How to Find ALL/Broken links using Selenium WebDriver?

Why you should check Broken links?

You should always make sure none of the link on your webpage is broken and users gets land to an error page.

Verification of links is one of the most common testing practice which is done by opening each link and ensure that each link is working correctly.

This testing is generally performed when a new build is deployed on the server and is done by Quality Analyst by clicking on each link and verifying whether the each & every link is working correctly or not so that user lands on correct page.

And if, there is any 404 or 505 error code or not? Also ensure that server response code is 200.

But, this testing is very monotonous and boring in nature. and so here we come at the solution provided by automation testing & this scenario is quite suitable for Automation.

Find Broken Links In Selenium:

This tutorial will help you by providing step by step guidance, help you to understand the approach and provide assistance in coding. Refer below for step by step explanation:

Pre requisite:

  • Selenium Project should be created.

How to find ALL links of a webpage in selenium?

  • Step 1: Approach to find all links on the Webpage: As we know, all the URLs are kept under anchor tag in HTML and url value kept href attribute.

Example (Format in which links are stored on webpage):

<a id="js-link-box-es" href="//es.wikipedia.org/" title="Español — Wikipedia — La enciclopedia libre" class="link-box" data-slogan="La enciclopedia libre">
<strong>Español</strong>
<small>1 588 000+ artículos</small>
</a>
Wikipedia Web page links
  • Step 2: Selenium provides method findElements that returns all the WebElements of the webpage in List. Refer below code:

List allLinks= driver.findElements(By.tagName(“a”));

So we have stored all web page links in allLinks list as a variable which comes under anchor tag. Refer below for complete code:

package stepDef;
import io.github.bonigarcia.wdm.WebDriverManager;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import org.testng.annotations.Test;
import java.util.List;
import java.util.concurrent.TimeUnit;
public class VerifyPageLinks {
  @Test
  public void verifyBrokenLinks(){
    WebDriverManager.chromedriver().version("79.0").setup();
    WebDriver driver = new ChromeDriver();
    driver.get("https://www.wikipedia.org/");
    driver.manage().window().maximize();
    driver.manage().timeouts().implicitlyWait(30, TimeUnit.SECONDS);
    // Store all the Links on Webpage in List
    List allLinks= driver.findElements(By.tagName("a"));
    System.out.println("Total Number of Links: "+allLinks.size());
    for(WebElement link : allLinks){
      System.out.println("Link Text:"+ link.getAttribute("href"));
    }
    driver.close();
  	}
	}

Console output:
Total Number of Links: 323
Link Text:https://en.wikipedia.org/
Link Text:https://es.wikipedia.org/
Link Text:https://ja.wikipedia.org/
Link Text:https://de.wikipedia.org/

How to validate all links of the webpage?

  • Step 3: Approach to Test Valid link
  • Test 1: URL text should not be blank: As we know URL text comes under href attribute. So we can get this text using Selenium getText() method and verify.
// As we we want to perform whole test for all the links. So we we will use Soft Assert
SoftAssert soft = new SoftAssert();
for (WebElement link : allLinks) {
String actualLinkText = link.getText();
System.out.println("Link Text" + actualLinkText);
System.out.println("URL within HREF: " + link.getAttribute("href"));
//actualLinkText should not null. So use TestNG method assertNotNull for assertions
soft.assertNotNull(actualLinkText);
}
soft.assertAll()
  • Test 2: Test Broken Link: When sending a request with the “GET” method on a server it returns response code “200” in case of valid URL. If response code is anything else than “200” then URL is broken.


To get a response code from the server we will use and request with “GET” and get server response. HttpURLConnection abstract class provides an amazing collection of methods that help us to make the request over the network.

Here is the code to get response from server:

@Test
public void brokenLink() {
String urForTest = "https://www.wikipedia.org/";
try {
URL url = new URL(urForTest); HttpURLConnection httpConnection = (HttpURLConnection) url.openConnection();
// Set request with “GET”Method
httpConnection.setRequestMethod("GET");
httpConnection.connect();
int serverResponseCode = httpConnection.getResponseCode();
System.out.println("Server Response Code: " + serverResponseCode);
} catch (Exception e) {
e.printStackTrace();
}
}

Console output:
Server Response Code: 200

  • Step 4: Make one reusable method– getResponseCode to get a response from the server for any URL. (Use Broken Link test code and make generic method)
  • Step 5: Add below assertions in test
// Test Case 2: Verify response code
String urlForTest= link.getAttribute("href");
int responseCode =getResponseCode(urlForTest);
System.out.println("Link: "+urlForTest+"Response From Server: "+responseCode);
soft.assertEquals(responseCode, 200, "Testing Response of URL");

Code :

package stepDef;
import io.github.bonigarcia.wdm.WebDriverManager;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import org.testng.annotations.Test;
import org.testng.asserts.SoftAssert;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.List;
import java.util.concurrent.TimeUnit;
public class VerifyPageLinks {
@Test
public void verifyBrokenLinks() {
// WebDriver Manager used to download WebDriver binaries automatically
WebDriverManager.chromedriver().version("79.0").setup();
WebDriver driver = new ChromeDriver();
driver.get("https://www.wikipedia.org/");
driver.manage().window().maximize();
driver.manage().timeouts().implicitlyWait(30, TimeUnit.SECONDS);
// store dropdown as Webelement
List allLinks = driver.findElements(By.tagName("a"));
System.out.println("Total Number of Links: " + allLinks.size());
  
SoftAssert soft = new SoftAssert(); 
  for (WebElement link : allLinks) { 
    String actualLinkText = link.getText(); 
    System.out.println("Link Text" + actualLinkText); 
    System.out.println("URL within HREF: " + link.getAttribute("href")); 
    //Test Case 1: //actualLinkText should not null. So use TestNG method assertNotNull for assertions 
    soft.assertNotNull(actualLinkText); 
    // Test Case 2: Verify response code 
    String urlForTest= link.getAttribute("href"); 
    int responseCode =getResponseCode(urlForTest); 
    System.out.println("Link: "+urlForTest+"Response From Server: "+responseCode); 
    soft.assertEquals(responseCode, 200, "Testing Response of URL"+urlForTest); 
  	} 
  	soft.assertAll(); driver.close(); 
	} 
  	@Test public void brokenLink() { 
    String urForTest = "https://www.wikipedia.org/"; 
      try { 
        URL url = new URL(urForTest); 
        HttpURLConnection httpConnection = (HttpURLConnection) url.openConnection(); 
        httpConnection.setRequestMethod("GET"); 
        httpConnection.connect(); 
        int serverResponseCode = httpConnection.getResponseCode(); 
        System.out.println("Server Response Code: " + serverResponseCode); 
      	} catch (Exception e) { 
        e.printStackTrace(); 
      	} 
    	} 
  		public int getResponseCode(String urForTest){ 
          int serverResponseCode=0; 
          try { 
            URL url = new URL(urForTest); 
            HttpURLConnection httpConnection = (HttpURLConnection) url.openConnection(); 
            httpConnection.setRequestMethod("GET"); 
            httpConnection.connect(); 
            serverResponseCode = httpConnection.getResponseCode(); 
            System.out.println("Server Response Code: " + serverResponseCode); 
          	} catch (Exception e) { 
            e.printStackTrace(); 
          	} 
          	return serverResponseCode; 
        	}
			}

Console:
URL within HREF: https://ja.wikipedia.org/
Server Response Code: 200
Link: https://ja.wikipedia.org/Response From Server: 200
Link TextDeutsch
2 416 000+ Artikel
URL within HREF: https://de.wikipedia.org/
Server Response Code: 200

Read our this article: Install & Setup Selenium step by step tutorial

Wonderful! you have completed the broken link validation testing code and now you can test any webpage broken link test. Hope this article helps you in writing code and giving additional information. Feel free to reach out to us on query@thoughtcoders.com.

Reference: https://docs.oracle.com/javase/8/docs/api/java/net/HttpURLConnection.html

close

Subscribe to our Newsletter:

Open chat
Feel free to contact