PHP: Extract HTML Tags/Element from a Web Page

In PHP, you can download a web page using file_get_contents or curl. Once you have downloaded a web page, you can process it. Take for example, we want to extract the image URLs from a web page.

We know that the tag structure of an image url is as follows

<img src="image.gif" alt="Image Description" />

Keeping this in mind, we write the following program


function extractElementsFromWebPage($webPage, $tagName) {
  //Creating a DOMDocument Object.
  $dom = new DOMDocument;

  //Parsing the HTML from the web page
  if ($dom->loadHTML($webPage)) {
    // Extracting the specified elements from the web page
    @$elements = $dom->getElementsByTagName($tagName);
    return $elements;
  return FALSE;

function downloadURL($URL) {
  $webPage = file_get_contents ($URL);
  return $webPage;

$webPage = downloadURL("");
if ($webPage ) {
  $imageURLURLs = extractElementsFromWebPage($webPage, 'img');
  if ($imageURLURLs) {
    foreach ($imageURLURLs as $imageURL){
      // Extracting the URLs
      echo $imageURL->getAttribute('src'), "n";
  else {
    echo "Error in parsing the webPagen";
else {
  echo "Error in downloading the webPagen";

There are certain things that need to be understood:

Firstly we are using file_get_contents to download a web page. Then we use the DOMDocument class in PHP to parse the HTML page. Check the two functions

  1. downloadURL
  2. extractElementsFromWebPage

downloadURL uses file_get_contents to download the web page and extractElementsFromWebPage uses the DOMDocument class. The function loadHTML is used to parse the HTML page and getElementsByTagName to extract the specified elements. In our case, we want to extract the HTML tag element img.

On executing the program

$ php extractElements.php

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s