Tuesday, 3 April 2012

PHP extract HTML content between two markups


PHP extracts HTML content between two markups.


//the content string will be returned, NOTE: This can be used for extracting the content for once only.


function extractContent($html, $startPattern, $endPattern){
$start = strpos($html, $startPattern);
$end = strpos($html, $endPattern, $start + strlen($startPattern) + 1);
return substr($html, $start + strlen($startPattern), $end - $start - strlen($startPattern));
}


Usage:
e.g,, <h1>content value that you want to extract...</h1>:


you can call:


$html = '<h1>content value that you want to extract...</h1>';
$startPattern = '<h1>';
$endPattern = '</h1>';

extractContent($html, $startPattern, $endPattern);


Other tricks:


//an array will be returned. each element stands for a content value.

function extractAllContents($html, $startPattern, $endPattern){
$array_index = 0;
$index = 0;
$contents = array();
while (true){
if ($index > strlen($html)) break;
$start = strpos ( $html, $startPattern, $index);
if ($start !== false){
$end = strpos ( $html, $endPattern, $start + strlen ( $startPattern ) + 1 );
$content = substr ( $html, $start + strlen ( $startPattern ), $end - $start - strlen ( $startPattern ) );
$contents[$array_index++] = $content;
$index = $end + strlen($endPattern) + 1;
}
else{
break;
}
}
return $contents;
}

No comments:

Post a Comment