in code want extract links , text old website successful problem somewhere have used ol>li
tags , somewhere used ul>li
tags inside table , have 400 different pages can extract links have change ol
ul
every time easiest , time saving way me extract links , text pages define specific <table>
contains links when define <table>
extract links others other tables don't want
table structure target contains ol>li
or ul>li
tags
<table style="width:850px;" cellspacing="0" cellpadding="1" border="3"> <tbody> <tr> <td style="text-align: center; background-color: rgb(51, 51, 204);"> <h1>my links</h1> </td> </tr> <tr> <td> <ol> <li><a href="http://websitelink.com/page1.php">page 1</a></li> <li><a href="http://websitelink.com/page2.php">page 2</a></li> <li><a href="http://websitelink.com/page3.php">page 3</a></li> <li><a href="http://websitelink.com/page4.php">page 4</a></li> </ol> ... <ul> <li><a href="http://websitelink.com/a.php">page a</a></li> <li><a href="http://websitelink.com/b.php">page b</a></li> <li><a href="http://websitelink.com/c.php">page c</a></li> <li><a href="http://websitelink.com/d.php">page d</a></li> </ul> </td> </tr> </tbody> </table>
my current php code
$html = file_get_contents('http://mywebsitelink.com/pagename.html'); $dom = new domdocument; @$dom->loadhtml($html); $oltags = $dom->getelementsbytagname('ol'); // have change between ul , ol instead of can define table foreach ($oltags $list){ $links = $list->getelementsbytagname('a'); foreach ($links $href){ $text = $href->nodevalue; $href = $href->getattribute('href'); if(!empty($text) && !empty($href)) { echo "link title: " . $text . " location: " . $href . "<br />"; } } }
$html = file_get_contents('http://mywebsitelink.com/pagename.html'); $dom = new domdocument; @$dom->loadhtml($html); $xpath = new domxpath($dom); $thetags = $xpath->query('//table/tbody/tr/td/ol/li/a|//table/tbody/tr/td/ul/li/a'); foreach($thetags $onetag) { $links = $onetag->getelementsbytagname('a'); foreach ($links $onelink){ $text = $onelink->nodevalue; $href = $onelink->getattribute('href'); if(!empty($text) && !empty($href)) { echo "link title: " . $text . " location: " . $href . "<br />"; } } } [...]
Comments
Post a Comment