首页 > 代码库 > simple_html_dom(1)

simple_html_dom(1)

// Create DOM from URL or file
$html = file_get_html(‘http://www.google.com/‘);

// Find all images 
foreach($html->find(‘img‘) as $element) 
       echo $element->src . ‘<br>‘;

// Find all links 
foreach($html->find(‘a‘) as $element) 
       echo $element->href . ‘<br>‘;

// Create DOM from string

$html = str_get_html(‘<div id="hello">Hello</div><div id="world">World</div>‘);

$html->find(‘div‘, 1)->class = ‘bar‘;

$html->find(‘div[id=hello]‘, 0)->innertext = ‘foo‘;

echo $html; // Output: <div id="hello">foo</div><div id="world" class="bar">World</div>

 

// Dump contents (without tags) from HTML

echo file_get_html(‘http://www.google.com/‘)->plaintext

// Create DOM from URL
$html = file_get_html(‘http://slashdot.org/‘);

// Find all article blocks
foreach($html->find(‘div.article‘) as $article) {
    $item[‘title‘]     = $article->find(‘div.title‘, 0)->plaintext;
    $item[‘intro‘]    = $article->find(‘div.intro‘, 0)->plaintext;
    $item[‘details‘] = $article->find(‘div.details‘, 0)->plaintext;
    $articles[] = $item;
}

print_r($articles);

How to create HTML DOM object?

// Create a DOM object
$html = new simple_html_dom();

// Load HTML from a string
$html->load(‘<html><body>Hello!</body></html>‘);

// Load HTML from a URL 
$html->load_file(‘http://www.google.com/‘);

// Load HTML from a HTML file 
$html->load_file(‘test.htm‘);

How to find HTML elements?

// Find all anchors, returns a array of element objects
$ret = $html->find(a‘);

// Find (N)th anchor, returns element object or null if not found (zero based)
$ret = $html->find(a‘, 0);

// Find lastest anchor, returns element object or null if not found (zero based)
$ret = $html->find(a‘, -1); 

// Find all <div> with the id attribute
$ret = $html->find(div[id]‘);

// Find all <div> which attribute id=foo
$ret = $html->find(div[id=foo]‘); 

// Find all element which id=foo
$ret = $html->find(#foo‘);

// Find all element which class=foo
$ret = $html->find(.foo‘);

// Find all element has attribute id
$ret = $html->find(*[id]‘); 

// Find all anchors and images 
$ret = $html->find(a, img‘); 

// Find all anchors and images with the "title" attribute
$ret = $html->find(a[title], img[title]‘);

// Find all <li> in <ul> 
$es = $html->find(ul li‘);

// Find Nested <div> tags
$es = $html->find(div div div‘); 

// Find all <td> in <table> which class="hello" 
$es = $html->find(table.hello td‘);

// Find all td tags with attribite align=center in table tags 
$es = $html->find(‘‘table td[align=center]‘);

// Find all <li> in <ul> 
foreach($html->find(ul‘) as $ul) 
{
       foreach($ul->find(li‘) as $li) 
       {
             // do something...
       }
}

// Find first <li> in first <ul> 
$e = $html->find(ul‘, 0)->find(li‘, 0);

// Find all text blocks 
$es = $html->find(text‘);

// Find all comment (<!--...-->) blocks 
$es = $html->find(comment‘);

Supports these operators in attribute selectors:

FilterDescription
[attribute]Matches elements that have the specified attribute.
[!attribute]Matches elements that don‘t have the specified attribute.
[attribute=value]Matches elements that have the specified attribute with a certain value.
[attribute!=value]Matches elements that don‘t have the specified attribute with a certain value.
[attribute^=value]Matches elements that have the specified attribute and it starts with a certain value.
[attribute$=value]Matches elements that have the specified attribute and it ends with a certain value.
[attribute*=value]Matches elements that have the specified attribute and it contains a certain value

simple_html_dom(1)