使用php simple html dom parser解析html标签

程序员文章站 2022-05-26 20:58:20

...

用了一下

PHP Simple HTML DOM Parser

解析HTML页面，感觉还不错，它能创建一个DOM tree方便你解析html里面的内容。用来抓东西挺好的。

附带一个例子，你也到sourceforge下载压缩包看里面的例子：

Scraping data with PHP Simple HTML DOM Parser

PHP Simple HTML DOM Parser , written in PHP5+, allows you to manipulate HTML in a very easy way. Supporting invalid HTML, this parser is better then other PHP scripts using complicated regexes to extract information from web pages.

Before getting the necessary info, a DOM should be created from either URL or file. The following script extracts links & images from a website:

view plain copy to clipboard print ?

Php代码 // Create DOM from URL or file

$html = file_get_html('http://www.microsoft.com/');

// Extract links

foreach($html->find('a') as $element)

echo $element->href . '
';

// Extract images

foreach($html->find('img') as $element)

echo $element->src . '
';

[php]

// Create DOM from URL or file

$html = file_get_html('http://www.microsoft.com/');

// Extract links

foreach($html->find('a') as $element)

echo $element->href . '
';

// Extract images

foreach($html->find('img') as $element)

echo $element->src . '
';

// Create DOM from URL or file

$html = file_get_html('http://www.microsoft.com/');

// Extract links

foreach($html->find('a') as $element)

echo $element->href . '
';

// Extract images

foreach($html->find('img') as $element)

echo $element->src . '
';

The parser can also be used to modify HTML elements:

view plain copy to clipboard print ?

Php代码 // Create DOM from string

$html = str_get_html('

Simple

Parser

');

$html->find('div', 1)->class = 'bar';

$html->find('div[id=simple]', 0)->innertext = 'Foo';

// Output:

Foo

Parser

echo $html;

[php]

// Create DOM from string

$html = str_get_html('

Simple

Parser

');

$html->find('div', 1)->class = 'bar';

$html->find('div[id=simple]', 0)->innertext = 'Foo';

// Output:

Foo

Parser

echo $html;

// Create DOM from string

$html = str_get_html('

Simple

Parser

');

$html->find('div', 1)->class = 'bar';

$html->find('div[id=simple]', 0)->innertext = 'Foo';

// Output:

Foo

Parser

echo $html;

Do you wish to retrieve content without any tags?

view plain copy to clipboard print ?

Php代码 echo file_get_html('http://www.yahoo.com/')->plaintext;

[php]

echo file_get_html('http://www.yahoo.com/')->plaintext;

echo file_get_html('http://www.yahoo.com/')->plaintext;In the package files of this parser ([url]http://simplehtmldom.sourceforge.net/[/url]) you can find some scraping examples from digg, imdb, slashdot. Let’s create one that extracts the first 10 results (titles only) for the keyword “php” from Google:

view plain copy to clipboard print ?

Php代码 $url = 'http://www.google.com/search?hl=en&q=php&btnG=Search';

// Create DOM from URL

$html = file_get_html($url);

// Match all 'A' tags that have the class attribute equal with 'l'

foreach($html->find('a[class=l]') as $key => $info)

{

echo ($key + 1).'. '.$info->plaintext."
\n";

}

[php]

$url = 'http://www.google.com/search?hl=en&q=php&btnG=Search';

// Create DOM from URL

$html = file_get_html($url);

// Match all 'A' tags that have the class attribute equal with 'l'

foreach($html->find('a[class=l]') as $key => $info)

{

echo ($key + 1).'. '.$info->plaintext."
\n";

}

$url = 'http://www.google.com/search?hl=en&q=php&btnG=Search';

// Create DOM from URL

$html = file_get_html($url);

// Match all 'A' tags that have the class attribute equal with 'l'

foreach($html->find('a[class=l]') as $key => $info)

{

echo ($key + 1).'. '.$info->plaintext."
\n";

}NOTE Make sure to include the parser before using any functions of it:

view plain copy to clipboard print ?

Php代码

include 'simple_html_dom.php';

[php]

include 'simple_html_dom.php';

include 'simple_html_dom.php';For more information regarding the usage of this function consider checking the ‘PHP Simple HTML Dom Parser’ Manual. To download the package files use the following URL: [url]

分享到：

相关标签：使用 php simple html dom parser 解析标签使用 php simple html dom p

上一篇：财政年度表之建表约束

下一篇：用PHP验证用户邮箱

使用php simple html dom parser解析html标签

PHP解析html类库simple_html_dom的转码bug

php解析html类库simple_html_dom(详细介绍)

基于simple_html_dom的使用小结

PHP使用DOMDocument类生成HTML实例（包含常见标签元素）

PHP simple_html_dom.php+正则采集文章代码

PHP解析html类库simple_html_dom的转码bug

浅析php插件 Simple HTML DOM 用DOM方式处理HTML

php解析html类库simple_html_dom(详细介绍)

WordPress中转义HTML与过滤链接的相关PHP函数使用解析

php获取html标签内容（php解析html的方法）

使用php simple html dom parser解析html标签

PHP解析html类库simple_html_dom的转码bug

php解析html类库simple_html_dom(详细介绍)

基于simple_html_dom的使用小结

PHP使用DOMDocument类生成HTML实例（包含常见标签元素）

PHP simple_html_dom.php+正则 采集文章代码

PHP解析html类库simple_html_dom的转码bug

浅析php插件 Simple HTML DOM 用DOM方式处理HTML

php解析html类库simple_html_dom(详细介绍)

WordPress中转义HTML与过滤链接的相关PHP函数使用解析

php获取html标签内容（php解析html的方法）

PHP simple_html_dom.php+正则采集文章代码