php抓取这个页面的内容

程序员文章站 2024-01-18 21:30:16

...

需要抓取的部分已经用红线标出来，
只需要抓取第一页的就可以，
抓取页面：http://www.mafengwo.cn/yj/10206/2-0-1.html
一以前都是用这个类simple_html_dom.php，但是这个我用这个类抓取不出来了

回复讨论(解决方案)

可以用正?提取。

$url = "http://www.mafengwo.cn/yj/10206/2-0-1.html";
$opts = array(
'http'=>array(
'timeout'=>10,
'header'=>"User-Agent: php\r\n" .
"Cookie: foo=bar\r\n"
)
);
$context = stream_context_create($opts);
$data = file_get_contents($url,false,$context)

这样可以读取到页面

可以用正?提取。

正则不会啊

没关系，加一个上下文（context）就可以了

include 'simple_html_dom.php';$opts = array(   'http'=>array(     'user_agent' => $_SERVER['HTTP_USER_AGENT']  ) ); $context = stream_context_create($opts); $url = 'http://www.mafengwo.cn/yj/10206/2-0-1.html';$html = file_get_html($url, false, $context);

没关系，加一个上下文（context）就可以了

include 'simple_html_dom.php';$opts = array(   'http'=>array(     'user_agent' => $_SERVER['HTTP_USER_AGENT']  ) ); $context = stream_context_create($opts); $url = 'http://www.mafengwo.cn/yj/10206/2-0-1.html';$html = file_get_html($url, false, $context);

嗯嗯，现在出来了，输出出来的是上边的内容，但是我区分不出来了，不会用正则把信息分开了

不至于吧？

include 'simple_html_dom.php';$opts = array(   'http'=>array(     'user_agent' => $_SERVER['HTTP_USER_AGENT']  ) ); $context = stream_context_create($opts); $url = 'http://www.mafengwo.cn/yj/10206/2-0-1.html'; $html = file_get_html($url, false, $context);$div = $html->find('div.post-list ul');foreach($div[0]->find('li') as $i=>$item) {  echo $item->find('img')[0]->src, PHP_EOL;  echo trim($item->find('h2')[0]->text()), PHP_EOL;  echo trim($item->find('div')[3]->text()), PHP_EOL;  //echo '**', $item->innertext(), PHP_EOL;}

php抓取这个页面的内容

回复讨论(解决方案)

php抓取这个页面的内容

php抓取这个页面的内容,该如何解决

使用php方法curl抓取AJAX异步内容思路分析及代码分享，curlajax

请问怎么提取这个页面的汉字内容到数组？大侠帮帮忙啊

php抓取网页婚配内容模板

php 定义404页面的实现代码_php技巧

javascript - 用PHP抓取一个页面，但是这个页面需要登录才能显示，怎么抓取呢？

php-【PHP求助】通过PHP Curl模拟浏览器远程抓取内容

哪位高手能帮小弟我看看这个php里面的加密方法是什么

PHP curl实现抓取302跳转后页面的示例