这个网站,为啥我用file_get_contents抓取不到任何内容?
程序员文章站
2024-02-02 13:59:40
...
http://www.hdwallpapersimages.com/
浏览器显示正常,先使用file_get_contents,抓取内容为空,用ChinaZ的百度蜘蛛和谷歌蜘蛛模拟抓取,还是请求超时,于是我干脆复制我浏览器的header,用file_get_contents抓取,还是抓取为空,这是我的代码:
浏览器显示正常,先使用file_get_contents,抓取内容为空,用ChinaZ的百度蜘蛛和谷歌蜘蛛模拟抓取,还是请求超时,于是我干脆复制我浏览器的header,用file_get_contents抓取,还是抓取为空,这是我的代码:
$opts = array(
'http'=>array(
'method'=>"GET",
'header'=>"Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8\r\n".
"Accept-Encoding:gzip, deflate, sdch\r\n".
"Accept-Language:zh-CN,zh;q=0.8,en;q=0.6\r\n".
"Cache-Control:max-age=0\r\n".
"Cookie:viewed_cookie_policy=yes; __utmt=1; __utma=37938810.875942873.1452954236.1453114091.1453209277.3; __utmb=37938810.30.10.1453209277; __utmc=37938810; __utmz=37938810.1452954236.1.1.utmcsr=bing|utmccn=(organic)|utmcmd=organic|utmctr=hd%20wallpaper; __unam=eb5fde1-1524ad24043-4a580705-62\r\n".
"Host:www.hdwallpapersimages.com\r\n".
"Proxy-Connection:keep-alive\r\n".
"Upgrade-Insecure-Requests:1\r\n".
"User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.111 Safari/537.36\r\n"
)
);
$context = stream_context_create($opts);
echo file_get_contents('http://www.hdwallpapersimages.com', false, $context);
回复内容:
http://www.hdwallpapersimages.com/
浏览器显示正常,先使用file_get_contents,抓取内容为空,用ChinaZ的百度蜘蛛和谷歌蜘蛛模拟抓取,还是请求超时,于是我干脆复制我浏览器的header,用file_get_contents抓取,还是抓取为空,这是我的代码:
$opts = array(
'http'=>array(
'method'=>"GET",
'header'=>"Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8\r\n".
"Accept-Encoding:gzip, deflate, sdch\r\n".
"Accept-Language:zh-CN,zh;q=0.8,en;q=0.6\r\n".
"Cache-Control:max-age=0\r\n".
"Cookie:viewed_cookie_policy=yes; __utmt=1; __utma=37938810.875942873.1452954236.1453114091.1453209277.3; __utmb=37938810.30.10.1453209277; __utmc=37938810; __utmz=37938810.1452954236.1.1.utmcsr=bing|utmccn=(organic)|utmcmd=organic|utmctr=hd%20wallpaper; __unam=eb5fde1-1524ad24043-4a580705-62\r\n".
"Host:www.hdwallpapersimages.com\r\n".
"Proxy-Connection:keep-alive\r\n".
"Upgrade-Insecure-Requests:1\r\n".
"User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.111 Safari/537.36\r\n"
)
);
$context = stream_context_create($opts);
echo file_get_contents('http://www.hdwallpapersimages.com', false, $context);
你抓取的网站打不开么
因为网站我也打不开,哈哈哈哈,在你运行的机子上 直接curl 看看有内容么
上一篇: 表单提交 没有submit按钮解决思路
下一篇: mysql 中怎么导入.txt文件