正则表达式:获取京东的 列表和详情-2019-10-18作业
程序员文章站
2022-03-20 23:10:15
...
结果图:
代码:
<?php $szUrl = "https://search-x.jd.com/Search?callback=jQuery6137026&area=16&enc=utf-8&keyword=%E5%A5%B3%E5%A3%AB%E7%9D%A1%E8%A1%A3&adType=7&page=1&ad_ids=291%3A33&xtest=new_search&_=1571796452352"; $UserAgent = 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; SLCC1; .NET CLR 2.0.50727; .NET CLR 3.0.04506; .NET CLR 3.5.21022; .NET CLR 1.0.3705; .NET CLR 1.1.4322)'; $curl = curl_init(); curl_setopt($curl, CURLOPT_URL, $szUrl); curl_setopt($curl, CURLOPT_HEADER, 0); //0表示不输出Header,1表示输出 curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false); curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false); curl_setopt($curl, CURLOPT_ENCODING, ''); curl_setopt($curl, CURLOPT_USERAGENT, $UserAgent); curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1); $content = curl_exec($curl); //改接口shop_name标识 $start='/color/'; $title='yanse'; $a=preg_replace($start,$title,$content); //改接口pc_price标识 $start='/pc_price/'; $title='jiage'; $result=preg_replace($start,$title,$a); $substr = substr($result, 21, -2); //对 JSON 格式的字符串进行解码,转换为 PHP 数组 true为数组,false为对象 $result = json_decode($substr, true); echo '<pre>'.print_r($result,true) ;
总结:
算是没用正则写吧,直接抓接口,接口换浏览器测试过了也不需要拼接。起码现在学会了抓接口。空余时间再直接抓页面获取指定区域的内容试试