python赶集网bs4爬取内容
程序员文章站
2022-10-03 13:13:56
import requests,csvfrom bs4 import BeautifulSouplist=[]herders={“User-Agent”:“Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Mobile Safari/537.36” }url=“http://bj.ganji.com/zufang/f4/p...
下面展示一些 内联代码片
。
// A code block
var foo = 'bar';
// An highlighted block
import requests,csv
from bs4 import BeautifulSoup
list=[]
herders={"User-Agent":"Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Mobile Safari/537.36" }
url="http://bj.ganji.com/zufang/f4/pnl/"
r=requests.get(url,headers=herders).text
soup=BeautifulSoup(r,'lxml')
div=soup.find('div',class_='f-list js-tips-list').find_all("dl",class_="f-list-item-wrap min-line-height f-clear")
for i in div:
name=i.find("dd",class_="dd-item title").a.string
struct=i.find("dd",class_="dd-item size").span.text
price=i.find("dd",class_="dd-item info").find("span",class_="num").string
list.append([name,struct,price])
with open("赶集网.csv","w+",encoding="utf-8",newline="") as f:
w=csv.writer(f)
w.writerows(list)
本文地址:https://blog.csdn.net/sober_1/article/details/109640200
上一篇: 唐朝时期的军事实力到底有多强大呢?