Python3爬虫之urllib携带cookie爬取网页的方法
程序员文章站
2022-04-13 15:33:54
如下所示:
import urllib.request
import urllib.parse
url = 'https://weibo.cn/5273...
如下所示:
import urllib.request import urllib.parse url = 'https://weibo.cn/5273088553/info' #正常的方式进行访问 # headers = { # 'user-agent': 'mozilla/5.0 (windows nt 10.0; wow64) applewebkit/537.36 (khtml, like gecko) chrome/63.0.3239.84 safari/537.36' # } # 携带cookie进行访问 headers = { 'get https':'//weibo.cn/5273088553/info http/1.1', 'host':' weibo.cn', 'connection':' keep-alive', 'upgrade-insecure-requests':' 1', 'user-agent':' mozilla/5.0 (windows nt 10.0; wow64) applewebkit/537.36 (khtml, like gecko) chrome/63.0.3239.84 safari/537.36', 'accept':' text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8', # 'referer: https':'//weibo.cn/', 'accept-language':' zh-cn,zh;q=0.9', 'cookie':' _t_wm=c1913301844388de10cba9d0bb7bbf1e; sub=_2a253wy_dderhgenm7fer-cbjzj-ihxvup7gvrdv6pujbkdanlxpdkw1nsespjz6v1ga5myw2heub9ytqw3nyy19u; suhb=0bt8spepegz439; scf=aua-hpsw5-z78-02nmuv8ctwxzcmn4xj91qyshkdxh4w9w0fcbpei6hy5e6vobedqtxtfqobcd2d32r0o_5jsrk.; ssologinstate=1516199821', } request = urllib.request.request(url=url,headers=headers) response = urllib.request.urlopen(request) #输出所有 # print(response.read().decode('gbk')) #将内容写入文件中 with open('weibo.html','wb') as fp: fp.write(response.read())
以上这篇python3爬虫之urllib携带cookie爬取网页的方法就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持。
上一篇: 文字属性和div容器盒的使用基础