欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  IT编程

Python3爬虫之urllib携带cookie爬取网页的方法

程序员文章站 2022-04-13 15:33:54
如下所示: import urllib.request import urllib.parse url = 'https://weibo.cn/5273...

如下所示:

import urllib.request
import urllib.parse
 
url = 'https://weibo.cn/5273088553/info'
#正常的方式进行访问
# headers = {
#  'user-agent': 'mozilla/5.0 (windows nt 10.0; wow64) applewebkit/537.36 (khtml, like gecko) chrome/63.0.3239.84 safari/537.36'
# }
# 携带cookie进行访问
headers = {
'get https':'//weibo.cn/5273088553/info http/1.1',
'host':' weibo.cn',
'connection':' keep-alive',
'upgrade-insecure-requests':' 1',
'user-agent':' mozilla/5.0 (windows nt 10.0; wow64) applewebkit/537.36 (khtml, like gecko) chrome/63.0.3239.84 safari/537.36',
'accept':' text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
# 'referer: https':'//weibo.cn/',
'accept-language':' zh-cn,zh;q=0.9',
'cookie':' _t_wm=c1913301844388de10cba9d0bb7bbf1e; sub=_2a253wy_dderhgenm7fer-cbjzj-ihxvup7gvrdv6pujbkdanlxpdkw1nsespjz6v1ga5myw2heub9ytqw3nyy19u; suhb=0bt8spepegz439; scf=aua-hpsw5-z78-02nmuv8ctwxzcmn4xj91qyshkdxh4w9w0fcbpei6hy5e6vobedqtxtfqobcd2d32r0o_5jsrk.; ssologinstate=1516199821',
}
request = urllib.request.request(url=url,headers=headers)
response = urllib.request.urlopen(request)
#输出所有
# print(response.read().decode('gbk'))
#将内容写入文件中
with open('weibo.html','wb') as fp:
 fp.write(response.read())

以上这篇python3爬虫之urllib携带cookie爬取网页的方法就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持。