Python-requests库的详细教程与使用举例
目录
前言
前段时间写了个爬虫,抓取了b站视频的弹幕——张大仙弹幕统计,仙友们来看看有没有你的弹幕吧!,最近经常写网络编程的脚本,就想好好学一下requests库常用的部分内容。另外,中文版的翻译一言难尽,建议看英文版。
在读本文前,你需要了解http及https协议,如果你还不了解,可以查看:网络-http协议学习笔记(消息结构、请求方法、状态码等)
安装
pip install requests
请求
GET请求
函数
get(url, params=None, **kwargs)
参数
url:请求的url
params:请求时的参数
代码
def Get():
payload = {"lady": "killer", "key": "9"}
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko)\
Chrome/52.0.2743.116 Safari/537.36'
}
req = requests.get(url=URL_GET, params=payload, headers=headers)
print("url:", req.url)
print("状态码:", req.status_code)
print("cookies", req.cookies)
print("text:", req.text)
print("json:", req.json())
req.close()
结果
url: http://httpbin.org/get?lady=killer&key=9
状态码: 200
cookies <RequestsCookieJar[]>
text: {
"args": {
"key": "9",
"lady": "killer"
},
"headers": {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate",
"Host": "httpbin.org",
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36",
"X-Amzn-Trace-Id": "Root=1-5f8450e4-30d63b07549e5df637b48d4d"
},
"origin": "59.64.129.128",
"url": "http://httpbin.org/get?lady=killer&key=9"
}json: {'args': {'key': '9', 'lady': 'killer'}, 'headers': {'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate', 'Host': 'httpbin.org', 'User-Agent': '
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36', 'X-Amzn-Trace-Id': 'Roo
t=1-5f8450e4-30d63b07549e5df637b48d4d'}, 'origin': '59.64.129.128', 'url': 'http://httpbin.org/get?lady=killer&key=9'}
实际访问的url已经添加了参数,也就是说,你可以不使用params参数,而是直接在url上面添加。
状态码请查看上面的文章
text是请求的结果
json格式是比较常用的
POST请求
函数
post(url, data=None, json=None, **kwargs)
url:请求的url
data:请求时的数据,字典、元组列表、字节或类似文件
json:json数据
代码
def Post():
data = {'bupt': 'beijing university of posts and communication', 'age': '65'}
req = requests.post(URL_POST, data=data)
print("url:", req.url)
print("状态码:", req.status_code)
print("cookies", req.cookies)
print("text:", req.text)
print("json:", req.json())
req.close()
结果
url: http://httpbin.org/post
状态码: 200
cookies <RequestsCookieJar[]>
text: {
"args": {},
"data": "",
"files": {},
"form": {
"age": "65",
"bupt": "beijing university of posts and communication"
},
"headers": {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate",
"Content-Length": "57",
"Content-Type": "application/x-www-form-urlencoded",
"Host": "httpbin.org",
"User-Agent": "python-requests/2.23.0",
"X-Amzn-Trace-Id": "Root=1-5f845f63-5103e9761cd94b3b1c9905f3"
},
"json": null,
"origin": "59.64.129.128",
"url": "http://httpbin.org/post"
}json: {'args': {}, 'data': '', 'files': {}, 'form': {'age': '65', 'bupt': 'beijing university of posts and communication'}, 'headers': {'Accept': '*/*', 'Accept-Encodin
g': 'gzip, deflate', 'Content-Length': '57', 'Content-Type': 'application/x-www-form-urlencoded', 'Host': 'httpbin.org', 'User-Agent': 'python-requests/2.23.0', 'X-Amzn
-Trace-Id': 'Root=1-5f845f63-5103e9761cd94b3b1c9905f3'}, 'json': None, 'origin': '59.64.129.128', 'url': 'http://httpbin.org/post'}
高级
有的时候我们往往需要登录才有权限发送请求,而账户信息一般会保存在Cookie中。
cookies设置
可以抓包获得cookie
也可以通过F12,在控制台输入document.cookie来获得。
代码
def Get_with_cookie():
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko)\
Chrome/52.0.2743.116 Safari/537.36'
}
cookies = {'BIDUPSID': 'BE00784DEDAD866ECD4C82326FD62EAB', ' PSTM': '1554688108', ' MCITY': '-%3A',
' ispeed_lsm': '0', ' BAIDUID': '1588105DA2B8508F2F85540F7FC96277:FG', ' H_PS_PSSID': '',
' BDRCVFR[NPt2Vg_wYt_]': 'mk3SLVN4HKm', ' delPer': '0', ' BD_CK_SAM': '1', ' PSINO': '2',
' BD_UPN': '13314752', ' sug': '3', ' sugstore': '0', ' ORIGIN': '0', ' bdime': '0',
' H_PS_645EC': '6c6bmTTznHXowtzIVMcI9ybyItnOQcEc5yoROOp7O7Q4a0FQWI7CHETQrdW05P4yg3zuSFV3',
' BA_HECTOR': '24a4800l810k25k1ur1fo8obe0j', ' BDORZ': 'FFFB88E999055A3F8A630C64834BD6D0'}
req = requests.get(URL_COOKIE, headers=headers, cookies=cookies)
print("url:", req.url)
print("状态码:", req.status_code)
print("cookies", req.cookies)
with open('ask_baidu_requets.html', 'w',encoding='utf-8') as f:
f.write(req.text)
req.close()
控制台输出
url: https://www.baidu.com/s?tn=02003390_hao_pg&ie=utf-8&wd=requests
状态码: 200
cookies <RequestsCookieJar[<Cookie BAIDUID=1588105DA2B8508F2F85540F7FC96277:FG=1 for .baidu.com/>, <Cookie BDRCVFR[NPt2Vg_wYt_]=mk3SLVN4HKm for .baidu.com/>, <Cookie H_
PS_PSSID= for .baidu.com/>, <Cookie PSINO=2 for .baidu.com/>, <Cookie delPer=0 for .baidu.com/>, <Cookie BDSVRTM=12 for www.baidu.com/>, <Cookie BD_CK_SAM=1 for www.bai
du.com/>]>
生成的文件
超时设置
timeout参数
这个时间只限制请求的时间。
代码
def Get_with_timeout():
req = requests.get('http://github.com', timeout=0.001)
print(req.text)
结果
全部代码
"""
--coding:utf-8--
@File: learnrequests.py
@Author:frank yu
@DateTime: 2020.10.12 17:10
@Contact: frankyu112058@gmail.com
@Description:
"""
import requests
URL_GET = "http://httpbin.org/get"
URL_POST = "http://httpbin.org/post"
URL_COOKIE = "https://www.baidu.com/s?tn=02003390_hao_pg&ie=utf-8&wd=requests"
def Get():
payload = {"lady": "killer", "key": "9"}
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko)\
Chrome/52.0.2743.116 Safari/537.36'
}
req = requests.get(url=URL_GET, params=payload, headers=headers)
print("url:", req.url)
print("状态码:", req.status_code)
print("cookies", req.cookies)
print("text:", req.text)
print("json:", req.json())
req.close()
def Post():
data = {'bupt': 'beijing university of posts and communication', 'age': '65'}
req = requests.post(URL_POST, data=data)
print("url:", req.url)
print("状态码:", req.status_code)
print("cookies", req.cookies)
print("text:", req.text)
print("json:", req.json())
req.close()
def Get_with_cookie():
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko)\
Chrome/52.0.2743.116 Safari/537.36'
}
cookies = {'BIDUPSID': 'BE00784DEDAD866ECD4C82326FD62EAB', ' PSTM': '1554688108', ' MCITY': '-%3A',
' ispeed_lsm': '0', ' BAIDUID': '1588105DA2B8508F2F85540F7FC96277:FG', ' H_PS_PSSID': '',
' BDRCVFR[NPt2Vg_wYt_]': 'mk3SLVN4HKm', ' delPer': '0', ' BD_CK_SAM': '1', ' PSINO': '2',
' BD_UPN': '13314752', ' sug': '3', ' sugstore': '0', ' ORIGIN': '0', ' bdime': '0',
' H_PS_645EC': '6c6bmTTznHXowtzIVMcI9ybyItnOQcEc5yoROOp7O7Q4a0FQWI7CHETQrdW05P4yg3zuSFV3',
' BA_HECTOR': '24a4800l810k25k1ur1fo8obe0j', ' BDORZ': 'FFFB88E999055A3F8A630C64834BD6D0'}
req = requests.get(URL_COOKIE, headers=headers, cookies=cookies)
print("url:", req.url)
print("状态码:", req.status_code)
print("cookies", req.cookies)
with open('ask_baidu_requets.html', 'w', encoding='utf-8') as f:
f.write(req.text)
req.close()
def Get_with_timeout():
req = requests.get('http://github.com', timeout=0.001)
print(req.text)
if __name__ == "__main__":
# Get()
# Post()
# Get_with_cookie()
Get_with_timeout()
更多python相关内容:【python总结】python学习框架梳理
本人b站账号:lady_killer9
有问题请下方评论,转载请注明出处,并附有原文链接,谢谢!如有侵权,请及时联系。如果您感觉有所收获,自愿打赏,可选择支付宝18833895206(小于),您的支持是我不断更新的动力。
本文地址:https://blog.csdn.net/lady_killer9/article/details/109032337