Python爬虫连载2-reponse\parse简介

程序员文章站 2022-07-07 19:12:25

一、reponse解析 urlopen的返回对象（1）geturl：返回网页地址（2）info：请求反馈对象的meta信息（3）getcode：返回的http code from urllib import request import chardet """ 解析reponse """ if ......

一、reponse解析

urlopen的返回对象

（1）geturl：返回网页地址

（2）info：请求反馈对象的meta信息

（3）getcode：返回的http code

from urllib import request

import chardet

"""

解析reponse

"""

if __name__ == "__main__":

    url = "https://www.baidu.com"

    rsp = request.urlopen(url)

    print("url:{0}".format(rsp.geturl()))#网页地址

    print("================")

    print("info:{0}".format(rsp.info()))#网页头信息

    print("================")

    print("code:{0}".format(rsp.getcode()))#请求后返回的状态码

$Python爬虫连载2-reponse\parse简介$

二、parse

1.request.date的使用

访问网络的两种方式

（1）get（2）post

2.url.parse用来解析url

 

from urllib import request,parse

import chardet

"""

解析reponse

"""

if __name__ == "__main__":

    url = "http://www.baidu.com/s?"

    wd = input("input your keyword:")

    #要想使用data,需要使用字典结构

    qs = {

        "wd":wd

    }

    #转换url编码

    qs = parse.urlencode(qs)#对关键字进行编码

    fullurl = url + qs#百度搜索传入的地址是基础地址加上关键字的编码形式

    print(fullurl)

    rsp = request.urlopen(fullurl)

    html = rsp.read()

    html = html.decode()#解码

    #使用get取值保证不会出错

    print(html)

$Python爬虫连载2-reponse\parse简介$

三、源码

reptile2_reposeanlysis.py

https://github.com/ruigege66/pythonreptile/blob/master/reptile2_reposeanlysis.py

2.csdn：https://blog.csdn.net/weixin_44630050（心悦君兮君不知-睿）

3.博客园：https://www.cnblogs.com/ruigege0000/

4.欢迎关注微信公众号：傅里叶变换，个人公众号，仅用于学习交流，后台回复”礼包“，获取大数据学习资料

$Python爬虫连载2-reponse\parse简介$

上一篇： Linux中的stat命令使用简介

下一篇： mysql 开发进阶篇系列 37 工具篇 perror (错误代码查看工具)与总结

Python爬虫连载2-reponse\parse简介

Python爬虫包BeautifulSoup简介与安装（一）

Python连载58-http协议简介

Python爬虫连载3-Post解析、Request类

[Python] - 爬虫之简介和基本原理

Python爬虫连载2-reponse\parse简介

python 爬虫简介

Python爬虫连载1-urllib.request和chardet包使用方式

Python爬虫连载15-利用selenium模块控制chrome

python爬虫框架feapde的使用简介

Python爬虫连载13-BeatuifulSoup四大对象、遍历文档对象、CSS选择器