欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

python3 spider 02 获取html的url、 head、 status

程序员文章站 2022-05-08 10:56:21
...
#coding:utf-8
from urllib import request
import chardet

if __name__=='__main__':
	req = request.Request('http://www.csdn.net')
	response = request.urlopen(req)
	#读取url信息
	url = response.geturl();
	print(url)
	#读取头信息
	info = response.info()
	print(info)
	code = response.getcode()
	print(code)

打印结果

---------- run python ----------
https://www.csdn.net/


Server: openresty
Date: Thu, 08 Nov 2018 16:38:59 GMT
Content-Type: text/html; charset=UTF-8
Transfer-Encoding: chunked
Connection: close
Vary: Accept-Encoding
Set-Cookie: uuid_tt_dd=10_37029271700-1541695139580-132031; Expires=Thu, 01 Jan 2025 00:00:00 GMT; Path=/; Domain=.csdn.net;
Set-Cookie: dc_session_id=10_1541695139580.760940; Expires=Thu, 01 Jan 2025 00:00:00 GMT; Path=/; Domain=.csdn.net;
Vary: Accept-Encoding
Set-Cookie: uuid_tt_dd=9797795953525243556_20181109; expires=Sun, 05-Nov-2028 16:38:59 GMT; Max-Age=315360000; path=/; domain=csdn.net
Strict-Transport-Security: max-age=864000


200

Output completed (1 sec consumed) - Normal Termination
相关标签: spider