IP代理与IP代理池的构建
程序员文章站
2022-05-19 13:24:24
...
#IP代理与IP代理池的构建
#IP代理概述,是指让爬虫使用代理IP去爬取对方网站
#提取代理IP可以使用西刺代理或者大象代理,使用大象代理时我们需要选择大陆以外的IP(收费)
#IP代理的构建实战
# import urllib.request
# ip="120.83.104.222:6675"#最好用国外的IP
# proxy=urllib.request.ProxyHandler({"http":ip})
# print(proxy)
# opener=urllib.request.build_opener(proxy,urllib.request.HTTPHandler)
# urllib.request.install_opener(opener)
# url="htttp://www.baidu.com"
# data=urllib.request.urlopen(url).read().decode("utf-8","ignore")
# print(data)
# fh=open("D:\\pythonprojects\\result\\666.html","wb")
# fh.write(data)
# fh.close()
#IP代理池构建的第一种方案(适合于代理IP稳定的情况)随机调用
import random
import urllib.request
ippools=[
'112.95.190.126:9999',
'113.247.252.114:9090',
'61.128.208.94:3128',
]
def ip(ippools):
thisip=random.choice(ippools)
print(thisip)
proxy=urllib.request.ProxyHandler({"http":thisip})
opener=urllib.request.build_opener(proxy,urllib.request.HTTPHandler)
urllib.request.install_opener(opener)
# ip(ippools)
# url="http://www.baidu.com"
# data=urllib.request.urlopen(url).read()
# #.decode("utf-8","ignore")
# print(len(data))
for i in range(0,3):
try:#因为会哦失效所以要检查异常
ip(ippools)
url="http://www.baidu.com"
data=urllib.request.urlopen(url).read()
#.decode("utf-8","ignore")
print(len(data))
fh=open("D:\\pythonprojects\\result\\"+str(i)+".html",'wb')
data=urllib.request.urlopen(url).read()
#.decode("utf-8","ignore")
print(len(data))
fh.write(data)
fh.close()
except Exception as err:
print(err)
上一篇: python数据交换模块-XML
下一篇: 打印机颜色故障解决方法总结