Python 爬虫 招聘信息并存入数据库
程序员文章站
2022-06-11 22:35:49
新学习了selenium,啪一下腾讯招聘 1 from lxml import etree 2 from selenium import webdriver 3 import pymysql 4 def Geturl(fullurl):#获取每个招聘网页的链接 5 browser.get(fullu... ......
新学习了selenium,啪一下腾讯招聘
1 from lxml import etree 2 from selenium import webdriver 3 import pymysql 4 def geturl(fullurl):#获取每个招聘网页的链接 5 browser.get(fullurl) 6 shouye_html_text = browser.page_source 7 shouye_ele = etree.html(shouye_html_text) 8 zp_list = shouye_ele.xpath('//*[@id="position"]/div[1]/table/tbody/tr/td/a/@href')#链接url 9 zp_url_list = [] 10 for zp_url_lost in zp_list: 11 zp_url = 'https://hr.tencent.com/'+zp_url_lost 12 zp_url_list.append(zp_url) 13 return zp_url_list 14 def getinfo(zp_url_list):#获取每个招聘链接内部的内容 15 for zp_url in zp_url_list: 16 browser.get(zp_url) 17 zp_info_html = browser.page_source 18 zp_ele = etree.html(zp_info_html) 19 zp_info_title = str(zp_ele.xpath('//*[@id="sharetitle"]/text()')[0]) 20 zp_info_location = str(zp_ele.xpath('//*[@id="position_detail"]/div/table/tbody/tr[2]/td[1]/text()')[0]) 21 zp_info_type = str(zp_ele.xpath('//*[@id="position_detail"]/div/table/tbody/tr[2]/td[2]/text()')[0]) 22 zp_info_num = str(zp_ele.xpath('//*[@id="position_detail"]/div/table/tbody/tr[2]/td[3]/text()')[0]) 23 zp_info_need = str(zp_ele.xpath('//*[@id="position_detail"]/div/table/tbody/tr[3]/td/ul/li/text()')) 24 connection = pymysql.connect(host='localhost', user='root', password='1234', db='txzp', ) 25 try: 26 with connection.cursor() as cursor: 27 sql = "insert into `txzp_info` (`title`, `location`,`type`,`num`,`need`) values (%s,%s,%s,%s, %s)" 28 cursor.execute(sql, (zp_info_title,zp_info_location,zp_info_type,zp_info_num,zp_info_need)) 29 connection.commit() 30 finally: 31 connection.close() 32 print(zp_info_title,zp_info_location,zp_info_type,zp_info_num,zp_info_need) 33 if __name__ == '__main__': 34 browser = webdriver.chrome() 35 pags = int(input('需要几页?')) 36 for i in range(0,pags): 37 url = 'https://hr.tencent.com/position.php?keywords=&tid=0&start={}' 38 fullurl = url.format(str(i*10)) 39 zp_url_list = geturl(fullurl) 40 getinfo(zp_url_list) 41 browser.close()
上一篇: 不同类型的感冒吃什么药膳好
下一篇: 中医养生:治疗风寒感冒十大食疗偏方
推荐阅读
-
Python爬虫框架Scrapy实战之批量抓取招聘信息
-
Python爬虫使用脚本登录Github并查看信息
-
Python 爬虫 招聘信息并存入数据库
-
Python3爬虫学习之MySQL数据库存储爬取的信息详解
-
【Python爬虫】requests+BeautifulSoup4+MongoDB 爬取51job招聘信息
-
python爬虫--爬取某网站电影信息并写入mysql数据库
-
Python爬虫+数据分析实战--爬取并分析中国天气网的温度信息
-
利用python3爬虫爬取全国天气数据并保存入Mysql数据库
-
Python爬虫PyQuery库简单爬取信息并录入数据库
-
Python实现的查询mysql数据库并通过邮件发送信息功能