python爬取中国天气网所有城市的最低气温并抽取前十利用matlab进行可视化输出
程序员文章站
2022-03-22 20:48:28
...
python爬取中国天气网所有城市的最低气温并抽取前十利用matlab进行可视化输出`
import requests
import lxml
from bs4 import BeautifulSoup
import pandas as pd
import matplotlib.pyplot as plt
url1 = 'http://www.weather.com.cn/textFC/hb.shtml'
url2 = 'http://www.weather.com.cn/textFC/db.shtml'
url3 = 'http://www.weather.com.cn/textFC/hd.shtml'
url4 = 'http://www.weather.com.cn/textFC/hz.shtml'
url5 = 'http://www.weather.com.cn/textFC/hn.shtml'
url6 = 'http://www.weather.com.cn/textFC/xb.shtml'
url7 = 'http://www.weather.com.cn/textFC/xn.shtml'
url8 = 'http://www.weather.com.cn/textFC/gat.shtml'
url_list = [url1, url2, url3, url4, url5, url6, url7]
main_url = 'http://www.weather.com.cn'
HEADERS = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36'}
city_weather_list = []
def get_weather(url):
"""获取页面上所有城市的最低气温"""
resp = requests.get(url=url, headers=HEADERS)
html = resp.content.decode('utf-8')
soup = BeautifulSoup(html, "lxml")
div1 = soup.select("div[class='conMidtab']")[0]
tables = div1.select("table")
for table in tables:
trs = table.select("tr")[2:]
for tr in trs:
city_weather = {}
city = tr.select("a")[0].string
temp_lowest = tr.select("td[width='86']")[0].string
city_weather['city'] = city
city_weather['temp_lowest'] = int(temp_lowest)
city_weather_list.append(city_weather)
return city_weather_list
def get_all_weather():
for url in url_list:
get_weather(url)
if __name__ == '__main__':
get_all_weather()
city_weather_list.sort(key=lambda data: data['temp_lowest'])
lowest_ten_list = city_weather_list[0:10]
graphic = pd.DataFrame(lowest_ten_list)
print(graphic)
# 由于matlab原本不支持中文,需要加上这句代码
plt.rcParams['font.sans-serif']="LiSu"
graphic.plot.bar(x='city',y='temp_lowest',color= 'green')
plt.title('全国气温最低城市排行',fontsize=20,fontweight='bold')
plt.xticks(rotation=0,fontsize=10)
plt.yticks(rotation=90, fontsize=20)
plt.show()
以下是可视化输出的图片