python爬取墨迹天气的8月份的温度情况并发送到邮箱
目标:获取墨迹天气的整个8月份的温度情况,并以txt格式的文件发送到邮箱
环境:pyhton3.6 pycharm软件 163邮箱(163邮箱要打开授权码,才能发送成功)
思路:1.获取墨迹天气8月份的温度网页源码
2.使用正则表达式或解析器对源码进行筛选,得到有用的数据信息,本文使用正则表达式
3.将数据保存到txt文件
4.发送到邮箱
一、获取源码
获取墨迹天气的源码方法很多,本文使用的requests库,get()方法获取源码,网上有大量的教程,也可参考本人前面写的文章,此处不展开详讲。代码附上:
import request
from requests.exceptions import RequestException
url='https://tianqi.moji.com/weather/china/hubei/jiangxia-district'
header = {
'User-Agent': 'Mozilla/5.0(Macintosh; Intel Mac OS X 10_11_4)\
AppleWebKit/537.36(KHTML, like Gecko) Chrome/52 .0.2743. 116 Safari/537.36'
} # 模拟浏览器访问
response = requests.get(url,headers=header)
try:
if response.status_code==200:
return response.text
except RequestException:
print("请求页面出错!!!")
return None
二、筛选数据信息
本文筛选信息使用的是正则表达式,如何利用正则表达式筛选信息?首先先观察一张图片,注意红色方框的内容。
通过观察图片发现包含日期的代码:<em>01</em>、<em>02</em>、...其中em是html中的元素,是告诉浏览器把其中的文本表示为强调的内容。则正则表达式为:r'<em>(\d\d)</em>',获取日期的代码:
reg_date =r'<em>(\d\d)</em>' #获取日期
reg_num = re.compile(reg_date) #编译一下,匹配更快
regdalist = reg_num.findall(html) #全源码匹配
注意:最后输出的结果是以列表的形式输出
获取天气状况的网页源码是:alt="雷阵雨 "></b> 对应的正则表达式为 r'alt="(.*?)"></b>'
获取温度和风的状况源码是:<p>26/36°</p> <p>东风 3级</p> 对应的正则表达式为: r'<p>(.*?)</p>'
查看全部网页的源码的人会发现,有的地方也有类似<p>.....</p>的代码,正则表达式会否将其筛选出来?答案是:会的,不过很少,可以从列表中将其删除掉,留下我们需要的信息。
三、保存为txt文件
如何将数据保存在txt文件?
首先打开文件,如果没有则系统会自动创建,然后将其数据写入进去,最后关闭文件
例: htmlcode= asjgojdopajdsafjbfoso
pageFile = open('pageCode.txt','w')#以写的方式打开
pageCode.txt pageFile.write(htmlcode)#写入
pageFile.close()#开了记得关
四、将txt文件发送到邮箱
如何将附件发送到邮箱?
第一步选择发送方邮件的服务器和端口号,163邮箱的服务器对应的端口号是25,QQ邮箱的服务器对应的端口号是465
email_sever = 'smtp.163.com' #使用163邮箱
email_port = '25' #对应的端口号
第二步可先设置发送人和接收人的信息
email = MIMEMultipart()
email['From'] = formataddr(["xxxx",email_sender]) (xxxx:发件人用户名)
email['To'] = formataddr(["xxxx",email_receiver]) (xxxx:收件人用户名)
email['Subject'] = "xxxxxxx" #发送的主题
#正文内容
message = "XXXXXX"
textApart = MIMEText(message)
email.attach(textApart)
#附件
txtFile = r'xxxxxxxxx'
txtApart = MIMEApplication(open(txtFile, 'rb').read())
txtApart.add_header('Content-Disposition', 'attachment',filename=('gbk','',txtFile))
email.attach(txtApart)
第四部发送邮件
def sendmail(mail_sever,mail_port,mail_sender,mail_pass,mail_receiver):
try:
mail = smtplib.SMTP(mail_sever,mail_port) #请求邮件服务器和端口
mail.login(mail_sender, mail_pass) #登录账号
mail.sendmail(mail_sender, [mail_receiver], email.as_string()) #发送内容
mail.quit()
print("邮件发送成功!!!")
except:
mail.quit()
print("邮件发送失败!!!")
sendmail(email_sever,port,sender,sender_pass,receiver)
到此整个过程就完成。本文仅供参考!!!如有需要改进之处,请多多指教,谢谢!!!文末见全码。
参考资料:
https://www.cnblogs.com/Axi8/p/5757270.html (参考其保存txt文件)
https://blog.csdn.net/handsomekang/article/details/9811355 (参考其发送带附件的邮件)
#获取墨迹天气中整个8月份的天气情况,数据保存为txt文件,并发送到邮箱
#coding utf-8
import requests
from requests.exceptions import RequestException
import re
import smtplib
from email.mime.text import MIMEText
from email.utils import formataddr
from email.mime.multipart import MIMEMultipart
from email.mime.application import MIMEApplication
#获取数据部分
def get_html(url):
header = {
'User-Agent': 'Mozilla/5.0(Macintosh; Intel Mac OS X 10_11_4)\
AppleWebKit/537.36(KHTML, like Gecko) Chrome/52 .0.2743. 116 Safari/537.36'
} # 模拟浏览器访问
response = requests.get(url,headers=header) #请求访问链接
try:
if response.status_code==200: #如果请求状态正常,则返回源码
return response.text
except RequestException: #抛出异常信息
print("请求页面出错!!!")
return None
def html_Select(html):
num=0
num_o=0
reg_date =r'<em>(\d\d)</em>' #获取日期
reg_st = r'alt="(.*?)"></b>' #获取天气状况
reg_team = r'<p>(.*?)</p>'
reg_num = re.compile(reg_date) #编译一下,匹配更快
reg_wea = re.compile(reg_st)
reg_temp = re.compile(reg_team)
regdalist = reg_num.findall(html) #全源码匹配
regstlist = reg_wea.findall(html)
regtelist = reg_temp.findall(html)
del regdalist[0]
del regtelist[62]
#print(regdalist)
#print(regstlist)
#print(regtelist)
temp_File = open('8月份温度.txt', 'w')
while num<32:
for data in regdalist[num:num+1]:
temp_File.write(data)
for weather in regstlist[num:num+1]:
temp_File.write(weather)
for temp in regtelist[num_o:num_o+2]:
temp_File.write(temp + "\n")
#temp_File.write("=================================")
num += 1
num_o +=2
temp_File.close()
#发送邮件部分
#设置发送邮件的服务器和端口
email_sever = 'smtp.163.com' #使用163邮箱
email_port = '25' #对应的端口号
#设置发件人和收件人
email_sender = 'aaa@qq.com'
email_pass = 'xxxxxxxx' #此处是授权密码
email_receiver = 'aaa@qq.com'
#设置发送内容以及主题
email = MIMEMultipart()
email['From'] = formataddr(["xxxxx",email_sender])
email['To'] = formataddr(["xxxxxxxx",email_receiver])
email['Subject'] = "xxxxxxxxx"
#正文内容
message = "xxxxxxxxxxx"
textApart = MIMEText(message)
email.attach(textApart)
#附件
txtFile = r'xxxxxxxxxxxx'
txtApart = MIMEApplication(open(txtFile, 'rb').read())
txtApart.add_header('Content-Disposition', 'attachment',filename=('gbk','',txtFile))
email.attach(txtApart)
#发送邮件
def send_email(mail_sever,mail_port,mail_sender,mail_pass,mail_reciver):
try:
mail = smtplib.SMTP(mail_sever,mail_port)
mail.login(mail_sender,mail_pass)
mail.sendmail(mail_sender,[mail_reciver],email.as_string())
mail.quit()
print("邮件发送成功!!!")
except:
mail.quit()
print("邮件发送失败!!!")
if __name__ == '__main__':
url='https://tianqi.moji.com/weather/china/hubei/jiangxia-district'
html=get_html(url)
html_Select(html)
send_email(email_sever,email_port,email_sender,email_pass,email_receiver)
上一篇: 铁轨