打开文件:
open("文件路径",'r')里面放路径,
date = f.read()
print(data)
f.colose():释放资源
编辑文件
后面的r(只可读)改为w (编辑)
f=open()
f.write("x=1")# 写进去x=1 w是覆盖写入, 清空文件后写入
wordcloud 库的使用说明
wordcloud库的常规使用方法
方法 | 描述 |
---|---|
w.generate(txt) | 向WordCloud对象w中加载文本txt,w.generate("Python and WordCloud")
|
w.to_file(filename) | 将词云输出为图像文件,.png或.jpg?x-oss-process=style/watermark格式,w.to_file("outfile.png")
|
配置wordcloud.WordCloud
参数 | 描述 |
---|---|
width | 指定词云对象生成图片的宽度,默认400像素 |
height | 指定词云对象生成图片的高度,默认200像素 |
min_font_size | 指定词云中字体的最小字号,默认4号 |
max_font_size | 指定词云中字体的最大字号,根据高度自动调节 |
font_step | 指定词云中字体字号的步进间隔,默认为1 |
font_path | 指定字体文件的路径,默认None |
max_words | 指定词云显示的最大单词数量,默认200 |
stop_words | 指定词云的排除词列表,即不显示的单词列表 |
mask | 指定词云形状,默认为长方形,需要引用imread()函数 |
background_color | 指定词云图片的背景颜色,默认为黑色 |
文本统计法
f=open(r'C:\Users\quyang\PycharmProjects\untitled\hamlet.txt',"r",encoding="utf8")
data = f.read().lower()
# print(data)
data_split = data.split(' ')
# print(data_split)
count_dict = {}
for word in data_split:
if word not in count_dict:
count_dict[word] = 1
else:
count_dict[word] += 1
# print(count_dict)
def func(i):
return i[1]
lt = list(count_dict.items())
lt.sort(key=func)
lt.reverse()
for i in lt[0:10]:
print(f'{i[0]:^7}{i[1]^5}')
中文统计法
import jieba
f = open(r'D:\上海Python11期视频\预科班\threekingdoms.txt', 'r', encoding='utf8')
data = f.read()
# print(data)
data_jieba = jieba.lcut(data)
print(data_jieba)
count_dict = {}
for word in data_jieba:
if len(word) == 1:
continue
if word in {"将军", "却说", "荆州", "二人", "不可", "不能", "如此", "商议"}:
continue
# if word == '孔明曰':
# word = '孔明'
# elif word == '玄德曰':
# word = '玄德'
if '曰' in word:
word = word.replace('曰', '')
if word in count_dict:
count_dict[word] += 1
else:
count_dict[word] = 1
def func(i):
return i[1]
data_list = list(count_dict.items())
data_list.sort(key=func)
data_list.reverse()
print(data_list)
词云
# import wordcloud
# f=open(r'C:\Users\quyang\PycharmProjects\untitled\threekingdoms.txt','r',encoding="utf8")
# data = f.read()
# w = wordcloud.WordCloud(font_path=r'C:\Windows\Fonts\simfang.ttf')#是中文字体
# w.generate(data)#生成图片
# w.to_file('outfile2.png')
自己换图片
import wordcloud
from imageio import imread
mask = imread(r'C:\Users\quyang\PycharmProjects\untitled\tst3.png')
f = open(r'C:\Users\quyang\PycharmProjects\untitled\threekingdoms.txt' , 'r', encoding='utf8')
data=f.read()
w=wordcloud.WordCloud(font_path=r'C:\Windows\Fonts\simsun.ttc', mask=mask , background_color="white")
w.generate(data)
w.to_file("outfile2.png")