欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  IT编程

统计文件中单词出现频率最高的10个以及他们出现的次数

程序员文章站 2023-03-26 16:45:04
import re regex = "[a-zA-Z]+" with open("./test.py") as f: lines = f.readlines() worddict = dict() for line in lines: words = re.findall(regex, line) ... ......
import re

regex = "[a-za-z]+"

with open("./test.py") as f:
    lines = f.readlines()

worddict = dict()
for line in lines:
    words = re.findall(regex, line)
    for word in words:
        if word in worddict.keys():
            worddict[word] += 1
        else:
            worddict[word] = 1

words_top10 = sorted(worddict.items(), key=lambda x: x[1], reverse=true)

print(words_top10)