欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

jieba分词报错:AttributeError: 'float' object has no attribute 'decode'

程序员文章站 2022-03-26 19:32:34
...

jieba分词报错:AttributeError: ‘float’ object has no attribute ‘decode’

最近在做关于新闻报道的研究,利用jieba分词时,程序报错AttributeError: ‘float’ object has no attribute ‘decode’

原始代码

一下仅展示报错部分代码

content_S = []
current_segment = jieba.lcut(content)
if len(current_segment) > 1 and current_segment != '\n':
    content_S.append(current_segment)
contents_clean = []
all_words = []
stopwords_lst = stopwords['stopword'].tolist()

执行代码后

C:\Users\Administrator\Anaconda3\lib\site-packages\jieba_compat.py in strdecode(sentence)
35 if not isinstance(sentence, text_type):
36 try:
—> 37 sentence = sentence.decode(‘utf-8’)
38 except UnicodeDecodeError:
39 sentence = sentence.decode(‘gbk’, ‘ignore’)

AttributeError: ‘float’ object has no attribute ‘decode’
这是因为所需要分词的文本中出现了数字的缘故,此时仅仅需要添加一个异常处理就可以正常进行了。

修改后代码

def drop_words(content):
    """去除停用词"""
    content_S = []
    try:
        global current_segment 
        current_segment = jieba.lcut(content)
    except AttributeError:
        pass
    if len(current_segment) > 1 and current_segment != '\n':  # 换行符
        content_S.append(current_segment)
    contents_clean = []
    all_words = []
    stopwords_lst = stopwords['stopword'].tolist()

因为我也是新手小白,遇到bug的时候第一反应找百度,发现有人也遇到了相同的问题,按照他的解决办法我的程序依然报错,于是只能自己想了……希望可以解决大家的问题。