欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

centos7 系统 yum 安装 tesseract,并 pip 安装 python3 的 tesserocr

程序员文章站 2022-05-29 12:14:45
...

#安装epel 源:

yum -y install epel-release

#安装tesseract:

yum -y install tesseract

#执行检查tesseract 支持的语言:

tesseract --list-langs

List of available languages (1):
eng
 

发现目前只支持英语,要安装更多语言包可通过 git 获取:

git clone https://github.com/tesseract-ocr/tessdata.git
mv tessdata/* /usr/share/tesseract/tessdata

pip 安装 pillow 和 tesserocr:

pip3 install pillow tesserocr

发现安装 pillow 成功,tesserocr 报错了

Installing collected packages: tesserocr

  Running setup.py install for tesserocr ... error

    Complete output from command /usr/local/python3/bin/python3.6 -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-i48iarbe/tesserocr/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-p27b42h9/install-record.txt --single-version-externally-managed --compile:

    pkg-config failed to find tesseract/lept libraries: b"Package tesseract was not found in the pkg-config search path.\nPerhaps you should add the directory containing `tesseract.pc'\nto the PKG_CONFIG_PATH environment variable\nNo package 'tesseract' found\n"

    Supporting tesseract v3.04.00

    Building with configs: {'libraries': ['tesseract', 'lept'], 'cython_compile_time_env': {'TESSERACT_VERSION': 197632}}

    /usr/local/python3/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'long_description_content_type'

      warnings.warn(msg)

    running install

    running build

    running build_ext

    building 'tesserocr' extension

    creating build

    creating build/temp.linux-x86_64-3.6

    gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/usr/local/python3/include/python3.6m -c tesserocr.cpp -o build/temp.linux-x86_64-3.6/tesserocr.o

    tesserocr.cpp:597:34: fatal error: leptonica/allheaders.h: No such file or directory

     #include "leptonica/allheaders.h"

                                      ^

    compilation terminated.

    error: command 'gcc' failed with exit status 1

   

    ----------------------------------------

Command "/usr/local/python3/bin/python3.6 -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-i48iarbe/tesserocr/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-p27b42h9/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-install-i48iarbe/tesserocr/

#解决方法,安装一下 tesseract-devel 库:

yum -y install tesseract-devel 

#再重新pip安装tesserocr:

pip3 install tesserocr

没报错,完成!