欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  IT编程

java实现图片文字识别ocr

程序员文章站 2024-02-17 18:07:22
最近在开发的时候需要识别图片中的一些文字,网上找了相关资料之后,发现google有一个离线的工具,以下为java使用的demo 在此之前,使用这个工具需要在本地安装ocr...

最近在开发的时候需要识别图片中的一些文字,网上找了相关资料之后,发现google有一个离线的工具,以下为java使用的demo

在此之前,使用这个工具需要在本地安装ocr工具:

java实现图片文字识别ocr

下面一个是一定要安装的离线包,建议默认安装

上面一个是中文的语言包,如果网络可以fq的童鞋可以在安装的时候就选择语言包在线安装,有多种语言可供选择,默认只有英文的

exe安装好之后,把上面一个文件拷到安装目录下tessdata文件夹下

如c:\program files (x86)\tesseract-ocr\tessdata下

然后下面两个是可选包,如果图片不做临时文件处理的话,可以不需要带的

java实现图片文字识别ocr

首先是一个临时文件生成用的类以防源文件损坏,参考某位博友的例子@gunner

package org.ink.image.textrz;

import java.awt.image.bufferedimage; 
import java.io.file; 
import java.io.ioexception; 
import java.util.iterator; 
import java.util.locale; 
 
import javax.imageio.iioimage; 
import javax.imageio.imageio; 
import javax.imageio.imagereader; 
import javax.imageio.imagewriteparam; 
import javax.imageio.imagewriter; 
import javax.imageio.metadata.iiometadata; 
import javax.imageio.stream.imageinputstream; 
import javax.imageio.stream.imageoutputstream; 
 
import com.sun.media.imageio.plugins.tiff.tiffimagewriteparam; 
 
public class imageiohelper {
  private locale locale=locale.chinese;
  /**
   * user set locale construct
   * @param locale
   */
  public imageiohelper(locale locale){
    this.locale=locale;
  }
  
  /**
   * default construct using default locale locale.chinese
   */
  public imageiohelper(){
    
  }
  /**
   * create tempfile of image in order to prevent damaging original file
   * @param imagefile
   * @param imageformat like png,jps .etc
   * @return tempfile of image
   * @throws ioexception
   */
  public file createimage(file imagefile, string imageformat) throws ioexception {  
    iterator<imagereader> readers = imageio.getimagereadersbyformatname(imageformat);  
    imagereader reader = readers.next();  
    imageinputstream iis = imageio.createimageinputstream(imagefile);  
    reader.setinput(iis);  
    iiometadata streammetadata = reader.getstreammetadata();  
    tiffimagewriteparam tiffwriteparam = new tiffimagewriteparam(locale.chinese);  
    tiffwriteparam.setcompressionmode(imagewriteparam.mode_disabled);  
    iterator<imagewriter> writers = imageio.getimagewritersbyformatname("tiff");  
    imagewriter writer = writers.next();  
    bufferedimage bi = reader.read(0);  
    iioimage image = new iioimage(bi,null,reader.getimagemetadata(0));  
    file tempfile = tempimagefile(imagefile);  
    imageoutputstream ios = imageio.createimageoutputstream(tempfile);  
    writer.setoutput(ios);  
    writer.write(streammetadata, image, tiffwriteparam);  
    ios.close();
    iis.close();
    writer.dispose();  
    reader.dispose();  
    return tempfile;  
  }  
  /**
   * add suffix to tempfile
   * @param imagefile
   * @return
   * @throws ioexception 
   */
  private file tempimagefile(file imagefile) throws ioexception {  
    string path = imagefile.getpath();  
    stringbuffer strb = new stringbuffer(path);  
    strb.insert(path.lastindexof('.'),"_text_recognize_temp");
    string s=strb.tostring().replacefirst("(?<=//.)(//w+)$", "tif");
    runtime.getruntime().exec("attrib "+"\""+s+"\""+" +h"); //设置文件隐藏
    return new file(strb.tostring()); 
  }  
  
} 

下面是真正识别的内容:

package org.ink.image.textrz;

import java.io.bufferedreader;  
import java.io.file;  
import java.io.fileinputstream;
import java.io.ioexception;
import java.io.inputstreamreader;  
import java.util.arraylist;  
import java.util.list;
import java.util.locale;

import org.jdesktop.swingx.util.os;  

/**
 * text recognize utils
 * @author ink.flower
 *
 */
public class ocrutil { 
  private final string lang_option = "-l"; //英文字母小写l,并非数字1  
  private final string eol = system.getproperty("line.separator");  
  private string tesspath = "c://program files (x86)//tesseract-ocr";//ocr默认安装路径
  private string transname="chi_sim";//默认中文语言包,识别中文
  
  /**
   * construct method of ocr ,set tesseract-ocr install path
   * @param tesspath tesseract-ocr install path
   * @param transfilename traningfile name like eng.traineddata
   */
  public ocrutil(string tesspath,string transfilename){
    this.tesspath=tesspath;
    this.transname=transfilename;
  }
  /**
   * construct method of ocr,default path is "c://program files (x86)//tesseract-ocr"
   */
  public ocrutil(){   }
  
  public string gettesspath() {
    return tesspath;
  }
  public void settesspath(string tesspath) {
    this.tesspath = tesspath;
  }
  public string gettransname() {
    return transname;
  }
  public void settransname(string transname) {
    this.transname = transname;
  }
  public string getlang_option() {
    return lang_option;
  }
  public string geteol() {
    return eol;
  }
  
  /**
   * recognize text in image
   * @param imagefile
   * @param imageformat
   * @return text recognized in image
   * @throws exception
   */
  public string recognizetext(file imagefile,string imageformat)throws exception{  
    file tempimage = new imageiohelper().createimage(imagefile,imageformat);  
    return ocrimages(tempimage, imagefile);  
  }  
  
  /**
   * recognize text in image
   * @param imagefile
   * @param imageformat
   * @param locale
   * @return text recognized in image
   * @throws exception
   */
  public string recognizetext(file imagefile,string imageformat,locale locale)throws exception{  
    file tempimage = new imageiohelper(locale).createimage(imagefile,imageformat);
    return ocrimages(tempimage, imagefile);
      
  }
  /**
   * 
   * @param tempimage
   * @param imagefile
   * @return
   * @throws ioexception
   * @throws interruptedexception
   */
  private string ocrimages(file tempimage,file imagefile) throws ioexception, interruptedexception{
    file outputfile = new file(imagefile.getparentfile(),"output");
    runtime.getruntime().exec("attrib "+"\""+outputfile.getabsolutepath()+"\""+" +h"); //设置文件隐藏
    stringbuffer strb = new stringbuffer();  
    list<string> cmd = new arraylist<string>();  
    if(os.iswindowsxp()){  
      cmd.add(tesspath+"//tesseract");  
    }else if(os.islinux()){  
      cmd.add("tesseract");  
    }else{  
      cmd.add(tesspath+"//tesseract");  
    }  
    cmd.add("");  
    cmd.add(outputfile.getname());  
    cmd.add(lang_option);  
    cmd.add(transname);  
    processbuilder pb = new processbuilder();  
    pb.directory(imagefile.getparentfile());  
    cmd.set(1, tempimage.getname());  
    pb.command(cmd);  
    pb.redirecterrorstream(true);  
    process process = pb.start();  
    int w = process.waitfor();  
    tempimage.delete();//删除临时正在工作文件     
    if(w==0){  
      bufferedreader in = new bufferedreader(new inputstreamreader(new fileinputstream(outputfile.getabsolutepath()+".txt"),"utf-8"));  
      string str;  
      while((str = in.readline())!=null){  
        strb.append(str).append(eol);  
      }  
      in.close();  
    }else{  
      string msg;  
      switch(w){  
      case 1:  
        msg = "errors accessing files.there may be spaces in your image's filename.";  
        break;  
      case 29:  
        msg = "cannot recongnize the image or its selected region.";  
        break;  
      case 31:  
        msg = "unsupported image format.";  
        break;  
      default:  
        msg = "errors occurred.";  
      }  
      tempimage.delete();  
      throw new runtimeexception(msg);  
    }  
    new file(outputfile.getabsolutepath()+".txt").delete();  
    return strb.tostring(); 
  }
}  

在实验中发现,如果对有多个文字的大图进行直接识别的话,效果可能比较差,所以可以参考另一篇切图的博文,将图片取一块之后再识别

  ←我是链接

这样成功率会提高很多。

以上为离线识别版本,效率因图而已,具体使用的时候可以总结分析,希望对大家的学习有所帮助,也希望大家多多支持。