java 库将 pdf 文件转换成高清图片方法
近期需要将 pdf 文件转成高清图片,使用库是 pdfbox、fontbox。可以使用 renderimagewithdpi 方法指定转换的清晰度,当然清晰度越高,转换需要的时间越长,转换出来的图片越大,越清晰。
说明:由于 adobo 软件越来越强大,支持的格式越来越多,这造成了 java 软件有些不能转换。所以对于新的格式可能会有转换问题。
1 引入依赖
<dependency> <groupid>org.apache.pdfbox</groupid> <artifactid>pdfbox</artifactid> <version>2.0.16</version> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.pdfbox/fontbox --> <dependency> <groupid>org.apache.pdfbox</groupid> <artifactid>fontbox</artifactid> <version>2.0.16</version> </dependency>
2 代码如下
public static void convertpdf2image(string pdfpath, string imagedirpath) { log.info("start convert pdf file:[{}] to image path:[{}]", pdfpath, imagedirpath); if (!new file(pdfpath).exists()) { log.info("pdffilename:[{}] not exist", pdfpath); return; } if (!new file(imagedirpath).exists()) { log.info("imagedir:[{}] not exist", imagedirpath); return; } byte[] pdfcontent = fileutil.getfilecontentbyte(pdfpath); string filename = fileutil.getfilename(pdfpath); float dpi = 200; convertpdf2image(pdfcontent, filename, imagedirpath, dpi); log.info("convert pdf file:[{}] to image success", filename); } private static void convertpdf2image(byte[] pdfcontent, string pdffilename, string imagedirpath, float dpi) { log.info("convert pdffilename:[{}] to imagedir:[{}] with dpi:[{}]", pdffilename, imagedirpath, dpi); if (arrayutils.isempty(pdfcontent)) { return; } // 为了保证显示清除,至少 90 if (dpi < 90) { dpi = 90; } string basesir = imagedirpath; if (basesir.endswith("/") || basesir.endswith("\\")) { basesir += pdffilename + "_"; } else { basesir += file.separator + pdffilename + "_"; } pddocument document = null; bufferedoutputstream outputstream = null; try { document = pddocument.load(pdfcontent); int pagecount = document.getnumberofpages(); pdfrenderer pdfrenderer = new pdfrenderer(document); string imgpath; for (int i = 0; i < pagecount; i++) { imgpath = basesir + i + ".png"; outputstream = new bufferedoutputstream(new fileoutputstream(imgpath)); bufferedimage image = pdfrenderer.renderimagewithdpi(i, dpi, imagetype.rgb); imageio.write(image, "png", outputstream); outputstream.close(); log.info("convert to png, total[{}], now[{}], ori:[{}], des[{}]", pagecount, i + 1, pdffilename, imgpath); } } catch (ioexception e) { log.error("convert pdf to image error, pdffilename:" + pdffilename, e); } finally { ioutil.closesilently(outputstream); ioutil.closesilently(document); } } // ioutil.closesilently 代码 public static void closesilently(closeable io) { if (io != null) { try { io.close(); } catch (ioexception e) { e.printstacktrace(); } } }
在实际使用中遇到问题
1)error o.a.p.contentstream.pdfstreamengine 911 - cannot read jbig2 image: jbig2-imageio is not installed
2)cannot read jpeg2000 image: java advanced imaging (jai) image i/o tools are not installed
以上两个问题需要使用 jai 插件和 jbig2 插件支持,通过引入 jai-imageio-core、jai-imageio-jpeg2000、jbig2-imageio
<dependency> <groupid>com.github.jai-imageio</groupid> <artifactid>jai-imageio-core</artifactid> <version>1.4.0</version> </dependency> <!-- https://mvnrepository.com/artifact/com.github.jai-imageio/jai-imageio-jpeg2000 --> <dependency> <groupid>com.github.jai-imageio</groupid> <artifactid>jai-imageio-jpeg2000</artifactid> <version>1.3.0</version> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.pdfbox/jbig2-imageio --> <dependency> <groupid>org.apache.pdfbox</groupid> <artifactid>jbig2-imageio</artifactid> <version>3.0.2</version> </dependency>
参考问题文件
https://github.com/crazycodelove/studentservice/blob/master/sys/src/main/resources/pdffile/000208-p1.pdf
https://github.com/crazycodelove/studentservice/blob/master/sys/src/main/resources/pdffile/001659-p14.pdf
https://github.com/crazycodelove/studentservice/blob/master/sys/src/main/resources/pdffile/main%20doc.pdf
参考文献
https://*.com/questions/42169154/pdfbox1-8-12-convert-pdf-to-white-page-image
https://*.com/questions/20424796/pdf-box-generating-blank-images-due-to-jbig2-images-in-it
https://blog.csdn.net/qq_15801963/article/details/80746830
https://my.oschina.net/u/2345654/blog/1058192
下一篇: day18