欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  IT编程

java 库将 pdf 文件转换成高清图片方法

程序员文章站 2022-08-02 10:38:42
近期需要将 pdf 文件转成高清图片,使用库是 pdfbox、fontbox。可以使用 renderImageWithDPI 方法指定转换的清晰度,当然清晰度越高,转换需要的时间越长,转换出来的图片越大,越清晰。 说明:由于 adobo 软件越来越强大,支持的格式越来越多,这造成了 java 软件有 ......

近期需要将 pdf 文件转成高清图片,使用库是 pdfbox、fontbox。可以使用 renderimagewithdpi 方法指定转换的清晰度,当然清晰度越高,转换需要的时间越长,转换出来的图片越大,越清晰。

说明:由于 adobo 软件越来越强大,支持的格式越来越多,这造成了 java 软件有些不能转换。所以对于新的格式可能会有转换问题。

1 引入依赖

<dependency>
            <groupid>org.apache.pdfbox</groupid>
            <artifactid>pdfbox</artifactid>
            <version>2.0.16</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/org.apache.pdfbox/fontbox -->
        <dependency>
            <groupid>org.apache.pdfbox</groupid>
            <artifactid>fontbox</artifactid>
            <version>2.0.16</version>
        </dependency>

 

2 代码如下

public static void convertpdf2image(string pdfpath, string imagedirpath) {
        log.info("start convert pdf file:[{}] to image path:[{}]", pdfpath, imagedirpath);
        if (!new file(pdfpath).exists()) {
            log.info("pdffilename:[{}] not exist", pdfpath);
            return;
        }
        if (!new file(imagedirpath).exists()) {
            log.info("imagedir:[{}] not exist", imagedirpath);
            return;
        }
        byte[] pdfcontent = fileutil.getfilecontentbyte(pdfpath);
        string filename = fileutil.getfilename(pdfpath);
        float dpi = 200;
        convertpdf2image(pdfcontent, filename, imagedirpath, dpi);
        log.info("convert pdf file:[{}] to image success", filename);
    }

private static void convertpdf2image(byte[] pdfcontent, string pdffilename, string imagedirpath, float dpi) {
        log.info("convert pdffilename:[{}] to imagedir:[{}] with dpi:[{}]", pdffilename, imagedirpath, dpi);
        if (arrayutils.isempty(pdfcontent)) {
            return;
        }
        // 为了保证显示清除,至少 90
        if (dpi < 90) {
            dpi = 90;
        }
        string basesir = imagedirpath;
        if (basesir.endswith("/") || basesir.endswith("\\")) {
            basesir += pdffilename + "_";
        } else {
            basesir += file.separator + pdffilename + "_";
        }
        pddocument document = null;
        bufferedoutputstream outputstream = null;
        try {
            document = pddocument.load(pdfcontent);
            int pagecount = document.getnumberofpages();
            pdfrenderer pdfrenderer = new pdfrenderer(document);
            string imgpath;
            for (int i = 0; i < pagecount; i++) {
                imgpath = basesir + i + ".png";
                outputstream = new bufferedoutputstream(new fileoutputstream(imgpath));
                bufferedimage image = pdfrenderer.renderimagewithdpi(i, dpi, imagetype.rgb);
                imageio.write(image, "png", outputstream);
                outputstream.close();
                log.info("convert to png, total[{}], now[{}], ori:[{}], des[{}]", pagecount, i + 1, pdffilename, imgpath);
            }
        } catch (ioexception e) {
            log.error("convert pdf to image error, pdffilename:" + pdffilename, e);
        } finally {
            ioutil.closesilently(outputstream);
            ioutil.closesilently(document);
        }
    }

// ioutil.closesilently 代码
public static void closesilently(closeable io) {
        if (io != null) {
            try {
                io.close();
            } catch (ioexception e) {
                e.printstacktrace();
            }
        }
    }

在实际使用中遇到问题

1)error o.a.p.contentstream.pdfstreamengine 911 - cannot read jbig2 image: jbig2-imageio is not installed

2)cannot read jpeg2000 image: java advanced imaging (jai) image i/o tools are not installed

以上两个问题需要使用 jai 插件和 jbig2 插件支持,通过引入 jai-imageio-core、jai-imageio-jpeg2000、jbig2-imageio

<dependency>
            <groupid>com.github.jai-imageio</groupid>
            <artifactid>jai-imageio-core</artifactid>
            <version>1.4.0</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/com.github.jai-imageio/jai-imageio-jpeg2000 -->
        <dependency>
            <groupid>com.github.jai-imageio</groupid>
            <artifactid>jai-imageio-jpeg2000</artifactid>
            <version>1.3.0</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/org.apache.pdfbox/jbig2-imageio -->
        <dependency>
            <groupid>org.apache.pdfbox</groupid>
            <artifactid>jbig2-imageio</artifactid>
            <version>3.0.2</version>
        </dependency>

参考问题文件

https://github.com/crazycodelove/studentservice/blob/master/sys/src/main/resources/pdffile/000208-p1.pdf

https://github.com/crazycodelove/studentservice/blob/master/sys/src/main/resources/pdffile/001659-p14.pdf

https://github.com/crazycodelove/studentservice/blob/master/sys/src/main/resources/pdffile/main%20doc.pdf

 

 

参考文献

https://*.com/questions/42169154/pdfbox1-8-12-convert-pdf-to-white-page-image

https://*.com/questions/20424796/pdf-box-generating-blank-images-due-to-jbig2-images-in-it

https://blog.csdn.net/qq_15801963/article/details/80746830

https://my.oschina.net/u/2345654/blog/1058192