java实现PPT转PDF出现中文乱码问题的解决方法
程序员文章站
2024-03-06 20:08:26
ppt转成pdf,原理是ppt转成图片,再用图片生产pdf,过程有个问题,不管是ppt还是pptx,都遇到中文乱码,编程方框的问题,其中ppt后缀网上随便找就有解决方案,就...
ppt转成pdf,原理是ppt转成图片,再用图片生产pdf,过程有个问题,不管是ppt还是pptx,都遇到中文乱码,编程方框的问题,其中ppt后缀网上随便找就有解决方案,就是设置字体为统一字体,pptx如果页面是一种中文字体不会有问题,如果一个页面有微软雅黑和宋体,就会导致部分中文方框,怀疑是poi处理的时候,只读取第一种字体,所以导致多个中文字体乱码。
百度和谷歌都找了很久,有看到说apache官网有人说是bug,但他们回复说是字体问题,这个问题其实我觉得poi可能可以自己做,读取原来字体设置成当前字体,不过性能应该会有很多消耗,反正我估计很多人跟我一样花费大量时间找解决方案,网上几乎没有现成的方案。自己也是一步步尝试,最终找到解决办法,ppt格式的就不说了网上找得到,pptx后缀的网上我是没找到。
问题前的pptx转成图片:
解决后的pptx转成图片:
解决方法:
读取每个shape,将文字转成统一的字体,网上找到的那段代码不可行,我自己改的方案如下:
for( xslfshape shape : slide[i].getshapes() ){ if ( shape instanceof xslftextshape ){ xslftextshape txtshape = (xslftextshape)shape ; system.out.println("txtshape" + (i+1) + ":" + txtshape.getshapename()); system.out.println("text:" +txtshape.gettext()); for ( xslftextparagraph textpara : txtshape.gettextparagraphs() ){ list<xslftextrun> textrunlist = textpara.gettextruns(); for(xslftextrun textrun: textrunlist) { textrun.setfontfamily("宋体"); } } } }
完整代码如下(除了以上自己的解决方案,大部分是*上的代码):
public static void convertppttopdf(string sourcepath, string destinationpath, string filetype) throws exception { fileinputstream inputstream = new fileinputstream(sourcepath); double zoom = 2; affinetransform at = new affinetransform(); at.settoscale(zoom, zoom); document pdfdocument = new document(); pdfwriter pdfwriter = pdfwriter.getinstance(pdfdocument, new fileoutputstream(destinationpath)); pdfptable table = new pdfptable(1); pdfwriter.open(); pdfdocument.open(); dimension pgsize = null; image slideimage = null; bufferedimage img = null; if (filetype.equalsignorecase(".ppt")) { slideshow ppt = new slideshow(inputstream); inputstream.close(); pgsize = ppt.getpagesize(); slide slide[] = ppt.getslides(); pdfdocument.setpagesize(new rectangle((float) pgsize.getwidth(), (float) pgsize.getheight())); pdfwriter.open(); pdfdocument.open(); for (int i = 0; i < slide.length; i++) { textrun[] truns = slide[i].gettextruns(); for ( int k=0;k<truns.length;k++){ richtextrun[] rtruns = truns[k].getrichtextruns(); for(int l=0;l<rtruns.length;l++){ // int index = rtruns[l].getfontindex(); // string name = rtruns[l].getfontname(); rtruns[l].setfontindex(1); rtruns[l].setfontname("宋体"); } } img = new bufferedimage((int) math.ceil(pgsize.width * zoom), (int) math.ceil(pgsize.height * zoom), bufferedimage.type_int_rgb); graphics2d graphics = img.creategraphics(); graphics.settransform(at); graphics.setpaint(color.white); graphics.fill(new rectangle2d.float(0, 0, pgsize.width, pgsize.height)); slide[i].draw(graphics); graphics.getpaint(); slideimage = image.getinstance(img, null); table.addcell(new pdfpcell(slideimage, true)); } } if (filetype.equalsignorecase(".pptx")) { xmlslideshow ppt = new xmlslideshow(inputstream); pgsize = ppt.getpagesize(); xslfslide slide[] = ppt.getslides(); pdfdocument.setpagesize(new rectangle((float) pgsize.getwidth(), (float) pgsize.getheight())); pdfwriter.open(); pdfdocument.open(); for (int i = 0; i < slide.length; i++) { for( xslfshape shape : slide[i].getshapes() ){ if ( shape instanceof xslftextshape ){ xslftextshape txtshape = (xslftextshape)shape ; // system.out.println("txtshape" + (i+1) + ":" + txtshape.getshapename()); //system.out.println("text:" +txtshape.gettext()); for ( xslftextparagraph textpara : txtshape.gettextparagraphs() ){ list<xslftextrun> textrunlist = textpara.gettextruns(); for(xslftextrun textrun: textrunlist) { textrun.setfontfamily("宋体"); } } } } img = new bufferedimage((int) math.ceil(pgsize.width * zoom), (int) math.ceil(pgsize.height * zoom), bufferedimage.type_int_rgb); graphics2d graphics = img.creategraphics(); graphics.settransform(at); graphics.setpaint(color.white); graphics.fill(new rectangle2d.float(0, 0, pgsize.width, pgsize.height)); slide[i].draw(graphics); // fileoutputstream out = new fileoutputstream("src/main/resources/test"+i+".jpg"); // javax.imageio.imageio.write(img, "jpg", out); graphics.getpaint(); slideimage = image.getinstance(img, null); table.addcell(new pdfpcell(slideimage, true)); } } pdfdocument.add(table); pdfdocument.close(); pdfwriter.close(); system.out.println("powerpoint file converted to pdf successfully"); }
maven配置:
<dependency> <groupid>org.apache.poi</groupid> <artifactid>poi</artifactid> <!-- <version>3.13</version> --> <version>3.9</version> </dependency> <dependency> <groupid>org.apache.poi</groupid> <artifactid>poi-ooxml</artifactid> <!-- <version>3.10-final</version> --> <version>3.9</version> </dependency> <dependency> <groupid>com.itextpdf</groupid> <artifactid>itextpdf</artifactid> <version>5.5.7</version> </dependency> <dependency> <groupid>com.itextpdf.tool</groupid> <artifactid>xmlworker</artifactid> <version>5.5.7</version> </dependency> <dependency> <groupid>org.apache.poi</groupid> <artifactid>poi-scratchpad</artifactid> <!-- <version>3.12</version> --> <version>3.9</version> </dependency>
上面就是为大家分享的java实现ppt转pdf出现中文乱码问题的解决方法,希望对大家的学习有所帮助。