使用Aspose.PDF for .NET将PDF转换为HTML格式示例解读（2）——将CSS拆分为页面

程序员文章站 2022-04-09 15:35:07

Aspose.PDF for .NET是一种高级PDF处理和解析API，用于在跨平台应用程序中执行文档管理和操作任务。API可以轻松用于生成，修改，转换，渲染，保护和打印PDF文档，而无需使用Adobe Acrobat。此外，还提供PDF压缩选项，表格创建和操作，图形和图像功能，广泛的超链接功能，印 ......

aspose.pdf for .net是一种高级pdf处理和解析api，用于在跨平台应用程序中执行文档管理和操作任务。api可以轻松用于生成，修改，转换，渲染，保护和打印pdf文档，而无需使用adobe acrobat。此外，还提供pdf压缩选项，表格创建和操作，图形和图像功能，广泛的超链接功能，印章和水印任务，扩展的安全控制和自定义字体处理。

pdf是当今最流行的文档格式之一，各种应用程序将其用作最终输出。由于支持多种数据类型和可移植性，因此它是创建和共享内容的首选格式。作为对开发文档管理应用程序感兴趣的.net应用程序开发人员，可能希望嵌入处理功能，以读取pdf文档并将其转换为其他文件格式，例如html。

在本文中，我们将探索并演示aspose.pdf for .net api的强大转换功能，以使用多种选项读取pdf文件并将其转换为html。

pdf转html-将css拆分为页面

将pdf文件转换为html时，将创建一个包含格式信息的css文件。aspose.pdf for .net还提供了将输出html拆分为页面的功能，还可以将css拆分为多个页面。

本htmlsaveoptions类有一个名为属性splitintopages，它支持的功能和生成文件时输出html文件拆分页面。如果希望基于单个页面拆分css文件，而不是生成单个css文件。要做到这一点，我们引入了一个新的标志，splitcssintopages对htmlsaveoptions类。当此属性的值设置为true时，转换器将根据创建的单个html页面将outout css分为多个部分/页面。以下代码段显示了如何使用该标志。

//文档目录的路径。
string datadir = runexamples.getdatadir_asposepdf_documentconversion_pdftohtmlformat();

// 1）清理目标文件夹
string htmlfile = path.getfullpath(datadir + "resultant.html");
string imagesdir = path.getdirectoryname(htmlfile) + @"\35942_files";
string cssdir = path.getdirectoryname(htmlfile) + @"\35942_css_files";
if (directory.exists(imagesdir)) { directory.delete(imagesdir, true); };
if (directory.exists(cssdir)) { directory.delete(cssdir, true); };

// 2）创建要转换的文档
document pdfdocument = new document(datadir + "input.pdf");

//  3）音调转换选项
htmlsaveoptions options = new htmlsaveoptions();
options.rasterimagessavingmode = htmlsaveoptions.rasterimagessavingmodes.aspngimagesembeddedintosvg;//<- to get compatibility with previous behavior and therefore same result of tests
// 将 html输出分成页面
options.splitintopages = true;
// 将 css分成页面
options.splitcssintopages = true;
options.customcsssavingstrategy = new htmlsaveoptions.csssavingstrategy(strategy_4_css_multipage_saving_right_way);
options.customstrategyofcssurlcreation = new htmlsaveoptions.cssurlmakingstrategy(strategy_5_css_making_custom_url_for_multipaging);
// 4）进行转换
pdfdocument.save(htmlfile, options);

private static void strategy_4_css_multipage_saving_right_way(htmlsaveoptions.csssavinginfo partsavinginfo)
{
    string datadir = runexamples.getdatadir_asposepdf_documentconversion_pdftohtmlformat();

    string outpath = datadir + "style_xyz_page" + partsavinginfo.cssnumber.tostring() + ".css";
    system.io.binaryreader reader = new binaryreader(partsavinginfo.contentstream);
    system.io.file.writeallbytes(outpath, reader.readbytes((int)partsavinginfo.contentstream.length));
}

private static string strategy_5_css_making_custom_url_for_multipaging(htmlsaveoptions.cssurlrequestinfo requestinfo)
{
    return "/document-viewer/getcss?cssid=4544554445_page{0}";
}

如果您有任何疑问或需求，请随时加入aspose技术交流群（642018183）。

上一篇：经期榴莲能吃吗？吃了会“血崩”？这误会大了！

下一篇：获取zabbix上所有主机的IP和主机名

使用Aspose.PDF for .NET将PDF转换为HTML格式示例解读（2）——将CSS拆分为页面

pdf转html-将css拆分为页面

使用Aspose.PDF for .NET将PDF转换为HTML格式示例解读（3）——将字体另存为WOFF或TTF

使用Aspose.PDF for .NET将PDF转换为HTML格式示例解读（4）——为图像文件指定前缀名

使用Aspose.PDF for .NET将PDF转换为HTML格式示例解读（7）——添加前缀以导入指令

使用Aspose.PDF for .NET将PDF转换为HTML格式示例解读（8）——将输出保存到Stream对象

使用Aspose.PDF for .NET将PDF转换为HTML格式示例解读（3）——将字体另存为WOFF或TTF

使用Aspose.PDF for .NET将PDF转换为HTML格式示例解读（6）——在style.css中设置字体的URL前缀

使用Aspose.PDF for .NET将PDF转换为HTML格式示例解读（4）——为图像文件指定前缀名

使用Aspose.PDF for .NET将PDF转换为HTML格式示例解读（7）——添加前缀以导入指令

使用Aspose.PDF for .NET将PDF转换为HTML格式示例解读（1）——以光栅格式保存图像

使用Aspose.PDF for .NET将PDF转换为HTML格式示例解读（2）——将CSS拆分为页面