欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  php教程

php实现word转html文档的例子

程序员文章站 2022-05-29 16:30:37
...
word文档不适合放到网页上了,如果我们要放到网页中去是需要一个个复制了,如果你还在复制就out了,下文小编来为各位整理一篇php实现word转html文档的例子,希望文章对各位有帮助.

要想完美解决,office转pdf或者html,最好还是用windows office软件,libreoffice不能完美转换,wps没有api.

先确认com模块是不是开启,phpinfo里面如果有com_dotnet模块,说明已开启,如果没有,修改php.ini,com.allow_dcom = true

前面的注释去掉,重启就OK了,php官方网站说,php5.4.5之前,com模块是内置的,其实也不一定全是,官网下的php 5.3.39,com模块就没有内置.

如果不是内置模块的话,php.ini加上,前提你的ext文件夹下,有该扩展.

extension=php_com_dotnet.dll

然后重启就OK了,代码如下:

function word2html($wordname,$htmlname)   
{   
    $word = new COM("word.application") or die("Unable to instanciate Word");     
    $word->Visible = 1;   
    $word->Documents->Open($wordname);   
    $word->Documents[1]->SaveAs($htmlname,8);   
    $word->Quit();   
    $word = null;   
    unset($word);   
}

word2html('D:/www/test/6.docx','D:/www/test/6.html');

注意:

1,转换出来的html,查看源码,比较乱的

2,转换过程中会调用winword.exe

3,如果页面一直在加载,把文档重命名,然后在重新转.

补充一个例子:

function lego_clean($text) {    
       
    $text = implode("\r",$text);    
   
    // normalize white space    
    $text = eregi_replace("[[:space:]]+", " ", $text);    
    $text = str_replace("> \r\r","
\r",$text); // remove everything before $text = strstr($text,"]*BodyTextIndent[^>]*>([^\n|\n\015|\015\n]*)","

\\1

",$text); $text = eregi_replace("

]*margin-left[^>]*>([^\n|\n\015|\015\n]*)

","
\\1
",$text); $text = str_replace(" ","",$text); //clean up whatever is left inside

and

  • $text = eregi_replace("

    ]*>","

    ",$text); $text = eregi_replace("

  • ]*>","
  • ",$text); // kill unwanted tags $text = eregi_replace("?span[^>]*>","",$text); $text = eregi_replace("?body[^>]*>","",$text); $text = eregi_replace("?div[^>]*>","",$text); $text = eregi_replace("]*>","",$text); $text = eregi_replace("?[a-z]\:[^>]*>","",$text); // kill style and on mouse* tags $text = eregi_replace("([ \f\r\t\n\'\"])style=[^>]+", "\\1", $text); $text = eregi_replace("([ \f\r\t\n\'\"])on[a-z]+=[^>]+", "\\1", $text); //remove empty paragraphs $text = str_replace("

    ","",$text); //remove closing $text = str_replace("","",$text); //clean up white space again $text = eregi_replace("[[:space:]]+", " ", $text); $text = str_replace("> \r\r","
    \r",$text); }


    文章地址:

    转载随意^^请带上本文地址!