栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 软件开发 > 后端开发 > Java

Java使用poi将word转换为html

Java 更新时间: 发布时间: IT归档 最新发布 模块sitemap 名妆网 法律咨询 聚返吧 英语巴士网 伯小乐 网商动力

Java使用poi将word转换为html

使用poi将word转换为html,支持doc,docx,转换后可以保持图片、样式。

1.导入Maven包

 
 org.apache.poi 
 poi 
 3.14 
 
 
 org.apache.poi 
 poi-scratchpad 
 3.14 
 
 
 org.apache.poi 
 poi-ooxml 
 3.14 
 
 
 fr.opensagres.xdocreport 
 xdocreport 
 1.0.6 
 
 
 org.apache.poi 
 poi-ooxml-schemas 
 3.14 
 
 
 org.apache.poi 
 ooxml-schemas 
 1.3 
 

2.转换代码

import org.apache.poi.hwpf.HWPFdocument; 
import org.apache.poi.hwpf.converter.WordToHtmlConverter; 
import org.apache.poi.xwpf.converter.core.BasicURIResolver; 
import org.apache.poi.xwpf.converter.core.FileImageExtractor; 
import org.apache.poi.xwpf.converter.xhtml.XHTMLConverter; 
import org.apache.poi.xwpf.converter.xhtml.XHTMLOptions; 
import org.apache.poi.xwpf.usermodel.XWPFdocument; 
import org.w3c.dom.document; 
 
import javax.xml.parsers.documentBuilderFactory; 
import javax.xml.transform.OutputKeys; 
import javax.xml.transform.Transformer; 
import javax.xml.transform.TransformerFactory; 
import javax.xml.transform.dom.DOMSource; 
import javax.xml.transform.stream.StreamResult; 
import java.io.File; 
import java.io.FileInputStream; 
import java.io.FileOutputStream; 
import java.io.OutputStreamWriter; 
 
public class Test { 
  // doc转换为html 
  void docToHtml() throws Exception { 
    String sourceFileName = "C:\doc\test.doc"; 
    String targetFileName = "C:\html\test.html"; 
    String imagePathStr = "C:\html\image\"; 
    HWPFdocument worddocument = new HWPFdocument(new FileInputStream(sourceFileName)); 
    document document = documentBuilderFactory.newInstance().newdocumentBuilder().newdocument(); 
    WordToHtmlConverter wordToHtmlConverter = new WordToHtmlConverter(document); 
    // 保存图片,并返回图片的相对路径 
    wordToHtmlConverter.setPicturesManager((content, pictureType, name, width, height) -> { 
      try(FileOutputStream out = new FileOutputStream(imagePathStr + name)){ 
  out.write(content); 
      } catch (Exception e) { 
 e.printStackTrace(); 
      }  
      return "image/" + name; 
    }); 
    wordToHtmlConverter.processdocument(worddocument); 
    document htmldocument = wordToHtmlConverter.getdocument(); 
    DOMSource domSource = new DOMSource(htmldocument); 
    StreamResult streamResult = new StreamResult(new File(targetFileName)); 
 
    TransformerFactory tf = TransformerFactory.newInstance(); 
    Transformer serializer = tf.newTransformer(); 
    serializer.setOutputProperty(OutputKeys.ENCODING, "utf-8"); 
    serializer.setOutputProperty(OutputKeys.INDENT, "yes"); 
    serializer.setOutputProperty(OutputKeys.METHOD, "html"); 
    serializer.transform(domSource, streamResult); 
  } 
  // docx转换为html 
  public void docxToHtml() throws Exception { 
    String sourceFileName = "D:\ac\00.docx"; 
    String targetFileName = "D:\ac\test.html"; 
    String imagePathStr = "D:\ac\image\"; 
    OutputStreamWriter outputStreamWriter = null; 
    try { 
      XWPFdocument document = new XWPFdocument(new FileInputStream(sourceFileName)); 
      XHTMLOptions options = XHTMLOptions.create(); 
      // 存放图片的文件夹 
      options.setExtractor(new FileImageExtractor(new File(imagePathStr))); 
      // html中图片的路径 
      options.URIResolver(new BasicURIResolver("image")); 
      outputStreamWriter = new OutputStreamWriter(new FileOutputStream(targetFileName), "utf-8"); 
      XHTMLConverter xhtmlConverter = (XHTMLConverter) XHTMLConverter.getInstance(); 
      xhtmlConverter.convert(document, outputStreamWriter, options); 
    } finally { 
      if (outputStreamWriter != null) { 
 outputStreamWriter.close(); 
      } 
    } 
  } 

演示地址: https://www.xiaoyun.studio/app/preview.html

以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持考高分网。

转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/148254.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号