目录
1转换为Html文件
2转换为Xml文件
3转换为Text文件
在POI中还存在有针对于word doc文件进行格式转换的功能。我们可以将word的内容转换为对应的Html文件,也可以把它转换为底层用来描述doc文档的xml文件,还可以把它转换为底层用来描述doc文档的xml格式的text文件。这些格式转换都是通过AbstractWordConverter特定的子类来完成的。
1转换为Html文件
将doc文档转换为对应的Html文档是通过WordToHtmlConverter类进行的。它会尽量的利用Html的方式来呈现原文档的样式。示例代码:
Java代码 /***Word转换为Html*@throwsException*/@TestpublicvoidtestWordToHtml()throwsException{InputStreamis=newFileInputStream("D:\\test.doc");HWPFDocumentwordDocument=newHWPFDocument(is);WordToHtmlConverterconverter=newWordToHtmlConverter(DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument());//对HWPFDocument进行转换converter.processDocument(wordDocument);Writerwriter=newFileWriter(newFile("D:\\converter.html"));Transformertransformer=TransformerFactory.newInstance().newTransformer();transformer.setOutputProperty(OutputKeys.ENCODING,"utf-8");//是否添加空格transformer.setOutputProperty(OutputKeys.INDENT,"yes");transformer.setOutputProperty(OutputKeys.METHOD,"html");transformer.transform(newDOMSource(converter.getDocument()),newStreamResult(writer));}
2转换为Xml文件
将doc文档转换为对应的Xml文件是通过WordToFoConverter类进行的。它可以把doc文档转换为底层用来描述doc文档的Xml文档。示例代码:
Java代码 /***Word转Fo*@throwsException*/@TestpublicvoidtestWordToFo()throwsException{InputStreamis=newFileInputStream("D:\\test.doc");HWPFDocumentwordDocument=newHWPFDocument(is);WordToFoConverterconverter=newWordToFoConverter(DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument());//对HWPFDocument进行转换converter.processDocument(wordDocument);Writerwriter=newFileWriter(newFile("D:\\converter.xml"));Transformertransformer=TransformerFactory.newInstance().newTransformer();transformer.setOutputProperty(OutputKeys.ENCODING,"utf-8");//是否添加空格transformer.setOutputProperty(OutputKeys.INDENT,"yes");//transformer.setOutputProperty(OutputKeys.METHOD,"html");transformer.transform(newDOMSource(converter.getDocument()),newStreamResult(writer));}
3转换为Text文件
将doc文档转换为text文档是通过WordToTextConverter来进行的。它可以把doc文档转换为底层用于描述doc文档的Xml格式的text文档。示例代码:
Java代码 /***Word转换为Text*@throwsException*/@TestpublicvoidtestWordToText()throwsException{InputStreamis=newFileInputStream("D:\\test.doc");HWPFDocumentwordDocument=newHWPFDocument(is);WordToTextConverterconverter=newWordToTextConverter(DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument());//对HWPFDocument进行转换converter.processDocument(wordDocument);Writerwriter=newFileWriter(newFile("D:\\converter.txt"));Transformertransformer=TransformerFactory.newInstance().newTransformer();transformer.setOutputProperty(OutputKeys.ENCODING,"utf-8");//是否添加空格transformer.setOutputProperty(OutputKeys.INDENT,"yes");transformer.setOutputProperty(OutputKeys.METHOD,"text");transformer.transform(newDOMSource(converter.getDocument()),newStreamResult(writer));}
(注:本文是基于poi3.9所写)