目录
1转换为Html文件
2转换为Xml文件
3转换为Text文件
在POI中还存在有针对于word doc文件进行格式转换的功能。我们可以将word的内容转换为对应的Html文件,也可以把它转换为底层用来描述doc文档的xml文件,还可以把它转换为底层用来描述doc文档的xml格式的text文件。这些格式转换都是通过AbstractWordConverter特定的子类来完成的。
1转换为Html文件
将doc文档转换为对应的Html文档是通过WordToHtmlConverter类进行的。它会尽量的利用Html的方式来呈现原文档的样式。示例代码:
Java代码
/**
*Word转换为Html
*@throwsException
*/
@Test
publicvoidtestWordToHtml()throwsException{
InputStreamis=newFileInputStream("D:\\test.doc");
HWPFDocumentwordDocument=newHWPFDocument(is);
WordToHtmlConverterconverter=newWordToHtmlConverter(DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument());
//对HWPFDocument进行转换
converter.processDocument(wordDocument);
Writerwriter=newFileWriter(newFile("D:\\converter.html"));
Transformertransformer=TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.ENCODING,"utf-8");
//是否添加空格
transformer.setOutputProperty(OutputKeys.INDENT,"yes");
transformer.setOutputProperty(OutputKeys.METHOD,"html");
transformer.transform(
newDOMSource(converter.getDocument()),
newStreamResult(writer));
}
2转换为Xml文件
将doc文档转换为对应的Xml文件是通过WordToFoConverter类进行的。它可以把doc文档转换为底层用来描述doc文档的Xml文档。示例代码:
Java代码
/**
*Word转Fo
*@throwsException
*/
@Test
publicvoidtestWordToFo()throwsException{
InputStreamis=newFileInputStream("D:\\test.doc");
HWPFDocumentwordDocument=newHWPFDocument(is);
WordToFoConverterconverter=newWordToFoConverter(DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument());
//对HWPFDocument进行转换
converter.processDocument(wordDocument);
Writerwriter=newFileWriter(newFile("D:\\converter.xml"));
Transformertransformer=TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.ENCODING,"utf-8");
//是否添加空格
transformer.setOutputProperty(OutputKeys.INDENT,"yes");
//transformer.setOutputProperty(OutputKeys.METHOD,"html");
transformer.transform(
newDOMSource(converter.getDocument()),
newStreamResult(writer));
}
3转换为Text文件
将doc文档转换为text文档是通过WordToTextConverter来进行的。它可以把doc文档转换为底层用于描述doc文档的Xml格式的text文档。示例代码:
Java代码
/**
*Word转换为Text
*@throwsException
*/
@Test
publicvoidtestWordToText()throwsException{
InputStreamis=newFileInputStream("D:\\test.doc");
HWPFDocumentwordDocument=newHWPFDocument(is);
WordToTextConverterconverter=newWordToTextConverter(DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument());
//对HWPFDocument进行转换
converter.processDocument(wordDocument);
Writerwriter=newFileWriter(newFile("D:\\converter.txt"));
Transformertransformer=TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.ENCODING,"utf-8");
//是否添加空格
transformer.setOutputProperty(OutputKeys.INDENT,"yes");
transformer.setOutputProperty(OutputKeys.METHOD,"text");
transformer.transform(
newDOMSource(converter.getDocument()),
newStreamResult(writer));
}
(注:本文是基于poi3.9所写)