java解析.odt文件

1、了解.odt文件

.odt文件是openoffice软件产生的文档格式，可以直接用office打开，这其实就是一个压缩包，可以使用解压软件打开，里面有一个content.xml文件，这个文件内有标签，标签内就是展示出来的内容。

2、解压文件

上面说了.odt文件就是一个压缩包，所以可以直接用解压缩包的方式解压，

public void parseFile(String filePath) throws IOException{
    File file = new File(filePath);
    // 原文件是否存在
    if(!file.exists()){
        throw new FileNotFoundException("文件不存在");
    }
    // 解压到源文件的同级目录下
    String parent = file.getParent();
    File file1 = new File(parent);
    odtUncompress(file, file1);
}


    public static void odtUncompress(String inputFile,String destDirPath) throws Exception {
        File srcFile = new File(inputFile);//获取当前压缩文件
        // 判断源文件是否存在
        if (!srcFile.exists()) {
            throw new Exception(srcFile.getPath() + "所指文件不存在");
        }
        ZipFile zipFile = new ZipFile(srcFile);//创建压缩文件对象
        //开始解压
        Enumeration entries = zipFile.entries();
        while (entries.hasMoreElements()) {
            ZipEntry entry = (ZipEntry) entries.nextElement();
            // 如果是文件夹，就创建个文件夹
            if (entry.isDirectory()) {
                String dirPath = destDirPath + "/" + entry.getName();
                srcFile.mkdirs();
            } else {
                // 如果是文件，就先创建一个文件，然后用io流把内容copy过去
                File targetFile = new File(destDirPath + "/" + entry.getName());
                // 保证这个文件的父文件夹必须要存在
                if (!targetFile.getParentFile().exists()) {
                    targetFile.getParentFile().mkdirs();
                }
                targetFile.createNewFile();
                // 将压缩文件内容写入到这个文件中
                InputStream is = zipFile.getInputStream(entry);
                FileOutputStream fos = new FileOutputStream(targetFile);
                int len;
                byte[] buf = new byte[1024];
                while ((len = is.read(buf)) != -1) {
                    fos.write(buf, 0, len);
                }
                fos.close();
                is.close();
            }
        }
    }

3、获取xml文件内容

因为我是需要修改xml文件内容，所以我还是从.odt文件入手，直接拿到xml文件

// 记录标签内容
private static String str = "";


public void originalContent(String srcFile) throws Exception {
    ZipFile zipFile = new ZipFile(srcFile);
    Enumeration entries = zipFile.entries();
    ZipEntry entry;
    org.w3c.dom.document doc = null;
    while (entries.hasMoreElements()) {
        entry = (ZipEntry) entries.nextElement();
        // 只操作xml文件
        if (entry.getName().equals("content.xml")) {
            // 构建文档
            documentBuilderFactory domFactory = documentBuilderFactory.newInstance();
            domFactory.setNamespaceAware(true);
            documentBuilder docBuilder = domFactory.newdocumentBuilder();
            doc = docBuilder.parse(zipFile.getInputStream(entry));

            // 获取节点
            NodeList list = doc.getElementsByTagName("text:p");

            for (int a = 0; a < list.getLength(); a++) {
                Node node =list.item(a);
                // 递归获取标签内容
                getText(node);
                System.out.println(str);
                // 清空数据，记录下个标签的内容
                str = "";
            }
        }
    }
}


// 递归获取子标签的内容
    private static void getText(org.w3c.dom.Node node) {
        if (node.getChildNodes().getLength() > 1) {
            NodeList childNodes = node.getChildNodes();
            for (int a = 0; a < childNodes.getLength(); a++) {
                getText(node.getChildNodes().item(a));
            }
        } else {
            if (node.getNodevalue() != null) {
                // str用来连接标签内容 用static修饰
                str = str + node.getNodevalue();
            }
            if (node.getFirstChild() != null) {
                str = str + node.getFirstChild().getNodevalue();
            }
        }
    }

至于将解压后的文件在压缩回去，也是和普通的文件压缩一样的，大家可以去看一下别人的，我就不写了，只要将后缀改成.odt就可以了。

java解析.odt文件

Java相关栏目本月热门文章