Search in sources :

Example 1 with XWPFRelation

use of org.apache.poi.xwpf.usermodel.XWPFRelation in project poi by apache.

the class XWPFHeaderFooterPolicy method createFooter.

/**
     * Creates a new footer of the specified type, to which the
     * supplied (and previously unattached!) paragraphs are
     * added to.
     */
public XWPFFooter createFooter(Enum type, XWPFParagraph[] pars) {
    XWPFFooter footer = getFooter(type);
    if (footer == null) {
        FtrDocument ftrDoc = FtrDocument.Factory.newInstance();
        XWPFRelation relation = XWPFRelation.FOOTER;
        int i = getRelationIndex(relation);
        XWPFFooter wrapper = (XWPFFooter) doc.createRelationship(relation, XWPFFactory.getInstance(), i);
        wrapper.setXWPFDocument(doc);
        String pStyle = "Footer";
        CTHdrFtr ftr = buildFtr(type, pStyle, wrapper, pars);
        wrapper.setHeaderFooter(ftr);
        ftrDoc.setFtr(ftr);
        assignFooter(wrapper, type);
        footer = wrapper;
    }
    return footer;
}
Also used : XWPFRelation(org.apache.poi.xwpf.usermodel.XWPFRelation) CTHdrFtr(org.openxmlformats.schemas.wordprocessingml.x2006.main.CTHdrFtr) XWPFFooter(org.apache.poi.xwpf.usermodel.XWPFFooter) FtrDocument(org.openxmlformats.schemas.wordprocessingml.x2006.main.FtrDocument)

Example 2 with XWPFRelation

use of org.apache.poi.xwpf.usermodel.XWPFRelation in project tika by apache.

the class XWPFEventBasedWordExtractor method handleDocumentPart.

private void handleDocumentPart(PackagePart documentPart, StringBuilder sb) throws IOException, SAXException {
    //load the numbering/list manager and styles from the main document part
    XWPFNumbering numbering = loadNumbering(documentPart);
    XWPFListManager xwpfListManager = new XWPFListManager(numbering);
    //headers
    try {
        PackageRelationshipCollection headersPRC = documentPart.getRelationshipsByType(XWPFRelation.HEADER.getRelation());
        if (headersPRC != null) {
            for (int i = 0; i < headersPRC.size(); i++) {
                PackagePart header = documentPart.getRelatedPart(headersPRC.getRelationship(i));
                handlePart(header, xwpfListManager, sb);
            }
        }
    } catch (InvalidFormatException e) {
        LOG.warn("Invalid format", e);
    }
    //main document
    handlePart(documentPart, xwpfListManager, sb);
    //for now, just dump other components at end
    for (XWPFRelation rel : new XWPFRelation[] { XWPFRelation.FOOTNOTE, XWPFRelation.COMMENT, XWPFRelation.FOOTER, XWPFRelation.ENDNOTE }) {
        try {
            PackageRelationshipCollection prc = documentPart.getRelationshipsByType(rel.getRelation());
            if (prc != null) {
                for (int i = 0; i < prc.size(); i++) {
                    PackagePart packagePart = documentPart.getRelatedPart(prc.getRelationship(i));
                    handlePart(packagePart, xwpfListManager, sb);
                }
            }
        } catch (InvalidFormatException e) {
            LOG.warn("Invalid format", e);
        }
    }
}
Also used : XWPFRelation(org.apache.poi.xwpf.usermodel.XWPFRelation) XWPFNumbering(org.apache.poi.xwpf.usermodel.XWPFNumbering) PackageRelationshipCollection(org.apache.poi.openxml4j.opc.PackageRelationshipCollection) XWPFListManager(org.apache.tika.parser.microsoft.ooxml.XWPFListManager) PackagePart(org.apache.poi.openxml4j.opc.PackagePart) InvalidFormatException(org.apache.poi.openxml4j.exceptions.InvalidFormatException)

Example 3 with XWPFRelation

use of org.apache.poi.xwpf.usermodel.XWPFRelation in project poi by apache.

the class XWPFHeaderFooterPolicy method createHeader.

/**
     * Creates a new header of the specified type, to which the
     * supplied (and previously unattached!) paragraphs are
     * added to.
     */
public XWPFHeader createHeader(Enum type, XWPFParagraph[] pars) {
    XWPFHeader header = getHeader(type);
    if (header == null) {
        HdrDocument hdrDoc = HdrDocument.Factory.newInstance();
        XWPFRelation relation = XWPFRelation.HEADER;
        int i = getRelationIndex(relation);
        XWPFHeader wrapper = (XWPFHeader) doc.createRelationship(relation, XWPFFactory.getInstance(), i);
        wrapper.setXWPFDocument(doc);
        String pStyle = "Header";
        CTHdrFtr hdr = buildHdr(type, pStyle, wrapper, pars);
        wrapper.setHeaderFooter(hdr);
        hdrDoc.setHdr(hdr);
        assignHeader(wrapper, type);
        header = wrapper;
    }
    return header;
}
Also used : XWPFRelation(org.apache.poi.xwpf.usermodel.XWPFRelation) HdrDocument(org.openxmlformats.schemas.wordprocessingml.x2006.main.HdrDocument) CTHdrFtr(org.openxmlformats.schemas.wordprocessingml.x2006.main.CTHdrFtr) XWPFHeader(org.apache.poi.xwpf.usermodel.XWPFHeader)

Example 4 with XWPFRelation

use of org.apache.poi.xwpf.usermodel.XWPFRelation in project poi by apache.

the class ExtractorFactory method createExtractor.

/**
     * Tries to determine the actual type of file and produces a matching text-extractor for it.
     *
     * @param pkg An {@link OPCPackage}.
     * @return A {@link POIXMLTextExtractor} for the given file.
     * @throws IOException If an error occurs while reading the file 
     * @throws OpenXML4JException If an error parsing the OpenXML file format is found. 
     * @throws XmlException If an XML parsing error occurs.
     * @throws IllegalArgumentException If no matching file type could be found.
     */
public static POIXMLTextExtractor createExtractor(OPCPackage pkg) throws IOException, OpenXML4JException, XmlException {
    try {
        // Check for the normal Office core document
        PackageRelationshipCollection core;
        core = pkg.getRelationshipsByType(CORE_DOCUMENT_REL);
        // If nothing was found, try some of the other OOXML-based core types
        if (core.size() == 0) {
            // Could it be an OOXML-Strict one?
            core = pkg.getRelationshipsByType(STRICT_DOCUMENT_REL);
        }
        if (core.size() == 0) {
            // Could it be a visio one?
            core = pkg.getRelationshipsByType(VISIO_DOCUMENT_REL);
            if (core.size() == 1)
                return new XDGFVisioExtractor(pkg);
        }
        // Should just be a single core document, complain if not
        if (core.size() != 1) {
            throw new IllegalArgumentException("Invalid OOXML Package received - expected 1 core document, found " + core.size());
        }
        // Grab the core document part, and try to identify from that
        final PackagePart corePart = pkg.getPart(core.getRelationship(0));
        final String contentType = corePart.getContentType();
        // Is it XSSF?
        for (XSSFRelation rel : XSSFExcelExtractor.SUPPORTED_TYPES) {
            if (rel.getContentType().equals(contentType)) {
                if (getPreferEventExtractor()) {
                    return new XSSFEventBasedExcelExtractor(pkg);
                }
                return new XSSFExcelExtractor(pkg);
            }
        }
        // Is it XWPF?
        for (XWPFRelation rel : XWPFWordExtractor.SUPPORTED_TYPES) {
            if (rel.getContentType().equals(contentType)) {
                return new XWPFWordExtractor(pkg);
            }
        }
        // Is it XSLF?
        for (XSLFRelation rel : XSLFPowerPointExtractor.SUPPORTED_TYPES) {
            if (rel.getContentType().equals(contentType)) {
                return new XSLFPowerPointExtractor(pkg);
            }
        }
        // special handling for SlideShow-Theme-files, 
        if (XSLFRelation.THEME_MANAGER.getContentType().equals(contentType)) {
            return new XSLFPowerPointExtractor(new XSLFSlideShow(pkg));
        }
        // How about xlsb?
        for (XSSFRelation rel : XSSFBEventBasedExcelExtractor.SUPPORTED_TYPES) {
            if (rel.getContentType().equals(contentType)) {
                return new XSSFBEventBasedExcelExtractor(pkg);
            }
        }
        throw new IllegalArgumentException("No supported documents found in the OOXML package (found " + contentType + ")");
    } catch (IOException e) {
        // ensure that we close the package again if there is an error opening it, however
        // we need to revert the package to not re-write the file via close(), which is very likely not wanted for a TextExtractor!
        pkg.revert();
        throw e;
    } catch (OpenXML4JException e) {
        // ensure that we close the package again if there is an error opening it, however
        // we need to revert the package to not re-write the file via close(), which is very likely not wanted for a TextExtractor!
        pkg.revert();
        throw e;
    } catch (XmlException e) {
        // ensure that we close the package again if there is an error opening it, however
        // we need to revert the package to not re-write the file via close(), which is very likely not wanted for a TextExtractor!
        pkg.revert();
        throw e;
    } catch (RuntimeException e) {
        // ensure that we close the package again if there is an error opening it, however
        // we need to revert the package to not re-write the file via close(), which is very likely not wanted for a TextExtractor!
        pkg.revert();
        throw e;
    }
}
Also used : XSSFRelation(org.apache.poi.xssf.usermodel.XSSFRelation) XDGFVisioExtractor(org.apache.poi.xdgf.extractor.XDGFVisioExtractor) XSSFBEventBasedExcelExtractor(org.apache.poi.xssf.extractor.XSSFBEventBasedExcelExtractor) PackageRelationshipCollection(org.apache.poi.openxml4j.opc.PackageRelationshipCollection) XSSFExcelExtractor(org.apache.poi.xssf.extractor.XSSFExcelExtractor) XWPFWordExtractor(org.apache.poi.xwpf.extractor.XWPFWordExtractor) IOException(java.io.IOException) PackagePart(org.apache.poi.openxml4j.opc.PackagePart) XSLFSlideShow(org.apache.poi.xslf.usermodel.XSLFSlideShow) XWPFRelation(org.apache.poi.xwpf.usermodel.XWPFRelation) OpenXML4JException(org.apache.poi.openxml4j.exceptions.OpenXML4JException) XSSFEventBasedExcelExtractor(org.apache.poi.xssf.extractor.XSSFEventBasedExcelExtractor) XSLFPowerPointExtractor(org.apache.poi.xslf.extractor.XSLFPowerPointExtractor) XmlException(org.apache.xmlbeans.XmlException) XSLFRelation(org.apache.poi.xslf.usermodel.XSLFRelation)

Example 5 with XWPFRelation

use of org.apache.poi.xwpf.usermodel.XWPFRelation in project tika by apache.

the class OOXMLExtractorFactory method trySXWPF.

private static POIXMLTextExtractor trySXWPF(OPCPackage pkg) throws XmlException, OpenXML4JException, IOException {
    PackageRelationshipCollection packageRelationshipCollection = pkg.getRelationshipsByType("http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument");
    if (packageRelationshipCollection.size() == 0) {
        packageRelationshipCollection = pkg.getRelationshipsByType("http://purl.oclc.org/ooxml/officeDocument/relationships/officeDocument");
    }
    if (packageRelationshipCollection.size() == 0) {
        return null;
    }
    PackagePart corePart = pkg.getPart(packageRelationshipCollection.getRelationship(0));
    String targetContentType = corePart.getContentType();
    for (XWPFRelation relation : XWPFWordExtractor.SUPPORTED_TYPES) {
        if (targetContentType.equals(relation.getContentType())) {
            return new XWPFEventBasedWordExtractor(pkg);
        }
    }
    return null;
}
Also used : XWPFRelation(org.apache.poi.xwpf.usermodel.XWPFRelation) PackageRelationshipCollection(org.apache.poi.openxml4j.opc.PackageRelationshipCollection) PackagePart(org.apache.poi.openxml4j.opc.PackagePart) XWPFEventBasedWordExtractor(org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor)

Aggregations

XWPFRelation (org.apache.poi.xwpf.usermodel.XWPFRelation)6 PackagePart (org.apache.poi.openxml4j.opc.PackagePart)4 PackageRelationshipCollection (org.apache.poi.openxml4j.opc.PackageRelationshipCollection)4 IOException (java.io.IOException)2 InvalidFormatException (org.apache.poi.openxml4j.exceptions.InvalidFormatException)2 OpenXML4JException (org.apache.poi.openxml4j.exceptions.OpenXML4JException)2 XWPFNumbering (org.apache.poi.xwpf.usermodel.XWPFNumbering)2 XmlException (org.apache.xmlbeans.XmlException)2 CTHdrFtr (org.openxmlformats.schemas.wordprocessingml.x2006.main.CTHdrFtr)2 ZipException (java.util.zip.ZipException)1 XDGFVisioExtractor (org.apache.poi.xdgf.extractor.XDGFVisioExtractor)1 XSLFPowerPointExtractor (org.apache.poi.xslf.extractor.XSLFPowerPointExtractor)1 XSLFRelation (org.apache.poi.xslf.usermodel.XSLFRelation)1 XSLFSlideShow (org.apache.poi.xslf.usermodel.XSLFSlideShow)1 XSSFBEventBasedExcelExtractor (org.apache.poi.xssf.extractor.XSSFBEventBasedExcelExtractor)1 XSSFEventBasedExcelExtractor (org.apache.poi.xssf.extractor.XSSFEventBasedExcelExtractor)1 XSSFExcelExtractor (org.apache.poi.xssf.extractor.XSSFExcelExtractor)1 XSSFRelation (org.apache.poi.xssf.usermodel.XSSFRelation)1 XWPFWordExtractor (org.apache.poi.xwpf.extractor.XWPFWordExtractor)1 XWPFFooter (org.apache.poi.xwpf.usermodel.XWPFFooter)1