Search in sources :

Example 51 with PackagePart

use of org.apache.poi.openxml4j.opc.PackagePart in project poi by apache.

the class TestXSSFBugs method bug45431.

/**
     * We should carry vba macros over after save
     */
@Test
public void bug45431() throws IOException, InvalidFormatException {
    XSSFWorkbook wb1 = XSSFTestDataSamples.openSampleWorkbook("45431.xlsm");
    OPCPackage pkg1 = wb1.getPackage();
    assertTrue(wb1.isMacroEnabled());
    // Check the various macro related bits can be found
    PackagePart vba = pkg1.getPart(PackagingURIHelper.createPartName("/xl/vbaProject.bin"));
    assertNotNull(vba);
    // And the drawing bit
    PackagePart drw = pkg1.getPart(PackagingURIHelper.createPartName("/xl/drawings/vmlDrawing1.vml"));
    assertNotNull(drw);
    // Save and re-open, both still there
    XSSFWorkbook wb2 = XSSFTestDataSamples.writeOutAndReadBack(wb1);
    pkg1.close();
    wb1.close();
    OPCPackage pkg2 = wb2.getPackage();
    assertTrue(wb2.isMacroEnabled());
    vba = pkg2.getPart(PackagingURIHelper.createPartName("/xl/vbaProject.bin"));
    assertNotNull(vba);
    drw = pkg2.getPart(PackagingURIHelper.createPartName("/xl/drawings/vmlDrawing1.vml"));
    assertNotNull(drw);
    // And again, just to be sure
    XSSFWorkbook wb3 = XSSFTestDataSamples.writeOutAndReadBack(wb2);
    pkg2.close();
    wb2.close();
    OPCPackage pkg3 = wb3.getPackage();
    assertTrue(wb3.isMacroEnabled());
    vba = pkg3.getPart(PackagingURIHelper.createPartName("/xl/vbaProject.bin"));
    assertNotNull(vba);
    drw = pkg3.getPart(PackagingURIHelper.createPartName("/xl/drawings/vmlDrawing1.vml"));
    assertNotNull(drw);
    pkg3.close();
    wb3.close();
}
Also used : SXSSFWorkbook(org.apache.poi.xssf.streaming.SXSSFWorkbook) PackagePart(org.apache.poi.openxml4j.opc.PackagePart) OPCPackage(org.apache.poi.openxml4j.opc.OPCPackage) Test(org.junit.Test)

Example 52 with PackagePart

use of org.apache.poi.openxml4j.opc.PackagePart in project tika by apache.

the class ZipContainerDetector method detectOfficeOpenXML.

/**
     * Detects the type of an OfficeOpenXML (OOXML) file from
     *  opened Package 
     */
public static MediaType detectOfficeOpenXML(OPCPackage pkg) {
    // Check for the normal Office core document
    PackageRelationshipCollection core = pkg.getRelationshipsByType(PackageRelationshipTypes.CORE_DOCUMENT);
    // Otherwise check for some other Office core document types
    if (core.size() == 0) {
        core = pkg.getRelationshipsByType(STRICT_CORE_DOCUMENT);
    }
    if (core.size() == 0) {
        core = pkg.getRelationshipsByType(VISIO_DOCUMENT);
    }
    // If we didn't find a single core document of any type, skip detection
    if (core.size() != 1) {
        // Invalid OOXML Package received
        return null;
    }
    // Get the type of the core document part
    PackagePart corePart = pkg.getPart(core.getRelationship(0));
    String coreType = corePart.getContentType();
    // Turn that into the type of the overall document
    String docType = coreType.substring(0, coreType.lastIndexOf('.'));
    // The Macro Enabled formats are a little special
    if (docType.toLowerCase(Locale.ROOT).endsWith("macroenabled")) {
        docType = docType.toLowerCase(Locale.ROOT) + ".12";
    }
    if (docType.toLowerCase(Locale.ROOT).endsWith("macroenabledtemplate")) {
        docType = MACRO_TEMPLATE_PATTERN.matcher(docType).replaceAll("macroenabled.12");
    }
    // Build the MediaType object and return
    return MediaType.parse(docType);
}
Also used : PackageRelationshipCollection(org.apache.poi.openxml4j.opc.PackageRelationshipCollection) PackagePart(org.apache.poi.openxml4j.opc.PackagePart)

Example 53 with PackagePart

use of org.apache.poi.openxml4j.opc.PackagePart in project tika by apache.

the class OOXMLExtractorFactory method trySXSLF.

private static POIXMLTextExtractor trySXSLF(OPCPackage pkg) throws XmlException, OpenXML4JException, IOException {
    PackageRelationshipCollection packageRelationshipCollection = pkg.getRelationshipsByType("http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument");
    if (packageRelationshipCollection.size() == 0) {
        packageRelationshipCollection = pkg.getRelationshipsByType("http://purl.oclc.org/ooxml/officeDocument/relationships/officeDocument");
    }
    if (packageRelationshipCollection.size() == 0) {
        return null;
    }
    PackagePart corePart = pkg.getPart(packageRelationshipCollection.getRelationship(0));
    String targetContentType = corePart.getContentType();
    XSLFRelation[] xslfRelations = org.apache.poi.xslf.extractor.XSLFPowerPointExtractor.SUPPORTED_TYPES;
    for (int i = 0; i < xslfRelations.length; i++) {
        XSLFRelation xslfRelation = xslfRelations[i];
        if (xslfRelation.getContentType().equals(targetContentType)) {
            return new XSLFEventBasedPowerPointExtractor(pkg);
        }
    }
    if (XSLFRelation.THEME_MANAGER.getContentType().equals(targetContentType)) {
        return new XSLFEventBasedPowerPointExtractor(pkg);
    }
    return null;
}
Also used : PackageRelationshipCollection(org.apache.poi.openxml4j.opc.PackageRelationshipCollection) XSLFRelation(org.apache.poi.xslf.usermodel.XSLFRelation) PackagePart(org.apache.poi.openxml4j.opc.PackagePart) XSLFEventBasedPowerPointExtractor(org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor)

Example 54 with PackagePart

use of org.apache.poi.openxml4j.opc.PackagePart in project tika by apache.

the class SXSLFPowerPointExtractorDecorator method handleBasicRelatedParts.

/**
     * This should handle the comments, master, notes, etc
     *
     * @param contentType
     * @param xhtmlClassLabel
     * @param parentPart
     * @param contentHandler
     */
private void handleBasicRelatedParts(String contentType, String xhtmlClassLabel, PackagePart parentPart, ContentHandler contentHandler) throws SAXException {
    PackageRelationshipCollection relatedPartPRC = null;
    try {
        relatedPartPRC = parentPart.getRelationshipsByType(contentType);
    } catch (InvalidFormatException e) {
        metadata.add(TikaCoreProperties.TIKA_META_EXCEPTION_WARNING, ExceptionUtils.getStackTrace(e));
    }
    if (relatedPartPRC != null && relatedPartPRC.size() > 0) {
        AttributesImpl attributes = new AttributesImpl();
        attributes.addAttribute("", "class", "class", "CDATA", xhtmlClassLabel);
        contentHandler.startElement("", "div", "div", attributes);
        for (int i = 0; i < relatedPartPRC.size(); i++) {
            PackageRelationship relatedPartPackageRelationship = relatedPartPRC.getRelationship(i);
            try {
                PackagePart relatedPartPart = parentPart.getRelatedPart(relatedPartPackageRelationship);
                try (InputStream stream = relatedPartPart.getInputStream()) {
                    context.getSAXParser().parse(stream, new OfflineContentHandler(new EmbeddedContentHandler(contentHandler)));
                } catch (IOException | TikaException e) {
                    metadata.add(TikaCoreProperties.TIKA_META_EXCEPTION_WARNING, ExceptionUtils.getStackTrace(e));
                }
            } catch (InvalidFormatException e) {
                metadata.add(TikaCoreProperties.TIKA_META_EXCEPTION_WARNING, ExceptionUtils.getStackTrace(e));
            }
        }
        contentHandler.endElement("", "div", "div");
    }
}
Also used : PackageRelationship(org.apache.poi.openxml4j.opc.PackageRelationship) AttributesImpl(org.xml.sax.helpers.AttributesImpl) OfflineContentHandler(org.apache.tika.sax.OfflineContentHandler) TikaException(org.apache.tika.exception.TikaException) PackageRelationshipCollection(org.apache.poi.openxml4j.opc.PackageRelationshipCollection) CloseShieldInputStream(org.apache.commons.io.input.CloseShieldInputStream) InputStream(java.io.InputStream) EmbeddedContentHandler(org.apache.tika.sax.EmbeddedContentHandler) IOException(java.io.IOException) PackagePart(org.apache.poi.openxml4j.opc.PackagePart) InvalidFormatException(org.apache.poi.openxml4j.exceptions.InvalidFormatException)

Example 55 with PackagePart

use of org.apache.poi.openxml4j.opc.PackagePart in project tika by apache.

the class SXWPFWordExtractorDecorator method addRelatedParts.

private void addRelatedParts(PackagePart documentPart, List<PackagePart> relatedParts) {
    for (String relation : MAIN_PART_RELATIONS) {
        PackageRelationshipCollection prc = null;
        try {
            prc = documentPart.getRelationshipsByType(relation);
            if (prc != null) {
                for (int i = 0; i < prc.size(); i++) {
                    PackagePart packagePart = documentPart.getRelatedPart(prc.getRelationship(i));
                    relatedParts.add(packagePart);
                }
            }
        } catch (InvalidFormatException e) {
        }
    }
}
Also used : PackageRelationshipCollection(org.apache.poi.openxml4j.opc.PackageRelationshipCollection) PackagePart(org.apache.poi.openxml4j.opc.PackagePart) InvalidFormatException(org.apache.poi.openxml4j.exceptions.InvalidFormatException)

Aggregations

PackagePart (org.apache.poi.openxml4j.opc.PackagePart)118 OutputStream (java.io.OutputStream)38 PackageRelationship (org.apache.poi.openxml4j.opc.PackageRelationship)27 OPCPackage (org.apache.poi.openxml4j.opc.OPCPackage)25 InvalidFormatException (org.apache.poi.openxml4j.exceptions.InvalidFormatException)24 PackageRelationshipCollection (org.apache.poi.openxml4j.opc.PackageRelationshipCollection)23 PackagePartName (org.apache.poi.openxml4j.opc.PackagePartName)19 QName (javax.xml.namespace.QName)18 IOException (java.io.IOException)17 XmlOptions (org.apache.xmlbeans.XmlOptions)17 InputStream (java.io.InputStream)11 Test (org.junit.Test)11 ByteArrayOutputStream (java.io.ByteArrayOutputStream)9 POIXMLException (org.apache.poi.POIXMLException)8 XmlException (org.apache.xmlbeans.XmlException)8 OpenXML4JException (org.apache.poi.openxml4j.exceptions.OpenXML4JException)7 ArrayList (java.util.ArrayList)6 TikaException (org.apache.tika.exception.TikaException)6 URI (java.net.URI)5 SAXException (org.xml.sax.SAXException)5