Search in sources :

Example 1 with DirectoryNode

use of org.apache.poi.poifs.filesystem.DirectoryNode in project poi by apache.

the class EmbeddedExtractor method extractAll.

protected void extractAll(ShapeContainer<?> parent, List<EmbeddedData> embeddings) throws IOException {
    for (Shape shape : parent) {
        EmbeddedData data = null;
        if (shape instanceof ObjectData) {
            ObjectData od = (ObjectData) shape;
            try {
                if (od.hasDirectoryEntry()) {
                    data = extractOne((DirectoryNode) od.getDirectory());
                } else {
                    String contentType = CONTENT_TYPE_BYTES;
                    if (od instanceof XSSFObjectData) {
                        contentType = ((XSSFObjectData) od).getObjectPart().getContentType();
                    }
                    data = new EmbeddedData(od.getFileName(), od.getObjectData(), contentType);
                }
            } catch (Exception e) {
                LOG.log(POILogger.WARN, "Entry not found / readable - ignoring OLE embedding", e);
            }
        } else if (shape instanceof Picture) {
            data = extractOne((Picture) shape);
        } else if (shape instanceof ShapeContainer) {
            extractAll((ShapeContainer<?>) shape, embeddings);
        }
        if (data == null) {
            continue;
        }
        data.setShape(shape);
        String filename = data.getFilename();
        String extension = (filename == null || filename.lastIndexOf('.') == -1) ? ".bin" : filename.substring(filename.lastIndexOf('.'));
        // try to find an alternative name
        if (filename == null || "".equals(filename) || filename.startsWith("MBD") || filename.startsWith("Root Entry")) {
            filename = shape.getShapeName();
            if (filename != null) {
                filename += extension;
            }
        }
        // default to dummy name
        if (filename == null || "".equals(filename)) {
            filename = "picture_" + embeddings.size() + extension;
        }
        filename = filename.trim();
        data.setFilename(filename);
        embeddings.add(data);
    }
}
Also used : Shape(org.apache.poi.ss.usermodel.Shape) Picture(org.apache.poi.ss.usermodel.Picture) ObjectData(org.apache.poi.ss.usermodel.ObjectData) XSSFObjectData(org.apache.poi.xssf.usermodel.XSSFObjectData) DirectoryNode(org.apache.poi.poifs.filesystem.DirectoryNode) XSSFObjectData(org.apache.poi.xssf.usermodel.XSSFObjectData) ShapeContainer(org.apache.poi.ss.usermodel.ShapeContainer) Ole10NativeException(org.apache.poi.poifs.filesystem.Ole10NativeException) IOException(java.io.IOException)

Example 2 with DirectoryNode

use of org.apache.poi.poifs.filesystem.DirectoryNode in project poi by apache.

the class EmbeddedExtractor method copyNodes.

protected static void copyNodes(DirectoryNode src, DirectoryNode dest) throws IOException {
    for (Entry e : src) {
        if (e instanceof DirectoryNode) {
            DirectoryNode srcDir = (DirectoryNode) e;
            DirectoryNode destDir = (DirectoryNode) dest.createDirectory(srcDir.getName());
            destDir.setStorageClsid(srcDir.getStorageClsid());
            copyNodes(srcDir, destDir);
        } else {
            InputStream is = src.createDocumentInputStream(e);
            try {
                dest.createDocument(e.getName(), is);
            } finally {
                is.close();
            }
        }
    }
}
Also used : Entry(org.apache.poi.poifs.filesystem.Entry) DocumentInputStream(org.apache.poi.poifs.filesystem.DocumentInputStream) InputStream(java.io.InputStream) DirectoryNode(org.apache.poi.poifs.filesystem.DirectoryNode)

Example 3 with DirectoryNode

use of org.apache.poi.poifs.filesystem.DirectoryNode in project poi by apache.

the class HSSFWorkbook method addOlePackage.

/**
     * Adds an OLE package manager object with the given POIFS to the sheet
     *
     * @param poiData an POIFS containing the embedded document, to be added
     * @param label the label of the payload
     * @param fileName the original filename
     * @param command the command to open the payload
     * @return the index of the added ole object
     * @throws IOException if the object can't be embedded
     */
public int addOlePackage(POIFSFileSystem poiData, String label, String fileName, String command) throws IOException {
    DirectoryNode root = poiData.getRoot();
    Map<String, ClassID> olemap = getOleMap();
    for (Map.Entry<String, ClassID> entry : olemap.entrySet()) {
        if (root.hasEntry(entry.getKey())) {
            root.setStorageClsid(entry.getValue());
            break;
        }
    }
    ByteArrayOutputStream bos = new ByteArrayOutputStream();
    poiData.writeFilesystem(bos);
    return addOlePackage(bos.toByteArray(), label, fileName, command);
}
Also used : FilteringDirectoryNode(org.apache.poi.poifs.filesystem.FilteringDirectoryNode) DirectoryNode(org.apache.poi.poifs.filesystem.DirectoryNode) ClassID(org.apache.poi.hpsf.ClassID) UnicodeString(org.apache.poi.hssf.record.common.UnicodeString) ByteArrayOutputStream(java.io.ByteArrayOutputStream) LittleEndianByteArrayOutputStream(org.apache.poi.util.LittleEndianByteArrayOutputStream) Map(java.util.Map) HashMap(java.util.HashMap)

Example 4 with DirectoryNode

use of org.apache.poi.poifs.filesystem.DirectoryNode in project poi by apache.

the class ExtractorFactory method getEmbededDocsTextExtractors.

/**
     * Returns an array of text extractors, one for each of
     *  the embedded documents in the file (if there are any).
     * If there are no embedded documents, you'll get back an
     *  empty array. Otherwise, you'll get one open
     *  {@link POITextExtractor} for each embedded file.
     */
public static POITextExtractor[] getEmbededDocsTextExtractors(POIOLE2TextExtractor ext) throws IOException, OpenXML4JException, XmlException {
    // All the embedded directories we spotted
    ArrayList<Entry> dirs = new ArrayList<Entry>();
    // For anything else not directly held in as a POIFS directory
    ArrayList<InputStream> nonPOIFS = new ArrayList<InputStream>();
    // Find all the embedded directories
    DirectoryEntry root = ext.getRoot();
    if (root == null) {
        throw new IllegalStateException("The extractor didn't know which POIFS it came from!");
    }
    if (ext instanceof ExcelExtractor) {
        // These are in MBD... under the root
        Iterator<Entry> it = root.getEntries();
        while (it.hasNext()) {
            Entry entry = it.next();
            if (entry.getName().startsWith("MBD")) {
                dirs.add(entry);
            }
        }
    } else if (ext instanceof WordExtractor) {
        // These are in ObjectPool -> _... under the root
        try {
            DirectoryEntry op = (DirectoryEntry) root.getEntry("ObjectPool");
            Iterator<Entry> it = op.getEntries();
            while (it.hasNext()) {
                Entry entry = it.next();
                if (entry.getName().startsWith("_")) {
                    dirs.add(entry);
                }
            }
        } catch (FileNotFoundException e) {
            logger.log(POILogger.INFO, "Ignoring FileNotFoundException while extracting Word document", e.getLocalizedMessage());
        // ignored here
        }
    //} else if(ext instanceof PowerPointExtractor) {
    // Tricky, not stored directly in poifs
    // TODO
    } else if (ext instanceof OutlookTextExtactor) {
        // Stored in the Attachment blocks
        MAPIMessage msg = ((OutlookTextExtactor) ext).getMAPIMessage();
        for (AttachmentChunks attachment : msg.getAttachmentFiles()) {
            if (attachment.getAttachData() != null) {
                byte[] data = attachment.getAttachData().getValue();
                nonPOIFS.add(new ByteArrayInputStream(data));
            } else if (attachment.getAttachmentDirectory() != null) {
                dirs.add(attachment.getAttachmentDirectory().getDirectory());
            }
        }
    }
    // Create the extractors
    if (dirs.size() == 0 && nonPOIFS.size() == 0) {
        return new POITextExtractor[0];
    }
    ArrayList<POITextExtractor> textExtractors = new ArrayList<POITextExtractor>();
    for (Entry dir : dirs) {
        textExtractors.add(createExtractor((DirectoryNode) dir));
    }
    for (InputStream nonPOIF : nonPOIFS) {
        try {
            textExtractors.add(createExtractor(nonPOIF));
        } catch (IllegalArgumentException e) {
            // Ignore, just means it didn't contain
            //  a format we support as yet
            logger.log(POILogger.INFO, "Format not supported yet", e.getLocalizedMessage());
        } catch (XmlException e) {
            throw new IOException(e.getMessage(), e);
        } catch (OpenXML4JException e) {
            throw new IOException(e.getMessage(), e);
        }
    }
    return textExtractors.toArray(new POITextExtractor[textExtractors.size()]);
}
Also used : PushbackInputStream(java.io.PushbackInputStream) ByteArrayInputStream(java.io.ByteArrayInputStream) InputStream(java.io.InputStream) ArrayList(java.util.ArrayList) FileNotFoundException(java.io.FileNotFoundException) DirectoryNode(org.apache.poi.poifs.filesystem.DirectoryNode) IOException(java.io.IOException) DirectoryEntry(org.apache.poi.poifs.filesystem.DirectoryEntry) WordExtractor(org.apache.poi.hwpf.extractor.WordExtractor) XWPFWordExtractor(org.apache.poi.xwpf.extractor.XWPFWordExtractor) MAPIMessage(org.apache.poi.hsmf.MAPIMessage) Entry(org.apache.poi.poifs.filesystem.Entry) DirectoryEntry(org.apache.poi.poifs.filesystem.DirectoryEntry) OutlookTextExtactor(org.apache.poi.hsmf.extractor.OutlookTextExtactor) OpenXML4JException(org.apache.poi.openxml4j.exceptions.OpenXML4JException) ByteArrayInputStream(java.io.ByteArrayInputStream) POITextExtractor(org.apache.poi.POITextExtractor) XSSFExcelExtractor(org.apache.poi.xssf.extractor.XSSFExcelExtractor) ExcelExtractor(org.apache.poi.hssf.extractor.ExcelExtractor) XSSFEventBasedExcelExtractor(org.apache.poi.xssf.extractor.XSSFEventBasedExcelExtractor) XSSFBEventBasedExcelExtractor(org.apache.poi.xssf.extractor.XSSFBEventBasedExcelExtractor) XmlException(org.apache.xmlbeans.XmlException) Iterator(java.util.Iterator) AttachmentChunks(org.apache.poi.hsmf.datatypes.AttachmentChunks)

Example 5 with DirectoryNode

use of org.apache.poi.poifs.filesystem.DirectoryNode in project poi by apache.

the class HSLFSlideShow method addEmbed.

/**
	 * Add a embedded object to this presentation
	 *
	 * @return 0-based index of the embedded object
	 */
public int addEmbed(POIFSFileSystem poiData) {
    DirectoryNode root = poiData.getRoot();
    // prepare embedded data
    if (new ClassID().equals(root.getStorageClsid())) {
        // need to set class id
        Map<String, ClassID> olemap = getOleMap();
        ClassID classID = null;
        for (Map.Entry<String, ClassID> entry : olemap.entrySet()) {
            if (root.hasEntry(entry.getKey())) {
                classID = entry.getValue();
                break;
            }
        }
        if (classID == null) {
            throw new IllegalArgumentException("Unsupported embedded document");
        }
        root.setStorageClsid(classID);
    }
    ExEmbed exEmbed = new ExEmbed();
    // remove unneccessary infos, so we don't need to specify the type
    // of the ole object multiple times
    Record[] children = exEmbed.getChildRecords();
    exEmbed.removeChild(children[2]);
    exEmbed.removeChild(children[3]);
    exEmbed.removeChild(children[4]);
    ExEmbedAtom eeEmbed = exEmbed.getExEmbedAtom();
    eeEmbed.setCantLockServerB(true);
    ExOleObjAtom eeAtom = exEmbed.getExOleObjAtom();
    eeAtom.setDrawAspect(ExOleObjAtom.DRAW_ASPECT_VISIBLE);
    eeAtom.setType(ExOleObjAtom.TYPE_EMBEDDED);
    // eeAtom.setSubType(ExOleObjAtom.SUBTYPE_EXCEL);
    // should be ignored?!?, see MS-PPT ExOleObjAtom, but Libre Office sets it ...
    eeAtom.setOptions(1226240);
    ExOleObjStg exOleObjStg = new ExOleObjStg();
    try {
        final String OLESTREAM_NAME = "Ole";
        if (!root.hasEntry(OLESTREAM_NAME)) {
            // the following data was taken from an example libre office document
            // beside this "Ole" record there were several other records, e.g. CompObj,
            // OlePresXXX, but it seems, that they aren't neccessary
            byte[] oleBytes = { 1, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
            poiData.createDocument(new ByteArrayInputStream(oleBytes), OLESTREAM_NAME);
        }
        ByteArrayOutputStream bos = new ByteArrayOutputStream();
        poiData.writeFilesystem(bos);
        exOleObjStg.setData(bos.toByteArray());
    } catch (IOException e) {
        throw new HSLFException(e);
    }
    int psrId = addPersistentObject(exOleObjStg);
    exOleObjStg.setPersistId(psrId);
    eeAtom.setObjStgDataRef(psrId);
    int objectId = addToObjListAtom(exEmbed);
    eeAtom.setObjID(objectId);
    return objectId;
}
Also used : HSLFException(org.apache.poi.hslf.exceptions.HSLFException) DirectoryNode(org.apache.poi.poifs.filesystem.DirectoryNode) ClassID(org.apache.poi.hpsf.ClassID) ByteArrayOutputStream(java.io.ByteArrayOutputStream) IOException(java.io.IOException) ByteArrayInputStream(java.io.ByteArrayInputStream) EscherBSERecord(org.apache.poi.ddf.EscherBSERecord) EscherOptRecord(org.apache.poi.ddf.EscherOptRecord) EscherContainerRecord(org.apache.poi.ddf.EscherContainerRecord) HashMap(java.util.HashMap) Map(java.util.Map)

Aggregations

DirectoryNode (org.apache.poi.poifs.filesystem.DirectoryNode)47 Test (org.junit.Test)16 InputStream (java.io.InputStream)15 POIFSFileSystem (org.apache.poi.poifs.filesystem.POIFSFileSystem)13 NPOIFSFileSystem (org.apache.poi.poifs.filesystem.NPOIFSFileSystem)12 Entry (org.apache.poi.poifs.filesystem.Entry)9 ByteArrayInputStream (java.io.ByteArrayInputStream)8 ByteArrayOutputStream (java.io.ByteArrayOutputStream)8 IOException (java.io.IOException)8 OPOIFSFileSystem (org.apache.poi.poifs.filesystem.OPOIFSFileSystem)6 FileInputStream (java.io.FileInputStream)5 FileNotFoundException (java.io.FileNotFoundException)5 DocumentInputStream (org.apache.poi.poifs.filesystem.DocumentInputStream)5 HSSFWorkbook (org.apache.poi.hssf.usermodel.HSSFWorkbook)4 HWPFDocument (org.apache.poi.hwpf.HWPFDocument)4 File (java.io.File)3 ArrayList (java.util.ArrayList)3 AttachmentChunks (org.apache.poi.hsmf.datatypes.AttachmentChunks)3 DirectoryEntry (org.apache.poi.poifs.filesystem.DirectoryEntry)3 DocumentEntry (org.apache.poi.poifs.filesystem.DocumentEntry)3