Search in sources :

Example 6 with Entry

use of org.apache.poi.poifs.filesystem.Entry in project poi by apache.

the class EmbeddedObjects method main.

@SuppressWarnings("unused")
public static void main(String[] args) throws Exception {
    POIFSFileSystem fs = new POIFSFileSystem(new FileInputStream(args[0]));
    HSSFWorkbook workbook = new HSSFWorkbook(fs);
    for (HSSFObjectData obj : workbook.getAllEmbeddedObjects()) {
        //the OLE2 Class Name of the object
        String oleName = obj.getOLE2ClassName();
        DirectoryNode dn = (obj.hasDirectoryEntry()) ? (DirectoryNode) obj.getDirectory() : null;
        Closeable document = null;
        if (oleName.equals("Worksheet")) {
            document = new HSSFWorkbook(dn, fs, false);
        } else if (oleName.equals("Document")) {
            document = new HWPFDocument(dn);
        } else if (oleName.equals("Presentation")) {
            document = new HSLFSlideShow(dn);
        } else {
            if (dn != null) {
                // The DirectoryEntry is a DocumentNode. Examine its entries to find out what it is
                for (Entry entry : dn) {
                    String name = entry.getName();
                }
            } else {
                // There is no DirectoryEntry
                // Recover the object's data from the HSSFObjectData instance.
                byte[] objectData = obj.getObjectData();
            }
        }
        if (document != null) {
            document.close();
        }
    }
    workbook.close();
}
Also used : HWPFDocument(org.apache.poi.hwpf.HWPFDocument) Entry(org.apache.poi.poifs.filesystem.Entry) POIFSFileSystem(org.apache.poi.poifs.filesystem.POIFSFileSystem) Closeable(java.io.Closeable) HSSFObjectData(org.apache.poi.hssf.usermodel.HSSFObjectData) DirectoryNode(org.apache.poi.poifs.filesystem.DirectoryNode) HSLFSlideShow(org.apache.poi.hslf.usermodel.HSLFSlideShow) FileInputStream(java.io.FileInputStream) HSSFWorkbook(org.apache.poi.hssf.usermodel.HSSFWorkbook)

Example 7 with Entry

use of org.apache.poi.poifs.filesystem.Entry in project poi by apache.

the class LoadEmbedded method loadEmbedded.

public static void loadEmbedded(HSSFWorkbook workbook) throws IOException {
    for (HSSFObjectData obj : workbook.getAllEmbeddedObjects()) {
        //the OLE2 Class Name of the object
        String oleName = obj.getOLE2ClassName();
        if (oleName.equals("Worksheet")) {
            DirectoryNode dn = (DirectoryNode) obj.getDirectory();
            HSSFWorkbook embeddedWorkbook = new HSSFWorkbook(dn, false);
            embeddedWorkbook.close();
        } else if (oleName.equals("Document")) {
            DirectoryNode dn = (DirectoryNode) obj.getDirectory();
            HWPFDocument embeddedWordDocument = new HWPFDocument(dn);
            embeddedWordDocument.close();
        } else if (oleName.equals("Presentation")) {
            DirectoryNode dn = (DirectoryNode) obj.getDirectory();
            SlideShow<?, ?> embeddedSlieShow = new HSLFSlideShow(dn);
            embeddedSlieShow.close();
        } else {
            if (obj.hasDirectoryEntry()) {
                // The DirectoryEntry is a DocumentNode. Examine its entries to find out what it is
                DirectoryNode dn = (DirectoryNode) obj.getDirectory();
                for (Entry entry : dn) {
                //System.out.println(oleName + "." + entry.getName());
                }
            } else {
                // There is no DirectoryEntry
                // Recover the object's data from the HSSFObjectData instance.
                byte[] objectData = obj.getObjectData();
            }
        }
    }
}
Also used : HWPFDocument(org.apache.poi.hwpf.HWPFDocument) Entry(org.apache.poi.poifs.filesystem.Entry) HSSFObjectData(org.apache.poi.hssf.usermodel.HSSFObjectData) DirectoryNode(org.apache.poi.poifs.filesystem.DirectoryNode) HSLFSlideShow(org.apache.poi.hslf.usermodel.HSLFSlideShow) HSSFWorkbook(org.apache.poi.hssf.usermodel.HSSFWorkbook)

Example 8 with Entry

use of org.apache.poi.poifs.filesystem.Entry in project poi by apache.

the class HSSFObjectData method getDirectory.

@Override
public DirectoryEntry getDirectory() throws IOException {
    EmbeddedObjectRefSubRecord subRecord = findObjectRecord();
    int streamId = subRecord.getStreamId().intValue();
    String streamName = "MBD" + HexDump.toHex(streamId);
    Entry entry = _root.getEntry(streamName);
    if (entry instanceof DirectoryEntry) {
        return (DirectoryEntry) entry;
    }
    throw new IOException("Stream " + streamName + " was not an OLE2 directory");
}
Also used : Entry(org.apache.poi.poifs.filesystem.Entry) DirectoryEntry(org.apache.poi.poifs.filesystem.DirectoryEntry) IOException(java.io.IOException) DirectoryEntry(org.apache.poi.poifs.filesystem.DirectoryEntry)

Example 9 with Entry

use of org.apache.poi.poifs.filesystem.Entry in project poi by apache.

the class VBAMacroReader method readMacros.

/**
     * Reads VBA Project modules from a VBA Project directory located at
     * <tt>macroDir</tt> into <tt>modules</tt>.
     *
     * @since 3.15-beta2
     */
protected void readMacros(DirectoryNode macroDir, ModuleMap modules) throws IOException {
    for (Entry entry : macroDir) {
        if (!(entry instanceof DocumentNode)) {
            continue;
        }
        String name = entry.getName();
        DocumentNode document = (DocumentNode) entry;
        DocumentInputStream dis = new DocumentInputStream(document);
        try {
            if ("dir".equalsIgnoreCase(name)) {
                // process DIR
                RLEDecompressingInputStream in = new RLEDecompressingInputStream(dis);
                String streamName = null;
                int recordId = 0;
                try {
                    while (true) {
                        recordId = in.readShort();
                        if (EOF == recordId || VERSION_INDEPENDENT_TERMINATOR == recordId) {
                            break;
                        }
                        int recordLength = in.readInt();
                        switch(recordId) {
                            case PROJECTVERSION:
                                trySkip(in, 6);
                                break;
                            case PROJECTCODEPAGE:
                                int codepage = in.readShort();
                                modules.charset = Charset.forName(CodePageUtil.codepageToEncoding(codepage, true));
                                break;
                            case STREAMNAME:
                                streamName = readString(in, recordLength, modules.charset);
                                int reserved = in.readShort();
                                if (reserved != STREAMNAME_RESERVED) {
                                    throw new IOException("Expected x0032 after stream name before Unicode stream name, but found: " + Integer.toHexString(reserved));
                                }
                                int unicodeNameRecordLength = in.readInt();
                                readUnicodeString(in, unicodeNameRecordLength);
                                // do something with this at some point
                                break;
                            case MODULEOFFSET:
                                readModule(in, streamName, modules);
                                break;
                            default:
                                trySkip(in, recordLength);
                                break;
                        }
                    }
                } catch (final IOException e) {
                    throw new IOException("Error occurred while reading macros at section id " + recordId + " (" + HexDump.shortToHex(recordId) + ")", e);
                } finally {
                    in.close();
                }
            } else if (!startsWithIgnoreCase(name, "__SRP") && !startsWithIgnoreCase(name, "_VBA_PROJECT")) {
                // process module, skip __SRP and _VBA_PROJECT since these do not contain macros
                readModule(dis, name, modules);
            }
        } finally {
            dis.close();
        }
    }
}
Also used : RLEDecompressingInputStream(org.apache.poi.util.RLEDecompressingInputStream) Entry(org.apache.poi.poifs.filesystem.Entry) ZipEntry(java.util.zip.ZipEntry) DocumentNode(org.apache.poi.poifs.filesystem.DocumentNode) IOException(java.io.IOException) DocumentInputStream(org.apache.poi.poifs.filesystem.DocumentInputStream)

Example 10 with Entry

use of org.apache.poi.poifs.filesystem.Entry in project poi by apache.

the class OLE2ScratchpadExtractorFactory method identifyEmbeddedResources.

/**
     * Returns an array of text extractors, one for each of
     *  the embedded documents in the file (if there are any).
     * If there are no embedded documents, you'll get back an
     *  empty array. Otherwise, you'll get one open
     *  {@link POITextExtractor} for each embedded file.
     */
public static void identifyEmbeddedResources(POIOLE2TextExtractor ext, List<Entry> dirs, List<InputStream> nonPOIFS) throws IOException {
    // Find all the embedded directories
    DirectoryEntry root = ext.getRoot();
    if (root == null) {
        throw new IllegalStateException("The extractor didn't know which POIFS it came from!");
    }
    if (ext instanceof WordExtractor) {
        // These are in ObjectPool -> _... under the root
        try {
            DirectoryEntry op = (DirectoryEntry) root.getEntry("ObjectPool");
            Iterator<Entry> it = op.getEntries();
            while (it.hasNext()) {
                Entry entry = it.next();
                if (entry.getName().startsWith("_")) {
                    dirs.add(entry);
                }
            }
        } catch (FileNotFoundException e) {
        // ignored here
        }
    //} else if(ext instanceof PowerPointExtractor) {
    // Tricky, not stored directly in poifs
    // TODO
    } else if (ext instanceof OutlookTextExtactor) {
        // Stored in the Attachment blocks
        MAPIMessage msg = ((OutlookTextExtactor) ext).getMAPIMessage();
        for (AttachmentChunks attachment : msg.getAttachmentFiles()) {
            if (attachment.getAttachData() != null) {
                byte[] data = attachment.getAttachData().getValue();
                nonPOIFS.add(new ByteArrayInputStream(data));
            } else if (attachment.getAttachmentDirectory() != null) {
                dirs.add(attachment.getAttachmentDirectory().getDirectory());
            }
        }
    }
}
Also used : MAPIMessage(org.apache.poi.hsmf.MAPIMessage) Entry(org.apache.poi.poifs.filesystem.Entry) DirectoryEntry(org.apache.poi.poifs.filesystem.DirectoryEntry) OutlookTextExtactor(org.apache.poi.hsmf.extractor.OutlookTextExtactor) ByteArrayInputStream(java.io.ByteArrayInputStream) FileNotFoundException(java.io.FileNotFoundException) DirectoryEntry(org.apache.poi.poifs.filesystem.DirectoryEntry) AttachmentChunks(org.apache.poi.hsmf.datatypes.AttachmentChunks) WordExtractor(org.apache.poi.hwpf.extractor.WordExtractor)

Aggregations

Entry (org.apache.poi.poifs.filesystem.Entry)24 DirectoryEntry (org.apache.poi.poifs.filesystem.DirectoryEntry)12 IOException (java.io.IOException)9 DirectoryNode (org.apache.poi.poifs.filesystem.DirectoryNode)9 FileNotFoundException (java.io.FileNotFoundException)6 InputStream (java.io.InputStream)6 DocumentEntry (org.apache.poi.poifs.filesystem.DocumentEntry)6 DocumentInputStream (org.apache.poi.poifs.filesystem.DocumentInputStream)6 DocumentNode (org.apache.poi.poifs.filesystem.DocumentNode)4 POIFSFileSystem (org.apache.poi.poifs.filesystem.POIFSFileSystem)4 ArrayList (java.util.ArrayList)3 AttachmentChunks (org.apache.poi.hsmf.datatypes.AttachmentChunks)3 HWPFDocument (org.apache.poi.hwpf.HWPFDocument)3 OldWordFileFormatException (org.apache.poi.hwpf.OldWordFileFormatException)3 BufferedInputStream (java.io.BufferedInputStream)2 ByteArrayInputStream (java.io.ByteArrayInputStream)2 FileInputStream (java.io.FileInputStream)2 POITextExtractor (org.apache.poi.POITextExtractor)2 HSLFSlideShow (org.apache.poi.hslf.usermodel.HSLFSlideShow)2 MAPIMessage (org.apache.poi.hsmf.MAPIMessage)2