Search in sources :

Example 6 with HSLFShape

use of org.apache.poi.hslf.usermodel.HSLFShape in project poi by apache.

the class TestBackground method readBackground.

/**
     * Read fill information from an reference ppt file
     */
@Test
public void readBackground() throws IOException {
    HSLFSlideShow ppt = HSLFTestDataSamples.getSlideShow("backgrounds.ppt");
    HSLFFill fill;
    HSLFShape shape;
    List<HSLFSlide> slide = ppt.getSlides();
    fill = slide.get(0).getBackground().getFill();
    assertEquals(HSLFFill.FILL_PICTURE, fill.getFillType());
    shape = slide.get(0).getShapes().get(0);
    assertEquals(HSLFFill.FILL_SOLID, shape.getFill().getFillType());
    fill = slide.get(1).getBackground().getFill();
    assertEquals(HSLFFill.FILL_PATTERN, fill.getFillType());
    shape = slide.get(1).getShapes().get(0);
    assertEquals(HSLFFill.FILL_BACKGROUND, shape.getFill().getFillType());
    fill = slide.get(2).getBackground().getFill();
    assertEquals(HSLFFill.FILL_TEXTURE, fill.getFillType());
    shape = slide.get(2).getShapes().get(0);
    assertEquals(HSLFFill.FILL_PICTURE, shape.getFill().getFillType());
    fill = slide.get(3).getBackground().getFill();
    assertEquals(HSLFFill.FILL_SHADE_CENTER, fill.getFillType());
    shape = slide.get(3).getShapes().get(0);
    assertEquals(HSLFFill.FILL_SHADE, shape.getFill().getFillType());
    ppt.close();
}
Also used : HSLFShape(org.apache.poi.hslf.usermodel.HSLFShape) HSLFSlideShow(org.apache.poi.hslf.usermodel.HSLFSlideShow) HSLFFill(org.apache.poi.hslf.usermodel.HSLFFill) HSLFSlide(org.apache.poi.hslf.usermodel.HSLFSlide) Test(org.junit.Test)

Example 7 with HSLFShape

use of org.apache.poi.hslf.usermodel.HSLFShape in project poi by apache.

the class Hyperlinks method main.

public static void main(String[] args) throws Exception {
    for (int i = 0; i < args.length; i++) {
        FileInputStream is = new FileInputStream(args[i]);
        HSLFSlideShow ppt = new HSLFSlideShow(is);
        is.close();
        for (HSLFSlide slide : ppt.getSlides()) {
            System.out.println("\nslide " + slide.getSlideNumber());
            // read hyperlinks from the slide's text runs
            System.out.println("- reading hyperlinks from the text runs");
            for (List<HSLFTextParagraph> paras : slide.getTextParagraphs()) {
                for (HSLFTextParagraph para : paras) {
                    for (HSLFTextRun run : para) {
                        HSLFHyperlink link = run.getHyperlink();
                        if (link != null) {
                            System.out.println(toStr(link, run.getRawText()));
                        }
                    }
                }
            }
            // in PowerPoint you can assign a hyperlink to a shape without text,
            // for example to a Line object. The code below demonstrates how to
            // read such hyperlinks
            System.out.println("- reading hyperlinks from the slide's shapes");
            for (HSLFShape sh : slide.getShapes()) {
                if (sh instanceof HSLFSimpleShape) {
                    HSLFHyperlink link = ((HSLFSimpleShape) sh).getHyperlink();
                    if (link != null) {
                        System.out.println(toStr(link, null));
                    }
                }
            }
        }
        ppt.close();
    }
}
Also used : HSLFTextRun(org.apache.poi.hslf.usermodel.HSLFTextRun) HSLFShape(org.apache.poi.hslf.usermodel.HSLFShape) HSLFTextParagraph(org.apache.poi.hslf.usermodel.HSLFTextParagraph) HSLFHyperlink(org.apache.poi.hslf.usermodel.HSLFHyperlink) HSLFSimpleShape(org.apache.poi.hslf.usermodel.HSLFSimpleShape) HSLFSlideShow(org.apache.poi.hslf.usermodel.HSLFSlideShow) HSLFSlide(org.apache.poi.hslf.usermodel.HSLFSlide) FileInputStream(java.io.FileInputStream)

Example 8 with HSLFShape

use of org.apache.poi.hslf.usermodel.HSLFShape in project tika by apache.

the class HSLFExtractor method parse.

protected void parse(DirectoryNode root, XHTMLContentHandler xhtml) throws IOException, SAXException, TikaException {
    HSLFSlideShow ss = new HSLFSlideShow(root);
    List<HSLFSlide> _slides = ss.getSlides();
    xhtml.startElement("div", "class", "slideShow");
    /* Iterate over slides and extract text */
    for (HSLFSlide slide : _slides) {
        xhtml.startElement("div", "class", "slide");
        // Slide header, if present
        HeadersFooters hf = slide.getHeadersFooters();
        if (hf != null && hf.isHeaderVisible() && hf.getHeaderText() != null) {
            xhtml.startElement("p", "class", "slide-header");
            xhtml.characters(hf.getHeaderText());
            xhtml.endElement("p");
        }
        // Slide master, if present
        extractMaster(xhtml, slide.getMasterSheet());
        // Slide text
        {
            xhtml.startElement("div", "class", "slide-content");
            textRunsToText(xhtml, slide.getTextParagraphs());
            xhtml.endElement("div");
        }
        // Table text
        for (HSLFShape shape : slide.getShapes()) {
            if (shape instanceof HSLFTable) {
                extractTableText(xhtml, (HSLFTable) shape);
            }
        }
        // Slide footer, if present
        if (hf != null && hf.isFooterVisible() && hf.getFooterText() != null) {
            xhtml.startElement("p", "class", "slide-footer");
            xhtml.characters(hf.getFooterText());
            xhtml.endElement("p");
        }
        // Comments, if present
        StringBuilder authorStringBuilder = new StringBuilder();
        for (Comment comment : slide.getComments()) {
            authorStringBuilder.setLength(0);
            xhtml.startElement("p", "class", "slide-comment");
            if (comment.getAuthor() != null) {
                authorStringBuilder.append(comment.getAuthor());
            }
            if (comment.getAuthorInitials() != null) {
                if (authorStringBuilder.length() > 0) {
                    authorStringBuilder.append(" ");
                }
                authorStringBuilder.append("(" + comment.getAuthorInitials() + ")");
            }
            if (authorStringBuilder.length() > 0) {
                if (comment.getText() != null) {
                    authorStringBuilder.append(" - ");
                }
                xhtml.startElement("b");
                xhtml.characters(authorStringBuilder.toString());
                xhtml.endElement("b");
            }
            if (comment.getText() != null) {
                xhtml.characters(comment.getText());
            }
            xhtml.endElement("p");
        }
        // Now any embedded resources
        handleSlideEmbeddedResources(slide, xhtml);
        // Find the Notes for this slide and extract inline
        HSLFNotes notes = slide.getNotes();
        if (notes != null) {
            xhtml.startElement("div", "class", "slide-notes");
            textRunsToText(xhtml, notes.getTextParagraphs());
            xhtml.endElement("div");
        }
        // Slide complete
        xhtml.endElement("div");
    }
    // All slides done
    xhtml.endElement("div");
    /* notes */
    xhtml.startElement("div", "class", "slide-notes");
    HashSet<Integer> seenNotes = new HashSet<>();
    HeadersFooters hf = ss.getNotesHeadersFooters();
    for (HSLFSlide slide : _slides) {
        HSLFNotes notes = slide.getNotes();
        if (notes == null) {
            continue;
        }
        Integer id = notes._getSheetNumber();
        if (seenNotes.contains(id)) {
            continue;
        }
        seenNotes.add(id);
        // Repeat the Notes header, if set
        if (hf != null && hf.isHeaderVisible() && hf.getHeaderText() != null) {
            xhtml.startElement("p", "class", "slide-note-header");
            xhtml.characters(hf.getHeaderText());
            xhtml.endElement("p");
        }
        // Notes text
        textRunsToText(xhtml, notes.getTextParagraphs());
        // Repeat the notes footer, if set
        if (hf != null && hf.isFooterVisible() && hf.getFooterText() != null) {
            xhtml.startElement("p", "class", "slide-note-footer");
            xhtml.characters(hf.getFooterText());
            xhtml.endElement("p");
        }
    }
    handleSlideEmbeddedPictures(ss, xhtml);
    xhtml.endElement("div");
}
Also used : HeadersFooters(org.apache.poi.hslf.model.HeadersFooters) HSLFNotes(org.apache.poi.hslf.usermodel.HSLFNotes) Comment(org.apache.poi.hslf.model.Comment) HSLFShape(org.apache.poi.hslf.usermodel.HSLFShape) HSLFTable(org.apache.poi.hslf.usermodel.HSLFTable) HSLFSlideShow(org.apache.poi.hslf.usermodel.HSLFSlideShow) HSLFSlide(org.apache.poi.hslf.usermodel.HSLFSlide) HashSet(java.util.HashSet)

Example 9 with HSLFShape

use of org.apache.poi.hslf.usermodel.HSLFShape in project tika by apache.

the class HSLFExtractor method extractMaster.

private void extractMaster(XHTMLContentHandler xhtml, HSLFMasterSheet master) throws SAXException {
    if (master == null) {
        return;
    }
    List<HSLFShape> shapes = master.getShapes();
    if (shapes == null || shapes.isEmpty()) {
        return;
    }
    xhtml.startElement("div", "class", "slide-master-content");
    for (HSLFShape shape : shapes) {
        if (shape != null && !HSLFMasterSheet.isPlaceholder(shape)) {
            if (shape instanceof HSLFTextShape) {
                HSLFTextShape tsh = (HSLFTextShape) shape;
                String text = tsh.getText();
                if (text != null) {
                    xhtml.element("p", text);
                }
            }
        }
    }
    xhtml.endElement("div");
}
Also used : HSLFShape(org.apache.poi.hslf.usermodel.HSLFShape) HSLFTextShape(org.apache.poi.hslf.usermodel.HSLFTextShape)

Example 10 with HSLFShape

use of org.apache.poi.hslf.usermodel.HSLFShape in project tika by apache.

the class HSLFExtractor method handleSlideEmbeddedResources.

private void handleSlideEmbeddedResources(HSLFSlide slide, XHTMLContentHandler xhtml) throws TikaException, SAXException, IOException {
    List<HSLFShape> shapes;
    try {
        shapes = slide.getShapes();
    } catch (NullPointerException e) {
        // Sometimes HSLF hits problems
        // Please open POI bugs for any you come across!
        EmbeddedDocumentUtil.recordEmbeddedStreamException(e, parentMetadata);
        return;
    }
    for (HSLFShape shape : shapes) {
        if (shape instanceof OLEShape) {
            OLEShape oleShape = (OLEShape) shape;
            HSLFObjectData data = null;
            try {
                data = oleShape.getObjectData();
            } catch (NullPointerException e) {
                /* getObjectData throws NPE some times. */
                EmbeddedDocumentUtil.recordEmbeddedStreamException(e, parentMetadata);
                continue;
            }
            if (data != null) {
                String objID = Integer.toString(oleShape.getObjectID());
                // Embedded Object: add a <div
                // class="embedded" id="X"/> so consumer can see where
                // in the main text each embedded document
                // occurred:
                AttributesImpl attributes = new AttributesImpl();
                attributes.addAttribute("", "class", "class", "CDATA", "embedded");
                attributes.addAttribute("", "id", "id", "CDATA", objID);
                xhtml.startElement("div", attributes);
                xhtml.endElement("div");
                InputStream dataStream = null;
                try {
                    dataStream = data.getData();
                } catch (Exception e) {
                    EmbeddedDocumentUtil.recordEmbeddedStreamException(e, parentMetadata);
                    continue;
                }
                try (TikaInputStream stream = TikaInputStream.get(dataStream)) {
                    String mediaType = null;
                    if ("Excel.Chart.8".equals(oleShape.getProgID())) {
                        mediaType = "application/vnd.ms-excel";
                    } else {
                        MediaType mt = getTikaConfig().getDetector().detect(stream, new Metadata());
                        mediaType = mt.toString();
                    }
                    if (mediaType.equals("application/x-tika-msoffice-embedded; format=comp_obj")) {
                        try (NPOIFSFileSystem npoifs = new NPOIFSFileSystem(new CloseShieldInputStream(stream))) {
                            handleEmbeddedOfficeDoc(npoifs.getRoot(), objID, xhtml);
                        }
                    } else {
                        handleEmbeddedResource(stream, objID, objID, mediaType, xhtml, false);
                    }
                } catch (IOException e) {
                    EmbeddedDocumentUtil.recordEmbeddedStreamException(e, parentMetadata);
                }
            }
        }
    }
}
Also used : TikaInputStream(org.apache.tika.io.TikaInputStream) CloseShieldInputStream(org.apache.tika.io.CloseShieldInputStream) InputStream(java.io.InputStream) Metadata(org.apache.tika.metadata.Metadata) TikaInputStream(org.apache.tika.io.TikaInputStream) IOException(java.io.IOException) HSLFObjectData(org.apache.poi.hslf.usermodel.HSLFObjectData) OLEShape(org.apache.poi.hslf.model.OLEShape) TikaException(org.apache.tika.exception.TikaException) IOException(java.io.IOException) SAXException(org.xml.sax.SAXException) NPOIFSFileSystem(org.apache.poi.poifs.filesystem.NPOIFSFileSystem) HSLFShape(org.apache.poi.hslf.usermodel.HSLFShape) AttributesImpl(org.xml.sax.helpers.AttributesImpl) MediaType(org.apache.tika.mime.MediaType) CloseShieldInputStream(org.apache.tika.io.CloseShieldInputStream)

Aggregations

HSLFShape (org.apache.poi.hslf.usermodel.HSLFShape)20 HSLFSlide (org.apache.poi.hslf.usermodel.HSLFSlide)17 HSLFSlideShow (org.apache.poi.hslf.usermodel.HSLFSlideShow)17 Test (org.junit.Test)12 ByteArrayInputStream (java.io.ByteArrayInputStream)4 ByteArrayOutputStream (java.io.ByteArrayOutputStream)4 HSLFTable (org.apache.poi.hslf.usermodel.HSLFTable)4 HSLFTextShape (org.apache.poi.hslf.usermodel.HSLFTextShape)4 FileInputStream (java.io.FileInputStream)3 HSLFAutoShape (org.apache.poi.hslf.usermodel.HSLFAutoShape)3 HSLFLine (org.apache.poi.hslf.usermodel.HSLFLine)3 HSLFObjectData (org.apache.poi.hslf.usermodel.HSLFObjectData)3 HSLFPictureData (org.apache.poi.hslf.usermodel.HSLFPictureData)3 HSLFTextRun (org.apache.poi.hslf.usermodel.HSLFTextRun)3 Rectangle2D (java.awt.geom.Rectangle2D)2 InputStream (java.io.InputStream)2 HashSet (java.util.HashSet)2 Comment (org.apache.poi.hslf.model.Comment)2 HeadersFooters (org.apache.poi.hslf.model.HeadersFooters)2 OLEShape (org.apache.poi.hslf.model.OLEShape)2