Search in sources :

Example 6 with CloseShieldInputStream

use of org.apache.tika.io.CloseShieldInputStream in project tika by apache.

the class HSLFExtractor method handleSlideEmbeddedResources.

private void handleSlideEmbeddedResources(HSLFSlide slide, XHTMLContentHandler xhtml) throws TikaException, SAXException, IOException {
    List<HSLFShape> shapes;
    try {
        shapes = slide.getShapes();
    } catch (NullPointerException e) {
        // Sometimes HSLF hits problems
        // Please open POI bugs for any you come across!
        EmbeddedDocumentUtil.recordEmbeddedStreamException(e, parentMetadata);
        return;
    }
    for (HSLFShape shape : shapes) {
        if (shape instanceof OLEShape) {
            OLEShape oleShape = (OLEShape) shape;
            HSLFObjectData data = null;
            try {
                data = oleShape.getObjectData();
            } catch (NullPointerException e) {
                /* getObjectData throws NPE some times. */
                EmbeddedDocumentUtil.recordEmbeddedStreamException(e, parentMetadata);
                continue;
            }
            if (data != null) {
                String objID = Integer.toString(oleShape.getObjectID());
                // Embedded Object: add a <div
                // class="embedded" id="X"/> so consumer can see where
                // in the main text each embedded document
                // occurred:
                AttributesImpl attributes = new AttributesImpl();
                attributes.addAttribute("", "class", "class", "CDATA", "embedded");
                attributes.addAttribute("", "id", "id", "CDATA", objID);
                xhtml.startElement("div", attributes);
                xhtml.endElement("div");
                InputStream dataStream = null;
                try {
                    dataStream = data.getData();
                } catch (Exception e) {
                    EmbeddedDocumentUtil.recordEmbeddedStreamException(e, parentMetadata);
                    continue;
                }
                try (TikaInputStream stream = TikaInputStream.get(dataStream)) {
                    String mediaType = null;
                    if ("Excel.Chart.8".equals(oleShape.getProgID())) {
                        mediaType = "application/vnd.ms-excel";
                    } else {
                        MediaType mt = getTikaConfig().getDetector().detect(stream, new Metadata());
                        mediaType = mt.toString();
                    }
                    if (mediaType.equals("application/x-tika-msoffice-embedded; format=comp_obj")) {
                        try (NPOIFSFileSystem npoifs = new NPOIFSFileSystem(new CloseShieldInputStream(stream))) {
                            handleEmbeddedOfficeDoc(npoifs.getRoot(), objID, xhtml);
                        }
                    } else {
                        handleEmbeddedResource(stream, objID, objID, mediaType, xhtml, false);
                    }
                } catch (IOException e) {
                    EmbeddedDocumentUtil.recordEmbeddedStreamException(e, parentMetadata);
                }
            }
        }
    }
}
Also used : TikaInputStream(org.apache.tika.io.TikaInputStream) CloseShieldInputStream(org.apache.tika.io.CloseShieldInputStream) InputStream(java.io.InputStream) Metadata(org.apache.tika.metadata.Metadata) TikaInputStream(org.apache.tika.io.TikaInputStream) IOException(java.io.IOException) HSLFObjectData(org.apache.poi.hslf.usermodel.HSLFObjectData) OLEShape(org.apache.poi.hslf.model.OLEShape) TikaException(org.apache.tika.exception.TikaException) IOException(java.io.IOException) SAXException(org.xml.sax.SAXException) NPOIFSFileSystem(org.apache.poi.poifs.filesystem.NPOIFSFileSystem) HSLFShape(org.apache.poi.hslf.usermodel.HSLFShape) AttributesImpl(org.xml.sax.helpers.AttributesImpl) MediaType(org.apache.tika.mime.MediaType) CloseShieldInputStream(org.apache.tika.io.CloseShieldInputStream)

Example 7 with CloseShieldInputStream

use of org.apache.tika.io.CloseShieldInputStream in project ddf by codice.

the class TikaInputTransformer method createThumbnail.

private void createThumbnail(InputStream input, Metacard metacard) {
    try {
        Image image = ImageIO.read(new CloseShieldInputStream(input));
        if (null != image) {
            BufferedImage bufferedImage = new BufferedImage(image.getWidth(null), image.getHeight(null), BufferedImage.TYPE_INT_RGB);
            Graphics2D graphics = bufferedImage.createGraphics();
            graphics.drawImage(image, null, null);
            graphics.dispose();
            BufferedImage thumb = Scalr.resize(bufferedImage, 200);
            try (ByteArrayOutputStream out = new ByteArrayOutputStream()) {
                ImageIO.write(thumb, "jpeg", out);
                byte[] thumbBytes = out.toByteArray();
                metacard.setAttribute(new AttributeImpl(Metacard.THUMBNAIL, thumbBytes));
            }
        } else {
            LOGGER.debug("Unable to read image from input stream to create thumbnail.");
        }
    } catch (Exception e) {
        LOGGER.debug("Unable to read image from input stream to create thumbnail.", e);
    }
}
Also used : AttributeImpl(ddf.catalog.data.impl.AttributeImpl) ByteArrayOutputStream(java.io.ByteArrayOutputStream) Image(java.awt.Image) BufferedImage(java.awt.image.BufferedImage) CloseShieldInputStream(org.apache.tika.io.CloseShieldInputStream) BufferedImage(java.awt.image.BufferedImage) CatalogTransformerException(ddf.catalog.transform.CatalogTransformerException) SAXException(org.xml.sax.SAXException) TransformerException(javax.xml.transform.TransformerException) TransformerConfigurationException(javax.xml.transform.TransformerConfigurationException) IOException(java.io.IOException) Graphics2D(java.awt.Graphics2D)

Aggregations

CloseShieldInputStream (org.apache.tika.io.CloseShieldInputStream)7 TikaInputStream (org.apache.tika.io.TikaInputStream)5 IOException (java.io.IOException)4 InputStream (java.io.InputStream)3 SAXException (org.xml.sax.SAXException)3 TikaException (org.apache.tika.exception.TikaException)2 Metadata (org.apache.tika.metadata.Metadata)2 MediaType (org.apache.tika.mime.MediaType)2 AttributesImpl (org.xml.sax.helpers.AttributesImpl)2 AttributeImpl (ddf.catalog.data.impl.AttributeImpl)1 CatalogTransformerException (ddf.catalog.transform.CatalogTransformerException)1 Graphics2D (java.awt.Graphics2D)1 Image (java.awt.Image)1 BufferedImage (java.awt.image.BufferedImage)1 BufferedInputStream (java.io.BufferedInputStream)1 ByteArrayOutputStream (java.io.ByteArrayOutputStream)1 File (java.io.File)1 FilterOutputStream (java.io.FilterOutputStream)1 Socket (java.net.Socket)1 URL (java.net.URL)1