Search in sources :

Example 31 with TemporaryResources

use of org.apache.tika.io.TemporaryResources in project tika by apache.

the class PooledTimeSeriesParser method parse.

/**
     * Parses a document stream into a sequence of XHTML SAX events. Fills in
     * related document metadata in the given metadata object.
     * <p>
     * The given document stream is consumed but not closed by this method. The
     * responsibility to close the stream remains on the caller.
     * <p>
     * Information about the parsing context can be passed in the context
     * parameter. See the parser implementations for the kinds of context
     * information they expect.
     *
     * @param stream   the document stream (input)
     * @param handler  handler for the XHTML SAX events (output)
     * @param metadata document metadata (input and output)
     * @param context  parse context
     * @throws IOException   if the document stream could not be read
     * @throws SAXException  if the SAX events could not be processed
     * @throws TikaException if the document could not be parsed
     * @since Apache Tika 0.5
     */
@Override
public void parse(InputStream stream, ContentHandler handler, Metadata metadata, ParseContext context) throws IOException, SAXException, TikaException {
    if (!isAvailable) {
        LOG.warn("PooledTimeSeries not installed!");
        return;
    }
    XHTMLContentHandler xhtml = new XHTMLContentHandler(handler, metadata);
    TemporaryResources tmp = new TemporaryResources();
    try {
        TikaInputStream tikaStream = TikaInputStream.get(stream, tmp);
        File input = tikaStream.getFile();
        String cmdOutput = computePoT(input);
        try (InputStream ofStream = new FileInputStream(new File(input.getAbsoluteFile() + ".of.txt"))) {
            try (InputStream ogStream = new FileInputStream(new File(input.getAbsoluteFile() + ".hog.txt"))) {
                extractHeaderOutput(ofStream, metadata, "of");
                extractHeaderOutput(ogStream, metadata, "og");
                xhtml.startDocument();
                doExtract(ofStream, xhtml, "Histogram of Optical Flows (HOF)", metadata.get("of_frames"), metadata.get("of_vecSize"));
                doExtract(ogStream, xhtml, "Histogram of Oriented Gradients (HOG)", metadata.get("og_frames"), metadata.get("og_vecSize"));
                xhtml.endDocument();
            }
        }
        // Temporary workaround for TIKA-1445 - until we can specify
        //  composite parsers with strategies (eg Composite, Try In Turn),
        //  always send the image onwards to the regular parser to have
        //  the metadata for them extracted as well
        _TMP_VIDEO_METADATA_PARSER.parse(tikaStream, handler, metadata, context);
    } finally {
        tmp.dispose();
    }
}
Also used : TikaInputStream(org.apache.tika.io.TikaInputStream) FileInputStream(java.io.FileInputStream) InputStream(java.io.InputStream) TemporaryResources(org.apache.tika.io.TemporaryResources) TikaInputStream(org.apache.tika.io.TikaInputStream) XHTMLContentHandler(org.apache.tika.sax.XHTMLContentHandler) File(java.io.File) FileInputStream(java.io.FileInputStream)

Aggregations

TemporaryResources (org.apache.tika.io.TemporaryResources)31 TikaInputStream (org.apache.tika.io.TikaInputStream)30 TikaException (org.apache.tika.exception.TikaException)15 XHTMLContentHandler (org.apache.tika.sax.XHTMLContentHandler)14 File (java.io.File)11 IOException (java.io.IOException)8 InputStream (java.io.InputStream)6 SAXException (org.xml.sax.SAXException)6 FileInputStream (java.io.FileInputStream)4 EncryptedDocumentException (org.apache.tika.exception.EncryptedDocumentException)4 Metadata (org.apache.tika.metadata.Metadata)4 MediaType (org.apache.tika.mime.MediaType)4 ZipArchiveEntry (org.apache.commons.compress.archivers.zip.ZipArchiveEntry)2 EmbeddedDocumentExtractor (org.apache.tika.extractor.EmbeddedDocumentExtractor)2 JempboxExtractor (org.apache.tika.parser.image.xmp.JempboxExtractor)2 IsoFile (com.coremedia.iso.IsoFile)1 Box (com.coremedia.iso.boxes.Box)1 FileTypeBox (com.coremedia.iso.boxes.FileTypeBox)1 MetaBox (com.coremedia.iso.boxes.MetaBox)1 MovieBox (com.coremedia.iso.boxes.MovieBox)1