Search in sources :

Example 1 with RereadableInputStream

use of org.apache.tika.utils.RereadableInputStream in project tika by apache.

the class TestRereadableInputStream method doACloseBehaviorTest.

private void doACloseBehaviorTest(boolean wantToClose) throws IOException {
    TestInputStream tis = createTestInputStream();
    RereadableInputStream ris = new RereadableInputStream(tis, 5, true, wantToClose);
    ris.close();
    assertEquals(wantToClose, tis.isClosed());
    if (!tis.isClosed()) {
        tis.close();
    }
}
Also used : RereadableInputStream(org.apache.tika.utils.RereadableInputStream)

Example 2 with RereadableInputStream

use of org.apache.tika.utils.RereadableInputStream in project tika by apache.

the class TSDParser method parse.

@Override
public void parse(InputStream stream, ContentHandler handler, Metadata metadata, ParseContext context) throws IOException, SAXException, TikaException {
    //Try to parse TSD file
    try (RereadableInputStream ris = new RereadableInputStream(stream, 2048, true, true)) {
        Metadata TSDAndEmbeddedMetadata = new Metadata();
        List<TSDMetas> tsdMetasList = this.extractMetas(ris);
        this.buildMetas(tsdMetasList, metadata != null && metadata.size() > 0 ? TSDAndEmbeddedMetadata : metadata);
        XHTMLContentHandler xhtml = new XHTMLContentHandler(handler, metadata);
        xhtml.startDocument();
        ris.rewind();
        //Try to parse embedded file in TSD file
        this.parseTSDContent(ris, handler, TSDAndEmbeddedMetadata, context);
        xhtml.endDocument();
    }
}
Also used : RereadableInputStream(org.apache.tika.utils.RereadableInputStream) Metadata(org.apache.tika.metadata.Metadata) XHTMLContentHandler(org.apache.tika.sax.XHTMLContentHandler)

Example 3 with RereadableInputStream

use of org.apache.tika.utils.RereadableInputStream in project tika by apache.

the class TestRereadableInputStream method test.

@Test
public void test() throws IOException {
    InputStream is = createTestInputStream();
    RereadableInputStream ris = new RereadableInputStream(is, MEMORY_THRESHOLD, true, true);
    try {
        for (int pass = 0; pass < NUM_PASSES; pass++) {
            for (int byteNum = 0; byteNum < TEST_SIZE; byteNum++) {
                int byteRead = ris.read();
                assertEquals("Pass = " + pass + ", byte num should be " + byteNum + " but is " + byteRead + ".", byteNum, byteRead);
            }
            ris.rewind();
        }
    } finally {
        // The RereadableInputStream should close the original input
        // stream (if it hasn't already).
        ris.close();
    }
}
Also used : BufferedInputStream(java.io.BufferedInputStream) FileInputStream(java.io.FileInputStream) RereadableInputStream(org.apache.tika.utils.RereadableInputStream) InputStream(java.io.InputStream) RereadableInputStream(org.apache.tika.utils.RereadableInputStream) Test(org.junit.Test)

Example 4 with RereadableInputStream

use of org.apache.tika.utils.RereadableInputStream in project tika by apache.

the class TestRereadableInputStream method doTestRewind.

private void doTestRewind(boolean readToEndOnRewind) throws IOException {
    RereadableInputStream ris = null;
    try {
        InputStream s1 = createTestInputStream();
        ris = new RereadableInputStream(s1, 5, readToEndOnRewind, true);
        ris.read();
        assertEquals(1, ris.getSize());
        ris.rewind();
        boolean moreBytesWereRead = (ris.getSize() > 1);
        assertEquals(readToEndOnRewind, moreBytesWereRead);
    } finally {
        if (ris != null) {
            ris.close();
        }
    }
}
Also used : RereadableInputStream(org.apache.tika.utils.RereadableInputStream) BufferedInputStream(java.io.BufferedInputStream) FileInputStream(java.io.FileInputStream) RereadableInputStream(org.apache.tika.utils.RereadableInputStream) InputStream(java.io.InputStream)

Aggregations

RereadableInputStream (org.apache.tika.utils.RereadableInputStream)4 BufferedInputStream (java.io.BufferedInputStream)2 FileInputStream (java.io.FileInputStream)2 InputStream (java.io.InputStream)2 Metadata (org.apache.tika.metadata.Metadata)1 XHTMLContentHandler (org.apache.tika.sax.XHTMLContentHandler)1 Test (org.junit.Test)1