Search in sources :

Example 1 with TrackingHandler

use of org.apache.tika.TikaTest.TrackingHandler in project tika by apache.

the class FictionBookParserTest method testEmbedded.

@Test
public void testEmbedded() throws Exception {
    try (InputStream input = FictionBookParserTest.class.getResourceAsStream("/test-documents/test.fb2")) {
        ContainerExtractor extractor = new ParserContainerExtractor();
        TikaInputStream stream = TikaInputStream.get(input);
        assertEquals(true, extractor.isSupported(stream));
        // Process it
        TrackingHandler handler = new TrackingHandler();
        extractor.extract(stream, null, handler);
        assertEquals(2, handler.filenames.size());
    }
}
Also used : TikaInputStream(org.apache.tika.io.TikaInputStream) InputStream(java.io.InputStream) TrackingHandler(org.apache.tika.TikaTest.TrackingHandler) TikaInputStream(org.apache.tika.io.TikaInputStream) ContainerExtractor(org.apache.tika.extractor.ContainerExtractor) ParserContainerExtractor(org.apache.tika.extractor.ParserContainerExtractor) ParserContainerExtractor(org.apache.tika.extractor.ParserContainerExtractor) Test(org.junit.Test)

Example 2 with TrackingHandler

use of org.apache.tika.TikaTest.TrackingHandler in project tika by apache.

the class TNEFParserTest method testBodyAndAttachments.

/**
     * Check the Rtf and Attachments are returned
     * as expected
     */
@Test
public void testBodyAndAttachments() throws Exception {
    ContainerExtractor extractor = new ParserContainerExtractor();
    // Process it with recursing
    // Will have the message body RTF and the attachments
    TrackingHandler handler = process(file, extractor, true);
    assertEquals(6, handler.filenames.size());
    assertEquals(6, handler.mediaTypes.size());
    // We know the filenames for all of them
    assertEquals("message.rtf", handler.filenames.get(0));
    assertEquals(MediaType.application("rtf"), handler.mediaTypes.get(0));
    assertEquals("quick.doc", handler.filenames.get(1));
    assertEquals(MediaType.application("msword"), handler.mediaTypes.get(1));
    assertEquals("quick.html", handler.filenames.get(2));
    assertEquals(MediaType.text("html"), handler.mediaTypes.get(2));
    assertEquals("quick.pdf", handler.filenames.get(3));
    assertEquals(MediaType.application("pdf"), handler.mediaTypes.get(3));
    assertEquals("quick.txt", handler.filenames.get(4));
    assertEquals(MediaType.text("plain"), handler.mediaTypes.get(4));
    assertEquals("quick.xml", handler.filenames.get(5));
    assertEquals(MediaType.application("xml"), handler.mediaTypes.get(5));
}
Also used : TrackingHandler(org.apache.tika.TikaTest.TrackingHandler) ContainerExtractor(org.apache.tika.extractor.ContainerExtractor) ParserContainerExtractor(org.apache.tika.extractor.ParserContainerExtractor) ParserContainerExtractor(org.apache.tika.extractor.ParserContainerExtractor) Test(org.junit.Test)

Aggregations

TrackingHandler (org.apache.tika.TikaTest.TrackingHandler)2 ContainerExtractor (org.apache.tika.extractor.ContainerExtractor)2 ParserContainerExtractor (org.apache.tika.extractor.ParserContainerExtractor)2 Test (org.junit.Test)2 InputStream (java.io.InputStream)1 TikaInputStream (org.apache.tika.io.TikaInputStream)1