Search in sources :

Example 6 with Tika

use of org.apache.tika.Tika in project tika by apache.

the class ClassParserTest method testClassParsing.

@Test
public void testClassParsing() throws Exception {
    String path = "/test-documents/AutoDetectParser.class";
    Metadata metadata = new Metadata();
    String content = new Tika().parseToString(ClassParserTest.class.getResourceAsStream(path), metadata);
    assertEquals("AutoDetectParser", metadata.get(TikaCoreProperties.TITLE));
    assertEquals("AutoDetectParser.class", metadata.get(Metadata.RESOURCE_NAME_KEY));
    assertTrue(content.contains("package org.apache.tika.parser;"));
    assertTrue(content.contains("class AutoDetectParser extends CompositeParser"));
    assertTrue(content.contains("private org.apache.tika.mime.MimeTypes types"));
    assertTrue(content.contains("public void parse(" + "java.io.InputStream, org.xml.sax.ContentHandler," + " org.apache.tika.metadata.Metadata) throws" + " java.io.IOException, org.xml.sax.SAXException," + " org.apache.tika.exception.TikaException;"));
    assertTrue(content.contains("private byte[] getPrefix(java.io.InputStream, int)" + " throws java.io.IOException;"));
}
Also used : Metadata(org.apache.tika.metadata.Metadata) Tika(org.apache.tika.Tika) Test(org.junit.Test)

Example 7 with Tika

use of org.apache.tika.Tika in project tika by apache.

the class AudioParserTest method testWAV.

@Test
public void testWAV() throws Exception {
    String path = "/test-documents/testWAV.wav";
    Metadata metadata = new Metadata();
    String content = new Tika().parseToString(AudioParserTest.class.getResourceAsStream(path), metadata);
    assertEquals("audio/x-wav", metadata.get(Metadata.CONTENT_TYPE));
    assertEquals("44100.0", metadata.get("samplerate"));
    assertEquals("2", metadata.get("channels"));
    assertEquals("16", metadata.get("bits"));
    assertEquals("PCM_SIGNED", metadata.get("encoding"));
    assertEquals("", content);
}
Also used : Metadata(org.apache.tika.metadata.Metadata) Tika(org.apache.tika.Tika) Test(org.junit.Test)

Example 8 with Tika

use of org.apache.tika.Tika in project tika by apache.

the class TikaVersionTest method testGetVersion.

@Test
public void testGetVersion() throws Exception {
    Response response = WebClient.create(endPoint + VERSION_PATH).type("text/plain").accept("text/plain").get();
    assertEquals(new Tika().toString(), getStringFromInputStream((InputStream) response.getEntity()));
}
Also used : Response(javax.ws.rs.core.Response) InputStream(java.io.InputStream) Tika(org.apache.tika.Tika) Test(org.junit.Test)

Example 9 with Tika

use of org.apache.tika.Tika in project tika by apache.

the class TikaWelcomeTest method testGetTextWelcome.

@Test
public void testGetTextWelcome() throws Exception {
    Response response = WebClient.create(endPoint + WELCOME_PATH).type("text/plain").accept("text/plain").get();
    String text = getStringFromInputStream((InputStream) response.getEntity());
    assertContains(new Tika().toString(), text);
    // Check our details were found
    assertContains("GET " + WELCOME_PATH, text);
    assertContains("=> text/plain", text);
    // Check that the Tika Version details come through too
    assertContains("GET " + VERSION_PATH, text);
}
Also used : Response(javax.ws.rs.core.Response) Tika(org.apache.tika.Tika) Test(org.junit.Test)

Example 10 with Tika

use of org.apache.tika.Tika in project tika by apache.

the class TIAParsingExample method parseToReaderExample.

public static void parseToReaderExample() throws Exception {
    File document = new File("example.doc");
    try (Reader reader = new Tika().parse(document)) {
        char[] buffer = new char[1000];
        int n = reader.read(buffer);
        while (n != -1) {
            System.out.append(CharBuffer.wrap(buffer, 0, n));
            n = reader.read(buffer);
        }
    }
}
Also used : Reader(java.io.Reader) Tika(org.apache.tika.Tika) File(java.io.File)

Aggregations

Tika (org.apache.tika.Tika)50 Test (org.junit.Test)32 Metadata (org.apache.tika.metadata.Metadata)28 TikaTest (org.apache.tika.TikaTest)12 TikaConfig (org.apache.tika.config.TikaConfig)12 ByteArrayInputStream (java.io.ByteArrayInputStream)11 File (java.io.File)6 InputStream (java.io.InputStream)6 URL (java.net.URL)5 HashSet (java.util.HashSet)4 TikaInputStream (org.apache.tika.io.TikaInputStream)4 Ignore (org.junit.Ignore)4 FileInputStream (java.io.FileInputStream)3 Before (org.junit.Before)3 IOException (java.io.IOException)2 ArrayList (java.util.ArrayList)2 Response (javax.ws.rs.core.Response)2 CompositeDetector (org.apache.tika.detect.CompositeDetector)2 TikaException (org.apache.tika.exception.TikaException)2 MimeTypes (org.apache.tika.mime.MimeTypes)2