Search in sources :

Example 1 with ParallelFileProcessingResult

use of org.apache.tika.batch.ParallelFileProcessingResult in project tika by apache.

the class OutputStreamFactoryTest method testSkip.

@Test
public void testSkip() throws Exception {
    Path outputDir = getNewOutputDir("os-factory-skip-");
    Map<String, String> args = getDefaultArgs("basic", outputDir);
    args.put("handleExisting", "skip");
    BatchProcess runner = getNewBatchRunner("/tika-batch-config-test.xml", args);
    ParallelFileProcessingResult result = run(runner);
    assertEquals(1, countChildren(outputDir));
    runner = getNewBatchRunner("/tika-batch-config-test.xml", args);
    result = run(runner);
    assertEquals(1, countChildren(outputDir));
}
Also used : Path(java.nio.file.Path) ParallelFileProcessingResult(org.apache.tika.batch.ParallelFileProcessingResult) BatchProcess(org.apache.tika.batch.BatchProcess) Test(org.junit.Test)

Example 2 with ParallelFileProcessingResult

use of org.apache.tika.batch.ParallelFileProcessingResult in project tika by apache.

the class OutputStreamFactoryTest method testIllegalState.

@Test
public void testIllegalState() throws Exception {
    Path outputDir = getNewOutputDir("os-factory-illegal-state-");
    Map<String, String> args = getDefaultArgs("basic", outputDir);
    BatchProcess runner = getNewBatchRunner("/tika-batch-config-test.xml", args);
    run(runner);
    assertEquals(1, countChildren(outputDir));
    boolean illegalState = false;
    try {
        ParallelFileProcessingResult result = run(runner);
    } catch (ExecutionException e) {
        if (e.getCause() instanceof IllegalStateException) {
            illegalState = true;
        }
    }
    assertTrue("Should have been an illegal state exception", illegalState);
}
Also used : Path(java.nio.file.Path) ParallelFileProcessingResult(org.apache.tika.batch.ParallelFileProcessingResult) BatchProcess(org.apache.tika.batch.BatchProcess) ExecutionException(java.util.concurrent.ExecutionException) Test(org.junit.Test)

Example 3 with ParallelFileProcessingResult

use of org.apache.tika.batch.ParallelFileProcessingResult in project tika by apache.

the class FSBatchTestBase method run.

protected ParallelFileProcessingResult run(BatchProcess process) throws Exception {
    ExecutorService executor = Executors.newSingleThreadExecutor();
    Future<ParallelFileProcessingResult> futureResult = executor.submit(process);
    return futureResult.get(10, TimeUnit.SECONDS);
}
Also used : ParallelFileProcessingResult(org.apache.tika.batch.ParallelFileProcessingResult) ExecutorService(java.util.concurrent.ExecutorService)

Example 4 with ParallelFileProcessingResult

use of org.apache.tika.batch.ParallelFileProcessingResult in project tika by apache.

the class HandlerBuilderTest method testXML.

@Test
public void testXML() throws Exception {
    Path outputDir = getNewOutputDir("handler-xml-");
    Map<String, String> args = getDefaultArgs("basic", outputDir);
    args.put("basicHandlerType", "xml");
    BatchProcess runner = getNewBatchRunner("/tika-batch-config-test.xml", args);
    ParallelFileProcessingResult result = run(runner);
    Path outputFile = outputDir.resolve("test0.xml.xml");
    String resultString = readFileToString(outputFile, UTF_8);
    assertTrue(resultString.contains("<html xmlns=\"http://www.w3.org/1999/xhtml\">"));
    assertTrue(resultString.contains("<?xml version=\"1.0\" encoding=\"UTF-8\"?>"));
    assertTrue(resultString.contains("This is tika-batch's first test file"));
}
Also used : Path(java.nio.file.Path) ParallelFileProcessingResult(org.apache.tika.batch.ParallelFileProcessingResult) BatchProcess(org.apache.tika.batch.BatchProcess) Test(org.junit.Test)

Example 5 with ParallelFileProcessingResult

use of org.apache.tika.batch.ParallelFileProcessingResult in project tika by apache.

the class HandlerBuilderTest method testRecursiveParserWrapper.

@Test
public void testRecursiveParserWrapper() throws Exception {
    Path outputDir = getNewOutputDir("handler-recursive-parser");
    Map<String, String> args = getDefaultArgs("basic", outputDir);
    args.put("basicHandlerType", "txt");
    args.put("recursiveParserWrapper", "true");
    BatchProcess runner = getNewBatchRunner("/tika-batch-config-test.xml", args);
    ParallelFileProcessingResult result = run(runner);
    Path outputFile = outputDir.resolve("test0.xml.json");
    String resultString = readFileToString(outputFile, UTF_8);
    assertTrue(resultString.contains("\"author\":\"Nikolai Lobachevsky\""));
    assertTrue(resultString.contains("tika-batch\\u0027s first test file"));
}
Also used : Path(java.nio.file.Path) ParallelFileProcessingResult(org.apache.tika.batch.ParallelFileProcessingResult) BatchProcess(org.apache.tika.batch.BatchProcess) Test(org.junit.Test)

Aggregations

ParallelFileProcessingResult (org.apache.tika.batch.ParallelFileProcessingResult)9 BatchProcess (org.apache.tika.batch.BatchProcess)8 Path (java.nio.file.Path)7 Test (org.junit.Test)7 ExecutorService (java.util.concurrent.ExecutorService)2 HashMap (java.util.HashMap)1 ExecutionException (java.util.concurrent.ExecutionException)1 CommandLine (org.apache.commons.cli.CommandLine)1 CommandLineParser (org.apache.commons.cli.CommandLineParser)1 DefaultParser (org.apache.commons.cli.DefaultParser)1 Option (org.apache.commons.cli.Option)1 BatchProcessBuilder (org.apache.tika.batch.builders.BatchProcessBuilder)1 TikaInputStream (org.apache.tika.io.TikaInputStream)1