Search in sources :

Example 1 with SearchException

use of org.icij.datashare.batch.SearchException in project datashare by ICIJ.

the class BatchSearchLoop method run.

public void run() {
    logger.info("Datashare running in batch mode. Waiting batch from ds:batchsearch.queue ({})", batchSearchQueue.getClass());
    String currentBatchId = null;
    waitForMainLoopCalled.countDown();
    loopThread = Thread.currentThread();
    while (!POISON.equals(currentBatchId) && !exitAsked) {
        try {
            currentBatchId = batchSearchQueue.poll(60, TimeUnit.SECONDS);
            if (currentBatchId != null && !POISON.equals(currentBatchId)) {
                BatchSearch batchSearch = repository.get(currentBatchId);
                if (batchSearch.state == BatchSearchRecord.State.QUEUED) {
                    repository.setState(batchSearch.uuid, BatchSearchRecord.State.RUNNING);
                    currentBatchSearchRunner.set(factory.createBatchSearchRunner(batchSearch, repository::saveResults));
                    currentBatchSearchRunner.get().call();
                    currentBatchSearchRunner.set(null);
                    repository.setState(batchSearch.uuid, BatchSearchRecord.State.SUCCESS);
                } else {
                    logger.warn("batch search {} not ran because in state {}", batchSearch.uuid, batchSearch.state);
                }
            }
        } catch (JooqBatchSearchRepository.BatchNotFoundException notFound) {
            logger.warn("batch was not executed : {}", notFound.toString());
        } catch (BatchSearchRunner.CancelException cancelEx) {
            logger.info("cancelling batch search {}", currentBatchId);
            batchSearchQueue.offer(currentBatchId);
            repository.reset(currentBatchId);
        } catch (SearchException sex) {
            logger.error("exception while running batch " + currentBatchId, sex);
            repository.setState(currentBatchId, sex);
        } catch (InterruptedException e) {
            logger.warn("main loop interrupted");
        }
    }
    logger.info("exiting main loop");
}
Also used : BatchSearch(org.icij.datashare.batch.BatchSearch) JooqBatchSearchRepository(org.icij.datashare.db.JooqBatchSearchRepository) SearchException(org.icij.datashare.batch.SearchException)

Example 2 with SearchException

use of org.icij.datashare.batch.SearchException in project datashare by ICIJ.

the class BatchSearchRunnerTest method test_run_batch_search_with_throttle_should_not_last_more_than_max_time.

@Test
public void test_run_batch_search_with_throttle_should_not_last_more_than_max_time() throws Exception {
    mockSearch.willReturn(5, createDoc("doc").build());
    BatchSearch batchSearch = new BatchSearch("uuid1", project("test-datashare"), "name1", "desc1", asSet("query1", "query2"), new Date(), BatchSearch.State.QUEUED, local());
    Date beforeBatch = timeRule.now;
    SearchException searchException = assertThrows(SearchException.class, () -> new BatchSearchRunner(indexer, new PropertiesProvider(new HashMap<String, String>() {

        {
            put(BATCH_THROTTLE, "1000");
            put(BATCH_SEARCH_MAX_TIME, "1");
        }
    }), batchSearch, resultConsumer).call());
    assertThat(searchException.toString()).contains("Batch timed out after 1s");
    assertThat(timeRule.now().getTime() - beforeBatch.getTime()).isEqualTo(1000);
}
Also used : PropertiesProvider(org.icij.datashare.PropertiesProvider) BatchSearch(org.icij.datashare.batch.BatchSearch) HashMap(java.util.HashMap) SearchException(org.icij.datashare.batch.SearchException) Date(java.util.Date) Test(org.junit.Test)

Example 3 with SearchException

use of org.icij.datashare.batch.SearchException in project datashare by ICIJ.

the class BatchSearchLoopTestInt method test_run_batch_search_failure.

@Test
public void test_run_batch_search_failure() throws Exception {
    when(factory.createBatchSearchRunner(any(), any())).thenThrow(new SearchException("query", new RuntimeException()));
    BatchSearchLoop app = new BatchSearchLoop(repository, batchSearchQueue, factory);
    batchSearchQueue.add(batchSearch.uuid);
    app.enqueuePoison();
    app.run();
    verify(repository).setState(batchSearch.uuid, BatchSearch.State.RUNNING);
    verify(repository).setState(eq(batchSearch.uuid), any(SearchException.class));
}
Also used : SearchException(org.icij.datashare.batch.SearchException) BatchSearchLoop(org.icij.datashare.tasks.BatchSearchLoop) Test(org.junit.Test)

Example 4 with SearchException

use of org.icij.datashare.batch.SearchException in project datashare by ICIJ.

the class BatchSearchRunner method call.

@Override
public Integer call() throws SearchException {
    int numberOfResults = 0;
    int throttleMs = parseInt(propertiesProvider.get(BATCH_THROTTLE).orElse("0"));
    int maxTimeSeconds = parseInt(propertiesProvider.get(BATCH_SEARCH_MAX_TIME).orElse("100000"));
    int scrollSize = min(parseInt(propertiesProvider.get(SCROLL_SIZE).orElse("1000")), MAX_SCROLL_SIZE);
    callThread = Thread.currentThread();
    // for tests
    callWaiterLatch.countDown();
    logger.info("running {} queries for batch search {} on project {} with throttle {}ms and scroll size of {}", batchSearch.queries.size(), batchSearch.uuid, batchSearch.project, throttleMs, scrollSize);
    String query = null;
    try {
        for (String s : batchSearch.queries.keySet()) {
            query = s;
            Indexer.Searcher searcher = indexer.search(batchSearch.project.getId(), Document.class).with(query, batchSearch.fuzziness, batchSearch.phraseMatches).withFieldValues("contentType", batchSearch.fileTypes.toArray(new String[] {})).withPrefixQuery("dirname", batchSearch.paths.toArray(new String[] {})).withoutSource("content").limit(scrollSize);
            List<? extends Entity> docsToProcess = searcher.scroll().collect(toList());
            long beforeScrollLoop = DatashareTime.getInstance().currentTimeMillis();
            while (docsToProcess.size() != 0 && numberOfResults < MAX_BATCH_RESULT_SIZE - MAX_SCROLL_SIZE) {
                if (cancelAsked) {
                    throw new CancelException();
                }
                resultConsumer.apply(batchSearch.uuid, query, (List<Document>) docsToProcess);
                if (DatashareTime.getInstance().currentTimeMillis() - beforeScrollLoop < maxTimeSeconds * 1000) {
                    DatashareTime.getInstance().sleep(throttleMs);
                } else {
                    throw new SearchException(query, new TimeoutException("Batch timed out after " + maxTimeSeconds + "s"));
                }
                numberOfResults += docsToProcess.size();
                docsToProcess = searcher.scroll().collect(toList());
            }
            searcher.clearScroll();
            totalProcessed += 1;
        }
    } catch (ElasticsearchStatusException esEx) {
        throw new SearchException(query, stream(esEx.getSuppressed()).filter(t -> t instanceof ResponseException).findFirst().orElse(esEx));
    } catch (IOException | InterruptedException ex) {
        throw new SearchException(query, ex);
    }
    logger.info("done batch search {} with success", batchSearch.uuid);
    return numberOfResults;
}
Also used : ResponseException(org.elasticsearch.client.ResponseException) SearchException(org.icij.datashare.batch.SearchException) IOException(java.io.IOException) Document(org.icij.datashare.text.Document) ElasticsearchStatusException(org.elasticsearch.ElasticsearchStatusException) Indexer(org.icij.datashare.text.indexing.Indexer) TimeoutException(java.util.concurrent.TimeoutException)

Example 5 with SearchException

use of org.icij.datashare.batch.SearchException in project datashare by ICIJ.

the class BatchSearchRunnerIntTest method test_search_with_error.

@Test
public void test_search_with_error() throws Exception {
    Document mydoc = createDoc("docId1").with("mydoc").build();
    indexer.add(TEST_INDEX, mydoc);
    BatchSearch search = new BatchSearch(project(TEST_INDEX), "name", "desc", asSet("AND mydoc"), User.local());
    SearchException sex = assertThrows(SearchException.class, () -> new BatchSearchRunner(indexer, new PropertiesProvider(), search, resultConsumer).call());
    assertThat(sex.toString()).contains("Failed to parse query [AND mydoc]");
}
Also used : PropertiesProvider(org.icij.datashare.PropertiesProvider) BatchSearch(org.icij.datashare.batch.BatchSearch) SearchException(org.icij.datashare.batch.SearchException) Document(org.icij.datashare.text.Document)

Aggregations

SearchException (org.icij.datashare.batch.SearchException)5 BatchSearch (org.icij.datashare.batch.BatchSearch)3 PropertiesProvider (org.icij.datashare.PropertiesProvider)2 Document (org.icij.datashare.text.Document)2 Test (org.junit.Test)2 IOException (java.io.IOException)1 Date (java.util.Date)1 HashMap (java.util.HashMap)1 TimeoutException (java.util.concurrent.TimeoutException)1 ElasticsearchStatusException (org.elasticsearch.ElasticsearchStatusException)1 ResponseException (org.elasticsearch.client.ResponseException)1 JooqBatchSearchRepository (org.icij.datashare.db.JooqBatchSearchRepository)1 BatchSearchLoop (org.icij.datashare.tasks.BatchSearchLoop)1 Indexer (org.icij.datashare.text.indexing.Indexer)1