Search in sources :

Example 41 with GetRequest

use of org.elasticsearch.action.get.GetRequest in project incubator-gobblin by apache.

the class TransportWriterVariant method getTestClient.

@Override
public TestClient getTestClient(Config config) throws IOException {
    final ElasticsearchTransportClientWriter transportClientWriter = new ElasticsearchTransportClientWriter(config);
    final TransportClient transportClient = transportClientWriter.getTransportClient();
    return new TestClient() {

        @Override
        public GetResponse get(GetRequest getRequest) throws IOException {
            try {
                return transportClient.get(getRequest).get();
            } catch (Exception e) {
                throw new IOException(e);
            }
        }

        @Override
        public void recreateIndex(String indexName) throws IOException {
            DeleteIndexRequestBuilder dirBuilder = transportClient.admin().indices().prepareDelete(indexName);
            try {
                DeleteIndexResponse diResponse = dirBuilder.execute().actionGet();
            } catch (IndexNotFoundException ie) {
                System.out.println("Index not found... that's ok");
            }
            CreateIndexRequestBuilder cirBuilder = transportClient.admin().indices().prepareCreate(indexName);
            CreateIndexResponse ciResponse = cirBuilder.execute().actionGet();
            Assert.assertTrue(ciResponse.isAcknowledged(), "Create index succeeeded");
        }

        @Override
        public void close() throws IOException {
            transportClientWriter.close();
        }
    };
}
Also used : DeleteIndexResponse(org.elasticsearch.action.admin.indices.delete.DeleteIndexResponse) TransportClient(org.elasticsearch.client.transport.TransportClient) GetRequest(org.elasticsearch.action.get.GetRequest) CreateIndexRequestBuilder(org.elasticsearch.action.admin.indices.create.CreateIndexRequestBuilder) IndexNotFoundException(org.elasticsearch.index.IndexNotFoundException) IOException(java.io.IOException) DeleteIndexRequestBuilder(org.elasticsearch.action.admin.indices.delete.DeleteIndexRequestBuilder) CreateIndexResponse(org.elasticsearch.action.admin.indices.create.CreateIndexResponse) IndexNotFoundException(org.elasticsearch.index.IndexNotFoundException) IOException(java.io.IOException)

Example 42 with GetRequest

use of org.elasticsearch.action.get.GetRequest in project datashare by ICIJ.

the class ElasticsearchIndexerTest method test_bulk_add_should_add_ner_pipeline_once_and_for_empty_list.

@Test
public void test_bulk_add_should_add_ner_pipeline_once_and_for_empty_list() throws IOException {
    Document doc = new org.icij.datashare.text.Document("id", project("prj"), Paths.get("doc.txt"), "content", Language.FRENCH, Charset.defaultCharset(), "application/pdf", new HashMap<>(), INDEXED, new HashSet<Pipeline.Type>() {

        {
            add(OPENNLP);
        }
    }, 432L);
    indexer.add(TEST_INDEX, doc);
    assertThat(indexer.bulkAdd(TEST_INDEX, OPENNLP, emptyList(), doc)).isTrue();
    GetResponse resp = es.client.get(new GetRequest(TEST_INDEX, doc.getId()), RequestOptions.DEFAULT);
    assertThat(resp.getSourceAsMap().get("status")).isEqualTo("DONE");
    assertThat((ArrayList<String>) resp.getSourceAsMap().get("nerTags")).containsExactly("OPENNLP");
}
Also used : ScriptType(org.elasticsearch.script.ScriptType) Type(org.icij.datashare.text.nlp.Pipeline.Type) GetRequest(org.elasticsearch.action.get.GetRequest) Document(org.icij.datashare.text.Document) GetResponse(org.elasticsearch.action.get.GetResponse) Test(org.junit.Test)

Example 43 with GetRequest

use of org.elasticsearch.action.get.GetRequest in project datashare by ICIJ.

the class ElasticsearchIndexerTest method test_bulk_add_with_root_document.

@Test
public void test_bulk_add_with_root_document() throws IOException {
    Document root = createDoc("root").build();
    assertThat(indexer.bulkAdd(TEST_INDEX, asList(createDoc("doc1").withRootId(root.getId()).build(), createDoc("doc2").withRootId(root.getId()).build()))).isTrue();
    assertThat(((Document) indexer.get(TEST_INDEX, "doc1")).getRootDocument()).isEqualTo(root.getId());
    assertThat(((Document) indexer.get(TEST_INDEX, "doc2")).getRootDocument()).isEqualTo(root.getId());
    assertThat(es.client.get(new GetRequest(TEST_INDEX, "doc1"), RequestOptions.DEFAULT).getFields().get("_routing").getValues()).isEqualTo(asList(root.getId()));
    assertThat(es.client.get(new GetRequest(TEST_INDEX, "doc1"), RequestOptions.DEFAULT).getFields().get("_routing").getValues()).isEqualTo(asList(root.getId()));
}
Also used : GetRequest(org.elasticsearch.action.get.GetRequest) Document(org.icij.datashare.text.Document) Test(org.junit.Test)

Example 44 with GetRequest

use of org.elasticsearch.action.get.GetRequest in project datashare by ICIJ.

the class ElasticsearchSpewerTest method test_simple_write.

@Test
public void test_simple_write() throws Exception {
    final TikaDocument document = new DocumentFactory().withIdentifier(new PathIdentifier()).create(get("test-file.txt"));
    final ParsingReader reader = new ParsingReader(new ByteArrayInputStream("test".getBytes()));
    document.setReader(reader);
    spewer.write(document);
    GetResponse documentFields = es.client.get(new GetRequest(TEST_INDEX, document.getId()), RequestOptions.DEFAULT);
    assertThat(documentFields.isExists()).isTrue();
    assertThat(documentFields.getId()).isEqualTo(document.getId());
    assertEquals(new HashMap<String, String>() {

        {
            put("name", "Document");
        }
    }, documentFields.getSourceAsMap().get("join"));
    ArgumentCaptor<Message> argument = ArgumentCaptor.forClass(Message.class);
    verify(publisher).publish(eq(Channel.NLP), argument.capture());
    assertThat(argument.getValue().content).includes(entry(Field.DOC_ID, document.getId()));
}
Also used : DocumentFactory(org.icij.extract.document.DocumentFactory) Message(org.icij.datashare.com.Message) ParsingReader(org.apache.tika.parser.ParsingReader) ByteArrayInputStream(java.io.ByteArrayInputStream) GetRequest(org.elasticsearch.action.get.GetRequest) PathIdentifier(org.icij.extract.document.PathIdentifier) TikaDocument(org.icij.extract.document.TikaDocument) GetResponse(org.elasticsearch.action.get.GetResponse) Test(org.junit.Test)

Example 45 with GetRequest

use of org.elasticsearch.action.get.GetRequest in project datashare by ICIJ.

the class ElasticsearchSpewerTest method test_embedded_document.

@Test
public void test_embedded_document() throws Exception {
    Path path = get(Objects.requireNonNull(getClass().getResource("/docs/embedded_doc.eml")).getPath());
    final TikaDocument document = new Extractor().extract(path);
    spewer.write(document);
    GetResponse documentFields = es.client.get(new GetRequest(TEST_INDEX, document.getId()), RequestOptions.DEFAULT);
    assertTrue(documentFields.isExists());
    SearchRequest searchRequest = new SearchRequest();
    SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
    searchSourceBuilder.query(QueryBuilders.multiMatchQuery("simple.tiff", "content"));
    searchRequest.source(searchSourceBuilder);
    SearchResponse response = es.client.search(searchRequest, RequestOptions.DEFAULT);
    assertThat(response.getHits().getTotalHits().value).isGreaterThan(0);
    // assertThat(response.getHits().getAt(0).getId()).endsWith("embedded.pdf");
    verify(publisher, times(2)).publish(eq(Channel.NLP), any(Message.class));
}
Also used : Path(java.nio.file.Path) SearchRequest(org.elasticsearch.action.search.SearchRequest) Message(org.icij.datashare.com.Message) GetRequest(org.elasticsearch.action.get.GetRequest) TikaDocument(org.icij.extract.document.TikaDocument) Extractor(org.icij.extract.extractor.Extractor) GetResponse(org.elasticsearch.action.get.GetResponse) SearchSourceBuilder(org.elasticsearch.search.builder.SearchSourceBuilder) SearchResponse(org.elasticsearch.action.search.SearchResponse) Test(org.junit.Test)

Aggregations

GetRequest (org.elasticsearch.action.get.GetRequest)45 GetResponse (org.elasticsearch.action.get.GetResponse)29 Test (org.junit.Test)14 IOException (java.io.IOException)13 IndexRequest (org.elasticsearch.action.index.IndexRequest)9 HashMap (java.util.HashMap)7 TikaDocument (org.icij.extract.document.TikaDocument)7 FetchSourceContext (org.elasticsearch.search.fetch.subphase.FetchSourceContext)6 ArrayList (java.util.ArrayList)5 DocumentFactory (org.icij.extract.document.DocumentFactory)5 ByteArrayInputStream (java.io.ByteArrayInputStream)4 ParsingReader (org.apache.tika.parser.ParsingReader)4 ElasticsearchException (org.elasticsearch.ElasticsearchException)4 BulkItemResponse (org.elasticsearch.action.bulk.BulkItemResponse)4 DeleteRequest (org.elasticsearch.action.delete.DeleteRequest)4 SearchRequest (org.elasticsearch.action.search.SearchRequest)4 UpdateRequest (org.elasticsearch.action.update.UpdateRequest)4 PathIdentifier (org.icij.extract.document.PathIdentifier)4 BulkRequest (org.elasticsearch.action.bulk.BulkRequest)3 MultiGetRequest (org.elasticsearch.action.get.MultiGetRequest)3