Examples with ParsedDocument - org.elasticsearch.index.mapper.ParsedDocument

Example 96 with ParsedDocument

use of org.elasticsearch.index.mapper.ParsedDocument in project crate by crate.

the class ReadOnlyEngineTests method testReadOnlyEngine.

@Test
public void testReadOnlyEngine() throws Exception {
    IOUtils.close(engine, store);
    Engine readOnlyEngine = null;
    final AtomicLong globalCheckpoint = new AtomicLong(SequenceNumbers.NO_OPS_PERFORMED);
    try (Store store = createStore()) {
        EngineConfig config = config(defaultSettings, store, createTempDir(), newMergePolicy(), null, null, globalCheckpoint::get);
        int numDocs = scaledRandomIntBetween(10, 1000);
        final SeqNoStats lastSeqNoStats;
        final List<DocIdSeqNoAndSource> lastDocIds;
        try (InternalEngine engine = createEngine(config)) {
            Engine.Get get = null;
            for (int i = 0; i < numDocs; i++) {
                ParsedDocument doc = testParsedDocument(Integer.toString(i), null, testDocument(), new BytesArray("{}"), null);
                engine.index(new Engine.Index(newUid(doc), doc, i, primaryTerm.get(), 1, null, Engine.Operation.Origin.REPLICA, System.nanoTime(), -1, false, SequenceNumbers.UNASSIGNED_SEQ_NO, 0));
                if (get == null || rarely()) {
                    get = newGet(doc);
                }
                if (rarely()) {
                    engine.flush();
                }
                globalCheckpoint.set(randomLongBetween(globalCheckpoint.get(), engine.getPersistedLocalCheckpoint()));
            }
            engine.syncTranslog();
            globalCheckpoint.set(randomLongBetween(globalCheckpoint.get(), engine.getPersistedLocalCheckpoint()));
            engine.flush();
            readOnlyEngine = new ReadOnlyEngine(engine.engineConfig, engine.getSeqNoStats(globalCheckpoint.get()), engine.getTranslogStats(), false, Function.identity());
            lastSeqNoStats = engine.getSeqNoStats(globalCheckpoint.get());
            lastDocIds = getDocIds(engine, true);
            assertThat(readOnlyEngine.getPersistedLocalCheckpoint(), equalTo(lastSeqNoStats.getLocalCheckpoint()));
            assertThat(readOnlyEngine.getSeqNoStats(globalCheckpoint.get()).getMaxSeqNo(), equalTo(lastSeqNoStats.getMaxSeqNo()));
            assertThat(getDocIds(readOnlyEngine, false), equalTo(lastDocIds));
            for (int i = 0; i < numDocs; i++) {
                if (randomBoolean()) {
                    String delId = Integer.toString(i);
                    engine.delete(new Engine.Delete(delId, newUid(delId), primaryTerm.get()));
                }
                if (rarely()) {
                    engine.flush();
                }
            }
            Engine.Searcher external = readOnlyEngine.acquireSearcher("test", Engine.SearcherScope.EXTERNAL);
            Engine.Searcher internal = readOnlyEngine.acquireSearcher("test", Engine.SearcherScope.INTERNAL);
            assertSame(external.getIndexReader(), internal.getIndexReader());
            assertThat(external.getIndexReader(), instanceOf(DirectoryReader.class));
            DirectoryReader dirReader = external.getDirectoryReader();
            ElasticsearchDirectoryReader esReader = getElasticsearchDirectoryReader(dirReader);
            IndexReader.CacheHelper helper = esReader.getReaderCacheHelper();
            assertNotNull(helper);
            assertEquals(helper.getKey(), dirReader.getReaderCacheHelper().getKey());
            IOUtils.close(external, internal);
            // the locked down engine should still point to the previous commit
            assertThat(readOnlyEngine.getPersistedLocalCheckpoint(), equalTo(lastSeqNoStats.getLocalCheckpoint()));
            assertThat(readOnlyEngine.getSeqNoStats(globalCheckpoint.get()).getMaxSeqNo(), equalTo(lastSeqNoStats.getMaxSeqNo()));
            assertThat(getDocIds(readOnlyEngine, false), equalTo(lastDocIds));
        }
        // Close and reopen the main engine
        try (InternalEngine recoveringEngine = new InternalEngine(config)) {
            recoveringEngine.recoverFromTranslog(translogHandler, Long.MAX_VALUE);
            // the locked down engine should still point to the previous commit
            assertThat(readOnlyEngine.getPersistedLocalCheckpoint(), equalTo(lastSeqNoStats.getLocalCheckpoint()));
            assertThat(readOnlyEngine.getSeqNoStats(globalCheckpoint.get()).getMaxSeqNo(), equalTo(lastSeqNoStats.getMaxSeqNo()));
            assertThat(getDocIds(readOnlyEngine, false), equalTo(lastDocIds));
        }
    } finally {
        IOUtils.close(readOnlyEngine);
    }
}

Also used : BytesArray(org.elasticsearch.common.bytes.BytesArray) ElasticsearchDirectoryReader(org.elasticsearch.common.lucene.index.ElasticsearchDirectoryReader) ElasticsearchDirectoryReader.getElasticsearchDirectoryReader(org.elasticsearch.common.lucene.index.ElasticsearchDirectoryReader.getElasticsearchDirectoryReader) DirectoryReader(org.apache.lucene.index.DirectoryReader) ElasticsearchDirectoryReader(org.elasticsearch.common.lucene.index.ElasticsearchDirectoryReader) ElasticsearchDirectoryReader.getElasticsearchDirectoryReader(org.elasticsearch.common.lucene.index.ElasticsearchDirectoryReader.getElasticsearchDirectoryReader) Store(org.elasticsearch.index.store.Store) AtomicLong(java.util.concurrent.atomic.AtomicLong) SeqNoStats(org.elasticsearch.index.seqno.SeqNoStats) ParsedDocument(org.elasticsearch.index.mapper.ParsedDocument) IndexReader(org.apache.lucene.index.IndexReader) Test(org.junit.Test)

Example 97 with ParsedDocument

use of org.elasticsearch.index.mapper.ParsedDocument in project elasticsearch by elastic.

the class TermVectorsService method generateTermVectorsFromDoc.

private static Fields generateTermVectorsFromDoc(IndexShard indexShard, TermVectorsRequest request) throws IOException {
    // parse the document, at the moment we do update the mapping, just like percolate
    ParsedDocument parsedDocument = parseDocument(indexShard, indexShard.shardId().getIndexName(), request.type(), request.doc(), request.xContentType());
    // select the right fields and generate term vectors
    ParseContext.Document doc = parsedDocument.rootDoc();
    Set<String> seenFields = new HashSet<>();
    Collection<GetField> getFields = new HashSet<>();
    for (IndexableField field : doc.getFields()) {
        MappedFieldType fieldType = indexShard.mapperService().fullName(field.name());
        if (!isValidField(fieldType)) {
            continue;
        }
        if (request.selectedFields() != null && !request.selectedFields().contains(field.name())) {
            continue;
        }
        if (seenFields.contains(field.name())) {
            continue;
        } else {
            seenFields.add(field.name());
        }
        String[] values = doc.getValues(field.name());
        getFields.add(new GetField(field.name(), Arrays.asList((Object[]) values)));
    }
    return generateTermVectors(indexShard, XContentHelper.convertToMap(parsedDocument.source(), true, request.xContentType()).v2(), getFields, request.offsets(), request.perFieldAnalyzer(), seenFields);
}

Also used : IndexableField(org.apache.lucene.index.IndexableField) GetField(org.elasticsearch.index.get.GetField) ParsedDocument(org.elasticsearch.index.mapper.ParsedDocument) ParseContext(org.elasticsearch.index.mapper.ParseContext) MappedFieldType(org.elasticsearch.index.mapper.MappedFieldType) HashSet(java.util.HashSet)

Example 98 with ParsedDocument

use of org.elasticsearch.index.mapper.ParsedDocument in project elasticsearch by elastic.

the class IndexShard method prepareIndex.

static Engine.Index prepareIndex(DocumentMapperForType docMapper, SourceToParse source, long seqNo, long primaryTerm, long version, VersionType versionType, Engine.Operation.Origin origin, long autoGeneratedIdTimestamp, boolean isRetry) {
    long startTime = System.nanoTime();
    ParsedDocument doc = docMapper.getDocumentMapper().parse(source);
    if (docMapper.getMapping() != null) {
        doc.addDynamicMappingsUpdate(docMapper.getMapping());
    }
    MappedFieldType uidFieldType = docMapper.getDocumentMapper().uidMapper().fieldType();
    Query uidQuery = uidFieldType.termQuery(doc.uid(), null);
    Term uid = MappedFieldType.extractTerm(uidQuery);
    return new Engine.Index(uid, doc, seqNo, primaryTerm, version, versionType, origin, startTime, autoGeneratedIdTimestamp, isRetry);
}

Also used : ParsedDocument(org.elasticsearch.index.mapper.ParsedDocument) Query(org.apache.lucene.search.Query) MappedFieldType(org.elasticsearch.index.mapper.MappedFieldType) CheckIndex(org.apache.lucene.index.CheckIndex) Index(org.elasticsearch.index.Index) Term(org.apache.lucene.index.Term)

Example 99 with ParsedDocument

use of org.elasticsearch.index.mapper.ParsedDocument in project elasticsearch by elastic.

the class InternalEngineTests method testEnableGcDeletes.

public void testEnableGcDeletes() throws Exception {
    try (Store store = createStore();
        Engine engine = new InternalEngine(config(defaultSettings, store, createTempDir(), newMergePolicy(), IndexRequest.UNSET_AUTO_GENERATED_TIMESTAMP, null))) {
        engine.config().setEnableGcDeletes(false);
        // Add document
        Document document = testDocument();
        document.add(new TextField("value", "test1", Field.Store.YES));
        ParsedDocument doc = testParsedDocument("1", "test", null, document, B_2, null);
        engine.index(new Engine.Index(newUid(doc), doc, SequenceNumbersService.UNASSIGNED_SEQ_NO, 0, 1, VersionType.EXTERNAL, Engine.Operation.Origin.PRIMARY, System.nanoTime(), -1, false));
        // Delete document we just added:
        engine.delete(new Engine.Delete("test", "1", newUid(doc), SequenceNumbersService.UNASSIGNED_SEQ_NO, 0, 10, VersionType.EXTERNAL, Engine.Operation.Origin.PRIMARY, System.nanoTime()));
        // Get should not find the document
        Engine.GetResult getResult = engine.get(new Engine.Get(true, newUid(doc)));
        assertThat(getResult.exists(), equalTo(false));
        // Give the gc pruning logic a chance to kick in
        Thread.sleep(1000);
        if (randomBoolean()) {
            engine.refresh("test");
        }
        // Delete non-existent document
        engine.delete(new Engine.Delete("test", "2", newUid("2"), SequenceNumbersService.UNASSIGNED_SEQ_NO, 0, 10, VersionType.EXTERNAL, Engine.Operation.Origin.PRIMARY, System.nanoTime()));
        // Get should not find the document (we never indexed uid=2):
        getResult = engine.get(new Engine.Get(true, newUid("2")));
        assertThat(getResult.exists(), equalTo(false));
        // Try to index uid=1 with a too-old version, should fail:
        Engine.Index index = new Engine.Index(newUid(doc), doc, SequenceNumbersService.UNASSIGNED_SEQ_NO, 0, 2, VersionType.EXTERNAL, Engine.Operation.Origin.PRIMARY, System.nanoTime(), -1, false);
        Engine.IndexResult indexResult = engine.index(index);
        assertTrue(indexResult.hasFailure());
        assertThat(indexResult.getFailure(), instanceOf(VersionConflictEngineException.class));
        // Get should still not find the document
        getResult = engine.get(new Engine.Get(true, newUid(doc)));
        assertThat(getResult.exists(), equalTo(false));
        // Try to index uid=2 with a too-old version, should fail:
        Engine.Index index1 = new Engine.Index(newUid(doc), doc, SequenceNumbersService.UNASSIGNED_SEQ_NO, 0, 2, VersionType.EXTERNAL, Engine.Operation.Origin.PRIMARY, System.nanoTime(), -1, false);
        indexResult = engine.index(index1);
        assertTrue(indexResult.hasFailure());
        assertThat(indexResult.getFailure(), instanceOf(VersionConflictEngineException.class));
        // Get should not find the document
        getResult = engine.get(new Engine.Get(true, newUid(doc)));
        assertThat(getResult.exists(), equalTo(false));
    }
}

Also used : Store(org.elasticsearch.index.store.Store) Index(org.elasticsearch.index.Index) ParsedDocument(org.elasticsearch.index.mapper.ParsedDocument) Document(org.elasticsearch.index.mapper.ParseContext.Document) ParsedDocument(org.elasticsearch.index.mapper.ParsedDocument) TextField(org.apache.lucene.document.TextField)

Example 100 with ParsedDocument

use of org.elasticsearch.index.mapper.ParsedDocument in project elasticsearch by elastic.

the class InternalEngineTests method testSkipTranslogReplay.

public void testSkipTranslogReplay() throws IOException {
    final int numDocs = randomIntBetween(1, 10);
    for (int i = 0; i < numDocs; i++) {
        ParsedDocument doc = testParsedDocument(Integer.toString(i), "test", null, testDocument(), new BytesArray("{}"), null);
        Engine.Index firstIndexRequest = new Engine.Index(newUid(doc), doc, SequenceNumbersService.UNASSIGNED_SEQ_NO, 0, Versions.MATCH_DELETED, VersionType.INTERNAL, PRIMARY, System.nanoTime(), -1, false);
        Engine.IndexResult indexResult = engine.index(firstIndexRequest);
        assertThat(indexResult.getVersion(), equalTo(1L));
    }
    engine.refresh("test");
    try (Engine.Searcher searcher = engine.acquireSearcher("test")) {
        TopDocs topDocs = searcher.searcher().search(new MatchAllDocsQuery(), randomIntBetween(numDocs, numDocs + 10));
        assertThat(topDocs.totalHits, equalTo(numDocs));
    }
    engine.close();
    engine = new InternalEngine(engine.config());
    try (Engine.Searcher searcher = engine.acquireSearcher("test")) {
        TopDocs topDocs = searcher.searcher().search(new MatchAllDocsQuery(), randomIntBetween(numDocs, numDocs + 10));
        assertThat(topDocs.totalHits, equalTo(0));
    }
}

Also used : Searcher(org.elasticsearch.index.engine.Engine.Searcher) TopDocs(org.apache.lucene.search.TopDocs) BytesArray(org.elasticsearch.common.bytes.BytesArray) ParsedDocument(org.elasticsearch.index.mapper.ParsedDocument) Index(org.elasticsearch.index.Index) MatchAllDocsQuery(org.apache.lucene.search.MatchAllDocsQuery) LongPoint(org.apache.lucene.document.LongPoint)

Aggregations

ParsedDocument (org.elasticsearch.index.mapper.ParsedDocument)211 Test (org.junit.Test)85 LongPoint (org.apache.lucene.document.LongPoint)59 BytesArray (org.elasticsearch.common.bytes.BytesArray)58 Matchers.containsString (org.hamcrest.Matchers.containsString)57 Store (org.elasticsearch.index.store.Store)52 Searcher (org.elasticsearch.index.engine.Engine.Searcher)46 DocumentMapper (org.elasticsearch.index.mapper.DocumentMapper)35 IOException (java.io.IOException)32 AtomicLong (java.util.concurrent.atomic.AtomicLong)31 MatchAllDocsQuery (org.apache.lucene.search.MatchAllDocsQuery)31 IndexableField (org.apache.lucene.index.IndexableField)30 Term (org.apache.lucene.index.Term)28 TopDocs (org.apache.lucene.search.TopDocs)28 NumericDocValuesField (org.apache.lucene.document.NumericDocValuesField)27 Index (org.elasticsearch.index.Index)27 UncheckedIOException (java.io.UncheckedIOException)26 Field (org.apache.lucene.document.Field)26 TextField (org.apache.lucene.document.TextField)26 ArrayList (java.util.ArrayList)25