Search in sources :

Example 61 with StringField

use of org.apache.lucene.document.StringField in project lucene-solr by apache.

the class StrategyTestCase method getDocuments.

protected List<Document> getDocuments(Iterator<SpatialTestData> sampleData) {
    List<Document> documents = new ArrayList<>();
    while (sampleData.hasNext()) {
        SpatialTestData data = sampleData.next();
        Document document = new Document();
        document.add(new StringField("id", data.id, Field.Store.YES));
        document.add(new StringField("name", data.name, Field.Store.YES));
        Shape shape = data.shape;
        shape = convertShapeFromGetDocuments(shape);
        if (shape != null) {
            for (Field f : strategy.createIndexableFields(shape)) {
                document.add(f);
            }
            if (//just for diagnostics
            storeShape)
                document.add(new StoredField(strategy.getFieldName(), shape.toString()));
        }
        documents.add(document);
    }
    return documents;
}
Also used : StringField(org.apache.lucene.document.StringField) StoredField(org.apache.lucene.document.StoredField) Field(org.apache.lucene.document.Field) StoredField(org.apache.lucene.document.StoredField) Shape(org.locationtech.spatial4j.shape.Shape) StringField(org.apache.lucene.document.StringField) ArrayList(java.util.ArrayList) Document(org.apache.lucene.document.Document)

Example 62 with StringField

use of org.apache.lucene.document.StringField in project lucene-solr by apache.

the class AnalyzingInfixSuggester method buildDocument.

private Document buildDocument(BytesRef text, Set<BytesRef> contexts, long weight, BytesRef payload) throws IOException {
    String textString = text.utf8ToString();
    Document doc = new Document();
    FieldType ft = getTextFieldType();
    doc.add(new Field(TEXT_FIELD_NAME, textString, ft));
    doc.add(new Field("textgrams", textString, ft));
    doc.add(new StringField(EXACT_TEXT_FIELD_NAME, textString, Field.Store.NO));
    doc.add(new BinaryDocValuesField(TEXT_FIELD_NAME, text));
    doc.add(new NumericDocValuesField("weight", weight));
    if (payload != null) {
        doc.add(new BinaryDocValuesField("payloads", payload));
    }
    if (contexts != null) {
        for (BytesRef context : contexts) {
            doc.add(new StringField(CONTEXTS_FIELD_NAME, context, Field.Store.NO));
            doc.add(new SortedSetDocValuesField(CONTEXTS_FIELD_NAME, context));
        }
    }
    return doc;
}
Also used : SortField(org.apache.lucene.search.SortField) NumericDocValuesField(org.apache.lucene.document.NumericDocValuesField) SortedSetDocValuesField(org.apache.lucene.document.SortedSetDocValuesField) BinaryDocValuesField(org.apache.lucene.document.BinaryDocValuesField) StringField(org.apache.lucene.document.StringField) Field(org.apache.lucene.document.Field) TextField(org.apache.lucene.document.TextField) NumericDocValuesField(org.apache.lucene.document.NumericDocValuesField) StringField(org.apache.lucene.document.StringField) SortedSetDocValuesField(org.apache.lucene.document.SortedSetDocValuesField) Document(org.apache.lucene.document.Document) BinaryDocValuesField(org.apache.lucene.document.BinaryDocValuesField) BytesRef(org.apache.lucene.util.BytesRef) FieldType(org.apache.lucene.document.FieldType)

Example 63 with StringField

use of org.apache.lucene.document.StringField in project lucene-solr by apache.

the class SpellChecker method addGram.

private static void addGram(String text, Document doc, int ng1, int ng2) {
    int len = text.length();
    for (int ng = ng1; ng <= ng2; ng++) {
        String key = "gram" + ng;
        String end = null;
        for (int i = 0; i < len - ng + 1; i++) {
            String gram = text.substring(i, i + ng);
            FieldType ft = new FieldType(StringField.TYPE_NOT_STORED);
            ft.setIndexOptions(IndexOptions.DOCS_AND_FREQS);
            Field ngramField = new Field(key, gram, ft);
            // spellchecker does not use positional queries, but we want freqs
            // for scoring these multivalued n-gram fields.
            doc.add(ngramField);
            if (i == 0) {
                // only one term possible in the startXXField, TF/pos and norms aren't needed.
                Field startField = new StringField("start" + ng, gram, Field.Store.NO);
                doc.add(startField);
            }
            end = gram;
        }
        if (end != null) {
            // may not be present if len==ng1
            // only one term possible in the endXXField, TF/pos and norms aren't needed.
            Field endField = new StringField("end" + ng, end, Field.Store.NO);
            doc.add(endField);
        }
    }
}
Also used : StringField(org.apache.lucene.document.StringField) Field(org.apache.lucene.document.Field) StringField(org.apache.lucene.document.StringField) FieldType(org.apache.lucene.document.FieldType)

Example 64 with StringField

use of org.apache.lucene.document.StringField in project lucene-solr by apache.

the class SearchEquivalenceTestBase method beforeClass.

@BeforeClass
public static void beforeClass() throws Exception {
    Random random = random();
    directory = newDirectory();
    stopword = "" + randomChar();
    CharacterRunAutomaton stopset = new CharacterRunAutomaton(Automata.makeString(stopword));
    analyzer = new MockAnalyzer(random, MockTokenizer.WHITESPACE, false, stopset);
    RandomIndexWriter iw = new RandomIndexWriter(random, directory, analyzer);
    Document doc = new Document();
    Field id = new StringField("id", "", Field.Store.NO);
    Field field = new TextField("field", "", Field.Store.NO);
    doc.add(id);
    doc.add(field);
    // index some docs
    int numDocs = TEST_NIGHTLY ? atLeast(1000) : atLeast(100);
    for (int i = 0; i < numDocs; i++) {
        id.setStringValue(Integer.toString(i));
        field.setStringValue(randomFieldContents());
        iw.addDocument(doc);
    }
    // delete some docs
    int numDeletes = numDocs / 20;
    for (int i = 0; i < numDeletes; i++) {
        Term toDelete = new Term("id", Integer.toString(random.nextInt(numDocs)));
        if (random.nextBoolean()) {
            iw.deleteDocuments(toDelete);
        } else {
            iw.deleteDocuments(new TermQuery(toDelete));
        }
    }
    reader = iw.getReader();
    s1 = newSearcher(reader);
    s2 = newSearcher(reader);
    iw.close();
}
Also used : StringField(org.apache.lucene.document.StringField) Field(org.apache.lucene.document.Field) TextField(org.apache.lucene.document.TextField) Random(java.util.Random) MockAnalyzer(org.apache.lucene.analysis.MockAnalyzer) CharacterRunAutomaton(org.apache.lucene.util.automaton.CharacterRunAutomaton) StringField(org.apache.lucene.document.StringField) TextField(org.apache.lucene.document.TextField) Term(org.apache.lucene.index.Term) Document(org.apache.lucene.document.Document) RandomIndexWriter(org.apache.lucene.index.RandomIndexWriter) BeforeClass(org.junit.BeforeClass)

Example 65 with StringField

use of org.apache.lucene.document.StringField in project lucene-solr by apache.

the class TestMemoryIndexAgainstRAMDir method testDocValuesMemoryIndexVsNormalIndex.

public void testDocValuesMemoryIndexVsNormalIndex() throws Exception {
    Document doc = new Document();
    long randomLong = random().nextLong();
    doc.add(new NumericDocValuesField("numeric", randomLong));
    int numValues = atLeast(5);
    for (int i = 0; i < numValues; i++) {
        randomLong = random().nextLong();
        doc.add(new SortedNumericDocValuesField("sorted_numeric", randomLong));
        if (random().nextBoolean()) {
            // randomly duplicate field/value
            doc.add(new SortedNumericDocValuesField("sorted_numeric", randomLong));
        }
    }
    BytesRef randomTerm = new BytesRef(randomTerm());
    doc.add(new BinaryDocValuesField("binary", randomTerm));
    if (random().nextBoolean()) {
        doc.add(new StringField("binary", randomTerm, Field.Store.NO));
    }
    randomTerm = new BytesRef(randomTerm());
    doc.add(new SortedDocValuesField("sorted", randomTerm));
    if (random().nextBoolean()) {
        doc.add(new StringField("sorted", randomTerm, Field.Store.NO));
    }
    numValues = atLeast(5);
    for (int i = 0; i < numValues; i++) {
        randomTerm = new BytesRef(randomTerm());
        doc.add(new SortedSetDocValuesField("sorted_set", randomTerm));
        if (random().nextBoolean()) {
            // randomly duplicate field/value
            doc.add(new SortedSetDocValuesField("sorted_set", randomTerm));
        }
        if (random().nextBoolean()) {
            // randomily just add a normal string field
            doc.add(new StringField("sorted_set", randomTerm, Field.Store.NO));
        }
    }
    MockAnalyzer mockAnalyzer = new MockAnalyzer(random());
    MemoryIndex memoryIndex = MemoryIndex.fromDocument(doc, mockAnalyzer);
    IndexReader indexReader = memoryIndex.createSearcher().getIndexReader();
    LeafReader leafReader = indexReader.leaves().get(0).reader();
    Directory dir = newDirectory();
    IndexWriter writer = new IndexWriter(dir, newIndexWriterConfig(random(), mockAnalyzer));
    writer.addDocument(doc);
    writer.close();
    IndexReader controlIndexReader = DirectoryReader.open(dir);
    LeafReader controlLeafReader = controlIndexReader.leaves().get(0).reader();
    NumericDocValues numericDocValues = leafReader.getNumericDocValues("numeric");
    NumericDocValues controlNumericDocValues = controlLeafReader.getNumericDocValues("numeric");
    assertEquals(0, numericDocValues.nextDoc());
    assertEquals(0, controlNumericDocValues.nextDoc());
    assertEquals(controlNumericDocValues.longValue(), numericDocValues.longValue());
    SortedNumericDocValues sortedNumericDocValues = leafReader.getSortedNumericDocValues("sorted_numeric");
    assertEquals(0, sortedNumericDocValues.nextDoc());
    SortedNumericDocValues controlSortedNumericDocValues = controlLeafReader.getSortedNumericDocValues("sorted_numeric");
    assertEquals(0, controlSortedNumericDocValues.nextDoc());
    assertEquals(controlSortedNumericDocValues.docValueCount(), sortedNumericDocValues.docValueCount());
    for (int i = 0; i < controlSortedNumericDocValues.docValueCount(); i++) {
        assertEquals(controlSortedNumericDocValues.nextValue(), sortedNumericDocValues.nextValue());
    }
    BinaryDocValues binaryDocValues = leafReader.getBinaryDocValues("binary");
    BinaryDocValues controlBinaryDocValues = controlLeafReader.getBinaryDocValues("binary");
    assertEquals(0, binaryDocValues.nextDoc());
    assertEquals(0, controlBinaryDocValues.nextDoc());
    assertEquals(controlBinaryDocValues.binaryValue(), binaryDocValues.binaryValue());
    SortedDocValues sortedDocValues = leafReader.getSortedDocValues("sorted");
    SortedDocValues controlSortedDocValues = controlLeafReader.getSortedDocValues("sorted");
    assertEquals(controlSortedDocValues.getValueCount(), sortedDocValues.getValueCount());
    assertEquals(0, sortedDocValues.nextDoc());
    assertEquals(0, controlSortedDocValues.nextDoc());
    assertEquals(controlSortedDocValues.binaryValue(), sortedDocValues.binaryValue());
    assertEquals(controlSortedDocValues.ordValue(), sortedDocValues.ordValue());
    assertEquals(controlSortedDocValues.lookupOrd(0), sortedDocValues.lookupOrd(0));
    SortedSetDocValues sortedSetDocValues = leafReader.getSortedSetDocValues("sorted_set");
    assertEquals(0, sortedSetDocValues.nextDoc());
    SortedSetDocValues controlSortedSetDocValues = controlLeafReader.getSortedSetDocValues("sorted_set");
    assertEquals(0, controlSortedSetDocValues.nextDoc());
    assertEquals(controlSortedSetDocValues.getValueCount(), sortedSetDocValues.getValueCount());
    for (long controlOrd = controlSortedSetDocValues.nextOrd(); controlOrd != SortedSetDocValues.NO_MORE_ORDS; controlOrd = controlSortedSetDocValues.nextOrd()) {
        assertEquals(controlOrd, sortedSetDocValues.nextOrd());
        assertEquals(controlSortedSetDocValues.lookupOrd(controlOrd), sortedSetDocValues.lookupOrd(controlOrd));
    }
    assertEquals(SortedSetDocValues.NO_MORE_ORDS, sortedSetDocValues.nextOrd());
    indexReader.close();
    controlIndexReader.close();
    dir.close();
}
Also used : NumericDocValues(org.apache.lucene.index.NumericDocValues) Document(org.apache.lucene.document.Document) BinaryDocValuesField(org.apache.lucene.document.BinaryDocValuesField) DoublePoint(org.apache.lucene.document.DoublePoint) LongPoint(org.apache.lucene.document.LongPoint) IntPoint(org.apache.lucene.document.IntPoint) FloatPoint(org.apache.lucene.document.FloatPoint) SortedNumericDocValuesField(org.apache.lucene.document.SortedNumericDocValuesField) SortedNumericDocValuesField(org.apache.lucene.document.SortedNumericDocValuesField) NumericDocValuesField(org.apache.lucene.document.NumericDocValuesField) MockAnalyzer(org.apache.lucene.analysis.MockAnalyzer) StringField(org.apache.lucene.document.StringField) SortedDocValuesField(org.apache.lucene.document.SortedDocValuesField) SortedSetDocValuesField(org.apache.lucene.document.SortedSetDocValuesField) BytesRef(org.apache.lucene.util.BytesRef) Directory(org.apache.lucene.store.Directory) RAMDirectory(org.apache.lucene.store.RAMDirectory)

Aggregations

StringField (org.apache.lucene.document.StringField)328 Document (org.apache.lucene.document.Document)306 Directory (org.apache.lucene.store.Directory)227 MockAnalyzer (org.apache.lucene.analysis.MockAnalyzer)129 NumericDocValuesField (org.apache.lucene.document.NumericDocValuesField)94 Term (org.apache.lucene.index.Term)93 RandomIndexWriter (org.apache.lucene.index.RandomIndexWriter)82 BytesRef (org.apache.lucene.util.BytesRef)74 IndexSearcher (org.apache.lucene.search.IndexSearcher)58 TextField (org.apache.lucene.document.TextField)57 DirectoryReader (org.apache.lucene.index.DirectoryReader)56 BinaryDocValuesField (org.apache.lucene.document.BinaryDocValuesField)55 ArrayList (java.util.ArrayList)54 Field (org.apache.lucene.document.Field)51 IndexReader (org.apache.lucene.index.IndexReader)51 TermQuery (org.apache.lucene.search.TermQuery)50 IndexWriter (org.apache.lucene.index.IndexWriter)46 SortedNumericDocValuesField (org.apache.lucene.document.SortedNumericDocValuesField)43 NRTCachingDirectory (org.apache.lucene.store.NRTCachingDirectory)43 IndexWriterConfig (org.apache.lucene.index.IndexWriterConfig)41