Search in sources :

Example 76 with TextField

use of org.apache.lucene.document.TextField in project lucene-solr by apache.

the class TestBM25Similarity method testLengthEncodingBackwardCompatibility.

public void testLengthEncodingBackwardCompatibility() throws IOException {
    Similarity similarity = new BM25Similarity();
    for (int indexCreatedVersionMajor : new int[] { Version.LUCENE_6_0_0.major, Version.LATEST.major }) {
        for (int length : new int[] { 1, 2, 4 }) {
            // these length values are encoded accurately on both cases
            Directory dir = newDirectory();
            // set the version on the directory
            new SegmentInfos(indexCreatedVersionMajor).commit(dir);
            IndexWriter w = new IndexWriter(dir, newIndexWriterConfig().setSimilarity(similarity));
            Document doc = new Document();
            String value = IntStream.range(0, length).mapToObj(i -> "b").collect(Collectors.joining(" "));
            doc.add(new TextField("foo", value, Store.NO));
            w.addDocument(doc);
            IndexReader reader = DirectoryReader.open(w);
            IndexSearcher searcher = newSearcher(reader);
            searcher.setSimilarity(similarity);
            Explanation expl = searcher.explain(new TermQuery(new Term("foo", "b")), 0);
            Explanation docLen = findExplanation(expl, "fieldLength");
            assertNotNull(docLen);
            assertEquals(docLen.toString(), length, (int) docLen.getValue());
            w.close();
            reader.close();
            dir.close();
        }
    }
}
Also used : IntStream(java.util.stream.IntStream) Explanation(org.apache.lucene.search.Explanation) DirectoryReader(org.apache.lucene.index.DirectoryReader) Term(org.apache.lucene.index.Term) IOException(java.io.IOException) Collectors(java.util.stream.Collectors) Version(org.apache.lucene.util.Version) SegmentInfos(org.apache.lucene.index.SegmentInfos) Document(org.apache.lucene.document.Document) IndexWriter(org.apache.lucene.index.IndexWriter) TermQuery(org.apache.lucene.search.TermQuery) Directory(org.apache.lucene.store.Directory) Store(org.apache.lucene.document.Field.Store) LuceneTestCase(org.apache.lucene.util.LuceneTestCase) TextField(org.apache.lucene.document.TextField) IndexReader(org.apache.lucene.index.IndexReader) IndexSearcher(org.apache.lucene.search.IndexSearcher) IndexSearcher(org.apache.lucene.search.IndexSearcher) TermQuery(org.apache.lucene.search.TermQuery) SegmentInfos(org.apache.lucene.index.SegmentInfos) Explanation(org.apache.lucene.search.Explanation) Term(org.apache.lucene.index.Term) Document(org.apache.lucene.document.Document) IndexWriter(org.apache.lucene.index.IndexWriter) IndexReader(org.apache.lucene.index.IndexReader) TextField(org.apache.lucene.document.TextField) Directory(org.apache.lucene.store.Directory)

Example 77 with TextField

use of org.apache.lucene.document.TextField in project lucene-solr by apache.

the class TestBooleanSimilarity method testPhraseScoreIsEqualToBoost.

public void testPhraseScoreIsEqualToBoost() throws IOException {
    Directory dir = newDirectory();
    RandomIndexWriter w = new RandomIndexWriter(random(), dir, newIndexWriterConfig().setSimilarity(new BooleanSimilarity()));
    Document doc = new Document();
    doc.add(new TextField("foo", "bar baz quux", Store.NO));
    w.addDocument(doc);
    DirectoryReader reader = w.getReader();
    w.close();
    IndexSearcher searcher = newSearcher(reader);
    searcher.setSimilarity(new BooleanSimilarity());
    PhraseQuery query = new PhraseQuery(2, "foo", "bar", "quux");
    TopDocs topDocs = searcher.search(query, 2);
    assertEquals(1, topDocs.totalHits);
    assertEquals(1f, topDocs.scoreDocs[0].score, 0f);
    topDocs = searcher.search(new BoostQuery(query, 7), 2);
    assertEquals(1, topDocs.totalHits);
    assertEquals(7f, topDocs.scoreDocs[0].score, 0f);
    reader.close();
    dir.close();
}
Also used : IndexSearcher(org.apache.lucene.search.IndexSearcher) TopDocs(org.apache.lucene.search.TopDocs) PhraseQuery(org.apache.lucene.search.PhraseQuery) DirectoryReader(org.apache.lucene.index.DirectoryReader) TextField(org.apache.lucene.document.TextField) Document(org.apache.lucene.document.Document) BoostQuery(org.apache.lucene.search.BoostQuery) RandomIndexWriter(org.apache.lucene.index.RandomIndexWriter) Directory(org.apache.lucene.store.Directory)

Example 78 with TextField

use of org.apache.lucene.document.TextField in project lucene-solr by apache.

the class TestValueSources method beforeClass.

@BeforeClass
public static void beforeClass() throws Exception {
    dir = newDirectory();
    analyzer = new MockAnalyzer(random());
    IndexWriterConfig iwConfig = newIndexWriterConfig(analyzer);
    iwConfig.setMergePolicy(newLogMergePolicy());
    RandomIndexWriter iw = new RandomIndexWriter(random(), dir, iwConfig);
    for (String[] doc : documents) {
        Document document = new Document();
        document.add(new StringField("id", doc[0], Field.Store.NO));
        document.add(new SortedDocValuesField("id", new BytesRef(doc[0])));
        document.add(new NumericDocValuesField("double", Double.doubleToRawLongBits(Double.parseDouble(doc[1]))));
        document.add(new NumericDocValuesField("float", Float.floatToRawIntBits(Float.parseFloat(doc[2]))));
        document.add(new NumericDocValuesField("int", Integer.parseInt(doc[3])));
        document.add(new NumericDocValuesField("long", Long.parseLong(doc[4])));
        document.add(new StringField("string", doc[5], Field.Store.NO));
        document.add(new SortedDocValuesField("string", new BytesRef(doc[5])));
        document.add(new TextField("text", doc[6], Field.Store.NO));
        document.add(new SortedNumericDocValuesField("floatMv", NumericUtils.floatToSortableInt(Float.parseFloat(doc[7]))));
        document.add(new SortedNumericDocValuesField("floatMv", NumericUtils.floatToSortableInt(Float.parseFloat(doc[8]))));
        document.add(new SortedNumericDocValuesField("floatMv", NumericUtils.floatToSortableInt(Float.parseFloat(doc[9]))));
        document.add(new SortedNumericDocValuesField("doubleMv", NumericUtils.doubleToSortableLong(Double.parseDouble(doc[7]))));
        document.add(new SortedNumericDocValuesField("doubleMv", NumericUtils.doubleToSortableLong(Double.parseDouble(doc[8]))));
        document.add(new SortedNumericDocValuesField("doubleMv", NumericUtils.doubleToSortableLong(Double.parseDouble(doc[9]))));
        document.add(new SortedNumericDocValuesField("intMv", Long.parseLong(doc[10])));
        document.add(new SortedNumericDocValuesField("intMv", Long.parseLong(doc[11])));
        document.add(new SortedNumericDocValuesField("intMv", Long.parseLong(doc[12])));
        document.add(new SortedNumericDocValuesField("longMv", Long.parseLong(doc[10])));
        document.add(new SortedNumericDocValuesField("longMv", Long.parseLong(doc[11])));
        document.add(new SortedNumericDocValuesField("longMv", Long.parseLong(doc[12])));
        iw.addDocument(document);
    }
    reader = iw.getReader();
    searcher = newSearcher(reader);
    iw.close();
}
Also used : SortedNumericDocValuesField(org.apache.lucene.document.SortedNumericDocValuesField) MockAnalyzer(org.apache.lucene.analysis.MockAnalyzer) SortedNumericDocValuesField(org.apache.lucene.document.SortedNumericDocValuesField) NumericDocValuesField(org.apache.lucene.document.NumericDocValuesField) StringField(org.apache.lucene.document.StringField) SortedDocValuesField(org.apache.lucene.document.SortedDocValuesField) TextField(org.apache.lucene.document.TextField) Document(org.apache.lucene.document.Document) RandomIndexWriter(org.apache.lucene.index.RandomIndexWriter) BytesRef(org.apache.lucene.util.BytesRef) IndexWriterConfig(org.apache.lucene.index.IndexWriterConfig) BeforeClass(org.junit.BeforeClass)

Example 79 with TextField

use of org.apache.lucene.document.TextField in project lucene-solr by apache.

the class TestLongNormValueSource method beforeClass.

@BeforeClass
public static void beforeClass() throws Exception {
    dir = newDirectory();
    analyzer = new MockAnalyzer(random());
    IndexWriterConfig iwConfig = newIndexWriterConfig(analyzer);
    iwConfig.setMergePolicy(newLogMergePolicy());
    iwConfig.setSimilarity(sim);
    RandomIndexWriter iw = new RandomIndexWriter(random(), dir, iwConfig);
    Document doc = new Document();
    doc.add(new TextField("text", "this is a test test test", Field.Store.NO));
    iw.addDocument(doc);
    doc = new Document();
    doc.add(new TextField("text", "second test", Field.Store.NO));
    iw.addDocument(doc);
    reader = iw.getReader();
    searcher = newSearcher(reader);
    iw.close();
}
Also used : MockAnalyzer(org.apache.lucene.analysis.MockAnalyzer) TextField(org.apache.lucene.document.TextField) Document(org.apache.lucene.document.Document) RandomIndexWriter(org.apache.lucene.index.RandomIndexWriter) IndexWriterConfig(org.apache.lucene.index.IndexWriterConfig) BeforeClass(org.junit.BeforeClass)

Example 80 with TextField

use of org.apache.lucene.document.TextField in project lucene-solr by apache.

the class SearchEquivalenceTestBase method beforeClass.

@BeforeClass
public static void beforeClass() throws Exception {
    Random random = random();
    directory = newDirectory();
    stopword = "" + randomChar();
    CharacterRunAutomaton stopset = new CharacterRunAutomaton(Automata.makeString(stopword));
    analyzer = new MockAnalyzer(random, MockTokenizer.WHITESPACE, false, stopset);
    RandomIndexWriter iw = new RandomIndexWriter(random, directory, analyzer);
    Document doc = new Document();
    Field id = new StringField("id", "", Field.Store.NO);
    Field field = new TextField("field", "", Field.Store.NO);
    doc.add(id);
    doc.add(field);
    // index some docs
    int numDocs = TEST_NIGHTLY ? atLeast(1000) : atLeast(100);
    for (int i = 0; i < numDocs; i++) {
        id.setStringValue(Integer.toString(i));
        field.setStringValue(randomFieldContents());
        iw.addDocument(doc);
    }
    // delete some docs
    int numDeletes = numDocs / 20;
    for (int i = 0; i < numDeletes; i++) {
        Term toDelete = new Term("id", Integer.toString(random.nextInt(numDocs)));
        if (random.nextBoolean()) {
            iw.deleteDocuments(toDelete);
        } else {
            iw.deleteDocuments(new TermQuery(toDelete));
        }
    }
    reader = iw.getReader();
    s1 = newSearcher(reader);
    s2 = newSearcher(reader);
    iw.close();
}
Also used : StringField(org.apache.lucene.document.StringField) Field(org.apache.lucene.document.Field) TextField(org.apache.lucene.document.TextField) Random(java.util.Random) MockAnalyzer(org.apache.lucene.analysis.MockAnalyzer) CharacterRunAutomaton(org.apache.lucene.util.automaton.CharacterRunAutomaton) StringField(org.apache.lucene.document.StringField) TextField(org.apache.lucene.document.TextField) Term(org.apache.lucene.index.Term) Document(org.apache.lucene.document.Document) RandomIndexWriter(org.apache.lucene.index.RandomIndexWriter) BeforeClass(org.junit.BeforeClass)

Aggregations

TextField (org.apache.lucene.document.TextField)192 Document (org.apache.lucene.document.Document)171 Directory (org.apache.lucene.store.Directory)99 MockAnalyzer (org.apache.lucene.analysis.MockAnalyzer)61 Term (org.apache.lucene.index.Term)61 IndexWriter (org.apache.lucene.index.IndexWriter)58 IndexSearcher (org.apache.lucene.search.IndexSearcher)55 IndexWriterConfig (org.apache.lucene.index.IndexWriterConfig)52 Field (org.apache.lucene.document.Field)50 StringField (org.apache.lucene.document.StringField)48 BytesRef (org.apache.lucene.util.BytesRef)48 RandomIndexWriter (org.apache.lucene.index.RandomIndexWriter)44 IndexReader (org.apache.lucene.index.IndexReader)43 TermQuery (org.apache.lucene.search.TermQuery)41 NumericDocValuesField (org.apache.lucene.document.NumericDocValuesField)31 SortedDocValuesField (org.apache.lucene.document.SortedDocValuesField)30 TopDocs (org.apache.lucene.search.TopDocs)29 RAMDirectory (org.apache.lucene.store.RAMDirectory)29 FieldType (org.apache.lucene.document.FieldType)23 Query (org.apache.lucene.search.Query)23