Examples with StandardAnalyzer - org.apache.lucene.analysis.standard.StandardAnalyzer

Example 11 with StandardAnalyzer

use of org.apache.lucene.analysis.standard.StandardAnalyzer in project elasticsearch by elastic.

the class TermVectorsUnitTests method writeStandardTermVector.

private void writeStandardTermVector(TermVectorsResponse outResponse) throws IOException {
    Directory dir = newDirectory();
    IndexWriterConfig conf = new IndexWriterConfig(new StandardAnalyzer());
    conf.setOpenMode(OpenMode.CREATE);
    IndexWriter writer = new IndexWriter(dir, conf);
    FieldType type = new FieldType(TextField.TYPE_STORED);
    type.setStoreTermVectorOffsets(true);
    type.setStoreTermVectorPayloads(false);
    type.setStoreTermVectorPositions(true);
    type.setStoreTermVectors(true);
    type.freeze();
    Document d = new Document();
    d.add(new Field("id", "abc", StringField.TYPE_STORED));
    d.add(new Field("title", "the1 quick brown fox jumps over  the1 lazy dog", type));
    d.add(new Field("desc", "the1 quick brown fox jumps over  the1 lazy dog", type));
    writer.updateDocument(new Term("id", "abc"), d);
    writer.commit();
    writer.close();
    DirectoryReader dr = DirectoryReader.open(dir);
    IndexSearcher s = new IndexSearcher(dr);
    TopDocs search = s.search(new TermQuery(new Term("id", "abc")), 1);
    ScoreDoc[] scoreDocs = search.scoreDocs;
    int doc = scoreDocs[0].doc;
    Fields termVectors = dr.getTermVectors(doc);
    EnumSet<Flag> flags = EnumSet.of(Flag.Positions, Flag.Offsets);
    outResponse.setFields(termVectors, null, flags, termVectors);
    dr.close();
    dir.close();
}

Also used : IndexSearcher(org.apache.lucene.search.IndexSearcher) TermQuery(org.apache.lucene.search.TermQuery) DirectoryReader(org.apache.lucene.index.DirectoryReader) Term(org.apache.lucene.index.Term) Document(org.apache.lucene.document.Document) Flag(org.elasticsearch.action.termvectors.TermVectorsRequest.Flag) FieldType(org.apache.lucene.document.FieldType) ScoreDoc(org.apache.lucene.search.ScoreDoc) TopDocs(org.apache.lucene.search.TopDocs) StringField(org.apache.lucene.document.StringField) Field(org.apache.lucene.document.Field) TextField(org.apache.lucene.document.TextField) Fields(org.apache.lucene.index.Fields) IndexWriter(org.apache.lucene.index.IndexWriter) StandardAnalyzer(org.apache.lucene.analysis.standard.StandardAnalyzer) Directory(org.apache.lucene.store.Directory) IndexWriterConfig(org.apache.lucene.index.IndexWriterConfig)

Example 12 with StandardAnalyzer

use of org.apache.lucene.analysis.standard.StandardAnalyzer in project elasticsearch by elastic.

the class CustomUnifiedHighlighterTests method testMultiPhrasePrefixQuery.

public void testMultiPhrasePrefixQuery() throws Exception {
    final String[] inputs = { "The quick brown fox." };
    final String[] outputs = { "The <b>quick</b> <b>brown</b> <b>fox</b>." };
    MultiPhrasePrefixQuery query = new MultiPhrasePrefixQuery();
    query.add(new Term("text", "quick"));
    query.add(new Term("text", "brown"));
    query.add(new Term("text", "fo"));
    assertHighlightOneDoc("text", inputs, new StandardAnalyzer(), query, Locale.ROOT, BreakIterator.getSentenceInstance(Locale.ROOT), 0, outputs);
}

Also used : StandardAnalyzer(org.apache.lucene.analysis.standard.StandardAnalyzer) MultiPhrasePrefixQuery(org.elasticsearch.common.lucene.search.MultiPhrasePrefixQuery) Term(org.apache.lucene.index.Term)

Example 13 with StandardAnalyzer

use of org.apache.lucene.analysis.standard.StandardAnalyzer in project elasticsearch by elastic.

the class CustomUnifiedHighlighterTests method testSentenceBoundedBreakIterator.

public void testSentenceBoundedBreakIterator() throws Exception {
    final String[] inputs = { "The quick brown fox in a long sentence with another quick brown fox. " + "Another sentence with brown fox." };
    final String[] outputs = { "The <b>quick</b> <b>brown</b>", "<b>fox</b> in a long", "with another <b>quick</b>", "<b>brown</b> <b>fox</b>.", "sentence with <b>brown</b>", "<b>fox</b>." };
    BooleanQuery query = new BooleanQuery.Builder().add(new TermQuery(new Term("text", "quick")), BooleanClause.Occur.SHOULD).add(new TermQuery(new Term("text", "brown")), BooleanClause.Occur.SHOULD).add(new TermQuery(new Term("text", "fox")), BooleanClause.Occur.SHOULD).build();
    assertHighlightOneDoc("text", inputs, new StandardAnalyzer(), query, Locale.ROOT, BoundedBreakIteratorScanner.getSentence(Locale.ROOT, 10), 0, outputs);
}

Also used : BooleanQuery(org.apache.lucene.search.BooleanQuery) AllTermQuery(org.elasticsearch.common.lucene.all.AllTermQuery) TermQuery(org.apache.lucene.search.TermQuery) StandardAnalyzer(org.apache.lucene.analysis.standard.StandardAnalyzer) Term(org.apache.lucene.index.Term)

Example 14 with StandardAnalyzer

use of org.apache.lucene.analysis.standard.StandardAnalyzer in project elasticsearch by elastic.

the class CustomUnifiedHighlighterTests method testSimple.

public void testSimple() throws Exception {
    final String[] inputs = { "This is a test. Just a test1 highlighting from unified highlighter.", "This is the second highlighting value to perform highlighting on a longer text that gets scored lower.", "This is highlighting the third short highlighting value.", "Just a test4 highlighting from unified highlighter." };
    String[] expectedPassages = { "Just a test1 <b>highlighting</b> from unified highlighter.", "This is the second <b>highlighting</b> value to perform <b>highlighting</b> on a" + " longer text that gets scored lower.", "This is <b>highlighting</b> the third short <b>highlighting</b> value.", "Just a test4 <b>highlighting</b> from unified highlighter." };
    Query query = new TermQuery(new Term("text", "highlighting"));
    assertHighlightOneDoc("text", inputs, new StandardAnalyzer(), query, Locale.ROOT, BreakIterator.getSentenceInstance(Locale.ROOT), 0, expectedPassages);
}

Also used : AllTermQuery(org.elasticsearch.common.lucene.all.AllTermQuery) TermQuery(org.apache.lucene.search.TermQuery) Query(org.apache.lucene.search.Query) CommonTermsQuery(org.apache.lucene.queries.CommonTermsQuery) PhraseQuery(org.apache.lucene.search.PhraseQuery) AllTermQuery(org.elasticsearch.common.lucene.all.AllTermQuery) MultiPhrasePrefixQuery(org.elasticsearch.common.lucene.search.MultiPhrasePrefixQuery) MatchAllDocsQuery(org.apache.lucene.search.MatchAllDocsQuery) TermQuery(org.apache.lucene.search.TermQuery) BooleanQuery(org.apache.lucene.search.BooleanQuery) StandardAnalyzer(org.apache.lucene.analysis.standard.StandardAnalyzer) Term(org.apache.lucene.index.Term)

Example 15 with StandardAnalyzer

use of org.apache.lucene.analysis.standard.StandardAnalyzer in project elasticsearch by elastic.

the class CustomUnifiedHighlighterTests method testRepeat.

public void testRepeat() throws Exception {
    final String[] inputs = { "Fun  fun fun  fun  fun  fun  fun  fun  fun  fun" };
    final String[] outputs = { "<b>Fun</b>  <b>fun</b> <b>fun</b>", "<b>fun</b>  <b>fun</b>", "<b>fun</b>  <b>fun</b>  <b>fun</b>", "<b>fun</b>  <b>fun</b>" };
    Query query = new TermQuery(new Term("text", "fun"));
    assertHighlightOneDoc("text", inputs, new StandardAnalyzer(), query, Locale.ROOT, BoundedBreakIteratorScanner.getSentence(Locale.ROOT, 10), 0, outputs);
    query = new PhraseQuery.Builder().add(new Term("text", "fun")).add(new Term("text", "fun")).build();
    assertHighlightOneDoc("text", inputs, new StandardAnalyzer(), query, Locale.ROOT, BoundedBreakIteratorScanner.getSentence(Locale.ROOT, 10), 0, outputs);
}

Aggregations

StandardAnalyzer (org.apache.lucene.analysis.standard.StandardAnalyzer)112 Analyzer (org.apache.lucene.analysis.Analyzer)37 IndexWriter (org.apache.lucene.index.IndexWriter)36 Document (org.apache.lucene.document.Document)29 IndexWriterConfig (org.apache.lucene.index.IndexWriterConfig)29 IndexSearcher (org.apache.lucene.search.IndexSearcher)24 Term (org.apache.lucene.index.Term)22 RAMDirectory (org.apache.lucene.store.RAMDirectory)21 Test (org.junit.Test)21 Query (org.apache.lucene.search.Query)20 BooleanQuery (org.apache.lucene.search.BooleanQuery)19 TermQuery (org.apache.lucene.search.TermQuery)19 IOException (java.io.IOException)16 Before (org.junit.Before)15 IndexReader (org.apache.lucene.index.IndexReader)14 HashMap (java.util.HashMap)13 Field (org.apache.lucene.document.Field)13 ArrayList (java.util.ArrayList)12 QueryParser (org.apache.lucene.queryparser.classic.QueryParser)12 Directory (org.apache.lucene.store.Directory)12