Search in sources :

Example 11 with LogByteSizeMergePolicy

use of org.apache.lucene.index.LogByteSizeMergePolicy in project lucene-solr by apache.

the class TestDirectoryTaxonomyReader method testOpenIfChangedNoChangesButSegmentMerges.

@Test
public void testOpenIfChangedNoChangesButSegmentMerges() throws Exception {
    // test openIfChanged() when the taxonomy hasn't really changed, but segments
    // were merged. The NRT reader will be reopened, and ParentArray used to assert
    // that the new reader contains more ordinals than were given from the old
    // TaxReader version
    Directory dir = newDirectory();
    // hold onto IW to forceMerge
    // note how we don't close it, since DTW will close it.
    final IndexWriter iw = new IndexWriter(dir, new IndexWriterConfig(new MockAnalyzer(random())).setMergePolicy(new LogByteSizeMergePolicy()));
    DirectoryTaxonomyWriter writer = new DirectoryTaxonomyWriter(dir) {

        @Override
        protected IndexWriter openIndexWriter(Directory directory, IndexWriterConfig config) throws IOException {
            return iw;
        }
    };
    // add a category so that the following DTR open will cause a flush and 
    // a new segment will be created
    writer.addCategory(new FacetLabel("a"));
    TaxonomyReader reader = new DirectoryTaxonomyReader(writer);
    assertEquals(2, reader.getSize());
    assertEquals(2, reader.getParallelTaxonomyArrays().parents().length);
    // merge all the segments so that NRT reader thinks there's a change 
    iw.forceMerge(1);
    // now calling openIfChanged should trip on the wrong assert in ParetArray's ctor
    TaxonomyReader newtr = TaxonomyReader.openIfChanged(reader);
    assertNotNull(newtr);
    reader.close();
    reader = newtr;
    assertEquals(2, reader.getSize());
    assertEquals(2, reader.getParallelTaxonomyArrays().parents().length);
    reader.close();
    writer.close();
    dir.close();
}
Also used : MockAnalyzer(org.apache.lucene.analysis.MockAnalyzer) LogByteSizeMergePolicy(org.apache.lucene.index.LogByteSizeMergePolicy) IndexWriter(org.apache.lucene.index.IndexWriter) TaxonomyReader(org.apache.lucene.facet.taxonomy.TaxonomyReader) FacetLabel(org.apache.lucene.facet.taxonomy.FacetLabel) RAMDirectory(org.apache.lucene.store.RAMDirectory) Directory(org.apache.lucene.store.Directory) IndexWriterConfig(org.apache.lucene.index.IndexWriterConfig) Test(org.junit.Test)

Example 12 with LogByteSizeMergePolicy

use of org.apache.lucene.index.LogByteSizeMergePolicy in project lucene-solr by apache.

the class FileBasedSpellChecker method loadExternalFileDictionary.

private void loadExternalFileDictionary(SolrCore core, SolrIndexSearcher searcher) {
    try {
        IndexSchema schema = null == searcher ? core.getLatestSchema() : searcher.getSchema();
        // Get the field's analyzer
        if (fieldTypeName != null && schema.getFieldTypeNoEx(fieldTypeName) != null) {
            FieldType fieldType = schema.getFieldTypes().get(fieldTypeName);
            // Do index-time analysis using the given fieldType's analyzer
            RAMDirectory ramDir = new RAMDirectory();
            LogMergePolicy mp = new LogByteSizeMergePolicy();
            mp.setMergeFactor(300);
            IndexWriter writer = new IndexWriter(ramDir, new IndexWriterConfig(fieldType.getIndexAnalyzer()).setMaxBufferedDocs(150).setMergePolicy(mp).setOpenMode(IndexWriterConfig.OpenMode.CREATE));
            List<String> lines = core.getResourceLoader().getLines(sourceLocation, characterEncoding);
            for (String s : lines) {
                Document d = new Document();
                d.add(new TextField(WORD_FIELD_NAME, s, Field.Store.NO));
                writer.addDocument(d);
            }
            writer.forceMerge(1);
            writer.close();
            dictionary = new HighFrequencyDictionary(DirectoryReader.open(ramDir), WORD_FIELD_NAME, 0.0f);
        } else {
            // check if character encoding is defined
            if (characterEncoding == null) {
                dictionary = new PlainTextDictionary(core.getResourceLoader().openResource(sourceLocation));
            } else {
                dictionary = new PlainTextDictionary(new InputStreamReader(core.getResourceLoader().openResource(sourceLocation), characterEncoding));
            }
        }
    } catch (IOException e) {
        log.error("Unable to load spellings", e);
    }
}
Also used : InputStreamReader(java.io.InputStreamReader) IOException(java.io.IOException) Document(org.apache.lucene.document.Document) RAMDirectory(org.apache.lucene.store.RAMDirectory) FieldType(org.apache.solr.schema.FieldType) HighFrequencyDictionary(org.apache.lucene.search.spell.HighFrequencyDictionary) LogByteSizeMergePolicy(org.apache.lucene.index.LogByteSizeMergePolicy) IndexWriter(org.apache.lucene.index.IndexWriter) PlainTextDictionary(org.apache.lucene.search.spell.PlainTextDictionary) LogMergePolicy(org.apache.lucene.index.LogMergePolicy) TextField(org.apache.lucene.document.TextField) IndexSchema(org.apache.solr.schema.IndexSchema) IndexWriterConfig(org.apache.lucene.index.IndexWriterConfig)

Aggregations

LogByteSizeMergePolicy (org.apache.lucene.index.LogByteSizeMergePolicy)12 IndexWriterConfig (org.apache.lucene.index.IndexWriterConfig)10 IndexWriter (org.apache.lucene.index.IndexWriter)8 RAMDirectory (org.apache.lucene.store.RAMDirectory)6 StandardAnalyzer (org.apache.lucene.analysis.standard.StandardAnalyzer)3 Document (org.apache.lucene.document.Document)3 Term (org.apache.lucene.index.Term)3 Directory (org.apache.lucene.store.Directory)3 IOException (java.io.IOException)2 MockAnalyzer (org.apache.lucene.analysis.MockAnalyzer)2 StringField (org.apache.lucene.document.StringField)2 FacetLabel (org.apache.lucene.facet.taxonomy.FacetLabel)2 TaxonomyReader (org.apache.lucene.facet.taxonomy.TaxonomyReader)2 DirectoryReader (org.apache.lucene.index.DirectoryReader)2 IndexReader (org.apache.lucene.index.IndexReader)2 LogMergePolicy (org.apache.lucene.index.LogMergePolicy)2 TermQuery (org.apache.lucene.search.TermQuery)2 BitSetProducer (org.apache.lucene.search.join.BitSetProducer)2 Accountable (org.apache.lucene.util.Accountable)2 ElasticsearchDirectoryReader (org.elasticsearch.common.lucene.index.ElasticsearchDirectoryReader)2