Search in sources :

Example 1 with PlainTextDictionary

use of org.apache.lucene.search.spell.PlainTextDictionary in project camel by apache.

the class LuceneSuggestionStrategy method suggestEndpointOptions.

@Override
public String[] suggestEndpointOptions(Set<String> names, String unknownOption) {
    // each option must be on a separate line in a String
    StringBuilder sb = new StringBuilder();
    for (String name : names) {
        sb.append(name);
        sb.append("\n");
    }
    StringReader reader = new StringReader(sb.toString());
    try {
        PlainTextDictionary words = new PlainTextDictionary(reader);
        // use in-memory lucene spell checker to make the suggestions
        RAMDirectory dir = new RAMDirectory();
        SpellChecker checker = new SpellChecker(dir);
        checker.indexDictionary(words, new IndexWriterConfig(new KeywordAnalyzer()), false);
        return checker.suggestSimilar(unknownOption, maxSuggestions);
    } catch (Exception e) {
    // ignore
    }
    return null;
}
Also used : KeywordAnalyzer(org.apache.lucene.analysis.core.KeywordAnalyzer) PlainTextDictionary(org.apache.lucene.search.spell.PlainTextDictionary) StringReader(java.io.StringReader) SpellChecker(org.apache.lucene.search.spell.SpellChecker) RAMDirectory(org.apache.lucene.store.RAMDirectory) IndexWriterConfig(org.apache.lucene.index.IndexWriterConfig)

Example 2 with PlainTextDictionary

use of org.apache.lucene.search.spell.PlainTextDictionary in project lucene-solr by apache.

the class FileBasedSpellChecker method loadExternalFileDictionary.

private void loadExternalFileDictionary(SolrCore core, SolrIndexSearcher searcher) {
    try {
        IndexSchema schema = null == searcher ? core.getLatestSchema() : searcher.getSchema();
        // Get the field's analyzer
        if (fieldTypeName != null && schema.getFieldTypeNoEx(fieldTypeName) != null) {
            FieldType fieldType = schema.getFieldTypes().get(fieldTypeName);
            // Do index-time analysis using the given fieldType's analyzer
            RAMDirectory ramDir = new RAMDirectory();
            LogMergePolicy mp = new LogByteSizeMergePolicy();
            mp.setMergeFactor(300);
            IndexWriter writer = new IndexWriter(ramDir, new IndexWriterConfig(fieldType.getIndexAnalyzer()).setMaxBufferedDocs(150).setMergePolicy(mp).setOpenMode(IndexWriterConfig.OpenMode.CREATE));
            List<String> lines = core.getResourceLoader().getLines(sourceLocation, characterEncoding);
            for (String s : lines) {
                Document d = new Document();
                d.add(new TextField(WORD_FIELD_NAME, s, Field.Store.NO));
                writer.addDocument(d);
            }
            writer.forceMerge(1);
            writer.close();
            dictionary = new HighFrequencyDictionary(DirectoryReader.open(ramDir), WORD_FIELD_NAME, 0.0f);
        } else {
            // check if character encoding is defined
            if (characterEncoding == null) {
                dictionary = new PlainTextDictionary(core.getResourceLoader().openResource(sourceLocation));
            } else {
                dictionary = new PlainTextDictionary(new InputStreamReader(core.getResourceLoader().openResource(sourceLocation), characterEncoding));
            }
        }
    } catch (IOException e) {
        log.error("Unable to load spellings", e);
    }
}
Also used : InputStreamReader(java.io.InputStreamReader) IOException(java.io.IOException) Document(org.apache.lucene.document.Document) RAMDirectory(org.apache.lucene.store.RAMDirectory) FieldType(org.apache.solr.schema.FieldType) HighFrequencyDictionary(org.apache.lucene.search.spell.HighFrequencyDictionary) LogByteSizeMergePolicy(org.apache.lucene.index.LogByteSizeMergePolicy) IndexWriter(org.apache.lucene.index.IndexWriter) PlainTextDictionary(org.apache.lucene.search.spell.PlainTextDictionary) LogMergePolicy(org.apache.lucene.index.LogMergePolicy) TextField(org.apache.lucene.document.TextField) IndexSchema(org.apache.solr.schema.IndexSchema) IndexWriterConfig(org.apache.lucene.index.IndexWriterConfig)

Aggregations

IndexWriterConfig (org.apache.lucene.index.IndexWriterConfig)2 PlainTextDictionary (org.apache.lucene.search.spell.PlainTextDictionary)2 RAMDirectory (org.apache.lucene.store.RAMDirectory)2 IOException (java.io.IOException)1 InputStreamReader (java.io.InputStreamReader)1 StringReader (java.io.StringReader)1 KeywordAnalyzer (org.apache.lucene.analysis.core.KeywordAnalyzer)1 Document (org.apache.lucene.document.Document)1 TextField (org.apache.lucene.document.TextField)1 IndexWriter (org.apache.lucene.index.IndexWriter)1 LogByteSizeMergePolicy (org.apache.lucene.index.LogByteSizeMergePolicy)1 LogMergePolicy (org.apache.lucene.index.LogMergePolicy)1 HighFrequencyDictionary (org.apache.lucene.search.spell.HighFrequencyDictionary)1 SpellChecker (org.apache.lucene.search.spell.SpellChecker)1 FieldType (org.apache.solr.schema.FieldType)1 IndexSchema (org.apache.solr.schema.IndexSchema)1