Search in sources :

Example 1 with WordParser

use of zemberek.morphology.parser.WordParser in project lucene-solr-analysis-turkish by iorixxx.

the class Zemberek3StemFilterFactory method inform.

@Override
public void inform(ResourceLoader loader) throws IOException {
    if (dictionaryFiles == null || dictionaryFiles.trim().isEmpty()) {
        this.parser = TurkishWordParserGenerator.createWithDefaults().getParser();
        // Use default dictionaries shipped with Zemberek3.
        return;
    }
    List<String> lines = new ArrayList<>();
    List<String> files = splitFileNames(dictionaryFiles);
    if (files.size() > 0) {
        for (String file : files) {
            List<String> wlist = getLines(loader, file.trim());
            lines.addAll(wlist);
        }
    }
    if (lines.isEmpty()) {
        this.parser = TurkishWordParserGenerator.createWithDefaults().getParser();
        // Use default dictionaries shipped with Zemberek3.
        return;
    }
    SuffixProvider suffixProvider = new TurkishSuffixes();
    RootLexicon lexicon = new TurkishDictionaryLoader(suffixProvider).load(lines);
    DynamicLexiconGraph graph = new DynamicLexiconGraph(suffixProvider);
    graph.addDictionaryItems(lexicon);
    parser = new WordParser(graph);
}
Also used : SuffixProvider(zemberek.morphology.lexicon.SuffixProvider) TurkishDictionaryLoader(zemberek.morphology.lexicon.tr.TurkishDictionaryLoader) TurkishSuffixes(zemberek.morphology.lexicon.tr.TurkishSuffixes) ArrayList(java.util.ArrayList) RootLexicon(zemberek.morphology.lexicon.RootLexicon) DynamicLexiconGraph(zemberek.morphology.lexicon.graph.DynamicLexiconGraph) WordParser(zemberek.morphology.parser.WordParser)

Aggregations

ArrayList (java.util.ArrayList)1 RootLexicon (zemberek.morphology.lexicon.RootLexicon)1 SuffixProvider (zemberek.morphology.lexicon.SuffixProvider)1 DynamicLexiconGraph (zemberek.morphology.lexicon.graph.DynamicLexiconGraph)1 TurkishDictionaryLoader (zemberek.morphology.lexicon.tr.TurkishDictionaryLoader)1 TurkishSuffixes (zemberek.morphology.lexicon.tr.TurkishSuffixes)1 WordParser (zemberek.morphology.parser.WordParser)1