Search in sources :

Example 6 with Analysis

use of org.puimula.libvoikko.Analysis in project sukija by ahomansikka.

the class VoikkoUtils method panalyze.

public static final boolean panalyze(Voikko voikko, String word, Set<String> result, String from, String to) {
    CharCombinator charCombinator = new CharCombinator(word, from, to);
    Iterator<String> iterator = charCombinator.iterator();
    while (iterator.hasNext()) {
        final String s = iterator.next();
        List<Analysis> list = voikko.analyze(s);
        if (list.size() > 0) {
            result.clear();
            result.addAll(getBaseForms(list));
            return true;
        }
    }
    return false;
}
Also used : Analysis(org.puimula.libvoikko.Analysis) CharCombinator(peltomaa.sukija.util.CharCombinator)

Example 7 with Analysis

use of org.puimula.libvoikko.Analysis in project sukija by ahomansikka.

the class CompoundWordEndSuggestion method suggest.

public boolean suggest(String word, VoikkoAttribute voikkoAtt) {
    //System.out.println ("CompoundWordEndSuggestion1 " + word);
    extraBaseForms.clear();
    boolean found = false;
    sb.delete(0, sb.length());
    boolean hasToken = false;
    for (Token token : trie.tokenize(word)) {
        //System.out.println ("CompoundWordEndSuggestion1 " + word + " " + token.getFragment());
        if (token.isMatch()) {
            //System.out.println ("CompoundWordEndSuggestion2 " + word + " " + token.getFragment() + " " + word.substring (token.getEmit().getStart()));
            List<Analysis> list = voikko.analyze(word.substring(token.getEmit().getStart()));
            if (list.size() > 0) {
                String start = word.substring(0, token.getEmit().getStart());
                for (Analysis a : list) {
                    final String BASEFORM = a.get("BASEFORM").toLowerCase();
                    //System.out.println ("CompoundWordEndSuggestion3 " + word + " " + token.getFragment() + " " + BASEFORM);
                    final String END = map.get(token.getFragment());
                    if (BASEFORM.endsWith(END)) {
                        if (addStart)
                            addStart(start, voikkoAtt);
                        if (addEnd)
                            extraBaseForms.add(END);
                        //System.out.println ("CompoundWordEndSuggestion4 " + word + " " + (start + BASEFORM));
                        extraBaseForms.add(start + BASEFORM);
                        final String DH = dehyphen(start, BASEFORM);
                        if (DH != null)
                            extraBaseForms.add(DH);
                        // Vielä ei voida palata, koska tuloksia on ehkä enemmän kuin yksi.
                        found = true;
                    }
                }
                if (found) {
                    voikkoAtt.addAnalysis(list);
                    //                    + " " + VoikkoUtils.getBaseForms(voikkoAtt.getAnalysis()) + " " + extraBaseForms.toString());
                    return true;
                }
            }
        }
    }
    return false;
}
Also used : Analysis(org.puimula.libvoikko.Analysis)

Aggregations

Analysis (org.puimula.libvoikko.Analysis)7 ArrayList (java.util.ArrayList)1 HashSet (java.util.HashSet)1 Tokenizer (org.apache.lucene.analysis.Tokenizer)1 CharTermAttribute (org.apache.lucene.analysis.tokenattributes.CharTermAttribute)1 FlagsAttribute (org.apache.lucene.analysis.tokenattributes.FlagsAttribute)1 BaseFormAttribute (peltomaa.sukija.attributes.BaseFormAttribute)1 OriginalWordAttribute (peltomaa.sukija.attributes.OriginalWordAttribute)1 HVTokenizer (peltomaa.sukija.finnish.HVTokenizer)1 CharCombinator (peltomaa.sukija.util.CharCombinator)1