Search in sources :

Example 1 with FlagsAttributeImpl

use of org.apache.lucene.analysis.tokenattributes.FlagsAttributeImpl in project sukija by ahomansikka.

the class BaseFormTester method test.

public static void test(Reader reader, Writer writer, Voikko voikko, boolean successOnly) throws IOException {
    TokenStream t = new HVTokenizer();
    ((Tokenizer) t).setReader(reader);
    t = new BaseFormFilter(t, voikko, successOnly);
    CharTermAttribute termAtt = t.addAttribute(CharTermAttribute.class);
    BaseFormAttribute baseFormAtt = t.addAttribute(BaseFormAttribute.class);
    FlagsAttribute flagsAtt = t.addAttribute(FlagsAttribute.class);
    OriginalWordAttribute originalWordAtt = t.addAttribute(OriginalWordAttribute.class);
    String orig = "";
    TreeSet<String> tset = new TreeSet<String>();
    FlagsAttribute flagsA = new FlagsAttributeImpl();
    try {
        t.reset();
        while (t.incrementToken()) {
            if (!orig.equals("") && !orig.equals(originalWordAtt.getOriginalWord())) {
                writer.write("Sana: " + orig);
                if (Constants.hasFlag(flagsA, Constants.FOUND)) {
                    writer.write(" M " + toString(tset));
                }
                writer.write("\n");
                writer.flush();
                tset.clear();
            }
            orig = originalWordAtt.getOriginalWord();
            tset.addAll(baseFormAtt.getBaseForms());
            flagsA.setFlags(flagsAtt.getFlags());
        }
        writer.write("Sana: " + orig);
        if (Constants.hasFlag(flagsA, Constants.FOUND)) {
            writer.write(" M " + toString(tset));
        }
        writer.write("\n");
        writer.flush();
        t.end();
    } finally {
        t.close();
    }
/*
    try {
      t.reset();
      while (t.incrementToken()) {
        writer.write ("Sana: " + originalWordAtt.getOriginalWord()
                      + " " + termAtt.toString()
                      + " " + Constants.toString (flagsAtt)
                      + " " + baseFormAtt.getBaseForms().toString()
                      + "\n");
        writer.flush();
      }
      t.end();
    }
    finally {
      t.close();
    }
*/
}
Also used : HVTokenizer(peltomaa.sukija.finnish.HVTokenizer) TokenStream(org.apache.lucene.analysis.TokenStream) FlagsAttribute(org.apache.lucene.analysis.tokenattributes.FlagsAttribute) CharTermAttribute(org.apache.lucene.analysis.tokenattributes.CharTermAttribute) FlagsAttributeImpl(org.apache.lucene.analysis.tokenattributes.FlagsAttributeImpl) BaseFormAttribute(peltomaa.sukija.attributes.BaseFormAttribute) TreeSet(java.util.TreeSet) OriginalWordAttribute(peltomaa.sukija.attributes.OriginalWordAttribute) Tokenizer(org.apache.lucene.analysis.Tokenizer) HVTokenizer(peltomaa.sukija.finnish.HVTokenizer)

Aggregations

TreeSet (java.util.TreeSet)1 TokenStream (org.apache.lucene.analysis.TokenStream)1 Tokenizer (org.apache.lucene.analysis.Tokenizer)1 CharTermAttribute (org.apache.lucene.analysis.tokenattributes.CharTermAttribute)1 FlagsAttribute (org.apache.lucene.analysis.tokenattributes.FlagsAttribute)1 FlagsAttributeImpl (org.apache.lucene.analysis.tokenattributes.FlagsAttributeImpl)1 BaseFormAttribute (peltomaa.sukija.attributes.BaseFormAttribute)1 OriginalWordAttribute (peltomaa.sukija.attributes.OriginalWordAttribute)1 HVTokenizer (peltomaa.sukija.finnish.HVTokenizer)1