Search in sources :

Example 1 with CFSA2Serializer

use of morfologik.fsa.builders.CFSA2Serializer in project languagetool by languagetool-org.

the class GermanSpellerRuleTest method getDictionary.

private Dictionary getDictionary(List<byte[]> lines, InputStream infoFile) throws IOException {
    Collections.sort(lines, FSABuilder.LEXICAL_ORDERING);
    FSA fsa = FSABuilder.build(lines);
    ByteArrayOutputStream fsaOutStream = new CFSA2Serializer().serialize(fsa, new ByteArrayOutputStream());
    ByteArrayInputStream fsaInStream = new ByteArrayInputStream(fsaOutStream.toByteArray());
    return Dictionary.read(fsaInStream, infoFile);
}
Also used : CFSA2Serializer(morfologik.fsa.builders.CFSA2Serializer) FSA(morfologik.fsa.FSA) ByteArrayInputStream(java.io.ByteArrayInputStream) ByteArrayOutputStream(java.io.ByteArrayOutputStream)

Example 2 with CFSA2Serializer

use of morfologik.fsa.builders.CFSA2Serializer in project languagetool by languagetool-org.

the class MorfologikMultiSpeller method getDictionary.

private Dictionary getDictionary(List<byte[]> lines, String dictPath) throws IOException {
    Dictionary dictFromCache = dicPathToDict.get(dictPath);
    if (dictFromCache != null) {
        return dictFromCache;
    } else {
        // Creating the dictionary at runtime can easily take 50ms for spelling.txt files
        // that are ~50KB. We don't want that overhead for every check of a short sentence,
        // so we cache the result:
        Collections.sort(lines, FSABuilder.LEXICAL_ORDERING);
        FSA fsa = FSABuilder.build(lines);
        ByteArrayOutputStream fsaOutStream = new CFSA2Serializer().serialize(fsa, new ByteArrayOutputStream());
        ByteArrayInputStream fsaInStream = new ByteArrayInputStream(fsaOutStream.toByteArray());
        String infoFile = dictPath.replace(".dict", ".info");
        Dictionary dict = Dictionary.read(fsaInStream, JLanguageTool.getDataBroker().getFromResourceDirAsStream(infoFile));
        dicPathToDict.put(dictPath, dict);
        return dict;
    }
}
Also used : Dictionary(morfologik.stemming.Dictionary) CFSA2Serializer(morfologik.fsa.builders.CFSA2Serializer) FSA(morfologik.fsa.FSA)

Aggregations

FSA (morfologik.fsa.FSA)2 CFSA2Serializer (morfologik.fsa.builders.CFSA2Serializer)2 ByteArrayInputStream (java.io.ByteArrayInputStream)1 ByteArrayOutputStream (java.io.ByteArrayOutputStream)1 Dictionary (morfologik.stemming.Dictionary)1