Examples with PreprocessingContext - org.carrot2.text.preprocessing.PreprocessingContext

Example 1 with PreprocessingContext

use of org.carrot2.text.preprocessing.PreprocessingContext in project lucene-solr by apache.

the class EchoTokensClusteringAlgorithm method process.

@Override
public void process() throws ProcessingException {
    final PreprocessingContext preprocessingContext = preprocessing.preprocess(documents, "", LanguageCode.ENGLISH);
    clusters = new ArrayList<>();
    for (char[] token : preprocessingContext.allTokens.image) {
        if (token != null) {
            clusters.add(new Cluster(new String(token)));
        }
    }
}

Also used : PreprocessingContext(org.carrot2.text.preprocessing.PreprocessingContext) Cluster(org.carrot2.core.Cluster)

Example 2 with PreprocessingContext

use of org.carrot2.text.preprocessing.PreprocessingContext in project lucene-solr by apache.

the class EchoStemsClusteringAlgorithm method process.

@Override
public void process() throws ProcessingException {
    final PreprocessingContext preprocessingContext = preprocessing.preprocess(documents, "", LanguageCode.ENGLISH);
    final AllTokens allTokens = preprocessingContext.allTokens;
    final AllWords allWords = preprocessingContext.allWords;
    final AllStems allStems = preprocessingContext.allStems;
    clusters = new ArrayList<>();
    for (int i = 0; i < allTokens.image.length; i++) {
        if (allTokens.wordIndex[i] >= 0) {
            clusters.add(new Cluster(new String(allStems.image[allWords.stemIndex[allTokens.wordIndex[i]]])));
        }
    }
}

Also used : PreprocessingContext(org.carrot2.text.preprocessing.PreprocessingContext) AllStems(org.carrot2.text.preprocessing.PreprocessingContext.AllStems) Cluster(org.carrot2.core.Cluster) AllTokens(org.carrot2.text.preprocessing.PreprocessingContext.AllTokens) AllWords(org.carrot2.text.preprocessing.PreprocessingContext.AllWords)

Aggregations

Cluster (org.carrot2.core.Cluster)2 PreprocessingContext (org.carrot2.text.preprocessing.PreprocessingContext)2 AllStems (org.carrot2.text.preprocessing.PreprocessingContext.AllStems)1 AllTokens (org.carrot2.text.preprocessing.PreprocessingContext.AllTokens)1 AllWords (org.carrot2.text.preprocessing.PreprocessingContext.AllWords)1