Search in sources :

Example 1 with WordCountOperator

use of edu.uci.ics.texera.dataflow.wordcount.WordCountOperator in project textdb by TextDB.

the class WordCountTest method computePayLoadWordCount.

// Compute result by tuple's PayLoad.
public static HashMap<String, Integer> computePayLoadWordCount(String tableName, String attribute) throws TexeraException {
    ScanBasedSourceOperator scanSource = new ScanBasedSourceOperator(new ScanSourcePredicate(tableName));
    WordCountOperator wordCount = null;
    HashMap<String, Integer> result = new HashMap<String, Integer>();
    if (tableName.equals(COUNT_TABLE)) {
        wordCount = new WordCountOperator(new WordCountOperatorPredicate(TestConstants.DESCRIPTION, LuceneAnalyzerConstants.standardAnalyzerString()));
    } else if (tableName.equals(COUNT_CHINESE_TABLE)) {
        wordCount = new WordCountOperator(new WordCountOperatorPredicate(TestConstantsChineseWordCount.DESCRIPTION, LuceneAnalyzerConstants.chineseAnalyzerString()));
    }
    wordCount.setInputOperator(scanSource);
    wordCount.open();
    Tuple tuple;
    while ((tuple = wordCount.getNextTuple()) != null) {
        result.put((String) tuple.getField(WordCountOperator.WORD).getValue(), (Integer) tuple.getField(WordCountOperator.COUNT).getValue());
    }
    wordCount.close();
    return result;
}
Also used : HashMap(java.util.HashMap) WordCountOperator(edu.uci.ics.texera.dataflow.wordcount.WordCountOperator) WordCountOperatorPredicate(edu.uci.ics.texera.dataflow.wordcount.WordCountOperatorPredicate) ScanBasedSourceOperator(edu.uci.ics.texera.dataflow.source.scan.ScanBasedSourceOperator) ScanSourcePredicate(edu.uci.ics.texera.dataflow.source.scan.ScanSourcePredicate) Tuple(edu.uci.ics.texera.api.tuple.Tuple)

Aggregations

Tuple (edu.uci.ics.texera.api.tuple.Tuple)1 ScanBasedSourceOperator (edu.uci.ics.texera.dataflow.source.scan.ScanBasedSourceOperator)1 ScanSourcePredicate (edu.uci.ics.texera.dataflow.source.scan.ScanSourcePredicate)1 WordCountOperator (edu.uci.ics.texera.dataflow.wordcount.WordCountOperator)1 WordCountOperatorPredicate (edu.uci.ics.texera.dataflow.wordcount.WordCountOperatorPredicate)1 HashMap (java.util.HashMap)1