Search in sources :

Example 1 with FlinkWordCount

use of org.apache.gora.examples.flink.FlinkWordCount in project gora by apache.

the class MapReduceTestUtils method testFlinkWordCount.

public static void testFlinkWordCount(Configuration conf, DataStore<String, WebPage> inStore, DataStore<String, TokenDatum> outStore) throws Exception {
    // Datastore now has to be a Hadoop based datastore
    ((DataStoreBase<String, WebPage>) inStore).setConf(conf);
    ((DataStoreBase<String, TokenDatum>) outStore).setConf(conf);
    // create input
    WebPageDataCreator.createWebPageData(inStore);
    // run Flink Job
    FlinkWordCount flinkWordCount = new FlinkWordCount();
    flinkWordCount.wordCount(inStore, outStore, conf);
    // assert results
    HashMap<String, Integer> actualCounts = new HashMap<>();
    for (String content : WebPageDataCreator.CONTENTS) {
        if (content != null) {
            for (String token : content.split(" ")) {
                Integer count = actualCounts.get(token);
                if (count == null)
                    count = 0;
                actualCounts.put(token, ++count);
            }
        }
    }
    for (Map.Entry<String, Integer> entry : actualCounts.entrySet()) {
        assertTokenCount(outStore, entry.getKey(), entry.getValue());
    }
}
Also used : HashMap(java.util.HashMap) FlinkWordCount(org.apache.gora.examples.flink.FlinkWordCount) DataStoreBase(org.apache.gora.store.impl.DataStoreBase) HashMap(java.util.HashMap) Map(java.util.Map)

Aggregations

HashMap (java.util.HashMap)1 Map (java.util.Map)1 FlinkWordCount (org.apache.gora.examples.flink.FlinkWordCount)1 DataStoreBase (org.apache.gora.store.impl.DataStoreBase)1