Search in sources :

Example 1 with RemoveDuplicates

use of org.apache.apex.malhar.lib.window.accumulation.RemoveDuplicates in project apex-malhar by apache.

the class DeDupExample method populateDAG.

@Override
public void populateDAG(DAG dag, Configuration conf) {
    Collector collector = new Collector();
    // Create a stream that reads from files in a local folder and output lines one by one to downstream.
    ApexStream<String> stream = StreamFactory.fromFolder("./src/test/resources/wordcount", name("textInput")).flatMap(new Function.FlatMapFunction<String, String>() {

        @Override
        public Iterable<String> f(String input) {
            return Arrays.asList(input.split("[\\p{Punct}\\s]+"));
        }
    }, name("ExtractWords")).map(new Function.MapFunction<String, String>() {

        @Override
        public String f(String input) {
            return input.toLowerCase();
        }
    }, name("ToLowerCase"));
    // Apply window and trigger option.
    stream.window(new WindowOption.GlobalWindow(), new TriggerOption().accumulatingFiredPanes().withEarlyFiringsAtEvery(Duration.standardSeconds(1))).accumulate(new RemoveDuplicates<String>(), name("RemoveDuplicates")).print(name("console")).endWith(collector, collector.input).populateDag(dag);
}
Also used : Function(org.apache.apex.malhar.lib.function.Function) TriggerOption(org.apache.apex.malhar.lib.window.TriggerOption) WindowOption(org.apache.apex.malhar.lib.window.WindowOption) RemoveDuplicates(org.apache.apex.malhar.lib.window.accumulation.RemoveDuplicates)

Aggregations

Function (org.apache.apex.malhar.lib.function.Function)1 TriggerOption (org.apache.apex.malhar.lib.window.TriggerOption)1 WindowOption (org.apache.apex.malhar.lib.window.WindowOption)1 RemoveDuplicates (org.apache.apex.malhar.lib.window.accumulation.RemoveDuplicates)1