Search in sources :

Example 1 with JetEngine

use of org.apache.gora.jet.JetEngine in project gora by apache.

the class LogAnalyticsJet method main.

/**
 * In the main method pageviews are fetched though the jet source connector.
 * Then those are grouped by url and day. Then a counting aggregator is
 * applied to calculate the aggregated daily pageviews. Then the result is
 * output through the jet sink connector to a gora compatible data store.
 */
public static void main(String[] args) throws Exception {
    Configuration conf = new Configuration();
    inStore = DataStoreFactory.getDataStore(Long.class, Pageview.class, conf);
    outStore = DataStoreFactory.getDataStore(String.class, MetricDatum.class, conf);
    Query<Long, Pageview> query = inStore.newQuery();
    JetEngine<Long, Pageview, String, MetricDatum> jetEngine = new JetEngine<>();
    Pipeline p = Pipeline.create();
    p.drawFrom(jetEngine.createDataSource(inStore, query)).groupingKey(e -> e.getValue().getUrl().toString()).aggregate(groupingBy(e -> getDay(e.getValue().getTimestamp()), counting())).map(e -> {
        MetricDatum metricDatum = new MetricDatum();
        String url = e.getKey();
        for (Map.Entry<Long, Long> item : e.getValue().entrySet()) {
            long timeStamp = item.getKey();
            long sum = item.getKey();
            metricDatum.setTimestamp(timeStamp);
            metricDatum.setMetric(sum);
        }
        metricDatum.setMetricDimension(url);
        return new JetInputOutputFormat<String, MetricDatum>(url + "_" + "ip", metricDatum);
    }).peek().drainTo(jetEngine.createDataSink(outStore));
    JetInstance jet = Jet.newJetInstance();
    try {
        jet.newJob(p).join();
    } finally {
        Jet.shutdownAll();
    }
}
Also used : Configuration(org.apache.hadoop.conf.Configuration) JetInstance(com.hazelcast.jet.JetInstance) MetricDatum(org.apache.gora.tutorial.log.generated.MetricDatum) JetInputOutputFormat(org.apache.gora.jet.JetInputOutputFormat) Pipeline(com.hazelcast.jet.pipeline.Pipeline) Pageview(org.apache.gora.tutorial.log.generated.Pageview) JetEngine(org.apache.gora.jet.JetEngine) Map(java.util.Map)

Aggregations

JetInstance (com.hazelcast.jet.JetInstance)1 Pipeline (com.hazelcast.jet.pipeline.Pipeline)1 Map (java.util.Map)1 JetEngine (org.apache.gora.jet.JetEngine)1 JetInputOutputFormat (org.apache.gora.jet.JetInputOutputFormat)1 MetricDatum (org.apache.gora.tutorial.log.generated.MetricDatum)1 Pageview (org.apache.gora.tutorial.log.generated.Pageview)1 Configuration (org.apache.hadoop.conf.Configuration)1