Search in sources :

Example 1 with BatchSourceFunction

use of io.cdap.cdap.etl.spark.function.BatchSourceFunction in project cdap by caskdata.

the class BatchSparkPipelineDriver method getSource.

@Override
protected SparkCollection<RecordInfo<Object>> getSource(StageSpec stageSpec, FunctionCache.Factory functionCacheFactory, StageStatisticsCollector collector) {
    PluginFunctionContext pluginFunctionContext = new PluginFunctionContext(stageSpec, sec, collector);
    FlatMapFunction<Tuple2<Object, Object>, RecordInfo<Object>> sourceFunction = new BatchSourceFunction(pluginFunctionContext, functionCacheFactory.newCache());
    this.functionCacheFactory = functionCacheFactory;
    return new RDDCollection<>(sec, functionCacheFactory, jsc, new SQLContext(jsc), datasetContext, sinkFactory, sourceFactory.createRDD(sec, jsc, stageSpec.getName(), Object.class, Object.class).flatMap(sourceFunction));
}
Also used : PluginFunctionContext(io.cdap.cdap.etl.spark.function.PluginFunctionContext) BatchSourceFunction(io.cdap.cdap.etl.spark.function.BatchSourceFunction) RecordInfo(io.cdap.cdap.etl.common.RecordInfo) Tuple2(scala.Tuple2) SQLContext(org.apache.spark.sql.SQLContext)

Aggregations

RecordInfo (io.cdap.cdap.etl.common.RecordInfo)1 BatchSourceFunction (io.cdap.cdap.etl.spark.function.BatchSourceFunction)1 PluginFunctionContext (io.cdap.cdap.etl.spark.function.PluginFunctionContext)1 SQLContext (org.apache.spark.sql.SQLContext)1 Tuple2 (scala.Tuple2)1