use of io.cdap.cdap.etl.spark.function.BatchSourceFunction in project cdap by caskdata.
the class BatchSparkPipelineDriver method getSource.
@Override
protected SparkCollection<RecordInfo<Object>> getSource(StageSpec stageSpec, FunctionCache.Factory functionCacheFactory, StageStatisticsCollector collector) {
PluginFunctionContext pluginFunctionContext = new PluginFunctionContext(stageSpec, sec, collector);
FlatMapFunction<Tuple2<Object, Object>, RecordInfo<Object>> sourceFunction = new BatchSourceFunction(pluginFunctionContext, functionCacheFactory.newCache());
this.functionCacheFactory = functionCacheFactory;
return new RDDCollection<>(sec, functionCacheFactory, jsc, new SQLContext(jsc), datasetContext, sinkFactory, sourceFactory.createRDD(sec, jsc, stageSpec.getName(), Object.class, Object.class).flatMap(sourceFunction));
}
Aggregations