Search in sources :

Example 1 with SinkEmitter

use of co.cask.cdap.etl.batch.mapreduce.SinkEmitter in project cdap by caskdata.

the class TransformExecutorFactory method setPipeTransformDetail.

private <KEY_OUT, VAL_OUT> void setPipeTransformDetail(PipelinePhase pipeline, String stageName, Map<String, PipeTransformDetail> transformations, Map<String, ErrorOutputWriter<Object, Object>> transformErrorSinkMap, OutputWriter<KEY_OUT, VAL_OUT> outputWriter) throws Exception {
    if (pipeline.getSinks().contains(stageName)) {
        StageInfo stageInfo = pipeline.getStage(stageName);
        // If there is a connector sink/ joiner at the end of pipeline, do not remove stage name. This is needed to save
        // stageName along with the record in connector sink and joiner takes input along with stageName
        String pluginType = stageInfo.getPluginType();
        boolean removeStageName = !(pluginType.equals(Constants.CONNECTOR_TYPE) || pluginType.equals(BatchJoiner.PLUGIN_TYPE));
        boolean isErrorConsumer = pluginType.equals(ErrorTransform.PLUGIN_TYPE);
        transformations.put(stageName, new PipeTransformDetail(stageName, removeStageName, isErrorConsumer, getTransformation(stageInfo), new SinkEmitter<>(stageName, outputWriter)));
        return;
    }
    try {
        addTransformation(pipeline, stageName, transformations, transformErrorSinkMap);
    } catch (Exception e) {
        // Catch the Exception to generate a User Error Log for the Pipeline
        PIPELINE_LOG.error("Failed to start pipeline stage '{}' with the error: {}. Please review your pipeline " + "configuration and check the system logs for more details.", stageName, Throwables.getRootCause(e).getMessage(), Throwables.getRootCause(e));
        throw e;
    }
    for (String output : pipeline.getDag().getNodeOutputs(stageName)) {
        setPipeTransformDetail(pipeline, output, transformations, transformErrorSinkMap, outputWriter);
        transformations.get(stageName).addTransformation(output, transformations.get(output));
    }
}
Also used : StageInfo(co.cask.cdap.etl.planner.StageInfo) SinkEmitter(co.cask.cdap.etl.batch.mapreduce.SinkEmitter)

Aggregations

SinkEmitter (co.cask.cdap.etl.batch.mapreduce.SinkEmitter)1 StageInfo (co.cask.cdap.etl.planner.StageInfo)1