Search in sources :

Example 1 with TxRunnable

use of co.cask.cdap.api.TxRunnable in project cdap by caskdata.

the class MapReduceRuntimeService method destroy.

/**
   * Calls the destroy method of {@link ProgramLifecycle}.
   */
private void destroy(final boolean succeeded, final String failureInfo) throws Exception {
    // if any exception happens during output committing, we want the MapReduce to fail.
    // for that to happen it is not sufficient to set the status to failed, we have to throw an exception,
    // otherwise the shutdown completes successfully and the completed() callback is called.
    // thus: remember the exception and throw it at the end.
    final AtomicReference<Exception> failureCause = new AtomicReference<>();
    // TODO (CDAP-1952): this should be done in the output committer, to make the M/R fail if addPartition fails
    try {
        context.execute(new TxRunnable() {

            @Override
            public void run(DatasetContext ctxt) throws Exception {
                ClassLoader oldClassLoader = ClassLoaders.setContextClassLoader(job.getConfiguration().getClassLoader());
                try {
                    for (Map.Entry<String, ProvidedOutput> output : context.getOutputs().entrySet()) {
                        commitOutput(succeeded, output.getKey(), output.getValue().getOutputFormatProvider(), failureCause);
                        if (succeeded && failureCause.get() != null) {
                            // mapreduce was successful but this output committer failed: call onFailure() for all committers
                            for (ProvidedOutput toFail : context.getOutputs().values()) {
                                commitOutput(false, toFail.getAlias(), toFail.getOutputFormatProvider(), failureCause);
                            }
                            break;
                        }
                    }
                    // if there was a failure, we must throw an exception to fail the transaction
                    // this will roll back all the outputs and also make sure that postCommit() is not called
                    // throwing the failure cause: it will be wrapped in a TxFailure and handled in the outer catch()
                    Exception cause = failureCause.get();
                    if (cause != null) {
                        failureCause.set(null);
                        throw cause;
                    }
                } finally {
                    ClassLoaders.setContextClassLoader(oldClassLoader);
                }
            }
        });
    } catch (TransactionFailureException e) {
        LOG.error("Transaction failure when committing dataset outputs", e);
        if (failureCause.get() != null) {
            failureCause.get().addSuppressed(e);
        } else {
            failureCause.set(e);
        }
    }
    final boolean success = succeeded && failureCause.get() == null;
    context.setState(getProgramState(success, failureInfo));
    final TransactionControl txControl = mapReduce instanceof ProgramLifecycle ? Transactions.getTransactionControl(TransactionControl.IMPLICIT, MapReduce.class, mapReduce, "destroy") : TransactionControl.IMPLICIT;
    try {
        if (TransactionControl.IMPLICIT == txControl) {
            context.execute(new TxRunnable() {

                @Override
                public void run(DatasetContext context) throws Exception {
                    doDestroy(success);
                }
            });
        } else {
            doDestroy(success);
        }
    } catch (Throwable e) {
        if (e instanceof TransactionFailureException && e.getCause() != null && !(e instanceof TransactionConflictException)) {
            e = e.getCause();
        }
        LOG.warn("Error executing the destroy method of the MapReduce program {}", context.getProgram().getName(), e);
    }
    // this is needed to make the run fail if there was an exception. See comment at beginning of this method
    if (failureCause.get() != null) {
        throw failureCause.get();
    }
}
Also used : ProgramLifecycle(co.cask.cdap.api.ProgramLifecycle) TransactionConflictException(org.apache.tephra.TransactionConflictException) AtomicReference(java.util.concurrent.atomic.AtomicReference) ProvidedOutput(co.cask.cdap.internal.app.runtime.batch.dataset.output.ProvidedOutput) ProvisionException(com.google.inject.ProvisionException) IOException(java.io.IOException) TransactionFailureException(org.apache.tephra.TransactionFailureException) URISyntaxException(java.net.URISyntaxException) TransactionConflictException(org.apache.tephra.TransactionConflictException) AbstractMapReduce(co.cask.cdap.api.mapreduce.AbstractMapReduce) MapReduce(co.cask.cdap.api.mapreduce.MapReduce) JarEntry(java.util.jar.JarEntry) TransactionFailureException(org.apache.tephra.TransactionFailureException) TxRunnable(co.cask.cdap.api.TxRunnable) TransactionControl(co.cask.cdap.api.annotation.TransactionControl) WeakReferenceDelegatorClassLoader(co.cask.cdap.common.lang.WeakReferenceDelegatorClassLoader) CombineClassLoader(co.cask.cdap.common.lang.CombineClassLoader) DatasetContext(co.cask.cdap.api.data.DatasetContext)

Example 2 with TxRunnable

use of co.cask.cdap.api.TxRunnable in project cdap by caskdata.

the class StreamingBatchSinkFunction method call.

@Override
public Void call(JavaRDD<T> data, Time batchTime) throws Exception {
    final long logicalStartTime = batchTime.milliseconds();
    MacroEvaluator evaluator = new DefaultMacroEvaluator(sec.getWorkflowToken(), sec.getRuntimeArguments(), logicalStartTime, sec.getSecureStore(), sec.getNamespace());
    PluginContext pluginContext = new SparkPipelinePluginContext(sec.getPluginContext(), sec.getMetrics(), stageInfo.isStageLoggingEnabled(), stageInfo.isProcessTimingEnabled());
    final SparkBatchSinkFactory sinkFactory = new SparkBatchSinkFactory();
    final String stageName = stageInfo.getName();
    final BatchSink<Object, Object, Object> batchSink = pluginContext.newPluginInstance(stageName, evaluator);
    boolean isPrepared = false;
    boolean isDone = false;
    try {
        sec.execute(new TxRunnable() {

            @Override
            public void run(DatasetContext datasetContext) throws Exception {
                SparkBatchSinkContext sinkContext = new SparkBatchSinkContext(sinkFactory, sec, datasetContext, logicalStartTime, stageInfo);
                batchSink.prepareRun(sinkContext);
            }
        });
        isPrepared = true;
        sinkFactory.writeFromRDD(data.flatMapToPair(sinkFunction), sec, stageName, Object.class, Object.class);
        isDone = true;
        sec.execute(new TxRunnable() {

            @Override
            public void run(DatasetContext datasetContext) throws Exception {
                SparkBatchSinkContext sinkContext = new SparkBatchSinkContext(sinkFactory, sec, datasetContext, logicalStartTime, stageInfo);
                batchSink.onRunFinish(true, sinkContext);
            }
        });
    } catch (Exception e) {
        LOG.error("Error writing to sink {} for the batch for time {}.", stageName, logicalStartTime, e);
    } finally {
        if (isPrepared && !isDone) {
            sec.execute(new TxRunnable() {

                @Override
                public void run(DatasetContext datasetContext) throws Exception {
                    SparkBatchSinkContext sinkContext = new SparkBatchSinkContext(sinkFactory, sec, datasetContext, logicalStartTime, stageInfo);
                    batchSink.onRunFinish(false, sinkContext);
                }
            });
        }
    }
    return null;
}
Also used : MacroEvaluator(co.cask.cdap.api.macro.MacroEvaluator) DefaultMacroEvaluator(co.cask.cdap.etl.common.DefaultMacroEvaluator) SparkPipelinePluginContext(co.cask.cdap.etl.spark.plugin.SparkPipelinePluginContext) PluginContext(co.cask.cdap.api.plugin.PluginContext) SparkBatchSinkContext(co.cask.cdap.etl.spark.batch.SparkBatchSinkContext) SparkPipelinePluginContext(co.cask.cdap.etl.spark.plugin.SparkPipelinePluginContext) SparkBatchSinkFactory(co.cask.cdap.etl.spark.batch.SparkBatchSinkFactory) TxRunnable(co.cask.cdap.api.TxRunnable) DefaultMacroEvaluator(co.cask.cdap.etl.common.DefaultMacroEvaluator) DatasetContext(co.cask.cdap.api.data.DatasetContext)

Example 3 with TxRunnable

use of co.cask.cdap.api.TxRunnable in project cdap by caskdata.

the class StreamingSparkSinkFunction method call.

@Override
public Void call(JavaRDD<T> data, Time batchTime) throws Exception {
    final long logicalStartTime = batchTime.milliseconds();
    MacroEvaluator evaluator = new DefaultMacroEvaluator(sec.getWorkflowToken(), sec.getRuntimeArguments(), logicalStartTime, sec.getSecureStore(), sec.getNamespace());
    final PluginContext pluginContext = new SparkPipelinePluginContext(sec.getPluginContext(), sec.getMetrics(), stageInfo.isStageLoggingEnabled(), stageInfo.isProcessTimingEnabled());
    final String stageName = stageInfo.getName();
    final SparkSink<T> sparkSink = pluginContext.newPluginInstance(stageName, evaluator);
    boolean isPrepared = false;
    boolean isDone = false;
    try {
        sec.execute(new TxRunnable() {

            @Override
            public void run(DatasetContext datasetContext) throws Exception {
                SparkPluginContext context = new BasicSparkPluginContext(sec, datasetContext, stageInfo);
                sparkSink.prepareRun(context);
            }
        });
        isPrepared = true;
        final SparkExecutionPluginContext sparkExecutionPluginContext = new SparkStreamingExecutionContext(sec, JavaSparkContext.fromSparkContext(data.rdd().context()), logicalStartTime, stageInfo);
        final JavaRDD<T> countedRDD = data.map(new CountingFunction<T>(stageName, sec.getMetrics(), "records.in", null)).cache();
        sec.execute(new TxRunnable() {

            @Override
            public void run(DatasetContext context) throws Exception {
                sparkSink.run(sparkExecutionPluginContext, countedRDD);
            }
        });
        isDone = true;
        sec.execute(new TxRunnable() {

            @Override
            public void run(DatasetContext datasetContext) throws Exception {
                SparkPluginContext context = new BasicSparkPluginContext(sec, datasetContext, stageInfo);
                sparkSink.onRunFinish(true, context);
            }
        });
    } catch (Exception e) {
        LOG.error("Error while executing sink {} for the batch for time {}.", stageName, logicalStartTime, e);
    } finally {
        if (isPrepared && !isDone) {
            sec.execute(new TxRunnable() {

                @Override
                public void run(DatasetContext datasetContext) throws Exception {
                    SparkPluginContext context = new BasicSparkPluginContext(sec, datasetContext, stageInfo);
                    sparkSink.onRunFinish(false, context);
                }
            });
        }
    }
    return null;
}
Also used : MacroEvaluator(co.cask.cdap.api.macro.MacroEvaluator) DefaultMacroEvaluator(co.cask.cdap.etl.common.DefaultMacroEvaluator) SparkPipelinePluginContext(co.cask.cdap.etl.spark.plugin.SparkPipelinePluginContext) BasicSparkPluginContext(co.cask.cdap.etl.spark.batch.BasicSparkPluginContext) SparkExecutionPluginContext(co.cask.cdap.etl.api.batch.SparkExecutionPluginContext) PluginContext(co.cask.cdap.api.plugin.PluginContext) SparkPluginContext(co.cask.cdap.etl.api.batch.SparkPluginContext) SparkStreamingExecutionContext(co.cask.cdap.etl.spark.streaming.SparkStreamingExecutionContext) CountingFunction(co.cask.cdap.etl.spark.function.CountingFunction) SparkPipelinePluginContext(co.cask.cdap.etl.spark.plugin.SparkPipelinePluginContext) SparkExecutionPluginContext(co.cask.cdap.etl.api.batch.SparkExecutionPluginContext) TxRunnable(co.cask.cdap.api.TxRunnable) DefaultMacroEvaluator(co.cask.cdap.etl.common.DefaultMacroEvaluator) DatasetContext(co.cask.cdap.api.data.DatasetContext) BasicSparkPluginContext(co.cask.cdap.etl.spark.batch.BasicSparkPluginContext) SparkPluginContext(co.cask.cdap.etl.api.batch.SparkPluginContext) BasicSparkPluginContext(co.cask.cdap.etl.spark.batch.BasicSparkPluginContext)

Example 4 with TxRunnable

use of co.cask.cdap.api.TxRunnable in project cdap by caskdata.

the class Transactions method execute.

/**
   * Executes the given {@link TxCallable} using the given {@link Transactional}.
   *
   * @param transactional the {@link Transactional} to use for transactional execution.
   * @param callable the {@link TxCallable} to be executed inside a transaction
   * @param <V> type of the result
   * @return value returned by the given {@link TxCallable}.
   * @throws TransactionFailureException if failed to execute the given {@link TxRunnable} in a transaction
   *
   * TODO: CDAP-6103 Move this to {@link Transactional} when revamping tx supports in program.
   */
public static <V> V execute(Transactional transactional, final TxCallable<V> callable) throws TransactionFailureException {
    final AtomicReference<V> result = new AtomicReference<>();
    transactional.execute(new TxRunnable() {

        @Override
        public void run(DatasetContext context) throws Exception {
            result.set(callable.call(context));
        }
    });
    return result.get();
}
Also used : TxRunnable(co.cask.cdap.api.TxRunnable) AtomicReference(java.util.concurrent.atomic.AtomicReference) DatasetContext(co.cask.cdap.api.data.DatasetContext) TransactionFailureException(org.apache.tephra.TransactionFailureException)

Example 5 with TxRunnable

use of co.cask.cdap.api.TxRunnable in project cdap by caskdata.

the class AbstractContext method execute.

/**
   * Execute in a transaction with optional retry on conflict.
   */
public void execute(final TxRunnable runnable, boolean retryOnConflict) throws TransactionFailureException {
    ClassLoader oldClassLoader = ClassLoaders.setContextClassLoader(getClass().getClassLoader());
    try {
        Transactional txnl = retryOnConflict ? Transactions.createTransactionalWithRetry(transactional, RetryStrategies.retryOnConflict(20, 100)) : transactional;
        txnl.execute(new TxRunnable() {

            @Override
            public void run(DatasetContext context) throws Exception {
                ClassLoader oldClassLoader = setContextCombinedClassLoader();
                try {
                    runnable.run(context);
                } finally {
                    ClassLoaders.setContextClassLoader(oldClassLoader);
                }
            }
        });
    } finally {
        ClassLoaders.setContextClassLoader(oldClassLoader);
    }
}
Also used : TxRunnable(co.cask.cdap.api.TxRunnable) CombineClassLoader(co.cask.cdap.common.lang.CombineClassLoader) DatasetContext(co.cask.cdap.api.data.DatasetContext) LineageDatasetContext(co.cask.cdap.data.LineageDatasetContext) TransactionFailureException(org.apache.tephra.TransactionFailureException) DatasetInstantiationException(co.cask.cdap.api.data.DatasetInstantiationException) TopicNotFoundException(co.cask.cdap.api.messaging.TopicNotFoundException) IOException(java.io.IOException) Transactional(co.cask.cdap.api.Transactional)

Aggregations

TxRunnable (co.cask.cdap.api.TxRunnable)38 DatasetContext (co.cask.cdap.api.data.DatasetContext)37 IOException (java.io.IOException)18 TransactionFailureException (org.apache.tephra.TransactionFailureException)17 TransactionConflictException (org.apache.tephra.TransactionConflictException)11 DatasetManagementException (co.cask.cdap.api.dataset.DatasetManagementException)10 TransactionControl (co.cask.cdap.api.annotation.TransactionControl)8 Table (co.cask.cdap.api.dataset.table.Table)7 ApplicationNotFoundException (co.cask.cdap.common.ApplicationNotFoundException)6 ProgramNotFoundException (co.cask.cdap.common.ProgramNotFoundException)6 NoSuchElementException (java.util.NoSuchElementException)5 AtomicReference (java.util.concurrent.atomic.AtomicReference)5 TransactionNotInProgressException (org.apache.tephra.TransactionNotInProgressException)5 ProgramLifecycle (co.cask.cdap.api.ProgramLifecycle)4 KeyValueTable (co.cask.cdap.api.dataset.lib.KeyValueTable)4 Put (co.cask.cdap.api.dataset.table.Put)3 SparkExecutionPluginContext (co.cask.cdap.etl.api.batch.SparkExecutionPluginContext)3 ApplicationSpecification (co.cask.cdap.api.app.ApplicationSpecification)2 DataSetException (co.cask.cdap.api.dataset.DataSetException)2 CloseableIterator (co.cask.cdap.api.dataset.lib.CloseableIterator)2