Search in sources :

Example 1 with ProgramLifecycle

use of co.cask.cdap.api.ProgramLifecycle in project cdap by caskdata.

the class MapReduceRuntimeService method destroy.

/**
   * Calls the destroy method of {@link ProgramLifecycle}.
   */
private void destroy(final boolean succeeded, final String failureInfo) throws Exception {
    // if any exception happens during output committing, we want the MapReduce to fail.
    // for that to happen it is not sufficient to set the status to failed, we have to throw an exception,
    // otherwise the shutdown completes successfully and the completed() callback is called.
    // thus: remember the exception and throw it at the end.
    final AtomicReference<Exception> failureCause = new AtomicReference<>();
    // TODO (CDAP-1952): this should be done in the output committer, to make the M/R fail if addPartition fails
    try {
        context.execute(new TxRunnable() {

            @Override
            public void run(DatasetContext ctxt) throws Exception {
                ClassLoader oldClassLoader = ClassLoaders.setContextClassLoader(job.getConfiguration().getClassLoader());
                try {
                    for (Map.Entry<String, ProvidedOutput> output : context.getOutputs().entrySet()) {
                        commitOutput(succeeded, output.getKey(), output.getValue().getOutputFormatProvider(), failureCause);
                        if (succeeded && failureCause.get() != null) {
                            // mapreduce was successful but this output committer failed: call onFailure() for all committers
                            for (ProvidedOutput toFail : context.getOutputs().values()) {
                                commitOutput(false, toFail.getAlias(), toFail.getOutputFormatProvider(), failureCause);
                            }
                            break;
                        }
                    }
                    // if there was a failure, we must throw an exception to fail the transaction
                    // this will roll back all the outputs and also make sure that postCommit() is not called
                    // throwing the failure cause: it will be wrapped in a TxFailure and handled in the outer catch()
                    Exception cause = failureCause.get();
                    if (cause != null) {
                        failureCause.set(null);
                        throw cause;
                    }
                } finally {
                    ClassLoaders.setContextClassLoader(oldClassLoader);
                }
            }
        });
    } catch (TransactionFailureException e) {
        LOG.error("Transaction failure when committing dataset outputs", e);
        if (failureCause.get() != null) {
            failureCause.get().addSuppressed(e);
        } else {
            failureCause.set(e);
        }
    }
    final boolean success = succeeded && failureCause.get() == null;
    context.setState(getProgramState(success, failureInfo));
    final TransactionControl txControl = mapReduce instanceof ProgramLifecycle ? Transactions.getTransactionControl(TransactionControl.IMPLICIT, MapReduce.class, mapReduce, "destroy") : TransactionControl.IMPLICIT;
    try {
        if (TransactionControl.IMPLICIT == txControl) {
            context.execute(new TxRunnable() {

                @Override
                public void run(DatasetContext context) throws Exception {
                    doDestroy(success);
                }
            });
        } else {
            doDestroy(success);
        }
    } catch (Throwable e) {
        if (e instanceof TransactionFailureException && e.getCause() != null && !(e instanceof TransactionConflictException)) {
            e = e.getCause();
        }
        LOG.warn("Error executing the destroy method of the MapReduce program {}", context.getProgram().getName(), e);
    }
    // this is needed to make the run fail if there was an exception. See comment at beginning of this method
    if (failureCause.get() != null) {
        throw failureCause.get();
    }
}
Also used : ProgramLifecycle(co.cask.cdap.api.ProgramLifecycle) TransactionConflictException(org.apache.tephra.TransactionConflictException) AtomicReference(java.util.concurrent.atomic.AtomicReference) ProvidedOutput(co.cask.cdap.internal.app.runtime.batch.dataset.output.ProvidedOutput) ProvisionException(com.google.inject.ProvisionException) IOException(java.io.IOException) TransactionFailureException(org.apache.tephra.TransactionFailureException) URISyntaxException(java.net.URISyntaxException) TransactionConflictException(org.apache.tephra.TransactionConflictException) AbstractMapReduce(co.cask.cdap.api.mapreduce.AbstractMapReduce) MapReduce(co.cask.cdap.api.mapreduce.MapReduce) JarEntry(java.util.jar.JarEntry) TransactionFailureException(org.apache.tephra.TransactionFailureException) TxRunnable(co.cask.cdap.api.TxRunnable) TransactionControl(co.cask.cdap.api.annotation.TransactionControl) WeakReferenceDelegatorClassLoader(co.cask.cdap.common.lang.WeakReferenceDelegatorClassLoader) CombineClassLoader(co.cask.cdap.common.lang.CombineClassLoader) DatasetContext(co.cask.cdap.api.data.DatasetContext)

Example 2 with ProgramLifecycle

use of co.cask.cdap.api.ProgramLifecycle in project cdap by caskdata.

the class WorkflowDriver method initializeWorkflow.

@SuppressWarnings("unchecked")
private Workflow initializeWorkflow() throws Exception {
    Class<?> clz = Class.forName(workflowSpec.getClassName(), true, program.getClassLoader());
    if (!Workflow.class.isAssignableFrom(clz)) {
        throw new IllegalStateException(String.format("%s is not Workflow.", clz));
    }
    Class<? extends Workflow> workflowClass = (Class<? extends Workflow>) clz;
    final Workflow workflow = new InstantiatorFactory(false).get(TypeToken.of(workflowClass)).create();
    // set metrics
    Reflections.visit(workflow, workflow.getClass(), new MetricsFieldSetter(basicWorkflowContext.getMetrics()));
    if (!(workflow instanceof ProgramLifecycle)) {
        return workflow;
    }
    final TransactionControl txControl = Transactions.getTransactionControl(TransactionControl.IMPLICIT, Workflow.class, workflow, "initialize", WorkflowContext.class);
    basicWorkflowToken.setCurrentNode(workflowSpec.getName());
    basicWorkflowContext.setState(new ProgramState(ProgramStatus.INITIALIZING, null));
    basicWorkflowContext.initializeProgram((ProgramLifecycle) workflow, basicWorkflowContext, txControl, false);
    runtimeStore.updateWorkflowToken(workflowRunId, basicWorkflowToken);
    return workflow;
}
Also used : InstantiatorFactory(co.cask.cdap.common.lang.InstantiatorFactory) MetricsFieldSetter(co.cask.cdap.internal.app.runtime.MetricsFieldSetter) ProgramLifecycle(co.cask.cdap.api.ProgramLifecycle) TransactionControl(co.cask.cdap.api.annotation.TransactionControl) Workflow(co.cask.cdap.api.workflow.Workflow) ProgramState(co.cask.cdap.api.ProgramState)

Example 3 with ProgramLifecycle

use of co.cask.cdap.api.ProgramLifecycle in project cdap by caskdata.

the class WorkflowDriver method destroyWorkflow.

@SuppressWarnings("unchecked")
private void destroyWorkflow() {
    if (!(workflow instanceof ProgramLifecycle)) {
        return;
    }
    final TransactionControl txControl = Transactions.getTransactionControl(TransactionControl.IMPLICIT, Workflow.class, workflow, "destroy");
    try {
        basicWorkflowToken.setCurrentNode(workflowSpec.getName());
        basicWorkflowContext.destroyProgram((ProgramLifecycle) workflow, basicWorkflowContext, txControl, false);
        runtimeStore.updateWorkflowToken(workflowRunId, basicWorkflowToken);
    } catch (Throwable t) {
        LOG.error(String.format("Failed to destroy the Workflow %s", workflowRunId), t);
    }
}
Also used : ProgramLifecycle(co.cask.cdap.api.ProgramLifecycle) TransactionControl(co.cask.cdap.api.annotation.TransactionControl) BasicThrowable(co.cask.cdap.proto.BasicThrowable)

Example 4 with ProgramLifecycle

use of co.cask.cdap.api.ProgramLifecycle in project cdap by caskdata.

the class SparkRuntimeService method initialize.

/**
   * Calls the {@link Spark#beforeSubmit(SparkClientContext)} for the pre 3.5 Spark programs, calls
   * the {@link ProgramLifecycle#initialize} otherwise.
   */
@SuppressWarnings("unchecked")
private void initialize() throws Exception {
    // AbstractSpark implements final initialize(context) and requires subclass to
    // implement initialize(), whereas programs that directly implement Spark have
    // the option to override initialize(context) (if they implement ProgramLifeCycle)
    final TransactionControl txControl = spark instanceof AbstractSpark ? Transactions.getTransactionControl(TransactionControl.IMPLICIT, AbstractSpark.class, spark, "initialize") : spark instanceof ProgramLifecycle ? Transactions.getTransactionControl(TransactionControl.IMPLICIT, Spark.class, spark, "initialize", SparkClientContext.class) : TransactionControl.IMPLICIT;
    TxRunnable runnable = new TxRunnable() {

        @Override
        public void run(DatasetContext ctxt) throws Exception {
            Cancellable cancellable = SparkRuntimeUtils.setContextClassLoader(new SparkClassLoader(runtimeContext));
            try {
                context.setState(new ProgramState(ProgramStatus.INITIALIZING, null));
                if (spark instanceof ProgramLifecycle) {
                    ((ProgramLifecycle) spark).initialize(context);
                } else {
                    spark.beforeSubmit(context);
                }
            } finally {
                cancellable.cancel();
            }
        }
    };
    if (TransactionControl.IMPLICIT == txControl) {
        context.execute(runnable);
    } else {
        runnable.run(context);
    }
}
Also used : ProgramLifecycle(co.cask.cdap.api.ProgramLifecycle) TxRunnable(co.cask.cdap.api.TxRunnable) Cancellable(org.apache.twill.common.Cancellable) TransactionControl(co.cask.cdap.api.annotation.TransactionControl) ProgramState(co.cask.cdap.api.ProgramState) DatasetContext(co.cask.cdap.api.data.DatasetContext) AbstractSpark(co.cask.cdap.api.spark.AbstractSpark)

Example 5 with ProgramLifecycle

use of co.cask.cdap.api.ProgramLifecycle in project cdap by caskdata.

the class MapperWrapper method run.

@SuppressWarnings("unchecked")
@Override
public void run(Context context) throws IOException, InterruptedException {
    MapReduceClassLoader classLoader = MapReduceClassLoader.getFromConfiguration(context.getConfiguration());
    ClassLoader weakReferenceClassLoader = new WeakReferenceDelegatorClassLoader(classLoader);
    BasicMapReduceTaskContext basicMapReduceContext = classLoader.getTaskContextProvider().get(context);
    String program = basicMapReduceContext.getProgramName();
    // this is a hook for periodic flushing of changes buffered by datasets (to avoid OOME)
    WrappedMapper.Context flushingContext = createAutoFlushingContext(context, basicMapReduceContext);
    basicMapReduceContext.setHadoopContext(flushingContext);
    InputSplit inputSplit = context.getInputSplit();
    if (inputSplit instanceof MultiInputTaggedSplit) {
        basicMapReduceContext.setInputContext(InputContexts.create((MultiInputTaggedSplit) inputSplit));
    }
    ClassLoader programClassLoader = classLoader.getProgramClassLoader();
    Mapper delegate = createMapperInstance(programClassLoader, getWrappedMapper(context.getConfiguration()), context, program);
    // injecting runtime components, like datasets, etc.
    try {
        Reflections.visit(delegate, delegate.getClass(), new PropertyFieldSetter(basicMapReduceContext.getSpecification().getProperties()), new MetricsFieldSetter(basicMapReduceContext.getMetrics()), new DataSetFieldSetter(basicMapReduceContext));
    } catch (Throwable t) {
        Throwable rootCause = Throwables.getRootCause(t);
        USERLOG.error("Failed to initialize program '{}' with error: {}. Please check the system logs for more details.", program, rootCause.getMessage(), rootCause);
        throw new IOException(String.format("Failed to inject fields to %s", delegate.getClass()), t);
    }
    ClassLoader oldClassLoader;
    if (delegate instanceof ProgramLifecycle) {
        oldClassLoader = ClassLoaders.setContextClassLoader(weakReferenceClassLoader);
        try {
            ((ProgramLifecycle) delegate).initialize(new MapReduceLifecycleContext(basicMapReduceContext));
        } catch (Exception e) {
            Throwable rootCause = Throwables.getRootCause(e);
            USERLOG.error("Failed to initialize program '{}' with error: {}. Please check the system logs for more " + "details.", program, rootCause.getMessage(), rootCause);
            throw new IOException(String.format("Failed to initialize mapper with %s", basicMapReduceContext), e);
        } finally {
            ClassLoaders.setContextClassLoader(oldClassLoader);
        }
    }
    oldClassLoader = ClassLoaders.setContextClassLoader(weakReferenceClassLoader);
    try {
        delegate.run(flushingContext);
    } finally {
        ClassLoaders.setContextClassLoader(oldClassLoader);
    }
    // memory by tx agent)
    try {
        basicMapReduceContext.flushOperations();
    } catch (Exception e) {
        throw new IOException("Failed to flush operations at the end of mapper of " + basicMapReduceContext, e);
    }
    // Close all writers created by MultipleOutputs
    basicMapReduceContext.closeMultiOutputs();
    if (delegate instanceof ProgramLifecycle) {
        oldClassLoader = ClassLoaders.setContextClassLoader(weakReferenceClassLoader);
        try {
            ((ProgramLifecycle<? extends RuntimeContext>) delegate).destroy();
        } catch (Exception e) {
            LOG.error("Error during destroy of mapper {}", basicMapReduceContext, e);
        // Do nothing, try to finish
        } finally {
            ClassLoaders.setContextClassLoader(oldClassLoader);
        }
    }
}
Also used : ProgramLifecycle(co.cask.cdap.api.ProgramLifecycle) IOException(java.io.IOException) DataSetFieldSetter(co.cask.cdap.internal.app.runtime.DataSetFieldSetter) IOException(java.io.IOException) Mapper(org.apache.hadoop.mapreduce.Mapper) WrappedMapper(org.apache.hadoop.mapreduce.lib.map.WrappedMapper) WeakReferenceDelegatorClassLoader(co.cask.cdap.common.lang.WeakReferenceDelegatorClassLoader) PropertyFieldSetter(co.cask.cdap.common.lang.PropertyFieldSetter) MetricsFieldSetter(co.cask.cdap.internal.app.runtime.MetricsFieldSetter) WrappedMapper(org.apache.hadoop.mapreduce.lib.map.WrappedMapper) MultiInputTaggedSplit(co.cask.cdap.internal.app.runtime.batch.dataset.input.MultiInputTaggedSplit) WeakReferenceDelegatorClassLoader(co.cask.cdap.common.lang.WeakReferenceDelegatorClassLoader) RuntimeContext(co.cask.cdap.api.RuntimeContext) TaggedInputSplit(co.cask.cdap.internal.app.runtime.batch.dataset.input.TaggedInputSplit) InputSplit(org.apache.hadoop.mapreduce.InputSplit)

Aggregations

ProgramLifecycle (co.cask.cdap.api.ProgramLifecycle)11 TransactionControl (co.cask.cdap.api.annotation.TransactionControl)6 WeakReferenceDelegatorClassLoader (co.cask.cdap.common.lang.WeakReferenceDelegatorClassLoader)5 TxRunnable (co.cask.cdap.api.TxRunnable)4 DatasetContext (co.cask.cdap.api.data.DatasetContext)4 IOException (java.io.IOException)4 ProgramState (co.cask.cdap.api.ProgramState)3 CombineClassLoader (co.cask.cdap.common.lang.CombineClassLoader)3 MetricsFieldSetter (co.cask.cdap.internal.app.runtime.MetricsFieldSetter)3 RuntimeContext (co.cask.cdap.api.RuntimeContext)2 AbstractMapReduce (co.cask.cdap.api.mapreduce.AbstractMapReduce)2 AbstractSpark (co.cask.cdap.api.spark.AbstractSpark)2 PropertyFieldSetter (co.cask.cdap.common.lang.PropertyFieldSetter)2 DataSetFieldSetter (co.cask.cdap.internal.app.runtime.DataSetFieldSetter)2 ProvisionException (com.google.inject.ProvisionException)2 URISyntaxException (java.net.URISyntaxException)2 TransactionConflictException (org.apache.tephra.TransactionConflictException)2 TransactionFailureException (org.apache.tephra.TransactionFailureException)2 Cancellable (org.apache.twill.common.Cancellable)2 MapReduce (co.cask.cdap.api.mapreduce.MapReduce)1