Search in sources :

Example 1 with SerializedThrowable

use of org.apache.flink.runtime.util.SerializedThrowable in project flink by apache.

the class JobClient method submitJobDetached.

/**
	 * Submits a job in detached mode. The method sends the JobGraph to the
	 * JobManager and waits for the answer whether the job could be started or not.
	 *
	 * @param jobManagerGateway Gateway to the JobManager which will execute the jobs
	 * @param config The cluster wide configuration.
	 * @param jobGraph The job
	 * @param timeout  Timeout in which the JobManager must have responded.
	 */
public static void submitJobDetached(ActorGateway jobManagerGateway, Configuration config, JobGraph jobGraph, FiniteDuration timeout, ClassLoader classLoader) throws JobExecutionException {
    checkNotNull(jobManagerGateway, "The jobManagerGateway must not be null.");
    checkNotNull(jobGraph, "The jobGraph must not be null.");
    checkNotNull(timeout, "The timeout must not be null.");
    LOG.info("Checking and uploading JAR files");
    try {
        jobGraph.uploadUserJars(jobManagerGateway, timeout, config);
    } catch (IOException e) {
        throw new JobSubmissionException(jobGraph.getJobID(), "Could not upload the program's JAR files to the JobManager.", e);
    }
    Object result;
    try {
        Future<Object> future = jobManagerGateway.ask(new JobManagerMessages.SubmitJob(jobGraph, // only receive the Acknowledge for the job submission message
        ListeningBehaviour.DETACHED), timeout);
        result = Await.result(future, timeout);
    } catch (TimeoutException e) {
        throw new JobTimeoutException(jobGraph.getJobID(), "JobManager did not respond within " + timeout.toString(), e);
    } catch (Throwable t) {
        throw new JobSubmissionException(jobGraph.getJobID(), "Failed to send job to JobManager: " + t.getMessage(), t.getCause());
    }
    if (result instanceof JobManagerMessages.JobSubmitSuccess) {
        JobID respondedID = ((JobManagerMessages.JobSubmitSuccess) result).jobId();
        // validate response
        if (!respondedID.equals(jobGraph.getJobID())) {
            throw new JobExecutionException(jobGraph.getJobID(), "JobManager responded for wrong Job. This Job: " + jobGraph.getJobID() + ", response: " + respondedID);
        }
    } else if (result instanceof JobManagerMessages.JobResultFailure) {
        try {
            SerializedThrowable t = ((JobManagerMessages.JobResultFailure) result).cause();
            throw t.deserializeError(classLoader);
        } catch (JobExecutionException e) {
            throw e;
        } catch (Throwable t) {
            throw new JobExecutionException(jobGraph.getJobID(), "JobSubmission failed: " + t.getMessage(), t);
        }
    } else {
        throw new JobExecutionException(jobGraph.getJobID(), "Unexpected response from JobManager: " + result);
    }
}
Also used : JobManagerMessages(org.apache.flink.runtime.messages.JobManagerMessages) IOException(java.io.IOException) SerializedThrowable(org.apache.flink.runtime.util.SerializedThrowable) JobID(org.apache.flink.api.common.JobID) TimeoutException(java.util.concurrent.TimeoutException) SerializedThrowable(org.apache.flink.runtime.util.SerializedThrowable)

Example 2 with SerializedThrowable

use of org.apache.flink.runtime.util.SerializedThrowable in project flink by apache.

the class JobClient method awaitJobResult.

/**
	 * Given a JobListeningContext, awaits the result of the job execution that this context is bound to
	 * @param listeningContext The listening context of the job execution
	 * @return The result of the execution
	 * @throws JobExecutionException if anything goes wrong while monitoring the job
	 */
public static JobExecutionResult awaitJobResult(JobListeningContext listeningContext) throws JobExecutionException {
    final JobID jobID = listeningContext.getJobID();
    final ActorRef jobClientActor = listeningContext.getJobClientActor();
    final Future<Object> jobSubmissionFuture = listeningContext.getJobResultFuture();
    final FiniteDuration askTimeout = listeningContext.getTimeout();
    // retrieves class loader if necessary
    final ClassLoader classLoader = listeningContext.getClassLoader();
    // ping the JobClientActor from time to time to check if it is still running
    while (!jobSubmissionFuture.isCompleted()) {
        try {
            Await.ready(jobSubmissionFuture, askTimeout);
        } catch (InterruptedException e) {
            throw new JobExecutionException(jobID, "Interrupted while waiting for job completion.");
        } catch (TimeoutException e) {
            try {
                Await.result(Patterns.ask(jobClientActor, // Ping the Actor to see if it is alive
                new Identify(true), Timeout.durationToTimeout(askTimeout)), askTimeout);
            // we got a reply, continue waiting for the job result
            } catch (Exception eInner) {
                // thus the health check failed
                if (!jobSubmissionFuture.isCompleted()) {
                    throw new JobExecutionException(jobID, "JobClientActor seems to have died before the JobExecutionResult could be retrieved.", eInner);
                }
            }
        }
    }
    final Object answer;
    try {
        // we have already awaited the result, zero time to wait here
        answer = Await.result(jobSubmissionFuture, Duration.Zero());
    } catch (Throwable throwable) {
        throw new JobExecutionException(jobID, "Couldn't retrieve the JobExecutionResult from the JobManager.", throwable);
    } finally {
        // failsafe shutdown of the client actor
        jobClientActor.tell(PoisonPill.getInstance(), ActorRef.noSender());
    }
    // second block handles the actual response
    if (answer instanceof JobManagerMessages.JobResultSuccess) {
        LOG.info("Job execution complete");
        SerializedJobExecutionResult result = ((JobManagerMessages.JobResultSuccess) answer).result();
        if (result != null) {
            try {
                return result.toJobExecutionResult(classLoader);
            } catch (Throwable t) {
                throw new JobExecutionException(jobID, "Job was successfully executed but JobExecutionResult could not be deserialized.");
            }
        } else {
            throw new JobExecutionException(jobID, "Job was successfully executed but result contained a null JobExecutionResult.");
        }
    } else if (answer instanceof JobManagerMessages.JobResultFailure) {
        LOG.info("Job execution failed");
        SerializedThrowable serThrowable = ((JobManagerMessages.JobResultFailure) answer).cause();
        if (serThrowable != null) {
            Throwable cause = serThrowable.deserializeError(classLoader);
            if (cause instanceof JobExecutionException) {
                throw (JobExecutionException) cause;
            } else {
                throw new JobExecutionException(jobID, "Job execution failed", cause);
            }
        } else {
            throw new JobExecutionException(jobID, "Job execution failed with null as failure cause.");
        }
    } else if (answer instanceof JobManagerMessages.JobNotFound) {
        throw new JobRetrievalException(((JobManagerMessages.JobNotFound) answer).jobID(), "Couldn't retrieve Job " + jobID + " because it was not running.");
    } else {
        throw new JobExecutionException(jobID, "Unknown answer from JobManager after submitting the job: " + answer);
    }
}
Also used : ActorRef(akka.actor.ActorRef) JobManagerMessages(org.apache.flink.runtime.messages.JobManagerMessages) FiniteDuration(scala.concurrent.duration.FiniteDuration) Identify(akka.actor.Identify) TimeoutException(java.util.concurrent.TimeoutException) IOException(java.io.IOException) FlinkUserCodeClassLoader(org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoader) SerializedThrowable(org.apache.flink.runtime.util.SerializedThrowable) JobID(org.apache.flink.api.common.JobID) TimeoutException(java.util.concurrent.TimeoutException) SerializedThrowable(org.apache.flink.runtime.util.SerializedThrowable)

Example 3 with SerializedThrowable

use of org.apache.flink.runtime.util.SerializedThrowable in project flink by apache.

the class ExecutionGraph method notifyJobStatusChange.

private void notifyJobStatusChange(JobStatus newState, Throwable error) {
    if (jobStatusListeners.size() > 0) {
        final long timestamp = System.currentTimeMillis();
        final Throwable serializedError = error == null ? null : new SerializedThrowable(error);
        for (JobStatusListener listener : jobStatusListeners) {
            try {
                listener.jobStatusChanges(getJobID(), newState, timestamp, serializedError);
            } catch (Throwable t) {
                LOG.warn("Error while notifying JobStatusListener", t);
            }
        }
    }
}
Also used : SerializedThrowable(org.apache.flink.runtime.util.SerializedThrowable) SerializedThrowable(org.apache.flink.runtime.util.SerializedThrowable)

Example 4 with SerializedThrowable

use of org.apache.flink.runtime.util.SerializedThrowable in project flink by apache.

the class JobSubmissionClientActor method tryToSubmitJob.

private void tryToSubmitJob() {
    LOG.info("Sending message to JobManager {} to submit job {} ({}) and wait for progress", jobManager.path().toString(), jobGraph.getName(), jobGraph.getJobID());
    Futures.future(new Callable<Object>() {

        @Override
        public Object call() throws Exception {
            ActorGateway jobManagerGateway = new AkkaActorGateway(jobManager, leaderSessionID);
            LOG.info("Upload jar files to job manager {}.", jobManager.path());
            try {
                jobGraph.uploadUserJars(jobManagerGateway, timeout, clientConfig);
            } catch (IOException exception) {
                getSelf().tell(decorateMessage(new JobManagerMessages.JobResultFailure(new SerializedThrowable(new JobSubmissionException(jobGraph.getJobID(), "Could not upload the jar files to the job manager.", exception)))), ActorRef.noSender());
            }
            LOG.info("Submit job to the job manager {}.", jobManager.path());
            jobManager.tell(decorateMessage(new JobManagerMessages.SubmitJob(jobGraph, ListeningBehaviour.EXECUTION_RESULT_AND_STATE_CHANGES)), getSelf());
            // issue a SubmissionTimeout message to check that we submit the job within
            // the given timeout
            getContext().system().scheduler().scheduleOnce(timeout, getSelf(), decorateMessage(JobClientMessages.getSubmissionTimeout()), getContext().dispatcher(), ActorRef.noSender());
            return null;
        }
    }, getContext().dispatcher());
}
Also used : AkkaActorGateway(org.apache.flink.runtime.instance.AkkaActorGateway) AkkaActorGateway(org.apache.flink.runtime.instance.AkkaActorGateway) ActorGateway(org.apache.flink.runtime.instance.ActorGateway) JobManagerMessages(org.apache.flink.runtime.messages.JobManagerMessages) IOException(java.io.IOException) IOException(java.io.IOException) SerializedThrowable(org.apache.flink.runtime.util.SerializedThrowable)

Example 5 with SerializedThrowable

use of org.apache.flink.runtime.util.SerializedThrowable in project flink by apache.

the class ProducerFailedExceptionTest method testCauseIsSerialized.

@Test
public void testCauseIsSerialized() throws Exception {
    // Tests that the cause is stringified, because it might be an instance
    // of a user level Exception, which can not be deserialized by the
    // remote receiver's system class loader.
    ProducerFailedException e = new ProducerFailedException(new Exception());
    assertNotNull(e.getCause());
    assertTrue(e.getCause() instanceof SerializedThrowable);
}
Also used : CancelTaskException(org.apache.flink.runtime.execution.CancelTaskException) SerializedThrowable(org.apache.flink.runtime.util.SerializedThrowable) Test(org.junit.Test)

Aggregations

SerializedThrowable (org.apache.flink.runtime.util.SerializedThrowable)5 IOException (java.io.IOException)3 JobManagerMessages (org.apache.flink.runtime.messages.JobManagerMessages)3 TimeoutException (java.util.concurrent.TimeoutException)2 JobID (org.apache.flink.api.common.JobID)2 ActorRef (akka.actor.ActorRef)1 Identify (akka.actor.Identify)1 CancelTaskException (org.apache.flink.runtime.execution.CancelTaskException)1 FlinkUserCodeClassLoader (org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoader)1 ActorGateway (org.apache.flink.runtime.instance.ActorGateway)1 AkkaActorGateway (org.apache.flink.runtime.instance.AkkaActorGateway)1 Test (org.junit.Test)1 FiniteDuration (scala.concurrent.duration.FiniteDuration)1