Search in sources :

Example 1 with Terminated

use of akka.actor.Terminated in project flink by apache.

the class JobClientActor method handleMessage.

@Override
protected void handleMessage(Object message) {
    if (message instanceof ExecutionGraphMessages.ExecutionStateChanged) {
        logAndPrintMessage((ExecutionGraphMessages.ExecutionStateChanged) message);
    } else if (message instanceof ExecutionGraphMessages.JobStatusChanged) {
        logAndPrintMessage((ExecutionGraphMessages.JobStatusChanged) message);
    } else if (message instanceof JobManagerLeaderAddress) {
        JobManagerLeaderAddress msg = (JobManagerLeaderAddress) message;
        if (jobManager != null) {
            // only print this message when we had been connected to a JobManager before
            logAndPrintMessage("New JobManager elected. Connecting to " + msg.address());
        }
        disconnectFromJobManager();
        this.leaderSessionID = msg.leaderSessionID();
        if (msg.address() != null) {
            // Resolve the job manager leader address to obtain an ActorRef
            AkkaUtils.getActorRefFuture(msg.address(), getContext().system(), timeout).onSuccess(new OnSuccess<ActorRef>() {

                @Override
                public void onSuccess(ActorRef result) throws Throwable {
                    getSelf().tell(decorateMessage(new JobManagerActorRef(result)), ActorRef.noSender());
                }
            }, getContext().dispatcher());
        }
    } else if (message instanceof JobManagerActorRef) {
        // Resolved JobManager ActorRef
        JobManagerActorRef msg = (JobManagerActorRef) message;
        connectToJobManager(msg.jobManager());
        logAndPrintMessage("Connected to JobManager at " + msg.jobManager());
        connectedToJobManager();
    } else // client is only interested in the final job result
    if (message instanceof JobManagerMessages.JobResultMessage) {
        if (LOG.isDebugEnabled()) {
            LOG.debug("Received {} message from JobManager", message.getClass().getSimpleName());
        }
        // forward the success to the original client
        if (isClientConnected()) {
            this.client.tell(decorateMessage(message), getSelf());
        }
        terminate();
    } else if (message instanceof Terminated) {
        ActorRef target = ((Terminated) message).getActor();
        if (jobManager.equals(target)) {
            LOG.info("Lost connection to JobManager {}. Triggering connection timeout.", jobManager.path());
            disconnectFromJobManager();
            // ConnectionTimeout extends RequiresLeaderSessionID
            if (isClientConnected()) {
                getContext().system().scheduler().scheduleOnce(timeout, getSelf(), decorateMessage(JobClientMessages.getConnectionTimeout()), getContext().dispatcher(), ActorRef.noSender());
            }
        } else {
            LOG.warn("Received 'Terminated' for unknown actor " + target);
        }
    } else if (JobClientMessages.getConnectionTimeout().equals(message)) {
        // check if we haven't found a job manager yet
        if (!isJobManagerConnected()) {
            final JobClientActorConnectionTimeoutException errorMessage = new JobClientActorConnectionTimeoutException("Lost connection to the JobManager.");
            final Object replyMessage = decorateMessage(new Status.Failure(errorMessage));
            if (isClientConnected()) {
                client.tell(replyMessage, getSelf());
            }
            // Connection timeout reached, let's terminate
            terminate();
        }
    } else if (!isJobManagerConnected() && getClientMessageClass().equals(message.getClass())) {
        LOG.info("Received {} but there is no connection to a JobManager yet.", message);
        // We want to submit/attach to a job, but we haven't found a job manager yet.
        // Let's give him another chance to find a job manager within the given timeout.
        getContext().system().scheduler().scheduleOnce(timeout, getSelf(), decorateMessage(JobClientMessages.getConnectionTimeout()), getContext().dispatcher(), ActorRef.noSender());
        handleCustomMessage(message);
    } else {
        if (!toBeTerminated) {
            handleCustomMessage(message);
        } else {
            // we're about to receive a PoisonPill because toBeTerminated == true
            String msg = getClass().getName() + " is about to be terminated. Therefore, the " + "job submission cannot be executed.";
            LOG.error(msg);
            getSender().tell(decorateMessage(new Status.Failure(new Exception(msg))), ActorRef.noSender());
        }
    }
}
Also used : JobStatus(org.apache.flink.runtime.jobgraph.JobStatus) Status(akka.actor.Status) JobManagerLeaderAddress(org.apache.flink.runtime.messages.JobClientMessages.JobManagerLeaderAddress) JobManagerActorRef(org.apache.flink.runtime.messages.JobClientMessages.JobManagerActorRef) ActorRef(akka.actor.ActorRef) Terminated(akka.actor.Terminated) JobManagerActorRef(org.apache.flink.runtime.messages.JobClientMessages.JobManagerActorRef) ExecutionGraphMessages(org.apache.flink.runtime.messages.ExecutionGraphMessages)

Example 2 with Terminated

use of akka.actor.Terminated in project flink by apache.

the class ProcessReaper method onReceive.

@Override
public void onReceive(Object message) {
    if (message instanceof Terminated) {
        try {
            Terminated term = (Terminated) message;
            String name = term.actor().path().toSerializationFormat();
            if (log != null) {
                log.error("Actor " + name + " terminated, stopping process...");
                // give the log some time to reach disk
                try {
                    Thread.sleep(100);
                } catch (InterruptedException e) {
                // not really a problem if we don't sleep...
                }
            }
        } finally {
            System.exit(exitCode);
        }
    }
}
Also used : Terminated(akka.actor.Terminated)

Example 3 with Terminated

use of akka.actor.Terminated in project flink by apache.

the class TaskManagerRegistrationTest method testTaskManagerResumesConnectAfterJobManagerFailure.

/**
	 * Validate that the TaskManager attempts to re-connect after it lost the connection
	 * to the JobManager.
	 */
@Test
public void testTaskManagerResumesConnectAfterJobManagerFailure() {
    new JavaTestKit(actorSystem) {

        {
            ActorGateway fakeJobManager1Gateway = null;
            ActorGateway fakeJobManager2Gateway = null;
            ActorGateway taskManagerGateway = null;
            final String JOB_MANAGER_NAME = "ForwardingJobManager";
            try {
                fakeJobManager1Gateway = TestingUtils.createForwardingActor(actorSystem, getTestActor(), Option.apply(JOB_MANAGER_NAME));
                final ActorGateway fakeJM1Gateway = fakeJobManager1Gateway;
                // we make the test actor (the test kit) the JobManager to intercept
                // the messages
                taskManagerGateway = createTaskManager(actorSystem, fakeJobManager1Gateway, config, true, false);
                final ActorGateway tm = taskManagerGateway;
                // validate initial registration
                new Within(timeout) {

                    @Override
                    protected void run() {
                        // the TaskManager should try to register
                        expectMsgClass(RegisterTaskManager.class);
                        // we accept the registration
                        tm.tell(new AcknowledgeRegistration(new InstanceID(), 45234), fakeJM1Gateway);
                    }
                };
                // kill the first forwarding JobManager
                watch(fakeJobManager1Gateway.actor());
                stopActor(fakeJobManager1Gateway.actor());
                final ActorGateway gateway = fakeJobManager1Gateway;
                new Within(timeout) {

                    @Override
                    protected void run() {
                        Object message = null;
                        // are queued up in the testing actor's mailbox
                        while (message == null || !(message instanceof Terminated)) {
                            message = receiveOne(timeout);
                        }
                        Terminated terminatedMessage = (Terminated) message;
                        assertEquals(gateway.actor(), terminatedMessage.actor());
                    }
                };
                fakeJobManager1Gateway = null;
                // now start the second fake JobManager and expect that
                // the TaskManager registers again
                // the second fake JM needs to have the same actor URL
                // since we cannot reliably wait until the actor is unregistered (name is
                // available again) we loop with multiple tries for 20 seconds
                long deadline = 20000000000L + System.nanoTime();
                do {
                    try {
                        fakeJobManager2Gateway = TestingUtils.createForwardingActor(actorSystem, getTestActor(), Option.apply(JOB_MANAGER_NAME));
                    } catch (InvalidActorNameException e) {
                        // wait and retry
                        Thread.sleep(100);
                    }
                } while (fakeJobManager2Gateway == null && System.nanoTime() < deadline);
                final ActorGateway fakeJM2GatewayClosure = fakeJobManager2Gateway;
                // expect the next registration
                new Within(timeout) {

                    @Override
                    protected void run() {
                        expectMsgClass(RegisterTaskManager.class);
                        // we accept the registration
                        tm.tell(new AcknowledgeRegistration(new InstanceID(), 45234), fakeJM2GatewayClosure);
                    }
                };
            } catch (Throwable e) {
                e.printStackTrace();
                fail(e.getMessage());
            } finally {
                stopActor(taskManagerGateway);
                stopActor(fakeJobManager1Gateway);
                stopActor(fakeJobManager2Gateway);
            }
        }
    };
}
Also used : InvalidActorNameException(akka.actor.InvalidActorNameException) AcknowledgeRegistration(org.apache.flink.runtime.messages.RegistrationMessages.AcknowledgeRegistration) InstanceID(org.apache.flink.runtime.instance.InstanceID) ActorGateway(org.apache.flink.runtime.instance.ActorGateway) Terminated(akka.actor.Terminated) JavaTestKit(akka.testkit.JavaTestKit) Test(org.junit.Test)

Aggregations

Terminated (akka.actor.Terminated)3 ActorRef (akka.actor.ActorRef)1 InvalidActorNameException (akka.actor.InvalidActorNameException)1 Status (akka.actor.Status)1 JavaTestKit (akka.testkit.JavaTestKit)1 ActorGateway (org.apache.flink.runtime.instance.ActorGateway)1 InstanceID (org.apache.flink.runtime.instance.InstanceID)1 JobStatus (org.apache.flink.runtime.jobgraph.JobStatus)1 ExecutionGraphMessages (org.apache.flink.runtime.messages.ExecutionGraphMessages)1 JobManagerActorRef (org.apache.flink.runtime.messages.JobClientMessages.JobManagerActorRef)1 JobManagerLeaderAddress (org.apache.flink.runtime.messages.JobClientMessages.JobManagerLeaderAddress)1 AcknowledgeRegistration (org.apache.flink.runtime.messages.RegistrationMessages.AcknowledgeRegistration)1 Test (org.junit.Test)1