Search in sources :

Example 1 with TriggerRegistrationAtJobManager

use of org.apache.flink.runtime.clusterframework.messages.TriggerRegistrationAtJobManager in project flink by apache.

the class FlinkResourceManager method triggerConnectingToJobManager.

/**
	 * Causes the resource manager to announce itself at the new leader JobManager and
	 * obtains its connection information and currently known TaskManagers.
	 *
	 * @param leaderAddress The akka actor URL of the new leader JobManager.
	 */
protected void triggerConnectingToJobManager(String leaderAddress) {
    LOG.info("Trying to associate with JobManager leader " + leaderAddress);
    final Object registerMessage = decorateMessage(new RegisterResourceManager(self()));
    final Object retryMessage = decorateMessage(new TriggerRegistrationAtJobManager(leaderAddress));
    // send the registration message to the JobManager
    ActorSelection jobManagerSel = context().actorSelection(leaderAddress);
    Future<Object> future = Patterns.ask(jobManagerSel, registerMessage, new Timeout(messageTimeout));
    future.onComplete(new OnComplete<Object>() {

        @Override
        public void onComplete(Throwable failure, Object msg) {
            // only process if we haven't been connected in the meantime
            if (jobManager == null) {
                if (msg != null) {
                    if (msg instanceof LeaderSessionMessage && ((LeaderSessionMessage) msg).message() instanceof RegisterResourceManagerSuccessful) {
                        self().tell(msg, ActorRef.noSender());
                    } else {
                        LOG.error("Invalid response type to registration at JobManager: {}", msg);
                        self().tell(retryMessage, ActorRef.noSender());
                    }
                } else {
                    // no success
                    LOG.error("Resource manager could not register at JobManager", failure);
                    self().tell(retryMessage, ActorRef.noSender());
                }
            }
        }
    }, context().dispatcher());
}
Also used : LeaderSessionMessage(org.apache.flink.runtime.messages.JobManagerMessages.LeaderSessionMessage) TriggerRegistrationAtJobManager(org.apache.flink.runtime.clusterframework.messages.TriggerRegistrationAtJobManager) ActorSelection(akka.actor.ActorSelection) Timeout(akka.util.Timeout) RegisterResourceManagerSuccessful(org.apache.flink.runtime.clusterframework.messages.RegisterResourceManagerSuccessful) RegisterResourceManager(org.apache.flink.runtime.clusterframework.messages.RegisterResourceManager)

Example 2 with TriggerRegistrationAtJobManager

use of org.apache.flink.runtime.clusterframework.messages.TriggerRegistrationAtJobManager in project flink by apache.

the class ResourceManagerTest method testTriggerReconnect.

@Test
public void testTriggerReconnect() {
    new JavaTestKit(system) {

        {
            new Within(duration("10 seconds")) {

                @Override
                protected void run() {
                    // set a long timeout for lookups such that the test fails in case of timeouts
                    Configuration shortTimeoutConfig = config.clone();
                    shortTimeoutConfig.setString(ConfigConstants.AKKA_LOOKUP_TIMEOUT, "99999 s");
                    fakeJobManager = TestingUtils.createForwardingActor(system, getTestActor(), Option.<String>empty());
                    resourceManager = TestingUtils.createResourceManager(system, fakeJobManager.actor(), shortTimeoutConfig);
                    // wait for registration message
                    RegisterResourceManager msg = expectMsgClass(RegisterResourceManager.class);
                    // all went well
                    resourceManager.tell(new RegisterResourceManagerSuccessful(fakeJobManager.actor(), Collections.<ResourceID>emptyList()), fakeJobManager);
                    // force a reconnect
                    resourceManager.tell(new TriggerRegistrationAtJobManager(fakeJobManager.actor()), fakeJobManager);
                    // new registration attempt should come in
                    expectMsgClass(RegisterResourceManager.class);
                }
            };
        }
    };
}
Also used : Configuration(org.apache.flink.configuration.Configuration) TriggerRegistrationAtJobManager(org.apache.flink.runtime.clusterframework.messages.TriggerRegistrationAtJobManager) ResourceID(org.apache.flink.runtime.clusterframework.types.ResourceID) RegisterResourceManagerSuccessful(org.apache.flink.runtime.clusterframework.messages.RegisterResourceManagerSuccessful) JavaTestKit(akka.testkit.JavaTestKit) RegisterResourceManager(org.apache.flink.runtime.clusterframework.messages.RegisterResourceManager) Test(org.junit.Test)

Example 3 with TriggerRegistrationAtJobManager

use of org.apache.flink.runtime.clusterframework.messages.TriggerRegistrationAtJobManager in project flink by apache.

the class FlinkResourceManager method handleMessage.

/**
	 *
	 * This method receives the actor messages after they have been filtered for
	 * a match with the leader session.
	 *
	 * @param message The incoming actor message.
	 */
@Override
protected void handleMessage(Object message) {
    try {
        if (message instanceof CheckAndAllocateContainers) {
            checkWorkersPool();
        } else if (message instanceof SetWorkerPoolSize) {
            SetWorkerPoolSize msg = (SetWorkerPoolSize) message;
            adjustDesignatedNumberOfWorkers(msg.numberOfWorkers());
        } else if (message instanceof RemoveResource) {
            RemoveResource msg = (RemoveResource) message;
            removeRegisteredResource(msg.resourceId());
        } else if (message instanceof NotifyResourceStarted) {
            NotifyResourceStarted msg = (NotifyResourceStarted) message;
            handleResourceStarted(sender(), msg.getResourceID());
        } else if (message instanceof NewLeaderAvailable) {
            NewLeaderAvailable msg = (NewLeaderAvailable) message;
            newJobManagerLeaderAvailable(msg.leaderAddress(), msg.leaderSessionId());
        } else if (message instanceof TriggerRegistrationAtJobManager) {
            TriggerRegistrationAtJobManager msg = (TriggerRegistrationAtJobManager) message;
            triggerConnectingToJobManager(msg.jobManagerAddress());
        } else if (message instanceof RegisterResourceManagerSuccessful) {
            RegisterResourceManagerSuccessful msg = (RegisterResourceManagerSuccessful) message;
            jobManagerLeaderConnected(msg.jobManager(), msg.currentlyRegisteredTaskManagers());
        } else if (message instanceof StopCluster) {
            StopCluster msg = (StopCluster) message;
            shutdownCluster(msg.finalStatus(), msg.message());
            sender().tell(decorateMessage(StopClusterSuccessful.getInstance()), ActorRef.noSender());
        } else if (message instanceof RegisterInfoMessageListener) {
            if (jobManager != null) {
                infoMessageListeners.add(sender());
                sender().tell(decorateMessage(RegisterInfoMessageListenerSuccessful.get()), // answer as the JobManager
                jobManager);
            }
        } else if (message instanceof UnRegisterInfoMessageListener) {
            infoMessageListeners.remove(sender());
        } else if (message instanceof FatalErrorOccurred) {
            FatalErrorOccurred fatalErrorOccurred = (FatalErrorOccurred) message;
            fatalError(fatalErrorOccurred.message(), fatalErrorOccurred.error());
        } else // --- unknown messages
        {
            LOG.error("Discarding unknown message: {}", message);
        }
    } catch (Throwable t) {
        // fatal error, needs master recovery
        fatalError("Error processing actor message", t);
    }
}
Also used : CheckAndAllocateContainers(org.apache.flink.runtime.clusterframework.messages.CheckAndAllocateContainers) UnRegisterInfoMessageListener(org.apache.flink.runtime.clusterframework.messages.UnRegisterInfoMessageListener) UnRegisterInfoMessageListener(org.apache.flink.runtime.clusterframework.messages.UnRegisterInfoMessageListener) RegisterInfoMessageListener(org.apache.flink.runtime.clusterframework.messages.RegisterInfoMessageListener) TriggerRegistrationAtJobManager(org.apache.flink.runtime.clusterframework.messages.TriggerRegistrationAtJobManager) NewLeaderAvailable(org.apache.flink.runtime.clusterframework.messages.NewLeaderAvailable) FatalErrorOccurred(org.apache.flink.runtime.clusterframework.messages.FatalErrorOccurred) SetWorkerPoolSize(org.apache.flink.runtime.clusterframework.messages.SetWorkerPoolSize) StopCluster(org.apache.flink.runtime.clusterframework.messages.StopCluster) RegisterResourceManagerSuccessful(org.apache.flink.runtime.clusterframework.messages.RegisterResourceManagerSuccessful) RemoveResource(org.apache.flink.runtime.clusterframework.messages.RemoveResource) NotifyResourceStarted(org.apache.flink.runtime.clusterframework.messages.NotifyResourceStarted)

Aggregations

RegisterResourceManagerSuccessful (org.apache.flink.runtime.clusterframework.messages.RegisterResourceManagerSuccessful)3 TriggerRegistrationAtJobManager (org.apache.flink.runtime.clusterframework.messages.TriggerRegistrationAtJobManager)3 RegisterResourceManager (org.apache.flink.runtime.clusterframework.messages.RegisterResourceManager)2 ActorSelection (akka.actor.ActorSelection)1 JavaTestKit (akka.testkit.JavaTestKit)1 Timeout (akka.util.Timeout)1 Configuration (org.apache.flink.configuration.Configuration)1 CheckAndAllocateContainers (org.apache.flink.runtime.clusterframework.messages.CheckAndAllocateContainers)1 FatalErrorOccurred (org.apache.flink.runtime.clusterframework.messages.FatalErrorOccurred)1 NewLeaderAvailable (org.apache.flink.runtime.clusterframework.messages.NewLeaderAvailable)1 NotifyResourceStarted (org.apache.flink.runtime.clusterframework.messages.NotifyResourceStarted)1 RegisterInfoMessageListener (org.apache.flink.runtime.clusterframework.messages.RegisterInfoMessageListener)1 RemoveResource (org.apache.flink.runtime.clusterframework.messages.RemoveResource)1 SetWorkerPoolSize (org.apache.flink.runtime.clusterframework.messages.SetWorkerPoolSize)1 StopCluster (org.apache.flink.runtime.clusterframework.messages.StopCluster)1 UnRegisterInfoMessageListener (org.apache.flink.runtime.clusterframework.messages.UnRegisterInfoMessageListener)1 ResourceID (org.apache.flink.runtime.clusterframework.types.ResourceID)1 LeaderSessionMessage (org.apache.flink.runtime.messages.JobManagerMessages.LeaderSessionMessage)1 Test (org.junit.Test)1