Search in sources :

Example 1 with KubernetesPod

use of org.apache.flink.kubernetes.kubeclient.resources.KubernetesPod in project flink by apache.

the class KubernetesResourceManagerDriverTest method testOnPodAdded.

@Test
public void testOnPodAdded() throws Exception {
    new Context() {

        {
            final CompletableFuture<KubernetesPod> createPodFuture = new CompletableFuture<>();
            final CompletableFuture<KubernetesWorkerNode> requestResourceFuture = new CompletableFuture<>();
            flinkKubeClientBuilder.setCreateTaskManagerPodFunction((pod) -> {
                createPodFuture.complete(pod);
                return FutureUtils.completedVoidFuture();
            });
            runTest(() -> {
                // request new pod
                runInMainThread(() -> getDriver().requestResource(TASK_EXECUTOR_PROCESS_SPEC).thenAccept(requestResourceFuture::complete));
                final KubernetesPod pod = new TestingKubernetesPod(createPodFuture.get(TIMEOUT_SEC, TimeUnit.SECONDS).getName(), true, false);
                // prepare validation:
                // - complete requestResourceFuture in main thread with correct
                // KubernetesWorkerNode
                final CompletableFuture<Void> validationFuture = requestResourceFuture.thenAccept((workerNode) -> {
                    validateInMainThread();
                    assertThat(workerNode.getResourceID().toString(), is(pod.getName()));
                });
                // send onAdded event
                getPodCallbackHandler().onAdded(Collections.singletonList(pod));
                // make sure finishing validation
                validationFuture.get(TIMEOUT_SEC, TimeUnit.SECONDS);
            });
        }
    };
}
Also used : CompletableFuture(java.util.concurrent.CompletableFuture) TestingKubernetesPod(org.apache.flink.kubernetes.kubeclient.resources.TestingKubernetesPod) TestingKubernetesPod(org.apache.flink.kubernetes.kubeclient.resources.TestingKubernetesPod) KubernetesPod(org.apache.flink.kubernetes.kubeclient.resources.KubernetesPod) Test(org.junit.Test)

Example 2 with KubernetesPod

use of org.apache.flink.kubernetes.kubeclient.resources.KubernetesPod in project flink by apache.

the class Fabric8FlinkKubeClientTest method testStopAndCleanupCluster.

@Test
public void testStopAndCleanupCluster() throws Exception {
    this.flinkKubeClient.createJobManagerComponent(this.kubernetesJobManagerSpecification);
    final KubernetesPod kubernetesPod = new KubernetesPod(new PodBuilder().editOrNewMetadata().withName(TASKMANAGER_POD_NAME).endMetadata().editOrNewSpec().endSpec().build());
    this.flinkKubeClient.createTaskManagerPod(kubernetesPod).get();
    assertEquals(1, this.kubeClient.apps().deployments().inNamespace(NAMESPACE).list().getItems().size());
    assertEquals(1, this.kubeClient.configMaps().inNamespace(NAMESPACE).list().getItems().size());
    assertEquals(2, this.kubeClient.services().inNamespace(NAMESPACE).list().getItems().size());
    assertEquals(1, this.kubeClient.pods().inNamespace(NAMESPACE).list().getItems().size());
    this.flinkKubeClient.stopAndCleanupCluster(CLUSTER_ID);
    assertTrue(this.kubeClient.apps().deployments().inNamespace(NAMESPACE).list().getItems().isEmpty());
}
Also used : PodBuilder(io.fabric8.kubernetes.api.model.PodBuilder) KubernetesPod(org.apache.flink.kubernetes.kubeclient.resources.KubernetesPod) Test(org.junit.Test)

Example 3 with KubernetesPod

use of org.apache.flink.kubernetes.kubeclient.resources.KubernetesPod in project flink by apache.

the class KubernetesResourceManagerDriver method recoverWorkerNodesFromPreviousAttempts.

// ------------------------------------------------------------------------
// Internal
// ------------------------------------------------------------------------
private void recoverWorkerNodesFromPreviousAttempts() throws ResourceManagerException {
    List<KubernetesPod> podList = flinkKubeClient.getPodsWithLabels(KubernetesUtils.getTaskManagerSelectors(clusterId));
    final List<KubernetesWorkerNode> recoveredWorkers = new ArrayList<>();
    for (KubernetesPod pod : podList) {
        final KubernetesWorkerNode worker = new KubernetesWorkerNode(new ResourceID(pod.getName()));
        final long attempt = worker.getAttempt();
        if (attempt > currentMaxAttemptId) {
            currentMaxAttemptId = attempt;
        }
        if (pod.isTerminated() || !pod.isScheduled()) {
            stopPod(pod.getName());
        } else {
            recoveredWorkers.add(worker);
        }
    }
    log.info("Recovered {} pods from previous attempts, current attempt id is {}.", recoveredWorkers.size(), ++currentMaxAttemptId);
    // Should not invoke resource event handler on the main thread executor.
    // We are in the initializing thread. The main thread executor is not yet ready.
    getResourceEventHandler().onPreviousAttemptWorkersRecovered(recoveredWorkers);
}
Also used : ResourceID(org.apache.flink.runtime.clusterframework.types.ResourceID) ArrayList(java.util.ArrayList) KubernetesPod(org.apache.flink.kubernetes.kubeclient.resources.KubernetesPod)

Example 4 with KubernetesPod

use of org.apache.flink.kubernetes.kubeclient.resources.KubernetesPod in project flink by apache.

the class KubernetesResourceManagerDriver method requestResource.

@Override
public CompletableFuture<KubernetesWorkerNode> requestResource(TaskExecutorProcessSpec taskExecutorProcessSpec) {
    final KubernetesTaskManagerParameters parameters = createKubernetesTaskManagerParameters(taskExecutorProcessSpec);
    final KubernetesPod taskManagerPod = KubernetesTaskManagerFactory.buildTaskManagerKubernetesPod(taskManagerPodTemplate, parameters);
    final String podName = taskManagerPod.getName();
    final CompletableFuture<KubernetesWorkerNode> requestResourceFuture = new CompletableFuture<>();
    requestResourceFutures.put(podName, requestResourceFuture);
    log.info("Creating new TaskManager pod with name {} and resource <{},{}>.", podName, parameters.getTaskManagerMemoryMB(), parameters.getTaskManagerCPU());
    final CompletableFuture<Void> createPodFuture = flinkKubeClient.createTaskManagerPod(taskManagerPod);
    FutureUtils.assertNoException(createPodFuture.handleAsync((ignore, exception) -> {
        if (exception != null) {
            log.warn("Could not create pod {}, exception: {}", podName, exception);
            CompletableFuture<KubernetesWorkerNode> future = requestResourceFutures.remove(taskManagerPod.getName());
            if (future != null) {
                future.completeExceptionally(exception);
            }
        } else {
            log.info("Pod {} is created.", podName);
        }
        return null;
    }, getMainThreadExecutor()));
    return requestResourceFuture;
}
Also used : TaskExecutorProcessSpec(org.apache.flink.runtime.clusterframework.TaskExecutorProcessSpec) FlinkException(org.apache.flink.util.FlinkException) ResourceManagerUtils(org.apache.flink.runtime.util.ResourceManagerUtils) ExternalResourceUtils(org.apache.flink.runtime.externalresource.ExternalResourceUtils) ExceptionUtils(org.apache.flink.util.ExceptionUtils) HashMap(java.util.HashMap) CompletableFuture(java.util.concurrent.CompletableFuture) KubernetesService(org.apache.flink.kubernetes.kubeclient.resources.KubernetesService) KubernetesWatch(org.apache.flink.kubernetes.kubeclient.resources.KubernetesWatch) KubernetesPod(org.apache.flink.kubernetes.kubeclient.resources.KubernetesPod) ArrayList(java.util.ArrayList) TaskManagerOptions(org.apache.flink.configuration.TaskManagerOptions) ProcessMemoryUtils(org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils) FutureUtils(org.apache.flink.util.concurrent.FutureUtils) ResourceManagerException(org.apache.flink.runtime.resourcemanager.exceptions.ResourceManagerException) Map(java.util.Map) KubernetesTaskManagerParameters(org.apache.flink.kubernetes.kubeclient.parameters.KubernetesTaskManagerParameters) ContaineredTaskManagerParameters(org.apache.flink.runtime.clusterframework.ContaineredTaskManagerParameters) KubernetesResourceManagerDriverConfiguration(org.apache.flink.kubernetes.configuration.KubernetesResourceManagerDriverConfiguration) ResourceID(org.apache.flink.runtime.clusterframework.types.ResourceID) HighAvailabilityMode(org.apache.flink.runtime.jobmanager.HighAvailabilityMode) Nullable(javax.annotation.Nullable) BlobServerOptions(org.apache.flink.configuration.BlobServerOptions) KubernetesUtils(org.apache.flink.kubernetes.utils.KubernetesUtils) AbstractResourceManagerDriver(org.apache.flink.runtime.resourcemanager.active.AbstractResourceManagerDriver) KubernetesConfigOptions(org.apache.flink.kubernetes.configuration.KubernetesConfigOptions) ResourceManagerDriver(org.apache.flink.runtime.resourcemanager.active.ResourceManagerDriver) ApplicationStatus(org.apache.flink.runtime.clusterframework.ApplicationStatus) Configuration(org.apache.flink.configuration.Configuration) JobManagerOptions(org.apache.flink.configuration.JobManagerOptions) FlinkPod(org.apache.flink.kubernetes.kubeclient.FlinkPod) KubernetesTaskManagerFactory(org.apache.flink.kubernetes.kubeclient.factory.KubernetesTaskManagerFactory) Preconditions(org.apache.flink.util.Preconditions) BootstrapTools(org.apache.flink.runtime.clusterframework.BootstrapTools) File(java.io.File) List(java.util.List) GlobalConfiguration(org.apache.flink.configuration.GlobalConfiguration) KubernetesTooOldResourceVersionException(org.apache.flink.kubernetes.kubeclient.resources.KubernetesTooOldResourceVersionException) Optional(java.util.Optional) Constants(org.apache.flink.kubernetes.utils.Constants) FlinkKubeClient(org.apache.flink.kubernetes.kubeclient.FlinkKubeClient) CompletableFuture(java.util.concurrent.CompletableFuture) KubernetesTaskManagerParameters(org.apache.flink.kubernetes.kubeclient.parameters.KubernetesTaskManagerParameters) KubernetesPod(org.apache.flink.kubernetes.kubeclient.resources.KubernetesPod)

Example 5 with KubernetesPod

use of org.apache.flink.kubernetes.kubeclient.resources.KubernetesPod in project flink by apache.

the class KubernetesResourceManagerDriverTest method testRecoverPreviousAttemptWorkersPodTerminated.

@Test
public void testRecoverPreviousAttemptWorkersPodTerminated() throws Exception {
    new Context() {

        {
            final KubernetesPod previousAttemptPod = new TestingKubernetesPod(CLUSTER_ID + "-taskmanager-1-1", true, true);
            final CompletableFuture<String> stopPodFuture = new CompletableFuture<>();
            final CompletableFuture<Collection<KubernetesWorkerNode>> recoveredWorkersFuture = new CompletableFuture<>();
            flinkKubeClientBuilder.setGetPodsWithLabelsFunction((ignore) -> Collections.singletonList(previousAttemptPod)).setStopPodFunction((podName) -> {
                stopPodFuture.complete(podName);
                return FutureUtils.completedVoidFuture();
            });
            resourceEventHandlerBuilder.setOnPreviousAttemptWorkersRecoveredConsumer(recoveredWorkersFuture::complete);
            runTest(() -> {
                // validate the terminated pod from previous attempt is not recovered
                // and is removed
                assertThat(recoveredWorkersFuture.get(TIMEOUT_SEC, TimeUnit.SECONDS), empty());
                assertThat(stopPodFuture.get(TIMEOUT_SEC, TimeUnit.SECONDS), is(previousAttemptPod.getName()));
            });
        }
    };
}
Also used : TaskExecutorProcessSpec(org.apache.flink.runtime.clusterframework.TaskExecutorProcessSpec) Deadline(org.apache.flink.api.common.time.Deadline) TestingKubernetesPod(org.apache.flink.kubernetes.kubeclient.resources.TestingKubernetesPod) CompletableFuture(java.util.concurrent.CompletableFuture) KubernetesPod(org.apache.flink.kubernetes.kubeclient.resources.KubernetesPod) ArrayList(java.util.ArrayList) TaskManagerOptions(org.apache.flink.configuration.TaskManagerOptions) ResourceRequirements(io.fabric8.kubernetes.api.model.ResourceRequirements) FutureUtils(org.apache.flink.util.concurrent.FutureUtils) Duration(java.time.Duration) Assert.fail(org.junit.Assert.fail) KubernetesResourceManagerDriverConfiguration(org.apache.flink.kubernetes.configuration.KubernetesResourceManagerDriverConfiguration) MatcherAssert.assertThat(org.hamcrest.MatcherAssert.assertThat) ResourceID(org.apache.flink.runtime.clusterframework.types.ResourceID) TestingFlinkKubeClient(org.apache.flink.kubernetes.kubeclient.TestingFlinkKubeClient) Matchers.empty(org.hamcrest.Matchers.empty) KubernetesConfigOptions(org.apache.flink.kubernetes.configuration.KubernetesConfigOptions) ResourceManagerDriver(org.apache.flink.runtime.resourcemanager.active.ResourceManagerDriver) ExpectedTestException(org.apache.flink.runtime.operators.testutils.ExpectedTestException) Assert.assertNotNull(org.junit.Assert.assertNotNull) Collection(java.util.Collection) Test(org.junit.Test) TimeUnit(java.util.concurrent.TimeUnit) Consumer(java.util.function.Consumer) List(java.util.List) WatchCallbackHandler(org.apache.flink.kubernetes.kubeclient.FlinkKubeClient.WatchCallbackHandler) KubernetesTooOldResourceVersionException(org.apache.flink.kubernetes.kubeclient.resources.KubernetesTooOldResourceVersionException) ResourceManagerDriverTestBase(org.apache.flink.runtime.resourcemanager.active.ResourceManagerDriverTestBase) Matchers.is(org.hamcrest.Matchers.is) CommonTestUtils(org.apache.flink.runtime.testutils.CommonTestUtils) Constants(org.apache.flink.kubernetes.utils.Constants) Collections(java.util.Collections) FlinkKubeClient(org.apache.flink.kubernetes.kubeclient.FlinkKubeClient) CompletableFuture(java.util.concurrent.CompletableFuture) TestingKubernetesPod(org.apache.flink.kubernetes.kubeclient.resources.TestingKubernetesPod) Collection(java.util.Collection) TestingKubernetesPod(org.apache.flink.kubernetes.kubeclient.resources.TestingKubernetesPod) KubernetesPod(org.apache.flink.kubernetes.kubeclient.resources.KubernetesPod) Test(org.junit.Test)

Aggregations

KubernetesPod (org.apache.flink.kubernetes.kubeclient.resources.KubernetesPod)8 ArrayList (java.util.ArrayList)4 Test (org.junit.Test)4 PodBuilder (io.fabric8.kubernetes.api.model.PodBuilder)3 CompletableFuture (java.util.concurrent.CompletableFuture)3 FlinkPod (org.apache.flink.kubernetes.kubeclient.FlinkPod)3 ResourceID (org.apache.flink.runtime.clusterframework.types.ResourceID)3 Pod (io.fabric8.kubernetes.api.model.Pod)2 List (java.util.List)2 TaskManagerOptions (org.apache.flink.configuration.TaskManagerOptions)2 KubernetesConfigOptions (org.apache.flink.kubernetes.configuration.KubernetesConfigOptions)2 KubernetesResourceManagerDriverConfiguration (org.apache.flink.kubernetes.configuration.KubernetesResourceManagerDriverConfiguration)2 FlinkKubeClient (org.apache.flink.kubernetes.kubeclient.FlinkKubeClient)2 KubernetesTooOldResourceVersionException (org.apache.flink.kubernetes.kubeclient.resources.KubernetesTooOldResourceVersionException)2 TestingKubernetesPod (org.apache.flink.kubernetes.kubeclient.resources.TestingKubernetesPod)2 Constants (org.apache.flink.kubernetes.utils.Constants)2 TaskExecutorProcessSpec (org.apache.flink.runtime.clusterframework.TaskExecutorProcessSpec)2 ResourceManagerDriver (org.apache.flink.runtime.resourcemanager.active.ResourceManagerDriver)2 Container (io.fabric8.kubernetes.api.model.Container)1 ContainerBuilder (io.fabric8.kubernetes.api.model.ContainerBuilder)1