Search in sources :

Example 1 with RelocationConfiguration

use of com.netflix.titus.supplementary.relocation.RelocationConfiguration in project titus-control-plane by Netflix.

the class DefaultNodeConditionControllerTest method checkTasksTerminatedDueToBadNodeConditions.

@Test
public void checkTasksTerminatedDueToBadNodeConditions() {
    // Mock jobs, tasks & nodes
    Map<String, TitusNode> nodeMap = buildNodes();
    List<Job<BatchJobExt>> jobs = getJobs(true);
    Map<String, List<Task>> tasksByJobIdMap = buildTasksForJobAndNodeAssignment(new ArrayList<>(nodeMap.values()), jobs);
    TitusRuntime titusRuntime = mock(TitusRuntime.class);
    when(titusRuntime.getRegistry()).thenReturn(new DefaultRegistry());
    RelocationConfiguration configuration = mock(RelocationConfiguration.class);
    when(configuration.getBadNodeConditionPattern()).thenReturn(".*Failure");
    when(configuration.isTaskTerminationOnBadNodeConditionEnabled()).thenReturn(true);
    NodeDataResolver nodeDataResolver = mock(NodeDataResolver.class);
    when(nodeDataResolver.resolve()).thenReturn(nodeMap);
    JobDataReplicator jobDataReplicator = mock(JobDataReplicator.class);
    when(jobDataReplicator.getStalenessMs()).thenReturn(0L);
    ReadOnlyJobOperations readOnlyJobOperations = mock(ReadOnlyJobOperations.class);
    when(readOnlyJobOperations.getJobs()).thenReturn(new ArrayList<>(jobs));
    tasksByJobIdMap.forEach((key, value) -> when(readOnlyJobOperations.getTasks(key)).thenReturn(value));
    JobManagementClient jobManagementClient = mock(JobManagementClient.class);
    Set<String> terminatedTaskIds = new HashSet<>();
    when(jobManagementClient.killTask(anyString(), anyBoolean(), any())).thenAnswer(invocation -> {
        String taskIdToBeTerminated = invocation.getArgument(0);
        terminatedTaskIds.add(taskIdToBeTerminated);
        return Mono.empty();
    });
    DefaultNodeConditionController nodeConditionCtrl = new DefaultNodeConditionController(configuration, nodeDataResolver, jobDataReplicator, readOnlyJobOperations, jobManagementClient, titusRuntime);
    ExecutionContext executionContext = ExecutionContext.newBuilder().withIteration(ExecutionId.initial()).build();
    StepVerifier.create(nodeConditionCtrl.handleNodesWithBadCondition(executionContext)).verifyComplete();
    assertThat(terminatedTaskIds).isNotEmpty();
    assertThat(terminatedTaskIds.size()).isEqualTo(2);
    verifyTerminatedTasksOnBadNodes(terminatedTaskIds, tasksByJobIdMap, nodeMap);
}
Also used : JobDataReplicator(com.netflix.titus.runtime.connector.jobmanager.JobDataReplicator) ReadOnlyJobOperations(com.netflix.titus.api.jobmanager.service.ReadOnlyJobOperations) JobManagementClient(com.netflix.titus.runtime.connector.jobmanager.JobManagementClient) NodeDataResolver(com.netflix.titus.supplementary.relocation.connector.NodeDataResolver) ArgumentMatchers.anyString(org.mockito.ArgumentMatchers.anyString) TitusRuntime(com.netflix.titus.common.runtime.TitusRuntime) ExecutionContext(com.netflix.titus.common.framework.scheduler.ExecutionContext) DefaultRegistry(com.netflix.spectator.api.DefaultRegistry) ArrayList(java.util.ArrayList) List(java.util.List) TitusNode(com.netflix.titus.supplementary.relocation.connector.TitusNode) Job(com.netflix.titus.api.jobmanager.model.job.Job) RelocationConfiguration(com.netflix.titus.supplementary.relocation.RelocationConfiguration) HashSet(java.util.HashSet) Test(org.junit.Test)

Example 2 with RelocationConfiguration

use of com.netflix.titus.supplementary.relocation.RelocationConfiguration in project titus-control-plane by Netflix.

the class DefaultNodeConditionControllerTest method noTerminationsOnDataStaleness.

@Test
public void noTerminationsOnDataStaleness() {
    TitusRuntime titusRuntime = mock(TitusRuntime.class);
    when(titusRuntime.getRegistry()).thenReturn(new DefaultRegistry());
    RelocationConfiguration configuration = mock(RelocationConfiguration.class);
    when(configuration.getBadNodeConditionPattern()).thenReturn(".*Problem");
    when(configuration.isTaskTerminationOnBadNodeConditionEnabled()).thenReturn(true);
    when(configuration.getDataStalenessThresholdMs()).thenReturn(8000L);
    NodeDataResolver nodeDataResolver = mock(NodeDataResolver.class);
    when(nodeDataResolver.getStalenessMs()).thenReturn(5L);
    JobDataReplicator jobDataReplicator = mock(JobDataReplicator.class);
    when(jobDataReplicator.getStalenessMs()).thenReturn(10L);
    ReadOnlyJobOperations readOnlyJobOperations = mock(ReadOnlyJobOperations.class);
    JobManagementClient jobManagementClient = mock(JobManagementClient.class);
    Set<String> terminatedTaskIds = new HashSet<>();
    when(jobManagementClient.killTask(anyString(), anyBoolean(), any())).thenAnswer(invocation -> {
        String taskIdToBeTerminated = invocation.getArgument(0);
        terminatedTaskIds.add(taskIdToBeTerminated);
        return Mono.empty();
    });
    DefaultNodeConditionController nodeConditionCtrl = new DefaultNodeConditionController(configuration, nodeDataResolver, jobDataReplicator, readOnlyJobOperations, jobManagementClient, titusRuntime);
    ExecutionContext executionContext = ExecutionContext.newBuilder().withIteration(ExecutionId.initial()).build();
    StepVerifier.create(nodeConditionCtrl.handleNodesWithBadCondition(executionContext)).verifyComplete();
    // No tasks terminated
    assertThat(terminatedTaskIds).isEmpty();
}
Also used : JobDataReplicator(com.netflix.titus.runtime.connector.jobmanager.JobDataReplicator) ReadOnlyJobOperations(com.netflix.titus.api.jobmanager.service.ReadOnlyJobOperations) ExecutionContext(com.netflix.titus.common.framework.scheduler.ExecutionContext) DefaultRegistry(com.netflix.spectator.api.DefaultRegistry) JobManagementClient(com.netflix.titus.runtime.connector.jobmanager.JobManagementClient) NodeDataResolver(com.netflix.titus.supplementary.relocation.connector.NodeDataResolver) ArgumentMatchers.anyString(org.mockito.ArgumentMatchers.anyString) TitusRuntime(com.netflix.titus.common.runtime.TitusRuntime) RelocationConfiguration(com.netflix.titus.supplementary.relocation.RelocationConfiguration) HashSet(java.util.HashSet) Test(org.junit.Test)

Example 3 with RelocationConfiguration

use of com.netflix.titus.supplementary.relocation.RelocationConfiguration in project titus-control-plane by Netflix.

the class DefaultNodeConditionControllerTest method badNodeConditionsIgnoredForJobsNotOptingIn.

@Test
public void badNodeConditionsIgnoredForJobsNotOptingIn() {
    Map<String, TitusNode> nodeMap = buildNodes();
    List<Job<BatchJobExt>> jobs = getJobs(false);
    Map<String, List<Task>> stringListMap = buildTasksForJobAndNodeAssignment(new ArrayList<>(nodeMap.values()), jobs);
    TitusRuntime titusRuntime = mock(TitusRuntime.class);
    when(titusRuntime.getRegistry()).thenReturn(new DefaultRegistry());
    RelocationConfiguration configuration = mock(RelocationConfiguration.class);
    when(configuration.getBadNodeConditionPattern()).thenReturn(".*Failure");
    when(configuration.isTaskTerminationOnBadNodeConditionEnabled()).thenReturn(true);
    NodeDataResolver nodeDataResolver = mock(NodeDataResolver.class);
    when(nodeDataResolver.resolve()).thenReturn(nodeMap);
    JobDataReplicator jobDataReplicator = mock(JobDataReplicator.class);
    when(jobDataReplicator.getStalenessMs()).thenReturn(0L);
    // Job attribute "terminateContainerOnBadAgent" = False
    ReadOnlyJobOperations readOnlyJobOperations = mock(ReadOnlyJobOperations.class);
    when(readOnlyJobOperations.getJobs()).thenReturn(new ArrayList<>(jobs));
    stringListMap.forEach((key, value) -> when(readOnlyJobOperations.getTasks(key)).thenReturn(value));
    JobManagementClient jobManagementClient = mock(JobManagementClient.class);
    Set<String> terminatedTaskIds = new HashSet<>();
    when(jobManagementClient.killTask(anyString(), anyBoolean(), any())).thenAnswer(invocation -> {
        String taskIdToBeTerminated = invocation.getArgument(0);
        terminatedTaskIds.add(taskIdToBeTerminated);
        return Mono.empty();
    });
    DefaultNodeConditionController nodeConditionController = new DefaultNodeConditionController(configuration, nodeDataResolver, jobDataReplicator, readOnlyJobOperations, jobManagementClient, titusRuntime);
    ExecutionContext executionContext = ExecutionContext.newBuilder().withIteration(ExecutionId.initial()).build();
    StepVerifier.create(nodeConditionController.handleNodesWithBadCondition(executionContext)).verifyComplete();
    // no tasks should be terminated for jobs
    assertThat(terminatedTaskIds).isEmpty();
}
Also used : JobDataReplicator(com.netflix.titus.runtime.connector.jobmanager.JobDataReplicator) ReadOnlyJobOperations(com.netflix.titus.api.jobmanager.service.ReadOnlyJobOperations) JobManagementClient(com.netflix.titus.runtime.connector.jobmanager.JobManagementClient) NodeDataResolver(com.netflix.titus.supplementary.relocation.connector.NodeDataResolver) ArgumentMatchers.anyString(org.mockito.ArgumentMatchers.anyString) TitusRuntime(com.netflix.titus.common.runtime.TitusRuntime) ExecutionContext(com.netflix.titus.common.framework.scheduler.ExecutionContext) DefaultRegistry(com.netflix.spectator.api.DefaultRegistry) ArrayList(java.util.ArrayList) List(java.util.List) TitusNode(com.netflix.titus.supplementary.relocation.connector.TitusNode) Job(com.netflix.titus.api.jobmanager.model.job.Job) RelocationConfiguration(com.netflix.titus.supplementary.relocation.RelocationConfiguration) HashSet(java.util.HashSet) Test(org.junit.Test)

Aggregations

DefaultRegistry (com.netflix.spectator.api.DefaultRegistry)3 ReadOnlyJobOperations (com.netflix.titus.api.jobmanager.service.ReadOnlyJobOperations)3 ExecutionContext (com.netflix.titus.common.framework.scheduler.ExecutionContext)3 TitusRuntime (com.netflix.titus.common.runtime.TitusRuntime)3 JobDataReplicator (com.netflix.titus.runtime.connector.jobmanager.JobDataReplicator)3 JobManagementClient (com.netflix.titus.runtime.connector.jobmanager.JobManagementClient)3 RelocationConfiguration (com.netflix.titus.supplementary.relocation.RelocationConfiguration)3 NodeDataResolver (com.netflix.titus.supplementary.relocation.connector.NodeDataResolver)3 HashSet (java.util.HashSet)3 Test (org.junit.Test)3 ArgumentMatchers.anyString (org.mockito.ArgumentMatchers.anyString)3 Job (com.netflix.titus.api.jobmanager.model.job.Job)2 TitusNode (com.netflix.titus.supplementary.relocation.connector.TitusNode)2 ArrayList (java.util.ArrayList)2 List (java.util.List)2