Search in sources :

Example 16 with ExecutionVertex

use of org.apache.flink.runtime.executiongraph.ExecutionVertex in project flink by apache.

the class CheckpointCoordinatorTest method testExternalizedCheckpoints.

/**
	 * Tests that the externalized checkpoint configuration is respected.
	 */
@Test
public void testExternalizedCheckpoints() throws Exception {
    try {
        final JobID jid = new JobID();
        final long timestamp = System.currentTimeMillis();
        // create some mock Execution vertices that receive the checkpoint trigger messages
        final ExecutionAttemptID attemptID1 = new ExecutionAttemptID();
        ExecutionVertex vertex1 = mockExecutionVertex(attemptID1);
        // set up the coordinator and validate the initial state
        CheckpointCoordinator coord = new CheckpointCoordinator(jid, 600000, 600000, 0, Integer.MAX_VALUE, ExternalizedCheckpointSettings.externalizeCheckpoints(true), new ExecutionVertex[] { vertex1 }, new ExecutionVertex[] { vertex1 }, new ExecutionVertex[] { vertex1 }, new StandaloneCheckpointIDCounter(), new StandaloneCompletedCheckpointStore(1), "fake-directory", Executors.directExecutor());
        assertTrue(coord.triggerCheckpoint(timestamp, false));
        for (PendingCheckpoint checkpoint : coord.getPendingCheckpoints().values()) {
            CheckpointProperties props = checkpoint.getProps();
            CheckpointProperties expected = CheckpointProperties.forExternalizedCheckpoint(true);
            assertEquals(expected, props);
        }
        // the now we should have a completed checkpoint
        coord.shutdown(JobStatus.FINISHED);
    } catch (Exception e) {
        e.printStackTrace();
        fail(e.getMessage());
    }
}
Also used : ExecutionAttemptID(org.apache.flink.runtime.executiongraph.ExecutionAttemptID) JobID(org.apache.flink.api.common.JobID) ExecutionVertex(org.apache.flink.runtime.executiongraph.ExecutionVertex) IOException(java.io.IOException) Test(org.junit.Test)

Example 17 with ExecutionVertex

use of org.apache.flink.runtime.executiongraph.ExecutionVertex in project flink by apache.

the class CheckpointCoordinatorTest method mockExecutionJobVertex.

static ExecutionJobVertex mockExecutionJobVertex(JobVertexID jobVertexID, int parallelism, int maxParallelism) {
    final ExecutionJobVertex executionJobVertex = mock(ExecutionJobVertex.class);
    ExecutionVertex[] executionVertices = new ExecutionVertex[parallelism];
    for (int i = 0; i < parallelism; i++) {
        executionVertices[i] = mockExecutionVertex(new ExecutionAttemptID(), jobVertexID, parallelism, maxParallelism, ExecutionState.RUNNING);
        when(executionVertices[i].getParallelSubtaskIndex()).thenReturn(i);
    }
    when(executionJobVertex.getJobVertexId()).thenReturn(jobVertexID);
    when(executionJobVertex.getTaskVertices()).thenReturn(executionVertices);
    when(executionJobVertex.getParallelism()).thenReturn(parallelism);
    when(executionJobVertex.getMaxParallelism()).thenReturn(maxParallelism);
    when(executionJobVertex.isMaxParallelismConfigured()).thenReturn(true);
    return executionJobVertex;
}
Also used : ExecutionAttemptID(org.apache.flink.runtime.executiongraph.ExecutionAttemptID) ExecutionJobVertex(org.apache.flink.runtime.executiongraph.ExecutionJobVertex) ExecutionVertex(org.apache.flink.runtime.executiongraph.ExecutionVertex) AcknowledgeCheckpoint(org.apache.flink.runtime.messages.checkpoint.AcknowledgeCheckpoint) DeclineCheckpoint(org.apache.flink.runtime.messages.checkpoint.DeclineCheckpoint)

Example 18 with ExecutionVertex

use of org.apache.flink.runtime.executiongraph.ExecutionVertex in project flink by apache.

the class CheckpointCoordinatorTest method testCheckpointStatsTrackerPendingCheckpointCallback.

/**
	 * Tests that the pending checkpoint stats callbacks are created.
	 */
@Test
public void testCheckpointStatsTrackerPendingCheckpointCallback() {
    final long timestamp = System.currentTimeMillis();
    ExecutionVertex vertex1 = mockExecutionVertex(new ExecutionAttemptID());
    // set up the coordinator and validate the initial state
    CheckpointCoordinator coord = new CheckpointCoordinator(new JobID(), 600000, 600000, 0, Integer.MAX_VALUE, ExternalizedCheckpointSettings.none(), new ExecutionVertex[] { vertex1 }, new ExecutionVertex[] { vertex1 }, new ExecutionVertex[] { vertex1 }, new StandaloneCheckpointIDCounter(), new StandaloneCompletedCheckpointStore(1), null, Executors.directExecutor());
    CheckpointStatsTracker tracker = mock(CheckpointStatsTracker.class);
    coord.setCheckpointStatsTracker(tracker);
    when(tracker.reportPendingCheckpoint(anyLong(), anyLong(), any(CheckpointProperties.class))).thenReturn(mock(PendingCheckpointStats.class));
    // Trigger a checkpoint and verify callback
    assertTrue(coord.triggerCheckpoint(timestamp, false));
    verify(tracker, times(1)).reportPendingCheckpoint(eq(1L), eq(timestamp), eq(CheckpointProperties.forStandardCheckpoint()));
}
Also used : ExecutionAttemptID(org.apache.flink.runtime.executiongraph.ExecutionAttemptID) ExecutionVertex(org.apache.flink.runtime.executiongraph.ExecutionVertex) JobID(org.apache.flink.api.common.JobID) Test(org.junit.Test)

Example 19 with ExecutionVertex

use of org.apache.flink.runtime.executiongraph.ExecutionVertex in project flink by apache.

the class CheckpointCoordinatorTest method testMinTimeBetweenCheckpointsInterval.

/**
	 * This test verified that after a completed checkpoint a certain time has passed before
	 * another is triggered.
	 */
@Test
public void testMinTimeBetweenCheckpointsInterval() throws Exception {
    final JobID jid = new JobID();
    // create some mock execution vertices and trigger some checkpoint
    final ExecutionAttemptID attemptID = new ExecutionAttemptID();
    final ExecutionVertex vertex = mockExecutionVertex(attemptID);
    final Execution executionAttempt = vertex.getCurrentExecutionAttempt();
    final BlockingQueue<Long> triggerCalls = new LinkedBlockingQueue<>();
    doAnswer(new Answer<Void>() {

        @Override
        public Void answer(InvocationOnMock invocation) throws Throwable {
            triggerCalls.add((Long) invocation.getArguments()[0]);
            return null;
        }
    }).when(executionAttempt).triggerCheckpoint(anyLong(), anyLong(), any(CheckpointOptions.class));
    final long delay = 50;
    final CheckpointCoordinator coord = new CheckpointCoordinator(jid, // periodic interval is 2 ms
    2, // timeout is very long (200 s)
    200_000, // 50 ms delay between checkpoints
    delay, 1, ExternalizedCheckpointSettings.none(), new ExecutionVertex[] { vertex }, new ExecutionVertex[] { vertex }, new ExecutionVertex[] { vertex }, new StandaloneCheckpointIDCounter(), new StandaloneCompletedCheckpointStore(2), "dummy-path", Executors.directExecutor());
    try {
        coord.startCheckpointScheduler();
        // wait until the first checkpoint was triggered
        Long firstCallId = triggerCalls.take();
        assertEquals(1L, firstCallId.longValue());
        AcknowledgeCheckpoint ackMsg = new AcknowledgeCheckpoint(jid, attemptID, 1L);
        // tell the coordinator that the checkpoint is done
        final long ackTime = System.nanoTime();
        coord.receiveAcknowledgeMessage(ackMsg);
        // wait until the next checkpoint is triggered
        Long nextCallId = triggerCalls.take();
        final long nextCheckpointTime = System.nanoTime();
        assertEquals(2L, nextCallId.longValue());
        final long delayMillis = (nextCheckpointTime - ackTime) / 1_000_000;
        // we need to add one ms here to account for rounding errors
        if (delayMillis + 1 < delay) {
            fail("checkpoint came too early: delay was " + delayMillis + " but should have been at least " + delay);
        }
    } finally {
        coord.stopCheckpointScheduler();
        coord.shutdown(JobStatus.FINISHED);
    }
}
Also used : ExecutionAttemptID(org.apache.flink.runtime.executiongraph.ExecutionAttemptID) LinkedBlockingQueue(java.util.concurrent.LinkedBlockingQueue) ExecutionVertex(org.apache.flink.runtime.executiongraph.ExecutionVertex) AcknowledgeCheckpoint(org.apache.flink.runtime.messages.checkpoint.AcknowledgeCheckpoint) Execution(org.apache.flink.runtime.executiongraph.Execution) InvocationOnMock(org.mockito.invocation.InvocationOnMock) Matchers.anyLong(org.mockito.Matchers.anyLong) JobID(org.apache.flink.api.common.JobID) Test(org.junit.Test)

Example 20 with ExecutionVertex

use of org.apache.flink.runtime.executiongraph.ExecutionVertex in project flink by apache.

the class CheckpointCoordinatorTest method testRestoreLatestCheckpointedState.

/**
	 * Tests that the checkpointed partitioned and non-partitioned state is assigned properly to
	 * the {@link Execution} upon recovery.
	 *
	 * @throws Exception
	 */
@Test
public void testRestoreLatestCheckpointedState() throws Exception {
    final JobID jid = new JobID();
    final long timestamp = System.currentTimeMillis();
    final JobVertexID jobVertexID1 = new JobVertexID();
    final JobVertexID jobVertexID2 = new JobVertexID();
    int parallelism1 = 3;
    int parallelism2 = 2;
    int maxParallelism1 = 42;
    int maxParallelism2 = 13;
    final ExecutionJobVertex jobVertex1 = mockExecutionJobVertex(jobVertexID1, parallelism1, maxParallelism1);
    final ExecutionJobVertex jobVertex2 = mockExecutionJobVertex(jobVertexID2, parallelism2, maxParallelism2);
    List<ExecutionVertex> allExecutionVertices = new ArrayList<>(parallelism1 + parallelism2);
    allExecutionVertices.addAll(Arrays.asList(jobVertex1.getTaskVertices()));
    allExecutionVertices.addAll(Arrays.asList(jobVertex2.getTaskVertices()));
    ExecutionVertex[] arrayExecutionVertices = allExecutionVertices.toArray(new ExecutionVertex[allExecutionVertices.size()]);
    // set up the coordinator and validate the initial state
    CheckpointCoordinator coord = new CheckpointCoordinator(jid, 600000, 600000, 0, Integer.MAX_VALUE, ExternalizedCheckpointSettings.none(), arrayExecutionVertices, arrayExecutionVertices, arrayExecutionVertices, new StandaloneCheckpointIDCounter(), new StandaloneCompletedCheckpointStore(1), null, Executors.directExecutor());
    // trigger the checkpoint
    coord.triggerCheckpoint(timestamp, false);
    assertTrue(coord.getPendingCheckpoints().keySet().size() == 1);
    long checkpointId = Iterables.getOnlyElement(coord.getPendingCheckpoints().keySet());
    CheckpointMetaData checkpointMetaData = new CheckpointMetaData(checkpointId, 0L);
    List<KeyGroupRange> keyGroupPartitions1 = StateAssignmentOperation.createKeyGroupPartitions(maxParallelism1, parallelism1);
    List<KeyGroupRange> keyGroupPartitions2 = StateAssignmentOperation.createKeyGroupPartitions(maxParallelism2, parallelism2);
    for (int index = 0; index < jobVertex1.getParallelism(); index++) {
        ChainedStateHandle<StreamStateHandle> nonPartitionedState = generateStateForVertex(jobVertexID1, index);
        ChainedStateHandle<OperatorStateHandle> partitionableState = generateChainedPartitionableStateHandle(jobVertexID1, index, 2, 8, false);
        KeyGroupsStateHandle partitionedKeyGroupState = generateKeyGroupState(jobVertexID1, keyGroupPartitions1.get(index), false);
        SubtaskState checkpointStateHandles = new SubtaskState(nonPartitionedState, partitionableState, null, partitionedKeyGroupState, null);
        AcknowledgeCheckpoint acknowledgeCheckpoint = new AcknowledgeCheckpoint(jid, jobVertex1.getTaskVertices()[index].getCurrentExecutionAttempt().getAttemptId(), checkpointId, new CheckpointMetrics(), checkpointStateHandles);
        coord.receiveAcknowledgeMessage(acknowledgeCheckpoint);
    }
    for (int index = 0; index < jobVertex2.getParallelism(); index++) {
        ChainedStateHandle<StreamStateHandle> nonPartitionedState = generateStateForVertex(jobVertexID2, index);
        ChainedStateHandle<OperatorStateHandle> partitionableState = generateChainedPartitionableStateHandle(jobVertexID2, index, 2, 8, false);
        KeyGroupsStateHandle partitionedKeyGroupState = generateKeyGroupState(jobVertexID2, keyGroupPartitions2.get(index), false);
        SubtaskState checkpointStateHandles = new SubtaskState(nonPartitionedState, partitionableState, null, partitionedKeyGroupState, null);
        AcknowledgeCheckpoint acknowledgeCheckpoint = new AcknowledgeCheckpoint(jid, jobVertex2.getTaskVertices()[index].getCurrentExecutionAttempt().getAttemptId(), checkpointId, new CheckpointMetrics(), checkpointStateHandles);
        coord.receiveAcknowledgeMessage(acknowledgeCheckpoint);
    }
    List<CompletedCheckpoint> completedCheckpoints = coord.getSuccessfulCheckpoints();
    assertEquals(1, completedCheckpoints.size());
    Map<JobVertexID, ExecutionJobVertex> tasks = new HashMap<>();
    tasks.put(jobVertexID1, jobVertex1);
    tasks.put(jobVertexID2, jobVertex2);
    coord.restoreLatestCheckpointedState(tasks, true, false);
    // verify the restored state
    verifyStateRestore(jobVertexID1, jobVertex1, keyGroupPartitions1);
    verifyStateRestore(jobVertexID2, jobVertex2, keyGroupPartitions2);
}
Also used : HashMap(java.util.HashMap) JobVertexID(org.apache.flink.runtime.jobgraph.JobVertexID) ArrayList(java.util.ArrayList) KeyGroupRange(org.apache.flink.runtime.state.KeyGroupRange) ExecutionVertex(org.apache.flink.runtime.executiongraph.ExecutionVertex) KeyGroupsStateHandle(org.apache.flink.runtime.state.KeyGroupsStateHandle) StreamStateHandle(org.apache.flink.runtime.state.StreamStateHandle) ByteStreamStateHandle(org.apache.flink.runtime.state.memory.ByteStreamStateHandle) ExecutionJobVertex(org.apache.flink.runtime.executiongraph.ExecutionJobVertex) AcknowledgeCheckpoint(org.apache.flink.runtime.messages.checkpoint.AcknowledgeCheckpoint) DeclineCheckpoint(org.apache.flink.runtime.messages.checkpoint.DeclineCheckpoint) AcknowledgeCheckpoint(org.apache.flink.runtime.messages.checkpoint.AcknowledgeCheckpoint) OperatorStateHandle(org.apache.flink.runtime.state.OperatorStateHandle) JobID(org.apache.flink.api.common.JobID) Test(org.junit.Test)

Aggregations

ExecutionVertex (org.apache.flink.runtime.executiongraph.ExecutionVertex)65 Test (org.junit.Test)47 JobID (org.apache.flink.api.common.JobID)42 ExecutionAttemptID (org.apache.flink.runtime.executiongraph.ExecutionAttemptID)41 AcknowledgeCheckpoint (org.apache.flink.runtime.messages.checkpoint.AcknowledgeCheckpoint)23 IOException (java.io.IOException)15 Execution (org.apache.flink.runtime.executiongraph.Execution)15 JobVertexID (org.apache.flink.runtime.jobgraph.JobVertexID)15 ExecutionJobVertex (org.apache.flink.runtime.executiongraph.ExecutionJobVertex)12 DeclineCheckpoint (org.apache.flink.runtime.messages.checkpoint.DeclineCheckpoint)12 HashMap (java.util.HashMap)10 ArrayList (java.util.ArrayList)8 TriggerStackTraceSample (org.apache.flink.runtime.messages.StackTraceSampleMessages.TriggerStackTraceSample)8 StreamStateHandle (org.apache.flink.runtime.state.StreamStateHandle)7 ExecutionGraph (org.apache.flink.runtime.executiongraph.ExecutionGraph)5 IntermediateResultPartition (org.apache.flink.runtime.executiongraph.IntermediateResultPartition)5 SimpleSlot (org.apache.flink.runtime.instance.SimpleSlot)5 ResultPartitionID (org.apache.flink.runtime.io.network.partition.ResultPartitionID)5 KeyGroupRange (org.apache.flink.runtime.state.KeyGroupRange)5 KeyGroupsStateHandle (org.apache.flink.runtime.state.KeyGroupsStateHandle)5