Search in sources :

Example 6 with CancelCheckpointMarker

use of org.apache.flink.runtime.io.network.api.CancelCheckpointMarker in project flink by apache.

the class StreamTask method performCheckpoint.

private boolean performCheckpoint(CheckpointMetaData checkpointMetaData, CheckpointOptions checkpointOptions, CheckpointMetrics checkpointMetrics) throws Exception {
    LOG.debug("Starting checkpoint ({}) {} on task {}", checkpointMetaData.getCheckpointId(), checkpointOptions.getCheckpointType(), getName());
    synchronized (lock) {
        if (isRunning) {
            // we can do a checkpoint
            // Since both state checkpointing and downstream barrier emission occurs in this
            // lock scope, they are an atomic operation regardless of the order in which they occur.
            // Given this, we immediately emit the checkpoint barriers, so the downstream operators
            // can start their checkpoint work as soon as possible
            operatorChain.broadcastCheckpointBarrier(checkpointMetaData.getCheckpointId(), checkpointMetaData.getTimestamp(), checkpointOptions);
            checkpointState(checkpointMetaData, checkpointOptions, checkpointMetrics);
            return true;
        } else {
            // we cannot perform our checkpoint - let the downstream operators know that they
            // should not wait for any input from this operator
            // we cannot broadcast the cancellation markers on the 'operator chain', because it may not
            // yet be created
            final CancelCheckpointMarker message = new CancelCheckpointMarker(checkpointMetaData.getCheckpointId());
            Exception exception = null;
            for (ResultPartitionWriter output : getEnvironment().getAllWriters()) {
                try {
                    output.writeBufferToAllChannels(EventSerializer.toBuffer(message));
                } catch (Exception e) {
                    exception = ExceptionUtils.firstOrSuppressed(new Exception("Could not send cancel checkpoint marker to downstream tasks.", e), exception);
                }
            }
            if (exception != null) {
                throw exception;
            }
            return false;
        }
    }
}
Also used : ResultPartitionWriter(org.apache.flink.runtime.io.network.api.writer.ResultPartitionWriter) CancelCheckpointMarker(org.apache.flink.runtime.io.network.api.CancelCheckpointMarker) CancelTaskException(org.apache.flink.runtime.execution.CancelTaskException) IOException(java.io.IOException)

Example 7 with CancelCheckpointMarker

use of org.apache.flink.runtime.io.network.api.CancelCheckpointMarker in project flink by apache.

the class BarrierBuffer method getNextNonBlocked.

// ------------------------------------------------------------------------
//  Buffer and barrier handling
// ------------------------------------------------------------------------
@Override
public BufferOrEvent getNextNonBlocked() throws Exception {
    while (true) {
        // process buffered BufferOrEvents before grabbing new ones
        BufferOrEvent next;
        if (currentBuffered == null) {
            next = inputGate.getNextBufferOrEvent();
        } else {
            next = currentBuffered.getNext();
            if (next == null) {
                completeBufferedSequence();
                return getNextNonBlocked();
            }
        }
        if (next != null) {
            if (isBlocked(next.getChannelIndex())) {
                // if the channel is blocked we, we just store the BufferOrEvent
                bufferSpiller.add(next);
                checkSizeLimit();
            } else if (next.isBuffer()) {
                return next;
            } else if (next.getEvent().getClass() == CheckpointBarrier.class) {
                if (!endOfStream) {
                    // process barriers only if there is a chance of the checkpoint completing
                    processBarrier((CheckpointBarrier) next.getEvent(), next.getChannelIndex());
                }
            } else if (next.getEvent().getClass() == CancelCheckpointMarker.class) {
                processCancellationBarrier((CancelCheckpointMarker) next.getEvent());
            } else {
                if (next.getEvent().getClass() == EndOfPartitionEvent.class) {
                    processEndOfPartition();
                }
                return next;
            }
        } else if (!endOfStream) {
            // end of input stream. stream continues with the buffered data
            endOfStream = true;
            releaseBlocksAndResetBarriers();
            return getNextNonBlocked();
        } else {
            // final end of both input and buffered data
            return null;
        }
    }
}
Also used : EndOfPartitionEvent(org.apache.flink.runtime.io.network.api.EndOfPartitionEvent) CancelCheckpointMarker(org.apache.flink.runtime.io.network.api.CancelCheckpointMarker) BufferOrEvent(org.apache.flink.runtime.io.network.partition.consumer.BufferOrEvent)

Example 8 with CancelCheckpointMarker

use of org.apache.flink.runtime.io.network.api.CancelCheckpointMarker in project flink by apache.

the class StreamTaskCancellationBarrierTest method testDeclineCallOnCancelBarrierOneInput.

/**
	 * This test verifies (for onw input tasks) that the Stream tasks react the following way to
	 * receiving a checkpoint cancellation barrier:
	 * 
	 *   - send a "decline checkpoint" notification out (to the JobManager)
	 *   - emit a cancellation barrier downstream
	 */
@Test
public void testDeclineCallOnCancelBarrierOneInput() throws Exception {
    OneInputStreamTask<String, String> task = new OneInputStreamTask<String, String>();
    OneInputStreamTaskTestHarness<String, String> testHarness = new OneInputStreamTaskTestHarness<>(task, 1, 2, BasicTypeInfo.STRING_TYPE_INFO, BasicTypeInfo.STRING_TYPE_INFO);
    testHarness.setupOutputForSingletonOperatorChain();
    StreamConfig streamConfig = testHarness.getStreamConfig();
    StreamMap<String, String> mapOperator = new StreamMap<>(new IdentityMap());
    streamConfig.setStreamOperator(mapOperator);
    StreamMockEnvironment environment = spy(testHarness.createEnvironment());
    // start the task
    testHarness.invoke(environment);
    testHarness.waitForTaskRunning();
    // emit cancellation barriers
    testHarness.processEvent(new CancelCheckpointMarker(2L), 0, 1);
    testHarness.processEvent(new CancelCheckpointMarker(2L), 0, 0);
    testHarness.waitForInputProcessing();
    // the decline call should go to the coordinator
    verify(environment, times(1)).declineCheckpoint(eq(2L), any(CheckpointDeclineOnCancellationBarrierException.class));
    // a cancellation barrier should be downstream
    Object result = testHarness.getOutput().poll();
    assertNotNull("nothing emitted", result);
    assertTrue("wrong type emitted", result instanceof CancelCheckpointMarker);
    assertEquals("wrong checkpoint id", 2L, ((CancelCheckpointMarker) result).getCheckpointId());
    // cancel and shutdown
    testHarness.endInput();
    testHarness.waitForTaskCompletion();
}
Also used : CancelCheckpointMarker(org.apache.flink.runtime.io.network.api.CancelCheckpointMarker) StreamConfig(org.apache.flink.streaming.api.graph.StreamConfig) CheckpointDeclineOnCancellationBarrierException(org.apache.flink.runtime.checkpoint.decline.CheckpointDeclineOnCancellationBarrierException) CoStreamMap(org.apache.flink.streaming.api.operators.co.CoStreamMap) StreamMap(org.apache.flink.streaming.api.operators.StreamMap) Test(org.junit.Test)

Example 9 with CancelCheckpointMarker

use of org.apache.flink.runtime.io.network.api.CancelCheckpointMarker in project flink by apache.

the class OneInputStreamTaskTest method testOvertakingCheckpointBarriers.

/**
	 * This test verifies that checkpoint barriers and barrier buffers work correctly with
	 * concurrent checkpoint barriers where one checkpoint is "overtaking" another checkpoint, i.e.
	 * some inputs receive barriers from an earlier checkpoint, thereby blocking,
	 * then all inputs receive barriers from a later checkpoint.
	 */
@Test
public void testOvertakingCheckpointBarriers() throws Exception {
    final OneInputStreamTask<String, String> mapTask = new OneInputStreamTask<String, String>();
    final OneInputStreamTaskTestHarness<String, String> testHarness = new OneInputStreamTaskTestHarness<String, String>(mapTask, 2, 2, BasicTypeInfo.STRING_TYPE_INFO, BasicTypeInfo.STRING_TYPE_INFO);
    testHarness.setupOutputForSingletonOperatorChain();
    StreamConfig streamConfig = testHarness.getStreamConfig();
    StreamMap<String, String> mapOperator = new StreamMap<String, String>(new IdentityMap());
    streamConfig.setStreamOperator(mapOperator);
    ConcurrentLinkedQueue<Object> expectedOutput = new ConcurrentLinkedQueue<Object>();
    long initialTime = 0L;
    testHarness.invoke();
    testHarness.waitForTaskRunning();
    testHarness.processEvent(new CheckpointBarrier(0, 0, CheckpointOptions.forFullCheckpoint()), 0, 0);
    // These elements should be buffered until we receive barriers from
    // all inputs
    testHarness.processElement(new StreamRecord<String>("Hello-0-0", initialTime), 0, 0);
    testHarness.processElement(new StreamRecord<String>("Ciao-0-0", initialTime), 0, 0);
    // These elements should be forwarded, since we did not yet receive a checkpoint barrier
    // on that input, only add to same input, otherwise we would not know the ordering
    // of the output since the Task might read the inputs in any order
    testHarness.processElement(new StreamRecord<String>("Hello-1-1", initialTime), 1, 1);
    testHarness.processElement(new StreamRecord<String>("Ciao-1-1", initialTime), 1, 1);
    expectedOutput.add(new StreamRecord<String>("Hello-1-1", initialTime));
    expectedOutput.add(new StreamRecord<String>("Ciao-1-1", initialTime));
    testHarness.waitForInputProcessing();
    // we should not yet see the barrier, only the two elements from non-blocked input
    TestHarnessUtil.assertOutputEquals("Output was not correct.", expectedOutput, testHarness.getOutput());
    // Now give a later barrier to all inputs, this should unblock the first channel,
    // thereby allowing the two blocked elements through
    testHarness.processEvent(new CheckpointBarrier(1, 1, CheckpointOptions.forFullCheckpoint()), 0, 0);
    testHarness.processEvent(new CheckpointBarrier(1, 1, CheckpointOptions.forFullCheckpoint()), 0, 1);
    testHarness.processEvent(new CheckpointBarrier(1, 1, CheckpointOptions.forFullCheckpoint()), 1, 0);
    testHarness.processEvent(new CheckpointBarrier(1, 1, CheckpointOptions.forFullCheckpoint()), 1, 1);
    expectedOutput.add(new CancelCheckpointMarker(0));
    expectedOutput.add(new StreamRecord<String>("Hello-0-0", initialTime));
    expectedOutput.add(new StreamRecord<String>("Ciao-0-0", initialTime));
    expectedOutput.add(new CheckpointBarrier(1, 1, CheckpointOptions.forFullCheckpoint()));
    testHarness.waitForInputProcessing();
    TestHarnessUtil.assertOutputEquals("Output was not correct.", expectedOutput, testHarness.getOutput());
    // Then give the earlier barrier, these should be ignored
    testHarness.processEvent(new CheckpointBarrier(0, 0, CheckpointOptions.forFullCheckpoint()), 0, 1);
    testHarness.processEvent(new CheckpointBarrier(0, 0, CheckpointOptions.forFullCheckpoint()), 1, 0);
    testHarness.processEvent(new CheckpointBarrier(0, 0, CheckpointOptions.forFullCheckpoint()), 1, 1);
    testHarness.waitForInputProcessing();
    testHarness.endInput();
    testHarness.waitForTaskCompletion();
    TestHarnessUtil.assertOutputEquals("Output was not correct.", expectedOutput, testHarness.getOutput());
}
Also used : CancelCheckpointMarker(org.apache.flink.runtime.io.network.api.CancelCheckpointMarker) StreamConfig(org.apache.flink.streaming.api.graph.StreamConfig) CheckpointBarrier(org.apache.flink.runtime.io.network.api.CheckpointBarrier) ConcurrentLinkedQueue(java.util.concurrent.ConcurrentLinkedQueue) StreamMap(org.apache.flink.streaming.api.operators.StreamMap) Test(org.junit.Test)

Example 10 with CancelCheckpointMarker

use of org.apache.flink.runtime.io.network.api.CancelCheckpointMarker in project flink by apache.

the class StreamTaskCancellationBarrierTest method testEmitCancellationBarrierWhenNotReady.

/**
	 * This test checks that tasks emit a proper cancel checkpoint barrier, if a "trigger checkpoint" message
	 * comes before they are ready.
	 */
@Test
public void testEmitCancellationBarrierWhenNotReady() throws Exception {
    StreamTask<String, ?> task = new InitBlockingTask();
    StreamTaskTestHarness<String> testHarness = new StreamTaskTestHarness<>(task, BasicTypeInfo.STRING_TYPE_INFO);
    testHarness.setupOutputForSingletonOperatorChain();
    // start the test - this cannot succeed across the 'init()' method
    testHarness.invoke();
    // tell the task to commence a checkpoint
    boolean result = task.triggerCheckpoint(new CheckpointMetaData(41L, System.currentTimeMillis()), CheckpointOptions.forFullCheckpoint());
    assertFalse("task triggered checkpoint though not ready", result);
    // a cancellation barrier should be downstream
    Object emitted = testHarness.getOutput().poll();
    assertNotNull("nothing emitted", emitted);
    assertTrue("wrong type emitted", emitted instanceof CancelCheckpointMarker);
    assertEquals("wrong checkpoint id", 41L, ((CancelCheckpointMarker) emitted).getCheckpointId());
}
Also used : CancelCheckpointMarker(org.apache.flink.runtime.io.network.api.CancelCheckpointMarker) CheckpointMetaData(org.apache.flink.runtime.checkpoint.CheckpointMetaData) Test(org.junit.Test)

Aggregations

CancelCheckpointMarker (org.apache.flink.runtime.io.network.api.CancelCheckpointMarker)10 Test (org.junit.Test)6 CheckpointBarrier (org.apache.flink.runtime.io.network.api.CheckpointBarrier)5 StreamConfig (org.apache.flink.streaming.api.graph.StreamConfig)4 IOException (java.io.IOException)3 CoStreamMap (org.apache.flink.streaming.api.operators.co.CoStreamMap)3 ByteBuffer (java.nio.ByteBuffer)2 ConcurrentLinkedQueue (java.util.concurrent.ConcurrentLinkedQueue)2 CheckpointOptions (org.apache.flink.runtime.checkpoint.CheckpointOptions)2 CheckpointType (org.apache.flink.runtime.checkpoint.CheckpointOptions.CheckpointType)2 CheckpointDeclineOnCancellationBarrierException (org.apache.flink.runtime.checkpoint.decline.CheckpointDeclineOnCancellationBarrierException)2 AbstractEvent (org.apache.flink.runtime.event.AbstractEvent)2 StreamMap (org.apache.flink.streaming.api.operators.StreamMap)2 ByteOrder (java.nio.ByteOrder)1 CheckpointMetaData (org.apache.flink.runtime.checkpoint.CheckpointMetaData)1 CancelTaskException (org.apache.flink.runtime.execution.CancelTaskException)1 EndOfPartitionEvent (org.apache.flink.runtime.io.network.api.EndOfPartitionEvent)1 ResultPartitionWriter (org.apache.flink.runtime.io.network.api.writer.ResultPartitionWriter)1 BufferOrEvent (org.apache.flink.runtime.io.network.partition.consumer.BufferOrEvent)1 TestTaskEvent (org.apache.flink.runtime.io.network.util.TestTaskEvent)1