Search in sources :

Example 16 with ResultPartitionWriter

use of org.apache.flink.runtime.io.network.api.writer.ResultPartitionWriter in project flink by apache.

the class StreamTask method performCheckpoint.

private boolean performCheckpoint(CheckpointMetaData checkpointMetaData, CheckpointOptions checkpointOptions, CheckpointMetrics checkpointMetrics) throws Exception {
    LOG.debug("Starting checkpoint ({}) {} on task {}", checkpointMetaData.getCheckpointId(), checkpointOptions.getCheckpointType(), getName());
    synchronized (lock) {
        if (isRunning) {
            // we can do a checkpoint
            // Since both state checkpointing and downstream barrier emission occurs in this
            // lock scope, they are an atomic operation regardless of the order in which they occur.
            // Given this, we immediately emit the checkpoint barriers, so the downstream operators
            // can start their checkpoint work as soon as possible
            operatorChain.broadcastCheckpointBarrier(checkpointMetaData.getCheckpointId(), checkpointMetaData.getTimestamp(), checkpointOptions);
            checkpointState(checkpointMetaData, checkpointOptions, checkpointMetrics);
            return true;
        } else {
            // we cannot perform our checkpoint - let the downstream operators know that they
            // should not wait for any input from this operator
            // we cannot broadcast the cancellation markers on the 'operator chain', because it may not
            // yet be created
            final CancelCheckpointMarker message = new CancelCheckpointMarker(checkpointMetaData.getCheckpointId());
            Exception exception = null;
            for (ResultPartitionWriter output : getEnvironment().getAllWriters()) {
                try {
                    output.writeBufferToAllChannels(EventSerializer.toBuffer(message));
                } catch (Exception e) {
                    exception = ExceptionUtils.firstOrSuppressed(new Exception("Could not send cancel checkpoint marker to downstream tasks.", e), exception);
                }
            }
            if (exception != null) {
                throw exception;
            }
            return false;
        }
    }
}
Also used : ResultPartitionWriter(org.apache.flink.runtime.io.network.api.writer.ResultPartitionWriter) CancelCheckpointMarker(org.apache.flink.runtime.io.network.api.CancelCheckpointMarker) CancelTaskException(org.apache.flink.runtime.execution.CancelTaskException) IOException(java.io.IOException)

Example 17 with ResultPartitionWriter

use of org.apache.flink.runtime.io.network.api.writer.ResultPartitionWriter in project flink by apache.

the class OperatorChain method createStreamOutput.

private <T> RecordWriterOutput<T> createStreamOutput(StreamEdge edge, StreamConfig upStreamConfig, int outputIndex, Environment taskEnvironment, String taskName) {
    // OutputTag, return null if not sideOutput
    OutputTag sideOutputTag = edge.getOutputTag();
    TypeSerializer outSerializer = null;
    if (edge.getOutputTag() != null) {
        // side output
        outSerializer = upStreamConfig.getTypeSerializerSideOut(edge.getOutputTag(), taskEnvironment.getUserClassLoader());
    } else {
        // main output
        outSerializer = upStreamConfig.getTypeSerializerOut(taskEnvironment.getUserClassLoader());
    }
    @SuppressWarnings("unchecked") StreamPartitioner<T> outputPartitioner = (StreamPartitioner<T>) edge.getPartitioner();
    LOG.debug("Using partitioner {} for output {} of task ", outputPartitioner, outputIndex, taskName);
    ResultPartitionWriter bufferWriter = taskEnvironment.getWriter(outputIndex);
    // we initialize the partitioner here with the number of key groups (aka max. parallelism)
    if (outputPartitioner instanceof ConfigurableStreamPartitioner) {
        int numKeyGroups = bufferWriter.getNumTargetKeyGroups();
        if (0 < numKeyGroups) {
            ((ConfigurableStreamPartitioner) outputPartitioner).configure(numKeyGroups);
        }
    }
    StreamRecordWriter<SerializationDelegate<StreamRecord<T>>> output = new StreamRecordWriter<>(bufferWriter, outputPartitioner, upStreamConfig.getBufferTimeout());
    output.setMetricGroup(taskEnvironment.getMetricGroup().getIOMetricGroup());
    return new RecordWriterOutput<>(output, outSerializer, sideOutputTag, this);
}
Also used : ConfigurableStreamPartitioner(org.apache.flink.streaming.runtime.partitioner.ConfigurableStreamPartitioner) StreamPartitioner(org.apache.flink.streaming.runtime.partitioner.StreamPartitioner) ResultPartitionWriter(org.apache.flink.runtime.io.network.api.writer.ResultPartitionWriter) SerializationDelegate(org.apache.flink.runtime.plugable.SerializationDelegate) RecordWriterOutput(org.apache.flink.streaming.runtime.io.RecordWriterOutput) StreamRecordWriter(org.apache.flink.streaming.runtime.io.StreamRecordWriter) ConfigurableStreamPartitioner(org.apache.flink.streaming.runtime.partitioner.ConfigurableStreamPartitioner) TypeSerializer(org.apache.flink.api.common.typeutils.TypeSerializer) OutputTag(org.apache.flink.util.OutputTag)

Example 18 with ResultPartitionWriter

use of org.apache.flink.runtime.io.network.api.writer.ResultPartitionWriter in project flink by apache.

the class StreamMockEnvironment method addOutput.

public <T> void addOutput(final Queue<Object> outputList, final TypeSerializer<T> serializer) {
    try {
        // The record-oriented writers wrap the buffer writer. We mock it
        // to collect the returned buffers and deserialize the content to
        // the output list
        BufferProvider mockBufferProvider = mock(BufferProvider.class);
        when(mockBufferProvider.requestBufferBlocking()).thenAnswer(new Answer<Buffer>() {

            @Override
            public Buffer answer(InvocationOnMock invocationOnMock) throws Throwable {
                return new Buffer(MemorySegmentFactory.allocateUnpooledSegment(bufferSize), mock(BufferRecycler.class));
            }
        });
        ResultPartitionWriter mockWriter = mock(ResultPartitionWriter.class);
        when(mockWriter.getNumberOfOutputChannels()).thenReturn(1);
        when(mockWriter.getBufferProvider()).thenReturn(mockBufferProvider);
        final RecordDeserializer<DeserializationDelegate<T>> recordDeserializer = new AdaptiveSpanningRecordDeserializer<DeserializationDelegate<T>>();
        final NonReusingDeserializationDelegate<T> delegate = new NonReusingDeserializationDelegate<T>(serializer);
        // Add records and events from the buffer to the output list
        doAnswer(new Answer<Void>() {

            @Override
            public Void answer(InvocationOnMock invocationOnMock) throws Throwable {
                Buffer buffer = (Buffer) invocationOnMock.getArguments()[0];
                addBufferToOutputList(recordDeserializer, delegate, buffer, outputList);
                return null;
            }
        }).when(mockWriter).writeBuffer(any(Buffer.class), anyInt());
        doAnswer(new Answer<Void>() {

            @Override
            public Void answer(InvocationOnMock invocationOnMock) throws Throwable {
                Buffer buffer = (Buffer) invocationOnMock.getArguments()[0];
                addBufferToOutputList(recordDeserializer, delegate, buffer, outputList);
                return null;
            }
        }).when(mockWriter).writeBufferToAllChannels(any(Buffer.class));
        outputs.add(mockWriter);
    } catch (Throwable t) {
        t.printStackTrace();
        fail(t.getMessage());
    }
}
Also used : Buffer(org.apache.flink.runtime.io.network.buffer.Buffer) AdaptiveSpanningRecordDeserializer(org.apache.flink.runtime.io.network.api.serialization.AdaptiveSpanningRecordDeserializer) ResultPartitionWriter(org.apache.flink.runtime.io.network.api.writer.ResultPartitionWriter) InvocationOnMock(org.mockito.invocation.InvocationOnMock) NonReusingDeserializationDelegate(org.apache.flink.runtime.plugable.NonReusingDeserializationDelegate) BufferProvider(org.apache.flink.runtime.io.network.buffer.BufferProvider) DeserializationDelegate(org.apache.flink.runtime.plugable.DeserializationDelegate) NonReusingDeserializationDelegate(org.apache.flink.runtime.plugable.NonReusingDeserializationDelegate)

Example 19 with ResultPartitionWriter

use of org.apache.flink.runtime.io.network.api.writer.ResultPartitionWriter in project flink by apache.

the class StreamRecordWriterTest method getMockWriter.

private static ResultPartitionWriter getMockWriter(int numPartitions) throws Exception {
    BufferProvider mockProvider = mock(BufferProvider.class);
    when(mockProvider.requestBufferBlocking()).thenAnswer(new Answer<Buffer>() {

        @Override
        public Buffer answer(InvocationOnMock invocation) {
            return new Buffer(MemorySegmentFactory.allocateUnpooledSegment(4096), FreeingBufferRecycler.INSTANCE);
        }
    });
    ResultPartitionWriter mockWriter = mock(ResultPartitionWriter.class);
    when(mockWriter.getBufferProvider()).thenReturn(mockProvider);
    when(mockWriter.getNumberOfOutputChannels()).thenReturn(numPartitions);
    return mockWriter;
}
Also used : Buffer(org.apache.flink.runtime.io.network.buffer.Buffer) InvocationOnMock(org.mockito.invocation.InvocationOnMock) ResultPartitionWriter(org.apache.flink.runtime.io.network.api.writer.ResultPartitionWriter) BufferProvider(org.apache.flink.runtime.io.network.buffer.BufferProvider)

Example 20 with ResultPartitionWriter

use of org.apache.flink.runtime.io.network.api.writer.ResultPartitionWriter in project flink by apache.

the class Task method doRun.

private void doRun() {
    // ----------------------------
    while (true) {
        ExecutionState current = this.executionState;
        if (current == ExecutionState.CREATED) {
            if (transitionState(ExecutionState.CREATED, ExecutionState.DEPLOYING)) {
                // success, we can start our work
                break;
            }
        } else if (current == ExecutionState.FAILED) {
            // we were immediately failed. tell the TaskManager that we reached our final state
            notifyFinalState();
            if (metrics != null) {
                metrics.close();
            }
            return;
        } else if (current == ExecutionState.CANCELING) {
            if (transitionState(ExecutionState.CANCELING, ExecutionState.CANCELED)) {
                // we were immediately canceled. tell the TaskManager that we reached our final
                // state
                notifyFinalState();
                if (metrics != null) {
                    metrics.close();
                }
                return;
            }
        } else {
            if (metrics != null) {
                metrics.close();
            }
            throw new IllegalStateException("Invalid state for beginning of operation of task " + this + '.');
        }
    }
    // all resource acquisitions and registrations from here on
    // need to be undone in the end
    Map<String, Future<Path>> distributedCacheEntries = new HashMap<>();
    TaskInvokable invokable = null;
    try {
        // ----------------------------
        // Task Bootstrap - We periodically
        // check for canceling as a shortcut
        // ----------------------------
        // activate safety net for task thread
        LOG.debug("Creating FileSystem stream leak safety net for task {}", this);
        FileSystemSafetyNet.initializeSafetyNetForThread();
        // first of all, get a user-code classloader
        // this may involve downloading the job's JAR files and/or classes
        LOG.info("Loading JAR files for task {}.", this);
        userCodeClassLoader = createUserCodeClassloader();
        final ExecutionConfig executionConfig = serializedExecutionConfig.deserializeValue(userCodeClassLoader.asClassLoader());
        if (executionConfig.getTaskCancellationInterval() >= 0) {
            // override task cancellation interval from Flink config if set in ExecutionConfig
            taskCancellationInterval = executionConfig.getTaskCancellationInterval();
        }
        if (executionConfig.getTaskCancellationTimeout() >= 0) {
            // override task cancellation timeout from Flink config if set in ExecutionConfig
            taskCancellationTimeout = executionConfig.getTaskCancellationTimeout();
        }
        if (isCanceledOrFailed()) {
            throw new CancelTaskException();
        }
        // ----------------------------------------------------------------
        // register the task with the network stack
        // this operation may fail if the system does not have enough
        // memory to run the necessary data exchanges
        // the registration must also strictly be undone
        // ----------------------------------------------------------------
        LOG.debug("Registering task at network: {}.", this);
        setupPartitionsAndGates(consumableNotifyingPartitionWriters, inputGates);
        for (ResultPartitionWriter partitionWriter : consumableNotifyingPartitionWriters) {
            taskEventDispatcher.registerPartition(partitionWriter.getPartitionId());
        }
        // next, kick off the background copying of files for the distributed cache
        try {
            for (Map.Entry<String, DistributedCache.DistributedCacheEntry> entry : DistributedCache.readFileInfoFromConfig(jobConfiguration)) {
                LOG.info("Obtaining local cache file for '{}'.", entry.getKey());
                Future<Path> cp = fileCache.createTmpFile(entry.getKey(), entry.getValue(), jobId, executionId);
                distributedCacheEntries.put(entry.getKey(), cp);
            }
        } catch (Exception e) {
            throw new Exception(String.format("Exception while adding files to distributed cache of task %s (%s).", taskNameWithSubtask, executionId), e);
        }
        if (isCanceledOrFailed()) {
            throw new CancelTaskException();
        }
        // ----------------------------------------------------------------
        // call the user code initialization methods
        // ----------------------------------------------------------------
        TaskKvStateRegistry kvStateRegistry = kvStateService.createKvStateTaskRegistry(jobId, getJobVertexId());
        Environment env = new RuntimeEnvironment(jobId, vertexId, executionId, executionConfig, taskInfo, jobConfiguration, taskConfiguration, userCodeClassLoader, memoryManager, ioManager, broadcastVariableManager, taskStateManager, aggregateManager, accumulatorRegistry, kvStateRegistry, inputSplitProvider, distributedCacheEntries, consumableNotifyingPartitionWriters, inputGates, taskEventDispatcher, checkpointResponder, operatorCoordinatorEventGateway, taskManagerConfig, metrics, this, externalResourceInfoProvider);
        // Make sure the user code classloader is accessible thread-locally.
        // We are setting the correct context class loader before instantiating the invokable
        // so that it is available to the invokable during its entire lifetime.
        executingThread.setContextClassLoader(userCodeClassLoader.asClassLoader());
        // When constructing invokable, separate threads can be constructed and thus should be
        // monitored for system exit (in addition to invoking thread itself monitored below).
        FlinkSecurityManager.monitorUserSystemExitForCurrentThread();
        try {
            // now load and instantiate the task's invokable code
            invokable = loadAndInstantiateInvokable(userCodeClassLoader.asClassLoader(), nameOfInvokableClass, env);
        } finally {
            FlinkSecurityManager.unmonitorUserSystemExitForCurrentThread();
        }
        // ----------------------------------------------------------------
        // actual task core work
        // ----------------------------------------------------------------
        // we must make strictly sure that the invokable is accessible to the cancel() call
        // by the time we switched to running.
        this.invokable = invokable;
        restoreAndInvoke(invokable);
        // to the fact that it has been canceled
        if (isCanceledOrFailed()) {
            throw new CancelTaskException();
        }
        // finish the produced partitions. if this fails, we consider the execution failed.
        for (ResultPartitionWriter partitionWriter : consumableNotifyingPartitionWriters) {
            if (partitionWriter != null) {
                partitionWriter.finish();
            }
        }
        // if that fails, the task was canceled/failed in the meantime
        if (!transitionState(ExecutionState.RUNNING, ExecutionState.FINISHED)) {
            throw new CancelTaskException();
        }
    } catch (Throwable t) {
        // ----------------------------------------------------------------
        // the execution failed. either the invokable code properly failed, or
        // an exception was thrown as a side effect of cancelling
        // ----------------------------------------------------------------
        t = preProcessException(t);
        try {
            // or to failExternally()
            while (true) {
                ExecutionState current = this.executionState;
                if (current == ExecutionState.RUNNING || current == ExecutionState.INITIALIZING || current == ExecutionState.DEPLOYING) {
                    if (ExceptionUtils.findThrowable(t, CancelTaskException.class).isPresent()) {
                        if (transitionState(current, ExecutionState.CANCELED, t)) {
                            cancelInvokable(invokable);
                            break;
                        }
                    } else {
                        if (transitionState(current, ExecutionState.FAILED, t)) {
                            cancelInvokable(invokable);
                            break;
                        }
                    }
                } else if (current == ExecutionState.CANCELING) {
                    if (transitionState(current, ExecutionState.CANCELED)) {
                        break;
                    }
                } else if (current == ExecutionState.FAILED) {
                    // in state failed already, no transition necessary any more
                    break;
                } else // unexpected state, go to failed
                if (transitionState(current, ExecutionState.FAILED, t)) {
                    LOG.error("Unexpected state in task {} ({}) during an exception: {}.", taskNameWithSubtask, executionId, current);
                    break;
                }
            // else fall through the loop and
            }
        } catch (Throwable tt) {
            String message = String.format("FATAL - exception in exception handler of task %s (%s).", taskNameWithSubtask, executionId);
            LOG.error(message, tt);
            notifyFatalError(message, tt);
        }
    } finally {
        try {
            LOG.info("Freeing task resources for {} ({}).", taskNameWithSubtask, executionId);
            // clear the reference to the invokable. this helps guard against holding references
            // to the invokable and its structures in cases where this Task object is still
            // referenced
            this.invokable = null;
            // free the network resources
            releaseResources();
            // free memory resources
            if (invokable != null) {
                memoryManager.releaseAll(invokable);
            }
            // remove all of the tasks resources
            fileCache.releaseJob(jobId, executionId);
            // close and de-activate safety net for task thread
            LOG.debug("Ensuring all FileSystem streams are closed for task {}", this);
            FileSystemSafetyNet.closeSafetyNetAndGuardedResourcesForThread();
            notifyFinalState();
        } catch (Throwable t) {
            // an error in the resource cleanup is fatal
            String message = String.format("FATAL - exception in resource cleanup of task %s (%s).", taskNameWithSubtask, executionId);
            LOG.error(message, t);
            notifyFatalError(message, t);
        }
        // errors here will only be logged
        try {
            metrics.close();
        } catch (Throwable t) {
            LOG.error("Error during metrics de-registration of task {} ({}).", taskNameWithSubtask, executionId, t);
        }
    }
}
Also used : Path(org.apache.flink.core.fs.Path) ExecutionState(org.apache.flink.runtime.execution.ExecutionState) HashMap(java.util.HashMap) ResultPartitionWriter(org.apache.flink.runtime.io.network.api.writer.ResultPartitionWriter) TaskKvStateRegistry(org.apache.flink.runtime.query.TaskKvStateRegistry) ExecutionConfig(org.apache.flink.api.common.ExecutionConfig) TaskNotRunningException(org.apache.flink.runtime.operators.coordination.TaskNotRunningException) WrappingRuntimeException(org.apache.flink.util.WrappingRuntimeException) CheckpointException(org.apache.flink.runtime.checkpoint.CheckpointException) CancelTaskException(org.apache.flink.runtime.execution.CancelTaskException) InvocationTargetException(java.lang.reflect.InvocationTargetException) FlinkException(org.apache.flink.util.FlinkException) RunnableWithException(org.apache.flink.util.function.RunnableWithException) RejectedExecutionException(java.util.concurrent.RejectedExecutionException) FlinkRuntimeException(org.apache.flink.util.FlinkRuntimeException) IOException(java.io.IOException) CancelTaskException(org.apache.flink.runtime.execution.CancelTaskException) Future(java.util.concurrent.Future) CompletableFuture(java.util.concurrent.CompletableFuture) TaskInvokable(org.apache.flink.runtime.jobgraph.tasks.TaskInvokable) ShuffleEnvironment(org.apache.flink.runtime.shuffle.ShuffleEnvironment) NettyShuffleEnvironment(org.apache.flink.runtime.io.network.NettyShuffleEnvironment) Environment(org.apache.flink.runtime.execution.Environment) Map(java.util.Map) HashMap(java.util.HashMap)

Aggregations

ResultPartitionWriter (org.apache.flink.runtime.io.network.api.writer.ResultPartitionWriter)37 ResultPartition (org.apache.flink.runtime.io.network.partition.ResultPartition)12 JobID (org.apache.flink.api.common.JobID)11 IOException (java.io.IOException)10 Test (org.junit.Test)10 CompletingCheckpointResponder (org.apache.flink.streaming.util.CompletingCheckpointResponder)8 FlinkRuntimeException (org.apache.flink.util.FlinkRuntimeException)8 ExecutionAttemptID (org.apache.flink.runtime.executiongraph.ExecutionAttemptID)7 EndOfData (org.apache.flink.runtime.io.network.api.EndOfData)7 CompletableFuture (java.util.concurrent.CompletableFuture)6 CheckpointMetaData (org.apache.flink.runtime.checkpoint.CheckpointMetaData)6 CancelTaskException (org.apache.flink.runtime.execution.CancelTaskException)6 StreamRecord (org.apache.flink.streaming.runtime.streamrecord.StreamRecord)6 ArrayList (java.util.ArrayList)5 Future (java.util.concurrent.Future)5 CheckpointMetrics (org.apache.flink.runtime.checkpoint.CheckpointMetrics)5 CheckpointOptions (org.apache.flink.runtime.checkpoint.CheckpointOptions)5 SavepointType (org.apache.flink.runtime.checkpoint.SavepointType)5 TaskStateSnapshot (org.apache.flink.runtime.checkpoint.TaskStateSnapshot)5 StopMode (org.apache.flink.runtime.io.network.api.StopMode)5