Search in sources :

Example 1 with CommittableWithLineage

use of org.apache.flink.streaming.api.connector.sink2.CommittableWithLineage in project flink by apache.

the class CompactorOperatorStateHandler method drain.

private void drain() throws ExecutionException, InterruptedException {
    checkState(holdingSummary != null);
    checkState(holdingSummary.getNumberOfPendingCommittables() == holdingSummary.getNumberOfCommittables() && holdingSummary.getNumberOfCommittables() == holdingMessages.size() + compactingMessages.size());
    Long checkpointId = holdingSummary.getCheckpointId().isPresent() ? holdingSummary.getCheckpointId().getAsLong() : null;
    int subtaskId = holdingSummary.getSubtaskId();
    if (!compactingRequests.isEmpty()) {
        CompletableFuture.allOf(compactingRequests.stream().map(r -> r.f1).toArray(CompletableFuture[]::new)).join();
        for (Tuple2<CompactorRequest, CompletableFuture<Iterable<FileSinkCommittable>>> compacting : compactingRequests) {
            CompletableFuture<Iterable<FileSinkCommittable>> future = compacting.f1;
            checkState(future.isDone());
            // Exception is thrown if it's completed exceptionally
            for (FileSinkCommittable c : future.get()) {
                holdingMessages.add(new CommittableWithLineage<>(c, checkpointId, subtaskId));
            }
        }
    }
    // Appending the compacted committable to the holding summary
    CommittableSummary<FileSinkCommittable> summary = new CommittableSummary<>(holdingSummary.getSubtaskId(), holdingSummary.getNumberOfSubtasks(), holdingSummary.getCheckpointId().isPresent() ? holdingSummary.getCheckpointId().getAsLong() : null, holdingMessages.size(), holdingMessages.size(), holdingSummary.getNumberOfFailedCommittables());
    output.collect(new StreamRecord<>(summary));
    for (CommittableMessage<FileSinkCommittable> committable : holdingMessages) {
        output.collect(new StreamRecord<>(committable));
    }
    // Remaining requests should be all done and their results are all emitted.
    // From now on the operator is stateless.
    remainingRequestsState.clear();
    compactingRequests.clear();
    compactingMessages.clear();
    holdingSummary = null;
    holdingMessages = null;
    if (writerStateDrained) {
        // We can pass through everything if the writer state is also drained.
        stateDrained = true;
        compactService.close();
        compactService = null;
    }
}
Also used : CommittableWithLineage(org.apache.flink.streaming.api.connector.sink2.CommittableWithLineage) Tuple2(org.apache.flink.api.java.tuple.Tuple2) Either(org.apache.flink.types.Either) CompletableFuture(java.util.concurrent.CompletableFuture) ArrayList(java.util.ArrayList) StreamRecord(org.apache.flink.streaming.runtime.streamrecord.StreamRecord) CheckpointListener(org.apache.flink.api.common.state.CheckpointListener) Map(java.util.Map) PendingFileRecoverable(org.apache.flink.streaming.api.functions.sink.filesystem.InProgressFileWriter.PendingFileRecoverable) RemainingRequestsSerializer(org.apache.flink.connector.file.sink.compactor.operator.CompactorOperator.RemainingRequestsSerializer) Preconditions.checkState(org.apache.flink.util.Preconditions.checkState) FileSinkCommittable(org.apache.flink.connector.file.sink.FileSinkCommittable) BucketWriter(org.apache.flink.streaming.api.functions.sink.filesystem.BucketWriter) AbstractStreamOperator(org.apache.flink.streaming.api.operators.AbstractStreamOperator) BoundedOneInput(org.apache.flink.streaming.api.operators.BoundedOneInput) VisibleForTesting(org.apache.flink.annotation.VisibleForTesting) ExecutionException(java.util.concurrent.ExecutionException) CommittableMessage(org.apache.flink.streaming.api.connector.sink2.CommittableMessage) CommittableSummary(org.apache.flink.streaming.api.connector.sink2.CommittableSummary) List(java.util.List) REMAINING_REQUESTS_RAW_STATES_DESC(org.apache.flink.connector.file.sink.compactor.operator.CompactorOperator.REMAINING_REQUESTS_RAW_STATES_DESC) SimpleVersionedSerializer(org.apache.flink.core.io.SimpleVersionedSerializer) SimpleVersionedListState(org.apache.flink.streaming.api.operators.util.SimpleVersionedListState) Internal(org.apache.flink.annotation.Internal) OneInputStreamOperator(org.apache.flink.streaming.api.operators.OneInputStreamOperator) FileCompactor(org.apache.flink.connector.file.sink.compactor.FileCompactor) IdenticalFileCompactor(org.apache.flink.connector.file.sink.compactor.IdenticalFileCompactor) StateInitializationContext(org.apache.flink.runtime.state.StateInitializationContext) CommittableSummary(org.apache.flink.streaming.api.connector.sink2.CommittableSummary) CompletableFuture(java.util.concurrent.CompletableFuture) FileSinkCommittable(org.apache.flink.connector.file.sink.FileSinkCommittable)

Example 2 with CommittableWithLineage

use of org.apache.flink.streaming.api.connector.sink2.CommittableWithLineage in project flink by apache.

the class CompactorOperator method emitCompacted.

private void emitCompacted(@Nullable Long checkpointId) throws Exception {
    List<FileSinkCommittable> compacted = new ArrayList<>();
    Iterator<Tuple2<CompactorRequest, CompletableFuture<Iterable<FileSinkCommittable>>>> iter = compactingRequests.iterator();
    while (iter.hasNext()) {
        Tuple2<CompactorRequest, CompletableFuture<Iterable<FileSinkCommittable>>> compacting = iter.next();
        CompletableFuture<Iterable<FileSinkCommittable>> future = compacting.f1;
        if (future.isDone()) {
            iter.remove();
            // Exception is thrown if it's completed exceptionally
            for (FileSinkCommittable c : future.get()) {
                compacted.add(c);
            }
        }
    }
    if (compacted.isEmpty()) {
        return;
    }
    // A summary must be sent before all results during this checkpoint
    CommittableSummary<FileSinkCommittable> summary = new CommittableSummary<>(getRuntimeContext().getIndexOfThisSubtask(), getRuntimeContext().getNumberOfParallelSubtasks(), checkpointId, compacted.size(), compacted.size(), 0);
    output.collect(new StreamRecord<>(summary));
    for (FileSinkCommittable c : compacted) {
        CommittableWithLineage<FileSinkCommittable> comm = new CommittableWithLineage<>(c, checkpointId, getRuntimeContext().getIndexOfThisSubtask());
        output.collect(new StreamRecord<>(comm));
    }
}
Also used : CommittableSummary(org.apache.flink.streaming.api.connector.sink2.CommittableSummary) CommittableWithLineage(org.apache.flink.streaming.api.connector.sink2.CommittableWithLineage) ArrayList(java.util.ArrayList) CompletableFuture(java.util.concurrent.CompletableFuture) Tuple2(org.apache.flink.api.java.tuple.Tuple2) FileSinkCommittable(org.apache.flink.connector.file.sink.FileSinkCommittable)

Example 3 with CommittableWithLineage

use of org.apache.flink.streaming.api.connector.sink2.CommittableWithLineage in project flink by apache.

the class CompactCoordinatorTest method testStateHandler.

@Test
public void testStateHandler() throws Exception {
    FileCompactStrategy strategy = Builder.newBuilder().setSizeThreshold(10).build();
    CompactCoordinator coordinator = new CompactCoordinator(strategy, getTestCommittableSerializer());
    // with . prefix
    FileSinkCommittable committable0 = committable("0", ".0", 5);
    FileSinkCommittable committable1 = committable("0", ".1", 6);
    // without . prefix
    FileSinkCommittable committable2 = committable("0", "2", 6);
    OperatorSubtaskState state;
    try (OneInputStreamOperatorTestHarness<CommittableMessage<FileSinkCommittable>, CompactorRequest> harness = new OneInputStreamOperatorTestHarness<>(coordinator)) {
        harness.setup();
        harness.open();
        harness.processElement(message(committable0));
        Assert.assertEquals(0, harness.extractOutputValues().size());
        harness.prepareSnapshotPreBarrier(1);
        state = harness.snapshot(1, 1);
    }
    CompactCoordinatorStateHandler handler = new CompactCoordinatorStateHandler(getTestCommittableSerializer());
    try (OneInputStreamOperatorTestHarness<CommittableMessage<FileSinkCommittable>, Either<CommittableMessage<FileSinkCommittable>, CompactorRequest>> harness = new OneInputStreamOperatorTestHarness<>(handler)) {
        harness.setup(new EitherSerializer<>(new SimpleVersionedSerializerTypeSerializerProxy<>(() -> new CommittableMessageSerializer<>(getTestCommittableSerializer())), new SimpleVersionedSerializerTypeSerializerProxy<>(() -> new CompactorRequestSerializer(getTestCommittableSerializer()))));
        harness.initializeState(state);
        harness.open();
        Assert.assertEquals(1, harness.extractOutputValues().size());
        harness.processElement(message(committable1));
        harness.processElement(message(committable2));
        List<Either<CommittableMessage<FileSinkCommittable>, CompactorRequest>> results = harness.extractOutputValues();
        Assert.assertEquals(3, results.size());
        // restored request
        Assert.assertTrue(results.get(0).isRight());
        assertToCompact(results.get(0).right(), committable0);
        // committable with . prefix should also be passed through
        Assert.assertTrue(results.get(1).isLeft() && results.get(1).left() instanceof CommittableWithLineage);
        Assert.assertEquals(((CommittableWithLineage<FileSinkCommittable>) results.get(1).left()).getCommittable(), committable1);
        // committable without . prefix should be passed through normally
        Assert.assertTrue(results.get(2).isLeft() && results.get(2).left() instanceof CommittableWithLineage);
        Assert.assertEquals(((CommittableWithLineage<FileSinkCommittable>) results.get(2).left()).getCommittable(), committable2);
    }
}
Also used : CommittableMessage(org.apache.flink.streaming.api.connector.sink2.CommittableMessage) CommittableWithLineage(org.apache.flink.streaming.api.connector.sink2.CommittableWithLineage) CompactCoordinatorStateHandler(org.apache.flink.connector.file.sink.compactor.operator.CompactCoordinatorStateHandler) OneInputStreamOperatorTestHarness(org.apache.flink.streaming.util.OneInputStreamOperatorTestHarness) CompactCoordinator(org.apache.flink.connector.file.sink.compactor.operator.CompactCoordinator) OperatorSubtaskState(org.apache.flink.runtime.checkpoint.OperatorSubtaskState) SimpleVersionedSerializerTypeSerializerProxy(org.apache.flink.core.io.SimpleVersionedSerializerTypeSerializerProxy) FileSinkCommittable(org.apache.flink.connector.file.sink.FileSinkCommittable) Either(org.apache.flink.types.Either) CompactorRequest(org.apache.flink.connector.file.sink.compactor.operator.CompactorRequest) CompactorRequestSerializer(org.apache.flink.connector.file.sink.compactor.operator.CompactorRequestSerializer) Test(org.junit.Test)

Example 4 with CommittableWithLineage

use of org.apache.flink.streaming.api.connector.sink2.CommittableWithLineage in project flink by apache.

the class SinkWriterOperator method emitCommittables.

private void emitCommittables(Long checkpointId) throws IOException, InterruptedException {
    if (!emitDownstream) {
        // although no committables are forwarded
        if (sinkWriter instanceof PrecommittingSinkWriter) {
            ((PrecommittingSinkWriter<?, ?>) sinkWriter).prepareCommit();
        }
        return;
    }
    Collection<CommT> committables = ((PrecommittingSinkWriter<?, CommT>) sinkWriter).prepareCommit();
    StreamingRuntimeContext runtimeContext = getRuntimeContext();
    int indexOfThisSubtask = runtimeContext.getIndexOfThisSubtask();
    output.collect(new StreamRecord<>(new CommittableSummary<>(indexOfThisSubtask, runtimeContext.getNumberOfParallelSubtasks(), checkpointId, committables.size(), committables.size(), 0)));
    for (CommT committable : committables) {
        output.collect(new StreamRecord<>(new CommittableWithLineage<>(committable, checkpointId, indexOfThisSubtask)));
    }
}
Also used : PrecommittingSinkWriter(org.apache.flink.api.connector.sink2.TwoPhaseCommittingSink.PrecommittingSinkWriter) CommittableSummary(org.apache.flink.streaming.api.connector.sink2.CommittableSummary) CommittableWithLineage(org.apache.flink.streaming.api.connector.sink2.CommittableWithLineage) StreamingRuntimeContext(org.apache.flink.streaming.api.operators.StreamingRuntimeContext)

Example 5 with CommittableWithLineage

use of org.apache.flink.streaming.api.connector.sink2.CommittableWithLineage in project flink by apache.

the class CompactorOperatorTest method testCompact.

@Test
public void testCompact() throws Exception {
    FileCompactor fileCompactor = new RecordWiseFileCompactor<>(new DecoderBasedReader.Factory<>(IntDecoder::new));
    CompactorOperator compactor = createTestOperator(fileCompactor);
    try (OneInputStreamOperatorTestHarness<CompactorRequest, CommittableMessage<FileSinkCommittable>> harness = new OneInputStreamOperatorTestHarness<>(compactor)) {
        harness.setup();
        harness.open();
        harness.processElement(request("0", Arrays.asList(committable("0", ".0", 5), committable("0", ".1", 5)), null));
        Assert.assertEquals(0, harness.extractOutputValues().size());
        harness.prepareSnapshotPreBarrier(1);
        harness.snapshot(1, 1L);
        harness.notifyOfCompletedCheckpoint(1);
        compactor.getAllTasksFuture().join();
        Assert.assertEquals(0, harness.extractOutputValues().size());
        harness.prepareSnapshotPreBarrier(2);
        // 1summary+1compacted+2cleanup
        List<CommittableMessage<FileSinkCommittable>> results = harness.extractOutputValues();
        Assert.assertEquals(4, results.size());
        SinkV2Assertions.assertThat((CommittableSummary<?>) results.get(0)).hasPendingCommittables(3);
        SinkV2Assertions.assertThat((CommittableWithLineage<?>) results.get(1)).hasCommittable(committable("0", "compacted-0", 10));
        SinkV2Assertions.assertThat((CommittableWithLineage<?>) results.get(2)).hasCommittable(cleanupPath("0", ".0"));
        SinkV2Assertions.assertThat((CommittableWithLineage<?>) results.get(3)).hasCommittable(cleanupPath("0", ".1"));
    }
}
Also used : CommittableMessage(org.apache.flink.streaming.api.connector.sink2.CommittableMessage) CompactorOperator(org.apache.flink.connector.file.sink.compactor.operator.CompactorOperator) CommittableSummary(org.apache.flink.streaming.api.connector.sink2.CommittableSummary) CommittableWithLineage(org.apache.flink.streaming.api.connector.sink2.CommittableWithLineage) CompactorRequest(org.apache.flink.connector.file.sink.compactor.operator.CompactorRequest) OneInputStreamOperatorTestHarness(org.apache.flink.streaming.util.OneInputStreamOperatorTestHarness) Test(org.junit.Test)

Aggregations

CommittableWithLineage (org.apache.flink.streaming.api.connector.sink2.CommittableWithLineage)10 CommittableSummary (org.apache.flink.streaming.api.connector.sink2.CommittableSummary)9 CommittableMessage (org.apache.flink.streaming.api.connector.sink2.CommittableMessage)8 FileSinkCommittable (org.apache.flink.connector.file.sink.FileSinkCommittable)6 CompactorRequest (org.apache.flink.connector.file.sink.compactor.operator.CompactorRequest)5 OneInputStreamOperatorTestHarness (org.apache.flink.streaming.util.OneInputStreamOperatorTestHarness)5 Test (org.junit.Test)5 CompactorOperator (org.apache.flink.connector.file.sink.compactor.operator.CompactorOperator)4 OperatorSubtaskState (org.apache.flink.runtime.checkpoint.OperatorSubtaskState)4 Either (org.apache.flink.types.Either)3 ArrayList (java.util.ArrayList)2 CompletableFuture (java.util.concurrent.CompletableFuture)2 Tuple2 (org.apache.flink.api.java.tuple.Tuple2)2 PendingFileRecoverable (org.apache.flink.streaming.api.functions.sink.filesystem.InProgressFileWriter.PendingFileRecoverable)2 List (java.util.List)1 Map (java.util.Map)1 ExecutionException (java.util.concurrent.ExecutionException)1 Internal (org.apache.flink.annotation.Internal)1 VisibleForTesting (org.apache.flink.annotation.VisibleForTesting)1 CheckpointListener (org.apache.flink.api.common.state.CheckpointListener)1