Search in sources :

Example 1 with CommittableMessage

use of org.apache.flink.streaming.api.connector.sink2.CommittableMessage in project flink by apache.

the class CompactorOperatorStateHandler method drain.

private void drain() throws ExecutionException, InterruptedException {
    checkState(holdingSummary != null);
    checkState(holdingSummary.getNumberOfPendingCommittables() == holdingSummary.getNumberOfCommittables() && holdingSummary.getNumberOfCommittables() == holdingMessages.size() + compactingMessages.size());
    Long checkpointId = holdingSummary.getCheckpointId().isPresent() ? holdingSummary.getCheckpointId().getAsLong() : null;
    int subtaskId = holdingSummary.getSubtaskId();
    if (!compactingRequests.isEmpty()) {
        CompletableFuture.allOf(compactingRequests.stream().map(r -> r.f1).toArray(CompletableFuture[]::new)).join();
        for (Tuple2<CompactorRequest, CompletableFuture<Iterable<FileSinkCommittable>>> compacting : compactingRequests) {
            CompletableFuture<Iterable<FileSinkCommittable>> future = compacting.f1;
            checkState(future.isDone());
            // Exception is thrown if it's completed exceptionally
            for (FileSinkCommittable c : future.get()) {
                holdingMessages.add(new CommittableWithLineage<>(c, checkpointId, subtaskId));
            }
        }
    }
    // Appending the compacted committable to the holding summary
    CommittableSummary<FileSinkCommittable> summary = new CommittableSummary<>(holdingSummary.getSubtaskId(), holdingSummary.getNumberOfSubtasks(), holdingSummary.getCheckpointId().isPresent() ? holdingSummary.getCheckpointId().getAsLong() : null, holdingMessages.size(), holdingMessages.size(), holdingSummary.getNumberOfFailedCommittables());
    output.collect(new StreamRecord<>(summary));
    for (CommittableMessage<FileSinkCommittable> committable : holdingMessages) {
        output.collect(new StreamRecord<>(committable));
    }
    // Remaining requests should be all done and their results are all emitted.
    // From now on the operator is stateless.
    remainingRequestsState.clear();
    compactingRequests.clear();
    compactingMessages.clear();
    holdingSummary = null;
    holdingMessages = null;
    if (writerStateDrained) {
        // We can pass through everything if the writer state is also drained.
        stateDrained = true;
        compactService.close();
        compactService = null;
    }
}
Also used : CommittableWithLineage(org.apache.flink.streaming.api.connector.sink2.CommittableWithLineage) Tuple2(org.apache.flink.api.java.tuple.Tuple2) Either(org.apache.flink.types.Either) CompletableFuture(java.util.concurrent.CompletableFuture) ArrayList(java.util.ArrayList) StreamRecord(org.apache.flink.streaming.runtime.streamrecord.StreamRecord) CheckpointListener(org.apache.flink.api.common.state.CheckpointListener) Map(java.util.Map) PendingFileRecoverable(org.apache.flink.streaming.api.functions.sink.filesystem.InProgressFileWriter.PendingFileRecoverable) RemainingRequestsSerializer(org.apache.flink.connector.file.sink.compactor.operator.CompactorOperator.RemainingRequestsSerializer) Preconditions.checkState(org.apache.flink.util.Preconditions.checkState) FileSinkCommittable(org.apache.flink.connector.file.sink.FileSinkCommittable) BucketWriter(org.apache.flink.streaming.api.functions.sink.filesystem.BucketWriter) AbstractStreamOperator(org.apache.flink.streaming.api.operators.AbstractStreamOperator) BoundedOneInput(org.apache.flink.streaming.api.operators.BoundedOneInput) VisibleForTesting(org.apache.flink.annotation.VisibleForTesting) ExecutionException(java.util.concurrent.ExecutionException) CommittableMessage(org.apache.flink.streaming.api.connector.sink2.CommittableMessage) CommittableSummary(org.apache.flink.streaming.api.connector.sink2.CommittableSummary) List(java.util.List) REMAINING_REQUESTS_RAW_STATES_DESC(org.apache.flink.connector.file.sink.compactor.operator.CompactorOperator.REMAINING_REQUESTS_RAW_STATES_DESC) SimpleVersionedSerializer(org.apache.flink.core.io.SimpleVersionedSerializer) SimpleVersionedListState(org.apache.flink.streaming.api.operators.util.SimpleVersionedListState) Internal(org.apache.flink.annotation.Internal) OneInputStreamOperator(org.apache.flink.streaming.api.operators.OneInputStreamOperator) FileCompactor(org.apache.flink.connector.file.sink.compactor.FileCompactor) IdenticalFileCompactor(org.apache.flink.connector.file.sink.compactor.IdenticalFileCompactor) StateInitializationContext(org.apache.flink.runtime.state.StateInitializationContext) CommittableSummary(org.apache.flink.streaming.api.connector.sink2.CommittableSummary) CompletableFuture(java.util.concurrent.CompletableFuture) FileSinkCommittable(org.apache.flink.connector.file.sink.FileSinkCommittable)

Example 2 with CommittableMessage

use of org.apache.flink.streaming.api.connector.sink2.CommittableMessage in project flink by apache.

the class FileSink method addPreCommitTopology.

@Override
public DataStream<CommittableMessage<FileSinkCommittable>> addPreCommitTopology(DataStream<CommittableMessage<FileSinkCommittable>> committableStream) {
    FileCompactStrategy strategy = bucketsBuilder.getCompactStrategy();
    if (strategy == null) {
        // not enabled, handlers will be added to process the remaining states of the compact
        // coordinator and the compactor operators.
        SingleOutputStreamOperator<Either<CommittableMessage<FileSinkCommittable>, CompactorRequest>> coordinatorOp = committableStream.forward().transform("CompactorCoordinator", new EitherTypeInfo<>(committableStream.getType(), new CompactorRequestTypeInfo(bucketsBuilder::getCommittableSerializer)), new CompactCoordinatorStateHandlerFactory(bucketsBuilder::getCommittableSerializer)).setParallelism(committableStream.getParallelism()).uid("FileSinkCompactorCoordinator");
        return coordinatorOp.forward().transform("CompactorOperator", committableStream.getType(), new CompactorOperatorStateHandlerFactory(bucketsBuilder::getCommittableSerializer, bucketsBuilder::createBucketWriter)).setParallelism(committableStream.getParallelism()).uid("FileSinkCompactorOperator");
    }
    // explicitly rebalance here is required, or the partitioner will be forward, which is in
    // fact the partitioner from the writers to the committers
    SingleOutputStreamOperator<CompactorRequest> coordinatorOp = committableStream.rebalance().transform("CompactorCoordinator", new CompactorRequestTypeInfo(bucketsBuilder::getCommittableSerializer), new CompactCoordinatorFactory(strategy, bucketsBuilder::getCommittableSerializer)).setParallelism(1).uid("FileSinkCompactorCoordinator");
    // parallelism of the compactors is not configurable at present, since it must be identical
    // to that of the committers, or the committable summary and the committables may be
    // distributed to different committers, which will cause a failure
    TypeInformation<CommittableMessage<FileSinkCommittable>> committableType = committableStream.getType();
    return coordinatorOp.transform("CompactorOperator", committableType, new CompactorOperatorFactory(strategy, bucketsBuilder.getFileCompactor(), bucketsBuilder::getCommittableSerializer, bucketsBuilder::createBucketWriter)).setParallelism(committableStream.getParallelism()).uid("FileSinkCompactorOperator");
}
Also used : CommittableMessage(org.apache.flink.streaming.api.connector.sink2.CommittableMessage) CompactorOperatorFactory(org.apache.flink.connector.file.sink.compactor.operator.CompactorOperatorFactory) CompactCoordinatorFactory(org.apache.flink.connector.file.sink.compactor.operator.CompactCoordinatorFactory) FileCompactStrategy(org.apache.flink.connector.file.sink.compactor.FileCompactStrategy) CompactorOperatorStateHandlerFactory(org.apache.flink.connector.file.sink.compactor.operator.CompactorOperatorStateHandlerFactory) CompactorRequestTypeInfo(org.apache.flink.connector.file.sink.compactor.operator.CompactorRequestTypeInfo) CompactCoordinatorStateHandlerFactory(org.apache.flink.connector.file.sink.compactor.operator.CompactCoordinatorStateHandlerFactory) Either(org.apache.flink.types.Either) CompactorRequest(org.apache.flink.connector.file.sink.compactor.operator.CompactorRequest)

Example 3 with CommittableMessage

use of org.apache.flink.streaming.api.connector.sink2.CommittableMessage in project flink by apache.

the class CompactCoordinatorTest method testSizeThreshold.

@Test
public void testSizeThreshold() throws Exception {
    FileCompactStrategy strategy = Builder.newBuilder().setSizeThreshold(10).build();
    CompactCoordinator coordinator = new CompactCoordinator(strategy, getTestCommittableSerializer());
    try (OneInputStreamOperatorTestHarness<CommittableMessage<FileSinkCommittable>, CompactorRequest> harness = new OneInputStreamOperatorTestHarness<>(coordinator)) {
        harness.setup();
        harness.open();
        FileSinkCommittable committable0 = committable("0", ".0", 5);
        FileSinkCommittable committable1 = committable("0", ".1", 6);
        harness.processElement(message(committable0));
        Assert.assertEquals(0, harness.extractOutputValues().size());
        harness.processElement(message(committable1));
        List<CompactorRequest> results = harness.extractOutputValues();
        Assert.assertEquals(1, results.size());
        assertToCompact(results.get(0), committable0, committable1);
        harness.processElement(message(committable("0", ".2", 5)));
        harness.processElement(message(committable("1", ".0", 5)));
        Assert.assertEquals(1, harness.extractOutputValues().size());
    }
}
Also used : CommittableMessage(org.apache.flink.streaming.api.connector.sink2.CommittableMessage) FileSinkCommittable(org.apache.flink.connector.file.sink.FileSinkCommittable) CompactorRequest(org.apache.flink.connector.file.sink.compactor.operator.CompactorRequest) OneInputStreamOperatorTestHarness(org.apache.flink.streaming.util.OneInputStreamOperatorTestHarness) CompactCoordinator(org.apache.flink.connector.file.sink.compactor.operator.CompactCoordinator) Test(org.junit.Test)

Example 4 with CommittableMessage

use of org.apache.flink.streaming.api.connector.sink2.CommittableMessage in project flink by apache.

the class CompactCoordinatorTest method testCompactOverMultipleCheckpoints.

@Test
public void testCompactOverMultipleCheckpoints() throws Exception {
    FileCompactStrategy strategy = Builder.newBuilder().enableCompactionOnCheckpoint(3).build();
    CompactCoordinator coordinator = new CompactCoordinator(strategy, getTestCommittableSerializer());
    try (OneInputStreamOperatorTestHarness<CommittableMessage<FileSinkCommittable>, CompactorRequest> harness = new OneInputStreamOperatorTestHarness<>(coordinator)) {
        harness.setup();
        harness.open();
        FileSinkCommittable committable0 = committable("0", ".0", 5);
        FileSinkCommittable committable1 = committable("0", ".1", 6);
        harness.processElement(message(committable0));
        harness.processElement(message(committable1));
        Assert.assertEquals(0, harness.extractOutputValues().size());
        harness.prepareSnapshotPreBarrier(1);
        harness.snapshot(1, 1);
        harness.prepareSnapshotPreBarrier(2);
        harness.snapshot(2, 2);
        Assert.assertEquals(0, harness.extractOutputValues().size());
        harness.prepareSnapshotPreBarrier(3);
        harness.snapshot(3, 3);
        List<CompactorRequest> results = harness.extractOutputValues();
        Assert.assertEquals(1, results.size());
        assertToCompact(results.get(0), committable0, committable1);
    }
}
Also used : CommittableMessage(org.apache.flink.streaming.api.connector.sink2.CommittableMessage) FileSinkCommittable(org.apache.flink.connector.file.sink.FileSinkCommittable) CompactorRequest(org.apache.flink.connector.file.sink.compactor.operator.CompactorRequest) OneInputStreamOperatorTestHarness(org.apache.flink.streaming.util.OneInputStreamOperatorTestHarness) CompactCoordinator(org.apache.flink.connector.file.sink.compactor.operator.CompactCoordinator) Test(org.junit.Test)

Example 5 with CommittableMessage

use of org.apache.flink.streaming.api.connector.sink2.CommittableMessage in project flink by apache.

the class CompactCoordinatorTest method testCompactOnEndOfInput.

@Test
public void testCompactOnEndOfInput() throws Exception {
    FileCompactStrategy strategy = Builder.newBuilder().setSizeThreshold(10).build();
    CompactCoordinator coordinator = new CompactCoordinator(strategy, getTestCommittableSerializer());
    try (OneInputStreamOperatorTestHarness<CommittableMessage<FileSinkCommittable>, CompactorRequest> harness = new OneInputStreamOperatorTestHarness<>(coordinator)) {
        harness.setup();
        harness.open();
        FileSinkCommittable committable0 = committable("0", ".0", 5);
        harness.processElement(message(committable0));
        Assert.assertEquals(0, harness.extractOutputValues().size());
        harness.prepareSnapshotPreBarrier(1);
        harness.snapshot(1, 1);
        Assert.assertEquals(0, harness.extractOutputValues().size());
        harness.endInput();
        List<CompactorRequest> results = harness.extractOutputValues();
        Assert.assertEquals(1, results.size());
        assertToCompact(results.get(0), committable0);
    }
}
Also used : CommittableMessage(org.apache.flink.streaming.api.connector.sink2.CommittableMessage) FileSinkCommittable(org.apache.flink.connector.file.sink.FileSinkCommittable) CompactorRequest(org.apache.flink.connector.file.sink.compactor.operator.CompactorRequest) OneInputStreamOperatorTestHarness(org.apache.flink.streaming.util.OneInputStreamOperatorTestHarness) CompactCoordinator(org.apache.flink.connector.file.sink.compactor.operator.CompactCoordinator) Test(org.junit.Test)

Aggregations

CommittableMessage (org.apache.flink.streaming.api.connector.sink2.CommittableMessage)19 OneInputStreamOperatorTestHarness (org.apache.flink.streaming.util.OneInputStreamOperatorTestHarness)14 CompactorRequest (org.apache.flink.connector.file.sink.compactor.operator.CompactorRequest)13 FileSinkCommittable (org.apache.flink.connector.file.sink.FileSinkCommittable)12 Test (org.junit.Test)12 CompactCoordinator (org.apache.flink.connector.file.sink.compactor.operator.CompactCoordinator)8 OperatorSubtaskState (org.apache.flink.runtime.checkpoint.OperatorSubtaskState)8 CommittableSummary (org.apache.flink.streaming.api.connector.sink2.CommittableSummary)8 CommittableWithLineage (org.apache.flink.streaming.api.connector.sink2.CommittableWithLineage)8 CompactorOperator (org.apache.flink.connector.file.sink.compactor.operator.CompactorOperator)4 Either (org.apache.flink.types.Either)4 ParameterizedTest (org.junit.jupiter.params.ParameterizedTest)3 PendingFileRecoverable (org.apache.flink.streaming.api.functions.sink.filesystem.InProgressFileWriter.PendingFileRecoverable)2 Test (org.junit.jupiter.api.Test)2 ArrayList (java.util.ArrayList)1 List (java.util.List)1 Map (java.util.Map)1 CompletableFuture (java.util.concurrent.CompletableFuture)1 ExecutionException (java.util.concurrent.ExecutionException)1 Internal (org.apache.flink.annotation.Internal)1