Search in sources :

Example 21 with RecoverableFsDataOutputStream

use of org.apache.flink.core.fs.RecoverableFsDataOutputStream in project flink by apache.

the class HadoopS3RecoverableWriterITCase method testCleanupRecoverableState.

@Test(expected = FileNotFoundException.class)
public void testCleanupRecoverableState() throws Exception {
    final RecoverableWriter writer = getRecoverableWriter();
    final Path path = new Path(basePathForTest, "part-0");
    final RecoverableFsDataOutputStream stream = writer.open(path);
    stream.write(bytesOf(testData1));
    S3Recoverable recoverable = (S3Recoverable) stream.persist();
    stream.closeForCommit().commit();
    // still the data is there as we have not deleted them from the tmp object
    final String content = getContentsOfFile(new Path('/' + recoverable.incompleteObjectName()));
    Assert.assertEquals(testData1, content);
    boolean successfullyDeletedState = writer.cleanupRecoverableState(recoverable);
    Assert.assertTrue(successfullyDeletedState);
    int retryTimes = 10;
    final long delayMs = 1000;
    // So we try multi-times to verify that the file was deleted at last.
    while (retryTimes > 0) {
        // this should throw the exception as we deleted the file.
        getContentsOfFile(new Path('/' + recoverable.incompleteObjectName()));
        retryTimes--;
        Thread.sleep(delayMs);
    }
}
Also used : Path(org.apache.flink.core.fs.Path) RecoverableWriter(org.apache.flink.core.fs.RecoverableWriter) RecoverableFsDataOutputStream(org.apache.flink.core.fs.RecoverableFsDataOutputStream) S3Recoverable(org.apache.flink.fs.s3.common.writer.S3Recoverable) Test(org.junit.Test)

Example 22 with RecoverableFsDataOutputStream

use of org.apache.flink.core.fs.RecoverableFsDataOutputStream in project flink by apache.

the class HadoopS3RecoverableWriterITCase method testCommitAfterNormalClose.

@Test
public void testCommitAfterNormalClose() throws Exception {
    final RecoverableWriter writer = getRecoverableWriter();
    final Path path = new Path(basePathForTest, "part-0");
    final RecoverableFsDataOutputStream stream = writer.open(path);
    stream.write(bytesOf(testData1));
    stream.closeForCommit().commit();
    Assert.assertEquals(testData1, getContentsOfFile(path));
}
Also used : Path(org.apache.flink.core.fs.Path) RecoverableWriter(org.apache.flink.core.fs.RecoverableWriter) RecoverableFsDataOutputStream(org.apache.flink.core.fs.RecoverableFsDataOutputStream) Test(org.junit.Test)

Example 23 with RecoverableFsDataOutputStream

use of org.apache.flink.core.fs.RecoverableFsDataOutputStream in project flink by apache.

the class HadoopS3RecoverableWriterITCase method testResumeAfterMultiplePersist.

private void testResumeAfterMultiplePersist(final String persistName, final String expectedFinalContents, final String firstItemToWrite, final String secondItemToWrite, final String thirdItemToWrite) throws Exception {
    final Path path = new Path(basePathForTest, "part-0");
    final RecoverableWriter initWriter = getRecoverableWriter();
    final Map<String, RecoverableWriter.ResumeRecoverable> recoverables = new HashMap<>(4);
    try (final RecoverableFsDataOutputStream stream = initWriter.open(path)) {
        recoverables.put(INIT_EMPTY_PERSIST, stream.persist());
        stream.write(bytesOf(firstItemToWrite));
        recoverables.put(INTERM_WITH_STATE_PERSIST, stream.persist());
        recoverables.put(INTERM_WITH_NO_ADDITIONAL_STATE_PERSIST, stream.persist());
        // and write some more data
        stream.write(bytesOf(secondItemToWrite));
        recoverables.put(FINAL_WITH_EXTRA_STATE, stream.persist());
    }
    final SimpleVersionedSerializer<RecoverableWriter.ResumeRecoverable> serializer = initWriter.getResumeRecoverableSerializer();
    final byte[] serializedRecoverable = serializer.serialize(recoverables.get(persistName));
    // get a new serializer from a new writer to make sure that no pre-initialized state leaks
    // in.
    final RecoverableWriter newWriter = getRecoverableWriter();
    final SimpleVersionedSerializer<RecoverableWriter.ResumeRecoverable> deserializer = newWriter.getResumeRecoverableSerializer();
    final RecoverableWriter.ResumeRecoverable recoveredRecoverable = deserializer.deserialize(serializer.getVersion(), serializedRecoverable);
    final RecoverableFsDataOutputStream recoveredStream = newWriter.recover(recoveredRecoverable);
    recoveredStream.write(bytesOf(thirdItemToWrite));
    recoveredStream.closeForCommit().commit();
    Assert.assertEquals(expectedFinalContents, getContentsOfFile(path));
}
Also used : Path(org.apache.flink.core.fs.Path) RecoverableWriter(org.apache.flink.core.fs.RecoverableWriter) HashMap(java.util.HashMap) RecoverableFsDataOutputStream(org.apache.flink.core.fs.RecoverableFsDataOutputStream)

Example 24 with RecoverableFsDataOutputStream

use of org.apache.flink.core.fs.RecoverableFsDataOutputStream in project flink by apache.

the class FileSinkCommittableSerializerMigrationTest method prepareDeserializationInProgressToCleanup.

@Test
@Ignore
public void prepareDeserializationInProgressToCleanup() throws IOException {
    String scenario = "in-progress";
    java.nio.file.Path path = resolveVersionPath(CURRENT_VERSION, scenario);
    BucketWriter<String, String> bucketWriter = createBucketWriter();
    RecoverableWriter writer = FileSystem.getLocalFileSystem().createRecoverableWriter();
    FileSinkCommittableSerializer serializer = new FileSinkCommittableSerializer(bucketWriter.getProperties().getPendingFileRecoverableSerializer(), bucketWriter.getProperties().getInProgressFileRecoverableSerializer());
    RecoverableFsDataOutputStream outputStream = writer.open(new Path(path.resolve("content").toString()));
    outputStream.write(IN_PROGRESS_CONTENT.getBytes(StandardCharsets.UTF_8));
    ResumeRecoverable resumeRecoverable = outputStream.persist();
    OutputStreamBasedInProgressFileRecoverable recoverable = new OutputStreamBasedInProgressFileRecoverable(resumeRecoverable);
    FileSinkCommittable committable = new FileSinkCommittable("0", recoverable);
    byte[] bytes = serializer.serialize(committable);
    Files.write(path.resolve("committable"), bytes);
}
Also used : Path(org.apache.flink.core.fs.Path) RecoverableWriter(org.apache.flink.core.fs.RecoverableWriter) RecoverableFsDataOutputStream(org.apache.flink.core.fs.RecoverableFsDataOutputStream) OutputStreamBasedInProgressFileRecoverable(org.apache.flink.streaming.api.functions.sink.filesystem.OutputStreamBasedPartFileWriter.OutputStreamBasedInProgressFileRecoverable) ResumeRecoverable(org.apache.flink.core.fs.RecoverableWriter.ResumeRecoverable) Ignore(org.junit.Ignore) Test(org.junit.Test)

Aggregations

RecoverableFsDataOutputStream (org.apache.flink.core.fs.RecoverableFsDataOutputStream)24 RecoverableWriter (org.apache.flink.core.fs.RecoverableWriter)21 Test (org.junit.Test)20 Path (org.apache.flink.core.fs.Path)17 Ignore (org.junit.Ignore)4 ResumeRecoverable (org.apache.flink.core.fs.RecoverableWriter.ResumeRecoverable)3 MockBlobStorage (org.apache.flink.fs.gs.storage.MockBlobStorage)3 GSRecoverableWriter (org.apache.flink.fs.gs.writer.GSRecoverableWriter)3 ByteArrayOutputStream (java.io.ByteArrayOutputStream)2 CommitRecoverable (org.apache.flink.core.fs.RecoverableWriter.CommitRecoverable)2 S3Recoverable (org.apache.flink.fs.s3.common.writer.S3Recoverable)2 OutputStreamBasedInProgressFileRecoverable (org.apache.flink.streaming.api.functions.sink.filesystem.OutputStreamBasedPartFileWriter.OutputStreamBasedInProgressFileRecoverable)2 OutputStreamBasedPendingFileRecoverable (org.apache.flink.streaming.api.functions.sink.filesystem.OutputStreamBasedPartFileWriter.OutputStreamBasedPendingFileRecoverable)2 IOException (java.io.IOException)1 HashMap (java.util.HashMap)1 FSDataInputStream (org.apache.flink.core.fs.FSDataInputStream)1 GSBlobIdentifier (org.apache.flink.fs.gs.storage.GSBlobIdentifier)1 OutputStreamBasedInProgressFileRecoverableSerializer (org.apache.flink.streaming.api.functions.sink.filesystem.OutputStreamBasedPartFileWriter.OutputStreamBasedInProgressFileRecoverableSerializer)1 OutputStreamBasedPendingFileRecoverableSerializer (org.apache.flink.streaming.api.functions.sink.filesystem.OutputStreamBasedPartFileWriter.OutputStreamBasedPendingFileRecoverableSerializer)1 Path (org.apache.hadoop.fs.Path)1