Search in sources :

Example 11 with CheckpointedPosition

use of org.apache.flink.connector.file.src.util.CheckpointedPosition in project flink by apache.

the class FileSourceSplitSerializer method deserializeV1.

private static FileSourceSplit deserializeV1(byte[] serialized) throws IOException {
    final DataInputDeserializer in = new DataInputDeserializer(serialized);
    final String id = in.readUTF();
    final Path path = new Path();
    path.read(in);
    final long offset = in.readLong();
    final long len = in.readLong();
    final long modificationTime = in.readLong();
    final long fileSize = in.readLong();
    final String[] hosts = readStringArray(in);
    final CheckpointedPosition readerPosition = in.readBoolean() ? new CheckpointedPosition(in.readLong(), in.readLong()) : null;
    // instantiate a new split and cache the serialized form
    return new FileSourceSplit(id, path, offset, len, modificationTime, fileSize, hosts, readerPosition, serialized);
}
Also used : Path(org.apache.flink.core.fs.Path) CheckpointedPosition(org.apache.flink.connector.file.src.util.CheckpointedPosition) DataInputDeserializer(org.apache.flink.core.memory.DataInputDeserializer)

Example 12 with CheckpointedPosition

use of org.apache.flink.connector.file.src.util.CheckpointedPosition in project flink by apache.

the class FileSourceSplitSerializer method serialize.

@Override
public byte[] serialize(FileSourceSplit split) throws IOException {
    checkArgument(split.getClass() == FileSourceSplit.class, "Cannot serialize subclasses of FileSourceSplit");
    // optimization: the splits lazily cache their own serialized form
    if (split.serializedFormCache != null) {
        return split.serializedFormCache;
    }
    final DataOutputSerializer out = SERIALIZER_CACHE.get();
    out.writeUTF(split.splitId());
    split.path().write(out);
    out.writeLong(split.offset());
    out.writeLong(split.length());
    out.writeLong(split.fileModificationTime());
    out.writeLong(split.fileSize());
    writeStringArray(out, split.hostnames());
    final Optional<CheckpointedPosition> readerPosition = split.getReaderPosition();
    out.writeBoolean(readerPosition.isPresent());
    if (readerPosition.isPresent()) {
        out.writeLong(readerPosition.get().getOffset());
        out.writeLong(readerPosition.get().getRecordsAfterOffset());
    }
    final byte[] result = out.getCopyOfBuffer();
    out.clear();
    // optimization: cache the serialized from, so we avoid the byte work during repeated
    // serialization
    split.serializedFormCache = result;
    return result;
}
Also used : DataOutputSerializer(org.apache.flink.core.memory.DataOutputSerializer) CheckpointedPosition(org.apache.flink.connector.file.src.util.CheckpointedPosition)

Aggregations

CheckpointedPosition (org.apache.flink.connector.file.src.util.CheckpointedPosition)12 Path (org.apache.flink.core.fs.Path)5 Test (org.junit.Test)4 FileSourceSplit (org.apache.flink.connector.file.src.FileSourceSplit)3 IOException (java.io.IOException)2 Configuration (org.apache.flink.configuration.Configuration)2 BulkFormat (org.apache.flink.connector.file.src.reader.BulkFormat)2 RowData (org.apache.flink.table.data.RowData)2 AtomicInteger (java.util.concurrent.atomic.AtomicInteger)1 AtomicReference (java.util.concurrent.atomic.AtomicReference)1 FileRecordFormat (org.apache.flink.connector.file.src.reader.FileRecordFormat)1 StreamFormat (org.apache.flink.connector.file.src.reader.StreamFormat)1 TestingFileSystem (org.apache.flink.connector.file.src.testutils.TestingFileSystem)1 FileStatus (org.apache.flink.core.fs.FileStatus)1 DataInputDeserializer (org.apache.flink.core.memory.DataInputDeserializer)1 DataOutputSerializer (org.apache.flink.core.memory.DataOutputSerializer)1 GenericRowData (org.apache.flink.table.data.GenericRowData)1 BigIntType (org.apache.flink.table.types.logical.BigIntType)1 BooleanType (org.apache.flink.table.types.logical.BooleanType)1 DecimalType (org.apache.flink.table.types.logical.DecimalType)1