Search in sources :

Example 1 with SizeAwareDataInputStream

use of org.apache.hudi.common.fs.SizeAwareDataInputStream in project hudi by apache.

the class HoodieDeleteBlock method getKeysToDelete.

public HoodieKey[] getKeysToDelete() {
    try {
        if (keysToDelete == null) {
            if (!getContent().isPresent() && readBlockLazily) {
                // read content from disk
                inflate();
            }
            SizeAwareDataInputStream dis = new SizeAwareDataInputStream(new DataInputStream(new ByteArrayInputStream(getContent().get())));
            int version = dis.readInt();
            int dataLength = dis.readInt();
            byte[] data = new byte[dataLength];
            dis.readFully(data);
            this.keysToDelete = SerializationUtils.<HoodieKey[]>deserialize(data);
            deflate();
        }
        return keysToDelete;
    } catch (IOException io) {
        throw new HoodieIOException("Unable to generate keys to delete from block content", io);
    }
}
Also used : HoodieIOException(org.apache.hudi.exception.HoodieIOException) ByteArrayInputStream(java.io.ByteArrayInputStream) HoodieKey(org.apache.hudi.common.model.HoodieKey) IOException(java.io.IOException) HoodieIOException(org.apache.hudi.exception.HoodieIOException) DataInputStream(java.io.DataInputStream) SizeAwareDataInputStream(org.apache.hudi.common.fs.SizeAwareDataInputStream) FSDataInputStream(org.apache.hadoop.fs.FSDataInputStream) SizeAwareDataInputStream(org.apache.hudi.common.fs.SizeAwareDataInputStream)

Example 2 with SizeAwareDataInputStream

use of org.apache.hudi.common.fs.SizeAwareDataInputStream in project hudi by apache.

the class HoodieAvroDataBlock method getBlock.

/**
 * This method is retained to provide backwards compatibility to HoodieArchivedLogs which were written using
 * HoodieLogFormat V1.
 */
@Deprecated
public static HoodieAvroDataBlock getBlock(byte[] content, Schema readerSchema) throws IOException {
    SizeAwareDataInputStream dis = new SizeAwareDataInputStream(new DataInputStream(new ByteArrayInputStream(content)));
    // 1. Read the schema written out
    int schemaLength = dis.readInt();
    byte[] compressedSchema = new byte[schemaLength];
    dis.readFully(compressedSchema, 0, schemaLength);
    Schema writerSchema = new Schema.Parser().parse(decompress(compressedSchema));
    if (readerSchema == null) {
        readerSchema = writerSchema;
    }
    GenericDatumReader<IndexedRecord> reader = new GenericDatumReader<>(writerSchema, readerSchema);
    // 2. Get the total records
    int totalRecords = dis.readInt();
    List<IndexedRecord> records = new ArrayList<>(totalRecords);
    // 3. Read the content
    for (int i = 0; i < totalRecords; i++) {
        int recordLength = dis.readInt();
        Decoder decoder = DecoderFactory.get().binaryDecoder(content, dis.getNumberOfBytesRead(), recordLength, null);
        IndexedRecord record = reader.read(null, decoder);
        records.add(record);
        dis.skipBytes(recordLength);
    }
    dis.close();
    return new HoodieAvroDataBlock(records, readerSchema);
}
Also used : IndexedRecord(org.apache.avro.generic.IndexedRecord) GenericDatumReader(org.apache.avro.generic.GenericDatumReader) Schema(org.apache.avro.Schema) ArrayList(java.util.ArrayList) DataInputStream(java.io.DataInputStream) FSDataInputStream(org.apache.hadoop.fs.FSDataInputStream) SizeAwareDataInputStream(org.apache.hudi.common.fs.SizeAwareDataInputStream) Decoder(org.apache.avro.io.Decoder) BinaryDecoder(org.apache.avro.io.BinaryDecoder) ByteArrayInputStream(java.io.ByteArrayInputStream) SizeAwareDataInputStream(org.apache.hudi.common.fs.SizeAwareDataInputStream)

Aggregations

ByteArrayInputStream (java.io.ByteArrayInputStream)2 DataInputStream (java.io.DataInputStream)2 FSDataInputStream (org.apache.hadoop.fs.FSDataInputStream)2 SizeAwareDataInputStream (org.apache.hudi.common.fs.SizeAwareDataInputStream)2 IOException (java.io.IOException)1 ArrayList (java.util.ArrayList)1 Schema (org.apache.avro.Schema)1 GenericDatumReader (org.apache.avro.generic.GenericDatumReader)1 IndexedRecord (org.apache.avro.generic.IndexedRecord)1 BinaryDecoder (org.apache.avro.io.BinaryDecoder)1 Decoder (org.apache.avro.io.Decoder)1 HoodieKey (org.apache.hudi.common.model.HoodieKey)1 HoodieIOException (org.apache.hudi.exception.HoodieIOException)1