Search in sources :

Example 6 with HoodieHFileReader

use of org.apache.hudi.io.storage.HoodieHFileReader in project hudi by apache.

the class TestHoodieBackedTableMetadata method verifyMetadataRecordKeyExcludeFromPayloadBaseFiles.

/**
 * Verify metadata table base files for the records persisted based on the config. When
 * the key deduplication is enabled, the records persisted on the disk in the base file
 * should have key field in the payload as empty string.
 *
 * @param table - Metadata table
 */
private void verifyMetadataRecordKeyExcludeFromPayloadBaseFiles(HoodieTable table) throws IOException {
    table.getHoodieView().sync();
    List<FileSlice> fileSlices = table.getSliceView().getLatestFileSlices(MetadataPartitionType.FILES.getPartitionPath()).collect(Collectors.toList());
    if (!fileSlices.get(0).getBaseFile().isPresent()) {
        throw new IllegalStateException("Base file not available!");
    }
    final HoodieBaseFile baseFile = fileSlices.get(0).getBaseFile().get();
    HoodieHFileReader hoodieHFileReader = new HoodieHFileReader(context.getHadoopConf().get(), new Path(baseFile.getPath()), new CacheConfig(context.getHadoopConf().get()));
    List<Pair<String, IndexedRecord>> records = hoodieHFileReader.readAllRecords();
    records.forEach(entry -> {
        assertNull(((GenericRecord) entry.getSecond()).get(HoodieRecord.RECORD_KEY_METADATA_FIELD));
        final String keyInPayload = (String) ((GenericRecord) entry.getSecond()).get(HoodieMetadataPayload.KEY_FIELD_NAME);
        assertFalse(keyInPayload.isEmpty());
    });
}
Also used : Path(org.apache.hadoop.fs.Path) HoodieBaseFile(org.apache.hudi.common.model.HoodieBaseFile) FileSlice(org.apache.hudi.common.model.FileSlice) HoodieHFileReader(org.apache.hudi.io.storage.HoodieHFileReader) CacheConfig(org.apache.hadoop.hbase.io.hfile.CacheConfig) Pair(org.apache.hadoop.hbase.util.Pair)

Aggregations

HoodieHFileReader (org.apache.hudi.io.storage.HoodieHFileReader)6 CacheConfig (org.apache.hadoop.hbase.io.hfile.CacheConfig)5 Path (org.apache.hadoop.fs.Path)4 IndexedRecord (org.apache.avro.generic.IndexedRecord)3 Pair (org.apache.hadoop.hbase.util.Pair)3 FileSlice (org.apache.hudi.common.model.FileSlice)3 HoodieBaseFile (org.apache.hudi.common.model.HoodieBaseFile)3 ClosableIterator (org.apache.hudi.common.util.ClosableIterator)2 Schema (org.apache.avro.Schema)1 Configuration (org.apache.hadoop.conf.Configuration)1 FileSystem (org.apache.hadoop.fs.FileSystem)1 InLineFileSystem (org.apache.hudi.common.fs.inline.InLineFileSystem)1 HoodieTableType (org.apache.hudi.common.model.HoodieTableType)1 HoodieTableMetaClient (org.apache.hudi.common.table.HoodieTableMetaClient)1 HoodieWriteConfig (org.apache.hudi.config.HoodieWriteConfig)1 HoodieTableMetadata (org.apache.hudi.metadata.HoodieTableMetadata)1 HoodieTable (org.apache.hudi.table.HoodieTable)1 ParameterizedTest (org.junit.jupiter.params.ParameterizedTest)1 ValueSource (org.junit.jupiter.params.provider.ValueSource)1