Search in sources :

Example 16 with HoodieAvroDataBlock

use of org.apache.hudi.common.table.log.block.HoodieAvroDataBlock in project hudi by apache.

the class ArchivedCommitsCommand method showCommits.

@CliCommand(value = "show archived commits", help = "Read commits from archived files and show details")
public String showCommits(@CliOption(key = { "skipMetadata" }, help = "Skip displaying commit metadata", unspecifiedDefaultValue = "true") boolean skipMetadata, @CliOption(key = { "limit" }, help = "Limit commits", unspecifiedDefaultValue = "10") final Integer limit, @CliOption(key = { "sortBy" }, help = "Sorting Field", unspecifiedDefaultValue = "") final String sortByField, @CliOption(key = { "desc" }, help = "Ordering", unspecifiedDefaultValue = "false") final boolean descending, @CliOption(key = { "headeronly" }, help = "Print Header Only", unspecifiedDefaultValue = "false") final boolean headerOnly) throws IOException {
    System.out.println("===============> Showing only " + limit + " archived commits <===============");
    HoodieTableMetaClient metaClient = HoodieCLI.getTableMetaClient();
    String basePath = metaClient.getBasePath();
    Path archivePath = new Path(metaClient.getArchivePath() + "/.commits_.archive*");
    FileStatus[] fsStatuses = FSUtils.getFs(basePath, HoodieCLI.conf).globStatus(archivePath);
    List<Comparable[]> allCommits = new ArrayList<>();
    for (FileStatus fs : fsStatuses) {
        // read the archived file
        HoodieLogFormat.Reader reader = HoodieLogFormat.newReader(FSUtils.getFs(basePath, HoodieCLI.conf), new HoodieLogFile(fs.getPath()), HoodieArchivedMetaEntry.getClassSchema());
        List<IndexedRecord> readRecords = new ArrayList<>();
        // read the avro blocks
        while (reader.hasNext()) {
            HoodieAvroDataBlock blk = (HoodieAvroDataBlock) reader.next();
            try (ClosableIterator<IndexedRecord> recordItr = blk.getRecordItr()) {
                recordItr.forEachRemaining(readRecords::add);
            }
        }
        List<Comparable[]> readCommits = readRecords.stream().map(r -> (GenericRecord) r).map(r -> readCommit(r, skipMetadata)).collect(Collectors.toList());
        allCommits.addAll(readCommits);
        reader.close();
    }
    TableHeader header = new TableHeader().addTableHeaderField("CommitTime").addTableHeaderField("CommitType");
    if (!skipMetadata) {
        header = header.addTableHeaderField("CommitDetails");
    }
    return HoodiePrintHelper.print(header, new HashMap<>(), sortByField, descending, limit, headerOnly, allCommits);
}
Also used : Path(org.apache.hadoop.fs.Path) HoodieArchivedMetaEntry(org.apache.hudi.avro.model.HoodieArchivedMetaEntry) Reader(org.apache.hudi.common.table.log.HoodieLogFormat.Reader) Option(org.apache.hudi.common.util.Option) HashMap(java.util.HashMap) ClosableIterator(org.apache.hudi.common.util.ClosableIterator) FileStatus(org.apache.hadoop.fs.FileStatus) CliOption(org.springframework.shell.core.annotation.CliOption) ArrayList(java.util.ArrayList) HoodieTableMetaClient(org.apache.hudi.common.table.HoodieTableMetaClient) Path(org.apache.hadoop.fs.Path) HoodieLogFile(org.apache.hudi.common.model.HoodieLogFile) HoodieLogFormat(org.apache.hudi.common.table.log.HoodieLogFormat) HoodieTimeline(org.apache.hudi.common.table.timeline.HoodieTimeline) IndexedRecord(org.apache.avro.generic.IndexedRecord) SpecificData(org.apache.avro.specific.SpecificData) CommandMarker(org.springframework.shell.core.CommandMarker) GenericRecord(org.apache.avro.generic.GenericRecord) CliCommand(org.springframework.shell.core.annotation.CliCommand) TableHeader(org.apache.hudi.cli.TableHeader) IOException(java.io.IOException) HoodieCommitMetadata(org.apache.hudi.avro.model.HoodieCommitMetadata) Collectors(java.util.stream.Collectors) HoodieCLI(org.apache.hudi.cli.HoodieCLI) Component(org.springframework.stereotype.Component) List(java.util.List) HoodieAvroDataBlock(org.apache.hudi.common.table.log.block.HoodieAvroDataBlock) HoodiePrintHelper(org.apache.hudi.cli.HoodiePrintHelper) FSUtils(org.apache.hudi.common.fs.FSUtils) FileStatus(org.apache.hadoop.fs.FileStatus) IndexedRecord(org.apache.avro.generic.IndexedRecord) TableHeader(org.apache.hudi.cli.TableHeader) ArrayList(java.util.ArrayList) HoodieAvroDataBlock(org.apache.hudi.common.table.log.block.HoodieAvroDataBlock) Reader(org.apache.hudi.common.table.log.HoodieLogFormat.Reader) HoodieTableMetaClient(org.apache.hudi.common.table.HoodieTableMetaClient) HoodieLogFormat(org.apache.hudi.common.table.log.HoodieLogFormat) HoodieLogFile(org.apache.hudi.common.model.HoodieLogFile) CliCommand(org.springframework.shell.core.annotation.CliCommand)

Aggregations

HoodieAvroDataBlock (org.apache.hudi.common.table.log.block.HoodieAvroDataBlock)16 IndexedRecord (org.apache.avro.generic.IndexedRecord)14 HashMap (java.util.HashMap)12 ArrayList (java.util.ArrayList)10 Path (org.apache.hadoop.fs.Path)10 HoodieLogFormat (org.apache.hudi.common.table.log.HoodieLogFormat)10 HoodieLogFile (org.apache.hudi.common.model.HoodieLogFile)9 IOException (java.io.IOException)8 Schema (org.apache.avro.Schema)8 List (java.util.List)7 GenericRecord (org.apache.avro.generic.GenericRecord)7 Collectors (java.util.stream.Collectors)6 FileStatus (org.apache.hadoop.fs.FileStatus)6 Option (org.apache.hudi.common.util.Option)6 Map (java.util.Map)5 HoodieAvroUtils (org.apache.hudi.avro.HoodieAvroUtils)5 HoodieTableMetaClient (org.apache.hudi.common.table.HoodieTableMetaClient)5 HeaderMetadataType (org.apache.hudi.common.table.log.block.HoodieLogBlock.HeaderMetadataType)5 FileSystem (org.apache.hadoop.fs.FileSystem)4 HoodieArchivedMetaEntry (org.apache.hudi.avro.model.HoodieArchivedMetaEntry)4