Search in sources :

Example 1 with ExternalSpillableMap

use of org.apache.hudi.common.util.collection.ExternalSpillableMap in project hudi by apache.

the class SpillableMapBasedFileSystemView method createFileIdToBootstrapBaseFileMap.

@Override
protected Map<HoodieFileGroupId, BootstrapBaseFileMapping> createFileIdToBootstrapBaseFileMap(Map<HoodieFileGroupId, BootstrapBaseFileMapping> fileGroupIdBootstrapBaseFileMap) {
    try {
        LOG.info("Creating bootstrap base File Map using external spillable Map. Max Mem=" + maxMemoryForBootstrapBaseFile + ", BaseDir=" + baseStoreDir);
        new File(baseStoreDir).mkdirs();
        Map<HoodieFileGroupId, BootstrapBaseFileMapping> pendingMap = new ExternalSpillableMap<>(maxMemoryForBootstrapBaseFile, baseStoreDir, new DefaultSizeEstimator(), new DefaultSizeEstimator<>(), diskMapType, isBitCaskDiskMapCompressionEnabled);
        pendingMap.putAll(fileGroupIdBootstrapBaseFileMap);
        return pendingMap;
    } catch (IOException e) {
        throw new RuntimeException(e);
    }
}
Also used : HoodieFileGroupId(org.apache.hudi.common.model.HoodieFileGroupId) ExternalSpillableMap(org.apache.hudi.common.util.collection.ExternalSpillableMap) IOException(java.io.IOException) BootstrapBaseFileMapping(org.apache.hudi.common.model.BootstrapBaseFileMapping) DefaultSizeEstimator(org.apache.hudi.common.util.DefaultSizeEstimator) File(java.io.File)

Example 2 with ExternalSpillableMap

use of org.apache.hudi.common.util.collection.ExternalSpillableMap in project hudi by apache.

the class SpillableMapBasedFileSystemView method createPartitionToFileGroups.

@Override
protected Map<String, List<HoodieFileGroup>> createPartitionToFileGroups() {
    try {
        LOG.info("Creating Partition To File groups map using external spillable Map. Max Mem=" + maxMemoryForFileGroupMap + ", BaseDir=" + baseStoreDir);
        new File(baseStoreDir).mkdirs();
        return (Map<String, List<HoodieFileGroup>>) (new ExternalSpillableMap<>(maxMemoryForFileGroupMap, baseStoreDir, new DefaultSizeEstimator(), new DefaultSizeEstimator<>(), diskMapType, isBitCaskDiskMapCompressionEnabled));
    } catch (IOException e) {
        throw new RuntimeException(e);
    }
}
Also used : ExternalSpillableMap(org.apache.hudi.common.util.collection.ExternalSpillableMap) IOException(java.io.IOException) DefaultSizeEstimator(org.apache.hudi.common.util.DefaultSizeEstimator) File(java.io.File) Map(java.util.Map) ExternalSpillableMap(org.apache.hudi.common.util.collection.ExternalSpillableMap) HoodieFileGroup(org.apache.hudi.common.model.HoodieFileGroup)

Example 3 with ExternalSpillableMap

use of org.apache.hudi.common.util.collection.ExternalSpillableMap in project hudi by apache.

the class HoodieMergeHandle method init.

/**
 * Load the new incoming records in a map and return partitionPath.
 */
protected void init(String fileId, Iterator<HoodieRecord<T>> newRecordsItr) {
    initializeIncomingRecordsMap();
    while (newRecordsItr.hasNext()) {
        HoodieRecord<T> record = newRecordsItr.next();
        // update the new location of the record, so we know where to find it next
        if (needsUpdateLocation()) {
            record.unseal();
            record.setNewLocation(new HoodieRecordLocation(instantTime, fileId));
            record.seal();
        }
        // NOTE: Once Records are added to map (spillable-map), DO NOT change it as they won't persist
        keyToNewRecords.put(record.getRecordKey(), record);
    }
    LOG.info("Number of entries in MemoryBasedMap => " + ((ExternalSpillableMap) keyToNewRecords).getInMemoryMapNumEntries() + "Total size in bytes of MemoryBasedMap => " + ((ExternalSpillableMap) keyToNewRecords).getCurrentInMemoryMapSize() + "Number of entries in BitCaskDiskMap => " + ((ExternalSpillableMap) keyToNewRecords).getDiskBasedMapNumEntries() + "Size of file spilled to disk => " + ((ExternalSpillableMap) keyToNewRecords).getSizeOfFileOnDiskInBytes());
}
Also used : ExternalSpillableMap(org.apache.hudi.common.util.collection.ExternalSpillableMap) HoodieRecordLocation(org.apache.hudi.common.model.HoodieRecordLocation)

Example 4 with ExternalSpillableMap

use of org.apache.hudi.common.util.collection.ExternalSpillableMap in project hudi by apache.

the class HoodieMergeHandle method close.

@Override
public List<WriteStatus> close() {
    try {
        writeIncomingRecords();
        if (keyToNewRecords instanceof ExternalSpillableMap) {
            ((ExternalSpillableMap) keyToNewRecords).close();
        } else {
            keyToNewRecords.clear();
        }
        writtenRecordKeys.clear();
        if (fileWriter != null) {
            fileWriter.close();
            fileWriter = null;
        }
        long fileSizeInBytes = FSUtils.getFileSize(fs, newFilePath);
        HoodieWriteStat stat = writeStatus.getStat();
        stat.setTotalWriteBytes(fileSizeInBytes);
        stat.setFileSizeInBytes(fileSizeInBytes);
        stat.setNumWrites(recordsWritten);
        stat.setNumDeletes(recordsDeleted);
        stat.setNumUpdateWrites(updatedRecordsWritten);
        stat.setNumInserts(insertRecordsWritten);
        stat.setTotalWriteErrors(writeStatus.getTotalErrorRecords());
        RuntimeStats runtimeStats = new RuntimeStats();
        runtimeStats.setTotalUpsertTime(timer.endTimer());
        stat.setRuntimeStats(runtimeStats);
        performMergeDataValidationCheck(writeStatus);
        LOG.info(String.format("MergeHandle for partitionPath %s fileID %s, took %d ms.", stat.getPartitionPath(), stat.getFileId(), runtimeStats.getTotalUpsertTime()));
        return Collections.singletonList(writeStatus);
    } catch (IOException e) {
        throw new HoodieUpsertException("Failed to close UpdateHandle", e);
    }
}
Also used : HoodieWriteStat(org.apache.hudi.common.model.HoodieWriteStat) HoodieUpsertException(org.apache.hudi.exception.HoodieUpsertException) ExternalSpillableMap(org.apache.hudi.common.util.collection.ExternalSpillableMap) RuntimeStats(org.apache.hudi.common.model.HoodieWriteStat.RuntimeStats) IOException(java.io.IOException) HoodieIOException(org.apache.hudi.exception.HoodieIOException)

Example 5 with ExternalSpillableMap

use of org.apache.hudi.common.util.collection.ExternalSpillableMap in project hudi by apache.

the class SpillableMapBasedFileSystemView method createFileIdToPendingCompactionMap.

@Override
protected Map<HoodieFileGroupId, Pair<String, CompactionOperation>> createFileIdToPendingCompactionMap(Map<HoodieFileGroupId, Pair<String, CompactionOperation>> fgIdToPendingCompaction) {
    try {
        LOG.info("Creating Pending Compaction map using external spillable Map. Max Mem=" + maxMemoryForPendingCompaction + ", BaseDir=" + baseStoreDir);
        new File(baseStoreDir).mkdirs();
        Map<HoodieFileGroupId, Pair<String, CompactionOperation>> pendingMap = new ExternalSpillableMap<>(maxMemoryForPendingCompaction, baseStoreDir, new DefaultSizeEstimator(), new DefaultSizeEstimator<>(), diskMapType, isBitCaskDiskMapCompressionEnabled);
        pendingMap.putAll(fgIdToPendingCompaction);
        return pendingMap;
    } catch (IOException e) {
        throw new RuntimeException(e);
    }
}
Also used : HoodieFileGroupId(org.apache.hudi.common.model.HoodieFileGroupId) ExternalSpillableMap(org.apache.hudi.common.util.collection.ExternalSpillableMap) IOException(java.io.IOException) DefaultSizeEstimator(org.apache.hudi.common.util.DefaultSizeEstimator) File(java.io.File) Pair(org.apache.hudi.common.util.collection.Pair)

Aggregations

ExternalSpillableMap (org.apache.hudi.common.util.collection.ExternalSpillableMap)7 IOException (java.io.IOException)6 File (java.io.File)5 DefaultSizeEstimator (org.apache.hudi.common.util.DefaultSizeEstimator)5 HoodieFileGroupId (org.apache.hudi.common.model.HoodieFileGroupId)4 HoodieInstant (org.apache.hudi.common.table.timeline.HoodieInstant)2 Map (java.util.Map)1 BootstrapBaseFileMapping (org.apache.hudi.common.model.BootstrapBaseFileMapping)1 HoodieFileGroup (org.apache.hudi.common.model.HoodieFileGroup)1 HoodieRecordLocation (org.apache.hudi.common.model.HoodieRecordLocation)1 HoodieWriteStat (org.apache.hudi.common.model.HoodieWriteStat)1 RuntimeStats (org.apache.hudi.common.model.HoodieWriteStat.RuntimeStats)1 Pair (org.apache.hudi.common.util.collection.Pair)1 HoodieIOException (org.apache.hudi.exception.HoodieIOException)1 HoodieUpsertException (org.apache.hudi.exception.HoodieUpsertException)1