Search in sources :

Example 1 with LazyIterableIterator

use of org.apache.hudi.client.utils.LazyIterableIterator in project hudi by apache.

the class HoodieBucketIndex method tagLocation.

@Override
public <R> HoodieData<HoodieRecord<R>> tagLocation(HoodieData<HoodieRecord<R>> records, HoodieEngineContext context, HoodieTable hoodieTable) throws HoodieIndexException {
    HoodieData<HoodieRecord<R>> taggedRecords = records.mapPartitions(recordIter -> {
        // partitionPath -> bucketId -> fileInfo
        Map<String, Map<Integer, Pair<String, String>>> partitionPathFileIDList = new HashMap<>();
        return new LazyIterableIterator<HoodieRecord<R>, HoodieRecord<R>>(recordIter) {

            @Override
            protected void start() {
            }

            @Override
            protected HoodieRecord<R> computeNext() {
                HoodieRecord record = recordIter.next();
                int bucketId = BucketIdentifier.getBucketId(record, config.getBucketIndexHashField(), numBuckets);
                String partitionPath = record.getPartitionPath();
                if (!partitionPathFileIDList.containsKey(partitionPath)) {
                    partitionPathFileIDList.put(partitionPath, loadPartitionBucketIdFileIdMapping(hoodieTable, partitionPath));
                }
                if (partitionPathFileIDList.get(partitionPath).containsKey(bucketId)) {
                    Pair<String, String> fileInfo = partitionPathFileIDList.get(partitionPath).get(bucketId);
                    return HoodieIndexUtils.getTaggedRecord(record, Option.of(new HoodieRecordLocation(fileInfo.getRight(), fileInfo.getLeft())));
                }
                return record;
            }

            @Override
            protected void end() {
            }
        };
    }, true);
    return taggedRecords;
}
Also used : LazyIterableIterator(org.apache.hudi.client.utils.LazyIterableIterator) HashMap(java.util.HashMap) HoodieRecord(org.apache.hudi.common.model.HoodieRecord) HoodieRecordLocation(org.apache.hudi.common.model.HoodieRecordLocation) HashMap(java.util.HashMap) Map(java.util.Map)

Aggregations

HashMap (java.util.HashMap)1 Map (java.util.Map)1 LazyIterableIterator (org.apache.hudi.client.utils.LazyIterableIterator)1 HoodieRecord (org.apache.hudi.common.model.HoodieRecord)1 HoodieRecordLocation (org.apache.hudi.common.model.HoodieRecordLocation)1