Search in sources :

Example 1 with Block

use of org.apache.carbondata.hadoop.internal.index.Block in project carbondata by apache.

the class IndexedSegment method getSplits.

@Override
public List<InputSplit> getSplits(JobContext job, FilterResolverIntf filterResolver) throws IOException {
    // do as following
    // 1. create the index or get from cache by the filter name in the configuration
    // 2. filter by index to get the filtered block
    // 3. create input split from filtered block
    List<InputSplit> output = new LinkedList<>();
    Index index = loader.load(this);
    List<Block> blocks = index.filter(job, filterResolver);
    for (Block block : blocks) {
        output.add(makeInputSplit(block));
    }
    return output;
}
Also used : Block(org.apache.carbondata.hadoop.internal.index.Block) Index(org.apache.carbondata.hadoop.internal.index.Index) InputSplit(org.apache.hadoop.mapreduce.InputSplit) LinkedList(java.util.LinkedList)

Example 2 with Block

use of org.apache.carbondata.hadoop.internal.index.Block in project carbondata by apache.

the class InMemoryBTreeIndex method filter.

@Override
public List<Block> filter(JobContext job, FilterResolverIntf filter) throws IOException {
    List<Block> result = new LinkedList<>();
    FilterExpressionProcessor filterExpressionProcessor = new FilterExpressionProcessor();
    AbsoluteTableIdentifier identifier = AbsoluteTableIdentifier.from(segment.getPath(), "", "");
    // for this segment fetch blocks matching filter in BTree
    List<DataRefNode> dataRefNodes = getDataBlocksOfSegment(job, filterExpressionProcessor, identifier, filter);
    for (DataRefNode dataRefNode : dataRefNodes) {
        BlockBTreeLeafNode leafNode = (BlockBTreeLeafNode) dataRefNode;
        TableBlockInfo tableBlockInfo = leafNode.getTableBlockInfo();
        result.add(new CarbonInputSplit(segment.getId(), tableBlockInfo.getDetailInfo().getBlockletId().toString(), new Path(tableBlockInfo.getFilePath()), tableBlockInfo.getBlockOffset(), tableBlockInfo.getBlockLength(), tableBlockInfo.getLocations(), tableBlockInfo.getBlockletInfos().getNoOfBlockLets(), tableBlockInfo.getVersion(), null));
    }
    return result;
}
Also used : Path(org.apache.hadoop.fs.Path) TableBlockInfo(org.apache.carbondata.core.datastore.block.TableBlockInfo) FilterExpressionProcessor(org.apache.carbondata.core.scan.filter.FilterExpressionProcessor) AbsoluteTableIdentifier(org.apache.carbondata.core.metadata.AbsoluteTableIdentifier) Block(org.apache.carbondata.hadoop.internal.index.Block) DataRefNode(org.apache.carbondata.core.datastore.DataRefNode) CarbonInputSplit(org.apache.carbondata.hadoop.CarbonInputSplit) LinkedList(java.util.LinkedList) BlockBTreeLeafNode(org.apache.carbondata.core.datastore.impl.btree.BlockBTreeLeafNode)

Aggregations

LinkedList (java.util.LinkedList)2 Block (org.apache.carbondata.hadoop.internal.index.Block)2 DataRefNode (org.apache.carbondata.core.datastore.DataRefNode)1 TableBlockInfo (org.apache.carbondata.core.datastore.block.TableBlockInfo)1 BlockBTreeLeafNode (org.apache.carbondata.core.datastore.impl.btree.BlockBTreeLeafNode)1 AbsoluteTableIdentifier (org.apache.carbondata.core.metadata.AbsoluteTableIdentifier)1 FilterExpressionProcessor (org.apache.carbondata.core.scan.filter.FilterExpressionProcessor)1 CarbonInputSplit (org.apache.carbondata.hadoop.CarbonInputSplit)1 Index (org.apache.carbondata.hadoop.internal.index.Index)1 Path (org.apache.hadoop.fs.Path)1 InputSplit (org.apache.hadoop.mapreduce.InputSplit)1