Search in sources :

Example 6 with VectorGroupByDesc

use of org.apache.hadoop.hive.ql.plan.VectorGroupByDesc in project hive by apache.

the class Vectorizer method validateGroupByOperator.

private boolean validateGroupByOperator(GroupByOperator op, boolean isReduce, boolean isTezOrSpark) {
    GroupByDesc desc = op.getConf();
    if (desc.isGroupingSetsPresent()) {
        setOperatorIssue("Grouping sets not supported");
        return false;
    }
    if (desc.pruneGroupingSetId()) {
        setOperatorIssue("Pruning grouping set id not supported");
        return false;
    }
    if (desc.getMode() != GroupByDesc.Mode.HASH && desc.isDistinct()) {
        setOperatorIssue("DISTINCT not supported");
        return false;
    }
    boolean ret = validateExprNodeDesc(desc.getKeys(), "Key");
    if (!ret) {
        return false;
    }
    /**
     *
     * GROUP BY DEFINITIONS:
     *
     * GroupByDesc.Mode enumeration:
     *
     *    The different modes of a GROUP BY operator.
     *
     *    These descriptions are hopefully less cryptic than the comments for GroupByDesc.Mode.
     *
     *        COMPLETE       Aggregates original rows into full aggregation row(s).
     *
     *                       If the key length is 0, this is also called Global aggregation and
     *                       1 output row is produced.
     *
     *                       When the key length is > 0, the original rows come in ALREADY GROUPED.
     *
     *                       An example for key length > 0 is a GROUP BY being applied to the
     *                       ALREADY GROUPED rows coming from an upstream JOIN operator.  Or,
     *                       ALREADY GROUPED rows coming from upstream MERGEPARTIAL GROUP BY
     *                       operator.
     *
     *        PARTIAL1       The first of 2 (or more) phases that aggregates ALREADY GROUPED
     *                       original rows into partial aggregations.
     *
     *                       Subsequent phases PARTIAL2 (optional) and MERGEPARTIAL will merge
     *                       the partial aggregations and output full aggregations.
     *
     *        PARTIAL2       Accept ALREADY GROUPED partial aggregations and merge them into another
     *                       partial aggregation.  Output the merged partial aggregations.
     *
     *                       (Haven't seen this one used)
     *
     *        PARTIALS       (Behaves for non-distinct the same as PARTIAL2; and behaves for
     *                       distinct the same as PARTIAL1.)
     *
     *        FINAL          Accept ALREADY GROUPED original rows and aggregate them into
     *                       full aggregations.
     *
     *                       Example is a GROUP BY being applied to rows from a sorted table, where
     *                       the group key is the table sort key (or a prefix).
     *
     *        HASH           Accept UNORDERED original rows and aggregate them into a memory table.
     *                       Output the partial aggregations on closeOp (or low memory).
     *
     *                       Similar to PARTIAL1 except original rows are UNORDERED.
     *
     *                       Commonly used in both Mapper and Reducer nodes.  Always followed by
     *                       a Reducer with MERGEPARTIAL GROUP BY.
     *
     *        MERGEPARTIAL   Always first operator of a Reducer.  Data is grouped by reduce-shuffle.
     *
     *                       (Behaves for non-distinct aggregations the same as FINAL; and behaves
     *                       for distinct aggregations the same as COMPLETE.)
     *
     *                       The output is full aggregation(s).
     *
     *                       Used in Reducers after a stage with a HASH GROUP BY operator.
     *
     *
     *  VectorGroupByDesc.ProcessingMode for VectorGroupByOperator:
     *
     *     GLOBAL         No key.  All rows --> 1 full aggregation on end of input
     *
     *     HASH           Rows aggregated in to hash table on group key -->
     *                        1 partial aggregation per key (normally, unless there is spilling)
     *
     *     MERGE_PARTIAL  As first operator in a REDUCER, partial aggregations come grouped from
     *                    reduce-shuffle -->
     *                        aggregate the partial aggregations and emit full aggregation on
     *                        endGroup / closeOp
     *
     *     STREAMING      Rows come from PARENT operator ALREADY GROUPED -->
     *                        aggregate the rows and emit full aggregation on key change / closeOp
     *
     *     NOTE: Hash can spill partial result rows prematurely if it runs low on memory.
     *     NOTE: Streaming has to compare keys where MergePartial gets an endGroup call.
     *
     *
     *  DECIDER: Which VectorGroupByDesc.ProcessingMode for VectorGroupByOperator?
     *
     *     Decides using GroupByDesc.Mode and whether there are keys with the
     *     VectorGroupByDesc.groupByDescModeToVectorProcessingMode method.
     *
     *         Mode.COMPLETE      --> (numKeys == 0 ? ProcessingMode.GLOBAL : ProcessingMode.STREAMING)
     *
     *         Mode.HASH          --> ProcessingMode.HASH
     *
     *         Mode.MERGEPARTIAL  --> (numKeys == 0 ? ProcessingMode.GLOBAL : ProcessingMode.MERGE_PARTIAL)
     *
     *         Mode.PARTIAL1,
     *         Mode.PARTIAL2,
     *         Mode.PARTIALS,
     *         Mode.FINAL        --> ProcessingMode.STREAMING
     *
     */
    boolean hasKeys = (desc.getKeys().size() > 0);
    ProcessingMode processingMode = VectorGroupByDesc.groupByDescModeToVectorProcessingMode(desc.getMode(), hasKeys);
    Pair<Boolean, Boolean> retPair = validateAggregationDescs(desc.getAggregators(), processingMode, hasKeys);
    if (!retPair.left) {
        return false;
    }
    // If all the aggregation outputs are primitive, we can output VectorizedRowBatch.
    // Otherwise, we the rest of the operator tree will be row mode.
    VectorGroupByDesc vectorDesc = new VectorGroupByDesc();
    desc.setVectorDesc(vectorDesc);
    vectorDesc.setVectorOutput(retPair.right);
    vectorDesc.setProcessingMode(processingMode);
    LOG.info("Vector GROUP BY operator will use processing mode " + processingMode.name() + ", isVectorOutput " + vectorDesc.isVectorOutput());
    return true;
}
Also used : ProcessingMode(org.apache.hadoop.hive.ql.plan.VectorGroupByDesc.ProcessingMode) VectorGroupByDesc(org.apache.hadoop.hive.ql.plan.VectorGroupByDesc) UDFToBoolean(org.apache.hadoop.hive.ql.udf.UDFToBoolean) VectorGroupByDesc(org.apache.hadoop.hive.ql.plan.VectorGroupByDesc) GroupByDesc(org.apache.hadoop.hive.ql.plan.GroupByDesc)

Example 7 with VectorGroupByDesc

use of org.apache.hadoop.hive.ql.plan.VectorGroupByDesc in project hive by apache.

the class Vectorizer method vectorizeOperator.

public Operator<? extends OperatorDesc> vectorizeOperator(Operator<? extends OperatorDesc> op, VectorizationContext vContext, boolean isTezOrSpark, VectorTaskColumnInfo vectorTaskColumnInfo) throws HiveException {
    Operator<? extends OperatorDesc> vectorOp = null;
    boolean isNative;
    switch(op.getType()) {
        case TABLESCAN:
            vectorOp = vectorizeTableScanOperator(op, vContext);
            isNative = true;
            break;
        case MAPJOIN:
            {
                if (op instanceof MapJoinOperator) {
                    VectorMapJoinInfo vectorMapJoinInfo = new VectorMapJoinInfo();
                    MapJoinDesc desc = (MapJoinDesc) op.getConf();
                    boolean specialize = canSpecializeMapJoin(op, desc, isTezOrSpark, vContext, vectorMapJoinInfo);
                    if (!specialize) {
                        Class<? extends Operator<?>> opClass = null;
                        // *NON-NATIVE* vector map differences for LEFT OUTER JOIN and Filtered...
                        List<ExprNodeDesc> bigTableFilters = desc.getFilters().get((byte) desc.getPosBigTable());
                        boolean isOuterAndFiltered = (!desc.isNoOuterJoin() && bigTableFilters.size() > 0);
                        if (!isOuterAndFiltered) {
                            opClass = VectorMapJoinOperator.class;
                        } else {
                            opClass = VectorMapJoinOuterFilteredOperator.class;
                        }
                        vectorOp = OperatorFactory.getVectorOperator(opClass, op.getCompilationOpContext(), op.getConf(), vContext);
                        isNative = false;
                    } else {
                        // TEMPORARY Until Native Vector Map Join with Hybrid passes tests...
                        // HiveConf.setBoolVar(physicalContext.getConf(),
                        //    HiveConf.ConfVars.HIVEUSEHYBRIDGRACEHASHJOIN, false);
                        vectorOp = specializeMapJoinOperator(op, vContext, desc, vectorMapJoinInfo);
                        isNative = true;
                        if (vectorTaskColumnInfo != null) {
                            if (usesVectorUDFAdaptor(vectorMapJoinInfo.getBigTableKeyExpressions())) {
                                vectorTaskColumnInfo.setUsesVectorUDFAdaptor(true);
                            }
                            if (usesVectorUDFAdaptor(vectorMapJoinInfo.getBigTableValueExpressions())) {
                                vectorTaskColumnInfo.setUsesVectorUDFAdaptor(true);
                            }
                        }
                    }
                } else {
                    Preconditions.checkState(op instanceof SMBMapJoinOperator);
                    SMBJoinDesc smbJoinSinkDesc = (SMBJoinDesc) op.getConf();
                    VectorSMBJoinDesc vectorSMBJoinDesc = new VectorSMBJoinDesc();
                    smbJoinSinkDesc.setVectorDesc(vectorSMBJoinDesc);
                    vectorOp = OperatorFactory.getVectorOperator(op.getCompilationOpContext(), smbJoinSinkDesc, vContext);
                    isNative = false;
                }
            }
            break;
        case REDUCESINK:
            {
                VectorReduceSinkInfo vectorReduceSinkInfo = new VectorReduceSinkInfo();
                ReduceSinkDesc desc = (ReduceSinkDesc) op.getConf();
                boolean specialize = canSpecializeReduceSink(desc, isTezOrSpark, vContext, vectorReduceSinkInfo);
                if (!specialize) {
                    vectorOp = OperatorFactory.getVectorOperator(op.getCompilationOpContext(), op.getConf(), vContext);
                    isNative = false;
                } else {
                    vectorOp = specializeReduceSinkOperator(op, vContext, desc, vectorReduceSinkInfo);
                    isNative = true;
                    if (vectorTaskColumnInfo != null) {
                        if (usesVectorUDFAdaptor(vectorReduceSinkInfo.getReduceSinkKeyExpressions())) {
                            vectorTaskColumnInfo.setUsesVectorUDFAdaptor(true);
                        }
                        if (usesVectorUDFAdaptor(vectorReduceSinkInfo.getReduceSinkValueExpressions())) {
                            vectorTaskColumnInfo.setUsesVectorUDFAdaptor(true);
                        }
                    }
                }
            }
            break;
        case FILTER:
            {
                vectorOp = vectorizeFilterOperator(op, vContext);
                isNative = true;
                if (vectorTaskColumnInfo != null) {
                    VectorFilterDesc vectorFilterDesc = (VectorFilterDesc) ((AbstractOperatorDesc) vectorOp.getConf()).getVectorDesc();
                    VectorExpression vectorPredicateExpr = vectorFilterDesc.getPredicateExpression();
                    if (usesVectorUDFAdaptor(vectorPredicateExpr)) {
                        vectorTaskColumnInfo.setUsesVectorUDFAdaptor(true);
                    }
                }
            }
            break;
        case SELECT:
            {
                vectorOp = vectorizeSelectOperator(op, vContext);
                isNative = true;
                if (vectorTaskColumnInfo != null) {
                    VectorSelectDesc vectorSelectDesc = (VectorSelectDesc) ((AbstractOperatorDesc) vectorOp.getConf()).getVectorDesc();
                    VectorExpression[] vectorSelectExprs = vectorSelectDesc.getSelectExpressions();
                    if (usesVectorUDFAdaptor(vectorSelectExprs)) {
                        vectorTaskColumnInfo.setUsesVectorUDFAdaptor(true);
                    }
                }
            }
            break;
        case GROUPBY:
            {
                vectorOp = vectorizeGroupByOperator(op, vContext);
                isNative = false;
                if (vectorTaskColumnInfo != null) {
                    VectorGroupByDesc vectorGroupByDesc = (VectorGroupByDesc) ((AbstractOperatorDesc) vectorOp.getConf()).getVectorDesc();
                    if (!vectorGroupByDesc.isVectorOutput()) {
                        vectorTaskColumnInfo.setGroupByVectorOutput(false);
                    }
                    VectorExpression[] vecKeyExpressions = vectorGroupByDesc.getKeyExpressions();
                    if (usesVectorUDFAdaptor(vecKeyExpressions)) {
                        vectorTaskColumnInfo.setUsesVectorUDFAdaptor(true);
                    }
                    VectorAggregateExpression[] vecAggregators = vectorGroupByDesc.getAggregators();
                    for (VectorAggregateExpression vecAggr : vecAggregators) {
                        if (usesVectorUDFAdaptor(vecAggr.inputExpression())) {
                            vectorTaskColumnInfo.setUsesVectorUDFAdaptor(true);
                        }
                    }
                }
            }
            break;
        case FILESINK:
            {
                FileSinkDesc fileSinkDesc = (FileSinkDesc) op.getConf();
                VectorFileSinkDesc vectorFileSinkDesc = new VectorFileSinkDesc();
                fileSinkDesc.setVectorDesc(vectorFileSinkDesc);
                vectorOp = OperatorFactory.getVectorOperator(op.getCompilationOpContext(), fileSinkDesc, vContext);
                isNative = false;
            }
            break;
        case LIMIT:
            {
                LimitDesc limitDesc = (LimitDesc) op.getConf();
                VectorLimitDesc vectorLimitDesc = new VectorLimitDesc();
                limitDesc.setVectorDesc(vectorLimitDesc);
                vectorOp = OperatorFactory.getVectorOperator(op.getCompilationOpContext(), limitDesc, vContext);
                isNative = true;
            }
            break;
        case EVENT:
            {
                AppMasterEventDesc eventDesc = (AppMasterEventDesc) op.getConf();
                VectorAppMasterEventDesc vectorEventDesc = new VectorAppMasterEventDesc();
                eventDesc.setVectorDesc(vectorEventDesc);
                vectorOp = OperatorFactory.getVectorOperator(op.getCompilationOpContext(), eventDesc, vContext);
                isNative = true;
            }
            break;
        case HASHTABLESINK:
            {
                SparkHashTableSinkDesc sparkHashTableSinkDesc = (SparkHashTableSinkDesc) op.getConf();
                VectorSparkHashTableSinkDesc vectorSparkHashTableSinkDesc = new VectorSparkHashTableSinkDesc();
                sparkHashTableSinkDesc.setVectorDesc(vectorSparkHashTableSinkDesc);
                vectorOp = OperatorFactory.getVectorOperator(op.getCompilationOpContext(), sparkHashTableSinkDesc, vContext);
                isNative = true;
            }
            break;
        case SPARKPRUNINGSINK:
            {
                SparkPartitionPruningSinkDesc sparkPartitionPruningSinkDesc = (SparkPartitionPruningSinkDesc) op.getConf();
                VectorSparkPartitionPruningSinkDesc vectorSparkPartitionPruningSinkDesc = new VectorSparkPartitionPruningSinkDesc();
                sparkPartitionPruningSinkDesc.setVectorDesc(vectorSparkPartitionPruningSinkDesc);
                vectorOp = OperatorFactory.getVectorOperator(op.getCompilationOpContext(), sparkPartitionPruningSinkDesc, vContext);
                isNative = true;
            }
            break;
        default:
            // These are children of GROUP BY operators with non-vector outputs.
            isNative = false;
            vectorOp = op;
            break;
    }
    Preconditions.checkState(vectorOp != null);
    if (vectorTaskColumnInfo != null && !isNative) {
        vectorTaskColumnInfo.setAllNative(false);
    }
    LOG.debug("vectorizeOperator " + vectorOp.getClass().getName());
    LOG.debug("vectorizeOperator " + vectorOp.getConf().getClass().getName());
    if (vectorOp != op) {
        fixupParentChildOperators(op, vectorOp);
        ((AbstractOperatorDesc) vectorOp.getConf()).setVectorMode(true);
    }
    return vectorOp;
}
Also used : VectorMapJoinInnerStringOperator(org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerStringOperator) VectorReduceSinkLongOperator(org.apache.hadoop.hive.ql.exec.vector.reducesink.VectorReduceSinkLongOperator) VectorMapJoinOuterLongOperator(org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterLongOperator) VectorReduceSinkStringOperator(org.apache.hadoop.hive.ql.exec.vector.reducesink.VectorReduceSinkStringOperator) VectorMapJoinInnerBigOnlyMultiKeyOperator(org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerBigOnlyMultiKeyOperator) VectorMapJoinLeftSemiMultiKeyOperator(org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinLeftSemiMultiKeyOperator) VectorMapJoinLeftSemiStringOperator(org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinLeftSemiStringOperator) VectorMapJoinLeftSemiLongOperator(org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinLeftSemiLongOperator) VectorReduceSinkMultiKeyOperator(org.apache.hadoop.hive.ql.exec.vector.reducesink.VectorReduceSinkMultiKeyOperator) VectorMapJoinOuterFilteredOperator(org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOuterFilteredOperator) VectorMapJoinInnerBigOnlyLongOperator(org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerBigOnlyLongOperator) VectorMapJoinInnerBigOnlyStringOperator(org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerBigOnlyStringOperator) VectorMapJoinInnerMultiKeyOperator(org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerMultiKeyOperator) VectorMapJoinOuterStringOperator(org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterStringOperator) VectorMapJoinOperator(org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator) VectorMapJoinInnerLongOperator(org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator) VectorMapJoinOuterMultiKeyOperator(org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterMultiKeyOperator) AppMasterEventDesc(org.apache.hadoop.hive.ql.plan.AppMasterEventDesc) VectorAppMasterEventDesc(org.apache.hadoop.hive.ql.plan.VectorAppMasterEventDesc) SMBJoinDesc(org.apache.hadoop.hive.ql.plan.SMBJoinDesc) VectorSMBJoinDesc(org.apache.hadoop.hive.ql.plan.VectorSMBJoinDesc) VectorFileSinkDesc(org.apache.hadoop.hive.ql.plan.VectorFileSinkDesc) FileSinkDesc(org.apache.hadoop.hive.ql.plan.FileSinkDesc) VectorMapJoinOperator(org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator) VectorFileSinkDesc(org.apache.hadoop.hive.ql.plan.VectorFileSinkDesc) VectorReduceSinkInfo(org.apache.hadoop.hive.ql.plan.VectorReduceSinkInfo) VectorSparkPartitionPruningSinkDesc(org.apache.hadoop.hive.ql.plan.VectorSparkPartitionPruningSinkDesc) SparkPartitionPruningSinkDesc(org.apache.hadoop.hive.ql.optimizer.spark.SparkPartitionPruningSinkDesc) VectorAppMasterEventDesc(org.apache.hadoop.hive.ql.plan.VectorAppMasterEventDesc) ArrayList(java.util.ArrayList) List(java.util.List) VectorSelectDesc(org.apache.hadoop.hive.ql.plan.VectorSelectDesc) VectorReduceSinkDesc(org.apache.hadoop.hive.ql.plan.VectorReduceSinkDesc) ReduceSinkDesc(org.apache.hadoop.hive.ql.plan.ReduceSinkDesc) VectorMapJoinOperator(org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator) VectorFilterDesc(org.apache.hadoop.hive.ql.plan.VectorFilterDesc) SparkHashTableSinkDesc(org.apache.hadoop.hive.ql.plan.SparkHashTableSinkDesc) VectorSparkHashTableSinkDesc(org.apache.hadoop.hive.ql.plan.VectorSparkHashTableSinkDesc) AbstractOperatorDesc(org.apache.hadoop.hive.ql.plan.AbstractOperatorDesc) MapJoinDesc(org.apache.hadoop.hive.ql.plan.MapJoinDesc) VectorMapJoinDesc(org.apache.hadoop.hive.ql.plan.VectorMapJoinDesc) VectorSparkHashTableSinkDesc(org.apache.hadoop.hive.ql.plan.VectorSparkHashTableSinkDesc) VectorSparkPartitionPruningSinkDesc(org.apache.hadoop.hive.ql.plan.VectorSparkPartitionPruningSinkDesc) VectorMapJoinInfo(org.apache.hadoop.hive.ql.plan.VectorMapJoinInfo) VectorSMBJoinDesc(org.apache.hadoop.hive.ql.plan.VectorSMBJoinDesc) VectorAggregateExpression(org.apache.hadoop.hive.ql.exec.vector.expressions.aggregates.VectorAggregateExpression) VectorLimitDesc(org.apache.hadoop.hive.ql.plan.VectorLimitDesc) LimitDesc(org.apache.hadoop.hive.ql.plan.LimitDesc) VectorLimitDesc(org.apache.hadoop.hive.ql.plan.VectorLimitDesc) VectorMapJoinOuterFilteredOperator(org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOuterFilteredOperator) VectorGroupByDesc(org.apache.hadoop.hive.ql.plan.VectorGroupByDesc) VectorExpression(org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression)

Example 8 with VectorGroupByDesc

use of org.apache.hadoop.hive.ql.plan.VectorGroupByDesc in project hive by apache.

the class Vectorizer method vectorizeGroupByOperator.

/*
   * NOTE: The VectorGroupByDesc has already been allocated and partially populated.
   */
public static Operator<? extends OperatorDesc> vectorizeGroupByOperator(Operator<? extends OperatorDesc> groupByOp, VectorizationContext vContext) throws HiveException {
    GroupByDesc groupByDesc = (GroupByDesc) groupByOp.getConf();
    List<ExprNodeDesc> keysDesc = groupByDesc.getKeys();
    VectorExpression[] vecKeyExpressions = vContext.getVectorExpressions(keysDesc);
    ArrayList<AggregationDesc> aggrDesc = groupByDesc.getAggregators();
    final int size = aggrDesc.size();
    VectorAggregateExpression[] vecAggregators = new VectorAggregateExpression[size];
    int[] projectedOutputColumns = new int[size];
    for (int i = 0; i < size; ++i) {
        AggregationDesc aggDesc = aggrDesc.get(i);
        vecAggregators[i] = vContext.getAggregatorExpression(aggDesc);
        // GroupBy generates a new vectorized row batch...
        projectedOutputColumns[i] = i;
    }
    VectorGroupByDesc vectorGroupByDesc = (VectorGroupByDesc) groupByDesc.getVectorDesc();
    vectorGroupByDesc.setKeyExpressions(vecKeyExpressions);
    vectorGroupByDesc.setAggregators(vecAggregators);
    vectorGroupByDesc.setProjectedOutputColumns(projectedOutputColumns);
    return OperatorFactory.getVectorOperator(groupByOp.getCompilationOpContext(), groupByDesc, vContext);
}
Also used : VectorGroupByDesc(org.apache.hadoop.hive.ql.plan.VectorGroupByDesc) VectorAggregateExpression(org.apache.hadoop.hive.ql.exec.vector.expressions.aggregates.VectorAggregateExpression) VectorExpression(org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression) AggregationDesc(org.apache.hadoop.hive.ql.plan.AggregationDesc) ExprNodeDesc(org.apache.hadoop.hive.ql.plan.ExprNodeDesc) VectorGroupByDesc(org.apache.hadoop.hive.ql.plan.VectorGroupByDesc) GroupByDesc(org.apache.hadoop.hive.ql.plan.GroupByDesc)

Example 9 with VectorGroupByDesc

use of org.apache.hadoop.hive.ql.plan.VectorGroupByDesc in project hive by apache.

the class TestVectorGroupByOperator method buildKeyGroupByDesc.

private static GroupByDesc buildKeyGroupByDesc(VectorizationContext ctx, String aggregate, String column, TypeInfo dataTypeInfo, String key, TypeInfo keyTypeInfo) {
    GroupByDesc desc = buildGroupByDescType(ctx, aggregate, GenericUDAFEvaluator.Mode.PARTIAL1, column, dataTypeInfo);
    ((VectorGroupByDesc) desc.getVectorDesc()).setProcessingMode(ProcessingMode.HASH);
    ExprNodeDesc keyExp = buildColumnDesc(ctx, key, keyTypeInfo);
    ArrayList<ExprNodeDesc> keys = new ArrayList<ExprNodeDesc>();
    keys.add(keyExp);
    desc.setKeys(keys);
    desc.getOutputColumnNames().add("_col1");
    return desc;
}
Also used : VectorGroupByDesc(org.apache.hadoop.hive.ql.plan.VectorGroupByDesc) ArrayList(java.util.ArrayList) ExprNodeDesc(org.apache.hadoop.hive.ql.plan.ExprNodeDesc) VectorGroupByDesc(org.apache.hadoop.hive.ql.plan.VectorGroupByDesc) GroupByDesc(org.apache.hadoop.hive.ql.plan.GroupByDesc)

Example 10 with VectorGroupByDesc

use of org.apache.hadoop.hive.ql.plan.VectorGroupByDesc in project hive by apache.

the class TestVectorGroupByOperator method buildGroupByDescType.

private static GroupByDesc buildGroupByDescType(VectorizationContext ctx, String aggregate, GenericUDAFEvaluator.Mode mode, String column, TypeInfo dataType) {
    AggregationDesc agg = buildAggregationDesc(ctx, aggregate, mode, column, dataType);
    ArrayList<AggregationDesc> aggs = new ArrayList<AggregationDesc>();
    aggs.add(agg);
    ArrayList<String> outputColumnNames = new ArrayList<String>();
    outputColumnNames.add("_col0");
    GroupByDesc desc = new GroupByDesc();
    desc.setVectorDesc(new VectorGroupByDesc());
    desc.setOutputColumnNames(outputColumnNames);
    desc.setAggregators(aggs);
    ((VectorGroupByDesc) desc.getVectorDesc()).setProcessingMode(ProcessingMode.GLOBAL);
    return desc;
}
Also used : ArrayList(java.util.ArrayList) VectorGroupByDesc(org.apache.hadoop.hive.ql.plan.VectorGroupByDesc) AggregationDesc(org.apache.hadoop.hive.ql.plan.AggregationDesc) VectorGroupByDesc(org.apache.hadoop.hive.ql.plan.VectorGroupByDesc) GroupByDesc(org.apache.hadoop.hive.ql.plan.GroupByDesc)

Aggregations

VectorGroupByDesc (org.apache.hadoop.hive.ql.plan.VectorGroupByDesc)10 GroupByDesc (org.apache.hadoop.hive.ql.plan.GroupByDesc)9 ArrayList (java.util.ArrayList)8 AggregationDesc (org.apache.hadoop.hive.ql.plan.AggregationDesc)5 CompilationOpContext (org.apache.hadoop.hive.ql.CompilationOpContext)4 FakeCaptureOutputOperator (org.apache.hadoop.hive.ql.exec.vector.util.FakeCaptureOutputOperator)4 ExprNodeDesc (org.apache.hadoop.hive.ql.plan.ExprNodeDesc)4 HashMap (java.util.HashMap)2 HashSet (java.util.HashSet)2 Map (java.util.Map)2 Set (java.util.Set)2 VectorExpression (org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression)2 VectorAggregateExpression (org.apache.hadoop.hive.ql.exec.vector.expressions.aggregates.VectorAggregateExpression)2 ByteWritable (org.apache.hadoop.hive.serde2.io.ByteWritable)2 DoubleWritable (org.apache.hadoop.hive.serde2.io.DoubleWritable)2 ShortWritable (org.apache.hadoop.hive.serde2.io.ShortWritable)2 TimestampWritable (org.apache.hadoop.hive.serde2.io.TimestampWritable)2 BooleanWritable (org.apache.hadoop.io.BooleanWritable)2 FloatWritable (org.apache.hadoop.io.FloatWritable)2 IntWritable (org.apache.hadoop.io.IntWritable)2