Search in sources :

Example 6 with SMBJoinDesc

use of org.apache.hadoop.hive.ql.plan.SMBJoinDesc in project hive by apache.

the class SparkSMBJoinHintOptimizer method removeSmallTableReduceSink.

/**
 * In bucket mapjoin, there are ReduceSinks that mark a small table parent (Reduce Sink are removed from big-table).
 * In SMB join these are not expected for any parents, either from small or big tables.
 * @param mapJoinOp
 */
@SuppressWarnings("unchecked")
private void removeSmallTableReduceSink(MapJoinOperator mapJoinOp) {
    SMBJoinDesc smbJoinDesc = new SMBJoinDesc(mapJoinOp.getConf());
    List<Operator<? extends OperatorDesc>> parentOperators = mapJoinOp.getParentOperators();
    for (int i = 0; i < parentOperators.size(); i++) {
        Operator<? extends OperatorDesc> par = parentOperators.get(i);
        if (i != smbJoinDesc.getPosBigTable()) {
            if (par instanceof ReduceSinkOperator) {
                List<Operator<? extends OperatorDesc>> grandParents = par.getParentOperators();
                Preconditions.checkArgument(grandParents.size() == 1, "AssertionError: expect # of parents to be 1, but was " + grandParents.size());
                Operator<? extends OperatorDesc> grandParent = grandParents.get(0);
                grandParent.removeChild(par);
                grandParent.setChildOperators(Utilities.makeList(mapJoinOp));
                mapJoinOp.getParentOperators().set(i, grandParent);
            }
        }
    }
}
Also used : ReduceSinkOperator(org.apache.hadoop.hive.ql.exec.ReduceSinkOperator) MapJoinOperator(org.apache.hadoop.hive.ql.exec.MapJoinOperator) Operator(org.apache.hadoop.hive.ql.exec.Operator) SMBJoinDesc(org.apache.hadoop.hive.ql.plan.SMBJoinDesc) ReduceSinkOperator(org.apache.hadoop.hive.ql.exec.ReduceSinkOperator) OperatorDesc(org.apache.hadoop.hive.ql.plan.OperatorDesc)

Aggregations

SMBJoinDesc (org.apache.hadoop.hive.ql.plan.SMBJoinDesc)6 ArrayList (java.util.ArrayList)5 List (java.util.List)3 MapJoinOperator (org.apache.hadoop.hive.ql.exec.MapJoinOperator)3 Operator (org.apache.hadoop.hive.ql.exec.Operator)3 ReduceSinkOperator (org.apache.hadoop.hive.ql.exec.ReduceSinkOperator)3 SMBMapJoinOperator (org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator)3 MapJoinDesc (org.apache.hadoop.hive.ql.plan.MapJoinDesc)3 OperatorDesc (org.apache.hadoop.hive.ql.plan.OperatorDesc)3 HashMap (java.util.HashMap)2 JoinOperator (org.apache.hadoop.hive.ql.exec.JoinOperator)2 VectorMapJoinOperator (org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator)2 VectorMapJoinOuterFilteredOperator (org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOuterFilteredOperator)2 VectorExpression (org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression)2 Serializable (java.io.Serializable)1 LinkedHashMap (java.util.LinkedHashMap)1 Set (java.util.Set)1 ImmutablePair (org.apache.commons.lang3.tuple.ImmutablePair)1 Configuration (org.apache.hadoop.conf.Configuration)1 Path (org.apache.hadoop.fs.Path)1