Search in sources :

Example 1 with SparkPartitionPruningSinkOperator

use of org.apache.hadoop.hive.ql.parse.spark.SparkPartitionPruningSinkOperator in project hive by apache.

the class SparkRemoveDynamicPruningBySize method process.

@Override
public Object process(Node nd, Stack<Node> stack, NodeProcessorCtx procContext, Object... nodeOutputs) throws SemanticException {
    OptimizeSparkProcContext context = (OptimizeSparkProcContext) procContext;
    SparkPartitionPruningSinkOperator op = (SparkPartitionPruningSinkOperator) nd;
    SparkPartitionPruningSinkDesc desc = op.getConf();
    if (desc.getStatistics().getDataSize() > context.getConf().getLongVar(ConfVars.SPARK_DYNAMIC_PARTITION_PRUNING_MAX_DATA_SIZE)) {
        OperatorUtils.removeBranch(op);
        // at this point we've found the fork in the op pipeline that has the pruning as a child plan.
        LOG.info("Disabling dynamic pruning for: " + desc.getTableScan().getName() + ". Expected data size is too big: " + desc.getStatistics().getDataSize());
    }
    return false;
}
Also used : SparkPartitionPruningSinkDesc(org.apache.hadoop.hive.ql.optimizer.spark.SparkPartitionPruningSinkDesc) OptimizeSparkProcContext(org.apache.hadoop.hive.ql.parse.spark.OptimizeSparkProcContext) SparkPartitionPruningSinkOperator(org.apache.hadoop.hive.ql.parse.spark.SparkPartitionPruningSinkOperator)

Aggregations

SparkPartitionPruningSinkDesc (org.apache.hadoop.hive.ql.optimizer.spark.SparkPartitionPruningSinkDesc)1 OptimizeSparkProcContext (org.apache.hadoop.hive.ql.parse.spark.OptimizeSparkProcContext)1 SparkPartitionPruningSinkOperator (org.apache.hadoop.hive.ql.parse.spark.SparkPartitionPruningSinkOperator)1