Search in sources :

Example 36 with DrillCostFactory

use of org.apache.drill.exec.planner.cost.DrillCostBase.DrillCostFactory in project drill by axbaretto.

the class UnionExchangePrel method computeSelfCost.

/**
 * A UnionExchange processes a total of M rows coming from N senders and
 * combines them into a single output stream.  Note that there is
 * no sort or merge operation going on. For costing purposes, we can
 * assume each sender is sending M/N rows to a single receiver.
 * (See DrillCostBase for symbol notations)
 * C =  CPU cost of SV remover for M/N rows
 *      + Network cost of sending M/N rows to 1 destination.
 * So, C = (s * M/N) + (w * M/N)
 * Total cost = N * C
 */
@Override
public RelOptCost computeSelfCost(RelOptPlanner planner, RelMetadataQuery mq) {
    if (PrelUtil.getSettings(getCluster()).useDefaultCosting()) {
        return super.computeSelfCost(planner, mq).multiplyBy(.1);
    }
    RelNode child = this.getInput();
    double inputRows = mq.getRowCount(child);
    int rowWidth = child.getRowType().getFieldCount() * DrillCostBase.AVG_FIELD_WIDTH;
    double svrCpuCost = DrillCostBase.SVR_CPU_COST * inputRows;
    double networkCost = DrillCostBase.BYTE_NETWORK_COST * inputRows * rowWidth;
    DrillCostFactory costFactory = (DrillCostFactory) planner.getCostFactory();
    return costFactory.makeCost(inputRows, svrCpuCost, 0, networkCost);
}
Also used : DrillCostFactory(org.apache.drill.exec.planner.cost.DrillCostBase.DrillCostFactory) RelNode(org.apache.calcite.rel.RelNode)

Example 37 with DrillCostFactory

use of org.apache.drill.exec.planner.cost.DrillCostBase.DrillCostFactory in project drill by apache.

the class DrillJoinRelBase method computeHashJoinCostWithRowCntKeySize.

public static RelOptCost computeHashJoinCostWithRowCntKeySize(RelOptPlanner planner, double probeRowCount, double buildRowCount, int keySize) {
    // cpu cost of hashing the join keys for the build side
    double cpuCostBuild = DrillCostBase.HASH_CPU_COST * keySize * buildRowCount;
    // cpu cost of hashing the join keys for the probe side
    double cpuCostProbe = DrillCostBase.HASH_CPU_COST * keySize * probeRowCount;
    // cpu cost of evaluating each leftkey=rightkey join condition
    double joinConditionCost = DrillCostBase.COMPARE_CPU_COST * keySize;
    double factor = PrelUtil.getPlannerSettings(planner).getOptions().getOption(ExecConstants.HASH_JOIN_TABLE_FACTOR_KEY).float_val;
    long fieldWidth = PrelUtil.getPlannerSettings(planner).getOptions().getOption(ExecConstants.AVERAGE_FIELD_WIDTH_KEY).num_val;
    // table + hashValues + links
    double memCost = ((fieldWidth * keySize) + IntHolder.WIDTH + IntHolder.WIDTH) * buildRowCount * factor;
    double cpuCost = // probe size determine the join condition comparison cost
    joinConditionCost * (probeRowCount) + cpuCostBuild + cpuCostProbe;
    DrillCostFactory costFactory = (DrillCostFactory) planner.getCostFactory();
    return costFactory.makeCost(buildRowCount + probeRowCount, cpuCost, 0, 0, memCost);
}
Also used : DrillCostFactory(org.apache.drill.exec.planner.cost.DrillCostBase.DrillCostFactory)

Example 38 with DrillCostFactory

use of org.apache.drill.exec.planner.cost.DrillCostBase.DrillCostFactory in project drill by apache.

the class UnionExchangePrel method computeSelfCost.

/**
 * A UnionExchange processes a total of M rows coming from N senders and
 * combines them into a single output stream.  Note that there is
 * no sort or merge operation going on. For costing purposes, we can
 * assume each sender is sending M/N rows to a single receiver.
 * (See DrillCostBase for symbol notations)
 * C =  CPU cost of SV remover for M/N rows
 *      + Network cost of sending M/N rows to 1 destination.
 * So, C = (s * M/N) + (w * M/N)
 * Total cost = N * C
 */
@Override
public RelOptCost computeSelfCost(RelOptPlanner planner, RelMetadataQuery mq) {
    if (PrelUtil.getSettings(getCluster()).useDefaultCosting()) {
        return super.computeSelfCost(planner, mq).multiplyBy(.1);
    }
    RelNode child = this.getInput();
    double inputRows = mq.getRowCount(child);
    int rowWidth = child.getRowType().getFieldCount() * DrillCostBase.AVG_FIELD_WIDTH;
    double svrCpuCost = DrillCostBase.SVR_CPU_COST * inputRows;
    double networkCost = DrillCostBase.BYTE_NETWORK_COST * inputRows * rowWidth;
    DrillCostFactory costFactory = (DrillCostFactory) planner.getCostFactory();
    return costFactory.makeCost(inputRows, svrCpuCost, 0, networkCost);
}
Also used : DrillCostFactory(org.apache.drill.exec.planner.cost.DrillCostBase.DrillCostFactory) RelNode(org.apache.calcite.rel.RelNode)

Example 39 with DrillCostFactory

use of org.apache.drill.exec.planner.cost.DrillCostBase.DrillCostFactory in project drill by apache.

the class TopNPrel method computeSelfCost.

/**
 * Cost of doing Top-N is proportional to M log N where M is the total number of
 * input rows and N is the limit for Top-N.  This makes Top-N preferable to Sort
 * since cost of full Sort is proportional to M log M .
 */
@Override
public RelOptCost computeSelfCost(RelOptPlanner planner, RelMetadataQuery mq) {
    if (PrelUtil.getSettings(getCluster()).useDefaultCosting()) {
        // We use multiplier 0.05 for TopN operator, and 0.1 for Sort, to make TopN a preferred choice.
        return super.computeSelfCost(planner, mq).multiplyBy(0.05);
    }
    RelNode child = this.getInput();
    double inputRows = mq.getRowCount(child);
    int numSortFields = this.collation.getFieldCollations().size();
    double cpuCost = DrillCostBase.COMPARE_CPU_COST * numSortFields * inputRows * (Math.log(limit) / Math.log(2));
    // assume in-memory for now until we enforce operator-level memory constraints
    double diskIOCost = 0;
    DrillCostFactory costFactory = (DrillCostFactory) planner.getCostFactory();
    return costFactory.makeCost(inputRows, cpuCost, diskIOCost, 0);
}
Also used : DrillCostFactory(org.apache.drill.exec.planner.cost.DrillCostBase.DrillCostFactory) RelNode(org.apache.calcite.rel.RelNode)

Example 40 with DrillCostFactory

use of org.apache.drill.exec.planner.cost.DrillCostBase.DrillCostFactory in project drill by apache.

the class ScanPrel method computeSelfCost.

@Override
public RelOptCost computeSelfCost(final RelOptPlanner planner, RelMetadataQuery mq) {
    final PlannerSettings settings = PrelUtil.getPlannerSettings(planner);
    final ScanStats stats = getGroupScan().getScanStats(settings);
    final int columnCount = getRowType().getFieldCount();
    if (PrelUtil.getSettings(getCluster()).useDefaultCosting()) {
        return planner.getCostFactory().makeCost(stats.getRecordCount() * columnCount, stats.getCpuCost(), stats.getDiskCost());
    }
    double rowCount = mq.getRowCount(this);
    // As DRILL-4083 points out, when columnCount == 0, cpuCost becomes zero,
    // which makes the costs of HiveScan and HiveDrillNativeParquetScan the same
    // For now, assume cpu cost is proportional to row count.
    // Note that this ignores the disk cost estimate (which should be a proxy for
    // row count * row width.)
    double cpuCost = rowCount * Math.max(columnCount, 1);
    // If a positive value for CPU cost is given multiply the default CPU cost by given CPU cost.
    if (stats.getCpuCost() > 0) {
        cpuCost *= stats.getCpuCost();
    }
    double ioCost = stats.getDiskCost();
    DrillCostFactory costFactory = (DrillCostFactory) planner.getCostFactory();
    return costFactory.makeCost(rowCount, cpuCost, ioCost, 0);
}
Also used : DrillCostFactory(org.apache.drill.exec.planner.cost.DrillCostBase.DrillCostFactory) ScanStats(org.apache.drill.exec.physical.base.ScanStats)

Aggregations

DrillCostFactory (org.apache.drill.exec.planner.cost.DrillCostBase.DrillCostFactory)46 RelNode (org.apache.calcite.rel.RelNode)21 ScanStats (org.apache.drill.exec.physical.base.ScanStats)4 DrillCostBase (org.apache.drill.exec.planner.cost.DrillCostBase)2 DbGroupScan (org.apache.drill.exec.physical.base.DbGroupScan)1