use of org.apache.drill.exec.planner.cost.DrillCostBase.DrillCostFactory in project drill by axbaretto.
the class UnionExchangePrel method computeSelfCost.
/**
* A UnionExchange processes a total of M rows coming from N senders and
* combines them into a single output stream. Note that there is
* no sort or merge operation going on. For costing purposes, we can
* assume each sender is sending M/N rows to a single receiver.
* (See DrillCostBase for symbol notations)
* C = CPU cost of SV remover for M/N rows
* + Network cost of sending M/N rows to 1 destination.
* So, C = (s * M/N) + (w * M/N)
* Total cost = N * C
*/
@Override
public RelOptCost computeSelfCost(RelOptPlanner planner, RelMetadataQuery mq) {
if (PrelUtil.getSettings(getCluster()).useDefaultCosting()) {
return super.computeSelfCost(planner, mq).multiplyBy(.1);
}
RelNode child = this.getInput();
double inputRows = mq.getRowCount(child);
int rowWidth = child.getRowType().getFieldCount() * DrillCostBase.AVG_FIELD_WIDTH;
double svrCpuCost = DrillCostBase.SVR_CPU_COST * inputRows;
double networkCost = DrillCostBase.BYTE_NETWORK_COST * inputRows * rowWidth;
DrillCostFactory costFactory = (DrillCostFactory) planner.getCostFactory();
return costFactory.makeCost(inputRows, svrCpuCost, 0, networkCost);
}
use of org.apache.drill.exec.planner.cost.DrillCostBase.DrillCostFactory in project drill by apache.
the class DrillJoinRelBase method computeHashJoinCostWithRowCntKeySize.
public static RelOptCost computeHashJoinCostWithRowCntKeySize(RelOptPlanner planner, double probeRowCount, double buildRowCount, int keySize) {
// cpu cost of hashing the join keys for the build side
double cpuCostBuild = DrillCostBase.HASH_CPU_COST * keySize * buildRowCount;
// cpu cost of hashing the join keys for the probe side
double cpuCostProbe = DrillCostBase.HASH_CPU_COST * keySize * probeRowCount;
// cpu cost of evaluating each leftkey=rightkey join condition
double joinConditionCost = DrillCostBase.COMPARE_CPU_COST * keySize;
double factor = PrelUtil.getPlannerSettings(planner).getOptions().getOption(ExecConstants.HASH_JOIN_TABLE_FACTOR_KEY).float_val;
long fieldWidth = PrelUtil.getPlannerSettings(planner).getOptions().getOption(ExecConstants.AVERAGE_FIELD_WIDTH_KEY).num_val;
// table + hashValues + links
double memCost = ((fieldWidth * keySize) + IntHolder.WIDTH + IntHolder.WIDTH) * buildRowCount * factor;
double cpuCost = // probe size determine the join condition comparison cost
joinConditionCost * (probeRowCount) + cpuCostBuild + cpuCostProbe;
DrillCostFactory costFactory = (DrillCostFactory) planner.getCostFactory();
return costFactory.makeCost(buildRowCount + probeRowCount, cpuCost, 0, 0, memCost);
}
use of org.apache.drill.exec.planner.cost.DrillCostBase.DrillCostFactory in project drill by apache.
the class UnionExchangePrel method computeSelfCost.
/**
* A UnionExchange processes a total of M rows coming from N senders and
* combines them into a single output stream. Note that there is
* no sort or merge operation going on. For costing purposes, we can
* assume each sender is sending M/N rows to a single receiver.
* (See DrillCostBase for symbol notations)
* C = CPU cost of SV remover for M/N rows
* + Network cost of sending M/N rows to 1 destination.
* So, C = (s * M/N) + (w * M/N)
* Total cost = N * C
*/
@Override
public RelOptCost computeSelfCost(RelOptPlanner planner, RelMetadataQuery mq) {
if (PrelUtil.getSettings(getCluster()).useDefaultCosting()) {
return super.computeSelfCost(planner, mq).multiplyBy(.1);
}
RelNode child = this.getInput();
double inputRows = mq.getRowCount(child);
int rowWidth = child.getRowType().getFieldCount() * DrillCostBase.AVG_FIELD_WIDTH;
double svrCpuCost = DrillCostBase.SVR_CPU_COST * inputRows;
double networkCost = DrillCostBase.BYTE_NETWORK_COST * inputRows * rowWidth;
DrillCostFactory costFactory = (DrillCostFactory) planner.getCostFactory();
return costFactory.makeCost(inputRows, svrCpuCost, 0, networkCost);
}
use of org.apache.drill.exec.planner.cost.DrillCostBase.DrillCostFactory in project drill by apache.
the class TopNPrel method computeSelfCost.
/**
* Cost of doing Top-N is proportional to M log N where M is the total number of
* input rows and N is the limit for Top-N. This makes Top-N preferable to Sort
* since cost of full Sort is proportional to M log M .
*/
@Override
public RelOptCost computeSelfCost(RelOptPlanner planner, RelMetadataQuery mq) {
if (PrelUtil.getSettings(getCluster()).useDefaultCosting()) {
// We use multiplier 0.05 for TopN operator, and 0.1 for Sort, to make TopN a preferred choice.
return super.computeSelfCost(planner, mq).multiplyBy(0.05);
}
RelNode child = this.getInput();
double inputRows = mq.getRowCount(child);
int numSortFields = this.collation.getFieldCollations().size();
double cpuCost = DrillCostBase.COMPARE_CPU_COST * numSortFields * inputRows * (Math.log(limit) / Math.log(2));
// assume in-memory for now until we enforce operator-level memory constraints
double diskIOCost = 0;
DrillCostFactory costFactory = (DrillCostFactory) planner.getCostFactory();
return costFactory.makeCost(inputRows, cpuCost, diskIOCost, 0);
}
use of org.apache.drill.exec.planner.cost.DrillCostBase.DrillCostFactory in project drill by apache.
the class ScanPrel method computeSelfCost.
@Override
public RelOptCost computeSelfCost(final RelOptPlanner planner, RelMetadataQuery mq) {
final PlannerSettings settings = PrelUtil.getPlannerSettings(planner);
final ScanStats stats = getGroupScan().getScanStats(settings);
final int columnCount = getRowType().getFieldCount();
if (PrelUtil.getSettings(getCluster()).useDefaultCosting()) {
return planner.getCostFactory().makeCost(stats.getRecordCount() * columnCount, stats.getCpuCost(), stats.getDiskCost());
}
double rowCount = mq.getRowCount(this);
// As DRILL-4083 points out, when columnCount == 0, cpuCost becomes zero,
// which makes the costs of HiveScan and HiveDrillNativeParquetScan the same
// For now, assume cpu cost is proportional to row count.
// Note that this ignores the disk cost estimate (which should be a proxy for
// row count * row width.)
double cpuCost = rowCount * Math.max(columnCount, 1);
// If a positive value for CPU cost is given multiply the default CPU cost by given CPU cost.
if (stats.getCpuCost() > 0) {
cpuCost *= stats.getCpuCost();
}
double ioCost = stats.getDiskCost();
DrillCostFactory costFactory = (DrillCostFactory) planner.getCostFactory();
return costFactory.makeCost(rowCount, cpuCost, ioCost, 0);
}
Aggregations