use of org.apache.drill.exec.physical.base.ScanStats in project drill by axbaretto.
the class OpenTSDBGroupScan method getScanStats.
@Override
public ScanStats getScanStats() {
ServiceImpl client = storagePlugin.getClient();
Map<String, String> params = fromRowData(openTSDBScanSpec.getTableName());
Set<MetricDTO> allMetrics = client.getAllMetrics(params);
long numMetrics = allMetrics.size();
float approxDiskCost = 0;
if (numMetrics != 0) {
MetricDTO metricDTO = allMetrics.iterator().next();
// This method estimates the sizes of Java objects (number of bytes of memory they occupy).
// more detailed information about how this estimation method work you can find in this article
// http://www.javaworld.com/javaworld/javaqa/2003-12/02-qa-1226-sizeof.html
approxDiskCost = SizeEstimator.estimate(metricDTO) * numMetrics;
}
return new ScanStats(ScanStats.GroupScanProperty.EXACT_ROW_COUNT, numMetrics, 1, approxDiskCost);
}
use of org.apache.drill.exec.physical.base.ScanStats in project drill by axbaretto.
the class ScanPrel method computeSelfCost.
@Override
public RelOptCost computeSelfCost(final RelOptPlanner planner, RelMetadataQuery mq) {
final PlannerSettings settings = PrelUtil.getPlannerSettings(planner);
final ScanStats stats = this.groupScan.getScanStats(settings);
final int columnCount = this.getRowType().getFieldCount();
if (PrelUtil.getSettings(getCluster()).useDefaultCosting()) {
return planner.getCostFactory().makeCost(stats.getRecordCount() * columnCount, stats.getCpuCost(), stats.getDiskCost());
}
// double rowCount = RelMetadataQuery.getRowCount(this);
double rowCount = stats.getRecordCount();
// As DRILL-4083 points out, when columnCount == 0, cpuCost becomes zero,
// which makes the costs of HiveScan and HiveDrillNativeParquetScan the same
// For now, assume cpu cost is proportional to row count.
double cpuCost = rowCount * Math.max(columnCount, 1);
// If a positive value for CPU cost is given multiply the default CPU cost by given CPU cost.
if (stats.getCpuCost() > 0) {
cpuCost *= stats.getCpuCost();
}
// Even though scan is reading from disk, in the currently generated plans all plans will
// need to read the same amount of data, so keeping the disk io cost 0 is ok for now.
// In the future we might consider alternative scans that go against projections or
// different compression schemes etc that affect the amount of data read. Such alternatives
// would affect both cpu and io cost.
double ioCost = 0;
DrillCostFactory costFactory = (DrillCostFactory) planner.getCostFactory();
return costFactory.makeCost(rowCount, cpuCost, ioCost, 0);
}
use of org.apache.drill.exec.physical.base.ScanStats in project drill by apache.
the class IcebergGroupScan method getScanStats.
@Override
public ScanStats getScanStats() {
int expectedRecordsPerChunk = 1_000_000;
if (maxRecords >= 0) {
expectedRecordsPerChunk = Math.max(maxRecords, 1);
}
int estimatedRecords = chunks.size() * expectedRecordsPerChunk;
return new ScanStats(ScanStats.GroupScanProperty.NO_EXACT_ROW_COUNT, estimatedRecords, 1, 0);
}
use of org.apache.drill.exec.physical.base.ScanStats in project drill by apache.
the class OpenTSDBGroupScan method getScanStats.
@Override
public ScanStats getScanStats() {
ServiceImpl client = storagePlugin.getClient();
Map<String, String> params = fromRowData(openTSDBScanSpec.getTableName());
Set<MetricDTO> allMetrics = client.getAllMetrics(params);
long numMetrics = allMetrics.size();
float approxDiskCost = 0;
if (numMetrics != 0) {
MetricDTO metricDTO = allMetrics.iterator().next();
// This method estimates the sizes of Java objects (number of bytes of memory they occupy).
// more detailed information about how this estimation method work you can find in this article
// http://www.javaworld.com/javaworld/javaqa/2003-12/02-qa-1226-sizeof.html
approxDiskCost = SizeEstimator.estimate(metricDTO) * numMetrics;
}
return new ScanStats(ScanStats.GroupScanProperty.EXACT_ROW_COUNT, numMetrics, 1, approxDiskCost);
}
use of org.apache.drill.exec.physical.base.ScanStats in project drill by apache.
the class DrillScanRel method computeSelfCost.
// TODO: this method is same as the one for ScanPrel...eventually we should consolidate
// this and few other methods in a common base class which would be extended
// by both logical and physical rels.
// TODO: Further changes may have caused the versions to diverge.
// TODO: Does not compute IO cost by default, but should. Changing that may break
// existing plugins.
@Override
public RelOptCost computeSelfCost(final RelOptPlanner planner, RelMetadataQuery mq) {
final ScanStats stats = getGroupScan().getScanStats(settings);
int columnCount = Utilities.isStarQuery(columns) ? STAR_COLUMN_COST : getRowType().getFieldCount();
// double rowCount = RelMetadataQuery.getRowCount(this);
double rowCount = Math.max(1, stats.getRecordCount());
double valueCount = rowCount * columnCount;
if (PrelUtil.getSettings(getCluster()).useDefaultCosting()) {
// the planner to control the cost model. That is, remove this path.
return planner.getCostFactory().makeCost(valueCount, stats.getCpuCost(), stats.getDiskCost());
}
double cpuCost;
double ioCost;
if (stats.getGroupScanProperty().hasFullCost()) {
cpuCost = stats.getCpuCost();
ioCost = stats.getDiskCost();
} else {
// for now, assume cpu cost is proportional to row count and number of columns
cpuCost = valueCount;
// Default io cost should be proportional to valueCount
ioCost = 0;
}
return planner.getCostFactory().makeCost(rowCount, cpuCost, ioCost);
}
Aggregations