Search in sources :

Example 1 with LocalCostEstimate

use of io.trino.cost.LocalCostEstimate in project trino by trinodb.

the class DetermineSemiJoinDistributionType method getSemiJoinNodeWithCost.

private PlanNodeWithCost getSemiJoinNodeWithCost(SemiJoinNode possibleJoinNode, Context context) {
    TypeProvider types = context.getSymbolAllocator().getTypes();
    StatsProvider stats = context.getStatsProvider();
    boolean replicated = possibleJoinNode.getDistributionType().get() == REPLICATED;
    /*
         *   HACK!
         *
         *   Currently cost model always has to compute the total cost of an operation.
         *   For SEMI-JOIN the total cost consist of 4 parts:
         *     - Cost of exchanges that have to be introduced to execute a JOIN
         *     - Cost of building a hash table
         *     - Cost of probing a hash table
         *     - Cost of building an output for matched rows
         *
         *   When output size for a SEMI-JOIN cannot be estimated the cost model returns
         *   UNKNOWN cost for the join.
         *
         *   However assuming the cost of SEMI-JOIN output is always the same, we can still make
         *   cost based decisions based on the input cost for different types of SEMI-JOINs.
         *
         *   TODO Decision about the distribution should be based on LocalCostEstimate only when PlanCostEstimate cannot be calculated. Otherwise cost comparator cannot take query.max-memory into account.
         */
    int estimatedSourceDistributedTaskCount = taskCountEstimator.estimateSourceDistributedTaskCount(context.getSession());
    LocalCostEstimate cost = calculateJoinCostWithoutOutput(possibleJoinNode.getSource(), possibleJoinNode.getFilteringSource(), stats, types, replicated, estimatedSourceDistributedTaskCount);
    return new PlanNodeWithCost(cost.toPlanCost(), possibleJoinNode);
}
Also used : StatsProvider(io.trino.cost.StatsProvider) TypeProvider(io.trino.sql.planner.TypeProvider) LocalCostEstimate(io.trino.cost.LocalCostEstimate)

Example 2 with LocalCostEstimate

use of io.trino.cost.LocalCostEstimate in project trino by trinodb.

the class DetermineJoinDistributionType method getJoinNodeWithCost.

private PlanNodeWithCost getJoinNodeWithCost(Context context, JoinNode possibleJoinNode) {
    TypeProvider types = context.getSymbolAllocator().getTypes();
    StatsProvider stats = context.getStatsProvider();
    boolean replicated = possibleJoinNode.getDistributionType().get() == REPLICATED;
    /*
         *   HACK!
         *
         *   Currently cost model always has to compute the total cost of an operation.
         *   For JOIN the total cost consist of 4 parts:
         *     - Cost of exchanges that have to be introduced to execute a JOIN
         *     - Cost of building a hash table
         *     - Cost of probing a hash table
         *     - Cost of building an output for matched rows
         *
         *   When output size for a JOIN cannot be estimated the cost model returns
         *   UNKNOWN cost for the join.
         *
         *   However assuming the cost of JOIN output is always the same, we can still make
         *   cost based decisions based on the input cost for different types of JOINs.
         *
         *   Although the side flipping can be made purely based on stats (smaller side
         *   always goes to the right), determining JOIN type is not that simple. As when
         *   choosing REPLICATED over REPARTITIONED join the cost of exchanging and building
         *   the hash table scales with the number of nodes where the build side is replicated.
         *
         *   TODO Decision about the distribution should be based on LocalCostEstimate only when PlanCostEstimate cannot be calculated. Otherwise cost comparator cannot take query.max-memory into account.
         */
    int estimatedSourceDistributedTaskCount = taskCountEstimator.estimateSourceDistributedTaskCount(context.getSession());
    LocalCostEstimate cost = calculateJoinCostWithoutOutput(possibleJoinNode.getLeft(), possibleJoinNode.getRight(), stats, types, replicated, estimatedSourceDistributedTaskCount);
    return new PlanNodeWithCost(cost.toPlanCost(), possibleJoinNode);
}
Also used : StatsProvider(io.trino.cost.StatsProvider) TypeProvider(io.trino.sql.planner.TypeProvider) LocalCostEstimate(io.trino.cost.LocalCostEstimate)

Aggregations

LocalCostEstimate (io.trino.cost.LocalCostEstimate)2 StatsProvider (io.trino.cost.StatsProvider)2 TypeProvider (io.trino.sql.planner.TypeProvider)2