Search in sources :

Example 6 with SymbolStatsEstimate

use of io.trino.cost.SymbolStatsEstimate in project trino by trinodb.

the class TestDetermineJoinDistributionType method testPartitionWhenBothTablesEqual.

@Test
public void testPartitionWhenBothTablesEqual() {
    int aRows = 10_000;
    int bRows = 10_000;
    assertDetermineJoinDistributionType().setSystemProperty(JOIN_DISTRIBUTION_TYPE, JoinDistributionType.AUTOMATIC.name()).overrideStats("valuesA", PlanNodeStatsEstimate.builder().setOutputRowCount(aRows).addSymbolStatistics(ImmutableMap.of(new Symbol("A1"), new SymbolStatsEstimate(0, 100, 0, 640000, 100))).build()).overrideStats("valuesB", PlanNodeStatsEstimate.builder().setOutputRowCount(bRows).addSymbolStatistics(ImmutableMap.of(new Symbol("B1"), new SymbolStatsEstimate(0, 100, 0, 640000, 100))).build()).on(p -> p.join(INNER, p.values(new PlanNodeId("valuesA"), aRows, p.symbol("A1", BIGINT)), p.values(new PlanNodeId("valuesB"), bRows, p.symbol("B1", BIGINT)), ImmutableList.of(new JoinNode.EquiJoinClause(p.symbol("A1", BIGINT), p.symbol("B1", BIGINT))), ImmutableList.of(p.symbol("A1", BIGINT)), ImmutableList.of(p.symbol("B1", BIGINT)), Optional.empty())).matches(join(INNER, ImmutableList.of(equiJoinClause("A1", "B1")), Optional.empty(), Optional.of(PARTITIONED), values(ImmutableMap.of("A1", 0)), values(ImmutableMap.of("B1", 0))));
}
Also used : PARTITIONED(io.trino.sql.planner.plan.JoinNode.DistributionType.PARTITIONED) INNER(io.trino.sql.planner.plan.JoinNode.Type.INNER) VarcharType.createUnboundedVarcharType(io.trino.spi.type.VarcharType.createUnboundedVarcharType) Assert.assertEquals(org.testng.Assert.assertEquals) Test(org.testng.annotations.Test) PlanMatchPattern.filter(io.trino.sql.planner.assertions.PlanMatchPattern.filter) REPLICATED(io.trino.sql.planner.plan.JoinNode.DistributionType.REPLICATED) Lookup.noLookup(io.trino.sql.planner.iterative.Lookup.noLookup) VarcharType(io.trino.spi.type.VarcharType) LEFT(io.trino.sql.planner.plan.JoinNode.Type.LEFT) RuleAssert(io.trino.sql.planner.iterative.rule.test.RuleAssert) Type(io.trino.sql.planner.plan.JoinNode.Type) PlanBuilder.expressions(io.trino.sql.planner.iterative.rule.test.PlanBuilder.expressions) ImmutableList(com.google.common.collect.ImmutableList) NaN(java.lang.Double.NaN) DistributionType(io.trino.sql.planner.plan.JoinNode.DistributionType) PlanNodeId(io.trino.sql.planner.plan.PlanNodeId) PlanBuilder(io.trino.sql.planner.iterative.rule.test.PlanBuilder) PlanMatchPattern.equiJoinClause(io.trino.sql.planner.assertions.PlanMatchPattern.equiJoinClause) JoinNode(io.trino.sql.planner.plan.JoinNode) TableScanNode(io.trino.sql.planner.plan.TableScanNode) PlanMatchPattern.join(io.trino.sql.planner.assertions.PlanMatchPattern.join) PlanNodeStatsEstimate(io.trino.cost.PlanNodeStatsEstimate) JOIN_MAX_BROADCAST_TABLE_SIZE(io.trino.SystemSessionProperties.JOIN_MAX_BROADCAST_TABLE_SIZE) TaskCountEstimator(io.trino.cost.TaskCountEstimator) Symbol(io.trino.sql.planner.Symbol) AfterClass(org.testng.annotations.AfterClass) SymbolStatsEstimate(io.trino.cost.SymbolStatsEstimate) RuleTester.defaultRuleTester(io.trino.sql.planner.iterative.rule.test.RuleTester.defaultRuleTester) ImmutableMap(com.google.common.collect.ImmutableMap) BeforeClass(org.testng.annotations.BeforeClass) FULL(io.trino.sql.planner.plan.JoinNode.Type.FULL) RuleTester(io.trino.sql.planner.iterative.rule.test.RuleTester) PlanMatchPattern.values(io.trino.sql.planner.assertions.PlanMatchPattern.values) TRUE_LITERAL(io.trino.sql.tree.BooleanLiteral.TRUE_LITERAL) JoinDistributionType(io.trino.sql.planner.OptimizerConfig.JoinDistributionType) DetermineJoinDistributionType.getSourceTablesSizeInBytes(io.trino.sql.planner.iterative.rule.DetermineJoinDistributionType.getSourceTablesSizeInBytes) CostComparator(io.trino.cost.CostComparator) PlanMatchPattern.enforceSingleRow(io.trino.sql.planner.assertions.PlanMatchPattern.enforceSingleRow) BIGINT(io.trino.spi.type.BigintType.BIGINT) RIGHT(io.trino.sql.planner.plan.JoinNode.Type.RIGHT) JOIN_DISTRIBUTION_TYPE(io.trino.SystemSessionProperties.JOIN_DISTRIBUTION_TYPE) ImmutableListMultimap(com.google.common.collect.ImmutableListMultimap) Optional(java.util.Optional) ValuesNode(io.trino.sql.planner.plan.ValuesNode) TestingColumnHandle(io.trino.testing.TestingMetadata.TestingColumnHandle) PlanBuilder.expression(io.trino.sql.planner.iterative.rule.test.PlanBuilder.expression) PlanNodeIdAllocator(io.trino.sql.planner.PlanNodeIdAllocator) PlanNodeId(io.trino.sql.planner.plan.PlanNodeId) Symbol(io.trino.sql.planner.Symbol) SymbolStatsEstimate(io.trino.cost.SymbolStatsEstimate) Test(org.testng.annotations.Test)

Example 7 with SymbolStatsEstimate

use of io.trino.cost.SymbolStatsEstimate in project trino by trinodb.

the class TestDetermineSemiJoinDistributionType method testReplicatesWhenNotRestricted.

@Test
public void testReplicatesWhenNotRestricted() {
    // variable width so that average row size is respected
    Type symbolType = createUnboundedVarcharType();
    int aRows = 10_000;
    int bRows = 10;
    PlanNodeStatsEstimate probeSideStatsEstimate = PlanNodeStatsEstimate.builder().setOutputRowCount(aRows).addSymbolStatistics(ImmutableMap.of(new Symbol("A1"), new SymbolStatsEstimate(0, 100, 0, 640000, 10))).build();
    PlanNodeStatsEstimate buildSideStatsEstimate = PlanNodeStatsEstimate.builder().setOutputRowCount(bRows).addSymbolStatistics(ImmutableMap.of(new Symbol("B1"), new SymbolStatsEstimate(0, 100, 0, 640000, 10))).build();
    // B table is small enough to be replicated in AUTOMATIC_RESTRICTED mode
    assertDetermineSemiJoinDistributionType().setSystemProperty(JOIN_DISTRIBUTION_TYPE, JoinDistributionType.AUTOMATIC.name()).setSystemProperty(JOIN_MAX_BROADCAST_TABLE_SIZE, "100MB").overrideStats("valuesA", probeSideStatsEstimate).overrideStats("valuesB", buildSideStatsEstimate).on(p -> {
        Symbol a1 = p.symbol("A1", symbolType);
        Symbol b1 = p.symbol("B1", symbolType);
        return p.semiJoin(p.values(new PlanNodeId("valuesA"), aRows, a1), p.values(new PlanNodeId("valuesB"), bRows, b1), a1, b1, p.symbol("output"), Optional.empty(), Optional.empty(), Optional.empty());
    }).matches(semiJoin("A1", "B1", "output", Optional.of(REPLICATED), values(ImmutableMap.of("A1", 0)), values(ImmutableMap.of("B1", 0))));
    probeSideStatsEstimate = PlanNodeStatsEstimate.builder().setOutputRowCount(aRows).addSymbolStatistics(ImmutableMap.of(new Symbol("A1"), new SymbolStatsEstimate(0, 100, 0, 640000d * 10000, 10))).build();
    buildSideStatsEstimate = PlanNodeStatsEstimate.builder().setOutputRowCount(bRows).addSymbolStatistics(ImmutableMap.of(new Symbol("B1"), new SymbolStatsEstimate(0, 100, 0, 640000d * 10000, 10))).build();
    // B table exceeds AUTOMATIC_RESTRICTED limit therefore it is partitioned
    assertDetermineSemiJoinDistributionType().setSystemProperty(JOIN_DISTRIBUTION_TYPE, JoinDistributionType.AUTOMATIC.name()).setSystemProperty(JOIN_MAX_BROADCAST_TABLE_SIZE, "100MB").overrideStats("valuesA", probeSideStatsEstimate).overrideStats("valuesB", buildSideStatsEstimate).on(p -> {
        Symbol a1 = p.symbol("A1", symbolType);
        Symbol b1 = p.symbol("B1", symbolType);
        return p.semiJoin(p.values(new PlanNodeId("valuesA"), aRows, a1), p.values(new PlanNodeId("valuesB"), bRows, b1), a1, b1, p.symbol("output"), Optional.empty(), Optional.empty(), Optional.empty());
    }).matches(semiJoin("A1", "B1", "output", Optional.of(PARTITIONED), values(ImmutableMap.of("A1", 0)), values(ImmutableMap.of("B1", 0))));
}
Also used : PlanMatchPattern.semiJoin(io.trino.sql.planner.assertions.PlanMatchPattern.semiJoin) Type(io.trino.spi.type.Type) VarcharType.createUnboundedVarcharType(io.trino.spi.type.VarcharType.createUnboundedVarcharType) Test(org.testng.annotations.Test) PlanMatchPattern.filter(io.trino.sql.planner.assertions.PlanMatchPattern.filter) PARTITIONED(io.trino.sql.planner.plan.SemiJoinNode.DistributionType.PARTITIONED) RuleAssert(io.trino.sql.planner.iterative.rule.test.RuleAssert) PlanBuilder.expressions(io.trino.sql.planner.iterative.rule.test.PlanBuilder.expressions) ImmutableList(com.google.common.collect.ImmutableList) PlanNodeId(io.trino.sql.planner.plan.PlanNodeId) PlanNodeStatsEstimate(io.trino.cost.PlanNodeStatsEstimate) JOIN_MAX_BROADCAST_TABLE_SIZE(io.trino.SystemSessionProperties.JOIN_MAX_BROADCAST_TABLE_SIZE) TaskCountEstimator(io.trino.cost.TaskCountEstimator) Symbol(io.trino.sql.planner.Symbol) AfterClass(org.testng.annotations.AfterClass) SymbolStatsEstimate(io.trino.cost.SymbolStatsEstimate) RuleTester.defaultRuleTester(io.trino.sql.planner.iterative.rule.test.RuleTester.defaultRuleTester) ImmutableMap(com.google.common.collect.ImmutableMap) BeforeClass(org.testng.annotations.BeforeClass) RuleTester(io.trino.sql.planner.iterative.rule.test.RuleTester) PlanMatchPattern.values(io.trino.sql.planner.assertions.PlanMatchPattern.values) REPLICATED(io.trino.sql.planner.plan.SemiJoinNode.DistributionType.REPLICATED) TRUE_LITERAL(io.trino.sql.tree.BooleanLiteral.TRUE_LITERAL) JoinDistributionType(io.trino.sql.planner.OptimizerConfig.JoinDistributionType) CostComparator(io.trino.cost.CostComparator) BIGINT(io.trino.spi.type.BigintType.BIGINT) JOIN_DISTRIBUTION_TYPE(io.trino.SystemSessionProperties.JOIN_DISTRIBUTION_TYPE) Optional(java.util.Optional) PlanNodeId(io.trino.sql.planner.plan.PlanNodeId) Type(io.trino.spi.type.Type) VarcharType.createUnboundedVarcharType(io.trino.spi.type.VarcharType.createUnboundedVarcharType) JoinDistributionType(io.trino.sql.planner.OptimizerConfig.JoinDistributionType) PlanNodeStatsEstimate(io.trino.cost.PlanNodeStatsEstimate) Symbol(io.trino.sql.planner.Symbol) SymbolStatsEstimate(io.trino.cost.SymbolStatsEstimate) Test(org.testng.annotations.Test)

Example 8 with SymbolStatsEstimate

use of io.trino.cost.SymbolStatsEstimate in project trino by trinodb.

the class TestDetermineSemiJoinDistributionType method testReplicatesWhenSourceIsSmall.

@Test
public void testReplicatesWhenSourceIsSmall() {
    // variable width so that average row size is respected
    Type symbolType = createUnboundedVarcharType();
    int aRows = 10_000;
    int bRows = 10;
    PlanNodeStatsEstimate probeSideStatsEstimate = PlanNodeStatsEstimate.builder().setOutputRowCount(aRows).addSymbolStatistics(ImmutableMap.of(new Symbol("A1"), new SymbolStatsEstimate(0, 100, 0, 640000d * 10000, 10))).build();
    PlanNodeStatsEstimate buildSideStatsEstimate = PlanNodeStatsEstimate.builder().setOutputRowCount(bRows).addSymbolStatistics(ImmutableMap.of(new Symbol("B1"), new SymbolStatsEstimate(0, 100, 0, 640000d * 10000, 10))).build();
    PlanNodeStatsEstimate buildSideSourceStatsEstimate = PlanNodeStatsEstimate.builder().setOutputRowCount(bRows).addSymbolStatistics(ImmutableMap.of(new Symbol("B1"), new SymbolStatsEstimate(0, 100, 0, 64, 10))).build();
    // build side exceeds AUTOMATIC_RESTRICTED limit but source plan nodes are small
    // therefore replicated distribution type is chosen
    assertDetermineSemiJoinDistributionType().setSystemProperty(JOIN_DISTRIBUTION_TYPE, JoinDistributionType.AUTOMATIC.name()).setSystemProperty(JOIN_MAX_BROADCAST_TABLE_SIZE, "100MB").overrideStats("valuesA", probeSideStatsEstimate).overrideStats("filterB", buildSideStatsEstimate).overrideStats("valuesB", buildSideSourceStatsEstimate).on(p -> {
        Symbol a1 = p.symbol("A1", symbolType);
        Symbol b1 = p.symbol("B1", symbolType);
        return p.semiJoin(p.values(new PlanNodeId("valuesA"), aRows, a1), p.filter(new PlanNodeId("filterB"), TRUE_LITERAL, p.values(new PlanNodeId("valuesB"), bRows, b1)), a1, b1, p.symbol("output"), Optional.empty(), Optional.empty(), Optional.empty());
    }).matches(semiJoin("A1", "B1", "output", Optional.of(REPLICATED), values(ImmutableMap.of("A1", 0)), filter("true", values(ImmutableMap.of("B1", 0)))));
}
Also used : PlanMatchPattern.semiJoin(io.trino.sql.planner.assertions.PlanMatchPattern.semiJoin) Type(io.trino.spi.type.Type) VarcharType.createUnboundedVarcharType(io.trino.spi.type.VarcharType.createUnboundedVarcharType) Test(org.testng.annotations.Test) PlanMatchPattern.filter(io.trino.sql.planner.assertions.PlanMatchPattern.filter) PARTITIONED(io.trino.sql.planner.plan.SemiJoinNode.DistributionType.PARTITIONED) RuleAssert(io.trino.sql.planner.iterative.rule.test.RuleAssert) PlanBuilder.expressions(io.trino.sql.planner.iterative.rule.test.PlanBuilder.expressions) ImmutableList(com.google.common.collect.ImmutableList) PlanNodeId(io.trino.sql.planner.plan.PlanNodeId) PlanNodeStatsEstimate(io.trino.cost.PlanNodeStatsEstimate) JOIN_MAX_BROADCAST_TABLE_SIZE(io.trino.SystemSessionProperties.JOIN_MAX_BROADCAST_TABLE_SIZE) TaskCountEstimator(io.trino.cost.TaskCountEstimator) Symbol(io.trino.sql.planner.Symbol) AfterClass(org.testng.annotations.AfterClass) SymbolStatsEstimate(io.trino.cost.SymbolStatsEstimate) RuleTester.defaultRuleTester(io.trino.sql.planner.iterative.rule.test.RuleTester.defaultRuleTester) ImmutableMap(com.google.common.collect.ImmutableMap) BeforeClass(org.testng.annotations.BeforeClass) RuleTester(io.trino.sql.planner.iterative.rule.test.RuleTester) PlanMatchPattern.values(io.trino.sql.planner.assertions.PlanMatchPattern.values) REPLICATED(io.trino.sql.planner.plan.SemiJoinNode.DistributionType.REPLICATED) TRUE_LITERAL(io.trino.sql.tree.BooleanLiteral.TRUE_LITERAL) JoinDistributionType(io.trino.sql.planner.OptimizerConfig.JoinDistributionType) CostComparator(io.trino.cost.CostComparator) BIGINT(io.trino.spi.type.BigintType.BIGINT) JOIN_DISTRIBUTION_TYPE(io.trino.SystemSessionProperties.JOIN_DISTRIBUTION_TYPE) Optional(java.util.Optional) PlanNodeId(io.trino.sql.planner.plan.PlanNodeId) Type(io.trino.spi.type.Type) VarcharType.createUnboundedVarcharType(io.trino.spi.type.VarcharType.createUnboundedVarcharType) JoinDistributionType(io.trino.sql.planner.OptimizerConfig.JoinDistributionType) PlanNodeStatsEstimate(io.trino.cost.PlanNodeStatsEstimate) Symbol(io.trino.sql.planner.Symbol) SymbolStatsEstimate(io.trino.cost.SymbolStatsEstimate) Test(org.testng.annotations.Test)

Example 9 with SymbolStatsEstimate

use of io.trino.cost.SymbolStatsEstimate in project trino by trinodb.

the class TestApplyPreferredTableWriterPartitioning method testThresholdWithNullFraction.

@Test
public void testThresholdWithNullFraction() {
    // Null value in partition column should increase the number of partitions by 1
    PlanNodeStatsEstimate stats = PlanNodeStatsEstimate.builder().addSymbolStatistics(ImmutableMap.of(new Symbol("col_one"), new SymbolStatsEstimate(0, 0, .5, 0, 49))).build();
    assertPreferredPartitioning(new PartitioningScheme(Partitioning.create(FIXED_HASH_DISTRIBUTION, ImmutableList.of(new Symbol("col_one"))), ImmutableList.of(new Symbol("col_one")))).withSession(SESSION_WITH_PREFERRED_PARTITIONING_DEFAULT_THRESHOLD).overrideStats(NODE_ID, stats).matches(SUCCESSFUL_MATCH);
}
Also used : PlanNodeStatsEstimate(io.trino.cost.PlanNodeStatsEstimate) Symbol(io.trino.sql.planner.Symbol) PartitioningScheme(io.trino.sql.planner.PartitioningScheme) SymbolStatsEstimate(io.trino.cost.SymbolStatsEstimate) Test(org.testng.annotations.Test)

Example 10 with SymbolStatsEstimate

use of io.trino.cost.SymbolStatsEstimate in project trino by trinodb.

the class TestApplyPreferredTableWriterPartitioning method testThresholdWithMultiplePartitions.

@Test
public void testThresholdWithMultiplePartitions() {
    PlanNodeStatsEstimate stats = PlanNodeStatsEstimate.builder().addSymbolStatistics(ImmutableMap.of(new Symbol("col_one"), new SymbolStatsEstimate(0, 0, 0, 0, 5))).addSymbolStatistics(ImmutableMap.of(new Symbol("col_two"), new SymbolStatsEstimate(0, 0, 0, 0, 10))).build();
    assertPreferredPartitioning(new PartitioningScheme(Partitioning.create(FIXED_HASH_DISTRIBUTION, ImmutableList.of(new Symbol("col_one"), new Symbol("col_two"))), ImmutableList.of(new Symbol("col_one"), new Symbol("col_two")))).withSession(SESSION_WITH_PREFERRED_PARTITIONING_DEFAULT_THRESHOLD).overrideStats(NODE_ID, stats).matches(SUCCESSFUL_MATCH);
}
Also used : PlanNodeStatsEstimate(io.trino.cost.PlanNodeStatsEstimate) Symbol(io.trino.sql.planner.Symbol) PartitioningScheme(io.trino.sql.planner.PartitioningScheme) SymbolStatsEstimate(io.trino.cost.SymbolStatsEstimate) Test(org.testng.annotations.Test)

Aggregations

SymbolStatsEstimate (io.trino.cost.SymbolStatsEstimate)21 Symbol (io.trino.sql.planner.Symbol)21 Test (org.testng.annotations.Test)20 PlanNodeStatsEstimate (io.trino.cost.PlanNodeStatsEstimate)18 VarcharType.createUnboundedVarcharType (io.trino.spi.type.VarcharType.createUnboundedVarcharType)18 PlanNodeId (io.trino.sql.planner.plan.PlanNodeId)18 ImmutableList (com.google.common.collect.ImmutableList)16 ImmutableMap (com.google.common.collect.ImmutableMap)16 JoinDistributionType (io.trino.sql.planner.OptimizerConfig.JoinDistributionType)16 Optional (java.util.Optional)16 JOIN_DISTRIBUTION_TYPE (io.trino.SystemSessionProperties.JOIN_DISTRIBUTION_TYPE)15 JOIN_MAX_BROADCAST_TABLE_SIZE (io.trino.SystemSessionProperties.JOIN_MAX_BROADCAST_TABLE_SIZE)15 CostComparator (io.trino.cost.CostComparator)15 PlanMatchPattern.values (io.trino.sql.planner.assertions.PlanMatchPattern.values)15 RuleAssert (io.trino.sql.planner.iterative.rule.test.RuleAssert)15 RuleTester (io.trino.sql.planner.iterative.rule.test.RuleTester)15 RuleTester.defaultRuleTester (io.trino.sql.planner.iterative.rule.test.RuleTester.defaultRuleTester)15 AfterClass (org.testng.annotations.AfterClass)15 BeforeClass (org.testng.annotations.BeforeClass)15 PlanMatchPattern.equiJoinClause (io.trino.sql.planner.assertions.PlanMatchPattern.equiJoinClause)12