Search in sources :

Example 21 with HashJoinPOP

use of org.apache.drill.exec.physical.config.HashJoinPOP in project drill by apache.

the class TestHashJoinSpill method testLeftOuterHashJoinSpill.

@SuppressWarnings("unchecked")
@Test
public void testLeftOuterHashJoinSpill() {
    HashJoinPOP joinConf = new HashJoinPOP(null, null, Lists.newArrayList(joinCond("lft", "EQUALS", "rgt")), JoinRelType.LEFT, null);
    operatorFixture.getOptionManager().setLocalOption("exec.hashjoin.num_partitions", 8);
    operatorFixture.getOptionManager().setLocalOption("exec.hashjoin.num_rows_in_batch", 64);
    operatorFixture.getOptionManager().setLocalOption("exec.hashjoin.max_batches_in_memory", 12);
    // Put some duplicate values
    List<String> leftTable = Lists.newArrayList("[{\"lft\": 0, \"a\" : \"a string\"}]", "[{\"lft\": 0, \"a\" : \"a different string\"},{\"lft\": 0, \"a\" : \"yet another\"}]");
    List<String> rightTable = Lists.newArrayList("[{\"rgt\": 0, \"b\" : \"a string\"}]", "[{\"rgt\": 0, \"b\" : \"a different string\"},{\"rgt\": 0, \"b\" : \"yet another\"}]");
    // 100_000
    int numRows = 4_000;
    for (int cnt = 1; cnt <= numRows / 2; cnt++) {
        // inner use only half, to check the left-outer join
        // leftTable.add("[{\"lft\": " + cnt + ", \"a\" : \"a string\"}]");
        rightTable.add("[{\"rgt\": " + cnt + ", \"b\" : \"a string\"}]");
    }
    for (int cnt = 1; cnt <= numRows; cnt++) {
        leftTable.add("[{\"lft\": " + cnt + ", \"a\" : \"a string\"}]");
    // rightTable.add("[{\"rgt\": " + cnt + ", \"b\" : \"a string\"}]");
    }
    legacyOpTestBuilder().physicalOperator(joinConf).inputDataStreamsJson(Lists.newArrayList(leftTable, rightTable)).baselineColumns("lft", "a", "b", "rgt").expectedTotalRows(numRows + 9).go();
}
Also used : HashJoinPOP(org.apache.drill.exec.physical.config.HashJoinPOP) OperatorTest(org.apache.drill.categories.OperatorTest) SlowTest(org.apache.drill.categories.SlowTest) Test(org.junit.Test)

Example 22 with HashJoinPOP

use of org.apache.drill.exec.physical.config.HashJoinPOP in project drill by apache.

the class TestHashJoinSpill method testSimpleHashJoinSpill.

@SuppressWarnings("unchecked")
@Test
public // Should spill, including recursive spill
void testSimpleHashJoinSpill() {
    HashJoinPOP joinConf = new HashJoinPOP(null, null, Lists.newArrayList(joinCond("lft", "EQUALS", "rgt")), JoinRelType.INNER, null);
    operatorFixture.getOptionManager().setLocalOption("exec.hashjoin.num_partitions", 4);
    operatorFixture.getOptionManager().setLocalOption("exec.hashjoin.num_rows_in_batch", 64);
    operatorFixture.getOptionManager().setLocalOption("exec.hashjoin.max_batches_in_memory", 8);
    // Put some duplicate values
    List<String> leftTable = Lists.newArrayList("[{\"lft\": 0, \"a\" : \"a string\"}]", "[{\"lft\": 0, \"a\" : \"a different string\"},{\"lft\": 0, \"a\" : \"yet another\"}]");
    List<String> rightTable = Lists.newArrayList("[{\"rgt\": 0, \"b\" : \"a string\"}]", "[{\"rgt\": 0, \"b\" : \"a different string\"},{\"rgt\": 0, \"b\" : \"yet another\"}]");
    int numRows = 2_500;
    for (int cnt = 1; cnt <= numRows; cnt++) {
        leftTable.add("[{\"lft\": " + cnt + ", \"a\" : \"a string\"}]");
        rightTable.add("[{\"rgt\": " + cnt + ", \"b\" : \"a string\"}]");
    }
    legacyOpTestBuilder().physicalOperator(joinConf).inputDataStreamsJson(Lists.newArrayList(leftTable, rightTable)).baselineColumns("lft", "a", "b", "rgt").expectedTotalRows(numRows + 9).go();
}
Also used : HashJoinPOP(org.apache.drill.exec.physical.config.HashJoinPOP) OperatorTest(org.apache.drill.categories.OperatorTest) SlowTest(org.apache.drill.categories.SlowTest) Test(org.junit.Test)

Example 23 with HashJoinPOP

use of org.apache.drill.exec.physical.config.HashJoinPOP in project drill by apache.

the class TestHashJoinSpill method testRightOuterHashJoinSpill.

@SuppressWarnings("unchecked")
@Test
public void testRightOuterHashJoinSpill() {
    HashJoinPOP joinConf = new HashJoinPOP(null, null, Lists.newArrayList(joinCond("lft", "EQUALS", "rgt")), JoinRelType.RIGHT, null);
    operatorFixture.getOptionManager().setLocalOption("exec.hashjoin.num_partitions", 4);
    operatorFixture.getOptionManager().setLocalOption("exec.hashjoin.num_rows_in_batch", 64);
    operatorFixture.getOptionManager().setLocalOption("exec.hashjoin.max_batches_in_memory", 8);
    // Put some duplicate values
    List<String> leftTable = Lists.newArrayList("[{\"lft\": 0, \"a\" : \"a string\"}]", "[{\"lft\": 0, \"a\" : \"a different string\"},{\"lft\": 0, \"a\" : \"yet another\"}]");
    List<String> rightTable = Lists.newArrayList("[{\"rgt\": 0, \"b\" : \"a string\"}]", "[{\"rgt\": 0, \"b\" : \"a different string\"},{\"rgt\": 0, \"b\" : \"yet another\"}]");
    int numRows = 8_000;
    for (int cnt = 1; cnt <= numRows; cnt++) {
        // leftTable.add("[{\"lft\": " + cnt + ", \"a\" : \"a string\"}]");
        rightTable.add("[{\"rgt\": " + cnt + ", \"b\" : \"a string\"}]");
    }
    legacyOpTestBuilder().physicalOperator(joinConf).inputDataStreamsJson(Lists.newArrayList(leftTable, rightTable)).baselineColumns("lft", "a", "b", "rgt").expectedTotalRows(numRows + 9).go();
}
Also used : HashJoinPOP(org.apache.drill.exec.physical.config.HashJoinPOP) OperatorTest(org.apache.drill.categories.OperatorTest) SlowTest(org.apache.drill.categories.SlowTest) Test(org.junit.Test)

Example 24 with HashJoinPOP

use of org.apache.drill.exec.physical.config.HashJoinPOP in project drill by apache.

the class TestHashJoinOutcome method testHashJoinOutcomes.

/**
 *  Run the Hash Join where one side has an uninitialized container (the 2nd one)
 * @param uninitializedSide Which side (right or left) is the uninitialized
 * @param specialOutcome What outcome the uninitialized container has
 * @param expectedOutcome what result outcome is expected
 */
private void testHashJoinOutcomes(UninitializedSide uninitializedSide, RecordBatch.IterOutcome specialOutcome, RecordBatch.IterOutcome expectedOutcome) {
    inputOutcomesLeft.add(RecordBatch.IterOutcome.OK_NEW_SCHEMA);
    inputOutcomesLeft.add(uninitializedSide.isRight ? RecordBatch.IterOutcome.OK : specialOutcome);
    inputOutcomesRight.add(RecordBatch.IterOutcome.OK_NEW_SCHEMA);
    inputOutcomesRight.add(uninitializedSide.isRight ? specialOutcome : RecordBatch.IterOutcome.OK);
    final MockRecordBatch mockInputBatchRight = new MockRecordBatch(operatorFixture.getFragmentContext(), opContext, uninitializedSide.isRight ? uninitialized2ndInputContainersRight : inputContainerRight, inputOutcomesRight, batchSchemaRight);
    final MockRecordBatch mockInputBatchLeft = new MockRecordBatch(operatorFixture.getFragmentContext(), opContext, uninitializedSide.isRight ? inputContainerLeft : uninitialized2ndInputContainersLeft, inputOutcomesLeft, batchSchemaLeft);
    List<JoinCondition> conditions = Lists.newArrayList();
    conditions.add(new JoinCondition(SqlKind.EQUALS.toString(), FieldReference.getWithQuotedRef("leftcol"), FieldReference.getWithQuotedRef("rightcol")));
    HashJoinPOP hjConf = new HashJoinPOP(null, null, conditions, JoinRelType.INNER);
    HashJoinBatch hjBatch = new HashJoinBatch(hjConf, operatorFixture.getFragmentContext(), mockInputBatchLeft, mockInputBatchRight);
    RecordBatch.IterOutcome gotOutcome = hjBatch.next();
    assertSame(gotOutcome, RecordBatch.IterOutcome.OK_NEW_SCHEMA);
    gotOutcome = hjBatch.next();
    // verify returned outcome
    assertSame(gotOutcome, expectedOutcome);
}
Also used : MockRecordBatch(org.apache.drill.exec.physical.impl.MockRecordBatch) RecordBatch(org.apache.drill.exec.record.RecordBatch) HashJoinPOP(org.apache.drill.exec.physical.config.HashJoinPOP) MockRecordBatch(org.apache.drill.exec.physical.impl.MockRecordBatch) JoinCondition(org.apache.drill.common.logical.data.JoinCondition)

Example 25 with HashJoinPOP

use of org.apache.drill.exec.physical.config.HashJoinPOP in project drill by apache.

the class TestNullInputMiniPlan method testHashJoinEmptyBoth.

@Test
public void testHashJoinEmptyBoth() throws Exception {
    final PhysicalOperator join = new HashJoinPOP(null, null, Lists.newArrayList(joinCond("a", "EQUALS", "b")), JoinRelType.INNER, null);
    testTwoInputNullBatchHandling(join);
}
Also used : PhysicalOperator(org.apache.drill.exec.physical.base.PhysicalOperator) HashJoinPOP(org.apache.drill.exec.physical.config.HashJoinPOP) Test(org.junit.Test)

Aggregations

HashJoinPOP (org.apache.drill.exec.physical.config.HashJoinPOP)34 Test (org.junit.Test)30 PhysicalOperator (org.apache.drill.exec.physical.base.PhysicalOperator)11 RecordBatch (org.apache.drill.exec.record.RecordBatch)10 BatchSchema (org.apache.drill.exec.record.BatchSchema)8 OperatorTest (org.apache.drill.categories.OperatorTest)6 LegacyOperatorTestBuilder (org.apache.drill.test.LegacyOperatorTestBuilder)6 SlowTest (org.apache.drill.categories.SlowTest)5 JoinCondition (org.apache.drill.common.logical.data.JoinCondition)5 BatchSchemaBuilder (org.apache.drill.exec.record.BatchSchemaBuilder)4 SchemaBuilder (org.apache.drill.exec.record.metadata.SchemaBuilder)4 SchemaBuilder (org.apache.drill.test.rowSet.schema.SchemaBuilder)4 ArrayList (java.util.ArrayList)3 JoinRelType (org.apache.calcite.rel.core.JoinRelType)3 RuntimeFilterDef (org.apache.drill.exec.work.filter.RuntimeFilterDef)3 MockRecordBatch (org.apache.drill.exec.physical.impl.MockRecordBatch)2 BloomFilterDef (org.apache.drill.exec.work.filter.BloomFilterDef)2 RowSet (org.apache.drill.exec.physical.rowSet.RowSet)1 VectorContainer (org.apache.drill.exec.record.VectorContainer)1