Search in sources :

Example 11 with ResultSetLoader

use of org.apache.drill.exec.physical.rowSet.ResultSetLoader in project drill by axbaretto.

the class TestResultSetLoaderOverflow method testCloseWithOverflow.

/**
 * Load a batch to overflow. Then, close the loader with the overflow
 * batch unharvested. The Loader should release the memory allocated
 * to the unused overflow vectors.
 */
@Test
public void testCloseWithOverflow() {
    TupleMetadata schema = new SchemaBuilder().add("s", MinorType.VARCHAR).buildSchema();
    ResultSetOptions options = new OptionBuilder().setRowCountLimit(ValueVector.MAX_ROW_COUNT).setSchema(schema).build();
    ResultSetLoader rsLoader = new ResultSetLoaderImpl(fixture.allocator(), options);
    RowSetLoader rootWriter = rsLoader.writer();
    rsLoader.startBatch();
    byte[] value = new byte[512];
    Arrays.fill(value, (byte) 'X');
    int count = 0;
    while (!rootWriter.isFull()) {
        rootWriter.start();
        rootWriter.scalar(0).setBytes(value, value.length);
        rootWriter.save();
        count++;
    }
    assertTrue(count < ValueVector.MAX_ROW_COUNT);
    // Harvest the full batch
    RowSet result = fixture.wrap(rsLoader.harvest());
    result.clear();
    // Close without harvesting the overflow batch.
    rsLoader.close();
}
Also used : ResultSetLoader(org.apache.drill.exec.physical.rowSet.ResultSetLoader) TupleMetadata(org.apache.drill.exec.record.metadata.TupleMetadata) SchemaBuilder(org.apache.drill.test.rowSet.schema.SchemaBuilder) RowSet(org.apache.drill.test.rowSet.RowSet) RowSetLoader(org.apache.drill.exec.physical.rowSet.RowSetLoader) ResultSetOptions(org.apache.drill.exec.physical.rowSet.impl.ResultSetLoaderImpl.ResultSetOptions) SubOperatorTest(org.apache.drill.test.SubOperatorTest) Test(org.junit.Test)

Example 12 with ResultSetLoader

use of org.apache.drill.exec.physical.rowSet.ResultSetLoader in project drill by axbaretto.

the class TestResultSetLoaderProjection method testProjectionDynamic.

@Test
public void testProjectionDynamic() {
    List<SchemaPath> selection = Lists.newArrayList(SchemaPath.getSimplePath("c"), SchemaPath.getSimplePath("b"), SchemaPath.getSimplePath("e"));
    ResultSetOptions options = new OptionBuilder().setProjection(selection).build();
    ResultSetLoader rsLoader = new ResultSetLoaderImpl(fixture.allocator(), options);
    RowSetLoader rootWriter = rsLoader.writer();
    rootWriter.addColumn(SchemaBuilder.columnSchema("a", MinorType.INT, DataMode.REQUIRED));
    rootWriter.addColumn(SchemaBuilder.columnSchema("b", MinorType.INT, DataMode.REQUIRED));
    rootWriter.addColumn(SchemaBuilder.columnSchema("c", MinorType.INT, DataMode.REQUIRED));
    rootWriter.addColumn(SchemaBuilder.columnSchema("d", MinorType.INT, DataMode.REQUIRED));
    doProjectionTest(rsLoader);
}
Also used : ResultSetLoader(org.apache.drill.exec.physical.rowSet.ResultSetLoader) SchemaPath(org.apache.drill.common.expression.SchemaPath) RowSetLoader(org.apache.drill.exec.physical.rowSet.RowSetLoader) ResultSetOptions(org.apache.drill.exec.physical.rowSet.impl.ResultSetLoaderImpl.ResultSetOptions) SubOperatorTest(org.apache.drill.test.SubOperatorTest) Test(org.junit.Test)

Example 13 with ResultSetLoader

use of org.apache.drill.exec.physical.rowSet.ResultSetLoader in project drill by axbaretto.

the class TestResultSetLoaderProtocol method testOverwriteRow.

/**
 * The writer protocol allows a client to write to a row any number of times
 * before invoking <tt>save()</tt>. In this case, each new value simply
 * overwrites the previous value. Here, we test the most basic case: a simple,
 * flat tuple with no arrays. We use a very large Varchar that would, if
 * overwrite were not working, cause vector overflow.
 * <p>
 * The ability to overwrite rows is seldom needed except in one future use
 * case: writing a row, then applying a filter "in-place" to discard unwanted
 * rows, without having to send the row downstream.
 * <p>
 * Because of this use case, specific rules apply when discarding row or
 * overwriting values.
 * <ul>
 * <li>Values can be written once per row. Fixed-width columns actually allow
 * multiple writes. But, because of the way variable-width columns work,
 * multiple writes will cause undefined results.</li>
 * <li>To overwrite a row, call <tt>start()</tt> without calling
 * <tt>save()</tt> on the previous row. Doing so ignores data for the
 * previous row and starts a new row in place of the old one.</li>
 * </ul>
 * Note that there is no explicit method to discard a row. Instead,
 * the rule is that a row is not saved until <tt>save()</tt> is called.
 */
@Test
public void testOverwriteRow() {
    TupleMetadata schema = new SchemaBuilder().add("a", MinorType.INT).add("b", MinorType.VARCHAR).buildSchema();
    ResultSetLoaderImpl.ResultSetOptions options = new OptionBuilder().setSchema(schema).setRowCountLimit(ValueVector.MAX_ROW_COUNT).build();
    ResultSetLoader rsLoader = new ResultSetLoaderImpl(fixture.allocator(), options);
    RowSetLoader rootWriter = rsLoader.writer();
    // Can't use the shortcut to populate rows when doing overwrites.
    ScalarWriter aWriter = rootWriter.scalar("a");
    ScalarWriter bWriter = rootWriter.scalar("b");
    // Write 100,000 rows, overwriting 99% of them. This will cause vector
    // overflow and data corruption if overwrite does not work; but will happily
    // produce the correct result if everything works as it should.
    byte[] value = new byte[512];
    Arrays.fill(value, (byte) 'X');
    int count = 0;
    rsLoader.startBatch();
    while (count < 100_000) {
        rootWriter.start();
        count++;
        aWriter.setInt(count);
        bWriter.setBytes(value, value.length);
        if (count % 100 == 0) {
            rootWriter.save();
        }
    }
    // Verify using a reader.
    RowSet result = fixture.wrap(rsLoader.harvest());
    assertEquals(count / 100, result.rowCount());
    RowSetReader reader = result.reader();
    int rowId = 1;
    while (reader.next()) {
        assertEquals(rowId * 100, reader.scalar("a").getInt());
        assertTrue(Arrays.equals(value, reader.scalar("b").getBytes()));
        rowId++;
    }
    result.clear();
    rsLoader.close();
}
Also used : ResultSetLoader(org.apache.drill.exec.physical.rowSet.ResultSetLoader) TupleMetadata(org.apache.drill.exec.record.metadata.TupleMetadata) SchemaBuilder(org.apache.drill.test.rowSet.schema.SchemaBuilder) SingleRowSet(org.apache.drill.test.rowSet.RowSet.SingleRowSet) RowSet(org.apache.drill.test.rowSet.RowSet) RowSetLoader(org.apache.drill.exec.physical.rowSet.RowSetLoader) RowSetReader(org.apache.drill.test.rowSet.RowSetReader) ScalarWriter(org.apache.drill.exec.vector.accessor.ScalarWriter) SubOperatorTest(org.apache.drill.test.SubOperatorTest) Test(org.junit.Test)

Example 14 with ResultSetLoader

use of org.apache.drill.exec.physical.rowSet.ResultSetLoader in project drill by axbaretto.

the class TestResultSetLoaderProtocol method testCaseInsensitiveSchema.

/**
 * Schemas are case insensitive by default. Verify that
 * the schema mechanism works, with emphasis on the
 * case insensitive case.
 * <p>
 * The tests here and elsewhere build columns from a
 * <tt>MaterializedField</tt>. Doing so is rather old-school;
 * better to use the newer <tt>ColumnMetadata</tt> which provides
 * additional information. The code here simply uses the <tt>MaterializedField</tt>
 * to create a <tt>ColumnMetadata</tt> implicitly.
 */
@Test
public void testCaseInsensitiveSchema() {
    ResultSetLoader rsLoader = new ResultSetLoaderImpl(fixture.allocator());
    RowSetLoader rootWriter = rsLoader.writer();
    TupleMetadata schema = rootWriter.schema();
    assertEquals(0, rsLoader.schemaVersion());
    // No columns defined in schema
    assertNull(schema.metadata("a"));
    try {
        schema.column(0);
        fail();
    } catch (IndexOutOfBoundsException e) {
    // Expected
    }
    try {
        rootWriter.column("a");
        fail();
    } catch (UndefinedColumnException e) {
    // Expected
    }
    try {
        rootWriter.column(0);
        fail();
    } catch (IndexOutOfBoundsException e) {
    // Expected
    }
    // Define a column
    assertEquals(0, rsLoader.schemaVersion());
    MaterializedField colSchema = SchemaBuilder.columnSchema("a", MinorType.VARCHAR, DataMode.REQUIRED);
    rootWriter.addColumn(colSchema);
    assertEquals(1, rsLoader.schemaVersion());
    // Can now be found, case insensitive
    assertTrue(colSchema.isEquivalent(schema.column(0)));
    ColumnMetadata colMetadata = schema.metadata(0);
    assertSame(colMetadata, schema.metadata("a"));
    assertSame(colMetadata, schema.metadata("A"));
    assertNotNull(rootWriter.column(0));
    assertNotNull(rootWriter.column("a"));
    assertNotNull(rootWriter.column("A"));
    assertEquals(1, schema.size());
    assertEquals(0, schema.index("a"));
    assertEquals(0, schema.index("A"));
    try {
        rootWriter.addColumn(colSchema);
        fail();
    } catch (IllegalArgumentException e) {
    // Expected
    }
    try {
        MaterializedField testCol = SchemaBuilder.columnSchema("A", MinorType.VARCHAR, DataMode.REQUIRED);
        rootWriter.addColumn(testCol);
        fail();
    } catch (IllegalArgumentException e) {
        // Expected
        assertTrue(e.getMessage().contains("Duplicate"));
    }
    // Can still add required fields while writing the first row.
    rsLoader.startBatch();
    rootWriter.start();
    rootWriter.scalar(0).setString("foo");
    MaterializedField col2 = SchemaBuilder.columnSchema("b", MinorType.VARCHAR, DataMode.REQUIRED);
    rootWriter.addColumn(col2);
    assertTrue(col2.isEquivalent(schema.column(1)));
    ColumnMetadata col2Metadata = schema.metadata(1);
    assertSame(col2Metadata, schema.metadata("b"));
    assertSame(col2Metadata, schema.metadata("B"));
    assertEquals(2, schema.size());
    assertEquals(1, schema.index("b"));
    assertEquals(1, schema.index("B"));
    rootWriter.scalar(1).setString("second");
    // After first row, can add an optional or repeated.
    // Also allows a required field: values will be back-filled.
    rootWriter.save();
    rootWriter.start();
    rootWriter.scalar(0).setString("bar");
    rootWriter.scalar(1).setString("");
    MaterializedField col3 = SchemaBuilder.columnSchema("c", MinorType.VARCHAR, DataMode.REQUIRED);
    rootWriter.addColumn(col3);
    assertTrue(col3.isEquivalent(schema.column(2)));
    ColumnMetadata col3Metadata = schema.metadata(2);
    assertSame(col3Metadata, schema.metadata("c"));
    assertSame(col3Metadata, schema.metadata("C"));
    assertEquals(3, schema.size());
    assertEquals(2, schema.index("c"));
    assertEquals(2, schema.index("C"));
    rootWriter.scalar("c").setString("c.2");
    MaterializedField col4 = SchemaBuilder.columnSchema("d", MinorType.VARCHAR, DataMode.OPTIONAL);
    rootWriter.addColumn(col4);
    assertTrue(col4.isEquivalent(schema.column(3)));
    ColumnMetadata col4Metadata = schema.metadata(3);
    assertSame(col4Metadata, schema.metadata("d"));
    assertSame(col4Metadata, schema.metadata("D"));
    assertEquals(4, schema.size());
    assertEquals(3, schema.index("d"));
    assertEquals(3, schema.index("D"));
    rootWriter.scalar("d").setString("d.2");
    MaterializedField col5 = SchemaBuilder.columnSchema("e", MinorType.VARCHAR, DataMode.REPEATED);
    rootWriter.addColumn(col5);
    assertTrue(col5.isEquivalent(schema.column(4)));
    ColumnMetadata col5Metadata = schema.metadata(4);
    assertSame(col5Metadata, schema.metadata("e"));
    assertSame(col5Metadata, schema.metadata("E"));
    assertEquals(5, schema.size());
    assertEquals(4, schema.index("e"));
    assertEquals(4, schema.index("E"));
    rootWriter.array(4).set("e1", "e2", "e3");
    rootWriter.save();
    // Verify. No reason to expect problems, but might as well check.
    RowSet result = fixture.wrap(rsLoader.harvest());
    assertEquals(5, rsLoader.schemaVersion());
    SingleRowSet expected = fixture.rowSetBuilder(result.batchSchema()).addRow("foo", "second", "", null, strArray()).addRow("bar", "", "c.2", "d.2", strArray("e1", "e2", "e3")).build();
    new RowSetComparison(expected).verifyAndClearAll(result);
    // Handy way to test that close works to abort an in-flight batch
    // and clean up.
    rsLoader.close();
}
Also used : ColumnMetadata(org.apache.drill.exec.record.metadata.ColumnMetadata) SingleRowSet(org.apache.drill.test.rowSet.RowSet.SingleRowSet) RowSetComparison(org.apache.drill.test.rowSet.RowSetComparison) ResultSetLoader(org.apache.drill.exec.physical.rowSet.ResultSetLoader) TupleMetadata(org.apache.drill.exec.record.metadata.TupleMetadata) SingleRowSet(org.apache.drill.test.rowSet.RowSet.SingleRowSet) RowSet(org.apache.drill.test.rowSet.RowSet) MaterializedField(org.apache.drill.exec.record.MaterializedField) RowSetLoader(org.apache.drill.exec.physical.rowSet.RowSetLoader) UndefinedColumnException(org.apache.drill.exec.vector.accessor.TupleWriter.UndefinedColumnException) SubOperatorTest(org.apache.drill.test.SubOperatorTest) Test(org.junit.Test)

Example 15 with ResultSetLoader

use of org.apache.drill.exec.physical.rowSet.ResultSetLoader in project drill by axbaretto.

the class TestResultSetLoaderProtocol method testInitialSchema.

/**
 * Provide a schema up front to the loader; schema is built before
 * the first row.
 * <p>
 * Also verifies the test-time method to set a row of values using
 * a single method.
 */
@Test
public void testInitialSchema() {
    TupleMetadata schema = new SchemaBuilder().add("a", MinorType.INT).addNullable("b", MinorType.INT).add("c", MinorType.VARCHAR).buildSchema();
    ResultSetLoaderImpl.ResultSetOptions options = new OptionBuilder().setSchema(schema).build();
    ResultSetLoader rsLoader = new ResultSetLoaderImpl(fixture.allocator(), options);
    RowSetLoader rootWriter = rsLoader.writer();
    rsLoader.startBatch();
    rootWriter.addRow(10, 100, "fred").addRow(20, null, "barney").addRow(30, 300, "wilma");
    RowSet actual = fixture.wrap(rsLoader.harvest());
    RowSet expected = fixture.rowSetBuilder(schema).addRow(10, 100, "fred").addRow(20, null, "barney").addRow(30, 300, "wilma").build();
    new RowSetComparison(expected).verifyAndClearAll(actual);
    rsLoader.close();
}
Also used : RowSetComparison(org.apache.drill.test.rowSet.RowSetComparison) ResultSetLoader(org.apache.drill.exec.physical.rowSet.ResultSetLoader) TupleMetadata(org.apache.drill.exec.record.metadata.TupleMetadata) SchemaBuilder(org.apache.drill.test.rowSet.schema.SchemaBuilder) SingleRowSet(org.apache.drill.test.rowSet.RowSet.SingleRowSet) RowSet(org.apache.drill.test.rowSet.RowSet) RowSetLoader(org.apache.drill.exec.physical.rowSet.RowSetLoader) SubOperatorTest(org.apache.drill.test.SubOperatorTest) Test(org.junit.Test)

Aggregations

ResultSetLoader (org.apache.drill.exec.physical.rowSet.ResultSetLoader)45 RowSetLoader (org.apache.drill.exec.physical.rowSet.RowSetLoader)44 SubOperatorTest (org.apache.drill.test.SubOperatorTest)44 Test (org.junit.Test)44 TupleMetadata (org.apache.drill.exec.record.metadata.TupleMetadata)38 SchemaBuilder (org.apache.drill.test.rowSet.schema.SchemaBuilder)38 RowSet (org.apache.drill.test.rowSet.RowSet)34 SingleRowSet (org.apache.drill.test.rowSet.RowSet.SingleRowSet)28 RowSetComparison (org.apache.drill.test.rowSet.RowSetComparison)17 ResultSetOptions (org.apache.drill.exec.physical.rowSet.impl.ResultSetLoaderImpl.ResultSetOptions)16 TupleWriter (org.apache.drill.exec.vector.accessor.TupleWriter)14 ScalarWriter (org.apache.drill.exec.vector.accessor.ScalarWriter)13 RowSetReader (org.apache.drill.test.rowSet.RowSetReader)12 BatchSchema (org.apache.drill.exec.record.BatchSchema)6 SchemaPath (org.apache.drill.common.expression.SchemaPath)5 TupleReader (org.apache.drill.exec.vector.accessor.TupleReader)5 ArrayWriter (org.apache.drill.exec.vector.accessor.ArrayWriter)4 ScalarElementReader (org.apache.drill.exec.vector.accessor.ScalarElementReader)4 MaterializedField (org.apache.drill.exec.record.MaterializedField)3 ArrayReader (org.apache.drill.exec.vector.accessor.ArrayReader)3