Examples with ColumnVector - org.apache.hadoop.hive.ql.exec.vector.ColumnVector

Example 1 with ColumnVector

use of org.apache.hadoop.hive.ql.exec.vector.ColumnVector in project hive by apache.

the class FakeVectorRowBatchFromObjectIterables method produceNextBatch.

@Override
public VectorizedRowBatch produceNextBatch() {
    batch.size = 0;
    batch.selectedInUse = false;
    for (int i = 0; i < types.length; ++i) {
        ColumnVector col = batch.cols[i];
        col.noNulls = true;
        col.isRepeating = false;
    }
    while (!eof && batch.size < this.batchSize) {
        int r = batch.size;
        for (int i = 0; i < types.length; ++i) {
            Iterator<Object> it = iterators.get(i);
            if (!it.hasNext()) {
                eof = true;
                break;
            }
            Object value = it.next();
            if (null == value) {
                batch.cols[i].isNull[batch.size] = true;
                batch.cols[i].noNulls = false;
            } else {
                // Must reset the isNull, could be set from prev batch use
                batch.cols[i].isNull[batch.size] = false;
                columnAssign[i].assign(batch.cols[i], batch.size, value);
            }
        }
        if (!eof) {
            batch.size += 1;
        }
    }
    return batch;
}

Also used : TimestampColumnVector(org.apache.hadoop.hive.ql.exec.vector.TimestampColumnVector) DecimalColumnVector(org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector) BytesColumnVector(org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector) LongColumnVector(org.apache.hadoop.hive.ql.exec.vector.LongColumnVector) ColumnVector(org.apache.hadoop.hive.ql.exec.vector.ColumnVector) DoubleColumnVector(org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector)

Example 2 with ColumnVector

use of org.apache.hadoop.hive.ql.exec.vector.ColumnVector in project hive by apache.

the class TestVectorGenericDateExpressions method testDateAddColCol.

private void testDateAddColCol(VectorExpression.Type colType1, boolean isPositive) {
    LongColumnVector date1 = newRandomLongColumnVector(10000, size);
    LongColumnVector days2 = newRandomLongColumnVector(1000, size);
    ColumnVector col1 = castTo(date1, colType1);
    LongColumnVector output = new LongColumnVector(size);
    VectorizedRowBatch batch = new VectorizedRowBatch(3, size);
    batch.cols[0] = col1;
    batch.cols[1] = days2;
    batch.cols[2] = output;
    validateDateAdd(batch, date1, days2, colType1, isPositive);
    TestVectorizedRowBatch.addRandomNulls(date1);
    batch.cols[0] = castTo(date1, colType1);
    validateDateAdd(batch, date1, days2, colType1, isPositive);
    TestVectorizedRowBatch.addRandomNulls(days2);
    batch.cols[1] = days2;
    validateDateAdd(batch, date1, days2, colType1, isPositive);
}

Also used : VectorizedRowBatch(org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch) TestVectorizedRowBatch(org.apache.hadoop.hive.ql.exec.vector.TestVectorizedRowBatch) LongColumnVector(org.apache.hadoop.hive.ql.exec.vector.LongColumnVector) TimestampColumnVector(org.apache.hadoop.hive.ql.exec.vector.TimestampColumnVector) BytesColumnVector(org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector) LongColumnVector(org.apache.hadoop.hive.ql.exec.vector.LongColumnVector) ColumnVector(org.apache.hadoop.hive.ql.exec.vector.ColumnVector)

Example 3 with ColumnVector

use of org.apache.hadoop.hive.ql.exec.vector.ColumnVector in project hive by apache.

the class TestVectorLogicalExpressions method testFilterExprOrExprWithSelectInUse.

@Test
public void testFilterExprOrExprWithSelectInUse() {
    VectorizedRowBatch batch1 = getBatchThreeBooleanCols();
    SelectColumnIsTrue expr1 = new SelectColumnIsTrue(0);
    SelectColumnIsFalse expr2 = new SelectColumnIsFalse(1);
    FilterExprOrExpr orExpr = new FilterExprOrExpr();
    orExpr.setChildExpressions(new VectorExpression[] { expr1, expr2 });
    // Evaluate batch1 so that temporary arrays in the expression
    // have residual values to interfere in later computation
    orExpr.evaluate(batch1);
    // Swap column vectors, but keep selected vector unchanged
    ColumnVector tmp = batch1.cols[0];
    batch1.cols[0] = batch1.cols[1];
    batch1.cols[1] = tmp;
    // Make sure row-7 is in the output.
    batch1.cols[1].isNull[7] = false;
    ((LongColumnVector) batch1.cols[1]).vector[7] = 0;
    orExpr.evaluate(batch1);
    assertEquals(3, batch1.size);
    assertEquals(0, batch1.selected[0]);
    assertEquals(3, batch1.selected[1]);
    assertEquals(7, batch1.selected[2]);
}

Also used : VectorizedRowBatch(org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch) LongColumnVector(org.apache.hadoop.hive.ql.exec.vector.LongColumnVector) ColumnVector(org.apache.hadoop.hive.ql.exec.vector.ColumnVector) DoubleColumnVector(org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector) LongColumnVector(org.apache.hadoop.hive.ql.exec.vector.LongColumnVector) Test(org.junit.Test)

Example 4 with ColumnVector

use of org.apache.hadoop.hive.ql.exec.vector.ColumnVector in project hive by apache.

the class TestVectorGenericDateExpressions method testDateAddColScalar.

private void testDateAddColScalar(VectorExpression.Type colType1, boolean isPositive) {
    LongColumnVector date1 = newRandomLongColumnVector(10000, size);
    ColumnVector col1 = castTo(date1, colType1);
    long scalar2 = newRandom(1000);
    LongColumnVector output = new LongColumnVector(size);
    VectorizedRowBatch batch = new VectorizedRowBatch(2, size);
    batch.cols[0] = col1;
    batch.cols[1] = output;
    validateDateAdd(batch, colType1, scalar2, isPositive, date1);
    TestVectorizedRowBatch.addRandomNulls(batch.cols[0]);
    validateDateAdd(batch, colType1, scalar2, isPositive, date1);
}

Example 5 with ColumnVector

use of org.apache.hadoop.hive.ql.exec.vector.ColumnVector in project hive by apache.

the class TestVectorGenericDateExpressions method testDateDiffScalarCol.

@Test
public void testDateDiffScalarCol() {
    for (VectorExpression.Type scalarType1 : dateTimestampStringTypes) {
        for (VectorExpression.Type colType2 : dateTimestampStringTypes) {
            LongColumnVector date2 = newRandomLongColumnVector(10000, size);
            LongColumnVector output = new LongColumnVector(size);
            ColumnVector col2 = castTo(date2, colType2);
            VectorizedRowBatch batch = new VectorizedRowBatch(2, size);
            batch.cols[0] = col2;
            batch.cols[1] = output;
            long scalar1 = newRandom(1000);
            validateDateDiff(batch, scalar1, scalarType1, colType2, date2);
            TestVectorizedRowBatch.addRandomNulls(date2);
            batch.cols[0] = castTo(date2, colType2);
            validateDateDiff(batch, scalar1, scalarType1, colType2, date2);
        }
    }
    VectorExpression udf;
    byte[] bytes = "error".getBytes(utf8);
    VectorizedRowBatch batch = new VectorizedRowBatch(2, 1);
    udf = new VectorUDFDateDiffScalarCol(new Timestamp(0), 0, 1);
    udf.setInputTypes(VectorExpression.Type.TIMESTAMP, VectorExpression.Type.STRING);
    batch.cols[0] = new BytesColumnVector(1);
    batch.cols[1] = new LongColumnVector(1);
    BytesColumnVector bcv = (BytesColumnVector) batch.cols[0];
    bcv.vector[0] = bytes;
    bcv.start[0] = 0;
    bcv.length[0] = bytes.length;
    udf.evaluate(batch);
    Assert.assertEquals(batch.cols[1].isNull[0], true);
    udf = new VectorUDFDateDiffScalarCol(bytes, 0, 1);
    udf.setInputTypes(VectorExpression.Type.STRING, VectorExpression.Type.TIMESTAMP);
    batch.cols[0] = new LongColumnVector(1);
    batch.cols[1] = new LongColumnVector(1);
    udf.evaluate(batch);
    Assert.assertEquals(batch.cols[1].isNull[0], true);
}

Also used : VectorizedRowBatch(org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch) TestVectorizedRowBatch(org.apache.hadoop.hive.ql.exec.vector.TestVectorizedRowBatch) BytesColumnVector(org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector) Timestamp(java.sql.Timestamp) LongColumnVector(org.apache.hadoop.hive.ql.exec.vector.LongColumnVector) TimestampColumnVector(org.apache.hadoop.hive.ql.exec.vector.TimestampColumnVector) BytesColumnVector(org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector) LongColumnVector(org.apache.hadoop.hive.ql.exec.vector.LongColumnVector) ColumnVector(org.apache.hadoop.hive.ql.exec.vector.ColumnVector) Test(org.junit.Test)

Aggregations

ColumnVector (org.apache.hadoop.hive.ql.exec.vector.ColumnVector)43 LongColumnVector (org.apache.hadoop.hive.ql.exec.vector.LongColumnVector)24 BytesColumnVector (org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector)19 TimestampColumnVector (org.apache.hadoop.hive.ql.exec.vector.TimestampColumnVector)14 DoubleColumnVector (org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector)11 VectorizedRowBatch (org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch)9 DecimalColumnVector (org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector)8 HiveException (org.apache.hadoop.hive.ql.metadata.HiveException)4 TestVectorizedRowBatch (org.apache.hadoop.hive.ql.exec.vector.TestVectorizedRowBatch)3 Output (org.apache.hadoop.hive.serde2.ByteStream.Output)3 BinarySortableSerializeWrite (org.apache.hadoop.hive.serde2.binarysortable.fast.BinarySortableSerializeWrite)3 Test (org.junit.Test)3 ParseException (java.text.ParseException)2 IOException (java.io.IOException)1 Timestamp (java.sql.Timestamp)1 ArrayList (java.util.ArrayList)1 ColumnStreamData (org.apache.hadoop.hive.common.io.encoded.EncodedColumnBatch.ColumnStreamData)1 LlapDataBuffer (org.apache.hadoop.hive.llap.cache.LlapDataBuffer)1 SerDeStripeMetadata (org.apache.hadoop.hive.llap.io.decode.GenericColumnVectorProducer.SerDeStripeMetadata)1 JoinUtil (org.apache.hadoop.hive.ql.exec.JoinUtil)1