Search in sources :

Example 1 with OrcList

use of org.apache.orc.mapred.OrcList in project druid by druid-io.

the class OrcStructConverterTest method testConvertRootFieldWithListOfNonNullPrimitivesReturningValuesAsTheyAre.

@Test
public void testConvertRootFieldWithListOfNonNullPrimitivesReturningValuesAsTheyAre() {
    final TypeDescription listType = TypeDescription.createList(TypeDescription.createInt());
    final OrcList<IntWritable> orcList = new OrcList<>(listType);
    orcList.addAll(IntStream.range(0, 3).mapToObj(i -> new IntWritable(i * 10)).collect(Collectors.toList()));
    final List<Integer> expectedResult = orcList.stream().map(IntWritable::get).collect(Collectors.toList());
    final OrcStructConverter converter = new OrcStructConverter(false);
    assertConversion(converter, listType, expectedResult, orcList);
}
Also used : OrcList(org.apache.orc.mapred.OrcList) TypeDescription(org.apache.orc.TypeDescription) IntWritable(org.apache.hadoop.io.IntWritable) Test(org.junit.Test)

Example 2 with OrcList

use of org.apache.orc.mapred.OrcList in project druid by druid-io.

the class OrcStructConverter method convertField.

/**
 * Convert a orc struct field as though it were a map, by fieldIndex. Complex types will be transformed
 * into java lists and maps when possible ({@link OrcStructConverter#convertList} and
 * {@link OrcStructConverter#convertMap}), and
 * primitive types will be extracted into an ingestion friendly state (e.g. 'int' and 'long'). Finally,
 * if a field is not present, this method will return null.
 *
 * Note: "Union" types are not currently supported and will be returned as null
 */
@Nullable
Object convertField(OrcStruct struct, int fieldIndex) {
    if (fieldIndex < 0) {
        return null;
    }
    TypeDescription schema = struct.getSchema();
    TypeDescription fieldDescription = schema.getChildren().get(fieldIndex);
    WritableComparable fieldValue = struct.getFieldValue(fieldIndex);
    if (fieldValue == null) {
        return null;
    }
    if (fieldDescription.getCategory().isPrimitive()) {
        return convertPrimitive(fieldDescription, fieldValue, binaryAsString);
    } else {
        /*
          ORC TYPE    WRITABLE TYPE
          array       org.apache.orc.mapred.OrcList
          map         org.apache.orc.mapred.OrcMap
          struct      org.apache.orc.mapred.OrcStruct
          uniontype   org.apache.orc.mapred.OrcUnion
       */
        switch(fieldDescription.getCategory()) {
            case LIST:
                OrcList orcList = (OrcList) fieldValue;
                return convertList(fieldDescription, orcList, binaryAsString);
            case MAP:
                OrcMap map = (OrcMap) fieldValue;
                return convertMap(fieldDescription, map, binaryAsString);
            case STRUCT:
                OrcStruct structMap = (OrcStruct) fieldValue;
                return convertStructToMap(structMap);
            case UNION:
            // sorry union types :(
            default:
                return null;
        }
    }
}
Also used : OrcStruct(org.apache.orc.mapred.OrcStruct) WritableComparable(org.apache.hadoop.io.WritableComparable) OrcList(org.apache.orc.mapred.OrcList) TypeDescription(org.apache.orc.TypeDescription) OrcMap(org.apache.orc.mapred.OrcMap) Nullable(javax.annotation.Nullable)

Example 3 with OrcList

use of org.apache.orc.mapred.OrcList in project druid by druid-io.

the class OrcStructConverterTest method testConvertRootFieldWithListOfNullsReturningListOfNulls.

@Test
public void testConvertRootFieldWithListOfNullsReturningListOfNulls() {
    final TypeDescription listType = TypeDescription.createList(TypeDescription.createInt());
    final OrcList<IntWritable> orcList = new OrcList<>(listType);
    IntStream.range(0, 3).forEach(i -> orcList.add(null));
    final List<Integer> expectedResult = new ArrayList<>();
    IntStream.range(0, 3).forEach(i -> expectedResult.add(null));
    final OrcStructConverter converter = new OrcStructConverter(false);
    assertConversion(converter, listType, expectedResult, orcList);
}
Also used : OrcList(org.apache.orc.mapred.OrcList) ArrayList(java.util.ArrayList) TypeDescription(org.apache.orc.TypeDescription) IntWritable(org.apache.hadoop.io.IntWritable) Test(org.junit.Test)

Aggregations

TypeDescription (org.apache.orc.TypeDescription)3 OrcList (org.apache.orc.mapred.OrcList)3 IntWritable (org.apache.hadoop.io.IntWritable)2 Test (org.junit.Test)2 ArrayList (java.util.ArrayList)1 Nullable (javax.annotation.Nullable)1 WritableComparable (org.apache.hadoop.io.WritableComparable)1 OrcMap (org.apache.orc.mapred.OrcMap)1 OrcStruct (org.apache.orc.mapred.OrcStruct)1