Search in sources :

Example 1 with HiveDrillNativeParquetScan

use of org.apache.drill.exec.store.hive.HiveDrillNativeParquetScan in project drill by apache.

the class ConvertHiveParquetScanToDrillParquetScan method createNativeScanRel.

/**
   * Helper method which creates a DrillScalRel with native HiveScan.
   */
private DrillScanRel createNativeScanRel(final Map<String, String> partitionColMapping, final DrillScanRel hiveScanRel) throws Exception {
    final RelDataTypeFactory typeFactory = hiveScanRel.getCluster().getTypeFactory();
    final RelDataType varCharType = typeFactory.createSqlType(SqlTypeName.VARCHAR);
    final List<String> nativeScanColNames = Lists.newArrayList();
    final List<RelDataType> nativeScanColTypes = Lists.newArrayList();
    for (RelDataTypeField field : hiveScanRel.getRowType().getFieldList()) {
        final String dirColName = partitionColMapping.get(field.getName());
        if (dirColName != null) {
            // partition column
            nativeScanColNames.add(dirColName);
            nativeScanColTypes.add(varCharType);
        } else {
            nativeScanColNames.add(field.getName());
            nativeScanColTypes.add(field.getType());
        }
    }
    final RelDataType nativeScanRowType = typeFactory.createStructType(nativeScanColTypes, nativeScanColNames);
    // Create the list of projected columns set in HiveScan. The order of this list may not be same as the order of
    // columns in HiveScan row type. Note: If the HiveScan.getColumn() contains a '*', we just need to add it as it is,
    // unlike above where we expanded the '*'. HiveScan and related (subscan) can handle '*'.
    final List<SchemaPath> nativeScanCols = Lists.newArrayList();
    for (SchemaPath colName : hiveScanRel.getColumns()) {
        final String partitionCol = partitionColMapping.get(colName.getAsUnescapedPath());
        if (partitionCol != null) {
            nativeScanCols.add(SchemaPath.getSimplePath(partitionCol));
        } else {
            nativeScanCols.add(colName);
        }
    }
    final HiveScan hiveScan = (HiveScan) hiveScanRel.getGroupScan();
    final HiveDrillNativeParquetScan nativeHiveScan = new HiveDrillNativeParquetScan(hiveScan.getUserName(), hiveScan.hiveReadEntry, hiveScan.storagePlugin, nativeScanCols, null);
    return new DrillScanRel(hiveScanRel.getCluster(), hiveScanRel.getTraitSet(), hiveScanRel.getTable(), nativeHiveScan, nativeScanRowType, nativeScanCols);
}
Also used : RelDataTypeField(org.apache.calcite.rel.type.RelDataTypeField) DrillScanRel(org.apache.drill.exec.planner.logical.DrillScanRel) SchemaPath(org.apache.drill.common.expression.SchemaPath) RelDataTypeFactory(org.apache.calcite.rel.type.RelDataTypeFactory) HiveScan(org.apache.drill.exec.store.hive.HiveScan) RelDataType(org.apache.calcite.rel.type.RelDataType) HiveDrillNativeParquetScan(org.apache.drill.exec.store.hive.HiveDrillNativeParquetScan)

Aggregations

RelDataType (org.apache.calcite.rel.type.RelDataType)1 RelDataTypeFactory (org.apache.calcite.rel.type.RelDataTypeFactory)1 RelDataTypeField (org.apache.calcite.rel.type.RelDataTypeField)1 SchemaPath (org.apache.drill.common.expression.SchemaPath)1 DrillScanRel (org.apache.drill.exec.planner.logical.DrillScanRel)1 HiveDrillNativeParquetScan (org.apache.drill.exec.store.hive.HiveDrillNativeParquetScan)1 HiveScan (org.apache.drill.exec.store.hive.HiveScan)1