Search in sources :

Example 1 with PartitionTransformSpec

use of org.apache.hadoop.hive.ql.parse.PartitionTransformSpec in project hive by apache.

the class HiveIcebergStorageHandler method getPartitionTransformSpec.

@Override
public List<PartitionTransformSpec> getPartitionTransformSpec(org.apache.hadoop.hive.ql.metadata.Table hmsTable) {
    List<PartitionTransformSpec> result = new ArrayList<>();
    TableDesc tableDesc = Utilities.getTableDesc(hmsTable);
    Table table = IcebergTableUtil.getTable(conf, tableDesc.getProperties());
    return table.spec().fields().stream().map(f -> {
        PartitionTransformSpec spec = new PartitionTransformSpec();
        spec.setColumnName(table.schema().findColumnName(f.sourceId()));
        // right now the only way to fetch the transform type and its params is through the toString() call
        String transformName = f.transform().toString().toUpperCase();
        // if the transform name contains '[' it means it has some config params
        if (transformName.contains("[")) {
            spec.setTransformType(PartitionTransformSpec.TransformType.valueOf(transformName.substring(0, transformName.indexOf("["))));
            spec.setTransformParam(Optional.of(Integer.valueOf(transformName.substring(transformName.indexOf("[") + 1, transformName.indexOf("]")))));
        } else {
            spec.setTransformType(PartitionTransformSpec.TransformType.valueOf(transformName));
            spec.setTransformParam(Optional.empty());
        }
        return spec;
    }).collect(Collectors.toList());
}
Also used : ExprNodeGenericFuncDesc(org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) TableDesc(org.apache.hadoop.hive.ql.plan.TableDesc) HadoopConfigurable(org.apache.iceberg.hadoop.HadoopConfigurable) ListIterator(java.util.ListIterator) URISyntaxException(java.net.URISyntaxException) Catalogs(org.apache.iceberg.mr.Catalogs) LoggerFactory(org.slf4j.LoggerFactory) Date(org.apache.hadoop.hive.common.type.Date) SemanticException(org.apache.hadoop.hive.ql.parse.SemanticException) JobID(org.apache.hadoop.mapred.JobID) AbstractSerDe(org.apache.hadoop.hive.serde2.AbstractSerDe) StatsSetupConst(org.apache.hadoop.hive.common.StatsSetupConst) OutputCommitter(org.apache.hadoop.mapred.OutputCommitter) AlterTableType(org.apache.hadoop.hive.ql.ddl.table.AlterTableType) Throwables(org.apache.iceberg.relocated.com.google.common.base.Throwables) Map(java.util.Map) Configuration(org.apache.hadoop.conf.Configuration) InputFormat(org.apache.hadoop.mapred.InputFormat) URI(java.net.URI) PrimitiveTypeInfo(org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo) HiveStorageHandler(org.apache.hadoop.hive.ql.metadata.HiveStorageHandler) HiveStoragePredicateHandler(org.apache.hadoop.hive.ql.metadata.HiveStoragePredicateHandler) Splitter(org.apache.iceberg.relocated.com.google.common.base.Splitter) OutputFormat(org.apache.hadoop.mapred.OutputFormat) ExprNodeDesc(org.apache.hadoop.hive.ql.plan.ExprNodeDesc) WriteEntity(org.apache.hadoop.hive.ql.hooks.WriteEntity) Collection(java.util.Collection) Partish(org.apache.hadoop.hive.ql.stats.Partish) HiveMetaHook(org.apache.hadoop.hive.metastore.HiveMetaHook) FileSinkDesc(org.apache.hadoop.hive.ql.plan.FileSinkDesc) InputFormatConfig(org.apache.iceberg.mr.InputFormatConfig) Schema(org.apache.iceberg.Schema) Collectors(java.util.stream.Collectors) SessionState(org.apache.hadoop.hive.ql.session.SessionState) PartitionSpecParser(org.apache.iceberg.PartitionSpecParser) Serializable(java.io.Serializable) SchemaParser(org.apache.iceberg.SchemaParser) List(java.util.List) Optional(java.util.Optional) TableProperties(org.apache.iceberg.TableProperties) SessionStateUtil(org.apache.hadoop.hive.ql.session.SessionStateUtil) HiveException(org.apache.hadoop.hive.ql.metadata.HiveException) LockType(org.apache.hadoop.hive.metastore.api.LockType) ConvertAstToSearchArg(org.apache.hadoop.hive.ql.io.sarg.ConvertAstToSearchArg) HashMap(java.util.HashMap) ExprNodeDynamicListDesc(org.apache.hadoop.hive.ql.plan.ExprNodeDynamicListDesc) ArrayList(java.util.ArrayList) SearchArgument(org.apache.hadoop.hive.ql.io.sarg.SearchArgument) Utilities(org.apache.hadoop.hive.ql.exec.Utilities) JobStatus(org.apache.hadoop.mapred.JobStatus) PartitionTransformSpec(org.apache.hadoop.hive.ql.parse.PartitionTransformSpec) ExprNodeColumnDesc(org.apache.hadoop.hive.ql.plan.ExprNodeColumnDesc) Properties(java.util.Properties) Logger(org.slf4j.Logger) Timestamp(org.apache.hadoop.hive.common.type.Timestamp) ExprNodeConstantDesc(org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc) Table(org.apache.iceberg.Table) HiveConf(org.apache.hadoop.hive.conf.HiveConf) Maps(org.apache.iceberg.relocated.com.google.common.collect.Maps) IOException(java.io.IOException) SerializationUtil(org.apache.iceberg.util.SerializationUtil) JobConf(org.apache.hadoop.mapred.JobConf) SnapshotSummary(org.apache.iceberg.SnapshotSummary) JobContext(org.apache.hadoop.mapred.JobContext) Deserializer(org.apache.hadoop.hive.serde2.Deserializer) Preconditions(org.apache.iceberg.relocated.com.google.common.base.Preconditions) JobContextImpl(org.apache.hadoop.mapred.JobContextImpl) HiveAuthorizationProvider(org.apache.hadoop.hive.ql.security.authorization.HiveAuthorizationProvider) SerializableTable(org.apache.iceberg.SerializableTable) VisibleForTesting(org.apache.iceberg.relocated.com.google.common.annotations.VisibleForTesting) Table(org.apache.iceberg.Table) SerializableTable(org.apache.iceberg.SerializableTable) ArrayList(java.util.ArrayList) TableDesc(org.apache.hadoop.hive.ql.plan.TableDesc) PartitionTransformSpec(org.apache.hadoop.hive.ql.parse.PartitionTransformSpec)

Example 2 with PartitionTransformSpec

use of org.apache.hadoop.hive.ql.parse.PartitionTransformSpec in project hive by apache.

the class DDLPlanUtils method getPartitionsBySpec.

private String getPartitionsBySpec(Table table) {
    if (table.isNonNative() && table.getStorageHandler() != null && table.getStorageHandler().supportsPartitionTransform()) {
        List<PartitionTransformSpec> specs = table.getStorageHandler().getPartitionTransformSpec(table);
        if (specs.isEmpty()) {
            return "";
        }
        List<String> partitionTransforms = new ArrayList<>();
        for (PartitionTransformSpec spec : specs) {
            if (spec.getTransformType() == PartitionTransformSpec.TransformType.IDENTITY) {
                partitionTransforms.add(spec.getColumnName());
            } else {
                partitionTransforms.add(spec.getTransformType().name() + "(" + (spec.getTransformParam().isPresent() ? spec.getTransformParam().get() + ", " : "") + spec.getColumnName() + ")");
            }
        }
        return "PARTITIONED BY SPEC ( \n" + StringUtils.join(partitionTransforms, ", \n") + ")";
    }
    return "";
}
Also used : ArrayList(java.util.ArrayList) PartitionTransformSpec(org.apache.hadoop.hive.ql.parse.PartitionTransformSpec)

Example 3 with PartitionTransformSpec

use of org.apache.hadoop.hive.ql.parse.PartitionTransformSpec in project hive by apache.

the class TextDescTableFormatter method addPartitionTransformData.

private void addPartitionTransformData(DataOutputStream out, Table table, boolean isOutputPadded) throws IOException {
    String partitionTransformOutput = "";
    if (table.isNonNative() && table.getStorageHandler() != null && table.getStorageHandler().supportsPartitionTransform()) {
        List<PartitionTransformSpec> partSpecs = table.getStorageHandler().getPartitionTransformSpec(table);
        if (partSpecs != null && !partSpecs.isEmpty()) {
            TextMetaDataTable metaDataTable = new TextMetaDataTable();
            partitionTransformOutput += LINE_DELIM + "# Partition Transform Information" + LINE_DELIM + "# ";
            metaDataTable.addRow(DescTableDesc.PARTITION_TRANSFORM_SPEC_SCHEMA.split("#")[0].split(","));
            for (PartitionTransformSpec spec : partSpecs) {
                String[] row = new String[2];
                row[0] = spec.getColumnName();
                if (spec.getTransformType() != null) {
                    row[1] = spec.getTransformParam().isPresent() ? spec.getTransformType().name() + "[" + spec.getTransformParam().get() + "]" : spec.getTransformType().name();
                }
                metaDataTable.addRow(row);
            }
            partitionTransformOutput += metaDataTable.renderTable(isOutputPadded);
        }
    }
    out.write(partitionTransformOutput.getBytes(StandardCharsets.UTF_8));
}
Also used : PartitionTransformSpec(org.apache.hadoop.hive.ql.parse.PartitionTransformSpec) TextMetaDataTable(org.apache.hadoop.hive.ql.ddl.ShowUtils.TextMetaDataTable)

Example 4 with PartitionTransformSpec

use of org.apache.hadoop.hive.ql.parse.PartitionTransformSpec in project hive by apache.

the class CreateTableDesc method toTable.

public Table toTable(HiveConf conf) throws HiveException {
    Table tbl = new Table(tableName.getDb(), tableName.getTable());
    if (getTblProps() != null) {
        tbl.getTTable().getParameters().putAll(getTblProps());
    }
    if (getNumBuckets() != -1) {
        tbl.setNumBuckets(getNumBuckets());
    }
    if (getStorageHandler() != null) {
        tbl.setProperty(org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.META_TABLE_STORAGE, getStorageHandler());
    }
    HiveStorageHandler storageHandler = tbl.getStorageHandler();
    /*
     * If the user didn't specify a SerDe, we use the default.
     */
    String serDeClassName;
    if (getSerName() == null) {
        if (storageHandler == null) {
            serDeClassName = PlanUtils.getDefaultSerDe().getName();
            LOG.info("Default to " + serDeClassName + " for table " + tableName);
        } else {
            serDeClassName = storageHandler.getSerDeClass().getName();
            LOG.info("Use StorageHandler-supplied " + serDeClassName + " for table " + tableName);
        }
    } else {
        // let's validate that the serde exists
        serDeClassName = getSerName();
        DDLUtils.validateSerDe(serDeClassName, conf);
    }
    tbl.setSerializationLib(serDeClassName);
    if (getFieldDelim() != null) {
        tbl.setSerdeParam(serdeConstants.FIELD_DELIM, getFieldDelim());
        tbl.setSerdeParam(serdeConstants.SERIALIZATION_FORMAT, getFieldDelim());
    }
    if (getFieldEscape() != null) {
        tbl.setSerdeParam(serdeConstants.ESCAPE_CHAR, getFieldEscape());
    }
    if (getCollItemDelim() != null) {
        tbl.setSerdeParam(serdeConstants.COLLECTION_DELIM, getCollItemDelim());
    }
    if (getMapKeyDelim() != null) {
        tbl.setSerdeParam(serdeConstants.MAPKEY_DELIM, getMapKeyDelim());
    }
    if (getLineDelim() != null) {
        tbl.setSerdeParam(serdeConstants.LINE_DELIM, getLineDelim());
    }
    if (getNullFormat() != null) {
        tbl.setSerdeParam(serdeConstants.SERIALIZATION_NULL_FORMAT, getNullFormat());
    }
    if (getSerdeProps() != null) {
        Iterator<Map.Entry<String, String>> iter = getSerdeProps().entrySet().iterator();
        while (iter.hasNext()) {
            Map.Entry<String, String> m = iter.next();
            tbl.setSerdeParam(m.getKey(), m.getValue());
        }
    }
    Optional<List<FieldSchema>> cols = Optional.ofNullable(getCols());
    Optional<List<FieldSchema>> partCols = Optional.ofNullable(getPartCols());
    if (storageHandler != null && storageHandler.alwaysUnpartitioned()) {
        tbl.getSd().setCols(new ArrayList<>());
        cols.ifPresent(c -> tbl.getSd().getCols().addAll(c));
        if (partCols.isPresent() && !partCols.get().isEmpty()) {
            // Add the partition columns to the normal columns and save the transform to the session state
            tbl.getSd().getCols().addAll(partCols.get());
            List<PartitionTransformSpec> spec = PartitionTransform.getPartitionTransformSpec(partCols.get());
            if (!SessionStateUtil.addResource(conf, hive_metastoreConstants.PARTITION_TRANSFORM_SPEC, spec)) {
                throw new HiveException("Query state attached to Session state must be not null. " + "Partition transform metadata cannot be saved.");
            }
        }
    } else {
        cols.ifPresent(c -> tbl.setFields(c));
        partCols.ifPresent(c -> tbl.setPartCols(c));
    }
    if (getBucketCols() != null) {
        tbl.setBucketCols(getBucketCols());
    }
    if (getSortCols() != null) {
        tbl.setSortCols(getSortCols());
    }
    if (getComment() != null) {
        tbl.setProperty("comment", getComment());
    }
    if (getLocation() != null) {
        tbl.setDataLocation(new Path(getLocation()));
    }
    if (getSkewedColNames() != null) {
        tbl.setSkewedColNames(getSkewedColNames());
    }
    if (getSkewedColValues() != null) {
        tbl.setSkewedColValues(getSkewedColValues());
    }
    tbl.getTTable().setTemporary(isTemporary());
    tbl.setStoredAsSubDirectories(isStoredAsSubDirectories());
    tbl.setInputFormatClass(getInputFormat());
    tbl.setOutputFormatClass(getOutputFormat());
    // Otherwise, load lazily via StorageHandler at query time.
    if (getInputFormat() != null && !getInputFormat().isEmpty()) {
        tbl.getTTable().getSd().setInputFormat(tbl.getInputFormatClass().getName());
    }
    if (getOutputFormat() != null && !getOutputFormat().isEmpty()) {
        tbl.getTTable().getSd().setOutputFormat(tbl.getOutputFormatClass().getName());
    }
    if (CreateTableOperation.doesTableNeedLocation(tbl)) {
        // If location is specified - ensure that it is a full qualified name
        CreateTableOperation.makeLocationQualified(tbl, conf);
    }
    if (isExternal()) {
        tbl.setProperty("EXTERNAL", "TRUE");
        tbl.setTableType(TableType.EXTERNAL_TABLE);
    }
    // 'n' columns where 'n' is the length of the bucketed columns.
    if ((tbl.getBucketCols() != null) && (tbl.getSortCols() != null)) {
        List<String> bucketCols = tbl.getBucketCols();
        List<Order> sortCols = tbl.getSortCols();
        if ((sortCols.size() > 0) && (sortCols.size() >= bucketCols.size())) {
            boolean found = true;
            Iterator<String> iterBucketCols = bucketCols.iterator();
            while (iterBucketCols.hasNext()) {
                String bucketCol = iterBucketCols.next();
                boolean colFound = false;
                for (int i = 0; i < bucketCols.size(); i++) {
                    if (bucketCol.equals(sortCols.get(i).getCol())) {
                        colFound = true;
                        break;
                    }
                }
                if (colFound == false) {
                    found = false;
                    break;
                }
            }
            if (found) {
                tbl.setProperty("SORTBUCKETCOLSPREFIX", "TRUE");
            }
        }
    }
    if (colStats != null) {
        ColumnStatisticsDesc colStatsDesc = new ColumnStatisticsDesc(colStats.getStatsDesc());
        colStatsDesc.setCatName(tbl.getCatName());
        colStatsDesc.setDbName(tbl.getDbName());
        colStatsDesc.setTableName(tbl.getTableName());
        String engine = colStats.getEngine();
        if (engine == null) {
            engine = org.apache.hadoop.hive.conf.Constants.HIVE_ENGINE;
        }
        ColumnStatistics columnStatistics = new ColumnStatistics(colStatsDesc, colStats.getStatsObj());
        columnStatistics.setEngine(engine);
        tbl.getTTable().setColStats(columnStatistics);
        // update column statistics.
        if (replWriteId > 0) {
            tbl.getTTable().setWriteId(replWriteId);
        }
    }
    // reset it on replica.
    if (replicationSpec == null || !replicationSpec.isInReplicationScope()) {
        if (!this.isCTAS && (tbl.getPath() == null || (!isExternal() && tbl.isEmpty()))) {
            if (!tbl.isPartitioned() && conf.getBoolVar(HiveConf.ConfVars.HIVESTATSAUTOGATHER)) {
                StatsSetupConst.setStatsStateForCreateTable(tbl.getTTable().getParameters(), MetaStoreUtils.getColumnNames(tbl.getCols()), StatsSetupConst.TRUE);
            }
        } else {
            StatsSetupConst.setStatsStateForCreateTable(tbl.getTTable().getParameters(), null, StatsSetupConst.FALSE);
        }
    }
    if (ownerName != null) {
        tbl.setOwner(ownerName);
    }
    return tbl;
}
Also used : Path(org.apache.hadoop.fs.Path) Order(org.apache.hadoop.hive.metastore.api.Order) ColumnStatistics(org.apache.hadoop.hive.metastore.api.ColumnStatistics) HiveStorageHandler(org.apache.hadoop.hive.ql.metadata.HiveStorageHandler) Table(org.apache.hadoop.hive.ql.metadata.Table) HiveException(org.apache.hadoop.hive.ql.metadata.HiveException) SQLCheckConstraint(org.apache.hadoop.hive.metastore.api.SQLCheckConstraint) SQLNotNullConstraint(org.apache.hadoop.hive.metastore.api.SQLNotNullConstraint) SQLUniqueConstraint(org.apache.hadoop.hive.metastore.api.SQLUniqueConstraint) SQLDefaultConstraint(org.apache.hadoop.hive.metastore.api.SQLDefaultConstraint) ColumnStatisticsDesc(org.apache.hadoop.hive.metastore.api.ColumnStatisticsDesc) List(java.util.List) ArrayList(java.util.ArrayList) Map(java.util.Map) HashMap(java.util.HashMap) PartitionTransformSpec(org.apache.hadoop.hive.ql.parse.PartitionTransformSpec)

Example 5 with PartitionTransformSpec

use of org.apache.hadoop.hive.ql.parse.PartitionTransformSpec in project hive by apache.

the class AlterTableSetPartitionSpecAnalyzer method analyzeCommand.

@Override
protected void analyzeCommand(TableName tableName, Map<String, String> partitionSpec, ASTNode command) throws SemanticException {
    Table table = getTable(tableName);
    validateAlterTableType(table, AlterTableType.SETPARTITIONSPEC, false);
    inputs.add(new ReadEntity(table));
    List<PartitionTransformSpec> partitionTransformSpec = PartitionTransform.getPartitionTransformSpec(command);
    if (!SessionStateUtil.addResource(conf, hive_metastoreConstants.PARTITION_TRANSFORM_SPEC, partitionTransformSpec)) {
        throw new SemanticException("Query state attached to Session state must be not null. " + "Partition transform metadata cannot be saved.");
    }
    AlterTableSetPartitionSpecDesc desc = new AlterTableSetPartitionSpecDesc(tableName, partitionSpec);
    rootTasks.add(TaskFactory.get(new DDLWork(getInputs(), getOutputs(), desc)));
}
Also used : ReadEntity(org.apache.hadoop.hive.ql.hooks.ReadEntity) Table(org.apache.hadoop.hive.ql.metadata.Table) DDLWork(org.apache.hadoop.hive.ql.ddl.DDLWork) PartitionTransformSpec(org.apache.hadoop.hive.ql.parse.PartitionTransformSpec) SemanticException(org.apache.hadoop.hive.ql.parse.SemanticException)

Aggregations

PartitionTransformSpec (org.apache.hadoop.hive.ql.parse.PartitionTransformSpec)6 ArrayList (java.util.ArrayList)3 List (java.util.List)3 HashMap (java.util.HashMap)2 Map (java.util.Map)2 Properties (java.util.Properties)2 Configuration (org.apache.hadoop.conf.Configuration)2 HiveException (org.apache.hadoop.hive.ql.metadata.HiveException)2 HiveStorageHandler (org.apache.hadoop.hive.ql.metadata.HiveStorageHandler)2 Table (org.apache.hadoop.hive.ql.metadata.Table)2 SemanticException (org.apache.hadoop.hive.ql.parse.SemanticException)2 SessionStateUtil (org.apache.hadoop.hive.ql.session.SessionStateUtil)2 IOException (java.io.IOException)1 Serializable (java.io.Serializable)1 URI (java.net.URI)1 URISyntaxException (java.net.URISyntaxException)1 Collection (java.util.Collection)1 ListIterator (java.util.ListIterator)1 Optional (java.util.Optional)1 Collectors (java.util.stream.Collectors)1