Search in sources :

Example 1 with HiveMetastoreClientWrapper

use of org.apache.flink.table.catalog.hive.client.HiveMetastoreClientWrapper in project flink by apache.

the class HiveParserUtils method getFunctionInfo.

// Get FunctionInfo and always look for it in metastore when FunctionRegistry returns null.
public static FunctionInfo getFunctionInfo(String funcName) throws SemanticException {
    FunctionInfo res = FunctionRegistry.getFunctionInfo(funcName);
    if (res == null) {
        SessionState sessionState = SessionState.get();
        HiveConf hiveConf = sessionState != null ? sessionState.getConf() : null;
        if (hiveConf != null) {
            // TODO: need to support overriding hive version
            try (HiveMetastoreClientWrapper hmsClient = new HiveMetastoreClientWrapper(hiveConf, HiveShimLoader.getHiveVersion())) {
                String[] parts = FunctionUtils.getQualifiedFunctionNameParts(funcName);
                Function function = hmsClient.getFunction(parts[0], parts[1]);
                getSessionHiveShim().registerTemporaryFunction(FunctionUtils.qualifyFunctionName(parts[1], parts[0]), Thread.currentThread().getContextClassLoader().loadClass(function.getClassName()));
                res = FunctionRegistry.getFunctionInfo(funcName);
            } catch (NoSuchObjectException e) {
                LOG.warn("Function {} doesn't exist in metastore", funcName);
            } catch (Exception e) {
                LOG.warn("Failed to look up function in metastore", e);
            }
        }
    }
    return res;
}
Also used : SessionState(org.apache.hadoop.hive.ql.session.SessionState) SqlAggFunction(org.apache.calcite.sql.SqlAggFunction) HiveAggSqlFunction(org.apache.flink.table.planner.functions.utils.HiveAggSqlFunction) BridgingSqlFunction(org.apache.flink.table.planner.functions.bridging.BridgingSqlFunction) SqlUserDefinedTableFunction(org.apache.calcite.sql.validate.SqlUserDefinedTableFunction) HiveTableSqlFunction(org.apache.flink.table.planner.functions.utils.HiveTableSqlFunction) Function(org.apache.hadoop.hive.metastore.api.Function) HiveMetastoreClientWrapper(org.apache.flink.table.catalog.hive.client.HiveMetastoreClientWrapper) WindowFunctionInfo(org.apache.hadoop.hive.ql.exec.WindowFunctionInfo) FunctionInfo(org.apache.hadoop.hive.ql.exec.FunctionInfo) HiveConf(org.apache.hadoop.hive.conf.HiveConf) NoSuchObjectException(org.apache.hadoop.hive.metastore.api.NoSuchObjectException) NlsString(org.apache.calcite.util.NlsString) InvocationTargetException(java.lang.reflect.InvocationTargetException) IOException(java.io.IOException) NoSuchObjectException(org.apache.hadoop.hive.metastore.api.NoSuchObjectException) SemanticException(org.apache.hadoop.hive.ql.parse.SemanticException) FlinkHiveException(org.apache.flink.connectors.hive.FlinkHiveException) HiveException(org.apache.hadoop.hive.ql.metadata.HiveException)

Example 2 with HiveMetastoreClientWrapper

use of org.apache.flink.table.catalog.hive.client.HiveMetastoreClientWrapper in project flink by apache.

the class HiveTableSink method consume.

private DataStreamSink<?> consume(ProviderContext providerContext, DataStream<RowData> dataStream, boolean isBounded, DataStructureConverter converter) {
    checkAcidTable(catalogTable.getOptions(), identifier.toObjectPath());
    try (HiveMetastoreClientWrapper client = HiveMetastoreClientFactory.create(HiveConfUtils.create(jobConf), hiveVersion)) {
        Table table = client.getTable(identifier.getDatabaseName(), identifier.getObjectName());
        StorageDescriptor sd = table.getSd();
        Class hiveOutputFormatClz = hiveShim.getHiveOutputFormatClass(Class.forName(sd.getOutputFormat()));
        boolean isCompressed = jobConf.getBoolean(HiveConf.ConfVars.COMPRESSRESULT.varname, false);
        HiveWriterFactory writerFactory = new HiveWriterFactory(jobConf, hiveOutputFormatClz, sd.getSerdeInfo(), tableSchema, getPartitionKeyArray(), HiveReflectionUtils.getTableMetadata(hiveShim, table), hiveShim, isCompressed);
        String extension = Utilities.getFileExtension(jobConf, isCompressed, (HiveOutputFormat<?, ?>) hiveOutputFormatClz.newInstance());
        OutputFileConfig.OutputFileConfigBuilder fileNamingBuilder = OutputFileConfig.builder().withPartPrefix("part-" + UUID.randomUUID().toString()).withPartSuffix(extension == null ? "" : extension);
        final int parallelism = Optional.ofNullable(configuredParallelism).orElse(dataStream.getParallelism());
        if (isBounded) {
            OutputFileConfig fileNaming = fileNamingBuilder.build();
            return createBatchSink(dataStream, converter, sd, writerFactory, fileNaming, parallelism);
        } else {
            if (overwrite) {
                throw new IllegalStateException("Streaming mode not support overwrite.");
            }
            Properties tableProps = HiveReflectionUtils.getTableMetadata(hiveShim, table);
            return createStreamSink(providerContext, dataStream, sd, tableProps, writerFactory, fileNamingBuilder, parallelism);
        }
    } catch (TException e) {
        throw new CatalogException("Failed to query Hive metaStore", e);
    } catch (IOException e) {
        throw new FlinkRuntimeException("Failed to create staging dir", e);
    } catch (ClassNotFoundException e) {
        throw new FlinkHiveException("Failed to get output format class", e);
    } catch (IllegalAccessException | InstantiationException e) {
        throw new FlinkHiveException("Failed to instantiate output format instance", e);
    }
}
Also used : TException(org.apache.thrift.TException) CatalogTable(org.apache.flink.table.catalog.CatalogTable) Table(org.apache.hadoop.hive.metastore.api.Table) HiveTableUtil.checkAcidTable(org.apache.flink.table.catalog.hive.util.HiveTableUtil.checkAcidTable) StorageDescriptor(org.apache.hadoop.hive.metastore.api.StorageDescriptor) CatalogException(org.apache.flink.table.catalog.exceptions.CatalogException) UncheckedIOException(java.io.UncheckedIOException) IOException(java.io.IOException) Properties(java.util.Properties) OutputFileConfig(org.apache.flink.streaming.api.functions.sink.filesystem.OutputFileConfig) HiveMetastoreClientWrapper(org.apache.flink.table.catalog.hive.client.HiveMetastoreClientWrapper) FlinkRuntimeException(org.apache.flink.util.FlinkRuntimeException) HiveWriterFactory(org.apache.flink.connectors.hive.write.HiveWriterFactory)

Example 3 with HiveMetastoreClientWrapper

use of org.apache.flink.table.catalog.hive.client.HiveMetastoreClientWrapper in project flink by apache.

the class HivePartitionUtils method getAllPartitions.

/**
 * Returns all HiveTablePartitions of a hive table, returns single HiveTablePartition if the
 * hive table is not partitioned.
 */
public static List<HiveTablePartition> getAllPartitions(JobConf jobConf, String hiveVersion, ObjectPath tablePath, List<String> partitionColNames, List<Map<String, String>> remainingPartitions) {
    List<HiveTablePartition> allHivePartitions = new ArrayList<>();
    try (HiveMetastoreClientWrapper client = HiveMetastoreClientFactory.create(HiveConfUtils.create(jobConf), hiveVersion)) {
        String dbName = tablePath.getDatabaseName();
        String tableName = tablePath.getObjectName();
        Table hiveTable = client.getTable(dbName, tableName);
        Properties tableProps = HiveReflectionUtils.getTableMetadata(HiveShimLoader.loadHiveShim(hiveVersion), hiveTable);
        if (partitionColNames != null && partitionColNames.size() > 0) {
            List<Partition> partitions = new ArrayList<>();
            if (remainingPartitions != null) {
                for (Map<String, String> spec : remainingPartitions) {
                    partitions.add(client.getPartition(dbName, tableName, partitionSpecToValues(spec, partitionColNames)));
                }
            } else {
                partitions.addAll(client.listPartitions(dbName, tableName, (short) -1));
            }
            for (Partition partition : partitions) {
                HiveTablePartition hiveTablePartition = toHiveTablePartition(partitionColNames, tableProps, partition);
                allHivePartitions.add(hiveTablePartition);
            }
        } else {
            allHivePartitions.add(new HiveTablePartition(hiveTable.getSd(), tableProps));
        }
    } catch (TException e) {
        throw new FlinkHiveException("Failed to collect all partitions from hive metaStore", e);
    }
    return allHivePartitions;
}
Also used : TException(org.apache.thrift.TException) Partition(org.apache.hadoop.hive.metastore.api.Partition) HiveTablePartition(org.apache.flink.connectors.hive.HiveTablePartition) HiveTablePartition(org.apache.flink.connectors.hive.HiveTablePartition) Table(org.apache.hadoop.hive.metastore.api.Table) HiveMetastoreClientWrapper(org.apache.flink.table.catalog.hive.client.HiveMetastoreClientWrapper) FlinkHiveException(org.apache.flink.connectors.hive.FlinkHiveException) ArrayList(java.util.ArrayList) Properties(java.util.Properties)

Example 4 with HiveMetastoreClientWrapper

use of org.apache.flink.table.catalog.hive.client.HiveMetastoreClientWrapper in project flink by apache.

the class HivePartitionFetcherContextBase method open.

@Override
public void open() throws Exception {
    metaStoreClient = new HiveMetastoreClientWrapper(HiveConfUtils.create(confWrapper.conf()), hiveShim);
    table = metaStoreClient.getTable(tablePath.getDatabaseName(), tablePath.getObjectName());
    tableSd = table.getSd();
    tableProps = HiveReflectionUtils.getTableMetadata(hiveShim, table);
    String extractorKind = configuration.get(PARTITION_TIME_EXTRACTOR_KIND);
    String extractorClass = configuration.get(PARTITION_TIME_EXTRACTOR_CLASS);
    String formatterPattern = configuration.get(PARTITION_TIME_EXTRACTOR_TIMESTAMP_FORMATTER);
    String extractorPattern = configuration.get(PARTITION_TIME_EXTRACTOR_TIMESTAMP_PATTERN);
    extractor = PartitionTimeExtractor.create(Thread.currentThread().getContextClassLoader(), extractorKind, extractorClass, extractorPattern, formatterPattern);
    tableLocation = new Path(table.getSd().getLocation());
    partValuesToCreateTime = new HashMap<>();
}
Also used : ObjectPath(org.apache.flink.table.catalog.ObjectPath) Path(org.apache.hadoop.fs.Path) HiveMetastoreClientWrapper(org.apache.flink.table.catalog.hive.client.HiveMetastoreClientWrapper)

Example 5 with HiveMetastoreClientWrapper

use of org.apache.flink.table.catalog.hive.client.HiveMetastoreClientWrapper in project flink by apache.

the class HiveTablePartition method ofPartition.

/**
 * Creates a HiveTablePartition to represent a hive partition.
 *
 * @param hiveConf the HiveConf used to connect to HMS
 * @param hiveVersion the version of hive in use, if it's null the version will be automatically
 *     detected
 * @param dbName name of the database
 * @param tableName name of the table
 * @param partitionSpec map from each partition column to its value. The map should contain
 *     exactly all the partition columns and in the order in which the partition columns are
 *     defined
 */
public static HiveTablePartition ofPartition(HiveConf hiveConf, @Nullable String hiveVersion, String dbName, String tableName, LinkedHashMap<String, String> partitionSpec) {
    HiveShim hiveShim = getHiveShim(hiveVersion);
    try (HiveMetastoreClientWrapper client = new HiveMetastoreClientWrapper(hiveConf, hiveShim)) {
        Table hiveTable = client.getTable(dbName, tableName);
        Partition hivePartition = client.getPartition(dbName, tableName, new ArrayList<>(partitionSpec.values()));
        return new HiveTablePartition(hivePartition.getSd(), partitionSpec, HiveReflectionUtils.getTableMetadata(hiveShim, hiveTable));
    } catch (TException e) {
        throw new FlinkHiveException(String.format("Failed to create HiveTablePartition for partition %s of hive table %s.%s", partitionSpec, dbName, tableName), e);
    }
}
Also used : TException(org.apache.thrift.TException) Partition(org.apache.hadoop.hive.metastore.api.Partition) Table(org.apache.hadoop.hive.metastore.api.Table) HiveMetastoreClientWrapper(org.apache.flink.table.catalog.hive.client.HiveMetastoreClientWrapper) HiveShim(org.apache.flink.table.catalog.hive.client.HiveShim)

Aggregations

HiveMetastoreClientWrapper (org.apache.flink.table.catalog.hive.client.HiveMetastoreClientWrapper)5 Table (org.apache.hadoop.hive.metastore.api.Table)3 TException (org.apache.thrift.TException)3 IOException (java.io.IOException)2 Properties (java.util.Properties)2 FlinkHiveException (org.apache.flink.connectors.hive.FlinkHiveException)2 Partition (org.apache.hadoop.hive.metastore.api.Partition)2 UncheckedIOException (java.io.UncheckedIOException)1 InvocationTargetException (java.lang.reflect.InvocationTargetException)1 ArrayList (java.util.ArrayList)1 SqlAggFunction (org.apache.calcite.sql.SqlAggFunction)1 SqlUserDefinedTableFunction (org.apache.calcite.sql.validate.SqlUserDefinedTableFunction)1 NlsString (org.apache.calcite.util.NlsString)1 HiveTablePartition (org.apache.flink.connectors.hive.HiveTablePartition)1 HiveWriterFactory (org.apache.flink.connectors.hive.write.HiveWriterFactory)1 OutputFileConfig (org.apache.flink.streaming.api.functions.sink.filesystem.OutputFileConfig)1 CatalogTable (org.apache.flink.table.catalog.CatalogTable)1 ObjectPath (org.apache.flink.table.catalog.ObjectPath)1 CatalogException (org.apache.flink.table.catalog.exceptions.CatalogException)1 HiveShim (org.apache.flink.table.catalog.hive.client.HiveShim)1