Search in sources :

Example 1 with VisibleForTesting

use of org.apache.iceberg.relocated.com.google.common.annotations.VisibleForTesting in project hive by apache.

the class HiveTableOperations method acquireLock.

@VisibleForTesting
long acquireLock() throws UnknownHostException, TException, InterruptedException {
    final LockComponent lockComponent = new LockComponent(LockType.EXCL_WRITE, LockLevel.TABLE, database);
    lockComponent.setTablename(tableName);
    final LockRequest lockRequest = new LockRequest(Lists.newArrayList(lockComponent), System.getProperty("user.name"), InetAddress.getLocalHost().getHostName());
    LockResponse lockResponse = metaClients.run(client -> client.lock(lockRequest));
    AtomicReference<LockState> state = new AtomicReference<>(lockResponse.getState());
    long lockId = lockResponse.getLockid();
    final long start = System.currentTimeMillis();
    long duration = 0;
    boolean timeout = false;
    try {
        if (state.get().equals(LockState.WAITING)) {
            // Retry count is the typical "upper bound of retries" for Tasks.run() function. In fact, the maximum number of
            // attempts the Tasks.run() would try is `retries + 1`. Here, for checking locks, we use timeout as the
            // upper bound of retries. So it is just reasonable to set a large retry count. However, if we set
            // Integer.MAX_VALUE, the above logic of `retries + 1` would overflow into Integer.MIN_VALUE. Hence,
            // the retry is set conservatively as `Integer.MAX_VALUE - 100` so it doesn't hit any boundary issues.
            Tasks.foreach(lockId).retry(Integer.MAX_VALUE - 100).exponentialBackoff(lockCheckMinWaitTime, lockCheckMaxWaitTime, lockAcquireTimeout, 1.5).throwFailureWhenFinished().onlyRetryOn(WaitingForLockException.class).run(id -> {
                try {
                    LockResponse response = metaClients.run(client -> client.checkLock(id));
                    LockState newState = response.getState();
                    state.set(newState);
                    if (newState.equals(LockState.WAITING)) {
                        throw new WaitingForLockException("Waiting for lock.");
                    }
                } catch (InterruptedException e) {
                    // Clear the interrupt status flag
                    Thread.interrupted();
                    LOG.warn("Interrupted while waiting for lock.", e);
                }
            }, TException.class);
        }
    } catch (WaitingForLockException waitingForLockException) {
        timeout = true;
        duration = System.currentTimeMillis() - start;
    } finally {
        if (!state.get().equals(LockState.ACQUIRED)) {
            unlock(Optional.of(lockId));
        }
    }
    // timeout and do not have lock acquired
    if (timeout && !state.get().equals(LockState.ACQUIRED)) {
        throw new CommitFailedException("Timed out after %s ms waiting for lock on %s.%s", duration, database, tableName);
    }
    if (!state.get().equals(LockState.ACQUIRED)) {
        throw new CommitFailedException("Could not acquire the lock on %s.%s, " + "lock request ended in state %s", database, tableName, state);
    }
    return lockId;
}
Also used : LockComponent(org.apache.hadoop.hive.metastore.api.LockComponent) LockResponse(org.apache.hadoop.hive.metastore.api.LockResponse) AtomicReference(java.util.concurrent.atomic.AtomicReference) LockState(org.apache.hadoop.hive.metastore.api.LockState) LockRequest(org.apache.hadoop.hive.metastore.api.LockRequest) CommitFailedException(org.apache.iceberg.exceptions.CommitFailedException) VisibleForTesting(org.apache.iceberg.relocated.com.google.common.annotations.VisibleForTesting)

Example 2 with VisibleForTesting

use of org.apache.iceberg.relocated.com.google.common.annotations.VisibleForTesting in project hive by apache.

the class HiveTableOperations method persistTable.

@VisibleForTesting
void persistTable(Table hmsTable, boolean updateHiveTable) throws TException, InterruptedException {
    if (updateHiveTable) {
        metaClients.run(client -> {
            EnvironmentContext envContext = new EnvironmentContext(ImmutableMap.of(StatsSetupConst.DO_NOT_UPDATE_STATS, StatsSetupConst.TRUE));
            ALTER_TABLE.invoke(client, database, tableName, hmsTable, envContext);
            return null;
        });
    } else {
        metaClients.run(client -> {
            client.createTable(hmsTable);
            return null;
        });
    }
}
Also used : EnvironmentContext(org.apache.hadoop.hive.metastore.api.EnvironmentContext) VisibleForTesting(org.apache.iceberg.relocated.com.google.common.annotations.VisibleForTesting)

Example 3 with VisibleForTesting

use of org.apache.iceberg.relocated.com.google.common.annotations.VisibleForTesting in project hive by apache.

the class HiveIcebergStorageHandler method overlayTableProperties.

/**
 * Stores the serializable table data in the configuration.
 * Currently the following is handled:
 * <ul>
 *   <li>- Table - in case the table is serializable</li>
 *   <li>- Location</li>
 *   <li>- Schema</li>
 *   <li>- Partition specification</li>
 *   <li>- FileIO for handling table files</li>
 *   <li>- Location provider used for file generation</li>
 *   <li>- Encryption manager for encryption handling</li>
 * </ul>
 * @param configuration The configuration storing the catalog information
 * @param tableDesc The table which we want to store to the configuration
 * @param map The map of the configuration properties which we append with the serialized data
 */
@VisibleForTesting
static void overlayTableProperties(Configuration configuration, TableDesc tableDesc, Map<String, String> map) {
    Properties props = tableDesc.getProperties();
    Table table = IcebergTableUtil.getTable(configuration, props);
    String schemaJson = SchemaParser.toJson(table.schema());
    Maps.fromProperties(props).entrySet().stream().filter(// map overrides tableDesc properties
    entry -> !map.containsKey(entry.getKey())).forEach(entry -> map.put(entry.getKey(), entry.getValue()));
    map.put(InputFormatConfig.TABLE_IDENTIFIER, props.getProperty(Catalogs.NAME));
    map.put(InputFormatConfig.TABLE_LOCATION, table.location());
    map.put(InputFormatConfig.TABLE_SCHEMA, schemaJson);
    props.put(InputFormatConfig.PARTITION_SPEC, PartitionSpecParser.toJson(table.spec()));
    // serialize table object into config
    Table serializableTable = SerializableTable.copyOf(table);
    checkAndSkipIoConfigSerialization(configuration, serializableTable);
    map.put(InputFormatConfig.SERIALIZED_TABLE_PREFIX + tableDesc.getTableName(), SerializationUtil.serializeToBase64(serializableTable));
    // We need to remove this otherwise the job.xml will be invalid as column comments are separated with '\0' and
    // the serialization utils fail to serialize this character
    map.remove("columns.comments");
    // save schema into table props as well to avoid repeatedly hitting the HMS during serde initializations
    // this is an exception to the interface documentation, but it's a safe operation to add this property
    props.put(InputFormatConfig.TABLE_SCHEMA, schemaJson);
}
Also used : ExprNodeGenericFuncDesc(org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) TableDesc(org.apache.hadoop.hive.ql.plan.TableDesc) HadoopConfigurable(org.apache.iceberg.hadoop.HadoopConfigurable) ListIterator(java.util.ListIterator) URISyntaxException(java.net.URISyntaxException) Catalogs(org.apache.iceberg.mr.Catalogs) LoggerFactory(org.slf4j.LoggerFactory) Date(org.apache.hadoop.hive.common.type.Date) SemanticException(org.apache.hadoop.hive.ql.parse.SemanticException) JobID(org.apache.hadoop.mapred.JobID) AbstractSerDe(org.apache.hadoop.hive.serde2.AbstractSerDe) StatsSetupConst(org.apache.hadoop.hive.common.StatsSetupConst) OutputCommitter(org.apache.hadoop.mapred.OutputCommitter) AlterTableType(org.apache.hadoop.hive.ql.ddl.table.AlterTableType) Throwables(org.apache.iceberg.relocated.com.google.common.base.Throwables) Map(java.util.Map) Configuration(org.apache.hadoop.conf.Configuration) InputFormat(org.apache.hadoop.mapred.InputFormat) URI(java.net.URI) PrimitiveTypeInfo(org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo) HiveStorageHandler(org.apache.hadoop.hive.ql.metadata.HiveStorageHandler) HiveStoragePredicateHandler(org.apache.hadoop.hive.ql.metadata.HiveStoragePredicateHandler) Splitter(org.apache.iceberg.relocated.com.google.common.base.Splitter) OutputFormat(org.apache.hadoop.mapred.OutputFormat) ExprNodeDesc(org.apache.hadoop.hive.ql.plan.ExprNodeDesc) WriteEntity(org.apache.hadoop.hive.ql.hooks.WriteEntity) Collection(java.util.Collection) Partish(org.apache.hadoop.hive.ql.stats.Partish) HiveMetaHook(org.apache.hadoop.hive.metastore.HiveMetaHook) FileSinkDesc(org.apache.hadoop.hive.ql.plan.FileSinkDesc) InputFormatConfig(org.apache.iceberg.mr.InputFormatConfig) Schema(org.apache.iceberg.Schema) Collectors(java.util.stream.Collectors) SessionState(org.apache.hadoop.hive.ql.session.SessionState) PartitionSpecParser(org.apache.iceberg.PartitionSpecParser) Serializable(java.io.Serializable) SchemaParser(org.apache.iceberg.SchemaParser) List(java.util.List) Optional(java.util.Optional) TableProperties(org.apache.iceberg.TableProperties) SessionStateUtil(org.apache.hadoop.hive.ql.session.SessionStateUtil) HiveException(org.apache.hadoop.hive.ql.metadata.HiveException) LockType(org.apache.hadoop.hive.metastore.api.LockType) ConvertAstToSearchArg(org.apache.hadoop.hive.ql.io.sarg.ConvertAstToSearchArg) HashMap(java.util.HashMap) ExprNodeDynamicListDesc(org.apache.hadoop.hive.ql.plan.ExprNodeDynamicListDesc) ArrayList(java.util.ArrayList) SearchArgument(org.apache.hadoop.hive.ql.io.sarg.SearchArgument) Utilities(org.apache.hadoop.hive.ql.exec.Utilities) JobStatus(org.apache.hadoop.mapred.JobStatus) PartitionTransformSpec(org.apache.hadoop.hive.ql.parse.PartitionTransformSpec) ExprNodeColumnDesc(org.apache.hadoop.hive.ql.plan.ExprNodeColumnDesc) Properties(java.util.Properties) Logger(org.slf4j.Logger) Timestamp(org.apache.hadoop.hive.common.type.Timestamp) ExprNodeConstantDesc(org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc) Table(org.apache.iceberg.Table) HiveConf(org.apache.hadoop.hive.conf.HiveConf) Maps(org.apache.iceberg.relocated.com.google.common.collect.Maps) IOException(java.io.IOException) SerializationUtil(org.apache.iceberg.util.SerializationUtil) JobConf(org.apache.hadoop.mapred.JobConf) SnapshotSummary(org.apache.iceberg.SnapshotSummary) JobContext(org.apache.hadoop.mapred.JobContext) Deserializer(org.apache.hadoop.hive.serde2.Deserializer) Preconditions(org.apache.iceberg.relocated.com.google.common.base.Preconditions) JobContextImpl(org.apache.hadoop.mapred.JobContextImpl) HiveAuthorizationProvider(org.apache.hadoop.hive.ql.security.authorization.HiveAuthorizationProvider) SerializableTable(org.apache.iceberg.SerializableTable) VisibleForTesting(org.apache.iceberg.relocated.com.google.common.annotations.VisibleForTesting) Table(org.apache.iceberg.Table) SerializableTable(org.apache.iceberg.SerializableTable) TableProperties(org.apache.iceberg.TableProperties) Properties(java.util.Properties) VisibleForTesting(org.apache.iceberg.relocated.com.google.common.annotations.VisibleForTesting)

Aggregations

VisibleForTesting (org.apache.iceberg.relocated.com.google.common.annotations.VisibleForTesting)2 IOException (java.io.IOException)1 Serializable (java.io.Serializable)1 URI (java.net.URI)1 URISyntaxException (java.net.URISyntaxException)1 ArrayList (java.util.ArrayList)1 Collection (java.util.Collection)1 HashMap (java.util.HashMap)1 List (java.util.List)1 ListIterator (java.util.ListIterator)1 Map (java.util.Map)1 Optional (java.util.Optional)1 Properties (java.util.Properties)1 AtomicReference (java.util.concurrent.atomic.AtomicReference)1 Collectors (java.util.stream.Collectors)1 Configuration (org.apache.hadoop.conf.Configuration)1 StatsSetupConst (org.apache.hadoop.hive.common.StatsSetupConst)1 Date (org.apache.hadoop.hive.common.type.Date)1 Timestamp (org.apache.hadoop.hive.common.type.Timestamp)1 HiveConf (org.apache.hadoop.hive.conf.HiveConf)1