Search in sources :

Example 16 with Expression

use of org.apache.iceberg.expressions.Expression in project presto by prestodb.

the class IcebergUtil method getTableScan.

public static TableScan getTableScan(TupleDomain<IcebergColumnHandle> predicates, Optional<Long> snapshotId, Table icebergTable) {
    Expression expression = ExpressionConverter.toIcebergExpression(predicates);
    TableScan tableScan = icebergTable.newScan().filter(expression);
    return snapshotId.map(id -> isSnapshot(icebergTable, id) ? tableScan.useSnapshot(id) : tableScan.asOfTime(id)).orElse(tableScan);
}
Also used : HdfsEnvironment(com.facebook.presto.hive.HdfsEnvironment) MetastoreContext(com.facebook.presto.hive.metastore.MetastoreContext) ICEBERG_TABLE_TYPE_VALUE(org.apache.iceberg.BaseMetastoreTableOperations.ICEBERG_TABLE_TYPE_VALUE) PrestoException(com.facebook.presto.spi.PrestoException) WRITE_LOCATION_PROVIDER_IMPL(org.apache.iceberg.TableProperties.WRITE_LOCATION_PROVIDER_IMPL) TABLE_TYPE_PROP(org.apache.iceberg.BaseMetastoreTableOperations.TABLE_TYPE_PROP) PartitionField(org.apache.iceberg.PartitionField) LocationProvider(org.apache.iceberg.io.LocationProvider) TableOperations(org.apache.iceberg.TableOperations) SchemaTableName(com.facebook.presto.spi.SchemaTableName) Expression(org.apache.iceberg.expressions.Expression) ExtendedHiveMetastore(com.facebook.presto.hive.metastore.ExtendedHiveMetastore) HistoryEntry(org.apache.iceberg.HistoryEntry) Locale(java.util.Locale) TypeManager(com.facebook.presto.common.type.TypeManager) Map(java.util.Map) TABLE_COMMENT(com.facebook.presto.hive.HiveMetadata.TABLE_COMMENT) DEFAULT_FILE_FORMAT(org.apache.iceberg.TableProperties.DEFAULT_FILE_FORMAT) HdfsContext(com.facebook.presto.hive.HdfsContext) LocationProviders.locationsFor(org.apache.iceberg.LocationProviders.locationsFor) TypeConverter.toPrestoType(com.facebook.presto.iceberg.TypeConverter.toPrestoType) HiveColumnConverterProvider(com.facebook.presto.hive.HiveColumnConverterProvider) BaseTable(org.apache.iceberg.BaseTable) ImmutableMap(com.google.common.collect.ImmutableMap) Table(org.apache.iceberg.Table) ImmutableList.toImmutableList(com.google.common.collect.ImmutableList.toImmutableList) TableScan(org.apache.iceberg.TableScan) Schema(org.apache.iceberg.Schema) FileFormat(org.apache.iceberg.FileFormat) TupleDomain(com.facebook.presto.common.predicate.TupleDomain) String.format(java.lang.String.format) ConnectorSession(com.facebook.presto.spi.ConnectorSession) Streams.stream(com.google.common.collect.Streams.stream) List(java.util.List) IcebergPrestoModelConverters.toIcebergTableIdentifier(com.facebook.presto.iceberg.util.IcebergPrestoModelConverters.toIcebergTableIdentifier) NOT_SUPPORTED(com.facebook.presto.spi.StandardErrorCode.NOT_SUPPORTED) PartitionSpec(org.apache.iceberg.PartitionSpec) Optional(java.util.Optional) Pattern(java.util.regex.Pattern) DEFAULT_FILE_FORMAT_DEFAULT(org.apache.iceberg.TableProperties.DEFAULT_FILE_FORMAT_DEFAULT) ICEBERG_INVALID_SNAPSHOT_ID(com.facebook.presto.iceberg.IcebergErrorCode.ICEBERG_INVALID_SNAPSHOT_ID) Lists.reverse(com.google.common.collect.Lists.reverse) Snapshot(org.apache.iceberg.Snapshot) TableScan(org.apache.iceberg.TableScan) Expression(org.apache.iceberg.expressions.Expression)

Example 17 with Expression

use of org.apache.iceberg.expressions.Expression in project hive by apache.

the class HiveIcebergInputFormat method getSplits.

@Override
public InputSplit[] getSplits(JobConf job, int numSplits) throws IOException {
    // Convert Hive filter to Iceberg filter
    String hiveFilter = job.get(TableScanDesc.FILTER_EXPR_CONF_STR);
    if (hiveFilter != null) {
        ExprNodeGenericFuncDesc exprNodeDesc = SerializationUtilities.deserializeObject(hiveFilter, ExprNodeGenericFuncDesc.class);
        SearchArgument sarg = ConvertAstToSearchArg.create(job, exprNodeDesc);
        try {
            Expression filter = HiveIcebergFilterFactory.generateFilterExpression(sarg);
            job.set(InputFormatConfig.FILTER_EXPRESSION, SerializationUtil.serializeToBase64(filter));
        } catch (UnsupportedOperationException e) {
            LOG.warn("Unable to create Iceberg filter, continuing without filter (will be applied by Hive later): ", e);
        }
    }
    job.set(InputFormatConfig.SELECTED_COLUMNS, job.get(ColumnProjectionUtils.READ_COLUMN_NAMES_CONF_STR, ""));
    job.set(InputFormatConfig.AS_OF_TIMESTAMP, job.get(TableScanDesc.AS_OF_TIMESTAMP, "-1"));
    job.set(InputFormatConfig.SNAPSHOT_ID, job.get(TableScanDesc.AS_OF_VERSION, "-1"));
    String location = job.get(InputFormatConfig.TABLE_LOCATION);
    return Arrays.stream(super.getSplits(job, numSplits)).map(split -> new HiveIcebergSplit((IcebergSplit) split, location)).toArray(InputSplit[]::new);
}
Also used : CombineHiveInputFormat(org.apache.hadoop.hive.ql.io.CombineHiveInputFormat) ExprNodeGenericFuncDesc(org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) Arrays(java.util.Arrays) ConvertAstToSearchArg(org.apache.hadoop.hive.ql.io.sarg.ConvertAstToSearchArg) ColumnProjectionUtils(org.apache.hadoop.hive.serde2.ColumnProjectionUtils) AbstractMapredIcebergRecordReader(org.apache.iceberg.mr.mapred.AbstractMapredIcebergRecordReader) IcebergSplit(org.apache.iceberg.mr.mapreduce.IcebergSplit) LoggerFactory(org.slf4j.LoggerFactory) SerializationUtilities(org.apache.hadoop.hive.ql.exec.SerializationUtilities) TableScanDesc(org.apache.hadoop.hive.ql.plan.TableScanDesc) SearchArgument(org.apache.hadoop.hive.ql.io.sarg.SearchArgument) DynConstructors(org.apache.iceberg.common.DynConstructors) Utilities(org.apache.hadoop.hive.ql.exec.Utilities) VectorizedSupport(org.apache.hadoop.hive.ql.exec.vector.VectorizedSupport) Expression(org.apache.iceberg.expressions.Expression) Configuration(org.apache.hadoop.conf.Configuration) Path(org.apache.hadoop.fs.Path) FileMetadataCache(org.apache.hadoop.hive.common.io.FileMetadataCache) Container(org.apache.iceberg.mr.mapred.Container) Logger(org.slf4j.Logger) IcebergSplitContainer(org.apache.iceberg.mr.mapreduce.IcebergSplitContainer) Reporter(org.apache.hadoop.mapred.Reporter) HiveConf(org.apache.hadoop.hive.conf.HiveConf) InputFormatConfig(org.apache.iceberg.mr.InputFormatConfig) IOException(java.io.IOException) SerializationUtil(org.apache.iceberg.util.SerializationUtil) VectorizedInputFormatInterface(org.apache.hadoop.hive.ql.exec.vector.VectorizedInputFormatInterface) MapredIcebergInputFormat(org.apache.iceberg.mr.mapred.MapredIcebergInputFormat) DataCache(org.apache.hadoop.hive.common.io.DataCache) JobConf(org.apache.hadoop.mapred.JobConf) Record(org.apache.iceberg.data.Record) MetastoreUtil(org.apache.iceberg.hive.MetastoreUtil) LlapCacheOnlyInputFormatInterface(org.apache.hadoop.hive.ql.io.LlapCacheOnlyInputFormatInterface) InputSplit(org.apache.hadoop.mapred.InputSplit) IcebergInputFormat(org.apache.iceberg.mr.mapreduce.IcebergInputFormat) Preconditions(org.apache.iceberg.relocated.com.google.common.base.Preconditions) RecordReader(org.apache.hadoop.mapred.RecordReader) Expression(org.apache.iceberg.expressions.Expression) ExprNodeGenericFuncDesc(org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) SearchArgument(org.apache.hadoop.hive.ql.io.sarg.SearchArgument)

Example 18 with Expression

use of org.apache.iceberg.expressions.Expression in project metacat by Netflix.

the class IcebergTableHandler method getIcebergTablePartitionMap.

/**
 * get Partition Map.
 *
 * @param tableName         Qualified table name
 * @param partitionsRequest partitionsRequest
 * @param icebergTable      iceberg Table
 * @return partition map
 */
public Map<String, ScanSummary.PartitionMetrics> getIcebergTablePartitionMap(final QualifiedName tableName, final PartitionListRequest partitionsRequest, final Table icebergTable) {
    final long start = this.registry.clock().wallTime();
    final Map<String, ScanSummary.PartitionMetrics> result;
    try {
        if (!Strings.isNullOrEmpty(partitionsRequest.getFilter())) {
            final IcebergFilterGenerator icebergFilterGenerator = new IcebergFilterGenerator(icebergTable.schema().columns());
            final Expression filter = (Expression) new PartitionParser(new StringReader(partitionsRequest.getFilter())).filter().jjtAccept(icebergFilterGenerator, null);
            result = this.icebergTableOpWrapper.getPartitionMetricsMap(icebergTable, filter);
        } else {
            result = this.icebergTableOpWrapper.getPartitionMetricsMap(icebergTable, null);
        }
    } catch (ParseException ex) {
        log.error("Iceberg filter parse error: ", ex);
        throw new IllegalArgumentException(String.format("Iceberg filter parse error. Ex: %s", ex.getMessage()));
    } catch (IllegalStateException e) {
        registry.counter(registry.createId(IcebergRequestMetrics.CounterGetPartitionsExceedThresholdFailure.getMetricName()).withTags(tableName.parts())).increment();
        final String message = String.format("Number of partitions queried for table %s exceeded the threshold %d", tableName, connectorContext.getConfig().getMaxPartitionsThreshold());
        log.warn(message);
        throw new IllegalArgumentException(message);
    } finally {
        final long duration = registry.clock().wallTime() - start;
        log.info("Time taken to getIcebergTablePartitionMap {} is {} ms", tableName, duration);
        this.recordTimer(IcebergRequestMetrics.TagGetPartitionMap.getMetricName(), duration);
        this.increaseCounter(IcebergRequestMetrics.TagGetPartitionMap.getMetricName(), tableName);
    }
    return result;
}
Also used : PartitionParser(com.netflix.metacat.common.server.partition.parser.PartitionParser) Expression(org.apache.iceberg.expressions.Expression) StringReader(java.io.StringReader) ParseException(com.netflix.metacat.common.server.partition.parser.ParseException) IcebergFilterGenerator(com.netflix.metacat.connector.hive.util.IcebergFilterGenerator)

Example 19 with Expression

use of org.apache.iceberg.expressions.Expression in project presto by prestodb.

the class ExpressionConverter method toIcebergExpression.

public static Expression toIcebergExpression(TupleDomain<IcebergColumnHandle> tupleDomain) {
    if (tupleDomain.isAll()) {
        return alwaysTrue();
    }
    if (!tupleDomain.getDomains().isPresent()) {
        return alwaysFalse();
    }
    Map<IcebergColumnHandle, Domain> domainMap = tupleDomain.getDomains().get();
    Expression expression = alwaysTrue();
    for (Map.Entry<IcebergColumnHandle, Domain> entry : domainMap.entrySet()) {
        IcebergColumnHandle columnHandle = entry.getKey();
        Domain domain = entry.getValue();
        expression = and(expression, toIcebergExpression(columnHandle.getName(), columnHandle.getType(), domain));
    }
    return expression;
}
Also used : Expression(org.apache.iceberg.expressions.Expression) Domain(com.facebook.presto.common.predicate.Domain) TupleDomain(com.facebook.presto.common.predicate.TupleDomain) Map(java.util.Map)

Example 20 with Expression

use of org.apache.iceberg.expressions.Expression in project drill by apache.

the class TestFilterTransformer method testToFilterIsNotNull.

@Test
public void testToFilterIsNotNull() {
    Expression expected = Expressions.notNull(MetastoreColumn.ROW_GROUP_INDEX.columnName());
    Expression actual = transformer.transform(FilterExpression.isNotNull(MetastoreColumn.ROW_GROUP_INDEX));
    assertEquals(expected.toString(), actual.toString());
}
Also used : FilterExpression(org.apache.drill.metastore.expressions.FilterExpression) Expression(org.apache.iceberg.expressions.Expression) Test(org.junit.Test) IcebergBaseTest(org.apache.drill.metastore.iceberg.IcebergBaseTest)

Aggregations

Expression (org.apache.iceberg.expressions.Expression)40 FilterExpression (org.apache.drill.metastore.expressions.FilterExpression)28 IcebergBaseTest (org.apache.drill.metastore.iceberg.IcebergBaseTest)26 Test (org.junit.Test)26 MetastoreColumn (org.apache.drill.metastore.MetastoreColumn)5 Map (java.util.Map)4 Path (org.apache.hadoop.fs.Path)3 TupleDomain (com.facebook.presto.common.predicate.TupleDomain)2 HashMap (java.util.HashMap)2 LogicalExpression (org.apache.drill.common.expression.LogicalExpression)2 TableMetadataUnit (org.apache.drill.metastore.components.tables.TableMetadataUnit)2 Delete (org.apache.drill.metastore.iceberg.operate.Delete)2 MapWork (org.apache.hadoop.hive.ql.plan.MapWork)2 TableScan (org.apache.iceberg.TableScan)2 Domain (com.facebook.presto.common.predicate.Domain)1 Marker (com.facebook.presto.common.predicate.Marker)1 Range (com.facebook.presto.common.predicate.Range)1 SortedRangeSet (com.facebook.presto.common.predicate.SortedRangeSet)1 ValueSet (com.facebook.presto.common.predicate.ValueSet)1 ArrayType (com.facebook.presto.common.type.ArrayType)1