Search in sources :

Example 16 with HdfsContext

use of com.facebook.presto.hive.HdfsContext in project presto by prestodb.

the class SemiTransactionalHiveMetastore method finishInsertIntoExistingTable.

public synchronized void finishInsertIntoExistingTable(ConnectorSession session, String databaseName, String tableName, Path currentLocation, List<String> fileNames, PartitionStatistics statisticsUpdate) {
    // Data can only be inserted into partitions and unpartitioned tables. They can never be inserted into a partitioned table.
    // Therefore, this method assumes that the table is unpartitioned.
    setShared();
    SchemaTableName schemaTableName = new SchemaTableName(databaseName, tableName);
    Action<TableAndMore> oldTableAction = tableActions.get(schemaTableName);
    MetastoreContext metastoreContext = new MetastoreContext(session.getIdentity(), session.getQueryId(), session.getClientInfo(), session.getSource(), getMetastoreHeaders(session), isUserDefinedTypeEncodingEnabled(session), columnConverterProvider);
    if (oldTableAction == null || oldTableAction.getData().getTable().getTableType().equals(TEMPORARY_TABLE)) {
        Table table = getTable(metastoreContext, databaseName, tableName).orElseThrow(() -> new TableNotFoundException(schemaTableName));
        PartitionStatistics currentStatistics = getTableStatistics(metastoreContext, databaseName, tableName);
        HdfsContext context = new HdfsContext(session, databaseName, tableName, table.getStorage().getLocation(), false);
        tableActions.put(schemaTableName, new Action<>(ActionType.INSERT_EXISTING, new TableAndMore(table, Optional.empty(), Optional.of(currentLocation), Optional.of(fileNames), false, merge(currentStatistics, statisticsUpdate), statisticsUpdate), context));
        return;
    }
    switch(oldTableAction.getType()) {
        case DROP:
            throw new TableNotFoundException(schemaTableName);
        case ADD:
        case ALTER:
        case INSERT_EXISTING:
            throw new UnsupportedOperationException("Inserting into an unpartitioned table that were added, altered, or inserted into in the same transaction is not supported");
        default:
            throw new IllegalStateException("Unknown action type");
    }
}
Also used : TableNotFoundException(com.facebook.presto.spi.TableNotFoundException) HdfsContext(com.facebook.presto.hive.HdfsContext) SchemaTableName(com.facebook.presto.spi.SchemaTableName)

Example 17 with HdfsContext

use of com.facebook.presto.hive.HdfsContext in project presto by prestodb.

the class DeltaPageSourceProvider method createPageSource.

@Override
public ConnectorPageSource createPageSource(ConnectorTransactionHandle transactionHandle, ConnectorSession session, ConnectorSplit split, ConnectorTableLayoutHandle layout, List<ColumnHandle> columns, SplitContext splitContext) {
    DeltaSplit deltaSplit = (DeltaSplit) split;
    DeltaTableLayoutHandle deltaTableLayoutHandle = (DeltaTableLayoutHandle) layout;
    DeltaTableHandle deltaTableHandle = deltaTableLayoutHandle.getTable();
    HdfsContext hdfsContext = new HdfsContext(session, deltaSplit.getSchema(), deltaSplit.getTable(), deltaSplit.getFilePath(), false);
    Path filePath = new Path(deltaSplit.getFilePath());
    List<DeltaColumnHandle> deltaColumnHandles = columns.stream().map(DeltaColumnHandle.class::cast).collect(Collectors.toList());
    List<DeltaColumnHandle> regularColumnHandles = deltaColumnHandles.stream().filter(columnHandle -> columnHandle.getColumnType() != PARTITION).collect(Collectors.toList());
    ConnectorPageSource dataPageSource = createParquetPageSource(hdfsEnvironment, session.getUser(), hdfsEnvironment.getConfiguration(hdfsContext, filePath), filePath, deltaSplit.getStart(), deltaSplit.getLength(), deltaSplit.getFileSize(), regularColumnHandles, deltaTableHandle.toSchemaTableName(), getParquetMaxReadBlockSize(session), isParquetBatchReadsEnabled(session), isParquetBatchReaderVerificationEnabled(session), typeManager, deltaTableLayoutHandle.getPredicate(), fileFormatDataSourceStats, false);
    return new DeltaPageSource(deltaColumnHandles, convertPartitionValues(deltaColumnHandles, deltaSplit.getPartitionValues()), dataPageSource);
}
Also used : Path(org.apache.hadoop.fs.Path) ParquetTypeUtils.nestedColumnPath(com.facebook.presto.parquet.ParquetTypeUtils.nestedColumnPath) ColumnIOConverter.constructField(org.apache.parquet.io.ColumnIOConverter.constructField) HdfsEnvironment(com.facebook.presto.hive.HdfsEnvironment) RichColumnDescriptor(com.facebook.presto.parquet.RichColumnDescriptor) DeltaColumnHandle.getPushedDownSubfield(com.facebook.presto.delta.DeltaColumnHandle.getPushedDownSubfield) ConnectorTransactionHandle(com.facebook.presto.spi.connector.ConnectorTransactionHandle) ParquetCorruptionException(com.facebook.presto.parquet.ParquetCorruptionException) ParquetTypeUtils.lookupColumnByName(com.facebook.presto.parquet.ParquetTypeUtils.lookupColumnByName) Preconditions.checkArgument(com.google.common.base.Preconditions.checkArgument) SchemaTableName(com.facebook.presto.spi.SchemaTableName) Collectors.toMap(java.util.stream.Collectors.toMap) SplitContext(com.facebook.presto.spi.SplitContext) ParquetTypeUtils.getDescriptors(com.facebook.presto.parquet.ParquetTypeUtils.getDescriptors) Configuration(org.apache.hadoop.conf.Configuration) Map(java.util.Map) Path(org.apache.hadoop.fs.Path) DeltaColumnHandle.isPushedDownSubfield(com.facebook.presto.delta.DeltaColumnHandle.isPushedDownSubfield) RuntimeStats(com.facebook.presto.common.RuntimeStats) FileFormatDataSourceStats(com.facebook.presto.hive.FileFormatDataSourceStats) HdfsContext(com.facebook.presto.hive.HdfsContext) ConnectorPageSourceProvider(com.facebook.presto.spi.connector.ConnectorPageSourceProvider) FSDataInputStream(org.apache.hadoop.fs.FSDataInputStream) ParquetDataSource(com.facebook.presto.parquet.ParquetDataSource) SUBFIELD(com.facebook.presto.delta.DeltaColumnHandle.ColumnType.SUBFIELD) GroupType(org.apache.parquet.schema.GroupType) ImmutableMap(com.google.common.collect.ImmutableMap) DELTA_MISSING_DATA(com.facebook.presto.delta.DeltaErrorCode.DELTA_MISSING_DATA) ColumnIndexStore(org.apache.parquet.internal.filter2.columnindex.ColumnIndexStore) Collectors(java.util.stream.Collectors) ColumnIOConverter.findNestedColumnIO(org.apache.parquet.io.ColumnIOConverter.findNestedColumnIO) FileNotFoundException(java.io.FileNotFoundException) String.format(java.lang.String.format) ColumnIndexFilterUtils(com.facebook.presto.parquet.reader.ColumnIndexFilterUtils) ConnectorSession(com.facebook.presto.spi.ConnectorSession) MessageType(org.apache.parquet.schema.MessageType) DataSize(io.airlift.units.DataSize) List(java.util.List) DELTA_CANNOT_OPEN_SPLIT(com.facebook.presto.delta.DeltaErrorCode.DELTA_CANNOT_OPEN_SPLIT) ColumnDescriptor(org.apache.parquet.column.ColumnDescriptor) ParquetTypeUtils.columnPathFromSubfield(com.facebook.presto.parquet.ParquetTypeUtils.columnPathFromSubfield) BlockMetaData(org.apache.parquet.hadoop.metadata.BlockMetaData) ColumnIO(org.apache.parquet.io.ColumnIO) Optional(java.util.Optional) DELTA_PARQUET_SCHEMA_MISMATCH(com.facebook.presto.delta.DeltaErrorCode.DELTA_PARQUET_SCHEMA_MISMATCH) ParquetPageSource(com.facebook.presto.hive.parquet.ParquetPageSource) REGULAR(com.facebook.presto.delta.DeltaColumnHandle.ColumnType.REGULAR) HdfsParquetDataSource.buildHdfsParquetDataSource(com.facebook.presto.hive.parquet.HdfsParquetDataSource.buildHdfsParquetDataSource) DeltaSessionProperties.getParquetMaxReadBlockSize(com.facebook.presto.delta.DeltaSessionProperties.getParquetMaxReadBlockSize) MessageColumnIO(org.apache.parquet.io.MessageColumnIO) MetadataReader(com.facebook.presto.parquet.cache.MetadataReader) PARTITION(com.facebook.presto.delta.DeltaColumnHandle.ColumnType.PARTITION) Strings.nullToEmpty(com.google.common.base.Strings.nullToEmpty) Utils(com.facebook.presto.common.Utils) ConnectorTableLayoutHandle(com.facebook.presto.spi.ConnectorTableLayoutHandle) PredicateUtils.predicateMatches(com.facebook.presto.parquet.predicate.PredicateUtils.predicateMatches) PrestoException(com.facebook.presto.spi.PrestoException) DeltaSessionProperties.isParquetBatchReaderVerificationEnabled(com.facebook.presto.delta.DeltaSessionProperties.isParquetBatchReaderVerificationEnabled) ArrayList(java.util.ArrayList) ParquetTypeUtils.getSubfieldType(com.facebook.presto.parquet.ParquetTypeUtils.getSubfieldType) Inject(javax.inject.Inject) ParquetTypeUtils.getParquetTypeByName(com.facebook.presto.parquet.ParquetTypeUtils.getParquetTypeByName) Subfield(com.facebook.presto.common.Subfield) ImmutableList(com.google.common.collect.ImmutableList) TypeManager(com.facebook.presto.common.type.TypeManager) Objects.requireNonNull(java.util.Objects.requireNonNull) Predicate(com.facebook.presto.parquet.predicate.Predicate) ParquetPageSourceFactory.checkSchemaMatch(com.facebook.presto.hive.parquet.ParquetPageSourceFactory.checkSchemaMatch) DELTA_BAD_DATA(com.facebook.presto.delta.DeltaErrorCode.DELTA_BAD_DATA) AggregatedMemoryContext.newSimpleAggregatedMemoryContext(com.facebook.presto.memory.context.AggregatedMemoryContext.newSimpleAggregatedMemoryContext) PredicateUtils.buildPredicate(com.facebook.presto.parquet.predicate.PredicateUtils.buildPredicate) Type(com.facebook.presto.common.type.Type) ParquetTypeUtils.getColumnIO(com.facebook.presto.parquet.ParquetTypeUtils.getColumnIO) IOException(java.io.IOException) ParquetTypeUtils.nestedColumnPath(com.facebook.presto.parquet.ParquetTypeUtils.nestedColumnPath) DeltaSessionProperties.isParquetBatchReadsEnabled(com.facebook.presto.delta.DeltaSessionProperties.isParquetBatchReadsEnabled) Domain(com.facebook.presto.common.predicate.Domain) TupleDomain(com.facebook.presto.common.predicate.TupleDomain) AggregatedMemoryContext(com.facebook.presto.memory.context.AggregatedMemoryContext) ParquetReader(com.facebook.presto.parquet.reader.ParquetReader) PERMISSION_DENIED(com.facebook.presto.spi.StandardErrorCode.PERMISSION_DENIED) Field(com.facebook.presto.parquet.Field) ConnectorSplit(com.facebook.presto.spi.ConnectorSplit) ConnectorPageSource(com.facebook.presto.spi.ConnectorPageSource) DeltaTypeUtils.convertPartitionValue(com.facebook.presto.delta.DeltaTypeUtils.convertPartitionValue) ColumnHandle(com.facebook.presto.spi.ColumnHandle) AccessControlException(org.apache.hadoop.security.AccessControlException) FileMetaData(org.apache.parquet.hadoop.metadata.FileMetaData) ParquetMetadata(org.apache.parquet.hadoop.metadata.ParquetMetadata) Block(com.facebook.presto.common.block.Block) HdfsContext(com.facebook.presto.hive.HdfsContext) ConnectorPageSource(com.facebook.presto.spi.ConnectorPageSource)

Example 18 with HdfsContext

use of com.facebook.presto.hive.HdfsContext in project presto by prestodb.

the class DeltaClient method loadDeltaTableLog.

private Optional<DeltaLog> loadDeltaTableLog(ConnectorSession session, Path tableLocation, SchemaTableName schemaTableName) {
    try {
        HdfsContext hdfsContext = new HdfsContext(session, schemaTableName.getSchemaName(), schemaTableName.getTableName(), tableLocation.toString(), false);
        FileSystem fileSystem = hdfsEnvironment.getFileSystem(hdfsContext, tableLocation);
        if (!fileSystem.isDirectory(tableLocation)) {
            return Optional.empty();
        }
        return Optional.of(DeltaLog.forTable(hdfsEnvironment.getConfiguration(hdfsContext, tableLocation), tableLocation));
    } catch (IOException ioException) {
        throw new PrestoException(GENERIC_INTERNAL_ERROR, "Failed to load Delta table: " + ioException.getMessage(), ioException);
    }
}
Also used : FileSystem(org.apache.hadoop.fs.FileSystem) PrestoException(com.facebook.presto.spi.PrestoException) HdfsContext(com.facebook.presto.hive.HdfsContext) IOException(java.io.IOException)

Example 19 with HdfsContext

use of com.facebook.presto.hive.HdfsContext in project presto by prestodb.

the class RaptorPageSinkProvider method createPageSink.

@Override
public ConnectorPageSink createPageSink(ConnectorTransactionHandle transactionHandle, ConnectorSession session, ConnectorInsertTableHandle tableHandle, PageSinkContext pageSinkContext) {
    checkArgument(!pageSinkContext.isCommitRequired(), "Raptor connector does not support page sink commit");
    RaptorInsertTableHandle handle = (RaptorInsertTableHandle) tableHandle;
    return new RaptorPageSink(new HdfsContext(session), pageSorter, storageManager, temporalFunction, handle.getTransactionId(), toColumnIds(handle.getColumnHandles()), handle.getColumnTypes(), toColumnIds(handle.getSortColumnHandles()), handle.getSortOrders(), handle.getBucketCount(), toColumnIds(handle.getBucketColumnHandles()), handle.getTemporalColumnHandle(), getWriterMaxBufferSize(session), maxAllowedFilesPerWriter);
}
Also used : HdfsContext(com.facebook.presto.hive.HdfsContext)

Aggregations

HdfsContext (com.facebook.presto.hive.HdfsContext)19 PrestoException (com.facebook.presto.spi.PrestoException)12 SchemaTableName (com.facebook.presto.spi.SchemaTableName)10 List (java.util.List)9 ImmutableList (com.google.common.collect.ImmutableList)7 ImmutableList.toImmutableList (com.google.common.collect.ImmutableList.toImmutableList)7 ArrayList (java.util.ArrayList)7 Path (org.apache.hadoop.fs.Path)7 Collectors.toList (java.util.stream.Collectors.toList)6 Type (com.facebook.presto.common.type.Type)5 ConnectorSession (com.facebook.presto.spi.ConnectorSession)5 ImmutableMap (com.google.common.collect.ImmutableMap)5 IOException (java.io.IOException)5 Map (java.util.Map)5 Objects.requireNonNull (java.util.Objects.requireNonNull)5 Optional (java.util.Optional)5 HdfsEnvironment (com.facebook.presto.hive.HdfsEnvironment)4 Inject (javax.inject.Inject)4 FileFormat (org.apache.iceberg.FileFormat)4 TupleDomain (com.facebook.presto.common.predicate.TupleDomain)3