Search in sources :

Example 76 with TrinoException

use of io.trino.spi.TrinoException in project trino by trinodb.

the class HivePageSourceProvider method createPageSource.

@Override
public ConnectorPageSource createPageSource(ConnectorTransactionHandle transaction, ConnectorSession session, ConnectorSplit split, ConnectorTableHandle tableHandle, List<ColumnHandle> columns, DynamicFilter dynamicFilter) {
    HiveTableHandle hiveTable = (HiveTableHandle) tableHandle;
    HiveSplit hiveSplit = (HiveSplit) split;
    if (shouldSkipBucket(hiveTable, hiveSplit, dynamicFilter)) {
        return new EmptyPageSource();
    }
    List<HiveColumnHandle> hiveColumns = columns.stream().map(HiveColumnHandle.class::cast).collect(toList());
    List<HiveColumnHandle> dependencyColumns = hiveColumns.stream().filter(HiveColumnHandle::isBaseColumn).collect(toImmutableList());
    if (hiveTable.isAcidUpdate()) {
        hiveColumns = hiveTable.getUpdateProcessor().orElseThrow(() -> new IllegalArgumentException("update processor not present")).mergeWithNonUpdatedColumns(hiveColumns);
    }
    Path path = new Path(hiveSplit.getPath());
    boolean originalFile = ORIGINAL_FILE_PATH_MATCHER.matcher(path.toString()).matches();
    List<ColumnMapping> columnMappings = ColumnMapping.buildColumnMappings(hiveSplit.getPartitionName(), hiveSplit.getPartitionKeys(), hiveColumns, hiveSplit.getBucketConversion().map(BucketConversion::getBucketColumnHandles).orElse(ImmutableList.of()), hiveSplit.getTableToPartitionMapping(), path, hiveSplit.getBucketNumber(), hiveSplit.getEstimatedFileSize(), hiveSplit.getFileModifiedTime());
    // This can happen when dynamic filters are collected after partition splits were listed.
    if (shouldSkipSplit(columnMappings, dynamicFilter)) {
        return new EmptyPageSource();
    }
    Configuration configuration = hdfsEnvironment.getConfiguration(new HdfsContext(session), path);
    TupleDomain<HiveColumnHandle> simplifiedDynamicFilter = dynamicFilter.getCurrentPredicate().transformKeys(HiveColumnHandle.class::cast).simplify(domainCompactionThreshold);
    Optional<ConnectorPageSource> pageSource = createHivePageSource(pageSourceFactories, cursorProviders, configuration, session, path, hiveSplit.getBucketNumber(), hiveSplit.getStart(), hiveSplit.getLength(), hiveSplit.getEstimatedFileSize(), hiveSplit.getSchema(), hiveTable.getCompactEffectivePredicate().intersect(simplifiedDynamicFilter), hiveColumns, typeManager, hiveSplit.getBucketConversion(), hiveSplit.getBucketValidation(), hiveSplit.isS3SelectPushdownEnabled(), hiveSplit.getAcidInfo(), originalFile, hiveTable.getTransaction(), columnMappings);
    if (pageSource.isPresent()) {
        ConnectorPageSource source = pageSource.get();
        if (hiveTable.isAcidDelete() || hiveTable.isAcidUpdate()) {
            checkArgument(orcFileWriterFactory.isPresent(), "orcFileWriterFactory not supplied but required for DELETE and UPDATE");
            HivePageSource hivePageSource = (HivePageSource) source;
            OrcPageSource orcPageSource = (OrcPageSource) hivePageSource.getDelegate();
            ColumnMetadata<OrcType> columnMetadata = orcPageSource.getColumnTypes();
            int acidRowColumnId = originalFile ? 0 : ACID_ROW_STRUCT_COLUMN_ID;
            HiveType rowType = fromOrcTypeToHiveType(columnMetadata.get(new OrcColumnId(acidRowColumnId)), columnMetadata);
            long currentSplitNumber = hiveSplit.getSplitNumber();
            if (currentSplitNumber >= MAX_NUMBER_OF_SPLITS) {
                throw new TrinoException(GENERIC_INSUFFICIENT_RESOURCES, format("Number of splits is higher than maximum possible number of splits %d", MAX_NUMBER_OF_SPLITS));
            }
            long initialRowId = currentSplitNumber << PER_SPLIT_ROW_ID_BITS;
            return new HiveUpdatablePageSource(hiveTable, hiveSplit.getPartitionName(), hiveSplit.getStatementId(), source, typeManager, hiveSplit.getBucketNumber(), path, originalFile, orcFileWriterFactory.get(), configuration, session, rowType, dependencyColumns, hiveTable.getTransaction().getOperation(), initialRowId, MAX_NUMBER_OF_ROWS_PER_SPLIT);
        }
        return source;
    }
    throw new RuntimeException("Could not find a file reader for split " + hiveSplit);
}
Also used : OrcColumnId(io.trino.orc.metadata.OrcColumnId) Configuration(org.apache.hadoop.conf.Configuration) ConnectorPageSource(io.trino.spi.connector.ConnectorPageSource) EmptyPageSource(io.trino.spi.connector.EmptyPageSource) HdfsContext(io.trino.plugin.hive.HdfsEnvironment.HdfsContext) Path(org.apache.hadoop.fs.Path) OrcPageSource(io.trino.plugin.hive.orc.OrcPageSource) OrcType(io.trino.orc.metadata.OrcType) TrinoException(io.trino.spi.TrinoException) OrcTypeToHiveTypeTranslator.fromOrcTypeToHiveType(io.trino.plugin.hive.orc.OrcTypeToHiveTypeTranslator.fromOrcTypeToHiveType) BucketConversion(io.trino.plugin.hive.HiveSplit.BucketConversion)

Example 77 with TrinoException

use of io.trino.spi.TrinoException in project trino by trinodb.

the class RecordFileWriter method commit.

@Override
public void commit() {
    try {
        recordWriter.close(false);
        committed = true;
    } catch (IOException e) {
        throw new TrinoException(HIVE_WRITER_CLOSE_ERROR, "Error committing write to Hive", e);
    }
}
Also used : TrinoException(io.trino.spi.TrinoException) IOException(java.io.IOException) UncheckedIOException(java.io.UncheckedIOException)

Example 78 with TrinoException

use of io.trino.spi.TrinoException in project trino by trinodb.

the class SortingFileWriter method commit.

@Override
public void commit() {
    if (!sortBuffer.isEmpty()) {
        // skip temporary files entirely if the total output size is small
        if (tempFiles.isEmpty()) {
            sortBuffer.flushTo(outputWriter::appendRows);
            outputWriter.commit();
            return;
        }
        flushToTempFile();
    }
    try {
        writeSorted();
        outputWriter.commit();
    } catch (UncheckedIOException e) {
        throw new TrinoException(HIVE_WRITER_CLOSE_ERROR, "Error committing write to Hive", e);
    }
}
Also used : TrinoException(io.trino.spi.TrinoException) UncheckedIOException(java.io.UncheckedIOException)

Example 79 with TrinoException

use of io.trino.spi.TrinoException in project trino by trinodb.

the class SortingFileWriter method writeTempFile.

private void writeTempFile(Consumer<TempFileWriter> consumer) {
    Path tempFile = getTempFileName();
    try (TempFileWriter writer = new TempFileWriter(types, tempFileSinkFactory.createSink(fileSystem, tempFile))) {
        consumer.accept(writer);
        writer.close();
        tempFiles.add(new TempFile(tempFile, writer.getWrittenBytes()));
    } catch (IOException | UncheckedIOException e) {
        cleanupFile(tempFile);
        throw new TrinoException(HIVE_WRITER_DATA_ERROR, "Failed to write temporary file: " + tempFile, e);
    }
}
Also used : Path(org.apache.hadoop.fs.Path) TempFileWriter(io.trino.plugin.hive.util.TempFileWriter) TrinoException(io.trino.spi.TrinoException) UncheckedIOException(java.io.UncheckedIOException) IOException(java.io.IOException) UncheckedIOException(java.io.UncheckedIOException)

Example 80 with TrinoException

use of io.trino.spi.TrinoException in project trino by trinodb.

the class HiveSplitSource method addToQueue.

ListenableFuture<Void> addToQueue(InternalHiveSplit split) {
    if (stateReference.get().getKind() != INITIAL) {
        return immediateVoidFuture();
    }
    if (estimatedSplitSizeInBytes.addAndGet(split.getEstimatedSizeInBytes()) > maxOutstandingSplitsBytes) {
        // If it's hit, it means individual splits are huge.
        if (loggedHighMemoryWarning.compareAndSet(false, true)) {
            highMemorySplitSourceCounter.update(1);
            log.warn("Split buffering for %s.%s in query %s exceeded memory limit (%s). %s splits are buffered.", databaseName, tableName, queryId, succinctBytes(maxOutstandingSplitsBytes), getBufferedInternalSplitCount());
        }
        throw new TrinoException(HIVE_EXCEEDED_SPLIT_BUFFERING_LIMIT, format("Split buffering for %s.%s exceeded memory limit (%s). %s splits are buffered.", databaseName, tableName, succinctBytes(maxOutstandingSplitsBytes), getBufferedInternalSplitCount()));
    }
    bufferedInternalSplitCount.incrementAndGet();
    OptionalInt bucketNumber = split.getBucketNumber();
    return queues.offer(bucketNumber, split);
}
Also used : TrinoException(io.trino.spi.TrinoException) OptionalInt(java.util.OptionalInt)

Aggregations

TrinoException (io.trino.spi.TrinoException)623 IOException (java.io.IOException)151 ImmutableList (com.google.common.collect.ImmutableList)105 List (java.util.List)100 Type (io.trino.spi.type.Type)93 ImmutableList.toImmutableList (com.google.common.collect.ImmutableList.toImmutableList)90 SchemaTableName (io.trino.spi.connector.SchemaTableName)83 Path (org.apache.hadoop.fs.Path)83 Optional (java.util.Optional)79 ArrayList (java.util.ArrayList)77 Map (java.util.Map)76 ImmutableMap (com.google.common.collect.ImmutableMap)70 Objects.requireNonNull (java.util.Objects.requireNonNull)69 ConnectorSession (io.trino.spi.connector.ConnectorSession)63 TableNotFoundException (io.trino.spi.connector.TableNotFoundException)62 ImmutableSet (com.google.common.collect.ImmutableSet)56 VarcharType (io.trino.spi.type.VarcharType)54 Set (java.util.Set)54 Slice (io.airlift.slice.Slice)53 Table (io.trino.plugin.hive.metastore.Table)52