Search in sources :

Example 36 with OrcCorruptionException

use of io.prestosql.orc.OrcCorruptionException in project hetu-core by openlookeng.

the class ShortColumnReader method readBlock.

@Override
public Block<Short> readBlock() throws IOException {
    if (!rowGroupOpen) {
        openRowGroup();
    }
    if (readOffset > 0) {
        if (presentStream != null) {
            // skip ahead the present bit reader, but count the set bits
            // and use this as the skip size for the data reader
            readOffset = presentStream.countBitsSet(readOffset);
        }
        if (readOffset > 0) {
            if (dataStream == null) {
                throw new OrcCorruptionException(column.getOrcDataSourceId(), "Value is not null but data stream is missing");
            }
            dataStream.skip(readOffset);
        }
    }
    Block block;
    if (dataStream == null) {
        if (presentStream == null) {
            throw new OrcCorruptionException(column.getOrcDataSourceId(), "Value is null but present stream is missing");
        }
        presentStream.skip(nextBatchSize);
        block = RunLengthEncodedBlock.create(SmallintType.SMALLINT, null, nextBatchSize);
    } else if (presentStream == null) {
        block = readNonNullBlock();
    } else {
        boolean[] isNull = new boolean[nextBatchSize];
        int nullCount = presentStream.getUnsetBits(nextBatchSize, isNull);
        if (nullCount == 0) {
            block = readNonNullBlock();
        } else if (nullCount != nextBatchSize) {
            block = readNullBlock(isNull, nextBatchSize - nullCount);
        } else {
            block = RunLengthEncodedBlock.create(SmallintType.SMALLINT, null, nextBatchSize);
        }
    }
    readOffset = 0;
    nextBatchSize = 0;
    return block;
}
Also used : RunLengthEncodedBlock(io.prestosql.spi.block.RunLengthEncodedBlock) ShortArrayBlock(io.prestosql.spi.block.ShortArrayBlock) Block(io.prestosql.spi.block.Block) OrcCorruptionException(io.prestosql.orc.OrcCorruptionException)

Example 37 with OrcCorruptionException

use of io.prestosql.orc.OrcCorruptionException in project hetu-core by openlookeng.

the class OrcDeletedRows method isDeleted.

private boolean isDeleted(OrcAcidRowId sourcePageRowId) {
    if (sortedRowsIterator == null) {
        for (WriteIdInfo deleteDeltaInfo : deleteDeltaLocations.getDeleteDeltas()) {
            Path path = createPath(deleteDeltaLocations.getPartitionLocation(), deleteDeltaInfo, sourceFileName);
            try {
                FileSystem fileSystem = hdfsEnvironment.getFileSystem(sessionUser, path, configuration);
                FileStatus fileStatus = hdfsEnvironment.doAs(sessionUser, () -> fileSystem.getFileStatus(path));
                pageSources.add(pageSourceFactory.createPageSource(fileStatus.getPath(), fileStatus.getLen(), fileStatus.getModificationTime()));
            } catch (FileNotFoundException ignored) {
                // source file does not have a delta delete file in this location
                continue;
            } catch (PrestoException e) {
                throw e;
            } catch (OrcCorruptionException e) {
                throw new PrestoException(HiveErrorCode.HIVE_BAD_DATA, format("Failed to read ORC file: %s", path), e);
            } catch (RuntimeException | IOException e) {
                throw new PrestoException(HiveErrorCode.HIVE_CURSOR_ERROR, format("Failed to read ORC file: %s", path), e);
            }
        }
        List<Type> columnTypes = ImmutableList.of(BigintType.BIGINT, IntegerType.INTEGER, BigintType.BIGINT);
        // Last index for rowIdHandle
        List<Integer> sortFields = ImmutableList.of(0, 1, 2);
        List<SortOrder> sortOrders = ImmutableList.of(SortOrder.ASC_NULLS_FIRST, SortOrder.ASC_NULLS_FIRST, SortOrder.ASC_NULLS_FIRST);
        sortedRowsIterator = HiveUtil.getMergeSortedPages(pageSources, columnTypes, sortFields, sortOrders);
    }
    do {
        if (currentPage == null || currentPageOffset >= currentPage.getPositionCount()) {
            currentPage = null;
            currentPageOffset = 0;
            if (sortedRowsIterator.hasNext()) {
                currentPage = sortedRowsIterator.next();
            } else {
                // No more entries in deleted_delta
                return false;
            }
        }
        do {
            deletedRowId.set(currentPage, currentPageOffset);
            if (deletedRowId.compareTo(sourcePageRowId) == 0) {
                // source row is deleted.
                return true;
            } else if (deletedRowId.compareTo(sourcePageRowId) > 0) {
                // So current source row is not deleted.
                return false;
            }
            currentPageOffset++;
        } while (currentPageOffset < currentPage.getPositionCount());
    } while (sortedRowsIterator.hasNext());
    // No more entries;
    return false;
}
Also used : Path(org.apache.hadoop.fs.Path) FileStatus(org.apache.hadoop.fs.FileStatus) FileNotFoundException(java.io.FileNotFoundException) SortOrder(io.prestosql.spi.block.SortOrder) PrestoException(io.prestosql.spi.PrestoException) IOException(java.io.IOException) UncheckedIOException(java.io.UncheckedIOException) BigintType(io.prestosql.spi.type.BigintType) Type(io.prestosql.spi.type.Type) IntegerType(io.prestosql.spi.type.IntegerType) FileSystem(org.apache.hadoop.fs.FileSystem) WriteIdInfo(io.prestosql.plugin.hive.WriteIdInfo) OrcCorruptionException(io.prestosql.orc.OrcCorruptionException)

Aggregations

OrcCorruptionException (io.prestosql.orc.OrcCorruptionException)37 Block (io.prestosql.spi.block.Block)14 RunLengthEncodedBlock (io.prestosql.spi.block.RunLengthEncodedBlock)12 LongStreamCheckpoint (io.prestosql.orc.checkpoint.LongStreamCheckpoint)7 DecimalStreamCheckpoint (io.prestosql.orc.checkpoint.DecimalStreamCheckpoint)4 InputStreamCheckpoint.createInputStreamCheckpoint (io.prestosql.orc.checkpoint.InputStreamCheckpoint.createInputStreamCheckpoint)4 LongStreamV2Checkpoint (io.prestosql.orc.checkpoint.LongStreamV2Checkpoint)4 LongArrayBlock (io.prestosql.spi.block.LongArrayBlock)4 Slice (io.airlift.slice.Slice)3 LongStreamV1Checkpoint (io.prestosql.orc.checkpoint.LongStreamV1Checkpoint)3 PrestoException (io.prestosql.spi.PrestoException)3 ByteStreamCheckpoint (io.prestosql.orc.checkpoint.ByteStreamCheckpoint)2 ByteArrayInputStream (io.prestosql.orc.stream.ByteArrayInputStream)2 LongInputStream (io.prestosql.orc.stream.LongInputStream)2 WriteIdInfo (io.prestosql.plugin.hive.WriteIdInfo)2 ByteArrayBlock (io.prestosql.spi.block.ByteArrayBlock)2 IntArrayBlock (io.prestosql.spi.block.IntArrayBlock)2 SortOrder (io.prestosql.spi.block.SortOrder)2 VariableWidthBlock (io.prestosql.spi.block.VariableWidthBlock)2 BigintType (io.prestosql.spi.type.BigintType)2