Search in sources :

Example 11 with OrcDataSourceId

use of io.trino.orc.OrcDataSourceId in project trino by trinodb.

the class SortingFileWriter method mergeFiles.

private void mergeFiles(Iterable<TempFile> files, Consumer<Page> consumer) {
    try (Closer closer = Closer.create()) {
        Collection<Iterator<Page>> iterators = new ArrayList<>();
        for (TempFile tempFile : files) {
            Path file = tempFile.getPath();
            OrcDataSource dataSource = new HdfsOrcDataSource(new OrcDataSourceId(file.toString()), fileSystem.getFileStatus(file).getLen(), new OrcReaderOptions(), fileSystem.open(file), new FileFormatDataSourceStats());
            closer.register(dataSource);
            iterators.add(new TempFileReader(types, dataSource));
        }
        new MergingPageIterator(iterators, types, sortFields, sortOrders, typeOperators).forEachRemaining(consumer);
        for (TempFile tempFile : files) {
            Path file = tempFile.getPath();
            if (!fileSystem.delete(file, false)) {
                throw new IOException("Failed to delete temporary file: " + file);
            }
        }
    } catch (IOException e) {
        throw new UncheckedIOException(e);
    }
}
Also used : Closer(com.google.common.io.Closer) Path(org.apache.hadoop.fs.Path) OrcDataSource(io.trino.orc.OrcDataSource) HdfsOrcDataSource(io.trino.plugin.hive.orc.HdfsOrcDataSource) MergingPageIterator(io.trino.plugin.hive.util.MergingPageIterator) OrcDataSourceId(io.trino.orc.OrcDataSourceId) ArrayList(java.util.ArrayList) HdfsOrcDataSource(io.trino.plugin.hive.orc.HdfsOrcDataSource) UncheckedIOException(java.io.UncheckedIOException) IOException(java.io.IOException) UncheckedIOException(java.io.UncheckedIOException) OrcReaderOptions(io.trino.orc.OrcReaderOptions) MergingPageIterator(io.trino.plugin.hive.util.MergingPageIterator) Iterator(java.util.Iterator) TempFileReader(io.trino.plugin.hive.util.TempFileReader)

Example 12 with OrcDataSourceId

use of io.trino.orc.OrcDataSourceId in project trino by trinodb.

the class TestLongDecode method assertVIntRoundTrip.

private static void assertVIntRoundTrip(SliceOutput output, long value, boolean signed) throws IOException {
    // write using Hive's code
    output.reset();
    if (signed) {
        writeVslong(output, value);
    } else {
        writeVulong(output, value);
    }
    Slice hiveBytes = Slices.copyOf(output.slice());
    // write using Trino's code, and verify they are the same
    output.reset();
    writeVLong(output, value, signed);
    Slice trinoBytes = Slices.copyOf(output.slice());
    if (!trinoBytes.equals(hiveBytes)) {
        assertEquals(trinoBytes, hiveBytes);
    }
    // read using Hive's code
    if (signed) {
        long readValueOld = readVslong(hiveBytes.getInput());
        assertEquals(readValueOld, value);
    } else {
        long readValueOld = readVulong(hiveBytes.getInput());
        assertEquals(readValueOld, value);
    }
    // read using Trino's code
    long readValueNew = readVInt(signed, new OrcInputStream(OrcChunkLoader.create(new OrcDataSourceId("test"), hiveBytes, Optional.empty(), newSimpleAggregatedMemoryContext())));
    assertEquals(readValueNew, value);
}
Also used : OrcDataSourceId(io.trino.orc.OrcDataSourceId) Slice(io.airlift.slice.Slice)

Aggregations

OrcDataSourceId (io.trino.orc.OrcDataSourceId)12 OrcDataSource (io.trino.orc.OrcDataSource)7 TrinoException (io.trino.spi.TrinoException)6 IOException (java.io.IOException)6 FileSystem (org.apache.hadoop.fs.FileSystem)6 OrcReaderOptions (io.trino.orc.OrcReaderOptions)5 Path (org.apache.hadoop.fs.Path)5 ImmutableMap (com.google.common.collect.ImmutableMap)4 OrcReader (io.trino.orc.OrcReader)4 FileFormatDataSourceStats (io.trino.plugin.hive.FileFormatDataSourceStats)4 HdfsEnvironment (io.trino.plugin.hive.HdfsEnvironment)4 ConnectorSession (io.trino.spi.connector.ConnectorSession)4 Type (io.trino.spi.type.Type)4 List (java.util.List)4 Objects.requireNonNull (java.util.Objects.requireNonNull)4 Optional (java.util.Optional)4 Inject (javax.inject.Inject)4 FSDataInputStream (org.apache.hadoop.fs.FSDataInputStream)4 ImmutableList.toImmutableList (com.google.common.collect.ImmutableList.toImmutableList)3 Test (org.testng.annotations.Test)3