Search in sources :

Example 1 with MergingPageIterator

use of io.prestosql.plugin.hive.util.MergingPageIterator in project hetu-core by openlookeng.

the class SortingFileWriter method mergeFiles.

private void mergeFiles(Iterable<TempFile> files, Consumer<Page> consumer) {
    try (Closer closer = Closer.create()) {
        Collection<Iterator<Page>> iterators = new ArrayList<>();
        for (TempFile tempFile : files) {
            Path file = tempFile.getPath();
            FileStatus fileStatus = fileSystem.getFileStatus(file);
            OrcDataSource dataSource = new HdfsOrcDataSource(new OrcDataSourceId(file.toString()), fileStatus.getLen(), new DataSize(1, MEGABYTE), new DataSize(8, MEGABYTE), new DataSize(8, MEGABYTE), false, fileSystem.open(file), new FileFormatDataSourceStats(), fileStatus.getModificationTime());
            TempFileReader reader = new TempFileReader(types, dataSource);
            // Closing the reader also closes the data source
            closer.register(reader);
            iterators.add(reader);
        }
        new MergingPageIterator(iterators, types, sortFields, sortOrders).forEachRemaining(consumer);
        for (TempFile tempFile : files) {
            Path file = tempFile.getPath();
            fileSystem.delete(file, false);
            if (fileSystem.exists(file)) {
                throw new IOException("Failed to delete temporary file: " + file);
            }
        }
    } catch (IOException e) {
        throw new UncheckedIOException(e);
    }
}
Also used : Closer(com.google.common.io.Closer) Path(org.apache.hadoop.fs.Path) OrcDataSource(io.prestosql.orc.OrcDataSource) HdfsOrcDataSource(io.prestosql.plugin.hive.orc.HdfsOrcDataSource) MergingPageIterator(io.prestosql.plugin.hive.util.MergingPageIterator) FileStatus(org.apache.hadoop.fs.FileStatus) OrcDataSourceId(io.prestosql.orc.OrcDataSourceId) ArrayList(java.util.ArrayList) HdfsOrcDataSource(io.prestosql.plugin.hive.orc.HdfsOrcDataSource) UncheckedIOException(java.io.UncheckedIOException) IOException(java.io.IOException) UncheckedIOException(java.io.UncheckedIOException) DataSize(io.airlift.units.DataSize) Iterator(java.util.Iterator) MergingPageIterator(io.prestosql.plugin.hive.util.MergingPageIterator) TempFileReader(io.prestosql.plugin.hive.util.TempFileReader)

Aggregations

Closer (com.google.common.io.Closer)1 DataSize (io.airlift.units.DataSize)1 OrcDataSource (io.prestosql.orc.OrcDataSource)1 OrcDataSourceId (io.prestosql.orc.OrcDataSourceId)1 HdfsOrcDataSource (io.prestosql.plugin.hive.orc.HdfsOrcDataSource)1 MergingPageIterator (io.prestosql.plugin.hive.util.MergingPageIterator)1 TempFileReader (io.prestosql.plugin.hive.util.TempFileReader)1 IOException (java.io.IOException)1 UncheckedIOException (java.io.UncheckedIOException)1 ArrayList (java.util.ArrayList)1 Iterator (java.util.Iterator)1 FileStatus (org.apache.hadoop.fs.FileStatus)1 Path (org.apache.hadoop.fs.Path)1