Search in sources :

Example 1 with RuntimeIOException

use of org.apache.iceberg.exceptions.RuntimeIOException in project incubator-gobblin by apache.

the class IcebergUtils method getIcebergDataFileWithMetric.

/**
 * Method to get DataFile with format and metrics information
 * This method is mainly used to get the file to be added
 */
public static DataFile getIcebergDataFileWithMetric(org.apache.gobblin.metadata.DataFile file, PartitionSpec partitionSpec, StructLike partition, Configuration conf, Map<Integer, Integer> schemaIdMap) {
    Path filePath = new Path(file.getFilePath());
    DataFiles.Builder dataFileBuilder = DataFiles.builder(partitionSpec);
    try {
        // Use absolute path to support federation
        dataFileBuilder.withPath(filePath.toUri().getRawPath()).withFileSizeInBytes(filePath.getFileSystem(conf).getFileStatus(filePath).getLen()).withFormat(file.getFileFormat());
    } catch (IOException exception) {
        throw new RuntimeIOException(exception, "Failed to get dataFile for path: %s", filePath);
    }
    if (partition != null) {
        dataFileBuilder.withPartition(partition);
    }
    Metrics metrics = new Metrics(file.getFileMetrics().getRecordCount(), IcebergUtils.getMapFromIntegerLongPairs(file.getFileMetrics().getColumnSizes(), schemaIdMap), IcebergUtils.getMapFromIntegerLongPairs(file.getFileMetrics().getValueCounts(), schemaIdMap), IcebergUtils.getMapFromIntegerLongPairs(file.getFileMetrics().getNullValueCounts(), schemaIdMap), IcebergUtils.getMapFromIntegerBytesPairs(file.getFileMetrics().getLowerBounds(), schemaIdMap), IcebergUtils.getMapFromIntegerBytesPairs(file.getFileMetrics().getUpperBounds(), schemaIdMap));
    return dataFileBuilder.withMetrics(metrics).build();
}
Also used : Path(org.apache.hadoop.fs.Path) RuntimeIOException(org.apache.iceberg.exceptions.RuntimeIOException) Metrics(org.apache.iceberg.Metrics) DataFiles(org.apache.iceberg.DataFiles) RuntimeIOException(org.apache.iceberg.exceptions.RuntimeIOException) IOException(java.io.IOException)

Example 2 with RuntimeIOException

use of org.apache.iceberg.exceptions.RuntimeIOException in project drill by apache.

the class ParquetFileWriter method write.

@Override
public File write() {
    Objects.requireNonNull(location, "File create location must be specified");
    Objects.requireNonNull(name, "File name must be specified");
    OutputFile outputFile = table.io().newOutputFile(new Path(location, FileFormat.PARQUET.addExtension(name)).toUri().getPath());
    FileAppender<Record> fileAppender = null;
    try {
        fileAppender = Parquet.write(outputFile).forTable(table).createWriterFunc(GenericParquetWriter::buildWriter).build();
        fileAppender.addAll(records);
        fileAppender.close();
        // metrics are available only when file was written (i.e. close method was executed)
        return new File(outputFile, fileAppender.metrics());
    } catch (IOException | ClassCastException | RuntimeIOException e) {
        if (fileAppender != null) {
            try {
                fileAppender.close();
            } catch (Exception ex) {
            // write has failed anyway, ignore closing exception if any and throw initial one
            }
        }
        throw new IcebergMetastoreException(String.format("Unable to write data into parquet file [%s]", outputFile.location()), e);
    }
}
Also used : OutputFile(org.apache.iceberg.io.OutputFile) Path(org.apache.hadoop.fs.Path) IcebergMetastoreException(org.apache.drill.metastore.iceberg.exceptions.IcebergMetastoreException) RuntimeIOException(org.apache.iceberg.exceptions.RuntimeIOException) GenericParquetWriter(org.apache.iceberg.data.parquet.GenericParquetWriter) Record(org.apache.iceberg.data.Record) RuntimeIOException(org.apache.iceberg.exceptions.RuntimeIOException) IOException(java.io.IOException) OutputFile(org.apache.iceberg.io.OutputFile) RuntimeIOException(org.apache.iceberg.exceptions.RuntimeIOException) IOException(java.io.IOException) IcebergMetastoreException(org.apache.drill.metastore.iceberg.exceptions.IcebergMetastoreException)

Aggregations

IOException (java.io.IOException)2 Path (org.apache.hadoop.fs.Path)2 RuntimeIOException (org.apache.iceberg.exceptions.RuntimeIOException)2 IcebergMetastoreException (org.apache.drill.metastore.iceberg.exceptions.IcebergMetastoreException)1 DataFiles (org.apache.iceberg.DataFiles)1 Metrics (org.apache.iceberg.Metrics)1 Record (org.apache.iceberg.data.Record)1 GenericParquetWriter (org.apache.iceberg.data.parquet.GenericParquetWriter)1 OutputFile (org.apache.iceberg.io.OutputFile)1