Search in sources :

Example 1 with HFileOutputFormat2

use of org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2 in project hbase by apache.

the class MultiHFileOutputFormat method createMultiHFileRecordWriter.

static <V extends Cell> RecordWriter<ImmutableBytesWritable, V> createMultiHFileRecordWriter(final TaskAttemptContext context) throws IOException {
    // Get the path of the output directory
    final Path outputPath = FileOutputFormat.getOutputPath(context);
    final Path outputDir = new FileOutputCommitter(outputPath, context).getWorkPath();
    final Configuration conf = context.getConfiguration();
    final FileSystem fs = outputDir.getFileSystem(conf);
    // Map of tables to writers
    final Map<ImmutableBytesWritable, RecordWriter<ImmutableBytesWritable, V>> tableWriters = new HashMap<>();
    return new RecordWriter<ImmutableBytesWritable, V>() {

        @Override
        public void write(ImmutableBytesWritable tableName, V cell) throws IOException, InterruptedException {
            RecordWriter<ImmutableBytesWritable, V> tableWriter = tableWriters.get(tableName);
            // if there is new table, verify that table directory exists
            if (tableWriter == null) {
                // using table name as directory name
                final Path tableOutputDir = new Path(outputDir, Bytes.toString(tableName.copyBytes()));
                fs.mkdirs(tableOutputDir);
                LOG.info("Writing Table '" + tableName.toString() + "' data into following directory" + tableOutputDir.toString());
                // Create writer for one specific table
                tableWriter = new HFileOutputFormat2.HFileRecordWriter<>(context, tableOutputDir);
                // Put table into map
                tableWriters.put(tableName, tableWriter);
            }
            // Write <Row, Cell> into tableWriter
            // in the original code, it does not use Row
            tableWriter.write(null, cell);
        }

        @Override
        public void close(TaskAttemptContext c) throws IOException, InterruptedException {
            for (RecordWriter<ImmutableBytesWritable, V> writer : tableWriters.values()) {
                writer.close(c);
            }
        }
    };
}
Also used : Path(org.apache.hadoop.fs.Path) ImmutableBytesWritable(org.apache.hadoop.hbase.io.ImmutableBytesWritable) Configuration(org.apache.hadoop.conf.Configuration) HashMap(java.util.HashMap) FileOutputCommitter(org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter) TaskAttemptContext(org.apache.hadoop.mapreduce.TaskAttemptContext) RecordWriter(org.apache.hadoop.mapreduce.RecordWriter) FileSystem(org.apache.hadoop.fs.FileSystem) HFileOutputFormat2(org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2)

Aggregations

HashMap (java.util.HashMap)1 Configuration (org.apache.hadoop.conf.Configuration)1 FileSystem (org.apache.hadoop.fs.FileSystem)1 Path (org.apache.hadoop.fs.Path)1 ImmutableBytesWritable (org.apache.hadoop.hbase.io.ImmutableBytesWritable)1 HFileOutputFormat2 (org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2)1 RecordWriter (org.apache.hadoop.mapreduce.RecordWriter)1 TaskAttemptContext (org.apache.hadoop.mapreduce.TaskAttemptContext)1 FileOutputCommitter (org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter)1