Search in sources :

Example 1 with AvroKeyRecordReader

use of org.apache.avro.mapreduce.AvroKeyRecordReader in project pinot by linkedin.

the class DelegatingAvroKeyInputFormat method createRecordReader.

public org.apache.hadoop.mapreduce.RecordReader<org.apache.avro.mapred.AvroKey<T>, NullWritable> createRecordReader(InputSplit split, TaskAttemptContext context) throws IOException, InterruptedException {
    LOGGER.info("DelegatingAvroKeyInputFormat.createRecordReader()  for split:{}", split);
    FileSplit fileSplit = (FileSplit) split;
    Configuration configuration = context.getConfiguration();
    String sourceName = getSourceNameFromPath(fileSplit, configuration);
    LOGGER.info("Source Name for path {} : {}", fileSplit.getPath(), sourceName);
    Map<String, String> schemaJSONMapping = new ObjectMapper().readValue(configuration.get("schema.json.mapping"), MAP_STRING_STRING_TYPE);
    LOGGER.info("Schema JSON Mapping: {}", schemaJSONMapping);
    String sourceSchemaJSON = schemaJSONMapping.get(sourceName);
    Schema schema = new Schema.Parser().parse(sourceSchemaJSON);
    return new AvroKeyRecordReader<T>(schema);
}
Also used : Configuration(org.apache.hadoop.conf.Configuration) Schema(org.apache.avro.Schema) FileSplit(org.apache.hadoop.mapreduce.lib.input.FileSplit) ObjectMapper(org.codehaus.jackson.map.ObjectMapper) AvroKeyRecordReader(org.apache.avro.mapreduce.AvroKeyRecordReader)

Example 2 with AvroKeyRecordReader

use of org.apache.avro.mapreduce.AvroKeyRecordReader in project pinot by linkedin.

the class DelegatingAvroKeyInputFormat method createRecordReader.

public org.apache.hadoop.mapreduce.RecordReader<org.apache.avro.mapred.AvroKey<T>, NullWritable> createRecordReader(InputSplit split, TaskAttemptContext context) throws IOException, InterruptedException {
    LOGGER.info("DelegatingAvroKeyInputFormat.createRecordReader()  for split:{}", split);
    FileSplit fileSplit = (FileSplit) split;
    Configuration configuration = context.getConfiguration();
    String sourceName = getSourceNameFromPath(fileSplit, configuration);
    LOGGER.info("Source Name for path {} : {}", fileSplit.getPath(), sourceName);
    Map<String, String> schemaJSONMapping = new ObjectMapper().readValue(configuration.get("schema.json.mapping"), MAP_STRING_STRING_TYPE);
    LOGGER.info("Schema JSON Mapping: {}", schemaJSONMapping);
    String sourceSchemaJSON = schemaJSONMapping.get(sourceName);
    Schema schema = new Schema.Parser().parse(sourceSchemaJSON);
    return new AvroKeyRecordReader<T>(schema);
}
Also used : Configuration(org.apache.hadoop.conf.Configuration) Schema(org.apache.avro.Schema) FileSplit(org.apache.hadoop.mapreduce.lib.input.FileSplit) ObjectMapper(org.codehaus.jackson.map.ObjectMapper) AvroKeyRecordReader(org.apache.avro.mapreduce.AvroKeyRecordReader)

Aggregations

Schema (org.apache.avro.Schema)2 AvroKeyRecordReader (org.apache.avro.mapreduce.AvroKeyRecordReader)2 Configuration (org.apache.hadoop.conf.Configuration)2 FileSplit (org.apache.hadoop.mapreduce.lib.input.FileSplit)2 ObjectMapper (org.codehaus.jackson.map.ObjectMapper)2