Search in sources :

Example 1 with Parser

use of org.apache.avro.Schema.Parser in project nifi by apache.

the class ValidateRecord method getValidationSchema.

protected RecordSchema getValidationSchema(final ProcessContext context, final FlowFile flowFile, final RecordReader reader) throws MalformedRecordException, IOException, SchemaNotFoundException {
    final String schemaAccessStrategy = context.getProperty(SCHEMA_ACCESS_STRATEGY).getValue();
    if (schemaAccessStrategy.equals(READER_SCHEMA.getValue())) {
        return reader.getSchema();
    } else if (schemaAccessStrategy.equals(SCHEMA_NAME_PROPERTY.getValue())) {
        final SchemaRegistry schemaRegistry = context.getProperty(SCHEMA_REGISTRY).asControllerService(SchemaRegistry.class);
        final String schemaName = context.getProperty(SCHEMA_NAME).evaluateAttributeExpressions(flowFile).getValue();
        final SchemaIdentifier schemaIdentifier = SchemaIdentifier.builder().name(schemaName).build();
        return schemaRegistry.retrieveSchema(schemaIdentifier);
    } else if (schemaAccessStrategy.equals(SCHEMA_TEXT_PROPERTY.getValue())) {
        final String schemaText = context.getProperty(SCHEMA_TEXT).evaluateAttributeExpressions(flowFile).getValue();
        final Parser parser = new Schema.Parser();
        final Schema avroSchema = parser.parse(schemaText);
        return AvroTypeUtil.createSchema(avroSchema);
    } else {
        throw new ProcessException("Invalid Schema Access Strategy: " + schemaAccessStrategy);
    }
}
Also used : ProcessException(org.apache.nifi.processor.exception.ProcessException) RecordSchema(org.apache.nifi.serialization.record.RecordSchema) Schema(org.apache.avro.Schema) SchemaIdentifier(org.apache.nifi.serialization.record.SchemaIdentifier) SchemaRegistry(org.apache.nifi.schemaregistry.services.SchemaRegistry) Parser(org.apache.avro.Schema.Parser)

Example 2 with Parser

use of org.apache.avro.Schema.Parser in project Gaffer by gchq.

the class AvroJobInitialiser method initialiseInput.

private void initialiseInput(final Job job, final MapReduce operation) throws IOException {
    if (null == avroSchemaFilePath) {
        throw new IllegalArgumentException("Avro schema file path has not been set");
    }
    final Schema schema = new Parser().parse(new File(avroSchemaFilePath));
    AvroJob.setInputKeySchema(job, schema);
    job.setInputFormatClass(AvroKeyInputFormat.class);
    for (final Map.Entry<String, String> entry : operation.getInputMapperPairs().entrySet()) {
        if (entry.getValue().contains(job.getConfiguration().get(MAPPER_GENERATOR))) {
            AvroKeyInputFormat.addInputPath(job, new Path(entry.getKey()));
        }
    }
}
Also used : Path(org.apache.hadoop.fs.Path) Schema(org.apache.avro.Schema) File(java.io.File) Map(java.util.Map) Parser(org.apache.avro.Schema.Parser)

Example 3 with Parser

use of org.apache.avro.Schema.Parser in project Gaffer by gchq.

the class AvroJobInitialiser method initialiseInput.

private void initialiseInput(final Job job, final MapReduceOperation operation) throws IOException {
    if (null == avroSchemaFilePath) {
        throw new IllegalArgumentException("Avro schema file path has not been set");
    }
    final Schema schema = new Parser().parse(new File(avroSchemaFilePath));
    AvroJob.setInputKeySchema(job, schema);
    job.setInputFormatClass(AvroKeyInputFormat.class);
    List<String> paths = operation.getInputPaths();
    for (final String path : paths) {
        AvroKeyInputFormat.addInputPath(job, new Path(path));
    }
}
Also used : Path(org.apache.hadoop.fs.Path) Schema(org.apache.avro.Schema) File(java.io.File) Parser(org.apache.avro.Schema.Parser)

Example 4 with Parser

use of org.apache.avro.Schema.Parser in project flink by apache.

the class GlueSchemaRegistryInputStreamDeserializer method getSchemaAndDeserializedStream.

/**
 * Get schema and remove extra Schema Registry information within input stream.
 *
 * @param in input stream
 * @return schema of object within input stream
 * @throws IOException Exception during decompression
 */
public Schema getSchemaAndDeserializedStream(InputStream in) throws IOException {
    byte[] inputBytes = new byte[in.available()];
    in.read(inputBytes);
    in.reset();
    MutableByteArrayInputStream mutableByteArrayInputStream = (MutableByteArrayInputStream) in;
    String schemaDefinition = glueSchemaRegistryDeserializationFacade.getSchemaDefinition(inputBytes);
    byte[] deserializedBytes = glueSchemaRegistryDeserializationFacade.getActualData(inputBytes);
    mutableByteArrayInputStream.setBuffer(deserializedBytes);
    Schema schema;
    try {
        Parser schemaParser = new Schema.Parser();
        schema = schemaParser.parse(schemaDefinition);
    } catch (SchemaParseException e) {
        String message = "Error occurred while parsing schema, see inner exception for details.";
        throw new AWSSchemaRegistryException(message, e);
    }
    return schema;
}
Also used : AWSSchemaRegistryException(com.amazonaws.services.schemaregistry.exception.AWSSchemaRegistryException) MutableByteArrayInputStream(org.apache.flink.formats.avro.utils.MutableByteArrayInputStream) SchemaParseException(org.apache.avro.SchemaParseException) Schema(org.apache.avro.Schema) Parser(org.apache.avro.Schema.Parser)

Aggregations

Schema (org.apache.avro.Schema)4 Parser (org.apache.avro.Schema.Parser)4 File (java.io.File)2 Path (org.apache.hadoop.fs.Path)2 AWSSchemaRegistryException (com.amazonaws.services.schemaregistry.exception.AWSSchemaRegistryException)1 Map (java.util.Map)1 SchemaParseException (org.apache.avro.SchemaParseException)1 MutableByteArrayInputStream (org.apache.flink.formats.avro.utils.MutableByteArrayInputStream)1 ProcessException (org.apache.nifi.processor.exception.ProcessException)1 SchemaRegistry (org.apache.nifi.schemaregistry.services.SchemaRegistry)1 RecordSchema (org.apache.nifi.serialization.record.RecordSchema)1 SchemaIdentifier (org.apache.nifi.serialization.record.SchemaIdentifier)1