Search in sources :

Example 1 with MutableByteArrayInputStream

use of org.apache.flink.formats.avro.utils.MutableByteArrayInputStream in project flink by apache.

the class AvroRowDeserializationSchema method readObject.

@SuppressWarnings("unchecked")
private void readObject(ObjectInputStream inputStream) throws ClassNotFoundException, IOException {
    recordClazz = (Class<? extends SpecificRecord>) inputStream.readObject();
    schemaString = inputStream.readUTF();
    typeInfo = (RowTypeInfo) AvroSchemaConverter.<Row>convertToTypeInfo(schemaString);
    schema = new Schema.Parser().parse(schemaString);
    if (recordClazz != null) {
        record = (SpecificRecord) SpecificData.newInstance(recordClazz, schema);
    } else {
        record = new GenericData.Record(schema);
    }
    datumReader = new SpecificDatumReader<>(schema);
    this.inputStream = new MutableByteArrayInputStream();
    decoder = DecoderFactory.get().binaryDecoder(this.inputStream, null);
    jodaConverter = JodaConverter.getConverter();
}
Also used : MutableByteArrayInputStream(org.apache.flink.formats.avro.utils.MutableByteArrayInputStream) Row(org.apache.flink.types.Row) GenericData(org.apache.avro.generic.GenericData)

Example 2 with MutableByteArrayInputStream

use of org.apache.flink.formats.avro.utils.MutableByteArrayInputStream in project flink by apache.

the class GlueSchemaRegistryInputStreamDeserializerTest method testGetSchemaAndDeserializedStream_withoutCompression_succeeds.

/**
 * Test whether getSchemaAndDeserializedStream method when compression is not enabled works.
 */
@Test
public void testGetSchemaAndDeserializedStream_withoutCompression_succeeds() throws IOException {
    compressionByte = COMPRESSION_DEFAULT_BYTE;
    compressionHandler = new GlueSchemaRegistryDefaultCompression();
    ByteArrayOutputStream byteArrayOutputStream = buildByteArrayOutputStream(AWSSchemaRegistryConstants.HEADER_VERSION_BYTE, compressionByte);
    byte[] bytes = writeToExistingStream(byteArrayOutputStream, encodeData(userDefinedPojo, new SpecificDatumWriter<>(userSchema)));
    MutableByteArrayInputStream mutableByteArrayInputStream = new MutableByteArrayInputStream();
    mutableByteArrayInputStream.setBuffer(bytes);
    glueSchemaRegistryDeserializationFacade = new MockGlueSchemaRegistryDeserializationFacade(bytes, glueSchema, NONE);
    GlueSchemaRegistryInputStreamDeserializer glueSchemaRegistryInputStreamDeserializer = new GlueSchemaRegistryInputStreamDeserializer(glueSchemaRegistryDeserializationFacade);
    Schema resultSchema = glueSchemaRegistryInputStreamDeserializer.getSchemaAndDeserializedStream(mutableByteArrayInputStream);
    assertThat(resultSchema.toString()).isEqualTo(glueSchema.getSchemaDefinition());
}
Also used : MutableByteArrayInputStream(org.apache.flink.formats.avro.utils.MutableByteArrayInputStream) Schema(org.apache.avro.Schema) ByteArrayOutputStream(java.io.ByteArrayOutputStream) GlueSchemaRegistryDefaultCompression(com.amazonaws.services.schemaregistry.common.GlueSchemaRegistryDefaultCompression) SpecificDatumWriter(org.apache.avro.specific.SpecificDatumWriter) Test(org.junit.Test)

Example 3 with MutableByteArrayInputStream

use of org.apache.flink.formats.avro.utils.MutableByteArrayInputStream in project flink by apache.

the class GlueSchemaRegistryInputStreamDeserializerTest method testGetSchemaAndDeserializedStream_withWrongSchema_throwsException.

/**
 * Test whether getSchemaAndDeserializedStream method throws exception with invalid schema.
 */
@Test
public void testGetSchemaAndDeserializedStream_withWrongSchema_throwsException() throws IOException {
    String schemaDefinition = "{" + "\"type\":\"record\"," + "\"name\":\"User\"," + "\"namespace\":\"org.apache.flink.formats.avro.glue.schema.registry\"," + "\"fields\":" + "[" + "{\"name\":\"name\",\"type\":\"string\"}," + "{\"name\":\"favorite_number\",\"name\":[\"int\",\"null\"]}," + "{\"name\":\"favorite_color\",\"type\":[\"string\",\"null\"]}" + "]" + "}";
    MutableByteArrayInputStream mutableByteArrayInputStream = new MutableByteArrayInputStream();
    glueSchema = new com.amazonaws.services.schemaregistry.common.Schema(schemaDefinition, DataFormat.AVRO.name(), testTopic);
    glueSchemaRegistryDeserializationFacade = new MockGlueSchemaRegistryDeserializationFacade(new byte[20], glueSchema, NONE);
    GlueSchemaRegistryInputStreamDeserializer awsSchemaRegistryInputStreamDeserializer = new GlueSchemaRegistryInputStreamDeserializer(glueSchemaRegistryDeserializationFacade);
    thrown.expect(AWSSchemaRegistryException.class);
    thrown.expectMessage("Error occurred while parsing schema, see inner exception for details.");
    awsSchemaRegistryInputStreamDeserializer.getSchemaAndDeserializedStream(mutableByteArrayInputStream);
}
Also used : MutableByteArrayInputStream(org.apache.flink.formats.avro.utils.MutableByteArrayInputStream) Test(org.junit.Test)

Example 4 with MutableByteArrayInputStream

use of org.apache.flink.formats.avro.utils.MutableByteArrayInputStream in project flink by apache.

the class GlueSchemaRegistryInputStreamDeserializerTest method testGetSchemaAndDeserializedStream_withCompression_succeeds.

/**
 * Test whether getSchemaAndDeserializedStream method when compression is enabled works.
 */
@Test
public void testGetSchemaAndDeserializedStream_withCompression_succeeds() throws IOException {
    COMPRESSION compressionType = COMPRESSION.ZLIB;
    compressionByte = AWSSchemaRegistryConstants.COMPRESSION_BYTE;
    compressionHandler = new GlueSchemaRegistryDefaultCompression();
    ByteArrayOutputStream byteArrayOutputStream = buildByteArrayOutputStream(AWSSchemaRegistryConstants.HEADER_VERSION_BYTE, compressionByte);
    byte[] bytes = writeToExistingStream(byteArrayOutputStream, compressData(encodeData(userDefinedPojo, new SpecificDatumWriter<>(userSchema))));
    MutableByteArrayInputStream mutableByteArrayInputStream = new MutableByteArrayInputStream();
    mutableByteArrayInputStream.setBuffer(bytes);
    glueSchemaRegistryDeserializationFacade = new MockGlueSchemaRegistryDeserializationFacade(bytes, glueSchema, compressionType);
    GlueSchemaRegistryInputStreamDeserializer glueSchemaRegistryInputStreamDeserializer = new GlueSchemaRegistryInputStreamDeserializer(glueSchemaRegistryDeserializationFacade);
    Schema resultSchema = glueSchemaRegistryInputStreamDeserializer.getSchemaAndDeserializedStream(mutableByteArrayInputStream);
    assertThat(resultSchema.toString()).isEqualTo(glueSchema.getSchemaDefinition());
}
Also used : MutableByteArrayInputStream(org.apache.flink.formats.avro.utils.MutableByteArrayInputStream) Schema(org.apache.avro.Schema) COMPRESSION(com.amazonaws.services.schemaregistry.utils.AWSSchemaRegistryConstants.COMPRESSION) ByteArrayOutputStream(java.io.ByteArrayOutputStream) GlueSchemaRegistryDefaultCompression(com.amazonaws.services.schemaregistry.common.GlueSchemaRegistryDefaultCompression) Test(org.junit.Test)

Example 5 with MutableByteArrayInputStream

use of org.apache.flink.formats.avro.utils.MutableByteArrayInputStream in project flink by apache.

the class GlueSchemaRegistryInputStreamDeserializer method getSchemaAndDeserializedStream.

/**
 * Get schema and remove extra Schema Registry information within input stream.
 *
 * @param in input stream
 * @return schema of object within input stream
 * @throws IOException Exception during decompression
 */
public Schema getSchemaAndDeserializedStream(InputStream in) throws IOException {
    byte[] inputBytes = new byte[in.available()];
    in.read(inputBytes);
    in.reset();
    MutableByteArrayInputStream mutableByteArrayInputStream = (MutableByteArrayInputStream) in;
    String schemaDefinition = glueSchemaRegistryDeserializationFacade.getSchemaDefinition(inputBytes);
    byte[] deserializedBytes = glueSchemaRegistryDeserializationFacade.getActualData(inputBytes);
    mutableByteArrayInputStream.setBuffer(deserializedBytes);
    Schema schema;
    try {
        Parser schemaParser = new Schema.Parser();
        schema = schemaParser.parse(schemaDefinition);
    } catch (SchemaParseException e) {
        String message = "Error occurred while parsing schema, see inner exception for details.";
        throw new AWSSchemaRegistryException(message, e);
    }
    return schema;
}
Also used : AWSSchemaRegistryException(com.amazonaws.services.schemaregistry.exception.AWSSchemaRegistryException) MutableByteArrayInputStream(org.apache.flink.formats.avro.utils.MutableByteArrayInputStream) SchemaParseException(org.apache.avro.SchemaParseException) Schema(org.apache.avro.Schema) Parser(org.apache.avro.Schema.Parser)

Aggregations

MutableByteArrayInputStream (org.apache.flink.formats.avro.utils.MutableByteArrayInputStream)6 Schema (org.apache.avro.Schema)3 Test (org.junit.Test)3 GlueSchemaRegistryDefaultCompression (com.amazonaws.services.schemaregistry.common.GlueSchemaRegistryDefaultCompression)2 ByteArrayOutputStream (java.io.ByteArrayOutputStream)2 GenericData (org.apache.avro.generic.GenericData)2 AWSSchemaRegistryException (com.amazonaws.services.schemaregistry.exception.AWSSchemaRegistryException)1 COMPRESSION (com.amazonaws.services.schemaregistry.utils.AWSSchemaRegistryConstants.COMPRESSION)1 Parser (org.apache.avro.Schema.Parser)1 SchemaParseException (org.apache.avro.SchemaParseException)1 SpecificData (org.apache.avro.specific.SpecificData)1 SpecificDatumWriter (org.apache.avro.specific.SpecificDatumWriter)1 Row (org.apache.flink.types.Row)1