Search in sources :

Example 1 with SchemaRegistryException

use of org.apache.gobblin.metrics.kafka.SchemaRegistryException in project incubator-gobblin by apache.

the class SchemaRegistryVersionWriter method readSchemaVersioningInformation.

@Override
public Schema readSchemaVersioningInformation(DataInputStream inputStream) throws IOException {
    if (inputStream.readByte() != KafkaAvroSchemaRegistry.MAGIC_BYTE) {
        throw new IOException("MAGIC_BYTE not found in Avro message.");
    }
    byte[] byteKey = new byte[schemaIdLengthBytes];
    int bytesRead = inputStream.read(byteKey, 0, schemaIdLengthBytes);
    if (bytesRead != schemaIdLengthBytes) {
        throw new IOException(String.format("Could not read enough bytes for schema id. Expected: %d, found: %d.", schemaIdLengthBytes, bytesRead));
    }
    String hexKey = Hex.encodeHexString(byteKey);
    try {
        return this.registry.getSchemaByKey(hexKey);
    } catch (SchemaRegistryException sre) {
        throw new IOException("Failed to retrieve schema for key " + hexKey, sre);
    }
}
Also used : SchemaRegistryException(org.apache.gobblin.metrics.kafka.SchemaRegistryException) IOException(java.io.IOException)

Example 2 with SchemaRegistryException

use of org.apache.gobblin.metrics.kafka.SchemaRegistryException in project incubator-gobblin by apache.

the class IcebergMetadataWriter method updateSchema.

private void updateSchema(TableMetadata tableMetadata, Map<String, String> props, String topicName) {
    // Set default schema versions
    props.put(SCHEMA_CREATION_TIME_KEY, tableMetadata.lastSchemaVersion.or(DEFAULT_CREATION_TIME));
    // Update Schema
    try {
        if (tableMetadata.candidateSchemas.isPresent() && tableMetadata.candidateSchemas.get().size() > 0) {
            Cache candidates = tableMetadata.candidateSchemas.get();
            // Only have default schema, so either we calculate schema from event or the schema does not have creation time, directly update it
            if (candidates.size() == 1 && candidates.getIfPresent(DEFAULT_CREATION_TIME) != null) {
                updateSchemaHelper(DEFAULT_CREATION_TIME, (Schema) candidates.getIfPresent(DEFAULT_CREATION_TIME), props, tableMetadata.table.get());
            } else {
                // update schema if candidates contains the schema that has the same creation time with the latest schema
                org.apache.avro.Schema latestSchema = (org.apache.avro.Schema) schemaRegistry.getLatestSchemaByTopic(topicName);
                String creationTime = AvroUtils.getSchemaCreationTime(latestSchema);
                if (creationTime == null) {
                    log.warn("Schema from schema registry does not contain creation time, check config for schema registry class");
                } else if (candidates.getIfPresent(creationTime) != null) {
                    updateSchemaHelper(creationTime, (Schema) candidates.getIfPresent(creationTime), props, tableMetadata.table.get());
                }
            }
        }
    } catch (SchemaRegistryException e) {
        log.error("Cannot get schema form schema registry, will not update this schema", e);
    }
}
Also used : Schema(org.apache.iceberg.Schema) SchemaRegistryException(org.apache.gobblin.metrics.kafka.SchemaRegistryException) Cache(com.google.common.cache.Cache)

Example 3 with SchemaRegistryException

use of org.apache.gobblin.metrics.kafka.SchemaRegistryException in project incubator-gobblin by apache.

the class KafkaSchemaChangeInjector method injectControlMessagesBefore.

/**
 * Inject a {@link org.apache.gobblin.stream.MetadataUpdateControlMessage} if the latest schema has changed. Check whether there is a new latest
 * schema if the input record's schema is not present in the schema cache.
 *
 * @param inputRecordEnvelope input record envelope
 * @param workUnitState work unit state
 * @return the injected messages
 */
@Override
public Iterable<ControlMessage<DecodeableKafkaRecord>> injectControlMessagesBefore(RecordEnvelope<DecodeableKafkaRecord> inputRecordEnvelope, WorkUnitState workUnitState) {
    DecodeableKafkaRecord consumerRecord = inputRecordEnvelope.getRecord();
    S schemaIdentifier = getSchemaIdentifier(consumerRecord);
    String topicName = consumerRecord.getTopic();
    // cacheable and is expensive.
    if (this.schemaCache.getIfPresent(schemaIdentifier) == null) {
        try {
            Schema latestSchema = this.schemaRegistry.getLatestSchemaByTopic(topicName);
            this.schemaCache.put(schemaIdentifier, "");
            // latest schema changed, so inject a metadata update control message
            if (!latestSchema.equals(this.latestSchema)) {
                // update the metadata in this injector since the control message is only applied downstream
                this.globalMetadata = GlobalMetadata.builderWithInput(this.globalMetadata, Optional.of(latestSchema)).build();
                // update the latestSchema
                this.latestSchema = latestSchema;
                // inject a metadata update control message before the record so that the downstream constructs
                // are aware of the new schema before processing the record
                ControlMessage datasetLevelMetadataUpdate = new MetadataUpdateControlMessage(this.globalMetadata);
                return Collections.singleton(datasetLevelMetadataUpdate);
            }
        } catch (SchemaRegistryException e) {
            throw new RuntimeException("Exception when getting the latest schema for topic " + topicName, e);
        }
    }
    // no schema change detected
    return null;
}
Also used : DecodeableKafkaRecord(org.apache.gobblin.kafka.client.DecodeableKafkaRecord) MetadataUpdateControlMessage(org.apache.gobblin.stream.MetadataUpdateControlMessage) Schema(org.apache.avro.Schema) SchemaRegistryException(org.apache.gobblin.metrics.kafka.SchemaRegistryException) MetadataUpdateControlMessage(org.apache.gobblin.stream.MetadataUpdateControlMessage) ControlMessage(org.apache.gobblin.stream.ControlMessage)

Aggregations

SchemaRegistryException (org.apache.gobblin.metrics.kafka.SchemaRegistryException)3 Cache (com.google.common.cache.Cache)1 IOException (java.io.IOException)1 Schema (org.apache.avro.Schema)1 DecodeableKafkaRecord (org.apache.gobblin.kafka.client.DecodeableKafkaRecord)1 ControlMessage (org.apache.gobblin.stream.ControlMessage)1 MetadataUpdateControlMessage (org.apache.gobblin.stream.MetadataUpdateControlMessage)1 Schema (org.apache.iceberg.Schema)1