Search in sources :

Example 6 with Entry

use of com.google.cloud.datacatalog.v1.Entry in project DataflowTemplates by GoogleCloudPlatform.

the class PubSubChangeConsumer method handleBatch.

@Override
public void handleBatch(List<SourceRecord> records, RecordCommitter committer) throws InterruptedException {
    ImmutableList.Builder<ApiFuture<String>> futureListBuilder = ImmutableList.builder();
    Set<Publisher> usedPublishers = new HashSet<>();
    // TODO(pabloem): Improve the commit logic.
    for (SourceRecord r : records) {
        // Debezium publishes updates for each table in a separate Kafka topic, which is the fully
        // qualified name of the MySQL table (e.g. dbInstanceName.databaseName.table_name).
        String tableName = r.topic();
        if (whitelistedTables.contains(tableName)) {
            Row updateRecord = translator.translate(r);
            if (updateRecord == null) {
                continue;
            }
            if (!observedTables.contains(tableName)) {
                Entry result = schemaUpdater.updateSchemaForTable(tableName, updateRecord.getSchema());
                if (result == null) {
                    throw new InterruptedException("A problem occurred when communicating with Cloud Data Catalog");
                }
                observedTables.add(tableName);
            }
            Publisher pubSubPublisher = this.getPubSubPublisher(tableName);
            if (pubSubPublisher == null) {
                // stop execution without committing any more messages.
                throw new InterruptedException("Unable to create a PubSub topic for table " + tableName);
            }
            usedPublishers.add(pubSubPublisher);
            PubsubMessage.Builder messageBuilder = PubsubMessage.newBuilder();
            LOG.debug("Update Record is: {}", updateRecord);
            try {
                RowCoder recordCoder = getCoderForRow(tableName, updateRecord);
                ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
                recordCoder.encode(updateRecord, outputStream);
                ByteString encodedUpdate = ByteString.copyFrom(outputStream.toByteArray());
                PubsubMessage message = messageBuilder.setData(encodedUpdate).putAttributes("table", tableName).build();
                futureListBuilder.add(pubSubPublisher.publish(message));
            } catch (IOException e) {
                LOG.error("Caught exception {} when trying to encode record {}. Stopping processing.", e, updateRecord);
                return;
            }
        } else {
            LOG.debug("Discarding record: {}", r);
        }
        committer.markProcessed(r);
    }
    usedPublishers.forEach(p -> p.publishAllOutstanding());
    for (ApiFuture<String> f : futureListBuilder.build()) {
        try {
            String result = f.get();
            LOG.debug("Result from PubSub Publish Future: {}", result);
        } catch (ExecutionException e) {
            LOG.error("Exception when executing future {}: {}. Stopping execution.", f, e);
            return;
        }
    }
    committer.markBatchFinished();
}
Also used : RowCoder(org.apache.beam.sdk.coders.RowCoder) ImmutableList(com.google.common.collect.ImmutableList) ByteString(com.google.protobuf.ByteString) Publisher(com.google.cloud.pubsub.v1.Publisher) ByteString(com.google.protobuf.ByteString) ByteArrayOutputStream(java.io.ByteArrayOutputStream) IOException(java.io.IOException) SourceRecord(org.apache.kafka.connect.source.SourceRecord) PubsubMessage(com.google.pubsub.v1.PubsubMessage) ApiFuture(com.google.api.core.ApiFuture) Entry(com.google.cloud.datacatalog.v1beta1.Entry) Row(org.apache.beam.sdk.values.Row) ExecutionException(java.util.concurrent.ExecutionException) HashSet(java.util.HashSet)

Example 7 with Entry

use of com.google.cloud.datacatalog.v1.Entry in project DataflowTemplates by GoogleCloudPlatform.

the class DataCatalogSchemaUtils method lookupPubSubEntry.

static Entry lookupPubSubEntry(DataCatalogClient client, String pubsubTopic, String gcpProject) {
    String linkedResource = String.format(DATA_CATALOG_PUBSUB_URI_TEMPLATE, gcpProject, pubsubTopic);
    LOG.info("Looking up LinkedResource {}", linkedResource);
    LookupEntryRequest request = LookupEntryRequest.newBuilder().setLinkedResource(linkedResource).build();
    try {
        Entry entry = client.lookupEntry(request);
        return entry;
    } catch (ApiException e) {
        System.out.println("CANT LOOKUP ENTRY" + e.toString());
        e.printStackTrace();
        LOG.error("ApiException thrown by Data Catalog API:", e);
        return null;
    }
}
Also used : LookupEntryRequest(com.google.cloud.datacatalog.v1beta1.LookupEntryRequest) Entry(com.google.cloud.datacatalog.v1beta1.Entry) ApiException(com.google.api.gax.rpc.ApiException)

Aggregations

Entry (com.google.cloud.datacatalog.v1beta1.Entry)5 IOException (java.io.IOException)4 AlreadyExistsException (com.google.api.gax.rpc.AlreadyExistsException)2 ApiException (com.google.api.gax.rpc.ApiException)2 DataCatalogClient (com.google.cloud.datacatalog.v1beta1.DataCatalogClient)2 LookupEntryRequest (com.google.cloud.datacatalog.v1beta1.LookupEntryRequest)2 ApiFuture (com.google.api.core.ApiFuture)1 NotFoundException (com.google.api.gax.rpc.NotFoundException)1 PermissionDeniedException (com.google.api.gax.rpc.PermissionDeniedException)1 CreateEntryGroupRequest (com.google.cloud.datacatalog.v1.CreateEntryGroupRequest)1 CreateEntryRequest (com.google.cloud.datacatalog.v1.CreateEntryRequest)1 DataCatalogClient (com.google.cloud.datacatalog.v1.DataCatalogClient)1 Entry (com.google.cloud.datacatalog.v1.Entry)1 EntryGroup (com.google.cloud.datacatalog.v1.EntryGroup)1 EntryGroupName (com.google.cloud.datacatalog.v1.EntryGroupName)1 CreateEntryGroupRequest (com.google.cloud.datacatalog.v1beta1.CreateEntryGroupRequest)1 CreateEntryRequest (com.google.cloud.datacatalog.v1beta1.CreateEntryRequest)1 EntryGroup (com.google.cloud.datacatalog.v1beta1.EntryGroup)1 ListEntriesRequest (com.google.cloud.datacatalog.v1beta1.ListEntriesRequest)1 ListEntriesResponse (com.google.cloud.datacatalog.v1beta1.ListEntriesResponse)1