Search in sources :

Example 1 with DeleteBigQueryDataFn

use of com.google.cloud.teleport.v2.transforms.DeleteBigQueryDataFn in project DataflowTemplates by GoogleCloudPlatform.

the class DeleteBigQueryDataFnTest method testTransform_withDeleteSourceDataEnabled_doesntTruncateSpecialPartitions.

/**
 * Test that DeleteBigQueryDataFn doesn't attempt to delete special BigQuery partitions even if
 * {@code deleteSourceData = true}.
 *
 * <p>As per <a
 * href="https://cloud.google.com/bigquery/docs/managing-partitioned-tables#delete_a_partition">
 * this documentation</a>, special partitions "__NULL__" and "__UNPARTITIONED__" cannot be
 * deleted.
 */
@Test
@Category(NeedsRunner.class)
public void testTransform_withDeleteSourceDataEnabled_doesntTruncateSpecialPartitions() {
    Options options = TestPipeline.testingPipelineOptions().as(Options.class);
    options.setDeleteSourceData(true);
    BigQueryTablePartition.Builder builder = BigQueryTablePartition.builder().setLastModificationTime(System.currentTimeMillis() * 1000);
    BigQueryTablePartition p1 = builder.setPartitionName("__NULL__").build();
    BigQueryTablePartition p2 = builder.setPartitionName("__UNPARTITIONED__").build();
    BigQueryTablePartition p3 = builder.setPartitionName("NORMAL_PARTITION").build();
    BigQueryTable t1 = table.toBuilder().setPartitions(Arrays.asList(p1, p2, p3)).setPartitioningColumn("column-name-doesnt-matter").build();
    DeleteBigQueryDataFn fn = new DeleteBigQueryDataFn().withTestBqClientFactory(() -> bqMock);
    testPipeline.apply("CreateInput", Create.of(KV.of(t1, p1), KV.of(t1, p2), KV.of(t1, p3)).withCoder(fnCoder)).apply("TestDeleteBigQueryDataFn", ParDo.of(fn));
    testPipeline.run(options);
    verify(bqMock, times(1)).delete(TableId.of("pr1", "d1", "t1$NORMAL_PARTITION"));
    verifyNoMoreInteractions(bqMock);
}
Also used : Options(com.google.cloud.teleport.v2.transforms.DeleteBigQueryDataFn.Options) BigQueryTablePartition(com.google.cloud.teleport.v2.values.BigQueryTablePartition) BigQueryTable(com.google.cloud.teleport.v2.values.BigQueryTable) Category(org.junit.experimental.categories.Category) Test(org.junit.Test)

Example 2 with DeleteBigQueryDataFn

use of com.google.cloud.teleport.v2.transforms.DeleteBigQueryDataFn in project DataflowTemplates by GoogleCloudPlatform.

the class DataplexBigQueryToGcs method transformPipeline.

@VisibleForTesting
static void transformPipeline(Pipeline pipeline, List<BigQueryTable> tables, DataplexBigQueryToGcsOptions options, String targetRootPath, BigQueryServices testBqServices, BigQueryClientFactory testBqClientFactory) {
    List<PCollection<KV<BigQueryTable, KV<BigQueryTablePartition, String>>>> fileCollections = new ArrayList<>(tables.size());
    tables.forEach(table -> {
        fileCollections.add(pipeline.apply(String.format("ExportTable-%s", table.getTableName()), new BigQueryTableToGcsTransform(table, targetRootPath, options.getFileFormat(), options.getFileCompression(), options.getEnforceSamePartitionKey()).withTestServices(testBqServices)).apply(String.format("AttachTableKeys-%s", table.getTableName()), WithKeys.of(table)));
    });
    PCollection<KV<BigQueryTable, KV<BigQueryTablePartition, String>>> exportFileResults = PCollectionList.of(fileCollections).apply("FlattenTableResults", Flatten.pCollections());
    PCollection<Void> metadataUpdateResults = exportFileResults.apply("UpdateDataplexMetadata", new UpdateDataplexBigQueryToGcsExportMetadataTransform());
    exportFileResults.apply(MapElements.into(TypeDescriptors.kvs(TypeDescriptor.of(BigQueryTable.class), TypeDescriptor.of(BigQueryTablePartition.class))).via((SerializableFunction<KV<BigQueryTable, KV<BigQueryTablePartition, String>>, KV<BigQueryTable, BigQueryTablePartition>>) input -> KV.of(input.getKey(), input.getValue().getKey()))).apply("WaitForMetadataUpdate", Wait.on(metadataUpdateResults)).apply("TruncateBigQueryData", ParDo.of(new DeleteBigQueryDataFn().withTestBqClientFactory(testBqClientFactory)));
}
Also used : BigQueryTablePartition(com.google.cloud.teleport.v2.values.BigQueryTablePartition) SerializableFunction(org.apache.beam.sdk.transforms.SerializableFunction) ArrayList(java.util.ArrayList) DeleteBigQueryDataFn(com.google.cloud.teleport.v2.transforms.DeleteBigQueryDataFn) KV(org.apache.beam.sdk.values.KV) PCollection(org.apache.beam.sdk.values.PCollection) BigQueryTable(com.google.cloud.teleport.v2.values.BigQueryTable) UpdateDataplexBigQueryToGcsExportMetadataTransform(com.google.cloud.teleport.v2.transforms.UpdateDataplexBigQueryToGcsExportMetadataTransform) BigQueryTableToGcsTransform(com.google.cloud.teleport.v2.transforms.BigQueryTableToGcsTransform) VisibleForTesting(com.google.common.annotations.VisibleForTesting)

Example 3 with DeleteBigQueryDataFn

use of com.google.cloud.teleport.v2.transforms.DeleteBigQueryDataFn in project DataflowTemplates by GoogleCloudPlatform.

the class DeleteBigQueryDataFnTest method testTransform_withDeleteSourceDataDisabled_doesntTruncateData.

@Test
@Category(NeedsRunner.class)
public void testTransform_withDeleteSourceDataDisabled_doesntTruncateData() {
    Options options = TestPipeline.testingPipelineOptions().as(Options.class);
    options.setDeleteSourceData(false);
    BigQueryTable partitionedTable = table.toBuilder().setPartitions(Collections.singletonList(partition)).setPartitioningColumn("column-name-doesnt-matter").build();
    DeleteBigQueryDataFn fn = new DeleteBigQueryDataFn().withTestBqClientFactory(() -> bqMock);
    PCollection<Void> actual = testPipeline.apply("CreateInput", Create.of(KV.of(partitionedTable, partition), KV.of(table, (BigQueryTablePartition) null)).withCoder(fnCoder)).apply("TestDeleteBigQueryDataFn", ParDo.of(fn));
    PAssert.that(actual).empty();
    testPipeline.run(options);
    verifyNoMoreInteractions(bqMock);
}
Also used : Options(com.google.cloud.teleport.v2.transforms.DeleteBigQueryDataFn.Options) BigQueryTable(com.google.cloud.teleport.v2.values.BigQueryTable) Category(org.junit.experimental.categories.Category) Test(org.junit.Test)

Aggregations

BigQueryTable (com.google.cloud.teleport.v2.values.BigQueryTable)3 Options (com.google.cloud.teleport.v2.transforms.DeleteBigQueryDataFn.Options)2 BigQueryTablePartition (com.google.cloud.teleport.v2.values.BigQueryTablePartition)2 Test (org.junit.Test)2 Category (org.junit.experimental.categories.Category)2 BigQueryTableToGcsTransform (com.google.cloud.teleport.v2.transforms.BigQueryTableToGcsTransform)1 DeleteBigQueryDataFn (com.google.cloud.teleport.v2.transforms.DeleteBigQueryDataFn)1 UpdateDataplexBigQueryToGcsExportMetadataTransform (com.google.cloud.teleport.v2.transforms.UpdateDataplexBigQueryToGcsExportMetadataTransform)1 VisibleForTesting (com.google.common.annotations.VisibleForTesting)1 ArrayList (java.util.ArrayList)1 SerializableFunction (org.apache.beam.sdk.transforms.SerializableFunction)1 KV (org.apache.beam.sdk.values.KV)1 PCollection (org.apache.beam.sdk.values.PCollection)1