Search in sources :

Example 6 with BigQueryTablePartition

use of com.google.cloud.teleport.v2.values.BigQueryTablePartition in project DataflowTemplates by GoogleCloudPlatform.

the class DeleteBigQueryDataFnTest method testTransform_withDeleteSourceDataEnabled_doesntTruncateSpecialPartitions.

/**
 * Test that DeleteBigQueryDataFn doesn't attempt to delete special BigQuery partitions even if
 * {@code deleteSourceData = true}.
 *
 * <p>As per <a
 * href="https://cloud.google.com/bigquery/docs/managing-partitioned-tables#delete_a_partition">
 * this documentation</a>, special partitions "__NULL__" and "__UNPARTITIONED__" cannot be
 * deleted.
 */
@Test
@Category(NeedsRunner.class)
public void testTransform_withDeleteSourceDataEnabled_doesntTruncateSpecialPartitions() {
    Options options = TestPipeline.testingPipelineOptions().as(Options.class);
    options.setDeleteSourceData(true);
    BigQueryTablePartition.Builder builder = BigQueryTablePartition.builder().setLastModificationTime(System.currentTimeMillis() * 1000);
    BigQueryTablePartition p1 = builder.setPartitionName("__NULL__").build();
    BigQueryTablePartition p2 = builder.setPartitionName("__UNPARTITIONED__").build();
    BigQueryTablePartition p3 = builder.setPartitionName("NORMAL_PARTITION").build();
    BigQueryTable t1 = table.toBuilder().setPartitions(Arrays.asList(p1, p2, p3)).setPartitioningColumn("column-name-doesnt-matter").build();
    DeleteBigQueryDataFn fn = new DeleteBigQueryDataFn().withTestBqClientFactory(() -> bqMock);
    testPipeline.apply("CreateInput", Create.of(KV.of(t1, p1), KV.of(t1, p2), KV.of(t1, p3)).withCoder(fnCoder)).apply("TestDeleteBigQueryDataFn", ParDo.of(fn));
    testPipeline.run(options);
    verify(bqMock, times(1)).delete(TableId.of("pr1", "d1", "t1$NORMAL_PARTITION"));
    verifyNoMoreInteractions(bqMock);
}
Also used : Options(com.google.cloud.teleport.v2.transforms.DeleteBigQueryDataFn.Options) BigQueryTablePartition(com.google.cloud.teleport.v2.values.BigQueryTablePartition) BigQueryTable(com.google.cloud.teleport.v2.values.BigQueryTable) Category(org.junit.experimental.categories.Category) Test(org.junit.Test)

Example 7 with BigQueryTablePartition

use of com.google.cloud.teleport.v2.values.BigQueryTablePartition in project DataflowTemplates by GoogleCloudPlatform.

the class DataplexBigQueryToGcsFilterTest method test_whenNoFilterOptions_filterAcceptsAllTablesAndPartitions.

@Test
public void test_whenNoFilterOptions_filterAcceptsAllTablesAndPartitions() {
    BigQueryTable.Builder t = table();
    BigQueryTablePartition p = partition().build();
    options.setTables(null);
    options.setExportDataModifiedBeforeDateTime(null);
    Filter f = new DataplexBigQueryToGcsFilter(options, new ArrayList<String>());
    assertThat(f.shouldSkipUnpartitionedTable(t)).isFalse();
    assertThat(f.shouldSkipPartitionedTable(t, Collections.singletonList(p))).isFalse();
    assertThat(f.shouldSkipPartition(t, p)).isFalse();
}
Also used : BigQueryTablePartition(com.google.cloud.teleport.v2.values.BigQueryTablePartition) Filter(com.google.cloud.teleport.v2.utils.BigQueryMetadataLoader.Filter) BigQueryTable(com.google.cloud.teleport.v2.values.BigQueryTable) Test(org.junit.Test)

Example 8 with BigQueryTablePartition

use of com.google.cloud.teleport.v2.values.BigQueryTablePartition in project DataflowTemplates by GoogleCloudPlatform.

the class DataplexBigQueryToGcs method transformPipeline.

@VisibleForTesting
static void transformPipeline(Pipeline pipeline, List<BigQueryTable> tables, DataplexBigQueryToGcsOptions options, String targetRootPath, BigQueryServices testBqServices, BigQueryClientFactory testBqClientFactory) {
    List<PCollection<KV<BigQueryTable, KV<BigQueryTablePartition, String>>>> fileCollections = new ArrayList<>(tables.size());
    tables.forEach(table -> {
        fileCollections.add(pipeline.apply(String.format("ExportTable-%s", table.getTableName()), new BigQueryTableToGcsTransform(table, targetRootPath, options.getFileFormat(), options.getFileCompression(), options.getEnforceSamePartitionKey()).withTestServices(testBqServices)).apply(String.format("AttachTableKeys-%s", table.getTableName()), WithKeys.of(table)));
    });
    PCollection<KV<BigQueryTable, KV<BigQueryTablePartition, String>>> exportFileResults = PCollectionList.of(fileCollections).apply("FlattenTableResults", Flatten.pCollections());
    PCollection<Void> metadataUpdateResults = exportFileResults.apply("UpdateDataplexMetadata", new UpdateDataplexBigQueryToGcsExportMetadataTransform());
    exportFileResults.apply(MapElements.into(TypeDescriptors.kvs(TypeDescriptor.of(BigQueryTable.class), TypeDescriptor.of(BigQueryTablePartition.class))).via((SerializableFunction<KV<BigQueryTable, KV<BigQueryTablePartition, String>>, KV<BigQueryTable, BigQueryTablePartition>>) input -> KV.of(input.getKey(), input.getValue().getKey()))).apply("WaitForMetadataUpdate", Wait.on(metadataUpdateResults)).apply("TruncateBigQueryData", ParDo.of(new DeleteBigQueryDataFn().withTestBqClientFactory(testBqClientFactory)));
}
Also used : BigQueryTablePartition(com.google.cloud.teleport.v2.values.BigQueryTablePartition) SerializableFunction(org.apache.beam.sdk.transforms.SerializableFunction) ArrayList(java.util.ArrayList) DeleteBigQueryDataFn(com.google.cloud.teleport.v2.transforms.DeleteBigQueryDataFn) KV(org.apache.beam.sdk.values.KV) PCollection(org.apache.beam.sdk.values.PCollection) BigQueryTable(com.google.cloud.teleport.v2.values.BigQueryTable) UpdateDataplexBigQueryToGcsExportMetadataTransform(com.google.cloud.teleport.v2.transforms.UpdateDataplexBigQueryToGcsExportMetadataTransform) BigQueryTableToGcsTransform(com.google.cloud.teleport.v2.transforms.BigQueryTableToGcsTransform) VisibleForTesting(com.google.common.annotations.VisibleForTesting)

Example 9 with BigQueryTablePartition

use of com.google.cloud.teleport.v2.values.BigQueryTablePartition in project DataflowTemplates by GoogleCloudPlatform.

the class DeleteBigQueryDataFnTest method testTransform_withDeleteSourceDataDisabled_doesntTruncateData.

@Test
@Category(NeedsRunner.class)
public void testTransform_withDeleteSourceDataDisabled_doesntTruncateData() {
    Options options = TestPipeline.testingPipelineOptions().as(Options.class);
    options.setDeleteSourceData(false);
    BigQueryTable partitionedTable = table.toBuilder().setPartitions(Collections.singletonList(partition)).setPartitioningColumn("column-name-doesnt-matter").build();
    DeleteBigQueryDataFn fn = new DeleteBigQueryDataFn().withTestBqClientFactory(() -> bqMock);
    PCollection<Void> actual = testPipeline.apply("CreateInput", Create.of(KV.of(partitionedTable, partition), KV.of(table, (BigQueryTablePartition) null)).withCoder(fnCoder)).apply("TestDeleteBigQueryDataFn", ParDo.of(fn));
    PAssert.that(actual).empty();
    testPipeline.run(options);
    verifyNoMoreInteractions(bqMock);
}
Also used : Options(com.google.cloud.teleport.v2.transforms.DeleteBigQueryDataFn.Options) BigQueryTable(com.google.cloud.teleport.v2.values.BigQueryTable) Category(org.junit.experimental.categories.Category) Test(org.junit.Test)

Example 10 with BigQueryTablePartition

use of com.google.cloud.teleport.v2.values.BigQueryTablePartition in project DataflowTemplates by GoogleCloudPlatform.

the class DataplexBigQueryToGcsFilterTest method test_whenTablesSet_filterExcludesTablesByName.

@Test
public void test_whenTablesSet_filterExcludesTablesByName() {
    BigQueryTable.Builder includedTable1 = table().setTableName("includedTable1");
    BigQueryTable.Builder includedTable2 = table().setTableName("includedTable2");
    BigQueryTable.Builder excludedTable = table().setTableName("excludedTable");
    BigQueryTablePartition p = partition().build();
    options.setTables("includedTable1,includedTable2");
    options.setExportDataModifiedBeforeDateTime(null);
    Filter f = new DataplexBigQueryToGcsFilter(options, new ArrayList<String>());
    assertThat(f.shouldSkipUnpartitionedTable(includedTable1)).isFalse();
    assertThat(f.shouldSkipUnpartitionedTable(includedTable2)).isFalse();
    assertThat(f.shouldSkipUnpartitionedTable(excludedTable)).isTrue();
    assertThat(f.shouldSkipPartitionedTable(includedTable1, Collections.singletonList(p))).isFalse();
    assertThat(f.shouldSkipPartitionedTable(includedTable2, Collections.singletonList(p))).isFalse();
    assertThat(f.shouldSkipPartitionedTable(excludedTable, Collections.singletonList(p))).isTrue();
    assertThat(f.shouldSkipPartition(includedTable1, p)).isFalse();
    assertThat(f.shouldSkipPartition(includedTable2, p)).isFalse();
    // Should NOT skip PARTITIONS, only tables as a whole because of their name:
    assertThat(f.shouldSkipPartition(excludedTable, p)).isFalse();
}
Also used : BigQueryTablePartition(com.google.cloud.teleport.v2.values.BigQueryTablePartition) Filter(com.google.cloud.teleport.v2.utils.BigQueryMetadataLoader.Filter) BigQueryTable(com.google.cloud.teleport.v2.values.BigQueryTable) Test(org.junit.Test)

Aggregations

BigQueryTablePartition (com.google.cloud.teleport.v2.values.BigQueryTablePartition)13 BigQueryTable (com.google.cloud.teleport.v2.values.BigQueryTable)11 Test (org.junit.Test)9 Filter (com.google.cloud.teleport.v2.utils.BigQueryMetadataLoader.Filter)6 Options (com.google.cloud.teleport.v2.transforms.DeleteBigQueryDataFn.Options)3 ArrayList (java.util.ArrayList)3 Category (org.junit.experimental.categories.Category)3 PCollection (org.apache.beam.sdk.values.PCollection)2 Table (com.google.api.services.bigquery.model.Table)1 TableFieldSchema (com.google.api.services.bigquery.model.TableFieldSchema)1 TableRow (com.google.api.services.bigquery.model.TableRow)1 TableSchema (com.google.api.services.bigquery.model.TableSchema)1 TimePartitioning (com.google.api.services.bigquery.model.TimePartitioning)1 TableId (com.google.cloud.bigquery.TableId)1 TableResult (com.google.cloud.bigquery.TableResult)1 TableReadOptions (com.google.cloud.bigquery.storage.v1beta1.ReadOptions.TableReadOptions)1 ReadSession (com.google.cloud.bigquery.storage.v1beta1.Storage.ReadSession)1 DataplexBigQueryToGcsOptions (com.google.cloud.teleport.v2.options.DataplexBigQueryToGcsOptions)1 BigQueryTableToGcsTransform (com.google.cloud.teleport.v2.transforms.BigQueryTableToGcsTransform)1 DeleteBigQueryDataFn (com.google.cloud.teleport.v2.transforms.DeleteBigQueryDataFn)1