Search in sources :

Example 1 with ReadOperation

use of org.apache.beam.sdk.io.gcp.spanner.ReadOperation in project java-docs-samples by GoogleCloudPlatform.

the class SpannerReadAll method main.

public static void main(String[] args) {
    Options options = PipelineOptionsFactory.fromArgs(args).withValidation().as(Options.class);
    Pipeline p = Pipeline.create(options);
    SpannerConfig spannerConfig = SpannerConfig.create().withInstanceId(options.getInstanceId()).withDatabaseId(options.getDatabaseId());
    // [START spanner_dataflow_readall]
    PCollection<Struct> allRecords = p.apply(SpannerIO.read().withSpannerConfig(spannerConfig).withQuery("SELECT t.table_name FROM information_schema.tables AS t WHERE t" + ".table_catalog = '' AND t.table_schema = ''")).apply(MapElements.into(TypeDescriptor.of(ReadOperation.class)).via((SerializableFunction<Struct, ReadOperation>) input -> {
        String tableName = input.getString(0);
        return ReadOperation.create().withQuery("SELECT * FROM " + tableName);
    })).apply(SpannerIO.readAll().withSpannerConfig(spannerConfig));
    // [END spanner_dataflow_readall]
    PCollection<Long> dbEstimatedSize = allRecords.apply(EstimateSize.create()).apply(Sum.longsGlobally());
    dbEstimatedSize.apply(ToString.elements()).apply(TextIO.write().to(options.getOutput()).withoutSharding());
    p.run().waitUntilFinish();
}
Also used : SpannerConfig(org.apache.beam.sdk.io.gcp.spanner.SpannerConfig) MapElements(org.apache.beam.sdk.transforms.MapElements) ToString(org.apache.beam.sdk.transforms.ToString) TypeDescriptor(org.apache.beam.sdk.values.TypeDescriptor) Sum(org.apache.beam.sdk.transforms.Sum) SerializableFunction(org.apache.beam.sdk.transforms.SerializableFunction) PipelineOptionsFactory(org.apache.beam.sdk.options.PipelineOptionsFactory) PCollection(org.apache.beam.sdk.values.PCollection) SpannerIO(org.apache.beam.sdk.io.gcp.spanner.SpannerIO) SpannerConfig(org.apache.beam.sdk.io.gcp.spanner.SpannerConfig) Description(org.apache.beam.sdk.options.Description) ReadOperation(org.apache.beam.sdk.io.gcp.spanner.ReadOperation) Struct(com.google.cloud.spanner.Struct) Validation(org.apache.beam.sdk.options.Validation) Pipeline(org.apache.beam.sdk.Pipeline) PipelineOptions(org.apache.beam.sdk.options.PipelineOptions) TextIO(org.apache.beam.sdk.io.TextIO) PipelineOptions(org.apache.beam.sdk.options.PipelineOptions) ReadOperation(org.apache.beam.sdk.io.gcp.spanner.ReadOperation) ToString(org.apache.beam.sdk.transforms.ToString) Pipeline(org.apache.beam.sdk.Pipeline) Struct(com.google.cloud.spanner.Struct)

Aggregations

Struct (com.google.cloud.spanner.Struct)1 Pipeline (org.apache.beam.sdk.Pipeline)1 TextIO (org.apache.beam.sdk.io.TextIO)1 ReadOperation (org.apache.beam.sdk.io.gcp.spanner.ReadOperation)1 SpannerConfig (org.apache.beam.sdk.io.gcp.spanner.SpannerConfig)1 SpannerIO (org.apache.beam.sdk.io.gcp.spanner.SpannerIO)1 Description (org.apache.beam.sdk.options.Description)1 PipelineOptions (org.apache.beam.sdk.options.PipelineOptions)1 PipelineOptionsFactory (org.apache.beam.sdk.options.PipelineOptionsFactory)1 Validation (org.apache.beam.sdk.options.Validation)1 MapElements (org.apache.beam.sdk.transforms.MapElements)1 SerializableFunction (org.apache.beam.sdk.transforms.SerializableFunction)1 Sum (org.apache.beam.sdk.transforms.Sum)1 ToString (org.apache.beam.sdk.transforms.ToString)1 PCollection (org.apache.beam.sdk.values.PCollection)1 TypeDescriptor (org.apache.beam.sdk.values.TypeDescriptor)1