Search in sources :

Example 1 with Supplier

use of org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Supplier in project beam by apache.

the class PubsubReadIT method testReadPublicData.

@Test
public void testReadPublicData() throws Exception {
    // The pipeline will never terminate on its own
    pipeline.getOptions().as(TestPipelineOptions.class).setBlockOnRun(false);
    PCollection<String> messages = pipeline.apply(PubsubIO.readStrings().fromTopic("projects/pubsub-public-data/topics/taxirides-realtime"));
    messages.apply("waitForAnyMessage", signal.signalSuccessWhen(messages.getCoder(), anyMessages -> true));
    Supplier<Void> start = signal.waitForStart(Duration.standardMinutes(5));
    pipeline.apply(signal.signalStart());
    PipelineResult job = pipeline.run();
    start.get();
    signal.waitForSuccess(Duration.standardMinutes(5));
    // A runner may not support cancel
    try {
        job.cancel();
    } catch (UnsupportedOperationException exc) {
    // noop
    }
}
Also used : TestPipelineOptions(org.apache.beam.sdk.testing.TestPipelineOptions) PipelineResult(org.apache.beam.sdk.PipelineResult) Duration(org.joda.time.Duration) RunWith(org.junit.runner.RunWith) Set(java.util.Set) SerializableFunction(org.apache.beam.sdk.transforms.SerializableFunction) Test(org.junit.Test) JUnit4(org.junit.runners.JUnit4) PCollection(org.apache.beam.sdk.values.PCollection) Supplier(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Supplier) Rule(org.junit.Rule) Strings(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Strings) TestPipeline(org.apache.beam.sdk.testing.TestPipeline) PipelineResult(org.apache.beam.sdk.PipelineResult) TestPipelineOptions(org.apache.beam.sdk.testing.TestPipelineOptions) Test(org.junit.Test)

Example 2 with Supplier

use of org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Supplier in project beam by apache.

the class BigQuerySourceBase method createSources.

List<BoundedSource<T>> createSources(List<ResourceId> files, TableSchema schema, List<MatchResult.Metadata> metadata) throws IOException, InterruptedException {
    final String jsonSchema = BigQueryIO.JSON_FACTORY.toString(schema);
    SerializableFunction<GenericRecord, T> fnWrapper = new SerializableFunction<GenericRecord, T>() {

        private Supplier<TableSchema> schema = Suppliers.memoize(Suppliers.compose(new TableSchemaFunction(), Suppliers.ofInstance(jsonSchema)));

        @Override
        public T apply(GenericRecord input) {
            return parseFn.apply(new SchemaAndRecord(input, schema.get()));
        }
    };
    List<BoundedSource<T>> avroSources = Lists.newArrayList();
    // mode.
    if (metadata != null) {
        for (MatchResult.Metadata file : metadata) {
            avroSources.add(AvroSource.from(file).withParseFn(fnWrapper, getOutputCoder()));
        }
    } else {
        for (ResourceId file : files) {
            avroSources.add(AvroSource.from(file.toString()).withParseFn(fnWrapper, getOutputCoder()));
        }
    }
    return ImmutableList.copyOf(avroSources);
}
Also used : BoundedSource(org.apache.beam.sdk.io.BoundedSource) SerializableFunction(org.apache.beam.sdk.transforms.SerializableFunction) MatchResult(org.apache.beam.sdk.io.fs.MatchResult) ResourceId(org.apache.beam.sdk.io.fs.ResourceId) Supplier(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Supplier) GenericRecord(org.apache.avro.generic.GenericRecord)

Example 3 with Supplier

use of org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Supplier in project beam by apache.

the class FhirIOReadIT method testFhirIORead.

@Test
public void testFhirIORead() throws Exception {
    pipeline.getOptions().as(DirectOptions.class).setBlockOnRun(false);
    FhirIO.Read.Result result = pipeline.apply(PubsubIO.readStrings().fromSubscription(pubsubSubscription)).apply(FhirIO.readResources());
    PCollection<String> resources = result.getResources();
    resources.apply("waitForAnyMessage", signal.signalSuccessWhen(resources.getCoder(), anyResources -> true));
    // wait for any resource
    Supplier<Void> start = signal.waitForStart(Duration.standardMinutes(5));
    pipeline.apply(signal.signalStart());
    PipelineResult job = pipeline.run();
    start.get();
    signal.waitForSuccess(Duration.standardMinutes(5));
    // A runner may not support cancel
    try {
        job.cancel();
    } catch (UnsupportedOperationException exc) {
    // noop
    }
}
Also used : Arrays(java.util.Arrays) TopicPath(org.apache.beam.sdk.io.gcp.pubsub.PubsubClient.TopicPath) PipelineResult(org.apache.beam.sdk.PipelineResult) Duration(org.joda.time.Duration) RunWith(org.junit.runner.RunWith) Parameters(org.junit.runners.Parameterized.Parameters) SecureRandom(java.security.SecureRandom) Supplier(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Supplier) TestPipeline(org.apache.beam.sdk.testing.TestPipeline) After(org.junit.After) DirectOptions(org.apache.beam.runners.direct.DirectOptions) TestPubsubSignal(org.apache.beam.sdk.io.gcp.pubsub.TestPubsubSignal) Parameterized(org.junit.runners.Parameterized) Before(org.junit.Before) PubsubClient(org.apache.beam.sdk.io.gcp.pubsub.PubsubClient) TestPubsubOptions(org.apache.beam.sdk.io.gcp.pubsub.TestPubsubOptions) Collection(java.util.Collection) SubscriptionPath(org.apache.beam.sdk.io.gcp.pubsub.PubsubClient.SubscriptionPath) IOException(java.io.IOException) PubsubGrpcClient(org.apache.beam.sdk.io.gcp.pubsub.PubsubGrpcClient) Test(org.junit.Test) PCollection(org.apache.beam.sdk.values.PCollection) PubsubIO(org.apache.beam.sdk.io.gcp.pubsub.PubsubIO) Rule(org.junit.Rule) HEALTHCARE_DATASET_TEMPLATE(org.apache.beam.sdk.io.gcp.healthcare.HL7v2IOTestUtil.HEALTHCARE_DATASET_TEMPLATE) PipelineResult(org.apache.beam.sdk.PipelineResult) DirectOptions(org.apache.beam.runners.direct.DirectOptions) Test(org.junit.Test)

Aggregations

Supplier (org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Supplier)3 PipelineResult (org.apache.beam.sdk.PipelineResult)2 TestPipeline (org.apache.beam.sdk.testing.TestPipeline)2 SerializableFunction (org.apache.beam.sdk.transforms.SerializableFunction)2 PCollection (org.apache.beam.sdk.values.PCollection)2 Duration (org.joda.time.Duration)2 Rule (org.junit.Rule)2 Test (org.junit.Test)2 RunWith (org.junit.runner.RunWith)2 IOException (java.io.IOException)1 SecureRandom (java.security.SecureRandom)1 Arrays (java.util.Arrays)1 Collection (java.util.Collection)1 Set (java.util.Set)1 GenericRecord (org.apache.avro.generic.GenericRecord)1 DirectOptions (org.apache.beam.runners.direct.DirectOptions)1 BoundedSource (org.apache.beam.sdk.io.BoundedSource)1 MatchResult (org.apache.beam.sdk.io.fs.MatchResult)1 ResourceId (org.apache.beam.sdk.io.fs.ResourceId)1 HEALTHCARE_DATASET_TEMPLATE (org.apache.beam.sdk.io.gcp.healthcare.HL7v2IOTestUtil.HEALTHCARE_DATASET_TEMPLATE)1