Search in sources :

Example 41 with PCollection

use of org.apache.beam.sdk.values.PCollection in project beam by apache.

the class GroupByKeyTranslatorBatch method translateNode.

@Override
public void translateNode(GroupByKey<K, V> transform, Twister2BatchTranslationContext context) {
    PCollection<KV<K, V>> input = context.getInput(transform);
    BatchTSetImpl<WindowedValue<KV<K, V>>> inputTTSet = context.getInputDataSet(input);
    final KvCoder<K, V> coder = (KvCoder<K, V>) input.getCoder();
    Coder<K> inputKeyCoder = coder.getKeyCoder();
    WindowingStrategy windowingStrategy = input.getWindowingStrategy();
    WindowFn<KV<K, V>, BoundedWindow> windowFn = (WindowFn<KV<K, V>, BoundedWindow>) windowingStrategy.getWindowFn();
    final WindowedValue.WindowedValueCoder<V> wvCoder = WindowedValue.FullWindowedValueCoder.of(coder.getValueCoder(), windowFn.windowCoder());
    KeyedTSet<byte[], byte[]> keyedTSet = inputTTSet.mapToTuple(new MapToTupleFunction<K, V>(inputKeyCoder, wvCoder));
    // todo add support for a partition function to be specified, this would use
    // todo keyedPartition function instead of KeyedGather
    ComputeTSet<KV<K, Iterable<WindowedValue<V>>>, Iterator<Tuple<byte[], Iterator<byte[]>>>> groupedbyKeyTset = keyedTSet.keyedGather().map(new ByteToWindowFunction(inputKeyCoder, wvCoder));
    // --- now group also by window.
    SystemReduceFnBuffering reduceFnBuffering = new SystemReduceFnBuffering(coder.getValueCoder());
    ComputeTSet<WindowedValue<KV<K, Iterable<V>>>, Iterable<KV<K, Iterator<WindowedValue<V>>>>> outputTset = groupedbyKeyTset.direct().<WindowedValue<KV<K, Iterable<V>>>>flatmap(new GroupByWindowFunction(windowingStrategy, reduceFnBuffering, context.getOptions()));
    PCollection output = context.getOutput(transform);
    context.setOutputDataSet(output, outputTset);
}
Also used : WindowFn(org.apache.beam.sdk.transforms.windowing.WindowFn) KvCoder(org.apache.beam.sdk.coders.KvCoder) KV(org.apache.beam.sdk.values.KV) SystemReduceFnBuffering(org.apache.beam.runners.twister2.translators.functions.internal.SystemReduceFnBuffering) WindowingStrategy(org.apache.beam.sdk.values.WindowingStrategy) PCollection(org.apache.beam.sdk.values.PCollection) ByteToWindowFunction(org.apache.beam.runners.twister2.translators.functions.ByteToWindowFunction) WindowedValue(org.apache.beam.sdk.util.WindowedValue) KV(org.apache.beam.sdk.values.KV) Iterator(java.util.Iterator) BoundedWindow(org.apache.beam.sdk.transforms.windowing.BoundedWindow) GroupByWindowFunction(org.apache.beam.runners.twister2.translators.functions.GroupByWindowFunction)

Example 42 with PCollection

use of org.apache.beam.sdk.values.PCollection in project beam by apache.

the class FlattenTranslatorBatch method translateNode.

@Override
public void translateNode(Flatten.PCollections<T> transform, Twister2BatchTranslationContext context) {
    Collection<PCollection<?>> pcs = context.getInputs().values();
    List<BatchTSetImpl<WindowedValue<T>>> tSets = new ArrayList<>();
    BatchTSetImpl<WindowedValue<T>> unionTSet;
    if (pcs.isEmpty()) {
        final TSetEnvironment tsetEnv = context.getEnvironment();
        unionTSet = ((BatchTSetEnvironment) tsetEnv).createSource(new Twister2EmptySource(), context.getOptions().getParallelism());
    } else {
        for (PValue pc : pcs) {
            BatchTSetImpl<WindowedValue<T>> curr = context.getInputDataSet(pc);
            tSets.add(curr);
        }
        BatchTSetImpl<WindowedValue<T>> first = tSets.remove(0);
        Collection<TSet<WindowedValue<T>>> others = new ArrayList<>();
        others.addAll(tSets);
        if (tSets.size() > 0) {
            unionTSet = first.union(others);
        } else {
            unionTSet = first;
        }
    }
    context.setOutputDataSet(context.getOutput(transform), unionTSet);
}
Also used : Twister2EmptySource(org.apache.beam.runners.twister2.translation.wrappers.Twister2EmptySource) TSetEnvironment(edu.iu.dsc.tws.tset.env.TSetEnvironment) BatchTSetEnvironment(edu.iu.dsc.tws.tset.env.BatchTSetEnvironment) ArrayList(java.util.ArrayList) TSet(edu.iu.dsc.tws.api.tset.sets.TSet) PValue(org.apache.beam.sdk.values.PValue) PCollection(org.apache.beam.sdk.values.PCollection) WindowedValue(org.apache.beam.sdk.util.WindowedValue) BatchTSetImpl(edu.iu.dsc.tws.tset.sets.batch.BatchTSetImpl)

Example 43 with PCollection

use of org.apache.beam.sdk.values.PCollection in project beam by apache.

the class ExpansionService method expand.

@VisibleForTesting
/*package*/
ExpansionApi.ExpansionResponse expand(ExpansionApi.ExpansionRequest request) {
    LOG.info("Expanding '{}' with URN '{}'", request.getTransform().getUniqueName(), request.getTransform().getSpec().getUrn());
    LOG.debug("Full transform: {}", request.getTransform());
    Set<String> existingTransformIds = request.getComponents().getTransformsMap().keySet();
    Pipeline pipeline = createPipeline();
    boolean isUseDeprecatedRead = ExperimentalOptions.hasExperiment(pipelineOptions, "use_deprecated_read") || ExperimentalOptions.hasExperiment(pipelineOptions, "beam_fn_api_use_deprecated_read");
    if (!isUseDeprecatedRead) {
        ExperimentalOptions.addExperiment(pipeline.getOptions().as(ExperimentalOptions.class), "beam_fn_api");
        // TODO(BEAM-10670): Remove this when we address performance issue.
        ExperimentalOptions.addExperiment(pipeline.getOptions().as(ExperimentalOptions.class), "use_sdf_read");
    } else {
        LOG.warn("Using use_depreacted_read in portable runners is runner-dependent. The " + "ExpansionService will respect that, but if your runner does not have support for " + "native Read transform, your Pipeline will fail during Pipeline submission.");
    }
    RehydratedComponents rehydratedComponents = RehydratedComponents.forComponents(request.getComponents()).withPipeline(pipeline);
    Map<String, PCollection<?>> inputs = request.getTransform().getInputsMap().entrySet().stream().collect(Collectors.toMap(Map.Entry::getKey, input -> {
        try {
            return rehydratedComponents.getPCollection(input.getValue());
        } catch (IOException exn) {
            throw new RuntimeException(exn);
        }
    }));
    String urn = request.getTransform().getSpec().getUrn();
    TransformProvider transformProvider = null;
    if (getUrn(ExpansionMethods.Enum.JAVA_CLASS_LOOKUP).equals(urn)) {
        AllowList allowList = pipelineOptions.as(ExpansionServiceOptions.class).getJavaClassLookupAllowlist();
        assert allowList != null;
        transformProvider = new JavaClassLookupTransformProvider(allowList);
    } else {
        transformProvider = getRegisteredTransforms().get(urn);
        if (transformProvider == null) {
            throw new UnsupportedOperationException("Unknown urn: " + request.getTransform().getSpec().getUrn());
        }
    }
    List<String> classpathResources = transformProvider.getDependencies(request.getTransform().getSpec(), pipeline.getOptions());
    pipeline.getOptions().as(PortablePipelineOptions.class).setFilesToStage(classpathResources);
    Map<String, PCollection<?>> outputs = transformProvider.apply(pipeline, request.getTransform().getUniqueName(), request.getTransform().getSpec(), inputs);
    // Needed to find which transform was new...
    SdkComponents sdkComponents = rehydratedComponents.getSdkComponents(Collections.emptyList()).withNewIdPrefix(request.getNamespace());
    sdkComponents.registerEnvironment(Environments.createOrGetDefaultEnvironment(pipeline.getOptions().as(PortablePipelineOptions.class)));
    Map<String, String> outputMap = outputs.entrySet().stream().collect(Collectors.toMap(Map.Entry::getKey, output -> {
        try {
            return sdkComponents.registerPCollection(output.getValue());
        } catch (IOException exn) {
            throw new RuntimeException(exn);
        }
    }));
    if (isUseDeprecatedRead) {
        SplittableParDo.convertReadBasedSplittableDoFnsToPrimitiveReadsIfNecessary(pipeline);
    }
    RunnerApi.Pipeline pipelineProto = PipelineTranslation.toProto(pipeline, sdkComponents);
    String expandedTransformId = Iterables.getOnlyElement(pipelineProto.getRootTransformIdsList().stream().filter(id -> !existingTransformIds.contains(id)).collect(Collectors.toList()));
    RunnerApi.Components components = pipelineProto.getComponents();
    RunnerApi.PTransform expandedTransform = components.getTransformsOrThrow(expandedTransformId).toBuilder().setUniqueName(expandedTransformId).clearOutputs().putAllOutputs(outputMap).build();
    LOG.debug("Expanded to {}", expandedTransform);
    return ExpansionApi.ExpansionResponse.newBuilder().setComponents(components.toBuilder().removeTransforms(expandedTransformId)).setTransform(expandedTransform).addAllRequirements(pipelineProto.getRequirementsList()).build();
}
Also used : Arrays(java.util.Arrays) PortablePipelineOptions(org.apache.beam.sdk.options.PortablePipelineOptions) ServerBuilder(org.apache.beam.vendor.grpc.v1p43p2.io.grpc.ServerBuilder) PipelineResult(org.apache.beam.sdk.PipelineResult) SchemaApi(org.apache.beam.model.pipeline.v1.SchemaApi) LoggerFactory(org.slf4j.LoggerFactory) SerializableFunction(org.apache.beam.sdk.transforms.SerializableFunction) ImmutableMap(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap) RehydratedComponents(org.apache.beam.runners.core.construction.RehydratedComponents) Throwables(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Throwables) MonotonicNonNull(org.checkerframework.checker.nullness.qual.MonotonicNonNull) PCollectionList(org.apache.beam.sdk.values.PCollectionList) Map(java.util.Map) Iterables(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.Iterables) PCollectionTuple(org.apache.beam.sdk.values.PCollectionTuple) Method(java.lang.reflect.Method) SchemaCoder(org.apache.beam.sdk.schemas.SchemaCoder) Set(java.util.Set) ServiceLoader(java.util.ServiceLoader) Converter(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Converter) Collectors(java.util.stream.Collectors) InvocationTargetException(java.lang.reflect.InvocationTargetException) ExpansionMethods(org.apache.beam.model.pipeline.v1.ExternalTransforms.ExpansionMethods) POutput(org.apache.beam.sdk.values.POutput) List(java.util.List) VisibleForTesting(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting) ExpansionServiceGrpc(org.apache.beam.model.expansion.v1.ExpansionServiceGrpc) StreamObserver(org.apache.beam.vendor.grpc.v1p43p2.io.grpc.stub.StreamObserver) Optional(java.util.Optional) AllowList(org.apache.beam.sdk.expansion.service.JavaClassLookupTransformProvider.AllowList) ExternalTransformBuilder(org.apache.beam.sdk.transforms.ExternalTransformBuilder) SchemaTranslation(org.apache.beam.sdk.schemas.SchemaTranslation) NoSuchSchemaException(org.apache.beam.sdk.schemas.NoSuchSchemaException) ExperimentalOptions(org.apache.beam.sdk.options.ExperimentalOptions) CaseFormat(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.CaseFormat) Coder(org.apache.beam.sdk.coders.Coder) RowCoder(org.apache.beam.sdk.coders.RowCoder) PipelineTranslation(org.apache.beam.runners.core.construction.PipelineTranslation) PipelineOptionsFactory(org.apache.beam.sdk.options.PipelineOptionsFactory) Constructor(java.lang.reflect.Constructor) Environments(org.apache.beam.runners.core.construction.Environments) Server(org.apache.beam.vendor.grpc.v1p43p2.io.grpc.Server) Preconditions.checkArgumentNotNull(org.apache.beam.sdk.util.Preconditions.checkArgumentNotNull) PTransform(org.apache.beam.sdk.transforms.PTransform) ExpansionApi(org.apache.beam.model.expansion.v1.ExpansionApi) PipelineRunner(org.apache.beam.sdk.PipelineRunner) SchemaRegistry(org.apache.beam.sdk.schemas.SchemaRegistry) TupleTag(org.apache.beam.sdk.values.TupleTag) ByteString(org.apache.beam.vendor.grpc.v1p43p2.com.google.protobuf.ByteString) Pipeline(org.apache.beam.sdk.Pipeline) PInput(org.apache.beam.sdk.values.PInput) Row(org.apache.beam.sdk.values.Row) PipelineOptions(org.apache.beam.sdk.options.PipelineOptions) Nullable(org.checkerframework.checker.nullness.qual.Nullable) RunnerApi(org.apache.beam.model.pipeline.v1.RunnerApi) Field(org.apache.beam.sdk.schemas.Schema.Field) SdkComponents(org.apache.beam.runners.core.construction.SdkComponents) ExternalConfigurationPayload(org.apache.beam.model.pipeline.v1.ExternalTransforms.ExternalConfigurationPayload) PDone(org.apache.beam.sdk.values.PDone) Logger(org.slf4j.Logger) PipelineResources.detectClassPathResourcesToStage(org.apache.beam.runners.core.construction.resources.PipelineResources.detectClassPathResourcesToStage) IOException(java.io.IOException) SplittableParDo(org.apache.beam.runners.core.construction.SplittableParDo) ExternalTransformRegistrar(org.apache.beam.sdk.expansion.ExternalTransformRegistrar) PCollection(org.apache.beam.sdk.values.PCollection) Schema(org.apache.beam.sdk.schemas.Schema) AutoService(com.google.auto.service.AutoService) Preconditions(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions) ArtifactRetrievalService(org.apache.beam.runners.fnexecution.artifact.ArtifactRetrievalService) Collections(java.util.Collections) BeamUrns.getUrn(org.apache.beam.runners.core.construction.BeamUrns.getUrn) ExperimentalOptions(org.apache.beam.sdk.options.ExperimentalOptions) ByteString(org.apache.beam.vendor.grpc.v1p43p2.com.google.protobuf.ByteString) SdkComponents(org.apache.beam.runners.core.construction.SdkComponents) RunnerApi(org.apache.beam.model.pipeline.v1.RunnerApi) IOException(java.io.IOException) Pipeline(org.apache.beam.sdk.Pipeline) PCollection(org.apache.beam.sdk.values.PCollection) PortablePipelineOptions(org.apache.beam.sdk.options.PortablePipelineOptions) AllowList(org.apache.beam.sdk.expansion.service.JavaClassLookupTransformProvider.AllowList) RehydratedComponents(org.apache.beam.runners.core.construction.RehydratedComponents) ImmutableMap(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap) Map(java.util.Map) VisibleForTesting(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting)

Example 44 with PCollection

use of org.apache.beam.sdk.values.PCollection in project beam by apache.

the class ReduceByKeyTranslator method translate.

@Override
public PCollection<KV<KeyT, OutputT>> translate(ReduceByKey<InputT, KeyT, ValueT, ?, OutputT> operator, PCollectionList<InputT> inputs) {
    // todo Could we even do values sorting in Beam ? And do we want it?
    checkState(!operator.getValueComparator().isPresent(), "Values sorting is not supported.");
    final UnaryFunction<InputT, KeyT> keyExtractor = operator.getKeyExtractor();
    final UnaryFunction<InputT, ValueT> valueExtractor = operator.getValueExtractor();
    final PCollection<InputT> input = operator.getWindow().map(window -> PCollectionLists.getOnlyElement(inputs).apply(window)).orElseGet(() -> PCollectionLists.getOnlyElement(inputs));
    // ~ create key & value extractor
    final MapElements<InputT, KV<KeyT, ValueT>> extractor = MapElements.via(new KeyValueExtractor<>(keyExtractor, valueExtractor));
    final PCollection<KV<KeyT, ValueT>> extracted = input.apply("extract-keys", extractor).setTypeDescriptor(TypeDescriptors.kvs(TypeAwareness.orObjects(operator.getKeyType()), TypeAwareness.orObjects(operator.getValueType())));
    final AccumulatorProvider accumulators = new LazyAccumulatorProvider(AccumulatorProvider.of(inputs.getPipeline()));
    if (operator.isCombinable()) {
        // if operator is combinable we can process it in more efficient way
        @SuppressWarnings("unchecked") final PCollection combined;
        if (operator.isCombineFnStyle()) {
            combined = extracted.apply("combine", Combine.perKey(asCombineFn(operator)));
        } else {
            combined = extracted.apply("combine", Combine.perKey(asCombiner(operator.getReducer(), accumulators, operator.getName().orElse(null))));
        }
        @SuppressWarnings("unchecked") final PCollection<KV<KeyT, OutputT>> cast = (PCollection) combined;
        return cast.setTypeDescriptor(operator.getOutputType().orElseThrow(() -> new IllegalStateException("Unable to infer output type descriptor.")));
    }
    return extracted.apply("group", GroupByKey.create()).setTypeDescriptor(TypeDescriptors.kvs(TypeAwareness.orObjects(operator.getKeyType()), TypeDescriptors.iterables(TypeAwareness.orObjects(operator.getValueType())))).apply("reduce", ParDo.of(new ReduceDoFn<>(operator.getReducer(), accumulators, operator.getName().orElse(null)))).setTypeDescriptor(operator.getOutputType().orElseThrow(() -> new IllegalStateException("Unable to infer output type descriptor.")));
}
Also used : KV(org.apache.beam.sdk.values.KV) TypeDescriptor(org.apache.beam.sdk.values.TypeDescriptor) CoderRegistry(org.apache.beam.sdk.coders.CoderRegistry) Combine(org.apache.beam.sdk.transforms.Combine) Coder(org.apache.beam.sdk.coders.Coder) BinaryFunction(org.apache.beam.sdk.extensions.euphoria.core.client.functional.BinaryFunction) SerializableFunction(org.apache.beam.sdk.transforms.SerializableFunction) SimpleFunction(org.apache.beam.sdk.transforms.SimpleFunction) PCollectionList(org.apache.beam.sdk.values.PCollectionList) Objects.requireNonNull(java.util.Objects.requireNonNull) AdaptableCollector(org.apache.beam.sdk.extensions.euphoria.core.translate.collector.AdaptableCollector) StreamSupport(java.util.stream.StreamSupport) ReduceFunctor(org.apache.beam.sdk.extensions.euphoria.core.client.functional.ReduceFunctor) SingleValueCollector(org.apache.beam.sdk.extensions.euphoria.core.translate.collector.SingleValueCollector) Nullable(org.checkerframework.checker.nullness.qual.Nullable) ReduceByKey(org.apache.beam.sdk.extensions.euphoria.core.client.operator.ReduceByKey) DoFn(org.apache.beam.sdk.transforms.DoFn) MapElements(org.apache.beam.sdk.transforms.MapElements) CannotProvideCoderException(org.apache.beam.sdk.coders.CannotProvideCoderException) VoidFunction(org.apache.beam.sdk.extensions.euphoria.core.client.functional.VoidFunction) GroupByKey(org.apache.beam.sdk.transforms.GroupByKey) TypeAwareness(org.apache.beam.sdk.extensions.euphoria.core.client.type.TypeAwareness) UnaryFunction(org.apache.beam.sdk.extensions.euphoria.core.client.functional.UnaryFunction) PCollection(org.apache.beam.sdk.values.PCollection) CombinableBinaryFunction(org.apache.beam.sdk.extensions.euphoria.core.client.functional.CombinableBinaryFunction) ParDo(org.apache.beam.sdk.transforms.ParDo) Preconditions.checkState(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkState) TypeDescriptors(org.apache.beam.sdk.values.TypeDescriptors) PCollectionLists(org.apache.beam.sdk.extensions.euphoria.core.client.util.PCollectionLists) AccumulatorProvider(org.apache.beam.sdk.extensions.euphoria.core.client.accumulators.AccumulatorProvider) KV(org.apache.beam.sdk.values.KV) AccumulatorProvider(org.apache.beam.sdk.extensions.euphoria.core.client.accumulators.AccumulatorProvider) PCollection(org.apache.beam.sdk.values.PCollection)

Example 45 with PCollection

use of org.apache.beam.sdk.values.PCollection in project beam by apache.

the class PubsubTableProviderIT method testSQLWithBytePayload.

@Test
public void testSQLWithBytePayload() throws Exception {
    // Prepare messages to send later
    List<PubsubMessage> messages = ImmutableList.of(objectsProvider.messageIdName(ts(1), 3, "foo"), objectsProvider.messageIdName(ts(2), 5, "bar"), objectsProvider.messageIdName(ts(3), 7, "baz"));
    String createTableString = String.format("CREATE EXTERNAL TABLE message (\n" + "event_timestamp TIMESTAMP, \n" + "attributes MAP<VARCHAR, VARCHAR>, \n" + "payload VARBINARY \n" + ") \n" + "TYPE '%s' \n" + "LOCATION '%s' \n" + "TBLPROPERTIES '{ " + "\"protoClass\" : \"%s\", " + "\"timestampAttributeKey\" : \"ts\" }'", tableProvider.getTableType(), eventsTopic.topicPath(), PayloadMessages.SimpleMessage.class.getName());
    String queryString = "SELECT message.payload AS some_bytes FROM message";
    // Initialize SQL environment and create the pubsub table
    BeamSqlEnv sqlEnv = BeamSqlEnv.inMemory(new PubsubTableProvider());
    sqlEnv.executeDdl(createTableString);
    // Apply the PTransform to query the pubsub topic
    PCollection<Row> queryOutput = query(sqlEnv, pipeline, queryString);
    // Observe the query results and send success signal after seeing the expected messages
    Schema justBytesSchema = Schema.builder().addField("some_bytes", FieldType.BYTES.withNullable(true)).build();
    Row expectedRow0 = row(justBytesSchema, (Object) messages.get(0).getPayload());
    Row expectedRow1 = row(justBytesSchema, (Object) messages.get(1).getPayload());
    Row expectedRow2 = row(justBytesSchema, (Object) messages.get(2).getPayload());
    Set<Row> expected = ImmutableSet.of(expectedRow0, expectedRow1, expectedRow2);
    queryOutput.apply("waitForSuccess", resultSignal.signalSuccessWhen(SchemaCoder.of(justBytesSchema), observedRows -> observedRows.equals(expected)));
    // Start the pipeline
    pipeline.run();
    // Block until a subscription for this topic exists
    eventsTopic.assertSubscriptionEventuallyCreated(pipeline.getOptions().as(GcpOptions.class).getProject(), Duration.standardMinutes(5));
    // Start publishing the messages when main pipeline is started and signaling topic is ready
    eventsTopic.publish(messages);
    // Poll the signaling topic for success message
    resultSignal.waitForSuccess(timeout);
}
Also used : Arrays(java.util.Arrays) LoggerFactory(org.slf4j.LoggerFactory) TimeoutException(java.util.concurrent.TimeoutException) PubsubMessage(org.apache.beam.sdk.io.gcp.pubsub.PubsubMessage) Future(java.util.concurrent.Future) TestPubsub(org.apache.beam.sdk.io.gcp.pubsub.TestPubsub) ResultSet(java.sql.ResultSet) Map(java.util.Map) TestPubsubSignal(org.apache.beam.sdk.io.gcp.pubsub.TestPubsubSignal) Parameterized(org.junit.runners.Parameterized) ImmutableMap(org.apache.beam.vendor.calcite.v1_28_0.com.google.common.collect.ImmutableMap) GcpOptions(org.apache.beam.sdk.extensions.gcp.options.GcpOptions) Matchers.allOf(org.hamcrest.Matchers.allOf) Collection(java.util.Collection) SchemaCoder(org.apache.beam.sdk.schemas.SchemaCoder) Set(java.util.Set) FieldType(org.apache.beam.sdk.schemas.Schema.FieldType) SchemaIOTableProviderWrapper(org.apache.beam.sdk.extensions.sql.meta.provider.SchemaIOTableProviderWrapper) Collectors(java.util.stream.Collectors) StandardCharsets(java.nio.charset.StandardCharsets) Executors(java.util.concurrent.Executors) ImmutableSet(org.apache.beam.vendor.calcite.v1_28_0.com.google.common.collect.ImmutableSet) Serializable(java.io.Serializable) List(java.util.List) Matchers.equalTo(org.hamcrest.Matchers.equalTo) JdbcDriver(org.apache.beam.sdk.extensions.sql.impl.JdbcDriver) ReflectHelpers(org.apache.beam.sdk.util.common.ReflectHelpers) ImmutableList(org.apache.beam.vendor.calcite.v1_28_0.com.google.common.collect.ImmutableList) GenericRecordBuilder(org.apache.avro.generic.GenericRecordBuilder) JsonMatcher.jsonBytesLike(org.apache.beam.sdk.testing.JsonMatcher.jsonBytesLike) ByteArrayOutputStream(java.io.ByteArrayOutputStream) InMemoryMetaStore(org.apache.beam.sdk.extensions.sql.meta.store.InMemoryMetaStore) Duration(org.joda.time.Duration) RunWith(org.junit.runner.RunWith) Parameters(org.junit.runners.Parameterized.Parameters) HashMap(java.util.HashMap) Callable(java.util.concurrent.Callable) Matchers.hasProperty(org.hamcrest.Matchers.hasProperty) BeamSqlRelUtils(org.apache.beam.sdk.extensions.sql.impl.rel.BeamSqlRelUtils) TestPipeline(org.apache.beam.sdk.testing.TestPipeline) MatcherAssert.assertThat(org.hamcrest.MatcherAssert.assertThat) BeamSqlEnv(org.apache.beam.sdk.extensions.sql.impl.BeamSqlEnv) Row(org.apache.beam.sdk.values.Row) PipelineOptions(org.apache.beam.sdk.options.PipelineOptions) ExecutorService(java.util.concurrent.ExecutorService) AvroUtils(org.apache.beam.sdk.schemas.utils.AvroUtils) Matchers.hasEntry(org.hamcrest.Matchers.hasEntry) GenericRecord(org.apache.avro.generic.GenericRecord) Logger(org.slf4j.Logger) UTF_8(java.nio.charset.StandardCharsets.UTF_8) Parameter(org.junit.runners.Parameterized.Parameter) ObjectMapper(com.fasterxml.jackson.databind.ObjectMapper) JdbcConnection(org.apache.beam.sdk.extensions.sql.impl.JdbcConnection) JsonProcessingException(com.fasterxml.jackson.core.JsonProcessingException) IOException(java.io.IOException) Test(org.junit.Test) PCollection(org.apache.beam.sdk.values.PCollection) AvroCoder(org.apache.beam.sdk.coders.AvroCoder) Schema(org.apache.beam.sdk.schemas.Schema) ExecutionException(java.util.concurrent.ExecutionException) TimeUnit(java.util.concurrent.TimeUnit) PayloadMessages(org.apache.beam.sdk.extensions.protobuf.PayloadMessages) Rule(org.junit.Rule) Ignore(org.junit.Ignore) Matcher(org.hamcrest.Matcher) Instant(org.joda.time.Instant) Statement(java.sql.Statement) TableProvider(org.apache.beam.sdk.extensions.sql.meta.provider.TableProvider) CalciteConnection(org.apache.beam.vendor.calcite.v1_28_0.org.apache.calcite.jdbc.CalciteConnection) Schema(org.apache.beam.sdk.schemas.Schema) BeamSqlEnv(org.apache.beam.sdk.extensions.sql.impl.BeamSqlEnv) Row(org.apache.beam.sdk.values.Row) PubsubMessage(org.apache.beam.sdk.io.gcp.pubsub.PubsubMessage) Test(org.junit.Test)

Aggregations

PCollection (org.apache.beam.sdk.values.PCollection)198 Test (org.junit.Test)133 TestPipeline (org.apache.beam.sdk.testing.TestPipeline)61 KV (org.apache.beam.sdk.values.KV)61 Map (java.util.Map)59 List (java.util.List)58 Rule (org.junit.Rule)57 RunWith (org.junit.runner.RunWith)54 PAssert (org.apache.beam.sdk.testing.PAssert)52 Instant (org.joda.time.Instant)46 Duration (org.joda.time.Duration)45 JUnit4 (org.junit.runners.JUnit4)45 ParDo (org.apache.beam.sdk.transforms.ParDo)44 TupleTag (org.apache.beam.sdk.values.TupleTag)42 Pipeline (org.apache.beam.sdk.Pipeline)41 Create (org.apache.beam.sdk.transforms.Create)41 ArrayList (java.util.ArrayList)40 Serializable (java.io.Serializable)39 PTransform (org.apache.beam.sdk.transforms.PTransform)37 Row (org.apache.beam.sdk.values.Row)37