Search in sources :

Example 1 with AccumulatorProvider

use of org.apache.beam.sdk.extensions.euphoria.core.client.accumulators.AccumulatorProvider in project beam by apache.

the class JoinTranslator method translate.

@Override
PCollection<KV<KeyT, OutputT>> translate(Join<LeftT, RightT, KeyT, OutputT> operator, PCollection<LeftT> left, PCollection<KV<KeyT, LeftT>> leftKeyed, PCollection<RightT> reight, PCollection<KV<KeyT, RightT>> rightKeyed) {
    final AccumulatorProvider accumulators = new LazyAccumulatorProvider(AccumulatorProvider.of(leftKeyed.getPipeline()));
    final TupleTag<LeftT> leftTag = new TupleTag<>();
    final TupleTag<RightT> rightTag = new TupleTag<>();
    final JoinFn<LeftT, RightT, KeyT, OutputT> joinFn = getJoinFn(operator, leftTag, rightTag, accumulators);
    return KeyedPCollectionTuple.of(leftTag, leftKeyed).and(rightTag, rightKeyed).apply("co-group-by-key", CoGroupByKey.create()).apply(joinFn.getFnName(), ParDo.of(joinFn));
}
Also used : TupleTag(org.apache.beam.sdk.values.TupleTag) AccumulatorProvider(org.apache.beam.sdk.extensions.euphoria.core.client.accumulators.AccumulatorProvider)

Example 2 with AccumulatorProvider

use of org.apache.beam.sdk.extensions.euphoria.core.client.accumulators.AccumulatorProvider in project beam by apache.

the class ReduceByKeyTranslator method translate.

@Override
public PCollection<KV<KeyT, OutputT>> translate(ReduceByKey<InputT, KeyT, ValueT, ?, OutputT> operator, PCollectionList<InputT> inputs) {
    // todo Could we even do values sorting in Beam ? And do we want it?
    checkState(!operator.getValueComparator().isPresent(), "Values sorting is not supported.");
    final UnaryFunction<InputT, KeyT> keyExtractor = operator.getKeyExtractor();
    final UnaryFunction<InputT, ValueT> valueExtractor = operator.getValueExtractor();
    final PCollection<InputT> input = operator.getWindow().map(window -> PCollectionLists.getOnlyElement(inputs).apply(window)).orElseGet(() -> PCollectionLists.getOnlyElement(inputs));
    // ~ create key & value extractor
    final MapElements<InputT, KV<KeyT, ValueT>> extractor = MapElements.via(new KeyValueExtractor<>(keyExtractor, valueExtractor));
    final PCollection<KV<KeyT, ValueT>> extracted = input.apply("extract-keys", extractor).setTypeDescriptor(TypeDescriptors.kvs(TypeAwareness.orObjects(operator.getKeyType()), TypeAwareness.orObjects(operator.getValueType())));
    final AccumulatorProvider accumulators = new LazyAccumulatorProvider(AccumulatorProvider.of(inputs.getPipeline()));
    if (operator.isCombinable()) {
        // if operator is combinable we can process it in more efficient way
        @SuppressWarnings("unchecked") final PCollection combined;
        if (operator.isCombineFnStyle()) {
            combined = extracted.apply("combine", Combine.perKey(asCombineFn(operator)));
        } else {
            combined = extracted.apply("combine", Combine.perKey(asCombiner(operator.getReducer(), accumulators, operator.getName().orElse(null))));
        }
        @SuppressWarnings("unchecked") final PCollection<KV<KeyT, OutputT>> cast = (PCollection) combined;
        return cast.setTypeDescriptor(operator.getOutputType().orElseThrow(() -> new IllegalStateException("Unable to infer output type descriptor.")));
    }
    return extracted.apply("group", GroupByKey.create()).setTypeDescriptor(TypeDescriptors.kvs(TypeAwareness.orObjects(operator.getKeyType()), TypeDescriptors.iterables(TypeAwareness.orObjects(operator.getValueType())))).apply("reduce", ParDo.of(new ReduceDoFn<>(operator.getReducer(), accumulators, operator.getName().orElse(null)))).setTypeDescriptor(operator.getOutputType().orElseThrow(() -> new IllegalStateException("Unable to infer output type descriptor.")));
}
Also used : KV(org.apache.beam.sdk.values.KV) TypeDescriptor(org.apache.beam.sdk.values.TypeDescriptor) CoderRegistry(org.apache.beam.sdk.coders.CoderRegistry) Combine(org.apache.beam.sdk.transforms.Combine) Coder(org.apache.beam.sdk.coders.Coder) BinaryFunction(org.apache.beam.sdk.extensions.euphoria.core.client.functional.BinaryFunction) SerializableFunction(org.apache.beam.sdk.transforms.SerializableFunction) SimpleFunction(org.apache.beam.sdk.transforms.SimpleFunction) PCollectionList(org.apache.beam.sdk.values.PCollectionList) Objects.requireNonNull(java.util.Objects.requireNonNull) AdaptableCollector(org.apache.beam.sdk.extensions.euphoria.core.translate.collector.AdaptableCollector) StreamSupport(java.util.stream.StreamSupport) ReduceFunctor(org.apache.beam.sdk.extensions.euphoria.core.client.functional.ReduceFunctor) SingleValueCollector(org.apache.beam.sdk.extensions.euphoria.core.translate.collector.SingleValueCollector) Nullable(org.checkerframework.checker.nullness.qual.Nullable) ReduceByKey(org.apache.beam.sdk.extensions.euphoria.core.client.operator.ReduceByKey) DoFn(org.apache.beam.sdk.transforms.DoFn) MapElements(org.apache.beam.sdk.transforms.MapElements) CannotProvideCoderException(org.apache.beam.sdk.coders.CannotProvideCoderException) VoidFunction(org.apache.beam.sdk.extensions.euphoria.core.client.functional.VoidFunction) GroupByKey(org.apache.beam.sdk.transforms.GroupByKey) TypeAwareness(org.apache.beam.sdk.extensions.euphoria.core.client.type.TypeAwareness) UnaryFunction(org.apache.beam.sdk.extensions.euphoria.core.client.functional.UnaryFunction) PCollection(org.apache.beam.sdk.values.PCollection) CombinableBinaryFunction(org.apache.beam.sdk.extensions.euphoria.core.client.functional.CombinableBinaryFunction) ParDo(org.apache.beam.sdk.transforms.ParDo) Preconditions.checkState(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkState) TypeDescriptors(org.apache.beam.sdk.values.TypeDescriptors) PCollectionLists(org.apache.beam.sdk.extensions.euphoria.core.client.util.PCollectionLists) AccumulatorProvider(org.apache.beam.sdk.extensions.euphoria.core.client.accumulators.AccumulatorProvider) KV(org.apache.beam.sdk.values.KV) AccumulatorProvider(org.apache.beam.sdk.extensions.euphoria.core.client.accumulators.AccumulatorProvider) PCollection(org.apache.beam.sdk.values.PCollection)

Example 3 with AccumulatorProvider

use of org.apache.beam.sdk.extensions.euphoria.core.client.accumulators.AccumulatorProvider in project beam by apache.

the class SingleValueCollectorTest method testBasicAccumulatorsAccess.

@Test
public void testBasicAccumulatorsAccess() {
    final AccumulatorProvider accumulators = accumulatorFactory.create();
    SingleValueCollector collector = new SingleValueCollector(accumulators, "test-no_op_name");
    Counter counter = collector.getCounter(TEST_COUNTER_NAME);
    Assert.assertNotNull(counter);
    Histogram histogram = collector.getHistogram(TEST_HISTOGRAM_NAME);
    Assert.assertNotNull(histogram);
// collector.getTimer() <- not yet supported
}
Also used : Histogram(org.apache.beam.sdk.extensions.euphoria.core.client.accumulators.Histogram) Counter(org.apache.beam.sdk.extensions.euphoria.core.client.accumulators.Counter) SingleJvmAccumulatorProvider(org.apache.beam.sdk.extensions.euphoria.core.testkit.accumulators.SingleJvmAccumulatorProvider) AccumulatorProvider(org.apache.beam.sdk.extensions.euphoria.core.client.accumulators.AccumulatorProvider) Test(org.junit.Test)

Example 4 with AccumulatorProvider

use of org.apache.beam.sdk.extensions.euphoria.core.client.accumulators.AccumulatorProvider in project beam by apache.

the class FlatMapTranslator method translate.

@Override
public PCollection<OutputT> translate(FlatMap<InputT, OutputT> operator, PCollectionList<InputT> inputs) {
    final AccumulatorProvider accumulators = new LazyAccumulatorProvider(AccumulatorProvider.of(inputs.getPipeline()));
    final Mapper<InputT, OutputT> mapper = new Mapper<>(operator.getName().orElse(null), operator.getFunctor(), accumulators, operator.getEventTimeExtractor().orElse(null), operator.getAllowedTimestampSkew());
    return PCollectionLists.getOnlyElement(inputs).apply("mapper", ParDo.of(mapper)).setTypeDescriptor(TypeAwareness.orObjects(operator.getOutputType()));
}
Also used : AccumulatorProvider(org.apache.beam.sdk.extensions.euphoria.core.client.accumulators.AccumulatorProvider)

Example 5 with AccumulatorProvider

use of org.apache.beam.sdk.extensions.euphoria.core.client.accumulators.AccumulatorProvider in project beam by apache.

the class SingleJvmAccumulatorProviderTest method testBasicAccumulatorsFunction.

@Test
public void testBasicAccumulatorsFunction() {
    final AccumulatorProvider accumulators = accFactory.create();
    Counter counter = accumulators.getCounter(TEST_COUNTER_NAME);
    Assert.assertNotNull(counter);
    counter.increment();
    counter.increment(2);
    Map<String, Long> counterSnapshots = accFactory.getCounterSnapshots();
    long counterValue = counterSnapshots.get(TEST_COUNTER_NAME);
    Assert.assertEquals(3L, counterValue);
    Histogram histogram = accumulators.getHistogram(TEST_HISTOGRAM_NAME);
    Assert.assertNotNull(histogram);
    histogram.add(1);
    histogram.add(2, 2);
    Map<String, Map<Long, Long>> histogramSnapshots = accFactory.getHistogramSnapshots();
    Map<Long, Long> histogramValue = histogramSnapshots.get(TEST_HISTOGRAM_NAME);
    long numOfValuesOfOne = histogramValue.get(1L);
    Assert.assertEquals(1L, numOfValuesOfOne);
    long numOfValuesOfTwo = histogramValue.get(2L);
    Assert.assertEquals(2L, numOfValuesOfTwo);
// collector.getTimer() <- not yet supported
}
Also used : Histogram(org.apache.beam.sdk.extensions.euphoria.core.client.accumulators.Histogram) Counter(org.apache.beam.sdk.extensions.euphoria.core.client.accumulators.Counter) Map(java.util.Map) SingleJvmAccumulatorProvider(org.apache.beam.sdk.extensions.euphoria.core.testkit.accumulators.SingleJvmAccumulatorProvider) AccumulatorProvider(org.apache.beam.sdk.extensions.euphoria.core.client.accumulators.AccumulatorProvider) Test(org.junit.Test)

Aggregations

AccumulatorProvider (org.apache.beam.sdk.extensions.euphoria.core.client.accumulators.AccumulatorProvider)6 Counter (org.apache.beam.sdk.extensions.euphoria.core.client.accumulators.Counter)3 Histogram (org.apache.beam.sdk.extensions.euphoria.core.client.accumulators.Histogram)3 SingleJvmAccumulatorProvider (org.apache.beam.sdk.extensions.euphoria.core.testkit.accumulators.SingleJvmAccumulatorProvider)3 Test (org.junit.Test)3 Map (java.util.Map)2 Objects.requireNonNull (java.util.Objects.requireNonNull)1 StreamSupport (java.util.stream.StreamSupport)1 CannotProvideCoderException (org.apache.beam.sdk.coders.CannotProvideCoderException)1 Coder (org.apache.beam.sdk.coders.Coder)1 CoderRegistry (org.apache.beam.sdk.coders.CoderRegistry)1 BinaryFunction (org.apache.beam.sdk.extensions.euphoria.core.client.functional.BinaryFunction)1 CombinableBinaryFunction (org.apache.beam.sdk.extensions.euphoria.core.client.functional.CombinableBinaryFunction)1 ReduceFunctor (org.apache.beam.sdk.extensions.euphoria.core.client.functional.ReduceFunctor)1 UnaryFunction (org.apache.beam.sdk.extensions.euphoria.core.client.functional.UnaryFunction)1 VoidFunction (org.apache.beam.sdk.extensions.euphoria.core.client.functional.VoidFunction)1 ReduceByKey (org.apache.beam.sdk.extensions.euphoria.core.client.operator.ReduceByKey)1 TypeAwareness (org.apache.beam.sdk.extensions.euphoria.core.client.type.TypeAwareness)1 PCollectionLists (org.apache.beam.sdk.extensions.euphoria.core.client.util.PCollectionLists)1 AdaptableCollector (org.apache.beam.sdk.extensions.euphoria.core.translate.collector.AdaptableCollector)1