Search in sources :

Example 26 with OffsetRange

use of org.apache.beam.sdk.io.range.OffsetRange in project beam by apache.

the class OutputAndTimeBoundedSplittableProcessElementInvokerTest method testInvokeProcessElementOutputDisallowedBeforeTryClaim.

@Test
public void testInvokeProcessElementOutputDisallowedBeforeTryClaim() throws Exception {
    DoFn<Void, String> brokenFn = new DoFn<Void, String>() {

        @ProcessElement
        public void process(ProcessContext c, RestrictionTracker<OffsetRange, Long> tracker) {
            c.output("foo");
        }

        @GetInitialRestriction
        public OffsetRange getInitialRestriction(@Element Void element) {
            throw new UnsupportedOperationException("Should not be called in this test");
        }
    };
    e.expectMessage("Output is not allowed before tryClaim()");
    runTest(brokenFn, new OffsetRange(0, 5));
}
Also used : OffsetRange(org.apache.beam.sdk.io.range.OffsetRange) RestrictionTracker(org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker) DoFn(org.apache.beam.sdk.transforms.DoFn) Test(org.junit.Test)

Example 27 with OffsetRange

use of org.apache.beam.sdk.io.range.OffsetRange in project beam by apache.

the class PerSubscriptionPartitionSdfTest method process.

@Test
@SuppressWarnings("argument.type.incompatible")
public void process() throws Exception {
    when(processor.runFor(MAX_SLEEP_TIME)).thenReturn(ProcessContinuation.resume());
    when(processorFactory.newProcessor(any(), any(), any())).thenAnswer(args -> {
        @Nonnull RestrictionTracker<OffsetRange, OffsetByteProgress> wrapped = args.getArgument(1);
        when(tracker.tryClaim(any())).thenReturn(true).thenReturn(false);
        assertTrue(wrapped.tryClaim(OffsetByteProgress.of(example(Offset.class), 123)));
        assertFalse(wrapped.tryClaim(OffsetByteProgress.of(Offset.of(333333), 123)));
        return processor;
    });
    doReturn(Optional.of(example(Offset.class))).when(processor).lastClaimed();
    assertEquals(ProcessContinuation.resume(), sdf.processElement(tracker, PARTITION, output));
    verify(processorFactory).newProcessor(eq(PARTITION), any(), eq(output));
    InOrder order = inOrder(processor);
    order.verify(processor).runFor(MAX_SLEEP_TIME);
    order.verify(processor).lastClaimed();
    InOrder order2 = inOrder(committerFactory, committer);
    order2.verify(committerFactory).apply(PARTITION);
    order2.verify(committer).commitOffset(Offset.of(example(Offset.class).value() + 1));
}
Also used : OffsetRange(org.apache.beam.sdk.io.range.OffsetRange) InOrder(org.mockito.InOrder) Nonnull(javax.annotation.Nonnull) Offset(com.google.cloud.pubsublite.Offset) Test(org.junit.Test)

Example 28 with OffsetRange

use of org.apache.beam.sdk.io.range.OffsetRange in project beam by apache.

the class ReadChangeStreamPartitionRangeTrackerTest method testTrySplitReturnsNullForInitialPartition.

@Test
public void testTrySplitReturnsNullForInitialPartition() {
    final PartitionMetadata partition = mock(PartitionMetadata.class);
    final OffsetRange range = new OffsetRange(100, 200);
    final ReadChangeStreamPartitionRangeTracker tracker = new ReadChangeStreamPartitionRangeTracker(partition, range);
    when(partition.getPartitionToken()).thenReturn(InitialPartition.PARTITION_TOKEN);
    assertNull(tracker.trySplit(0.0D));
}
Also used : OffsetRange(org.apache.beam.sdk.io.range.OffsetRange) PartitionMetadata(org.apache.beam.sdk.io.gcp.spanner.changestreams.model.PartitionMetadata) Test(org.junit.Test)

Example 29 with OffsetRange

use of org.apache.beam.sdk.io.range.OffsetRange in project beam by apache.

the class BundleSplitter method getBundleSizes.

List<OffsetRange> getBundleSizes(int desiredNumBundles, long start, long end) {
    List<OffsetRange> result = new ArrayList<>();
    double[] relativeSizes = getRelativeBundleSizes(desiredNumBundles);
    // Generate offset ranges proportional to the relative sizes.
    double s = sum(relativeSizes);
    long startOffset = start;
    double sizeSoFar = 0;
    for (int i = 0; i < relativeSizes.length; ++i) {
        sizeSoFar += relativeSizes[i];
        long endOffset = (i == relativeSizes.length - 1) ? end : (long) (start + sizeSoFar * (end - start) / s);
        if (startOffset != endOffset) {
            result.add(new OffsetRange(startOffset, endOffset));
        }
        startOffset = endOffset;
    }
    return result;
}
Also used : OffsetRange(org.apache.beam.sdk.io.range.OffsetRange) ArrayList(java.util.ArrayList)

Example 30 with OffsetRange

use of org.apache.beam.sdk.io.range.OffsetRange in project beam by apache.

the class ReadChangeStreamPartitionDoFn method initialRestriction.

/**
 * The restriction for a partition will be defined from the start and end timestamp to query the
 * partition for. These timestamps are converted to microseconds. The {@link OffsetRange}
 * restriction represents a closed-open interval, while the start / end timestamps represent a
 * closed-closed interval, so we add 1 microsecond to the end timestamp to convert it to
 * closed-open.
 *
 * <p>In this function we also update the partition state to {@link
 * PartitionMetadata.State#RUNNING}.
 *
 * @param partition the partition to be queried
 * @return the offset range from the partition start timestamp to the partition end timestamp + 1
 *     microsecond
 */
@GetInitialRestriction
public OffsetRange initialRestriction(@Element PartitionMetadata partition) {
    final String token = partition.getPartitionToken();
    final com.google.cloud.Timestamp startTimestamp = partition.getStartTimestamp();
    final long startMicros = TimestampConverter.timestampToMicros(startTimestamp);
    // Offset range represents closed-open interval
    final long endMicros = Optional.ofNullable(partition.getEndTimestamp()).map(TimestampConverter::timestampToMicros).map(micros -> micros + 1).orElse(TimestampConverter.MAX_MICROS + 1);
    final com.google.cloud.Timestamp partitionScheduledAt = partition.getScheduledAt();
    final com.google.cloud.Timestamp partitionRunningAt = daoFactory.getPartitionMetadataDao().updateToRunning(token);
    if (partitionScheduledAt != null && partitionRunningAt != null) {
        metrics.updatePartitionScheduledToRunning(new Duration(partitionScheduledAt.toSqlTimestamp().getTime(), partitionRunningAt.toSqlTimestamp().getTime()));
    }
    return new OffsetRange(startMicros, endMicros);
}
Also used : AttributeValue(io.opencensus.trace.AttributeValue) DaoFactory(org.apache.beam.sdk.io.gcp.spanner.changestreams.dao.DaoFactory) Manual(org.apache.beam.sdk.transforms.splittabledofn.WatermarkEstimators.Manual) ChangeStreamMetrics(org.apache.beam.sdk.io.gcp.spanner.changestreams.ChangeStreamMetrics) PartitionMetadataDao(org.apache.beam.sdk.io.gcp.spanner.changestreams.dao.PartitionMetadataDao) Duration(org.joda.time.Duration) LoggerFactory(org.slf4j.LoggerFactory) PARTITION_ID_ATTRIBUTE_LABEL(org.apache.beam.sdk.io.gcp.spanner.changestreams.ChangeStreamMetrics.PARTITION_ID_ATTRIBUTE_LABEL) TimestampConverter(org.apache.beam.sdk.io.gcp.spanner.changestreams.TimestampConverter) DataChangeRecord(org.apache.beam.sdk.io.gcp.spanner.changestreams.model.DataChangeRecord) DataChangeRecordAction(org.apache.beam.sdk.io.gcp.spanner.changestreams.action.DataChangeRecordAction) QueryChangeStreamAction(org.apache.beam.sdk.io.gcp.spanner.changestreams.action.QueryChangeStreamAction) Tracing(io.opencensus.trace.Tracing) RestrictionTracker(org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker) DoFn(org.apache.beam.sdk.transforms.DoFn) Tracer(io.opencensus.trace.Tracer) ChangeStreamDao(org.apache.beam.sdk.io.gcp.spanner.changestreams.dao.ChangeStreamDao) Logger(org.slf4j.Logger) HeartbeatRecordAction(org.apache.beam.sdk.io.gcp.spanner.changestreams.action.HeartbeatRecordAction) Scope(io.opencensus.common.Scope) UnboundedPerElement(org.apache.beam.sdk.transforms.DoFn.UnboundedPerElement) ManualWatermarkEstimator(org.apache.beam.sdk.transforms.splittabledofn.ManualWatermarkEstimator) Serializable(java.io.Serializable) ChildPartitionsRecordAction(org.apache.beam.sdk.io.gcp.spanner.changestreams.action.ChildPartitionsRecordAction) PartitionMetadata(org.apache.beam.sdk.io.gcp.spanner.changestreams.model.PartitionMetadata) Instant(org.joda.time.Instant) ReadChangeStreamPartitionRangeTracker(org.apache.beam.sdk.io.gcp.spanner.changestreams.restriction.ReadChangeStreamPartitionRangeTracker) Optional(java.util.Optional) PartitionMetadataMapper(org.apache.beam.sdk.io.gcp.spanner.changestreams.mapper.PartitionMetadataMapper) ActionFactory(org.apache.beam.sdk.io.gcp.spanner.changestreams.action.ActionFactory) MapperFactory(org.apache.beam.sdk.io.gcp.spanner.changestreams.mapper.MapperFactory) OffsetRange(org.apache.beam.sdk.io.range.OffsetRange) ChangeStreamRecordMapper(org.apache.beam.sdk.io.gcp.spanner.changestreams.mapper.ChangeStreamRecordMapper) OffsetRange(org.apache.beam.sdk.io.range.OffsetRange) TimestampConverter(org.apache.beam.sdk.io.gcp.spanner.changestreams.TimestampConverter) Duration(org.joda.time.Duration)

Aggregations

OffsetRange (org.apache.beam.sdk.io.range.OffsetRange)63 Test (org.junit.Test)53 Instant (org.joda.time.Instant)8 ArrayList (java.util.ArrayList)5 OffsetRangeTracker (org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker)5 Progress (org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker.Progress)5 ProcessContinuation (org.apache.beam.sdk.transforms.DoFn.ProcessContinuation)4 PartitionMetadata (org.apache.beam.sdk.io.gcp.spanner.changestreams.model.PartitionMetadata)3 DoFn (org.apache.beam.sdk.transforms.DoFn)3 BigDecimal (java.math.BigDecimal)2 RestrictionTracker (org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker)2 Offset (com.google.cloud.pubsublite.Offset)1 SuppressFBWarnings (edu.umd.cs.findbugs.annotations.SuppressFBWarnings)1 Scope (io.opencensus.common.Scope)1 AttributeValue (io.opencensus.trace.AttributeValue)1 Tracer (io.opencensus.trace.Tracer)1 Tracing (io.opencensus.trace.Tracing)1 Serializable (java.io.Serializable)1 Map (java.util.Map)1 Optional (java.util.Optional)1