use of org.apache.beam.sdk.io.gcp.spanner.changestreams.model.ChildPartition in project beam by apache.
the class ChildPartitionsRecordAction method run.
/**
* This is the main processing function for a {@link ChildPartitionsRecord}. It returns an {@link
* Optional} of {@link ProcessContinuation} to indicate if the calling function should stop or
* not. If the {@link Optional} returned is empty, it means that the calling function can continue
* with the processing. If an {@link Optional} of {@link ProcessContinuation#stop()} is returned,
* it means that this function was unable to claim the timestamp of the {@link
* ChildPartitionsRecord}, so the caller should stop.
*
* <p>When processing the {@link ChildPartitionsRecord} the following procedure is applied:
*
* <ol>
* <li>We try to claim the child partition record timestamp. If it is not possible, we stop here
* and return.
* <li>We update the watermark to the child partition record timestamp.
* <li>For each child partition, we try to insert them in the metadata tables if they do not
* exist.
* <li>For each child partition, we check if they originate from a split or a merge and
* increment the corresponding metric.
* </ol>
*
* Dealing with partition splits and merge cases is detailed below:
*
* <ul>
* <li>Partition Splits: child partition tokens should not exist in the partition metadata
* table, so new rows are just added to such table. In case of a bundle retry, we silently
* ignore duplicate entries.
* <li>Partition Merges: the first parent partition that receives the child token should succeed
* in inserting it. The remaining parents will silently ignore and skip the insertion.
* </ul>
*
* @param partition the current partition being processed
* @param record the change stream child partition record received
* @param tracker the restriction tracker of the {@link
* org.apache.beam.sdk.io.gcp.spanner.changestreams.dofn.ReadChangeStreamPartitionDoFn} SDF
* @param watermarkEstimator the watermark estimator of the {@link
* org.apache.beam.sdk.io.gcp.spanner.changestreams.dofn.ReadChangeStreamPartitionDoFn} SDF
* @return {@link Optional#empty()} if the caller can continue processing more records. A non
* empty {@link Optional} with {@link ProcessContinuation#stop()} if this function was unable
* to claim the {@link ChildPartitionsRecord} timestamp
*/
@VisibleForTesting
public Optional<ProcessContinuation> run(PartitionMetadata partition, ChildPartitionsRecord record, RestrictionTracker<OffsetRange, Long> tracker, ManualWatermarkEstimator<Instant> watermarkEstimator) {
final String token = partition.getPartitionToken();
try (Scope scope = TRACER.spanBuilder("ChildPartitionsRecordAction").setRecordEvents(true).startScopedSpan()) {
TRACER.getCurrentSpan().putAttribute(PARTITION_ID_ATTRIBUTE_LABEL, AttributeValue.stringAttributeValue(token));
LOG.debug("[" + token + "] Processing child partition record " + record);
final Timestamp startTimestamp = record.getStartTimestamp();
final Instant startInstant = new Instant(startTimestamp.toSqlTimestamp().getTime());
final long startMicros = TimestampConverter.timestampToMicros(startTimestamp);
if (!tracker.tryClaim(startMicros)) {
LOG.debug("[" + token + "] Could not claim queryChangeStream(" + startTimestamp + "), stopping");
return Optional.of(ProcessContinuation.stop());
}
watermarkEstimator.setWatermark(startInstant);
for (ChildPartition childPartition : record.getChildPartitions()) {
processChildPartition(partition, record, childPartition);
}
LOG.debug("[" + token + "] Child partitions action completed successfully");
return Optional.empty();
}
}
use of org.apache.beam.sdk.io.gcp.spanner.changestreams.model.ChildPartition in project beam by apache.
the class ChildPartitionsRecordAction method processChildPartition.
// Unboxing of runInTransaction result will not produce a null value, we can ignore it
@SuppressWarnings("nullness")
private void processChildPartition(PartitionMetadata partition, ChildPartitionsRecord record, ChildPartition childPartition) {
try (Scope scope = TRACER.spanBuilder("ChildPartitionsRecordAction.processChildPartition").setRecordEvents(true).startScopedSpan()) {
TRACER.getCurrentSpan().putAttribute(PARTITION_ID_ATTRIBUTE_LABEL, AttributeValue.stringAttributeValue(partition.getPartitionToken()));
final String partitionToken = partition.getPartitionToken();
final String childPartitionToken = childPartition.getToken();
final boolean isSplit = isSplit(childPartition);
LOG.debug("[" + partitionToken + "] Processing child partition" + (isSplit ? " split" : " merge") + " event");
final PartitionMetadata row = toPartitionMetadata(record.getStartTimestamp(), partition.getEndTimestamp(), partition.getHeartbeatMillis(), childPartition);
LOG.debug("[" + partitionToken + "] Inserting child partition token " + childPartitionToken);
final Boolean insertedRow = partitionMetadataDao.runInTransaction(transaction -> {
if (transaction.getPartition(childPartitionToken) == null) {
transaction.insert(row);
return true;
} else {
return false;
}
}).getResult();
if (insertedRow && isSplit) {
metrics.incPartitionRecordSplitCount();
} else if (insertedRow) {
metrics.incPartitionRecordMergeCount();
} else {
LOG.debug("[" + partitionToken + "] Child token " + childPartitionToken + " already exists, skipping...");
}
}
}
use of org.apache.beam.sdk.io.gcp.spanner.changestreams.model.ChildPartition in project beam by apache.
the class ChildPartitionsRecordActionTest method testRestrictionClaimedAnsIsSplitCaseAndChildExists.
@Test
public void testRestrictionClaimedAnsIsSplitCaseAndChildExists() {
final String partitionToken = "partitionToken";
final long heartbeat = 30L;
final Timestamp startTimestamp = Timestamp.ofTimeMicroseconds(10L);
final Timestamp endTimestamp = Timestamp.ofTimeMicroseconds(20L);
final PartitionMetadata partition = mock(PartitionMetadata.class);
final ChildPartitionsRecord record = new ChildPartitionsRecord(startTimestamp, "recordSequence", Arrays.asList(new ChildPartition("childPartition1", partitionToken), new ChildPartition("childPartition2", partitionToken)), null);
when(partition.getEndTimestamp()).thenReturn(endTimestamp);
when(partition.getHeartbeatMillis()).thenReturn(heartbeat);
when(partition.getPartitionToken()).thenReturn(partitionToken);
when(tracker.tryClaim(10L)).thenReturn(true);
when(transaction.getPartition("childPartition1")).thenReturn(mock(Struct.class));
when(transaction.getPartition("childPartition2")).thenReturn(mock(Struct.class));
final Optional<ProcessContinuation> maybeContinuation = action.run(partition, record, tracker, watermarkEstimator);
assertEquals(Optional.empty(), maybeContinuation);
verify(watermarkEstimator).setWatermark(new Instant(startTimestamp.toSqlTimestamp().getTime()));
}
use of org.apache.beam.sdk.io.gcp.spanner.changestreams.model.ChildPartition in project beam by apache.
the class ChildPartitionsRecordActionTest method testRestrictionNotClaimed.
@Test
public void testRestrictionNotClaimed() {
final String partitionToken = "partitionToken";
final Timestamp startTimestamp = Timestamp.ofTimeMicroseconds(10L);
final PartitionMetadata partition = mock(PartitionMetadata.class);
final ChildPartitionsRecord record = new ChildPartitionsRecord(startTimestamp, "recordSequence", Arrays.asList(new ChildPartition("childPartition1", partitionToken), new ChildPartition("childPartition2", partitionToken)), null);
when(partition.getPartitionToken()).thenReturn(partitionToken);
when(tracker.tryClaim(10L)).thenReturn(false);
final Optional<ProcessContinuation> maybeContinuation = action.run(partition, record, tracker, watermarkEstimator);
assertEquals(Optional.of(ProcessContinuation.stop()), maybeContinuation);
verify(watermarkEstimator, never()).setWatermark(any());
verify(dao, never()).insert(any());
}
use of org.apache.beam.sdk.io.gcp.spanner.changestreams.model.ChildPartition in project beam by apache.
the class ChangeStreamRecordMapperTest method testMappingStructRowFromInitialPartitionToChildPartitionRecord.
/**
* Adds the default parent partition token as a parent of each child partition.
*/
@Test
public void testMappingStructRowFromInitialPartitionToChildPartitionRecord() {
final Struct struct = recordsToStructWithStrings(new ChildPartitionsRecord(Timestamp.ofTimeSecondsAndNanos(10L, 20), "1", Arrays.asList(new ChildPartition("childToken1", Sets.newHashSet()), new ChildPartition("childToken2", Sets.newHashSet())), null));
final ChildPartitionsRecord expected = new ChildPartitionsRecord(Timestamp.ofTimeSecondsAndNanos(10L, 20), "1", Arrays.asList(new ChildPartition("childToken1", Sets.newHashSet(InitialPartition.PARTITION_TOKEN)), new ChildPartition("childToken2", Sets.newHashSet(InitialPartition.PARTITION_TOKEN))), null);
final PartitionMetadata initialPartition = partition.toBuilder().setPartitionToken(InitialPartition.PARTITION_TOKEN).build();
assertEquals(Collections.singletonList(expected), mapper.toChangeStreamRecords(initialPartition, struct, resultSetMetadata));
}
Aggregations