Search in sources :

Example 16 with DataflowStepContext

use of org.apache.beam.runners.dataflow.worker.DataflowExecutionContext.DataflowStepContext in project beam by apache.

the class UserDistributionMonitoringInfoToCounterUpdateTransformer method validate.

private Optional<String> validate(MonitoringInfo monitoringInfo) {
    Optional<String> validatorResult = specValidator.validate(monitoringInfo);
    if (validatorResult.isPresent()) {
        return validatorResult;
    }
    String urn = monitoringInfo.getUrn();
    if (!urn.equals(Urns.USER_DISTRIBUTION_INT64)) {
        throw new RuntimeException(String.format("Received unexpected counter urn. Expected urn: %s, received: %s", Urns.USER_DISTRIBUTION_INT64, urn));
    }
    String type = monitoringInfo.getType();
    if (!type.equals(TypeUrns.DISTRIBUTION_INT64_TYPE)) {
        throw new RuntimeException(String.format("Received unexpected counter type. Expected type: %s, received: %s", TypeUrns.DISTRIBUTION_INT64_TYPE, type));
    }
    final String ptransform = monitoringInfo.getLabelsMap().get(MonitoringInfoConstants.Labels.PTRANSFORM);
    DataflowStepContext stepContext = transformIdMapping.get(ptransform);
    if (stepContext == null) {
        return Optional.of("Encountered user-counter MonitoringInfo with unknown ptransformId: " + monitoringInfo.toString());
    }
    return Optional.empty();
}
Also used : DataflowStepContext(org.apache.beam.runners.dataflow.worker.DataflowExecutionContext.DataflowStepContext)

Example 17 with DataflowStepContext

use of org.apache.beam.runners.dataflow.worker.DataflowExecutionContext.DataflowStepContext in project beam by apache.

the class UserDistributionMonitoringInfoToCounterUpdateTransformer method transform.

/**
 * Transforms user counter MonitoringInfo to relevant CounterUpdate.
 *
 * @return Relevant CounterUpdate or null if transformation failed.
 */
@Override
@Nullable
public CounterUpdate transform(MonitoringInfo monitoringInfo) {
    Optional<String> validationResult = validate(monitoringInfo);
    if (validationResult.isPresent()) {
        LOG.debug(validationResult.get());
        return null;
    }
    DistributionData data = decodeInt64Distribution(monitoringInfo.getPayload());
    Map<String, String> miLabels = monitoringInfo.getLabelsMap();
    final String ptransform = miLabels.get(MonitoringInfoConstants.Labels.PTRANSFORM);
    final String counterName = miLabels.get(MonitoringInfoConstants.Labels.NAME);
    final String counterNamespace = miLabels.get(MonitoringInfoConstants.Labels.NAMESPACE);
    CounterStructuredNameAndMetadata name = new CounterStructuredNameAndMetadata();
    DataflowStepContext stepContext = transformIdMapping.get(ptransform);
    name.setName(new CounterStructuredName().setOrigin(Origin.USER.toString()).setName(counterName).setOriginalStepName(stepContext.getNameContext().originalName()).setOriginNamespace(counterNamespace)).setMetadata(new CounterMetadata().setKind(Kind.DISTRIBUTION.toString()));
    return new CounterUpdate().setStructuredNameAndMetadata(name).setCumulative(true).setDistribution(new DistributionUpdate().setMax(DataflowCounterUpdateExtractor.longToSplitInt(data.max())).setMin(DataflowCounterUpdateExtractor.longToSplitInt(data.min())).setSum(DataflowCounterUpdateExtractor.longToSplitInt(data.sum())).setCount(DataflowCounterUpdateExtractor.longToSplitInt(data.count())));
}
Also used : CounterMetadata(com.google.api.services.dataflow.model.CounterMetadata) DistributionData(org.apache.beam.runners.core.metrics.DistributionData) CounterStructuredName(com.google.api.services.dataflow.model.CounterStructuredName) DistributionUpdate(com.google.api.services.dataflow.model.DistributionUpdate) CounterStructuredNameAndMetadata(com.google.api.services.dataflow.model.CounterStructuredNameAndMetadata) DataflowStepContext(org.apache.beam.runners.dataflow.worker.DataflowExecutionContext.DataflowStepContext) CounterUpdate(com.google.api.services.dataflow.model.CounterUpdate) Nullable(org.checkerframework.checker.nullness.qual.Nullable)

Example 18 with DataflowStepContext

use of org.apache.beam.runners.dataflow.worker.DataflowExecutionContext.DataflowStepContext in project beam by apache.

the class RegisterAndProcessBundleOperation method handleBagUserState.

private CompletionStage<BeamFnApi.StateResponse.Builder> handleBagUserState(StateRequest stateRequest) {
    StateKey.BagUserState bagUserStateKey = stateRequest.getStateKey().getBagUserState();
    DataflowStepContext userStepContext = ptransformIdToUserStepContext.get(bagUserStateKey.getTransformId());
    checkState(userStepContext != null, String.format("Unknown PTransform id '%s'", bagUserStateKey.getTransformId()));
    // TODO: We should not be required to hold onto a pointer to the bag states for the
    // user. InMemoryStateInternals assumes that the Java garbage collector does the clean-up work
    // but instead StateInternals should hold its own references and write out any data and
    // clear references when the MapTask within Dataflow completes like how WindmillStateInternals
    // works.
    BagState<ByteString> state = userStateData.computeIfAbsent(stateRequest.getStateKey(), unused -> userStepContext.stateInternals().state(// window.
    StateNamespaces.window(GlobalWindow.Coder.INSTANCE, GlobalWindow.INSTANCE), StateTags.bag(bagUserStateKey.getUserStateId(), ByteStringCoder.of())));
    switch(stateRequest.getRequestCase()) {
        case GET:
            return CompletableFuture.completedFuture(StateResponse.newBuilder().setGet(StateGetResponse.newBuilder().setData(concat(state.read()))));
        case APPEND:
            state.add(stateRequest.getAppend().getData());
            return CompletableFuture.completedFuture(StateResponse.newBuilder().setAppend(StateAppendResponse.getDefaultInstance()));
        case CLEAR:
            state.clear();
            return CompletableFuture.completedFuture(StateResponse.newBuilder().setClear(StateClearResponse.getDefaultInstance()));
        default:
            throw new IllegalArgumentException(String.format("Unknown request type %s", stateRequest.getRequestCase()));
    }
}
Also used : StateKey(org.apache.beam.model.fnexecution.v1.BeamFnApi.StateKey) ByteString(org.apache.beam.vendor.grpc.v1p43p2.com.google.protobuf.ByteString) DataflowStepContext(org.apache.beam.runners.dataflow.worker.DataflowExecutionContext.DataflowStepContext)

Example 19 with DataflowStepContext

use of org.apache.beam.runners.dataflow.worker.DataflowExecutionContext.DataflowStepContext in project beam by apache.

the class BeamFnMapTaskExecutorFactory method createOperationTransformForRegisterFnNodes.

private Function<Node, Node> createOperationTransformForRegisterFnNodes(final IdGenerator idGenerator, final InstructionRequestHandler instructionRequestHandler, final StateDelegator beamFnStateDelegator, final String stageName, final DataflowExecutionContext<?> executionContext) {
    return new TypeSafeNodeFunction<RegisterRequestNode>(RegisterRequestNode.class) {

        @Override
        public Node typedApply(RegisterRequestNode input) {
            ImmutableMap.Builder<String, DataflowOperationContext> ptransformIdToOperationContextBuilder = ImmutableMap.builder();
            ImmutableMap.Builder<String, DataflowStepContext> ptransformIdToStepContext = ImmutableMap.builder();
            for (Map.Entry<String, NameContext> entry : input.getPTransformIdToPartialNameContextMap().entrySet()) {
                NameContext fullNameContext = NameContext.create(stageName, entry.getValue().originalName(), entry.getValue().systemName(), entry.getValue().userName());
                DataflowOperationContext operationContext = executionContext.createOperationContext(fullNameContext);
                ptransformIdToOperationContextBuilder.put(entry.getKey(), operationContext);
                ptransformIdToStepContext.put(entry.getKey(), executionContext.getStepContext(operationContext));
            }
            ImmutableMap.Builder<String, NameContext> pcollectionIdToNameContext = ImmutableMap.builder();
            for (Map.Entry<String, NameContext> entry : input.getPCollectionToPartialNameContextMap().entrySet()) {
                pcollectionIdToNameContext.put(entry.getKey(), NameContext.create(stageName, entry.getValue().originalName(), entry.getValue().systemName(), entry.getValue().userName()));
            }
            ImmutableMap<String, DataflowOperationContext> ptransformIdToOperationContexts = ptransformIdToOperationContextBuilder.build();
            ImmutableMap<String, SideInputReader> ptransformIdToSideInputReaders = buildPTransformIdToSideInputReadersMap(executionContext, input, ptransformIdToOperationContexts);
            ImmutableTable<String, String, PCollectionView<?>> ptransformIdToSideInputIdToPCollectionView = buildPTransformIdToSideInputIdToPCollectionView(input);
            return OperationNode.create(new RegisterAndProcessBundleOperation(idGenerator, instructionRequestHandler, beamFnStateDelegator, input.getRegisterRequest(), ptransformIdToOperationContexts, ptransformIdToStepContext.build(), ptransformIdToSideInputReaders, ptransformIdToSideInputIdToPCollectionView, pcollectionIdToNameContext.build(), // TODO: Set NameContext properly for these operations.
            executionContext.createOperationContext(NameContext.create(stageName, stageName, stageName, stageName))));
        }
    };
}
Also used : NameContext(org.apache.beam.runners.dataflow.worker.counters.NameContext) SideInputReader(org.apache.beam.runners.core.SideInputReader) DataflowStepContext(org.apache.beam.runners.dataflow.worker.DataflowExecutionContext.DataflowStepContext) RegisterAndProcessBundleOperation(org.apache.beam.runners.dataflow.worker.fn.control.RegisterAndProcessBundleOperation) ImmutableMap(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap) RegisterRequestNode(org.apache.beam.runners.dataflow.worker.graph.Nodes.RegisterRequestNode) PCollectionView(org.apache.beam.sdk.values.PCollectionView) TypeSafeNodeFunction(org.apache.beam.runners.dataflow.worker.graph.Networks.TypeSafeNodeFunction) ImmutableMap(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap) Map(java.util.Map) HashMap(java.util.HashMap)

Example 20 with DataflowStepContext

use of org.apache.beam.runners.dataflow.worker.DataflowExecutionContext.DataflowStepContext in project beam by apache.

the class StreamingPCollectionViewWriterDoFnFactoryTest method testConstruction.

@Test
public void testConstruction() throws Exception {
    DataflowOperationContext mockOperationContext = Mockito.mock(DataflowOperationContext.class);
    DataflowExecutionContext mockExecutionContext = Mockito.mock(DataflowExecutionContext.class);
    DataflowStepContext mockStepContext = Mockito.mock(StreamingModeExecutionContext.StepContext.class);
    when(mockExecutionContext.getStepContext(mockOperationContext)).thenReturn(mockStepContext);
    CloudObject coder = CloudObjects.asCloudObject(WindowedValue.getFullCoder(BigEndianIntegerCoder.of(), GlobalWindow.Coder.INSTANCE), /*sdkComponents=*/
    null);
    ParDoFn parDoFn = new StreamingPCollectionViewWriterDoFnFactory().create(null, /* pipeline options */
    CloudObject.fromSpec(ImmutableMap.of(PropertyNames.OBJECT_TYPE_NAME, "StreamingPCollectionViewWriterDoFn", PropertyNames.ENCODING, coder, WorkerPropertyNames.SIDE_INPUT_ID, "test-side-input-id")), null, /* side input infos */
    null, /* main output tag */
    null, /* output tag to receiver index */
    mockExecutionContext, mockOperationContext);
    assertThat(parDoFn, instanceOf(StreamingPCollectionViewWriterParDoFn.class));
}
Also used : CloudObject(org.apache.beam.runners.dataflow.util.CloudObject) ParDoFn(org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoFn) DataflowStepContext(org.apache.beam.runners.dataflow.worker.DataflowExecutionContext.DataflowStepContext) Test(org.junit.Test)

Aggregations

DataflowStepContext (org.apache.beam.runners.dataflow.worker.DataflowExecutionContext.DataflowStepContext)26 Test (org.junit.Test)16 HashMap (java.util.HashMap)12 MonitoringInfo (org.apache.beam.model.pipeline.v1.MetricsApi.MonitoringInfo)11 CounterUpdate (com.google.api.services.dataflow.model.CounterUpdate)7 NameContext (org.apache.beam.runners.dataflow.worker.counters.NameContext)5 CounterMetadata (com.google.api.services.dataflow.model.CounterMetadata)3 CounterStructuredName (com.google.api.services.dataflow.model.CounterStructuredName)3 CounterStructuredNameAndMetadata (com.google.api.services.dataflow.model.CounterStructuredNameAndMetadata)3 InstructionRequest (org.apache.beam.model.fnexecution.v1.BeamFnApi.InstructionRequest)3 InstructionResponse (org.apache.beam.model.fnexecution.v1.BeamFnApi.InstructionResponse)3 CloudObject (org.apache.beam.runners.dataflow.util.CloudObject)3 ParDoFn (org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoFn)3 InstructionRequestHandler (org.apache.beam.runners.fnexecution.control.InstructionRequestHandler)3 Nullable (org.checkerframework.checker.nullness.qual.Nullable)3 Instant (org.joda.time.Instant)3 CompletableFuture (java.util.concurrent.CompletableFuture)2 CountDownLatch (java.util.concurrent.CountDownLatch)2 StateKey (org.apache.beam.model.fnexecution.v1.BeamFnApi.StateKey)2 StateRequest (org.apache.beam.model.fnexecution.v1.BeamFnApi.StateRequest)2