Search in sources :

Example 6 with JobInfo

use of org.apache.beam.runners.fnexecution.provisioning.JobInfo in project beam by apache.

the class ReferenceCountingExecutableStageContextFactoryTest method testCatchThrowablesAndLogThem.

@Test
public void testCatchThrowablesAndLogThem() throws Exception {
    PrintStream oldErr = System.err;
    oldErr.flush();
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    PrintStream newErr = new PrintStream(baos);
    try {
        System.setErr(newErr);
        Creator creator = mock(Creator.class);
        ExecutableStageContext c1 = mock(ExecutableStageContext.class);
        when(creator.apply(any(JobInfo.class))).thenReturn(c1);
        // throw an Throwable and ensure that it is caught and logged.
        doThrow(new NoClassDefFoundError()).when(c1).close();
        ReferenceCountingExecutableStageContextFactory factory = ReferenceCountingExecutableStageContextFactory.create(creator, (x) -> true);
        JobInfo jobA = mock(JobInfo.class);
        when(jobA.jobId()).thenReturn("jobA");
        ExecutableStageContext ac1A = factory.get(jobA);
        factory.release(ac1A);
        newErr.flush();
        String output = new String(baos.toByteArray(), Charsets.UTF_8);
        // Ensure that the error is logged
        assertTrue(output.contains("Unable to close ExecutableStageContext"));
    } finally {
        newErr.flush();
        System.setErr(oldErr);
    }
}
Also used : PrintStream(java.io.PrintStream) JobInfo(org.apache.beam.runners.fnexecution.provisioning.JobInfo) ByteArrayOutputStream(java.io.ByteArrayOutputStream) Creator(org.apache.beam.runners.fnexecution.control.ReferenceCountingExecutableStageContextFactory.Creator) Test(org.junit.Test)

Example 7 with JobInfo

use of org.apache.beam.runners.fnexecution.provisioning.JobInfo in project beam by apache.

the class JobInvocationTest method setup.

@Before
public void setup() {
    executorService = Executors.newFixedThreadPool(1);
    JobInfo jobInfo = JobInfo.create("jobid", "jobName", "retrievalToken", Struct.getDefaultInstance());
    ListeningExecutorService listeningExecutorService = MoreExecutors.listeningDecorator(executorService);
    Pipeline pipeline = Pipeline.create();
    runner = new ControllablePipelineRunner();
    jobInvocation = new JobInvocation(jobInfo, listeningExecutorService, PipelineTranslation.toProto(pipeline), runner);
}
Also used : JobInfo(org.apache.beam.runners.fnexecution.provisioning.JobInfo) ListeningExecutorService(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.ListeningExecutorService) Pipeline(org.apache.beam.sdk.Pipeline) Before(org.junit.Before)

Example 8 with JobInfo

use of org.apache.beam.runners.fnexecution.provisioning.JobInfo in project beam by apache.

the class SamzaPipelineRunner method run.

@Override
public PortablePipelineResult run(final RunnerApi.Pipeline pipeline, JobInfo jobInfo) {
    // Expand any splittable DoFns within the graph to enable sizing and splitting of bundles.
    RunnerApi.Pipeline pipelineWithSdfExpanded = ProtoOverrides.updateTransform(PTransformTranslation.PAR_DO_TRANSFORM_URN, pipeline, SplittableParDoExpander.createSizedReplacement());
    // Don't let the fuser fuse any subcomponents of native transforms.
    RunnerApi.Pipeline trimmedPipeline = TrivialNativeTransformExpander.forKnownUrns(pipelineWithSdfExpanded, SamzaPortablePipelineTranslator.knownUrns());
    // Fused pipeline proto.
    // TODO: Consider supporting partially-fused graphs.
    RunnerApi.Pipeline fusedPipeline = trimmedPipeline.getComponents().getTransformsMap().values().stream().anyMatch(proto -> ExecutableStage.URN.equals(proto.getSpec().getUrn())) ? trimmedPipeline : GreedyPipelineFuser.fuse(trimmedPipeline).toPipeline();
    LOG.info("Portable pipeline to run:");
    LOG.info(PipelineDotRenderer.toDotString(fusedPipeline));
    // the pipeline option coming from sdk will set the sdk specific runner which will break
    // serialization
    // so we need to reset the runner here to a valid Java runner
    options.setRunner(SamzaRunner.class);
    try {
        final SamzaRunner runner = SamzaRunner.fromOptions(options);
        final PortablePipelineResult result = runner.runPortablePipeline(fusedPipeline, jobInfo);
        final SamzaExecutionEnvironment exeEnv = options.getSamzaExecutionEnvironment();
        if (exeEnv == SamzaExecutionEnvironment.LOCAL || exeEnv == SamzaExecutionEnvironment.STANDALONE) {
            // Make run() sync for local mode
            result.waitUntilFinish();
        }
        return result;
    } catch (Exception e) {
        throw new RuntimeException("Failed to invoke samza job", e);
    }
}
Also used : RunnerApi(org.apache.beam.model.pipeline.v1.RunnerApi) PTransformTranslation(org.apache.beam.runners.core.construction.PTransformTranslation) PipelineDotRenderer(org.apache.beam.runners.core.construction.renderer.PipelineDotRenderer) Logger(org.slf4j.Logger) LoggerFactory(org.slf4j.LoggerFactory) GreedyPipelineFuser(org.apache.beam.runners.core.construction.graph.GreedyPipelineFuser) TrivialNativeTransformExpander(org.apache.beam.runners.core.construction.graph.TrivialNativeTransformExpander) ExecutableStage(org.apache.beam.runners.core.construction.graph.ExecutableStage) SamzaPortablePipelineTranslator(org.apache.beam.runners.samza.translation.SamzaPortablePipelineTranslator) PortablePipelineRunner(org.apache.beam.runners.jobsubmission.PortablePipelineRunner) SplittableParDoExpander(org.apache.beam.runners.core.construction.graph.SplittableParDoExpander) ProtoOverrides(org.apache.beam.runners.core.construction.graph.ProtoOverrides) JobInfo(org.apache.beam.runners.fnexecution.provisioning.JobInfo) PortablePipelineResult(org.apache.beam.runners.jobsubmission.PortablePipelineResult) RunnerApi(org.apache.beam.model.pipeline.v1.RunnerApi) PortablePipelineResult(org.apache.beam.runners.jobsubmission.PortablePipelineResult)

Example 9 with JobInfo

use of org.apache.beam.runners.fnexecution.provisioning.JobInfo in project beam by apache.

the class SamzaRunner method runPortablePipeline.

public PortablePipelineResult runPortablePipeline(RunnerApi.Pipeline pipeline, JobInfo jobInfo) {
    final String dotGraph = PipelineDotRenderer.toDotString(pipeline);
    LOG.info("Portable pipeline to run DOT graph:\n{}", dotGraph);
    final ConfigBuilder configBuilder = new ConfigBuilder(options);
    SamzaPortablePipelineTranslator.createConfig(pipeline, configBuilder, options);
    configBuilder.put(BEAM_DOT_GRAPH, dotGraph);
    final Config config = configBuilder.build();
    options.setConfigOverride(config);
    if (listener != null) {
        listener.onInit(config, options);
    }
    final SamzaExecutionContext executionContext = new SamzaExecutionContext(options);
    final Map<String, MetricsReporterFactory> reporterFactories = getMetricsReporters();
    final StreamApplication app = appDescriptor -> {
        appDescriptor.withApplicationContainerContextFactory(executionContext.new Factory()).withMetricsReporterFactories(reporterFactories);
        SamzaPortablePipelineTranslator.translate(pipeline, new PortableTranslationContext(appDescriptor, options, jobInfo));
    };
    ApplicationRunner runner = runSamzaApp(app, config);
    return new SamzaPortablePipelineResult(app, runner, executionContext, listener, config);
}
Also used : PViewToIdMapper(org.apache.beam.runners.samza.translation.PViewToIdMapper) PortableTranslationContext(org.apache.beam.runners.samza.translation.PortableTranslationContext) ExperimentalOptions(org.apache.beam.sdk.options.ExperimentalOptions) LoggerFactory(org.slf4j.LoggerFactory) HashMap(java.util.HashMap) PipelineJsonRenderer(org.apache.beam.runners.samza.util.PipelineJsonRenderer) PipelineRunner(org.apache.beam.sdk.PipelineRunner) Map(java.util.Map) SamzaPipelineTranslator(org.apache.beam.runners.samza.translation.SamzaPipelineTranslator) MetricsReporter(org.apache.samza.metrics.MetricsReporter) MetricsReporterFactory(org.apache.samza.metrics.MetricsReporterFactory) JobInfo(org.apache.beam.runners.fnexecution.provisioning.JobInfo) PortablePipelineResult(org.apache.beam.runners.jobsubmission.PortablePipelineResult) Pipeline(org.apache.beam.sdk.Pipeline) Iterators(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.Iterators) ApplicationRunners(org.apache.samza.runtime.ApplicationRunners) ExternalContext(org.apache.samza.context.ExternalContext) PipelineOptions(org.apache.beam.sdk.options.PipelineOptions) RunnerApi(org.apache.beam.model.pipeline.v1.RunnerApi) ApplicationRunner(org.apache.samza.runtime.ApplicationRunner) PipelineDotRenderer(org.apache.beam.runners.core.construction.renderer.PipelineDotRenderer) Logger(org.slf4j.Logger) Iterator(java.util.Iterator) TranslationContext(org.apache.beam.runners.samza.translation.TranslationContext) ServiceLoader(java.util.ServiceLoader) SplittableParDo(org.apache.beam.runners.core.construction.SplittableParDo) SamzaPortablePipelineTranslator(org.apache.beam.runners.samza.translation.SamzaPortablePipelineTranslator) PipelineOptionsValidator(org.apache.beam.sdk.options.PipelineOptionsValidator) MetricsEnvironment(org.apache.beam.sdk.metrics.MetricsEnvironment) PValue(org.apache.beam.sdk.values.PValue) ConfigBuilder(org.apache.beam.runners.samza.translation.ConfigBuilder) Config(org.apache.samza.config.Config) SamzaTransformOverrides(org.apache.beam.runners.samza.translation.SamzaTransformOverrides) StreamApplication(org.apache.samza.application.StreamApplication) Collections(java.util.Collections) ApplicationRunner(org.apache.samza.runtime.ApplicationRunner) Config(org.apache.samza.config.Config) StreamApplication(org.apache.samza.application.StreamApplication) MetricsReporterFactory(org.apache.samza.metrics.MetricsReporterFactory) ConfigBuilder(org.apache.beam.runners.samza.translation.ConfigBuilder) LoggerFactory(org.slf4j.LoggerFactory) MetricsReporterFactory(org.apache.samza.metrics.MetricsReporterFactory) PortableTranslationContext(org.apache.beam.runners.samza.translation.PortableTranslationContext)

Example 10 with JobInfo

use of org.apache.beam.runners.fnexecution.provisioning.JobInfo in project beam by apache.

the class SparkPipelineRunner method main.

/**
 * Main method to be called only as the entry point to an executable jar with structure as defined
 * in {@link PortablePipelineJarUtils}.
 */
public static void main(String[] args) throws Exception {
    // Register standard file systems.
    FileSystems.setDefaultPipelineOptions(PipelineOptionsFactory.create());
    SparkPipelineRunnerConfiguration configuration = parseArgs(args);
    String baseJobName = configuration.baseJobName == null ? PortablePipelineJarUtils.getDefaultJobName() : configuration.baseJobName;
    Preconditions.checkArgument(baseJobName != null, "No default job name found. Job name must be set using --base-job-name.");
    Pipeline pipeline = PortablePipelineJarUtils.getPipelineFromClasspath(baseJobName);
    Struct originalOptions = PortablePipelineJarUtils.getPipelineOptionsFromClasspath(baseJobName);
    // The retrieval token is only required by the legacy artifact service, which the Spark runner
    // no longer uses.
    String retrievalToken = ArtifactApi.CommitManifestResponse.Constants.NO_ARTIFACTS_STAGED_TOKEN.getValueDescriptor().getOptions().getExtension(RunnerApi.beamConstant);
    SparkPipelineOptions sparkOptions = PipelineOptionsTranslation.fromProto(originalOptions).as(SparkPipelineOptions.class);
    String invocationId = String.format("%s_%s", sparkOptions.getJobName(), UUID.randomUUID().toString());
    if (sparkOptions.getAppName() == null) {
        LOG.debug("App name was null. Using invocationId {}", invocationId);
        sparkOptions.setAppName(invocationId);
    }
    SparkPipelineRunner runner = new SparkPipelineRunner(sparkOptions);
    JobInfo jobInfo = JobInfo.create(invocationId, sparkOptions.getJobName(), retrievalToken, PipelineOptionsTranslation.toProto(sparkOptions));
    try {
        runner.run(pipeline, jobInfo);
    } catch (Exception e) {
        throw new RuntimeException(String.format("Job %s failed.", invocationId), e);
    }
    LOG.info("Job {} finished successfully.", invocationId);
}
Also used : JobInfo(org.apache.beam.runners.fnexecution.provisioning.JobInfo) CmdLineException(org.kohsuke.args4j.CmdLineException) Pipeline(org.apache.beam.model.pipeline.v1.RunnerApi.Pipeline) Struct(org.apache.beam.vendor.grpc.v1p43p2.com.google.protobuf.Struct)

Aggregations

JobInfo (org.apache.beam.runners.fnexecution.provisioning.JobInfo)11 PortablePipelineRunner (org.apache.beam.runners.jobsubmission.PortablePipelineRunner)5 RunnerApi (org.apache.beam.model.pipeline.v1.RunnerApi)4 Pipeline (org.apache.beam.model.pipeline.v1.RunnerApi.Pipeline)4 PortablePipelineResult (org.apache.beam.runners.jobsubmission.PortablePipelineResult)4 PTransformTranslation (org.apache.beam.runners.core.construction.PTransformTranslation)3 ExecutableStage (org.apache.beam.runners.core.construction.graph.ExecutableStage)3 GreedyPipelineFuser (org.apache.beam.runners.core.construction.graph.GreedyPipelineFuser)3 ProtoOverrides (org.apache.beam.runners.core.construction.graph.ProtoOverrides)3 SplittableParDoExpander (org.apache.beam.runners.core.construction.graph.SplittableParDoExpander)3 TrivialNativeTransformExpander (org.apache.beam.runners.core.construction.graph.TrivialNativeTransformExpander)3 MetricsEnvironment (org.apache.beam.sdk.metrics.MetricsEnvironment)3 Struct (org.apache.beam.vendor.grpc.v1p43p2.com.google.protobuf.Struct)3 CmdLineException (org.kohsuke.args4j.CmdLineException)3 Logger (org.slf4j.Logger)3 LoggerFactory (org.slf4j.LoggerFactory)3 Map (java.util.Map)2 UUID (java.util.UUID)2 ArtifactApi (org.apache.beam.model.jobmanagement.v1.ArtifactApi)2 PipelineOptionsTranslation (org.apache.beam.runners.core.construction.PipelineOptionsTranslation)2