Search in sources :

Example 1 with Twister2PipelineOptions

use of org.apache.beam.runners.twister2.Twister2PipelineOptions in project twister2 by DSC-SPIDAL.

the class WordCount method execute.

@Override
public void execute(WorkerEnvironment workerEnv) {
    BatchEnvironment env = TSetEnvironment.initBatch(workerEnv);
    Config config = env.getConfig();
    String input = config.getStringValue("input");
    String output = config.getStringValue("output");
    System.out.println("Rank " + env.getWorkerID());
    Twister2PipelineOptions options = PipelineOptionsFactory.as(Twister2PipelineOptions.class);
    options.setTSetEnvironment(env);
    options.as(Twister2PipelineOptions.class).setRunner(Twister2LegacyRunner.class);
    runWordCount(options, input, output);
}
Also used : Twister2PipelineOptions(org.apache.beam.runners.twister2.Twister2PipelineOptions) BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment) Config(edu.iu.dsc.tws.api.config.Config)

Example 2 with Twister2PipelineOptions

use of org.apache.beam.runners.twister2.Twister2PipelineOptions in project twister2 by DSC-SPIDAL.

the class MinimalWordCount method execute.

@Override
public void execute(WorkerEnvironment workerEnv) {
    BatchEnvironment env = TSetEnvironment.initBatch(workerEnv);
    System.out.println("Rank " + env.getWorkerID());
    Twister2PipelineOptions options = PipelineOptionsFactory.as(Twister2PipelineOptions.class);
    options.setTSetEnvironment(env);
    options.as(Twister2PipelineOptions.class).setRunner(Twister2LegacyRunner.class);
    // Create the Pipeline object with the options we defined above
    Pipeline p = Pipeline.create(options);
    // Concept #1: Apply a root transform to the pipeline; in this case, TextIO.Read to read a set
    // of input text files. TextIO.Read returns a PCollection where each element is one line from
    // the input text (a set of Shakespeare's texts).
    // This example reads a public data set consisting of the complete works of Shakespeare.
    p.apply(TextIO.read().from("gs://apache-beam-samples/shakespeare/*")).apply(FlatMapElements.into(TypeDescriptors.strings()).via((String word) -> Arrays.asList(word.split("[^\\p{L}]+")))).apply(Filter.by((String word) -> !word.isEmpty())).apply(Count.perElement()).apply(MapElements.into(TypeDescriptors.strings()).via((KV<String, Long> wordCount) -> wordCount.getKey() + ": " + wordCount.getValue())).apply(TextIO.write().to("wordcounts"));
    p.run().waitUntilFinish();
}
Also used : Twister2PipelineOptions(org.apache.beam.runners.twister2.Twister2PipelineOptions) BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment) KV(org.apache.beam.sdk.values.KV) Pipeline(org.apache.beam.sdk.Pipeline)

Example 3 with Twister2PipelineOptions

use of org.apache.beam.runners.twister2.Twister2PipelineOptions in project twister2 by DSC-SPIDAL.

the class ReadSourceTest method execute.

@Override
public void execute(WorkerEnvironment workerEnv) {
    BatchEnvironment env = TSetEnvironment.initBatch(workerEnv);
    System.out.println("Rank " + env.getWorkerID());
    Twister2PipelineOptions options = PipelineOptionsFactory.as(Twister2PipelineOptions.class);
    options.setTSetEnvironment(env);
    options.as(Twister2PipelineOptions.class).setRunner(Twister2LegacyRunner.class);
    String resultPath = "/tmp/testdir";
    Pipeline p = Pipeline.create(options);
    PCollection<String> result = p.apply(GenerateSequence.from(0).to(10)).apply(ParDo.of(new DoFn<Long, String>() {

        @ProcessElement
        public void processElement(ProcessContext c) throws Exception {
            c.output(c.element().toString());
        }
    }));
    try {
        result.apply(TextIO.write().to(new URI(resultPath).getPath() + "/part"));
    } catch (URISyntaxException e) {
        LOG.info(e.getMessage());
    }
    p.run();
    System.out.println("Result " + result.toString());
}
Also used : Twister2PipelineOptions(org.apache.beam.runners.twister2.Twister2PipelineOptions) DoFn(org.apache.beam.sdk.transforms.DoFn) BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment) URISyntaxException(java.net.URISyntaxException) URI(java.net.URI) Pipeline(org.apache.beam.sdk.Pipeline)

Aggregations

BatchEnvironment (edu.iu.dsc.tws.tset.env.BatchEnvironment)3 Twister2PipelineOptions (org.apache.beam.runners.twister2.Twister2PipelineOptions)3 Pipeline (org.apache.beam.sdk.Pipeline)2 Config (edu.iu.dsc.tws.api.config.Config)1 URI (java.net.URI)1 URISyntaxException (java.net.URISyntaxException)1 DoFn (org.apache.beam.sdk.transforms.DoFn)1 KV (org.apache.beam.sdk.values.KV)1