Search in sources :

Example 31 with BatchEnvironment

use of edu.iu.dsc.tws.tset.env.BatchEnvironment in project twister2 by DSC-SPIDAL.

the class ReadSourceTest method execute.

@Override
public void execute(WorkerEnvironment workerEnv) {
    BatchEnvironment env = TSetEnvironment.initBatch(workerEnv);
    System.out.println("Rank " + env.getWorkerID());
    Twister2PipelineOptions options = PipelineOptionsFactory.as(Twister2PipelineOptions.class);
    options.setTSetEnvironment(env);
    options.as(Twister2PipelineOptions.class).setRunner(Twister2LegacyRunner.class);
    String resultPath = "/tmp/testdir";
    Pipeline p = Pipeline.create(options);
    PCollection<String> result = p.apply(GenerateSequence.from(0).to(10)).apply(ParDo.of(new DoFn<Long, String>() {

        @ProcessElement
        public void processElement(ProcessContext c) throws Exception {
            c.output(c.element().toString());
        }
    }));
    try {
        result.apply(TextIO.write().to(new URI(resultPath).getPath() + "/part"));
    } catch (URISyntaxException e) {
        LOG.info(e.getMessage());
    }
    p.run();
    System.out.println("Result " + result.toString());
}
Also used : Twister2PipelineOptions(org.apache.beam.runners.twister2.Twister2PipelineOptions) DoFn(org.apache.beam.sdk.transforms.DoFn) BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment) URISyntaxException(java.net.URISyntaxException) URI(java.net.URI) Pipeline(org.apache.beam.sdk.Pipeline)

Example 32 with BatchEnvironment

use of edu.iu.dsc.tws.tset.env.BatchEnvironment in project twister2 by DSC-SPIDAL.

the class UnionExample2 method execute.

@Override
public void execute(WorkerEnvironment workerEnv) {
    BatchEnvironment env = TSetEnvironment.initBatch(workerEnv);
    int start = env.getWorkerID() * 100;
    SourceTSet<Integer> src1 = dummySource(env, start, COUNT, PARALLELISM).setName("src1");
    ComputeTSet<Integer> map = src1.direct().map(i -> i + 50);
    ComputeTSet<Integer> unionTSet = src1.union(map);
    LOG.info("test source union");
    unionTSet.direct().forEach(s -> LOG.info("union: " + s));
}
Also used : BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment)

Example 33 with BatchEnvironment

use of edu.iu.dsc.tws.tset.env.BatchEnvironment in project twister2 by DSC-SPIDAL.

the class KeyedOperationsExample method execute.

@Override
public void execute(WorkerEnvironment workerEnv) {
    BatchEnvironment env = TSetEnvironment.initBatch(workerEnv);
    KeyedSourceTSet<String, Integer> kSource = dummyKeyedSource(env, COUNT, PARALLELISM);
    KeyedDirectTLink<String, Integer> kDirect = kSource.keyedDirect();
    kDirect.forEach(i -> LOG.info("d_fe: " + i.toString()));
    KeyedCachedTSet<String, Integer> cache = kDirect.cache();
    cache.keyedDirect().forEach(i -> LOG.info("c_d_fe: " + i.toString()));
    cache.keyedReduce(Integer::sum).forEach(i -> LOG.info("c_r_fe: " + i.toString()));
    cache.keyedGather().forEach((ApplyFunc<Tuple<String, Iterator<Integer>>>) data -> {
        StringBuilder sb = new StringBuilder();
        sb.append("c_g_fe: key ").append(data.getKey()).append("->");
        while (data.getValue().hasNext()) {
            sb.append(" ").append(data.getValue().next());
        }
        LOG.info(sb.toString());
    });
}
Also used : Tuple(edu.iu.dsc.tws.api.comms.structs.Tuple) Iterator(java.util.Iterator) ResourceAllocator(edu.iu.dsc.tws.rsched.core.ResourceAllocator) BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment) HashMap(java.util.HashMap) Config(edu.iu.dsc.tws.api.config.Config) KeyedDirectTLink(edu.iu.dsc.tws.tset.links.batch.KeyedDirectTLink) Logger(java.util.logging.Logger) KeyedCachedTSet(edu.iu.dsc.tws.tset.sets.batch.KeyedCachedTSet) JobConfig(edu.iu.dsc.tws.api.JobConfig) KeyedSourceTSet(edu.iu.dsc.tws.tset.sets.batch.KeyedSourceTSet) WorkerEnvironment(edu.iu.dsc.tws.api.resource.WorkerEnvironment) TSetEnvironment(edu.iu.dsc.tws.tset.env.TSetEnvironment) ApplyFunc(edu.iu.dsc.tws.api.tset.fn.ApplyFunc) BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment) Tuple(edu.iu.dsc.tws.api.comms.structs.Tuple)

Example 34 with BatchEnvironment

use of edu.iu.dsc.tws.tset.env.BatchEnvironment in project twister2 by DSC-SPIDAL.

the class JoinExample method execute.

@Override
public void execute(WorkerEnvironment workerEnv) {
    BatchEnvironment env = TSetEnvironment.initBatch(workerEnv);
    int para = 2;
    int workerID = env.getWorkerID();
    SourceTSet<Integer> src0 = dummySource(env, COUNT, para).setName("src0");
    KeyedTSet<Integer, Integer> left = src0.mapToTuple(i -> new Tuple<>(i % 2, i)).setName("left");
    left.keyedDirect().forEach(i -> LOG.info(workerID + "L " + i.toString()));
    SourceTSet<Integer> src1 = dummySource(env, COUNT, para).setName("src1");
    KeyedTSet<Integer, Integer> right = src1.mapToTuple(i -> new Tuple<>(i % 2, i)).setName("right");
    right.keyedDirect().forEach(i -> LOG.info(workerID + "R " + i.toString()));
    JoinTLink<Integer, Integer, Integer> join = left.join(right, CommunicationContext.JoinType.INNER, Integer::compareTo).setName("join");
    join.forEach(t -> LOG.info(workerID + "out: " + t.toString()));
}
Also used : CommunicationContext(edu.iu.dsc.tws.api.comms.CommunicationContext) Tuple(edu.iu.dsc.tws.api.comms.structs.Tuple) SourceTSet(edu.iu.dsc.tws.tset.sets.batch.SourceTSet) ResourceAllocator(edu.iu.dsc.tws.rsched.core.ResourceAllocator) BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment) HashMap(java.util.HashMap) Config(edu.iu.dsc.tws.api.config.Config) Logger(java.util.logging.Logger) KeyedTSet(edu.iu.dsc.tws.tset.sets.batch.KeyedTSet) JobConfig(edu.iu.dsc.tws.api.JobConfig) WorkerEnvironment(edu.iu.dsc.tws.api.resource.WorkerEnvironment) TSetEnvironment(edu.iu.dsc.tws.tset.env.TSetEnvironment) JoinTLink(edu.iu.dsc.tws.tset.links.batch.JoinTLink) BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment) Tuple(edu.iu.dsc.tws.api.comms.structs.Tuple)

Example 35 with BatchEnvironment

use of edu.iu.dsc.tws.tset.env.BatchEnvironment in project twister2 by DSC-SPIDAL.

the class GatherExample method execute.

@Override
public void execute(WorkerEnvironment workerEnv) {
    BatchEnvironment env = TSetEnvironment.initBatch(workerEnv);
    int start = env.getWorkerID() * 100;
    SourceTSet<Integer> src = dummySource(env, start, COUNT, PARALLELISM);
    GatherTLink<Integer> gather = src.gather();
    LOG.info("test foreach");
    gather.forEach(i -> LOG.info("foreach: " + i));
    LOG.info("test map");
    gather.map(i -> i.toString() + "$$").withSchema(PrimitiveSchemas.STRING).direct().forEach(s -> LOG.info("map: " + s));
    LOG.info("test compute");
    gather.compute((ComputeFunc<Iterator<Tuple<Integer, Integer>>, String>) input -> {
        int sum = 0;
        while (input.hasNext()) {
            sum += input.next().getValue();
        }
        return "sum=" + sum;
    }).withSchema(PrimitiveSchemas.STRING).direct().forEach(s -> LOG.info("compute: " + s));
    LOG.info("test computec");
    gather.compute((ComputeCollectorFunc<Iterator<Tuple<Integer, Integer>>, String>) (input, output) -> {
        int sum = 0;
        while (input.hasNext()) {
            sum += input.next().getValue();
        }
        output.collect("sum=" + sum);
    }).withSchema(PrimitiveSchemas.STRING).direct().forEach(s -> LOG.info("computec: " + s));
    gather.mapToTuple(i -> new Tuple<>(i % 2, i)).keyedDirect().forEach(i -> LOG.info("mapToTuple: " + i.toString()));
}
Also used : Tuple(edu.iu.dsc.tws.api.comms.structs.Tuple) Iterator(java.util.Iterator) ComputeCollectorFunc(edu.iu.dsc.tws.api.tset.fn.ComputeCollectorFunc) SourceTSet(edu.iu.dsc.tws.tset.sets.batch.SourceTSet) GatherTLink(edu.iu.dsc.tws.tset.links.batch.GatherTLink) ResourceAllocator(edu.iu.dsc.tws.rsched.core.ResourceAllocator) BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment) HashMap(java.util.HashMap) Config(edu.iu.dsc.tws.api.config.Config) Logger(java.util.logging.Logger) JobConfig(edu.iu.dsc.tws.api.JobConfig) WorkerEnvironment(edu.iu.dsc.tws.api.resource.WorkerEnvironment) TSetEnvironment(edu.iu.dsc.tws.tset.env.TSetEnvironment) ComputeFunc(edu.iu.dsc.tws.api.tset.fn.ComputeFunc) PrimitiveSchemas(edu.iu.dsc.tws.api.tset.schema.PrimitiveSchemas) BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment) ComputeCollectorFunc(edu.iu.dsc.tws.api.tset.fn.ComputeCollectorFunc) ComputeFunc(edu.iu.dsc.tws.api.tset.fn.ComputeFunc) Tuple(edu.iu.dsc.tws.api.comms.structs.Tuple)

Aggregations

BatchEnvironment (edu.iu.dsc.tws.tset.env.BatchEnvironment)59 Config (edu.iu.dsc.tws.api.config.Config)24 TSetEnvironment (edu.iu.dsc.tws.tset.env.TSetEnvironment)24 JobConfig (edu.iu.dsc.tws.api.JobConfig)23 WorkerEnvironment (edu.iu.dsc.tws.api.resource.WorkerEnvironment)23 Logger (java.util.logging.Logger)23 SourceTSet (edu.iu.dsc.tws.tset.sets.batch.SourceTSet)22 HashMap (java.util.HashMap)22 ResourceAllocator (edu.iu.dsc.tws.rsched.core.ResourceAllocator)21 Iterator (java.util.Iterator)21 Tuple (edu.iu.dsc.tws.api.comms.structs.Tuple)18 ComputeCollectorFunc (edu.iu.dsc.tws.api.tset.fn.ComputeCollectorFunc)12 ComputeFunc (edu.iu.dsc.tws.api.tset.fn.ComputeFunc)12 TSetContext (edu.iu.dsc.tws.api.tset.TSetContext)7 SinkTSet (edu.iu.dsc.tws.tset.sets.batch.SinkTSet)6 Twister2Job (edu.iu.dsc.tws.api.Twister2Job)5 MapFunc (edu.iu.dsc.tws.api.tset.fn.MapFunc)5 SinkFunc (edu.iu.dsc.tws.api.tset.fn.SinkFunc)5 Twister2Submitter (edu.iu.dsc.tws.rsched.job.Twister2Submitter)5 ComputeTSet (edu.iu.dsc.tws.tset.sets.batch.ComputeTSet)5