Search in sources :

Example 11 with WorkerEnvironment

use of edu.iu.dsc.tws.api.resource.WorkerEnvironment in project twister2 by DSC-SPIDAL.

the class KGatherUngroupedExample method execute.

@Override
public void execute(WorkerEnvironment workerEnv) {
    BatchEnvironment env = TSetEnvironment.initBatch(workerEnv);
    SourceTSet<Integer> src = dummySource(env, COUNT, PARALLELISM);
    KeyedGatherUngroupedTLink<Integer, Integer> klink = src.mapToTuple(i -> new Tuple<>(i % 4, i)).keyedGatherUngrouped();
    LOG.info("test foreach");
    klink.forEach((ApplyFunc<Tuple<Integer, Integer>>) data -> LOG.info(data.getKey() + " -> " + data.getValue()));
    LOG.info("test map");
    klink.map((MapFunc<Tuple<Integer, Integer>, String>) input -> input.getKey() + " -> " + input.getValue()).direct().forEach(s -> LOG.info("map: " + s));
    LOG.info("test compute");
    klink.compute((ComputeFunc<Iterator<Tuple<Integer, Integer>>, String>) input -> {
        StringBuilder sb = new StringBuilder();
        while (input.hasNext()) {
            Tuple<Integer, Integer> next = input.next();
            sb.append("[").append(next.getKey()).append("->").append(next.getValue()).append("]");
        }
        return sb.toString();
    }).direct().forEach(s -> LOG.info("compute: " + s));
    LOG.info("test computec");
    klink.compute((ComputeCollectorFunc<Iterator<Tuple<Integer, Integer>>, String>) (input, output) -> {
        while (input.hasNext()) {
            Tuple<Integer, Integer> next = input.next();
            output.collect(next.getKey() + " -> " + next.getValue() * 2);
        }
    }).direct().forEach(s -> LOG.info("computec: " + s));
}
Also used : Tuple(edu.iu.dsc.tws.api.comms.structs.Tuple) Iterator(java.util.Iterator) ComputeCollectorFunc(edu.iu.dsc.tws.api.tset.fn.ComputeCollectorFunc) SourceTSet(edu.iu.dsc.tws.tset.sets.batch.SourceTSet) ResourceAllocator(edu.iu.dsc.tws.rsched.core.ResourceAllocator) BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment) HashMap(java.util.HashMap) Config(edu.iu.dsc.tws.api.config.Config) MapFunc(edu.iu.dsc.tws.api.tset.fn.MapFunc) Logger(java.util.logging.Logger) JobConfig(edu.iu.dsc.tws.api.JobConfig) WorkerEnvironment(edu.iu.dsc.tws.api.resource.WorkerEnvironment) TSetEnvironment(edu.iu.dsc.tws.tset.env.TSetEnvironment) ComputeFunc(edu.iu.dsc.tws.api.tset.fn.ComputeFunc) ApplyFunc(edu.iu.dsc.tws.api.tset.fn.ApplyFunc) KeyedGatherUngroupedTLink(edu.iu.dsc.tws.tset.links.batch.KeyedGatherUngroupedTLink) BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment) Iterator(java.util.Iterator) Tuple(edu.iu.dsc.tws.api.comms.structs.Tuple)

Example 12 with WorkerEnvironment

use of edu.iu.dsc.tws.api.resource.WorkerEnvironment in project twister2 by DSC-SPIDAL.

the class KReduceExample method execute.

@Override
public void execute(WorkerEnvironment workerEnv) {
    BatchEnvironment env = TSetEnvironment.initBatch(workerEnv);
    int start = env.getWorkerID() * 100;
    SourceTSet<Integer> src = dummySource(env, start, COUNT, PARALLELISM);
    KeyedReduceTLink<Integer, Integer> kreduce = src.mapToTuple(i -> new Tuple<>(i % 10, i)).keyedReduce(Integer::sum);
    LOG.info("test foreach");
    kreduce.forEach(t -> LOG.info("sum by key=" + t.getKey() + ", " + t.getValue()));
    LOG.info("test map");
    kreduce.map(i -> i.toString() + "$$").direct().forEach(s -> LOG.info("map: " + s));
    LOG.info("test compute");
    kreduce.compute((ComputeFunc<Iterator<Tuple<Integer, Integer>>, String>) input -> {
        StringBuilder s = new StringBuilder();
        while (input.hasNext()) {
            s.append(input.next().toString()).append(" ");
        }
        return s.toString();
    }).direct().forEach(s -> LOG.info("compute: concat " + s));
    LOG.info("test computec");
    kreduce.compute((ComputeCollectorFunc<Iterator<Tuple<Integer, Integer>>, String>) (input, output) -> {
        while (input.hasNext()) {
            output.collect(input.next().toString());
        }
    }).direct().forEach(s -> LOG.info("computec: " + s));
}
Also used : Tuple(edu.iu.dsc.tws.api.comms.structs.Tuple) Iterator(java.util.Iterator) ComputeCollectorFunc(edu.iu.dsc.tws.api.tset.fn.ComputeCollectorFunc) SourceTSet(edu.iu.dsc.tws.tset.sets.batch.SourceTSet) ResourceAllocator(edu.iu.dsc.tws.rsched.core.ResourceAllocator) BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment) HashMap(java.util.HashMap) Config(edu.iu.dsc.tws.api.config.Config) Logger(java.util.logging.Logger) KeyedReduceTLink(edu.iu.dsc.tws.tset.links.batch.KeyedReduceTLink) JobConfig(edu.iu.dsc.tws.api.JobConfig) WorkerEnvironment(edu.iu.dsc.tws.api.resource.WorkerEnvironment) TSetEnvironment(edu.iu.dsc.tws.tset.env.TSetEnvironment) ComputeFunc(edu.iu.dsc.tws.api.tset.fn.ComputeFunc) BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment) Iterator(java.util.Iterator) Tuple(edu.iu.dsc.tws.api.comms.structs.Tuple)

Example 13 with WorkerEnvironment

use of edu.iu.dsc.tws.api.resource.WorkerEnvironment in project twister2 by DSC-SPIDAL.

the class SReduceWindowExample method execute.

@Override
public void execute(WorkerEnvironment workerEnv) {
    StreamingEnvironment env = TSetEnvironment.initStreaming(workerEnv);
    SSourceTSet<Integer> src = dummySource(env, ELEMENTS_IN_STREAM, PARALLELISM);
    SDirectTLink<Integer> link = src.direct();
    if (COUNT_WINDOWS) {
        if (PROCESS_WINDOW) {
            WindowComputeTSet<Iterator<Integer>> winTSet = link.countWindow(2);
            WindowComputeTSet<Iterator<Integer>> processedTSet = winTSet.process((WindowComputeFunc<Iterator<Integer>, Iterator<Integer>>) input -> {
                List<Integer> list = new ArrayList<>();
                while (input.hasNext()) {
                    list.add(input.next());
                }
                return list.iterator();
            });
            processedTSet.direct().forEach((ApplyFunc<Iterator<Integer>>) data -> {
                while (data.hasNext()) {
                    System.out.println(data.next());
                }
            });
        }
        if (REDUCE_WINDOW) {
            WindowComputeTSet<Integer> winTSet = link.countWindow(2);
            WindowComputeTSet<Integer> localReducedTSet = winTSet.aggregate((AggregateFunc<Integer>) Integer::sum);
            localReducedTSet.direct().forEach(x -> System.out.println(x));
        }
    }
    if (DURATION_WINDOWS) {
        if (PROCESS_WINDOW) {
            System.out.println("DURATION PROCESS WINDOW");
            WindowComputeTSet<Iterator<Integer>> winTSet = link.timeWindow(2, TimeUnit.MILLISECONDS);
            WindowComputeTSet<Iterator<Integer>> processedTSet = winTSet.process((WindowComputeFunc<Iterator<Integer>, Iterator<Integer>>) input -> {
                List<Integer> list = new ArrayList<>();
                while (input.hasNext()) {
                    list.add(input.next());
                }
                return list.iterator();
            });
            processedTSet.direct().forEach((ApplyFunc<Iterator<Integer>>) data -> {
                while (data.hasNext()) {
                    System.out.println(data.next());
                }
            });
        }
        if (REDUCE_WINDOW) {
            WindowComputeTSet<Integer> winTSet = link.timeWindow(2, TimeUnit.MILLISECONDS);
            WindowComputeTSet<Integer> localReducedTSet = winTSet.aggregate((AggregateFunc<Integer>) Integer::sum);
            localReducedTSet.direct().forEach(x -> System.out.println(x));
        // link.countWindow().reduce(a,b-> a + b)
        }
    }
    // Runs the entire TSet graph
    env.run();
}
Also used : SSourceTSet(edu.iu.dsc.tws.tset.sets.streaming.SSourceTSet) Iterator(java.util.Iterator) ResourceAllocator(edu.iu.dsc.tws.rsched.core.ResourceAllocator) HashMap(java.util.HashMap) Config(edu.iu.dsc.tws.api.config.Config) AggregateFunc(edu.iu.dsc.tws.tset.fn.AggregateFunc) Logger(java.util.logging.Logger) JobConfig(edu.iu.dsc.tws.api.JobConfig) ArrayList(java.util.ArrayList) StreamingEnvironment(edu.iu.dsc.tws.tset.env.StreamingEnvironment) TimeUnit(java.util.concurrent.TimeUnit) List(java.util.List) WorkerEnvironment(edu.iu.dsc.tws.api.resource.WorkerEnvironment) TSetEnvironment(edu.iu.dsc.tws.tset.env.TSetEnvironment) WindowComputeFunc(edu.iu.dsc.tws.tset.fn.WindowComputeFunc) SDirectTLink(edu.iu.dsc.tws.tset.links.streaming.SDirectTLink) WindowComputeTSet(edu.iu.dsc.tws.tset.sets.streaming.WindowComputeTSet) ApplyFunc(edu.iu.dsc.tws.api.tset.fn.ApplyFunc) BatchTsetExample(edu.iu.dsc.tws.examples.tset.batch.BatchTsetExample) StreamingEnvironment(edu.iu.dsc.tws.tset.env.StreamingEnvironment) Iterator(java.util.Iterator) ArrayList(java.util.ArrayList) List(java.util.List)

Example 14 with WorkerEnvironment

use of edu.iu.dsc.tws.api.resource.WorkerEnvironment in project twister2 by DSC-SPIDAL.

the class TSetCheckptExample method execute.

@Override
public void execute(WorkerEnvironment workerEnvironment) {
    BatchChkPntEnvironment env = TSetEnvironment.initCheckpointing(workerEnvironment);
    LOG.info(String.format("Hello from worker %d", env.getWorkerID()));
    SourceTSet<Integer> sourceX = env.createSource(new SourceFunc<Integer>() {

        private int count = 0;

        @Override
        public boolean hasNext() {
            return count < 10000;
        }

        @Override
        public Integer next() {
            return count++;
        }
    }, 4);
    long t1 = System.currentTimeMillis();
    ComputeTSet<Object> twoComputes = sourceX.direct().compute((itr, c) -> {
        itr.forEachRemaining(i -> {
            c.collect(i * 5);
        });
    }).direct().compute((itr, c) -> {
        itr.forEachRemaining(i -> {
            c.collect((int) i + 2);
        });
    });
    LOG.info("Time for two computes : " + (System.currentTimeMillis() - t1));
    t1 = System.currentTimeMillis();
    PersistedTSet<Object> persist = twoComputes.persist();
    LOG.info("Time for persist : " + (System.currentTimeMillis() - t1) / 1000);
    // When persist() is called, twister2 performs all the computations/communication
    // upto this point and persists the result into the disk.
    // This makes previous data garbage collectible and frees some memory.
    // If persist() is called in a checkpointing enabled job, this will create
    // a snapshot at this point and will start straightaway from this point if the
    // job is restarted.
    // Similar to CachedTSets, PersistedTSets can be added as inputs for other TSets and
    // operations
    persist.reduce((i1, i2) -> {
        return (int) i1 + (int) i2;
    }).forEach(i -> {
        LOG.info("SUM=" + i);
    });
}
Also used : Twister2Job(edu.iu.dsc.tws.api.Twister2Job) ComputeTSet(edu.iu.dsc.tws.tset.sets.batch.ComputeTSet) SourceTSet(edu.iu.dsc.tws.tset.sets.batch.SourceTSet) SourceFunc(edu.iu.dsc.tws.api.tset.fn.SourceFunc) Logger(java.util.logging.Logger) BatchChkPntEnvironment(edu.iu.dsc.tws.tset.env.BatchChkPntEnvironment) JobConfig(edu.iu.dsc.tws.api.JobConfig) Serializable(java.io.Serializable) PersistedTSet(edu.iu.dsc.tws.tset.sets.batch.PersistedTSet) Twister2Submitter(edu.iu.dsc.tws.rsched.job.Twister2Submitter) WorkerEnvironment(edu.iu.dsc.tws.api.resource.WorkerEnvironment) TSetEnvironment(edu.iu.dsc.tws.tset.env.TSetEnvironment) Twister2Worker(edu.iu.dsc.tws.api.resource.Twister2Worker) BatchChkPntEnvironment(edu.iu.dsc.tws.tset.env.BatchChkPntEnvironment)

Example 15 with WorkerEnvironment

use of edu.iu.dsc.tws.api.resource.WorkerEnvironment in project twister2 by DSC-SPIDAL.

the class SDirectExample method execute.

@Override
public void execute(WorkerEnvironment workerEnv) {
    StreamingEnvironment env = TSetEnvironment.initStreaming(workerEnv);
    SSourceTSet<Integer> src = dummySource(env, COUNT, PARALLELISM).setName("src");
    SDirectTLink<Integer> link = src.direct().setName("dir");
    link.map(i -> i * 2).setName("map").direct().forEach(i -> LOG.info("m" + i.toString()));
    link.flatmap((i, c) -> c.collect("fm" + i)).setName("flatmap").direct().forEach(i -> LOG.info(i.toString()));
    link.compute(i -> i + "C").setName("compute").direct().forEach(i -> LOG.info(i));
    link.mapToTuple(i -> new Tuple<>(i, i.toString())).keyedDirect().forEach(i -> LOG.info("mapToTuple: " + i.toString()));
    link.compute((input, output) -> output.collect(input + "DD")).setName("computec").direct().forEach(s -> LOG.info(s.toString()));
    // Runs the entire TSet graph
    env.run();
}
Also used : SSourceTSet(edu.iu.dsc.tws.tset.sets.streaming.SSourceTSet) Tuple(edu.iu.dsc.tws.api.comms.structs.Tuple) WorkerEnvironment(edu.iu.dsc.tws.api.resource.WorkerEnvironment) TSetEnvironment(edu.iu.dsc.tws.tset.env.TSetEnvironment) SDirectTLink(edu.iu.dsc.tws.tset.links.streaming.SDirectTLink) ResourceAllocator(edu.iu.dsc.tws.rsched.core.ResourceAllocator) HashMap(java.util.HashMap) Config(edu.iu.dsc.tws.api.config.Config) Logger(java.util.logging.Logger) JobConfig(edu.iu.dsc.tws.api.JobConfig) BatchTsetExample(edu.iu.dsc.tws.examples.tset.batch.BatchTsetExample) StreamingEnvironment(edu.iu.dsc.tws.tset.env.StreamingEnvironment) StreamingEnvironment(edu.iu.dsc.tws.tset.env.StreamingEnvironment)

Aggregations

WorkerEnvironment (edu.iu.dsc.tws.api.resource.WorkerEnvironment)49 Logger (java.util.logging.Logger)46 Config (edu.iu.dsc.tws.api.config.Config)42 Iterator (java.util.Iterator)27 TSetEnvironment (edu.iu.dsc.tws.tset.env.TSetEnvironment)26 JobConfig (edu.iu.dsc.tws.api.JobConfig)25 Tuple (edu.iu.dsc.tws.api.comms.structs.Tuple)24 ResourceAllocator (edu.iu.dsc.tws.rsched.core.ResourceAllocator)23 BatchEnvironment (edu.iu.dsc.tws.tset.env.BatchEnvironment)23 SourceTSet (edu.iu.dsc.tws.tset.sets.batch.SourceTSet)23 HashMap (java.util.HashMap)22 LogicalPlanBuilder (edu.iu.dsc.tws.comms.utils.LogicalPlanBuilder)21 MessageTypes (edu.iu.dsc.tws.api.comms.messaging.types.MessageTypes)20 Set (java.util.Set)20 ResultsVerifier (edu.iu.dsc.tws.examples.verification.ResultsVerifier)19 IntArrayComparator (edu.iu.dsc.tws.examples.verification.comparators.IntArrayComparator)19 BenchmarkUtils (edu.iu.dsc.tws.examples.utils.bench.BenchmarkUtils)18 Timing (edu.iu.dsc.tws.examples.utils.bench.Timing)18 BenchWorker (edu.iu.dsc.tws.examples.comms.BenchWorker)14 BenchmarkConstants (edu.iu.dsc.tws.examples.utils.bench.BenchmarkConstants)13