Search in sources :

Example 1 with SourceFunc

use of edu.iu.dsc.tws.api.tset.fn.SourceFunc in project twister2 by DSC-SPIDAL.

the class TSetCheckptExample method execute.

@Override
public void execute(WorkerEnvironment workerEnvironment) {
    BatchChkPntEnvironment env = TSetEnvironment.initCheckpointing(workerEnvironment);
    LOG.info(String.format("Hello from worker %d", env.getWorkerID()));
    SourceTSet<Integer> sourceX = env.createSource(new SourceFunc<Integer>() {

        private int count = 0;

        @Override
        public boolean hasNext() {
            return count < 10000;
        }

        @Override
        public Integer next() {
            return count++;
        }
    }, 4);
    long t1 = System.currentTimeMillis();
    ComputeTSet<Object> twoComputes = sourceX.direct().compute((itr, c) -> {
        itr.forEachRemaining(i -> {
            c.collect(i * 5);
        });
    }).direct().compute((itr, c) -> {
        itr.forEachRemaining(i -> {
            c.collect((int) i + 2);
        });
    });
    LOG.info("Time for two computes : " + (System.currentTimeMillis() - t1));
    t1 = System.currentTimeMillis();
    PersistedTSet<Object> persist = twoComputes.persist();
    LOG.info("Time for persist : " + (System.currentTimeMillis() - t1) / 1000);
    // When persist() is called, twister2 performs all the computations/communication
    // upto this point and persists the result into the disk.
    // This makes previous data garbage collectible and frees some memory.
    // If persist() is called in a checkpointing enabled job, this will create
    // a snapshot at this point and will start straightaway from this point if the
    // job is restarted.
    // Similar to CachedTSets, PersistedTSets can be added as inputs for other TSets and
    // operations
    persist.reduce((i1, i2) -> {
        return (int) i1 + (int) i2;
    }).forEach(i -> {
        LOG.info("SUM=" + i);
    });
}
Also used : Twister2Job(edu.iu.dsc.tws.api.Twister2Job) ComputeTSet(edu.iu.dsc.tws.tset.sets.batch.ComputeTSet) SourceTSet(edu.iu.dsc.tws.tset.sets.batch.SourceTSet) SourceFunc(edu.iu.dsc.tws.api.tset.fn.SourceFunc) Logger(java.util.logging.Logger) BatchChkPntEnvironment(edu.iu.dsc.tws.tset.env.BatchChkPntEnvironment) JobConfig(edu.iu.dsc.tws.api.JobConfig) Serializable(java.io.Serializable) PersistedTSet(edu.iu.dsc.tws.tset.sets.batch.PersistedTSet) Twister2Submitter(edu.iu.dsc.tws.rsched.job.Twister2Submitter) WorkerEnvironment(edu.iu.dsc.tws.api.resource.WorkerEnvironment) TSetEnvironment(edu.iu.dsc.tws.tset.env.TSetEnvironment) Twister2Worker(edu.iu.dsc.tws.api.resource.Twister2Worker) BatchChkPntEnvironment(edu.iu.dsc.tws.tset.env.BatchChkPntEnvironment)

Example 2 with SourceFunc

use of edu.iu.dsc.tws.api.tset.fn.SourceFunc in project twister2 by DSC-SPIDAL.

the class PartitionExample method execute.

@Override
public void execute(WorkerEnvironment workerEnvironment) {
    BatchEnvironment env = TSetEnvironment.initBatch(workerEnvironment);
    List<TField> fieldList = new ArrayList<>();
    fieldList.add(new TField("first", MessageTypes.INTEGER));
    fieldList.add(new TField("second", MessageTypes.DOUBLE));
    RowSourceTSet src = env.createRowSource("row", new SourceFunc<Row>() {

        private int count = 0;

        @Override
        public boolean hasNext() {
            return count++ < 1000;
        }

        @Override
        public Row next() {
            return new TwoRow(1, 4.1);
        }
    }, 4).withSchema(new RowSchema(fieldList));
    BatchRowTLink partition = src.partition(new PartitionFunc<Row>() {

        private List<Integer> targets;

        private Random random;

        private int c = 0;

        private Map<Integer, Integer> counts = new HashMap<>();

        @Override
        public void prepare(Set<Integer> sources, Set<Integer> destinations) {
            targets = new ArrayList<>(destinations);
            random = new Random();
            for (int t : targets) {
                counts.put(t, 0);
            }
        }

        @Override
        public int partition(int sourceIndex, Row val) {
            int index = random.nextInt(targets.size());
            int count = counts.get(index);
            counts.put(index, count + 1);
            c++;
            if (c == 1000) {
                LOG.info("COUNTS " + counts);
            }
            return targets.get(index);
        }
    }, 4, 0);
    partition.forEach(new ApplyFunc<Row>() {

        private TSetContext ctx;

        private int count;

        @Override
        public void prepare(TSetContext context) {
            ctx = context;
        }

        @Override
        public void apply(Row data) {
            LOG.info(ctx.getIndex() + " Data " + data.get(0) + ", " + data.get(1) + ", count " + count++);
        }
    });
}
Also used : RowSchema(edu.iu.dsc.tws.api.tset.schema.RowSchema) RowSourceTSet(edu.iu.dsc.tws.tset.sets.batch.row.RowSourceTSet) TField(edu.iu.dsc.tws.common.table.TField) HashMap(java.util.HashMap) BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment) ArrayList(java.util.ArrayList) SourceFunc(edu.iu.dsc.tws.api.tset.fn.SourceFunc) TSetContext(edu.iu.dsc.tws.api.tset.TSetContext) Random(java.util.Random) TwoRow(edu.iu.dsc.tws.common.table.TwoRow) BatchRowTLink(edu.iu.dsc.tws.api.tset.link.batch.BatchRowTLink) Row(edu.iu.dsc.tws.common.table.Row) TwoRow(edu.iu.dsc.tws.common.table.TwoRow)

Example 3 with SourceFunc

use of edu.iu.dsc.tws.api.tset.fn.SourceFunc in project twister2 by DSC-SPIDAL.

the class TSetCommunicationExample method execute.

@Override
public void execute(WorkerEnvironment workerEnv) {
    BatchEnvironment env = TSetEnvironment.initBatch(workerEnv);
    LOG.info(String.format("Hello from worker %d", env.getWorkerID()));
    SourceTSet<Integer> sourceX = env.createSource(new SourceFunc<Integer>() {

        private int count = 0;

        @Override
        public boolean hasNext() {
            return count < 10;
        }

        @Override
        public Integer next() {
            return count++;
        }
    }, 4);
    sourceX.direct().compute((itr, collector) -> {
        itr.forEachRemaining(i -> {
            collector.collect(i * 5);
        });
    }).direct().compute((itr, collector) -> {
        itr.forEachRemaining(i -> {
            collector.collect((int) i + 2);
        });
    }).reduce((i1, i2) -> {
        return (int) i1 + (int) i2;
    }).forEach(i -> {
        LOG.info("SUM=" + i);
    });
}
Also used : Twister2Job(edu.iu.dsc.tws.api.Twister2Job) Twister2Submitter(edu.iu.dsc.tws.rsched.job.Twister2Submitter) WorkerEnvironment(edu.iu.dsc.tws.api.resource.WorkerEnvironment) TSetEnvironment(edu.iu.dsc.tws.tset.env.TSetEnvironment) SourceTSet(edu.iu.dsc.tws.tset.sets.batch.SourceTSet) BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment) SourceFunc(edu.iu.dsc.tws.api.tset.fn.SourceFunc) Logger(java.util.logging.Logger) JobConfig(edu.iu.dsc.tws.api.JobConfig) Twister2Worker(edu.iu.dsc.tws.api.resource.Twister2Worker) Serializable(java.io.Serializable) BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment)

Example 4 with SourceFunc

use of edu.iu.dsc.tws.api.tset.fn.SourceFunc in project twister2 by DSC-SPIDAL.

the class HelloTSet method execute.

@Override
public void execute(WorkerEnvironment workerEnv) {
    BatchEnvironment env = TSetEnvironment.initBatch(workerEnv);
    LOG.info("Strating Hello TSet Example");
    int para = env.getConfig().getIntegerValue("para", 4);
    SourceTSet<int[]> source = env.createSource(new SourceFunc<int[]>() {

        private int count = 0;

        @Override
        public boolean hasNext() {
            return count < para;
        }

        @Override
        public int[] next() {
            count++;
            return new int[] { 1, 1, 1 };
        }
    }, para).setName("source");
    PartitionTLink<int[]> partitioned = source.partition(new LoadBalancePartitioner<>());
    ComputeTSet<int[]> mapedPartition = partitioned.map((MapFunc<int[], int[]>) input -> Arrays.stream(input).map(a -> a * 2).toArray());
    ReduceTLink<int[]> reduce = mapedPartition.reduce((t1, t2) -> {
        int[] ret = new int[t1.length];
        for (int i = 0; i < t1.length; i++) {
            ret[i] = t1[i] + t2[i];
        }
        return ret;
    });
    SinkTSet<int[]> sink = reduce.sink(value -> {
        LOG.info("Results " + Arrays.toString(value));
        return false;
    });
    env.run(sink);
    LOG.info("Ending  Hello TSet Example");
}
Also used : Twister2Job(edu.iu.dsc.tws.api.Twister2Job) Arrays(java.util.Arrays) ResourceAllocator(edu.iu.dsc.tws.rsched.core.ResourceAllocator) Options(org.apache.commons.cli.Options) BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment) HashMap(java.util.HashMap) Config(edu.iu.dsc.tws.api.config.Config) MapFunc(edu.iu.dsc.tws.api.tset.fn.MapFunc) JobConfig(edu.iu.dsc.tws.api.JobConfig) DefaultParser(org.apache.commons.cli.DefaultParser) ReduceTLink(edu.iu.dsc.tws.tset.links.batch.ReduceTLink) CommandLine(org.apache.commons.cli.CommandLine) ComputeTSet(edu.iu.dsc.tws.tset.sets.batch.ComputeTSet) SourceTSet(edu.iu.dsc.tws.tset.sets.batch.SourceTSet) CommandLineParser(org.apache.commons.cli.CommandLineParser) SinkTSet(edu.iu.dsc.tws.tset.sets.batch.SinkTSet) SourceFunc(edu.iu.dsc.tws.api.tset.fn.SourceFunc) LoadBalancePartitioner(edu.iu.dsc.tws.tset.fn.LoadBalancePartitioner) Logger(java.util.logging.Logger) Serializable(java.io.Serializable) Twister2Submitter(edu.iu.dsc.tws.rsched.job.Twister2Submitter) WorkerEnvironment(edu.iu.dsc.tws.api.resource.WorkerEnvironment) TSetEnvironment(edu.iu.dsc.tws.tset.env.TSetEnvironment) ParseException(org.apache.commons.cli.ParseException) PartitionTLink(edu.iu.dsc.tws.tset.links.batch.PartitionTLink) Twister2Worker(edu.iu.dsc.tws.api.resource.Twister2Worker) SourceFunc(edu.iu.dsc.tws.api.tset.fn.SourceFunc) BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment)

Aggregations

SourceFunc (edu.iu.dsc.tws.api.tset.fn.SourceFunc)4 JobConfig (edu.iu.dsc.tws.api.JobConfig)3 Twister2Job (edu.iu.dsc.tws.api.Twister2Job)3 Twister2Worker (edu.iu.dsc.tws.api.resource.Twister2Worker)3 WorkerEnvironment (edu.iu.dsc.tws.api.resource.WorkerEnvironment)3 Twister2Submitter (edu.iu.dsc.tws.rsched.job.Twister2Submitter)3 BatchEnvironment (edu.iu.dsc.tws.tset.env.BatchEnvironment)3 TSetEnvironment (edu.iu.dsc.tws.tset.env.TSetEnvironment)3 SourceTSet (edu.iu.dsc.tws.tset.sets.batch.SourceTSet)3 Serializable (java.io.Serializable)3 Logger (java.util.logging.Logger)3 ComputeTSet (edu.iu.dsc.tws.tset.sets.batch.ComputeTSet)2 HashMap (java.util.HashMap)2 Config (edu.iu.dsc.tws.api.config.Config)1 TSetContext (edu.iu.dsc.tws.api.tset.TSetContext)1 MapFunc (edu.iu.dsc.tws.api.tset.fn.MapFunc)1 BatchRowTLink (edu.iu.dsc.tws.api.tset.link.batch.BatchRowTLink)1 RowSchema (edu.iu.dsc.tws.api.tset.schema.RowSchema)1 Row (edu.iu.dsc.tws.common.table.Row)1 TField (edu.iu.dsc.tws.common.table.TField)1