Search in sources :

Example 21 with BatchEnvironment

use of edu.iu.dsc.tws.tset.env.BatchEnvironment in project twister2 by DSC-SPIDAL.

the class HadoopTSet method execute.

@Override
public void execute(Config config, JobAPI.Job job, IWorkerController workerController, IPersistentVolume persistentVolume, IVolatileVolume volatileVolume) {
    int workerId = workerController.getWorkerInfo().getWorkerID();
    WorkerEnvironment workerEnv = WorkerEnvironment.init(config, job, workerController, persistentVolume, volatileVolume);
    BatchEnvironment tSetEnv = TSetEnvironment.initBatch(workerEnv);
    Configuration configuration = new Configuration();
    configuration.addResource(new Path(HdfsDataContext.getHdfsConfigDirectory(config)));
    configuration.set(TextInputFormat.INPUT_DIR, "/input4");
    SourceTSet<String> source = tSetEnv.createHadoopSource(configuration, TextInputFormat.class, 4, new MapFunc<Tuple<LongWritable, Text>, String>() {

        @Override
        public String map(Tuple<LongWritable, Text> input) {
            return input.getKey().toString() + " : " + input.getValue().toString();
        }
    });
    SinkTSet<Iterator<String>> sink = source.direct().sink((SinkFunc<Iterator<String>>) value -> {
        while (value.hasNext()) {
            String next = value.next();
            LOG.info("Received value: " + next);
        }
        return true;
    });
    tSetEnv.run(sink);
}
Also used : Path(org.apache.hadoop.fs.Path) Twister2Job(edu.iu.dsc.tws.api.Twister2Job) HdfsDataContext(edu.iu.dsc.tws.data.utils.HdfsDataContext) ResourceAllocator(edu.iu.dsc.tws.rsched.core.ResourceAllocator) BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment) Text(org.apache.hadoop.io.Text) IPersistentVolume(edu.iu.dsc.tws.api.resource.IPersistentVolume) HashMap(java.util.HashMap) Config(edu.iu.dsc.tws.api.config.Config) MapFunc(edu.iu.dsc.tws.api.tset.fn.MapFunc) LongWritable(org.apache.hadoop.io.LongWritable) JobConfig(edu.iu.dsc.tws.api.JobConfig) TextInputFormat(org.apache.hadoop.mapreduce.lib.input.TextInputFormat) Configuration(org.apache.hadoop.conf.Configuration) Path(org.apache.hadoop.fs.Path) Tuple(edu.iu.dsc.tws.api.comms.structs.Tuple) Iterator(java.util.Iterator) IVolatileVolume(edu.iu.dsc.tws.api.resource.IVolatileVolume) SourceTSet(edu.iu.dsc.tws.tset.sets.batch.SourceTSet) SinkTSet(edu.iu.dsc.tws.tset.sets.batch.SinkTSet) JobAPI(edu.iu.dsc.tws.proto.system.job.JobAPI) Logger(java.util.logging.Logger) SinkFunc(edu.iu.dsc.tws.api.tset.fn.SinkFunc) Serializable(java.io.Serializable) Twister2Submitter(edu.iu.dsc.tws.rsched.job.Twister2Submitter) IWorker(edu.iu.dsc.tws.api.resource.IWorker) WorkerEnvironment(edu.iu.dsc.tws.api.resource.WorkerEnvironment) IWorkerController(edu.iu.dsc.tws.api.resource.IWorkerController) TSetEnvironment(edu.iu.dsc.tws.tset.env.TSetEnvironment) Configuration(org.apache.hadoop.conf.Configuration) BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment) Text(org.apache.hadoop.io.Text) WorkerEnvironment(edu.iu.dsc.tws.api.resource.WorkerEnvironment) Iterator(java.util.Iterator) LongWritable(org.apache.hadoop.io.LongWritable) Tuple(edu.iu.dsc.tws.api.comms.structs.Tuple)

Example 22 with BatchEnvironment

use of edu.iu.dsc.tws.tset.env.BatchEnvironment in project twister2 by DSC-SPIDAL.

the class AddInputsExample method execute.

@Override
public void execute(WorkerEnvironment workerEnv) {
    BatchEnvironment env = TSetEnvironment.initBatch(workerEnv);
    // source with 25..29
    SourceTSet<Integer> baseSrc = dummySourceOther(env, COUNT, PARALLELISM);
    // source with 0..4
    SourceTSet<Integer> src = dummySource(env, COUNT, PARALLELISM);
    CachedTSet<Integer> srcCache = src.direct().cache().setName("src");
    // make src an input of baseSrc
    CachedTSet<Integer> baseSrcCache = baseSrc.direct().cache().setName("baseSrc");
    CachedTSet<Integer> out = baseSrcCache.direct().compute(new BaseComputeCollectorFunc<Iterator<Integer>, Integer>() {

        @Override
        public void compute(Iterator<Integer> input, RecordCollector<Integer> collector) {
            DataPartitionConsumer<Integer> c1 = (DataPartitionConsumer<Integer>) getInput("src-input").getConsumer();
            while (input.hasNext() && c1.hasNext()) {
                collector.collect(input.next() + c1.next());
            }
        }
    }).addInput("src-input", srcCache).lazyCache();
    for (int i = 0; i < 4; i++) {
        LOG.info("iter: " + i);
        env.evalAndUpdate(out, baseSrcCache);
        try {
            Thread.sleep(1000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }
    baseSrcCache.direct().forEach(l -> LOG.info(l.toString()));
}
Also used : BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment) Iterator(java.util.Iterator) DataPartitionConsumer(edu.iu.dsc.tws.api.dataset.DataPartitionConsumer)

Example 23 with BatchEnvironment

use of edu.iu.dsc.tws.tset.env.BatchEnvironment in project twister2 by DSC-SPIDAL.

the class BranchingExample method execute.

@Override
public void execute(WorkerEnvironment workerEnv) {
    BatchEnvironment env = TSetEnvironment.initBatch(workerEnv);
    int para = 2;
    SourceTSet<Integer> src = dummySource(env, COUNT, para).setName("src0");
    KeyedTSet<Integer, Integer> left = src.mapToTuple(i -> new Tuple<>(i % 2, i)).setName("left");
    KeyedTSet<Integer, Integer> right = src.mapToTuple(i -> new Tuple<>(i % 2, i + 1)).setName("right");
    JoinTLink<Integer, Integer, Integer> join = left.join(right, CommunicationContext.JoinType.INNER, Integer::compareTo).setName("join");
    ComputeTSet<String> map = join.map(t -> "(" + t.getKey() + " " + t.getLeftValue() + " " + t.getRightValue() + ")").setName("map***");
    ComputeTSet<String> map1 = map.direct().map(s -> "###" + s).setName("map@@");
    ComputeTSet<String> union = map.union(map1).setName("union");
    union.direct().forEach(s -> LOG.info(s));
}
Also used : CommunicationContext(edu.iu.dsc.tws.api.comms.CommunicationContext) Tuple(edu.iu.dsc.tws.api.comms.structs.Tuple) ComputeTSet(edu.iu.dsc.tws.tset.sets.batch.ComputeTSet) SourceTSet(edu.iu.dsc.tws.tset.sets.batch.SourceTSet) ResourceAllocator(edu.iu.dsc.tws.rsched.core.ResourceAllocator) BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment) HashMap(java.util.HashMap) Config(edu.iu.dsc.tws.api.config.Config) Logger(java.util.logging.Logger) KeyedTSet(edu.iu.dsc.tws.tset.sets.batch.KeyedTSet) JobConfig(edu.iu.dsc.tws.api.JobConfig) WorkerEnvironment(edu.iu.dsc.tws.api.resource.WorkerEnvironment) TSetEnvironment(edu.iu.dsc.tws.tset.env.TSetEnvironment) JoinTLink(edu.iu.dsc.tws.tset.links.batch.JoinTLink) BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment) Tuple(edu.iu.dsc.tws.api.comms.structs.Tuple)

Example 24 with BatchEnvironment

use of edu.iu.dsc.tws.tset.env.BatchEnvironment in project twister2 by DSC-SPIDAL.

the class BaseTSetBatchWorker method execute.

@Override
public void execute(WorkerEnvironment workerEnv) {
    BatchEnvironment env = TSetEnvironment.initBatch(workerEnv);
    jobParameters = JobParameters.build(env.getConfig());
    experimentData = new ExperimentData();
    experimentData.setTaskStages(jobParameters.getTaskStages());
    if (jobParameters.isStream()) {
        throw new IllegalStateException("This worker does not support streaming, Please use" + "TSetStreamingWorker instead");
    } else {
        experimentData.setOperationMode(OperationMode.BATCH);
        experimentData.setIterations(jobParameters.getIterations());
    }
}
Also used : BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment) ExperimentData(edu.iu.dsc.tws.examples.verification.ExperimentData)

Example 25 with BatchEnvironment

use of edu.iu.dsc.tws.tset.env.BatchEnvironment in project twister2 by DSC-SPIDAL.

the class HelloTSet method execute.

@Override
public void execute(WorkerEnvironment workerEnv) {
    BatchEnvironment env = TSetEnvironment.initBatch(workerEnv);
    LOG.info("Strating Hello TSet Example");
    int para = env.getConfig().getIntegerValue("para", 4);
    SourceTSet<int[]> source = env.createSource(new SourceFunc<int[]>() {

        private int count = 0;

        @Override
        public boolean hasNext() {
            return count < para;
        }

        @Override
        public int[] next() {
            count++;
            return new int[] { 1, 1, 1 };
        }
    }, para).setName("source");
    PartitionTLink<int[]> partitioned = source.partition(new LoadBalancePartitioner<>());
    ComputeTSet<int[]> mapedPartition = partitioned.map((MapFunc<int[], int[]>) input -> Arrays.stream(input).map(a -> a * 2).toArray());
    ReduceTLink<int[]> reduce = mapedPartition.reduce((t1, t2) -> {
        int[] ret = new int[t1.length];
        for (int i = 0; i < t1.length; i++) {
            ret[i] = t1[i] + t2[i];
        }
        return ret;
    });
    SinkTSet<int[]> sink = reduce.sink(value -> {
        LOG.info("Results " + Arrays.toString(value));
        return false;
    });
    env.run(sink);
    LOG.info("Ending  Hello TSet Example");
}
Also used : Twister2Job(edu.iu.dsc.tws.api.Twister2Job) Arrays(java.util.Arrays) ResourceAllocator(edu.iu.dsc.tws.rsched.core.ResourceAllocator) Options(org.apache.commons.cli.Options) BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment) HashMap(java.util.HashMap) Config(edu.iu.dsc.tws.api.config.Config) MapFunc(edu.iu.dsc.tws.api.tset.fn.MapFunc) JobConfig(edu.iu.dsc.tws.api.JobConfig) DefaultParser(org.apache.commons.cli.DefaultParser) ReduceTLink(edu.iu.dsc.tws.tset.links.batch.ReduceTLink) CommandLine(org.apache.commons.cli.CommandLine) ComputeTSet(edu.iu.dsc.tws.tset.sets.batch.ComputeTSet) SourceTSet(edu.iu.dsc.tws.tset.sets.batch.SourceTSet) CommandLineParser(org.apache.commons.cli.CommandLineParser) SinkTSet(edu.iu.dsc.tws.tset.sets.batch.SinkTSet) SourceFunc(edu.iu.dsc.tws.api.tset.fn.SourceFunc) LoadBalancePartitioner(edu.iu.dsc.tws.tset.fn.LoadBalancePartitioner) Logger(java.util.logging.Logger) Serializable(java.io.Serializable) Twister2Submitter(edu.iu.dsc.tws.rsched.job.Twister2Submitter) WorkerEnvironment(edu.iu.dsc.tws.api.resource.WorkerEnvironment) TSetEnvironment(edu.iu.dsc.tws.tset.env.TSetEnvironment) ParseException(org.apache.commons.cli.ParseException) PartitionTLink(edu.iu.dsc.tws.tset.links.batch.PartitionTLink) Twister2Worker(edu.iu.dsc.tws.api.resource.Twister2Worker) SourceFunc(edu.iu.dsc.tws.api.tset.fn.SourceFunc) BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment)

Aggregations

BatchEnvironment (edu.iu.dsc.tws.tset.env.BatchEnvironment)59 Config (edu.iu.dsc.tws.api.config.Config)24 TSetEnvironment (edu.iu.dsc.tws.tset.env.TSetEnvironment)24 JobConfig (edu.iu.dsc.tws.api.JobConfig)23 WorkerEnvironment (edu.iu.dsc.tws.api.resource.WorkerEnvironment)23 Logger (java.util.logging.Logger)23 SourceTSet (edu.iu.dsc.tws.tset.sets.batch.SourceTSet)22 HashMap (java.util.HashMap)22 ResourceAllocator (edu.iu.dsc.tws.rsched.core.ResourceAllocator)21 Iterator (java.util.Iterator)21 Tuple (edu.iu.dsc.tws.api.comms.structs.Tuple)18 ComputeCollectorFunc (edu.iu.dsc.tws.api.tset.fn.ComputeCollectorFunc)12 ComputeFunc (edu.iu.dsc.tws.api.tset.fn.ComputeFunc)12 TSetContext (edu.iu.dsc.tws.api.tset.TSetContext)7 SinkTSet (edu.iu.dsc.tws.tset.sets.batch.SinkTSet)6 Twister2Job (edu.iu.dsc.tws.api.Twister2Job)5 MapFunc (edu.iu.dsc.tws.api.tset.fn.MapFunc)5 SinkFunc (edu.iu.dsc.tws.api.tset.fn.SinkFunc)5 Twister2Submitter (edu.iu.dsc.tws.rsched.job.Twister2Submitter)5 ComputeTSet (edu.iu.dsc.tws.tset.sets.batch.ComputeTSet)5