Search in sources :

Example 1 with Row

use of edu.iu.dsc.tws.common.table.Row in project twister2 by DSC-SPIDAL.

the class STPartition method isComplete.

@Override
public boolean isComplete() {
    for (Map.Entry<Integer, Queue<Table>> e : inputs.entrySet()) {
        if (e.getValue().isEmpty()) {
            continue;
        }
        // partition the table, default
        Table t = e.getValue().poll();
        List<ArrowColumn> columns = t.getColumns();
        ArrowColumn col = columns.get(indexes[0]);
        for (int i = 0; i < col.getVector().getValueCount(); i++) {
            Row row = new OneRow(col.get(i));
            int target = selector.next(e.getKey(), row);
            TableBuilder builder = partitionedTables.get(target);
            if (builder == null) {
                builder = new ArrowTableBuilder(schema, allocator);
                this.partitionedTables.put(target, builder);
            }
            for (int j = 0; j < columns.size(); j++) {
                builder.getColumns().get(j).addValue(columns.get(j).get(i));
            }
        }
    }
    if (finished) {
        for (Map.Entry<Integer, TableBuilder> e : partitionedTables.entrySet()) {
            Table t = e.getValue().build();
            allToAll.insert(t, e.getKey());
        }
        // clear the tables, so we won't build the tables again
        partitionedTables.clear();
        for (int s : finishedSources) {
            allToAll.finish(s);
        }
        // clear so, we won't call finish again
        finishedSources.clear();
        return allToAll.isComplete();
    }
    return false;
}
Also used : Table(edu.iu.dsc.tws.common.table.Table) ArrowTableBuilder(edu.iu.dsc.tws.common.table.ArrowTableBuilder) ArrowTableBuilder(edu.iu.dsc.tws.common.table.ArrowTableBuilder) TableBuilder(edu.iu.dsc.tws.common.table.TableBuilder) ArrowColumn(edu.iu.dsc.tws.common.table.ArrowColumn) OneRow(edu.iu.dsc.tws.common.table.OneRow) Row(edu.iu.dsc.tws.common.table.Row) OneRow(edu.iu.dsc.tws.common.table.OneRow) HashMap(java.util.HashMap) Map(java.util.Map) Queue(java.util.Queue)

Example 2 with Row

use of edu.iu.dsc.tws.common.table.Row in project twister2 by DSC-SPIDAL.

the class PartitionExample method execute.

@Override
public void execute(WorkerEnvironment workerEnvironment) {
    BatchEnvironment env = TSetEnvironment.initBatch(workerEnvironment);
    List<TField> fieldList = new ArrayList<>();
    fieldList.add(new TField("first", MessageTypes.INTEGER));
    fieldList.add(new TField("second", MessageTypes.DOUBLE));
    RowSourceTSet src = env.createRowSource("row", new SourceFunc<Row>() {

        private int count = 0;

        @Override
        public boolean hasNext() {
            return count++ < 1000;
        }

        @Override
        public Row next() {
            return new TwoRow(1, 4.1);
        }
    }, 4).withSchema(new RowSchema(fieldList));
    BatchRowTLink partition = src.partition(new PartitionFunc<Row>() {

        private List<Integer> targets;

        private Random random;

        private int c = 0;

        private Map<Integer, Integer> counts = new HashMap<>();

        @Override
        public void prepare(Set<Integer> sources, Set<Integer> destinations) {
            targets = new ArrayList<>(destinations);
            random = new Random();
            for (int t : targets) {
                counts.put(t, 0);
            }
        }

        @Override
        public int partition(int sourceIndex, Row val) {
            int index = random.nextInt(targets.size());
            int count = counts.get(index);
            counts.put(index, count + 1);
            c++;
            if (c == 1000) {
                LOG.info("COUNTS " + counts);
            }
            return targets.get(index);
        }
    }, 4, 0);
    partition.forEach(new ApplyFunc<Row>() {

        private TSetContext ctx;

        private int count;

        @Override
        public void prepare(TSetContext context) {
            ctx = context;
        }

        @Override
        public void apply(Row data) {
            LOG.info(ctx.getIndex() + " Data " + data.get(0) + ", " + data.get(1) + ", count " + count++);
        }
    });
}
Also used : RowSchema(edu.iu.dsc.tws.api.tset.schema.RowSchema) RowSourceTSet(edu.iu.dsc.tws.tset.sets.batch.row.RowSourceTSet) TField(edu.iu.dsc.tws.common.table.TField) HashMap(java.util.HashMap) BatchEnvironment(edu.iu.dsc.tws.tset.env.BatchEnvironment) ArrayList(java.util.ArrayList) SourceFunc(edu.iu.dsc.tws.api.tset.fn.SourceFunc) TSetContext(edu.iu.dsc.tws.api.tset.TSetContext) Random(java.util.Random) TwoRow(edu.iu.dsc.tws.common.table.TwoRow) BatchRowTLink(edu.iu.dsc.tws.api.tset.link.batch.BatchRowTLink) Row(edu.iu.dsc.tws.common.table.Row) TwoRow(edu.iu.dsc.tws.common.table.TwoRow)

Example 3 with Row

use of edu.iu.dsc.tws.common.table.Row in project twister2 by DSC-SPIDAL.

the class RowBatchTLinkImpl method lazyPersist.

public StorableTBase<Row> lazyPersist() {
    DiskPersistSingleSink<Row> diskPersistSingleSink = new DiskPersistSingleSink<>(this.getId());
    RowPersistedTSet persistedTSet = new RowPersistedTSet(getTSetEnv(), diskPersistSingleSink, getTargetParallelism(), (RowSchema) getSchema());
    addChildToGraph(persistedTSet);
    return persistedTSet;
}
Also used : DiskPersistSingleSink(edu.iu.dsc.tws.tset.sinks.DiskPersistSingleSink) RowPersistedTSet(edu.iu.dsc.tws.tset.sets.batch.row.RowPersistedTSet) Row(edu.iu.dsc.tws.common.table.Row)

Example 4 with Row

use of edu.iu.dsc.tws.common.table.Row in project twister2 by DSC-SPIDAL.

the class RowMapCompute method compute.

@Override
public void compute(Iterator<Row> input, RecordCollector<Row> output) {
    while (input.hasNext()) {
        Row next = input.next();
        Row out = mapFn.map(next);
        output.collect(out);
    }
}
Also used : Row(edu.iu.dsc.tws.common.table.Row)

Example 5 with Row

use of edu.iu.dsc.tws.common.table.Row in project twister2 by DSC-SPIDAL.

the class RowBatchTLinkImpl method lazyCache.

public StorableTBase<Row> lazyCache() {
    RowCachedTSet cacheTSet = new RowCachedTSet(getTSetEnv(), new CacheSingleSink<Row>(), getTargetParallelism(), (RowSchema) getSchema());
    addChildToGraph(cacheTSet);
    return cacheTSet;
}
Also used : RowCachedTSet(edu.iu.dsc.tws.tset.sets.batch.row.RowCachedTSet) Row(edu.iu.dsc.tws.common.table.Row)

Aggregations

Row (edu.iu.dsc.tws.common.table.Row)7 ArrowTableBuilder (edu.iu.dsc.tws.common.table.ArrowTableBuilder)2 HashMap (java.util.HashMap)2 TSetContext (edu.iu.dsc.tws.api.tset.TSetContext)1 SourceFunc (edu.iu.dsc.tws.api.tset.fn.SourceFunc)1 BatchRowTLink (edu.iu.dsc.tws.api.tset.link.batch.BatchRowTLink)1 RowSchema (edu.iu.dsc.tws.api.tset.schema.RowSchema)1 ArrowColumn (edu.iu.dsc.tws.common.table.ArrowColumn)1 OneRow (edu.iu.dsc.tws.common.table.OneRow)1 TField (edu.iu.dsc.tws.common.table.TField)1 Table (edu.iu.dsc.tws.common.table.Table)1 TableBuilder (edu.iu.dsc.tws.common.table.TableBuilder)1 TwoRow (edu.iu.dsc.tws.common.table.TwoRow)1 BatchChkPntEnvironment (edu.iu.dsc.tws.tset.env.BatchChkPntEnvironment)1 BatchEnvironment (edu.iu.dsc.tws.tset.env.BatchEnvironment)1 CheckpointedTSet (edu.iu.dsc.tws.tset.sets.batch.CheckpointedTSet)1 RowCachedTSet (edu.iu.dsc.tws.tset.sets.batch.row.RowCachedTSet)1 RowPersistedTSet (edu.iu.dsc.tws.tset.sets.batch.row.RowPersistedTSet)1 RowSourceTSet (edu.iu.dsc.tws.tset.sets.batch.row.RowSourceTSet)1 DiskPersistSingleSink (edu.iu.dsc.tws.tset.sinks.DiskPersistSingleSink)1