Search in sources :

Example 6 with Row

use of edu.iu.dsc.tws.common.table.Row in project twister2 by DSC-SPIDAL.

the class RowBatchTLinkImpl method persist.

/*
   * Similar to cache, but stores data in disk rather than in memory.
   */
public StorableTBase<Row> persist() {
    // handling checkpointing
    if (getTSetEnv().isCheckpointingEnabled()) {
        String persistVariableName = this.getId() + "-persisted";
        BatchChkPntEnvironment chkEnv = (BatchChkPntEnvironment) getTSetEnv();
        Boolean persisted = chkEnv.initVariable(persistVariableName, false);
        if (persisted) {
            // create a source function with the capability to read from disk
            DiskPartitionBackedSource<Row> sourceFn = new DiskPartitionBackedSource<>(this.getId());
            // pass the source fn to the checkpointed tset (that would create a source tset from the
            // source function, the same way as a persisted tset. This preserves the order of tsets
            // that are being created in the checkpointed env)
            CheckpointedTSet<Row> checkTSet = new CheckpointedTSet<>(getTSetEnv(), sourceFn, this.getTargetParallelism(), getSchema());
            // adding checkpointed tset to the graph, so that the IDs would not change
            addChildToGraph(checkTSet);
            // run only the checkpointed tset so that it would populate the inputs in the executor
            getTSetEnv().runOne(checkTSet);
            return checkTSet;
        } else {
            StorableTBase<Row> storable = this.doPersist();
            chkEnv.updateVariable(persistVariableName, true);
            chkEnv.commit();
            return storable;
        }
    }
    return doPersist();
}
Also used : DiskPartitionBackedSource(edu.iu.dsc.tws.tset.sources.DiskPartitionBackedSource) BatchChkPntEnvironment(edu.iu.dsc.tws.tset.env.BatchChkPntEnvironment) Row(edu.iu.dsc.tws.common.table.Row) CheckpointedTSet(edu.iu.dsc.tws.tset.sets.batch.CheckpointedTSet)

Example 7 with Row

use of edu.iu.dsc.tws.common.table.Row in project twister2 by DSC-SPIDAL.

the class RowSourceOp method execute.

@Override
public void execute() {
    if (source.hasNext()) {
        // todo:: change source function to accept a row, so we don't have to allocate new
        // row every time
        Row tuple = source.next();
        builder.add(tuple);
        if (builder.currentSize() > tableMaxSize) {
            multiEdgeOpAdapter.writeToEdges(builder.build());
            builder = new ArrowTableBuilder(schema.toArrowSchema(), runtime.getRootAllocator());
        }
    } else {
        multiEdgeOpAdapter.writeToEdges(builder.build());
        builder = null;
        multiEdgeOpAdapter.writeEndToEdges();
    }
}
Also used : ArrowTableBuilder(edu.iu.dsc.tws.common.table.ArrowTableBuilder) Row(edu.iu.dsc.tws.common.table.Row)

Aggregations

Row (edu.iu.dsc.tws.common.table.Row)7 ArrowTableBuilder (edu.iu.dsc.tws.common.table.ArrowTableBuilder)2 HashMap (java.util.HashMap)2 TSetContext (edu.iu.dsc.tws.api.tset.TSetContext)1 SourceFunc (edu.iu.dsc.tws.api.tset.fn.SourceFunc)1 BatchRowTLink (edu.iu.dsc.tws.api.tset.link.batch.BatchRowTLink)1 RowSchema (edu.iu.dsc.tws.api.tset.schema.RowSchema)1 ArrowColumn (edu.iu.dsc.tws.common.table.ArrowColumn)1 OneRow (edu.iu.dsc.tws.common.table.OneRow)1 TField (edu.iu.dsc.tws.common.table.TField)1 Table (edu.iu.dsc.tws.common.table.Table)1 TableBuilder (edu.iu.dsc.tws.common.table.TableBuilder)1 TwoRow (edu.iu.dsc.tws.common.table.TwoRow)1 BatchChkPntEnvironment (edu.iu.dsc.tws.tset.env.BatchChkPntEnvironment)1 BatchEnvironment (edu.iu.dsc.tws.tset.env.BatchEnvironment)1 CheckpointedTSet (edu.iu.dsc.tws.tset.sets.batch.CheckpointedTSet)1 RowCachedTSet (edu.iu.dsc.tws.tset.sets.batch.row.RowCachedTSet)1 RowPersistedTSet (edu.iu.dsc.tws.tset.sets.batch.row.RowPersistedTSet)1 RowSourceTSet (edu.iu.dsc.tws.tset.sets.batch.row.RowSourceTSet)1 DiskPersistSingleSink (edu.iu.dsc.tws.tset.sinks.DiskPersistSingleSink)1