Search in sources :

Example 36 with ComputeConnection

use of edu.iu.dsc.tws.task.impl.ComputeConnection in project twister2 by DSC-SPIDAL.

the class SvmSgdAdvancedRunner method executeWeightVectorLoadingTaskGraph.

/**
 * This method loads the training data in a distributed mode
 * dataStreamerParallelism is the amount of parallelism used
 * in loaded the data in parallel.
 *
 * @return twister2 DataObject containing the training data
 */
public DataObject<Object> executeWeightVectorLoadingTaskGraph() {
    DataObject<Object> data = null;
    DataObjectSource sourceTask = new DataObjectSource(Context.TWISTER2_DIRECT_EDGE, this.svmJobParameters.getWeightVectorDataDir());
    DataObjectSink sinkTask = new DataObjectSink();
    trainingBuilder.addSource(Constants.SimpleGraphConfig.DATA_OBJECT_SOURCE, sourceTask, dataStreamerParallelism);
    ComputeConnection firstGraphComputeConnection = trainingBuilder.addCompute(Constants.SimpleGraphConfig.DATA_OBJECT_SINK, sinkTask, dataStreamerParallelism);
    firstGraphComputeConnection.direct(Constants.SimpleGraphConfig.DATA_OBJECT_SOURCE).viaEdge(Context.TWISTER2_DIRECT_EDGE).withDataType(MessageTypes.OBJECT);
    trainingBuilder.setMode(OperationMode.BATCH);
    ComputeGraph datapointsTaskGraph = trainingBuilder.build();
    datapointsTaskGraph.setGraphName("weight-vector-loading-graph");
    ExecutionPlan firstGraphExecutionPlan = taskExecutor.plan(datapointsTaskGraph);
    taskExecutor.execute(datapointsTaskGraph, firstGraphExecutionPlan);
    data = taskExecutor.getOutput(datapointsTaskGraph, firstGraphExecutionPlan, Constants.SimpleGraphConfig.DATA_OBJECT_SINK);
    if (data == null) {
        throw new NullPointerException("Something Went Wrong in Loading Weight Vector");
    } else {
        LOG.info("Training Data Total Partitions : " + data.getPartitions().length);
    }
    return data;
}
Also used : DataObjectSink(edu.iu.dsc.tws.task.dataobjects.DataObjectSink) ExecutionPlan(edu.iu.dsc.tws.api.compute.executor.ExecutionPlan) ComputeGraph(edu.iu.dsc.tws.api.compute.graph.ComputeGraph) DataObject(edu.iu.dsc.tws.api.dataset.DataObject) DataObjectSource(edu.iu.dsc.tws.task.dataobjects.DataObjectSource) ComputeConnection(edu.iu.dsc.tws.task.impl.ComputeConnection)

Example 37 with ComputeConnection

use of edu.iu.dsc.tws.task.impl.ComputeConnection in project twister2 by DSC-SPIDAL.

the class SvmSgdOnlineRunner method buildStreamingTrainingTG.

private ComputeGraph buildStreamingTrainingTG() {
    iterativeStreamingDataStreamer = new IterativeStreamingDataStreamer(this.svmJobParameters.getFeatures(), OperationMode.STREAMING, this.svmJobParameters.isDummy(), this.binaryBatchModel);
    BaseWindowedSink baseWindowedSink = getWindowSinkInstance();
    iterativeStreamingCompute = new IterativeStreamingCompute(OperationMode.STREAMING, new ReduceAggregator(), this.svmJobParameters);
    IterativeStreamingSinkEvaluator iterativeStreamingSinkEvaluator = new IterativeStreamingSinkEvaluator();
    trainingBuilder.addSource(Constants.SimpleGraphConfig.ITERATIVE_STREAMING_DATASTREAMER_SOURCE, iterativeStreamingDataStreamer, dataStreamerParallelism);
    ComputeConnection svmComputeConnection = trainingBuilder.addCompute(Constants.SimpleGraphConfig.ITERATIVE_STREAMING_SVM_COMPUTE, baseWindowedSink, dataStreamerParallelism);
    ComputeConnection svmReduceConnection = trainingBuilder.addCompute("window-sink", iterativeStreamingCompute, dataStreamerParallelism);
    ComputeConnection svmFinalEvaluationConnection = trainingBuilder.addCompute("window-evaluation-sink", iterativeStreamingSinkEvaluator, dataStreamerParallelism);
    svmComputeConnection.direct(Constants.SimpleGraphConfig.ITERATIVE_STREAMING_DATASTREAMER_SOURCE).viaEdge(Constants.SimpleGraphConfig.STREAMING_EDGE).withDataType(MessageTypes.DOUBLE_ARRAY);
    svmReduceConnection.allreduce(Constants.SimpleGraphConfig.ITERATIVE_STREAMING_SVM_COMPUTE).viaEdge("window-sink-edge").withReductionFunction(new ReduceAggregator()).withDataType(MessageTypes.DOUBLE_ARRAY);
    svmFinalEvaluationConnection.allreduce("window-sink").viaEdge("window-evaluation-edge").withReductionFunction(new IterativeAccuracyReduceFunction()).withDataType(MessageTypes.DOUBLE);
    trainingBuilder.setMode(OperationMode.STREAMING);
    trainingBuilder.setTaskGraphName(IterativeSVMConstants.ITERATIVE_STREAMING_TRAINING_TASK_GRAPH);
    return trainingBuilder.build();
}
Also used : IterativeStreamingDataStreamer(edu.iu.dsc.tws.examples.ml.svm.streamer.IterativeStreamingDataStreamer) BaseWindowedSink(edu.iu.dsc.tws.task.window.core.BaseWindowedSink) ReduceAggregator(edu.iu.dsc.tws.examples.ml.svm.aggregate.ReduceAggregator) IterativeStreamingCompute(edu.iu.dsc.tws.examples.ml.svm.compute.IterativeStreamingCompute) IterativeStreamingSinkEvaluator(edu.iu.dsc.tws.examples.ml.svm.compute.window.IterativeStreamingSinkEvaluator) ComputeConnection(edu.iu.dsc.tws.task.impl.ComputeConnection) IterativeAccuracyReduceFunction(edu.iu.dsc.tws.examples.ml.svm.aggregate.IterativeAccuracyReduceFunction)

Example 38 with ComputeConnection

use of edu.iu.dsc.tws.task.impl.ComputeConnection in project twister2 by DSC-SPIDAL.

the class SvmSgdIterativeRunner method buildWeightVectorTG.

private ComputeGraph buildWeightVectorTG() {
    DataFileReplicatedReadSource dataFileReplicatedReadSource = new DataFileReplicatedReadSource(Context.TWISTER2_DIRECT_EDGE, this.svmJobParameters.getWeightVectorDataDir(), 1);
    IterativeSVMWeightVectorObjectCompute weightVectorObjectCompute = new IterativeSVMWeightVectorObjectCompute(Context.TWISTER2_DIRECT_EDGE, 1, this.svmJobParameters.getFeatures());
    IterativeSVMWeightVectorObjectDirectSink weightVectorObjectSink = new IterativeSVMWeightVectorObjectDirectSink();
    ComputeGraphBuilder weightVectorComputeGraphBuilder = ComputeGraphBuilder.newBuilder(config);
    weightVectorComputeGraphBuilder.addSource(Constants.SimpleGraphConfig.WEIGHT_VECTOR_OBJECT_SOURCE, dataFileReplicatedReadSource, dataStreamerParallelism);
    ComputeConnection weightVectorComputeConnection = weightVectorComputeGraphBuilder.addCompute(Constants.SimpleGraphConfig.WEIGHT_VECTOR_OBJECT_COMPUTE, weightVectorObjectCompute, dataStreamerParallelism);
    ComputeConnection weightVectorSinkConnection = weightVectorComputeGraphBuilder.addCompute(Constants.SimpleGraphConfig.WEIGHT_VECTOR_OBJECT_SINK, weightVectorObjectSink, dataStreamerParallelism);
    weightVectorComputeConnection.direct(Constants.SimpleGraphConfig.WEIGHT_VECTOR_OBJECT_SOURCE).viaEdge(Context.TWISTER2_DIRECT_EDGE).withDataType(MessageTypes.OBJECT);
    weightVectorSinkConnection.direct(Constants.SimpleGraphConfig.WEIGHT_VECTOR_OBJECT_COMPUTE).viaEdge(Context.TWISTER2_DIRECT_EDGE).withDataType(MessageTypes.DOUBLE_ARRAY);
    weightVectorComputeGraphBuilder.setMode(operationMode);
    weightVectorComputeGraphBuilder.setTaskGraphName(IterativeSVMConstants.WEIGHT_VECTOR_LOADING_TASK_GRAPH);
    return weightVectorComputeGraphBuilder.build();
}
Also used : DataFileReplicatedReadSource(edu.iu.dsc.tws.task.dataobjects.DataFileReplicatedReadSource) IterativeSVMWeightVectorObjectCompute(edu.iu.dsc.tws.examples.ml.svm.data.IterativeSVMWeightVectorObjectCompute) IterativeSVMWeightVectorObjectDirectSink(edu.iu.dsc.tws.examples.ml.svm.data.IterativeSVMWeightVectorObjectDirectSink) ComputeGraphBuilder(edu.iu.dsc.tws.task.impl.ComputeGraphBuilder) ComputeConnection(edu.iu.dsc.tws.task.impl.ComputeConnection)

Example 39 with ComputeConnection

use of edu.iu.dsc.tws.task.impl.ComputeConnection in project twister2 by DSC-SPIDAL.

the class SvmSgdIterativeRunner method buildSvmSgdIterativeTrainingTG.

private ComputeGraph buildSvmSgdIterativeTrainingTG() {
    iterativeDataStream = new IterativeDataStream(this.svmJobParameters.getFeatures(), this.operationMode, this.svmJobParameters.isDummy(), this.binaryBatchModel);
    iterativeSVMRiterativeSVMWeightVectorReduce = new IterativeSVMWeightVectorReduce(this.operationMode);
    trainingBuilder.addSource(Constants.SimpleGraphConfig.ITERATIVE_DATASTREAMER_SOURCE, iterativeDataStream, dataStreamerParallelism);
    ComputeConnection svmComputeConnection = trainingBuilder.addCompute(Constants.SimpleGraphConfig.ITERATIVE_SVM_REDUCE, iterativeSVMRiterativeSVMWeightVectorReduce, dataStreamerParallelism);
    svmComputeConnection.allreduce(Constants.SimpleGraphConfig.ITERATIVE_DATASTREAMER_SOURCE).viaEdge(Constants.SimpleGraphConfig.REDUCE_EDGE).withReductionFunction(new IterativeWeightVectorReduceFunction()).withDataType(MessageTypes.DOUBLE_ARRAY);
    trainingBuilder.setMode(operationMode);
    trainingBuilder.setTaskGraphName(IterativeSVMConstants.ITERATIVE_TRAINING_TASK_GRAPH);
    return trainingBuilder.build();
}
Also used : IterativeDataStream(edu.iu.dsc.tws.examples.ml.svm.streamer.IterativeDataStream) IterativeWeightVectorReduceFunction(edu.iu.dsc.tws.examples.ml.svm.aggregate.IterativeWeightVectorReduceFunction) IterativeSVMWeightVectorReduce(edu.iu.dsc.tws.examples.ml.svm.aggregate.IterativeSVMWeightVectorReduce) ComputeConnection(edu.iu.dsc.tws.task.impl.ComputeConnection)

Example 40 with ComputeConnection

use of edu.iu.dsc.tws.task.impl.ComputeConnection in project twister2 by DSC-SPIDAL.

the class SvmSgdIterativeRunner method generateGenericDataPointLoader.

private ComputeGraph generateGenericDataPointLoader(int samples, int parallelism, int numOfFeatures, String dataSourcePathStr, String dataObjectSourceStr, String dataObjectComputeStr, String dataObjectSinkStr, String graphName) {
    SVMDataObjectSource<String, TextInputSplit> sourceTask = new SVMDataObjectSource(Context.TWISTER2_DIRECT_EDGE, dataSourcePathStr, samples);
    IterativeSVMDataObjectCompute dataObjectCompute = new IterativeSVMDataObjectCompute(Context.TWISTER2_DIRECT_EDGE, parallelism, samples, numOfFeatures, DELIMITER);
    IterativeSVMDataObjectDirectSink iterativeSVMPrimaryDataObjectDirectSink = new IterativeSVMDataObjectDirectSink();
    ComputeGraphBuilder datapointsComputeGraphBuilder = ComputeGraphBuilder.newBuilder(config);
    datapointsComputeGraphBuilder.addSource(dataObjectSourceStr, sourceTask, parallelism);
    ComputeConnection datapointComputeConnection = datapointsComputeGraphBuilder.addCompute(dataObjectComputeStr, dataObjectCompute, parallelism);
    ComputeConnection computeConnectionSink = datapointsComputeGraphBuilder.addCompute(dataObjectSinkStr, iterativeSVMPrimaryDataObjectDirectSink, parallelism);
    datapointComputeConnection.direct(dataObjectSourceStr).viaEdge(Context.TWISTER2_DIRECT_EDGE).withDataType(MessageTypes.OBJECT);
    computeConnectionSink.direct(dataObjectComputeStr).viaEdge(Context.TWISTER2_DIRECT_EDGE).withDataType(MessageTypes.OBJECT);
    datapointsComputeGraphBuilder.setMode(this.operationMode);
    datapointsComputeGraphBuilder.setTaskGraphName(graphName);
    // Build the first taskgraph
    return datapointsComputeGraphBuilder.build();
}
Also used : TextInputSplit(edu.iu.dsc.tws.data.api.splits.TextInputSplit) IterativeSVMDataObjectDirectSink(edu.iu.dsc.tws.examples.ml.svm.data.IterativeSVMDataObjectDirectSink) IterativeSVMDataObjectCompute(edu.iu.dsc.tws.examples.ml.svm.data.IterativeSVMDataObjectCompute) ComputeGraphBuilder(edu.iu.dsc.tws.task.impl.ComputeGraphBuilder) SVMDataObjectSource(edu.iu.dsc.tws.examples.ml.svm.data.SVMDataObjectSource) ComputeConnection(edu.iu.dsc.tws.task.impl.ComputeConnection)

Aggregations

ComputeConnection (edu.iu.dsc.tws.task.impl.ComputeConnection)65 ComputeGraphBuilder (edu.iu.dsc.tws.task.impl.ComputeGraphBuilder)55 ComputeGraph (edu.iu.dsc.tws.api.compute.graph.ComputeGraph)40 TaskSchedulerClassTest (edu.iu.dsc.tws.tsched.utils.TaskSchedulerClassTest)16 ExecutionPlan (edu.iu.dsc.tws.api.compute.executor.ExecutionPlan)13 DataFlowGraph (edu.iu.dsc.tws.task.cdfw.DataFlowGraph)8 DataObject (edu.iu.dsc.tws.api.dataset.DataObject)6 GraphDataSource (edu.iu.dsc.tws.graphapi.partition.GraphDataSource)6 DataObjectSource (edu.iu.dsc.tws.task.dataobjects.DataObjectSource)6 DataObjectSink (edu.iu.dsc.tws.task.dataobjects.DataObjectSink)5 ReduceAggregator (edu.iu.dsc.tws.examples.ml.svm.aggregate.ReduceAggregator)4 ConnectedSink (edu.iu.dsc.tws.task.cdfw.task.ConnectedSink)4 SVMReduce (edu.iu.dsc.tws.examples.ml.svm.aggregate.SVMReduce)3 DataFileReplicatedReadSource (edu.iu.dsc.tws.task.dataobjects.DataFileReplicatedReadSource)3 IExecutor (edu.iu.dsc.tws.api.compute.executor.IExecutor)2 Config (edu.iu.dsc.tws.api.config.Config)2 TextInputSplit (edu.iu.dsc.tws.data.api.splits.TextInputSplit)2 IterativeAccuracyReduceFunction (edu.iu.dsc.tws.examples.ml.svm.aggregate.IterativeAccuracyReduceFunction)2 IterativeSVMCompute (edu.iu.dsc.tws.examples.ml.svm.compute.IterativeSVMCompute)2 SVMCompute (edu.iu.dsc.tws.examples.ml.svm.compute.SVMCompute)2