Search in sources :

Example 1 with ISenderToDriver

use of edu.iu.dsc.tws.api.resource.ISenderToDriver in project twister2 by DSC-SPIDAL.

the class TaskWorker method execute.

@Override
public void execute(Config cfg, JobAPI.Job job, IWorkerController wController, IPersistentVolume pVolume, IVolatileVolume vVolume) {
    this.config = cfg;
    this.workerId = wController.getWorkerInfo().getWorkerID();
    this.workerController = wController;
    this.persistentVolume = pVolume;
    this.volatileVolume = vVolume;
    ISenderToDriver senderToDriver = JMWorkerAgent.getJMWorkerAgent().getDriverAgent();
    workerEnvironment = WorkerEnvironment.init(config, job, workerController, pVolume, vVolume);
    computeEnvironment = ComputeEnvironment.init(workerEnvironment);
    // to keep backward compatibility
    taskExecutor = computeEnvironment.getTaskExecutor();
    // call execute
    execute();
    // wait for the sync
    try {
        workerEnvironment.getWorkerController().waitOnBarrier();
    } catch (TimeoutException timeoutException) {
        LOG.log(Level.SEVERE, timeoutException.getMessage(), timeoutException);
    }
    computeEnvironment.close();
    // lets terminate the network
    workerEnvironment.close();
    // we are done executing
    // If the execute returns without any errors we assume that the job completed properly
    JobExecutionState.WorkerJobState workerState = JobExecutionState.WorkerJobState.newBuilder().setFailure(false).setJobName(config.getStringValue(Context.JOB_ID)).setWorkerMessage("Worker Completed").build();
    senderToDriver.sendToDriver(workerState);
    LOG.log(Level.FINE, String.format("%d Worker done", workerId));
}
Also used : ISenderToDriver(edu.iu.dsc.tws.api.resource.ISenderToDriver) JobExecutionState(edu.iu.dsc.tws.proto.system.JobExecutionState) TimeoutException(edu.iu.dsc.tws.api.exceptions.TimeoutException)

Example 2 with ISenderToDriver

use of edu.iu.dsc.tws.api.resource.ISenderToDriver in project twister2 by DSC-SPIDAL.

the class CDFWRuntime method handleExecuteMessage.

private boolean handleExecuteMessage(Any msg) {
    ISenderToDriver senderToDriver = JMWorkerAgent.getJMWorkerAgent().getDriverAgent();
    CDFWJobAPI.ExecuteMessage executeMessage;
    ExecutionPlan executionPlan;
    CDFWJobAPI.ExecuteCompletedMessage completedMessage = null;
    try {
        executeMessage = msg.unpack(CDFWJobAPI.ExecuteMessage.class);
        // get the subgraph from the map
        CDFWJobAPI.SubGraph subGraph = executeMessage.getGraph();
        ComputeGraph taskGraph = (ComputeGraph) serializer.deserialize(subGraph.getGraphSerialized().toByteArray());
        if (taskGraph == null) {
            LOG.severe(workerId + " Unable to find the subgraph " + subGraph.getName());
            return true;
        }
        // use the taskexecutor to create the execution plan
        executionPlan = taskExecutor.plan(taskGraph);
        taskExecutor.execute(taskGraph, executionPlan);
        // reuse the task executor execute
        completedMessage = CDFWJobAPI.ExecuteCompletedMessage.newBuilder().setSubgraphName(subGraph.getName()).build();
        if (!senderToDriver.sendToDriver(completedMessage)) {
            LOG.severe("Unable to send the subgraph completed message :" + completedMessage);
        }
    } catch (InvalidProtocolBufferException e) {
        LOG.log(Level.SEVERE, "Unable to unpack received message ", e);
    }
    return false;
}
Also used : ISenderToDriver(edu.iu.dsc.tws.api.resource.ISenderToDriver) ExecutionPlan(edu.iu.dsc.tws.api.compute.executor.ExecutionPlan) ComputeGraph(edu.iu.dsc.tws.api.compute.graph.ComputeGraph) InvalidProtocolBufferException(com.google.protobuf.InvalidProtocolBufferException) CDFWJobAPI(edu.iu.dsc.tws.proto.system.job.CDFWJobAPI)

Example 3 with ISenderToDriver

use of edu.iu.dsc.tws.api.resource.ISenderToDriver in project twister2 by DSC-SPIDAL.

the class Twister2WorkerStarter method execute.

@Override
public void execute(Config config, JobAPI.Job job, IWorkerController workerController, IPersistentVolume persistentVolume, IVolatileVolume volatileVolume) {
    int workerID = workerController.getWorkerInfo().getWorkerID();
    WorkerEnvironment workerEnv = WorkerEnvironment.init(config, job, workerController, persistentVolume, volatileVolume);
    String workerClass = job.getWorkerClassName();
    Twister2Worker worker;
    try {
        Object object = ReflectionUtils.newInstance(workerClass);
        worker = (Twister2Worker) object;
        LOG.info("loaded worker class: " + workerClass);
        worker.execute(workerEnv);
    } catch (ClassNotFoundException | InstantiationException | IllegalAccessException e) {
        LOG.severe(String.format("failed to load the worker class %s", workerClass));
        throw new RuntimeException(e);
    }
    // If the execute returns without any errors we assume that the job completed properly
    if (JobMasterContext.isJobMasterUsed(config) && !job.getDriverClassName().isEmpty()) {
        ISenderToDriver senderToDriver = WorkerRuntime.getSenderToDriver();
        JobExecutionState.WorkerJobState workerState = JobExecutionState.WorkerJobState.newBuilder().setFailure(false).setJobName(config.getStringValue(Context.JOB_ID)).setWorkerMessage("Worker Completed").build();
        senderToDriver.sendToDriver(workerState);
    }
}
Also used : ISenderToDriver(edu.iu.dsc.tws.api.resource.ISenderToDriver) JobExecutionState(edu.iu.dsc.tws.proto.system.JobExecutionState) WorkerEnvironment(edu.iu.dsc.tws.api.resource.WorkerEnvironment) Twister2Worker(edu.iu.dsc.tws.api.resource.Twister2Worker)

Example 4 with ISenderToDriver

use of edu.iu.dsc.tws.api.resource.ISenderToDriver in project twister2 by DSC-SPIDAL.

the class JobMasterClientExample method simulateClient.

/**
 * a method to simulate JMWorkerAgent running in workers
 */
public static void simulateClient(Config config, JobAPI.Job job, int workerID) {
    String workerIP = JMWorkerController.convertStringToIP("localhost").getHostAddress();
    int workerPort = 10000 + (int) (Math.random() * 10000);
    JobMasterAPI.NodeInfo nodeInfo = NodeInfoUtils.createNodeInfo("node.ip", "rack01", null);
    JobAPI.ComputeResource computeResource = job.getComputeResource(0);
    Map<String, Integer> additionalPorts = generateAdditionalPorts(config, workerPort);
    JobMasterAPI.WorkerInfo workerInfo = WorkerInfoUtils.createWorkerInfo(workerID, workerIP, workerPort, nodeInfo, computeResource, additionalPorts);
    int restartCount = K8sWorkerUtils.getAndInitRestartCount(config, job.getJobId(), workerInfo);
    long start = System.currentTimeMillis();
    WorkerRuntime.init(config, job, workerInfo, restartCount);
    long delay = System.currentTimeMillis() - start;
    LOG.severe("worker-" + workerID + " startupDelay " + delay);
    IWorkerStatusUpdater statusUpdater = WorkerRuntime.getWorkerStatusUpdater();
    IWorkerController workerController = WorkerRuntime.getWorkerController();
    ISenderToDriver senderToDriver = WorkerRuntime.getSenderToDriver();
    WorkerRuntime.addReceiverFromDriver(new IReceiverFromDriver() {

        @Override
        public void driverMessageReceived(Any anyMessage) {
            LOG.info("Received message from IDriver: \n" + anyMessage);
            senderToDriver.sendToDriver(anyMessage);
        }
    });
    try {
        List<JobMasterAPI.WorkerInfo> workerList = workerController.getAllWorkers();
        LOG.info("All workers joined... IDs: " + getIDs(workerList));
    } catch (TimeoutException timeoutException) {
        LOG.log(Level.SEVERE, timeoutException.getMessage(), timeoutException);
        return;
    }
    // wait
    sleeeep(2 * 1000);
    try {
        workerController.waitOnBarrier();
        LOG.info("All workers reached the barrier. Proceeding.......");
    } catch (TimeoutException timeoutException) {
        LOG.log(Level.SEVERE, timeoutException.getMessage(), timeoutException);
    }
    // int id = job.getNumberOfWorkers() - 1;
    // JobMasterAPI.WorkerInfo info = workerController.getWorkerInfoForID(id);
    // LOG.info("WorkerInfo for " + id + ": \n" + info);
    // wait up to 3sec
    sleeeep((long) (Math.random() * 10 * 1000));
    // start the worker
    try {
        throwException(workerID);
    } catch (Throwable t) {
        // update worker status to FAILED
        statusUpdater.updateWorkerStatus(JobMasterAPI.WorkerState.FAILED);
        WorkerRuntime.close();
        // System.exit(1);
        throw t;
    }
    statusUpdater.updateWorkerStatus(JobMasterAPI.WorkerState.COMPLETED);
    WorkerRuntime.close();
    System.out.println("Client has finished the computation. Client exiting.");
}
Also used : ISenderToDriver(edu.iu.dsc.tws.api.resource.ISenderToDriver) IWorkerStatusUpdater(edu.iu.dsc.tws.api.resource.IWorkerStatusUpdater) IWorkerController(edu.iu.dsc.tws.api.resource.IWorkerController) JobAPI(edu.iu.dsc.tws.proto.system.job.JobAPI) Any(com.google.protobuf.Any) JobMasterAPI(edu.iu.dsc.tws.proto.jobmaster.JobMasterAPI) IReceiverFromDriver(edu.iu.dsc.tws.api.resource.IReceiverFromDriver) TimeoutException(edu.iu.dsc.tws.api.exceptions.TimeoutException)

Aggregations

ISenderToDriver (edu.iu.dsc.tws.api.resource.ISenderToDriver)4 TimeoutException (edu.iu.dsc.tws.api.exceptions.TimeoutException)2 JobExecutionState (edu.iu.dsc.tws.proto.system.JobExecutionState)2 Any (com.google.protobuf.Any)1 InvalidProtocolBufferException (com.google.protobuf.InvalidProtocolBufferException)1 ExecutionPlan (edu.iu.dsc.tws.api.compute.executor.ExecutionPlan)1 ComputeGraph (edu.iu.dsc.tws.api.compute.graph.ComputeGraph)1 IReceiverFromDriver (edu.iu.dsc.tws.api.resource.IReceiverFromDriver)1 IWorkerController (edu.iu.dsc.tws.api.resource.IWorkerController)1 IWorkerStatusUpdater (edu.iu.dsc.tws.api.resource.IWorkerStatusUpdater)1 Twister2Worker (edu.iu.dsc.tws.api.resource.Twister2Worker)1 WorkerEnvironment (edu.iu.dsc.tws.api.resource.WorkerEnvironment)1 JobMasterAPI (edu.iu.dsc.tws.proto.jobmaster.JobMasterAPI)1 CDFWJobAPI (edu.iu.dsc.tws.proto.system.job.CDFWJobAPI)1 JobAPI (edu.iu.dsc.tws.proto.system.job.JobAPI)1