Search in sources :

Example 6 with IWorker

use of edu.iu.dsc.tws.api.resource.IWorker in project twister2 by DSC-SPIDAL.

the class MesosWorker method launchTask.

@Override
public void launchTask(ExecutorDriver executorDriver, Protos.TaskInfo taskInfo) {
    LOG.info("Task start time(ms):" + System.currentTimeMillis());
    Integer id = Integer.parseInt(taskInfo.getData().toStringUtf8());
    LOG.info("Task " + id + " has started");
    Protos.TaskStatus status = Protos.TaskStatus.newBuilder().setTaskId(taskInfo.getTaskId()).setState(Protos.TaskState.TASK_RUNNING).build();
    executorDriver.sendStatusUpdate(status);
    // jobID = SchedulerContext.jobID(config);
    // System.out.println("job name is " + jobID);
    long port = 0;
    for (Protos.Resource r : taskInfo.getResourcesList()) {
        if (r.getName().equals("ports")) {
            port = r.getRanges().getRange(0).getBegin();
            break;
        }
    }
    MesosWorkerController workerController;
    try {
        JobAPI.Job job = JobUtils.readJobFile("twister2-job/" + jobID + ".job");
        workerController = new MesosWorkerController(config, job, InetAddress.getLocalHost().getHostAddress(), toIntExact(port), id);
        LOG.info("Initializing with zookeeper");
        workerController.initializeWithZooKeeper();
        LOG.info("Waiting for all workers to join");
        workerController.getAllWorkers();
        LOG.info("Everyone has joined");
        IWorker worker = JobUtils.initializeIWorker(job);
        worker.execute(config, job, workerController, null, null);
        workerController.close();
    } catch (UnknownHostException e) {
        LOG.severe("Host unkown " + e.getMessage());
    } catch (TimeoutException timeoutException) {
        LOG.log(Level.SEVERE, timeoutException.getMessage(), timeoutException);
        return;
    }
    // The below two lines can be used to send a message to the framework
    // String reply = id.toString();
    // executorDriver.sendFrameworkMessage(reply.getBytes());
    LOG.info("Task " + id + " has finished");
    status = Protos.TaskStatus.newBuilder().setTaskId(taskInfo.getTaskId()).setState(Protos.TaskState.TASK_FINISHED).build();
    executorDriver.sendStatusUpdate(status);
}
Also used : UnknownHostException(java.net.UnknownHostException) Protos(org.apache.mesos.Protos) IWorker(edu.iu.dsc.tws.api.resource.IWorker) JobAPI(edu.iu.dsc.tws.proto.system.job.JobAPI) TimeoutException(edu.iu.dsc.tws.api.exceptions.TimeoutException)

Example 7 with IWorker

use of edu.iu.dsc.tws.api.resource.IWorker in project twister2 by DSC-SPIDAL.

the class MesosMPIWorkerStarter method startWorker.

public static void startWorker(IWorkerController workerController, IPersistentVolume pv) {
    JobAPI.Job job = JobUtils.readJobFile("twister2-job/" + jobName + ".job");
    MesosVolatileVolume volatileVolume = null;
    // TODO method SchedulerContext.volatileDiskRequested deleted
    // volatileVolume needs to be checked from job object
    // if (SchedulerContext.volatileDiskRequested(config)) {
    // volatileVolume =
    // new MesosVolatileVolume(SchedulerContext.jobName(config), workerID);
    // }
    // lets create the resource plan
    // Map<Integer, JobMasterAPI.WorkerInfo> processNames =
    // MPIWorker.createResourcePlan(config, MPI.COMM_WORLD, null);
    // now create the resource plan
    // AllocatedResources resourcePlan = MPIWorker.addContainers(config, processNames);
    // AllocatedResources resourcePlan = MesosWorkerUtils.createAllocatedResources("mesos",
    // workerID, job);
    // resourcePlan = new AllocatedResources(SchedulerContext.clusterType(config), workerID);
    IWorker worker = JobUtils.initializeIWorker(job);
    worker.execute(config, job, workerController, pv, volatileVolume);
}
Also used : MesosVolatileVolume(edu.iu.dsc.tws.rsched.schedulers.mesos.MesosVolatileVolume) IWorker(edu.iu.dsc.tws.api.resource.IWorker) JobAPI(edu.iu.dsc.tws.proto.system.job.JobAPI)

Example 8 with IWorker

use of edu.iu.dsc.tws.api.resource.IWorker in project twister2 by DSC-SPIDAL.

the class MPIWorkerStarter method startWorker.

/**
 * Start the worker
 *
 * @param intracomm communication
 */
private void startWorker(Intracomm intracomm) {
    try {
        // initialize the logger
        initWorkerLogger(config, intracomm.getRank());
        // now create the worker
        IWorkerController wc = WorkerRuntime.getWorkerController();
        IPersistentVolume persistentVolume = initPersistenceVolume(config, globalRank);
        MPIContext.addRuntimeObject("comm", intracomm);
        IWorker worker = JobUtils.initializeIWorker(job);
        MPIWorkerManager workerManager = new MPIWorkerManager();
        workerManager.execute(config, job, wc, persistentVolume, null, worker);
    } catch (MPIException e) {
        LOG.log(Level.SEVERE, "Failed to synchronize the workers at the start");
        throw new RuntimeException(e);
    }
}
Also used : IPersistentVolume(edu.iu.dsc.tws.api.resource.IPersistentVolume) MPIException(mpi.MPIException) Twister2RuntimeException(edu.iu.dsc.tws.api.exceptions.Twister2RuntimeException) MPIWorkerManager(edu.iu.dsc.tws.rsched.worker.MPIWorkerManager) IWorkerController(edu.iu.dsc.tws.api.resource.IWorkerController) IWorker(edu.iu.dsc.tws.api.resource.IWorker)

Aggregations

IWorker (edu.iu.dsc.tws.api.resource.IWorker)8 JobAPI (edu.iu.dsc.tws.proto.system.job.JobAPI)3 MPIWorkerManager (edu.iu.dsc.tws.rsched.worker.MPIWorkerManager)3 TimeoutException (edu.iu.dsc.tws.api.exceptions.TimeoutException)2 UnknownHostException (java.net.UnknownHostException)2 Twister2RuntimeException (edu.iu.dsc.tws.api.exceptions.Twister2RuntimeException)1 IPersistentVolume (edu.iu.dsc.tws.api.resource.IPersistentVolume)1 IWorkerController (edu.iu.dsc.tws.api.resource.IWorkerController)1 JobMasterAPI (edu.iu.dsc.tws.proto.jobmaster.JobMasterAPI)1 K8sVolatileVolume (edu.iu.dsc.tws.rsched.schedulers.k8s.worker.K8sVolatileVolume)1 MesosVolatileVolume (edu.iu.dsc.tws.rsched.schedulers.mesos.MesosVolatileVolume)1 WorkerManager (edu.iu.dsc.tws.rsched.worker.WorkerManager)1 MPIException (mpi.MPIException)1 Protos (org.apache.mesos.Protos)1