Search in sources :

Example 6 with Task

use of com.spotify.helios.common.descriptors.Task in project helios by spotify.

the class ZooKeeperAgentModel method getTasks.

/**
   * Returns the tasks (basically, a pair of {@link JobId} and {@link Task}) for the current agent.
   */
@Override
public Map<JobId, Task> getTasks() {
    final Map<JobId, Task> tasks = Maps.newHashMap();
    for (final Map.Entry<String, Task> entry : this.tasks.getNodes().entrySet()) {
        final JobId id = jobIdFromTaskPath(entry.getKey());
        tasks.put(id, entry.getValue());
    }
    return tasks;
}
Also used : Task(com.spotify.helios.common.descriptors.Task) Map(java.util.Map) JobId(com.spotify.helios.common.descriptors.JobId)

Example 7 with Task

use of com.spotify.helios.common.descriptors.Task in project helios by spotify.

the class ZooKeeperMasterModel method deployJobRetry.

private void deployJobRetry(final ZooKeeperClient client, final String host, final Deployment deployment, int count, final String token) throws JobDoesNotExistException, JobAlreadyDeployedException, HostNotFoundException, JobPortAllocationConflictException, TokenVerificationException {
    if (count == 3) {
        throw new HeliosRuntimeException("3 failures (possibly concurrent modifications) while " + "deploying. Giving up.");
    }
    log.info("deploying {}: {} (retry={})", deployment, host, count);
    final JobId id = deployment.getJobId();
    final Job job = getJob(id);
    if (job == null) {
        throw new JobDoesNotExistException(id);
    }
    verifyToken(token, job);
    final UUID operationId = UUID.randomUUID();
    final String jobPath = Paths.configJob(id);
    try {
        Paths.configHostJob(host, id);
    } catch (IllegalArgumentException e) {
        throw new HostNotFoundException("Could not find Helios host '" + host + "'");
    }
    final String taskPath = Paths.configHostJob(host, id);
    final String taskCreationPath = Paths.configHostJobCreation(host, id, operationId);
    final List<Integer> staticPorts = staticPorts(job);
    final Map<String, byte[]> portNodes = Maps.newHashMap();
    final byte[] idJson = id.toJsonBytes();
    for (final int port : staticPorts) {
        final String path = Paths.configHostPort(host, port);
        portNodes.put(path, idJson);
    }
    final Task task = new Task(job, deployment.getGoal(), deployment.getDeployerUser(), deployment.getDeployerMaster(), deployment.getDeploymentGroupName());
    final List<ZooKeeperOperation> operations = Lists.newArrayList(check(jobPath), create(portNodes), create(Paths.configJobHost(id, host)));
    // Attempt to read a task here.
    try {
        client.getNode(taskPath);
        // if we get here the node exists already
        throw new JobAlreadyDeployedException(host, id);
    } catch (NoNodeException e) {
        operations.add(create(taskPath, task));
        operations.add(create(taskCreationPath));
    } catch (KeeperException e) {
        throw new HeliosRuntimeException("reading existing task description failed", e);
    }
    // TODO (dano): Failure handling is racy wrt agent and job modifications.
    try {
        client.transaction(operations);
        log.info("deployed {}: {} (retry={})", deployment, host, count);
    } catch (NoNodeException e) {
        // Either the job, the host or the task went away
        assertJobExists(client, id);
        assertHostExists(client, host);
        // If the job and host still exists, we likely tried to redeploy a job that had an UNDEPLOY
        // goal and lost the race with the agent removing the task before we could set it. Retry.
        deployJobRetry(client, host, deployment, count + 1, token);
    } catch (NodeExistsException e) {
        // Check for conflict due to transaction retry
        try {
            if (client.exists(taskCreationPath) != null) {
                // Our creation operation node existed, we're done here
                return;
            }
        } catch (KeeperException ex) {
            throw new HeliosRuntimeException("checking job deployment failed", ex);
        }
        try {
            // Check if the job was already deployed
            if (client.stat(taskPath) != null) {
                throw new JobAlreadyDeployedException(host, id);
            }
        } catch (KeeperException ex) {
            throw new HeliosRuntimeException("checking job deployment failed", e);
        }
        // Check for static port collisions
        for (final int port : staticPorts) {
            checkForPortConflicts(client, host, port, id);
        }
        // Catch all for logic and ephemeral issues
        throw new HeliosRuntimeException("deploying job failed", e);
    } catch (KeeperException e) {
        throw new HeliosRuntimeException("deploying job failed", e);
    }
}
Also used : Task(com.spotify.helios.common.descriptors.Task) RolloutTask(com.spotify.helios.common.descriptors.RolloutTask) NoNodeException(org.apache.zookeeper.KeeperException.NoNodeException) ZooKeeperOperation(com.spotify.helios.servicescommon.coordination.ZooKeeperOperation) NodeExistsException(org.apache.zookeeper.KeeperException.NodeExistsException) HeliosRuntimeException(com.spotify.helios.common.HeliosRuntimeException) Job(com.spotify.helios.common.descriptors.Job) UUID(java.util.UUID) JobId(com.spotify.helios.common.descriptors.JobId) KeeperException(org.apache.zookeeper.KeeperException)

Aggregations

Task (com.spotify.helios.common.descriptors.Task)7 HeliosRuntimeException (com.spotify.helios.common.HeliosRuntimeException)5 JobId (com.spotify.helios.common.descriptors.JobId)5 RolloutTask (com.spotify.helios.common.descriptors.RolloutTask)5 KeeperException (org.apache.zookeeper.KeeperException)5 NoNodeException (org.apache.zookeeper.KeeperException.NoNodeException)5 Job (com.spotify.helios.common.descriptors.Job)3 IOException (java.io.IOException)3 Deployment (com.spotify.helios.common.descriptors.Deployment)2 ZooKeeperClient (com.spotify.helios.servicescommon.coordination.ZooKeeperClient)2 ZooKeeperOperation (com.spotify.helios.servicescommon.coordination.ZooKeeperOperation)2 UUID (java.util.UUID)2 NodeExistsException (org.apache.zookeeper.KeeperException.NodeExistsException)2 JsonParseException (com.fasterxml.jackson.core.JsonParseException)1 JsonMappingException (com.fasterxml.jackson.databind.JsonMappingException)1 Map (java.util.Map)1 BadVersionException (org.apache.zookeeper.KeeperException.BadVersionException)1 NotEmptyException (org.apache.zookeeper.KeeperException.NotEmptyException)1