Search in sources :

Example 31 with NoNodeException

use of org.apache.zookeeper.KeeperException.NoNodeException in project helios by spotify.

the class ZooKeeperMasterModel method removeDeploymentGroup.

/**
   * Remove a deployment group.
   *
   * <p>If successful, all ZK nodes associated with the DG will be deleted. Specifically these
   * nodes are guaranteed to be non-existent after a successful remove (not all of them might exist
   * before, though):
   * <ul>
   *   <li>/config/deployment-groups/[group-name]</li>
   *   <li>/status/deployment-groups/[group-name]</li>
   *   <li>/status/deployment-groups/[group-name]/hosts</li>
   *   <li>/status/deployment-groups/[group-name]/removed</li>
   *   <li>/status/deployment-group-tasks/[group-name]</li>
   * </ul>
   * If the operation fails no ZK nodes will be removed.
   *
   * @throws DeploymentGroupDoesNotExistException If the DG does not exist.
   */
@Override
public void removeDeploymentGroup(final String name) throws DeploymentGroupDoesNotExistException {
    log.info("removing deployment-group: name={}", name);
    final ZooKeeperClient client = provider.get("removeDeploymentGroup");
    try {
        client.ensurePath(Paths.configDeploymentGroups());
        client.ensurePath(Paths.statusDeploymentGroups());
        client.ensurePath(Paths.statusDeploymentGroupTasks());
        final List<ZooKeeperOperation> operations = Lists.newArrayList();
        final List<String> paths = ImmutableList.of(Paths.configDeploymentGroup(name), Paths.statusDeploymentGroup(name), Paths.statusDeploymentGroupHosts(name), Paths.statusDeploymentGroupRemovedHosts(name), Paths.statusDeploymentGroupTasks(name));
        // DGs to become slower and spam logs with errors so we want to avoid it.
        for (final String path : paths) {
            if (client.exists(path) == null) {
                operations.add(create(path));
            }
        }
        for (final String path : Lists.reverse(paths)) {
            operations.add(delete(path));
        }
        client.transaction(operations);
    } catch (final NoNodeException e) {
        throw new DeploymentGroupDoesNotExistException(name);
    } catch (final KeeperException e) {
        throw new HeliosRuntimeException("removing deployment-group " + name + " failed", e);
    }
}
Also used : NoNodeException(org.apache.zookeeper.KeeperException.NoNodeException) ZooKeeperClient(com.spotify.helios.servicescommon.coordination.ZooKeeperClient) ZooKeeperOperation(com.spotify.helios.servicescommon.coordination.ZooKeeperOperation) HeliosRuntimeException(com.spotify.helios.common.HeliosRuntimeException) KeeperException(org.apache.zookeeper.KeeperException)

Example 32 with NoNodeException

use of org.apache.zookeeper.KeeperException.NoNodeException in project helios by spotify.

the class ZooKeeperMasterModel method addJob.

/**
   * Adds a job into the configuration.
   */
@Override
public void addJob(final Job job) throws JobExistsException {
    log.info("adding job: {}", job);
    final JobId id = job.getId();
    final UUID operationId = UUID.randomUUID();
    final String creationPath = Paths.configJobCreation(id, operationId);
    final ZooKeeperClient client = provider.get("addJob");
    try {
        try {
            client.ensurePath(Paths.historyJob(id));
            client.transaction(create(Paths.configJob(id), job), create(Paths.configJobRefShort(id), id), create(Paths.configJobHosts(id)), create(creationPath), // change down the tree. Effectively, make it that version == cVersion.
            set(Paths.configJobs(), UUID.randomUUID().toString().getBytes()));
        } catch (final NodeExistsException e) {
            if (client.exists(creationPath) != null) {
                // The job was created, we're done here
                return;
            }
            throw new JobExistsException(id.toString());
        }
    } catch (NoNodeException e) {
        throw new HeliosRuntimeException("adding job " + job + " failed due to missing ZK path: " + e.getPath(), e);
    } catch (final KeeperException e) {
        throw new HeliosRuntimeException("adding job " + job + " failed", e);
    }
}
Also used : NoNodeException(org.apache.zookeeper.KeeperException.NoNodeException) ZooKeeperClient(com.spotify.helios.servicescommon.coordination.ZooKeeperClient) NodeExistsException(org.apache.zookeeper.KeeperException.NodeExistsException) HeliosRuntimeException(com.spotify.helios.common.HeliosRuntimeException) UUID(java.util.UUID) JobId(com.spotify.helios.common.descriptors.JobId) KeeperException(org.apache.zookeeper.KeeperException)

Example 33 with NoNodeException

use of org.apache.zookeeper.KeeperException.NoNodeException in project helios by spotify.

the class ZooKeeperMasterModel method stopDeploymentGroup.

@Override
public void stopDeploymentGroup(final String deploymentGroupName) throws DeploymentGroupDoesNotExistException {
    checkNotNull(deploymentGroupName, "name");
    log.info("stop deployment-group: name={}", deploymentGroupName);
    final ZooKeeperClient client = provider.get("stopDeploymentGroup");
    // Delete deployment group tasks (if any) and set DG state to FAILED
    final DeploymentGroupStatus status = DeploymentGroupStatus.newBuilder().setState(FAILED).setError("Stopped by user").build();
    final String statusPath = Paths.statusDeploymentGroup(deploymentGroupName);
    final String tasksPath = Paths.statusDeploymentGroupTasks(deploymentGroupName);
    try {
        client.ensurePath(Paths.statusDeploymentGroupTasks());
        final List<ZooKeeperOperation> operations = Lists.newArrayList();
        // NOTE: This remove operation is racey. If tasks exist and the rollout finishes before the
        // delete() is executed then this will fail. Conversely, if it doesn't exist but is created
        // before the transaction is executed it will also fail. This is annoying for users, but at
        // least means we won't have inconsistent state.
        //
        // That the set() is first in the list of operations is important because of the
        // kludgy error checking we do below to disambiguate "doesn't exist" failures from the race
        // condition mentioned below.
        operations.add(set(statusPath, status));
        final Stat tasksStat = client.exists(tasksPath);
        if (tasksStat != null) {
            operations.add(delete(tasksPath));
        } else {
            // There doesn't seem to be a "check that node doesn't exist" operation so we
            // do a create and a delete on the same path to emulate it.
            operations.add(create(tasksPath));
            operations.add(delete(tasksPath));
        }
        client.transaction(operations);
    } catch (final NoNodeException e) {
        // Yes, the way you figure out which operation in a transaction failed is retarded.
        if (((OpResult.ErrorResult) e.getResults().get(0)).getErr() == KeeperException.Code.NONODE.intValue()) {
            throw new DeploymentGroupDoesNotExistException(deploymentGroupName);
        } else {
            throw new HeliosRuntimeException("stop deployment-group " + deploymentGroupName + " failed due to a race condition, please retry", e);
        }
    } catch (final KeeperException e) {
        throw new HeliosRuntimeException("stop deployment-group " + deploymentGroupName + " failed", e);
    }
}
Also used : Stat(org.apache.zookeeper.data.Stat) NoNodeException(org.apache.zookeeper.KeeperException.NoNodeException) ZooKeeperClient(com.spotify.helios.servicescommon.coordination.ZooKeeperClient) ZooKeeperOperation(com.spotify.helios.servicescommon.coordination.ZooKeeperOperation) HeliosRuntimeException(com.spotify.helios.common.HeliosRuntimeException) DeploymentGroupStatus(com.spotify.helios.common.descriptors.DeploymentGroupStatus) KeeperException(org.apache.zookeeper.KeeperException)

Example 34 with NoNodeException

use of org.apache.zookeeper.KeeperException.NoNodeException in project helios by spotify.

the class JobHistoryReaper method processItem.

@Override
void processItem(final String jobId) {
    log.info("Deciding whether to reap job history for job {}", jobId);
    final JobId id = JobId.fromString(jobId);
    final Job job = masterModel.getJob(id);
    if (job == null) {
        try {
            client.deleteRecursive(Paths.historyJob(id));
            log.info("Reaped job history for job {}", jobId);
        } catch (NoNodeException ignored) {
        // Something deleted the history right before we got to it. Ignore and keep going.
        } catch (KeeperException e) {
            log.warn("error reaping job history for job {}", jobId, e);
        }
    }
}
Also used : NoNodeException(org.apache.zookeeper.KeeperException.NoNodeException) Job(com.spotify.helios.common.descriptors.Job) JobId(com.spotify.helios.common.descriptors.JobId) KeeperException(org.apache.zookeeper.KeeperException)

Example 35 with NoNodeException

use of org.apache.zookeeper.KeeperException.NoNodeException in project helios by spotify.

the class ZooKeeperMasterModel method getJobs.

/**
   * Returns a {@link Map} of {@link JobId} to {@link Job} objects for all of the jobs known.
   */
@Override
public Map<JobId, Job> getJobs() {
    log.debug("getting jobs");
    final String folder = Paths.configJobs();
    final ZooKeeperClient client = provider.get("getJobs");
    try {
        final List<String> ids;
        try {
            ids = client.getChildren(folder);
        } catch (NoNodeException e) {
            return Maps.newHashMap();
        }
        final Map<JobId, Job> descriptors = Maps.newHashMap();
        for (final String id : ids) {
            final JobId jobId = JobId.fromString(id);
            final String path = Paths.configJob(jobId);
            try {
                final byte[] data = client.getData(path);
                final Job descriptor = parse(data, Job.class);
                descriptors.put(descriptor.getId(), descriptor);
            } catch (NoNodeException e) {
                // Ignore, the job was deleted before we had a chance to read it.
                log.debug("Ignoring deleted job {}", jobId);
            }
        }
        return descriptors;
    } catch (KeeperException | IOException e) {
        throw new HeliosRuntimeException("getting jobs failed", e);
    }
}
Also used : NoNodeException(org.apache.zookeeper.KeeperException.NoNodeException) ZooKeeperClient(com.spotify.helios.servicescommon.coordination.ZooKeeperClient) HeliosRuntimeException(com.spotify.helios.common.HeliosRuntimeException) IOException(java.io.IOException) Job(com.spotify.helios.common.descriptors.Job) JobId(com.spotify.helios.common.descriptors.JobId) KeeperException(org.apache.zookeeper.KeeperException)

Aggregations

NoNodeException (org.apache.zookeeper.KeeperException.NoNodeException)44 KeeperException (org.apache.zookeeper.KeeperException)30 HeliosRuntimeException (com.spotify.helios.common.HeliosRuntimeException)16 IOException (java.io.IOException)12 Stat (org.apache.zookeeper.data.Stat)12 ZooKeeperClient (com.spotify.helios.servicescommon.coordination.ZooKeeperClient)11 ZooKeeperOperation (com.spotify.helios.servicescommon.coordination.ZooKeeperOperation)9 Job (com.spotify.helios.common.descriptors.Job)8 ConnectionLossException (org.apache.zookeeper.KeeperException.ConnectionLossException)8 JobId (com.spotify.helios.common.descriptors.JobId)6 UnsupportedEncodingException (java.io.UnsupportedEncodingException)5 UnknownHostException (java.net.UnknownHostException)5 HashMap (java.util.HashMap)5 Map (java.util.Map)5 ZooKeeperException (org.apache.solr.common.cloud.ZooKeeperException)5 NodeExistsException (org.apache.zookeeper.KeeperException.NodeExistsException)5 SessionExpiredException (org.apache.zookeeper.KeeperException.SessionExpiredException)5 DeploymentGroup (com.spotify.helios.common.descriptors.DeploymentGroup)4 UUID (java.util.UUID)4 TimeoutException (java.util.concurrent.TimeoutException)4