Search in sources :

Example 16 with ZooKeeperClient

use of com.spotify.helios.servicescommon.coordination.ZooKeeperClient in project helios by spotify.

the class ZooKeeperMasterModel method getJob.

/**
 * Returns the job configuration for the job specified by {@code id} as a
 * {@link Job} object. A return value of null indicates the job doesn't exist.
 */
@Override
public Job getJob(final JobId id) {
    log.debug("getting job: {}", id);
    final ZooKeeperClient client = provider.get("getJobId");
    return getJob(client, id);
}
Also used : ZooKeeperClient(com.spotify.helios.servicescommon.coordination.ZooKeeperClient)

Example 17 with ZooKeeperClient

use of com.spotify.helios.servicescommon.coordination.ZooKeeperClient in project helios by spotify.

the class ZooKeeperMasterModel method getJobHistory.

/**
 * Given a jobId and host, returns the N most recent events in its history on that host in the
 * cluster.
 */
@Override
public List<TaskStatusEvent> getJobHistory(final JobId jobId, final String host) throws JobDoesNotExistException {
    final Job descriptor = getJob(jobId);
    if (descriptor == null) {
        throw new JobDoesNotExistException(jobId);
    }
    final ZooKeeperClient client = provider.get("getJobHistory");
    final List<String> hosts;
    try {
        hosts = (!isNullOrEmpty(host)) ? singletonList(host) : client.getChildren(Paths.historyJobHosts(jobId));
    } catch (NoNodeException e) {
        return emptyList();
    } catch (KeeperException e) {
        throw new RuntimeException(e);
    }
    final List<TaskStatusEvent> jsEvents = Lists.newArrayList();
    for (final String h : hosts) {
        final List<String> events;
        try {
            events = client.getChildren(Paths.historyJobHostEvents(jobId, h));
        } catch (NoNodeException e) {
            continue;
        } catch (KeeperException e) {
            throw new RuntimeException(e);
        }
        for (final String event : events) {
            try {
                final byte[] data = client.getData(Paths.historyJobHostEventsTimestamp(jobId, h, Long.valueOf(event)));
                final TaskStatus status = Json.read(data, TaskStatus.class);
                jsEvents.add(new TaskStatusEvent(status, Long.valueOf(event), h));
            } catch (NoNodeException e) {
            // ignore, it went away before we read it
            } catch (KeeperException | IOException e) {
                throw new RuntimeException(e);
            }
        }
    }
    return Ordering.from(EVENT_COMPARATOR).sortedCopy(jsEvents);
}
Also used : TaskStatusEvent(com.spotify.helios.common.descriptors.TaskStatusEvent) NoNodeException(org.apache.zookeeper.KeeperException.NoNodeException) IOException(java.io.IOException) TaskStatus(com.spotify.helios.common.descriptors.TaskStatus) HeliosRuntimeException(com.spotify.helios.common.HeliosRuntimeException) ZooKeeperClient(com.spotify.helios.servicescommon.coordination.ZooKeeperClient) Job(com.spotify.helios.common.descriptors.Job) KeeperException(org.apache.zookeeper.KeeperException)

Example 18 with ZooKeeperClient

use of com.spotify.helios.servicescommon.coordination.ZooKeeperClient in project helios by spotify.

the class ZooKeeperMasterModel method rollingUpdate.

@Override
public void rollingUpdate(final DeploymentGroup deploymentGroup, final JobId jobId, final RolloutOptions options) throws DeploymentGroupDoesNotExistException, JobDoesNotExistException {
    checkNotNull(deploymentGroup, "deploymentGroup");
    final Job job = getJob(jobId);
    if (job == null) {
        throw new JobDoesNotExistException(jobId);
    }
    final RolloutOptions rolloutOptionsWithFallback = rolloutOptionsWithFallback(options, job);
    log.info("preparing to initiate rolling-update on deployment-group: " + "name={}, jobId={}, options={}", deploymentGroup.getName(), jobId, rolloutOptionsWithFallback);
    final DeploymentGroup updated = deploymentGroup.toBuilder().setJobId(jobId).setRolloutOptions(rolloutOptionsWithFallback).setRollingUpdateReason(MANUAL).build();
    final List<ZooKeeperOperation> operations = Lists.newArrayList();
    final ZooKeeperClient client = provider.get("rollingUpdate");
    operations.add(set(Paths.configDeploymentGroup(updated.getName()), updated));
    try {
        final RollingUpdateOp op = getInitRollingUpdateOps(updated, client);
        operations.addAll(op.operations());
        log.info("starting zookeeper transaction for rolling-update on " + "deployment-group name={} jobId={}. List of operations: {}", deploymentGroup.getName(), jobId, operations);
        client.transaction(operations);
        emitEvents(deploymentGroupEventTopic, op.events());
        log.info("initiated rolling-update on deployment-group: name={}, jobId={}", deploymentGroup.getName(), jobId);
    } catch (final NoNodeException e) {
        throw new DeploymentGroupDoesNotExistException(deploymentGroup.getName());
    } catch (final KeeperException e) {
        throw new HeliosRuntimeException("rolling-update on deployment-group " + deploymentGroup.getName() + " failed. " + e.getMessage(), e);
    }
}
Also used : RolloutOptions(com.spotify.helios.common.descriptors.RolloutOptions) RollingUpdateOp(com.spotify.helios.rollingupdate.RollingUpdateOp) NoNodeException(org.apache.zookeeper.KeeperException.NoNodeException) ZooKeeperOperation(com.spotify.helios.servicescommon.coordination.ZooKeeperOperation) ZooKeeperClient(com.spotify.helios.servicescommon.coordination.ZooKeeperClient) HeliosRuntimeException(com.spotify.helios.common.HeliosRuntimeException) Job(com.spotify.helios.common.descriptors.Job) DeploymentGroup(com.spotify.helios.common.descriptors.DeploymentGroup) KeeperException(org.apache.zookeeper.KeeperException)

Example 19 with ZooKeeperClient

use of com.spotify.helios.servicescommon.coordination.ZooKeeperClient in project helios by spotify.

the class DeploymentGroupTest method testUpdateDeploymentGroupHosts.

private void testUpdateDeploymentGroupHosts(final RolloutOptions rolloutOptions) throws Exception {
    final ZooKeeperClient client = spy(this.client);
    final ZooKeeperMasterModel masterModel = spy(newMasterModel(client));
    // Return a job so we can add a real deployment group.
    final Job job = Job.newBuilder().setCommand(ImmutableList.of("COMMAND")).setImage("IMAGE").setName("JOB_NAME").setVersion("VERSION").build();
    doReturn(job).when(masterModel).getJob(job.getId());
    // Add a real deployment group.
    final DeploymentGroup dg = DeploymentGroup.newBuilder().setName(GROUP_NAME).setHostSelectors(ImmutableList.of(HostSelector.parse("role=melmac"))).setJobId(job.getId()).setRolloutOptions(rolloutOptions).setRollingUpdateReason(MANUAL).build();
    masterModel.addDeploymentGroup(dg);
    // Setup some hosts
    final String oldHost = "host1";
    final String newHost = "host2";
    client.ensurePath(Paths.configHost(oldHost));
    client.ensurePath(Paths.configHost(newHost));
    client.ensurePath(Paths.statusHostUp(oldHost));
    client.ensurePath(Paths.statusHostUp(newHost));
    // Give the deployment group a host.
    client.setData(Paths.statusDeploymentGroupHosts(dg.getName()), Json.asBytes(ImmutableList.of(oldHost)));
    // And a status...
    client.setData(Paths.statusDeploymentGroup(dg.getName()), DeploymentGroupStatus.newBuilder().setState(DONE).build().toJsonBytes());
    // Switch out our host!
    // TODO(negz): Use an unchanged host, make sure ordering remains the same.
    masterModel.updateDeploymentGroupHosts(dg.getName(), ImmutableList.of(newHost));
    verify(client, times(2)).transaction(opCaptor.capture());
    final DeploymentGroup changed = dg.toBuilder().setRollingUpdateReason(HOSTS_CHANGED).build();
    // Ensure we set the DG status to HOSTS_CHANGED.
    // This means we triggered a rolling update.
    final ZooKeeperOperation setDeploymentGroupHostChanged = set(Paths.configDeploymentGroup(dg.getName()), changed);
    // Ensure ZK tasks are written to:
    // - Perform a rolling undeploy for the removed (old) host
    // - Perform a rolling update for the added (new) host and the unchanged host
    final List<RolloutTask> tasks = ImmutableList.<RolloutTask>builder().addAll(RollingUndeployPlanner.of(changed).plan(singletonList(oldHost))).addAll(RollingUpdatePlanner.of(changed).plan(singletonList(newHost))).build();
    final ZooKeeperOperation setDeploymentGroupTasks = set(Paths.statusDeploymentGroupTasks(dg.getName()), DeploymentGroupTasks.newBuilder().setRolloutTasks(tasks).setTaskIndex(0).setDeploymentGroup(changed).build());
    assertThat(opCaptor.getValue(), hasItems(setDeploymentGroupHostChanged, setDeploymentGroupTasks));
}
Also used : ZooKeeperClient(com.spotify.helios.servicescommon.coordination.ZooKeeperClient) DefaultZooKeeperClient(com.spotify.helios.servicescommon.coordination.DefaultZooKeeperClient) ZooKeeperOperation(com.spotify.helios.servicescommon.coordination.ZooKeeperOperation) RolloutTask(com.spotify.helios.common.descriptors.RolloutTask) Job(com.spotify.helios.common.descriptors.Job) DeploymentGroup(com.spotify.helios.common.descriptors.DeploymentGroup)

Example 20 with ZooKeeperClient

use of com.spotify.helios.servicescommon.coordination.ZooKeeperClient in project helios by spotify.

the class DeploymentGroupTest method testUpdateFailedHostsChangedDeploymentGroupHosts.

// A test that ensures deployment groups that failed during a rolling update triggered by
// changing hosts will perform a new rolling update if the hosts change again.
@Test
public void testUpdateFailedHostsChangedDeploymentGroupHosts() throws Exception {
    final ZooKeeperClient client = spy(this.client);
    final ZooKeeperMasterModel masterModel = spy(newMasterModel(client));
    // Return a job so we can add a real deployment group.
    final Job job = Job.newBuilder().setCommand(ImmutableList.of("COMMAND")).setImage("IMAGE").setName("JOB_NAME").setVersion("VERSION").build();
    doReturn(job).when(masterModel).getJob(job.getId());
    // Add a real deployment group.
    final DeploymentGroup dg = DeploymentGroup.newBuilder().setName(GROUP_NAME).setHostSelectors(ImmutableList.of(HostSelector.parse("role=melmac"))).setJobId(job.getId()).setRolloutOptions(RolloutOptions.getDefault()).setRollingUpdateReason(HOSTS_CHANGED).build();
    masterModel.addDeploymentGroup(dg);
    // Give the deployment group a host.
    client.setData(Paths.statusDeploymentGroupHosts(dg.getName()), Json.asBytes(ImmutableList.of("host1")));
    // And a status...
    client.setData(Paths.statusDeploymentGroup(dg.getName()), DeploymentGroupStatus.newBuilder().setState(FAILED).build().toJsonBytes());
    // Pretend our new host is UP.
    final HostStatus statusUp = mock(HostStatus.class);
    doReturn(HostStatus.Status.UP).when(statusUp).getStatus();
    doReturn(statusUp).when(masterModel).getHostStatus("host2");
    // Switch out our host!
    masterModel.updateDeploymentGroupHosts(dg.getName(), ImmutableList.of("host2"));
    // Ensure we write the same DG status again.
    // This is a no-op, but it means we triggered a rolling update.
    final ZooKeeperOperation setDeploymentGroup = set(Paths.configDeploymentGroup(dg.getName()), dg);
    verify(client, times(2)).transaction(opCaptor.capture());
    assertThat(opCaptor.getValue(), hasItem(setDeploymentGroup));
}
Also used : ZooKeeperClient(com.spotify.helios.servicescommon.coordination.ZooKeeperClient) DefaultZooKeeperClient(com.spotify.helios.servicescommon.coordination.DefaultZooKeeperClient) ZooKeeperOperation(com.spotify.helios.servicescommon.coordination.ZooKeeperOperation) HostStatus(com.spotify.helios.common.descriptors.HostStatus) Job(com.spotify.helios.common.descriptors.Job) DeploymentGroup(com.spotify.helios.common.descriptors.DeploymentGroup) Test(org.junit.Test)

Aggregations

ZooKeeperClient (com.spotify.helios.servicescommon.coordination.ZooKeeperClient)43 HeliosRuntimeException (com.spotify.helios.common.HeliosRuntimeException)20 KeeperException (org.apache.zookeeper.KeeperException)18 NoNodeException (org.apache.zookeeper.KeeperException.NoNodeException)15 DefaultZooKeeperClient (com.spotify.helios.servicescommon.coordination.DefaultZooKeeperClient)13 Job (com.spotify.helios.common.descriptors.Job)12 Test (org.junit.Test)12 DeploymentGroup (com.spotify.helios.common.descriptors.DeploymentGroup)11 ZooKeeperOperation (com.spotify.helios.servicescommon.coordination.ZooKeeperOperation)11 IOException (java.io.IOException)10 RolloutTask (com.spotify.helios.common.descriptors.RolloutTask)8 DeploymentGroupTasks (com.spotify.helios.common.descriptors.DeploymentGroupTasks)6 JobId (com.spotify.helios.common.descriptors.JobId)6 Deployment (com.spotify.helios.common.descriptors.Deployment)5 CuratorFramework (org.apache.curator.framework.CuratorFramework)5 ExponentialBackoffRetry (org.apache.curator.retry.ExponentialBackoffRetry)5 DeploymentGroupStatus (com.spotify.helios.common.descriptors.DeploymentGroupStatus)4 SetData (com.spotify.helios.servicescommon.coordination.SetData)4 RetryPolicy (org.apache.curator.RetryPolicy)4 BadVersionException (org.apache.zookeeper.KeeperException.BadVersionException)4