Search in sources :

Example 6 with NoNodeException

use of org.apache.zookeeper.KeeperException.NoNodeException in project zookeeper by apache.

the class LoadFromLogTest method testRestoreWithTransactionErrors.

/**
     * Test we can restore a snapshot that has errors and data ahead of the zxid
     * of the snapshot file.
     */
@Test
public void testRestoreWithTransactionErrors() throws Exception {
    final String hostPort = HOST + PortAssignment.unique();
    // setup a single server cluster
    File tmpDir = ClientBase.createTmpDir();
    ClientBase.setupTestEnv();
    ZooKeeperServer zks = new ZooKeeperServer(tmpDir, tmpDir, 3000);
    SyncRequestProcessor.setSnapCount(10000);
    final int PORT = Integer.parseInt(hostPort.split(":")[1]);
    ServerCnxnFactory f = ServerCnxnFactory.createFactory(PORT, -1);
    f.startup(zks);
    Assert.assertTrue("waiting for server being up ", ClientBase.waitForServerUp(hostPort, CONNECTION_TIMEOUT));
    ZooKeeper zk = getConnectedZkClient(hostPort);
    // generate some transactions
    try {
        for (int i = 0; i < NUM_MESSAGES; i++) {
            try {
                zk.create("/invaliddir/test-", new byte[0], Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT_SEQUENTIAL);
            } catch (NoNodeException e) {
            //Expected
            }
        }
    } finally {
        zk.close();
    }
    // force the zxid to be behind the content
    zks.getZKDatabase().setlastProcessedZxid(zks.getZKDatabase().getDataTreeLastProcessedZxid() - 10);
    LOG.info("Set lastProcessedZxid to " + zks.getZKDatabase().getDataTreeLastProcessedZxid());
    // Force snapshot and restore
    zks.takeSnapshot();
    zks.shutdown();
    f.shutdown();
    zks = new ZooKeeperServer(tmpDir, tmpDir, 3000);
    SyncRequestProcessor.setSnapCount(10000);
    f = ServerCnxnFactory.createFactory(PORT, -1);
    f.startup(zks);
    Assert.assertTrue("waiting for server being up ", ClientBase.waitForServerUp(hostPort, CONNECTION_TIMEOUT));
    f.shutdown();
    zks.shutdown();
}
Also used : ZooKeeper(org.apache.zookeeper.ZooKeeper) NoNodeException(org.apache.zookeeper.KeeperException.NoNodeException) ServerCnxnFactory(org.apache.zookeeper.server.ServerCnxnFactory) File(java.io.File) ZooKeeperServer(org.apache.zookeeper.server.ZooKeeperServer) Test(org.junit.Test)

Example 7 with NoNodeException

use of org.apache.zookeeper.KeeperException.NoNodeException in project helios by spotify.

the class ZooKeeperMasterModel method removeJob.

/**
   * Deletes a job from ZooKeeper.  Ensures that job is not currently running anywhere.
   */
@Override
public Job removeJob(final JobId id, final String token) throws JobDoesNotExistException, JobStillDeployedException, TokenVerificationException {
    log.info("removing job: id={}", id);
    final ZooKeeperClient client = provider.get("removeJob");
    final Job job = getJob(client, id);
    if (job == null) {
        throw new JobDoesNotExistException(id);
    }
    verifyToken(token, job);
    // TODO (dano): handle retry failures
    try {
        final ImmutableList.Builder<ZooKeeperOperation> operations = ImmutableList.builder();
        final UUID jobCreationOperationId = getJobCreation(client, id);
        if (jobCreationOperationId != null) {
            operations.add(delete(Paths.configJobCreation(id, jobCreationOperationId)));
        }
        operations.add(delete(Paths.configJobHosts(id)), delete(Paths.configJobRefShort(id)), delete(Paths.configJob(id)), // change down the tree. Effectively, make it that version == cVersion.
        set(Paths.configJobs(), UUID.randomUUID().toString().getBytes()));
        client.transaction(operations.build());
    } catch (final NoNodeException e) {
        throw new JobDoesNotExistException(id);
    } catch (final NotEmptyException e) {
        throw new JobStillDeployedException(id, listJobHosts(client, id));
    } catch (final KeeperException e) {
        throw new HeliosRuntimeException("removing job " + id + " failed", e);
    }
    // Delete job history on a best effort basis
    try {
        client.deleteRecursive(Paths.historyJob(id));
    } catch (NoNodeException ignored) {
    // There's no history for this job
    } catch (KeeperException e) {
        log.warn("error removing job history for job {}", id, e);
    }
    return job;
}
Also used : NoNodeException(org.apache.zookeeper.KeeperException.NoNodeException) ZooKeeperClient(com.spotify.helios.servicescommon.coordination.ZooKeeperClient) ZooKeeperOperation(com.spotify.helios.servicescommon.coordination.ZooKeeperOperation) ImmutableList(com.google.common.collect.ImmutableList) HeliosRuntimeException(com.spotify.helios.common.HeliosRuntimeException) NotEmptyException(org.apache.zookeeper.KeeperException.NotEmptyException) Job(com.spotify.helios.common.descriptors.Job) UUID(java.util.UUID) KeeperException(org.apache.zookeeper.KeeperException)

Example 8 with NoNodeException

use of org.apache.zookeeper.KeeperException.NoNodeException in project helios by spotify.

the class ZooKeeperMasterModel method getDeploymentGroupTasks.

private Map<String, VersionedValue<DeploymentGroupTasks>> getDeploymentGroupTasks(final ZooKeeperClient client) {
    final String folder = Paths.statusDeploymentGroupTasks();
    try {
        final List<String> names;
        try {
            names = client.getChildren(folder);
        } catch (NoNodeException e) {
            return Collections.emptyMap();
        }
        final Map<String, VersionedValue<DeploymentGroupTasks>> ret = Maps.newHashMap();
        for (final String name : names) {
            final String path = Paths.statusDeploymentGroupTasks(name);
            try {
                final Node node = client.getNode(path);
                final byte[] data = node.getBytes();
                final int version = node.getStat().getVersion();
                if (data.length == 0) {
                    // This can happen because of ensurePath creates an empty node
                    log.debug("Ignoring empty deployment group tasks {}", name);
                } else {
                    final DeploymentGroupTasks val = parse(data, DeploymentGroupTasks.class);
                    ret.put(name, VersionedValue.of(val, version));
                }
            } catch (NoNodeException e) {
                // Ignore, the deployment group was deleted before we had a chance to read it.
                log.debug("Ignoring deleted deployment group tasks {}", name);
            }
        }
        return ret;
    } catch (KeeperException | IOException e) {
        throw new HeliosRuntimeException("getting deployment group tasks failed", e);
    }
}
Also used : VersionedValue(com.spotify.helios.servicescommon.VersionedValue) NoNodeException(org.apache.zookeeper.KeeperException.NoNodeException) Node(com.spotify.helios.servicescommon.coordination.Node) HeliosRuntimeException(com.spotify.helios.common.HeliosRuntimeException) DeploymentGroupTasks(com.spotify.helios.common.descriptors.DeploymentGroupTasks) IOException(java.io.IOException) KeeperException(org.apache.zookeeper.KeeperException)

Example 9 with NoNodeException

use of org.apache.zookeeper.KeeperException.NoNodeException in project helios by spotify.

the class ZooKeeperRegistrarUtil method deregisterHost.

public static void deregisterHost(final ZooKeeperClient client, final String host) throws HostNotFoundException, HostStillInUseException {
    log.info("deregistering host: {}", host);
    // TODO (dano): handle retry failures
    try {
        final List<ZooKeeperOperation> operations = Lists.newArrayList();
        if (client.exists(Paths.configHost(host)) == null) {
            throw new HostNotFoundException("host [" + host + "] does not exist");
        }
        // Remove all jobs deployed to this host
        final List<String> jobs = safeGetChildren(client, Paths.configHostJobs(host));
        for (final String jobString : jobs) {
            final JobId job = JobId.fromString(jobString);
            final String hostJobPath = Paths.configHostJob(host, job);
            final List<String> nodes = safeListRecursive(client, hostJobPath);
            for (final String node : reverse(nodes)) {
                operations.add(delete(node));
            }
            if (client.exists(Paths.configJobHost(job, host)) != null) {
                operations.add(delete(Paths.configJobHost(job, host)));
            }
            // Clean out the history for each job
            final List<String> history = safeListRecursive(client, Paths.historyJobHost(job, host));
            for (final String s : reverse(history)) {
                operations.add(delete(s));
            }
        }
        operations.add(delete(Paths.configHostJobs(host)));
        // Remove the host status
        final List<String> nodes = safeListRecursive(client, Paths.statusHost(host));
        for (final String node : reverse(nodes)) {
            operations.add(delete(node));
        }
        // Remove port allocations
        final List<String> ports = safeGetChildren(client, Paths.configHostPorts(host));
        for (final String port : ports) {
            operations.add(delete(Paths.configHostPort(host, Integer.valueOf(port))));
        }
        operations.add(delete(Paths.configHostPorts(host)));
        // Remove host id
        final String idPath = Paths.configHostId(host);
        if (client.exists(idPath) != null) {
            operations.add(delete(idPath));
        }
        // Remove host config root
        operations.add(delete(Paths.configHost(host)));
        client.transaction(operations);
    } catch (NoNodeException e) {
        throw new HostNotFoundException(host);
    } catch (KeeperException e) {
        throw new HeliosRuntimeException(e);
    }
}
Also used : NoNodeException(org.apache.zookeeper.KeeperException.NoNodeException) ZooKeeperOperation(com.spotify.helios.servicescommon.coordination.ZooKeeperOperation) HostNotFoundException(com.spotify.helios.master.HostNotFoundException) HeliosRuntimeException(com.spotify.helios.common.HeliosRuntimeException) JobId(com.spotify.helios.common.descriptors.JobId) KeeperException(org.apache.zookeeper.KeeperException)

Example 10 with NoNodeException

use of org.apache.zookeeper.KeeperException.NoNodeException in project helios by spotify.

the class ZooKeeperRegistrarUtil method reRegisterHost.

/**
   * Re-register an agent with a different host id. Will remove the existing status of the agent
   * but preserve any jobs deployed to the host and their history.
   * @param client ZooKeeperClient
   * @param host Host
   * @param hostId ID of the host
   * @throws HostNotFoundException If the hostname we are trying to re-register as doesn't exist.
   * @throws KeeperException If an unexpected zookeeper error occurs.
   */
public static void reRegisterHost(final ZooKeeperClient client, final String host, final String hostId) throws HostNotFoundException, KeeperException {
    // * Delete everything in the /status/hosts/<hostname> subtree
    // * Don't delete any history for the job (on the host)
    // * DON'T touch anything in the /config/hosts/<hostname> subtree, except updating the host id
    log.info("re-registering host: {}, new host id: {}", host, hostId);
    try {
        final List<ZooKeeperOperation> operations = Lists.newArrayList();
        // Check that the host exists in ZK
        operations.add(check(Paths.configHost(host)));
        // Remove the host status
        final List<String> nodes = safeListRecursive(client, Paths.statusHost(host));
        for (final String node : reverse(nodes)) {
            operations.add(delete(node));
        }
        // ...and re-create the /status/hosts/<host>/jobs node + parent
        operations.add(create(Paths.statusHost(host)));
        operations.add(create(Paths.statusHostJobs(host)));
        // Update the host ID
        // We don't have WRITE permissions to the node, so delete and re-create it.
        operations.add(delete(Paths.configHostId(host)));
        operations.add(create(Paths.configHostId(host), hostId.getBytes(UTF_8)));
        client.transaction(operations);
    } catch (NoNodeException e) {
        throw new HostNotFoundException(host);
    } catch (KeeperException e) {
        throw new HeliosRuntimeException(e);
    }
}
Also used : NoNodeException(org.apache.zookeeper.KeeperException.NoNodeException) ZooKeeperOperation(com.spotify.helios.servicescommon.coordination.ZooKeeperOperation) HostNotFoundException(com.spotify.helios.master.HostNotFoundException) HeliosRuntimeException(com.spotify.helios.common.HeliosRuntimeException) KeeperException(org.apache.zookeeper.KeeperException)

Aggregations

NoNodeException (org.apache.zookeeper.KeeperException.NoNodeException)44 KeeperException (org.apache.zookeeper.KeeperException)30 HeliosRuntimeException (com.spotify.helios.common.HeliosRuntimeException)16 IOException (java.io.IOException)12 Stat (org.apache.zookeeper.data.Stat)12 ZooKeeperClient (com.spotify.helios.servicescommon.coordination.ZooKeeperClient)11 ZooKeeperOperation (com.spotify.helios.servicescommon.coordination.ZooKeeperOperation)9 Job (com.spotify.helios.common.descriptors.Job)8 ConnectionLossException (org.apache.zookeeper.KeeperException.ConnectionLossException)8 JobId (com.spotify.helios.common.descriptors.JobId)6 UnsupportedEncodingException (java.io.UnsupportedEncodingException)5 UnknownHostException (java.net.UnknownHostException)5 HashMap (java.util.HashMap)5 Map (java.util.Map)5 ZooKeeperException (org.apache.solr.common.cloud.ZooKeeperException)5 NodeExistsException (org.apache.zookeeper.KeeperException.NodeExistsException)5 SessionExpiredException (org.apache.zookeeper.KeeperException.SessionExpiredException)5 DeploymentGroup (com.spotify.helios.common.descriptors.DeploymentGroup)4 UUID (java.util.UUID)4 TimeoutException (java.util.concurrent.TimeoutException)4