Search in sources :

Example 6 with TokenSupplier

use of org.apache.cassandra.distributed.api.TokenSupplier in project cassandra by apache.

the class HostReplacementOfDownedClusterTest method hostReplacementOfDeadNodeAndOtherNodeStartsAfter.

/**
 * Cluster stops completely, then start seed, then host replace node2; after all complete start node3 to make sure
 * it comes up correctly with the new host in the ring.
 */
@Test
public void hostReplacementOfDeadNodeAndOtherNodeStartsAfter() throws IOException {
    // start with 3 nodes, stop both nodes, start the seed, host replace the down node)
    int numStartNodes = 3;
    TokenSupplier even = TokenSupplier.evenlyDistributedTokens(numStartNodes);
    try (Cluster cluster = Cluster.build(numStartNodes).withConfig(c -> c.with(Feature.GOSSIP, Feature.NETWORK)).withTokenSupplier(node -> even.token(node == (numStartNodes + 1) ? 2 : node)).start()) {
        IInvokableInstance seed = cluster.get(1);
        IInvokableInstance nodeToRemove = cluster.get(2);
        IInvokableInstance nodeToStartAfterReplace = cluster.get(3);
        InetSocketAddress addressToReplace = nodeToRemove.broadcastAddress();
        setupCluster(cluster);
        // collect rows/tokens to detect issues later on if the state doesn't match
        SimpleQueryResult expectedState = nodeToRemove.coordinator().executeWithResult("SELECT * FROM " + KEYSPACE + ".tbl", ConsistencyLevel.ALL);
        List<String> beforeCrashTokens = getTokenMetadataTokens(seed);
        // now stop all nodes
        stopAll(cluster);
        // with all nodes down, now start the seed (should be first node)
        seed.startup();
        // at this point node2 should be known in gossip, but with generation/version of 0
        assertGossipInfo(seed, addressToReplace, 0, -1);
        // make sure node1 still has node2's tokens
        List<String> currentTokens = getTokenMetadataTokens(seed);
        Assertions.assertThat(currentTokens).as("Tokens no longer match after restarting").isEqualTo(beforeCrashTokens);
        // now create a new node to replace the other node
        IInvokableInstance replacingNode = replaceHostAndStart(cluster, nodeToRemove);
        // wait till the replacing node is in the ring
        awaitRingJoin(seed, replacingNode);
        awaitRingJoin(replacingNode, seed);
        // we see that the replaced node is properly in the ring, now lets add the other node back
        nodeToStartAfterReplace.startup();
        awaitRingJoin(seed, nodeToStartAfterReplace);
        awaitRingJoin(replacingNode, nodeToStartAfterReplace);
        // make sure all nodes are healthy
        awaitRingHealthy(seed);
        assertRingIs(seed, seed, replacingNode, nodeToStartAfterReplace);
        assertRingIs(replacingNode, seed, replacingNode, nodeToStartAfterReplace);
        logger.info("Current ring is {}", assertRingIs(nodeToStartAfterReplace, seed, replacingNode, nodeToStartAfterReplace));
        validateRows(seed.coordinator(), expectedState);
        validateRows(replacingNode.coordinator(), expectedState);
    }
}
Also used : HostReplacementTest.setupCluster(org.apache.cassandra.distributed.test.hostreplacement.HostReplacementTest.setupCluster) ClusterUtils.getTokenMetadataTokens(org.apache.cassandra.distributed.shared.ClusterUtils.getTokenMetadataTokens) LoggerFactory(org.slf4j.LoggerFactory) ClusterUtils.stopAll(org.apache.cassandra.distributed.shared.ClusterUtils.stopAll) TokenSupplier(org.apache.cassandra.distributed.api.TokenSupplier) ClusterUtils.replaceHostAndStart(org.apache.cassandra.distributed.shared.ClusterUtils.replaceHostAndStart) TestBaseImpl(org.apache.cassandra.distributed.test.TestBaseImpl) SimpleQueryResult(org.apache.cassandra.distributed.api.SimpleQueryResult) Assertions(org.assertj.core.api.Assertions) Feature(org.apache.cassandra.distributed.api.Feature) Logger(org.slf4j.Logger) GOSSIPER_QUARANTINE_DELAY(org.apache.cassandra.config.CassandraRelevantProperties.GOSSIPER_QUARANTINE_DELAY) ClusterUtils.awaitRingJoin(org.apache.cassandra.distributed.shared.ClusterUtils.awaitRingJoin) HostReplacementTest.validateRows(org.apache.cassandra.distributed.test.hostreplacement.HostReplacementTest.validateRows) IOException(java.io.IOException) Test(org.junit.Test) ConsistencyLevel(org.apache.cassandra.distributed.api.ConsistencyLevel) InetSocketAddress(java.net.InetSocketAddress) List(java.util.List) IInvokableInstance(org.apache.cassandra.distributed.api.IInvokableInstance) ClusterUtils.assertNotInRing(org.apache.cassandra.distributed.shared.ClusterUtils.assertNotInRing) Cluster(org.apache.cassandra.distributed.Cluster) ClusterUtils.assertGossipInfo(org.apache.cassandra.distributed.shared.ClusterUtils.assertGossipInfo) ClusterUtils.assertRingIs(org.apache.cassandra.distributed.shared.ClusterUtils.assertRingIs) ClusterUtils.awaitRingHealthy(org.apache.cassandra.distributed.shared.ClusterUtils.awaitRingHealthy) IInvokableInstance(org.apache.cassandra.distributed.api.IInvokableInstance) SimpleQueryResult(org.apache.cassandra.distributed.api.SimpleQueryResult) TokenSupplier(org.apache.cassandra.distributed.api.TokenSupplier) InetSocketAddress(java.net.InetSocketAddress) HostReplacementTest.setupCluster(org.apache.cassandra.distributed.test.hostreplacement.HostReplacementTest.setupCluster) Cluster(org.apache.cassandra.distributed.Cluster) Test(org.junit.Test)

Example 7 with TokenSupplier

use of org.apache.cassandra.distributed.api.TokenSupplier in project cassandra by apache.

the class HostReplacementOfDownedClusterTest method hostReplacementOfDeadNode.

/**
 * When the full cluster crashes, make sure that we can replace a dead node after recovery.  This can happen
 * with DC outages (assuming single DC setup) where the recovery isn't able to recover a specific node.
 */
@Test
public void hostReplacementOfDeadNode() throws IOException {
    // start with 2 nodes, stop both nodes, start the seed, host replace the down node)
    TokenSupplier even = TokenSupplier.evenlyDistributedTokens(2);
    try (Cluster cluster = Cluster.build(2).withConfig(c -> c.with(Feature.GOSSIP, Feature.NETWORK)).withTokenSupplier(node -> even.token(node == 3 ? 2 : node)).start()) {
        IInvokableInstance seed = cluster.get(1);
        IInvokableInstance nodeToRemove = cluster.get(2);
        InetSocketAddress addressToReplace = nodeToRemove.broadcastAddress();
        setupCluster(cluster);
        // collect rows/tokens to detect issues later on if the state doesn't match
        SimpleQueryResult expectedState = nodeToRemove.coordinator().executeWithResult("SELECT * FROM " + KEYSPACE + ".tbl", ConsistencyLevel.ALL);
        List<String> beforeCrashTokens = getTokenMetadataTokens(seed);
        // now stop all nodes
        stopAll(cluster);
        // with all nodes down, now start the seed (should be first node)
        seed.startup();
        // at this point node2 should be known in gossip, but with generation/version of 0
        assertGossipInfo(seed, addressToReplace, 0, -1);
        // make sure node1 still has node2's tokens
        List<String> currentTokens = getTokenMetadataTokens(seed);
        Assertions.assertThat(currentTokens).as("Tokens no longer match after restarting").isEqualTo(beforeCrashTokens);
        // now create a new node to replace the other node
        IInvokableInstance replacingNode = replaceHostAndStart(cluster, nodeToRemove);
        awaitRingJoin(seed, replacingNode);
        awaitRingJoin(replacingNode, seed);
        assertNotInRing(seed, nodeToRemove);
        logger.info("Current ring is {}", assertNotInRing(replacingNode, nodeToRemove));
        validateRows(seed.coordinator(), expectedState);
        validateRows(replacingNode.coordinator(), expectedState);
    }
}
Also used : HostReplacementTest.setupCluster(org.apache.cassandra.distributed.test.hostreplacement.HostReplacementTest.setupCluster) ClusterUtils.getTokenMetadataTokens(org.apache.cassandra.distributed.shared.ClusterUtils.getTokenMetadataTokens) LoggerFactory(org.slf4j.LoggerFactory) ClusterUtils.stopAll(org.apache.cassandra.distributed.shared.ClusterUtils.stopAll) TokenSupplier(org.apache.cassandra.distributed.api.TokenSupplier) ClusterUtils.replaceHostAndStart(org.apache.cassandra.distributed.shared.ClusterUtils.replaceHostAndStart) TestBaseImpl(org.apache.cassandra.distributed.test.TestBaseImpl) SimpleQueryResult(org.apache.cassandra.distributed.api.SimpleQueryResult) Assertions(org.assertj.core.api.Assertions) Feature(org.apache.cassandra.distributed.api.Feature) Logger(org.slf4j.Logger) GOSSIPER_QUARANTINE_DELAY(org.apache.cassandra.config.CassandraRelevantProperties.GOSSIPER_QUARANTINE_DELAY) ClusterUtils.awaitRingJoin(org.apache.cassandra.distributed.shared.ClusterUtils.awaitRingJoin) HostReplacementTest.validateRows(org.apache.cassandra.distributed.test.hostreplacement.HostReplacementTest.validateRows) IOException(java.io.IOException) Test(org.junit.Test) ConsistencyLevel(org.apache.cassandra.distributed.api.ConsistencyLevel) InetSocketAddress(java.net.InetSocketAddress) List(java.util.List) IInvokableInstance(org.apache.cassandra.distributed.api.IInvokableInstance) ClusterUtils.assertNotInRing(org.apache.cassandra.distributed.shared.ClusterUtils.assertNotInRing) Cluster(org.apache.cassandra.distributed.Cluster) ClusterUtils.assertGossipInfo(org.apache.cassandra.distributed.shared.ClusterUtils.assertGossipInfo) ClusterUtils.assertRingIs(org.apache.cassandra.distributed.shared.ClusterUtils.assertRingIs) ClusterUtils.awaitRingHealthy(org.apache.cassandra.distributed.shared.ClusterUtils.awaitRingHealthy) IInvokableInstance(org.apache.cassandra.distributed.api.IInvokableInstance) SimpleQueryResult(org.apache.cassandra.distributed.api.SimpleQueryResult) TokenSupplier(org.apache.cassandra.distributed.api.TokenSupplier) InetSocketAddress(java.net.InetSocketAddress) HostReplacementTest.setupCluster(org.apache.cassandra.distributed.test.hostreplacement.HostReplacementTest.setupCluster) Cluster(org.apache.cassandra.distributed.Cluster) Test(org.junit.Test)

Example 8 with TokenSupplier

use of org.apache.cassandra.distributed.api.TokenSupplier in project cassandra by apache.

the class NodeCannotJoinAsHibernatingNodeWithoutReplaceAddressTest method test.

@Test
public void test() throws IOException, InterruptedException {
    TokenSupplier even = TokenSupplier.evenlyDistributedTokens(2);
    try (Cluster cluster = init(Cluster.build(2).withConfig(c -> c.with(Feature.values()).set(Constants.KEY_DTEST_API_STARTUP_FAILURE_AS_SHUTDOWN, false)).withInstanceInitializer(BBHelper::install).withTokenSupplier(node -> even.token((node == 3 || node == 4) ? 2 : node)).start())) {
        final IInvokableInstance toReplace = cluster.get(2);
        final String toReplaceAddress = toReplace.broadcastAddress().getAddress().getHostAddress();
        SharedState.cluster = cluster;
        // ignore host replacement errors
        cluster.setUncaughtExceptionsFilter((nodeId, cause) -> nodeId > 2);
        fixDistributedSchemas(cluster);
        ClusterUtils.stopUnchecked(toReplace);
        try {
            ClusterUtils.replaceHostAndStart(cluster, toReplace, (inst, ignore) -> ClusterUtils.updateAddress(inst, toReplaceAddress));
            Assert.fail("Host replacement should exit with an error");
        } catch (Exception e) {
            // the instance is expected to fail, but it may not have finished shutdown yet, so wait for it to shutdown
            SharedState.shutdownComplete.await(1, TimeUnit.MINUTES);
        }
        IInvokableInstance inst = ClusterUtils.addInstance(cluster, toReplace.config(), c -> c.set("auto_bootstrap", true));
        ClusterUtils.updateAddress(inst, toReplaceAddress);
        Assertions.assertThatThrownBy(() -> inst.startup()).hasMessageContaining("A node with address").hasMessageContaining("already exists, cancelling join");
    }
}
Also used : IInvokableInstance(org.apache.cassandra.distributed.api.IInvokableInstance) TokenSupplier(org.apache.cassandra.distributed.api.TokenSupplier) ICluster(org.apache.cassandra.distributed.api.ICluster) Cluster(org.apache.cassandra.distributed.Cluster) IOException(java.io.IOException) Test(org.junit.Test)

Aggregations

IOException (java.io.IOException)8 Cluster (org.apache.cassandra.distributed.Cluster)8 IInvokableInstance (org.apache.cassandra.distributed.api.IInvokableInstance)8 TokenSupplier (org.apache.cassandra.distributed.api.TokenSupplier)8 Test (org.junit.Test)8 Feature (org.apache.cassandra.distributed.api.Feature)7 ClusterUtils.replaceHostAndStart (org.apache.cassandra.distributed.shared.ClusterUtils.replaceHostAndStart)7 TestBaseImpl (org.apache.cassandra.distributed.test.TestBaseImpl)7 List (java.util.List)6 ConsistencyLevel (org.apache.cassandra.distributed.api.ConsistencyLevel)6 SimpleQueryResult (org.apache.cassandra.distributed.api.SimpleQueryResult)6 ClusterUtils.assertRingIs (org.apache.cassandra.distributed.shared.ClusterUtils.assertRingIs)6 ClusterUtils.awaitRingHealthy (org.apache.cassandra.distributed.shared.ClusterUtils.awaitRingHealthy)6 ClusterUtils.awaitRingJoin (org.apache.cassandra.distributed.shared.ClusterUtils.awaitRingJoin)6 Logger (org.slf4j.Logger)6 LoggerFactory (org.slf4j.LoggerFactory)6 BOOTSTRAP_SKIP_SCHEMA_CHECK (org.apache.cassandra.config.CassandraRelevantProperties.BOOTSTRAP_SKIP_SCHEMA_CHECK)5 GOSSIPER_QUARANTINE_DELAY (org.apache.cassandra.config.CassandraRelevantProperties.GOSSIPER_QUARANTINE_DELAY)5 ClusterUtils.getTokenMetadataTokens (org.apache.cassandra.distributed.shared.ClusterUtils.getTokenMetadataTokens)5 Assertions (org.assertj.core.api.Assertions)5