Search in sources :

Example 1 with NodeToolResult

use of org.apache.cassandra.distributed.api.NodeToolResult in project cassandra by apache.

the class ClusterUtils method ring.

/**
 * Get the ring from the perspective of the instance.
 */
public static List<RingInstanceDetails> ring(IInstance inst) {
    NodeToolResult results = inst.nodetoolResult("ring");
    results.asserts().success();
    return parseRing(results.getStdout());
}
Also used : NodeToolResult(org.apache.cassandra.distributed.api.NodeToolResult)

Example 2 with NodeToolResult

use of org.apache.cassandra.distributed.api.NodeToolResult in project cassandra by apache.

the class ClientNetworkStopStartTest method assertNodetoolStdout.

private static void assertNodetoolStdout(IInvokableInstance node, String expectedStatus, String notExpected, String... nodetool) {
    NodeToolResult result = node.nodetoolResult(nodetool);
    result.asserts().success().stdoutContains(expectedStatus);
    if (notExpected != null)
        result.asserts().stdoutNotContains(notExpected);
}
Also used : NodeToolResult(org.apache.cassandra.distributed.api.NodeToolResult)

Example 3 with NodeToolResult

use of org.apache.cassandra.distributed.api.NodeToolResult in project cassandra by apache.

the class RepairCoordinatorFailingMessageTest method streamFailure.

@Test(timeout = 1 * 60 * 1000)
public void streamFailure() {
    String table = tableName("streamfailure");
    CLUSTER.schemaChange(format("CREATE TABLE %s.%s (key text, value text, PRIMARY KEY (key))", KEYSPACE, table));
    // there needs to be a difference to cause streaming to happen, so add to one node
    CLUSTER.get(2).executeInternal(format("INSERT INTO %s.%s (key) VALUES (?)", KEYSPACE, table), "some data");
    IMessageFilters.Filter filter = CLUSTER.verbs(Verb.SYNC_REQ).messagesMatching(of(m -> {
        throw new RuntimeException("stream fail");
    })).drop();
    try {
        NodeToolResult result = repair(1, KEYSPACE, table);
        result.asserts().failure().errorContains("Some repair failed").notificationContains(NodeToolResult.ProgressEventType.ERROR, "Some repair failed").notificationContains(NodeToolResult.ProgressEventType.COMPLETE, "finished with error");
    } finally {
        filter.off();
    }
}
Also used : IMessageFilters(org.apache.cassandra.distributed.api.IMessageFilters) NodeToolResult(org.apache.cassandra.distributed.api.NodeToolResult) Test(org.junit.Test)

Example 4 with NodeToolResult

use of org.apache.cassandra.distributed.api.NodeToolResult in project cassandra by apache.

the class RepairCoordinatorNeighbourDown method validationParticipentCrashesAndComesBack.

@Test
public void validationParticipentCrashesAndComesBack() {
    // Test what happens when a participant restarts in the middle of validation
    // Currently this isn't recoverable but could be.
    // TODO since this is a real restart, how would I test "long pause"? Can't send SIGSTOP since same procress
    String table = tableName("validationparticipentcrashesandcomesback");
    assertTimeoutPreemptively(Duration.ofMinutes(1), () -> {
        CLUSTER.schemaChange(format("CREATE TABLE %s.%s (key text, value text, PRIMARY KEY (key))", KEYSPACE, table));
        AtomicReference<Future<Void>> participantShutdown = new AtomicReference<>();
        CLUSTER.verbs(Verb.VALIDATION_REQ).to(2).messagesMatching(of(m -> {
            // the nice thing about this is that this lambda is "capturing" and not "transfer", what this means is that
            // this lambda isn't serialized and any object held isn't copied.
            participantShutdown.set(CLUSTER.get(2).shutdown());
            // drop it so this node doesn't reply before shutdown.
            return true;
        })).drop();
        // since nodetool is blocking, need to handle participantShutdown in the background
        CompletableFuture<Void> recovered = CompletableFuture.runAsync(() -> {
            try {
                while (participantShutdown.get() == null) {
                    // event not happened, wait for it
                    TimeUnit.MILLISECONDS.sleep(100);
                }
                Future<Void> f = participantShutdown.get();
                // wait for shutdown to complete
                f.get();
                CLUSTER.get(2).startup();
            } catch (Exception e) {
                if (e instanceof RuntimeException) {
                    throw (RuntimeException) e;
                }
                throw new RuntimeException(e);
            }
        });
        long repairExceptions = getRepairExceptions(CLUSTER, 1);
        NodeToolResult result = repair(1, KEYSPACE, table);
        // if recovery didn't happen then the results are not what are being tested, so block here first
        recovered.join();
        result.asserts().failure().errorContains("/127.0.0.2:7012 died");
        if (withNotifications) {
            result.asserts().notificationContains(NodeToolResult.ProgressEventType.ERROR, "/127.0.0.2:7012 died").notificationContains(NodeToolResult.ProgressEventType.COMPLETE, "finished with error");
        }
        Assert.assertEquals(repairExceptions + 1, getRepairExceptions(CLUSTER, 1));
        if (repairType != RepairType.PREVIEW) {
            assertParentRepairFailedWithMessageContains(CLUSTER, KEYSPACE, table, "/127.0.0.2:7012 died");
        } else {
            assertParentRepairNotExist(CLUSTER, KEYSPACE, table);
        }
    });
}
Also used : CompletableFuture(java.util.concurrent.CompletableFuture) Future(java.util.concurrent.Future) AtomicReference(java.util.concurrent.atomic.AtomicReference) NodeToolResult(org.apache.cassandra.distributed.api.NodeToolResult) UnknownHostException(java.net.UnknownHostException) Test(org.junit.Test)

Example 5 with NodeToolResult

use of org.apache.cassandra.distributed.api.NodeToolResult in project cassandra by apache.

the class RepairCoordinatorNeighbourDown method neighbourDown.

@Test
public void neighbourDown() {
    String table = tableName("neighbourdown");
    assertTimeoutPreemptively(Duration.ofMinutes(1), () -> {
        CLUSTER.schemaChange(format("CREATE TABLE %s.%s (key text, value text, PRIMARY KEY (key))", KEYSPACE, table));
        String downNodeAddress = CLUSTER.get(2).callOnInstance(() -> FBUtilities.getBroadcastAddressAndPort().getHostAddressAndPort());
        Future<Void> shutdownFuture = CLUSTER.get(2).shutdown();
        try {
            // wait for the node to stop
            shutdownFuture.get();
            // wait for the failure detector to detect this
            CLUSTER.get(1).runOnInstance(() -> {
                InetAddressAndPort neighbor;
                try {
                    neighbor = InetAddressAndPort.getByName(downNodeAddress);
                } catch (UnknownHostException e) {
                    throw new RuntimeException(e);
                }
                while (FailureDetector.instance.isAlive(neighbor)) Uninterruptibles.sleepUninterruptibly(500, TimeUnit.MILLISECONDS);
            });
            long repairExceptions = getRepairExceptions(CLUSTER, 1);
            NodeToolResult result = repair(1, KEYSPACE, table);
            result.asserts().failure().errorContains("Endpoint not alive");
            if (withNotifications) {
                result.asserts().notificationContains(NodeToolResult.ProgressEventType.START, "Starting repair command").notificationContains(NodeToolResult.ProgressEventType.START, "repairing keyspace " + KEYSPACE + " with repair options").notificationContains(NodeToolResult.ProgressEventType.ERROR, "Endpoint not alive").notificationContains(NodeToolResult.ProgressEventType.COMPLETE, "finished with error");
            }
            Assert.assertEquals(repairExceptions + 1, getRepairExceptions(CLUSTER, 1));
        } finally {
            CLUSTER.get(2).startup();
        }
        // make sure to call outside of the try/finally so the node is up so we can actually query
        if (repairType != RepairType.PREVIEW) {
            assertParentRepairFailedWithMessageContains(CLUSTER, KEYSPACE, table, "Endpoint not alive");
        } else {
            assertParentRepairNotExist(CLUSTER, KEYSPACE, table);
        }
    });
}
Also used : InetAddressAndPort(org.apache.cassandra.locator.InetAddressAndPort) UnknownHostException(java.net.UnknownHostException) NodeToolResult(org.apache.cassandra.distributed.api.NodeToolResult) Test(org.junit.Test)

Aggregations

NodeToolResult (org.apache.cassandra.distributed.api.NodeToolResult)29 Test (org.junit.Test)24 Cluster (org.apache.cassandra.distributed.Cluster)6 IMessageFilters (org.apache.cassandra.distributed.api.IMessageFilters)5 UnknownHostException (java.net.UnknownHostException)2 ExecutorService (java.util.concurrent.ExecutorService)2 Token (org.apache.cassandra.dht.Token)2 IInvokableInstance (org.apache.cassandra.distributed.api.IInvokableInstance)2 IOException (java.io.IOException)1 ArrayList (java.util.ArrayList)1 Arrays (java.util.Arrays)1 Random (java.util.Random)1 UUID (java.util.UUID)1 CompletableFuture (java.util.concurrent.CompletableFuture)1 Future (java.util.concurrent.Future)1 AtomicReference (java.util.concurrent.atomic.AtomicReference)1 CassandraRelevantProperties (org.apache.cassandra.config.CassandraRelevantProperties)1 ColumnFamilyStore (org.apache.cassandra.db.ColumnFamilyStore)1 Range (org.apache.cassandra.dht.Range)1 ConsistencyLevel (org.apache.cassandra.distributed.api.ConsistencyLevel)1