Search in sources :

Example 1 with CasWriteUnknownResultException

use of org.apache.cassandra.exceptions.CasWriteUnknownResultException in project cassandra by apache.

the class StorageProxy method proposePaxos.

/**
 * Propose the {@param proposal} accoding to the {@param replicaPlan}.
 * When {@param backoffIfPartial} is true, the proposer backs off when seeing the proposal being accepted by some but not a quorum.
 * The result of the cooresponding CAS in uncertain as the accepted proposal may or may not be spread to other nodes in later rounds.
 */
private static boolean proposePaxos(Commit proposal, ReplicaPlan.ForPaxosWrite replicaPlan, boolean backoffIfPartial, long queryStartNanoTime) throws WriteTimeoutException, CasWriteUnknownResultException {
    ProposeCallback callback = new ProposeCallback(replicaPlan.contacts().size(), replicaPlan.requiredParticipants(), !backoffIfPartial, replicaPlan.consistencyLevel(), queryStartNanoTime);
    Message<Commit> message = Message.out(PAXOS_PROPOSE_REQ, proposal);
    for (Replica replica : replicaPlan.contacts()) {
        if (replica.isSelf()) {
            PAXOS_PROPOSE_REQ.stage.execute(() -> {
                try {
                    Message<Boolean> response = message.responseWith(doPropose(proposal));
                    callback.onResponse(response);
                } catch (Exception ex) {
                    logger.error("Failed paxos propose locally", ex);
                }
            });
        } else {
            MessagingService.instance().sendWithCallback(message, replica.endpoint(), callback);
        }
    }
    callback.await();
    if (callback.isSuccessful())
        return true;
    if (backoffIfPartial && !callback.isFullyRefused())
        throw new CasWriteUnknownResultException(replicaPlan.consistencyLevel(), callback.getAcceptCount(), replicaPlan.requiredParticipants());
    return false;
}
Also used : Replica(org.apache.cassandra.locator.Replica) CasWriteUnknownResultException(org.apache.cassandra.exceptions.CasWriteUnknownResultException) OverloadedException(org.apache.cassandra.exceptions.OverloadedException) ReadAbortException(org.apache.cassandra.exceptions.ReadAbortException) RejectException(org.apache.cassandra.db.RejectException) CasWriteTimeoutException(org.apache.cassandra.exceptions.CasWriteTimeoutException) WriteFailureException(org.apache.cassandra.exceptions.WriteFailureException) InvalidRequestException(org.apache.cassandra.exceptions.InvalidRequestException) RequestTimeoutException(org.apache.cassandra.exceptions.RequestTimeoutException) ReadTimeoutException(org.apache.cassandra.exceptions.ReadTimeoutException) CasWriteUnknownResultException(org.apache.cassandra.exceptions.CasWriteUnknownResultException) TimeoutException(java.util.concurrent.TimeoutException) UnavailableException(org.apache.cassandra.exceptions.UnavailableException) WriteTimeoutException(org.apache.cassandra.exceptions.WriteTimeoutException) UncheckedInterruptedException(org.apache.cassandra.utils.concurrent.UncheckedInterruptedException) TombstoneOverwhelmingException(org.apache.cassandra.db.filter.TombstoneOverwhelmingException) RequestFailureException(org.apache.cassandra.exceptions.RequestFailureException) IsBootstrappingException(org.apache.cassandra.exceptions.IsBootstrappingException) ReadFailureException(org.apache.cassandra.exceptions.ReadFailureException)

Example 2 with CasWriteUnknownResultException

use of org.apache.cassandra.exceptions.CasWriteUnknownResultException in project cassandra by apache.

the class StorageProxy method cas.

/**
 * Apply @param updates if and only if the current values in the row for @param key
 * match the provided @param conditions.  The algorithm is "raw" Paxos: that is, Paxos
 * minus leader election -- any node in the cluster may propose changes for any row,
 * which (that is, the row) is the unit of values being proposed, not single columns.
 *
 * The Paxos cohort is only the replicas for the given key, not the entire cluster.
 * So we expect performance to be reasonable, but CAS is still intended to be used
 * "when you really need it," not for all your updates.
 *
 * There are three phases to Paxos:
 *  1. Prepare: the coordinator generates a ballot (timeUUID in our case) and asks replicas to (a) promise
 *     not to accept updates from older ballots and (b) tell us about the most recent update it has already
 *     accepted.
 *  2. Accept: if a majority of replicas respond, the coordinator asks replicas to accept the value of the
 *     highest proposal ballot it heard about, or a new value if no in-progress proposals were reported.
 *  3. Commit (Learn): if a majority of replicas acknowledge the accept request, we can commit the new
 *     value.
 *
 *  Commit procedure is not covered in "Paxos Made Simple," and only briefly mentioned in "Paxos Made Live,"
 *  so here is our approach:
 *   3a. The coordinator sends a commit message to all replicas with the ballot and value.
 *   3b. Because of 1-2, this will be the highest-seen commit ballot.  The replicas will note that,
 *       and send it with subsequent promise replies.  This allows us to discard acceptance records
 *       for successfully committed replicas, without allowing incomplete proposals to commit erroneously
 *       later on.
 *
 *  Note that since we are performing a CAS rather than a simple update, we perform a read (of committed
 *  values) between the prepare and accept phases.  This gives us a slightly longer window for another
 *  coordinator to come along and trump our own promise with a newer one but is otherwise safe.
 *
 * @param keyspaceName the keyspace for the CAS
 * @param cfName the column family for the CAS
 * @param key the row key for the row to CAS
 * @param request the conditions for the CAS to apply as well as the update to perform if the conditions hold.
 * @param consistencyForPaxos the consistency for the paxos prepare and propose round. This can only be either SERIAL or LOCAL_SERIAL.
 * @param consistencyForCommit the consistency for write done during the commit phase. This can be anything, except SERIAL or LOCAL_SERIAL.
 *
 * @return null if the operation succeeds in updating the row, or the current values corresponding to conditions.
 * (since, if the CAS doesn't succeed, it means the current value do not match the conditions).
 */
public static RowIterator cas(String keyspaceName, String cfName, DecoratedKey key, CASRequest request, ConsistencyLevel consistencyForPaxos, ConsistencyLevel consistencyForCommit, ClientState state, int nowInSeconds, long queryStartNanoTime) throws UnavailableException, IsBootstrappingException, RequestFailureException, RequestTimeoutException, InvalidRequestException, CasWriteUnknownResultException {
    final long startTimeForMetrics = nanoTime();
    try {
        TableMetadata metadata = Schema.instance.validateTable(keyspaceName, cfName);
        if (DatabaseDescriptor.getPartitionDenylistEnabled() && DatabaseDescriptor.getDenylistWritesEnabled() && !partitionDenylist.isKeyPermitted(keyspaceName, cfName, key.getKey())) {
            denylistMetrics.incrementWritesRejected();
            throw new InvalidRequestException(String.format("Unable to CAS write to denylisted partition [0x%s] in %s/%s", key.toString(), keyspaceName, cfName));
        }
        Supplier<Pair<PartitionUpdate, RowIterator>> updateProposer = () -> {
            // read the current values and check they validate the conditions
            Tracing.trace("Reading existing values for CAS precondition");
            SinglePartitionReadCommand readCommand = (SinglePartitionReadCommand) request.readCommand(nowInSeconds);
            ConsistencyLevel readConsistency = consistencyForPaxos == ConsistencyLevel.LOCAL_SERIAL ? ConsistencyLevel.LOCAL_QUORUM : ConsistencyLevel.QUORUM;
            FilteredPartition current;
            try (RowIterator rowIter = readOne(readCommand, readConsistency, queryStartNanoTime)) {
                current = FilteredPartition.create(rowIter);
            }
            if (!request.appliesTo(current)) {
                Tracing.trace("CAS precondition does not match current values {}", current);
                casWriteMetrics.conditionNotMet.inc();
                return Pair.create(PartitionUpdate.emptyUpdate(metadata, key), current.rowIterator());
            }
            // Create the desired updates
            PartitionUpdate updates = request.makeUpdates(current, state);
            long size = updates.dataSize();
            casWriteMetrics.mutationSize.update(size);
            writeMetricsForLevel(consistencyForPaxos).mutationSize.update(size);
            // Apply triggers to cas updates. A consideration here is that
            // triggers emit Mutations, and so a given trigger implementation
            // may generate mutations for partitions other than the one this
            // paxos round is scoped for. In this case, TriggerExecutor will
            // validate that the generated mutations are targetted at the same
            // partition as the initial updates and reject (via an
            // InvalidRequestException) any which aren't.
            updates = TriggerExecutor.instance.execute(updates);
            return Pair.create(updates, null);
        };
        return doPaxos(metadata, key, consistencyForPaxos, consistencyForCommit, consistencyForCommit, queryStartNanoTime, casWriteMetrics, updateProposer);
    } catch (CasWriteUnknownResultException e) {
        casWriteMetrics.unknownResult.mark();
        throw e;
    } catch (CasWriteTimeoutException wte) {
        casWriteMetrics.timeouts.mark();
        writeMetricsForLevel(consistencyForPaxos).timeouts.mark();
        throw new CasWriteTimeoutException(wte.writeType, wte.consistency, wte.received, wte.blockFor, wte.contentions);
    } catch (ReadTimeoutException e) {
        casWriteMetrics.timeouts.mark();
        writeMetricsForLevel(consistencyForPaxos).timeouts.mark();
        throw e;
    } catch (ReadAbortException e) {
        casWriteMetrics.markAbort(e);
        writeMetricsForLevel(consistencyForPaxos).markAbort(e);
        throw e;
    } catch (WriteFailureException | ReadFailureException e) {
        casWriteMetrics.failures.mark();
        writeMetricsForLevel(consistencyForPaxos).failures.mark();
        throw e;
    } catch (UnavailableException e) {
        casWriteMetrics.unavailables.mark();
        writeMetricsForLevel(consistencyForPaxos).unavailables.mark();
        throw e;
    } finally {
        final long latency = nanoTime() - startTimeForMetrics;
        casWriteMetrics.addNano(latency);
        writeMetricsForLevel(consistencyForPaxos).addNano(latency);
    }
}
Also used : TableMetadata(org.apache.cassandra.schema.TableMetadata) ReadFailureException(org.apache.cassandra.exceptions.ReadFailureException) ReadTimeoutException(org.apache.cassandra.exceptions.ReadTimeoutException) SinglePartitionReadCommand(org.apache.cassandra.db.SinglePartitionReadCommand) UnavailableException(org.apache.cassandra.exceptions.UnavailableException) FilteredPartition(org.apache.cassandra.db.partitions.FilteredPartition) ReadAbortException(org.apache.cassandra.exceptions.ReadAbortException) CasWriteUnknownResultException(org.apache.cassandra.exceptions.CasWriteUnknownResultException) ConsistencyLevel(org.apache.cassandra.db.ConsistencyLevel) WriteFailureException(org.apache.cassandra.exceptions.WriteFailureException) RowIterator(org.apache.cassandra.db.rows.RowIterator) InvalidRequestException(org.apache.cassandra.exceptions.InvalidRequestException) CasWriteTimeoutException(org.apache.cassandra.exceptions.CasWriteTimeoutException) PartitionUpdate(org.apache.cassandra.db.partitions.PartitionUpdate) Pair(org.apache.cassandra.utils.Pair)

Example 3 with CasWriteUnknownResultException

use of org.apache.cassandra.exceptions.CasWriteUnknownResultException in project cassandra by apache.

the class ErrorMessageTest method testV4CasWriteResultUnknownSerDeser.

@Test
public void testV4CasWriteResultUnknownSerDeser() {
    int receivedBlockFor = 3;
    ConsistencyLevel consistencyLevel = ConsistencyLevel.SERIAL;
    CasWriteUnknownResultException ex = new CasWriteUnknownResultException(consistencyLevel, receivedBlockFor, receivedBlockFor);
    ErrorMessage deserialized = encodeThenDecode(ErrorMessage.fromException(ex), ProtocolVersion.V4);
    assertTrue(deserialized.error instanceof WriteTimeoutException);
    assertFalse(deserialized.error instanceof CasWriteUnknownResultException);
    WriteTimeoutException deserializedEx = (WriteTimeoutException) deserialized.error;
    assertEquals(consistencyLevel, deserializedEx.consistency);
    assertEquals(receivedBlockFor, deserializedEx.received);
    assertEquals(receivedBlockFor, deserializedEx.blockFor);
}
Also used : ConsistencyLevel(org.apache.cassandra.db.ConsistencyLevel) CasWriteTimeoutException(org.apache.cassandra.exceptions.CasWriteTimeoutException) WriteTimeoutException(org.apache.cassandra.exceptions.WriteTimeoutException) ErrorMessage(org.apache.cassandra.transport.messages.ErrorMessage) CasWriteUnknownResultException(org.apache.cassandra.exceptions.CasWriteUnknownResultException) Test(org.junit.Test)

Example 4 with CasWriteUnknownResultException

use of org.apache.cassandra.exceptions.CasWriteUnknownResultException in project cassandra by apache.

the class ErrorMessageTest method testV5CasWriteResultUnknownSerDeser.

@Test
public void testV5CasWriteResultUnknownSerDeser() {
    int receivedBlockFor = 3;
    ConsistencyLevel consistencyLevel = ConsistencyLevel.SERIAL;
    CasWriteUnknownResultException ex = new CasWriteUnknownResultException(consistencyLevel, receivedBlockFor, receivedBlockFor);
    ErrorMessage deserialized = encodeThenDecode(ErrorMessage.fromException(ex), ProtocolVersion.V5);
    assertTrue(deserialized.error instanceof CasWriteUnknownResultException);
    CasWriteUnknownResultException deserializedEx = (CasWriteUnknownResultException) deserialized.error;
    assertEquals(consistencyLevel, deserializedEx.consistency);
    assertEquals(receivedBlockFor, deserializedEx.received);
    assertEquals(receivedBlockFor, deserializedEx.blockFor);
    assertEquals(ex.getMessage(), deserializedEx.getMessage());
    assertTrue(deserializedEx.getMessage().contains("CAS operation result is unknown"));
}
Also used : ConsistencyLevel(org.apache.cassandra.db.ConsistencyLevel) ErrorMessage(org.apache.cassandra.transport.messages.ErrorMessage) CasWriteUnknownResultException(org.apache.cassandra.exceptions.CasWriteUnknownResultException) Test(org.junit.Test)

Example 5 with CasWriteUnknownResultException

use of org.apache.cassandra.exceptions.CasWriteUnknownResultException in project cassandra by apache.

the class CasWriteTest method testWriteUnknownResult.

@Test
public void testWriteUnknownResult() {
    cluster.filters().reset();
    int pk = pkGen.getAndIncrement();
    CountDownLatch ready = new CountDownLatch(1);
    cluster.filters().verbs(Verb.PAXOS_PROPOSE_REQ.id).from(1).to(2, 3).messagesMatching((from, to, msg) -> {
        if (to == 2) {
            // Inject a single CAS request in-between prepare and propose phases
            cluster.coordinator(2).execute(mkCasInsertQuery((a) -> pk, 1, 2), ConsistencyLevel.QUORUM);
            ready.countDown();
        } else {
            Uninterruptibles.awaitUninterruptibly(ready);
        }
        return false;
    }).drop();
    try {
        cluster.coordinator(1).execute(mkCasInsertQuery((a) -> pk, 1, 1), ConsistencyLevel.QUORUM);
    } catch (Throwable t) {
        Assert.assertEquals("Expecting cause to be CasWriteUnknownResultException", CasWriteUnknownResultException.class.getCanonicalName(), t.getClass().getCanonicalName());
        return;
    }
    Assert.fail("Expecting test to throw a CasWriteUnknownResultException");
}
Also used : Arrays(java.util.Arrays) BeforeClass(org.junit.BeforeClass) LoggerFactory(org.slf4j.LoggerFactory) AtomicReference(java.util.concurrent.atomic.AtomicReference) Function(java.util.function.Function) Supplier(java.util.function.Supplier) ArrayList(java.util.ArrayList) BaseMatcher(org.hamcrest.BaseMatcher) Future(java.util.concurrent.Future) AtomicInteger(java.util.concurrent.atomic.AtomicInteger) After(org.junit.After) ExpectedException(org.junit.rules.ExpectedException) ExecutorService(java.util.concurrent.ExecutorService) Before(org.junit.Before) AssertUtils(org.apache.cassandra.distributed.shared.AssertUtils) Description(org.hamcrest.Description) AfterClass(org.junit.AfterClass) Uninterruptibles(com.google.common.util.concurrent.Uninterruptibles) Logger(org.slf4j.Logger) FBUtilities(org.apache.cassandra.utils.FBUtilities) CoreMatchers.containsString(org.hamcrest.CoreMatchers.containsString) CasWriteTimeoutException(org.apache.cassandra.exceptions.CasWriteTimeoutException) ICluster(org.apache.cassandra.distributed.api.ICluster) Test(org.junit.Test) ConsistencyLevel(org.apache.cassandra.distributed.api.ConsistencyLevel) Verb(org.apache.cassandra.net.Verb) Executors(java.util.concurrent.Executors) TimeUnit(java.util.concurrent.TimeUnit) Consumer(java.util.function.Consumer) CountDownLatch(java.util.concurrent.CountDownLatch) List(java.util.List) Rule(org.junit.Rule) CasWriteUnknownResultException(org.apache.cassandra.exceptions.CasWriteUnknownResultException) Cluster(org.apache.cassandra.distributed.Cluster) Assert(org.junit.Assert) InstanceClassLoader(org.apache.cassandra.distributed.shared.InstanceClassLoader) CountDownLatch(java.util.concurrent.CountDownLatch) Test(org.junit.Test)

Aggregations

CasWriteUnknownResultException (org.apache.cassandra.exceptions.CasWriteUnknownResultException)5 CasWriteTimeoutException (org.apache.cassandra.exceptions.CasWriteTimeoutException)4 ConsistencyLevel (org.apache.cassandra.db.ConsistencyLevel)3 Test (org.junit.Test)3 InvalidRequestException (org.apache.cassandra.exceptions.InvalidRequestException)2 ReadAbortException (org.apache.cassandra.exceptions.ReadAbortException)2 ReadFailureException (org.apache.cassandra.exceptions.ReadFailureException)2 ReadTimeoutException (org.apache.cassandra.exceptions.ReadTimeoutException)2 UnavailableException (org.apache.cassandra.exceptions.UnavailableException)2 WriteFailureException (org.apache.cassandra.exceptions.WriteFailureException)2 WriteTimeoutException (org.apache.cassandra.exceptions.WriteTimeoutException)2 ErrorMessage (org.apache.cassandra.transport.messages.ErrorMessage)2 Uninterruptibles (com.google.common.util.concurrent.Uninterruptibles)1 ArrayList (java.util.ArrayList)1 Arrays (java.util.Arrays)1 List (java.util.List)1 CountDownLatch (java.util.concurrent.CountDownLatch)1 ExecutorService (java.util.concurrent.ExecutorService)1 Executors (java.util.concurrent.Executors)1 Future (java.util.concurrent.Future)1