Search in sources :

Example 1 with BatchlogCleanup

use of org.apache.cassandra.service.BatchlogResponseHandler.BatchlogCleanup in project cassandra by apache.

the class StorageProxy method mutateAtomically.

/**
 * See mutate. Adds additional steps before and after writing a batch.
 * Before writing the batch (but after doing availability check against the FD for the row replicas):
 *      write the entire batch to a batchlog elsewhere in the cluster.
 * After: remove the batchlog entry (after writing hints for the batch rows, if necessary).
 *
 * @param mutations the Mutations to be applied across the replicas
 * @param consistency_level the consistency level for the operation
 * @param requireQuorumForRemove at least a quorum of nodes will see update before deleting batchlog
 * @param queryStartNanoTime the value of nanoTime() when the query started to be processed
 */
public static void mutateAtomically(Collection<Mutation> mutations, ConsistencyLevel consistency_level, boolean requireQuorumForRemove, long queryStartNanoTime) throws UnavailableException, OverloadedException, WriteTimeoutException {
    Tracing.trace("Determining replicas for atomic batch");
    long startTime = nanoTime();
    List<WriteResponseHandlerWrapper> wrappers = new ArrayList<>(mutations.size());
    if (mutations.stream().anyMatch(mutation -> Keyspace.open(mutation.getKeyspaceName()).getReplicationStrategy().hasTransientReplicas()))
        throw new AssertionError("Logged batches are unsupported with transient replication");
    try {
        // If we are requiring quorum nodes for removal, we upgrade consistency level to QUORUM unless we already
        // require ALL, or EACH_QUORUM. This is so that *at least* QUORUM nodes see the update.
        ConsistencyLevel batchConsistencyLevel = requireQuorumForRemove ? ConsistencyLevel.QUORUM : consistency_level;
        switch(consistency_level) {
            case ALL:
            case EACH_QUORUM:
                batchConsistencyLevel = consistency_level;
        }
        ReplicaPlan.ForTokenWrite replicaPlan = ReplicaPlans.forBatchlogWrite(batchConsistencyLevel == ConsistencyLevel.ANY);
        final UUID batchUUID = UUIDGen.getTimeUUID();
        BatchlogCleanup cleanup = new BatchlogCleanup(mutations.size(), () -> asyncRemoveFromBatchlog(replicaPlan, batchUUID));
        // add a handler for each mutation - includes checking availability, but doesn't initiate any writes, yet
        for (Mutation mutation : mutations) {
            if (hasLocalMutation(mutation))
                writeMetrics.localRequests.mark();
            else
                writeMetrics.remoteRequests.mark();
            WriteResponseHandlerWrapper wrapper = wrapBatchResponseHandler(mutation, consistency_level, batchConsistencyLevel, WriteType.BATCH, cleanup, queryStartNanoTime);
            // exit early if we can't fulfill the CL at this time.
            wrappers.add(wrapper);
        }
        // write to the batchlog
        syncWriteToBatchlog(mutations, replicaPlan, batchUUID, queryStartNanoTime);
        // now actually perform the writes and wait for them to complete
        syncWriteBatchedMutations(wrappers, Stage.MUTATION);
    } catch (UnavailableException e) {
        writeMetrics.unavailables.mark();
        writeMetricsForLevel(consistency_level).unavailables.mark();
        Tracing.trace("Unavailable");
        throw e;
    } catch (WriteTimeoutException e) {
        writeMetrics.timeouts.mark();
        writeMetricsForLevel(consistency_level).timeouts.mark();
        Tracing.trace("Write timeout; received {} of {} required replies", e.received, e.blockFor);
        throw e;
    } catch (WriteFailureException e) {
        writeMetrics.failures.mark();
        writeMetricsForLevel(consistency_level).failures.mark();
        Tracing.trace("Write failure; received {} of {} required replies", e.received, e.blockFor);
        throw e;
    } finally {
        long latency = nanoTime() - startTime;
        writeMetrics.addNano(latency);
        writeMetricsForLevel(consistency_level).addNano(latency);
        updateCoordinatorWriteLatencyTableMetric(mutations, latency);
    }
}
Also used : ReplicaPlan(org.apache.cassandra.locator.ReplicaPlan) ArrayList(java.util.ArrayList) UnavailableException(org.apache.cassandra.exceptions.UnavailableException) ConsistencyLevel(org.apache.cassandra.db.ConsistencyLevel) CasWriteTimeoutException(org.apache.cassandra.exceptions.CasWriteTimeoutException) WriteTimeoutException(org.apache.cassandra.exceptions.WriteTimeoutException) WriteFailureException(org.apache.cassandra.exceptions.WriteFailureException) BatchlogCleanup(org.apache.cassandra.service.BatchlogResponseHandler.BatchlogCleanup) Mutation(org.apache.cassandra.db.Mutation) CounterMutation(org.apache.cassandra.db.CounterMutation) IMutation(org.apache.cassandra.db.IMutation) UUID(java.util.UUID)

Example 2 with BatchlogCleanup

use of org.apache.cassandra.service.BatchlogResponseHandler.BatchlogCleanup in project cassandra by apache.

the class StorageProxy method mutateMV.

/**
 * Use this method to have these Mutations applied
 * across all replicas.
 *
 * @param mutations the mutations to be applied across the replicas
 * @param writeCommitLog if commitlog should be written
 * @param baseComplete time from epoch in ms that the local base mutation was(or will be) completed
 * @param queryStartNanoTime the value of nanoTime() when the query started to be processed
 */
public static void mutateMV(ByteBuffer dataKey, Collection<Mutation> mutations, boolean writeCommitLog, AtomicLong baseComplete, long queryStartNanoTime) throws UnavailableException, OverloadedException, WriteTimeoutException {
    Tracing.trace("Determining replicas for mutation");
    final String localDataCenter = DatabaseDescriptor.getEndpointSnitch().getLocalDatacenter();
    long startTime = nanoTime();
    try {
        // if we haven't joined the ring, write everything to batchlog because paired replicas may be stale
        final UUID batchUUID = UUIDGen.getTimeUUID();
        if (StorageService.instance.isStarting() || StorageService.instance.isJoining() || StorageService.instance.isMoving()) {
            BatchlogManager.store(Batch.createLocal(batchUUID, FBUtilities.timestampMicros(), mutations), writeCommitLog);
        } else {
            List<WriteResponseHandlerWrapper> wrappers = new ArrayList<>(mutations.size());
            // non-local mutations rely on the base mutation commit-log entry for eventual consistency
            Set<Mutation> nonLocalMutations = new HashSet<>(mutations);
            Token baseToken = StorageService.instance.getTokenMetadata().partitioner.getToken(dataKey);
            ConsistencyLevel consistencyLevel = ConsistencyLevel.ONE;
            // Since the base -> view replication is 1:1 we only need to store the BL locally
            ReplicaPlan.ForTokenWrite replicaPlan = ReplicaPlans.forLocalBatchlogWrite();
            BatchlogCleanup cleanup = new BatchlogCleanup(mutations.size(), () -> asyncRemoveFromBatchlog(replicaPlan, batchUUID));
            // add a handler for each mutation - includes checking availability, but doesn't initiate any writes, yet
            for (Mutation mutation : mutations) {
                if (hasLocalMutation(mutation))
                    writeMetrics.localRequests.mark();
                else
                    writeMetrics.remoteRequests.mark();
                String keyspaceName = mutation.getKeyspaceName();
                Token tk = mutation.key().getToken();
                AbstractReplicationStrategy replicationStrategy = Keyspace.open(keyspaceName).getReplicationStrategy();
                Optional<Replica> pairedEndpoint = ViewUtils.getViewNaturalEndpoint(replicationStrategy, baseToken, tk);
                EndpointsForToken pendingReplicas = StorageService.instance.getTokenMetadata().pendingEndpointsForToken(tk, keyspaceName);
                // if there are no paired endpoints there are probably range movements going on, so we write to the local batchlog to replay later
                if (!pairedEndpoint.isPresent()) {
                    if (pendingReplicas.isEmpty())
                        logger.warn("Received base materialized view mutation for key {} that does not belong " + "to this node. There is probably a range movement happening (move or decommission)," + "but this node hasn't updated its ring metadata yet. Adding mutation to " + "local batchlog to be replayed later.", mutation.key());
                    continue;
                }
                // write so the view mutation is sent to the pending endpoint
                if (pairedEndpoint.get().isSelf() && StorageService.instance.isJoined() && pendingReplicas.isEmpty()) {
                    try {
                        mutation.apply(writeCommitLog);
                        nonLocalMutations.remove(mutation);
                        // won't trigger cleanup
                        cleanup.decrement();
                    } catch (Exception exc) {
                        logger.error("Error applying local view update: Mutation (keyspace {}, tables {}, partition key {})", mutation.getKeyspaceName(), mutation.getTableIds(), mutation.key());
                        throw exc;
                    }
                } else {
                    ReplicaLayout.ForTokenWrite liveAndDown = ReplicaLayout.forTokenWrite(replicationStrategy, EndpointsForToken.of(tk, pairedEndpoint.get()), pendingReplicas);
                    wrappers.add(wrapViewBatchResponseHandler(mutation, consistencyLevel, consistencyLevel, liveAndDown, baseComplete, WriteType.BATCH, cleanup, queryStartNanoTime));
                }
            }
            // Apply to local batchlog memtable in this thread
            if (!nonLocalMutations.isEmpty())
                BatchlogManager.store(Batch.createLocal(batchUUID, FBUtilities.timestampMicros(), nonLocalMutations), writeCommitLog);
            // Perform remote writes
            if (!wrappers.isEmpty())
                asyncWriteBatchedMutations(wrappers, localDataCenter, Stage.VIEW_MUTATION);
        }
    } finally {
        viewWriteMetrics.addNano(nanoTime() - startTime);
    }
}
Also used : EndpointsForToken(org.apache.cassandra.locator.EndpointsForToken) ReplicaPlan(org.apache.cassandra.locator.ReplicaPlan) ArrayList(java.util.ArrayList) EndpointsForToken(org.apache.cassandra.locator.EndpointsForToken) Token(org.apache.cassandra.dht.Token) Replica(org.apache.cassandra.locator.Replica) OverloadedException(org.apache.cassandra.exceptions.OverloadedException) ReadAbortException(org.apache.cassandra.exceptions.ReadAbortException) RejectException(org.apache.cassandra.db.RejectException) CasWriteTimeoutException(org.apache.cassandra.exceptions.CasWriteTimeoutException) WriteFailureException(org.apache.cassandra.exceptions.WriteFailureException) InvalidRequestException(org.apache.cassandra.exceptions.InvalidRequestException) RequestTimeoutException(org.apache.cassandra.exceptions.RequestTimeoutException) ReadTimeoutException(org.apache.cassandra.exceptions.ReadTimeoutException) CasWriteUnknownResultException(org.apache.cassandra.exceptions.CasWriteUnknownResultException) TimeoutException(java.util.concurrent.TimeoutException) UnavailableException(org.apache.cassandra.exceptions.UnavailableException) WriteTimeoutException(org.apache.cassandra.exceptions.WriteTimeoutException) UncheckedInterruptedException(org.apache.cassandra.utils.concurrent.UncheckedInterruptedException) TombstoneOverwhelmingException(org.apache.cassandra.db.filter.TombstoneOverwhelmingException) RequestFailureException(org.apache.cassandra.exceptions.RequestFailureException) IsBootstrappingException(org.apache.cassandra.exceptions.IsBootstrappingException) ReadFailureException(org.apache.cassandra.exceptions.ReadFailureException) ConsistencyLevel(org.apache.cassandra.db.ConsistencyLevel) ReplicaLayout(org.apache.cassandra.locator.ReplicaLayout) BatchlogCleanup(org.apache.cassandra.service.BatchlogResponseHandler.BatchlogCleanup) AbstractReplicationStrategy(org.apache.cassandra.locator.AbstractReplicationStrategy) Mutation(org.apache.cassandra.db.Mutation) CounterMutation(org.apache.cassandra.db.CounterMutation) IMutation(org.apache.cassandra.db.IMutation) UUID(java.util.UUID) HashSet(java.util.HashSet)

Aggregations

ArrayList (java.util.ArrayList)2 UUID (java.util.UUID)2 ConsistencyLevel (org.apache.cassandra.db.ConsistencyLevel)2 CounterMutation (org.apache.cassandra.db.CounterMutation)2 IMutation (org.apache.cassandra.db.IMutation)2 Mutation (org.apache.cassandra.db.Mutation)2 CasWriteTimeoutException (org.apache.cassandra.exceptions.CasWriteTimeoutException)2 UnavailableException (org.apache.cassandra.exceptions.UnavailableException)2 WriteFailureException (org.apache.cassandra.exceptions.WriteFailureException)2 WriteTimeoutException (org.apache.cassandra.exceptions.WriteTimeoutException)2 ReplicaPlan (org.apache.cassandra.locator.ReplicaPlan)2 BatchlogCleanup (org.apache.cassandra.service.BatchlogResponseHandler.BatchlogCleanup)2 HashSet (java.util.HashSet)1 TimeoutException (java.util.concurrent.TimeoutException)1 RejectException (org.apache.cassandra.db.RejectException)1 TombstoneOverwhelmingException (org.apache.cassandra.db.filter.TombstoneOverwhelmingException)1 Token (org.apache.cassandra.dht.Token)1 CasWriteUnknownResultException (org.apache.cassandra.exceptions.CasWriteUnknownResultException)1 InvalidRequestException (org.apache.cassandra.exceptions.InvalidRequestException)1 IsBootstrappingException (org.apache.cassandra.exceptions.IsBootstrappingException)1