Search in sources :

Example 1 with FilteredPartition

use of org.apache.cassandra.db.partitions.FilteredPartition in project cassandra by apache.

the class QueryPagerTest method namesQueryTest.

@Test
public void namesQueryTest() throws Exception {
    QueryPager pager = namesQuery("k0", "c1", "c5", "c7", "c8").getPager(null, ProtocolVersion.CURRENT);
    assertFalse(pager.isExhausted());
    List<FilteredPartition> partition = query(pager, 5, 4);
    assertRow(partition.get(0), "k0", "c1", "c5", "c7", "c8");
    assertTrue(pager.isExhausted());
}
Also used : QueryPager(org.apache.cassandra.service.pager.QueryPager) FilteredPartition(org.apache.cassandra.db.partitions.FilteredPartition) Test(org.junit.Test)

Example 2 with FilteredPartition

use of org.apache.cassandra.db.partitions.FilteredPartition in project cassandra by apache.

the class QueryPagerTest method SliceQueryWithTombstoneTest.

@Test
public void SliceQueryWithTombstoneTest() throws Exception {
    // Testing for the bug of #6748
    String keyspace = "cql_keyspace";
    String table = "table2";
    ColumnFamilyStore cfs = Keyspace.open(keyspace).getColumnFamilyStore(table);
    // Insert rows but with a tombstone as last cell
    for (int i = 0; i < 5; i++) executeInternal(String.format("INSERT INTO %s.%s (k, c, v) VALUES ('k%d', 'c%d', null)", keyspace, table, 0, i));
    ReadCommand command = SinglePartitionReadCommand.create(cfs.metadata(), nowInSec, Util.dk("k0"), Slice.ALL);
    QueryPager pager = command.getPager(null, ProtocolVersion.CURRENT);
    for (int i = 0; i < 5; i++) {
        List<FilteredPartition> partitions = query(pager, 1);
        // The only live cell we should have each time is the row marker
        assertRow(partitions.get(0), "k0", "c" + i);
    }
}
Also used : QueryPager(org.apache.cassandra.service.pager.QueryPager) FilteredPartition(org.apache.cassandra.db.partitions.FilteredPartition) Test(org.junit.Test)

Example 3 with FilteredPartition

use of org.apache.cassandra.db.partitions.FilteredPartition in project cassandra by apache.

the class StorageProxy method cas.

/**
 * Apply @param updates if and only if the current values in the row for @param key
 * match the provided @param conditions.  The algorithm is "raw" Paxos: that is, Paxos
 * minus leader election -- any node in the cluster may propose changes for any row,
 * which (that is, the row) is the unit of values being proposed, not single columns.
 *
 * The Paxos cohort is only the replicas for the given key, not the entire cluster.
 * So we expect performance to be reasonable, but CAS is still intended to be used
 * "when you really need it," not for all your updates.
 *
 * There are three phases to Paxos:
 *  1. Prepare: the coordinator generates a ballot (timeUUID in our case) and asks replicas to (a) promise
 *     not to accept updates from older ballots and (b) tell us about the most recent update it has already
 *     accepted.
 *  2. Accept: if a majority of replicas respond, the coordinator asks replicas to accept the value of the
 *     highest proposal ballot it heard about, or a new value if no in-progress proposals were reported.
 *  3. Commit (Learn): if a majority of replicas acknowledge the accept request, we can commit the new
 *     value.
 *
 *  Commit procedure is not covered in "Paxos Made Simple," and only briefly mentioned in "Paxos Made Live,"
 *  so here is our approach:
 *   3a. The coordinator sends a commit message to all replicas with the ballot and value.
 *   3b. Because of 1-2, this will be the highest-seen commit ballot.  The replicas will note that,
 *       and send it with subsequent promise replies.  This allows us to discard acceptance records
 *       for successfully committed replicas, without allowing incomplete proposals to commit erroneously
 *       later on.
 *
 *  Note that since we are performing a CAS rather than a simple update, we perform a read (of committed
 *  values) between the prepare and accept phases.  This gives us a slightly longer window for another
 *  coordinator to come along and trump our own promise with a newer one but is otherwise safe.
 *
 * @param keyspaceName the keyspace for the CAS
 * @param cfName the column family for the CAS
 * @param key the row key for the row to CAS
 * @param request the conditions for the CAS to apply as well as the update to perform if the conditions hold.
 * @param consistencyForPaxos the consistency for the paxos prepare and propose round. This can only be either SERIAL or LOCAL_SERIAL.
 * @param consistencyForCommit the consistency for write done during the commit phase. This can be anything, except SERIAL or LOCAL_SERIAL.
 *
 * @return null if the operation succeeds in updating the row, or the current values corresponding to conditions.
 * (since, if the CAS doesn't succeed, it means the current value do not match the conditions).
 */
public static RowIterator cas(String keyspaceName, String cfName, DecoratedKey key, CASRequest request, ConsistencyLevel consistencyForPaxos, ConsistencyLevel consistencyForCommit, ClientState state, int nowInSeconds, long queryStartNanoTime) throws UnavailableException, IsBootstrappingException, RequestFailureException, RequestTimeoutException, InvalidRequestException, CasWriteUnknownResultException {
    final long startTimeForMetrics = nanoTime();
    try {
        TableMetadata metadata = Schema.instance.validateTable(keyspaceName, cfName);
        if (DatabaseDescriptor.getPartitionDenylistEnabled() && DatabaseDescriptor.getDenylistWritesEnabled() && !partitionDenylist.isKeyPermitted(keyspaceName, cfName, key.getKey())) {
            denylistMetrics.incrementWritesRejected();
            throw new InvalidRequestException(String.format("Unable to CAS write to denylisted partition [0x%s] in %s/%s", key.toString(), keyspaceName, cfName));
        }
        Supplier<Pair<PartitionUpdate, RowIterator>> updateProposer = () -> {
            // read the current values and check they validate the conditions
            Tracing.trace("Reading existing values for CAS precondition");
            SinglePartitionReadCommand readCommand = (SinglePartitionReadCommand) request.readCommand(nowInSeconds);
            ConsistencyLevel readConsistency = consistencyForPaxos == ConsistencyLevel.LOCAL_SERIAL ? ConsistencyLevel.LOCAL_QUORUM : ConsistencyLevel.QUORUM;
            FilteredPartition current;
            try (RowIterator rowIter = readOne(readCommand, readConsistency, queryStartNanoTime)) {
                current = FilteredPartition.create(rowIter);
            }
            if (!request.appliesTo(current)) {
                Tracing.trace("CAS precondition does not match current values {}", current);
                casWriteMetrics.conditionNotMet.inc();
                return Pair.create(PartitionUpdate.emptyUpdate(metadata, key), current.rowIterator());
            }
            // Create the desired updates
            PartitionUpdate updates = request.makeUpdates(current, state);
            long size = updates.dataSize();
            casWriteMetrics.mutationSize.update(size);
            writeMetricsForLevel(consistencyForPaxos).mutationSize.update(size);
            // Apply triggers to cas updates. A consideration here is that
            // triggers emit Mutations, and so a given trigger implementation
            // may generate mutations for partitions other than the one this
            // paxos round is scoped for. In this case, TriggerExecutor will
            // validate that the generated mutations are targetted at the same
            // partition as the initial updates and reject (via an
            // InvalidRequestException) any which aren't.
            updates = TriggerExecutor.instance.execute(updates);
            return Pair.create(updates, null);
        };
        return doPaxos(metadata, key, consistencyForPaxos, consistencyForCommit, consistencyForCommit, queryStartNanoTime, casWriteMetrics, updateProposer);
    } catch (CasWriteUnknownResultException e) {
        casWriteMetrics.unknownResult.mark();
        throw e;
    } catch (CasWriteTimeoutException wte) {
        casWriteMetrics.timeouts.mark();
        writeMetricsForLevel(consistencyForPaxos).timeouts.mark();
        throw new CasWriteTimeoutException(wte.writeType, wte.consistency, wte.received, wte.blockFor, wte.contentions);
    } catch (ReadTimeoutException e) {
        casWriteMetrics.timeouts.mark();
        writeMetricsForLevel(consistencyForPaxos).timeouts.mark();
        throw e;
    } catch (ReadAbortException e) {
        casWriteMetrics.markAbort(e);
        writeMetricsForLevel(consistencyForPaxos).markAbort(e);
        throw e;
    } catch (WriteFailureException | ReadFailureException e) {
        casWriteMetrics.failures.mark();
        writeMetricsForLevel(consistencyForPaxos).failures.mark();
        throw e;
    } catch (UnavailableException e) {
        casWriteMetrics.unavailables.mark();
        writeMetricsForLevel(consistencyForPaxos).unavailables.mark();
        throw e;
    } finally {
        final long latency = nanoTime() - startTimeForMetrics;
        casWriteMetrics.addNano(latency);
        writeMetricsForLevel(consistencyForPaxos).addNano(latency);
    }
}
Also used : TableMetadata(org.apache.cassandra.schema.TableMetadata) ReadFailureException(org.apache.cassandra.exceptions.ReadFailureException) ReadTimeoutException(org.apache.cassandra.exceptions.ReadTimeoutException) SinglePartitionReadCommand(org.apache.cassandra.db.SinglePartitionReadCommand) UnavailableException(org.apache.cassandra.exceptions.UnavailableException) FilteredPartition(org.apache.cassandra.db.partitions.FilteredPartition) ReadAbortException(org.apache.cassandra.exceptions.ReadAbortException) CasWriteUnknownResultException(org.apache.cassandra.exceptions.CasWriteUnknownResultException) ConsistencyLevel(org.apache.cassandra.db.ConsistencyLevel) WriteFailureException(org.apache.cassandra.exceptions.WriteFailureException) RowIterator(org.apache.cassandra.db.rows.RowIterator) InvalidRequestException(org.apache.cassandra.exceptions.InvalidRequestException) CasWriteTimeoutException(org.apache.cassandra.exceptions.CasWriteTimeoutException) PartitionUpdate(org.apache.cassandra.db.partitions.PartitionUpdate) Pair(org.apache.cassandra.utils.Pair)

Example 4 with FilteredPartition

use of org.apache.cassandra.db.partitions.FilteredPartition in project cassandra by apache.

the class ReadMessageTest method testGetColumn.

@Test
public void testGetColumn() {
    ColumnFamilyStore cfs = Keyspace.open(KEYSPACE1).getColumnFamilyStore(CF);
    new RowUpdateBuilder(cfs.metadata(), 0, ByteBufferUtil.bytes("key1")).clustering("Column1").add("val", ByteBufferUtil.bytes("abcd")).build().apply();
    ColumnMetadata col = cfs.metadata().getColumn(ByteBufferUtil.bytes("val"));
    int found = 0;
    for (FilteredPartition partition : Util.getAll(Util.cmd(cfs).build())) {
        for (Row r : partition) {
            if (r.getCell(col).value().equals(ByteBufferUtil.bytes("abcd")))
                ++found;
        }
    }
    assertEquals(1, found);
}
Also used : ColumnMetadata(org.apache.cassandra.schema.ColumnMetadata) FilteredPartition(org.apache.cassandra.db.partitions.FilteredPartition) Row(org.apache.cassandra.db.rows.Row) Test(org.junit.Test)

Example 5 with FilteredPartition

use of org.apache.cassandra.db.partitions.FilteredPartition in project cassandra by apache.

the class QueryPagerTest method rangeSliceQueryTest.

public void rangeSliceQueryTest(boolean testPagingState, ProtocolVersion protocolVersion) {
    ReadCommand command = rangeSliceQuery("k1", "k5", 100, "c1", "c7");
    QueryPager pager = command.getPager(null, protocolVersion);
    assertFalse(pager.isExhausted());
    List<FilteredPartition> partitions = query(pager, 5);
    assertRow(partitions.get(0), "k2", "c1", "c2", "c3", "c4", "c5");
    assertFalse(pager.isExhausted());
    pager = maybeRecreate(pager, command, testPagingState, protocolVersion);
    assertFalse(pager.isExhausted());
    partitions = query(pager, 4);
    assertRow(partitions.get(0), "k2", "c6", "c7");
    assertRow(partitions.get(1), "k3", "c1", "c2");
    assertFalse(pager.isExhausted());
    pager = maybeRecreate(pager, command, testPagingState, protocolVersion);
    assertFalse(pager.isExhausted());
    partitions = query(pager, 6);
    assertRow(partitions.get(0), "k3", "c3", "c4", "c5", "c6", "c7");
    assertRow(partitions.get(1), "k4", "c1");
    assertFalse(pager.isExhausted());
    pager = maybeRecreate(pager, command, testPagingState, protocolVersion);
    assertFalse(pager.isExhausted());
    partitions = query(pager, 5);
    assertRow(partitions.get(0), "k4", "c2", "c3", "c4", "c5", "c6");
    assertFalse(pager.isExhausted());
    pager = maybeRecreate(pager, command, testPagingState, protocolVersion);
    assertFalse(pager.isExhausted());
    partitions = query(pager, 5);
    assertRow(partitions.get(0), "k4", "c7");
    assertRow(partitions.get(1), "k5", "c1", "c2", "c3", "c4");
    assertFalse(pager.isExhausted());
    pager = maybeRecreate(pager, command, testPagingState, protocolVersion);
    assertFalse(pager.isExhausted());
    partitions = query(pager, 5, 3);
    assertRow(partitions.get(0), "k5", "c5", "c6", "c7");
    assertTrue(pager.isExhausted());
}
Also used : QueryPager(org.apache.cassandra.service.pager.QueryPager) FilteredPartition(org.apache.cassandra.db.partitions.FilteredPartition)

Aggregations

FilteredPartition (org.apache.cassandra.db.partitions.FilteredPartition)15 QueryPager (org.apache.cassandra.service.pager.QueryPager)9 Test (org.junit.Test)6 RowIterator (org.apache.cassandra.db.rows.RowIterator)3 SinglePartitionReadCommand (org.apache.cassandra.db.SinglePartitionReadCommand)2 PartitionIterator (org.apache.cassandra.db.partitions.PartitionIterator)2 Row (org.apache.cassandra.db.rows.Row)2 SSTableReader (org.apache.cassandra.io.sstable.format.SSTableReader)2 ColumnMetadata (org.apache.cassandra.schema.ColumnMetadata)2 TableMetadata (org.apache.cassandra.schema.TableMetadata)2 ArrayList (java.util.ArrayList)1 ColumnIdentifier (org.apache.cassandra.cql3.ColumnIdentifier)1 ColumnFamilyStore (org.apache.cassandra.db.ColumnFamilyStore)1 ConsistencyLevel (org.apache.cassandra.db.ConsistencyLevel)1 DecoratedKey (org.apache.cassandra.db.DecoratedKey)1 Directories (org.apache.cassandra.db.Directories)1 Keyspace (org.apache.cassandra.db.Keyspace)1 ReadExecutionController (org.apache.cassandra.db.ReadExecutionController)1 ClusteringIndexSliceFilter (org.apache.cassandra.db.filter.ClusteringIndexSliceFilter)1 PartitionUpdate (org.apache.cassandra.db.partitions.PartitionUpdate)1