Search in sources :

Example 6 with ReplicationResponse

use of org.elasticsearch.action.support.replication.ReplicationResponse in project crate by crate.

the class RetentionLeaseIT method testRetentionLeaseSyncedOnRemove.

@Test
public void testRetentionLeaseSyncedOnRemove() throws Exception {
    final int numberOfReplicas = 2 - scaledRandomIntBetween(0, 2);
    internalCluster().ensureAtLeastNumDataNodes(1 + numberOfReplicas);
    execute("create table doc.tbl (x int) clustered into 1 shards with (number_of_replicas = ?)", new Object[] { numberOfReplicas });
    ensureGreen("tbl");
    final String primaryShardNodeId = clusterService().state().routingTable().index("tbl").shard(0).primaryShard().currentNodeId();
    final String primaryShardNodeName = clusterService().state().nodes().get(primaryShardNodeId).getName();
    final IndexShard primary = internalCluster().getInstance(IndicesService.class, primaryShardNodeName).getShardOrNull(new ShardId(resolveIndex("tbl"), 0));
    final int length = randomIntBetween(1, 8);
    final Map<String, RetentionLease> currentRetentionLeases = new LinkedHashMap<>();
    for (int i = 0; i < length; i++) {
        final String id = randomValueOtherThanMany(currentRetentionLeases.keySet()::contains, () -> randomAlphaOfLength(8));
        final long retainingSequenceNumber = randomLongBetween(0, Long.MAX_VALUE);
        final String source = randomAlphaOfLength(8);
        final CountDownLatch latch = new CountDownLatch(1);
        final ActionListener<ReplicationResponse> listener = countDownLatchListener(latch);
        // simulate a peer recovery which locks the soft deletes policy on the primary
        final Closeable retentionLock = randomBoolean() ? primary.acquireHistoryRetentionLock(Engine.HistorySource.INDEX) : () -> {
        };
        currentRetentionLeases.put(id, primary.addRetentionLease(id, retainingSequenceNumber, source, listener));
        latch.await();
        retentionLock.close();
    }
    for (int i = 0; i < length; i++) {
        final String id = randomFrom(currentRetentionLeases.keySet());
        final CountDownLatch latch = new CountDownLatch(1);
        primary.removeRetentionLease(id, countDownLatchListener(latch));
        // simulate a peer recovery which locks the soft deletes policy on the primary
        final Closeable retentionLock = randomBoolean() ? primary.acquireHistoryRetentionLock(Engine.HistorySource.INDEX) : () -> {
        };
        currentRetentionLeases.remove(id);
        latch.await();
        retentionLock.close();
        // check retention leases have been written on the primary
        assertThat(currentRetentionLeases, equalTo(RetentionLeaseUtils.toMapExcludingPeerRecoveryRetentionLeases(primary.loadRetentionLeases())));
        // check current retention leases have been synced to all replicas
        for (final ShardRouting replicaShard : clusterService().state().routingTable().index("tbl").shard(0).replicaShards()) {
            final String replicaShardNodeId = replicaShard.currentNodeId();
            final String replicaShardNodeName = clusterService().state().nodes().get(replicaShardNodeId).getName();
            final IndexShard replica = internalCluster().getInstance(IndicesService.class, replicaShardNodeName).getShardOrNull(new ShardId(resolveIndex("tbl"), 0));
            final Map<String, RetentionLease> retentionLeasesOnReplica = RetentionLeaseUtils.toMapExcludingPeerRecoveryRetentionLeases(replica.getRetentionLeases());
            assertThat(retentionLeasesOnReplica, equalTo(currentRetentionLeases));
            // check retention leases have been written on the replica
            assertThat(currentRetentionLeases, equalTo(RetentionLeaseUtils.toMapExcludingPeerRecoveryRetentionLeases(replica.loadRetentionLeases())));
        }
    }
}
Also used : IndexShard(org.elasticsearch.index.shard.IndexShard) Closeable(java.io.Closeable) IndicesService(org.elasticsearch.indices.IndicesService) CountDownLatch(java.util.concurrent.CountDownLatch) LinkedHashMap(java.util.LinkedHashMap) ReplicationResponse(org.elasticsearch.action.support.replication.ReplicationResponse) ShardId(org.elasticsearch.index.shard.ShardId) ShardRouting(org.elasticsearch.cluster.routing.ShardRouting) Test(org.junit.Test)

Example 7 with ReplicationResponse

use of org.elasticsearch.action.support.replication.ReplicationResponse in project crate by crate.

the class RetentionLeaseIT method testCanRenewRetentionLeaseWithoutWaitingForShards.

@Test
public void testCanRenewRetentionLeaseWithoutWaitingForShards() throws InterruptedException {
    final String idForInitialRetentionLease = randomAlphaOfLength(8);
    final long initialRetainingSequenceNumber = randomLongBetween(0, Long.MAX_VALUE);
    final AtomicReference<RetentionLease> retentionLease = new AtomicReference<>();
    runWaitForShardsTest(idForInitialRetentionLease, initialRetainingSequenceNumber, (primary, listener) -> {
        final long nextRetainingSequenceNumber = randomLongBetween(initialRetainingSequenceNumber, Long.MAX_VALUE);
        final String nextSource = randomAlphaOfLength(8);
        retentionLease.set(primary.renewRetentionLease(idForInitialRetentionLease, nextRetainingSequenceNumber, nextSource));
        listener.onResponse(new ReplicationResponse());
    }, primary -> {
        try {
            /*
                         * If the background renew was able to execute, then the retention leases were persisted to disk. There is no other
                         * way for the current retention leases to end up written to disk so we assume that if they are written to disk, it
                         * implies that the background sync was able to execute despite wait for shards being set on the index.
                         */
            assertBusy(() -> assertThat(RetentionLeaseUtils.toMapExcludingPeerRecoveryRetentionLeases(primary.loadRetentionLeases()).values(), contains(retentionLease.get())));
        } catch (final Exception e) {
            fail(e.toString());
        }
    });
}
Also used : AtomicReference(java.util.concurrent.atomic.AtomicReference) ElasticsearchException(org.elasticsearch.ElasticsearchException) ReplicationResponse(org.elasticsearch.action.support.replication.ReplicationResponse) Test(org.junit.Test)

Example 8 with ReplicationResponse

use of org.elasticsearch.action.support.replication.ReplicationResponse in project crate by crate.

the class RetentionLeaseIT method runWaitForShardsTest.

private void runWaitForShardsTest(final String idForInitialRetentionLease, final long initialRetainingSequenceNumber, final BiConsumer<IndexShard, ActionListener<ReplicationResponse>> primaryConsumer, final Consumer<IndexShard> afterSync) throws InterruptedException {
    final int numDataNodes = internalCluster().numDataNodes();
    execute("create table doc.tbl (x int) clustered into 1 shards " + "with (" + "   number_of_replicas = ?, " + "   \"soft_deletes.enabled\" = true," + "   \"soft_deletes.retention_lease.sync_interval\" = ?)", new Object[] { numDataNodes == 1 ? 0 : numDataNodes - 1, TimeValue.timeValueSeconds(1).getStringRep() });
    ensureYellowAndNoInitializingShards("tbl");
    assertFalse(client().admin().cluster().prepareHealth("tbl").setWaitForActiveShards(numDataNodes).get().isTimedOut());
    final String primaryShardNodeId = clusterService().state().routingTable().index("tbl").shard(0).primaryShard().currentNodeId();
    final String primaryShardNodeName = clusterService().state().nodes().get(primaryShardNodeId).getName();
    final IndexShard primary = internalCluster().getInstance(IndicesService.class, primaryShardNodeName).getShardOrNull(new ShardId(resolveIndex("tbl"), 0));
    final String source = randomAlphaOfLength(8);
    final CountDownLatch latch = new CountDownLatch(1);
    final ActionListener<ReplicationResponse> listener = ActionListener.wrap(r -> latch.countDown(), e -> fail(e.toString()));
    primary.addRetentionLease(idForInitialRetentionLease, initialRetainingSequenceNumber, source, listener);
    latch.await();
    final String waitForActiveValue = randomBoolean() ? "all" : Integer.toString(numDataNodes);
    execute("alter table doc.tbl set (\"write.wait_for_active_shards\" = ?)", new Object[] { waitForActiveValue });
    final CountDownLatch actionLatch = new CountDownLatch(1);
    final AtomicBoolean success = new AtomicBoolean();
    primaryConsumer.accept(primary, new ActionListener<ReplicationResponse>() {

        @Override
        public void onResponse(final ReplicationResponse replicationResponse) {
            success.set(true);
            actionLatch.countDown();
        }

        @Override
        public void onFailure(final Exception e) {
            fail(e.toString());
        }
    });
    actionLatch.await();
    assertTrue(success.get());
    afterSync.accept(primary);
}
Also used : ShardId(org.elasticsearch.index.shard.ShardId) AtomicBoolean(java.util.concurrent.atomic.AtomicBoolean) IndexShard(org.elasticsearch.index.shard.IndexShard) IndicesService(org.elasticsearch.indices.IndicesService) CountDownLatch(java.util.concurrent.CountDownLatch) ElasticsearchException(org.elasticsearch.ElasticsearchException) ReplicationResponse(org.elasticsearch.action.support.replication.ReplicationResponse)

Example 9 with ReplicationResponse

use of org.elasticsearch.action.support.replication.ReplicationResponse in project crate by crate.

the class RetentionLeaseIT method testRetentionLeasesSyncOnRecovery.

@Test
public void testRetentionLeasesSyncOnRecovery() throws Exception {
    final int numberOfReplicas = 2 - scaledRandomIntBetween(0, 2);
    internalCluster().ensureAtLeastNumDataNodes(1 + numberOfReplicas);
    /*
         * We effectively disable the background sync to ensure that the retention leases are not synced in the background so that the only
         * source of retention leases on the replicas would be from recovery.
         */
    execute("create table doc.tbl (x int) clustered into 1 shards " + "with (" + "   number_of_replicas = 0, " + "   \"soft_deletes.enabled\" = true, " + "   \"soft_deletes.retention_lease.sync_interval\" = ?)", new Object[] { TimeValue.timeValueHours(24).getStringRep() });
    allowNodes("tbl", 1);
    ensureYellow("tbl");
    execute("alter table doc.tbl set (number_of_replicas = ?)", new Object[] { numberOfReplicas });
    final String primaryShardNodeId = clusterService().state().routingTable().index("tbl").shard(0).primaryShard().currentNodeId();
    final String primaryShardNodeName = clusterService().state().nodes().get(primaryShardNodeId).getName();
    final IndexShard primary = internalCluster().getInstance(IndicesService.class, primaryShardNodeName).getShardOrNull(new ShardId(resolveIndex("tbl"), 0));
    final int length = randomIntBetween(1, 8);
    final Map<String, RetentionLease> currentRetentionLeases = new HashMap<>();
    for (int i = 0; i < length; i++) {
        final String id = randomValueOtherThanMany(currentRetentionLeases.keySet()::contains, () -> randomAlphaOfLength(8));
        final long retainingSequenceNumber = randomLongBetween(0, Long.MAX_VALUE);
        final String source = randomAlphaOfLength(8);
        final CountDownLatch latch = new CountDownLatch(1);
        final ActionListener<ReplicationResponse> listener = ActionListener.wrap(r -> latch.countDown(), e -> fail(e.toString()));
        currentRetentionLeases.put(id, primary.addRetentionLease(id, retainingSequenceNumber, source, listener));
        latch.await();
    }
    // Cause some recoveries to fail to ensure that retention leases are handled properly when retrying a recovery
    // 
    execute("set global persistent \"indices.recovery.retry_delay_network\" = '100ms'");
    final Semaphore recoveriesToDisrupt = new Semaphore(scaledRandomIntBetween(0, 4));
    final MockTransportService primaryTransportService = (MockTransportService) internalCluster().getInstance(TransportService.class, primaryShardNodeName);
    primaryTransportService.addSendBehavior((connection, requestId, action, request, options) -> {
        if (action.equals(PeerRecoveryTargetService.Actions.FINALIZE) && recoveriesToDisrupt.tryAcquire()) {
            if (randomBoolean()) {
                // return a ConnectTransportException to the START_RECOVERY action
                final TransportService replicaTransportService = internalCluster().getInstance(TransportService.class, connection.getNode().getName());
                final DiscoveryNode primaryNode = primaryTransportService.getLocalNode();
                replicaTransportService.disconnectFromNode(primaryNode);
                AbstractSimpleTransportTestCase.connectToNode(replicaTransportService, primaryNode);
            } else {
                // return an exception to the FINALIZE action
                throw new ElasticsearchException("failing recovery for test purposes");
            }
        }
        connection.sendRequest(requestId, action, request, options);
    });
    // now allow the replicas to be allocated and wait for recovery to finalize
    allowNodes("tbl", 1 + numberOfReplicas);
    ensureGreen("tbl");
    // check current retention leases have been synced to all replicas
    for (final ShardRouting replicaShard : clusterService().state().routingTable().index("tbl").shard(0).replicaShards()) {
        final String replicaShardNodeId = replicaShard.currentNodeId();
        final String replicaShardNodeName = clusterService().state().nodes().get(replicaShardNodeId).getName();
        final IndexShard replica = internalCluster().getInstance(IndicesService.class, replicaShardNodeName).getShardOrNull(new ShardId(resolveIndex("tbl"), 0));
        final Map<String, RetentionLease> retentionLeasesOnReplica = RetentionLeaseUtils.toMapExcludingPeerRecoveryRetentionLeases(replica.getRetentionLeases());
        assertThat(retentionLeasesOnReplica, equalTo(currentRetentionLeases));
        // check retention leases have been written on the replica; see RecoveryTarget#finalizeRecovery
        assertThat(currentRetentionLeases, equalTo(RetentionLeaseUtils.toMapExcludingPeerRecoveryRetentionLeases(replica.loadRetentionLeases())));
    }
}
Also used : DiscoveryNode(org.elasticsearch.cluster.node.DiscoveryNode) MockTransportService(org.elasticsearch.test.transport.MockTransportService) HashMap(java.util.HashMap) LinkedHashMap(java.util.LinkedHashMap) IndexShard(org.elasticsearch.index.shard.IndexShard) IndicesService(org.elasticsearch.indices.IndicesService) Semaphore(java.util.concurrent.Semaphore) ElasticsearchException(org.elasticsearch.ElasticsearchException) CountDownLatch(java.util.concurrent.CountDownLatch) ReplicationResponse(org.elasticsearch.action.support.replication.ReplicationResponse) ShardId(org.elasticsearch.index.shard.ShardId) MockTransportService(org.elasticsearch.test.transport.MockTransportService) TransportService(org.elasticsearch.transport.TransportService) ShardRouting(org.elasticsearch.cluster.routing.ShardRouting) Test(org.junit.Test)

Example 10 with ReplicationResponse

use of org.elasticsearch.action.support.replication.ReplicationResponse in project crate by crate.

the class RetentionLeaseIT method testCanRenewRetentionLeaseUnderBlock.

@Test
public void testCanRenewRetentionLeaseUnderBlock() throws InterruptedException {
    final String idForInitialRetentionLease = randomAlphaOfLength(8);
    final long initialRetainingSequenceNumber = randomLongBetween(0, Long.MAX_VALUE);
    final AtomicReference<RetentionLease> retentionLease = new AtomicReference<>();
    runUnderBlockTest(idForInitialRetentionLease, initialRetainingSequenceNumber, (primary, listener) -> {
        final long nextRetainingSequenceNumber = randomLongBetween(initialRetainingSequenceNumber, Long.MAX_VALUE);
        final String nextSource = randomAlphaOfLength(8);
        retentionLease.set(primary.renewRetentionLease(idForInitialRetentionLease, nextRetainingSequenceNumber, nextSource));
        listener.onResponse(new ReplicationResponse());
    }, primary -> {
        try {
            /*
                         * If the background renew was able to execute, then the retention leases were persisted to disk. There is no other
                         * way for the current retention leases to end up written to disk so we assume that if they are written to disk, it
                         * implies that the background sync was able to execute under a block.
                         */
            assertBusy(() -> assertThat(RetentionLeaseUtils.toMapExcludingPeerRecoveryRetentionLeases(primary.loadRetentionLeases()).values(), contains(retentionLease.get())));
        } catch (final Exception e) {
            fail(e.toString());
        }
    });
}
Also used : AtomicReference(java.util.concurrent.atomic.AtomicReference) ElasticsearchException(org.elasticsearch.ElasticsearchException) ReplicationResponse(org.elasticsearch.action.support.replication.ReplicationResponse) Test(org.junit.Test)

Aggregations

ReplicationResponse (org.elasticsearch.action.support.replication.ReplicationResponse)24 Test (org.junit.Test)10 CountDownLatch (java.util.concurrent.CountDownLatch)9 IndexShard (org.elasticsearch.index.shard.IndexShard)9 ShardId (org.elasticsearch.index.shard.ShardId)8 ShardRouting (org.elasticsearch.cluster.routing.ShardRouting)7 Collections (java.util.Collections)6 List (java.util.List)6 AtomicBoolean (java.util.concurrent.atomic.AtomicBoolean)6 PlainActionFuture (org.elasticsearch.action.support.PlainActionFuture)6 IndexShardRoutingTable (org.elasticsearch.cluster.routing.IndexShardRoutingTable)6 IndicesService (org.elasticsearch.indices.IndicesService)6 IOException (java.io.IOException)5 ArrayList (java.util.ArrayList)5 Arrays (java.util.Arrays)5 BytesArray (org.elasticsearch.common.bytes.BytesArray)5 ByteSizeValue (org.elasticsearch.common.unit.ByteSizeValue)5 Engine (org.elasticsearch.index.engine.Engine)5 SequenceNumbers (org.elasticsearch.index.seqno.SequenceNumbers)5 Translog (org.elasticsearch.index.translog.Translog)5