Search in sources :

Example 6 with RepositoryException

use of org.opensearch.repositories.RepositoryException in project OpenSearch by opensearch-project.

the class ConcurrentSnapshotsIT method testMasterFailOverWithQueuedDeletes.

public void testMasterFailOverWithQueuedDeletes() throws Exception {
    internalCluster().startMasterOnlyNodes(3);
    final String dataNode = internalCluster().startDataOnlyNode();
    final String repoName = "test-repo";
    createRepository(repoName, "mock");
    final String firstIndex = "index-one";
    createIndexWithContent(firstIndex);
    final String firstSnapshot = "snapshot-one";
    blockDataNode(repoName, dataNode);
    final ActionFuture<CreateSnapshotResponse> firstSnapshotResponse = startFullSnapshotFromNonMasterClient(repoName, firstSnapshot);
    waitForBlock(dataNode, repoName, TimeValue.timeValueSeconds(30L));
    final String dataNode2 = internalCluster().startDataOnlyNode();
    ensureStableCluster(5);
    final String secondIndex = "index-two";
    createIndexWithContent(secondIndex, dataNode2, dataNode);
    final String secondSnapshot = "snapshot-two";
    final ActionFuture<CreateSnapshotResponse> secondSnapshotResponse = startFullSnapshot(repoName, secondSnapshot);
    logger.info("--> wait for snapshot on second data node to finish");
    awaitClusterState(state -> {
        final SnapshotsInProgress snapshotsInProgress = state.custom(SnapshotsInProgress.TYPE, SnapshotsInProgress.EMPTY);
        return snapshotsInProgress.entries().size() == 2 && snapshotHasCompletedShard(secondSnapshot, snapshotsInProgress);
    });
    final ActionFuture<AcknowledgedResponse> firstDeleteFuture = startDeleteFromNonMasterClient(repoName, firstSnapshot);
    awaitNDeletionsInProgress(1);
    blockNodeOnAnyFiles(repoName, dataNode2);
    final ActionFuture<CreateSnapshotResponse> snapshotThreeFuture = startFullSnapshotFromNonMasterClient(repoName, "snapshot-three");
    waitForBlock(dataNode2, repoName, TimeValue.timeValueSeconds(30L));
    assertThat(firstSnapshotResponse.isDone(), is(false));
    assertThat(secondSnapshotResponse.isDone(), is(false));
    logger.info("--> waiting for all three snapshots to show up as in-progress");
    assertBusy(() -> assertThat(currentSnapshots(repoName), hasSize(3)), 30L, TimeUnit.SECONDS);
    final ActionFuture<AcknowledgedResponse> deleteAllSnapshots = startDeleteFromNonMasterClient(repoName, "*");
    logger.info("--> wait for delete to be enqueued in cluster state");
    awaitClusterState(state -> {
        final SnapshotDeletionsInProgress deletionsInProgress = state.custom(SnapshotDeletionsInProgress.TYPE);
        return deletionsInProgress.getEntries().size() == 1 && deletionsInProgress.getEntries().get(0).getSnapshots().size() == 3;
    });
    logger.info("--> waiting for second snapshot to finish and the other two snapshots to become aborted");
    assertBusy(() -> {
        assertThat(currentSnapshots(repoName), hasSize(2));
        for (SnapshotsInProgress.Entry entry : clusterService().state().custom(SnapshotsInProgress.TYPE, SnapshotsInProgress.EMPTY).entries()) {
            assertThat(entry.state(), is(SnapshotsInProgress.State.ABORTED));
            assertThat(entry.snapshot().getSnapshotId().getName(), not(secondSnapshot));
        }
    }, 30L, TimeUnit.SECONDS);
    logger.info("--> stopping current master node");
    internalCluster().stopCurrentMasterNode();
    unblockNode(repoName, dataNode);
    unblockNode(repoName, dataNode2);
    for (ActionFuture<AcknowledgedResponse> deleteFuture : Arrays.asList(firstDeleteFuture, deleteAllSnapshots)) {
        try {
            assertAcked(deleteFuture.actionGet());
        } catch (RepositoryException rex) {
            // rarely the master node fails over twice when shutting down the initial master and fails the transport listener
            assertThat(rex.repository(), is("_all"));
            assertThat(rex.getMessage(), endsWith("Failed to update cluster state during repository operation"));
        } catch (SnapshotMissingException sme) {
            // very rarely a master node fail-over happens at such a time that the client on the data-node sees a disconnect exception
            // after the master has already started the delete, leading to the delete retry to run into a situation where the
            // snapshot has already been deleted potentially
            assertThat(sme.getSnapshotName(), is(firstSnapshot));
        }
    }
    expectThrows(SnapshotException.class, snapshotThreeFuture::actionGet);
    logger.info("--> verify that all snapshots are gone and no more work is left in the cluster state");
    assertBusy(() -> {
        assertThat(client().admin().cluster().prepareGetSnapshots(repoName).get().getSnapshots(), empty());
        final ClusterState state = clusterService().state();
        final SnapshotsInProgress snapshotsInProgress = state.custom(SnapshotsInProgress.TYPE);
        assertThat(snapshotsInProgress.entries(), empty());
        final SnapshotDeletionsInProgress snapshotDeletionsInProgress = state.custom(SnapshotDeletionsInProgress.TYPE);
        assertThat(snapshotDeletionsInProgress.getEntries(), empty());
    }, 30L, TimeUnit.SECONDS);
}
Also used : ClusterState(org.opensearch.cluster.ClusterState) CreateSnapshotResponse(org.opensearch.action.admin.cluster.snapshots.create.CreateSnapshotResponse) AcknowledgedResponse(org.opensearch.action.support.master.AcknowledgedResponse) SnapshotsInProgress(org.opensearch.cluster.SnapshotsInProgress) RepositoryException(org.opensearch.repositories.RepositoryException) Matchers.containsString(org.hamcrest.Matchers.containsString) SnapshotDeletionsInProgress(org.opensearch.cluster.SnapshotDeletionsInProgress)

Example 7 with RepositoryException

use of org.opensearch.repositories.RepositoryException in project OpenSearch by opensearch-project.

the class SnapshotsService method beginSnapshot.

/**
 * Starts snapshot.
 * <p>
 * Creates snapshot in repository and updates snapshot metadata record with list of shards that needs to be processed.
 * Note: This method is only used in clusters that contain a node older than {@link #NO_REPO_INITIALIZE_VERSION} to ensure a backwards
 * compatible path for initializing the snapshot in the repository is executed.
 *
 * @param clusterState               cluster state
 * @param snapshot                   snapshot meta data
 * @param partial                    allow partial snapshots
 * @param userCreateSnapshotListener listener
 */
private void beginSnapshot(final ClusterState clusterState, final SnapshotsInProgress.Entry snapshot, final boolean partial, final List<String> indices, final Repository repository, final ActionListener<Snapshot> userCreateSnapshotListener) {
    threadPool.executor(ThreadPool.Names.SNAPSHOT).execute(new AbstractRunnable() {

        boolean hadAbortedInitializations;

        @Override
        protected void doRun() {
            assert initializingSnapshots.contains(snapshot.snapshot());
            if (repository.isReadOnly()) {
                throw new RepositoryException(repository.getMetadata().name(), "cannot create snapshot in a readonly repository");
            }
            final String snapshotName = snapshot.snapshot().getSnapshotId().getName();
            final StepListener<RepositoryData> repositoryDataListener = new StepListener<>();
            repository.getRepositoryData(repositoryDataListener);
            repositoryDataListener.whenComplete(repositoryData -> {
                // check if the snapshot name already exists in the repository
                if (repositoryData.getSnapshotIds().stream().anyMatch(s -> s.getName().equals(snapshotName))) {
                    throw new InvalidSnapshotNameException(repository.getMetadata().name(), snapshotName, "snapshot with the same name already exists");
                }
                if (clusterState.nodes().getMinNodeVersion().onOrAfter(NO_REPO_INITIALIZE_VERSION) == false) {
                    // In mixed version clusters we initialize the snapshot in the repository so that in case of a master failover to an
                    // older version master node snapshot finalization (that assumes initializeSnapshot was called) produces a valid
                    // snapshot.
                    repository.initializeSnapshot(snapshot.snapshot().getSnapshotId(), snapshot.indices(), metadataForSnapshot(snapshot, clusterState.metadata()));
                }
                logger.info("snapshot [{}] started", snapshot.snapshot());
                final Version version = minCompatibleVersion(clusterState.nodes().getMinNodeVersion(), repositoryData, null);
                if (indices.isEmpty()) {
                    // No indices in this snapshot - we are done
                    userCreateSnapshotListener.onResponse(snapshot.snapshot());
                    endSnapshot(SnapshotsInProgress.startedEntry(snapshot.snapshot(), snapshot.includeGlobalState(), snapshot.partial(), Collections.emptyList(), Collections.emptyList(), threadPool.absoluteTimeInMillis(), repositoryData.getGenId(), ImmutableOpenMap.of(), snapshot.userMetadata(), version), clusterState.metadata(), repositoryData);
                    return;
                }
                clusterService.submitStateUpdateTask("update_snapshot [" + snapshot.snapshot() + "]", new ClusterStateUpdateTask() {

                    @Override
                    public ClusterState execute(ClusterState currentState) {
                        SnapshotsInProgress snapshots = currentState.custom(SnapshotsInProgress.TYPE);
                        List<SnapshotsInProgress.Entry> entries = new ArrayList<>();
                        for (SnapshotsInProgress.Entry entry : snapshots.entries()) {
                            if (entry.snapshot().equals(snapshot.snapshot()) == false) {
                                entries.add(entry);
                                continue;
                            }
                            if (entry.state() == State.ABORTED) {
                                entries.add(entry);
                                assert entry.shards().isEmpty();
                                hadAbortedInitializations = true;
                            } else {
                                final List<IndexId> indexIds = repositoryData.resolveNewIndices(indices, Collections.emptyMap());
                                // Replace the snapshot that was just initialized
                                ImmutableOpenMap<ShardId, ShardSnapshotStatus> shards = shards(snapshots, currentState.custom(SnapshotDeletionsInProgress.TYPE, SnapshotDeletionsInProgress.EMPTY), currentState.metadata(), currentState.routingTable(), indexIds, useShardGenerations(version), repositoryData, entry.repository());
                                if (!partial) {
                                    Tuple<Set<String>, Set<String>> indicesWithMissingShards = indicesWithMissingShards(shards, currentState.metadata());
                                    Set<String> missing = indicesWithMissingShards.v1();
                                    Set<String> closed = indicesWithMissingShards.v2();
                                    if (missing.isEmpty() == false || closed.isEmpty() == false) {
                                        final StringBuilder failureMessage = new StringBuilder();
                                        if (missing.isEmpty() == false) {
                                            failureMessage.append("Indices don't have primary shards ");
                                            failureMessage.append(missing);
                                        }
                                        if (closed.isEmpty() == false) {
                                            if (failureMessage.length() > 0) {
                                                failureMessage.append("; ");
                                            }
                                            failureMessage.append("Indices are closed ");
                                            failureMessage.append(closed);
                                        }
                                        entries.add(new SnapshotsInProgress.Entry(entry, State.FAILED, indexIds, repositoryData.getGenId(), shards, version, failureMessage.toString()));
                                        continue;
                                    }
                                }
                                entries.add(new SnapshotsInProgress.Entry(entry, State.STARTED, indexIds, repositoryData.getGenId(), shards, version, null));
                            }
                        }
                        return ClusterState.builder(currentState).putCustom(SnapshotsInProgress.TYPE, SnapshotsInProgress.of(unmodifiableList(entries))).build();
                    }

                    @Override
                    public void onFailure(String source, Exception e) {
                        logger.warn(() -> new ParameterizedMessage("[{}] failed to create snapshot", snapshot.snapshot().getSnapshotId()), e);
                        removeFailedSnapshotFromClusterState(snapshot.snapshot(), e, null, new CleanupAfterErrorListener(userCreateSnapshotListener, e));
                    }

                    @Override
                    public void onNoLongerMaster(String source) {
                        // We are not longer a master - we shouldn't try to do any cleanup
                        // The new master will take care of it
                        logger.warn("[{}] failed to create snapshot - no longer a master", snapshot.snapshot().getSnapshotId());
                        userCreateSnapshotListener.onFailure(new SnapshotException(snapshot.snapshot(), "master changed during snapshot initialization"));
                    }

                    @Override
                    public void clusterStateProcessed(String source, ClusterState oldState, ClusterState newState) {
                        // The userCreateSnapshotListener.onResponse() notifies caller that the snapshot was accepted
                        // for processing. If client wants to wait for the snapshot completion, it can register snapshot
                        // completion listener in this method. For the snapshot completion to work properly, the snapshot
                        // should still exist when listener is registered.
                        userCreateSnapshotListener.onResponse(snapshot.snapshot());
                        if (hadAbortedInitializations) {
                            final SnapshotsInProgress snapshotsInProgress = newState.custom(SnapshotsInProgress.TYPE);
                            assert snapshotsInProgress != null;
                            final SnapshotsInProgress.Entry entry = snapshotsInProgress.snapshot(snapshot.snapshot());
                            assert entry != null;
                            endSnapshot(entry, newState.metadata(), repositoryData);
                        } else {
                            endCompletedSnapshots(newState);
                        }
                    }
                });
            }, this::onFailure);
        }

        @Override
        public void onFailure(Exception e) {
            logger.warn(() -> new ParameterizedMessage("failed to create snapshot [{}]", snapshot.snapshot().getSnapshotId()), e);
            removeFailedSnapshotFromClusterState(snapshot.snapshot(), e, null, new CleanupAfterErrorListener(userCreateSnapshotListener, e));
        }
    });
}
Also used : AbstractRunnable(org.opensearch.common.util.concurrent.AbstractRunnable) RepositoryMissingException(org.opensearch.repositories.RepositoryMissingException) ImmutableOpenMap(org.opensearch.common.collect.ImmutableOpenMap) Arrays(java.util.Arrays) Metadata(org.opensearch.cluster.metadata.Metadata) Collections.unmodifiableList(java.util.Collections.unmodifiableList) DataStream(org.opensearch.cluster.metadata.DataStream) Version(org.opensearch.Version) ClusterStateApplier(org.opensearch.cluster.ClusterStateApplier) Regex(org.opensearch.common.regex.Regex) Strings(org.opensearch.common.Strings) GroupedActionListener(org.opensearch.action.support.GroupedActionListener) Map(java.util.Map) ActionListener(org.opensearch.action.ActionListener) EnumSet(java.util.EnumSet) Repository(org.opensearch.repositories.Repository) TimeValue(org.opensearch.common.unit.TimeValue) Index(org.opensearch.index.Index) ExceptionsHelper(org.opensearch.ExceptionsHelper) Set(java.util.Set) ClusterStateTaskExecutor(org.opensearch.cluster.ClusterStateTaskExecutor) Settings(org.opensearch.common.settings.Settings) ObjectCursor(com.carrotsearch.hppc.cursors.ObjectCursor) TransportService(org.opensearch.transport.TransportService) FailedToCommitClusterStateException(org.opensearch.cluster.coordination.FailedToCommitClusterStateException) ActionFilters(org.opensearch.action.support.ActionFilters) AbstractLifecycleComponent(org.opensearch.common.component.AbstractLifecycleComponent) ShardState(org.opensearch.cluster.SnapshotsInProgress.ShardState) Logger(org.apache.logging.log4j.Logger) Stream(java.util.stream.Stream) ClusterStateUpdateTask(org.opensearch.cluster.ClusterStateUpdateTask) StepListener(org.opensearch.action.StepListener) State(org.opensearch.cluster.SnapshotsInProgress.State) IndexNameExpressionResolver(org.opensearch.cluster.metadata.IndexNameExpressionResolver) CopyOnWriteArrayList(java.util.concurrent.CopyOnWriteArrayList) RepositoriesService(org.opensearch.repositories.RepositoriesService) ThreadPool(org.opensearch.threadpool.ThreadPool) Priority(org.opensearch.common.Priority) TransportMasterNodeAction(org.opensearch.action.support.master.TransportMasterNodeAction) ArrayList(java.util.ArrayList) ClusterState(org.opensearch.cluster.ClusterState) LegacyESVersion(org.opensearch.LegacyESVersion) ClusterStateTaskConfig(org.opensearch.cluster.ClusterStateTaskConfig) RepositoryCleanupInProgress(org.opensearch.cluster.RepositoryCleanupInProgress) RepositoriesMetadata(org.opensearch.cluster.metadata.RepositoriesMetadata) Executor(java.util.concurrent.Executor) IOException(java.io.IOException) DeleteSnapshotRequest(org.opensearch.action.admin.cluster.snapshots.delete.DeleteSnapshotRequest) ClusterService(org.opensearch.cluster.service.ClusterService) RestoreInProgress(org.opensearch.cluster.RestoreInProgress) RoutingTable(org.opensearch.cluster.routing.RoutingTable) SnapshotsInProgress.completed(org.opensearch.cluster.SnapshotsInProgress.completed) ClusterChangedEvent(org.opensearch.cluster.ClusterChangedEvent) ShardGenerations(org.opensearch.repositories.ShardGenerations) AbstractRunnable(org.opensearch.common.util.concurrent.AbstractRunnable) ObjectObjectCursor(com.carrotsearch.hppc.cursors.ObjectObjectCursor) DiscoveryNode(org.opensearch.cluster.node.DiscoveryNode) IndexId(org.opensearch.repositories.IndexId) Locale(java.util.Locale) NotMasterException(org.opensearch.cluster.NotMasterException) ShardSnapshotStatus(org.opensearch.cluster.SnapshotsInProgress.ShardSnapshotStatus) RepositoryException(org.opensearch.repositories.RepositoryException) IndexShardRoutingTable(org.opensearch.cluster.routing.IndexShardRoutingTable) Collection(java.util.Collection) ConcurrentHashMap(java.util.concurrent.ConcurrentHashMap) ClusterBlockException(org.opensearch.cluster.block.ClusterBlockException) Collectors(java.util.stream.Collectors) Nullable(org.opensearch.common.Nullable) Tuple(org.opensearch.common.collect.Tuple) Objects(java.util.Objects) List(java.util.List) Optional(java.util.Optional) ClusterStateTaskListener(org.opensearch.cluster.ClusterStateTaskListener) DiscoveryNodes(org.opensearch.cluster.node.DiscoveryNodes) IndexMetadata(org.opensearch.cluster.metadata.IndexMetadata) ActionRunnable(org.opensearch.action.ActionRunnable) SnapshotsInProgress(org.opensearch.cluster.SnapshotsInProgress) CloneSnapshotRequest(org.opensearch.action.admin.cluster.snapshots.clone.CloneSnapshotRequest) HashMap(java.util.HashMap) SnapshotDeletionsInProgress(org.opensearch.cluster.SnapshotDeletionsInProgress) Deque(java.util.Deque) ParameterizedMessage(org.apache.logging.log4j.message.ParameterizedMessage) Function(java.util.function.Function) HashSet(java.util.HashSet) IndexRoutingTable(org.opensearch.cluster.routing.IndexRoutingTable) UUIDs(org.opensearch.common.UUIDs) LinkedList(java.util.LinkedList) StreamInput(org.opensearch.common.io.stream.StreamInput) RepositoryData(org.opensearch.repositories.RepositoryData) Setting(org.opensearch.common.settings.Setting) Iterator(java.util.Iterator) Collections.emptySet(java.util.Collections.emptySet) RepositoryShardId(org.opensearch.repositories.RepositoryShardId) ShardRouting(org.opensearch.cluster.routing.ShardRouting) ShardId(org.opensearch.index.shard.ShardId) Consumer(java.util.function.Consumer) CreateSnapshotRequest(org.opensearch.action.admin.cluster.snapshots.create.CreateSnapshotRequest) LogManager(org.apache.logging.log4j.LogManager) Collections(java.util.Collections) EnumSet(java.util.EnumSet) Set(java.util.Set) HashSet(java.util.HashSet) Collections.emptySet(java.util.Collections.emptySet) CopyOnWriteArrayList(java.util.concurrent.CopyOnWriteArrayList) ArrayList(java.util.ArrayList) RepositoryShardId(org.opensearch.repositories.RepositoryShardId) ShardId(org.opensearch.index.shard.ShardId) Version(org.opensearch.Version) LegacyESVersion(org.opensearch.LegacyESVersion) ClusterState(org.opensearch.cluster.ClusterState) IndexId(org.opensearch.repositories.IndexId) ClusterStateUpdateTask(org.opensearch.cluster.ClusterStateUpdateTask) RepositoryException(org.opensearch.repositories.RepositoryException) RepositoryMissingException(org.opensearch.repositories.RepositoryMissingException) FailedToCommitClusterStateException(org.opensearch.cluster.coordination.FailedToCommitClusterStateException) IOException(java.io.IOException) NotMasterException(org.opensearch.cluster.NotMasterException) RepositoryException(org.opensearch.repositories.RepositoryException) ClusterBlockException(org.opensearch.cluster.block.ClusterBlockException) SnapshotsInProgress(org.opensearch.cluster.SnapshotsInProgress) StepListener(org.opensearch.action.StepListener) ShardSnapshotStatus(org.opensearch.cluster.SnapshotsInProgress.ShardSnapshotStatus) ParameterizedMessage(org.apache.logging.log4j.message.ParameterizedMessage)

Example 8 with RepositoryException

use of org.opensearch.repositories.RepositoryException in project OpenSearch by opensearch-project.

the class SnapshotsService method createSnapshot.

/**
 * Initializes the snapshotting process.
 * <p>
 * This method is used by clients to start snapshot. It makes sure that there is no snapshots are currently running and
 * creates a snapshot record in cluster state metadata.
 *
 * @param request  snapshot request
 * @param listener snapshot creation listener
 */
public void createSnapshot(final CreateSnapshotRequest request, final ActionListener<Snapshot> listener) {
    final String repositoryName = request.repository();
    final String snapshotName = indexNameExpressionResolver.resolveDateMathExpression(request.snapshot());
    validate(repositoryName, snapshotName);
    // TODO: create snapshot UUID in CreateSnapshotRequest and make this operation idempotent to cleanly deal with transport layer
    // retries
    // new UUID for the snapshot
    final SnapshotId snapshotId = new SnapshotId(snapshotName, UUIDs.randomBase64UUID());
    Repository repository = repositoriesService.repository(request.repository());
    if (repository.isReadOnly()) {
        listener.onFailure(new RepositoryException(repository.getMetadata().name(), "cannot create snapshot in a readonly repository"));
        return;
    }
    final Snapshot snapshot = new Snapshot(repositoryName, snapshotId);
    final Map<String, Object> userMeta = repository.adaptUserMetadata(request.userMetadata());
    repository.executeConsistentStateUpdate(repositoryData -> new ClusterStateUpdateTask() {

        private SnapshotsInProgress.Entry newEntry;

        @Override
        public ClusterState execute(ClusterState currentState) {
            ensureSnapshotNameAvailableInRepo(repositoryData, snapshotName, repository);
            final SnapshotsInProgress snapshots = currentState.custom(SnapshotsInProgress.TYPE, SnapshotsInProgress.EMPTY);
            final List<SnapshotsInProgress.Entry> runningSnapshots = snapshots.entries();
            ensureSnapshotNameNotRunning(runningSnapshots, repositoryName, snapshotName);
            validate(repositoryName, snapshotName, currentState);
            final boolean concurrentOperationsAllowed = currentState.nodes().getMinNodeVersion().onOrAfter(FULL_CONCURRENCY_VERSION);
            final SnapshotDeletionsInProgress deletionsInProgress = currentState.custom(SnapshotDeletionsInProgress.TYPE, SnapshotDeletionsInProgress.EMPTY);
            if (deletionsInProgress.hasDeletionsInProgress() && concurrentOperationsAllowed == false) {
                throw new ConcurrentSnapshotExecutionException(repositoryName, snapshotName, "cannot snapshot while a snapshot deletion is in-progress in [" + deletionsInProgress + "]");
            }
            final RepositoryCleanupInProgress repositoryCleanupInProgress = currentState.custom(RepositoryCleanupInProgress.TYPE, RepositoryCleanupInProgress.EMPTY);
            if (repositoryCleanupInProgress.hasCleanupInProgress()) {
                throw new ConcurrentSnapshotExecutionException(repositoryName, snapshotName, "cannot snapshot while a repository cleanup is in-progress in [" + repositoryCleanupInProgress + "]");
            }
            // cluster state anyway in #applyClusterState.
            if (concurrentOperationsAllowed == false && runningSnapshots.stream().anyMatch(entry -> entry.state() != State.INIT)) {
                throw new ConcurrentSnapshotExecutionException(repositoryName, snapshotName, " a snapshot is already running");
            }
            ensureNoCleanupInProgress(currentState, repositoryName, snapshotName);
            ensureBelowConcurrencyLimit(repositoryName, snapshotName, snapshots, deletionsInProgress);
            // Store newSnapshot here to be processed in clusterStateProcessed
            List<String> indices = Arrays.asList(indexNameExpressionResolver.concreteIndexNames(currentState, request));
            final List<String> dataStreams = indexNameExpressionResolver.dataStreamNames(currentState, request.indicesOptions(), request.indices());
            logger.trace("[{}][{}] creating snapshot for indices [{}]", repositoryName, snapshotName, indices);
            final List<IndexId> indexIds = repositoryData.resolveNewIndices(indices, getInFlightIndexIds(runningSnapshots, repositoryName));
            final Version version = minCompatibleVersion(currentState.nodes().getMinNodeVersion(), repositoryData, null);
            ImmutableOpenMap<ShardId, ShardSnapshotStatus> shards = shards(snapshots, deletionsInProgress, currentState.metadata(), currentState.routingTable(), indexIds, useShardGenerations(version), repositoryData, repositoryName);
            if (request.partial() == false) {
                Set<String> missing = new HashSet<>();
                for (ObjectObjectCursor<ShardId, SnapshotsInProgress.ShardSnapshotStatus> entry : shards) {
                    if (entry.value.state() == ShardState.MISSING) {
                        missing.add(entry.key.getIndex().getName());
                    }
                }
                if (missing.isEmpty() == false) {
                    throw new SnapshotException(new Snapshot(repositoryName, snapshotId), "Indices don't have primary shards " + missing);
                }
            }
            newEntry = SnapshotsInProgress.startedEntry(new Snapshot(repositoryName, snapshotId), request.includeGlobalState(), request.partial(), indexIds, dataStreams, threadPool.absoluteTimeInMillis(), repositoryData.getGenId(), shards, userMeta, version);
            final List<SnapshotsInProgress.Entry> newEntries = new ArrayList<>(runningSnapshots);
            newEntries.add(newEntry);
            return ClusterState.builder(currentState).putCustom(SnapshotsInProgress.TYPE, SnapshotsInProgress.of(new ArrayList<>(newEntries))).build();
        }

        @Override
        public void onFailure(String source, Exception e) {
            logger.warn(() -> new ParameterizedMessage("[{}][{}] failed to create snapshot", repositoryName, snapshotName), e);
            listener.onFailure(e);
        }

        @Override
        public void clusterStateProcessed(String source, ClusterState oldState, final ClusterState newState) {
            try {
                logger.info("snapshot [{}] started", snapshot);
                listener.onResponse(snapshot);
            } finally {
                if (newEntry.state().completed()) {
                    endSnapshot(newEntry, newState.metadata(), repositoryData);
                }
            }
        }

        @Override
        public TimeValue timeout() {
            return request.masterNodeTimeout();
        }
    }, "create_snapshot [" + snapshotName + ']', listener::onFailure);
}
Also used : EnumSet(java.util.EnumSet) Set(java.util.Set) HashSet(java.util.HashSet) Collections.emptySet(java.util.Collections.emptySet) CopyOnWriteArrayList(java.util.concurrent.CopyOnWriteArrayList) ArrayList(java.util.ArrayList) RepositoryCleanupInProgress(org.opensearch.cluster.RepositoryCleanupInProgress) ImmutableOpenMap(org.opensearch.common.collect.ImmutableOpenMap) SnapshotDeletionsInProgress(org.opensearch.cluster.SnapshotDeletionsInProgress) Version(org.opensearch.Version) LegacyESVersion(org.opensearch.LegacyESVersion) Collections.unmodifiableList(java.util.Collections.unmodifiableList) CopyOnWriteArrayList(java.util.concurrent.CopyOnWriteArrayList) ArrayList(java.util.ArrayList) List(java.util.List) LinkedList(java.util.LinkedList) TimeValue(org.opensearch.common.unit.TimeValue) ClusterState(org.opensearch.cluster.ClusterState) ClusterStateUpdateTask(org.opensearch.cluster.ClusterStateUpdateTask) RepositoryException(org.opensearch.repositories.RepositoryException) RepositoryMissingException(org.opensearch.repositories.RepositoryMissingException) FailedToCommitClusterStateException(org.opensearch.cluster.coordination.FailedToCommitClusterStateException) IOException(java.io.IOException) NotMasterException(org.opensearch.cluster.NotMasterException) RepositoryException(org.opensearch.repositories.RepositoryException) ClusterBlockException(org.opensearch.cluster.block.ClusterBlockException) Repository(org.opensearch.repositories.Repository) SnapshotsInProgress(org.opensearch.cluster.SnapshotsInProgress) ParameterizedMessage(org.apache.logging.log4j.message.ParameterizedMessage) ObjectObjectCursor(com.carrotsearch.hppc.cursors.ObjectObjectCursor)

Example 9 with RepositoryException

use of org.opensearch.repositories.RepositoryException in project OpenSearch by opensearch-project.

the class BlobStoreRepository method doGetRepositoryData.

private void doGetRepositoryData(ActionListener<RepositoryData> listener) {
    // Retry loading RepositoryData in a loop in case we run into concurrent modifications of the repository.
    // Keep track of the most recent generation we failed to load so we can break out of the loop if we fail to load the same
    // generation repeatedly.
    long lastFailedGeneration = RepositoryData.UNKNOWN_REPO_GEN;
    while (true) {
        final long genToLoad;
        if (bestEffortConsistency) {
            // We're only using #latestKnownRepoGen as a hint in this mode and listing repo contents as a secondary way of trying
            // to find a higher generation
            final long generation;
            try {
                generation = latestIndexBlobId();
            } catch (IOException ioe) {
                listener.onFailure(new RepositoryException(metadata.name(), "Could not determine repository generation from root blobs", ioe));
                return;
            }
            genToLoad = latestKnownRepoGen.updateAndGet(known -> Math.max(known, generation));
            if (genToLoad > generation) {
                logger.info("Determined repository generation [" + generation + "] from repository contents but correct generation must be at least [" + genToLoad + "]");
            }
        } else {
            // We only rely on the generation tracked in #latestKnownRepoGen which is exclusively updated from the cluster state
            genToLoad = latestKnownRepoGen.get();
        }
        try {
            final Tuple<Long, BytesReference> cached = latestKnownRepositoryData.get();
            final RepositoryData loaded;
            // Caching is not used with #bestEffortConsistency see docs on #cacheRepositoryData for details
            if (bestEffortConsistency == false && cached != null && cached.v1() == genToLoad) {
                loaded = repositoryDataFromCachedEntry(cached);
            } else {
                loaded = getRepositoryData(genToLoad);
                // We can cache serialized in the most recent version here without regard to the actual repository metadata version
                // since we're only caching the information that we just wrote and thus won't accidentally cache any information that
                // isn't safe
                cacheRepositoryData(BytesReference.bytes(loaded.snapshotsToXContent(XContentFactory.jsonBuilder(), Version.CURRENT)), genToLoad);
            }
            listener.onResponse(loaded);
            return;
        } catch (RepositoryException e) {
            // If the generation to load changed concurrently and we didn't just try loading the same generation before we retry
            if (genToLoad != latestKnownRepoGen.get() && genToLoad != lastFailedGeneration) {
                lastFailedGeneration = genToLoad;
                logger.warn("Failed to load repository data generation [" + genToLoad + "] because a concurrent operation moved the current generation to [" + latestKnownRepoGen.get() + "]", e);
                continue;
            }
            if (bestEffortConsistency == false && ExceptionsHelper.unwrap(e, NoSuchFileException.class) != null) {
                // We did not find the expected index-N even though the cluster state continues to point at the missing value
                // of N so we mark this repository as corrupted.
                markRepoCorrupted(genToLoad, e, ActionListener.wrap(v -> listener.onFailure(corruptedStateException(e)), listener::onFailure));
            } else {
                listener.onFailure(e);
            }
            return;
        } catch (Exception e) {
            listener.onFailure(new RepositoryException(metadata.name(), "Unexpected exception when loading repository data", e));
            return;
        }
    }
}
Also used : Metadata(org.opensearch.cluster.metadata.Metadata) IndexFormatTooNewException(org.apache.lucene.index.IndexFormatTooNewException) AllocationService(org.opensearch.cluster.routing.allocation.AllocationService) AlreadyClosedException(org.apache.lucene.store.AlreadyClosedException) Version(org.opensearch.Version) Strings(org.opensearch.common.Strings) AbortedSnapshotException(org.opensearch.snapshots.AbortedSnapshotException) GroupedActionListener(org.opensearch.action.support.GroupedActionListener) RecoveryState(org.opensearch.indices.recovery.RecoveryState) Map(java.util.Map) Lucene(org.opensearch.common.lucene.Lucene) ActionListener(org.opensearch.action.ActionListener) IOContext(org.apache.lucene.store.IOContext) Repository(org.opensearch.repositories.Repository) TimeValue(org.opensearch.common.unit.TimeValue) ExceptionsHelper(org.opensearch.ExceptionsHelper) Set(java.util.Set) Settings(org.opensearch.common.settings.Settings) BlobStoreIndexShardSnapshot(org.opensearch.index.snapshots.blobstore.BlobStoreIndexShardSnapshot) BlockingQueue(java.util.concurrent.BlockingQueue) AbstractLifecycleComponent(org.opensearch.common.component.AbstractLifecycleComponent) Logger(org.apache.logging.log4j.Logger) RepositoryOperation(org.opensearch.repositories.RepositoryOperation) Stream(java.util.stream.Stream) ClusterStateUpdateTask(org.opensearch.cluster.ClusterStateUpdateTask) BytesArray(org.opensearch.common.bytes.BytesArray) BlobStoreIndexShardSnapshots(org.opensearch.index.snapshots.blobstore.BlobStoreIndexShardSnapshots) FsBlobContainer(org.opensearch.common.blobstore.fs.FsBlobContainer) StepListener(org.opensearch.action.StepListener) XContentType(org.opensearch.common.xcontent.XContentType) IndexCommit(org.apache.lucene.index.IndexCommit) ThreadPool(org.opensearch.threadpool.ThreadPool) BlobContainer(org.opensearch.common.blobstore.BlobContainer) Releasable(org.opensearch.common.lease.Releasable) Supplier(java.util.function.Supplier) ArrayList(java.util.ArrayList) ClusterState(org.opensearch.cluster.ClusterState) SnapshotMissingException(org.opensearch.snapshots.SnapshotMissingException) Numbers(org.opensearch.common.Numbers) SlicedInputStream(org.opensearch.index.snapshots.blobstore.SlicedInputStream) SnapshotException(org.opensearch.snapshots.SnapshotException) Streams(org.opensearch.common.io.Streams) CompressorFactory(org.opensearch.common.compress.CompressorFactory) RepositoryVerificationException(org.opensearch.repositories.RepositoryVerificationException) RepositoryCleanupInProgress(org.opensearch.cluster.RepositoryCleanupInProgress) InputStreamIndexInput(org.opensearch.common.lucene.store.InputStreamIndexInput) LongStream(java.util.stream.LongStream) IndexInput(org.apache.lucene.store.IndexInput) SetOnce(org.apache.lucene.util.SetOnce) RepositoriesMetadata(org.opensearch.cluster.metadata.RepositoriesMetadata) Executor(java.util.concurrent.Executor) SnapshotInfo(org.opensearch.snapshots.SnapshotInfo) RepositoryMetadata(org.opensearch.cluster.metadata.RepositoryMetadata) IOException(java.io.IOException) IndexShardSnapshotFailedException(org.opensearch.index.snapshots.IndexShardSnapshotFailedException) NotXContentException(org.opensearch.common.compress.NotXContentException) AtomicLong(java.util.concurrent.atomic.AtomicLong) RepositoryCleanupResult(org.opensearch.repositories.RepositoryCleanupResult) BlobPath(org.opensearch.common.blobstore.BlobPath) NamedXContentRegistry(org.opensearch.common.xcontent.NamedXContentRegistry) ClusterService(org.opensearch.cluster.service.ClusterService) CounterMetric(org.opensearch.common.metrics.CounterMetric) ShardGenerations(org.opensearch.repositories.ShardGenerations) NoSuchFileException(java.nio.file.NoSuchFileException) AbstractRunnable(org.opensearch.common.util.concurrent.AbstractRunnable) SnapshotCreationException(org.opensearch.snapshots.SnapshotCreationException) ByteSizeUnit(org.opensearch.common.unit.ByteSizeUnit) SnapshotFiles(org.opensearch.index.snapshots.blobstore.SnapshotFiles) SnapshotsService(org.opensearch.snapshots.SnapshotsService) CorruptIndexException(org.apache.lucene.index.CorruptIndexException) ConcurrentCollections(org.opensearch.common.util.concurrent.ConcurrentCollections) XContentParser(org.opensearch.common.xcontent.XContentParser) DiscoveryNode(org.opensearch.cluster.node.DiscoveryNode) MapperService(org.opensearch.index.mapper.MapperService) IndexId(org.opensearch.repositories.IndexId) XContentFactory(org.opensearch.common.xcontent.XContentFactory) RepositoryStats(org.opensearch.repositories.RepositoryStats) BlobMetadata(org.opensearch.common.blobstore.BlobMetadata) RecoverySettings(org.opensearch.indices.recovery.RecoverySettings) RepositoryException(org.opensearch.repositories.RepositoryException) FileInfo.canonicalName(org.opensearch.index.snapshots.blobstore.BlobStoreIndexShardSnapshot.FileInfo.canonicalName) BytesRef(org.apache.lucene.util.BytesRef) SnapshotId(org.opensearch.snapshots.SnapshotId) Collection(java.util.Collection) LoggingDeprecationHandler(org.opensearch.common.xcontent.LoggingDeprecationHandler) ConcurrentHashMap(java.util.concurrent.ConcurrentHashMap) Store(org.opensearch.index.store.Store) LinkedBlockingQueue(java.util.concurrent.LinkedBlockingQueue) Collectors(java.util.stream.Collectors) Nullable(org.opensearch.common.Nullable) Tuple(org.opensearch.common.collect.Tuple) BlobStore(org.opensearch.common.blobstore.BlobStore) List(java.util.List) Optional(java.util.Optional) BytesReference(org.opensearch.common.bytes.BytesReference) RateLimitingInputStream(org.opensearch.index.snapshots.blobstore.RateLimitingInputStream) IndexMetadata(org.opensearch.cluster.metadata.IndexMetadata) ActionRunnable(org.opensearch.action.ActionRunnable) SnapshotsInProgress(org.opensearch.cluster.SnapshotsInProgress) ByteSizeValue(org.opensearch.common.unit.ByteSizeValue) SnapshotDeletionsInProgress(org.opensearch.cluster.SnapshotDeletionsInProgress) ParameterizedMessage(org.apache.logging.log4j.message.ParameterizedMessage) AtomicReference(java.util.concurrent.atomic.AtomicReference) Function(java.util.function.Function) FilterInputStream(java.io.FilterInputStream) IndexShardSnapshotStatus(org.opensearch.index.snapshots.IndexShardSnapshotStatus) IndexMetaDataGenerations(org.opensearch.repositories.IndexMetaDataGenerations) UUIDs(org.opensearch.common.UUIDs) StoreFileMetadata(org.opensearch.index.store.StoreFileMetadata) IndexOutput(org.apache.lucene.store.IndexOutput) IndexShardRestoreFailedException(org.opensearch.index.snapshots.IndexShardRestoreFailedException) RepositoryData(org.opensearch.repositories.RepositoryData) Setting(org.opensearch.common.settings.Setting) RepositoryShardId(org.opensearch.repositories.RepositoryShardId) IndexFormatTooOldException(org.apache.lucene.index.IndexFormatTooOldException) ShardId(org.opensearch.index.shard.ShardId) TimeUnit(java.util.concurrent.TimeUnit) Consumer(java.util.function.Consumer) DeleteResult(org.opensearch.common.blobstore.DeleteResult) LogManager(org.apache.logging.log4j.LogManager) Collections(java.util.Collections) RateLimiter(org.apache.lucene.store.RateLimiter) InputStream(java.io.InputStream) BytesReference(org.opensearch.common.bytes.BytesReference) AtomicLong(java.util.concurrent.atomic.AtomicLong) NoSuchFileException(java.nio.file.NoSuchFileException) RepositoryException(org.opensearch.repositories.RepositoryException) IOException(java.io.IOException) IndexFormatTooNewException(org.apache.lucene.index.IndexFormatTooNewException) AlreadyClosedException(org.apache.lucene.store.AlreadyClosedException) AbortedSnapshotException(org.opensearch.snapshots.AbortedSnapshotException) SnapshotMissingException(org.opensearch.snapshots.SnapshotMissingException) SnapshotException(org.opensearch.snapshots.SnapshotException) RepositoryVerificationException(org.opensearch.repositories.RepositoryVerificationException) IOException(java.io.IOException) IndexShardSnapshotFailedException(org.opensearch.index.snapshots.IndexShardSnapshotFailedException) NotXContentException(org.opensearch.common.compress.NotXContentException) NoSuchFileException(java.nio.file.NoSuchFileException) SnapshotCreationException(org.opensearch.snapshots.SnapshotCreationException) CorruptIndexException(org.apache.lucene.index.CorruptIndexException) RepositoryException(org.opensearch.repositories.RepositoryException) IndexShardRestoreFailedException(org.opensearch.index.snapshots.IndexShardRestoreFailedException) IndexFormatTooOldException(org.apache.lucene.index.IndexFormatTooOldException) RepositoryData(org.opensearch.repositories.RepositoryData)

Example 10 with RepositoryException

use of org.opensearch.repositories.RepositoryException in project OpenSearch by opensearch-project.

the class ExceptionSerializationTests method testRepositoryException.

public void testRepositoryException() throws IOException {
    RepositoryException ex = serialize(new RepositoryException("repo", "msg"));
    assertEquals("repo", ex.repository());
    assertEquals("[repo] msg", ex.getMessage());
    ex = serialize(new RepositoryException(null, "msg"));
    assertNull(ex.repository());
    assertEquals("[_na] msg", ex.getMessage());
}
Also used : RepositoryException(org.opensearch.repositories.RepositoryException)

Aggregations

RepositoryException (org.opensearch.repositories.RepositoryException)24 IOException (java.io.IOException)12 ArrayList (java.util.ArrayList)9 Settings (org.opensearch.common.settings.Settings)9 List (java.util.List)8 Set (java.util.Set)7 Executor (java.util.concurrent.Executor)7 ParameterizedMessage (org.apache.logging.log4j.message.ParameterizedMessage)7 Collection (java.util.Collection)6 Collections (java.util.Collections)6 Map (java.util.Map)6 Optional (java.util.Optional)6 ConcurrentHashMap (java.util.concurrent.ConcurrentHashMap)6 Consumer (java.util.function.Consumer)6 Function (java.util.function.Function)6 Collectors (java.util.stream.Collectors)6 Stream (java.util.stream.Stream)6 LogManager (org.apache.logging.log4j.LogManager)6 Logger (org.apache.logging.log4j.Logger)6 NoSuchFileException (java.nio.file.NoSuchFileException)5