Search in sources :

Example 11 with FailedNodeException

use of org.elasticsearch.action.FailedNodeException in project elasticsearch by elastic.

the class BaseTasksResponse method writeTo.

@Override
public void writeTo(StreamOutput out) throws IOException {
    super.writeTo(out);
    out.writeVInt(taskFailures.size());
    for (TaskOperationFailure exp : taskFailures) {
        exp.writeTo(out);
    }
    out.writeVInt(nodeFailures.size());
    for (FailedNodeException exp : nodeFailures) {
        exp.writeTo(out);
    }
}
Also used : TaskOperationFailure(org.elasticsearch.action.TaskOperationFailure) FailedNodeException(org.elasticsearch.action.FailedNodeException)

Example 12 with FailedNodeException

use of org.elasticsearch.action.FailedNodeException in project elasticsearch by elastic.

the class RestActions method buildNodesHeader.

/**
     * Create the XContent header for any {@link BaseNodesResponse}. This looks like:
     * <code>
     * "_nodes" : {
     *   "total" : 3,
     *   "successful" : 1,
     *   "failed" : 2,
     *   "failures" : [ { ... }, { ... } ]
     * }
     * </code>
     * Prefer the overload that properly invokes this method to calling this directly.
     *
     * @param builder XContent builder.
     * @param params XContent parameters.
     * @param total The total number of nodes touched.
     * @param successful The successful number of responses received.
     * @param failed The number of failures (effectively {@code total - successful}).
     * @param failures The failure exceptions related to {@code failed}.
     * @see #buildNodesHeader(XContentBuilder, Params, BaseNodesResponse)
     */
public static void buildNodesHeader(final XContentBuilder builder, final Params params, final int total, final int successful, final int failed, final List<FailedNodeException> failures) throws IOException {
    builder.startObject("_nodes");
    builder.field("total", total);
    builder.field("successful", successful);
    builder.field("failed", failed);
    if (failures.isEmpty() == false) {
        builder.startArray("failures");
        for (FailedNodeException failure : failures) {
            builder.startObject();
            failure.toXContent(builder, params);
            builder.endObject();
        }
        builder.endArray();
    }
    builder.endObject();
}
Also used : FailedNodeException(org.elasticsearch.action.FailedNodeException)

Example 13 with FailedNodeException

use of org.elasticsearch.action.FailedNodeException in project elasticsearch by elastic.

the class AsyncShardFetch method processAsyncFetch.

/**
     * Called by the response handler of the async action to fetch data. Verifies that its still working
     * on the same cache generation, otherwise the results are discarded. It then goes and fills the relevant data for
     * the shard (response + failures), issuing a reroute at the end of it to make sure there will be another round
     * of allocations taking this new data into account.
     */
protected synchronized void processAsyncFetch(ShardId shardId, List<T> responses, List<FailedNodeException> failures) {
    if (closed) {
        // we are closed, no need to process this async fetch at all
        logger.trace("{} ignoring fetched [{}] results, already closed", shardId, type);
        return;
    }
    logger.trace("{} processing fetched [{}] results", shardId, type);
    if (responses != null) {
        for (T response : responses) {
            NodeEntry<T> nodeEntry = cache.get(response.getNode().getId());
            // if the entry is there, and not marked as failed already, process it
            if (nodeEntry == null) {
                continue;
            }
            if (nodeEntry.isFailed()) {
                logger.trace("{} node {} has failed for [{}] (failure [{}])", shardId, nodeEntry.getNodeId(), type, nodeEntry.getFailure());
            } else {
                logger.trace("{} marking {} as done for [{}], result is [{}]", shardId, nodeEntry.getNodeId(), type, response);
                nodeEntry.doneFetching(response);
            }
        }
    }
    if (failures != null) {
        for (FailedNodeException failure : failures) {
            logger.trace("{} processing failure {} for [{}]", shardId, failure, type);
            NodeEntry<T> nodeEntry = cache.get(failure.nodeId());
            // if the entry is there, and not marked as failed already, process it
            if (nodeEntry != null && nodeEntry.isFailed() == false) {
                Throwable unwrappedCause = ExceptionsHelper.unwrapCause(failure.getCause());
                // if the request got rejected or timed out, we need to try it again next time...
                if (unwrappedCause instanceof EsRejectedExecutionException || unwrappedCause instanceof ReceiveTimeoutTransportException || unwrappedCause instanceof ElasticsearchTimeoutException) {
                    nodeEntry.restartFetching();
                } else {
                    logger.warn((Supplier<?>) () -> new ParameterizedMessage("{}: failed to list shard for {} on node [{}]", shardId, type, failure.nodeId()), failure);
                    nodeEntry.doneFetching(failure.getCause());
                }
            }
        }
    }
    reroute(shardId, "post_response");
}
Also used : ReceiveTimeoutTransportException(org.elasticsearch.transport.ReceiveTimeoutTransportException) ElasticsearchTimeoutException(org.elasticsearch.ElasticsearchTimeoutException) FailedNodeException(org.elasticsearch.action.FailedNodeException) ParameterizedMessage(org.apache.logging.log4j.message.ParameterizedMessage) EsRejectedExecutionException(org.elasticsearch.common.util.concurrent.EsRejectedExecutionException)

Example 14 with FailedNodeException

use of org.elasticsearch.action.FailedNodeException in project crate by crate.

the class Gateway method performStateRecovery.

public void performStateRecovery(final GatewayStateRecoveredListener listener) throws GatewayException {
    final DiscoveryNode[] nodes = clusterService.state().nodes().getMasterNodes().values().toArray(DiscoveryNode.class);
    if (LOGGER.isTraceEnabled()) {
        LOGGER.trace("performing state recovery from {}", Arrays.toString(nodes));
    }
    var request = new TransportNodesListGatewayMetaState.Request(nodes);
    PlainActionFuture<TransportNodesListGatewayMetaState.NodesGatewayMetaState> future = PlainActionFuture.newFuture();
    client.executeLocally(TransportNodesListGatewayMetaState.TYPE, request, future);
    final TransportNodesListGatewayMetaState.NodesGatewayMetaState nodesState = future.actionGet();
    final int requiredAllocation = 1;
    if (nodesState.hasFailures()) {
        for (final FailedNodeException failedNodeException : nodesState.failures()) {
            LOGGER.warn("failed to fetch state from node", failedNodeException);
        }
    }
    final ObjectFloatHashMap<Index> indices = new ObjectFloatHashMap<>();
    Metadata electedGlobalState = null;
    int found = 0;
    for (final TransportNodesListGatewayMetaState.NodeGatewayMetaState nodeState : nodesState.getNodes()) {
        if (nodeState.metadata() == null) {
            continue;
        }
        found++;
        if (electedGlobalState == null) {
            electedGlobalState = nodeState.metadata();
        } else if (nodeState.metadata().version() > electedGlobalState.version()) {
            electedGlobalState = nodeState.metadata();
        }
        for (final ObjectCursor<IndexMetadata> cursor : nodeState.metadata().indices().values()) {
            indices.addTo(cursor.value.getIndex(), 1);
        }
    }
    if (found < requiredAllocation) {
        listener.onFailure("found [" + found + "] metadata states, required [" + requiredAllocation + "]");
        return;
    }
    // update the global state, and clean the indices, we elect them in the next phase
    final Metadata.Builder metadataBuilder = Metadata.builder(electedGlobalState).removeAllIndices();
    assert !indices.containsKey(null);
    final Object[] keys = indices.keys;
    for (int i = 0; i < keys.length; i++) {
        if (keys[i] != null) {
            final Index index = (Index) keys[i];
            IndexMetadata electedIndexMetadata = null;
            int indexMetadataCount = 0;
            for (final TransportNodesListGatewayMetaState.NodeGatewayMetaState nodeState : nodesState.getNodes()) {
                if (nodeState.metadata() == null) {
                    continue;
                }
                final IndexMetadata indexMetadata = nodeState.metadata().index(index);
                if (indexMetadata == null) {
                    continue;
                }
                if (electedIndexMetadata == null) {
                    electedIndexMetadata = indexMetadata;
                } else if (indexMetadata.getVersion() > electedIndexMetadata.getVersion()) {
                    electedIndexMetadata = indexMetadata;
                }
                indexMetadataCount++;
            }
            if (electedIndexMetadata != null) {
                if (indexMetadataCount < requiredAllocation) {
                    LOGGER.debug("[{}] found [{}], required [{}], not adding", index, indexMetadataCount, requiredAllocation);
                }
                // TODO if this logging statement is correct then we are missing an else here
                metadataBuilder.put(electedIndexMetadata, false);
            }
        }
    }
    ClusterState recoveredState = Function.<ClusterState>identity().andThen(state -> ClusterStateUpdaters.upgradeAndArchiveUnknownOrInvalidSettings(state, clusterService.getClusterSettings())).apply(ClusterState.builder(clusterService.getClusterName()).metadata(metadataBuilder).build());
    listener.onSuccess(recoveredState);
}
Also used : Arrays(java.util.Arrays) FailedNodeException(org.elasticsearch.action.FailedNodeException) PlainActionFuture(org.elasticsearch.action.support.PlainActionFuture) IndexMetadata(org.elasticsearch.cluster.metadata.IndexMetadata) ClusterService(org.elasticsearch.cluster.service.ClusterService) Index(org.elasticsearch.index.Index) Function(java.util.function.Function) ObjectCursor(com.carrotsearch.hppc.cursors.ObjectCursor) ClusterState(org.elasticsearch.cluster.ClusterState) Metadata(org.elasticsearch.cluster.metadata.Metadata) DiscoveryNode(org.elasticsearch.cluster.node.DiscoveryNode) Logger(org.apache.logging.log4j.Logger) NodeClient(org.elasticsearch.client.node.NodeClient) LogManager(org.apache.logging.log4j.LogManager) ObjectFloatHashMap(com.carrotsearch.hppc.ObjectFloatHashMap) ClusterState(org.elasticsearch.cluster.ClusterState) DiscoveryNode(org.elasticsearch.cluster.node.DiscoveryNode) IndexMetadata(org.elasticsearch.cluster.metadata.IndexMetadata) Metadata(org.elasticsearch.cluster.metadata.Metadata) Index(org.elasticsearch.index.Index) FailedNodeException(org.elasticsearch.action.FailedNodeException) IndexMetadata(org.elasticsearch.cluster.metadata.IndexMetadata) ObjectFloatHashMap(com.carrotsearch.hppc.ObjectFloatHashMap)

Example 15 with FailedNodeException

use of org.elasticsearch.action.FailedNodeException in project crate by crate.

the class AsyncShardFetch method processAsyncFetch.

/**
 * Called by the response handler of the async action to fetch data. Verifies that its still working
 * on the same cache generation, otherwise the results are discarded. It then goes and fills the relevant data for
 * the shard (response + failures), issuing a reroute at the end of it to make sure there will be another round
 * of allocations taking this new data into account.
 */
protected synchronized void processAsyncFetch(List<T> responses, List<FailedNodeException> failures, long fetchingRound) {
    if (closed) {
        // we are closed, no need to process this async fetch at all
        logger.trace("{} ignoring fetched [{}] results, already closed", shardId, type);
        return;
    }
    logger.trace("{} processing fetched [{}] results", shardId, type);
    if (responses != null) {
        for (T response : responses) {
            NodeEntry<T> nodeEntry = cache.get(response.getNode().getId());
            if (nodeEntry != null) {
                if (nodeEntry.getFetchingRound() != fetchingRound) {
                    assert nodeEntry.getFetchingRound() > fetchingRound : "node entries only replaced by newer rounds";
                    logger.trace("{} received response for [{}] from node {} for an older fetching round (expected: {} but was: {})", shardId, nodeEntry.getNodeId(), type, nodeEntry.getFetchingRound(), fetchingRound);
                } else if (nodeEntry.isFailed()) {
                    logger.trace("{} node {} has failed for [{}] (failure [{}])", shardId, nodeEntry.getNodeId(), type, nodeEntry.getFailure());
                } else {
                    // if the entry is there, for the right fetching round and not marked as failed already, process it
                    logger.trace("{} marking {} as done for [{}], result is [{}]", shardId, nodeEntry.getNodeId(), type, response);
                    nodeEntry.doneFetching(response);
                }
            }
        }
    }
    if (failures != null) {
        for (FailedNodeException failure : failures) {
            logger.trace("{} processing failure {} for [{}]", shardId, failure, type);
            NodeEntry<T> nodeEntry = cache.get(failure.nodeId());
            if (nodeEntry != null) {
                if (nodeEntry.getFetchingRound() != fetchingRound) {
                    assert nodeEntry.getFetchingRound() > fetchingRound : "node entries only replaced by newer rounds";
                    logger.trace("{} received failure for [{}] from node {} for an older fetching round (expected: {} but was: {})", shardId, nodeEntry.getNodeId(), type, nodeEntry.getFetchingRound(), fetchingRound);
                } else if (nodeEntry.isFailed() == false) {
                    // if the entry is there, for the right fetching round and not marked as failed already, process it
                    Throwable unwrappedCause = SQLExceptions.unwrap(failure.getCause());
                    // if the request got rejected or timed out, we need to try it again next time...
                    if (unwrappedCause instanceof EsRejectedExecutionException || unwrappedCause instanceof ReceiveTimeoutTransportException || unwrappedCause instanceof ElasticsearchTimeoutException) {
                        nodeEntry.restartFetching();
                    } else {
                        logger.warn(() -> new ParameterizedMessage("{}: failed to list shard for {} on node [{}]", shardId, type, failure.nodeId()), failure);
                        nodeEntry.doneFetching(failure.getCause());
                    }
                }
            }
        }
    }
    reroute(shardId, "post_response");
}
Also used : ReceiveTimeoutTransportException(org.elasticsearch.transport.ReceiveTimeoutTransportException) ElasticsearchTimeoutException(org.elasticsearch.ElasticsearchTimeoutException) FailedNodeException(org.elasticsearch.action.FailedNodeException) ParameterizedMessage(org.apache.logging.log4j.message.ParameterizedMessage) EsRejectedExecutionException(org.elasticsearch.common.util.concurrent.EsRejectedExecutionException)

Aggregations

FailedNodeException (org.elasticsearch.action.FailedNodeException)15 ArrayList (java.util.ArrayList)5 ParameterizedMessage (org.apache.logging.log4j.message.ParameterizedMessage)3 TaskOperationFailure (org.elasticsearch.action.TaskOperationFailure)3 ObjectFloatHashMap (com.carrotsearch.hppc.ObjectFloatHashMap)2 ObjectCursor (com.carrotsearch.hppc.cursors.ObjectCursor)2 Arrays (java.util.Arrays)2 ElasticsearchTimeoutException (org.elasticsearch.ElasticsearchTimeoutException)2 DefaultShardOperationFailedException (org.elasticsearch.action.support.DefaultShardOperationFailedException)2 BroadcastShardOperationFailedException (org.elasticsearch.action.support.broadcast.BroadcastShardOperationFailedException)2 ClusterState (org.elasticsearch.cluster.ClusterState)2 ShardRouting (org.elasticsearch.cluster.routing.ShardRouting)2 ClusterService (org.elasticsearch.cluster.service.ClusterService)2 EsRejectedExecutionException (org.elasticsearch.common.util.concurrent.EsRejectedExecutionException)2 Index (org.elasticsearch.index.Index)2 ReceiveTimeoutTransportException (org.elasticsearch.transport.ReceiveTimeoutTransportException)2 LTRStatsNodeResponse (com.o19s.es.ltr.action.LTRStatsAction.LTRStatsNodeResponse)1 LTRStatsNodesRequest (com.o19s.es.ltr.action.LTRStatsAction.LTRStatsNodesRequest)1 LTRStatsNodesResponse (com.o19s.es.ltr.action.LTRStatsAction.LTRStatsNodesResponse)1 Map (java.util.Map)1