Search in sources :

Example 1 with NodesUnreachableException

use of com.sequenceiq.cloudbreak.util.NodesUnreachableException in project cloudbreak by hortonworks.

the class SaltCheckerConclusionStep method checkUnreachableNodes.

private Conclusion checkUnreachableNodes(Long resourceId) {
    Stack stack = stackService.getByIdWithListsInTransaction(resourceId);
    Set<String> allNodes = stackUtil.collectNodes(stack).stream().map(Node::getHostname).collect(Collectors.toSet());
    try {
        stackUtil.collectAndCheckReachableNodes(stack, allNodes);
    } catch (NodesUnreachableException e) {
        Set<String> unreachableNodes = e.getUnreachableNodes();
        String conclusion = String.format("Unreachable nodes: %s. We detected that cluster members can’t communicate with each other. " + "Please validate if all cluster members are available and healthy through your cloud provider.", unreachableNodes);
        String details = String.format("Unreachable salt minions: %s", unreachableNodes);
        LOGGER.warn(details);
        return failed(conclusion, details);
    }
    return succeeded();
}
Also used : Set(java.util.Set) NodesUnreachableException(com.sequenceiq.cloudbreak.util.NodesUnreachableException) Stack(com.sequenceiq.cloudbreak.domain.stack.Stack)

Example 2 with NodesUnreachableException

use of com.sequenceiq.cloudbreak.util.NodesUnreachableException in project cloudbreak by hortonworks.

the class ClusterHostServiceRunnerTest method collectAndCheckReachableNodesThrowsException.

@Test
void collectAndCheckReachableNodesThrowsException() throws NodesUnreachableException {
    Set<String> unreachableNodes = new HashSet<>();
    unreachableNodes.add("node1.example.com");
    when(stackUtil.collectAndCheckReachableNodes(eq(stack), any())).thenThrow(new NodesUnreachableException("error", unreachableNodes));
    CloudbreakServiceException cloudbreakServiceException = Assertions.assertThrows(CloudbreakServiceException.class, () -> underTest.runClusterServices(stack, cluster, Map.of()));
    assertEquals("Can not run cluster services on new nodes because the configuration management service is not responding on these nodes: " + "[node1.example.com]", cloudbreakServiceException.getMessage());
}
Also used : CloudbreakServiceException(com.sequenceiq.cloudbreak.common.exception.CloudbreakServiceException) NodesUnreachableException(com.sequenceiq.cloudbreak.util.NodesUnreachableException) HashSet(java.util.HashSet) Test(org.junit.jupiter.api.Test)

Example 3 with NodesUnreachableException

use of com.sequenceiq.cloudbreak.util.NodesUnreachableException in project cloudbreak by hortonworks.

the class SaltCheckerConclusionStepTest method checkShouldFallbackForOldImageVersionsAndReturnConclusionIfUnreachableNodeFound.

@Test
public void checkShouldFallbackForOldImageVersionsAndReturnConclusionIfUnreachableNodeFound() throws NodesUnreachableException {
    RPCResponse<SaltHealthReport> response = new RPCResponse<>();
    RPCMessage message = new RPCMessage();
    message.setMessage("rpc response");
    response.setMessages(List.of(message));
    when(nodeStatusService.saltPing(eq(1L))).thenReturn(response);
    when(stackService.getByIdWithListsInTransaction(eq(1L))).thenReturn(new Stack());
    when(stackUtil.collectNodes(any())).thenReturn(Set.of(createNode("host1"), createNode("host2")));
    when(stackUtil.collectAndCheckReachableNodes(any(), anyCollection())).thenThrow(new NodesUnreachableException("error", Set.of("host1")));
    Conclusion stepResult = underTest.check(1L);
    assertTrue(stepResult.isFailureFound());
    assertEquals("Unreachable nodes: [host1]. We detected that cluster members can’t communicate with each other. " + "Please validate if all cluster members are available and healthy through your cloud provider.", stepResult.getConclusion());
    assertEquals("Unreachable salt minions: [host1]", stepResult.getDetails());
    assertEquals(SaltCheckerConclusionStep.class, stepResult.getConclusionStepClass());
    verify(nodeStatusService, times(1)).saltPing(eq(1L));
    verify(stackService, times(1)).getByIdWithListsInTransaction(eq(1L));
    verify(stackUtil, times(1)).collectNodes(any());
    verify(stackUtil, times(1)).collectAndCheckReachableNodes(any(), any());
}
Also used : SaltHealthReport(com.cloudera.thunderhead.telemetry.nodestatus.NodeStatusProto.SaltHealthReport) RPCResponse(com.sequenceiq.cloudbreak.client.RPCResponse) RPCMessage(com.sequenceiq.cloudbreak.client.RPCMessage) NodesUnreachableException(com.sequenceiq.cloudbreak.util.NodesUnreachableException) Stack(com.sequenceiq.cloudbreak.domain.stack.Stack) Test(org.junit.jupiter.api.Test)

Example 4 with NodesUnreachableException

use of com.sequenceiq.cloudbreak.util.NodesUnreachableException in project cloudbreak by hortonworks.

the class ClusterManagerUpgradeService method upgradeClusterManager.

private void upgradeClusterManager(Stack stack) throws CloudbreakOrchestratorException {
    Cluster cluster = stack.getCluster();
    InstanceMetaData gatewayInstance = stack.getPrimaryGatewayInstance();
    GatewayConfig primaryGatewayConfig = gatewayConfigService.getGatewayConfig(stack, gatewayInstance, cluster.getGateway() != null);
    Set<String> gatewayFQDN = Collections.singleton(gatewayInstance.getDiscoveryFQDN());
    ExitCriteriaModel exitCriteriaModel = clusterDeletionBasedModel(stack.getId(), cluster.getId());
    SaltConfig pillar = createSaltConfig(stack, cluster.getId(), primaryGatewayConfig);
    Set<String> allNode = stackUtil.collectNodes(stack).stream().map(Node::getHostname).collect(Collectors.toSet());
    try {
        Set<Node> reachableNodes = stackUtil.collectAndCheckReachableNodes(stack, allNode);
        hostOrchestrator.upgradeClusterManager(primaryGatewayConfig, gatewayFQDN, reachableNodes, pillar, exitCriteriaModel);
    } catch (NodesUnreachableException e) {
        String errorMessage = "Can not upgrade cluster manager because the configuration management service is not responding on these nodes: " + e.getUnreachableNodes();
        LOGGER.error(errorMessage);
        throw new CloudbreakRuntimeException(errorMessage, e);
    }
}
Also used : InstanceMetaData(com.sequenceiq.cloudbreak.domain.stack.instance.InstanceMetaData) ExitCriteriaModel(com.sequenceiq.cloudbreak.orchestrator.state.ExitCriteriaModel) Node(com.sequenceiq.cloudbreak.common.orchestration.Node) CloudbreakRuntimeException(com.sequenceiq.cloudbreak.service.CloudbreakRuntimeException) Cluster(com.sequenceiq.cloudbreak.domain.stack.cluster.Cluster) SaltConfig(com.sequenceiq.cloudbreak.orchestrator.model.SaltConfig) NodesUnreachableException(com.sequenceiq.cloudbreak.util.NodesUnreachableException) GatewayConfig(com.sequenceiq.cloudbreak.orchestrator.model.GatewayConfig)

Example 5 with NodesUnreachableException

use of com.sequenceiq.cloudbreak.util.NodesUnreachableException in project cloudbreak by hortonworks.

the class ClusterHostServiceRunner method runClusterServices.

public NodeReachabilityResult runClusterServices(@Nonnull Stack stack, @Nonnull Cluster cluster, Map<String, String> candidateAddresses) {
    try {
        Set<Node> allNodes = stackUtil.collectNodes(stack);
        Set<Node> reachableNodes = stackUtil.collectAndCheckReachableNodes(stack, candidateAddresses.keySet());
        List<GatewayConfig> gatewayConfigs = gatewayConfigService.getAllGatewayConfigs(stack);
        List<GrainProperties> grainsProperties = grainPropertiesService.createGrainProperties(gatewayConfigs, cluster, reachableNodes);
        executeRunClusterServices(stack, cluster, candidateAddresses, allNodes, reachableNodes, gatewayConfigs, grainsProperties);
        return new NodeReachabilityResult(reachableNodes, Set.of());
    } catch (CloudbreakOrchestratorCancelledException e) {
        throw new CancellationException(e.getMessage());
    } catch (CloudbreakOrchestratorException | IOException | CloudbreakException e) {
        throw new CloudbreakServiceException(e.getMessage(), e);
    } catch (NodesUnreachableException e) {
        String errorMessage = "Can not run cluster services on new nodes because the configuration management service is not responding on these nodes: " + e.getUnreachableNodes();
        LOGGER.error(errorMessage);
        throw new CloudbreakServiceException(errorMessage, e);
    }
}
Also used : CloudbreakServiceException(com.sequenceiq.cloudbreak.common.exception.CloudbreakServiceException) Node(com.sequenceiq.cloudbreak.common.orchestration.Node) GrainProperties(com.sequenceiq.cloudbreak.orchestrator.model.GrainProperties) IOException(java.io.IOException) NodesUnreachableException(com.sequenceiq.cloudbreak.util.NodesUnreachableException) CloudbreakOrchestratorException(com.sequenceiq.cloudbreak.orchestrator.exception.CloudbreakOrchestratorException) CloudbreakOrchestratorCancelledException(com.sequenceiq.cloudbreak.orchestrator.exception.CloudbreakOrchestratorCancelledException) CancellationException(com.sequenceiq.cloudbreak.cloud.scheduler.CancellationException) CloudbreakException(com.sequenceiq.cloudbreak.service.CloudbreakException) NodeReachabilityResult(com.sequenceiq.cloudbreak.orchestrator.model.NodeReachabilityResult) GatewayConfig(com.sequenceiq.cloudbreak.orchestrator.model.GatewayConfig)

Aggregations

NodesUnreachableException (com.sequenceiq.cloudbreak.util.NodesUnreachableException)5 CloudbreakServiceException (com.sequenceiq.cloudbreak.common.exception.CloudbreakServiceException)2 Node (com.sequenceiq.cloudbreak.common.orchestration.Node)2 Stack (com.sequenceiq.cloudbreak.domain.stack.Stack)2 GatewayConfig (com.sequenceiq.cloudbreak.orchestrator.model.GatewayConfig)2 Test (org.junit.jupiter.api.Test)2 SaltHealthReport (com.cloudera.thunderhead.telemetry.nodestatus.NodeStatusProto.SaltHealthReport)1 RPCMessage (com.sequenceiq.cloudbreak.client.RPCMessage)1 RPCResponse (com.sequenceiq.cloudbreak.client.RPCResponse)1 CancellationException (com.sequenceiq.cloudbreak.cloud.scheduler.CancellationException)1 Cluster (com.sequenceiq.cloudbreak.domain.stack.cluster.Cluster)1 InstanceMetaData (com.sequenceiq.cloudbreak.domain.stack.instance.InstanceMetaData)1 CloudbreakOrchestratorCancelledException (com.sequenceiq.cloudbreak.orchestrator.exception.CloudbreakOrchestratorCancelledException)1 CloudbreakOrchestratorException (com.sequenceiq.cloudbreak.orchestrator.exception.CloudbreakOrchestratorException)1 GrainProperties (com.sequenceiq.cloudbreak.orchestrator.model.GrainProperties)1 NodeReachabilityResult (com.sequenceiq.cloudbreak.orchestrator.model.NodeReachabilityResult)1 SaltConfig (com.sequenceiq.cloudbreak.orchestrator.model.SaltConfig)1 ExitCriteriaModel (com.sequenceiq.cloudbreak.orchestrator.state.ExitCriteriaModel)1 CloudbreakException (com.sequenceiq.cloudbreak.service.CloudbreakException)1 CloudbreakRuntimeException (com.sequenceiq.cloudbreak.service.CloudbreakRuntimeException)1