Search in sources :

Example 1 with Replica

use of com.linkedin.kafka.cruisecontrol.model.Replica in project cruise-control by linkedin.

the class ReplicaDistributionGoal method rebalanceByMovingReplicasOut.

private boolean rebalanceByMovingReplicasOut(Broker broker, ClusterModel clusterModel, Set<Goal> optimizedGoals, Set<String> excludedTopics) {
    // Get the eligible brokers.
    SortedSet<Broker> candidateBrokers = new TreeSet<>(Comparator.comparingInt((Broker b) -> b.replicas().size()).thenComparingInt(Broker::id));
    candidateBrokers.addAll(_selfHealingDeadBrokersOnly ? clusterModel.healthyBrokers() : clusterModel.healthyBrokers().stream().filter(b -> b.replicas().size() < _balanceUpperLimit).collect(Collectors.toSet()));
    // Get the replicas to rebalance. Replicas are sorted from smallest to largest disk usage.
    List<Replica> replicasToMove = broker.sortedReplicas(Resource.DISK, true);
    // Now let's move things around.
    for (Replica replica : replicasToMove) {
        if (shouldExclude(replica, excludedTopics)) {
            continue;
        }
        Broker b = maybeApplyBalancingAction(clusterModel, replica, candidateBrokers, ActionType.REPLICA_MOVEMENT, optimizedGoals);
        // Only check if we successfully moved something.
        if (b != null) {
            if (broker.replicas().size() <= (broker.isAlive() ? _balanceUpperLimit : 0)) {
                return false;
            }
            // Remove and reinsert the broker so the order is correct.
            candidateBrokers.remove(b);
            if (b.replicas().size() < _balanceUpperLimit || _selfHealingDeadBrokersOnly) {
                candidateBrokers.add(b);
            }
        }
    }
    // All the replicas has been moved away from the broker.
    return !broker.replicas().isEmpty();
}
Also used : Replica(com.linkedin.kafka.cruisecontrol.model.Replica) SortedSet(java.util.SortedSet) REPLICA_REJECT(com.linkedin.kafka.cruisecontrol.analyzer.ActionAcceptance.REPLICA_REJECT) PriorityQueue(java.util.PriorityQueue) ClusterModel(com.linkedin.kafka.cruisecontrol.model.ClusterModel) LoggerFactory(org.slf4j.LoggerFactory) TreeSet(java.util.TreeSet) HashSet(java.util.HashSet) OptimizationFailureException(com.linkedin.kafka.cruisecontrol.exception.OptimizationFailureException) ActionAcceptance(com.linkedin.kafka.cruisecontrol.analyzer.ActionAcceptance) Logger(org.slf4j.Logger) BalancingConstraint(com.linkedin.kafka.cruisecontrol.analyzer.BalancingConstraint) Set(java.util.Set) AnalyzerUtils(com.linkedin.kafka.cruisecontrol.analyzer.AnalyzerUtils) ACCEPT(com.linkedin.kafka.cruisecontrol.analyzer.ActionAcceptance.ACCEPT) ActionType(com.linkedin.kafka.cruisecontrol.analyzer.ActionType) Collectors(java.util.stream.Collectors) Broker(com.linkedin.kafka.cruisecontrol.model.Broker) List(java.util.List) Statistic(com.linkedin.kafka.cruisecontrol.common.Statistic) BalancingAction(com.linkedin.kafka.cruisecontrol.analyzer.BalancingAction) Resource(com.linkedin.kafka.cruisecontrol.common.Resource) ClusterModelStats(com.linkedin.kafka.cruisecontrol.model.ClusterModelStats) ADD(com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaDistributionGoal.ChangeType.ADD) REMOVE(com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaDistributionGoal.ChangeType.REMOVE) Comparator(java.util.Comparator) Collections(java.util.Collections) ModelCompletenessRequirements(com.linkedin.kafka.cruisecontrol.monitor.ModelCompletenessRequirements) Broker(com.linkedin.kafka.cruisecontrol.model.Broker) TreeSet(java.util.TreeSet) Replica(com.linkedin.kafka.cruisecontrol.model.Replica)

Example 2 with Replica

use of com.linkedin.kafka.cruisecontrol.model.Replica in project cruise-control by linkedin.

the class ResourceDistributionGoal method actionAcceptance.

/**
 * Check whether given action is acceptable by this goal. An action is acceptable by this goal if it satisfies the
 * following: (1) if both source and destination brokers were within the limit before the action, the corresponding
 * limits cannot be violated after the action, (2) otherwise, the action cannot increase the utilization difference
 * between brokers.
 *
 * @param action Action to be checked for acceptance.
 * @param clusterModel The state of the cluster.
 * @return {@link ActionAcceptance#ACCEPT} if the action is acceptable by this goal,
 * {@link ActionAcceptance#REPLICA_REJECT} otherwise.
 */
@Override
public ActionAcceptance actionAcceptance(BalancingAction action, ClusterModel clusterModel) {
    Replica sourceReplica = clusterModel.broker(action.sourceBrokerId()).replica(action.topicPartition());
    Broker destinationBroker = clusterModel.broker(action.destinationBrokerId());
    switch(action.balancingAction()) {
        case REPLICA_SWAP:
            Replica destinationReplica = destinationBroker.replica(action.destinationTopicPartition());
            double sourceUtilizationDelta = destinationReplica.load().expectedUtilizationFor(resource()) - sourceReplica.load().expectedUtilizationFor(resource());
            if (sourceUtilizationDelta == 0) {
                // No change in terms of load.
                return ACCEPT;
            }
            // Check if both the source and the destination broker are within the balance limit before applying a swap that
            // could potentially make a balanced broker unbalanced -- i.e. never make a balanced broker unbalanced.
            // Note that (1) if both source and destination brokers were within the limit before the swap, the corresponding
            // limits cannot be violated after the swap, (2) otherwise, the swap is guaranteed not to increase the utilization
            // difference between brokers.
            boolean bothBrokersCurrentlyWithinLimit = sourceUtilizationDelta > 0 ? (isLoadAboveBalanceLowerLimit(destinationBroker) && isLoadUnderBalanceUpperLimit(sourceReplica.broker())) : (isLoadAboveBalanceLowerLimit(sourceReplica.broker()) && isLoadUnderBalanceUpperLimit(destinationBroker));
            if (bothBrokersCurrentlyWithinLimit) {
                // Ensure that the resource utilization on balanced broker(s) do not go out of limits after the swap.
                return isSwapViolatingLimit(sourceReplica, destinationReplica) ? REPLICA_REJECT : ACCEPT;
            }
            // Ensure that the swap does not increase the utilization difference between brokers.
            return isSelfSatisfiedAfterSwap(sourceReplica, destinationReplica) ? ACCEPT : REPLICA_REJECT;
        case REPLICA_MOVEMENT:
        case LEADERSHIP_MOVEMENT:
            if (isLoadAboveBalanceLowerLimit(sourceReplica.broker()) && isLoadUnderBalanceUpperLimit(destinationBroker)) {
                // Already satisfied balance limits cannot be violated due to balancing action.
                return (isLoadUnderBalanceUpperLimitAfterChange(sourceReplica.load(), destinationBroker, ADD) && isLoadAboveBalanceLowerLimitAfterChange(sourceReplica.load(), sourceReplica.broker(), REMOVE)) ? ACCEPT : REPLICA_REJECT;
            }
            // Check that current destination would not become more unbalanced.
            return isAcceptableAfterReplicaMove(sourceReplica, destinationBroker) ? ACCEPT : REPLICA_REJECT;
        default:
            throw new IllegalArgumentException("Unsupported balancing action " + action.balancingAction() + " is provided.");
    }
}
Also used : Broker(com.linkedin.kafka.cruisecontrol.model.Broker) Replica(com.linkedin.kafka.cruisecontrol.model.Replica)

Example 3 with Replica

use of com.linkedin.kafka.cruisecontrol.model.Replica in project cruise-control by linkedin.

the class ResourceDistributionGoal method updateGoalState.

/**
 * Update the current resource that is being balanced if there are still resources to be balanced, finish otherwise.
 *
 * @param clusterModel The state of the cluster.
 * @param excludedTopics The topics that should be excluded from the optimization action.
 */
@Override
protected void updateGoalState(ClusterModel clusterModel, Set<String> excludedTopics) throws OptimizationFailureException {
    Set<Integer> brokerIdsAboveBalanceUpperLimit = new HashSet<>();
    Set<Integer> brokerIdsUnderBalanceLowerLimit = new HashSet<>();
    // While proposals exclude the excludedTopics, the balance still considers utilization of the excludedTopic replicas.
    for (Broker broker : clusterModel.healthyBrokers()) {
        if (!isLoadUnderBalanceUpperLimit(broker)) {
            brokerIdsAboveBalanceUpperLimit.add(broker.id());
        }
        if (!isLoadAboveBalanceLowerLimit(broker)) {
            brokerIdsUnderBalanceLowerLimit.add(broker.id());
        }
    }
    if (!brokerIdsAboveBalanceUpperLimit.isEmpty()) {
        LOG.warn("Utilization for broker ids:{} {} above the balance limit for:{} after {}.", brokerIdsAboveBalanceUpperLimit, (brokerIdsAboveBalanceUpperLimit.size() > 1) ? "are" : "is", resource(), (clusterModel.selfHealingEligibleReplicas().isEmpty()) ? "rebalance" : "self-healing");
        _succeeded = false;
    }
    if (!brokerIdsUnderBalanceLowerLimit.isEmpty()) {
        LOG.warn("Utilization for broker ids:{} {} under the balance limit for:{} after {}.", brokerIdsUnderBalanceLowerLimit, (brokerIdsUnderBalanceLowerLimit.size() > 1) ? "are" : "is", resource(), (clusterModel.selfHealingEligibleReplicas().isEmpty()) ? "rebalance" : "self-healing");
        _succeeded = false;
    }
    // Sanity check: No self-healing eligible replica should remain at a decommissioned broker.
    for (Replica replica : clusterModel.selfHealingEligibleReplicas()) {
        if (replica.broker().isAlive()) {
            continue;
        }
        if (_selfHealingDeadBrokersOnly) {
            throw new OptimizationFailureException("Self healing failed to move the replica away from decommissioned brokers.");
        }
        _selfHealingDeadBrokersOnly = true;
        LOG.warn("Omitting resource balance limit to relocate remaining replicas from dead brokers to healthy ones.");
        return;
    }
    // No dead broker contains replica.
    _selfHealingDeadBrokersOnly = false;
    // Sanity check: No self-healing eligible replica should remain at a decommissioned broker.
    for (Replica replica : clusterModel.selfHealingEligibleReplicas()) {
        if (!replica.broker().isAlive()) {
            throw new OptimizationFailureException("Self healing failed to move the replica away from decommissioned broker.");
        }
    }
    finish();
}
Also used : Broker(com.linkedin.kafka.cruisecontrol.model.Broker) OptimizationFailureException(com.linkedin.kafka.cruisecontrol.exception.OptimizationFailureException) Replica(com.linkedin.kafka.cruisecontrol.model.Replica) HashSet(java.util.HashSet)

Example 4 with Replica

use of com.linkedin.kafka.cruisecontrol.model.Replica in project cruise-control by linkedin.

the class ResourceDistributionGoal method rebalanceByMovingLoadIn.

private boolean rebalanceByMovingLoadIn(Broker broker, ClusterModel clusterModel, Set<Goal> optimizedGoals, ActionType actionType, Set<String> excludedTopics) {
    if (!clusterModel.newBrokers().isEmpty() && !broker.isNew()) {
        // We have new brokers and the current broker is not a new broker.
        return true;
    }
    PriorityQueue<CandidateBroker> candidateBrokerPQ = new PriorityQueue<>();
    // Sort the replicas initially to avoid sorting it every time.
    double clusterUtilization = clusterModel.load().expectedUtilizationFor(resource()) / clusterModel.capacityFor(resource());
    for (Broker candidate : clusterModel.healthyBrokers()) {
        // Get candidate replicas on candidate broker to try moving load from -- sorted in the order of trial (descending load).
        if (utilizationPercentage(candidate) > clusterUtilization) {
            SortedSet<Replica> replicasToMoveIn = sortedCandidateReplicas(candidate, excludedTopics, 0, false);
            CandidateBroker candidateBroker = new CandidateBroker(candidate, replicasToMoveIn, false);
            candidateBrokerPQ.add(candidateBroker);
        }
    }
    // for replica movement.
    while (!candidateBrokerPQ.isEmpty() && (actionType == REPLICA_MOVEMENT || (actionType == LEADERSHIP_MOVEMENT && broker.leaderReplicas().size() != broker.replicas().size()))) {
        CandidateBroker cb = candidateBrokerPQ.poll();
        SortedSet<Replica> candidateReplicasToReceive = cb.replicas();
        for (Iterator<Replica> iterator = candidateReplicasToReceive.iterator(); iterator.hasNext(); ) {
            Replica replica = iterator.next();
            Broker b = maybeApplyBalancingAction(clusterModel, replica, Collections.singletonList(broker), actionType, optimizedGoals);
            // has nothing to move in. In that case we will never reenqueue that source broker.
            if (b != null) {
                if (isLoadAboveBalanceLowerLimit(broker)) {
                    return false;
                }
                // Remove the replica from its source broker if it was a replica movement.
                if (actionType == REPLICA_MOVEMENT) {
                    iterator.remove();
                }
                // we reenqueue the source broker and switch to the next broker.
                if (!candidateBrokerPQ.isEmpty() && utilizationPercentage(cb.broker()) < utilizationPercentage(candidateBrokerPQ.peek().broker())) {
                    candidateBrokerPQ.add(cb);
                    break;
                }
            }
        }
    }
    return true;
}
Also used : Broker(com.linkedin.kafka.cruisecontrol.model.Broker) PriorityQueue(java.util.PriorityQueue) Replica(com.linkedin.kafka.cruisecontrol.model.Replica)

Example 5 with Replica

use of com.linkedin.kafka.cruisecontrol.model.Replica in project cruise-control by linkedin.

the class ResourceDistributionGoal method rebalanceBySwappingLoadIn.

private boolean rebalanceBySwappingLoadIn(Broker broker, ClusterModel clusterModel, Set<Goal> optimizedGoals, Set<String> excludedTopics) {
    if (!broker.isAlive() || broker.replicas().isEmpty()) {
        // Source broker is dead or has no replicas to swap.
        return true;
    }
    // Get the replicas to rebalance.
    SortedSet<Replica> sourceReplicas = new TreeSet<>(Comparator.comparingDouble((Replica r) -> r.load().expectedUtilizationFor(resource())).thenComparing(r -> r.topicPartition().toString()));
    sourceReplicas.addAll(broker.replicas());
    // Sort the replicas initially to avoid sorting it every time.
    PriorityQueue<CandidateBroker> candidateBrokerPQ = new PriorityQueue<>();
    for (Broker candidate : clusterModel.healthyBrokersOverThreshold(resource(), _balanceLowerThreshold)) {
        // Get candidate replicas on candidate broker to try swapping with -- sorted in the order of trial (descending load).
        double minSourceReplicaLoad = sourceReplicas.first().load().expectedUtilizationFor(resource());
        SortedSet<Replica> replicasToSwapWith = sortedCandidateReplicas(candidate, excludedTopics, minSourceReplicaLoad, false);
        CandidateBroker candidateBroker = new CandidateBroker(candidate, replicasToSwapWith, false);
        candidateBrokerPQ.add(candidateBroker);
    }
    while (!candidateBrokerPQ.isEmpty()) {
        CandidateBroker cb = candidateBrokerPQ.poll();
        SortedSet<Replica> candidateReplicasToSwapWith = cb.replicas();
        Replica swappedInReplica = null;
        Replica swappedOutReplica = null;
        for (Replica sourceReplica : sourceReplicas) {
            if (shouldExclude(sourceReplica, excludedTopics)) {
                continue;
            }
            // It does not make sense to swap replicas without utilization from a live broker.
            double sourceReplicaUtilization = sourceReplica.load().expectedUtilizationFor(resource());
            if (sourceReplicaUtilization == 0.0) {
                break;
            }
            // Try swapping the source with the candidate replicas. Get the swapped in replica if successful, null otherwise.
            Replica swappedIn = maybeApplySwapAction(clusterModel, sourceReplica, candidateReplicasToSwapWith, optimizedGoals);
            if (swappedIn != null) {
                if (isLoadAboveBalanceLowerLimit(broker)) {
                    // Successfully balanced this broker by swapping in.
                    return false;
                }
                // Add swapped in/out replica for updating the list of replicas in source broker.
                swappedInReplica = swappedIn;
                swappedOutReplica = sourceReplica;
                break;
            }
        }
        swapUpdate(swappedInReplica, swappedOutReplica, sourceReplicas, candidateReplicasToSwapWith, candidateBrokerPQ, cb);
    }
    return true;
}
Also used : Replica(com.linkedin.kafka.cruisecontrol.model.Replica) SortedSet(java.util.SortedSet) REPLICA_REJECT(com.linkedin.kafka.cruisecontrol.analyzer.ActionAcceptance.REPLICA_REJECT) PriorityQueue(java.util.PriorityQueue) ClusterModel(com.linkedin.kafka.cruisecontrol.model.ClusterModel) LoggerFactory(org.slf4j.LoggerFactory) LEADERSHIP_MOVEMENT(com.linkedin.kafka.cruisecontrol.analyzer.ActionType.LEADERSHIP_MOVEMENT) Function(java.util.function.Function) TreeSet(java.util.TreeSet) ArrayList(java.util.ArrayList) HashSet(java.util.HashSet) REPLICA_SWAP(com.linkedin.kafka.cruisecontrol.analyzer.ActionType.REPLICA_SWAP) OptimizationFailureException(com.linkedin.kafka.cruisecontrol.exception.OptimizationFailureException) Load(com.linkedin.kafka.cruisecontrol.model.Load) REMOVE(com.linkedin.kafka.cruisecontrol.analyzer.goals.ResourceDistributionGoal.ChangeType.REMOVE) ADD(com.linkedin.kafka.cruisecontrol.analyzer.goals.ResourceDistributionGoal.ChangeType.ADD) REPLICA_MOVEMENT(com.linkedin.kafka.cruisecontrol.analyzer.ActionType.REPLICA_MOVEMENT) ActionAcceptance(com.linkedin.kafka.cruisecontrol.analyzer.ActionAcceptance) Logger(org.slf4j.Logger) Iterator(java.util.Iterator) BalancingConstraint(com.linkedin.kafka.cruisecontrol.analyzer.BalancingConstraint) Set(java.util.Set) ACCEPT(com.linkedin.kafka.cruisecontrol.analyzer.ActionAcceptance.ACCEPT) ActionType(com.linkedin.kafka.cruisecontrol.analyzer.ActionType) Collectors(java.util.stream.Collectors) Broker(com.linkedin.kafka.cruisecontrol.model.Broker) List(java.util.List) Statistic(com.linkedin.kafka.cruisecontrol.common.Statistic) BalancingAction(com.linkedin.kafka.cruisecontrol.analyzer.BalancingAction) Resource(com.linkedin.kafka.cruisecontrol.common.Resource) ClusterModelStats(com.linkedin.kafka.cruisecontrol.model.ClusterModelStats) Comparator(java.util.Comparator) Collections(java.util.Collections) ModelCompletenessRequirements(com.linkedin.kafka.cruisecontrol.monitor.ModelCompletenessRequirements) Broker(com.linkedin.kafka.cruisecontrol.model.Broker) TreeSet(java.util.TreeSet) PriorityQueue(java.util.PriorityQueue) Replica(com.linkedin.kafka.cruisecontrol.model.Replica)

Aggregations

Replica (com.linkedin.kafka.cruisecontrol.model.Replica)40 Broker (com.linkedin.kafka.cruisecontrol.model.Broker)26 BalancingConstraint (com.linkedin.kafka.cruisecontrol.analyzer.BalancingConstraint)13 OptimizationFailureException (com.linkedin.kafka.cruisecontrol.exception.OptimizationFailureException)12 ClusterModel (com.linkedin.kafka.cruisecontrol.model.ClusterModel)9 HashSet (java.util.HashSet)9 TreeSet (java.util.TreeSet)8 Resource (com.linkedin.kafka.cruisecontrol.common.Resource)7 ActionAcceptance (com.linkedin.kafka.cruisecontrol.analyzer.ActionAcceptance)6 ActionType (com.linkedin.kafka.cruisecontrol.analyzer.ActionType)6 BalancingAction (com.linkedin.kafka.cruisecontrol.analyzer.BalancingAction)6 ArrayList (java.util.ArrayList)6 List (java.util.List)6 ACCEPT (com.linkedin.kafka.cruisecontrol.analyzer.ActionAcceptance.ACCEPT)5 REPLICA_REJECT (com.linkedin.kafka.cruisecontrol.analyzer.ActionAcceptance.REPLICA_REJECT)5 ClusterModelStats (com.linkedin.kafka.cruisecontrol.model.ClusterModelStats)5 ModelCompletenessRequirements (com.linkedin.kafka.cruisecontrol.monitor.ModelCompletenessRequirements)5 Set (java.util.Set)5 SortedSet (java.util.SortedSet)5 Collectors (java.util.stream.Collectors)5