Search in sources :

Example 61 with DegraderLoadBalancerStrategyConfig

use of com.linkedin.d2.balancer.strategies.degrader.DegraderLoadBalancerStrategyConfig in project rest.li by linkedin.

the class DegraderLoadBalancerStrategyV3 method doUpdatePartitionState.

/**
 * updatePartitionState
 *
 * We have two mechanisms to influence the health and traffic patterns of the client. They are
 * by load balancing (switching traffic from one host to another) and by degrading service
 * (dropping calls). We load balance by allocating points in a consistent hash ring based on the
 * computedDropRate of the individual TrackerClients, which takes into account the latency
 * seen by that TrackerClient's requests. We can alternatively, if the cluster is
 * unhealthy (by using a high latency watermark) drop a portion of traffic across all tracker
 * clients corresponding to this cluster.
 *
 * Currently only 500 level return codes are counted into error rate when adjusting the hash ring.
 * The reason we do not consider other errors is that there are legitimate errors that servers can
 * send back for clients to handle, such as 400 return codes.
 *
 * We don't want both to reduce hash points and allow clients to manage their own drop rates
 * because the clients do not have a global view that the load balancing strategy does. Without
 * a global view, the clients won't know if it already has a reduced number of hash points. If the
 * client continues to drop at the same drop rate as before their points have been reduced, then
 * the client would have its outbound request reduced by both reduction in points and the client's
 * drop rate. To avoid this, the drop rate is managed globally by the load balancing strategy and
 * provided to each client. The strategy will alternate between adjusting the hash ring points or
 * the global drop rate in order to avoid double penalizing a client.
 *
 * We also have a mechanism for recovery if the number of points in the hash ring is not
 * enough to receive traffic. The initialRecoveryLevel is a number between 0.0 and 1.0, and
 * corresponds to a weight of the tracker client's full hash points.
 * The reason for the weight is to allow an initialRecoveryLevel that corresponds to
 * less than one hash point. This would be useful if a "cooling off" period is desirable for the
 * misbehaving tracker clients, ie , given a full weight of 100 hash points,0.005 means that
 * there will be one cooling off period before the client is reintroduced into the hash ring.
 *
 * The second configuration, rampFactor, will geometrically increase the
 * previous recoveryLevel if traffic still hasn't been seen for that tracker client.
 *
 * For example, given initialRecoveryLevel = 0.01, rampFactor = 2, and default tracker client hash
 * points of 100, we will increase the hash points in this pattern on successive update States:
 *  0.01, 0.02, 0.04, 0.08, 0.16, 0.32, etc., aborting as soon as
 * calls are recorded for that tracker client.
 *
 * We also have highWaterMark and lowWaterMark as properties of the DegraderLoadBalancer strategy
 * so that the strategy can make decisions on whether to start dropping traffic globally across
 * all tracker clients for this cluster. The amount of traffic to drop is controlled by the
 * globalStepUp and globalStepDown properties, where globalStepUp controls how much the global
 * drop rate increases per interval, and globalStepDown controls how much the global drop rate
 * decreases per interval. We only step up the global drop rate when the average cluster latency
 * is higher than the highWaterMark, and only step down the global drop rate when the average
 * cluster latency is lower than the global drop rate.
 *
 * This code is thread reentrant. Multiple threads can potentially call this concurrently, and so
 * callers must pass in the DegraderLoadBalancerState that they based their shouldUpdate() call on.
 * The multiple threads may have different views of the trackerClients latency, but this is
 * ok as the new state in the end will have only taken one action (either loadbalance or
 * call-dropping with at most one step). Currently we will not call this concurrently, as
 * checkUpdatePartitionState will control entry to a single thread.
 */
private static PartitionDegraderLoadBalancerState doUpdatePartitionState(long clusterGenerationId, int partitionId, PartitionDegraderLoadBalancerState oldState, DegraderLoadBalancerStrategyConfig config, List<DegraderTrackerClientUpdater> degraderTrackerClientUpdaters, boolean isQuarantineEnabled) {
    debug(_log, "updating state for: ", degraderTrackerClientUpdaters);
    double sumOfClusterLatencies = 0.0;
    long totalClusterCallCount = 0;
    boolean hashRingChanges = false;
    boolean clientDegraded = false;
    boolean recoveryMapChanges = false;
    boolean quarantineMapChanged = false;
    PartitionDegraderLoadBalancerState.Strategy strategy = oldState.getStrategy();
    Map<DegraderTrackerClient, Double> oldRecoveryMap = oldState.getRecoveryMap();
    Map<DegraderTrackerClient, Double> newRecoveryMap = new HashMap<>(oldRecoveryMap);
    double currentOverrideDropRate = oldState.getCurrentOverrideDropRate();
    double initialRecoveryLevel = config.getInitialRecoveryLevel();
    double ringRampFactor = config.getRingRampFactor();
    int pointsPerWeight = config.getPointsPerWeight();
    PartitionDegraderLoadBalancerState newState;
    Map<DegraderTrackerClient, LoadBalancerQuarantine> quarantineMap = oldState.getQuarantineMap();
    Map<DegraderTrackerClient, LoadBalancerQuarantine> quarantineHistory = oldState.getQuarantineHistory();
    Set<DegraderTrackerClient> activeClients = new HashSet<>();
    long clk = config.getClock().currentTimeMillis();
    long clusterErrorCount = 0;
    long clusterDropCount = 0;
    for (DegraderTrackerClientUpdater clientUpdater : degraderTrackerClientUpdaters) {
        DegraderTrackerClient client = clientUpdater.getTrackerClient();
        DegraderControl degraderControl = client.getDegraderControl(partitionId);
        double averageLatency = degraderControl.getLatency();
        long callCount = degraderControl.getCallCount();
        clusterDropCount += (int) (degraderControl.getCurrentDropRate() * callCount);
        clusterErrorCount += (int) (degraderControl.getErrorRate() * callCount);
        oldState.getPreviousMaxDropRate().put(client, clientUpdater.getMaxDropRate());
        sumOfClusterLatencies += averageLatency * callCount;
        totalClusterCallCount += callCount;
        activeClients.add(client);
        if (isQuarantineEnabled) {
            // Check/update quarantine state if current client is already under quarantine
            LoadBalancerQuarantine quarantine = quarantineMap.get(client);
            if (quarantine != null && quarantine.checkUpdateQuarantineState()) {
                // Evict client from quarantine
                quarantineMap.remove(client);
                quarantineHistory.put(client, quarantine);
                _log.info("TrackerClient {} evicted from quarantine @ {}", client.getUri(), clk);
                // Next need to put the client to slow-start/recovery mode to gradually pick up traffic.
                // For now simply force the weight to the initialRecoveryLevel so the client can gradually recover
                // RecoveryMap is used here to track the clients that just evicted from quarantine
                // They'll not be quarantined again in the recovery phase even though the effective
                // weight is within the range.
                newRecoveryMap.put(client, degraderControl.getMaxDropRate());
                clientUpdater.setMaxDropRate(1.0 - initialRecoveryLevel);
                quarantineMapChanged = true;
            }
        }
        if (newRecoveryMap.containsKey(client)) {
            recoveryMapChanges = handleClientInRecoveryMap(degraderControl, clientUpdater, initialRecoveryLevel, ringRampFactor, callCount, newRecoveryMap, strategy);
        }
    }
    // in TrackerClientUpdaters -- those URIs were removed from zookeeper
    if (isQuarantineEnabled) {
        quarantineMap.entrySet().removeIf(e -> !activeClients.contains(e.getKey()));
        quarantineHistory.entrySet().removeIf(e -> !activeClients.contains(e.getKey()));
    }
    // Also remove the clients from recoveryMap if they are gone
    newRecoveryMap.entrySet().removeIf(e -> !activeClients.contains(e.getKey()));
    boolean trackerClientInconsistency = degraderTrackerClientUpdaters.size() != oldState.getPointsMap().size();
    if (oldState.getClusterGenerationId() == clusterGenerationId && totalClusterCallCount <= 0 && !recoveryMapChanges && !quarantineMapChanged && !trackerClientInconsistency) {
        // if the cluster has not been called recently (total cluster call count is <= 0)
        // and we already have a state with the same set of URIs (same cluster generation),
        // and no clients are in rehab or evicted from quarantine, then don't change anything.
        debug(_log, "New state is the same as the old state so we're not changing anything. Old state = ", oldState, ", config= ", config);
        return new PartitionDegraderLoadBalancerState(oldState, clusterGenerationId, config.getClock().currentTimeMillis());
    }
    // update our overrides.
    double newCurrentAvgClusterLatency = -1;
    if (totalClusterCallCount > 0) {
        newCurrentAvgClusterLatency = sumOfClusterLatencies / totalClusterCallCount;
    }
    debug(_log, "average cluster latency: ", newCurrentAvgClusterLatency);
    // This points map stores how many hash map points to allocate for each tracker client.
    Map<URI, Integer> points = new HashMap<>();
    Map<URI, Integer> oldPointsMap = oldState.getPointsMap();
    for (DegraderTrackerClientUpdater clientUpdater : degraderTrackerClientUpdaters) {
        DegraderTrackerClient client = clientUpdater.getTrackerClient();
        URI clientUri = client.getUri();
        // Don't take into account cluster health when calculating the number of points
        // for each client. This is because the individual clients already take into account
        // latency and errors, and a successfulTransmissionWeight can and should be made
        // independent of other nodes in the cluster. Otherwise, one unhealthy client in a small
        // cluster can take down the entire cluster if the avg latency is too high.
        // The global drop rate will take into account the cluster latency. High cluster-wide error
        // rates are not something d2 can address.
        // 
        // this client's maxDropRate and currentComputedDropRate may have been adjusted if it's in the
        // rehab program (to gradually send traffic it's way).
        DegraderControl degraderControl = client.getDegraderControl(partitionId);
        double dropRate = Math.min(degraderControl.getCurrentComputedDropRate(), clientUpdater.getMaxDropRate());
        // calculate the weight as the probability of successful transmission to this
        // node divided by the probability of successful transmission to the entire
        // cluster
        double clientWeight = client.getPartitionWeight(partitionId) * client.getSubsetWeight(partitionId);
        double successfulTransmissionWeight = clientWeight * (1.0 - dropRate);
        // calculate the weight as the probability of a successful transmission to this node
        // multiplied by the client's self-defined weight. thus, the node's final weight
        // takes into account both the self defined weight (to account for different
        // hardware in the same cluster) and the performance of the node (as defined by the
        // node's degrader).
        debug(_log, "computed new weight for uri ", clientUri, ": ", successfulTransmissionWeight);
        // keep track if we're making actual changes to the Hash Ring in this updatePartitionState.
        int newPoints = (int) (successfulTransmissionWeight * pointsPerWeight);
        boolean quarantineEffect = false;
        if (isQuarantineEnabled) {
            if (quarantineMap.containsKey(client)) {
                // If the client is still in quarantine, keep the points to 0 so no real traffic will be used
                newPoints = 0;
                quarantineEffect = true;
            } else // HTTP_LB_QUARANTINE_MAX_PERCENT)
            if (successfulTransmissionWeight <= 0.0 && clientWeight > EPSILON && degraderControl.isHigh()) {
                if (1.0 * quarantineMap.size() < Math.ceil(degraderTrackerClientUpdaters.size() * config.getQuarantineMaxPercent())) {
                    // Put the client into quarantine
                    LoadBalancerQuarantine quarantine = quarantineHistory.remove(client);
                    if (quarantine == null) {
                        quarantine = new LoadBalancerQuarantine(clientUpdater.getTrackerClient(), config, oldState.getServiceName());
                    }
                    quarantine.reset(clk);
                    quarantineMap.put(client, quarantine);
                    // reduce the points to 0 so no real traffic will be used
                    newPoints = 0;
                    _log.warn("TrackerClient {} is put into quarantine {}. OverrideDropRate = {}, callCount = {}, latency = {}," + " errorRate = {}", new Object[] { client.getUri(), quarantine, degraderControl.getMaxDropRate(), degraderControl.getCallCount(), degraderControl.getLatency(), degraderControl.getErrorRate() });
                    quarantineEffect = true;
                } else {
                    _log.error("Quarantine for service {} is full! Could not add {}", oldState.getServiceName(), client);
                }
            }
        }
        // client into the recovery program, because we don't want this tracker client to get any traffic.
        if (!quarantineEffect && newPoints == 0 && clientWeight > EPSILON) {
            // We are choking off traffic to this tracker client.
            // Enroll this tracker client in the recovery program so that
            // we can make sure it still gets some traffic
            Double oldMaxDropRate = clientUpdater.getMaxDropRate();
            // set the default recovery level.
            newPoints = (int) (initialRecoveryLevel * pointsPerWeight);
            // Therefore we end up with adding this client to the Map only if it is in LOAD_BALANCE phase.
            if (!newRecoveryMap.containsKey(client) && strategy == PartitionDegraderLoadBalancerState.Strategy.LOAD_BALANCE) {
                // keep track of this client,
                newRecoveryMap.put(client, oldMaxDropRate);
                clientUpdater.setMaxDropRate(1.0 - initialRecoveryLevel);
            }
        }
        // also enroll new client into the recoveryMap if possible
        enrollNewClientInRecoveryMap(newRecoveryMap, oldState, config, degraderControl, clientUpdater);
        points.put(clientUri, newPoints);
        if (!oldPointsMap.containsKey(clientUri) || oldPointsMap.get(clientUri) != newPoints) {
            hashRingChanges = true;
            clientDegraded |= oldPointsMap.containsKey(clientUri) && (newPoints < oldPointsMap.get(clientUri));
        }
    }
    // if there were changes to the members of the cluster
    if ((strategy == PartitionDegraderLoadBalancerState.Strategy.LOAD_BALANCE && hashRingChanges) || oldState.getClusterGenerationId() != clusterGenerationId) {
        // atomic overwrite
        // try Call Dropping next time we updatePartitionState.
        List<DegraderTrackerClient> unHealthyClients = getUnhealthyTrackerClients(degraderTrackerClientUpdaters, points, quarantineMap, config, partitionId);
        newState = new PartitionDegraderLoadBalancerState(clusterGenerationId, config.getClock().currentTimeMillis(), true, oldState.getRingFactory(), points, PartitionDegraderLoadBalancerState.Strategy.CALL_DROPPING, currentOverrideDropRate, newCurrentAvgClusterLatency, newRecoveryMap, oldState.getServiceName(), oldState.getDegraderProperties(), totalClusterCallCount, clusterDropCount, clusterErrorCount, quarantineMap, quarantineHistory, activeClients, unHealthyClients.size());
        logState(oldState, newState, partitionId, config, unHealthyClients, clientDegraded);
    } else {
        // time to try call dropping strategy, if necessary.
        double newDropLevel = calculateNewDropLevel(config, currentOverrideDropRate, newCurrentAvgClusterLatency, totalClusterCallCount);
        if (newDropLevel != currentOverrideDropRate) {
            overrideClusterDropRate(partitionId, newDropLevel, degraderTrackerClientUpdaters);
        }
        // don't change the points map, but try load balancing strategy next time.
        // recoveryMap needs to update if quarantine or fastRecovery is enabled. This is because the client will not
        // have chance to get in in next interval (already evicted from quarantine or not a new client anymore).
        List<DegraderTrackerClient> unHealthyClients = getUnhealthyTrackerClients(degraderTrackerClientUpdaters, oldPointsMap, quarantineMap, config, partitionId);
        newState = new PartitionDegraderLoadBalancerState(clusterGenerationId, config.getClock().currentTimeMillis(), true, oldState.getRingFactory(), oldPointsMap, PartitionDegraderLoadBalancerState.Strategy.LOAD_BALANCE, newDropLevel, newCurrentAvgClusterLatency, newRecoveryMap, oldState.getServiceName(), oldState.getDegraderProperties(), totalClusterCallCount, clusterDropCount, clusterErrorCount, quarantineMap, quarantineHistory, activeClients, unHealthyClients.size());
        logState(oldState, newState, partitionId, config, unHealthyClients, clientDegraded);
        points = oldPointsMap;
    }
    // adjust the min call count for each client based on the hash ring reduction and call dropping
    // fraction.
    overrideMinCallCount(partitionId, currentOverrideDropRate, degraderTrackerClientUpdaters, points, pointsPerWeight);
    return newState;
}
Also used : DegraderTrackerClient(com.linkedin.d2.balancer.clients.DegraderTrackerClient) HashMap(java.util.HashMap) DegraderControl(com.linkedin.util.degrader.DegraderControl) URI(java.net.URI) LoadBalancerQuarantine(com.linkedin.d2.balancer.strategies.LoadBalancerQuarantine) HashSet(java.util.HashSet)

Example 62 with DegraderLoadBalancerStrategyConfig

use of com.linkedin.d2.balancer.strategies.degrader.DegraderLoadBalancerStrategyConfig in project rest.li by linkedin.

the class DegraderLoadBalancerState method shutdown.

public void shutdown(DegraderLoadBalancerStrategyConfig config) {
    // Need to shutdown quarantine and release the related transport client
    if (config.getQuarantineMaxPercent() <= 0.0 || !_quarantineEnabled.get()) {
        return;
    }
    for (Partition par : _partitions.values()) {
        Lock lock = par.getLock();
        lock.lock();
        try {
            PartitionDegraderLoadBalancerState curState = par.getState();
            curState.getQuarantineMap().values().forEach(LoadBalancerQuarantine::shutdown);
        } finally {
            lock.unlock();
        }
    }
}
Also used : LoadBalancerQuarantine(com.linkedin.d2.balancer.strategies.LoadBalancerQuarantine) ReentrantLock(java.util.concurrent.locks.ReentrantLock) Lock(java.util.concurrent.locks.Lock)

Aggregations

Test (org.testng.annotations.Test)35 DegraderTrackerClient (com.linkedin.d2.balancer.clients.DegraderTrackerClient)31 DegraderTrackerClientTest (com.linkedin.d2.balancer.clients.DegraderTrackerClientTest)28 ArrayList (java.util.ArrayList)26 URI (java.net.URI)22 AtomicLong (java.util.concurrent.atomic.AtomicLong)21 TrackerClient (com.linkedin.d2.balancer.clients.TrackerClient)19 HashMap (java.util.HashMap)16 DegraderImpl (com.linkedin.util.degrader.DegraderImpl)13 RequestContext (com.linkedin.r2.message.RequestContext)12 DegraderControl (com.linkedin.util.degrader.DegraderControl)9 DegraderLoadBalancerStrategyConfig (com.linkedin.d2.balancer.strategies.degrader.DegraderLoadBalancerStrategyConfig)8 URIRequest (com.linkedin.d2.balancer.util.URIRequest)7 CallCompletion (com.linkedin.util.degrader.CallCompletion)7 DegraderTrackerClientImpl (com.linkedin.d2.balancer.clients.DegraderTrackerClientImpl)5 DelegatingRingFactory (com.linkedin.d2.balancer.strategies.DelegatingRingFactory)5 LoadBalancerQuarantine (com.linkedin.d2.balancer.strategies.LoadBalancerQuarantine)5 List (java.util.List)4 AtomicInteger (java.util.concurrent.atomic.AtomicInteger)4 LoadBalancerState (com.linkedin.d2.balancer.LoadBalancerState)3