use of com.linkedin.d2.balancer.strategies.degrader.DegraderLoadBalancerStrategyConfig in project rest.li by linkedin.
the class DegraderLoadBalancerStrategyV2 method updateState.
/**
* updateState
*
* We have two mechanisms to influence the health and traffic patterns of the client. They are
* by load balancing (switching traffic from one host to another) and by degrading service
* (dropping calls). We load balance by allocating points in a consistent hash ring based on the
* computedDropRate of the individual TrackerClients, which takes into account the latency
* seen by that TrackerClient's requests. We can alternatively, if the cluster is
* unhealthy (by using a high latency watermark) drop a portion of traffic across all tracker
* clients corresponding to this cluster.
*
* The reason we do not currently consider error rate when adjusting the hash ring is that
* there are legitimate errors that servers can send back for clients to handle, such as
* 400 return codes. A potential improvement would be to catch transport level exceptions and 500
* level return codes, but the implication of that would need to be carefully understood and documented.
*
* We don't want both to reduce hash points and allow clients to manage their own drop rates
* because the clients do not have a global view that the load balancing strategy does. Without
* a global view, the clients won't know if it already has a reduced number of hash points. If the
* client continues to drop at the same drop rate as before their points have been reduced, then
* the client would have its outbound request reduced by both reduction in points and the client's
* drop rate. To avoid this, the drop rate is managed globally by the load balancing strategy and
* provided to each client. The strategy will ALTERNATE between adjusting the hash ring points or
* the global drop rate in order to avoid double penalizing a client. See below:
*
* Period 1
* We found the average latency is greater than high water mark.
* Then increase the global drop rate for this cluster (let's say from 0% to 20%)
* so 20% of all calls gets dropped.
* .
* .
* Period 2
* The average latency is still higher than high water mark and we found
* it is especially high for few specific clients in the cluster
* Then reduce the number of hash points for those clients in the hash ring, with the hope we'll
* redirect the traffic to "healthier" client and reduce the average latency
* .
* .
* Period 3
* The average latency is still higher than high water mark
* Then we will alternate strategy by increasing the global rate for the whole cluster again
* .
* .
* repeat until the latency becomes smaller than high water mark and higher than low water mark
* to maintain the state. If the latency becomes lower than low water mark that means the cluster
* is getting healthier so we can serve more traffic so we'll start recovery as explained below
*
* We also have a mechanism for recovery if the number of points in the hash ring is not
* enough to receive traffic. The initialRecoveryLevel is a number between 0.0 and 1.0, and
* corresponds to a weight of the tracker client's full hash points. e.g. if a client
* has a default 100 hash points in a ring, 0.0 means there's 0 point for the client in the ring
* and 1.0 means there are 100 points in the ring for the client.
* The second configuration, rampFactor, will geometrically increase the
* previous recoveryLevel if traffic still hasn't been seen for that tracker client.
*
* The reason for using weight instead of real points is to allow an initialRecoveryLevel that corresponds to
* less than one hash point. This would be useful if a "cooling off" period is desirable for the
* misbehaving tracker clients i.e. given a full weight of 100 hash points, 0.005 initialRecoverylevel
* 0 hashpoints at start and rampFactor = 2 means that there will be one cooling off period before the
* client is reintroduced into the hash ring (see below).
*
* Period 1
* 100 * 0.005 = 0.5 point -> So nothing in the hashring
*
* Period 2
* 100 * (0.005 * 2 because of rampfactor) = 1 point -> So we'll add one point in the hashring
*
* Another example, given initialRecoveryLevel = 0.01, rampFactor = 2, and default tracker client hash
* points of 100, we will increase the hash points in this pattern on successive update States:
* 0.01, 0.02, 0.04, 0.08, 0.16, 0.32, etc. -> 1, 2, 4, 8, 16, 32 points in the hashring and aborting
* as soon as calls are recorded for that tracker client.
*
* We also have highWaterMark and lowWaterMark as properties of the DegraderLoadBalancer strategy
* so that the strategy can make decisions on whether to start dropping traffic GLOBALLY across
* all tracker clients for this cluster. The amount of traffic to drop is controlled by the
* globalStepUp and globalStepDown properties, where globalStepUp controls how much the global
* drop rate increases per interval, and globalStepDown controls how much the global drop rate
* decreases per interval. We only step up the global drop rate when the average cluster latency
* is higher than the highWaterMark, and only step down the global drop rate when the average
* cluster latency is lower than the global drop rate.
*
* This code is thread reentrant. Multiple threads can potentially call this concurrently, and so
* callers must pass in the DegraderLoadBalancerState that they based their shouldUpdate() call on.
* The multiple threads may have different views of the trackerClients latency, but this is
* ok as the new state in the end will have only taken one action (either loadbalance or
* call-dropping with at most one step). Currently we will not call this concurrently, as
* checkUpdateState will control entry to a single thread.
*
* @param clusterGenerationId
* @param trackerClients
* @param oldState
* @param config
*/
private static DegraderLoadBalancerState updateState(long clusterGenerationId, List<TrackerClient> trackerClients, DegraderLoadBalancerState oldState, DegraderLoadBalancerStrategyConfig config) {
debug(_log, "updating state for: ", trackerClients);
double sumOfClusterLatencies = 0.0;
double computedClusterDropSum = 0.0;
double computedClusterWeight = 0.0;
long totalClusterCallCount = 0;
boolean hashRingChanges = false;
boolean recoveryMapChanges = false;
DegraderLoadBalancerState.Strategy strategy = oldState.getStrategy();
Map<TrackerClient, Double> oldRecoveryMap = oldState.getRecoveryMap();
Map<TrackerClient, Double> newRecoveryMap = new HashMap<TrackerClient, Double>(oldRecoveryMap);
double currentOverrideDropRate = oldState.getCurrentOverrideDropRate();
double initialRecoveryLevel = config.getInitialRecoveryLevel();
double ringRampFactor = config.getRingRampFactor();
int pointsPerWeight = config.getPointsPerWeight();
DegraderLoadBalancerState newState;
for (TrackerClient client : trackerClients) {
double averageLatency = client.getDegraderControl(DEFAULT_PARTITION_ID).getLatency();
long callCount = client.getDegraderControl(DEFAULT_PARTITION_ID).getCallCount();
oldState.getPreviousMaxDropRate().put(client, client.getDegraderControl(DEFAULT_PARTITION_ID).getMaxDropRate());
sumOfClusterLatencies += averageLatency * callCount;
totalClusterCallCount += callCount;
double clientDropRate = client.getDegraderControl(DEFAULT_PARTITION_ID).getCurrentComputedDropRate();
computedClusterDropSum += client.getPartitionWeight(DEFAULT_PARTITION_ID) * clientDropRate;
computedClusterWeight += client.getPartitionWeight(DEFAULT_PARTITION_ID);
boolean recoveryMapContainsClient = newRecoveryMap.containsKey(client);
// points in the hash ring for the clients.
if (callCount == 0) {
// due solely to low volume.
if (recoveryMapContainsClient) {
// it may do nothing.
if (strategy == DegraderLoadBalancerState.Strategy.LOAD_BALANCE) {
double oldMaxDropRate = client.getDegraderControl(DEFAULT_PARTITION_ID).getMaxDropRate();
double transmissionRate = 1.0 - oldMaxDropRate;
if (transmissionRate <= 0.0) {
// We use the initialRecoveryLevel to indicate how many points to initially set
// the tracker client to when traffic has stopped flowing to this node.
transmissionRate = initialRecoveryLevel;
} else {
transmissionRate *= ringRampFactor;
transmissionRate = Math.min(transmissionRate, 1.0);
}
double newMaxDropRate = 1.0 - transmissionRate;
client.getDegraderControl(DEFAULT_PARTITION_ID).setMaxDropRate(newMaxDropRate);
}
recoveryMapChanges = true;
}
} else //else we don't really need to change the client maxDropRate.
if (recoveryMapContainsClient) {
// else if the recovery map contains the client and the call count was > 0
// tough love here, once the rehab clients start taking traffic, we
// restore their maxDropRate to it's original value, and unenroll them
// from the program.
// This is safe because the hash ring points are controlled by the
// computedDropRate variable, and the call dropping rate is controlled by
// the overrideDropRate. The maxDropRate only serves to cap the computedDropRate and
// overrideDropRate.
// We store the maxDropRate and restore it here because the initialRecoveryLevel could
// potentially be higher than what the default maxDropRate allowed. (the maxDropRate doesn't
// necessarily have to be 1.0). For instance, if the maxDropRate was 0.99, and the
// initialRecoveryLevel was 0.05 then we need to store the old maxDropRate.
client.getDegraderControl(DEFAULT_PARTITION_ID).setMaxDropRate(newRecoveryMap.get(client));
newRecoveryMap.remove(client);
recoveryMapChanges = true;
}
}
double computedClusterDropRate = computedClusterDropSum / computedClusterWeight;
debug(_log, "total cluster call count: ", totalClusterCallCount);
debug(_log, "computed cluster drop rate for ", trackerClients.size(), " nodes: ", computedClusterDropRate);
if (oldState.getClusterGenerationId() == clusterGenerationId && totalClusterCallCount <= 0 && !recoveryMapChanges) {
// if the cluster has not been called recently (total cluster call count is <= 0)
// and we already have a state with the same set of URIs (same cluster generation),
// and no clients are in rehab, then don't change anything.
debug(_log, "New state is the same as the old state so we're not changing anything. Old state = ", oldState, ", config=", config);
return new DegraderLoadBalancerState(oldState, clusterGenerationId, config.getUpdateIntervalMs(), config.getClock().currentTimeMillis());
}
// update our overrides.
double newCurrentAvgClusterLatency = -1;
if (totalClusterCallCount > 0) {
newCurrentAvgClusterLatency = sumOfClusterLatencies / totalClusterCallCount;
}
debug(_log, "average cluster latency: ", newCurrentAvgClusterLatency);
// This points map stores how many hash map points to allocate for each tracker client.
Map<URI, Integer> points = new HashMap<URI, Integer>();
Map<URI, Integer> oldPointsMap = oldState.getPointsMap();
for (TrackerClient client : trackerClients) {
double successfulTransmissionWeight;
URI clientUri = client.getUri();
// Don't take into account cluster health when calculating the number of points
// for each client. This is because the individual clients already take into account
// latency, and a successfulTransmissionWeight can and should be made
// independent of other nodes in the cluster. Otherwise, one unhealthy client in a small
// cluster can take down the entire cluster if the avg latency is too high.
// The global drop rate will take into account the cluster latency. High cluster-wide error
// rates are not something d2 can address.
//
// this client's maxDropRate and currentComputedDropRate may have been adjusted if it's in the
// rehab program (to gradually send traffic it's way).
double dropRate = Math.min(client.getDegraderControl(DEFAULT_PARTITION_ID).getCurrentComputedDropRate(), client.getDegraderControl(DEFAULT_PARTITION_ID).getMaxDropRate());
// calculate the weight as the probability of successful transmission to this
// node divided by the probability of successful transmission to the entire
// cluster
successfulTransmissionWeight = client.getPartitionWeight(DEFAULT_PARTITION_ID) * (1.0 - dropRate);
// calculate the weight as the probability of a successful transmission to this node
// multiplied by the client's self-defined weight. thus, the node's final weight
// takes into account both the self defined weight (to account for different
// hardware in the same cluster) and the performance of the node (as defined by the
// node's degrader).
debug(_log, "computed new weight for uri ", clientUri, ": ", successfulTransmissionWeight);
// keep track if we're making actual changes to the Hash Ring in this updateState.
int newPoints = (int) (successfulTransmissionWeight * pointsPerWeight);
if (newPoints == 0) {
// We are choking off traffic to this tracker client.
// Enroll this tracker client in the recovery program so that
// we can make sure it still gets some traffic
Double oldMaxDropRate = client.getDegraderControl(DEFAULT_PARTITION_ID).getMaxDropRate();
// set the default recovery level.
newPoints = (int) (initialRecoveryLevel * pointsPerWeight);
// Keep track of the original maxDropRate
if (!newRecoveryMap.containsKey(client)) {
// keep track of this client,
newRecoveryMap.put(client, oldMaxDropRate);
client.getDegraderControl(DEFAULT_PARTITION_ID).setMaxDropRate(1.0 - initialRecoveryLevel);
}
}
points.put(clientUri, newPoints);
if (!oldPointsMap.containsKey(clientUri) || oldPointsMap.get(clientUri) != newPoints) {
hashRingChanges = true;
}
}
// if there were changes to the members of the cluster
if ((strategy == DegraderLoadBalancerState.Strategy.LOAD_BALANCE && hashRingChanges == true) || // strategy
oldState.getClusterGenerationId() != clusterGenerationId) {
// atomic overwrite
// try Call Dropping next time we updateState.
newState = new DegraderLoadBalancerState(config.getUpdateIntervalMs(), clusterGenerationId, points, config.getClock().currentTimeMillis(), DegraderLoadBalancerState.Strategy.CALL_DROPPING, currentOverrideDropRate, newCurrentAvgClusterLatency, true, newRecoveryMap, oldState.getServiceName(), oldState.getDegraderProperties(), totalClusterCallCount);
logState(oldState, newState, config, trackerClients);
} else {
// time to try call dropping strategy, if necessary.
// we are explicitly setting the override drop rate to a number between 0 and 1, inclusive.
double newDropLevel = Math.max(0.0, currentOverrideDropRate);
// to get the cluster latency stabilized
if (newCurrentAvgClusterLatency > 0 && totalClusterCallCount >= config.getMinClusterCallCountHighWaterMark()) {
// statistically significant
if (newCurrentAvgClusterLatency >= config.getHighWaterMark() && currentOverrideDropRate != 1.0) {
// if the cluster latency is too high and we can drop more traffic
newDropLevel = Math.min(1.0, newDropLevel + config.getGlobalStepUp());
} else if (newCurrentAvgClusterLatency <= config.getLowWaterMark() && currentOverrideDropRate != 0.0) {
// else if the cluster latency is good and we can reduce the override drop rate
newDropLevel = Math.max(0.0, newDropLevel - config.getGlobalStepDown());
}
// else the averageClusterLatency is between Low and High, or we can't change anything more,
// then do not change anything.
} else if (newCurrentAvgClusterLatency > 0 && totalClusterCallCount >= config.getMinClusterCallCountLowWaterMark()) {
//but we might recover a bit if the latency is healthy
if (newCurrentAvgClusterLatency <= config.getLowWaterMark() && currentOverrideDropRate != 0.0) {
// the cluster latency is good and we can reduce the override drop rate
newDropLevel = Math.max(0.0, newDropLevel - config.getGlobalStepDown());
}
// else the averageClusterLatency is somewhat high but since the qps is not that high, we shouldn't degrade
} else {
// if we enter here that means we have very low traffic. We should reduce the overrideDropRate, if possible.
// when we have below 1 QPS traffic, we should be pretty confident that the cluster can handle very low
// traffic. Of course this is depending on the MinClusterCallCountLowWaterMark that the service owner sets.
// Another possible cause for this is if we had somehow choked off all traffic to the cluster, most
// likely in a one node/small cluster scenario. Obviously, we can't check latency here,
// we'll have to rely on the metric in the next updateState. If the cluster is still having
// latency problems, then we will oscillate between off and letting a little traffic through,
// and that is acceptable. If the latency, though high, is deemed acceptable, then the
// watermarks can be adjusted to let more traffic through.
newDropLevel = Math.max(0.0, newDropLevel - config.getGlobalStepDown());
}
if (newDropLevel != currentOverrideDropRate) {
overrideClusterDropRate(newDropLevel, trackerClients);
}
// don't change the points map or the recoveryMap, but try load balancing strategy next time.
newState = new DegraderLoadBalancerState(config.getUpdateIntervalMs(), clusterGenerationId, oldPointsMap, config.getClock().currentTimeMillis(), DegraderLoadBalancerState.Strategy.LOAD_BALANCE, newDropLevel, newCurrentAvgClusterLatency, true, oldRecoveryMap, oldState.getServiceName(), oldState.getDegraderProperties(), totalClusterCallCount);
logState(oldState, newState, config, trackerClients);
points = oldPointsMap;
}
// adjust the min call count for each client based on the hash ring reduction and call dropping
// fraction.
overrideMinCallCount(currentOverrideDropRate, trackerClients, points, pointsPerWeight);
return newState;
}
use of com.linkedin.d2.balancer.strategies.degrader.DegraderLoadBalancerStrategyConfig in project rest.li by linkedin.
the class DegraderLoadBalancerStrategyV2 method setConfig.
public void setConfig(DegraderLoadBalancerStrategyConfig config) {
_config = config;
String hashMethod = _config.getHashMethod();
Map<String, Object> hashConfig = _config.getHashConfig();
if (hashMethod == null || hashMethod.equals(HASH_METHOD_NONE)) {
_hashFunction = new RandomHash();
} else if (HASH_METHOD_URI_REGEX.equals(hashMethod)) {
_hashFunction = new URIRegexHash(hashConfig);
} else {
_log.warn("Unknown hash method {}, falling back to random", hashMethod);
_hashFunction = new RandomHash();
}
}
use of com.linkedin.d2.balancer.strategies.degrader.DegraderLoadBalancerStrategyConfig in project rest.li by linkedin.
the class DegraderLoadBalancerStrategyV2 method isNewStateHealthy.
static boolean isNewStateHealthy(DegraderLoadBalancerState newState, DegraderLoadBalancerStrategyConfig config, List<TrackerClient> trackerClients) {
if (newState.getCurrentAvgClusterLatency() > config.getLowWaterMark()) {
return false;
}
Map<URI, Integer> pointsMap = newState.getPointsMap();
for (TrackerClient client : trackerClients) {
int perfectHealth = (int) (client.getPartitionWeight(DEFAULT_PARTITION_ID) * config.getPointsPerWeight());
Integer point = pointsMap.get(client.getUri());
if (point < perfectHealth) {
return false;
}
}
return true;
}
use of com.linkedin.d2.balancer.strategies.degrader.DegraderLoadBalancerStrategyConfig in project rest.li by linkedin.
the class DegraderLoadBalancerTest method testDegraderLoadBalancerHandlingExceptionInUpdate.
@Test(groups = { "small", "back-end" })
public void testDegraderLoadBalancerHandlingExceptionInUpdate() {
Map<String, Object> myMap = new HashMap<String, Object>();
Long timeInterval = 5000L;
TestClock clock = new TestClock();
myMap.put(PropertyKeys.CLOCK, clock);
myMap.put(PropertyKeys.HTTP_LB_STRATEGY_PROPERTIES_UPDATE_INTERVAL_MS, timeInterval);
Map<String, String> degraderProperties = new HashMap<String, String>();
degraderProperties.put(PropertyKeys.DEGRADER_HIGH_ERROR_RATE, "0.5");
degraderProperties.put(PropertyKeys.DEGRADER_LOW_ERROR_RATE, "0.2");
DegraderImpl.Config degraderConfig = DegraderConfigFactory.toDegraderConfig(degraderProperties);
final List<TrackerClient> clients = createTrackerClient(3, clock, degraderConfig);
DegraderLoadBalancerStrategyConfig unbrokenConfig = DegraderLoadBalancerStrategyConfig.createHttpConfigFromMap(myMap);
DegraderLoadBalancerStrategyConfig brokenConfig = new MockDegraderLoadBalancerStrategyConfig(unbrokenConfig);
URI uri4 = URI.create("http://test.linkedin.com:10010/abc4");
//this client will throw exception when getDegraderControl is called hence triggering a failed state update
BrokenTrackerClient brokenClient = new BrokenTrackerClient(uri4, getDefaultPartitionData(1d), new TestLoadBalancerClient(uri4), clock, null);
clients.add(brokenClient);
//test DegraderLoadBalancerStrategyV2_1 when the strategy is LOAD_BALANCE
final DegraderLoadBalancerStrategyV2_1 strategyV2 = new DegraderLoadBalancerStrategyV2_1(brokenConfig, "testStrategyV2", null);
DegraderLoadBalancerStrategyAdapter strategyAdapterV2 = new DegraderLoadBalancerStrategyAdapter(strategyV2);
//simulate 100 threads trying to get client at the same time. Make sure that they won't be blocked if an exception
//occurs during updateState()
runMultiThreadedTest(strategyAdapterV2, clients, 100, true);
DegraderLoadBalancerStrategyV2_1.DegraderLoadBalancerState stateV2 = strategyV2.getState();
// only one exception would occur and other thread would succeed in initializing immediately after
assertTrue(stateV2.isInitialized());
assertEquals(stateV2.getStrategy(), DegraderLoadBalancerStrategyV2_1.DegraderLoadBalancerState.Strategy.CALL_DROPPING);
brokenClient.reset();
//test DegraderLoadBalancerStrategyV3 when the strategy is LOAD_BALANCE
DegraderLoadBalancerStrategyV3 strategyV3 = new DegraderLoadBalancerStrategyV3(brokenConfig, "testStrategyV3", null);
DegraderLoadBalancerStrategyAdapter strategyAdapterV3 = new DegraderLoadBalancerStrategyAdapter(strategyV3);
//simulate 100 threads trying to get client at the same time. Make sure that they won't be blocked if an exception
//occurs during updateState()
runMultiThreadedTest(strategyAdapterV3, clients, 100, true);
DegraderLoadBalancerStrategyV3.PartitionDegraderLoadBalancerState stateV3 = strategyV3.getState().getPartitionState(0);
// only one exception would occur and other thread would succeed in initializing immediately after
assertTrue(stateV3.isInitialized());
assertEquals(stateV3.getStrategy(), DegraderLoadBalancerStrategyV3.PartitionDegraderLoadBalancerState.Strategy.CALL_DROPPING);
brokenClient.reset();
// test DegraderLoadBalancerStrategy when the strategy is CALL_DROPPING. We have to make some prepare the
// environment by simulating lots of high latency calls to the tracker client
int numberOfCallsPerClient = 10;
List<CallCompletion> callCompletions = new ArrayList<CallCompletion>();
for (TrackerClient client : clients) {
for (int i = 0; i < numberOfCallsPerClient; i++) {
callCompletions.add(client.getCallTracker().startCall());
}
}
clock.addMs(brokenConfig.getUpdateIntervalMs() - 1000);
for (CallCompletion cc : callCompletions) {
for (int i = 0; i < numberOfCallsPerClient; i++) {
cc.endCall();
}
}
clock.addMs(1000);
Map<TrackerClient, TrackerClientMetrics> beforeStateUpdate = getTrackerClientMetrics(clients);
//test DegraderLoadBalancerStrategyV2_1 when the strategy is CALL_DROPPING
strategyV2.setStrategy(DegraderLoadBalancerStrategyV2_1.DegraderLoadBalancerState.Strategy.CALL_DROPPING);
strategyV3.setStrategy(DEFAULT_PARTITION_ID, DegraderLoadBalancerStrategyV3.PartitionDegraderLoadBalancerState.Strategy.CALL_DROPPING);
runMultiThreadedTest(strategyAdapterV2, clients, 100, true);
stateV2 = strategyV2.getState();
//MockDegraderLoadBalancerStrategyConfig getHighWaterMark should have been called and throw an exception every time and update would fail for any thread
// no side-effects on state when update fails
assertEquals(stateV2.getStrategy(), DegraderLoadBalancerStrategyV2_1.DegraderLoadBalancerState.Strategy.CALL_DROPPING);
// no side-effects on tracker clients when update fails
Map<TrackerClient, TrackerClientMetrics> afterFailedV2StateUpdate = getTrackerClientMetrics(clients);
for (TrackerClient client : clients) {
assertEquals(beforeStateUpdate.get(client), afterFailedV2StateUpdate.get(client));
}
runMultiThreadedTest(strategyAdapterV3, clients, 100, true);
stateV3 = strategyV3.getState().getPartitionState(0);
// no side-effects on state when update fails
assertEquals(stateV3.getStrategy(), DegraderLoadBalancerStrategyV3.PartitionDegraderLoadBalancerState.Strategy.CALL_DROPPING);
// no side-effects on tracker clients when update fails
Map<TrackerClient, TrackerClientMetrics> afterFailedV3StateUpdate = getTrackerClientMetrics(clients);
for (TrackerClient client : clients) {
assertEquals(beforeStateUpdate.get(client), afterFailedV3StateUpdate.get(client));
}
brokenClient.reset();
//this time we'll change the config to the correct one so it won't throw exception when strategy is CALL_DROPPING
// update would succeed and state and trackerclients are expected to be mutated
callCompletions.clear();
for (TrackerClient client : clients) {
for (int i = 0; i < numberOfCallsPerClient; i++) {
callCompletions.add(client.getCallTracker().startCall());
}
}
clock.addMs(brokenConfig.getUpdateIntervalMs() - 1000);
for (CallCompletion cc : callCompletions) {
for (int i = 0; i < numberOfCallsPerClient; i++) {
cc.endCall();
}
}
clock.addMs(1000);
strategyV2.setConfig(unbrokenConfig);
beforeStateUpdate = getTrackerClientMetrics(clients);
// when we run this, the strategy is CALL_DROPPING, and our clients' latency is 4000 MS so our current override
// drop rate is going to be 0.2 That means occasionally some tracker client will be null
runMultiThreadedTest(strategyAdapterV2, clients, 100, false);
stateV2 = strategyV2.getState();
// This time update should succeed, and both state and trackerclients are updated
Map<TrackerClient, TrackerClientMetrics> afterV2StateUpdate = getTrackerClientMetrics(clients);
for (TrackerClient client : clients) {
assertNotEquals(beforeStateUpdate.get(client), afterV2StateUpdate.get(client));
}
assertEquals(stateV2.getStrategy(), DegraderLoadBalancerStrategyV2_1.DegraderLoadBalancerState.Strategy.LOAD_BALANCE);
brokenClient.reset();
// reset metrics on tracker client's degrader control
for (TrackerClient client : clients) {
TrackerClientMetrics originalMetrics = beforeStateUpdate.get(client);
DegraderControl degraderControl = client.getDegraderControl(DEFAULT_PARTITION_ID);
degraderControl.setOverrideDropRate(originalMetrics._overrideDropRate);
degraderControl.setMaxDropRate(originalMetrics._maxDropRate);
degraderControl.setOverrideMinCallCount(originalMetrics._overrideMinCallCount);
}
callCompletions.clear();
for (TrackerClient client : clients) {
for (int i = 0; i < numberOfCallsPerClient; i++) {
callCompletions.add(client.getCallTracker().startCall());
}
}
clock.addMs(brokenConfig.getUpdateIntervalMs() - 1000);
for (CallCompletion cc : callCompletions) {
for (int i = 0; i < numberOfCallsPerClient; i++) {
cc.endCall();
}
}
clock.addMs(1000);
strategyV3.setConfig(unbrokenConfig);
beforeStateUpdate = getTrackerClientMetrics(clients);
runMultiThreadedTest(strategyAdapterV3, clients, 100, false);
stateV3 = strategyV3.getState().getPartitionState(0);
// This time update should succeed, and both state and trackerclients are updated
Map<TrackerClient, TrackerClientMetrics> afterV3StateUpdate = getTrackerClientMetrics(clients);
for (TrackerClient client : clients) {
assertNotEquals(beforeStateUpdate.get(client), afterV3StateUpdate.get(client));
}
assertEquals(stateV3.getStrategy(), DegraderLoadBalancerStrategyV3.PartitionDegraderLoadBalancerState.Strategy.LOAD_BALANCE);
}
use of com.linkedin.d2.balancer.strategies.degrader.DegraderLoadBalancerStrategyConfig in project rest.li by linkedin.
the class DegraderLoadBalancerTest method testMediumTrafficHighLatency1Client.
@Test(groups = { "small", "back-end" })
public void testMediumTrafficHighLatency1Client() {
Map<String, Object> myMap = new HashMap<String, Object>();
Long timeInterval = 5000L;
TestClock clock = new TestClock();
myMap.put(PropertyKeys.CLOCK, clock);
myMap.put(PropertyKeys.HTTP_LB_STRATEGY_PROPERTIES_UPDATE_INTERVAL_MS, timeInterval);
Map<String, String> degraderProperties = new HashMap<String, String>();
degraderProperties.put(PropertyKeys.DEGRADER_HIGH_ERROR_RATE, "0.5");
degraderProperties.put(PropertyKeys.DEGRADER_LOW_ERROR_RATE, "0.2");
DegraderImpl.Config degraderConfig = DegraderConfigFactory.toDegraderConfig(degraderProperties);
double qps = 5.7;
//test Strategy V3
List<TrackerClient> clients = createTrackerClient(1, clock, degraderConfig);
DegraderLoadBalancerStrategyConfig config = DegraderLoadBalancerStrategyConfig.createHttpConfigFromMap(myMap);
DegraderLoadBalancerStrategyV3 strategyV3 = new DegraderLoadBalancerStrategyV3(config, "DegraderLoadBalancerTest", null);
DegraderLoadBalancerStrategyAdapter strategy = new DegraderLoadBalancerStrategyAdapter(strategyV3);
testDegraderLoadBalancerSimulator(strategy, clock, timeInterval, clients, qps, degraderConfig);
//test Strategy V2
clients = createTrackerClient(1, clock, degraderConfig);
config = DegraderLoadBalancerStrategyConfig.createHttpConfigFromMap(myMap);
DegraderLoadBalancerStrategyV2_1 strategyV2 = new DegraderLoadBalancerStrategyV2_1(config, "DegraderLoadBalancerTest", null);
strategy = new DegraderLoadBalancerStrategyAdapter(strategyV2);
testDegraderLoadBalancerSimulator(strategy, clock, timeInterval, clients, qps, degraderConfig);
}
Aggregations