36 Load Balancing Fair and Proper Use of Load Balancing Strategies Available Here Below

36 Load Balancing Fair and Proper Use of Load Balancing Strategies Available Here Below #

In the previous lesson, we learned about the definition of the LoadBalance interface and the content of the AbstractLoadBalance abstract class. We also provided a detailed explanation of the core principles and rough implementation of the ConsistentHashLoadBalance and RandomLoadBalance classes. In this lesson, we will continue introducing the remaining three implementations of LoadBalance.

LeastActiveLoadBalance #

LeastActiveLoadBalance uses the Least Active Request Load Balancing Algorithm. It believes that the fewer active requests a Provider node has, the more processing capacity remains, and the more efficient it is in handling requests. Therefore, we should prioritize assigning requests to such Provider nodes.

LeastActiveLoadBalance needs to be used together with ActiveLimitFilter. ActiveLimitFilter records the number of active requests for each method of the interface. When LeastActiveLoadBalance performs load balancing, it only selects Invokers from the set of Invokers with the least active requests.

In the implementation of LeastActiveLoadBalance, the first step is to select all the Invoker objects with the least active requests. The subsequent logic is exactly the same as RandomLoadBalance, which is to select the final Invoker object based on the weight of these Invoker objects. The following is the specific implementation of the LeastActiveLoadBalance.doSelect() method:

protected <T> Invoker<T> doSelect(List<Invoker<T>> invokers, URL url, Invocation invocation) {

    // Initialize the number of Invokers
    int length = invokers.size();

    // Record the minimum active request count
    int leastActive = -1;

    // Record the number of Invokers with the minimum active request count
    int leastCount = 0;

    // Record the index positions of Invokers with the minimum active request count in the invokers array
    int[] leastIndexes = new int[length];

    // Record the weights of each Invoker in the Invoker set with the minimum active request count
    int[] weights = new int[length];

    // Record the sum of the weights of all Invokers in the Invoker set with the minimum active request count
    int totalWeight = 0;

    // Record the weight of the first Invoker in the Invoker set with the minimum active request count
    int firstWeight = 0;

    // Record whether all Invokers in the Invoker set with the minimum active request count have the same weight
    boolean sameWeight = true;

    for (int i = 0; i < length; i++) { // Iterate through all Invokers to get the Invoker set with the minimum active request count
        Invoker<T> invoker = invokers.get(i);

        // Get the active request count of the Invoker
        int active = RpcStatus.getStatus(invoker.getUrl(), invocation.getMethodName()).getActive();

        // Get the weight of the Invoker
        int afterWarmup = getWeight(invoker, invocation);

        weights[i] = afterWarmup;

        // Compare the active request counts
        if (leastActive == -1 || active < leastActive) {
            // The current Invoker is the first Invoker with the minimum active request count, so record the following information.
            leastActive = active; // Update the minimum active request count
            leastCount = 1; // Update the number of Invokers with the minimum active request count
            leastIndexes[0] = i; // Update the Invoker

            totalWeight = afterWarmup; // Update the total weight
            firstWeight = afterWarmup; // Record the weight value of the first Invoker
            sameWeight = true; // Update whether the weight values are equal
        } else if (active == leastActive) {
            // The current Invoker belongs to the Invoker set with the minimum active request count
            leastIndexes[leastCount++] = i; // Record the index of this Invoker

            totalWeight += afterWarmup; // Update the total weight

            if (sameWeight && afterWarmup != firstWeight) {
                sameWeight = false; // Update whether the weight values are equal
            }
        }
    }

    // If there is only one Invoker object with the minimum active request count, return it directly
    if (leastCount == 1) {
        return invokers.get(leastIndexes[0]);
...

} }

// According to the logic of RandomLoadBalance, select an Invoker object randomly from the collection of Invokers with the minimum active request count

if (!sameWeight && totalWeight > 0) {

int offsetWeight = ThreadLocalRandom.current().nextInt(totalWeight);

for (int i = 0; i < leastCount; i++) {

    int leastIndex = leastIndexes[i];

    offsetWeight -= weights[leastIndex];

    if (offsetWeight < 0) {

        return invokers.get(leastIndex);

    }

}

}

return invokers.get(leastIndexes[ThreadLocalRandom.current().nextInt(leastCount)]);

}

The specific principles of ActiveLimitFilter and the underlying RpcStatus for recording the count of active requests have been analyzed in detail in the previous [Lesson 30]. I won’t repeat it here. If there are any unclear points, you can review the relevant content of the previous lesson.

RoundRobinLoadBalance #

RoundRobinLoadBalance implements the Weighted Round-Robin Load Balancing Algorithm.

Round-robin means that each request is assigned to each provider in turn. For example, if there are three provider nodes A, B, and C, using the ordinary round-robin approach, the first request will be assigned to Provider A, the second request to Provider B, the third request to Provider C, and the fourth request to Provider A again…and so on.

Round-robin is a stateless load balancing algorithm that is simple to implement and is suitable for scenarios where all provider nodes in the cluster have similar performance. However, in reality, it is difficult to guarantee this, because it is easy to have the best and worst performing provider nodes in the cluster handling the same amount of traffic. This can cause the poorly performing provider nodes to be heavily loaded and even unable to respond in a timely manner, while the well-performing provider nodes have idle resource utilization. In this case, we can use weighted round-robin to reduce the load on the poorly performing provider nodes.

After weighting, the ratio of traffic assigned to each provider node will be close to or equal to their weight ratio. For example, if the weight ratio of provider nodes A, B, and C is 5:1:1, then in 7 requests, node A will receive 5 requests, node B will receive 1 request, and node C will also receive 1 request.

In versions prior to Dubbo 2.6.5, the implementation of RoundRobinLoadBalance had some issues, such as the performance issue in selecting Invokers and not being smooth enough in load balancing. These issues have been fixed in Dubbo 2.6.5 and later versions. So, here we will introduce the latest implementation of RoundRobinLoadBalance.

Each provider node has two weights: one is the configured weight, which does not change during load balancing; the other is the current weight, which is dynamically adjusted during load balancing and has an initial value of 0.

When a new request comes in, RoundRobinLoadBalance will iterate through the Invoker list and add its configured weight to its corresponding current weight. After the iteration is completed, the maximum current weight will be found, and the weight sum will be subtracted from it, and then the corresponding Invoker object will be returned.

Next, we will illustrate the execution process of RoundRobinLoadBalance with an example. Here, we still assume that the weight ratio of nodes A, B, and C is 5:1:1.

Handling the first request: add the weights in the currentWeight array to the configured weights, changing it from [0, 0, 0] to [5, 1, 1]. Then select the Invoker with the highest weight as the result, which is node A. Finally, subtract the total weight from the current weight of node A, resulting in the currentWeight array becoming [-2, 1, 1].
Handling the second request: add the weights in the currentWeight array to the configured weights, changing it from [-2, 1, 1] to [3, 2, 2]. Then select the Invoker with the highest weight as the result, which is node A. Finally, subtract the total weight from the current weight of node A, resulting in the currentWeight array becoming [-4, 2, 2].
Handling the third request: add the weights in the currentWeight array to the configured weights, changing it from [-4, 2, 2] to [1, 3, 3]. Then select the Invoker with the highest weight as the result, which is node B. Finally, subtract the total weight from the current weight of node B, resulting in the currentWeight array becoming [1, -4, 3].
Handling the fourth request: add the weights in the currentWeight array to the configured weights, changing it from [1, -4, 3] to [6, -3, 4]. Then select the Invoker with the highest weight as the result, which is node A. Finally, subtract the total weight from the current weight of node A, resulting in the currentWeight array becoming [-1, -3, 4].
Handling the fifth request: add the weights in the currentWeight array to the configured weights, changing it from [-1, -3, 4] to [4, -2, 5]. Then select the Invoker with the highest weight as the result, which is node C. Finally, subtract the total weight from the current weight of node C, resulting in the currentWeight array becoming [4, -2, -2].
Handling the sixth request: add the weights in the currentWeight array to the configured weights, changing it from [4, -2, -2] to [9, -1, -1]. Then select the Invoker with the highest weight as the result, which is node A. Finally, subtract the total weight from the current weight of node A, resulting in the currentWeight array becoming [2, -1, -1].
Handling the seventh request: add the weights in the currentWeight array to the configured weights, changing it from [2, -1, -1] to [7, 0, 0]. Then select the Invoker with the highest weight as the result, which is node A. Finally, subtract the total weight from the current weight of node A, resulting in the currentWeight array becoming [0, 0, 0].

At this point, a round-robin cycle is completed.

In version 2.6.4 of Dubbo, the result of a single round-robin cycle in the above example is [A, A, A, A, A, B, C], which means that the first 5 requests are all routed to node A. This will cause node A to receive a large number of requests in a short period of time, increasing its load significantly, while nodes B and C do not receive any requests and remain completely idle. This situation of “instantaneous imbalance in assignment” is what we referred to as the “non-smooth problem” earlier.

In the RoundRobinLoadBalance class, we create a corresponding WeightedRoundRobin object for each Invoker object to record the configured weight (weight field) and the current weight that changes with each load balancing algorithm execution (current field).

With an understanding of the WeightedRoundRobin inner class, let’s take a look at the specific implementation of the RoundRobinLoadBalance.doSelect() method:

protected <T> Invoker<T> doSelect(List<Invoker<T>> invokers, URL url, Invocation invocation) {

    String key = invokers.get(0).getUrl().getServiceKey() + "." + invocation.getMethodName();

    // Get the mapping of WeightedRoundRobin corresponding to the entire Invoker list, create a new WeightedRoundRobin mapping if it is empty
    ConcurrentMap<String, WeightedRoundRobin> map = methodWeightMap.computeIfAbsent(key, k -> new ConcurrentHashMap<>());

    int totalWeight = 0;

    long maxCurrent = Long.MIN_VALUE;

    long now = System.currentTimeMillis(); // Get the current time

    Invoker<T> selectedInvoker = null;

    WeightedRoundRobin selectedWRR = null;

    for (Invoker<T> invoker : invokers) {

        String identifyString = invoker.getUrl().toIdentityString();

        int weight = getWeight(invoker, invocation);

        // Check if the current Invoker has a corresponding WeightedRoundRobin object, create one if it doesn't exist
        WeightedRoundRobin weightedRoundRobin = map.computeIfAbsent(identifyString, k -> {
            WeightedRoundRobin wrr = new WeightedRoundRobin();
            wrr.setWeight(weight);
```java

return wrr;

});

// Check if Invoker weight has changed, if so, update the weight field of WeightedRoundRobin

if (weight != weightedRoundRobin.getWeight()) {

    weightedRoundRobin.setWeight(weight);

}

// Increase currentWeight with the configured Weight

long cur = weightedRoundRobin.increaseCurrent();

// Set the lastUpdate field

weightedRoundRobin.setLastUpdate(now);

// Find the Invoker with the maximum currentWeight and its corresponding WeightedRoundRobin

if (cur > maxCurrent) {

    maxCurrent = cur;

    selectedInvoker = invoker;

    selectedWRR = weightedRoundRobin;

}

totalWeight += weight; // Calculate the sum of weights

}

if (invokers.size() != map.size()) {

map.entrySet().removeIf(item -> now - item.getValue().getLastUpdate() > RECYCLE_PERIOD);

}

if (selectedInvoker != null) {

// Subtract totalWeight from currentWeight

selectedWRR.sel(totalWeight);

// Return the selected Invoker object

return selectedInvoker;

}

return invokers.get(0);

}

ShortestResponseLoadBalance #

ShortestResponseLoadBalance is a LoadBalance implementation introduced in Dubbo 2.7 or later. It implements the algorithm of selecting the provider node with the shortest response time, which means selecting the provider node that has a successful invocation and the shortest response time from multiple provider nodes. However, there may be multiple provider nodes that meet this condition, so another random algorithm is used to select the final provider node to call.

After understanding the core principle of ShortestResponseLoadBalance, let’s take a look at the core implementation of the ShortestResponseLoadBalance.doSelect() method:


protected <T> Invoker<T> doSelect(List<Invoker<T>> invokers, URL url, Invocation invocation) {

// Record the number of Invoker objects in the collection

int length = invokers.size();

// Record the shortest response time for all Invoker objects

long shortestResponse = Long.MAX_VALUE;

// Number of Invoker objects with the shortest response time

int shortestCount = 0;

// Indexes of all Invoker objects with the shortest response time

int[] shortestIndexes = new int[length];

// Store the weights of each Invoker object

int[] weights = new int[length];

// Store the total weight

int totalWeight = 0;

// Record the weight of the first Invoker object

int firstWeight = 0;

// Whether the weights of the Invoker objects in the shortest response time Invoker collection are the same

boolean sameWeight = true;

for (int i = 0; i < length; i++) {

    Invoker<T> invoker = invokers.get(i);

    RpcStatus rpcStatus = RpcStatus.getStatus(invoker.getUrl(), invocation.getMethodName());

    // Get the average elapsed time of successful invocations, the specific calculation method is: total elapsed time of successful invocations / total number of successful invocations = average elapsed time of successful invocations

    // The content of RpcStatus has been introduced in previous lessons, so I won't repeat it here

    long succeededAverageElapsed = rpcStatus.getSucceededAverageElapsed();

    // Get the current active requests of the provider, which is the number of requests currently being processed

    int active = rpcStatus.getActive();

    // Calculate the estimated time to process a new request, which is the approximate time it takes to process it if the current request is sent to this provider

    long estimateResponse = succeededAverageElapsed * active;

    // Calculate the weight of this Invoker object (mainly for preheating)

    int afterWarmup = getWeight(invoker, invocation);

    weights[i] = afterWarmup;

    if (estimateResponse < shortestResponse) {

        // The first Invoker object with the shortest response time is found, record its related information

        shortestResponse = estimateResponse;

        shortestCount = 1;

        shortestIndexes[0] = i;

        totalWeight = afterWarmup;

        firstWeight = afterWarmup;

        sameWeight = true;

    } else if (estimateResponse == shortestResponse) {

        // Multiple Invoker objects with the shortest response time

        shortestIndexes[shortestCount++] = i;

        totalWeight += afterWarmup;

        if (sameWeight && i > 0

                && afterWarmup != firstWeight) {

            sameWeight = false;

        }

    }

}

if (shortestCount == 1) {

    return invokers.get(shortestIndexes[0]);

}

// If the weights of all Invoker objects with the shortest response time are different, select an Invoker randomly using weighted random load balancing

if (!sameWeight && totalWeight > 0) {

    int offsetWeight = ThreadLocalRandom.current().nextInt(totalWeight);

    for (int i = 0; i < shortestCount; i++) {

        int shortestIndex = shortestIndexes[i];

        offsetWeight -= weights[shortestIndex];

        if (offsetWeight < 0) {

            return invokers.get(shortestIndex);

        }

    }

}

// If the weights of all Invoker objects with the shortest response time are the same, randomly return one

return invokers.get(shortestIndexes[ThreadLocalRandom.current().nextInt(shortestCount)]);

}

Summary #

Today we continue to introduce the remaining three implementations of the LoadBalance interface after the last lesson.

We first introduced the LeastActiveLoadBalance implementation, which uses the least active number load balancing algorithm to select the Provider node that is currently processing the fewest requests to handle the latest request. Then we introduced the RoundRobinLoadBalance implementation, which uses the weighted round-robin load balancing algorithm to address the issues caused by the pure round-robin load balancing algorithm. With the upgrade of Dubbo version, the problem of its own lack of smoothness has also been optimized. Finally, we introduced the ShortestResponseLoadBalance implementation, which selects a Provider node from the provider nodes with the shortest response time to handle new requests.