Load Balancing with a Dynamic Set of Balls and Bins
Three BARC researchers have reached a new result yielding the right understanding of load-balancing in a dynamic environment and hereby presenting a key to drastically cutting server based CO2 emissions in the future.
Distributing balls into bins is a common problem in computer systems. For example, the balls can model clients while bins model servers and we want to distribute the clients to the servers. This problem is essential for streaming services such as Vimeo and Youtube, or cloud services such as Google Cloud. The problem is very dynamic in that servers can both be added and removed and clients can come and go. At any given point, we want to be able to efficiently find the server of a given client.
The classic solution to the problem is called Consistent Hashing (presented at STOC ’97 by Karger et al.) which can distribute clients to servers in the above dynamic environment. However, classic Consistent Hashing does not guarantee load balancing between the servers: Some servers may serve very few clients while others may end up vastly overloaded. At SODA ’18, Mirrokni, Thorup, and Zadimoghaddam showed how to enforce load-balancing. For a user specified parameter x, their system guarantees that no server gets load more than 1+x times the average. For example, with x=0.1 this means that the maximal load of any server is at most 10% higher than the average. The price for this guarantee is that clients have to be moved up to 1/x2 steps away from their initial server. With x=0.1, this means that clients may have to be moved 100 steps away. At first sight this may sound like a lot. However, it is important to remember that we need a scalable system that can work with millions if not billions of clients and servers, and the upper bound 1/x2=100 on the number of steps holds regardless of the number of clients and servers in the system. The load balancing system got adopted by Vimeo which reported an 8-fold reduction in the resources needed for their streaming services (see https://medium.com/vimeo-engineering-blog/improving-load-balancing-with-a-new-consistent-hashing-algorithm-9f1bd75709ed and the figure).
The code is open source and now used by other companies such as Google Cloud, and it is thus saving resources for companies all over the world. This is not only beneficial to the companies but also to the climate. Indeed, server farms around the world are using more energy than all air traffic combined.
In a very recent BARC paper by Anders Aamand, Jakob Bæk Tejs Knudsen, and Mikkel Thorup (accepted at the prestigious conference STOC ’21), we finally understand the cost of load balancing. We show that a balancing parameter of 1+x can be obtained only having to move clients at most 1/x steps away from their initial server. This is much better than the bound of 1/x2 from the previous paper. For example, with x=0.1 we reduce the displacement from 100 steps to 10 steps, and we expect this to become highly appreciated by industry. Our new result is even the best possible, yielding the right understanding of load-balancing in a dynamic environment. The a full version of the paper can be found at https://arxiv.org/abs/2104.05093.