11 How to Eliminate Cold Keys and Expired Keys in MC

11 How to Eliminate Cold Keys and Expired Keys in MC #

Hello, I am your caching course teacher Chen Bo. Welcome to Lesson 11 “Memcached Eviction Strategies”.

Eviction Strategies #

As a caching component, Memcached (MC) can only store the most frequently accessed hot data. Once the stored data exceeds the memory limit, it is necessary to evict the cold keys in MC. Most keys in MC have expiration times. After a key expires, for performance reasons, MC does not immediately delete the expired key. Instead, it is gradually cleaned by a maintenance thread. Only when this expired key is accessed, it will be deleted to reclaim storage space. Therefore, MC manages the lifecycle of keys, including both expiration and deletion for reclaiming, as shown in the following diagram.

img

Key expiration includes two scenarios: expiration after the expire time and expiration of all keys after flush_all is called.

MC has three ways to delete and reclaim key/value pairs:

  1. The first way is lazy deletion upon retrieval. After a key expires, it is not immediately deleted. Instead, when the key is retrieved, the status of the key is checked. If it has expired, it is truly deleted and the storage space is reclaimed.

  2. The second way is synchronous scanning of the LRU (Least Recently Used) tail. When memory allocation is requested and the memory is fully used, and there are no available chunks in the slabclass associated with the Item, it will scan the tail of the LRU to reclaim expired and invalidated keys. If no such keys are found, one key will be forcefully deleted.

  3. The third way is the LRU maintenance thread that periodically scans the four LRU queues to asynchronously evict expired key/value pairs.

flush_all #

In MC, besides regular expiration based on the expire time, there is also a way to immediately expire all keys using flush_all. If there is an abnormality in caching data writing, such as a large amount of dirty data without a simple way to quickly identify them, flush_all can be used to immediately invalidate all data. This ensures the correctness of data by reloading them from the database using the keys. Flush_all can cause all keys on an MC node to be invalidated immediately. However, in some scenarios, it is necessary to have multiple MC nodes invalidate their data simultaneously at a specific time. In such cases, the delayed invalidation instruction of flush_all can be used. This instruction appends an expiretime parameter to the flush_all instruction, which allows multiple MC nodes to invalidate all keys at the same time.

If there are no parameters after flush_all, it is equivalent to flush_all 0, which means all keys are immediately invalidated. When MC receives the flush_all instruction, if it is a delayed invalidation, it will set the oldest_live in the global setting to the timestamp N seconds after the specified time, which means it will expire after N seconds. If it is an immediate invalidation, it will set the oldest_cas in the global setting to the current maximum global cas value. After setting this global variable value, it immediately returns. Therefore, when MC invalidates all keys through flush_all, no keys are actually deleted. These keys will be deleted asynchronously by user requests or by the LRU maintenance thread to complete the actual deletion action.

Lazy Deletion #

In MC, lazy active deletion of expired keys refers to the process where, during the handling of instructions such as touch, get, and gets, the key is first queried to find the related Item. Then, the key’s expiration and whether it has been flushed are verified. If it is expired or flushed, the key is directly deleted and reclaimed. It is easy to verify if a key has expired by simply checking its expiration time. To check if a key has been flushed, the logic is as follows: first, check if the key’s most recent access time is smaller than the oldest_live value in the global configuration. If it is smaller, it means the key has been flushed. Otherwise, check the key’s unique id value (CAS) and see if it is smaller than the oldest_cas value in the global configuration. If it is, it means the key has also been flushed.

Memory Allocation Failure and LRU Synchronous Eviction #

When Memcached inserts or updates a key, it first tries to allocate a free space for the new key/value in the appropriate slab class. If the allocation fails, it synchronously evicts the tail element of the COLD LRU for that slab class. If the eviction is successful, the slab class will have one more free space that can be used by the previously mentioned key. If there is no data in the COLD LRU queue, the eviction fails. In this case, Memcached starts polling the tail of the HOT LRU. If a key has expired, it is evicted; otherwise, it is migrated.

LRU Maintenance Thread and Asynchronous Eviction #

Performing eviction synchronously during key read, write, or update operations is not efficient because eviction is a heavyweight operation that can significantly degrade Memcached’s performance compared to request processing. Therefore, Memcached has an additional LRU maintenance thread that performs garbage collection on expired keys, freeing up the memory occupied by them without adding additional request burden.

As mentioned earlier, Memcached has 64 slab classes, with slab classes numbered from 1 to 63 used for storing and retrieving item data. In order to manage expired data, each of these slab classes correspond to 4 LRU queues named TEMP, HOT, WARM, and COLD LRUs. So in total, there are 63 * 4 = 252 LRU queues. The LRU maintenance thread intermittently sleeps and then starts cleaning the tail of the 4 LRUs based on a certain strategy.

When a new key is written, if its expiration time is less than 61 seconds, it is inserted directly into the TEMP LRU as shown in the image below. TEMP LRU has no length limit and can continue to insert keys. Since the expiration time is short, the TEMP LRU does not perform internal movement or inter-queue migration, ensuring optimal performance. After the sleep, the LRU maintenance thread first performs 500 rounds of polling on the tail of the TEMP LRU and in each round, it performs 5 iterations. During each iteration, it checks if the key has expired and if it has, it is evicted and the next iteration continues. If an unexpired key is encountered, it evicts that key and exits the cleaning process of the TEMP LRU. If all keys in the tail of the TEMP LRU have expired, the maintenance thread can reclaim a total of 2500 expired keys in one run(500 rounds x 5 iterations).

img

In the image below, when a new key is written and its expiration time is greater than 61 seconds, it is inserted directly into the HOT LRU. The HOT LRU has a memory limit and the memory occupied by each HOT LRU cannot exceed 20% of the total actual used memory of the slabclass it belongs to. When performing routine maintenance, the LRU maintenance thread first cleans the TEMP LRU and then maintains the HOT LRU. The maintenance of HOT LRU also starts with 500 rounds of polling, with 5 iterations in each round. During each iteration, it checks if the key has expired and if it has, it is evicted. The iteration continues until an unexpired key is encountered. If the status of this key is ACTIVE, it is moved to the WARM LRU. For keys with non-ACTIVE status, if the memory usage of the HOT LRU exceeds the limit, it is moved to the COLD LRU; otherwise, it is simply evicted. It is worth noting that this evict operation is not common; once it happens, although the evicted key is cleared, the operation function considers the number of keys recycled and cleaned in this operation to be still zero.

img

In the image below, if the number of keys recycled or migrated in the HOT LRU is zero, the LRU maintenance thread starts polling the WARM LRU. The WARM LRU also has a memory limit, with each WARM LRU being allowed to occupy a maximum of 40% of the total actual used memory of the slabclass it belongs to. The WARM LRU maintenance starts with 500 rounds of polling, with 5 iterations in each round. During each iteration, it checks if the key has expired and if it has, it is evicted. The iteration continues until an unexpired key is encountered. If the status of this key is ACTIVE, it is moved to the head of the LRU queue. For keys with non-ACTIVE status, if the memory usage of the WARM LRU exceeds the limit, they are moved to the COLD LRU; otherwise, they are simply evicted. It is worth noting that this evict operation is not common; once it happens, although the evicted key is cleared, the operation function considers the number of keys recycled and cleaned in this operation to be still zero.

img

The LRU maintenance thread finally performs maintenance on the COLD LRU, as shown in the image below. Similar to the TEMP LRU, the COLD LRU has no length limit and can continuously store data. The COLD LRU maintenance starts with 500 rounds of polling, with 5 iterations in each round. During each iteration, it checks if the key has expired and if it has, it is evicted. The iteration continues until an unexpired key is encountered. If the status of this key is ACTIVE, it is moved to the head of the WARM LRU; otherwise, it is simply ignored.

img

When the LRU maintenance thread is running, the TEMP LRU has its own loop while the other three LRUs run in a separate loop. If the number of keys cleaned or moved in the HOT, WARM, and COLD LRUs is zero, the 500-round loop is immediately stopped.