15 How to Deeply Understand and Extend Twemproxy

15 How to Deeply Understand and Extend Twemproxy #

Hello, I am your caching course teacher, Chen Bo. Welcome to Lesson 15, “Twemproxy Framework, Application, and Expansion.”

Twemproxy Architecture and Application #

Twemproxy is an open-source architecture developed by Twitter. It is a proxy component for shard resource access. As shown in the figure below, it can encapsulate the distribution and hash rules of resource pools, solve the problem of detecting and reconnecting to partially faulty backend nodes, make client access as simple as possible, and make resource changes lightweight by only updating Twemproxy, without updating tens of thousands of clients. Finally, Twemproxy accesses the backend through a single long connection, which greatly reduces the connection pressure on backend resources.

System Architecture #

Next, let’s analyze the application system architecture based on Twemproxy and the internal architecture of Twemproxy components.

As shown in the figure below, in the application system, Twemproxy is an intermediate layer between the client and the resource. Its backend supports shard access to both Memcached resource pools and Redis resource pools. Twemproxy supports modulo distribution, consistent hash distribution, and random distribution, although the usage of random distribution is less common.

When the application frontend requests cached data, it directly accesses the corresponding port of Twemproxy. Twemproxy then parses the command, calculates the key through the hash function, and routes the key to the shard of the backend resource according to the distribution strategy. After the backend resource responds, the response is returned to the corresponding client.

During system operation, Twemproxy automatically maintains the status of backend resource services. If a backend resource service is abnormal, it will be automatically excluded and probed at regular intervals. When the backend resource recovers, the cache node will be restored to normal use.

Component Architecture #

Twemproxy is developed based on the epoll event-driven model, as shown in the figure below. It is a single-process, single-threaded component. The core process handles all events, including network IO, protocol parsing, message routing, etc. Twemproxy can listen on multiple ports, with each port accepting and processing a business cache request. Twemproxy supports Redis and Memcached protocols and three distribution schemes: consistent hash distribution, modulo distribution, and random distribution. Twemproxy is configured via a YAML file, which is simple and clear, and easy to read and write manually.

Twemproxy accesses the backend resource through a single long connection. When it receives a large number of concurrent requests, it batch processes multiple requests to the backend using pipelines. When the backend resource continues to experience access exceptions, Twemproxy will exclude it from the normal list and continuously probe it. Once it recovers, requests will be routed and distributed to it.

During operation, Twemproxy generates a massive stream of request and response messages. Therefore, developers have carefully designed a memory management mechanism to minimize memory allocation and copying and maximize system performance. Inside Twemproxy, requests and responses are treated as messages. The message structures and the buffers that hold message data are reused to avoid the overhead of repeated allocation and deallocation, which improves message processing performance. To address short-lived connections, Twemproxy also reuses connections. This way, even when facing short-lived access from clients such as PHP, the previously allocated connections can be reused, improving connection performance.

In addition, Twemproxy adopts a zero-copy solution for messages. For request messages, they are only read once when received by the client, and no copying is performed for subsequent parsing, processing, and forwarding. The same approach is used for backend responses: the response is read into the message buffer when received from the backend, and no copying is performed for subsequent parsing, processing, and replying to the client. By sharing message bodies and message buffers, even though Twemproxy is single-process/single-threaded, it can achieve a QPS of more than 60,000 to 80,000.

Twemproxy Request and Response #

Next, let’s take a look at how Twemproxy routes and responds to requests.

Twemproxy listens on a port and accepts new connections from clients. It initializes a client_conn object for each connection. When data is sent by the client, the client_conn receives a network read event and reads the data from the network card, storing it in the buffer for the request message. After reading, the message is parsed according to the configured protocol. Once parsing is successful, the request message is placed in the out queue of the client_conn. Next, the command key is hashed and the corresponding server shard connection, represented by a server_conn structure, is found using a distribution algorithm, as shown in the diagram below.

If the in queue of the server_conn is empty, a write event is triggered for the server_conn. The request message is then stored in the in queue of the server_conn. When the server_conn handles the write event, the request messages in the in queue are aggregated and sent to the backend resources in batches using a pipeline method. Once the sending is completed, the sent request message is removed from the in queue of the server_conn and inserted into the out queue. After the backend resources have finished processing the request, they send the response back to Twemproxy. When the response reaches Twemproxy, the corresponding server_conn receives an epoll read event and starts reading the response message. After reading and parsing the response, the first request message in the out queue of the server_conn is deleted, and this request message is paired with the latest received response message. Once the request and response are matched, a write event is triggered for the client_conn, as shown in the diagram below.

Then, when the client_conn handles the epoll write event, it sends the responses to the client in the order of the requests. After sending, the request message is removed from the out queue of the client. Finally, the message buffer and message structures are recycled for reuse in subsequent request processing. At this point, the processing of a request is complete.

Installation and Use of Twemproxy #

Installing and using Twemproxy is relatively simple. First, clone Twemproxy from GitHub to the target server using Git. Then, navigate to the Twemproxy directory and execute $ autoreconf -fvi, followed by ./configure, and finally make (or optionally make install) to compile and install Twemproxy. After that, Twemproxy can be started by running src/nutcracker -c /xxx/conf/nutcracker.yml.

Twemproxy proxies access to backend resources, and the deployment information and access policies for these resources are configured in a YAML file. Therefore, let’s take a brief look at the Twemproxy configuration. As shown in the diagram, this configuration proxies access to two business data caches: “alpha” and “beta”. For each business configuration, there is first a “listen” configuration item that sets the port to listen on. Then there are hash algorithms and distribution algorithms. Auto_eject_hosts is used to decide whether an exception server should be ejected and rehashed when it encounters a backend server exception, and it is set to no ejection by default. The Redis configuration item is used to indicate the type of backend resource, whether it is Redis or Memcached. The last configuration item, servers, is used to specify the list of resource pools.

Taking Memcached access as an example, after deploying the Memcached resources for the business, the list of Mc resources and access methods are configured in the YAML file, and then Twemproxy is started. The business can access the data of the backend resources by accessing Twemproxy. If there are any changes to the Mc resources in the future, no modifications need to be made on the business side, and operations can directly modify the configuration of Twemproxy.

In actual production use, Twemproxy still has some issues. First, it is a single-process/single-threaded model, where one event_base handles all events, including reading client requests, forwarding requests to backend servers, receiving responses from servers, and sending responses to clients. A single instance of Twemproxy can handle a maximum of around 80,000 QPS under load testing, but for the sake of stability in production, it can support at most 30,000 to 40,000 QPS. In contrast, the online QPS of Memcached can generally reach 100,000 to 200,000, so there needs to be 3 to 5 Twemproxy instances in front of each Mc instance. Having too many instances can cause problems such as complexity in management and high costs.

Second, for the sake of performance and preventing single points of failure, Twemproxy needs to be deployed in multiple instances, and new instances need to be added or redundant instances need to be taken offline based on changes in business access volume. If multiple Twemproxy instances are accessed simultaneously and the client access strategy is not proper, some Twemproxy instances may be under heavy load while others are idle, causing access imbalance.

Third, the backend resources are centrally configured in the YAML file of Twemproxy, simplifying the maintenance of resource changes compared to maintaining them in all business client endpoints. However, making changes to these configurations across multiple Twemproxy instances to make them effective at the same time is a complex task.

Finally, Twemproxy does not support multiple replicas or multi-level architectures for Mc access strategies, nor does it support read-write separation access for the Master-Slave architecture of Redis.

To address these issues, you can extend Twemproxy to better meet the needs of the business and operations.

Twemproxy Extension #

Multi-process Transformation #

Performance is the top priority. First of all, we can modify Twemproxy’s single-process/single-threaded model to a parallel processing model. The parallel solution can be implemented using either multithreading or multiprocessing. Since Twemproxy is just a message routing middleware and does not require extra data sharing, the multiprocessing solution would be more concise and suitable.

In the multiprocessing transformation, we can create a master process and multiple worker processes for task processing, as shown in the diagram below. Each process maintains its own independent epoll event-driven mechanism. The master process is mainly responsible for listening to ports, accepting new connections, and dispatching connections to worker processes.

The worker processes, based on their own independent event_base, manage all client connections dispatched by the master. When a client sends a network request, the worker process reads and parses the commands, transfers them through the IO queue within the process, and finally packages and pipelines the requests to the backend servers.

When the servers have finished processing the requests and send back the responses, the corresponding worker process reads and parses the responses, and then replies to the clients in batches.

Through multiprocessing transformation, Twemproxy’s QPS can be increased from 80,000 to over 400,000. When accessing the service, the number of Twemproxy instances that need to be deployed will be significantly reduced, making operations and maintenance simpler.

Adding Load Balancing #

There are generally three solutions to achieve load balancing for multiple Twemproxy instances.

The first solution is to add a group of LVS (Linux Virtual Server) between Twemproxy and the business access side as a load balancing layer. With LVS, you can easily add or remove Twemproxy instances, and LVS is responsible for load balancing and request distribution, as shown in the diagram below.

The second solution is to add the list of Twemproxy IPs to the DNS. Business clients access Twemproxy through the domain name, and each time a connection is made, the DNS randomly returns an IP to achieve load balancing as much as possible.

The third solution is for the business client to customize the load balancing strategy. The business client retrieves the list of Twemproxy IPs from a configuration center or DNS, and then accesses Twemproxy instances in a balanced manner, achieving load balancing.

Solution one can make use of mature LVS solutions, which support load balancing strategies efficiently and stably, but it adds another layer and increases the complexity and cost of operation and maintenance. Solution two can only achieve connection balancing; whether the access requests are balanced cannot be guaranteed. Solution three has the lowest cost and higher performance compared to the previous two solutions. It is recommended to use solution three, which is also what we use internally at Weibo.

Adding a Configuration Center #

To maintain Twemproxy’s configuration, we can add a configuration center service. All configuration information from the YAML configuration file, including the deployment and access information of backend resources, can be stored in the configuration center as configurations, as shown in the diagram below.

When Twemproxy starts, it first subscribes to and fetches configuration from the configuration center, then parses and starts normally. Twemproxy registers its own IP and listening port information in the configuration center. The business client obtains the deployment information of Twemproxy from the configuration center and then performs balanced access.

When there are changes in the backend resources, the configuration in the configuration center is updated directly. The configuration center notifies all Twemproxy instances, and upon receiving the event notification, Twemproxy can fetch the latest configuration and adjust the access to the backend resources to achieve online changes. The whole process is automatically completed, making it more efficient and reliable.

Support for M-S-L1 multi-level access #

Previously, in order to deal with sudden flood traffic and avoid the impact of local hardware failures, a Master-Slave-L1 architecture was adopted for Mc access. The access strategy of this caching architecture can be encapsulated within Twemproxy. The implementation solution is also relatively simple. First, add the Master, Slave, and L1 layers in the servers configuration, as shown in the figure below.

When Twemproxy starts, each worker process pre-connects to all Mc backends. When a client request is received, Twemproxy can use different access strategies based on the parsed command.

For get requests, first select an L1 to access randomly. If a cache miss occurs, continue to access the Master and Slave. If any layer is hit in the middle, it will be written back.
For gets requests, it needs to be read from the master and, if the master retrieval fails, from the slave. After retrieval, it needs to be written to the master, then retrieved from the master again to ensure that the cas unique id comes from the master.
For add/cas and other requests, first request the master, and once successful, write the key/value to the slave and all L1 through the set command.
For set requests, it is the simplest, just set all resource pools directly.
For the response to the stats command, it can be calculated by Twemproxy itself or obtained by aggregation from the backend Mc.

Redis master-slave access #

Redis supports master-slave replication. In order to support a greater amount of concurrent access and reduce pressure on the master database, multiple slave databases are generally deployed. For write operations, directly request the Redis master, and for read operations, randomly select a Redis slave. This logic can also be encapsulated within Twemproxy. The configuration information of Redis master-slave can be recorded in the configuration center in the form of a domain name or an IP port, and Twemproxy subscribes to and updates it in real-time, so as to timely change the access to the backend when slaves are added or removed or when the master-slave switch happens.

In this lesson, we explained the characteristics of large and medium-sized internet systems in the era of big data, the classic problems and solutions when accessing Memcached, and how to solve the capacity issue, performance bottleneck, connection bottleneck, and local failure problems of Memcached by splitting the cache pool and implementing a Master-Slave dual-layer architecture, as well as the Master-Slave-L1 three-tier architecture to better handle sudden flood traffic and local failure problems.

This lesson focuses on learning the application system architecture based on Twemproxy, the system architecture and key technologies of Twemproxy, the deployment and configuration information of Twemproxy. Finally, we learned how to expand Twemproxy to achieve better performance, availability, and operability.

You can refer to the mind map below to review and organize these knowledge points.