17 RocketMQ Cluster Performance Optimization #
Introduction #
In this article, we will optimize the RocketMQ cluster from two dimensions: system parameters and cluster parameters. The goal is to make the RocketMQ run more smoothly. Stability is often more important than simply improving TPS (transactions per second). Based on the actual production environment, this article provides examples of pitfalls caused by parameter settings that led to unstable clusters and affected business operations.
System Parameter Tuning #
After extracting the RocketMQ installation package, there is a file called os.sh
in the bin
directory. This file provides system parameter configurations recommended by RocketMQ official. Usually, these parameters can meet the system requirements, but they can also be adjusted according to the actual situation. It is important to note that Linux kernel version 2.6 or below should not be used. It is recommended to use Linux kernel version 3.10 or above. If you are using CentOS, you can choose CentOS 7 or above. Problems that occur when using Linux kernel version 2.6 will be discussed in the later section on pitfalls.
Maximum File Count #
Set the maximum number of files opened by the user:
vim /etc/security/limits.conf
# End of file
baseuser soft nofile 655360
baseuser hard nofile 655360
* soft nofile 655360
* hard nofile 655360
System Parameter Settings #
The adjustment of system parameters follows the recommendations from the official documentation. Below, we explain each parameter. You can directly execute sh os.sh
to complete the system parameter configuration, or you can manually add the following content to the vim /etc/sysctl.conf
file and then execute sysctl -p
to make it take effect.
vm.overcommit_memory=1
vm.drop_caches=1
vm.zone_reclaim_mode=0
vm.max_map_count=655360
vm.dirty_background_ratio=50
vm.dirty_ratio=50
vm.dirty_writeback_centisecs=360000
vm.page-cluster=3
vm.swappiness=1
Parameter explanations:
Parameter | Meaning |
---|---|
overcommit_memory | Determines whether the kernel allows memory overcommit. overcommit_memory=0: the kernel checks if there is enough memory space when a user requests memory. overcommit_memory=1: the kernel always believes that there is enough memory space until it is used up. overcommit_memory=2: the kernel prohibits any form of memory overcommitment. |
drop_caches | When data is written, the kernel clears the cache to free up memory. drop_caches=1: clears the page cache (file). drop_caches=2: clears the inode and directory cache. drop_caches=3: clears both. |
zone_reclaim_mode | zone_reclaim_mode=0: the system tends to allocate memory from other nodes. zone_reclaim_mode=1: the system tends to reclaim cache memory from the local node. |
max_map_count | Defines the maximum number of memory areas a process can have. The default value is 65536. |
dirty_background_ratio/dirty_ratio | When the dirty cache reaches a certain level, the pdflush process is triggered to write the dirty cache back to disk. The values of dirty_background_ratio/dirty_ratio are automatically calculated when dirty_background_bytes/dirty_bytes exist. |
dirty_writeback_centisecs | Specifies the interval at which pdflush automatically runs (in hundredths of a second). |
page-cluster | Specifies the number of memory pages (exponent of 2) involved in each swap in or swap out operation. page-cluster=0: 1 page. page-cluster=1: 2 pages. page-cluster=2: 4 pages. page-cluster=3: 8 pages. |
swappiness | swappiness=0: swap space is used only when there is a shortage of memory, and the remaining free memory falls below the vm.min_free_kbytes limit. swappiness=1: a minimal amount of swapping is performed and swapping is not disabled. Only applicable for kernel versions 3.5 and above, or Red Hat kernel version 2.6.32-303 and above. swappiness=10: recommended when there is sufficient memory in the system in order to improve performance. swappiness=60: default value. swappiness=100: the kernel will actively use swap space. |
Cluster Parameter Tuning #
Production Environment Configuration #
Here is a configuration file used in production environments, along with an explanation of its parameters. Just make slight modifications to the cluster name and you can use it as the configuration file for your production environment.
Example configuration:
brokerClusterName=testClusterA
brokerName=broker-a
brokerId=0
listenPort=10911
namesrvAddr=x.x.x.x:9876;x.x.x.x::9876
defaultTopicQueueNums=16
autoCreateTopicEnable=false
autoCreateSubscriptionGroup=false
deleteWhen=04
fileReservedTime=48
mapedFileSizeCommitLog=1073741824
mapedFileSizeConsumeQueue=50000000
destroyMapedFileIntervalForcibly=120000
redeleteHangedFileInterval=120000
diskMaxUsedSpaceRatio=88
storePathRootDir=/data/rocketmq/store
storePathCommitLog=/data/rocketmq/store/commitlog
storePathConsumeQueue=/data/rocketmq/store/consumequeue
storePathIndex=/data/rocketmq/store/index
storeCheckpoint=/data/rocketmq/store/checkpoint
abortFile=/data/rocketmq/store/abort
maxMessageSize=65536
flushCommitLogLeastPages=4
flushConsumeQueueLeastPages=2
flushCommitLogThoroughInterval=10000
flushConsumeQueueThoroughInterval=60000
brokerRole=ASYNC_MASTER
flushDiskType=ASYNC_FLUSH
maxTransferCountOnMessageInMemory=1000
transientStorePoolEnable=false
warmMapedFileEnable=false
pullMessageThreadPoolNums=128
slaveReadEnable=true
transferMsgByHeap=true
waitTimeMillsInSendQueue=1000
Parameter explanations:
Parameter | Meaning |
---|---|
brokerClusterName | Cluster name. |
brokerName | Broker name. |
Property Name | Description |
———————————– | ——————————————————————————————————————————- |
brokerId | 0 represents the master node |
listenPort | The port on which the broker listens |
namesrvAddr | The address of the nameserver |
defaultTopicQueueNums | The default number of queues when creating a topic |
autoCreateTopicEnable | Whether to allow automatic creation of topics. It is recommended to disable in production environments and enable in non-production environments |
autoCreateSubscriptionGroup | Whether to allow automatic creation of consumer groups. It is recommended to disable in production environments and enable in non-production environments |
deleteWhen | The time to clean up expired logs. “04” indicates cleaning starting at 4 AM |
fileReservedTime | The time (in hours) to keep log files. 48 means 48 hours, i.e., 2 days |
mapedFileSizeCommitLog | The size of the commitlog file |
mapedFileSizeConsumeQueue | The size of the ConsumeQueue file |
destroyMapedFileIntervalForcibly | |
redeleteHangedFileInterval | |
diskMaxUsedSpaceRatio | When the disk usage exceeds this ratio, log cleaning operations will be triggered |
storePathRootDir | The root directory for storing RocketMQ logs and other data |
storePathCommitLog | The directory for storing the CommitLog |
storePathConsumeQueue | The directory for storing the ConsumeQueue |
storePathIndex | The directory for storing index files |
storeCheckpoint | The directory for storing checkpoint files |
abortFile | The directory for storing abort files |
maxMessageSize | The maximum size (in bytes) allowed for a single message |
flushCommitLogLeastPages | The number of pages (each page is 4KB) that need to be flushed if the size of unflushed messages exceeds this setting |
flushConsumeQueueLeastPages | The number of pages (each page is 4KB) that need to be flushed if the size of unflushed consume queue exceeds this setting |
flushCommitLogThoroughInterval | The interval (in seconds) between two consecutive flush operations on messages. Default: 10 seconds |
flushConsumeQueueThoroughInterval | The interval (in seconds) between two consecutive flush operations on consume queues. Default: 60 seconds |
brokerRole | The role of the broker. ASYNC_MASTER represents an asynchronously replicated master node, SYNC_MASTER represents a synchronously replicated master node, and SLAVE represents a slave node |
flushDiskType | The disk flushing type. ASYNC_FLUSH represents asynchronous flushing, and SYNC_FLUSH represents synchronous flushing |
maxTransferCountOnMessageInMemory | The maximum number of messages that can be pulled in one request when consuming messages |
transientStorePoolEnable | Whether to enable off-heap memory transfer |
warmMapedFileEnable | Whether to enable file warming |
pullMessageThreadPoolNums | The size of the thread pool for pulling messages |
slaveReadEnable | Whether to enable reading messages from the slave node. If the ratio of memory occupied by messages exceeds 40%, the messages will be read from the 0th node of the slave |
transferMsgByHeap | Whether to read messages from heap memory when consuming messages |
waitTimeMillsInSendQueue | The waiting time (in milliseconds) in the sending queue. If it exceeds this setting, a timeout error will be thrown |
Optimization Recommendations #
There are several properties of the broker that may affect the stability of the cluster’s performance. They are explained below.
1. Enable asynchronous flushing
Except for certain payment scenarios or low TPS scenarios (e.g., TPS below 2000), it is recommended to enable asynchronous flushing to improve cluster throughput.
flushDiskType=ASYNC_FLUSH
2. Enable slave read permission
The size of messages occupying physical memory is configured by accessMessageInMemoryMaxRatio
, with a default value of 40%. If the consumed messages are not in memory, enabling slaveReadEnable
will read the messages from the slave node, thereby increasing the memory utilization of the master.
slaveReadEnable=true
3. Number of messages pulled in one request
The number of messages pulled in one request during consumption is determined jointly by the broker and the consumer client, with a default value of 32. The broker-side parameter is set by maxTransferCountOnMessageInMemory
, and the consumer-side parameter is set by pullBatchSize
. It is recommended to set a larger value on the broker side, such as 1000, to leave more room for adjustment on the consumer side.
maxTransferCountOnMessageInMemory=1000
4. Waiting time in the sending queue
When a message is sent to the broker, the waiting time in the queue is set by the waitTimeMillsInSendQueue
parameter, with a default value of 200ms. It is recommended to set a larger value, such as 1000ms to 5000ms. If set too short, the sending client will experience timeouts.
waitTimeMillsInSendQueue=1000
5. Master-Slave asynchronous replication
To improve cluster performance, in production environments, it is recommended to set the role to ASYNC_MASTER for master-slave asynchronous replication, as the performance of master-slave synchronous replication is relatively low in stress testing.
brokerRole=ASYNC_MASTER
6. Increase cluster stability
To improve cluster stability, the following three parameters are particularly important. They will also be mentioned in the troubleshooting cases.
Disable off-heap memory:
transientStorePoolEnable=false
Disable file warming:
warmMapedFileEnable=false
Enable heap memory transfer:
transferMsgByHeap=true