Member-only story
Featured
Go 1.24: Mutex Spin Optimization Significantly Enhances Performance
An overview of the mutex spin optimization introduced in Go 1.24 and its impact on performance.

Background
In 2024, Rhys Hiltner proposed performance optimizations for mutex locks. This optimization has now been merged into the upcoming Go 1.24 release, potentially enhancing performance by up to 70% in scenarios with high lock contention.

In the benchmark test ChanContended, the author observed a significant decline in mutex performance as GOMAXPROCS
increased.
Intel i7-13700H (linux/amd64):
- With 4 threads allowed, the overall throughput is half that of a single thread.
- With 8 threads allowed, the throughput is halved again.
- With 12 threads allowed, the throughput is halved once more.
- At
GOMAXPROCS=20
, 200 channel operations took an average of 44 microseconds, with an average of 220 nanoseconds per unlock2 call, each having the opportunity to wake a sleeping thread.
Another perspective is to consider the CPU usage time of the process. The following data shows that within 1.78 seconds of Wall-Clock Time
, the process's 20 threads spent 27.74 seconds in CPU (spinning) during lock2 calls.

These lock2-related threads did not sleep but continuously spun, consuming significant CPU resources.
New Proposal: Adding Spinning State
Through the analysis above, the author found that in the current lock2 implementation, although threads theoretically can sleep, they spin, leading to slower lock handoffs and considerable performance loss. Thus, a new design proposal was introduced 《Proposal: Improve scalability of runtime.lock2》.