mirror of
https://git.haproxy.org/git/haproxy.git/
synced 2025-09-20 21:31:28 +02:00
OPTIM: ring: check the queue's owner using a CAS on x86
In the loop where the queue's leader tries to get the tail lock, we also need to check if another thread took ownership of the queue the current thread is currently working for. This is currently done using an atomic load. Tests show that on x86, using a CAS for this is much more efficient because it allows to keep the cache line in exclusive state for a few more cycles that permit the queue release call after the loop to be done without having to wait again. The measured gain is +5% for 128 threads on a 64-core AMD system (11.08M msg/s vs 10.56M). However, ARM loses about 1% on this, and we cannot afford that on machines without a fast CAS anyway, so the load is performed using a CAS only on x86_64. It might not be as efficient on low-end models but we don't care since they are not the ones dealing with high contention.
This commit is contained in:
parent
d25099b359
commit
a727c6eaa5
13
src/ring.c
13
src/ring.c
@ -275,7 +275,18 @@ ssize_t ring_write(struct ring *ring, size_t maxlen, const struct ist pfx[], siz
|
||||
*/
|
||||
|
||||
while (1) {
|
||||
if ((curr_cell = HA_ATOMIC_LOAD(ring_queue_ptr)) != &cell)
|
||||
#if defined(__x86_64__)
|
||||
/* read using a CAS on x86, as it will keep the cache line
|
||||
* in exclusive state for a few more cycles that will allow
|
||||
* us to release the queue without waiting after the loop.
|
||||
*/
|
||||
curr_cell = &cell;
|
||||
HA_ATOMIC_CAS(ring_queue_ptr, &curr_cell, curr_cell);
|
||||
#else
|
||||
curr_cell = HA_ATOMIC_LOAD(ring_queue_ptr);
|
||||
#endif
|
||||
/* give up if another thread took the leadership of the queue */
|
||||
if (curr_cell != &cell)
|
||||
goto wait_for_flush;
|
||||
|
||||
/* OK the queue is locked, let's attempt to get the tail lock.
|
||||
|
Loading…
x
Reference in New Issue
Block a user