Willy Tarreau 30a659c355 MEDIUM: ring: significant boost in the loop by checking the ring queue ptr first
By doing that and placing the cpu_relax at the right places, the ARM
reaches 6.0M/s on 80 threads. On x86_64, at 3C6T the EPYC sees a small
increase from 4.45M to 4.57M but at 24C48T it sees a drop from 3.82M
to 3.33M due to the write contention hidden behind the CAS that
implements the FETCH_OR(), that we'll address next.
2024-03-25 17:34:19 +00:00
..
2024-03-25 17:34:19 +00:00