mirror of
https://git.haproxy.org/git/haproxy.git/
synced 2025-10-26 22:20:59 +01:00
BUG/MEDIUM: build: limit excessive and counter-productive gcc-15 vectorization
In https://bugs.gentoo.org/964719, Dan Goodliffe reported that using CFLAGS="-O3 -march=westmere" creates a binary that segfaults on startup with gcc-15. This could be reproduced here, is isolated to gcc-15 and -O3, and is caused by gcc emitting "movdqa" instructions to read unaligned longs taken from chars that were carefully isolated within ifdefs checking for support for unaligned integers on the platform... Some experiments showed that changing all casts all over the code using either typedef-enforced align(1) or using the packed union trick does the job, it needs a more in-depth validation since it's obvious that it doesn't produce the same code at all (at least on more modern machines). However, the offending optimization option could be isolated, it's "-fvect-cost-model=dynamic" which causes this, while -O2 uses "-fvect-cost-model=very-cheap". Turning it back to very-cheap solves the issue, reduces the code, and yields an extra 5% performance increase on the http-request rate (181k vs 172k on a single core)! This could at least partially explain why it has been observed several times over the last few years that -O3 yields bigger and slower code than -O2. It was also verified that the option doesn't change the emitted code at -O0..-O2,-Os,-Oz, but only at -O3. This patch detects the presence of this option and turns it on to address the problem that some distros are facing after an upgrade to gcc-15. As such it should be backported to recent LTS and stable branches. Here, 3.1 was used, so it seems legit to at least target the last two LTS branches (i.e. go as far as 3.0). Thanks to Dan Goodliffe for sharing a working reproducer, Sam James for starting the investigations and Christian Ruppert for bringing the issue to us.
This commit is contained in:
parent
d30b88a6cc
commit
871c80505c
3
Makefile
3
Makefile
@ -213,7 +213,8 @@ UNIT_TEST_SCRIPT=./scripts/run-unittests.sh
|
||||
# undefined behavior to silently produce invalid code. For this reason we have
|
||||
# to use -fwrapv or -fno-strict-overflow to guarantee the intended behavior.
|
||||
# It is preferable not to change this option in order to avoid breakage.
|
||||
STD_CFLAGS := $(call cc-opt-alt,-fwrapv,-fno-strict-overflow)
|
||||
STD_CFLAGS := $(call cc-opt-alt,-fwrapv,-fno-strict-overflow) \
|
||||
$(call cc-opt,-fvect-cost-model=very-cheap)
|
||||
|
||||
#### Compiler-specific flags to enable certain classes of warnings.
|
||||
# Some are hard-coded, others are enabled only if supported.
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user