The lib compatibility checks introduced in 2.8-dev6 with commit c3b297d5a
("MEDIUM: tools: further relax dlopen() checks too consider grouped
symbols") were partially incorrect in that they check at the same time
libcrypto and libssl. But if loading a library that only depends on
libcrypto, the ssl-only symbols will be missing and this might present
an inconsistency. This is what is observed on FreeBSD 13.1 when
libcrypto is being loaded, where it sees two symbols having disappeared.
The fix consists in splitting the checks for libcrypto and libssl.
No backport is needed, unless the patch above finally gets backported.
There's a recurring issue regarding shared library loading from Lua. If
the imported library is linked with a different version of openssl but
doesn't use it, the check will trigger and emit a warning. In practise
it's not necessarily a problem as long as the API is the same, because
all symbols are replaced and the library will use the included ssl lib.
It's only a problem if the library comes with a different API because
the dynamic linker will only replace known symbols with ours, and not
all. Thus the loaded lib may call (via a static inline or a macro) a
few different symbols that will allocate or preinitialize structures,
and which will then pass them to the common symbols coming from a
different and incompatible lib, exactly what happens to users of Lua's
luaossl when building haproxy with quictls and without rebuilding
luaossl.
In order to better address this situation, we now define groups of
symbols that must always appear/disappear in a consistent way. It's OK
if they're all absent from either haproxy or the lib, it means that one
of them doesn't use them so there's no problem. But if any of them is
defined on any side, all of them must be in the exact same state on the
two sides. The symbols are represented using a bit in a mask, and the
mask of the group of other symbols they're related to. This allows to
check 64 symbols, this should be OK for a while. The first ones that
are tested for now are symbols whose combination differs between
openssl versions 1.0, 1.1, and 3.0 as well as quictls. Thus a difference
there will indicate upcoming trouble, but no error will mean that we're
running on a seemingly compatible API and that all symbols should be
replaced at once.
The same mechanism could possibly be used for pcre/pcre2, zlib and the
few other optional libs that may occasionally cause runtime issues when
used by dependencies, provided that discriminatory symbols are found to
distinguish them. But in practice such issues are pretty rare, mainly
because loading standard libs via Lua on a custom build of haproxy is
not pretty common.
In the event that further symbol compatibility issues would be reported
in the future, backporting this patch as well as the following series
might be an acceptable solution given that the scope of changes is very
narrow (the malloc stuff is needed so that the malloc/free tests can be
dropped):
BUG/MINOR: illegal use of the malloc_trim() function if jemalloc is used
MINOR: pools: make sure 'no-memory-trimming' is always used
MINOR: pools: intercept malloc_trim() instead of trying to plug holes
MEDIUM: pools: move the compat code from trim_all_pools() to malloc_trim()
MINOR: pools: export trim_all_pools()
MINOR: pattern: use trim_all_pools() instead of a conditional malloc_trim()
MINOR: tools: relax dlopen() on malloc/free checks
Now that we can provide a safe malloc_trim() we don't need to detect
anymore that some dependencies use a different set of malloc/free
functions than ours because they will use the same as those we're
seeing, and we control their use of malloc_trim(). The comment about
the incompatibility with DEBUG_MEM_STATS is not true anymore either
since the feature relies on macros so we're now OK.
This will stop catching libraries linked against glibc's allocator
when haproxy is natively built with jemalloc. This was especially
annoying since dlopen() on a lib depending on jemalloc() tends to
fail on TLS issues.
This one is printed as the iocb in the "show fd" output, and arguably
this wasn't very convenient as-is:
293 : st=0x000123(cl heopI W:sRa R:sRA) ref=0 gid=1 tmask=0x8 umask=0x0 prmsk=0x8 pwmsk=0x0 owner=0x7f488487afe0 iocb=0x50a2c0(main+0x60f90)
Let's unstatify it and export it so that the symbol can now be resolved
from the various points that need it.
While we do support quic4@ and quic6@ for listening addresses, it was
not possible to specify that we want to use an FD inherited from the
parent with QUIC. It's just a matter of making it possible to enable
a dgram-type socket and a stream-type transport, so let's add this.
Now it becomes possible to write "quic+fd@12", "quic+ipv4@addr" etc.
Complete ipcmp() function with a new argument <check_port>. If this
argument is true, the function will compare port values besides IP
addresses and return true only if both are identical.
This commit will simplify QUIC connection migration detection. As such,
it should be backported to 2.7.
These includes brought by commit 9c76637ff ("MINOR: anon: add new macros
and functions to anonymize contents") resulted in an increase of exactly
20% of the number of lines to build. These include are not needed there,
only tools.c needs xxhash.h.
Previous commit 8a6767d26 ("BUG/MINOR: config: don't count trailing spaces
as empty arg (v2)") was still not enough. As reported by ClusterFuzz in
issue 52049 (https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=52049),
there remains a case where for the sake of reporting the correct argument
count, the function may produce virtual args that span beyond the end of
the output buffer if that one is too short. That's what's happening with
a config file of one empty line followed by a large number of args.
This means that what args[] points to cannot be relied on and that a
different approach is needed. Since no output is produced for spaces and
comments, we know that args[arg] continues to point to out+outpos as long
as only comments or spaces are found, which is what we're interested in.
As such it's safe to check the last arg's pointer against the one before
the trailing zero was emitted, in order to decide to count one final arg.
No backport is needed, unless the commit above is backported.
In parse_line(), spaces increment the arg count and it is incremented
again on '#' or end of line, resulting in an extra empty arg at the
end of arg's list. The visible effect is that the reported arg count
is in excess of 1. It doesn't seem to affect regular function but
specialized ones like anonymisation depends on this count.
This is the second attempt for this problem, here the explanation :
When called for the first line, no <out> was allocated yet so it's NULL,
letting the caller realloc a larger line if needed. However the words are
parsed and their respective args[arg] are filled with out+position, which
means that while the first arg is NULL, the other ones are no and fail the
test that was meant to avoid dereferencing a NULL. Let's simply check <out>
instead of <args> since the latter is always derived from the former and
cannot be NULL without the former also being.
This may need to be backported to stable versions.
"stdout" and "stderr" are not hashed. In the same spirit, "fd@" and
"sockpair@" prefixes are not hashed too. There is no reason to hash such
address and it may be useful to diagnose bugs.
No backport needed, except if anonymization mechanism is backported.
Add PA_O_DGRAM and PA_O_PORT_OFS options when str2sa_range() is called. This
way dgram sockets and addresses with port offsets are supported.
No backport needed, except if anonymization mechanism is backported.
Add a parameter hasport to return a simple hash or ipstring when
ipstring has no port. Doesn't hash if scramble is null. Add
option PA_O_PORT_RESOLVE to str2sa_range. Add a case UNIX.
Those modification permit to use hash_ipanon in cli section
in order to dump the same anonymization of address in the
configuration file and with CLI.
No backport needed, except if anonymization mechanism is backported.
This reverts commit 5529424ef1.
Since this patch, HAProxy crashes when the first line of the configuration
file contains more than one parameter because, on the first call of
parse_line(), the output line is not allocated. Thus elements in the
arguments array may point on invalid memory area.
It may be considered as a bug to reference invalid memory area and should be
fixed. But for now, it is safer to revert this patch
If the reverted commit is backported, this one must be backported too.
In parse_line(), spaces increment the arg count and it is incremented
again on '#' or end of line, resulting in an extra empty arg at the
end of arg's list. The visible effect is that the reported arg count
is in excess of 1. It doesn't seem to affect regular function but
specialized ones like anonymisation depends on this count.
This may need to be backported to stable versions.
chipitsine reported in github issue #1872 that in function hash_anon and
hash_ipanon, index_hash can be equal to NB_L_HASH_WORD and can reach an
inexisting line table, the table is initialized hash_word[NB_L_HASH_WORD][20];
so hash_word[NB_L_HASH_WORD] doesn't exist.
No backport needed, except if anonymization mechanism is backported.
Patrick Hemmer reported an improper log behavior when using
log-format to escape log data (+E option):
Some bytes were truncated from the output:
- escape_string() function now takes an extra parameter that
allow the caller to specify input string stop pointer in
case the input string is not guaranteed to be zero-terminated.
- Minors checks were added into lf_text_len() to make sure dst
string will not overflow.
- lf_text_len() now makes proper use of escape_string() function.
This should be backported as far as 1.8.
These macros and functions will be used to anonymize strings by producing
a short hash. This will allow to match config elements against dump elements
without revealing the original data. This will later be used to anonymize
configuration parts and CLI commands output. For now only string, identifiers
and addresses are supported, but the model is easily extensible.
It's convenient for debugging IP trees. However we're not dumping the
full keys, for the sake of simplicity, only the 4 first bytes are dumped
as a u32 hex value. In practice this is sufficient for debugging. As a
reminder since it seems difficult to recover the command each time it's
needed, the output is converted to an image using dot from Graphviz:
dot -o a.png -Tpng dump.txt
The first approach in commit 288dc1d8e ("BUG/MEDIUM: tools: avoid calling
dlsym() in static builds") relied on dlopen() but on certain configs (at
least gcc-4.8+ld-2.27+glibc-2.17) it used to catch situations where it
ought not fail.
Let's have a second try on this using dladdr() instead. The variable was
renamed "build_is_static" as it's exactly what's being detected there.
We could even take it for reporting in -vv though that doesn't seem very
useful. At least the variable was made global to ease inspection via the
debugger, or in case it's useful later.
Now it properly detects a static build even with gcc-4.4+glibc-2.11.1 and
doesn't crash anymore.
Since 2.4 with commit 64192392c ("MINOR: tools: add functions to retrieve
the address of a symbol"), we can resolve symbols. However some old glibc
crash in dlsym() when the program is statically built.
Fortunately even on these old libs we can detect lack of support by
calling dlopen(NULL). Normally it returns a handle to the current
program, but on a static build it returns NULL. This is sufficient to
refrain from calling dlsym() (which will be of very limited use anyway),
so we check this once at boot and use the result when needed.
This may be backported to 2.4. On stable versions, be careful to place
the init code inside an if/endif guard that checks for DL support.
Sometimes we need to be able to signal one thread among a mask, without
caring much about which bit will be picked. At the moment we use ffsl()
for this but this sometimes results in imbalance at certain specific
places where the same first thread in a set is always the same one that
is selected.
Another approach would consist in using the rank finding function but it
requires a popcount and a setup phase, and possibly a modulo operation
depending on the popcount, which starts to be very expensive.
Here we take a different approach. The idea is an input bit value is
passed, from 0 to LONGBITS-1, and that as much as possible we try to
pick the bit matching it if it is set. Otherwise we look at a mirror
position based on a decreasing power of two, and jump to the side
that still has bits left. In 6 iterations it ends up spotting one bit
among 64 and the operations are very cheap and optimizable. This method
has the benefit that we don't care where the holes are located in the
mask, thus it shows a good distribution of output bits based on the
input ones. A long-time test shows an average of 16 cycles, or ~4ns
per lookup at 3.8 GHz, which is about twice as fast as using the rank
finding function.
Just like for that one, the code was stored into tools.c since we don't
have a C file for intops.
In order to better detect the danger caused by extra shared libraries
which replace some symbols, upon dlopen() we now compare a few critical
symbols such as malloc(), free(), and some OpenSSL symbols, to see if
the loaded library comes with its own version. If this happens, a
warning is emitted and TAINTED_REDEFINITION is set. This is important
because some external libs might be linked against different libraries
than the ones haproxy was linked with, and most often this will end
very badly (e.g. an OpenSSL object is allocated by haproxy and freed
by such libs).
Since the main source of dlopen() calls is the Lua lib, via a "require"
statement, it's worth trying to show a Lua call trace when detecting a
symbol redefinition during dlopen(). As such we emit a Lua backtrace if
Lua is detected as being in use.
Several bug reports were caused by shared libraries being loaded by other
libraries or some Lua code. Such libraries could define alternate symbols
or include dependencies to alternate versions of a library used by haproxy,
making it very hard to understand backtraces and analyze the issue.
Let's intercept dlopen() and set a new TAINTED_SHARED_LIBS flag when it
succeeds, so that "show info" makes it visible that some external libs
were added.
The redefinition is based on the definition of RTLD_DEFAULT or RTLD_NEXT
that were previously used to detect that dlsym() is usable, since we need
it as well. This should be sufficient to catch most situations.
There's no more reason for keepin the code and definitions in conn_stream,
let's move all that to stconn. The alphabetical ordering of include files
was adjusted.
This file contains all the stream-connector functions that are specific
to application layers of type stream. So let's name it accordingly so
that it's easier to figure what's located there.
The alphabetical ordering of include files was preserved.
The following functions which act on a connection-based stream connector
were renamed to sc_conn_* (~60 places):
cs_conn_drain_and_shut
cs_conn_process
cs_conn_read0
cs_conn_ready
cs_conn_recv
cs_conn_send
cs_conn_shut
cs_conn_shutr
cs_conn_shutw
The given size must be the size of the destination buffer, not the size of the
(binary) address representation.
This fixes GitHub issue #1599.
The bug was introduced in 92149f9a82 which is in
2.4+. The fix must be backported there.
If QUIC support is enabled both branches of the ternary conditional are
identical, upsetting Coverity. Move the full conditional into the non-QUIC
preprocessor branch to make the code more clear.
This resolves GitHub issue #1710.
Just trying "quic4@:4433" with USE_QUIC not set rsults in such a cryptic
error:
[ALERT] (14610) : config : parsing [quic-mini.cfg:44] : 'bind' : unsupported protocol family 2 for address 'quic4@:4433'
Let's at least add the stream and datagram statuses to indicate what was
being looked for:
[ALERT] (15252) : config : parsing [quic-mini.cfg:44] : 'bind' : unsupported stream protocol for datagram family 2 address 'quic4@:4433'
Still not very pretty but gives a little bit more info.
The error message when mixing stream and dgram protocols in an
address speaks about sockets while it ought to speak about addresses,
let's fix this as in some contexts it can be a bit confusing.
cs_conn_io_cb(), cs_conn_sync_recv() and cs_conn_sync_send() are moved in
conn_stream.c. Associated functions are moved too (cs_notify, cs_conn_read0,
cs_conn_recv, cs_conn_send and cs_conn_process).
When trying to sort sets of strings, it's often needed to required to
compare 3 strings to see if the chosen one fits well between the two
others. That's what this function does, in addition to being able to
ignore extremities when they're NULL (typically for the first iteration
for example).
url2sa() still have an unfortunate case where it reads 1 byte too far,
it happens when no port or path are specified in the URL, and could
crash if the byte after the URL is not allocated (mostly with ASAN).
This case is never triggered in old versions of haproxy because url2sa
is used with buffers which are way bigger than the URL. It is only
triggered with the httpclient.
Should be bacported in every stable branches.
Fix 8a91374 ("BUG/MINOR: tools: url2sa reads ipv4 too far") introduced a
regression in the value returned when parsing an ipv4 host.
Tthe consumed length is supposed to be as far as the first character of
the path, only its not computed correctly anymore and return the length
minus the size of the scheme.
Fixed the issue by reverting 'curr' and 'url' as they were before the
patch.
Must be backported in every stable branch where the 8a91374 patch was
backported.
kFreeBSD needs to be treated as a distinct target from FreeBSD
since the underlying system libc is the GNU one. Thus, relying
only on __GLIBC__ no longer suffice.
- freebsd-glibc new target, key difference is including crypt.h
and linking to libdl like linux.
- cpu affinity available but the api is still the FreeBSD's.
- enabling auxiliary data access only for Linux.
Patch based on preliminary work done by @bigon.
closes#1555
The url2sa implementation is inconsitent when parsing an IPv4, indeed
url2sa() takes a <ulen> as a parameter where the call to url2ipv4() takes
a null terminated string. Which means url2ipv4 could try to read more
that it is supposed to.
This function is only used from a buffer so it never reach a unallocated
space. It can only cause an issue when used from the httpclient which
uses it with an ist.
This patch fixes the issue by copying everything in the trash and
null-terminated it.
Must be backported in all supported version.
dladdr1() is used on glibc and takes a void**, but we pass it a
const ElfW(Sym)** and some compilers complain that we're aliasing.
Let's just set a may_alias attribute on the local variable to
address this. There's no need to backport this unless warnings are
reported on older distros or uncommon compilers.
Many times core dumps reported by users who experience trouble are
difficult to exploit due to missing system libraries. Sometimes,
having just a list of loaded libraries and their respective addresses
can already provide some hints about some problems.
This patch makes a step in that direction by adding a new "show libs"
command that will try to enumerate the list of object files that are
loaded in memory, relying on the dynamic linker for this. It may also
be used to detect that some foreign code embarks other undesired libs
(e.g. some external Lua modules).
At the moment it's only supported on glibc when USE_DL is set, but it's
implemented in a way that ought to make it reasonably easy to be extended
to other platforms.
Sometimes it is really useful to be able to specify a default value for
an optional environment variable, like the ${name-value} construct in
shell. In fact we're really missing this for a number of settings in
reg tests, starting with timeouts.
This commit simply adds support for the common syntax above. Other
common forms like '+' to replace existing variables, or ':-' and ':+'
to act on empty variables, were not implemented at this stage, as they
are less commonly needed.
Instead of using sock_type and ctrl_type to select a protocol, let's
make use of the new protocol type. For now they always match so there
is no change. This is applied to address parsing and to socket retrieval
from older processes.
This new ssllib_name_startswith precondition check can be used to
distinguish application linked with OpenSSL from the ones linked with
other SSL libraries (LibreSSL or BoringSSL namely). This check takes a
string as input and returns 1 when the SSL library's name starts with
the given string. It is based on the OpenSSL_version function which
returns the same output as the "openssl version" command.