There currently is a large inconsistency in how binding parameters are
split between bind_conf and listeners. It happens that for historical
reasons some parameters are available at the listener level but cannot
be configured per-listener but only for a bind_conf, and thus, need to
be replicated. In addition, some of the bind_conf parameters are in fact
for the listening socket itself while others are for the instanciated
sockets.
A previous attempt at splitting listeners into receivers failed because
the boundary between all these settings is not well defined.
This patch introduces a level of listening socket settings in the
bind_conf, that will be detachable later. Such settings that are solely
for the listening socket are:
- unix socket permissions (used only during binding)
- interface (used for binding)
- network namespace (used for binding)
- process mask and thread mask (used during startup)
The rest seems to be used only to initialize the resulting sockets, or
to control the accept rate. For now, only the unix params (bind_conf->ux)
were moved there.
Just like with previous commit, DNS nameservers are affected as well with
addresses starting in "udp@", but here it's different, because due to
another bug in the DNS parser, the address is rejected, indicating that
it doesn't have a ->connect() method. Similarly, the DNS code believes
it's working on top of TCP at this point and this used to work because of
this. The same fix is applied to remap the protocol and the ->connect test
was dropped.
No backport is needed, as the ->connect() test will never strike in 2.2
or below.
Commit 3835c0dcb ("MEDIUM: udp: adds minimal proto udp support for
message listeners.") introduced a problematic side effect in log server
address parser: if "udp@", "udp4@" or "udp6@" prefixes a log server's
address, the adress is passed as-is to the log server with a non-existing
family and fails like this when trying to send:
[ALERT] 259/195708 (3474) : socket() failed in logger #1: Address family not supported by protocol (errno=97)
The problem is that till now there was no UDP family, so logs expect an
AF_INET family to be passed for UDP there.
This patch manually remaps AF_CUST_UDP4 and AF_CUST_UDP6 to their "tcp"
equivalent that the log server parser expects. No backport is needed.
Remove the last utility functions for handling the multi-cert bundles
and remove the multi-variable from the ckch structure.
With this patch, the bundles are completely removed.
The multi variable is not useful anymore since the removal of the
multi-certificates bundle support. It can be removed safely from the CLI
functions and suppose that every ckch contains a single certificate.
Since the removal of the multi-certificates bundle support, this
variable is not useful anymore, we can remove all tests for this
variable and suppose that every ckch contains a single certificate.
Like the previous commit, this one emulates the bundling by loading each
certificate separately and storing it in a separate SSL_CTX.
This patch does it for the standard certificate loading, which means
outside directories or crt-list.
The multi-certificates bundle was the common way of offering multiple
certificates of different types (ecdsa and rsa) for a same SSL_CTX.
This was implemented with OpenSSL 1.0.2 before the client_hello callback
was available.
Now that all versions which does not support this callback are
deprecated (< 1.1.0), we can safely removes the support for the bundle
which was inconvenient and complexify too much the code.
The multi-certificates bundle was the common way of offering multiple
certificates of different types (ecdsa and rsa) for a same SSL_CTX.
This was implemented with OpenSSL 1.0.2 before the client_hello callback
was available.
Now that all versions which does not support this callback are
depracated (< 1.1.0), we can safely removes the support for the bundle
which was inconvenient and complexify too much the code.
This patch emulates the bundle loading by looking for the bundle files
when the specified file in the configuration does not exist. It then
creates new entries in the crtlist, so they will appear as new line if
they are dumped from the CLI.
Remove the support for multi-certificates bundle in the CLI. There is
nothing to replace here, it will use the standard codepath with the
"bundle emulation" in the future.
The multi-cert certificates bundle is the former way, implemented with
openssl 1.0.2, of doing multi-certificate (RSA, ECDSA and DSA) for the
same SNI host. Remove this support temporarely so it is replaced by
the loading of each certificate in a separate SSL_CTX.
The use of "bind" wasn't that wise but was temporary. The problem is that
it will not allow to coexist with tcp. Let's explicitly call it "dgram-bind"
so that datagram listeners are expected here, leaving some room for stream
listeners later. This is the only change.
Since the refactoring of the crt-list, the same function is used to
parse a crt-list file and a crt-list line on the CLI.
The assumption that a line on the CLI and a line in a file is finished
by a \n was made. However that is potentialy not the case with a file
which does not finish by a \n.
This patch fixes issue #860 and must be backported in 2.2.
In the SSL code, when we were waiting for the availability of the crypto
engine, once it is ready and its fd's I/O handler is called, don't call
ssl_sock_io_cb() directly, instead, call tasklet_wakeup() on the
ssl_sock_ctx's tasklet. We were calling ssl_sock_io_cb() with NULL as
a tasklet, which used to be fine, but it is no longer true since the
fd takeover changes. We could just provide the tasklet, but let's just
wake the tasklet, as is done for other FDs, for fairness.
This should fix github issue #856.
This should be backported into 2.2.
The socks4 keyword parser was a bit too much copy-pasted, it only checks
for a null port and reports "invalid range". Let's properly check for the
1-65535 range and report the correct error.
It may be backported everywhere "socks4" is present (2.0).
In bug #835, @arjenzorgdoc reported that the verifyhost option on the
server line is case-sensitive, that shouldn't be the case.
This patch fixes the issue by replacing memcmp by strncasecmp and strcmp
by strcasecmp. The patch was suggested by @arjenzorgdoc.
This must be backported in all versions supporting the verifyhost
option.
Changes performed using the following coccinelle patch:
@@
type T;
expression E;
expression t;
@@
(
t = calloc(E, sizeof(*t))
|
- t = calloc(E, sizeof(T))
+ t = calloc(E, sizeof(*t))
)
Looking through the commit history, grepping for coccinelle shows that the same
replacement with a different patch was already performed in the past in commit
02779b6263a177b1e462e53db6eaf57bcda574bc.
newsrv->curr_idle_thr is of type `unsigned int`, not `int`. Fix this issue
by simply passing the dereferenced pointer to sizeof, which is the preferred
style anyway.
This bug was introduced in commit dc2f2753e97ecfe94827de56ee9efd2cd6d39ad3.
It first appeared in 2.2-dev5. The patch must be backported to 2.2+.
It is notable that the `calloc` call was not introduced within the commit in
question. The allocation was already happening before that commit and it
already looked like it does after applying the patch. Apparently the
argument for the `sizeof` managed to get broken during the rearrangement
that happened in that commit:
for (i = 0; i < global.nbthread; i++)
- MT_LIST_INIT(&newsrv->idle_orphan_conns[i]);
- newsrv->curr_idle_thr = calloc(global.nbthread, sizeof(*newsrv->curr_idle_thr));
+ MT_LIST_INIT(&newsrv->safe_conns[i]);
+
+ newsrv->curr_idle_thr = calloc(global.nbthread, sizeof(int));
Even more notable is that I previously fixed that *exact same* allocation in
commit 017484c80f2fd265281853fdf0bc816b19a751da.
So apparently it was managed to break this single line twice in the same
way for whatever reason there might be.
iif() takes a boolean as input and returns one of the two argument
strings depending on whether the boolean is true.
This converter most likely is most useful to return the proper scheme
depending on the value returned by the `ssl_fc` fetch, e.g. for use within
the `x-forwarded-proto` request header.
However it can also be useful for use within a template that is sent to
the client using `http-request return` with a `lf-file`. It allows the
administrator to implement a simple condition, without needing to prefill
variables within the regular configuration using `http-request
set-var(req.foo)`.
It must be done to expire patterns cached in the LRU cache. Otherwise it is
possible to retrieve an already freed pattern, attached to a released pattern
expression.
When a specific pattern is deleted (->delete() callback), the pattern expression
revision is already renewed. Thus it is not affected by this bug. Only prune
action on the pattern expression is concerned.
In addition, for a pattern expression, in ->prune() callbacks when the pattern
list is released, a missing LIST_DEL() has been added. It is not a real issue
because the list is reinitialized at the end and all elements are released and
should never be reused. But it is less confusing this way.
This bug may be triggered when a map is cleared from the cli socket. A
workaround is to set the pattern cache size (tune.pattern.cache-size) to 0 to
disable it.
This patch should fix the issue #844. It must be backported to all supported
versions.
This allocation technically is always reachable and cannot leak, however other
global variables such as `oldpids` are already being freed. This is in an
attempt to get HAProxy to a state where there are zero live allocations after a
clean exit.
Given the following example configuration:
listen http
bind *:80
mode http
stats scope .
Running a configuration check with valgrind reports:
==16341== 26 (24 direct, 2 indirect) bytes in 1 blocks are definitely lost in loss record 3 of 13
==16341== at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16341== by 0x571C2E: stats_add_scope (uri_auth.c:296)
==16341== by 0x46CE29: cfg_parse_listen (cfgparse-listen.c:1901)
==16341== by 0x45A112: readcfgfile (cfgparse.c:2078)
==16341== by 0x50A0F5: init (haproxy.c:1828)
==16341== by 0x418248: main (haproxy.c:3012)
After this patch is applied the leak is gone as expected.
This is a very minor leak that can only be observed if deinit() is called,
shortly before the OS will free all memory of the process anyway. No
backport needed.
It initially looked appealing to be able to call traces with ",,," for
unused arguments, but tcc doesn't like empty macro arguments, and quite
frankly, adding a zero between the few remaining ones is no big deal.
Let's do so now.
Commit 77b98220e ("BUG/MINOR: threads: work around a libgcc_s issue with
chrooting") tried to address an issue with libgcc_s being loaded too late.
But it turns out that the symbol used there isn't present on armhf, thus
it breaks the build.
Given that the issue manifests itself during pthread_exit(), the safest
and most portable way to test this is to call pthread_exit(). For this
we create a dummy thread which exits, during the early boot. This results
in the relevant library to be loaded if needed, making sure that a later
call to pthread_exit() will still work. It was tested to work fine under
linux on the following platforms:
glibc:
- armhf
- aarch64
- x86_64
- sparc64
- ppc64le
musl:
- mipsel
Just running the code under strace easily shows the call in the dummy
thread, for example here on armhf:
$ strace -fe trace=file ./haproxy -v 2>&1 | grep gcc_s
[pid 23055] open("/lib/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = 3
The code was isolated so that it's easy to #ifdef it out if needed.
This should be backported where the patch above is backported (likely
2.0).
The condition in h1_refresh_timeout() seems insufficient to properly
take care of the half-closed timeout, because depending on the ordering
of operations when performing the last send() to a client, the stream
may or may not still be there and we may fail to shrink the client
timeout on our last opportunity to do so.
Here we want to make sure that the timeout is always reduced when the
last chunk was sent and the shutdown completed, regardless of the
presence of a stream or not. This is what this patch does.
This should be backported as far as 2.0, and should fix the issue
reported in #541.
Since 1.8 with commit e8692b41e ("CLEANUP: auth: use the build options list
to report its support"), crypt(3) is always reported as being supported in
"haproxy -vv" because no test on USE_LIBCRYPT is made anymore when
producing the output.
This reintroduces the distinction between with and without USE_LIBCRYPT
in the output by indicating "yes" or "no". It may be backported as far
as 1.8, though the code differs due to a number of include files cleanups.
When the server address is set for the first time, the log message is a bit ugly
because there is no old ip address to report. Thus in the log, we can see :
PX/SRV changed its IP from to A.B.C.D by DNS additional record.
Now, when this happens, "(none)" is reported :
PX/SRV changed its IP from (none) to A.B.C.D by DNS additional record.
This patch may be backported to 2.2.
When a SRV record for an already known server is processed, only the weight is
updated, if not configured to be ignored. It is a problem if the IP address
carried by the associated additional record changes. Because the server IP
address is never renewed.
To fix this bug, If there is an addition record attached to a SRV record, we
always try to set the IP address. If it is the same, no change is
performed. This way, IP changes are always handled.
This patch should fix the issue #841. It must be backported to 2.2.
A SRV record keeps a reference on the corresponding additional record, if
any. But this additional record is also inserted in a separate linked-list into
the dns response. The problems arise when obsolete additional records are
released. The additional records list is purged but the SRV records always
reference these objects, leading to an undefined behavior. Worst, this happens
very quickly because additional records are never renewed. Thus, once received,
an additional record will always expire.
Now, the addtional record are only associated to a SRV record or simply
ignored. And the last version is always used.
This patch helps to fix the issue #841. It must be backported to 2.2.
The pathq sample fetch extract the relative URI of a request, i.e the path with
the query-string, excluding the scheme and the authority, if any. It is pretty
handy to always get a relative URI independently on the HTTP version. Indeed,
while relative URIs are common in HTTP/1.1, in HTTP/2, most of time clients use
absolute URIs.
This patch may be backported to 2.2.
These actions do the same as corresponding "-path" versions except the
query-string is included to the manipulated request path. This means set-pathq
action replaces the path and the query-string and replace-pathq action matches
and replace the path including the query-string.
This patch may be backported to 2.2.
This reverts commit 4b9c0d1fc08388bf44c6ebbd88f786032dd010fc.
Actually, the "replace-path" action is ambiguous. "set-path" action preserves
the query-string. The "path" sample fetch does not contain the query-string. But
"replace-path" action is documented to handle the query-string. It is probably
not the expected behavior. So instead of fixing the code, we will fix the
documentation to make "replace-path" action consistent with other parts of the
code. In addition actions and sample fetches to handle the path with the
query-string will be added.
If the commit above is ever backported, this one must be as well.
It was reported in bug #837 that haproxy -s causes a 100% CPU.
However this option does not exist and haproxy must exit with the
usage message.
The parser was not handling the case where -s is not followed by 't' or
'f' which are the only two valid cases.
This bug was introduced by df6c5a ("BUG/MEDIUM: mworker: fix the copy of
options in copy_argv()") which was backported as far as 1.8.
This fix must be backported as far as 1.8.
Ever since the protocols were added in 1.3.13, listeners used to be
started twice:
- once by start_proxies(), which iteratees over all proxies then all
listeners ;
- once by protocol_bind_all() which iterates over all protocols then
all listeners ;
It's a real mess because error reporting is not even consistent, and
more importantly now that some protocols do not appear in regular
proxies (peers, logs), there is no way to retry their binding should
it fail on the last step.
What this patch does is to make sure that listeners are exclusively
started by protocols. The failure to start a listener now causes the
emission of an error indicating the proxy's name (as it used to be
the case per proxy), and retryable failures are silently ignored
during all but last attempts.
The start_proxies() function was kept solely for setting the proxy's
state to READY and emitting the "Proxy started" message and log that
some have likely got used to seeking in their logs.
Similarly to previous commit about ->bind_all(), we have the same
construct for ->unbind_all() which ought not to be used either. Let's
make protocol_unbind_all() iterate over all listeners and directly
call unbind_listener() instead.
It's worth noting that for uxst there was originally a specific
->unbind_all() function but the simplifications that came over the
years have resulted in a locally reimplemented version of the same
function: the test on (state > LI_ASSIGNED) there is equivalent to
the one on (state >= LI_PAUSED) that is used in do_unbind_listener(),
and it seems these have been equivalent since at least commit dabf2e264
("[MAJOR] added a new state to listeners")) (1.3.14).
All protocols only iterate over their own listeners list and start
the listeners using a direct call to their ->bind() function. This
code duplication doesn't make sense and prevents us from centralizing
the startup error handling. Worse, it's not even symmetric because
there's an unbind_all_listeners() function common to all protocols
without any equivalent for binding. Let's start by directly calling
each protocol's bind() function from protocol_bind_all().
Previous commit 77b98220e ("BUG/MINOR: threads: work around a libgcc_s
issue with chrooting") broke the build on cygwin. I didn't even know we
supported threads on cygwin. But the point is that it's actually the
glibc-based libpthread which requires libgcc_s, so in absence of other
reports we should not apply the workaround on other libraries.
This should be backported along with the aforementioned patch.
Sander Hoentjen reported another issue related to libgcc_s in issue #671.
What happens is that when the old process quits, pthread_exit() calls
something from libgcc_s.so after the process was chrooted, and this is
the first call to that library, causing an attempt to load it. In a
chroot, this fails, thus libthread aborts. The behavior widely differs
between operating systems because some decided to use a static build for
this library.
In 2.2 this was resolved as a side effect of a workaround for the same issue
with the backtrace() call, which is also in libgcc_s. This was in commit
0214b45 ("MINOR: debug: call backtrace() once upon startup"). But backtraces
are not necessarily enabled, and we need something for older versions.
By inspecting a significant number of ligcc_s on various gcc versions and
platforms, it appears that a few functions have been present since gcc 3.0,
one of which, _Unwind_Find_FDE() has no side effect (it only returns a
pointer). What this patch does is that in the thread initialization code,
if built with gcc >= 3.0, a call to this function is made in order to make
sure that libgcc_s is loaded at start up time and that there will be no
need to load it upon exit.
An easy way to check which libs are loaded under Linux is :
$ strace -e trace=openat ./haproxy -v
With this patch applied, libgcc_s now appears during init.
Sander confirmed that this patch was enough to put an end to the core
dumps on exit in 2.0, so this patch should be backported there, and maybe
as far as 1.8.
In issue #777, cppcheck wrongly assumes a useless null pointer check
in the expression below while it's obvious that in a 3G/1G split on
32-bit, len can become positive if p is NULL:
p = memchr(ctx.value.ptr, ' ', ctx.value.len);
len = p - ctx.value.ptr;
if (!p || len <= 0)
return 0;
In addition, on 64 bits you never know given that len is a 32-bit signed
int thus the sign of the result in case of a null p will always be the
opposite of the 32th bit of ctx.value.ptr. Admittedly the test is ugly.
Tim proposed this fix consisting in checking for p == ctx.value.ptr
instead when checking for first character only, which Ilya confirmed is
enough to shut cppcheck up. No backport is needed.
When calling the http_replace_res_status() function, an optional reason may now
be set. It is ignored if it points to NULL and the original reason is
preserved. Only the response status is replaced. Otherwise both the status and
the reason are replaced.
It simplifies the API and most of time, avoids an extra call to
http_replace_res_reason().
The documentation stated the "replace-path" action replaces the path, including
the query-string if any is present. But in the code, only the path is
replaced. The query-string is preserved. So, now, instead of relying on the same
action code than "set-uri" action (1), a new action code (4) is used for
"replace-path" action. In http_req_replace_stline() function, when the action
code is 4, we call http_replace_req_path() setting the last argument (with_qs)
to 1. This way, the query-string is not skipped but included to the path to be
replaced.
This patch relies on the commit b8ce505c6 ("MINOR: http-htx: Add an option to
eval query-string when the path is replaced"). Both must be backported as far as
2.0. It should fix the issue #829.
The http_replace_req_path() function now takes a third argument to evaluate the
query-string as part of the path or to preserve it. If <with_qs> is set, the
query-string is replaced with the path. Otherwise, only the path is replaced.
This patch is mandatory to fix issue #829. The next commit depends on it. So be
carefull during backports.