The 32-bits version field of the Retry paquet was inversed by the code. As this
field must be in the network byte order on the wire, this code has supposed that
the sender of the Retry packet will always be little endian. Hopefully this is
often the case on our Intel machines ;)
Must be backported as far as 2.6.
The 4 bits least significant bits of the first byte in a Retry packet must be
random. There are generated calling statistical_prng_range() with 16 as argument.
Must be backported as far as 2.6.
When a stick-table is defined within a peers section, the name is
prefixed with the peers section name. However when checking for
duplicate table names, the check was using the table name without
the prefix, and would thus never match.
Must be backported as far as 2.6.
This patch introduces the "client-sigalgs" keyword for the server line,
which allows to configure the list of server signature algorithms
negociated during the handshake. Also available as
"ssl-default-server-client-sigalgs" in the global section.
This patch introduces the "sigalgs" keyword for the server line, which
allows to configure the list of server signature algorithms negociated
during the handshake. Also available as "ssl-default-server-sigalgs" in
the global section.
This warning happened in 2.9-dev with commit 723c73f8a ("MEDIUM: mux-h1:
Split h1_process_mux() to make code more readable"). It's the usual gcc
crap that relies on comments to disable the warning but which drops these
comments between the preprocessor and the compiler, so using any split
build system (distcc, ccache etc) reintroduces the warning. Use the more
reliable and portable __fallthrough instead. No backport needed.
Return a more acurate error than the previous patch, CO_ER_SSL_EMPTY is
the code for "Connection closed during SSL handshake" which is more
precise than CO_ER_SSL_ABORT ("Connection error during SSL handshake").
No backport needed.
During a SSL_do_handshake(), SSL_ERROR_ZERO_RETURN can be returned in case
the remote peer sent a close_notify alert. Previously this would set the
connection error to CO_ER_SSL_HANDSHAKE, this patch sets it to
CO_ER_SSL_ABORT to have a more acurate error.
This bug was introduced by this commit which was not sufficient:
BUG/MINOR: quic: Possible endless loop in quic_lstnr_dghdlr()
It was revealed by the blackhole interop runner test with neqo as client.
qc_conn_release() could be called after having locke the CID tree when two different
threads was creating the same connection at the same time. Indeed in this case
the last thread which tried to create a new connection for the same an already existing
CID could not manage to insert an already inserted CID in the connection CID tree.
This was expected. It had to destroy the newly created for nothing connection calling
qc_conn_release(). But this function also locks this tree calling free_quic_conn_cids() leading to a deadlock.
A solution would have been to delete the new CID created from its tree before
calling qc_conn_release().
A better solution is to stop inserting the first CID from qc_new_conn(), and to
insert it into the CID tree only if there was not an already created connection.
This is whas is implemented by this patch.
Must be backported as far as 2.7.
Aurelien Darragon found a case of leak when working on ticket #2184.
When a reexec_on_failure() happens *BEFORE* protocol_bind_all(), the
worker is not fork and the mworker_proc struct is still there with
its 2 socketpairs.
The socketpair that is supposed to be in the master is already closed in
mworker_cleanup_proc(), the one for the worker was suppposed to
be cleaned up in mworker_cleanlisteners().
However, since the fd is not bound during this failure, the fd is never
closed.
This patch fixes the problem by setting the fd to -1 in the mworker_proc
after the fork, so we ensure that this it won't be close if everything
was done right, and then we try to close it in mworker_cleanup_proc()
when it's not set to -1.
This could be triggered with the script in ticket #2184 and a `ulimit -H
-n 300`. This will fail before the protocol_bind_all() when trying to
increase the nofile setrlimit.
In recent version of haproxy, there is a BUG_ON() in fd_insert() that
could be triggered by this bug because of the global.maxsock check.
Must be backported as far as 2.6.
The problem could exist in previous version but the code is different
and this won't be triggered easily without other consequences in the
master.
In GH #2187 it was mentioned that the ifnone-forwardfor regtest
did not cover the case where forwardfor ifnone is explicitly set in
the frontend but forwardfor option is not used in the backend.
Expected behavior in this case is that the frontend takes the precedence
because the backend did not specify the option.
Adding this missing case to prevent regressions in the future.
A regression was introduced in 730b983 ("MINOR: proxy: move 'forwardfor'
option to http_ext")
Indeed, when the forwardfor if-none option is specified on the frontend
but forwardfor is not specified at all on the backend: if-none from the
frontend is ignored.
But this behavior conflicts with the historical one, if-none should only
be ignored if forwardfor is also enabled on the backend and if-none is
not set there.
It should fix GH #2187.
This should be backported in 2.8 with 730b983 ("MINOR: proxy: move
'forwardfor' option to http_ext")
h1_append_chunk_size() and h1_prepend_chunk_crlf() functions were marked as
possibly unused to be able to add them in standalone commits. Now these
functions are used, the __maybe_unused statement can be removed.
When the HTX was introduced, we have lost the support for the kernel
splicing for chunked messages. Thanks to this patch set, it is possible
again. Of course, we still need to keep the H1 parser synchronized. Thus
only the chunk content can be spliced. We still need to read the chunk
envelope using a buffer.
There is no reason to backport this feature. But, just in case, this patch
depends on following patches:
* "MEDIUM: filters/htx: Don't rely on HTX extra field if payload is filtered"
* "MINOR: mux-h1: Add function to prepend the chunk crlf to the output buffer"
* "MINOR: mux-h1: Add function to append the chunk size to the output buffer"
* "REORG: mux-h1: Rename functions to emit chunk size/crlf in the output buffer"
* "MEDIUM: mux-h1: Split h1_process_mux() to make code more readable"
If an HTTP data filter is registered on a channel, we must not rely on the
HTX extra field because the payload may be changed and we cannot predict if
this value will change or not. It is too errorprone to let filters deal with
this reponsibility. So we set it to 0 when payload filtering is performed,
but only if the payload length can be determined. It is important because
this field may be used when data are forwarded. In fact, it will be used by
the H1 multiplexer to be able to splice chunk-encoded payload.
h1_process_mux() function was pretty huge a quite hard to debug. So, the
funcion is split in sub-functions. Each sub-function is responsible to a
part of the message (start-line, headers, payload, trailers...). We are
still relying on a HTTP parser to format the message to be sure to detect
errors. Functionnaly speaking, there is no change. But the code is now more
readable.
Depending on the timing, time to time, the log messages can be mixed. A
client can start and be fully handled by HAProxy (including its log message)
before the log message of the previous client was emitted or received. To
fix the issue, a barrier was added to be sure to eval the "expect" rule on
logs before starting the next client.
It appears that dconv dislikes the "see also" part being on the same line as
the regular paragraph. The beginning of the line does not show up in the
rendered version.
Attempt to fix this by inserting an additional newline which is consistent with
other options.
This bug arrived with this commit:
MINOR: quic: Remove pool_zalloc() from qc_new_conn()
Missing initialization of largest packet number received during a keyupdate phase.
This prevented the keyupdate feature from working and made the keyupdate interop
tests to fail for all the clients.
Furthermore, ->flags from quic_tls_ctx was also not initialized. This could
also impact the keyupdate feature at least.
No backport needed.
Replace a "less than" comparison between two tick variable by a call to tick_is_lt()
in quic_loss_pktns(). This bug could lead to a wrong packet loss detection
when the loss time computed values could wrap. This is the case 20 seconds after
haproxy has started.
Must be backported as far as 2.6.
In ticket #2184, HAProxy is crashing in a BUG_ON() after a lot of reload
when the previous processes did not exit.
Each worker has a socketpair which is a FD in the master, when reloading
this FD still exists until the process leaves. But the global.maxconn
value is not incremented for each of these FD. So when there is too much
workers and the number of FD reaches maxsock, the next FD inserted in
the poller will crash the process.
This patch fixes the issue by increasing the maxsock for each remaining
worker.
Must be backported in every maintained version.
This bug was introduced by this commit:
MINOR: quic: Remove pool_zalloc() from qc_new_conn()
The transport parameters was not initialized. This leaded to a crash when
dumping the received ones from TRACE()s.
Also reset the lengths of the CIDs attached to a quic_conn struct to 0 value
to prevent them from being dumped from traces when not already initialized.
No backport needed.
Replace a call to pool_zalloc() by a call to pool_malloc() into quic_dgram_parse
to allocate quic_rx_packet struct objects.
Initialize almost all the members of quic_rx_packet struct.
->saddr is initialized by quic_rx_pkt_retrieve_conn().
->pnl and ->pn are initialized by qc_do_rm_hp().
->dcid and ->scid are initialized by quic_rx_pkt_parse() which calls
quic_packet_read_long_header() for a long packet. For a short packet,
only ->dcid will be initialized.
pool_zalloc() is replaced by pool_alloc() into qc_conn_alloc_ssl_ctx() to allocate
a ssl_sock_ctx struct. ssl_sock_ctx struct member are all initiliazed to null values
excepted ->ssl which is initialized by the next statement: a call to qc_ssl_sess_init().
qc_new_conn() is ued to initialize QUIC connections with quic_conn struct objects.
This function calls quic_conn_release() when it fails to initialize a connection.
quic_conn_release() is also called to release the memory allocated by a QUIC
connection.
Replace pool_zalloc() by pool_alloc() in this function and initialize
all quic_conn struct members which are referenced by quic_conn_release() to
prevent use of non initialized variables in this fonction.
The ebtrees, the lists attached to quic_conn struct must be initialized.
The tasks must be reset to their NULL default values to be safely destroyed
by task_destroy(). This is all the case for all the TLS cipher contexts
of the encryption levels (struct quic_enc_level) and those for the keyupdate.
The packet number spaces (struct quic_pktns) must also be initialized.
->prx_counters pointer must be initialized to prevent quic_conn_prx_cntrs_update()
from dereferencing this pointer.
->latest_rtt member of quic_loss struct must also be initialized. This is done
by quic_loss_init() called by quic_path_init().
This may happen when the initilization of a new QUIC conn fails with qc_new_conn()
when receiving an Initial paquet. This is done after having allocated a CID with
new_quic_cid() called by quic_rx_pkt_retrieve_conn() which stays in the listener
connections tree without a QUIC connection attached to. Then when the listener
receives another Initial packet for the same CID, quic_rx_pkt_retrieve_conn()
returns NULL again (no QUIC connection) but with an thread ID already bound to the
connection, leading the datagram to be requeued in the same datagram handler thread
queue. And so on.
To fix this, the connection is created after having created the connection ID.
If this fails, the connection is deallocated.
During the race condition, when two different threads handle two datagrams for
the same connection, in addition to releasing the newer created connection ID,
the newer QUIC connection must also be released.
Must be backported as far as 2.7.
quic_conn_prx_cntrs_update() may be called from quic_conn_release() with
NULL as value for ->prx_counters member. This is the case when qc_new_conn() fails
when allocating <buf_area>. In this case quic_conn_prx_cntrs_update() BUG_ON().
Must be backported as far as 2.7.
->tx.pto_probe member of quic_pktns struct was not initialized by quic_pktns_init().
This bug never occured because all quic_pktns structs are attached to quic_conn
structs which are always pool_zalloc()'ed.
Must be backported as far as 2.6.
On soft-stop, netns_sig_stop() function is called to purge the shared
namespace tree by iterating over each entries to free them.
However, once an entry is cleaned up and removed from the tree, the entry
itself isn't freed and this results into a minor leak when soft-stopping
because entry was allocated using calloc() in netns_store_insert() when
parsing the configuration.
This could be backported in every stable versions.
When support for 'namespace' keyword was added for the 'default-server'
directive in 22f41a2 ("MINOR: server: Make 'default-server' support
'namespace' keyword."), we forgot to copy the attribute from the parent
to the newly created server.
This resulted in the 'namespace' keyword being parsed without errors when
used from a 'default-server' directive, but in practise the option was
simply ignored.
There's no need to duplicate the netns struct because it is stored in
a shared list, so copying the pointer does the job.
This patch partially fixes GH #2038 and should be backported to all
stable versions.
The local address was dumped as "from" address by dump_quic_full() and
the peer address as "to" address. This patch fixes this issue.
Furthermore, to support the server side (QUIC client) to come, it is preferable
to stop using "from" and "to" labels to dump the local and peer addresses which
is confusing for a QUIC client which uses its local address as "from" address.
To mimic netstat, this is "Local Address" and "Foreign Address" which will
be displayed by "show quic" CLI command and "local_addr" and "foreign_addr"
for "show quic full" command to mention the local addresses and the peer
addresses.
Must be backported as far as 2.7.
This bug arrived with this commit which was supposed to fix another one:
BUG/MINOR: quic: Wrong Application encryption level selection when probing
The aim of this patch was to prevent the Application encryption to be selected
when probing leading to ACK only packets to be sent if the ack delay timer
had fired in the meantime, leading to crashes when no 01-RTT had been sent
because the ack range tree is empty in this case.
This statement is not correct (qc->pktns->flags & QUIC_FL_PKTNS_PROBE_NEEDED)
because qc->pktns is an array of packet number space. But it is equivalent
to (qc->pktns[QUIC_TLS_PKTNS_INITIAL].flags & QUIC_FL_PKTNS_PROBE_NEEDED).
That said, the patch mentionned above is not more useful since this following
which disable the ack time during the handshakes:
BUG/MINOR: quic: Do not use ack delay during the handshakes
This commit revert the first patch mentionned above.
Must be backported as far as 2.6.
It was reported in issue #2181, strange behavior during the new SSL
hanshake failure logs.
Errors were logged with the code 0, which is unknown to OpenSSL.
This patch mades 2 changes:
- It stops using ERR_error_string() when the SSL error code is 0
- It uses ERR_error_string_n() to be thread-safe
Must be backported to 2.8.
When an HTTP applet tries to get request data, we must take care to properly
detect the end of the message. It an empty HTX message with the SC_FL_EOI
flag set on the front SC. However, an issue was introduced during the SC
refactoring performed in the 2.8. The backend SC is tested instead of the
frontend one.
Because of this bug, the receive functions hang because the test on
SC_FL_EOI flag never succeeds. Of course, by checking the frontend SC (the
opposite SC to the one attached to the appctx), it works.
This patch should fix the issue #2180. It must be backported to the 2.8.
proxy default-server is a specific type of server that is not allocated
using new_server(): it is directly stored within the parent proxy
structure. However, since it may contain some default config options that
may be inherited by regular servers, it is also subject to dynamic members
(strings, structures..) that needs to be deallocated when the parent proxy
is cleaned up.
Unfortunately, srv_drop() may not be used directly from p->defsrv since
this function is meant to be used on regular servers only (those created
using new_server()).
To circumvent this, we're splitting srv_drop() to make a new function
called srv_free_params() that takes care of the member cleaning which
originally takes place in srv_drop(). This function is exposed through
server.h, so it may be called from outside server.c.
Thanks to this, calling srv_free_params(&p->defsrv) from free_proxy()
prevents any memory leaks due to dynamic parameters allocated when
parsing a default-server line from a proxy section.
This partially fixes GH #2173 and may be backported to 2.8.
[While it could also be relevant for other stable versions, the patch
won't apply due to architectural changes / name changes between 2.4 => 2.6
and then 2.6 => 2.8. Considering this is a minor fix that only makes
memory analyzers happy during deinit paths (at least for <= 2.8), it might
not be worth the trouble to backport them any further?]
bind->settings.interface hint is allocated when "interface" keyword
is specified on a bind line, but the string isn't explicitly freed in
proxy_free, resulting in minor memory leak on deinit paths when the
keyword is being used.
It partially fixes GH #2173 and may be backported to all stable versions.
[in 2.2 free_proxy did not exist so the patch must be applied directly
in deinit() function from haproxy.c]
When interface keyword is used multiple times within the same bind line,
the previous value isn't checked and is rewritten as-is, resulting in a
small memory leak.
Ensuring the interface name is first freed before assigning it to a new
value.
This may be backported to every stable versions.
[Note for 2.2, the fix must be performed in bind_parse_interface() from
proto_tcp.c, directly within the listener's loop, also ha_free() was
not available so free() must be used instead]
Complementary fix to ac456ab ("DOC: config: fix rfc7239 converter examples")
since somehow I managed to overlook one example..
This needs to be backported in 2.8 with ac456ab.
To prevent bogus matches, var() does not default to string type anymore
since 44c5ff6 ("MEDIUM: vars: make the var() sample fetch function really
return type ANY).
Thanks to the above fix, haproxy now returns an error if var() is used
within an ACL or IF condition and the matching type is not explicitly
set.
However, the documentation was not updated to reflect this change.
This partially fixes GH #2087 and must be backported up to 2.6.
Commit 511ddd5 introduced tune.quic.socket-owner parameter related to
QUIC socket behaviour. However it was misspelled in configuration.txt in
'bind' section as tune.quic.conn-owner.
Because of the commit 5cb8d7b8f ("BUG/MINOR: peers: Improve detection of
config errors in peers sections"), 2 scripts now report errors during
startup because some variables are not set and the remote peer server is
thus malformed. To perform a peer synchro between 2 haproxys in these
scripts, the startup must be delayed to properly resolve addresses.
In addidiotn, we must wait (2s) to be sure the connection between peers is
properly established. These scripts are now flagged as slow.
There are several misuses in peers sections that are not detected during the
configuration parsing and that could lead to undefined behaviors or crashes.
First, only one listener is expected for a peers section. If several bind
lines or local peer definitions are used, an error is triggered. However, if
multiple addresses are set on the same bind line, there is no error while
only the last listener is properly configured. On the 2.8, there is no crash
but side effects are hardly predictable. On older version, HAProxy crashes
if an unconfigured listener is used.
Then, there is no check on remote peers name. It is unexpected to have same
name for several remote peers. There is now a test, performed during the
post-parsing, to verify all remote peer names are unique.
Finally, server parsing options for the peers sections are changed to be
sure a port is always defined, and not a port range or a port offset.
This patch fixes the issue #2066. It could be backported to all stable
versions.