Commit Graph

17551 Commits

Author SHA1 Message Date
Remi Tricot-Le Breton
5c25c577a0 BUG/MEDIUM: ssl: Fix crash when calling "update ssl ocsp-response" when an update is ongoing
The CLI command "update ssl ocsp-response" was forcefully removing an
OCSP response from the update tree regardless of whether it used to be
in it beforehand or not. But since the main OCSP upate task works by
removing the entry being currently updated from the update tree and then
reinserting it when the update process is over, it meant that in the CLI
command code we were modifying a structure that was already being used.

These concurrent accesses were not properly locked on the "regular"
update case because it was assumed that once an entry was removed from
the update tree, the update task was the only one able to work on it.

Rather than locking the whole update process, an "updating" flag was
added to the certificate_ocsp in order to prevent the "update ssl
ocsp-response" command from trying to update a response already being
updated.

An easy way to reproduce this crash was to perform two "simultaneous"
calls to "update ssl ocsp-response" on the same certificate. It would
then crash on an eb64_delete call in the main ocsp update task function.

This patch can be backported up to 2.8. Wait a little bit before
backporting.
2024-03-20 16:12:10 +01:00
Remi Tricot-Le Breton
69071490ff BUG/MAJOR: ocsp: Separate refcount per instance and per store
With the current way OCSP responses are stored, a single OCSP response
is stored (in a certificate_ocsp structure) when it is loaded during a
certificate parsing, and each SSL_CTX that references it increments its
refcount. The reference to the certificate_ocsp is kept in the SSL_CTX
linked to each ckch_inst, in an ex_data entry that gets freed when the
context is freed.
One of the downsides of this implementation is that if every ckch_inst
referencing a certificate_ocsp gets detroyed, then the OCSP response is
removed from the system. So if we were to remove all crt-list lines
containing a given certificate (that has an OCSP response), and if all
the corresponding SSL_CTXs were destroyed (no ongoing connection using
them), the OCSP response would be destroyed even if the certificate
remains in the system (as an unused certificate).
In such a case, we would want the OCSP response not to be "usable",
since it is not used by any ckch_inst, but still remain in the OCSP
response tree so that if the certificate gets reused (via an "add ssl
crt-list" command for instance), its OCSP response is still known as
well.
But we would also like such an entry not to be updated automatically
anymore once no instance uses it. An easy way to do it could have been
to keep a reference to the certificate_ocsp structure in the ckch_store
as well, on top of all the ones in the ckch_instances, and to remove the
ocsp response from the update tree once the refcount falls to 1, but it
would not work because of the way the ocsp response tree keys are
calculated. They are decorrelated from the ckch_store and are the actual
OCSP_CERTIDs, which is a combination of the issuer's name hash and key
hash, and the certificate's serial number. So two copies of the same
certificate but with different names would still point to the same ocsp
response tree entry.

The solution that answers to all the needs expressed aboved is actually
to have two reference counters in the certificate_ocsp structure, one
actual reference counter corresponding to the number of "live" pointers
on the certificate_ocsp structure, incremented for every SSL_CTX using
it, and one for the ckch stores.
If the ckch_store reference counter falls to 0, the corresponding
certificate must have been removed via CLI calls ('set ssl cert' for
instance).
If the actual refcount falls to 0, then no live SSL_CTX uses the
response anymore. It could happen if all the corresponding crt-list
lines were removed and there are no live SSL sessions using the
certificate anymore.
If any of the two refcounts becomes 0, we will always remove the
response from the auto update tree, because there's no point in spending
time updating an OCSP response that no new SSL connection will be able
to use. But the certificate_ocsp object won't be removed from the tree
unless both refcounts are 0.

Must be backported up to 2.8. Wait a little bit before backporting.
2024-03-20 16:12:10 +01:00
Amaury Denoyelle
87b96cf3a5 BUG/MAJOR: connection: fix server used_conns with H2 + reuse safe
By default, backend connections are accounted by the server. This allows
to determine the number of idle connections to keep. A backend
connection can also be marked as private to prevent its reuse. It is
thus removed from server lists into the session list. As such, a private
connection is not accounted into server : conn_set_private() uses
srv_release_conn() to ensure this.

When using HTTP/2 on backend side with default http-reuse safe, the
above principle are mixed. Indeed, when a connection is first used, or
switches from idle to used, it is moved into the session list but it is
not flagged as private. This is done to prevent its sharing by different
clients to prevent head-of-line blocking issue. When all streams are
closed, the connection becomes idle again and is reinserted in the
server list. This has been introduced by the following patch :

  0d21deaded
  MEDIUM: backend: add reused conn to sess if mux marked as HOL blocking

When freeing a backend connection, special care is taken to ensure
server used counter is decremented. This is implemented into
conn_backend_deinit(). However, this function does this only if the
connection is not present in a session list. This is valid for private
connections. However, if a connection is non-private and present only
temporarily into a session list, the decrement operation won't be
executed despite the connection being accounted by the server.

This bug has several impacts. The server used counter won't be able to
reach its initial null value, even when all its connections are closed.
This can result in a wrong estimation of necessary idle connections,
which may cause unnecessary new connection usage. Also, this will
prevent definitely the server from being removed via "delete server" CLI
command.

This should be backported up to 2.4. Note that conn_backend_deinit() was
introduced in 2.9. For lesser versions, the change should be done
directly into conn_free().
2024-03-20 14:26:57 +01:00
Amaury Denoyelle
fd3ce173aa BUG/MEDIUM: http_ana: ignore NTLM for reuse aggressive/always and no H1
Backend connections can be marked as private to prevent their sharing by
multiple clients. Now, this has become an exception as only two reasons
for data traffic can trigger this (checks are ignored here) :
* http-reuse never
* HTTP response with NTLM header

The first case is easy to manage as the connection is flagged as private
since its inception. However, the second case is dynamic as the
connection can be flagged anytime during its lifetime. When using a
backend protocol such as HTTP/2 with reuse mode aggressive or always, we
face a design issue as the connection would be marked as private,
despite potentially being shared by several clients at the same time.

This is conceptually invalid, but worst it can trigger crashes on MUX
stream detach callback depending on the order of release of the streams,
by calling session_check_idle_conn() with a NULL session. It could also
be possible to have several NTLM responses on a single connection for
different sessions. In this case, connection owner is still being
updated without attaching the connection to its correct session, which
ultimately would cause a crash on session_check_idle_conn with an
invalid session.

Here are two backtrace examples from GDB for such cases :

Thread 1 (Thread 0x7ff73e9fc700 (LWP 648859)):
 #0  session_check_idle_conn (conn=0x7ff72f597800, sess=0x0) at include/haproxy/session.h:209
 #1  h2_detach (sd=<optimized out>) at src/mux_h2.c:4520
 #2  0x000056151742be24 in sc_detach_endp (scp=scp@entry=0x7ff73e9f0f18) at src/stconn.c:376
 #3  0x000056151742c208 in sc_destroy (sc=<optimized out>) at src/stconn.c:444
 #4  0x0000561517370871 in stream_free (s=s@entry=0x7ff72a2dbd80) at src/stream.c:728
 #5  0x000056151737541f in process_stream (t=t@entry=0x7ff72d5e2620, context=0x7ff72a2dbd80, state=<optimized out>) at src/stream.c:2645
 #6  0x0000561517456cbb in run_tasks_from_lists (budgets=budgets@entry=0x7ff73e9f10d0) at src/task.c:632
 #7  0x00005615174576b9 in process_runnable_tasks () at src/task.c:876
 #8  0x000056151742275a in run_poll_loop () at src/haproxy.c:2996
 #9  0x0000561517422db1 in run_thread_poll_loop (data=<optimized out>) at src/haproxy.c:3195
 #10 0x00007ff789e081ca in start_thread () from /lib64/libpthread.so.0
 #11 0x00007ff789a39e73 in clone () from /lib64/libc.so.6

(gdb)
Thread 1 (Thread 0x7ff52e7fc700 (LWP 681458)):
 #0  0x0000556ebd6e7e69 in session_check_idle_conn (conn=0x7ff5787ff100, sess=0x7ff51d2539a0) at include/haproxy/session.h:209
 #1  h2_detach (sd=<optimized out>) at src/mux_h2.c:4520
 #2  0x0000556ebd7f3e24 in sc_detach_endp (scp=scp@entry=0x7ff52e7f0f18) at src/stconn.c:376
 #3  0x0000556ebd7f4208 in sc_destroy (sc=<optimized out>) at src/stconn.c:444
 #4  0x0000556ebd738871 in stream_free (s=s@entry=0x7ff520e28200) at src/stream.c:728
 #5  0x0000556ebd73d41f in process_stream (t=t@entry=0x7ff565783700, context=0x7ff520e28200, state=<optimized out>) at src/stream.c:2645
 #6  0x0000556ebd81ecbb in run_tasks_from_lists (budgets=budgets@entry=0x7ff52e7f10d0) at src/task.c:632
 #7  0x0000556ebd81f6b9 in process_runnable_tasks () at src/task.c:876
 #8  0x0000556ebd7ea75a in run_poll_loop () at src/haproxy.c:2996
 #9  0x0000556ebd7eadb1 in run_thread_poll_loop (data=<optimized out>) at src/haproxy.c:3195
 #10 0x00007ff5752081ca in start_thread () from /lib64/libpthread.so.0
 #11 0x00007ff574e39e73 in clone () from /lib64/libc.so.6
(gdb)

To solve this issue, simply ignore NTLM responses when using a
multiplexer with streams support and the connection is not already
attached to the session. The connection is not marked as private and
will continue to be shared freely accross clients. This is considered
conceptually valid as NTLM usage (rfc 4559) with HTTP is broken and was
designed only with HTTP/1.1 in mind. A side-effect of the change is that
SESS_FL_PREFER_LAST is also not set anymore on NTLM detection, which
allows following requests to be load-balanced accross several server
instances.

The original behavior is kept for HTTP/1 or if the connection is already
attached to the session. This last case happens when using HTTP/2 with
default http-reuse safe mode since the following patch :

  0d21deaded
  MEDIUM: backend: add reused conn to sess if mux marked as HOL blocking

This should be backported up to all stable releases. Up until 2.4, it
can be taken as-is. For lesser versions, above patch is not present. In
this case the condition should be restricted only to HTTP/1 usage :

  if (srv_conn && strcmp(srv_conn->mux->name, "H1") == 0) {
2024-03-20 14:26:57 +01:00
Christopher Faulet
eb89e4f3e0 BUG/MEDIUM: spoe: Return an invalid frame on recv if size is too small
Frames with a too small size must be detected on receive and an error must
be triggered. It is especially important for frames of size 0. Otherwise,
because the frame length is used as return value, the frame is ignored (0 is
the return value to state the frame must be ignored). It is an issue because
in this case, outgoing data, the 4 bytes representing the frame size, are
never consumed. If the agent also closes the connection, this leads to a
wakeup loop because outgoing data are stuck and a shutdown is pending.

In addition, all pending outgoing data are systematcially skipped when the
applet is in SPOE_APPCTX_ST_END state.

The patch should fix the issue #2490. It must be backported to all stable
versions.
2024-03-19 07:54:25 +01:00
Ilia Shipitsin
5fe02c33bc CLEANUP: assorted typo fixes in the code and comments
This is 40th iteration of typo fixes
2024-03-18 19:54:33 +01:00
Christopher Faulet
885e40494c MINOR: spoe: Add SPOE filters in the exposed deprecated directives
It is the first deprecated directive exposed via the
'expose-deprecated-directives' global option. This way, it is possible to
silent the warning about the SPOE uses.
2024-03-15 11:31:48 +01:00
Christopher Faulet
189f74d4ff MINOR: cfgparse: Add a global option to expose deprecated directives
Similarly to "expose-exprimental-directives" option, there is no a global
option to expose some deprecated directives. Idea is to have a way to silent
warnings about deprecated directives when there is no alternative solution.

Of course, deprecated directives covered by this option are not listed and
may change. It is only a best effort to let users upgrade smoothly.
2024-03-15 11:31:48 +01:00
Christopher Faulet
dff9807188 MAJOR: spoe: Deprecate the SPOE filter
As announced on the ML few weeks (months ?) ago and on several GH issues,
the SPOE is now deprecated. Sadly, this filter should be refactored to work
properly. It was implemented as a functionnal PoC for the 1.7 and since
then, no time was invest to improve it and make it truly maintainable in
time. Worst, other parts of HAProxy evolve, especially applets part, making
maintenance ever more expensive.

Instead of keeping the SPOE filter in a this state and always reply to users
encountering issues or limitations that it is far from perfect but we cannot
work on it for now, we decided to deprecate it.

We can still change our mind before the 3.0.0 release if the situation
evolves. Otherwise the filter will be removed or marked as unmaintained for
the 3.1. If the situation does not change, it means the 3.0 will be the last
version with a true SPOE support.
2024-03-15 11:29:39 +01:00
Christopher Faulet
6547b14292 BUG/MINOR: spoe: Be sure to be able to quickly close IDLE applets on soft-stop
On soft-stop, we try, as far as possible, to process all pending messages
before closing SPOE applets. However, in sync mode, when an applets waiting
for a response receives the ACK frame, it is switched to IDLE state without
checking if it may be closed. In this case, we will wait the idle timeout
before closing de applet, delaying the soft-stop.

To reduce this delay, on soft-stop, IDLE applets are woken up. On the next
wakeup, the applet will try to process pending messages or will be
closed.

This patch should be backported to all stable versions.
2024-03-15 09:09:22 +01:00
Christopher Faulet
3c066b1e34 BUG/MEDIUM: spoe: Don't rely on stream's expiration to detect processing timeout
On stream side, the SPOE filter relied on the stream's expiration date to be
woken up and be able to detect processing timeout. However, the stream
expiration date must not be updated this way. Mainly because it may be
overwritten at the end of process_stream(). In the worst case, it is set to
TICK_ETERNITY for any reason. In this case, it is impossible to detect the
SPOE filter must time out and abort the processing.

The right way to do is to set an analysis expiration date on the
corresponding channel, depending on the direction. This expiration date will
be used to compute the stream's expiration date at the end of
process_stream().

This patch may be related to issue #2478. It must be backported to all
stable versions.
2024-03-15 09:09:22 +01:00
Amaury Denoyelle
7dae3ceaa0 BUG/MAJOR: server: do not delete srv referenced by session
A server can only be deleted if there is no elements which reference it.
This is taken care via srv_check_for_deletion(), most notably for active
and idle connections.

A special case occurs for connections directly managed by a session.
This is for so-called private connections, when using http-reuse never
or H2 + http-reuse safe for example. In this case. server does not
account these connections into its idle lists. This caused a bug as the
server is deleted despite the session still being able to access it.

To properly fix this, add a new referencing element into the server for
these session connections. A mt_list has been chosen for this. On
default http-reuse, private connections are typically not used so it
won't make any difference. If using H2 servers, or more generally when
dealing with private connections, insert/delete should typically occur
only once per session lifetime so impact on performance should be
minimal.

This should be backported up to 2.4. Note that srv_check_for_deletion()
was introduced in 3.0 dev tree. On backport, the extra condition in it
should be placed in cli_parse_delete_server() instead.
2024-03-14 15:21:07 +01:00
Amaury Denoyelle
5ad801c058 MINOR: session: rename private conns elements
By default, backend connections are attached to a server instance. This
allows to implement connection reuse. However, in some particular cases,
connection cannot be shared accross several clients. These connections
are considered and private and are attached to the session instance
instead.

These private connections are also indexed by the target server to not
mix them. All of this is implemented via a dedicated structure
previously named struct sess_srv_list.

Rename it to better reflect its usage to struct sess_priv_conns. Also
rename its internal members and all of the associated functions.

This commit is only a renaming, thus no functional impact is expected.
2024-03-14 15:21:02 +01:00
Christopher Faulet
f31a4e302e BUG/MINOR: listener: Don't schedule frontend without task in listener_release()
null pointer dereference was reported by Coverity in listener_release()
function. Indeed, we must not try to schedule frontend without task when a
limit is still blocking the frontend. This issue was introduced by commit
65ae1347c7 ("BUG/MINOR: listener: Wake proxy's mngmt task up if necessary on
session release")

This patch should fix issue #2488. It must be backported to all stable
version with the commit above.
2024-03-14 09:34:36 +01:00
Christopher Faulet
65ae1347c7 BUG/MINOR: listener: Wake proxy's mngmt task up if necessary on session release
When a session is released, listener_release() function is called to notify
the listener. It is an opportunity to resume limited/full listeners. We
first try to resume the listener owning the released session, then all
limited listeners in the global queue and finally all limited listeners in
the frontend's waiting queue. This last step is only performed if there is
no limit applied on the frontend. Nothing is performed if the session rate is
still limited. And it is an issue because if this happens for the last
listener's session, there is no other event to wake the frontend's managment
task up and the listener remains in the limited state.

To fix the issue, when a limit is still applied on the frontent, we must
compute the new wake up date from the sessions rate and schedule the
frontend's managment task.

It is easy to reproduce the issue in SSL by setting a maxconn and a rate
limit on sessions.

This patch should fix the issue #2476. It must be backported to all stable
versions.
2024-03-13 15:20:06 +01:00
William Lallemand
70be894e41 MINOR: debug: enable insecure fork on the command line
-dI allow to enable "insure-fork-wanted" directly from the command line,
which is useful when you want to run ASAN with addr2line with a lot of
configuration files without editing them.
2024-03-13 11:23:14 +01:00
Aurelien DARRAGON
07b2e84bce BUG/MEDIUM: hlua: streams don't support mixing lua-load with lua-load-per-thread (2nd try)
While trying to reproduce another crash case involving lua filters
reported by @bgrooot on GH #2467, we found out that mixing filters loaded
from different contexts ('lua-load' vs 'lua-load-per-thread') for the same
stream isn't supported and may even cause the process to crash.

Historically, mixing lua-load and lua-load-per-threads for a stream wasn't
supported, but this changed thanks to 0913386 ("BUG/MEDIUM: hlua: streams
don't support mixing lua-load with lua-load-per-thread").

However, the above fix didn't consider lua filters's use-case properly:
unlike lua fetches, actions or even services, lua filters don't simply
use the stream hlua context as a "temporary" hlua running context to
process some hlua code. For fetches, actions.. hlua executions are
processed sequentially, so we simply reuse the hlua context from the
previous action/fetch to run the next one (this allows to bypass memory
allocations and initialization, thus it increases performance), unless
we need to run on a different hlua state-id, in which case we perform a
reset of the hlua context.

But this cannot work with filters: indeed, once registered, a filter will
last for the whole stream duration. It means that the filter will rely
on the stream hlua context from ->attach() to ->detach(). And here is the
catch, if for the same stream we register 2 lua filters from different
contexts ('lua-load' + 'lua-load-per-thread'), then we have an issue,
because the hlua stream will be re-created each time we switch between
runtime contexts, which means each time we switch between the filters (may
happen for each stream processing step), and since lua filters rely on the
stream hlua to carry context between filtering steps, this context will be
lost upon a switch. Given that lua filters code was not designed with that
in mind, it would confuse the code and cause unexpected behaviors ranging
from lua errors to crashing process.

So here we take another approach: instead of re-creating the stream hlua
context each time we switch between "global" and "per-thread" runtime
context, let's have both of them inside the stream directly as initially
suggested by Christopher back then when talked about the original issue.

For this we leverage hlua_stream_ctx_prepare() and hlua_stream_ctx_get()
helper functions which return the proper hlua context for a given stream
and state_id combination.

As for debugging infos reported after ha_panic(), we check for both hlua
runtime contexts to check if one of them was active when the panic occured
(only 1 runtime ctx per stream may be active at a given time).

This should be backported to all stable versions with 0913386
("BUG/MEDIUM: hlua: streams don't support mixing lua-load with lua-load-per-thread")

This commit depends on:
 - "DEBUG: lua: precisely identify if stream is stuck inside lua or not"
   [for versions < 2.9 the ha_thread_dump_one() part should be skipped]
 - "MINOR: hlua: use accessors for stream hlua ctx"

For 2.4, the filters API didn't exist. However it may be a good idea to
backport it anyway because ->set_priv()/->get_priv() from tcp/http lua
applets may also be affected by this bug, plus it will ease code
maintenance. Of course, filters-related parts should be skipped in this
case.
2024-03-13 09:24:46 +01:00
Aurelien DARRAGON
aa554be69c MINOR: hlua: use accessors for stream hlua ctx
Change hlua_stream_ctx_prepare() prototype so that it now returns the
proper hlua ctx on success instead of returning a boolean.

Add hlua_stream_ctx_get() to retrieve hlua ctx out of a given stream.

This way we may easily change the storage mechanism for hlua stream in
the future without extensive code changes.

No backport needed unless a commit depends on it.
2024-03-13 09:24:46 +01:00
Aurelien DARRAGON
1a2cdf64c9 DEBUG: lua: precisely identify if stream is stuck inside lua or not
When ha_panic() is called by the watchdog, we try to guess from
ha_task_dump() and ha_thread_dump_one() if the thread was stuck while
executing lua from the stream context. However we consider this is the
case by simply checking if the stream hlua context was set, but this is
not very precise because if the hlua context is set, then it simply means
that at least one lua instruction was executed at the stream level, not
that the stuck was currently executing lua when the panic occured.

This is especially true with filters, one could simply register a lua
filter that does nothing but this will still end up initializing the
stream hlua context for each stream. If the thread end up being stuck
during the stream handling, then debug dumping functions will report
that the stream was stuck while handling lua, which is not necessarilly
true, and could in fact confuse us even more.

So here we take another approach, we add the BUSY flag to hlua context:
this flag is set by hlua_ctx_resume() around lua_resume() call, this way
we can precisely tell if the thread was handling lua when it was
interrupted, and we rely on this flag in debug functions to check if the
thread was effectively stuck inside lua or not while processing the stream

No backport needed unless a commit depends on it.
2024-03-13 09:24:46 +01:00
Aurelien DARRAGON
85d81e4d0a BUG/MINOR: hlua: fix missing lock in hlua_filter_delete()
hlua_filter_delete() calls hlua_unref() on the stream hlua stack, but
we should own the lock prior to manipulating the stack.

This should be backported up to 2.6.
2024-03-13 09:24:46 +01:00
Aurelien DARRAGON
ecd8f3bfd7 BUG/MINOR: hlua: missing lock in hlua_filter_new()
This is a complementary patch to 8670db7 ("BUG/MAJOR: hlua: improper lock
usage with hlua_ctx_resume()") for hlua_filter_new().

Indeed, the HLUA_E_ERRMSG case still relies on the lua stack but didn't
take the lock to do so.

This should be backported up to 2.6.
2024-03-13 09:24:46 +01:00
Aurelien DARRAGON
4aefffc38c BUG/MINOR: hlua: segfault when loading the same filter from different contexts
Trying to register the same lua filter from global and per-thread context
(using 'lua-load' + 'lua-load-per-thread') causes a segmentation fault in
hlua_post_init().

This is due to a simple copy paste error as we try to print the function
name in the error message (like we do when loading the same lua function
from different contexts) instead of the filter name.

This should be backported up to 2.6.
2024-03-13 09:24:46 +01:00
William Lallemand
501d9fdb86 MEDIUM: ssl: allow to change the OpenSSL security level from global section
The new "ssl-security-level" option allows one to change the OpenSSL
security level without having to change the openssl.cnf global file of
your distribution. This directives applies on every SSL_CTX context.

People sometimes change their security level directly in the ciphers
directive, however there are some cases when the security level change
is not applied in the right order (for example when applying a DH
param).

Before this patch, it was to possible to trick by using a specific
openssl.cnf file and start haproxy this way:

    OPENSSL_CONF=./openssl.cnf ./haproxy -f bug-2468.cfg

Values for the security level can be found there:

https://www.openssl.org/docs/man1.1.1/man3/SSL_CTX_set_security_level.html

This was discussed in github issue #2468.
2024-03-12 17:37:11 +01:00
William Lallemand
7e9e4a8f50 MEDIUM: ssl: initialize the SSL stack explicitely
In issue #2448, users are complaining that FIPS is not working correctly
since the removal of SSL_library_init().

This was removed because SSL_library_init() is deprecated with OpenSSL
3.x and emits a warning. But the initialization was not needed anymore
because it is done at the first openssl API call.

However it some cases it is needed. SSL_library_init() is now a define
to OPENSSL_init_ssl(0, NULL). This patch adds OPENSSL_init_ssl(0, NULL)
to the init.

This could be backported in every stable branches, however let's wait
before backporting it.
2024-03-12 12:03:07 +01:00
Willy Tarreau
7223296092 BUG/MINOR: server: fix first server template not being indexed
3.0-dev1 introduced a small regression with commit b4db3be86e ("BUG/MINOR:
server: fix server_find_by_name() usage during parsing"). By changing the
way servers are indexed and moving it into the server template loop, the
first one is no longer indexed because the loop starts at low+1 since it
focuses on duplication. Let's index the first one explicitly now.

This should not be backported, unless the commit above is backported.
2024-03-12 08:23:03 +01:00
Dragan Dosen
0091692d97 BUG/MINOR: ssl: do not set the aead_tag flags in sample_conv_aes_gcm()
This was not useful and was using uninitialized value. Introduced with
the commit 08ac28237 ("MINOR: Add aes_gcm_enc converter").

Must be backported wherever the commit 08ac28237 was backported.
2024-03-11 19:20:44 +01:00
Dragan Dosen
d7610e6dde BUG/MINOR: ssl: fix possible ctx memory leak in sample_conv_aes_gcm()
The issue was introduced with the commit c31499d74 ("MINOR: ssl: Add
aes_gcm_dec converter").

This must be backported to all stable branches where the above converter
is present, but it may need to be adjusted for older branches because of
code refactoring.
2024-03-11 19:20:31 +01:00
Brooks Davis
c03a023882 MINOR: tools: use public interface for FreeBSD get_exec_path()
Where possible (FreeBSD 13+), use the public, documented interface to
the ELF auxiliary argument vector: elf_aux_info().

__elf_aux_vector is a private interface exported so that the runtime
linker can set its value during process startup and not intended for
public consumption.  In FreeBSD 15 it has been removed from libc and
moved to libsys.
2024-03-11 19:00:37 +01:00
Amaury Denoyelle
c499d66f37 MINOR: quic: remove qc_treat_rx_crypto_frms()
This commit removes qc_treat_rx_crypto_frms(). This function was used in
a single place inside qc_ssl_provide_all_quic_data(). Besides, its
naming was confusing as conceptually it is directly linked to quic_ssl
module instead of quic_rx.

Thus, body of qc_treat_rx_crypto_frms() is inlined directly inside
qc_ssl_provide_all_quic_data(). Also, qc_ssl_provide_quic_data() is now
only used inside quic_ssl to its scope is set to static. Overall, API
for CRYPTO frame handling is now cleaner.
2024-03-11 14:27:51 +01:00
Amaury Denoyelle
b068e758fb MINOR: quic: simplify rescheduling for handshake
On CRYPTO frames reception, tasklet is rescheduled with TASK_HEAVY to
limit CPU consumption. This commit slighly simplifies this by regrouping
TASK_HEAVY setting and tasklet_wakeup() instructions in a single
location in qc_handle_crypto_frm(). All other unnecessary
tasklet_wakeup() are removed.
2024-03-11 14:15:36 +01:00
Willy Tarreau
6770259083 MEDIUM: mux-h2: allow to set the glitches threshold to kill a connection
Till now it was still needed to write rules to eliminate bad behaving
H2 clients, while most of the time it would be desirable to just be able
to set a threshold on the level of anomalies on a connection.

This is what this patch does. By setting a glitches threshold for frontend
and backend, it allows to automatically turn a connection to the error
state when the threshold is reached so that the connection dies by itself
without having to write possibly complex rules.

One subtlety is that we still have the error state being exclusive to the
parser's state so this requires the h2c_report_glitches() function to return
a status indicating if the threshold was reached or not so that processing
can instantly stop and bypass the state update, otherwise the state could
be turned back to a valid one (e.g. after parsing CONTINUATION); we should
really contemplate the possibility to use H2_CF_ERROR for this. Fortunately
there were very few places where a glitch was reported outside of an error
path so the changes are quite minor.

Now by setting the front value to 1000, a client flooding with short
CONTINUATION frames is instantly stopped.
2024-03-11 08:25:08 +01:00
Willy Tarreau
e6e7e1587e MINOR: mux-h2: always use h2c_report_glitch()
The function aims at centralizing counter measures but due to the fact
that it only increments the counter by one unit, sometimes it was not
used and the value was calculated directly. Let's pass the increment in
argument so that it can be used everywhere.
2024-03-11 07:36:56 +01:00
matthias sweertvaegher
062ea3a3d4 BUILD: solaris: fix compilation errors
Compilation on solaris fails because of usage of names reserved on that
platform, i.e. 'queue' and 's_addr'.

This patch redefines 'queue' as '_queue' and renames 's_addr' to
'srv_addr' which fixes compilation for now.

Future plan: rename 'queue' in code base so define can be removed again.

Backporting: 2.9, 2.8
2024-03-09 11:24:54 +01:00
Willy Tarreau
758cb450a2 OPTIM: sink: drop the sink lock used to count drops
The sink lock was made to prevent event producers from passing while
there were other threads trying to print a "dropped" message, in order
to guarantee the absence of reordering. It has a serious impact however,
which is that all threads need to take the read lock when producing a
regular trace even when there's no reader.

This patch takes a different approach. The drop counter is shifted left
by one so that the lowest bit is used to indicate that one thread is
already taking care of trying to dump the counter. Threads only read
this value normally, and will only try to change it if it's non-null,
in which case they'll first check if they are the first ones trying to
dump it, otherwise will simply count another drop and leave. This has
a large benefit. First, it will avoid the locking that causes stalls
as soon as a slow reader is present. Second, it avoids any write on the
fast path as long as there's no drop. And it remains very lightweight
since we just need to add +2 or subtract 2*dropped in operations, while
offering the guarantee that the sink_write() has succeeded before
unlocking the counter.

While a reader was previously limiting the traffic to 11k RPS under
4C/8T, now we reach 36k RPS vs 14k with no reader, so readers will no
longer slow the traffic down and will instead even speed it up due to
avoiding the contention down the chain in the ring. The locking cost
dropped from ~75% to ~60% now (it's in ring_write now).
2024-03-09 11:23:52 +01:00
Willy Tarreau
eb7b2ec83a OPTIM: sink: try to merge "dropped" messages faster
When a reader doesn't read fast enough and causes drops, subsequent
threads try to produce a "dropped" message. But it takes time to
produce and emit this message, in part due to the use of chunk_printf()
that relies on vfprintf() which has to parse the printf format, and
during this time other threads may continue to increment the counter.
This is the reason why this is currently performed in a loop. When
reading what is received, it's common to see a large count followed
by one or two single-digit counts, indicating that we could possibly
have improved that by writing faster.

Let's improve the situation a little bit. First we're now using a
static message prefixed with enough space to write the digits, and a
call to ultoa_r() fills these digits from right to left so that we
don't have to process a format string nor perform a copy of the message.

Second, we now re-check the counter immediately after having prepared
the message so that we still get an opportunity for updating it. In
order to avoid too long loops, this is limited to 10 iterations.

Tests show that the number of single-digit "dropped" counters on output
now dropped roughly by 15-30%. Also, it was observed that with 8 threads,
there's almost never more than one retry.
2024-03-09 11:23:52 +01:00
Amaury Denoyelle
1ee7bf5bd9 MINOR: quic: always use ncbuf for rx CRYPTO
The previous patch fix the handling of in-order CRYPTO frames which
requires the usage of a new buffer for these data as their handling is
delayed to run under TASK_HEAVY.

In fact, as now all CRYPTO frames handling must be delayed, their
handling can be unify. This is the purpose of this commit, which removes
the just introduced new buffer. Now, all CRYPTO frames are buffered
inside the ncbuf. Unused elements such as crypto_frms member for
encryption level are also removed.

This commit is not a bugcfix but is a direct follow-up to the last one.
As such, it can probably be backported with it to 2.9 to reduce code
differences between these versions.
2024-03-08 17:22:48 +01:00
Amaury Denoyelle
81f118cec0 BUG/MEDIUM: quic: fix handshake freeze under high traffic
QUIC relies on SSL_do_hanshake() to be able to validate handshake. As
this function is computation heavy, it is since 2.9 called only under
TASK_HEAVY. This has been implemented by the following patch :
  94d20be138
  MEDIUM: quic: Heavy task mode during handshake

Instead of handling CRYPTO frames immediately during reception, this
patch delays the process to run under TASK_HEAVY tasklet. A frame copy
is stored in qel.rx.crypto_frms list. However, this frame still
reference the receive buffer. If the receive buffer is cleared before
the tasklet is rescheduled, it will point to garbage data, resulting in
haproxy decryption error. This happens if a fair amount of data is
received constantly to preempt the quic_conn tasklet execution.

This bug can be reproduced with a fair amount of clients. It is
exhibited by 'show quic full' which can report connections blocked on
handshake. Using the following commands result in h2load non able to
complete the last connections.

$ h2load --alpn-list h3 -t 8 -c 800 -m 10 -w 10 -n 8000 "https://127.0.0.1:20443/?s=10k"

Also, haproxy QUIC listener socket mode was active to trigger the issue.
This forces several connections to share the same reception buffer,
rendering the bug even more plausible to occur. It should be possible to
reproduce it with connection socket if increasing the clients amount.

To fix this bug, define a new buffer under quic_cstream. It is used
exclusively to copy CRYPTO data for in-order frame if ncbuf is empty.
This ensures data remains accessible even if receive buffer is cleared.

Note that this fix is only a temporary step. Indeed, a ncbuf is also
already used for out-of-order data. It should be possible to unify its
usage for both in and out-of-order data, rendering this new buffer
instance unnecessary. In this case, several unneeded elements will
become obsolete such as qel.rx.crypto_frms list. This will be done in a
future refactoring patch.

This must be backported up to 2.9.
2024-03-08 17:22:48 +01:00
Nenad Merdanovic
08ac282375 MINOR: Add aes_gcm_enc converter
The converter can be used to encrypt the raw byte input using the
AES-GCM algorithm, using provided nonce and key.

Co-authored-by: Dragan Dosen (ddosen@haproxy.com)
2024-03-08 17:20:43 +01:00
Nenad Merdanovic
e225e04ba7 MINOR: vars: export var_set and var_unset functions
Co-authored-by: Dragan Dosen <ddosen@haproxy.com>
2024-03-08 17:20:43 +01:00
Aurelien DARRAGON
cf37e4cc1b BUG/MINOR: cfgparse: report proper location for log-format-sd errors
When a parsing error occurs inside a log-format-sd expression, we report
the location of the log-format directive (which may not be set) instead
of reporting the proper log-format-sd directive location where the parsing
error occured.

 1|listen test
 2|  log-format "%B"      # no error
 3|  log-format-sd "%bad" # error

 | [ALERT]    (322261) : config : Parsing [empty.conf:2]: failed to parse log-format-sd : no such format variable 'bad'. If you wanted to emit the '%' character verbatim, you need to use '%%'.

The fix consists in using the config hints dedicated to log-format-sd
directive instead of the log-format one.

The bug was introduced in 8a4e4420 ("MEDIUM: log-format: Use standard
HAProxy log system to report errors").

This should be backported to every stable versions.
2024-03-07 11:48:17 +01:00
Aurelien DARRAGON
59f08f65fd CLEANUP: tree-wide: use proper ERR_* return values for PRE_CHECK fcts
httpclient_precheck(), ssl_ocsp_update_precheck(), and
resolvers_create_default() functions are registered through
REGISTER_PRE_CHECK() macro to be called by haproxy during init from the
pre_check_list list. When calling functions registered in pre_check_list,
haproxy expects ERR_* return values. However those 3 functions currently
use raw return values, so we better use explicit ERR_* macros to prevent
breakage in the future if ERR_* values mapping were to change.
2024-03-07 11:48:08 +01:00
Aurelien DARRAGON
2df7e077c7 CLEANUP: log: fix obsolete comment for add_sample_to_logformat_list()
Since 833cc794 ("MEDIUM: sample: handle comma-delimited converter list")
logformat expressions now support having a comma-delimited converter list
right after the fetch. Let's remove a leftover comment from the initial
implementation that says otherwise.
2024-03-07 11:47:56 +01:00
Amaury Denoyelle
b0dd4810e7 BUG/MINOR: mux-quic: fix crash on aborting uni remote stream
A remote unidirectional stream can be aborted prematurely if application
layers cannot identify its type. In this case, a STOP_SENDING frame is
emitted.

Since QUIC MUX refactoring, a crash would occur in this scenario due to
2 specific characteristics of remote uni streams :
* qcs.tx.fctl was not initialized completely. This cause a crash due to
  BUG_ON() statement inside qcs_destroy().
* qcs.stream is never allocated. This caused qcs_prep_bytes() to crash
  inside qcc_io_send().

This bug is considered minor as it happens only on very specific QUIC
clients. It was detected when using s2n-quic over interop.

This does not need to be backported.
2024-03-06 10:41:01 +01:00
Amaury Denoyelle
d8f1ff8648 BUG/MEDIUM: quic: fix connection freeze on post handshake
After handshake completion, QUIC server is responsible to emit
HANDSHAKE_DONE frame. Some clients wait for it to begin STREAM
transfers.

Previously, there was no explicit tasklet_wakeup() after handshake
completion, which is necessary to emit post-handshake frames. In most
cases, this was undetected as most client continue emission which will
reschedule the tasklet. However, as there is no tasklet_wakeup(), this
is not a consistent behavior. If this bug occurs, it causes a connection
freeze, preventing the client to emit any request. The connection is
finally closed on idle timeout.

To fix this, add an explicit tasklet_wakeup() after handshake
completion. It sounds simple enough but in fact it's difficult to find
the correct location efor tasklet_wakeup() invocation, as post-handshake
is directly linked to connection accept, with different orderings.
Notably, if 0-RTT is used, connection can be accepted prior handshake
completion. Another major point is that along HANDSHAKE_DONE frame, a
series of NEW_CONNECTION_ID frames are emitted. However, these new CIDs
allocation must occur after connection is migrated to its new thread as
these CIDs are tied to it. A BUG_ON() is present to check this in
qc_set_tid_affinity().

With all this in mind, 2 locations were selected for the necessary
tasklet_wakeup() :
* on qc_xprt_start() : this is useful for standard case without 0-RTT.
  This ensures that this is done only after connection thread migration.
* on qc_ssl_provide_all_quic_data() : this is done on handshake
  completion with 0-RTT used. In this case only, connection is already
  accepted and migrated, so tasklet_wakeup() is safe.

Note that as a side-change, quic_accept_push_qc() API has evolved to
better reflect differences between standard and 0-RTT usages. It is now
forbidden to call it multiple times on a single quic_conn instance. A
BUG_ON() has been added.

This issue is labelled as medium even though it seems pretty rare. It
was only reproducible using QUIC interop runner, with haproxy compiled
with LibreSSL with quic-go as client. However, affected code parts are
pretty sensible, which justify the chosen severity.

This should fix github issue #2418.

It should be backported up to 2.6, after a brief period of observation.
Note that the extra comment added in qc_set_tid_affinity() can be
removed in 2.6 as thread migration is not implemented for this version.
Other parts should apply without conflict.
2024-03-06 10:39:57 +01:00
William Lallemand
3a3c2b2695 BUG/MINOR: ssl/cli: typo in new ssl crl-file CLI description
The `new ssl crl-file` option description on the CLI lacks the dash.

Must be backported as far as 2.6.
2024-03-05 14:49:17 +01:00
Ilya Shipitsin
96cd04f8db CLEANUP: fix typo in naming for variable "unused"
In resolvers.c:rslv_promex_next_ts() and in
stick-tables.c:stk_promex_next_ts(), an unused argument was mistakenly
called "unsued" instead of "unused". Let's fix this in a separate patch
so that it can be omitted from backports if this causes build problems.
2024-03-05 11:50:34 +01:00
Ilya Shipitsin
da3b12ade1 CLEANUP: assorted typo fixes in the code and comments
This is 39th iteration of typo fixes
The naming issue on the argument called "unsued" instead of "unused"
in two functions from resolvers and stick-tables was put into a second
patch so that it can be omitted if it were to cause backport issues.
2024-03-05 11:50:34 +01:00
Willy Tarreau
962c129dc1 BUG/MINOR: sink: fix a race condition in the TCP log forwarding code
That's exactly the same as commit 53bfab080c ("BUG/MINOR: sink: fix a race
condition between the writer and the reader") that went into 2.7 and was
backported as far as 2.4, except that since the code was duplicated, the
second instance was not noticed, leaving the race present. The race has
a limited impact, if a forwarder reaches the end of the logs and a new
message arrives before it leaves, the forwarder will only wake up after
yet another new message will be sent. In practice it remains unnoticeable
because for the race to trigger, one needs to have a steady flow of logs,
which means the wakeup will happen anyway.

This should be backported, but no need to insist on it if it resists.
2024-03-05 11:48:44 +01:00
Aurelien DARRAGON
75c8a1bc2d CLEANUP: hlua: txn class functions may LJMP
Clarify that some txn related class functions may LJMP by adding the
__LJMP tag to their prototype.
2024-03-04 16:48:51 +01:00
Aurelien DARRAGON
f364f4670b MINOR: hlua: use SEND_ERR to report errors in hlua_event_runner()
Instead of reporting lua errors using ha_alert(), let's use SEND_ERR()
helper which will also try to generate a log message according to lua
log settings.
2024-03-04 16:48:48 +01:00
Aurelien DARRAGON
e1b0031650 BUG/MINOR: hlua: don't call ha_alert() in hlua_event_subscribe()
hlua_event_subscribe() is meant to be called from a protected lua env
during init and/or runtime. As such, only hlua_event_sub() makes uses
of it: when an error happens hlua_event_sub() will already raise a Lua
exception. Thus it's not relevant to use ha_alert() there as it could
generate log pollution (error is relevant from Lua script point of view,
not from haproxy one).

This could be backported in 2.8.
2024-03-04 16:48:42 +01:00
Aurelien DARRAGON
8670db7a89 BUG/MAJOR: hlua: improper lock usage with hlua_ctx_resume()
hlua_ctx_resume() itself can safely be used as-is in a multithreading
context because it takes care of taking the lua lock.

However, when hlua_ctx_resume() returns, the lock is released and it is
thus the caller's responsibility to ensure it owns the lock prior to
performing additional manipulations on the Lua stack. Unfortunately, since
early haproxy lua implementation, we used to do it wrong:

The most common hlua_ctx_resume() pattern we can find in the code (because
it was duplicated over and over over time) is the following:

  |ret = hlua_ctx_resume()
  |switch (ret) {
  |        case HLUA_E_OK:
  |        break;
  |        case HLUA_E_ERRMSG:
  |        break;
  |        [...]
  |}

Problem is: for some of the switch cases, we still perform lua stack
manipulations. This is the case for the HLUA_E_ERRMSG for instance where
we often use lua_tostring() to retrieve last lua error message on the top
of the stack, or sometimes for the HLUA_E_OK case, when we need to perform
some lua cleanup logic once the resume ended. But all of this is done
WITHOUT the lua lock, so this means that the main lua stack could be
accessed simultaneously by concurrent threads when a script was loaded
using 'lua-load'.

While it is not critical for switch-cases dedicated to error handling,
(those are not supposed to happen very often), it can be very problematic
for stack manipulations occuring in the HLUA_E_OK case under heavy load
for instance. In this case, main lua stack corruptions will eventually
happen. This is especially true inside hlua_filter_new(), where this bug
was known to cause lua stack corruptions under load, leading to lua errors
and even crashing the process as reported by @bgrooot in GH #2467.

The fix is relatively simple, once hlua_ctx_resume() returns: we should
consider that ANY lua stack access should be lua-lock protected. If the
related lua calls may raise lua errors, then (RE)SET_SAFE_LJMP
combination should be used as usual (it allows to lock the lua stack and
catch lua exceptions at the same time), else hlua_{lock,unlock} may be
used if no exceptions are expected.

This patch should fix GH #2467.

It should be backported to all stable versions.

[ada: some ctx adj will be required for older versions as event_hdl
 doesn't exist prior to 2.8 and filters were implemented in 2.5, thus
 some chunks won't apply]
2024-03-04 16:48:31 +01:00
Aurelien DARRAGON
19b016f9f8 BUG/MEDIUM: hlua: improper lock usage with SET_SAFE_LJMP()
When we want to perform some unsafe lua stack manipulations from an
unprotected lua environment, we use SET_SAFE_LJMP() RESET_SAFE_LJMP()
combination to lock lua stack and catch potential lua exceptions that
may occur between the two.

Hence, we regularly find this pattern (duplicated over and over):

  |if (!SET_SAFE_LJMP(hlua)) {
  |        const char *error;
  |
  |        if (lua_type(hlua->T, -1) == LUA_TSTRING)
  |                error = hlua_tostring_safe(hlua->T, -1);
  |         else
  |                error = "critical error";
  |        SEND_ERR(NULL, "*: %s.\n", error);
  |}

This is wrong because when SET_SAFE_LJMP() returns false (meaning that an
exception was caught), then the lua lock was released already, thus the
caller is not expected to perform lua stack manipulations (because the
main lua stack may be shared between multiple threads). In the pattern
above we only want to retrieve the lua exception message which may be
found at the top of the stack, to do so we now explicitly take the lua
lock before accessing the lua stack. Note that hlua_lock() doesn't catch
lua exceptions so only safe lua functions are expected to be used there
(lua functions that may NOT raise exceptions).

It should be backported to every stable versions.

[ada: some ctx adj will be required for older versions as event_hdl
 doesn't exist prior to 2.8 and filters were implemented in 2.5, thus
 some chunks won't apply, but other fixes should stay relevant]
2024-03-04 16:47:20 +01:00
Aurelien DARRAGON
d81c2205a3 BUG/MINOR: hlua: improper lock usage in hlua_filter_new()
In hlua_filter_new(), after each hlua resume, we systematically try to
empty the stack by calling lua_settop(). However we're doing this without
locking the lua context, so it is unsafe in multithreading context if the
script is loaded using 'lua-load'. To fix the issue, we protect the call
with hlua_{lock,unlock}() helpers.

This should be backported up to 2.6.
2024-03-04 16:47:18 +01:00
Aurelien DARRAGON
51f291c795 BUG/MINOR: hlua: improper lock usage in hlua_filter_callback()
In hlua_filter_callback(), some lua stack work is performed under
SET_SAFE_LJMP() guard which also takes care of locking the hlua context
when needed. However, a lua_gettop() call is performed out of the guard,
thus it is unsafe in multithreading context if the script is loaded using
'lua-load' because in this case the main lua stack is shared between
threads and each access to a lua stack must be performed under the lock,
thus we move lua_gettop() call under the lock.

It should be backported up to 2.6.
2024-03-04 16:47:17 +01:00
Aurelien DARRAGON
9578524091 BUG/MINOR: hlua: fix possible crash in hlua_filter_new() under load
hlua_filter_new() handles memory allocation errors by jumping to the
"end:" cleanup label in case of errors. Such errors may happen when the
system is heavily loaded for instance.

In hlua_filter_new(), we try to allocate two hlua contexts in a row before
checking if one of them failed (in which case we jump to the cleanup part
of the function), and only then we initialize them both.

If a memory allocation failure happens for only one out of the two
flt_ctx->hlua[] contexts pair, we still jump to the cleanup part.
It means that the hlua context that was successfully allocated and wasn't
initialized yet will be passed to hlua_ctx_destroy(), resulting in invalid
reads in the cleanup function, which may ultimately cause the process to
crash.

To fix the issue: we make sure flt_ctx hlua contexts are initialized right
after they are allocated, that is before any error handling condition that
may force the cleanup.

This bug was discovered when trying to reproduce GH #2467 with haproxy
started with "-dMfail" argument.

It should be backported up to 2.6.
2024-03-04 16:47:03 +01:00
Aurelien DARRAGON
369bfa0b50 BUG/MINOR: hlua: don't use lua_tostring() from unprotected contexts
As per lua documentation, lua_tostring() may raise a memory error.
However, we're often using it to fetch the error message at the top of
the stack (ie: after a failing lua call) from unprotected environments.
In practise, lua_tostring() has rare chances of failing, but still, if
it happens to be the case, it could crash the process and we better not
risk it.

So here, we add hlua_tostring_safe() function, which works exactly as
lua_tostring(), but the function cannot LJMP as it will catch
lua_tostring() exceptions to return NULL instead.

Everywhere lua_tostring() was used to retrieve error string from such
unprotected contexts, we now rely on hlua_tostring_safe().

This should be backported to all stable versions.

[ada: ctx adj will be required, for versions prior to 2.8 event_hdl
 API didn't exist so some chunks won't apply, and prior to 2.5 filters
 API didn't exist either, so again, some chunks should be ignored]
2024-03-04 16:46:55 +01:00
Aurelien DARRAGON
5508db9a20 BUG/MINOR: hlua: fix unsafe lua_tostring() usage with empty stack
Lua documentation says that lua_tostring() returns a pointer that remains
valid as long as the object is not removed from the stack.

However there are some places were we use the returned string AFTER the
corresponding object is removed from the stack. In practise this doesn't
seem to cause visible bugs (probably because the pointer remains valid
waiting for a GC cycle), but let's fix that to comply with the
documentation and avoid undefined behavior.

It should be backported in all stable versions.
2024-03-04 16:46:53 +01:00
Willy Tarreau
7151076522 BUG/MINOR: tools: seed the statistical PRNG slightly better
Thomas Baroux reported a very interesting issue. "balance random" would
systematically assign the same server first upon restart. That comes from
its use of statistical_prng() which is only seeded with the thread number,
and since at low loads threads are assigned to incoming connections in
round robin order, practically speaking, the same thread always gets the
same request and will produce the same random number.

We already have a much better RNG that's also way more expensive, but we
can use it at boot time to seed the PRNG instead of using the thread ID
only.

This needs to be backported to 2.4.
2024-03-01 16:25:39 +01:00
Christopher Faulet
31ec9f18bb MINOR: hlua: Be able to disable logging from lua
Add core.silent (-1) value to be able to disable logging via
TXN:set_loglevel() call. Otherwise, there is no way to do so and it may be
handy. This special value cannot be used with TXN:log() function.

This patch may be backported if necessary.
2024-03-01 15:01:18 +01:00
Christopher Faulet
75fb0afde4 BUG/MINOR: hlua: Fix log level to the right value when set via TXN:set_loglevel
When the log level is changed in lua, by calling TXN:set_loglevel function,
it must be incremented by one because it is decremented in strm_log()
function.

This patch must be backport to all stable versions.
2024-03-01 15:01:18 +01:00
Christopher Faulet
573ed242e3 BUG/MINOR: config/quic: Alert about PROXY protocol use on a QUIC listener
PROXY procotol is not supported on QUIC for now. Thus return an error during
configuration parsing if 'accept-proxy' option is used for a QUIC listener.

This patch should fix the issue #2186. It should be backport as far as 2.6.
2024-03-01 15:01:18 +01:00
Christopher Faulet
69f15b9a40 CLEANUP: mux-h2: Fix h2s_make_data() comment about the return value
2 return values are specified in the h2s_make_data() function comment. Both
are more or less equivalent but the later is probably more accurate. So,
keep the right one and remove the other one.

This patch should fix the issue #2175.
2024-02-29 13:57:44 +01:00
Amaury Denoyelle
f913d42aaf MINOR: quic: add MUX output for show quic
Extend "show quic" to be able to dump MUX related information. This is
done via the new function qcc_show_quic(). This replaces the old streams
dumping list which was incomplete.

These info are displayed on full output or by specifying "mux" field.
2024-02-29 10:03:36 +01:00
Amaury Denoyelle
dda3a0d8fc MINOR: quic: specify show quic output fields
Add the possibility to customize show quic full output with only a
specific set of printed fields. This is specified as a comma-separated
list. Here are the currently supported values :
* tp: transport parameters
* sock: connection addresses and socket FD
* pktns: packet number space with ack ranges and in flight bytes
* cc: congestion controler and loss information

Note that streams output is not filtered by this mechanism. It's because
it will be replaced soon by an output generated from the MUX which will
use its owned field name.
2024-02-29 10:03:36 +01:00
Amaury Denoyelle
c4f5ff8369 MINOR: quic: filter show quic by address
Add the possibilty to restrict show quic output to only a single
connection. This is done by specifying a quic_conn address pointer.

Default format selection has evolved with it. Indeed, it seems more
fitting to use full format by default when filtering on a connection.
However, it's still possible to revert to the original oneline format
with it by specifying it explicitely.
2024-02-29 10:03:33 +01:00
Christopher Faulet
60fcc27577 MEDIUM: htx/http-ana: No longer close connection on early HAProxy response
When a response was returned by HAProxy, a dedicated HTX flag was
set. Thanks to this flag, it was possible to add a "connection: close"
header to the response if the request was not fully received and to close
the connection. In the same way, when a redirect rule was applied,
keep-alive was forcefully disabled for unfinished requests.

All these mechanisms are now useless because the H1 mux is able to drain the
response. So HTX_FL_PROXY_RESP flag is removed and no special processing is
performed on HAProxy response when the request is unfinished.
2024-02-28 16:02:33 +01:00
Christopher Faulet
077906da14 MAJOR: mux-h1: Drain requests on client side before shut a stream down
unlike for H2 and H3, there is no mechanism in H1 to notify the client it
must stop to upload data when a response is replied before the end of the
request without closing the connection. There is no RST_STREAM frame
equivalent.

Thus, there is only two ways to deal with this situation: closing the
connection or draining the request. Until now, HAProxy didn't support
draining H1 messages. Closing the connection in this case has however a
major drawback. It leads to send a TCP reset, dropping this way all in-fly
data. There is no warranty the client has fully received the response.

Draining H1 messages was never implemented because in old versions it was a
bit tricky to implement. However, it is now far simplier to support this
feature because it is possible to have a H1 stream without any applicative
stream. It is the purpose of this patch. Now, when a shutdown is requested
and the stream is detached from the connection, if the request is unfinished
while the response was fully sent, the request in drained.

To do so, in this case the shutdown and the detach are delayed. From the
upper layer point of view, there is no changes. The endpoint is shut down
and detached as usual. But on H1 mux point of view, the H1 stream is still
alive and is being able to drain data. However the stream-endpoint
descriptor is orphan. Once the request is fully received (and drained), the
connection is shut down if it cannot be reused for a new transaction and the
H1 stream is destroyed.
2024-02-28 16:02:33 +01:00
Christopher Faulet
14db433db9 MINOR: mux-h1: Move all stuff to detach a stream in an internal function
All code from h1_detach() function was moved in a internal function,
h1s_finish_detach(). It will be used to defer the detach and be able to
drain the requests payload.
2024-02-28 15:31:07 +01:00
Christopher Faulet
7ae280d091 MINOR: mux-h1: Move checks performed before a shutdown in a dedicated function
Checks performed in h1_shutw() to determine if the connection must be
shutdown now or not was move in a dedicated function. This will be used to
be able to drain the requests payload.
2024-02-28 15:31:07 +01:00
Christopher Faulet
81f75d32b2 BUG/MINOR: mux-h1: Properly report when mux is blocked during a nego
During a zero-copy forwarding negociation, if the H1 mux is blocked for any
reason, the IOBUF_FL_FF_BLOCKED flag must be set on its iobuf to notfiy the
producer it must wait. However, there were two places where it was not
performed: when the output buffer allocation failed and when the chunk
formatting failed.

This patch fixes the issue. It must be backported to 2.9.
2024-02-28 15:31:07 +01:00
Christopher Faulet
489e583ac5 BUG/MEDIUM: mux-h1: Fix again 0-copy forwarding of chunks with an unknown size
There is still an issue with zero-copy forwarding of chunks with an unknown
size. It is possible for a producer to fill the sapce reserved for the CRLF
at the end of the chunk. The root cause is that this space is not accounted
in the iobuf offset. So, from the producer point of view, the space may be
used. We can also argue the current design for iobuf is not well suited for
this case. Instead of using a pointer on the consumer's buffer, it could be
easier to use a custom buffer built on top of the consumer one, via a call
to b_make(), with the size, head and data field reflecting the avaialble
space the producer can use.

By the way, because of this bug, it is possible to trigger a BUG_ON() when
we try to write the CRLF at the end of the chunk because the buffer is
full. It is unexpected. Only the stats applet may hit this bug.

To fix the issue, instead of writting this CRLF when the current chunk is
consumed, it is written before consuming the next one. This way, all space
reserved to create the chunk formatting is always placed before forwarding
data.

No backport needed.
2024-02-28 15:31:07 +01:00
Aurelien DARRAGON
60edfabc7b LICENSE: http_ext: fix GPL license version
This is a followup of the previous commit: GH user @songliumeng initially
reported an issue with the GPL license version for event_hdl source file
which was fixed by the previous commit. It turns out the same mistake was
made in http_ext source file: due to a mixup between LGPL and GPL, GPL
version '2.1' was referenced instead of '2'.

Again, clarify that this is indeed GPL by making use of the banner
provided in doc/gpl.txt

This should be backported in 2.8 with b2bb925 ("MINOR: proxy/http_ext:
introduce proxy forwarded option")
2024-02-28 15:13:35 +01:00
Aurelien DARRAGON
e9261fff29 LICENSE: event_hdl: fix GPL license version
As spotted by user @songliumeng in GH #2463, there was a mixup between
LGPL and GPL in event_hdl source file: GPL version '2.1' was referenced
instead of '2'. Clarify that this is indeed GPL by making use of the
banner provided in doc/gpl.txt.

This should be backported in 2.8 with 68e692d ("MINOR: event_hdl: add
event handler base api")
2024-02-28 15:13:27 +01:00
William Lallemand
bb7af8b2f1 BUG/MINOR: ssl/cli: duplicate cleaning code in cli_parse_del_crtlist
Since 23cab33 ("BUG/MINOR: ssl: Clear the ckch instance when deleting a
crt-list line"), LIST_DELETE is done twice, one time in
cli_parse_del_crtlist() and another time in ckch_inst_free().

It could trigger a crash with -DDEBUG_LIST.

This isn't a major problem since the ptr is not freed in the meantime so
it will only trigger with the debug.

This patch removes the LIST_DELETE as well as the loop done on link_ref
which is also don in ckch_inst_free()

Could be backported as far as 2.4. 2.4 version does not have a link_ref
loop.
2024-02-27 18:10:43 +01:00
Amaury Denoyelle
8a31783b64 BUG/MEDIUM: server: fix dynamic servers initial settings
Contrary to static servers, dynamic servers does not initialize their
settings from a default server instance. As such, _srv_parse_init() was
responsible to set a set of minimal values to have a correct behavior.

However, some settings were not properly initialized. This caused
dynamic servers to not behave as static ones without explicit
parameters.

Currently, the main issue detected is connection reuse which was
completely impossible. This is due to incorrect pool_purge_delay and
max_reuse settings incompatible with srv_add_to_idle_list().

To fix the connection reuse, but also more generally to ensure dynamic
servers are aligned with other server instances, define a new function
srv_settings_init(). This is used to set initial values for both default
servers and dynamic servers. For static servers, srv_settings_cpy() is
kept instead, using their default server as reference.

This patch could have unexpected effects on dynamic servers behavior as
it restored proper initial settings. Previously, they were set to 0 via
calloc() invocation from new_server().

This should be backported up to 2.6, after a brief period of
observation.
2024-02-27 17:02:20 +01:00
William Lallemand
4895fdac5a BUG/MAJOR: ssl/ocsp: crash with ocsp when old process exit or using ocsp CLI
This patch reverts 2 fixes that were made in an attempt to fix the
ocsp-update feature used with the 'commit ssl cert' command.

The patches crash the worker when doing a soft-stop when the 'set ssl
ocsp-response' command was used, or during runtime if the ocsp-update
was used.

This was reported in issue #2462 and #2442.

The last patch reverted is the associated reg-test.

Revert "BUG/MEDIUM: ssl: Fix crash when calling "update ssl ocsp-response" when an update is ongoing"
This reverts commit 5e66bf26ec.

Revert "BUG/MEDIUM: ocsp: Separate refcount per instance and per store"
This reverts commit 04b77f84d1b52185fc64735d7d81137479d68b00.

Revert "REGTESTS: ssl: Add OCSP related tests"
This reverts commit acd1b85d3442fc58164bd0fb96e72f3d4b501d15.
2024-02-26 18:04:25 +01:00
Christopher Faulet
19559d4447 BUG/MEDIUM: applet: Fix HTX .rcv_buf callback function to release outbuf buffer
In appctx_htx_rcv_buf(), HTX blocks found in the appctx output buffer are
copied into the channel buffer. At the end, the state of the underlying
buffer must be updated. If everything was copied, the buffer is reset. This
way, it will be released later, at the end of the applet process function.

However, here there was a typo. We do it on the input buffer instead of the
output buffer. As side effect, an empty HTX message remained stuck in the
appctx outbut buffer, blocking the applet and leading to blocked session
with no expiration date.

No backport needed.
2024-02-26 16:40:13 +01:00
Miroslav Zagorac
3f771f5118 MINOR: ssl: Call callback function after loading SSL CRL data
Due to the possibility of calling a control process after adding CRLs, the
ssl_commit_crlfile_cb variable was added.  It is actually a pointer to the
callback function, which is called if defined after initial loading of CRL
data from disk and after committing CRL data via CLI command
'commit ssl crl-file ..'.

If the callback function returns an error, then the CLI commit operation
is terminated.

Also, one case was added to the CLI context used by "commit cafile" and
"commit crlfile": CACRL_ST_CRLCB in which the callback function is called.

Signed-off-by: William Lallemand <wlallemand@haproxy.com>
2024-02-23 18:12:27 +01:00
Amaury Denoyelle
ba9f905da9 BUG/MINOR: quic: fix output of show quic
Output of 'show quic' is messed up since the introduction of reordered
packets counter in the following commit. The new counter is mixed up
with the first stream line. This is due to the wrong placement of the
newline delimiter.

  167e38e0e0
  MINOR: quic: Add a counter for reordered packets

This should be backported up to 2.6.
2024-02-23 17:32:24 +01:00
Christopher Faulet
3d93ecc132 BUG/MAJOR: cli: Restore non-interactive mode behavior with pipelined commands
The issue was decribed in commit "BUG/MEDIUM: cli: Warn if pipelined commands
are delimited by a \n". In non-interactive mode, it was possible to use a
newline character as delimiter for pipelined commands. As a consequence, it was
possible to stop commands processing on the middle.

With the above commit, a warning is emitted to notify users. With this one,
we restore the expected behavior, as documented in the management guide.
Only the first line of commands is parsed. This commit will not be
backported to avoid breaking changes on stable versions.

This commit has of course some visible effects. All script using a newline
character as delimiter to pipeline commands in non-interactive mode will
stop working. Only the first command will be evaluated, all others will be
ignored. Pipelined commands MUST now be separated by a semi-colon.

But there is a more subtle and probably more annoying change. It is no
longer possible to pipeline commands with a payload ! A command with a
payload will always be the last one evaluated because it must be finished by
a newline (eventually preceeded by a custom pattern).

It is really annoying to introduce such breaking change. But, on the long
term, it is mandatory. The 2.8 will be the last LST version supporting the
old behavior (with some warning however). This will let 4 years to users to
adapt their scripts.

No backport needed.
2024-02-23 15:19:49 +01:00
Christopher Faulet
598c7f164c BUG/MEDIUM: cli: Warn if pipelined commands are delimited by a \n
This was broken since commit 0011c25144 ("BUG/MINOR: cli: avoid O(bufsize)
parsing cost on pipelined commands"). It is not really a bug fix but it is
labelled as is to make it more visible.

Before, a full line was first retrieved from the request buffer before
extracting the first command to eval it. Now, only one command is retrieved.
But we rely on the request buffer state to interrupt processing in
non-interactive mode. After a command processing, if output of the request
buffer is empty, we leave. Before the above commit, this was not a problem.
But since then, it is obviously a bad statement. First because some input
data may still be there. It is not true today, but it might change. Then,
there is no warranty to receive all commands in same time. For small list of
commands, it will be most of time the case, but it is a dangerous
assumption. For long list of commands, it is almost always false.

To be an issue, commands must be chunked exactly between two commands. But
in this case, remaining commands are skipped. A good way to reproduce the
issue is to wait a bit between two commands, for instance:

    (printf "show info;"; sleep 2; printf "show stat\n") | socat ...

In fact, to properly fix the issue, we should exit on the first command
finished by a newline. Indeed, as stated in the documentation, in
non-interactive mode, a single line is processed. To pipeline commands,
commands must be separated by a semi-colon. Unfortunately, the above commit
introduced another change. It is possible to pipeline commands delimited by
a newline. It was pushed 2 years ago and backported to all stable versions.
Several scripts may rely on this behavior.

So, on stable version, the bug will not be fixed. However a warning will be
emitted to notify users their scripts don't respect the documentation and
they must adapt it. Mainly because the cli behavior on this point will be
changed in 3.0 to stick to the doc. This warning will only be emitted once
over the whole worker process life. Idea is to not flood the logs with the
same warning for every offending commands.

This commit should probably be backported to all stable versions. But with
some cautions because the CLI was often modified.
2024-02-23 15:19:49 +01:00
Christopher Faulet
e018e8a419 MINOR: cli: Remove useless loop on commands to find unescaped semi-colon
This loop was added to detect pipelined commands when only co_getline() was
used to get commands. Now, co_getdelim() is used and the semi-colon is also
considered as a command delimiter.

As side effet, the last semi-colon, if any, is no longer replaced by a
newline. Thus, we must take care to adapt the test to detect partial
commands.
2024-02-23 15:19:49 +01:00
Amaury Denoyelle
73806f0675 BUG/MEDIUM: mux-quic: do not crash on qcs_destroy for connection error
On qcs_destroy(), a BUG_ON() statement check that QCS does not have
anymore prepared data. This is to ensure connection flow control is
always coherent and prevent transfer freeze.

However, this BUG_ON() may cause a spurrious crash in case QCC is
considered on error. Indeed, in this case, all transfers are interrupted
and qmux_strm_detach() will proceed to immediate QCS free before
releasing the connection. In this situation, connection flow control is
irrelevant so the BUG_ON() should be ignored.

This crash occurs since the MUX refactoring via the following patch.
Previously, a similar BUG_ON() was used but it was incorrectly
implemented rendering it immune even to targetted cause.

  3fe3251593
  MEDIUM: mux-quic: simplify sending API

This should fix github issue #2456.

This does not need to be backported.
2024-02-23 11:41:33 +01:00
Amaury Denoyelle
1b8c5abeeb BUG/MAJOR: server: fix stream crash due to deleted server
Before a dynamic server can be deleted, a set of preconditions must be
validated to ensure it is not referenced naymore by a stream or a
connection. This is implemented in srv_check_for_deletion().

The various criteria specified were incomplete. This allows a server
instance to be deleted while still be referenced by a stream and a
connection.

This bug was reproduced by using ASAN compilation. A script was used to
add and delete a server every second, while using h2load to generate
traffic with download of 1k objects. Here is the ASAN error.

==140916==ERROR: AddressSanitizer: heap-use-after-free on address 0x520000020080 at pc 0x63cb25679537 bp 0x701529ff5070 sp 0x701529ff5060
READ of size 1 at 0x520000020080 thread T7
    #0 0x63cb25679536 in objt_server include/haproxy/obj_type.h:99
    #1 0x63cb2568f465 in process_stream src/stream.c:1823
    #2 0x63cb25a4a4a2 in run_tasks_from_lists src/task.c:632
    #3 0x63cb25a4bf62 in process_runnable_tasks src/task.c:876
    #4 0x63cb2596a220 in run_poll_loop src/haproxy.c:3050
    #5 0x63cb2596b192 in run_thread_poll_loop src/haproxy.c:3252
    #6 0x701539aa9559  (/usr/lib/libc.so.6+0x8b559) (BuildId: c0caa0b7709d3369ee575fcd7d7d0b0fc48733af)
    #7 0x701539b26a3b  (/usr/lib/libc.so.6+0x108a3b) (BuildId: c0caa0b7709d3369ee575fcd7d7d0b0fc48733af)

To fix this, add <curr_used_conns> to the counters checked in
srv_check_for_deletion().

Outside of this bug, one case which remains sensible is for SF_DIRECT
streams which referenced a server instance early in process_stream()
before connect_server(). This occurs with use-server directive,
force-persist rule or cookie persistence. However, after code
reexamination, the code is considered reliable as process_stream() is
not rescheduled before connect_server() invocation. These observations
have been saved in sess_change_server() documentation to ensure it
remains valid in the future.

This must be backported up to 2.6.
2024-02-22 18:36:54 +01:00
Amaury Denoyelle
4adf2c9f00 BUG/MINOR: stats: drop srv refcount on early release
Server refcount is used to protect from server deletion while dumping a
server instance, for stats dump on both CLI and HTTP applet. However,
dump can be aborted prematurely before reaching the end. In this case,
server refcount is never decremented.

This bug can cause an inconsistency on servers refcount, preventing them
to be deleted even after "del server" success.

To fix this, implement release handler for both stats CLI and HTTP
applet. Drop server reference if dump was interrupted during servers
loop.

This should be backported up to 2.6.
2024-02-22 18:24:35 +01:00
Aurelien DARRAGON
2462e5bcca BUG/MINOR: log: fix potential lf->name memory leak
Recent commit 2ed6068 ("MINOR: log: custom name for logformat node")
introduced a potential memory leak because when custom name is provided,
lf->name value is allocated using strdup(), thus is expected to be freed
alongside the node when the node is released.

However lf->name was only freed in some common places within log.c
cleanups and helpers func, but in reality there are still cases where
lf nodes are manually freed without making use of freeing helpers.

So this is what this patch does, it makes sure all lf freeing places now
leverage the free_logformat_node() helper function that takes care of
freeing all known allocated elements within the node, including custom
name.

This commit depends on:
 - "MINOR: log: add free_logformat_node() helper function"

No backport needed unless 2ed6068 gets backported.
2024-02-22 15:32:42 +01:00
Aurelien DARRAGON
1c2e16ba8a MINOR: log: add free_logformat_node() helper function
Function may be used to free a single logformat node.
2024-02-22 15:32:42 +01:00
Aurelien DARRAGON
62121d5b90 CLEANUP: log: use free_logformat_list() in parse_logformat_string()
This is a follow up for 24a5e42db6 ("CLEANUP: log: deinitialization of
the log buffer in one function") as there was another opportunity to
make use of the new cleanup function.
2024-02-22 15:32:42 +01:00
Aurelien DARRAGON
e7aee6edd5 CLEANUP: log: fix process_send_log() indentation
Fix bad indentation for process_send_log() prototype (tab was used instead
of spaces)
2024-02-22 15:32:42 +01:00
Christopher Faulet
1a2a196fcf BUG/MEDIUM: mux-h1: Don't emit 0-CRLF chunk in h1_done_ff() when iobuf is empty
A chunk message transferred via zero-copy forwarding in H1 may be
corrupted. This only happens when the chunk size is not known during the
nego stage and when there is nothing to forward when h1_donn_ff() is
called. In this case, we always emit a chunk. Because there is nothing to
forward, a 0-CRLF is emitted in the middle of the message.

The issue occurred with the HTTP stats applet only.

A simple fix is to check the size of data in the iobuf before emitting a new
chunk in h1_done_ff(). However, we still try to send outgoing data because
when this happens, it is most of time because the H1 output buffer is almost
full.

This patch should fix the issue #2453. No backport needed.
2024-02-21 11:49:58 +01:00
Amaury Denoyelle
a17eaf7763 BUG/MINOR: quic: initialize msg_flags before sendmsg
Previously, msghdr struct used for sendmsg was memset to 0. This was
updated for performance reason with each members individually defined.
This is done by the following commit :

  commit 107d6d7546
  OPTIM: quic: improve slightly qc_snd_buf() internal

msg_flags is the only member unset, as sendmsg manual page reports that
it is unused. However, this caused a coverity report. In the end, it is
better to explicitely set it to 0 to avoid any future interrogations,
compiler warning or even portability issues.

This should fix coverity report from github issue #2455.

No need to backport unless above patch is.
2024-02-21 10:13:53 +01:00
Willy Tarreau
9d572952a2 BUILD: applet: fix build on some 32-bit archs
The to_forward field was added to debugging output of applets with commit
62a81cb6a ("MINOR: applet: Add callback function to deal with zero-copy
forwarding"), though it's a size_t printed as %lu, which causes complaints
on 32-bit archs. Let's just cast as %lu.

No backport is needed.
2024-02-21 04:18:32 +01:00
Amaury Denoyelle
8b950f40fa MINOR: quic: only use sendmsg() syscall variant
This patch is the direct followup of the previous one :
  MINOR: quic: remove sendto() usage variant

This finalizes qc_snd_buf() simplification by removing send() syscall
usage for quic-conn owned socket. Syscall invocation is merged in a
single code location to the sendmsg() variant.

The only difference for owned socket is that destination address for
sendmsg() is set to NULL. This usage is documented in man 2 sendmsg as
valid for connected sockets. This allows maximum performance by avoiding
unnecessary lookups on kernel socket address tables.

As the previous patch, no functional change should happen here. However,
it will be simpler to extend qc_snd_buf() for GSO usage.
2024-02-20 16:42:05 +01:00
Amaury Denoyelle
8de9f8f193 MINOR: quic: remove sendto() usage variant
qc_snd_buf() is a wrapper around emission syscalls. Given QUIC
configuration, a different variant is used. When using connection
socket, send() is the only used. For listener sockets, sendmsg() and
sendto() are possible. The first one is used only if local address has
been retrieved prior. This allows to fix it on sending to guarantee the
source address selection. Finally, sendto() is used for systems which do
not support local address retrieval.

All of these variants render the code too complex. As such, this patch
simplifies this by removing sendto() alternative. Now, sendmsg() is
always used for listener sockets. Source address is then specified only
if supported by the system.

This patch should not exhibit functional behavior changes. It will be
useful when implementing GSO as the code is now simpler.
2024-02-20 16:42:05 +01:00
Amaury Denoyelle
ea90c39302 MINOR: quic: move IP_PKTINFO on send on a dedicated function
When using listener socket, source address for emission is explicitely
set using ancillary data for sendmsg(). This is useful to guarantee the
correct address is used when binding on a non-explicit address.

This code was implemented directly under qc_snd_buf(). However, it is
quite complex due to portability issue. For IPv4, two parallel
implementations coexist, defined under IP_PKTINFO or IP_RECVDSTADDR. For
IPv6, another option is defined under IPV6_RECVPKTINFO. Each variant
uses its distinct name which increase the code complexity.

Extract ancillary data filling in a dedicated function named
cmsg_set_saddr(). This reduces greatly the body of qc_snd_buf(). Such
functions can be replicated when other ancillary data type will be
implemented. This will notably be useful for GSO implementation.
2024-02-20 16:42:05 +01:00
Amaury Denoyelle
107d6d7546 OPTIM: quic: improve slightly qc_snd_buf() internal
qc_snd_buf() is a wrapper for sendmsg() syscall (or its derivatives)
used for all QUIC emissions. This patch aims at removing several
non-optimal code sections :

* fd_send_ready() for connected sockets is only checked on the function
  preambule instead of inside the emission loop

* zero-ing msghdr structure for unconnected sockets is removed. This is
  unnecessary as all fields are properly initialized then.

* extra memcpy/memset invocations when using IP_PKTINFO/IPV6_RECVPKTINFO
  are removed by setting directly the address value into cmsg buffer
2024-02-20 16:42:05 +01:00
Amaury Denoyelle
9b806550b7 MINOR: quic: warn on bind on multiple addresses if no IP_PKTINFO support
Binding on multiple addresses for QUIC is safe only if IP_PKTINFO or
equivalent is available. Else, the behavior may be undefined as the
system is responsible to choose the network interface and source address
on response.

This commit adds a warning on boot if no or partial support for
IP_PKTINFO or equivalent is detected and configuration contains UDP
binding on multiple addresses.

This should be backported up to 2.6. Special backport recommdations :
* change ha_warning() to ha_diag_warning() to ensure no spurrious
  warnings will be triggered on stable releases
* IP_PKTINFO usage was introduced on 2.7. For 2.6, multiple addresses
  QUIC binding is always unreliable. As such, preprocessor condition
  must simply be removed so that the warning is always active regarding
  of the system. Warning message should also be truncated to suppress
  IP_PKTINFO reference.
2024-02-20 16:40:14 +01:00
Aurelien DARRAGON
ee88c4418f MINOR: log: automate string array construction in sess_build_logline()
make it so string array construction is performed by dedicated macro
helpers instead of manual char insertion between string members.

The goal is to easily be able to support multiple forms of array
construction depending on the data encoding format (raw, json..).

Only %hrl and %hsl logformats are concerned.
2024-02-20 15:49:55 +01:00
Aurelien DARRAGON
8d2b9e2acd MINOR: log: print metadata prefixes separately in sess_build_logline()
Some log variables may be prefixed with specific chars that represent
extra informations that are relevant with it but are are not directly
part of the "raw" value.

ie: '+' char is prepended before some values when "option logasap" is
used to indicate that the value has not yet reached its final value.

However, as those "metadata" are printed using the general purpose
LOGCHAR() printing helper, it's not easy to tell if they are part of the
base value or not.

In this patch we add the LOGMETACHAR() helper that is a wrapper for
LOGCHAR(). The goal is to prepare for adding some logic to prevent such
additional infos from being generated when not relevant or needed.
2024-02-20 15:49:55 +01:00
Aurelien DARRAGON
a2fc40bc28 MINOR: log: simplify quotes handling in sess_build_logline()
quotes building for some log formats is directly performed under each
switch case statement so it would become painful to add other conditions
to prevent the quotes from being generated when it's not supported by the
the data encoding format for instance (ie: JSON).

Let's centralize and simplify quotes handling by adding LOGQUOTE_START()
and LOGQUOTE_END() helper macros. If a quotation is started and not
explicitly ended, it will be automatically ended at the end of the current
logformat node:

LOGQUOTE_START() sets 'quote' variable to 1, this way LOGQUOTE_END() only
prints the ending quote when needed. LOGQUOTE_END() is systematically
called after each node switch-case (after each value). LOGQUOTE_START()
does nothing if LOG_OPT_QUOTE isn't set, so does LOGQUOTE_END().

Some rare cases such as %hsl (list of captured headers) required special
handling: in this case multiple quoted texts are generated for the same
field value so explicit LOGQUOTE_START() + LOGQUOTE_END() combination was
needed.
2024-02-20 15:49:55 +01:00
Aurelien DARRAGON
c6a7138420 MINOR: log: simplify last_isspace in sess_build_logline()
last_isspace variable is explicitly set to 0 in all cases except
LOG_FMT_SEPARATOR case. So we can actually simplify the code by setting
last_isspace to 0 by default and skipping the assignment for the
LOG_FMT_SEPARATOR case.
2024-02-20 15:49:55 +01:00
Aurelien DARRAGON
1448478d62 MINOR: log: explicit typecasting for logformat nodes
Add the ability to manually specify desired output type after a custom
field name for logformat nodes. Forcing the type can be useful to ensure
value is stored with the proper type representation. (i.e.: forcing
numerical to string to work around the limited resolution of JS number
types)

By default, type is set to SMP_T_SAME, which means the original type will
be preserved.

Currently supported types are: bool, str, sint
2024-02-20 15:49:54 +01:00
Aurelien DARRAGON
0cfcc64b79 MINOR: sample: add type_to_smp() helper function
type_to_smp(type) does the reverse operation of smp_to_type[smp]: it takes
a type name as input string and tries to return the corresponding SMP_T_*
smp type or SMP_TYPES if not found.
2024-02-20 15:18:39 +01:00
Aurelien DARRAGON
2ed6068f2a MINOR: log: custom name for logformat node
Add the ability to specify custom name (will be used for representation
in verbose output types such as json) to logformat nodes.

For now, a custom name should be composed by characters [a-zA-Z0-9-_]*
2024-02-20 15:18:39 +01:00
Amaury Denoyelle
5b31989a3f BUG/MEDIUM: quic: fix transient send error with listener socket
Transient send errors is handled differentely if using connection or
listener socket for QUIC transfers. In the first case, proper poller
subscription is used via fd_cant_send()/fd_want_send(). For the listener
socket case, error is ignored by qc_snd_buf() caller and retransmission
mechanism will allow to reemit the data.

For listener socket, transient error code handling is buggy. It blindly
uses fd_cand_send() with <qc.fd> member which is set to -1 for listener
socket usage. This results in an invalid fdtab access, with a possible
crash or a modification of a totally unrelated FD.

This bug is simply fixed by using qc_test_fd() before using
fd_cant_send()/fd_want_send(). This ensures <qc.fd> is used only if
initialized which is only the case when using connection socket.

No crash was reported yet for this bug. However, it is reproducible by
using ASAN compilation and the following strace sendmsg() errno command
injection :

 # strace -qq -yy -p $(pgrep haproxy) -f -e trace=%network \
     -e inject=sendto,sendmsg:error=EAGAIN:when=20+20

This must be backported up to 2.7.
2024-02-19 17:56:51 +01:00
Christopher Faulet
56e73df37d BUG/MEDIUM: hlua: Don't loop if a lua socket does not consume received data
If some data are received for a lua socket while the lua script responsible
to consume these data is not ready to do so, for instance because it is
sleeping, the applet is woken up in loop because it never states it will not
consume these data yet.

To fix the issue, in the applet I/O handle, when there are outgoing data, we
always pretend the applet will not consume it. It is the responsibility to
the lua script to reactivate receives by calling Socket.receive() function.

This patch must be backported to every stable version. For 2.4 and older,
si_want_get()/si_cant_get() must be used instead of
applet_will_consume()/applet_wont_consume().
2024-02-16 15:48:08 +01:00
Christopher Faulet
38534d344b BUG/MEDIUM: hlua: Be able to garbage collect uninitialized lua sockets
It is poosible to create a lua socket without performing any connect. In
this case, the lua socket is released because of the garbage collector.
However, the garbarge collector does not release the applet, it wakes it
up. Since commit 751b59c40b ("BUG/MEDIUM: hlua: Initialize appctx used by a
lua socket on connect only"), the applet initialization is performed on
connect. So, here, it is possible to wake an uninitialized applet. It is an
unexpected case for the applet's I/O handler, leading to a segfault because
some resources are not initialized (the stream's target in this case).

So, now, in the lua socket GC function, we take care to immediately release
uninitialized applets. At worst, the release itself is delayed. But it is
safe because we are sure the applet's I/O handler will never be executed.

In addition, we take case to increment the GC counter when the lua socket is
created. The way, uninitialized lua socket are released more quickly.

This patch should fix the issue #2451. It must be backported as far as 2.6.
2024-02-16 15:48:08 +01:00
Christopher Faulet
cd7e73efae BUG/MEDIUM: applet: Immediately free appctx on early error
When an error is triggered during the applet initialization, a dedicated
function is called to release it. Indeed, in this case, because the applet
was not initialized, the ->release callback must not be called. However,
because the init stage may be delayed to be performed during the first
applet wakeup, we must also take care to not rely on the default
appctx_free() function, to immediately release the applet. Otherwise, if the
error happens in a delayed init stage, the applet is never released.

This patch partially fix the issue #2451. It must be backported as far as
2.6.
2024-02-16 15:48:08 +01:00
Amaury Denoyelle
f8df9bd6a5 BUG/MINOR: qpack: reject invalid dynamic table capacity
Currently haproxy does not implement dynamic table support for QPACK. As
such, dynamic table capacity advertized via H3 SETTINGS is 0. When
receiving a non-null Set Dynamic Table Capacity instruction, close
immediately the connection using QPACK_ENCODER_STREAM_ERROR.

Prior to this patch, such instructions were simply ignored. This is non
conform to QUIC specification.

This should be backported up to 2.6. Note that on 2.6 qcc_set_error()
must be replaced by function qcc_emit_cc_app().
2024-02-15 17:46:53 +01:00
Amaury Denoyelle
bd71212ea9 BUG/MINOR: qpack: reject invalid increment count decoding
Close the connection using QPACK_DECODER_STREAM_ERROR when receiving an
invalid insert count increment. As haproxy does not use dynamic table,
this instruction must never be emitted by the peer.

Prior to this patch, haproxy silently ignored such instruction which is
not conform to the QUIC specification.

This should be backported up to 2.6. Note that on 2.6 qcc_set_error()
must be replaced by function qcc_emit_cc_app().
2024-02-15 17:46:19 +01:00
Amaury Denoyelle
cc29ab437e BUG/MINOR: quic: reject HANDSHAKE_DONE as server
As specified in RFC 9000, a client must never emit a HANDSHAKE_DONE
frame. If this happens, the server must close the connection with error
PROTOCOL VIOLATION.

Previously, such a frame was silently discarded on server side. The
connection remained opened which is not conformant to the specification.

This should be backported up to 2.6.
2024-02-15 17:07:24 +01:00
Amaury Denoyelle
80b82c2192 MINOR: quic: handle all frame types on reception
Ensure every frame types are handled in qc_parse_pkt_frms. Add an
ABORT_NOW on the default case. This is safe as an unknown frame must be
rejected prior via qc_parse_frm().
2024-02-15 17:07:24 +01:00
Amaury Denoyelle
5a2aa8c161 BUG/MINOR: quic: reject unknown frame type
As specified by RFC 9000, connection is closed on error if an unknown
QUIC frame type is received.

Previously, a frame with unknown type was silently discarded. The
connection remained opened which is not conformant to the specification.

This should be backported up to 2.6.
2024-02-15 17:04:17 +01:00
Christopher Faulet
081022a0c5 MINOR: muxes/applet: Simplify checks on options to disable zero-copy forwarding
Global options to disable for zero-copy forwarding are now tested outside
callbacks responsible to perform the forwarding itself. It is cleaner this
way because we don't try at all zero-copy forwarding if at least one side
does not support it. It is equivalent to what was performed before, but it
is simplier this way.
2024-02-14 15:41:04 +01:00
Christopher Faulet
ddf6b7539c BUG/MAJOR: stconn: Check support for zero-copy forwarding on both sides
There is a nego stage when a producer is ready to forward data to the other
side. At this stage, the zero-copy forwarding may be disabled if the
consumer does not support it. However, there is a flaw with this way to
proceed. If the channel buffer is not empty, we delay the zero-copy
forwarding to flush all data from the channel first. During this delay,
receives on the endpoint (at connection level for muxes), are blocked to be
sure to have the opportunity to switch on zero-copy forwarding. It is a
problem if the consumer cannot flush data from the channel's buffer, waiting
for more data for instance.

It is especially annoying with the CLI applet, because this scenario can
happen if a command is partially received. For instance without the LF at
the end. In this case, the CLI applet is blocked because it waits more
data. The frontend connexion is also blocked because channel's data must be
flushed before trying to receive more data. Worst, this happen at where no
timeout is armed. Thus the session is stuck infinitly, client aborts cannot
be detected because receives are blocked, and the applet cannot abort on its
side because there are pending outgoing data. It is clearly a situation
where it is easy to consume all CLI slots.

To fix the issue, thanks to previous commits, we now check zero-copy
forwarding support on both sides before proceeding.

This patch relies on the following commits:

  * MINOR: muxes: Announce support for zero-copy forwarding on consumer side
  * MINOR: stconn: Add SE flag to announce zero-copy forwarding on consumer side
  * MINOR: stconn: Rename SE_FL_MAY_FASTFWD and reorder bitfield
  * CLEANUP: stconn: Move SE flags set by app layer at the end of the bitfield

All the series must be backported to 2.9.
2024-02-14 15:41:02 +01:00
Christopher Faulet
e2921ffad1 MINOR: muxes: Announce support for zero-copy forwarding on consumer side
It is unused for now, but the muxes announce their support of the zero-copy
forwarding on consumer side. All muxes, except the fgci one, are supported
it.
2024-02-14 15:15:10 +01:00
Christopher Faulet
7598c0ba69 MINOR: stconn: Rename SE_FL_MAY_FASTFWD and reorder bitfield
To fix a bug, a flag to announce the capabitlity to support the zero-copy
forwarding on the consumer side will be added on the SE descriptor. So the
old flag SE_FL_MAY_FASTFWD is renamed to indicate it concerns the producer
side. It is now SE_FL_MAY_FASTFWD_PROD. And to prepare addition of the new
flag, the bitfield is a bit reordered.
2024-02-14 15:00:32 +01:00
Christopher Faulet
40d98176ba BUG/MEDIUM: stconn: Don't check pending shutdown to wake an applet up
This revert of commit 0b93ff8c87 ("BUG/MEDIUM: stconn: Wake applets on
sending path if there is a pending shutdown") and 9e394d34e0 ("BUG/MINOR:
stconn: Don't report blocked sends during connection establishment") because
it was not the right fixes.

We must not wake an applet up when a shutdown is pending because it means
output some data are still blocked in the channel buffer. The applet does
not necessarily consume these data. In this case, the applet may be woken up
infinitly, except if it explicitly reports it wont consume datay yet.

This patch must be backported as far as 2.8. For older versions, as far as
2.2, it may be backported. If so, a previous fix must be pushed to prevent
an HTTP applet to be stuck. In http_ana.c, in http_end_request() and
http_end_reponse(), the call to channel_htx_truncate() on the request
channel in case of MSG_ERROR must be replace by a call to
channel_htx_erase().
2024-02-14 14:22:36 +01:00
Christopher Faulet
5895fa8ce0 MINOR: cli: No longer check SC for shutdown to interrupt wait command
Thanks to the previous patch ("MEDIUM: applet: Add notion of shutdown for
write for applets"), it is no longer necessary to check SC flags to detect
shutdowns to interrupt the wait command. It is possible to remove this ugly
workaround. In addition, we only test the SE for shutdown because end of
stream and error are already checked by the CLI I/O handler. And it is no
longer necessary to remove output data from the channel's buffer because
shutdown are not reported if there are remaining outgoing data.

Of course, if the "wait" command is backported, the commit above and this
one must be backported too.
2024-02-14 14:22:36 +01:00
Christopher Faulet
4a78f766ff MEDIUM: applet: Add notion of shutdown for write for applets
In fact there is already flags on the SE to state a shutdown for reads or
writes was performed. But for applets, this notion does not exist. Both
flags are set in same time when the applet is released. But at the SC level,
there are functions to perform a shutdown (formely the shutw) and an abort
(formely the shutr). For applets, when a shutdown is performed on the SC, if
the applet is not immediately released, nothing is acknowledge at the SE
level.

With old way to implement applets, this was not an real issue until recently
because applets accessed to the channel/SC flags. It was thus possible to
catch the shutdowns. But the "wait" command on the CLI reveals the
flaw. Indeed, when this command is executed, nothing is read or sent. So, it
is not possible to detect the shutdowns. As a workaround, a dedicated test
on the SC flags was added at the end of the wait command I/O handler. But it
is pretty ugly.

With new way to implement applets, there is no longer access to the channel
or SC. So we must add a way to acknowledge shutdown into the SE.

This patch solves the both sides of the issue. The shutw notion is added for
applets. Its only purpose is to set SE_FL_SHWN flags. This flag is tested by
all applets, so, it solves the issue quite simply.

Note that it is described as a bug fix but there is no real issue, just a
design flaw. However, if the "wait" command is backported, this patch must
be backported too. Unfortinately it will require an adaptation because there
is no appctx flags on older versions.
2024-02-14 14:22:36 +01:00
Christopher Faulet
dcd917d972 MINOR: applet: Remove uselelss test on SE_FL_SHR/SHW flags
These both flags are set after releasing the applet, in
appctx_shut(). Concretly, it means the applet is shutdown for reads and
writes. Once set, the applet's I/O handler was no longer called. Tests on
these flags are useless. There is no chance to match them.
2024-02-14 14:22:36 +01:00
Christopher Faulet
5df45cff8f BUG/MEDIUM: stconn/applet: Block 0-copy forwarding if producer needs more room
This case does not exist yet with the H1 multiplexer, but applets may decide to
not produce data if there is not enough room in the destination buffer (the
applet's outbuf or the opposite SE buffer). It is true for the stats applets for
instance. However this case is not properly handled when the zero-copy
forwarding is in-use.

To fix the issue, the se_done_ff() function was modified to return the number of
bytes really forwarded and to subs for sends if nothing was forwarded while the
zero-copy forwarding was blocked by the producer. On the applet side, we take
care to block the zero-copy forwarding if the applet requests more room. At the
end, zero-copy forwarding is unblocked if something was forwarded.

This way, it is now possible for the stats applet to report a full buffer and
block the zero-copy forwarding, even if the buffer is not really full, by
requesting more room.

No backport needed.
2024-02-14 14:22:36 +01:00
Christopher Faulet
ece002af1d BUG/MEDIUM: applet: Add a flag to state an applet is using zero-copy forwarding
An issue was introduced when zero-copy forwarding was added to the stats and
cache applets. There is no test to be sure the upper layer is ready to use
the zero-copy forwarding. So these applets refuse to deliver the response
into the applet's output buffer if the zero-copy forwarding is supported by
the opposite endpoint. It is especially an issue when a filter, like the
compression, is in-use on the response channel.

Because of this bug, the response is not delivered and the applet is woken
up in loop to produce data.

To fix the issue, an appctx flag was added, APPCTX_FL_FASTFWD, to know when
the zero-copy forwarding is in-use. We rely on this flag to not fill the
outbuf in the applet's I/O handler.

No backport needed.
2024-02-14 14:22:36 +01:00
Christopher Faulet
1465eb570b MINOR: stats: Use a dedicated function to check if output is almost full
This simplifies a bit the stats applet. Because the CLI part was not
refactored yet to use the applet's buffers, there are 3 ways to produce
data:

  * the HTX message for the HTTP stats when zero-copy forwarding is not
    used
  * raw data in the opposite endpoint buffer for the HTTP stats when
    zero-copy forwarding is used
  * the channel buffer when the CLI "show stat" command is evaluated

There is already a dedicated function to take care to copy data at the right
place. There is now also a dedicated function to check us the output buffer
is almost full.
2024-02-14 14:22:36 +01:00
Christopher Faulet
3ee3a7937a BUG/MAJOR: mux-h1: Fix zero-copy forwarding when sending chunks of unknown size
Commit 91b77c1632 ("MEDIUM: mux-h1: Support zero-copy forwarding for chunks with
an unknown size") was recently pushed but it contains 3 bugs. The first one is
during the nego. The extra size reserved for the CRLF at the end of the chunk
must not be added to the offset value. Indeed, the CRLF will be appended after
the data and not prepended to them.

The second one, still during the nego, is an integer overflow when the available
room in the output buffer is computed.

Finally, the last one is when the chunk itself is formatted. This part was
totally buggy if the output buffer was not empty at the beginning.

No backport needed.
2024-02-14 14:22:36 +01:00
Frederic Lecaille
167e38e0e0 MINOR: quic: Add a counter for reordered packets
A packet is considered as reordered when it is detected as lost because its packet
number is above the largest acknowledeged packet number by at least the
packet reordering threshold value.

Add ->nb_reordered_pkt new quic_loss struct member at the same location that
the number of lost packets to count such packets.

Should be backported to 2.6.
2024-02-14 11:32:29 +01:00
Frederic Lecaille
eeeb81bb49 MINOR: quic: Dynamic packet reordering threshold
Let's say that the largest packet number acknowledged by the peer is #10, when inspecting
the non already acknowledged packets to detect if they are lost or not, this is the
case a least if the difference between this largest packet number and and their
packet numbers are bigger or equal to the packet reordering threshold as defined
by the RFC 9002. This latter must not be less than QUIC_LOSS_PACKET_THRESHOLD(3).
Which such a value, packets #7 and oldest are detected as lost if non acknowledged,
contrary to packet number #8 or #9.

So, the packet loss detection is very sensitive to such a network characteristic
where non acknowledged packets are distant from each others by their packet number
differences.

Do not use this static value anymore for the packet reordering threshold which is used
as a criteria to detect packet loss. In place, make it depend on the difference
between the number of the last transmitted packet and the number of the oldest
one among the packet which are still in flight before being inspected to be
deemed as lost.

Add new tune.quic.reorder-ratio setting to apply a ratio in percent to this
dynamic packet reorder threshold.

Should be backported to 2.6.
2024-02-14 11:32:29 +01:00
Frederic Lecaille
2ed53ae4a0 MINOR: quic: Update K CUBIC calculation (RFC 9438)
The new formula for K CUBIC which arrives with RFC 9438 is as follows:

       K = cubic_root((W_max - cwnd_epoch) / C)

Note that W_max is c->last_w_max, and cwnd_epoch is c->cwnd when entering
quic_cubic_update() just after a congestion event.

Must be backported as far as 2.6.
2024-02-12 13:44:42 +01:00
Frederic Lecaille
406c63ba44 BUG/MEDIUM: quic: Wrong K CUBIC calculation.
The formula for K CUBIC calculation is as follows:

		K = cubic_root(W_max * (1 - beta_quic) / C).

Note that this does not match the comment. But the aim of this patch is to not
hide a bug inside another patch to update this K CUBIC calculation.

The unit of C is bytes/s^3 (or segments/s^3). And we want to store K as
milliseconds. So, the conversion inside the cubic_root() to convert seconds in
milliseconds is wrong. The unit used here is bytes/(ms/1000)^3 or
bytes*1000^3/ms^3. That said, it is preferable to compute K as seconds, then
convert to milliseconds as done by this patch.

Must be backported as far as 2.6.
2024-02-12 13:44:42 +01:00
Remi Tricot-Le Breton
5e66bf26ec BUG/MEDIUM: ssl: Fix crash when calling "update ssl ocsp-response" when an update is ongoing
The CLI command "update ssl ocsp-response" was forcefully removing an
OCSP response from the update tree regardless of whether it used to be
in it beforehand or not. But since the main OCSP upate task works by
removing the entry being currently updated from the update tree and then
reinserting it when the update process is over, it meant that in the CLI
command code we were modifying a structure that was already being used.

These concurrent accesses were not properly locked on the "regular"
update case because it was assumed that once an entry was removed from
the update tree, the update task was the only one able to work on it.

Rather than locking the whole update process, an "updating" flag was
added to the certificate_ocsp in order to prevent the "update ssl
ocsp-response" command from trying to update a response already being
updated.

An easy way to reproduce this crash was to perform two "simultaneous"
calls to "update ssl ocsp-response" on the same certificate. It would
then crash on an eb64_delete call in the main ocsp update task function.

This patch can be backported up to 2.8.
2024-02-12 11:15:45 +01:00
Willy Tarreau
b746af9990 BUG/MEDIUM: pool: fix rare risk of deadlock in pool_flush()
As reported by github user @JB0925 in issue #2427, there is a possible
crash in pool_flush(). The problem is that if the free_list is not empty
in the first test, and is empty at the moment the xchg() is performed,
for example because another thread called it in parallel, we place a
POOL_BUSY there that is never removed later, causing the next thread to
wait forever.

This was introduced in 2.5 with commit 2a4523f6f ("BUG/MAJOR: pools: fix
possible race with free() in the lockless variant"). It has probably
very rarely been detected, because:
  - pool_flush() is only called when stopping is set
  - the function does nothing if global pools are disabled, which is
    the case on most modern systems with a fast memory allocator.

It's possible to reproduce it by modifying __task_free() to call
pool_flush() on 1% of the calls instead of only when stopping.

The fix is quite simple, it consists in moving the zeroing of the
entry in the break path after verifying that the entry was not already
busy.

This must be backported wherever commit 2a4523f6f is.
2024-02-10 12:38:40 +01:00
Willy Tarreau
ab8928b9db BUILD: address a few remaining calloc(size, n) cases
In issue #2427 Ilya reports that gcc-14 rightfully complains about
sizeof() being placed in the left term of calloc(). There's no impact
but it's a bad pattern that gets copy-pasted over time. Let's fix the
few remaining occurrences (debug.c, halog, udp-perturb).

This can be backported to all branches, and the irrelevant parts dropped.
2024-02-10 11:37:27 +01:00
Willy Tarreau
613e959c7b MINOR: cli/wait: add a condition to wait on a server to become unused
The "wait" command now supports a condition, "srv-unused", which waits
for the designated server to become totally unused, indicating that it
is removable. Upon each wakeup it calls srv_check_for_deletion() to
verify if conditions are met, if not if it's recoverable, or if it's
not recoverable, and proceeds according to this, never waiting for a
final decision longer than the configured delay.

The purpose is to make it possible to remove servers from the CLI after
waiting for their sessions to be terminated:

  $ socat -t5 /path/to/socket - <<< "
        disable server px/srv1
        shutdown sessions server px/srv1
        wait 2s srv-unused px/srv1
        del server px/srv1"

Or even wait for connections to terminate themselves:

  $ socat -t70 /path/to/socket - <<< "
        disable server px/srv1
        wait 1m srv-unused px/srv1
        del server px/srv1"
2024-02-09 20:38:08 +01:00
Willy Tarreau
66989ff426 MINOR: cli/wait: also pass up to 4 arguments to the external conditions
Conditions will need to have context, arguments etc from the command line.
Since these will vary with time (otherwise we wouldn't wait), let's just
pass them as text (possibly pre-processed). We're starting with 4 strings
that are expected to be allocated by strdup() and are always sent to free()
upon release.
2024-02-09 20:38:08 +01:00
Willy Tarreau
2673f8be82 MINOR: cli/wait: also support an unrecoverable failure status
Since we'll support waiting for an action to succeed or permanently
fail, we need the ability to return an unrecoverable failure. Let's
add CLI_WAIT_ERR_FAIL for this. A static error message may be placed
into ctx->msg to report to the user why the failure is unrecoverable.
2024-02-09 20:38:08 +01:00
Willy Tarreau
d8731c6680 MINOR: cli/wait: make the wait command support a more detailed help message
We'll want to add some waiting conditions, so let's support -h to show
the available list, and also print this usage on unknown options.
2024-02-09 20:38:08 +01:00
Willy Tarreau
9b680d7411 MINOR: server: split the server deletion code in two parts
We'll need to be able to verify whether or not a server may be deleted.
For now, both the verification and the action are performed in the same
function, at once under thread isolation. The goal here is to extract
the verification code into a new function that will perform these checks,
return a status between success/recoverable/non-recoverable failure, and
will also return a message for the caller.
2024-02-09 20:38:08 +01:00
Christopher Faulet
17cc4e4684 BUG/MINOR: applet: Always release empty appctx buffers after processing
When an applet is using its own buffers, it is important to release them, if
empty, after processing to recycle unsued buffers. It is not a leak because
these buffers are necessarily released when the applet is released. But this
leads to an excess of buffer allocations.

No need to backport.
2024-02-09 15:14:38 +01:00
Willy Tarreau
1d2255a78a MINOR: cli: add a new "wait" command to wait for a certain delay
This allows to insert delays between commands, i.e. to collect a same
set of metrics at a fixed interval. E.g:

  $ socat -t20 /path/to/socket <<< "show activity; wait 10s; show activity"

The goal will be to extend the feature to optionally support waiting on
certain conditions. For this reason the struct definitions and enums were
placed into cli-t.h.
2024-02-08 21:54:54 +01:00
Willy Tarreau
02b31fa003 MINOR: cli: always reset the applet task's timeout
The CLI applet doesn't make use of its timeout at all, only the stream
does. That's a wonder because it allows any command's I/O handler to
trivially set a wakeup timer by simply touching the task's ->expire
field, and the I/O handler will automatically be woken up again. The
only condition for this is that we properly take care of clearing that
timeout whenever we finish processing a command and switch back to the
PROMPT state. That's what this patch does.
2024-02-08 20:53:31 +01:00
Willy Tarreau
3d91ffdaff MINOR: cli: make sure to always print a pending message after release()
If a release handler produces a final message, it's currently left
pending in the CLI context and needs another I/O event to be dumped
because immediately after calling ->release, we check for states
OUTPUT and above and we wait until more data arrives.

This patch adds continue statement to go back to the loop immediately
after leaving the release handler in order to attempt to emit the
output message.

At this point it's not sure whether any release handlers are producing
messages, so it's probably not needed to backport this.
2024-02-08 18:22:35 +01:00
Willy Tarreau
6219a58d28 BUG/MEDIUM: cli: fix once for all the problem of missing trailing LFs
Some commands are still missing their trailing LF, and very few were even
already spotted in the past emitting more than one. The risk of missing
this LF is particularly high, especially when tests are run in non-
interactive mode where the output looks good at first glance. The problem
is that once run in interactive mode, the missing empty line makes the
command not being complete, and scripts can wait forever.

Let's tackle the problem at its root: messages emitted at the end must
always end with an LF and we know some miss it. Thus, in cli_output_msg()
we now start by removing the trailing LFs from the string, and we always
add exactly one. This way the trailing LF added by correct functions are
silently ignored and all functions are now correct.

This would need to be progressively backported to all supported versions
in order to address them all at once, though the risk of breaking a legacy
script relying on the wrong output is never zero. At first it should at
least go as far as the lastest LTS (2.8), and maybe another one depending
on user demands. Note that it also requires previous patch ("BUG/MINOR:
vars/cli: fix missing LF after "get var" output") because it fixes a test
for a bogus output for "get var" in a VTC.
2024-02-08 18:22:15 +01:00
Willy Tarreau
5d0dd88ac6 BUG/MINOR: vars/cli: fix missing LF after "get var" output
"get var" on the CLI was also missing an LF, and the vtest as well, so
that fixing only the code breaks the vtest. This must be backported to
2.4 as the issue was brought with commit c35eb38f1d ("MINOR: vars/cli:
add a "get var" CLI command to retrieve global variables").
2024-02-08 18:22:01 +01:00
Willy Tarreau
eaeb67bdb4 BUG/MINOR: server/cli: add missing LF at the end of certain notice/error lines
Some cli_err(), cli_msg() or even ha_error() etc are missing the trailing
LF, which breaks the continuity of the CLI parsing: the extra LF that serves
to mark the end of the command is in fact taken as the missing LF and no
extra one is added.

This patch adds the missing LF on identified messages. It might be worth
trying to proceed in a more generic way with this, given the amount of
code that is possibly at risk.
2024-02-08 18:21:52 +01:00
Willy Tarreau
870e2d3f1f MEDIUM: mux-h2: update session trackers with number of glitches
We now update the session's tracked counters with the observed glitches.
In order to avoid incurring a high cost, e.g. if many small frames contain
issues, we batch the updates around h2_process_demux() by directly passing
the difference. Indeed, for now all functions that increment glitches are
called from h2_process_demux(). If that were to change, we'd just need to
keep the value of the last synced counter in the h2c struct instead of the
stack.

The regtest was updated to verify that the 3rd client that does not cause
issue still sees the counter resulting from client 2's mistakes. The rate
is also verified, considering it shouldn't fail since the period is very
long (1m).
2024-02-08 15:51:49 +01:00
Willy Tarreau
8581d62daf MINOR: session: add the necessary functions to update the per-session glitches
This provides a new function session_add_glitch_ctr() that will update
the glitch counter and rate for the session, if tracked at all.
2024-02-08 15:51:49 +01:00
Willy Tarreau
c9c6b683fb MEDIUM: stick-tables: add a new stored type for glitch_cnt and glitch_rate
This adds a new pair of stored types in the stick-tables:
  - glitch_cnt
  - glitch_rate

These keep count of the number of glitches reported on a front connection,
in order to decide how to act with a badly defective client or a potential
attacker. For now nothing updates these counters, but all the infrastructure
needed to configure, update and retrieve them was added, including the doc.

No regtest was added yet since they're not filled yet.
2024-02-08 15:51:49 +01:00
Willy Tarreau
9f3a0834d8 MINOR: mux-h2: count late reduction of INITIAL_WINDOW_SIZE as a glitch
It's quite uncommon for a client to decide to change the connection's
initial window size after the settings exchange phase, unless it tries
to increase it. One of the impacts depending is that it updates all
streams, so it can be expensive, depending on the stacks, and may even
be used to construct an attack. For this reason, we now count a glitch
when this happens.

A test with h2spec shows that it triggers 9 across a full test.
2024-02-08 15:51:49 +01:00
Willy Tarreau
28dfd006ca MINOR: mux-h2: count excess of CONTINUATION frames as a glitch
Here we consider that if a HEADERS frame is made of more than 4 fragments
whose average size is lower than 1kB, that's very likely an abuse so we
count a glitch per 16 fragments, which means 1 glitch per 1kB frame in a
16kB buffer. This means that an abuser sending 1600 1-byte frames would
increase the counter by 100, and that sending 100 headers per request in
individual frames each results in a count of ~7 to be added per request.

A test consisting in sending 100M requests made of 101 frames each over
a connection resulted in ~695M glitches to be counted for this connection.

Note that no special care is taken to avoid wrapping since it already takes
a very long time to reach 100M and there's no particular impact of wrapping
here (roughly 1M/s).
2024-02-08 15:51:49 +01:00
Willy Tarreau
eeacca75d1 BUG/MINOR: mux-h2: count rejected DATA frames against the connection's flow control
RFC9113 clarified a point regarding the payload from DATA frames sent to
closed streams. It must always be counted against the connection's flow
control. In practice it should really have no practical effect, but if
repeated upload attempts are aborted, this might cause the client's
window to progressively shrink since not being ACKed.

It's probably not necessary to backport this, unless another patch
depends on it.
2024-02-08 15:51:49 +01:00
Aurelien DARRAGON
0c437b2dfc MINOR: sample: implement bc_{be,srv}_queue samples
%[bc_be_queue] and %[bc_srv_queue] are equivalent to %bq and %sq tags in
log-format.
2024-02-08 09:39:23 +01:00
Aurelien DARRAGON
16014bc5b3 MINOR: stream: rename "txn.redispatch" to "txn.redispatched"
The fetch will return true if the stream was redispatched: this is a
past action, thus we rename the fetch to better reflect its true
meaning and prevent confusions.

Documentation was updated.
While at it, the fetch was moved from internal states section to Layer 4
section, which is where it belongs.

No backport needed unless 92b2edb (" MINOR: stream: add "txn.redispatch"
fetch") gets backported.
2024-02-08 09:39:14 +01:00
Remi Tricot-Le Breton
e29ec2e649 BUG/MINOR: ssl: Reenable ocsp auto-update after an "add ssl crt-list"
If a certificate that has an OCSP uri is unused and gets added to a
crt-list with the ocsp auto update option "on", it would not have been
inserted into the auto update tree because this insertion was only
working on the first call of the ssl_sock_load_ocsp function.
If the configuration used a crt-list like the following:
    cert1.pem *
    cert2.pem [ocsp-update on] *

Then calling "del ssl crt-list" on the second line and then reverting
the delete by calling "add ssl crt-list" with the same line, then the
cert2.pem would not appear in the ocsp update list (can be checked
thanks to "show ssl ocsp-updates" command).

This patch ensures that in such a case we still perform the insertion in
the update tree.

This patch can be backported up to branch 2.8.
2024-02-07 17:10:49 +01:00
Remi Tricot-Le Breton
a290db5706 BUG/MINOR: ssl: Destroy ckch instances before the store during deinit
The ckch_store's free'ing function might end up calling
'ssl_sock_free_ocsp' if the corresponding certificate had ocsp data.
This ocsp cleanup function expects for the 'refcount_instance' member of
the certificate_ocsp structure to be 0, meaning that no live
ckch instance kept a reference on this certificate_ocsp structure.
But since in ckch_store_free we were destroying the ckch_data before
destroying the linked instances, the BUG_ON would fail during a standard
deinit. Reversing the cleanup order fixes the problem.

Must be backported to 2.8.
2024-02-07 17:10:31 +01:00
Remi Tricot-Le Breton
befebf8b51 BUG/MEDIUM: ocsp: Separate refcount per instance and per store
With the current way OCSP responses are stored, a single OCSP response
is stored (in a certificate_ocsp structure) when it is loaded during a
certificate parsing, and each ckch_inst that references it increments
its refcount. The reference to the certificate_ocsp is actually kept in
the SSL_CTX linked to each ckch_inst, in an ex_data entry that gets
freed when he context is freed.
One of the downside of this implementation is that is every ckch_inst
referencing a certificate_ocsp gets detroyed, then the OCSP response is
removed from the system. So if we were to remove all crt-list lines
containing a given certificate (that has an OCSP response), the response
would be destroyed even if the certificate remains in the system (as an
unused certificate). In such a case, we would want the OCSP response not
to be "usable", since it is not used by any ckch_inst, but still remain
in the OCSP response tree so that if the certificate gets reused (via an
"add ssl crt-list" command for instance), its OCSP response is still
known as well. But we would also like such an entry not to be updated
automatically anymore once no instance uses it. An easy way to do it
could have been to keep a reference to the certificate_ocsp structure in
the ckch_store as well, on top of all the ones in the ckch_instances,
and to remove the ocsp response from the update tree once the refcount
falls to 1, but it would not work because of the way the ocsp response
tree keys are calculated. They are decorrelated from the ckch_store and
are the actual OCSP_CERTIDs, which is a combination of the issuer's name
hash and key hash, and the certificate's serial number. So two copies of
the same certificate but with different names would still point to the
same ocsp response tree entry.

The solution that answers to all the needs expressed aboved is actually
to have two reference counters in the certificate_ocsp structure, one
for the actual ckch instances and one for the ckch stores. If the
instance refcount becomes 0 then we remove the entry from the auto
update tree, and if the store reference becomes 0 we can then remove the
OCSP response from the tree. This would allow to chain some "del ssl
crt-list" and "add ssl crt-list" CLI commands without losing any
functionality.

Must be backported to 2.8.
2024-02-07 17:10:05 +01:00
Remi Tricot-Le Breton
23cab33b67 BUG/MINOR: ssl: Clear the ckch instance when deleting a crt-list line
When deleting a crt-list line through a "del ssl crt-list" call on the
CLI, we ended up free'ing the corresponding ckch instances without fully
clearing their contents. It left some dangling references on other
objects because the attache SSL_CTX was not deleted, as well as all the
ex_data referenced by it (OCSP responses for instance).

This patch can be backported up to branch 2.4.
2024-02-07 17:10:00 +01:00
Remi Tricot-Le Breton
28e78a0a74 MINOR: ssl: Use OCSP_CERTID instead of ckch_store in ckch_store_build_certid
The only useful information taken out of the ckch_store in order to copy
an OCSP certid into a buffer (later used as a key for entries in the
OCSP response tree) is the ocsp_certid field of the ckch_data structure.
We then don't need to pass a pointer to the full ckch_store to
ckch_store_build_certid or even any information related to the store
itself.
The ckch_store_build_certid is then converted into a helper function
that simply takes an OCSP_CERTID and converts it into a char buffer.
2024-02-07 17:09:39 +01:00
Remi Tricot-Le Breton
1fda0a5202 BUG/MINOR: ssl: Duplicate ocsp update mode when dup'ing ckch
When calling ckchs_dup (during a "set ssl cert" CLI command), if the
modified store had OCSP auto update enabled then the new certificate
would not keep the previous update mode and would not appear in the auto
update list.

This patch can be backported to 2.8.
2024-02-07 17:09:34 +01:00
Christopher Faulet
d7467cd495 MINOR: applet: Identify applets using their own buffers via a flag
These applets can now be identified by testing APPCTX_FL_INOUT_BUFS
flag. This will be useful between the kind of applets in helper functions.
2024-02-07 15:05:05 +01:00
Christopher Faulet
a9301c96f1 MINOR: applet: Use an option to disable zero-copy forwarding for all applets
At the beginning of the 3.0-dev cycle, the zero-copy forwarding support was
added only for the cache applet with an option to disable it. This was a
hack, waiting for a better integration with applets. It is now possible to
implement the zero-copy forwarding for any applets. So the specific option
for the cache applet was renamed to be used for all applets. And this option
is now also checked for the stats applet.

Concretely, 'tune.cache.zero-copy-forwarding' was renamed to
'tune.applet.zero-copy-forwarding'.
2024-02-07 15:05:01 +01:00
Christopher Faulet
00152bad85 MINOR: cache: Remove unsed .data_sent field from the cache applet context
This field was introduced when the first implementation of the zero-copy
forwarding was added. It is now useless. However, we must still save the
body-size of the object in the cache.
2024-02-07 15:04:57 +01:00
Christopher Faulet
ee53d8421f MEDIUM: applet: Simplify a bit API to exchange data with applets
Default .rcv_buf and .snd_buf functions that applets can use are now
specialized to manipulate raw buffers or HTX buffers.

Thus a TCP applet should use appctx_raw_rcv_buf() and appctx_raw_snd_buf()
while HTTP applet should use appctx_htx_rcv_buf() and appctx_htx_snd_buf().

Note that the appctx is now directly passed to these functions instead of
the SC.
2024-02-07 15:04:52 +01:00
Christopher Faulet
868205943c MAJOR: stats: Send stats dump over HTTP using zero-copy forwarding
Just like for the cache applet, it is now possible to send response to the
opposite side using the zero-copy forwarding. Internal functions were
slightly updated but there is nothing special to say. Except the requested
size during the nego stage is not exact.
2024-02-07 15:04:48 +01:00
Christopher Faulet
91b77c1632 MEDIUM: mux-h1: Support zero-copy forwarding for chunks with an unknown size
Till now, for chunked messages, the H1 mux used the size requested during
the zero-copy forwarding negotiation as the chunk size. And till now, this
was accurate because the requested size was indeed the chunk size on the
producer side.

But this will be a problem to implement the zero-copy forwarding on some
applets because the content size is not known during the nego but only when
it is produced. Thanks to previous patches, it is now possible to know the
requested size is not exact and we are able to reserve a larger space to
write the chunk size later, in h1_done_ff(), with some padding.
2024-02-07 15:04:44 +01:00
Christopher Faulet
dcb964f8db MINOR: mux-h1: Stop zero-copy forwarding during nego for too big requested size
Now, during the zero-copy forwarding negotiation, when the requested size is
exact, we are now able to check if it is bigger than the expected one or
not. If it is indeed bigger than expeceted, the zero-copy forwarding is
disabled, the error will be triggered later on the normal sending path.
2024-02-07 15:04:41 +01:00
Christopher Faulet
1c18d32a0d MEDIUM: stconn: Nofify requested size during zero-copy forwarding nego is exact
It is now possible to use a flag during zero-copy forwarding negotiation to
specify the requested size is exact, it means the producer really expect to
receive at least this amount of data.

It can be used by consumer to prepare some processing at this stage, based
on the requested size. For instance, in the H1 mux, it is used to write the
next chunk size.
2024-02-07 15:04:38 +01:00
Christopher Faulet
fe506d7aaa MINOR: mux-h1: Be able to define the length of a chunk size when it is prepended
It is now possible to impose the length to represent the chunk size in the
function used to prepended the chunk size in a buffer (so before the chunk
itself). It is thus possible to reserve a specific space for an unknown
chunk size and padding it with leading '0' to use all the space and avoid
holes.
2024-02-07 15:04:34 +01:00
Christopher Faulet
2297f52734 MINOR: stconn: Add support for flags during zero-copy forwarding negotiation
During zero-copy forwarding negotiation, a pseudo flag was already used to
notify the consummer if the producer is able to use kernel splicing or not. But
this was not extensible. So, now we use a true bitfield to be able to pass flags
during the negotiation. NEGO_FF_FL_* flags may be used now.

Of course, for now, there is only one flags, the kernel splicing support on
producer side (NEGO_FF_FL_MAY_SPLICE).
2024-02-07 15:04:29 +01:00
Christopher Faulet
c061ba30f7 MAJOR: cache: Send cached objects using zero-copy forwarding
The zero-copy forwarding is reintroduced implementing the .fastfwd callback
function. Otherwise, there is nothing special to say.
2024-02-07 15:04:24 +01:00
Christopher Faulet
863417292b MAJOR: cache: Update HTTP cache applet to handle its own buffers
Just like the HTTP stats applet, the cache applet was refactored to use its
own buffers. Changes are pretty similar.
2024-02-07 15:04:21 +01:00
Christopher Faulet
6d5dd23dbc MEDIUM: cache: Temporarily remove zero-copy forwarding support
The cache applet will be refactored to use its own buffer. Thus, for now,
the zero-copy forwarding support is removed and it will be reintrocuded
later.
2024-02-07 15:04:17 +01:00
Christopher Faulet
18845a0624 MAJOR: stats: Update HTTP stats applet to handle its own buffers
The HTTP stat applets and all internal functions was adapted to use its own
buffers instead of the channels ones. The CLI part was not refactored yet,
thus there are still some access to channels in the file. But for the HTTP
part, we no longer use the channels at all.

To do so, the HTTP stats applet now uses default .rcv_buf and .snd_buf
callback function. In addition, it sets appctx flags instead of SE ones.
2024-02-07 15:04:13 +01:00
Christopher Faulet
a4dcd3e54b MEDIUM: stats: Don't interrupt processing on partial post
We no longer test the opposite stream-connector to detect aborted partial
post. Applets must not try to access to info ouside their scope. This make
the code more sensitive to changes and it is a common source of bug.

Tests on the sedesc flags at the begining of the I/O handler should be
enough.
2024-02-07 15:04:09 +01:00
Christopher Faulet
6ac119ba2d MINOR: applet: Automatically handle applets having more data for the stream
This should simplify applets implementation. Of course an applet should
still do it by itself if conditions to set this flag differ.
2024-02-07 15:04:06 +01:00
Christopher Faulet
39b6f5b04c MEDIUM: applet: Add support for zero-copy forwarding from an applet
Thanks to this patch, it is possible to an applet to directly send data to
the opposite endpoint. To do so, it must implement <fastfwd> appctx callback
function and set SE_FL_MAY_FASTFWD flag.

Everything will be handled by appctx_fastfwd() function. The applet is only
responsible to transfer data. If it sets <to_forward> value, it is used to
limit the amount of data to forward.
2024-02-07 15:04:01 +01:00
Christopher Faulet
62a81cb6a6 MINOR: applet: Add callback function to deal with zero-copy forwarding
This patch introduces the support for the callback function responsible to
produce data via the zero-copy forwarding mechanism. There is no
implementation for now. But <to_forward> field was added in the appctx
structure to let an applet inform how much data it want to forward. It is
not mandatory but it will be used during the zero-copy forwarding
negociation.
2024-02-07 15:03:57 +01:00
Christopher Faulet
7ec544b217 MEDIUM: applet: Use appctx flags to report EOS/EOI/ERROR to SE
We have indroduced flags to deal with end of input, end of stream and errors
at the applet level. With this patch we make the link with the endpoint
descriptor.

In appctx_rcv_buf(), applet flags are converted to SE flags.
2024-02-07 15:03:54 +01:00
Christopher Faulet
cc7b141e1c MINOR: applet: Add an appctx flag to report shutdown to applets
There is no shutdown for reads and send with applets. Both are performed
when the appctx is released. So instead of 2 flags, like for
muxes/connections, only one flag is used. But the idea is the same:
acknowledge the event at the applet level.
2024-02-07 15:03:50 +01:00
Christopher Faulet
14bd091fd7 MINOR: applet: Remove appctx state field to only used the flags
The appctx state was never really used as a state. It is only used to know
when an applet should be freed on the next wakeup. This can be converted to
a flag and the state can be removed. This is what this patch does.
2024-02-07 15:03:46 +01:00
Christopher Faulet
e8655546b7 MINOR: applet: Add flags on the appctx and stop abusing its state
Till now, we've extended the appctx state to add some flags. However, the
field name is misleading. So a bitfield was added to handle real flags. And
helper functions to manipulate this bitfield were added.
2024-02-07 15:03:34 +01:00
Christopher Faulet
c0527261cf MINOR: applet: Show IN/OUT buffers in trace messages when used
The function dumping applet trace messages was updated to dump info about
in/out buffers instead of channel buffers when it is relevant.
2024-02-07 15:03:30 +01:00
Christopher Faulet
4ad8192ce4 MEDIM: applet: Add the applet handler based on IN/OUT buffers
A dedicated function to run applets was introduced, in addition to the old
one, to deal with applets that use their own buffers. The main differnce
here is that this handler does not use channels at all. It performs a
synchronous send before calling the applet and performs a synchronous
receive just after.

No applets are plugged on this handler for now.
2024-02-07 15:03:26 +01:00
Christopher Faulet
f81b704d01 MEDIUM: stconn: Add functions to handle applets I/O from the SC layer
There is no tasklet to handle I/O subscriptions for applets, but functions
to deal with receives and sends from the SC layer were added. it meanse a
function to retrieve data from an applet with this synchronous version and a
function to push data to an applet wit this synchronous version.

It is pretty similar to the functions used for muxes but there are some
differences. So for now, we keep them separated.

Zero-copy forwarding is not supported for now. In addition, there is no
subscription mechanism.
2024-02-07 15:03:23 +01:00
Christopher Faulet
525ec12305 MINOR: applet: Implement default functions to exchange data with channels
In this patch, we add default functions to copy data from a channel to the
<inbuf> buffer of an applet (appctx_rcv_buf) and another on to copy data
from <outbuf> buffer of an applet to a channel (appctx_snd_buf).

These functions are not used for now, but they will be used by applets to
define their <rcv_buf> and <snd_buf> callback functions. Of course, it will
be possible for a specific applet to implement its own functions but these
ones should be good enough for most of applets. HTX and RAW buffers are
supported.
2024-02-07 15:03:18 +01:00
Christopher Faulet
04eca50f49 MINOR: applet: Add traces to debug receive/send and block/wake events
New traces events are added to be able to debug receives and sends.
2024-02-07 15:03:09 +01:00
Christopher Faulet
ab9d2c6ca8 MINOR: applet: Add dedicated IN/OUT buffers for appctx
It is the first patch of a series aimed to align applets on connections.
Here, dedicated buffers are added for applets. For now, buffers are
initialized and helpers function to deal with allocation are added. In
addition, flags to report allocation failures or full buffers are also
introduced. <inbuf> will be used to push data to the applet from the stream
and <outbuf> will be used to push data from the applet to the stream.
2024-02-07 15:03:01 +01:00
Christopher Faulet
45ca9dadcd MINOR: stconn: Be prepared to handle error when a SC is attached to an applet
sc_attach_applet() was changed to be able to fail and callers were updated
accordingly. For now it cannot fail but if this changes, callers will be
prepared to handle errors.
2024-02-07 15:02:27 +01:00
Christopher Faulet
ad937372f3 MINOR: stconn: Explicitly use an appctx to attach a stconn on it
In sc_attach_applet, an untyped pointer (void *) was used to attach a SC on
an applet. There is no reason to not use the right type here. So now a
pointer on an appctx is explicitly used.
2024-02-07 15:02:22 +01:00
Frederic Lecaille
c977b9aa15 MINOR: quic: Stop using 1024th of a second.
Use milliseconds in place of 1024th of a second.

Should be backported as far as 2.6.
2024-02-07 08:44:31 +01:00
Frederic Lecaille
19a66b290e BUG/MINOR: quic: fix possible integer wrap around in cubic window calculation
Avoid loss of precision when computing K cubic value.
Same issue when computing the congestion window value from cubic increase function
formula with possible integer varaiable wrap around.

Depends on this commit:

	MINOR: quic: Code clarifications for QUIC CUBIC (RFC 9438)

Must be backported as far as 2.6.
2024-02-07 08:44:31 +01:00
Frederic Lecaille
88d13caa38 CLEANUP: quic: Code clarifications for QUIC CUBIC (RFC 9438)
The first version of our QUIC CUBIC implementation is confusing because relying on
TCP CUBIC linux kernel implementation and with references to RFC 8312 which is
obsoleted by RFC 9438 (August 2023) after our implementation. RFC 8312 is a little
bit hard to understand. RFC 9438 arrived with much more clarifications.

So, RFC 9438 is about "CUBIC for Fast Long-Distance Networks". Our implementation
for QUIC is not very well documented. As it was difficult to reread this
code, this patch adds only some comments at complicated locations and rename
some macros, variables without logic modifications at all.

So, the aim of this patch is to add first some comments and variables/macros renaming
to avoid embedding too much code modifications in the same big patch.

Some code modifications will come to adapt this CUBIC implementation to this new
RFC 9438.

Rename some macros:
  CUBIC_BETA -> CUBIC_BETA_SCALED
  CUBIC_C    -> CUBIC_C_SCALED
  CUBIC_BETA_SCALE_SHIFT -> CUBIC_SCALE_FACTOR_SHIFT (this is the scaling factor
which is used only for CUBIC_BETA_SCALED)
  CUBIC_DIFF_TIME_LIMIT -> CUBIC_TIME_LIMIT

CUBIC_ONE_SCALED was added (scaled value of 1).

These cubic struct members were renamed:
	->tcp_wnd -> ->W_est
	->origin_point -> ->W_target
	->epoch_start -> ->t_epoch
	->remaining_tcp_inc -> remaining_W_est_inc

Local variables to quic_cubic_update() were renamed:
    t -> elapsed_time
    diff ->t
    delta -> W_cubic_t

Add a grahpic curve about the CUBIC Increase function.
Add big copied & pasted RFC 9438 extracts in relation with the 3 different increase
function regions.
Same thing for the fast convergence.
Fix a typo about the reference to QUIC RFC 9002.

Must be backported as far as 2.6 to ease any further modifications to come.
2024-02-07 08:44:31 +01:00
Willy Tarreau
7cba015c85 DEBUG: make the "debug dev {debug|warn|check}" command print a message
In order to test the new message output capability, these commands will
now explicitly mention that the bug was triggered from the CLI.
2024-02-05 17:09:00 +01:00
Remi Tricot-Le Breton
73705ac701 BUG/MINOR: ssl: Fix error message after ssl_sock_load_ocsp call
If we were to enable 'ocsp-update' on a certificate that does not have
an OCSP URI, we would exit ssl_sock_load_ocsp with a negative error code
which would raise a misleading error message ("<cert> has an OCSP URI
and OCSP auto-update is set to 'on' ..."). This patch simply fixes the
error message but an error is still raised.

This issue was raised in GitHub #2432.
It can be backported up to branch 2.8.
2024-02-05 15:59:16 +01:00
Aurelien DARRAGON
92b2edb42e MINOR: stream: add "txn.redispatch" fetch
Fetch will return true if the stream underwent a redispatch according to
"option redispatch" setting upon retries.

Documentation was added, and the "%rc" logformat alternative now mentions
the new fetch to properly emulate the logformat behavior.
2024-02-05 14:54:37 +01:00
Frederic Lecaille
59acb27001 BUILD: quic: Variable name typo inside a BUG_ON().
This build issued was introduced by this previous commit which is a bugfix:

   BUG/MINOR: quic: Wrong ack ranges handling when reaching the limit.

A BUG_ON() referenced <fist> variable in place of <first>.

Must be backported as far as 2.6 as the previous commit.
2024-02-05 14:31:21 +01:00
Frederic Lecaille
0ce61d2f6d BUG/MINOR: quic: Wrong ack ranges handling when reaching the limit.
Acknowledgements ranges are used to build ACK frames. To avoid allocating too
much such objects, a limit was set to 32(QUIC_MAX_ACK_RANGES) by this commit:

	MINOR: quic: Do not allocate too much ack ranges

But there is an inversion when removing the oldest range from its tree.
eb64_first() must be used in place of eb64_last(). Note that this patch
only does this modification in addition to rename <last> variable to <first>.

This bug leads such a h2load command to block when a request ends up not
being acknowledged by haproxy even if correctly served:

	/opt/nghttp2/build/bin/h2load --alpn-list h3 -t 1 -c 1 -m 1 -n 100 \
		https://127.0.0.1/?s=5m

There is a remaining question to be answered. In such a case, haproxy refuses to
reopen the stream, this is a good thing but should not haproxy ackownledge the
request (because correctly parsed again).

Note that to be easily reproduced, this setting had to be applied to the client
network interface:

    tc qdisc add dev eth1 root netem delay 100ms 1s loss random

Must be backported as far as 2.6.
2024-02-05 14:26:52 +01:00
Willy Tarreau
52cc45dfa5 MINOR: acl: add extra diagnostics about suspicious string patterns
As noticed in this thread, some bogus configurations are not always easy
to spot: https://www.mail-archive.com/haproxy@formilux.org/msg44558.html
Here it was about config keywords being used in ACL patterns where strings
were expected, hence they're always valid.

Since we have the diag mode (-dD) we can perform some extra checks when
it's used, and emit them to suggest the user there might be an issue.

Here we detect a few common words (logic such as "and"/"or"/"||" etc),
C++/JS comments mistakenly used to try to isolate final args, and words
that have the exact name of a sample fetch or an ACL keyword. These checks
are only done in diag mode of course.
2024-02-03 12:08:11 +01:00
Willy Tarreau
75d64c0d4c BUG/MINOR: diag: run the final diags before quitting when using -c
Final diags were added in 2.4 by commit 5a6926dcf ("MINOR: diag: create
cfgdiag module"), but it's called too late in the startup process,
because when "-c" is passed, the call is not made, while it's its primary
use case. Let's just move the call earlier.

Note that currently the check in this function is limited to verifying
unicity of server cookies in a backend, so it can be backported as far
as 2.4, but there is little value in insisting if it doesn't backport
easily.
2024-02-03 12:08:11 +01:00
Willy Tarreau
ced4148401 BUG/MINOR: diag: always show the version before dumping a diag warning
Diag warnings were added in 2.4 by commit 7b01a8dbd ("MINOR: global:
define diagnostic mode of execution") but probably due to the split
function that checks for the mode, they did not reuse the emission of
the version string before the first warning, as was brought in 2.2 by
commit bebd21206 ("MINOR: init: report in "haproxy -c" whether there
were warnings or not"). The effet is that diag warnings are emitted
before the version string if there is no other warning nor error. Let's
just proceed like for the two other ones.

This can be backported to 2.4, though this is of very low importance.
2024-02-03 12:08:11 +01:00
Christopher Faulet
ca6f0ca82b MEDIUM: promex/resolvers: Dump resolvers metrics via a promex module
Just like for stick-tables, this patch adds a promex module to dump
resolvers metrics. It adds the "resolver" scope and for now, it dumps
folloowing metrics:

  * haproxy_resolver_sent
  * haproxy_resolver_send_error
  * haproxy_resolver_valid
  * haproxy_resolver_update
  * haproxy_resolver_cname
  * haproxy_resolver_cname_error
  * haproxy_resolver_any_err
  * haproxy_resolver_nx
  * haproxy_resolver_timeout
  * haproxy_resolver_refused
  * haproxy_resolver_other
  * haproxy_resolver_invalid
  * haproxy_resolver_too_big
  * haproxy_resolver_outdated
2024-02-02 09:11:34 +01:00
Christopher Faulet
3e55b3da30 MEDIUM: promex/stick-table: Dump stick-table metrics via a promex module
Create a promex module to dump stick-table metrics. Thanks to this patch,
all references to stick tables were removed from the promex service.
2024-02-02 09:11:34 +01:00
Christopher Faulet
3246f863d6 MEDIUM: stats: Be able to access a specific field into a stats module
It is now possible to selectively retrieve extra counters from stats
modules. H1, H2, QUIC and H3 fill_stats() callback functions are updated to
return a specific counter.
2024-02-01 12:00:53 +01:00
Christopher Faulet
fd366a106b MINOR: stats: Be able to access to registered stats modules from anywhere
The list of modules registered on the stats to expose extra counters is now
public. It is required to export these counters into the Prometheus
exporter.
2024-02-01 12:00:53 +01:00
Aurelien DARRAGON
42a97d9feb MEDIUM: tcp-act/backend: support for set-bc-{mark,tos} actions
set-bc-{mark,tos} actions are pretty similar to set-fc-{mark,tos} to set
mark/tos on packets sent from haproxy to server: set-bc-{mark,tos} actions
act on the whole backend/srv connection: from connect() to connection
teardown, thus they may only be used before the connection to the server
is instantiated, meaning that they are only relevant for request-oriented
rules such as tcp-request or http-request rules. For now their use is
limited to content request rules, because tos and mark informations are
stored directly within the stream, thus it is required that the stream
already exists.

stream flags are used in combination with dedicated stream struct members
variables to pass 'tos' and 'mark' informations so that they are correctly
considered during stream connection assignment logic (prior to connecting
to actually connecting to the server)

'tos' and 'mark' fd sockopts are taken into account in conn hash
parameters for connection reuse mechanism.

The documentation was updated accordingly.
2024-02-01 10:58:30 +01:00
Aurelien DARRAGON
b4ee7b044e MEDIUM: tcp-act: <expr> support for set-fc-{mark,tos} actions
In this patch we add the possibility to use sample expression as argument
for set-fc-{mark,tos} actions. To make it backward compatible with
previous behavior, during parsing we first try to parse the value as
as integer (decimal or hex notation), and then fallback to expr parsing
in case of failure.

The documentation was updated accordingly.
2024-02-01 10:58:30 +01:00
Aurelien DARRAGON
03cb782bcb MINOR: hlua: Rename set_{tos, mark} to set_fc_{tos, mark}
This is a complementary patch to "MINOR: tcp-act: Rename "set-{mark,tos}"
to "set-fc-{mark,tos}"", but for the Lua API.

set_mark and set_tos were kept as aliases for set_fc_mark and set_fc_tos
but they were marked as deprecated.

Using this opportunity to reorder set_mark and set_tos by alphabetical
order.
2024-02-01 10:58:30 +01:00
Aurelien DARRAGON
acf6383076 MINOR: tcp-act: Rename "set-{mark,tos}" to "set-fc-{mark,tos}"
"set-mark" and "set-tos" only alter packets from haproxy to client
(frontend connection). Since we may add support for equivalent keywords
on server side, we rename them with an explicit name to prevent
confusions.

Thus, we rename:
 - "set-mark" to "set-fc-mark"
 - "set-tos" to "set-fc-tos"

"set-mark" and "set-tos" were kept as aliases (to "set-fc-mark" and
"set-fc-tos" respectively) for now to prevent config breakage, but they
have been marked as deprecated so they can be removed in future version.
2024-02-01 10:58:30 +01:00
Aurelien DARRAGON
eea3b94514 MINOR: tcp_act: fix alphabetical ordering of tcp request content actions
"set-src" and "set-src-port" were misplaced and incorrectly ordered in
tcp_req_cont_actions keyword list.
2024-02-01 10:58:30 +01:00
Aurelien DARRAGON
ea09075f59 OPTIM: connection: progressive hash for conn_calculate_hash()
Some CPU time is needlessly wasted in conn_calculate_hash(), because all
params are first copied into a temporary buffer before computing the
hash on the whole buffer. Instead, let's leverage the XXH progressive
hash update functions to avoid expensive memcpys.
2024-02-01 10:58:30 +01:00
Amaury Denoyelle
4b5f557283 MINOR: mux-quic: realign Tx buffer if possible
A major reorganization of QUIC MUX sending has been implemented. Now
data transfer occur over a single QCS buffer. This has improve
performance but at the cost of restrictions on snd_buf. Indeed, buffer
instances are now shared from stream callback snd_buf up to quic-conn
layer.

As such, snd_buf cannot manipulate freely already present data buffer.
In particular, realign has been completely removed by the previous
patches.

This commit reintroduces a partial realign support. This is only done if
the buffer contains only unsent data, via a new MUX function
qcc_realign_stream_txbuf() which is called during snd_buf.
2024-01-31 16:28:54 +01:00
Amaury Denoyelle
4513787d0d MEDIUM: mux-quic: properly handle conn Tx buf exhaustion
This commit is a direct follow-up on the major rearchitecture of send
buffering. This patch implements the proper handling of connection pool
buffer temporary exhaustion.

The first step is to be able to differentiate a fatal allocation error
from a temporary pool exhaustion. This is done via a new output argument
on qcc_get_stream_txbuf(). For a fatal error, application protocol layer
will schedule the immediate connection closing. For a pool exhaustion,
QCC is flagged with QC_CF_CONN_FULL and stream sending process is
interrupted. QCS instance is also registered in a new list
<qcc.buf_wait_list>.

A new connection buffer can become available when all ACKs are received
for an older buffer. This process is taken in charge by quic-conn layer.
It uses qcc_notify_buf() function to clear QC_CF_CONN_FULL and to wake
up every streams registered on buf_wait_list to resume sending process.
2024-01-31 16:28:54 +01:00
Amaury Denoyelle
cd22200d23 MEDIUM: mux-quic: release Tx buf on too small room
This commit is a direct follow-up on the major rearchitecture of send
buffering. It allows application protocol to react if current QCS
sending buffer space is too small. In this case, the buffer can be
released to the quic-conn layer. This allows to allocate a new QCS
buffer and retry HTX parsing, unless connection buffer pool is already
depleted.

A new function qcc_release_stream_txbuf() serves as API for app protocol
to release the QCS sending buffer. This operation fails if there is
unsent data in it. In this case, MUX has to keep it to finalize transfer
of unsent data to quic-conn layer. QCS is thus flagged with
QC_SF_BLK_MROOM to interrupt snd_buf operation.

When all data are sent to the quic-conn layer, QC_SF_BLK_MROOM is
cleared via qcc_streams_sent_done() and stream layer is woken up to
restart snd_buf.

Note that a new function qcc_stream_can_send() has been defined. It
allows app proto to check if sending is currently blocked for the
current QCS. For now, it checks QC_SF_BLK_MROOM flag. However, it will
be extended to other conditions with the following patches.
2024-01-31 16:28:54 +01:00
Amaury Denoyelle
3fe3251593 MEDIUM: mux-quic: simplify sending API
The previous commit was a major rework for QUIC MUX sending process.
Following this, this patch cleans up a few elements that remains but can
be removed as they are duplicated.

Of notable changes, offset fields from QCS and QCC are removed. They are
both equivalent to flow control soft offsets.

A new function qcs_prep_bytes() is implemented. Its purpose is to return
the count of prepared data bytes not yet sent. It also replaces
qcs_need_sending().
2024-01-31 16:28:54 +01:00
Amaury Denoyelle
00a3e5f786 MAJOR: mux-quic: remove intermediary Tx buffer
Previously, QUIC MUX sending was implemented with data transfered along
two different buffer instances per stream.

The first QCS buffer was used for HTX blocks conversion into H3 (or
other application protocol) during snd_buf stream callback. QCS instance
is then registered for sending via qcc_io_cb().

For each sending QCS, data memcpy is performed from the first to a
secondary buffer. A STREAM frame is produced for each QCS based on the
content of their secondary buffer.

This model is useful for QUIC MUX which has a major difference with
other muxes : data must be preserved longer, even after sent to the
lower layer. Data references is shared with quic-conn layer which
implements retransmission and data deletion on ACK reception.

This double buffering stages was the first model implemented and remains
active until today. One of its major drawbacks is that it requires
memcpy invocation for every data transferred between the two buffers.
Another important drawback is that the first buffer was is allocated by
each QCS individually without restriction. On the other hand, secondary
buffers are accounted for the connection. A bottleneck can appear if
secondary buffer pool is exhausted, causing unnecessary haproxy
buffering.

The purpose of this commit is to completely break this model. The first
buffer instance is removed. Now, application protocols will directly
allocate buffer from qc_stream_desc layer. This removes completely the
memcpy invocation.

This commit has a lot of code modifications. The most obvious one is the
removal of <qcs.tx.buf> field. Now, qcc_get_stream_txbuf() returns a
buffer instance from qc_stream_desc layer. qcs_xfer_data() which was
responsible for the memcpy between the two buffers is also completely
removed. Offset fields of QCS and QCC are now incremented directly by
qcc_send_stream(). These values are used as boundary with flow control
real offset to delimit the STREAM frames built.

As this change has a big impact on the code, this commit is only the
first part to fully support single buffer emission. For the moment, some
limitations are reintroduced and will be fixed in the next patches :

* on snd_buf if QCS sent buffer in used has room but not enough for the
  application protocol to store its content
* on snd_buf if QCS sent buffer is NULL and allocation cannot succeeds
  due to connection pool exhaustion

One final important aspect is that extra care is necessary now in
snd_buf callback. The same buffer instance is referenced by both the
stream and quic-conn layer. As such, some operation such as realign
cannot be done anymore freely.
2024-01-31 16:28:54 +01:00
Amaury Denoyelle
7dd6ed6321 MINOR: mux-quic: check fctl during STREAM frame build
qcs_build_stream_frm() is responsible to generate a STREAM frame
pointing to the content of QCS TX buffer.

This patch moves send flow control overflow check from qcs_xfer_data()
to qcs_build_stream_frm(), i.e. from transfer between internal
QCS buffer and qc_stream_desc, to STREAM frame generation.

Flow control is both check at stream and connection level. For
connection flow control, as several frames are built before emission, an
accumulator is used as extra arguments to functions to account the total
length of already built frames.

This patch should not provide any functional changes. Its main purpose
is to prepare for the removal of QCS internal buffer.
2024-01-31 16:28:54 +01:00
Amaury Denoyelle
c6ef55407c MINOR: mux-quic: remove unneeded sent-offset fields
Both QCS and QCC have their owned sent offset field. These fields store
the newest offset sent to the quic-conn layer. It is similar to QCS/QCC
flow control real offset. This patch removes them and replaces them by
the latter for code clarification.

MINOR: mux-quic: remove unneeded qcc.tx.sent_offsets field

This commit as a similar purpose as previous, except that it removes QCC
<sent_offsets> field, now equivalent to connection flow control real
offset.
2024-01-31 16:28:54 +01:00
Amaury Denoyelle
d4bf6f0526 MEDIUM: mux-quic: limit conn flow control on snd_buf
This commit is a direct follow-up on the previous one. This time, it
deals with connection level flow control. Process is similar to stream
level : soft offset is incremented during snd_buf and real offset during
STREAM frame emission.

On MAX_DATA reception, both stream layer and QMUX is woken up if
necessary. One extra feature for conn level is the introduction of a new
QCC list to reference QCS instances. It will store instances for which
snd_buf callback has been interrupted on QCC soft offset reached. Every
stream instances is woken up on MAX_DATA reception if soft_offset is
unblocked.
2024-01-31 16:28:54 +01:00
Amaury Denoyelle
c44692356d MEDIUM: mux-quic: limit stream flow control on snd_buf
This patch is the first of two to reimplement flow control emission
limits check. The objective is to account flow control earlier during
snd_buf stream callback. This should smooth transfers and prevent over
buffering on haproxy side if flow control limit is reached.

The current patch deals with stream level flow control. It reuses the
newly defined flow control type. Soft offset is incremented after HTX to
data conversion. If limit is reached, snd_buf is interrupted and stream
layer will subscribe on QCS.

On qcc_io_cb(), generation of STREAM frames is restricted as previously
to ensure to never surpass peer limits. Finally, flow control real
offset is incremented on lower layer send notification. Thus, it will
serve as a base offset for built STREAM frames. If limit is reached,
STREAM frames generation is suspended.

Each time QCS data flow control limit is reached, soft and real offsets
are reconsidered.

Finally, special care is used when flow control limit is incremented via
MAX_STREAM_DATA reception. If soft value is unblocked, stream layer
snd_buf is woken up. If real value is unblocked, qcc_io_cb() is
rescheduled.
2024-01-31 16:28:54 +01:00
Amaury Denoyelle
25493ca036 MINOR: mux-quic: define a flow control related type
Create a new module dedicated to flow control handling. It will be used
to implement earlier flow control update on snd_buf stream callback.

For the moment, only Tx part is implemented (i.e. limit set by the peer
that haproxy must respect for sending). A type quic_fctl is defined to
count emitted data bytes. Two offsets are used : a real one and a soft
one. The difference is that soft offset can be incremented beyond limit
unless it is already in excess.

Soft offset will be used for HTX to H3 parsing. As size of generated H3
is unknown before parsing, it allows to surpass the limit one time. Real
offset will be used during STREAM frame generation : this time the limit
must not be exceeded to prevent protocol violation.
2024-01-31 16:28:54 +01:00
Amaury Denoyelle
f32c08be34 MINOR: mux-quic: prepare for earlier flow control update
Add a new argument to qcc_send_stream() to specify the count of sent
bytes.

For the moment this argument is unused. This commit is in fact a step to
implement earlier flow control update during stream layer snd_buf.
2024-01-31 16:28:54 +01:00
Willy Tarreau
fadabc430f CLEANUP: h1: remove unused function h1_measure_trailers()
This one stopped being used in 2.1 when HTX became mandatory,
let's drop it.
2024-01-31 15:22:12 +01:00
Willy Tarreau
0d76a284b6 BUG/MEDIUM: h1: always reject the NUL character in header values
Ben Kallus kindly reported that we still hadn't blocked the NUL
character from header values as clarified in RFC9110 and that, even
though there's no known issure related to this, it may one day be
used to construct an attack involving another component.

Actually, both Christopher and I sincerely believed we had done it
prior to releasing 2.9, shame on us for missing that one and thanks
to Ben for the reminder!

The change was applied, it was confirmed to properly reject this NUL
byte from both header and trailer values, and it's still possible to
force it to continue to be supported using the usual pair of unsafe
"option accept-invalid-http-{request|response}" for those who would
like to keep it for whatever reason that wouldn't make sense.

This was tagged medium so that distros also remember to apply it as
a preventive measure.

It should progressively be backported to all versions down to 2.0.
2024-01-31 15:22:12 +01:00
Willy Tarreau
44f02d26f0 BUG/MINOR: h1-htx: properly initialize the err_pos field
Trailers are parsed using a temporary h1m struct, likely due to using
distinct h1 parser states. However, the err_pos field that's used to
decide whether or not to enfore option accept-invalid-http-request (or
response) was not initialized in this struct, resulting in using a
random value that may randomly accept or reject a few bad chars. The
impact is very limited in trailers (e.g. no message size is transmitted
there) but we must make sure that the option is respected, at least for
users facing the need for this option there.

The issue was introduced in 2.0 by commit 2d7c5395ed ("MEDIUM: htx:
Add the parsing of trailers of chunked messages"), and the code moved
from mux_h1.c to h1_htx.c in 2.1 with commit 4f0f88a9d0 ("MEDIUM:
mux-h1/h1-htx: move HTX convertion of H1 messages in dedicated file")
so the patch needs to be backported to all stable versions, and the
file adjusted for 2.0.
2024-01-31 15:22:12 +01:00
William Lallemand
5c45199347 MEDIUM: ssl/quic: always compile the ssl_conf.early_data test
Always compile the test of the early_data variable in
"ssl_quic_initial_ctx", this way we can emit a warning about its support
or not.

The test was moved in a more simple preprocessor check which only checks
the new HAVE_SSL_0RTT_QUIC constant.

Could be backported to 2.9 with the 2 previous commits.
However AWS-LC must be excluded of HAVE_SSL_0RTT_QUIC in this version.
2024-01-31 11:57:54 +01:00
William Lallemand
025f5105ee MINOR: ssl: rename HA_OPENSSL_HAVE_0RTT_SUPPORT constant to HAVE_SSL_0RTT_QUIC
Rename the constant to be me more comprehensive.
2024-01-31 11:57:54 +01:00
Christopher Faulet
4837e99892 BUG/MEDIUM: h1: Don't support LF only to mark the end of a chunk size
It is similar to the previous fix but for the chunk size parsing. But this
one is more annoying because a poorly coded application in front of haproxy
may ignore the last digit before the LF thinking it should be a CR. In this
case it may be out of sync with HAProxy and that could be exploited to
perform some sort or request smuggling attack.

While it seems unlikely, it is safer to forbid LF with CR at the end of a
chunk size.

This patch must be backported to 2.9 and probably to all stable versions
because there is no reason to still support LF without CR in this case.
2024-01-30 15:00:14 +01:00
Christopher Faulet
7b737da825 BUG/MINOR: h1: Don't support LF only at the end of chunks
When the message is chunked, all chunks must ends with a CRLF. However, on
old versions, to support bad client or server implementations, the LF only
was also accepted. Nowadays, it seems useless and can even be considered as
an issue. Just forbid LF only at the end of chunks, it seems reasonnable.

This patch must be backported to 2.9 and probably to all stable versions
because there is no reason to still support LF without CR in this case.
2024-01-30 14:58:59 +01:00
Miroslav Zagorac
24a5e42db6 CLEANUP: log: deinitialization of the log buffer in one function
In several places in the source, there was the same block of code that was
used to deinitialize the log buffer.  There were even two functions that
did this, but they were called only from the code that is in the same
source file (free_tcpcheck_fmt() in src/tcpcheck.c and free_logformat_list()
in src/proxy.c - they were both static functions).

The function free_logformat_list() was moved from the file src/proxy.c to
src/log.c, and a check of the list before freeing the memory was added to
that function.
2024-01-30 08:27:26 +01:00
Amaury Denoyelle
a13989f109 BUG/MEDIUM: quic: fix crash on invalid qc_stream_buf_free() BUG_ON
A recent fix was introduced to ensure unsent data are deleted when a
QUIC MUX stream releases its qc_stream_desc instance. This is necessary
to ensure all used buffers will be liberated once all ACKs are received.
This is implemented by the following patch :

  commit ad6b13d317 (quic-dev/qns)
  BUG/MEDIUM: quic: remove unsent data from qc_stream_desc buf

Before this patch, buffer removal was done only on ACK reception. ACK
handling is only done in order from the oldest one. A BUG_ON() statement
is present to ensure this assertion remains valid.

This is however not true anymore since the above patch. Indeed, after
unsent data removal, the current buffer may be empty if it did not
contain yet any sent data. In this case, it is not the oldest buffer,
thus the BUG_ON() statement will be triggered.

To fix this, simply remove this BUG_ON() statement. It should not have
any impact as it is safe to remove buffers in any order.

Note that several conditions must be met to trigger this BUG_ON crash :
* a QUIC MUX stream is destroyed before transmitting all of its data
* several buffers must have been previously allocated for this stream so
  it happens only for transfers bigger than bufsize
* latency should be high enough to delay ACK reception

This must be backported wherever the above patch is (currently targetted
to 2.6).
2024-01-29 15:44:49 +01:00
Amaury Denoyelle
7d22c4956c BUG/MEDIUM: qpack: allow 6xx..9xx status codes
HTTP status codes outside of 100..599 are considered invalid in HTTP
RFC9110. However, it is explicitely stated that range 600..999 is often
used for internal communication so in practice haproxy must be lenient
with it.

Before this patch, QPACK encoder rejected these values. This resulted in
a connection error. Fix this by extending the range of allowed values
from 100 to 999.

This is linked to github issue #2422. Once again, thanks to @yokim-git
for his help here.

This must be backported up to 2.6.
2024-01-29 15:40:19 +01:00
Amaury Denoyelle
5d2fe1871a BUG/MEDIUM: h3: do not crash on invalid response status code
A crash occurs in h3_resp_headers_send() if an invalid response code is
received from the backend side. Fix this by properly flagging the
connection on error. This will cause a CONNECTION_CLOSE.

This should fix github issue #2422.

Big thanks to ygkim (@yokim-git) for his help and reactivity. Initially,
GDB reported an invalid code source location due to heavy functions
inlining inside h3_snd_buf(). The issue was found after using -Og flag.

This must be backported up to 2.6.
2024-01-29 15:38:51 +01:00
Amaury Denoyelle
df5cf9123f MINOR: h3: add traces for stream sending function
Replace h3_debug_printf() by real trace for functions used by stream
layer snd_buf callback. A new event type H3_EV_STRM_SEND is created for
the occasion.

This should be backported up to 2.6 to help investigate H3 issues on
stable releases. Note that h3_nego_ff/h3_done_ff definition are not
available from 2.8.
2024-01-29 15:38:24 +01:00
Olivier Houchard
1ad1991721 BUG/MAJOR: ssl_sock: Always clear retry flags in read/write functions
It has been found that under some rare error circumstances,
SSL_do_handshake() could return with SSL_ERROR_WANT_READ without
even trying to call the read function, causing permanent wakeups
that prevent the process from sleeping.

It was established that this only happens if the retry flags are
not systematically cleared in both directions upon any I/O attempt,
but, given the lack of documentation on this topic, it is hard to
say if this rather strange behavior is expected or not, otherwise
why wouldn't the library always clear the flags by itself before
proceeding?

In addition, this only seems to affect OpenSSL 1.1.0 and above,
and does not affect wolfSSL nor aws-lc.

A bisection on haproxy showed that this issue was first triggered by
commit a8955d57ed ("MEDIUM: ssl: provide our own BIO."), which means
that OpenSSL's socket BIO does not have this problem. And this one
does always clear the flags before proceeding. So let's just proceed
the same way. It was verified that it properly fixes the problem,
does not affect other implementations, and doesn't cause any freeze
nor spurious wakeups either.

Many thanks to Valentn Gutirrez for providing a network capture
showing the incident as well as a reproducer. This is GH issue #2403.

This patch needs to be backported to all versions that include the
commit above, i.e. as far as 2.0.
2024-01-29 15:10:24 +01:00
Amaury Denoyelle
ad6b13d317 BUG/MEDIUM: quic: remove unsent data from qc_stream_desc buf
QCS instances use qc_stream_desc for data buffering on emission. On
stream reset, its Tx channel is closed earlier than expected. This may
leave unsent data into qc_stream_desc.

Before this patch, these unsent data would remain after QCS freeing.
This prevents the buffer to be released as no ACK reception will remove
them. The buffer is only freed when the whole connection is closed. As
qc_stream_desc buffer is limited per connection, this reduces the buffer
pool for other streams of the same connection. In the worst case if
several streams are resetted, this may completely freeze the transfer of
the remaining connection streams.

This bug was reproduced by reducing the connection buffer pool to a
single buffer instance by using the following global statement :

  tune.quic.frontend.conn-tx-buffers.limit 1.

Then a QUIC client is used which opens a stream for a large enough
object to ensure data are buffered. The client them emits a STOP_SENDING
before reading all data, which forces the corresponding QCS instance to
be resetted. The client then opens a new request but the transfer is
freezed due to this bug.

To fix this, adjust qc_stream_desc API. Add a new argument <final_size>
on qc_stream_desc_release() function. Its value is compared to the
currently buffered offset in latest qc_stream_desc buffer. If
<final_size> is inferior, it means unsent data are present in the
buffer. As such, qc_stream_desc_release() removes them to ensure the
buffer will finally be freed when all ACKs are received. It is also
possible that no data remains immediately, indicating that ACK were
already received. As such, buffer instance is immediately removed by
qc_stream_buf_free().

This must be backported up to 2.6. As this code section is known to
regression, a period of observation could be reserved before
distributing it on LTS releases.
2024-01-26 16:02:05 +01:00
Amaury Denoyelle
1da5787db4 MINOR: quic: extract qc_stream_buf free in a dedicated function
On ACK reception, data are removed from buffer via qc_stream_desc_ack().
The buffer can be freed if no more data are left. A new slot is also
accounted in buffer connection pool. Extract this operation in a
dedicated private function qc_stream_buf_free().

This change should have no functional change. However it will be useful
for the next patch which needs to remove a buffer from another function.

This patch is necessary for the following bugfix. As such, it must be
backported with it up to 2.6.
2024-01-26 16:00:53 +01:00
Frederic Lecaille
96385f40b5 MINOR: quic: Stop hardcoding a scale shifting value (CUBIC_BETA_SCALE_FACTOR_SHIFT)
Very minor modification to replace a statement with an hardcoded value by a macro.

Should be backported as far as 2.6 to ease any further modification to come.
2024-01-25 08:02:41 +01:00
Frederic Lecaille
574cf3fe00 CLEANUP: quic: Remove unused CUBIC_BETA_SCALE_FACTOR_SHIFT macro.
This macro is not used and has a confusing name.

Should be backported as far as 2.6.
2024-01-25 08:02:41 +01:00
Frederic Lecaille
b9a163e7e1 BUG/MINOR: quic: newreno QUIC congestion control algorithm no more available
There is a typo in the statement to initialize this variable when selecting
newreno as cc algo:
      const char *newreno = "newrno";
This would have happened if #defines had be used in place of several const char *
variables harcoded values.

Take the opportunity of this patch to use #defines for all the available cc algorithms.

Must be backported to 2.9.
2024-01-25 08:02:41 +01:00
Remi Tricot-Le Breton
6b69512332 BUG/MEDIUM: cache: Fix crash when deleting secondary entry
When a cache is "cold" and multiple clients simultaneously try to access
the same resource we must forward all the requests to the server. Next,
every "duplicated" response will be processed in http_action_store_cache
and we will try to cache every one of them regardless of whether this
response was already cached. In order to avoid having multiple entries
for a same primary key, the logic is then to first delete any
preexisting entry from the cache tree before storing the current one.
The actual previous response content will not be deleted yet though
because if the corresponding row is detached from the "avail" list it
might still be used by a cache applet if it actually performed a lookup
in the cache tree before the new response could be received.

This all means that we can end up using a valid row that references a
cache_entry that was already removed from the cache tree. This does not
pose any problem in regular caches (no 'vary' mechanism enabled) because
the applet only works on the data and not the 'cache_entry' information,
but in the "vary" context, when calling 'http_cache_applet_release' we
might call 'delete_entry' on the given entry which in turn tries to
iterate over all the secondary entries to find the right one in which
the secondary entry counter can be updated. We would then call
eb32_next_dup on an entry that was not in the tree anymore which ended
up crashing.

This crash was introduced by "48f81ec09 : MAJOR: cache: Delay cache
entry delete in reserve_hot function" which added the call to
"release_entry" in "http_cache_applet_release" that ended up crashing.

This issue was raised in GitHub #2417.
This patch must be backported to branch 2.9.
2024-01-24 18:01:30 +01:00
Aurelien DARRAGON
f41402ab29 CLEANUP: hlua: fix indent, remove extra return in hlua_core_get_var()
This is cleanup patch to address cosmetic issues introduced in f034139bc0
("MINOR: lua: Allow reading "proc." scoped vars from LUA core.")

Also taking this opportunity to prefix the function with __LJMP to
indicate that it may longjump.

No backport needed.
2024-01-24 16:27:47 +01:00
Aurelien DARRAGON
564addcb72 BUG/MINOR: hlua: fix uninitialized var in hlua_core_get_var()
As raised by Coverity in GH #2223, f034139bc0 ("MINOR: lua: Allow reading
"proc." scoped vars from LUA core.") causes uninitialized reads due to
smp being passed to vars_get_by_name() without being initialized first.

Indeed, vars_get_by_name() tries to read smp->sess and smp->strm pointers.
As we're only interested in the PROC var scope, it is safe to call
vars_get_by_name() with sess and strm pointers set to NULL, thus we
simply memset smp prior to calling vars_get_by_name() to fix the issue.

This should be backported in 2.9 with f034139bc0.
2024-01-24 16:27:38 +01:00
Frederic Lecaille
ab75d89e07 BUILD: quic: Fix build error when building QUIC against libressl.
This previous commit was not sufficient to completely fix the building issue
in relation with the TLS stack 0-RTT support. LibreSSL was the last TLS
stack to refuse to compile because of undefined a QUIC specific function
for 0-RTT: SSL_set_quic_early_data_enabled().

To get rid of such compilation issues, define HA_OPENSSL_HAVE_0RTT_SUPPORT
only when building against TLS stack with 0-RTT support.

No need to backport.
2024-01-24 15:37:40 +01:00
Frederic Lecaille
40f9902388 BUILD: quic: Fix build error when building QUIC against wolfssl.
This commit:

    "MINOR: quic: Enable early data at SSL session level (aws-lc)

introduced a build error when using wolfssl as TLS stack
because it references unknown function wolfSSL_set_quic_early_data_enabled()
which is not defined in qc_set_quic_early_data_context() that must not be used
in this case. The compilation of this fonction was enabled for wolfssl when
it should not have by the mentionned commit.

No backport is needed.
2024-01-24 14:36:41 +01:00
Willy Tarreau
59e9b6c204 BUILD: quic: fix build error when using the compatibility layer
Commit f783dd959b ("MINOR: quic: Enable early data at SSL session level
(aws-lc)") introduced a build error when using the openssl compat layer
because it references unknown function SSL_set_quic_early_data_context()
in qc_set_quic_early_data_context() that is not used in this case.

No backport is needed.
2024-01-24 10:49:24 +01:00
Willy Tarreau
e41638af33 BUG/MINOR: jwt: fix jwt_verify crash on 32-bit archs
The jwt_verify converter was added in 2.5 with commit 130e142ee2
("MEDIUM: jwt: Add jwt_verify converter to verify JWT integrity"). It
takes a string on input and returns an integer. It turns out that by
presetting the return value to zero before processing contents, while
the sample data is a union, it overwrites the beginning of the buffer
struct passed on input. On a 64-bit arch it's not an issue because it's
where the allocated size is stored and it's not used in the operation,
which explains why the regtest works. But on 32-bit, both the size and
the pointer are overwritten, causing a NULL pointer to be passed to
jwt_tokenize() which is not designed to support this, hence crashes.

Let's just use a temporary variable to hold the result and move the
output sample initialization to the end of the function.

This should be backported as far as 2.5.
2024-01-24 10:35:22 +01:00
Emeric Brun
ef02dba7bc BUG/MEDIUM: cli: some err/warn msg dumps add LR into CSV output on stat's CLI
The initial purpose of CSV stats through CLI was to make it easely
parsable by scripts. But in some specific cases some error or warning
messages strings containing LF were dumped into cells of this CSV.

This made some parsing failure on several tools. In addition, if a
warning or message contains to successive LF, they will be dumped
directly but double LFs tag the end of the response on CLI and the
client may consider a truncated response.

This patch extends the 'csv_enc_append' and 'csv_enc' functions used
to format quoted string content according to RFC  with an additionnal
parameter to convert multi-lines strings to one line: CRs are skipped,
and LFs are replaced with spaces. In addition and optionally, it is
also possible to remove resulting trailing spaces.

The call of this function to fill strings into stat's CSV output is
updated to force this conversion.

This patch should be backported on all supported branches (issue was
already present in v2.0)
2024-01-24 08:38:59 +01:00
Frederic Lecaille
5c88b9fcfb MINOR: quic: Correctly wait for the completion of handshakes with early data (aws-lc)
This patch impacts only the haproxy builds against aws-lc TLS stack (USE_OPENSSL_AWSLC).

As mentionned by the boringssl documentation, SSL_do_handshake() completes as soon
as ClientHello is processed and server flight sent (from the TLS stack to the
server endpoint I guess). Into QUIC, the completion has as side effect to discard
the Handshake packet number space. If this handshake completion is not deffered,
the Handshake level CRYPTO data will not be sent to the peer (because of the
assotiated packet number space discarding). According to the documentation,
SSL_in_early_data() may be used to do that. If it returns 1, this means that
the handshake is still in progress but has enough progressed to send half-RTT
data.

This patch is required to make the haproxy builds against aws-lc TLS stack support 0-RTT.
2024-01-23 16:03:29 +01:00
Frederic Lecaille
fcc825501c MINOR: ssl_sock: Early data disabled during SSL_CTX switching (aws-lc)
This patch impacts only haproxy when built against aws-lc TLS stack (OPENSSL_IS_AWSLC).

During the SSL_CTX switching from ssl_sock_switchctx_cbk() callback,
ssl_sock_switchctx_set() is called. This latter calls SSL_set_SSL_CTX()
whose aims is to change the SSL_CTX attached o an SSL object (TLS session).
But the aws-lc (or boringssl) implementation of this function copy
the "early data enabled" setting value (boolean) coming with the SSL_CTX object
into the SSL object. So, if not set in the SSL_CTX object this setting disabled
the one which has been set by configuration into the SSL object
(see qc_set_quic_early_data_enabled(), it calls SSL_set_early_data_enabled()
with an SSL object as parameter).

Fix this enabling the "early data enabled" setting into the SSL_CTX before
setting this latter into the SSL object.

This patch is required to make QUIC 0-RTT work with haproxy built against
aws-lc.

Note that, this patch should also help in early data support for TCP connections.
2024-01-23 16:03:29 +01:00
Frederic Lecaille
f783dd959b MINOR: quic: Enable early data at SSL session level (aws-lc)
This patch impacts only the haproxy build against aws-lc TLS stack (USE_OPENSSL_AWSLC).

Implement qc_set_quic_early_data_enabled() new function to enable
early data at session level. To make QUIC O-RTT work, a context string
must be set calling SSL_set_quic_early_data_context(). This is a
subset of the encoded transport parameters which is used for this.
Note that some application level settings should be also added (TODO).

This patch is required to make 0-RTT work for haproxy builds against aws-lc.
2024-01-23 16:03:29 +01:00
Frederic Lecaille
1cddd2637b MINOR: quic: Transport parameters encoding without version_information
Encode the version_information parameter only if the chosen version is provided
to quic_transport_params_encode() whose aim is to encode into a buffer all the
transport parameters passed as parameter (struct quic_params *p) in addition
to the version_information parameter.

This enables the support of transport parameters encoding without
the version_information transport parameter. This is useful for build against TLS stacks
as boringssl, aws-lc where a subset of the listener transport parameters
without version_information must be set as context string for acception
early data (see https://commondatastorage.googleapis.com/chromium-boringssl-docs/ssl.h.html#SSL_set_quic_early_data_context).

This patch is required to make haproxy builds against aws-lc TLS stack
(USE_OPENSSL_AWSLC) support 0-RTT. Does not impact the others builds.
2024-01-23 16:03:29 +01:00
Willy Tarreau
cdc993b19e BUILD: stick-table: fix build error on 32-bit platforms
Commit 9b2717e7b ("MINOR: stktable: use {show,set,clear} table with ptr")
stores a pointer in a long long (64bit), which fails the cas to void* on
32-bit platforms:

  src/stick_table.c: In function 'table_process_entry_per_ptr':
  src/stick_table.c:5136:37: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
   5136 |         ts = stktable_lookup_ptr(t, (void *)ptr);

On all our supported platforms, longs and pointers are of the same size,
so let's just turn this to a ulong instead.
2024-01-21 08:21:35 +01:00
Willy Tarreau
6e5aa16145 MINOR: connection: add sample fetches to report per-connection glitches
Now with fc_glitches and bc_glitches we can retrieve the number of
detected glitches on a front or back connection. On the backend it
can indicate a bug in a server that may induce frequent reconnections
hence CPU usage in TLS reconnections, and on the frontend it may
indicate an abusive client that may be trying to attack the stack
or to fingerprint it. Small non-zero values are definitely expected
and can be caused by network glitches for example, as well as rare
bugs in the other component (or maybe even in haproxy). These should
never be considered as alarming as long as they remain low (i.e.
much less than one per request). A reg-test is provided.
2024-01-18 17:21:44 +01:00
Willy Tarreau
d2b44fd730 MINOR: mux-h2: implement MUX_CTL_GET_GLITCHES
This reports the number of glitches on a connection.
2024-01-18 17:21:44 +01:00
Willy Tarreau
3d4438484a MINOR: mux-h2: add a counter of "glitches" on a connection
There are a lot of H2 events which are not invalid from a protocol
perspective but which are yet anomalies, especially when repeated. They
can come from bogus or really poorly implemlented clients, as well as
purposely built attacks, as we've seen in the past with various waves
of attempts at abusing H2 stacks.

In order to better deal with such situations, it would be nice to be
able to sort out what is correct and what is not. There's already the
HTTP error counter that may even be updated on a tracked connection,
but HTTP errors are something clearly defined while there's an entire
scope of gray area around it that should not fall into it.

This patch introduces the notion of "glitches", which normally correspond
to unexpected and temporary malfunction. And this is exactly what we'd
like to monitor. For example a peer is not misbehaving if a request it
sends fails to decode because due to HPACK compression it's larger than
a buffer, and for this reason such an event is reported as a stream error
and not a connection error. But this causes trouble nonetheless and should
be accounted for, especially to detect if it's repeated. Similarly, a
truncated preamble or settings frame may very well be caused by a network
hiccup but how do we know that in the logs? For such events, a glitch
counter is incremented on the connection.

For now a total of 41 locations were instrumented with this and the
counter is reported in the traces when not null, as well as in
"show sess" and "show fd". This was done using a new function,
"h2c_report_glitch()" so that it becomes easier to extend to more
advanced processing (applying thresholds, producing logs, escalating
to connection error, tracking etc).

A test with h2spec shows it reported in 8545 trace lines for 147 tests,
with some reaching value 3 in a same test (e.g. HPACK errors).

Some places were not instrumented, typically anything that can be
triggered on perfectly valid activity (received data after RST being
emitted, timeouts, etc). Some types of events were thought about,
such as INITIAL_WINDOW_SIZE after the first SETTINGS frame, too small
window update increments, etc. It just sounds too early to know if
those are currently being triggered by perfectly legit clients. Also
it's currently not incremented on timeouts so that we don't do that
repeatedly on short keep-alive timeouts, though it could make sense.
This may change in the future depending on how it's used. For now
this is not exposed outside of traces and debugging.
2024-01-18 17:21:44 +01:00
Willy Tarreau
87b74697cd MINOR: mux-h2/traces: add a missing trace on connection WU with negative inc
The test was performed but no trace emitted, which can complicate certain
diagnostics, so let's just add the trace for this rare case. It may safely
be backported though this is really not important.
2024-01-18 17:21:44 +01:00
Willy Tarreau
e1c8bfd0ed BUG/MEDIUM: mux-h2: refine connection vs stream error on headers
Commit 7021a8c4d8 ("BUG/MINOR: mux-h2: also count streams for refused
ones") addressed stream counting issues on some error cases but not
completely correctly regarding the conn_err vs stream_err case. Indeed,
contrary to the initial analysis, h2c_dec_hdrs() can set H2_CS_ERROR
when facing some unrecoverable protocol errors, and it's not correct
to send it to strm_err which will only send the RST_STREAM frame and
the subsequent GOAWAY frame is in fact the result of the read timeout.

The difficulty behind this lies on the sequence of output validations
because h2c_dec_hdrs() returns two results at once:
  - frame processing status (done/incomplete/failed)
  - connection error status

The original ordering requires to write 2 exemplaries of the exact
same error handling code disposed differently, which the patch above
tried to factor to one. After careful inspection of h2c_dec_hdrs()
and its comments, it's clear that it always returns -1 on failure,
*including* connection errors. This means we can rearrange the test
to get rid of the missing data first, and immediately enter the
no-return zone where both the stream and connection errors can be
checked at the same place, making sure to consistently maintain
error counters. This is way better because we don't have to update
stream counters on the error path anymore. h2spec now passes the
test much faster.

This will need to be backported to the same branches as the commit
above, which was already backported to 2.9.
2024-01-18 17:21:02 +01:00
Aurelien DARRAGON
52f0b6edbe MINOR: vars: fix indentation in var_clear_buffer()
Fix indentation in var_clear_buffer() since it is exclusively using
spaces.

Could be backported if a fix depends on it.
2024-01-18 16:31:55 +01:00
Frederic Lecaille
0eaf42a2a4 BUG/MEDIUM: quic: keylog callback not called (USE_OPENSSL_COMPAT)
This bug impacts only the QUIC OpenSSL compatibility module (USE_QUIC_OPENSSL_COMPAT)
and it was introduced by this commit:

    BUG/MINOR: quic: Wrong keylog callback setting.

quic_tls_compat_keylog_callback() callback was no more set when the SSL keylog was
enabled by tune.ssl.keylog setting. This is the callback which sets the TLS secrets
into haproxy.

Set it again when the SSL keylog is not enabled by configuration.

Thank you to @Greg57070 for having reported this issue in GH #2412.

Must be backported as far as 2.8.
2024-01-16 10:17:27 +01:00
Willy Tarreau
7021a8c4d8 BUG/MINOR: mux-h2: also count streams for refused ones
There are a few places where we can reject an incoming stream based on
technical errors such as decoded headers that are too large for the
internal buffers, or memory allocation errors. In this case we send
an RST_STREAM to abort the request, but the total stream counter was
not incremented. That's not really a problem, until one starts to try
to enforce a total stream limit using tune.h2.fe.max-total-streams,
and which will not count such faulty streams. Typically a client that
learns too large cookies and tries to replay them in a way that
overflows the maximum buffer size would be rejected and depending on
how they're implemented, they might retry forever.

This patch removes the stream count increment from h2s_new() and moves
it instead to the calling functions, so that it translates the decision
to process a new stream instead of a successfully decoded stream. The
result is that such a bogus client will now be blocked after reaching
the total stream limit.

This can be validated this way:

  global
        tune.h2.fe.max-total-streams 128
        expose-experimental-directives
        trace h2 sink stdout
        trace h2 level developer
        trace h2 verbosity complete
        trace h2 start now

  frontend h
        bind :8080
        mode http
        redirect location /

Sending this will fill frames with 15972 bytes of cookie headers that
expand to 16500 for storage+index once decoded, causing "message too large"
events:

  (dev/h2/mkhdr.sh -t p;dev/h2/mkhdr.sh -t s;
   for sid in {0..1000}; do
     dev/h2/mkhdr.sh  -t h -i $((sid*2+1)) -f es,eh \
       -R "828684410f7777772e6578616d706c652e636f6d \
           $(for i in {1..66}; do
             echo -n 60 7F 73 433d $(for j in {1..24}; do
               echo -n 2e313233343536373839; done);
            done) ";
   done) | nc 0 8080

Now it properly stops after sending 128 streams.

This may be backported wherever commit 983ac4397 ("MINOR: mux-h2:
support limiting the total number of H2 streams per connection") is
present, since without it, that commit is less effective.
2024-01-12 18:59:59 +01:00
William Lallemand
97832ab823 MEDIUM: ssl: implements 'default-crt' keyword for bind Lines
The 'default-crt' bind keyword allows to specify multiples
default/fallback certificates, allowing one to have an RSA as well as an
ECDSA default.
2024-01-12 17:40:42 +01:00
William Lallemand
83a0cde207 REORG: ssl: move 'generate-certificates' code to ssl_gencert.c
A lot of code specific to the 'generate-certificates' option was left in
ssl_sock.c.

Move the code to 'ssl_gencert.c' and 'ssl_gencert.h'
2024-01-12 17:40:42 +01:00
William Lallemand
b80635a7e0 MEDIUM: ssl: does not use default_ctx for 'generate-certificate' option
The 'generate-certificates' option does not need its dedicated SSL_CTX
*, it only needs the default SSL_CTX.

Use the default SSL_CTX found in the sni_ctx to generate certificates.

It allows to remove all the specific default_ctx initialization, as
well as the default_ssl_conf and 'default_inst'.
2024-01-12 17:40:42 +01:00
William Lallemand
0bf9d122a9 MEDIUM: ssl: generate '*' SNI filters for default certificates
This patch follows the previous one about default certificate selection
("MEDIUM: ssl: allow multiple fallback certificate to allow ECDSA/RSA
selection").

This patch generates '*" SNI filters for the first certificate of a
bind line, it will be used to match default certificates. Instead of
setting the default_ctx pointer in the bind line.

Since the filters are in the SNI tree, it allows to have multiple
default certificate and restore the ecdsa/rsa selection with a
multi-cert bundle.

This configuration:
   # foobar.pem.ecdsa and foobar.pem.rsa
   bind *:8443 ssl crt foobar.pem crt next.pem

will use "foobar.pem.ecdsa" and "foobar.pem.rsa" as default
certificates.

Note: there is still cleanup needed around default_ctx.

This was discussed in github issue #2392.
2024-01-12 17:40:42 +01:00
William Lallemand
30592168e5 MEDIUM: ssl: allow multiple fallback certificate to allow ECDSA/RSA selection
This patch changes the default certificate mechanism.

Since the beginning of SSL in HAProxy, the default certificate was the first
certificate of a bind line. This allowed to fallback on this certificate
when no servername extension was sent by the server, or when no SAN nor
CN was available in the certificate.

When using a multi-certificate bundle (ecdsa+rsa), it was possible to
have both certificates as the fallback one, leting openssl chose the
right one. This was possible because a multi-certificate bundle
was generating a unique SSL_CTX for both certificates.

When the haproxy and openssl architecture evolved, we decided to
use multiple SSL_CTX for a multi-cert bundle, in order to simplify the
code and allow updates over the CLI.

However only one default_ctx was allowed, so we lost the ability to
chose between ECDSA and RSA for the default certificate.

This patch allows to use a '*' filter for a certificate, which allow to
lookup between multiple '*' filter, and have one in RSA and another one
in ECDSA. It replaces the default_ctx mechanism in the ClientHello
callback and use the standard algorithm to look for a default cert and
chose between ECDSA and RSA.

/!\ This patch breaks the automatic setting of the default certificate, which
will be introduce in the next patch. So the first certificate of a bind
line won't be used as a defaullt anymore.

To use this feature, one could use crt-list with '*' filters:

$ cat foo.crtlist
foobar.pem.rsa   *
foobar.pem.ecdsa *

In order to test the feature, it's easy to send a request without
the servername extension and use ECDSA or RSA compatible ciphers:

$ openssl s_client -connect localhost:8443 -tls1_2 -cipher ECDHE-RSA-AES256-GCM-SHA384
$ openssl s_client -connect localhost:8443 -tls1_2 -cipher ECDHE-ECDSA-AES256-GCM-SHA384
2024-01-12 17:40:42 +01:00
Amaury Denoyelle
333f2cabab BUG/MINOR: mux-quic: do not prevent non-STREAM sending on flow control
Data emitted by QUIC MUX is restrained by the peer flow control. This is
checked on stream and connection level inside qcc_io_send().

The connection level check was placed early in qcc_io_send() preambule.
However, this also prevents emission of other frames STOP_SENDING and
RESET_STREAM, until flow control limitation is increased by a received
MAX_DATA. Note that local flow control frame emission is done prior in
qcc_io_send() and so are not impacted.

In the worst case, if no MAX_DATA is received for some time, this could
delay significantly streams closure and resource free. However, this
should be rare as other peers should anticipate emission of MAX_DATA
before reaching flow control limit. In the end, this is also covered by
the MUX timeout so the impact should be minimal

To fix this, move the connection level check directly inside QCS sending
loop. Note that this could cause unnecessary looping when connection
flow control level is reached and no STOP_SENDING/RESET_STREAM are
needed.

This should be backported up to 2.6.
2024-01-12 16:53:41 +01:00
Ilya Shipitsin
671f6cf36a CLEANUP: fix spelling of "occured" in src/h3.c 2024-01-12 08:34:53 +01:00
Willy Tarreau
4cc25f26f9 MEDIUM: http: add the ability to redefine http-err-codes and http-fail-codes
The new global keywords "http-err-codes" and "http-fail-codes" allow to
redefine which HTTP status codes indicate a client-induced error or a
server error, as tracked by stick-table counters. This is only done
globally, though everything was done so that it could easily be extended
to a per-proxy mechanism if there was a real need for this (but it would
eat quite more RAM then).

A simple reg-test was added (http-err-fail.vtc).
2024-01-11 15:10:08 +01:00
Willy Tarreau
9d827e1049 MEDIUM: http_act: check status codes against the bit fields for err/fail
This drops the hard-coded 4xx and 5xx status codes for err_cnt and
fail_cnt, in favor of the new bit fields that will soon be configurable.
There should be no difference at all since the bit fields are initialized
to the exact same sets (400-499 for err, 500-599 minus 501 and 505 for
fail).
2024-01-11 15:10:08 +01:00
Willy Tarreau
3c135569c5 MINOR: http: add infrastructure to choose status codes for err / fail
At the moment, http_err_cnt and http_fail_cnt are incremented on a
well-defined set of status codes, which are checked at various places.
Over time, there have been some complains about 404, 401 or 407
triggering errors, or 500 triggering failures in SOAP environments
for example. With a small bit field that fits in a cache line we
can match the presence of a status code from 100 to 599, so that
remains cheap.

This patch adds two such bit fields, one per code class, and the
accompanying functions to set/clear/test the codes. The arrays are
preset at boot time. For now they are not used and it's not possible
to adjust them.
2024-01-11 15:10:08 +01:00
Willy Tarreau
59c01f1091 CLEANUP: http: avoid duplicating literals in find_http_meth()
The function does the inverse of http_known_methods[], better rely on
that array with its indices, that makes the code clearer. Note that
we purposely don't use a loop because the compiler is able to build
an evaluation tree of the size checks and content checks that's very
efficient for the most common methods. Moving a few unimportant
entries even simplified the output code a little bit (they're now
groupped by size without changing anything for the first ones).
2024-01-11 15:10:08 +01:00
Willy Tarreau
19def65228 OPTIM: http: simplify http_get_status_idx() using a hash
This function uses a large switch/case, but the problem is that due to the
numerous holes in the range, the compiler implemented a large jump table.
With a bit of experimentations, some trivial perfect-hash code works, and
since the number of entries is 19, it was enlarged to match the nearest
next power of two to avoid a large modulo operation, and fills the holes
with the default return value (HTTP_ERR_500). Jumping to 32 also results
in a lot of valid keys and allows us to pick small values, resulting in
very fast and compact code. The new function, despite keeping a 32-bytes
table, saves slightly more than 800 bytes of code+data compared to the
previous code, and avoids table jumps that affect the CPU's branch history.

Note that another simple hash worked fine and produced exactly 19 codes
(hence no need to pad holes): ((status * 8675725) >> 13) % 19

But it's still about 24 bytes larger in code to save 13 bytes of data
that are aligned anyway, and it was a bit more expensive so that was
definitely not worth it.

The validity of the table was verified with this test code added just after
it:

  __attribute__((constructor)) void http_hash_test(void)
  {
	int i;
	for (i = 0; i <= 600; i++)
		printf("code %d => %d\n", i, http_get_status_idx(i));
	exit(0);
  }

And starting haproxy |grep -vw 14 correctly shows all ordered values
(except 500 of course which is 14).

In case new codes would be added, just play again with dev/phash to
updated the table. As long as there are less than 32 effective entries
it will remain easy to update without having to modify phash.
2024-01-11 15:10:08 +01:00
Aurelien DARRAGON
3b0bf5097b MINOR: map: mapfile ordering also matters for tree-based match types
Willy made me realize that tree-based matching may also suffer from
out-of-order mapfile loading, as opposed to what's being said in
b546bb6d ("BUG/MINOR: map: list-based matching potential ordering
regression") and the associated REGTEST.

Indeed, in case of duplicated keys, we want to be sure that only the key
that was first seen in the file will be returned (as long as it is not
removed). The above fix is still valid, and the list-based match regtest
will also prevent regressions for tree-based match since mapfile loading
logic is currently match-type agnostic.

But let's clarify that by making both the code comment and the regtest
more precise.
2024-01-11 11:13:54 +01:00
Aurelien DARRAGON
b546bb6d67 BUG/MINOR: map: list-based matching potential ordering regression
An unexpected side-effect was introduced by 5fea597 ("MEDIUM: map/acl:
Accelerate several functions using pat_ref_elt struct ->head list")

The above commit tried to use eb tree API to manipulate elements as much
as possible in the hope to accelerate some functions.

Prior to 5fea597, pattern_read_from_file() used to iterate over all
elements from the map file in the same order they were seen in the file
(using list_for_each_entry) to push them in the pattern expression.

Now, since eb api is used to iterate over elements, the ordering is lost
very early.

This is known to cause behavior changes with existing setups (same conf
and map file) when compared with previous versions for some list-based
matching methods as described in GH #2400. For instance, the map_dom()
converter may return a different matching key from the one that was
returned by older haproxy versions.

For IP or STR matching, matching is based on tree lookups for better
efficiency, so in this case the ordering is lost at the name of
performance. The order in which they are loaded doesn't matter because
tree ordering is based on the content, it is not positional.

But with some other types, matching is based on list lookups (e.g.: dom),
and the order in which elements are pushed into the list can affect the
matching element that will be returned (in case of multiple matches, since
only the first matching element in the list will be returned).

Despite the documentation not officially stating that the file ordering
should be preserved for list-based matching methods, it's probably best
to be conservative here and stick to historical behavior. Moreover, there
was no performance benefit from using the eb tree api to iterate over
elements in pattern_read_from_file() since all elements are visited
anyway.

This should be backported to 2.9.
2024-01-10 18:02:13 +01:00
William Lallemand
0773826645 CLEANUP: ssl: fix indentation in smp_fetch_ssl_fc_ec() (part 2)
Fix indentation in smp_fetch_ssl_fc_ec() since it is using exclusively
spaces.

This should have been in previous 9a21b4b43 patch but was missed by
accident.

Could be backported if a fix depends on it.
2024-01-09 17:27:31 +01:00
William Lallemand
9a21b4b435 CLEANUP: ssl: fix indentation in smp_fetch_ssl_fc_ec()
Fix indentation in smp_fetch_ssl_fc_ec() since it is using exclusively
spaces.

Could be backported if a fix depends on it.
2024-01-09 11:53:21 +01:00
Mariam John
25da2174c6 MINOR: ssl: Update ssl_fc_curve/ssl_bc_curve to use SSL_get0_group_name
The function `smp_fetch_ssl_fc_ec` gets the curve name used during key
exchange. It currently uses the `SSL_get_negotiated_group`, available
since OpenSSLv3.0 to get the nid and derive the short name of the curve
from the nid. In OpenSSLv3.2, a new function, `SSL_get0_group_name` was
added that directly gives the curve name.

The function `smp_fetch_ssl_fc_ec` has been updated to use
`SSL_get0_group_name` if using OpenSSL>=3.2 and for versions >=3.0 and <
3.2 use the old SSL_get_negotiated_group to get the curve name. Another
change made is to normalize the return value, so that
`smp_fetch_ssl_fc_ec` returns curve name in uppercase.
(`SSL_get0_group_name` returns the curve name in lowercase and
`SSL_get_negotiated_group` + `OBJ_nid2sn` returns curve name in
uppercase).

Can be backported to 2.8.
2024-01-09 11:53:21 +01:00
Willy Tarreau
2b930aa7c3 [RELEASE] Released version 3.0-dev1
Released version 3.0-dev1 with the following main changes :
    - MINOR: channel: Use dedicated functions to deal with STREAMER flags
    - MEDIUM: applet: Handle channel's STREAMER flags on applets size
    - MINOR: applets: Use channel's field to compute amount of data received
    - MEDIUM: cache: Save body size of cached objects and track it on delivery
    - MEDIUM: cache: Add support for endp-to-endp fast-forwarding
    - MINOR: cache: Add global option to enable/disable zero-copy forwarding
    - MINOR: pattern: Use reference name as filename to read patterns from a file
    - MEDIUM: pattern: Add support for virtual and optional files for patterns
    - DOC: config: Add section about name format for maps and ACLs
    - DOC: management/lua: Update commands about map and acl
    - MINOR: promex: Add support for specialized front/back/li/srv metric names
    - MINOR: promex: Export active/backup metrics per-server
    - BUG/MINOR: ssl: Double free of OCSP Certificate ID
    - MINOR: ssl/cli: Add ha_(warning|alert) msgs to CLI ckch callback
    - BUG/MINOR: ssl: Wrong OCSP CID after modifying an SSL certficate
    - BUG/MINOR: lua: Wrong OCSP CID after modifying an SSL certficate (LUA)
    - DOC: configuration: typo req.ssl_hello_type
    - MINOR: hq-interop: add fastfwd support
    - CLEANUP: mux_quic: rename ffwd function with prefix qmux_strm_
    - MINOR: mux-quic: add traces for 0-copy/fast-forward
    - BUG/MINOR: mworker/cli: fix set severity-output support
    - CLEANUP: mworker/cli: add comments about pcli_find_and_exec_kw()
    - BUG/MEDIUM: quic: Possible buffer overflow when building TLS records
    - BUILD: ssl: update types in wolfssl cert selection callback
    - MINOR: ssl: activate the certificate selection callback for WolfSSL
    - CI: github: switch to wolfssl git-c4b77ad for new PR
    - BUG/MEDIUM: map/acl: pat_ref_{set,delete}_by_id regressions
    - BUG/MINOR: ext-check: cannot use without preserve-env
    - CLEANUP: mux-quic: remove unused prototype
    - MINOR: mux-quic: clean up qcs Rx buffer allocation API
    - MINOR: mux-quic: clean up qcs Tx buffer allocation API
    - CLEANUP: mux-quic: clean up app ops callback definitions
    - MINOR: mux-quic: factorize QC_SF_UNKNOWN_PL_LENGTH set
    - MINOR: h3: complete traces for sending
    - MINOR: h3: adjust zero-copy sending related code
    - MINOR: hq-interop: use zero-copy to transfer single HTX data block
    - BUG/MEDIUM: quic: QUIC CID removed from tree without locking
    - BUG/MEDIUM: stconn: Block zero-copy forwarding if EOS/ERROR on consumer side
    - BUG/MEDIUM: mux-h1: Cound data from input buf during zero-copy forwarding
    - BUG/MEDIUM: mux-h1: Explicitly skip request's C-L header if not set originally
    - CLEANUP: mux-h1: Fix a trace message about C-L header addition
    - BUG/MEDIUM: mux-h2: Report too large HEADERS frame only when rxbuf is empty
    - BUG/MEDIUM: mux-quic: report early error on stream
    - DOC: config: add arguments to sample fetch methods in the table
    - DOC: config: also add arguments to the converters in the table
    - BUG/MINOR: resolvers: default resolvers fails when network not configured
    - SCRIPTS: mk-patch-list: produce a list of patches
    - DEV: patchbot: add the AI-based bot to pre-select candidate patches to backport
    - BUG/MEDIUM: mux-h2: Switch pending error to error if demux buffer is empty
    - BUG/MEDIUM: mux-h2: Only Report H2C error on read error if demux buffer is empty
    - BUG/MEDIUM: mux-h2: Don't report error on SE if error is only pending on H2C
    - BUG/MEDIUM: mux-h2: Don't report error on SE for closed H2 streams
    - DOC: config: Update documentation about local haproxy response
    - DEV: patchbot: use checked buttons as reference instead of internal table
    - DEV: patchbot: allow to show/hide backported patches
    - MINOR: h3: remove quic_conn only reference
    - BUG/MINOR: server: Use the configured address family for the initial resolution
    - MINOR: mux-quic: remove qcc_shutdown() from qcc_release()
    - MINOR: mux-quic: use qcc_release in case of init failure
    - MINOR: mux-quic: adjust error code in init failure
    - MINOR: h3: add traces for connection init stage
    - BUG/MINOR: h3: properly handle alloc failure on finalize
    - MINOR: h3: use INTERNAL_ERROR code for init failure
    - BUG/MAJOR: stconn: Disable zero-copy forwarding if consumer is shut or in error
    - MINOR: stats: store the parent proxy in stats ctx (http)
    - BUG/MEDIUM: stats: unhandled switching rules with TCP frontend
    - MEDIUM: proxy: set PR_O_HTTP_UPG on implicit upgrades
    - MINOR: proxy: monitor-uri works with tcp->http upgrades
    - OPTIM: server: eb lookup for server_find_by_name()
    - OPTIM: server: ebtree lookups for findserver_unique_* functions
    - MINOR: server/event_hdl: add server_inetaddr struct to facilitate event data usage
    - MINOR: server/event_hdl: update _srv_event_hdl_prepare_inetaddr prototype
    - BUG/MINOR: server/event_hdl: propagate map port info through inetaddr event
    - MINOR: server: ensure connection cleanup on server addr changes
    - CLEANUP: server/event_hdl: remove purge_conn hint in INETADDR event
    - MEDIUM: server: merge srv_update_addr() and srv_update_addr_port() logic
    - CLEANUP: server: remove unused server_parse_addr_change_request() function
    - CLEANUP: resolvers: remove duplicate func prototype
    - MINOR: resolvers: add unique numeric id to nameservers
    - MEDIUM: server: make server_set_inetaddr() updater serializable
    - MINOR: server/event_hdl: expose updater info through INETADDR event
    - MINOR: server: add dns hint in server_inetaddr_updater struct
    - MEDIUM: server/dns: clear RMAINT when addr resolves again
    - BUG/MINOR: server/dns: use server_set_inetaddr() to unset srv addr from DNS
    - BUG/MEDIUM: server/dns: perform svc_port updates atomically from SRV records
    - MEDIUM: peers: use server as stream target
    - CLEANUP: peers: remove unused sock_init_arg struct member
    - CLEANUP: peers: remove unused "proto" and "xprt" struct members
    - MINOR: peers: rely on srv->addr and remove peer->addr
    - DOC: config: add context hint for server keywords
    - MINOR: stktable: add table_process_entry helper function
    - MINOR: stktable: use {show,set,clear} table with ptr
    - MINOR: map: add map_*_key converters to provide the matching key
    - DOC: fix typo for fastfwd QUIC option
    - BUG/MINOR: mux-quic: always report error to SC on RESET_STREAM emission
    - MEDIUM: mux-quic: add BUG_ON if sending on locally closed QCS
    - BUG/MINOR: mux-quic: disable fast-fwd if connection on error
    - BUG/MINOR: quic: Wrong keylog callback setting.
    - BUG/MINOR: quic: Missing call to TLS message callbacks
    - MINOR: h3: check connection error during sending
    - BUG/MINOR: h3: close connection on header list too big
    - BUG/MINOR: h3: close connection on sending alloc errors
    - BUG/MINOR: h3: disable fast-forward on buffer alloc failure
    - Revert "MINOR: mux-quic: Disable zero-copy forwarding for send by default"
    - MINOR: stktable: stktable_data_ptr() cannot fail in table_process_entry()
    - CLEANUP: assorted typo fixes in the code and comments
    - CI: use semantic version compare for determing "latest" OpenSSL
    - CLEANUP: server: remove ambiguous check in srv_update_addr_port()
    - CLEANUP: resolvers: remove unused RSLV_UPD_OBSOLETE_IP flag
    - CLEANUP: resolvers: remove some more unused RSLV_UDP flags
    - MEDIUM: server: simplify snr_set_srv_down() to prevent confusions
    - MINOR: backend: export get_server_*() functions
    - MINOR: tcpcheck: export proxy_parse_tcpcheck()
    - MEDIUM: udp: allow to retrieve the frontend destination address
    - MINOR: global: export a way to list build options
    - MINOR: debug: add features and build options to "show dev"
    - BUG/MINOR: server: fix server_find_by_name() usage during parsing
    - REGTESTS: check attach-srv out of order declaration
    - CLEANUP: quic: Remaining useless code into server part
    - BUILD: quic: Missing quic_ssl.h header protection
    - BUG/MEDIUM: h3: fix incorrect snd_buf return value
    - MINOR: h3: do not consider missing buf room as error on trailers
    - BUG/MEDIUM: stconn: Forward shutdown on write timeout only if it is forwardable
    - BUG/MEDIUM: stconn: Set fsb date if zero-copy forwarding is blocked during nego
    - BUG/MEDIUM: spoe: Never create new spoe applet if there is no server up
    - MINOR: mux-h2: support limiting the total number of H2 streams per connection
    - CLEANUP: mux-h2: remove the printfs from previous commit on h2 streams limit.
    - DEV: h2: add the ability to emit literals in mkhdr
    - DEV: h2: add the preface as well in supported output types
    - DEV: h2: support passing raw data for a frame
    - IMPORT: ebtree: implement and use flsnz_long() to count bits
    - IMPORT: ebtree: switch the sizes and offsets to size_t and ssize_t
    - IMPORT: ebtree: rework the fls macros to better deal with arch-specific ones
    - IMPORT: ebtree: make string_equal_bits turn back to unsigned char
    - IMPORT: ebtree: use unsigned ints for flznz()
    - IMPORT: ebtree: make string_equal_bits() return an unsigned
2024-01-06 14:09:35 +01:00
Willy Tarreau
e19334a343 CLEANUP: mux-h2: remove the printfs from previous commit on h2 streams limit.
After thinking about them all the time at the end, I managed to remove
them while editing the commit and to forget to push them :-(
2024-01-05 19:19:10 +01:00
Willy Tarreau
983ac4397d MINOR: mux-h2: support limiting the total number of H2 streams per connection
This patch introduces a new setting: tune.h2.fe.max-total-streams. It
sets the HTTP/2 maximum number of total streams processed per incoming
connection. Once this limit is reached, HAProxy will send a graceful GOAWAY
frame informing the client that it will close the connection after all
pending streams have been closed. In practice, clients tend to close as fast
as possible when receiving this, and to establish a new connection for next
requests. Doing this is sometimes useful and desired in situations where
clients stay connected for a very long time and cause some imbalance inside a
farm. For example, in some highly dynamic environments, it is possible that
new load balancers are instantiated on the fly to adapt to a load increase,
and that once the load goes down they should be stopped without breaking
established connections. By setting a limit here, the connections will have
a limited lifetime and will be frequently renewed, with some possibly being
established to other nodes, so that existing resources are quickly released.

The default value is zero, which enforces no limit beyond those implied by
the protocol (2^30 ~= 1.07 billion). Values around 1000 were found to
already cause frequent enough connection renewal without causing any
perceptible latency to most clients. One notable exception here is h2load
which reports errors for all requests that were expected to be sent over
a given connection after it receives a GOAWAY. This is an already known
limitation: https://github.com/nghttp2/nghttp2/issues/981

The patch was made in two parts inside h2_frt_handle_headers():
  - the first one, at the end of the function, which verifies if the
    configured limit was reached and if it's needed to emit a GOAWAY ;

  - the second, just before decoding the stream frame, which verifies if
    a previously configured limit was ignored by the client, and closes
    the connection if this happens. Indeed, one reason for a connection
    to stay alive for too long definitely comes from a stupid bot that
    periodically fetches the same resource, scans lots of URLs or tries
    to brute-force something. These ones are more likely to just ignore
    the last stream ID advertised in GOAWAY than a regular browser, or
    a well-behaving client such as curl which respects it. So in order
    to make sure we can close the connection we need to enforce the
    advertised limit.

Note that a regular client will not face a problem with that because in
the worst case it will have max_concurrent_streams in flight and this
limit is taken into account when calculating the advertised last
acceptable stream ID.

Just a note: it may also be possible to move the first part above to
h2s_frt_stream_new() instead so that it's not processed for trailers,
though it doesn't seem to be more interesting, first because it has
two return points.

This is something that may be backported to 2.9 and 2.8 to offer more
control to those dealing with dynamic infrastructures, especially since
for now we cannot force a connection to be cleanly closed using rules
(e.g. github issues #946, #2146).
2024-01-05 18:49:11 +01:00
Christopher Faulet
72c23bd4cd BUG/MEDIUM: spoe: Never create new spoe applet if there is no server up
This test was already performed when a new message is queued into the
sending queue. However not when the last applet is released, in
spoe_release_appctx(). It is a quite old bug. It was introduced by commit
6f1296b5c7 ("BUG/MEDIUM: spoe: Create a SPOE applet if necessary when the
last one is released").

Because of this bug, new SPOE applets may be created and quickly released
because there is no server up, in loop and while there is at least one
message in the sending queue, consuming all the CPU. It is pretty visible if
the processing timeout is high.

To fix the bug, conditions to create or not a SPOE applet are now
centralized in spoe_create_appctx(). The test about the max connections per
second and about number of active servers are moved in this function.

This patch must be backported to all stable versions.
2024-01-05 17:28:50 +01:00
Christopher Faulet
7eb7ae2835 BUG/MEDIUM: stconn: Forward shutdown on write timeout only if it is forwardable
The commit b9c87f8082 ("BUG/MEDIUM: stconn/stream: Forward shutdown on write
timeout") introduced a regression. In sc_cond_forward_shut(), the write
timeout is considered too early to forward the shutdown. In fact, it is
always considered, even if the shutdown is not forwardable yet. It is of
course unexpected. It is especially an issue when a write timeout is
encountered on server side during the connection establishment. In this
case, if shutdown is forwarded too early on the client side, the connection
is closed before the 503 error sending.

So the write timeout must indeed be considered to forward the shutdown to
the underlying layer, but only if the shutdown is forwardable. Otherwise, we
should do nothing.

This patch should fix the issue #2404. It must be backported as far as 2.2.
2024-01-05 17:28:06 +01:00
Amaury Denoyelle
8df47442d2 MINOR: h3: do not consider missing buf room as error on trailers
Improve h3_resp_trailers_send() return value to be similar with
h3_resp_data_send(). In particular, if QCS Tx buffer has not enough
space for trailer encoding, 0 is returned instead of an error value,
with QC_SF_BLK_MROOM set.

This unify HTTP/3 headers/data/trailers encoding functions. Negative
error codes are limited to fatal error which should cause a connection
closure. Not enough output buffer space is only a transient condition
which is reflect by the QC_SF_BLK_MROOM flag.
2024-01-04 15:37:49 +01:00
Amaury Denoyelle
14673fe54d BUG/MEDIUM: h3: fix incorrect snd_buf return value
h3_resp_data_send() is used to transcode HTX data into H3 data frames.
If QCS Tx buffer is not aligned when first invoked, two separate frames
may be built, first until buffer end, then with remaining space in
front.

If buffer space is not enough for at least the H3 frame header, -1 is
returned with the flag QC_SF_BLK_MROOM set to await for more room. An
issue arises if this occurs for the second frame : -1 is returned even
though HTX data were properly transcoded and removed on the first step.
This causes snd_buf callback to return an incorrect value to the stream
layer, which in the end will corrupt the channel output buffer.

To fix this, stop considering that not enough remaining space is an
error case. Instead, return 0 if this is encountered for the first frame
or the HTX removed block size for the second one. As QC_SF_BLK_MROOM is
set, this will correctly interrupt H3 encoding. Label err is thus only
properly limited to fatal error which should cause a connection closure.
A new BUG_ON() has been added which should prevent similar issues in the
future.

This issue was detected using the following client :
 $ ngtcp2-client --no-quic-dump --no-http-dump --exit-on-all-streams-close \
   127.0.0.1 20443 -n2 "http://127.0.0.1:20443/?s=50k"

This triggers the following CHECK_IF statement. Note that it may be
necessary to disable fast forwarding to enforce snd_buf usage.

Thread 1 "haproxy" received signal SIGILL, Illegal instruction.
0x00005555558bc48a in co_data (c=0x5555561ed428) at include/haproxy/channel.h:130
130             CHECK_IF_HOT(c->output > c_data(c));
[ ## gdb ## ] bt
 #0  0x00005555558bc48a in co_data (c=0x5555561ed428) at include/haproxy/channel.h:130
 #1  0x00005555558c1d69 in sc_conn_send (sc=0x5555561f92d0) at src/stconn.c:1637
 #2  0x00005555558c2683 in sc_conn_io_cb (t=0x5555561f7f10, ctx=0x5555561f92d0, state=32832) at src/stconn.c:1824
 #3  0x000055555590c48f in run_tasks_from_lists (budgets=0x7fffffffdaa0) at src/task.c:596
 #4  0x000055555590cf88 in process_runnable_tasks () at src/task.c:876
 #5  0x00005555558aae3b in run_poll_loop () at src/haproxy.c:3049
 #6  0x00005555558ab57e in run_thread_poll_loop (data=0x555555d9fa00 <ha_thread_info>) at src/haproxy.c:3251
 #7  0x00005555558ad053 in main (argc=6, argv=0x7fffffffddd8) at src/haproxy.c:3948

In case CHECK_IF are not activated, it may cause crash or incorrect
transfers.

This was introduced by the following commit
  commit 2144d24186
  BUG/MINOR: h3: close connection on sending alloc errors

This must be backported wherever the above patch is.
2024-01-04 15:36:58 +01:00
Frédéric Lécaille
860028db47 CLEANUP: quic: Remaining useless code into server part
Remove some QUIC definitions of members from server structure as the haproxy QUIC
stack does not support at all the server part (QUIC client) as this time.
Remove the statements in relation with their initializations.

This patch should be backported as far as 2.6 to save memory.
2024-01-04 11:16:06 +01:00
Amaury Denoyelle
b4db3be86e BUG/MINOR: server: fix server_find_by_name() usage during parsing
Since below commit, server_find_by_name() now search using
'used_server_id' proxy backend tree :
  4bcfe30414
  OPTIM: server: eb lookup for server_find_by_name()

This introduces a regression if server_find_by_name() is used via
check_config_validity() during post-parsing. Indeed, used_server_id tree
is populated at the same stage so it's possible to not found an existing
server. This can cause incorrect rejection of previously valid
configuration file.

To fix this, servers are now inserted in used_server_id tree during
parsing via parse_server(). This guarantees that server instances can be
retrieved during post parsing.

A known feature which uses server_find_by_name() during post parsing is
attach-srv tcp-rule used for reverse HTTP. Prior to the current fix, a
config was wrongly rejected if the rule was declared before the server
line.

This should not be backported unless the mentionned commit is.
2024-01-02 15:52:47 +01:00
Willy Tarreau
9d869b10de MINOR: debug: add features and build options to "show dev"
The "show dev" CLI command is still missing useful elements such as the
build options, SSL version etc. Let's just add the build features and
the build options there so that it's possible to collect all of this
from a running process without having to start the executable with -vv.

This is still dumped all at once from the parsing function since the
output is small. If it were to grow, this would possibly require to be
reworked to support a context.

It might be helpful to backport this to 2.9 since it can help narrow
down certain issues.
2024-01-02 11:44:42 +01:00
Willy Tarreau
afba58f21e MINOR: global: export a way to list build options
The new function hap_get_next_build_opt() will iterate over the list of
build options. This will be used for debugging, so that the build options
can be retrieved from the CLI.
2024-01-02 11:44:42 +01:00
Dragan Dosen
96c1a61136 MEDIUM: udp: allow to retrieve the frontend destination address
A new flag RX_F_PASS_PKTINFO is now available, whose purpose is to mark
that the destination address is about to be retrieved on some listeners.

The address can be retrieved from the first received datagram, and
relies on the IP_PKTINFO, IP_RECVDSTADDR and IPV6_RECVPKTINFO support.
2024-01-02 11:44:42 +01:00
Dragan Dosen
1582ccf9d3 MINOR: tcpcheck: export proxy_parse_tcpcheck()
Export proxy_parse_tcpcheck() in tcpcheck.h
2024-01-02 11:44:42 +01:00
Dragan Dosen
5b1609f9da MINOR: backend: export get_server_*() functions
This is in preparation for exposing more of the LB internals.
2024-01-02 11:44:42 +01:00
Aurelien DARRAGON
bdecff511c MEDIUM: server: simplify snr_set_srv_down() to prevent confusions
snr_set_srv_down() (was formely known as snr_update_srv_status()), is
still too ambiguous because it's not clear whether we will be putting
the server under maintenance or not. This is mainly due to the fact that
the function behaves differently if has_no_ip is set or not.

By reviewing the function callers, it has now become clear that
snr_resolution_cb() is always calling the function with a valid resolution
so we only want to put the server under maintenance if we don't have a
valid IP address. On the other hand snr_resolution_error_cb() always
calls the function on error, with either no resolution (for SRV requests)
or with failing resolution (all cases except RSLV_STATUS_VALID), so in
this case we decide whether to put the server under maintenance case by
case (ie: expired? timeout?)

As a result, let's simplify snr_set_srv_down() so that it is only called
when the caller really thinks that the server should be put under
maintenance, which means always for snr_resolution_error_cb(), and only
if the resolution didn't yield usable ip for snr_resolution_cb().
2024-01-02 10:29:50 +01:00
Aurelien DARRAGON
689784ed91 CLEANUP: resolvers: remove some more unused RSLV_UDP flags
RSLV_UPD_CNAME and RSLV_UPD_NAME_ERROR flags have now become useless since
3cf7f987 ("MINOR: dns: proper domain name validation when receiving DNS
response") as they are never set, but we forgot to remove them.
2024-01-02 10:29:41 +01:00
Aurelien DARRAGON
3ebe7bef8d CLEANUP: server: remove ambiguous check in srv_update_addr_port()
A leftover check was left by recent patch series about server
addr:svc_port propagation: a check on (msg) being set was performed
in srv_update_addr_port(), but msg is always set, so the check is not
needed and confuses coverity (See GH #2399)
2024-01-02 10:29:24 +01:00
Ilya Shipitsin
8705e45964 CLEANUP: assorted typo fixes in the code and comments
This is 38th iteration of typo fixes
2024-01-02 10:19:48 +01:00
Aurelien DARRAGON
41b7193e3c MINOR: stktable: stktable_data_ptr() cannot fail in table_process_entry()
In table_process_entry(), stktable_data_ptr() result is dereferenced
without checking if it's NULL first, which may happen when bad inputs
are provided to the function.

However, data_type and ts arguments were already checked prior to calling
the function, so we know for sure that stktable_data_ptr() will never
return NULL in this case.

However some static code analyzers such as Coverity are being confused
because they think that the result might possibly be NULL.
(See GH #2398)

To make it explicit that we always provide good inputs and expect valid
result, let's switch to the __stktable_data_ptr() unsafe function.
2024-01-02 08:51:51 +01:00
Amaury Denoyelle
b7274e69ef Revert "MINOR: mux-quic: Disable zero-copy forwarding for send by default"
This reverts commit 18f2ccd244.

Found issues related to QUIC fast-forward were resolved (see github
issue #2372). Reenable it by default. If any issue arises, it can be
disabled using the global statement :
  tune.quit.zero-copy-fwd-send off

This can be backported to 2.9, but only after a sensible period of
observation.
2023-12-22 16:30:37 +01:00
Amaury Denoyelle
cfa6d4cdd0 BUG/MINOR: h3: disable fast-forward on buffer alloc failure
If QCS Tx buffer cannot be allocated in nego_ff callback, disable
fast-forward for this connection and return immediately. If snd_buf is
later finally used but still no buffer can being allocated, the
connection will be closed on error.

This should fix coverity reported in github issue #2390.

This should be backported up to 2.9.
2023-12-22 16:14:23 +01:00
Amaury Denoyelle
2144d24186 BUG/MINOR: h3: close connection on sending alloc errors
When encoding new HTTP/3 frames, QCS Tx buffer must be allocated if
currently NULL. Previously, allocation failure was not properly checked,
leaving the connection in an unspecified state, or worse risking a
crash.

Fix this by setting <h3c.err> to H3_INTERNAL_ERROR each time the
allocation fails. This will stop sending and close the connection. In
the future, it may be better to put the connection on pause waiting for
allocation to succeed but this is too complicated to implement for now
in a reliable way.

Along with the current change, return of all HTX parsing functions
(h3_resp_*_send) were set to a negative value in case of error. A new
BUG_ON() in h3_snd_buf() ensures that if such a value is returned,
either a connection error is register (via <h3c.err>) or buffer is
temporarily full (flag QC_SF_BLK_MROOM).

This should fix github issue #2389.

This should be backported up to 2.6. Note that qcc_get_stream_txbuf()
does not exist in 2.9 and below. mux_get_buf() is its equivalent. An
explicit check b_is_null(&qcs.tx.buf) should be used there.
2023-12-22 16:02:49 +01:00
Amaury Denoyelle
d077f7ccf4 BUG/MINOR: h3: close connection on header list too big
When parsing a HTX response, if too many headers are present, stop
sending and close the connection with error code H3_INTERNAL_ERROR.

Previously, no error was reported despite the interruption of header
parsing. This cause an infinite loop. However, this is considered as
minor as it happens on the response path from backend side.

This should be backported up to 2.6.
It relies on previous commit
  "MINOR: h3: check connection error during sending".
2023-12-22 15:43:39 +01:00
Amaury Denoyelle
642016ce03 MINOR: h3: check connection error during sending
If an error occurs during HTX to H3 encoding, h3_snd_buf() should be
interrupted. This commit add this possibility by checking for <h3c.err>
member value. If non null, sending loop is stopped and an error is
reported using qcc_set_error().

This commit does not change any behavior for now, as <h3c.err> is never
set during sending. However, this will change in future commits, most
notably to reject too many headers or handle buffer allocation failure.
As such, this commit should be backported along the following fixes.
Note that in 2.6 qcc_set_error() does not exist and must be replaced by
qcc_emit_cc_app().
2023-12-22 15:40:11 +01:00
Frédéric Lécaille
10e96fcd17 BUG/MINOR: quic: Missing call to TLS message callbacks
This bug impacts only the QUIC OpenSSL compatibility module (USE_QUIC_OPENSSL_COMPAT).

The TLS capture of information from client hello enabled by
tune.ssl.capture-buffer-size could not work with USE_QUIC_OPENSSL_COMPAT. This
is due to the fact the callback set for this feature was replaced by
quic_tls_compat_msg_callback(). In fact this called must be registered by
ssl_sock_register_msg_callback() as this done for the TLS client hello capture.
A call to this function appends the function passed as parameter to a list of
callbacks to be called when the TLS stack parse a TLS message.
quic_tls_compat_msg_callback() had to be modified to return if it is called
for a non-QUIC TLS session.

Must be backported to 2.8.
2023-12-21 16:33:06 +01:00
Frédéric Lécaille
b26f6fb0cb BUG/MINOR: quic: Wrong keylog callback setting.
This bug impacts only the QUIC OpenSSL compatibility module (USE_QUIC_OPENSSL_COMPAT).

To make this module works, quic_tls_compat_keylog_callback() function must be
set as keylog callback, or at least be called by another keylog callback.
This is what SSL_CTX_keylog() was supposed to do. In addition to export the TLS
secrets via sample fetches this latter also calls quic_tls_compat_keylog_callback()
when compiled with USE_QUIC_OPENSSL_COMPAT defined.

Before this patch, SSL_CTX_keylog() was replaced by quic_tls_compat_keylog_callback()
and the TLS secret were no more exported by sample fetches.

Must be backported to 2.8.
2023-12-21 16:26:31 +01:00
Amaury Denoyelle
19f4f4d890 BUG/MINOR: mux-quic: disable fast-fwd if connection on error
Add a check on nego_ff to ensure connection is not on error. If this is
the case, fast-forward is disable to prevent unnecessary sending. If
snd_buf is latter called, stconn will be notified of the error to
interrupt the stream.

This check is necessary to ensure snd_buf and nego_ff are consistent.
Note that previously, if fast-forward was conducted even on connection
error, no sending would occur as qcc_io_send() also check these flags.
However, there is a risk that stconn is never notified of the error
status, thus it is considered as a bug.

Its impact is minimal for now as fast-forward is disable by default on
QUIC. By fixing it, it should be possible to reactive it soon.

This should be backported up to 2.9.
2023-12-21 15:42:08 +01:00
Amaury Denoyelle
235e8f1afd MEDIUM: mux-quic: add BUG_ON if sending on locally closed QCS
Previously, if snd_buf operation was conducted despite QCS already
locally closed, the input buffer was silently dropped. This situation
could happen if a RESET_STREAM was emitted butemission not reported to
the stream layer. Resetting silently the buffer ensure QUIC MUX remain
compliant with RFC 9000 which forbid emission after RESET_STREAM.

Since previous commit, it is now ensured that RESET_STREAM sending will
always be reported to stream-layer. Thus, there is no need anymore to
silently reset the buffer. A BUG_ON() statement is added to ensure this
assumption will remain valid.

The new code is deemed cleaner as it does not hide a missing error
notification on the stconn-layer. Previously, if an error was missing,
sending would continue unnecessarily with a false success status
reported for the stream.

Note that the BUG_ON() statement was also added into nego_ff callback.
This is necessary to ensure both sending path remains consistent.

This patch is labelled as MEDIUM as issues were already encountered in
snd_buf/nego_ff implementation and it's not easy to cover all occurences
during test. If the BUG_ON() is triggered without any apparent
stream-layer issue, this commit should be reverted.
2023-12-21 15:42:08 +01:00
Amaury Denoyelle
0a69750a98 BUG/MINOR: mux-quic: always report error to SC on RESET_STREAM emission
On RESET_STREAM emission, the stream Tx channel is closed. This event
must be reported to stream-conn layer to interrupt future send
operations.

Previously, se_fl_set_error() was manually invocated before/after
qcc_reset_stream(). Change this by moving se_fl_set_error() invocation
into the latter. This ensures that notification won't be forget, most
notably in HTTP/3 layer.

In most cases, behavior should be identical as both functions were
called together unless not necessary. However, there is one exception
which could cause a RESET_STREAM emission without error notification :
this happens on H3 trailers parsing error. All other H3 errors happen
before the stream-layer creation and thus the error is notified on
stream creation. This regression has been caused by the following patch :

  152beeec34
  MINOR: mux-quic: report error on stream-endpoint earlier

Thus it should be backported up to 2.7.

Note that the case described above did not cause any crash or protocol
error. This is because currently MUX QUIC snd_buf operation silently
reset buffer on transmission if QCS is already closed locally. This will
however be removed in a future commit so the current patch is necessary
to prevent an invalid behavior.
2023-12-21 15:42:08 +01:00
Aurelien DARRAGON
ca47583787 MINOR: map: add map_*_key converters to provide the matching key
All map_*_ converters now have an additional output type: key. Such
converters will return the matched entry's key (as found in the map file)
as a string instead of the value.

Consider this example map file:
 |example.com value1
 |haproxy value2

With the above map file:

str(test.example.com/url),map_dom_key(file.map) will return "example.com"
str(running haproxy),map_sub_key(file.map) will return "haproxy"

This should address GH #1446.
2023-12-21 14:22:27 +01:00
Aurelien DARRAGON
9b2717e7bb MINOR: stktable: use {show,set,clear} table with ptr
This patchs adds support for optional ptr (0xffff form) instead of key
argument to match against existing sticktable entries, ie: if the key is
empty or cannot be matched on the cli due to incompatible characters.
Lookup is performed using a linear search so it will be slower than key
search which relies on eb tree lookup.

Example:

set table mytable key mykey data.gpc0 1

show table mytable
> 0x7fbd00032bd8: key=mykey use=0 exp=86373242 shard=0 gpc0=1

clear table mytable ptr 0x7fbd00032bd8

This patchs depends on:
 - "MINOR: stktable: add table_process_entry helper function"

It should solve GH #2118
2023-12-21 14:22:27 +01:00
Aurelien DARRAGON
6ee3923c52 MINOR: stktable: add table_process_entry helper function
Only keep key-related logic in table_process_entry_per_key() function,
and then use table_process_entry() function that takes an entry pointer
as argument to process the entry.
2023-12-21 14:22:27 +01:00
Aurelien DARRAGON
f6ae25858d MINOR: peers: rely on srv->addr and remove peer->addr
Similarly to the previous commit, we get rid of unused peer member.

peer->addr was only used to save a copy of the sever's addr at parsing
time. But instead of relying on an intermediate variable, we can actually
use server's address directly when initiating the peer session.

As with other streams created from server's settings (tcp/http, log, ring),
we should rely on srv->svc_port for the port part of the address. This
shouldn't change anything for peers since the address is fully resolved
at parsing time and runtime changes are not supported, but this should
help to make the code future-proof.
2023-12-21 14:22:27 +01:00
Aurelien DARRAGON
372d3e2934 CLEANUP: peers: remove unused "proto" and "xprt" struct members
peer->proto and peer->xprt struct members are now pure legacy: they are
only set during parsing but never used afterwards.

This is due to commit 02efedac ("MINOR: peers: now remove the remote
connection setup code") which made some cleanup in the past, but the
unused proto and xprt members were probably left unused by mistake.

Since we don't have valid uses for them, we remove them.

Also, peer_xprt() helper function was removed since it was related to
peer->xprt struct member.
2023-12-21 14:22:27 +01:00
Aurelien DARRAGON
334caefaaa CLEANUP: peers: remove unused sock_init_arg struct member
Since be0688c6 ("MEDIUM: stream_interface: remove the si->init"),
sock_init_arg is completely useless (set but never used later), thus
we remove it.
2023-12-21 14:22:27 +01:00
Aurelien DARRAGON
c5cace3100 BUG/MEDIUM: server/dns: perform svc_port updates atomically from SRV records
This was the last missing bit from cd994407a ("BUG/MAJOR: server/addr:
fix a race during server addr:svc_port updates")

Indeed, despite the fix, svc_port updates from resolvers were still
directly performed on the server's struct.

Now they make proper use of the server_set_inetaddr() function so the port
change (+ optional addr change with AR) will be propagated atomically.

This patch depends on:
 - "MINOR: server: ensure connection cleanup on server addr changes"
 - "CLEANUP: server/event_hdl: remove purge_conn hint in INETADDR event"
 - "MEDIUM: server: merge srv_update_addr() and srv_update_addr_port() logic"
 - "MEDIUM: server: make server_set_inetaddr() updater serializable"
 - "MINOR: server/event_hdl: expose updater info through INETADDR event"
 - "MINOR: server: add dns hint in server_inetaddr_updater struct"
 - "MEDIUM: server/dns: clear RMAINT when addr resolves again"

While it could be backported in 2.9 with cd994407a ("BUG/MAJOR:
server/addr: fix a race during server addr:svc_port updates") to ensure
addr and svc_port updates performed by resolver's code comply with the
API taking care of pushing the update (and thus avoid any race), some
patch dependencies are quite sensitive so it's probably best to avoid
backporting for no good reason, or at least wait for it to be considered
stable to prevent any breakeages
2023-12-21 14:22:27 +01:00
Aurelien DARRAGON
64c9c8ef39 BUG/MINOR: server/dns: use server_set_inetaddr() to unset srv addr from DNS
As seen before, server's addr and svc_port should not be updated directly
during runtime, because even if the update is performed under the lock,
some competing threads might be reading ->addr and ->svc_port without
the lock because they simply cannot afford it.

To prevent races with such competing threads, server's addr and port
should only be updated using server_set_inetaddr() function or similar.

This patch depends on:
 - "MINOR: server: ensure connection cleanup on server addr changes"
 - "CLEANUP: server/event_hdl: remove purge_conn hint in INETADDR event"
 - "MEDIUM: server: merge srv_update_addr() and srv_update_addr_port() logic"
 - "MEDIUM: server: make server_set_inetaddr() updater serializable"
 - "MINOR: server/event_hdl: expose updater info through INETADDR event"
 - "MINOR: server: add dns hint in server_inetaddr_updater struct"
 - "MEDIUM: server/dns: clear RMAINT when addr resolves again"

While it could be backported in 2.9 with cd994407a ("BUG/MAJOR:
server/addr: fix a race during server addr:svc_port updates") to ensure
addr and svc_port reset performed by resolver's code comply with the
API taking care of pushing the update (and thus avoid any race), some
patch dependencies are quite sensitive so it's probably best to avoid
backporting for no good reason, or at least wait for it to be considered
stable to prevent any breakeages.
2023-12-21 14:22:27 +01:00
Aurelien DARRAGON
334ebfa1a2 MEDIUM: server/dns: clear RMAINT when addr resolves again
snr_update_srv_status() and srvrq_update_srv_status() will both set or
clear the server RMAINT state depending of the result of the current dns
resolution.

This used to work pretty well in the past, but now that addr:svc_port
changes are changed atomically through a dedicated task, the change is
performed asynchronously, so this can cause some flapping issues if the
server is put out of maintenance while the server's address is still
unassigned.

To prevent errors, the resolver's code is now only allowed to put the
server under maintenance but not to remove it from maintenance:

the decision to remove a server from maintenance is performed by the task
responsible for updating the server's addr: if the addr resolves again
thanks to a valid DNS resolution and the server was previously under
RMAINT, then it cleared from RMAINT state.

srvrq_update_srv_status() was renamed srvrq_set_srv_down(), since it is
only called to put the server in maintenance as a result of a failing
SRV entry.

snr_update_srv_status() was renamed srv_set_srv_down() and slightly
modified so that it only takes care of putting the server under
maintenance when needed.

The cli command "set server x/y addr" does not need to remove the RMAINT
flag anymore.
2023-12-21 14:22:27 +01:00
Aurelien DARRAGON
33cd676e9e MINOR: server/event_hdl: expose updater info through INETADDR event
Thanks to the previous commit, we can now expose updater info through
INETADDR event.
2023-12-21 14:22:27 +01:00
Aurelien DARRAGON
3ac79b504a MEDIUM: server: make server_set_inetaddr() updater serializable
server_set_inetaddr() updater argument is a simple char * string
containing infos about the caller responsible for the update.

In this patch, we try to make this argument serializable, that is, make
it so that we can easily export it without having to keep the original
pointer passed by the caller or having to work with strings of variable
lengths.

This was a prerequisite for exposing more updater information through
SERVER_INETADDR event (upcoming patch).

Static strings were simply mapped to a fixed ID that can be converted back
to a string when needed using server_inetaddr_updater_by_to_str(). One
special case one made for the SERVER_INETADDR_UPDATER_DNS_RESOLVER updater
since in this case the updater hint has to be generated from the
corresponding resolver id / nameserver id combination. This was achieved
by saving the nameserver id within the updater struct. Knowing that the
resolver id can be guessed from the server struct directly, it was not
exposed through the updater struct.

This patch depends on:
 - "MINOR: resolvers: add unique numeric id to nameservers"

No functional change should be expected.
2023-12-21 14:22:27 +01:00
Aurelien DARRAGON
2f6120d6d4 MINOR: resolvers: add unique numeric id to nameservers
When we want to avoid keeping pointers on a nameserver struct, it's not
always convenient to refer as a nameserver using it's text-based unique
identifier since it's not limited in length thus it cannot be serialized
and deserialized safely.

To address this limitation, we add a new ->puid member in dns_nameserver
struct which is a parent-unique numeric value that can be used to refer
to the dns nameserver within its parent resolver context.

To achieve this, we reused the resolver->nb_nameserver member that wasn't
used. Each time we add a new nameserver to a resolver: we set ns->puid to
the current number of nameservers within the resolver and we increment
this number right away.

Public helper function find_nameserver_by_resolvers_and_id() was added to
help retrieve nameserver pointer from (resolver X nameserver puid)
combination.
2023-12-21 14:22:27 +01:00
Aurelien DARRAGON
ab6fef4882 CLEANUP: server: remove unused server_parse_addr_change_request() function
server_parse_addr_change_request() was completely replaced by the newer
srv_update_addr_port() function. Considering the function doesn't offer
useful features that srv_update_addr_port() couldn't do, we simply
remove the function.
2023-12-21 14:22:27 +01:00
Aurelien DARRAGON
f1f4b93a67 MEDIUM: server: merge srv_update_addr() and srv_update_addr_port() logic
Both functions are performing the similar tasks, except that the _port()
version is doing a bit more work.

In this patch, we add the server_set_inetaddr() function that works like
the srv_update_addr_port() but it takes parsed inputs instead of raw
strings as arguments.

Then, server_set_inetaddr() is used as underlying helper function for
both srv_update_addr() and srv_update_addr_port() to make them easier
to maintain.

Also, helper functions were added:
 - server_set_inetaddr_warn() -> same as server_set_inetaddr() but report
   a warning on updates.
 - server_get_inetaddr() -> fills a struct server_inetaddr from srv

Since the feedback message generation part was slightly reworked, some
minor changes in the way addr:svc_port updates are reported in the logs
or cli messages should be expected (no loss of information though).
2023-12-21 14:22:27 +01:00
Aurelien DARRAGON
2d0c7f5935 CLEANUP: server/event_hdl: remove purge_conn hint in INETADDR event
Now that purge_conn hint is now being ignored thanks to previous commit,
we can simply get rid of it.
2023-12-21 14:22:27 +01:00
Aurelien DARRAGON
2e3a163e47 MINOR: server: ensure connection cleanup on server addr changes
Previously, in srv_update_addr_port(), we forced connection cleanup on
server changes.
This was done in 6318d33ce ("BUG/MEDIUM: connections: force connections
cleanup on server changes").

However, there is no reason we shouldn't have done the same in
srv_update_addr() function, because the end goal is the same: perform
runtime changes on server's address.

The purge_conn hint propagated through the INETADDR server event was
simply there to keep the original behavior (only purge the connection
for events originating from srv_update_addr_port()), but to ensure the
address change is handled the same way for both code paths, we simply
ignore this hint.
2023-12-21 14:22:26 +01:00
Aurelien DARRAGON
545e72546c BUG/MINOR: server/event_hdl: propagate map port info through inetaddr event
server addr:svc_port updates during runtime might set or clear the
SRV_F_MAPPORTS flag. Unfortunately, the flag update is still directly
performed by srv_update_addr_port() function while the addr:svc_port
update is being scheduled for atomic update. Given that existing readers
don't take server's lock to read addr:svc_port, they also check the
SRV_F_MAPPORTS flag right after without the lock.

So we could cause the readers to incorrectly interpret the svc_port from
the server struct because the mapport information is not published
atomically, resulting in inconsistencies between svc_port / mapport flag.
(MAPPORTS flag causes svc_port to be used differently by the reader)

To fix this, we publish the mapport information within the INETADDR server
event and we let the task responsible for updating server's addr and port
position or clear the flag depending on the mapport hint.

This patch depends on:
 - MINOR: server/event_hdl: add server_inetaddr struct to facilitate event data usage
 - MINOR: server/event_hdl: update _srv_event_hdl_prepare_inetaddr prototype

This should be backported in 2.9 with 683b2ae01 ("MINOR: server/event_hdl:
add SERVER_INETADDR event")
2023-12-21 14:22:26 +01:00
Aurelien DARRAGON
4e50c31eab MINOR: server/event_hdl: update _srv_event_hdl_prepare_inetaddr prototype
Slightly change _srv_event_hdl_prepare_inetaddr() function prototype to
reduce the input arguments by learning some settings directly from the
server. Also taking this opportunity to make the function static inline
since it's relatively simple and not meant to be used directly.
2023-12-21 14:22:26 +01:00
Aurelien DARRAGON
835263047e OPTIM: server: ebtree lookups for findserver_unique_* functions
4e5e2664 ("MINOR: proxy: add findserver_unique_id() and findserver_unique_name()")
added findserver_unique_id() and findserver_unique_name() functions that
were inspired from the historical findserver() function, so unfortunately
they don't perform well when used on large backend farms because they scan
the whole server list linearly.

I was about to provide a patch to optimize such functions when I stumbled
on Baptiste's work:
  19a106d24 ("MINOR: server: server_find functions: id, name, best_match")

It turns out Baptiste already implemented helper functions to supersed
the unoptimized findserver() function (at least at runtime when servers
have been assigned their final IDs and inserted in the lookup trees): they
offer more matching options and rely on eb lookups so they are much more
suitable for fast queries. I don't know how I missed that, but they are a
perfect base for the server rid matching functions.

So in this patch, we essentially revert 4e5e2664 to provide the optimized
equivalent functions named server_find_by_id_unique() and
server_find_by_name_unique(), then we force existing findserver_unique_*()
callers to switch to the new functions.

This patch depends on:
 - "OPTIM: server: eb lookup for server_find_by_name()"

This could be backported up to 2.8.
2023-12-21 14:22:26 +01:00
Aurelien DARRAGON
4bcfe30414 OPTIM: server: eb lookup for server_find_by_name()
server_find_by_name() function was added in 19a106d24 ("MINOR: server:
server_find functions: id, name, best_match").

At that time, only the used_server_id proxy tree was available, thus the
name lookup was performed as a linear search.

However, used_server_name proxy tree was added in 84d6046a ("MINOR: proxy:
Add a "server by name" tree to proxy."), so we may safely rely on it to
perform server name lookups now. This will hopefully make the function
quite faster, especially when performing lookups in huge backend farms.
2023-12-21 14:22:26 +01:00
Aurelien DARRAGON
e35fa36360 MINOR: proxy: monitor-uri works with tcp->http upgrades
Currently, we have a check in proxy_cfg_ensure_no_http() that generates a
warning if the monitor-uri is configured on a proxy that doesn't have
mode HTTP enabled.

However, when we give a look at monitor-uri implementation, it's not
100% correct. Indeed, despite the warning message, the directive will
still be evaluated when HTTP upgrade occurs from a TCP frontend.
Thus the error is misleading.

To make the error message comply with the actual behavior, the check was
moved alongside other checks that accept both native HTTP mode or HTTP
upgrades in cfgparse.c.
2023-12-21 14:22:26 +01:00
Aurelien DARRAGON
8a6cc6e3ea MEDIUM: proxy: set PR_O_HTTP_UPG on implicit upgrades
When a TCP frontend uses an HTTP backend, the stream is automatically
upgraded and it results in a similar behavior as if a switch-mode http
rule was evaluated since stream_set_http_mode() gets called in both
situations and minimal HTTP analyzers are set.

In the current implementation, some postparsing checks are generating
errors or warnings when the frontend is in TCP mode with some HTTP options
set and no upgrade is expected (no switch-rule http). But as you can guess,
unfortunately this leads in issues when such "HTTP" only options are used
in a frontend that has implicit switching rules (that is, when the
frontend uses an HTTP backend for example), because in this case the
PR_O_HTTP_UPG will not be set, so the postparsing checks will consider
that some options are not relevant and will raise some warnings.

Consider the following example:

  backend back
    mode http
    server s1 git.haproxy.org:80
  frontend front
    mode tcp
    bind localhost:8080
    http-request set-var(txn.test) str(TRUE),debug(WORKING,stderr)
    use_backend back

By starting an haproxy instance with the above example conf, we end up
having this warning:

  [WARNING]  (400280) : config : 'http-request' rules ignored for frontend 'front' as they require HTTP mode.

However, by making a request on the frontend, we notice that the request
rules are still executed, and that's because the stream is effectively
upgraded as a result of an implicit upgrade:

  [debug] WORKING: type=str <TRUE>

So this confirms the previous description: since implicit and explicit
upgrades result in approximately the same behavior on the frontend side,
we should consider them both when doing postparsing checks.

This is what we try to address in the following commit: PR_O_HTTP_UPG
flag is now more generic in the sense that it refers to either implicit
(through default_backend or use_backend rules) or explicit (switch-mode
rules) upgrades. Indeed, everytime an HTTP or dynamic backend (where the
mode cannot be assumed during parsing) is encountered in default_backend
directive or use_backend rules, we explicitly position the upgrade flag
so that further checks that depend on the proxy being in HTTP context
don't report false warnings.
2023-12-21 14:22:26 +01:00
Aurelien DARRAGON
64b7d8e173 BUG/MEDIUM: stats: unhandled switching rules with TCP frontend
Consider the following configuration:

  backend back
    mode http

  frontend front
    mode tcp
    bind localhost:8080
    stats enable
    stats uri /stats
    tcp-request content switch-mode http if FALSE
    use_backend back

Firing a request to /stats on the haproxy process started with the above
configuration will cause a segfault in http_handle_stats().

The cause for the crash is that in this case, the upgrade doesn't simply
switches to HTTP mode, but also changes the stream backend (changing from
the frontend itself to the targeted HTTP backend).

However, there is an inconsitency in the stats logic between the check
for the stats URI and the actual handling of the stats page.

Indeed, http_stats_check_uri() checks uri parameters from the proxy
undergoing the http analyzers, whereas http_handle_stats() uses s->be
instead.

During stream analysis, from the frontend perspective: s->be defaults to
the frontend. But if the frontend is in TCP mode and the stream is
upgraded to HTTP via backend switching rules, then s->be will be assigned
to the actual HTTP-capable backend in stream_set_backend().

What this means is that when the http analyzer first checks if the current
URI matches the one from the "stats uri" directive, it will check against
the "stats uri" directive from the frontend, but later since the stats
handlers reads the uri from s->be it wil actually use the value from the
backend and the previous safety checks are thus garbage, resulting in
unexpected behavior. (In our test case since the backend didn't define
"stats uri" it is set to NULL, and http_handle_stats() dereferences it)

To fix this, we should ensure that prechecks and actual stats processing
always rely on the same proxy source for stats config directives.

This is what is done in this patch, thanks to the previous commit, since
we can make sure that the stat applet will use ->http_px as its parent
proxy. So here we simply propagate the current proxy being analyzed
through all the stats processing functions.

This patch depends on:
 - MINOR: stats: store the parent proxy in stats ctx (http)

It should be backported up to 2.4.
For 2.4: the fix is less trivial since stats ctx was directly stored
within the applet struct at that time, so this alternative patch must be
used instead (without "MINOR: stats: store the parent proxy in stats ctx
(http)" dependency):

    diff --git a/include/haproxy/applet-t.h b/include/haproxy/applet-t.h
    index 014e01ed9..1d9a63359 100644
    --- a/include/haproxy/applet-t.h
    +++ b/include/haproxy/applet-t.h
    @@ -121,6 +121,7 @@ struct appctx {
     		 * keep the grouped together and avoid adding new ones.
     		 */
     		struct {
    +			struct proxy *http_px;  /* parent proxy of the current applet (only relevant for HTTP applet) */
     			void *obj1;             /* context pointer used in stats dump */
     			void *obj2;             /* context pointer used in stats dump */
     			uint32_t domain;        /* set the stats to used, for now only proxy stats are supported */
    diff --git a/src/http_ana.c b/src/http_ana.c
    index b557da89d..1025d7711 100644
    --- a/src/http_ana.c
    +++ b/src/http_ana.c
    @@ -63,8 +63,8 @@ static enum rule_result http_req_restrict_header_names(struct stream *s, struct
     static void http_manage_client_side_cookies(struct stream *s, struct channel *req);
     static void http_manage_server_side_cookies(struct stream *s, struct channel *res);

    -static int http_stats_check_uri(struct stream *s, struct http_txn *txn, struct proxy *backend);
    -static int http_handle_stats(struct stream *s, struct channel *req);
    +static int http_stats_check_uri(struct stream *s, struct http_txn *txn, struct proxy *px);
    +static int http_handle_stats(struct stream *s, struct channel *req, struct proxy *px);

     static int http_handle_expect_hdr(struct stream *s, struct htx *htx, struct http_msg *msg);
     static int http_reply_100_continue(struct stream *s);
    @@ -428,7 +428,7 @@ int http_process_req_common(struct stream *s, struct channel *req, int an_bit, s
     		}

     		/* parse the whole stats request and extract the relevant information */
    -		http_handle_stats(s, req);
    +		http_handle_stats(s, req, px);
     		verdict = http_req_get_intercept_rule(px, &px->uri_auth->http_req_rules, s);
     		/* not all actions implemented: deny, allow, auth */

    @@ -3959,16 +3959,16 @@ void http_check_response_for_cacheability(struct stream *s, struct channel *res)

     /*
      * In a GET, HEAD or POST request, check if the requested URI matches the stats uri
    - * for the current backend.
    + * for the current proxy.
      *
      * It is assumed that the request is either a HEAD, GET, or POST and that the
      * uri_auth field is valid.
      *
      * Returns 1 if stats should be provided, otherwise 0.
      */
    -static int http_stats_check_uri(struct stream *s, struct http_txn *txn, struct proxy *backend)
    +static int http_stats_check_uri(struct stream *s, struct http_txn *txn, struct proxy *px)
     {
    -	struct uri_auth *uri_auth = backend->uri_auth;
    +	struct uri_auth *uri_auth = px->uri_auth;
     	struct htx *htx;
     	struct htx_sl *sl;
     	struct ist uri;
    @@ -4003,14 +4003,14 @@ static int http_stats_check_uri(struct stream *s, struct http_txn *txn, struct p
      * s->target which is supposed to already point to the stats applet. The caller
      * is expected to have already assigned an appctx to the stream.
      */
    -static int http_handle_stats(struct stream *s, struct channel *req)
    +static int http_handle_stats(struct stream *s, struct channel *req, struct proxy *px)
     {
     	struct stats_admin_rule *stats_admin_rule;
     	struct stream_interface *si = &s->si[1];
     	struct session *sess = s->sess;
     	struct http_txn *txn = s->txn;
     	struct http_msg *msg = &txn->req;
    -	struct uri_auth *uri_auth = s->be->uri_auth;
    +	struct uri_auth *uri_auth = px->uri_auth;
     	const char *h, *lookup, *end;
     	struct appctx *appctx;
     	struct htx *htx;
    @@ -4020,6 +4020,7 @@ static int http_handle_stats(struct stream *s, struct channel *req)
     	memset(&appctx->ctx.stats, 0, sizeof(appctx->ctx.stats));
     	appctx->st1 = appctx->st2 = 0;
     	appctx->ctx.stats.st_code = STAT_STATUS_INIT;
    +	appctx->ctx.stats.http_px = px;
     	appctx->ctx.stats.flags |= uri_auth->flags;
     	appctx->ctx.stats.flags |= STAT_FMT_HTML; /* assume HTML mode by default */
     	if ((msg->flags & HTTP_MSGF_VER_11) && (txn->meth != HTTP_METH_HEAD))
    diff --git a/src/stats.c b/src/stats.c
    index d1f3daa98..1f0b2bff7 100644
    --- a/src/stats.c
    +++ b/src/stats.c
    @@ -2863,9 +2863,9 @@ static int stats_dump_be_stats(struct stream_interface *si, struct proxy *px)
     	return stats_dump_one_line(stats, stats_count, appctx);
     }

    -/* Dumps the HTML table header for proxy <px> to the trash for and uses the state from
    - * stream interface <si> and per-uri parameters <uri>. The caller is responsible
    - * for clearing the trash if needed.
    +/* Dumps the HTML table header for proxy <px> to the trash and uses the state from
    + * stream interface <si>. The caller is responsible for clearing the trash if
    + * needed.
      */
     static void stats_dump_html_px_hdr(struct stream_interface *si, struct proxy *px)
     {
    @@ -3015,17 +3015,19 @@ static void stats_dump_html_px_end(struct stream_interface *si, struct proxy *px
      * input buffer. Returns 0 if it had to stop dumping data because of lack of
      * buffer space, or non-zero if everything completed. This function is used
      * both by the CLI and the HTTP entry points, and is able to dump the output
    - * in HTML or CSV formats. If the later, <uri> must be NULL.
    + * in HTML or CSV formats.
      */
     int stats_dump_proxy_to_buffer(struct stream_interface *si, struct htx *htx,
    -			       struct proxy *px, struct uri_auth *uri)
    +			       struct proxy *px)
     {
     	struct appctx *appctx = __objt_appctx(si->end);
    -	struct stream *s = si_strm(si);
     	struct channel *rep = si_ic(si);
     	struct server *sv, *svs;	/* server and server-state, server-state=server or server->track */
     	struct listener *l;
    +	struct uri_auth *uri = NULL;

    +	if (appctx->ctx.stats.http_px)
    +		uri = appctx->ctx.stats.http_px->uri_auth;
     	chunk_reset(&trash);

     	switch (appctx->ctx.stats.px_st) {
    @@ -3045,7 +3047,7 @@ int stats_dump_proxy_to_buffer(struct stream_interface *si, struct htx *htx,
     					break;

     				/* match '.' which means 'self' proxy */
    -				if (strcmp(scope->px_id, ".") == 0 && px == s->be)
    +				if (strcmp(scope->px_id, ".") == 0 && px == appctx->ctx.stats.http_px)
     					break;
     				scope = scope->next;
     			}
    @@ -3227,10 +3229,16 @@ int stats_dump_proxy_to_buffer(struct stream_interface *si, struct htx *htx,
     }

     /* Dumps the HTTP stats head block to the trash for and uses the per-uri
    - * parameters <uri>. The caller is responsible for clearing the trash if needed.
    + * parameters from the parent proxy. The caller is responsible for clearing
    + * the trash if needed.
      */
    -static void stats_dump_html_head(struct appctx *appctx, struct uri_auth *uri)
    +static void stats_dump_html_head(struct appctx *appctx)
     {
    +	struct uri_auth *uri;
    +
    +	BUG_ON(!appctx->ctx.stats.http_px);
    +	uri = appctx->ctx.stats.http_px->uri_auth;
    +
     	/* WARNING! This must fit in the first buffer !!! */
     	chunk_appendf(&trash,
     	              "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\"\n"
    @@ -3345,17 +3353,21 @@ static void stats_dump_html_head(struct appctx *appctx, struct uri_auth *uri)
     }

     /* Dumps the HTML stats information block to the trash for and uses the state from
    - * stream interface <si> and per-uri parameters <uri>. The caller is responsible
    - * for clearing the trash if needed.
    + * stream interface <si> and per-uri parameters from the parent proxy. The caller
    + * is responsible for clearing the trash if needed.
      */
    -static void stats_dump_html_info(struct stream_interface *si, struct uri_auth *uri)
    +static void stats_dump_html_info(struct stream_interface *si)
     {
     	struct appctx *appctx = __objt_appctx(si->end);
     	unsigned int up = (now.tv_sec - start_date.tv_sec);
     	char scope_txt[STAT_SCOPE_TXT_MAXLEN + sizeof STAT_SCOPE_PATTERN];
     	const char *scope_ptr = stats_scope_ptr(appctx, si);
    +	struct uri_auth *uri;
     	unsigned long long bps = (unsigned long long)read_freq_ctr(&global.out_32bps) * 32;

    +	BUG_ON(!appctx->ctx.stats.http_px);
    +	uri = appctx->ctx.stats.http_px->uri_auth;
    +
     	/* Turn the bytes per second to bits per second and take care of the
     	 * usual ethernet overhead in order to help figure how far we are from
     	 * interface saturation since it's the only case which usually matters.
    @@ -3629,8 +3641,7 @@ static void stats_dump_json_end()
      * a pointer to the current server/listener.
      */
     static int stats_dump_proxies(struct stream_interface *si,
    -                              struct htx *htx,
    -                              struct uri_auth *uri)
    +                              struct htx *htx)
     {
     	struct appctx *appctx = __objt_appctx(si->end);
     	struct channel *rep = si_ic(si);
    @@ -3650,7 +3661,7 @@ static int stats_dump_proxies(struct stream_interface *si,
     		px = appctx->ctx.stats.obj1;
     		/* skip the disabled proxies, global frontend and non-networked ones */
     		if (!px->disabled && px->uuid > 0 && (px->cap & (PR_CAP_FE | PR_CAP_BE))) {
    -			if (stats_dump_proxy_to_buffer(si, htx, px, uri) == 0)
    +			if (stats_dump_proxy_to_buffer(si, htx, px) == 0)
     				return 0;
     		}

    @@ -3666,14 +3677,12 @@ static int stats_dump_proxies(struct stream_interface *si,
     }

     /* This function dumps statistics onto the stream interface's read buffer in
    - * either CSV or HTML format. <uri> contains some HTML-specific parameters that
    - * are ignored for CSV format (hence <uri> may be NULL there). It returns 0 if
    - * it had to stop writing data and an I/O is needed, 1 if the dump is finished
    - * and the stream must be closed, or -1 in case of any error. This function is
    - * used by both the CLI and the HTTP handlers.
    + * either CSV or HTML format. It returns 0 if it had to stop writing data and
    + * an I/O is needed, 1 if the dump is finished and the stream must be closed,
    + * or -1 in case of any error. This function is used by both the CLI and the
    + * HTTP handlers.
      */
    -static int stats_dump_stat_to_buffer(struct stream_interface *si, struct htx *htx,
    -				     struct uri_auth *uri)
    +static int stats_dump_stat_to_buffer(struct stream_interface *si, struct htx *htx)
     {
     	struct appctx *appctx = __objt_appctx(si->end);
     	struct channel *rep = si_ic(si);
    @@ -3688,7 +3697,7 @@ static int stats_dump_stat_to_buffer(struct stream_interface *si, struct htx *ht

     	case STAT_ST_HEAD:
     		if (appctx->ctx.stats.flags & STAT_FMT_HTML)
    -			stats_dump_html_head(appctx, uri);
    +			stats_dump_html_head(appctx);
     		else if (appctx->ctx.stats.flags & STAT_JSON_SCHM)
     			stats_dump_json_schema(&trash);
     		else if (appctx->ctx.stats.flags & STAT_FMT_JSON)
    @@ -3708,7 +3717,7 @@ static int stats_dump_stat_to_buffer(struct stream_interface *si, struct htx *ht

     	case STAT_ST_INFO:
     		if (appctx->ctx.stats.flags & STAT_FMT_HTML) {
    -			stats_dump_html_info(si, uri);
    +			stats_dump_html_info(si);
     			if (!stats_putchk(rep, htx, &trash))
     				goto full;
     		}
    @@ -3733,7 +3742,7 @@ static int stats_dump_stat_to_buffer(struct stream_interface *si, struct htx *ht
     		case STATS_DOMAIN_PROXY:
     		default:
     			/* dump proxies */
    -			if (!stats_dump_proxies(si, htx, uri))
    +			if (!stats_dump_proxies(si, htx))
     				return 0;
     			break;
     		}
    @@ -4112,11 +4121,14 @@ static int stats_process_http_post(struct stream_interface *si)
     static int stats_send_http_headers(struct stream_interface *si, struct htx *htx)
     {
     	struct stream *s = si_strm(si);
    -	struct uri_auth *uri = s->be->uri_auth;
    +	struct uri_auth *uri;
     	struct appctx *appctx = __objt_appctx(si->end);
     	struct htx_sl *sl;
     	unsigned int flags;

    +	BUG_ON(!appctx->ctx.stats.http_px);
    +	uri = appctx->ctx.stats.http_px->uri_auth;
    +
     	flags = (HTX_SL_F_IS_RESP|HTX_SL_F_VER_11|HTX_SL_F_XFER_ENC|HTX_SL_F_XFER_LEN|HTX_SL_F_CHNK);
     	sl = htx_add_stline(htx, HTX_BLK_RES_SL, flags, ist("HTTP/1.1"), ist("200"), ist("OK"));
     	if (!sl)
    @@ -4166,11 +4178,14 @@ static int stats_send_http_redirect(struct stream_interface *si, struct htx *htx
     {
     	char scope_txt[STAT_SCOPE_TXT_MAXLEN + sizeof STAT_SCOPE_PATTERN];
     	struct stream *s = si_strm(si);
    -	struct uri_auth *uri = s->be->uri_auth;
    +	struct uri_auth *uri;
     	struct appctx *appctx = __objt_appctx(si->end);
     	struct htx_sl *sl;
     	unsigned int flags;

    +	BUG_ON(!appctx->ctx.stats.http_px);
    +	uri = appctx->ctx.stats.http_px->uri_auth;
    +
     	/* scope_txt = search pattern + search query, appctx->ctx.stats.scope_len is always <= STAT_SCOPE_TXT_MAXLEN */
     	scope_txt[0] = 0;
     	if (appctx->ctx.stats.scope_len) {
    @@ -4263,7 +4278,7 @@ static void http_stats_io_handler(struct appctx *appctx)
     	}

     	if (appctx->st0 == STAT_HTTP_DUMP) {
    -		if (stats_dump_stat_to_buffer(si, res_htx, s->be->uri_auth))
    +		if (stats_dump_stat_to_buffer(si, res_htx))
     			appctx->st0 = STAT_HTTP_DONE;
     	}

    @@ -4888,6 +4903,7 @@ static int cli_parse_show_stat(char **args, char *payload, struct appctx *appctx

     	appctx->ctx.stats.scope_str = 0;
     	appctx->ctx.stats.scope_len = 0;
    +	appctx->ctx.stats.http_px = NULL; // not under http context
     	appctx->ctx.stats.flags = STAT_SHNODE | STAT_SHDESC;

     	if ((strm_li(si_strm(appctx->owner))->bind_conf->level & ACCESS_LVL_MASK) >= ACCESS_LVL_OPER)
    @@ -4954,7 +4970,7 @@ static int cli_io_handler_dump_info(struct appctx *appctx)
      */
     static int cli_io_handler_dump_stat(struct appctx *appctx)
     {
    -	return stats_dump_stat_to_buffer(appctx->owner, NULL, NULL);
    +	return stats_dump_stat_to_buffer(appctx->owner, NULL);
     }

     static int cli_io_handler_dump_json_schema(struct appctx *appctx)
2023-12-21 14:21:53 +01:00
Aurelien DARRAGON
ef9d692544 MINOR: stats: store the parent proxy in stats ctx (http)
Some HTTP related stats functions need to know the parent proxy, mainly
to get a pointer on the related uri_auth set by the proxy or to check
scope settings.

The current design (probably historical as only the http context existed
by then) took the other approach: it propagates the uri pointer from the
http context deep down the calling stack up to the relevant functions.
For non-http contexts (cli), the pointer is set to NULL.

Doing so is not very pretty and not easy to maintain. Moreover, there were
still some places in the code were the uri pointer was learned directly
from the stream proxy because the argument was not available as argument
from those functions. This is error-prone, because if one day we decide to
change the source proxy in the parent function, we might still have some
functions down the stack that ignore the top most argument and still do
on their own, and we'll probably end up with inconsistencies.

So in this patch, we take a safer approach: the caller responsible for
creating the stats applet should set the http_px pointer so that any stats
function running under the applet that needs to know if it's running in
http context or needs to access parent proxy info may do so thanks to
the dedicated ctx->http_px pointer.
2023-12-21 14:20:03 +01:00
Amaury Denoyelle
9ab107b84b MINOR: h3: use INTERNAL_ERROR code for init failure
Consider that application layer is responsible to set proper error code
on init or finalize operation failure. In case of H3, use INTERNAL_ERROR
application error code. This allows to remove qcc_set_error() invocation
from qmux_init().

In case application layer would not specify any error code, fallback
INTERNAL_ERROR transport error code would be used thanks to the recent
change introduced for error management in qmux_init().
2023-12-20 15:40:02 +01:00
Amaury Denoyelle
7a3602a1f5 BUG/MINOR: h3: properly handle alloc failure on finalize
If H3 control stream Tx buffer cannot be allocated, return a proper
errur through h3_finalize(). This will cause the emission of a
CONNECTION_CLOSE with error H3_INTERNAL_ERROR and closure of the whole
connection.

This should be backported up to 2.6. Note that 2.9 has some difference
which will cause conflict. The main one is that qcc_get_stream_txbuf()
does not exist in this version. Instead the check in h3_control_send()
should be made after mux_get_buf(). Finally, it may be useful to first
pick previous commit (MINOR: h3: add traces for connection init stage)
to improve context similarity.
2023-12-20 15:39:51 +01:00
Amaury Denoyelle
a2dbd6d916 MINOR: h3: add traces for connection init stage
Add traces H3_EV_H3C_NEW. These are used for h3_init() and h3_finalize()
functions.
2023-12-20 15:27:11 +01:00
Amaury Denoyelle
403492af8e MINOR: mux-quic: adjust error code in init failure
If QUIC MUX cannot be initialized for any reason, the connection is shut
down with a CONNECTION_CLOSE frame. Previously, no error code was
explicitely specified, resulting in "no error" code.

Change this by always set error code in case of QUIC MUX failure. Use
the already defined QUIC MUX error code or "internal error" if unset.
Call quic_set_connection_close() on error label to register it to the
quic_conn layer.

This should help to improve error reporting in case of MUX
initialization failure.
2023-12-20 15:27:11 +01:00
Amaury Denoyelle
bcade776c2 MINOR: mux-quic: use qcc_release in case of init failure
qmux_init() may fail at different stage. In this case, an error is
returned and QCC allocated element are freed. Previously, extra care was
taken using different label to only liberate already allocate elements.

This patch removes the multi label and uses qcc_release(). This will be
simpler to ensure a QCC is always properly freed. The only important
thing is to ensure that mandatory fields are properly initialized to
NULL or equivalent to be able to use qcc_release() safely.
2023-12-20 15:27:11 +01:00
Amaury Denoyelle
3c38bb7ee1 MINOR: mux-quic: remove qcc_shutdown() from qcc_release()
Render qcc_release() more generic by removing qcc_shutdown(). This
prevents systematic graceful shutdown/CONNECTION_CLOSE emission if only
QCC resource deallocation is necessary.

For now, qcc_shutdown() is used before every qcc_release() invocation.
The only exception is on qmux_destroy stream layer callback.

This commit will be useful to reuse qcc_release() in other contexts to
simply deallocate a QCC instance.
2023-12-20 15:27:11 +01:00
Christopher Faulet
3811c1de25 BUG/MINOR: server: Use the configured address family for the initial resolution
A regression was introduced by the commit c886fb58eb ("MINOR: server/ip:
centralize server ip updates"). The configured address family is lost when the
server address is initialized during the startup, for the resolution based on
the libc or based on the server state-file. Thus, "ipv4@" and "ipv6@" prefixed
are ignored.

To fix the bug, we take care to use the configured address family before calling
str2ip2() in srv_apply_lastaddr() and srv_apply_via_libc() functions.

This patch should fix the issue #2393. It must be backported to 2.9.
2023-12-20 12:21:59 +01:00
Amaury Denoyelle
d2540b2f72 MINOR: h3: remove quic_conn only reference
H3 uses a direct reference to quic_conn to access the listener instance.
This can be replaced by using qcc->conn->target. This allows to remove
quic_conn-t.h header include from it.
2023-12-20 10:38:30 +01:00
Christopher Faulet
d9eb6d6680 BUG/MEDIUM: mux-h2: Don't report error on SE for closed H2 streams
An error on the H2 connection was always reported as an error to the
stream-endpoint descriptor, independently on the H2 stream state. But it is
a bug to do so for closed streams. And indeed, it leads to report "SD--"
termination state for some streams while the response was fully received and
forwarded to the client, at least for the backend side point of view.

Now, errors are no longer reported for H2 streams in closed state.

This patch is related to the three previous ones:

 * "BUG/MEDIUM: mux-h2: Don't report error on SE for closed H2 streams"
 * "BUG/MEDIUM: mux-h2: Don't report error on SE if error is only pending on H2C"
 * "BUG/MEDIUM: mux-h2: Only Report H2C error on read error if demux buffer is empty"

The series should fix a bug reported in issue #2388
(#2388#issuecomment-1855735144). The series should be backported to 2.9 but
only after a period of observation. In theory, older versions are also
affected but this part is pretty sensitive. So don't backport it further
except if someone ask for it.
2023-12-18 21:15:32 +01:00
Christopher Faulet
580ffd6123 BUG/MEDIUM: mux-h2: Don't report error on SE if error is only pending on H2C
In h2s_wake_one_stream(), we must not report an error on the stream-endpoint
descriptor if the error is not definitive on the H2 connection. A pending
error on the H2 connection means there are potentially remaining data to be
demux. It is important to not truncate a message for a stream.

This patch is part of a series that should fix a bug reported in issue #2388
(#2388#issuecomment-1855735144). Backport instructions will be shipped in
the last commit of the series.
2023-12-18 21:15:32 +01:00
Christopher Faulet
19fb19976f BUG/MEDIUM: mux-h2: Only Report H2C error on read error if demux buffer is empty
It is similar to the previous fix ("BUG/MEDIUM: mux-h2: Don't report H2C
error on read error if dmux buffer is not empty"), but on receive side. If
the demux buffer is not empty, an error on the TCP connection must not be
immediately reported as an error on the H2 connection. We must be sure to
have tried to demux all data first. Otherwise, messages for one or more
streams may be truncated while all data were already received and are
waiting to be demux.

This patch is part of a series that should fix a bug reported in issue #2388
(#2388#issuecomment-1855735144). Backport instructions will be shipped in
the last commit of the series.
2023-12-18 21:15:32 +01:00
Christopher Faulet
5b78cbae77 BUG/MEDIUM: mux-h2: Switch pending error to error if demux buffer is empty
When an error on the H2 connection is detected when sending data, only a
pending error is reported, waiting for an error or a shutdown on the read
side. However if a shutdown was already received, the pending error is
switched to a definitive error.

At this stage, we must also wait to have flushed the demux
buffer. Otherwise, if some data must still be demux, messages for one or
more streams may be truncated. There is already the flag H2_CF_END_REACHED
to know a shutdown was received and we no longer progress on demux side
(buffer empty or data truncated). On sending side, we should use this flag
instead to report a definitive error.

This patch is part of a series that should fix a bug reported in issue #2388
(#2388#issuecomment-1855735144). Backport instructions will be shipped in
the last commit of the series.
2023-12-18 21:15:32 +01:00
William Lallemand
0d2ebb53f7 BUG/MINOR: resolvers: default resolvers fails when network not configured
Bug #1740 was opened again, this time a user is complaining about the
"can't create socket for nameserver". This can happen if the resolv.conf
file contains a class of address which was not configured on the
machine, for example IPv6.

The fix does the same as b10b1196b ("MINOR: resolvers: shut the warning
when "default" resolvers is implicit"), and uses the
"resolvers->conf.implicit" variable to emit the error.

Though it is not needed to convert the explicit behavior with a
ERR_WARN, because this is supposed to be an unrecoverable error, unlike
the connect().

Should fix issue #1740.

Must be backported were b10b1196b was backported. (as far as 2.6)
2023-12-18 15:50:07 +01:00
Amaury Denoyelle
af297f19f6 BUG/MEDIUM: mux-quic: report early error on stream
On STOP_SENDING reception, an error is notified to the stream layer as
no more data can be responded. However, this is not done if the stream
instance is not allocated (already freed for example).

The issue occurs if STOP_SENDING is received and the stream instance is
instantiated after it. It happens if a STREAM frame is received after it
with H3 HEADERS, which is valid in QUIC protocol due to UDP packet
reordering. In this case, stream layer is never notified about the
underlying error. Instead, reponse buffers are silently purged by the
MUX in qmux_strm_snd_buf().

This is suboptimal as there is no point in exchanging data from the
server if it cannot be eventually transferred back to the client.
However, aside from this consideration, no other issue occured. However,
this is not the case with QUIC mux-to-mux implementation. Now, if
mux-to-mux is used, qmux_strm_snd_buf() is bypassed and response if
transferred via nego_ff/done_ff callbacks. However, these functions did
not checked if QCS is already locally closed. This causes a crash when
qcc_send_stream() is called via done_ff.

To fix this crash, there is several approach, one of them would be to
adjust nego_ff/done_ff QUIC callbacks. However, another method has been
chosen. Now stream layer is flagged on error just after its
instantiation if the stream is already locally closed. This ensures that
mux-to-mux won't try to emit data as se_nego_ff() check if the opposide
SD is not on error before continuing.

Note that an alternative solution could be to not instantiate at all
stream layer if QCS is already locally closed. This is the most optimal
solution as it reduce unnecessary allocations and task processing.
However, it's not easy to implement so the easier bug fix has been
chosen for the moment.

This patch is labelled as MEDIUM as it can change behavior of all QCS
instances, wheter mux-to-mux is used or not, and thus could reveal other
architecture issues.

This should fix latest crash occurence on github issue #2392.

It should be backported up to 2.6, until a necessary period of
observation.
2023-12-14 11:15:46 +01:00
Christopher Faulet
682f73b4fa BUG/MEDIUM: mux-h2: Report too large HEADERS frame only when rxbuf is empty
During HEADERS frames decoding, if a frame is too large to fit in a buffer,
an internal error is reported and a RST_STREAM is emitted. On the other
hand, we wait to have an empty rxbuf to decode the frame because we cannot
retry a failed HPACK decompression.

When we are decoding headers, it is valid to return an error if dbuf buffer
is full because no data can be blocked in the rxbuf (which hosts the HTX
message).

However, during the trailers decoding, it is possible to have some data not
sent yet for the current stream in the rxbug and data for another stream
fully filling the dbuf buffer. In this case, we don't decode the trailers
but we must not return an error. We must wait to empty the rxbuf first.

Now, a HEADERS frame is considered as too large if the dbuf buffer is full
and if the rxbuf is empty (the HTX message to be accurate).

This patch should fix the issue #2382. It must be backported to all stable
versions.
2023-12-13 16:45:29 +01:00
Christopher Faulet
65ca444240 CLEANUP: mux-h1: Fix a trace message about C-L header addition
This fixes a cut-paste error on a trace message notifying a 'Content-Length'
header was added during the HTTP message formatting.
2023-12-13 16:45:29 +01:00
Christopher Faulet
966a18e2b4 BUG/MEDIUM: mux-h1: Explicitly skip request's C-L header if not set originally
Commit f89ba27caa ("BUG/MEDIUM: mux-h1; Ignore headers modifications about
payload representation") introduced a regression. The Content-Length is no
longer sent to the server for requests without payload but with a
'Content-Lnegth' header explicitly set to 0, like POST request with no
payload. It is of course unexpected. In some cases, depending on the server,
such requests are considered as invalid and a 411-Length-Required is returned.

The above commit is not directly responsible for the bug, it only reveals a too
lax condition to skip the 'Content-Length' header of bodyless requests. We must
only skip this header if none was originally found, during the parsing.

This patch should fix the issue #2386. It must be backported to 2.9.
2023-12-13 16:45:29 +01:00
Christopher Faulet
eed1e8733c BUG/MEDIUM: mux-h1: Cound data from input buf during zero-copy forwarding
During zero-copy forwarding, we first try to forward data found in the input
buffer before trying to receive more data. These data must be removed from
the amount of data to forward (the cound variable).

Otherwise, on an internal retry, in h1_fastfwd(), we can be lead to read
more data than expected. It is especially a problem on the end of a
chunk. An error is erroneously reported because more data than announced are
received.

This patch should fix the issue #2382. It must be backported to 2.9.
2023-12-13 16:45:29 +01:00
Frédéric Lécaille
dd58dff1e6 BUG/MEDIUM: quic: QUIC CID removed from tree without locking
This bug arrived with this commit:

   BUG/MINOR: quic: Wrong RETIRE_CONNECTION_ID sequence number chec

Every connection ID manipulations against the by thread trees used to store the
connection IDs must be done under the trees locks. These trees are accessed by
the low level connection identification code.

When receiving a RETIRE_CONNECTION_ID frame, the concerned connection ID must
be deleted from the its underlying by thread tree but not without locking!
Add a WR lock around ebmb_delete() call to do so.

Must be backported as far as 2.7.
2023-12-13 14:42:50 +01:00
Amaury Denoyelle
f8e095b058 MINOR: hq-interop: use zero-copy to transfer single HTX data block
Similarly to H3, hq-interop now uses zero-copy when dealing with a HTX
message with only a single data block. Exchange HTX and QCS buffer, and
use the HTX data block for HTTP payload. This is only possible if QCS
buffer is empty. Contrary to HTTP/3, no extra frame header is needed
before transferring HTTP payload.

hq-interop is only implemented for testing purpose so this change should
not be noticeable by users. However, it will be useful to be able to
test zero-copy transfer on QUIC interop testing.
2023-12-12 10:31:22 +01:00
Amaury Denoyelle
d3987b69c3 MINOR: h3: adjust zero-copy sending related code
Adjust HTTP/3 data emission. First, add HTX as argument to the function
as this is used for other frames emission function. Keep the buffer
argument as this is mandatory for zero-copy. Extend comments related to
this, in particular to explain purposes of both HTX and buffer
arguments.

No function change here. This should however be useful to port a code
equivalent to hq-interop protocol.
2023-12-12 10:31:22 +01:00
Amaury Denoyelle
0e632fc9b4 MINOR: h3: complete traces for sending
Add data level traces for each encoded H3 frame. Of notable interest,
traces will be useful to detect if standard emission, zero-copy or fast
forward is used. Also add the generic filter H3_EV_TX_FRAME to be able
to filter these messages.
2023-12-12 10:31:22 +01:00
Amaury Denoyelle
1adadc4d3f MINOR: mux-quic: factorize QC_SF_UNKNOWN_PL_LENGTH set
When dealing with HTTP/1 responses without Content-Length nor chunked
encoding, flag QC_SF_UNKNOWN_PL_LENGTH is set on QCS. This prevent the
emission of a RESET_STREAM on shutw, instead resorting to a proper FIN
emission.

This code was duplicated both in H3 and hq-interop. Move it in common
qcs_http_snd_buf() to factorize it.
2023-12-12 10:14:22 +01:00
Amaury Denoyelle
e772d3f40f CLEANUP: mux-quic: clean up app ops callback definitions
qcc_app_ops is a set of callbacks used to unify application protocol
running over QUIC. This commit introduces some changes to clarify its
API :
* write simple comment to reflect each callback purpose
* rename decode_qcs to rcv_buf as this name is more common and is
  similar to already existing snd_buf
* finalize is moved up as it is used during connection init stage

All these changes are ported to HTTP/3 layer. Also function comments
have been extended to highlight HTTP/3 special characteristics.
2023-12-11 16:15:13 +01:00
Amaury Denoyelle
f496c7469b MINOR: mux-quic: clean up qcs Tx buffer allocation API
This function is similar to the previous one, but this time for QCS
sending buffer.

Previously, each application layer redefine their own version of
mux_get_buf() which was used to allocate <qcs.tx.buf>. Unify it under a
single function renamed qcc_get_stream_txbuf().
2023-12-11 16:08:51 +01:00
Amaury Denoyelle
b526ffbfb9 MINOR: mux-quic: clean up qcs Rx buffer allocation API
Replaces qcs_get_buf() function which naming does not reflect its
purpose. Add a new function qcc_get_stream_rxbuf() which allocate if
needed <qcs.rx.app_buf> and returns the buffer pointer. This function is
reserved for application protocol layer. This buffer is then accessed by
stconn layer.

For other qcs_get_buf() invocation which was used in effect for a local
buffer, replace these by a plain b_alloc().
2023-12-11 16:02:30 +01:00
Aurelien DARRAGON
63282f3bfb BUG/MINOR: ext-check: cannot use without preserve-env
Since 1de44da ("MINOR: ext-check: add an option to preserve environment
variables"), it is now possible to provide an extra argument to
"external-check" directive. This allows to support the "preserve-env"
option which differs from the default behavior.

However a mistake was made, because the config parser doesn't allow the
default configuration anymore: using external-check without argument will
trigger an error:
  'external-check' only supports 'preserve-env' as an argument, found ''.

This is due to as small mistake in the code that make the check
systematically report an error if the first argument is not equal to
"preserve-env". The check was modified so that the error is only reported
if the argument is provided, so that the default behavior is restored.

This should fix GH #2380 and should be backported on 2.9 and potentially
further (anywhere 1de44da is, because a note about an optional backport
up to the 2.6 was left in the original commit message)
2023-12-08 14:26:06 +01:00
Aurelien DARRAGON
d7964c52ce BUG/MEDIUM: map/acl: pat_ref_{set,delete}_by_id regressions
Some regressions were introduced by 5fea59754b ("MEDIUM: map/acl:
Accelerate several functions using pat_ref_elt struct ->head list")

pat_ref_delete_by_id() fails to properly unlink and free the removed
reference because it bypasses the pat_ref_delete_by_ptr() made for
that purpose. This function is normally used everywhere the target
reference is set for removal, such as the pat_ref_delete() function
that matches pattern against a string. The call was probably skipped
by accident during the rewrite of the function.

With the above commit also comes another undesirable change:
both pat_ref_delete_by_id() and pat_ref_set_by_id() directly use the
<refelt> argument as a valid pointer (they do dereference it).

This is wrong, because <refelt> is unsafe and should be handled as an
ID, not a pointer (hence the function name). Indeed, the calling function
may directly pass user input from the CLI as <refelt> argument, so we must
first ensure that it points to a valid element before using it, else it is
probably invalid and we shouldn't touch it.

What this patch essentially does, is that it reverts pat_ref_set_by_id()
and pat_ref_delete_by_id() to pre 5fea59754b behavior. This seems like
it was the only optimization from the patch that doesn't apply.

Hopefully, after reviewing the changes with Fred, it seems that the 2
functions are only being involved in commands for manipulating maps or
acls on the cli, so the "missed" opportunity to improve their performance
shouldn't matter much. Nonetheless, if we wanted to speed up the reference
lookup by ID, we could consider adding an eb64 tree for that specific
purpose that contains all pattern references IDs (ie: pointers) so that
eb lookup functions may be used instead of linear list search.

The issue was raised by Marko Juraga as he failed to perform an an acl
removal by reference on the CLI on 2.9 which was known to work properly
on other versions.

It should be backported on 2.9.

Co-Authored-by: Frédéric Lécaille <flecaille@haproxy.com>
2023-12-08 14:26:06 +01:00
William Lallemand
86376f591e MINOR: ssl: activate the certificate selection callback for WolfSSL
The PR which allows to chose a certificate depending on the ciphers and
the signature algorithms was merged in WolfSSL. Let's activate this
code.

This could be backported in 2.9 only when the next WolfSSL release is
available (5.6.5). It will also need a check on the version.
2023-12-08 12:08:01 +01:00
William Lallemand
dbe9cea35b BUILD: ssl: update types in wolfssl cert selection callback
The types have changed in the PR for the wolfSSL_get_sigalg_info()
function, let's update them.

Must be backported in 2.9.
2023-12-08 12:03:11 +01:00
Frédéric Lécaille
c075e4f2fc BUG/MEDIUM: quic: Possible buffer overflow when building TLS records
This bug impacts only the OpenSSL QUIC compatibility module (USE_QUIC_OPENSSL_COMPAT).

This may happen only when the TLS stack has to be provided with more than 1024+1+5+16
bytes of CRYPTO data. In this case several TLS records have to be built in one
call to SSL_provide_quic_data(). A 5-bytes header is created at the head
of these records. This header is used as AAD to cipher the record. But
the length of this AAD was counted two times. One time here in
quic_tls_compat_create_record() (initialization):

	 adlen = quic_tls_compat_create_header(qc, rec, ad, 0);

and a second time here in the same function after quic_tls_tls_seal() return:

     ret = aad_len + outlen;

This addition is useless. Note that this bug could be reproduced when haproxy has
to authenticate the client.

Thank you to @vifino for having reported this issue in GH #2381.

Must be backported to 2.8.
2023-12-08 10:03:33 +01:00
William Lallemand
75a51dfc3f CLEANUP: mworker/cli: add comments about pcli_find_and_exec_kw()
Add a comment about the pcli_find_and_exec_kw().
2023-12-07 18:04:41 +01:00
William Lallemand
1c1bb8ef2a BUG/MINOR: mworker/cli: fix set severity-output support
"set severity-output" is one of these command that changes the appctx
state so the next commands are affected.

Unfortunately the master CLI works with pipelining and server close
mode, which means the connection between the master and the worker is
closed after each response, so for the next command this is a new appctx
state.

To fix the problem, 2 new flags are added ACCESS_MCLI_SEVERITY_STR and
ACCESS_MCLI_SEVERITY_NB which are used to prefix each command sent to
the worker with the right "set severity-output" command.

This patch fixes issue #2350.

It could be backported as far as 2.6.
2023-12-07 17:37:23 +01:00
Amaury Denoyelle
0338778c41 MINOR: mux-quic: add traces for 0-copy/fast-forward
Complete qmux traces :
* add a trace when 0-copy is used for DATA transfer
* mark the FIN as detected when using fast forward
2023-12-07 17:06:55 +01:00
Amaury Denoyelle
f5b2870eab CLEANUP: mux_quic: rename ffwd function with prefix qmux_strm_
All QUIC MUX functions which are callbacks for stream layer use the
prefix qmux_strm_*. This was not the case for fast forward related
callback which only used qmux_* prefix.

Fix this by reusing the standard prefix to respect QUIC MUX code
convention.
2023-12-07 17:06:55 +01:00
Amaury Denoyelle
de765a0058 MINOR: hq-interop: add fastfwd support
Implement callback for fast forwarding for hq-interop.

This change should not be considered as functionally important. Indeed,
HTTP/0.9 is reserved for QUIC interop testing and should not be used
outside of it. However, implementing fast forwarding in this context is
useful as this will allow to test MUX code sections for fast forward via
QUIC interop.
2023-12-07 17:06:52 +01:00
Frédéric Lécaille
917f7c74d3 BUG/MINOR: lua: Wrong OCSP CID after modifying an SSL certficate (LUA)
This bugfix is the same as the following one:
    "BUG/MINOR: ssl_ckch: Wrong OCSP CID after modifying an SSL certficate"
where the OCSP CID had to be reset when updating a certificate.

Must be backported to 2.8.
2023-12-06 16:12:08 +01:00
Frédéric Lécaille
75f5977ff4 BUG/MINOR: ssl: Wrong OCSP CID after modifying an SSL certficate
This bug could be reproduced with the "set ssl cert" CLI command to update
a certificate. The OCSP CID is duplicated by ckchs_dup() which calls
ssl_sock_copy_cert_key_and_chain(). It should be computed again by
ssl_sock_load_ocsp(). This may be accomplished resetting the new ckch OCSP CID
returned by ckchs_dup().

This bug may be in relation with GH #2319.

Must be backported to 2.8.
2023-12-06 16:12:08 +01:00
Frédéric Lécaille
456ba6e95f MINOR: ssl/cli: Add ha_(warning|alert) msgs to CLI ckch callback
This patch allows cli_io_handler_commit_cert() callback called upon
a "commit ssl cert ..." command to prefix the messages returned by the CLI
to the by the ones built by ha_warining(), ha_alert().

Should be interesting to backport this commit to 2.8.
2023-12-06 16:12:08 +01:00
Frédéric Lécaille
7dab3e8266 BUG/MINOR: ssl: Double free of OCSP Certificate ID
This bug could be reproduced loading several certificated from "bind" line:
with "server_ocsp.pem" as argument to "crt" setting and updating
the CDSA certificate with the RSA as follows:

echo -e "set ssl cert reg-tests/ssl/ocsp_update/multicert/server_ocsp.pem.ecdsa \
	     <<\n$(cat reg-tests/ssl/ocsp_update/multicert/server_ocsp.pem.rsa)\n" | socat - /tmp/stats
followed by an "commit ssl cert reg-tests/ssl/ocsp_update/multicert/server_ocsp.pem.ecdsa"
command. This could be detected by libasan as follows:

=================================================================
==507223==ERROR: AddressSanitizer: attempting double-free on 0x60200007afb0 in thread T3:
    #0 0x7fabc6fb5527 in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x54527)
    #1 0x7fabc6ae8f8c in ossl_asn1_string_embed_free (/opt/quictls/lib/libcrypto.so.81.3+0xd4f8c)
    #2 0x7fabc6af54e9 in ossl_asn1_primitive_free (/opt/quictls/lib/libcrypto.so.81.3+0xe14e9)
    #3 0x7fabc6af5960 in ossl_asn1_template_free (/opt/quictls/lib/libcrypto.so.81.3+0xe1960)
    #4 0x7fabc6af569f in ossl_asn1_item_embed_free (/opt/quictls/lib/libcrypto.so.81.3+0xe169f)
    #5 0x7fabc6af58a4 in ASN1_item_free (/opt/quictls/lib/libcrypto.so.81.3+0xe18a4)
    #6 0x46a159 in ssl_sock_free_cert_key_and_chain_contents src/ssl_ckch.c:723
    #7 0x46aa92 in ckch_store_free src/ssl_ckch.c:869
    #8 0x4704ad in cli_release_commit_cert src/ssl_ckch.c:1981
    #9 0x962e83 in cli_io_handler src/cli.c:1140
    #10 0xc1edff in task_run_applet src/applet.c:454
    #11 0xaf8be9 in run_tasks_from_lists src/task.c:634
    #12 0xafa2ed in process_runnable_tasks src/task.c:876
    #13 0xa23c72 in run_poll_loop src/haproxy.c:3024
    #14 0xa24aa3 in run_thread_poll_loop src/haproxy.c:3226
    #15 0x7fabc69e7ea6 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x7ea6)
    #16 0x7fabc6907a2e in __clone (/lib/x86_64-linux-gnu/libc.so.6+0xfba2e)

0x60200007afb0 is located 0 bytes inside of 3-byte region [0x60200007afb0,0x60200007afb3)
freed by thread T3 here:
    #0 0x7fabc6fb5527 in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x54527)
    #1 0x7fabc6ae8f8c in ossl_asn1_string_embed_free (/opt/quictls/lib/libcrypto.so.81.3+0xd4f8c)

previously allocated by thread T2 here:
    #0 0x7fabc6fb573f in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x5473f)
    #1 0x7fabc6ae8d77 in ASN1_STRING_set (/opt/quictls/lib/libcrypto.so.81.3+0xd4d77)

Thread T3 created by T0 here:
    #0 0x7fabc6f84bba in pthread_create (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x23bba)
    #1 0xc04f36 in setup_extra_threads src/thread.c:252
    #2 0xa2761f in main src/haproxy.c:3917
    #3 0x7fabc682fd09 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x23d09)

Thread T2 created by T0 here:
    #0 0x7fabc6f84bba in pthread_create (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x23bba)
    #1 0xc04f36 in setup_extra_threads src/thread.c:252
    #2 0xa2761f in main src/haproxy.c:3917
    #3 0x7fabc682fd09 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x23d09)

SUMMARY: AddressSanitizer: double-free ??:0 __interceptor_free
==507223==ABORTING
Aborted

The OCSP CID stored in the impacted ckch data were freed but not reset to NULL,
leading to a subsequent double free.

Must be backported to 2.8.
2023-12-06 16:12:08 +01:00
Christopher Faulet
67c03508d6 MEDIUM: pattern: Add support for virtual and optional files for patterns
Before this patch, it was not possible to use a list of patterns, map or a
list of acls, without an existing file.  However, it could be handy to just
use an ID, with no file on the disk. It is pretty useful for everyone
managing dynamically these lists. It could also be handy to try to load a
list from a file if it exists without failing if not. This way, it could be
possible to make a cold start without any file (instead of empty file),
dynamically add and del patterns, dump the list to the file periodically to
reuse it on reload (via an external process).

In this patch, we uses some prefixes to be able to use virtual or optional
files.

The default case remains unchanged. regular files are used. A filename, with
no prefix, is used as reference, and it must exist on the disk. With the
prefix "file@", the same is performed. Internally this prefix is
skipped. Thus the same file, with ou without "file@" prefix, references the
same list of patterns.

To use a virtual map, "virt@" prefix must be used. No file is read, even if
the following name looks like a file. It is just an ID. The prefix is part
of ID and must always be used.

To use a optional file, ie a file that may or may not exist on a disk at
startup, "opt@" prefix must be used. If the file exists, its content is
loaded. But HAProxy doesn't complain if not. The prefix is not part of
ID. For a given file, optional files and regular files reference the same
list of patterns.

This patch should fix the issue #2202.
2023-12-06 10:24:41 +01:00
Christopher Faulet
660e4185e1 MINOR: pattern: Use reference name as filename to read patterns from a file
It is only a small API refactoring. The filename is no longer used when
pat_ref_read_from_file_smp() or pat_ref_read_from_file() functions are
called. The filename was already used to create the reference on the list of
patterns. Thus, we now directly use info from this reference.
2023-12-06 10:24:41 +01:00
Christopher Faulet
533121a56e MINOR: cache: Add global option to enable/disable zero-copy forwarding
tune.cache.zero-copy-forwarding parameter can now be used to enable or
disable the zero-copy fast-forwarding for the cache applet only. It is
enabled ('on') by default. It can be disabled by setting the parameter to
'off'.
2023-12-06 10:24:41 +01:00
Christopher Faulet
ebead3c0a1 MEDIUM: cache: Add support for endp-to-endp fast-forwarding
It is now possible to directly forward data to the opposite side from the
cache applet. To do so, dedicated functions were added to fast-forward the
payload part of the cached objects. Of course headers and trailers are still
sent via the channel's buffer, using the HTX.

When an object is delivered from the cache, once the applet reaches the
HTX_CACHE_DATA state, it declares it can fast-forward data. From this point,
all data are directly transferred to the oppposite side.
2023-12-06 10:24:41 +01:00
Christopher Faulet
5baa9ea168 MEDIUM: cache: Save body size of cached objects and track it on delivery
We now save the body size of cached objets in the cache entry strucutre. In
addition, the cache applet tracks the body part already sent.

This will be mandatory to add support of endpoint-to-endpoint
fast-forwarding in the cache applet.
2023-12-06 10:24:41 +01:00
Christopher Faulet
5f99a37ae6 MINOR: applets: Use channel's field to compute amount of data received
To be able to support endpoint-to-endpoint fast-forwarding (formerly called
mux-to-mux fast-forwarding), we cannot rely on data in the input channel to
compute amount of data the applet has produced.

The applet API is not really designed to know how many bytes are produced or
received at each call. Till now, it was not a problem because data always
passed through the channels. With E2E fast-frowarding, input data may be
immediately consumed. From the caller point of view (task_run_applet), there
is only the total field of the input channel that will change. So let's use
it now.
2023-12-06 10:24:41 +01:00
Christopher Faulet
52c84ab0e0 MEDIUM: applet: Handle channel's STREAMER flags on applets size
Till now, it was not possible to notify an producing applet is streaming
data. It means, it was not possible to set CF_STREAMER and CF_STREAMER_FLAGS
on the input channel of an applet streaming data.

While it is not a big deal for most of applets, it is interesting for the
cache. Because there are now dedicated functions to deal with these flags,
we can use them in task_run_applet() to set/unset these flags on the input
channel.

This patch relies on "MINOR: channel: Use dedicated functions to deal with
STREAMER flags".
2023-12-06 10:24:41 +01:00
Christopher Faulet
a40321eb3b MINOR: channel: Use dedicated functions to deal with STREAMER flags
For now, CF_STREAMER and CF_STREAMER_FAST flags are set in sc_conn_recv()
function. The logic is moved in dedicated functions.

First, channel_check_idletimer() function is now responsible to check the
channel's last read date against the idle timer value to be sure the
producer is still streaming data. Otherwise, it removes STREAMER flags.

Then, channel_check_xfer() function is responsible to check amount of data
transferred avec a receive, to eventually update STREAMER flags.

In sc_conn_recv(), we now use these functions.
2023-12-06 10:24:41 +01:00
Christopher Faulet
a7777bbf79 BUG/MEDIUM: peers: fix partial message decoding
peer_recv_msg() may return because the message is incomplete without
checking if a shutdown is pending for the SC. The function relies on
co_getblk() to detect shutdowns. However, the message length decoding may be
interrupted if the multi-bytes integer is incomplete. In this case, the SC
is not check for shutdowns.

When this happens, this leads to an appctx spinning loop.

This patch should fix the issue #2373. It must be backported to 2.8.
2023-12-05 09:28:53 +01:00
Christopher Faulet
18f2ccd244 MINOR: mux-quic: Disable zero-copy forwarding for send by default
There is at least an bug for now in this part and it is still unstable. Thus
it is better to disable it for now by default. It can be enable by setting
tune.quic.zero-copy-fwd-send to 'on'.
2023-12-04 15:36:02 +01:00
Christopher Faulet
5c959336fd MINOR: mux-quic: Add global option to enable/disable zero-copy forwarding
tune.quic.zero-copy-fwd-send can now be used to enable or disable the
zero-copy fast-forwarding for the QUIC mux only, for sends. For now, there
is no option to disable it for receives because it is not supported yet.

It is enabled ('on') by default.
2023-12-04 15:33:52 +01:00
Christopher Faulet
6da0429e75 MINOR: mux-h2: Add global option to enable/disable zero-copy forwarding
tune.h2.zero-copy-fwd-send can now be used to enable or disable the
zero-copy fast-forwarding for the H2 mux only, for sends. For now, there is
no option to disable it for receives because it is not supported yet.

It is enabled ('on') by default.
2023-12-04 15:33:34 +01:00
Christopher Faulet
f5e73024e9 MINOR: mux-h1: Add global option to enable/disable zero-copy forwarding
tune.h1.zero-copy-fwd-recv and tune.h1.zero-copy-fwd-send can now be used to
enable or disable the zero-copy fast-forwarding for the H1 mux only, for
receives or sends. Unlike the PT mux, there are 2 options here because
client and server sides can use difference muxes.

Both are enabled ('on') by default.
2023-12-04 15:33:07 +01:00
Christopher Faulet
eccef69137 MINOR: mux-pt: Add global option to enable/disable zero-copy forwarding
tune.pt.zero-copy-forwarding parameter can now be used to enable or disable
the zero-copy fast-forwarding for the PT mux only. It is enabled ('on') by
default. It can be disabled by setting the parameter to 'off'. In this case,
this disables receive and send side.
2023-12-04 15:32:32 +01:00
Christopher Faulet
7732323cf3 MINOR: global: Use a dedicated bitfield to customize zero-copy fast-forwarding
Zero-copy fast-forwading feature is a quite new and is a bit sensitive.
There is an option to disable it globally. However, all protocols have not
the same maturity. For instance, for the PT multiplexer, there is nothing
really new. The zero-copy fast-forwading is only another name for the kernel
splicing. However, for the QUIC/H3, it is pretty new, not really optimized
and it will evolved. And soon, the support will be added for the cache
applet.

In this context, it is usefull to be able to enable/disable zero-copy
fast-forwading per-protocol and applet. And when it is applicable, on sends
or receives separately. So, instead of having one flag to disable it
globally, there is now a dedicated bitfield, global.tune.no_zero_copy_fwd.
2023-12-04 15:31:47 +01:00
Willy Tarreau
db812f73af BUILD: http_htx: silence uninitialized warning on some gcc versions
Building on gcc 4.4 reports "start may be used uninitialized". This is
a classical case of dependency between two variables where the compiler
lost track of their initialization and doesn't know that if one is not
set, the other is. By just moving the second test in the else clause
of the assignment both fixes it and makes the code more efficient, and
this can be simplified as a ternary operator.

It's probably not needed to backport this, unless anyone reports build
warnings with more recent compilers (intermediary optimization levels
such as -O1 can sometimes trigger such warnings).
2023-12-01 20:46:24 +01:00
Aurelien DARRAGON
c2cd6a419c BUG/MINOR: server/event_hdl: properly handle AF_UNSPEC for INETADDR event
It is possible that a server's addr family is temporarily set to AF_UNSPEC
even if we're certain to be in INET context (ipv4, ipv6).

Indeed, as soon as IP address resolving is involved, srv->addr family will
be set to AF_UNSPEC when the resolution fails (could happen at anytime).

However, _srv_event_hdl_prepare_inetaddr() wrongly assumed that it would
only be called with AF_INET or AF_INET6 families. Because of that, the
function will handle AF_UNSPEC address as an IPV6 address: not only
we could risk reading from an unititialized area, but we would then
propagate false information when publishing the event.

In this patch we make sure to properly handle the AF_UNSPEC family in
both the "prev" and the "next" part for SERVER_INETADDR event and that
every members are explicitly initialized.

This bug was introduced by 6fde37e046 ("MINOR: server/event_hdl: add
SERVER_INETADDR event"), no backport needed.
2023-12-01 20:43:42 +01:00
Tim Duesterhus
1dcc6a8a96 BUG/MINOR: sample: Make the word converter compatible with -m found
Previously an expression like:

    path,word(2,/) -m found

always returned `true`.

Bug exists since the `word` converter exists. That is:
c9a0f6d023

The same bug was previously fixed for the `field` converter in commit
4381d26edc.

The fix should be backported to 1.6+.
2023-12-01 14:35:47 +01:00
Christopher Faulet
084db70ad1 DEBUG: stream: Report lra/fsb values for front end back SC in stream dump
REX and WEX date are already reported. But if the corresponding SC cannot
expire on read or write, "<NEVER>" is reported instead. The same is reported
if no expiration date is set. It is not really convenient because we cannot
distinguish the two cases.

So, now, for each SC, read and wirte timer (rto/wto) are also reported in
the dump, based on .lra/.fsb dates and the current I/O timeout. The SC I/O
timeout is also reported.
2023-12-01 11:25:49 +01:00
Aurelien DARRAGON
eec3911e64 BUG/MINOR: cfgparse-listen: fix warning being reported as an alert
Since b40542000d ("MEDIUM: proxy: Warn about ambiguous use of named
defaults sections") we introduced a new error to prevent user from
having an ambiguous named default section in the config which is both
inherited explicitly using "from" and implicitly by another proxy due to
the default section being the last one defined.

However, despite the error message being presented as a warning the
err_code, the commit message and the documentation, it is actually
reported as a fatal error because ha_alert() was used in place of
ha_warning().

In this patch we make the code comply with the documentation and the
intended behavior by using ha_warning() to report the error message.

This should be backported up to 2.6.
2023-12-01 09:09:45 +01:00
Willy Tarreau
822d45678f BUILD: server: shut a bogus gcc warning on certain ubuntu
On ubuntu 20.04 and 22.04 with gcc 9.4 and 11.4 respectively, we get
the following warning:

  src/server.c: In function 'srv_update_addr_port':
  src/server.c:4027:3: warning: 'new_port' may be used uninitialized in this function [-Wmaybe-uninitialized]
   4027 |   _srv_event_hdl_prepare_inetaddr(&cb_data.addr, &s->addr, s->svc_port,
        |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   4028 |                                   ((ip_change) ? &sa : &s->addr),
        |                                   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   4029 |                                   ((port_change) ? new_port : s->svc_port),
        |                                   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   4030 |                                   1);
        |                                   ~~

It's clearly wrong, port_change only changes from 0 to anything else
*after* assigning new_port. Let's just preset new_port to zero instead
of trying to play smart with the compiler.
2023-11-30 17:48:03 +01:00
Willy Tarreau
7f58e9f1e0 DEBUG: unstatify a few functions that are often present in backtraces
It's useful to be able to recognize certain functions that are often
present in backtraces as they call lower level functions, and for this
they must not be static. Let's remove "static" in front of these
functions:

  sc_notify, sc_conn_recv, sc_conn_send, sc_conn_process,
  sc_applet_process, back_establish, stream_update_both_sc,
  httpclient_applet_io_handler, httpclient_applet_init,
  httpclient_applet_release
2023-11-30 17:15:54 +01:00
Frdric Lcaille
ff8db5a85d BUG/MINOR: config: Stopped parsing upon unmatched environment variables
When an environment variable could not be matched by getenv(), the
current character to be parsed by parse_line() from <in> variable
is the trailing double quotes. If nothing is done in such a case,
this character is skipped by parse_line(), then the following spaces
are parsed as an empty argument.

To fix this, skip the double quotes character and the following spaces
to make <in> variable point to the next argument to be parsed.

Thank you to @sigint2 for having reported this issue in GH #2367.

Must be backported as far as 2.4.
2023-11-30 16:48:41 +01:00
Amaury Denoyelle
0ce213d246 MINOR: quic_tp: use in_addr/in6_addr for preferred_address
preferred_address is a transport parameter specify by the server. It
specified both an IPv4 and IPv6 address. These addresses were defined as
plain array in <struct tp_preferred_address>.

Convert these adressees to use the common types in_addr/in6_addr. With
this change, dumping of preferred_address is extended. It now displays
the addresses using inet_ntop() and CID value.
2023-11-30 15:59:45 +01:00
Amaury Denoyelle
a9ad68aa74 BUG/MINOR: quic_tp: fix preferred_address decoding
quic_transport_param_dec_pref_addr() is responsible to decode
preferred_address from received transport parameter. There was two
issues with this function :
* address and port location as defined in RFC were inverted for both
  IPv4 and IPv6 during decoding
* an invalid check was done to ensure decoded CID length corresponds to
  remaining buffer size. It did not take into account the final field
  for stateless reset token.

These issues were never encountered as only server can emit
preferred_address transport parameter, so the impact of this bug is
invisible.

This should be backported up to 2.6.
2023-11-30 15:49:10 +01:00
Amaury Denoyelle
f31719edae CLEANUP: quic_cid: remove unused listener arg
retrieve_qc_conn_from_cid() requires listener as argument whereas it is
unused. This is an artifact from the old architecture where CID trees
where stored on listener instances instead of globally. Remove it to
better reflect this change.
2023-11-30 15:04:27 +01:00
Amaury Denoyelle
86e5c607d1 MINOR: rhttp: mark reverse HTTP as experimental
Mark the reverse HTTP feature as experimental. This will allow to adjust
if needed the configuration mechanism with future developments without
maintaining retro-compatibility.

Concretely, each config directives linked to it now requires to specify
first global expose-experimental-directives before. This is the case for
the following directives :
- rhttp@ prefix uses in bind and server lines
- nbconn bind keyword
- attach-srv tcp rule

Each documentation section refering to these keywords are updated to
highlight this new requirement.

Note that this commit has duplicated on several places the code from the
global function check_kw_experimental(). This is because the latter only
work with cfg_keyword type. This is not adapted with bind_kw or
action_kw types. This should be improve in a future patch.
2023-11-30 15:04:27 +01:00
Christopher Faulet
c9418366b4 BUG/MEDIUM: cli: Don't look for payload pattern on empty commands
A regression was introduced by commit 9431aa0bdf ("BUG/MEDIUM: cli: Don't
look for payload pattern on empty commands").

On empty commands (really empty or containing spaces and tabs), the number of
arguments set to 0. However we look for the payload pattern without checking
it. The result is an access at the index -1 in the argument array. It is of
course invalid.

To fix the issue, we just skip this part when there is no argument. Note that
the empty command is still sent to the worker.

This patch should solve the issue #2365. No backport needed.
2023-11-29 15:09:29 +01:00
Christopher Faulet
24059615a7 MINOR: Add sample fetches to get the frontend and backend stream ID
"fc.id" and "bc.id" sample fetches can now be used to get, respectively, the
frontend or the backend stream ID. They rely on ->sctl() callback function
on the mux attached to the corresponding SC.

It means these sample fetches work only for connection, not applets, and
from the time a multiplexer is installed.
2023-11-29 11:11:12 +01:00
Christopher Faulet
fd8ce788a5 MINOR: muxes: Implement ->sctl() callback for muxes and return the stream id
All muxes now implements the ->sctl() callback function and are able to
return the stream ID. For the PT multiplexer, it is always 0. For the H1
multiplexer it is the request count for the current H1 connection (added for
this purpose). The FCGI, H2 and QUIC muxes, the stream ID is returned.

The stream ID is returned as a signed 64 bits integer.
2023-11-29 11:11:12 +01:00
Christopher Faulet
d982a37e4c MINOR: muxes: Rename mux_ctl_type values to use MUX_CTL_ prefix
Instead of the generic MUX_, we now use MUX_CTL_ prefix for all mux_ctl_type
value. This will avoid any ambiguities with other enums, especially with a
new one that will be added to get information on mux streams.
2023-11-29 11:11:12 +01:00
Christopher Faulet
0b8e7d666e MINOR: stream: Expose the stream's uniq_id via a new sample fetch
"txn.id32" may now be used to get the stream's uniq_id. It is equivalent to
%rt in logs.
2023-11-29 11:11:12 +01:00
Christopher Faulet
b1eb3bc9a2 MINOR: stream: add a sample fetch to get the number of connection retries
"txn.conn_retries" can now be used to get the number of connection
retries. This value is only stable once the connection is fully
established. For HTTP sessions, L7-retries must also be passed.
2023-11-29 11:11:12 +01:00
Christopher Faulet
8f56552862 MINOR: stream: Expose session terminate state via a new sample fetch
It is now possible to retrieve the session terminate state, using
"txn.sess_term_state". The sample fetch returns the 2-character session
termation state.

Of course, the result of this sample fetch is volatile. It is subject to
change. It is also most of time useless because no termation state is set
except at the end. It should only be useful in http-after-response rule
sets. It may also be used to customize the logs using a log-format
directive.

This patch should fix the issue #2221.
2023-11-29 11:11:12 +01:00
Christopher Faulet
0fd25514d6 MEDIUM: http-ana: Set termination state before returning haproxy response
When, for any reason and at any step, HAProxy decides to interrupt an HTTP
transaction, it returns a dedicated responses to the client (possibly empty)
and it sets the stream flags used to produce the session termination state.
These both operation were performed in any order, depending on the code
path. Most of time, the HAPRoxy response is produced first.

With this patch, the stream flags for the termination state are now set
first. This way, these flags become visible from http-after-reponse rule
sets. Only errors when the HAProxy response is generated are reported later.
2023-11-29 11:11:12 +01:00
Christopher Faulet
2de9e3ae24 MINOR: http-fetch: Add a sample to get the transaction status code
It was possible get the status code in the HTTP response and the one
received from the server. Thanks to 'txn.status', it is now possible to get
the transaction status code. It is equivalent to '%ST' in log-format.

Most of time, it is the same than 'status', except if the status code of the
HTTP reply does not match the one used to interrupt the transaction. For
instance, an error file use mapped on 400 containing a 404.
2023-11-29 11:11:12 +01:00
Christopher Faulet
b2f82b2b51 MINOR: http-fetch: Add a sample to retrieve the server status code
The code returned by the "status" sample fetch is the one in the HTTP
response at the moment the sample is evaluated. It may be the status code in
the server response or the one of the HAProxy reply in case of error, deny,
redirect...

However, it could be handy to retrieve the status code returned by the
server, when a HTTP response was really received from it. It is the purpose
of the "server_status" sample fetch. The server status code itself is stored
in the HTTP txn.
2023-11-29 11:11:12 +01:00
Amaury Denoyelle
263f4e3d9c MINOR: h3: use correct error code for missing SETTINGS
Each received HTTP/3 frame is checked to ensure it is valid given the
type of stream and its current status. This was implemented via
h3_is_frame_valid().

Previously, no distinction was made for error code, so every failure
triggered a CONNECTION_CLOSE_APP with code H3_FRAME_UNEXPECTED. However,
this function also ensures that the first frame received on control
frame is of type SETTINGS. If not, the error code to use is
H3_MISSING_SETTINGS.

To support this, adjust the function prototype. Instead of returning a
boolean, 0 is returned for success, or a HTTP/3 error code. The function
is renamed h3_check_frame_valid() to reflects the return type change.

This is not considered as a bug as previously the connection was
correctly closed on a missing SETTINGS, albeit with a non conform error
code. It's not deemed as sufficient to be backported.
2023-11-29 09:24:20 +01:00
Amaury Denoyelle
74ba22b1ee BUG/MINOR: h3: always reject PUSH_PROMISE
The condition for checking PUSH_PROMISE was not correctly interpreted
from the RFC. Initially, it rejects such a frame for every stream
initiated from client side.

In fact, the RFC indicates that PUSH_PROMISE are never sent by a client.
Thus, it can be rejected in any case until HTTP/3 will be implemented on
the backend side.

This should be backported up to 2.6.
2023-11-29 09:24:20 +01:00
Amaury Denoyelle
81a4cc666d BUG/MINOR: h3: fix TRAILERS encoding
HTTP/3 trailers encoding was never working as intended. It's because
h3_trailers_to_htx() manipulate a newly allocated buffer instead of the
already existing channel one. Thus, HTX message handled by the stream
was incomplete as it lacked trailers and EOM.

Fix this by reusing the already allocated channel buffer in
h3_trailers_to_htx().

This bug was detected by simulating TRAILERS emission which generate
CL--- state due to missing request side termination signal. Its impact
is deemed as minimal as trailers are pretty infrequent for now in
HTTP/3.

This must be backported up to 2.7.
2023-11-29 09:24:19 +01:00
Christopher Faulet
07691a2e7c CLEANUP: log: Fix %rc comment in sess_build_logline()
%rq was used instead of %rc.
2023-11-29 08:59:27 +01:00
Christopher Faulet
61749d7cb7 BUG/MEDIUM: mux-quic: Stop zero-copy FF during nego if input is not empty
When the producer negociate with the QUIC mux to perform a zero-copy
fast-forward, data in the input buffer are first transferred in the H3
buffer. However, after the transfer, if the input buffer is not empty, the
data fast-forwarding must be stopped. In this case, qmux_nego_ff() must
return 0.

No backport needed.
2023-11-29 08:59:27 +01:00
Christopher Faulet
a053512a7f BUG/MEDIUM: master/cli: Properly pin the master CLI on thread 1 / group 1
A previous fix was pushed for that (13fb7170be "BUG/MEDIUM: master/cli: Pin
the master CLI on the first thread of the group 1" ). Unfortunately, instead
of the master CLI, it is the sockpairs between the master and the workers
that were pinned to the first thread of the group 1. So the crash is still
there.

So, again, to fix the bug the master CLI is now pinned on the first thread
of the first group.

 patch should fix the issue #2259 and must be backported to 2.8.
2023-11-29 08:59:27 +01:00
Aurelien DARRAGON
d3cbd36950 BUG/MINOR: compression: possible NULL dereferences in comp_prepare_compress_request()
This bug was introduced in ead43fe4f2 ("MEDIUM: compression: Make it so
we can compress requests as well.")

2 cases where not properly handled, resulting in 2 possible NULL
dereferences leading to crashes in the function at runtime:
 - when the backend didn't define any compression options so its comp
   pointer is NULL (ie: if only the frontend defines some comp options)
 - when both the frontend and the backend didn't set a compression algo
   but at least one of the two defined some other comp options (comp
   pointer set)

For the first case, we added the missing checks to make sure we don't
read ->comp pointer if it is NULL.
For the second case, we properly return from the function if no
compression algo is defined, because there is no default value that could
be used as a fallback.

This should be backported to 2.8.
2023-11-29 08:59:27 +01:00
Aurelien DARRAGON
2f2cb6d082 MEDIUM: log/balance: support FQDN for UDP log servers
In previous log backend implementation, we created a pseudo log target
for each declared log server, and we made the log target's address point
to the actual server address to save some time and prevent unecessary
copies.

But this was done without knowing that when FQDN is involved (more broadly
when dns/resolution is involved), the "port" part of server addr should
not be relied upon, and we should explicitly use ->svc_port for that
purpose.

With that in mind and thanks to the previous commit, some changes were
required: we allocate a dedicated addr within the log target when target
is in DGRAM mode. The addr is first initialized with known values and it
is then updated automatically by _srv_set_inetaddr() during runtime.
(the change is atomic so readers don't need to worry about it)

addr from server "log target" (INET/DGRAM mode) is made of the combination
of server's address (lacking the port part) and server's svc_port.
2023-11-29 08:59:27 +01:00
Aurelien DARRAGON
cd994407a9 BUG/MAJOR: server/addr: fix a race during server addr:svc_port updates
For inet families (IP4/IP6), it is expected that server's addr/port might
be updated at runtime from DNS, cli or lua for instance.

Such updates were performed under the server's lock.

Unfortunately, most readers such as backend.c or sink.c perform the read
without taking server's lock because they can't afford slowing down their
processing for a type of event which is normally rare. But this could
result in bad values being read for the server addr:svc_port tuple (ie:
during connection etablishment) as a result of concurrent updates from
external components, which can obviously cause some undesirable effects.

Instead of slowing the readers down, as we consider server's addr changes
are relatively rare, we take another approach and try to update the
addr:port atomically by performing changes under full thread isolation
when a new change is requested. The changes are performed by a dedicated
task which takes care of isolating the current thread and doesn't depend
on other threads (independent code path) to protect against dead locks.

As such, server's addr:port changes will now be performed atomically, but
they will not be processed instantly, they will be translated to events
that the dedicated task will pick up from time to time to apply the
pending changes.

This bug existed for a very long time and has never been reported so
far. It was discovered by reading the code during the implementation
of log backend ("mode log" in backends). As it involves changes in
sensitive areas as well as thread isolation, it is probably not
worth considering backporting it for now, unless it is proven that it
will help to solve bugs that are actually encountered in the field.

This patch depends on:
 - 24da4d3 ("MINOR: tools: use const for read only pointers in ip{cmp,cpy}")
 - c886fb5 ("MINOR: server/ip: centralize server ip updates")
 - event_hdl API (which was first seen on 2.8) +
   683b2ae ("MINOR: server/event_hdl: add SERVER_INETADDR event") +
   BUG/MEDIUM: server/event_hdl: memory overrun in _srv_event_hdl_prepare_inetaddr() +
   "MINOR: event_hdl: add global tunables"

   Note that the patch may be reworked so that it doesn't depend on
   event_hdl API for older versions, the approach would remain the same:
   this would result in a larger patch due to the need to manually
   implement a global queue of pending updates with its dedicated task
   responsible for picking updates and comitting them. An alternative
   approach could consist in per-server, lock-protected, temporary
   addr:svc_port storage dedicated to "updaters" were only the most
   recent values would be kept. The sync task would then use them as
   source values to atomically update the addr:svc_port members that the
   runtime readers are actually using.
2023-11-29 08:59:27 +01:00
Aurelien DARRAGON
cb3ec978fd MINOR: event_hdl: add global tunables
The local variable "event_hdl_async_max_notif_at_once" which was
introduced with the event_hdl API was left as is but with a TODO note
telling that we should make it a global tunable.

Well, we're doing this now. To prepare for upcoming tunables related to
event_hdl API, we add a dedicated struct named event_hdl_tune which is
globally exposed through the event_hdl header file so that it may be used
from everywhere. The struct is automatically initialized in
event_hdl_init() according to defaults.h.

"event_hdl_async_max_notif_at_once" now becomes
"event_hdl_tune.max_events_at_once" with it's dedicated
configuation keyword: "tune.events.max-events-at-once".

We're also taking this opportunity to raise the default value from 10
to 100 since it's seems quite reasonnable given existing async event_hdl
users.

The documentation was updated accordingly.
2023-11-29 08:59:27 +01:00
Aurelien DARRAGON
f638d4b1bc BUG/MEDIUM: server/event_hdl: memory overrun in _srv_event_hdl_prepare_inetaddr()
As reported in GH #2358, #2359, #2360, #2361 and #2362: ipv6 address
handling may cause memory overrun due to struct in6_addr being handled
as sockaddr_in6 which is larger. Moreover, source variable wasn't properly
read from since the raw value was used as a pointer instead of pointing to
the actual variable's address.

This bug was introduced by 6fde37e046
("MINOR: server/event_hdl: add SERVER_INETADDR event")

Unfortunately for us, gcc didn't catch this and, this actually used to
"work" by accident since in6_addr struct is made of array so not passing
pointer explicitly still resolved to the proper starting address..
Hopefully this was caught by coverity so thanks to Ilya for that.

The fix is simple: we simply copy the whole in6_addr struct by accessing
it using a pointer and using the proper struct size for the copy.
2023-11-29 08:59:27 +01:00
William Lallemand
08f1e2bea2 MINOR: mworker/cli: implements the customized payload pattern for master CLI
Implements the customized payload pattern for the master CLI.

The pattern is stored in the stream in char pcli_payload_pat[8].

The principle is basically the same as the CLI one, it looks for '<<'
then stores what's between '<<' and '\n', and look for it to exit the
payload mode.
2023-11-28 19:13:49 +01:00
William Lallemand
dd38c37777 CLEANUP: mworker/cli: use a label to return errors
Remove the returns in the function to end directly at the end label.
2023-11-28 19:12:32 +01:00
William Lallemand
e3557c7d45 MEDIUM: cli: allow custom pattern for payload
The CLI payload syntax has some limitation, it can't handle payloads
with empty lines, which is a common problem when uploading a PEM file
over the CLI.

This patch implements a way to customize the ending pattern of the CLI,
so we can't look for other things than empty lines.

A char cli_payload_pat[8] is used in the appctx to store the customized
pattern. The pattern can't be more than 7 characters and can still empty
to match an empty line.

The cli_io_handler() identifies the pattern and stores it, and
cli_parse_request() identifies the end of the payload.

If the customized pattern between "<<" and "\n" is more than 7
characters, it is not considered as a pattern.

This patch only implements the parser for the 'stats socket', another
patch is needed for the 'master CLI'.
2023-11-28 19:12:32 +01:00
Remi Tricot-Le Breton
23c810d042 BUG/MINOR: cache: Remove incomplete entries from the cache when stream is closed
When a stream is interrupted by the client before the full answer is
stored in the cache, we end up with an incomplete entry in the cache
that cannot be overwritten until it "naturally" expires. In such a case,
we call the cache filter's cache_store_strm_deinit callback without ever
calling cache_store_http_end which means that the 'complete' flag is
never set on the concerned cache_entry.
This patch adds a check on the 'complete' flag in the strm_deinit
callback and removes the entry from the cache if it is incomplete.

A way to exhibit this bug is to try to get the same "big" response on
multiple clients at the same time thanks to h2load for instance, and to
interrupt the client side before the answer can be fully stored in the
cache.

This patch can be backported up to 2.4 but it will need some rework
starting with branch 2.8 because of the latest cache changes.
2023-11-28 17:18:48 +01:00
Frédéric Lécaille
ad61a5dde3 REORG: quic: Move quic_increment_curr_handshake() to quic_sock
Move quic_increment_curr_handshake() from quic_conn.c to quic_sock.h to be inlined.
Also move all the inlined functions at the end of this header.
2023-11-28 15:47:18 +01:00
Frédéric Lécaille
3e16784dfc REORG: quic: Remove qc_pkt_insert() implementation
As this function does only a few things with a not very well chosen name,
remove it and replace it by the its statements at the unique location it
is called.
2023-11-28 15:47:18 +01:00
Frédéric Lécaille
95e9033fd2 REORG: quic: Add a new module for retransmissions
Move several functions in relation with the retransmissions from TX part
(quic_tx.c) to quic_retransmit.c new C file.
2023-11-28 15:47:18 +01:00
Frédéric Lécaille
714d1096bc REORG: quic: Move qc_notify_send() to quic_conn
Move qc_notify_send() from quic_tx.c to quic_conn.c. Note that it was already
exported from both quic_conn.h and quic_tx.h. Modify this latter header
to fix the duplication.
2023-11-28 15:47:18 +01:00
Frédéric Lécaille
b5970967ca REORG: quic: Add a new module for QUIC retry
Add quic_retry.c new C file for the QUIC retry feature:
   quic_saddr_cpy() moved from quic_tx.c,
   quic_generate_retry_token_aad() moved from
   quic_generate_retry_token() moved from
   parse_retry_token() moved from
   quic_retry_token_check() moved from
   quic_retry_token_check() moved from
2023-11-28 15:47:18 +01:00
Frédéric Lécaille
43fbea0f38 REORG: quic: Move ncbuf related function from quic_rx to quic_conn
Move quic_get_ncbuf() and quic_free_ncbuf() from quic_rx.c to quic_conn.h
as static inlined functions.
2023-11-28 15:47:18 +01:00
Frédéric Lécaille
e0d3eb496b REORG: quic: Move NEW_CONNECTION_ID frame builder to quic_cid
Move qc_build_new_connection_id_frm() from quic_conn.c to quic_cid.c.
Also move quic_connection_id_to_frm_cpy() from quic_conn.h to quic_cid.h.
2023-11-28 15:47:18 +01:00
Frédéric Lécaille
795d1a57bf REORG: quic: Rename some (quic|qc)_conn* objects to quic_conn_closed
These objects could be confused with the ones defined by the congestion control
part (quic_cc.c).
2023-11-28 15:47:16 +01:00
Frédéric Lécaille
0b872e24cd REORG: quic: Move qc_may_probe_ipktns() to quic_tls.h
This function is in relation with the Initial packet number space which is
more linked to the QUIC TLS specifications. Let's move it to quic_tls.h
to be inlined.
2023-11-28 15:37:50 +01:00
Frédéric Lécaille
c93ebcc59b REORG: quic: Move quic_build_post_handshake_frames() to quic_conn module
Move quic_build_post_handshake_frames() from quic_rx.c to quic_conn.c. This
is a function which is also called from the TX part (quic_tx.c).
2023-11-28 15:37:50 +01:00
Frédéric Lécaille
3482455ddd REORG: quic: Move qc_handle_conn_migration() to quic_conn.c
This function manipulates only quic_conn objects. Its location is definitively
in quic_conn.c.
2023-11-28 15:37:50 +01:00
Frédéric Lécaille
581549851c REORG: quic: Move QUIC path definitions/declarations to quic_cc module
Move quic_path struct from quic_conn-t.h to quic_cc-t.h and rename it to quic_cc_path.
Update the code consequently.
Also some inlined functions in relation with QUIC path to quic_cc.h
2023-11-28 15:37:50 +01:00
Frédéric Lécaille
f32fc26b62 REORG: quic: Rename some functions used upon ACK receipt
Rename some functions to reflect more their jobs.
Move qc_release_lost_pkts() to quic_loss.c
2023-11-28 15:37:50 +01:00
Frédéric Lécaille
f74d882ef0 REORG: quic: Move the QUIC DCID parser to quic_sock.c
Move quic_get_dgram_dcid() from quic_conn.c to quic_sock.c because
only used in this file and define it as static.
2023-11-28 15:37:50 +01:00
Frédéric Lécaille
3b91756ebe REORG: quic: Move QUIC SSL BIO method related functions to quic_ssl.c
Move __quic_conn_init() and __quic_conn_deinit() from quic_conn.c to quic_ssl.c.
2023-11-28 15:37:50 +01:00
Frédéric Lécaille
09ab48472c REORG: quic: Move several inlined functions from quic_conn.h
Move quic_pkt_type(), quic_saddr_cpy(), quic_write_uint32(), max_available_room(),
max_stream_data_size(), quic_packet_number_length(), quic_packet_number_encode()
and quic_compute_ack_delay_us()	to quic_tx.c because only used in this file.
Also move quic_ack_delay_ms() and quic_read_uint32() to quic_tx.c because they
are used only in this file.

Move quic_rx_packet_refinc() and quic_rx_packet_refdec() to quic_rx.h header.
Move qc_el_rx_pkts(), qc_el_rx_pkts_del() and qc_list_qel_rx_pkts() to quic_tls.h
header.
2023-11-28 15:37:47 +01:00
Frédéric Lécaille
831764641f REORG: quic: Move QUIC CRYPTO stream definitions/declarations to QUIC TLS
Move quic_cstream struct definition from quic_conn-t.h to quic_tls-t.h.
Its pool is also moved from quic_conn module to quic_tls. Same thing for
quic_cstream_new() and quic_cstream_free().
2023-11-28 15:37:22 +01:00
Frédéric Lécaille
ae885b9b68 REORG: quic: Move CRYPTO data buffer defintions to QUIC TLS module
Move quic_crypto_buf struct definition from quic_conn-t.h to quic_tls-t.h.
Also move its pool definition/declaration to quic_tls-t.h/quic_tls.c.
2023-11-28 15:37:22 +01:00
Frédéric Lécaille
0fc0d45745 REORG: quic: Add a new module to handle QUIC connection IDs
Move quic_cid and quic_connnection_id from quic_conn-t.h to new quic_cid-t.h header.
Move defintions of quic_stateless_reset_token_init(), quic_derive_cid(),
new_quic_cid(), quic_get_cid_tid() and retrieve_qc_conn_from_cid() to quic_cid.c
new C file.
2023-11-28 15:37:22 +01:00
Frédéric Lécaille
1564ec0a93 REORG: quic: Move some QUIC CLI code to its C file
Move init_quic() from quic_conn.c to quic_cli.c and rename it to cli_quic_init().
2023-11-28 15:37:22 +01:00
Frédéric Lécaille
21615d4376 CLEANUP: quic: Remove dead definitions/declarations
Remove useless definitions and declarations.
2023-11-28 15:37:22 +01:00
Christopher Faulet
af733ef6e4 BUG/MEDIUM: mux-h2: Remove H2_SF_NOTIFIED flag for H2S blocked on fast-forward
When a H2 stream is blocked during data fast-forwarding, we must take care
to remove H2_SF_NOTIFIED flag. This was only performed when data
fast-forward was attempted. However, if the H2 stream was blocked for any
reason, this flag was not removed. During our tests, we found it was
possible to infinitely block a connection because one of its streams was in
the send_list with the flag set. In this case, the stream was no longer
woken up to resume the sends, blocking all other streams.

No backport needed.
2023-11-28 14:01:56 +01:00
Amaury Denoyelle
fe3726cb76 BUG/MINOR: quic: fix CONNECTION_CLOSE_APP encoding
CONNECTION_CLOSE_APP encoding is broken, which prevents the sending of
every packet with such a frame. This bug was always present in quic
haproxy. However, it was slightly dissimulated by the previous code
which always initialized all frame members to zero, which was sufficient
to ensure CONNECTION_CLOSE_APP encoding was ok. The below patch changes
this behavior by removing this costly initialization step.

  4cf784f38e
  MINOR: quic: Avoid zeroing frame structures

Now, frames members must always be initialized individually given the
type of frame to used. However, for CONNECTION_CLOSE_APP this was not
done as qc_cc_build_frm() accessed the wrong union member refering to a
CONNECTION_CLOSE instead.

This bug was detected when trying to generate a HTTP/3 error. The
CONNECTION_CLOSE_APP frame encoding failed due to a non-initialized
<reason_phrase_len> which was too big. This was reported by the
following trace :
  "frame building error : qc@0x5555561b86c0 idle_timer_task@0x5555561e5050 flags=0x86038058 CONNECTION_CLOSE_APP"

This must be backported up to 2.6. This is necessary even if above
commit is not as previous code is also buggy, albeit with a different
behavior.
2023-11-28 11:40:01 +01:00
Willy Tarreau
d656ac7e13 OPTIM: mux-h2/zero-copy: don't allocate more buffers per connections than streams
It's the exact same as commit 0a7ab7067 ("OPTIM: mux-h2: don't allocate
more buffers per connections than streams"), but for the zero-copy case
this time. Previously it was only done on the regular snd_buf() path, but
this one is needed as well. A transfer on 16 parallel streams now consumes
half of the memory, and a single stream consumes much less.

An alternate approach would be worth investigating in the future, based
on the same principle as the CF_STREAMER_FAST at the higher level: in
short, by monitoring how many mux buffers we write at once before refilling
them, we would get an idea of how much is worth keeping in buffers max,
given that anything beyond would just waste memory. Some tests show that
a single buffer already seems almost as good, except for single-stream
transfers, which is why it's worth spending more time on this.
2023-11-28 09:15:26 +01:00
Amaury Denoyelle
e97489a526 MINOR: trace: support -dt optional format
Add an optional argument for "-dt". This argument is interpreted as a
list of several trace statement separated by comma. For each statement,
a specific trace name can be specifed, or none to act on all sources.
Using double-colon separator, it is possible to add specifications on
the wanted level and verbosity.
2023-11-27 17:15:14 +01:00
Amaury Denoyelle
670520cff8 MINOR: trace: parse verbosity in a function
This patch is similar to the previous one except that it handles trace
verbosity. Trace source must be specified unless "quiet" is used.
2023-11-27 17:11:14 +01:00
Amaury Denoyelle
ed9fbeed78 MINOR: trace: parse level in a function
Extract conversion of level string argument to integer value in a
dedicated internal function trace_parse_level(). This function is used
to for CLI trace parsing and will also be useful for "-dt" process
argument.
2023-11-27 17:11:14 +01:00
Amaury Denoyelle
cef29d3708 MINOR: trace: define simple -dt argument
Add '-dt' haproxy process argument. This will automatically activate all
trace sources on stderr with the error level. This could be useful to
troubleshoot issues such as protocol violations.
2023-11-27 17:10:18 +01:00