Commit Graph

7693 Commits

Author SHA1 Message Date
Willy Tarreau
613e959c7b MINOR: cli/wait: add a condition to wait on a server to become unused
The "wait" command now supports a condition, "srv-unused", which waits
for the designated server to become totally unused, indicating that it
is removable. Upon each wakeup it calls srv_check_for_deletion() to
verify if conditions are met, if not if it's recoverable, or if it's
not recoverable, and proceeds according to this, never waiting for a
final decision longer than the configured delay.

The purpose is to make it possible to remove servers from the CLI after
waiting for their sessions to be terminated:

  $ socat -t5 /path/to/socket - <<< "
        disable server px/srv1
        shutdown sessions server px/srv1
        wait 2s srv-unused px/srv1
        del server px/srv1"

Or even wait for connections to terminate themselves:

  $ socat -t70 /path/to/socket - <<< "
        disable server px/srv1
        wait 1m srv-unused px/srv1
        del server px/srv1"
2024-02-09 20:38:08 +01:00
Willy Tarreau
66989ff426 MINOR: cli/wait: also pass up to 4 arguments to the external conditions
Conditions will need to have context, arguments etc from the command line.
Since these will vary with time (otherwise we wouldn't wait), let's just
pass them as text (possibly pre-processed). We're starting with 4 strings
that are expected to be allocated by strdup() and are always sent to free()
upon release.
2024-02-09 20:38:08 +01:00
Willy Tarreau
2673f8be82 MINOR: cli/wait: also support an unrecoverable failure status
Since we'll support waiting for an action to succeed or permanently
fail, we need the ability to return an unrecoverable failure. Let's
add CLI_WAIT_ERR_FAIL for this. A static error message may be placed
into ctx->msg to report to the user why the failure is unrecoverable.
2024-02-09 20:38:08 +01:00
Willy Tarreau
9b680d7411 MINOR: server: split the server deletion code in two parts
We'll need to be able to verify whether or not a server may be deleted.
For now, both the verification and the action are performed in the same
function, at once under thread isolation. The goal here is to extract
the verification code into a new function that will perform these checks,
return a status between success/recoverable/non-recoverable failure, and
will also return a message for the caller.
2024-02-09 20:38:08 +01:00
Willy Tarreau
1d2255a78a MINOR: cli: add a new "wait" command to wait for a certain delay
This allows to insert delays between commands, i.e. to collect a same
set of metrics at a fixed interval. E.g:

  $ socat -t20 /path/to/socket <<< "show activity; wait 10s; show activity"

The goal will be to extend the feature to optionally support waiting on
certain conditions. For this reason the struct definitions and enums were
placed into cli-t.h.
2024-02-08 21:54:54 +01:00
Willy Tarreau
8581d62daf MINOR: session: add the necessary functions to update the per-session glitches
This provides a new function session_add_glitch_ctr() that will update
the glitch counter and rate for the session, if tracked at all.
2024-02-08 15:51:49 +01:00
Willy Tarreau
c9c6b683fb MEDIUM: stick-tables: add a new stored type for glitch_cnt and glitch_rate
This adds a new pair of stored types in the stick-tables:
  - glitch_cnt
  - glitch_rate

These keep count of the number of glitches reported on a front connection,
in order to decide how to act with a badly defective client or a potential
attacker. For now nothing updates these counters, but all the infrastructure
needed to configure, update and retrieve them was added, including the doc.

No regtest was added yet since they're not filled yet.
2024-02-08 15:51:49 +01:00
Remi Tricot-Le Breton
befebf8b51 BUG/MEDIUM: ocsp: Separate refcount per instance and per store
With the current way OCSP responses are stored, a single OCSP response
is stored (in a certificate_ocsp structure) when it is loaded during a
certificate parsing, and each ckch_inst that references it increments
its refcount. The reference to the certificate_ocsp is actually kept in
the SSL_CTX linked to each ckch_inst, in an ex_data entry that gets
freed when he context is freed.
One of the downside of this implementation is that is every ckch_inst
referencing a certificate_ocsp gets detroyed, then the OCSP response is
removed from the system. So if we were to remove all crt-list lines
containing a given certificate (that has an OCSP response), the response
would be destroyed even if the certificate remains in the system (as an
unused certificate). In such a case, we would want the OCSP response not
to be "usable", since it is not used by any ckch_inst, but still remain
in the OCSP response tree so that if the certificate gets reused (via an
"add ssl crt-list" command for instance), its OCSP response is still
known as well. But we would also like such an entry not to be updated
automatically anymore once no instance uses it. An easy way to do it
could have been to keep a reference to the certificate_ocsp structure in
the ckch_store as well, on top of all the ones in the ckch_instances,
and to remove the ocsp response from the update tree once the refcount
falls to 1, but it would not work because of the way the ocsp response
tree keys are calculated. They are decorrelated from the ckch_store and
are the actual OCSP_CERTIDs, which is a combination of the issuer's name
hash and key hash, and the certificate's serial number. So two copies of
the same certificate but with different names would still point to the
same ocsp response tree entry.

The solution that answers to all the needs expressed aboved is actually
to have two reference counters in the certificate_ocsp structure, one
for the actual ckch instances and one for the ckch stores. If the
instance refcount becomes 0 then we remove the entry from the auto
update tree, and if the store reference becomes 0 we can then remove the
OCSP response from the tree. This would allow to chain some "del ssl
crt-list" and "add ssl crt-list" CLI commands without losing any
functionality.

Must be backported to 2.8.
2024-02-07 17:10:05 +01:00
Remi Tricot-Le Breton
28e78a0a74 MINOR: ssl: Use OCSP_CERTID instead of ckch_store in ckch_store_build_certid
The only useful information taken out of the ckch_store in order to copy
an OCSP certid into a buffer (later used as a key for entries in the
OCSP response tree) is the ocsp_certid field of the ckch_data structure.
We then don't need to pass a pointer to the full ckch_store to
ckch_store_build_certid or even any information related to the store
itself.
The ckch_store_build_certid is then converted into a helper function
that simply takes an OCSP_CERTID and converts it into a char buffer.
2024-02-07 17:09:39 +01:00
Christopher Faulet
d7467cd495 MINOR: applet: Identify applets using their own buffers via a flag
These applets can now be identified by testing APPCTX_FL_INOUT_BUFS
flag. This will be useful between the kind of applets in helper functions.
2024-02-07 15:05:05 +01:00
Christopher Faulet
a9301c96f1 MINOR: applet: Use an option to disable zero-copy forwarding for all applets
At the beginning of the 3.0-dev cycle, the zero-copy forwarding support was
added only for the cache applet with an option to disable it. This was a
hack, waiting for a better integration with applets. It is now possible to
implement the zero-copy forwarding for any applets. So the specific option
for the cache applet was renamed to be used for all applets. And this option
is now also checked for the stats applet.

Concretely, 'tune.cache.zero-copy-forwarding' was renamed to
'tune.applet.zero-copy-forwarding'.
2024-02-07 15:05:01 +01:00
Christopher Faulet
ee53d8421f MEDIUM: applet: Simplify a bit API to exchange data with applets
Default .rcv_buf and .snd_buf functions that applets can use are now
specialized to manipulate raw buffers or HTX buffers.

Thus a TCP applet should use appctx_raw_rcv_buf() and appctx_raw_snd_buf()
while HTTP applet should use appctx_htx_rcv_buf() and appctx_htx_snd_buf().

Note that the appctx is now directly passed to these functions instead of
the SC.
2024-02-07 15:04:52 +01:00
Christopher Faulet
868205943c MAJOR: stats: Send stats dump over HTTP using zero-copy forwarding
Just like for the cache applet, it is now possible to send response to the
opposite side using the zero-copy forwarding. Internal functions were
slightly updated but there is nothing special to say. Except the requested
size during the nego stage is not exact.
2024-02-07 15:04:48 +01:00
Christopher Faulet
1c18d32a0d MEDIUM: stconn: Nofify requested size during zero-copy forwarding nego is exact
It is now possible to use a flag during zero-copy forwarding negotiation to
specify the requested size is exact, it means the producer really expect to
receive at least this amount of data.

It can be used by consumer to prepare some processing at this stage, based
on the requested size. For instance, in the H1 mux, it is used to write the
next chunk size.
2024-02-07 15:04:38 +01:00
Christopher Faulet
2297f52734 MINOR: stconn: Add support for flags during zero-copy forwarding negotiation
During zero-copy forwarding negotiation, a pseudo flag was already used to
notify the consummer if the producer is able to use kernel splicing or not. But
this was not extensible. So, now we use a true bitfield to be able to pass flags
during the negotiation. NEGO_FF_FL_* flags may be used now.

Of course, for now, there is only one flags, the kernel splicing support on
producer side (NEGO_FF_FL_MAY_SPLICE).
2024-02-07 15:04:29 +01:00
Christopher Faulet
39b6f5b04c MEDIUM: applet: Add support for zero-copy forwarding from an applet
Thanks to this patch, it is possible to an applet to directly send data to
the opposite endpoint. To do so, it must implement <fastfwd> appctx callback
function and set SE_FL_MAY_FASTFWD flag.

Everything will be handled by appctx_fastfwd() function. The applet is only
responsible to transfer data. If it sets <to_forward> value, it is used to
limit the amount of data to forward.
2024-02-07 15:04:01 +01:00
Christopher Faulet
62a81cb6a6 MINOR: applet: Add callback function to deal with zero-copy forwarding
This patch introduces the support for the callback function responsible to
produce data via the zero-copy forwarding mechanism. There is no
implementation for now. But <to_forward> field was added in the appctx
structure to let an applet inform how much data it want to forward. It is
not mandatory but it will be used during the zero-copy forwarding
negociation.
2024-02-07 15:03:57 +01:00
Christopher Faulet
cc7b141e1c MINOR: applet: Add an appctx flag to report shutdown to applets
There is no shutdown for reads and send with applets. Both are performed
when the appctx is released. So instead of 2 flags, like for
muxes/connections, only one flag is used. But the idea is the same:
acknowledge the event at the applet level.
2024-02-07 15:03:50 +01:00
Christopher Faulet
14bd091fd7 MINOR: applet: Remove appctx state field to only used the flags
The appctx state was never really used as a state. It is only used to know
when an applet should be freed on the next wakeup. This can be converted to
a flag and the state can be removed. This is what this patch does.
2024-02-07 15:03:46 +01:00
Christopher Faulet
4434b03358 MINIOR: applet: Add flags to deal with ends of input, ends of stream and errors
Dedicated appctx flags to report EOI, EOS and errors (pending or terminal) were
added with the functions to set these flags. It is pretty similar to what it
done on most of muxes.
2024-02-07 15:03:42 +01:00
Christopher Faulet
e8655546b7 MINOR: applet: Add flags on the appctx and stop abusing its state
Till now, we've extended the appctx state to add some flags. However, the
field name is misleading. So a bitfield was added to handle real flags. And
helper functions to manipulate this bitfield were added.
2024-02-07 15:03:34 +01:00
Christopher Faulet
4ad8192ce4 MEDIM: applet: Add the applet handler based on IN/OUT buffers
A dedicated function to run applets was introduced, in addition to the old
one, to deal with applets that use their own buffers. The main differnce
here is that this handler does not use channels at all. It performs a
synchronous send before calling the applet and performs a synchronous
receive just after.

No applets are plugged on this handler for now.
2024-02-07 15:03:26 +01:00
Christopher Faulet
f81b704d01 MEDIUM: stconn: Add functions to handle applets I/O from the SC layer
There is no tasklet to handle I/O subscriptions for applets, but functions
to deal with receives and sends from the SC layer were added. it meanse a
function to retrieve data from an applet with this synchronous version and a
function to push data to an applet wit this synchronous version.

It is pretty similar to the functions used for muxes but there are some
differences. So for now, we keep them separated.

Zero-copy forwarding is not supported for now. In addition, there is no
subscription mechanism.
2024-02-07 15:03:23 +01:00
Christopher Faulet
525ec12305 MINOR: applet: Implement default functions to exchange data with channels
In this patch, we add default functions to copy data from a channel to the
<inbuf> buffer of an applet (appctx_rcv_buf) and another on to copy data
from <outbuf> buffer of an applet to a channel (appctx_snd_buf).

These functions are not used for now, but they will be used by applets to
define their <rcv_buf> and <snd_buf> callback functions. Of course, it will
be possible for a specific applet to implement its own functions but these
ones should be good enough for most of applets. HTX and RAW buffers are
supported.
2024-02-07 15:03:18 +01:00
Christopher Faulet
361b81bfca MINOR: applet: Add support for callback functions to exchange data with channels
For now, it is not usable, but this patch introduce the support of callback
functions, in the applet structure, to exchange data between channels and
applets. It is pretty similar to callback functions defined by muxes.
2024-02-07 15:03:14 +01:00
Christopher Faulet
ab9d2c6ca8 MINOR: applet: Add dedicated IN/OUT buffers for appctx
It is the first patch of a series aimed to align applets on connections.
Here, dedicated buffers are added for applets. For now, buffers are
initialized and helpers function to deal with allocation are added. In
addition, flags to report allocation failures or full buffers are also
introduced. <inbuf> will be used to push data to the applet from the stream
and <outbuf> will be used to push data from the applet to the stream.
2024-02-07 15:03:01 +01:00
Christopher Faulet
0dd7ff0d67 MINOR: stconn: Be able to detect applets using HTX
IS_HXT_SC() macro is only usable if the stream-connector is attached to a
connection. It is a bit restrictive because this cannot work if the SC is
attached to an applet. So let's fix that be adding the support of applets
too.
2024-02-07 15:02:19 +01:00
Christopher Faulet
6734e56514 MINOR: task: Move wait_event in the task header file
wait_event structure was in connection header file because it is only used
by connections and muxes. But, this may change. For instance applets may be
good candidates to use it too. So, the structure is moved to the task header
file instead.
2024-02-07 15:02:13 +01:00
Willy Tarreau
25968c186a MINOR: debug: add an optional message argument to the BUG_ON() family
This commit adds support for an optional second argument to BUG_ON(),
WARN_ON(), CHECK_IF(), that can be a constant string. When such an
argument is given, it will be printed on a second line after the
existing first message that contains the condition.

This can be used to provide more human-readable explanations about
what happened, such as "too low on memory" or "memory corruption
detected" that may help a user resolve the incident by themselves.
2024-02-05 17:09:00 +01:00
Willy Tarreau
d417863828 MINOR: debug: support passing an optional message in ABORT_NOW()
The ABORT_NOW() macro is not much used since we have BUG_ON(), but
there are situations where it makes sense, typically if the program
must always die regardless od DEBUG_STRICT, or if the condition must
always be evaluated (e.g. decompress something and check it).

It's not convenient not to have any hint about what happened there. But
providing too much info also results in wiping some registers, making
the trace less exploitable, so a compromise must be found.

What this patch does is to provide the support for an optional argument
to ABORT_NOW(). When an argument is passed (a string), then a message
will be emitted with the file name, line number, the message and a
trailing LF, before the stack dump and the crash. It should be used
reasonably, for example in functions that have multiple calls that need
to be more easily distinguished.
2024-02-05 17:09:00 +01:00
Willy Tarreau
bc70b385fd MINOR: debug: make BUG_ON() catch build errors even without DEBUG_STRICT
As seen in previous commit 59acb27001 ("BUILD: quic: Variable name typo
inside a BUG_ON()."), it can sometimes happen that with DEBUG forced
without DEBUG_STRICT, BUG_ON() statements are ignored. Sadly, it means
that typos there are not even build-tested.

This patch makes these statements reference sizeof(cond) to make sure
the condition is parsed. This doesn't result in any code being emitted,
but makes sure the expression is correct so that an issue such as the one
above will fail to build (which was verified).

This may be backported as it can help spot failed backports as well.
2024-02-05 15:09:37 +01:00
Aurelien DARRAGON
be0165b249 BUILD: debug: remove leftover parentheses in ABORT_NOW()
Since d480b7b ("MINOR: debug: make ABORT_NOW() store the caller's line
number when using abort"), building with 'DEBUG_USE_ABORT' fails with:

  |In file included from include/haproxy/api.h:35,
  |                 from include/haproxy/activity.h:26,
  |                 from src/ev_poll.c:20:
  |include/haproxy/thread.h: In function ‘ha_set_thread’:
  |include/haproxy/bug.h:107:47: error: expected ‘;’ before ‘_with_line’
  |  107 | #define ABORT_NOW() do { DUMP_TRACE(); abort()_with_line(__LINE__); } while (0)
  |      |                                               ^~~~~~~~~~
  |include/haproxy/bug.h:129:25: note: in expansion of macro ‘ABORT_NOW’
  |  129 |                         ABORT_NOW();                                    \
  |      |                         ^~~~~~~~~
  |include/haproxy/bug.h:123:9: note: in expansion of macro ‘__BUG_ON’
  |  123 |         __BUG_ON(cond, file, line, crash, pfx, sfx)
  |      |         ^~~~~~~~
  |include/haproxy/bug.h:174:30: note: in expansion of macro ‘_BUG_ON’
  |  174 | #  define BUG_ON(cond)       _BUG_ON     (cond, __FILE__, __LINE__, 3, "FATAL: bug ",     "")
  |      |                              ^~~~~~~
  |include/haproxy/thread.h:201:17: note: in expansion of macro ‘BUG_ON’
  |  201 |                 BUG_ON(!thr->ltid_bit);
  |      |                 ^~~~~~
  |compilation terminated due to -Wfatal-errors.
  |make: *** [Makefile:1006: src/ev_poll.o] Error 1

This is because of a leftover: abort()_with_line(__LINE__);
                                    ^^
Fixing it by removing the extra parentheses after 'abort' since the
abort() call is now performed under abort_with_line() helper function.

This was raised by Ilya in GH #2440.

No backport is needed, unless the above commit gets backported.
2024-02-05 14:55:04 +01:00
Willy Tarreau
d480b7be96 MINOR: debug: make ABORT_NOW() store the caller's line number when using abort
Placing DO_NOT_FOLD() before abort() only works in -O2 but not in -Os which
continues to place only 5 calls to abort() in h3.o for call places. The
approach taken here is to replace abort() with a new function that wraps
it and stores the line number in the stack. This slightly increases the
code size (+0.1%) but when unwinding a crash, the line number remains
present now. This is a very low cost, especially if we consider that
DEBUG_USE_ABORT is almost only used by code coverage tools and occasional
debugging sessions.
2024-02-02 17:12:06 +01:00
Willy Tarreau
2bb192ba91 MINOR: debug: make sure calls to ha_crash_now() are never merged
As indicated in previous commit, we don't want calls to ha_crash_now()
to be merged, since it will make gdb return a wrong line number. This
was found to happen with gcc 4.7 and 4.8 in h3.c where 26 calls end up
as only 5 to 18 "ud2" instructions depending on optimizations. By
calling DO_NOT_FOLD() just before provoking the trap, we can reliably
avoid this folding problem. Note that this does not address the case
where abort() is used instead (DEBUG_USE_ABORT).
2024-02-02 17:12:06 +01:00
Willy Tarreau
e06e8a2390 MINOR: compiler: add a new DO_NOT_FOLD() macro to prevent code folding
Modern compilers sometimes perform function tail merging and identical
code folding, which consist in merging identical occurrences of same
code paths, generally final ones (e.g. before a return, a jump or an
unreachable statement). In the case of ABORT_NOW(), it can happen that
the compiler merges all of them into a single one in a function,
defeating the purpose of the check which initially was to figure where
the bug occurred.

Here we're creating a DO_NO_FOLD() macro which makes use of the line
number and passes it as an integer argument to an empty asm() statement.
The effect is a code position dependency which prevents the compiler
from merging the code till that point (though it may still merge the
following code). In practice it's efficient at stopping the compilers
from merging calls to ha_crash_now(), which was the initial purpose.

It may also be used to force certain optimization constructs since it
gives more control to the developer.
2024-02-02 17:12:06 +01:00
Christopher Faulet
3246f863d6 MEDIUM: stats: Be able to access a specific field into a stats module
It is now possible to selectively retrieve extra counters from stats
modules. H1, H2, QUIC and H3 fill_stats() callback functions are updated to
return a specific counter.
2024-02-01 12:00:53 +01:00
Christopher Faulet
fd366a106b MINOR: stats: Be able to access to registered stats modules from anywhere
The list of modules registered on the stats to expose extra counters is now
public. It is required to export these counters into the Prometheus
exporter.
2024-02-01 12:00:53 +01:00
Aurelien DARRAGON
42a97d9feb MEDIUM: tcp-act/backend: support for set-bc-{mark,tos} actions
set-bc-{mark,tos} actions are pretty similar to set-fc-{mark,tos} to set
mark/tos on packets sent from haproxy to server: set-bc-{mark,tos} actions
act on the whole backend/srv connection: from connect() to connection
teardown, thus they may only be used before the connection to the server
is instantiated, meaning that they are only relevant for request-oriented
rules such as tcp-request or http-request rules. For now their use is
limited to content request rules, because tos and mark informations are
stored directly within the stream, thus it is required that the stream
already exists.

stream flags are used in combination with dedicated stream struct members
variables to pass 'tos' and 'mark' informations so that they are correctly
considered during stream connection assignment logic (prior to connecting
to actually connecting to the server)

'tos' and 'mark' fd sockopts are taken into account in conn hash
parameters for connection reuse mechanism.

The documentation was updated accordingly.
2024-02-01 10:58:30 +01:00
Aurelien DARRAGON
b4ee7b044e MEDIUM: tcp-act: <expr> support for set-fc-{mark,tos} actions
In this patch we add the possibility to use sample expression as argument
for set-fc-{mark,tos} actions. To make it backward compatible with
previous behavior, during parsing we first try to parse the value as
as integer (decimal or hex notation), and then fallback to expr parsing
in case of failure.

The documentation was updated accordingly.
2024-02-01 10:58:30 +01:00
Aurelien DARRAGON
ea09075f59 OPTIM: connection: progressive hash for conn_calculate_hash()
Some CPU time is needlessly wasted in conn_calculate_hash(), because all
params are first copied into a temporary buffer before computing the
hash on the whole buffer. Instead, let's leverage the XXH progressive
hash update functions to avoid expensive memcpys.
2024-02-01 10:58:30 +01:00
Aurelien DARRAGON
1de149fb6d CLEANUP: connection: remove obsolete comment in header file
0x00000008 bit for CO_FL_* flags is no more unused since 8cc3fc73f1
("MINOR: connection: update rhttp flags usage"). Removing the comment
that says otherwise.
2024-02-01 10:58:30 +01:00
Amaury Denoyelle
4b5f557283 MINOR: mux-quic: realign Tx buffer if possible
A major reorganization of QUIC MUX sending has been implemented. Now
data transfer occur over a single QCS buffer. This has improve
performance but at the cost of restrictions on snd_buf. Indeed, buffer
instances are now shared from stream callback snd_buf up to quic-conn
layer.

As such, snd_buf cannot manipulate freely already present data buffer.
In particular, realign has been completely removed by the previous
patches.

This commit reintroduces a partial realign support. This is only done if
the buffer contains only unsent data, via a new MUX function
qcc_realign_stream_txbuf() which is called during snd_buf.
2024-01-31 16:28:54 +01:00
Amaury Denoyelle
4513787d0d MEDIUM: mux-quic: properly handle conn Tx buf exhaustion
This commit is a direct follow-up on the major rearchitecture of send
buffering. This patch implements the proper handling of connection pool
buffer temporary exhaustion.

The first step is to be able to differentiate a fatal allocation error
from a temporary pool exhaustion. This is done via a new output argument
on qcc_get_stream_txbuf(). For a fatal error, application protocol layer
will schedule the immediate connection closing. For a pool exhaustion,
QCC is flagged with QC_CF_CONN_FULL and stream sending process is
interrupted. QCS instance is also registered in a new list
<qcc.buf_wait_list>.

A new connection buffer can become available when all ACKs are received
for an older buffer. This process is taken in charge by quic-conn layer.
It uses qcc_notify_buf() function to clear QC_CF_CONN_FULL and to wake
up every streams registered on buf_wait_list to resume sending process.
2024-01-31 16:28:54 +01:00
Amaury Denoyelle
cd22200d23 MEDIUM: mux-quic: release Tx buf on too small room
This commit is a direct follow-up on the major rearchitecture of send
buffering. It allows application protocol to react if current QCS
sending buffer space is too small. In this case, the buffer can be
released to the quic-conn layer. This allows to allocate a new QCS
buffer and retry HTX parsing, unless connection buffer pool is already
depleted.

A new function qcc_release_stream_txbuf() serves as API for app protocol
to release the QCS sending buffer. This operation fails if there is
unsent data in it. In this case, MUX has to keep it to finalize transfer
of unsent data to quic-conn layer. QCS is thus flagged with
QC_SF_BLK_MROOM to interrupt snd_buf operation.

When all data are sent to the quic-conn layer, QC_SF_BLK_MROOM is
cleared via qcc_streams_sent_done() and stream layer is woken up to
restart snd_buf.

Note that a new function qcc_stream_can_send() has been defined. It
allows app proto to check if sending is currently blocked for the
current QCS. For now, it checks QC_SF_BLK_MROOM flag. However, it will
be extended to other conditions with the following patches.
2024-01-31 16:28:54 +01:00
Amaury Denoyelle
3fe3251593 MEDIUM: mux-quic: simplify sending API
The previous commit was a major rework for QUIC MUX sending process.
Following this, this patch cleans up a few elements that remains but can
be removed as they are duplicated.

Of notable changes, offset fields from QCS and QCC are removed. They are
both equivalent to flow control soft offsets.

A new function qcs_prep_bytes() is implemented. Its purpose is to return
the count of prepared data bytes not yet sent. It also replaces
qcs_need_sending().
2024-01-31 16:28:54 +01:00
Amaury Denoyelle
00a3e5f786 MAJOR: mux-quic: remove intermediary Tx buffer
Previously, QUIC MUX sending was implemented with data transfered along
two different buffer instances per stream.

The first QCS buffer was used for HTX blocks conversion into H3 (or
other application protocol) during snd_buf stream callback. QCS instance
is then registered for sending via qcc_io_cb().

For each sending QCS, data memcpy is performed from the first to a
secondary buffer. A STREAM frame is produced for each QCS based on the
content of their secondary buffer.

This model is useful for QUIC MUX which has a major difference with
other muxes : data must be preserved longer, even after sent to the
lower layer. Data references is shared with quic-conn layer which
implements retransmission and data deletion on ACK reception.

This double buffering stages was the first model implemented and remains
active until today. One of its major drawbacks is that it requires
memcpy invocation for every data transferred between the two buffers.
Another important drawback is that the first buffer was is allocated by
each QCS individually without restriction. On the other hand, secondary
buffers are accounted for the connection. A bottleneck can appear if
secondary buffer pool is exhausted, causing unnecessary haproxy
buffering.

The purpose of this commit is to completely break this model. The first
buffer instance is removed. Now, application protocols will directly
allocate buffer from qc_stream_desc layer. This removes completely the
memcpy invocation.

This commit has a lot of code modifications. The most obvious one is the
removal of <qcs.tx.buf> field. Now, qcc_get_stream_txbuf() returns a
buffer instance from qc_stream_desc layer. qcs_xfer_data() which was
responsible for the memcpy between the two buffers is also completely
removed. Offset fields of QCS and QCC are now incremented directly by
qcc_send_stream(). These values are used as boundary with flow control
real offset to delimit the STREAM frames built.

As this change has a big impact on the code, this commit is only the
first part to fully support single buffer emission. For the moment, some
limitations are reintroduced and will be fixed in the next patches :

* on snd_buf if QCS sent buffer in used has room but not enough for the
  application protocol to store its content
* on snd_buf if QCS sent buffer is NULL and allocation cannot succeeds
  due to connection pool exhaustion

One final important aspect is that extra care is necessary now in
snd_buf callback. The same buffer instance is referenced by both the
stream and quic-conn layer. As such, some operation such as realign
cannot be done anymore freely.
2024-01-31 16:28:54 +01:00
Amaury Denoyelle
c6ef55407c MINOR: mux-quic: remove unneeded sent-offset fields
Both QCS and QCC have their owned sent offset field. These fields store
the newest offset sent to the quic-conn layer. It is similar to QCS/QCC
flow control real offset. This patch removes them and replaces them by
the latter for code clarification.

MINOR: mux-quic: remove unneeded qcc.tx.sent_offsets field

This commit as a similar purpose as previous, except that it removes QCC
<sent_offsets> field, now equivalent to connection flow control real
offset.
2024-01-31 16:28:54 +01:00
Amaury Denoyelle
d4bf6f0526 MEDIUM: mux-quic: limit conn flow control on snd_buf
This commit is a direct follow-up on the previous one. This time, it
deals with connection level flow control. Process is similar to stream
level : soft offset is incremented during snd_buf and real offset during
STREAM frame emission.

On MAX_DATA reception, both stream layer and QMUX is woken up if
necessary. One extra feature for conn level is the introduction of a new
QCC list to reference QCS instances. It will store instances for which
snd_buf callback has been interrupted on QCC soft offset reached. Every
stream instances is woken up on MAX_DATA reception if soft_offset is
unblocked.
2024-01-31 16:28:54 +01:00
Amaury Denoyelle
c44692356d MEDIUM: mux-quic: limit stream flow control on snd_buf
This patch is the first of two to reimplement flow control emission
limits check. The objective is to account flow control earlier during
snd_buf stream callback. This should smooth transfers and prevent over
buffering on haproxy side if flow control limit is reached.

The current patch deals with stream level flow control. It reuses the
newly defined flow control type. Soft offset is incremented after HTX to
data conversion. If limit is reached, snd_buf is interrupted and stream
layer will subscribe on QCS.

On qcc_io_cb(), generation of STREAM frames is restricted as previously
to ensure to never surpass peer limits. Finally, flow control real
offset is incremented on lower layer send notification. Thus, it will
serve as a base offset for built STREAM frames. If limit is reached,
STREAM frames generation is suspended.

Each time QCS data flow control limit is reached, soft and real offsets
are reconsidered.

Finally, special care is used when flow control limit is incremented via
MAX_STREAM_DATA reception. If soft value is unblocked, stream layer
snd_buf is woken up. If real value is unblocked, qcc_io_cb() is
rescheduled.
2024-01-31 16:28:54 +01:00
Amaury Denoyelle
25493ca036 MINOR: mux-quic: define a flow control related type
Create a new module dedicated to flow control handling. It will be used
to implement earlier flow control update on snd_buf stream callback.

For the moment, only Tx part is implemented (i.e. limit set by the peer
that haproxy must respect for sending). A type quic_fctl is defined to
count emitted data bytes. Two offsets are used : a real one and a soft
one. The difference is that soft offset can be incremented beyond limit
unless it is already in excess.

Soft offset will be used for HTX to H3 parsing. As size of generated H3
is unknown before parsing, it allows to surpass the limit one time. Real
offset will be used during STREAM frame generation : this time the limit
must not be exceeded to prevent protocol violation.
2024-01-31 16:28:54 +01:00
Amaury Denoyelle
f32c08be34 MINOR: mux-quic: prepare for earlier flow control update
Add a new argument to qcc_send_stream() to specify the count of sent
bytes.

For the moment this argument is unused. This commit is in fact a step to
implement earlier flow control update during stream layer snd_buf.
2024-01-31 16:28:54 +01:00
Amaury Denoyelle
220386ae40 BUG/MINOR: ssl/quic: fix 0RTT define
Previous patches have reorganize define definitions for SSL 0RTT
support. However a typo was introduced. This caused haproxy to disable
0RTT support announcement and report of an erroneous warning for no
support on the SSL library side when using quictls/openssl compat layer.

This was detected by using ngtcp2-client. No 0RTT packet were emitted by
the client due to haproxy missing support advertisement.

The faulty commit is the following one :
  commit 5c45199347
  MEDIUM: ssl/quic: always compile the ssl_conf.early_data test

This must be backported wherever the above patch is.
2024-01-31 16:28:32 +01:00
Willy Tarreau
fadabc430f CLEANUP: h1: remove unused function h1_measure_trailers()
This one stopped being used in 2.1 when HTX became mandatory,
let's drop it.
2024-01-31 15:22:12 +01:00
William Lallemand
025f5105ee MINOR: ssl: rename HA_OPENSSL_HAVE_0RTT_SUPPORT constant to HAVE_SSL_0RTT_QUIC
Rename the constant to be me more comprehensive.
2024-01-31 11:57:54 +01:00
William Lallemand
f5353f2c45 MINOR: ssl: add HAVE_SSL_0RTT constant
Add the HAVE_SSL_0RTT constant which define if the SSL library supports
0RTT. Which is different from HA_OPENSSL_HAVE_0RTT_SUPPORT which was
used only in the context of QUIC
2024-01-31 11:57:54 +01:00
Christopher Faulet
4837e99892 BUG/MEDIUM: h1: Don't support LF only to mark the end of a chunk size
It is similar to the previous fix but for the chunk size parsing. But this
one is more annoying because a poorly coded application in front of haproxy
may ignore the last digit before the LF thinking it should be a CR. In this
case it may be out of sync with HAProxy and that could be exploited to
perform some sort or request smuggling attack.

While it seems unlikely, it is safer to forbid LF with CR at the end of a
chunk size.

This patch must be backported to 2.9 and probably to all stable versions
because there is no reason to still support LF without CR in this case.
2024-01-30 15:00:14 +01:00
Christopher Faulet
7b737da825 BUG/MINOR: h1: Don't support LF only at the end of chunks
When the message is chunked, all chunks must ends with a CRLF. However, on
old versions, to support bad client or server implementations, the LF only
was also accepted. Nowadays, it seems useless and can even be considered as
an issue. Just forbid LF only at the end of chunks, it seems reasonnable.

This patch must be backported to 2.9 and probably to all stable versions
because there is no reason to still support LF without CR in this case.
2024-01-30 14:58:59 +01:00
Miroslav Zagorac
24a5e42db6 CLEANUP: log: deinitialization of the log buffer in one function
In several places in the source, there was the same block of code that was
used to deinitialize the log buffer.  There were even two functions that
did this, but they were called only from the code that is in the same
source file (free_tcpcheck_fmt() in src/tcpcheck.c and free_logformat_list()
in src/proxy.c - they were both static functions).

The function free_logformat_list() was moved from the file src/proxy.c to
src/log.c, and a check of the list before freeing the memory was added to
that function.
2024-01-30 08:27:26 +01:00
Willy Tarreau
e5ac9fc98b BUILD: makefile: also define cmd_CXX to pretty-print C++ build commands
Device Atlas' dummy lib will use a C++ file when built with cache
support, so for completeness we'll have to pretty-print it as well.
Let's define cmd_CXX.
2024-01-26 18:54:23 +01:00
Ilya Shipitsin
558d385c85 CLEANUP: fix spelling of "elemt" 2024-01-26 17:29:27 +01:00
Amaury Denoyelle
ad6b13d317 BUG/MEDIUM: quic: remove unsent data from qc_stream_desc buf
QCS instances use qc_stream_desc for data buffering on emission. On
stream reset, its Tx channel is closed earlier than expected. This may
leave unsent data into qc_stream_desc.

Before this patch, these unsent data would remain after QCS freeing.
This prevents the buffer to be released as no ACK reception will remove
them. The buffer is only freed when the whole connection is closed. As
qc_stream_desc buffer is limited per connection, this reduces the buffer
pool for other streams of the same connection. In the worst case if
several streams are resetted, this may completely freeze the transfer of
the remaining connection streams.

This bug was reproduced by reducing the connection buffer pool to a
single buffer instance by using the following global statement :

  tune.quic.frontend.conn-tx-buffers.limit 1.

Then a QUIC client is used which opens a stream for a large enough
object to ensure data are buffered. The client them emits a STOP_SENDING
before reading all data, which forces the corresponding QCS instance to
be resetted. The client then opens a new request but the transfer is
freezed due to this bug.

To fix this, adjust qc_stream_desc API. Add a new argument <final_size>
on qc_stream_desc_release() function. Its value is compared to the
currently buffered offset in latest qc_stream_desc buffer. If
<final_size> is inferior, it means unsent data are present in the
buffer. As such, qc_stream_desc_release() removes them to ensure the
buffer will finally be freed when all ACKs are received. It is also
possible that no data remains immediately, indicating that ACK were
already received. As such, buffer instance is immediately removed by
qc_stream_buf_free().

This must be backported up to 2.6. As this code section is known to
regression, a period of observation could be reserved before
distributing it on LTS releases.
2024-01-26 16:02:05 +01:00
Frederic Lecaille
ab75d89e07 BUILD: quic: Fix build error when building QUIC against libressl.
This previous commit was not sufficient to completely fix the building issue
in relation with the TLS stack 0-RTT support. LibreSSL was the last TLS
stack to refuse to compile because of undefined a QUIC specific function
for 0-RTT: SSL_set_quic_early_data_enabled().

To get rid of such compilation issues, define HA_OPENSSL_HAVE_0RTT_SUPPORT
only when building against TLS stack with 0-RTT support.

No need to backport.
2024-01-24 15:37:40 +01:00
Emeric Brun
ef02dba7bc BUG/MEDIUM: cli: some err/warn msg dumps add LR into CSV output on stat's CLI
The initial purpose of CSV stats through CLI was to make it easely
parsable by scripts. But in some specific cases some error or warning
messages strings containing LF were dumped into cells of this CSV.

This made some parsing failure on several tools. In addition, if a
warning or message contains to successive LF, they will be dumped
directly but double LFs tag the end of the response on CLI and the
client may consider a truncated response.

This patch extends the 'csv_enc_append' and 'csv_enc' functions used
to format quoted string content according to RFC  with an additionnal
parameter to convert multi-lines strings to one line: CRs are skipped,
and LFs are replaced with spaces. In addition and optionally, it is
also possible to remove resulting trailing spaces.

The call of this function to fill strings into stat's CSV output is
updated to force this conversion.

This patch should be backported on all supported branches (issue was
already present in v2.0)
2024-01-24 08:38:59 +01:00
Willy Tarreau
a3d6af6a0f MINOR: connection: add a new mux_ctl to report number of connection glitches
MUX_CTL_GET_GLITCHES will report the non-negative number of clitches
observed on a connection, or -1 if not supported.
2024-01-18 17:21:44 +01:00
William Lallemand
97832ab823 MEDIUM: ssl: implements 'default-crt' keyword for bind Lines
The 'default-crt' bind keyword allows to specify multiples
default/fallback certificates, allowing one to have an RSA as well as an
ECDSA default.
2024-01-12 17:40:42 +01:00
William Lallemand
83a0cde207 REORG: ssl: move 'generate-certificates' code to ssl_gencert.c
A lot of code specific to the 'generate-certificates' option was left in
ssl_sock.c.

Move the code to 'ssl_gencert.c' and 'ssl_gencert.h'
2024-01-12 17:40:42 +01:00
William Lallemand
b80635a7e0 MEDIUM: ssl: does not use default_ctx for 'generate-certificate' option
The 'generate-certificates' option does not need its dedicated SSL_CTX
*, it only needs the default SSL_CTX.

Use the default SSL_CTX found in the sni_ctx to generate certificates.

It allows to remove all the specific default_ctx initialization, as
well as the default_ssl_conf and 'default_inst'.
2024-01-12 17:40:42 +01:00
William Lallemand
0bf9d122a9 MEDIUM: ssl: generate '*' SNI filters for default certificates
This patch follows the previous one about default certificate selection
("MEDIUM: ssl: allow multiple fallback certificate to allow ECDSA/RSA
selection").

This patch generates '*" SNI filters for the first certificate of a
bind line, it will be used to match default certificates. Instead of
setting the default_ctx pointer in the bind line.

Since the filters are in the SNI tree, it allows to have multiple
default certificate and restore the ecdsa/rsa selection with a
multi-cert bundle.

This configuration:
   # foobar.pem.ecdsa and foobar.pem.rsa
   bind *:8443 ssl crt foobar.pem crt next.pem

will use "foobar.pem.ecdsa" and "foobar.pem.rsa" as default
certificates.

Note: there is still cleanup needed around default_ctx.

This was discussed in github issue #2392.
2024-01-12 17:40:42 +01:00
Amaury Denoyelle
c121fcef30 BUILD: quic: missing include for quic_tp
Add missing netinet/in.h required for in_addr/in6_addr types.

This should be backported up to 2.9.
2024-01-12 16:08:36 +01:00
Willy Tarreau
3c135569c5 MINOR: http: add infrastructure to choose status codes for err / fail
At the moment, http_err_cnt and http_fail_cnt are incremented on a
well-defined set of status codes, which are checked at various places.
Over time, there have been some complains about 404, 401 or 407
triggering errors, or 500 triggering failures in SOAP environments
for example. With a small bit field that fits in a cache line we
can match the presence of a status code from 100 to 599, so that
remains cheap.

This patch adds two such bit fields, one per code class, and the
accompanying functions to set/clear/test the codes. The arrays are
preset at boot time. For now they are not used and it's not possible
to adjust them.
2024-01-11 15:10:08 +01:00
Frédéric Lécaille
37d5a26cc5 CLEANUP: quic: Double quic_dgram_parse() prototype declaration.
This function is defined in the RX part (quic_rx.c) and declared in quic_rx.h
header. This is its correct place.

Remove the useless declaration of this function in quic_conn.h.

Should be backported in 2.9 where this double declaration was introduced when
moving quic_dgram_parse() from quic_conn.c to quic_rx.c.
2024-01-10 17:22:24 +01:00
Willy Tarreau
5c0128d942 IMPORT: ebtree: make string_equal_bits() return an unsigned
It used to return ssize_t for -1 but in fact we're using this -1 as
the largest possible value and the result is generally cast to signed
to check if the end was reached, so better make it clearly return an
unsigned value here.

This is cbtree commit e1e58a2b2ced2560d4544abaefde595273089704.
This is ebtree commit d7531a7475f8ba8e592342ef1240df3330d0ab47.
2024-01-06 13:35:42 +01:00
Willy Tarreau
b7068b3152 IMPORT: ebtree: use unsigned ints for flznz()
There's no reason to return signed values there. And it turns out that
the compiler manages to improve the performance by ~2%.

This is cbtree commit ab3fd53b8d6bbe15c196dfb4f47d552c3441d602.
This is ebtree commit 0ebb1d7411d947de55fa5913d3ab17d089ea865c.
2024-01-06 13:35:42 +01:00
Willy Tarreau
2a14f99dbb IMPORT: ebtree: make string_equal_bits turn back to unsigned char
With flsnz() instead of flsnz_long() we're now getting a better
performance on both x86 and ARM. The difference is that previously
we were relying on a function that was forcing the use of register
%eax for the 8-bit version and that was preventing the compiler
from keeping the code optimized. The gain is roughly 5% on ARM and
1% on x86.

This is cbtree commit 19cf39b2514bea79fed94d85e421e293be097a0e.
This is ebtree commit a9aaf2d94e2c92fa37aa3152c2ad8220a9533ead.
2024-01-06 13:35:42 +01:00
Willy Tarreau
1c46a07460 IMPORT: ebtree: rework the fls macros to better deal with arch-specific ones
The definitions were a bit of a mess and there wasn't even a fall back to
__builtin_clz() on compilers supporting it. Now we instead define a macro
for each implementation that is set on an arch-dependent case by case,
and add the fall back ones only when not defined. This also allows the
flsnz8() to automatically fall back to the 32-bit arch-specific version
if available. This shows a consistent 33% speedup on arm for strings.

This is cbtree commit c6075742e8d0a6924e7183d44bd93dec20ca8049.
This is ebtree commit f452d0f83eca72f6c3484ccb138d341ed6fd27ed.
2024-01-06 13:35:42 +01:00
Willy Tarreau
fc421e5b3d IMPORT: ebtree: switch the sizes and offsets to size_t and ssize_t
Let's use these in order to avoid 32-64 bit casts on 64 bit platforms.

This is cbtree commit e4f4c10fcb5719b626a1ed4f8e4e94d175468c34.
This is ebtree commit cc10507385c784d9a9e74ea9595493317d3da99e.
2024-01-06 13:35:13 +01:00
Willy Tarreau
9afe3b59a7 IMPORT: ebtree: implement and use flsnz_long() to count bits
The asm code shows multiple conversions. Gcc has always been terribly
bad at dealing with chars, which are constantly converted to ints for
every operation and zero-extended after each operation. But here in
addition there are conversions before and after the flsnz(). Let's
just mark the variables as long and use flsnz_long() to process them
without any conversion. This shortens the code and makes it slightly
faster.

Note that the fls operations could make use of __builtin_clz() on
gcc 4.6 and above, and it would be useful to implement native support
for ARM as well.

This is cbtree commit 1f0f83ba26f2279c8bba0080a2e09a803dddde47.
This is ebtree commit 9c38dcae22a84f0b0d9c5a56facce1ca2ad0aaef.
2024-01-06 13:35:13 +01:00
Christopher Faulet
7cc4151422 BUG/MEDIUM: stconn: Set fsb date if zero-copy forwarding is blocked during nego
During the zero-copy forwarding, if the consumer side reports it is blocked,
it means it is blocked on send. At the stream-connector level, the event
must be reported to be sure to set/update the fsb date. Otherwise, write
timeouts cannot be properly reported. If this happens when no other timeout
is armed, this freezes the stream.

This patch must be backported to 2.9.
2024-01-05 17:28:06 +01:00
Frédéric Lécaille
fd178ccdb0 BUILD: quic: Missing quic_ssl.h header protection
Such "#ifdef USE_QUIC" prepocessor statements are used by QUIC C header
to avoid inclusion of QUIC headers when the QUIC support is not enabled
(by USE_QUIC make variable). Furthermore, this allows inclusions of QUIC
header from C file without having to protect them with others "#ifdef USE_QUIC"
statements as follows:

   #ifdef USE_QUIC
   #include <a QUIC header>
   #include <another one QUIC header>
   #endif /* USE_QUIC */

So, here if this quic_ssl.h header was included by a C file, and compiled without
QUIC support, this will lead to build errrors as follows:

 In file included from <a C file...>:
        include/haproxy/quic_ssl.h:35:35: warning: ‘enum ssl_encryption_level_t’
        declared inside parameter list will not be visible outside of this
        definition or declaration

Should be backported to 2.9 to avoid such building issues to come.
2024-01-04 13:56:44 +01:00
Frédéric Lécaille
860028db47 CLEANUP: quic: Remaining useless code into server part
Remove some QUIC definitions of members from server structure as the haproxy QUIC
stack does not support at all the server part (QUIC client) as this time.
Remove the statements in relation with their initializations.

This patch should be backported as far as 2.6 to save memory.
2024-01-04 11:16:06 +01:00
Willy Tarreau
afba58f21e MINOR: global: export a way to list build options
The new function hap_get_next_build_opt() will iterate over the list of
build options. This will be used for debugging, so that the build options
can be retrieved from the CLI.
2024-01-02 11:44:42 +01:00
Dragan Dosen
96c1a61136 MEDIUM: udp: allow to retrieve the frontend destination address
A new flag RX_F_PASS_PKTINFO is now available, whose purpose is to mark
that the destination address is about to be retrieved on some listeners.

The address can be retrieved from the first received datagram, and
relies on the IP_PKTINFO, IP_RECVDSTADDR and IPV6_RECVPKTINFO support.
2024-01-02 11:44:42 +01:00
Dragan Dosen
1582ccf9d3 MINOR: tcpcheck: export proxy_parse_tcpcheck()
Export proxy_parse_tcpcheck() in tcpcheck.h
2024-01-02 11:44:42 +01:00
Dragan Dosen
5b1609f9da MINOR: backend: export get_server_*() functions
This is in preparation for exposing more of the LB internals.
2024-01-02 11:44:42 +01:00
Aurelien DARRAGON
689784ed91 CLEANUP: resolvers: remove some more unused RSLV_UDP flags
RSLV_UPD_CNAME and RSLV_UPD_NAME_ERROR flags have now become useless since
3cf7f987 ("MINOR: dns: proper domain name validation when receiving DNS
response") as they are never set, but we forgot to remove them.
2024-01-02 10:29:41 +01:00
Aurelien DARRAGON
299501845d CLEANUP: resolvers: remove unused RSLV_UPD_OBSOLETE_IP flag
RSLV_UPD_OBSOLETE_IP was introduced with commit a8c6db8d2 ("MINOR: dns:
Cache previous DNS answers.") but the commit didn't make any use of it,
and today the flag is still unused. Since we have no valid use for it,
better remove it to prevent confusions.
2024-01-02 10:29:33 +01:00
Ilya Shipitsin
8705e45964 CLEANUP: assorted typo fixes in the code and comments
This is 38th iteration of typo fixes
2024-01-02 10:19:48 +01:00
Frédéric Lécaille
10e96fcd17 BUG/MINOR: quic: Missing call to TLS message callbacks
This bug impacts only the QUIC OpenSSL compatibility module (USE_QUIC_OPENSSL_COMPAT).

The TLS capture of information from client hello enabled by
tune.ssl.capture-buffer-size could not work with USE_QUIC_OPENSSL_COMPAT. This
is due to the fact the callback set for this feature was replaced by
quic_tls_compat_msg_callback(). In fact this called must be registered by
ssl_sock_register_msg_callback() as this done for the TLS client hello capture.
A call to this function appends the function passed as parameter to a list of
callbacks to be called when the TLS stack parse a TLS message.
quic_tls_compat_msg_callback() had to be modified to return if it is called
for a non-QUIC TLS session.

Must be backported to 2.8.
2023-12-21 16:33:06 +01:00
Amaury Denoyelle
235e8f1afd MEDIUM: mux-quic: add BUG_ON if sending on locally closed QCS
Previously, if snd_buf operation was conducted despite QCS already
locally closed, the input buffer was silently dropped. This situation
could happen if a RESET_STREAM was emitted butemission not reported to
the stream layer. Resetting silently the buffer ensure QUIC MUX remain
compliant with RFC 9000 which forbid emission after RESET_STREAM.

Since previous commit, it is now ensured that RESET_STREAM sending will
always be reported to stream-layer. Thus, there is no need anymore to
silently reset the buffer. A BUG_ON() statement is added to ensure this
assumption will remain valid.

The new code is deemed cleaner as it does not hide a missing error
notification on the stconn-layer. Previously, if an error was missing,
sending would continue unnecessarily with a false success status
reported for the stream.

Note that the BUG_ON() statement was also added into nego_ff callback.
This is necessary to ensure both sending path remains consistent.

This patch is labelled as MEDIUM as issues were already encountered in
snd_buf/nego_ff implementation and it's not easy to cover all occurences
during test. If the BUG_ON() is triggered without any apparent
stream-layer issue, this commit should be reverted.
2023-12-21 15:42:08 +01:00
Aurelien DARRAGON
f6ae25858d MINOR: peers: rely on srv->addr and remove peer->addr
Similarly to the previous commit, we get rid of unused peer member.

peer->addr was only used to save a copy of the sever's addr at parsing
time. But instead of relying on an intermediate variable, we can actually
use server's address directly when initiating the peer session.

As with other streams created from server's settings (tcp/http, log, ring),
we should rely on srv->svc_port for the port part of the address. This
shouldn't change anything for peers since the address is fully resolved
at parsing time and runtime changes are not supported, but this should
help to make the code future-proof.
2023-12-21 14:22:27 +01:00
Aurelien DARRAGON
372d3e2934 CLEANUP: peers: remove unused "proto" and "xprt" struct members
peer->proto and peer->xprt struct members are now pure legacy: they are
only set during parsing but never used afterwards.

This is due to commit 02efedac ("MINOR: peers: now remove the remote
connection setup code") which made some cleanup in the past, but the
unused proto and xprt members were probably left unused by mistake.

Since we don't have valid uses for them, we remove them.

Also, peer_xprt() helper function was removed since it was related to
peer->xprt struct member.
2023-12-21 14:22:27 +01:00
Aurelien DARRAGON
334caefaaa CLEANUP: peers: remove unused sock_init_arg struct member
Since be0688c6 ("MEDIUM: stream_interface: remove the si->init"),
sock_init_arg is completely useless (set but never used later), thus
we remove it.
2023-12-21 14:22:27 +01:00
Aurelien DARRAGON
7293eb68ff MEDIUM: peers: use server as stream target
Historically, we used the internal peer proxy as stream target, because
then we only cared about initiating a basic tcp connection with the
endpoint, and relying on parent proxy settings was enough.

But later, we introduced the possibility to connect to an SSL peer by
taking server's SSL parameters into acount. This was done in commit
1055e687 ("MINOR: peers: Make outgoing connection to SSL/TLS peers work.")

However, the above commit introduced an ambiguity:

peer_session_target() function was introduced, and the function will
either return the peers proxy's object or the current server's object
depending if ssl is configured or not.

While this works fine to ensure proper SSL handling while being
conservative with historical behavior, this cause other server transport
related settings to only work when ssl settings are provided, which is
quite debatable.

Indeed, while we're there, why not always using the server's object as
a stream target, to ensure all transport related options are properly
handled? Moreover, the peers documentation tells this:

   ... "support for all "server" parameters found in 5.2 paragraph that
   are related to transport settings" ...

To remove the ambiguity and fully comply with the documentation, we make
peer_session_target() always return the server's object.
2023-12-21 14:22:27 +01:00
Aurelien DARRAGON
334ebfa1a2 MEDIUM: server/dns: clear RMAINT when addr resolves again
snr_update_srv_status() and srvrq_update_srv_status() will both set or
clear the server RMAINT state depending of the result of the current dns
resolution.

This used to work pretty well in the past, but now that addr:svc_port
changes are changed atomically through a dedicated task, the change is
performed asynchronously, so this can cause some flapping issues if the
server is put out of maintenance while the server's address is still
unassigned.

To prevent errors, the resolver's code is now only allowed to put the
server under maintenance but not to remove it from maintenance:

the decision to remove a server from maintenance is performed by the task
responsible for updating the server's addr: if the addr resolves again
thanks to a valid DNS resolution and the server was previously under
RMAINT, then it cleared from RMAINT state.

srvrq_update_srv_status() was renamed srvrq_set_srv_down(), since it is
only called to put the server in maintenance as a result of a failing
SRV entry.

snr_update_srv_status() was renamed srv_set_srv_down() and slightly
modified so that it only takes care of putting the server under
maintenance when needed.

The cli command "set server x/y addr" does not need to remove the RMAINT
flag anymore.
2023-12-21 14:22:27 +01:00
Aurelien DARRAGON
72e2c8db3e MINOR: server: add dns hint in server_inetaddr_updater struct
This will allow event consumers to know if the update was triggered dns/
resolver stuff by checking the ->dns boolean.
2023-12-21 14:22:27 +01:00
Aurelien DARRAGON
33cd676e9e MINOR: server/event_hdl: expose updater info through INETADDR event
Thanks to the previous commit, we can now expose updater info through
INETADDR event.
2023-12-21 14:22:27 +01:00
Aurelien DARRAGON
3ac79b504a MEDIUM: server: make server_set_inetaddr() updater serializable
server_set_inetaddr() updater argument is a simple char * string
containing infos about the caller responsible for the update.

In this patch, we try to make this argument serializable, that is, make
it so that we can easily export it without having to keep the original
pointer passed by the caller or having to work with strings of variable
lengths.

This was a prerequisite for exposing more updater information through
SERVER_INETADDR event (upcoming patch).

Static strings were simply mapped to a fixed ID that can be converted back
to a string when needed using server_inetaddr_updater_by_to_str(). One
special case one made for the SERVER_INETADDR_UPDATER_DNS_RESOLVER updater
since in this case the updater hint has to be generated from the
corresponding resolver id / nameserver id combination. This was achieved
by saving the nameserver id within the updater struct. Knowing that the
resolver id can be guessed from the server struct directly, it was not
exposed through the updater struct.

This patch depends on:
 - "MINOR: resolvers: add unique numeric id to nameservers"

No functional change should be expected.
2023-12-21 14:22:27 +01:00
Aurelien DARRAGON
2f6120d6d4 MINOR: resolvers: add unique numeric id to nameservers
When we want to avoid keeping pointers on a nameserver struct, it's not
always convenient to refer as a nameserver using it's text-based unique
identifier since it's not limited in length thus it cannot be serialized
and deserialized safely.

To address this limitation, we add a new ->puid member in dns_nameserver
struct which is a parent-unique numeric value that can be used to refer
to the dns nameserver within its parent resolver context.

To achieve this, we reused the resolver->nb_nameserver member that wasn't
used. Each time we add a new nameserver to a resolver: we set ns->puid to
the current number of nameservers within the resolver and we increment
this number right away.

Public helper function find_nameserver_by_resolvers_and_id() was added to
help retrieve nameserver pointer from (resolver X nameserver puid)
combination.
2023-12-21 14:22:27 +01:00
Aurelien DARRAGON
4fe0cca305 CLEANUP: resolvers: remove duplicate func prototype
dns_dgram_init() function prototype was found in both resolvers and dns
header files, but it should belong to the dns header file, so the
duplicate entry was simply removed.
2023-12-21 14:22:27 +01:00
Aurelien DARRAGON
ab6fef4882 CLEANUP: server: remove unused server_parse_addr_change_request() function
server_parse_addr_change_request() was completely replaced by the newer
srv_update_addr_port() function. Considering the function doesn't offer
useful features that srv_update_addr_port() couldn't do, we simply
remove the function.
2023-12-21 14:22:27 +01:00
Aurelien DARRAGON
f1f4b93a67 MEDIUM: server: merge srv_update_addr() and srv_update_addr_port() logic
Both functions are performing the similar tasks, except that the _port()
version is doing a bit more work.

In this patch, we add the server_set_inetaddr() function that works like
the srv_update_addr_port() but it takes parsed inputs instead of raw
strings as arguments.

Then, server_set_inetaddr() is used as underlying helper function for
both srv_update_addr() and srv_update_addr_port() to make them easier
to maintain.

Also, helper functions were added:
 - server_set_inetaddr_warn() -> same as server_set_inetaddr() but report
   a warning on updates.
 - server_get_inetaddr() -> fills a struct server_inetaddr from srv

Since the feedback message generation part was slightly reworked, some
minor changes in the way addr:svc_port updates are reported in the logs
or cli messages should be expected (no loss of information though).
2023-12-21 14:22:27 +01:00
Aurelien DARRAGON
2d0c7f5935 CLEANUP: server/event_hdl: remove purge_conn hint in INETADDR event
Now that purge_conn hint is now being ignored thanks to previous commit,
we can simply get rid of it.
2023-12-21 14:22:27 +01:00
Aurelien DARRAGON
545e72546c BUG/MINOR: server/event_hdl: propagate map port info through inetaddr event
server addr:svc_port updates during runtime might set or clear the
SRV_F_MAPPORTS flag. Unfortunately, the flag update is still directly
performed by srv_update_addr_port() function while the addr:svc_port
update is being scheduled for atomic update. Given that existing readers
don't take server's lock to read addr:svc_port, they also check the
SRV_F_MAPPORTS flag right after without the lock.

So we could cause the readers to incorrectly interpret the svc_port from
the server struct because the mapport information is not published
atomically, resulting in inconsistencies between svc_port / mapport flag.
(MAPPORTS flag causes svc_port to be used differently by the reader)

To fix this, we publish the mapport information within the INETADDR server
event and we let the task responsible for updating server's addr and port
position or clear the flag depending on the mapport hint.

This patch depends on:
 - MINOR: server/event_hdl: add server_inetaddr struct to facilitate event data usage
 - MINOR: server/event_hdl: update _srv_event_hdl_prepare_inetaddr prototype

This should be backported in 2.9 with 683b2ae01 ("MINOR: server/event_hdl:
add SERVER_INETADDR event")
2023-12-21 14:22:26 +01:00
Aurelien DARRAGON
14893a6a00 MINOR: server/event_hdl: add server_inetaddr struct to facilitate event data usage
event_hdl_cb_data_server_inetaddr struct had some anonymous structs
defined in it, making it impossible to pass as a function argument and
harder to maintain since changes must be performed at multiple places
at once. So instead we define a storage struct named server_inetaddr
that helps to save addr:port server information in INET context.
2023-12-21 14:22:26 +01:00
Aurelien DARRAGON
835263047e OPTIM: server: ebtree lookups for findserver_unique_* functions
4e5e2664 ("MINOR: proxy: add findserver_unique_id() and findserver_unique_name()")
added findserver_unique_id() and findserver_unique_name() functions that
were inspired from the historical findserver() function, so unfortunately
they don't perform well when used on large backend farms because they scan
the whole server list linearly.

I was about to provide a patch to optimize such functions when I stumbled
on Baptiste's work:
  19a106d24 ("MINOR: server: server_find functions: id, name, best_match")

It turns out Baptiste already implemented helper functions to supersed
the unoptimized findserver() function (at least at runtime when servers
have been assigned their final IDs and inserted in the lookup trees): they
offer more matching options and rely on eb lookups so they are much more
suitable for fast queries. I don't know how I missed that, but they are a
perfect base for the server rid matching functions.

So in this patch, we essentially revert 4e5e2664 to provide the optimized
equivalent functions named server_find_by_id_unique() and
server_find_by_name_unique(), then we force existing findserver_unique_*()
callers to switch to the new functions.

This patch depends on:
 - "OPTIM: server: eb lookup for server_find_by_name()"

This could be backported up to 2.8.
2023-12-21 14:22:26 +01:00
Aurelien DARRAGON
8a6cc6e3ea MEDIUM: proxy: set PR_O_HTTP_UPG on implicit upgrades
When a TCP frontend uses an HTTP backend, the stream is automatically
upgraded and it results in a similar behavior as if a switch-mode http
rule was evaluated since stream_set_http_mode() gets called in both
situations and minimal HTTP analyzers are set.

In the current implementation, some postparsing checks are generating
errors or warnings when the frontend is in TCP mode with some HTTP options
set and no upgrade is expected (no switch-rule http). But as you can guess,
unfortunately this leads in issues when such "HTTP" only options are used
in a frontend that has implicit switching rules (that is, when the
frontend uses an HTTP backend for example), because in this case the
PR_O_HTTP_UPG will not be set, so the postparsing checks will consider
that some options are not relevant and will raise some warnings.

Consider the following example:

  backend back
    mode http
    server s1 git.haproxy.org:80
  frontend front
    mode tcp
    bind localhost:8080
    http-request set-var(txn.test) str(TRUE),debug(WORKING,stderr)
    use_backend back

By starting an haproxy instance with the above example conf, we end up
having this warning:

  [WARNING]  (400280) : config : 'http-request' rules ignored for frontend 'front' as they require HTTP mode.

However, by making a request on the frontend, we notice that the request
rules are still executed, and that's because the stream is effectively
upgraded as a result of an implicit upgrade:

  [debug] WORKING: type=str <TRUE>

So this confirms the previous description: since implicit and explicit
upgrades result in approximately the same behavior on the frontend side,
we should consider them both when doing postparsing checks.

This is what we try to address in the following commit: PR_O_HTTP_UPG
flag is now more generic in the sense that it refers to either implicit
(through default_backend or use_backend rules) or explicit (switch-mode
rules) upgrades. Indeed, everytime an HTTP or dynamic backend (where the
mode cannot be assumed during parsing) is encountered in default_backend
directive or use_backend rules, we explicitly position the upgrade flag
so that further checks that depend on the proxy being in HTTP context
don't report false warnings.
2023-12-21 14:22:26 +01:00
Aurelien DARRAGON
ef9d692544 MINOR: stats: store the parent proxy in stats ctx (http)
Some HTTP related stats functions need to know the parent proxy, mainly
to get a pointer on the related uri_auth set by the proxy or to check
scope settings.

The current design (probably historical as only the http context existed
by then) took the other approach: it propagates the uri pointer from the
http context deep down the calling stack up to the relevant functions.
For non-http contexts (cli), the pointer is set to NULL.

Doing so is not very pretty and not easy to maintain. Moreover, there were
still some places in the code were the uri pointer was learned directly
from the stream proxy because the argument was not available as argument
from those functions. This is error-prone, because if one day we decide to
change the source proxy in the parent function, we might still have some
functions down the stack that ignore the top most argument and still do
on their own, and we'll probably end up with inconsistencies.

So in this patch, we take a safer approach: the caller responsible for
creating the stats applet should set the http_px pointer so that any stats
function running under the applet that needs to know if it's running in
http context or needs to access parent proxy info may do so thanks to
the dedicated ctx->http_px pointer.
2023-12-21 14:20:03 +01:00
Christopher Faulet
123a9e7d83 BUG/MAJOR: stconn: Disable zero-copy forwarding if consumer is shut or in error
A regression was introduced by commit 2421c6fa7d ("BUG/MEDIUM: stconn: Block
zero-copy forwarding if EOS/ERROR on consumer side"). When zero-copy
forwarding is inuse and the consumer side is shut or in error, we declare it
as blocked and it is woken up. The idea is to handle this state at the
stream-connector level. However this definitly blocks receives on the
producer side. So if the mux is unable to close by itself, but instead wait
the peer to shut, this can lead to a wake up loop. And indeed, with the
passthrough multiplexer this may happen.

To fix the issue and prevent any loop, instead of blocking the zero-copy
forwarding, we now disable it. This way, the stream-connector on producer
side will fallback on classical receives and will be able to handle peer
shutdown properly. In addition, the wakeup of the consumer side was
removed. This will be handled, if necessary, by sc_notify().

This patch should fix the issue #2395. It must be backported to 2.9.
2023-12-21 11:00:57 +01:00
Christopher Faulet
2421c6fa7d BUG/MEDIUM: stconn: Block zero-copy forwarding if EOS/ERROR on consumer side
When the producer side (h1 for now) negociates with the consumer side to
perform a zero-copy forwarding, we now consider the consumer side as blocked
if it is closed and this was reported to the SE via a end-of-stream or a
(pending) error.

It is performed before calling ->nego_ff callback function, in se_nego_ff().
This way, all consumer are concerned automatically. The aim of this patch is
to fix an issue with the QUIC mux. Indeed, it is unexpected to send a frame
on an closed stream. This triggers a BUG_ON(). Other muxes are not affected
but it remains useless to try to send data if the stream is closed.

This patch should fix the issue #2372. It must be backported to 2.9.
2023-12-13 16:45:29 +01:00
Amaury Denoyelle
e772d3f40f CLEANUP: mux-quic: clean up app ops callback definitions
qcc_app_ops is a set of callbacks used to unify application protocol
running over QUIC. This commit introduces some changes to clarify its
API :
* write simple comment to reflect each callback purpose
* rename decode_qcs to rcv_buf as this name is more common and is
  similar to already existing snd_buf
* finalize is moved up as it is used during connection init stage

All these changes are ported to HTTP/3 layer. Also function comments
have been extended to highlight HTTP/3 special characteristics.
2023-12-11 16:15:13 +01:00
Amaury Denoyelle
f496c7469b MINOR: mux-quic: clean up qcs Tx buffer allocation API
This function is similar to the previous one, but this time for QCS
sending buffer.

Previously, each application layer redefine their own version of
mux_get_buf() which was used to allocate <qcs.tx.buf>. Unify it under a
single function renamed qcc_get_stream_txbuf().
2023-12-11 16:08:51 +01:00
Amaury Denoyelle
b526ffbfb9 MINOR: mux-quic: clean up qcs Rx buffer allocation API
Replaces qcs_get_buf() function which naming does not reflect its
purpose. Add a new function qcc_get_stream_rxbuf() which allocate if
needed <qcs.rx.app_buf> and returns the buffer pointer. This function is
reserved for application protocol layer. This buffer is then accessed by
stconn layer.

For other qcs_get_buf() invocation which was used in effect for a local
buffer, replace these by a plain b_alloc().
2023-12-11 16:02:30 +01:00
Amaury Denoyelle
14d968f2f2 CLEANUP: mux-quic: remove unused prototype
Remove qcc_emit_cc_app() prototype from header file. This function was
removed by a previous commit and does not exist anymore.
2023-12-11 15:12:57 +01:00
William Lallemand
1c1bb8ef2a BUG/MINOR: mworker/cli: fix set severity-output support
"set severity-output" is one of these command that changes the appctx
state so the next commands are affected.

Unfortunately the master CLI works with pipelining and server close
mode, which means the connection between the master and the worker is
closed after each response, so for the next command this is a new appctx
state.

To fix the problem, 2 new flags are added ACCESS_MCLI_SEVERITY_STR and
ACCESS_MCLI_SEVERITY_NB which are used to prefix each command sent to
the worker with the right "set severity-output" command.

This patch fixes issue #2350.

It could be backported as far as 2.6.
2023-12-07 17:37:23 +01:00
Christopher Faulet
67c03508d6 MEDIUM: pattern: Add support for virtual and optional files for patterns
Before this patch, it was not possible to use a list of patterns, map or a
list of acls, without an existing file.  However, it could be handy to just
use an ID, with no file on the disk. It is pretty useful for everyone
managing dynamically these lists. It could also be handy to try to load a
list from a file if it exists without failing if not. This way, it could be
possible to make a cold start without any file (instead of empty file),
dynamically add and del patterns, dump the list to the file periodically to
reuse it on reload (via an external process).

In this patch, we uses some prefixes to be able to use virtual or optional
files.

The default case remains unchanged. regular files are used. A filename, with
no prefix, is used as reference, and it must exist on the disk. With the
prefix "file@", the same is performed. Internally this prefix is
skipped. Thus the same file, with ou without "file@" prefix, references the
same list of patterns.

To use a virtual map, "virt@" prefix must be used. No file is read, even if
the following name looks like a file. It is just an ID. The prefix is part
of ID and must always be used.

To use a optional file, ie a file that may or may not exist on a disk at
startup, "opt@" prefix must be used. If the file exists, its content is
loaded. But HAProxy doesn't complain if not. The prefix is not part of
ID. For a given file, optional files and regular files reference the same
list of patterns.

This patch should fix the issue #2202.
2023-12-06 10:24:41 +01:00
Christopher Faulet
533121a56e MINOR: cache: Add global option to enable/disable zero-copy forwarding
tune.cache.zero-copy-forwarding parameter can now be used to enable or
disable the zero-copy fast-forwarding for the cache applet only. It is
enabled ('on') by default. It can be disabled by setting the parameter to
'off'.
2023-12-06 10:24:41 +01:00
Christopher Faulet
a40321eb3b MINOR: channel: Use dedicated functions to deal with STREAMER flags
For now, CF_STREAMER and CF_STREAMER_FAST flags are set in sc_conn_recv()
function. The logic is moved in dedicated functions.

First, channel_check_idletimer() function is now responsible to check the
channel's last read date against the idle timer value to be sure the
producer is still streaming data. Otherwise, it removes STREAMER flags.

Then, channel_check_xfer() function is responsible to check amount of data
transferred avec a receive, to eventually update STREAMER flags.

In sc_conn_recv(), we now use these functions.
2023-12-06 10:24:41 +01:00
Willy Tarreau
eb67d63456 [RELEASE] Released version 3.0-dev0
Released version 3.0-dev0 with the following main changes :
    - exact copy of 2.9.0
2023-12-05 16:19:35 +01:00
Christopher Faulet
7732323cf3 MINOR: global: Use a dedicated bitfield to customize zero-copy fast-forwarding
Zero-copy fast-forwading feature is a quite new and is a bit sensitive.
There is an option to disable it globally. However, all protocols have not
the same maturity. For instance, for the PT multiplexer, there is nothing
really new. The zero-copy fast-forwading is only another name for the kernel
splicing. However, for the QUIC/H3, it is pretty new, not really optimized
and it will evolved. And soon, the support will be added for the cache
applet.

In this context, it is usefull to be able to enable/disable zero-copy
fast-forwading per-protocol and applet. And when it is applicable, on sends
or receives separately. So, instead of having one flag to disable it
globally, there is now a dedicated bitfield, global.tune.no_zero_copy_fwd.
2023-12-04 15:31:47 +01:00
Aurelien DARRAGON
c2cd6a419c BUG/MINOR: server/event_hdl: properly handle AF_UNSPEC for INETADDR event
It is possible that a server's addr family is temporarily set to AF_UNSPEC
even if we're certain to be in INET context (ipv4, ipv6).

Indeed, as soon as IP address resolving is involved, srv->addr family will
be set to AF_UNSPEC when the resolution fails (could happen at anytime).

However, _srv_event_hdl_prepare_inetaddr() wrongly assumed that it would
only be called with AF_INET or AF_INET6 families. Because of that, the
function will handle AF_UNSPEC address as an IPV6 address: not only
we could risk reading from an unititialized area, but we would then
propagate false information when publishing the event.

In this patch we make sure to properly handle the AF_UNSPEC family in
both the "prev" and the "next" part for SERVER_INETADDR event and that
every members are explicitly initialized.

This bug was introduced by 6fde37e046 ("MINOR: server/event_hdl: add
SERVER_INETADDR event"), no backport needed.
2023-12-01 20:43:42 +01:00
Amaury Denoyelle
0ce213d246 MINOR: quic_tp: use in_addr/in6_addr for preferred_address
preferred_address is a transport parameter specify by the server. It
specified both an IPv4 and IPv6 address. These addresses were defined as
plain array in <struct tp_preferred_address>.

Convert these adressees to use the common types in_addr/in6_addr. With
this change, dumping of preferred_address is extended. It now displays
the addresses using inet_ntop() and CID value.
2023-11-30 15:59:45 +01:00
Amaury Denoyelle
f31719edae CLEANUP: quic_cid: remove unused listener arg
retrieve_qc_conn_from_cid() requires listener as argument whereas it is
unused. This is an artifact from the old architecture where CID trees
where stored on listener instances instead of globally. Remove it to
better reflect this change.
2023-11-30 15:04:27 +01:00
Christopher Faulet
0f15dcd9a7 MINOR: muxes: Add a callback function to send commands to mux streams
Just like the ->ctl() callback function, used to send commands to mux
connections, the ->sctl() callback function can now be used to send commands
to mux streams. The first command, MUX_SCTL_SID, is a way to request the mux
stream ID.

It will be implemented later for each mux.
2023-11-29 11:11:12 +01:00
Christopher Faulet
d982a37e4c MINOR: muxes: Rename mux_ctl_type values to use MUX_CTL_ prefix
Instead of the generic MUX_, we now use MUX_CTL_ prefix for all mux_ctl_type
value. This will avoid any ambiguities with other enums, especially with a
new one that will be added to get information on mux streams.
2023-11-29 11:11:12 +01:00
Christopher Faulet
8f56552862 MINOR: stream: Expose session terminate state via a new sample fetch
It is now possible to retrieve the session terminate state, using
"txn.sess_term_state". The sample fetch returns the 2-character session
termation state.

Of course, the result of this sample fetch is volatile. It is subject to
change. It is also most of time useless because no termation state is set
except at the end. It should only be useful in http-after-response rule
sets. It may also be used to customize the logs using a log-format
directive.

This patch should fix the issue #2221.
2023-11-29 11:11:12 +01:00
Christopher Faulet
b2f82b2b51 MINOR: http-fetch: Add a sample to retrieve the server status code
The code returned by the "status" sample fetch is the one in the HTTP
response at the moment the sample is evaluated. It may be the status code in
the server response or the one of the HAProxy reply in case of error, deny,
redirect...

However, it could be handy to retrieve the status code returned by the
server, when a HTTP response was really received from it. It is the purpose
of the "server_status" sample fetch. The server status code itself is stored
in the HTTP txn.
2023-11-29 11:11:12 +01:00
Aurelien DARRAGON
2f2cb6d082 MEDIUM: log/balance: support FQDN for UDP log servers
In previous log backend implementation, we created a pseudo log target
for each declared log server, and we made the log target's address point
to the actual server address to save some time and prevent unecessary
copies.

But this was done without knowing that when FQDN is involved (more broadly
when dns/resolution is involved), the "port" part of server addr should
not be relied upon, and we should explicitly use ->svc_port for that
purpose.

With that in mind and thanks to the previous commit, some changes were
required: we allocate a dedicated addr within the log target when target
is in DGRAM mode. The addr is first initialized with known values and it
is then updated automatically by _srv_set_inetaddr() during runtime.
(the change is atomic so readers don't need to worry about it)

addr from server "log target" (INET/DGRAM mode) is made of the combination
of server's address (lacking the port part) and server's svc_port.
2023-11-29 08:59:27 +01:00
Aurelien DARRAGON
cb3ec978fd MINOR: event_hdl: add global tunables
The local variable "event_hdl_async_max_notif_at_once" which was
introduced with the event_hdl API was left as is but with a TODO note
telling that we should make it a global tunable.

Well, we're doing this now. To prepare for upcoming tunables related to
event_hdl API, we add a dedicated struct named event_hdl_tune which is
globally exposed through the event_hdl header file so that it may be used
from everywhere. The struct is automatically initialized in
event_hdl_init() according to defaults.h.

"event_hdl_async_max_notif_at_once" now becomes
"event_hdl_tune.max_events_at_once" with it's dedicated
configuation keyword: "tune.events.max-events-at-once".

We're also taking this opportunity to raise the default value from 10
to 100 since it's seems quite reasonnable given existing async event_hdl
users.

The documentation was updated accordingly.
2023-11-29 08:59:27 +01:00
William Lallemand
08f1e2bea2 MINOR: mworker/cli: implements the customized payload pattern for master CLI
Implements the customized payload pattern for the master CLI.

The pattern is stored in the stream in char pcli_payload_pat[8].

The principle is basically the same as the CLI one, it looks for '<<'
then stores what's between '<<' and '\n', and look for it to exit the
payload mode.
2023-11-28 19:13:49 +01:00
William Lallemand
e3557c7d45 MEDIUM: cli: allow custom pattern for payload
The CLI payload syntax has some limitation, it can't handle payloads
with empty lines, which is a common problem when uploading a PEM file
over the CLI.

This patch implements a way to customize the ending pattern of the CLI,
so we can't look for other things than empty lines.

A char cli_payload_pat[8] is used in the appctx to store the customized
pattern. The pattern can't be more than 7 characters and can still empty
to match an empty line.

The cli_io_handler() identifies the pattern and stores it, and
cli_parse_request() identifies the end of the payload.

If the customized pattern between "<<" and "\n" is more than 7
characters, it is not considered as a pattern.

This patch only implements the parser for the 'stats socket', another
patch is needed for the 'master CLI'.
2023-11-28 19:12:32 +01:00
Frédéric Lécaille
ad61a5dde3 REORG: quic: Move quic_increment_curr_handshake() to quic_sock
Move quic_increment_curr_handshake() from quic_conn.c to quic_sock.h to be inlined.
Also move all the inlined functions at the end of this header.
2023-11-28 15:47:18 +01:00
Frédéric Lécaille
95e9033fd2 REORG: quic: Add a new module for retransmissions
Move several functions in relation with the retransmissions from TX part
(quic_tx.c) to quic_retransmit.c new C file.
2023-11-28 15:47:18 +01:00
Frédéric Lécaille
714d1096bc REORG: quic: Move qc_notify_send() to quic_conn
Move qc_notify_send() from quic_tx.c to quic_conn.c. Note that it was already
exported from both quic_conn.h and quic_tx.h. Modify this latter header
to fix the duplication.
2023-11-28 15:47:18 +01:00
Frédéric Lécaille
b39362070d BUILD: quic: Several compiler warns fixes after retry module creation
Such a warning appeared after having added quic_retry.h which includes only
headers for types (quic_cid-t.h, clock-t.h...)

In file included from include/haproxy/quic_retry.h:12,
                 from src/quic_retry.c:5:
include/haproxy/quic_cid-t.h:26:26: error: field ‘seq_num’ has incomplete type
   26 |         struct eb64_node seq_num;
2023-11-28 15:47:18 +01:00
Frédéric Lécaille
b5970967ca REORG: quic: Add a new module for QUIC retry
Add quic_retry.c new C file for the QUIC retry feature:
   quic_saddr_cpy() moved from quic_tx.c,
   quic_generate_retry_token_aad() moved from
   quic_generate_retry_token() moved from
   parse_retry_token() moved from
   quic_retry_token_check() moved from
   quic_retry_token_check() moved from
2023-11-28 15:47:18 +01:00
Frédéric Lécaille
43fbea0f38 REORG: quic: Move ncbuf related function from quic_rx to quic_conn
Move quic_get_ncbuf() and quic_free_ncbuf() from quic_rx.c to quic_conn.h
as static inlined functions.
2023-11-28 15:47:18 +01:00
Frédéric Lécaille
e0d3eb496b REORG: quic: Move NEW_CONNECTION_ID frame builder to quic_cid
Move qc_build_new_connection_id_frm() from quic_conn.c to quic_cid.c.
Also move quic_connection_id_to_frm_cpy() from quic_conn.h to quic_cid.h.
2023-11-28 15:47:18 +01:00
Frédéric Lécaille
795d1a57bf REORG: quic: Rename some (quic|qc)_conn* objects to quic_conn_closed
These objects could be confused with the ones defined by the congestion control
part (quic_cc.c).
2023-11-28 15:47:16 +01:00
Frédéric Lécaille
d7a5fa24dc REORG: quic: Move qc_pkt_long() to quic_rx.h
This inlined function takes a quic_rx_packet struct as argument unique argument.
Let's move it to QUIC RX part.
2023-11-28 15:37:50 +01:00
Frédéric Lécaille
0b872e24cd REORG: quic: Move qc_may_probe_ipktns() to quic_tls.h
This function is in relation with the Initial packet number space which is
more linked to the QUIC TLS specifications. Let's move it to quic_tls.h
to be inlined.
2023-11-28 15:37:50 +01:00
Frédéric Lécaille
c93ebcc59b REORG: quic: Move quic_build_post_handshake_frames() to quic_conn module
Move quic_build_post_handshake_frames() from quic_rx.c to quic_conn.c. This
is a function which is also called from the TX part (quic_tx.c).
2023-11-28 15:37:50 +01:00
Frédéric Lécaille
3482455ddd REORG: quic: Move qc_handle_conn_migration() to quic_conn.c
This function manipulates only quic_conn objects. Its location is definitively
in quic_conn.c.
2023-11-28 15:37:50 +01:00
Frédéric Lécaille
581549851c REORG: quic: Move QUIC path definitions/declarations to quic_cc module
Move quic_path struct from quic_conn-t.h to quic_cc-t.h and rename it to quic_cc_path.
Update the code consequently.
Also some inlined functions in relation with QUIC path to quic_cc.h
2023-11-28 15:37:50 +01:00
Frédéric Lécaille
f32fc26b62 REORG: quic: Rename some functions used upon ACK receipt
Rename some functions to reflect more their jobs.
Move qc_release_lost_pkts() to quic_loss.c
2023-11-28 15:37:50 +01:00
Frédéric Lécaille
f74d882ef0 REORG: quic: Move the QUIC DCID parser to quic_sock.c
Move quic_get_dgram_dcid() from quic_conn.c to quic_sock.c because
only used in this file and define it as static.
2023-11-28 15:37:50 +01:00
Frédéric Lécaille
09ab48472c REORG: quic: Move several inlined functions from quic_conn.h
Move quic_pkt_type(), quic_saddr_cpy(), quic_write_uint32(), max_available_room(),
max_stream_data_size(), quic_packet_number_length(), quic_packet_number_encode()
and quic_compute_ack_delay_us()	to quic_tx.c because only used in this file.
Also move quic_ack_delay_ms() and quic_read_uint32() to quic_tx.c because they
are used only in this file.

Move quic_rx_packet_refinc() and quic_rx_packet_refdec() to quic_rx.h header.
Move qc_el_rx_pkts(), qc_el_rx_pkts_del() and qc_list_qel_rx_pkts() to quic_tls.h
header.
2023-11-28 15:37:47 +01:00
Frédéric Lécaille
831764641f REORG: quic: Move QUIC CRYPTO stream definitions/declarations to QUIC TLS
Move quic_cstream struct definition from quic_conn-t.h to quic_tls-t.h.
Its pool is also moved from quic_conn module to quic_tls. Same thing for
quic_cstream_new() and quic_cstream_free().
2023-11-28 15:37:22 +01:00
Frédéric Lécaille
ae885b9b68 REORG: quic: Move CRYPTO data buffer defintions to QUIC TLS module
Move quic_crypto_buf struct definition from quic_conn-t.h to quic_tls-t.h.
Also move its pool definition/declaration to quic_tls-t.h/quic_tls.c.
2023-11-28 15:37:22 +01:00
Frédéric Lécaille
5f9bd6bbce BUILD: quic: Missing RX header inclusions
Fix such building issues:
   In file included from src/quic_tx.c:15:
        include/haproxy/quic_tx.h:51:23: warning: ‘struct quic_rx_packet’

Do not know why the compiler warns about such missing header inclusions
just now. It should have complained a long time ago during the big QUIC
source code split.
2023-11-28 15:37:22 +01:00
Frédéric Lécaille
f949f7df83 REORG: quic: QUIC connection types header cleaning
Move UDP datagram definitions from quic_conn-t.h to quic_sock-t.h
Move debug quic_rx_crypto_frm struct from quic_conn-t.h to quic_trace-t.h
2023-11-28 15:37:22 +01:00
Frédéric Lécaille
0fc0d45745 REORG: quic: Add a new module to handle QUIC connection IDs
Move quic_cid and quic_connnection_id from quic_conn-t.h to new quic_cid-t.h header.
Move defintions of quic_stateless_reset_token_init(), quic_derive_cid(),
new_quic_cid(), quic_get_cid_tid() and retrieve_qc_conn_from_cid() to quic_cid.c
new C file.
2023-11-28 15:37:22 +01:00
Frédéric Lécaille
21615d4376 CLEANUP: quic: Remove dead definitions/declarations
Remove useless definitions and declarations.
2023-11-28 15:37:22 +01:00
Christopher Faulet
2a307d273a BUG/MEDIUM: stconn: Don't perform zero-copy FF if opposite SC is blocked
When zero-copy data fast-forwarding is inuse, if the opposite SC is blocked,
there is no reason to try to fast-forward more data. Worst, in some cases,
this can lead to a receive loop of the producer side while the consumer side
is blocked.

No backport needed.
2023-11-28 14:01:56 +01:00
Amaury Denoyelle
e97489a526 MINOR: trace: support -dt optional format
Add an optional argument for "-dt". This argument is interpreted as a
list of several trace statement separated by comma. For each statement,
a specific trace name can be specifed, or none to act on all sources.
Using double-colon separator, it is possible to add specifications on
the wanted level and verbosity.
2023-11-27 17:15:14 +01:00
Amaury Denoyelle
cef29d3708 MINOR: trace: define simple -dt argument
Add '-dt' haproxy process argument. This will automatically activate all
trace sources on stderr with the error level. This could be useful to
troubleshoot issues such as protocol violations.
2023-11-27 17:10:18 +01:00
Willy Tarreau
3ac9912837 OPTIM: pattern: save memory and time using ebst instead of ebis
In the pat_ref_elt struct, the pattern string is stored outside of the
node element, using a pointer to an strdup(). Not only this needlessly
wastes at least 16-24 bytes per entry (8 for the pointer, 8-16 for the
allocator), it also makes the tree descent less efficient since both
the node and the string have to be visited for each layer (hence at least
two cache lines). Let's use an ebmb storage and place the pattern right
at the end of the pat_ref_elt, making it a variable-sized element instead.

The set-map test below jumps from 173 to 182 kreq/s/core, and the memory
usage drops from 356 MB to 324 MB:

  http-request set-map(/dev/null) %[rand(1000000)] 1

This is even more visible with large maps: after loading 16M IP addresses
into a map, the process uses this amount of memory:

  - 3.15 GB with haproxy-2.8
  - 4.21 GB with haproxy-2.9-dev11
  - 3.68 GB with this patch

So that's a net saving of 32 bytes per entry here, which cuts in half the
extra cost of the tree, and loading a large map takes about 20% less time.
2023-11-27 11:25:07 +01:00
Willy Tarreau
fc800b6cb7 MINOR: task/profiling: do not record task_drop_running() as a caller
Task_drop_running() is used to remove the RUNNING bit and check if
while the task was running it got a new wakeup from itself. Thus
each time task_drop_running() marks itself as a caller, it in fact
removes the previous caller that woke up the task, such as below:

Tasks activity over 10.439 sec till 0.000 sec ago:
  function                      calls   cpu_tot   cpu_avg   lat_tot   lat_avg
  task_run_applet            57895273   6.396m    6.628us   2.733h    170.0us <- run_tasks_from_lists@src/task.c:658 task_drop_running

Better not mark this function as a caller and keep the original one:

Tasks activity over 13.834 sec till 0.000 sec ago:
  function                      calls   cpu_tot   cpu_avg   lat_tot   lat_avg
  task_run_applet            62424582   5.825m    5.599us   5.717h    329.7us <- sc_app_chk_rcv_applet@src/stconn.c:952 appctx_wakeup
2023-11-27 11:24:52 +01:00
William Lallemand
3dd55fa132 MINOR: mworker/cli: implement hard-reload over the master CLI
The mworker mode never had a proper 'hard-stop' (-st) for the reload,
this is a mode which was commonly used with the daemon mode, but it was
never implemented in mworker mode.

This patch fixes the problem by implementing a "hard-reload" command
over the master CLI. It does the same as the "reload" command, but
instead of waiting for the connections to stop in the previous process,
it immediately quits the previous process after binding.
2023-11-24 21:44:25 +01:00
Aurelien DARRAGON
f2629ebd4e MINOR: proxy: add free_server_rules() helper function
Take the px->server_rules freeing part out of free_proxy() and make it
a dedicated helper function so that it becomes possible to use it from
anywhere.
2023-11-24 16:27:55 +01:00
Aurelien DARRAGON
24da4d3ee7 MINOR: tools: use const for read only pointers in ip{cmp,cpy}
In this patch we fix the prototype for ipcmp() and ipcpy() functions so
that input pointers that are used exclusively for reads are used as const
pointers. This way, the compiler can safely assume that those variables
won't be altered by the function.
2023-11-24 16:27:55 +01:00
Aurelien DARRAGON
683b2ae013 MINOR: server/event_hdl: add SERVER_INETADDR event
In this patch we add the support for a new SERVER event in the
event_hdl API.

SERVER_INETADDR is implemented as an advanced server event.
It is published each time the server's ip address or port is
about to change. (ie: from the cli, dns, lua...)

SERVER_INETADDR data is an event_hdl_cb_data_server_inetaddr struct
that provides additional info related to the server inet addr change,
but can be casted as a regular event_hdl_cb_data_server struct if
additional info is not needed.
2023-11-24 16:27:55 +01:00
Christopher Faulet
8d46a2c973 MAJOR: h3: Implement zero-copy support to send DATA frame
When possible, we try send DATA frame without copying data. To do so, we
swap the input buffer with QCS tx buffer. It is only possible iff:

 * There is only one HTX block of data at the beginning of the message
 * Amount of data to send is equal to the size of the HTX data block
 * The QCS tx buffer is empty

In this case, both buffers are swapped. The frame metadata are written at
the begining of the buffer, before data and where the HTX structure is
stored.
2023-11-24 07:42:43 +01:00
Christopher Faulet
1bcc0f8892 MEDIUM: mux-quic: Add consumer-side fast-forwarding support
The QUIC multiplexer now implements callbacks to consume fast-forwarded
data. It relies on the H3 stack to acquire the buffer and format the frame.
2023-11-24 07:42:43 +01:00
Amaury Denoyelle
a3187fe06c MINOR: rhttp: add count of active conns per thread
Add a new member <nb_rhttp_conns> in thread_ctx structure. Its purpose
is to count the current number of opened reverse HTTP connections
regarding from their listeners membership.

This patch will be useful to support multi-thread for active reverse
HTTP, in order to select the less loaded thread.

Note that despite access to <nb_rhttp_conns> are only done by the
current thread, atomic operations are used. This is because once
multi-thread support will be added, external threads will also retrieve
values from others.
2023-11-23 17:43:01 +01:00
Amaury Denoyelle
55e78ff7e1 MINOR: rhttp: large renaming to use rhttp prefix
Previous commit renames 'proto_reverse_connect' module to 'proto_rhttp'.
This commits follows this by replacing various custom prefix by 'rhttp_'
to make the code uniform.

Note that 'reverse_' prefix was kept in connection module. This is
because if a new reversable protocol not based on HTTP is implemented,
it may be necessary to reused the same connection function which are
protocol agnostic.
2023-11-23 17:40:01 +01:00
Amaury Denoyelle
e09af499b4 MINOR: rhttp: rename proto_reverse_connect
This commit is renaming of module proto_reverse_connect to proto_rhttp.
This name is selected as it is shorter and more precise.
2023-11-23 17:38:58 +01:00
Willy Tarreau
1de44daf7d MINOR: ext-check: add an option to preserve environment variables
In Github issue #2128, @jvincze84 explained the complexity of using
external checks in some advanced setups due to the systematic purge of
environment variables, and expressed the desire to preserve the
existing environment. During the discussion an agreement was found
around having an option to "external-check" to do that and that
solution was tested and confirmed to work by user @nyxi.

This patch just cleans this up, implements the option as
"preserve-env" and documents it. The default behavior does not change,
the environment is still purged, unless "preserve-env" is passed. The
choice of not using "import-env" instead was made so that we could
later use it to name specific variables that have to be imported
instead of keeping the whole environment.

The patch is simple enough that it could be backported if needed (and
was in fact tested on 2.6 first).
2023-11-23 16:53:57 +01:00
Ilya Shipitsin
80813cdd2a CLEANUP: assorted typo fixes in the code and comments
This is 37th iteration of typo fixes
2023-11-23 16:23:14 +01:00
Willy Tarreau
6455fd5024 MINOR: debug: add the ability to enter components in the post_mortem struct
Here the idea is to collect components' versions and build options. The
main component is haproxy, but the API is made so that any sub-system
can easily add a component there (for example the detailed version of a
device detection lib, or some info about a lib loaded from Lua).

The elements are stored as a pointer to an array of structs and its count
so that it's sufficient to issue this in gdb to list them all at once:

  print *post_mortem.components@post_mortem.nb_components

For now we collect name, version, toolchain, toolchain options, build
options and path. Maybe more could be useful in the future.
2023-11-23 15:39:21 +01:00
Willy Tarreau
2268f10dd6 DEBUG: tinfo: store the pthread ID and the stack pointer in tinfo
When debugging a core, it's difficult to match a given gdb thread number
against an internal thread. Let's just store the pthread ID and the stack
pointer in each tinfo. This could help in the future by allowing to just
glance over them and pick the right one depending what info is found
first.
2023-11-23 14:32:55 +01:00
Amaury Denoyelle
54c94c60d2 DEBUG: connection/flags: update flags for reverse HTTP
Add missing CO_FL_REVERSED and CO_FL_ACT_REVERSING flag definitions in
conn_show_flags(). These flags were introduced in this release with
reverse HTTP support.

No need to backport
2023-11-20 18:10:12 +01:00
Amaury Denoyelle
decf29d06d MINOR: quic: remove unneeded QUIC specific stopping function
On CONNECTION_CLOSE reception/emission, QUIC connections enter CLOSING
state. At this stage, only CONNECTION_CLOSE can be reemitted and all
other exchanges are stopped.

Previously, on haproxy process stopping, if all QUIC connections were in
CLOSING state, they were released before their closing timer expiration
to not block the process shutdown. However, since a recent commit, the
closing timer has been shorten to a more reasonable delay. It is now
consider viable to respect connections closing state even on process
shutdown. As such, stopping specific code in QUIC connections idle timer
task was removed.

A specific function quic_handle_stopping() was implemented to notify
QUIC connections on shutdown from main() function. It should have been
deleted along the removal in QUIC idle timer task. This patch just does
this.
2023-11-20 17:59:52 +01:00
Willy Tarreau
445fc1fe3a BUG/MINOR: sock: mark abns sockets as non-suspendable and always unbind them
In 2.3, we started to get a cleaner socket unbinding mechanism with
commit f58b8db47 ("MEDIUM: receivers: add an rx_unbind() method in
the protocols"). This mechanism rightfully refrains from unbinding
when sockets are expected to be transferrable to another worker via
"expose-fd listeners", but this is not compatible with ABNS sockets,
which do not support reuseport, unbinding nor being renamed: in short
they will always prevent a new process from binding.

It turns out that this is not much visible because by pure accident,
GTUNE_SOCKET_TRANSFER is only set in the code dealing with master mode
and deamons, so it's never set in foreground mode nor in tests even if
present on the stats socket. However with master mode, it is now always
set even when not present on the stats socket, and will always conflict.

The only reasonable approach seems to consist in marking these abns
sockets as non-suspendable so that the generic sock_unbind() code can
decide to just unbind them regardless of GTUNE_SOCKET_TRANSFER.

This should carefully be backported as far as 2.4.
2023-11-20 11:38:26 +01:00
Aurelien DARRAGON
4b2616f784 MINOR: log/backend: prevent stick table and stick rules with LOG mode
Report a warning and prevent errors if user tries to declare a stick table
or use stick rules within a log backend.
2023-11-18 11:16:21 +01:00
Aurelien DARRAGON
6a29888f60 MINOR: log/backend: ensure log exclusive params are not used in other modes
add proxy_cfg_ensure_no_log() function (similar to
proxy_cfg_ensure_no_http()) to ensure at the end of proxy parsing that
no log exclusive options are found if the proxy is not in log mode.
2023-11-18 11:16:21 +01:00
Aurelien DARRAGON
b61147fd2a MEDIUM: log/balance: merge tcp/http algo with log ones
"log-balance" directive was recently introduced to configure the
balancing algorithm to use when in a log backend. However, it is
confusing and it causes issues when used in default section.

In this patch, we take another approach: first we remove the
"log-balance" directive, and instead we rely on existing "balance"
directive to configure log load balancing in log backend.

Some algorithms such as roundrobin can be used as-is in a log backend,
and for log-only algorithms, they are implemented as "log-$name" inside
the "backend" directive.

The documentation was updated accordingly.
2023-11-18 11:16:21 +01:00
Aurelien DARRAGON
f42dfaa214 MEDIUM: lbprm: store algo params on 32bits
Make sure lbprm.algo can store 32bits by declaring it as uint32_t

Then, use all 32 available bits to offer 4 extra bits for the BE_LB_NEED
inputs. This will allow new required inputs to be easily added (up to 4
new ones, plus one that wasn't used yet if we keep them exclusive)

This required some cleanup: all ALGO bitfields were rewritten in the
32bits format and the high ones were shifted to make room for the
new BE_LB_NEED bits.
2023-11-18 11:16:21 +01:00
Aurelien DARRAGON
a327b80f1f CLEANUP: backend: removing unused LB param
BE_LB_HASH_RND was introduced with 760e81d35 ("MINOR: backend: implement
random-based load balancing") but was never used since. Removing it
to regain an extra slot for future types.
2023-11-18 11:16:21 +01:00
Aurelien DARRAGON
e10cf61099 MINOR: stktable: add stktable_deinit function
Adding sktable_deinit() helper function to properly cleanup a sticktable
that was initialized using stktable_init().
2023-11-18 11:16:21 +01:00
Willy Tarreau
f592a0d5dd MINOR: rhttp: remove the unused outgoing connect() function
A dummy connect() function previously had to be installed for the log
server so that a reverse-http address could be referenced on a "server"
line, but after the recent rework of the server line parsing, this is
no longer needed, and this is actually annoying as it makes one believe
there is a way to connect outside, which is not true. Let's now get rid
of this function.
2023-11-17 18:10:16 +01:00
Frdric Lcaille
888d1dc3dc MINOR: quic: Rename "handshake" timeout to "client-hs"
Use a more specific name for this timeout to distinguish it from a possible
future one on the server side.
Also update the documentation.
2023-11-17 18:09:41 +01:00
Frédéric Lécaille
e3e0bb90ce MEDIUM: quic: Add support for "handshake" timeout setting.
The idle timer task may be used to trigger the client handshake timeout.
The hanshake timeout expiration date (qc->hs_expire) is initialized when the
connection is allocated. Obviously, this timeout is taken into an account only
during the handshake by qc_idle_timer_do_rearm() whose job is to rearm the idle timer.

The idle timer expiration date could be initialized only one time, then
never updated until the hanshake completes. But this only works if the
handshake timeout is smaller than the idle timer task timeout. If the handshake
timeout is set greater than the idle timeout, this latter may expire before the
handshake timeout.

This patch may have an impact on the L1/C1 interop tests (with heavy packet loss
or corruption). This is why I guess some implementations with a hanshake timeout
support set a big timeout during this test. This is at least the case for ngtcp2
which sets a 180s hanshake timeout! haproxy will certainly have to proceed the
same way if it wants to have a chance to pass this test as before this handshake
timeout.
2023-11-17 17:31:42 +01:00
Frédéric Lécaille
b33eacc523 MINOR: proxy: Add "handshake" new timeout (frontend side)
Add a new timeout for the handshake, on the frontend side only. Such a hanshake
will be typically used for TLS hanshakes during client connections to TLS/TCP or
QUIC frontends.
2023-11-17 17:31:42 +01:00
Christopher Faulet
5ed101e09c BUG/MINOR: stconn: Report read activity on non-indep streams for partial sends
Partial sends is an activity, not a full blocking. Thus a read activity must
be reported for non-independent stream. It is especially important for very
congested stream where full sends are uncommon.

This patch must be backported to 2.8.
2023-11-17 15:36:43 +01:00
Christopher Faulet
020231ea79 MINOR: channel: Add functions to get info on buffers and deal with HTX streams
This patch adds HXT-aware versions of the functions c_data(), ci_data() and
c_empty(). channel_data() function returns the amount of data in the
channel, channel_input_data() returns the amount of input data and
channel_empty() returns true if the channel's buffer is empty. These
functions handles HTX buffers.

In addition, channel_data_limit() function, still HTX-aware, can be used to
get the maximum absolute amount of data that can be copied in a buffer,
independently on data already present in the buffer.
2023-11-17 15:08:15 +01:00
Christopher Faulet
7393bf7e42 MINOR: htx: Use a macro for overhead induced by HTX
The overhead induced by the HTX format was set to the HTX structure itself
and two HTX blocks. It was set this way to optimize zero-copy during
transfers. This value may (and will) be used at different places. Thus we
now use a macro, called HTX_BUF_OVERHEAD.
2023-11-17 12:13:00 +01:00
Christopher Faulet
b68c579eda BUG/MEDIUM: stconn: Update fsb date on partial sends
The first-send-blocked date was originally designed to save the date of the
first send of a series where some data remain blocked. It was relaxed
recently (3083fd90e "BUG/MEDIUM: stconn: Report a send activity everytime
data were sent") to save the date of the first full blocked send. However,
it is not accurrate.

When all data are sent, the fsb value must be reset to TICK_ETERNITY. When
nothing is sent and if it is not already set, it must be set. But when data
are partially sent, the value must be updated and not reset. Otherwise the
write timeout may be ignored because fsb date is never set.

So, changes brought by the patch above are reverted and
sc_ep_report_blocked_send() was changed to know if some data were sent or
not. This way we are able to update fsb value.
l
This patch must be backported to 2.8.
2023-11-17 12:13:00 +01:00
Remi Tricot-Le Breton
45a2ff0f4a MINOR: shctx: Remove 'use_shared_mem' variable
This global variable was used to avoid using locks on shared_contexts in
the unlikely case of nbthread==1. Since the locks do not do anything
when USE_THREAD is not defined, it will be more beneficial to simply
remove this variable and the systematic test on its value in the shared
context locking functions.
2023-11-16 19:35:10 +01:00
Remi Tricot-Le Breton
4fe6c1365d MINOR: shctx: Remove redundant arg from free_block callback
The free_block callback does not get called on blocks that are not row
heads anymore so we don't need too shared_block parameters.
2023-11-16 19:35:10 +01:00
Remi Tricot-Le Breton
48f81ec09d MAJOR: cache: Delay cache entry delete in reserve_hot function
A reference counter on the cache_entry was added in a previous commit.
Its value is atomically increased and decreased via the retain_entry and
release_entry functions.

This is needed because of the latest cache and shared_context
modifications that introduced two separate locks instead of the
preexisting single shctx_lock one.
With the new logic, we have two main blocks competing for the two locks:
- the one in the http_action_req_cache_use that performs a lookup in the
  cache tree (locked by the cache lock) and then tries to remove the
  corresponding blocks from the shared_context's 'avail' list until the
  response is sent to the client by the cache applet,
- the shctx_row_reserve_hot that traverses the 'avail' list and gives
  them back to the caller, while removing previous row heads from the
  cache tree
Those two blocks require the two locks but one of them would take the
cache lock first, and the other one the shctx_lock first, which would
end in a deadlock without the current patch.

The way this conflict is resolved in this patch is by ensuring that at
least one of those uses works without taking the two locks at the same
time.
The solution found was to keep taking the two locks in the cache_use
case. We first lock the cache to lookup for an entry and we then take
the shctx lock as well to detach the corresponding blocks from the
'avail' list. The subtlety is that between the cache lookup and the
actual locking of the shctx, another thread might have called the
reserve_hot function in which we only take the shctx lock.
In this function we traverse the 'avail' list to remove blocks that are
then given to the caller. If one of those blocks corresponds to a
previous row head, we call the 'free_blocks' callback that used to
delete the cache entry from the tree.
We now avoid deleting directly the cache entries in reserve_hot and we
rather set the cache entries 'complete' param to 0 so that no other
thread tries to work with this entry. This way, when we release the
shctx lock in reserve_hot, the first thread that had performed the cache
lookup and had found an entry that we just gave to another thread will
see that the 'complete' field is 0 and it won't try to work with this
response.

The actual removal of entries from the cache tree will now be performed
in the new 'reserve_finish' callback called at the end of the
shctx_row_reserve_hot function. It will iterate on all the row head that
were inserted in a dedicated list in the 'free_block' callback and
perform the actual delete.
2023-11-16 19:35:10 +01:00
Remi Tricot-Le Breton
1cd91b4f2a MINOR: shctx: Add new reserve_finish callback call to shctx_row_reserve_hot
This patch adds a reserve_finish callback that can be defined by the
subsystems that require a shared_context. It is called at the end of
shctx_row_reserve_hot after the shared_context lock is released.
2023-11-16 19:35:10 +01:00
Remi Tricot-Le Breton
ed35b9411a MEDIUM: cache: Switch shctx spinlock to rwlock and restrict its scope
Since a lock on the cache tree was added in the latest cache changes, we
do not need to use the shared_context's lock to lock more than pure
shared_context related data anymore. This already existing lock will now
only cover the 'avail' list from the shared_context. It can then be
changed to a rwlock instead of a spinlock because we might want to only
run through the avail list sometimes.

Apart form changing the type of the shctx lock, the main modification
introduced by this patch is to limit the amount of code covered by the
shctx lock. This lock does not need to cover any code strictly related
to the cache tree anymore.
2023-11-16 19:35:10 +01:00
Remi Tricot-Le Breton
3831d8454f MEDIUM: shctx: Remove 'hot' list from shared_context
The "hot" list stored in a shared_context was used to keep a reference
to shared blocks that were currently being used and were thus removed
from the available list (so that they don't get reused for another cache
response). This 'hot' list does not ever need to be shared across
threads since every one of them only works on their current row.

The main need behind this 'hot' list was to detach the corresponding
blocks from the 'avail' list and to have a known list root when calling
list_for_each_entry_from in shctx_row_data_append (for instance).

Since we actually never need to iterate over all members of the 'hot'
list, we can remove it and replace the inc_hot/dec_hot logic by a
detach/reattach one.
2023-11-16 19:35:10 +01:00
Remi Tricot-Le Breton
ac9c49b40d MEDIUM: cache: Use dedicated cache tree lock alongside shctx lock
Every use of the cache tree was covered by the shctx lock even when no
operations were performed on the shared_context lists (avail and hot).
This patch adds a dedicated RW lock for the cache so that blocks of code
that work on the cache tree only can use this lock instead of the
superseding shctx one. This is useful for operations during which the
concerned blocks are already in the hot list.
When the two locks need to be taken at the same time, in
http_action_req_cache_use and in shctx_row_reserve_hot, the shctx one
must be taken first.
A new parameter needed to be added to the shared_context's free_block
callback prototype so that cache_free_block can take the cache lock and
release it afterwards.
2023-11-16 19:35:10 +01:00
Remi Tricot-Le Breton
81d8014af8 MINOR: shctx: Remove explicit 'from' param from shctx_row_data_append
This parameter is not necessary since the first element of a row always
has a pointer to the row's tail.
2023-11-16 19:35:10 +01:00
Amaury Denoyelle
8cc3fc73f1 MINOR: connection: update rhttp flags usage
Change the flags used for reversed connection :
* CO_FL_REVERSED is now put after reversal for passive connect. For
  active connect, it is delayed when accept is completed after reversal.
* CO_FL_ACT_REVERSING replace the old CO_FL_REVERSED. It is put only for
  active connect on reversal and removes once accept is done.

This allows to identify a connection as reversed during its whole
lifetime. This should be useful to extend reverse connect.
2023-11-16 17:53:31 +01:00
Christopher Faulet
e5cffa8ace MINOR: connection: Add a CTL flag to notify mux it should wait for reads again
MUX_SUBS_RECV ctl flag is added to instruct the mux it should wait for read
events. This flag will be pretty useful to handle abortonclose option.
2023-11-14 11:01:51 +01:00
Frédéric Lécaille
9021e8935e MINOR: quic: Maximum congestion control window for each algo
Make all the congestion support the maximum congestion control window
set by configuration. There is nothing special to explain. For each
each algo, each time the window is incremented it is also bounded.
2023-11-13 17:53:18 +01:00
Frédéric Lécaille
028a55a1d0 MINOR: quic: Add a max window parameter to congestion control algorithms
Add a new ->max_cwnd member to bind_conf struct to store the maximum
congestion control window value for each QUIC binding.
Modify the "quic-cc-algo" keyword parsing to add an optional parameter
to its value: the maximum congestion window value between parentheses
as follows:

      ex: quic-cc-algo cubic(10m)

This value must be bounded, greater than 10k and smaller than 1g.
2023-11-13 17:53:18 +01:00
Aurelien DARRAGON
64e0b63442 BUG/MEDIUM: server: invalid address (post)parsing checks
This bug was introduced with 29b76ca ("BUG/MEDIUM: server/log: "mode log"
after server keyword causes crash ")

Indeed, we cannot safely rely on addr_proto being set when str2sa_range()
returns in parse_server() (even if SRV_PARSE_PARSE_ADDR is set), because
proto lookup might be bypassed when FQDN addresses are involved.

Unfortunately, the above patch wrongly assumed that proto would always
be set when SRV_PARSE_PARSE_ADDR was passed to parse_server() (so when
str2sa_range() was called), resulting in invalid postparsing checks being
performed, which could as well lead to crashes with log backends
("mode log" set) because some postparsing init was skipped as a result of
proto not being set and this wasn't expected later in the init code.

To fix this, we now make use of the previous patch to perform server's
address compatibility checks on hints that are always set when
str2sa_range() succesfully returns.

For log backend, we're also adding a complementary test to check if the
address family is of expected type, else we report an error, plus we're
moving the postinit logic in log api since _srv_check_proxy_mode() is
only meant to check proxy mode compatibility and we were abusing it.

This patch depends on:
 - "MINOR: tools: make str2sa_range() directly return type hints"

No backport required unless 29b76ca gets backported.
2023-11-10 17:49:57 +01:00
Aurelien DARRAGON
12582eb8e5 MINOR: tools: make str2sa_range() directly return type hints
str2sa_range() already allows the caller to provide <proto> in order to
get a pointer on the protocol matching with the string input thanks to
5fc9328a ("MINOR: tools: make str2sa_range() directly return the protocol")

However, as stated into the commit message, there is a trick:
   "we can fail to return a protocol in case the caller
    accepts an fqdn for use later. This is what servers do and in this
    case it is valid to return no protocol"

In this case, we're unable to return protocol because the protocol lookup
depends on both the [proto type + xprt type] and the [family type] to be
known.

While family type might not be directly resolved when fqdn is involved
(because family type might be discovered using DNS queries), proto type
and xprt type are already known. As such, the caller might be interested
in knowing those address related hints even if the address family type is
not yet resolved and thus the matching protocol cannot be looked up.

Thus in this patch we add the optional net_addr_type (custom type)
argument to str2sa_range to enable the caller to check the protocol type
and transport type when the function succeeds.
2023-11-10 17:49:57 +01:00
Willy Tarreau
a13f8425f0 MINOR: task/debug: make task_queue() and task_schedule() possible callers
It's common to see process_stream() being woken up by wake_expired_tasks
in the profiling output, without knowing which timeout was set to cause
this. By making it possible to record the call places of task_queue()
and task_schedule(), and by making wake_expired_tasks() explicitly not
replace it, we'll be able to know which task_queue() or task_schedule()
was triggered for a given wakeup.

For example below:
  process_stream                51200   311.4ms   6.081us   34.59s    675.6us <- run_tasks_from_lists@src/task.c:659 task_queue
  process_stream                19227   70.00ms   3.640us   9.813m    30.62ms <- sc_notify@src/stconn.c:1136 task_wakeup
  process_stream                 6414   102.3ms   15.95us   8.093m    75.70ms <- stream_new@src/stream.c:578 task_wakeup

It's visible that it's the run_tasks_from_lists() which in fact applies
on the task->expire returned by the ->process() function itself.
2023-11-09 17:24:00 +01:00
Willy Tarreau
0eb0914dba MINOR: task/debug: explicitly support passing a null caller to wakeup functions
This is used for tracing and profiling. By permitting to have a NULL
caller, we allow a caller to explicitly pass zero to state that the
current caller must not be replaced. This will soon be used by
wake_expired_tasks() to avoid replacing a caller in the expire loop.
2023-11-09 17:24:00 +01:00
Amaury Denoyelle
bb28215d9b MEDIUM: quic: define an accept queue limit
QUIC connections are pushed manually into a dedicated listener queue
when they are ready to be accepted. This happens after handshake
finalization or on 0-RTT packet reception. Listener is then woken up to
dequeue them with listener_accept().

This patch comptabilizes the number of connections currently stored in
the accept queue. If reaching a certain limit, INITIAL packets are
dropped on reception to prevent further QUIC connections allocation.
This should help to preserve system resources.

This limit is automatically derived from the listener backlog. Half of
its value is reserved for handshakes and the other half for accept
queues. By default, backlog is equal to maxconn which guarantee that
there can't be no more than maxconn connections in handshake or waiting
to be accepted.
2023-11-09 16:24:00 +01:00
Amaury Denoyelle
3df6a60113 MEDIUM: quic: limit handshake per listener
Implement a limit per listener for concurrent number of QUIC
connections. When reached, INITIAL packets for new connections are
automatically dropped until the number of handshakes is reduced.

The limit value is automatically based on listener backlog, which itself
defaults to maxconn.

This feature is important to ensure CPU and memory resources are not
consume if too many handshakes attempt are started in parallel.

Special care is taken if a connection is released before handshake
completion. In this case, counter must be decremented. This forces to
ensure that member <qc.state> is set early in qc_new_conn() before any
quic_conn_release() invocation.
2023-11-09 16:23:52 +01:00
Amaury Denoyelle
278808915b MINOR: quic: reduce half open counters scope
Accounting is implemented for half open connections which represent QUIC
connections waiting for handshake completion. When reaching a certain
limit, Retry mechanism is automatically activated prior to instantiate
new connections.

The issue with this behavior is that two notions are mixed : QUIC
connection handshake phase and Retry which is mechanism against
amplification attacks. As such, only peer address validation should be
taken into account to activate Retry protection.

This patch chooses to reduce the scope of half_open_conn. Now only
connection waiting to validate the peer address are now accounted for.
Most notably, connections instantiated with a validated Retry token
check are not accounted.

One impact of this patch is that it should prevent to activate Retry
mechanism too early, in particular in case if multiple handshakes are
too slow. Another limitation should be implemented to protect against
this scenario.
2023-11-09 16:23:52 +01:00
Amaury Denoyelle
d38bb7f8a7 MEDIUM: quic: adjust address validation
When a new QUIC connection is created, server considers peer address as
not yet validated. The server must limit its sending up to 3 times the
content already received. This is a defensive measure to avoid flooding
a remote host victim of address spoofing.

This patch adjust the condition to consider the peer address as
validated. Two conditions are now considered :
* successful handling of a received HANDSHAKE packet. This was already
  done before although implemented in a different way.
* validation of a Retry token. This was not considered prior this patch
  despite RFC recommandation.

This patch also adjusts how a connection is internally labelled as using
a validated peer address. Before, above conditions were checked via
quic_peer_validated_addr(). Now, a flag QUIC_FL_CONN_PEER_VALIDATED_ADDR
is set to labelled this. It already existed prior this patch but was
only used for quic_cc_conn. This should now be more explicit.
2023-11-09 16:23:52 +01:00
Frédéric Lécaille
0016dbaef4 BUG/MEDIUM: quic: Possible crashes during secrets allocations (heavy load)
This bug could be reproduced with -dMfail option and detected by libasan.
During the TLS secrets allocations, when failed, quic_tls_ctx_secs_free()
is called. It resets the already initialized secrets. Some were detected
as initialized when not, or with a non initialized length, which leads
to big "memset(0)" detected by libsasan.

Ensure that all the secrets are really initialized with correct lengths.

No need to be backported.
2023-11-09 10:32:31 +01:00
Frédéric Lécaille
4cfae3ac01 MINOR: quic: release the TLS context asap from quic_conn_release()
This was no reason not to release as soon as possible the TLS/SSL QUIC connection
context from quic_conn_release() before allocating a "closing connection" connection
(quic_cc_conn struct).
2023-11-09 10:32:31 +01:00
Frédéric Lécaille
3a8dd48e30 MEDIUM: quic: Heavy task mode with non contiguously bufferized CRYPTO data
This patch sets the handshake task in heavy task mode when receiving in disorder
CRYPTO data which results in in order bufferized CRYPTO data. This is done
thanks to a non-contiguous buffer and from qc_handle_crypto_frm() after having
potentially bufferized CRYPTO data in this buffer.
qc_treat_rx_crypto_frms() is no more called from qc_treat_rx_pkts() but instead
this is where the task is set in heavy task mode. Consequently,
this is the job of qc_ssl_provide_all_quic_data() to call directly
qc_treat_rx_crypto_frms() to provide the in order bufferized CRYPTO data to the
TLS stack. As this function releases the non-contiguous buffer for the CRYPTO
data, if possible, there is no need to do that from qc_treat_rx_crypto_frms()
anymore.
2023-11-09 10:32:31 +01:00
Frédéric Lécaille
94d20be138 MEDIUM: quic: Heavy task mode during handshake
Add a new pool for the CRYPTO data frames received in order.
Add ->rx.crypto_frms list to each encryption level to store such frames
when they are received in order from qc_handle_crypto_frm().
Also set the handshake task (qc_conn_io_cb()) in heavy task mode from
this function after having received such frames. When this task
detects that it is set in heavy mode, it calls qc_ssl_provide_all_quic_data()
newly implemented function to provide the CRYPTO data to the TLS task.
Modify quic_conn_enc_level_uninit() to release these CRYPTO frames
when releasing the encryption level they are in relation with.
2023-11-09 10:32:31 +01:00
Christopher Faulet
84d26bcf3f MINOR: stconn/mux-h2: Use a iobuf flag to report EOI to consumer side during FF
IOBUF_FL_EOI iobuf flag is now set by the producer to notify the consumer
that the end of input was reached. Thanks to this flag, we can remove the
ugly ack in h2_done_ff() to test the opposite SE flags.

Of course, for now, it works and it is good enough. But we must keep in mind
that EOI is always forwarded from the producer side to the consumer side in
this case. But if this change, a new CO_RFL_ flag will have to be added to
instruct the producer if it can forward EOI or not.
2023-11-08 21:14:07 +01:00
Christopher Faulet
4be0c7c655 MEDIUM: stconn/muxes: Loop on data fast-forwarding to forward at least a buffer
In the mux-to-mux data forwarding, we now try, as far as possible to send at
least a buffer. Of course, if the consumer side is congested or if nothing
more can be received, we leave. But the idea is to retry to fast-forward
data if less than a buffer was forwarded. It is only performed for buffer
fast-forwarding, not splicing.

The idea behind this patch is to optimise the forwarding, when a first
forward was performed to complete a buffer with some existing data. In this
case, the amount of data forwarded is artificially limited because we are
using a non-empty buffer. But without this limitation, it is highly probable
that a full buffer could have been sent. And indeed, with H2 client, a
significant improvement was observed during our test.

To do so, .done_fastfwd() callback function must be able to deal with
interim forwards. Especially for the H2 mux, to remove H2_SF_NOTIFIED flags
on the H2S on the last call only. Otherwise, the H2 stream can be blocked by
itself because it is in the send_list. IOBUF_FL_INTERIM_FF iobuf flag is
used to notify the consumer it is not the last call. This flag is then
removed on the last call.
2023-11-08 21:14:07 +01:00
Willy Tarreau
89c6b67a82 BUG/MEDIUM: pool: fix releasable pool calculation when overloaded
In 2.6-dev1, the method used to decide how many pool entries could be
released at once was revisited to support releases in batches. This was
done with commits 91a8e28f9 ("MINOR: pool: add a function to estimate
how many may be released at once") and 361e31e3f ("MEDIUM: pool: compute
the number of evictable entries once per pool").

The first commit takes care of the possible inconsistency between the
moment the allocated count and the used count are read, but unfortunately
fixed it the wrong way, by adjusting "used" to match "alloc" whenever it
was lower (i.e. almost always). This results in a nasty case which is that
as soon as the allocated value becomes higher than the estimated count of
needed entries, we end up returning pool->minavail, which causes very
small batches to be released, starting from commit 1513c5479 ("MEDIUM:
pools: release cached objects in batches").

The problem was further amplified in 2.9-dev3 with commit 7bf829ace
("MAJOR: pools: move the shared pool's free_list over multiple buckets")
because it now becomes possible for a thread to allocate from one bucket
and release into a few other different ones, causing an accumulation of
entries in that bucket.

The fix is trivial, simply adjust the alloc counter if the used one is
higher, before performing operations.

This must be backported to 2.6.
2023-11-08 17:12:49 +01:00
Amaury Denoyelle
6f9b65f952 BUG/MEDIUM: quic: fix sslconns on quic_conn alloc failure
QUIC connections are accounted inside global sslconns. As with QUIC
actconn, it suffered from a similar issue if an intermediary allocation
failed inside qc_new_conn().

Fix this similarly by moving increment operation inside qc_new_conn().
Increment and error path are now centralized and much easier to
validate.

The consequences are similar to the actconn fix : on memory allocation
global sslconns may wrap, this time blocking any future QUIC or SSL
connections on the process.

This must be backported up to 2.6.
2023-11-07 14:06:02 +01:00
Christopher Faulet
62812b2a1d DOC: stconn: Improve comments about lra and fsb usage
Recent fixes have shown <lra> and <fsb> uses were not prettu clear. So let's
try to improve documentation about these value. Especially when <lra> is
updated and how to used it.
2023-11-07 10:41:11 +01:00
Christopher Faulet
d247152ec2 BUG/MEDIUM: Don't apply a max value on room_needed in sc_need_room()
In sc_need_room(), we compute the maximum room that can be requested to
restarted reading to be sure to be able to unblock the SC. At worst when the
buffer is emptied. Here, the buffer reserve is considered but it is an issue.

Counting the reserve can lead to a wicked bug with the H1 multiplexer, when
small amount of data are found at the end of the HTX buffer. In this case,
to not wrap, the H1 mux requests more room. It is an optim to be able to
resync the buffer with the consumer side and to be able to perform zero-copy
transfers. However, if this amount of data is smaller than the reserve and
if the consumer is congested, we fall in a loop because the wrong value is
used to request more room. The H1 mux continues to pretend there is not
enough space in the buffer, while the effective requested value is lower
than the free space in the buffer. While the consumer is congested and does
not consume these data, the is no way to stop the loop.

We can fix the function by removing the buffer reserve from the
computation. But it remains a dangerous decision to apply a max value on
room_needed. It is safer to require the caller must set a correct value. For
now, it is true. But at the end, it is totally unexepected to wait for more
room than an empty buffer can contain.

This patch must be backported to 2.8.
2023-11-07 10:35:38 +01:00
Christopher Faulet
4a2660aa45 BUG/MEDIUM: stconn: Don't report rcv/snd expiration date if SC cannot epxire
When receive or send expiration date of a stream-connector is retrieved, we
now automatically check if it may expire. If not, TICK_ETERNITY is returned.

The expiration dates of the frontend and backend stream-connectors are used
to compute the stream expiration date. This operation is performed at 2
places: at the end of process_stream() and in sc_notify() if the stream is
not woken up.

With this patch, there is no special changes for process_stream() because it
was already handled. It make thing a little simpler. However, it fixes
sc_notify() by avoiding to erroneously compute an expiration date in
past. This highly reduce the stream wakeups when there is contention on the
consumer side.

The bug was introduced with the commit 8073094bf ("NUG/MEDIUM: stconn:
Always update stream's expiration date after I/O"). It was an error to
unconditionnaly set the stream expiration data, without testing blocking
conditions on both SC.

This patch must be backported to 2.8.
2023-11-07 10:30:01 +01:00
Christopher Faulet
141b489291 BUG/MEDIUM: stconn: Report send activity during mux-to-mux fast-forward
When data are directly forwarded from a mux to the opposite one, we must not
forget to report send activity when data are successfully sent or report a
blocked send with data are blocked. It is important because otherwise, if
the transfer is quite long, longer than the client or server timeout, an
error may be triggered because the write timeout is reached.

H1, H2 and PT muxes are concerned. To fix the issue, The done_fastword()
callback now returns the amount of data consummed. This way it is possible
to update/reset the FSB data accordingly.

No backport needed.
2023-11-07 10:30:01 +01:00
Alexander Stephan
6f4bfed3a2 MINOR: server: Add parser support for set-proxy-v2-tlv-fmt
This commit introduces a generic server-side parsing of type-value pair
arguments and allocation of a TLV list via a new keyword called
set-proxy-v2-tlv-fmt.

This allows to 1) forward any TLV type with the help of fc_pp_tlv,
2) generally, send out any TLV type and value via a log format expression.
To have this fully working the connection will need to be updated in
a follow-up commit to actually respect the new server TLV list.

default-server support has also been implemented.
2023-11-04 04:56:59 +01:00
Aurelien DARRAGON
5158c0ff69 MEDIUM: stktable/peers: "write-to" local table on peer updates
In this patch, we add the possibility to declare on a table definition
("table" in peer section, or "stick-table" in proxy section) that we
want the remote/peer updates on that table to be pushed on a local
haproxy table in addition to the source table.

Consider this example:

  |peers mypeers
  |        peer local 127.0.0.1:3334
  |        peer clust 127.0.0.1:3333
  |        table t1.local type string size 10m store server_id,server_key expire 30s
  |        table t1.clust type string size 10m store server_id,server_key write-to mypeers/t1.local expire 30s

With this setup, we consider haproxy uses t1.local as cache/local table
for read and write operations, and that t1.clust is a remote table
containing datas processed from t1.local and similar tables from other
haproxy peers in a cluster setup. The t1.clust table will be used to
refresh the local/cache one via the "write-to" statement.

What will happen, is that every time haproxy will see entry updates for
the t1.clust table: it will overwrite t1.local table with fresh data and
will update the entry expiration timer. If t1.local entry doesn't exist
yet (key doesn't exist), it will automatically create it. Note that only
types that cannot be used for arithmetic ops will be handled, and this
to prevent processed values from the remote table from interfering with
computations based on values from the local table. (ie: prevent
cumulative counters from growing indefinitely).

"write-to" will only push supported types if they both exist in the source
and the target table. Be careful with server_id and server_key storage
because they are often declared implicitly when referencing a table in
sticking rules but it is required to declare them explicitly for them to
be pushed between a remote and a local table through "write-to" option.

Also note that the "write-to" target table should have the same type as
the source one, and that the key length should be strictly equal,
otherwise haproxy will raise an error due to the tables being
incompatibles. A table that is already being written to cannot be used
as a source table for a "write-to" target.

Thanks to this patch, it will now be possible to use sticking rules in
peer cluster context by using a local table as a local cache which
will be automatically refreshed by one or multiple remote table(s).

This commit depends on:
 - "MINOR: stktable: stktable_init() sets err_msg on error"
 - "MINOR: stktable: check if a type should be used as-is"
2023-11-03 17:30:30 +01:00
Aurelien DARRAGON
db0cb54f81 MINOR: stktable: check if a type should be used as-is
stick table types now have an extra bit named 'as_is' that allows us to
check if such type should be used as-is or if it may be involved in
arithmetic operations such as counters. This can be useful since those
types are not common and may require specific handling.

e.g.: stktable_data_types[data_type].as_is will be set to 1 if the type
cannot be used in arithmetic operations.
2023-11-03 17:30:30 +01:00
Aurelien DARRAGON
b8c19f877a MINOR: stktable: stktable_init() sets err_msg on error
stktable_init() now sets err_msg when error occurs so that caller is able
to precisely report the cause of the failure.
2023-11-03 17:30:30 +01:00
Aurelien DARRAGON
b9c0b039c8 MINOR: proxy/stktable: add resolve_stick_rule helper function
Simplify stick and store sticktable proxy rules postparsing by adding
a sticking rule entry resolve (postparsing) function.

This will ease code maintenance.
2023-11-03 17:30:30 +01:00
Ruei-Bang Chen
7a1ec235cd MINOR: sample: Add fetcher for getting all cookie names
This new fetcher can be used to extract the list of cookie names from
Cookie request header or from Set-Cookie response header depending on
the stream direction. There is an optional argument that can be used
as the delimiter (which is assumed to be the first character of the
argument) between cookie names. The default delimiter is comma (,).

Note that we will treat the Cookie request header as a semi-colon
separated list of cookies and each Set-Cookie response header as
a single cookie and extract the cookie names accordingly.
2023-11-03 09:57:06 +01:00
Amaury Denoyelle
4a89dba6d5 MEDIUM: quic: count quic_conn for global sslconns
Similar to the previous commit which check for maxconn before allocating
a QUIC connection, this patch checks for maxsslconn at the same step.
This is necessary as a QUIC connection cannot run without a SSL context.

This should be backported up to 2.6. It relies on the following patch :
  "BUG/MINOR: ssl: use a thread-safe sslconns increment"
2023-10-26 15:35:58 +02:00
Amaury Denoyelle
7735cf3854 MEDIUM: quic: count quic_conn instance for maxconn
Increment actconn and check maxconn limit when a quic_conn is
instantiated. This is necessary because prior to this patch, quic_conn
instances where not counted. Global actconn was only incremented after
the handshake has been completed and the connection structure is
allocated.

The increment is done using increment_actconn() on INITIAL packet
parsing if a new connection is about to be created. If the limit is
reached, the allocation is cancelled and the INITIAL packet is dropped.

The decrement is done under quic_conn_release(). This means that
quic_cc_conn instances are not taken into account. This seems safe
enough because quic_cc_conn are only used for minimal usage.

The counterpart of this change is that maxconn must not be checked a
second time when listener_accept() is done over a QUIC connection. For
this, a new bind_conf flag BC_O_XPRT_MAXCONN is set for listeners when
maxconn is already counted by the lower layer. For the moment, it is
positionned only for QUIC listeners.

Without this patch, haproxy process could suffer from heavy memory/CPU
load if the number of concurrent handshake is high.

This patch is not considered a bug fix per-se. However, it has a major
benefit to protect against too many QUIC handshakes. As such, it should
be backported up to 2.6. For this, it relies on the following patch :
  "MINOR: frontend: implement a dedicated actconn increment function"
2023-10-26 15:35:56 +02:00
Amaury Denoyelle
350f8b0c07 BUG/MINOR: ssl: use a thread-safe sslconns increment
Each time a new SSL context is allocated, global.sslconns is
incremented. If global.maxsslconn is reached, the allocation is
cancelled.

This procedure was not entirely thread-safe due to the check and
increment operations conducted at different stage. This could lead to
global.maxsslconn slightly exceeded when several threads allocate SSL
context while sslconns is near the limit.

To fix this, use a CAS operation in a do/while loop. This code is
similar to the actconn/maxconn increment for connection.

A new function increment_sslconn() is defined for this operation. For
the moment, only SSL code is using it. However, it is expected that QUIC
will also use it to count QUIC connections as SSL ones.

This should be backported to all stable releases. Note that prior to the
2.6, sslconns was outside of global struct, so this commit should be
slightly adjusted.
2023-10-26 15:25:07 +02:00
Amaury Denoyelle
fffd435bbd MINOR: frontend: implement a dedicated actconn increment function
When a new frontend connection is instantiated, actconn global counter
is incremented. If global maxconn value is reached, the connection is
cancelled. This ensures that system limit are under control.

Prior to this patch, the atomic check/increment operations were done
directly into listener_accept(). Move them in a dedicated function
increment_actconn() in frontend module. This will be useful when QUIC
connections will be counted in actconn counter.
2023-10-26 15:18:48 +02:00
Willy Tarreau
96bb99a87d DEBUG: pools: detect that malloc_trim() is in progress
Now when calling ha_panic() with a thread still under malloc_trim(),
we'll set a new tainted flag to easily report it, and the output
trace will report that this condition happened and will suggest to
use no-memory-trimming to avoid it in the future.
2023-10-25 15:48:02 +02:00
Willy Tarreau
26a6481f00 DEBUG: lua: add tainted flags for stuck Lua contexts
William suggested that since we can detect the presence of Lua in the
stack, let's combine it with stuck detection to set a new pair of flags
indicating a stuck Lua context and a stuck Lua shared context.

Now, executing an infinite loop in a Lua sample fetch function with
yield disabled crashes with tainted=0xe40 if loaded from a lua-load
statement, or tainted=0x640 from a lua-load-per-thread statement.

In addition, at the end of the panic dump, we can check if Lua was
seen stuck and emit recommendations about lua-load-per-thread and
the choice of dependencies depending on the presence of threads
and/or shared context.
2023-10-25 15:48:02 +02:00
Willy Tarreau
46bbb3a33b DEBUG: add a tainted flag when ha_panic() is called
This will make it easier to know that the panic function was called,
for the occasional case where the dump crashes and/or the stack is
corrupted and not much exploitable. Now at least it will be sufficient
to check the tainted value to know that someone called ha_panic(), and
it will also be usable to condition extra analysis.
2023-10-25 15:48:02 +02:00
Aurelien DARRAGON
66795bd721 MINOR: connection: add conn_pr_mode_to_proto_mode() helper func
This function allows to safely map proxy mode to corresponding proto_mode

This will allow for easier code maintenance and prevent mixups between
proxy mode and proto mode.
2023-10-25 11:59:27 +02:00
Aurelien DARRAGON
29b76cae47 BUG/MEDIUM: server/log: "mode log" after server keyword causes crash
In 9a74a6c ("MAJOR: log: introduce log backends"), a mistake was made:
it was assumed that the proxy mode was already known during server
keyword parsing in parse_server() function, but this is wrong.

Indeed, "mode log" can be declared late in the proxy section. Due to this,
a simple config like this will cause the process to crash:

   |backend test
   |
   |  server name 127.0.0.1:8080
   |  mode log

In order to fix this, we relax some checks in _srv_parse_init() and store
the address protocol from str2sa_range() in server struct, then we set-up
a postparsing function that is to be called after config parsing to
finish the server checks/initialization that depend on the proxy mode
to be known. We achieve this by checking the PR_CAP_LB capability from
the parent proxy to know if we're in such case where the effective proxy
mode is not yet known (it is assumed that other proxies which are implicit
ones don't provide this possibility and thus don't suffer from this
constraint).

Only then, if the capability is not found, we immediately perform the
server checks that depend on the proxy mode, else the check is postponed
and it will automatically be performed during postparsing thanks to the
REGISTER_POST_SERVER_CHECK() hook.

Note that we remove the SRV_PARSE_IN_LOG_BE flag because it was introduced
in the above commit and it is no longer relevant.

No backport needed unless 9a74a6c gets backported.
2023-10-25 11:59:27 +02:00
Willy Tarreau
55d2fc0c02 DEBUG: mux-h2/flags: fix list of h2c flags used by the flags decoder
The two recent commits below each added one flag to h2c but omitted to
update the __APPEND_FLAG macro used by dev/flags so they are not
properly decoded:

  3dd963b35 ("BUG/MINOR: mux-h2: fix http-request and http-keep-alive timeouts again")
  68d02e5fa ("BUG/MINOR: mux-h2: make up other blocked streams upon removal from list")

This can be backported along with these commits.
2023-10-25 11:44:54 +02:00
Amaury Denoyelle
f76e94d231 MINOR: backend: refactor insertion in avail conns tree
Define a new function srv_add_to_avail_list(). This function is used to
centralize connection insertion in available tree. It reuses a BUG_ON()
statement to ensure the connection is not present in the idle list.
2023-10-25 10:33:06 +02:00
Amaury Denoyelle
f70cf28539 MINOR: listener: forbid most keywords for reverse HTTP bind
Reverse HTTP bind is very specific in that in rely on a server to
initiate connection. All connection settings are defined on the server
line and ignored from the bind line.

Before this patch, most of keywords were silently ignored. This could
result in a configuration from doing unexpected things from the user
point of view. To improve this situation, add a new 'rhttp_ok' field in
bind_kw structure. If not set, the keyword is forbidden on a reverse
bind line and will cause a fatal config error.

For the moment, only the following keywords are usable with reverse bind
'id', 'name' and 'nbconn'.

This change is safe as it's already forbidden to mix reverse and
standard addresses on the same bind line.
2023-10-20 17:28:08 +02:00
Amaury Denoyelle
e05edf71df MINOR: cfgparse: rename "rev@" prefix to "rhttp@"
'rev@' was used to specify a bind/server used with reverse HTTP
transport. This notation was deemed not explicit enough. Rename it
'rhttp@' instead.
2023-10-20 14:44:37 +02:00
Amaury Denoyelle
9d4c7c1151 MINOR: server: convert @reverse to rev@ standard format
Remove the recently introduced '@reverse' notation for HTTP reverse
servers. Instead, reuse the 'rev@' prefix already defined for bind
lines.
2023-10-20 14:44:37 +02:00
Amaury Denoyelle
3222047a14 MINOR: listener: add nbconn kw for reverse connect
Previously, maxconn keyword was reused for a specific usage on reverse
HTTP binds to specify the number of active connect to proceed. To avoid
confusion, introduce a new dedicated keyword 'nbconn' which is specific
to reverse HTTP bind.

This new keyword is forbidden for non-reverse listener. A fatal error is
emitted during config parsing if this rule is not respected. It's safe
because it's also forbidden to mix standard and reverse addresses on the
same bind line.

Internally, nbconn value will be reassigned to 'maxconn' member of
bind_conf structure. This ensures that listener layer will automatically
reenable the preconnect task each time a connection is closed.
2023-10-20 14:44:37 +02:00
Amaury Denoyelle
37d7e52cc6 MINOR: cfgparse: forbid mixing reverse and standard listeners
Reverse HTTP listeners are very specific and share only a very limited
subset of keywords with other listeners. As such, it is probable
meaningless to mix standard and reverse addresses on the same bind line.
This patch emits a fatal error during configuration parsing if this is
the case.
2023-10-20 14:44:37 +02:00
Christopher Faulet
60e7116be0 BUG/MEDIUM: peers: Fix synchro for huge number of tables
The number of updates sent at once was limited to not loop too long to emit
updates when the buffer size is huge or when the number of sync tables is
huge. The limit can be configured and is set to 200 by default. However,
this fix introduced a bug. It is impossible to syncrhonize two peers if the
number of tables is higher than this limit. Thus by default, it is not
possible to sync two peers if there are more than 200 tables to sync.

Technically speacking, a teaching process is finished if we loop on all tables
with no new update messages sent. Because we are limited at each call, the loop
is splitted on several calls. However the restart point for the next loop is
always the last table for which we emitted an update message. Thus with more
tables than the limit, the loop never reachs the end point.

Worse, in conjunction with the bug fixed by "BUG/MEDIUM: peers: Be sure to
always refresh recconnect timer in sync task", it is possible to trigger the
watchdog because the applets may be woken up in loop and leave requesting
more room while its buffer is empty.

To fix the issue, restart conditions for a teaching loop were changed. If
the teach process is interrupted, we now save the restart point, called
stop_local_table. It is the last evaluated table on the previous loop. This
restart point is reset when the teach process is finished.

In additionn, the updates_sent variable in peer_send_msgs() was renamed to
updates to avoid ambiguities. Indeed, the variable is incremented, whether
messages were sent or not.

This patch must be backported as far as 2.6.
2023-10-20 14:32:12 +02:00
Willy Tarreau
3dd963b35f BUG/MINOR: mux-h2: fix http-request and http-keep-alive timeouts again
Stefan Behte reported that since commit f279a2f14 ("BUG/MINOR: mux-h2:
refresh the idle_timer when the mux is empty"), the http-request and
http-keep-alive timeouts don't work anymore on H2. Before this patch,
and since 3e448b9b64 ("BUG/MEDIUM: mux-h2: make sure control frames do
not refresh the idle timeout"), they would only be refreshed after stream
frames were sent (HEADERS or DATA) but the patch above that adds more
refresh points broke these so they don't expire anymore as long as
there's some activity.

We cannot just revert the fix since it also addressed an isse by which
sometimes the timeout would trigger too early and provoque truncated
responses. The right approach here is in fact to only use refresh the
idle timer when the mux buffer was flushed from any such stream frames.

In order to achieve this, we're now setting a flag on the connection
whenever we write a stream frame, and we consider that flag when deciding
to refresh the buffer after it's emptied. This way we'll only clear that
flag once the buffer is empty and there were stream data in it, not if
there were no such stream data. In theory it remains possible to leave
the flag on if some control data is appended after the buffer and it's
never cleared, but in practice it's not a problem as a buffer will always
get sent in large blocks when the window opens. Even a large buffer should
be emptied once in a while as control frames will not fill it as much as
data frames could.

Given the patch above was backported as far as 2.6, this patch should
also be backported as far as 2.6.
2023-10-18 17:17:58 +02:00
Willy Tarreau
91ed52976c MINOR: dgram: allow to set rcv/sndbuf for dgram sockets as well
tune.rcvbuf.client and tune.rcvbuf.server are not suitable for shared
dgram sockets because they're per connection so their units are not the
same. However, QUIC's listener and log servers are not connected and
take per-thread or per-process traffic where a socket log buffer might
be too small, causing undesirable packet losses and retransmits in the
case of QUIC. This essentially manifests in listener mode with new
connections taking a lot of time to set up under heavy traffic due to
the small queues causing delays. Let's add a few new settings allowing
to set these shared socket sizes on the frontend and backend side (which
reminds that these are per-front/back and not per client/server hence
not per connection).
2023-10-18 17:01:19 +02:00
Christopher Faulet
203211f4cb REORG: stconn/muxes: Rename init step in fast-forwarding
Instead of speaking of an initialisation stage for each data
fast-forwarding, we now use the negociate term. Thus init_ff/init_fastfwd
functions were renamed nego_ff/nego_fastfwd.
2023-10-18 12:46:55 +02:00
Christopher Faulet
023564b685 MINOR: global: Add an option to disable the zero-copy forwarding
The zero-copy forwarding or the mux-to-mux forwarding is a way to
fast-forward data without using the channels buffers. Data are transferred
from a mux to the other one. The kernel splicing is an optimization of the
zero-copy forwarding. But it can also use normal buffers (but not channels
ones). This way, it could be possible to fast-forward data with muxes not
supporting the kernel splicing (H2 and H3 muxes) but also with applets.

However, this mode can introduce regressions or bugs in future (just like
the kernel splicing). Thus, It could be usefull to disable this optim. To do
so, in configuration, the global tune settting
'tune.disable-zero-copy-forwarding' may be set in a global section or the
'-dZ' command line parameter may be used to start HAProxy. Of course, this
also disables the kernel splicing.
2023-10-17 18:51:13 +02:00
Christopher Faulet
322d660d08 MINOR: tree-wide: Only rely on co_data() to check channel emptyness
Because channel_is_empty() function does now only check the channel's
buffer, we can remove it and rely on co_data() instead. Of course, all tests
must be inverted.

channel_is_empty() is thus removed.
2023-10-17 18:51:13 +02:00
Christopher Faulet
20c463955d MEDIUM: channel: don't look at iobuf to report an empty channel
It is important to split channels and I/O buffers. When data are pushed in
an I/O buffer, we consider them as forwarded. The channel never sees
them. Fast-forwarded data are now handled in the SE only.
2023-10-17 18:51:13 +02:00
Christopher Faulet
2d80eb5b7a MEDIUM: mux-h1: Add fast-forwarding support
The H1 multiplexer now implements callbacks function to produce and consume
fast-forwarded data.
2023-10-17 18:51:13 +02:00
Christopher Faulet
91f1c5519a MEDIUM: raw-sock: Specifiy amount of data to send via snd_pipe callback
When data were sent using the kernel splicing, we tried to send all data
with no restriction. Most of time it is valid. However, because the payload
representation may differ between the producer and the consumer, it is
important to be able to specify how must data to send via the splicing.

Of course, for performance reason, it is important to maximize amount of
data send via splicing at each call. However, on edge-cases, this now can be
limited.
2023-10-17 18:51:13 +02:00
Christopher Faulet
7ffb7624fe MINOR: connection: Remove mux callbacks about splicing
The kernel splicing support was totally remove waiting for the mux-to-mux
fast-forward implementation. So corresponding mux callbacks can be removed
now.
2023-10-17 18:51:13 +02:00
Christopher Faulet
8b89fe3d8f MINOR: stconn: Temporarily remove kernel splicing support
mux-to-mux fast-forwarding will be added. To avoid mix with the splicing and
simplify the commits, the kernel splicing support is removed from the
stconn. CF_KERN_SPLICING flag is removed and the support is no longer tested
in process_stream().

In the stconn part, rcv_pipe() callback function is no longer called.

Reg-tests scripts testing the kernel splicing are temporarly marked as
broken.
2023-10-17 18:51:13 +02:00
Christopher Faulet
242c6f0ded MINOR: connection: Add new mux callbacks to perform data fast-forwarding
To perform the mux-to-mux data fast-forwarding, 4 new callbacks were added
into the mux_ops structure. 2 callbacks will be used from the stconn for
fast-forward data. The 2 other callbacks will be used by the endpoint to
request an iobuf to the opposite endpoint.

 * fastfwd() callback function is used by a producer to forward data

 * resume_fastfwd() callback function is used by a consumer if some data are
   blocked in the iobuf, to resume the data forwarding.

 * init_fastfwd() must be used by an endpoint (the producer one), inside the
   fastfwd() callback to request an iobuf to the opposite side (the consumer
   one).

 * done_fastfwd() must be used by an endpoint (the producer one) at the end
   of fastfwd() to notify the opposite endpoint (the consumer one) if data
   were forwarded or not.

This API is still under development, so it may evolved. Especially when the
fast-forward will be extended to applets.

2 helper functions were also added into the SE api to wrap init_fastfwd()
and done_fastfwd() callback function of the underlying endpoint.

For now, this API is unsed and not implemented at all in muxes.
2023-10-17 18:51:13 +02:00
Christopher Faulet
1d68bebb70 MINOR: stconn: Extend iobuf to handle a buffer in addition to a pipe
It is unused for now, but the iobuf structure now owns a pointer to a
buffer. This buffer will be used to perform mux-to-mux fast-forwarding when
splicing is not supported or unusable. This pointer should be filled by an
endpoint to let the opposite one forward data.

Extra fields, in addition to the buffer, are mandatory because the buffer
may already contains some data. the ".offset" field may be used may be used
as the position to start to copy data. Finally, the amount of data copied in
this buffer must be saved in ".data" field.

Some flags are also added to prepare next changes. And helper stconn
fnuctions are updated to also count data in the buffer. For a first
implementation, it is not planned to handle data in the buffer and in the
pipe in same time. But it will be possible to do so.
2023-10-17 18:51:13 +02:00
Christopher Faulet
e52519ac83 MINOR: stconn: Start to introduce mux-to-mux fast-forwarding notion
Instead of talking about kernel splicing at stconn/sedesc level, we now try
to talk about mux-to-mux fast-forwarding. To do so, 2 functions were added
to know if there are fast-forwarded data and to retrieve this amount of
data. Of course, for now, there is only data in a pipe.

In addition, some flags were renamed to reflect this notion. Note the
channel's documentation was not updated yet.
2023-10-17 18:51:13 +02:00
Christopher Faulet
8bee0dcd7d MEDIUM: stconn/channel: Move pipes used for the splicing in the SE descriptors
The pipes used to put data when the kernel splicing is in used are moved in
the SE descriptors. For now, it is just a simple remplacement but there is a
major difference with the pipes in the channel. The data are pushed in the
consumer's pipe while it was pushed in the producer's pipe. So it means the
request data are now pushed in the pipe of the backend SE descriptor and
response data are pushed in the pipe of the frontend SE descriptor.

The idea is to hide the pipe from the channel/SC side and to be able to
handle fast-forwading in pipe but also in buffer. To do so, the pipe is
inside a new entity, called iobuf. This entity will be extended.
2023-10-17 18:51:13 +02:00
Willy Tarreau
68d02e5fa9 BUG/MINOR: mux-h2: make up other blocked streams upon removal from list
An interesting issue was met when testing the mux-to-mux forwarding code.

In order to preserve fairness, in h2_snd_buf() if other streams are waiting
in send_list or fctl_list, the stream that is attempting to send also goes
to its list, and will be woken up by h2_process_mux() or h2_send() when
some space is released. But on rare occasions, there are only a few (or
even a single) streams waiting in this list, and these streams are just
quickly removed because of a timeout or a quick h2_detach() that calls
h2s_destroy(). In this case there's no even to wake up the other waiting
stream in its list, and this will possibly resume processing after some
client WINDOW_UPDATE frames or even new streams, so usually it doesn't
last too long and it not much noticeable, reason why it was left that
long. In addition, measures have shown that in heavy network-bound
benchmark, this exact situation happens on less than 1% of the streams
(reached 4% with mux-mux).

The fix here consists in replacing these LIST_DEL_INIT() calls on
h2s->list with a function call that checks if other streams were queued
to the send_list recently, and if so, which also tries to resume them
by calling h2_resume_each_sending_h2s(). The detection of late additions
is made via a new flag on the connection, H2_CF_WAIT_INLIST, which is set
when a stream is queued due to other streams being present, and which is
cleared when this is function is called.

It is particularly difficult to reproduce this case which is particularly
timing-dependent, but in a constrained environment, a test involving 32
conns of 20 streams each, all downloading a 10 MB object previously
showed a limitation of 17 Gbps with lots of idle CPU time, and now
filled the cable at 25 Gbps.

This should be backported to all versions where it applies.
2023-10-17 16:43:44 +02:00
Aurelien DARRAGON
94d0f77deb MINOR: server: introduce "log-bufsize" kw
"log-bufsize" may now be used for a log server (in a log backend) to
configure the bufsize of implicit ring associated to the server (which
defaults to BUFSIZE).
2023-10-13 10:05:07 +02:00
Aurelien DARRAGON
b30bd7adba MEDIUM: log/balance: support for the "hash" lb algorithm
hash lb algorithm can be configured with the "log-balance hash <cnv_list>"
directive. With this algorithm, the user specifies a converter list with
<cnv_list>.

The produced log message will be passed as-is to the provided converter
list, and the resulting hash will be used to select the log server that
will receive the log message.
2023-10-13 10:05:06 +02:00
Aurelien DARRAGON
7251344748 MINOR: sample: add sample_process_cnv() function
split sample_process() in 2 parts in order to be able to only process
the converter part of a sample expression from an existing input sample
struct passed as parameter.
2023-10-13 10:05:06 +02:00
Aurelien DARRAGON
a7563158f7 MINOR: lbprm: support for the "none" hash-type function
Allow the use of the "none" hash-type function so that the key resulting
from the sample expression is directly used as the hash.

This can be useful to do the hashing manually using available hashing
converters, or even custom ones, and then inform haproxy that it can
directly rely on the sample expression result which is explictly handled
as an integer in this case.
2023-10-13 10:05:06 +02:00
Aurelien DARRAGON
9a74a6cb17 MAJOR: log: introduce log backends
Using "mode log" in a backend section turns the proxy in a log backend
which can be used to log-balance logs between multiple log targets
(udp or tcp servers)

log backends can be used as regular log targets using the log directive
with "backend@be_name" prefix, like so:

  | log backend@mybackend local0

A log backend will distribute log messages to servers according to the
log load-balancing algorithm that can be set using the "log-balance"
option from the log backend section. For now, only the roundrobin
algorithm is supported and set by default.
2023-10-13 10:05:06 +02:00
Aurelien DARRAGON
e58a9b4baf MINOR: sink: add sink_new_from_srv() function
This helper function can be used to create a new sink from an existing
server struct (and thus existing proxy as well), in order to spare some
resources when possible.
2023-10-13 10:05:06 +02:00
Aurelien DARRAGON
5c0d1c1a74 MEDIUM: sink: inherit from caller fmt in ring_write() when rings didn't set one
implicit rings were automatically forced to the parent logger format, but
this was done upon ring creation.

This is quite restrictive because we might want to choose the desired
format right before generating the log header (ie: when producing the
log message), depending on the logger (log directive) that is
responsible for the log message, and with current logic this is not
possible. (To this day, we still have dedicated implicit ring per log
directive, but this might change)

In ring_write(), we check if the sink->fmt is specified:
 - defined: we use it since it is the most precise format
   (ie: for named rings)
 - undefined: then we fallback to the format from the logger

With this change, implicit rings' format is now set to UNSPEC upon
creation. This is safe because the log header building function
automatically enforces the "raw" format when UNSPEC is set. And since
logger->format also defaults to "raw", no change of default behavior
should be expected.
2023-10-13 10:05:06 +02:00
Aurelien DARRAGON
6dad0549a5 MEDIUM: log/sink: simplify log header handling
Introduce log_header struct to easily pass log header data between
functions and use that to simplify the logic around log header
handling.

While at it, some outdated comments were updated as well.

No change in behavior should be expected.
2023-10-13 10:05:06 +02:00
Aurelien DARRAGON
a9b185f34e MEDIUM: log: introduce log target
log targets were immediately embedded in logger struct (previously
named logsrv) and could not be used outside of this context.

In this patch, we're introducing log_target type with the associated
helper functions so that it becomes possible to declare and use log
targets outside of loggers scope.
2023-10-13 10:05:06 +02:00
Aurelien DARRAGON
18da35c123 MEDIUM: tree-wide: logsrv struct becomes logger
When 'log' directive was implemented, the internal representation was
named 'struct logsrv', because the 'log' directive would directly point
to the log target, which used to be a (UDP) log server exclusively at
that time, hence the name.

But things have become more complex, since today 'log' directive can point
to ring targets (implicit, or named) for example.

Indeed, a 'log' directive does no longer reference the "final" server to
which the log will be sent, but instead it describes which log API and
parameters to use for transporting the log messages to the proper log
destination.

So now the term 'logsrv' is rather confusing and prevents us from
introducing a new level of abstraction because they would be mixed
with logsrv.

So in order to better designate this 'log' directive, and make it more
generic, we chose the word 'logger' which now replaces logsrv everywhere
it was used in the code (including related comments).

This is internal rewording, so no functional change should be expected
on user-side.
2023-10-13 10:05:06 +02:00
Amaury Denoyelle
7d76ffb2a4 BUG/MINOR: quic: fix qc.cids access on quic-conn fail alloc
CIDs tree is now allocated dynamically since the following commit :
  276697438d
  MINOR: quic: Use a pool for the connection ID tree.

This can caused a crash if qc_new_conn() is interrupted due to an
intermediary failed allocation. When freeing all connection members,
free_quic_conn_cids() is used. However, this function does not support a
NULL cids.

To fix this, simply check that cids is NULL during free_quic_conn_cids()
prologue.

This bug was reproduced using -dMfail.

No need to backport.
2023-10-13 08:52:16 +02:00
Willy Tarreau
5798b5bb14 BUG/MAJOR: connection: make sure to always remove a connection from the tree
Since commit 5afcb686b ("MAJOR: connection: purge idle conn by last usage")
in 2.9-dev4, the test on conn->toremove_list added to conn_get_idle_flag()
in 2.8 by commit 3a7b539b1 ("BUG/MEDIUM: connection: Preserve flags when a
conn is removed from an idle list") becomes misleading. Indeed, now both
toremove_list and idle_list are shared by a union since the presence in
these lists is mutually exclusive. However, in conn_get_idle_flag() we
check for the presence in the toremove_list to decide whether or not to
delete the connection from the tree. This test now fails because instead
it sees the presence in the idle or safe list via the union, and concludes
the element must not be removed. Thus the element remains in the tree and
can be found later after the connection is released, causing crashes that
Tristan reported in issue #2292.

The following config is sufficient to reproduce it with 2 threads:

   defaults
        mode http
        timeout client 5s
        timeout server 5s
        timeout connect 1s

   listen front
        bind :8001
        server next 127.0.0.1:8002

   frontend next
        bind :8002
        timeout http-keep-alive 1
        http-request redirect location /

Sending traffic with a few concurrent connections and some short timeouts
suffices to instantly crash it after ~10k reqs:

   $ h2load -t 4 -c 16 -n 10000 -m 1 -w 1 http://0:8001/

With Amaury we analyzed the conditions in which the function is called
in order to figure a better condition for the test and concluded that
->toremove_list is never filled there so we can safely remove that part
from the test and just move the flag retrieval back to what it was prior
to the 2.8 patch above. Note that the patch is not reverted though, as
the parts that would drop the unexpected flags removal are unchanged.

This patch must NOT be backported. The code in 2.8 works correctly, it's
only the change in 2.9 that makes it misbehave.
2023-10-12 14:20:03 +02:00
Amaury Denoyelle
f59f8326f9 REORG: quic: cleanup traces definition
Move all QUIC trace definitions from quic_conn.h to quic_trace-t.h. Also
remove multiple definition trace_quic macro definition into
quic_trace.h. This forces all QUIC source files who relies on trace to
include it while reducing the size of quic_conn.h.
2023-10-11 14:15:31 +02:00
Frédéric Lécaille
bd83b6effb BUG/MINOR: quic: Avoid crashing with unsupported cryptographic algos
This bug was detected when compiling haproxy against aws-lc TLS stack
during QUIC interop runner tests. Some algorithms could be negotiated by haproxy
through the TLS stack but not fully supported by haproxy QUIC implentation.
This leaded tls_aead() to return NULL (same thing for tls_md(), tls_hp()).
As these functions returned values were never checked, they could triggered
segfaults.

To fix this, one closes the connection as soon as possible with a
handshake_failure(40) TLS alert. Note that as the TLS stack successfully
negotiates an algorithm, it provides haproxy with CRYPTO data before entering
->set_encryption_secrets() callback. This is why this callback
(ha_set_encryption_secrets() on haproxy side) is modified to release all
the CRYPTO frames before triggering a CONNECTION_CLOSE with a TLS alert. This is
done calling qc_release_pktns_frms() for all the packet number spaces.
Modify some quic_tls_keys_hexdump to avoid crashes when the ->aead or ->hp EVP_CIPHER
are NULL.
Modify qc_release_pktns_frms() to do nothing if the packet number space passed
as parameter is not intialized.

This bug does not impact the QUIC TLS compatibily mode (USE_QUIC_OPENSSL_COMPAT).

Thank you to @ilia-shipitsin for having reported this issue in GH #2309.

Must be backported as far as 2.6.
2023-10-11 11:52:22 +02:00
William Lallemand
deed2b6077 BUILD: ssl: enable keylog for WolfSSL
Enable the keylog feature when linking against an WolfSSL library which
has the 'HAVE_SECRET_CALLBACK' define.

Only supports <= TLSv1.2 secret dump.
2023-10-09 21:34:25 +02:00
William Lallemand
9a4c53d96c CLEANUP: ssl: remove compat functions for openssl < 1.0.0
The openssl-compat.h file has some function which were implemented in
order to provide compatibility with openssl < 1.0.0. Most of them where
to support the 0.9.8 version, but we don't support this version anymore.

This patch removes the deprecated code from openssl-compat.h
2023-10-09 17:27:53 +02:00
William Lallemand
1918bcbc12 BUILD: ssl: enable keylog for awslc
AWSLC suports SSL_CTX_set_keylog_callback(), this patch enables the
build with the keylog feature for this library.
2023-10-09 16:17:30 +02:00
William Lallemand
4428ac4f70 BUILD: ssl: add 'secure_memcmp' converter for WolfSSL and awslc
CRYPTO_memcmp is supported by both awslc and wolfssl, lets add the
suport for the 'secure_memcmp' converter into the build.
2023-10-09 15:44:50 +02:00
William Lallemand
bf426eecd7 BUILD: ssl: add 'ssl_c_r_dn' fetch for WolfSSL
WolfSSL supports SSL_get0_verified_chain() so we can activate this
feature.
2023-10-09 15:09:47 +02:00
William Lallemand
d75bc06bdc BUILD: ssl: enable 'ciphersuites' for WolfSSL
WolfSSL supports setting the 'ciphersuites', lets enable the keyword for
it.
2023-10-09 14:56:43 +02:00
Willy Tarreau
1e3422e6b0 BUG/MEDIUM: actions: always apply a longest match on prefix lookup
Many actions take arguments after a parenthesis. When this happens, they
have to be tagged in the parser with KWF_MATCH_PREFIX so that a sub-word
is sufficient (since by default the whole block including the parenthesis
is taken).

The problem with this is that the parser stops on the first match. This
was OK years ago when there were very few actions, but over time new ones
were added and many actions are the prefix of another one (e.g. "set-var"
is the prefix of "set-var-fmt"). And what happens in this case is that the
first word is picked. Most often that doesn't cause trouble because such
similar-looking actions involve the same custom parser so actually the
wrong selection of the first entry results in the correct parser to be
used anyway and the error to be silently hidden.

But it's getting worse when accidentally declaring prefixes in multiple
files, because in this case it will solely depend on the object file link
order: if the longest name appears first, it will be properly detected,
but if it appears last, its other prefix will be detected and might very
well not be related at all and use a distinct parser. And this is random
enough to make some actions succeed or fail depending on the build options
that affect the linkage order. Worse: what if a keyword is the prefix of
another one, with a different parser but a compatible syntax ? It could
seem to work by accident but not do the expected operations.

The correct solution is to always look for the longest matching name.
This way the correct keyword will always be matched and used and there
will be no risk to randomly pick the wrong anymore.

This fix must be backported to the relevant stable releases.
2023-10-06 17:06:44 +02:00
Christopher Faulet
a633338b55 BUG/MEDIUM: stconn: Fix comparison sign in sc_need_room()
sc_need_room() function may be called with a negative value. In this case,
the intent is to be notified if any space was made in the channel buffer. In
the function, we get the min between the requested room and the maximum
possible room in the buffer, considering it may be an HTX buffer.

However this max value is unsigned and leads to an unsigned comparison,
casting the negative value to an unsigned value. Of course, in this case,
this always leads to the wrong result. This bug seems to have no effect but
it is hard to be sure.

To fix the issue, we take care to respect the requested room sign by casting
the max value to a signed integer.

This patch must be backported to 2.8.
2023-10-06 15:34:31 +02:00
Aurelien DARRAGON
205d480d9f MINOR: sink: refine forward_px usage
now forward_px only serves as a hint to know if a proxy was created
specifically for the sink, in which case the sink is responsible for it.

Everywhere forward_px was used in appctx context: get the parent proxy from
the sft->srv instead.

This permits to finally get rid of the double link dependency between sink
and proxy.
2023-10-06 15:34:31 +02:00
Willy Tarreau
90fa2eaa15 MINOR: haproxy: permit to register features during boot
The regtests are using the "feature()" predicate but this one can only
rely on build-time options. It would be nice if some runtime-specific
options could be detected at boot time so that regtests could more
flexibly adapt to what is supported (capabilities, splicing, etc).

Similarly, certain features that are currently enabled with USE_XXX
could also be automatically detected at build time using ifdefs and
would simplify the configuration, but then we'd lose the feature
report in the feature list which is convenient for regtests.

This patch makes sure that haproxy -vv shows the variable's contents
and not the macro's contents, and adds a new hap_register_feature()
to allow the code to register a new keyword.
2023-10-06 11:40:02 +02:00
Remi Tricot-Le Breton
a5e96425a2 MEDIUM: cache: Add "Origin" header to secondary cache key
This patch add a hash of the Origin header to the cache's secondary key.
This enables to manage store responses that have a "Vary: Origin" header
in the cache when vary is enabled.
This cannot be considered as a means to manage CORS requests though, it
only processes the Origin header and hashes the presented value without
any form of URI normalization.

This need was expressed by Philipp Hossner in GitHub issue #251.

Co-Authored-by: Philipp Hossner <philipp.hossner@posteo.de>
2023-10-05 10:53:54 +02:00
William Lallemand
45174e4fdc BUILD: quic: allow USE_QUIC to work with AWSLC
This patch fixes the build with AWSLC and USE_QUIC=1, this is only meant
to be able to build for now and it's not feature complete.

The set_encryption_secrets callback has been split in set_read_secret
and set_write_secret.

Missing features:

- 0RTT was disabled.
- TLS1_3_CK_CHACHA20_POLY1305_SHA256, TLS1_3_CK_AES_128_CCM_SHA256 were disabled
- clienthello callback is missing, certificate selection could be
  limited (RSA + ECDSA at the same time)
2023-10-04 16:55:19 +02:00
Christopher Faulet
f32e28eddc MINOR: mux-h1: Add flags if outgoing msg contains a header about its payload
If a "Content-length" or "Transfer-Encoding; chunked" headers is found or
inserted in an outgoing message, a specific flag is now set on the H1
stream. H1S_F_HAVE_CLEN is set for "Content-length" header and
H1S_F_HAVE_CHNK for "Transfer-Encoding: chunked".

This will be useful to properly format outgoing messages, even if one of
these headers was removed by hand (with no update of the message meta-data).
2023-10-04 15:34:18 +02:00
Amaury Denoyelle
bd001ff346 MINOR: backend: refactor specific source address allocation
Refactor alloc_bind_address() function which is used to allocate a
sockaddr if a connection to a target server relies on a specific source
address setting.

The main objective of this change is to be able to use this function
outside of backend module, namely for preconnections using a reverse
server. As such, this function is now exported globally.

For reverse connect, there is no stream instance. As such, the function
parts which relied on it were reduced to the minimal. Now, stream is
only used if a non-static address is configured which is useful for
usesrc client|clientip|hdr_ip. These options have no sense for reverse
connect so it should be safe to use the same function.
2023-10-03 17:49:12 +02:00
Amaury Denoyelle
2ac5d9a657 MINOR: quic: handle perm error on bind during runtime
Improve EACCES permission errors encounterd when using QUIC connection
socket at runtime :

* First occurence of the error on the process will generate a log
  warning. This should prevent users from using a privileged port
  without mandatory access rights.

* Socket mode will automatically fallback to listener socket for the
  receiver instance. This requires to duplicate the settings from the
  bind_conf to the receiver instance to support configurations with
  multiple addresses on the same bind line.
2023-10-03 16:52:02 +02:00
Amaury Denoyelle
3ef6df7387 MINOR: quic: define quic-socket bind setting
Define a new bind option quic-socket :
  quic-socket [ connection | listener ]

This new setting works in conjunction with the existing configuration
global tune.quic.socket-owner and reuse the same semantics.

The purpose of this setting is to allow to disable connection socket
usage on listener instances individually. This will notably be useful
when needing to deactivating it when encountered a fatal permission
error on bind() at runtime.
2023-10-03 16:49:26 +02:00
Willy Tarreau
7c69c9b51f BUG/MAJOR: plock: fix major bug in pl_take_w() introduced with EBO
When EBO was brought to pl_take_w() by plock commit 60d750d ("plock: use
EBO when waiting for readers to leave in take_w() and stow()"), a mistake
was made: the mask against which the current value of the lock is tested
excludes the first reader like in stow(), but it must not because it was
just obtained via an ldadd() which means that it doesn't count itself.

The problem this causes is that if there is exactly one reader when a
writer grabs the lock, the writer will not wait for it to leave before
starting its operations.

The solution consists in checking for any reader in the IF. However the
mask passed to pl_wait_unlock_*() must still exclude the lowest bit as
it's verified after a subsequent load.

Kudos to Remi Tricot-Le Breton for reporting and bisecting this issue
with a reproducer.

No backport is needed since this was brought in 2.9-dev3 with commit
8178a5211 ("MAJOR: threads/plock: update the embedded library again").
The code is now on par with plock commit ada70fe.
2023-10-03 08:28:12 +02:00
Amaury Denoyelle
337c71423f MINOR: connection: define mux flag for reverse support
Add a new MUX flag MX_FL_REVERSABLE. This value is used to indicate that
MUX instance supports connection reversal. For the moment, only HTTP/2
multiplexer is flagged with it.

This allows to dynamically check if reversal can be completed during MUX
installation. This will allow to relax requirement on config writing for
'tcp-request session attach-srv' which currently cannot be used mixed
with non-http/2 listener instances, even if used conditionnally with an
ACL.
2023-09-29 18:09:08 +02:00
Amaury Denoyelle
ac1164de7c MINOR: connection: define error for reverse connect
Define a new error code for connection CO_ER_REVERSE. This will be used
to report an issue which happens on a connection targetted for reversal
before reverse process is completed.
2023-09-29 18:08:26 +02:00
Emeric Brun
3c250cb847 Revert "BUG/MEDIUM: quic: missing check of dcid for init pkt including a token"
This reverts commit 072e774939.

Doing h2load with h3 tests we notice this behavior:

Client ---- INIT no token SCID = a , DCID = A ---> Server (1)
Client <--- RETRY+TOKEN DCID = a, SCID = B    ---- Server (2)
Client ---- INIT+TOKEN SCID = a , DCID = B    ---> Server (3)
Client <--- INIT DCID = a, SCID = C           ---- Server (4)
Client ---- INIT+TOKEN SCID = a, DCID = C     ---> Server (5)

With (5) dropped by haproxy due to token validation.

Indeed the previous patch adds SCID of retry packet sent to the aad
of the token ciphering aad. It was useful to validate the next INIT
packets including the token are sent by the client using the new
provided SCID for DCID as mantionned into the RFC 9000.
But this stateless information is lost on received INIT packets
following the first outgoing INIT packet from the server because
the client is also supposed to re-use a second time the lastest
received SCID for its new DCID. This will break the token validation
on those last packets and they will be dropped by haproxy.

It was discussed there:
https://mailarchive.ietf.org/arch/msg/quic/7kXVvzhNCpgPk6FwtyPuIC6tRk0/

To resume: this is not the role of the server to verify the re-use of
retry's SCID for DCID in further client's INIT packets.

The previous patch must be reverted in all versions where it was
backported (supposed until 2.6)
2023-09-29 09:27:22 +02:00
Willy Tarreau
d956db6638 CLEANUP: stream: remove the now unused stream_dump() function
It was superseded by strm_dump_to_buffer() which provides much more
complete information and supports anonymizing.
2023-09-29 09:20:27 +02:00
Willy Tarreau
c185bc4656 MEDIUM: stream: now provide full stream dumps in case of loops
When a stream is caught looping, we produce some output to help figure
its internal state explaining why it's looping. The problem is that this
debug output is quite old and the info it provides are quite insufficient
to debug a modern process, and since such bugs happen only once or twice
a year the situation doesn't improve.

On the other hand the output of "show sess all" is extremely detailed
and kept up to date with code evolutions since it's a heavily used
debugging tool.

This commit replaces the call to the totally outdated stream_dump() with
a call to strm_dump_to_buffer(), and removes the filters dump since they
are already emitted there, and it now produces much more exploitable
output:

  [ALERT]    (5936) : A bogus STREAM [0x7fa8dc02f660] is spinning at 5653514 calls per second and refuses to die, aborting now! Please report this error to developers:
  0x7fa8dc02f660: [28/Sep/2023:09:53:08.811818] id=2 proto=tcpv4 source=127.0.0.1:58306
     flags=0xc4a, conn_retries=0, conn_exp=<NEVER> conn_et=0x000 srv_conn=0x133f220, pend_pos=(nil) waiting=0 epoch=0x1
     frontend=public (id=2 mode=http), listener=? (id=1) addr=127.0.0.1:4080
     backend=public (id=2 mode=http) addr=127.0.0.1:61932
     server=s1 (id=1) addr=127.0.0.1:7443
     task=0x7fa8dc02fa40 (state=0x01 nice=0 calls=5749559 rate=5653514 exp=3s tid=1(1/1) age=1s)
     txn=0x7fa8dc02fbf0 flags=0x3000 meth=1 status=-1 req.st=MSG_DONE rsp.st=MSG_RPBEFORE req.f=0x4c rsp.f=0x00
     scf=0x7fa8dc02f5f0 flags=0x00000482 state=EST endp=CONN,0x7fa8dc02b4b0,0x05004001 sub=1 rex=58s wex=<NEVER>
         h1s=0x7fa8dc02b4b0 h1s.flg=0x100010 .sd.flg=0x5004001 .req.state=MSG_DONE .res.state=MSG_RPBEFORE
          .meth=GET status=0 .sd.flg=0x05004001 .sc.flg=0x00000482 .sc.app=0x7fa8dc02f660
          .subs=0x7fa8dc02f608(ev=1 tl=0x7fa8dc02fae0 tl.calls=0 tl.ctx=0x7fa8dc02f5f0 tl.fct=sc_conn_io_cb)
          h1c=0x7fa8dc0272d0 h1c.flg=0x0 .sub=0 .ibuf=0@(nil)+0/0 .obuf=0@(nil)+0/0 .task=0x7fa8dc0273f0 .exp=<NEVER>
         co0=0x7fa8dc027040 ctrl=tcpv4 xprt=RAW mux=H1 data=STRM target=LISTENER:0x12840c0
         flags=0x00000300 fd=32 fd.state=20 updt=0 fd.tmask=0x2
     scb=0x7fa8dc02fb30 flags=0x00001411 state=EST endp=CONN,0x7fa8dc0300c0,0x05000001 sub=1 rex=58s wex=<NEVER>
         h1s=0x7fa8dc0300c0 h1s.flg=0x4010 .sd.flg=0x5000001 .req.state=MSG_DONE .res.state=MSG_RPBEFORE
          .meth=GET status=0 .sd.flg=0x05000001 .sc.flg=0x00001411 .sc.app=0x7fa8dc02f660
          .subs=0x7fa8dc02fb48(ev=1 tl=0x7fa8dc02feb0 tl.calls=2 tl.ctx=0x7fa8dc02fb30 tl.fct=sc_conn_io_cb)
          h1c=0x7fa8dc02ff00 h1c.flg=0x80000000 .sub=1 .ibuf=0@(nil)+0/0 .obuf=0@(nil)+0/0 .task=0x7fa8dc030020 .exp=<NEVER>
         co1=0x7fa8dc02fcd0 ctrl=tcpv4 xprt=RAW mux=H1 data=STRM target=SERVER:0x133f220
         flags=0x10000300 fd=33 fd.state=10421 updt=0 fd.tmask=0x2
     req=0x7fa8dc02f680 (f=0x1840000 an=0x8000 pipe=0 tofwd=0 total=79)
         an_exp=<NEVER> buf=0x7fa8dc02f688 data=(nil) o=0 p=0 i=0 size=0
         htx=0xc18f60 flags=0x0 size=0 data=0 used=0 wrap=NO extra=0
     res=0x7fa8dc02f6d0 (f=0x80000000 an=0x1400000 pipe=0 tofwd=0 total=0)
         an_exp=<NEVER> buf=0x7fa8dc02f6d8 data=(nil) o=0 p=0 i=0 size=0
         htx=0xc18f60 flags=0x0 size=0 data=0 used=0 wrap=NO extra=0
    call trace(10):
    |       0x59f2b7 [0f 0b 0f 1f 80 00 00 00]: stream_dump_and_crash+0x1f7/0x2bf
    |       0x5a0d71 [e9 af e6 ff ff ba 40 00]: process_stream+0x19f1/0x3a56
    |       0x68d7bb [49 89 c7 4d 85 ff 74 77]: run_tasks_from_lists+0x3ab/0x924
    |       0x68e0b4 [29 44 24 14 8b 4c 24 14]: process_runnable_tasks+0x374/0x6d6
    |       0x656f67 [83 3d f2 75 84 00 01 0f]: run_poll_loop+0x127/0x5a8
    |       0x6575d7 [48 8b 1d 42 50 5c 00 48]: main+0x1b22f7
    | 0x7fa8e0f35e45 [64 48 89 04 25 30 06 00]: libpthread:+0x7e45
    | 0x7fa8e0e5a4af [48 89 c7 b8 3c 00 00 00]: libc:clone+0x3f/0x5a

Note that the output is subject to the global anon key so that IPs and
object names can be anonymized if required. It could make sense to
backport this and the few related previous patches next time such an
issue is reported.
2023-09-29 09:20:27 +02:00
Willy Tarreau
5743eeea88 MINOR: stream: make stream_dump() always multi-line
There used to be two working modes for this function, a single-line one
and a multi-line one, the difference being made on the "eol" argument
which could contain either a space or an LF (and with the prefix being
adjusted accordingly). Let's get rid of the single-line mode as it's
what limits the output contents because it's difficult to produce
exploitable structured data this way. It was only used in the rare case
of spinning streams and applets and these are the ones lacking info. Now
a spinning stream produces:

[ALERT]    (3511) : A bogus STREAM [0x227e7b0] is spinning at 5581202 calls per second and refuses to die, aborting now! Please report this error to developers:
  strm=0x227e7b0,c4a src=127.0.0.1 fe=public be=public dst=s1
  txn=0x2041650,3000 txn.req=MSG_DONE,4c txn.rsp=MSG_RPBEFORE,0
  rqf=1840000 rqa=8000 rpf=80000000 rpa=1400000
  scf=0x24af280,EST,482 scb=0x24af430,EST,1411
  af=(nil),0 sab=(nil),0
  cof=0x7fdb28026630,300:H1(0x24a6f60)/RAW((nil))/tcpv4(33)
  cob=0x23199f0,10000300:H1(0x24af630)/RAW((nil))/tcpv4(32)
  filters={}
  call trace(11):
  (...)
2023-09-29 09:20:27 +02:00
Willy Tarreau
48b2233d36 CLEANUP: freq_ctr: make all freq_ctr readers take a const
Since 2.4-dev18 with commit b4476c6a8 ("CLEANUP: freq_ctr: make
arguments of freq_ctr_total() const"), most of the freq_ctr readers
should be fine with a const, except that they were not updated to
reflect this and they continue to force variable on some functions
that call them. Let's update this. This could even be backported if
needed.
2023-09-29 09:20:27 +02:00
Vladimir Vdovin
f8b81f6eb7 MINOR: support for http-request set-timeout client
Added set-timeout for frontend side of session, so it can be used to set
custom per-client timeouts if needed. Added cur_client_timeout to fetch
client timeout samples.
2023-09-28 08:49:22 +02:00
Amaury Denoyelle
b9bb3b932c MINOR: proto_reverse_connect: emit log for preconnect
Add reporting using send_log() for preconnect operation. This is minimal
to ensure we understand the current status of listener in active reverse
connect.

To limit logging quantity, only important transition are considered.
This requires to implement a minimal state machine as a new field in
receiver structure.

Here are the logs produced :
* Initiating : first time preconnect is enabled on a listener
* Error : last preconnect attempt interrupted on a connection error
* Reaching maxconn : all necessary connections were reversed and are
  operational on a listener
2023-09-22 17:21:53 +02:00
Amaury Denoyelle
1f43fb71be MINOR: proto_reverse_connect: refactor preconnect failure
When a connection is freed during preconnect before reversal, the error
must be notified to the listener to remove any connection reference and
rearm a new preconnect attempt. Currently, this can occur through 2 code
paths :
* conn_free() called directly by H2 mux
* error during conn_create_mux(). For this case, connection is flagged
  with CO_FL_ERROR and reverse_connect task is woken up. The process
  task handler is then responsible to call conn_free() for such
  connection.

Duplicated steps where done both in conn_free() and process task
handler. These are now removed. To facilitate code maintenance,
dedicated operation have been centralized in a new function
rev_notify_preconn_err() which is called by conn_free().
2023-09-22 16:43:36 +02:00
Emeric Brun
27b2fd2e06 MINOR: quic: handle external extra CIDs generator.
This patch adds the ability to externalize and customize the code
of the computation of extra CIDs after the first one was derived from
the ODCID.

This is to prepare interoperability with extra components such as
different QUIC proxies or routers for instance.

To process the patch defines two function callbacks:
- the first one to compute a hash 64bits from the first generated CID
  (itself continues to be derived from ODCID). Resulting hash is stored
  into the 'quic_conn' and 64bits is chosen large enought to be able to
  store an entire haproxy's CID.
- the second callback re-uses the previoulsy computed hash to derive
  an extra CID using the custom algorithm. If not set haproxy will
  continue to choose a randomized CID value.

Those two functions have also the 'cluster_secret' passed as an argument:
this way, it is usable for obfuscation or ciphering.
2023-09-22 10:32:14 +02:00
Aurelien DARRAGON
acb7d8a89c MINOR: pattern: fix pat_{parse,match}_ip() function comments
Function comments were outdated, probably because they have not been
updated during the previous refactors.

Fixing comments to better reflect the current behavior.

This may be backported up to 2.2, or even 2.0 by slightly adapting the
patch (in 2.0, such functions are documented in proto/pattern.h)
2023-09-21 09:50:55 +02:00