8523 Commits

Author SHA1 Message Date
Christopher Faulet
52a3d807fc BUG/MAJOR: filters/htx: Add a flag to state the payload is altered by a filter
When a filter is registered on the data, it means it may change the payload
length by rewritting data. It means consumers of the message cannot trust the
expected length of payload as announced by the producer. The commit 8bd835b2d2
("MEDIUM: filters/htx: Don't rely on HTX extra field if payload is filtered")
was pushed to solve this issue. When the HTTP payload of a message is filtered,
the extra field is set to 0 to be sure it will never be used by error by any
consumer. However, it is not enough.

Indeed, the filters must be called before fowarding some data. They cannot be
by-passed. But if a consumer is unable to flush the HTX message, some outgoing
data can remain blocked in the channel's buffer. If some new data are then
pushed because there is some room in the channel's buffe, the producer will set
the HTX extra field. At this stage, if the consumer is unblocked and can send
again data, it is possible to call it to forward outgoing data blocked in the
channel's buffer before waking the stream up to filter new input data. It is the
purpose of the data fast-forwarding. In this case, the HTX extra field will be
seen by the consumer. It is unexpected and leads to undefined behavior.

One consequence of this bug is to perform a wrong chunking on compressed
messages, leading to processing errors at the end of the message, reported as
"ID--" in logs.

To fix the bug, a HTX flag is added to state the payload of the current HTX
message is altered. When this flag is set (HTX_FL_ALTERED_PAYLOAD), the HTX
extra field must not be trusted. And to keep things simple, when this flag is
set, the HTX extra field is automatically set to 0 when the HTX message is
loaded, in htxbuf() function.

It is probably the less intrusive way to fix the bug for now. But this part must
be reviewed to save meta-info of the HTX message outside of the message itself.

This commit should solve the issue #2741. It must be backported as far as 2.9.
2024-10-17 13:54:54 +02:00
Valentine Krasnobaeva
b73a278df4 MINOR: mworker/cli: add _send_status to support state transition
In the new master-worker architecture, when a worker process is forked and
successfully initialized it needs somehow to communicate its "READY" state to
the master, in order to terminate the previous worker and workers, that might
exceeded max_reloads counter.

So, let's implement for this a new master CLI _send_status command. A new
worker can send its status string "READY" to the master, when it's about
entering to the run poll loop, thus it can start to receive data.

In _send_status() in the master context we update the status of the new worker:
PROC_O_INIT flag is withdrawn.

When TERM signal is sent to a worker, worker terminates and this triggers the
mworker_catch_sigchld() handler in master. This handler deletes the exiting
process entry from the processes list.

In _send_status() we loop over the processes list twice. At the first time, in
order to stop workers that exceeded the max_reloads counter. At the second time,
in order to stop the worker forked before the last reload. In the corner case,
when max_reloads=1, we avoid to send SIGTERM twice to the same worker by
setting sigterm_sent flag during the first loop.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
2bb07b913d MINOR: startup: rename and adapt reexec_on_failure
Previously reexec_on_failure() was called in cases when the process has failed
after reload, while it was parsing its configuration or it was trying to apply
it. reexec_on_failure() has called mworker_reexec() and the master process has
been reexecuted.

With the new architecture in such cases there is no longer need to reexecute
the master process after its reload again. It simply keeps the previous worker,
forked before the reload, and it lets the new one to exit with an error. But we
still need the code, which increments the number of failed reloads and which
notifies systemd with new "Reload failed!" status. So, let's reuse and adapt
for this reexec_on_failure() and let's rename it to on_new_child_failure().
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
646299fc95 MINOR: mworker: add and set state PROC_O_INIT for new worker
Here, to distinguish between the new worker and the previous one let's add a
new process state PROC_O_INIT and let's set it, when the memory is allocated
for the new worker in the processes list.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
cc1a631beb MINOR: mworker/cli: rename and clean mworker_cli_sockpair_new
Let's rename mworker_cli_sockpair_new() to
mworker_cli_global_proxy_new_listener() to outline that this function creates
the GLOBAL proxy, allocates the listener with "master-socket" bind conf and
attaches this listener to this GLOBAL proxy. Listener is bound to ipc_fd[1] of
the sockpair inherited in master and in worker (master CLI sockpair).
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
0fbf1973ad MINOR: mworker/cli: rename mworker_cli_proxy_new_listener
This is the first commit in a series to add the support of 4 primary reload
use-cases for the new master-worker architecture:

1. Newly forked worker process dies before any reload, due to some errors in
   the configuration. Newly forked worker process crashes before any reload
   after sending its "READY" state to master.

2. Newly forked worker process dies due to some errors in the new
   configuration. This happens after reload, when this new configuration was
   supplied, so the previous worker process is still here.

3. Newly forked worker process crashes after sending its "READY" state to
   master due to some bugs. This happens after reload, so the previous worker
   process is still here.

4. Newly forked worker process has sent its "READY" state to master and starts
   to receive traffic. This happens after reload, the old worker hasn't
   terminated yet, as it is waiting on some idle connection and it crashes.

Let's rename in this commit mworker_cli_proxy_new_listener() to
mworker_cli_master_proxy_new_listener() to outline, that this function creates
"master-socket" bind conf and allocates a listener. This listener is attached
to the MASTER proxy and it's bound to the ipc_fd[0] of the sockpair,
inherited in master and in worker processes (master CLI sockpair).
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
f9123e2183 MEDIUM: cfgparse: add KWF_DISCOVERY keyword flag
This commit is a part of the series to add a support of discovery mode in the
configuration parser and in initialization sequence.

So, let's add here KWF_DISCOVERY flag to distinguish the keywords,
which should be parsed in "discovery" mode and which are needed for master
process, from all others. Keywords, that should be parsed in "discovery" mode
have its dedicated parser funtions. Let's tag these functions with
KWF_DISCOVERY flag in keywords list. Like this, only these keyword parsers
might be called during the first configuration read in discovery mode.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
6769745fe5 MINOR: global: add MODE_DISCOVERY flag
This is the first commit from a series to add a support of discovery mode
in the configuration parser and in initialization sequence.

Discovery mode is the mode, when we read the configuration at the first time
and we parse and set runtime modes: daemon, zero-warning, master-worker. In
this mode we also parse some parameters needed for the master process to start,
in case if we are in the master-worker mode. Like this the master process
doesn't allocate any additional resources, which it doesn't use and it quickly
finishes its initialization and enters to its polling loop. The worker process
after its fork reads the rest of the configuration.

So, let's add in this commit MODE_DISCOVERY flag to check it in
configuration parser functions.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
fe75c1e12d MEDIUM: startup: remove MODE_MWORKER_WAIT
MODE_MWORKER_WAIT becames redundant with MODE_MWORKER, due to moving
master-worker fork in init(). This change allows master no longer perform
reexec just after forking in order to free additional memory.

As after the fork in the master process we set 'master' variable, we can
replace now MODE_MWORKER_WAIT in some 'if' statements by simple check of this
'master' variable.

Let's also continue to get rid of HAPROXY_MWORKER_WAIT_ONLY environment
variable, as it's no longer needed as well.

In cfg_program_postparser(), which is used to check if cmdline is defined to
launch a program, we completely remove the check of mode for now, because
the master process does not parse the configuration for the moment. 'program'
section parsing will be reintroduced in master later in the next commits.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
fb7bef781d MINOR: defaults: update MASTER_MAXCONN description
This is a one of the commits to prepare the removal of MODE_MWORKER_WAIT
support, as it became redundant with MODE_MWORKER due to moving master-worker
fork in init().
2024-10-16 22:02:39 +02:00
Willy Tarreau
2c2dac77aa DEBUG: mux-h2/flags: add H2_CF_DEM_RXBUF & H2_SF_EXPECT_RXDATA for the decoder
Both flags were recently added but missing from the decoders flags, so
they appeared in hex in dev/flags/flags output. No backport needed.
2024-10-16 18:32:52 +02:00
Aurelien DARRAGON
85298189bf BUG/MEDIUM: server: server stuck in maintenance after FQDN change
Pierre Bonnat reported that SRV-based server-template recently stopped
to work properly.

After reviewing the changes, it was found that the regression was caused
by a4d04c6 ("BUG/MINOR: server: make sure the HMAINT state is part of MAINT")

Indeed, HMAINT is not a regular maintenance flag. It was implemented in
b418c122 a4d04c6 ("BUG/MINOR: server: make sure the HMAINT state is part
of MAINT"). This flag is only set (and never removed) when the server FQDN
is changed from its initial config-time value. This can happen with "set
server fqdn" command as well as SRV records updates from the DNS. This
flag should ideally belong to server flags.. but it was stored under
srv_admin enum because cur_admin is properly exported/imported via server
state-file while regular server's flags are not.

Due to a4d04c6, when a server FQDN changes, the server is considered in
maintenance, and since the HMAINT flag is never removed, the server is
stuck in maintenance.

To fix the issue, we partially revert a4d04c6. But this latter commit is
right on one point: HMAINT flag was way too confusing and mixed-up between
regular MAINT flags, thus there's nothing to blame about a4d04c6 as it was
error-prone anyway.. To prevent such kind of bugs from happening again,
let's rename HMAINT to something more explicit (SRV_ADMF_FQDN_CHANGED) and
make it stand out under srv_admin enum so we're not tempted to mix it with
regular maintenance flags anymore.

Since a4d04c6 was set to be backported in all versions, this patch must
be backported there as well.
2024-10-16 14:26:57 +02:00
Amaury Denoyelle
0918c41ef6 BUG/MEDIUM: quic: support wait-for-handshake
wait-for-handshake http-request action was completely ineffective with
QUIC protocol. This commit implements its support for QUIC.

QUIC MUX layer is extended to support wait-for-handshake. A new function
qcc_handle_wait_for_hs() is executed during qcc_io_process(). It detects
if MUX processing occurs after underlying QUIC handshake completion. If
this is the case, it indicates that early data may be received. As such,
connection is flagged with CO_FL_EARLY_SSL_HS, which is necessary to
block stream processing on wait-for-handshake action.

After this, qcc subscribs on quic_conn layer for RECV notification. This
is used to detect QUIC handshake completion. Thus,
qcc_handle_wait_for_hs() can be reexecuted one last time, to remove
CO_FL_EARLY_SSL_HS and notify every streams flagged as
SE_FL_WAIT_FOR_HS.

This patch must be backported up to 2.6, after a mandatory period of
observation. Note that it relies on the backport of the two previous
patches :
- MINOR: quic: notify connection layer on handshake completion
- BUG/MINOR: stream: unblock stream on wait-for-handshake completion
2024-10-16 11:51:35 +02:00
Willy Tarreau
4eb3ff1d3b MAJOR: mux-h2: make streams use the connection's buffers
For now it seems to work as before, and even when artificially inflating
the number of allocatable buffers per stream. The number of allocated
slots is always the same as the max number of streams, which guarantees
that each stream will find one buffer. we only grant one buffer per
stream at this point, since the goal was to replace the existing single
rxbuf.

A new demux blocking flag, H2_CF_DEM_RXBUF, was added to indicate
a failure to get an rxbuf slot from the connection. It was lightly
tested (by forcing bl_init() to a lower number of buffers). It is not
yet certain whether it's more useful to have a new flag or to reuse
the existing H2_CF_DEM_SFULL which indicates the rxbuf is full,
but at least the new flag more accurately translates the condition,
that may make a difference in the future. However, given that when
RXBUF is set, most of the time it results in a failure to find more
room to demux and it sets SFULL, for now we have to always clear
SFULL when clearing RXBUF as well. This means that most of the time
we'll see 3 combinations:
  - none: everything's OK
  - SFULL: the unique rx buffer is full
  - RXBUF || (RXBUF|SFULL): cannot allocate more entries

Note that we need to be super careful in h2_frt_transfer_data() because
the htx_free_data_space() function doesn't guarantee that the room is
usable, so htx_add_data() may still fail despite an apparent room. For
this reason, h2_frt_transfer_data() maintains a "full" flag to indicate
that a transfer attempt failed and that a new buffer is required.
2024-10-12 16:29:16 +02:00
Willy Tarreau
3b5ac2b553 MINOR: mux-h2: move H2_CF_WAIT_IN_LIST flag away from the demux flags
It's not convenient to have this flag in the middle of the demux flags,
it easily hides other ones that need to be added. Let's move it after
the other ones.
2024-10-12 16:29:16 +02:00
Willy Tarreau
721ea5b06c MINOR: mux-h2: count within a connection, how many streams are receiving data
A stream is receiving data from after the HEADERS frame missing END_STREAM,
to the end of the stream or HREM (the presence of END_STREAM). We're now
adding a flag to the stream that indicates this state, as well as a counter
in the connection of streams currently receiving data. The purpose will be
to gauge at any instant the number of streams that might have to share the
available bandwidth and buffers count in order not to allocate too much flow
control to any single stream. For now the counter is kept up to date, and is
reported in "show fd".
2024-10-12 16:29:16 +02:00
Willy Tarreau
8f09bdce10 MINOR: buffer: add a buffer list type with functions
The buffer ring is problematic in multiple aspects, one of which being
that it is only usable by one entity. With multiplexed protocols, we need
to have shared buffers used by many entities (streams and connection),
and the only way to use the buffer ring model in this case is to have
each entity store its own array, and keep a shared counter on allocated
entries. But even with the default 32 buf and 100 streams per HTTP/2
connection, we're speaking about 32*101*32 bytes = 103424 bytes per H2
connection, just to store up to 32 shared buffers, spread randomly in
these tables. Some users might want to achieve much higher than default
rates over high speed links (e.g. 30-50 MB/s at 100ms), which is 3 to 5
MB storage per connection, hence 180 to 300 buffers. There it starts to
cost a lot, up to 1 MB per connection, just to store buffer indexes.

Instead this patch introduces a variant which we call a buffer list.
That's basically just a free list encoded in an array. Each cell
contains a buffer structure, a next index, and a few flags. The index
could be reduced to 16 bits if needed, in order to make room for a new
struct member. The design permits initializing a whole freelist at once
using memset(0).

The list pointer is stored at a single location (e.g. the connection)
and all users (the streams) will just have indexes referencing their
first and last assigned entries (head and tail). This means that with
a single table we can now have all our buffers shared between multiple
streams, irrelevant to the number of potential streams which would want
to use them. Now the 180 to 300 entries array only costs 7.2 to 12 kB,
or 80 times less.

Two large functions (bl_deinit() & bl_get()) were implemented in buf.c.
A basic doc was added to explain how it works.
2024-10-12 16:29:15 +02:00
Willy Tarreau
ac66df4e2e REORG: buffers: move some of the heavy functions from buf.h to buf.c
Over time, some of the buffer management functions grew quite a bit,
and were still forced to remain inlined since all defined in buf.h.
Let's create buf.c and move the heaviest ones there. All those moved
here were above 200 bytes.
2024-10-12 16:29:15 +02:00
Aurelien DARRAGON
1bdf6e884a MEDIUM: sink: implement sink_find_early()
sink_find_early() is a convenient function that can be used instead of
sink_find() during parsing time in order to try to find a matching
sink even if the sink is not defined yet.

Indeed, if the sink is not defined, sink_find_early() will try to create
it and mark it as forward-declared. It will also save informations from
the caller to better identify it in case of errors.

If the sink happens to be found in the config, it will transition from
forward-declared type to its final type. Else, it means that the sink
was not found in the config, in this case, during postresolve, we raise
an error to indicate that the sink was not found in the configuration.

It should help solve postresolving issue with rings, because for now only
log targets implement proper ring postresolving.. but rings may be used
at different places in the code, such as debug() converter or in "traces"
section.
2024-10-10 16:55:15 +02:00
Aurelien DARRAGON
0e271f1d2a MINOR: log: add do_log_parse_act() helper func
Function may be used from places where per-context actions are usually
registered (tcp_act.c, http_act.c, quic_rules.c.. to name a few) in
order to expose the do_log() action.
2024-10-04 21:38:08 +02:00
Aurelien DARRAGON
e63c7da508 MINOR: log: add do_log() logging helper
do_log() is quite similar to sess_log() or strm_log(), excepts that it
may be called at any time during session handling in an opportunistic
way as long as the session exists (the stream may or may not exist).

Also, it will try to emit the log as INFO by default, unless set-log-level
is used on the stream, or error origin flag is set.
2024-10-04 21:38:02 +02:00
Amaury Denoyelle
f6599cf5a6 MEDIUM: quic: decount out-of-order ACK data range for MUX txbuf window
This commit is the last one of a serie whose objective is to restore
QUIC transfer throughput performance to the state prior to the recent
QUIC MUX buffer allocator rework.

This gain is obtained by reporting received out-of-order ACK data range
to the QUIC MUX which can then decount room in its txbuf window. This is
implemented in QUIC streamdesc layer by adding a new invokation of
notify_room callback. This is done into qc_stream_buf_store_ack() which
handle out-of-order ACK data range.

Previous commit has introduced merging of overlapping ACK data range. As
such, it's easy to only report the newly acknowledged data range.

As with in-order ACKs, this new notification is only performed on
released streambuf. As such, when a streambuf instance is released,
notify_room notification now also reports the total length of
out-of-order ACK data range currently stored. This value is stored in a
new streambuf member <room> to avoid unnecessary tree lookup.

This <room> member also serves on in-order ACK notification to reduce
the notified room. This prevents to report invalid values when overlap
ranges are treated first out-of-order and then in-order, which would
cause an invalid QUIC MUX txbuf window value.

After this change has been implemented, performance has been
significantly improved, both with ngtcp2-client rate usage and on
interop goodput test. These values are now similar to the rate observed
on older haproxy version before QUIC MUX buffer allocator rework.
2024-10-04 18:09:51 +02:00
Amaury Denoyelle
e7578084b0 MINOR: quic: implement dedicated type for out-of-order stream ACK
QUIC streamdesc layer is responsible to handle reception of ACK for
streams. It removes stream data from the underlying buffers on ACK
reception.

Streamdesc layer treats ACK in order at the stream level. Out of order
ACKs are buffered in a tree until they can be handled on older data
acknowledgement reception. Previously, qf_stream instance which comes
from the quic_tx_packet was used as tree node to buffer such ranges.

Introduce a new type dedicated to represent out of order stream ack data
range. This type is named qc_stream_ack. It contains minimal infos only
relative to the acknowledged stream data range.

This allows to reduce size of frequently used quic_frame with the
removal of tree node from qf_stream. Another side effect of this change
is that now quic_frame are always released immediately on ACK reception,
both in-order and out-of-order. This allows to also release the
quic_tx_packet instance which should reduce memory consumption.

The drawback of this change is that qc_stream_ack instance must be
allocated on out-of-order ACK reception. As such, qc_stream_desc_ack()
may fail if an error happens on allocation. For the moment, such error
is silenly recovered up to qc_treat_rx_pkts() with the dropping of the
received packet containing the ACK frame. In the future, it may be
useful to close the connection as this error may only happens on low
memory usage.
2024-10-04 17:56:45 +02:00
Christopher Faulet
15a520d474 MINOR: config/trace: Add a 'traces' section to declare debug traces
It is no longer supported to declare debug traces, via 'trace' directive, in
a global section. A 'traces' directive must be used instead. The syntax of
the 'trace' directive in these sections remains the same. But it is no
longer experimental.

The main reason for this change is to avoid to have a ring section defined
before a global one. Indeed, for now, forward declarations of ring sections
are not supported. So to configure traces, you had to add a ring section
before the global one defining the traces. Most of time, that meant to have
two global sections :

  global
    [...] # global settings

  ring <name>
    [...]

  global
    [...] # trace config

In addition, it will be possible to easily extend the traces section by
adding some new directives.
2024-10-02 10:22:51 +02:00
Amaury Denoyelle
cc4384aeb7 MEDIUM: quic: handle out-of-order ACK at streamdesc layer
qc_stream_desc_ack() is the entrypoint for streamdesc layer to handle a
new acknowledgement of previously emitted STREAM data.

Previously, it was only able to deal with in-order ACK offset. The
caller was responsible to buffer out-of-order ACKs. Change this by
dealing with the latter case directly in qc_stream_desc_ack(). This
notably simplify ACK handling in quic_rx module.
2024-10-01 16:22:20 +02:00
Amaury Denoyelle
62558a9285 MINOR: quic: move buffered ACK to streambuf
QUIC streamdesc layer is used to manage QUIC MUX stream txbuf data
storage until acknowledgment. Currently, it only supports in-order
acknowledgment at the stream level. This requires to be able to buffer
out-of-order ACKs until they can be handled.

Previously, these ACKs were stored in a tree to the streamdesc instance.
Move this indexed storage at the streambuf instance.

This commit is purely an architecture change. However, it will allow to
extend ACK management in future patches, such as the ability to merge
overlapping out-of-order ACKs.
2024-10-01 16:19:42 +02:00
Amaury Denoyelle
943e48dadd MINOR: quic: store streambuf in a streamdesc tree
qc_stream_desc layer is used by QUIC MUX to store emitted STREAM data
until their acknowledgement. Each stream with Tx capability can allocate
its own qc_stream_desc. In turn, each stream desc can have one or
multiple data buffers. This is useful when a MUX stream releases a
buffer and allocate a new one, to preserve bandwith without waiting to
receive all acknowledgement of the previous buffer.

Each buffer is encapsulated in a qc_stream_buf structure. Previously, it
was stored as a list into qc_stream_desc. Change this storage to use a
tree instead. Each buffer is indexed by their offset.

This commit does not introduce functional changes. However, this
rearchitecture will be necessary for future commit to extend ACK
management which require fetching individual buffer instance, not just
the first or last element of a streamdesc, by their offset.
2024-10-01 16:19:41 +02:00
Amaury Denoyelle
f4a83fbb14 MINOR: quic: do not remove qc_stream_desc automatically on ACK handling
qc_stream_desc_ack() is used to handle ACK received for STREAM frame. It
removes acknowledged data from their underlying buffer.

If all data were removed after ACK handling, qc_stream_desc instance
would automatically be freed at the end of qc_stream_desc_ack().
However, this renders the function complicated to use. Simplify this by
removing this automatic removal. Now, caller is responsible to check
after ACK handling if qc_stream_desc instance can be removed. This is
easily done using qc_stream_desc_done() helper.
2024-10-01 16:19:25 +02:00
Amaury Denoyelle
db68f8ed86 MINOR: quic: refactor STREAM room notification
qc_stream_desc is an intermediary layer between QUIC MUX and quic_conn.
It is a facility which permits to store data to emit and keep them for
retransmission until acknowledgment. This layer is responsible to notify
QUIC MUX each time a buffer is freed. This is necessary as MUX buffer
allocation is limited by the underlying congestion window size.

Refactor this to use a mechanism similar to send notification. A new
callback notify_room can now be registered to qc_stream_desc instance.
This is set by QUIC MUX to qmux_ctrl_room(). On MUX QUIC free, special
care is now taken to reset notify_room callback to NULL.

Thanks to this refactoring, further adjustment have been made to refine
the architecture. One of them is the removal of qc_stream_desc
QC_SD_FL_OOB_BUF, which is now converted to a MUX layer flag
QC_SF_TXBUF_OOB.
2024-10-01 16:19:25 +02:00
Amaury Denoyelle
d7f4e5abf0 MEDIUM: quic: strengthen MUX send notification
Previous commit implement a refactor of MUX send notification from
quic_conn layer. With this new architecture, a proper callback is
defined for each qc_stream_desc instance.

This architecture change allows to simplify notification from quic_conn
layer. First, ensure the MUX callback to properly ignore retransmission
of an already emitted frame. Luckily, this can be handled easily by
comparing offsets and FIN status. Also, each QCS instance can now be
unregistered from send notification just prior qc_stream_desc releasing.
This ensures a QCS is never manipulated from quic_conn after its
emission ending. Both these changes render the send notification more
robust. As a nice effect, flag QUIC_FL_CONN_TX_MUX_CONTEXT can be
removed as it is now unneeded.
2024-10-01 16:19:25 +02:00
Amaury Denoyelle
6ad99af0a9 MINOR: quic: refactor MUX send notification
For STREAM emission, MUX QUIC generates one or several frames and emit
them via qc_send_mux(). Lower layer may use them as-is, or split them to
lower chunk to fit in a QUIC packet. It is then responsible to notify
the MUX to report the amount of data sent.

Previously, this was done via a direct call from quic_conn to MUX using
qcc_streams_sent_done(). Modify this to have a better isolation accross
layers. Define a send callback handled by the qc_stream_desc instance.
This allows the MUX to register each QCS instance individually to the
renamved qmux_ctrl_send() which replaces qcc_streams_sent_done().

At quic_conn layer, qc_stream_desc_send() can be used now. This is a
wrapper to qc_stream_desc layer to invoke the send callback if
registered.

This mechanism of qc_stream_desc callback should be extended later to
implement other notifications accross the QUIC stack.
2024-10-01 16:19:25 +02:00
Christopher Faulet
273d322b6f MINOR: stream/stats: Expose the total number of streams ever created in stats
A shared counter is added in the thread context to track the total number of
streams created on the thread. This number is then reported in stats. It
will be a useful information to diagnose some bugs.
2024-09-30 16:55:53 +02:00
Christopher Faulet
18ee22ff76 MINOR: stream/stats: Expose the current number of streams in stats
A shared counter is added in the thread context to track the current number
of streams. This number is then reported in stats. It will be a useful
information to diagnose some bugs.
2024-09-30 16:55:53 +02:00
Christopher Faulet
6a94b7419e MINOR: stream: Support dynamic changes of the number of connection retries
Thanks to the previous patch, it is now possible to add an action to
dynamically change the maxumum number of connection retires for a stream.
"set-retries" action may now be used to do so, from a "tcp-request content"
or a "http-request" rule. This action accepts an expression or an integer
between 0 and 100. The integer value is checked during the configuration
parsing and leads to an error if it is not in the expected range. However,
for the expression, the value is retrieve at runtime. So, invalid value are
just ignored.

Too high value is forbidden to avoid any trouble. 100 retries seems already
be an amazingly hight value. In addition, the option is only available on
backend or listen sections.

Because the max retries is limited to 100 at most, it can be stored as a
unsigned short. This save some space in the stream structure.
2024-09-30 16:55:53 +02:00
Christopher Faulet
91e785edc9 MINOR: stream: Rely on a per-stream max connection retries value
Instead of directly relying on the backend parameter to limit the number of
connection retries, we now use a per-stream value. This value is by default
inherited from the backend value when it is set. So for now, there is no
change except the stream value is used instead of the backend value. But
thanks to this change, it will be possible to dynamically change this value.
2024-09-30 16:55:53 +02:00
Christopher Faulet
0d91de2be4 MINOR: action: Export release_expr_int_action() release function
This function was only used by TCP actions and was private to tcp_act.c
file. However, it make sense to make it public to be used by any action
relying on an int-or-expression argument.
2024-09-30 16:55:53 +02:00
Willy Tarreau
7caf073faa MINOR: tools: do not attempt to use backtrace() on linux without glibc
The function is provided by glibc. Nothing prevents us from using our
own outside of glibc there (tested on aarch64 with musl). We still do
not enable it by default as we don't yet know if all archs work well,
but it's sufficient to pass USE_BACKTRACE=1 when building with musl to
verify it's OK.
2024-09-29 09:52:23 +02:00
Willy Tarreau
1c4776dbc3 BUILD: tools: only include execinfo.h for the real backtrace() function
No need to include this possibly non-existing file when using our own
backtrace() implementation, it's only needed for the libc-provided one.
Because of this it's currently not possible to build musl with backtrace
enabled.
2024-09-29 09:52:23 +02:00
Willy Tarreau
a4d04c649a BUG/MINOR: server: make sure the HMAINT state is part of MAINT
In 1.8 when adding "set server fqdn" with commit b418c1228c ("MINOR:
server: cli: Add server FQDNs to server-state file and stats socket."),
the HMAINT flag was not made part of the MAINT ones, so technically
speaking when changing the FQDN, the server is not completely considered
as in maintenance mode.

In its defense, the code location around that was completely messy, with
the aggregator flag being hidden between other values and purposely but
discretely ignoring one of the flags, so the comments were updated to
make the intent clearer (particularly regarding CMAINT which looked like
it was also forgotten while it was on purpose).

This can be backported anywhere.
2024-09-27 18:40:15 +02:00
Willy Tarreau
b8e3b0a18d BUG/MEDIUM: stream: make stream_shutdown() async-safe
The solution found in commit b500e84e24 ("BUG/MINOR: server: shut down
streams under thread isolation") to deal with inter-thread stream
shutdown doesn't work fine because there exists code paths involving
a server lock which can then deadlock on thread_isolate(). A better
solution then consists in deferring the shutdown to the stream itself
and just wake it up for that.

The only thing is that TASK_WOKEN_OTHER is a bit too generic and we
need to pass at least 2 types of events (SF_ERR_DOWN and SF_ERR_KILLED),
so we're now leveraging the new TASK_F_UEVT1 and _UEVT2 flags on the
task's state to convey these info. The caller only needs to wake the
task up with these flags set, and the stream handler will then finish
the job locally using stream_shutdown_self().

This needs to be carefully backported to all branches affected by the
dequeuing issue and containing any of the 5541d4995d ("BUG/MEDIUM:
queue: deal with a rare TOCTOU in assign_server_and_queue()"), and/or
b11495652e ("BUG/MEDIUM: queue: implement a flag to check for the
dequeuing").
2024-09-27 12:15:41 +02:00
Willy Tarreau
b5281283bb MINOR: task: define two new one-shot events for use with WOKEN_OTHER or MSG
TASK_WOKEN_MSG only says "someone sent you a message" but doesn't convey
any info about the message. TASK_WOKEN_OTHER says "you're woken for another
reason" but doesn't tell which one. Most often they're used as-is by the
task handlers to report very specific situations.

For some important control notifications, having the ability to modulate
the message a little bit is useful, so let's define two user event types
UEVT1 and UEVT2 to be used in conjunction with TASK_WOKEN_MSG or _OTHER
so that the application can know that a specific condition was explicitly
requested. It will be used this way:

  task_wakeup(s->task, TASK_WOKEN_MSG | TASK_F_UEVT1);
or:
  task_wakeup(s->task, TASK_WOKEN_OTHER | TASK_F_UEVT2);

Since events are cumulative, keep in mind not to consider a 3rd value
as the combination of EVT1+EVT2; these really mean that the two events
appeared (though in unspecified order).
2024-09-27 11:56:10 +02:00
Aurelien DARRAGON
4189eb7aca MINOR: log: add log_orig_proxy() helper function
Function may be used on proxy where log-steps are used to check if a given
log origin should be handled or not.
2024-09-26 16:53:07 +02:00
Aurelien DARRAGON
c043d5d372 MINOR: log: introduce "log-steps" proxy keyword
For now it is only available for proxies with frontend capability because
log-steps are only evaluated under sess_log() or strm_log() which
essentially focus on the frontend side when it comes to log settings so
it's better to keep it this way for better consistency, at least for now.

For now the setting does nothing (it is not considered during runtime),
it will be implemented and documented in upcoming commits.
2024-09-26 16:53:07 +02:00
Aurelien DARRAGON
9341792baf MINOR: proxy: add log_steps struct member
add proxy->conf.log_steps eb32 root tree which will be used to store the
log origin identifiers that should result in haproxy emitting a log as
configured by the user using upcoming "log-steps" proxy keyword.

It was chosen to use eb32 tree instead of simple bitfield because despite
the slight overhead it is more future-proof given that we already
implemented the prerequisites for seamless custom log origins registration
that will also be usable from "log-steps" proxy keyword.
2024-09-26 16:53:07 +02:00
Aurelien DARRAGON
b882402a29 MINOR: log: support extra log origins for '%OG' alias
Following previous commits, let's improve log_orig_to_str() so that
extra log origins (registered through log_orig_register()) can be
translated to string from origin ID.

For that, it is required to add eb_32 tree node to log_origin struct in
order to enable quick integer lookup during runtime. Slow name lookup
using the list is acceptable for config parsing, but it is not the case
during runtime when log_orig_to_str() is expected to be used. Also, to
prevent duplicated info, get rid of ->id field and use ->tree.key instead
2024-09-26 16:53:07 +02:00
Aurelien DARRAGON
f8bb9d5c57 MINOR: log: explicitly handle extra log origins as error when relevant
Thanks to previous commit, we can know check for log_orig optional flags
in functions taking struct log_orig as parameter. Let's take this
opportunity to add the LOG_ORIG_FL_ERROR flag and check this flag at a
few places to handle the log message differently because if the flag is
set then the caller expects the log to be handled as an error explicitly.

e.g.: in _process_send_log_override(), if the flag is set, use the error
log format instead of the dedicated one.
2024-09-26 16:53:07 +02:00
Aurelien DARRAGON
3c15ee05e9 MINOR: log: introduce log_orig flags
Rename 'enum log_orig' to 'enum log_orig_id', since this enum specifically
contains the log origin ids.

Add 'struct log_orig' which wraps 'enum log_orig' with optional flags
(no flags defined for now).

Add log_orig() helper func that takes id and flags as parameter and
returns log_orig struct initialized with input arguments.

Update functions taking log origin as parameter so they explicitly take
log orig id or log orig wrapper as argument depending on the level of
context expected by the function.
2024-09-26 16:53:07 +02:00
Aurelien DARRAGON
818475c5cc MINOR: log: introduce extra log profile steps
add a way to register additional log origins using log_origin_register()
that may be used as log profile steps from log profile sections.

For now this does nothing as no extra origins are registered and extra log
origins are not yet considered for runtime logging paths.

When specifying an extra logging step for on <step> under log-profile
section, the logging step is stored within a binary tree for efficient
lookup during runtime. No performance impact should be expected if extra
log origins are not being used, and slight performance impact if extra
log origins are used.

Don't forget to update the documentation when new log origins are added
(both %OG log alias and on <step> log-profile keyword are concerned.
2024-09-26 16:53:07 +02:00
Oliver Dala
a889413f5e BUG/MEDIUM: cli: Deadlock when setting frontend maxconn
The proxy lock state isn't passed down to relax_listener
through dequeue_proxy_listeners, which causes a deadlock
in relax_listener when it tries to get that lock.

Backporting: Older versions didn't have relax_listener and directly called
resume_listener in dequeue_proxy_listeners. lpx should just be passed directly
to resume_listener then.

The bug was introduced in commit 001328873c352e5e4b1df0dcc8facaf2fc1408aa

[cf: This patch should fix the issue #2726. It must be backported as far as
2.4]
2024-09-25 17:12:11 +02:00
Christopher Faulet
96edacc546 DEV: flags/applet: decode appctx flags
Decode APPCTX flags via appctx_show_flags() function.
2024-09-24 18:26:36 +02:00