Commit Graph

16085 Commits

Author SHA1 Message Date
Aurelien DARRAGON
bd8a94a759 BUG/MINOR: hlua_fcn/queue: fix broken pop_wait()
queue:pop_wait() was broken during late refactor prior to merge.
(Due to small modifications to ensure that pop() returns nil on empty
queue instead of nothing)

Because of this, pop_wait() currently behaves exactly as pop(), resulting
in 100% active CPU when used in a while loop.

Indeed, _hlua_queue_pop() should explicitly return 0 when the queue is
empty since pop_wait logic relies on this and the pushnil should be
handled directly in queue:pop() function instead.

Adding some comments as well to document this.
2023-05-11 09:23:14 +02:00
Christopher Faulet
0fda8d2c8e BUG/MEDIUM: filters: Don't deinit filters for disabled proxies during startup
During the startup stage, if a proxy was disabled in config, all filters
were released and removed. But it may be an issue if some info are shared
between filters of the same type. Resources may be released too early.

It happens with ACLs defined in SPOE configurations. Pattern expressions can
be shared between filters. To fix the issue, filters for disabled proxies
are no longer released during the startup stage but only when HAProxy is
stopped.

This commit depends on the previous one ("MINOR: spoe: Don't stop disabled
proxies"). Both must be backported to all stable versions.
2023-05-11 09:22:46 +02:00
Christopher Faulet
7f4ffad46e MINOR: spoe: Don't stop disabled proxies
SPOE register a signal handler to be able to stop SPOE applets ASAP during
soft-stop. Disabled proxies must be ignored at this staged because they are
not fully configured.

For now, it is useless but this change is mandatory to fix a bug.
2023-05-11 09:22:46 +02:00
Christopher Faulet
16e314150a BUILD: mjson: Fix warning about unused variables
clang 15 reports unused variables in src/mjson.c:

  src/mjson.c:196:21: fatal error: expected ';' at end of declaration
    int __maybe_unused n = 0;

and

  src/mjson.c:727:17: fatal error: variable 'n' set but not used [-Wunused-but-set-variable]
    int sign = 1, n = 0;

An issue was created on the project, but it was not fixed for now:

   https://github.com/cesanta/mjson/issues/51

So for now, to fix the build issue, these variables are declared as unused.
Of course, if there is any update on this library, be careful to review this
patch first to be sure it is always required.

This patch should fix the issue #1868. It be backported as far as 2.4.
2023-05-11 09:22:46 +02:00
Ilya Shipitsin
83f54b9aef CLEANUP: src/listener.c: remove redundant NULL check
fixes #2031

quoting Willy Tarreau:

"Originally the listeners were intended to work without a bind_conf
(e.g. for FTP processing) hence these tests, but over time the
bind_conf has become omnipresent"
2023-05-11 05:30:03 +02:00
Christopher Faulet
bd90a16564 MEDIUM: stream: Resync analyzers at the end of process_stream() on change
At the end of process_stream(), if there was any change on request/response
analyzers, we now trigger a resync. It is performed if any analyzer is added
but also removed. It should help to catch internal changes on a stream and
eventually avoid it to be frozen.

There is no reason to backport this patch. But it may be good to keep an eye
on it, just in case.
2023-05-10 16:45:36 +02:00
Christopher Faulet
b1368adcc7 BUG/MEDIUM: stream: Forward shutdowns when unhandled errors are caught
In process_stream(), after request and response analyzers evaluation,
unhandled errors are processed, if any. In this case, depending on the case,
remaining request or response analyzers may be removed, unlesse the last one
about end of filters. However, auto-close is not reenabled in same
time. Thus it is possible to not forward the shutdown for a side to the
other one while no analyzer is there to do so or at least to make evolved
the situation.

In theory, it is thus possible to freeze a stream if no wakeup happens. And
it seems possible because it explain a freeze we've oberseved.

This patch could be backported to every stable versions but only after a
period of observation and if it may match an unexplained bug. It should not
lead to any loop but at worst and eventually to truncated messages.
2023-05-10 16:45:36 +02:00
Willy Tarreau
862588a4b5 BUG/MINOR: config: make compression work again in defaults section
When commit ead43fe4f ("MEDIUM: compression: Make it so we can compress
requests as well.") added the test for the direction flags to select the
compression, it implicitly broke compression defined in defaults sections
because the flags from the default proxy were not recopied, hence the
compression was enabled but in no direction.

No backport is needed, that's 2.8 only.
2023-05-10 16:41:21 +02:00
Frédéric Lécaille
b971696296 BUG/MINOR: quic: Possible crash when dumping version information
->others member of tp_version_information structure pointed to a buffer in the
TLS stack used to parse the transport parameters. There is no garantee that this
buffer is available until the connection is released.

Do not dump the available versions selected by the client anymore, but displayed the
chosen one (selected by the client for this connection) and the negotiated one.

Must be backported to 2.7 and 2.6.
2023-05-10 13:26:37 +02:00
Amaury Denoyelle
da24bcfad3 BUG/MEDIUM: mux-quic: wakeup tasklet to close on error
A recent series of commit have been introduced to rework error
generation on QUIC MUX side. Now, all MUX/APP functions uses
qcc_set_error() to set the flag QC_CF_ERRL on error. Then, this flag is
converted to QC_CF_ERRL_DONE with a CONNECTION_CLOSE emission by
qc_send().

This has the advantage of centralizing the CONNECTION_CLOSE generation
in one place and reduces the link between MUX and quic-conn layer.
However, we must now ensure that every qcc_set_error() call is followed
by a QUIC MUX tasklet to invoke qc_send(). This was not the case, thus
when there is no active transfer, no CONNECTION_CLOSE frame is emitted
and the connection remains opened.

To fix this, add a tasklet_wakeup() directly in qcc_set_error(). This is
a brute force solution as this may be unneeded when already in the MUX
tasklet context. However, it is the simplest solution as it is too
tedious for the moment to list all qcc_set_error() invocation outside of
the tasklet.

This must be backported up to 2.7.
2023-05-09 18:42:34 +02:00
Amaury Denoyelle
58721f2192 BUG/MINOR: mux-quic: fix transport VS app CONNECTION_CLOSE
A recent series of patch were introduced to streamline error generation
by QUIC MUX. However, a regression was introduced : every error
generated by the MUX was built as CONNECTION_CLOSE_APP frame, whereas it
should be only for H3/QPACK errors.

Fix this by adding an argument <app> in qcc_set_error. When false, a
standard CONNECTION_CLOSE is used as error.

This bug was detected by QUIC tracker with the following tests
"stop_sending" and "server_flow_control" which requires a
CONNECTION_CLOSE frame.

This must be backported up to 2.7.
2023-05-09 18:42:34 +02:00
Christopher Faulet
a236c58223 BUG/MEDIUM: stats: Require more room if buffer is almost full
This was lost with commit f4258bdf3 ("MINOR: stats: Use the applet API to
write data"). When the buffer is almost full, the stats applet gives up.
When this happens, the applet must require more room. Otherwise, data in the
channel buffer are sent to the client but the applet is not woken up in
return.

It is a 2.8-specific bug, no backport needed.
2023-05-09 16:36:45 +02:00
William Lallemand
930afdf614 BUILD: ssl: buggy -Werror=dangling-pointer since gcc 13.0
GCC complains about swapping 2 heads list, one local and one global.

    gcc -Iinclude  -O2 -g -Wall -Wextra -Wundef -Wdeclaration-after-statement -Wfatal-errors -Wtype-limits -Wshift-negative-value -Wshift-overflow=2 -Wduplicated-cond -Wnull-dereference -fwrapv -Wno-address-of-packed-member -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-clobbered -Wno-missing-field-initializers -Wno-cast-function-type -Wno-string-plus-int -Wno-atomic-alignment -Werror     -DDEBUG_STRICT -DDEBUG_MEMORY_POOLS -DUSE_EPOLL  -DUSE_NETFILTER -DUSE_POLL -DUSE_THREAD  -DUSE_BACKTRACE -DUSE_TPROXY -DUSE_LINUX_TPROXY -DUSE_LINUX_SPLICE -DUSE_LIBCRYPT -DUSE_CRYPT_H  -DUSE_GETADDRINFO -DUSE_OPENSSL  -DUSE_SSL -DUSE_LUA -DUSE_ACCEPT4  -DUSE_ZLIB  -DUSE_CPU_AFFINITY -DUSE_TFO -DUSE_NS -DUSE_DL -DUSE_RT  -DUSE_MATH    -DUSE_SYSTEMD  -DUSE_PRCTL  -DUSE_THREAD_DUMP   -DUSE_QUIC   -DUSE_SHM_OPEN   -DUSE_PCRE -DUSE_PCRE_JIT   -I/github/home/opt/include -I/usr/include  -DCONFIG_HAPROXY_VERSION=\"2.8-dev8-7d23e8d1a6db\" -DCONFIG_HAPROXY_DATE=\"2023/04/24\" -c -o src/ssl_sample.o src/ssl_sample.c
    In file included from include/haproxy/pool.h:29,
                     from include/haproxy/chunk.h:31,
                     from include/haproxy/dynbuf.h:33,
                     from include/haproxy/channel.h:27,
                     from include/haproxy/applet.h:29,
                     from src/ssl_sock.c:47:
    src/ssl_sock.c: In function 'tlskeys_finalize_config':
    include/haproxy/list.h:48:88: error: storing the address of local variable 'tkr' in 'tlskeys_reference.p' [-Werror=dangling-pointer=]
       48 | #define LIST_INSERT(lh, el) ({ (el)->n = (lh)->n; (el)->n->p = (lh)->n = (el); (el)->p = (lh); (el); })
          |                                                                                ~~~~~~~~^~~~~~
    src/ssl_sock.c:1086:9: note: in expansion of macro 'LIST_INSERT'
     1086 |         LIST_INSERT(&tkr, &tlskeys_reference);
          |         ^~~~~~~~~~~
    compilation terminated due to -Wfatal-errors.

This appears with gcc 13.0.

The fix uses LIST_SPLICE() instead of inserting the head of the local
list in the global list.

Should fix issue #2136 .
2023-05-09 14:25:10 +02:00
Christopher Faulet
d6f0557deb BUG/MEDIUM: cache: Don't request more room than the max allowed
Since a recent change on the SC API, a producer must specify the amount of
free space it needs to progress when it is blocked. But, it must take care
to never exceed the maximum size allowed in the buffer. Otherwise, the
stream is freezed because it cannot reach the condition to unblock the
producer.

In this context, there is a bug in the cache applet when it fails to dump a
message. It may request more space than allowed. It happens when the cached
object is too big.

It is a 2.8-specific bug. No backport needed.
2023-05-09 11:53:28 +02:00
Frédéric Lécaille
7a01ff7921 BUG/MINOR: quic: Wrong key update cipher context initialization for encryption
As noticed by Miroslav, there was a typo in quic_tls_key_update() which lead
a cipher context for decryption to be initialized and used in place of a cipher
context for encryption. Surprisingly, this did not prevent the key update
from working. Perhaps this is due to the fact that the underlying cryptographic
algorithms used by QUIC are all symetric algorithms.

Also modify incorrect traces.

Must be backported in 2.6 and 2.7.
2023-05-09 11:03:26 +02:00
Frédéric Lécaille
a94612522d CLEANUP: quic: Typo fix for quic_connection_id pool
Remove a "n" extra letter.

Should be backported to 2.7.
2023-05-09 10:48:40 +02:00
Frédéric Lécaille
1bc6e318f0 CLEANUP: quic: Rename several <buf> variables in quic_frame.(c|h)
Most of the function in quic_frame.c and quic_frame.h manipulate <buf> buffer
position variables which have nothing to see with struct buffer variables.
Rename them to <pos>

Should be backported to 2.7.
2023-05-09 10:48:40 +02:00
Willy Tarreau
95e6c9999a BUILD: debug: do not check the isolated_thread variable in non-threaded builds
The build without thread support was broken by commit b30ced3d8 ("BUG/MINOR:
debug: fix incorrect profiling status reporting in show threads") because
it accesses the isolated_thread variable that is not defined when threads
are disabled. In fact both the test on harmless and this one make no sense
without threads, so let's comment out the block and mark the related
variables as unused.

This may have to be backported to 2.7 if the commit above is.
2023-05-07 15:02:30 +02:00
Willy Tarreau
dd9f921b3a CLEANUP: fix a few reported typos in code comments
These are only the few relevant changes among those reported here:

  https://github.com/haproxy/haproxy/actions/runs/4856148287/jobs/8655397661
2023-05-07 07:07:44 +02:00
Willy Tarreau
615c301db4 MINOR: config: allow cpu-map to take commas in lists of ranges
The function that cpu-map uses to parse CPU sets, parse_cpu_set(), was
etended in 2.4 with commit a80823543 ("MINOR: cfgparse: support the
comma separator on parse_cpu_set") to support commas between ranges.
But since it was quite late in the development cycle, by then it was
decided not to add a last-minute surprise and not to magically support
commas in cpu-map, hence the "comma_allowed" argument.

Since then we know that it was not the best choice, because the comma
is silently ignored in the cpu-map syntax, causing all sorts of
surprises in field with threads running on a single node for example.
In addition it's quite common to copy-paste a taskset line and put it
directly into the haproxy configuration.

This commit relaxes this rule an finally allows cpu-map to support
commas between ranges. It simply consists in removing the comma_allowed
argument in the parse_cpu_set() function. The doc was updated to
reflect this.
2023-05-05 18:41:52 +02:00
Amaury Denoyelle
2273af11e0 MINOR: quic: implement oneline format for "show quic"
Add a new output format "oneline" for "show quic" command. This prints
one connection per line with minimal information. The objective is to
have an equivalent of the netstat/ss tools with just enough information
to quickly find connection which are misbehaving.

A legend is printed on the first line to describe the field columns
starting with a dash character.

This should be backported up to 2.7.
2023-05-05 18:08:37 +02:00
Amaury Denoyelle
bc1f5fed72 MINOR: quic: add format argument for "show quic"
Add an extra optional argument for "show quic" to specify desired output
format. Its objective is to control the verbosity per connections. For
the moment, only "full" is supported, which is the already implemented
output with maximum information.

This should be backported up to 2.7.
2023-05-05 18:06:51 +02:00
Aurelien DARRAGON
86fb22c557 MINOR: hlua_fcn: add Queue class
Adding a new lua class: Queue.

This class provides a generic FIFO storage mechanism that may be shared
between multiple lua contexts to easily pass data between them, as stock
Lua doesn't provide easy methods for passing data between multiple
coroutines.

New Queue object may be obtained using core.queue()
(it works like core.concat() for a concat Class)

Lua documentation was updated (including some usage examples)
2023-05-05 16:28:32 +02:00
Aurelien DARRAGON
40cd44f52c MINOR: hlua: declare hlua_gethlua() function
Declaring hlua_gethlua() function to make it usable from hlua_fcn.c.
2023-05-05 16:28:32 +02:00
Aurelien DARRAGON
e0b16355ce CLEANUP: hlua: hlua_register_task() may longjmp
Adding __LJMP prefix to hlua_register_task() to indicate that the
function may longjmp when executed.
2023-05-05 16:28:32 +02:00
Aurelien DARRAGON
977688bd57 MINOR: server: fix message report when IDRAIN is set and MAINT is cleared
Remaining in drain mode after removing one of server admins flags leads
to this message being generated:

"Server name/backend is leaving forced drain but remains in drain mode."

However this is not necessarily true: the server might just be leaving
MAINT with the IDRAIN flag set, so the report is incorrect in this case.
(FDRAIN was not set so it cannot be cleared)

To prevent confusion around this message and to comply with the code
comment above it: we remove the "leaving forced drain" precision to
make the report suitable for multiple transitions.
2023-05-05 16:28:32 +02:00
Aurelien DARRAGON
a2c5321045 BUG/MINOR: hlua: spinning loop in hlua_socket_handler()
Since 3157222 ("MEDIUM: hlua/applet: Use the sedesc to report and detect
end of processing"), hlua_socket_handler() might spin loop if the hlua
socket is destroyed and some data was left unconsumed in the applet.

Prior to the above commit, the stream was explicitly KILLED
(when ctx->die == 1) so the app couldn't spinloop on unconsumed data.
But since the refactor this is no longer the case.

To prevent unconsumed data from waking the applet indefinitely, we consume
pending data when either one of EOS|ERROR|SHR|SHW flags are set, as it is
done everywhere else this check is performed in the code. Hence it was
probably overlooked in the first place during the refacto.

This bug is 2.8 specific only, so no backport needed.
2023-05-05 16:28:32 +02:00
Aurelien DARRAGON
717a38d135 MINOR: hlua: expose proxy mailers
Proxy mailers, which are configured using "email-alert" directives
in proxy sections from the configuration, are now being exposed
directly in lua thanks to the proxy:get_mailers() method which
returns a class containing the various mailers settings if email
alerts are configured for the given proxy (else nil is returned).

Both the class and the proxy method were marked as LEGACY since this
feature relies on mailers configuration, and the goal here is to provide
the last missing bits of information to lua scripts in order to make
them capable of sending email alerts instead of relying on the soon-to-
be deprecated mailers implementation based on checks (see src/mailers.c)

Lua documentation was updated accordingly.
2023-05-05 16:28:32 +02:00
Aurelien DARRAGON
5bed48fec8 MINOR: mailers/hlua: disable email sending from lua
Exposing a new hlua function, available from body or init contexts, that
forcefully disables the sending of email alerts even if the mailers are
defined in haproxy configuration.

This will help for sending email directly from lua.
(prevent legacy email sending from intefering with lua)
2023-05-05 16:28:32 +02:00
Aurelien DARRAGON
0bd53b2152 MINOR: hlua/event_hdl: expose SERVER_CHECK event
Exposing SERVER_CHECK event through the lua API.

New lua class named ServerEventCheck was added to provide additional
data for SERVER_CHECK event.

Lua documentation was updated accordingly.
2023-05-05 16:28:32 +02:00
Aurelien DARRAGON
dcbc2d2cac MINOR: checks/event_hdl: SERVER_CHECK event
Adding a new event type: SERVER_CHECK.

This event is published when a server's check state ought to be reported.
(check status change or check result)

SERVER_CHECK event is provided as a server event with additional data
carrying relevant check's context such as check's result and health.
2023-05-05 16:28:32 +02:00
Aurelien DARRAGON
948dd3ddfb MINOR: hlua: expose SERVER_ADMIN event
Exposing SERVER_ADMIN event in lua and updating the documentation.
2023-05-05 16:28:32 +02:00
Aurelien DARRAGON
a163d65254 MINOR: server/event_hdl: add SERVER_ADMIN event
Adding a new SERVER event in the event_hdl API.

SERVER_ADMIN is implemented as an advanced server event.
It is published each time the administrative state changes.
(when s->cur_admin changes)

SERVER_ADMIN data is an event_hdl_cb_data_server_admin struct that
provides additional info related to the admin state change, but can
be casted as a regular event_hdl_cb_data_server struct if additional
info is not needed.
2023-05-05 16:28:32 +02:00
Aurelien DARRAGON
c99f3adf10 MINOR: hlua: expose SERVER_STATE event
Exposing SERVER_STATE event in lua and updating the documentation.
2023-05-05 16:28:32 +02:00
Aurelien DARRAGON
c249f6d964 OPTIM: server: publish UP/DOWN events from STATE change
Reuse cb_data from STATE event to publish UP and DOWN events.
This saves some CPU time since the event is only constructed
once to publish STATE, STATE+UP or STATE+DOWN depending on the
state change.
2023-05-05 16:28:32 +02:00
Aurelien DARRAGON
e3eea29f48 MINOR: server/event_hdl: add SERVER_STATE event
Adding a new SERVER event in the event_hdl API.

SERVER_STATE is implemented as an advanced server event.
It is published each time the server's effective state changes.
(when s->cur_state changes)

SERVER_STATE data is an event_hdl_cb_data_server_state struct that
provides additional info related to the server state change, but can
be casted as a regular event_hdl_cb_data_server struct if additional
info is not needed.
2023-05-05 16:28:32 +02:00
Aurelien DARRAGON
306a5fc987 MINOR: server/event_hdl: publish macro helper
add a macro helper to help publish server events to global and
per-server subscription list at once since all server events
support both subscription modes.
2023-05-05 16:28:32 +02:00
Aurelien DARRAGON
fc84553df8 MINOR: hlua_fcn: add Proxy.get_srv_act() and Proxy.get_srv_bck()
Proxy.get_srv_act: number of active servers that are eligible for LB
Proxy.get_srv_bck: number of backup servers that are eligible for LB
2023-05-05 16:28:32 +02:00
Aurelien DARRAGON
fc759b4ac2 MINOR: hlua_fcn: add Server.get_pend_conn() and Server.get_cur_sess()
Server.get_pend_conn: number of pending connections to the server
Server.get_cur_sess: number of current sessions handled by the server

Lua documentation was updated accordingly.
2023-05-05 16:28:32 +02:00
Aurelien DARRAGON
3889efa8e4 MINOR: hlua_fcn: add Server.get_proxy()
Server.get_proxy(): get the proxy to which the server belongs
(or nil if not available)
2023-05-05 16:28:32 +02:00
Aurelien DARRAGON
4be36a1337 MINOR: hlua_fcn: add Server.get_trackers()
This function returns an array of servers who are currently tracking
the server.
2023-05-05 16:28:32 +02:00
Aurelien DARRAGON
406511a2df MINOR: hlua_fcn: add Server.tracking()
This function returns the currently tracked server, if any.
2023-05-05 16:28:32 +02:00
Aurelien DARRAGON
7a03dee36f MINOR: hlua_fcn: add Server.is_dynamic()
This function returns true if the current server is dynamic,
meaning that it was instantiated at runtime (ie: from the cli)
2023-05-05 16:28:32 +02:00
Aurelien DARRAGON
c72051d53a MINOR: hlua_fcn: add Server.is_backup()
This function returns true if the current server is a backup server.
2023-05-05 16:28:32 +02:00
Aurelien DARRAGON
862a0fe75a MINOR: hlua_fcn: fix Server.is_draining() return type
Adjusting Server.is_draining() return type from integer to boolean
to comply with the documentation.
2023-05-05 16:28:32 +02:00
Christopher Faulet
e7405d4124 MEDIUM: stconn: Check room needed to unblock opposite SC when data was sent
After a sending attempt, we check the opposite SC to see if it is waiting
for a minimum free space to receive more data. If the condition is
respected, it is unblocked. 0 is special case where the SC is
unconditionally unblocked.
2023-05-05 15:44:23 +02:00
Christopher Faulet
18b3309f38 MEDIUM: stconn: Check room needed to unblock SC on fast-forward
During fast-forward, if the SC is waiting for a minimum free space to
receive more data and some data was sent, it is only unblock is the
condition is respected. 0 is special case where the SC is unconditionally
unblocked.
2023-05-05 15:44:23 +02:00
Christopher Faulet
c184b11b1a MEDIUM: applet: Check room needed to unblock opposite SC when data was consumed
If the opposite SC is waiting for a minimum free space to receive more data,
it is only unblock is the condition is respected. 0 is a special cases where
the opposite SC is always unblocked.
2023-05-05 15:44:23 +02:00
Christopher Faulet
fab82bfd55 BUG/MEDIUM: stconn: Unblock SC from stream if there is enough room to progrees
At the end of process_stream(), in sc_update_rx(), the SC is now unblocked
if it was waiting for room and the free space in the input buffer is large
enough. This patch should fix an issue with the compression filter that can
leave the channel's buffer empty while the endpoint is waiting for room to
progress. Indeed, in this case, because the buffer is empty, there is no
send attempt and no other way to unblock the SE.

This commit depends on following commits:

  * MEDIUM: tree-wide: Change sc API to specify required free space to progress
  * MINOR: stconn: Add a field to specify the room needed by the SC to progress
  * MINOR: peers: Use the applet API to send message
  * MINOR: stats: Use the applet API to write data
  * MINOR: cli: Use applet API to write output message

It should fix a regression introduced with the commit 341a5783b
("BUG/MEDIUM: stconn: stop to enable/disable reads from streams via
si_update_rx").

It must be backported iff the commit above is also backported. It was not
backported yet and it is thus probably a good idea to not do so to avoid to
backport too many change..
2023-05-05 15:44:23 +02:00
Christopher Faulet
7b3d38a633 MEDIUM: tree-wide: Change sc API to specify required free space to progress
sc_need_room() now takes the required free space to receive more data as
parameter. All calls to this function are updated accordingly. For now, this
value is set but not used. When we are waiting for a buffer, 0 is used. So
we expect to be unblocked ASAP. However this must be reviewed because
SC_FL_NEED_BUF is probably enough in this case and this flag is already set
if the input buffer allocation fails.
2023-05-05 15:44:23 +02:00
Christopher Faulet
9aed1124ed MINOR: stconn: Add a field to specify the room needed by the SC to progress
When the SC is blocked because it is waiting for room in the input buffer,
it will be responsible to specify the minimum free space required to
progress. In this commit, we only introduce the field in the stconn
structure that will be used to store this value. It is a signed value with
the following meaning:

  * -1: The SC is waiting for room but not based on the buffer state. It
        will be typically used during splicing when the pipe is full. In
        this case, only a successful send can unblock the SC.

  * >= 0; The minimum free space in the input buffer to unblock the SC. 0 is
          a special value to specify the SC must be unblocked ASAP, by the
          stream, at the end of process_stream() or when output data are
          consumed on the opposite side.
2023-05-05 15:41:30 +02:00
Christopher Faulet
7a48b72d39 MINOR: peers: Use the applet API to send message
The peers applet now use the applet API to send message instead of the
channel API. This way, it does not need to take care to request more room if
it fails to put data into the channel's buffer.
2023-05-05 15:41:30 +02:00
Christopher Faulet
f4258bdf3b MINOR: stats: Use the applet API to write data
stats_putchk() is updated to use the applet API instead of the channel API
to write data. To do so, the appctx is passed as parameter instead of the
channel. This way, the applet does not need to take care to request more
room it it fails to put data into the channel's buffer.
2023-05-05 15:41:29 +02:00
Christopher Faulet
e8ee27b0fd MINOR: cli: Use applet API to write output message
Instead of using the channel API to to write output message from the CLI
applet, we use the applet API. This way, the applet does not need to take
care to request more room it it fails to put its message into the channel's
buffer.
2023-05-05 15:41:19 +02:00
William Lallemand
b6ae2aafde MINOR: ssl: allow to change the signature algorithm for client authentication
This commit introduces the keyword "client-sigalgs" for the bind line,
which does the same as "sigalgs" but for the client authentication.

"ssl-default-bind-client-sigalgs" allows to set the default parameter
for all the bind lines.

This patch should fix issue #2081.
2023-05-05 00:05:46 +02:00
William Lallemand
1d3c822300 MINOR: ssl: allow to change the server signature algorithm
This patch introduces the "sigalgs" keyword for the bind line, which
allows to configure the list of server signature algorithms negociated
during the handshake. Also available as "ssl-default-bind-sigalgs" in
the default section.

This patch was originally written by Bruno Henc.
2023-05-04 22:43:18 +02:00
Willy Tarreau
e69919d1ba CLEANUP: debug: remove the now unused ha_thread_dump_all_to_trash()
The function isn't used anymore since each call place performs its
own loop. Let's get rid of it.
2023-05-04 19:19:04 +02:00
Willy Tarreau
009b5519e6 MINOR: debug: make "show threads" properly iterate over all threads
Previously it would re-dump all threads to the same trash if  the output
buffer was full, which it never was since the trash is of the same size.
Now it dumps one thread, copies it to the buffer and yields until it can
continue. Showing 256 threads works as expected.
2023-05-04 19:15:50 +02:00
Willy Tarreau
880d1684a7 MINOR: debug: write panic dump to stderr one thread at a time
Currently large setups cannot dump all their threads because they're
first dumped to the trash buffer, then copied to stderr. Here we can
now change this, instead we dump one thread at a time into the trash
and immediately send it to stderr. We also keep a copy into a local
trash chunk that's assigned to thread_dump_buffer so that a core file
still contains a copy of a large number of threads, which is generally
sufficient for the vast majority of situations.

It was verified that dumping 256 threads now produces ~55kB of output
and all of them are properly dumped.
2023-05-04 19:15:50 +02:00
Willy Tarreau
9a6ecbd590 MEDIUM: debug: simplify the thread dump mechanism
The thread dump mechanism that is used by "show threads" and by the
panic dump is overly complicated due to an initial misdesign. It
firsts wakes all threads, then serializes their dumps, then releases
them, while taking extreme care not to face colliding dumps. In fact
this is not what we need and it reached a limit where big machines
cannot dump all their threads anymore due to buffer size limitations.

What is needed instead is to be able to dump *one* thread, and to let
the requester iterate on all threads.

That's what this patch does. It adds the thread_dump_buffer to the
struct thread_ctx so that the requester offers the buffer to the
thread that is about to be dumped. This buffer also serves as a lock.
A thread at rest has a NULL, a valid pointer indicates the thread is
using it, and 0x1 (NULL+1) is used by the dumped thread to tell the
requester it's done. This makes sure that a given thread is dumped
once at a time. In addition to this, the calling thread decides
whether it accesses the thread by itself or via the debug signal
handler, in order to get a backtrace. This is much saner because the
calling thread is free to do whatever it wants with the buffer after
each thread is dumped, and there is no dependency between threads,
once they've dumped, they're free to continue (and possibly to dump
for another requester if needed). Finally, when the THREAD_DUMP
feature is disabled and the debug signal is not used, the requester
accesses the thread by itself like before.

For now we still have the buffer size limitation but it will be
addressed in future patches.
2023-05-04 19:15:44 +02:00
Christopher Faulet
34f81d5815 BUG/MINOR: mux-h2: Also expect data when waiting for a tunnel establishment
When a client H2 stream is waiting for a tunnel establishment, it must state
it expects data from server. It is the second fix that should fix
regressions of the commit 2722c04b ("MEDIUM: mux-h2: Don't expect data from
server as long as request is unfinished")

It is a 2.8-specific bug. No backport needed.
2023-05-04 16:58:33 +02:00
Willy Tarreau
cb01f5daa7 BUG/MINOR: debug: do not emit empty lines in thread dumps
In 2.3, commit 471425f51 ("BUG/MINOR: debug: Don't dump the lua stack
if it is not initialized") introduced the possibility to emit an empty
line when there's no Lua info to dump. The problem is that doing this
on the CLI in "show threads" marks the end of the output, and it may
affect some external tools. We need to make sure that LFs are only
emitted if there's something on the line and that all lines properly
start with the prefix.

This may be backported as far as 2.0 since the commit above was
backported there.
2023-05-04 16:51:50 +02:00
Amaury Denoyelle
d4af04198b MINOR: mux-quic: close connection asap on local error
With the change for QUIC MUX local error API, the new flag QC_CF_ERRL is
now checked on qc_detach(). If set, qcs instance is freed even though
transfer is not finished. This should help to quickly release qcs and
eventually all MUX instance resources.

To further accelerate this, a specific check has been added in
qc_shutw(). It is skipped if local error flag is set to prevent noisy
reset stream invocation. In the same way, QUIC MUX is not rescheduled on
qc_recv_buf() operation if local error flag set.

This should be backported up to 2.7.
2023-05-04 16:36:51 +02:00
Amaury Denoyelle
35542ce7bf MINOR: mux-quic: report local error on stream endpoint asap
If an error a detected at the MUX layer, all remaining stream endpoints
should be closed asap with error set. This is now done by checking for
QC_CF_ERRL flag on qc_wake_some_streams() and qc_send_buf(). To complete
this, qc_wake_some_streams() is called by qc_process() if needed.

This should help to quickly release streams as soon as a new error is
detected locally by the MUX or APP layer. This allows to in turn free
the MUX instance itself. Previously, error would not have been
automatically reported until the transport layer closure would occur
on CONNECTION_CLOSE emission.

This should be backported up to 2.7.
2023-05-04 16:36:51 +02:00
Amaury Denoyelle
51f116d65e MINOR: mux-quic: adjust local error API
When a fatal error is detected by the QUIC MUX or H3 layer, the
connection should be closed with a CONNECTION_CLOSE with an error code
as the reason.

Previously, a direct call was used to the quic_conn layer to try to
close the connection. This API was adjusted to be more flexible. Now,
when an error is detected, the function qcc_set_error() is called. This
set the flag QC_CF_ERRL with the error code stored by the MUX. The
connection will be closed soon so most of the operations are not
conducted anymore. Connection is then finally closed during qc_send()
via quic_conn layer if QC_CF_ERRL is set. This will set the flag
QC_CF_ERRL_DONE which indicates that the MUX instance can be freed.

This model is cleaner and brings the following improvments :
- interaction with quic_conn layer for closure is centralized on a
  single function
- CO_FL_ERROR is not set anymore. This was incorrect as this should be
  reserved to errors reported by the transport layer to be similar with
  other haproxy components. As a consequence, qcc_is_dead() has been
  adjusted to check for QC_CF_ERRL_DONE to release the MUX instance.

This should be backported up to 2.7.
2023-05-04 16:36:51 +02:00
Amaury Denoyelle
b8901d2c86 MINOR: mux-quic: wake up after recv only if avail data
When HTX content is transferred from qcs instance to upper stream
endpoint, a wakeup is conducted for MUX tasklet. However, this is only
necessary if demux was interrupted due to a full QCS HTX buffer.

This should be backported up to 2.7.
2023-05-04 16:36:51 +02:00
Amaury Denoyelle
8d44bfaf0b MINOR: mux-quic: add trace event for local error
Add a dedicated trace event QMUX_EV_QCC_ERR. This is used for locally
detected error when a CONNECTION_CLOSE should be emitted.

This should be backported up to 2.7.
2023-05-04 16:36:51 +02:00
Amaury Denoyelle
b737f95009 BUG/MINOR: mux-quic: prevent quic_conn error code to be overwritten
When MUX performs a graceful shutdown, quic_conn error code is set to a
"no error" code which depends on the application layer used. However,
this may overwrite a previous error code if quic_conn layer has detected
an error on its side.

In practice, this behavior has not been seen on production. In fact, it
may have undesirable effect only if this error code modification happens
between the quic_conn error detection and the emission of the
CONNECTION_CLOSE, so it should be pretty rare. However, there is still a
tiny possibility it may happen.

To prevent this, first check that quic_conn error code is not set before
setting it. Ideally, transport layer API should be adjusted to be able
to set this without fiddling with the quic_conn directly.

This should be backported up to 2.6.
2023-05-04 16:36:51 +02:00
Christopher Faulet
4403cdf653 BUG/MEDIUM: mux-h2: Properly handle end of request to expect data from server
The commit 2722c04b ("MEDIUM: mux-h2: Don't expect data from server as long
as request is unfinished") introduced a regression in the H2 multiplexer.
The end of the request is not systematically handled to state a H2 stream on
client side now expexts data from the server.

Indeed, while the client is uploading its request, the H2 stream warns it
does not expect data from the server. This way, no server timeout is applied
at this stage. When end of the request is detected, the H2 stream must state
it now expects the server response. This enables the server timeout.

However, it was only performed at one place while the end of the request can
be handled at different places. First, during a zero-copy in
h2_rcv_buf(). Then, when the SC is created with the full request. Because of
this bug, it is possible to totally disable the server timeout for H2
streams.

In h2_rcv_buf(), we now rely on h2s flags to detect the end of the request,
but only when the rxbuf was emptied.

It is a 2.8-specific bug. No backport needed.
2023-05-04 16:29:27 +02:00
Willy Tarreau
e5e62231d8 MINOR: debug: permit the "debug dev loop" to run under isolation
Sometimes it's convenient to test the effect of tasks running under
isolation, e.g. to validate the contents of the crash dumps. Let's
add an optional "isolated" keyword to "debug dev loop" for this.
2023-05-04 11:50:26 +02:00
Willy Tarreau
b30ced3d88 BUG/MINOR: debug: fix incorrect profiling status reporting in show threads
Thread dumps include a field "prof" for each thread that reports whether
task profiling is currently active or not. It turns out that in 2.7-dev1,
commit 680ed5f28 ("MINOR: task: move profiling bit to per-thread")
mistakenly replaced it with a check for the current thread's bit in the
thread dumps, which basically is the only place where another thread is
being watched. The same mistake was done a few lines later by confusing
threads_want_rdv_mask with the profiling mask. This mask disappeared
in 2.7-dev2 with commit 598cf3f22 ("MAJOR: threads: change thread_isolate
to support inter-group synchronization"), though instead we know the ID
of the isolated thread. This commit fixes this and now reports "isolated"
instead of "wantrdv".

This can be backported to 2.7.
2023-05-04 11:41:33 +02:00
Willy Tarreau
8b3e39e37b MINOR: activity: allow "show activity" to restart in the middle of a line
16kB buffers are not enough to dump 4096 threads with up to 10 bytes value
on each line. By storing the column number in the applet's context, we can
now restart from the last attempted column. This requires to dump all values
as they are produced, but it doesn't cost that much: a 4096-thread output
from a fesh process produces 300kB of output in ~8ms, or ~400us per call
(19*16kB), most of which are spent in vfprintf(). Given that we don't print
more than needed, it doesn't really change anything.

The main caveat is that when interrupted on such large lines, there's a
great possibility that the total or average on the first column doesn't
match anymore the sum or average of all dumped values. In order to avoid
this whenever possible (typically less than ~1500 threads), we first try
to dump entire lines and only proceed one column at a time when we have
to retry a failed dump. This is already the same for other stats that are
dumped in an interruptible way anyway and there's little that can be done
about it at this point (and not much immediately perceived benefit in
doing this with extreme accuracy for >1500 threads).
2023-05-03 17:26:11 +02:00
Willy Tarreau
6ed0b9885d MINOR: activity: allow "show activity" to restart dumping on any line
When using many threads, it's difficult to see the end of "show activity"
due to the numerous columns which fill the buffer. For example a dump of
a 256-thread, freshly booted process yields around 15kB.

Here by arranging the dump in a loop around a switch/case block where
each case checks the code line number against the current dump position,
we have a restartable counter for free with a granularity of the line of
code, without having to maintain a matching between states and specific
lines. It just requires to reset the trash buffer for each line and to
try to dump it after each line.

Now dumping 256 threads after a few seconds of traffic happily emits 20kB.
2023-05-03 17:24:54 +02:00
Willy Tarreau
8ee0d11cb8 MINOR: activity: iterate over all fields in a main loop for dumping
Now each line of "show activity" will iterate over n+2 fields, one for
the line header, one for the total, and one per thread. This will soon
allow us to save the current state in a restartable way.
2023-05-03 17:24:54 +02:00
Willy Tarreau
a465b21516 MINOR: activity: show the line header inside the SHOW_VAL macro
Doing so will allow us to drop the extra chunk_appendf() dedicated to
the line header and simplify iteration over restartable columns.
2023-05-03 17:24:54 +02:00
Willy Tarreau
5ddf9bea09 MINOR: activity: use a single macro to iterate over all fields
Instead of having SHOW_AVG() and SHOW_TOT(), let's just have SHOW_VAL()
which iterates over all values.
2023-05-03 17:24:54 +02:00
Willy Tarreau
ff508f12c6 BUILD: cli: fix build on Windows due to isalnum() implemented as a macro
Commit 986798718 ("DEBUG: cli: add "debug dev task" to show/wake/expire/kill
tasks and tasklets") broke the build on windows due to this:

  src/debug.c:940:95: error: array subscript has type char [-Werror=char-subscripts]
    940 |  caller && may_access(caller) && may_access(caller->func) && isalnum(*caller->func) ? caller->func : "0",
        |                                                                      ^~~~~~~~~~~~~

It's classical on platforms which implement ctype.h as macros instead of
functions, let's cast it as uchar. No backport is needed.
2023-05-03 16:32:50 +02:00
William Lallemand
117c7fde06 BUG/MINOR: ssl/sample: x509_v_err_str converter output when not found
The x509_v_err_str converter now outputs the numerical value as a string
when the corresponding constant name was not found.

Must be backported as far as 2.7.
2023-05-03 15:19:38 +02:00
Willy Tarreau
9867987182 DEBUG: cli: add "debug dev task" to show/wake/expire/kill tasks and tasklets
When analyzing certain types of bugs in field, sometimes it would be
nice to be able to wake up a task or tasklet to see how events progress
(e.g. to detect a missing wakeup condition), or expire or kill such a
task. This restricted command shows hte current state of a task or tasklet
and allows to manipulate it like this. However it must be used with extreme
care because while it does verify that the pointers are mapped, it cannot
know if they point to a real task, and performing such actions on something
not a task will easily lead to a crash. In addition, performing a "kill"
on a task has great chances of provoking a deferred crash due to a double
free and/or another kill that is not idempotent. Use with extreme care!
2023-05-03 11:47:44 +02:00
Willy Tarreau
dd01448953 MINOR: debug: clarify "debug dev stream" help message
The help message was insufficient to figure how to use it and specify
the stream pointer and changes to operate.
2023-05-03 11:47:44 +02:00
Willy Tarreau
65efd33c06 BUG/MINOR: stream/cli: fix stream age calculation in "show sess"
The "show sess" command displays the stream's age in synthetic form,
and also makes it appear in the long version (show sess all). But that
last one uses the wrong origin, it uses accept_date.tv_sec instead of
accept_ts (formerly known as tv_accept). This was introduced in 1.4.2
with the long format, with commit 66dc20a17 ("[MINOR] stats socket: add
show sess <id> to dump details about a session"), while the code that
split the two variables was introduced in 1.3.16 with commit b7f694f20
("[MEDIUM] implement a monotonic internal clock"). This problem was
revealed by recent change ad5a5f677 ("MEDIUM: tree-wide: replace timeval
with nanoseconds in tv_accept and tv_request") that made this value report
random garbage, and generally emphasized by the fact that in 2.8 the two
clocks have sufficiently large an offset for such mistakes to be noticeable
early.

Arguably a difference between date and accept_date could also make sense,
to indicate if the stream had been there for more than 49 days, but this
would introduce instabilities for most sockets (including negative times)
for extremely rare cases while the goal is essentially to see how much
longer than a configured timeout a stream has been there. And that's what
other locations (including the short form) provide.

This patch could be backported but most users will never notice. In case
of backport, tv_accept.tv_sec should be used instead of accept_date.tv_sec.
2023-05-03 11:47:44 +02:00
William Lallemand
64a77e3ea5 MINOR: ssl: disable CRL checks with WolfSSL when no CRL file
WolfSSL is enabling by default the CRL checks even if a CRL file wasn't
provided. This patch resets the default X509_STORE flags so this is
not checked by default.
2023-05-02 18:30:11 +02:00
Tim Duesterhus
0ababda701 BUG/MINOR: stats: fix typo in TotalSplicedBytesOut field name
An additional `d` slipped in there.

This likely should not be backported, because scripts might rely on the
typoed name.

Public discussion on this topic here:
   https://www.mail-archive.com/haproxy@formilux.org/msg43359.html
2023-05-02 11:15:49 +02:00
Amaury Denoyelle
bc0adfa334 MINOR: proxy: factorize send rate measurement
Implement a new dedicated function increment_send_rate() which can be
call anywhere new bytes must be accounted for global total sent.
2023-04-28 16:53:44 +02:00
Amaury Denoyelle
1bcb695a05 MINOR: quic: use real sending rate measurement
Before this patch, global sending rate was measured on the QUIC lower
layer just after sendto(). This meant that all QUIC frames were
accounted for, including non STREAM frames and also retransmission.

To have a better reflection of the application data transferred, move
the incrementation into the MUX layer. This allows to account only for
STREAM frames payload on their first emission.

This should be backported up to 2.6.
2023-04-28 16:52:26 +02:00
Aleksandar Lazic
5529c9985e MINOR: sample: Add bc_rtt and bc_rttvar
This Patch adds fetch samples for backends round trip time.
2023-04-28 16:31:08 +02:00
Willy Tarreau
c05d30e9d8 MINOR: clock: replace the timeval start_time with start_time_ns
Now that "now" is no more a timeval, there's no point keeping a copy
of it as a timeval, let's also switch start_time to nanoseconds, it
simplifies operations.
2023-04-28 16:08:08 +02:00
Willy Tarreau
69530f59ae MEDIUM: clock: replace timeval "now" with integer "now_ns"
This puts an end to the occasional confusion between the "now" date
that is internal, monotonic and not synchronized with the system's
date, and "date" which is the system's date and not necessarily
monotonic. Variable "now" was removed and replaced with a 64-bit
integer "now_ns" which is a counter of nanoseconds. It wraps every
585 years, so if all goes well (i.e. if humanity does not need
haproxy anymore in 500 years), it will just never wrap. This implies
that now_ns is never nul and that the zero value can reliably be used
as "not set yet" for a timestamp if needed. This will also simplify
date checks where it becomes possible again to do "date1<date2".

All occurrences of "tv_to_ns(&now)" were simply replaced by "now_ns".
Due to the intricacies between now, global_now and now_offset, all 3
had to be turned to nanoseconds at once. It's not a problem since all
of them were solely used in 3 functions in clock.c, but they make the
patch look bigger than it really  is.

The clock_update_local_date() and clock_update_global_date() functions
are now much simpler as there's no need anymore to perform conversions
nor to round the timeval up or down.

The wrapping continues to happen by presetting the internal offset in
the short future so that the 32-bit now_ms continues to wrap 20 seconds
after boot.

The start_time used to calculate uptime can still be turned to
nanoseconds now. One interrogation concerns global_now_ms which is used
only for the freq counters. It's unclear whether there's more value in
using two variables that need to be synchronized sequentially like today
or to just use global_now_ns divided by 1 million. Both approaches will
work equally well on modern systems, the difference might come from
smaller ones. Better not change anyhting for now.

One benefit of the new approach is that we now have an internal date
with a resolution of the nanosecond and the precision of the microsecond,
which can be useful to extend some measurements given that timestamps
also have this resolution.
2023-04-28 16:08:08 +02:00
Willy Tarreau
eed5da1037 MINOR: clock: do not use now.tv_sec anymore
Instead we're using ns_to_sec(tv_to_ns(&now)) which allows the tv_sec
part to disappear. At this point, "now" is only used as a timeval in
clock.c where it is updated.
2023-04-28 16:08:08 +02:00
Willy Tarreau
e8e4712771 MINOR: checks: use a nanosecond counters instead of timeval for checks->start
Now we store the checks start date as a nanosecond timestamps instead
of a timeval, this will simplify the operations with "now" in the near
future.
2023-04-28 16:08:08 +02:00
Willy Tarreau
b68d308aec MINOR: activity: use nanoseconds, not timeval to compute uptime
Now that we have the required functions, let's get rid of the timeval
in intermediary calculations.
2023-04-28 16:08:08 +02:00
Willy Tarreau
563efe62e9 MINOR: stats: use nanoseconds, not timeval to compute uptime
Now that we have the required functions, let's get rid of the timeval
in intermediary calculations.
2023-04-28 16:08:08 +02:00
Willy Tarreau
ad5a5f6779 MEDIUM: tree-wide: replace timeval with nanoseconds in tv_accept and tv_request
Let's get rid of timeval in storage of internal timestamps so that they
are no longer mistaken for wall clock time. These were exclusively used
subtracted from each other or to/from "now" after being converted to ns,
so this patch removes the tv_to_ns() conversion to use them natively. Two
occurrences of tv_isge() were turned to a regular wrapping subtract.
2023-04-28 16:08:08 +02:00
Willy Tarreau
aaebcae58b MINOR: spoe: switch the timeval-based timestamps to nanosecond timestamps
Various points were collected during a request/response and were stored
using timeval. Let's now switch them to nanosecond based timestamps.
2023-04-28 16:08:08 +02:00
Willy Tarreau
76d343d3d3 MINOR: time: replace calls to tv_ms_elapsed() with a linear subtract
Instead of operating on {sec, usec} now we convert both operands to
ns then subtract them and convert to ms. This is a first step towards
dropping timeval from these timestamps.

Interestingly, tv_ms_elapsed() and tv_ms_remain() are no longer used at
all and could be removed.
2023-04-28 16:08:08 +02:00
Willy Tarreau
7222db7b84 BUG/MINOR: stats: report the correct start date in "show info"
The "show info" help for "Start_time_sec" says "Start time in seconds"
so it's definitely the start date in human format, not the internal one
that is solely used to compute uptime. Since commit 28360dc ("MEDIUM:
clock: force internal time to wrap early after boot"), both are split
apart since the start time takes into account the offset needed to cause
the early wraparound, so we must only use start_date here.

No backport is needed.
2023-04-28 16:08:08 +02:00
Christopher Faulet
2ebac6a320 BUG/MEDIUM: tcpcheck: Don't eval custom expect rule on an empty buffer
The commit a664aa6a6 ("BUG/MINOR: tcpcheck: Be able to expect an empty
response") instroduced a regression for expect rules relying on a custom
function. Indeed, there is no check on the buffer to be sure it is not empty
before calling the custom function. But some of these functions expect to
have data and don't perform any test on the buffer emptiness.

So instead of fixing all custom functions, we just don't eval them if the
buffer is empty.

This patch must be backported but only if the commit above was backported
first.
2023-04-28 15:01:10 +02:00
Christopher Faulet
89aeabff5b BUG/MINOR: resolvers: Use sc_need_room() to wait more room when dumping stats
It was a cut/paste typo during stream-interface to conn-stream
refactoring. sc_have_room() was used instead of sc_need_room().

This patch must be backported as far as 2.6.
2023-04-28 08:51:34 +02:00
Christopher Faulet
e99c43907c BUG/MEDIUM: spoe: Don't start new applet if there are enough idle ones
It is possible to start too many applets on sporadic burst of events after
an inactivity period. It is due to the way we estimate if a new applet must
be created or not. It is based on a frequency counter. We compare the events
processing rate against the number of events currently processed (in
progress or waiting to be processed). But we should also take care of the
number of idle applets.

We already track the number of idle applets, but it is global and not
per-thread. Thus we now also track the number of idle applets per-thread. It
is not a big deal because this fills a hole in the spoe_agent structure.
Thanks to this counter, we can refrain applets creation if there is enough
idle applets to handle currently processed events.

This patch should be backported to every stable versions.
2023-04-28 08:51:34 +02:00
Willy Tarreau
d2f61de8c2 BUG/MINOR: hlua: return wall-clock date, not internal date in core.now()
That's hopefully the last one affected by this. It was a bit trickier
because there's the promise in the doc that the date is monotonous, so
we continue to use now-start_time as the uptime value and add it to
start_date to get the current date. It was also emphasized by commit
28360dc ("MEDIUM: clock: force internal time to wrap early after boot"),
causing core.now() to return a date of Mar 20 on Apr 27. No backport is
needed.
2023-04-27 18:44:14 +02:00
Willy Tarreau
bc3c4e85f0 BUG/MINOR: trace: show wall-clock date, not internal date in show activity
Yet another case where "now" was used instead of "date" for a publicly
visible date that was already incorrect and became worse after commit
28360dc ("MEDIUM: clock: force internal time to wrap early after boot").
No backport is needed.
2023-04-27 18:22:34 +02:00
Willy Tarreau
22b6d26c57 BUG/MINOR: calltrace: fix 'now' being used in place of 'date'
Since commit 28360dc ("MEDIUM: clock: force internal time to wrap early
after boot") we have a much clearer distinction between 'now' (the internal,
drifting clock) and 'date' (the wall clock time). The calltrace code was
using "now" instead of "date" since the value is displayed to humans.

No backport is needed.
2023-04-27 18:14:57 +02:00
Willy Tarreau
fe1b3b8777 Revert "BUG/MINOR: clock: fix a few occurrences of 'now' being used in place of 'date'"
This reverts commit aadcfc9ea6.

The parts affecting the DeviceAtlas addon were wrong actually, the
"now" variable was a local time_t in a file that's not compiled with
the haproxy binary (dadwsch). Only the fix to the calltrace is correct,
so better revert and fix the only one in a separate commit. No backport
is needed.
2023-04-27 18:14:57 +02:00
Willy Tarreau
82bde18aa4 BUG/MINOR: activity: show wall-clock date, not internal date in show activity
Another case where "now" was used instead of "date" for a publicly visible
date that was already incorrect and became worse after commit 28360dc
("MEDIUM: clock: force internal time to wrap early after boot"). No
backport is needed.
2023-04-27 14:47:50 +02:00
Willy Tarreau
a5f0e6cfc0 BUG/MINOR: spoe: use "date" not "now" in debug messages
The debug messages were still emitted with a date taken from "now" instead
of "date", which was not correct a long time ago but which became worse in
2.8 since commit 28360dc ("MEDIUM: clock: force internal time to wrap early
after boot"). Let's fix it. No backport is needed.
2023-04-27 11:57:53 +02:00
Willy Tarreau
aadcfc9ea6 BUG/MINOR: clock: fix a few occurrences of 'now' being used in place of 'date'
Since commit 28360dc ("MEDIUM: clock: force internal time to wrap early
after boot") we have a much clearer distinction between 'now' (the internal,
drifting clock) and 'date' (the wall clock time). There were still a few
places where 'now' was being used for human consumption.

No backport is needed.
2023-04-26 19:21:25 +02:00
Amaury Denoyelle
7b516d3732 BUG/MINOR: quic: fix race on quic_conns list during affinity rebind
Each quic_conn are attached in a global thread-local quic_conns list
used for "show quic" command. During thread rebinding, a connection is
detached from its local list instance and moved to its new thread list.
However this operation is not thread-safe and may cause a race
condition.

To fix this, only remove the connection from its list inside
qc_set_tid_affinity(). The connection is inserted only after in
qc_finalize_affinity_rebind() on the new thread instance thus prevented
a race condition. One impact of this is that a connection will be
invisible during rebinding for "show quic".

A connection must not transition to closing state in between this two
steps or else cleanup via quic_handle_stopping() may not miss it. To
ensure this, this patch relies on the previous commit :
  commit d6646dddcc
  MINOR: quic: finalize affinity change as soon as possible

This should be backported up to 2.7.
2023-04-26 17:50:22 +02:00
Amaury Denoyelle
d6646dddcc MINOR: quic: finalize affinity change as soon as possible
During accept, a quic-conn is rebind to a new thread. This process is
done in two times :
* first on the original thread via qc_set_tid_affinity()
* then on the newly assigned thread via qc_finalize_affinity_rebind()

Most quic_conn operations (I/O tasklet, task and quic_conn FD socket
read) are reactivated ony after the second step. However, there is a
possibility that datagrams are handled before it via quic_dgram_parse()
when using listener sockets. This does not seem to cause any issue but
this may cause unexpected behavior in the future.

To simplify this, qc_finalize_affinity_rebind() will be called both by
qc_xprt_start() and quic_dgram_parse(). Only one invocation will be
performed thanks to the new flag QUIC_FL_CONN_AFFINITY_CHANGED.

This should be backported up to 2.7.
2023-04-26 17:50:16 +02:00
Amaury Denoyelle
a57ab0fabe MINOR: mux-quic: do not allocate Tx buf for empty STREAM frame
Sometimes it may be necessary to send an empty STREAM frame to signal
clean stream closure with FIN bit set. Prior to this change, a Tx buffer
was allocated unconditionnally even if no data is transferred.

Most of the times, allocation was not performed due to an older buffer
reused. But if data were already acknowledge, a new buffer is allocated.
No memory leak occurs as the buffer is properly released when the empty
frame acknowledge is received. But this allocation is unnecessary and it
consumes a connexion Tx buffer for nothing.

Improve this by skipping buffer allocation if no data to transfer.
qcs_build_stream_frm() is now able to deal with a NULL out argument.

This should be backported up to 2.6.
2023-04-26 17:50:16 +02:00
Amaury Denoyelle
42c5b75cac MINOR: mux-quic: do not set buffer for empty STREAM frame
Previous patch fixes an issue occurring with empty STREAM frames without
payload. The crash was hidden in part because buf/data fields of
qf_stream were set even if no payload is referenced. This was not the
true cause of the crash but to ease future debugging, a STREAM frame
built with no payload now has its buf and data fields set to NULL.

This should be backported up to 2.6.
2023-04-26 17:50:16 +02:00
Amaury Denoyelle
19eaf88fda BUG/MINOR: quic: prevent buggy memcpy for empty STREAM
Sometimes it may be necessary to send empty STREAM frames with only the
FIN bit set. For these frames, memcpy is thus unnecessary as their
payload is empty. However, we did not prevent its invocation inside
quic_build_stream_frame().

Normally, memcpy invocation with length==0 is safe. However, there is an
extra condition in our function to handle data wrapping. For an empty
STREAM frame in the context of MUX emission, this is safe as the frame
points to a valid buffer which causes the wrapping condition to be false
and resulting in a memcpy with 0 length.

However, in the context of retransmission, this may lead to a crash.
Consider the following scenario : two STREAM frames A and B are
produced, one with payload and one empty with FIN set, pointing to the
same stream_desc buffer. If A is acknowledged by the peer, its buffer is
released as no more data is left in it. If B needs to be resent, the
wrapping condition will be messed up to a reuse of a freed buffer. Most
of the times, <wrap> will be a negative number, which results in a
memcpy invocation causing a buffer overflow.

To fix this, simply add an extra condition to skip memcpy and wrapping
check if STREAM frame length is null inside quic_build_stream_frame().

This crash is pretty rare as it relies on a lot of conditions difficult
to reproduce. It seems to be the cause for the latest crashes reported
under github issue #2120. In all the inspected dumps, the segfault
occurred during retransmission with an empty STREAM frame being used as
input. Thanks again to Tristan from Mangadex for his help and
investigation on it.

This should be backported up to 2.6.
2023-04-26 17:50:16 +02:00
Amaury Denoyelle
7c5591facb BUG/MEDIUM: mux-quic: improve streams fairness to prevent early timeout
Since the following mentioned patch, a send-list mechanism was
implemented to improve streams priorization on sending.
  commit 20f2a425ff
  MAJOR: mux-quic: rework stream sending priorization

This is done to prevent the same streams to always be used as first ones
on emission. However there is still a flaw on the algorithm. Once put in
the send-list, a streams is not removed until it has sent all of its
content. When a stream transfers a large object, it will remain in the
send-list during all the transfer and will soon monopolize the first
place. the stream does never leave its position until the transfer is
finished and will monopolize the first place. Other streams behind won't
have the opportunity to advance on their own transfers due to a Tx
buffer exhaustion.

This situation is especially problematic if a small timeout client is
used. As some streams won't advance on their transfer for a long period
of time, they will be aborted due to a stream layer timeout client
causing a RESET_STREAM emission.

To fix this, during sending each stream with at least some bytes
transferred from its tx.buf to qc_stream_desc out buffer is put at the
end of the send-list. This ensures that on the next iteration streams
that cannot transfer anything will be used in priority.

This patch improves significantly h2load benchmarks for large objects
with several streams opened in parallel on a single connection. Without
it, errors may be reported by h2load for aborted streams. For example,
this improved the following scenario on a 10mbit/s link with a 10s
timeout client :
  $ ./build/bin/h2load --npn-list h3 -t 1 -c 1 -m 30 -n 30 https://198.18.10.11:20443/?s=500k

This fix may help with the github issue #2004 where chrome browser stop
to use QUIC after receiving RESET_STREAM frames.

This should be backported up to 2.7.
2023-04-26 17:50:16 +02:00
Amaury Denoyelle
24962dd178 BUG/MEDIUM: mux-quic: do not emit RESET_STREAM for unknown length
Some HTX responses may not always contain a EOM block. For example this
is the case if content-length header is missing from the HTTP server
response. Stream termination is thus signaled to QUIC mux via shutw
callback. However, this is interpreted inconditionnally as an early
close by the mux with a RESET_STREAM emission. Most of the times, QUIC
clients report this as an error.

To fix this, check if htx.extra is set to HTX_UNKOWN_PAYLOAD_LENGTH for
a qcs instance. If true, shutw will never be used to emit a
RESET_STREAM. Instead, the stream will be closed properly with a FIN
STREAM frame. If all data were already transfered, an empty STREAM frame
is sent.

This fix may help with the github issue #2004 where chrome browser stop
to use QUIC after receiving RESET_STREAM frames.

This issue was reported by Vladimir Zakharychev. Thanks to him for his
help and testing. It was also reproduced locally using httpterm with the
query string "/?s=1k&b=0&C=1".

This should be backported up to 2.7.
2023-04-26 17:50:09 +02:00
Frédéric Lécaille
7d23e8d1a6 CLEANUP: quic: Rename several <buf> variables into quic_sock.c
Rename some variables which are not struct buffer variables.

Should be backported to 2.7.
2023-04-24 15:53:27 +02:00
Frédéric Lécaille
bb426aa5f1 CLEANUP: quic: Rename <buf> variable into qc_parse_hd_form()
There is no struct buffer variable manipulated by this function.

Should be backported to 2.7.
2023-04-24 15:53:27 +02:00
Frédéric Lécaille
6ff52f9ce5 CLEANUP: quic: Rename <buf> variable into quic_packet_read_long_header()
Make this function be more readable: there is no struct buffer variable passed
as parameter to this function.

Should be backported to 2.7.
2023-04-24 15:53:27 +02:00
Frédéric Lécaille
81a02b59f5 CLEANUP: quic: Rename several <buf> variables at low level
Make quic_stateless_reset_token_cpy(), quic_derive_cid() and quic_get_cid_tid()
be more readable: there is no struct buffer variable manipulated by these
functions.

Should be backported to 2.7.
2023-04-24 15:53:27 +02:00
Frédéric Lécaille
182934d80b CLEANUP: quic: Rename quic_get_dgram_dcid() <buf> variable
quic_get_dgram_dcid() does not manipulate any struct buffer variable.

Should be backported to 2.7.
2023-04-24 15:53:26 +02:00
Frédéric Lécaille
1e0f8255a1 CLEANUP: quic: Make qc_build_pkt() be more readable
There is no <buf> variable passed to this function.
Also rename <buf_end> to <end> to mimic others functions.
Rename <beg> to <first_byte> and <end> to <last_byte>.

Should be backported to 2.7.
2023-04-24 15:53:26 +02:00
Frédéric Lécaille
3adb9e85a1 CLEANUP: quic: Rename <buf> variable for several low level functions
Make quic_build_packet_long_header(), quic_build_packet_short_header() and
quic_apply_header_protection() be more readable: there is no struct buffer
variables used by these functions.

Should be backported to 2.7.
2023-04-24 15:53:26 +02:00
Frédéric Lécaille
bef3098d33 CLEANUP: quic: Rename <buf> variable into quic_rx_pkt_parse()
Make this function be more readable: there is no struct buffer variable used
by this function.

Should be backported to 2.7.
2023-04-24 15:53:26 +02:00
Frédéric Lécaille
7f0b1c7016 CLEANUP: quic: Rename <buf> variable into quic_padding_check()
Make quic_padding_check() be more readable: there is not struct buffer variable
used by this function.

Should be backported to 2.7.
2023-04-24 15:53:26 +02:00
Frédéric Lécaille
dad0ede28a CLEANUP: quic: Rename <buf> variable to <token> in quic_generate_retry_token()
Make quic_generate_retry_token() be more readable: there is no struct buffer
variable used in this function.

Should be backported to 2.7.
2023-04-24 15:53:26 +02:00
Frédéric Lécaille
e66d67a1ae CLEANUP: quic: Remove useless parameters passes to qc_purge_tx_buf()
Remove the pointer to the connection passed as parameters to qc_purge_tx_buf()
and other similar function which came with qc_purge_tx_buf() implementation.
They were there do track the connection during tests.

Must be backported to 2.7.
2023-04-24 15:53:26 +02:00
Amaury Denoyelle
d5f03cd576 CLEANUP: quic: rename frame variables
Rename all frame variables with the suffix _frm. This helps to
differentiate frame instances from other internal objects.

This should be backported up to 2.7.
2023-04-24 15:35:22 +02:00
Amaury Denoyelle
888c5f283a CLEANUP: quic: rename frame types with an explicit prefix
Each frame type used in quic_frame union has been renamed with the
following prefix "qf_". This helps to differentiate frame instances from
other internal objects.

This should be backported up to 2.7.
2023-04-24 15:35:03 +02:00
Frédéric Lécaille
b73762ad78 BUG/MINOR: quic: Useless I/O handler task wakeups (draining, killing state)
From the idle_timer_task(), the I/O handler must be woken up to send ack. But
there is no reason to do that in draining state or killing state. In draining
state this is even forbidden.

Must be backported to 2.7.
2023-04-24 11:47:11 +02:00
Frédéric Lécaille
d21c628ffd BUG/MINOR: quic: Useless probing retransmission in draining or killing state
The timer task responsible of triggering probing retransmission did not inspect
the state of the connection before doing its job. But there is no need to
probe the peer when the connection is in draining or killing state. About the
draining state, this is even forbidden.

Must be backported to 2.7 and 2.6.
2023-04-24 11:46:33 +02:00
Frédéric Lécaille
c6bec2a3af BUG/MINOR: quic: Possible leak during probing retransmissions
qc_dgrams_retransmit() prepares two list of frames to be retransmitted into
two datagrams. If the first datagram could not be sent, the TX buffer will
be purged with the prepared packet and its frames, but this was not the case for
the second list of frames.

Must be backported in 2.7.
2023-04-24 11:38:28 +02:00
Frédéric Lécaille
ce0bb338c6 BUG/MINOR: quic: Possible memory leak from TX packets
This bug arrived with this commit which was not sufficient:

     BUG/MEDIUM: quic: Missing TX buffer draining from qc_send_ppkts()

Indeed, there were also remaining allocated TX packets to be released and
their TX frames.
Implement qc_purge_tx_buf() to do so which depends on qc_free_tx_coalesced_pkts()
and qc_free_frm_list().

Must be backported to 2.7.
2023-04-24 11:38:28 +02:00
Frédéric Lécaille
e95e00e305 MINOR: quic: Move traces at proto level
These traces has already been useful to debug issues.

Must be backported to 2.7 and 2.6.
2023-04-24 11:38:16 +02:00
Willy Tarreau
0e875cf291 MEDIUM: listener: switch the default sharding to by-group
Sharding by-group is exactly identical to by-process for a single
group, and will use the same number of file descriptors for more than
one group, while significantly lowering the kernel's locking overhead.

Now that all special listeners (cli, peers) are properly handled, and
that support for SO_REUSEPORT is detected at runtime per protocol, there
should be no more reason for now switching to by-group by default.

That's what this patch does. It does only this and nothing else so that
it's easy to revert, should any issue be raised.

Testing on an AMD EPYC 74F3 featuring 24 cores and 48 threads distributed
into 8 core complexes of 3 cores each, shows that configuring 8 groups
(one per CCX) is sufficient to simply double the forwarded connection
rate from 112k to 214k/s, reducing kernel locking from 71 to 55%.
2023-04-23 10:18:16 +02:00
Willy Tarreau
7310164b2c MINOR: listener: add a new global tune.listener.default-shards setting
This new setting accepts "by-process", "by-group" and "by-thread" and
will dictate how listeners will be sharded by default when nothing is
specified. While the default remains "by-process", "by-group" should be
much more efficient with many threads, while not changing anything for
single-group setups.
2023-04-23 09:46:15 +02:00
Willy Tarreau
c38499ceae MINOR: listener: do not restrict CLI to first group anymore
Now that we're able to run listeners on any set of groups, we don't need
to maintain a special case about the stats socket anymore. It used to be
forced to group 1 only so as to avoid startup failures in case several
groups were configured, but if it's done now, it will automatically bind
the needed FDs to have one per group so this is no more an issue.
2023-04-23 09:46:15 +02:00
Willy Tarreau
f1003ea7fa MINOR: protocol: perform a live check for SO_REUSEPORT support
When testing if a protocol supports SO_REUSEPORT, we're now able to
verify if the OS does really support it. While it may be supported at
build time, it may possibly have been blocked in a container for
example so we'd rather know what it's like.
2023-04-23 09:46:15 +02:00
Willy Tarreau
b073573c10 MINOR: sock: add a function to check for SO_REUSEPORT support at runtime
The new function _sock_supports_reuseport() will be used to check if a
protocol type supports SO_REUSEPORT or not. This will be useful to verify
that shards can really work.
2023-04-23 09:46:15 +02:00
Willy Tarreau
8a5e6f4cca MINOR: protocol: add a function to check if some features are supported
The new function protocol_supports_flag() checks the protocol flags
to verify if some features are supported, but will support being
extended to refine the tests. Let's use it to check for REUSEPORT.
2023-04-23 09:46:15 +02:00
Willy Tarreau
c1fbdd6397 MINOR: listener: automatically adjust shards based on support for SO_REUSEPORT
Now if multiple shards are explicitly requested, and the listener's
protocol doesn't support SO_REUSEPORT, sharding is disabled, which will
result in the socket being automatically duped if needed. A warning is
emitted when this happens. If "shards by-group" or "shards by-thread"
are used, these will automatically be turned down to 1 since we want
this to be possible easily using -dR on the command line without having
to djust the config. For "by-thread", a diag warning will be emitted to
help troubleshoot possible performance issues.
2023-04-23 09:46:15 +02:00
Willy Tarreau
785b89f551 MINOR: protocol: move the global reuseport flag to the protocols
Some protocol support SO_REUSEPORT and others not. Some have such a
limitation in the kernel, and others in haproxy itself (e.g. sock_unix
cannot support multiple bindings since each one will unbind the previous
one). Also it's really protocol-dependent and not just family-dependent
because on Linux for some time it was supported for TCP and not UDP.

Let's move the definition to the protocols instead. Now it's preset in
tcp/udp/quic when SO_REUSEPORT is defined, and is otherwise left unset.
The enabled() config condition test validates IPv4 (generally sufficient),
and -dR / noreuseport all protocols at once.
2023-04-23 09:46:15 +02:00
Willy Tarreau
65df7e028d MINOR: protocol: add a flags field to store info about protocols
We'll use these flags to know if some protocols are supported, and if
so, with what options/extensions. Reuseport will move there for example.
Two functions were added to globally set/clear a flag.
2023-04-23 09:46:15 +02:00
Willy Tarreau
a22db6567f MEDIUM: peers: call bind_complete_thread_setup() to finish the config
The listeners in peers sections were still not handing the thread
groups fine. Shards were silently ignored and if a listener was bound
to more than one group, it would simply fail. Now we can call the
dedicated function to resolve all this and possibly create the missing
extra listeners.

bind_complete_thread_setup() was adjusted to use the proxy_type_str()
instead of writing "proxy" at the only place where this word was still
hard-coded so that we continue to speak about peers sections when
relevant.
2023-04-23 09:46:15 +02:00
Willy Tarreau
f6a8444f55 REORG: listener: move the bind_conf's thread setup code to listener.c
What used to be only two lines to apply a mask in a loop in
check_config_validity() grew into a 130-line block that performs deeply
listener-specific operations that do not have their place there anymore.
In addition it's worth noting that the peers code still doesn't support
shards nor being bound to more than one group, which is a second reason
for moving that code to its own function. Nothing was changed except
recreating the missing variables from the bind_conf itself (the fe only).
2023-04-23 09:46:15 +02:00
Willy Tarreau
e1a0107f9c BUG/MINOR: config: fix NUMA topology detection on FreeBSD
In 2.6-dev1, NUMA topology detection was enabled on FreeBSD with commit
f5d48f8b3 ("MEDIUM: cfgparse: numa detect topology on FreeBSD."). But
it suffers from a minor bug which is that it forgets to check for the
number of domains and always emits a confusing warning indicating that
multiple sockets were found while it's not the case.

This can be backported to 2.6.
2023-04-23 09:46:15 +02:00
Willy Tarreau
997ad155fe BUG/MINOR: tools: check libssl and libcrypto separately
The lib compatibility checks introduced in 2.8-dev6 with commit c3b297d5a
("MEDIUM: tools: further relax dlopen() checks too consider grouped
symbols") were partially incorrect in that they check at the same time
libcrypto and libssl. But if loading a library that only depends on
libcrypto, the ssl-only symbols will be missing and this might present
an inconsistency. This is what is observed on FreeBSD 13.1 when
libcrypto is being loaded, where it sees two symbols having disappeared.

The fix consists in splitting the checks for libcrypto and libssl.

No backport is needed, unless the patch above finally gets backported.
2023-04-23 09:46:15 +02:00
Willy Tarreau
9f53b7b41a BUG/MINOR: sock_inet: use SO_REUSEPORT_LB where available
On FreeBSD 13.1 I noticed that thread balancing using shards was not
always working. Sometimes several threads would work, but most of the
time a single one was taking all the traffic. This is related to how
SO_REUSEPORT works on FreeBSD since version 12, as it seems there is
no guarantee that multiple sockets will receive the traffic. However
there is SO_REUSEPORT_LB that is designed exactly for this, so we'd
rather use it when available.

This patch may possibly be backported, but nobody complained and it's
not sure that many users rely on shards. So better wait for some feedback
before backporting this.
2023-04-23 09:46:15 +02:00
Ilya Shipitsin
ccf8012f28 CLEANUP: assorted typo fixes in the code and comments
This is 36th iteration of typo fixes
2023-04-23 09:44:53 +02:00
Willy Tarreau
023c311d70 BUG/MINOR: cli: clarify error message about stats bind-process
In 2.7-dev2, "stats bind-process" was removed by commit 94f763b5e
("MEDIUM: config: remove deprecated "bind-process" directives from
frontends") and an error message indicates that it's no more supported.
However it says "stats" is not supported instead of "stats bind-process",
making it a bit confusing.

This should be backported to 2.7.
2023-04-23 09:40:56 +02:00
Tim Duesterhus
1307cd42d2 CLEANUP: Stop checking the pointer before calling ring_free()
Changes performed with this Coccinelle patch:

    @@
    expression e;
    @@

    - if (e != NULL) {
    	ring_free(e);
    - }

    @@
    expression e;
    @@

    - if (e) {
    	ring_free(e);
    - }

    @@
    expression e;
    @@

    - if (e)
    	ring_free(e);

    @@
    expression e;
    @@

    - if (e != NULL)
    	ring_free(e);
2023-04-23 00:28:25 +02:00
Tim Duesterhus
fe83f58906 CLEANUP: Stop checking the pointer before calling task_free()
Changes performed with this Coccinelle patch:

    @@
    expression e;
    @@

    - if (e != NULL) {
    	task_destroy(e);
    - }

    @@
    expression e;
    @@

    - if (e) {
    	task_destroy(e);
    - }

    @@
    expression e;
    @@

    - if (e)
    	task_destroy(e);

    @@
    expression e;
    @@

    - if (e != NULL)
    	task_destroy(e);
2023-04-23 00:28:25 +02:00
Tim Duesterhus
c18e244515 CLEANUP: Stop checking the pointer before calling pool_free()
Changes performed with this Coccinelle patch:

    @@
    expression e;
    expression p;
    @@

    - if (e != NULL) {
    	pool_free(p, e);
    - }

    @@
    expression e;
    expression p;
    @@

    - if (e) {
    	pool_free(p, e);
    - }

    @@
    expression e;
    expression p;
    @@

    - if (e)
    	pool_free(p, e);

    @@
    expression e;
    expression p;
    @@

    - if (e != NULL)
    	pool_free(p, e);
2023-04-23 00:28:25 +02:00
Tim Duesterhus
b1ec21d259 CLEANUP: Stop checking the pointer before calling tasklet_free()
Changes performed with this Coccinelle patch:

    @@
    expression e;
    @@

    - if (e != NULL) {
    	tasklet_free(e);
    - }

    @@
    expression e;
    @@

    - if (e) {
    	tasklet_free(e);
    - }

    @@
    expression e;
    @@

    - if (e)
    	tasklet_free(e);

    @@
    expression e;
    @@

    - if (e != NULL)
    	tasklet_free(e);

See GitHub Issue #2126
2023-04-23 00:28:25 +02:00
Willy Tarreau
8adffaa899 MINOR: listener: always compare the local thread as well
By comparing the local thread's load with the least loaded thread's
load, we can further improve the fairness and at the same time also
improve locality since it allows a small ratio of connections not to
be migrated. This is visible on CPU usage with long connections on
very large thread counts (224) and high bandwidth (200G). The cost
of checking the local thread's load remains fairly low so there's no
reason not to do this. We continue to update the index if we select
the local thread, because it means that the two other threads were
both more loaded so we'd rather find better ones.
2023-04-21 17:41:26 +02:00
Willy Tarreau
ff18504d73 MINOR: listener: make sure to avoid ABA updates in per-thread index
One limitation of the current thread index mechanism is that if the
values are assigned multiple times to the same thread and the index
loops, it can match again the old value, which will not prevent a
competing thread from finishing its CAS and assigning traffic to a
thread that's not the optimal one. The probability is low but the
solution is simple enough and consists in implementing an update
counter in the high bits of the index to force a mismatch in this
case (assuming we don't try to cover for extremely unlikely cases
where the update counter loops while the index remains equal). So
let's do that. In order to improve the situation a little bit, we
now set the index to a ulong so that in 32 bits we have 8 bits of
counter and in 64 bits we have 40 bits.
2023-04-21 17:41:26 +02:00
Willy Tarreau
77e33509c8 MINOR: listener: resync with the thread index before heavy calculations
During heavy accept competition, the CAS will occasionally fail and
we'll have to go through all the calculation again. While the first
two loops look heavy, they're almost never taken so they're quite
cheap. However the rest of the operation is heavy because we have to
consult connection counts and queue indexes for other threads, so
better double-check if the index is still valid before continuing.
Tests show that it's more efficient do retry half-way like this.
2023-04-21 17:41:26 +02:00
Willy Tarreau
b657492680 MINOR: listener: use a common thr_idx from the reference listener
Instead of seeing each listener use its own thr_idx, let's use the same
for all those from a shard. It should provide more accurate and smoother
thread allocation.
2023-04-21 17:41:26 +02:00
Willy Tarreau
9d360604bd MEDIUM: listener: rework thread assignment to consider all groups
Till now threads were assigned in listener_accept() to other threads of
the same group only, using a single group mask. Now that we have all the
relevant info (array of listeners of the same shard), we can spread the
thr_idx to cover all assigned groups. The thread indexes now contain the
group number in their upper bits, and the indexes run over te whole list
of threads, all groups included.

One particular subtlety here is that switching to a thread from another
group also means switching the group, hence the listener. As such, when
changing the group we need to update the connection's owner to point to
the listener of the same shard that is bound to the target group.
2023-04-21 17:41:26 +02:00
Willy Tarreau
e6f5ab5afa MINOR: listener: make accept_queue index atomic
There has always been a race when checking the length of an accept queue
to determine which one is more loaded that another, because the head and
tail are read at two different moments. This is not required, we can merge
them as two 16 bit numbers inside a single 32-bit index that is always
accessed atomically. This way we read both values at once and always have
a consistent measurement.
2023-04-21 17:41:26 +02:00
Willy Tarreau
09b52d1c3d MEDIUM: config: permit to start a bind on multiple groups at once
Now it's possible for a bind line to span multiple thread groups. When
this happens, the first one will become the reference and will be entirely
set up, and the subsequent ones will be duplicated from this reference,
so that they can be registered in distinct groups. The reference is
always setup and started first so it is always available when the other
ones are started.

The doc was updated to reflect this new possibility with its limitations
and impacts, and the differences with the "shards" option.
2023-04-21 17:41:26 +02:00
Willy Tarreau
09e266e6f5 MINOR: proto: skip socket setup for duped FDs
It's not strictly necessary, but it's still better to avoid setting
up the same socket multiple times when it's being duplicated to a few
FDs. We don't change that for inherited ones however since they may
really need to be set up, so we only skip duplicated ones.
2023-04-21 17:41:26 +02:00
Willy Tarreau
0e1aaf4e78 MEDIUM: proto: duplicate receivers marked RX_F_MUST_DUP
The different protocol's ->bind() function will now check the receiver's
RX_F_MUST_DUP flag to decide whether to bind a fresh new listener from
scratch or reuse an existing one and just duplicate it. It turns out
that the existing code already supports reusing FDs since that was done
as part of the FD passing and inheriting mechanism. Here it's not much
different, we pass the FD of the reference receiver, it gets duplicated
and becomes the new receiver's FD.

These FDs are also marked RX_F_INHERITED so that they are not exported
and avoid being touched directly (only the reference should be touched).
2023-04-21 17:41:26 +02:00
Willy Tarreau
aae1810b4d MINOR: receiver: add a struct shard_info to store info about each shard
In order to create multiple receivers for one multi-group shard, we'll
need some more info about the shard. Here we store:
  - the number of groups (= number of receivers)
  - the number of threads (will be used for accept LB)
  - pointer to the reference rx (to get the FD and to find all threads)
  - pointers to the other members (to iterate over all threads)

For now since there's only one group per shard it remains simple. The
listener deletion code already takes care of removing the current
member from its shards list and moving others' reference to the last
one if it was their reference (so as to avoid o(n^2) updates during
ordered deletes).

Since the vast majority of setups will not use multi-group shards, we
try to save memory usage by only allocating the shard_info when it is
needed, so the principle here is that a receiver shard_info==NULL is
alone and doesn't share its socket with another group.

Various approaches were considered and tests show that the management
of the listeners during boot makes it easier to just attach to or
detach from a shard_info and automatically allocate it if it does not
exist, which is what is being done here.

For now the attach code is not called, but detach is already called
on delete.
2023-04-21 17:41:26 +02:00
Willy Tarreau
84fe1f479b MINOR: listener: support another thread dispatch mode: "fair"
This new algorithm for rebalancing incoming connections to multiple
threads is simpler and instead of considering the threads load, it will
only cycle through all of them, offering a fair share of the traffic to
each thread. It may be well suited for short-lived connections but is
also convenient for very large thread counts where it's not always certain
that the least loaded thread will always be found.
2023-04-21 17:41:26 +02:00
Willy Tarreau
6a4d48b736 MINOR: quic_sock: index li->per_thr[] on local thread id, not global one
There's a li_per_thread array in each listener for use with QUIC
listeners. Since thread groups were introduced, this array can be
allocated too large because global.nbthread is allocated for each
listener, while only no more than MIN(nbthread,MAX_THREADS_PER_GROUP)
may be used by a single listener. This was because the global thread
ID is used as the index instead of the local ID (since a listener may
only be used by a single group). Let's just switch to local ID and
reduce the allocated size.
2023-04-21 17:41:26 +02:00
Willy Tarreau
77d37b07b1 MINOR: quic: support migrating the listener as well
When migrating a quic_conn to another thread, we may need to also
switch the listener if the thread belongs to another group. When
this happens, the freshly created connection will already have the
target listener, so let's just pick it from the connection and use
it in qc_set_tid_affinity(). Note that it will be the caller's
responsibility to guarantee this.
2023-04-21 17:41:26 +02:00
Aurelien DARRAGON
23f352f7d0 MINOR: server/event_hdl: prepare for server event data wrapper
Adding the possibility to publish an event using a struct wrapper
around existing SERVER events to provide additional contextual info.

Using the specific struct wrapper is not required: it is supported
to cast event data as a regular server event data struct so
that we don't break the existing API.

However, casting event data with a more explicit data type allows
to fetch event-only relevant hints.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
f71e0645c1 MEDIUM: server: split srv_update_status() in two functions
Considering that srv_update_status() is now synchronous again since
3ff577e1 ("MAJOR: server: make server state changes synchronous again"),
and that we can easily identify if the update is from an operational
or administrative context thanks to "MINOR: server: pass adm and op cause
to srv_update_status()".

And given that administrative and operational updates cannot be cumulated
(since srv_update_status() is called synchronously and independently for
admin updates and state/operational updates, and the function directly
consumes the changes).

We split srv_update_status() in 2 distinct parts:

Either <type> is 0, meaning the update is an operational update which
is handled by directly looking at cur_state and next_state to apply the
proper transition.
Also, the check to prevent operational state from being applied
if MAINT admin flag is set is no longer needed given that the calling
functions already ensure this (ie: srv_set_{running,stopping,stopped)

Or <type> is 1, meaning the update is an administrative update, where
cur_admin and next_admin are evaluated to apply the proper transition and
deduct the resulting server state (next_state is updated implicitly).

Once this is done, both operations share a common code path in
srv_update_status() to update proxy and servers stats if required.

Thanks to this change, the function's behavior is much more predictable,
it is not an all-in-one function anymore. Either we apply an operational
change, else it is an administrative change. That's it, we cannot mix
the 2 since both code paths are now properly separated.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
76e255520f MINOR: server: pass adm and op cause to srv_update_status()
Operational and administrative state change causes are not propagated
through srv_update_status(), instead they are directly consumed within
the function to provide additional info during the call when required.

Thus, there is no valid reason for keeping adm and op causes within
server struct. We are wasting space and keeping uneeded complexity.

We now exlicitly pass change type (operational or administrative) and
associated cause to srv_update_status() so that no extra storage is
needed since those values are only relevant from srv_update_status().
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
10518c0d59 CLEANUP: server: fix srv_set_{running, stopping, stopped} function comment
Fixing function comments for the server state changing function since they
still refer to asynchonous propagation of server state which is no longer
in play.
Moreover, there were some mixups between running/stopping.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
c54b98ac9a CLEANUP: server: remove unused variables in srv_update_status()
check and px local variable aliases are not very useful.
Let's remove them and use s->check and s->proxy instead.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
1746b56e68 MINOR: server: change srv_op_st_chg_cause storage type
This one is greatly inspired by "MINOR: server: change adm_st_chg_cause storage type".

While looking at current srv_op_st_chg_cause usage, it was clear that
the struct needed some cleanup since some leftovers from asynchronous server
state change updates were left behind and resulted in some useless code
duplication, and making the whole thing harder to maintain.

Two observations were made:

- by tracking down srv_set_{running, stopped, stopping} usage,
  we can see that the <reason> argument is always a fixed statically
  allocated string.
- check-related state change context (duration, status, code...) is
  not used anymore since srv_append_status() directly extracts the
  values from the server->check. This is pure legacy from when
  the state changes were applied asynchronously.

To prevent code duplication, useless string copies and make the reason/cause
more exportable, we store it as an enum now, and we provide
srv_op_st_chg_cause() function to fetch the related description string.
HEALTH and AGENT causes (check related) are now explicitly identified to
make consumers like srv_append_op_chg_cause() able to fetch checks info
from the server itself if they need to.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
f3b48a808e MINOR: server: srv_append_status refacto
srv_append_status() has become a swiss-knife function over time.
It is used from server code and also from checks code, with various
inputs and distincts code paths, making it very hard to guess the
actual behavior of the function (resulting string output).

To simplify the logic behind it, we're dividing it in multiple contextual
functions that take simple inputs and do explicit things, making them
more predictable and easier to maintain.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
9b1ccd7325 MINOR: server: change adm_st_chg_cause storage type
Even though it doesn't look like it at first glance, this is more like
a cleanup than an actual code improvement:

Given that srv->adm_st_chg_cause has been used to exclusively store
static strings ever since it was implemented, we make the choice to
store it as an enum instead of a fixed-size string within server
struct.

This will allow to save some space in server struct, and will make
it more easily exportable (ie: event handlers) because of the
reduced memory footprint during handling and the ability to later get
the corresponding human-readable message when it's explicitly needed.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
85b91375bf MINOR: server: propagate lb changes through srv_lb_propagate()
Now that we have a generic srv_lb_propagate(s) function, let's
use it each time we explicitly wan't to set the status down as
well.

Indeed, it is tricky to try to handle "down" case explicitly,
instead we use srv_lb_propagate() which will call the proper
function that will handle the new server state.

This will allow some code cleanup and will prevent any logic
error.

This commit depends on:
- "MINOR: server: propagate server state change to lb through single function"
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
8bbe643acc MINOR: server: propagate server state change to lb through single function
Use a dedicated helper function to propagate server state change to
lb algorithms, since it is performed at multiple places within
srv_update_status() function.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
5f80f8bbc5 MINOR: server: central update for server counters on state change
Based on "BUG/MINOR: server: don't miss server stats update on server
state transitions", we're also taking advantage of the new centralized
logic to update down_trans server counter directly from there instead
of multiple places.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
9c21ff0208 BUG/MINOR: server: don't use date when restoring last_change from state file
When restoring from a state file: the server "Status" reports weird values on
the html stats page:

	"5s UP" becomes -> "? UP" after the restore

This is due to a bug in srv_state_srv_update(): when restoring the states
from a state file, we rely on date.tv_sec to compute the process-relative
server last_change timestamp.

This is wrong because everywhere else we use now.tv_sec when dealing
with last_change, for instance in srv_update_status().

date (which is Wall clock time) deviates from now (monotonic time) in the
long run.

They should not be mixed, and given that last_change is an internal time value,
we should rely on now.tv_sec instead.

last_change export through "show servers state" cli is safe since we export
a delta and not the raw time value in dump_servers_state():

	srv_time_since_last_change = now.tv_sec - srv->last_change

--

While this bug affects all stable versions, it was revealed in 2.8 thanks
to 28360dc ("MEDIUM: clock: force internal time to wrap early after boot")
This is due to the fact that "now" immediately deviates from "date",
whereas in the past they had the same value when starting.

Thus prior to 2.8 the bug is trickier since it could take some time for
date and now to deviate sufficiently for the issue to arise, and instead
of reporting absurd values that are easy to spot it could just result in
last_change becoming inconsistent over time.

As such, the fix should be backported to all stable versions.
[for 2.2 the patch needs to be applied manually since
srv_state_srv_update() was named srv_update_state() and can be found in
server.c instead of server_state.c]
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
9f5853fa38 BUG/MINOR: server: don't miss server stats update on server state transitions
s->last_change and s->down_time updates were manually updated for each
effective server state change within srv_update_status().

This is rather error-prone, and as a result there were still some state
transitions that were not handled properly since at least 1.8.

ie:
- when transitionning from DRAIN to READY: downtime was updated
  (which is wrong since a server in DRAIN state should not be
   considered as DOWN)
- when transitionning from MAINT to READY: downtime was not updated
  (this can be easily seen in the html stats page)

To fix these all at once, and prevent similar bugs from being introduced,
we centralize the server last_change and down_time stats logic at the end
of srv_update_status():

If the server state changed during the call, then it means that
last_change must be updated, with a special case when changing from
STOPPED state which means the server was previously DOWN and thus
downtime should be updated.

This patch depends on:

- "MINOR: server: explicitly commit state change in srv_update_status()"

This could be backported to every stable versions.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
e80ddb18a8 BUG/MINOR: server: don't miss proxy stats update on server state transitions
backend "down" stats logic has been duplicated multiple times in
srv_update_status(), resulting in the logic now being error-prone.

For example, the following bugfix was needed to compensate for a
copy-paste introduced bug: d332f139
("BUG/MINOR: server: update last_change on maint->ready transitions too")

While the above patch works great, we actually forgot to update the
proxy downtime like it is done for other down->up transitions...
This is simply illustrating that the current design is error-prone,
it is very easy to miss something in this area.

To properly update the proxy downtime stats on the maint->ready transition,
to cleanup srv_update_status() and to prevent similar bugs from being
introduced in the future, proxy/backend stats update are now automatically
performed at the end of the server state change if needed.

Thus we can remove existing updates that were performed at various places
within the function, this simplifies things a bit.

This patch depends on:
- "MINOR: server: explicitly commit state change in srv_update_status()"

This could be backported to all stable versions.

Backport notes:

2.2:

Replace
        struct task *srv_cleanup_toremove_conns(struct task *task, void *context, unsigned int state)

by
        struct task *srv_cleanup_toremove_connections(struct task *task, void *context, unsigned short state)
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
22151c70bb MINOR: server: explicitly commit state change in srv_update_status()
As shown in 8f29829 ("BUG/MEDIUM: checks: a down server going to
maint remains definitely stucked on down state."), state changes
that don't result in explicit lb state change, require us to perform
an explicit server state commit to make sure the next state is
applied before returning from the function.

This is the case for server state changes that don't trigger lb logic
and only perform some logging.

This is quite error prone, we could easily forget a state change
combination that could result in next_state, next_admin or next_eweight
not being applied. (cur_state, cur_admin and cur_eweight would be left
with unexpected values)

To fix this, we explicitly call srv_lb_commit_status() at the end
of srv_update_status() to enforce the new values, even if they were
already applied. (when a state changes requires lb state update
an implicit commit is already performed)

Applying the state change multiple times is safe (since the next value
always points to the current value).

Backport notes:

2.2:

Replace
	struct task *srv_cleanup_toremove_conns(struct task *task, void *context, unsigned int state)

by
	struct task *srv_cleanup_toremove_connections(struct task *task, void *context, unsigned short state)
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
9a1df02ccb BUG/MINOR: server: incorrect report for tracking servers leaving drain
Report message for tracking servers completely leaving drain is wrong:

The check for "leaving drain .. via" never evaluates because the
condition !(s->next_admin & SRV_ADMF_FDRAIN) is always true in the
current block which is guarded by !(s->next_admin & SRV_ADMF_DRAIN).

For tracking servers that leave inherited drain mode, this results in the
following message being emitted:
  "Server x/b is UP (leaving forced drain)"

Instead of:
  "Server x/b is UP (leaving drain) via x/a"

To this fix: we check if FDRAIN is currently set, else it means that the
drain status is inherited from the tracked server (IDRAIN)

This regression was introduced with 64cc49cf ("MAJOR: servers: propagate server
status changes asynchronously."), thus it may be backported to every stable
versions.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
096b383e16 MINOR: hlua/event_hdl: timestamp for events
'when' optional argument is provided to lua event handlers.

It is an integer representing the number of seconds elapsed since Epoch
and may be used in conjunction with lua `os.date()` function to provide
a custom format string.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
e9314fb7a7 MINOR: event_hdl: provide event->when for advanced handlers
For advanced async handlers only
(Registered using EVENT_HDL_ASYNC_TASK() macro):

event->when is provided as a struct timeval and fetched from 'date'
haproxy global variable.

Thanks to 'when', related event consumers will be able to timestamp
events, even if they don't work in real-time or near real-time.
Indeed, unlike sync or normal async handlers, advanced async handlers
could purposely delay the consumption of pending events, which means
that the date wouldn't be accurate if computed directly from within
the handler.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
ebf58e991a MINOR: event_hdl: dynamically allocated event data members
Add the ability to provide a cleanup function for event data passed
via the publishing function.

One use case could be the need to provide valid pointers in the safe
section of the data struct.
Cleanup function will be automatically called with data (or copy of data)
as argument when all handlers consumed the event, which provides an easy
way to release some memory or decrement refcounts to ressources that were
provided through the data struct.
data in itself may not be freed by the cleanup function, it is handled
by the API.

This would allow passing large (allocated) data blocks through the data
struct while keeping data struct size under the EVENT_HDL_ASYNC_EVENT_DATA
size limit.

To do so, when publishing an event, where we would currently do:

        struct event_hdl_cb_data_new_family event_data;

        /* safe data, available from both sync and async contexts
	 * may not use pointers to short-living resources
	 */
        event_data.safe.my_custom_data = x;

        /* unsafe data, only available from sync contexts */
        event_data.unsafe.my_unsafe_data = y;

        /* once data is prepared, we can publish the event */
        event_hdl_publish(NULL,
                          EVENT_HDL_SUB_NEW_FAMILY_SUBTYPE_1,
                          EVENT_HDL_CB_DATA(&event_data));

We could do:

        struct event_hdl_cb_data_new_family event_data;

        /* safe data, available from both sync and async contexts
	 * may not use pointers to short-living resources,
	 * unless EVENT_HDL_CB_DATA_DM is used to ensure pointer
	 * consistency (ie: refcount)
	 */
        event_data.safe.my_custom_static_data = x;
	event_data.safe.my_custom_dynamic_data = malloc(1);

        /* unsafe data, only available from sync contexts */
        event_data.unsafe.my_unsafe_data = y;

        /* once data is prepared, we can publish the event */
        event_hdl_publish(NULL,
                          EVENT_HDL_SUB_NEW_FAMILY_SUBTYPE_1,
                          EVENT_HDL_CB_DATA_DM(&event_data, data_new_family_cleanup));

With data_new_family_cleanup func which would look like this:

      void data_new_family_cleanup(const void *data)
      {
      	const struct event_hdl_cb_data_new_family *event_data = ptr;

	/* some data members require specific cleanup once the event
	 * is consumed
	 */
      	free(event_data.safe.my_custom_dynamic_data);
	/* don't ever free data! it is not ours */
      }

Not sure if this feature will become relevant in the future, so I prefer not
to mention it in the doc for now.

But given that the implementation is trivial and does not put a burden
on the existing API, it's a good thing to have it there, just in case.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
a63f4903c9 MINOR: server/event_hdl: prepare for upcoming refactors
This commit does nothing that ought to be mentioned, except that
it adds missing comments and slighty moves some function calls
out of "sensitive" code in preparation of some server code refactors.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
2f6a07dce8 MINOR: hlua/event_hdl: fix return type for hlua_event_hdl_cb_data_push_args
Changing hlua_event_hdl_cb_data_push_args() return type to void since it
does not return anything useful.
Also changing its name to hlua_event_hdl_cb_push_args() since it does more
than just pushing cb data argument (it also handles event type and mgmt).

Errors catched by the function are reported as lua errors.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
55f84c7cab MINOR: hlua/event_hdl: expose proxy_uuid variable in server events
Adding proxy_uuid to ServerEvent class.
proxy_uuid contains the uuid of the proxy to which the server belongs
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
3d9bf4e1a5 MINOR: hlua/event_hdl: rely on proxy_uuid instead of proxy_name for lookups
Since "MINOR: server/event_hdl: add proxy_uuid to event_hdl_cb_data_server"
we may now use proxy_uuid variable to perform proxy lookups when
handling a server event.

It is more reliable since proxy_uuid isn't subject to any size limitation
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
d714213862 MINOR: server/event_hdl: add proxy_uuid to event_hdl_cb_data_server
Expose proxy_uuid variable in event_hdl_cb_data_server struct to
overcome proxy_name fixed length limitation.

proxy_uuid may be used by the handler to perform proxy lookups.
This should be preferred over lookups relying proxy_name.
(proxy_name is suitable for printing / logging purposes but not for
ID lookups since it has a maximum fixed length)
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
0ddf052972 CLEANUP: server: fix update_status() function comment
srv_update_status() function comment says that the function "is designed
to be called asynchronously".

While this used to be true back then with 64cc49cf
("MAJOR: servers: propagate server status changes asynchronously.")

This is not true anymore since 3ff577e ("MAJOR: server: make server state changes
synchronous again")

Fixing the comment in order to better reflect current behavior.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
88687f0980 CLEANUP: errors: fix obsolete function comments
Since 9f903af5 ("MEDIUM: log: slightly refine the output format of
alerts/warnings/etc"), messages generated by ha_{alert,warning,notice}
don't embed date/time information anymore.

Updating some old function comments that kept saying otherwise.
2023-04-21 14:36:45 +02:00
Amaury Denoyelle
a65dd3a2c8 BUG/MINOR: quic: consume Rx datagram even on error
A BUG_ON crash can occur on qc_rcv_buf() if a Rx packet allocation
failed.

To fix this, datagram are marked as consumed even if a fatal error
occured during parsing. For the moment, only a Rx packet allocation
failure could provoke this. At this stage, it's unknown if the datagram
were partially parsed or not at all so it's better to discard it
completely.

This bug was detected using -dMfail argument.

This should be backported up to 2.7.
2023-04-20 14:49:32 +02:00
Amaury Denoyelle
d537ca79dc BUG/MINOR: quic: prevent crash on qc_new_conn() failure
Properly initialize el_th_ctx member first on qc_new_conn(). This
prevents a segfault if release should be called later due to memory
allocation failure in the function on qc_detach_th_ctx_list().

This should be backported up to 2.7.
2023-04-20 14:49:32 +02:00
Amaury Denoyelle
9bbfa72b67 BUG/MINOR: h3: fix crash on h3s alloc failure
Do not emit a CONNECTION_CLOSE on h3s allocation failure. Indeed, this
causes a crash as the calling function qcs_new() will also try to emit a
CONNECTION_CLOSE which triggers a BUG_ON() on qcc_emit_cc().

This was reproduced using -dMfail.

This should be backported up to 2.7.
2023-04-20 14:49:32 +02:00
Amaury Denoyelle
93d2ebe9f3 BUG/MINOR: mux-quic: properly handle STREAM frame alloc failure
Previously, if a STREAM frame cannot be allocated for emission, a crash
would occurs due to an ABORT_NOW() statement in _qc_send_qcs().

Replace this by proper error code handling. Each stream were sending
fails are removed temporarily from qcc::send_list to a list local to
_qc_send_qcs(). Once emission has been conducted for all streams,
reinsert failed stream to qcc::send_list. This avoids to reloop on
failed streams on the second while loop at the end of _qc_send_qcs().

This crash was reproduced using -dMfail.

This should be backported up to 2.6.
2023-04-20 14:49:32 +02:00
Amaury Denoyelle
ed820823f0 BUG/MINOR: mux-quic: fix crash with app ops install failure
On MUX initialization, the application layer is setup via
qcc_install_app_ops(). If this function fails MUX is deallocated and an
error is returned.

This code path causes a crash before connection has been registered
prior into the mux_stopping_data::list for stopping idle frontend conns.
To fix this, insert the connection later in qc_init() once no error can
occured.

The crash was seen on the process closing with SUGUSR1 with a segfault
on mux_stopping_process(). This was reproduced using -dMfail.

This regression was introduced by the following patch :
  commit b4d119f0c7
  BUG/MEDIUM: mux-quic: fix crash on H3 SETTINGS emission

This should be backported up to 2.7.
2023-04-20 14:49:32 +02:00
Frédéric Lécaille
d07421331f BUG/MINOR: quic: Wrong Retry token generation timestamp computing
Again a now_ms variable value used without the ticks API. It is used
to store the generation time of the Retry token to be received back
from the client.

Must be backported to 2.6 and 2.7.
2023-04-19 17:31:28 +02:00
Frédéric Lécaille
45662efb2f BUG/MINOR: quic: Unchecked buffer length when building the token
As server, an Initial does not contain a token but only the token length field
with zero as value. The remaining room was not checked before writting this field.

Must be backported to 2.6 and 2.7.
2023-04-19 11:36:54 +02:00
Frédéric Lécaille
0ed94032b2 MINOR: quic: Do not allocate too much ack ranges
Limit the maximum number of ack ranges to QUIC_MAX_ACK_RANGES(32).

Must be backported to 2.6 and 2.7.
2023-04-19 11:36:54 +02:00
Frédéric Lécaille
4b2627beae BUG/MINOR: quic: Stop removing ACK ranges when building packets
Since this commit:

    BUG/MINOR: quic: Possible wrapped values used as ACK tree purging limit.

There are more chances that ack ranges may be removed from their trees when
building a packet. It is preferable to impose a limit to these trees. This
will be the subject of the a next commit to come.

For now on, it is sufficient to stop deleting ack range from their trees.
Remove quic_ack_frm_reduce_sz() and quic_rm_last_ack_ranges() which were
there to do that.
Make qc_frm_len() support ACK frames and calls it to ensure an ACK frame
may be added to a packet before building it.

Must be backported to 2.6 and 2.7.
2023-04-19 11:36:54 +02:00
Aurelien DARRAGON
8cd620b46f MINOR: hlua: safe coroutine.create()
Overriding global coroutine.create() function in order to link the
newly created subroutine with the parent hlua ctx.
(hlua_gethlua() function from a subroutine will return hlua ctx from the
hlua ctx on which the coroutine.create() was performed, instead of NULL)

Doing so allows hlua_hook() function to support being called from
subroutines created using coroutine.create() within user lua scripts.

That is: the related subroutine will be immune to the forced-yield,
but it will still be checked against hlua timeouts. If the subroutine
fails to yield or finish before the timeout, the related lua handler will
be aborted (instead of going rogue unnoticed like it would be the case prior
to this commit)
2023-04-19 11:03:31 +02:00
Aurelien DARRAGON
cf0f792490 MINOR: hlua: hook yield on known lua state
When forcing a yield attempt from hlua_hook(), we should perform it on
the known hlua state, not on a potential substate created using
coroutine.create() from an existing hlua state from lua script.

Indeed, only true hlua couroutines will properly handle the yield and
perform the required timeout checks when returning in hlua_ctx_resume().

So far, this was not a concern because hlua_gethlua() would return NULL
if hlua_hook() is not directly being called from a hlua coroutine anyway.

But with this we're trying to make hlua_hook() ready for being called
from a subcoroutine which inherits from a parent hlua ctx.
In this case, no yield attempt will be performed, we will simply check
for hlua timeouts.

Not doing so would result in the timeout checks not being performed since
hlua_ctx_resume() is completely bypassed when yielding from the subroutine,
resulting in a user-defined coroutine potentially going rogue unnoticed.
2023-04-19 11:03:31 +02:00
Aurelien DARRAGON
2a9764baae CLEANUP: hlua: avoid confusion between internal timers and tick based timers
Not all hlua "time" variables use the same time logic.

hlua->wake_time relies on ticks since its meant to be used in conjunction
with task scheduling. Thus, it should be stored as a signed int and
manipulated using the tick api.
Adding a few comments about that to prevent mixups with hlua internal
timer api which doesn't rely on the ticks api.
2023-04-19 11:03:31 +02:00
Aurelien DARRAGON
58e36e5b14 MEDIUM: hlua: introduce tune.lua.burst-timeout
The "burst" execution timeout applies to any Lua handler.
If the handler fails to finish or yield before timeout is reached,
handler will be aborted to prevent thread contention, to prevent
traffic from not being served for too long, and ultimately to prevent
the process from crashing because of the watchdog kicking in.

Default value is 1000ms.
Combined with forced-yield default value of 10000 lua instructions, it
should be high enough to prevent any existing script breakage, while
still being able to catch slow lua converters or sample fetches
doing thread contention and risking the process stability.

Setting value to 0 completely bypasses this check. (not recommended but
could be required to restore original behavior if this feature breaks
existing setups somehow...)

No backport needed, although it could be used to prevent watchdog crashes
due to poorly coded (slow/cpu consuming) lua sample fetches/converters.
2023-04-19 11:03:31 +02:00
Aurelien DARRAGON
da9503ca9a MEDIUM: hlua: reliable timeout detection
For non yieldable lua handlers (converters, fetches or yield
incompatible lua functions), current timeout detection relies on now_ms
thread local variable.

But within non-yieldable contexts, now_ms won't be updated if not by us
(because we're momentarily stuck in lua context so we won't
re-enter the polling loop, which is responsible for clock updates).

To circumvent this, clock_update_date(0, 1) was manually performed right
before now_ms is being read for the timeout checks.

But this fails to work consistently, because if no other concurrent
threads periodically run clock_update_global_date(), which do happen if
we're the only active thread (nbthread=1 or low traffic), our
clock_update_date() call won't reliably update our local now_ms variable

Moreover, clock_update_date() is not the right tool for this anyway, as
it was initially meant to be used from the polling context.
Using it could have negative impact on other threads relying on now_ms
to be stable. (because clock_update_date() performs global clock update
from time to time)

-> Introducing hlua multipurpose timer, which is internally based on
now_cpu_time_fast() that provides per-thread consistent clock readings.

Thanks to this new hlua timer API, hlua timeout logic is less error-prone
and more robust.

This allows the timeout detection to work as expected for both yieldable
and non-yieldable lua handlers.

This patch depends on commit "MINOR: clock: add now_cpu_time_fast() function"

While this could theorically be backported to all stable versions,
it is advisable to avoid backports unless we're confident enough
since it could cause slight behavior changes (timing related) in
existing setups.
2023-04-19 11:03:31 +02:00
Aurelien DARRAGON
df188f145b MINOR: clock: add now_cpu_time_fast() function
Same as now_cpu_time(), but for fast queries (less accurate)
Relies on now_cpu_time() and now_mono_time_fast() is used
as a cache expiration hint to prevent now_cpu_time() from being
called too often since it is known to be quite expensive.

Depends on commit "MINOR: clock: add now_mono_time_fast() function"
2023-04-19 11:03:31 +02:00
Aurelien DARRAGON
07cbd8e074 MINOR: clock: add now_mono_time_fast() function
Same as now_mono_time(), but for fast queries (less accurate)
Relies on coarse clock source (also known as fast clock source on
some systems).

Fallback to now_mono_time() if coarse source is not supported on the system.
2023-04-19 11:03:31 +02:00
Willy Tarreau
be336620b7 BUG/MINOR: cfgparse: make sure to include openssl-compat
Commit 5003ac7fe ("MEDIUM: config: set useful ALPN defaults for HTTPS
and QUIC") revealed a build dependency bug: if QUIC is not enabled,
cfgparse doesn't have any dependency on the SSL stack, so the various
ifdefs that try to check special conditions such as rejecting an H2
config with too small a bufsize, are silently ignored. This was
detected because the default ALPN string was not set and caused the
alpn regtest to fail without QUIC support. Adding openssl-compat to
the list of includes seems to be sufficient to have what we need.

It's unclear when this dependency was broken, it seems that even 2.2
didn't have an explicit dependency on anything SSL-related, though it
could have been inherited through other files (as happens with QUIC
here). It would be safe to backport it to all stable branches. The
impact is very low anyway.
2023-04-19 10:46:21 +02:00
Amaury Denoyelle
89e48ff92f BUG/MEDIUM: quic: prevent crash on Retry sending
The following commit introduced a regression :
  commit 1a5cc19cec
  MINOR: quic: adjust Rx packet type parsing

Since this commit, qv variable was left to NULL as version is stored
directly in quic_rx_packet instance. In most cases, this only causes
traces to skip version printing. However, qv is dereferenced when
sending a Retry which causes a segfault.

To fix this, simply remove qv variable and use pkt->version instead,
both for traces and send_retry() invocation.

This bug was detected thanks to QUIC interop runner. It can easily be
reproduced by using quic-force-retry on the bind line.

This must be backported up to 2.7.
2023-04-19 10:18:58 +02:00
Willy Tarreau
5003ac7fe9 MEDIUM: config: set useful ALPN defaults for HTTPS and QUIC
This commit makes sure that if three is no "alpn", "npn" nor "no-alpn"
setting on a "bind" line which corresponds to an HTTPS or QUIC frontend,
we automatically turn on "h2,http/1.1" as an ALPN default for an HTTP
listener, and "h3" for a QUIC listener. This simplifies the configuration
for end users since they won't have to explicitly configure the ALPN
string to enable H2, considering that at the time of writing, HTTP/1.1
represents less than 7% of the traffic on large infrastructures. The
doc and regtests were updated. For more info, refer to the following
thread:

  https://www.mail-archive.com/haproxy@formilux.org/msg43410.html
2023-04-19 09:52:20 +02:00
Willy Tarreau
de85de69ec MINOR: ssl_crtlist: dump "no-alpn" on "show crtlist" when "no-alpn" was set
Instead of dumping "alpn " better show "no-alpn" as configured.
2023-04-19 09:12:43 +02:00
Willy Tarreau
a2a095536a MINOR: ssl: do not set ALPN callback with the empty string
While it does not have any effect, it's better not to try to setup an
ALPN callback nor to try to lookup algorithms when the configured ALPN
string is empty as a result of "no-alpn" being used.
2023-04-19 09:12:43 +02:00
Willy Tarreau
158c18e85a MINOR: config: add "no-alpn" support for bind lines
It's possible to replace a previously set ALPN but not to disable ALPN
if it was previously set. The new "no-alpn" setting allows to disable
a previously set ALPN setting by preparing an empty one that will be
replaced and freed when the config is validated.
2023-04-19 08:38:06 +02:00
Christopher Faulet
d0c57d3d33 BUG/MEDIUM: stconn: Propagate error on the SC on sending path
On sending path, a pending error can be promoted to a terminal error at the
endpoint level (SE_FL_ERR_PENDING to SE_FL_ERROR). When this happens, we
must propagate the error on the SC to be able to handle it at the stream
level and eventually forward it to the other side.

Because of this bug, it is possible to freeze sessions, for instance on the
CLI.

It is a 2.8-specific issue. No backport needed.
2023-04-18 18:57:04 +02:00
Christopher Faulet
845f7c4708 CLEANUP: cli: Remove useless debug message in cli_io_handler()
When compiled in debug mode, HAProxy prints a debug message at the end of
the cli I/O handle. It is pretty annoying and useless because, we can active
applet traces. Thus, just remove it.
2023-04-18 18:57:04 +02:00
Christopher Faulet
cbfcb02e21 CLEANUP: backend: Remove useless debug message in assign_server()
When compiled in debug mode, HAProxy prints a debug message at the beginning
of assign_server(). It is pretty annoying and useless because, in debug
mode, we can active stream traces. Thus, just remove it.
2023-04-18 18:57:04 +02:00
Christopher Faulet
27c17d1ca5 BUG/MINOR: http-ana: Update analyzers on both sides when switching in TUNNEL mode
The commit 9704797fa ("BUG/MEDIUM: http-ana: Properly switch the request in
tunnel mode on upgrade") fixes the switch in TUNNEL mode, but only
partially. Because both channels are switch in TUNNEL mode in same time on
one side, the channel's analyzers on the opposite side are not updated
accordingly. This prevents the tunnel timeout to be applied.

So instead of updating both sides in same time, we only force the analysis
on the other side by setting CF_WAKE_ONCE flag when a channel is switched in
TUNNEL mode. In addition, we must take care to forward all data if there is
no DATAa TCP filters registered.

This patch is related to the issue #2125. It is 2.8-specific. No backport
needed.
2023-04-18 18:57:04 +02:00
Amaury Denoyelle
0783a7b08e MINOR: listener: remove unneeded local accept flag
Remove the receiver RX_F_LOCAL_ACCEPT flag. This was used by QUIC
protocol before thread rebinding was supported by the quic_conn layer.

This should be backported up to 2.7 after the previous patch has also
been taken.
2023-04-18 17:09:34 +02:00
Amaury Denoyelle
1acbbca171 MAJOR: quic: support thread balancing on accept
Before this patch, QUIC protocol used a custom add_listener callback.
This was because a quic_conn instance was allocated before accept. Its
thread affinity was fixed and could not be changed after. The thread was
derived itself from the CID selected by the client which prevent an even
repartition of QUIC connections on multiple threads.

A series of patches was introduced with a lot of changes. The most
important ones :
* removal of affinity between an encoded CID and a thread
* possibility to rebind a quic_conn on a new thread

Thanks to this, it's possible to suppress the custom add_listener
callback. Accept is conducted for QUIC protocol as with the others. A
less loaded thread is selected on listener_accept() and the connection
stack is bind on it. This operation implies that quic_conn instance is
moved to the new thread using the set_affinity QUIC protocol callback.

To reactivate quic_conn instance after thread rebind,
qc_finalize_affinity_rebind() is called after accept on the new thread
by qc_xprt_start() through accept_queue_process() / session_accept_fd().

This should be backported up to 2.7 after a period of observation.
2023-04-18 17:09:34 +02:00
Amaury Denoyelle
739de3f119 MINOR: quic: properly finalize thread rebinding
When a quic_conn instance is rebinded on a new thread its tasks and
tasklet are destroyed and new ones created. Its socket is also migrated
to a new thread which stop reception on it.

To properly reactivate a quic_conn after rebind, wake up its tasks and
tasklet if they were active before thread rebind. Also reactivate
reading on the socket FD. These operations are implemented on a new
function qc_finalize_affinity_rebind().

This should be backported up to 2.7 after a period of observation.
2023-04-18 17:09:02 +02:00
Amaury Denoyelle
5f8704152a BUG/MINOR: quic: transform qc_set_timer() as a reentrant function
qc_set_timer() function is used to rearm the timer for loss detection
and probing. Previously, timer was always rearm when congestion window
was free due to a wrong interpretation of the RFC which mandates the
client to rearm the timer before handshake completion to avoid a
deadlock related to anti-amplification.

Fix this by removing this code from quic_pto_pktns(). This allows
qc_set_timer() to be reentrant and only activate the timer if needed.

The impact of this bug seems limited. It can probably caused the timer
task to be processed too frequently which could caused too frequent
probing.

This change will allow to reuse easily qc_set_timer() after quic_conn
thread migration. As such, the new timer task will be scheduled only if
needed.

This should be backported up to 2.6.
2023-04-18 17:09:02 +02:00
Amaury Denoyelle
25174d51ef MEDIUM: quic: implement thread affinity rebinding
Implement a new function qc_set_tid_affinity(). This function is
responsible to rebind a quic_conn instance to a new thread.

This operation consists mostly of releasing existing tasks and tasklet
and allocating new instances on the new thread. If the quic_conn uses
its owned socket, it is also migrated to the new thread. The migration
is finally completed with updated the CID TID to the new thread. After
this step, the connection is thus accessible to the new thread and
cannot be access anymore on the old one without risking race condition.

To ensure rebinding is either done completely or not at all, tasks and
tasklet are pre-allocated before all operations. If this fails, an error
is returned and rebiding is not done.

To destroy the older tasklet, its context is set to NULL before wake up.
In I/O callbacks, a new function qc_process() is used to check context
and free the tasklet if NULL.

The thread rebinding can cause a race condition if the older thread
quic_dghdlrs::dgrams list contains datagram for the connection after
rebinding is done. To prevent this, quic_rx_pkt_retrieve_conn() always
check if the packet CID is still associated to the current thread or
not. In the latter case, no connection is returned and the new thread is
returned to allow to redispatch the datagram to the new thread in a
thread-safe way.

This should be backported up to 2.7 after a period of observation.
2023-04-18 17:08:34 +02:00
Amaury Denoyelle
1304d19dee MINOR: quic: delay post handshake frames after accept
When QUIC handshake is completed on our side, some frames are prepared
to be sent :
* HANDSHAKE_DONE
* several NEW_CONNECTION_ID with CIDs allocated

This step was previously executed in quic_conn_io_cb() directly after
CRYPTO frames parsing. This patch delays it to be completed after
accept. Special care have been taken to ensure it is still functional
with 0-RTT activated.

For the moment, this patch should have no impact. However, when
quic_conn thread migration on accept will be implemented, it will be
easier to remap only one CID to the new thread. New CIDs will be
allocated after migration on the new thread.

This should be backported up to 2.7 after a period of observation.
2023-04-18 17:08:28 +02:00
Amaury Denoyelle
a66e04338e MINOR: protocol: define new callback set_affinity
Define a new protocol callback set_affinity. This function is used
during listener_accept() to notify about a rebind on a new thread just
before pushing the connection on the selected thread queue. If the
callback fails, accept is done locally.

This change will be useful for protocols with state allocated before
accept is done. For the moment, only QUIC protocol is concerned. This
will allow to rebind the quic_conn to a new thread depending on its
load.

This should be backported up to 2.7 after a period of observation.
2023-04-18 16:54:52 +02:00
Amaury Denoyelle
987812b190 MINOR: quic: do not proceed to accept for closing conn
Each quic_conn is inserted in an accept queue to allocate the upper
layers. This is done through a listener tasklet in
quic_sock_accept_conn().

This patch interrupts the accept process for a quic_conn in
closing/draining state. Indeed, this connection will soon be closed so
it's unnecessary to allocate a complete stack for it.

This patch will become necessary when thread migration is implemented.
Indeed, it won't be allowed to proceed to thread migration for a closing
quic_conn.

This should be backported up to 2.7 after a period of observation.
2023-04-18 16:54:48 +02:00
Amaury Denoyelle
f16ec344d5 MEDIUM: quic: handle conn bootstrap/handshake on a random thread
TID encoding in CID was removed by a recent change. It is now possible
to access to the <tid> member stored in quic_connection_id instance.

For unknown CID, a quick solution was to redispatch to the thread
corresponding to the first CID byte. This ensures that an identical CID
will always be handled by the same thread to avoid creating multiple
same connection. However, this forces an uneven load repartition which
can be critical for QUIC handshake operation.

To improve this, remove the above constraint. An unknown CID is now
handled by its receiving thread. However, this means that if multiple
packets are received with the same unknown CID, several threads will try
to allocate the same connection.

To prevent this race condition, CID insertion in global tree is now
conducted first before creating the connection. This is a thread-safe
operation which can only be executed by a single thread. The thread
which have inserted the CID will then proceed to quic_conn allocation.
Other threads won't be able to insert the same CID : this will stop the
treatment of the current packet which is redispatch to the now owning
thread.

This should be backported up to 2.7 after a period of observation.
2023-04-18 16:54:44 +02:00
Amaury Denoyelle
1e959ad522 MINOR: quic: remove TID encoding in CID
CIDs were moved from a per-thread list to a global list instance. The
TID-encoded is thus non needed anymore.

This should be backported up to 2.7 after a period of observation.
2023-04-18 16:54:31 +02:00
Amaury Denoyelle
e83f937cc1 MEDIUM: quic: use a global CID trees list
Previously, quic_connection_id were stored in a per-thread tree list.
Datagram were first dispatched to the correct thread using the encoded
TID before a tree lookup was done.

Remove these trees and replace it with a global trees list of 256
entries. A CID is using the list index corresponding to its first byte.
On datagram dispatch, CID is lookup on its tree and TID is retrieved
using new member quic_connection_id.tid. As such, a read-write lock
protects each list instances. With 256 entries, it is expected that
contention should be reduced.

A new structure quic_cid_tree served as a tree container associated with
its read-write lock. An API is implemented to ensure lock safety for
insert/lookup/delete operation.

This patch is a step forward to be able to break the affinity between a
CID and a TID encoded thread. This is required to be able to migrate a
quic_conn after accept to select thread based on their load.

This should be backported up to 2.7 after a period of observation.
2023-04-18 16:54:17 +02:00
Amaury Denoyelle
66947283ba MINOR: quic: remove TID ref from quic_conn
Remove <tid> member in quic_conn. This is moved to quic_connection_id
instance.

For the moment, this change has no impact. Indeed, qc.tid reference
could easily be replaced by tid as all of this work was already done on
the connection thread. However, it is planified to support quic_conn
thread migration in the future, so removal of qc.tid will simplify this.

This should be backported up to 2.7.
2023-04-18 16:20:47 +02:00
Amaury Denoyelle
c2a9264f34 MINOR: quic: adjust quic CID derive API
ODCID are never stored in the CID tree. Instead, we store our generated
CID which is directly derived from the CID using a hash function. This
operation is done via quic_derive_cid().

Previously, generated CID was returned as a 64-bits integer. However,
this is cumbersome to convert as an array of bytes which is the most
common CID representation. Adjust this by modifying return type to a
quic_cid struct.

This should be backported up to 2.7.
2023-04-18 16:20:47 +02:00
Amaury Denoyelle
1a5cc19cec MINOR: quic: adjust Rx packet type parsing
qc_parse_hd_form() is the function used to parse the first byte of a
packet and return its type and version. Its API has been simplified with
the following changes :
* extra out paremeters are removed (long_header and version). All infos
  are now stored directly in quic_rx_packet instance
* a new dummy version is declared in quic_versions array with a 0 number
  code. This can be used to match Version negotiation packets.
* a new default packet type is defined QUIC_PACKET_TYPE_UNKNOWN to be
  used as an initial value.

Also, the function has been exported to an include file. This will be
useful to be able to reuse on quic-sock to parse the first packet of a
datagram.

This should be backported up to 2.7.
2023-04-18 16:20:47 +02:00
Amaury Denoyelle
6ac0fb0f13 MINOR: quic: remove uneeded tasklet_wakeup after accept
No need to explicitely wakeup quic-conn tasklet after accept is done.

This should be backported up to 2.7.
2023-04-18 16:20:47 +02:00
Amaury Denoyelle
591e7981d9 CLEANUP: quic: rename quic_connection_id vars
Two different structs exists for QUIC connection ID :
* quic_connection_id which represents a full CID with its sequence
  number
* quic_cid which is just a buffer with a length. It is contained in the
  above structure.

To better differentiate them, rename all quic_connection_id variable
instances to "conn_id" by contrast to "cid" which is used for quic_cid.

This should be backported up to 2.7.
2023-04-18 16:20:47 +02:00
Amaury Denoyelle
9b68b64572 CLEANUP: quic: remove unused qc param on stateless reset token
Remove quic_conn instance as first parameter of
quic_stateless_reset_token_init() and quic_stateless_reset_token_cpy()
functions. It was only used for trace purpose.

The main advantage is that it will be possible to allocate a QUIC CID
without a quic_conn instance using new_quic_cid() which is requires to
first check if a CID is existing before allocating a connection.

This should be backported up to 2.7.
2023-04-18 16:20:47 +02:00
Amaury Denoyelle
90e5027e46 CLEANUP: quic: remove unused scid_node
Remove unused scid_node member for quic_conn structure. It was prepared
for QUIC backend support.

This should be backported up to 2.7.
2023-04-18 16:20:47 +02:00
Amaury Denoyelle
22a368ce58 CLEANUP: quic: remove unused QUIC_LOCK label
QUIC_LOCK label is never used. Indeed, lock usage is minimal on QUIC as
every connection is pinned to its owned thread.

This should be backported up to 2.7.
2023-04-18 16:20:47 +02:00
Amaury Denoyelle
c361937d51 BUG/MINOR: task: allow to use tasklet_wakeup_after with tid -1
Adjust BUG_ON() statement to allow tasklet_wakeup_after() for tasklets
with tid pinned to -1 (the current thread). This is similar to
tasklet_wakeup().

This should be backported up to 2.6.
2023-04-18 16:20:47 +02:00
Willy Tarreau
ca1027c22f MINOR: mux-h2: make the max number of concurrent streams configurable per side
For a long time the maximum number of concurrent streams was set once for
both sides (front and back) while the impacts are different. This commit
allows it to be configured separately for each side. The older settings
remains the fallback choice when other ones are not set.
2023-04-18 15:58:55 +02:00
Willy Tarreau
9d7abda787 MINOR: mux-h2: make the initial window size configurable per side
For a long time the initial window size (per-stream size) was set once
for both directions, frontend and backend, resulting in a tradeoff between
upload speed and download fairness. This commit allows it to be configured
separately for each side. The older settings remains the fallback choice
when other ones are not set.
2023-04-18 15:58:55 +02:00
Christopher Faulet
b36e512bd0 MINOR: stconn: Propagate EOS from an applet to the attached stream-connector
In the same way than for a stream-connector attached to a mux, an EOS is now
propagated from an applet to its stream-connector. To do so, sc_applet_eos()
function is added.
2023-04-17 17:41:28 +02:00
Christopher Faulet
1aec6c92cb MINOR: stconn: Propagate EOS from a mux to the attached stream-connector
Now there is a SC flag to state the endpoint has reported an end-of-stream,
it is possible to distinguish an EOS from an abort at the stream-connector
level.

sc_conn_read0() function is renamed to sc_conn_eos() and it propagates an
EOS by setting SC_FL_EOS instead of SC_FL_ABRT_DONE. It only concernes
stream-connectors attached to a mux.
2023-04-17 17:41:28 +02:00
Christopher Faulet
ca5309a9a3 MINOR: stconn: Add a flag to report EOS at the stream-connector level
SC_FL_EOS flag is added to report the end-of-stream at the SC level. It will
be used to distinguish end of stream reported by the endoint, via the
SE_FL_EOS flag, and the abort triggered by the stream, via the
SC_FL_ABRT_DONE flag.

In this patch, the flag is defined and is systematically tested everywhere
SC_FL_ABRT_DONE is tested. It should be safe because it is never set.
2023-04-17 17:41:28 +02:00
Christopher Faulet
285aa40d35 BUG/MEDIUM: log: Properly handle client aborts in syslog applet
In the syslog applet, when there is no output data, nothing is performed and
the applet leaves by requesting more data. But it is an issue because a
client abort is only handled if it reported with the last bytes of the
message. If the abort occurs after the message was handled, it is ignored.
The session remains opened and inactive until the client timeout is being
triggered. It no such timeout is configured, given that the default maxconn
is 10, all slots can be quickly busy and make the applet unresponsive.

To fix the issue, the best is to always try to read a message when the I/O
handle is called. This way, the abort can be handled. And if there is no
data, we leave as usual.

This patch should fix the issue #2112. It must be backported as far as 2.4.
2023-04-17 16:50:30 +02:00
Christopher Faulet
9704797fa2 BUG/MEDIUM: http-ana: Properly switch the request in tunnel mode on upgrade
Since the commit f2b02cfd9 ("MAJOR: http-ana: Review error handling during
HTTP payload forwarding"), during the payload forwarding, we are analyzing a
side, we stop to test the opposite side. It means when the HTTP request
forwarding analyzer is called, we no longer check the response side and vice
versa.

Unfortunately, since then, the HTTP tunneling is broken after a protocol
upgrade. On the response is switch in TUNNEL mode. The request remains in
DONE state. As a consequence, data received from the server are forwarded to
the client but not data received from the client.

To fix the bug, when both sides are in DONE state, both are switched in same
time in TUNNEL mode if it was requested. It is performed in the same way in
http_end_request() and http_end_response().

This patch should fix the issue #2125. It is 2.8-specific. No backport
needed.
2023-04-17 16:17:35 +02:00
William Lallemand
a21ca74e83 MINOR: ssl: remove OpenSSL 1.0.2 mention into certificate loading error
Remove the mention to OpenSSL 1.0.2 in the certificate chain loading
error, which is not relevant.

Could be backported in 2.7.
2023-04-17 14:45:40 +02:00
Ilya Shipitsin
2ca01589a0 CLEANUP: use "offsetof" where appropriate
let's use the C library macro "offsetof"
2023-04-16 09:58:49 +02:00
Frdric Lcaille
b5efe7901d BUG/MINOR: quic: Do not use ack delay during the handshakes
As revealed by GH #2120 opened by @Tristan971, there are cases where ACKs
have to be sent without packet to acknowledge because the ACK timer has
been triggered and the connection needs to probe the peer at the same time.
Indeed

Thank you to @Tristan971 for having reported this issue.

Must be backported to 2.6 and 2.7.
2023-04-14 21:09:13 +02:00
Christopher Faulet
75b954fea4 BUG/MINOR: stconn: Don't set SE_FL_ERROR at the end of sc_conn_send()
When I reworked my series, this code was first removed and reinserted by
error. So let's remove it again.
2023-04-14 17:32:44 +02:00
Christopher Faulet
25d9fe50f5 MEDIUM: stconn: Rely on SC flags to handle errors instead of SE flags
It is the last commit on this subject. we stop to use SE_FL_ERROR flag from
the SC, except at the I/O level. Otherwise, we rely on SC_FL_ERROR
flag. Now, there should be a real separation between SE flags and SC flags.
2023-04-14 17:05:54 +02:00
Christopher Faulet
e182a8e651 MEDIUM: stream: Stop to use SE flags to detect endpoint errors
Here again, we stop to use SE_FL_ERROR flag from process_stream() and
sub-functions and we fully rely on SC_FL_ERROR to do so.
2023-04-14 17:05:54 +02:00
Christopher Faulet
d7bac88427 MEDIUM: stream: Stop to use SE flags to detect read errors from analyzers
In the same way the previous commit, we stop to use SE_FL_ERROR flag from
analyzers and their sub-functions. We now fully rely on SC_FL_ERROR to do so.
2023-04-14 17:05:54 +02:00
Christopher Faulet
725170eee6 MEDIUM: backend: Stop to use SE flags to detect connection errors
SE_FL_ERROR flag is no longer set when an error is detected durign the
connection establishment. SC_FL_ERROR flag is set instead. So it is safe to
remove test on SE_FL_ERROR to detect connection establishment error.
2023-04-14 17:05:54 +02:00
Christopher Faulet
88d05a0f3b MEDIUM: tree-wide: Stop to set SE_FL_ERROR from upper layer
We can now fully rely on SC_FL_ERROR flag from the stream. The first step is
to stop to set the SE_FL_ERROR flag. Only endpoints are responsible to set
this flag. It was a design limitation. It is now fixed.
2023-04-14 17:05:54 +02:00
Christopher Faulet
ad46e52814 MINOR: tree-wide: Test SC_FL_ERROR with SE_FL_ERROR from upper layer
From the stream, when SE_FL_ERROR flag is tested, we now also test the
SC_FL_ERROR flag. Idea is to stop to rely on the SE descriptor to detect
errors.
2023-04-14 17:05:54 +02:00
Christopher Faulet
340021b89f MINOR: stream: Set SC_FL_ERROR on channels' buffer allocation error
Set SC_FL_ERROR flag when we fail to allocate a buffer for a stream.
2023-04-14 17:05:54 +02:00
Christopher Faulet
38656f406c MINOR: backend: Set SC_FL_ERROR on connection error
During connection establishement, if an error occurred, the SC_FL_ERROR flag
is now set. Concretely, it is set when SE_FL_ERROR is also set.
2023-04-14 17:05:53 +02:00
Christopher Faulet
a1d14a7c7f MINOR: stconn: Add a flag to ack endpoint errors at SC level
The flag SC_FL_ERROR is added to ack errors on the endpoint. When
SE_FL_ERROR flag is detected on the SE descriptor, the corresponding is set
on the SC. Idea is to avoid, as far as possible, to manipulated the SE
descriptor in upper layers and know when an error in the endpoint is handled
by the SC.

For now, this flag is only set and cleared but never tested.
2023-04-14 17:05:53 +02:00
Christopher Faulet
638fe6ab0f MINOR: stconn: Don't clear SE_FL_ERROR when endpoint is reset
There is no reason to remove this flag. When the SC endpoint is reset, it is
replaced by a new one. The old one is released. It was useful when the new
endpoint inherited some flags from the old one.  But it is no longer
performed. Thus there is no reason still unset this flag.
2023-04-14 17:05:53 +02:00
Christopher Faulet
e8bcef5f22 MEDIUM: stconn: Forbid applets with more to deliver if EOI was reached
When an applet is woken up, before calling its io_handler, we pretend it has
no more data to deliver. So, after the io_handler execution, it is a bug if
an applet states it has more data to deliver while the end of input is
reached.

So a BUG_ON() is added to be sure it never happens.
2023-04-14 17:05:53 +02:00
Christopher Faulet
56a2b608b0 MINOR: stconn: Stop to set SE_FL_ERROR on sending path
It is not the SC responsibility to report errors on the SE descriptor. It is
the endpoint responsibility. It must switch SE_FL_ERR_PENDING into
SE_FL_ERROR if the end of stream was detected. It can even be considered as
a bug if it is not done by he endpoint.

So now, on sending path, a BUG_ON() is added to abort if SE_FL_EOS and
SE_FL_ERR_PENDING flags are set but not SE_FL_ERROR. It is trully important
to handle this case in the endpoint to be able to properly shut the endpoint
down.
2023-04-14 17:05:53 +02:00
Christopher Faulet
d3bc340e7e BUG/MINOR: cli: Don't close when SE_FL_ERR_PENDING is set in cli analyzer
SE_FL_ERR_PENDING is used to report an error on the write side. But it is
not a terminal error. Some incoming data may still be available. In the cli
analyzers, it is important to not close the stream when this flag is
set. Otherwise the response to a command can be truncated. It is probably
hard to observe. But it remains a bug.

While this patch could be backported to 2.7, there is no real reason to do
so, except if someone reports a bug about truncated responses.
2023-04-14 16:49:04 +02:00
Christopher Faulet
214f1b5c16 MINOR: tree-wide: Replace several chn_prod() by the corresponding SC
At many places, call to chn_prod() can be easily replaced by the
corresponding SC. It is a bit easier to understand which side is
manipulated.
2023-04-14 15:06:04 +02:00
Christopher Faulet
64350bbf05 MINOR: tree-wide: Replace several chn_cons() by the corresponding SC
At many places, call to chn_cons() can be easily replaced by the
corresponding SC. It is a bit easier to understand which side is
manipulated.
2023-04-14 15:04:03 +02:00
Christopher Faulet
b2b1c3a6ea MINOR: channel/stconn: Replace sc_shutw() by sc_shutdown()
All reference to a shutw is replaced by an abort. So sc_shutw() is renamed
sc_shutdown(). SC app ops functions are renamed accordingly.
2023-04-14 15:02:57 +02:00
Christopher Faulet
208c712b40 MINOR: stconn: Rename SC_FL_SHUTW in SC_FL_SHUT_DONE
Here again, it is just a flag renaming. In SC flags, there is no longer
shutdown for writes but shutdowns.
2023-04-14 15:01:21 +02:00
Christopher Faulet
cfc11c0eae MINOR: channel/stconn: Replace sc_shutr() by sc_abort()
All reference to a shutr is replaced by an abort. So sc_shutr() is renamed
sc_abort(). SC app ops functions are renamed accordingly.
2023-04-14 14:54:35 +02:00
Christopher Faulet
0c370eee6d MINOR: stconn: Rename SC_FL_SHUTR in SC_FL_ABRT_DONE
Here again, it is just a flag renaming. In SC flags, there is no longer
shutdown for reads but aborts. For now this flag is set when a read0 is
detected. It is of couse not accurate. This will be changed later.
2023-04-14 14:51:22 +02:00
Christopher Faulet
df7cd710a8 MINOR: channel/stconn: Replace channel_shutw_now() by sc_schedule_shutdown()
After the flag renaming, it is now the turn for the channel function to be
renamed and moved in the SC scope. channel_shutw_now() is replaced by
sc_schedule_shutdown(). The request channel is replaced by the front SC and
the response is replace by the back SC.
2023-04-14 14:49:45 +02:00
Christopher Faulet
e38534cbd0 MINOR: stconn: Rename SC_FL_SHUTW_NOW in SC_FL_SHUT_WANTED
Because shutowns for reads are now considered as aborts, the shudowns for
writes can now be considered as shutdowns. Here it is just a flag
renaming. SC_FL_SHUTW_NOW is renamed SC_FL_SHUT_WANTED.
2023-04-14 14:46:07 +02:00
Christopher Faulet
12762f09a5 MINOR: channel/stconn: Replace channel_shutr_now() by sc_schedule_abort()
After the flag renaming, it is now the turn for the channel function to be
renamed and moved in the SC scope. channel_shutr_now() is replaced by
sc_schedule_abort(). The request channel is replaced by the front SC and the
response is replace by the back SC.
2023-04-14 14:08:49 +02:00
Christopher Faulet
573ead1e68 MINOR: stconn: Rename SC_FL_SHUTR_NOW in SC_FL_ABRT_WANTED
It is the first step to transform shutdown for reads for the upper layer
into aborts. This patch is quite simple, it is just a flag renaming.
2023-04-14 14:06:01 +02:00
Christopher Faulet
7eb837df4a MINOR: stream: Introduce stream_abort() to abort on both sides in same time
The function stream_abort() should now be called when an abort is performed
on the both channels in same time.
2023-04-14 14:04:59 +02:00
Christopher Faulet
3db538ac2f MINOR: channel: Forwad close to other side on abort
Most of calls to channel_abort() are associated to a call to
channel_auto_close(). Others are in areas where the auto close is the
default. So, it is now systematically enabled when an abort is performed on
a channel, as part of channel_abort() function.
2023-04-14 13:56:28 +02:00
Christopher Faulet
0adffb62c1 MINOR: filters: Review and simplify errors handling
First, it is useless to abort the both channel explicitly. For HTTP streams,
http_reply_and_close() is called. This function already take care to abort
processing. For TCP streams, we can rely on stream_retnclose().

To set termination flags, we can also rely on http_set_term_flags() for HTTP
streams and sess_set_term_flags() for TCP streams. Thus no reason to handle
them by hand.

At the end, the error handling after filters evaluation is now quite simple.
2023-04-14 12:13:09 +02:00
Christopher Faulet
dbad8ec787 MINOR: stream: Uninline and export sess_set_term_flags() function
This function will be used to set termination flags on TCP streams from
outside of process_stream(). Thus, it must be uninlined and exported.
2023-04-14 12:13:09 +02:00
Christopher Faulet
95125886ee BUG/MEDIUM: stconn: Do nothing in sc_conn_recv() when the SC needs more room
We erroneously though that an attempt to receive data was not possible if the SC
was waiting for more room in the channel buffer. A BUG_ON() was added to detect
bugs. And in fact, it is possible.

The regression was added in commit 341a5783b ("BUG/MEDIUM: stconn: stop to
enable/disable reads from streams via si_update_rx").

This patch should fix the issue #2115. It must be backported if the commit
above is backported.
2023-04-14 12:13:09 +02:00
Christopher Faulet
915ba08b57 BUG/MEDIUM: stream: Report write timeouts before testing the flags
A regression was introduced when stream's timeouts were refactored. Write
timeouts are not testing is the right order. When timeous of the front SC
are handled, we must then test the read timeout on the request channel and
the write timeout on the response channel. But write timeout is tested on
the request channel instead. On the back SC, the same mix-up is performed.

We must be careful to handle timeouts before checking channel flags. To
avoid any confusions, all timeuts are handled first, on front and back SCs.
Then flags of the both channels are tested.

It is a 2.8-specific issue. No backport needed.
2023-04-14 12:13:09 +02:00
Christopher Faulet
925279ccf2 BUG/MINOR: stream: Fix test on SE_FL_ERROR on the wrong entity
There is a bug at begining of process_stream(). The SE_FL_ERROR flag is
tested against backend stream-connector's flags instead of its SE
descriptor's flags. It is an old typo, introduced when the stream-interfaces
were replaced by the conn-streams.

This patch must be backported as far as 2.6.
2023-04-14 12:13:09 +02:00
Frédéric Lécaille
895700bd32 BUG/MINOR: quic: Wrong Application encryption level selection when probing
This bug arrived with this commit:

    MEDIUM: quic: Ack delay implementation

After having probed the Handshake packet number space, one must not select the
Application encryption level to continue trying building packets as this is done
when the connection is not probing. Indeed, if the ACK timer has been triggered
in the meantime, the packet builder will try to build a packet at the Application
encryption level to acknowledge the received packet. But there is very often
no 01RTT packet to acknowledge when the connection is probing before the
handshake is completed. This triggers a BUG_ON() in qc_do_build_pkt() which
checks that the tree of ACK ranges to be used is not empty.

Thank you to @Tristan971 for having reported this issue in GH #2109.

Must be backported to 2.6 and 2.7.
2023-04-13 19:20:09 +02:00
Frédéric Lécaille
a576c1b0c6 MINOR: quic: Remove a useless test about probing in qc_prep_pkts()
qel->pktns->tx.pto_probe is set to 0 after having prepared a probing
datagram. There is no reason to check this parameter. Furthermore
it is always 0 when the connection does not probe the peer.

Must be backported to 2.6 and 2.7.
2023-04-13 19:20:09 +02:00
Frédéric Lécaille
91369cfcd0 MINOR: quic: Display the packet number space flags in traces
Display this information when the encryption level is also displayed.

Must be backported to 2.6 and 2.7.
2023-04-13 19:20:08 +02:00
Frédéric Lécaille
595251f22e BUG/MINOR: quic: SIGFPE in quic_cubic_update()
As reported by @Tristan971 in GH #2116, the congestion control window could be zero
due to an inversion in the code about the reduction factor to be applied.
On a new loss event, it must be applied to the slow start threshold and the window
should never be below ->min_cwnd (2*max_udp_payload_sz).

Same issue in both newReno and cubic algorithm. Furthermore in newReno, only the
threshold was decremented.

Must be backported to 2.6 and 2.7.
2023-04-13 19:20:08 +02:00
Frédéric Lécaille
9d68c6aaf6 BUG/MINOR: quic: Possible wrapped values used as ACK tree purging limit.
Add two missing checks not to substract too big values from another too little
one. In this case the resulted wrapped huge values could be passed to the function
which has to remove the last range of a tree of ACK ranges as encoded limit size
not to go below, cancelling the ACK ranges deletion. The consequence could be that
no ACK were sent.

Must be backported to 2.6 and 2.7.
2023-04-13 19:20:08 +02:00
Frédéric Lécaille
45bf1a82f1 BUG/MEDIUM: quic: Code sanitization about acknowledgements requirements
qc_may_build_pkt() has been modified several times regardless of the conditions
the functions it is supposed to allow to send packets (qc_build_pkt()/qc_do_build_pkt())
really use to finally send packets just after having received others, leading
to contraditions and possible very long loops sending empty packets (PADDING only packets)
because qc_may_build_pkt() could allow qc_build_pkt()/qc_do_build_pkt to build packet,
and the latter did nothing except sending PADDING frames, because from its point
of view they had nothing to send.

For now on, this is the job of qc_may_build_pkt() to decide to if there is
packets to send just after having received others AND to provide this information
to the qc_build_pkt()/qc_do_build_pkt()

Note that the unique case where the acknowledgements are completely ignored is
when the endpoint must probe. But at least this is when sending at most two datagrams!

This commit also fixes the issue reported by Willy about a very low throughput
performance when the client serialized its requests.

Must be backported to 2.7 and 2.6.
2023-04-13 19:20:08 +02:00
Frédéric Lécaille
eb3e5171ed MINOR: quic: Add connection flags to traces
This should help in diagnosing issues.

Some adjustments have to be done to avoid deferencing a quic_conn objects from
TRACE_*() calls.

Must be backported to 2.7 and 2.6.
2023-04-13 19:20:08 +02:00
Frédéric Lécaille
809bd9fed1 BUG/MINOR: quic: Ignored less than 1ms RTTs
Do not ignore very short RTTs (less than 1ms) before computing the smoothed
RTT initializing it to an "infinite" value (UINT_MAX).

Must be backported to 2.7 and 2.6.
2023-04-13 19:20:08 +02:00
Frédéric Lécaille
fad0e6cf73 MINOR: quic: Add packet loss and maximum cc window to "show quic"
Add the number of packet losts and the maximum congestion control window computed
by the algorithms to "show quic".
Same thing for the traces of existent congestion control algorithms.

Must be backported to 2.7 and 2.6.
2023-04-13 19:20:08 +02:00
Olivier Houchard
f98a8c317e BUG/MEDIUM: fd: don't wait for tmask to stabilize if we're not in it.
In fd_update_events(), we loop until there's no bit in the running_mask
that is not in the thread_mask. Problem is, the thread sets its
running_mask bit before that loop, and so if 2 threads do the same, and
a 3rd one just closes the FD and sets the thread_mask to 0, then
running_mask will always be non-zero, and we will loop forever. This is
trivial to reproduce when using a DNS resolver that will just answer
"port unreachable", but could theoretically happen with other types of
file descriptors too.

To fix that, just don't bother looping if we're no longer in the
thread_mask, if that happens we know we won't have to take care of the
FD, anyway.

This should be backported to 2.7, 2.6 and 2.5.
2023-04-13 18:04:46 +02:00
Willy Tarreau
a07635ead5 MINOR: bind-conf: support a new shards value: "by-group"
Setting "shards by-group" will create one shard per thread group. This
can often be a reasonable tradeoff between a single one that can be
suboptimal on CPUs with many cores, and too many that will eat a lot
of file descriptors. It was shown to provide good results on a 224
thread machine, with a distribution that was even smoother than the
system's since here it can take into account the number of connections
per thread in the group. Depending on how popular it becomes, it could
even become the default setting in a future version.
2023-04-13 17:38:31 +02:00
Willy Tarreau
d30e82b9f0 MINOR: receiver: reserve special values for "shards"
Instead of artificially setting the shards count to MAX_THREAD when
"by-thread" is used, let's reserve special values for symbolic names
so that we can add more in the future. For now we use value -1 for
"by-thread", which requires to turn the type to signed int but it was
already used as such everywhere anyway.
2023-04-13 17:12:50 +02:00
Amaury Denoyelle
53fc98c3bc MINOR: fd: implement fd_migrate_on() to migrate on a non-local thread
fd_migrate_on() can be used to migrate an existing FD to any thread, even
one belonging to a different group from the current one and from the
caller's. All that is needed is to make sure the FD is still valid when
the operation is performed (which is the case when such operations happen).

This is potentially slightly expensive since it locks the tgid during the
delicate operation, but it is normally performed only from an owning
thread to offer the FD to another one (e.g. reassign a better thread upon
accept()).
2023-04-13 16:57:51 +02:00
Willy Tarreau
97da942ba6 MINOR: thread: keep a bitmask of enabled groups in thread_set
We're only checking for 0, 1, or >1 groups enabled there, and we'll soon
need to be more precise and know quickly which groups are non-empty.
Let's just replace the count with a mask of enabled groups. This will
allow to quickly spot the presence of any such group in a set.
2023-04-13 16:57:51 +02:00
William Lallemand
3f210970bf BUG/MINOR: stick_table: alert when type len has incorrect characters
Alert when the len argument of a stick table type contains incorrect
characters.

Replace atol by strtol.

Could be backported in every maintained versions.
2023-04-13 14:46:08 +02:00
Willy Tarreau
28f2a590f6 MINOR: activity: add a line reporting the average CPU usage to "show activity"
It was missing from the output but is sometimes convenient to observe
and understand how incoming connections are distributed. The CPU usage
is reported as the instant measurement of 100-idle_pct for each thread,
and the average value is shown for the aggregated value.

This could be backported as it's helpful in certain troublehsooting
sessions.
2023-04-12 08:42:52 +02:00
Frédéric Lécaille
6fd2576d5e MINOR: quic: Add a trace for packet with an ACK frame
As the ACK frames are not added to the packet list of ack-eliciting frames,
it could not be traced. But there is a flag to identify such packet.
Let's use it to add this information to the traces of TX packets.

Must be backported to 2.6 and 2.7.
2023-04-11 10:47:19 +02:00
Frédéric Lécaille
e47adca432 MINOR: quic: Dump more information at proto level when building packets
This should be helpful to debug issues at without too much traces.

Must be backported to 2.7 and 2.6.
2023-04-11 10:47:19 +02:00
Frédéric Lécaille
c0aaa07aa3 MINOR: quic: Modify qc_try_rm_hp() traces
Dump at proto level the packet information when its header protection was removed.
Remove no more use qpkt_trace variable.

Must be backported to 2.7 and 2.6.
2023-04-11 10:47:19 +02:00
Frédéric Lécaille
68737316ea BUG/MINOR: quic: Wrong packet number space probing before confirmed handshake
It is possible that the handshake was not confirmed and there was no more
packet in flight to probe with. It this case the server must wait for
the client to be unblocked without probing any packet number space contrary
to what was revealed by interop tests as follows:

	[01|quic|2|uic_loss.c:65] TX loss pktns : qc@0x7fac301cd390 pktns=I pp=0
	[01|quic|2|uic_loss.c:67] TX loss pktns : qc@0x7fac301cd390 pktns=H pp=0 tole=-102ms
	[01|quic|2|uic_loss.c:67] TX loss pktns : qc@0x7fac301cd390 pktns=01RTT pp=0 if=1054 tole=-1987ms
	[01|quic|5|uic_loss.c:73] quic_loss_pktns(): leaving : qc@0x7fac301cd390
	[01|quic|5|uic_loss.c:91] quic_pto_pktns(): entering : qc@0x7fac301cd390
	[01|quic|3|ic_loss.c:121] TX PTO handshake not already completed : qc@0x7fac301cd390
	[01|quic|2|ic_loss.c:141] TX PTO : qc@0x7fac301cd390 pktns=I pp=0 dur=83ms
	[01|quic|5|ic_loss.c:142] quic_pto_pktns(): leaving : qc@0x7fac301cd390
	[01|quic|3|c_conn.c:5179] needs to probe Initial packet number space : qc@0x7fac301cd390

This bug was not visible before this commit:
     BUG/MINOR: quic: wake up MUX on probing only for 01RTT

This means that before it, one could do bad things (probing the 01RTT packet number
space before the handshake was confirmed).

Must be backported to 2.7 and 2.6.
2023-04-11 10:47:19 +02:00
Frédéric Lécaille
2513b1dd7b MINOR: quic: Trace fix in quic_pto_pktns() (handshaske status)
The handshake must be confirmed before probing the 01RTT packet number space.

Must be backported to 2.7 and 2.6.
2023-04-11 10:47:19 +02:00
Christopher Faulet
c202c740b5 BUG/MEDIUM: mux-h2: Never set SE_FL_EOS without SE_FL_EOI or SE_FL_ERROR
When end-of-stream is reported by a H2 stream, we must take care to also
report an error is end-of-input was not reported. Indeed, it is now
mandatory to set SE_FL_EOI or SE_FL_ERROR flags when SE_FL_EOS is set.

It is a 2.8-specific issue. No backport needed.
2023-04-11 08:59:10 +02:00
Christopher Faulet
c393c9e388 BUG/MEDIUM: mux-h1: Report EOI when a TCP connection is upgraded to H2
When TCP connection is first upgrade to H1 then to H2, the stream-connector,
created by the PT mux, must be destroyed because the H2 mux cannot inherit
from it. When it is performed, the SE_FL_EOS flag is set but SE_FL_EOI must
also be set. It is now required to never set SE_FL_EOS without SE_FL_EOI or
SE_FL_ERROR.

It is a 2.8-specific issue. No backport needed.
2023-04-11 08:45:18 +02:00