7859 Commits

Author SHA1 Message Date
William Lallemand
0a0512f76d MINOR: mworker/cli: the mcli_reload bind_conf only send the reload status
Upon a reload with the master CLI, the FD of the master CLI session is
received by the internal socketpair listener.

This session is used to display the status of the reload and then will
close.
2022-09-24 16:35:23 +02:00
William Lallemand
56f73b21a5 MINOR: mworker: stores the mcli_reload bind_conf
Stores the mcli_reload bind_conf in order to identify it later.
2022-09-24 15:56:25 +02:00
William Lallemand
21623b5949 MINOR: mworker: mworker_cli_proxy_new_listener() returns a bind_conf
mworker_cli_proxy_new_listener() now returns a bind_conf * or NULL upon
failure.
2022-09-24 15:51:27 +02:00
Christopher Faulet
4558437211 CLEANUP: list: Fix mt_list_for_each_entry_safe indentation
It makes the macro easier to read.
2022-09-21 16:02:40 +02:00
Aurelien DARRAGON
60cffbaca5 MINOR: list: documenting mt_list_for_each_entry_safe() macro
- Adding some comments in mt_list_for_each_entry_safe() macro to make it
  somehow understandable.
  The macro is performing critical stuff but was not documented at all.
  Moreover, nested loops with conditional tricks are used,
  making it even harder to understand the steps performed in it.

- Updating mt_list_for_each_entry_safe usage example.

- Added a "FIXME:" comment in a specific condition that seems to
  never be reached even when deeply stress-testing mt_lists
  (using test_list binary provided in the repository).
2022-09-21 16:02:40 +02:00
Willy Tarreau
a700420671 MINOR: clock: split local and global date updates
Pollers that support busy polling spend a lot of time (and cause
contention) updating the global date when they're looping over themselves
while it serves no purpose: what's needed is only an update on the local
date to know when to stop looping.

This patch splits clock_pudate_date() into a pair of local and global
update functions, so that pollers can be easily improved.
2022-09-21 09:06:28 +02:00
Aurelien DARRAGON
ae1e14d65b CLEANUP: tools: removing escape_chunk() function
Func is not used anymore. See e3bde807d.
2022-09-20 16:25:30 +02:00
Aurelien DARRAGON
c5bff8e550 BUG/MINOR: log: improper behavior when escaping log data
Patrick Hemmer reported an improper log behavior when using
log-format to escape log data (+E option):
Some bytes were truncated from the output:

- escape_string() function now takes an extra parameter that
  allow the caller to specify input string stop pointer in
  case the input string is not guaranteed to be zero-terminated.
- Minors checks were added into lf_text_len() to make sure dst
  string will not overflow.
- lf_text_len() now makes proper use of escape_string() function.

This should be backported as far as 1.8.
2022-09-20 16:25:30 +02:00
Amaury Denoyelle
0ed617ac2f BUG/MEDIUM: mux-quic: properly trim HTX buffer on snd_buf reset
MUX QUIC snd_buf operation whill return early if a qcs instance is
resetted. In this case, HTX is left untouched and the callback returns
the whole bufer size. This lead to an undefined behavior as the stream
layer is notified about a transfer but does not see its HTX buffer
emptied. In the end, the transfer may stall which will lead to a leak on
session.

To fix this, HTX buffer is now resetted when snd_buf is short-circuited.
This should fix the issue as now the stream layer can continue the
transfer until its completion.

This patch has already been tested by Tristan and is reported to solve
the github issue #1801.

This should be backported up to 2.6.
2022-09-20 15:35:33 +02:00
Amaury Denoyelle
9534e59bb9 MINOR: mux-quic: refactor snd_buf
Factorize common code between h3 and hq-interop snd_buf operation. This
is inserted in MUX QUIC snd_buf own callback.

The h3/hq-interop API has been adjusted to directly receive a HTX
message instead of a plain buf. This led to extracting part of MUX QUIC
snd_buf in qmux_http module.

This should be backported up to 2.6.
2022-09-20 15:35:29 +02:00
Amaury Denoyelle
d80fbcaca2 REORG: mux-quic: export HTTP related function in a dedicated file
Extract function dealing with HTX outside of MUX QUIC. For the moment,
only rcv_buf stream operation is concerned.

The main objective is to be able to support both TCP and HTTP proxy mode
with a common base and add specialized modules on top of it.

This should be backported up to 2.6.
2022-09-20 15:35:23 +02:00
Amaury Denoyelle
36d50bff22 REORG: mux-quic: extract traces in a dedicated source file
QUIC MUX implements several APIs to interface with stream, quic-conn and
app-ops layers. It is planified to better separate this roles, possibly
by using several files.

The first step is to extract QUIC MUX traces in a dedicated source
files. This will allow to reuse traces in multiple files.

The main objective is to be
able to support both TCP and HTTP proxy mode with a common base and add
specialized modules on top of it.

This should be backported up to 2.6.
2022-09-20 15:35:09 +02:00
Amaury Denoyelle
afb7b9d8e5 BUG/MEDIUM: mux-quic: fix nb_hreq decrement
nb_hreq is a counter on qcc for active HTTP requests. It is incremented
for each qcs where a full HTTP request was received. It is decremented
when the stream is closed locally :
- on HTTP response fully transmitted
- on stream reset

A bug will occur if a stream is resetted without having processed a full
HTTP request. nb_hreq will be decremented whereas it was not
incremented. This will lead to a crash when building with
DEBUG_STRICT=2. If BUG_ON_HOT are not active, nb_hreq counter will wrap
which may break the timeout logic for the connection.

This bug was triggered on haproxy.org. It can be reproduced by
simulating the reception of a STOP_SENDING frame instead of a STREAM one
by patching qc_handle_strm_frm() :

+       if (quic_stream_is_bidi(strm_frm->id))
+               qcc_recv_stop_sending(qc->qcc, strm_frm->id, 0);
+       //ret = qcc_recv(qc->qcc, strm_frm->id, strm_frm->len,
+       //               strm_frm->offset.key, strm_frm->fin,
+       //               (char *)strm_frm->data);

To fix this bug, a qcs is now flagged with a new QC_SF_HREQ_RECV. This
is set when the full HTTP request is received. When the stream is closed
locally, nb_hreq will be decremented only if this flag was set.

This must be backported up to 2.6.
2022-09-19 12:12:21 +02:00
Erwan Le Goas
b0c0501516 MINOR: config: add command-line -dC to dump the configuration file
This commit adds a new command line option -dC to dump the configuration
file. An optional key may be appended to -dC in order to produce an
anonymized dump using this key. The anonymizing process uses the same
algorithm as the CLI so that the same key will produce the same hashes
for the same identifiers. This way an admin may share an anonymized
extract of a configuration to match against live dumps. Note that key 0
will not anonymize the output. However, in any case, the configuration
is dumped after tokenizing, thus comments are lost.
2022-09-17 11:27:09 +02:00
Erwan Le Goas
54966dffda MINOR: anon: store the anonymizing key in the CLI's appctx
In order to allow users to dump internal states using a specific key
without changing the global one, we're introducing a key in the CLI's
appctx. This key is preloaded from the global one when "set anon on"
is used (and if none exists, a random one is assigned). And the key
can optionally be assigned manually for the whole CLI session.

A "show anon" command was also added to show the anon state, and the
current key if the users has sufficient permissions. In addition, a
"debug dev hash" command was added to test the feature.
2022-09-17 11:27:09 +02:00
Erwan Le Goas
fad9da83da MINOR: anon: store the anonymizing key in the global structure
Add a uint32_t key in global to hash words with it. A new CLI command
'set global-key <key>' was added to change the global anonymizing key.
The global may also be set in the configuration using the global
"anonkey" directive. For now this key is not used.
2022-09-17 11:24:53 +02:00
Erwan Le Goas
9c76637fff MINOR: anon: add new macros and functions to anonymize contents
These macros and functions will be used to anonymize strings by producing
a short hash. This will allow to match config elements against dump elements
without revealing the original data. This will later be used to anonymize
configuration parts and CLI commands output. For now only string, identifiers
and addresses are supported, but the model is easily extensible.
2022-09-17 11:24:53 +02:00
Amaury Denoyelle
8d4ac48d3d CLEANUP: mux-quic: remove stconn usage in h3/hq
Small cleanup on snd_buf for application protocol layer.
* do not export h3_snd_buf
* replace stconn by a qcs argument. This is better as h3/hq-interop only
  uses the qcs instance.

This should be backported up to 2.6.
2022-09-16 13:53:30 +02:00
Christopher Faulet
7c4b2ec09d MINOR: flags/mux-h1: decode H1C and H1S flags
The new functions h1c_show_flags() and h1s_show_flags() decode the flags
state into a string, and are used by dev/flags:

$ /dev/flags/flags h1c 0x2200
h1c->flags = H1C_F_ST_READY | H1C_F_ST_ATTACHED

./dev/flags/flags h1s 0x190
h1s->flags = H1S_F_BODYLESS_RESP | H1S_F_NOT_FIRST | H1S_F_WANT_KAL
2022-09-15 11:01:59 +02:00
Christopher Faulet
18ad15f5c4 REORG: mux-h1: extract flags and enums into mux_h1-t.h
The same was performed for the H2 multiplexer. H1C and H1S flags are moved
in a dedicated header file. It will be mainly used to be able to decode
mux-h1 flags from the flags utility.

In this patch, we only move the flags to mux_h1-t.h.
2022-09-15 11:01:59 +02:00
Amaury Denoyelle
f8aaf8bdfa BUG/MEDIUM: mux-quic: fix crash on early app-ops release
H3 SETTINGS emission has recently been delayed. The idea is to send it
with the first STREAM to reduce sendto syscall invocation. This was
implemented in the following patch :
  3dd79d378c86b3ebf60e029f518add5f1ed54815
  MINOR: h3: Send the h3 settings with others streams (requests)

This patch works fine under nominal conditions. However, it will cause a
crash if a HTTP/3 connection is released before having sent any data,
for example when receiving an invalid first request. In this case,
qc_release will first free qcc.app_ops HTTP/3 application protocol layer
via release callback. Then qc_send is called to emit any closing frames
built by app_ops release invocation. However, in qc_send, as no data has
been sent, it will try to complete application layer protocol
intialization, with a SETTINGS emission for HTTP/3. Thus, qcc.app_ops is
reused, which is invalid as it has been just freed. This will cause a
crash with h3_finalize in the call stack.

This bug can be reproduced artificially by generating incomplete HTTP/3
requests. This will in time trigger http-request timeout without any
data send. This is done by editing qc_handle_strm_frm function.

-       ret = qcc_recv(qc->qcc, strm_frm->id, strm_frm->len,
+       ret = qcc_recv(qc->qcc, strm_frm->id, strm_frm->len - 1,
                       strm_frm->offset.key, strm_frm->fin,
                       (char *)strm_frm->data);

To fix this, application layer closing API has been adjusted to be done
in two-steps. A new shutdown callback is implemented : it is used by the
HTTP/3 layer to generate GOAWAY frame in qc_release prologue.
Application layer context qcc.app_ops is then freed later in qc_release
via the release operation which is now only used to liberate app layer
ressources. This fixes the problem as the intermediary qc_send
invocation will be able to reuse app_ops before it is freed.

This patch fixes the crash, but it would be better to adjust H3 SETTINGS
emission in case of early connection closing : in this case, there is no
need to send it. This should be implemented in a future patch.

This should fix the crash recently experienced by Tristan in github
issue #1801.

This must be backported up to 2.6.
2022-09-15 10:41:44 +02:00
William Lallemand
95fc737fc6 MEDIUM: quic: separate path for rx and tx with set_encryption_secrets
With quicTLS the set_encruption_secrets callback is always called with
the read_secret and the write_secret.

However this is not the case with libreSSL, which uses the
set_read_secret()/set_write_secret() mecanism. It still provides the
set_encryption_secrets() callback, which is called with a NULL
parameter for the write_secret during the read, and for the read_secret
during the write.

The exchange key was not designed in haproxy to be called separately for
read and write, so this patch allow calls with read or write key to
NULL.
2022-09-14 18:16:37 +02:00
William Lallemand
1c8f3b386d MINOR: httpclient: export httpclient_create_proxy()
Export httpclient_create_proxy() in http_client.h
2022-09-14 14:34:39 +02:00
William Lallemand
992ad62e3c MEDIUM: httpclient: allow to use another proxy
httpclient_new_from_proxy() is a variant of httpclient_new() which
allows to create the requests from a different proxy.

The proxy and its 2 servers are now stored in the httpclient structure.

The proxy must have been created with httpclient_create_proxy() to be
used.

The httpclient_postcheck() callback will finish the initialization of
all proxies created with PR_CAP_HTTPCLIENT.
2022-09-13 17:12:38 +02:00
William Lallemand
54aec5f678 MEDIUM: httpclient: httpclient_create_proxy() creates a proxy for httpclient
httpclient_create_proxy() is a function which creates a proxy that could
be used for the httpclient. It will allocate a proxy, a raw server and
an ssl server.

This patch moves most of the code from httpclient_precheck() into a
generic function httpclient_create_proxy().

The proxy will have the PR_CAP_HTTPCLIENT capability.

This could be used for specifics httpclient instances that needs
different proxy settings.
2022-09-13 17:12:38 +02:00
Emeric Brun
d6e581de4b BUG/MEDIUM: sink: bad init sequence on tcp sink from a ring.
The init of tcp sink, particularly for SSL, was done
too early in the code, during parsing, and this can cause
a crash specially if nbthread was not configured.

This was detected by William using ASAN on a new regtest
on log forward.

This patch adds the 'struct proxy' created for a sink
to a list and this list is now submitted to the same init
code than the main proxies list or the log_forward's proxies
list. Doing this, we are assured to use the right init sequence.
It also removes the ini code for ssl from post section parsing.

This patch should be backported as far as v2.2

Note: this fix uses 'goto' labels created by commit
'BUG/MAJOR: log-forward: Fix log-forward proxies not fully initialized'
but this code didn't exist before v2.3 so this patch needs to be
adapted for v2.2.
2022-09-13 17:03:30 +02:00
Willy Tarreau
439be5838d MINOR: flags/mux-h2: decode H2C and H2S flags
The new functions h2c_show_flags() and h2s_show_flags() decode the flags
state into a string, and are used by dev/flags:

  $ ./dev/flags/flags h2c 0x0600
  h2c->flags = H2_CF_DEM_IN_PROGRESS | H2_CF_DEM_SHORT_READ

  $ ./dev/flags/flags h2s 0x7003
  h2s->flags = H2_SF_HEADERS_RCVD | H2_SF_OUTGOING_DATA | H2_SF_HEADERS_SENT \
             | H2_SF_ES_SENT | H2_SF_ES_RCVD
2022-09-12 19:33:07 +02:00
Willy Tarreau
6c0fadfb7d REORG: mux-h2: extract flags and enums into mux_h2-t.h
Originally in 1.8 we wanted to have an independent mux that could possibly
be disabled and would not impose dependencies on the outside. Everything
would fit into a single C file and that was fine.

Nowadays muxes are unavoidable, and not being able to easily inspect them
from outside is sometimes a bit of a pain. In particular, the flags utility
still cannot be used to decode their flags.

As a first step towards this, this patch moves the flags and enums to
mux_h2-t.h, as well as the two state decoding inline functions. It also
dropped the H2_SS_*_BIT defines that nobody uses. The mux_h2.c file remains
the only one to include that for now.
2022-09-12 19:33:07 +02:00
Willy Tarreau
799e5410b4 MINOR: flags/fd: decode FD flags states
The new function is fd_show_flags() and it reports known FD flags:

  $ ./dev/flags/flags fd 0x000121
  fd->flags = FD_POLL_IN | FD_EV_READY_W | FD_EV_ACTIVE_R
2022-09-12 19:33:07 +02:00
Willy Tarreau
62bde43779 BUILD: flags: fix the fallback macros for missing stdio
The fallback macros for when stdio is not there didn't have the "..."
and were causing build issues on platforms with stricter dependencies
between includes.
2022-09-09 17:46:45 +02:00
Willy Tarreau
233c0a586d BUILD: flags: fix build warning in some macros used by show_flags
Some gcc versions seem to be upset by the use of enums as booleans,
so OK, let's cast all of them as uint, that's no big deal.
2022-09-09 17:36:27 +02:00
Aurelien DARRAGON
d46f437de6 MINOR: proxy/listener: support for additional PAUSED state
This patch is a prerequisite for #1626.
Adding PAUSED state to the list of available proxy states.
The flag is set when the proxy is paused at runtime (pause_listener()).
It is cleared when the proxy is resumed (resume_listener()).

It should be backported to 2.6, 2.5 and 2.4
2022-09-09 17:23:01 +02:00
Aurelien DARRAGON
001328873c MINOR: listener: small API change
A minor API change was performed in listener(.c/.h) to restore consistency
between stop_listener() and (resume/pause)_listener() functions.

LISTENER_LOCK was never locked prior to calling stop_listener():
lli variable hint is thus not useful anymore.

Added PROXY_LOCK locking in (resume/pause)_listener() functions
with related lpx variable hint (prerequisite for #1626).

It should be backported to 2.6, 2.5 and 2.4
2022-09-09 17:23:01 +02:00
Willy Tarreau
6edae6ff48 MINOR: flags/http_ana: use flag dumping to show http msg states
The function is hmsg_show_flags(). It shows the HTTP_MSGF_* flags.
2022-09-09 17:18:57 +02:00
Willy Tarreau
5349779e40 MINOR: flags/htx: use flag dumping to show htx and start-line flags
The function are respectively htx_show_flags() and hsl_show_flags().
2022-09-09 16:59:29 +02:00
Willy Tarreau
e2afad0af4 MINOR: flags/http_ana: use flag dumping for txn flags
The new function is txn_show_flags(). It dumps the TXN flags
as well as the client and server cookie types.
2022-09-09 16:52:09 +02:00
Willy Tarreau
92a2d3c02b MINOR: flags/task: use flag dumping for task state
The new function is task_show_state().
2022-09-09 16:52:09 +02:00
Willy Tarreau
e9d1283cc5 MINOR: flags/stream: use flag dumping for stream flags
The new function is strm_show_flags(). It dumps the stream flags
as well as the err type under SF_ERR_MASK and the final state under
SF_FINST_MASK.
2022-09-09 16:52:09 +02:00
Willy Tarreau
f4cb98ce56 MINOR: flags/stream: use flag dumping for stream error type
The new function is strm_et_show_flags(). Only the error type is
handled at the moment, as a bit more complex logic is needed to
mix the values and enums present in some fields.
2022-09-09 16:52:09 +02:00
Willy Tarreau
4bab7d81b6 MINOR: flags/stconn: use flag dumping for stconn and sedesc flags
The two new functions are se_show_flags() and sc_show_flags().
Maybe something could be done for SC_ST_* values but as it's a
small enum, a simple switch/case should work fine.
2022-09-09 16:52:08 +02:00
Willy Tarreau
9d9e101689 MINOR: flags/connection: use flag dumping for connection flags
The new function is conn_show_flags(), it only deals with flags. Nothing
is planned for connection error types at the moment.
2022-09-09 16:15:10 +02:00
Willy Tarreau
cdc9ddc8cf MINOR: flags/channel: use flag dumping for channel flags and analysers
The two new functions are chn_show_analysers() and chn_show_flags().
They work on an existing buffer so one was declared in flags.c for this
purpose. File flags.c does not have to know about channel flags anymore.
2022-09-09 16:15:10 +02:00
Willy Tarreau
7a955b5d73 MINOR: flags: implement a macro used to dump enums inside masks
Some of our flags have enums inside a mask. The new macro __APPEND_ENUM
is able to deal with that by comparing the flag's value against an exact
one under the mask. One needs to take care of eliminating the zero value
though, otherwise delimiters will not always be properly placed (e.g. if
some flags were dumped before and what remains is exactly zero). The
bits of the mask are cleared only upon exact matches.
2022-09-09 16:15:10 +02:00
Willy Tarreau
77acaf5af5 MINOR: flags: add a new file to host flag dumping macros
The "flags" utility is useful but painful to maintain up to date. This
commit aims at providing a low-maintenance solution to keep flags up to
date, by proposing some macros that build a string from a set of flags
in a way that requires the least possible verbosity.

The idea will be to add an inline function dedicated to this just after
the flags declaration, and enumerate the flags one is interested in, and
that function will fill a string based on them.

Placing this inside the type files allows both haproxy and external tools
like "flags" to use it, but comes with a few constraints. First, the
files will be slightly less readable if these functions are huge, so they
need to stay as compact as possible. Second, the function will need
anprintf() and we don't want to include stdio.h in type files as it
proved to be particularly heavy and to cause definition headaches in
the past.

As such the file here only contains a macro enclosed in #ifdef EOF (that
is defined in stdio), and provides an alternate empty one when no stdio
is defined. This way it's the caller that has to include stdio first or
it won't get anything back, and in practice the locations relying on
this always have it.

The macro has to be used in 3 steps:
  - prologue: dumps 0 and exits if the value is zero
  - flags: the macro can be recursively called and it will push the
    flag from bottom to top so that they appear in the same order as
    today without requiring to be declared the other way around
  - epilogue: dump remaining flags that were not identified

The macro was arranged so that a single character can be used with no
other argument to declare all flags at once. Example:

  #define _(n, ...) __APPEND_FLAG(buf, len, del, flg, n, #n, __VA_ARGS__)
     _(0);
     _(X_FLAG1, _(X_FLAG2, _(X_FLAG3, _(X_FLAG4))));
     _(~0);
  #undef _

Existing files will have to be updated to rely on it, and more files
could come soon.
2022-09-09 14:47:31 +02:00
Frédéric Lécaille
3dd79d378c MINOR: h3: Send the h3 settings with others streams (requests)
This is the ->finalize application callback which prepares the unidirectional STREAM
frames for h3 settings and wakeup the mux I/O handler to send them. As haproxy is
at the same time always waiting for the client request, this makes haproxy
call sendto() to send only about 20 bytes of stream data. Furthermore in case
of heavy loss, this give less chances to short h3 requests to succeed.

Drawback: as at this time the mux sends its streams by their IDs ascending order
the stream 0 is always embedded before the unidirectional stream 3 for h3 settings.
Nevertheless, as these settings may be lost and received after other h3 request
streams, this is permitted by the RFC.

Perhaps there is a better way to do. This will have to be checked with Amaury.

Must be backported to 2.6.
2022-09-08 18:04:58 +02:00
Frédéric Lécaille
bb995eafc7 BUG/MINOR: quic: Speed up the handshake completion only one time
It is possible to speed up the handshake completion but only one time
by connection as mentionned in RFC 9002 "6.2.3. Speeding up Handshake Completion".
Add a flag to prevent this process to be run several times
(see https://www.rfc-editor.org/rfc/rfc9002#name-speeding-up-handshake-compl).

Must be backported to 2.6.
2022-09-08 18:04:58 +02:00
Willy Tarreau
3d4cdb198c MEDIUM: tasks/activity: combine the called function with the caller
Now instead of getting aggregate stats per called function, we have
them per function AND per call place. The "byaddr" sort considers
the function pointer first, then the call count, so that dominant
callers of a given callee are instantly spotted. This allows to get
sorted outputs like this:

Tasks activity:
  function                      calls   cpu_tot   cpu_avg   lat_tot   lat_avg
  h1_io_cb                   17357952   40.91s    2.357us   4.849m    16.76us <- sock_conn_iocb@src/sock.c:869 tasklet_wakeup
  sc_conn_io_cb              10357182   6.297s    607.0ns   27.93m    161.8us <- sc_app_chk_rcv_conn@src/stconn.c:762 tasklet_wakeup
  process_stream              9891131   1.809m    10.97us   53.61m    325.2us <- sc_notify@src/stconn.c:1209 task_wakeup
  process_stream              9823934   1.887m    11.52us   48.31m    295.1us <- stream_new@src/stream.c:563 task_wakeup
  sc_conn_io_cb               9347863   16.59s    1.774us   6.143m    39.43us <- h1_wake_stream_for_recv@src/mux_h1.c:2600 tasklet_wakeup
  h1_io_cb                     501344   1.848s    3.686us   6.544m    783.2us <- conn_subscribe@src/connection.c:732 tasklet_wakeup
  sc_conn_io_cb                239717   492.3ms   2.053us   3.213m    804.3us <- qcs_notify_send@src/mux_quic.c:529 tasklet_wakeup
  h2_io_cb                     173019   4.204s    24.30us   40.95s    236.7us <- h2_snd_buf@src/mux_h2.c:6712 tasklet_wakeup
  h2_io_cb                     149487   424.3ms   2.838us   14.63s    97.87us <- h2c_restart_reading@src/mux_h2.c:856 tasklet_wakeup
  other                        101893   4.626s    45.40us   14.84s    145.7us
  quic_lstnr_dghdlr             94389   614.0ms   6.504us   30.54s    323.6us <- quic_lstnr_dgram_dispatch@src/quic_sock.c:255 tasklet_wakeup
  quic_conn_app_io_cb           92205   3.735s    40.51us   390.9ms   4.239us <- qc_lstnr_pkt_rcv@src/xprt_quic.c:6184 tasklet_wakeup_after
  qc_io_cb                      50355   19.01s    377.5us   10.65s    211.4us <- qc_treat_acked_tx_frm@src/xprt_quic.c:1695 tasklet_wakeup
  h1_io_cb                      44427   155.0ms   3.489us   21.50s    484.0us <- h1_takeover@src/mux_h1.c:4085 tasklet_wakeup
  qc_io_cb                       9018   4.924s    546.0us   3.084s    342.0us <- qc_stream_desc_ack@src/quic_stream.c:128 tasklet_wakeup
  h1_timeout_task                3236   1.172ms   362.0ns   1.119s    345.9us <- h1_release@src/mux_h1.c:1087 task_wakeup
  h1_io_cb                       2804   7.974ms   2.843us   1.980s    706.0us <- sock_conn_iocb@src/sock.c:849 tasklet_wakeup
  sc_conn_io_cb                  2804   33.44ms   11.92us   2.597s    926.2us <- h1_wake_stream_for_send@src/mux_h1.c:2610 tasklet_wakeup
  qc_io_cb                       2623   2.669s    1.017ms   1.347s    513.5us <- h3_snd_buf@src/h3.c:1084 tasklet_wakeup
  qc_process_timer                662   526.4us   795.0ns   1.081s    1.633ms <- wake_expired_tasks@src/task.c:344 task_wakeup
  quic_conn_app_io_cb             648   12.62ms   19.47us   225.7ms   348.2us <- qc_process_timer@src/xprt_quic.c:4635 tasklet_wakeup
  accept_queue_process            286   1.571ms   5.494us   72.55ms   253.7us <- listener_accept@src/listener.c:1099 tasklet_wakeup
  process_resolvers               176   157.8us   896.0ns   7.835ms   44.52us <- wake_expired_tasks@src/task.c:429 task_drop_running
  qc_io_cb                        167   10.71ms   64.12us   32.47ms   194.4us <- qc_process_timer@src/xprt_quic.c:4602 tasklet_wakeup
  sc_conn_io_cb                   123   80.05us   650.0ns   50.35ms   409.4us <- qcs_notify_recv@src/mux_quic.c:519 tasklet_wakeup
  h2_timeout_task                  32   30.69us   958.0ns   9.038ms   282.4us <- h2_release@src/mux_h2.c:1191 task_wakeup
  task_run_applet                  24   33.79ms   1.408ms   5.838ms   243.3us <- sc_applet_create@src/stconn.c:489 appctx_wakeup
  accept_queue_process             17   56.34us   3.314us   7.505ms   441.5us <- accept_queue_process@src/listener.c:165 tasklet_wakeup
  srv_cleanup_toremove_conns       16   1.133ms   70.81us   5.685ms   355.3us <- srv_cleanup_idle_conns@src/server.c:5948 task_wakeup
  srv_cleanup_idle_conns           16   74.57us   4.660us   2.797ms   174.8us <- wake_expired_tasks@src/task.c:429 task_drop_running
  quic_conn_app_io_cb              12   786.9us   65.58us   2.042ms   170.1us <- qc_process_timer@src/xprt_quic.c:4589 tasklet_wakeup
  sc_conn_io_cb                     9   20.55us   2.283us   2.475ms   275.0us <- sock_conn_iocb@src/sock.c:869 tasklet_wakeup
  h2_io_cb                          8   34.12us   4.265us   1.784ms   223.0us <- h2_do_shutw@src/mux_h2.c:4656 tasklet_wakeup
  task_run_applet                   4   6.615ms   1.654ms   2.306us   576.0ns <- sc_app_chk_snd_applet@src/stconn.c:996 appctx_wakeup
  quic_conn_io_cb                   4   4.278ms   1.069ms   6.469us   1.617us <- qc_lstnr_pkt_rcv@src/xprt_quic.c:6184 tasklet_wakeup_after
  qc_io_cb                          2   20.81us   10.40us   4.943us   2.471us <- qc_init@src/mux_quic.c:2057 tasklet_wakeup
  quic_conn_app_io_cb               2   752.9us   376.4us   63.97us   31.99us <- qc_xprt_start@src/xprt_quic.c:7122 tasklet_wakeup
  quic_accept_run                   2   13.84us   6.920us   172.8us   86.42us <- quic_accept_push_qc@src/quic_sock.c:458 tasklet_wakeup
  qc_idle_timer_task                2   295.0us   147.5us   8.761us   4.380us <- wake_expired_tasks@src/task.c:344 task_wakeup
  qc_io_cb                          1   867.1us   867.1us   812.8us   812.8us <- qcs_consume@src/mux_quic.c:800 tasklet_wakeup

... and calls sorted by address like this:

Tasks activity:
  function                      calls   cpu_tot   cpu_avg   lat_tot   lat_avg
  task_run_applet                  23   32.73ms   1.423ms   5.837ms   253.8us <- sc_applet_create@src/stconn.c:489 appctx_wakeup
  task_run_applet                   4   6.615ms   1.654ms   2.306us   576.0ns <- sc_app_chk_snd_applet@src/stconn.c:996 appctx_wakeup
  accept_queue_process            285   1.566ms   5.495us   72.49ms   254.3us <- listener_accept@src/listener.c:1099 tasklet_wakeup
  accept_queue_process             17   56.34us   3.314us   7.505ms   441.5us <- accept_queue_process@src/listener.c:165 tasklet_wakeup
  sc_conn_io_cb              10357182   6.297s    607.0ns   27.93m    161.8us <- sc_app_chk_rcv_conn@src/stconn.c:762 tasklet_wakeup
  sc_conn_io_cb               9347863   16.59s    1.774us   6.143m    39.43us <- h1_wake_stream_for_recv@src/mux_h1.c:2600 tasklet_wakeup
  sc_conn_io_cb                239717   492.3ms   2.053us   3.213m    804.3us <- qcs_notify_send@src/mux_quic.c:529 tasklet_wakeup
  sc_conn_io_cb                  2804   33.44ms   11.92us   2.597s    926.2us <- h1_wake_stream_for_send@src/mux_h1.c:2610 tasklet_wakeup
  sc_conn_io_cb                   123   80.05us   650.0ns   50.35ms   409.4us <- qcs_notify_recv@src/mux_quic.c:519 tasklet_wakeup
  sc_conn_io_cb                     9   20.55us   2.283us   2.475ms   275.0us <- sock_conn_iocb@src/sock.c:869 tasklet_wakeup
  process_resolvers               159   145.9us   917.0ns   7.823ms   49.20us <- wake_expired_tasks@src/task.c:429 task_drop_running
  srv_cleanup_idle_conns           16   74.57us   4.660us   2.797ms   174.8us <- wake_expired_tasks@src/task.c:429 task_drop_running
  srv_cleanup_toremove_conns       16   1.133ms   70.81us   5.685ms   355.3us <- srv_cleanup_idle_conns@src/server.c:5948 task_wakeup
  process_stream              9891130   1.809m    10.97us   53.61m    325.2us <- sc_notify@src/stconn.c:1209 task_wakeup
  process_stream              9823933   1.887m    11.52us   48.31m    295.1us <- stream_new@src/stream.c:563 task_wakeup
  h1_io_cb                   17357952   40.91s    2.357us   4.849m    16.76us <- sock_conn_iocb@src/sock.c:869 tasklet_wakeup
  h1_io_cb                     501344   1.848s    3.686us   6.544m    783.2us <- conn_subscribe@src/connection.c:732 tasklet_wakeup
  h1_io_cb                      44427   155.0ms   3.489us   21.50s    484.0us <- h1_takeover@src/mux_h1.c:4085 tasklet_wakeup
  h1_io_cb                       2804   7.974ms   2.843us   1.980s    706.0us <- sock_conn_iocb@src/sock.c:849 tasklet_wakeup
  h1_timeout_task                3236   1.172ms   362.0ns   1.119s    345.9us <- h1_release@src/mux_h1.c:1087 task_wakeup
  h2_timeout_task                  32   30.69us   958.0ns   9.038ms   282.4us <- h2_release@src/mux_h2.c:1191 task_wakeup
  h2_io_cb                     173019   4.204s    24.30us   40.95s    236.7us <- h2_snd_buf@src/mux_h2.c:6712 tasklet_wakeup
  h2_io_cb                     149487   424.3ms   2.838us   14.63s    97.87us <- h2c_restart_reading@src/mux_h2.c:856 tasklet_wakeup
  h2_io_cb                          8   34.12us   4.265us   1.784ms   223.0us <- h2_do_shutw@src/mux_h2.c:4656 tasklet_wakeup
  qc_io_cb                      50355   19.01s    377.5us   10.65s    211.4us <- qc_treat_acked_tx_frm@src/xprt_quic.c:1695 tasklet_wakeup
  qc_io_cb                       9018   4.924s    546.0us   3.084s    342.0us <- qc_stream_desc_ack@src/quic_stream.c:128 tasklet_wakeup
  qc_io_cb                       2623   2.669s    1.017ms   1.347s    513.5us <- h3_snd_buf@src/h3.c:1084 tasklet_wakeup
  qc_io_cb                        167   10.71ms   64.12us   32.47ms   194.4us <- qc_process_timer@src/xprt_quic.c:4602 tasklet_wakeup
  qc_io_cb                          2   20.81us   10.40us   4.943us   2.471us <- qc_init@src/mux_quic.c:2057 tasklet_wakeup
  qc_io_cb                          1   867.1us   867.1us   812.8us   812.8us <- qcs_consume@src/mux_quic.c:800 tasklet_wakeup
  qc_idle_timer_task                2   295.0us   147.5us   8.761us   4.380us <- wake_expired_tasks@src/task.c:344 task_wakeup
  quic_conn_io_cb                   4   4.278ms   1.069ms   6.469us   1.617us <- qc_lstnr_pkt_rcv@src/xprt_quic.c:6184 tasklet_wakeup_after
  quic_conn_app_io_cb           92205   3.735s    40.51us   390.9ms   4.239us <- qc_lstnr_pkt_rcv@src/xprt_quic.c:6184 tasklet_wakeup_after
  quic_conn_app_io_cb             648   12.62ms   19.47us   225.7ms   348.2us <- qc_process_timer@src/xprt_quic.c:4635 tasklet_wakeup
  quic_conn_app_io_cb              12   786.9us   65.58us   2.042ms   170.1us <- qc_process_timer@src/xprt_quic.c:4589 tasklet_wakeup
  quic_conn_app_io_cb               2   752.9us   376.4us   63.97us   31.99us <- qc_xprt_start@src/xprt_quic.c:7122 tasklet_wakeup
  quic_lstnr_dghdlr             94389   614.0ms   6.504us   30.54s    323.6us <- quic_lstnr_dgram_dispatch@src/quic_sock.c:255 tasklet_wakeup
  qc_process_timer                662   526.4us   795.0ns   1.081s    1.633ms <- wake_expired_tasks@src/task.c:344 task_wakeup
  quic_accept_run                   2   13.84us   6.920us   172.8us   86.42us <- quic_accept_push_qc@src/quic_sock.c:458 tasklet_wakeup
  other                        101892   4.626s    45.40us   14.84s    145.7us

It already becomes visible that some tasks have different very costs
depending where they're called (e.g. process_stream). The method used
to wake them up is also shown. Applets are handled specially and shown
as appctx_wakeup.
2022-09-08 16:21:22 +02:00
Willy Tarreau
a3423873fe CLEANUP: activity: make the number of sched activity entries more configurable
This removes all the hard-coded 8-bit and 256 entries to use a pair of
macros instead so that we can more easily experiment with larger table
sizes if needed.
2022-09-08 14:55:09 +02:00
Willy Tarreau
e0e6d81460 CLEANUP: task: move tid and wake_date into the common part
There used to be one tid for tasklets and a thread_mask for tasks. Since
2.7, both tasks and tasklets now use a tid (albeit with a very slight
semantic difference for the negative value), to in order to limit code
duplication and to ease debugging it makes sense to move tid into the
common part. One limitation is that it will leave a hole in the structure,
but we now have the wake_date that is always present and can move there as
well to plug the hole.

This results in something overall pretty clean (and cleaner than before),
with the low-level stuff (state,tid,process,context) appearing first, then
the caller stuff (caller,wake_date,calls,debug) next, and finally the
type-specific stuff (rq/wq/expire/nice).
2022-09-08 14:30:38 +02:00
Willy Tarreau
2830d282e5 DEBUG: task: simplify the caller recording in DEBUG_TASK
Instead of storing an index that's swapped at every call, let's use the
two pointers as a shifting history. Now we have a permanent "caller"
field that records the last caller, and an optional prev_caller in the
debug section enabled by DEBUG_TASK that keeps a copy of the previous
caller one. This way, not only it's much easier to follow what's
happening during debugging, but it saves 8 bytes in the struct task in
debug mode and still keeps it under 2 cache lines in nominal mode, and
this will finally be usable everywhere and later in profiling.

The caller_idx was also used as a hint that the entry was freed, in order
to detect wakeup-after-free. This was changed by setting caller to -1
instead and preserving its value in caller[1].

Finally, the operations were made atomic. That's not critical but since
it's used for debugging and race conditions represent a significant part
of the issues in multi-threaded mode, it seems wise to at least eliminate
some possible factors of faulty analysis.
2022-09-08 14:30:38 +02:00