Commit Graph

6329 Commits

Author SHA1 Message Date
Willy Tarreau
e7866b1ff7 MEDIUM: stconn: always rely on CF_SHUTR in addition to cs_rx_blocked()
One flag (RXBLK_SHUT) is always set with CF_SHUTR, so in order to remove
it, we first need to make sure we always check for CF_SHUTR where
cs_rx_blocked() is being used.
2022-05-27 19:33:35 +02:00
Willy Tarreau
516621bbe6 MINOR: stconn: remove calls to cs_done_get()
It was only called after setting SHUTW on the output channel, and since
it's now handled by sc_is_send_allowed() we don't need it anymore.
2022-05-27 19:33:34 +02:00
Willy Tarreau
a1547ce0a0 MINOR: stconn: consider CF_SHUTW for sc_is_send_allowed()
When a shutdown(WR) is performed, send is no longer allowed, and that is
currently handled by the explicit cs_done_get() in the various shutw()
calls. That's a bit ugly and complicated for no reason, let's simply
integrate the test of SHUTW in sc_is_send_allowed().

Note that the test could also be added wherever sc_is_send_allowed() is
used but for now proceeding like this limits the changes.
2022-05-27 19:33:34 +02:00
Willy Tarreau
902ba7e2bc CLEANUP: stconn: use a single function to know if SC may send to SE
sc_is_send_allowed() is now used everywhere instead of the combination
of cs_tx_endp_ready() && !cs_tx_blocked(). There's no place where we
need them individually thus it's simpler. The test was placed in cs_util
as we'll complete it later.
2022-05-27 19:33:34 +02:00
Willy Tarreau
6001c9217c CLEANUP: stconn: make a few functions take a const argument
A number of functions in cs_utils.h are not usable from functions taking
a const because they're not declared as using const, despite never
modifying the stconn. Let's address this for the following ones:

  sc_ic(), sc_oc(), sc_ib(), sc_ob(), sc_strm_task(),
  cs_opposite(), sc_conn_ready(), cs_src(), cs_dst(),
2022-05-27 19:33:34 +02:00
Willy Tarreau
967955b156 CLEANUP: stconn: rename cs_ep_set_error() to se_fl_set_error()
First it applies to the stream endpoint and not the conn_stream, and
second it only tests and touches the flags so it makes sense to call
it se_fl_ like other functions which only manipulate the flags, as
it's just a special case of flags.
2022-05-27 19:33:34 +02:00
Willy Tarreau
108423819c CLEANUP: stconn: rename cs_conn_get_first() to conn_get_first_sc()
It returns an stconn from a connection and not the opposite, so the name
change was more appropriate. In addition it was moved to connection.h
which manipulates the connection stuff, and it happens that only
connection.c uses it.
2022-05-27 19:33:34 +02:00
Willy Tarreau
462b989d4c CLEANUP: stconn: rename cs_conn_*() to sc_conn_*()
The following functions which act on a connection-based stream connector
were renamed to sc_conn_* (~60 places):

  cs_conn_drain_and_shut
  cs_conn_process
  cs_conn_read0
  cs_conn_ready
  cs_conn_recv
  cs_conn_send
  cs_conn_shut
  cs_conn_shutr
  cs_conn_shutw
2022-05-27 19:33:34 +02:00
Willy Tarreau
f8d0ab54ec CLEANUP: stconn: rename cs_get_data_name() to sc_get_data_name()
Only used twice to dump stream debug info.
2022-05-27 19:33:34 +02:00
Willy Tarreau
fa57cc7b20 CLEANUP: stconn: rename __cs_endp_target() to __sc_endp()
The function returns the real stream endpoint so since there's no more
confusion around the terminology, let's drop "target".
2022-05-27 19:33:34 +02:00
Willy Tarreau
8e7c6e6907 CLEANUP: stconn: rename cs_appctx() to sc_appctx()
Nothing special, just s/cs/sc/, roughly 50-60 entries.
2022-05-27 19:33:34 +02:00
Willy Tarreau
417a31bb55 CLEANUP: stconn: rename cs_conn_mux() to sc_mux_ops()
This effectively returns the mux_ops from the connection when it exists
on an stconn.
2022-05-27 19:33:34 +02:00
Willy Tarreau
6fe2b42e45 CLEANUP: stconn: rename cs_mux() to sc_mux_strm()
The function doesn't return a pointer to the mux but to the mux stream
(h1s, h2s etc). Let's adjust its name to reflect this. It's rarely used,
the name can be enlarged a bit. And of course s/cs/sc to accommodate for
the updated name.
2022-05-27 19:33:34 +02:00
Willy Tarreau
fd9417ba3f CLEANUP: stconn: rename cs_conn() to sc_conn()
It's mostly used from upper layers. Both the checked and unchecked
functions were updated, or ~150 entries.
2022-05-27 19:33:34 +02:00
Willy Tarreau
ea27f48c5a CLEANUP: stconn: rename cs_{check,strm,strm_task} to sc_strm_*
These functions return the app-layer associated with an stconn, which
is a check, a stream or a stream's task. They're used a lot to access
channels, flags and for waking up tasks. Let's just name them
appropriately for the stream connector.
2022-05-27 19:33:34 +02:00
Willy Tarreau
40a9c32e3a CLEANUP: stconn: rename cs_{i,o}{b,c} to sc_{i,o}{b,c}
We're starting to propagate the stream connector's new name through the
API. Most call places of these functions that retrieve the channel or its
buffer are in applets. The local variable names are not changed in order
to keep the changes small and reviewable. There were ~92 uses of cs_ic(),
~96 of cs_oc() (due to co_get*() being less factorizable than ci_put*),
and ~5 accesses to the buffer itself.
2022-05-27 19:33:34 +02:00
Willy Tarreau
15c25d5e1d MINOR: applet: add new wrappers to put chk/blk/str/chr to channel from appctx
The vast majority of calls to ci_putchk() etc are performed from applets
which directly know an endpoint. Figuring the correct API (writing into
input channel etc) isn't trivial for newcomers, and knowing that they
must mark the flag indicating a buffer full condition isn't trivial
either.

Here we're adding wrappers to these functions but to be used directly
from the appctx. That's already what is being done in multiple steps in
the applet code, where the endp is derived from the appctx, then the cs
from the endp, then the stream from the cs, then the channel from the
stream, and so on. But this time the function doesn't require to know
much of the internals, applet_putchr() writes a char from the appctx,
and marks the buffer full if needed. Period. This will allow to remove
a significant amount of obscure ci_putchk() and cs_ic() calls from the
code, hence a significant number of possible mistakes.
2022-05-27 19:33:34 +02:00
Willy Tarreau
2f2318df87 MEDIUM: stconn: merge the app_ops and the data_cb fields
For historical reasons (stream-interface and connections), we used to
require two independent fields for the application level callbacks and
the transport-level functions. Over time the distinction faded away so
much that the low-level functions became specific to the application
and conversely. For example, applets may only work with streams on top
since they rely on the channels, and the stream-level functions differ
between applets and connections. Right now the application level only
contains a wake() callback and the low-level ones contain the functions
that act at the lower level to perform the shutr/shutw and at the upper
level to notify about readability and writability. Let's just merge them
together into a single set and get rid of this confusing distinction.
Note that the check ops do not define any app-level function since these
are only called by streams.
2022-05-27 19:33:34 +02:00
Willy Tarreau
c086960a03 MINOR: conn_stream: test the various ops functions before calling them
We currently call all ->shutr, ->chk_snd etc from ->ops unconditionally,
while the ->wake() call from data_cb is checked. Better check ops as
well for consistency, this will help get them merged.
2022-05-27 19:33:34 +02:00
Willy Tarreau
f3ae34b67d MINOR: check: export wake_srv_chk()
We'll need it to centralize the stream connectors definitions.
2022-05-27 19:33:34 +02:00
Willy Tarreau
026e8fb290 CLEANUP: stconn: tree-wide rename stconn states CS_ST/SB_* to SC_ST/SB_*
This also follows the natural naming. There are roughly 238 changes, all
totally trivial. conn_stream-t.h has become completely void of any
"conn_stream" related stuff now (except its name).
2022-05-27 19:33:34 +02:00
Willy Tarreau
cb04166525 CLEANUP: stconn: tree-wide rename stream connector flags CS_FL_* to SC_FL_*
This follows the natural naming. There are roughly 100 changes, all
totally trivial.
2022-05-27 19:33:34 +02:00
Willy Tarreau
7cb9e6c6ba CLEANUP: stream: rename "csf" and "csb" to "scf" and "scb"
These are the stream connectors, let's give them consistent names. The
patch is large (405 locations) but totally trivial.
2022-05-27 19:33:34 +02:00
Willy Tarreau
c105492bf5 CLEANUP: stdesc: rename the stream connector ->cs field to ->sc
This is a rename of this field. Most of the places were in muxes, but
were already factored with the previous series adding *_sc().
2022-05-27 19:33:34 +02:00
Willy Tarreau
4596fe20d9 CLEANUP: conn_stream: tree-wide rename to stconn (stream connector)
This renames the "struct conn_stream" to "struct stconn" and updates
the descriptions in all comments (and the rare help descriptions) to
"stream connector" or "connector". This touches a lot of files but
the change is minimal. The local variables were not even renamed, so
there's still a lot of "cs" everywhere.
2022-05-27 19:33:34 +02:00
Willy Tarreau
3a3f480d15 CLEANUP: conn_stream: rename cs_app_* to sc_app_*
Let's start to introduce the stream connector at the app_ops level.
This is entirely self-contained into conn_stream.c. The functions
were also updated to reflect the new name, and the comments were
updated.
2022-05-27 19:33:34 +02:00
Willy Tarreau
798465b02c CLEANUP: conn_stream: rename the conn_stream's endp to sedesc
Just like for the appctx, this is a pointer to a stream endpoint descriptor,
so let's make this explicit and not confuse it with the full endpoint. There
are very few changes thanks to the preliminary refactoring of the flags
manipulation.
2022-05-27 19:33:34 +02:00
Willy Tarreau
d869e13ed8 CLEANUP: applet: rename the sedesc pointer from "endp" to "sedesc"
Now at least it makes it obvious that it's the stream endpoint descriptor
and not an endpoint. There were few changes thanks to the previous refactor
of the flags.
2022-05-27 19:33:34 +02:00
Willy Tarreau
ea59b0201c CLEANUP: conn_stream: rename cs_endpoint to sedesc (stream endpoint descriptor)
After some discussion we found that the cs_endpoint was precisely the
descriptor for a stream endpoint, hence the naturally coming name,
stream endpoint constructor.

This patch renames only the type everywhere and the new/init/free functions
to remain consistent with it. Future patches will address field names and
argument names in various code areas.
2022-05-27 19:33:34 +02:00
Willy Tarreau
65d0597b2b CLEANUP: conn_stream: rename the cs_endpoint's target to "se"
That's the "stream endpoint" pointer. Let's change it now while it's
not much spread. The function __cs_endp_target() wasn't yet renamed
because that will change more globally soon.
2022-05-27 19:33:34 +02:00
Willy Tarreau
b605c4213f CLEANUP: conn_stream: rename the stream endpoint flags CS_EP_* to SE_FL_*
Let's now use the new flag names for the stream endpoint.
2022-05-27 19:33:34 +02:00
Willy Tarreau
d56377c5eb CLEANUP: conn_stream: apply endp_flags.cocci tree-wide
This changes all main uses of endp->flags to the se_fl_*() equivalent
by applying coccinelle script endp_flags.cocci. The se_fl_*() functions
themselves were manually excluded from the change, of course.

Note: 144 locations were touched, manually reviewed and found to be OK.

The script was applied with all includes:

  spatch --in-place --recursive-includes -I include --sp-file $script $files
2022-05-27 19:33:34 +02:00
Willy Tarreau
0cfcc40812 CLEANUP: conn_stream: apply cs_endp_flags.cocci tree-wide
This changes all main uses of cs->endp->flags to the sc_ep_*() equivalent
by applying coccinelle script cs_endp_flags.cocci.

Note: 143 locations were touched, manually reviewed and found to be OK,
except a single one that was adjusted in cs_reset_endp() where the flags
are read and filtered to be used as-is and not as a boolean, hence was
replaced with sc_ep_get() & $FLAGS.

The script was applied with all includes:

  spatch --in-place --recursive-includes -I include --sp-file $script $files
2022-05-27 19:33:34 +02:00
Willy Tarreau
cd1d585e53 MINOR: conn_stream: add new sets of functions to set/get endpoint flags
At plenty of places we need to manipulate the conn_stream's endpoint just
to set or clear a flag. This patch adds a handful of functions to perform
the common operations (clr/set/get etc) on these flags at both the endpoint
and at the conn_stream level.

The functions were named after the target names, i.e. se_fl_*() to act on
the stream endpoint flags, and sc_ep_* to manipulate the endpoint flags
from the stream connector (currently conn_stream).

For now they're not used.
2022-05-27 19:33:34 +02:00
Willy Tarreau
24d15b1891 CLEANUP: conn_stream: rename the cs_endpoint's context to "conn"
This one is exclusively used by the connection, regardless its generic
name "ctx" is rather confusing. Let's make it a struct connection* and
call it "conn". This way there's no doubt about what it is and there's
no way it will be used by accident by being taken for something else.
2022-05-27 19:33:34 +02:00
Christopher Faulet
a45403f965 Revert "BUG/MINOR: task: Don't defer tasks release when HAProxy is stopping"
This reverts commit d9404b464f.

In fact, there is a BUG_ON() in __task_free() function to be sure the task
is no longer in the wait-queue or the run-queue. Because the patch tries to
fix a "leak" on deinit, it is safer to revert it. there is no reason to
introduce potential bug for this kind of issues. And there is no reason to
impact the normal use-cases at runtime with additionnal conditions to only
remove a task on deinit.
2022-05-25 16:41:52 +02:00
Amaury Denoyelle
8c6176b8db MINOR: h3: refactor SETTINGS parsing/error reporting
Bring some improvment to h3_parse_settings_frm() function. The first one
is the parsing which now manipulates a buffer instead of a plain char*.
This is more to unify with other parsing functions rather than dealing
with data wrapping : it's unlikely to happen as SETTINGS is only
received as the first frame on the control STREAM.

Various errors are now properly reported as connection error :
* on incomplete frame payload
* on a duplicated settings in the same frame
* on reserved settings receive
2022-05-25 15:41:25 +02:00
Amaury Denoyelle
849b24f15b MINOR: h3: abort read on unknown uni stream
As specified by HTTP/3 draft, an unknown unidirectional stream can be
aborted. To do this, use a new flag QC_SF_READ_ABORTED. When the MUX
detects this flag, QCS instance is automatically freed.

Previously, such streams were instead automatically drained. By aborting
them, we economize some useless memcpy instruction. On future data
reception, QCS instance is not found in the tree and considered as
already closed. The frame payload is thus deleted without copying it.
2022-05-25 15:41:25 +02:00
Amaury Denoyelle
9cc475182c CLEANUP: h3: remove h3 uni tasklet
Remove all unnecessary bits of code for H3 unidirectional streams. Most
notable, an individual tasklet is not require anymore for each stream.
This is useless since the merge of RX/TX uni streams handling with
bidirectional streams code.
2022-05-25 15:41:25 +02:00
Amaury Denoyelle
f8db5aaf78 MEDIUM: quic: refactor uni streams RX
The whole QUIC stack is impacted by this change :
* at quic-conn level, a single function is now used to handle uni and
  bidirectional streams. It uses qcc_recv() function from MUX.
* at MUX level, qc_recv() io-handler function does not skip uni streams
* most changes are conducted at app layer. Most notably, all received
  data is handle by decode_qcs operation.

Now that decode_qcs is the single app read function, the H3 layer can be
simplified. Uni streams parsing was extracted from h3_attach_ruqs() to
h3_decode_qcs().

h3_decode_qcs() is able to deal with all HTTP/3 frame types. It first
check if the frame is valid for the H3 stream type. Most notably,
SETTINGS parsing was moved from h3_control_recv() into h3_decode_qcs().

This commit has some major benefits besides removing duplicated code.
Mainly, QUIC flow control is now enforced for uni streams as with bidi
streams. Also, an unknown frame received on control stream does not set
an error : it is now silently ignored as required by the specification.

Some cleaning in H3 code is already done with this patch :
h3_control_recv() and h3_attach_ruqs() are removed as they are now
unused. A final patch should clean up the unneeded remaining bit.
2022-05-25 15:41:25 +02:00
Amaury Denoyelle
3236a8e85c MINOR: h3: define stream type
Define a new enum h3s_t. This is used to differentiate between the
different stream types used in a HTTP/3 connection, including the QPACK
encoder/decoder streams.

For the moment, only bidirectional streams is positioned. This patch
will be useful to unify reception of uni streams with bidirectional
ones.
2022-05-25 15:41:25 +02:00
Amaury Denoyelle
6b92394973 MINOR: h3/qpack: use qcs as type in decode callbacks
Replace h3_uqs type by qcs in stream callbacks. This change is done in
the context of unification between bidi and uni-streams. h3_uqs type
will be unneeded when this is achieved.
2022-05-25 15:41:25 +02:00
Amaury Denoyelle
f9e190e49a MINOR: quic: support CONNECTION_CLOSE_APP emission
Complete quic-conn API for error reporting. A new parameter <app> is
defined in the function quic_set_connection_close(). This will transform
the frame into a CONNECTION_CLOSE_APP type.

This type of frame will be generated by the applicative layer, h3 or
hq-interop for the moment. A new function qcc_emit_cc_app() is exported
by the MUX layer for them.
2022-05-25 15:41:25 +02:00
Amaury Denoyelle
081479df92 CLEANUP: h3: rename uni stream type constants
Cosmetic fix which reduce the name of unidirectional stream constants.
No impact on the code.
2022-05-25 15:41:25 +02:00
Amaury Denoyelle
1c25b18e17 MINOR: mux-quic: delay cs_endpoint allocation
Do not allocate cs_endpoint for every QCS instances in qcs_new().
Instead, this is delayed to qc_attach_cs() function.

In effect, with H3 as app protocol, cs_endpoint will be allocated on
HEADERS parsing. Thus, no cs_endpoint is allocated for H3 unidirectional
streams which do not convey any HTTP data.
2022-05-25 15:41:25 +02:00
Christopher Faulet
d9404b464f BUG/MINOR: task: Don't defer tasks release when HAProxy is stopping
A running or queued task is not released when task_destroy() is called,
except if it is the current task. Its process function is set to NULL and we
let the scheduler to release the task. However, when HAProxy is stopping, it
never happens and some tasks may leak. To fix the issue, we now also rely on
the global MODE_STOPPING flag. When this flag is set, the task is always
immediately released.

This patch should fix the issue #1714. It could be backported as far as 2.4
but it's not a real problem in practice because it only happens on
deinit. The leak exists on previous versions but not MODE_STOPPING flag.
2022-05-25 15:31:21 +02:00
David CARLIER
842e4a6617 BUILD/MINOR: cpuset fix build for FreeBSD 13.1
the cpuset api changes done fir the future 14 release had been
backported to the 13.1 release so changing the cpuset api of choice
condition change accordingly.
2022-05-20 23:06:03 +02:00
Willy Tarreau
b5821e12ce MINOR: connection: add flag MX_FL_FRAMED to mark muxes relying on framed xprt
In order to be able to check compatibility between muxes and transport
layers, we'll need a new flag to tag muxes that work on framed transport
layers like QUIC. Only QUIC has this flag now.
2022-05-20 18:41:55 +02:00
Willy Tarreau
91b780a455 CLEANUP: listener: store stream vs dgram at the bind_conf level
Let's collect the set of xprt-level and sock-level dgram/stream protocols
seen on a bind line and store that in the bind_conf itself while they're
being parsed. This will make it much easier to detect incompatibilities
later than the current approch which consists in scanning all listeners
in post-parsing.
2022-05-20 18:41:55 +02:00
Willy Tarreau
787e92a4fb CLEANUP: listener: replace bind_conf->quic_force_retry with BC_O_QUIC_FORCE_RETRY
It was only set and used once, let's replace it now and take it out of
the ifdef.
2022-05-20 18:41:51 +02:00
Willy Tarreau
1ea6e6a17f CLEANUP: listener: replace bind_conf->generate_cers with BC_O_GENERATE_CERTS
The new flag will now replace this boolean variable.
2022-05-20 18:39:43 +02:00
Willy Tarreau
11ba404c6b CLEANUP: listener: replace all uses of bind_conf->is_ssl with BC_O_USE_SSL
The new flag will now replace this boolean variable that was only set and
tested.
2022-05-20 18:39:43 +02:00
Willy Tarreau
c694471b21 MINOR: listener: add a new "options" entry in bind_conf
There is no way to store useful info there, yet there's about one entry
per boolean. Let's add an "options" attribute which will collect various
options.

In practice, even the BC_O_SSL_* flags and a few info such as strict_sni
could move there.
2022-05-20 18:39:43 +02:00
Willy Tarreau
fca044bda5 CLEANUP: listener: add a comment about what the BC_SSL_O_* flags are for
They're for ->ssl_options but it wasn't obvious.
2022-05-20 18:39:43 +02:00
Willy Tarreau
3882d2a96c MINOR: listener: provide a function to process all of a bind_conf's arguments
The "bind" parsing code was duplicated for the peers section and as a
result it wasn't kept updated, resulting in slightly different error
behavior (e.g. errors were not freed, warnings were emitted as alerts)
Let's first unify it into a new dedicated function that properly reports
and frees the error.
2022-05-20 18:39:43 +02:00
Willy Tarreau
91b47263f7 MINOR: protocol: replace ctrl_type with xprt_type and clarify it
There's been some great confusion between proto_type, ctrl_type and
sock_type. It turns out that ctrl_type was improperly chosen because
it's not the control layer that is of this or that type, but the
transport layer, and it turns out that the transport layer doesn't
(normally) denaturate the underlying control layer, except for QUIC
which turns dgrams to streams. The fact that the SOCK_{DGRAM|STREAM}
set of values was used added to the confusion.

Let's replace it with xprt_type which reuses the later introduced
PROTO_TYPE_* values, and update the comments to explain which one
works at what level.
2022-05-20 18:39:43 +02:00
Amaury Denoyelle
d46b0f52ae MINOR: mux-quic: emit FLOW_CONTROL_ERROR
Send a CONNECTION_CLOSE if the peer emits more data than authorized by
our flow-control. This is implemented for both stream and connection
level.

Fields have been added in qcc/qcs structures to differentiate received
offsets for limit enforcing with consumed offsets for sending of
MAX_DATA/MAX_STREAM_DATA frames.
2022-05-20 17:47:09 +02:00
Amaury Denoyelle
9fab9fd7e5 MINOR: quic/mux-quic: define CONNECTION_CLOSE send API
Define an API to easily set a CONNECTION_CLOSE. This will mainly be
useful for the MUX when an error is detected which require to close the
whole connection.

On the MUX side, a new flag is added when a CONNECTION_CLOSE has been
prepared. This will disable add future send operations.
2022-05-20 17:26:56 +02:00
Frédéric Lécaille
9286210aa8 MINOR: quic: Add tune.quic.retry-threshold keyword
This QUIC specific keyword may be used to set the theshold, in number of
connection openings, beyond which QUIC Retry feature will be automatically
enabled. Its default value is 100.
2022-05-20 17:11:13 +02:00
Frédéric Lécaille
cbd59c7ab6 MINOR: quic: QUIC stats counters handling
First commit to handle the QUIC stats counters. There is nothing special to say
except perhaps for ->conn_openings which is a gauge to count the number of
connection openings. It is incremented after having instantiated a quic_conn
struct, then decremented when the handshake was successful (handshake completed
state) or failed or when the connection timed out without reaching the handshake
completed state.
2022-05-20 17:11:13 +02:00
Frédéric Lécaille
a89659a752 MINOR: quic: Attach proxy QUIC stats counters to the QUIC connection
Make usage of EXTRA_COUNTERS_GET() do to so from qc_new_conn().
2022-05-20 17:11:13 +02:00
Frédéric Lécaille
a58cafeb89 MINOR: quic_stats: Add a new stats module for QUIC
This is a very minimalist frontend only stats module with only one gauge for the
QUIC establishing connections count.
2022-05-20 17:11:13 +02:00
Frédéric Lécaille
2822593a12 BUILD: stats: Missing headers inclusions from stats.h
If we add a new stats module to C source files including only
stats.h we get these errors:

    include/haproxy/stats.h:39:31: error: array type has incomplete element type
    ‘struct name_desc’
       39 | extern const struct name_desc stat_fields[];

    include/haproxy/stats.h:55:50: warning: ‘struct listener’ declared inside
    parameter list will not be visible outside of this definition or declaration
       55 | int stats_fill_li_stats(struct proxy *px, struct listener *l, int flags,

name_desc struct is defined in tools-t.h and listener struct in listner-t.h.
2022-05-20 16:57:12 +02:00
Frédéric Lécaille
6492e66e41 MINOR: quic: Move quic_lstnr_dgram_dispatch() out of xprt_quic.c
Remove this function from xprt_quic.c which for now implements only
"by thread attached to a connection" code.
2022-05-20 16:57:12 +02:00
Frédéric Lécaille
3f3ff47998 MINOR: quic: Retry implementation
Here is the format of a token:
        - format (1 byte)
        - ODCID (from 9 up 21 bytes)
        - creation timestamp (4 bytes)
        - salt (16 bytes)

A format byte is required to distinguish the Retry token from others sent in
NEW_TOKEN frames.

The Retry token is ciphered after having derived a strong secret from the cluster secret
and generated the AEAD AAD, as well as a 16 bytes long salt. This salt is
added to the token. Obviously it is not ciphered. The format byte is not
ciphered too.

The AAD are built by quic_generate_retry_token_aad() which concatenates the version,
the client SCID and the IP address and port. We had to implement quic_saddr_cpy()
to copy the IP address and port to the AAD buffer. Only the Retry SCID is generated
on our side to build a Retry packet, the others fields come from the first packet
received by the client. It must reuse this Retry SCID in response to our Retry packet.
So, we have not to store it on our side. Everything is offloaded to the client (stateless).
quic_generate_retry_token() must be used to generate a Retry packet. It calls
quic_pkt_encrypt() to cipher the token.

quic_generate_retry_check() must be used to check the validity of a Retry token.
It is able to decipher a token which arrives into an Initial packet in response
to a Retry packet. It calls parse_retry_token() after having deciphered the token
to store the ODCID into a local quic_cid struct variable. Finally this ODCID may
be stored into the transport parameter thanks to qc_lstnr_params_init().
The Retry token lifetime is 10 seconds. This lifetime is also checked by
quic_generate_retry_check(). If quic_generate_retry_check() fails, the received
packet is dropped without anymore packet processing at this time.
2022-05-20 16:57:12 +02:00
Frédéric Lécaille
55367c8679 MINOR: quic_tls: Add quic_tls_decrypt2() implementation
This function does exactly the same thing as quic_tls_decrypt(), except that
it does reuse its input buffer as output buffer. This is needed
to decrypt the Retry token without modifying the packet buffer which
contains this token. Indeed, this would prevent us from decryption
the packet itself as the token belong to the AEAD AAD for the packet.
2022-05-20 16:57:12 +02:00
Frédéric Lécaille
a9c5d8da58 MINOR: quic_tls: Add quic_tls_derive_retry_token_secret()
This function must be used to derive strong secrets from a non pseudo-random
secret (cluster-secret setting in our case) and an IV. First it call
quic_hkdf_extract_and_expand() to do that for a temporary strong secret (tmpkey)
then two calls to quic_hkdf_expand() reusing this strong temporary secret
to derive the final strong secret and IV.
2022-05-20 16:57:12 +02:00
Frédéric Lécaille
359d877f73 MINOR: quic: Dump initial derived secrets
It seems <qc> parameters was removed for an unknown reason preventing
these secrets to dumped by the traces.
2022-05-20 16:57:12 +02:00
Amaury Denoyelle
fe1c785bcc CLEANUP: quic: adjust comment/coding style for TPs init
Fix typo in comment and adjust code alignment for better readability.
2022-05-19 17:40:09 +02:00
Amaury Denoyelle
0daef007e4 BUG/MEDIUM: quic: fix initialization for local/remote TPs
The local and remote TPs were both processed through the same function
quic_transport_params_init(). This caused the remote TPs to be
overwritten with values configured for our local usage.

Change this by reserving quic_transport_params_init() only for our local
TPs. Remote TPs are simply initialized via
quic_dflt_transport_params_cpy().

This bug could result in a connection closed in error by the client due
to a violation of its TPs. For example, curl client closed the
connection after receiving too many CONNECTION_ID due to an invalid
active_connection_id value used.
2022-05-19 17:40:09 +02:00
Christopher Faulet
c95eaefbfd MEDIUM: check: Use the CS to handle subscriptions for read/write events
Instead of using the health-check to subscribe to read/write events, we now
rely on the conn-stream. Indeed, on the server side, the conn-stream's
endpoint is a multiplexer. Thus it seems appropriate to handle subscriptions
for read/write events the same way than for the streams. Of course, the I/O
callback function is not the same. We use srv_chk_io_cb() instead of
cs_conn_io_cb().
2022-05-19 10:12:38 +02:00
Christopher Faulet
361417f9b4 REORG: check: Rename and export I/O callback function
event_srv_chk_io() function is renamed srv_chk_io_cb() to be consistant with
the I/O callback function of connections. In addition, this function is
exported. It will be required to use the conn-stream's subscriptions.
2022-05-19 10:12:38 +02:00
Amaury Denoyelle
c830e1e904 MINOR: mux-quic: implement MAX_DATA emission
This commit is similar to the previous one but deals with MAX_DATA for
connection-level data flow control. It uses the same function
qcc_consume_qcs() to update flow control level and generate a MAX_DATA
frame if needed.
2022-05-18 16:25:07 +02:00
Amaury Denoyelle
a977355aa1 MINOR: mux-quic: implement MAX_STREAM_DATA emission
Send MAX_STREAM_DATA frames when at least half of the allocated
flow-control has been demuxed, frame and cleared. This is necessary to
support QUIC STREAM with received data greater than a buffer.

Transcoders must use the new function qcc_consume_qcs() to empty the QCS
buffer. This will allow to monitor current flow-control level and
generate a MAX_STREAM_DATA frame if required. This frame will be emitted
via qc_io_cb().
2022-05-18 16:25:07 +02:00
Amaury Denoyelle
c985cb167d MINOR: mux-quic: reorganize flow-control frames emission
Adjust the mechanism for MAX_STREAMS_BIDI emission. When a bidirectional
stream is removed, current flow-control level is checked. If needed, a
MAX_STREAMS_BIDI frame is generated and inserted in a new list in the
QCS instance. The new frames will be emitted at the start of qc_send().

This has no impact on the current MAX_STREAMS_BIDI behavior. However,
this mechanism is more flexible and will allow to implement quickly
MAX_STREAM_DATA/MAX_DATA emission.
2022-05-18 15:52:44 +02:00
Amaury Denoyelle
3a0864067a MINOR: mux-quic: remove qcc_decode_qcs() call in XPRT
Slightly change the interface for qcc_recv() between MUX and XPRT. The
MUX is now responsible to call qcc_decode_qcs(). This is cleaner as now
the XPRT does not have to deal with an extra QCS parameter and the MUX
will call qcc_decode_qcs() only if really needed.

This change is possible since there is no extra buffering for
out-of-order STREAM frames and the XPRT does not have to handle buffered
frames.
2022-05-18 15:50:57 +02:00
Amaury Denoyelle
03dcf560ae BUG/MINOR: mux-quic: update session's idle delay before stream creation
This commit is an adaptation from the following patch :
  commit d0de677682
  Author: Willy Tarreau <w@1wt.eu>
  Date:   Fri Feb 4 09:05:37 2022 +0100
  BUG/MINOR: mux-h2: update the session's idle delay before creating the stream

This should fix the incorrect timeouts present in httplog format for
QUIC requests.
2022-05-18 15:30:13 +02:00
Amaury Denoyelle
ca21c768b9 MINOR: ncbuf: refactor ncb_advance()
First adjusted some typos in comments inside the function. Second,
change the naming of some variable to reduce confusion.

A special case has been inserted when advance is done inside a GAP block
and this block is the last of the buffer. In this case, the whole buffer
will be emptied, equivalent to a ncb_init() operation.
2022-05-18 15:30:13 +02:00
Amaury Denoyelle
82c51b561e OPTIM: quic: realign empty Rx buffer
quic_rx_pkts_del() function removes packets from QUIC RX buffer. In most
cases, the buffer will be emptied after it. In this case, it's useful to
realign it. This will avoid future data wrapping and use of an
unnecessary junk to fill a too small contiguous space.
2022-05-18 15:16:26 +02:00
Maciej Zdeb
d01be2ab13 MINOR: peers: Track number of applets run by thread
Maintain number of peers applets run on all threads. It will be used
in next patch for least loaded thread selection.
2022-05-17 16:13:22 +02:00
Christopher Faulet
d9c1d33fa1 MEDIUM: applet: Add support for async appctx startup on a thread subset
It is now possible to start an appctx on a thread subset. Some controls were
added here and there. It is forbidden to start a backend appctx on another
thread than the local one. If a frontend appctx is started on another thread
or a thread subset, the applet .init callback function must be defined. This
callback function is responsible to finalize the appctx startup. It can be
performed synchornously. In this case, the appctx is started on the local
thread. It is not really useful but it is valid. Or it can be performed
asynchronously. In this case, .init callback function is called when the
appctx is woken up for the first time. When this happens, the appctx
affinity is set to the current thread to be able to start the session and
the stream.
2022-05-17 16:13:22 +02:00
Christopher Faulet
6095d57701 MINOR: applet: Add API to start applet on a thread subset
In the same way than for the tasks, the applets api was changed to be able
to start a new appctx on a thread subset. For now the feature is
disabled. Only appctx_new_here() is working. But it will be possible to
start an appctx on a specific thread or a subset via a mask.
2022-05-17 16:13:22 +02:00
Christopher Faulet
387e79727c MINOR: peers: Add a ref to peers section in the peer structure
This change is required to handle asynchrone init of the appctx. It is now
possible to directly get the peers section associated to a peer.
2022-05-17 16:13:22 +02:00
Christopher Faulet
2ae25ea24b MINOR: sink: Add a ref to sink in the sink_forward_target structure
This change is required to be able to refactor the init stage of appctx. It
is now possible to directly get the sink from a forward target.
2022-05-17 16:13:22 +02:00
Christopher Faulet
d0c4ec04b8 MINOR: applet: Add function to release appctx on error during init stage
appctx_free_on_early_error() must be used to release a freshly created
frontend appctx if an error occurred during the init stage. It takes care to
release the stream instead of the appctx if it exists. For a backend appctx,
it just calls appctx_free().
2022-05-17 16:13:21 +02:00
Christopher Faulet
8718c95c0a MINOR: applet: Add a function to finalize frontend appctx startup
appctx_finalize_startup() may be used to finalize the frontend appctx
startup. It is responsible to create the appctx's session and the frontend
conn-stream. On error, it is the caller responsibility to release the
appctx. However, the session is released if it was created. On success, if
an error is encountered in the caller function, the stream must be released
instead of the appctx.

This function should ease the init stage when new appctx is created.
2022-05-17 16:13:21 +02:00
Christopher Faulet
16c0d9cda0 MINOR: applet: Add appctx_init() helper fnuction
It is just a helper function that call the .init applet callback function,
if it exists. This will simplify a bit the init stage when a new applet is
started. For now, this callback function is only used when a new service is
started.
2022-05-17 16:13:21 +02:00
Christopher Faulet
ab5d1dceed MINOR: stream: Export stream_free()
The stream_free() function is now public. It is mandatory to properly handle
errors when a new applet is started.
2022-05-17 16:13:21 +02:00
Christopher Faulet
c9929380a4 MINOR: applet: Change return value for .init callback function
0 is now returned on success and -1 on error.
2022-05-17 16:13:21 +02:00
Christopher Faulet
ac57bb527a MINOR: applet: Prepare appctx to own the session on frontend side
Applets were moved at the same level than multiplexers. Thus, gradually,
applets code is changed to be less dependent from the stream. With this
commit, the frontend appctx are ready to own the session. It means a
frontend appctx will be responsible to release the session.
2022-05-17 16:13:21 +02:00
Christopher Faulet
ef5e1bb4cf CLEANUP: conn-stream: Remove cs_applet_shut declaration from header file
This function was renamed and moved in applet code. cs_applet_shut() does
not exist anymore. Its declaration must be removed.
2022-05-17 16:13:21 +02:00
David Carlier
135c1ec139 BUILD: fix build warning on solaris based systems with __maybe_unused.
__maybe_unused is already defined there.
2022-05-17 11:42:20 +02:00
Remi Tricot-Le Breton
1746a388c5 MINOR: ssl: Add 'ssl-provider' global option
When HAProxy is linked to an OpenSSLv3 library, this option can be used
to load a provider during init. You can specify multiple ssl-provider
options, which will be loaded in the order they appear. This does not
prevent OpenSSL from parsing its own configuration file in which some
other providers might be specified.
A linked list of the providers loaded from the configuration file is
kept so that all those providers can be unloaded during cleanup. The
providers loaded directly by OpenSSL will be freed by OpenSSL.
2022-05-17 10:56:05 +02:00
Christopher Faulet
18c13d3bd8 MEDIUM: http-ana: Add a proxy option to restrict chars in request header names
The "http-restrict-req-hdr-names" option can now be set to restrict allowed
characters in the request header names to the "[a-zA-Z0-9-]" charset.

Idea of this option is to not send header names with non-alphanumeric or
hyphen character. It is especially important for FastCGI application because
all those characters are converted to underscore. For instance,
"X-Forwarded-For" and "X_Forwarded_For" are both converted to
"HTTP_X_FORWARDED_FOR". So, header names can be mixed up by FastCGI
applications. And some HAProxy rules may be bypassed by mangling header
names. In addition, some non-HTTP compliant servers may incorrectly handle
requests when header names contain characters ouside the "[a-zA-Z0-9-]"
charset.

When this option is set, the policy must be specify:

  * preserve: It disables the filtering. It is the default mode for HTTP
              proxies with no FastCGI application configured.

  * delete: It removes request headers with a name containing a character
            outside the "[a-zA-Z0-9-]" charset. It is the default mode for
            HTTP backends with a configured FastCGI application.

  * reject: It rejects the request with a 403-Forbidden response if it
            contains a header name with a character outside the
            "[a-zA-Z0-9-]" charset.

The option is evaluated per-proxy and after http-request rules evaluation.

This patch may be backported to avoid any secuirty issue with FastCGI
application (so as far as 2.2).
2022-05-16 16:00:26 +02:00
Amaury Denoyelle
45fce8fcb5 CLEANUP: quic: remove unused quic_rx_strm_frm
quic_rx_strm_frm type was used to buffered STREAM frames received out of
order. Now the MUX is able to deal directly with these frames and
buffered it inside its ncbuf.
2022-05-13 17:29:52 +02:00
Amaury Denoyelle
00f87bbaa3 CLEANUP: mux-quic: remove unused fields for Rx
Rx has been simplified since the conversion of buffer to a ncbuf. The
old buffer can now be removed. The frms tree is also removed. It was
used previously to stored out-of-order received STREAM frames. Now the
MUX is able to buffer them directly into the ncbuf.
2022-05-13 17:29:52 +02:00
Amaury Denoyelle
1290f1ebfb MEDIUM: mux-quic/h3/hq-interop: use ncbuf for bidir streams
Add a ncbuf for data reception on qcs. Thanks to this, the MUX is able
to buffered all received frame directly into the buffer. Flow control
parameters will be used to ensure there is never an overflow.

This change will simplify Rx path with the future deletion of acked
frames tree previously used for frames out of order.
2022-05-13 17:28:46 +02:00
Amaury Denoyelle
06749f3d6f MINOR: xprt_quic: adjust flow-control according to bufsize
Redefine the initial local flow-control to enforce by us. Use bufsize as
the maximum offset allowed to be received.

This change is part of an adjustement on the Rx path. Mux buffer will be
converted to a ncbuf. Flow-control parameters must ensure that we never
receive a frame larger than the buffer. With this, all received frames
will be stored in the MUX buffer.
2022-05-13 17:22:19 +02:00
Willy Tarreau
6796a06278 CLEANUP: conn_stream: merge cs_new_from_{mux,applet} into cs_new_from_endp()
The two functions became exact copies since there's no more special case
for the appctx owner. Let's merge them into a single one, that simplifies
the code.
2022-05-13 14:28:48 +02:00
Willy Tarreau
0698c80a58 CLEANUP: applet: remove the unneeded appctx->owner
This one is the pointer to the conn_stream which is always in the
endpoint that is always present in the appctx, thus it's not needed.
This patch removes it and replaces it with appctx_cs() instead. A
few occurences that were using __cs_strm(appctx->owner) were moved
directly to appctx_strm() which does the equivalent.
2022-05-13 14:28:48 +02:00
Willy Tarreau
c1b8d77805 MINOR: applet: add appctx_strm() and appctx_cs() to access common fields
It's very common to have to access a stream or a conn_stream from the
appctx, let's add trivial accessors for that.
2022-05-13 14:28:48 +02:00
Willy Tarreau
1c3ead45a4 MINOR: applet: replace cs_applet_shut() with appctx_shut()
The former takes a conn_stream still attached to a valid appctx,
which also complicates the termination of the applet. Instead, let's
pass the appctx which already points to the endpoint, this allows us
to properly detach the conn_stream before the call, which is cleaner
and safer.
2022-05-13 14:28:48 +02:00
Willy Tarreau
4201ab791d CLEANUP: muxes: make mux->attach/detach take a conn_stream endpoint
The mux ->detach() function currently takes a conn_stream. This causes
an awkward situation where the caller cs_detach_endp() has to partially
mark it as released but not completely so that ->detach() finds its
endpoint and context, and it cannot be done later since it's possible
that ->detach() deletes the endpoint. As such the endpoint link between
the conn_stream and the mux's stream is in a transient situation while
we'd like it to be clean so that the mux's ->detach() code can call any
regular function it wants that knows the regular semantics of the
relation between the CS and the endpoint.

A better approach consists in slightly modifying the detach() API to
better match the reality, which is that the endpoint is detached but
still alive and that it's the only part the function is interested in.

As such, this patch modifies the function to take an endpoint there,
and by analogy (or simplicity) does the same for ->attach(), even
though it looks less important there since we're always attaching an
endpoint to a conn_stream anyway. It is possible that in the future
the API could evolve to use more endpoints that provide a bit more
flexibility in the API, but at this point we don't need to go further.
2022-05-13 14:28:48 +02:00
Willy Tarreau
01c2a4a86f MINOR: mux-quic: remove the now unneeded conn_stream from the qcs
Since we always have a valid endpoint we can safely use it to access
the conn_stream and stop using qcs->cs. That's one less pointer to
care about.
2022-05-13 14:28:48 +02:00
Willy Tarreau
efb4618c6e MINOR: conn_stream: add a pointer back to the cs from the endpoint
Muxes and applets need to have both a pointer to the endpoint and to the
conn_stream. It would seem more natural that they only have a pointer to
the endpoint (that is always there) and that this one has an optional
pointer to the conn_stream. This would reduce the number of elements to
manipulate in lower level code. In addition, the conn_stream is not much
used from the lower layers (wake and exceptional events mostly).
2022-05-13 14:28:48 +02:00
Willy Tarreau
386346f5eb MINOR: conn_stream: make cs_set_error() work on the endpoint instead
Wherever we need to report an error, we have an even easier access to
the endpoint than the conn_stream. Let's first adjust the API to use
the endpoint and rename the function accordingly to cs_ep_set_error().
2022-05-13 14:27:57 +02:00
Amaury Denoyelle
df25acf47f MINOR: ncbuf: implement advance
A new function ncb_advance() is implemented. This is used to advance the
buffer head pointer. This will consume the front data while forming a
new gap at the end for future data.

On success NCB_RET_OK is returned. The operation can be rejected if a
too small new gap is formed in front of the buffer.
2022-05-12 18:29:55 +02:00
Amaury Denoyelle
b830f0d8d9 MINOR: ncbuf: define various insertion modes
Define three different ways to proceed insertion. This configures how
overlapping data is treated.
- NCB_ADD_PRESERVE : in this mode, old data are kept during insertion.
- NCB_ADD_OVERWRT : new data will overwrite old ones.
- NCB_ADD_COMPARE : this mode adds a new test in check stage. The
  overlapping old and new data must be identical or else the insertion
  is not conducted. An error NCB_RET_DATA_REJ is used in this case.

The mode is specified with a new argument to ncb_add() function.
2022-05-12 18:27:05 +02:00
Amaury Denoyelle
077e096b30 MINOR: ncbuf: implement insertion
Implement a new function ncb_add() to insert data in ncbuf. This
operation is conducted in two stages. First, a simulation will be run to
ensure that insertion can be proceeded. If a gap is formed, either
before or after the new data, it must be big enough to store its header,
or else the insertion is aborted.

After this check stage, the insertion is conducted block by block with
the function pair ncb_fill_data_blk()/ncb_fill_gap_blk().

A new type ncb_ret is used as a return value. For the moment, only
success or gap-size error is used. It is planned to add new error types
in the future when insertion will be extended.
2022-05-12 18:27:05 +02:00
Amaury Denoyelle
edeb0a61a2 MINOR: ncbuf: optimize storage for the last gap
Relax the constraint for gap storage when this is the last block.
ncb_blk API functions will consider that if a gap is stored near the end
of the buffer, without the space to store its header, the gap will cover
entirely the buffer end.

For these special cases, the gap size/data size are not write/read
inside the gap to prevent an overflow. Such a gap is designed in
functions as "reduced gap" and will be flagged with the value
NCB_BK_F_FIN.

This should reduce the rejection on future add operation when receiving
data in-order. Without reduced gap handling, an insertion would be
rejected if it covers only partially the last buffer bytes, which can be
a very common case.
2022-05-12 18:18:47 +02:00
Amaury Denoyelle
d5d2ed90f0 MINOR: ncbuf: complete API and define block interal abstraction
Implement two new functions to report the total data stored accross the
whole buffer and the data stored at a specific offset until the next gap
or the buffer end.

To facilitate implementation of these new functions and also future
add/delete operations, a new abstraction is introduced : ncb_blk. This
structure represents a block of either data or gap in the buffer. It
simplifies operation when moving forward in the buffer. The first buffer
block can be retrieved via ncb_blk_first(buf). The block at a specific
offset is accessed via ncb_blk_find(buf, off).

This abstraction is purely used in functions but not stored in the ncbuf
structure per-se. This is necessary to keep the minimal memory
footprint.
2022-05-12 18:18:47 +02:00
Amaury Denoyelle
1b5f77fc18 MINOR: ncbuf: define non-contiguous buffer
Define the new type ncbuf. It can be used as a buffer with
non-contiguous data and wrapping support.

To reduce as much as possible the memory footprint, size of data and
gaps are stored in the gaps themselves. This put some limitation on the
buffer usage. A reserved space is present just before the head to store
the size of the first data block. Also, add and delete operations will
be constrained to ensure minimal gap sizes are preserved.

The sizes stored in the gaps are represented by a custom type named
ncb_sz_t. This type is a typedef to easily change it : this has a
direct impact on the maximum buffer size (MAX(ncb_sz_t) - sizeof(ncb_sz_t))
and the minimal gap sizes (sizeof(ncb_sz_t) * 2)).
Currently, it is set to uint32_t.
2022-05-12 18:13:21 +02:00
Frédéric Lécaille
31fe308acc CLEANUP: quic_tls: QUIC_TLS_IV_LEN defined two times
Hopefully with the same value!
2022-05-12 17:48:35 +02:00
Frédéric Lécaille
a54e49d0b1 CLEANUP: quic: wrong use of eb*entry() macro
This wrong use has no consequence because the ->node member fields of
eb*node structs are the first.
2022-05-12 17:48:35 +02:00
Frédéric Lécaille
e2fb1bf487 MINOR: quic: Send stateless reset tokens
Add send_stateless_reset() to send a stateless reset packet. It prepares
a packet to build a 1-RTT packet with quic_stateless_reset_token_cpy()
to copy a stateless reset token derived from the cluster secret with
the destination connection ID received as salt.
Also add QUIC_EV_STATELESS_RST new trace event to at least to have a trace
of the connection which are reset.
2022-05-12 17:48:35 +02:00
Frédéric Lécaille
0226c521b0 MINOR: quic: new_quic_cid() code moving
This function will have to call another one from quic_tls.[ch] soon.
As we do not want to include quic_tls.h from xprt_quic.h because
quic_tls.h already includes xprt_quic.h, let's moving it into
xprt_quic.c.
2022-05-12 17:48:35 +02:00
Frédéric Lécaille
7b92c81e43 MINOR: quic-tls: Add quic_hkdf_extract_and_expand() for HKDF
This is a wrapper function around OpenSSL HKDF API functions to
use the "extract-then-expand" HKDF mode as defined by rfc5869.
This function will be used to derived stateless reset tokens
from secrets ("cluster-secret" conf. keyword) and CIDs (as salts).
2022-05-12 17:48:35 +02:00
Frédéric Lécaille
372508cc42 MINOR: config: Add "cluster-secret" new global keyword
It could be usefull to set a ASCII secret which could be used for different
usages. For instance, it will be used to derive QUIC stateless reset tokens.
2022-05-12 17:48:35 +02:00
Frédéric Lécaille
7cc8b3166a MINOR: quic: Add correct ack delay values to ACK frames
A ->time_received new member is added to quic_rx_packet to store the time the
packet are received. ->largest_time_received is added the the packet number
space structure to store this timestamp for the packet with a new largest
packet number to be acknowledged. QUIC_FL_PKTNS_NEW_LARGEST_PN new flag is
added to mark a packet number space as having to acknowledged a packet wih a
new largest packet number. In this case, the packet number space ack delay
must be recalculated.
Add quic_compute_ack_delay_us() function to compute the ack delay from the value
of the time a packet was received. Used only when a packet with a new largest
packet number.
2022-05-12 15:30:14 +02:00
Frédéric Lécaille
9475d890ee MINOR: quic: Congestion controller event trace fix (loss)
Missing event type (loss).
2022-05-12 15:30:14 +02:00
Frédéric Lécaille
f6e8594469 BUG/MINOR: quic: Wrong unit for ack delay for incoming ACK frames
This ACK frame field value is in microseconds. Everything is interpreted
and stored in milliseconds in our QUIC implementation.
2022-05-12 15:30:14 +02:00
Frédéric Lécaille
5b988ebed1 BUG/MINOR: quic: Dropped peer transport parameters
The call to quic_dflt_transport_params_cpy() is already first done by
quic_transport_params_init() which is a good thing. But this function was also
called each time we parsed a transport parameters with quic_transport_param_decode(),
re-initializing to default values some of them. The transport parameters concerned
by this bug are the following:
   - max_udp_payload_size
   - ack_delay_exponent
   - max_ack_delay
   - active_connection_id_limit
So, let's remove this call to quic_dflt_transport_params_cpy() which has nothing
to do here!
2022-05-12 15:26:10 +02:00
Frédéric Lécaille
8726d633d4 MINOR: quic: Add a debug counter for sendto() errors
As we do not have any task to be wake up by the poller after sendto() error,
we add an sendto() error counter to the quic_conn struct.
Dump its values from qc_send_ppkts().
2022-05-12 15:11:53 +02:00
Emeric Brun
314e6ec822 BUG/MAJOR: dns: multi-thread concurrency issue on UDP socket
This patch adds a lock on the struct dgram_conn to ensure
that an other thread cannot trash a fd or alter its status
while the current thread processing it on for send/receive/connect
operations.

Starting with the 2.4 version this could cause a crash when a DNS
request is failing, setting the FD of the dgram structure to -1. If the
dgram structure is reused after that, a read access to fdtab[-1] is
attempted. The crash was only triggered when compiled with ASAN.

In previous versions the concurrency issue also exists but is less
likely to crash.

This patch must be backported until v2.4 and should be
adapt for v < 2.4.
2022-05-11 15:20:10 +02:00
vigneshsp
47a4c61d63 BUG/MINOR: server: Make SRV_STATE_LINE_MAXLEN value from 512 to 2kB (2000 bytes).
The statefile before this patch can only parse lines within 512
characters, now as we made the value to 2000, it can support a
line of length of 2kB.

This patch fixes GitHub issue #1530.
It should be backported to all stable releases.
2022-05-11 11:39:06 +02:00
Willy Tarreau
8a0fd3a36c BUILD: debug: work around gcc-12 excessive -Warray-bounds warnings
As was first reported by Ilya in issue #1513, compiling with gcc-12
adds warnings about size 0 around each BUG_ON() call due to the
ABORT_NOW() macro that tries to dereference pointer value 1.

The problem is known, seems to be complex inside gcc and could only
be worked around for now by adjusting a pointer limit so that the
warning still catches NULL derefs in the first page but not other
values commonly used in kernels and boot loaders:
   https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=91f7d7e1b

It's described in more details here:
   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104657
   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99578
   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103768

And some projects had to work around it using various approaches,
some of which are described in the bugs reports above, plus another
one here:
   https://mail.coreboot.org/hyperkitty/list/seabios@seabios.org/thread/HLK3BHP2T3FN6FZ46BIPIK3VD5FOU74Z/

In haproxy we can hide it by hiding the pointer in a DISGUISE() macro,
but this forces the pointer to be loaded into a register, so that
register is lost precisely where we want to get the maximum of them.

In our case we purposely use a low-value non-null pointer because:
  - it's mandatory that this value fits within an unmapped page and
    only the lowest one has this property
  - we really want to avoid register loads for the address, as these
    will be lost and will complicate the bug analysis, and they tend
    to be used for large addresses (i.e. instruction length limit).
  - the compiler may decide to optimize away the null deref when it
    sees it (seen in the past already)

As such, the current workaround merged in gcc-12 is not effective for
us.

Another approach consists in using pragmas to silently disable
-Warray-bounds and -Wnull-dereference only for this part. The problem
is that pragmas cannot be placed into macros.

The resulting solution consists in defining a forced-inlined function
only to trigger the crash, and surround the dereference with pragmas,
themselves conditionned to gcc >= 5 since older versions don't
understand them (but they don't complain on the dereference at least).

This way the code remains the same even at -O0, without the stack
pointer being modified nor any address register being modified on
common archs (x86 at least). A variation could have been to rely on
__builtin_trap() but it's not everywhere and it behaves differently
on different platforms (undefined opcode or a nasty abort()) while
the segv remains uniform and effective.

This may need to be backported to older releases once users start to
complain about gcc-12 breakage.
2022-05-09 20:32:11 +02:00
Willy Tarreau
ecab71fbac BUILD: stats: conditionally mark obsolete stats states as deprecated
The obsolete stats states STAT_ST_* were marked as deprecated with recent
commit 6ef1648dc ("CLEANUP: stats: rename the stats state values an mark
the old ones deprecated"), except that this feature requires gcc 6 and
above. Let's use the macro that depends on this condition instead.

The issue appeared on 2.6-dev9 so no backport is needed.
2022-05-09 20:32:11 +02:00
Willy Tarreau
4fc2cd7c8e MINOR: compiler: add a new macro to set an attribute on an enum when possible
Gcc 6 and above support placing an attribute on an enum's value. This
is convenient for marking some values as deprecated. We just need the
macro because older versions fail to parse __attribute__() there.
2022-05-09 20:32:11 +02:00
David CARLIER
4aed40e6c7 MINOR: tcp: socket translate TCP_KEEPIDLE for macOs equivalent
On Linux the interval before starting to send TCP keep-alive packets
is defined by TCP_KEEPIDLE. MacOS has an equivalent with TCP_KEEPIDLE,
which also uses seconds as a unit, so it's possible to simply remap the
definition of TCP_KEEPIDLE to TCP_KEEPALIVE there and get it to seamlessly
work. The other settings (interval and count) are not present, though.
2022-05-08 10:35:39 +02:00
Willy Tarreau
6ef1648dc2 CLEANUP: stats: rename the stats state values an mark the old ones deprecated
The STAT_ST_* values have been abused by virtually every applet and CLI
keyword handler, and this must not continue as it's a source of bugs and
of overly complicated code.

This patch renames the states to STAT_STATE_*, and keeps the previous
enum while marking each entry as deprecated. This should be sufficient to
catch out-of-tree code that might rely on them and to let them know what
to do with that.
2022-05-06 18:33:49 +02:00
Willy Tarreau
1c0715b12a CLEANUP: cli: move the status print context into its own context
Now that the CLI's print context is alone in the appctx, it's possible
to refine the appctx's ctx layout so that the cli part matches exactly
a regular svcctx, and as such move the CLI context into an svcctx like
other applets. External code will still build and work because the
struct cli perfectly maps onto the struct cli_print_ctx that's located
into svc.storage. This is of course only to make a smooth transition
during 2.6 and will disappear immediately after.

A tiny change had to be applied to the opentracing addon which performs
direct accesses to the CLI's err pointer in its own print function. The
rest uses the standard cli_print_* which were the only ones that needed
a small change.

The whole "ctx.cli" struct could be tagged as deprecated so that any
possibly existing external code that relies on it will get build
warnings, and the comments in the struct are pretty clear about the
way to fix it, and the lack of future of this old API.
2022-05-06 18:33:22 +02:00
Willy Tarreau
aa229ccc4c MINOR: lua: move the http service context out of appctx.ctx
Just like for the TCP service, let's move the context away from
appctx.ctx. A new struct hlua_http_ctx was defined, reserved in
hlua_applet_http_init() and used everywhere else. Similarly, the
task dump code will no more report decoded stack traces in case
these services would be involved. That may be solved later.
2022-05-06 18:13:36 +02:00
Willy Tarreau
e23f33bbfe MINOR: lua: move the tcp service storage outside of appctx.ctx
The use-service mechanism for Lua in TCP mode relies on the
hlua_tcp storage in appctx->ctx. We can move its definition to
hlua.c and simply use appctx_reserve_svcctx() to reserve and access
the stoage. One tiny side effect is that the task dump used in panics
will not show anymore the Lua call stack in its trace. For this a
better API is needed from the Lua code to expose a function that does
the job from an appctx.
2022-05-06 18:13:36 +02:00
Willy Tarreau
5321da9df0 MEDIUM: lua: move the cosocket storage outside of appctx.ctx
The Lua cosockets were using appctx.ctx.hlua_cosocket. Let's move this
to a local definition of "hlua_csk_ctx" in hlua.c, which is allocated
from the appctx by hlua_socket_new(). There's a notable change which is
that, while previously the xref link with the peer was established with
the appctx, it's now in the hlua_csk_ctx. This one must then hold a
pointer to the appctx. The code was adjusted accordingly, and now that
part of the code doesn't use the appctx.ctx anymore.
2022-05-06 18:13:36 +02:00
Willy Tarreau
f61494c708 CLEANUP: cache: take the context out of appctx.ctx
The context was moved to a local definition in the cache code, and
there's nothing specific to the cache anymore in the appctx. The
struct is stored into the appctx's storage area via the svcctx.
2022-05-06 18:13:36 +02:00
Willy Tarreau
c7afedc140 BUILD: applet: mark the appctx's st2 variable as deprecated
This one has been misused for a while as well, it's time to deprecate it
since we don't use it anymore. It will be removed in 2.7 and for now is
only marked as deprecated. Since we need to guarantee that it's zeroed
before starting any applet or CLI command, it was moved into an anonymous
union where its sibling is not marked as deprecated so that we can
continue to initialize it without triggering a warning.

If you found this commit after a bisect session you initiated to figure
why you got some build warnings and don't know what to do, have a look
at the code that deals with the "show fd", "show sess" or "show servers"
commands, as it's supposed to be self-explanatory about the tiny changes
to apply to your code to port it. If you find APPLET_MAX_SVCCTX to be
too small for your use case, either kindly ask for a tiny extension
(and try to get your code merged), or just use a pool.
2022-05-06 18:13:36 +02:00
Willy Tarreau
f50da2c320 BUILD: applet: mark the CLI's generic variables as deprecated
The generic context variables p0/p1/p2, i0/i1, o0/o1 have been abused
and causing trouble for too long, it's time to remove them now that
they are not used anymore.

However the risk that external code still uses them is not nul and we
had not warned before about their removal. Let's mark them deprecated
in 2.6 and removed in 2.7. This will let external code continue to work
(as well as it could if it misuses them), with a strong encouragement
on updating it.

If you found this commit after a bisect session you initiated to figure
why you got some build warnings and don't know what to do, have a look
at the code that deals with the "show fd", "show env" or "show servers"
commands, as it's supposed to be self-explanatory about the tiny changes
to apply to your code to port it. If you find APPLET_MAX_SVCCTX to be
too small for your use case, either kindly ask for a tiny extension
(and try to get your code merged), or just use a pool.
2022-05-06 18:13:36 +02:00
Willy Tarreau
23a2407843 CLEANUP: spoe: do not use appctx.ctx anymore
The spoe code already uses its own generic pointer, let's move it to
svcctx instead of keeping a struct spoe in the appctx union.
2022-05-06 18:13:36 +02:00
Willy Tarreau
455caef642 CLEANUP: peers: do not use appctx.ctx anymore
The peers code already uses its own generic pointer, let's move it to
svcctx instead of keeping a struct peers in the appctx union.
2022-05-06 18:13:36 +02:00
Willy Tarreau
1eea6657fb CLEANUP: httpclient: do not use the appctx.ctx anymore
The httpclient already uses its own pointer and only used to store this
single pointer into the appctx.ctx field. Let's just move it to the
svcctx and remove this entry from the appctx union.
2022-05-06 18:13:36 +02:00
Willy Tarreau
cba8838e59 CLEANUP: ring: pass the ring watch flags to ring_attach_cli(), not in ctx.cli
The ring watch flags (wait, seek end) were dangerously passed via ctx.cli.i0
from "show buf" in sink.c:cli_parse_show_events(), or implicitly reset in
"show errors". That's very unconvenient, difficult to follow, and prone to
short-term breakage.

Let's pass an extra argument to ring_attach_cli() to take these flags, now
defined in ring-t.h as RING_WF_*, and let the function set them itself
where appropriate (still ctx.cli.i0 for now).
2022-05-06 18:13:36 +02:00
Willy Tarreau
42cc831abf CLEANUP: sink: use the generic context to store the forwarder's context
Instead of having a struct that contains a single pointer in the appctx
context, let's directly use the generic context pointer and get rid of
the now unused sft.ptr entry.
2022-05-06 18:13:36 +02:00
Willy Tarreau
dec23dc43f CLEANUP: ssl/cli: use a local context for "commit ssl {ca|crl}file"
These two commands use distinct parse/release functions but a common
iohandler, thus they need to keep the same context. It was created
under the name "commit_cacrlfile_ctx" and holds a large part of the
pointers (6) and the ca_type field that helps distinguish between
the two commands for the I/O handler. It looks like some of these
fields could have been merged since apparently the CA part only
uses *cafile* and the CRL part *crlfile*, while both old and new
are of type cafile_entry and set only for each type. This could
probably even simplify some parts of the code that tries to use
the correct field.

These fields were the last ones to be migrated thus the appctx's
ssl context could finally be removed.
2022-05-06 18:13:36 +02:00
Willy Tarreau
329f4b4f2f CLEANUP: ssl/cli: use a local context for "set ssl cert"
The command doesn't really need any storage since there's only a parser,
but since it used this context, there might have been plans for extension,
so better continue with a persistent one. Only old_ckchs, new_ckchs, and
path were being used from the appctx's ssl context. There ones moved to
the local definition, and the two former ones were removed from the appctx
since not used anymore.
2022-05-06 18:13:36 +02:00
Willy Tarreau
96c9a6c752 CLEANUP: ssl/cli: use a local context for "show ssl cert"
This command only really uses old_ckchs, cur_ckchs and the index
in which the transaction was stored. The new structure "show_cert_ctx"
only has these 3 fields, and the now unused "cur_ckchs" and "index"
could be removed from the shared ssl context.
2022-05-06 18:13:36 +02:00
Willy Tarreau
f3e8b3e877 CLEANUP: ssl/cli: use a local context for "show crlfile"
Now this command doesn't share any context anymore with "show cafile"
nor with the other commands. The previous "cur_cafile_entry" field from
the applet's ssl context was removed as not used anymore. Everything was
moved to show_crlfile_ctx which only has 3 fields.
2022-05-06 18:13:36 +02:00
Willy Tarreau
50c2f1e0cd CLEANUP: ssl/cli: use a local context for "show cafile"
Saying that the layout and usage of the various variables in the ssl
applet context is a mess would be an understatement. It's very hard
to know what command uses what fields, even after having moved away
from the mix of cli and ssl.

Let's extract the parts used by "show cafile" into their own structure.
Only the "show_all" field would be removed from the ssl ctx, the other
fields are still shared with other commands.
2022-05-06 18:13:35 +02:00
Willy Tarreau
bcda5f6bcd CLEANUP: hlua/cli: take the hlua_cli context definition out of the appctx
This context is used by CLI keywords registered by Lua. We can take
it out of the appctx and use the generic command context allocation so
that the appctx doesn't have to declare a specific one anymore. The
context is created during parsing.
2022-05-06 18:13:35 +02:00
Willy Tarreau
41f885241e CLEANUP: stats/cli: stop using appctx->st2
Instead, let's have the state as an enum inside the context. It's much
cleaner and safer as we know nobody else touches it.
2022-05-06 18:13:35 +02:00
Willy Tarreau
91cefcaba4 CLEANUP: stats/cli: take the "show stat" context definition out of the appctx
This makes use of the generic command context allocation so that the
appctx doesn't have to declare a specific one anymore. The context is
created during parsing (both in the CLI and HTTP).

The change looks large but it's particularly mechanical. The context
initialization appears in stats.c and http_ana.c. The context is used
in stats.c and resolvers.c since "show stat resolvers" points there.
That's the reason why the definition moved to stats.h. "show info"
and "show stat" continue to share the same state definition for now.

Nothing else was modified.
2022-05-06 18:13:35 +02:00
Willy Tarreau
cb8bf17900 CLEANUP: peers/cli: take the "show peers" context definition out of the appctx
This makes use of the generic command context allocation so that the
appctx doesn't have to declare a specific one anymore. The context is
created during parsing. The code also uses st2 which deserves being
addressed in separate commit.
2022-05-06 18:13:35 +02:00
Willy Tarreau
0fcecc63c8 CLEANUP: map/cli: take the "show map" context definition out of the appctx
This makes use of the generic command context allocation so that the
appctx doesn't have to declare a specific one anymore. The context is
created during parsing. Many commands, including pure parsers, use this
context but that's not a problem as it's designed to be used this way.
Due to this, many lines are changed but that's in fact a replacement of
"appctx->ctx.map" with "ctx->". Note that the code also uses st2 which
deserves being addressed in separate commit.
2022-05-06 18:13:35 +02:00
Willy Tarreau
3c69e08e96 CLEANUP: stick-table/cli: take the "show table" context definition out of the appctx
This makes use of the generic command context allocation so that the
appctx doesn't have to declare a specific one anymore. The context is
created during parsing. The code also uses st2 which deserves being
addressed in separate commit.
2022-05-06 18:13:35 +02:00
Willy Tarreau
0fd8f0e236 CLEANUP: proxy/cli: take the "show errors" context definition out of the appctx
This makes use of the generic command context allocation so that the
appctx doesn't have to declare a specific one anymore. The context is
created during parsing.

The code still has room for improvement, such as in the "flags" field
where bits are hard-coded, but they weren't modified.
2022-05-06 18:13:35 +02:00
Willy Tarreau
39f097d965 CLEANUP: stream/cli: take the "show sess" context definition out of the appctx
This makes use of the generic command context allocation so that the
appctx doesn't have to declare a specific one anymore. The context is
created during parsing.
2022-05-06 18:13:35 +02:00
Willy Tarreau
f12f32a0fa MINOR: applet: reserve some generic storage in the applet's context
Instead of using existing fields and having to put keyword-specific
contexts in the applet definition, let's have the appctx provide a
generic storage area that's currently large enough for existing CLI
commands and small applets, and a function to allocate that storage.

The function will be responsible for verifying that the requested size
fits in the area so that the caller doesn't need to add specific checks;
it is validated during development as this size is static and will
not change at runtime. In addition the caller doesn't even need to
free() the area since it's part of an existing context. For the
caller's convenience, a context pointer "svcctx" for the command is
also provided so that the allocated area can be placed there (or
possibly any other one in case a larger area is needed).

The struct's layout has been temporarily complicated by adding one
level of anonymous union on top of the "ctx" one. This will allow us
to preserve "ctx" during 2.6 for compatibility with possible external
code and get rid of it in 2.7. This explains why the diff extends to
the whole "ctx" union, but a "git show -b" shows that only one extra
layer was added. In order to make both the svcctx pointer and its
storage accessible without further enlarging the appctx structure,
both svcctx and the storage share the same storage as the ctx part.
This is done by having them placed in the union with a protected
overlapping area for svcctx, for which a shadow member is also
present in the storage area:

    union {
       void* svcctx;         // variable accessed by services
       struct {
           void *shadow;     // shadow of svcctx;
           char storage[];   // where most services store their data
       };
       union {               // older commands store here and ignore svcctx
          ...
       } ctx;
    };

I.e. new applications will use appctx->svcctx while older ones will be
able to continue to use appctx->ctx.*

The whole area (including the pointer's context) is zeroed before any
applet is initialized, and before CLI keyword processor's first invocation,
as it is an important part of the existing keyword processors, which makes
CLI keywords effectively behave like applets.
2022-05-06 18:13:35 +02:00
Willy Tarreau
4fd9b4ddf0 BUG/MINOR: ssl/cli: fix "show ssl cert" not to mix cli+ssl contexts
The "show ssl cert" command mixes some generic pointers from the
"ctx.cli" struct with context-specific ones from "ctx.ssl" while both
are in a union. Amazingly, despite the use of both p0 and i0 to store
respectively a pointer to the current ckchs and a transaction id, there
was no overlap with the other pointers used during these operations,
but should these fields be reordered or slightly updated this will break.
Comments were added above the faulty functions to indicate which fields
they are using.

This needs to be backported to 2.5.
2022-05-06 18:13:35 +02:00
Willy Tarreau
06305798f7 BUG/MINOR: ssl/cli: fix "show ssl ca-file <name>" not to mix cli+ssl contexts
The "show ssl ca-file <name>" command mixes some generic pointers from
the "ctx.cli" struct and context-specific ones from "ctx.ssl" while both
are in a union. The i0 integer used to store the current ca_index overlaps
with new_crlfile_entry which is thus harmless for now but is at the mercy
of any reordering or addition of these fields. Let's add dedicated fields
into the ssl structure for this.

Comments were added on top of the affected functions to indicate what they
use.

This needs to be backported to 2.5.
2022-05-06 18:13:35 +02:00
Willy Tarreau
821c3b0b5e BUG/MINOR: ssl/cli: fix "show ssl ca-file/crl-file" not to mix cli+ssl contexts
The "show ca-file" and "show crl-file" commands mix some generic pointers
from the "ctx.cli" struct and context-specific ones from "ctx.ssl" while
both are in a union. It's fortunate that the p0 pointer in use is located
immediately before the first one used (it overlaps with next_ckchi_link,
and old_cafile_entry is safe). But should these fields be reordered or
slightly updated this will break.

Comments were added on top of the affected functions to indicate what they
use.

This needs to be backported to 2.5.
2022-05-06 18:13:35 +02:00
William Lallemand
7867f63313 MEDIUM: resolvers: create a "default" resolvers section at startup
Try to create a "default" resolvers section at startup, but does not
display any error nor warning. This section is initialized using the
/etc/resolv.conf of the system.

This is opportunistic and with no guarantee that it will work (but it should
on most systems).

This is useful for the httpclient as it allows to use the DNS resolver
without any configuration in most of the cases.

The function is called from the httpclient_pre_check() function to
ensure than we tried to create the section before trying to initiate the
httpclient. But it is also called from the resolvers.c to ensure the
section is created when the httpclient init was disabled.
2022-05-06 17:02:15 +02:00
Christopher Faulet
fa24379aeb MINOR: conn-stream: Add mask from flags set by endpoint or app layer
In flags set on the endpoints, some are set by endpoints itself and some are
set by the app layer. To help flags manipulations, 2 masks have been
added. The first one, CS_EP_ENDP_MASK, for all flags that an endpoint may
set. The other one, CS_EP_APP_MASK, for flags that the app layer may set.

This patch is mandatory for the next commit.
2022-05-05 09:23:35 +02:00
Frédéric Lécaille
664741e1c5 MINOR: quic: Make the quic_conn be aware of the number of streams
This is required when the retransmitted frame types when the mux is released.
We add a counter for the number of streams which were opened or closed by the mux.
After the mux has been released, we can rely on this counter to know if the STREAM
frames are retransmitted ones or not.
2022-05-03 10:13:40 +02:00
Frédéric Lécaille
b074966634 CLEANUP: mux: Useless xprt_quic-t.h inclusion
This inclusion is useless for mux_quic-t.h. Furthermore this fixes compilation
issues when we need to refer to mux_quic-t.h from xprt_quic-t.h.
2022-05-03 10:13:40 +02:00
Willy Tarreau
0367b4cf63 MINOR: session: get rid of the now unused SESS_FL_ADDR_*_SET flags
That's similar to what was done for conn_streams and connections. The
flags were only set exactly when the relevant pointers were allocated,
so better test the pointer than the flag and stop setting the flag.
2022-05-02 17:51:51 +02:00
Willy Tarreau
030b3e6bcc MINOR: connection: get rid of the CO_FL_ADDR_*_SET flags
Just like for the conn_stream, now that these addresses are dynamically
allocated, there is no single case where the pointer is set without the
corresponding flag, and the flag is used as a permission to dereference
the pointer. Let's just replace the test of the flag with a test of the
pointer and remove all flag assignment. This makes the code clearer
(especially in "if" conditions) and saves the need for future code to
think about properly setting the flag after setting the pointer.
2022-05-02 17:47:46 +02:00
Willy Tarreau
04e9acaef4 MINOR: conn_stream: remove the now unused CS_FL_ADDR_*_SET flags
These flags indicate that the ->src or ->dst field in the conn_stream
is not null, which is something the caller already sees (and even tests
from the two sets of functions that set them). They maintain some burden
because an agent trying to set a source or destination has to manually
set the flags in addition to setting the pointer, so they provide no
value anymore, let's drop them.
2022-05-02 17:43:51 +02:00
Willy Tarreau
03bd3952a6 MEDIUM: stream: remove the confusing SF_ADDR_SET flag
This flag is no longer needed now that it must always match the presence
of a destination address on the backend conn_stream. Worse, before previous
patch, if it were to be accidently removed while the address is present, it
could result in a leak of that address since alloc_dst_address() would first
be called to flush it.

Its usage has a long history where addresses were stored in an area shared
with the connection, but as this is no longer the case, there's no reason
for putting this burden onto application-level code that should not focus
on setting obscure flags.

The only place where that made a small difference is in the dequeuing code
in case of queue redistribution, because previously the code would first
clear the flag, and only later when trying to deal with the queue, would
release the address. It's not even certain whether there would exist a
code path going to connect_server() without calling pendconn_dequeue()
first (e.g. retries on queue timeout maybe?).

Now the pendconn_dequeue() code will rely on SF_ASSIGNED to decide to
clear and release the address, since that flag is always set while in
a server's queue, and its clearance implies that we don't want to keep
the address. At least it remains consistent and there's no more risk of
leaking it.
2022-05-02 16:56:01 +02:00
Amaury Denoyelle
f1fc0b393b MINOR: mux-quic: support full request channel buffer
If the request channel buffer is full, H3 demuxing must be interrupted
on the stream until some read is performed. This condition is reported
if the HTX stream buffer qcs.rx.app_buf is full.

In this case, qcs instance is marked with a new flag QC_SF_DEM_FULL.
This flag cause the H3 demuxing to be interrupted. It is cleared when
the HTX buffer is read by the conn-stream layer through rcv_buf
operation.

When the flag is cleared, the MUX tasklet is woken up. However, as MUX
iocb does not treat Rx for the moment, this is useless. It must be fix
to prevent possible freeze on POST transfers.

In practice, for the moment the HTX buffer is never full as the current
Rx code is limited by the quic-conn receive buffer size and the
incomplete flow-control implementation. So for now this patch is not
testable under the current conditions.
2022-05-02 11:19:02 +02:00
Frédéric Lécaille
c40e19d711 BUG/MINOR: quic: Missing time threshold multiplifier for loss delay computation
It seems this multiplier ended up in oblivion. Indeed a multiplier must be
applied to the loss delay expressed as an RTT multiplier: 9/8.

So, some packets were detected as lost too soon, leading to be retransmitted too
early!
2022-04-29 16:46:56 +02:00
Frédéric Lécaille
1601395063 MINOR: quic: moving code for QUIC loss detection
qc_qc_packet_loss_lookup() is definitively a QUIC loss detection function.
2022-04-29 16:46:56 +02:00
Frédéric Lécaille
77cb38d22d BUG/MEDIUM: quic: Possible crash on STREAM frame loss
A crash is possible under such circumtances:
    - The congestion window is drastically reduced to its miniaml value
    when a quic listener is experiencing extreme packet loss ;
    - we enqueue several STREAM frames to be resent and some of them could not be
    transmitted ;
    - some of the still in flight are acknowledged and trigger the
    stream memory releasing ;
    - when we come back to send the remaing STREAM frames, haproxy
    crashes when it tries to build them.

To fix this issue, we mark the STREAM frame as lost when detected as lost.
Then a lookup if performed for the stream the STREAM frames are attached
to before building them. They are released if the stream is no more available
or the data range of the frame is consumed.
2022-04-28 16:22:40 +02:00
Frédéric Lécaille
da342556c3 MEDIUM: quic: Mark copies of acknowledged frames as acknowledged
We call qc_release_frm() to do so from this function everywhere a frame
is released.
2022-04-28 16:22:40 +02:00
Frédéric Lécaille
3e3a621447 MINOR: quic: old data distinction for qc_send_app_pkt()
Modify qc_send_app_pkt() to distinguish the case where it sends new data
against the case where it sends old data during probing retransmissions.
We add <old_data> boolean parameter to this function to do so. The mux
never directly send old data when probing retransmissions are needed by
the connection.
2022-04-28 16:22:40 +02:00
Frédéric Lécaille
a9568411e4 MEDIUM: quic: New functions for probing rework
We want to be able to resend frames from list of frames during handshakes to
resend datagrams with the same frames as during the first transmissions.
This leads to decrease drasctically the chances of frame fragmentation due to
variable lengths of the packet fields. Furthermore the frames were not duplicated
when retransmitted from a packet to another one. This must be the case only during
packet loss dectection.

qc_dup_pkt_frms() is there to duplicate the frames from an input list to an output
list. A distinction is made between STREAM frames and the other ones because we
can rely on the "acknowledged offset" the aim of which is to track the number
of bytes which were acknowledged from TX STREAM frames.

qc_release_frm() in addition to release the frame passed as parameter, also mark
the duplicate STREAM frames as acknowledeged.

qc_send_hdshk_pkts() is the qc_send_app_pkts() counterpart to send datagrams from
at most two list of frames to be able to coalesced packets from two different
packet number spaces

qc_dgrams_retransmit() is there to probe the peer with datagrams depending on the
need of the packet number spaces which must be flag with QUIC_FL_PKTNS_PROBE_NEEDED
by the PTO timer task (qc_process_timer()).
2022-04-28 16:22:40 +02:00
Frédéric Lécaille
3ef729a643 MINOR: quic: process_timer() rework
Add QUIC_FL_CONN_RETRANS_NEEDED connection flag definition to mark a quic_conn
struct as needing a retranmission.
Add QUIC_FL_PKTNS_PROBE_NEEDED to mark a packet number space as needing a
datagram probing.
Set these flags from process_timer() to trigger datagram probings.
Do not initiate anymore datagrams probing from any quic encryption level.
This will be done from the I/O handlers (quic_conn_io_cb() during handshakes and
quic_conn_app_io_cb() after handshakes).
2022-04-28 16:22:40 +02:00
Frédéric Lécaille
b44cbc68a6 MINOR: quic: Do not retransmit frames from coalesced packets
Add QUIC_FL_TX_PACKET_COALESCED flag to mark a TX packet as coalesced with others
to build a datagram.
Ensure we do not directly retransmit frames from such coalesced packets. They must
be retransmitted from their packet number spaces to avoid duplications.
2022-04-28 16:22:40 +02:00
Frédéric Lécaille
b917191817 MINOR: quic: Prepare quic_frame struct duplication
We want to track the frames which have been duplicated during retransmissions so
that to avoid uselessly retransmitting frames which would already have been
acknowledged. ->origin new member is there to store the frame from which a copy
was done, ->reflist is a list to store the frames which are copies.
Also ensure all the frames are zeroed and that their ->reflist list member is
initialized.
Add QUIC_FL_TX_FRAME_ACKED flag definition to mark a TX frame as acknowledged.
2022-04-28 16:22:40 +02:00
Amaury Denoyelle
47447af1ef MINOR: mux-quic: add a app-layer context in qcs
Define 2 new callback for qcc_app_ops : attach and detach. They are
called when a qcs instance is respectively allocated and freed. If
implemented, they can allocate a custom context stored in the new
abstract field ctx of qcs.

For now, h3 and hq-interop does not use these new callbacks. They will
be soon implemented by the h3 layer to allocate a context used for
stateful demuxing.

This change is required to support the demuxing of H3 frames bigger than
a buffer.
2022-04-28 15:44:19 +02:00
Amaury Denoyelle
3df8ca0a4d MINOR: mux-quic: partially copy Rx frame if almost full buf
Improve the reception for STREAM frames. In qcc_recv(), if the frame is
bigger than the remaining space in rx buffer, do not reject it wholly.
Instead, copy as much data as possible. The rest of the data is
buffered.

This is necessary to handle H3 frames bigger than a buffer. The H3 code
does not demux until the frame is complete or the buffer is full.
Without this, the transfer on payload larger than the Rx buffer can
rapidly freeze.
2022-04-28 15:42:21 +02:00
Amaury Denoyelle
44d0912f7b MINOR: mux-quic: count local flow-control stream limit on reception
Add new qcs fields to count the sum of bytes received for each stream.
This is necessary to enforce flow-control for reception on the peer.

For the moment, the implementation is partial. No MAX_STREAM_DATA or
FLOW_CONTROL_ERROR are emitted. BUG_ON statements are here as a
remainder.

This means that for the moment we do not support POST payloads greater
that the initial max-stream-data announced (256k currently).

At least, we now ensure that we never buffer a frame which overflows the
flow-control limit : this ensures that the memory consumption per stream
should stay under control.
2022-04-28 14:56:14 +02:00
Amaury Denoyelle
408d226aa1 MINOR: mux-quic: remove unused bogus qcc_get_stream()
qcc_get_stream() was used when qcs and qc_stream_desc shared the same
node-tree. This is not the case anymore since
  e4301da5ed
  MINOR: quic-stream: use distinct tree nodes for quic stream and qcs

Now this function is broken as the qcc tree only contains qcs.
Thankfully it is unused so it can be removed without impact.
2022-04-28 14:47:53 +02:00
Willy Tarreau
226866e1bb CLEANUP: deinit: release the config postparsers
These ones were not released either, it just requires to export the list
("postparsers") and it makes valgrind happy.
2022-04-27 18:07:24 +02:00
Thomas Prckl
10243938db MINOR: ssl: add a new global option "tune.ssl.hard-maxrecord"
Low footprint client machines may not have enough memory to download a
complete 16KB TLS record at once. With the new option the maximum
record size can be defined on the server side.

Note: Before limiting the the record size on the server side, a client should
consider using the TLS Maximum Fragment Length Negotiation Extension defined
in RFC6066.

This patch fixes GitHub issue #1679.
2022-04-27 16:53:43 +02:00
Remi Tricot-Le Breton
4d7fdc65d4 MINOR: connection: Add way to disable active connection closing during soft-stop
If the "close-spread-time" option is set to "infinite", active
connection closing during a soft-stop can be disabled. The 'connection:
close' header or the GOAWAY frame will not be added anymore to the
server's response and active connections will only be closed once the
clients disconnect. Idle connections will not be closed all at once when
the soft-stop starts anymore, and each idle connection will follow its
own timeout based on the multiple timeouts set in the configuration (as
is the case during regular execution).

This feature request was described in GitHub issue #1614.
This patch should be backported to 2.5. It depends on 'MEDIUM: global:
Add a "close-spread-time" option to spread soft-stop on time window'.
2022-04-26 19:56:47 +02:00
Willy Tarreau
65d9f83794 BUILD: compiler: properly distinguish weak and global symbols
While weak symbols were finally fixed with commit fb1b6f5bc ("BUILD:
compiler: use a more portable set of asm(".weak") statements"), it
was an error to think that initcall symbols were also weak. They must
not be and they're only global. The reason is that any externally
linked code loaded as a .so would drop its weak symbols when being
loaded, hence its initcalls that may contain various function
registration calls.

The ambiguity came from the fact that we initially reused the initcall's
HA_GLOBL macro for OSX then generalized it, then turned it to a choice
between .globl and .weak based on the OS, while in fact we needed a
macro to define weak symbols.

Let's rename the macro to HA_WEAK() to make it clear it's only for weak
symbols, and redefine HA_GLOBL() that initcall needs.

This will need to be backported wherever the commit above is backported
(at least 2.5 for now).
2022-04-26 19:49:33 +02:00
Willy Tarreau
a80e4a3546 MINOR: fd: add functions to set O_NONBLOCK and FD_CLOEXEC
Instead of seeing each location manipulate the fcntl() themselves and
often forget to check previous flags, let's centralize the functions to
do this. It also allows to drop fcntl.h from most call places and will
ease the adoption of different OS-specific mechanisms if needed. Note
that the fd_set_nonblock() function purposely doesn't check the previous
flags as it's meant to be used on new FDs only.
2022-04-26 10:59:48 +02:00
Willy Tarreau
197715ae21 CLEANUP: compression: move the default setting of maxzlibmem to defaults
__comp_fetch_init() only presets the maxzlibmem, and only when both
USE_ZLIB and DEFAULT_MAXZLIBMEM are set. The intent is to preset a
default value to protect the system against excessive memory usage
when no setting is set by the user.

Nowadays the entry in the global struct is always there so there's no
point anymore in passing via a constructor to possibly set this value.
Let's go the cleaner way by always presetting DEFAULT_MAXZLIBMEM to 0
in defaults.h unless these conditions are met, and always assigning it
instead of pre-setting the entry to zero. This is more straightforward
and removes some ifdefs and the last constructor. In addition, now the
setting has a chance of being found.
2022-04-25 19:42:43 +02:00
Willy Tarreau
2df1fbf816 MINOR: init: add global setting "fd-hard-limit" to bound system limits
On some systems, the hard limit for ulimit -n may be huge, in the order
of 1 billion, and using this to automatically compute maxconn doesn't
work as it requires way too much memory. Users tend to hard-code maxconn
but that's not convenient to manage deployments on heterogenous systems,
nor when porting configs to developers' machines. The ulimit-n parameter
doesn't work either because it forces the limit. What most users seem to
want (and it makes sense) is to respect the system imposed limits up to
a certain value and cap this value. This is exactly what fd-hard-limit
does.

This addresses github issue #1622.
2022-04-25 18:04:49 +02:00
Willy Tarreau
7c9a0fe2a6 MEDIUM: backend: add new "balance hash <expr>" algorithm
Almost all of our hash-based LB algorithms are implemented as special
cases of something that can now be achieved using sample expressions,
and some of them have adopted some options to adapt their behavior in
ways that could also be achieved using converters.

There are users who want to hash other parameters that are combined
into variables, and who set headers from these values and use
"balance hdr(name)" for this.

Instead of constantly implementing specific options and having users
hack around when they want a real hash, let's implement a native hash
mode that applies to a standard sample expression. This way, any
fetchable element (including variables) may be used to construct the
hash, even modified by any converter if desired.
2022-04-25 16:09:26 +02:00
Willy Tarreau
a4e39890f3 MINOR: task: add a new task_instant_wakeup() function
This function's purpose is to wake up either a local or remote task,
bypassing the tree-based run queue. It is meant for fast wakeups that
are supposed to be equivalent to those used with tasklets, i.e. a task
had to pause some processing and can complete (typically a resource
becomes available again). In all cases, it's important to keep in mind
that the task must have gone through the regular scheduling path before
being blocked, otherwise the task priorities would be ignored.

The reason for this is that some wakeups are massively inter-thread
(e.g. server queues), that these inter-thread wakeups cause a huge
contention on the shared runqueue lock. A user reported 47% CPU spent
in process_runnable_tasks with only 32 threads and 80k requests in
queues. With this mechanism, purely one-to-one wakeups can avoid
taking the lock thanks to the mt_list used for the shared tasklet
queue.

Right now the shared tasklet queue moves everything to the TL_URGENT
queue. It's not dramatic but it would seem better to have a new shared
list dedicated to tasks, and that would deliver into TL_NORMAL, for an
even better fairness. This could be improved in the future.
2022-04-22 19:11:59 +02:00
William Lallemand
b53eb8790e MINOR: init: add the pre-check callback
This adds a call to function <fct> to the list of functions to be called at
the step just before the configuration validity checks. This is useful when you
need to create things like it would have been done during the configuration
parsing and where the initialization should continue in the configuration
check.
It could be used for example to generate a proxy with multiple servers using
the configuration parser itself. At this step the trash buffers are allocated.
Threads are not yet started so no protection is required. The function is
expected to return non-zero on success, or zero on failure. A failure will make
the process emit a succinct error message and immediately exit.
2022-04-22 15:45:47 +02:00
Christopher Faulet
eb50c01fef MINOR: conn-stream: Make cs_detach_* private and use cs_destroy() from outside
A conn-stream is never detached from an endpoint or an application alone,
except on a reset. Thus, to avoid any error, these functions are now
private. And cs_destroy() function is added to destroy a conn-stream. This
function is called when a stream is released, on the front and back
conn-streams, and when a health-check is finished.
2022-04-22 14:32:30 +02:00
Christopher Faulet
ca6c9bba82 CLEANUP: conn-stream: Rename cs_applet_release()
This function does not release the applet but only call the applet release
callback. It is equivalent to cs_conn_shut() but for applets. Thus the
function is renamed cs_applet_shut().
2022-04-22 14:14:27 +02:00
Christopher Faulet
ff022a2b8c CLEANUP: conn-stream: Rename cs_conn_close() and cs_conn_drain_and_close()
These functions don't close the connection but only perform shutdown for
reads and writes at the mux level. It is a bit ambiguous. Thus,
cs_conn_close() is renamed cs_conn_shut() and cs_conn_drain_and_close() is
renamed cs_conn_drain_and_shut(). These both functions rely on
cs_conn_shutw() and cs_conn_shutr().
2022-04-22 14:14:27 +02:00
Remi Tricot-Le Breton
f87c67e5e4 MINOR: ssl: Add 'show ssl providers' cli command and providers list in -vv option
Starting from OpenSSLv3, providers are at the core of cryptography
functions. Depending on the provider used, the way the SSL
functionalities work could change. This new 'show ssl providers' CLI
command allows to show what providers were loaded by the SSL library.
This is required because the provider configuration is exclusively done
in the OpenSSL configuration file (/usr/local/ssl/openssl.cnf for
instance).
A new line is also added to the 'haproxy -vv' output containing the same
information.
2022-04-21 14:54:45 +02:00
Amaury Denoyelle
97e84c6c69 MINOR: cfg-quic: define tune.quic.conn-buf-limit
Add a new global configuration option to set the limit of buffers per
QUIC connection. By default, this value is set to 30.
2022-04-21 12:04:04 +02:00
Amaury Denoyelle
1b2dba531d MINOR: mux-quic: implement immediate send retry
Complete qc_send function. After having processed each qcs emission, it
will now retry send on qcs where transfer can continue. This is useful
when qc_stream_desc buffer is full and there is still data present in
qcs buf.

To implement this, each eligible qcs is inserted in a new list
<qcc.send_retry_list>. This is done on send notification from the
transport layer through qcc_streams_sent_done(). Retry emission until
send_retry_list is empty or the transport layer cannot proceed more
data.

Several send operations are now called on two different places. Thus a
new _qc_send_qcs() function is defined to factorize the code.

This change should maximize the throughput during QUIC transfers.
2022-04-21 12:04:04 +02:00
Amaury Denoyelle
d2f80a2e63 MINOR: quic: limit total stream buffers per connection
MUX streams can now allocate multiple buffers for sending. quic-conn is
responsible to limit the total count of allowed allocated buffers. A
counter is stored in the new field <stream_buf_count>.

For the moment, the value is hardcoded to 30.

On stream buffer allocation failure, the qcc MUX is flagged with
QC_CF_CONN_FULL. The MUX is then woken up as soon as a buffer is freed,
most notably on ACK reception.
2022-04-21 12:04:04 +02:00
Amaury Denoyelle
1b81dda3e0 MINOR: quic-stream: refactor ack management
Acknowledge of STREAM has been complexified with the introduction of
stream multi buffers. Two functions are executing roughly the same set
of instructions in xprt_quic.c.

To simplify this, move the code complexity in a new function
qc_stream_desc_ack(). It will handle offset calculation, removal of
data, freeing oldest buffer and freeing stream instance if required.
The qc_stream_desc API is cleaner as qc_stream_desc_free_buf() ambiguous
function has been removed.
2022-04-21 12:04:04 +02:00
Amaury Denoyelle
a456920491 MEDIUM: quic: implement multi-buffered Tx streams
Complete the qc_stream_desc type to support multiple buffers on
emission. The main objective is to increase the transfer throughput.
The MUX is now able to transfer more data without having to wait ACKs.

To implement this feature, a new type qc_stream_buf is declared. it
encapsulates a buffer with a list element. New functions are defined to
retrieve the current buffer, release it or allocate a new one. Each
buffer is kept in the qc_stream_desc list until all of its data is
acknowledged.

On the MUX side, a qcs uses the current stream buffer to transfer data.
Once the buffer is full, it is released and a new one will be allocated
on a future qc_send() invocation.
2022-04-21 12:03:20 +02:00