Commit Graph

6766 Commits

Author SHA1 Message Date
Christopher Faulet
2f7c82bfdf BUG/MINOR: haproxy: Fix option to disable the fast-forward
The option was renamed to only permit to disable the fast-forward. First
there is no reason to enable it because it is the default behavior. Then it
introduced a bug because there is no way to be sure the command line has
precedence over the configuration this way. So, the option is now named
"tune.disable-fast-forward" and does not support any argument. And of
course, the commande line option "-dF" has now precedence over the
configuration.

No backport needed.
2023-02-21 11:44:55 +01:00
Amaury Denoyelle
77ed63106d MEDIUM: quic: trigger fast connection closing on process stopping
With previous commit, quic-conn are now handled as jobs to prevent the
termination of haproxy process. This ensures that QUIC connections are
closed when all data are acknowledged by the client and there is no more
active streams.

The quic-conn layer emits a CONNECTION_CLOSE once the MUX has been
released and all streams are acknowledged. Then, the timer is scheduled
to definitely free the connection after the idle timeout period. This
allows to treat late-arriving packets.

Adjust this procedure to deactivate this timer when process stopping is
in progress. In this case, quic-conn timer is set to expire immediately
to free the quic-conn instance as soon as possible. This allows to
quickly close haproxy process.

This should be backported up to 2.7.
2023-02-20 11:20:18 +01:00
Amaury Denoyelle
eb7d320d25 MINOR: mux-quic: implement client-fin timeout
Implement client-fin timeout for MUX quic. This timeout is used once an
applicative layer shutdown has been called. In HTTP/3, this corresponds
to the emission of a GOAWAY.

This should be backported up to 2.7.
2023-02-20 11:20:18 +01:00
Amaury Denoyelle
b30247b16c MINOR: mux-quic: define qc_shutdown()
Factorize shutdown operation in a dedicated function qc_shutdown(). This
will allow to call it from multiple places. A new flag QC_CF_APP_SHUT is
also defined to ensure it will only be executed once even if called
multiple times per connection.

This commit will be useful to properly support haproxy soft stop.
This should be backported up to 2.7.
2023-02-20 11:18:58 +01:00
Frédéric Lécaille
2f531116ed MINOR: quic: Add traces to qc_kill_conn()
Very minor modification to help in debugging issues.

Must be backported to 2.7.
2023-02-17 17:36:30 +01:00
Frédéric Lécaille
a2c62c3141 MINOR: quic: Kill the connections on ICMP (port unreachable) packet receipt
The send*() syscall which are responsible of such ICMP packets reception
fails with ECONNREFUSED as errno.

  man(7) udp
  ECONNREFUSED
      No receiver was associated with the destination address.
      This might be caused by a previous packet sent over the socket.

We must kill asap the underlying connection.

Must be backported to 2.7.
2023-02-17 17:36:30 +01:00
Frédéric Lécaille
75c8ad5490 MINOR: quic: Move code to wakeup the timer task to avoid anti-amplication deadlock
This code was there because the timer task was not running on the same thread
as the one which parse the QUIC packets. Now that this is no more the case,
we can wake up this task directly.

Must be backported to 2.7.
2023-02-17 17:36:30 +01:00
Frédéric Lécaille
1dbeb35f80 MINOR: quic: Add new traces about by connection RX buffer handling
Move quic_rx_pkts_del() out of quic_conn.h to make it benefit from the TRACE API.
Add traces which already already helped in diagnosing an issue encountered with
ngtcp2 which sent too much 1RTT packets before the handshake completion. This
has been fixed here after having discussed with Tasuhiro on QUIC dev slack:

https://github.com/ngtcp2/ngtcp2/pull/663

Must be backported to 2.7.
2023-02-17 17:36:30 +01:00
Amaury Denoyelle
14037bf26f MINOR: h3: add traces on decode_qcs callback
Add traces inside h3_decode_qcs(). Every error path has now its
dedicated trace which should simplify debugging. Each early returns has
been converted to a goto invocation.

To complete the demux tracing, demux frame type and length are now
printed using the h3s instance whenever its possible on trace
invocation. A new internal value H3_FT_UNINIT is used as a frame type to
mark demuxing as inactive.

This should be backported up to 2.7.
2023-02-17 17:31:52 +01:00
Amaury Denoyelle
381d8137e3 MINOR: h3/hq-interop: handle no data in decode_qcs() with FIN set
Properly handle a STREAM frame with no data but the FIN bit set at the
application layer. H3 and hq-interop decode_qcs() callback have been
adjusted to not return early in this case.

If the FIN bit is accepted, a HTX EOM must be inserted for the upper
stream layer. If the FIN is rejected because the stream cannot be
closed, a proper CONNECTION_CLOSE error will be triggered.

A new utility function qcs_http_handle_standalone_fin() has been
implemented in the qmux_http module. This allows to simply add the HTX
EOM on qcs HTX buffer. If the HTX buffer is empty, a EOT is first added
to ensure it will be transmitted above.

This commit will allow to properly handle FIN notify through an empty
STREAM frame. However, it is not sufficient as currently qcc_recv() skip
the decode_qcs() invocation when the offset is already received. This
will be fixed in the next commit.

This should be backported up to 2.6 along with the next patch.
2023-02-17 16:25:00 +01:00
Willy Tarreau
3e820a1056 MINOR: threads: add flags to know if a thread is started and/or running
Several times during debugging it has been difficult to find a way to
reliably indicate if a thread had been started and if it was still
running. It's really not easy because the elements we look at are not
necessarily reliable (e.g. harmless bit or idle bit might not reflect
what we think during a signal). And such notions can be subjective
anyway.

Here we define two thread flags, TH_FL_STARTED which is set as soon as
a thread enters run_thread_poll_loop() and drops the idle bit, and
another one, TH_FL_IN_LOOP, which is set when entering run_poll_loop()
and cleared when leaving it. This should help init/deinit code know
whether it's called from a non-initialized thread (i.e. tid must not
be trusted), or shared functions know if they're being called from a
running thread or from init/deinit code outside of the polling loop.
2023-02-17 16:01:34 +01:00
Christopher Faulet
d4eaa8af6b MINOR: global: Add an option to disable the data fast-forward
The new global option "tune.fast-forward" can be set to "off" to disable the
data fast-forward. It is an debug option, thus it is internally marked as
experimental. The directive "expose-experimental-directives" must be set
first to use this one. By default, the data fast-forward is enable.

It could be usefull to force to wake the stream up when data are
received. To be sure, evreything works fine in this case. The data
fast-forward is an optim. It must work without it. But some code may rely on
the fact the stream will not be woken up. With this option, it is possible
to spot some hidden bugs.
2023-02-17 10:17:02 +01:00
William Lallemand
44979ad680 BUG/MINOR: config: crt-list keywords mistaken for bind ssl keywords
This patch fixes an issue in the "-dK" keywords dumper, which was
mistakenly displaying the "crt-list" keywords for "bind ssl" keywords.

The patch fixes the issue by dumping the "crt-list" keywords in its own
section, and dumping the "bind" keywords which are in the "SSL" scope
with a "bind ssl" prefix.

This commit depends on the previous "MINOR: ssl: rename confusing
ssl_bind_kws" commit.

Must be backported in 2.6.

Diff of the `./haproxy -dKall -q -c -f /dev/null` output before and
after the patch in 2.8-dev4:

     | @@ -190,30 +190,9 @@ listen
     |  	use-fcgi-app
     |  	bind <addr> accept-netscaler-cip +1
     |  	bind <addr> accept-proxy
     | -	bind <addr> allow-0rtt
     | -	bind <addr> alpn +1
     |  	bind <addr> backlog +1
     | -	bind <addr> ca-file +1
     | -	bind <addr> ca-ignore-err +1
     | -	bind <addr> ca-sign-file +1
     | -	bind <addr> ca-sign-pass +1
     | -	bind <addr> ca-verify-file +1
     | -	bind <addr> ciphers +1
     | -	bind <addr> ciphersuites +1
     | -	bind <addr> crl-file +1
     | -	bind <addr> crt +1
     | -	bind <addr> crt-ignore-err +1
     | -	bind <addr> crt-list +1
     | -	bind <addr> curves +1
     |  	bind <addr> defer-accept
     | -	bind <addr> ecdhe +1
     |  	bind <addr> expose-fd +1
     | -	bind <addr> force-sslv3
     | -	bind <addr> force-tlsv10
     | -	bind <addr> force-tlsv11
     | -	bind <addr> force-tlsv12
     | -	bind <addr> force-tlsv13
     | -	bind <addr> generate-certificates
     |  	bind <addr> gid +1
     |  	bind <addr> group +1
     |  	bind <addr> id +1
     | @@ -225,48 +204,52 @@ listen
     |  	bind <addr> name +1
     |  	bind <addr> namespace +1
     |  	bind <addr> nice +1
     | -	bind <addr> no-ca-names
     | -	bind <addr> no-sslv3
     | -	bind <addr> no-tls-tickets
     | -	bind <addr> no-tlsv10
     | -	bind <addr> no-tlsv11
     | -	bind <addr> no-tlsv12
     | -	bind <addr> no-tlsv13
     | -	bind <addr> npn +1
     | -	bind <addr> prefer-client-ciphers
     |  	bind <addr> process +1
     |  	bind <addr> proto +1
     |  	bind <addr> severity-output +1
     |  	bind <addr> shards +1
     | -	bind <addr> ssl
     | -	bind <addr> ssl-max-ver +1
     | -	bind <addr> ssl-min-ver +1
     | -	bind <addr> strict-sni
     |  	bind <addr> tcp-ut +1
     |  	bind <addr> tfo
     |  	bind <addr> thread +1
     | -	bind <addr> tls-ticket-keys +1
     |  	bind <addr> transparent
     |  	bind <addr> uid +1
     |  	bind <addr> user +1
     |  	bind <addr> v4v6
     |  	bind <addr> v6only
     | -	bind <addr> verify +1
     |  	bind <addr> ssl allow-0rtt
     |  	bind <addr> ssl alpn +1
     |  	bind <addr> ssl ca-file +1
     | +	bind <addr> ssl ca-ignore-err +1
     | +	bind <addr> ssl ca-sign-file +1
     | +	bind <addr> ssl ca-sign-pass +1
     |  	bind <addr> ssl ca-verify-file +1
     |  	bind <addr> ssl ciphers +1
     |  	bind <addr> ssl ciphersuites +1
     |  	bind <addr> ssl crl-file +1
     | +	bind <addr> ssl crt +1
     | +	bind <addr> ssl crt-ignore-err +1
     | +	bind <addr> ssl crt-list +1
     |  	bind <addr> ssl curves +1
     |  	bind <addr> ssl ecdhe +1
     | +	bind <addr> ssl force-sslv3
     | +	bind <addr> ssl force-tlsv10
     | +	bind <addr> ssl force-tlsv11
     | +	bind <addr> ssl force-tlsv12
     | +	bind <addr> ssl force-tlsv13
     | +	bind <addr> ssl generate-certificates
     |  	bind <addr> ssl no-ca-names
     | +	bind <addr> ssl no-sslv3
     | +	bind <addr> ssl no-tls-tickets
     | +	bind <addr> ssl no-tlsv10
     | +	bind <addr> ssl no-tlsv11
     | +	bind <addr> ssl no-tlsv12
     | +	bind <addr> ssl no-tlsv13
     |  	bind <addr> ssl npn +1
     | -	bind <addr> ssl ocsp-update +1
     | +	bind <addr> ssl prefer-client-ciphers
     |  	bind <addr> ssl ssl-max-ver +1
     |  	bind <addr> ssl ssl-min-ver +1
     | +	bind <addr> ssl strict-sni
     | +	bind <addr> ssl tls-ticket-keys +1
     |  	bind <addr> ssl verify +1
     |  	server <name> <addr> addr +1
     |  	server <name> <addr> agent-addr +1
     | @@ -591,6 +574,23 @@ listen
     |  	http-after-response unset-var*
     |  userlist
     |  peers
     | +crt-list
     | +	allow-0rtt
     | +	alpn +1
     | +	ca-file +1
     | +	ca-verify-file +1
     | +	ciphers +1
     | +	ciphersuites +1
     | +	crl-file +1
     | +	curves +1
     | +	ecdhe +1
     | +	no-ca-names
     | +	npn +1
     | +	ocsp-update +1
     | +	ssl-max-ver +1
     | +	ssl-min-ver +1
     | +	verify +1
     |  # List of registered CLI keywords:
     |  @!<pid> [MASTER]
     |  @<relative pid> [MASTER]
2023-02-16 16:14:37 +01:00
William Lallemand
af67806651 MINOR: ssl: rename confusing ssl_bind_kws
The ssl_bind_kw structure is exclusively used for crt-list keyword, it
must be named otherwise to remove the confusion.

The structure was renamed ssl_crtlist_kws.
2023-02-16 16:03:45 +01:00
Amaury Denoyelle
15c74702d5 MINOR: quic: implement a basic "show quic" CLI handler
Implement a basic "show quic" CLI handler. This command will be useful
to display various information on all the active QUIC frontend
connections.

This work is heavily inspired by "show sess". Most notably, a global
list of quic_conn has been introduced to be able to loop over them. This
list is stored per thread in ha_thread_ctx.

Also add three CLI handlers for "show quic" in order to allocate and
free the command context. The dump handler runs on thread isolation.
Each quic_conn is referenced using a back-ref to handle deletion during
handler yielding.

For the moment, only a list of raw quic_conn pointers is displayed. The
handler will be completed over time with more information as needed.

This should be backported up to 2.7.
2023-02-09 18:11:00 +01:00
Aurelien DARRAGON
3e7a0bb70b MINOR: cfgparse/server: move (min/max)conn postparsing logic into dedicated function
In check_config_validity() function, we performed some consistency checks to
adjust minconn/maxconn attributes for each declared server.

We move this logic into a dedicated function named srv_minmax_conn_apply()
to be able to perform those checks later in the process life when needed
(ie: dynamic servers)
2023-02-08 14:48:21 +01:00
William Lallemand
a14686d096 MINOR: ssl/ocsp: add a function to check the OCSP update configuration
Deduplicate the code which checks the OCSP update in the ckch_store and
in the crtlist_entry.

Also, jump immediatly to error handling when the ERR_FATAL is catched.
2023-02-08 11:40:31 +01:00
William Lallemand
b4b9caa65f BUILD: ssl/ocsp: ssl_ocsp-t.h depends on ssl_sock-t.h
ssl_ocsp-t.h uses SSL_SOCK_NUM_KEYTYPES which is defined in
ssl_sock-t.h.

No backport needed.
2023-02-08 11:31:03 +01:00
Willy Tarreau
28360dc53f MEDIUM: clock: force internal time to wrap early after boot
GH issue #2034 clearly indicates yet another case of time roll-over
that went badly. Issues that happen only once every 50 days are hard
to detect and debug, and are usually reported more or less synchronized
from multiple sources. This patch finally does what had long been planned
but never done yet, which is to force the time to wrap early after boot
so that any such remaining issue can be spotted quicker. The margin delay
here is 20s (it may be changed by setting BOOT_TIME_WRAP_SEC to another
value). This value seems sufficient to permit failed health checks to
succeed and traffic to come in and possibly start to update some time
stamps (accept dates in logs, freq counters, stick-tables expiration
dates etc).

It could theoretically be helpful to have this in 2.7, but as can be
seen with the two patches below, we've already had incorrect use cases
of the internal monotonic time when the wall-clock one was needed, so
we could expect to detect other ones in the future. Note that this will
*not* induce bugs, it will only make them happen much faster (i.e. no
need to wait for 50 days before seeing them). If it were to eventually
be backported, these two previous patches must also be backported:

    BUG/MINOR: clock: use distinct wall-clock and monotonic start dates
    BUG/MEDIUM: cache: use the correct time reference when comparing dates
2023-02-08 11:10:33 +01:00
Willy Tarreau
6093ba47c0 BUG/MINOR: clock: do not mix wall-clock and monotonic time in uptime calculation
We've had a start date even before the internal monotonic clock existed,
but once the monotonic clock was added, the start date was not updated
to distinguish the wall clock time units and the internal monotonic time
units. The distinction is important because both clocks do not necessarily
progress at the same speed. The very rare occurrences of the wall-clock
date are essentially for human consumption and communication with third
parties (e.g. report the start date in "show info" for monitoring
purposes). However currently this one is also used to measure the distance
to "now" as being the process' uptime. This is actually not correct. It
only works because for now the two dates are initialized at the exact
same instant at boot but could still be wrong if the system's date shows
a big jump backwards during startup for example. In addition the current
situation prevents us from enforcing an abritrary offset at boot to reveal
some heisenbugs.

This patch adds a new "start_time" at boot that is set from "now" and is
used in uptime calculations. "start_date" instead is now set from "date"
and will always reflect the system date for human consumption (e.g. in
"show info"). This way we're now sure that any drift of the internal
clock relative to the system date will not impact the reported uptime.

This could possibly be backported though it's unlikely that anyone has
ever noticed the problem.
2023-02-08 11:06:55 +01:00
Frédéric Lécaille
b7a406ac34 MINOR: quic: Update version_information transport parameter to draft-14
This is necessary to make our stack negotiate the QUIC versions with clients.
(See https://author-tools.ietf.org/iddiff?url1=draft-ietf-quic-version-negotiation-13&url2=draft-ietf-quic-version-negotiation-14&difftype=--html)

Must be backported to 2.7.
2023-02-06 11:54:07 +01:00
Aurelien DARRAGON
e5958d0292 BUG/MEDIUM: stats: fix resolvers dump
In ("BUG/MEDIUM: stats: Rely on a local trash buffer to dump the stats"),
we forgot to apply the patch in resolvers.c which provides the
stats_dump_resolvers() function that is involved when dumping with "resolvers"
domain.

As a consequence, resolvers dump was broken because stats_dump_one_line(),
which is used in stats_dump_resolv_to_buffer(), implicitely uses trash_chunk
from stats.c to prepare the dump, and stats_putchk() is then called with
global trash (currently empty) as output data.

Given that trash_dump variable is static and thus only available within stats.c
we change stats_putchk() function prototype so that the function does not take
the output buffer as an argument. Instead, stats_putchk() will implicitly use
the local trash_dump variable declared in stats.c.

It will also prevent further mixups between stats_dump_* functions and
stats_putchk().

This needs to be backported with ("BUG/MEDIUM: stats: Rely on a local trash
buffer to dump the stats")
2023-02-06 07:53:03 +01:00
Willy Tarreau
f2988e1447 CLEANUP: listener/thread: remove now unused bind_conf's bind_tgroup/bind_thread
Not needed anymore since last commit, let's get rid of it.
2023-02-03 18:00:21 +01:00
Willy Tarreau
f0de8cacc4 MEDIUM: listener/config: make the "thread" parser rely on thread_sets
Instead of reading and storing a single group and a single mask for a
"thread" directive on a bind line, we now store the complete range in
a thread set that's stored in the bind_conf. The bind_parse_thread()
function now just calls parse_thread_set() to complete the current set,
which starts empty, and thread_resolve_group_mask() was updated to
support retrieving thread group numbers or absolute thread numbers
directly from the pre-filled thread_set, and continue to feed bind_tgroup
and bind_thread. The CLI parsers which were pre-initialized to set the
bind_tgroup to 1 cannot do it anymore as it would prevent one from
restricting the thread set. Instead check_config_validity() now detects
the CLI frontend and passes the info down to thread_resolve_group_mask()
that will automatically use only the group 1's threads for these
listeners. The same is done for the peers listeners for now.

At this step it's already possible to start with all previous valid
configs as well as extended ones supporting comma-delimited thread
sets. In addition the parser already accepts large ranges spanning
multiple groups, but since the underlying listeners infrastructure
is not read, for now we're maintaining a specific check against this
at the higher level of the config validity check.

The patch is a bit large because thread resolution is performed in
multiple steps, so we need to adjust all of them at once to preserve
functional and technical consistency.
2023-02-03 18:00:21 +01:00
Willy Tarreau
bef43dfa60 MINOR: thread: add a simple thread_set API
The purpose is to be able to store large thread sets, defined by ranges
that may cross group boundaries, as well as define lists of groups and
masks. The thread_set struct implements the storage, and the parser is
in parse_thread_set(), with a focus on "bind" lines, but not only.
2023-02-03 18:00:21 +01:00
Willy Tarreau
9e2682afed MINOR: listener: remove the now useless LI_F_QUIC_LISTENER flag
This flag is only used to tag a QUIC listener, which we now know by
its bind_conf's xprt as well. It's only used to decide whether or not
to perform an extra initialization step on the listener. Let's drop it
as well as the flags field.

With the various fields and options moved, the listener struct reduced
by 48 bytes total.
2023-02-03 18:00:20 +01:00
Willy Tarreau
b25634d23e CLEANUP: listener: remove the now unused options field
All options that made sense were moved to the bind_conf, and remaining
ones were removed. This field isn't used at all anymore. The thr_idx
field was moved there to plug the hole.
2023-02-03 18:00:20 +01:00
Willy Tarreau
4c1d3a953d MINOR: listener: get rid of LI_O_TCP_L4_RULES and LI_O_TCP_L5_RULES
LI_O_TCP_L4_RULES and LI_O_TCP_L5_RULES are only set by from the proxy
based on the presence or absence of tcp_req l4/l5 rules. It's basically
as cheap to check the list as it is to check the flag, except that there
is no need to maintain a copy. Let's get rid of them, and this may ease
addition of more dynamic stuff later.
2023-02-03 18:00:20 +01:00
Willy Tarreau
1714680cec MINOR: listener: move LI_O_UNLIMITED and LI_O_NOSTOP to bind_conf
These two flags are entirely for internal use and are even per proxy
in practice since they're used for peers and CLI to indicate (for the
first one) that the listener(s) are not subject to connection limits,
and for the second that the listener(s) should not be stopped on
soft-stop. No need to keep them in the listeners, let's move them to
the bind_conf under names BC_O_UNLIMITED and BC_O_NOSTOP.
2023-02-03 18:00:20 +01:00
Willy Tarreau
f1b4730f7d MINOR: listener: move the ACC_PROXY and ACC_CIP options to bind_conf
These are only set per bind line and used when creating a sessions,
we can move them to the bind_conf under the names BC_O_ACC_PROXY and
BC_O_ACC_CIP respectively.
2023-02-03 18:00:20 +01:00
Willy Tarreau
c492f1b17f MINOR: listener: move TCP_FO to bind_conf
It's set per bind line ("tfo") and only used in tcp_bind_listener() so
there's no point keeping the address family tests, let's just store the
flag in the bind_conf under the name BC_O_TCP_FO.
2023-02-03 18:00:20 +01:00
Willy Tarreau
d9b4d21248 MINOR: listener: move the DEF_ACCEPT option to the bind_conf
This option is set per bind line, and was only set stored when the
address family is AF_INET4 or AF_INET6. That's pointless since it's
used only in tcp_bind_listener() which is only used for such families
as well, so it can now be moved to the bind_conf under the name
BC_O_DEF_ACCEPT.
2023-02-03 18:00:20 +01:00
Willy Tarreau
9bdcf42922 MINOR: listener: move the NOQUICKACK option to the bind_conf
It solely depends on the bind line so let's move it there under the
name BC_O_NOQUICKACK.
2023-02-03 18:00:20 +01:00
Willy Tarreau
cfb7c2f515 MINOR: listener: move the NOLINGER option to the bind_conf
It's currently declared per-frontend, though it would make sense to
support it per-line but in no case per-listener. Let's move the option
to a bind_conf option BC_O_NOLINGER.
2023-02-03 18:00:20 +01:00
Willy Tarreau
7dbd4187dc MINOR: listener: move the nice field to the bind_conf
This is another bind line setting which can move to the bind_conf.
Note that it leaves a 2-byte hole in the listener struct.
2023-02-03 18:00:20 +01:00
Willy Tarreau
d5983cef80 MINOR: listener: remove the useless ->default_target field
This field is used by stream_new() to optionally set the applet the
stream will connect to for simple proxies like the CLI for example.
But it has never been configurable to anything and is always strictly
equal to the frontend's ->default_target. Let's just drop it and make
stream_new() only use the frontend's. It makes more sense anyway as
we don't want the proxy to work differently based on the "bind" line.
This idea was brought in 1.6 hoping that the h2 implementation would
use applets for decoding (which was dropped after the very first
attempt in 1.8).
2023-02-03 18:00:20 +01:00
Willy Tarreau
3083615410 MINOR: listener: move the ->accept callback to the bind_conf
The accept callback directly derives from the upper layer, generally
it's session_accept_fd(). As such it's also defined per bind line
so it makes sense to move it there.
2023-02-03 18:00:20 +01:00
Willy Tarreau
758c69d951 MINOR: listener: move the maxconn parameter to the bind_conf
The maxconn is set per bind line so let's move it there. This might
possibly even slightly reduce inter-thread contention since this one
is read-mostly and it was stored next to nbconn which changes for
each connection setup or teardown.
2023-02-03 18:00:20 +01:00
Willy Tarreau
1920f897d8 MINOR: listener: move the backlog setting from listener to bind_conf
The backlog setting is also defined by the bind_conf, so let's move
it there.
2023-02-03 18:00:20 +01:00
Willy Tarreau
882f2485a1 MINOR: listener: move maxaccept from listener to bind_conf
Like for previous values, maxaccept is really per-bind_conf, so let's
move it there. Some frontends (peers, log) set it to 1 so the assignment
was slightly moved.
2023-02-03 18:00:20 +01:00
Willy Tarreau
ee378165fb MINOR: listener: move maxseg and tcp_ut to bind_conf
These two arguments were only set and only used with tcpv4/tcpv6. Let's
just store them into the bind_conf instead of duplicating them for all
listeners since they're fixed per "bind" line.
2023-02-03 18:00:20 +01:00
Willy Tarreau
7866e8e50d MEDIUM: listener: move the analysers mask to the bind_conf
When bind_conf were created, some elements such as the analysers mask
ought to have moved there but that wasn't the case. Now that it's
getting clearer that bind_conf provides all binding parameters and
the listener is essentially a listener on an address, it's starting
to get really confusing to keep such parameters in the listener, so
let's move the mask to the bind_conf. We also take this opportunity
for pre-setting the mask to the frontend's upon initalization. Now
several loops have one less argument to take care of.
2023-02-03 18:00:20 +01:00
Frdric Lcaille
0aa79953c9 BUG/MINOR: quic: Unchecked source connection ID
The SCID (source connection ID) used by a peer (client or server) is sent into the
long header of a QUIC packet in clear. But it is also sent into the transport
parameters (initial_source_connection_id). As these latter are encrypted into the
packet, one must check that these two pieces of information do not differ
due to a packet header corruption. Furthermore as such a connection is unusuable
it must be killed and must stop as soon as possible processing RX/TX packets.

Implement qc_kill_con() to flag a connection as unusable and to kille it asap
waking up the idle timer task to release the connection.

Add a check to quic_transport_params_store() to detect that the SCIDs do not
match and make it call qc_kill_con().

Add several tests about connection to be killed at several critial locations,
especially in the TLS stack callback to receive CRYPTO data from or derive secrets,
and before preparing packet after having received others.

Must be backported to 2.6 and 2.7.
2023-02-03 17:55:55 +01:00
Frdric Lcaille
af25a69c8b MEDIUM: quic: Remove qc_conn_finalize() from the ClientHello TLS callbacks
This is a bad idea to make the TLS ClientHello callback call qc_conn_finalize().
If this latter fails, this would generate a TLS alert and make the connection
send packet whereas it is not functional. But qc_conn_finalize() job was to
install the transport parameters sent by the QUIC listener. This installation
cannot be done at any time. This must be done after having possibly negotiated
the QUIC version and before sending the first Handshake packets. It seems
the better moment to do that in when the Handshake TX secrets are derived. This
has been found inspecting the ngtcp2 code. Calling SSL_set_quic_transport_params()
too late would make the ServerHello to be sent without the transport parameters.

The code for the connection update which was done from qc_conn_finalize() has
been moved to quic_transport_params_store(). So, this update is done as soon as
possible.

Add QUIC_FL_CONN_TX_TP_RECEIVED to flag the connection as having received the
peer transport parameters. Indeed this is required when the ClientHello message
is splitted between packets.

Add QUIC_FL_CONN_FINALIZED to protect the connection from calling qc_conn_finalize()
more than one time. This latter is called only when the connection has received
the transport parameters and after returning from SSL_do_hanshake() which is the
function which trigger the TLS ClientHello callback call.

Remove the calls to qc_conn_finalize() from from the TLS ClientHello callbacks.

Must be backported to 2.6. and 2.7.
2023-02-03 17:55:55 +01:00
Frdric Lcaille
9969adbcdc MINOR: stats: add by HTTP version cumulated number of sessions and requests
Add cum_sess_ver[] new array of counters to count the number of cumulated
HTTP sessions by version (h1, h2 or h3).
Implement proxy_inc_fe_cum_sess_ver_ctr() to increment these counter.
This function is called each a HTTP mux is correctly initialized. The QUIC
must before verify the application operations for the mux is for h3 before
calling proxy_inc_fe_cum_sess_ver_ctr().
ST_F_SESS_OTHER stat field for the cumulated of sessions others than
HTTP sessions is deduced from ->cum_sess_ver counter (for all the session,
not only HTTP sessions) from which the HTTP sessions counters are substracted.

Add cum_req[] new array of counters to count the number of cumulated HTTP
requests by version and others than HTTP requests. This new member replace ->cum_req.
Modify proxy_inc_fe_req_ctr() which increments these counters to pass an HTTP
version, 0 special values meaning "other than an HTTP request". This is the case
for instance for syslog.c from which proxy_inc_fe_req_ctr() is called with 0
as version parameter.
ST_F_REQ_TOT stat field compputing for the cumulated number of requests is modified
to count the sum of all the cum_req[] counters.

As this patch is useful for QUIC, it must be backported to 2.7.
2023-02-03 17:55:49 +01:00
Willy Tarreau
1ea5f410ff CLEANUP: quic: no need for atomics on packet refcnt
This is a leftover from the implementation's history, but the
quic_rx_packet and quic_tx_packet ref counts were still atomically
updated. It was found in perf top that the cost of the atomic inc
in quic_tx_packet_refinc() alone was responsible for 1% of the CPU
usage at 135 Gbps. Given that packets are only processed on their
assigned thread we don't need that anymore and this can be replaced
with regular non-atomic operations.

Doing this alone has reduced the CPU usage of qc_do_build_pkt()
from 3.6 to 2.5% and increased the overall bit rate by about 1%.
2023-02-03 13:39:20 +01:00
Amaury Denoyelle
24d5b72ca9 MINOR: quic: add config for retransmit limit
Define a new configuration option "tune.quic.max-frame-loss". This is
used to specify the limit for which a single frame instance can be
detected as lost. If exceeded, the connection is closed.

This should be backported up to 2.7.
2023-02-03 11:56:46 +01:00
Amaury Denoyelle
e4abb1f2da MEDIUM: quic: implement a retransmit limit per frame
Add a <loss_count> new field in quic_frame structure. This field is set
to 0 and incremented each time a sent packet is declared lost. If
<loss_count> reached a hard-coded limit, the connection is deemed as
failing and is closed immediately with a CONNECTION_CLOSE using
INTERNAL_ERROR.

By default, limit is set to 10. This should ensure that overall memory
usage is limited if a peer behaves incorrectly.

This should be backported up to 2.7.
2023-02-03 11:56:42 +01:00
Amaury Denoyelle
57b3eaa793 MINOR: quic: refactor frame deallocation
Define a new function qc_frm_free() to handle frame deallocation. New
BUG_ON() statements ensure that the deallocated frame is not referenced
by other frame. To support this, all LIST_DELETE() have been replaced by
LIST_DEL_INIT(). This should enforce that frame deallocation is robust.

As a complement, qc_frm_unref() has been moved into quic_frame module.
It is justified as this is a utility function related to frame
deallocation. It allows to use it in quic_pktns_tx_pkts_release() before
calling qc_frm_free().

This should be backported up to 2.7.
2023-02-03 11:55:41 +01:00
Amaury Denoyelle
40c24f1a10 MINOR: quic: define new functions for frame alloc
Define two utility functions for quic_frame allocation :
* qc_frm_alloc() is used to allocate a new frame
* qc_frm_dup() is used to allocate a new frame by duplicating an
  existing one

Theses functions are useful to centralize quic_frame initialization.
Note that pool_zalloc() is replaced by a proper pool_alloc() + explicit
initialization code.

This commit will simplify implementation of the per frame retransmission
limitation. Indeed, a new counter will be added in quic_frame structure
which must be initialized to 0.

This should be backported up to 2.7.
2023-02-03 10:44:26 +01:00
Amaury Denoyelle
2216b0866e MINOR: quic: remove fin from quic_stream frame type
A dedicated <fin> field was used in quic_stream structure. However, this
info is already encoded in the frame type field as specified by QUIC
protocol.

In fact, only code for packet reception used the <fin> field. On the
sending side, we only checked for the FIN bit. To align both sides,
remove the <fin> field and only used the FIN bit.

This should be backported up to 2.7.
2023-02-03 09:46:55 +01:00
Amaury Denoyelle
1e340ba6bc MINOR: mux-quic/h3: define stream close callback
Define a new qcc_app_ops callback named close(). This will be used to
notify app-layer about the closure of a stream by the remote peer. Its
main usage is to ensure that the closure is allowed by the application
protocol specification.

For the moment, close is not implemented by H3 layer. However, this
function will be mandatory to properly reject a STOP_SENDING on the
control stream and preventing a later crash. As such, this commit must
be backported with the next one on 2.6.

This is related to github issue #2006.
2023-01-30 15:56:25 +01:00
Aurelien DARRAGON
b2e2ec51b3 MEDIUM: proxy/http_ext: implement dynamic http_ext
proxy http-only options implemented in http_ext were statically stored
within proxy struct.

We're making some changes so that http_ext are now stored in a dynamically
allocated structs.
http_ext related structs are only allocated when needed to save some space
whenever possible, and they are automatically freed upon proxy deletion.

Related PX_O_HTTP{7239,XFF,XOT) option flags were removed because we're now
considering an http_ext option as 'active' if it is allocated (ptr is not NULL)

A few checks (and BUG_ON) were added to make these changes safe because
it adds some (acceptable) complexity to the previous design.

Also, proxy.http was renamed to proxy.http_ext to make things more explicit.
2023-01-27 15:18:59 +01:00
Aurelien DARRAGON
9ded834adc OPTIM: http_ext/7239: introduce c_mode to save some space
forwarded header option (rfc7239) deals with sample expressions in two
steps: first a sample expression string is extracted from the config file
and later in startup sequence this string is converted into the resulting
sample_expr.

We need to perform these two steps because we cannot compile the expr
too early in the parsing sequence. (or we would miss some context)
Because of this, we have two dinstinct structure members (expr and expr_s)
for each 7239 field supporting sample expressions.
This is not cool, because we're bloating the http forwarded config structure,
and thus, bloating proxy config structure.

To address this, we now merge both expr and expr_s members inside a single
union to regain some space. This forces us to perform some additional logic
to make sure to use the proper structure member at different parsing steps.

Thanks to this, we're also able to free/release related config hints and
sample expression strings as soon as the sample expression
compilation is finished.
2023-01-27 15:18:59 +01:00
Aurelien DARRAGON
f958341610 MINOR: proxy: move 'originalto' option to http_ext
Just like forwarded (7239) header and forwardfor header, move parsing,
logic and management of 'originalto' option into http_ext dedicated class.

We're only doing this to standardize proxy http options management.
Existing behavior remains untouched.
2023-01-27 15:18:59 +01:00
Aurelien DARRAGON
730b9836a6 MINOR: proxy: move 'forwardfor' option to http_ext
Just like forwarded (7239) header, move parsing, logic and management
of 'forwardfor' option into http_ext dedicated class.

We're only doing this to standardize proxy http options management.
Existing behavior remains untouched.
2023-01-27 15:18:59 +01:00
Aurelien DARRAGON
b2bb9257d2 MINOR: proxy/http_ext: introduce proxy forwarded option
Introducing http_ext class for http extension related work that
doesn't fit into existing http classes.

HTTP extension "forwarded", introduced with 7239 RFC is now supported
by haproxy.

The option supports various modes from simple to complex usages involving
custom sample expressions.

  Examples :

    # Those servers want the ip address and protocol of the client request
    # Resulting header would look like this:
    #   forwarded: proto=http;for=127.0.0.1
    backend www_default
        mode http
        option forwarded
        #equivalent to: option forwarded proto for

    # Those servers want the requested host and hashed client ip address
    # as well as client source port (you should use seed for xxh32 if ensuring
    # ip privacy is a concern)
    # Resulting header would look like this:
    #   forwarded: host="haproxy.org";for="_000000007F2F367E:60138"
    backend www_host
        mode http
        option forwarded host for-expr src,xxh32,hex for_port

    # Those servers want custom data in host, for and by parameters
    # Resulting header would look like this:
    #   forwarded: host="host.com";by=_haproxy;for="[::1]:10"
    backend www_custom
        mode http
        option forwarded host-expr str(host.com) by-expr str(_haproxy) for for_port-expr int(10)

    # Those servers want random 'for' obfuscated identifiers for request
    # tracing purposes while protecting sensitive IP information
    # Resulting header would look like this:
    #   forwarded: for=_000000002B1F4D63
    backend www_for_hide
        mode http
        option forwarded for-expr rand,hex

By default (no argument provided), forwarded option will try to mimic
x-forward-for common setups (source client ip address + source protocol)

The option is not available for frontends.
no option forwarded is supported.

More info about 7239 RFC here: https://www.rfc-editor.org/rfc/rfc7239.html

More info about the feature in doc/configuration.txt

This should address feature request GH #575

Depends on:
  - "MINOR: http_htx: add http_append_header() to append value to header"
  - "MINOR: sample: add ARGC_OPT"
  - "MINOR: proxy: introduce http only options"
2023-01-27 15:18:59 +01:00
Aurelien DARRAGON
832e9f4119 MINOR: proxy: introduce http only options
This commit is innoffensive but will allow to do some code refactors in
existing proxy http options. Newly created http related proxy options
will also benefit from this.
2023-01-27 15:18:59 +01:00
Aurelien DARRAGON
5f7f5fe76a MINOR: sample: add ARGC_OPT
Add ARGC_OPT enum to provide more context for upcoming sample parse errors
involving proxy "option" config directives.
2023-01-27 15:18:59 +01:00
Aurelien DARRAGON
38ebffaf10 MINOR: http_htx: add http_prepend_header() to prepend value to header
Just like http_append_header(), but this time to insert new value before
an existing one.

If the header already contains one or multiple values, ',' is automatically
inserted after the new value.
2023-01-27 15:18:59 +01:00
Aurelien DARRAGON
a5a8552cab MINOR: http_htx: add http_append_header() to append value to header
Calling this function as an alternative to http_replace_header_value()
to append a new value to existing header instead of replacing the whole
header content.

If the header already contains one or multiple values: a ',' is automatically
appended before the new value.

This function is not meant for prepending (providing empty ctx value),
in which case we should consider implementing dedicated prepend
alternative function.
2023-01-27 15:18:59 +01:00
Willy Tarreau
271c440392 MINOR: h2: add h2_phdr_to_ist() to make ISTs from pseudo headers
Till now pseudo headers were passed as const strings, but having them as
ISTs will be more convenient for traces. This doesn't change anything for
strings which are derived from them (and being constants they're still
zero-terminated).
2023-01-26 15:49:43 +01:00
Willy Tarreau
b8b243ac6a MINOR: trace: add the long awaited TRACE_PRINTF()
TRACE_PRINTF() can be used to produce arbitrary trace contents at any
trace level. It uses the exact same arguments as other TRACE_* macros,
but here they are mandatory since they are followed by the format-string,
though they may be filled with zeroes. The reason for the arguments is to
match tracking or filtering and not pollute other non-inspected objects.

It will probably be used inside loops, in which case there are two points
to be careful about:
  - output atomicity is only per-message, so competing threads may see
    their messages interleaved. As such, it is recommended that the
    caller places a recognizable unique context at the beginning of the
    message such as a connection pointer.
  - iterating over arrays or lists for all requests could be very
    expensive. In order to avoid this it is best to condition the call
    via TRACE_ENABLED() with the same arguments, which will return the
    same decision.
  - messages longer than TRACE_MAX_MSG-1 (1023 by default) will be
    truncated.

For example, in order to dump the list of HTTP headers between hpack
and h2:

  if (outlen > 0 &&
      TRACE_ENABLED(TRACE_LEVEL_DEVELOPER,
      H2_EV_RX_FRAME|H2_EV_RX_HDR, h2c->conn, 0, 0, 0)) {
          int i;
          for (i = 0; list[i].n.len; i++)
              TRACE_PRINTF(TRACE_LEVEL_DEVELOPER, H2_EV_RX_FRAME|H2_EV_RX_HDR,
                           h2c->conn, 0, 0, 0, "h2c=%p hdr[%d]=%s:%s", h2c, i,
                           list[i].n.ptr, list[i].v.ptr);
  }

In addition, a lower-level TRACE_PRINTF_LOC() macro is provided, that takes
two extra arguments, the caller's location and the caller's function name.
This will allow to emit composite traces from central functions on the
behalf of another one.
2023-01-26 15:49:43 +01:00
Willy Tarreau
4b36d5e8de MINOR: trace: add a trace_no_cb() dummy callback for when to use no callback
By default, passing a NULL cb to the trace functions will result in the
source's default one to be used. For some cases we won't want to use any
callback at all, not event the default one. Let's define a trace_no_cb()
function for this, that does absolutely nothing.
2023-01-26 15:49:43 +01:00
Willy Tarreau
8f9a9704bb MINOR: trace: add a TRACE_ENABLED() macro to determine if a trace is active
Sometimes it would be necessary to prepare some messages, pre-process some
blocks or maybe duplicate some contents before they vanish for the purpose
of tracing them. However we don't want to do that for everything that is
submitted to the traces, it's important to do it only for what will really
be traced.

The __trace() function has all the knowledge for this, to the point of even
checking the lockon pointers. This commit splits the function in two, one
with the trace decision logic, and the other one for the trace production.
The first one is now usable through wrappers such as _trace_enabled() and
TRACE_ENABLED() which will indicate whether traces are going to be produced
for the current source, level, event mask, parameters and tracking.
2023-01-26 15:49:43 +01:00
Willy Tarreau
80f36b2ac2 CLEANUP: trace: remove the QUIC-specific ifdefs
There are ifdefs at several places to only define TRC_ARGS_QCON when
QUIC is defined, but nothing prevents this code from building without.
Let's just remove those ifdefs, the single "if" they avoid is not worth
the extra maintenance burden.
2023-01-26 15:49:43 +01:00
Amaury Denoyelle
71fd03632f MINOR: mux-quic/h3: send SETTINGS as soon as transport is ready
As specified by HTTP3 RFC, SETTINGS frame should be sent as soon as
possible. Before this patch, this was only done on the first qc_send()
invocation. This delay significantly SETTINGS emission until the first
H3 response is ready to be transferred.

This patch fixes this by ensuring SETTINGS is emitted when MUX-QUIC is
being setup.

As a side point, return value of finalize operation is checked. This
means that an error during SETTINGS emission will cause the connection
init to fail.

This should be backported up to 2.7.
2023-01-25 16:01:55 +01:00
Willy Tarreau
7e70bfc8cb MINOR: threads: add a thread_harmless_end() version that doesn't wait
thread_harmless_end() needs to wait for rdv_requests to disappear so
that we're certain to respect a harmless promise that possibly allowed
another thread to proceed under isolation. But this doesn't work in a
signal handler because a thread could be interrupted by the debug
handler while already waiting for isolation and with rdv_request>0.
As such this function could cause a deadlock in such a signal handler.

Let's implement a specific variant for this, thread_harmless_end_sig(),
that just resets the thread's bit and doesn't wait. It must of course
not be used past a check point that would allow the isolation requester
to return and see the thread as temporarily harmless then turning back
on its promise.

This will be needed to fix a race in the debug handler.
2023-01-19 19:22:17 +01:00
Willy Tarreau
b2f38c13d1 BUG/MINOR: thread: always reload threads_enabled in loops
A few loops waiting for threads to synchronize such as thread_isolate()
rightfully filter the thread masks via the threads_enabled field that
contains the list of enabled threads. However, it doesn't use an atomic
load on it. Before 2.7, the equivalent variables were marked as volatile
and were always reloaded. In 2.7 they're fields in ha_tgroup_ctx[], and
the risk that the compiler keeps them in a register inside a loop is not
null at all. In practice when ha_thread_relax() calls sched_yield() or
an x86 PAUSE instruction, it could be verified that the variable is
always reloaded. If these are avoided (e.g. architecture providing
neither solution), it's visible in asm code that the variables are not
reloaded. In this case, if a thread exists just between the moment the
two values are read, the loop could spin forever.

This patch adds the required _HA_ATOMIC_LOAD() on the relevant
threads_enabled fields. It must be backported to 2.7.
2023-01-19 19:22:17 +01:00
Amaury Denoyelle
7d78eff889 MINOR: h3: extend function for QUIC varint encoding
Slighty adjust b_quic_enc_int(). This function is used to encode an
integer as a QUIC varint in a struct buffer.

A new parameter is added to the function API to specify the width of the
encoded integer. By default, 0 should be use to ensure that the minimum
space is used. Other valid values are 1, 2, 4 or 8. An error is reported
if the width is not large enough.

This new parameter will be useful when buffer space is reserved prior to
encode an unknown integer value. The maximum size of 8 bytes will be
reserved and some data can be put after. When finally encoding the
integer, the width can be requested to be 8 bytes.

With this new parameter, a small refactoring of the function has been
conducted to remove some useless internal variables.

This should be backported up to 2.7. It will be mostly useful to
implement H3 trailers encoding.
2023-01-19 15:09:01 +01:00
Remi Tricot-Le Breton
bb35e1f5aa BUG/MINOR: ssl: Fix compilation with OpenSSL 1.0.2 (missing ECDSA_SIG_set0)
This function was introduced in OpenSSL 1.1.0. Prior to that, the
ECDSA_SIG structure was public.
This function was used in commit 5a8f02ae "BUG/MEDIUM: jwt: Properly
process ecdsa signatures (concatenated R and S params)".

This patch needs to be backported up to branch 2.5 alongside commit
5a8f02ae.
2023-01-19 11:13:51 +01:00
William Lallemand
2edc6d0301 Revert "BUILD: ssl: add ECDSA_SIG_set0() for openssl < 1.1 or libressl < 2.7"
This reverts commit d65791e26c.

Conflict with the patch which was originally written and lacks the
BN_clear_free() and the NULL check.
2023-01-19 11:13:24 +01:00
Willy Tarreau
d65791e26c BUILD: ssl: add ECDSA_SIG_set0() for openssl < 1.1 or libressl < 2.7
Commit 5a8f02ae6 ("BUG/MEDIUM: jwt: Properly process ecdsa signatures
(concatenated R and S params)") makes use of ECDSA_SIG_set0() which only
appeared in openssl-1.1.0 and libressl 2.7, and breaks the build before.
Let's just do what it minimally does (only assigns the two fields to the
destination).

This will need to be backported where the commit above is, likely 2.5.
2023-01-19 10:57:00 +01:00
Frdric Lcaille
21c4c9b854 MINOR: quic: Replace v2 draft definitions by those of the final 2 version
This should finalize the support for the QUIC version 2.

Must be backported to 2.7.
2023-01-17 16:35:20 +01:00
Frdric Lcaille
12a0317fed MINOR: quic: Add "no-quic" global option
Add "no-quic" to "global" section to disable the use of QUIC transport protocol
by all configured QUIC listeners. This is listeners with QUIC addresses on their
"bind" lines. Internally, the socket addresses binding is skipped by
protocol_bind_all() for receivers with <proto_quic4> or <proto_quic6> as
protocol (see protocol struct).
Add information about "no-quic" global option to the documentation.

Must be backported to 2.7.
2023-01-17 16:35:20 +01:00
Willy Tarreau
e77f4306ba BUG/MEDIUM: stconn: also consider SE_FL_EOI to switch to SE_FL_ERROR
In se_fl_set_error() we used to switch to SE_FL_ERROR only when there
is already SE_FL_EOS indicating that the read side is closed. But that
is not sufficient, we need to consider all cases where no more reads
will be performed on the connection, and as such also include SE_FL_EOI.

Without this, some aborted connections during a transfer sometimes only
stop after the timeout, because the ERR_PENDING is never promoted to
ERROR.

This must be backported to 2.7 and requires previous patch "CLEANUP:
stconn: always use se_fl_set_error() to set the pending error".
2023-01-17 16:27:35 +01:00
Christopher Faulet
2e47e3a1cf MINOR: htx: Add an HTX value for the extra field is payload length is unknown
When the payload length cannot be determined, the htx extra field is set to
the magical vlaue ULLONG_MAX. It is not obvious. This a dedicated HTX value
is now used. Now, HTX_UNKOWN_PAYLOAD_LENGTH must be used in this case,
instead of ULLONG_MAX.
2023-01-13 11:51:11 +01:00
Christopher Faulet
4da82395d8 CLEANUP: http-ana: Remove HTTP_MSG_ERROR state
This state is now unused. Thus it can be removed.
2023-01-13 11:22:13 +01:00
Christopher Faulet
71236dedb9 MINOR: http-ana: Add a function to set HTTP termination flags
There is already a function to set termination flags but it is not well
suited for HTTP streams. So a function, dedicated to the HTTP analysis, was
added. This way, this new function will be called for HTTP analysers on
error. And if the error is not caugth at this stage, the generic function
will still be called from process_stream().

Here, by default a PRXCOND error is reported and depending on the stream
state, the reson will be set accordingly:

  * If the backend SC is in INI state, SF_FINST_T is reported on tarpit and
    SF_FINST_R otherwise.

  * SF_FINST_Q is the server connection is queued

  * SF_FINST_C in any connection attempt state (REQ/TAR/ASS/CONN/CER/RDY).
    Except for applets, a SF_FINST_R is reported.

  * Once the server connection is established, SF_FINST_H is reported while
    HTTP_MSG_DATA state on the response side.

  * SF_FINST_L is reported if the response is in HTTP_MSG_DONE state or
    higher and a client error/timeout was reported.

  * Otherwise SF_FINST_D is reported.
2023-01-13 09:45:23 +01:00
Willy Tarreau
6be8d09a61 OPTIM: global: move byte counts out of global and per-thread
During multiple tests we've already noticed that shared stats counters
have become a real bottleneck under large thread counts. With QUIC it's
pretty visible, with qc_snd_buf() taking 2.5% of the CPU on a 48-thread
machine at only 25 Gbps, and this CPU is entirely spent in the atomic
increment of the byte count and byte rate. It's also visible in H1/H2
but slightly less since we're working with larger buffers, hence less
frequent updates. These counters are exclusively used to report the
byte count in "show info" and the byte rate in the stats.

Let's move them to the thread_ctx struct and make the stats reader
just collect each thread's stats when requested. That's way more
efficient than competing on a single cache line.

After this, qc_snd_buf has totally disappeared from the perf profile
and tests made in h1 show roughly 1% performance increase on small
objects.
2023-01-12 16:37:45 +01:00
Amaury Denoyelle
0a1154afb5 MINOR: mux-quic: use send-list for STOP_SENDING/RESET_STREAM emission
When a STOP_SENDING or RESET_STREAM must be send, its corresponding qcs
is inserted into <qcc.send_list> via qcc_reset_stream() or
qcc_abort_stream_read().

This allows to remove the iteration on full qcs tree in qc_send().
Instead, STOP_SENDING and RESET_STREAM is done in the loop over
<qcc.send_list> as with STREAM frames. This should improve slightly the
performance, most notably when large number of streams are opened.

This must be backported up to 2.7.
2023-01-10 17:49:50 +01:00
Amaury Denoyelle
f9b03265f0 MEDIUM: h3: send SETTINGS before STREAM frames
Complete qcc_send_stream() function to allow to specify if the stream
should be handled in priority. Internally this will insert the qcs
instance in front of <qcc.send_list> to be able to treat it before other
streams.

This functionality is useful when some QUIC streams should be sent
before others. Most notably, this is used to guarantee that H3 SETTINGS
is done first via the control stream.

This must be backported up to 2.7.
2023-01-10 17:49:50 +01:00
Amaury Denoyelle
20f2a425ff MAJOR: mux-quic: rework stream sending priorization
Implement a mechanism to register streams ready to send data in new
STREAM frames. Internally, this is implemented with a new list
<qcc.send_list> which contains qcs instances.

A qcs can be registered safely using the new function qcc_send_stream().
This is done automatically in qc_send_buf() which covers most cases.
Also, application layer is free to use it for internal usage streams.
This is currently the case for H3 control stream with SETTINGS sending.

The main point of this patch is to handle stream sending fairly. This is
in stark contrast with previous code where streams with lower ID were
always prioritized. This could cause other streams to be indefinitely
blocked behind a stream which has a lot of data to transfer. Now,
streams are handled in an order scheduled by se_desc layer.

This commit is the first one of a serie which will bring other
improvments which also relied on the send_list implementation.

This must be backported up to 2.7 when deemed sufficiently stable.
2023-01-10 17:49:50 +01:00
Christopher Faulet
da89e9b95b MINOR: channel/applets: Stop to test CF_WRITE_ERROR flag if CF_SHUTW is enough
In applets, we stop processing when a write error (CF_WRITE_ERROR) or a shutdown
for writes (CF_SHUTW) is detected. However, any write error leads to an
immediate shutdown for writes. Thus, it is enough to only test if CF_SHUTW is
set.
2023-01-09 18:41:08 +01:00
Christopher Faulet
4b490b7517 MINOR: channel: Stop to test CF_READ_ERROR flag if CF_SHUTR is enough
When a read error (CF_READ_ERROR) is reported, a shutdown for reads is
always performed (CF_SHUTR). Thus, there is no reason to check if
CF_READ_ERROR is set if CF_SHUTR is also checked.
2023-01-09 18:41:08 +01:00
Christopher Faulet
2357718217 MEDIUM: channel: Remove CF_READ_ATTACHED and report CF_READ_EVENT instead
CF_READ_ATTACHED flag is only used in input events for stream analyzers,
CF_MASK_ANALYSER. A read event can be reported instead and this flag can be
removed. We must only take care to report a read event when the client
connection is upgraded from TCP to HTTP.
2023-01-09 18:41:08 +01:00
Christopher Faulet
049fbcd36a MINOR: channel: Remove CF_ANA_TIMEOUT and report CF_READ_EVENT instead
It appears CF_ANA_TIMEOUT is flag only used in CF_MASK_ANALYSER. All
analyzer timeout relies on the analysis expiration date (chn->analyse_exp).
Worst, once set, this flag is never removed. Thus this flag can be removed
and replaced by a read event (CF_READ_EVENT).
2023-01-09 18:41:08 +01:00
Christopher Faulet
a63f8f379f MINOR: channel: Remove CF_WRITE_ACTIVITY
Thanks to previous changes, CF_WRITE_ACTIVITY flags can be removed.
Everywhere it was used, its value is now directly used
(CF_WRITE_EVENT|CF_WRITE_ERROR).
2023-01-09 18:41:08 +01:00
Christopher Faulet
33e03cec5f MINOR: channel: Remove CF_READ_ACTIVITY
Thanks to previous changes, CF_READ_ACTIVITY flags can be removed.
Everywhere it was used, its value is now directly used
(CF_READ_EVENT|CF_READ_ERROR).
2023-01-09 18:41:08 +01:00
Christopher Faulet
d898841530 MEDIUM: channel: Use CF_WRITE_EVENT instead of CF_WRITE_PARTIAL
Just like CF_READ_PARTIAL, CF_WRITE_PARTIAL is now merged with
CF_WRITE_EVENT. There a subtlety in sc_notify(). The "connect" event
(formely CF_WRITE_NULL) is now detected with
(CF_WRITE_EVENT + sc->state < SC_ST_EST).
2023-01-09 18:41:08 +01:00
Christopher Faulet
285f7616ee MEDIUM: channel: Use CF_READ_EVENT instead of CF_READ_PARTIAL
CF_READ_PARTIAL flag is now merged with CF_READ_EVENT. It means
CF_READ_EVENT is set when a read0 is received (formely CF_READ_NULL) or when
data are received (formely CF_READ_ACTIVITY).

There is nothing special here, except conditions to wake the stream up in
sc_notify(). Indeed, the test was a bit changed to reflect recent
change. read0 event is now formalized by (CF_READ_EVENT + CF_SHUTR).
2023-01-09 18:41:08 +01:00
Christopher Faulet
b96f2aa380 REORG: channel: Rename CF_WRITE_NULL to CF_WRITE_EVENT
As for CF_READ_NULL, it appears CF_WRITE_NULL and other write events on a
channel are mainly used to wake up the stream and may be replace by on write
event.

In this patch, we introduce CF_WRITE_EVENT flag as a replacement to
CF_WRITE_EVENT_NULL. There is no breaking change for now, it is just a
rename. Gradually, other write events will be merged with this one.
2023-01-09 18:41:08 +01:00
Christopher Faulet
6e1bbc446b REORG: channel: Rename CF_READ_NULL to CF_READ_EVENT
CF_READ_NULL flag is not really useful and used. It is a transient event
used to wakeup the stream. As we will see, all read events on a channel may
be resumed to only one and are all used to wake up the stream.

In this patch, we introduce CF_READ_EVENT flag as a replacement to
CF_READ_NULL. There is no breaking change for now, it is just a
rename. Gradually, other read events will be merged with this one.
2023-01-09 18:41:08 +01:00
Willy Tarreau
5a72d03a58 MINOR: stick-table: implement the sc-add-gpc() action
This action increments the General Purpose Counter at the index <idx> of
the array associated to the sticky counter designated by <sc-id> by the
value of either integer <int> or the integer evaluation of expression
<expr>. Integers and expressions are limited to unsigned 32-bit values.
If an error occurs, this action silently fails and the actions evaluation
continues. <idx> is an integer between 0 and 99 and <sc-id> is an integer
between 0 and 2. It also silently fails if the there is no GPC stored at
this index. The entry in the table is refreshed even if the value is zero.
The 'gpc_rate' is automatically adjusted to reflect the average growth
rate of the gpc value.

The main use of this action is to count scores or total volumes (e.g.
estimated danger per source IP reported by the server or a WAF, total
uploaded bytes, etc).
2023-01-07 09:11:22 +01:00
Willy Tarreau
6c0117168e MEDIUM: stick-table: set the track-sc limit at boottime via tune.stick-counters
The number of stick-counter entries usable by track-sc rules is currently
set at build time. There is no good value for this since the vast majority
of users don't need any, most need only a few and rare users need more.
Adding more counters for everyone increases memory and CPU usages for no
reason.

This patch moves the per-session and per-stream arrays to a pool of a size
defined at boot time. This way it becomes possible to set the number of
entries at boot time via a new global setting "tune.stick-counters" that
sets the limit for the whole process. When not set, the MAX_SESS_STR_CTR
value still applies, or 3 if not set, as before.

It is also possible to lower the value to 0 to save a bit of memory if
not used at all.

Note that a few low-level sample-fetch functions had to be protected due
to the ability to use sample-fetches in the global section to set some
variables.
2023-01-06 18:08:49 +01:00
Christopher Faulet
61aded057d BUG/MAJOR: buf: Fix copy of wrapping output data when a buffer is realigned
There is a bug in b_slow_realign() function when wrapping output data are
copied in the swap buffer. block1 and block2 sizes are inverted. Thus blocks
with a wrong size are copied. It leads to data mixin if the first block is
in reality larger than the second one or to a copy of data outside the
buffer is the first block is smaller than the second one.

The bug was introduced when the buffer API was refactored in 1.9. It was
found by a code review and seems never to have been triggered in almost 5
years. However, we cannot exclude it is responsible of some unresolved bugs.

This patch should fix issue #1978. It must be backported as far as 2.0.
2023-01-05 09:34:49 +01:00
Willy Tarreau
6e70a3986c BUILD: makefile: only consider settings from enabled options
Due to the previous SSL exception we coudln't restrict the collected
CFLAGS/LDFLAGS to those of enabled options, so all of them were
considered if set. The problem is that it would prevent simply
disabling a build option without unsetting its xxx_CFLAGS or _LDFLAGS
values if those had incompatible values (e.g. -lfoo).

Now that only existing options are listed in collect_opts_flags, we
can safely check that the option is set and only consider its settings
in this case. Thus OT_LDFLAGS will not be used if USE_OT is not set
for example.
2022-12-23 17:01:55 +01:00
Willy Tarreau
6a2cd33509 BUILD: makefile: remove the special case of the SSL option
By creating USE_SSL and enabling it when USE_OPENSSL is set, we can
get rid of the special case that was made with it regarding cflags
collect and when resetting options. The option doesn't need to be
manually set, though in the future it might prove useful if other
non-openssl API are supported.
2022-12-23 16:53:35 +01:00
Willy Tarreau
2b8d0978f3 BUILD: makefile: make all OpenSSL variants use the same settings
It's getting complicated to configure includes and lib dirs for
OpenSSL API variants such as WolfSSL, because some settings are
common and others are specific but carry a prefix that doesn't
match the USE_* rule scheme.

This patch simplifies everything by considering that all SSL libs
will use SSL_INC, SSL_LIB, SSL_CFLAGS and SSL_LDFLAGS. That's much
more convenient. This works thanks to the settings collector which
explicitly checks the SSL_* settings. When USE_OPENSSL_WOLFSSL is
set, then USE_OPENSSL is implied, so that there's no need to
duplicate maintenance effort.
2022-12-23 16:53:35 +01:00
Willy Tarreau
8fa2f49f24 BUILD: makefile: add a function to collect all options' CFLAGS/LDFLAGS
The new function collect_opts_flags now scans all USE_* options defined
in use_opts and appends the corresponding *_CFLAGS and *_LDFLAGS to
OPTIONS_{C,LD}FLAGS respectively. This will be useful to get rid of all
the individual concatenations to these variables.
2022-12-23 16:53:35 +01:00