Commit Graph

13949 Commits

Author SHA1 Message Date
Willy Tarreau
e22267971b MINOR: connection: skip FD-based syscalls for FD-less connections
Some syscalls at the TCP level act directly on the FD. Some of them
are used by TCP actions like set-tos, set-mark, silent-drop, others
try to retrieve TCP info, get the source or destination address. These
ones must not be called with an invalid FD coming from an FD-less
connection, so let's add the relevant tests for this. It's worth
noting that all these ones already have fall back plans (do nothing,
error, or switch to alternate implementation).
2022-04-11 19:31:47 +02:00
Willy Tarreau
0e9c264ca0 MINOR: connection: use conn_fd() when displaying connection errors
The SSL connection errors and socks4 proxy errors used to blindly dump
the FD, now it's sanitized via conn_fd().
2022-04-11 19:31:47 +02:00
Willy Tarreau
a57f34523e MINOR: stream: only dump connections' FDs when they are valid
The "show sess" output and the debugger outputs will now use conn_fd()
to retrieve the file descriptor instead of dumping incorrect data.
2022-04-11 19:31:47 +02:00
Willy Tarreau
c78a9698ef MINOR: connection: add a new flag CO_FL_FDLESS on fd-less connections
QUIC connections do not use a file descriptor, instead they use the
quic equivalent which is the quic_conn. A number of our historical
functions at the connection level continue to unconditionally touch
the file descriptor and this may have consequences once QUIC starts
to be used.

This patch adds a new flag on QUIC connections, CO_FL_FDLESS, to
mention that the connection doesn't have a file descriptor, hence the
FD-based API must never be used on them.

From now on it will be possible to intrument existing functions to
panic when this flag is present.
2022-04-11 19:31:47 +02:00
Willy Tarreau
e4d09cedb6 MINOR: sock: check configured limits at the sock layer, not the listener's
listener_accept() used to continue to enforce the FD limits relative to
global.maxsock by itself while it's the last FD-specific test in the
whole file. This test has nothing to do there, it ought to be placed in
sock_accept_conn() which is the one in charge of FD allocation and tests.
Similar tests are already located there by the way. The only tiny
difference is that listener_accept() used to pause for one second when
this limit was reached, while other similar conditions were pausing only
100ms, so now the same 100ms will apply. But that's not important and
could even be considered as an improvement.
2022-04-11 19:31:47 +02:00
Willy Tarreau
325fc63f5a BUILD: xprt-quic: replace ERR_func_error_string() with ERR_peek_error_func()
OpenSSL 3.0 warns that ERR_func_error_string() is deprecated. Using
ERR_peek_error_func() solves it instead, and this function was added to
the compat layer by commit 1effd9aa0 ("MINOR: ssl: Remove call to
ERR_func_error_string with OpenSSLv3").
2022-04-11 18:54:46 +02:00
William Lallemand
d7bfbe2333 BUILD: ssl: add USE_ENGINE and disable the openssl engine by default
The OpenSSL engine API is deprecated starting with OpenSSL 3.0.

In order to have a clean build this feature is now disabled by default.
It can be reactivated with USE_ENGINE=1 on the build line.
2022-04-11 18:41:24 +02:00
Willy Tarreau
00147f7244 BUG/MINOR: stats: define the description' background color in dark color scheme
Shawn Heisey reported that the proxy's description was unreadable in dark
color scheme. This is because the text color is changed in the table but
not the cell's background.

This should be backported to 2.5.
2022-04-11 08:01:43 +02:00
Willy Tarreau
dddfec5428 CLEANUP: connection: reduce the with of the mux dump output
In "haproxy -vv" we produce a list of available muxes with their
capabilities, but that list is often quite large for terminals due
to excess of spaces, so let's reduce them a bit to make the output
more readable.
2022-04-09 11:43:16 +02:00
Remi Tricot-Le Breton
b5d968d9b2 MEDIUM: global: Add a "close-spread-time" option to spread soft-stop on time window
The new 'close-spread-time' global option can be used to spread idle and
active HTTP connction closing after a SIGUSR1 signal is received. This
allows to limit bursts of reconnections when too many idle connections
are closed at once. Indeed, without this new mechanism, in case of
soft-stop, all the idle connections would be closed at once (after the
grace period is over), and all active HTTP connections would be closed
by appending a "Connection: close" header to the next response that goes
over it (or via a GOAWAY frame in case of HTTP2).

This patch adds the support of this new option for HTTP as well as HTTP2
connections. It works differently on active and idle connections.

On active connections, instead of sending systematically the GOAWAY
frame or adding the 'Connection: close' header like before once the
soft-stop has started, a random based on the remainder of the close
window is calculated, and depending on its result we could decide to
keep the connection alive. The random will be recalculated for any
subsequent request/response on this connection so the GOAWAY will still
end up being sent, but we might wait a few more round trips. This will
ensure that goaways are distributed along a longer time window than
before.

On idle connections, a random factor is used when determining the expire
field of the connection's task, which should naturally spread connection
closings on the time window (see h2c_update_timeout).

This feature request was described in GitHub issue #1614.
This patch should be backported to 2.5. It depends on "BUG/MEDIUM:
mux-h2: make use of http-request and keep-alive timeouts" which
refactorized the timeout management of HTTP2 connections.
2022-04-08 18:15:21 +02:00
Frédéric Lécaille
8c7927c6dd MINOR: quic_tls: Make key update use of reusable cipher contexts
We modify the key update feature implementation to support reusable cipher contexts
as this is done for the other cipher contexts for packet decryption and encryption.
To do so we attach a context to the quic_tls_kp struct and initialize it each time
the underlying secret key is updated. Same thing when we rotate the secrets keys,
we rotate the contexts as the same time.
2022-04-08 15:38:29 +02:00
Frédéric Lécaille
3dfd4c4b0d MINOR: quic: Add short packet key phase bit values to traces
This is useful to diagnose key update related issues.
2022-04-08 15:38:29 +02:00
Frédéric Lécaille
9688a8df49 CLEANUP: quic: Do not set any cipher/group from ssl_quic_initial_ctx()
These settings are potentially cancelled by others setting initialization shared
with SSL sock bindings. This will have to be clarified when we will adapt the
QUIC bindings configuration.
2022-04-08 15:38:29 +02:00
Frédéric Lécaille
f2f4a4eee5 MINOR: quic_tls: Stop hardcoding cipher IV lengths
For QUIC AEAD usage, the number of bytes for the IVs is always 12.
2022-04-08 15:38:29 +02:00
Frédéric Lécaille
f4605748f4 MINOR: quic_tls: Add reusable cipher contexts to QUIC TLS contexts
Add ->ctx new member field to quic_tls_secrets struct to store the cipher context
for each QUIC TLS context TX/RX parts.
Add quic_tls_rx_ctx_init() and quic_tls_tx_ctx_init() functions to initialize
these cipher context for RX and TX parts respectively.
Make qc_new_isecs() call these two functions to initialize the cipher contexts
of the Initial secrets. Same thing for ha_quic_set_encryption_secrets() to
initialize the cipher contexts of the subsequent derived secrets (ORTT, Handshake,
1RTT).
Modify quic_tls_decrypt() and quic_tls_encrypt() to always use the same cipher
context without allocating it each time they are called.
2022-04-08 15:38:29 +02:00
Frédéric Lécaille
82851bd3cb BUG/MEDIUM: quic: Possible crash from quic_free_arngs()
All quic_arng_node objects are allocated from "pool_head_quic_arng" memory pool.
They must be deallocated calling pool_free().
2022-04-08 15:38:29 +02:00
Willy Tarreau
9cc88c3075 BUG/MINOR: quic: set the source not the destination address on accept()
When a QUIC connection is accepted, the wrong field is set from the
client's source address, it's the destination instead of the source.
No backport needed.
2022-04-08 14:43:27 +02:00
Amaury Denoyelle
8038821c88 BUG/MEDIUM: mux-quic: properly release conn-stream on detach
On qc_detach(), the qcs must cleared the conn-stream context and set its
cs pointer to NULL. This prevents the qcs to point to a dangling
reference.

Without this, a SEGFAULT may occurs in qc_wake_some_streams() when
accessing an already detached conn-stream instance through a qcs.

Here is the SEGFAULT observed on haproxy.org.
 Program terminated with signal 11, Segmentation fault.
 1234                            else if (qcs->cs->data_cb->wake) {
 (gdb) p qcs.cs.data_cb
 $1 = (const struct data_cb *) 0x0

This can happens since the following patch :
 commit fe035eca3a
 MEDIUM: mux-quic: report errors on conn-streams
2022-04-08 12:09:23 +02:00
Amaury Denoyelle
9c3955c98c CLEANUP: mux-quic: remove uneeded TODO in qc_detach
The stream mux buffering has been reworked since the introduction of the
struct qc_stream_desc. A qcs is now able to quickly release its buffer
to the quic-conn.
2022-04-08 12:08:50 +02:00
Christopher Faulet
114e759d5d BUG/MEDIUM: http-act: Don't replace URI if path is not found or invalid
For replace-path, replace-pathq and replace-uri actions, we must take care
to not match on the selected element if it is not defined.

regex_exec_match2() function expects to be called with a defined
subject. However, if the request path is invalid or not found, the function
is called with a NULL subject, leading to a crash when compiled without the
PRCE/PCRE2 support.

For instance the following rules crashes HAProxy on a CONNECT request:

  http-request replace-path /short/(.) /\1

This patch must be backported as far as 2.0.
2022-04-08 10:45:31 +02:00
Christopher Faulet
21ac0eec28 BUG/MEDIUM: http-conv: Fix url_enc() to not crush const samples
url_enc() encodes an input string by calling encode_string(). To do so, it
adds a trailing '\0' to the sample string. However it never restores the
sample string at the end. It is a problem for const samples. The sample
string may be in the middle of a buffer. For instance, the HTTP headers
values are concerned.

However, instead of modifying the sample string, it is easier to rely on
encode_chunk() function. It does the same but on a buffer.

This patch must be backported as far as 2.2.
2022-04-08 10:12:59 +02:00
Christopher Faulet
dca3b5b2c6 BUG/MINOR: http_client: Don't add input data on an empty request buffer
When compiled in debug mode, a BUG_ON triggers an error when the payload is
fully transfered from the http-client buffer to the request channel
buffer. In fact, when channel_add_input() is called, the request buffer is
empty. So an error is reported when those data are directly forwarded,
because we try to add some output data on a buffer with no data.

To fix the bug, we must be sure to call channel_add_input() after the data
transfer.

The bug was introduced by the commit ccc7ee45f ("MINOR: httpclient: enable
request buffering"). So, this patch must be backported if the above commit
is backported.
2022-04-07 11:04:39 +02:00
Christopher Faulet
2eb5243e7f BUG/MEDIUM: mux-h1: Set outgoing message to DONE when payload length is reached
When a message is sent, we can switch it state to MSG_DONE if all the
announced payload was processed. This way, if the EOM flag is not set on the
message when the last expected data block is processed, the message can
still be set to MSG_DONE state.

This bug is related to the previous ones. There is a design issue with the
HTX since the 2.4. When the EOM HTX block was replaced by a flag, I tried
hard to be sure the flag is always set with the last HTX block on a
message. It works pretty well for all messages received from a client or a
server. But for internal messages, it is not so simple. Because applets
cannot always properly handle the end of messages. So, there are some cases
where the EOM flag is set on an empty message.

As a workaround, for chunked messages, we can add an EOT HTX block. It does
the trick. But for messages with a content-length, there is no empty DATA
block. Thus, the only way to be sure the end of the message was reached in
this case is to detect it in the H1 multiplexr.

We already count the amount of data processed when the payload length is
announced. Thus, we must only switch the message in DONE state when last
bytes of the payload are received. Or when the EOM flag is received of
course.

This patch must be backported as far as 2.4.
2022-04-07 11:04:07 +02:00
Christopher Faulet
f89af9c205 BUG/MEDIUM: hlua: Don't set EOM flag on an empty HTX message in HTTP applet
In a lua HTTP applet, when the script is finished, we must be sure to not
set the EOM on an empty message. Otherwise, because there is no data to
send, the mux on the client side may miss the end of the message and
consider any shutdown as an abort.

See "UG/MEDIUM: stats: Be sure to never set EOM flag on an empty HTX
message" for details.

This patch must be backported as far as 2.4. On previous version there is
still the EOM HTX block.
2022-04-07 11:04:07 +02:00
Christopher Faulet
08b45cb8fd BUG/MEDIUM: stats: Be sure to never set EOM flag on an empty HTX message
During the last call to the stats I/O handle, it is possible to have nothing
to dump. It may happen for many reasons. For instance, all remaining proxies
are disabled or they don't match the specified scope. In HTML or in JSON, it
is not really an issue because there is a footer. So there are still some
data to push in the response channel buffer. In CSV, it is a problem because
there is no footer. It means it is possible to finish the response with no
payload at all in the HTX message. Thus, the EOM flag may be added on an
empty message. When this happens, a shutdown is performed on an empty HTX
message. Because there is nothing to send, the mux on the client side is not
notified that the message was properly finished and interprets the shutdown
as an abort.

The response is chunked. So an abort at this stage means the last CRLF is
never sent to the client. All data were sent but the message is invalid
because the response chunking is not finished. If the reponse is compressed,
because of a similar bug in the comppression filter, the compression is also
aborted and the content is truncated because some data a lost in the
compression filter.

It is design issue with the HTX. It must be addressed. And there is an
opportunity to do so with the recent conn-stream refactoring. It must be
carefully evaluated first. But it is possible. In the means time and to also
fix stable versions, to workaround the bug, a end-of-trailer HTX block is
systematically added at the end of the message when the EOM flag is set if
the HTX message is empty. This way, there are always some data to send when
the EOM flag is set.

Note that with a H2 client, it is only a problem when the response is
compressed.

This patch should fix the issue #1478. It must be backported as far as
2.4. On previous versions there is still the EOM block.
2022-04-07 11:04:07 +02:00
Christopher Faulet
d057960769 BUG/MINOR: fcgi-app: Don't add C-L header on response to HEAD requests
In the FCGI app, when a full response is received, if there is no
content-length and transfer-encoding headers, a content-length header is
automatically added. This avoid, as far as possible to chunk the
response. This trick was added because, most of time, scripts don"t add
those headers.

But this should not be performed for response to HEAD requests. Indeed, in
this case, there is no payload. If the payload size is not specified, we
must not added it by hand. Otherwise, a "content-length: 0" will always be
added while it is not the real payload size (unknown at this stage).

This patch should solve issue #1639. It must be backported as far as 2.2.
2022-04-07 11:04:07 +02:00
Amaury Denoyelle
b515b0af1d MEDIUM: quic: report closing state for the MUX
Define a new API to notify the MUX from the quic-conn when the
connection is about to be closed. This happens in the following cases :
- on idle timeout
- on CONNECTION_CLOSE emission or reception

The MUX wake callback is called on these conditions. The quic-conn
QUIC_FL_NOTIFY_CLOSE is set to only report once. On the MUX side,
connection flags CO_FL_SOCK_RD_SH|CO_FL_SOCK_WR_SH are set to interrupt
future emission/reception.

This patch is the counterpart to
  "MEDIUM: mux-quic: report CO_FL_ERROR on send".
Now the quic-conn is able to report its closing, which may be translated
by the MUX into a CO_FL_ERROR on the connection for the upper layer.
This allows the MUX to properly react to the QUIC closing mechanism for
both idle-timeout and closing/draining states.
2022-04-07 10:37:45 +02:00
Amaury Denoyelle
fe035eca3a MEDIUM: mux-quic: report errors on conn-streams
Complete the error reporting. For each attached streams, if CO_FL_ERROR
is set, mark them with CS_FL_ERR_PENDING|CS_FL_ERROR. This will notify
the upper layer to trigger streams detach and release of the MUX.

This reporting is implemented in a new function qc_wake_some_streams(),
called by qc_wake(). This ensures that a lower-layer error is quickly
reported to the individual streams.
2022-04-07 10:37:45 +02:00
Amaury Denoyelle
d97fc804f9 MEDIUM: mux-quic: report CO_FL_ERROR on send
Mark the connection with CO_FL_ERROR on qc_send() if the socket Tx is
closed. This flag is used by the upper layer to order a close on the
MUX. This requires to check CO_FL_ERROR in qcc_is_dead() to process to
immediate MUX free when set.

The qc_wake() callback has been completed. Most notably, it now calls
qc_send() to report a possible CO_FL_ERROR. This is useful because
qc_wake() is called by the quic-conn on imminent closing.

Note that for the moment the error flag can never be set because the
quic-conn does not report when the Tx socket is closed. This will be
implemented in a following patch.
2022-04-07 10:35:34 +02:00
Amaury Denoyelle
c933780f1e MINOR: mux-quic: centralize send operations in qc_send
Regroup all features related to sending in qc_send(). This will be
useful when qc_send() will be called outside of the io-cb.

Currently, flow-control frames generation is now automatically
integrated in qc_send().
2022-04-07 10:23:10 +02:00
Amaury Denoyelle
198d35f9c6 MINOR: mux-quic: define is_active app-ops
Add a new app layer operation is_active. This can be used by the MUX to
check if the connection can be considered as active or not. This is used
inside qcc_is_dead as a first check.

For example on HTTP/3, if there is at least one bidir client stream
opened the connection is active. This explicitly ignore the uni streams
used for control and qpack as they can never be closed during the
connection lifetime.
2022-04-07 10:23:10 +02:00
Amaury Denoyelle
06890aaa91 MINOR: mux-quic: adjust timeout to accelerate closing
Improve timeout handling on the MUX. When releasing a stream, first
check if the connection can be considered as dead and should be freed
immediatly. This allows to liberate resources faster when possible.

If the connection is still active, ensure there is no attached
conn-stream before scheduling the timeout. To do this, add a nb_cs field
in the qcc structure.
2022-04-07 10:23:10 +02:00
Amaury Denoyelle
846cc046ae MINOR: mux-quic: factorize conn-stream attach
Provide a new function qc_attach_cs. This must be used by the app layer
when a conn-stream can be instantiated. This will simplify future
development.
2022-04-07 10:10:23 +02:00
Amaury Denoyelle
c9acc31018 BUG/MINOR: fix memleak on quic-conn streams cleaning
When freeing a quic-conn, the streams resources attached to it must be
cleared. This code is already implemented but the streams buffer was not
deallocated.

Fix this by using the function qc_stream_desc_free. This existing
function centralize all operations to properly free all streams
elements, attached both to the MUX and the quic-conn.

This fixes a memory leak which can happen for each released connection.
2022-04-07 10:10:23 +02:00
Amaury Denoyelle
6057b4090e CLEANUP: mux-quic: remove unused QC_CF_CC_RECV
This flag was used to notify the MUX about a CONNECTION_CLOSE frame
reception. It is now unused on the MUX side and can be removed. A new
mechanism to detect quic-conn closing will be soon implemented.
2022-04-07 10:10:23 +02:00
Amaury Denoyelle
e0be573c1b CLEANUP: quic: use static qualifer on quic_close
quic_close can be used through xprt-ops and can thus be kept as a static
symbol.
2022-04-07 10:10:22 +02:00
Amaury Denoyelle
db71e3bd09 BUG/MEDIUM: quic: ensure quic-conn survives to the MUX
Rationalize the lifetime of the quic-conn regarding with the MUX. The
quic-conn must not be freed if the MUX is still allocated.

This simplify the MUX code when accessing the quic-conn and removed
possible segfaults.

To implement this, if the quic-conn timer expired, the quic-conn is
released only if the MUX is not allocated. Else, the quic-conn is
flagged with QUIC_FL_CONN_EXP_TIMER. The MUX is then responsible
to call quic_close() which will free the flagged quic-conn.
2022-04-07 10:10:22 +02:00
Frédéric Lécaille
59bf255806 MINOR: quic: Add closing connection state
New received packets after sending CONNECTION_CLOSE frame trigger a new
CONNECTION_CLOSE frame to be sent. Each time such a frame is sent we
increase the number of packet required to send another CONNECTION_CLOSE
frame.
Rearm only one time the idle timer when sending a CONNECTION_CLOSE frame.
2022-04-06 15:52:35 +02:00
Frédéric Lécaille
47756809fb MINOR: quic: Add draining connection state.
As soon as we receive a CONNECTION_CLOSE frame, we must stop sending packets.
We add QUIC_FL_CONN_DRAINING connection flag to do so.
2022-04-06 15:52:35 +02:00
William Lallemand
eb0d4c40ac BUG/MINOR: httpclient: end callback in applet release
In case an error provokes the release of the applet, we will never call
the end callback of the httpclient.

In the case of a lua script, it would mean that the lua task will never
be waked up after a yield, letting the lua script stuck forever.

Fix the issue by moving the callback from the end of the iohandler to
the applet release function.

Must be backported in 2.5.
2022-04-06 14:19:36 +02:00
William Lallemand
71abad050a MEDIUM: httpclient: enable l7-retry
Enable the layer-7 retry in the httpclient. This way the client will
retry upon a connection error or a timeout.
2022-04-06 11:50:10 +02:00
William Lallemand
ccc7ee45f9 MINOR: httpclient: enable request buffering
The request buffering is required for doing l7 retry. The IO handler of
the httpclient need to be rework for that.

This patch change the IO handler so it copies partially the data instead
of swapping buffer. This is needed because the b_xfer won't never work
if the destination buffer is not empty, which is the case when
buffering.
2022-04-06 11:43:01 +02:00
Remi Tricot-Le Breton
e8041fe8bc BUG/MINOR: ssl/cli: Remove empty lines from CLI output
There were empty lines in the output of "show ssl ca-file <cafile>" and
"show ssl crl-file <crlfile>" commands when an empty line should only
mark the end of the output. This patch adds a space to those lines.

This patch should be backported to 2.5.
2022-04-05 16:53:37 +02:00
William Lallemand
80296b4bd5 BUG/MINOR: ssl: handle X509_get_default_cert_dir() returning NULL
ssl_store_load_locations_file() is using X509_get_default_cert_dir()
when using '@system-ca' as a parameter.

This function could return a NULL if OpenSSL was built with a
X509_CERT_DIR set to NULL, this is uncommon but let's fix this.

No backport needed, 2.6 only.

Fix issue #1637.
2022-04-05 10:19:30 +02:00
Nikola Sale
0dbf03871f MINOR: sample: converter: Add add_item convertor
This new converter is similar to the concat converter and can be used to
build new variables made of a succession of other variables but the main
difference is that it does the checks if adding a delimiter makes sense as
wouldn't be the case if e.g the current input sample is empty. That
situation would require 2 separate rules using concat converter where the
first rule would have to check if the current sample string is empty before
adding a delimiter. This resolves GitHub Issue #1621.
2022-04-04 07:30:58 +02:00
William Lallemand
c6b1763dcd MINOR: ssl: ca-file @system-ca loads the system trusted CA
The new parameter "@system-ca" to the ca-file directives loads the
trusted CA in the directory returned by X509_get_default_cert_dir().
2022-04-01 23:52:50 +02:00
William Lallemand
4f6ca32217 BUG/MINOR: ssl: continue upon error when opening a directory w/ ca-file
Previous patch was accidentaly breaking upon an error when itarating
through a CA directory. This is not the expected behavior, the function
must start processing the other files after the warning.
2022-04-01 23:52:50 +02:00
William Lallemand
87fd994727 MEDIUM: ssl: allow loading of a directory with the ca-file directive
This patch implements the ability to load a certificate directory with
the "ca-file" directive.

The X509_STORE_load_locations() API does not allow to cache a directory
in memory at startup, it only references the directory to allow a lookup
of the files when needed. But that is not compatible with the way
HAProxy works, without any access to the filesystem.

The current implementation loads every ".pem", ".crt", ".cer", and
".crl" available in the directory which is what is done when using
c_rehash and X509_STORE_load_locations(). Those files are cached in the
same X509_STORE referenced by the directory name. When looking at "show ssl
ca-file", everything will be shown in the same entry.

This will eventually allow to load more easily the CA of the system,
which could already be done with "ca-file /etc/ssl/certs" in the
configuration.

Loading failure intentionally emit a warning instead of an alert,
letting HAProxy starts when one of the files can't be loaded.

Known limitations:

- There is a bug in "show ssl ca-file", once the buffer is full, the
iohandler is not called again to output the next entries.

- The CLI API is kind of limited with this, since it does not allow to
  add or remove a entry in a particular ca-file. And with a lot of
  CAs you can't push them all in a buffer. It probably needs a "add ssl
  ca-file" like its done with the crt-list.

Fix issue #1476.
2022-04-01 20:36:38 +02:00
Willy Tarreau
fcce0e1e09 OPTIM: hpack: read 32 bits at once when possible.
As suggested in the comment, it's possible to read 32 bits at once in
big-endian order, and now we have the functions to do this, so let's
do it. This reduces the code on the fast path by 31 bytes on x86, and
more importantly performs single-operation 32-bit reads instead of
playing with shifts and additions.
2022-04-01 17:29:06 +02:00
Willy Tarreau
17b4687a89 CLEANUP: hpack: be careful about integer promotion from uint8_t
As reported in issue #1635, there's a subtle sign change when shifting
a uint8_t value to the left because integer promotion first turns any
smaller type to signed int *even if it was unsigned*. A warning was
reported about uint8_t shifted left 24 bits that couldn't fit in int
due to this.

It was verified that the emitted code didn't change, as expected, but
at least this allows to silence the code checkers. There's no need to
backport this.
2022-04-01 17:29:06 +02:00
Frédéric Lécaille
eb2a2da67c BUG/MINOR: quic: Missing TX packet deallocations
Ensure all TX packets are deallocated. There may be remaining ones which
will never be acknowledged or deemed lost.
2022-04-01 16:26:06 +02:00
Frédéric Lécaille
64670884ba BUG/MINOR: quic: Missing ACK range deallocations
free_quic_arngs() was implemented but not used. Let's call it from
quic_conn_release().
2022-04-01 16:26:06 +02:00
Frédéric Lécaille
96fd1633e1 BUG/MINOR: quic: QUIC TLS secrets memory leak
We deallocate these secrets from quic_conn_release().
2022-04-01 16:26:06 +02:00
Frédéric Lécaille
000162e1d6 BUG/MINOR: h3: Missing wait event struct field initialization
This one has been detected by valgrind:
==2179331== Conditional jump or move depends on uninitialised value(s)
==2179331==    at 0x1B6EDE: qcs_notify_recv (mux_quic.c:201)
==2179331==    by 0x1A17C5: qc_handle_uni_strm_frm (xprt_quic.c:2254)
==2179331==    by 0x1A1982: qc_handle_strm_frm (xprt_quic.c:2286)
==2179331==    by 0x1A2CDB: qc_parse_pkt_frms (xprt_quic.c:2550)
==2179331==    by 0x1A6068: qc_treat_rx_pkts (xprt_quic.c:3463)
==2179331==    by 0x1A6C3D: quic_conn_app_io_cb (xprt_quic.c:3589)
==2179331==    by 0x3AA566: run_tasks_from_lists (task.c:580)
==2179331==    by 0x3AB197: process_runnable_tasks (task.c:883)
==2179331==    by 0x357E56: run_poll_loop (haproxy.c:2750)
==2179331==    by 0x358366: run_thread_poll_loop (haproxy.c:2921)
==2179331==    by 0x3598D2: main (haproxy.c:3538)
==2179331==
2022-04-01 16:26:06 +02:00
Frédéric Lécaille
b823bb7f7f MINOR: quic: Add traces about list of frames
This should be useful to have an idea of the list of frames which could be built
towards the list of available frames when building packets.
Same thing about the frames which could not be built because of a lack of room
in the TX buffer.
2022-04-01 16:26:06 +02:00
Frédéric Lécaille
6c01b74ffa MINOR: quic: Useless call to SSL_CTX_set_default_verify_paths()
This call to SSL_CTX_set_default_verify_paths() is useless for haproxy.
2022-04-01 16:26:06 +02:00
Frédéric Lécaille
12fd259363 BUG/MINOR: quic: Too much prepared retransmissions due to anti-amplification
We must not re-enqueue frames if we can detect in advance they will not be
transmitted due to the anti-amplification limit.
2022-04-01 16:26:06 +02:00
Frédéric Lécaille
009016c0cd BUG/MINOR: quic: Non duplicated frames upon fast retransmission
We must duplicate the frames to be sent again from packets which are not deemed
lost.
2022-04-01 16:26:06 +02:00
Frédéric Lécaille
5cfb4edca7 BUG/MINOR: quic: Do not probe from an already probing packet number space
During a handshake, after having prepared a probe upon a PTO expiration from
process_timer(), we wake up the I/O handler to make it send probing packets.
This handler first treat incoming packets  which trigger a fast retransmission
leading to send too much probing (duplicated) packets. In this cas we cancel
the fast retranmission.
2022-04-01 16:26:05 +02:00
Frédéric Lécaille
03235d78ae MINOR: quic: Do not display any timer value from process_timer()
This is confusing to display the connection timer from this function as it is not
supposed to update it. Only qc_set_timer() should do that.
2022-04-01 16:22:52 +02:00
Frédéric Lécaille
05bd92bbc5 BUG/MINOR: quic: Discard Initial packet number space only one time
When discarding a packet number space, we at least reset the PTO backoff counter.
Doing this several times have an impact on the PTO duration calculation.
We must not discard a packet number space several times (this is already the case
for the handshake packet number space).
2022-04-01 16:22:52 +02:00
Frédéric Lécaille
d6570e1789 BUG/MINOR: quic: Missing probing packets when coalescing
Before having a look at the next encryption level to build packets if there is
no more ack-eliciting frames to send we must check we have not to probe from
the current encryption level anymore. If not, we only send one datagram instead
of sending two datagrams giving less chance to recover from packet loss.
2022-04-01 16:22:52 +02:00
Frédéric Lécaille
b002145e9f MEDIUM: quic: Send ACK frames asap
Due to a erroneous interpretation of the RFC 9000 (quic-transport), ACKs frames
were always sent only after having received two ack-eliciting packets.
This could trigger useless retransmissions for tail packets on the peer side.
For now on, we send as soon as possible ACK frames as soon as we have ACK to send,
in the same packets as the ack-eliciting frame packets, and we also send ACK
frames after having received 2 ack-eliciting packets since the last time we sent
an ACK frame with other ack-eliciting frames.
2022-04-01 16:22:52 +02:00
Frédéric Lécaille
205e4f359e CLEANUP: quic: Remove all atomic operations on packet number spaces
As such variables are handled by the QUIC connection I/O handler which runs
always on the thread, there is no need to continue to use such atomic operations
2022-04-01 16:22:47 +02:00
Frédéric Lécaille
fc79006c92 CLEANUP: quic: Remove all atomic operations on quic_conn struct
As the QUIC connections are always handled by the same thread there is no need
anymore to continue to use atomic operations on such variables.
2022-04-01 16:22:44 +02:00
Frédéric Lécaille
f44d19eb91 BUG/MEDIUM: quic: Possible crash in ha_quic_set_encryption_secrets()
This bug has come with this commit:
   1fc5e16c4 MINOR: quic: More accurate immediately close
As mentionned in this commit we do not want to derive anymore secret when in closing
state. But the flag which denote secrets were derived was set. Add a label at
the correct flag to skip the secrets derivation without setting this flag.
2022-04-01 16:22:40 +02:00
Willy Tarreau
413713f02a BUG/MAJOR: mux_pt: always report the connection error to the conn_stream
Over time we've tried hard to abstract connection errors from the upper
layers so that they're reported per stream and not per connection. As
early as 1.8-rc1, commit 4ff3b8964 ("MINOR: connection: make conn_stream
users also check for per-stream error flag") did precisely this, but
strangely only for rx, not for tx (probably that by then send errors
were not imagined to be reported that way).

And this lack of Tx error check was just revealed in 2.6 by recent commit
d1480cc8a ("BUG/MEDIUM: stream-int: do not rely on the connection error
once established") that causes wakeup loops between si_cs_send() failing
to send via mux_pt_snd_buf() and subscribing against si_cs_io_cb() in
loops because the function now rightfully only checks for CS_FL_ERROR
and not CO_FL_ERROR.

As found by Amaury, this causes aborted "show events -w" to cause
haproxy to loop at 100% CPU.

This fix theoretically needs to be backported to all versions, though
it will be necessary and sufficient to backport it wherever 4ff3b8964
gets backported.
2022-03-31 16:55:52 +02:00
Willy Tarreau
c40c407268 BUG/MINOR: cli/stream: fix "shutdown session" to iterate over all threads
The list of streams was modified in 2.4 to become per-thread with commit
a698eb673 ("MINOR: streams: use one list per stream instead of a global
one"). However the change applied to cli_parse_shutdown_session() is
wrong, as it uses the nullity of the stream pointer to continue on next
threads, but this one is not null once the list_for_each_entry() loop
is finished, it points to the list's head again, so the loop doesn't
check other threads, and no message is printed either to say that the
stream was not found.

Instead we should check if the stream is equal to the requested pointer
since this is the condition to break out of the loop.

Thus must be backported to 2.4. Thanks to Maciej Zdeb for reporting this.
2022-03-31 15:00:45 +02:00
Amaury Denoyelle
d8e680cbaf MEDIUM: mux-quic: remove qcs tree node
The new qc_stream_desc type has a tree node for storage. Thus, we can
remove the node in the qcs structure.

When initializing a new stream, it is stored into the qcc streams_by_id
tree. When the MUX releases it, it will freed as soon as its buffer is
emptied. Before this, the quic-conn is responsible to store it inside
its own streams_by_id tree.
2022-03-30 16:26:59 +02:00
Amaury Denoyelle
7272cd76fc MEDIUM: quic: move transport fields from qcs to qc_conn_stream
Move the xprt-buf and ack related fields from qcs to the qc_stream_desc
structure. In exchange, qcs has a pointer to the low-level stream. For
each new qcs, a qc_stream_desc is automatically allocated.

This simplify the transport layer by removing qcs/mux manipulation
during ACK frame parsing. An additional check is done to not notify the
MUX on sending if the stream is already released : this case may now
happen on retransmission.

To complete this change, the quic_stream frame now references the
quic_stream instance instead of a qcs.
2022-03-30 16:19:48 +02:00
Amaury Denoyelle
5c3859c509 MINOR: quic: implement stream descriptor for transport layer
Currently, the mux qcs streams manage the Tx buffering, even after
sending it to the transport layer. Buffers are emptied when
acknowledgement are treated by the transport layer. This complicates the
MUX liberation and we may loose some data after the MUX free.

Change this paradigm by moving the buffering on the transport layer. For
this goal, a new type is implemented as low-level stream at the
transport layer, as a counterpart of qcs mux instances. This structure
is called qc_stream_desc. This will allow to free the qcs/qcc instances
without having to wait for acknowledge reception.

For the moment, the quic-conn is responsible to store the qc_stream_desc
in a new tree named streams_by_id. This will sligthly change in the next
commits to remove the qcs node which has a similar purpose :
qc_stream_desc instances will be shared between the qcc MUX and the
quic-conn.

This patch only introduces the new type definition and the function to
manipulate it. The following commit will bring the rearchitecture in the
qcs structure.
2022-03-30 16:16:07 +02:00
Amaury Denoyelle
95e50fbeff CLEANUP: quic: complete comment on qcs_try_to_consume
Specify the return value usage.
2022-03-30 16:12:18 +02:00
Amaury Denoyelle
f89094510c BUG/MINOR: mux-quic: ensure to free all qcs on MUX release
Remove qcs instances left during qcc MUX release. This can happen when
the MUX is closed before the completion of all the transfers, such as on
a timeout or process termination.

This may free some memory leaks on the connection.
2022-03-30 16:12:18 +02:00
Amaury Denoyelle
8347f27221 BUG/MINOR: h3: release resources on close
Implement the release app-ops ops for H3 layer. This is used to clean up
uni-directional streams and the h3 context.

This prevents a memory leak on H3 resources for each connection.
2022-03-30 16:12:18 +02:00
Amaury Denoyelle
cbc13b71c6 MINOR: mux-quic: define release app-ops
Define a new callback release inside qcc_app_ops. It is called when the
qcc MUX is freed via qc_release. This will allows to implement cleaning
on the app layer.
2022-03-30 16:12:18 +02:00
Amaury Denoyelle
dccbd733f0 MINOR: mux-quic: reorganize qcs free
Regroup some cleaning operations inside a new function qcs_free. This
can be used for all streams, both through qcs_destroy and with
uni-directional streams.
2022-03-30 16:12:18 +02:00
Amaury Denoyelle
50742294f5 MINOR: mux-quic: return qcs instance from qcc_get_qcs
Refactoring on qcc_get_qcs : return the qcs instance instead of the tree
node. This is useful to hide some eb64_entry macros for better
readability.
2022-03-30 16:12:18 +02:00
Amaury Denoyelle
8d5def0bab BUG/MEDIUM: quic: do not use qcs from quic_stream on ACK parsing
The quic_stream frame stores the qcs instance. On ACK parsing, qcs is
accessed to clear the stream buffer. This can cause a segfault if the
MUX or the qcs is already released.

Consider the following scenario :

1. a STREAM frame is generated by the MUX
   transport layer emits the frame with PKN=1
   upper layer has finished the transfer so related qcs is detached

2. transport layer reemits the frame with PKN=2 because ACK was not
   received

3. ACK for PKN=1 is received, stream buffer is cleared
   at this stage, qcs may be freed by the MUX as it is detached

4. ACK for PKN=2 is received
   qcs for STREAM frame is dereferenced which will lead to a crash

To prevent this, qcs is never accessed from the quic_stream during ACK
parsing. Instead, a lookup is done on the MUX streams tree. If the MUX
is already released, no lookup is done. These checks prevents a possible
segfault.

This change may have an impact on the perf as now we are forced to use a
tree lookup operation. If this is the case, an alternative solution may
be to implement a refcount on qcs instances.
2022-03-30 16:12:18 +02:00
William Lallemand
5213918fe2 BUILD: ssl/lua: CacheCert needs OpenSSL
Return an lua error when trying to use CacheCert.set() and OpenSSL was
not used to build HAProxy.
2022-03-30 15:05:42 +02:00
William Lallemand
30fcca18a5 MINOR: ssl/lua: CertCache.set() allows to update an SSL certificate file
The CertCache.set() function allows to update an SSL certificate file
stored in the memory of the HAProxy process. This function does the same
as "set ssl cert" + "commit ssl cert" over the CLI.

This could be used to update the crt and key, as well as the OCSP, the
SCTL, and the OSCP issuer.

The implementation does yield every 10 ckch instances, the same way the
"commit ssl cert" do.
2022-03-30 14:56:10 +02:00
William Lallemand
26654e7a59 MINOR: ssl: add "crt" in the cert_exts array
The cert_exts array does handle "crt" the default way, however
you might stil want to look for these extensions in the array.
2022-03-30 14:55:53 +02:00
William Lallemand
e60c7d6e59 MINOR: ssl: export ckch_inst_rebuild()
ckch_inst_rebuild() will be needed to regenerate the ckch instances from
the lua code, we need to export it.
2022-03-30 12:18:16 +02:00
William Lallemand
ff8bf988b9 MINOR: ssl: simplify the certificate extensions array
Simplify the "cert_exts" array which is used for the selection of the
parsing function depending on the extension.

It now uses a pointer to an array element instead of an index, which is
simplier for the declaration of the array.

This way also allows to have multiple extension using the same type.
2022-03-30 12:18:16 +02:00
William Lallemand
aaacc7e8ad MINOR: ssl: move the cert_exts and the CERT_TYPE enum
Move the cert_exts declaration and the CERT_TYPE enum in the .h in order
to reuse them in another file.
2022-03-30 12:18:16 +02:00
William Lallemand
3b5a3a6c03 MINOR: ssl: split the cert commit io handler
Extract the code that replace the ckch_store and its dependencies into
the ckch_store_replace() function.

This function must be used under the global ckch lock.
It frees everything related to the old ckch_store.
2022-03-30 12:18:16 +02:00
William Lallemand
7177a95f89 MEDIUM: httpclient/lua: be stricter with httpclient parameters
Checks the argument passed to the httpclient send functions so we don't
add a mispelled argument.
2022-03-30 12:18:16 +02:00
Willy Tarreau
3f0b2e85f3 MINOR: services: alphabetically sort service names
Note that we cannot reuse dump_act_rules() because the output format
may be adjusted depending on the call place (this is also used from
haproxy -vv). The principle is the same however.
2022-03-30 12:12:44 +02:00
Willy Tarreau
0f99637353 MINOR: filters: alphabetically sort the list of filter names
There are very few but they're registered from constructors, hence
in a random order. The scope had to be copied when retrieving the
next keyword. Note that this also has the effect of listing them
sorted in haproxy -vv.
2022-03-30 12:08:00 +02:00
Willy Tarreau
4465171593 MINOR: cli: alphanumerically sort the dump of supported commands
Like for previous keyword classes, we're sorting the output. But this
time as it's not trivial to do it with multiple words, instead we're
proceeding like the help command, we sort them on their usage message
when present, and fall back to the first word of the command when there
is no usage message (e.g. "help" command).
2022-03-30 12:02:35 +02:00
Willy Tarreau
800bd78944 MINOR: acl: alphanumerically sort the ACL dump
The mechanism is similar to others. We take care of sorting on the
keyword only and not the fetch_kw which is not unique.
2022-03-30 11:49:59 +02:00
Willy Tarreau
5e0f90a930 MINOR: sample: alphanumerically sort sample & conv keyword dumps
It's much more convenient to sort these keywords on output to detect
changes, and it's easy to do. The patch looks big but most of it is
only caused by an indent change in the loop, as "git diff -b" shows.
2022-03-30 11:30:36 +02:00
Willy Tarreau
d905826000 MINOR: config: alphanumerically sort config keywords output
The output produced by dump_registered_keywords() really deserves to be
sorted in order to ease comparisons. The function now implements a tiny
sorting mechanism that's suitable for each two-level list, and makes
use of dump_act_rules() to dump rulesets. The code is not significantly
more complicated and some parts (e.g options) could even be factored.
The output is much more exploitable to detect differences now.
2022-03-30 11:21:32 +02:00
Willy Tarreau
2100b383ab MINOR: action: add a function to dump the list of actions for a ruleset
The new function dump_act_rules() now dumps the list of actions supported
by a ruleset. These actions are alphanumerically sorted first so that the
produced output is easy to compare.
2022-03-30 11:19:22 +02:00
Willy Tarreau
3ff476e9ef MINOR: tools: add strordered() to check whether strings are ordered
When trying to sort sets of strings, it's often needed to required to
compare 3 strings to see if the chosen one fits well between the two
others. That's what this function does, in addition to being able to
ignore extremities when they're NULL (typically for the first iteration
for example).
2022-03-30 10:02:56 +02:00
Willy Tarreau
29d799d591 MINOR: sample: list registered sample converter functions
Similar to the sample fetch keywords, let's also list the converter
keywords. They're much simpler since there's no compatibility matrix.
Instead the input and output types are listed. This is called by
dump_registered_keywords() for the "cnv" keywords class.
2022-03-29 18:01:37 +02:00
Willy Tarreau
f78813f74f MINOR: samples: add a function to list register sample fetch keywords
New function smp_dump_fetch_kw lists registered sample fetch keywords
with their compatibility matrix, mandatory and optional argument types,
and output types. It's called from dump_registered_keywords() with class
"smp".
2022-03-29 18:01:37 +02:00
Willy Tarreau
6ff7d1b9a5 MINOR: acl: add a function to dump the list of known ACL keywords
New function acl_dump_kwd() dumps the registered ACL keywords and their
sample-fetch equivalent to stdout. It's called by dump_registered_keywords()
for keyword class "acl".
2022-03-29 18:01:37 +02:00
Willy Tarreau
06d0e2e034 MINOR: cli: add a new keyword dump function
New function cli_list_keywords() scans the list of registered CLI keywords
and dumps them on stdout. It's now called from dump_registered_keywords()
for the class "cli".

Some keywords are valid for the master, they'll be suffixed with
"[MASTER]". Others are valid for the worker, they'll have "[WORKER]".
Those accessible only in expert mode will show "[EXPERT]" and the
experimental ones will show "[EXPERIM]".
2022-03-29 18:01:37 +02:00
Willy Tarreau
5fcc100d91 MINOR: services: extend list_services() to dump to stdout
When no output stream is passed, stdout is used with one entry per line,
and this is called from dump_registered_services() when passed the class
"svc".
2022-03-29 18:01:37 +02:00
Willy Tarreau
3b65e14842 MINOR: filters: extend flt_dump_kws() to dump to stdout
When passing a NULL output buffer the function will now dump to stdout
with a more compact format that is more suitable for machine processing.

An entry was added to dump_registered_keyword() to call it when the
keyword class "flt" is requested.
2022-03-29 18:01:37 +02:00
Willy Tarreau
ca1acd6080 MINOR: config: add a function to dump all known config keywords
All registered config keywords that are valid in the config parser are
dumped to stdout organized like the regular sections (global, listen,
etc). Some keywords that are known to only be valid in frontends or
backends will be suffixed with [FE] or [BE].

All regularly registered "bind" and "server" keywords are also dumped,
one per "bind" or "server" line. Those depending on ssl are listed after
the "ssl" keyword. Doing so required to export the listener and server
keyword lists that were static.

The function is called from dump_registered_keywords() for keyword
class "cfg".
2022-03-29 18:01:32 +02:00
Willy Tarreau
76871a4f8c MINOR: management: add some basic keyword dump infrastructure
It's difficult from outside haproxy to detect the supported keywords
and syntax. Interestingly, many of our modern keywords are enumerated
since they're registered from constructors, so it's not very hard to
enumerate most of them.

This patch creates some basic infrastructure to support dumping existing
keywords from different classes on stdout. The format will differ depending
on the classes, but the idea is that the output could easily be passed to
a script that generates some simple syntax highlighting rules, completion
rules for editors, syntax checkers or config parsers.

The principle chosen here is that if "-dK" is passed on the command-line,
at the end of the parsing the registered keywords will be dumped for the
requested classes passed after "-dK". Special name "help" will show known
classes, while "all" will execute all of them. The reason for doing that
after the end of the config processor is that it will also enumerate
internally-generated keywords, Lua or even those loaded from external
code (e.g. if an add-on is loaded using LD_PRELOAD). A typical way to
call this with a valid config would be:

    ./haproxy -dKall -q -c -f /path/to/config

If there's no config available, feeding /dev/null will also do the job,
though it will not be able to detect dynamically created keywords, of
course.

This patch also updates the management doc.

For now nothing but the help is listed, various subsystems will follow
in subsequent patches.
2022-03-29 17:55:54 +02:00
Willy Tarreau
0d71d2f4fa BUG/MINOR: samples: add missing context names for sample fetch functions
In 2.4, two commits added support for supporting sample fetch calls from
new config and CLI contexts, but these were not added to the visibile
names, which may possibly cause "(null)" to appear in some error messages.
The commit in question were:
  db5e0dbea ("MINOR: sample: add a new CLI_PARSER context for samples")
  f9a7a8fd8 ("MINOR: sample: add a new CFG_PARSER context for samples")

This patch needs to be backported where these are present (2.4 and above).
2022-03-29 17:55:54 +02:00
Christopher Faulet
b4f96eda56 BUG/MINOR: log: Initialize the list element when allocating a new log server
211ea252d ("BUG/MINOR: logs: fix logsrv leaks on clean exit") introduced a
regression because the list element of a new log server is not intialized. Thus
HAProxy crashes on error path when an invalid log server is released.

This patch shoud fix the issue #1636. It must be backported if the above commit
is backported. For now, it is 2.6-specific and no backport is needed.
2022-03-29 14:17:10 +02:00
Christopher Faulet
744451c7c4 BUG/MEDIUM: mux-h1: Properly detect full buffer cases during message parsing
When the destination buffer is full while there are still data to parse, the
h1s must be marked as congested to be able to restart the parsing
later. This work on headers and data parsing. But on trailers parsing, we
fail to do so when the buffer is full before to parse the trailers. In this
case, we skip the trailers parsing but the h1s is not marked as
congested. This is important to be sure to wake up the mux to restart the
parsing when some room is made in the buffer.

Because of this bug, the message processing may hang till a timeout is
triggered. Note that for 2.3 and 2.2, the EOM processing is buggy too, for
the same reason. It should be fixed too on these versions. On the 2.0, only
trailers parsing is affected.

This patch must be backported as far as 2.0. On 2.3 and 2.2, the EOM parsing
must be fixed too.
2022-03-28 16:19:04 +02:00
Christopher Faulet
d9fc12818e BUG/MEDIUM: mux-fcgi: Properly handle return value of headers/trailers parsing
h1_parse_msg_hdrs() and h1_parse_msg_tlrs() may return negative values if
the parsing fails or if more space is needed in the destination buffer. When
h1-htx was changed, The H1 mux was updated accordingly but not the FCGI
mux. Thus if a negative value is returned, it is ignored and it is casted to
a size_t, leading to an integer overflow on the <ofs> value, used to know
the position in the RX buffer.

This patch must be backported as far as 2.2.
2022-03-28 15:56:17 +02:00
William Lallemand
3d7a9186dd BUG/MINOR: tools: url2sa reads too far when no port nor path
url2sa() still have an unfortunate case where it reads 1 byte too far,
it happens when no port or path are specified in the URL, and could
crash if the byte after the URL is not allocated (mostly with ASAN).

This case is never triggered in old versions of haproxy because url2sa
is used with buffers which are way bigger than the URL. It is only
triggered with the httpclient.

Should be bacported in every stable branches.
2022-03-25 17:48:28 +01:00
Amaury Denoyelle
d8769d1d87 CLEANUP: h3: suppress by default stdout traces
H3_DEBUG definition is removed from h3.c similarly to the commit
  d96361b270
  CLEANUP: qpack: suppress by default stdout traces

Also, a plain fprintf in h3_snd_buf has been replaced to be conditional
to the H3_DEBUG definition.

These changes reduces the default output on stdout with QUIC traffic.
2022-03-25 15:30:23 +01:00
Amaury Denoyelle
d96361b270 CLEANUP: qpack: suppress by default stdout traces
Remove the definition of DEBUG_HPACK on qpack-dec.c which forces the
QPACK decoding traces on stderr. Also change the name to use a dedicated
one for QPACK decoding as DEBUG_QPACK.
2022-03-25 15:22:40 +01:00
Amaury Denoyelle
18a10d07b6 BUILD: qpack: fix unused value when not using DEBUG_HPACK
If the macro is not defined, some local variables are flagged as unused
by the compiler. Fix this by using the __maybe_unused attribute.

For now, the macro is defined in the qpack-dec.c. However, this will
change to not mess up the stderr output of haproxy with QUIC traffic.
2022-03-25 15:21:45 +01:00
Amaury Denoyelle
251eadfce5 MINOR: mux-quic: activate qmux traces on stdout via macro
This commit is similar to the following one :
  commit 118b2cbf84
  MINOR: quic: activate QUIC traces at compilation

If the macro ENABLE_QUIC_STDOUT_TRACES is defined, qmux traces are
outputted automatically on stdout. This is useful for the haproxy-qns
interop docker image.
2022-03-25 14:51:14 +01:00
Amaury Denoyelle
fdcec3644a MINOR: mux-quic: add trace event for qcs_push_frame
Add a new qmux trace event QMUX_EV_QCS_PUSH_FRM. Its only purpose is to
display the meaningful result of a qcs_push_frame invocation.

A dedicated struct qcs_push_frm_trace_arg is defined to pass a series of
extra args for the trace output.
2022-03-25 14:51:14 +01:00
Amaury Denoyelle
fa29f33f2c MINOR: mux-quic: add trace event for frame sending
Define a new qmux event QMUX_EV_SEND_FRM. This allows to pass a
quic_frame as an extra argument. Depending on the frame type, a special
format can be used to log the frame content.

Currently this event is only used in qc_send_max_streams. Thus the
handler is only able to handle MAX_STREAMS frames.
2022-03-25 14:51:14 +01:00
Amaury Denoyelle
4f137577d7 MINOR: mux-quic: replace printfs by traces
Convert all printfs in the mux-quic code with traces.

Note that some meaningul printfs were not converted because they use
extra args in a format-string. This is the case inside qcs_push_frame
and qc_send_max_streams. A dedicated trace event should be implemented
for them to be able to display the extra arguments.
2022-03-25 14:51:14 +01:00
Amaury Denoyelle
dd4fbfb5b4 MINOR: mux-quic: declare the qmux trace module
Declare a new trace module for mux-quic named qmux. It will be used to
convert all printf to regular traces. The handler qmux_trace can uses a
connection and a qcs instance as extra arguments.
2022-03-25 14:51:10 +01:00
Amaury Denoyelle
0c2d964280 REORG: quic: use a dedicated quic_loss.c
Move all inline functions with trace from quic_loss.h to a dedicated
object file. This let to remove the TRACE_SOURCE macro definition
outside of the include file.

This change is required to be able to define another TRACE_SOUCE inside
the mux_quic.c for a dedicated trace module.
2022-03-25 14:45:45 +01:00
Amaury Denoyelle
777969c163 BUILD: quic: add missing includes
Complete the include list for the files quic_loss.h and quic_sock.c.
2022-03-25 14:45:45 +01:00
Amaury Denoyelle
9296091cf7 MINOR: mux-quic: convert fin on push-frame as boolean
This is only useful to display a clear 0/1 value in the traces. This has
no impact beyond this cosmetic change.
2022-03-25 14:45:45 +01:00
William Lallemand
b938b77ade BUG/MINOR: tools: fix url2sa return value with IPv4
Fix 8a91374 ("BUG/MINOR: tools: url2sa reads ipv4 too far") introduced a
regression in the value returned when parsing an ipv4 host.

Tthe consumed length is supposed to be as far as the first character of
the path, only its not computed correctly anymore and return the length
minus the size of the scheme.

Fixed the issue by reverting 'curr' and 'url' as they were before the
patch.

Must be backported in every stable branch where the 8a91374 patch was
backported.
2022-03-25 11:49:27 +01:00
Frédéric Lécaille
cc2764e7fe BUG/MINOR: quic: Wrong buffer length passed to generate_retry_token()
After having consumed <i> bytes from <buf>, the remaining available room to be
passed to generate_retry_token() is sizeof(buf) - i.
This bug could be easily reproduced with quic-qo as client which chooses a random
value as ODCID length.
2022-03-23 17:16:20 +01:00
Willy Tarreau
0c3205a541 BUILD: stream-int: avoid a build warning when DEBUG is empty
When no DEBUG_STRICT is enabled, we get this build warning:

  src/stream_interface.c: In function 'stream_int_chk_snd_conn':
  src/stream_interface.c:1198:28: warning: unused variable 'conn' [-Wunused-variable]
   1198 |         struct connection *conn = cs_conn(cs);
        |                            ^~~~

This was the result of the simplification of the code in commit
d1480cc8a ("BUG/MEDIUM: stream-int: do not rely on the connection error
once established") which removed the last user of this variable outside
of a BUG_ON().

If the patch above is backported, this one should be backported as well.
2022-03-23 11:15:49 +01:00
Amaury Denoyelle
1e5e5136ee MINOR: mux-quic: support MAX_DATA frame parsing
This commit is similar to the previous one but with MAX_DATA frames.
This allows to increase the connection level flow-control limit. If the
connection was blocked due to QC_CF_BLK_MFCTL flag, the flag is reseted.
2022-03-23 10:14:14 +01:00
Amaury Denoyelle
8727ff4668 MINOR: mux-quic: support MAX_STREAM_DATA frame parsing
Implement a MUX method to parse MAX_STREAM_DATA. If the limit is greater
than the previous one and the stream was blocked, the flag
QC_SF_BLK_SFCTL is removed.
2022-03-23 10:09:39 +01:00
Amaury Denoyelle
05ce55e582 MEDIUM: mux-quic: respect peer connection data limit
This commit is similar to the previous one, but this time on the
connection level instead of the stream.

When the connection limit is reached, the connection is flagged with
QC_CF_BLK_MFCTL. This flag is checked in qc_send.

qcs_push_frame uses a new parameter which is used to not exceed the
connection flow-limit while calling it repeatdly over multiple streams
instance before transfering data to the transport layer.
2022-03-23 10:05:29 +01:00
Amaury Denoyelle
6ea781919a MEDIUM: mux-quic: respect peer bidirectional stream data limit
Implement the flow-control max-streams-data limit on emission. We ensure
that we never push more than the offset limit set by the peer. When the
limit is reached, the stream is marked as blocked with a new flag
QC_SF_BLK_SFCTL to disable emission.

Currently, this is only implemented for bidirectional streams. It's
required to unify the sending for unidirectional streams via
qcs_push_frame from the H3 layer to respect the flow-control limit for
them.
2022-03-23 10:05:29 +01:00
Amaury Denoyelle
78396e5ee8 MINOR: mux-quic: use shorter name for flow-control fields
Rename the fields used for flow-control in the qcc structure. The
objective is to have shorter name for better readability while keeping
their purpose clear. It will be useful when the flow-control will be
extended with new fields.
2022-03-23 10:05:29 +01:00
Amaury Denoyelle
75d14ad5cb MINOR: mux-quic: add comments for send functions
Add comments on qc_send and qcs_push_frame. Also adjust the return of
qc_send to reflect the total bytes sent. This has no impact as currently
the return value is not checked by the caller.
2022-03-23 10:05:29 +01:00
Amaury Denoyelle
ac74aa531d MINOR: mux-quic: complete trace when stream is not found
Display the ID of the stream not found. This will help to detect when we
received retransmitted frames for an already closed stream.
2022-03-23 09:49:08 +01:00
Amaury Denoyelle
e0320b8aa6 CLEANUP: mux-quic: change comment style to not mess with git conflict
Remove "=======" symbols from the MUX buffer diagram. This is useful to
not mess with git conflict markers when resolving a conflict.
2022-03-23 09:48:43 +01:00
Frédéric Lécaille
aaf1f19e8b MINOR: quic: Add traces in qc_set_timer() (scheduling)
This should be helpful to diagnose some issues: timer task not
run when it should run.
2022-03-23 09:01:45 +01:00
Frédéric Lécaille
ce69cbc520 MINOR: quic: Add traces about stream TX buffer consumption
This will be helpful to diagnose STREAM blocking states.
2022-03-23 09:01:45 +01:00
Dhruv Jain
1295798139 MEDIUM: mqtt: support mqtt_is_valid and mqtt_field_value converters for MQTTv3.1
In MQTTv3.1, protocol name is "MQIsdp" and protocol level is 3. The mqtt
converters(mqtt_is_valid and mqtt_field_value) did not work for clients on
mqttv3.1 because the mqtt_parse_connect() marked the CONNECT message invalid
if either the protocol name is not "MQTT" or the protocol version is other than
v3.1.1 or v5.0. To fix it, we have added the mqttv3.1 protocol name and version
as part of the checks.

This patch fixes the mqtt converters to support mqttv3.1 clients as well (issue #1600).
It must be backported to 2.4.
2022-03-22 09:25:52 +01:00
Frédéric Lécaille
411aa6daf5 BUG/MINOR: quic: Non initialized variable in quic_build_post_handshake_frames()
<cid> could be accessed before being initialized.
2022-03-21 14:30:23 +01:00
Frédéric Lécaille
44ae75220a BUG/MINOR: quic: Incorrect peer address validation
We must consider the peer address as validated as soon as we received an
handshake packet. An ACK frame in handshake packet was too restrictive.
Rename the concerned flag to reflect this situation.
2022-03-21 14:27:09 +01:00
Frédéric Lécaille
12aa26b6fd BUG/MINOR: quic: 1RTT packets ignored after mux was released
We must be able to handle 1RTT packets after the mux has terminated its job
(qc->mux_state == QC_MUX_RELEASED). So the condition (qc->mux_state != QC_MUX_READY)
in qc_qel_may_rm_hp() is not correct when we want to wait for the mux to be started.
Add a check in qc_parse_pkt_frms() to ensure is started before calling it. All
the STREAM frames will be ignored when the mux will be released.
2022-03-21 14:27:09 +01:00
Frédéric Lécaille
2899fe2460 BUG/MINOR: quic: Missing TX packet initializations
The most important one is the ->flags member which leads to an erratic xprt behavior.
For instance a non ack-eliciting packet could be seen as ack-eliciting leading the
xprt to try to retransmit a packet which are not ack-eliciting. In this case, the
xprt does nothing and remains indefinitively in a blocking state.
2022-03-21 14:27:09 +01:00
Frédéric Lécaille
f27b66faee BUG/MINOR: mux-quic: Missing I/O handler events initialization
This could lead to a mux erratic behavior. Sometimes the application layer could
not wakeup the mux I/O handler because it estimated it had already subscribed
to write events (see h3_snd_buf() end of implementation).
2022-03-21 11:29:40 +01:00
Frédéric Lécaille
4e22f28feb BUG/MINOR: mux-quic: Access to empty frame list from qc_send_frames()
This was revealed by libasan when each time qc_send_frames() is run at the first
time:

=================================================================
==84177==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fbaaca2b3c8 at pc 0x560a4fdb7c2e bp 0x7fbaaca2b300 sp 0x7fbaaca2b2f8
READ of size 1 at 0x7fbaaca2b3c8 thread T6
    #0 0x560a4fdb7c2d in qc_send_frames src/mux_quic.c:473
    #1 0x560a4fdb83be in qc_send src/mux_quic.c:563
    #2 0x560a4fdb8a6e in qc_io_cb src/mux_quic.c:638
    #3 0x560a502ab574 in run_tasks_from_lists src/task.c:580
    #4 0x560a502ad589 in process_runnable_tasks src/task.c:883
    #5 0x560a501e3c88 in run_poll_loop src/haproxy.c:2675
    #6 0x560a501e4519 in run_thread_poll_loop src/haproxy.c:2846
    #7 0x7fbabd120ea6 in start_thread nptl/pthread_create.c:477
    #8 0x7fbabcb19dee in __clone (/lib/x86_64-linux-gnu/libc.so.6+0xfddee)

Address 0x7fbaaca2b3c8 is located in stack of thread T6 at offset 56 in frame
    #0 0x560a4fdb7f00 in qc_send src/mux_quic.c:514

  This frame has 1 object(s):
    [32, 48) 'frms' (line 515) <== Memory access at offset 56 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
Thread T6 created by T0 here:
    #0 0x7fbabd1bd2a2 in __interceptor_pthread_create ../../../../src/libsanitizer/asan/asan_interceptors.cpp:214
    #1 0x560a5036f9b8 in setup_extra_threads src/thread.c:221
    #2 0x560a501e70fd in main src/haproxy.c:3457
    #3 0x7fbabca42d09 in __libc_start_main ../csu/libc-start.c:308

SUMMARY: AddressSanitizer: stack-buffer-overflow src/mux_quic.c:473 in qc_send_frames
2022-03-21 11:29:40 +01:00
Frédéric Lécaille
dcc74ff792 BUG/MINOR: quic: Unsent frame because of qc_build_frms()
There are non already identified rare cases where qc_build_frms() does not manage
to size frames to be encoded in a packet leading qc_build_frm() to fail to add
such frame to the packet to be built. In such cases we must move back such
frames to their origin frame list passed as parameter to qc_build_frms(): <frms>.
because they were added to the packet frame list (but not built). If this
this packet is not retransmitted, the frame is lost for ever! Furthermore we must
not modify the buffer.
2022-03-21 11:29:40 +01:00
Frédéric Lécaille
d64f68fb0a BUG/MINOR: quic: Possible leak in quic_build_post_handshake_frames()
Rework this function to leave the connection passed as parameter in the same state
it was before entering this function.
2022-03-21 11:29:40 +01:00
Frédéric Lécaille
f1f812bfdb BUG/MINOR: quic: Possible crash in parse_retry_token()
We must check the decoded length of this incoming data before copying into our
internal structure. This could lead to crashes.
Reproduced with such a packet captured from QUIC interop.
    {
	    0xc5, 0x00, 0x00, 0x00, 0x01, 0x12, 0xf2, 0x65,
		0x4d, 0x9d, 0x58, 0x90, 0x23, 0x7e, 0x67, 0xef,
		0xf8, 0xef, 0x5b, 0x87, 0x48, 0xbe, 0xde, 0x7a, /* corrupted byte: 0x11, */
		0x01, 0xdc, 0x41, 0xbf, 0xfb, 0x07, 0x39, 0x9f,
		0xfd, 0x96, 0x67, 0x5f, 0x58, 0x03, 0x57, 0x74,
		0xc7, 0x26, 0x00, 0x45, 0x25, 0xdc, 0x7f, 0xf1,
		0x22, 0x1d,
	}
2022-03-21 11:29:40 +01:00
Frédéric Lécaille
e2a1c1b372 MEDIUM: quic: Rework of the TX packets memory handling
The TX packet refcounting had come with the multithreading support but not only.
It is very useful to ease the management of the memory allocated for TX packets
with TX frames attached to. At some locations of the code we have to move TX
frames from a packet to a new one during retranmission when the packet has been
deemed as lost or not. When deemed lost the memory allocated for the paquet must
be released contrary to when its frames are retransmitted when probing (PTO).

For now on, thanks to this patch we handle the TX packets memory this way. We
increment the packet refcount when:
  - we insert it in its packet number space tree,
  - we attache an ack-eliciting frame to it.
And reciprocally we decrement this refcount when:
  - we remove an ack-eliciting frame from the packet,
  - we delete the packet from its packet number space tree.

Note that an optimization WOULD NOT be to fully reuse (without releasing its
memorya TX packet to retransmit its contents (its ack-eliciting frames). Its
information (timestamp, in flight length) to be processed by packet loss detection
and the congestion control.
2022-03-21 11:29:40 +01:00
Frédéric Lécaille
141982a4e1 MEDIUM: quic: Limit the number of ACK ranges
When building a packet with an ACK frame, we store the largest acknowledged
packet number sent in this frame in the packet (quic_tx_packet struc).
When receiving an ack for such a packet we can purge the tree of acknowledged
packet number ranges from the range sent before this largest acknowledged
packet number.
2022-03-21 11:29:40 +01:00
Frédéric Lécaille
8f3ae0272f CLEANUP: quic: "largest_acked_pn" pktns struc member moving
This struct member stores the largest acked packet number which was received. It
is used to build (TX) packet. But this is confusing to store it in the tx packet
of the packet number space structure even if it is used to build and transmit
packets.
2022-03-21 11:29:40 +01:00
Frédéric Lécaille
302c2b1120 MINOR: quic: Code factorization (TX buffer reuse)
Add qc_may_reuse_cbuf() function used by qc_prep_pkts() and qc_prep_app_pkts().
Simplification of the factorized section code: there is no need to check there
is enough room to mark the end of the data in the TX buf. This is done by
the callers (qc_prep_pkts() and qc_prep_app_pkts()). Add a diagram to explain
the conditions which must be verified to be able to reuse a cbuf struct.

This should improve the QUIC stack implementation maintenability.
2022-03-21 11:29:40 +01:00
Tim Duesterhus
f4f6c0f6bb CLEANUP: Reapply ist.cocci
This makes use of the newly added:

    - i.ptr = p;
    - i.len = strlen(i.ptr);
    + i = ist(p);

patch.
2022-03-21 08:30:47 +01:00
Tim Duesterhus
7750850594 CLEANUP: Reapply ist.cocci with --include-headers-for-types --recursive-includes
Previous uses of `ist.cocci` did not add `--include-headers-for-types` and
`--recursive-includes` preventing Coccinelle seeing `struct ist` members of
other structs.

Reapply the patch with proper flags to further clean up the use of the ist API.

The command used was:

    spatch -sp_file dev/coccinelle/ist.cocci -in_place --include-headers --include-headers-for-types --recursive-includes --dir src/
2022-03-21 08:30:47 +01:00
Christopher Faulet
ab398d8ff9 BUG/MINOR: http-rules: Don't free new rule on allocation failure
If allocation of a new HTTP rule fails, we must not release it calling
free_act_rule(). The regression was introduced by the commit dd7e6c6dc
("BUG/MINOR: http-rules: completely free incorrect TCP rules on error").

This patch must only be backported if the commit above is backported. It should
fix the issues #1627, #1628 and #1629.
2022-03-21 08:24:17 +01:00
Christopher Faulet
9075dbdd84 BUG/MINOR: rules: Initialize the list element when allocating a new rule
dd7e6c6dc ("BUG/MINOR: http-rules: completely free incorrect TCP rules on
error") and 388c0f2a6 ("BUG/MINOR: tcp-rules: completely free incorrect TCP
rules on error") introduced a regression because the list element of a new
rule is not intialized. Thus HAProxy crashes when an incorrect rule is
released.

This patch must be backported if above commits are backported. Note that
new_act_rule() only exists since the 2.5. It relies on the commit d535f807b
("MINOR: rules: add a new function new_act_rule() to allocate act_rules").
2022-03-21 07:55:37 +01:00
Willy Tarreau
15a4733d5d BUG/MEDIUM: mux-h2: make use of http-request and keep-alive timeouts
Christian Ruppert reported an issue explaining that it's not possible to
forcefully close H2 connections which do not receive requests anymore if
they continue to send control traffic (window updates, ping etc). This
will indeed refresh the timeout. In H1 we don't have this problem because
any single byte is part of the stream, so the control frames in H2 would
be equivalent to TCP acks in H1, that would not contribute to the timeout
being refreshed.

What misses from H2 is the use of http-request and keep-alive timeouts.
These were not implemented because initially it was hard to see how they
could map to H2. But if we consider the real use of the keep-alive timeout,
that is, how long do we keep a connection alive with no request, then it's
pretty obvious that it does apply to H2 as well. Similarly, http-request
may definitely be honored as soon as a HEADERS frame starts to appear
while there is no stream. This will also allow to deal with too long
CONTINUATION frames.

This patch moves the timeout update to a new function, h2c_update_timeout(),
which is in charge of this. It also adds an "idle_start" timestamp in the
connection, which is set when nb_cs reaches zero or when a headers frame
start to arrive, so that it cannot be delayed too long.

This patch should be backported to recent stable releases after some
observation time. It depends on previous patch "MEDIUM: mux-h2: slightly
relax timeout management rules".
2022-03-18 17:43:34 +01:00
Willy Tarreau
3439583dd6 MEDIUM: mux-h2: slightly relax timeout management rules
The H2 timeout rules were arranged to cover complex situations In 2.1
with commit c2ea47fb1 ("BUG/MEDIUM: mux-h2: do not enforce timeout on
long connections").

It turns out that such rules while complex, do not perfectly cover all
use cases. The real intent is to say that as long as there are attached
streams, the connection must not timeout. Then once all these streams
have quit (possibly for timeout reasons) then the mux should take over
the management of timeouts.

We do have this nb_cs field which indicates the number of attached
streams, and it's updated even when leaving orphaned streams. So
checking it alone is sufficient to know whether it's the mux or the
streams that are in charge of the timeouts.

In its current state, this doesn't cause visible effects except that
it makes it impossible to implement more subtle parsing timeouts.

This would need to be backported as far as 2.0 along with the next
commit that will depend on it.
2022-03-18 17:43:34 +01:00
Willy Tarreau
6e805dab2a BUG/MEDIUM: trace: avoid race condition when retrieving session from conn->owner
There's a rare race condition possible when trying to retrieve session from
a back connection's owner, that was fixed in 2.4 and described in commit
3aab17bd5 ("BUG/MAJOR: connection: reset conn->owner when detaching from
session list").

It also affects the trace code which does the same, so the same fix is
needed, i.e. check from conn->session_list that the connection is still
enlisted. It's visible when sending a few tens to hundreds of parallel
requests to an h2 backend and enabling traces in parallel.

This should be backported as far as 2.2 which is the oldest version
supporting traces.
2022-03-18 17:43:28 +01:00
Willy Tarreau
d1480cc8a4 BUG/MEDIUM: stream-int: do not rely on the connection error once established
Historically the stream-interface code used to check for connection
errors by itself. Later this was partially deferred to muxes, but
only once the mux is installed or the connection is at least in the
established state. But probably as a safety practice the connection
error tests remained.

The problem is that they are causing trouble on when a response received
from a mux is mixed with an error report. The typical case is an upload
that is interrupted by the server sending an error or redirect without
draining all data, causing an RST to be queued just after the data. In
this case the mux has the data, the CO_FL_ERROR flag is present on the
connection, and unfortunately the stream-interface refuses to retrieve
the data due to this flag, and return an error to the client.

It's about time to only rely on CS_FL_ERROR which is set by the mux, but
the stream-interface is still responsible for the connection during its
setup. However everywhere the CO_FL_ERROR is checked, CS_FL_ERROR is
also checked.

This commit addresses this by:
  - adding a new function si_is_conn_error() that checks the SI state
    and only reports the status of CO_FL_ERROR for states before
    SI_ST_EST.

  - eliminating all checks for CO_FL_ERORR in places where CS_FL_ERROR
    is already checked and either the presence of a mux was already
    validated or the stream-int's state was already checked as being
    SI_ST_EST or higher.

CO_FL_ERROR tests on the send() direction are also inappropriate as they
may cause the loss of pending data. Now this doesn't happen anymore and
such events are only converted to CS_FL_ERROR by the mux once notified of
the problem. As such, this must not cause the loss of any error event.

Now an early error reported on a backend mux doesn't prevent the queued
response from being read and forwarded to the client (the list of syscalls
below was trimmed and epoll_ctl is not represented):

  recvfrom(10, "POST / HTTP/1.1\r\nConnection: clo"..., 16320, 0, NULL, NULL) = 66
  sendto(11, "POST / HTTP/1.1\r\ntransfer-encodi"..., 47, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = 47
  epoll_wait(3, [{events=EPOLLIN|EPOLLERR|EPOLLHUP|EPOLLRDHUP, data={u32=11, u64=11}}], 200, 15001) = 1
  recvfrom(11, "HTTP/1.1 200 OK\r\ncontent-length:"..., 16320, 0, NULL, NULL) = 57
  sendto(10, "HTTP/1.1 200 OK\r\ncontent-length:"..., 57, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = 57
  epoll_wait(3, [{events=EPOLLIN|EPOLLERR|EPOLLHUP|EPOLLRDHUP, data={u32=11, u64=11}}], 200, 13001) = 1
  epoll_wait(3, [{events=EPOLLIN, data={u32=10, u64=10}}], 200, 13001) = 1
  recvfrom(10, "A\n0123456789\r\n0\r\n\r\n", 16320, 0, NULL, NULL) = 19
  shutdown(10, SHUT_WR)                   = 0
  close(11)                               = 0
  close(10)                               = 0

Above the server is an haproxy configured with the following:

   listen blah
        bind :8002
        mode http
        timeout connect 5s
        timeout client  5s
        timeout server  5s
        option httpclose
        option nolinger
        http-request return status 200 hdr connection close

And the client takes care of sending requests and data in two distinct
parts:

   while :; do
     ./dev/tcploop/tcploop 8001 C T S:"POST / HTTP/1.1\r\nConnection: close\r\nTransfer-encoding: chunked\r\n\r\n" P1 S:"A\n0123456789\r\n0\r\n\r\n" P R F;
   done

With this, a small percentage of the requests will reproduce the behavior
above. Note that this fix requires the following patch to be applied for
the test above to work:

  BUG/MEDIUM: mux-h1: only turn CO_FL_ERROR to CS_FL_ERROR with empty ibuf

This should be backported with after a few weeks of observation, and
likely one version at a time. During the backports, the patch might
need to be adjusted at each check of CO_FL_ERORR to follow the
principles explained above.
2022-03-18 17:00:19 +01:00
Willy Tarreau
99bbdbcc21 BUG/MEDIUM: mux-h1: only turn CO_FL_ERROR to CS_FL_ERROR with empty ibuf
A connection-level error must not be turned to a stream-level error if there
are still pending data for that stream, otherwise it can cause the truncation
of the last pending data.

This must be backported to affected releases, at least as far as 2.4,
maybe further.
2022-03-18 17:00:17 +01:00
William Lallemand
58a81aeb91 BUG/MINOR: httpclient: CF_SHUTW_NOW should be tested with channel_is_empty()
CF_SHUTW_NOW shouldn't be a condition alone to exit the io handler, it
must be tested with the emptiness of the response channel.

Must be backported to 2.5.
2022-03-18 11:34:10 +01:00
William Lallemand
1eca894321 BUG/MINOR: httpclient: process the response when received before the end of the request
A server could reply a response with a shut before the end of the htx
transfer, in this case the httpclient would leave before computing the
received response.

This patch fixes the issue by calling the "process_data" label instead of
the "more" label which don't do the si_shut.

Must be bacported in 2.5.
2022-03-18 11:34:10 +01:00
William Lallemand
a625b03e83 BUG/MINOR: httpclient: only check co_data() instead of HTTP_MSG_DATA
Checking msg >= HTTP_MSG_DATA was useful to check if we received all the
data. However it does not work correctly in case of errors because we
don't reach this state, preventing to catch the error in the httpclient.

The consequence of this problem is that we don't get the status code of
the error response upon an error.

Fix the issue by only checking co_data().

Must be backported to 2.5.
2022-03-18 11:34:10 +01:00
Willy Tarreau
dd7e6c6dc7 BUG/MINOR: http-rules: completely free incorrect TCP rules on error
When a http-request or http-response rule fails to parse, we currently
free only the rule without its contents, which makes ASAN complain.
Now that we have a new function for this, let's completely free the
rule. This relies on this commit:

  MINOR: actions: add new function free_act_rule() to free a single rule

It's probably not needed to backport this since we're on the exit path
anyway.
2022-03-17 20:29:06 +01:00
Willy Tarreau
388c0f2a63 BUG/MINOR: tcp-rules: completely free incorrect TCP rules on error
When a tcp-request or tcp-response rule fails to parse, we currently
free only the rule without its contents, which makes ASAN complain.
Now that we have a new function for this, let's completely free the
rule. Reg-tests are now completely OK with ASAN. This relies on this
commit:

  MINOR: actions: add new function free_act_rule() to free a single rule

It's probably not needed to backport this since we're on the exit path
anyway.
2022-03-17 20:26:54 +01:00
Willy Tarreau
6a783e499c MINOR: actions: add new function free_act_rule() to free a single rule
There was free_act_rules() that frees all rules from a head but nothing
to free a single rule. Currently some rulesets partially free their own
rules on parsing error, and we're seeing some regtests emit errors under
ASAN because of this.

Let's first extract the code to free a rule into its own function so
that it becomes possible to use it on a single rule.
2022-03-17 20:26:19 +01:00
Willy Tarreau
211ea252d9 BUG/MINOR: logs: fix logsrv leaks on clean exit
Log servers are a real mess because:
  - entries are duplicated using memcpy() without their strings being
    reallocated, which results in these ones not being freeable every
    time.

  - a new field, ring_name, was added in 2.2 by commit 99c453df9
    ("MEDIUM: ring: new section ring to declare custom ring buffers.")
    but it's never initialized during copies, causing the same issue

  - no attempt is made at freeing all that.

Of course, running "haproxy -c" under ASAN quickly notices that and
dumps a core.

This patch adds the missing strdup() and initialization where required,
adds a new free_logsrv() function to cleanly free() such a structure,
calls it from the proxy when iterating over logsrvs instead of silently
leaking their file names and ring names, and adds the same logsrv loop
to the proxy_free_defaults() function so that we don't leak defaults
sections on exit.

It looks a bit entangled, but it comes as a whole because all this stuff
is inter-dependent and was missing.

It's probably preferable not to backport this in the foreseable future
as it may reveal other jokes if some obscure parts continue to memcpy()
the logsrv struct.
2022-03-17 19:53:46 +01:00
William Lallemand
43c2ce4d81 BUG/MINOR: server/ssl: free the SNI sample expression
ASAN complains about the SNI expression not being free upon an haproxy
-c. Indeed the httpclient is now initialized with a sni expression and
this one is never free in the server release code.

Must be backported in 2.5 and could be backported in every stable
versions.
2022-03-16 18:03:15 +01:00
William Lallemand
715c101a19 BUILD: httpclient: fix build without SSL
src/http_client.c: In function ‘httpclient_cfg_postparser’:
src/http_client.c:1065:8: error: unused variable ‘errmsg’ [-Werror=unused-variable]
 1065 |  char *errmsg = NULL;
      |        ^~~~~~
src/http_client.c:1064:6: error: unused variable ‘err_code’ [-Werror=unused-variable]
 1064 |  int err_code = 0;
      |      ^~~~~~~~

Fix the build of the httpclient without SSL, the problem was introduced
with previous patch 71e3158 ("BUG/MINOR: httpclient: send the SNI using
the host header")

Must be backported in 2.5 as well.
2022-03-16 16:39:23 +01:00
William Lallemand
71e3158395 BUG/MINOR: httpclient: send the SNI using the host header
Generate an SNI expression which uses the Host header of the request.
This is mandatory for most of the SSL servers nowadays.

Must be backported in 2.5 with the previous patch which export
server_parse_sni_expr().
2022-03-16 15:55:30 +01:00
William Lallemand
0d05867e78 MINOR: server: export server_parse_sni_expr() function
Export the server_parse_sni_expr() function in order to create a SNI
expression in a server which was not parsed from the configuration.
2022-03-16 15:55:30 +01:00
Christopher Faulet
53fa787a07 BUG/MEDIUM: sink: Properly get the stream-int in appctx callback functions
The appctx owner is not a stream-interface anymore. It is now a
conn-stream. However, sink code was not updated accordingly. It is now
fixed.

It is 2.6-specific, no backport is needed.
2022-03-16 10:01:30 +01:00
Christopher Faulet
fe14af30ec BUG/MEDIUM: cli/debug: Properly get the stream-int in all debug I/O handlers
The appctx owner is not a stream-interface anymore. It is now a conn-stream.
In the cli I/O handler for the command "debug dev fd", we still handle it as
a stream-interface. It is now fixed.

It is 2.6-specific, no backport is needed.
2022-03-16 09:52:13 +01:00
Christopher Faulet
9affa931cd BUG/MEDIUM: applet: Don't call .release callback function twice
Since the CS/SI refactoring, the .release callback function may be called
twice. The first call when a shutdown for read or for write is performed.
The second one when the applet is detached from its conn-stream. The second
call must be guarded, just like the first one, to only be performed is the
stream-interface is not the in disconnected (SI_ST_DIS) or closed
(SI_ST_CLO) state.

To simplify the fix, we now always rely on si_applet_release() function.

It is 2.6-specific, no backport is needed.
2022-03-15 11:47:53 +01:00
William Lallemand
8f170c7fca BUG/MINOR: httpclient/lua: stuck when closing without data
The httpclient lua code is lacking the end callback, which means it
won't be able to wake up the lua code after a longjmp if the connection
was closed without any data.

Must be backported to 2.5.
2022-03-15 11:42:38 +01:00
Frdric Lcaille
e9a974a37a BUG/MAJOR: quic: Possible crash with full congestion control window
This commit reverts this one:
  "d5066dd9d BUG/MEDIUM: quic: qc_prep_app_pkts() retries on qc_build_pkt() failures"

After having filled the congestion control window, qc_build_pkt() always fails.
Then depending on the relative position of the writer and  reader indexes for the
TX buffer, this could lead this function to try to reuse the buffer even if not full.
In such case, we do not always mark the end of the data in this TX buffer. This
is something the reader cannot understand: it reads a false datagram length,
then a wrong packet address from the TX buffer, leading to an invalid pointer
dereferencing.
2022-03-15 10:38:48 +01:00
Frdric Lcaille
2ee5c8b3dd BUG/MEDIUM: quic: Blocked STREAM when retransmitted
STREAM frames which are not acknowledged in order are inserted in ->tx.acked_frms
tree ordered by the STREAM frame offset values. Then, they are consumed in order
by qcs_try_to_consume(). But, when we retransmit frames, we possibly have to
insert the same STREAM frame node (with the same offset) in this tree.
The problem is when they have different lengths. Unfortunately the restransmitted
frames are not inserted because of the tree nature (EB_ROOT_UNIQUE). If the STREAM
frame which has been successfully inserted has a smaller length than the
retransmitted ones, when it is consumed they are tailing bytes in the STREAM
(retransmitted ones) which indefinitively remains in the STREAM TX buffer
which will never properly be consumed, leading to a blocking state.

At this time this may happen because we sometimes build STREAM frames
with null lengths. But this is another issue.

The solution is to use an EB_ROOT tree to support the insertion of STREAM frames
with the same offset but with different lengths. As qcs_try_to_consume() support
the STREAM frames retransmission this modification should not have any impact.
2022-03-15 10:38:48 +01:00
William Lallemand
97f69c6fb5 BUG/MEDIUM: httpclient: must manipulate head, not first
The httpclient mistakenly use the htx_get_first{_blk}() functions instead
of the htx_get_head{_blk}() functions. Which could stop the httpclient
because it will be without the start line, waiting for data that won't never
come.

Must be backported in 2.5.
2022-03-14 15:10:12 +01:00
William Lallemand
c020b2505d BUG/MINOR: httpclient: remove the UNUSED block when parsing headers
Remove the UNUSED blocks when iterating on headers, we should not stop
when encountering one. We should only stop iterating once we found the
EOH block. It doesn't provoke a problem, since we don't manipulates
the headers before treating them, but it could evolve in the future.

Must be backported to 2.5.
2022-03-14 15:10:12 +01:00
William Lallemand
c8f1eb99b4 BUG/MINOR: httpclient: consume partly the blocks when necessary
Consume partly the blocks in the httpclient I/O handler when there is
not enough room in the destination buffer for the whole block or when
the block is not contained entirely in the channel's output.

It prevents the I/O handler to be stuck in cases when we need to modify
the buffer with a filter for exemple.

Must be backported in 2.5.
2022-03-14 15:10:12 +01:00
William Lallemand
2b7dc4edb0 BUG/MEDIUM: httpclient: don't consume data before it was analyzed
In httpclient_applet_io_handler(), on the response path, we don't check
if the data are in the output part of the channel, and could consume
them before they were analyzed.

To fix this issue, this patch checks for the stline and the headers if
the msg_state is >= HTTP_MSG_DATA which means the stline and headers
were analyzed. For the data part, it checks if each htx blocks is in the
output before copying it.

Must be backported in 2.5.
2022-03-14 15:10:12 +01:00
Amaury Denoyelle
76e8b70e43 MEDIUM: server: remove experimental-mode for dynamic servers
Dynamic servers feature is now judged to be stable enough. Remove the
experimental-mode requirement for "add/del server" commands. This should
facilitate dynamic servers adoption.
2022-03-11 14:28:28 +01:00
Amaury Denoyelle
7d098bea2b MEDIUM: check: do not auto configure SSL/PROXY for dynamic servers
For server checks, SSL and PROXY is automatically inherited from the
server settings if no specific check port is specified. Change this
behavior for dynamic servers : explicit "check-ssl"/"check-send-proxy"
are required for them.

Without this change, it is impossible to add a dynamic server with
SSL/PROXY settings and checks without, if the check port is not
explicit. This is because "no-check-ssl"/"no-check-send-proxy" keywords
are not available for dynamic servers.

This change respects the principle that dynamic servers on the CLI
should not reuse the same shortcuts used during the config file parsing.
Mostly because we expect this feature to be manipulated by automated
tools, contrary to the config file which should aim to be the shortest
possible for human readability.

Update the documentation of the "check" keyword to reflect this change.
2022-03-11 14:28:28 +01:00
Amaury Denoyelle
6ccfa3c40f MEDIUM: mux-quic: improve bidir STREAM frames sending
The current implementation of STREAM frames emission has some
limitation. Most notably when we cannot sent all frames in a single
qc_send run.

In this case, frames are left in front of the MUX list. It will be
re-send individually before other frames, possibly another frame from
the same STREAM with new data. An opportunity to merge the frames is
lost here.

This method is now improved. If a frame cannot be send entirely, it is
discarded. On the next qc_send run, we retry to send to this position. A
new field qcs.sent_offset is used to remember this. A new frame list is
used for each qc_send.

The impact of this change is not precisely known. The most notable point
is that it is a more logical method of emission. It might also improve
performance as we do not keep old STREAM frames which might delay other
streams.
2022-03-11 11:37:31 +01:00
Amaury Denoyelle
54445d04e4 MINOR: quic: implement sending confirmation
Implement a new MUX function qcc_notify_send. This function must be
called by the transport layer to confirm the sending of STREAM data to
the MUX.

For the moment, the function has no real purpose. However, it will be
useful to solve limitations on push frame and implement the flow
control.
2022-03-11 11:37:31 +01:00
Amaury Denoyelle
db5d1a1b19 MINOR: mux-quic: improve opportunistic retry sending for STREAM frames
For the moment, the transport layer function qc_send_app_pkts lacks
features. Most notably, it only send up to a single Tx buffer and won't
retry even if there is frames left and its Tx buffer is now empty.

To overcome this limitation, the MUX implements an opportunistic retry
sending mechanism. qc_send_app_pkts is repeatedly called until the
transport layer is blocked on an external condition (such as congestion
control or a sendto syscall error).

The blocking was detected by inspecting the frame list before and after
qc_send_app_pkts. If no frame has been poped by the function, we
considered the transport layer to be blocked and we stop to send. The
MUX is subscribed on the lower layer to send the frames left.

However, in case of STREAM frames, qc_send_app_pkts might use only a
portion of the data and update the frame offset. So, for STREAM frames,
a new mechanism is implemented : if the offset field of the first frame
has not been incremented, it means the transport layer is blocked.

This should improve transfers execution. Before this change, there is a
possibility of interrupted transfer if the mux has not sent everything
possible and is waiting on a transport signaling which will never
happen.

In the future, qc_send_app_pkts should be extended to retry sending by
itself. All this code burden will be removed from the MUX.
2022-03-11 11:37:31 +01:00
Amaury Denoyelle
e2ec9421ea MINOR: mux-quic: prevent push frame for unidir streams
For the moment, unidirectional streams handling is not identical to
bidirectional ones in MUX/H3 layer, both in Rx and Tx path. As a safety,
skip over uni streams in qc_send.

In fact, this change has no impact because qcs.tx.buf is emptied before
we start using qcs_push_frame, which prevents the call to
qcs_push_frame. However, this condition will soon change to improve
bidir streams emission, so an explicit check on stream type must be
done.

It is planified to unify uni and bidir streams handling in a future
stage. When implemented, the check will be removed.
2022-03-11 11:37:31 +01:00
Frédéric Lécaille
728b30d750 CLEANUP: quic: Comments fix for qc_prep_(app)pkts() functions
Fix the comments for these two functions about their returned values.
2022-03-11 11:37:31 +01:00
Frédéric Lécaille
d5066dd9dd BUG/MEDIUM: quic: qc_prep_app_pkts() retries on qc_build_pkt() failures
The "stop_build" label aim is to try to reuse the TX buffer when there is not
enough contiguous room to build a packet. It was defined but not used!
2022-03-11 11:37:31 +01:00
Frédéric Lécaille
530601cd84 MEDIUM: quic: Implement the idle timeout feature
The aim of the idle timeout is to silently closed the connection after a period
of inactivity depending on the "max_idle_timeout" transport parameters advertised
by the endpoints. We add a new task to implement this timer. Its expiry is
updated each time we received an ack-eliciting packet, and each time we send
an ack-eliciting packet if no other such packet was sent since we received
the last ack-eliciting packet. Such conditions may be implemented thanks
to QUIC_FL_CONN_IDLE_TIMER_RESTARTED_AFTER_READ new flag.
2022-03-11 11:37:30 +01:00
Frédéric Lécaille
676b849d37 BUG/MINOR: quic: Missing check when setting the anti-amplification limit as reached
Ensure the peer address is not validated before setting the anti-amplication
limit as reached.
2022-03-11 11:37:30 +01:00
Frédéric Lécaille
f293b69521 MEDIUM: quic: Remove the QUIC connection reference counter
There is no need to use such a reference counter anymore since the QUIC
connections are always handled by the same thread.
quic_conn_drop() is removed. Its code is merged into quic_conn_release().
2022-03-11 11:37:30 +01:00
Willy Tarreau
d2985f3cec BUG/MINOR: session: fix theoretical risk of memleak in session_accept_fd()
Andrew Suffield reported in issue #1596 that we've had a bug in
session_accept_fd() since 2.4 with commit 1b3c931bf ("MEDIUM:
connections: Introduce a new XPRT method, start().") where an error
label is wrong and may cause the leak of the freshly allocated session
in case conn_xprt_start() returns < 0.

The code was checked there and the only two transport layers available
at this point are raw_sock and ssl_sock. The former doesn't provide a
->start() method hence conn_xprt_start() will always return zero. The
second does provide such a function, but it may only return <0 if the
underlying transport (raw_sock) has such a method and fails, which is
thus not the case.

So fortunately it is not possible to trigger this leak.

The patch above also touched the accept code in quic_sock() which was
mostly a plain copy of the session code, but there the move didn't
have this impact, and since then it was simplified and the next change
moved it to its final destination with the proper error label.

This should be backported as far as 2.4 as a long-term safety measure
(e.g. if in the future we have a reason for making conn_xprt_start()
to start failing), but will not have any positive nor negative effect
in the short term.
2022-03-11 07:25:11 +01:00
Willy Tarreau
0657b93385 MINOR: stream: add "last_rule_file" and "last_rule_line" samples
These two sample fetch methods report respectively the file name and the
line number where was located the last rule that was final. This is aimed
at being used on log-format lines to help admins figure what rule in the
configuration gave a final verdict, and help understand the condition
that led to the action.

For example, it's now possible to log the last matched rule by adding
this to the log-format:

  ... lr=%[last_rule_file]:%[last_rule_line]

A regtest is provided to test various combinations of final rules, some
even on top of each other from different rulesets.
2022-03-10 11:51:34 +01:00
Willy Tarreau
c6dae869ca MINOR: rules: record the last http/tcp rule that gave a final verdict
When a tcp-{request,response} content or http-request/http-response
rule delivers a final verdict (deny, accept, redirect etc), the last
evaluated one will now be recorded in the stream. The purpose is to
permit to log the last one that performed a final action. For now
the log is not produced.
2022-03-10 11:51:34 +01:00
Christopher Faulet
fbff854250 BUG/MAJOR: mux-pt: Always destroy the backend connection on detach
In TCP, when a conn-stream is detached from a backend connection, the
connection must be always closed. It was only performed if an error or a
shutdown occurred or if there was no connection owner. But it is a problem,
because, since the 2.3, backend connections are always owned by a
session. This way it is possible to have idle connections attached to a
session instead of a server. But there is no idle connections in TCP. In
addition, when a session owns a connection it is responsible to close it
when it is released. But it only works for idle connections. And it only
works if the session is released.

Thus there is the place for bugs here. And indeed, a connection leak may
occur if a connection retry is performed because of a timeout. In this case,
the underlying connection is still alive and is waiting to be fully
established. Thus, when the conn-stream is detached from the connection, the
connection is not closed. Because the PT multiplexer is quite simple, there
is no timeout at this stage. We depend on the kenerl to be notified and
finally close the connection. With an unreachable server, orphan backend
connections may be accumulated for a while. It may be perceived as a leak.

Because there is no reason to keep such backend connections, we just close
it now. Frontend connections are still closed by the session or when an
error or a shutdown occurs.

This patch should fix the issue #1522. It must be backported as far as
2.0. Note that the 2.2 and 2.0 are not affected by this bug because there is
no owner for backend TCP connections. But it is probably a good idea to
backport the patch on these versions to avoid any future bugs.
2022-03-09 15:56:00 +01:00
Tim Duesterhus
a6a3279188 CLEANUP: fcgi: Use istadv() in fcgi_strm_send_params
Found manually, while creating the previous commits to turn `struct proxy`
members into ists.

There is an existing Coccinelle rule to replace this pattern by `istadv()` in
`ist.cocci`:

    @@
    struct ist i;
    expression e;
    @@

    - i.ptr += e;
    - i.len -= e;
    + i = istadv(i, e);

But apparently it is not smart enough to match ists that are stored in another
struct. It would be useful to make the existing rule more generic, so that it
might catch similar cases in the future.
2022-03-09 07:51:27 +01:00
Tim Duesterhus
98f05f6a38 CLEANUP: fcgi: Replace memcpy() on ist by istcat()
This is a little cleaner, because the length of the resulting string does not
need to be calculated manually.
2022-03-09 07:51:27 +01:00
Tim Duesterhus
b4b03779d0 MEDIUM: proxy: Store server_id_hdr_name as a struct ist
The server_id_hdr_name is already processed as an ist in various locations lets
also just store it as such.

see 0643b0e7e ("MINOR: proxy: Make `header_unique_id` a `struct ist`") for a
very similar past commit.
2022-03-09 07:51:27 +01:00
Tim Duesterhus
e502c3e793 MINOR: proxy: Store orgto_hdr_name as a struct ist
The orgto_hdr_name is already processed as an ist in `http_process_request`,
lets also just store it as such.

see 0643b0e7e ("MINOR: proxy: Make `header_unique_id` a `struct ist`") for a
very similar past commit.
2022-03-09 07:51:27 +01:00
Tim Duesterhus
b50ab8489e MINOR: proxy: Store fwdfor_hdr_name as a struct ist
The fwdfor_hdr_name is already processed as an ist in `http_process_request`,
lets also just store it as such.

see 0643b0e7e ("MINOR: proxy: Make `header_unique_id` a `struct ist`") for a
very similar past commit.
2022-03-09 07:51:27 +01:00
Tim Duesterhus
4b1fcaaee3 MINOR: proxy: Store monitor_uri as a struct ist
The monitor_uri is already processed as an ist in `http_wait_for_request`, lets
also just store it as such.

see 0643b0e7e ("MINOR: proxy: Make `header_unique_id` a `struct ist`") for a
very similar past commit.
2022-03-09 07:51:27 +01:00
Christopher Faulet
5ce1299c64 DEBUG: stream: Fix stream trace message to print response buffer state
Channels buffer state is displayed in the strem trace messages. However,
because of a typo, the request buffer was used instead of the response one.

This patch should be backported as far as 2.2.
2022-03-08 18:31:44 +01:00
Christopher Faulet
5001913033 DEBUG: stream: Add the missing descriptions for stream trace events
The description for STRM_EV_FLT_ANA and STRM_EV_FLT_ERR was missing.

This patch should be backported as far as 2.2.
2022-03-08 18:31:44 +01:00
Christopher Faulet
e8cefacfa9 BUG/MEDIUM: mcli: Properly handle errors and timeouts during reponse processing
The response analyzer of the master CLI only handles read errors. So if
there is a write error, the session remains stuck because some outgoing data
are blocked in the channel and the response analyzer waits everything to be
sent. Because the maxconn is set to 10 for the master CLI, it may be
unresponsive if this happens to many times.

Now read and write errors, timeouts and client aborts are handled.

This patch should solve the issue #1512. It must be backported as far as
2.0.
2022-03-08 18:31:44 +01:00
Christopher Faulet
8b1eed16d0 DEBUG: cache: Update underlying buffer when loading HTX message in cache applet
In the I/O handler of the cache applet, we must update the underlying buffer
when the HTX message is loaded, using htx_from_buf() function instead of
htxbuf(). It is important because the applet will update the message by
adding new HTX blocks. This way, the state of the underlying buffer remains
consistant with the state of the HTX message.

It is especially important if HAProxy is compiled with "DEBUG_STRICT=2"
mode. Without this patch, channel_add_input() call crashed if the channel
was empty at the begining of the I/O handler.

Note that it is more a build/debug issue than a bug. But this patch may
prevent future bugs. For now it is safe because htx_to_buf() function is
systematically called, updating accordingly the underlying buffer.

This patch may be backported as far as 2.0.
2022-03-08 18:29:20 +01:00
Christopher Faulet
e9382e0afe BUG/MEDIUM: stream: Use the front analyzers for new listener-less streams
For now, for a stream, request analyzers are set at 2 stages. The first one
is when the stream is created. The session's listener analyzers, if any, are
set on the request channel. In addition, some HTTP analyzers are set for HTX
streams (AN_REQ_WAIT_HTTP and AN_REQ_HTTP_PROCESS_FE). The second one is
when the backend is set on the stream. At the stage, request analyzers are
updated using the backend settings.

It is an issue for client applets because there is no listener attached to
the stream. In addtion, it may have no specific/dedicated backend. Thus,
several request analyzers are missing. Among others, the HTTP analyzers for
HTTP applets. The HTTP client is the only one affected for now.

To fix the bug, when a stream is created without a listener, we use the
frontend to set the request analyzers. Note that there is no issue with the
response channel because its analyzers are set when the server connection is
established.

This patch may be backported to all stable versions. Because only the HTTP
client is affected, it must at least be backported to 2.5. It is related to
the issue #1593.
2022-03-08 18:27:47 +01:00
Christopher Faulet
dbf1e88e87 BUG/MINOR: cache: Set conn-stream/channel EOI flags at the end of request
This bug is the same than for the HTTP client. See "BUG/MINOR: httpclient:
Set conn-stream/channel EOI flags at the end of request" for details.

Note that because a filter is always attached to the stream when the cache
is used, there is no issue because there is no direct forwarding in this
case. Thus the stream analyzers are able to see the HTX_FL_EOM flag on the
HTX messge.

This patch must be backported as far as 2.0. But only CF_EOI must be set
because applets are not attached to a conn-stream on older versions.
2022-03-08 18:24:16 +01:00
Christopher Faulet
3fa5d19d14 BUG/MINOR: stats: Set conn-stream/channel EOI flags at the end of request
This bug is the same than for the HTTP client. See "BUG/MINOR: httpclient:
Set conn-stream/channel EOI flags at the end of request" for details.

This patch must be backported as far as 2.0. But only CF_EOI must be set
because applets are not attached to a conn-stream on older versions.
2022-03-08 18:24:16 +01:00
Christopher Faulet
d8d2708cfe BUG/MINOR: hlua: Set conn-stream/channel EOI flags at the end of request
This bug is the same than for the HTTP client. See "BUG/MINOR: httpclient:
Set conn-stream/channel EOI flags at the end of request" for details.

This patch must be backported as far as 2.0. But only CF_EOI must be set
because applets are not attached to a conn-stream on older versions.
2022-03-08 18:24:16 +01:00
Christopher Faulet
3d4332419c BUG/MINOR: httpclient: Set conn-stream/channel EOI flags at the end of request
In HTX, HTX_FL_EOM flag is added on the message to notifiy the end of the
message was received. In addition, the producer must set CS_FL_EOI flag on
the conn-stream. If it is a mux, the stream-interface is responsible to set
CF_EOI flag on the input channel. But, for now, if the producer is an
applet, in addition to the conn-stream flag, it must also set the channel
one.

These flags are used to notify the stream that the message is finished and
no more data are expected. It is especially important when the message
itself it directly forwarded from one side to the other. Because in this
case, the stream has no way to see the HTX_FL_EOM flag on the
message. Otherwise, the stream will detect a client or a server abort,
depending on the side.

For the HTTP client, it is not really easy to diagnose this error because
there is also another bug hiding this one. All HTTP request analyzers are
not set on the input channel. This will be fixed by another patch.

This patch must be backported to 2.5. It is related to the issue #1593.
2022-03-08 16:33:56 +01:00
Marno Krahmer
a690b73fba MINOR: stats: Add dark mode support for socket rows
In commit e9ed63e548 dark mode support was added to the stats page. The
initial commit does not include  dark mode color overwrites for the
.socket CSS class. This commit colors socket rows the same way as
backends that acre active but do not have a health check defined.

This fixes an issue where reading information from socket lines became
really hard in dark mode due to suboptimal coloring of the cell
background and the font in it.
2022-03-08 14:47:23 +01:00
Amaury Denoyelle
20f89cac95 BUG/MEDIUM: quic: do not drop packet on duplicate stream/decoding error
Change the return value to success in qc_handle_bidi_strm_frm for two
specific cases :
* if STREAM frame is an already received offset
* if application decoding failed

This ensures that the packet is not dropped and properly acknowledged.
Previous to this fix, the return code was set to error which prevented
the ACK to be generated.

The impact of the bug might be noticeable in environment with packet
loss and retransmission. Due to haproxy not generating ACK for packets
containing STREAM frames with already received offset, the client will
probably retransmit them again, which will worsen the network
transmission.
2022-03-08 14:36:32 +01:00
William Lallemand
b0dfd099c5 BUG/MINOR: cli: shows correct mode in "show sess"
The "show sess" cli command only handles "http" or "tcp" as a fallback
mode, replace this by a call to proxy_mode_str() to show all the modes.

Could be backported in every maintained versions.
2022-03-08 12:21:36 +01:00
William Lallemand
06715af9e5 BUG/MINOR: add missing modes in proxy_mode_str()
Add the missing PR_MODE_SYSLOG and PR_MODE_PEERS in proxy_mode_str().

Could be backported in every maintained versions.
2022-03-08 12:21:36 +01:00
Willy Tarreau
c4e56dc58c MINOR: pools: add a new global option "no-memory-trimming"
Some users with very large numbers of connections have been facing
extremely long malloc_trim() calls on reload that managed to trigger
the watchdog! That's a bit counter-productive. It's even possible
that some implementations are not perfectly reliable or that their
trimming time grows quadratically with the memory used. Instead of
constantly trying to work around these issues, let's offer an option
to disable this mechanism, since nobody had been complaining in the
past, and this was only meant to be an improvement.

This should be backported to 2.4 where trimming on reload started to
appear.
2022-03-08 10:45:03 +01:00
Frédéric Lécaille
9777ead2ed CLEANUP: quic: Remove window redundant variable from NewReno algorithm state struct
We use the window variable which is stored in the path struct.
2022-03-04 17:47:32 +01:00
Frédéric Lécaille
0e7c9a7143 MINOR: quic: More precise window update calculation
When in congestion avoidance state and when acknowledging an <acked> number bytes
we must increase the congestion window by at most one datagram (<path->mtu>)
by congestion window. So thanks to this patch we apply a ratio to the current
number of acked bytes : <acked> * <path->mtu> / <cwnd>.
So, when <cwnd> bytes are acked we precisely increment <cwnd> by <path->mtu>.
Furthermore we take into an account the number of remaining acknowledged bytes
each time we increment the window by <acked> storing their values in the algorithm
struct state (->remain_acked) so that it might be take into an account at the
next ACK event.
2022-03-04 17:47:32 +01:00
Frédéric Lécaille
5f6783094d CLEANUP: quic: Remove useless definitions from quic_cc_event struct
Since the persistent congestion detection is done out of the congestion
controllers, there is no need to pass them information through quic_cc_event struct.
We remove its useless members. Also remove qc_cc_loss_event() which is no more used.
2022-03-04 17:47:32 +01:00
Frédéric Lécaille
a5ee0ae6a2 MINOR: quic: Persistent congestion detection outside of controllers
We establish the persistent congestion out of any congestion controller
to improve the algorithms genericity. This path characteristic detection may
be implemented regarless of the underlying congestion control algorithm.

Send congestion (loss) event using directly quic_cc_event(), so without
qc_cc_loss_event() wrapper function around quic_cc_event().

Take the opportunity of this patch to shorten "newest_time_sent" member field
of quic_cc_event to "time_sent".
2022-03-04 17:47:32 +01:00
Frédéric Lécaille
83bfca6c71 MINOR: quic: Add a "slow start" callback to congestion controller
We want to be able to make the congestion controllers re-enter the slow
start state outside of the congestion controllers themselves. So,
we add a callback ->slow_start() to do so.
Define this callback for NewReno algorithm.
2022-03-04 17:47:32 +01:00
Frédéric Lécaille
ba9db40b07 CLEANUP: quic: Remove QUIC path manipulations out of the congestion controller
QUIC connection path in flight bytes is a variable which should not be manipulated
by the congestion controller. This latter aim is to compute the congestion window.
So, we pass it as less as parameters as possible to do so.
2022-03-04 17:47:32 +01:00
Frédéric Lécaille
4d3d36b670 BUG/MINOR: quic: Missing recovery start timer reset
The recovery start time must be reset after a persistent congestion has been
detected.
2022-03-04 17:47:32 +01:00
Frédéric Lécaille
05e30ee7d5 MINOR: quic: Retry on qc_build_pkt() failures
This is done going to stop_build label when qc_build_pkt() fails
because of a lack of buffer room (returns -1).
2022-03-04 17:47:32 +01:00
David Carlier
43a568575f BUILD: fix kFreeBSD build.
kFreeBSD needs to be treated as a distinct target from FreeBSD
since the underlying system libc is the GNU one. Thus, relying
only on __GLIBC__ no longer suffice.

- freebsd-glibc new target, key difference is including crypt.h
  and linking to libdl like linux.
- cpu affinity available but the api is still the FreeBSD's.
- enabling auxiliary data access only for Linux.

Patch based on preliminary work done by @bigon.

closes #1555
2022-03-04 17:19:12 +01:00
Amaury Denoyelle
c055e30176 MEDIUM: mux-quic: implement MAX_STREAMS emission for bidir streams
Implement the locally flow-control streams limit for opened
bidirectional streams. Add a counter which is used to count the total
number of closed streams. If this number is big enough, emit a
MAX_STREAMS frame to increase the limit of remotely opened bidirectional
streams.

This is the first commit to implement QUIC flow-control. A series of
patches should follow to complete this.

This is required to be able to handle more than 100 client requests.
This should help to validate the Multiplexing interop test.
2022-03-04 17:00:12 +01:00
Amaury Denoyelle
e9c4cc13fc MINOR: mux-quic: retry send opportunistically for remaining frames
This commit should fix the possible transfer interruption caused by the
previous commit. The MUX always retry to send frames if there is
remaining data after a send call on the transport layer. This is useful
if the transport layer is not blocked on the sending path.

In the future, the transport layer should retry by itself the send
operation if no blocking condition exists. The MUX layer will always
subscribe to retry later if remaining frames are reported which indicate
a blocking on the transport layer.
2022-03-04 17:00:12 +01:00
Amaury Denoyelle
2c71fe58f0 MEDIUM: mux-quic: use direct send transport API for STREAMs
Modify the STREAM emission in qc_send. Use the new transport function
qc_send_app_pkts to directly send the list of constructed frames. This
allows to remove the tasklet wakeup on the quic_conn and should reduce
the latency.

If not all frames are send after the transport call, subscribe the MUX
on the lower layer to be able to retry. Currently there is a bug because
the transport layer does not retry to send frames in excess after a
successful sendto. This might cause the transfer to be interrupted.
2022-03-04 17:00:12 +01:00
Amaury Denoyelle
0dc40f06d1 MINOR: mux-quic: complete functions to detect stream type
Improve the functions used to detect the stream characteristics :
uni/bidirectional and local/remote initiated.

Most notably, these functions are now designed to work transparently for
a MUX in the frontend or backend side. For this, we use the connection
to determine the current MUX side. This will be useful if QUIC is
implemented on the server side.
2022-03-04 17:00:12 +01:00
Amaury Denoyelle
749cb647b1 MINOR: mux-quic: refactor transport parameters init
Since QUIC accept handling has been improved, the MUX is initialized
after the handshake completion. Thus its safe to access transport
parameters in qc_init via the quic_conn.

Remove quic_mux_transport_params_update which was called by the
transport for the MUX. This improves the architecture by removing a
direct call from the transport to the MUX.

The deleted function body is not transfered to qc_init because this part
will change heavily in the near future when implementing the
flow-control.
2022-03-04 17:00:12 +01:00
Frédéric Lécaille
c2f561ce1e MINOR: quic: Export qc_send_app_pkts()
This is at least to make this function be callable by the mux.
2022-03-04 17:00:12 +01:00
Frédéric Lécaille
edc81469a8 MINOR: quic: Make qc_build_frms() build ack-eliciting frames from a list
We want to be able to build ack-eliciting frames to be embedded into QUIC packets
from a prebuilt list of ack-eliciting frames. This will be helpful for the mux
which would like to send STREAM frames asap after having builts its own prebuilt
list.
To do so, we only add a parameter as struct list to this function to handle
such a prebuilt list.
2022-03-04 17:00:12 +01:00
Frédéric Lécaille
28c7ea3725 MINOR: quic: Send short packet from a frame list
We want to be able to send ack-elicting packets from a list of ack-eliciting
frames. So, this patch adds such a paramaters to the function responsible of
building 1RTT packets. The entry point function is qc_send_app_pkts() which
is used with the underlying packet number space TX frame list as parameter.
2022-03-04 17:00:12 +01:00
Frédéric Lécaille
1c5968b275 MINOR: quic: qc_prep_app_pkts() implementation
We want to get rid of the code used during the handshake step. qc_prep_app_pkts()
aim is to build short packets which are also datagrams.
Make quic_conn_app_io_cb() call this new function to prepare short packets.
2022-03-04 17:00:12 +01:00
Amaury Denoyelle
1455113e93 CLEANUP: quic: complete ABORT_NOW with a TODO comment
Add a TODO comment to not forget to properly implement error returned by
qcs_push_frame.
2022-03-04 16:56:51 +01:00
Willy Tarreau
3dfb7da04b CLEANUP: tree-wide: remove a few rare non-ASCII chars
As reported by Tim in issue #1428, our sources are clean, there are
just a few files with a few rare non-ASCII chars for the paragraph
symbol, a few typos, or in Fred's name. Given that Fred already uses
the non-accentuated form at other places like on the public list,
let's uniformize all this and make sure the code displays equally
everywhere.
2022-03-04 08:58:32 +01:00
Willy Tarreau
f9eba78fb8 BUG/MEDIUM: pools: fix ha_free() on area in the process of being freed
Commit e81248c0c ("BUG/MINOR: pool: always align pool_heads to 64 bytes")
added a free of the allocated pool in pool_destroy() using ha_free(), but
it added a subtle bug by which once the pool is released, setting its
address to NULL inside the structure itself cannot work because the area
has just been freed.

This will need to be backported wherever the patch above is backported.
2022-03-03 18:42:49 +01:00
Amaury Denoyelle
2d0f873cd8 BUG/MINOR: quic: fix segfault on CC if mux uninitialized
A segfault happens when receiving a CONNECTION_CLOSE during handshake.
This is because the mux is not initialized at this stage but the
transport layer dereferences it.

Fix this by ensuring that the MUX is initialized before. Thanks to Willy
for his help on this one. Welcome in the QUIC-men team !
2022-03-03 18:09:37 +01:00
Willy Tarreau
e81248c0c8 BUG/MINOR: pool: always align pool_heads to 64 bytes
This is the pool equivalent of commit 97ea9c49f ("BUG/MEDIUM: fd: always
align fdtab[] to 64 bytes"). After a careful code review, it happens that
the pool heads are the other structures allocated with malloc/calloc that
claim to be aligned to a size larger than what the allocator can offer.
While no issue was reported on them, no memset() is performed and no type
is large, this is a problem waiting to happen, so better fix it. In
addition, it's relatively easy to do by storing the allocation address
inside the pool_head itself and use it at free() time. Finally, threads
might benefit from the fact that the caches will really be aligned and
that there will be no false sharing.

This should be backported to all versions where it applies easily.
2022-03-02 18:22:08 +01:00
William Lallemand
10a37360c8 BUG/MEDIUM: httpclient/lua: infinite appctx loop with POST
When POSTing a request with a payload, and reusing the same httpclient
lua instance, one could encounter a spinning of the httpclient appctx.

Indeed the sent counter is not reset between 2 POSTs and the condition
for sending the EOM flag is never met.

Must fixed issue #1593.

To be backported in 2.5.
2022-03-02 16:32:47 +01:00
Willy Tarreau
06e66c84fc DEBUG: reduce the footprint of BUG_ON() calls
Many inline functions involve some BUG_ON() calls and because of the
partial complexity of the functions, they're not inlined anymore (e.g.
co_data()). The reason is that the expression instantiates the message,
its size, sometimes a counter, then the atomic OR to taint the process,
and the back trace. That can be a lot for an inline function and most
of it is always the same.

This commit modifies this by delegating the common parts to a dedicated
function "complain()" that takes care of updating the counter if needed,
writing the message and measuring its length, and tainting the process.
This way the caller only has to check a condition, pass a pointer to the
preset message, and the info about the type (bug or warn) for the tainting,
then decide whether to dump or crash. Note that this part could also be
moved to the function but resulted in complain() always being at the top
of the stack, which didn't seem like an improvement.

Thanks to these changes, the BUG_ON() calls do not result in uninlining
functions anymore and the overall code size was reduced by 60 to 120 kB
depending on the build options.
2022-03-02 16:00:42 +01:00
Willy Tarreau
a631b86523 BUILD: tcpcheck: do not declare tcp_check_keywords_register() inline
This one is referenced in initcalls by its pointer, it makes no sense
to declare it inline. At best it causes function duplication, at worst
it doesn't build on older compilers.
2022-03-02 14:54:44 +01:00
Willy Tarreau
4de2cda104 BUILD: trace: do not declare trace_registre_source() inline
This one is referenced in initcalls by its pointer, it makes no sense
to declare it inline. At best it causes function duplication, at worst
it doesn't build on older compilers.
2022-03-02 14:53:00 +01:00
Willy Tarreau
368479c3fc BUILD: http_rules: do not declare http_*_keywords_registre() inline
The 3 functions http_{req,res,after_res}_keywords_register() are
referenced in initcalls by their pointer, it makes no sense to declare
them inline. At best it causes function duplication, at worst it doesn't
build on older compilers.
2022-03-02 14:50:38 +01:00
Willy Tarreau
d318e4e022 BUILD: connection: do not declare register_mux_proto() inline
This one is referenced in initcalls by its pointer, it makes no sense
to declare it inline. At best it causes function duplication, at worst
it doesn't build on older compilers.
2022-03-02 14:46:45 +01:00
Frédéric Lécaille
bd24208673 MINOR: quic: Assemble QUIC TLS flags at the same level
Do not distinguish the direction (TX/RX) when settings TLS secrets flags.
There is not such a distinction in the RFC 9001.
Assemble them at the same level: at the upper context level.
2022-03-01 16:34:03 +01:00
Frédéric Lécaille
9355d50f73 CLEANUP: quic: Indentation fix in qc_prep_pkts()
Non-invasive modification.
2022-03-01 16:22:35 +01:00
Frédéric Lécaille
7d845f15fd CLEANUP: quic: Useless tests in qc_try_rm_hp()
There is no need to test <qel>. Furthermore the packet type has already checked
by the caller.
2022-03-01 16:22:35 +01:00
Frédéric Lécaille
51c9065f66 MINOR: quic: Drop the packets of discarded packet number spaces
This is required since this previous commit:
   "MINOR: quic: Post handshake I/O callback switching"
If not, such packets remain endlessly in the RX buffer and cannot be parsed
by the new I/O callback used after the handshake has been confirmed.
2022-03-01 16:22:35 +01:00
Frédéric Lécaille
00e2400fa6 MINOR: quic: Post handshake I/O callback switching
Implement a simple task quic_conn_app_io_cb() to be used after the handshakes
have completed.
2022-03-01 16:22:35 +01:00
Frédéric Lécaille
5757b4a50e MINOR: quic: Ensure PTO timer is not set in the past
Wakeup asap the timer task when setting its timer in the past.
Take also the opportunity of this patch to make simplify quic_pto_pktns():
calling tick_first() is useless here to compare <lpto> with <tmp_pto>.
2022-03-01 16:22:35 +01:00
Christopher Faulet
10c9c74cd1 CLEANUP: stream: Remove useless tests on conn-stream in stream_dump()
Since the recent refactoring on the conn-streams, a stream has always a
defined frontend and backend conn-streams. Thus, in stream_dump(), there is
no reason to still test if these conn-streams are defined.

In addition, still in stream_dump(), get the stream-interfaces using the
conn-streams and not the opposite.

This patch should fix issue #1589 and #1590.
2022-03-01 15:22:05 +01:00
Amaury Denoyelle
0e3010b1bb MEDIUM: quic: rearchitecture Rx path for bidirectional STREAM frames
Reorganize the Rx path for STREAM frames on bidirectional streams. A new
function qcc_recv is implemented on the MUX. It will handle the STREAM
frames copy and offset calculation from transport to MUX.

Another function named qcc_decode_qcs from the MUX can be called by
transport each time new STREAM data has been copied.

The architecture is now cleaner with the MUX layer in charge of parsing
the STREAM frames offsets. This is required to be able to implement the
flow-control on the MUX layer.

Note that as a convenience, a STREAM frame is not partially copied to
the MUX buffer. This simplify the implementation for the moment but it
may change in the future to optimize the STREAM frames handling.

For the moment, only bidirectional streams benefit from this change. In
the future, it may be extended to unidirectional streams to unify the
STREAM frames processing.
2022-03-01 11:07:27 +01:00
Amaury Denoyelle
3c4303998f BUG/MINOR: quic: support FIN on Rx-buffered STREAM frames
FIN flag on a STREAM frame was not detected if the frame was previously
buffered on qcs.rx.frms before being handled.

To fix this, copy the fin field from the quic_stream instance to
quic_rx_strm_frm. This is required to properly notify the FIN flag on
qc_treat_rx_strm_frms for the MUX layer.

Without this fix, the request channel might be left opened after the
last STREAM frame reception if there is out-of-order frames on the Rx
path.
2022-03-01 11:07:06 +01:00
Amaury Denoyelle
3bf06093dc MINOR: mux-quic: define flag for last received frame
This flag is set when the STREAM frame with FIN set has been received on
a qcs instance. For now, this is only used as a BUG_ON guard to prevent
against multiple frames with FIN set. It will also be useful when
reorganize the RX path and move some of its code in the mux.
2022-03-01 10:52:31 +01:00
Amaury Denoyelle
f77e3435a9 MINOR: quic: handle partially received buffered stream frame
Adjust the function to handle buffered STREAM frames. If the offset of
the frame was already fully received, discard the frame. If only
partially received, compute the difference and copy only the newly
offset.

Before this change, a buffered frame representing a fully or partially
received offset caused the loop to be interrupted. The frame was
preserved, thus preventing frames with greater offset to be handled.

This may fix some occurences of stalled transfer on the request channel
if there is out-of-order STREAM frames on the Rx path.
2022-03-01 10:52:31 +01:00
Amaury Denoyelle
2d2d030522 MINOR: quic: simplify copy of STREAM frames to RX buffer
qc_strm_cpy can be simplified by simply using b_putblk which already
handle wrapping of the destination buffer. The function is kept to
update the frame length and offset fields.
2022-03-01 10:52:31 +01:00
Amaury Denoyelle
850695ab1f CLEANUP: adjust indentation in bidir STREAM handling function
Fix indentation in qc_handle_bidi_strm_frm in if condition.
2022-03-01 10:52:31 +01:00
Tim Duesterhus
cc8348fbc1 MINOR: queue: Replace if() + abort() with BUG_ON()
see 5cd4bbd7a ("BUG/MAJOR: threads/queue: Fix thread-safety issues on the queues management")
2022-03-01 10:14:56 +01:00
Tim Duesterhus
17e6b737d7 MINOR: connection: Transform safety check in PROXYv2 parsing into BUG_ON()
With BUG_ON() being enabled by default it is more useful to use a BUG_ON()
instead of an effectively never-taken if, as any incorrect assumptions will
become much more visible.

see 488ee7fb6 ("BUG/MAJOR: proxy_protocol: Properly validate TLV lengths")
2022-02-28 17:59:28 +01:00
Tim Duesterhus
f09af57df5 CLEANUP: connection: Indicate unreachability to the compiler in conn_recv_proxy
Transform the unreachability comment into a call to `my_unreachable()` to allow
the compiler from benefitting from it.

see d1b15b6e9 ("MINOR: proxy_protocol: Ingest PP2_TYPE_UNIQUE_ID on incoming connections")
see 615f81eb5 ("MINOR: connection: Use a `struct ist` to store proxy_authority")
2022-02-28 17:59:28 +01:00
Christopher Faulet
8bc1759f60 DEBUG: stream-int: Fix BUG_ON used to test appctx in si_applet_ops callbacks
693b23bb1 ("MEDIUM: tree-wide: Use unsafe conn-stream API when it is
relevant") introduced a regression in DEBUG_STRICT mode because some BUG_ON
conditions were inverted. It should ok now.

In addition, ALREADY_CHECKED macro was removed from appctx_wakeup() function
because it is useless now.
2022-02-28 17:29:11 +01:00
Christopher Faulet
234a10aa9b BUG/MEDIUM: htx: Fix a possible null derefs in htx_xfer_blks()
In htx_xfer_blks() function, when headers or trailers are partially
transferred, we rollback the copy by removing copied blocks. Internally, all
blocks between <dstref> and <dstblk> are removed. But if the transfer was
stopped because we failed to reserve a block, the variable <dstblk> is
NULL. Thus, we must not try to remove it. It is unexpected to call
htx_remove_blk() in this case.

htx_remove_blk() was updated to test <blk> variable inside the existing
BUG_ON(). The block must be defined.

For now, this bug may only be encountered when H2 trailers are copied. On H2
headers, the destination buffer is empty. Thus a swap is performed.

This patch should fix the issue #1578. It must be backported as far as 2.4.
2022-02-28 17:16:55 +01:00
Christopher Faulet
4ab8438362 BUG/MEDIUM: mux-fcgi: Don't rely on SI src/dst addresses for FCGI health-checks
When an HTTP health-check is performed in FCGI, we must not rely on the SI
source and destination addresses to set default parameters
(REMOTE_ADDR/REMOTE_PORT and SERVER_NAME/SERVER_PORT) because the backend
conn-stream is not attached to a stream but to a healt-check. Thus, there is
no stream-interface. In addition, there is no client connection because it
is an "internal" session.

Thus, for now, in this case, there is only the server connection that can be
used. So src/dst addresses are retrieved from the server connection when the
CS application is a health-check.

This patch should solve issue #1572. It must be backported to 2.5. Note than
the CS api has changed. Thus, on HAProxy 2.5, we should test the session's
origin instead:

    const struct sockaddr_storage *src = (cs_check(fstrm->cs) ? ...);
    const struct sockaddr_storage *dst = (cs_check(fstrm->cs) ? ...);
2022-02-28 17:16:55 +01:00
Christopher Faulet
9936dc6577 REORG: stream-int: Uninline si_sync_recv() and make si_cs_recv() private
This way si_*_recv() and si_*_sned() API are defined the same
way. si_sync_snd/si_sync_recv are both exported and defined in the C
file. And si_cs_send/si_cs_recv are private and only used by
stream-interface internals.
2022-02-28 17:16:47 +01:00
Christopher Faulet
494162381e CLEANUP: stream-int: Make si_cs_send() function static
This function was not exported and is only used in stream_interface.c. So
make it static.
2022-02-28 17:13:36 +01:00
Christopher Faulet
693b23bb10 MEDIUM: tree-wide: Use unsafe conn-stream API when it is relevant
The unsafe conn-stream API (__cs_*) is now used when we are sure the good
endpoint or application is attached to the conn-stream. This avoids compiler
warnings about possible null derefs. It also simplify the code and clear up
any ambiguity about manipulated entities.
2022-02-28 17:13:36 +01:00
Willy Tarreau
84240044f0 MINOR: channel: don't use co_set_data() to decrement output
The use of co_set_data() should be strictly limited to setting the amount
of existing data to be transmitted. It ought not be used to decrement the
output after the data have left the buffer, because doing so involves
performing incorrect calculations using co_data() that still comprises
data that are not in the buffer anymore. Let's use c_rew() for this, which
is made exactly for this purpose, i.e. decrement c->output by as much as
requested. This is cleaner, faster, and will permit stricter checks.
2022-02-28 16:51:23 +01:00
Willy Tarreau
6d3f1e322e DEBUG: rename WARN_ON_ONCE() to CHECK_IF()
The only reason for warning once is to check if a condition really
happens. Let's use a term that better translates the intent, that's
important when reading the code.
2022-02-28 11:51:23 +01:00
Amaury Denoyelle
7b4c9d6e8c MINOR: quic: add a TODO for a memleak frame on ACK consume
The quic_frame instance containing the quic_stream must be freed when
the corresponding ACK has been received. However when implementing this
on qcs_try_to_consume, some data transfers are interrupted and cannot
complete (DC test from interop test suite).
2022-02-25 15:06:17 +01:00
Amaury Denoyelle
0c7679dd86 MINOR: quic: liberate the TX stream buffer after ACK processing
The sending buffer of each stream is cleared when processing ACKs
corresponding to STREAM emitted frames. If the buffer is empty, free it
and offer it as with other dynamic buffers usage.

This should reduce memory consumption as before an opened stream
confiscate a buffer during its whole lifetime even if there is no more
data to transmit.
2022-02-25 15:06:17 +01:00
Amaury Denoyelle
642ab06313 MINOR: quic: adjust buffer handling for STREAM transmission
Simplify the data manipulation of STREAM frames on TX. Only stream data
and len field are used to generate a valid STREAM frames from the
buffer. Do not use the offset field, which required that a single buffer
instance should be shared for every frames on a single stream.
2022-02-25 15:06:17 +01:00
Willy Tarreau
4e0a8b1224 DEBUG: add a new WARN_ON_ONCE() macro
This one will maintain a static counter per call place and will only
emit the warning on the first call. It may be used to invite users to
report an unexpected event without spamming them with messages.
2022-02-25 11:55:47 +01:00
Willy Tarreau
305cfbde43 DBEUG: add a new WARN_ON() macro
This is the same as BUG_ON() except that it never crashes and only emits
a warning and a backtrace, inviting users to report the problem. This will
be usable for non-fatal issues that should not happen and need to be fixed.
This way the BUG_ON() when using DEBUG_STRICT_NOCRASH is effectively an
equivalent of WARN_ON().
2022-02-25 11:55:47 +01:00
Willy Tarreau
edd426871f DEBUG: move the tainted stuff to bug.h for easier inclusion
The functions needed to manipulate the "tainted" flags were located in
too high a level to be callable from the lower code layers. Let's move
them to bug.h.
2022-02-25 11:55:38 +01:00
Willy Tarreau
9b4a0e6bac BUG/MINOR: debug: fix get_tainted() to properly read an atomic value
get_tainted() was using an atomic store from the atomic value to a
local one instead of using an atomic load. In practice it has no effect
given the relatively rare updates of this field and the fact that it's
read only when dumping "show info" output, but better fix it.

There's probably no need to backport this.
2022-02-25 11:54:30 +01:00
Willy Tarreau
c72d2c7e5b BUILD: stream: fix build warning with older compilers
GCC 6 was not very good at value propagation and is often mislead about
risks of null derefs. Since 2.6-dev commit 13a35e575 ("MAJOR: conn_stream/
stream-int: move the appctx to the conn-stream"), it sees a risk of null-
deref in stream_upgrade_from_cs() after checking cs_conn_mux(cs). Let's
disguise the result so that it doesn't complain anymore. The output code
is exactly the same. The same method could be used to shut warnings at
-O1 that affect the same compiler by the way.
2022-02-24 19:43:15 +01:00
Amaury Denoyelle
119965f15e BUG/MEDIUM: quic: fix received ACK stream calculation
Adjust the handling of ACK for STREAM frames. When receiving a ACK, the
corresponding frames from the acknowledged packet are retrieved. If a
frame is of type STREAM, we compare the frame STREAM offset with the
last offset known of the qcs instance.

The comparison was incomplete as it did not treat a acked offset smaller
than the known offset. Previously, the acked frame was incorrectly
buffered in the qcs.tx.acked_frms. On reception of future ACKs, when
trying to process the buffered acks via qcs_try_to_consume, the loop is
interrupted on the smallest offset different from the qcs known offset :
in this case it will be the previous smaller range. This is a real bug
as it prevents all buffered ACKs to be processed, eventually filling the
qcs sending buffer and cause the transfer to stall.

Fix this by properly properly handle smaller acked offset. First check
if the offset length is greater than the qcs offset and mark as
acknowledged the difference on the qcs. If not, the frame is not
buffered and simply ignored.
2022-02-24 18:37:39 +01:00
Willy Tarreau
282b6a7539 BUG/MINOR: proxy: preset the error message pointer to NULL in parse_new_proxy()
As reported by Coverity in issue #1568, a missing initialization of the
error message pointer in parse_new_proxy() may result in displaying garbage
or crashing in case of memory allocation error when trying to create a new
proxy on startup.

This should be backported to 2.4.
2022-02-24 16:40:04 +01:00
Christopher Faulet
2da02ae8b2 BUILD: tree-wide: Avoid warnings about undefined entities retrieved from a CS
Since recent changes related to the conn-stream/stream-interface
refactoring, GCC reports potential null pointer dereferences when we get the
appctx, the stream or the stream-interface from the conn-strem. Of course,
depending on the time, these entities may be null. But at many places, we
know they are defined and it is safe to get them without any check. Thus, we
use ALREADY_CHECKED() macro to silent these warnings.

Note that the refactoring is unfinished, so it is not a real issue for now.
2022-02-24 13:56:52 +01:00
Christopher Faulet
9264a2c0e8 BUG/MINOR: h3/hq_interop: Fix CS and stream creation
Some recent API changes about conn-stream and stream creation were not fully
applied to the H3 part.

It is 2.6-DEV specific, no backport is needed.
2022-02-24 11:13:59 +01:00
Christopher Faulet
c983b2114d CLEANUP: backend: Don't export connect_server anymore
connect_server() function is only called from backend.c. So make it static.
2022-02-24 11:00:03 +01:00
Christopher Faulet
54e85cbfc7 MAJOR: check: Use a persistent conn-stream for health-checks
In the same way a stream has always valid conn-streams, when a health-checks
is created, a conn-stream is now created and the health-check is attached on
it, as an app. This simplify a bit the connect part when a health-check is
running.
2022-02-24 11:00:03 +01:00
Christopher Faulet
14fd99a20c MINOR: stream: Don't destroy conn-streams but detach app and endp
Don't call cs_destroy() anymore when a stream is released. Instead the
endpoint and the app are detached from the conn-stream.
2022-02-24 11:00:03 +01:00
Christopher Faulet
c36de9dc93 MINOR: conn-stream: Release a CS when both app and endp are detached
cs_detach_app() function is added to detach an app from a conn-stream. And
now, both cs_detach_app() and cs_detach_endp() release the conn-stream when
both the app and the endpoint are detached.
2022-02-24 11:00:03 +01:00
Christopher Faulet
014ac35eb2 CLEANUP: stream-int: rename si_reset() to si_init()
si_reset() function is only used when a stream-interface is allocated. Thus
rename it to si_init() insteaad.
2022-02-24 11:00:03 +01:00
Christopher Faulet
cda94accb1 MAJOR: stream/conn_stream: Move the stream-interface into the conn-stream
Thanks to all previous changes, it is now possible to move the
stream-interface into the conn-stream. To do so, some SI functions are
removed and their conn-stream counterparts are added. In addition, the
conn-stream is now responsible to create and release the
stream-interface. While the stream-interfaces were inlined in the stream
structure, there is now a pointer in the conn-stream. stream-interfaces are
now dynamically allocated. Thus a dedicated pool is added. It is a temporary
change because, at the end, the stream-interface structure will most
probably disappear.
2022-02-24 11:00:03 +01:00
Christopher Faulet
108ce5a70b MINOR: sink: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the sink part.
2022-02-24 11:00:02 +01:00
Christopher Faulet
0de82720e7 MINOR: tcp-act: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the tcp-act part.
2022-02-24 11:00:02 +01:00
Christopher Faulet
b91afea91c MINOR: httpclient: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the httpclient part.
2022-02-24 11:00:02 +01:00
Christopher Faulet
e1ede302c3 MINOR: http-act: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the http-act part.
2022-02-24 11:00:02 +01:00
Christopher Faulet
8f8f35b2b0 MINOR: dns: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the dns part.
2022-02-24 11:00:02 +01:00
Christopher Faulet
7a58d79dd2 MINOR: cache: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the cache part.
2022-02-24 11:00:02 +01:00
Christopher Faulet
436811f4a8 MINOR: hlua: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the hlua part.
2022-02-24 11:00:02 +01:00
Christopher Faulet
5d3c8aa154 MINOR: debug: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the debug part.
2022-02-24 11:00:02 +01:00
Christopher Faulet
56489e2e31 MINOR: peers: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the peers part.
2022-02-24 11:00:02 +01:00
Christopher Faulet
4d056bcb70 MINOR: proxy: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the proxy part.
2022-02-24 11:00:02 +01:00
Christopher Faulet
503d26428d MINOR: frontend: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the frontend part.
2022-02-24 11:00:02 +01:00
Christopher Faulet
02fc86e8f6 MINOR: log: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the log part.
2022-02-24 11:00:02 +01:00
Christopher Faulet
0c247df38b MINOR: cli: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the cli part.
2022-02-24 11:00:02 +01:00
Christopher Faulet
a629447d02 MINOR: http-ana: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream, all
access to the SI is done via the conn-stream. This patch is limited to the
http-ana part.
2022-02-24 11:00:02 +01:00
Christopher Faulet
5c8b47f665 MINOR: stream: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the stream part.
2022-02-24 11:00:02 +01:00
Christopher Faulet
4a0114b298 MINOR: backend: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the backend part.
2022-02-24 11:00:02 +01:00
Christopher Faulet
0dd566b42e MINOR: stream: Slightly rework stream_new to separate CS/SI initialization
It is just a minor reforctoring of stream_new() function to ease next
changes. Especially to move the SI from the stream to the conn-stream.
2022-02-24 11:00:02 +01:00
Christopher Faulet
95a61e8a0e MINOR: stream: Add pointer to front/back conn-streams into stream struct
frontend and backend conn-streams are now directly accesible from the
stream. This way, and with some other changes, it will be possible to remove
the stream-interfaces from the stream structure.
2022-02-24 11:00:02 +01:00
Christopher Faulet
f835dea939 MEDIUM: conn_stream: Add a pointer to the app object into the conn-stream
In the same way the conn-stream has a pointer to the stream endpoint , this
patch adds a pointer to the application entity in the conn-stream
structure. For now, it is a stream or a health-check. It is mandatory to
merge the stream-interface with the conn-stream.
2022-02-24 11:00:02 +01:00
Christopher Faulet
86e1c3381b MEDIUM: applet: Set the conn-stream as appctx owner instead of the stream-int
Because appctx is now an endpoint of the conn-stream, there is no reason to
still have the stream-interface as appctx owner. Thus, the conn-stream is
now the appctx owner.
2022-02-24 11:00:02 +01:00
Christopher Faulet
13a35e5752 MAJOR: conn_stream/stream-int: move the appctx to the conn-stream
Thanks to previous changes, it is now possible to set an appctx as endpoint
for a conn-stream. This means the appctx is no longer linked to the
stream-interface but to the conn-stream. Thus, a pointer to the conn-stream
is explicitly stored in the stream-interface. The endpoint (connection or
appctx) can be retrieved via the conn-stream.
2022-02-24 11:00:02 +01:00
Christopher Faulet
dd2d0d8b80 MEDIUM: conn-stream: Be prepared to use an appctx as conn-stream endpoint
To be able to use an appctx as conn-stream endpoint, the connection is no
longer stored as is in the conn-stream. The obj-type is used instead.
2022-02-24 11:00:02 +01:00
Christopher Faulet
897d612d68 MEDIUM: conn-stream: No longer access connection field directly
To be able to handle applets as a conn-stream endpoint, we must be prepared
to handle different types of endpoints. First of all, the conn-strream's
connection must no longer be used directly.
2022-02-24 11:00:02 +01:00
Christopher Faulet
1329f2a12a REORG: conn_stream: move conn-stream stuff in dedicated files
Move code dealing with the conn-streams in dedicated files.
2022-02-24 11:00:02 +01:00
Christopher Faulet
e2b38b31bb MEDIUM: stream: Allocate backend CS when the stream is created
Because the backend conn-stream is no longer released during connection
retry and because it is valid to have conn-stream with no connection, it is
possible to allocated it when the stream is created. This means, from now, a
stream has always valid frontend and backend conn-streams. It is the first
step to merge the SI and the CS.
2022-02-24 11:00:02 +01:00
Christopher Faulet
e00ad358c9 MEDIUM: stream: No longer release backend conn-stream on connection retry
The backend conn-stream is no longer released on connection retry. This
means the conn-stream is detached from the underlying connection but not
released. Thus, during connection retries, the stream has always an
allocated conn-stream with no connection. All previous changes were made to
make this possible.

Note that .attach() mux callback function was changed to get the conn-stream
as argument. The muxes are no longer responsible to create the conn-stream
when a server connection is attached to a stream.
2022-02-24 11:00:02 +01:00
Christopher Faulet
a742293ec9 MINOR: stream: Handle appctx case first when creating a new stream
In the same way the previous commit, when a stream is created, the appctx
case is now handled before the conn-stream one. The purpose of this change
is to limit bugs during the SI/CS refactoring.
2022-02-24 11:00:01 +01:00
Christopher Faulet
0256da14a5 MINOR: connection: Be prepared to handle conn-stream with no connection
The conn-stream will progressively replace the stream-interface. Thus, a
stream will have to allocate the backend conn-stream during its
creation. This means it will be possible to have a conn-stream with no
connection. To prepare this change, we test the conn-stream's connection
when we retrieve it.
2022-02-24 11:00:01 +01:00
Willy Tarreau
f4b79c4a01 MINOR: pools: support setting debugging options using -dM
The 9 currently available debugging options may now be checked, set, or
cleared using -dM. The directive now takes a comma-delimited list of
options after the optional poisonning byte. With "help", the list of
available options is displayed with a short help and their current
status.

The management doc was updated.
2022-02-23 17:28:41 +01:00
Willy Tarreau
1408b1f8be MINOR: pools: delegate parsing of command line option -dM to a new function
New function pool_parse_debugging() is now dedicated to parsing options
of -dM. For now it only handles the optional memory poisonning byte, but
the function may already return an informative message to be printed for
help, a warning or an error. This way we'll reuse it for the settings
that will be needed for configurable debugging options.
2022-02-23 17:28:41 +01:00
Willy Tarreau
18f96d02d3 MEDIUM: init: handle arguments earlier
The argument parser runs too late, we'll soon need it before creating
pools, hence just after init_early(). No visible change is expected but
this part is sensitive enough to be placed into its own commit for easier
bisection later if needed.
2022-02-23 17:28:41 +01:00
Willy Tarreau
392524d222 MINOR: init: extract args parsing to their own function
The cmdline argument parsing was performed quite late, which prevents
from retrieving elements that can be used to initialize the pools and
certain sensitive areas. The goal is to improve this by parsing command
line arguments right after the early init stage. This is possible
because the cmdline parser already does very little beyond retrieving
config elements that are used later.

Doing so requires to move the parser code to a separate function and
to externalize a few variables out of the function as they're used
later in the boot process, in the original function.

This patch creates init_args() but doesn't move it upfront yet, it's
still executed just before init(), which essentially corresponds to
what was done before (only the trash buffers, ACLs and Lua were
initialized earlier and are not needed for this).

The rest is not modified and as expected no change is observed.

Note that the diff doesn't to justice to the change as it makes it
look like the early init() code was moved to a new function after
the function was renamed, while in fact it's clearly the parser
itself which moved.
2022-02-23 17:11:33 +01:00
Willy Tarreau
34527d5354 MEDIUM: init: split the early initialization in its own function
There are some delicate chicken-and-egg situations in the initialization
code, because the init() function currently does way too much (it goes
as far as parsing the config) and due to this it must be started very
late. But it's also in charge of initializing a number of variables that
are needed in early boot (e.g. hostname/pid for error reporting, or
entropy for random generators).

This patch carefully extracts all the early code that depends on
absolutely nothing, and places it immediately after the STG_LOCK init
stage. The only possible failures at this stage are only allocation
errors and they continue to provoke an immediate exit().

Some environment variables, hostname, date, pid etc are retrieved at
this stage. The program's arguments are also copied there since they're
needed to be kept intact for the master process.
2022-02-23 17:11:33 +01:00
Willy Tarreau
3ebe4d989c MEDIUM: initcall: move STG_REGISTER earlier
The STG_REGISTER init level is used to register known keywords and
protocol stacks. It must be called earlier because some of the init
code already relies on it to be known. For example, "haproxy -vv"
for now is constrained to start very late only because of this.

This patch moves it between STG_LOCK and STG_ALLOC, which is fine as
it's used for static registration.
2022-02-23 17:11:33 +01:00
Willy Tarreau
ef301b7556 MINOR: pools: add a debugging flag for memory poisonning option
Now -dM will set POOL_DBG_POISON for consistency with the rest of the
pool debugging options. As such now we only check for the new flag,
which allows the default value to be preset.
2022-02-23 17:11:33 +01:00
Willy Tarreau
13d7775b06 MINOR: pools: replace DEBUG_MEMORY_POOLS with runtime POOL_DBG_TAG
This option used to allow to store a marker at the end of the area, which
was used as a canary and detection against wrong freeing while the object
is used, and as a pointer to the last pool_free() caller when back in cache.
Now that we can compute the offsets at runtime, let's check it at run time
and continue the code simplification.
2022-02-23 17:11:33 +01:00
Willy Tarreau
0271822f17 MINOR: pools: replace DEBUG_POOL_TRACING with runtime POOL_DBG_CALLER
This option used to allow to store a pointer to the caller of the last
pool_alloc() or pool_free() at the end of the area. Now that we can
compute the offsets at runtime, let's check it at run time and continue
the code simplification. In __pool_alloc() we now always calculate the
return address (which is quite cheap), and the POOL_DEBUG_TRACE_CALLER()
calls are conditionned on the status of debugging option.
2022-02-23 17:11:33 +01:00
Willy Tarreau
42705d06b7 MINOR: pools: get rid of POOL_EXTRA
This macro is build-time dependent and is almost unused, yet where it
cannot easily be avoided. Now that we store the distinction between
pool->size and pool->alloc_sz, we don't need to maintain it and we
can instead compute it on the fly when creating a pool. This is what
this patch does. The variables are for now pretty static, but this is
sufficient to kill the macro and will allow to set them more dynamically.
2022-02-23 17:11:33 +01:00
Willy Tarreau
96d5bc7379 MINOR: pools: store the allocated size for each pool
The allocated size is the visible size plus the extra storage. Since
for now we can store up to two extra elements (mark and tracer), it's
convenient because now we know that the mark is always stored at
->size, and the tracer is always before ->alloc_sz.
2022-02-23 17:11:33 +01:00
Willy Tarreau
e981631d27 MEDIUM: pools: replace CONFIG_HAP_POOLS with a runtime "NO_CACHE" flag.
Like previous patches, this replaces the build-time code paths that were
conditionned by CONFIG_HAP_POOLS with runtime paths conditionned by
!POOL_DBG_NO_CACHE. One trivial test had to be added in the hot path in
__pool_alloc() to refrain from calling pool_get_from_cache(), and another
one in __pool_free() to avoid calling pool_put_to_cache().

All cache-specific functions were instrumented with a BUG_ON() to make
sure we never call them with cache disabled. Additionally the cache[]
array was not initialized (remains NULL) so that we can later drop it
if not needed. It's particularly huge and should be turned to dynamic
with a pointer to a per-thread area where all the objects are located.
This will solve the memory usage issue and will improve locality, or
even help better deal with NUMA machines once each thread uses its own
arena.
2022-02-23 17:11:33 +01:00
Willy Tarreau
dff3b0627d MINOR: pools: make the global pools a runtime option.
There were very few functions left that were specific to global pools,
and even the checks they used to participate to are not directly on the
most critical path so they can suffer an extra "if".

What's done now is that pool_releasable() always returns 0 when global
pools are disabled (like the one before) so that pool_evict_last_items()
never tries to place evicted objects there. As such there will never be
any object in the free list. However pool_refill_local_from_shared() is
bypassed when global pools are disabled so that we even avoid the atomic
loads from this function.

The default global setting is still adjusted based on the original
CONFIG_NO_GLOBAL_POOLS that is set depending on threads and the allocator.
The global executable only grew by 1.1kB by keeping this code enabled,
and the code is simplified and will later support runtime options.
2022-02-23 17:11:33 +01:00
Willy Tarreau
6f3c7f6e6a MINOR: pools: add a new debugging flag POOL_DBG_INTEGRITY
The test to decide whether or not to enforce integrity checks on cached
objects is now enabled at runtime and conditionned by this new debugging
flag. While previously it was not a concern to inflate the code size by
keeping the two functions static, they were moved to pool.c to limit the
impact. In pool_get_from_cache(), the fast code path remains fast by
having both flags tested at once to open a slower branch when either
POOL_DBG_COLD_FIRST or POOL_DBG_INTEGRITY are set.
2022-02-23 17:11:33 +01:00
Willy Tarreau
d3470e1ce8 MINOR: pools: add a new debugging flag POOL_DBG_COLD_FIRST
When enabling pools integrity checks, we usually prefer to allocate cold
objects first in order to maximize the time the objects spend in the
cache. In order to make this configurable at runtime, let's introduce
a new debugging flag to control this allocation order. It is currently
preset by the DEBUG_POOL_INTEGRITY build-time setting.
2022-02-23 17:11:33 +01:00
Willy Tarreau
fd8b737e2c MINOR: pools: switch DEBUG_DONT_SHARE_POOLS to runtime
This test used to appear at a single location in create_pool() to
enable a check on the pool name or unconditionally merge similarly
sized pools.

This patch introduces POOL_DBG_DONT_MERGE and conditions the test on
this new runtime flag, that is preset according to the aforementioned
debugging option.
2022-02-23 17:11:33 +01:00
Willy Tarreau
8d0273ed88 MINOR: pools: switch the fail-alloc test to runtime only
The fail-alloc test used to be enabled/disabled at build time using
the DEBUG_FAIL_ALLOC macro, but it happens that the cost of the test
is quite cheap and that it can be enabled as one of the pool_debugging
options.

This patch thus introduces the first POOL_DBG_FAIL_ALLOC option, whose
default value depends on DEBUG_FAIL_ALLOC. The mem_should_fail() function
is now always built, but it was made static since it's never used outside.
2022-02-23 17:11:33 +01:00
Willy Tarreau
605629b008 MINOR: pools: introduce a new pool_debugging global variable
This read-mostly variable will be used at runtime to enable/disable
certain pool-debugging features and will be set by the command-line
parser. A future option -dP will take a number of debugging features
as arguments to configure this variable's contents.
2022-02-23 17:11:33 +01:00
Willy Tarreau
af580f659c MINOR: pools: disable redundant poisonning on pool_free()
The poisonning performed on pool_free() used to help a little bit with
use-after-free detection, but usually did more harm than good in that
it was never possible to perform post-mortem analysis on released
objects once poisonning was enabled on allocation. Now that there is
a dedicated DEBUG_POOL_INTEGRITY, let's get rid of this annoyance
which is not even documented in the management manual.
2022-02-23 17:11:33 +01:00
Willy Tarreau
b61fccdc3f CLEANUP: init: remove the ifdef on HAPROXY_MEMMAX
It's ugly, let's move it to defaults.h with all other ones and preset
it to zero if not defined.
2022-02-23 17:11:33 +01:00
Willy Tarreau
cc0d554e5f CLEANUP: vars: move the per-process variables initialization to vars.c
There's no point keeping the vars_init_head() call in init() when we
already have a vars_init() registered at the right time to do that,
and it complexifies the boot sequence, so let's move it there.
2022-02-23 17:11:33 +01:00
Willy Tarreau
add4306231 CLEANUP: muxes: do not use a dynamic trash in list_mux_protos()
Let's not use a trash there anymore. The function is called at very
early boot (for "haproxy -vv"), and the need for a trash prevents the
arguments from being parsed earlier. Moreover, the function only uses
a FILE* on output with fprintf(), so there's not even any benefit in
using chunk_printf() on an intermediary variable, emitting the output
directly is both clearer and safer.
2022-02-23 17:11:33 +01:00
Willy Tarreau
5b4b6ca823 CLEANUP: httpclient: initialize the client in stage INIT not REGISTER
REGISTER is meant to only assemble static lists, not to initialize
code that may depend on some elements possibly initialized at this
level. For example the init code currently looks up transport protocols
such as XPRT_RAW and XPRT_SSL which ought to be themselves registered
from at REGISTER stage, and which currently work only because they're
still registered directly from a constructor. INIT is perfectly suited
for this level.
2022-02-23 17:11:33 +01:00
William Lallemand
ab90ee80d9 BUG/MINOR: httpclient/lua: missing pop for new timeout parameter
The lua timeout server lacks a lua_pop(), breaking the lua stack.

No backported needed.
2022-02-23 15:16:08 +01:00
William Lallemand
b4a4ef6a29 MINOR: httpclient/lua: ability to set a server timeout
Add the ability to set a "server timeout" on the httpclient with either
the httpclient_set_timeout() API or the timeout argument in a request.

Issue #1470.
2022-02-23 15:11:11 +01:00
Christopher Faulet
686501cb1c BUG/MEDIUM: stream: Abort processing if response buffer allocation fails
In process_stream(), we force the response buffer allocation before any
processing to be able to return an error message. It is important because,
when an error is triggered, the stream is immediately closed. Thus we cannot
wait for the response buffer allocation.

When the allocation fails, the stream analysis is stopped and the expiration
date of the stream's task is updated before exiting process_stream(). But if
the stream was woken up because of a connection or an analysis timeout, the
expiration date remains blocked in the past. This means the stream is woken
up in loop as long as the response buffer is not properly allocated.

Alone, this behavior is already a bug. But because the mechanism to handle
buffer allocation failures is totally broken since a while, this bug becomes
more problematic. Because, most of time, the watchdog will kill HAProxy in
this case because it will detect a spinning loop.

To fix it, at least temporarily, an allocation failure at this stage is now
reported as an error and the processing is aborted. It's not satisfying but
it is better than nothing. If the buffers allocation mechanism is
refactored, this part will be reviewed.

This patch must be backported, probably as far as 2.0. It may be perceived
as a regression, but the actual behavior is probably even worse. And
because it was not reported, it is probably not a common situation.
2022-02-23 09:26:32 +01:00
Willy Tarreau
9f699958dc MINOR: pools: mark most static pool configuration variables as read-mostly
The mem_poison_byte, mem_fail_rate, using_default_allocator and the
pools list are all only set once at boot time and never changed later,
while they're heavily used at run time. Let's optimize their usage from
all threads by marking them read-mostly so that them reside in a shared
cache line.
2022-02-21 20:44:26 +01:00
Amaury Denoyelle
4323567650 MINOR: quic: fix handling of out-of-order received STREAM frames
The recent changes was not complete.
  d1c76f24fd
  MINOR: quic: do not modify offset node if quic_rx_strm_frm in tree

The frame length and data pointer should incremented after the data
copy. A BUG_ON statement has been added to detect an incorrect decrement
operaiton.
2022-02-21 19:14:09 +01:00
Amaury Denoyelle
c0b66ca73c MINOR: mux-quic: fix uninitialized return on qc_send
This should fix the github issue #1562.
2022-02-21 18:46:58 +01:00
Amaury Denoyelle
ff191de1ca MINOR: h3: fix compiler warning variable set but not used
Some variables were only checked via BUG_ON macro. If compiling without
DEBUG_STRICT, this instruction is a noop. Fix this by using an explicit
condition + ABORT_NOW.

This should fix the github issue #1549.
2022-02-21 18:46:58 +01:00
Amaury Denoyelle
d1c76f24fd MINOR: quic: do not modify offset node if quic_rx_strm_frm in tree
qc_rx_strm_frm_cpy is unsafe because it updates the offset field of the
frame. This is not safe as the frame is inserted in the tree when
calling this function and offset serves as the key node.

To fix this, the API is modified so that qc_rx_strm_frm_cpy does not
update the frame parameter. The caller is responsible to update
offset/length in case of a partial copy.

The impact of this bug is not known. It can only happened with received
STREAM frames out-of-order. This might be triggered with large h3 POST
requests.
2022-02-21 18:46:58 +01:00
Christopher Faulet
ae17925b87 DEBUG: stream-int: Check CS_FL_WANT_ROOM is not set with an empty input buffer
In si_cs_recv(), the mux must never set CS_FL_WANT_ROOM flag on the
conn-stream if the input buffer is empty and nothing was copied. It is
important because, there is nothing the app layer can do in this case to
make some room. If this happens, this will most probably lead to a ping-pong
loop between the mux and the stream.

With this BUG_ON(), it will be easier to spot such bugs.
2022-02-21 16:29:00 +01:00
Christopher Faulet
ec361bbd84 BUG/MAJOR: mux-h2: Be sure to always report HTX parsing error to the app layer
If a parsing error is detected and the corresponding HTX flag is set
(HTX_FL_PARSING_ERROR), we must be sure to always report it to the app
layer. It is especially important when the error occurs during the response
parsing, on the server side. In this case, the RX buffer contains an empty
HTX message to carry the flag. And it remains in this state till the info is
reported to the app layer. This must be done otherwise, on the conn-stream,
the CS_FL_ERR_PENDING flag cannot be switched to CS_FL_ERROR and the
CS_FL_WANT_ROOM flag is always set when h2_rcv_buf() is called. The result
is a ping-pong loop between the mux and the stream.

Note that this patch fixes a bug. But it also reveals a design issue. The
error must not be reported at the HTX level. The error is already carried by
the conn-stream. There is no reason to duplicate it. In addition, it is
errorprone to have an empty HTX message only to report the error to the app
layer.

This patch should fix the issue #1561. It must be backported as far as 2.0
but the bug only affects HAProxy >= 2.4.
2022-02-21 16:05:47 +01:00
Christopher Faulet
c17c31c822 BUG/MEDIUM: mux-h1: Don't wake h1s if mux is blocked on lack of output buffer
After sending some data, we try to wake the H1 stream to resume data
processing at the stream level, except if the output buffer is still
full. However we must also be sure the mux is not blocked because of an
allocation failure on this buffer. Otherwise, it may lead to a ping-pong
loop between the stream and the mux to send more data with an unallocated
output buffer.

Note there is a mechanism to queue buffers allocations when a failure
happens. However this mechanism is totally broken since the filters were
introducted in HAProxy 1.7. And it is worse now with the multiplexers. So
this patch fixes a possible loop needlessly consuming all the CPU. But
buffer allocation failures must remain pretty rare.

This patch must be backported as far as 2.0.
2022-02-21 16:05:47 +01:00
Amaury Denoyelle
ea3e0355da MINOR: mux-quic: fix a possible null dereference in qc_timeout_task
The qcc instance should be tested as it is implied by a previous test
that it may be NULL. In this case, qc_timeout_task can be stopped.

This should fix github issue #1559.
2022-02-21 10:05:16 +01:00
Willy Tarreau
11adb1d8fc BUG/MEDIUM: httpclient: limit transfers to the maximum available room
A bug was uncovered by commit fc5912914 ("MINOR: httpclient: Don't limit
data transfer to 1024 bytes"), it happens that callers of b_xfer() and
b_force_xfer() are expected to check for available room in the target
buffer. Previously it was unlikely to be full but now with full buffer-
sized transfers, it happens more often and in practice it is possible
to crash the process with the debug command "httpclient" on the CLI by
going beyond a the max buffer size. Other call places ought to be
rechecked by now and it might be time to rethink this API if it tends
to generalize.

This must be backported to 2.5.
2022-02-18 17:32:12 +01:00
William Lallemand
8a91374487 BUG/MINOR: tools: url2sa reads ipv4 too far
The url2sa implementation is inconsitent when parsing an IPv4, indeed
url2sa() takes a <ulen> as a parameter where the call to url2ipv4() takes
a null terminated string. Which means url2ipv4 could try to read more
that it is supposed to.

This function is only used from a buffer so it never reach a unallocated
space. It can only cause an issue when used from the httpclient which
uses it with an ist.

This patch fixes the issue by copying everything in the trash and
null-terminated it.

Must be backported in all supported version.
2022-02-18 16:32:04 +01:00
Willy Tarreau
2c8f984441 CLEANUP: httpclient/cli: fix indentation alignment of the help message
The output was not aligned with other commands, let's fix it.
2022-02-18 16:29:50 +01:00
Remi Tricot-Le Breton
1b01b7f2ef BUG/MINOR: ssl: Missing return value check in ssl_ocsp_response_print
When calling ssl_ocsp_response_print which is used to display an OCSP
response's details when calling the "show ssl ocsp-response" on the CLI,
we use the BIO_read function that copies an OpenSSL BIO into a trash.
The return value was not checked though, which could lead to some
crashes since BIO_read can return a negative value in case of error.

This patch should be backported to 2.5.
2022-02-18 09:58:04 +01:00
Remi Tricot-Le Breton
8081b67699 BUG/MINOR: ssl: Fix leak in "show ssl ocsp-response" CLI command
When calling the "show ssl ocsp-response" CLI command some OpenSSL
objects need to be created in order to get some information related to
the OCSP response and some of them were not freed.

It should be backported to 2.5.
2022-02-18 09:57:57 +01:00
Remi Tricot-Le Breton
a9a591ab3d BUG/MINOR: ssl: Add missing return value check in ssl_ocsp_response_print
The b_istput function called to append the last data block to the end of
an OCSP response's detailed output was not checked in
ssl_ocsp_response_print. The ssl_ocsp_response_print return value checks
were added as well since some of them were missing.
This error was raised by Coverity (CID 1469513).

This patch fixes GitHub issue #1541.
It can be backported to 2.5.
2022-02-18 09:57:51 +01:00
William Lallemand
4f4f2b7b5f MINOR: httpclient/lua: add 'dst' optionnal field
The 'dst' optionnal field on a httpclient request can be used to set an
alternative server address in the haproxy address format. Which means it
could be use with unix@, ipv6@ etc.

Should fix issue #1471.
2022-02-17 20:07:00 +01:00
William Lallemand
7b2e0ee1c1 MINOR: httpclient: sets an alternative destination
httpclient_set_dst() allows to set an alternative destination address
using HAProxy addres format. This will ignore the address within the
URL.
2022-02-17 20:07:00 +01:00
Lukas Tribus
1a16e4ebcb BUG/MINOR: mailers: negotiate SMTP, not ESMTP
As per issue #1552 the mailer code currently breaks on ESMTP multiline
responses. Let's negotiate SMTP instead.

Should be backported to 2.0.
2022-02-17 15:45:59 +01:00
William Lallemand
5085bc3103 BUG/MINOR: httpclient: reinit flags in httpclient_start()
When starting for the 2nd time a request from the same httpclient *hc
context, the flags are not reinitialized and the httpclient will stop
after the first call to the IO handler, because the END flag is always
present.

This patch also add a test before httpclient_start() to ensure we don't
start a client already started.

Must be backported in 2.5.
2022-02-17 12:59:52 +01:00
Willy Tarreau
d0de677682 BUG/MINOR: mux-h2: update the session's idle delay before creating the stream
The idle connection delay calculation before a request is a bit tricky,
especially for multiplexed protocols. It changed between 2.3 and 2.4 by
the integration of the idle delay inside the session itself with these
commits:

  dd78921c6 ("MINOR: logs: Use session idle duration when no stream is provided")
  7a6c51324 ("MINOR: stream: Always get idle duration from the session")

and by then it was only set by the H1 mux. But over multiple changes, what
used to be a zero idle delay + a request delay for H2 became a bit odd, with
the idle time slipping into the request time measurement. The effect is that,
as reported in GH issue #1395, some H2 request times look huge.

This patch introduces the calculation of the session's idle time on the
H2 mux before creating the stream. This is made possible because the
stream_new() code immediately copies this value into the stream for use
at log time. Thus we don't care about changing something that will be
touched by every single request. The idle time is calculated as documented,
i.e. the delay from the previous request to the current one. This also
means that when a single stream is present on a connection, a part of
the server's response time may appear in the %Ti measurement, but this
reflects the reality since nothing would prevent the client from using
the connection to fetch more objects. In addition this shows how long
it takes a client to find references to objects in an HTML page and
start to fetch them.

A different approach could have consisted in counting from the last time
the connection was left without any request (i.e. really idle), but this
would at least require a documentation change and it's not certain this
would provide a more useful information.

Thanks to Bart Butler and Luke Seelenbinder for reporting enough elements
to diagnose this issue.

This should be backported to 2.4.
2022-02-16 14:42:30 +01:00
Willy Tarreau
c7d85485a0 BUG/MEDIUM: h2/hpack: fix emission of HPACK DTSU after settings change
Sadly, despite particular care, commit 39a0a1e12 ("MEDIUM: h2/hpack: emit
a Dynamic Table Size Update after settings change") broke H2 when sending
DTSU. A missing negation on the flag caused the DTSU_EMITTED flag to be
lost and the DTSU to be sent again on the next stream, and possibly to
break flow control or a few other internal states.

This will have to be backported wherever the patch above was backported.

Thanks to Yves Lafon for notifying us with elements to reproduce the
issue!
2022-02-16 14:42:13 +01:00
Willy Tarreau
b042e4f6f7 BUG/MAJOR: spoe: properly detach all agents when releasing the applet
There's a bug in spoe_release_appctx() which checks the presence of items
in the wrong list rt[tid].agents to run over rt[tid].waiting_queue and
zero their spoe_appctx. The effect is that these contexts are not zeroed
and if spoe_stop_processing() is called, "sa->cur_fpa--" will be applied
to one of these recently freed contexts and will corrupt random memory
locations, as found at least in bugs #1494 and #1525.

This must be backported to all stable versions.

Many thanks to Christian Ruppert from Babiel for exchanging so many
useful traces over the last two months, testing debugging code and
helping set up a similar environment to reproduce it!
2022-02-16 14:42:13 +01:00
Andrew McDermott
bfb15ab34e BUG/MAJOR: http/htx: prevent unbounded loop in http_manage_server_side_cookies
Ensure calls to http_find_header() terminate. If a "Set-Cookie2"
header is found then the while(1) loop in
http_manage_server_side_cookies() will never terminate, resulting in
the watchdog firing and the process terminating via SIGABRT.

The while(1) loop becomes unbounded because an unmatched call to
http_find_header("Set-Cookie") will leave ctx->blk=NULL. Subsequent
calls to check for "Set-Cookie2" will now enumerate from the beginning
of all the blocks and will once again match on subsequent
passes (assuming a match first time around), hence the loop becoming
unbounded.

This issue was introduced with HTX and this fix should be backported
to all versions supporting HTX.

Many thanks to Grant Spence (gspence@redhat.com) for working through
this issue with me.
2022-02-16 14:42:13 +01:00
Amaury Denoyelle
1d5fdc526b MINOR: h3: remove unused return value on decode_qcs
This should fix 1470806 coverity report from github issue #1550.
2022-02-16 14:37:56 +01:00
William Lallemand
de6ecc3ace BUG/MINOR: httpclient/cli: display junk characters in vsn
ist are not ended by '\0', leading to junk characters being displayed
when using %s for printing the HTTP start line.

Fix the issue by replacing %s by %.*s + istlen.

Must be backported in 2.5.
2022-02-16 11:37:02 +01:00
Remi Tricot-Le Breton
d544d33e10 BUG/MINOR: jwt: Memory leak if same key is used in multiple jwt_verify calls
If the same filename was specified in multiple calls of the jwt_verify
converter, we would have parsed the contents of the file every time it
was used instead of checking if the entry already existed in the tree.
This lead to memory leaks because we would not insert the duplicated
entry and we would not free it (as well as the EVP_PKEY it referenced).
We now check the return value of ebst_insert and free the current entry
if it is a duplicate of an existing entry.
The order in which the tree insert and the pkey parsing happen was also
switched in order to avoid parsing key files in case of duplicates.

Should be backported to 2.5.
2022-02-15 20:08:20 +01:00
Remi Tricot-Le Breton
2b5a655946 BUG/MINOR: jwt: Missing pkey free during cleanup
When emptying the jwt_cert_tree during deinit, the entries are freed but
not the EVP_PKEY reference they kept, leading in a memory leak.

Should be backported in 2.5.
2022-02-15 20:08:20 +01:00
Remi Tricot-Le Breton
4930c6c869 BUG/MINOR: jwt: Double free in deinit function
The node pointer was not moving properly along the jwt_cert_tree during
the deinit which ended in a double free during cleanup (or when checking
a configuration that used the jwt_verify converter with an explicit
certificate specified).

This patch fixes GitHub issue #1533.
It should be backported to 2.5.
2022-02-15 20:08:20 +01:00
Amaury Denoyelle
31e4f6e149 MINOR: h3: report error on HEADERS/DATA parsing
Inspect return code of HEADERS/DATA parsing functions and use a BUG_ON
to signal an error. The stream should be closed to handle the error
in a more clean fashion.
2022-02-15 17:33:21 +01:00
Frédéric Lécaille
71f3abbb52 MINOR: quic: Move quic_rxbuf_pool pool out of xprt part
This pool could be confuse with that of the RX buffer pool for the connection
(quic_conn_rxbuf).
2022-02-15 17:33:21 +01:00
Frédéric Lécaille
53c7d8db56 MINOR: quic: Do not retransmit too much packets.
We retranmist at most one datagram and possibly one more with only PING frame
as ack-eliciting frame.
2022-02-15 17:33:21 +01:00
Frédéric Lécaille
0c80e69470 MINOR: quic: Possible frame parsers array overrun
This should fix CID 1469663 for GH #1546.
2022-02-15 17:33:21 +01:00
Frédéric Lécaille
59509b5187 MINOR: quic: Non checked returned value for cs_new() in h3_decode_qcs()
This should fix CID 1469664 for GH #1546
2022-02-15 17:33:21 +01:00
Frédéric Lécaille
3c08cb4948 MINOR: h3: Dead code in h3_uqs_init()
This should fix CID 1469657 for GH #1546.
2022-02-15 17:23:44 +01:00
Frédéric Lécaille
1e1fb5db45 MINOR: quic: Non checked returned value for cs_new() in hq_interop_decode_qcs()
This should fix CID 1469657 for GH #1546
2022-02-15 17:23:44 +01:00
Frédéric Lécaille
498e992c1c MINOR: quic: Useless test in quic_lstnr_dghdlr()
This statement is useless. This should fix CID 1469651 for GH #1546.
2022-02-15 17:23:44 +01:00
Frédéric Lécaille
e1c3546efa MINOR: quic: Avoid warning about NULL pointer dereferences
This is the same fixe as for this commit:
    "BUILD: tree-wide: avoid warnings caused by redundant checks of obj_types"
Should fix CID 1469649 for GH #1546
2022-02-15 17:23:44 +01:00
Frédéric Lécaille
ee4508da4f MINOR: quic: ha_quic_set_encryption_secrets without server specific code
Remove this server specific code section. It is useless, not tested. Furthermore
this is really not the good place to retrieve the peer transport parameters.
2022-02-15 17:23:44 +01:00
Frédéric Lécaille
16de9f7dbf MINOR: quic: Code never reached in qc_ssl_sess_init()
There was a remaining useless statement in this code block.
This fixes CID 1469648 for GH #1546
2022-02-15 17:23:44 +01:00
Frédéric Lécaille
21db6f962b MINOR: quic: Wrong loss delay computation
I really do not know where does this statement come from even after having
checked several drafts.
2022-02-15 17:23:44 +01:00
Amaury Denoyelle
91379f79f8 MINOR: h3: implement DATA parsing
Add a new function h3_data_to_htx. This function is used to parse a H3
DATA frame and copy it in the mux stream HTX buffer. This is required to
support HTTP POST data.

Note that partial transfers if the HTX buffer is fulled is not properly
handle. This causes large DATA transfer to fail at the moment.
2022-02-15 17:17:00 +01:00
Amaury Denoyelle
7b0f1220d4 MINOR: h3: extract HEADERS parsing in a dedicated function
Move the HEADERS parsing code outside of generic h3_decode_qcs to a new
dedicated function h3_headers_to_htx. The benefit will be visible when
other H3 frames parsing will be implemented such as DATA.
2022-02-15 17:12:27 +01:00
Amaury Denoyelle
0484f92656 MINOR: h3: report frames bigger than rx buffer
If a frame is bigger than the qcs buffer, it can not be parsed at the
moment. Add a TODO comment to signal that a fix is required.
2022-02-15 17:11:59 +01:00
Amaury Denoyelle
bb56530470 MINOR: h3: set CS_FL_NOT_FIRST
When creating a new conn-stream on H3 HEADERS parsing, the flag
CS_FL_NOT_FIRST must be set. This is identical to the mux-h2.
2022-02-15 17:10:51 +01:00
Amaury Denoyelle
eb53e5baa1 MINOR: mux-quic: set EOS on rcv_buf
Flags EOI/EOS must be set on conn-stream when transfering the last data
of a stream in rcv_buf. This is activated if qcs HTX buffer has the EOM
flag and has been fully transfered.
2022-02-15 17:10:51 +01:00
Amaury Denoyelle
9a327a7c3f MINOR: mux-quic: implement rcv_buf
Implement the stream rcv_buf operation on QUIC mux.

A new buffer is stored in qcs structure named app_buf. This new buffer
will contains HTX and will be filled for example on H3 DATA frame
parsing.

The rcv_buf operation transfer as much as possible data from the HTX
from app_buf to the conn-stream buffer. This is mainly identical to
mux-h2. This is required to support HTTP POST data.
2022-02-15 17:10:51 +01:00
Amaury Denoyelle
95b93a3a93 MINOR: h3: set properly HTX EOM/BODYLESS on HEADERS parsing
Adjust the method to detect that a H3 HEADERS frame is the last one of
the stream. If this is true, the flags EOM and BODYLESS must be set on
the HTX message.
2022-02-15 17:08:48 +01:00
Amaury Denoyelle
a04724af29 MINOR: h3: add documentation on h3_decode_qcs
Specify the purpose of the fin argument on h3_decode_qcs.
2022-02-15 17:08:32 +01:00
Amaury Denoyelle
ffafb3d2c2 MINOR: h3: remove transfer-encoding header
According to HTTP/3 specification, transfer-encoding header must not be
used in HTTP/3 messages. Remove it when converting HTX responses to
HTTP/3.
2022-02-15 17:08:22 +01:00
Amaury Denoyelle
4ac6d37333 BUG/MINOR: h3: fix the header length for QPACK decoding
Pass the H3 frame length to QPACK decoding instead of the length of the
whole buffer.

Without this fix, if there is multiple H3 frames starting with a
HEADERS, QPACK decoding will be erroneously applied over all of them,
most probably leading to a decoding error.
2022-02-15 17:06:22 +01:00
Amaury Denoyelle
6a2c2f4910 BUG/MINOR: quic: fix FIN stream signaling
If the last frame is not entirely copied and must be buffered, FIN
must not be signaled to the upper layer.

This might fix a rare bug which could cause the request channel to be
closed too early leading to an incomplete request.
2022-02-15 17:01:14 +01:00
Amaury Denoyelle
ab9cec7ce1 MINOR: qpack: fix typo in trace
hanme -> hname
2022-02-15 11:08:17 +01:00
Amaury Denoyelle
4af6595d41 BUG/MEDIUM: quic: fix crash on CC if mux not present
If a CONNECTION_CLOSE is received during handshake or after mux release,
a segfault happens due to invalid dereferencement of qc->qcc. Check
mux_state first to prevent this.
2022-02-15 11:08:17 +01:00
Amaury Denoyelle
8524f0f779 MINOR: quic: use a global dghlrs for each thread
Move the QUIC datagram handlers oustide of the receivers. Use a global
handler per-thread which is allocated on post-config. Implement a free
function on process deinit to avoid a memory leak.
2022-02-15 10:13:20 +01:00
Willy Tarreau
6c8babf6c4 BUG/MAJOR: sched: prevent rare concurrent wakeup of multi-threaded tasks
Since the relaxation of the run-queue locks in 2.0 there has been a
very small but existing race between expired tasks and running tasks:
a task might be expiring and being woken up at the same time, on
different threads. This is protected against via the TASK_QUEUED and
TASK_RUNNING flags, but just after the task finishes executing, it
releases it TASK_RUNNING bit an only then it may go to task_queue().
This one will do nothing if the task's ->expire field is zero, but
if the field turns to zero between this test and the call to
__task_queue() then three things may happen:
  - the task may remain in the WQ until the 24 next days if it's in
    the future;
  - the task may prevent any other task after it from expiring during
    the 24 next days once it's queued
  - if DEBUG_STRICT is set on 2.4 and above, an abort may happen
  - since 2.2, if the task got killed in between, then we may
    even requeue a freed task, causing random behaviour next time
    it's found there, or possibly corrupting the tree if it gets
    reinserted later.

The peers code is one call path that easily reproduces the case with
the ->expire field being reset, because it starts by setting it to
TICK_ETERNITY as the first thing when entering the task handler. But
other code parts also use multi-threaded tasks and rightfully expect
to be able to touch their expire field without causing trouble. No
trivial code path was found that would destroy such a shared task at
runtime, which already limits the risks.

This must be backported to 2.0.
2022-02-14 20:10:43 +01:00
Willy Tarreau
27c8da1fd5 DEBUG: pools: replace the link pointer with the caller's address on pool_free()
Along recent evolutions of the pools, we've lost the ability to reliably
detect double-frees because while in the past the same pointer was being
used to chain the objects in the cache and to store the pool's address,
since 2.0 they're different so the pool's address is never overwritten on
free() and a double-free will rarely be detected.

This patch sets the caller's return address there. It can never be equal
to a pool's address and will help guess what was the previous call path.
It will not work on exotic architectures nor with very old compilers but
these are not the environments where we're trying to get detailed bug
reports, and this is not done by default anyway so we don't care about
this limitation. Note that depending on the inlining status of the
function, the result may differ but that's no big deal either.

A test by placing a double free of an appctx inside the release handler
itself successfully reported the trouble during appctx_free() and showed
that the return address was in stream_int_shutw_applet() (this one calls
the release handler).
2022-02-14 20:10:43 +01:00
Willy Tarreau
49bb5d4268 DEBUG: pools: let's add reverse mapping from cache heads to thread and pool
During global eviction we're visiting nodes from the LRU tail and we
determine their pool cache head and their pool. In order to make sure
we never mess up, let's add some backwards pointer to the thread number
and pool from the pool_cache_head. It's 64-byte aligned anyway so we're
not wasting space and it helps for debugging and will prevent memory
corruption the earliest possible.
2022-02-14 20:10:43 +01:00
Willy Tarreau
e2830addda DEBUG: pools: add extra sanity checks when picking objects from a local cache
These few checks are added to make sure we never try to pick an object from
an empty list, which would have a devastating effect.
2022-02-14 20:10:43 +01:00
Willy Tarreau
ceabc5ca8c CLEANUP: pools: don't needlessly set a call mark during refilling of caches
When refilling caches from the shared cache, it's pointless to set the
pointer to the local pool since it may be overwritten immediately after
by the LIST_INSERT(). This is a leftover from the pre-2.4 code in fact.
It didn't hurt, though.
2022-02-14 20:10:43 +01:00
Willy Tarreau
c895c441c7 BUG/MINOR: pools: always flush pools about to be destroyed
When destroying a pool (e.g. at exit or when resizing buffers), it's
important to try to free all their local objects otherwise we can leave
some in the cache. This is particularly visible when changing "bufsize",
because "show pools" will then show two "trash" pools, one of which
contains a single object in cache (which is fortunately not reachable).
In all cases this happens while single-threaded so that's easy to do,
we just have to do it on the current thread.

The easiest way to do this is to pass an extra argument to function
pool_evict_from_local_cache() to force a full flush instead of a
partial one.

This can probably be backported to about all branches where this
applies, but at least 2.4 needs it.
2022-02-14 20:10:43 +01:00
Willy Tarreau
b5ba09ed58 BUG/MEDIUM: pools: ensure items are always large enough for the pool_cache_item
With the introduction of DEBUG_POOL_TRACING in 2.6-dev with commit
add43fa43 ("DEBUG: pools: add new build option DEBUG_POOL_TRACING"), small
pools might be too short to store both the pool_cache_item struct and the
caller location, resulting in memory corruption and crashes when this debug
option is used.

What happens here is that the way the size is calculated is by considering
that the POOL_EXTRA part is only used while the object is in use, but this
is not true anymore for the caller's pointer which must absolutely be placed
after the pool_cache_item.

This patch makes sure that the caller part will always start after the
pool_cache_item and that the allocation will always be sufficent. This is
only tagged medium because the debug option is new and unlikely to be used
unless requested by a developer.

No backport is needed.
2022-02-14 20:10:43 +01:00
Frédéric Lécaille
547aa0e95e MINOR: quic: Useless statement in quic_crypto_data_cpy()
This should fix Coverity CID 375057 in GH #1526 where a useless assignment
was detected.
2022-02-14 15:20:54 +01:00
Frédéric Lécaille
c0b481f87b MINOR: quic: Possible memleak in qc_new_conn()
This should fix Coverity CID 375047 in GH #1536 where <buf_area> could leak because
not always freed by by quic_conn_drop(), especially when not stored in <qc> variable.
2022-02-14 15:20:54 +01:00
Frédéric Lécaille
225c31fc9f CLEANUP: h3: Unreachable target in h3_uqs_init()
This should fix Coverity CID 375045 in GH #1536 which detects a no more
use "err" target in h3_uqs_init()
2022-02-14 15:20:54 +01:00
Frédéric Lécaille
6842485a84 MINOR: quic: Possible overflow in qpack_get_varint()
This should fix CID 375051 in GH 1536 where a signed integer expression (1 << bit)
which could overflow was compared to a uint64_t.
2022-02-14 15:20:54 +01:00
Frédéric Lécaille
ce2ecc9643 MINOR: quic: Potential overflow expression in qc_parse_frm()
This should fix Coverity CID 375056 where an unsigned char was used
to store a 32bit mask.
2022-02-14 15:20:54 +01:00
Frédéric Lécaille
439c464250 MINOR: quic: EINTR error ignored
This should fix Coverity CID 375050 in GH #1536 where EINTR errno was ignored due
to wrong do...while() loop usage.
2022-02-14 15:20:54 +01:00
Frédéric Lécaille
3916ca197e MINOR: quic: Variable used before being checked in ha_quic_add_handshake_data()
This should fix Coverity CID 375058 in GH issue #1536
2022-02-14 15:20:54 +01:00
Frédéric Lécaille
83cd51e87a MINOR: quic: Remove an RX buffer useless lock
This lock is no more useful: the RX buffer for a connection is always handled
by the same thread.
2022-02-14 15:20:54 +01:00
Remi Tricot-Le Breton
88c5695c67 MINOR: ssl: Remove calls to SSL_CTX_set_tmp_dh_callback on OpenSSLv3
The SSL_CTX_set_tmp_dh_callback function was marked as deprecated in
OpenSSLv3 so this patch replaces this callback mechanism by a direct set
of DH parameters during init.
2022-02-14 10:07:14 +01:00
Remi Tricot-Le Breton
c76c3c4e59 MEDIUM: ssl: Replace all DH objects by EVP_PKEY on OpenSSLv3 (via HASSL_DH type)
DH structure is a low-level one that should not be used anymore with
OpenSSLv3. All functions working on DH were marked as deprecated and
this patch replaces the ones we used with new APIs recommended in
OpenSSLv3, be it in the migration guide or the multiple new manpages
they created.
This patch replaces all mentions of the DH type by the HASSL_DH one,
which will be replaced by EVP_PKEY with OpenSSLv3 and will remain DH on
older versions. It also uses all the newly created helper functions that
enable for instance to load DH parameters from a file into an EVP_PKEY,
or to set DH parameters into an SSL_CTX for use in a DHE negotiation.

The following deprecated functions will effectively disappear when
building with OpenSSLv3 : DH_set0_pqg, PEM_read_bio_DHparams, DH_new,
DH_free, DH_up_ref, SSL_CTX_set_tmp_dh.
2022-02-14 10:07:14 +01:00
Remi Tricot-Le Breton
55d7e782ee MINOR: ssl: Set default dh size to 2048
Starting from OpenSSLv3, we won't rely on the
SSL_CTX_set_tmp_dh_callback mechanism so we will need to know the DH
size we want to use during init. In order for the default DH param size
to be used when no RSA or DSA private key can be found for a given bind
line, we will need to know the default size we want to use (which was
not possible the way the code was built, since the global default dh
size was set too late.
2022-02-14 10:07:14 +01:00
Remi Tricot-Le Breton
bed72631f9 MINOR: ssl: Build local DH of right size when needed
The current way the local DH structures are built relies on the fact
that the ssl_get_tmp_dh function would only be called as a callback
during a DHE negotiation, so after all the SSL contexts are built and
the init is over. With OpenSSLv3, this function will now be called
during init, so before those objects are curretly built.
This patch ensures that when calling ssl_get_tmp_dh and trying to use
one of or hard-coded DH parameters, it will be created if it did not
exist yet.
The current DH parameter creation is also kept so that with versions
before OpenSSLv3 we don't end up creating this DH object during a
handshake.
2022-02-14 10:07:14 +01:00
Remi Tricot-Le Breton
7f6425a130 MINOR: ssl: Add ssl_new_dh_fromdata helper function
Starting from OpenSSLv3, the DH_set0_pqg function is deprecated and the
use of DH objects directly is advised against so this new helper
function will be used to convert our hard-coded DH parameters into an
EVP_PKEY. It relies on the new OSSL_PARAM mechanism, as described in the
EVP_PKEY-DH manpage.
2022-02-14 10:07:14 +01:00
Remi Tricot-Le Breton
5f17930572 MINOR: ssl: Add ssl_sock_set_tmp_dh_from_pkey helper function
This helper function will only be used with OpenSSLv3. It simply sets in
an SSL_CTX a set of DH parameters of the same size as a certificate's
private key. This logic is the same as the one used with older versions,
it simply relies on new APIs.
If no pkey can be found the SSL_CTX_set_dh_auto function wll be called,
making the SSL_CTX rely on DH parameters provided by OpenSSL in case of
DHE negotiation.
2022-02-14 10:07:14 +01:00
Remi Tricot-Le Breton
846eda91ba MINOR: ssl: Add ssl_sock_set_tmp_dh helper function
Starting from OpenSSLv3, the SSL_CTX_set_tmp_dh function is deprecated
and it should be replaced by SSL_CTX_set0_tmp_dh_pkey, which takes an
EVP_PKEY instead of a DH parameter. Since this function is new to
OpenSSLv3 and its use requires an extra EVP_PKEY_up_ref call, we will
keep the two versions side by side, otherwise it would require to get
rid of all DH references in older OpenSSL versions as well.
This helper function is not used yet so this commit should be strictly
iso-functional, regardless of the OpenSSL version.
2022-02-14 10:07:14 +01:00
Remi Tricot-Le Breton
292a88ce94 MINOR: ssl: Factorize ssl_get_tmp_dh and append a cbk to its name
In the upcoming OpenSSLv3 specific patches, we will make use of the
newly created ssl_get_tmp_dh that returns an EVP_PKEY containing DH
parameters of the same size as a bind line's RSA or DSA private key.
The previously named ssl_get_tmp_dh function was renamed
ssl_get_tmp_dh_cbk because it is only used as a callback passed to
OpenSSL through SSL_CTX_set_tmp_dh_callback calls.
2022-02-14 10:07:14 +01:00
Remi Tricot-Le Breton
09ebb3359a MINOR: ssl: Add ssl_sock_get_dh_from_bio helper function
This new function makes use of the new OpenSSLv3 APIs that should be
used to load DH parameters from a file (or a BIO in this case) and that
should replace the deprecated PEM_read_bio_DHparams function.
Note that this function returns an EVP_PKEY when using OpenSSLv3 since
they now advise against using low level structures such as DH ones.
This helper function is not used yet so this commit should be stricly
iso-functional, regardless of the OpenSSL version.
2022-02-14 10:07:14 +01:00
Remi Tricot-Le Breton
78a36e3344 MINOR: ssl: Remove call to ERR_load_SSL_strings with OpenSSLv3
Starting from OpenSSLv3, error strings are loaded automatically so
ERR_load_SSL_strings is not needed anymore and was marked as deprecated.
2022-02-14 10:07:14 +01:00
Remi Tricot-Le Breton
1effd9aa09 MINOR: ssl: Remove call to ERR_func_error_string with OpenSSLv3
ERR_func_error_string does not return anything anymore with OpenSSLv3,
it can be replaced by ERR_peek_error_func which did not exist on
previous versions.
2022-02-14 10:07:14 +01:00
William Lallemand
7b820a6191 BUG/MINOR: mworker: does not erase the pidfile upon reload
When started in master-worker mode combined with daemon mode, HAProxy
will open() with O_TRUNC the pidfile when switching to wait mode.

In 2.5, it happens  everytime after trying to load the configuration,
since we switch to wait mode.

In previous version this happens upon a failure of the configuration
loading.

Fixes bug #1545.

Must be backported in every supported branches.
2022-02-14 09:28:13 +01:00
Amaury Denoyelle
58a7704d54 MINOR: quic: take out xprt snd_buf operation
Rename quic_conn_to_buf to qc_snd_buf and remove it from xprt ops. This
is done to reflect the true usage of this function which is only a
wrapper around sendto but cannot be called by the upper layer.

qc_snd_buf is moved in quic-sock because to mark its link with
quic_sock_fd_iocb which is the recvfrom counterpart.
2022-02-09 15:57:46 +01:00
Amaury Denoyelle
80bd837aaf MINOR: quic: remove unused xprt rcv_buf operation
rcv_buf and the implement quic_conn_to_buf is not used. recvfrom instead
is called on the listener polling task via quic_sock_fd_iocb.
2022-02-09 15:57:41 +01:00
Amaury Denoyelle
f6dcbce53e MINOR: quic: rename local tid variable
Rename a local variable tid to cid_tid. This ensures there is no
confusion with the global tid. It is now more explicit that we are
manipulating a quic datagram handlers from another thread in
quic_lstnr_dgram_dispatch.
2022-02-09 15:05:23 +01:00
Amaury Denoyelle
b78805488b MINOR: h3: hardcode the stream id of control stream
Use the value of 0x3 for the stream-id of H3 control-stream. As a
consequence, qcs_get_next_id is now unused and is thus removed.
2022-02-09 15:05:23 +01:00
Remi Tricot-Le Breton
c9414e25c4 MINOR: ssl: Remove call to HMAC_Init_ex with OpenSSLv3
HMAC_Init_ex being a function that acts on a low-level HMAC_CTX
structure was marked as deprecated in OpenSSLv3.
This patch replaces this call by EVP_MAC_CTX_set_params, as advised in
the migration_guide, and uses the new OSSL_PARAM mechanism to configure
the MAC context, as described in the EVP_MAC and EVP_MAC-HMAC manpages.
2022-02-09 12:11:31 +01:00
Remi Tricot-Le Breton
8ea1f5f6cd MINOR: ssl: Remove call to SSL_CTX_set_tlsext_ticket_key_cb with OpenSSLv3
SSL_CTX_set_tlsext_ticket_key_cb was deprecated on OpenSSLv3 because it
uses an HMAC_pointer which is deprecated as well. According to the v3's
manpage it should be replaced by SSL_CTX_set_tlsext_ticket_key_evp_cb
which uses a EVP_MAC_CTX pointer.
This new callback was introduced in OpenSSLv3 so we need to keep the two
calls in the source base and to split the usage depending on the OpenSSL
version.
2022-02-09 12:11:31 +01:00
Remi Tricot-Le Breton
c11e7e1d94 MINOR: ssl: Remove EC_KEY related calls when creating a certificate
In the context of the 'generate-certificates' bind line option, if an
'ecdhe' option is present on the bind line as well, we use the
SSL_CTX_set_tmp_ecdh function which was marked as deprecated in
OpenSSLv3. As advised in the SSL_CTX_set_tmp_ecdh manpage, this function
should be replaced by the SSL_CTX_set1_groups one (or the
SSL_CTX_set1_curves one in our case which does the same but existed on
older OpenSSL versions as well).

The ECDHE behaviour with OpenSSL 1.0.2 is not the same when using the
SSL_CTX_set1_curves function as the one we have on newer versions.
Instead of looking for a code that would work exactly the same
regardless of the OpenSSL version, we will keep the original code on
1.0.2 and use newer APIs for other versions.

This patch should be strictly isofunctional.
2022-02-09 11:15:44 +01:00
Remi Tricot-Le Breton
ff4c3c4c9e MINOR: ssl: Remove EC_KEY related calls when preparing SSL context
The ecdhe option relies on the SSL_CTX_set_tmp_ecdh function which has
been marked as deprecated in OpenSSLv3. As advised in the
SSL_CTX_set_tmp_ecdh manpage, this function should be replaced by the
SSL_CTX_set1_groups one (or the SSL_CTX_set1_curves one in our case
which does the same but existed on older OpenSSL versions as well).

When using the "curves" option we have a different behaviour with
OpenSSL1.0.2 compared to later versions. On this early version an SSL
backend using a P-256 ECDSA certificate manages to connect to an SSL
frontend having a "curves P-384" option (when it fails with later
versions).
Even if the API used for later version than OpenSSL 1.0.2 already
existed then, for some reason the behaviour is not the same on the older
version which explains why the original code with the deprecated API is
kept for this version (otherwise we would risk breaking everything on a
version that might still be used by some people despite being pretty old).

This patch should be strictly isofunctional.
2022-02-09 11:15:44 +01:00
Remi Tricot-Le Breton
2559bc8318 MINOR: ssl: Use high level OpenSSL APIs in sha2 converter
The sha2 converter's implementation used low level interfaces such as
SHA256_Update which are flagged as deprecated starting from OpenSSLv3.
This patch replaces those calls by EVP ones which already existed on
older versions. It should be fully isofunctional.
2022-02-09 11:15:44 +01:00
Remi Tricot-Le Breton
36f80f6e0b CLEANUP: ssl: Remove unused ssl_sock_create_cert function
This function is not used anymore, it can be removed.
2022-02-09 11:15:44 +01:00
Remi Tricot-Le Breton
2e7d1eb2a7 BUG/MINOR: ssl: Remove empty lines from "show ssl ocsp-response <id>" output
There were empty lines in the output of the CLI's "show ssl
ocsp-response <id>" command. The plain "show ssl ocsp-response" command
(without parameter) was already managed in commit
cc750efbc5. This patch adds an extra space
to those lines so that the only existing empty lines actually mark the
end of the output. This requires to post-process the buffer filled by
OpenSSL's OCSP_RESPONSE_print function (which produces the output of the
"openssl ocsp -respin <ocsp.pem>" command). This way the output of our
command still looks the same as openssl's one.

Must be backported in 2.5.
2022-02-03 09:57:24 +01:00
Frédéric Lécaille
bfa3236c6c MINOR: quic: Remove a useless test in quic_get_dgram_dcid()
This test is already done when entering quic_get_dgram_dcid().
2022-02-02 18:24:43 +01:00
Frédéric Lécaille
f6f7520b9b MINOR: quic: Wrong datagram buffer passed to quic_lstnr_dgram_dispatch()
The same datagram could be passed to quic_lstnr_dgram_dispatch() before
being consumed by qc_lstnr_pkt_rcv() leading to a wrong decryption for the packet
number decryption, then a decryption error for the data. This was due to
a wrong datagram buffer passed to quic_lstnr_dgram_dispatch(). The datagram data
which must be passed to quic_lstnr_dgram_dispatch() are the same as the one
passed to recvfrom().
2022-02-02 18:24:21 +01:00
Frédéric Lécaille
841bf5e7f4 MINOR: quic: Do not modify a marked as consumed datagram
Mark the datagrams as consumed at the very last time.
2022-02-02 18:24:21 +01:00
Christopher Faulet
fc5912914b MINOR: httpclient: Don't limit data transfer to 1024 bytes
For debug purpose, no more 1024 bytes were copied at a time. But there is no
reason to keep this limitation. Thus, it is removed.

This patch may be backported to 2.5.
2022-02-02 16:19:19 +01:00
Christopher Faulet
6ced61dd0a BUG/MEDIUM: httpclient: Xfer the request when the stream is created
Since the HTTP legacy mode was removed, it is unexpected to create an HTTP
stream without a valid request. Thanks to this change, the wait_for_request
analyzer was significatly simplified. And it is possible because HTTP
multiplexers already take care to have a valid request to create a stream.

But it means that any HTTP applet on the client side must do the same. The
httpclient client is one of them. And it is not a problem because the
request is generated before starting the applet. We must just take care to
set the right state.

For now it works "by chance", because the applet seems to be scheduled
before the stream itself. But if this change, this will lead to crash
because the stream expects to have a request when wait_for_request analyzer.

This patch should be backported to 2.5.
2022-02-02 16:19:17 +01:00
Christopher Faulet
600985df41 BUG/MINOR: httpclient: Revisit HC request and response buffers allocation
For now, these buffers are allocated when the httpclient is created and
freed when it is released. Usually, we try to avoid to keep buffer allocated
if it is not required. Empty buffers should be released ASAP. Apart for
that, there is no issue with the response side because a copy is always
performed. However, for the request side, a swap with the channel's buffer
is always performed. And there is no guarantee the channel's buffer is
allocated. Thus, after the swap, the httpclient can retrieve a null
buffer. In practice, this never happens. But this may change. And it will be
required for a futur fix.

So, now, we systematically take care to have an allocated buffer when we
want to write in it. And it is released as soon as it becomes empty.

This patch should be backported to 2.5.
2022-02-02 16:19:16 +01:00
William Lallemand
dae12c7553 MINOR: mworker/cli: add flags in the prompt
The master CLI prompt is now able to show flags in its prompt depending
on the mode used: experimental (x), expert (e), mcli-debug (d).
2022-02-02 15:51:24 +01:00
William Lallemand
2a17191e91 MINOR: mworker/cli: mcli-debug-mode enables every command
"mcli-debug-mode on" enables every command that were meant for a worker,
on the CLI of the master. Which mean you can issue, "show fd", show
stat" in order to debug the MASTER proxy.

You can also combine it with "expert-mode on" or "experimental-mode on"
to access to more commands.
2022-02-02 15:51:24 +01:00
William Lallemand
d9c28070c1 BUG/MINOR: mworker/cli: don't display help on master applet
When in expert or experimental mode on the master CLI, and issuing a
command for the master process, all commands are prefixed by
"mode-experimental -" or/and "mode-expert on -", however these commands
were not available in the master applet, so the help was issued for
each one.
2022-02-02 15:51:24 +01:00
William Lallemand
fe618fbd0c CLEANUP: cleanup a commentary in pcli_parse_request()
Remove '1' from a commentary in pcli_parse_request()
2022-02-02 15:51:12 +01:00
Willy Tarreau
2454d6ef5b [RELEASE] Released version 2.6-dev1
Released version 2.6-dev1 with the following main changes :
    - BUG/MINOR: cache: Fix loop on cache entries in "show cache"
    - BUG/MINOR: httpclient: allow to replace the host header
    - BUG/MINOR: lua: don't expose internal proxies
    - MEDIUM: mworker: seamless reload use the internal sockpairs
    - BUG/MINOR: lua: remove loop initial declarations
    - BUG/MINOR: mworker: does not add the -sf in wait mode
    - BUG/MEDIUM: mworker: FD leak of the eventpoll in wait mode
    - MINOR: quic: do not reject PADDING followed by other frames
    - REORG: quic: add comment on rare thread concurrence during CID alloc
    - CLEANUP: quic: add comments on CID code
    - MEDIUM: quic: handle CIDs to rattach received packets to connection
    - MINOR: qpack: support litteral field line with non-huff name
    - MINOR: quic: activate QUIC traces at compilation
    - MINOR: quic: use more verbose QUIC traces set at compile-time
    - MEDIUM: pool: refactor malloc_trim/glibc and jemalloc api addition detections.
    - MEDIUM: pool: support purging jemalloc arenas in trim_all_pools()
    - BUG/MINOR: mworker: deinit of thread poller was called when not initialized
    - BUILD: pools: only detect link-time jemalloc on ELF platforms
    - CI: github actions: add the output of $CC -dM -E-
    - BUG/MEDIUM: cli: Properly set stream analyzers to process one command at a time
    - BUILD: evports: remove a leftover from the dead_fd cleanup
    - MINOR: quic: Set "no_application_protocol" alert
    - MINOR: quic: More accurate immediately close.
    - MINOR: quic: Immediately close if no transport parameters extension found
    - MINOR: quic: Rename qc_prep_hdshk_pkts() to qc_prep_pkts()
    - MINOR: quic: Possible crash when inspecting the xprt context
    - MINOR: quic: Dynamically allocate the secrete keys
    - MINOR: quic: Add a function to derive the key update secrets
    - MINOR: quic: Add structures to maintain key phase information
    - MINOR: quic: Optional header protection key for quic_tls_derive_keys()
    - MINOR: quic: Add quic_tls_key_update() function for Key Update
    - MINOR: quic: Enable the Key Update process
    - MINOR: quic: Delete the ODCIDs asap
    - BUG/MINOR: vars: Fix the set-var and unset-var converters
    - MEDIUM: pool: Following up on previous pool trimming update.
    - BUG/MEDIUM: mux-h1: Fix splicing by properly detecting end of message
    - BUG/MINOR: mux-h1: Fix splicing for messages with unknown length
    - MINOR: mux-h1: Improve H1 traces by adding info about http parsers
    - MINOR: mux-h1: register a stats module
    - MINOR: mux-h1: add counters instance to h1c
    - MINOR: mux-h1: count open connections/streams on stats
    - MINOR: mux-h1: add stat for total count of connections/streams
    - MINOR: mux-h1: add stat for total amount of bytes received and sent
    - REGTESTS: h1: Add a script to validate H1 splicing support
    - BUG/MINOR: server: Don't rely on last default-server to init server SSL context
    - BUG/MEDIUM: resolvers: Detach query item on response error
    - MEDIUM: resolvers: No longer store query items in a list into the response
    - BUG/MAJOR: segfault using multiple log forward sections.
    - BUG/MEDIUM: h1: Properly reset h1m flags when headers parsing is restarted
    - BUG/MINOR: resolvers: Don't overwrite the error for invalid query domain name
    - BUILD: bug: Fix error when compiling with -DDEBUG_STRICT_NOCRASH
    - BUG/MEDIUM: sample: Fix memory leak in sample_conv_jwt_member_query
    - DOC: spoe: Clarify use of the event directive in spoe-message section
    - DOC: config: Specify %Ta is only available in HTTP mode
    - BUILD: tree-wide: avoid warnings caused by redundant checks of obj_types
    - IMPORT: slz: use the correct CRC32 instruction when running in 32-bit mode
    - MINOR: quic: fix segfault on CONNECTION_CLOSE parsing
    - MINOR: h3: add BUG_ON on control receive function
    - MEDIUM: xprt-quic: finalize app layer initialization after ALPN nego
    - MINOR: h3: remove duplicated FIN flag position
    - MAJOR: mux-quic: implement a simplified mux version
    - MEDIUM: mux-quic: implement release mux operation
    - MEDIUM: quic: detect the stream FIN
    - MINOR: mux-quic: implement subscribe on stream
    - MEDIUM: mux-quic: subscribe on xprt if remaining data after send
    - MEDIUM: mux-quic: wake up xprt on data transferred
    - MEDIUM: mux-quic: handle when sending buffer is full
    - MINOR: quic: RX buffer full due to wrong CRYPTO data handling
    - MINOR: quic: Race issue when consuming RX packets buffer
    - MINOR: quic: QUIC encryption level RX packets race issue
    - MINOR: quic: Delete remaining RX handshake packets
    - MINOR: quic: Remove QUIC TX packet length evaluation function
    - MINOR: hq-interop: fix tx buffering
    - MINOR: mux-quic: remove uneeded code to check fin on TX
    - MINOR: quic: add HTX EOM on request end
    - BUILD: mux-quic: fix compilation with DEBUG_MEM_STATS
    - MINOR: http-rules: Add capture action to http-after-response ruleset
    - BUG/MINOR: cli/server: Don't crash when a server is added with a custom id
    - MINOR: mux-quic: do not release qcs if there is remaining data to send
    - MINOR: quic: notify the mux on CONNECTION_CLOSE
    - BUG/MINOR: mux-quic: properly initialize flow control
    - MINOR: quic: Compilation fix for quic_rx_packet_refinc()
    - MINOR: h3: fix possible invalid dereference on htx parsing
    - DOC: config: retry-on list is space-delimited
    - DOC: config: fix error-log-format example
    - BUG/MEDIUM: mworker/cli: crash when trying to access an old PID in prompt mode
    - MINOR: hq-interop: refix tx buffering
    - REGTESTS: ssl: use X509_V_ERR_UNABLE_TO_GET_ISSUER_CERT_LOCALLY for cert check
    - MINOR: cli: "show version" displays the current process version
    - CLEANUP: cfgparse: modify preprocessor guards around numa detection code
    - MEDIUM: cfgparse: numa detect topology on FreeBSD.
    - BUILD: ssl: unbreak the build with newer libressl
    - MINOR: vars: Move UPDATEONLY flag test to vars_set_ifexist
    - MINOR: vars: Set variable type to ANY upon creation
    - MINOR: vars: Delay variable content freeing in var_set function
    - MINOR: vars: Parse optional conditions passed to the set-var converter
    - MINOR: vars: Parse optional conditions passed to the set-var actions
    - MEDIUM: vars: Enable optional conditions to set-var converter and actions
    - DOC: vars: Add documentation about the set-var conditions
    - REGTESTS: vars: Add new test for conditional set-var
    - MINOR: quic: Attach timer task to thread for the connection.
    - CLEANUP: quic_frame: Remove a useless suffix to STOP_SENDING
    - MINOR: quic: Add traces for STOP_SENDING frame and modify others
    - CLEANUP: quic: Remove cdata_len from quic_tx_packet struct
    - MINOR: quic: Enable TLS 0-RTT if needed
    - MINOR: quic: No TX secret at EARLY_DATA encryption level
    - MINOR: quic: Add quic_set_app_ops() function
    - MINOR: ssl_sock: Set the QUIC application from ssl_sock_advertise_alpn_protos.
    - MINOR: quic: Make xprt support 0-RTT.
    - MINOR: qpack: Missing check for truncated QPACK fields
    - CLEANUP: quic: Comment fix for qc_strm_cpy()
    - MINOR: hq_interop: Stop BUG_ON() truncated streams
    - MINOR: quic: Do not mix packet number space and connection flags
    - CLEANUP: quic: Shorten a litte bit the traces in lstnr_rcv_pkt()
    - MINOR: mux-quic: fix trace on stream creation
    - CLEANUP: quic: fix spelling mistake in a trace
    - CLEANUP: quic: rename quic_conn conn to qc in quic_conn_free
    - MINOR: quic: add missing lock on cid tree
    - MINOR: quic: rename constant for haproxy CIDs length
    - MINOR: quic: refactor concat DCID with address for Initial packets
    - MINOR: quic: compare coalesced packets by DCID
    - MINOR: quic: refactor DCID lookup
    - MINOR: quic: simplify the removal from ODCID tree
    - REGTESTS: vars: Remove useless ssl tunes from conditional set-var test
    - MINOR: ssl: Remove empty lines from "show ssl ocsp-response" output
    - MINOR: quic: Increase the RX buffer for each connection
    - MINOR: quic: Add a function to list remaining RX packets by encryption level
    - MINOR: quic: Stop emptying the RX buffer asap.
    - MINOR: quic: Do not expect to receive only one O-RTT packet
    - MINOR: quic: Do not forget STREAM frames received in disorder
    - MINOR: quic: Wrong packet refcount handling in qc_pkt_insert()
    - DOC: fix misspelled keyword "resolve_retries" in resolvers
    - CLEANUP: quic: rename quic_conn instances to qc
    - REORG: quic: move mux function outside of xprt
    - MINOR: quic: add reference to quic_conn in ssl context
    - MINOR: quic: add const qualifier for traces function
    - MINOR: trace: add quic_conn argument definition
    - MINOR: quic: use quic_conn as argument to traces
    - MINOR: quic: add quic_conn instance in traces for qc_new_conn
    - MINOR: quic: Add stream IDs to qcs_push_frame() traces
    - MINOR: quic: unchecked qc_retrieve_conn_from_cid() returned value
    - MINOR: quic: Wrong dropped packet skipping
    - MINOR: quic: Handle the cases of overlapping STREAM frames
    - MINOR: quic: xprt traces fixes
    - MINOR: quic: Drop asap Retry or Version Negotiation packets
    - MINOR: pools: work around possibly slow malloc_trim() during gc
    - DEBUG: ssl: make sure we never change a servername on established connections
    - MINOR: quic: Add traces for RX frames (flow control related)
    - MINOR: quic: Add CONNECTION_CLOSE phrase to trace
    - REORG: quic: remove qc_ prefix on functions which not used it directly
    - BUG/MINOR: quic: upgrade rdlock to wrlock for ODCID removal
    - MINOR: quic: remove unnecessary call to free_quic_conn_cids()
    - MINOR: quic: store ssl_sock_ctx reference into quic_conn
    - MINOR: quic: remove unnecessary if in qc_pkt_may_rm_hp()
    - MINOR: quic: replace usage of ssl_sock_ctx by quic_conn
    - MINOR: quic: delete timer task on quic_close()
    - MEDIUM: quic: implement refcount for quic_conn
    - BUG/MINOR: quic: fix potential null dereference
    - BUG/MINOR: quic: fix potential use of uninit pointer
    - BUG/MEDIUM: backend: fix possible sockaddr leak on redispatch
    - BUG/MEDIUM: peers: properly skip conn_cur from incoming messages
    - CI: Github Actions: do not show VTest failures if build failed
    - BUILD: opentracing: display warning in case of using OT_USE_VARS at compile time
    - MINOR: compat: detect support for dl_iterate_phdr()
    - MINOR: debug: add ability to dump loaded shared libraries
    - MINOR: debug: add support for -dL to dump library names at boot
    - BUG/MEDIUM: ssl: initialize correctly ssl w/ default-server
    - REGTESTS: ssl: fix ssl_default_server.vtc
    - BUG/MINOR: ssl: free the fields in srv->ssl_ctx
    - BUG/MEDIUM: ssl: free the ckch instance linked to a server
    - REGTESTS: ssl: update of a crt with server deletion
    - BUILD/MINOR: cpuset FreeBSD 14 build fix.
    - MINOR: pools: always evict oldest objects first in pool_evict_from_local_cache()
    - DOC: pool: document the purpose of various structures in the code
    - CLEANUP: pools: do not use the extra pointer to link shared elements
    - CLEANUP: pools: get rid of the POOL_LINK macro
    - MINOR: pool: allocate from the shared cache through the local caches
    - CLEANUP: pools: group list updates in pool_get_from_cache()
    - MINOR: pool: rely on pool_free_nocache() in pool_put_to_shared_cache()
    - MINOR: pool: make pool_is_crowded() always true when no shared pools are used
    - MINOR: pool: check for pool's fullness outside of pool_put_to_shared_cache()
    - MINOR: pool: introduce pool_item to represent shared pool items
    - MINOR: pool: add a function to estimate how many may be released at once
    - MEDIUM: pool: compute the number of evictable entries once per pool
    - MINOR: pools: prepare pool_item to support chained clusters
    - MINOR: pools: pass the objects count to pool_put_to_shared_cache()
    - MEDIUM: pools: centralize cache eviction in a common function
    - MEDIUM: pools: start to batch eviction from local caches
    - MEDIUM: pools: release cached objects in batches
    - OPTIM: pools: reduce local pool cache size to 512kB
    - CLEANUP: assorted typo fixes in the code and comments This is 29th iteration of typo fixes
    - CI: github actions: update OpenSSL to 3.0.1
    - BUILD/MINOR: tools: solaris build fix on dladdr.
    - BUG/MINOR: cli: fix _getsocks with musl libc
    - BUG/MEDIUM: http-ana: Preserve response's FLT_END analyser on L7 retry
    - MINOR: quic: Wrong traces after rework
    - MINOR: quic: Add trace about in flight bytes by packet number space
    - MINOR: quic: Wrong first packet number space computation
    - MINOR: quic: Wrong packet number space computation for PTO
    - MINOR: quic: Wrong loss time computation in qc_packet_loss_lookup()
    - MINOR: quic: Wrong ack_delay compution before calling quic_loss_srtt_update()
    - MINOR: quic: Remove nb_pto_dgrams quic_conn struct member
    - MINOR: quic: Wrong packet number space trace in qc_prep_pkts()
    - MINOR: quic: Useless test in qc_prep_pkts()
    - MINOR: quic: qc_prep_pkts() code moving
    - MINOR: quic: Speeding up Handshake Completion
    - MINOR: quic: Probe Initial packet number space more often
    - MINOR: quic: Probe several packet number space upon timer expiration
    - MINOR: quic: Comment fix.
    - MINOR: quic: Improve qc_prep_pkts() flexibility
    - MINOR: quic: Do not drop secret key but drop the CRYPTO data
    - MINOR: quic: Prepare Handshake packets asap after completed handshake
    - MINOR: quic: Flag asap the connection having reached the anti-amplification limit
    - MINOR: quic: PTO timer too often reset
    - MINOR: quic: Re-arm the PTO timer upon datagram receipt
    - MINOR: proxy: add option idle-close-on-response
    - MINOR: cpuset: switch to sched_setaffinity for FreeBSD 14 and above.
    - CI: refactor spelling check
    - CLEANUP: assorted typo fixes in the code and comments
    - BUILD: makefile: add -Wno-atomic-alignment to work around clang abusive warning
    - MINOR: quic: Only one CRYPTO frame by encryption level
    - MINOR: quic: Missing retransmission from qc_prep_fast_retrans()
    - MINOR: quic: Non-optimal use of a TX buffer
    - BUG/MEDIUM: mworker: don't use _getsocks in wait mode
    - BUG/MINOR: ssl: Store client SNI in SSL context in case of ClientHello error
    - BUG/MAJOR: mux-h1: Don't decrement .curr_len for unsent data
    - DOC: internals: document the pools architecture and API
    - CI: github actions: clean default step conditions
    - BUILD: cpuset: fix build issue on macos introduced by previous change
    - MINOR: quic: Remaining TRACEs with connection as firt arg
    - MINOR: quic: Reset ->conn quic_conn struct member when calling qc_release()
    - MINOR: quic: Flag the connection as being attached to a listener
    - MINOR: quic: Wrong CRYPTO frame concatenation
    - MINOR: quid: Add traces quic_close() and quic_conn_io_cb()
    - REGTESTS: ssl: Fix ssl_errors regtest with OpenSSL 1.0.2
    - MINOR: quic: Do not dereference ->conn quic_conn struct member
    - MINOR: quic: fix return of quic_dgram_read
    - MINOR: quic: add config parse source file
    - MINOR: quic: implement Retry TLS AEAD tag generation
    - MEDIUM: quic: implement Initial token parsing
    - MINOR: quic: define retry_source_connection_id TP
    - MEDIUM: quic: implement Retry emission
    - MINOR: quic: free xprt tasklet on its thread
    - BUG/MEDIUM: connection: properly leave stopping list on error
    - MINOR: pools: enable pools with DEBUG_FAIL_ALLOC as well
    - MINOR: quic: As server, skip 0-RTT packet number space
    - MINOR: quic: Do not wakeup the I/O handler before the mux is started
    - BUG/MEDIUM: htx: Adjust length to add DATA block in an empty HTX buffer
    - CI: github actions: use cache for OpenTracing
    - BUG/MINOR: httpclient: don't send an empty body
    - BUG/MINOR: httpclient: set default Accept and User-Agent headers
    - BUG/MINOR: httpclient/lua: don't pop the lua stack when getting headers
    - BUILD/MINOR: fix solaris build with clang.
    - BUG/MEDIUM: server: avoid changing healthcheck ctx with set server ssl
    - CI: refactor OpenTracing build script
    - DOC: management: mark "set server ssl" as deprecated
    - MEDIUM: cli: yield between each pipelined command
    - MINOR: channel: add new function co_getdelim() to support multiple delimiters
    - BUG/MINOR: cli: avoid O(bufsize) parsing cost on pipelined commands
    - MEDIUM: h2/hpack: emit a Dynamic Table Size Update after settings change
    - MINOR: quic: Retransmit the TX frames in the same order
    - MINOR: quic: Remove the packet number space TX MT_LIST
    - MINOR: quic: Splice the frames which could not be added to packets
    - MINOR: quic: Add the number of TX bytes to traces
    - CLEANUP: quic: Replace <nb_pto_dgrams> by <probe>
    - MINOR: quic: Send two ack-eliciting packets when probing packet number spaces
    - MINOR: quic: Probe regardless of the congestion control
    - MINOR: quic: Speeding up handshake completion
    - MINOR: quic: Release RX Initial packets asap
    - MINOR: quic: Release asap TX frames to be transmitted
    - MINOR: quic: Probe even if coalescing
    - BUG/MEDIUM: cli: Never wait for more data on client shutdown
    - BUG/MEDIUM: mcli: do not try to parse empty buffers
    - BUG/MEDIUM: mcli: always realign wrapping buffers before parsing them
    - BUG/MINOR: stream: make the call_rate only count the no-progress calls
    - MINOR: quic: do not use quic_conn after dropping it
    - MINOR: quic: adjust quic_conn refcount decrement
    - MINOR: quic: fix race-condition on xprt tasklet free
    - MINOR: quic: free SSL context on quic_conn free
    - MINOR: quic: Add QUIC_FT_RETIRE_CONNECTION_ID parsing case
    - MINOR: quic: Wrong packet number space selection
    - DEBUG: pools: add new build option DEBUG_POOL_INTEGRITY
    - MINOR: quic: add missing include in quic_sock
    - MINOR: quic: fix indentation in qc_send_ppkts
    - MINOR: quic: remove dereferencement of connection when possible
    - MINOR: quic: set listener accept cb on parsing
    - MEDIUM: quic/ssl: add new ex data for quic_conn
    - MINOR: quic: initialize ssl_sock_ctx alongside the quic_conn
    - MINOR: ssl: fix build in release mode
    - MINOR: pools: partially uninline pool_free()
    - MINOR: pools: partially uninline pool_alloc()
    - MINOR: pools: prepare POOL_EXTRA to be split into multiple extra fields
    - MINOR: pools: extend pool_cache API to pass a pointer to a caller
    - DEBUG: pools: add new build option DEBUG_POOL_TRACING
    - DEBUG: cli: add a new "debug dev fd" expert command
    - MINOR: fd: register the write side of the poller pipe as well
    - CI: github actions: use cache for SSL libs
    - BUILD: debug/cli: condition test of O_ASYNC to its existence
    - BUILD: pools: fix build error on DEBUG_POOL_TRACING
    - MINOR: quic: refactor header protection removal
    - MINOR: quic: handle app data according to mux/connection layer status
    - MINOR: quic: refactor app-ops initialization
    - MINOR: receiver: define a flag for local accept
    - MEDIUM: quic: flag listener for local accept
    - MINOR: quic: do not manage connection in xprt snd_buf
    - MINOR: quic: remove wait handshake/L6 flags on init connection
    - MINOR: listener: add flags field
    - MINOR: quic: define QUIC flag on listener
    - MINOR: quic: create accept queue for QUIC connections
    - MINOR: listener: define per-thr struct
    - MAJOR: quic: implement accept queue
    - CLEANUP: mworker: simplify mworker_free_child()
    - BUILD/DEBUG: lru: update the standalone code to support the revision
    - DEBUG: lru: use a xorshift generator in the testing code
    - BUG/MAJOR: compiler: relax alignment constraints on certain structures
    - BUG/MEDIUM: fd: always align fdtab[] to 64 bytes
    - MINOR: quic: No DCID length for datagram context
    - MINOR: quic: Comment fix about the token found in Initial packets
    - MINOR: quic: Get rid of a struct buffer in quic_lstnr_dgram_read()
    - MINOR: quic: Remove the QUIC haproxy server packet parser
    - MINOR: quic: Add new defintion about DCIDs offsets
    - MINOR: quic: Add a list to QUIC sock I/O handler RX buffer
    - MINOR: quic: Allocate QUIC datagrams from sock I/O handler
    - MINOR: proto_quic: Allocate datagram handlers
    - MINOR: quic: Pass CID as a buffer to quic_get_cid_tid()
    - MINOR: quic: Convert quic_dgram_read() into a task
    - CLEANUP: quic: Remove useless definition
    - MINOR: proto_quic: Wrong allocations for TX rings and RX bufs
    - MINOR: quic: Do not consume the RX buffer on QUIC sock i/o handler side
    - MINOR: quic: Do not reset a full RX buffer
    - MINOR: quic: Attach all the CIDs to the same connection
    - MINOR: quic: Make usage of by datagram handler trees
    - MEDIUM: da: new optional data file download scheduler service.
    - MEDIUM: da: update doc and build for new scheduler mode service.
    - MEDIUM: da: update module to handle schedule mode.
    - MINOR: quic: Drop Initial packets with wrong ODCID
    - MINOR: quic: Wrong RX buffer tail handling when no more contiguous data
    - MINOR: quic: Iterate over all received datagrams
    - MINOR: quic: refactor quic CID association with threads
    - BUG/MEDIUM: resolvers: Really ignore trailing dot in domain names
    - DEV: flags: Add missing flags
    - BUG/MINOR: sink: Use the right field in appctx context in release callback
    - MINOR: sock: move the unused socket cleaning code into its own function
    - BUG/MEDIUM: mworker: close unused transferred FDs on load failure
    - BUILD: atomic: make the old HA_ATOMIC_LOAD() support const pointers
    - BUILD: cpuset: do not use const on the source of CPU_AND/CPU_ASSIGN
    - BUILD: checks: fix inlining issue on set_srv_agent_[addr,port}
    - BUILD: vars: avoid overlapping field initialization
    - BUILD: server-state: avoid using not-so-portable isblank()
    - BUILD: mux_fcgi: avoid aliasing of a const struct in traces
    - BUILD: tree-wide: mark a few numeric constants as explicitly long long
    - BUILD: tools: fix warning about incorrect cast with dladdr1()
    - BUILD: task: use list_to_mt_list() instead of casting list to mt_list
    - BUILD: mworker: include tools.h for platforms without unsetenv()
    - BUG/MINOR: mworker: fix a FD leak of a sockpair upon a failed reload
    - MINOR: mworker: set the master side of ipc_fd in the worker to -1
    - MINOR: mworker: allocate and initialize a mworker_proc
    - CI: Consistently use actions/checkout@v2
    - REGTESTS: Remove REQUIRE_VERSION=1.8 from all tests
    - MINOR: mworker: sets used or closed worker FDs to -1
    - MINOR: quic: Try to accept 0-RTT connections
    - MINOR: quic: Do not try to treat 0-RTT packets without started mux
    - MINOR: quic: Do not try to accept a connection more than one time
    - MINOR: quic: Initialize the connection timer asap
    - MINOR: quic: Do not use connection struct xprt_ctx too soon
    - Revert "MINOR: mworker: sets used or closed worker FDs to -1"
    - BUILD: makefile: avoid testing all -Wno-* options when not needed
    - BUILD: makefile: validate support for extra warnings by batches
    - BUILD: makefile: only compute alternative options if required
    - DEBUG: fd: make sure we never try to insert/delete an impossible FD number
    - MINOR: mux-quic: add comment
    - MINOR: mux-quic: properly initialize qcc flags
    - MINOR: mux-quic: do not consider CONNECTION_CLOSE for the moment
    - MINOR: mux-quic: create a timeout task
    - MEDIUM: mux-quic: delay the closing with the timeout
    - MINOR: mux-quic: release idle conns on process stopping
    - MINOR: listener: replace the listener's spinlock with an rwlock
    - BUG/MEDIUM: listener: read-lock the listener during accept()
    - MINOR: mworker/cli: set expert/experimental mode from the CLI
2022-02-01 18:06:59 +01:00
William Lallemand
7267f78ebe MINOR: mworker/cli: set expert/experimental mode from the CLI
Allow to set the master CLI in expert or experimental mode. No command
within the master are unlocked yet, but it gives the ability to send
expert or experimental commands to the workers.

    echo "@1; experimental-mode on; del server be1/s2" | socat /var/run/haproxy.master -
    echo "experimental-mode on; @1 del server be1/s2" | socat /var/run/haproxy.master -
2022-02-01 17:33:06 +01:00
Willy Tarreau
fed93d367c BUG/MEDIUM: listener: read-lock the listener during accept()
Listeners might be disabled by other threads while running in
listener_accept() due to a stopping condition or possibly a rebinding
error after a failed stop/start. When this happens, the listener's FD
is -1 and accesses made by the lower layers to fdtab[-1] do not end up
well. This can occasionally be noticed if running at high connection
rates in master-worker mode when compiled with ASAN and hammered with
10 reloads per second. From time to time an out-of-bounds error will
be reported.

One approach could consist in keeping a copy of critical information
such as the FD before proceeding but that's not correct since in case of
close() the FD might be reassigned to another connection for example.

In fact what is needed is to read-lock the listener during this operation
so that it cannot change while we're touching it.

Tests have shown that using a spinlock only does generally work well but
it doesn't scale much with threads and we can see listener_accept() eat
10-15% CPU on a 24 thread machine at 300k conn/s. For this reason the
lock was turned to an rwlock by previous commit and this patch only takes
the read lock to make sure other operations do not change the listener's
state while threads are accepting connections. With this approach, no
performance loss was noticed at all and listener_accept() doesn't appear
in perf top.

This ought to be backported to about all branches that make use of the
unlocked listeners, but in practice it seems to mostly concern 2.3 and
above, since 2.2 and older will take the FD in the argument (and the
race exists there, this FD could end up being reassigned in parallel
but there's not much that can be done there to prevent that race; at
least a permanent error will be reported).

For backports, the current approach is preferred, with a preliminary
backport of previous commit "MINOR: listener: replace the listener's
spinlock with an rwlock". However if for any reason this commit cannot
be backported, the current patch can be modified to simply take a
spinlock (tested and works), it will just impact high performance
workloads (like DDoS protection).
2022-02-01 16:51:55 +01:00
Willy Tarreau
08b6f96452 MINOR: listener: replace the listener's spinlock with an rwlock
We'll need to lock the listener a little bit more during accept() and
tests show that a spinlock is a massive performance killer, so let's
first switch to an rwlock for this lock.

This patch might have to be backported for the next patch to work, and
if so, the change is almost mechanical (look for LISTENER_LOCK), but do
not forget about the few HA_SPIN_INIT() in the file. There's no reference
to this lock outside of listener.c nor listener-t.h.
2022-02-01 16:51:55 +01:00
Amaury Denoyelle
0e0969d6cf MINOR: mux-quic: release idle conns on process stopping
Implement the idle frontend connection cleanup for QUIC mux. Each
connection is registered on the mux_stopping_list. On process closing,
the mux is notified via a new function qc_wake. This function immediatly
release the connection if the parent proxy is stopped.

This allows to quickly close the process even if there is QUIC
connection stucked on timeout.
2022-02-01 15:42:32 +01:00
Amaury Denoyelle
1136e9243a MEDIUM: mux-quic: delay the closing with the timeout
Do not close immediatly the connection if there is no bidirectional
stream opened. Schedule instead the mux timeout when this condition is
verified. On the timer expiration, the mux/connection can be freed.
2022-02-01 15:19:35 +01:00
Amaury Denoyelle
aebe26f8ba MINOR: mux-quic: create a timeout task
This task will be used to schedule a timer when there is no activity on
the mux. The timeout is set via the "timeout client" from the
configuration file.

The timeout task process schedule the timeout only on specific
conditions. Currently, it's done if there is no opened bidirectional
stream.

For now this task is not used. This will be implemented in the following
commit.
2022-02-01 15:19:35 +01:00
Amaury Denoyelle
d975148776 MINOR: mux-quic: do not consider CONNECTION_CLOSE for the moment
Remove the condition on CONNECTION_CLOSE reception to close immediately
streams. It can cause some crash as the QUIC xprt layer still access the
qcs to send data and handle ACK.

The whole interface and buffering between QUIC xprt and mux must be
properly reorganized to better handle this case. Once this is done, it
may have some sense to free the qcs streams on CONNECTION_CLOSE
reception.
2022-02-01 15:19:35 +01:00
Amaury Denoyelle
ce1f30dac8 MINOR: mux-quic: properly initialize qcc flags
Set qcc.flags to 0 on qc_init.
2022-02-01 15:19:35 +01:00
Amaury Denoyelle
6a4aebfbfc MINOR: mux-quic: add comment
Explain the qc_release_detached_streams function purpose and interface.
Most notably the return code which is the count of released streams.
2022-02-01 10:56:43 +01:00
Willy Tarreau
9aa324de2d DEBUG: fd: make sure we never try to insert/delete an impossible FD number
It's among the cases that would provoke memory corruption, let's add
some tests against negative FDs and those larger than the table. This
must never ever happen and would currently result in silent corruption
or a crash. Better have a noticeable one exhibiting the call chain if
that were to happen.
2022-01-31 21:00:35 +01:00
William Lallemand
ce672844dd Revert "MINOR: mworker: sets used or closed worker FDs to -1"
This reverts commit ea7371e934.

This can't work correctly as we need this FD in the worker to be
inserted in the fdtab. The correct way to do it would be to cleanup the
mworker_proc in the master after the fork().
2022-01-31 19:06:07 +01:00
Frédéric Lécaille
7fbb94da8d MINOR: quic: Do not use connection struct xprt_ctx too soon
In fact the xprt_ctx of the connection is first stored into quic_conn
struct as soon as it is initialized from qc_conn_alloc_ssl_ctx().
As quic_conn_init_timer() is run after this function, we can associate
the timer context of the timer to the one from the quic_conn struct.
2022-01-31 16:40:23 +01:00
Frédéric Lécaille
789413caf0 MINOR: quic: Initialize the connection timer asap
We must move this initialization from xprt_start() callback, which
comes too late (after handshake completion for 1RTT session). This timer must be
usable as soon as we have packets to send/receive. Let's initialize it after
the TLS context is initialized in qc_conn_alloc_ssl_ctx(). This latter function
initializes I/O handler task (quic_conn_io_cb) to send/receive packets.
2022-01-31 16:40:23 +01:00
Frédéric Lécaille
91f083a365 MINOR: quic: Do not try to accept a connection more than one time
We add a new flag to mark a connection as already enqueued for acception.
This is useful for 0-RTT session where a connection is first enqueued for
acception as soon as 0-RTT RX secrets could be derived. Then as for any other
connection, we could accept one more time this connection after handshake
completion which lead to very bad side effects.

Thank you to Amaury for this nice patch.
2022-01-31 16:40:23 +01:00
Frédéric Lécaille
298931d177 MINOR: quic: Do not try to treat 0-RTT packets without started mux
We proceed the same was as for 1-RTT packets: we do not try to treat
them until the mux is started.
2022-01-31 16:40:23 +01:00
Frédéric Lécaille
61b851d748 MINOR: quic: Try to accept 0-RTT connections
When a listener managed to derive 0-RTT RX secrets we consider it accepted
the early data. So we enqueue the connection into the accept queue.
2022-01-31 16:40:23 +01:00
William Lallemand
ea7371e934 MINOR: mworker: sets used or closed worker FDs to -1
mworker_cli_sockpair_new() is used to create the socketpair CLI listener of
the worker. Its FD is referenced in the mworker_proc structure, however,
once it's assigned to the listener the reference should be removed so we
don't use it accidentally.

The same must be done in case of errors if the FDs were already closed.
2022-01-31 11:10:34 +01:00
William Lallemand
56be0e0146 MINOR: mworker: allocate and initialize a mworker_proc
mworker_proc_new() allocates and initializes correctly a mworker_proc
structure.
2022-01-28 23:52:36 +01:00
William Lallemand
7e01878e45 MINOR: mworker: set the master side of ipc_fd in the worker to -1
Once the child->ipc_fd[0] is closed in the worker, set the value to -1
so we don't reference a closed FD anymore.
2022-01-28 23:52:26 +01:00
William Lallemand
55a921c914 BUG/MINOR: mworker: fix a FD leak of a sockpair upon a failed reload
When starting HAProxy in master-worker, the master pre-allocate a struct
mworker_proc and do a socketpair() before the configuration parsing. If
the configuration loading failed, the FD are never closed because they
aren't part of listener, they are not even in the fdtab.

This patch fixes the issue by cleaning the mworker_proc structure that
were not asssigned a process, and closing its FDs.

Must be backported as far as 2.0, the srv_drop() only frees the memory
and could be dropped since it's done before an exec().
2022-01-28 23:47:43 +01:00
Willy Tarreau
4c943fd60b BUILD: mworker: include tools.h for platforms without unsetenv()
In this case we fall back to my_unsetenv() thus we need tools.h
to avoid a warning.
2022-01-28 19:04:02 +01:00
Willy Tarreau
cc5cd5b8d8 BUILD: task: use list_to_mt_list() instead of casting list to mt_list
There were a few casts of list* to mt_list* that were upsetting some
old compilers (not sure about the effect on others). We had created
list_to_mt_list() purposely for this, let's use it instead of applying
this cast.
2022-01-28 19:04:02 +01:00
Willy Tarreau
f3d5c4b032 BUILD: tools: fix warning about incorrect cast with dladdr1()
dladdr1() is used on glibc and takes a void**, but we pass it a
const ElfW(Sym)** and some compilers complain that we're aliasing.
Let's just set a may_alias attribute on the local variable to
address this. There's no need to backport this unless warnings are
reported on older distros or uncommon compilers.
2022-01-28 19:04:02 +01:00
Willy Tarreau
8f0b4e97e7 BUILD: tree-wide: mark a few numeric constants as explicitly long long
At a few places in the code the switch/case ond flags are tested against
64-bit constants without explicitly being marked as long long. Some
32-bit compilers complain that the constant is too large for a long, and
other likely always use long long there. Better fix that as it's uncertain
what others which do not complain do. It may be backported to avoid doubts
on uncommon platforms if needed, as it touches very few areas.
2022-01-28 19:04:02 +01:00
Willy Tarreau
31a8306b93 BUILD: mux_fcgi: avoid aliasing of a const struct in traces
fcgi_trace() declares fconn as a const and casts its mbuf array to
(struct buffer*), which rightfully upsets some older compilers. Better
just declare it as a writable variable and get rid of the cast. It's
harmless anyway. This has been there since 2.1 with commit 5c0f859c2
("MINOR: mux-fcgi/trace: Register a new trace source with its events")
and doens't need to be backported though it would not harm either.
2022-01-28 19:04:02 +01:00
Willy Tarreau
74bc991600 BUILD: server-state: avoid using not-so-portable isblank()
Once in a while we get rid of this one. isblank() is missing on old
C libraries and only matches two values, so let's just replace it.
It was brought with this commit in 2.4:

  0bf268e18 ("MINOR: server: Be more strict on the server-state line parsing")

It may be backported though it's really not important.
2022-01-28 19:04:02 +01:00
Willy Tarreau
e90dde1edf BUILD: vars: avoid overlapping field initialization
Compiling vars.c with gcc 4.2 shows that we're initializing some local
structs field members in a not really portable way:

src/vars.c: In function 'vars_parse_cli_set_var':
src/vars.c:1195: warning: initialized field overwritten
src/vars.c:1195: warning: (near initialization for 'px.conf.args')
src/vars.c:1195: warning: initialized field overwritten
src/vars.c:1195: warning: (near initialization for 'px.conf')
src/vars.c:1201: warning: initialized field overwritten
src/vars.c:1201: warning: (near initialization for 'rule.conf')

It's totally harmless anyway, but better clean this up.
2022-01-28 19:04:02 +01:00
Willy Tarreau
95d3eaff36 BUILD: checks: fix inlining issue on set_srv_agent_[addr,port}
These functions are declared as external functions in check.h and
as inline functions in check.c. Let's move them as static inline in
check.h. This appeared in 2.4 with the following commits:

  4858fb2e1 ("MEDIUM: check: align agentaddr and agentport behaviour")
  1c921cd74 ("BUG/MINOR: check: consitent way to set agentaddr")

While harmless (it only triggers build warnings with some gcc 4.x),
it should probably be backported where the paches above are present
to keep the code consistent.
2022-01-28 19:04:02 +01:00
Willy Tarreau
a65b4933ba BUILD: cpuset: do not use const on the source of CPU_AND/CPU_ASSIGN
The man page indicates that CPU_AND() and CPU_ASSIGN() take a variable,
not a const on the source, even though it doesn't make much sense. But
with older libcs, this triggers a build warning:

  src/cpuset.c: In function 'ha_cpuset_and':
  src/cpuset.c:53: warning: initialization discards qualifiers from pointer target type
  src/cpuset.c: In function 'ha_cpuset_assign':
  src/cpuset.c:101: warning: initialization discards qualifiers from pointer target type

Better stick stricter to the documented API as this is really harmless
here. There's no need to backport it (unless build issues are reported,
which is quite unlikely).
2022-01-28 19:04:02 +01:00
Willy Tarreau
e08acaed19 BUG/MEDIUM: mworker: close unused transferred FDs on load failure
When the master process is reloaded on a new config, it will try to
connect to the previous process' socket to retrieve all known
listening FDs to be reused by the new listeners. If listeners were
removed, their unused FDs are simply closed.

However there's a catch. In case a socket fails to bind, the master
will cancel its startup and swithc to wait mode for a new operation
to happen. In this case it didn't close the possibly remaining FDs
that were left unused.

It is very hard to hit this case, but it can happen during a
troubleshooting session with fat fingers. For example, let's say
a config runs like this:

   frontend ftp
        bind 1.2.3.4:20000-29999

The admin wants to extend the port range down to 10000-29999 and
by mistake ends up with:

   frontend ftp
        bind 1.2.3.41:20000-29999

Upon restart the bind will fail if the address is not present, and the
master will then switch to wait mode without releasing the previous FDs
for 1.2.3.4:20000-29999 since they're now apparently unused. Then once
the admin fixes the config and does:

   frontend ftp
        bind 1.2.3.4:10000-29999

The service will start, but will bind new sockets, half of them
overlapping with the previous ones that were not properly closed. This
may result in a startup error (if SO_REUSEPORT is not enabled or not
available), in a FD number exhaustion (if the error is repeated many
times), or in connections being randomly accepted by the process if
they sometimes land on the old FD that nobody listens on.

This patch will need to be backported as far as 1.8, and depends on
previous patch:

   MINOR: sock: move the unused socket cleaning code into its own function

Note that before 2.3 most of the code was located inside haproxy.c, so
the patch above should probably relocate the function there instead of
sock.c.
2022-01-28 19:04:02 +01:00
Willy Tarreau
b510116fd2 MINOR: sock: move the unused socket cleaning code into its own function
The startup code used to scan the list of unused sockets retrieved from
an older process, and to close them one by one. This also required that
the knowledge of the internal storage of these temporary sockets was
known from outside sock.c and that the code was copy-pasted at every
call place.

This patch moves this into sock.c under the name
sock_drop_unused_old_sockets(), and removes the xfer_sock_list
definition from sock.h since the rest of the code doesn't need to know
this.

This cleanup is minimal and preliminary to a future fix that will need
to be backported to all versions featuring FD transfers over the CLI.
2022-01-28 19:04:02 +01:00
Christopher Faulet
dd0b144c3a BUG/MINOR: sink: Use the right field in appctx context in release callback
In the release callback, ctx.peers was used instead of ctx.sft. Concretly,
it is not an issue because the appctx context is an union and these both
fields are structures with a unique pointer. But it will be a problem if
that changes.

This patch must be backported as far as 2.2.
2022-01-28 17:56:18 +01:00
Christopher Faulet
0a82cf4c16 BUG/MEDIUM: resolvers: Really ignore trailing dot in domain names
When a string is converted to a domain name label, the trailing dot must be
ignored. In resolv_str_to_dn_label(), there is a test to do so. However, the
trailing dot is not really ignored. The character itself is not copied but
the string index is still moved to the next char. Thus, this trailing dot is
counted in the length of the last encoded part of the domain name. Worst,
because the copy is skipped, a garbage character is included in the domain
name.

This patch should fix the issue #1528. It must be backported as far as 2.0.
2022-01-28 17:56:18 +01:00
Amaury Denoyelle
0442efd214 MINOR: quic: refactor quic CID association with threads
Do not use an extra DCID parameter on new_quic_cid to be able to
associated a new generated CID to a thread ID. Simply do the computation
inside the function. The API is cleaner this way.

This also has the effects to improve the apparent randomness of CIDs.
With the previous version the first byte of all CIDs are identical for a
connection which could lead to privacy issue. This version may not be
totally perfect on this aspect but it improves the situation.
2022-01-28 16:29:27 +01:00
Frédéric Lécaille
df1c7c78c1 MINOR: quic: Iterate over all received datagrams
Make the listener datagram handler iterate over all received datagrams
2022-01-28 16:08:07 +01:00
Frédéric Lécaille
1712b1df59 MINOR: quic: Wrong RX buffer tail handling when no more contiguous data
The producer must know where is the tailing hole in the RX buffer
when it purges it from consumed datagram. This is done allocating
a fake datagram with the remaining number of bytes which cannot be produced
at the tail of the RX buffer as length.
2022-01-28 16:08:07 +01:00
Frédéric Lécaille
dc36404c36 MINOR: quic: Drop Initial packets with wrong ODCID
According to the RFC 9000, the client ODCID must have a minimal length of 8 bytes.
2022-01-28 16:08:07 +01:00
Frédéric Lécaille
74904a4792 MINOR: quic: Make usage of by datagram handler trees
The CID trees are no more attached to the listener receiver but to the
underlying datagram handlers (one by thread) which run always on the same thread.
So, any operation on these trees do not require any locking.
2022-01-27 16:37:55 +01:00
Frédéric Lécaille
9ea9463d47 MINOR: quic: Attach all the CIDs to the same connection
We copy the first octet of the original destination connection ID to any CID for
the connection calling new_quic_cid(). So this patch modifies only this function
to take a dcid as passed parameter.
2022-01-27 16:37:55 +01:00
Frédéric Lécaille
320744b53d MINOR: quic: Do not reset a full RX buffer
As the RX buffer is not consumed by the sock i/o handler as soon as a datagram
is produced, when full an RX buffer must not be reset. The remaining room is
consumed without modifying it. The consumer has a represention of its contents:
a list of datagrams.
2022-01-27 16:37:55 +01:00
Frédéric Lécaille
37ae505c21 MINOR: quic: Do not consume the RX buffer on QUIC sock i/o handler side
Rename quic_lstnr_dgram_read() to quic_lstnr_dgram_dispatch() to reflect its new role.
After calling this latter, the sock i/o handler must consume the buffer only if
the datagram it received is detected as wrong by quic_lstnr_dgram_dispatch().
The datagram handler task mark the datagram as consumed atomically setting ->buf
to NULL value. The sock i/o handler is responsible of flushing its RX buffer
before using it. It also keeps a datagram among the consumed ones so that
to pass it to quic_lstnr_dgram_dispatch() and prevent it from allocating a new one.
2022-01-27 16:37:55 +01:00
Frédéric Lécaille
794d068d8f MINOR: proto_quic: Wrong allocations for TX rings and RX bufs
As mentionned in the comment, the tx_qrings and rxbufs members of
receiver struct must be pointers to pointers!
Modify the functions responsible of their allocations consequently.
Note that this code could work because sizeof rxbuf and sizeof tx_qrings
are greater than the size of pointer!
2022-01-27 16:37:55 +01:00
Frédéric Lécaille
25bc8875d7 MINOR: quic: Convert quic_dgram_read() into a task
quic_dgram_read() parses all the QUIC packets from a UDP datagram. It is the best
candidate to be converted into a task, because is processing data unit is the UDP
datagram received by the QUIC sock i/o handler. If correct, this datagram is
added to the context of a task, quic_lstnr_dghdlr(), a conversion of quic_dgram_read()
into a task. This task pop a datagram from an mt_list and passes it among to
the packet handler (quic_lstnr_pkt_rcv()).
Modify the quic_dgram struct to play the role of the old quic_dgram_ctx struct when
passed to quic_lstnr_pkt_rcv().
Modify the datagram handlers allocation to set their tasks to quic_lstnr_dghdlr().
2022-01-27 16:37:55 +01:00
Frédéric Lécaille
220894a5d6 MINOR: quic: Pass CID as a buffer to quic_get_cid_tid()
Very minor modification so that this function might be used for a context
without CID (at datagram level).
2022-01-27 16:37:55 +01:00
Frédéric Lécaille
69dd5e6a0b MINOR: proto_quic: Allocate datagram handlers
Add quic_dghdlr new struct do define datagram handler tasks, one by thread.
Allocate them and attach them to the listener receiver part calling
quic_alloc_dghdlrs_listener() newly implemented function.
2022-01-27 16:37:55 +01:00
Frédéric Lécaille
3d4bfe708a MINOR: quic: Allocate QUIC datagrams from sock I/O handler
Add quic_dgram new structure to store information about datagrams received
by the sock I/O handler (quic_sock_fd_iocb) and its associated pool.
Implement quic_get_dgram_dcid() to retrieve the datagram DCID which must
be the same for all the packets in the datagram.
Modify quic_lstnr_dgram_read() called by the sock I/O handler to allocate
a quic_dgram each time a correct datagram is found and add it to the sock I/O
handler rxbuf dgram list.
2022-01-27 16:37:55 +01:00
Frédéric Lécaille
53898bba81 MINOR: quic: Add a list to QUIC sock I/O handler RX buffer
This list will be used to store datagrams in the rxbuf struct used
by the quic_sock_fd_iocb() QUIC sock I/O handler with one rxbuf by thread.
2022-01-27 16:37:55 +01:00
Frédéric Lécaille
9cc64e2dba MINOR: quic: Remove the QUIC haproxy server packet parser
This function is no more used anymore, broken and uses code shared with the
listener packet parser. This is becoming anoying to continue to modify
it without testing each time we modify the code it shares with the
listener packet parser.
2022-01-27 16:37:55 +01:00
Frédéric Lécaille
3d55462654 MINOR: quic: Get rid of a struct buffer in quic_lstnr_dgram_read()
This is to be sure xprt functions do not manipulate the buffer struct
passed as parameter to quic_lstnr_dgram_read() from low level datagram
I/O callback in quic_sock.c (quic_sock_fd_iocb()).
2022-01-27 16:37:55 +01:00
Frédéric Lécaille
055ee6c14b MINOR: quic: Comment fix about the token found in Initial packets
Mention that the token is sent only by servers in both server and listener
packet parsers.
Remove a "TO DO" section in listener packet parser because there is nothing
more to do in this function about the token
2022-01-27 16:37:55 +01:00
Frédéric Lécaille
4852101fd2 MINOR: quic: No DCID length for datagram context
This quic_dgram_ctx struct member is used to denote if we are parsing a new
datagram (null value), or a coalesced packet into the current datagram (non null
value). But it was never set.
2022-01-27 16:37:55 +01:00
Willy Tarreau
97ea9c49f1 BUG/MEDIUM: fd: always align fdtab[] to 64 bytes
There's a risk that fdtab is not 64-byte aligned. The first effect is that
it may cause false sharing between cache lines resulting in contention
when adjacent FDs are used by different threads. The second is related
to what is explained in commit "BUG/MAJOR: compiler: relax alignment
constraints on certain structures", i.e. that modern compilers might
make use of aligned vector operations to zero some entries, and would
crash. We do not use any memset() or so on fdtab, so the risk is almost
inexistent, but that's not a reason for violating some valid assumptions.

This patch addresses this by allocating 64 extra bytes and aligning the
structure manually (this is an extremely cheap solution for this specific
case). The original address is stored in a new variable "fdtab_addr" and
is the one that gets freed. This remains extremely simple and should be
easily backportable. A dedicated aligned allocator later would help, of
course.

This needs to be backported as far as 2.2. No issue related to this was
reported yet, but it could very well happen as compilers evolve. In
addition this should preserve high performance across restarts (i.e.
no more dependency on allocator's alignment).
2022-01-27 16:28:10 +01:00
Willy Tarreau
8e92738ffd DEBUG: lru: use a xorshift generator in the testing code
The standalone testing code used to rely on rand(), but switching to a
xorshift generator speeds up the test by 7% which is important to
accurately measure the real impact of the LRU code itself.
2022-01-27 16:28:10 +01:00
Willy Tarreau
bf9c07fd91 BUILD/DEBUG: lru: update the standalone code to support the revision
The standalone testing code didn't implement the revision and didn't
build anymore, let's fix that.
2022-01-27 16:28:10 +01:00
William Lallemand
08cb945a9b CLEANUP: mworker: simplify mworker_free_child()
Remove useless checks and simplify the function.
2022-01-27 15:33:40 +01:00
Amaury Denoyelle
cfa2d5648f MAJOR: quic: implement accept queue
Do not proceed to direct accept when creating a new quic_conn. Wait for
the QUIC handshake to succeeds to insert the quic_conn in the accept
queue. A tasklet is then woken up to call listener_accept to accept the
quic_conn.

The most important effect is that the connection/mux layers are not
instantiated at the same time as the quic_conn. This forces to delay
some process to be sure that the mux is allocated :
* initialization of mux transport parameters
* installation of the app-ops

Also, the mux instance is not checked now to wake up the quic_conn
tasklet. This is safe because the xprt-quic code is now ready to handle
the absence of the connection/mux layers.

Note that this commit has a deep impact as it changes significantly the
lower QUIC architecture. Most notably, it breaks the 0-RTT feature.
2022-01-26 16:13:54 +01:00
Amaury Denoyelle
f68b2cb816 MINOR: listener: define per-thr struct
Create a new structure li_per_thread. This is uses as an array in the
listener structure, with an entry allocated per thread. The new function
li_init_per_thr is responsible of the allocation.

For now, li_per_thread contains fields only useful for QUIC listeners.
As such, it is only allocated for QUIC listeners.
2022-01-26 16:13:54 +01:00
Amaury Denoyelle
2ce99fe4bf MINOR: quic: create accept queue for QUIC connections
Create a new type quic_accept_queue to handle QUIC connections accept.
A queue will be allocated for each thread. It contains a list of
listeners which contains at least one quic_conn ready to be accepted and
the tasklet to run listener_accept for these listeners.
2022-01-26 16:13:51 +01:00
Amaury Denoyelle
b59b88950a MINOR: quic: define QUIC flag on listener
Mark QUIC listeners with the flag LI_F_QUIC_LISTENER. It is set by the
proto-quic layer on the add listener callback. This allows to override
more clearly the accept callback on quic_session_accept.
2022-01-26 15:25:45 +01:00
Amaury Denoyelle
cbe090d42f MINOR: quic: remove wait handshake/L6 flags on init connection
The connection is allocated after finishing the QUIC handshake. Remove
handshake/L6 flags when initializing the connection as handshake is
finished with success at this stage.
2022-01-26 15:25:45 +01:00
Amaury Denoyelle
9fa15e5413 MINOR: quic: do not manage connection in xprt snd_buf
Remove usage of connection in quic_conn_from_buf. As connection and
quic_conn are decorrelated, it is not logical to check connection flags
when using sendto.

This require to store the L4 peer address in quic_conn to be able to use
sendto.

This change is required to delay allocation of connection.
2022-01-26 15:25:38 +01:00
Amaury Denoyelle
683b5fc7b8 MEDIUM: quic: flag listener for local accept
QUIC connections are distributed accross threads by xprt-quic according
to their CIDs. As such disable the thread selection in listener_accept
for QUIC listeners.

This prevents connection from migrating to another threads after its
allocation which can results in unexpected side-effects.
2022-01-26 11:59:12 +01:00
Amaury Denoyelle
7f7713d6ef MINOR: receiver: define a flag for local accept
This flag is named RX_F_LOCAL_ACCEPT. It will be activated for special
receivers where connection balancing to threads is already handle
outside of listener_accept, such as with QUIC listeners.
2022-01-26 11:22:20 +01:00
Amaury Denoyelle
4b40f19f92 MINOR: quic: refactor app-ops initialization
Add a new function in mux-quic to install app-ops. For now this
functions is called during the ALPN negotiation of the QUIC handshake.

This change will be useful when the connection accept queue will be
implemented. It will be thus required to delay the app-ops
initialization because the mux won't be allocated anymore during the
QUIC handshake.
2022-01-26 10:59:33 +01:00
Amaury Denoyelle
0b1f93127f MINOR: quic: handle app data according to mux/connection layer status
Define a new enum to represent the status of the mux/connection layer
above a quic_conn. This is important to know if it's possible to handle
application data, or if it should be buffered or dropped.
2022-01-26 10:57:17 +01:00
Amaury Denoyelle
8ae28077b9 MINOR: quic: refactor header protection removal
Adjust the function to check if header protection can be removed. It can
now be used both for a single packet in qc_lstnr_pkt_rcv and in the
quic_conn handler to handle buffered packets for a specific encryption
level.
2022-01-26 10:51:16 +01:00
Willy Tarreau
f70fdde591 BUILD: pools: fix build error on DEBUG_POOL_TRACING
When squashing commit add43fa43 ("DEBUG: pools: add new build option
DEBUG_POOL_TRACING") I managed to break the build and to fail to detect
it even after the rebase and a full rebuild :-(
2022-01-25 15:59:18 +01:00
Willy Tarreau
410942b92a BUILD: debug/cli: condition test of O_ASYNC to its existence
David Carlier reported a build breakage on Haiku since commit
5be7c198e ("DEBUG: cli: add a new "debug dev fd" expert command")
due to O_ASYNC not being defined. Ilya also reported it broke the
build on Cygwin. It's not that portable and sometimes defined as
O_NONBLOCK for portability. But here we don't even need that, as
we already condition other flags, let's just ignore it if it does
not exist.
2022-01-25 14:51:53 +01:00
Willy Tarreau
3a6af1e5e8 MINOR: fd: register the write side of the poller pipe as well
The poller's pipe was only registered on the read side since we don't
need to poll to write on it. But this leaves some known FDs so it's
better to also register the write side with no event. This will allow
to show them in "show fd" and to avoid dumping them as unhandled FDs.

Note that the only other type of unhandled FDs left are:
  - stdin/stdout/stderr
  - epoll FDs

The later can be registered upon startup though but at least a dummy
handler would be needed to keep the fdtab clean.
2022-01-24 20:41:25 +01:00
Willy Tarreau
5be7c198e5 DEBUG: cli: add a new "debug dev fd" expert command
This command will scan the whole file descriptors space to look for
existing FDs that are unknown to haproxy's fdtab, and will try to dump
a maximum number of information about them (including type, mode, device,
size, uid/gid, cloexec, O_* flags, socket types and addresses when
relevant). The goal is to help detecting inherited FDs from parent
processes as well as potential leaks.

Some of those listed are actually known but handled so deep into some
systems that they're not in the fdtab (such as epoll FDs or inter-
thread pipes). This might be refined in the future so that these ones
become known and do not appear.

Example of output:

 $ socat - /tmp/sock1 <<< "expert-mode on;debug dev fd"

    0 type=tty. mod=0620 dev=0x8803 siz=0 uid=1000 gid=5 fs=0x16 ino=0x6 getfd=+0 getfl=O_RDONLY,O_APPEND
    1 type=tty. mod=0620 dev=0x8803 siz=0 uid=1000 gid=5 fs=0x16 ino=0x6 getfd=+0 getfl=O_RDONLY,O_APPEND
    2 type=tty. mod=0620 dev=0x8803 siz=0 uid=1000 gid=5 fs=0x16 ino=0x6 getfd=+0 getfl=O_RDONLY,O_APPEND
    3 type=pipe mod=0600 dev=0 siz=0 uid=1000 gid=100 fs=0xc ino=0x18112348 getfd=+0
    4 type=epol mod=0600 dev=0 siz=0 uid=0 gid=0 fs=0xd ino=0x3674 getfd=+0 getfl=O_RDONLY
   33 type=pipe mod=0600 dev=0 siz=0 uid=1000 gid=100 fs=0xc ino=0x24af8251 getfd=+0 getfl=O_RDONLY
   34 type=epol mod=0600 dev=0 siz=0 uid=0 gid=0 fs=0xd ino=0x3674 getfd=+0 getfl=O_RDONLY
   36 type=pipe mod=0600 dev=0 siz=0 uid=1000 gid=100 fs=0xc ino=0x24af8d1b getfd=+0 getfl=O_RDONLY
   37 type=epol mod=0600 dev=0 siz=0 uid=0 gid=0 fs=0xd ino=0x3674 getfd=+0 getfl=O_RDONLY
   39 type=pipe mod=0600 dev=0 siz=0 uid=1000 gid=100 fs=0xc ino=0x24afa04f getfd=+0 getfl=O_RDONLY
   41 type=pipe mod=0600 dev=0 siz=0 uid=1000 gid=100 fs=0xc ino=0x24af8252 getfd=+0 getfl=O_RDONLY
   42 type=epol mod=0600 dev=0 siz=0 uid=0 gid=0 fs=0xd ino=0x3674 getfd=+0 getfl=O_RDONLY
2022-01-24 20:26:09 +01:00
Willy Tarreau
add43fa43e DEBUG: pools: add new build option DEBUG_POOL_TRACING
This new option, when set, will cause the callers of pool_alloc() and
pool_free() to be recorded into an extra area in the pool that is expected
to be helpful for later inspection (e.g. in core dumps). For example it
may help figure that an object was released to a pool with some sub-fields
not yet released or that a use-after-free happened after releasing it,
with an immediate indication about the exact line of code that released
it (possibly an error path).

This only works with the per-thread cache, and even objects refilled from
the shared pool directly into the thread-local cache will have a NULL
there. That's not an issue since these objects have not yet been freed.
It's worth noting that pool_alloc_nocache() continues not to set any
caller pointer (e.g. when the cache is empty) because that would require
a possibly undesirable API change.

The extra cost is minimal (one pointer per object) and this completes
well with DEBUG_POOL_INTEGRITY.
2022-01-24 16:40:48 +01:00
Willy Tarreau
0e2a5b4b61 MINOR: pools: extend pool_cache API to pass a pointer to a caller
This adds a caller to pool_put_to_cache() and pool_get_from_cache()
which will optionally be used to pass a pointer to their callers. For
now it's not used, only the API is extended to support this pointer.
2022-01-24 16:40:48 +01:00
Willy Tarreau
d392973dcc MINOR: pools: partially uninline pool_alloc()
The pool_alloc() function was already a wrapper to __pool_alloc() which
was also inlined but took a set of flags. This latter was uninlined and
moved to pool.c, and pool_alloc()/pool_zalloc() turned to macros so that
they can more easily evolve to support debugging options.

The number of call places made this code grow over time and doing only
this change saved ~1% of the whole executable's size.
2022-01-24 16:40:48 +01:00
Willy Tarreau
15c322c413 MINOR: pools: partially uninline pool_free()
The pool_free() function has become a bit big over time due to the
extra consistency checks. It used to remain inline only to deal
cleanly with the NULL pointer free that's quite present on some
structures (e.g. in stream_free()).

Here we're splitting the function in two:
  - __pool_free() does the inner block without the pointer test and
    becomes a function ;

  - pool_free() is now a macro that only checks the pointer and calls
    __pool_free() if needed.

The use of a macro versus an inline function is only motivated by an
easier intrumentation of the code later.

With this change, the code size reduces by ~1%, which means that at
this point all pool_free() call places used to represent more than
1% of the total code size.
2022-01-24 16:40:48 +01:00
Amaury Denoyelle
7c564bfdd3 MINOR: ssl: fix build in release mode
Fix potential null pointer dereference. In fact, this case is not
possible, only a mistake in SSL ex-data initialization may cause it :
either connection is set or quic_conn, which allows to retrieve
the bind_conf.

A BUG_ON was already present but this does not cover release build.
2022-01-24 11:15:48 +01:00
Amaury Denoyelle
33ac346ba8 MINOR: quic: initialize ssl_sock_ctx alongside the quic_conn
Extract the allocation of ssl_sock_ctx from qc_conn_init to a dedicated
function qc_conn_alloc_ssl_ctx. This function is called just after
allocating a new quic_conn, without waiting for the initialization of
the connection. It allocates the ssl_sock_ctx and the quic_conn tasklet.

This change is now possible because the SSL callbacks are dealing with a
quic_conn instance.

This change is required to be able to delay the connection allocation
and handle handshake packets without it.
2022-01-24 10:30:49 +01:00
Amaury Denoyelle
9320dd5385 MEDIUM: quic/ssl: add new ex data for quic_conn
Allow to register quic_conn as ex-data in SSL callbacks. A new index is
used to identify it as ssl_qc_app_data_index.

Replace connection by quic_conn as SSL ex-data when initializing the QUIC
SSL session. When using SSL callbacks in QUIC context, the connection is
now NULL. Used quic_conn instead to retrieve the required parameters.
Also clean up

The same changes are conducted inside the QUIC SSL methods of xprt-quic
: connection instance usage is replaced by quic_conn.
2022-01-24 10:30:49 +01:00
Amaury Denoyelle
57af069571 MINOR: quic: set listener accept cb on parsing
Define a special accept cb for QUIC listeners to quic_session_accept().
This operation is conducted during the proto.add callback when creating
listeners.

A special care is now taken care when setting the standard callback
session_accept_fd() to not overwrite if already defined by the proto
layer.
2022-01-24 10:30:49 +01:00
Amaury Denoyelle
29632b8b10 MINOR: quic: remove dereferencement of connection when possible
Some functions of xprt-quic were still using connection instead of
quic_conn. This must be removed as the two are decorrelated : a
quic_conn can exist without a connection.
2022-01-24 10:30:49 +01:00
Amaury Denoyelle
74f2292557 MINOR: quic: fix indentation in qc_send_ppkts
Adjust wrong mixing of tabs/spaces.
2022-01-24 10:30:49 +01:00
Amaury Denoyelle
4d29504c58 MINOR: quic: add missing include in quic_sock
Add quic_sock.h include in corresponding source file quic_sock.c.
2022-01-24 10:30:49 +01:00
Willy Tarreau
0575d8fd76 DEBUG: pools: add new build option DEBUG_POOL_INTEGRITY
When enabled, objects picked from the cache are checked for corruption
by comparing their contents against a pattern that was placed when they
were inserted into the cache. Objects are also allocated in the reverse
order, from the oldest one to the most recent, so as to maximize the
ability to detect such a corruption. The goal is to detect writes after
free (or possibly hardware memory corruptions). Contrary to DEBUG_UAF
this cannot detect reads after free, but may possibly detect later
corruptions and will not consume extra memory. The CPU usage will
increase a bit due to the cost of filling/checking the area and for the
preference for cold cache instead of hot cache, though not as much as
with DEBUG_UAF. This option is meant to be usable in production.
2022-01-21 19:07:48 +01:00
Frédéric Lécaille
39ba1c3e12 MINOR: quic: Wrong packet number space selection
It is possible that the listener is in INITIAL state, but have to probe
with Handshake packets. In this case, when entering qc_prep_pkts() there
is nothing to do. We must select the next packet number space (or encryption
level) to be able to probe with such packet type.
2022-01-21 17:38:11 +01:00
Frédéric Lécaille
2cca241780 MINOR: quic: Add QUIC_FT_RETIRE_CONNECTION_ID parsing case
At this time, we do not do anything. This is only to prevent a packet
from being parsed and to pass some test irrespective of the CIDs management.
2022-01-21 17:38:11 +01:00
Amaury Denoyelle
2d9794b03a MINOR: quic: free SSL context on quic_conn free
Free the SSL context attached to the quic_conn when freeing the
connection. This fixes a memory leak for every QUIC connection.
2022-01-21 15:20:07 +01:00
Amaury Denoyelle
760da3be57 MINOR: quic: fix race-condition on xprt tasklet free
Remove the unsafe call to tasklet_free in quic_close. At this stage the
tasklet may already be scheduled by an other threads even after if the
quic_conn refcount is now null. It will probably cause a crash on the
next tasklet processing.

Use tasklet_kill instead to ensure that the tasklet is freed in a
thread-safe way. Note that quic_conn_io_cb is not protected by the
refcount so only the quic_conn pinned thread must kill the tasklet.
2022-01-21 15:19:31 +01:00
Amaury Denoyelle
2eb7b30715 MINOR: quic: adjust quic_conn refcount decrement
Adjust slightly refcount code decrement on quic_conn close. A new
function named quic_conn_release is implemented. This function is
responsible to remove the quic_conn from CIDs trees and decrement the
refcount to free the quic_conn once all threads have finished to work
with it.

For now, quic_close is responsible to call it so the quic_conn is
scheduled to be free by upper layers. In the future, it may be useful to
delay it to be able to send remaining data or waiting for missing ACKs
for example.

This simplify quic_conn_drop which do not require the lock anymore.
Also, this can help to free the connection more quickly in some cases.
2022-01-21 15:03:17 +01:00
Amaury Denoyelle
9c4da93796 MINOR: quic: do not use quic_conn after dropping it
quic_conn_drop decrement the refcount and may free the quic_conn if
reaching 0. The quic_conn should not be dereferenced again after it in
any case even for traces.
2022-01-21 15:02:56 +01:00
Willy Tarreau
6c539c4b8c BUG/MINOR: stream: make the call_rate only count the no-progress calls
We have an anti-looping protection in process_stream() that detects bugs
that used to affect a few filters like compression in the past which
sometimes forgot to handle a read0 or a particular error, leaving a
thread looping at 100% CPU forever. When such a condition is detected,
an alert it emitted and the process is killed so that it can be replaced
by a sane one:

  [ALERT]    (19061) : A bogus STREAM [0x274abe0] is spinning at 2057156
             calls per second and refuses to die, aborting now! Please
             report this error to developers [strm=0x274abe0,3 src=unix
             fe=MASTER be=MASTER dst=<MCLI> txn=(nil),0 txn.req=-,0
             txn.rsp=-,0 rqf=c02000 rqa=10000 rpf=88000021 rpa=8000000
             sif=EST,40008 sib=DIS,84018 af=(nil),0 csf=0x274ab90,8600
             ab=0x272fd40,1 csb=(nil),0
             cof=0x25d5d80,1300:PASS(0x274aaf0)/RAW((nil))/unix_stream(9)
             cob=(nil),0:NONE((nil))/NONE((nil))/NONE(0) filters={}]
    call trace(11):
    |       0x4dbaab [c7 04 25 01 00 00 00 00]: stream_dump_and_crash+0x17b/0x1b4
    |       0x4df31f [e9 bd c8 ff ff 49 83 7c]: process_stream+0x382f/0x53a3
    (...)

One problem with this detection is that it used to only count the call
rate because we weren't sure how to make it more accurate, but the
threshold was high enough to prevent accidental false positives.

There is actually one case that manages to trigger it, which is when
sending huge amounts of requests pipelined on the master CLI. Some
short requests such as "show version" are sufficient to be handled
extremely fast and to cause a wake up of an analyser to parse the
next request, then an applet to handle it, back and forth. But this
condition is not an error, since some data are being forwarded by
the stream, and it's easy to detect it.

This patch modifies the detection so that update_freq_ctr() only
applies to calls made without CF_READ_PARTIAL nor CF_WRITE_PARTIAL
set on any of the channels, which really indicates that nothing is
happening at all.

This is greatly sufficient and extremely effective, as the call above
is still caught (shutr being ignored by an analyser) while a loop on
the master CLI now has no effect. The "call_rate" field in the detailed
"show sess" output will now be much lower, except for bogus streams,
which may help spot them. This field is only there for developers
anyway so it's pretty fine to slightly adjust its meaning.

This patch could be backported to stable versions in case of reports
of such an issue, but as that's unlikely, it's not really needed.
2022-01-20 18:56:57 +01:00
Willy Tarreau
a4e4d66f70 BUG/MEDIUM: mcli: always realign wrapping buffers before parsing them
Pipelined commands easily result in request buffers to wrap, and the
master-cli parser only deals with linear buffers since it needs contiguous
keywords to look for in a list. As soon as a buffer wraps, some commands
are ignored and the parser is called in loops because the wrapped data
do not leave the buffer.

Let's take the easiest path that's already used at the HTTP layer, we
simply realign the buffer if its input wraps. This rarely happens anyway
(typically once per buffer), remains reasonably cheap and guarantees this
cannot happen anymore.

This needs to be backported as far as 2.0.
2022-01-20 18:56:57 +01:00
Willy Tarreau
6cd93f52e9 BUG/MEDIUM: mcli: do not try to parse empty buffers
When pcli_parse_request() is called with an empty buffer, it still tries
to parse it and can go on believing it finds an empty request if the last
char before the beginning of the buffer is a '\n'. In this case it overwrites
it with a zero and processes it as an empty command, doing nothing but not
making the buffer progress. This results in an infinite loop that is stopped
by the watchdog. For a reason related to another issue (yet to be fixed),
this can easily be reproduced by pipelining lots of commands such as
"show version".

Let's add a length check after the search for a '\n'.

This needs to be backported as far as 2.0.
2022-01-20 18:56:57 +01:00
Christopher Faulet
0f727dabf5 BUG/MEDIUM: cli: Never wait for more data on client shutdown
When a shutdown is detected on the cli, we try to execute all pending
commands first before closing the connection. It is required because
commands execution is serialized. However, when the last part is a partial
command, the cli connection is not closed, waiting for more data. Because
there is no timeout for now on the cli socket, the connection remains
infinitely in this state. And because the maxconn is set to 10, if it
happens several times, the cli socket quickly becomes unresponsive because
all its slots are waiting for more data on a closed connections.

This patch should fix the issue #1512. It must be backported as far as 2.0.
2022-01-20 18:56:39 +01:00
Frédéric Lécaille
94fca87f6a MINOR: quic: Probe even if coalescing
Again, we fix a reminiscence of the way we probed before probing by packet.
When we were probing by datagram we inspected <prv_pkt> to know if we were
coalescing several packets. There is no need to do that at all when probing by packet.
Furthermore this could lead to blocking situations where we want to probe but
are limited by the congestion control (<cwnd> path variable). This must not be
the case. When probing we must do it regardless of the congestion control.
2022-01-20 16:43:06 +01:00
Frédéric Lécaille
e87524d41c MINOR: quic: Release asap TX frames to be transmitted
This is done only for ack-eliciting frames to be sent from Initial and Handshake
packet number space when discarding them.
2022-01-20 16:43:06 +01:00
Frédéric Lécaille
a6255f53e8 MINOR: quic: Release RX Initial packets asap
This is to free up some space in the RX buffer as soon as possible.
2022-01-20 16:43:06 +01:00
Frédéric Lécaille
04e63aa6ef MINOR: quic: Speeding up handshake completion
If a client resend Initial CRYPTO data, this is because it did not receive all
the server Initial CRYPTO data. With this patch we prepare a fast retransmission
without waiting for the PTO timer expiration sending old Initial CRYPTO data,
coalescing them with Handshake CRYPTO if present in the same datagram. Furthermore
we send also a datagram made of previously sent Hanshashke CRYPTO data if any.
2022-01-20 16:43:06 +01:00
Frédéric Lécaille
f4e5a7c644 MINOR: quic: Probe regardless of the congestion control
When probing, we must not take into an account the congestion control window.
This was not completely correctly implemented: qc_build_frms() could fail
because of this limit when comparing the head of the packet againts the
congestion control window. With this patch we make it fail only when
we are not probing.
2022-01-20 16:43:06 +01:00
Frédéric Lécaille
0fa553d0c2 MINOR: quic: Send two ack-eliciting packets when probing packet number spaces
This is to avoid too much PTO timer expirations for 01RTT and Handshake packet
number spaces. Furthermore we are not limited by the anti-amplication for 01RTT
packet number space. According to the RFC we can send up to two packets.
2022-01-20 16:43:06 +01:00
Frédéric Lécaille
ce6602d887 CLEANUP: quic: Replace <nb_pto_dgrams> by <probe>
This modification should have come with this commit:
    "MINOR: quic: Remove nb_pto_dgrams quic_conn struct member"
where the nb_pto_dgrams quic_conn struct member was removed.
2022-01-20 16:43:06 +01:00
Frédéric Lécaille
8b6ea17105 MINOR: quic: Add the number of TX bytes to traces
This should be helpful to diagnose some issues regarding packet loss
and recovery issues.
2022-01-20 16:43:06 +01:00
Frédéric Lécaille
cba4cd427e MINOR: quic: Splice the frames which could not be added to packets
When building packets to send, we build frames computing their sizes
to have more chance to be added to new packets. There are rare cases
where this packet coult not be built because of the congestion control
which may for instance prevent us from building a packet with padding
(retransmitted Initial packets). In such a case, the pre-built frames
were lost because added to the packet frame list but not move packet
to the packet number space they come from.

With this patch we add the frames to the packet only if it could be built
and move them back to the packet number space if not.
2022-01-20 16:43:06 +01:00
Frédéric Lécaille
82468ea98e MINOR: quic: Remove the packet number space TX MT_LIST
There is no need to use an MT_LIST to store frames to send from a packet
number space. This is a reminiscence for multi-threading support for the TX part.
2022-01-20 16:43:06 +01:00
Frédéric Lécaille
7065dd0895 MINOR: quic: Retransmit the TX frames in the same order
This is only to please the peer. We resend the TX frames in the
same order they have been sent.
2022-01-20 16:35:43 +01:00
Willy Tarreau
39a0a1e120 MEDIUM: h2/hpack: emit a Dynamic Table Size Update after settings change
As reported by @jinsubsim in github issue #1498, there is an
interoperability issue between nghttp2 as a client and a few servers
among which haproxy (in fact likely all those which do not make use
of the dynamic headers table in responses or which do not intend to
use a larger table), when reducing the header table size below 4096.
These are easily testable this way:

  nghttp -v -H":method: HEAD" --header-table-size=0 https://$SITE

It will result in a compression error for those which do not start
with an HPACK dynamic table size update opcode.

There is a possible interpretation of the H2 and HPACK specs that
says that an HPACK encoder must send an HPACK headers table update
confirming the new size it will be using after having acknowledged
it, because since it's possible for a decoder to advertise a late
SETTINGS and change it after transfers have begun, the initially
advertised value might very well be seen as a first change from the
initial setting, and the HPACK spec doesn't specify the side which
causes the change that triggers a DTSU update, which was essentially
summed up in this question from nghttp2's author when this issue
was already raised 6 years ago, but which didn't really find a solid
response by then:

  https://lists.w3.org/Archives/Public/ietf-http-wg/2015OctDec/0107.html

The ongoing consensus based on what some servers are doing and that aims
at limiting interoperability issues seems to be that a DTSU is expected
for each reduction from the current size, which should be reflected in
the next revision of the H2 spec:

  https://github.com/httpwg/http2-spec/pull/1005

Given that we do not make use of this table we can emit a DTSU of zero
before encoding any HPACK frame. However, some clients do not support
receiving DTSU with such values (e.g. VTest) so we cannot do it
inconditionnally!

The current patch aims at sticking as close to the spec as possible by
proceeding this way:
  - when a SETTINGS_HEADER_TABLE_SIZE is received, a flag is set
    indicating that the value changed
  - before sending any HPACK frame, this flag is checked to see if
    an update is wanted and if none was sent
  - in this case a DTSU of size zero is emitted and a flag is set
    to mention it was emitted so that it never has to be sent again

This addresses the problem with nghttp2 without affecting VTest.

More context is available here:
  https://github.com/nghttp2/nghttp2/issues/1660
  https://lists.w3.org/Archives/Public/ietf-http-wg/2021OctDec/0235.html

Many thanks to @jinsubsim for this report and participating to the issue
that led to an improvement of the H2 spec.

This should be backported to stable releases in a timely manner, ideally
as far as 2.4 once the h2spec update is merged, then to other versions
after a few months of observation or in case an issue around this is
reported.
2022-01-20 05:01:03 +01:00
Willy Tarreau
0011c25144 BUG/MINOR: cli: avoid O(bufsize) parsing cost on pipelined commands
Sending pipelined commands on the CLI using a semi-colon as a delimiter
has a cost that grows linearly with the buffer size, because co_getline()
is called for each word and looks up a '\n' in the whole buffer while
copying its contents into a temporary buffer.

This causes huge parsing delays, for example 3s for 100k "show version"
versus 110ms if parsed only once for a default 16k buffer.

This patch makes use of the new co_getdelim() function to support both
an LF and a semi-colon as delimiters so that it's no more needed to parse
the whole buffer, and that commands are instantly retrieved. We still
need to rely on co_getline() in payload mode as escapes and semi-colons
are not used there.

It should likely be backported where CLI processing speed matters, but
will require to also backport previous patch "MINOR: channel: add new
function co_getdelim() to support multiple delimiters". It's worth noting
that backporting it without "MEDIUM: cli: yield between each pipelined
command" would significantly increase the ratio of disconnections caused
by empty request buffers, for the sole reason that the currently slow
parsing grants more time to request data to come in. As such it would
be better to backport the patch above before taking this one.
2022-01-19 19:16:47 +01:00
Willy Tarreau
c514365317 MINOR: channel: add new function co_getdelim() to support multiple delimiters
For now we have co_getline() which reads a buffer and stops on LF, and
co_getword() which reads a buffer and stops on one arbitrary delimiter.
But sometimes we'd need to stop on a set of delimiters (CR and LF, etc).

This patch adds a new function co_getdelim() which takes a set of delimiters
as a string, and constructs a small map (32 bytes) that's looked up during
parsing to stop after the first delimiter found within the set. It also
supports an optional escape character that skips a delimiter (typically a
backslash). For the rest it works exactly like the two other variants.
2022-01-19 19:16:47 +01:00
Willy Tarreau
fa7b4f6691 MEDIUM: cli: yield between each pipelined command
Pipelining commands on the CLI is sometimes needed for batched operations
such as map deletion etc, but it causes two problems:
  - some possibly long-running commands will be run in series without
    yielding, possibly causing extremely long latencies that will affect
    quality of service and even trigger the watchdog, as seen in github
    issue #1515.

  - short commands that end on a buffer size boundary, when not run in
    interactive mode, will often cause the socket to be closed when
    the last command is parsed, because the buffer is empty.

This patch proposes a small change to this: by yielding in the CLI applet
after processing a command when there are data left, we significantly
reduce the latency, since only one command is executed per call, and
we leave an opportunity for the I/O layers to refill the request buffer
with more commands, hence to execute all of them much more often.

With this change there's no more watchdog triggered on long series of
"del map" on large map files, and the operations are much less disturbed.
It would be desirable to backport this patch to stable versions after some
period of observation in recent versions.
2022-01-19 19:16:47 +01:00
William Dauchy
a087f87875 BUG/MEDIUM: server: avoid changing healthcheck ctx with set server ssl
While giving a fresh try to `set server ssl` (which I wrote), I realised
the behavior is a bit inconsistent. Indeed when using this command over
a server with ssl enabled for the data path but also for the health
check path we have:

- data and health check done using tls
- emit `set server be_foo/srv0 ssl off`
- data path and health check path becomes plain text
- emit `set server be_foo/srv0 ssl on`
- data path becomes tls and health check path remains plain text

while I thought the end result would be:
- data path and health check path comes back in tls

In the current code we indeed erase all connections while deactivating,
but restore only the data path while activating.  I made this mistake in
the past because I was testing with a case where the health check plain
text by default.

There are several ways to solve this issue. The cleanest one would
probably be to avoid changing the health check connection when we use
`set server ssl` command, and create a new command `set server
ssl-check` to change this. For now I assumed this would be ok to simply
avoid changing the health check path and be more consistent.

This patch tries to address that and also update the documentation. It
should not break the existing usage with health check on plain text, as
in this case they should have `no-check-ssl` in defaults.  Without this
patch, it makes the command unusable in an env where you have a list of
server to add along the way with initial `server-template`, and all
using tls for data and healthcheck path.

For 2.6 we should probably reconsider and add `set server ssl-check`
command for better granularity of cases.

If this solution is accepted, this patch should be backported up to >=
2.4.

The alternative solution was to restore the previous state, but I
believe this will create even more confusion in the future.

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2022-01-18 12:05:17 +01:00
William Lallemand
01e2be84d7 BUG/MINOR: httpclient/lua: don't pop the lua stack when getting headers
hlua_httpclient_table_to_hdrs() does a lua_pop(L, 1) at the end of the
function, this is supposed to be done in the caller and it is already be
done in hlua_httpclient_send().

This call has the consequence of poping the next parameter of the
httpclient, ignoring it.

This patch fixes the issue by removing the lua_pop(L, 1).

Must be backported in 2.5.
2022-01-14 20:51:31 +01:00
William Lallemand
bad9c8cac4 BUG/MINOR: httpclient: set default Accept and User-Agent headers
Some servers require at least an Accept and a User-Agent header in the
request. This patch sets some default value.

Must be backported in 2.5.
2022-01-14 20:46:21 +01:00
William Lallemand
e1e045f4d7 BUG/MINOR: httpclient: don't send an empty body
Forbid the httpclient to send an empty chunked client when there is no
data to send. It does happen when doing a simple GET too.

Must be backported in 2.5.
2022-01-14 18:10:43 +01:00
Christopher Faulet
28e7ba8688 BUG/MEDIUM: htx: Adjust length to add DATA block in an empty HTX buffer
htx_add_data() is able to partially consume data. However there is a bug
when the HTX buffer is empty.  The data length is not properly
adjusted. Thus, if it exceeds the HTX buffer size, no block is added. To fix
the issue, the length is now adjusted first.

This patch must be backported as far as 2.0.
2022-01-13 09:34:22 +01:00
Frédéric Lécaille
b80b20c6ff MINOR: quic: Do not wakeup the I/O handler before the mux is started
If we wakeup the I/O handler before the mux is started, it is possible
it has enough time to parse the ClientHello TLS message and update the
mux transport parameters, leading to a crash.
So, we initialize ->qcc quic_conn struct member at the very last time,
when the mux if fully initialized. The condition to wakeup the I/O handler
from lstnr_rcv_pkt() is: xprt context and mux both initialized.
Note that if the xprt context is initialized, it implies its tasklet is
initialized. So, we do not check anymore this latter condition.
2022-01-12 18:08:29 +01:00
Frédéric Lécaille
bec186dde5 MINOR: quic: As server, skip 0-RTT packet number space
This is true only when we are building packets. A QUIC server never
sends 0-RTT packets. So let't skip the associated TLS encryption level.
2022-01-12 18:08:29 +01:00
Willy Tarreau
3b990fe0be BUG/MEDIUM: connection: properly leave stopping list on error
The stopping-list management introduced by commit d3a88c1c3 ("MEDIUM:
connection: close front idling connection on soft-stop") missed two
error paths in the H1 and H2 muxes. The effect is that if a stream
or HPACK table couldn't be allocated for these incoming connections,
we would leave with the connection freed still attached to the
stopping_list and it would never leave it, resulting in use-after-free
hence either a crash or a data corruption.

This is marked as medium as it only happens under extreme memory pressure
or when playing with tune.fail-alloc. Other stability issues remain in
such a case so that abnormal behaviors cannot be explained by this bug
alone.

This must be backported to 2.4.
2022-01-12 17:31:01 +01:00
Amaury Denoyelle
9ab2fb3921 MINOR: quic: free xprt tasklet on its thread
Free the ssl_sock_ctx tasklet in quic_close() instead of
quic_conn_drop(). This ensures that the tasklet is destroyed safely by
the same thread.

This has no impact as the free operation was previously conducted with
care and should not be responsible of any crash.
2022-01-12 15:21:27 +01:00
Amaury Denoyelle
b76ae69513 MEDIUM: quic: implement Retry emission
Implement the emission of Retry packets. These packets are emitted in
response to Initial from clients without token. The token from the Retry
packet contains the ODCID from the Initial packet.

By default, Retry packet emission is disabled and the handshake can
continue without address validation. To enable Retry, a new bind option
has been defined named "quic-force-retry". If set, the handshake must be
conducted only after receiving a token in the Initial packet.
2022-01-12 11:08:48 +01:00
Amaury Denoyelle
c3b6f4d484 MINOR: quic: define retry_source_connection_id TP
Define a new QUIC transport parameter retry_source_connection_id. This
parameter is set only by server, after issuing a Retry packet.
2022-01-12 11:08:48 +01:00
Amaury Denoyelle
5ff1c9778c MEDIUM: quic: implement Initial token parsing
Implement the parsing of token from Initial packets. It is expected that
the token contains a CID which is the DCID from the Initial packet
received from the client without token which triggers a Retry packet.
This CID is then used for transport parameters.

Note that at the moment Retry packet emission is not implemented. This
will be achieved in a following commit.
2022-01-12 11:08:48 +01:00
Amaury Denoyelle
6efec292ef MINOR: quic: implement Retry TLS AEAD tag generation
Implement a new QUIC TLS related function
quic_tls_generate_retry_integrity_tag(). This function can be used to
calculate the AEAD tag of a Retry packet.
2022-01-12 11:08:48 +01:00
Amaury Denoyelle
c8b4ce4a47 MINOR: quic: add config parse source file
Create a new dedicated source file for QUIC related options parsing on
the bind line.
2022-01-12 11:08:48 +01:00
Amaury Denoyelle
ce340fe4a7 MINOR: quic: fix return of quic_dgram_read
It is expected that quic_dgram_read() returns the total number of bytes
read. Fix the return value when the read has been successful. This bug
has no impact as in the end the return value is not checked by the
caller.
2022-01-12 11:08:48 +01:00
Frédéric Lécaille
1aa57d32bb MINOR: quic: Do not dereference ->conn quic_conn struct member
->conn quic_conn struct member is a connection struct object which may be
released from several places. With this patch we do our best to stop dereferencing
this member as much as we can.
2022-01-12 09:49:49 +01:00
Frédéric Lécaille
ba85acdc70 MINOR: quid: Add traces quic_close() and quic_conn_io_cb()
This is to have an idea of possible remaining issues regarding the
connection terminations.
2022-01-11 16:56:04 +01:00
Frédéric Lécaille
81cd3c8eed MINOR: quic: Wrong CRYPTO frame concatenation
This commit was not correct:
	 "MINOR: quic: Only one CRYPTO frame by encryption level"
Indeed, when receiving CRYPTO data from TLS stack for a packet number space,
there are rare cases where there is already other frames than CRYPTO data frames
in the packet number space, especially for 01RTT packet number space.  This is
very often with quant as client.
2022-01-11 16:12:31 +01:00
Frédéric Lécaille
2fe8b3be20 MINOR: quic: Flag the connection as being attached to a listener
We do not rely on connection objects to know if we are a listener or not.
2022-01-11 16:12:31 +01:00
Frédéric Lécaille
19cd46e6e5 MINOR: quic: Reset ->conn quic_conn struct member when calling qc_release()
There may be remaining locations where ->conn quic_conn struct member
is used. So let's reset this.
Add a trace to have an idead when this connection is released.
2022-01-11 16:12:31 +01:00
Frédéric Lécaille
5f7f118b31 MINOR: quic: Remaining TRACEs with connection as firt arg
This is a quic_conn struct which is expected by TRACE_()* macros
2022-01-11 16:12:31 +01:00
David CARLIER
bb10dad5a8 BUILD: cpuset: fix build issue on macos introduced by previous change
The build on macos was broken by recent commit df91cbd58 ("MINOR: cpuset:
switch to sched_setaffinity for FreeBSD 14 and above."), let's move the
variable declaration inside the ifdef.
2022-01-11 15:09:49 +01:00
Christopher Faulet
b4eca0e908 BUG/MAJOR: mux-h1: Don't decrement .curr_len for unsent data
A regression was introduced by commit 140f1a58 ("BUG/MEDIUM: mux-h1: Fix
splicing by properly detecting end of message"). To detect end of the
outgoing message, when the content-length is announced, we count amount of
data already sent. But only data really sent must be counted.

If the output buffer is full, we can fail to send data (fully or
partially). In this case, we must take care to only count sent
data. Otherwise we may think too much data were sent and an internal error
may be erroneously reported.

This patch should fix issues #1510 and #1511. It must be backported as far
as 2.4.
2022-01-11 09:15:13 +01:00
Remi Tricot-Le Breton
a996763619 BUG/MINOR: ssl: Store client SNI in SSL context in case of ClientHello error
If an error is raised during the ClientHello callback on the server side
(ssl_sock_switchctx_cbk), the servername callback won't be called and
the client's SNI will not be saved in the SSL context. But since we use
the SSL_get_servername function to return this SNI in the ssl_fc_sni
sample fetch, that means that in case of error, such as an SNI mismatch
with a frontend having the strict-sni option enabled, the sample fetch
would not work (making strict-sni related errors hard to debug).

This patch fixes that by storing the SNI as an ex_data in the SSL
context in case the ClientHello callback returns an error. This way the
sample fetch can fallback to getting the SNI this way. It will still
first call the SSL_get_servername function first since it is the proper
way of getting a client's SNI when the handshake succeeded.

In order to avoid memory allocations are runtime into this highly used
runtime function, a new memory pool was created to store those client
SNIs. Its entry size is set to 256 bytes since SNIs can't be longer than
255 characters.

This fixes GitHub #1484.

It can be backported in 2.5.
2022-01-10 16:31:22 +01:00
William Lallemand
f82afbb9cd BUG/MEDIUM: mworker: don't use _getsocks in wait mode
Since version 2.5 the master is automatically re-executed in wait-mode
when the config is successfully loaded, puting corner cases of the wait
mode in plain sight.

When using the -x argument and with the right timing, the master will
try to get the FDs again in wait mode even through it's not needed
anymore, which will harm the worker by removing its listeners.

However, if it fails, (and it's suppose to, sometimes), the
master will exit with EXIT_FAILURE because it does not have the
MODE_MWORKER flag, but only the MODE_MWORKER_WAIT flag. With the
consequence of killing the workers.

This patch fixes the issue by restricting the use of _getsocks to some
modes.

This patch must be backported in every version supported, even through
the impact should me more harmless in version prior to 2.5.
2022-01-07 18:44:27 +01:00
Frédéric Lécaille
99942d6f4c MINOR: quic: Non-optimal use of a TX buffer
When full, after having reset the writer index, let's reuse the TX buffer in
any case.
2022-01-07 17:58:26 +01:00
Frédéric Lécaille
f010f0aaf2 MINOR: quic: Missing retransmission from qc_prep_fast_retrans()
In fact we must look for the first packet with some ack-elicting frame to
in the packet number space tree to retransmit from. Obviously there
may be already retransmit packets which are not deemed as lost and
still present in the packet number space tree for TX packets.
2022-01-07 17:58:26 +01:00
Frédéric Lécaille
d4ecf94827 MINOR: quic: Only one CRYPTO frame by encryption level
When receiving CRYPTO data from the TLS stack, concatenate the CRYPTO data
to the first allocated CRYPTO frame if present. This reduces by one the number
of handshake packets built for a connection with a standard size certificate.
2022-01-07 17:58:26 +01:00
Ilya Shipitsin
37d3e38130 CLEANUP: assorted typo fixes in the code and comments
This is 30th iteration of typo fixes
2022-01-07 14:42:54 +01:00
David CARLIER
df91cbd584 MINOR: cpuset: switch to sched_setaffinity for FreeBSD 14 and above.
Following up previous update on cpuset-t.h. Ultimately, at some point
 the cpuset_setaffinity code path could be removed.
2022-01-07 06:53:51 +01:00
William Dauchy
a9dd901143 MINOR: proxy: add option idle-close-on-response
Avoid closing idle connections if a soft stop is in progress.

By default, idle connections will be closed during a soft stop. In some
environments, a client talking to the proxy may have prepared some idle
connections in order to send requests later. If there is no proper retry
on write errors, this can result in errors while haproxy is reloading.
Even though a proper implementation should retry on connection/write
errors, this option was introduced to support back compat with haproxy <
v2.4. Indeed before v2.4, we were waiting for a last request to be able
to add a "connection: close" header and advice the client to close the
connection.

In a real life example, this behavior was seen in AWS using the ALB in
front of a haproxy. The end result was ALB sending 502 during haproxy
reloads.
This patch was tested on haproxy v2.4, with a regular reload on the
process, and a constant trend of requests coming in. Before the patch,
we see regular 502 returned to the client; when activating the option,
the 502 disappear.

This patch should help fixing github issue #1506.
In order to unblock some v2.3 to v2.4 migraton, this patch should be
backported up to v2.4 branch.

Signed-off-by: William Dauchy <wdauchy@gmail.com>
[wt: minor edits to the doc to mention other options to care about]
Signed-off-by: Willy Tarreau <w@1wt.eu>
2022-01-06 09:09:51 +01:00
Frédéric Lécaille
6b6631593f MINOR: quic: Re-arm the PTO timer upon datagram receipt
When block by the anti-amplification limit, this is the responsability of the
client to unblock it sending new datagrams. On the server side, even if not
well parsed, such datagrams must trigger the PTO timer arming.
2022-01-04 17:30:00 +01:00
Frédéric Lécaille
078634d126 MINOR: quic: PTO timer too often reset
It must be reset when the anti-amplication was reached but only if the
peer address was not validated.
2022-01-04 17:30:00 +01:00
Frédéric Lécaille
41a076087b MINOR: quic: Flag asap the connection having reached the anti-amplification limit
The best location to flag the connection is just after having built the packet
which reached the anti-amplication limit.
2022-01-04 17:30:00 +01:00
Frédéric Lécaille
de6f7c503e MINOR: quic: Prepare Handshake packets asap after completed handshake
Switch back to QUIC_HS_ST_SERVER_HANDSHAKE state after a completed handshake
if acks must be send.
Also ensure we build post handshake frames only one time without using prev_st
variable and ensure we discard the Handshake packet number space only one time.
2022-01-04 17:30:00 +01:00
Frédéric Lécaille
917a7dbdc7 MINOR: quic: Do not drop secret key but drop the CRYPTO data
We need to be able to decrypt late Handshake packets after the TLS secret
keys have been discarded. If not the peer send Handshake packet which have
not been acknowledged. But for such packets, we discard the CRYPTO data.
2022-01-04 17:30:00 +01:00
Frédéric Lécaille
ee2b8b377f MINOR: quic: Improve qc_prep_pkts() flexibility
We want to be able to choose the encryption levels to be used
by qc_prep_pkts() outside of it.
2022-01-04 17:30:00 +01:00
Frédéric Lécaille
f7ef97698a MINOR: quic: Comment fix.
When we drop a packet with unknown length, this is the entire datagram which
must be skipped.
2022-01-04 17:30:00 +01:00
Frédéric Lécaille
a56054e438 MINOR: quic: Probe several packet number space upon timer expiration
When the loss detection timer expires, we SHOULD include new data in
our probing packets (RFC 9002 par 6.2.4. Sending Probe Packets).
2022-01-04 17:30:00 +01:00
Frédéric Lécaille
db6a4727cf MINOR: quic: Probe Initial packet number space more often
Especially when the PTO expires for Handshake packet number space and when
Initial packets are still flying (for QUIC servers).
2022-01-04 17:30:00 +01:00
Frédéric Lécaille
3bb457c4ba MINOR: quic: Speeding up Handshake Completion
According to RFC 9002 par. 6.2.3. when receving duplicate Initial CRYPTO
data a server may a packet containing non unacknowledged before the PTO
expiry.
2022-01-04 17:30:00 +01:00
Frédéric Lécaille
63556772cc MINOR: quic: qc_prep_pkts() code moving
Move the switch default case code out of the switch to improve the readibily.
2022-01-04 17:30:00 +01:00
Frédéric Lécaille
5732062cd2 MINOR: quic: Useless test in qc_prep_pkts()
These tests were there to initiate PTO probing but they are not correct.
Furthermore they may break the PTO probing process and lead to useless packet
building.
2022-01-04 17:30:00 +01:00
Frédéric Lécaille
dd51da599e MINOR: quic: Wrong packet number space trace in qc_prep_pkts()
It was always the first packet number space information which was dumped.
2022-01-04 17:30:00 +01:00
Frédéric Lécaille
466e9da145 MINOR: quic: Remove nb_pto_dgrams quic_conn struct member
For now on we rely on tx->pto_probe pktns struct member to inform
the packet building function we want to probe.
2022-01-04 17:30:00 +01:00
Frédéric Lécaille
22576a2e55 MINOR: quic: Wrong ack_delay compution before calling quic_loss_srtt_update()
RFC 9002 5.3. Estimating smoothed_rtt and rttvar:
MUST use the lesser of the acknowledgment delay and the peer's max_ack_delay
after the handshake is confirmed.
2022-01-04 17:30:00 +01:00
Frédéric Lécaille
dc90c07715 MINOR: quic: Wrong loss time computation in qc_packet_loss_lookup()
This part as been modified by the RFC since our first implementation.
2022-01-04 17:30:00 +01:00
Frédéric Lécaille
22cfd83890 MINOR: quic: Add trace about in flight bytes by packet number space
This parameter is useful to diagnose packet loss detection issues.
2022-01-04 17:30:00 +01:00
Frédéric Lécaille
fde2a98dd1 MINOR: quic: Wrong traces after rework
TRACE_*() macros must take a quic_conn struct as first argument.
2022-01-04 17:30:00 +01:00
Christopher Faulet
7bf46bb9a9 BUG/MEDIUM: http-ana: Preserve response's FLT_END analyser on L7 retry
When a filter is attached on a stream, the FLT_END analyser must not be
removed from the response channel on L7 retry. It is especially important
because CF_FLT_ANALYZE flag is still set. This means the synchronization
between the two sides when the filter ends can be blocked. Depending on the
timing, this can freeze the stream infinitely or lead to a spinning loop.

Note that the synchronization between the two sides at the end of the
analysis was introduced because the stream was reused in HTTP between two
transactions. But, since the HTX was introduced, a new stream is created for
each transaction. So it is probably possible to remove this step for 2.2 and
higher.

This patch must be backported as far as 2.0.
2022-01-04 10:56:04 +01:00
David Carlier
ae5c42f4d0 BUILD/MINOR: tools: solaris build fix on dladdr.
dladdr takes a mutable address on this platform.
2022-01-03 14:43:51 +01:00
Ilya Shipitsin
5e87bcf870 CLEANUP: assorted typo fixes in the code and comments This is 29th iteration of typo fixes 2022-01-03 14:40:58 +01:00
Willy Tarreau
1513c5479a MEDIUM: pools: release cached objects in batches
With this patch pool_evict_last_items builds clusters of up to
CONFIG_HAP_POOL_CLUSTER_SIZE entries so that accesses to the shared
pools are reduced by CONFIG_HAP_POOL_CLUSTER_SIZE and the inter-
thread contention is reduced by as much..
2022-01-02 19:35:26 +01:00
Willy Tarreau
43937e920f MEDIUM: pools: start to batch eviction from local caches
Since previous patch we can forcefully evict multiple objects from the
local cache, even when evicting basd on the LRU entries. Let's define
a compile-time configurable setting to batch releasing of objects. For
now we set this value to 8 items per round.

This is marked medium because eviction from the LRU will slightly change
in order to group the last items that are freed within a single cache
instead of accurately scanning only the oldest ones exactly in their
order of appearance. But this is required in order to evolve towards
batched removals.
2022-01-02 19:35:26 +01:00
Willy Tarreau
a0b5831eed MEDIUM: pools: centralize cache eviction in a common function
We currently have two functions to evict cold objects from local caches:
pool_evict_from_local_cache() to evict from a single cache, and
pool_evict_from_local_caches() to evict oldest objects from all caches.

The new function pool_evict_last_items() focuses on scanning oldest
objects from a pool and releasing a predefined number of them, either
to the shared pool or to the system. For now they're evicted one at a
time, but the next step will consist in creating clusters.
2022-01-02 19:35:26 +01:00
Willy Tarreau
337410c5a4 MINOR: pools: pass the objects count to pool_put_to_shared_cache()
This is in order to let the caller build the cluster of items to be
released. For now single items are released hence the count is always
1.
2022-01-02 19:35:26 +01:00
Willy Tarreau
148160b027 MINOR: pools: prepare pool_item to support chained clusters
In order to support batched allocations and releases, we'll need to
prepare chains of items linked together and that can be atomically
attached and detached at once. For this we implement a "down" pointer
in each pool_item that points to the other items belonging to the same
group. For now it's always NULL though freeing functions already check
them when trying to release everything.
2022-01-02 19:35:26 +01:00
Willy Tarreau
361e31e3fe MEDIUM: pool: compute the number of evictable entries once per pool
In pool_evict_from_local_cache() we used to check for room left in the
pool for each and every object. Now we compute the value before entering
the loop and keep into a local list what has to be released, and call
the OS-specific functions for the other ones.

It should already save some cycles since it's not needed anymore to
recheck for the pool's filling status. But the main expected benefit
comes from the ability to pre-construct a list of all releasable
objects, that will later help with grouping them.
2022-01-02 19:35:26 +01:00
Willy Tarreau
c16ed3b090 MINOR: pool: introduce pool_item to represent shared pool items
In order to support batch allocation from/to shared pools, we'll have to
support a specific representation for pool objects. The new pool_item
structure will be used for this. For now it only contains a "next"
pointer that matches exactly the current storage model. The few functions
that deal with the shared pool entries were adapted to use the new type.
There is no functionality difference at this point.
2022-01-02 19:35:26 +01:00
Willy Tarreau
b46674a283 MINOR: pool: check for pool's fullness outside of pool_put_to_shared_cache()
Instead of letting pool_put_to_shared_cache() pass the object to the
underlying OS layer when there's no more room, let's have the caller
check if the pool is full and either call pool_put_to_shared_cache()
or call pool_free_nocache().

Doing this sensibly simplifies the code as this function now only has
to deal with a pool and an item and only for cases where there are
local caches and shared caches. As the code was simplified and the
calls more isolated, the function was moved to pool.c.

Note that it's only called from pool_evict_from_local_cache{,s}() and
that a part of its logic might very well move there when dealing with
batches.
2022-01-02 19:35:26 +01:00
Willy Tarreau
afe2c4a1fc MINOR: pool: allocate from the shared cache through the local caches
One of the thread scaling challenges nowadays for the pools is the
contention on the shared caches. There's never any situation where we
have a shared cache and no local cache anymore, so we can technically
afford to transfer objects from the shared cache to the local cache
before returning them to the user via the regular path. This adds a
little bit more work per object per miss, but will permit batch
processing later.

This patch simply moves pool_get_from_shared_cache() to pool.c under
the new name pool_refill_local_from_shared(), and this function does
not return anything but it places the allocated object at the head of
the local cache.
2022-01-02 19:27:57 +01:00
Willy Tarreau
8c4927098e CLEANUP: pools: get rid of the POOL_LINK macro
The POOL_LINK macro is now only used for debugging, and it still requires
ifdefs around, which needlessly complicates the code. Let's replace it
and the calling code with a new pair of macros: POOL_DEBUG_SET_MARK()
and POOL_DEBUG_CHECK_MARK(), that respectively store and check the pool
pointer in the extra location at the end of the pool. This removes 4
pairs of ifdefs in the middle of the code.
2022-01-02 12:44:19 +01:00
Willy Tarreau
799f6143ca CLEANUP: pools: do not use the extra pointer to link shared elements
This practice relying on POOL_LINK() dates from the era where there were
no pool caches, but given that the structures are a bit more complex now
and that pool caches do not make use of this feature, it is totally
useless since released elements have already been overwritten, and yet
it complicates the architecture and prevents from making simplifications
and optimizations. Let's just get rid of this feature. The pointer to
the origin pool is preserved though, as it helps detect incorrect frees
and serves as a canary for overflows.
2022-01-02 12:44:19 +01:00
Willy Tarreau
d5ec100661 MINOR: pools: always evict oldest objects first in pool_evict_from_local_cache()
For an unknown reason, despite the comment stating that we were evicting
oldest objects first from the local caches, due to the use of LIST_NEXT,
the newest were evicted, since pool_put_to_cache() uses LIST_INSERT().

Some tests on 16 threads show that evicting oldest objects instead can
improve performance by 0.5-1% especially when using shared pools.
2022-01-02 12:40:14 +01:00
William Lallemand
e69563fd8e BUG/MEDIUM: ssl: free the ckch instance linked to a server
This patch unlinks and frees the ckch instance linked to a server during
the free of this server.

This could have locked certificates in a "Used" state when removing
servers dynamically from the CLI. And could provoke a segfault once we
try to dynamically update the certificate after that.

This must be backported as far as 2.4.
2021-12-30 16:56:52 +01:00
William Lallemand
231610ad9c BUG/MINOR: ssl: free the fields in srv->ssl_ctx
A lot of free are missing in ssl_sock_free_srv_ctx(), this could result
in memory leaking when removing dynamically a server via the CLI.

This must be backported in every branches, by removing the fields that
does not exist in the previous branches.
2021-12-30 13:43:04 +01:00
William Lallemand
2c776f1c30 BUG/MEDIUM: ssl: initialize correctly ssl w/ default-server
This bug was introduced by d817dc73 ("MEDIUM: ssl: Load client
certificates in a ckch for backend servers") in which the creation of
the SSL_CTX for a server was moved to the configuration parser when
using a "crt" keyword instead of being done in ssl_sock_prepare_srv_ctx().

The patch 0498fa40 ("BUG/MINOR: ssl: Default-server configuration ignored by
server") made it worse by setting the same SSL_CTX for every servers
using a default-server. Resulting in any SSL option on a server applied
to every server in its backend.

This patch fixes the issue by reintroducing a string which store the
path of certificate inside the server structure, and loading the
certificate in ssl_sock_prepare_srv_ctx() again.

This is a quick fix to backport, a cleaner way can be achieve by always
creating the SSL_CTX in ssl_sock_prepare_srv_ctx() and splitting
properly the ssl_sock_load_srv_cert() function.

This patch fixes issue #1488.

Must be backported as far as 2.4.
2021-12-29 14:42:16 +01:00
Willy Tarreau
654726db5a MINOR: debug: add support for -dL to dump library names at boot
This is a second help to dump loaded library names late at boot, once
external code has already been initialized. The purpose is to provide
a format that makes it easy to pass to "tar" to produce an archive
containing the executable and the list of dependencies. For example
if haproxy is started as "haproxy -f foo.cfg", a config check only
will suffice to quit before starting, "-q" will be used to disable
undesired output messages, and -dL will be use to dump libraries.
This will result in such a command to trivially produce a tarball
of loaded libraries:

   ./haproxy -q -c -dL -f foo.cfg | tar -T - -hzcf archive.tgz
2021-12-28 17:07:13 +01:00
Willy Tarreau
6ab7b21a11 MINOR: debug: add ability to dump loaded shared libraries
Many times core dumps reported by users who experience trouble are
difficult to exploit due to missing system libraries. Sometimes,
having just a list of loaded libraries and their respective addresses
can already provide some hints about some problems.

This patch makes a step in that direction by adding a new "show libs"
command that will try to enumerate the list of object files that are
loaded in memory, relying on the dynamic linker for this. It may also
be used to detect that some foreign code embarks other undesired libs
(e.g. some external Lua modules).

At the moment it's only supported on glibc when USE_DL is set, but it's
implemented in a way that ought to make it reasonably easy to be extended
to other platforms.
2021-12-28 16:59:00 +01:00
Willy Tarreau
b4ff6f4ae9 BUG/MEDIUM: peers: properly skip conn_cur from incoming messages
The approach used for skipping conn_cur in commit db2ab8218 ("MEDIUM:
stick-table: never learn the "conn_cur" value from peers") was wrong,
it only works with simple tables but as soon as frequency counters or
arrays are exchanged after conn_cur, the stream is desynchronized and
incorrect values are read. This is because the fields have a variable
length depending on their types and cannot simply be skipped by a
"continue" statement.

Let's change the approach to make sure we continue to completely parse
these local-only fields, and only drop the value at the moment we're
about to store them, since this is exactly the intent.

A simpler approach could consist in having two sets of stktable_data_ptr()
functions, one for retrieval and one for storage, and to make the store
function return a NULL pointer for local types. For now this doesn't
seem worth the trouble.

This fixes github issue #1497. Thanks to @brenc for the reproducer.

This must be backported to 2.5.
2021-12-24 13:48:39 +01:00
Willy Tarreau
266d540549 BUG/MEDIUM: backend: fix possible sockaddr leak on redispatch
A subtle change of target address allocation was introduced with commit
68cf3959b ("MINOR: backend: rewrite alloc of stream target address") in
2.4. Prior to this patch, a target address was allocated by function
assign_server_address() only if none was previously allocated. After
the change, the allocation became unconditional. Most of the time it
makes no difference, except when we pass multiple times through
connect_server() with SF_ADDR_SET cleared.

The most obvious fix would be to avoid allocating that address there
when already set, but the root cause is that since introduction of
dynamically allocated addresses, the SF_ADDR_SET flag lies. It can
be cleared during redispatch or during a queue redistribution without
the address being released.

This patch instead gives back all its correct meaning to SF_ADDR_SET
and guarantees that when not set no address is allocated, by freeing
that address at the few places the flag is cleared. The flag could
even be removed so that only the address is checked but that would
require to touch many areas for no benefit.

The easiest way to test it is to send requests to a proxy with l7
retries enabled, which forwards to a server returning 500:

  defaults
    mode http
    timeout client 1s
    timeout server 1s
    timeout connect 1s
    retry-on all-retryable-errors
    retries 1
    option redispatch

  listen proxy
    bind *:5000
    server app 0.0.0.0:5001

  frontend dummy-app
    bind :5001
    http-request return status 500

Issuing "show pools" on the CLI will show that pool "sockaddr" grows
as requests are redispatched, and remains stable with the fix. Even
"ps" will show that the process' RSS grows by ~160B per request.

This fix will need to be backported to 2.4. Note that before 2.5,
there's no strm->si[1].dst, strm->target_addr must be used instead.

This addresses github issue #1499. Special thanks to Daniil Leontiev
for providing a well-documented reproducer.
2021-12-24 11:50:01 +01:00
Amaury Denoyelle
9979d0d1ea BUG/MINOR: quic: fix potential use of uninit pointer
Properly initialized the ssl_sock_ctx pointer in qc_conn_init. This is
required to avoid to set an undefined pointer in qc.xprt_ctx if argument
*xprt_ctx is NULL.
2021-12-23 16:33:47 +01:00
Amaury Denoyelle
c6fab98f9b BUG/MINOR: quic: fix potential null dereference
This is not a real issue because found_in_dcid can not be set if qc is
NULL.
2021-12-23 16:32:19 +01:00
Amaury Denoyelle
76f47caacc MEDIUM: quic: implement refcount for quic_conn
Implement a refcount on quic_conn instance. By default, the refcount is
0. Two functions are implemented to manipulate it.
* qc_conn_take() which increments the refcount
* qc_conn_drop() which decrements it. If the refcount is 0 *BEFORE*
  the substraction, the instance is freed.

The refcount is incremented on retrieve_qc_conn_from_cid() or when
allocating a new quic_conn in qc_lstnr_pkt_rcv(). It is substracted most
notably by the xprt.close operation and at the end of
qc_lstnr_pkt_rcv(). The increments/decrements should be conducted under
the CID lock to guarantee thread-safety.
2021-12-23 16:06:07 +01:00
Amaury Denoyelle
0a29e13835 MINOR: quic: delete timer task on quic_close()
The timer task is attached to the connection-pinned thread. Only this
thread can delete it. With the future refcount implementation of
quic_conn, every thread can be responsible to remove the quic_conn via
quic_conn_free(). Thus, the timer task deletion is moved from the
calling function quic_close().
2021-12-23 16:06:07 +01:00
Amaury Denoyelle
e81fed9a54 MINOR: quic: replace usage of ssl_sock_ctx by quic_conn
Big refactoring on xprt-quic. A lot of functions were using the
ssl_sock_ctx as argument to only access the related quic_conn. All these
arguments are replaced by a quic_conn parameter.

As a convention, the quic_conn instance is always the first parameter of
these functions.

This commit is part of the rearchitecture of xprt-quic layers and the
separation between xprt and connection instances.
2021-12-23 16:06:06 +01:00
Amaury Denoyelle
741eacca47 MINOR: quic: remove unnecessary if in qc_pkt_may_rm_hp()
Remove the shortcut to use the INITIAL encryption level when removing
header protection on first connection packet.

This change is useful for the following change which removes
ssl_sock_ctx in argument lists in favor of the quic_conn instance.
2021-12-23 16:02:24 +01:00
Amaury Denoyelle
7ca7c84fb8 MINOR: quic: store ssl_sock_ctx reference into quic_conn
Add a pointer in quic_conn to its related ssl_sock_ctx. This change is
required to avoid to use the connection instance to access it.

This commit is part of the rearchitecture of xprt-quic layers and the
separation between xprt and connection instances. It will be notably
useful when the connection allocation will be delayed.
2021-12-23 15:51:00 +01:00
Amaury Denoyelle
a83729e9e6 MINOR: quic: remove unnecessary call to free_quic_conn_cids()
free_quic_conn_cids() was called in quic_build_post_handshake_frames()
if an error occured. However, the only error is an allocation failure of
the CID which does not required to call it.

This change is required for future refcount implementation. The CID lock
will be removed from the free_quic_conn_cids() and to the caller.
2021-12-23 15:51:00 +01:00
Amaury Denoyelle
250ac42754 BUG/MINOR: quic: upgrade rdlock to wrlock for ODCID removal
When a quic_conn is found in the DCID tree, it can be removed from the
first ODCID tree. However, this operation must absolutely be run under a
write-lock to avoid race condition. To avoid to use the lock too
frequently, node.leaf_p is checked. This value is set to NULL after
ebmb_delete.
2021-12-23 15:51:00 +01:00
Amaury Denoyelle
d6b166787c REORG: quic: remove qc_ prefix on functions which not used it directly
The qc_* prefix should be reserved to functions which used a specific
quic_conn instance and are expected to be pinned on the connection
thread.
2021-12-23 15:51:00 +01:00
Frédéric Lécaille
010e532e81 MINOR: quic: Add CONNECTION_CLOSE phrase to trace
Some applications may send some information about the reason why they decided
to close a connection. Add them to CONNECTION_CLOSE frame traces.
Take the opportunity of this patch to shorten some too long variable names
without any impact.
2021-12-23 15:48:25 +01:00
Frédéric Lécaille
1ede823d6b MINOR: quic: Add traces for RX frames (flow control related)
Add traces about important frame types to chunk_tx_frm_appendf()
and call this function for any type of frame when parsing a packet.
Move it to quic_frame.c
2021-12-23 15:48:25 +01:00
Willy Tarreau
77bfa66124 DEBUG: ssl: make sure we never change a servername on established connections
Since this case was already met previously with commit 655dec81b
("BUG/MINOR: backend: do not set sni on connection reuse"), let's make
sure that we don't change reused connection settings. This could be
generalized to most settings that are only in effect before the handshake
in fact (like set_alpn and a few other ones).
2021-12-23 15:44:06 +01:00
Willy Tarreau
0d93a81863 MINOR: pools: work around possibly slow malloc_trim() during gc
During 2.4-dev, support for malloc_trim() was implemented to ease
release of memory in a stopping process. This was found to be quite
effective and later backported to 2.3.7.

Then it was found that sometimes malloc_trim() could take a huge time
to complete it if was competing with other threads still allocating and
releasing memory, reason why it was decided in 2.5-dev to move
malloc_trim() under the thread isolation that was already in place in
the shared pool version of pool_gc() (this was commit 26ed1835).

However, other instances of pool_gc() that used to call malloc_trim()
were not updated since they were not using thread isolation. Currently
we have two other such instances, one for when there is absolutely no
pool and one for when there are only thread-local pools.

Christian Ruppert reported in GH issue #1490 that he's sometimes seeing
and old process die upon reload when upgrading from 2.3 to 2.4, and
that this happens inside malloc_trim(). The problem is that since
2.4-dev11 with commit 0bae07592 we detect modern libc that provide a
faster thread-aware allocator and do not maintain shared pools anymore.
As such we're using again the simpler pool_gc() implementations that do
not use thread isolation around the malloc_trim() call.

All this code was cleaned up recently and the call moved to a new
function trim_all_pools(). This patch implements explicit thread isolation
inside that function so that callers do not have to care about this
anymore. The thread isolation is conditional so that this doesn't affect
the one already in place in the larger version of pool_gc(). This way it
will solve the problem for all callers.

This patch must be backported as far as 2.3. It may possibly require
some adaptations. If trim_all_pools() is not present, copy-pasting the
tests in each version of pool_gc() will have the same effect.

Thanks to Christian for his detailed report and his testing.
2021-12-23 15:44:06 +01:00
Frédéric Lécaille
2c15a66b61 MINOR: quic: Drop asap Retry or Version Negotiation packets
These packet are only sent by servers. We drop them as soon as possible
when we are an haproxy listener.
2021-12-22 20:43:22 +01:00
Frédéric Lécaille
e7ff2b265a MINOR: quic: xprt traces fixes
Empty parameters are permitted with TRACE_*() macros. If removed, must
be replaced by NULL.
2021-12-22 20:43:22 +01:00
Frédéric Lécaille
10250b2e93 MINOR: quic: Handle the cases of overlapping STREAM frames
This is the same treatment for bidi and uni STREAM frames. This is a duplication
code which should me remove building a function for both these types of streams.
2021-12-22 20:43:22 +01:00
Frédéric Lécaille
01cfec74f5 MINOR: quic: Wrong dropped packet skipping
There were cases where some dropped packets were not well skipped. This led
the low level QUIC packet parser to continue from wrong packet boundaries.
2021-12-22 20:43:22 +01:00
Frédéric Lécaille
4d118d6a8e MINOR: quic: unchecked qc_retrieve_conn_from_cid() returned value
If qc_retrieve_conn_from_cid() did not manage to retrieve the connection
from packet CIDs, we must drop them.
2021-12-22 17:27:51 +01:00
Frédéric Lécaille
677b99dca7 MINOR: quic: Add stream IDs to qcs_push_frame() traces
This is only for debug purpose.
2021-12-21 16:06:03 +01:00
Amaury Denoyelle
e770ce3980 MINOR: quic: add quic_conn instance in traces for qc_new_conn
The connection instance has been replaced by a quic_conn as first
argument to QUIC traces. It is possible to report the quic_conn instance
in the qc_new_conn(), contrary to the connection which is not
initialized at this stage.
2021-12-21 15:53:19 +01:00
Amaury Denoyelle
7aaeb5b567 MINOR: quic: use quic_conn as argument to traces
Replace the connection instance for first argument of trace callback by
a quic_conn instance. The QUIC trace module is properly initialized with
the first argument refering to a quic_conn.

Replace every connection instances in TRACE_* macros invocation in
xprt-quic by its related quic_conn. In some case, the connection is
still used to access the quic_conn. It may cause some problem on the
future when the connection will be completly separated from the xprt
layer.

This commit is part of the rearchitecture of xprt-quic layers and the
separation between xprt and connection instances.
2021-12-21 15:53:19 +01:00
Amaury Denoyelle
baea96400f MINOR: trace: add quic_conn argument definition
Prepare trace support for quic_conn instances as argument. This will be
used by the xprt-quic layer in replacement of the connection.

This commit is part of the rearchitecture of xprt-quic layers and the
separation between xprt and connection instances.
2021-12-21 15:53:19 +01:00
Amaury Denoyelle
4fd53d772f MINOR: quic: add const qualifier for traces function
Add const qualifier on arguments of several dump functions used in the
trace callback. This is required to be able to replace the first trace
argument by a quic_conn instance. The first argument is a const pointer
and so the members accessed through it must also be const.
2021-12-21 15:53:19 +01:00
Amaury Denoyelle
c15dd9214b MINOR: quic: add reference to quic_conn in ssl context
Add a new member in ssl_sock_ctx structure to reference the quic_conn
instance if used in the QUIC stack. This member is initialized during
qc_conn_init().

This is needed to be able to access to the quic_conn without relying on
the connection instance. This commit is part of the rearchitecture of
xprt-quic layers and the separation between xprt and connection
instances.
2021-12-21 15:53:19 +01:00
Amaury Denoyelle
8a5b27a9b9 REORG: quic: move mux function outside of xprt
Move qcc_get_qcs() function from xprt_quic.c to mux_quic.c. This
function is used to retrieve the qcs instance from a qcc with a stream
id. This clearly belongs to the mux-quic layer.
2021-12-21 15:51:40 +01:00
Amaury Denoyelle
17a741693c CLEANUP: quic: rename quic_conn instances to qc
Use the convention of naming quic_conn instance as qc to not confuse it
with a connection instance. The changes occured for qc_parse_pkt_frms(),
qc_build_frms() and qc_do_build_pkt().
2021-12-21 15:51:30 +01:00
Frédéric Lécaille
2ce5acf7ed MINOR: quic: Wrong packet refcount handling in qc_pkt_insert()
The QUIC connection I/O handler qc_conn_io_cb() could be called just after
qc_pkt_insert() have inserted a packet in a its tree, and before qc_pkt_insert()
have incremented the reference counter to this packet. As qc_conn_io_cb()
decrement this counter, the packet could be released before qc_pkt_insert()
might increment the counter, leading to possible crashes when trying to do so.
So, let's make qc_pkt_insert() increment this counter before inserting the packet
it is tree. No need to lock anything for that.
2021-12-20 17:33:51 +01:00
Frédéric Lécaille
f1d38cbe15 MINOR: quic: Do not forget STREAM frames received in disorder
Add a function to process all STREAM frames received and ordered
by their offset (qc_treat_rx_strm_frms()) and modify
qc_handle_bidi_strm_frm() consequently.
2021-12-20 17:33:51 +01:00
Frédéric Lécaille
4137b2d316 MINOR: quic: Do not expect to receive only one O-RTT packet
There is nothing about this in the RFC. We must support to receive
several 0-RTT packets before the handshake has completed.
2021-12-20 17:33:51 +01:00
Frédéric Lécaille
91ac6c3a8a MINOR: quic: Add a function to list remaining RX packets by encryption level
This is only to debug some issues which cause the RX buffer saturation
with "Too big packet" traces.
2021-12-20 17:33:51 +01:00
Remi Tricot-Le Breton
cc750efbc5 MINOR: ssl: Remove empty lines from "show ssl ocsp-response" output
There were empty lines in the output of the CLI's "show ssl
ocsp-response" command (after the certificate ID and between two
certificates). This patch removes them since an empty line should mark
the end of the output.

Must be backported in 2.5.
2021-12-20 12:02:17 +01:00
Amaury Denoyelle
dbef985b74 MINOR: quic: simplify the removal from ODCID tree
With the DCID refactoring, the locking is more centralized. It is
possible to simplify the code for removal of a quic_conn from the ODCID
tree.

This operation can be conducted as soon as the connection has been
retrieved from the DCID tree, meaning that the peer now uses the final
DCID. Remove the bit to flag a connection for removal and just uses
ebmb_delete() on each sucessful lookup on the DCID tree. If the
quic_conn has already been removed, it is just a noop thanks to
eb_delete() implementation.
2021-12-17 10:59:36 +01:00
Amaury Denoyelle
8efe032bba MINOR: quic: refactor DCID lookup
A new function named qc_retrieve_conn_from_cid() now contains all the
code to retrieve a connection from a DCID. It handle all type of packets
and centralize the locking on the ODCID/DCID trees.

This simplify the qc_lstnr_pkt_rcv() function.
2021-12-17 10:59:36 +01:00
Amaury Denoyelle
adb2276524 MINOR: quic: compare coalesced packets by DCID
If an UDP datagram contains multiple QUIC packets, they must all use the
same DCID. The datagram context is used partly for this.

To ensure this, a comparison was made on the dcid_node of DCID tree. As
this is a comparison based on pointer address, it can be faulty when
nodes are removed/readded on the same pointer address.

Replace this comparison by a proper comparison on the DCID data itself.
To this end, the dgram_ctx structure contains now a quic_cid member.
2021-12-17 10:59:36 +01:00
Amaury Denoyelle
c92cbfc014 MINOR: quic: refactor concat DCID with address for Initial packets
For first Initial packets, the socket source dest address is
concatenated to the DCID. This is used to be able to differentiate
possible collision between several clients which used the same ODCID.

Refactor the code to manage DCID and the concatenation with the address.
Before this, the concatenation was done on the quic_cid struct and its
<len> field incremented. In the code it is difficult to differentiate a
normal DCID with a DCID + address concatenated.

A new field <addrlen> has been added in the quic_cid struct. The <len>
field now only contains the size of the QUIC DCID. the <addrlen> is
first initialized to 0. If the address is concatenated, it will be
updated with the size of the concatenated address. This now means we
have to explicitely used either cid.len or cid.len + cid.addrlen to
access the DCID or the DCID + the address. The code should be clearer
thanks to this.

The field <odcid_len> in quic_rx_packet struct is now useless and has
been removed. However, a new parameter must be added to the
qc_new_conn() function to specify the size of the ODCID addrlen.
2021-12-17 10:59:36 +01:00
Amaury Denoyelle
d496251cde MINOR: quic: rename constant for haproxy CIDs length
On haproxy implementation, generated DCID are on 8 bytes, the minimal
value allowed by the specification. Rename the constant representing
this size to inform that this is haproxy specific.
2021-12-17 10:59:36 +01:00
Amaury Denoyelle
260e5e6c24 MINOR: quic: add missing lock on cid tree
All operation on the ODCID/DCID trees must be conducted under a
read-write lock. Add a missing read-lock on the lookup operation inside
listener handler.
2021-12-17 10:59:36 +01:00
Amaury Denoyelle
67e6cd50ef CLEANUP: quic: rename quic_conn conn to qc in quic_conn_free
Rename quic_conn from conn to qc to differentiate it from a struct
connection instance. This convention is already used in the majority of
the code.
2021-12-17 10:59:35 +01:00
Amaury Denoyelle
47e1f6d4e2 CLEANUP: quic: fix spelling mistake in a trace
Initiial -> Initial
2021-12-17 10:59:35 +01:00
Amaury Denoyelle
fdbf63e86e MINOR: mux-quic: fix trace on stream creation
Replace non-initialized qcs.by_id.key by the id to report the proper
stream ID on stream creation.
2021-12-17 09:55:01 +01:00
Frédéric Lécaille
8678eb0d19 CLEANUP: quic: Shorten a litte bit the traces in lstnr_rcv_pkt()
Some traces were too long and confusing when displaying 0 for a non-already
parsed packet number.
2021-12-17 08:38:43 +01:00
Frédéric Lécaille
25eeebe293 MINOR: quic: Do not mix packet number space and connection flags
The packet number space flags were mixed with the connection level flags.
This leaded to ACK to be sent at the connection level without regard to
the underlying packet number space. But we want to be able to acknowleged
packets for a specific packet number space.
2021-12-17 08:38:43 +01:00
Frédéric Lécaille
afd373c232 MINOR: hq_interop: Stop BUG_ON() truncated streams
This is required if we do not want to make haproxy crash during zerortt
interop runner test which makes a client open multiple streams with
long request paths.
2021-12-17 08:38:43 +01:00
Frédéric Lécaille
3fe7df877d CLEANUP: quic: Comment fix for qc_strm_cpy()
This function never returns a negative value... hopefully because it returns
a size_t!!!
2021-12-17 08:38:43 +01:00
Frédéric Lécaille
e629cfd96a MINOR: qpack: Missing check for truncated QPACK fields
Decrementing <len> variable without checking could make haproxy crash (on abort)
when printing a huge buffer (with negative length).
2021-12-17 08:38:43 +01:00
Frédéric Lécaille
a5da31d186 MINOR: quic: Make xprt support 0-RTT.
A client sends a 0-RTT data packet after an Initial one in the same datagram.
We must be able to parse such packets just after having parsed the Initial packets.
2021-12-17 08:38:43 +01:00
Frédéric Lécaille
1761fdf0c6 MINOR: ssl_sock: Set the QUIC application from ssl_sock_advertise_alpn_protos.
Make this function call quic_set_app_ops() if the protocol could be negotiated
by the TLS stack.
2021-12-17 08:38:43 +01:00
Frédéric Lécaille
b0bd62db23 MINOR: quic: Add quic_set_app_ops() function
Export the code responsible which set the ->app_ops structure into
quic_set_app_ops() function. It must be called by  the TLS callback which
selects the application (ssl_sock_advertise_alpn_protos) so that
to be able to build application packets after having received 0-RTT data.
2021-12-17 08:38:43 +01:00
Frédéric Lécaille
4015cbb723 MINOR: quic: No TX secret at EARLY_DATA encryption level
The TLS does not provide us with TX secrets after we have provided it
with 0-RTT data. This is logic: the server does not need to send 0-RTT
data. We must skip the section where such secrets are derived if we do not
want to close the connection with a TLS alert.
2021-12-17 08:38:43 +01:00
Frédéric Lécaille
ad3c07ae81 MINOR: quic: Enable TLS 0-RTT if needed
Enable 0-RTT at the TLS context level:
    RFC 9001 4.6.1. Enabling 0-RTT
    Accordingly, the max_early_data_size parameter is repurposed to hold a
    sentinel value 0xffffffff to indicate that the server is willing to accept
    QUIC 0-RTT data.
At the SSL connection level, we must call SSL_set_quic_early_data_enabled().
2021-12-17 08:38:43 +01:00
Frédéric Lécaille
0371cd54d0 CLEANUP: quic: Remove cdata_len from quic_tx_packet struct
This field is no more useful. Modify the traces consequently.
Also initialize ->pn_node.key value to -1, which is an illegal value
for QUIC packet number, and display it in traces if different from -1.
2021-12-17 08:38:43 +01:00
Frédéric Lécaille
d8b8443047 MINOR: quic: Add traces for STOP_SENDING frame and modify others
If not handled by qc_parse_pkt_frms(), the packet which contains it is dropped.
Add only a trace when parsing this frame at this time.
Also modify others to reduce the traces size and have more information about streams.
2021-12-17 08:38:43 +01:00
Frédéric Lécaille
1d2faa24d2 CLEANUP: quic_frame: Remove a useless suffix to STOP_SENDING
This is to be consistent with the other frame names. Adding a _frame
suffixe to STOP_SENDING is useless. We know this is a frame.
2021-12-17 08:38:43 +01:00
Frédéric Lécaille
f57c333ac1 MINOR: quic: Attach timer task to thread for the connection.
This is to avoid races between the connection I/O handler and this task
which share too much variables.
2021-12-17 08:38:43 +01:00
Remi Tricot-Le Breton
0b9e190028 MEDIUM: vars: Enable optional conditions to set-var converter and actions
This patch adds the possibility to add a set of conditions to a set-var
call, be it a converter or an action (http-request or http-response
action for instance). The conditions must all be true for the given
set-var call for the variable to actually be set. If any of the
conditions is false, the variable is left untouched.
The managed conditions are the following : "ifexists", "ifnotexists",
"ifempty", "ifnotempty", "ifset", "ifnotset", "ifgt", "iflt".  It is
possible to combine multiple conditions in a single set-var call since
some of them apply to the variable itself, and some others to the input.

This patch does not change the fact that variables of scope proc are
still created during configuration parsing, regardless of the conditions
that might be added to the set-var calls in which they are mentioned.
For instance, such a line :
    http-request set-var(proc.foo,ifexists) int(5)
would not prevent the creation of the variable during init, and when
actually reaching this line during runtime, the proc.foo variable would
already exist. This is specific to the proc scope.

These new conditions mean that a set-var could "fail" for other reasons
than memory allocation failures but without clearing the contents of the
variable.
2021-12-16 17:31:57 +01:00
Remi Tricot-Le Breton
bb6bc95b1e MINOR: vars: Parse optional conditions passed to the set-var actions
This patch adds the parsing of the optional condition parameters that
can be passed to the set-var and set-var-fmt actions (http as well as
tcp). Those conditions will not be taken into account yet in the var_set
function so conditions passed as parameters will not have any effect.
Since actions do not benefit from the parameter preparsing that
converters have, parsing conditions needed to be done by hand.
2021-12-16 17:31:57 +01:00
Remi Tricot-Le Breton
51899d251c MINOR: vars: Parse optional conditions passed to the set-var converter
This patch adds the parsing of the optional condition parameters that
can be passed to the set-var converter. Those conditions will not be
taken into account yet in the var_set function so conditions passed as
parameters will not have any effect. This is true for any condition
apart from the "ifexists" one that is also used to replace the
VF_UPDATEONLY flag that was used to prevent proc scope variable creation
from a LUA module.
2021-12-16 17:31:55 +01:00
Remi Tricot-Le Breton
25fccd52ac MINOR: vars: Delay variable content freeing in var_set function
When calling var_set on a variable of type string (SMP_T_STR, SMP_T_BIN
or SMP_T_METH), the contents of the variable were freed directly. When
adding conditions to set-var calls we might have cases in which the
contents of an existing variable should be kept unchanged so the freeing
of the internal buffers is delayed in the var_set function (so that we
can bypass it later).
2021-12-16 17:31:31 +01:00
Remi Tricot-Le Breton
1bd9805085 MINOR: vars: Set variable type to ANY upon creation
The type of a newly created variable was not initialized. This patch
sets it to SMP_T_ANY by default. This will be required when conditions
can be added to a set-var call because we might end up creating a
variable without setting it yet.
2021-12-16 17:31:31 +01:00
Remi Tricot-Le Breton
7055301934 MINOR: vars: Move UPDATEONLY flag test to vars_set_ifexist
The vars_set_by_name_ifexist function was created to avoid creating too
many variables from a LUA module. This was made thanks to the
VF_UPDATEONLY flags which prevented variable creation in the var_set
function. Since commit 3a4bedccc ("MEDIUM: vars: replace the global name
index with a hash") this limitation was restricted to 'proc' scope
variables only.
This patch simply moves the scope test to the vars_set_by_name_ifexist
function instead of the var_set function.
2021-12-16 17:31:27 +01:00
David CARLIER
f5d48f8b3b MEDIUM: cfgparse: numa detect topology on FreeBSD.
allowing for all platforms supporting cpu affinity to have a chance
 to detect the cpu topology from a given valid node (e.g.
 DragonflyBSD seems to be NUMA aware from a kernel's perspective
 and seems to be willing start to provide userland means to get
 proper info).
2021-12-15 11:05:51 +01:00
Amaury Denoyelle
b09f4477f4 CLEANUP: cfgparse: modify preprocessor guards around numa detection code
numa_detect_topology() is always define now if USE_CPU_AFFINITY is
activated. For the moment, only on Linux an actual implementation is
provided. For other platforms, it always return 0.

This change has been made to easily add implementation of NUMA detection
for other platforms. The phrasing of the documentation has also been
edited to removed the mention of Linux-only on numa-cpu-mapping
configuration option.
2021-12-15 11:05:51 +01:00
William Lallemand
740629e296 MINOR: cli: "show version" displays the current process version
This patch implements a simple "show version" command which returns
the version of the current process.

It's available from the master and the worker processes, so it is easy
to check if the master and the workers have the same version.

This is a minor patch that really improve compatibility checks
for scripts.

Could be backported in haproxy version as far as 2.0.
2021-12-14 15:40:06 +01:00
Amaury Denoyelle
1ac95445e6 MINOR: hq-interop: refix tx buffering
Incorrect usage of the buffer API : b_room() replaces b_size() to ensure
that we have enough size for http data copy.
2021-12-10 15:14:58 +01:00
William Lallemand
dcbe7b91d6 BUG/MEDIUM: mworker/cli: crash when trying to access an old PID in prompt mode
The master process encounter a crash when trying to access an old
process which left from the master CLI.

To reproduce the problem, you need a prompt to a previous worker, then
wait for this worker to leave, once it left launch a command from this
prompt. The s->target is then filled with a NULL which is dereferenced
when trying to connect().

This patch fixes the problem by checking if s->target is NULL.

Must be backported as far as 2.0.
2021-12-10 14:30:18 +01:00
Amaury Denoyelle
7059ebc095 MINOR: h3: fix possible invalid dereference on htx parsing
The htx variable is only initialized if we have received a HTTP/3
HEADERS frame. Else it must not be dereferenced.

This should fix the compilation on CI with gcc.
  src/h3.c: In function ‘h3_decode_qcs’:
  src/h3.c:224:14: error: ‘htx’ may be used uninitialized in this function
  [-Werror=maybe-uninitialized]
    224 |   htx->flags |= HTX_FL_EOM
2021-12-08 15:52:59 +01:00
Amaury Denoyelle
f3b0ba7dc9 BUG/MINOR: mux-quic: properly initialize flow control
Initialize all flow control members on the qcc instance. Without this,
the value are undefined and it may be possible to have errors about
reached streams limit.
2021-12-08 15:26:16 +01:00
Amaury Denoyelle
5154e7a252 MINOR: quic: notify the mux on CONNECTION_CLOSE
The xprt layer is reponsible to notify the mux of a CONNECTION_CLOSE
reception. In this case the flag QC_CF_CC_RECV is positionned on the
qcc and the mux tasklet is waken up.

One of the notable effect of the QC_CF_CC_RECV is that each qcs will be
released even if they have remaining data in their send buffers.
2021-12-08 15:26:16 +01:00
Amaury Denoyelle
2873a31c81 MINOR: mux-quic: do not release qcs if there is remaining data to send
A qcs is not freed if there is remaining data in its buffer. In this
case, the flag QC_SF_DETACH is positionned.

The qcc io handler is responsible to remove the qcs if the QC_SF_DETACH
is set and their buffers are empty.
2021-12-08 15:26:16 +01:00
Christopher Faulet
70f8948364 BUG/MINOR: cli/server: Don't crash when a server is added with a custom id
When a server is dynamically added via the CLI with a custom id, the key
used to insert it in the backend's tree of used names is not initialized.
The server id must be used but it is only used when no custom id is
provided. Thus, with a custom id, HAProxy crashes.

Now, the server id is always used to init this key, to be able to insert the
server in the corresponding tree.

This patch should fix the issue #1481. It must be backported as far as 2.4.
2021-12-07 19:04:33 +01:00
Christopher Faulet
ba8f06304e MINOR: http-rules: Add capture action to http-after-response ruleset
It is now possible to perform captures on the response when
http-after-response rules are evaluated. It may be handy to capture headers
from responses generated by HAProxy.

This patch is trivial, it may be backported if necessary.
2021-12-07 19:04:33 +01:00
Amaury Denoyelle
db44338473 MINOR: quic: add HTX EOM on request end
Set the HTX EOM flag on RX the app layer. This is required to notify
about the end of the request for the stream analyzers, else the request
channel never goes to MSG_DONE state.
2021-12-07 17:11:22 +01:00
Amaury Denoyelle
fecfa0d822 MINOR: mux-quic: remove uneeded code to check fin on TX
Remove a wrong comparaison with the same buffer on both sides. In any
cases, the FIN is properly set by qcs_push_frame only when the payload
has been totally emptied.
2021-12-07 17:11:22 +01:00
Amaury Denoyelle
5ede40be67 MINOR: hq-interop: fix tx buffering
On h09 app layer, if there is not enought size in the tx buffer, the
transfer is interrupted and the flag QC_SF_BLK_MROOM is positionned.
The transfer is woken up by the mux when new buffer size becomes
available.

This ensure that no data is silently discarded during transfer. Without
this, once the buffer is full the data were removed and thus not send to
the client resulting in a truncating payload.
2021-12-07 17:08:52 +01:00
Frédéric Lécaille
73dcc6ee62 MINOR: quic: Remove QUIC TX packet length evaluation function
Remove qc_eval_pkt() which has come with the multithreading support. It
was there to evaluate the length of a TX packet before building. We could
build from several thread TX packets without consuming a packet number for nothing (when
the building failed). But as the TX packet building functions are always
executed by the same thread, the one attached to the connection, this does
not make sense to continue to use such a function. Furthermore it is buggy
since we had to recently pad the TX packet under certain circumstances.
2021-12-07 15:53:56 +01:00
Frédéric Lécaille
fee7ba673f MINOR: quic: Delete remaining RX handshake packets
After the handshake has succeeded, we must delete any remaining
Initial or Handshake packets from the RX buffer. This cannot be
done depending on the state the connection (->st quic_conn struct
member value) as the packet are not received/treated in order.
2021-12-07 15:53:56 +01:00
Frédéric Lécaille
7d807c93f4 MINOR: quic: QUIC encryption level RX packets race issue
The tree containing RX packets must be protected from concurrent accesses.
2021-12-07 15:53:56 +01:00
Frédéric Lécaille
d61bc8db59 MINOR: quic: Race issue when consuming RX packets buffer
Add a null byte to the end of the RX buffer to notify the consumer there is no
more data to treat.
Modify quic_rx_packet_pool_purge() which is the function which remove the
RX packet from the buffer.
Also rename this function to quic_rx_pkts_del().
As the RX packets may be accessed by the QUIC connection handler (quic_conn_io_cb())
the function responsible of decrementing their reference counters must not
access other information than these reference counters! It was a very bad idea
to try to purge the RX buffer asap when executing this function.
2021-12-07 15:53:56 +01:00
Frédéric Lécaille
f9cb3a9b0e MINOR: quic: RX buffer full due to wrong CRYPTO data handling
Do not leave in the RX buffer packets with CRYPTO data which were
already received. We do this when parsing CRYPTO frame. If already
received we must not consider such frames as if they were not received
in order! This had as side effect to interrupt the transfer of long streams
(ACK frames not parsed).
2021-12-07 15:53:56 +01:00
Amaury Denoyelle
84ea8dcbc4 MEDIUM: mux-quic: handle when sending buffer is full
Handle the case when the app layer sending buffer is full. A new flag
QC_SF_BLK_MROOM is set in this case and the transfer is interrupted. It
is expected that then the conn-stream layer will subscribe to SEND.

The MROOM flag is reset each time the muxer transfer data from the app
layer to its own buffer. If the app layer has been subscribed on SEND it
is woken up.
2021-12-07 15:44:45 +01:00
Amaury Denoyelle
e257d9e8ec MEDIUM: mux-quic: wake up xprt on data transferred
On qc_send, data are transferred for each stream from their qcs.buf to
the qcs.xprt_buf. Wake up the xprt to warn about new data available for
transmission.
2021-12-07 15:44:45 +01:00
Amaury Denoyelle
a2c58a7c8d MEDIUM: mux-quic: subscribe on xprt if remaining data after send
The streams data are transferred from the qcs.buf to the qcs.xprt_buf
during qc_send. If the xprt_buf is not empty and not all data can be
transferred, subscribe the connection on the xprt for sending.

The mux will be woken up by the xprt when the xprt_buf will be cleared.
This happens on ACK reception.
2021-12-07 15:44:45 +01:00
Amaury Denoyelle
a3f222dc1e MINOR: mux-quic: implement subscribe on stream
Implement the subscription in the mux on the qcs instance.

Subscribe is now used by the h3 layer when receiving an incomplete frame
on the H3 control stream. It is also used when attaching the remote
uni-directional streams on the h3 layer.

In the qc_send, the mux wakes up the qcs for each new transfer executed.
This is done via the method qcs_notify_send().

The xprt wakes up the qcs when receiving data on unidirectional streams.
This is done via the method qcs_notify_recv().
2021-12-07 15:44:45 +01:00