The MUX instance is released before its quic-conn counterpart. On
termination, a H3 GOAWAY is emitted to prevent the client to open new
streams for this connection.
The quic-conn instance will stay alive until all opened streams data are
acknowledged. If the client tries to open a new stream during this
interval despite the GOAWAY, quic-conn is responsible to request its
immediate closure with a STOP_SENDING + RESET_STREAM.
This behavior was already implemented but the received packet with the
new STREAM was never acknowledged. This was fixed with the following
commit :
commit 156a89aef8
BUG/MINOR: quic: acknowledge STREAM frame even if MUX is released
However, this patch introduces a regression as it did not skip the call
to qc_handle_strm_frm() despite the MUX instance being released. This
can cause a segfault when using qcc_get_qcs() on a released MUX
instance. To fix this, add a missing break statement which will skip
qc_handle_strm_frm() when the MUX instance is not initialized.
This commit was reproduced using a short timeout client and sending
several requests with delay between them by using a modified aioquic. It
produces a crash with the following backtrace :
#0 0x000055555594d261 in __eb64_lookup (x=4, root=0x7ffff4091f60) at include/import/eb64tree.h:132
#1 eb64_lookup (root=0x7ffff4091f60, x=4) at src/eb64tree.c:37
#2 0x000055555563fc66 in qcc_get_qcs (qcc=0x7ffff4091dc0, id=4, receive_only=1, send_only=0, out=0x7ffff780ca70) at src/mux_quic.c:668
#3 0x0000555555641e1a in qcc_recv (qcc=0x7ffff4091dc0, id=4, len=40, offset=0, fin=1 '\001', data=0x7ffff40c4fef "\001&") at src/mux_quic.c:974
#4 0x0000555555619d28 in qc_handle_strm_frm (pkt=0x7ffff4088e60, strm_frm=0x7ffff780cf50, qc=0x7ffff7cef000, fin=1 '\001') at src/quic_conn.c:2515
#5 0x000055555561d677 in qc_parse_pkt_frms (qc=0x7ffff7cef000, pkt=0x7ffff4088e60, qel=0x7ffff7cef6c0) at src/quic_conn.c:3050
#6 0x00005555556230aa in qc_treat_rx_pkts (qc=0x7ffff7cef000, cur_el=0x7ffff7cef6c0, next_el=0x0) at src/quic_conn.c:4214
#7 0x0000555555625fee in quic_conn_app_io_cb (t=0x7ffff40c1fa0, context=0x7ffff7cef000, state=32848) at src/quic_conn.c:4640
#8 0x00005555558a676d in run_tasks_from_lists (budgets=0x7ffff780d470) at src/task.c:596
#9 0x00005555558a725b in process_runnable_tasks () at src/task.c:876
#10 0x00005555558522ba in run_poll_loop () at src/haproxy.c:2945
#11 0x00005555558529ac in run_thread_poll_loop (data=0x555555d14440 <ha_thread_info+64>) at src/haproxy.c:3141
#12 0x00007ffff789ebb5 in ?? () from /usr/lib/libc.so.6
#13 0x00007ffff7920d90 in ?? () from /usr/lib/libc.so.6
This should fix github issue #2067.
This must be backported up to 2.6.
In very very rare cases, it is possible the Initial packet number space
must be probed even if it there is no more in flight CRYPTO frames.
In such cases, a PING frame is sent into an Initial packet. As this
packet is ack-eliciting, it must be padded by the server. qc_do_build_pkt()
is modified to do so.
Take the opportunity of this patch to modify the trace for TX frames to
easily distinguished them from other frame relative traces.
Must be backported to 2.7.
Mark the connection as limited by the anti-amplification limit when trying to
probe the peer.
Wakeup the connection PTO/dectection loss timer as soon as a datagram is
received. This was done only when the datagram was dropped.
This fixes deadlock issues revealed by some interop runner tests.
Must be backported to 2.7 and 2.6.
Some frames are marked as already acknowledged from duplicated packets
whose the original packet has been acknowledged. There is no need
to resend such packets or frames.
Implement qc_pkt_with_only_acked_frms() to detect packet with only
already acknowledged frames inside and use it from qc_prep_fast_retrans()
which selects the packet to be retransmitted.
Must be backported to 2.6 and 2.7.
Even if there is a check in callers of qc_prep_hdshk_fast_retrans() and
qc_prep_fast_retrans() to prevent retransmissions of packets with no ack-eliciting
frames, these two functions should pay attention not do to that especially if
someone decides to modify their implementations in the future.
Must be backported to 2.6 and 2.7.
This is an old bug which arrived in this commit due to a misinterpretation
of the RFC I guess where the desired effect was to acknowledge all the
handshake packets:
77ac6f566 BUG/MINOR: quic: Missing acknowledgments for trailing packets
This had as bad effect to acknowledge all the handshake packets even the
ones which are not ack-eliciting.
Must be backported to 2.7 and 2.6.
Dump the secret used to derive the next one during a key update initiated by the
client and dump the resulted new secret and the new key and iv to be used to
decryption Application level packets.
Also add a trace when the key update is supposed to be initiated on haproxy side.
This has already helped in diagnosing an issue evealed by the key update interop
test with xquic as client.
Must be backported to 2.7.
v2 interop runner test revealed this bug as follows:
[01|quic|4|c_conn.c:4087] new packet : qc@0x7f62ec026e30 pkt@0x7f62ec056390 el=I pn=491940080 rel=H
[01|quic|5|c_conn.c:1509] qc_pkt_decrypt(): entering : qc@0x7f62ec026e30
[01|quic|0|c_conn.c:1553] quic_tls_decrypt() failed : qc@0x7f62ec026e30
[01|quic|5|c_conn.c:1575] qc_pkt_decrypt(): leaving : qc@0x7f62ec026e30
[01|quic|0|c_conn.c:4091] packet decryption failed -> dropped : qc@0x7f62ec026e30 pkt@0x7f62ec056390 el=I pn=491940080
Only v2 Initial packets decryption received by the clients were impacted. There
is no issue to encrypt v2 Initial packets. This is due to the fact that when
negotiated the client may send two versions of Initial packets (currently v1,
then v2). The selection was done for the TX path but not on the RX path.
Implement qc_select_tls_ctx() to select the correct TLS cipher context for all
types of packets and call this function before removing the header protection
and before deciphering the packet.
Must be backported to 2.7.
When retransmitting datagrams with two coalesced packets inside, the second
packet was not taken into consideration when checking there is enough space
into the network for the datagram, especially when limited by the anti-amplification.
Must be backported to 2.6 and 2.7.
Before building a packet into a datagram, ensure there is sufficient space for at
least 1200 bytes. Also pad datagrams with only one ack-eliciting Initial packet
inside.
Must be backported to 2.7 and 2.6.
Making http_7239_valid_obfsc() inline because it is only called by inline
functions.
Removing dead comment and documenting proxy_http_parse_{7239,xff,xot} functions.
No backport needed.
Anonymization mode has two CLI handlers "set anon <on|off>" and "set
anon global-key". The last one only requires admin level. However, as
cli_find_kw() is implemented, only the first handler will be retrieved
as they both start with the same prefix "set anon".
This has the effect to execute the wrong handler for "set anon
global-key" with an error message about an invalid keyword. To fix this,
handlers definition have been separated for both "set anon on" and "set
anon off" commands. This allows to have minimal changes while keeping
the same "set anon" prefix for each commands.
Also take this opportunity to fix a reference to a non-existing "set
global-key" CLI handler in the documentation.
This must be backported up to 2.7.
When a STREAM frame is re-emitted, it will point to the same stream
buffer as the original one. If an ACK is received for either one of
these frame, the underlying buffer may be freed. Thus, if the second
frame is declared as lost and schedule for retransmission, we must
ensure that the underlying buffer is still allocated or interrupt the
retransmission.
Stream buffer is stored as an eb_tree indexed by the stream ID. To avoid
to lookup over a tree each time a STREAM frame is re-emitted, a lost
STREAM frame is flagged as QUIC_FL_TX_FRAME_LOST.
In most cases, this code is functional. However, there is several
potential issues which may cause a segfault :
- when explicitely probing with a STREAM frame, the frame won't be
flagged as lost
- when splitting a STREAM frame during retransmission, the flag is not
copied
To fix both these cases, QUIC_FL_TX_FRAME_LOST flag has been converted
to a <dup> field in quic_stream structure. This field is now properly
copied when splitting a STREAM frame. Also, as this is now an inner
quic_frame field, it will be copied automatically on qc_frm_dup()
invocation thus ensuring that it will be set on probing.
This issue was encounted randomly with the following backtrace :
#0 __memmove_avx512_unaligned_erms ()
#1 0x000055f4d5a48c01 in memcpy (__len=18446698486215405173, __src=<optimized out>,
#2 quic_build_stream_frame (buf=0x7f6ac3fcb400, end=<optimized out>, frm=0x7f6a00556620,
#3 0x000055f4d5a4a147 in qc_build_frm (buf=buf@entry=0x7f6ac3fcb5d8,
#4 0x000055f4d5a23300 in qc_do_build_pkt (pos=<optimized out>, end=<optimized out>,
#5 0x000055f4d5a25976 in qc_build_pkt (pos=0x7f6ac3fcba10,
#6 0x000055f4d5a30c7e in qc_prep_app_pkts (frms=0x7f6a0032bc50, buf=0x7f6a0032bf30,
#7 qc_send_app_pkts (qc=0x7f6a0032b310, frms=0x7f6a0032bc50) at src/quic_conn.c:4184
#8 0x000055f4d5a35f42 in quic_conn_app_io_cb (t=0x7f6a0009c660, context=0x7f6a0032b310,
This should fix github issue #2051.
This should be backported up to 2.6.
In the OCSP response callback, instead of using the actual date of the
system, the scheduler's 'now' timer is used when checking a response's
validity.
This patch can be backported to all stable versions.
When adding a new certificate through the CLI and appending it to a
crt-list with the 'ocsp-update' option set, the new certificate would
not be added to the OCSP response update list.
The only thing that was missing was the copy of the ocsp_update mode
from the ssl_bind_conf into the ckch_store's object.
An extra wakeup of the update task also needed to happen in case the
newly inserted entry needs to be updated before the next wakeup of the
task.
This patch does not need to be backported.
Add tests for the "show ssl ocsp-updates" cli command as well as the new
'base64' parameter that can be passed to the "show ssl ocsp-response"
command.
The options were after the filters which does not work well and now
raises a warning. It did not break the regtest because the crt-lists
were not actually used by clients.
The minimum and maximum delays between two automatic updates of a given
OCSP response can now be set via global options. It allows to limit the
update rate of OCSP responses for configurations that use many frontend
certificates with the ocsp-update option set if the updates are deemed
too costly.
A new format option can be passed to the "show ssl ocsp-response" CLI
command to dump the contents of an OCSP response in base64. This is
needed because thanks to the new OCSP auto update mechanism, we could
end up using an OCSP response internally that was never provided by the
user.
In case of successive OCSP update errors for a given OCSP response, the
retry delay will be multiplied by 2 for every new failure in order to
avoid retrying too often to update responses for which the responder is
unresponsive (for instance). The maximum delay will still be taken into
account so the OCSP update requests will wtill be sent at least every
hour.
Instead of using the same proxy as other http client calls (through lua
for instance), the OCSP update will use a dedicated proxy which will
enable it to change the log format and log conditions (for instance).
This proxy will have the NOLOGNORM option and regular logging will be
managed by the update task itself because in order to dump information
related to OCSP updates, we need to control the moment when the logs are
emitted (instead or relying on the stream's life which is decorrelated
from the update itself).
The update task then calls sess_log directly, which uses a dedicated
ocsp logformat that fetches specific OCSP data. Sess_log was preferred
to the more low level app_log because it offers the strength of
"regular" sample fetches and allows to add generic information alongside
OCSP ones in the log line.
In case of connection error (unreachable server for instance), a regular
httpclient log line will also be emitted. This line will have some extra
HTTP related info that can't be provided by the ocsp update logging
mechanism.
This patch adds a series of sample fetches that rely on the specified
OCSP update context structure. They will then be of use only in the
context of an ongoing OCSP update.
They cannot be used directly in the configuration so they won't be made
public. They will be used in the OCSP update's specific log format which
should be emitted by the update task itself in a future patch.
This command can be used to dump information about the entries contained
in the ocsp update tree. It will display one line per concerned OCSP
response and will contain the expected next update time as well as the
time of the last successful update, and the number of successful and
failed attempts.
In order to have some information about the frontend certificate when
dumping the contents of the ocsp update tree from the cli, we could
either keep a reference to a ckch_store in the certificate_ocsp
structure, which might cause some dangling reference problems, or
simply copy the path to the certificate in the ocsp response structure.
This latter solution was chosen because of its simplicity.
Those new specific error codes will enable to know a bit better what
went wrong during and OCSP update process. They will come to use in
future sample fetches as well as in debugging means (via the cli or
future traces).
In case of allocation error during the construction of an OCSP request
for instance, we would have ended reinserting the ocsp entry at the same
place in the ocsp update tree which could potentially lead to an
"endless" loop of errors in ssl_ocsp_update_responses. In such a case,
entries are now reinserted further in the tree (1 minute later) in order
to avoid such a chain of alloc failure.
When an abort is detected before all headers were received, and if there are
pending incoming data, we must report a parsing error instead of a
connection abort. This way it will be able to be handled as an invalid
message by HTTP analyzers instead of an early abort with no message.
It is especially important to be accurate on L7 retry. Indeed, without this
fix, this case will be handle by the "empty-response" retries policy while a
retry on "junk-response" is more accurate.
This patch must be backported to 2.7.
A recent fix (af124360e "BUG/MEDIUM: http-ana: Detect closed SC on opposite side
during body forwarding") was pushed to handle to sync a side when the opposite
one is in closing state. However, sometimes, the synchro is performed too early,
preventing a L7 retry to be performed.
Indeed, while the above fix is valid on the reponse side. On the request side,
if the response was not yet received, we must wait before closing.
So, to fix the fix, on the request side, we at least wait the response was
received before finishing the request analysis. Of course, if there is an error,
an abort or anything wrong on the server side, the response analyser should
handle it.
This patch is related to #2061. No backport needed.
A regression about "empty-response" L7 retry was introduced with the commit
dd6496f591 ("CLEANUP: http-ana: Remove useless if statement about L7
retries").
The if statetement was removed on a wrong assumption. Indeed, L7 retries on
status is now handled in the HTTP analysers. Thus, the stream-connector
(formely the conn-stream, and before again the stream-interface) no longer
report a read error to force a retry. But it is still possible to get a read
error with no response. In this case, we must perform a retry is
"empty-response" is enabled.
So the if statement is re-introduced, reverting the cleanup.
This patch should fix the issue #2061. It must be backported as far as 2.4.
When we are about to perform a L7 retry, we deal with the conn_retries
counter, to be sure we can retry. However, there is an issue here because
the counter is incremented before it is checked against the backend
limit. So, we can miss a connection retry.
Of course, we must invert both operation. The conn_retries counter must be
incremented after the check agains the backend limit.
This patch must be backported as far as 2.6.
This patch completes the previous one with poller subscribe of quic-conn
owned socket on sendto() error. This ensures that mux-quic is notified
if waiting on sending when a transient sendto() error is cleared. As
such, qc_notify_send() is called directly inside socket I/O callback.
qc_notify_send() internal condition have been thus completed. This will
prevent to notify upper layer until all sending condition are fulfilled:
room in congestion window and no transient error on socket FD.
This should be backported up to 2.7.
On sendto() transient error, prior to this patch sending was simulated
and we relied on retransmission to retry sending. This could hurt
significantly the performance.
Thanks to quic-conn owned socket support, it is now possible to improve
this. On transient error, sending is interrupted and quic-conn socket FD
is subscribed on the poller for sending. When send is possible,
quic_conn_sock_fd_iocb() will be in charge of restart sending.
A consequence of this change is on the return value of qc_send_ppkts().
This function will now return 0 on transient error if quic-conn has its
owned socket. This is used to interrupt sending in the calling function.
The flag QUIC_FL_CONN_TO_KILL must be checked to differentiate a fatal
error from a transient one.
This should be backported up to 2.7.
Sending is implemented in two parts on quic-conn module. First, QUIC
packets are prepared in a buffer and then sendto() is called with this
buffer as input.
qc.tx.buf is used as the input buffer. It must always be empty before
starting to prepare new packets in it. Currently, this is guarantee by
the fact that either sendto() is completed, a fatal error is encountered
which prevent future send, or a transient error is encountered and we
rely on retransmission to send the remaining data.
This will change when poller subscribe of socket FD on sendto()
transient error will be implemented. In this case, qc.tx.buf will not be
emptied to resume sending when the transient error is cleared. To allow
the current sending process to work as expected, a new function
qc_purge_txbuf() is implemented. It will try to send remaining data
before preparing new packets for sending. If successful, txbuf will be
emptied and sending can continue. If not, sending will be interrupted.
This should be backported up to 2.7.
Implement qc_notify_send(). This function is responsible to notify the
upper layer subscribed on SUB_RETRY_SEND if sending condition are back
to normal.
For the moment, this patch has no functional change as only congestion
window room is checked before notifying the upper layer. However, this
will be extended when poller subscribe of socket on sendto() error will
be implemented. qc_notify_send() will thus be responsible to ensure that
all condition are met before wake up the upper layer.
This should be backported up to 2.7.
This patch simply clean up return paths used in various send function of
quic-conn module. This will simplify the implementation of poller
subscribing on sendto() error which add another error handling path.
This should be backported up to 2.7.
Added new testcases for all 4 branches of smp_fetch_hdr_ip():
- a plain IPv4 address
- an IPv4 address with an port number
- a plain IPv6 address
- an IPv6 address wrapped in [] brackets
The Content-Length header is always added into the request for an HTTP
health-check. However, when there is no payload, this header may be skipped
for OPTIONS, GET, HEAD and DELETE methods. In fact, it is a "SHOULD NOT" in
the RCF 9110 (#8.6).
It is not really an issue in itself but it seems to be an issue for AWS
ELB. It returns a 400-Bad-Request if a HEAD/GET request with no payload
contains a Content-Length header.
So, it is better to skip this header when possible.
This patch should fix the issue #2026. It could be backported as far as 2.2.
When the HTTP request of a health-check is forged, we must not pretend there
is no payload, by setting HTX_SL_F_BODYLESS, if a log-format body was
configured.
Indeed, a test on the body length was used but it is only valid for a plain
string. For A log-format string, a list is used. Note it an bug with no
consequence for now.
This patch must be backported as far as 2.2.
If the response is closed before any data was received, we must not report
an error to the SE descriptor. It is important to be able to retry on an
empty response.
This patch should fix the issue #2061. It must be backported to 2.7.
When a connection is removed from the safe list or the idle list,
CO_FL_SAFE_LIST and CO_FL_IDLE_LIST flags must be cleared. It is performed
when the connection is reused. But not when it is moved into the
toremove_conns list. It may be an issue because the multiplexer owning the
connection may be woken up before the connection is really removed. If the
connection flags are not sanitized, it may think the connection is idle and
reinsert it in the corresponding list. From this point, we can imagine
several bugs. An UAF or a connection reused with an invalid state for
instance.
To avoid any issue, the connection flags are sanitized when an idle
connection is moved into the toremove_conns list. The same is performed at
right places in the multiplexers. Especially because the connection release
may be delayed (for h2 and fcgi connections).
This patch shoudld fix the issue #2057. It must carefully be backported as
far as 2.2. Especially on the 2.2 where the code is really different. But
some conflicts should be expected on the 2.4 too.
EBADF on sendto() is considered as a fatal error. As such, it is removed
from the list of the transient errors. The connection will be killed
when encountered.
For the record, EBADF can be encountered on process termination with the
listener socket.
This should be backported up to 2.7.
Send is conducted through qc_send_ppkts() for a QUIC connection. There
is two types of error which can be encountered on sendto() or affiliated
syscalls :
* transient error. In this case, sending is simulated with the remaining
data and retransmission process is used to have the opportunity to
retry emission
* fatal error. If this happens, the connection should be closed as soon
as possible. This is done via qc_kill_conn() function. Until this
patch, only ECONNREFUSED errno was considered as fatal.
Modify the QUIC send API to be able to differentiate transient and fatal
errors more easily. This is done by fixing the return value of the
sendto() wrapper qc_snd_buf() :
* on fatal error, a negative error code is returned. This is now the
case for every errno except EAGAIN, EWOULDBLOCK, ENOTCONN, EINPROGRESS
and EBADF.
* on a transient error, 0 is returned. This is the case for the listed
errno values above and also if a partial send has been conducted by
the kernel.
* on success, the return value of sendto() syscall is returned.
This commit will be useful to be able to handle transient error with a
quic-conn owned socket. In this case, the socket should be subscribed to
the poller and no simulated send will be conducted.
This commit allows errno management to be confined in the quic-sock
module which is a nice cleanup.
On a final note, EBADF should be considered as fatal. This will be the
subject of a next commit.
This should be backported up to 2.7.
thread_set_first_group() and thread_set_first_tmask() were modified
and renamed to instead return the number and mask of the nth group.
Passing zero continues to return the first one, but it will be more
convenient to use this way when building shards.