Commit Graph

335 Commits

Author SHA1 Message Date
Frédéric Lécaille
495968ed51 MINOR: quic: Add trace to debug idle timer task issues
Add TRACE_PROTO() call where this is relevant to debug issues about
qc_idle_timer_task() issues.

Must be backported to 2.6 and 2.7.
2023-04-04 18:24:28 +02:00
Ilya Shipitsin
07be66d21b CLEANUP: assorted typo fixes in the code and comments
This is 35th iteration of typo fixes
2023-04-01 18:33:40 +02:00
Frédéric Lécaille
d721571d26 MEDIUM: quic: Ack delay implementation
Reuse the idle timeout task to delay the acknowledgments. The time of the
idle timer expiration is for now on stored in ->idle_expire. The one to
trigger the acknowledgements is stored in ->ack_expire.
Add QUIC_FL_CONN_ACK_TIMER_FIRED new connection flag to mark a connection
as having its acknowledgement timer been triggered.
Modify qc_may_build_pkt() to prevent the sending of "ack only" packets and
allows the connection to send packet when the ack timer has fired.
It is possible that acks are sent before the ack timer has triggered. In
this case it is cancelled only if ACK frames are really sent.
The idle timer expiration must be set again when the ack timer has been
triggered or when it is cancelled.

Must be backported to 2.7.
2023-03-31 13:41:17 +02:00
Frédéric Lécaille
8f991948f5 MINOR: quic: Traces adjustments at proto level.
Dump variables displayed by TRACE_ENTER() or TRACE_LEAVE() by calls to TRACE_PROTO().
No more variables are displayed by the two former macros. For now on, these information
are accessible from proto level.
Add new calls to TRACE_PROTO() at important locations in relation whith QUIC transport
protocol.
When relevant, try to prefix such traces with TX or RX keyword to identify the
concerned subpart (transmission or reception) of the protocol.

Must be backported to 2.7.
2023-03-31 09:54:59 +02:00
Frédéric Lécaille
deb978149a BUG/MINOR: quic: Missing max_idle_timeout initialization for the connection
This bug was revealed by handshakeloss interop tests (often with quiceh) where one
could see haproxy an Initial packet without TLS ClientHello message (only a padded
PING frame). In this case, as the ->max_idle_timeout was not initialized, the
connection was closed about three seconds later, and haproxy opened a new one with
a new source connection ID upon receipt of the next client Initial packet. As the
interop runner count the number of source connection ID used by the server to check
there were exactly 50 such IDs used by the server, it considered the test as failed.

So, the ->max_idle_timeout of the connection must be at least initialized
to the local "max_idle_timeout" transport parameter value to avoid such
a situation (closing connections too soon) until it is "negotiated" with the
client when receiving its TLS ClientHello message.

Must be backported to 2.7 and 2.6.
2023-03-31 09:54:59 +02:00
Frédéric Lécaille
a3772e1134 MINOR: quic: Add recovery related information to "show quic"
Add ->srtt, ->rtt_var, ->rtt_min and ->pto_count values from ->path->loss
struct to "show quic". Same thing for ->cwnd from ->path struct.

Also take the opportunity of this patch to dump the packet number
space information directly from ->pktns[] array in place of ->els[]
array. Indeed, ->els[QUIC_TLS_ENC_LEVEL_EARLY_DATA] and ->els[QUIC_TLS_ENC_LEVEL_APP]
have the same packet number space.

Must be backported to 2.7 where "show quic" implementation has alredy been
backported.
2023-03-31 09:54:59 +02:00
Frdric Lcaille
9c317b1d35 BUG/MINOR: quic: Missing padding in very short probe packets
This bug arrived with this commit:
   MINOR: quic: Send PING frames when probing Initial packet number space

This may happen when haproxy needs to probe the peer with very short packets
(only one PING frame). In this case, the packet must be padded. There was clearly
a case which was removed by the mentionned commit above. That said, there was
an extra byte which was added to the PADDING frame before the mentionned commit
above. This is no more the case with this patch.

Thank you to @tatsuhiro-t (ngtcp2 manager) for having reported this issue which
was revealed by the keyupdate test (on client side).

Must be backported to 2.7 and 2.6.
2023-03-28 18:26:57 +02:00
Frédéric Lécaille
c425e03b28 BUG/MINOR: quic: Missing STREAM frame type updated
This patch follows this commit which was not sufficient:
  BUG/MINOR: quic: Missing STREAM frame data pointer updates

Indeed, after updating the ->offset field, the bit which informs the
frame builder of its presence must be systematically set.

This bug was revealed by the following BUG_ON() from
quic_build_stream_frame() :
  bug condition "!!(frm->type & 0x04) != !!stream->offset.key" matched at src/quic_frame.c:515

This should fix the last crash occured on github issue #2074.

Must be backported to 2.6 and 2.7.
2023-03-27 16:01:44 +02:00
Amaury Denoyelle
8afe4b88c4 BUG/MINOR: quic: ignore congestion window on probing for MUX wakeup
qc_notify_send() is used to wake up the MUX layer for sending. This
function first ensures that all sending condition are met to avoid to
wake up the MUX for unnecessarily.

One of this condition is to check if there is room in the congestion
window. However, when probe packets must be sent due to a PTO
expiration, RFC 9002 explicitely mentions that the congestion window
must be ignored which was not the case prior to this patch.

This commit fixes this by first setting <pto_probe> of 01RTT packet
space before invoking qc_notify_send(). This ensures that congestion
window won't be checked anymore to wake up the MUX layer until probing
packets are sent.

This commit replaces the following one which was not sufficient :
  commit e25fce03eb
  BUG/MINOR: quic: Dysfunctional 01RTT packet number space probing

This should be backported up to 2.7.
2023-03-21 14:52:02 +01:00
Amaury Denoyelle
2a19b6e564 BUG/MINOR: quic: wake up MUX on probing only for 01RTT
On PTO probe timeout expiration, a probe packet must be emitted.
quic_pto_pktns() is used to determine for which packet space the timer
has expired. However, if MUX is already subscribed for sending, it is
woken up without checking first if this happened for the 01RTT packet
space.

It is unsure that this is really a bug as in most cases, MUX is
established only after Initial and Handshake packet spaces are removed.
However, the situation is not se clear when 0-RTT is used. For this
reason, adjust the code to explicitely check for the 01RTT packet space
before waking up the MUX layer.

This should be backported up to 2.6. Note that qc_notify_send() does not
exists in 2.6 so it should be replaced by the explicit block checking
(qc->subs && qc->subs->events & SUB_RETRY_SEND).
2023-03-21 14:09:50 +01:00
Frédéric Lécaille
e25fce03eb BUG/MINOR: quic: Dysfunctional 01RTT packet number space probing
This bug arrived with this commit:
   "MINOR: quic: implement qc_notify_send()".
The ->tx.pto_probe variable was no more set when qc_processt_timer() the timer
task for the connection responsible of detecting packet loss and probing upon
PTO expiration leading to interrupted stream transfers. This was revealed by
blackhole interop failed tests where one could see that qc_process_timer()
was wakeup without traces as follows in the log file:
   "needs to probe 01RTT packet number space"

Must be backported to 2.7 and to 2.6 if the commit mentionned above
is backported to 2.6 in the meantime.
2023-03-20 17:50:36 +01:00
Frédéric Lécaille
c664e644eb MINOR: quic: Stop stressing the acknowledgments process (RX ACK frames)
The ACK frame range of packets were handled from the largest to the smallest
packet number, leading to big number of ebtree insertions when the packet are
handled in the inverse way they are sent. This was detected a long time ago
but left in the code to stress our implementation. It is time to be more
efficient and process the packet so that to avoid useless ebtree insertions.

Modify qc_ackrng_pkts() responsible of handling the acknowledged packets from an
ACK frame range of acknowledged packets.

Must be backported to 2.7.
2023-03-20 17:47:12 +01:00
Frdric Lcaille
ca07979b97 BUG/MINOR: quic: Missing STREAM frame data pointer updates
This patch follows this one which was not sufficient:
    "BUG/MINOR: quic: Missing STREAM frame length updates"
Indeed, it is not sufficient to update the ->len and ->offset member
of a STREAM frame to move it forward. The data pointer must also be updated.
This is not done by the STREAM frame builder.

Must be backported to 2.6 and 2.7.
2023-03-17 09:21:18 +01:00
Frdric Lcaille
fc546ab6a7 BUG/MINOR: quic: Missing STREAM frame length updates
Some STREAM frame lengths were not updated before being duplicated, built
of requeued contrary to their ack offsets. This leads haproxy to crash when
receiving acknowledgements for such frames with this thread #1 backtrace:

 Thread 1 (Thread 0x7211b6ffd640 (LWP 986141)):
 #0  ha_crash_now () at include/haproxy/bug.h:52
 No locals.
 #1  b_del (b=<optimized out>, del=<optimized out>) at include/haproxy/buf.h:436
 No locals.
 #2  qc_stream_desc_ack (stream=stream@entry=0x7211b6fd9bc8, offset=offset@entry=53176, len=len@entry=1122) at src/quic_stream.c:111

Thank you to @Tristan971 for having provided such traces which reveal this issue:

[04|quic|5|c_conn.c:1865] qc_requeue_nacked_pkt_tx_frms(): entering : qc@0x72119c22cfe0
[04|quic|5|_frame.c:1179] qc_frm_unref(): entering : qc@0x72119c22cfe0
[04|quic|5|_frame.c:1186] qc_frm_unref(): remove frame reference : qc@0x72119c22cfe0 frm@0x72118863d260 STREAM_F uni=0 fin=1 id=460 off=52957 len=1122 3244
[04|quic|5|_frame.c:1194] qc_frm_unref(): leaving : qc@0x72119c22cfe0
[04|quic|5|c_conn.c:1902] qc_requeue_nacked_pkt_tx_frms(): updated partially acked frame : qc@0x72119c22cfe0 frm@0x72119c472290 STREAM_F uni=0 fin=1 id=460 off=53176 len=1122

Note that haproxy has much more chance to crash if this frame is the last one
(fin bit set). But another condition must be fullfilled to update the ack offset.
A previous STREAM frame from the same stream with the same offset but with less
data must be acknowledged by the peer. This is the condition to update the ack offset.

For others frames without fin bit in the same conditions, I guess the stream may be
truncated because too much data are removed from the stream when they are
acknowledged.

Must be backported to 2.6 and 2.7.
2023-03-16 14:35:19 +01:00
Frédéric Lécaille
be795ceb91 MINOR: quic: Do not stress the peer during retransmissions of lost packets
This issue was revealed by "Multiple streams" QUIC tracker test which very often
fails (locally) with a file of about 1Mbytes (x4 streams). The log of QUIC tracker
revealed that from its point of view, the 4 files were never all received entirely:

 "results" : {
      "stream_0_rec_closed" : true,
      "stream_0_rec_offset" : 1024250,
      "stream_0_snd_closed" : true,
      "stream_0_snd_offset" : 15,
      "stream_12_rec_closed" : false,
      "stream_12_rec_offset" : 72689,
      "stream_12_snd_closed" : true,
      "stream_12_snd_offset" : 15,
      "stream_4_rec_closed" : true,
      "stream_4_rec_offset" : 1024250,
      "stream_4_snd_closed" : true,
      "stream_4_snd_offset" : 15,
      "stream_8_rec_closed" : true,
      "stream_8_rec_offset" : 1024250,
      "stream_8_snd_closed" : true,
      "stream_8_snd_offset" : 15
   },

But this in contradiction with others QUIC tracker logs which confirms that haproxy
has really (re)sent the stream at the suspected offset(stream_12_rec_offset):

       1152085,
       "transport",
       "packet_received",
       {
          "frames" : [
             {
                "frame_type" : "stream",
                "length" : "155",
                "offset" : "72689",
                "stream_id" : "12"
             }
          ],
          "header" : {
             "dcid" : "a14479169ebb9dba",
             "dcil" : "8",
             "packet_number" : "466",
             "packet_size" : 190
          },
          "packet_type" : "1RTT"
       }

When detected as losts, the packets are enlisted, then their frames are
requeued in their packet number space by qc_requeue_nacked_pkt_tx_frms().
This was done using a local list which was spliced to the packet number
frame list. This had as bad effect to retransmit the frames in the inverse
order they have been sent. This is something the QUIC tracker go client
does not like at all!

Removing the frame splicing fixes this issue and allows haproxy to pass the
"Multiple streams" test.

Must be backported to 2.7.
2023-03-08 18:50:45 +01:00
Frédéric Lécaille
cc101cd2aa BUG/MINOR: quic: Wrong RETIRE_CONNECTION_ID sequence number check
This bug arrived with this commit:
     b5a8020e9 MINOR: quic: RETIRE_CONNECTION_ID frame handling (RX)
and was revealed by h3 interop tests with clients like s2n-quic and quic-go
as noticed by Amaury.

Indeed, one must check that the CID matching the sequence number provided by a received
RETIRE_CONNECTION_ID frame does not match the DCID of the packet.
Remove useless ->curr_cid_seq_num member from quic_conn struct.
The sequence number lookup must be done in qc_handle_retire_connection_id_frm()
to check the validity of the RETIRE_CONNECTION_ID frame, it returns the CID to be
retired into <cid_to_retire> variable passed as parameter to this function if
the frame is valid and if the CID was not already retired

Must be backported to 2.7.
2023-03-08 14:53:12 +01:00
Amaury Denoyelle
2d37629222 MINOR: quic: handle new closing list in show quic
A new global quic-conn list has been added by the previous patch. It will
contain every quic-conn in closing or draining state.

Thus, it is now easier to include or skip them on a "show quic" output :
when the default list on the current thread has been browsed entirely,
either we skip to the next thread or we look at the closing list on the
current thread.

This should be backported up to 2.7.
2023-03-08 14:39:48 +01:00
Amaury Denoyelle
efed86c973 MINOR: quic: create a global list dedicated for closing QUIC conns
When a CONNECTION_CLOSE is emitted or received, a QUIC connection enters
respectively in draining or closing state. These states are a loose
equivalent of TCP TIME_WAIT. No data can be exchanged anymore but the
connection is maintained during a certain timer to handle packet
reordering or loss.

A new global list has been defined for QUIC connections in
closing/draining state inside thread_ctx structure. Each time a
connection enters in one of this state, it will be moved from the
default global list to the new closing list.

The objective of this patch is to quickly filter connections on
closing/draining. Most notably, this will be used to wake up these
connections and avoid that haproxy process stopping is delayed by them.

A dedicated function qc_detach_th_ctx_list() has been implemented to
transfer a quic-conn from one list instance to the other. This takes
care of back-references attach to a quic-conn instance in case of a
running "show quic".

This should be backported up to 2.7.
2023-03-08 14:39:48 +01:00
Frédéric Lécaille
5e3201ea77 MINOR: quic: Add transport parameters to "show quic"
Modify quic_transport_params_dump() and others function relative to the
transport parameters value dump from TRACE() to make their output more
compact.
Add call to quic_transport_params_dump() to dump the transport parameters
from "show quic" CLI command.

Must be backported to 2.7.
2023-03-08 08:50:54 +01:00
Frédéric Lécaille
ece86e64c4 MINOR: quic: Add spin bit support
Add QUIC_FL_RX_PACKET_SPIN_BIT new RX packet flag to mark an RX packet as having
the spin bit set. Idem for the connection with QUIC_FL_CONN_SPIN_BIT flag.
Implement qc_handle_spin_bit() to set/unset QUIC_FL_CONN_SPIN_BIT for the connection
as soon as a packet number could be deciphered.
Modify quic_build_packet_short_header() to set the spin bit when building
a short packet header.

Validated by quic-tracker spin bit test.

Must be backported to 2.7.
2023-03-08 08:50:54 +01:00
Frédéric Lécaille
433af7fad9 MINOR: quic: Useless TLS context allocations in qc_do_rm_hp()
These allocations are definitively useless.

Must be backported to 2.7.
2023-03-08 08:50:54 +01:00
Frédéric Lécaille
8ac8a8778d MINOR: quic: RETIRE_CONNECTION_ID frame handling (RX)
Add ->curr_cid_seq_num new quic_conn struct frame to store the connection
ID sequence number currently used by the connection.
Implement qc_handle_retire_connection_id_frm() to handle this RX frame.
Implement qc_retire_connection_seq_num() to remove a connection ID from its
sequence number.
Implement qc_build_new_connection_id_frm to allocate a new NEW_CONNECTION_ID
frame from a CID.
Modify qc_parse_pkt_frms() which parses the frames of an RX packet to handle
the case of the RETIRE_CONNECTION_ID frame.

Must be backported to 2.7.
2023-03-08 08:50:54 +01:00
Frédéric Lécaille
b4c5471425 MINOR: quic: Store the next connection IDs sequence number in the connection
Add ->next_cid_seq_num new member to quic_conn struct to store the next
connection ID to be used to alloacated a connection ID.
It is initialized to 0 from qc_new_conn() which initializes a connection.
Modify new_quic_cid() to use this variable each time it is called without
giving the possibility to the caller to pass the sequence number for the
connection to be allocated.

Modify quic_build_post_handshake_frames() to use ->next_cid_seq_num
when building NEW_CONNECTION_ID frames after the hanshake has been completed.
Limit the number of connection IDs provided to the peer to the minimum
between 4 and the value it sent with active_connection_id_limit transport
parameter. This includes the connection ID used by the connection to send
this new connection IDs.

Must be backported to 2.7.
2023-03-08 08:50:54 +01:00
Amaury Denoyelle
315a4f6ae5 BUG/MEDIUM: quic: do not crash when handling STREAM on released MUX
The MUX instance is released before its quic-conn counterpart. On
termination, a H3 GOAWAY is emitted to prevent the client to open new
streams for this connection.

The quic-conn instance will stay alive until all opened streams data are
acknowledged. If the client tries to open a new stream during this
interval despite the GOAWAY, quic-conn is responsible to request its
immediate closure with a STOP_SENDING + RESET_STREAM.

This behavior was already implemented but the received packet with the
new STREAM was never acknowledged. This was fixed with the following
commit :
  commit 156a89aef8
  BUG/MINOR: quic: acknowledge STREAM frame even if MUX is released

However, this patch introduces a regression as it did not skip the call
to qc_handle_strm_frm() despite the MUX instance being released. This
can cause a segfault when using qcc_get_qcs() on a released MUX
instance. To fix this, add a missing break statement which will skip
qc_handle_strm_frm() when the MUX instance is not initialized.

This commit was reproduced using a short timeout client and sending
several requests with delay between them by using a modified aioquic. It
produces a crash with the following backtrace :
 #0  0x000055555594d261 in __eb64_lookup (x=4, root=0x7ffff4091f60) at include/import/eb64tree.h:132
 #1  eb64_lookup (root=0x7ffff4091f60, x=4) at src/eb64tree.c:37
 #2  0x000055555563fc66 in qcc_get_qcs (qcc=0x7ffff4091dc0, id=4, receive_only=1, send_only=0, out=0x7ffff780ca70) at src/mux_quic.c:668
 #3  0x0000555555641e1a in qcc_recv (qcc=0x7ffff4091dc0, id=4, len=40, offset=0, fin=1 '\001', data=0x7ffff40c4fef "\001&") at src/mux_quic.c:974
 #4  0x0000555555619d28 in qc_handle_strm_frm (pkt=0x7ffff4088e60, strm_frm=0x7ffff780cf50, qc=0x7ffff7cef000, fin=1 '\001') at src/quic_conn.c:2515
 #5  0x000055555561d677 in qc_parse_pkt_frms (qc=0x7ffff7cef000, pkt=0x7ffff4088e60, qel=0x7ffff7cef6c0) at src/quic_conn.c:3050
 #6  0x00005555556230aa in qc_treat_rx_pkts (qc=0x7ffff7cef000, cur_el=0x7ffff7cef6c0, next_el=0x0) at src/quic_conn.c:4214
 #7  0x0000555555625fee in quic_conn_app_io_cb (t=0x7ffff40c1fa0, context=0x7ffff7cef000, state=32848) at src/quic_conn.c:4640
 #8  0x00005555558a676d in run_tasks_from_lists (budgets=0x7ffff780d470) at src/task.c:596
 #9  0x00005555558a725b in process_runnable_tasks () at src/task.c:876
 #10 0x00005555558522ba in run_poll_loop () at src/haproxy.c:2945
 #11 0x00005555558529ac in run_thread_poll_loop (data=0x555555d14440 <ha_thread_info+64>) at src/haproxy.c:3141
 #12 0x00007ffff789ebb5 in ?? () from /usr/lib/libc.so.6
 #13 0x00007ffff7920d90 in ?? () from /usr/lib/libc.so.6

This should fix github issue #2067.

This must be backported up to 2.6.
2023-03-06 13:39:40 +01:00
Frédéric Lécaille
ec93721fb0 MINOR: quic: Send PING frames when probing Initial packet number space
In very very rare cases, it is possible the Initial packet number space
must be probed even if it there is no more in flight CRYPTO frames.
In such cases, a PING frame is sent into an Initial packet. As this
packet is ack-eliciting, it must be padded by the server. qc_do_build_pkt()
is modified to do so.

Take the opportunity of this patch to modify the trace for TX frames to
easily distinguished them from other frame relative traces.

Must be backported to 2.7.
2023-03-03 19:12:26 +01:00
Frédéric Lécaille
a65b71f89f BUG/MINOR: quic: Missing detections of amplification limit reached
Mark the connection as limited by the anti-amplification limit when trying to
probe the peer.
Wakeup the connection PTO/dectection loss timer as soon as a datagram is
received. This was done only when the datagram was dropped.
This fixes deadlock issues revealed by some interop runner tests.

Must be backported to 2.7 and 2.6.
2023-03-03 19:12:26 +01:00
Frédéric Lécaille
e6359b649b BUG/MINOR: quic: Do not resend already acked frames
Some frames are marked as already acknowledged from duplicated packets
whose the original packet has been acknowledged. There is no need
to resend such packets or frames.

Implement qc_pkt_with_only_acked_frms() to detect packet with only
already acknowledged frames inside and use it from qc_prep_fast_retrans()
which selects the packet to be retransmitted.

Must be backported to 2.6 and 2.7.
2023-03-03 19:12:26 +01:00
Frédéric Lécaille
21564be4a2 BUG/MINOR: quic: Ensure not to retransmit packets with no ack-eliciting frames
Even if there is a check in callers of qc_prep_hdshk_fast_retrans() and
qc_prep_fast_retrans() to prevent retransmissions of packets with no ack-eliciting
frames, these two functions should pay attention not do to that especially if
someone decides to modify their implementations in the future.

Must be backported to 2.6 and 2.7.
2023-03-03 19:12:26 +01:00
Frédéric Lécaille
b3562a3815 BUG/MINOR: quic: Remove force_ack for Initial,Handshake packets
This is an old bug which arrived in this commit due to a misinterpretation
of the RFC I guess where the desired effect was to acknowledge all the
handshake packets:

    77ac6f566 BUG/MINOR: quic: Missing acknowledgments for trailing packets

This had as bad effect to acknowledge all the handshake packets even the
ones which are not ack-eliciting.

Must be backported to 2.7 and 2.6.
2023-03-03 19:12:26 +01:00
Frédéric Lécaille
51a7caf921 MINOR: quic: Add traces about QUIC TLS key update
Dump the secret used to derive the next one during a key update initiated by the
client and dump the resulted new secret and the new key and iv to be used to
decryption Application level packets.

Also add a trace when the key update is supposed to be initiated on haproxy side.

This has already helped in diagnosing an issue evealed by the key update interop
test with xquic as client.

Must be backported to 2.7.
2023-03-03 19:12:26 +01:00
Frédéric Lécaille
720277843b BUG/MINOR: quic: v2 Initial packets decryption failed
v2 interop runner test revealed this bug as follows:

     [01|quic|4|c_conn.c:4087] new packet : qc@0x7f62ec026e30 pkt@0x7f62ec056390 el=I pn=491940080 rel=H
     [01|quic|5|c_conn.c:1509] qc_pkt_decrypt(): entering : qc@0x7f62ec026e30
     [01|quic|0|c_conn.c:1553] quic_tls_decrypt() failed : qc@0x7f62ec026e30
     [01|quic|5|c_conn.c:1575] qc_pkt_decrypt(): leaving : qc@0x7f62ec026e30
     [01|quic|0|c_conn.c:4091] packet decryption failed -> dropped : qc@0x7f62ec026e30 pkt@0x7f62ec056390 el=I pn=491940080

Only v2 Initial packets decryption received by the clients were impacted. There
is no issue to encrypt v2 Initial packets. This is due to the fact that when
negotiated the client may send two versions of Initial packets (currently v1,
then v2). The selection was done for the TX path but not on the RX path.

Implement qc_select_tls_ctx() to select the correct TLS cipher context for all
types of packets and call this function before removing the header protection
and before deciphering the packet.

Must be backported to 2.7.
2023-03-03 19:12:26 +01:00
Frédéric Lécaille
d30a04a4bb BUG/MINOR: quic: Ensure to be able to build datagrams to be retransmitted
When retransmitting datagrams with two coalesced packets inside, the second
packet was not taken into consideration when checking there is enough space
into the network for the datagram, especially when limited by the anti-amplification.

Must be backported to 2.6 and 2.7.
2023-03-03 19:12:26 +01:00
Frédéric Lécaille
ceb88b8f46 MINOR: quic: Add a BUG_ON_HOT() call for too small datagrams
This should be helpful to detect too small datagrams: datagrams
smaller than 1200 bytes, with Initial packets inside.

Must be backported to 2.7.
2023-03-03 19:12:26 +01:00
Frédéric Lécaille
69e7118fe9 BUG/MINOR: quic: Do not send too small datagrams (with Initial packets)
Before building a packet into a datagram, ensure there is sufficient space for at
least 1200 bytes. Also pad datagrams with only one ack-eliciting Initial packet
inside.

Must be backported to 2.7 and 2.6.
2023-03-03 19:12:26 +01:00
Amaury Denoyelle
c8a0efbda8 BUG/MEDIUM: quic: properly handle duplicated STREAM frames
When a STREAM frame is re-emitted, it will point to the same stream
buffer as the original one. If an ACK is received for either one of
these frame, the underlying buffer may be freed. Thus, if the second
frame is declared as lost and schedule for retransmission, we must
ensure that the underlying buffer is still allocated or interrupt the
retransmission.

Stream buffer is stored as an eb_tree indexed by the stream ID. To avoid
to lookup over a tree each time a STREAM frame is re-emitted, a lost
STREAM frame is flagged as QUIC_FL_TX_FRAME_LOST.

In most cases, this code is functional. However, there is several
potential issues which may cause a segfault :
- when explicitely probing with a STREAM frame, the frame won't be
  flagged as lost
- when splitting a STREAM frame during retransmission, the flag is not
  copied

To fix both these cases, QUIC_FL_TX_FRAME_LOST flag has been converted
to a <dup> field in quic_stream structure. This field is now properly
copied when splitting a STREAM frame. Also, as this is now an inner
quic_frame field, it will be copied automatically on qc_frm_dup()
invocation thus ensuring that it will be set on probing.

This issue was encounted randomly with the following backtrace :
 #0  __memmove_avx512_unaligned_erms ()
 #1  0x000055f4d5a48c01 in memcpy (__len=18446698486215405173, __src=<optimized out>,
 #2  quic_build_stream_frame (buf=0x7f6ac3fcb400, end=<optimized out>, frm=0x7f6a00556620,
 #3  0x000055f4d5a4a147 in qc_build_frm (buf=buf@entry=0x7f6ac3fcb5d8,
 #4  0x000055f4d5a23300 in qc_do_build_pkt (pos=<optimized out>, end=<optimized out>,
 #5  0x000055f4d5a25976 in qc_build_pkt (pos=0x7f6ac3fcba10,
 #6  0x000055f4d5a30c7e in qc_prep_app_pkts (frms=0x7f6a0032bc50, buf=0x7f6a0032bf30,
 #7  qc_send_app_pkts (qc=0x7f6a0032b310, frms=0x7f6a0032bc50) at src/quic_conn.c:4184
 #8  0x000055f4d5a35f42 in quic_conn_app_io_cb (t=0x7f6a0009c660, context=0x7f6a0032b310,

This should fix github issue #2051.

This should be backported up to 2.6.
2023-03-03 15:08:02 +01:00
Amaury Denoyelle
caa16549b8 MINOR: quic: notify on send ready
This patch completes the previous one with poller subscribe of quic-conn
owned socket on sendto() error. This ensures that mux-quic is notified
if waiting on sending when a transient sendto() error is cleared. As
such, qc_notify_send() is called directly inside socket I/O callback.

qc_notify_send() internal condition have been thus completed. This will
prevent to notify upper layer until all sending condition are fulfilled:
room in congestion window and no transient error on socket FD.

This should be backported up to 2.7.
2023-03-01 14:32:37 +01:00
Amaury Denoyelle
e1a0ee3cf6 MEDIUM: quic: implement poller subscribe on sendto error
On sendto() transient error, prior to this patch sending was simulated
and we relied on retransmission to retry sending. This could hurt
significantly the performance.

Thanks to quic-conn owned socket support, it is now possible to improve
this. On transient error, sending is interrupted and quic-conn socket FD
is subscribed on the poller for sending. When send is possible,
quic_conn_sock_fd_iocb() will be in charge of restart sending.

A consequence of this change is on the return value of qc_send_ppkts().
This function will now return 0 on transient error if quic-conn has its
owned socket. This is used to interrupt sending in the calling function.
The flag QUIC_FL_CONN_TO_KILL must be checked to differentiate a fatal
error from a transient one.

This should be backported up to 2.7.
2023-03-01 14:32:37 +01:00
Amaury Denoyelle
147862de61 MINOR: quic: purge txbuf before preparing new packets
Sending is implemented in two parts on quic-conn module. First, QUIC
packets are prepared in a buffer and then sendto() is called with this
buffer as input.

qc.tx.buf is used as the input buffer. It must always be empty before
starting to prepare new packets in it. Currently, this is guarantee by
the fact that either sendto() is completed, a fatal error is encountered
which prevent future send, or a transient error is encountered and we
rely on retransmission to send the remaining data.

This will change when poller subscribe of socket FD on sendto()
transient error will be implemented. In this case, qc.tx.buf will not be
emptied to resume sending when the transient error is cleared. To allow
the current sending process to work as expected, a new function
qc_purge_txbuf() is implemented. It will try to send remaining data
before preparing new packets for sending. If successful, txbuf will be
emptied and sending can continue. If not, sending will be interrupted.

This should be backported up to 2.7.
2023-03-01 14:29:16 +01:00
Amaury Denoyelle
e0fe118dad MINOR: quic: implement qc_notify_send()
Implement qc_notify_send(). This function is responsible to notify the
upper layer subscribed on SUB_RETRY_SEND if sending condition are back
to normal.

For the moment, this patch has no functional change as only congestion
window room is checked before notifying the upper layer. However, this
will be extended when poller subscribe of socket on sendto() error will
be implemented. qc_notify_send() will thus be responsible to ensure that
all condition are met before wake up the upper layer.

This should be backported up to 2.7.
2023-03-01 14:29:16 +01:00
Amaury Denoyelle
37333864ef MINOR: quic: simplify return path in send functions
This patch simply clean up return paths used in various send function of
quic-conn module. This will simplify the implementation of poller
subscribing on sendto() error which add another error handling path.

This should be backported up to 2.7.
2023-03-01 14:29:16 +01:00
Amaury Denoyelle
1febc2d316 MEDIUM: quic: improve fatal error handling on send
Send is conducted through qc_send_ppkts() for a QUIC connection. There
is two types of error which can be encountered on sendto() or affiliated
syscalls :
* transient error. In this case, sending is simulated with the remaining
  data and retransmission process is used to have the opportunity to
  retry emission
* fatal error. If this happens, the connection should be closed as soon
  as possible. This is done via qc_kill_conn() function. Until this
  patch, only ECONNREFUSED errno was considered as fatal.

Modify the QUIC send API to be able to differentiate transient and fatal
errors more easily. This is done by fixing the return value of the
sendto() wrapper qc_snd_buf() :
* on fatal error, a negative error code is returned. This is now the
  case for every errno except EAGAIN, EWOULDBLOCK, ENOTCONN, EINPROGRESS
  and EBADF.
* on a transient error, 0 is returned. This is the case for the listed
  errno values above and also if a partial send has been conducted by
  the kernel.
* on success, the return value of sendto() syscall is returned.

This commit will be useful to be able to handle transient error with a
quic-conn owned socket. In this case, the socket should be subscribed to
the poller and no simulated send will be conducted.

This commit allows errno management to be confined in the quic-sock
module which is a nice cleanup.

On a final note, EBADF should be considered as fatal. This will be the
subject of a next commit.

This should be backported up to 2.7.
2023-02-28 10:51:25 +01:00
Frdric Lcaille
b7a13be6cd BUILD: quic: 32-bits compilation issue with %zu in quic_rx_pkts_del()
This issue arrived with this commit:
    1dbeb35f8 MINOR: quic: Add new traces about by connection RX buffer handling

and revealed by the GH CI as follows:

   src/quic_conn.c: In function ‘quic_rx_pkts_del’:
    include/haproxy/trace.h:134:65: error: format ‘%zu’ expects argument of type ‘size_t’,
    but argument 6 has type ‘uint64_t’ {aka ‘long long unsigned int’} [-Werror=format=]
    _msg_len = snprintf(_msg, sizeof(_msg), (fmt), ##args);

Replace all %zu printf integer format by %llu.

Must be backported to 2.7 where the previous is supposed to be backported.
2023-02-24 09:23:07 +01:00
Frdric Lcaille
bbf86be996 BUG/MEDIUM: quic: Missing TX buffer draining from qc_send_ppkts()
If the TX buffer (->tx.buf) attached to the connection is not drained, there
are chances that this will be detected by qc_txb_release() which triggers
a BUG_ON_HOT() when this is the case as follows

[00|quic|2|c_conn.c:3477] UDP port unreachable : qc@0x5584f18d6d50 pto_count=0 cwnd=6816 ppif=1046 pif=1046
[00|quic|5|ic_conn.c:749] qc_kill_conn(): entering : qc@0x5584f18d6d50
[00|quic|5|ic_conn.c:752] qc_kill_conn(): leaving : qc@0x5584f18d6d50
[00|quic|5|c_conn.c:3532] qc_send_ppkts(): leaving : qc@0x5584f18d6d50 pto_count=0 cwnd=6816 ppif=1046 pif=1046

FATAL: bug condition "buf && b_data(buf)" matched at src/quic_conn.c:3098

Consume the remaining data in the TX buffer calling b_del().

This bug arrived with this commit:
    a2c62c314 MINOR: quic: Kill the connections on ICMP (port unreachable) packet receipt

Takes also the opportunity of this patch to modify the comments for qc_send_ppkts()
which should have arrived with a2c62c314 commit.

Must be backported to 2.7 where this latter commit is supposed to be backported.
2023-02-21 11:06:10 +01:00
Amaury Denoyelle
77ed63106d MEDIUM: quic: trigger fast connection closing on process stopping
With previous commit, quic-conn are now handled as jobs to prevent the
termination of haproxy process. This ensures that QUIC connections are
closed when all data are acknowledged by the client and there is no more
active streams.

The quic-conn layer emits a CONNECTION_CLOSE once the MUX has been
released and all streams are acknowledged. Then, the timer is scheduled
to definitely free the connection after the idle timeout period. This
allows to treat late-arriving packets.

Adjust this procedure to deactivate this timer when process stopping is
in progress. In this case, quic-conn timer is set to expire immediately
to free the quic-conn instance as soon as possible. This allows to
quickly close haproxy process.

This should be backported up to 2.7.
2023-02-20 11:20:18 +01:00
Amaury Denoyelle
fb375574f9 MINOR: quic: mark quic-conn as jobs on socket allocation
To prevent data loss for QUIC connections, haproxy global variable jobs
is incremented each time a quic-conn socket is allocated. This allows
the QUIC connection to terminate all its transfer operation during proxy
soft-stop. Without this patch, the process will be terminated without
waiting for QUIC connections.

Note that this is done in qc_alloc_fd(). This means only QUIC connection
with their owned socket will properly support soft-stop. In the other
case, the connection will be interrupted abruptly as before. Similarly,
jobs decrement is conducted in qc_release_fd().

This should be backported up to 2.7.
2023-02-20 11:20:18 +01:00
Amaury Denoyelle
156a89aef8 BUG/MINOR: quic: acknowledge STREAM frame even if MUX is released
When the MUX is freed, the quic-conn layer may stay active until all
streams acknowledgment are processed. In this interval, if a new stream
is opened by the client, the quic-conn is thus now responsible to handle
it. This is done by the emission of a STOP_SENDING + RESET_STREAM.

Prior to this patch, the received packet was not acknowledged. This is
undesirable if the quic-conn is able to properly reject the request as
this can lead to unneeded retransmission from the client.

This must be backported up to 2.6.
2023-02-20 10:54:27 +01:00
Amaury Denoyelle
7546301712 BUG/MINOR: quic: also send RESET_STREAM if MUX released
When the MUX is freed, the quic-conn layer may stay active until all
streams acknowledgment are processed. In this interval, if a new stream
is opened by the client, the quic-conn is thus now responsible to handle
it. This is done by the emission of a STOP_SENDING.

This process has been completed to also emit a RESET_STREAM with the
same error code H3_REQUEST_REJECTED. This is done to conform with the H3
specification to invite the client to retry its request on a new
connection.

This should be backported up to 2.6.
2023-02-20 10:52:51 +01:00
Amaury Denoyelle
38836b6b3d MINOR: quic: adjust request reject when MUX is already freed
When the MUX is freed, the quic-conn layer may stay active until all
streams acknowledgment are processed. In this interval, if a new stream
is opened by the client, the quic-conn is thus now responsible to handle
it. This is done by the emission of a STOP_SENDING.

This process is closely related to HTTP/3 protocol despite being handled
by the quic-conn layer. This highlights a flaw in our QUIC architecture
which should be adjusted. To reflect this situation, the function
qc_stop_sending_frm_enqueue() is renamed qc_h3_request_reject(). Also,
internal H3 treatment such as uni-directional bypass has been moved
inside the function.

This commit is only a refactor. However, bug fix on next patches will
rely on it so it should be backported up to 2.6.
2023-02-20 10:52:05 +01:00
Frédéric Lécaille
5faf577997 BUG/MINOR: quic: Missing padding for short packets
This was revealed by Amaury when setting tune.quic.frontend.max-streams-bidi to 8
and asking a client to open 12 streams. haproxy has to send short packets
with little MAX_STREAMS frames encoded with 2 bytes. In addition to a packet number
encoded with only one byte. In the case <len_frms> is the length of the encoded
frames to be added to the packet plus the length of the packet number.

Ensure the length of the packet is at least QUIC_PACKET_PN_MAXLEN adding a PADDING
frame wich (QUIC_PACKET_PN_MAXLEN - <len_frms>) as size. For instance with
a two bytes MAX_STREAMS frames and a one byte packet number length, this adds
one byte of padding.

See https://datatracker.ietf.org/doc/html/rfc9001#name-header-protection-sample.

Must be backported to 2.7 and 2.6.
2023-02-17 17:36:30 +01:00
Frédéric Lécaille
35218c6357 BUG/MINOR: quic: Do not drop too small datagrams with Initial packets
When receiving an Initial packet a peer must drop it if the datagram is smaller
than 1200. Before this patch, this is the entire datagram which was dropped.

In such a case, drop the packet after having parsed its length.

Must be backported to 2.6 and 2.7
2023-02-17 17:36:30 +01:00
Frédéric Lécaille
8f7d22406c BUG/MINOR: quic: Wrong initialization for io_cb_wakeup boolean
This bug arrives with this commit:
    982896961 MINOR: quic: split and rename qc_lstnr_pkt_rcv()
The first block of code consists in possibly setting this variable to true.
But it was already initialized to true before entering this code section.
Should be initialized to false.

Also take the opportunity to remove an unused "err" label.

Must be backported to 2.6 and 2.7.
2023-02-17 17:36:30 +01:00
Frédéric Lécaille
7c6d8f88df BUG/MINOR: quic: Do not probe with too little Initial packets
Before probing the Initial packet number space, verify that we can at least
sent 1200 bytes by datagram. This may not be the case due to the amplification limit.

Must be backported to 2.6 and 2.7.
2023-02-17 17:36:30 +01:00
Frédéric Lécaille
4540053fa6 MINOR: quic: Add <pto_count> to the traces
This may be useful to diagnose issues in relation with QUIC recovery.

Must be backported to 2.7.
2023-02-17 17:36:30 +01:00
Frédéric Lécaille
bc09f745e6 MINOR: quic: Add a trace to identify connections which sent Initial packet.
This should help in diagnosing issues revealed by the interop runner which counts
the number of handshakes from the number of Initial packets sent by the server.

Must be backported to 2.7.
2023-02-17 17:36:30 +01:00
Frédéric Lécaille
1e8ef1bed6 BUG/MINOR: quic: Missing call to task_queue() in qc_idle_timer_do_rearm()
The aim of this function is to rearm the idle timer. The ->expire
field of the timer task was updated without being requeued.
Some connection could be unexpectedly terminated.

Must be backported to 2.6 and 2.7.
2023-02-17 17:36:30 +01:00
Frédéric Lécaille
e1738df468 MINOR: quic: Make qc_dgrams_retransmit() return a status.
This is very helpful during retranmission when receiving ICMP port unreachable
errors after the peer has left. This is the unique case at prevent where
qc_send_hdshk_pkts() or qc_send_app_probing() may fail (when they call
qc_send_ppkts() which fails with ECONNREFUSED as errno).

Also make the callers qc_dgrams_retransmit() stop their packet process. This
is the case of quic_conn_app_io_cb() and quic_conn_io_cb().

This modifications stops definitively any packet processing when receiving
ICMP port unreachable errors.

Must be backported to 2.7.
2023-02-17 17:36:30 +01:00
Frédéric Lécaille
2f531116ed MINOR: quic: Add traces to qc_kill_conn()
Very minor modification to help in debugging issues.

Must be backported to 2.7.
2023-02-17 17:36:30 +01:00
Frédéric Lécaille
a2c62c3141 MINOR: quic: Kill the connections on ICMP (port unreachable) packet receipt
The send*() syscall which are responsible of such ICMP packets reception
fails with ECONNREFUSED as errno.

  man(7) udp
  ECONNREFUSED
      No receiver was associated with the destination address.
      This might be caused by a previous packet sent over the socket.

We must kill asap the underlying connection.

Must be backported to 2.7.
2023-02-17 17:36:30 +01:00
Frédéric Lécaille
dd41a45014 MINOR: quic: Simplication for qc_set_timer()
There is no reason to run code for nothing if the timer task has been
released when entering qc_set_timer().

Must be backported to 2.7.
2023-02-17 17:36:30 +01:00
Frédéric Lécaille
dea3298282 BUG/MINOR: quic: Really cancel the connection timer from qc_set_timer()
The ->expire field of the timer task to be cancelled was not reset
to TICK_ETERNITY.

Must be backported to 2.6 and 2.7.
2023-02-17 17:36:30 +01:00
Frédéric Lécaille
75c8ad5490 MINOR: quic: Move code to wakeup the timer task to avoid anti-amplication deadlock
This code was there because the timer task was not running on the same thread
as the one which parse the QUIC packets. Now that this is no more the case,
we can wake up this task directly.

Must be backported to 2.7.
2023-02-17 17:36:30 +01:00
Frédéric Lécaille
1dbeb35f80 MINOR: quic: Add new traces about by connection RX buffer handling
Move quic_rx_pkts_del() out of quic_conn.h to make it benefit from the TRACE API.
Add traces which already already helped in diagnosing an issue encountered with
ngtcp2 which sent too much 1RTT packets before the handshake completion. This
has been fixed here after having discussed with Tasuhiro on QUIC dev slack:

https://github.com/ngtcp2/ngtcp2/pull/663

Must be backported to 2.7.
2023-02-17 17:36:30 +01:00
Frédéric Lécaille
07846cbda8 BUG/MINOR: quic: Wrong datagram dispatch because of qc_check_dcid()
There was a parenthesis placed in the wrong place for a memcmp().
As a consequence, clients could not reuse a UDP address for a new connection.

Must be backported to 2.7.
2023-02-13 16:14:24 +01:00
Frdric Lcaille
91376d6134 BUG/MEDIUM: quic: Buffer overflow when looking through QUIC CLI keyword list
This has been detected by libasan as follows:

=================================================================
==3170559==ERROR: AddressSanitizer: global-buffer-overflow on address 0x55cf77faad08 at pc 0x55cf77a87370 bp 0x7ffc01bdba70 sp 0x7ffc01bdba68
READ of size 8 at 0x55cf77faad08 thread T0
    #0 0x55cf77a8736f in cli_find_kw src/cli.c:335
    #1 0x55cf77a8a9bb in cli_parse_request src/cli.c:792
    #2 0x55cf77a8c385 in cli_io_handler src/cli.c:1024
    #3 0x55cf77d19ca1 in task_run_applet src/applet.c:245
    #4 0x55cf77c0b6ba in run_tasks_from_lists src/task.c:634
    #5 0x55cf77c0cf16 in process_runnable_tasks src/task.c:861
    #6 0x55cf77b48425 in run_poll_loop src/haproxy.c:2934
    #7 0x55cf77b491cf in run_thread_poll_loop src/haproxy.c:3127
    #8 0x55cf77b4bef2 in main src/haproxy.c:3783
    #9 0x7fb8b0693d09 in __libc_start_main ../csu/libc-start.c:308
    #10 0x55cf7764f4c9 in _start (/home/flecaille/src/haproxy-untouched/haproxy+0x1914c9)

0x55cf77faad08 is located 0 bytes to the right of global variable 'cli_kws' defined in 'src/quic_conn.c:7834:27' (0x55cf77faaca0) of size 104
SUMMARY: AddressSanitizer: global-buffer-overflow src/cli.c:335 in cli_find_kw
Shadow bytes around the buggy address:

According to cli_find_kw() code and cli_kw_list struct definition, the second
member of this structure ->kw[] must be a null-terminated array.
Add a last element with default initializers to <cli_kws> global variable which
is impacted by this bug.

This bug arrived with this commit:
   15c74702d MINOR: quic: implement a basic "show quic" CLI handler

Must be backported to 2.7 where this previous commit has been already
backported.
2023-02-11 21:08:34 +01:00
Amaury Denoyelle
a9de25a559 BUG/MINOR: quic: fix type bug on "show quic" for 32-bits arch
Incorrect printf format specifier "%lu" was used on "show quic" handler
for uint64_t. This breaks build on 32-bits architecture. To fix this
portability issue, force an explicit cast to unsigned long long with
"%llu" specifier.

This must be backported up to 2.7.
2023-02-10 09:29:37 +01:00
Amaury Denoyelle
10a46de620 BUG/MINOR: quic: fix filtering of closing connections on "show quic"
Filtering of closing/draining connections on "show quic" was not
properly implemented. This causes the extra argument "all" to display
all connections to be without effect. This patch fixes this and restores
the output of all connections.

This must be backported up to 2.7.
2023-02-09 18:30:14 +01:00
Amaury Denoyelle
3f9758ecab MINOR: quic: filter closing conn on "show quic"
Reduce default "show quic" output by masking connection on
closing/draing state due to a CONNECTION_CLOSE emission/reception. These
connections can still be displayed using the special argument "all".

This should be backported up to 2.7.
2023-02-09 18:14:40 +01:00
Amaury Denoyelle
2eda63b447 MINOR: quic: display Tx stream info on "show quic"
Complete "show quic" handler by displaying information about
quic_stream_desc entries. These structures are used to emit stream data
and store them until acknowledgment is received.

This should be backported up to 2.7.
2023-02-09 18:14:40 +01:00
Amaury Denoyelle
1b0fc437f3 MINOR: quic: display infos about various encryption level on "show quic"
Complete "show quic" handler by displaying various information related
to each encryption level and packet number space. Most notably, ack
ranges and bytes in flight are present to help debug retransmission
issues.

This should be backported up to 2.7.
2023-02-09 18:14:40 +01:00
Amaury Denoyelle
b89c0e243a MINOR: quic: display socket info on "show quic"
Complete "show quic" handler by displaying information related to the
quic_conn owned socket. First, the FD is printed, followed by the
address of the local and remote endpoint.

This should be backported up to 2.7.
2023-02-09 18:14:40 +01:00
Amaury Denoyelle
58d9d5d160 MINOR: quic: display CIDs and state in "show quic"
Complete "show quic" handler. Source and destination CIDs are printed
for every connection. This is complete by a state info to reflect if
handshake is completed and if a CONNECTION_CLOSE has been emitted or
received and the allocation status of the attached MUX. Finally the idle
timer expiration is also printed.

This should be backported up to 2.7.
2023-02-09 18:14:40 +01:00
Amaury Denoyelle
15c74702d5 MINOR: quic: implement a basic "show quic" CLI handler
Implement a basic "show quic" CLI handler. This command will be useful
to display various information on all the active QUIC frontend
connections.

This work is heavily inspired by "show sess". Most notably, a global
list of quic_conn has been introduced to be able to loop over them. This
list is stored per thread in ha_thread_ctx.

Also add three CLI handlers for "show quic" in order to allocate and
free the command context. The dump handler runs on thread isolation.
Each quic_conn is referenced using a back-ref to handle deletion during
handler yielding.

For the moment, only a list of raw quic_conn pointers is displayed. The
handler will be completed over time with more information as needed.

This should be backported up to 2.7.
2023-02-09 18:11:00 +01:00
Amaury Denoyelle
f2f08f88ef BUG/MEDIUM: quic: do not split STREAM frames if no space
When building STREAM frames in a packet buffer, if a frame is too large
it will be splitted in two. A shorten version will be used and the
original frame will be modified to represent the remaining space.

To ensure there is enough space to store the frame data length encoded
as a QUIC integer, we use the function max_available_room(). This
function can return 0 if there not only a small space left which is
insufficient for the frame header and the shorten data. Prior to this
patch, this wasn't check and an empty unneeded STREAM frame was built
and sent for nothing.

Change this by checking the value return by max_available_room(). If 0,
do not try to split this frame and continue to the next ones in the
packet.

On 2.6, this patch serves as an optimization which will prevent the building
of unneeded empty STREAM frames.

On 2.7, this behavior has the side-effect of triggering a BUG_ON()
statement on quic_build_stream_frame(). This BUG_ON() ensures that we do
not use quic_frame with OFF bit set if its offset is 0. This can happens
if the condition defined above is reproduced for a STREAM frame at
offset 0. An empty unneeded frame is built as descibed. The problem is
that the original frame is modified with its OFF bit set even if the
offset is still 0.

This must be backported up to 2.6.
2023-02-03 19:19:50 +01:00
Frdric Lcaille
0aa79953c9 BUG/MINOR: quic: Unchecked source connection ID
The SCID (source connection ID) used by a peer (client or server) is sent into the
long header of a QUIC packet in clear. But it is also sent into the transport
parameters (initial_source_connection_id). As these latter are encrypted into the
packet, one must check that these two pieces of information do not differ
due to a packet header corruption. Furthermore as such a connection is unusuable
it must be killed and must stop as soon as possible processing RX/TX packets.

Implement qc_kill_con() to flag a connection as unusable and to kille it asap
waking up the idle timer task to release the connection.

Add a check to quic_transport_params_store() to detect that the SCIDs do not
match and make it call qc_kill_con().

Add several tests about connection to be killed at several critial locations,
especially in the TLS stack callback to receive CRYPTO data from or derive secrets,
and before preparing packet after having received others.

Must be backported to 2.6 and 2.7.
2023-02-03 17:55:55 +01:00
Frdric Lcaille
af25a69c8b MEDIUM: quic: Remove qc_conn_finalize() from the ClientHello TLS callbacks
This is a bad idea to make the TLS ClientHello callback call qc_conn_finalize().
If this latter fails, this would generate a TLS alert and make the connection
send packet whereas it is not functional. But qc_conn_finalize() job was to
install the transport parameters sent by the QUIC listener. This installation
cannot be done at any time. This must be done after having possibly negotiated
the QUIC version and before sending the first Handshake packets. It seems
the better moment to do that in when the Handshake TX secrets are derived. This
has been found inspecting the ngtcp2 code. Calling SSL_set_quic_transport_params()
too late would make the ServerHello to be sent without the transport parameters.

The code for the connection update which was done from qc_conn_finalize() has
been moved to quic_transport_params_store(). So, this update is done as soon as
possible.

Add QUIC_FL_CONN_TX_TP_RECEIVED to flag the connection as having received the
peer transport parameters. Indeed this is required when the ClientHello message
is splitted between packets.

Add QUIC_FL_CONN_FINALIZED to protect the connection from calling qc_conn_finalize()
more than one time. This latter is called only when the connection has received
the transport parameters and after returning from SSL_do_hanshake() which is the
function which trigger the TLS ClientHello callback call.

Remove the calls to qc_conn_finalize() from from the TLS ClientHello callbacks.

Must be backported to 2.6. and 2.7.
2023-02-03 17:55:55 +01:00
Frdric Lcaille
8417beb7da BUG/MAJOR: quic: Possible crash when processing 1-RTT during 0-RTT session
This bug was revealed by some C1 interop tests (heavy hanshake packet
corruption) when receiving 1-RTT packets with a key phase update.
This lead the packet to be decrypted with the next key phase secrets.
But this latter is initialized only after the handshake is complete.

In fact, 1-RTT must never be processed before the handshake is complete.
Relying on the "qc->mux_state == QC_MUX_NULL" condition to check the
handshake is complete is wrong during 0-RTT sessions when the mux
is initialized before the handshake is complete.

Must be backported to 2.7 and 2.6.
2023-02-03 17:55:55 +01:00
Frdric Lcaille
37ed4a3842 MINOR: quic: When probing Handshake packet number space, also probe the Initial one
This is not really a bug fix but an improvement. When the Handshake packet number
space has been detected as needed to be probed, we should also try to probe the
Initial packet number space if there are still packets in flight. Furthermore
we should also try to send up to two datagrams.

Must be backported to 2.6 and 2.7.
2023-02-03 17:55:55 +01:00
Frdric Lcaille
055e82657e BUG/MINOR: quic: Do not ignore coalesced packets in qc_prep_fast_retrans()
This function is called only when probing only one packet number space
(Handshake) or two times the same one (Application). So, there is no risk
to prepare two times the same frame when uneeded because we wanted to
probe two packet number spaces. The condition "ignore the packets which
has been coalesced to another one" is not necessary. More importantly
the bug is when we want to prepare a Application packet which has
been coalesced to an Handshake packet. This is always the case when
the first Application packet is sent. It is always coalesced to
an Handshake packet with an ACK frame. So, when lost, this first
application packet was never resent. It contains the HANDSHAKE_DONE
frame to confirm the completion of the handshake to the client.

Must be backported to 2.6 and 2.7.
2023-02-03 17:55:55 +01:00
Frdric Lcaille
6dead91b8a MINOR: quic: Add a trace about variable states in qc_prep_fast_retrans()
This has already been very useful to diagnose retransmission issues.

Must be backported to 2.6 and 2.7.
2023-02-03 17:55:55 +01:00
Frdric Lcaille
b75eecc874 BUG/MINOR: quic: Too big PTO during handshakes
During the handshake and when the handshake has not been confirmed
the acknowledgement delays reported by the peer may be larger
than max_ack_delay. max_ack_delay SHOULD be ignored before the
handshake is completed when computing the PTO. But the current code considered
the wrong condition "before the hanshake is completed".

Replace the enum value QUIC_HS_ST_COMPLETED by QUIC_HS_ST_CONFIRMED to
fix this issue. In quic_loss.c, the parameter passed to quic_pto_pktns()
is renamed to avoid any possible confusion.

Must be backported to 2.7 and 2.6.
2023-02-03 17:55:55 +01:00
Frdric Lcaille
dd419461ef BUG/MINOR: quic: Possible stream truncations under heavy loss
This may happen during retransmission of frames which can be splitted
(CRYPTO, or STREAM frames). One may have to split a frame to be
retransmitted due to the QUIC protocol properties (packet size limitation
and packet field encoding sizes). The remaining part of a frame which
cannot be retransmitted must be detached from the original frame it is
copied from. If not, when the really sent part will be acknowledged
the remaining part will be acknowledged too but not sent!

Must be backported to 2.7 and 2.6.
2023-02-03 17:55:55 +01:00
Amaury Denoyelle
24d5b72ca9 MINOR: quic: add config for retransmit limit
Define a new configuration option "tune.quic.max-frame-loss". This is
used to specify the limit for which a single frame instance can be
detected as lost. If exceeded, the connection is closed.

This should be backported up to 2.7.
2023-02-03 11:56:46 +01:00
Amaury Denoyelle
e4abb1f2da MEDIUM: quic: implement a retransmit limit per frame
Add a <loss_count> new field in quic_frame structure. This field is set
to 0 and incremented each time a sent packet is declared lost. If
<loss_count> reached a hard-coded limit, the connection is deemed as
failing and is closed immediately with a CONNECTION_CLOSE using
INTERNAL_ERROR.

By default, limit is set to 10. This should ensure that overall memory
usage is limited if a peer behaves incorrectly.

This should be backported up to 2.7.
2023-02-03 11:56:42 +01:00
Amaury Denoyelle
57b3eaa793 MINOR: quic: refactor frame deallocation
Define a new function qc_frm_free() to handle frame deallocation. New
BUG_ON() statements ensure that the deallocated frame is not referenced
by other frame. To support this, all LIST_DELETE() have been replaced by
LIST_DEL_INIT(). This should enforce that frame deallocation is robust.

As a complement, qc_frm_unref() has been moved into quic_frame module.
It is justified as this is a utility function related to frame
deallocation. It allows to use it in quic_pktns_tx_pkts_release() before
calling qc_frm_free().

This should be backported up to 2.7.
2023-02-03 11:55:41 +01:00
Amaury Denoyelle
40c24f1a10 MINOR: quic: define new functions for frame alloc
Define two utility functions for quic_frame allocation :
* qc_frm_alloc() is used to allocate a new frame
* qc_frm_dup() is used to allocate a new frame by duplicating an
  existing one

Theses functions are useful to centralize quic_frame initialization.
Note that pool_zalloc() is replaced by a proper pool_alloc() + explicit
initialization code.

This commit will simplify implementation of the per frame retransmission
limitation. Indeed, a new counter will be added in quic_frame structure
which must be initialized to 0.

This should be backported up to 2.7.
2023-02-03 10:44:26 +01:00
Amaury Denoyelle
1dac018d9f MINOR: quic: ensure offset is properly set for STREAM frames
Care must be taken when reading/writing offset for STREAM frames. A
special OFF bit is set in the frame type to indicate that the field is
present. If not set, it is assumed that offset is 0.

To represent this, offset field of quic_stream structure must always be
initialized with a valid value in regards with its frame type OFF bit.

The previous code has no bug in part because pool_zalloc() is used to
allocate quic_frame instances. To be able to use pool_alloc(), offset is
always explicitely set to 0. If a non-null value is used, OFF bit is set
at the same occasion. A new BUG_ON() statement is added on frame builder
to ensure that the caller has set OFF bit if offset is non null.

This should be backported up to 2.7.
2023-02-03 09:46:55 +01:00
Amaury Denoyelle
2216b0866e MINOR: quic: remove fin from quic_stream frame type
A dedicated <fin> field was used in quic_stream structure. However, this
info is already encoded in the frame type field as specified by QUIC
protocol.

In fact, only code for packet reception used the <fin> field. On the
sending side, we only checked for the FIN bit. To align both sides,
remove the <fin> field and only used the FIN bit.

This should be backported up to 2.7.
2023-02-03 09:46:55 +01:00
Frédéric Lécaille
d18025eeef BUG/MINOR: quic: Do not request h3 clients to close its unidirection streams
It is forbidden to request h3 clients to close its Control and QPACK unidirection
streams. If not, the client closes the connection with H3_CLOSED_CRITICAL_STREAM(0x104).
Perhaps this could prevent some clients as Chrome to come back for a while.

But at quic_conn level there is no mean to identify the streams for which we cannot
send STOP_SENDING frame. Such a possibility is even not mentionned in RFC 9000.
At this time there is no choice than stopping sending STOP_SENDING frames for
all the h3 unidirectional streams inspecting the ->app_opps quic_conn value.

Must be backported to 2.7 and 2.6.
2023-01-20 15:49:52 +01:00
Frdric Lcaille
21c4c9b854 MINOR: quic: Replace v2 draft definitions by those of the final 2 version
This should finalize the support for the QUIC version 2.

Must be backported to 2.7.
2023-01-17 16:35:20 +01:00
Frdric Lcaille
6fc86974cf MINOR: quic: Disable the active connection migrations
Set "disable_active_migration" transport parameter to inform the peer
haproxy listeners does not the connection migration feature.
Also drop all received datagrams with a modified source address.

Must be backported to 2.7.
2023-01-17 16:35:20 +01:00
Frdric Lcaille
f676954f72 MINOR: quic: Useless test about datagram destination addresses
There is no reason to check if the peer has modified the destination
address of its connection.

May be backported to 2.7.
2023-01-17 16:35:20 +01:00
Amaury Denoyelle
5854fc08cc MINOR: mux-quic: handle RESET_STREAM reception
Implement RESET_STREAM reception by mux-quic. On reception, qcs instance
will be mark as remotely closed and its Rx buffer released. The stream
layer will be flagged on error if still attached.

This commit is part of implementing H3 errors at the stream level.
Indeed, on H3 stream errors, STOP_SENDING + RESET_STREAM should be
emitted. The STOP_SENDING will in turn generate a RESET_STREAM by the
remote peer which will be handled thanks to this patch.

This should be backported up to 2.7.
2022-12-22 16:38:04 +01:00
Willy Tarreau
eed7826529 BUG/MEDIUM: quic: properly take shards into account on bind lines
Shards were completely forgotten in commit f5a0c8abf ("MEDIUM: quic:
respect the threads assigned to a bind line"). The thread mask is
taken from the bind_conf, but since shards were introduced in 2.5,
the per-listener mask is held by the receiver and can be smaller
than the bind_conf's mask.

The effect here is that the traffic is not distributed to the
appropriate thread. At first glance it's not dramatic since it remains
one of the threads eligible by the bind_conf, but it still means that
in some contexts such as "shards by-thread", some concurrency may
persist on listeners while they're expected to be alone. One identified
impact is that it requires more rxbufs than necessary, but there may
possibly be other not yet identified side effects.

This must be backported to 2.7 and everywhere the commit above is
backported.
2022-12-21 09:27:26 +01:00
Amaury Denoyelle
5ac6b3b125 BUG/MINOR: quic: fix crash on PTO rearm if anti-amplification reset
There is a possible segfault when accessing qc->timer_task in
quic_conn_io_cb() without testing it. It seems however very rare as it
requires several condition to be encounter.
* quic_conn must be in CLOSING state after having sent a
  CONNECTION_CLOSE which free the qc.timer_task
* quic_conn handshake must still be in progress : in fact, qc.timer_task
  is accessed on this path because of the anti-amplification limit
  lifted.

I was unable thus far to trigger it but benchmarking tests seems to have
fire it with the following backtrace as a result :

  #0  _task_wakeup (f=4096, caller=0x5620ed004a40 <_.46868>, t=0x0) at include/haproxy/task.h:195
  195             state = _HA_ATOMIC_OR_FETCH(&t->state, f);
  [Current thread is 1 (Thread 0x7fc714ff1700 (LWP 14305))]
  (gdb) bt
  #0  _task_wakeup (f=4096, caller=0x5620ed004a40 <_.46868>, t=0x0) at include/haproxy/task.h:195
  #1  quic_conn_io_cb (t=0x7fc5d0e07060, context=0x7fc5d0df49c0, state=<optimized out>) at src/quic_conn.c:4393
  #2  0x00005620ecedab6e in run_tasks_from_lists (budgets=<optimized out>) at src/task.c:596
  #3  0x00005620ecedb63c in process_runnable_tasks () at src/task.c:861
  #4  0x00005620ecea971a in run_poll_loop () at src/haproxy.c:2913
  #5  0x00005620ecea9cf9 in run_thread_poll_loop (data=<optimized out>) at src/haproxy.c:3102
  #6  0x00007fc773c3f609 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
  #7  0x00007fc77372d133 in clone () from /lib/x86_64-linux-gnu/libc.so.6
  (gdb) up
  #1  quic_conn_io_cb (t=0x7fc5d0e07060, context=0x7fc5d0df49c0, state=<optimized out>) at src/quic_conn.c:4393
  4393                            task_wakeup(qc->timer_task, TASK_WOKEN_MSG);
  (gdb) p qc
  $1 = (struct quic_conn *) 0x7fc5d0df49c0
  (gdb) p qc->timer_task
  $2 = (struct task *) 0x0

This fix should be backported up to 2.6.
2022-12-15 17:02:19 +01:00
Amaury Denoyelle
4244833c5f BUG/MINOR: quic: handle alloc failure on qc_new_conn() for owned socket
This patch is the follow up of previous fix :
  BUG/MINOR: quic: properly handle alloc failure in qc_new_conn()

quic_conn owned socket FD is initialized as soon as possible in
qc_new_conn(). This guarantees that we can safely call
quic_conn_release() on allocation failure. This function uses internally
qc_release_fd() to free the socket FD unless it has been initialized to
an invalid FD value.

Without this patch, a segfault will occur if one inner allocation of
qc_new_conn() fails before qc.fd is initialized.

This change is linked to quic-conn owned socket implementation.
This should be backported up to 2.7.
2022-12-12 14:53:55 +01:00
Amaury Denoyelle
dbf6ad470b BUG/MINOR: quic: properly handle alloc failure in qc_new_conn()
qc_new_conn() is used to allocate a quic_conn instance and its various
internal members. If one allocation fails, quic_conn_release() is used
to cleanup things.

For the moment, pool_zalloc() is used which ensures that all content is
null. However, some members must be initialized to a special values
to be able to use quic_conn_release() safely. This is the case for
quic_conn lists and its tasklet.

Also, some quic_conn internal allocation functions were doing their own
cleanup on failure without reset to NULL. This caused an issue with
quic_conn_release() which also frees this members. To fix this, these
functions now only return an error without cleanup. It is the caller
responsibility to free the allocated content, which is done via
quic_conn_release().

Without this patch, allocation failure in qc_new_conn() would often
result in segfault. This was reproduced easily using fail-alloc at 10%.

This should be backported up to 2.6.
2022-12-12 11:44:34 +01:00
Ilya Shipitsin
5fa29b8a74 CLEANUP: assorted typo fixes in the code and comments
This is 34th iteration of typo fixes
2022-12-07 09:08:18 +01:00
Amaury Denoyelle
d3083c9df9 MINOR: quic: reconnect quic-conn socket on address migration
UDP addresses may change over time for a QUIC connection. When using
quic-conn owned socket, we have to detect address change to break the
bind/connect association on the socket.

For the moment, on change detected, QUIC connection socket is closed and
a new one is opened. In the future, we may improve this by trying to
keep the original socket and reexecute only bind/connect syscalls.

This change is part of quic-conn owned socket implementation.
It may be backported to 2.7 after a period of observation.
2022-12-02 14:45:43 +01:00
Amaury Denoyelle
7c9fdd9c3a MEDIUM: quic: move receive out of FD handler to quic-conn io-cb
This change is the second part for reception on QUIC connection socket.
All operations inside the FD handler has been delayed to quic-conn
tasklet via the new function qc_rcv_buf().

With this change, buffer management on reception has been simplified. It
is now possible to use a local buffer inside qc_rcv_buf() instead of
quic_receiver_buf().

This change is part of quic-conn owned socket implementation.
It may be backported to 2.7 after a period of observation.
2022-12-02 14:45:43 +01:00
Amaury Denoyelle
5b41486b7f MEDIUM: quic: use quic-conn socket for reception
Try to use the quic-conn socket for reception if it is allocated. For
this, the socket is inserted in the fdtab. This will call the new
handler quic_conn_io_cb() which is responsible to process the recv()
system call. It will reuse datagram dispatch for simplicity. However,
this is guaranteed to be called on the quic-conn thread, so it will be
more efficient to use a dedicated buffer. This will be implemented in
another commit.

This patch should improve performance by reducing contention on the
receiver socket. However, more gain can be obtained when the datagram
dispatch operation will be skipped.

Older quic_sock_fd_iocb() is renamed to quic_lstnr_sock_fd_iocb() to
emphasize its usage for the receiver socket.

This change is part of quic-conn owned socket implementation.
It may be backported to 2.7 after a period of observation.
2022-12-02 14:45:43 +01:00
Amaury Denoyelle
40909dfec5 MINOR: quic: allocate a socket per quic-conn
Allocate quic-conn owned socket if possible. This requires that this is
activated in haproxy configuration. Also, this is done only if local
address is known so it depends on the support of IP_PKTINFO.

For the moment this socket is not used. This causes QUIC support to be
broken as received datagram are not read. This commit will be completed
by a following patch to support recv operation on the newly allocated
socket.

This change is part of quic-conn owned socket implementation.
It may be backported to 2.7 after a period of observation.
2022-12-02 14:45:43 +01:00
Amaury Denoyelle
eb6be98a65 MINOR: quic: ignore address migration during handshake
QUIC protocol support address migration which allows to maintain the
connection even if client has changed its network address. This is done
through address migration.

RFC 9000 stipulates that address migration is forbidden before handshake
has been completed. Add a check for this : drop silently every datagram
if client network address has changed until handshake completion.

This commit is one of the first steps towards QUIC connection migration
support.

This should be backported up to 2.7.
2022-12-02 14:45:43 +01:00
Amaury Denoyelle
eec0b3c1bd MINOR: quic: detect connection migration
Detect connection migration attempted by the client. This is done by
comparing addresses stored in quic-conn with src/dest addresses of the
UDP datagram.

A new function qc_handle_conn_migration() has been added. For the
moment, no operation is conducted and the function will be completed
during connection migration implementation. The only notable things is
the increment of a new counter "quic_conn_migration_done".

This should be backported up to 2.7.
2022-12-02 14:45:43 +01:00
Amaury Denoyelle
8687b63c69 MINOR: quic: extract datagram parsing code
Extract individual datagram parsing code outside of datagrams list loop
in quic_lstnr_dghdlr(). This is moved in a new function named
quic_dgram_parse().

To complete this change, quic_lstnr_dghdlr() has been moved into
quic_sock source file : it belongs to QUIC socket lower layer and is
directly called by quic_sock_fd_iocb().

This commit will ease implementation of quic-conn owned socket.
New function quic_dgram_parse() will be easily usable after a receive
operation done on quic-conn IO-cb.

This should be backported up to 2.7.
2022-12-02 14:45:43 +01:00
Amaury Denoyelle
3f474e64c8 MINOR: quic: complete traces in qc_rx_pkt_handle()
Add missing ENTER trace for qc_rx_pkt_handle() function. LEAVE traces
are already present.

This should be backported up to 2.7.
2022-12-02 14:45:43 +01:00
Amaury Denoyelle
518c98f150 MINOR: quic: remove qc from quic_rx_packet
quic_rx_packet struct had a reference to the quic_conn instance. This is
useless as qc instance is always passed through function argument. In
fact, pkt.qc is used only in qc_pkt_decrypt() on key update, even though
qc is also passed as argument.

Simplify this by removing qc field from quic_rx_packet structure
definition. Also clean up qc_pkt_decrypt() documentation and interface
to align it with other quic-conn related functions.

This should be backported up to 2.7.
2022-12-02 14:45:43 +01:00
Frédéric Lécaille
7b5d9b1f03 BUG/MINOR: quic: Endless loop during retransmissions
qc_dgrams_retransmit() could reuse the same local list and could splice it two
times to the packet number space list of frame to be send/resend. This creates a
loop in this list and makes qc_build_frms() possibly endlessly loop when trying
to build frames from the packet number space list of frames. Then haproxy aborts.

This issue could be easily reproduced patching qc_build_frms() function to set <dlen>
variable value to 0 after having built at least 10 CRYPTO frames and using ngtcp2
as client with 30% packet loss in both direction.

Thank you to @gabrieltz for having reported this issue in GH #1903.

Must be backported to 2.6.
2022-11-29 15:19:16 +01:00
Willy Tarreau
33a6870fea BUILD: quic: silence two invalid build warnings at -O1 with gcc-6.5
Gcc 6.5 is now well known for triggering plenty of false "may be used
uninitialized", particularly at -O1, and two of them happen in quic,
quic_tp and quic_conn. Both of them were reviewed and easily confirmed
as wrong (gcc seems to ignore the control flow after the function
returns and believes error conditions are not met). Let's just preset
the variables that bothers it. In quic_tp the initialization was moved
out of the loop since there's no point inflating the code just to
silence a stupid warning.
2022-11-24 09:16:41 +01:00
Frédéric Lécaille
74b5f7b31b BUG/MAJOR: quic: Crash after discarding packet number spaces
This previous patch was not sufficient to prevent haproxy from
crashing when some Handshake packets had to be inspected before being possibly
retransmitted:

     "BUG/MAJOR: quic: Crash upon retransmission of dgrams with several packets"

This patch introduced another issue: access to packets which have been
released because still attached to others (in the same datagram). This was
the case for instance when discarding the Initial packet number space before
inspecting an Handshake packet in the same datagram through its ->prev or
member in our case.

This patch implements quic_tx_packet_dgram_detach() which detaches a packet
from the adjacent ones in the same datagram to be called when ackwowledging
a packet (as done in the previous commit) and when releasing its memory. This
was, we are sure the released packets will not be accessed during retransmissions.

Thank you to @gabrieltz for having reported this issue in GH #1903.

Must be backported to 2.6.
2022-11-20 18:35:46 +01:00
Frdric Lcaille
814645f42f BUG/MAJOR: quic: Crash upon retransmission of dgrams with several packets
As revealed by some traces provided by @gabrieltz in GH #1903 issue,
there are clients (chrome I guess) which acknowledge only one packet among others
in the same datagram. This is the case for the first datagram sent by a QUIC haproxy
listener made an Initial packet followed by an Handshake one. In this identified
case, this is the Handshake packet only which is acknowledged. But if the
client is able to respond with an Handshake packet (ACK frame) this is because
it has successfully parsed the Initial packet. So, why not also acknowledging it?
AFAIK, this is mandatory. On our side, when restransmitting this datagram, the
Handshake packet was accessed from the Initial packet after having being released.

Anyway. There is an issue on our side. Obviously, we must not expect an
implementation to respect the RFC especially when it want to build an attack ;)

With this simple patch for each TX packet we send, we also set the previous one
in addition to the next one. When a packet is acknowledged, we detach the next one
and the next one in the same datagram from this packet, so that it cannot be
resent when resending these packets (the previous one, in our case).

Thank you to @gabrieltz for having reported this issue.

Must be backported to 2.6.
2022-11-19 04:56:55 +01:00
Amaury Denoyelle
2f668f0e60 MINOR: quic: complete traces/debug for handshake
Add more traces to follow CRYPTO data buffering in ncbuf. Offset for
quic_enc_level is now reported for event QUIC_EV_CONN_PRHPKTS. Also
ncb_advance() must never fail so a BUG_ON() statement is here to
guarantee it.

This was useful to track handshake failure reported too often. This is
related to github issue #1903.

This should be backported up to 2.6.
2022-11-18 16:44:46 +01:00
Amaury Denoyelle
bc174b2101 BUG/MEDIUM: quic: fix memleak for out-of-order crypto data
Liberate quic_enc_level ncbuf in quic_stream_free(). In most cases, this
will already be done when handshake is completed via
qc_treat_rx_crypto_frms(). However, if a connection is released before
handshake completion, a leak was present without this patch.

Under normal situation, this leak should have been limited due to the
majority of QUIC connection success on handshake. However, another bug
caused handshakes to fail too frequently, especially with chrome client.
This had the side-effect to dramatically increase this memory leak.

This should fix in part github issue #1903.
2022-11-18 16:44:46 +01:00
Amaury Denoyelle
ff95f2d447 BUG/MEDIUM: quic: fix unsuccessful handshakes on ncb_advance error
QUIC handshakes were frequently in error due to haproxy misuse of
ncbuf. This resulted in one of the following scenario :
- handshake rejected with CONNECTION_CLOSE due to overlapping data
  rejected
- CRYPTO data fully received by haproxy but handshake completion signal
  not reported causing the client to emit PING repeatedly before timeout

This was produced because ncb_advance() result was not checked after
providing crypto data to the SSL stack in qc_provide_cdata(). However,
this operation can fail if a too small gap is formed. In the meantime,
quic_enc_level offset was always incremented. In some cases, this caused
next ncb_add() to report rejected overlapping data. In other cases, no
error was reported but SSL stack never received the end of CRYPTO data.

Change slightly the handling of new CRYPTO frames to avoid this bug :
when receiving a CRYPTO frame for the current offset, handle them
directly as previously done only if quic_enc_level ncbuf is empty. In
the other case, copy them to the buffer before treating contiguous data
via qc_treat_rx_crypto_frms().

This change ensures that ncb_advance() operation is now conducted only
in a data block : thus this is guaranteed to never fail.

This bug was easily reproduced with chromium as it fragments CRYPTO
frames randomly in several frames out of order.

This commit has two drawbacks :
- it is slightly less worst on performance because as sometimes even
  data at the current offset will be memcpy
- if a client uses too many fragmented CRYPTO frames, this can cause
  repeated ncb_add() error on gap size. This can be reproduced with
  chrome, albeit with a slighly less frequent rate than the fixed issue.

This change should fix in part github issue #1903.

This must be backported up to 2.6.
2022-11-18 16:44:46 +01:00
Amaury Denoyelle
3a72ba2aed BUILD: quic: fix dubious 0-byte overflow on qc_release_lost_pkts
With GCC 12.2.0 and O2 optimization activated, compiler reports the
following warning for qc_release_lost_pkts().

In function ‘quic_tx_packet_refdec’,
    inlined from ‘qc_release_lost_pkts.constprop’ at src/quic_conn.c:2056:3:
include/haproxy/atomic.h:320:41: error: ‘__atomic_sub_fetch_4’ writing 4 bytes into a region of size 0 overflows the destination [-Werror=stringop-overflow=]
  320 | #define HA_ATOMIC_SUB_FETCH(val, i)     __atomic_sub_fetch(val, i, __ATOMIC_SEQ_CST)
      |                                         ^~~~~~~~~~~~~~~~~~
include/haproxy/quic_conn.h:499:14: note: in expansion of macro ‘HA_ATOMIC_SUB_FETCH’
  499 |         if (!HA_ATOMIC_SUB_FETCH(&pkt->refcnt, 1)) {
      |              ^~~~~~~~~~~~~~~~~~~

GCC thinks that quic_tx_packet_refdec() can be called with a NULL
argument from qc_release_lost_pkts() with <oldest_lost> as arg.

This warning is a false positive as <oldest_lost> cannot be NULL in
qc_release_lost_pkts() at this stage. This is due to the previous check
to ensure that <pkts> list is not empty.

This warning is silenced by using ALREADY_CHECKED() macro.

This should be backported up to 2.6.

This should fix github issue #1852.
2022-11-17 10:34:47 +01:00
Ilya Shipitsin
4a689dad03 CLEANUP: assorted typo fixes in the code and comments
This is 32nd iteration of typo fixes
2022-10-30 17:17:56 +01:00
Amaury Denoyelle
bbb1c68508 BUG/MINOR: quic: fix subscribe operation
Subscribing was not properly designed between quic-conn and quic MUX
layers. Align this as with in other haproxy components : <subs> field is
moved from the MUX to the quic-conn structure. All mention of qcc MUX is
cleaned up in quic_conn_subscribe()/quic_conn_unsubscribe().

Thanks to this change, ACK reception notification has been simplified.
It's now unnecessary to check for the MUX existence before waking it.
Instead, if <subs> quic-conn field is set, just wake-up the upper layer
tasklet without mentionning MUX. This should probably be extended to
other part in quic-conn code.

This should be backported up to 2.6.
2022-10-26 18:18:26 +02:00
Amaury Denoyelle
9e3026c58d MINOR: quic: extend Retry token check function
On Initial packet reception, token is checked for validity through
quic_retry_token_check() function. However, some related parts were left
in the parent function quic_rx_pkt_retrieve_conn(). Move this code
directly into quic_retry_token_check() to facilitate its call in various
context.

The API of quic_retry_token_check() has also been refactored. Instead of
working on a plain char* buffer, it now uses a quic_rx_packet instance.
This helps to reduce the number of parameters.

This change will allow to check Retry token even if data were received
with a FD-owned quic-conn socket. Indeed, in this case,
quic_rx_pkt_retrieve_conn() call will probably be skipped.

This should be backported up to 2.6.
2022-10-19 18:45:58 +02:00
Amaury Denoyelle
6e56a9e055 MINOR: quic: refactor packet drop on reception
Sometimes, a packet is dropped on reception. Several goto statements are
used, mostly to increment a proxy drop counter or drop silently the
packet. However, this labels are interleaved. Re-arrang goto labels to
simplify this process :
* drop label is used to drop a packet with counter incrementation. This
  is the default method.
* drop_silent is the next label which does the same thing but skip the
  counter incrementation. This is useful when we do not need to report
  the packet dropping operation.

This should be backported up to 2.6.
2022-10-19 18:45:58 +02:00
Amaury Denoyelle
982896961c MINOR: quic: split and rename qc_lstnr_pkt_rcv()
This change is the following of qc_lstnr_pkt_rcv() refactoring. This
function has finally been split into several ones.

The first half is renamed quic_rx_pkt_parse(). This function is
responsible to parse a QUIC packet header and calculate the packet
length.

QUIC connection retrieval has been extracted and is now called directly
by quic_lstnr_dghdlr().

The second half of qc_lstnr_pkt_rcv() is renamed to qc_rx_pkt_handle().
This function is responsible to copy a QUIC packet content to a
quic-conn receive buffer.

A third function named qc_rx_check_closing() is responsible to detect if
the connection is already in closing state. As this requires to drop the
whole datagram, it seems justified to be in a separate function.

This change has no functional impact. It is part of a refactoring series
on qc_lstnr_pkt_rcv(). The objective is to facilitate the integration of
FD-owned quic-conn socket patches.

This should be backported up to 2.6.
2022-10-19 18:45:55 +02:00
Amaury Denoyelle
449b1a8f55 MINOR: quic: extract connection retrieval
Simplify qc_lstnr_pkt_rcv() by extracting code responsible to retrieve
the quic-conn instance. This code is put in a dedicated function named
quic_rx_pkt_retrieve_conn(). This new function could be skipped if a
FD-owned quic-conn socket is used.

The first traces of qc_lstnr_pkt_rcv() have been clean up as qc instance
is always NULL here : thus qc parameter can be removed without any
change.

This change has no functional impact. It is a part of a refactoring
series on qc_lstnr_pkt_rcv(). The objective is facilitate integration of
FD-owned socket patches.

This should be backported up to 2.6.
2022-10-19 18:12:56 +02:00
Amaury Denoyelle
deb7c87f55 MINOR: quic: define first packet flag
Received packets treatment has some difference regarding if this is the
first one or not of the encapsulating datagram. Previously, this was set
via a function argument. Simplify this by defining a new Rx packet flag
named QUIC_FL_RX_PACKET_DGRAM_FIRST.

This change does not have functional impact. It will simplify API when
qc_lstnr_pkt_rcv() is broken into several functions : their number of
arguments will be reduced thanks to this patch.

This should be backported up to 2.6.
2022-10-19 18:12:56 +02:00
Amaury Denoyelle
845169da58 MINOR: quic: extend pn_offset field from quic_rx_packet
pn_offset field was only set if header protection cannot be removed.
Extend the usage of this field : it is now set everytime on packet
parsing in qc_lstnr_pkt_rcv().

This change helps to clean up API of Rx functions by removing
unnecessary variables and function argument.

This change has no functional impact. It is a part of a refactoring
series on qc_lstnr_pkt_rcv(). The objective is facilitate integration of
FD-owned socket patches.

This should be backported up to 2.6.
2022-10-19 18:12:56 +02:00
Amaury Denoyelle
0eae57273b MINOR: quic: add version field on quic_rx_packet
Add a new field version on quic_rx_packet structure. This is set on
header parsing in qc_lstnr_pkt_rcv() function.

This change has no functional impact. It is a part of a refactoring
series on qc_lstnr_pkt_rcv(). The objective is facilitate integration of
FD-owned socket patches.

This should be backported up to 2.6.
2022-10-19 18:12:56 +02:00
Amaury Denoyelle
6c940569f6 BUG/MINOR: quic: fix buffer overflow on retry token generation
When generating a Retry token, client CID is used as encryption input.
The client must reuse the same CID when emitting the token in a new
Initial packet.

A memory overflow can occur on quic_generate_retry_token() depending on
the size of client CID. This is because space reserved for <aad> only
accounted for QUIC_HAP_CID_LEN (size of haproxy owned generated CID).
However, the client CID size only depends on client parameter and is
instead limited to QUIC_CID_MAXLEN as specified in RFC9000.

This was reproduced with ngtcp2 and haproxy built with ASAN. Here is the error
log :
  ==14964==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fffee228cee at pc 0x7ffff785f427 bp 0x7fffee2289e0 sp 0x7fffee228188
  WRITE of size 17 at 0x7fffee228cee thread T5
      #0 0x7ffff785f426 in __interceptor_memcpy /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827
      #1 0x555555906ea7 in quic_generate_retry_token_aad src/quic_conn.c:5452
      #2 0x555555907e72 in quic_retry_token_check src/quic_conn.c:5577
      #3 0x55555590d01e in qc_lstnr_pkt_rcv src/quic_conn.c:6103
      #4 0x5555559190fa in quic_lstnr_dghdlr src/quic_conn.c:7179
      #5 0x555555eb0abf in run_tasks_from_lists src/task.c:590
      #6 0x555555eb285f in process_runnable_tasks src/task.c:855
      #7 0x555555d9118f in run_poll_loop src/haproxy.c:2853
      #8 0x555555d91f88 in run_thread_poll_loop src/haproxy.c:3042
      #9 0x7ffff709f8fc  (/usr/lib/libc.so.6+0x868fc)
      #10 0x7ffff7121a5f  (/usr/lib/libc.so.6+0x108a5f)

This must be backported up to 2.6.
2022-10-18 14:36:47 +02:00
Willy Tarreau
f5a0c8abf5 MEDIUM: quic: respect the threads assigned to a bind line
Right now the QUIC thread mapping derives the thread ID from the CID
by dividing by global.nbthread. This is a problem because this makes
QUIC work on all threads and ignores the "thread" directive on the
bind lines. In addition, only 8 bits are used, which is no more
compatible with the up to 4096 threads we may have in a configuration.

Let's modify it this way:
  - the CID now dedicates 12 bits to the thread ID
  - on output we continue to place the TID directly there.
  - on input, the value is extracted. If it corresponds to a valid
    thread number of the bind_conf, it's used as-is.
  - otherwise it's used as a rank within the current bind_conf's
    thread mask so that in the end we still get a valid thread ID
    for this bind_conf.

The extraction function now requires a bind_conf in order to get the
group and thread mask. It was better to use bind_confs now as the goal
is to make them support multiple listeners sooner or later.
2022-10-13 18:08:05 +02:00
Amaury Denoyelle
1cba8d60f3 CLEANUP: quic: improve naming for rxbuf/datagrams handling
QUIC datagrams are read from a random thread. They are then redispatch
to the connection thread according to the first packet DCID. These
operations are implemented through a special buffer designed to avoid
locking.

Refactor this code with the following changes :
* <rxbuf> type is renamed <quic_receiver_buf>. Its list element is also
  renamed to highligh its attach point to a receiver.
* <quic_dgram> and <quic_receiver_buf> definition are moved to
  quic_sock-t.h. This helps to reduce the size of quic_conn-t.h.
* <quic_dgram> list elements are renamed to highlight their attach point
  into a <quic_receiver_buf> and a <quic_dghdlr>.

This should be backported up to 2.6.
2022-10-13 11:06:48 +02:00
Frédéric Lécaille
e1a49cfd4d MINOR: quic: Split the secrets key allocation in two parts
Implement quic_tls_secrets_keys_alloc()/quic_tls_secrets_keys_free() to allocate
the memory for only one direction (RX or TX).
Modify ha_quic_set_encryption_secrets() to call these functions for one of this
direction (or both). So, for now on we can rely on the value of the secret keys
to know if it was derived.
Remove QUIC_FL_TLS_SECRETS_SET flag which is no more useful.
Consequently, the secrets are dumped by the traces only if derived.

Must be backported to 2.6.
2022-10-13 10:12:03 +02:00
Frédéric Lécaille
4aa7d8197a BUG/MINOR: quic: Stalled 0RTT connections with big ClientHello TLS message
This issue was reproduced with -Q picoquic client option to split a big ClientHello
message into two Initial packets and haproxy as server without any knowledged of
any previous ORTT session (restarted after a firt 0RTT session). The ORTT received
packets were removed from their queue when the second Initial packet was parsed,
and the QUIC handshake state never progressed and remained at Initial state.

To avoid such situations, after having treated some Initial packets we always
check if there are ORTT packets to parse and we never remove them from their
queue. This will be done after the hanshake is completed or upon idle timeout
expiration.

Also add more traces to be able to analize the handshake progression.

Tested with ngtcp2 and picoquic

Must be backported to 2.6.
2022-10-13 10:12:03 +02:00
Frédéric Lécaille
9f9263ed13 MINOR: quic: Use a non-contiguous buffer for RX CRYPTO data
Implement quic_get_ncbuf() to dynamically allocate a new ncbuf to be attached to
any quic_cstream struct which needs such a buffer. Note that there is no quic_cstream
for 0RTT encryption level. quic_free_ncbuf() is added to release the memory
allocated for a non-contiguous buffer.

Modify qc_handle_crypto_frm() to call this function and allocate an ncbuf for
crypto data which are not received in order. The crypto data which are received in
order are not buffered but provide to the TLS stack (calling qc_provide_cdata()).

Modify qc_treat_rx_crypto_frms() which is called after having provided the
in order received crypto data to the TLS stack to provide again the remaining
crypto data which has been buffered, if possible (if they are in order). Each time
buffered CRYPTO data were consumed, we try to release the memory allocated for
the non-contiguous buffer (ncbuf).
Also move rx.crypto.offset quic_enc_level struct member to rx.offset quic_cstream
struct member.

Must be backported to 2.6.
2022-10-13 10:12:03 +02:00
Frédéric Lécaille
a20c93e6e2 MINOR: quic: Extract CRYPTO frame parsing from qc_parse_pkt_frms()
Implement qc_handle_crypto_frm() to parse a CRYPTO frame.

Must be backported to 2.6.
2022-10-13 10:12:03 +02:00
Frédéric Lécaille
7e3f7c47e9 MINOR: quic: New quic_cstream object implementation
Add new quic_cstream struct definition to implement the CRYPTO data stream.
This is a simplication of the qcs object (QUIC streams) for the CRYPTO data
without any information about the flow control. They are not attached to any
tree, but to a QUIC encryption level, one by encryption level except for
the early data encryption level (for 0RTT). A stream descriptor is also allocated
for each CRYPTO data stream.

Must be backported to 2.6
2022-10-13 10:12:03 +02:00
Amaury Denoyelle
97ecc7a8ea MEDIUM: quic: retrieve frontend destination address
Retrieve the frontend destination address for a QUIC connection. This
address is retrieve from the first received datagram and then stored in
the associated quic-conn.

This feature relies on IP_PKTINFO or affiliated flags support on the
socket. This flag is set for each QUIC listeners in
sock_inet_bind_receiver(). To retrieve the destination address,
recvfrom() has been replaced by recvmsg() syscall. This operation and
parsing of msghdr structure has been extracted in a wrapper quic_recv().

This change is useful to finalize the implementation of 'dst' sample
fetch. As such, quic_sock_get_dst() has been edited to return local
address from the quic-conn. As a best effort, if local address is not
available due to kernel non-support of IP_PKTINFO, address of the
listener is returned instead.

This should be backported up to 2.6.
2022-10-10 11:48:27 +02:00
Amaury Denoyelle
90121b3321 CLEANUP: quic: fix indentation
Fix some indentation in qc_lstnr_pkt_rcv().

This should be backported up to 2.6.
2022-10-05 11:08:32 +02:00
Amaury Denoyelle
2ed840015f MINOR: quic: limit usage of ssl_sock_ctx in favor of quic_conn
Continue on the cleanup of QUIC stack and components.

quic_conn uses internally a ssl_sock_ctx to handle mandatory TLS QUIC
integration. However, this is merely as a convenience, and it is not
equivalent to stackable ssl xprt layer in the context of HTTP1 or 2.

To better emphasize this, ssl_sock_ctx usage in quic_conn has been
removed wherever it is not necessary : namely in functions not related
to TLS. quic_conn struct now contains its own wait_event for tasklet
quic_conn_io_cb().

This should be backported up to 2.6.
2022-10-05 11:08:32 +02:00
Amaury Denoyelle
92fa63f735 CLEANUP: quic: create a dedicated quic_conn module
xprt_quic module was too large and did not reflect the true architecture
by contrast to the other protocols in haproxy.

Extract code related to XPRT layer and keep it under xprt_quic module.
This code should only contains a simple API to communicate between QUIC
lower layer and connection/MUX.

The vast majority of the code has been moved into a new module named
quic_conn. This module is responsible to the implementation of QUIC
lower layer. Conceptually, it overlaps with TCP kernel implementation
when comparing QUIC and HTTP1/2 stacks of haproxy.

This should be backported up to 2.6.
2022-10-03 16:25:17 +02:00