Commit Graph

20158 Commits

Author SHA1 Message Date
Aurelien DARRAGON
bce0c0c37a BUG/MINOR: dns: fix ring offset calculation in dns_resolve_send()
With 737d10f ("BUG/MEDIUM: dns: ensure ring offset is properly reajusted
to head") relative offset calculation was fixed in dns_session_io_handler()
and dns_process_req() functions.

But if we compare with the changes performed in the patch that introduced
the bug: d9c7188 ("MEDIUM: ring: make the offset relative to the head/tail
instead of absolute"), we can see that dns_resolve_send() is missing from
the patch.

Applying both 737d10f + ("BUG/MINOR: dns: fix ring offset calculation on
first read") to dns_resolve_send() function.
With this last commit, we should be back at pre d9c7188 behavior.

No backport needed.
2023-03-08 08:57:13 +01:00
Aurelien DARRAGON
5a43db2c5d BUG/MINOR: dns: fix ring offset calculation on first read
With 737d10f ("BUG/MEDIUM: dns: ensure ring offset is properly reajusted
to head") ring offset is now properly re-adjusted in dns_session_io_handler()
and dns_process_req().

But the previous patch does not cope well if the first read is performed
on a non-empty ring since relative ofs will be computed from ds->ofs=0 or
dss->ofs_req=0.
In this case: relative offset could become invalid since we mix up relative
offsets with absolute offsets.

To fix this, we apply the same logic performed in d9c7188 ("MEDIUM: ring:
make the offset relative to the head/tail instead of absolute") for the
cli_io_handler_show_ring() function: that is using b_peek_ofs(buf, 0) to
set the contextual offset instead of hard-coding it to 0.

This should be considered as a minor bugfix since this bug was discovered by
reading the code: 737d10f already survived a good amount of stress-tests as
shown in GH #2068.

No backport needed as 737d10f is not marked for backports.
2023-03-08 08:56:30 +01:00
Aurelien DARRAGON
2c98867187 BUG/MEDIUM: sink/forwarder: ensure ring offset is properly readjusted to head
Since d9c7188 ("MEDIUM: ring: make the offset relative to the head/tail instead
of absolute"), ring offset calculation has changed: we don't rely on ring->ofs
absolute offset anymore.

But with the above patch, relative offset is not properly calculated in
sink_forward_oc_io_handler() and sink_forward_io_handler().

The issue here is the same as 737d10f ("BUG/MEDIUM: dns: ensure ring offset is
properly reajusted to head") since dns and sink_forward share the same
ring logic:

When the ring is becoming full, ring_write() will try to regain some space to
insert new data by calling b_del() on older messages. Here b_del() moves
buffer's head under the hood, and since ring->ofs cannot be used to "correct"
the relative offset, both sink_forward_oc_io_handler() and
sink_forward_io_handler() start to get invalid offset.
At this point, we will suffer from ring data corruption resulting in unexpected
behavior or process crashes.

This can be easily demonstrated with the following test:

    |log-forward syslog
    |  dgram-bind 127.0.0.1:5114
    |  log ring@logbuffer local0
    |
    |ring logbuffer
    |  format rfc5424
    |  size 16384
    |  server logserver 127.0.0.1:5114

Haproxy will forward incoming logs on udp@127.0.0.1:5114 to
tcp@127.0.0.1:5114

Then use the following tcp server:
  nc -l -p 5114

With the following udp log sender:
    |while [ 1 ]
    |do
    |  logger --udp  --server 127.0.0.1 -P 5114 -p user.warn "Test 7"
    |done

Once the ring buffer is full (it takes less that a second to fill the 16k
buffer) haproxy starts to misbehave and the log forwarding stops.

We apply the same fix as in 737d10f ("BUG/MEDIUM: dns: ensure ring offset is
properly reajusted to head").
Please note the ~0 case that is handled slightly differently in this patch:
this is required to properly start reading from a non-empty ring. This case
will be fixed in dns related code in the following patch.

This does not need to be backported as d9c7188 was not marked for backports.
2023-03-08 08:54:43 +01:00
Frédéric Lécaille
5e3201ea77 MINOR: quic: Add transport parameters to "show quic"
Modify quic_transport_params_dump() and others function relative to the
transport parameters value dump from TRACE() to make their output more
compact.
Add call to quic_transport_params_dump() to dump the transport parameters
from "show quic" CLI command.

Must be backported to 2.7.
2023-03-08 08:50:54 +01:00
Frédéric Lécaille
ece86e64c4 MINOR: quic: Add spin bit support
Add QUIC_FL_RX_PACKET_SPIN_BIT new RX packet flag to mark an RX packet as having
the spin bit set. Idem for the connection with QUIC_FL_CONN_SPIN_BIT flag.
Implement qc_handle_spin_bit() to set/unset QUIC_FL_CONN_SPIN_BIT for the connection
as soon as a packet number could be deciphered.
Modify quic_build_packet_short_header() to set the spin bit when building
a short packet header.

Validated by quic-tracker spin bit test.

Must be backported to 2.7.
2023-03-08 08:50:54 +01:00
Frédéric Lécaille
433af7fad9 MINOR: quic: Useless TLS context allocations in qc_do_rm_hp()
These allocations are definitively useless.

Must be backported to 2.7.
2023-03-08 08:50:54 +01:00
Frédéric Lécaille
8ac8a8778d MINOR: quic: RETIRE_CONNECTION_ID frame handling (RX)
Add ->curr_cid_seq_num new quic_conn struct frame to store the connection
ID sequence number currently used by the connection.
Implement qc_handle_retire_connection_id_frm() to handle this RX frame.
Implement qc_retire_connection_seq_num() to remove a connection ID from its
sequence number.
Implement qc_build_new_connection_id_frm to allocate a new NEW_CONNECTION_ID
frame from a CID.
Modify qc_parse_pkt_frms() which parses the frames of an RX packet to handle
the case of the RETIRE_CONNECTION_ID frame.

Must be backported to 2.7.
2023-03-08 08:50:54 +01:00
Frédéric Lécaille
904caac3e4 MINOR: quic: Typo fix for ACK_ECN frame
Wrong name displayed by TRACE().

Must be backported to 2.7.
2023-03-08 08:50:54 +01:00
Frédéric Lécaille
b4c5471425 MINOR: quic: Store the next connection IDs sequence number in the connection
Add ->next_cid_seq_num new member to quic_conn struct to store the next
connection ID to be used to alloacated a connection ID.
It is initialized to 0 from qc_new_conn() which initializes a connection.
Modify new_quic_cid() to use this variable each time it is called without
giving the possibility to the caller to pass the sequence number for the
connection to be allocated.

Modify quic_build_post_handshake_frames() to use ->next_cid_seq_num
when building NEW_CONNECTION_ID frames after the hanshake has been completed.
Limit the number of connection IDs provided to the peer to the minimum
between 4 and the value it sent with active_connection_id_limit transport
parameter. This includes the connection ID used by the connection to send
this new connection IDs.

Must be backported to 2.7.
2023-03-08 08:50:54 +01:00
Frédéric Lécaille
4afbca611f MINOR: quic: Do not accept wrong active_connection_id_limit values
A peer must not send active_connection_id_limit values smaller than 2
which is also the minimum value when not sent.

Make the transport parameters decoding fail in this case.

Must be backported to 2.7.
2023-03-08 08:50:54 +01:00
Amaury Denoyelle
ebfafc212a BUG/MINOR: mux-quic: properly init STREAM frame as not duplicated
STREAM frame retransmission has been recently fixed. A new boolean field
<dup> was created for quic_stream frame type. It is set for duplicated
STREAM frame to ensure extra checks on the underlying buffer are
conducted before sending the frame. All of this has been implemented by
this commit :
  315a4f6ae5
  BUG/MEDIUM: quic: do not crash when handling STREAM on released MUX

However, the above commit is incomplete. In the MUX code, when a new
STREAM frame is created, <dup> is left uninitialized. In most cases this
is harmless as it will only add extra unneeded checks before sending the
frame. So this is mainly a performance issue.

There is however one case where this bug will lead to a crash : when the
response consists only of an empty STREAM frame. In this case, the empty
frame will be silently removed as it is incorrectly assimilated to an
already acked frame range in qc_build_frms(). This can trigger a
BUG_ON() on the MUX code as a qcs instance is still in the send list
after qc_send_frames() invocation.

Note that this is extremely rare to have only an empty STREAM frame. It
was reproduced with HTTP/0.9 where no HTTP status line exists on an
empty body. I do not know if this is possible on HTTP/3 as a status line
should be present each time in a HEADERS frame.

Properly initialize <dup> field to 0 on each STREAM frames generated by
the QUIC MUX to fix this issue.

This crash may be linked to github issue #2049.

This should be backported up to 2.6.
2023-03-07 18:39:49 +01:00
Amaury Denoyelle
737d10fac1 BUG/MEDIUM: dns: ensure ring offset is properly reajusted to head
Since the below patch, ring offset calculation for readers has changed.
  commit d9c7188633
  MEDIUM: ring: make the offset relative to the head/tail instead of absolute

For readers, this requires to adjust their offsets to be relative to the
ring head each time read is resumed. Indeed, buffer head can change any
time a ring_write() is performed after older entries were purged.
This operation was not performed on the DNS code which causes the offset
to become invalid. In most cases, the following BUG_ON() was triggered :

  FATAL: bug condition "msg_len + ofs + cnt + 1 > b_data(buf)" matched
  at src/dns.c:522

Fix this by adjusting DNS reader offsets when entering
dns_session_io_handler() and dns_process_req().

This bug was reproduced by using a backend with 10 servers using SRV
record resolution on a single resolvers section. A BUG_ON() crash would
occur after less than 5 minutes of process execution.

This does not need to be backported as the above patch is not.

This should fix github issue #2068.
2023-03-07 15:51:58 +01:00
Willy Tarreau
237e6a0d65 BUG/MAJOR: fd/thread: fix race between updates and closing FD
While running some L7 retries tests, Christopher and I stumbled upon a
very strange behavior showing some occasional server timeouts when the
server closes keep-alive connections quickly. The issue can be
reproduced with the following config:

    global
        expose-experimental-directives
        #tune.fd.edge-triggered on   # can speed up the issue

    defaults
        mode http
        timeout client 5s
        timeout server 10s
        timeout connect 2s

    listen f
        bind :8001
        http-reuse always
        retry-on all-retryable-errors
        server next 127.0.0.1:8002

    frontend b
        bind :8002
        timeout http-keep-alive 1  # one ms
        redirect location /

Sending fast requests without reusing the client connection on port 8001
with a single connection and at least 3 threads on haproxy occasionally
shows some glitches pauses (below with timeout server 2s):

  $ taskset -c 2,3 h1load  -e -t 1 -r 1 -c 1 http://127.0.0.1:8001/
  #     time conns tot_conn  tot_req      tot_bytes    err  cps  rps  bps   ttfb
           1     1     9794     9793         959714      0 9k79 9k79 7M67 42.94u
           2     1     9794     9793         959714      0 0.00 0.00 0.00    -
           3     1     9794     9793         959714      0 0.00 0.00 0.00    -
           4     0    16015    16015        1569470      0 6k22 6k22 4M87 522.9u
           5     0    18657    18656        1828190      2 2k63 2k63 2M06 39.22u

If this doesn't happen, limiting to a request rate close to 1/timeout
may help.

What is happening is that after several migrations, a late report
via fd_update_events() may detect that the thread is not welcome, and
will want to program an update so that the current thread's poller
disables its polling on it. It is allowed to do so because it used
fd_grab_tgid(). But what if _fd_delete_orphan() was just starting to
be called and already reset the update_mask ? We'll end up with a bit
present in the update mask, then _fd_delete_orphan() resets the tgid,
which will prevent the poller from consuming that update. The update
is not needed anymore since the FD was closed, but in this case nobody
will clear this bit until the same FD is reused again and cleared. And
as long as the thread's bit remains in the update_mask, no new updates
will be programmed for the next use of this FD on the same thread since
due to the bit being present, fd_nbupdt will not be changed. This is
what is causing this timeout.

The fix consists in making sure _fd_delete_orphan() waits for the
occasional watchers to leave, and to do this before clearing the
update_mask. This will be either fd_update_events() trying to check
its thread_mask, or the poller checking its updates, so that's pretty
short. But it definitely closes this race.

This fix is needed since the introduction of fd_grab_tgid(), hence 2.7.

Note that while testing the fix, another related issue concerning the
atomicity of running_mask vs thread_mask popped up and will have to be
fixed till 2.5 as part of another patch. It may make the tests for this
fix occasionally tigger a few BUG_ON() or face a null conn->subs in
sock_conn_iocb(), though these ones are much more difficult to trigger.
This is not caused by this fix.
2023-03-07 07:09:59 +01:00
Amaury Denoyelle
315a4f6ae5 BUG/MEDIUM: quic: do not crash when handling STREAM on released MUX
The MUX instance is released before its quic-conn counterpart. On
termination, a H3 GOAWAY is emitted to prevent the client to open new
streams for this connection.

The quic-conn instance will stay alive until all opened streams data are
acknowledged. If the client tries to open a new stream during this
interval despite the GOAWAY, quic-conn is responsible to request its
immediate closure with a STOP_SENDING + RESET_STREAM.

This behavior was already implemented but the received packet with the
new STREAM was never acknowledged. This was fixed with the following
commit :
  commit 156a89aef8
  BUG/MINOR: quic: acknowledge STREAM frame even if MUX is released

However, this patch introduces a regression as it did not skip the call
to qc_handle_strm_frm() despite the MUX instance being released. This
can cause a segfault when using qcc_get_qcs() on a released MUX
instance. To fix this, add a missing break statement which will skip
qc_handle_strm_frm() when the MUX instance is not initialized.

This commit was reproduced using a short timeout client and sending
several requests with delay between them by using a modified aioquic. It
produces a crash with the following backtrace :
 #0  0x000055555594d261 in __eb64_lookup (x=4, root=0x7ffff4091f60) at include/import/eb64tree.h:132
 #1  eb64_lookup (root=0x7ffff4091f60, x=4) at src/eb64tree.c:37
 #2  0x000055555563fc66 in qcc_get_qcs (qcc=0x7ffff4091dc0, id=4, receive_only=1, send_only=0, out=0x7ffff780ca70) at src/mux_quic.c:668
 #3  0x0000555555641e1a in qcc_recv (qcc=0x7ffff4091dc0, id=4, len=40, offset=0, fin=1 '\001', data=0x7ffff40c4fef "\001&") at src/mux_quic.c:974
 #4  0x0000555555619d28 in qc_handle_strm_frm (pkt=0x7ffff4088e60, strm_frm=0x7ffff780cf50, qc=0x7ffff7cef000, fin=1 '\001') at src/quic_conn.c:2515
 #5  0x000055555561d677 in qc_parse_pkt_frms (qc=0x7ffff7cef000, pkt=0x7ffff4088e60, qel=0x7ffff7cef6c0) at src/quic_conn.c:3050
 #6  0x00005555556230aa in qc_treat_rx_pkts (qc=0x7ffff7cef000, cur_el=0x7ffff7cef6c0, next_el=0x0) at src/quic_conn.c:4214
 #7  0x0000555555625fee in quic_conn_app_io_cb (t=0x7ffff40c1fa0, context=0x7ffff7cef000, state=32848) at src/quic_conn.c:4640
 #8  0x00005555558a676d in run_tasks_from_lists (budgets=0x7ffff780d470) at src/task.c:596
 #9  0x00005555558a725b in process_runnable_tasks () at src/task.c:876
 #10 0x00005555558522ba in run_poll_loop () at src/haproxy.c:2945
 #11 0x00005555558529ac in run_thread_poll_loop (data=0x555555d14440 <ha_thread_info+64>) at src/haproxy.c:3141
 #12 0x00007ffff789ebb5 in ?? () from /usr/lib/libc.so.6
 #13 0x00007ffff7920d90 in ?? () from /usr/lib/libc.so.6

This should fix github issue #2067.

This must be backported up to 2.6.
2023-03-06 13:39:40 +01:00
Frédéric Lécaille
ec93721fb0 MINOR: quic: Send PING frames when probing Initial packet number space
In very very rare cases, it is possible the Initial packet number space
must be probed even if it there is no more in flight CRYPTO frames.
In such cases, a PING frame is sent into an Initial packet. As this
packet is ack-eliciting, it must be padded by the server. qc_do_build_pkt()
is modified to do so.

Take the opportunity of this patch to modify the trace for TX frames to
easily distinguished them from other frame relative traces.

Must be backported to 2.7.
2023-03-03 19:12:26 +01:00
Frédéric Lécaille
a65b71f89f BUG/MINOR: quic: Missing detections of amplification limit reached
Mark the connection as limited by the anti-amplification limit when trying to
probe the peer.
Wakeup the connection PTO/dectection loss timer as soon as a datagram is
received. This was done only when the datagram was dropped.
This fixes deadlock issues revealed by some interop runner tests.

Must be backported to 2.7 and 2.6.
2023-03-03 19:12:26 +01:00
Frédéric Lécaille
e6359b649b BUG/MINOR: quic: Do not resend already acked frames
Some frames are marked as already acknowledged from duplicated packets
whose the original packet has been acknowledged. There is no need
to resend such packets or frames.

Implement qc_pkt_with_only_acked_frms() to detect packet with only
already acknowledged frames inside and use it from qc_prep_fast_retrans()
which selects the packet to be retransmitted.

Must be backported to 2.6 and 2.7.
2023-03-03 19:12:26 +01:00
Frédéric Lécaille
21564be4a2 BUG/MINOR: quic: Ensure not to retransmit packets with no ack-eliciting frames
Even if there is a check in callers of qc_prep_hdshk_fast_retrans() and
qc_prep_fast_retrans() to prevent retransmissions of packets with no ack-eliciting
frames, these two functions should pay attention not do to that especially if
someone decides to modify their implementations in the future.

Must be backported to 2.6 and 2.7.
2023-03-03 19:12:26 +01:00
Frédéric Lécaille
b3562a3815 BUG/MINOR: quic: Remove force_ack for Initial,Handshake packets
This is an old bug which arrived in this commit due to a misinterpretation
of the RFC I guess where the desired effect was to acknowledge all the
handshake packets:

    77ac6f566 BUG/MINOR: quic: Missing acknowledgments for trailing packets

This had as bad effect to acknowledge all the handshake packets even the
ones which are not ack-eliciting.

Must be backported to 2.7 and 2.6.
2023-03-03 19:12:26 +01:00
Frédéric Lécaille
51a7caf921 MINOR: quic: Add traces about QUIC TLS key update
Dump the secret used to derive the next one during a key update initiated by the
client and dump the resulted new secret and the new key and iv to be used to
decryption Application level packets.

Also add a trace when the key update is supposed to be initiated on haproxy side.

This has already helped in diagnosing an issue evealed by the key update interop
test with xquic as client.

Must be backported to 2.7.
2023-03-03 19:12:26 +01:00
Frédéric Lécaille
720277843b BUG/MINOR: quic: v2 Initial packets decryption failed
v2 interop runner test revealed this bug as follows:

     [01|quic|4|c_conn.c:4087] new packet : qc@0x7f62ec026e30 pkt@0x7f62ec056390 el=I pn=491940080 rel=H
     [01|quic|5|c_conn.c:1509] qc_pkt_decrypt(): entering : qc@0x7f62ec026e30
     [01|quic|0|c_conn.c:1553] quic_tls_decrypt() failed : qc@0x7f62ec026e30
     [01|quic|5|c_conn.c:1575] qc_pkt_decrypt(): leaving : qc@0x7f62ec026e30
     [01|quic|0|c_conn.c:4091] packet decryption failed -> dropped : qc@0x7f62ec026e30 pkt@0x7f62ec056390 el=I pn=491940080

Only v2 Initial packets decryption received by the clients were impacted. There
is no issue to encrypt v2 Initial packets. This is due to the fact that when
negotiated the client may send two versions of Initial packets (currently v1,
then v2). The selection was done for the TX path but not on the RX path.

Implement qc_select_tls_ctx() to select the correct TLS cipher context for all
types of packets and call this function before removing the header protection
and before deciphering the packet.

Must be backported to 2.7.
2023-03-03 19:12:26 +01:00
Frédéric Lécaille
d30a04a4bb BUG/MINOR: quic: Ensure to be able to build datagrams to be retransmitted
When retransmitting datagrams with two coalesced packets inside, the second
packet was not taken into consideration when checking there is enough space
into the network for the datagram, especially when limited by the anti-amplification.

Must be backported to 2.6 and 2.7.
2023-03-03 19:12:26 +01:00
Frédéric Lécaille
ceb88b8f46 MINOR: quic: Add a BUG_ON_HOT() call for too small datagrams
This should be helpful to detect too small datagrams: datagrams
smaller than 1200 bytes, with Initial packets inside.

Must be backported to 2.7.
2023-03-03 19:12:26 +01:00
Frédéric Lécaille
69e7118fe9 BUG/MINOR: quic: Do not send too small datagrams (with Initial packets)
Before building a packet into a datagram, ensure there is sufficient space for at
least 1200 bytes. Also pad datagrams with only one ack-eliciting Initial packet
inside.

Must be backported to 2.7 and 2.6.
2023-03-03 19:12:26 +01:00
Aurelien DARRAGON
39254cac47 MINOR: http_ext: adding some documentation, forgot to inline function
Making http_7239_valid_obfsc() inline because it is only called by inline
functions.

Removing dead comment and documenting proxy_http_parse_{7239,xff,xot} functions.

No backport needed.
2023-03-03 18:22:59 +01:00
Amaury Denoyelle
dd3a33f863 BUG/MINOR: cli: fix CLI handler "set anon global-key" call
Anonymization mode has two CLI handlers "set anon <on|off>" and "set
anon global-key". The last one only requires admin level. However, as
cli_find_kw() is implemented, only the first handler will be retrieved
as they both start with the same prefix "set anon".

This has the effect to execute the wrong handler for "set anon
global-key" with an error message about an invalid keyword. To fix this,
handlers definition have been separated for both "set anon on" and "set
anon off" commands. This allows to have minimal changes while keeping
the same "set anon" prefix for each commands.

Also take this opportunity to fix a reference to a non-existing "set
global-key" CLI handler in the documentation.

This must be backported up to 2.7.
2023-03-03 18:05:58 +01:00
Amaury Denoyelle
c8a0efbda8 BUG/MEDIUM: quic: properly handle duplicated STREAM frames
When a STREAM frame is re-emitted, it will point to the same stream
buffer as the original one. If an ACK is received for either one of
these frame, the underlying buffer may be freed. Thus, if the second
frame is declared as lost and schedule for retransmission, we must
ensure that the underlying buffer is still allocated or interrupt the
retransmission.

Stream buffer is stored as an eb_tree indexed by the stream ID. To avoid
to lookup over a tree each time a STREAM frame is re-emitted, a lost
STREAM frame is flagged as QUIC_FL_TX_FRAME_LOST.

In most cases, this code is functional. However, there is several
potential issues which may cause a segfault :
- when explicitely probing with a STREAM frame, the frame won't be
  flagged as lost
- when splitting a STREAM frame during retransmission, the flag is not
  copied

To fix both these cases, QUIC_FL_TX_FRAME_LOST flag has been converted
to a <dup> field in quic_stream structure. This field is now properly
copied when splitting a STREAM frame. Also, as this is now an inner
quic_frame field, it will be copied automatically on qc_frm_dup()
invocation thus ensuring that it will be set on probing.

This issue was encounted randomly with the following backtrace :
 #0  __memmove_avx512_unaligned_erms ()
 #1  0x000055f4d5a48c01 in memcpy (__len=18446698486215405173, __src=<optimized out>,
 #2  quic_build_stream_frame (buf=0x7f6ac3fcb400, end=<optimized out>, frm=0x7f6a00556620,
 #3  0x000055f4d5a4a147 in qc_build_frm (buf=buf@entry=0x7f6ac3fcb5d8,
 #4  0x000055f4d5a23300 in qc_do_build_pkt (pos=<optimized out>, end=<optimized out>,
 #5  0x000055f4d5a25976 in qc_build_pkt (pos=0x7f6ac3fcba10,
 #6  0x000055f4d5a30c7e in qc_prep_app_pkts (frms=0x7f6a0032bc50, buf=0x7f6a0032bf30,
 #7  qc_send_app_pkts (qc=0x7f6a0032b310, frms=0x7f6a0032bc50) at src/quic_conn.c:4184
 #8  0x000055f4d5a35f42 in quic_conn_app_io_cb (t=0x7f6a0009c660, context=0x7f6a0032b310,

This should fix github issue #2051.

This should be backported up to 2.6.
2023-03-03 15:08:02 +01:00
Remi Tricot-Le Breton
8c20a74c90 BUG/MINOR: ssl: Use 'date' instead of 'now' in ocsp stapling callback
In the OCSP response callback, instead of using the actual date of the
system, the scheduler's 'now' timer is used when checking a response's
validity.

This patch can be backported to all stable versions.
2023-03-02 15:57:56 +01:00
Remi Tricot-Le Breton
56ab607c40 MINOR: ssl: Replace now.tv_sec with date.tv_sec in ocsp update task
Instead of relying on the scheduler's timer in the main ocsp update
task, we use the actual system's date.
2023-03-02 15:57:56 +01:00
Remi Tricot-Le Breton
86d1e0b163 BUG/MINOR: ssl: Fix ocsp-update when using "add ssl crt-list"
When adding a new certificate through the CLI and appending it to a
crt-list with the 'ocsp-update' option set, the new certificate would
not be added to the OCSP response update list.
The only thing that was missing was the copy of the ocsp_update mode
from the ssl_bind_conf into the ckch_store's object.
An extra wakeup of the update task also needed to happen in case the
newly inserted entry needs to be updated before the next wakeup of the
task.

This patch does not need to be backported.
2023-03-02 15:57:56 +01:00
Remi Tricot-Le Breton
ca0c84a509 MINOR: ssl: Add ocsp-update information to "show ssl crt-list"
The "show ssl crt-list <list>" CLI command did not manage the new
ocsp-update option yet.
2023-03-02 15:57:55 +01:00
Remi Tricot-Le Breton
5ab54c61b0 REGTESTS: ssl: Add test for new ocsp update cli commands
Add tests for the "show ssl ocsp-updates" cli command as well as the new
'base64' parameter that can be passed to the "show ssl ocsp-response"
command.
2023-03-02 15:57:55 +01:00
Remi Tricot-Le Breton
780504ae4d REGTESTS: ssl: Fix ocsp update crt-lists
The options were after the filters which does not work well and now
raises a warning. It did not break the regtest because the crt-lists
were not actually used by clients.
2023-03-02 15:37:23 +01:00
Remi Tricot-Le Breton
5843237993 MINOR: ssl: Add global options to modify ocsp update min/max delay
The minimum and maximum delays between two automatic updates of a given
OCSP response can now be set via global options. It allows to limit the
update rate of OCSP responses for configurations that use many frontend
certificates with the ocsp-update option set if the updates are deemed
too costly.
2023-03-02 15:37:23 +01:00
Remi Tricot-Le Breton
9c4437d024 MINOR: ssl: Add way to dump ocsp response in base64
A new format option can be passed to the "show ssl ocsp-response" CLI
command to dump the contents of an OCSP response in base64. This is
needed because thanks to the new OCSP auto update mechanism, we could
end up using an OCSP response internally that was never provided by the
user.
2023-03-02 15:37:22 +01:00
Remi Tricot-Le Breton
7e1a62e2b4 MINOR: ssl: Increment OCSP update replay delay in case of failure
In case of successive OCSP update errors for a given OCSP response, the
retry delay will be multiplied by 2 for every new failure in order to
avoid retrying too often to update responses for which the responder is
unresponsive (for instance). The maximum delay will still be taken into
account so the OCSP update requests will wtill be sent at least every
hour.
2023-03-02 15:37:21 +01:00
Remi Tricot-Le Breton
07b7c15bce MINOR: ssl: Reorder struct certificate_ocsp members
Just swapping those two 'refcount' and 'response' members enables to
fill two 4 bytes holes in the structure.
2023-03-02 15:37:20 +01:00
Remi Tricot-Le Breton
b33fe2f4a2 MINOR: ssl: Use dedicated proxy and log-format for OCSP update
Instead of using the same proxy as other http client calls (through lua
for instance), the OCSP update will use a dedicated proxy which will
enable it to change the log format and log conditions (for instance).
This proxy will have the NOLOGNORM option and regular logging will be
managed by the update task itself because in order to dump information
related to OCSP updates, we need to control the moment when the logs are
emitted (instead or relying on the stream's life which is decorrelated
from the update itself).
The update task then calls sess_log directly, which uses a dedicated
ocsp logformat that fetches specific OCSP data. Sess_log was preferred
to the more low level app_log because it offers the strength of
"regular" sample fetches and allows to add generic information alongside
OCSP ones in the log line.
In case of connection error (unreachable server for instance), a regular
httpclient log line will also be emitted. This line will have some extra
HTTP related info that can't be provided by the ocsp update logging
mechanism.
2023-03-02 15:37:19 +01:00
Remi Tricot-Le Breton
d42c896216 MINOR: ssl: Add sample fetches related to OCSP update
This patch adds a series of sample fetches that rely on the specified
OCSP update context structure. They will then be of use only in the
context of an ongoing OCSP update.
They cannot be used directly in the configuration so they won't be made
public. They will be used in the OCSP update's specific log format which
should be emitted by the update task itself in a future patch.
2023-03-02 15:37:18 +01:00
Remi Tricot-Le Breton
d14fc51613 MINOR: ssl: Add 'show ssl ocsp-updates' CLI command
This command can be used to dump information about the entries contained
in the ocsp update tree. It will display one line per concerned OCSP
response and will contain the expected next update time as well as the
time of the last successful update, and the number of successful and
failed attempts.
2023-03-02 15:37:17 +01:00
Remi Tricot-Le Breton
0c96ee48b4 MINOR: ssl: Add certificate's path to certificate_ocsp structure
In order to have some information about the frontend certificate when
dumping the contents of the ocsp update tree from the cli, we could
either keep a reference to a ckch_store in the certificate_ocsp
structure, which might cause some dangling reference problems, or
simply copy the path to the certificate in the ocsp response structure.
This latter solution was chosen because of its simplicity.
2023-03-02 15:37:15 +01:00
Remi Tricot-Le Breton
ad6cba83a4 MINOR: ssl: Store specific ocsp update errors in response and update ctx
Those new specific error codes will enable to know a bit better what
went wrong during and OCSP update process. They will come to use in
future sample fetches as well as in debugging means (via the cli or
future traces).
2023-03-02 15:37:12 +01:00
Remi Tricot-Le Breton
9e94df3e55 MINOR: ssl: Add ocsp update success/failure counters
Those counters will be used for debugging purposes and will be dumped
via a cli command.
2023-03-02 15:37:11 +01:00
Remi Tricot-Le Breton
6de7b78c9f MINOR: ssl: Reinsert ocsp update entries later in case of unknown error
In case of allocation error during the construction of an OCSP request
for instance, we would have ended reinserting the ocsp entry at the same
place in the ocsp update tree which could potentially lead to an
"endless" loop of errors in ssl_ocsp_update_responses. In such a case,
entries are now reinserted further in the tree (1 minute later) in order
to avoid such a chain of alloc failure.
2023-03-02 15:37:10 +01:00
Remi Tricot-Le Breton
926f34bc36 MINOR: ssl: Destroy ocsp update http_client during cleanup
If a deinit is started while an OCSP update is in progress we might end
up with a dangling http_client instance that should be destroyed
properly.
2023-03-02 15:37:07 +01:00
Christopher Faulet
91ff709542 BUG/MINOR: mxu-h1: Report a parsing error on abort with pending data
When an abort is detected before all headers were received, and if there are
pending incoming data, we must report a parsing error instead of a
connection abort. This way it will be able to be handled as an invalid
message by HTTP analyzers instead of an early abort with no message.

It is especially important to be accurate on L7 retry. Indeed, without this
fix, this case will be handle by the "empty-response" retries policy while a
retry on "junk-response" is more accurate.

This patch must be backported to 2.7.
2023-03-01 17:35:16 +01:00
Christopher Faulet
c2fba3f77f BUG/MEDIUM: http-ana: Don't close request side when waiting for response
A recent fix (af124360e "BUG/MEDIUM: http-ana: Detect closed SC on opposite side
during body forwarding") was pushed to handle to sync a side when the opposite
one is in closing state. However, sometimes, the synchro is performed too early,
preventing a L7 retry to be performed.

Indeed, while the above fix is valid on the reponse side. On the request side,
if the response was not yet received, we must wait before closing.

So, to fix the fix, on the request side, we at least wait the response was
received before finishing the request analysis. Of course, if there is an error,
an abort or anything wrong on the server side, the response analyser should
handle it.

This patch is related to #2061. No backport needed.
2023-03-01 17:35:16 +01:00
Christopher Faulet
6f78ac5605 BUG/MINOR: http-ana: Do a L7 retry on read error if there is no response
A regression about "empty-response" L7 retry was introduced with the commit
dd6496f591 ("CLEANUP: http-ana: Remove useless if statement about L7
retries").

The if statetement was removed on a wrong assumption. Indeed, L7 retries on
status is now handled in the HTTP analysers. Thus, the stream-connector
(formely the conn-stream, and before again the stream-interface) no longer
report a read error to force a retry. But it is still possible to get a read
error with no response. In this case, we must perform a retry is
"empty-response" is enabled.

So the if statement is re-introduced, reverting the cleanup.

This patch should fix the issue #2061. It must be backported as far as 2.4.
2023-03-01 17:35:16 +01:00
Christopher Faulet
41ade746c7 BUG/MINOR: http-ana: Don't increment conn_retries counter before the L7 retry
When we are about to perform a L7 retry, we deal with the conn_retries
counter, to be sure we can retry. However, there is an issue here because
the counter is incremented before it is checked against the backend
limit. So, we can miss a connection retry.

Of course, we must invert both operation. The conn_retries counter must be
incremented after the check agains the backend limit.

This patch must be backported as far as 2.6.
2023-03-01 17:35:16 +01:00
Amaury Denoyelle
caa16549b8 MINOR: quic: notify on send ready
This patch completes the previous one with poller subscribe of quic-conn
owned socket on sendto() error. This ensures that mux-quic is notified
if waiting on sending when a transient sendto() error is cleared. As
such, qc_notify_send() is called directly inside socket I/O callback.

qc_notify_send() internal condition have been thus completed. This will
prevent to notify upper layer until all sending condition are fulfilled:
room in congestion window and no transient error on socket FD.

This should be backported up to 2.7.
2023-03-01 14:32:37 +01:00