Commit Graph

19548 Commits

Author SHA1 Message Date
Aurelien DARRAGON
e2907c7ee3 MINOR: stick-table: add sc-add-gpc() to http-after-response
sc-add-gpc() was implemented in 5a72d03 ("MINOR: stick-table:
implement the sc-add-gpc() action")

This new action was exposed everywhere sc-inc-gpc() is available,
except for http-after-response.
But there doesn't seem to be a technical constraint that prevents us from
exposing it in http-after-response.
It was probably overlooked, let's add it.

No backport needed, unless 5a72d03 ("MINOR: stick-table: implement the
sc-add-gpc() action") is being backported.
2023-03-17 13:09:09 +01:00
Frédéric Lécaille
ca07979b97 BUG/MINOR: quic: Missing STREAM frame data pointer updates
This patch follows this one which was not sufficient:
    "BUG/MINOR: quic: Missing STREAM frame length updates"
Indeed, it is not sufficient to update the ->len and ->offset member
of a STREAM frame to move it forward. The data pointer must also be updated.
This is not done by the STREAM frame builder.

Must be backported to 2.6 and 2.7.
2023-03-17 09:21:18 +01:00
Willy Tarreau
14ea98af73 BUG/MINOR: mux-h2: set CO_SFL_STREAMER when sending lots of data
Emeric noticed that h2 bit-rate performance was always slightly lower
than h1 when the CPU is saturated. Strace showed that we were always
data in 2kB chunks, corresponding to the max_record size. What's
happening is that when this mechanism of dynamic record size was
introduced, the STREAMER flag at the stream level was relied upon.
Since all this was moved to the muxes, the flag has to be passed as
an argument to the snd_buf() function, but the mux h2 did not use it
despite a comment mentioning it, probably because before the multi-buf
it was not easy to figure the status of the buffer.

The solution here consists in checking if the mbuf is congested or not,
by checking if it has more than one buffer allocated. If so we set the
CO_SFL_STREAMER flag, otherwise we don't. This way moderate size
exchanges continue to be made over small chunks, but downloads will
be able to use the large ones.

While it could be backported to all supported versions, it would be
better to limit it to the last LTS, so let's do it for 2.7 and 2.6 only.
This patch requires previous commit "MINOR: buffer: add br_single() to
check if a buffer ring has more than one buf".
2023-03-16 18:45:46 +01:00
Willy Tarreau
93c5511af8 BUG/MEDIUM: mux-h2: only restart sending when mux buffer is decongested
During performance tests, Emeric faced a case where the wakeups of
sc_conn_io_cb() caused by h2_resume_each_sending_h2s() was multiplied
by 5-50 and a lot of CPU was being spent doing this for apparently no
reason.

The culprit is h2_send() not behaving well with congested buffers and
small SSL records. What happens when the output is congested is that
all buffers are full, and data are emitted in 2kB chunks, which are
sufficient to wake all streams up again to ask them to send data again,
something that will obviously only work for one of them at best, and
waste a lot of CPU in wakeups and memcpy() due to the small buffers.
When this happens, the performance can be divided by 2-2.5 on large
objects.

Here the chosen solution against this is to keep in mind that as long
as there are still at least two buffers in the ring after calling
xprt->snd_buf(), it means that the output is congested and there's
no point trying again, because these data will just be placed into
such buffers and will wait there. Instead we only mark the buffer
decongested once we're back to a single allocated buffer in the ring.

By doing so we preserve the ability to deal with large concurrent
bursts while not causing a thundering herd by waking all streams for
almost nothing.

This needs to be backported to 2.7 and 2.6. Other versions could
benefit from it as well but it's not strictly necessary, and we can
reconsider this option if some excess calls to sc_conn_io_cb() are
faced.

Note that this fix depends on this recent commit:
    MINOR: buffer: add br_single() to check if a buffer ring has more than one buf
2023-03-16 18:45:46 +01:00
Willy Tarreau
9824f8c890 MINOR: buffer: add br_single() to check if a buffer ring has more than one buf
It's cheaper and cleaner than using br_count()==1 given that it just compares
two indexes, and that a ring having a single buffer is in a special case where
it is between empty and used up-to-1. In other words it's not congested.
2023-03-16 18:45:46 +01:00
Willy Tarreau
e5a26eb2de MINOR: buffer: add br_count() to return the number of allocated bufs
We have no way to know how many buffers are currently allocated in a
buffer ring. Let's add br_count() for this.
2023-03-16 18:45:46 +01:00
Willy Tarreau
3fb2c6d5b4 BUG/MINOR: mux-h2: make sure the h2c task exists before refreshing it
When detaching a stream, if it's the last one and the mbuf is blocked,
we leave without freeing the stream yet. We also refresh the h2c task's
timeout, except that it's possible that there's no such task in case
there is no client timeout, causing a crash. The fix just consists in
doing this when the task exists.

This bug has always been there and is extremely hard to meet even
without a client timeout. This fix has to be backported to all
branches, but it's unlikely anyone has ever met it anyay.
2023-03-16 18:45:46 +01:00
Christopher Faulet
3a7b539b12 BUG/MEDIUM: connection: Preserve flags when a conn is removed from an idle list
The commit 5e1b0e7bf ("BUG/MEDIUM: connection: Clear flags when a conn is
removed from an idle list") introduced a regression. CO_FL_SAFE_LIST and
CO_FL_IDLE_LIST flags are used when the connection is released to properly
decrement used/idle connection counters. if a connection is idle, these
flags must be preserved till the connection is really released. It may be
removed from the list but not immediately released. If these flags are lost
when it is finally released, the current number of used connections is
erroneously decremented. If means this counter may become negative and the
counters tracking the number of idle connecitons is not decremented,
suggesting a leak.

So, the above commit is reverted and instead we improve a bit the way to
detect an idle connection. The function conn_get_idle_flag() must now be
used to know if a connection is in an idle list. It returns the connection
flag corresponding to the idle list if the connection is idle
(CO_FL_SAFE_LIST or CO_FL_IDLE_LIST) or 0 otherwise. But if the connection
is scheduled to be removed, 0 is also returned, regardless the connection
flags.

This new function is used when the connection is temporarily removed from
the list to be used, mainly in muxes.

This patch should fix #2078 and #2057. It must be backported as far as 2.2.
2023-03-16 15:34:20 +01:00
Frédéric Lécaille
fc546ab6a7 BUG/MINOR: quic: Missing STREAM frame length updates
Some STREAM frame lengths were not updated before being duplicated, built
of requeued contrary to their ack offsets. This leads haproxy to crash when
receiving acknowledgements for such frames with this thread #1 backtrace:

 Thread 1 (Thread 0x7211b6ffd640 (LWP 986141)):
 #0  ha_crash_now () at include/haproxy/bug.h:52
 No locals.
 #1  b_del (b=<optimized out>, del=<optimized out>) at include/haproxy/buf.h:436
 No locals.
 #2  qc_stream_desc_ack (stream=stream@entry=0x7211b6fd9bc8, offset=offset@entry=53176, len=len@entry=1122) at src/quic_stream.c:111

Thank you to @Tristan971 for having provided such traces which reveal this issue:

[04|quic|5|c_conn.c:1865] qc_requeue_nacked_pkt_tx_frms(): entering : qc@0x72119c22cfe0
[04|quic|5|_frame.c:1179] qc_frm_unref(): entering : qc@0x72119c22cfe0
[04|quic|5|_frame.c:1186] qc_frm_unref(): remove frame reference : qc@0x72119c22cfe0 frm@0x72118863d260 STREAM_F uni=0 fin=1 id=460 off=52957 len=1122 3244
[04|quic|5|_frame.c:1194] qc_frm_unref(): leaving : qc@0x72119c22cfe0
[04|quic|5|c_conn.c:1902] qc_requeue_nacked_pkt_tx_frms(): updated partially acked frame : qc@0x72119c22cfe0 frm@0x72119c472290 STREAM_F uni=0 fin=1 id=460 off=53176 len=1122

Note that haproxy has much more chance to crash if this frame is the last one
(fin bit set). But another condition must be fullfilled to update the ack offset.
A previous STREAM frame from the same stream with the same offset but with less
data must be acknowledged by the peer. This is the condition to update the ack offset.

For others frames without fin bit in the same conditions, I guess the stream may be
truncated because too much data are removed from the stream when they are
acknowledged.

Must be backported to 2.6 and 2.7.
2023-03-16 14:35:19 +01:00
Aurelien DARRAGON
819817fc5e BUG/MINOR: tcp_sample: fix a bug in fc_dst_port and fc_dst_is_local sample fetches
There is a bug in the smp_fetch_dport() function which affects the 'f' case,
also known as 'fc_dst_port' sample fetch.

conn_get_src() is used to retrieve the address prior to calling conn_dst().
But this is wrong: conn_get_dst() should be used instead.

Because of that, conn_dst() may return unexpected results since the dst
address is not guaranteed to be set depending on the conn state at the time
the sample fetch is used.

This was reported by Corin Langosch on the ML:

during his tests he noticed that using fc_dst_port in a log-format string
resulted in the correct value being printed in the logs but when he used it
in an ACL, the ACL did not evaluate properly.

This can be easily reproduced with the following test conf:
    |frontend test-http
    |  bind 127.0.0.1:8080
    |  mode http
    |
    |  acl test fc_dst_port eq 8080
    |  http-request return status 200 if test
    |  http-request return status 500 if !test

A request on 127.0.0.1:8080 should normally return 200 OK, but here it
will return a 500.

The same bug was also found in smp_fetch_dst_is_local() (fc_dst_is_local
sample fetch) by reading the code: the fix was applied twice.

This needs to be backported up to 2.5
[both sample fetches were introduced in 2.5 with 888cd70 ("MINOR:
tcp-sample: Add samples to get original info about client connection")]
2023-03-16 11:26:53 +01:00
Christopher Faulet
a4bd7602f3 BUG/MEDIUM: mux-h1: Don't block SE_FL_ERROR if EOS is not reported on H1C
When a connection error is encountered during a receive, the error is not
immediatly reported to the SE descriptor. We take care to process all
pending input data first. However, when the error is finally reported, a
fatal error is reported only if a read0 was also received. Otherwise, only a
pending error is reported.

With a raw socket, it is not an issue because shutdowns for read and write
are systematically reported too when a connection error is detected. So in
this case, the fatal error is always reported to the SE descriptor. But with
a SSL socket, in case of pure SSL error, only the connection error is
reported. This prevent the fatal error to go up. And because the connection
is in error, no more receive or send are preformed on the socket. The mux is
blocked till a timeout is triggered at the stream level, looping infinitly
to progress.

To fix the bug, during the demux stage, when there is no longer pending
data, the read error is reported to the SE descriptor, regardless the
shutdown for reads was received or not. No more data are expected, so it is
safe.

This patch should fix the issue #2046. It must be backported to 2.7.
2023-03-16 08:54:51 +01:00
Christopher Faulet
f19c639787 DEBUG: ssl-sock/show_fd: Display SSL error code
Like for connection error code, when FD are dumps, the ssl error code is now
displayed. This may help to diagnose why a connection error occurred.

This patch may be backported to help debugging.
2023-03-14 15:51:34 +01:00
Christopher Faulet
d52f2ad6ee DEBUG: cli/show_fd: Display connection error code
When FD are dumps, the connection error code is now displayed. This may help
to diagnose why a connection error occurred.

This patch may be backported to help debugging.
2023-03-14 15:48:07 +01:00
Christopher Faulet
52ec6f14c4 BUG/MEDIUM: resolvers: Properly stop server resolutions on soft-stop
When HAproxy is stopping, the DNS resolutions must be stopped, except those
triggered from a "do-resolve" action. To do so, the resolutions themselves
cannot be destroyed, the current design is too complex. However, it is
possible to mute the resolvers tasks. The same is already performed with the
health-checks. On soft-stop, the tasks are still running periodically but
nothing if performed.

For the resolvers, when the process is stopping, before running a
resolution, we check all the requesters attached to this resolution. If t
least a request is a stream or if there is a requester attached to a running
proxy, a new resolution is triggered. Otherwise, we ignored the
resolution. It will be evaluated again on the next wakeup. This way,
"do-resolv" action are still working during soft-stop but other resoluation
are stopped.

Of course, it may be see as a feature and not a bug because it was never
performed. But it is in fact not expected at all to still performing
resolutions when HAProxy is stopping. In addution, a proxy option will be
added to change this behavior.

This patch partially fixes the issue #1874. It could be backported to 2.7
and maybe to 2.6. But no further.
2023-03-14 15:23:55 +01:00
Christopher Faulet
48678e483f BUG/MEDIUM: proxy: properly stop backends on soft-stop
On soft-stop, we must properlu stop backends and not only proxies with at
least a listener. This is mandatory in order to stop the health checks. A
previous fix was provided to do so (ba29687bc1 "BUG/MEDIUM: proxy: properly
stop backends"). However, only stop_proxy() function was fixed. When HAproxy
is stopped, this function is no longer used. So the same kind of fix must be
done on do_soft_stop_now().

This patch partially fixes the issue #1874. It must be backported as far as
2.4.
2023-03-14 15:23:55 +01:00
Remi Tricot-Le Breton
7716f27736 MINOR: ssl: Add certificate path to 'show ssl ocsp-response' output
The ocsp-related CLI commands tend to work with OCSP_CERTIDs as well as
certificate paths so the path should also be added to the output of the
"show ssl ocsp-response" command when no certid or path is provided.
2023-03-14 11:07:32 +01:00
Remi Tricot-Le Breton
dafc068f12 MINOR: ssl: Accept certpath as param in "show ssl ocsp-response" CLI command
In order to increase usability, the "show ssl ocsp-response" also takes
a frontend certificate path as parameter. In such a case, it behaves the
same way as "show ssl cert foo.pem.ocsp".
2023-03-14 11:07:32 +01:00
Remi Tricot-Le Breton
f64a05979d BUG/MINOR: ssl: Fix double free in ocsp update deinit
If the last update before a deinit happens was successful, the pointer
to the httpclient in the ocsp update context was not reset while the
httpclient instance was already destroyed.
2023-03-14 11:07:32 +01:00
Remi Tricot-Le Breton
a6c0a59e9a MINOR: ssl: Use ocsp update task for "update ssl ocsp-response" command
Instead of having a dedicated httpclient instance and its own code
decorrelated from the actual auto update one, the "update ssl
ocsp-response" will now use the update task in order to perform updates.

Since the cli command allows to update responses that were never
included in the auto update tree, a new flag was added to the
certificate_ocsp structure so that the said entry can be inserted into
the tree "by hand" and it won't be reinserted back into the tree after
the update process is performed. The 'update_once' flag "stole" a bit
from the 'fail_count' counter since it is the one less likely to reach
UINT_MAX among the ocsp counters of the certificate_ocsp structure.

This new logic required that every certificate_ocsp entry contained all
the ocsp-related information at all time since entries that are not
supposed to be configured automatically can still be updated through the
cli. The logic of the ssl_sock_load_ocsp was changed accordingly.
2023-03-14 11:07:32 +01:00
Remi Tricot-Le Breton
c9bfe32b71 MINOR: ssl: Change the ocsp update log-format
The dedicated proxy used for OCSP auto update is renamed OCSP-UPDATE
which should be more explicit than the previous HC_OCSP name. The
reference to the underlying httpclient is simply kept in the
documentation.
The certid is removed from the log line since it is not really
comprehensible and is replaced by the path to the corresponding frontend
certificate.
2023-03-14 11:07:32 +01:00
Christopher Faulet
e5d02c3d46 BUG/MEDIUM: mux-pt: Set EOS on error on sending path if read0 was received
It is more a less a revert of the commit b65af26e1 ("MEDIUM: mux-pt: Don't
always set a final error on SE on the sending path"). The PT multiplexer is
so simple that an error on the sending path is terminal. Unlike other muxes,
there is no connection level here. However, instead of reporting an final
error by setting SE_FL_ERROR, we set SE_FL_EOS flag instead if a read0 was
received on the underlying connection. Concretely, it is always true with
the current design of the raw socket layer. But it is cleaner this way.

Without this patch, it is possible to block a TCP socket if a connection
error is triggered when data are sent (for instance a broken pipe) while the
upper stream does not expect to receive more data.

Note the patch above introduced a regression because errors handling at the
connection level is quite simple. All errors are final. But we must keep in
mind it may change. And if so, this will require to move back on a 2-step
errors handling in the mux-pt.

This patch must be backported to 2.7.
2023-03-13 11:22:13 +01:00
Willy Tarreau
fc0ad29c29 [RELEASE] Released version 2.8-dev5
Released version 2.8-dev5 with the following main changes :
    - MINOR: ssl: rename confusing ssl_bind_kws
    - BUG/MINOR: config: crt-list keywords mistaken for bind ssl keywords
    - BUG/MEDIUM: http-ana: Detect closed SC on opposite side during body forwarding
    - BUG/MEDIUM: stconn: Don't rearm the read expiration date if EOI was reached
    - MINOR: global: Add an option to disable the data fast-forward
    - MINOR: haproxy: Add an command option to disable data fast-forward
    - REGTESTS: Remove unsupported feature command in http_splicing.vtc
    - BUG/MEDIUM: wdt: fix wrong thread being checked for sleeping
    - BUG/MINOR: sched: properly report long_rq when tasks remain in the queue
    - BUG/MEDIUM: sched: allow a bit more TASK_HEAVY to be processed when needed
    - MINOR: threads: add flags to know if a thread is started and/or running
    - MINOR: h3/hq-interop: handle no data in decode_qcs() with FIN set
    - BUG/MINOR: mux-quic: transfer FIN on empty STREAM frame
    - BUG/MINOR: mworker: prevent incorrect values in uptime
    - MINOR: h3: add traces on decode_qcs callback
    - BUG/MINOR: quic: Possible unexpected counter incrementation on send*() errors
    - MINOR: quic: Add new traces about by connection RX buffer handling
    - MINOR: quic: Move code to wakeup the timer task to avoid anti-amplication deadlock
    - BUG/MINOR: quic: Really cancel the connection timer from qc_set_timer()
    - MINOR: quic: Simplication for qc_set_timer()
    - MINOR: quic: Kill the connections on ICMP (port unreachable) packet receipt
    - MINOR: quic: Add traces to qc_kill_conn()
    - MINOR: quic: Make qc_dgrams_retransmit() return a status.
    - BUG/MINOR: quic: Missing call to task_queue() in qc_idle_timer_do_rearm()
    - MINOR: quic: Add a trace to identify connections which sent Initial packet.
    - MINOR: quic: Add <pto_count> to the traces
    - BUG/MINOR: quic: Do not probe with too little Initial packets
    - BUG/MINOR: quic: Wrong initialization for io_cb_wakeup boolean
    - BUG/MINOR: quic: Do not drop too small datagrams with Initial packets
    - BUG/MINOR: quic: Missing padding for short packets
    - MINOR: quic: adjust request reject when MUX is already freed
    - BUG/MINOR: quic: also send RESET_STREAM if MUX released
    - BUG/MINOR: quic: acknowledge STREAM frame even if MUX is released
    - BUG/MINOR: h3: prevent hypothetical demux failure on int overflow
    - MEDIUM: h3: enforce GOAWAY by resetting higher unhandled stream
    - MINOR: mux-quic: define qc_shutdown()
    - MINOR: mux-quic: define qc_process()
    - MINOR: mux-quic: implement client-fin timeout
    - MEDIUM: mux-quic: properly implement soft-stop
    - MINOR: quic: mark quic-conn as jobs on socket allocation
    - MEDIUM: quic: trigger fast connection closing on process stopping
    - MINOR: mux-h2/traces: do not log h2s pointer for dummy streams
    - MINOR: mux-h2/traces: add a missing TRACE_LEAVE() in h2s_frt_handle_headers()
    - BUG/MEDIUM: quic: Missing TX buffer draining from qc_send_ppkts()
    - DEBUG: stream: Add a BUG_ON to never exit process_stream with an expired task
    - DOC: config: Fix description of options about HTTP connection modes
    - MINOR: proxy: Only consider backend httpclose option for server connections
    - BUG/MINOR: haproxy: Fix option to disable the fast-forward
    - DOC: config: Add the missing tune.fail-alloc option from global listing
    - MINOR: cfgcond: Implement strstr condition expression
    - MINOR: cfgcond: Implement enabled condition expression
    - REGTESTS: Skip http_splicing.vtc script if fast-forward is disabled
    - REGTESTS: Fix ssl_errors.vtc script to wait for connections close
    - BUG/MINOR: mworker: stop doing strtok directly from the env
    - BUG/MEDIUM: mworker: prevent inconsistent reload when upgrading from old versions
    - BUG/MEDIUM: mworker: don't register mworker_accept_wrapper() when master FD is wrong
    - MINOR: startup: HAPROXY_STARTUP_VERSION contains the version used to start
    - BUG/MINOR: cache: Cache response even if request has "no-cache" directive
    - BUG/MINOR: cache: Check cache entry is complete in case of Vary
    - MINOR: compiler: add a TOSTR() macro to turn a value into a string
    - BUG/MINOR: lua/httpclient: missing free in hlua_httpclient_send()
    - BUG/MEDIUM: httpclient/lua: fix a race between lua GC and hlua_ctx_destroy
    - MEDIUM: channel: Remove CF_READ_NOEXP flag
    - MAJOR: channel: Remove flags to report READ or WRITE errors
    - DEBUG: stream/trace: Add sedesc flags in trace messages
    - MINOR: channel/stconn: Move rto/wto from the channel to the stconn
    - MEDIUM: channel/stconn: Move rex/wex timer from the channel to the sedesc
    - MEDIUM: stconn: Don't requeue the stream's task after I/O
    - MEDIUM: stconn: Replace read and write timeouts by a unique I/O timeout
    - MEDIUM: stconn: Add two date to track successful reads and blocked sends
    - MINOR: applet/stconn: Add a SE flag to specify an endpoint does not expect data
    - MAJOR: stream: Use SE descriptor date to detect read/write timeouts
    - MINOR: stream: Dump the task expiration date in trace messages
    - MINOR: stream: Report rex/wex value using the sedesc date in trace messages
    - MINOR: stream: Use relative expiration date in trace messages
    - MINOR: stconn: Always report READ/WRITE event on shutr/shutw
    - CLEANUP: stconn: Remove old read and write expiration dates
    - MINOR: stconn: Set half-close timeout using proxy settings
    - MINOR: stconn: Remove half-closed timeout
    - REGTESTS: cache: Use rxresphdrs to only get headers for 304 responses
    - MINOR: stconn: Add functions to set/clear SE_FL_EXP_NO_DATA flag from endpoint
    - BUG/MINOR: proto_ux: report correct error when bind_listener fails
    - BUG/MINOR: protocol: fix minor memory leak in protocol_bind_all()
    - MINOR: proto_uxst: add resume method
    - MINOR: listener/api: add lli hint to listener functions
    - MINOR: listener: add relax_listener() function
    - MINOR: listener: workaround for closing a tiny race between resume_listener() and stopping
    - MINOR: listener: make sure we don't pause/resume bypassed listeners
    - BUG/MEDIUM: listener: fix pause_listener() suspend return value handling
    - BUG/MINOR: listener: fix resume_listener() resume return value handling
    - BUG/MEDIUM: resume from LI_ASSIGNED in default_resume_listener()
    - MINOR: listener: pause_listener() becomes suspend_listener()
    - BUG/MEDIUM: listener/proxy: fix listeners notify for proxy resume
    - BUG/MINOR: sock_unix: match finalname with tempname in sock_unix_addrcmp()
    - MEDIUM: proto_ux: properly suspend named UNIX listeners
    - MINOR: proto_ux: ability to dump ABNS names in error messages
    - MINOR: haproxy: always protocol unbind on startup error path
    - BUILD: quic: 32-bits compilation issue with %zu in quic_rx_pkts_del()
    - BUG/MINOR: ring: do not realign ring contents on resize
    - MEDIUM: ring: make the offset relative to the head/tail instead of absolute
    - CLEANUP: ring: remove the now unused ring's offset
    - MINOR: config: add HAPROXY_BRANCH environment variable
    - BUILD: thead: Fix several 32 bits compilation issues with uint64_t variables
    - BUG/MEDIUM: fd: avoid infinite loops in fd_add_to_fd_list and fd_rm_from_fd_list
    - BUG/MEDIUM: h1-htx: Never copy more than the max data allowed during parsing
    - BUG/MINOR: stream: Remove BUG_ON about the task expiration in process_stream()
    - MINOR: stream: Handle stream's timeouts in a dedicated function
    - MEDIUM: stream: Eventually handle stream timeouts when exiting process_stream()
    - MINOR: stconn: Report a send activity when endpoint is willing to consume data
    - BUG/MEDIUM: stconn: Report a blocked send if some output data are not consumed
    - MEDIUM: mux-h1: Don't expect data from server as long as request is unfinished
    - MEDIUM: mux-h2: Don't expect data from server as long as request is unfinished
    - MEDIUM: mux-quic: Don't expect data from server as long as request is unfinished
    - DOC: config: Clarify the meaning of 'hold' in the 'resolvers' section
    - DOC: config: Replace TABs by spaces
    - BUG/MINOR: fd: used the update list from the fd's group instead of tgid
    - BUG/MEDIUM: fd: make fd_delete() support being called from a different group
    - CLEANUP: listener: only store conn counts for local threads
    - MINOR: tinfo: make thread_set functions return nth group/mask instead of first
    - MEDIUM: quic: improve fatal error handling on send
    - MINOR: quic: consider EBADF as critical on send()
    - BUG/MEDIUM: connection: Clear flags when a conn is removed from an idle list
    - BUG/MINOR: mux-h1: Don't report an error on an early response close
    - BUG/MINOR: http-check: Don't set HTX_SL_F_BODYLESS flag with a log-format body
    - BUG/MINOR: http-check: Skip C-L header for empty body when it's not mandatory
    - BUG/MINOR: http-fetch: recognize IPv6 addresses in square brackets in req.hdr_ip()
    - REGTEST: added tests covering smp_fetch_hdr_ip()
    - MINOR: quic: simplify return path in send functions
    - MINOR: quic: implement qc_notify_send()
    - MINOR: quic: purge txbuf before preparing new packets
    - MEDIUM: quic: implement poller subscribe on sendto error
    - MINOR: quic: notify on send ready
    - BUG/MINOR: http-ana: Don't increment conn_retries counter before the L7 retry
    - BUG/MINOR: http-ana: Do a L7 retry on read error if there is no response
    - BUG/MEDIUM: http-ana: Don't close request side when waiting for response
    - BUG/MINOR: mxu-h1: Report a parsing error on abort with pending data
    - MINOR: ssl: Destroy ocsp update http_client during cleanup
    - MINOR: ssl: Reinsert ocsp update entries later in case of unknown error
    - MINOR: ssl: Add ocsp update success/failure counters
    - MINOR: ssl: Store specific ocsp update errors in response and update ctx
    - MINOR: ssl: Add certificate's path to certificate_ocsp structure
    - MINOR: ssl: Add 'show ssl ocsp-updates' CLI command
    - MINOR: ssl: Add sample fetches related to OCSP update
    - MINOR: ssl: Use dedicated proxy and log-format for OCSP update
    - MINOR: ssl: Reorder struct certificate_ocsp members
    - MINOR: ssl: Increment OCSP update replay delay in case of failure
    - MINOR: ssl: Add way to dump ocsp response in base64
    - MINOR: ssl: Add global options to modify ocsp update min/max delay
    - REGTESTS: ssl: Fix ocsp update crt-lists
    - REGTESTS: ssl: Add test for new ocsp update cli commands
    - MINOR: ssl: Add ocsp-update information to "show ssl crt-list"
    - BUG/MINOR: ssl: Fix ocsp-update when using "add ssl crt-list"
    - MINOR: ssl: Replace now.tv_sec with date.tv_sec in ocsp update task
    - BUG/MINOR: ssl: Use 'date' instead of 'now' in ocsp stapling callback
    - BUG/MEDIUM: quic: properly handle duplicated STREAM frames
    - BUG/MINOR: cli: fix CLI handler "set anon global-key" call
    - MINOR: http_ext: adding some documentation, forgot to inline function
    - BUG/MINOR: quic: Do not send too small datagrams (with Initial packets)
    - MINOR: quic: Add a BUG_ON_HOT() call for too small datagrams
    - BUG/MINOR: quic: Ensure to be able to build datagrams to be retransmitted
    - BUG/MINOR: quic: v2 Initial packets decryption failed
    - MINOR: quic: Add traces about QUIC TLS key update
    - BUG/MINOR: quic: Remove force_ack for Initial,Handshake packets
    - BUG/MINOR: quic: Ensure not to retransmit packets with no ack-eliciting frames
    - BUG/MINOR: quic: Do not resend already acked frames
    - BUG/MINOR: quic: Missing detections of amplification limit reached
    - MINOR: quic: Send PING frames when probing Initial packet number space
    - BUG/MEDIUM: quic: do not crash when handling STREAM on released MUX
    - BUG/MAJOR: fd/thread: fix race between updates and closing FD
    - BUG/MEDIUM: dns: ensure ring offset is properly reajusted to head
    - BUG/MINOR: mux-quic: properly init STREAM frame as not duplicated
    - MINOR: quic: Do not accept wrong active_connection_id_limit values
    - MINOR: quic: Store the next connection IDs sequence number in the connection
    - MINOR: quic: Typo fix for ACK_ECN frame
    - MINOR: quic: RETIRE_CONNECTION_ID frame handling (RX)
    - MINOR: quic: Useless TLS context allocations in qc_do_rm_hp()
    - MINOR: quic: Add spin bit support
    - MINOR: quic: Add transport parameters to "show quic"
    - BUG/MEDIUM: sink/forwarder: ensure ring offset is properly readjusted to head
    - BUG/MINOR: dns: fix ring offset calculation on first read
    - BUG/MINOR: dns: fix ring offset calculation in dns_resolve_send()
    - MINOR: jwt: Add support for RSA-PSS signatures (PS256 algorithm)
    - MINOR: h3: add traces on h3_init_uni_stream() error paths
    - MINOR: quic: create a global list dedicated for closing QUIC conns
    - MINOR: quic: handle new closing list in show quic
    - MEDIUM: quic: release closing connections on stopping
    - BUG/MINOR: quic: Wrong RETIRE_CONNECTION_ID sequence number check
    - MINOR: fd/cli: report the polling mask in "show fd"
    - CLEANUP: sock: always perform last connection updates before wakeup
    - MINOR: quic: Do not stress the peer during retransmissions of lost packets
    - BUG/MINOR: init: properly detect NUMA bindings on large systems
    - BUG/MINOR: thread: report thread and group counts in the correct order
    - BUG/MAJOR: fd/threads: close a race on closing connections after takeover
    - MINOR: debug: add random delay injection with "debug dev delay-inj"
    - BUG/MINOR: mworker: use MASTER_MAXCONN as default maxconn value
    - BUG/MINOR: quic: Missing listener accept queue tasklet wakeups
    - MINOR: quic_sock: un-statify quic_conn_sock_fd_iocb()
    - DOC: config: fix typo "dependeing" in bind thread description
    - DOC/CLEANUP: fix typos
2023-03-10 16:28:37 +01:00
Michael Prokop
9a62e35e37 DOC/CLEANUP: fix typos
s/algorithmm/algorithm/
s/an other/another/
s/certicates/certificates/
s/exemples/examples/
s/informations/information/
s/optionnal/optional/
2023-03-10 16:19:31 +01:00
Willy Tarreau
7fd8756b26 DOC: config: fix typo "dependeing" in bind thread description
"dependeing" -> "depending".
2023-03-10 14:30:01 +01:00
Willy Tarreau
8f6da64641 MINOR: quic_sock: un-statify quic_conn_sock_fd_iocb()
This one is printed as the iocb in the "show fd" output, and arguably
this wasn't very convenient as-is:
    293 : st=0x000123(cl heopI W:sRa R:sRA) ref=0 gid=1 tmask=0x8 umask=0x0 prmsk=0x8 pwmsk=0x0 owner=0x7f488487afe0 iocb=0x50a2c0(main+0x60f90)

Let's unstatify it and export it so that the symbol can now be resolved
from the various points that need it.
2023-03-10 14:30:01 +01:00
Frédéric Lécaille
4377dbd756 BUG/MINOR: quic: Missing listener accept queue tasklet wakeups
This bug was revealed by h2load tests run as follows:

      h2load -t 4 --npn-list h3 -c 64 -m 16 -n 16384  -v https://127.0.0.1:4443/

This open (-c) 64 QUIC connections (-n) 16384 h3 requets from (-t) 4 threads, i.e.
256 requests by connection. Such tests could not always pass and often ended with
such results displays by h2load:

finished in 53.74s, 38.11 req/s, 493.78KB/s
requests: 16384 total, 2944 started, 2048 done, 2048 succeeded, 14336
failed, 14336 errored, 0 timeout
status codes: 2048 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 25.92MB (27174537) total, 102.00KB (104448) headers (space
savings 1.92%), 25.80MB (27053569) data
UDP datagram: 3883 sent, 24330 received
                     min         max         mean         sd        ± sd
time for request:    48.75ms    502.86ms    134.12ms     75.80ms    92.68%
time for connect:    20.94ms    331.24ms    189.59ms     84.81ms    59.38%
time to 1st byte:   394.36ms    417.01ms    406.72ms      9.14ms    75.00%
req/s           :       0.00      115.45       14.30       38.13    87.50%

The number of successful requests was always a multiple of 256.
Activating the traces also shew that some connections were blocked after having
successfully completed their handshakes due to the fact that the mux. The mux
is started upon the acceptation of the connection.

Under heavy load, some connections were never accepted. From the moment where
more than 4 (MAXACCEPT) connections were enqueued before a listener could be
woken up to accept at most 4 connections, the remaining connections were not
accepted ore lately at the second listener tasklet  wakeup.

Add a call to tasklet_wakeup() to the accept list tasklet of the listeners to
wake up it if there are remaining connections to accept after having called
listener_accept(). In this case the listener must not be removed of this
accept list, if not at the next call it will not accept anything more.

Must be backported to 2.7 and 2.6.
2023-03-10 14:05:24 +01:00
William Lallemand
2078d4b1f7 BUG/MINOR: mworker: use MASTER_MAXCONN as default maxconn value
In environments where SYSTEM_MAXCONN is defined when compiling, the
master will use this value instead of the original minimal value which
was set to 100. When this happens, the master process could allocate
RAM excessively since it does not need to have an high maxconn. (For
example if SYSTEM_MAXCONN was set to 100000 or more)

This patch fixes the issue by using the new define MASTER_MAXCONN which
define a default maxconn of 100 for the master process.

Must be backported as far as 2.5.
2023-03-09 14:28:44 +01:00
Willy Tarreau
bd3b44edff MINOR: debug: add random delay injection with "debug dev delay-inj"
The goal is to send signals to random threads at random instants so that
they spin for a random delay in a relax() loop, trying to give back the
CPU to another competing hardware thread, in hope that from time to time
this can trigger in critical areas and increase the chances to provoke a
latent concurrency bug. For now none were observed.

For example, this command starts 64 such tasks waking after random delays
of 0-1ms and delivering signals to trigger such loops on 3 random threads:

  for i in {1..64}; do
    socat - /tmp/sock1 <<< "expert-mode on;debug dev delay-inj 2 3"
  done

This command is only enabled when DEBUG_DEV is set at build time.
2023-03-09 14:01:58 +01:00
Willy Tarreau
cd8914bc52 BUG/MAJOR: fd/threads: close a race on closing connections after takeover
As mentioned in commit 237e6a0d6 ("BUG/MAJOR: fd/thread: fix race between
updates and closing FD"), a race was found during stress tests involving
heavy backend connection reuse with many competing closes.

Here the problem is complex. The analysis in commit f69fea64e ("MAJOR:
fd: get rid of the DWCAS when setting the running_mask") that removed
the DWCAS in 2.5 overlooked a few races.

First, a takeover from thread1 could happen just after fd_update_events()
in thread2 validates it holds the tmask bit in the CAS loop. Since thread1
releases running_mask after the operation, thread2 will succeed the CAS
and both will believe the FD is theirs. This does explain the occasional
crashes seen with h1_io_cb() being called on a bad context, or
sock_conn_iocb() seeing conn->subs vanish after checking it. This issue
can be addressed using a DWCAS in both fd_takeover() and fd_update_events()
as it was before the patch above but this is not portable to all archs and
is not easy to adapt for those lacking it, due to some operations still
happening only on individual masks after the thread groups were added.

Second, the checks after fd_clr_running() for the current thread being
the last one is not sufficient: at the exact moment the operation
completes, another thread may also set and drop the running bit and see
itself as alone, and both can call _fd_close_orphan() in parallel. In
order to prevent this from happening, we cannot rely on the absence of
others, we need an explicit flag indicating that the FD must be closed.
One approach that was attempted consisted in playing with the thread_mask
but that was not reliable since it could still match between the late
deletion and the early insertion that follows. Instead, a new FD flag
was added, FD_MUST_CLOSE, that exactly indicates that the call to
_fd_delete_orphan() must be done. It is set by fd_delete(), and
atomically cleared by the first one which checks it, and which is the
only one to call _fd_delete_orphan().

With both points addressed, there's no more visible race left:

- takeover() only happens under the connection list's lock and cannot
  compete with fd_delete() since fd_delete() must first remove the
  connection from the list before deleting the FD. That's also why it
  doesn't need to call _fd_delete_orphan() when dropping its running
  bit.

- takeover() sets its running bit then atomically replaces the thread
  mask, so that until that's done, it doesn't validate the condition
  to end the synchonization loop in fd_update_events(). Once it's OK,
  the previous thread's bit is lost, and this is checked for in
  fd_update_events()

- fd_update_events() can compete with fd_delete() at various places
  which are explained above. Since fd_delete() clears the thread mask
  as after setting its running bit and after setting the FD_MUST_CLOSE
  bit, the synchronization loop guarantees that the thread mask is seen
  before going further, and that once it's seen, the FD_MUST_CLOSE flag
  is already present.

- fd_delete() may start while fd_update_events() has already started,
  but fd_delete() must hold a bit in thread_mask before starting, and
  that is checked by the first test in fd_update_events() before setting
  the running_mask.

- the poller's _update_fd() will not compete against _fd_delete_orphan()
  nor fd_insert() thanks to the fd_grab_tgid() that's always done before
  updating the polled_mask, and guarantees that we never pretend that a
  polled_mask has a bit before the FD is added.

The issue is very hard to reproduce and is extremely time-sensitive.
Some tests were required with a 1-ms timeout with request rates
closely matching 1 kHz per server, though certain tests sometimes
benefitted from saturation. It was found that adding the following
slowdown at a few key places helped a lot and managed to trigger the
bug in 0.5 to 5 seconds instead of tens of minutes on a 20-thread
setup:

    { volatile int i = 10000; while (i--); }

Particularly, placing it at key places where only one of running_mask
or thread_mask is set and not the other one yet (e.g. after the
synchronization loop in fd_update_events or after dropping the
running bit) did yield great results.

Many thanks to Olivier Houchard for this expert help analysing these
races and reviewing candidate fixes.

The patch must be backported to 2.5. Note that 2.6 does not have tgid
in FDs, and that it requires a change of output on fd_clr_running() as
we need the previous bit. This is provided by carefully backporting
commit d6e1987612 ("MINOR: fd: make fd_clr_running() return the previous
value instead"). Tests have shown that the lack of tgid is a showstopper
for 2.6 and that unless a better workaround is found, it could still be
preferable to backport the minimum pieces required for fd_grab_tgid()
to 2.6 so that it stays stable long.
2023-03-09 14:01:48 +01:00
Willy Tarreau
cf0d0eedc7 BUG/MINOR: thread: report thread and group counts in the correct order
In case too many thread groups are needed for the threads, we emit
an error indicating the problem. Unfortunately the threads and groups
counts were reversed.

This can be backported to 2.6.
2023-03-09 11:40:56 +01:00
Willy Tarreau
f5b63277f4 BUG/MINOR: init: properly detect NUMA bindings on large systems
The NUMA detection code tries not to interfer with any taskset the user
could have specified in init scripts. For this it compares the number of
CPUs available with the number the process is bound to. However, the CPU
count is retrieved after being applied an upper bound of MAX_THREADS, so
if the machine has more than 64 CPUs, the comparison always fails and
makes haproxy think the user has already enforced a binding, and it does
not pin it anymore to a single NUMA node.

This can be verified by issuing:

  $ socat /path/to/sock - <<< "show info" | grep thread

On a dual 48-CPU machine it reports 64, implying that threads are allowed
to run on the second socket:

  Nbthread: 64

With this fix, the function properly reports 96, and the output shows 48,
indicating that a single NUMA node was used:

  Nbthread: 48

Of course nothing is changed when "no numa-cpu-mapping" is specified:

  Nbthread: 64

This can be backported to 2.4.
2023-03-09 10:17:37 +01:00
Frédéric Lécaille
be795ceb91 MINOR: quic: Do not stress the peer during retransmissions of lost packets
This issue was revealed by "Multiple streams" QUIC tracker test which very often
fails (locally) with a file of about 1Mbytes (x4 streams). The log of QUIC tracker
revealed that from its point of view, the 4 files were never all received entirely:

 "results" : {
      "stream_0_rec_closed" : true,
      "stream_0_rec_offset" : 1024250,
      "stream_0_snd_closed" : true,
      "stream_0_snd_offset" : 15,
      "stream_12_rec_closed" : false,
      "stream_12_rec_offset" : 72689,
      "stream_12_snd_closed" : true,
      "stream_12_snd_offset" : 15,
      "stream_4_rec_closed" : true,
      "stream_4_rec_offset" : 1024250,
      "stream_4_snd_closed" : true,
      "stream_4_snd_offset" : 15,
      "stream_8_rec_closed" : true,
      "stream_8_rec_offset" : 1024250,
      "stream_8_snd_closed" : true,
      "stream_8_snd_offset" : 15
   },

But this in contradiction with others QUIC tracker logs which confirms that haproxy
has really (re)sent the stream at the suspected offset(stream_12_rec_offset):

       1152085,
       "transport",
       "packet_received",
       {
          "frames" : [
             {
                "frame_type" : "stream",
                "length" : "155",
                "offset" : "72689",
                "stream_id" : "12"
             }
          ],
          "header" : {
             "dcid" : "a14479169ebb9dba",
             "dcil" : "8",
             "packet_number" : "466",
             "packet_size" : 190
          },
          "packet_type" : "1RTT"
       }

When detected as losts, the packets are enlisted, then their frames are
requeued in their packet number space by qc_requeue_nacked_pkt_tx_frms().
This was done using a local list which was spliced to the packet number
frame list. This had as bad effect to retransmit the frames in the inverse
order they have been sent. This is something the QUIC tracker go client
does not like at all!

Removing the frame splicing fixes this issue and allows haproxy to pass the
"Multiple streams" test.

Must be backported to 2.7.
2023-03-08 18:50:45 +01:00
Willy Tarreau
9b773ec118 CLEANUP: sock: always perform last connection updates before wakeup
Normally the task_wakeup() in sock_conn_io_cb() is expected to
happen on the same thread the FD is attached to. But due to the
way the code was arranged in the past (with synchronous callbacks)
we continue to update connections after the wakeup, which always
makes the reader have to think deeply whether it's possible or not
to call another thread there. Let's just move the tasklet_wakeup()
at the end to make sure there's no problem with that.
2023-03-08 16:07:32 +01:00
Willy Tarreau
677c006c5c MINOR: fd/cli: report the polling mask in "show fd"
It's missing and often needed when trying to debug a situation, let's
report the polling mask as well in "show fd".
2023-03-08 16:07:32 +01:00
Frédéric Lécaille
cc101cd2aa BUG/MINOR: quic: Wrong RETIRE_CONNECTION_ID sequence number check
This bug arrived with this commit:
     b5a8020e9 MINOR: quic: RETIRE_CONNECTION_ID frame handling (RX)
and was revealed by h3 interop tests with clients like s2n-quic and quic-go
as noticed by Amaury.

Indeed, one must check that the CID matching the sequence number provided by a received
RETIRE_CONNECTION_ID frame does not match the DCID of the packet.
Remove useless ->curr_cid_seq_num member from quic_conn struct.
The sequence number lookup must be done in qc_handle_retire_connection_id_frm()
to check the validity of the RETIRE_CONNECTION_ID frame, it returns the CID to be
retired into <cid_to_retire> variable passed as parameter to this function if
the frame is valid and if the CID was not already retired

Must be backported to 2.7.
2023-03-08 14:53:12 +01:00
Amaury Denoyelle
5907fede87 MEDIUM: quic: release closing connections on stopping
Since the following commit :
  commit fb375574f9
  MINOR: quic: mark quic-conn as jobs on socket allocation

quic-conn instances are marked as jobs. This prevent haproxy process to
stop while there is transfer in progress. To not delay process
termination, idle connections are woken up through their MUX instances
to be able to release them immediately.

However, there is no mechanism to wake up quic connections left on
closing or draining state. This means that haproxy process termination
is delayed until every closing quic connections timer has expired.

To improve this, a new function quic_handle_stopping() is called when
haproxy process is stopping. It simply wakes up the idle timer task of
all connections in the global closing list. These connections will thus
be released immediately to not interrupt haproxy process stopping.

This should be backported up to 2.7.
2023-03-08 14:41:28 +01:00
Amaury Denoyelle
2d37629222 MINOR: quic: handle new closing list in show quic
A new global quic-conn list has been added by the previous patch. It will
contain every quic-conn in closing or draining state.

Thus, it is now easier to include or skip them on a "show quic" output :
when the default list on the current thread has been browsed entirely,
either we skip to the next thread or we look at the closing list on the
current thread.

This should be backported up to 2.7.
2023-03-08 14:39:48 +01:00
Amaury Denoyelle
efed86c973 MINOR: quic: create a global list dedicated for closing QUIC conns
When a CONNECTION_CLOSE is emitted or received, a QUIC connection enters
respectively in draining or closing state. These states are a loose
equivalent of TCP TIME_WAIT. No data can be exchanged anymore but the
connection is maintained during a certain timer to handle packet
reordering or loss.

A new global list has been defined for QUIC connections in
closing/draining state inside thread_ctx structure. Each time a
connection enters in one of this state, it will be moved from the
default global list to the new closing list.

The objective of this patch is to quickly filter connections on
closing/draining. Most notably, this will be used to wake up these
connections and avoid that haproxy process stopping is delayed by them.

A dedicated function qc_detach_th_ctx_list() has been implemented to
transfer a quic-conn from one list instance to the other. This takes
care of back-references attach to a quic-conn instance in case of a
running "show quic".

This should be backported up to 2.7.
2023-03-08 14:39:48 +01:00
Amaury Denoyelle
815c8ce210 MINOR: h3: add traces on h3_init_uni_stream() error paths
Complete traces on h3_init_uni_stream(). This ensures there is always a
dedicated trace for each error paths.

This should be backported up to 2.7.
2023-03-08 14:32:30 +01:00
Remi Tricot-Le Breton
447a38f387 MINOR: jwt: Add support for RSA-PSS signatures (PS256 algorithm)
This patch adds the support for the PS algorithms when verifying JWT
signatures (rsa-pss). It was not managed during the first implementation
and previously raised an "Unmanaged algorithm" error.
The tests use the same rsa signature as the plain rsa tests (RS256 ...)
and the implementation simply adds a call to
EVP_PKEY_CTX_set_rsa_padding in the function that manages rsa and ecdsa
signatures.
The signatures in the reg-test were built thanks to the PyJWT python
library once again.
2023-03-08 10:43:04 +01:00
Aurelien DARRAGON
bce0c0c37a BUG/MINOR: dns: fix ring offset calculation in dns_resolve_send()
With 737d10f ("BUG/MEDIUM: dns: ensure ring offset is properly reajusted
to head") relative offset calculation was fixed in dns_session_io_handler()
and dns_process_req() functions.

But if we compare with the changes performed in the patch that introduced
the bug: d9c7188 ("MEDIUM: ring: make the offset relative to the head/tail
instead of absolute"), we can see that dns_resolve_send() is missing from
the patch.

Applying both 737d10f + ("BUG/MINOR: dns: fix ring offset calculation on
first read") to dns_resolve_send() function.
With this last commit, we should be back at pre d9c7188 behavior.

No backport needed.
2023-03-08 08:57:13 +01:00
Aurelien DARRAGON
5a43db2c5d BUG/MINOR: dns: fix ring offset calculation on first read
With 737d10f ("BUG/MEDIUM: dns: ensure ring offset is properly reajusted
to head") ring offset is now properly re-adjusted in dns_session_io_handler()
and dns_process_req().

But the previous patch does not cope well if the first read is performed
on a non-empty ring since relative ofs will be computed from ds->ofs=0 or
dss->ofs_req=0.
In this case: relative offset could become invalid since we mix up relative
offsets with absolute offsets.

To fix this, we apply the same logic performed in d9c7188 ("MEDIUM: ring:
make the offset relative to the head/tail instead of absolute") for the
cli_io_handler_show_ring() function: that is using b_peek_ofs(buf, 0) to
set the contextual offset instead of hard-coding it to 0.

This should be considered as a minor bugfix since this bug was discovered by
reading the code: 737d10f already survived a good amount of stress-tests as
shown in GH #2068.

No backport needed as 737d10f is not marked for backports.
2023-03-08 08:56:30 +01:00
Aurelien DARRAGON
2c98867187 BUG/MEDIUM: sink/forwarder: ensure ring offset is properly readjusted to head
Since d9c7188 ("MEDIUM: ring: make the offset relative to the head/tail instead
of absolute"), ring offset calculation has changed: we don't rely on ring->ofs
absolute offset anymore.

But with the above patch, relative offset is not properly calculated in
sink_forward_oc_io_handler() and sink_forward_io_handler().

The issue here is the same as 737d10f ("BUG/MEDIUM: dns: ensure ring offset is
properly reajusted to head") since dns and sink_forward share the same
ring logic:

When the ring is becoming full, ring_write() will try to regain some space to
insert new data by calling b_del() on older messages. Here b_del() moves
buffer's head under the hood, and since ring->ofs cannot be used to "correct"
the relative offset, both sink_forward_oc_io_handler() and
sink_forward_io_handler() start to get invalid offset.
At this point, we will suffer from ring data corruption resulting in unexpected
behavior or process crashes.

This can be easily demonstrated with the following test:

    |log-forward syslog
    |  dgram-bind 127.0.0.1:5114
    |  log ring@logbuffer local0
    |
    |ring logbuffer
    |  format rfc5424
    |  size 16384
    |  server logserver 127.0.0.1:5114

Haproxy will forward incoming logs on udp@127.0.0.1:5114 to
tcp@127.0.0.1:5114

Then use the following tcp server:
  nc -l -p 5114

With the following udp log sender:
    |while [ 1 ]
    |do
    |  logger --udp  --server 127.0.0.1 -P 5114 -p user.warn "Test 7"
    |done

Once the ring buffer is full (it takes less that a second to fill the 16k
buffer) haproxy starts to misbehave and the log forwarding stops.

We apply the same fix as in 737d10f ("BUG/MEDIUM: dns: ensure ring offset is
properly reajusted to head").
Please note the ~0 case that is handled slightly differently in this patch:
this is required to properly start reading from a non-empty ring. This case
will be fixed in dns related code in the following patch.

This does not need to be backported as d9c7188 was not marked for backports.
2023-03-08 08:54:43 +01:00
Frédéric Lécaille
5e3201ea77 MINOR: quic: Add transport parameters to "show quic"
Modify quic_transport_params_dump() and others function relative to the
transport parameters value dump from TRACE() to make their output more
compact.
Add call to quic_transport_params_dump() to dump the transport parameters
from "show quic" CLI command.

Must be backported to 2.7.
2023-03-08 08:50:54 +01:00
Frédéric Lécaille
ece86e64c4 MINOR: quic: Add spin bit support
Add QUIC_FL_RX_PACKET_SPIN_BIT new RX packet flag to mark an RX packet as having
the spin bit set. Idem for the connection with QUIC_FL_CONN_SPIN_BIT flag.
Implement qc_handle_spin_bit() to set/unset QUIC_FL_CONN_SPIN_BIT for the connection
as soon as a packet number could be deciphered.
Modify quic_build_packet_short_header() to set the spin bit when building
a short packet header.

Validated by quic-tracker spin bit test.

Must be backported to 2.7.
2023-03-08 08:50:54 +01:00
Frédéric Lécaille
433af7fad9 MINOR: quic: Useless TLS context allocations in qc_do_rm_hp()
These allocations are definitively useless.

Must be backported to 2.7.
2023-03-08 08:50:54 +01:00
Frédéric Lécaille
8ac8a8778d MINOR: quic: RETIRE_CONNECTION_ID frame handling (RX)
Add ->curr_cid_seq_num new quic_conn struct frame to store the connection
ID sequence number currently used by the connection.
Implement qc_handle_retire_connection_id_frm() to handle this RX frame.
Implement qc_retire_connection_seq_num() to remove a connection ID from its
sequence number.
Implement qc_build_new_connection_id_frm to allocate a new NEW_CONNECTION_ID
frame from a CID.
Modify qc_parse_pkt_frms() which parses the frames of an RX packet to handle
the case of the RETIRE_CONNECTION_ID frame.

Must be backported to 2.7.
2023-03-08 08:50:54 +01:00
Frédéric Lécaille
904caac3e4 MINOR: quic: Typo fix for ACK_ECN frame
Wrong name displayed by TRACE().

Must be backported to 2.7.
2023-03-08 08:50:54 +01:00
Frédéric Lécaille
b4c5471425 MINOR: quic: Store the next connection IDs sequence number in the connection
Add ->next_cid_seq_num new member to quic_conn struct to store the next
connection ID to be used to alloacated a connection ID.
It is initialized to 0 from qc_new_conn() which initializes a connection.
Modify new_quic_cid() to use this variable each time it is called without
giving the possibility to the caller to pass the sequence number for the
connection to be allocated.

Modify quic_build_post_handshake_frames() to use ->next_cid_seq_num
when building NEW_CONNECTION_ID frames after the hanshake has been completed.
Limit the number of connection IDs provided to the peer to the minimum
between 4 and the value it sent with active_connection_id_limit transport
parameter. This includes the connection ID used by the connection to send
this new connection IDs.

Must be backported to 2.7.
2023-03-08 08:50:54 +01:00
Frédéric Lécaille
4afbca611f MINOR: quic: Do not accept wrong active_connection_id_limit values
A peer must not send active_connection_id_limit values smaller than 2
which is also the minimum value when not sent.

Make the transport parameters decoding fail in this case.

Must be backported to 2.7.
2023-03-08 08:50:54 +01:00