From cb91ccd8a8b548d85bf5155b1a8ce759f1480a4e Mon Sep 17 00:00:00 2001
From: Amaury Denoyelle <adenoyelle@haproxy.com>
Date: Wed, 22 Jan 2025 17:26:13 +0100
Subject: [PATCH] MEDIUM: quic: use dynamic credit for pacing

Major improvements have been introduced in pacing recently. Most
notably, QMUX schedules emission on a millisecond resolution, which
allow to use passive wait to be much CPU friendly.

However, an issue remains with the pacing max credit. Unless BBR is
used, it is fixed to the configured value from quic-cc-algo bind
statement. This is not practical as if too low, it may drastically
reduce performance due to 1ms sleep resolution. If too high, some
clients will suffer from too much packet loss.

This commit fixes the issue by implementing a dynamic maximum credit
value based on the network condition specific to each clients.
Calculation is done to fix a maximum value which should allow QMUX
current tasklet context to emit enough data to cover the delay with the
next tasklet invokation. As such, avg_loop_us is used to detect the
process load. If too small, 1.5ms is used as minimal value, to cover the
extra delay incurred by the system which will happen for a default 1ms
sleep.

This should be backported up to 3.1.
---
 doc/configuration.txt         | 20 +++++++-------------
 include/haproxy/quic_pacing.h |  2 +-
 src/quic_pacing.c             | 19 +++++++++++++++++--
 3 files changed, 25 insertions(+), 16 deletions(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 7b133d637..56864003b 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -17267,7 +17267,7 @@ quic-cc-algo { cubic | newreno | bbr | nocc }[(<args,...>)]
   Default value: cubic
 
   It is possible to enable pacing if the algorithm is compatible. This is done
-  by specifying an optional burst argument as described in the next paragraph.
+  by setting an optional integer argument as described in the next paragraph.
   The purpose of pacing is to smooth emission of data to reduce network losses.
   In most scenario, it can significantly improve network throughput by avoiding
   retransmissions. Pacing support is still experimental, as such it requires
@@ -17283,24 +17283,18 @@ quic-cc-algo { cubic | newreno | bbr | nocc }[(<args,...>)]
   mandatory order of each parameters :
   - maximum window size in bytes. It must be greater than 10k and smaller than
     4g. By default "tune.quic.frontend.default-max-window-size" value is used.
-  - burst size in datagrams. By default, it is set to 0, which means unlimited.
-    A positive value up to 1024 can be specified to smooth emission using
-    pacing. Lower values provide a smoother traffic (hence less losses) at the
-    expense of a higher CPU usage, while higher values will reduce CPU usage
-    and provide a slightly more bursty traffic. Note that a datagram is usually
-    around 1252 bytes, and that a typical receive buffer is 208kB or 170
-    datagrams, so in order to keep the traffic smooth, bursts should only
-    represent a small fraction of this value (between a few units to a few tens
-    at most). See above paragraph for more explanation. This parameter is
-    ignored by BBR.
+  - pacing activation. By default, it is set to 0, which means pacing is not
+    used. To activate it, specify a positive value. Burst size will be
+    dynamically adjusted to adapt to the network conditions. This parameter is
+    ignored by BBR as pacing is automatically activated for this algorithm.
 
   Example:
       # newreno congestion control algorithm
       quic-cc-algo newreno
       # cubic congestion control algorithm with one megabytes as window
       quic-cc-algo cubic(1m)
-      # cubic with pacing on top of it, with burst limited to 12 datagrams
-      quic-cc-algo cubic(,12)
+      # cubic with pacing activated on top of it
+      quic-cc-algo cubic(,1)
 
   A special value "nocc" may be used to force a fixed congestion window always
   set at the maximum size. It is reserved for debugging scenarios to remove any
diff --git a/include/haproxy/quic_pacing.h b/include/haproxy/quic_pacing.h
index 582a489b8..83251a74f 100644
--- a/include/haproxy/quic_pacing.h
+++ b/include/haproxy/quic_pacing.h
@@ -11,7 +11,7 @@ static inline void quic_pacing_init(struct quic_pacer *pacer,
 {
 	pacer->cc = cc;
 	pacer->cur = 0;
-	pacer->credit = cc->algo->pacing_burst(cc);
+	pacer->credit = 0;
 }
 
 void quic_pacing_sent_done(struct quic_pacer *pacer, int sent);
diff --git a/src/quic_pacing.c b/src/quic_pacing.c
index 427c3b406..000c957b2 100644
--- a/src/quic_pacing.c
+++ b/src/quic_pacing.c
@@ -21,14 +21,29 @@ int quic_pacing_reload(struct quic_pacer *pacer)
 {
 	const uint64_t task_now_ns = task_mono_time();
 	const uint64_t inter = pacer->cc->algo->pacing_inter(pacer->cc);
-	uint64_t inc;
+	uint64_t inc, wakeup_delay;
 	uint credit_max;
 
 	if (task_now_ns > pacer->cur) {
 		/* Calculate number of packets which could have been emitted since last emission sequence. Result is rounded up. */
 		inc = (task_now_ns - pacer->cur + inter - 1) / inter;
 
-		credit_max = pacer->cc->algo->pacing_burst(pacer->cc);
+		/* Credit must not exceed a maximal value to guarantee a
+		 * smooth emission. This max value represents the number of
+		 * packet based on congestion window and RTT which can be sent
+		 * to cover the sleep until the next wakeup. This delay is
+		 * roughly the max between the scheduler delay or 1ms.
+		 */
+
+		/* Calculate wakeup_delay to determine max credit value. */
+		wakeup_delay = MAX(swrate_avg(activity[tid].avg_loop_us, TIME_STATS_SAMPLES), 1000);
+		/* Convert it to nanoseconds. Use 1.5 factor tolerance to try to cover the imponderable extra system delay until the next wakeup. */
+		wakeup_delay *= 1500;
+		/* Determine max credit from wakeup_delay and packet rate emission. */
+		credit_max = wakeup_delay / inter;
+		/* Ensure max credit will never be smaller than 2. */
+		credit_max = MAX(credit_max, 2);
+		/* Apply max credit on the new value. */
 		pacer->credit = MIN(pacer->credit + inc, credit_max);
 
 		/* Refresh pacing reload timer. */