MEDIUM: quic: use dynamic credit for pacing

Major improvements have been introduced in pacing recently. Most
notably, QMUX schedules emission on a millisecond resolution, which
allow to use passive wait to be much CPU friendly.

However, an issue remains with the pacing max credit. Unless BBR is
used, it is fixed to the configured value from quic-cc-algo bind
statement. This is not practical as if too low, it may drastically
reduce performance due to 1ms sleep resolution. If too high, some
clients will suffer from too much packet loss.

This commit fixes the issue by implementing a dynamic maximum credit
value based on the network condition specific to each clients.
Calculation is done to fix a maximum value which should allow QMUX
current tasklet context to emit enough data to cover the delay with the
next tasklet invokation. As such, avg_loop_us is used to detect the
process load. If too small, 1.5ms is used as minimal value, to cover the
extra delay incurred by the system which will happen for a default 1ms
sleep.

This should be backported up to 3.1.
This commit is contained in:
Amaury Denoyelle 2025-01-22 17:26:13 +01:00
parent 8098be1fdc
commit cb91ccd8a8
3 changed files with 25 additions and 16 deletions

View File

@ -17267,7 +17267,7 @@ quic-cc-algo { cubic | newreno | bbr | nocc }[(<args,...>)]
Default value: cubic
It is possible to enable pacing if the algorithm is compatible. This is done
by specifying an optional burst argument as described in the next paragraph.
by setting an optional integer argument as described in the next paragraph.
The purpose of pacing is to smooth emission of data to reduce network losses.
In most scenario, it can significantly improve network throughput by avoiding
retransmissions. Pacing support is still experimental, as such it requires
@ -17283,24 +17283,18 @@ quic-cc-algo { cubic | newreno | bbr | nocc }[(<args,...>)]
mandatory order of each parameters :
- maximum window size in bytes. It must be greater than 10k and smaller than
4g. By default "tune.quic.frontend.default-max-window-size" value is used.
- burst size in datagrams. By default, it is set to 0, which means unlimited.
A positive value up to 1024 can be specified to smooth emission using
pacing. Lower values provide a smoother traffic (hence less losses) at the
expense of a higher CPU usage, while higher values will reduce CPU usage
and provide a slightly more bursty traffic. Note that a datagram is usually
around 1252 bytes, and that a typical receive buffer is 208kB or 170
datagrams, so in order to keep the traffic smooth, bursts should only
represent a small fraction of this value (between a few units to a few tens
at most). See above paragraph for more explanation. This parameter is
ignored by BBR.
- pacing activation. By default, it is set to 0, which means pacing is not
used. To activate it, specify a positive value. Burst size will be
dynamically adjusted to adapt to the network conditions. This parameter is
ignored by BBR as pacing is automatically activated for this algorithm.
Example:
# newreno congestion control algorithm
quic-cc-algo newreno
# cubic congestion control algorithm with one megabytes as window
quic-cc-algo cubic(1m)
# cubic with pacing on top of it, with burst limited to 12 datagrams
quic-cc-algo cubic(,12)
# cubic with pacing activated on top of it
quic-cc-algo cubic(,1)
A special value "nocc" may be used to force a fixed congestion window always
set at the maximum size. It is reserved for debugging scenarios to remove any

View File

@ -11,7 +11,7 @@ static inline void quic_pacing_init(struct quic_pacer *pacer,
{
pacer->cc = cc;
pacer->cur = 0;
pacer->credit = cc->algo->pacing_burst(cc);
pacer->credit = 0;
}
void quic_pacing_sent_done(struct quic_pacer *pacer, int sent);

View File

@ -21,14 +21,29 @@ int quic_pacing_reload(struct quic_pacer *pacer)
{
const uint64_t task_now_ns = task_mono_time();
const uint64_t inter = pacer->cc->algo->pacing_inter(pacer->cc);
uint64_t inc;
uint64_t inc, wakeup_delay;
uint credit_max;
if (task_now_ns > pacer->cur) {
/* Calculate number of packets which could have been emitted since last emission sequence. Result is rounded up. */
inc = (task_now_ns - pacer->cur + inter - 1) / inter;
credit_max = pacer->cc->algo->pacing_burst(pacer->cc);
/* Credit must not exceed a maximal value to guarantee a
* smooth emission. This max value represents the number of
* packet based on congestion window and RTT which can be sent
* to cover the sleep until the next wakeup. This delay is
* roughly the max between the scheduler delay or 1ms.
*/
/* Calculate wakeup_delay to determine max credit value. */
wakeup_delay = MAX(swrate_avg(activity[tid].avg_loop_us, TIME_STATS_SAMPLES), 1000);
/* Convert it to nanoseconds. Use 1.5 factor tolerance to try to cover the imponderable extra system delay until the next wakeup. */
wakeup_delay *= 1500;
/* Determine max credit from wakeup_delay and packet rate emission. */
credit_max = wakeup_delay / inter;
/* Ensure max credit will never be smaller than 2. */
credit_max = MAX(credit_max, 2);
/* Apply max credit on the new value. */
pacer->credit = MIN(pacer->credit + inc, credit_max);
/* Refresh pacing reload timer. */