BUG/MINOR: freq_ctr: fix possible negative rate with the scaled API

In 1.9 with commit 627505d36 ("MINOR: freq_ctr: add swrate_add_scaled()
to work with large samples") we got the ability to indicate when adding
some values that they represent a number of samples. However there is an
issue in the calculation which is that the number of samples that is
added to the sum before the division in order to avoid fading away too
fast, is multiplied by the scale. The problem it causes is that this is
done in the negative part of the expression, and that as soon if the sum
of old_sum and v*s is too small (e.g. zero), we end up with a negative
value of -s.

This is visible in "show pools" which occasionally report a very large
value on "needed_avg" since 2.9, though the bug has been there for longer.
Indeed in 2.9 since they're hashed in buckets, it suffices that any
thread got one such error once for the sum to be wrong. One possible
impact is memory usage not shrinking after a short burst due to pools
refraining from releasing objects, believing they don't have enough.

This must be backported to all versions. Note that the opportunistic
version can be dropped before 2.8.
This commit is contained in:
Willy Tarreau 2023-09-14 11:00:07 +02:00
parent 148f145d32
commit e3b2704e26

View File

@ -343,7 +343,7 @@ static inline unsigned int swrate_add_scaled(unsigned int *sum, unsigned int n,
old_sum = *sum;
do {
new_sum = old_sum + v * s - div64_32((unsigned long long)(old_sum + n) * s, n);
new_sum = old_sum + v * s - div64_32((unsigned long long)old_sum * s + n - 1, n);
} while (!HA_ATOMIC_CAS(sum, &old_sum, new_sum) && __ha_cpu_relax());
return new_sum;
}
@ -378,7 +378,7 @@ static inline uint swrate_add_scaled_opportunistic(uint *sum, uint n, uint v, ui
uint new_sum, old_sum;
old_sum = *sum;
new_sum = old_sum + v * s - div64_32((unsigned long long)(old_sum + n) * s, n);
new_sum = old_sum + v * s - div64_32((unsigned long long)old_sum * s + n - 1, n);
HA_ATOMIC_CAS(sum, &old_sum, new_sum);
return new_sum;
}