MINOR: mux-h1: perform a graceful close at 75% glitches threshold

This avoids hitting the hard wall for connections with non-compliant peers that are accumulating errors. We recycle the connection early enough to permit to reset the counter. Example below with a threshold set to 100: Before, 1% errors: $ h1load -H "Host : blah" -c 1 -n 10000000 0:4445 # time conns tot_conn tot_req tot_bytes err cps rps bps ttfb 1 1 1039 103872 6763365 1038 1k03 103k 54M1 9.426u 2 1 2128 212793 14086140 2127 1k08 108k 58M5 8.963u 3 1 3215 321465 21392137 3214 1k08 108k 58M3 8.982u 4 1 4307 430684 28735013 4306 1k09 109k 58M6 8.935u 5 1 5390 538989 36016294 5389 1k08 108k 58M1 9.021u After, no more errors: $ h1load -H "Host : blah" -c 1 -n 10000000 0:4445 # time conns tot_conn tot_req tot_bytes err cps rps bps ttfb 1 1 1509 113161 7487809 0 1k50 113k 59M9 8.482u 2 1 3002 225101 15114659 0 1k49 111k 60M9 8.582u 3 1 4508 338045 22809911 0 1k50 112k 61M5 8.523u 4 1 5971 447785 30286861 0 1k46 109k 59M7 8.772u 5 1 7472 560335 37955271 0 1k49 112k 61M2 8.537u
2026-03-15 12:01:37 +01:00 · 2025-12-20 16:48:15 +01:00 · 2025-12-20 16:48:15 +01:00 · 5904f8279b
commit 5904f8279b
parent 05b457002b
2 changed files with 23 additions and 4 deletions
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@ -4211,6 +4211,10 @@ tune.h1.be.glitches-threshold <number>
  probably be in the hundreds or thousands to be effective without affecting
  slightly bogus servers. It is also possible to only kill connections when the
  CPU usage crosses a certain level, by using "tune.glitches.kill.cpu-usage".
+  Note that a graceful close is attempted at 75% of the configured threshold by
+  advertising a GOAWAY for a future stream. This ensures that a slightly faulty
+  connection will stop being used after some time without risking to interrupt
+  ongoing transfers.

  See also: tune.h1.fe.glitches-threshold, bc_glitches, and
            tune.glitches.kill.cpu-usage
@ -4226,6 +4230,11 @@ tune.h1.fe.glitches-threshold <number>
  probably be in the hundreds or thousands to be effective without affecting
  slightly bogus clients. It is also possible to only kill connections when the
  CPU usage crosses a certain level, by using "tune.glitches.kill.cpu-usage".
+  Note that a graceful close is attempted at 75% of the configured threshold by
+  advertising a GOAWAY for a future stream. This ensures that a slightly non-
+  compliant client will have the opportunity to create a new connection and
+  continue to work unaffected without ever triggering the hard close thus
+  risking to interrupt ongoing transfers.

  See also: tune.h1.be.glitches-threshold, fc_glitches, and
            tune.glitches.kill.cpu-usage
--- a/src/mux_h1.c
+++ b/src/mux_h1.c
@ -525,10 +525,20 @@ static inline int _h1_report_glitch(struct h1c *h1c, int increment)
 		h1_be_glitches_threshold : h1_fe_glitches_threshold;

 	h1c->glitches += increment;
-	if (thres && h1c->glitches >= thres &&
-	    (th_ctx->idle_pct <= global.tune.glitch_kill_maxidle)) {
-		h1c->flags |= H1C_F_ERROR;
-		return 1;
+	if (unlikely(thres && h1c->glitches >= (thres * 3 + 1) / 4)) {
+		/* at 75% of the threshold, we switch to close mode
+		 * to force clients to periodically reconnect.
+		 */
+		h1c->h1s->flags = (h1c->h1s->flags & ~H1S_F_WANT_MSK) | H1S_F_WANT_CLO;
+
+		/* at 100% of the threshold and excess of CPU usage we also
+		 * actively kill the connection.
+		 */
+		if (h1c->glitches >= thres &&
+		    (th_ctx->idle_pct <= global.tune.glitch_kill_maxidle)) {
+			h1c->flags |= H1C_F_ERROR;
+			return 1;
+		}
 	}
 	return 0;
 }