From b9bf16b3827c58e61bf4cf27b30e14484fd2b6e5 Mon Sep 17 00:00:00 2001 From: Willy Tarreau Date: Wed, 24 Apr 2024 11:37:06 +0200 Subject: [PATCH] BUG/MINOR: h1: fix detection of upper bytes in the URI In 1.7 with commit 5f10ea30f4 ("OPTIM: http: improve parsing performance of long URIs") we improved the URI parser's performance on platforms supporting unaligned accesses by reading 4 chars at a time in a 32-bit word. However, as reported in GH issue #2545, there's a bug in the way the top bytes are checked, as the parser will stop when all 4 of them are above 7e instead of when one of them is, so certain patterns can be accepted through if the last ones are all valid. The fix requires to negate the value but on the other hand it allows to parallelize some of the tests and fuse the masks, which could even end up slightly faster. This needs to be backported to all stable versions, but be careful, this code moved a lot over time, from proto_http.c to h1.c, to http_msg.c, to h1.c again. Better just grep for "24242424" or "21212121" in each version to find it. Big kudos to Martijn van Oosterhout (@kleptog) for spotting this problem while analyzing that piece of code, and reporting it. --- src/h1.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/src/h1.c b/src/h1.c index dab638787..0a548937a 100644 --- a/src/h1.c +++ b/src/h1.c @@ -575,12 +575,9 @@ int h1_headers_to_hdr_list(char *start, const char *stop, #ifdef HA_UNALIGNED_LE /* speedup: skip bytes not between 0x24 and 0x7e inclusive */ while (ptr <= end - sizeof(int)) { - int x = *(int *)ptr - 0x24242424; - if (x & 0x80808080) - break; + uint x = *(uint *)ptr; - x -= 0x5b5b5b5b; - if (!(x & 0x80808080)) + if (((x - 0x24242424) | (0x7e7e7e7e - x)) & 0x80808080U) break; ptr += sizeof(int);