haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-11-02 09:31:19 +01:00

Author	SHA1	Message	Date
Willy Tarreau	ccc65012d3	IMPORT: slz: silence a build warning on non-x86 non-arm Building with clang 16 on MIPS64 yields this warning: src/slz.c:931:24: warning: unused function 'crc32_uint32' [-Wunused-function] static inline uint32_t crc32_uint32(uint32_t data) ^ Let's guard it using UNALIGNED_LE_OK which is the only case where it's used. This saves us from introducing a possibly non-portable attribute. This is libslz upstream commit f5727531dba8906842cb91a75c1ffa85685a6421.	2025-05-16 16:43:53 +02:00
Willy Tarreau	31ca29eee1	IMPORT: slz: fix header used for empty zlib message Calling slz_rfc1950_finish() without emitting any data would result in incorrectly emitting a gzip header (rfc1952) instead of a zlib header (rfc1950) due to a copy-paste between the two wrappers. The impact is almost inexistent since the zlib format is almost never used in this context, and compressing totally empty messages is quite rare as well. Let's take this opportunity for fixing another mistake on an RFC number in a comment. This is slz upstream commit 7f3fce4f33e8c2f5e1051a32a6bca58e32d4f818.	2025-05-16 16:43:53 +02:00
Willy Tarreau	411b04c7d3	IMPORT: slz: use a better hash for machines with a fast multiply The current hash involves 3 simple shifts and additions so that it can be mapped to a multiply on architecures having a fast multiply. This is indeed what the compiler does on x86_64. A large range of values was scanned to try to find more optimal factors on machines supporting such a fast multiply, and it turned out that new factor 0x1af42f resulted in smoother hashes that provided on average 0.4% better compression on both the Silesia corpus and an mbox file composed of very compressible emails and uncompressible attachments. It's even slightly better than CRC32C while being faster on Skylake. This patch enables this factor on archs with a fast multiply. This is slz upstream commit 82ad1e75c13245a835c1c09764c89f2f6e8e2a40.	2025-05-16 16:43:53 +02:00
Willy Tarreau	248bbec83c	IMPORT: slz: support crc32c for lookup hash on sse4 but only if requested If building for sse4 and USE_CRC32C_HASH is defined, then we can use crc32c to calculate the lookup hash. By default we don't do it because even on skylake it's slower than the current hash, which only involves a short multiply (~5% slower). But the gains are marginal (0.3%). This is slz upstream commit 44ae4f3f85eb275adba5844d067d281e727d8850. Note: this is not used by default and only merged in order to avoid divergence between the code bases.	2025-05-16 16:43:53 +02:00
Willy Tarreau	ea1b70900f	IMPORT: slz: avoid multiple shifts on 64-bits On 64-bit platforms, disassembling the code shows that send_huff() performs a left shift followed by a right one, which are the result of integer truncation and zero-extension caused solely by using different types at different levels in the call chain. By making encode24() take a 64-bit int on input and send_huff() take one optionally, we can remove one shift in the hot path and gain 1% performance without affecting other platforms. This is slz upstream commit fd165b36c4621579c5305cf3bb3a7f5410d3720b.	2025-05-16 16:43:53 +02:00
Willy Tarreau	90d18e2006	IMPORT: slz: implement a synchronous flush() operation In some cases it may be desirable for latency reasons to forcefully flush the queue even if it results in suboptimal compression. In our case the queue might contain up to almost 4 bytes, which need an EOB and a switch to literal mode, followed by 4 bytes to encode an empty message. This means that each call can add 5 extra bytes in the ouput stream. And the flush may also result in the header being produced for the first time, which can amount to 2 or 10 bytes (zlib or gzip). In the worst case, a total of 19 bytes may be emitted at once upon a flush with 31 pending bits and a gzip header. This is libslz upstream commit cf8c4668e4b4216e930b56338847d8d46a6bfda9.	2023-06-30 16:12:36 +02:00
Willy Tarreau	eda36f1c23	IMPORT: slz: declare len to fix debug build when optimal match is enabled Building with -DFIND_OPTIMAL_MATCH would fail on undeclared "len". This one likely vanished in some cleanup. This is libslz upstream commit 1ea20360715e1ad0cd81db83fa4361310716b8cc	2022-11-14 11:14:02 +01:00
Willy Tarreau	b154422db1	IMPORT: slz: use the correct CRC32 instruction when running in 32-bit mode Many ARMv8 processors also support Aarch32 and can run armv7 and even thumb2 code. While armv8 compilers will not emit these instructions, armv7 compilers that are aware of these processors will do. For example, using gcc built for an armv7 target and passing it "-mcpu=cortex-a72" or "-march=armv8-a+crc" will result in the CRC32 instruction to be used. In this case the current assembly code fails because with the ARM and Thumb2 instruction sets there is no such "%wX" half-registers. We need to use "%X" instead as the native 32-bit register when running with a 32-bit instruction set, and use "%wX" when using the 64-bit instruction set (A64). This is slz upstream commit fab83248612a1e8ee942963fe916a9cdbf085097	2021-12-06 09:14:20 +01:00
Tim Duesterhus	eaf16fcb53	CLEANUP: slz: Mark `reset_refs` as static This function has no prototype and is not used outside of slz.c.	2021-09-24 15:07:50 +02:00
Willy Tarreau	fc89c3fd2b	IMPORT: slz: silence a build warning with -Wundef The test on FIND_OPTIMAL_MATCH for the experimental code can yield a build warning when using -Wundef, let's turn it into a regular ifdef. This is slz upstream commit 05630ae8f22b71022803809eb1e7deb707bb30fb	2021-08-28 12:47:57 +02:00
Willy Tarreau	388fc25915	IMPORT: slz: use inttypes.h instead of stdint.h stdint.h is not as portable as inttypes.h. It doesn't exist at least on AIX 5.1 and Solaris 7, while inttypes.h is present there and does include stdint.h on platforms supporting it. This is equivalent to libslz upstream commit e36710a ("slz: use inttypes.h instead of stdint.h")	2021-05-14 08:44:52 +02:00
Willy Tarreau	9e274280a4	IMPORT: slz: do not produce the crc32_fast table when CRC is natively supported On ARM with native CRC support, no need to inflate the executable with a 4kB CRC table, let's just drop it. This is slz upstream commit d8715db20b2968d1f3012a734021c0978758f911.	2021-05-12 09:29:33 +02:00
Willy Tarreau	027fdcb168	IMPORT: slz: use the generic function for the last bytes of the crc32 This is the only place where we conditionally use the crc32_fast table, better call the crc32_char inline function for this. This should also reduce by ~1kB the L1 cache footprint of the compression when dealing with small blocks, and at least shows a consistent 0.5% perf improvement. This is slz upstream commit 075351b6c2513b548bac37d6582e46855bc7b36f.	2021-05-12 09:29:29 +02:00
Ilya Shipitsin	b2be9a1ea9	CLEANUP: assorted typo fixes in the code and comments This is 22nd iteration of typo fixes	2021-04-26 10:42:58 +02:00
Willy Tarreau	5e65f4276b	CLEANUP: compression: remove calls to SLZ init functions As we now embed the library we don't need to support the older 1.0 API any more, so we can remove the explicit calls to slz_make_crc_table() and slz_prepare_dist_table().	2021-04-22 16:11:19 +02:00
Willy Tarreau	ab2b7828e2	IMPORT: slz: import slz into the tree SLZ is rarely packaged by distros and there have been complaints about the CPU and memory usage of ZLIB, leading to some suggestions to better address the issue by simply integrating SLZ into the tree (just 3 files). See discussions below: https://www.mail-archive.com/haproxy@formilux.org/msg38037.html https://www.mail-archive.com/haproxy@formilux.org/msg40079.html https://www.mail-archive.com/haproxy@formilux.org/msg40365.html This patch does just this, after minor adjustments to these files: - tables.h was renamed to slz-tables.h - tables.h had the precomputed tables removed since not used here - slz.c uses includes <import/slz> instead of "slz.h" The slz commit imported here was b06c172 ("slz: avoid a build warning with -Wimplicit-fallthrough"). No other change was performed either to SLZ nor to haproxy at this point so that this operation may be replicated if needed for a future version.	2021-04-22 15:50:41 +02:00

16 Commits