16 Commits

Author SHA1 Message Date
Willy Tarreau
ccc65012d3 IMPORT: slz: silence a build warning on non-x86 non-arm
Building with clang 16 on MIPS64 yields this warning:

  src/slz.c:931:24: warning: unused function 'crc32_uint32' [-Wunused-function]
  static inline uint32_t crc32_uint32(uint32_t data)
                         ^

Let's guard it using UNALIGNED_LE_OK which is the only case where it's
used. This saves us from introducing a possibly non-portable attribute.

This is libslz upstream commit f5727531dba8906842cb91a75c1ffa85685a6421.
2025-05-16 16:43:53 +02:00
Willy Tarreau
31ca29eee1 IMPORT: slz: fix header used for empty zlib message
Calling slz_rfc1950_finish() without emitting any data would result in
incorrectly emitting a gzip header (rfc1952) instead of a zlib header
(rfc1950) due to a copy-paste between the two wrappers. The impact is
almost inexistent since the zlib format is almost never used in this
context, and compressing totally empty messages is quite rare as well.
Let's take this opportunity for fixing another mistake on an RFC number
in a comment.

This is slz upstream commit 7f3fce4f33e8c2f5e1051a32a6bca58e32d4f818.
2025-05-16 16:43:53 +02:00
Willy Tarreau
411b04c7d3 IMPORT: slz: use a better hash for machines with a fast multiply
The current hash involves 3 simple shifts and additions so that it can
be mapped to a multiply on architecures having a fast multiply. This is
indeed what the compiler does on x86_64. A large range of values was
scanned to try to find more optimal factors on machines supporting such
a fast multiply, and it turned out that new factor 0x1af42f resulted in
smoother hashes that provided on average 0.4% better compression on both
the Silesia corpus and an mbox file composed of very compressible emails
and uncompressible attachments. It's even slightly better than CRC32C
while being faster on Skylake. This patch enables this factor on archs
with a fast multiply.

This is slz upstream commit 82ad1e75c13245a835c1c09764c89f2f6e8e2a40.
2025-05-16 16:43:53 +02:00
Willy Tarreau
248bbec83c IMPORT: slz: support crc32c for lookup hash on sse4 but only if requested
If building for sse4 and USE_CRC32C_HASH is defined, then we can use
crc32c to calculate the lookup hash. By default we don't do it because
even on skylake it's slower than the current hash, which only involves
a short multiply (~5% slower). But the gains are marginal (0.3%).

This is slz upstream commit 44ae4f3f85eb275adba5844d067d281e727d8850.

Note: this is not used by default and only merged in order to avoid
divergence between the code bases.
2025-05-16 16:43:53 +02:00
Willy Tarreau
ea1b70900f IMPORT: slz: avoid multiple shifts on 64-bits
On 64-bit platforms, disassembling the code shows that send_huff() performs
a left shift followed by a right one, which are the result of integer
truncation and zero-extension caused solely by using different types at
different levels in the call chain. By making encode24() take a 64-bit
int on input and send_huff() take one optionally, we can remove one shift
in the hot path and gain 1% performance without affecting other platforms.

This is slz upstream commit fd165b36c4621579c5305cf3bb3a7f5410d3720b.
2025-05-16 16:43:53 +02:00
Willy Tarreau
90d18e2006 IMPORT: slz: implement a synchronous flush() operation
In some cases it may be desirable for latency reasons to forcefully
flush the queue even if it results in suboptimal compression. In our
case the queue might contain up to almost 4 bytes, which need an EOB
and a switch to literal mode, followed by 4 bytes to encode an empty
message. This means that each call can add 5 extra bytes in the ouput
stream. And the flush may also result in the header being produced for
the first time, which can amount to 2 or 10 bytes (zlib or gzip). In
the worst case, a total of 19 bytes may be emitted at once upon a flush
with 31 pending bits and a gzip header.

This is libslz upstream commit cf8c4668e4b4216e930b56338847d8d46a6bfda9.
2023-06-30 16:12:36 +02:00
Willy Tarreau
eda36f1c23 IMPORT: slz: declare len to fix debug build when optimal match is enabled
Building with -DFIND_OPTIMAL_MATCH would fail on undeclared "len".
This one likely vanished in some cleanup.

This is libslz upstream commit 1ea20360715e1ad0cd81db83fa4361310716b8cc
2022-11-14 11:14:02 +01:00
Willy Tarreau
b154422db1 IMPORT: slz: use the correct CRC32 instruction when running in 32-bit mode
Many ARMv8 processors also support Aarch32 and can run armv7 and even
thumb2 code. While armv8 compilers will not emit these instructions,
armv7 compilers that are aware of these processors will do. For
example, using gcc built for an armv7 target and passing it
"-mcpu=cortex-a72" or "-march=armv8-a+crc" will result in the CRC32
instruction to be used.

In this case the current assembly code fails because with the ARM and
Thumb2 instruction sets there is no such "%wX" half-registers. We need
to use "%X" instead as the native 32-bit register when running with a
32-bit instruction set, and use "%wX" when using the 64-bit instruction
set (A64).

This is slz upstream commit fab83248612a1e8ee942963fe916a9cdbf085097
2021-12-06 09:14:20 +01:00
Tim Duesterhus
eaf16fcb53 CLEANUP: slz: Mark reset_refs as static
This function has no prototype and is not used outside of slz.c.
2021-09-24 15:07:50 +02:00
Willy Tarreau
fc89c3fd2b IMPORT: slz: silence a build warning with -Wundef
The test on FIND_OPTIMAL_MATCH for the experimental code can yield a
build warning when using -Wundef, let's turn it into a regular ifdef.

This is slz upstream commit 05630ae8f22b71022803809eb1e7deb707bb30fb
2021-08-28 12:47:57 +02:00
Willy Tarreau
388fc25915 IMPORT: slz: use inttypes.h instead of stdint.h
stdint.h is not as portable as inttypes.h. It doesn't exist at least
on AIX 5.1 and Solaris 7, while inttypes.h is present there and does
include stdint.h on platforms supporting it.

This is equivalent to libslz upstream commit e36710a ("slz: use
inttypes.h instead of stdint.h")
2021-05-14 08:44:52 +02:00
Willy Tarreau
9e274280a4 IMPORT: slz: do not produce the crc32_fast table when CRC is natively supported
On ARM with native CRC support, no need to inflate the executable with
a 4kB CRC table, let's just drop it.

This is slz upstream commit d8715db20b2968d1f3012a734021c0978758f911.
2021-05-12 09:29:33 +02:00
Willy Tarreau
027fdcb168 IMPORT: slz: use the generic function for the last bytes of the crc32
This is the only place where we conditionally use the crc32_fast table,
better call the crc32_char inline function for this. This should also
reduce by ~1kB the L1 cache footprint of the compression when dealing
with small blocks, and at least shows a consistent 0.5% perf improvement.

This is slz upstream commit 075351b6c2513b548bac37d6582e46855bc7b36f.
2021-05-12 09:29:29 +02:00
Ilya Shipitsin
b2be9a1ea9 CLEANUP: assorted typo fixes in the code and comments
This is 22nd iteration of typo fixes
2021-04-26 10:42:58 +02:00
Willy Tarreau
5e65f4276b CLEANUP: compression: remove calls to SLZ init functions
As we now embed the library we don't need to support the older 1.0 API
any more, so we can remove the explicit calls to slz_make_crc_table()
and slz_prepare_dist_table().
2021-04-22 16:11:19 +02:00
Willy Tarreau
ab2b7828e2 IMPORT: slz: import slz into the tree
SLZ is rarely packaged by distros and there have been complaints about
the CPU and memory usage of ZLIB, leading to some suggestions to better
address the issue by simply integrating SLZ into the tree (just 3 files).
See discussions below:

   https://www.mail-archive.com/haproxy@formilux.org/msg38037.html
   https://www.mail-archive.com/haproxy@formilux.org/msg40079.html
   https://www.mail-archive.com/haproxy@formilux.org/msg40365.html

This patch does just this, after minor adjustments to these files:
  - tables.h was renamed to slz-tables.h
  - tables.h had the precomputed tables removed since not used here
  - slz.c uses includes <import/slz*> instead of "slz*.h"

The slz commit imported here was b06c172 ("slz: avoid a build warning
with -Wimplicit-fallthrough"). No other change was performed either to
SLZ nor to haproxy at this point so that this operation may be replicated
if needed for a future version.
2021-04-22 15:50:41 +02:00