haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-11-20 02:11:00 +01:00

Author	SHA1	Message	Date
Amaury Denoyelle	b40ce97ecc	BUG/MEDIUM: server: fix crash after duplicate GUID insertion On "add server", if a GUID is defined, guid_insert() is used to add the entry into the global GUID tree. If a similar entry already exists, GUID insertion fails and the server creation is eventually aborted. A crash could occur in this case because of an invalid memory access via guid_remove(). The latter is caused via free_server() as the server insertion is rejected. The invalid occurs on GUID key. The issue occurs because of guid_insert(). The function properly deallocates the GUID key on duplicate insertion, but it failed to reset <guid.node.key> to NULL. This caused the invalid memory access on guid_remove(). To fix this, ensure that key member is properly resetted on guid_insert() error path. This must be backported up to 3.0.	2025-05-22 17:59:37 +02:00
Amaury Denoyelle	5e088e3f8e	MINOR: server: use stress mode for "add server help" Implement stress mode on "add server help". This ensures that the command is fully reentrant on full output buffer. For testing, it requires compilation with USE_STRESS and global setting "stress-level 1".	2025-05-22 17:40:05 +02:00
Amaury Denoyelle	4de5090976	MINOR: server: implement "add server help" Implement "help" as a sub-command for "add server" CLI. The objective is to list all the keywords that are supported for dynamic servers. CLI IO handler and add_srv_ctx are used to support reentrancy on full output buffer. Now that this command is implemented, the outdated keyword list on "add server" from management documentation can be removed.	2025-05-22 17:40:05 +02:00
Amaury Denoyelle	2570892c41	MINOR: server: define CLI I/O handler for "add server" Extend "add server" to support an IO handler function named cli_io_handler_add_server(). A context object is also defined whose usage will depend on IO handler capabilities. IO handler is skipped when "add server" is run in default mode, i.e. on a dynamic server creation. Thus, currently IO handler is unneeded. However, it will become useful to support sub-commands for "add server". Note that return value of "add server" parser has been changed on server creation success. Previously, it was used incorrectly to report if server was inserted or not. In fact, parser return value is used by CLI generic code to detect if command processing has been completed, or should continue to the IO handler. Now, "add server" always returns 1 to signal that CLI processing is completed. This is necessary to preserve CLI output emitted by parser, even now that IO handler is defined for the command. Previously, output was emitted in every situations due to IO handler not defined. See below code snippet from cli.c for a better overview : if (kw->parse && kw->parse(args, payload, appctx, kw->private) != 0) { ret = 1; goto fail; } /* kw->parse could set its own io_handler or io_release handler */ if (!appctx->cli_ctx.io_handler) { ret = 1; goto fail; } appctx->st0 = CLI_ST_CALLBACK; ret = 1; goto end;	2025-05-22 17:40:05 +02:00
Willy Tarreau	1c0f2e62ad	MINOR: ssl: also provide the "tls-tickets" bind option Currently there is "no-tls-tickets" that is also supported in the ssl-default-bind-options directive, but there's no way to re-enable them on a specific "bind" line. This patch simply provides the option to re-enable them. Note that the flag is inverted because tickets are enabled by default and the no-tls-ticket option sets the flag to disable them.	2025-05-22 15:31:54 +02:00
Willy Tarreau	3494775a1f	MINOR: ssl: support strict-sni in ssl-default-bind-options Several users already reported that it would be nice to support strict-sni in ssl-default-bind-options. However, in order to support it, we also need an option to disable it. This patch moves the setting of the option from the strict_sni field to a flag in the ssl_options field so that it can be inherited from the default bind options, and adds a new "no-strict-sni" directive to allow to disable it on a specific "bind" line. The test file "del_ssl_crt-list.vtc" which already tests both options was updated to make use of the default option and the no- variant to confirm everything continues to work.	2025-05-22 15:31:54 +02:00
Christopher Faulet	7244f16ac4	MINOR: promex: Add agent check status/code/duration metrics In the Prometheus exporter, the last health check status is already exposed, with its code and duration in seconds. The server status is also exposed. But the information about the agent check are not available. It is not really handy because when a server status is changed because of the agent, it is not obvious by looking to the Prometheus metrics. Indeed, the server may reported as DOWN for instance, while the health check status still reports a success. Being able to get the agent status in that case could be valuable. So now, the last agent check status is exposed, with its code and duration in seconds. Following metrics can be grabbe now: * haproxy_server_agent_status * haproxy_server_agent_code * haproxy_server_agent_duration_seconds Note that unlike the other metrics, no per-backend aggregated metric is exposed. This patch is related to issue #2983.	2025-05-22 09:50:10 +02:00
Willy Tarreau	a1577a89a0	MINOR: glitches: add global setting "tune.glitches.kill.cpu-usage" It was mentioned during the development of glitches that it would be nice to support not killing misbehaving connections below a certain CPU usage so that poor implementations that routinely misbehave without impact are not killed. This is now possible by setting a CPU usage threshold under which we don't kill them via this parameter. It defaults to zero so that we continue to kill them by default.	2025-05-21 15:47:42 +02:00
Willy Tarreau	eee57b4d3f	CLEANUP: cfgparse: alphabetically sort the global keywords The global keywords table was no longer sorted at all, let's fix it to ease spotting the searched ones.	2025-05-21 15:47:42 +02:00
Amaury Denoyelle	01e3b2119a	MINOR: quic: add some missing includes Insert some missing includes statement in QUIC source files. This was detected after the next commit which adjust the include list used in quic_conn-t.h file.	2025-05-21 14:44:27 +02:00
Amaury Denoyelle	f286288471	MINOR: quic: refactor handling of streams after MUX release quic-conn layer has to handle itself STREAM frames after MUX release. If the stream was already seen, it is probably only a retransmitted frame which can be safely ignored. For other streams, an active closure may be needed. Thus it's necessary that quic-conn layer knows the highest stream ID already handled by the MUX after its release. Previously, this was done via <nb_streams> member array in quic-conn structure. Refactor this by replacing <nb_streams> by two members called <stream_max_uni>/<stream_max_bidi>. Indeed, it is unnecessary for quic-conn layer to monitor locally opened uni streams, as the peer cannot by definition emit a STREAM frame on it. Also, bidirectional streams are always opened by the remote side. Previously, <nb_streams> were set by quic-stream layer. Now, <stream_max_uni>/<stream_max_bidi> members are only set one time, just prior to QUIC MUX release. This is sufficient as quic-conn do not use them if the MUX is available. Note that previously, IDs were used relatively to their type, thus incremented by 1, after shifting the original value. For simplification, use the plain stream ID, which is incremented by 4.	2025-05-21 14:26:45 +02:00
Amaury Denoyelle	07d41a043c	MINOR: quic: move function to check stream type in utils Move general function to check if a stream is uni or bidirectional from QUIC MUX to quic_utils module. This should prevent unnecessary include of QUIC MUX header file in other sources.	2025-05-21 14:17:41 +02:00
Amaury Denoyelle	cf45bf1ad8	CLEANUP: quic: remove unused cbuf module Cbuf are not used anymore. Remove the related source and header files, as well as include statements in the rest of QUIC source files.	2025-05-21 14:16:37 +02:00
William Lallemand	8b121ab6f7	BUG/MINOR: acme: fix formatting issue in error and logs Stop emitting \n in errmsg for intermediate error messages, this was emitting multiline logs and was returning to a new line in the middle of sentences. We don't need to emit them in acme_start_task() since the errmsg is ouput in a send_log which already contains a \n or on the CLI which also emits it.	2025-05-21 11:41:28 +02:00
William Lallemand	156f4bd7a6	BUG/MEDIUM: acme: check if acme domains are configured When starting the ACME task with a ckch_conf which does not contain the domains, the ACME task would segfault because it will try to dereference a NULL in this case. The patch fix the issue by emitting a warning when no domains are configured. It's not done at configuration parsing because it is not easy to emit the warning because there are is no callback system which give access to the whole ckch_conf once a line is parsed. No backport needed.	2025-05-21 11:41:28 +02:00
Amaury Denoyelle	e399daa67e	BUG/MEDIUM: mux-quic: fix BUG_ON() on rxbuf alloc error RX buffer allocation has been reworked in current dev tree. The objective is to support multiple buffers per QCS to improve upload throughput. RX buffer allocation failure is handled simply : the whole connection is closed. This is done via qcc_set_error(), with INTERNAL_ERROR as error code. This function contains a BUG_ON() to ensure it is called only one time per connection instance. On RX buffer alloc failure, the aformentioned BUG_ON() crashes due to a double invokation of qcc_set_error(). First by qcs_get_rxbuf(), and immediately after it by qcc_recv(), which is the caller of the previous one. This regression was introduced by the following commit. 60f64449fbba7bb6e351e8343741bb3c960a2e6d MAJOR: mux-quic: support multiple QCS RX buffers To fix this, simply remove qcc_set_error() invocation in qcs_get_rxbuf(). On buffer alloc failture, qcc_recv() is responsible to set the error. This does not need to be backported.	2025-05-21 11:33:00 +02:00
Willy Tarreau	4b52d5e406	BUILD: acme: fix build issue on 32-bit archs with 64-bit time_t The build failed on mips32 with a 64-bit time_t here: https://github.com/haproxy/haproxy/actions/runs/15150389164/job/42595310111 Let's just turn the "remain" variable used to show the remaining time into a more portable ullong and use %llu for all format specifiers, since long remains limited to 32-bit on 32-bit archs. No backport needed.	2025-05-21 10:18:47 +02:00
Willy Tarreau	09d4c9519e	BUILD: ssl: avoid possible printf format warning in traces When building on MIPS-32 with gcc-9.5 and glibc-2.31, I got this: src/ssl_trace.c: In function 'ssl_trace': src/ssl_trace.c:118:42: warning: format '%ld' expects argument of type 'long int', but argument 3 has type 'ssize_t' {aka 'const int'} [-Wformat=] 118 \| chunk_appendf(&trace_buf, " : size=%ld", *size); \| ~~^ ~~~~~ \| \| \| \| \| ssize_t {aka const int} \| long int \| %d Let's just cast the type. No backport needed.	2025-05-21 10:01:14 +02:00
Willy Tarreau	3b2fb5cc15	CLEANUP: wdt: clarify the comments on the common exit path The condition in which we reach the check for ha_panic() and ha_stuck_warning() are not super clear, let's reformulate them.	2025-05-20 16:37:06 +02:00
Willy Tarreau	0a8bfb5b90	BUG/MEDIUM: wdt: always ignore the first watchdog wakeup With commit a06c215f08 ("MEDIUM: wdt: always make the faulty thread report its own warnings"), when the TH_FL_STUCK flag was flipped on, we'd then go to the panic code instead of giving a second chance like before the commit. This can trigger rare cases that only happen with moderate loads like was addressed by commit 24ce001771 ("BUG/MEDIUM: wdt: fix the stuck detection for warnings"). This is in fact due to the loss of the common "goto update_and_leave" that used to serve both the warning code and the flag setting for probation, and it's apparently what hit Christian in issue #2980. Let's make sure we exit naturally when turning the bit on for the first time. Let's also update the confusing comment at the end of the check that was left over by latest change. Since the first commit was backported to 3.1, this commit should be backported there as well.	2025-05-20 16:37:03 +02:00
Frederic Lecaille	08eee0d9cf	MINOR: quic: OpenSSL 3.5 trick to support 0-RTT For an unidentified reason, SSL_do_hanshake() succeeds at its first call when 0-RTT is enabled for the connection. This behavior looks very similar by the one encountered by AWS-LC stack. That said, it was documented by AWS-LC. This issue leads the connection to stop sending handshake packets after having release the handshake encryption level. In fact, no handshake packets could even been sent leading the handshake to always fail. To fix this, this patch simulates a "handshake in progress" state waiting for the application level read secret to be established by the TLS stack. This may happen only after the QUIC listener has completed/confirmed the handshake upon handshake CRYPTO data receipt from the peer.	2025-05-20 15:00:06 +02:00
Frederic Lecaille	849a3af14e	MINOR: quic: OpenSSL 3.5 internal QUIC custom extension for transport parameters reset A QUIC must sent its transport parameter using a TLS custom extention. This extension is reset by SSL_set_SSL_CTX(). It can be restored calling quic_ssl_set_tls_cbs() (which calls SSL_set_quic_tls_cbs()).	2025-05-20 15:00:06 +02:00
Frederic Lecaille	b3ac1a636c	MINOR: quic: implement all remaining callbacks for OpenSSL 3.5 QUIC API The quic_conn struct is modified for two reasons. The first one is to store the encoded version of the local tranport parameter as this is done for USE_QUIC_OPENSSL_COMPAT. Indeed, the local transport parameter "should remain valid until after the parameters have been sent" as mentionned by SSL_set_quic_tls_cbs(3) manual. In our case, the buffer is a static buffer attached to the quic_conn object. qc_ssl_set_quic_transport_params() function whose role is to call SSL_set_tls_quic_transport_params() (aliased by SSL_set_quic_transport_params() to set these local tranport parameter into the TLS stack from the buffer attached to the quic_conn struct. The second quic_conn struct modification is the addition of the new ->prot_level (SSL protection level) member added to the quic_conn struct to store "the most recent write encryption level set via the OSSL_FUNC_SSL_QUIC_TLS_yield_secret_fn callback (if it has been called)" as mentionned by SSL_set_quic_tls_cbs(3) manual. This patches finally implements the five remaining callacks to make the haproxy QUIC implementation work. OSSL_FUNC_SSL_QUIC_TLS_crypto_send_fn() (ha_quic_ossl_crypto_send) is easy to implement. It calls ha_quic_add_handshake_data() after having converted qc->prot_level TLS protection level value to the correct ssl_encryption_level_t (boringSSL API/quictls) value. OSSL_FUNC_SSL_QUIC_TLS_crypto_recv_rcd_fn() (ha_quic_ossl_crypto_recv_rcd()) provide the non-contiguous addresses to the TLS stack, without releasing them. OSSL_FUNC_SSL_QUIC_TLS_crypto_release_rcd_fn() (ha_quic_ossl_crypto_release_rcd()) release these non-contiguous buffer relying on the fact that the list of encryption level (qc->qel_list) is correctly ordered by SSL protection level secret establishements order (by the TLS stack). OSSL_FUNC_SSL_QUIC_TLS_yield_secret_fn() (ha_quic_ossl_got_transport_params()) is a simple wrapping function over ha_quic_set_encryption_secrets() which is used by boringSSL/quictls API. OSSL_FUNC_SSL_QUIC_TLS_got_transport_params_fn() (ha_quic_ossl_got_transport_params()) role is to store the peer received transport parameters. It simply calls quic_transport_params_store() and set them into the TLS stack calling qc_ssl_set_quic_transport_params(). Also add some comments for all the OpenSSL 3.5 QUIC API callbacks. This patch have no impact on the other use of QUIC API provided by the others TLS stacks.	2025-05-20 15:00:06 +02:00
Frederic Lecaille	dc6a3c329a	MINOR: quic: Allow the use of the new OpenSSL 3.5.0 QUIC TLS API (to be completed) This patch allows the use of the new OpenSSL 3.5.0 QUIC TLS API when it is available and detected at compilation time. The detection relies on the presence of the OSSL_FUNC_SSL_QUIC_TLS_CRYPTO_SEND macro from openssl-compat.h. Indeed this macro is defined by OpenSSL since 3.5.0 version. It is not defined by quictls. This helps in distinguishing these two TLS stacks. When the detection succeeds, HAVE_OPENSSL_QUIC is also defined by openssl-compat.h. Then, this is this new macro which is used to detect the availability of the new OpenSSL 3.5.0 QUIC TLS API. Note that this detection is done only if USE_QUIC_OPENSSL_COMPAT is not asked. So, USE_QUIC_OPENSSL_COMPAT and HAVE_OPENSSL_QUIC are exclusive. At the same location, from openssl-compat.h, ssl_encryption_level_t enum is defined. This enum was defined by quictls and expansively used by the haproxy QUIC implementation. SSL_set_quic_transport_params() is replaced by SSL_set_quic_tls_transport_params. SSL_set_quic_early_data_enabled() (quictls) is also replaced by SSL_set_quic_tls_early_data_enabled() (OpenSSL). SSL_quic_read_level() (quictls) is not defined by OpenSSL. It is only used by the traces to log the current TLS stack decryption level (read). A macro makes it return -1 which is an usused values. The most of the differences between quictls and OpenSSL QUI APIs are in quic_ssl.c where some callbacks must be defined for these two APIs. This is why this patch modifies quic_ssl.c to define an array of OSSL_DISPATCH structs: <ha_quic_dispatch>. Each element of this arry defines a callback. So, this patch implements these six callabcks: - ha_quic_ossl_crypto_send() - ha_quic_ossl_crypto_recv_rcd() - ha_quic_ossl_crypto_release_rcd() - ha_quic_ossl_yield_secret() - ha_quic_ossl_got_transport_params() and - ha_quic_ossl_alert(). But at this time, these implementations which must return an int return 0 interpreted as a failure by the OpenSSL QUIC API, except for ha_quic_ossl_alert() which is implemented the same was as for quictls. The five remaining functions above will be implemented by the next patches to come. ha_quic_set_encryption_secrets() and ha_quic_add_handshake_data() have been moved to be defined for both quictls and OpenSSL QUIC API. These callbacks are attached to the SSL objects (sessions) calling qc_ssl_set_cbs() new function. This latter callback the correct function to attached the correct callbacks to the SSL objects (defined by <ha_quic_method> for quictls, and <ha_quic_dispatch> for OpenSSL). The calls to SSL_provide_quic_data() and SSL_process_quic_post_handshake() have been also disabled. These functions are not defined by OpenSSL QUIC API. At this time, the functions which call them are still defined when HAVE_OPENSSL_QUIC is defined.	2025-05-20 15:00:06 +02:00
Frederic Lecaille	894595b711	MINOR: quic: Add useful error traces about qc_ssl_sess_init() failures There were no traces to diagnose qc_ssl_sess_init() failures from QUIC traces. This patch add calls to TRACE_DEVEL() into qc_ssl_sess_init() and its caller (qc_alloc_ssl_sock_ctx()). This was useful at least to diagnose SSL context initialization failures when porting QUIC to the new OpenSSL 3.5 QUIC API. Should be easily backported as far as 2.6.	2025-05-20 15:00:06 +02:00
Frederic Lecaille	a2822b1776	CLEANUP: quic: Useless BIO_METHOD initialization This code is there from QUIC implementation start. It was supposed to initialize <ha_quic_meth> as a BIO_METHOD static object. But this BIO_METHOD is not used at all! Should be backported as far as 2.6 to help integrate the next patches to come.	2025-05-20 15:00:06 +02:00
William Lallemand	e803385a6e	MINOR: acme: renewal notification over the dpapi sink Output a sink message when the certificate was renewed by the ACME client. The message is emitted on the "dpapi" sink, and ends by \n\0. Since the message contains this binary character, the right -0 parameter must be used when consulting the sink over the CLI: Example: $ echo "show events dpapi -nw -0" \| socat -t9999 /tmp/haproxy.sock - <0>2025-05-19T15:56:23.059755+02:00 acme newcert foobar.pem.rsa\n\0 When used with the master CLI, @@1 should be used instead of @1 in order to keep the connection to the worker. Example: $ echo "@@1 show events dpapi -nw -0" \| socat -t9999 /tmp/master.sock - <0>2025-05-19T15:56:23.059755+02:00 acme newcert foobar.pem.rsa\n\0	2025-05-19 16:07:25 +02:00
Willy Tarreau	99d6c889d0	BUG/MAJOR: leastconn: never reuse the node after dropping the lock On ARM with 80 cores and a single server, it's sometimes possible to see a segfault in fwlc_get_next_server() around 600-700k RPS. It seldom happens as well on x86 with 128 threads with the same config around 1M rps. It turns out that in fwlc_get_next_server(), before calling fwlc_srv_reposition(), we have to drop the lock and that one takes it back again. The problem is that anything can happen to our node during this time, and it can be freed. Then when continuing our work, we later iterate over it and its next to find a node with an acceptable key, and by doing so we can visit either uninitialized memory or simply nodes that are no longer in the tree. A first attempt at fixing this consisted in artificially incrementing the elements count before dropping the lock, but that turned out to be even worse because other threads could loop forever on such an element looking for an entry that does not exist. Maintaining a separate refcount didn't work well either, and it required to deal with the memory release while dropping it, which is really not convenient. Here we're taking a different approach consisting in simply not trusting this node anymore and going back to the beginning of the loop, as is done at a few other places as well. This way we can safely ignore the possibly released node, and the test runs reliably both on the arm and the x86 platforms mentioned above. No performance regression was observed either, likely because this operation is quite rare. No backport is needed since this appeared with the leastconn rework in 3.2.	2025-05-19 16:05:03 +02:00
Amaury Denoyelle	d358da4d83	BUG/MINOR: quic: fix crash on quic_conn alloc failure If there is an alloc failure during qc_new_conn(), cleaning is done via quic_conn_release(). However, since the below commit, an unchecked dereferencing of <qc.path> is performed in the latter. e841164a4402118bd7b2e2dc2b5068f21de5d9d2 MINOR: quic: account for global congestion window To fix this, simply check <qc.path> before dereferencing it in quic_conn_release(). This is safe as it is properly initialized to NULL on qc_new_conn() first stage. This does not need to be backported.	2025-05-19 11:03:48 +02:00
Willy Tarreau	099c1b2442	BUG/MAJOR: queue: properly keep count of the queue length The queue length was moved to its own variable in commit 583303c48 ("MINOR: proxies/servers: Calculate queueslength and use it."), however a few places were missed in pendconn_unlink() and assign_server_and_queue() resulting in never decreasing counts on aborted streams. This was reproduced when injecting more connections than the total backend could stand in TCP mode and letting some of them time out in the queue. No backport is needed, this is only 3.2.	2025-05-17 10:46:10 +02:00
Willy Tarreau	6be02d1c6e	BUG/MAJOR: leastconn: do not loop forever when facing saturated servers Since commit 9fe72bba3 ("MAJOR: leastconn; Revamp the way servers are ordered."), there's no way to escape the loop visiting the mt_list heads in fwlc_get_next_server if all servers in the list are saturated, resulting in a watchdog panic. It can be reproduced with this config and injecting with more than 2 concurrent conns: balance leastconn server s1 127.0.0.1:8000 maxconn 1 server s2 127.0.0.1:8000 maxconn 1 Here we count the number of saturated servers that were encountered, and escape the loop once the number of remaining servers exceeds the number of saturated ones. No backport is needed since this arrived in 3.2.	2025-05-17 10:44:36 +02:00
Willy Tarreau	ccc65012d3	IMPORT: slz: silence a build warning on non-x86 non-arm Building with clang 16 on MIPS64 yields this warning: src/slz.c:931:24: warning: unused function 'crc32_uint32' [-Wunused-function] static inline uint32_t crc32_uint32(uint32_t data) ^ Let's guard it using UNALIGNED_LE_OK which is the only case where it's used. This saves us from introducing a possibly non-portable attribute. This is libslz upstream commit f5727531dba8906842cb91a75c1ffa85685a6421.	2025-05-16 16:43:53 +02:00
Willy Tarreau	31ca29eee1	IMPORT: slz: fix header used for empty zlib message Calling slz_rfc1950_finish() without emitting any data would result in incorrectly emitting a gzip header (rfc1952) instead of a zlib header (rfc1950) due to a copy-paste between the two wrappers. The impact is almost inexistent since the zlib format is almost never used in this context, and compressing totally empty messages is quite rare as well. Let's take this opportunity for fixing another mistake on an RFC number in a comment. This is slz upstream commit 7f3fce4f33e8c2f5e1051a32a6bca58e32d4f818.	2025-05-16 16:43:53 +02:00
Willy Tarreau	411b04c7d3	IMPORT: slz: use a better hash for machines with a fast multiply The current hash involves 3 simple shifts and additions so that it can be mapped to a multiply on architecures having a fast multiply. This is indeed what the compiler does on x86_64. A large range of values was scanned to try to find more optimal factors on machines supporting such a fast multiply, and it turned out that new factor 0x1af42f resulted in smoother hashes that provided on average 0.4% better compression on both the Silesia corpus and an mbox file composed of very compressible emails and uncompressible attachments. It's even slightly better than CRC32C while being faster on Skylake. This patch enables this factor on archs with a fast multiply. This is slz upstream commit 82ad1e75c13245a835c1c09764c89f2f6e8e2a40.	2025-05-16 16:43:53 +02:00
Willy Tarreau	248bbec83c	IMPORT: slz: support crc32c for lookup hash on sse4 but only if requested If building for sse4 and USE_CRC32C_HASH is defined, then we can use crc32c to calculate the lookup hash. By default we don't do it because even on skylake it's slower than the current hash, which only involves a short multiply (~5% slower). But the gains are marginal (0.3%). This is slz upstream commit 44ae4f3f85eb275adba5844d067d281e727d8850. Note: this is not used by default and only merged in order to avoid divergence between the code bases.	2025-05-16 16:43:53 +02:00
Willy Tarreau	ea1b70900f	IMPORT: slz: avoid multiple shifts on 64-bits On 64-bit platforms, disassembling the code shows that send_huff() performs a left shift followed by a right one, which are the result of integer truncation and zero-extension caused solely by using different types at different levels in the call chain. By making encode24() take a 64-bit int on input and send_huff() take one optionally, we can remove one shift in the hot path and gain 1% performance without affecting other platforms. This is slz upstream commit fd165b36c4621579c5305cf3bb3a7f5410d3720b.	2025-05-16 16:43:53 +02:00
Willy Tarreau	df00164fdd	BUG/MEDIUM: h1/h2/h3: reject forbidden chars in the Host header field In continuation with 9a05c1f574 ("BUG/MEDIUM: h2/h3: reject some forbidden chars in :authority before reassembly") and the discussion in issue #2941, @DemiMarie rightfully suggested that Host should also be sanitized, because it is sometimes used in concatenation, such as this: http-request set-url https://%[req.hdr(host)]%[pathq] which was proposed as a workaround for h2 upstream servers that require :authority here: https://www.mail-archive.com/haproxy@formilux.org/msg43261.html The current patch then adds the same check for forbidden chars in the Host header, using the same function as for the patch above, since in both cases we validate the host:port part of the authority. This way we won't reconstruct ambiguous URIs by concatenating Host and path. Just like the patch above, this can be backported afer a period of observation.	2025-05-16 15:13:17 +02:00
Willy Tarreau	b84762b3e0	BUG/MINOR: h3: don't insert more than one Host header Let's make sure we drop extraneous Host headers after having compared them. That also works when :authority was already present. This way, like for h1 and h2, we only keep one copy of it, while still making sure that Host matches :authority. This way, if a request has both :authority and Host, only one Host header will be produced (from :authority). Note that due to the different organization of the code and wording along the evolving RFCs, here we also check that all duplicates are identical, while h2 ignores them as per RFC7540, but this will be re-unified later. This should be backported to stable versions, at least 2.8, though thanks to the existing checks the impact is probably nul.	2025-05-16 15:13:17 +02:00
Christopher Faulet	f45a632bad	BUG/MEDIUM: stconn: Disable 0-copy forwarding for filters altering the payload It is especially a problem with Lua filters, but it is important to disable the 0-copy forwarding if a filter alters the payload, or at least to be able to disable it. While the filter is registered on the data filtering, it is not an issue (and it is the common case) because, there is now way to fast-forward data at all. But it may be an issue if a filter decides to alter the payload and to unregister from data filtering. In that case, the 0-copy forwarding can be re-enabled in a hardly precdictable state. To fix the issue, a SC flags was added to do so. The HTTP compression filter set it and lua filters too if the body length is changed (via HTTPMessage.set_body_len()). Note that it is an issue because of a bad design about the HTX. Many info about the message are stored in the HTX structure itself. It must be refactored to move several info to the stream-endpoint descriptor. This should ease modifications at the stream level, from filter or a TCP/HTTP rules. This should be backported as far as 3.0. If necessary, it may be backported on lower versions, as far as 2.6. In that case, it must be reviewed and adapted.	2025-05-16 15:11:37 +02:00
Christopher Faulet	94055a5e73	MEDIUM: hlua: Add function to change the body length of an HTTP Message There was no function for a lua filter to change the body length of an HTTP Message. But it is mandatory to be able to alter the message payload. It is not possible update to directly update the message headers because the internal state of the message must also be updated accordingly. It is the purpose of HTTPMessage.set_body_len() function. The new body length myst be passed as argument. If it is an integer, the right "Content-Length" header is set. If the "chunked" string is used, it forces the message to be chunked-encoded and in that case the "Transfer-Encoding" header. This patch should fix the issue #2837. It could be backported as far as 2.6.	2025-05-16 14:34:12 +02:00
Willy Tarreau	f2d7aa8406	BUG/MEDIUM: peers: also limit the number of incoming updates There's a configurable limit to the number of messages sent to a peer (tune.peers.max-updates-at-once), but this one is not applied to the receive side. While it can usually be OK with default settings, setups involving a large tune.bufsize (1MB and above) regularly experience high latencies and even watchdogs during reloads because the full learning process sends a lot of data that manages to fill the entire buffer, and due to the compactness of the protocol, 1MB of buffer can contain more than 100k updates, meaning taking locks etc during this time, which is not workable. Let's make sure the receiving side also respects the max-updates-at-once setting. For this it counts incoming updates, and refrains from continuing once the limit is reached. It's a bit tricky to do because after receiving updates we still have to send ours (and possibly some ACKs) so we cannot just leave the loop. This issue was reported on 3.1 but it should progressively be backported to all versions having the max-updates-at-once option available.	2025-05-15 16:57:21 +02:00
Aurelien DARRAGON	098a5e5c0b	BUG/MINOR: sink: detect and warn when using "send-proxy" options with ring servers using "send-proxy" or "send-proxy-v2" option on a ring server is not relevant nor supported. Worse, on 2.4 it causes haproxy process to crash as reported in GH #2965. Let's be more explicit about the fact that this keyword is not supported under "ring" context by ignoring the option and emitting a warning message to inform the user about that. Ideally, we should do the same for peers and log servers. The proper way would be to check servers options during postparsing but we currently lack proper cross-type server postparsing hooks. This will come later and thus will give us a chance to perform the compatibilty checks for server options depending on proxy type. But for now let's simply fix the "ring" case since it is the only one that's known to cause a crash. It may be backported to all stable versions.	2025-05-15 16:18:31 +02:00
Christopher Faulet	e2ae8a74e8	DEBUG: mux-spop: Review some trace messages to adjust the message or the level Some trace messages were not really accurrate, reporting a CLOSED connection while only an error was reported on it. In addition, an TRACE_ERROR() was used to report a short read on HELLO/DISCONNECT frames header. But it is not an error. a TRACE_DEVEL() should be used instead. This patch could be backported to 3.1 to ease future backports.	2025-05-14 11:52:10 +02:00
Christopher Faulet	6e46f0bf93	BUG/MEDIUM: mux-spop; Don't report a read error if there are pending data When an read error is detected, no error must be reported on the SPOP connection is there are still some data to parse. It is important to be sure to process all data before reporting the error and be sure to not truncate received frames. However, we must also take care to handle short read case to not wait data that will never be received. This patch must be backported to 3.1.	2025-05-14 11:51:58 +02:00
Christopher Faulet	16314bb93c	BUG/MEDIUM: mux-spop: Properly detect truncated frames on demux to report error There was no test in the demux part to detect truncated frames and to report an error at the connection level. The SPOP streams were properly switch to half-closed state. But waiting the associated SPOE applets were woken up and released, the SPOP connection could be woken up several times for nothing. I never triggered the watchdog in that case, but it is not excluded. Now, at the end of the demux function, if a specific test was added to detect truncated frames to report an error and close the connection. This patch must be backported to 3.1.	2025-05-14 11:47:41 +02:00
Christopher Faulet	71feb49a9f	BUG/MEDIUM: spop-conn: Report short read for partial frames payload When a frame was not fully received, a short read must be reported on the SPOP connection to help the demux to handle truncated frames. This was performed for frames truncated on the header part but not on the payload part. It is now properly detected. This patch must be backported to 3.1.	2025-05-14 09:20:10 +02:00
Christopher Faulet	ddc5f8d92e	BUG/MEDIUM: mux-spop: Properly handle CLOSING state The CLOSING state was not handled at all by the SPOP multiplexer while it is mandatory when a DISCONNECT frame was sent and the mux should wait for the DISCONNECT frame in reply from the agent. Thanks to this patch, it should be fixed. In addition, if an error occurres during the AGENT HELLO frame parsing, the SPOP connection is no longer switched to CLOSED state and remains in ERROR state instead. It is important to be able to send the DISCONNECT frame to the agent instead of closing the TCP connection immediately. This patch depends on following commits: * BUG/MEDIUM: mux-spop: Remove frame parsing states from the SPOP connection state * MINOR: mux-spop: Don't set SPOP connection state to FRAME_H after ACK parsing * BUG/MINOR: mux-spop: Don't open new streams for SPOP connection on error * BUG/MINOR: mux-spop: Make the demux stream ID a signed integer All the series must be backported to 3.1.	2025-05-14 09:14:12 +02:00
Christopher Faulet	a3940614c2	BUG/MEDIUM: mux-spop: Remove frame parsing states from the SPOP connection state SPOP_CS_FRAME_H and SPOP_CS_FRAME_P states, that were used to handle frame parsing, were removed. The demux process now relies on the demux stream ID to know if it is waiting for the frame header or the frame payload. Concretly, when the demux stream ID is not set (dsi == -1), the demuxer is waiting for the next frame header. Otherwise (dsi >= 0), it is waiting for the frame payload. It is especially important to be able to properly handle DISCONNECT frames sent by the agents. SPOP_CS_RUNNING state is introduced to know the hello handshake was finished and the SPOP connection is able to open SPOP streams and exchange NOTIFY/ACK frames with the agents. It depends on the following fixes: * MINOR: mux-spop: Don't set SPOP connection state to FRAME_H after ACK parsing * BUG/MINOR: mux-spop: Make the demux stream ID a signed integer This change will be mandatory for the next fix. It must be backported to 3.1 with the commits above.	2025-05-13 19:51:40 +02:00
Christopher Faulet	6b0f7de4e3	MINOR: mux-spop: Don't set SPOP connection state to FRAME_H after ACK parsing After the ACK frame was parsed, it is useless to set the SPOP connection state to SPOP_CS_FRAME_H state because this will be automatically handled by the demux function. If it is not an issue, but this will simplify changes for the next commit.	2025-05-13 19:51:40 +02:00
Christopher Faulet	197eaaadfd	BUG/MINOR: mux-spop: Don't open new streams for SPOP connection on error Till now, only SPOP connections fully closed or those with a TCP connection on error were concerned. But available streams could be reported for SPOP connections in error or closing state. But in these states, no NOTIFY frames will be sent and no ACK frames will be parsed. So, no new SPOP streams should be opened. This patch should be backported to 3.1.	2025-05-13 19:51:40 +02:00

... 3 4 5 6 7 ...

19601 Commits