haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-09-20 21:31:28 +02:00

Author	SHA1	Message	Date
Remi Tricot-Le Breton	90441e9bfe	BUG/MAJOR: cache: Crash because of wrong cache entry deleted When "vary" is enabled, we can have multiple entries for a given primary key in the cache tree. There is a limit to how many secondary entries can be inserted for a given key. When we try to insert a new secondary entry, if the limit is already reached, we can try to find expired entries with the same primary key, and if the limit is still reached we want to abort the current insertion and to remove the node that was just inserted. In commit "a29b073: MEDIUM: cache: Add refcount on cache_entry" though, a regression was introduced. Instead of removing the entry just inserted as the comments suggested, we removed the second to last entry and returned NULL. We then reset the eb.key of the cache_entry in the caller because we assumed that the entry was already removed from the tree. This means that some entries with an empty key were wrongly kept in the tree and the last secondary entry, which keeps the number of secondary entries of a given key was removed. This ended up causing some crashes later on when we tried to iterate over the elements of this given key. The crash could occur in multiple places, either when trying to retrieve an entry or to add some new ones. This crash was raised in GitHub issue #2950. The fix should be backported up to 3.0.	2025-05-23 22:38:54 +02:00
Willy Tarreau	84ffb3d0a9	MINOR: config: list recently added sections with -dKcfg Newly added sections (crt-store, traces, acme) were not listed in -dKcfg, let's add them. For now they have to be manually enumerated.	2025-05-23 10:49:33 +02:00
Willy Tarreau	28c7a22790	BUG/MEDIUM: server: fix potential null-deref after previous fix A valid build warning was reported in the CI with latest commit b40ce97ecc ("BUG/MEDIUM: server: fix crash after duplicate GUID insertion"). Indeed, if the first test in the function fails, we branch to the err label with guid==NULL and will crash there. Let's just test guid before dereferencing it for freeing. This needs to be backported to 3.0 as well since the commit above was meant to go there.	2025-05-22 18:09:12 +02:00
Amaury Denoyelle	b40ce97ecc	BUG/MEDIUM: server: fix crash after duplicate GUID insertion On "add server", if a GUID is defined, guid_insert() is used to add the entry into the global GUID tree. If a similar entry already exists, GUID insertion fails and the server creation is eventually aborted. A crash could occur in this case because of an invalid memory access via guid_remove(). The latter is caused via free_server() as the server insertion is rejected. The invalid occurs on GUID key. The issue occurs because of guid_insert(). The function properly deallocates the GUID key on duplicate insertion, but it failed to reset <guid.node.key> to NULL. This caused the invalid memory access on guid_remove(). To fix this, ensure that key member is properly resetted on guid_insert() error path. This must be backported up to 3.0.	2025-05-22 17:59:37 +02:00
Amaury Denoyelle	5e088e3f8e	MINOR: server: use stress mode for "add server help" Implement stress mode on "add server help". This ensures that the command is fully reentrant on full output buffer. For testing, it requires compilation with USE_STRESS and global setting "stress-level 1".	2025-05-22 17:40:05 +02:00
Amaury Denoyelle	4de5090976	MINOR: server: implement "add server help" Implement "help" as a sub-command for "add server" CLI. The objective is to list all the keywords that are supported for dynamic servers. CLI IO handler and add_srv_ctx are used to support reentrancy on full output buffer. Now that this command is implemented, the outdated keyword list on "add server" from management documentation can be removed.	2025-05-22 17:40:05 +02:00
Amaury Denoyelle	2570892c41	MINOR: server: define CLI I/O handler for "add server" Extend "add server" to support an IO handler function named cli_io_handler_add_server(). A context object is also defined whose usage will depend on IO handler capabilities. IO handler is skipped when "add server" is run in default mode, i.e. on a dynamic server creation. Thus, currently IO handler is unneeded. However, it will become useful to support sub-commands for "add server". Note that return value of "add server" parser has been changed on server creation success. Previously, it was used incorrectly to report if server was inserted or not. In fact, parser return value is used by CLI generic code to detect if command processing has been completed, or should continue to the IO handler. Now, "add server" always returns 1 to signal that CLI processing is completed. This is necessary to preserve CLI output emitted by parser, even now that IO handler is defined for the command. Previously, output was emitted in every situations due to IO handler not defined. See below code snippet from cli.c for a better overview : if (kw->parse && kw->parse(args, payload, appctx, kw->private) != 0) { ret = 1; goto fail; } /* kw->parse could set its own io_handler or io_release handler */ if (!appctx->cli_ctx.io_handler) { ret = 1; goto fail; } appctx->st0 = CLI_ST_CALLBACK; ret = 1; goto end;	2025-05-22 17:40:05 +02:00
Willy Tarreau	1c0f2e62ad	MINOR: ssl: also provide the "tls-tickets" bind option Currently there is "no-tls-tickets" that is also supported in the ssl-default-bind-options directive, but there's no way to re-enable them on a specific "bind" line. This patch simply provides the option to re-enable them. Note that the flag is inverted because tickets are enabled by default and the no-tls-ticket option sets the flag to disable them.	2025-05-22 15:31:54 +02:00
Willy Tarreau	3494775a1f	MINOR: ssl: support strict-sni in ssl-default-bind-options Several users already reported that it would be nice to support strict-sni in ssl-default-bind-options. However, in order to support it, we also need an option to disable it. This patch moves the setting of the option from the strict_sni field to a flag in the ssl_options field so that it can be inherited from the default bind options, and adds a new "no-strict-sni" directive to allow to disable it on a specific "bind" line. The test file "del_ssl_crt-list.vtc" which already tests both options was updated to make use of the default option and the no- variant to confirm everything continues to work.	2025-05-22 15:31:54 +02:00
Christopher Faulet	7244f16ac4	MINOR: promex: Add agent check status/code/duration metrics In the Prometheus exporter, the last health check status is already exposed, with its code and duration in seconds. The server status is also exposed. But the information about the agent check are not available. It is not really handy because when a server status is changed because of the agent, it is not obvious by looking to the Prometheus metrics. Indeed, the server may reported as DOWN for instance, while the health check status still reports a success. Being able to get the agent status in that case could be valuable. So now, the last agent check status is exposed, with its code and duration in seconds. Following metrics can be grabbe now: * haproxy_server_agent_status * haproxy_server_agent_code * haproxy_server_agent_duration_seconds Note that unlike the other metrics, no per-backend aggregated metric is exposed. This patch is related to issue #2983.	2025-05-22 09:50:10 +02:00
Willy Tarreau	0ac41ff97e	[RELEASE] Released version 3.2-dev17 Released version 3.2-dev17 with the following main changes : - DOC: configuration: explicit multi-choice on bind shards option - BUG/MINOR: sink: detect and warn when using "send-proxy" options with ring servers - BUG/MEDIUM: peers: also limit the number of incoming updates - MEDIUM: hlua: Add function to change the body length of an HTTP Message - BUG/MEDIUM: stconn: Disable 0-copy forwarding for filters altering the payload - BUG/MINOR: h3: don't insert more than one Host header - BUG/MEDIUM: h1/h2/h3: reject forbidden chars in the Host header field - DOC: config: properly index "table and "stick-table" in their section - DOC: management: change reference to configuration manual - BUILD: debug: mark ha_crash_now() as attribute(noreturn) - IMPORT: slz: avoid multiple shifts on 64-bits - IMPORT: slz: support crc32c for lookup hash on sse4 but only if requested - IMPORT: slz: use a better hash for machines with a fast multiply - IMPORT: slz: fix header used for empty zlib message - IMPORT: slz: silence a build warning on non-x86 non-arm - BUG/MAJOR: leastconn: do not loop forever when facing saturated servers - BUG/MAJOR: queue: properly keep count of the queue length - BUG/MINOR: quic: fix crash on quic_conn alloc failure - BUG/MAJOR: leastconn: never reuse the node after dropping the lock - MINOR: acme: renewal notification over the dpapi sink - CLEANUP: quic: Useless BIO_METHOD initialization - MINOR: quic: Add useful error traces about qc_ssl_sess_init() failures - MINOR: quic: Allow the use of the new OpenSSL 3.5.0 QUIC TLS API (to be completed) - MINOR: quic: implement all remaining callbacks for OpenSSL 3.5 QUIC API - MINOR: quic: OpenSSL 3.5 internal QUIC custom extension for transport parameters reset - MINOR: quic: OpenSSL 3.5 trick to support 0-RTT - DOC: update INSTALL for QUIC with OpenSSL 3.5 usages - DOC: management: update 'acme status' - BUG/MEDIUM: wdt: always ignore the first watchdog wakeup - CLEANUP: wdt: clarify the comments on the common exit path - BUILD: ssl: avoid possible printf format warning in traces - BUILD: acme: fix build issue on 32-bit archs with 64-bit time_t - DOC: management: precise some of the fields of "show servers conn" - BUG/MEDIUM: mux-quic: fix BUG_ON() on rxbuf alloc error - DOC: watchdog: update the doc to reflect the recent changes - BUG/MEDIUM: acme: check if acme domains are configured - BUG/MINOR: acme: fix formatting issue in error and logs - EXAMPLES: lua: avoid screen refresh effect in "trisdemo" - CLEANUP: quic: remove unused cbuf module - MINOR: quic: move function to check stream type in utils - MINOR: quic: refactor handling of streams after MUX release - MINOR: quic: add some missing includes - MINOR: quic: adjust quic_conn-t.h include list - CLEANUP: cfgparse: alphabetically sort the global keywords - MINOR: glitches: add global setting "tune.glitches.kill.cpu-usage" v3.2-dev17	2025-05-21 15:56:06 +02:00
Willy Tarreau	a1577a89a0	MINOR: glitches: add global setting "tune.glitches.kill.cpu-usage" It was mentioned during the development of glitches that it would be nice to support not killing misbehaving connections below a certain CPU usage so that poor implementations that routinely misbehave without impact are not killed. This is now possible by setting a CPU usage threshold under which we don't kill them via this parameter. It defaults to zero so that we continue to kill them by default.	2025-05-21 15:47:42 +02:00
Willy Tarreau	eee57b4d3f	CLEANUP: cfgparse: alphabetically sort the global keywords The global keywords table was no longer sorted at all, let's fix it to ease spotting the searched ones.	2025-05-21 15:47:42 +02:00
Amaury Denoyelle	00d90e8839	MINOR: quic: adjust quic_conn-t.h include list Adjust include list in quic_conn-t.h. This file is included in many QUIC source, so it is useful to keep as lightweight as possible. Note that connection/QUIC MUX are transformed into forward declaration for better layer separation.	2025-05-21 14:44:27 +02:00
Amaury Denoyelle	01e3b2119a	MINOR: quic: add some missing includes Insert some missing includes statement in QUIC source files. This was detected after the next commit which adjust the include list used in quic_conn-t.h file.	2025-05-21 14:44:27 +02:00
Amaury Denoyelle	f286288471	MINOR: quic: refactor handling of streams after MUX release quic-conn layer has to handle itself STREAM frames after MUX release. If the stream was already seen, it is probably only a retransmitted frame which can be safely ignored. For other streams, an active closure may be needed. Thus it's necessary that quic-conn layer knows the highest stream ID already handled by the MUX after its release. Previously, this was done via <nb_streams> member array in quic-conn structure. Refactor this by replacing <nb_streams> by two members called <stream_max_uni>/<stream_max_bidi>. Indeed, it is unnecessary for quic-conn layer to monitor locally opened uni streams, as the peer cannot by definition emit a STREAM frame on it. Also, bidirectional streams are always opened by the remote side. Previously, <nb_streams> were set by quic-stream layer. Now, <stream_max_uni>/<stream_max_bidi> members are only set one time, just prior to QUIC MUX release. This is sufficient as quic-conn do not use them if the MUX is available. Note that previously, IDs were used relatively to their type, thus incremented by 1, after shifting the original value. For simplification, use the plain stream ID, which is incremented by 4.	2025-05-21 14:26:45 +02:00
Amaury Denoyelle	07d41a043c	MINOR: quic: move function to check stream type in utils Move general function to check if a stream is uni or bidirectional from QUIC MUX to quic_utils module. This should prevent unnecessary include of QUIC MUX header file in other sources.	2025-05-21 14:17:41 +02:00
Amaury Denoyelle	cf45bf1ad8	CLEANUP: quic: remove unused cbuf module Cbuf are not used anymore. Remove the related source and header files, as well as include statements in the rest of QUIC source files.	2025-05-21 14:16:37 +02:00
Baptiste Assmann	b437094853	EXAMPLES: lua: avoid screen refresh effect in "trisdemo" In current version of the game, there is a "screen refresh" effect: the screen is cleared before being re-drawn. I moved the clear right after the connection is opened and removed it from rendering time.	2025-05-21 12:00:53 +02:00
William Lallemand	8b121ab6f7	BUG/MINOR: acme: fix formatting issue in error and logs Stop emitting \n in errmsg for intermediate error messages, this was emitting multiline logs and was returning to a new line in the middle of sentences. We don't need to emit them in acme_start_task() since the errmsg is ouput in a send_log which already contains a \n or on the CLI which also emits it.	2025-05-21 11:41:28 +02:00
William Lallemand	156f4bd7a6	BUG/MEDIUM: acme: check if acme domains are configured When starting the ACME task with a ckch_conf which does not contain the domains, the ACME task would segfault because it will try to dereference a NULL in this case. The patch fix the issue by emitting a warning when no domains are configured. It's not done at configuration parsing because it is not easy to emit the warning because there are is no callback system which give access to the whole ckch_conf once a line is parsed. No backport needed.	2025-05-21 11:41:28 +02:00
Willy Tarreau	f5ed309449	DOC: watchdog: update the doc to reflect the recent changes The watchdog was improved and fixed a few months ago, but the doc had not been updated to reflect this. That's now done.	2025-05-21 11:34:55 +02:00
Amaury Denoyelle	e399daa67e	BUG/MEDIUM: mux-quic: fix BUG_ON() on rxbuf alloc error RX buffer allocation has been reworked in current dev tree. The objective is to support multiple buffers per QCS to improve upload throughput. RX buffer allocation failure is handled simply : the whole connection is closed. This is done via qcc_set_error(), with INTERNAL_ERROR as error code. This function contains a BUG_ON() to ensure it is called only one time per connection instance. On RX buffer alloc failure, the aformentioned BUG_ON() crashes due to a double invokation of qcc_set_error(). First by qcs_get_rxbuf(), and immediately after it by qcc_recv(), which is the caller of the previous one. This regression was introduced by the following commit. 60f64449fbba7bb6e351e8343741bb3c960a2e6d MAJOR: mux-quic: support multiple QCS RX buffers To fix this, simply remove qcc_set_error() invocation in qcs_get_rxbuf(). On buffer alloc failture, qcc_recv() is responsible to set the error. This does not need to be backported.	2025-05-21 11:33:00 +02:00
Willy Tarreau	5c628d4e09	DOC: management: precise some of the fields of "show servers conn" As reported in issue #2970, the output of "show servers conn" is not clear. It was essentially meant as a debugging tool during some changes to idle connections management, but if some users want to monitor or graph them, more info is needed. The doc mentions the currently known list of fields, and reminds that this output is not meant to be stable over time, but as long as it does not change, it can provide some useful metrics to some users.	2025-05-21 10:45:07 +02:00
Willy Tarreau	4b52d5e406	BUILD: acme: fix build issue on 32-bit archs with 64-bit time_t The build failed on mips32 with a 64-bit time_t here: https://github.com/haproxy/haproxy/actions/runs/15150389164/job/42595310111 Let's just turn the "remain" variable used to show the remaining time into a more portable ullong and use %llu for all format specifiers, since long remains limited to 32-bit on 32-bit archs. No backport needed.	2025-05-21 10:18:47 +02:00
Willy Tarreau	09d4c9519e	BUILD: ssl: avoid possible printf format warning in traces When building on MIPS-32 with gcc-9.5 and glibc-2.31, I got this: src/ssl_trace.c: In function 'ssl_trace': src/ssl_trace.c:118:42: warning: format '%ld' expects argument of type 'long int', but argument 3 has type 'ssize_t' {aka 'const int'} [-Wformat=] 118 \| chunk_appendf(&trace_buf, " : size=%ld", *size); \| ~~^ ~~~~~ \| \| \| \| \| ssize_t {aka const int} \| long int \| %d Let's just cast the type. No backport needed.	2025-05-21 10:01:14 +02:00
Willy Tarreau	3b2fb5cc15	CLEANUP: wdt: clarify the comments on the common exit path The condition in which we reach the check for ha_panic() and ha_stuck_warning() are not super clear, let's reformulate them.	2025-05-20 16:37:06 +02:00
Willy Tarreau	0a8bfb5b90	BUG/MEDIUM: wdt: always ignore the first watchdog wakeup With commit a06c215f08 ("MEDIUM: wdt: always make the faulty thread report its own warnings"), when the TH_FL_STUCK flag was flipped on, we'd then go to the panic code instead of giving a second chance like before the commit. This can trigger rare cases that only happen with moderate loads like was addressed by commit 24ce001771 ("BUG/MEDIUM: wdt: fix the stuck detection for warnings"). This is in fact due to the loss of the common "goto update_and_leave" that used to serve both the warning code and the flag setting for probation, and it's apparently what hit Christian in issue #2980. Let's make sure we exit naturally when turning the bit on for the first time. Let's also update the confusing comment at the end of the check that was left over by latest change. Since the first commit was backported to 3.1, this commit should be backported there as well.	2025-05-20 16:37:03 +02:00
William Lallemand	dcdf27af70	DOC: management: update 'acme status' Update the 'acme status' section with the "Stopped" status and fix the description.	2025-05-20 16:08:57 +02:00
Frederic Lecaille	bbe302087c	DOC: update INSTALL for QUIC with OpenSSL 3.5 usages Update the QUIC sections which mention the OpenSSL library use cases.	2025-05-20 15:00:06 +02:00
Frederic Lecaille	08eee0d9cf	MINOR: quic: OpenSSL 3.5 trick to support 0-RTT For an unidentified reason, SSL_do_hanshake() succeeds at its first call when 0-RTT is enabled for the connection. This behavior looks very similar by the one encountered by AWS-LC stack. That said, it was documented by AWS-LC. This issue leads the connection to stop sending handshake packets after having release the handshake encryption level. In fact, no handshake packets could even been sent leading the handshake to always fail. To fix this, this patch simulates a "handshake in progress" state waiting for the application level read secret to be established by the TLS stack. This may happen only after the QUIC listener has completed/confirmed the handshake upon handshake CRYPTO data receipt from the peer.	2025-05-20 15:00:06 +02:00
Frederic Lecaille	849a3af14e	MINOR: quic: OpenSSL 3.5 internal QUIC custom extension for transport parameters reset A QUIC must sent its transport parameter using a TLS custom extention. This extension is reset by SSL_set_SSL_CTX(). It can be restored calling quic_ssl_set_tls_cbs() (which calls SSL_set_quic_tls_cbs()).	2025-05-20 15:00:06 +02:00
Frederic Lecaille	b3ac1a636c	MINOR: quic: implement all remaining callbacks for OpenSSL 3.5 QUIC API The quic_conn struct is modified for two reasons. The first one is to store the encoded version of the local tranport parameter as this is done for USE_QUIC_OPENSSL_COMPAT. Indeed, the local transport parameter "should remain valid until after the parameters have been sent" as mentionned by SSL_set_quic_tls_cbs(3) manual. In our case, the buffer is a static buffer attached to the quic_conn object. qc_ssl_set_quic_transport_params() function whose role is to call SSL_set_tls_quic_transport_params() (aliased by SSL_set_quic_transport_params() to set these local tranport parameter into the TLS stack from the buffer attached to the quic_conn struct. The second quic_conn struct modification is the addition of the new ->prot_level (SSL protection level) member added to the quic_conn struct to store "the most recent write encryption level set via the OSSL_FUNC_SSL_QUIC_TLS_yield_secret_fn callback (if it has been called)" as mentionned by SSL_set_quic_tls_cbs(3) manual. This patches finally implements the five remaining callacks to make the haproxy QUIC implementation work. OSSL_FUNC_SSL_QUIC_TLS_crypto_send_fn() (ha_quic_ossl_crypto_send) is easy to implement. It calls ha_quic_add_handshake_data() after having converted qc->prot_level TLS protection level value to the correct ssl_encryption_level_t (boringSSL API/quictls) value. OSSL_FUNC_SSL_QUIC_TLS_crypto_recv_rcd_fn() (ha_quic_ossl_crypto_recv_rcd()) provide the non-contiguous addresses to the TLS stack, without releasing them. OSSL_FUNC_SSL_QUIC_TLS_crypto_release_rcd_fn() (ha_quic_ossl_crypto_release_rcd()) release these non-contiguous buffer relying on the fact that the list of encryption level (qc->qel_list) is correctly ordered by SSL protection level secret establishements order (by the TLS stack). OSSL_FUNC_SSL_QUIC_TLS_yield_secret_fn() (ha_quic_ossl_got_transport_params()) is a simple wrapping function over ha_quic_set_encryption_secrets() which is used by boringSSL/quictls API. OSSL_FUNC_SSL_QUIC_TLS_got_transport_params_fn() (ha_quic_ossl_got_transport_params()) role is to store the peer received transport parameters. It simply calls quic_transport_params_store() and set them into the TLS stack calling qc_ssl_set_quic_transport_params(). Also add some comments for all the OpenSSL 3.5 QUIC API callbacks. This patch have no impact on the other use of QUIC API provided by the others TLS stacks.	2025-05-20 15:00:06 +02:00
Frederic Lecaille	dc6a3c329a	MINOR: quic: Allow the use of the new OpenSSL 3.5.0 QUIC TLS API (to be completed) This patch allows the use of the new OpenSSL 3.5.0 QUIC TLS API when it is available and detected at compilation time. The detection relies on the presence of the OSSL_FUNC_SSL_QUIC_TLS_CRYPTO_SEND macro from openssl-compat.h. Indeed this macro is defined by OpenSSL since 3.5.0 version. It is not defined by quictls. This helps in distinguishing these two TLS stacks. When the detection succeeds, HAVE_OPENSSL_QUIC is also defined by openssl-compat.h. Then, this is this new macro which is used to detect the availability of the new OpenSSL 3.5.0 QUIC TLS API. Note that this detection is done only if USE_QUIC_OPENSSL_COMPAT is not asked. So, USE_QUIC_OPENSSL_COMPAT and HAVE_OPENSSL_QUIC are exclusive. At the same location, from openssl-compat.h, ssl_encryption_level_t enum is defined. This enum was defined by quictls and expansively used by the haproxy QUIC implementation. SSL_set_quic_transport_params() is replaced by SSL_set_quic_tls_transport_params. SSL_set_quic_early_data_enabled() (quictls) is also replaced by SSL_set_quic_tls_early_data_enabled() (OpenSSL). SSL_quic_read_level() (quictls) is not defined by OpenSSL. It is only used by the traces to log the current TLS stack decryption level (read). A macro makes it return -1 which is an usused values. The most of the differences between quictls and OpenSSL QUI APIs are in quic_ssl.c where some callbacks must be defined for these two APIs. This is why this patch modifies quic_ssl.c to define an array of OSSL_DISPATCH structs: <ha_quic_dispatch>. Each element of this arry defines a callback. So, this patch implements these six callabcks: - ha_quic_ossl_crypto_send() - ha_quic_ossl_crypto_recv_rcd() - ha_quic_ossl_crypto_release_rcd() - ha_quic_ossl_yield_secret() - ha_quic_ossl_got_transport_params() and - ha_quic_ossl_alert(). But at this time, these implementations which must return an int return 0 interpreted as a failure by the OpenSSL QUIC API, except for ha_quic_ossl_alert() which is implemented the same was as for quictls. The five remaining functions above will be implemented by the next patches to come. ha_quic_set_encryption_secrets() and ha_quic_add_handshake_data() have been moved to be defined for both quictls and OpenSSL QUIC API. These callbacks are attached to the SSL objects (sessions) calling qc_ssl_set_cbs() new function. This latter callback the correct function to attached the correct callbacks to the SSL objects (defined by <ha_quic_method> for quictls, and <ha_quic_dispatch> for OpenSSL). The calls to SSL_provide_quic_data() and SSL_process_quic_post_handshake() have been also disabled. These functions are not defined by OpenSSL QUIC API. At this time, the functions which call them are still defined when HAVE_OPENSSL_QUIC is defined.	2025-05-20 15:00:06 +02:00
Frederic Lecaille	894595b711	MINOR: quic: Add useful error traces about qc_ssl_sess_init() failures There were no traces to diagnose qc_ssl_sess_init() failures from QUIC traces. This patch add calls to TRACE_DEVEL() into qc_ssl_sess_init() and its caller (qc_alloc_ssl_sock_ctx()). This was useful at least to diagnose SSL context initialization failures when porting QUIC to the new OpenSSL 3.5 QUIC API. Should be easily backported as far as 2.6.	2025-05-20 15:00:06 +02:00
Frederic Lecaille	a2822b1776	CLEANUP: quic: Useless BIO_METHOD initialization This code is there from QUIC implementation start. It was supposed to initialize <ha_quic_meth> as a BIO_METHOD static object. But this BIO_METHOD is not used at all! Should be backported as far as 2.6 to help integrate the next patches to come.	2025-05-20 15:00:06 +02:00
William Lallemand	e803385a6e	MINOR: acme: renewal notification over the dpapi sink Output a sink message when the certificate was renewed by the ACME client. The message is emitted on the "dpapi" sink, and ends by \n\0. Since the message contains this binary character, the right -0 parameter must be used when consulting the sink over the CLI: Example: $ echo "show events dpapi -nw -0" \| socat -t9999 /tmp/haproxy.sock - <0>2025-05-19T15:56:23.059755+02:00 acme newcert foobar.pem.rsa\n\0 When used with the master CLI, @@1 should be used instead of @1 in order to keep the connection to the worker. Example: $ echo "@@1 show events dpapi -nw -0" \| socat -t9999 /tmp/master.sock - <0>2025-05-19T15:56:23.059755+02:00 acme newcert foobar.pem.rsa\n\0	2025-05-19 16:07:25 +02:00
Willy Tarreau	99d6c889d0	BUG/MAJOR: leastconn: never reuse the node after dropping the lock On ARM with 80 cores and a single server, it's sometimes possible to see a segfault in fwlc_get_next_server() around 600-700k RPS. It seldom happens as well on x86 with 128 threads with the same config around 1M rps. It turns out that in fwlc_get_next_server(), before calling fwlc_srv_reposition(), we have to drop the lock and that one takes it back again. The problem is that anything can happen to our node during this time, and it can be freed. Then when continuing our work, we later iterate over it and its next to find a node with an acceptable key, and by doing so we can visit either uninitialized memory or simply nodes that are no longer in the tree. A first attempt at fixing this consisted in artificially incrementing the elements count before dropping the lock, but that turned out to be even worse because other threads could loop forever on such an element looking for an entry that does not exist. Maintaining a separate refcount didn't work well either, and it required to deal with the memory release while dropping it, which is really not convenient. Here we're taking a different approach consisting in simply not trusting this node anymore and going back to the beginning of the loop, as is done at a few other places as well. This way we can safely ignore the possibly released node, and the test runs reliably both on the arm and the x86 platforms mentioned above. No performance regression was observed either, likely because this operation is quite rare. No backport is needed since this appeared with the leastconn rework in 3.2.	2025-05-19 16:05:03 +02:00
Amaury Denoyelle	d358da4d83	BUG/MINOR: quic: fix crash on quic_conn alloc failure If there is an alloc failure during qc_new_conn(), cleaning is done via quic_conn_release(). However, since the below commit, an unchecked dereferencing of <qc.path> is performed in the latter. e841164a4402118bd7b2e2dc2b5068f21de5d9d2 MINOR: quic: account for global congestion window To fix this, simply check <qc.path> before dereferencing it in quic_conn_release(). This is safe as it is properly initialized to NULL on qc_new_conn() first stage. This does not need to be backported.	2025-05-19 11:03:48 +02:00
Willy Tarreau	099c1b2442	BUG/MAJOR: queue: properly keep count of the queue length The queue length was moved to its own variable in commit 583303c48 ("MINOR: proxies/servers: Calculate queueslength and use it."), however a few places were missed in pendconn_unlink() and assign_server_and_queue() resulting in never decreasing counts on aborted streams. This was reproduced when injecting more connections than the total backend could stand in TCP mode and letting some of them time out in the queue. No backport is needed, this is only 3.2.	2025-05-17 10:46:10 +02:00
Willy Tarreau	6be02d1c6e	BUG/MAJOR: leastconn: do not loop forever when facing saturated servers Since commit 9fe72bba3 ("MAJOR: leastconn; Revamp the way servers are ordered."), there's no way to escape the loop visiting the mt_list heads in fwlc_get_next_server if all servers in the list are saturated, resulting in a watchdog panic. It can be reproduced with this config and injecting with more than 2 concurrent conns: balance leastconn server s1 127.0.0.1:8000 maxconn 1 server s2 127.0.0.1:8000 maxconn 1 Here we count the number of saturated servers that were encountered, and escape the loop once the number of remaining servers exceeds the number of saturated ones. No backport is needed since this arrived in 3.2.	2025-05-17 10:44:36 +02:00
Willy Tarreau	ccc65012d3	IMPORT: slz: silence a build warning on non-x86 non-arm Building with clang 16 on MIPS64 yields this warning: src/slz.c:931:24: warning: unused function 'crc32_uint32' [-Wunused-function] static inline uint32_t crc32_uint32(uint32_t data) ^ Let's guard it using UNALIGNED_LE_OK which is the only case where it's used. This saves us from introducing a possibly non-portable attribute. This is libslz upstream commit f5727531dba8906842cb91a75c1ffa85685a6421.	2025-05-16 16:43:53 +02:00
Willy Tarreau	31ca29eee1	IMPORT: slz: fix header used for empty zlib message Calling slz_rfc1950_finish() without emitting any data would result in incorrectly emitting a gzip header (rfc1952) instead of a zlib header (rfc1950) due to a copy-paste between the two wrappers. The impact is almost inexistent since the zlib format is almost never used in this context, and compressing totally empty messages is quite rare as well. Let's take this opportunity for fixing another mistake on an RFC number in a comment. This is slz upstream commit 7f3fce4f33e8c2f5e1051a32a6bca58e32d4f818.	2025-05-16 16:43:53 +02:00
Willy Tarreau	411b04c7d3	IMPORT: slz: use a better hash for machines with a fast multiply The current hash involves 3 simple shifts and additions so that it can be mapped to a multiply on architecures having a fast multiply. This is indeed what the compiler does on x86_64. A large range of values was scanned to try to find more optimal factors on machines supporting such a fast multiply, and it turned out that new factor 0x1af42f resulted in smoother hashes that provided on average 0.4% better compression on both the Silesia corpus and an mbox file composed of very compressible emails and uncompressible attachments. It's even slightly better than CRC32C while being faster on Skylake. This patch enables this factor on archs with a fast multiply. This is slz upstream commit 82ad1e75c13245a835c1c09764c89f2f6e8e2a40.	2025-05-16 16:43:53 +02:00
Willy Tarreau	248bbec83c	IMPORT: slz: support crc32c for lookup hash on sse4 but only if requested If building for sse4 and USE_CRC32C_HASH is defined, then we can use crc32c to calculate the lookup hash. By default we don't do it because even on skylake it's slower than the current hash, which only involves a short multiply (~5% slower). But the gains are marginal (0.3%). This is slz upstream commit 44ae4f3f85eb275adba5844d067d281e727d8850. Note: this is not used by default and only merged in order to avoid divergence between the code bases.	2025-05-16 16:43:53 +02:00
Willy Tarreau	ea1b70900f	IMPORT: slz: avoid multiple shifts on 64-bits On 64-bit platforms, disassembling the code shows that send_huff() performs a left shift followed by a right one, which are the result of integer truncation and zero-extension caused solely by using different types at different levels in the call chain. By making encode24() take a 64-bit int on input and send_huff() take one optionally, we can remove one shift in the hot path and gain 1% performance without affecting other platforms. This is slz upstream commit fd165b36c4621579c5305cf3bb3a7f5410d3720b.	2025-05-16 16:43:53 +02:00
Willy Tarreau	0a91c6dcae	BUILD: debug: mark ha_crash_now() as attribute(noreturn) Building on MIPS64 with clang16 incorrectly reports some uninitialized value warnings in stats-proxy.c due to some calls to ABORT_NOW() where the compiler didn't know the code wouldn't return. Let's properly mark the function as noreturn, and take this opportunity for also marking it unused to avoid possible warnings depending on the build options (if ABORT_NOW is not used). No backport needed though it will not harm.	2025-05-16 16:43:53 +02:00
William Lallemand	1eebf98952	DOC: management: change reference to configuration manual Since e24b77e7 ('DOC: config: move the extraneous sections out of the "global" definition') the ACME section of the configuration manual was move from 3.13 to 12.8. Change the reference to that section in "acme renew".	2025-05-16 16:01:43 +02:00
Willy Tarreau	81e46be026	DOC: config: properly index "table and "stick-table" in their section Tim reported in issue #2953 that "stick-table" and "table" were not indexed as keywords. The issue was the indent level. Also let's make sure to put a box around the "store" arguments as well.	2025-05-16 15:37:03 +02:00
Willy Tarreau	df00164fdd	BUG/MEDIUM: h1/h2/h3: reject forbidden chars in the Host header field In continuation with 9a05c1f574 ("BUG/MEDIUM: h2/h3: reject some forbidden chars in :authority before reassembly") and the discussion in issue #2941, @DemiMarie rightfully suggested that Host should also be sanitized, because it is sometimes used in concatenation, such as this: http-request set-url https://%[req.hdr(host)]%[pathq] which was proposed as a workaround for h2 upstream servers that require :authority here: https://www.mail-archive.com/haproxy@formilux.org/msg43261.html The current patch then adds the same check for forbidden chars in the Host header, using the same function as for the patch above, since in both cases we validate the host:port part of the authority. This way we won't reconstruct ambiguous URIs by concatenating Host and path. Just like the patch above, this can be backported afer a period of observation.	2025-05-16 15:13:17 +02:00

1 2 3 4 5 ...

24682 Commits