haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-09-20 21:31:28 +02:00

Author	SHA1	Message	Date
Christopher Faulet	e56e718c82	MINOR: mux-h1: Add masks to group H1S DEMUX and MUX errors It is just a small patch to clean up mux/demux functions. Instead of listing the H1S errors that must be handled during demux of mux operations, masks of flags are used. It is more readable.	2025-01-31 10:41:49 +01:00
Willy Tarreau	8235a24782	MEDIUM: epoll: skip reports of stale file descriptors Now that we can see that some events are reported for older instances of a file descriptor, let's skip these ones instead of reporting dangerous events on them. It might possibly qualify as a bug if it helps fixing strange issues in certain environments, in which case it can make sense to backport it along with the following recent patches: DEBUG: fd: add a counter of takeovers of an FD since it was last opened MINOR: fd: add a generation number to file descriptors DEBUG: epoll: store and compare the FD's generation count with reported event	2025-01-30 19:45:34 +01:00
Willy Tarreau	5012b6c6d9	DEBUG: epoll: store and compare the FD's generation count with reported event There have been some reported cases where races between threads in epoll were causing wrong reports of close or error events. Since the epoll_event data is 64 bits, we can store the FD's generation counter in the upper bits to verify if we're speaking about the same instance of the FD as the current one or a stale one. If the generation number does not match, then we classify these into 3 conditions and increment the relevant COUNT_IF() counters (stale report for closed FD, stale report of harmless event on reopened FD, stale report of HUP/ERR on reopened FD). Tests have shown that with heavy concurrency, a very small maxconn (typically 1 per thread), http-reuse always and a server closing connections first but randomly (httpterm with /C=2r), such events can happen at a pace of a few per second for the closed FDs, and a few per minute for the other ones, so there's value in leaving this accessible for troubleshooting. E.g after a few minutes: Count Type Location function(): "condition" [comment] 5541 CNT ev_epoll.c:296 _do_poll(): "1" [epoll report of event on a just closed fd (harmless)] 10 CNT ev_epoll.c:294 _do_poll(): "1" [epoll report of event on a closed recycled fd (rare)] 42 CNT ev_epoll.c:289 _do_poll(): "1" [epoll report of HUP on a stale fd reopened on the same thread (suspicious)] 212 CNT ev_epoll.c:279 _do_poll(): "1" [epoll report of HUP/ERR on a stale fd reopened on another thread (harmless)] 1 CNT mux_h1.c:3911 h1_send(): "b_data(&h1c->obuf)" [connection error (send) with pending output data] This one with the following setup, whicih abuses threads contention by starting 64 threads on two cores: - config: global nbthread 64 stats socket /tmp/sock1 level admin stats timeout 1h defaults timeout client 5s timeout server 5s timeout connect 5s mode http listen p2 bind :8002 http-reuse always server s1 127.0.0.1:8000 maxconn 4 - haproxy forcefully started on 2C4T: $ taskset -c 0,1,4,5 ./haproxy -db -f epoll-dbg.cfg - httpterm on port 8000, cpus 2,3,6,7 (2C4T) - h1load with responses larger than a single buffer, and randomly closing/keeping alive: $ taskset -c 2,3,6,7 h1load -e -t 4 -c 256 -r 1 0:8002/?s=19k/C=2r	2025-01-30 19:45:34 +01:00
Willy Tarreau	d155924efe	MINOR: fd: add a generation number to file descriptors This patch adds a counter of close() on file descriptors in the fdtab. The goal is to better detect if reported events concern the current or a previous file descriptor. For now the counter is only added, and is showed in "show fd" as "gen". We're reusing unused space at the end of the struct. If it's needed for something more important later, this patch can be reverted.	2025-01-30 19:45:34 +01:00
Willy Tarreau	44ac7a7e73	DEBUG: fd: add a counter of takeovers of an FD since it was last opened That's essentially in order to help with debugging strange cases like the occasional epoll issues/races, by keeping a counter of how many times an FD was taken over since last inserted. The room is available so let's use it. If it's needed later, this patch can easily be reverted. The counter is also reported in "show fd" as "tkov".	2025-01-30 19:45:34 +01:00
Amaury Denoyelle	b849ee5fa3	BUILD: quic: fix overflow in global tune A new global option was recently introduced to disable pacing. However, the value used (1<<31) caused issue with some compiler as options field used for storage is declared as int. Move pacing deactivation flag outside into the newly defined quic_tune to fix this. This should be backported up to 3.1 after a period of observation. Note that it relied on the previous patch which defined new quic_tune type.	2025-01-30 18:12:53 +01:00
Amaury Denoyelle	09e9c7d5b7	MINOR: quic: define quic_tune Define a new structure quic_tune. It will be useful to regroup various configuration settings and tunable related to QUIC, instead of defining them into the global structure.	2025-01-30 18:12:40 +01:00
Amaury Denoyelle	2fc63cb186	MINOR: quic: mark BBR as stable Pacing has recently been moved out of experimental status and is activated by default. This is a mandatory requirement for BBR. Furthermore, BBR is now considered stable. As such, removes its experimental status with this commit.	2025-01-30 17:20:41 +01:00
Amaury Denoyelle	a19d9b0486	MAJOR: quic: mark pacing as stable and enable it by default Remove pacing experimental status, so it's not required anymore to use expose-experimental-directives to enable it. Along this change, pacing is now activated by default. As such, pacing configuration is transformed into its final form. The global on/off setting is turned into a disable setting without argument.	2025-01-30 17:20:41 +01:00
Amaury Denoyelle	0c8b54b2d1	MINOR: quic: transform pacing settings into a global option Pacing support was previously activated on each bind line individually, via an optional argument of quic-cc-algo keyword. Remove this optional argument and introduce a global setting to enable/disable pacing. Pacing activation is still flagged as experimental. One important change is that previously BBR usage automatically activated pacing support. This is not the case anymore, so users should now always explicitely activate pacing if BBR is selected. A new warning message will be displayed if this is not the case. Another consequence of this change is that now pacing_inter callback is always defined for every quic_cc_algo types. As such, QUIC MUX uses global.tune.options to determine if pacing is required. This should be backported up to 3.1, after a period of observation.	2025-01-30 17:19:38 +01:00
Amaury Denoyelle	d04e93bc2e	MINOR: quic: allow BBR testing without pacing Pacing is activated per bind line via an optional boolean argument of quic-cc-algo keyword. Contrary to the default usage, pacing is automatically activated when BBR is chosen. This is because this algorithm is expected to run on top of pacing, else its behavior is undefined. Previously, pacing argument was thus ignored when BBR was selected. Change this to support explicit deactivation of pacing with it. This could be useful to test BBR without pacing when debugging some issues. This should be backported up to 3.1, after a period of observation.	2025-01-30 17:18:02 +01:00
Amaury Denoyelle	6acf391e89	MINOR: quic: remove references to burst in quic-cc-algo parsing Pacing activation configuration has been recently revamped. Previously, pacing related quic-cc-algo argument was used to specify a burst size. It evolved into a boolean value as burst size is dynamically calculated now. As such, removes any references to the old burst value in config parsing code for cleaner code. This should be backported up to 3.1, after a period of observation.	2025-01-30 17:02:59 +01:00
Willy Tarreau	bd7a688b8b	BUG/MEDIUM: chunk: make sure to flush the trash pool before resizing Late in 3.1 we've added an integrity check to make sure we didn't keep trash objects allocated before resizing the trash with commit 0bfd36e7b8 ("MINOR: chunk: add a BUG_ON upon the next init_trash_buffer()"), but it turns out that the counter that is being checked includes the number of objects left in local thread caches. As such it can trigger despite no object being allocated. This precisely happens when setting tune.memory.hot-size to a few megabytes because some temporarily used trash objects will remain in cache. In order to address this, let's first flush the pool before running the check. That was previously done by pool_destroy() but the check had to be inserted before it. So now we first flush the trash pool, then verify it's no longer used, and finally we can destroy it. This needs to be backported to 3.1. Thanks to Christian Ruppert for reporting this bug.	2025-01-29 17:55:18 +01:00
William Lallemand	b43e5d8c16	BUILD: ssl: more cleaner approach to WolfSSL without renegotiation Patch discussed in https://github.com/wolfSSL/wolfssl/issues/6834 When building Wolfssl without renegotiation options, WolfSSL still defines the macros about it, which warns during the build. This patch completes the previous one by undefining the macros so haproxy could build without any warning.	2025-01-28 20:55:20 +01:00
William Lallemand	c6a8279cdf	BUILD: ssl: allow to build without the renegotiation API of WolfSSL In ticket https://github.com/wolfSSL/wolfssl/issues/6834, it was suggested to push --enable-haproxy within --enable-distro. WolfSSL does not want to include the renegotiation support in --enable-distro. To achieve this, let haproxy build without SSL_renegotiate_pending() when wolfssl does not define HAVE_SECURE_RENEGOCIATION or HAVE_SERVER_RENEGOCIATION_INFO.	2025-01-28 18:31:32 +01:00
Olivier Houchard	9253146b90	BUILD: queues: Use unsigned int when needed Use unsigned int instead of int when calculating which thread group we should dequeue from next, as the difference in signedness makes clang unhappy.	2025-01-28 17:44:54 +01:00
Olivier Houchard	b74ec1efc2	MINOR: queues: use __ha_cpu_relax() on failed CAS. Make sure we call __ha_cpu_relax() if we fail a CAS, to help with contention.	2025-01-28 16:00:19 +01:00
Willy Tarreau	f17b0a994b	BUILD: tools: fix build on BSD by dropping the ETIME check Commit 44537379fc ("MINOR: tools: add errname to print errno macro name") brought a facility to report errno using a symbolic string when known instead of showing only the value. However, among the listed options, ETIME is mentioned but is unknown from FreeBSD where it breaks the build. Let's simply drop it, we don't use ETIME anyway and even if it would be reported, the default code path still reports the numeric value so there's no harm. If other ones fail to build in the future, they could be handled the same way.	2025-01-28 15:58:57 +01:00
Christopher Faulet	36d151dc10	MEDIUM: stream: No longer use TASK_F_UEVT* to shut a stream down Thanks to the previous patch, it is now possible to explicitly rely on stream's events to shut it down. The right event is set in stream_shutdown(), before waking up the stream, via an atomic operation. In process_stream(), this event will be handled as expected. Thus, TASK_F_UEVT* are no longer used, but not removed since still usable for other tasks. This patch depends on "MEDIUM: stream: Map task wake up reasons to dedicated stream events".	2025-01-28 14:53:37 +01:00
Christopher Faulet	6048460102	MEDIUM: stream: Map task wake up reasons to dedicated stream events To fix thread-safety issues when a stream must be shut, three new task states were added. These states are generic (UEVT1, UEVT2 and UEVT3), the task callback function is responsible to know what to do with them. However, it is not really scalable. The best is to use an atomic field in the stream structure itself to deal with these dedicated events. There is already the "pending_events" field that save wake up reasons (TASK_WOKEN_) to not loose them if process_stream() is interrupted before it had a chance to handle them. So the idea is to introduce a new field to handle streams dedicated events and merged them with the task's wake up reasons used by the stream. This means a mapping must be performed between some task wake up reasons and streams events. Note that not all task wake up reasons will be mapped. In this patch, the "new_events" field is introduced. It is an atomic bit-field. Streams events (STRM_EVT_) are also introduced to map the task wake up reasons used by process_stream(). Only TASK_WOKEN_TIMER and TASK_WOKEN_MSG are mapped, in addition to TASK_F_UEVT* flags. In process_stream(), "pending_events" field is now filled with new stream events and the mapping of the wake up reasons.	2025-01-28 14:53:37 +01:00
Christopher Faulet	0a52a75ef7	BUG/MINOR: stream: Properly handle "on-marked-up shutdown-backup-sessions" shutdown-backup-sessions action for on-marked-up directive does not work anymore since the stream_shutdown() function was modified to be async-safe. When stream_shutdown() was modified to be async-safe, dedicated task events were added to map the reasons to shut a stream down. SF_ERR_DOWN was mapped to TASK_F_EVT1 and SF_ERR_KILLED was mapped to TASK_F_EVT2. The reverse mapping was performed by process_stream() to shut the stream with the appropriate reason. However, SF_ERR_UP reason, used by shutdown-backup-sessions action to shut a stream down because a preferred server became available, was not mapped in the same way. So since commit b8e3b0a18d ("BUG/MEDIUM: stream: make stream_shutdown() async-safe"), this action is ignored and does not work anymore. To fix an issue, and being able to bakcport the fix, a third task event was added. TASK_F_EVT3 is now mapped on SF_ERR_UP. This patch should fix the issue #2848. It must be backported as far as 2.6.	2025-01-28 14:53:37 +01:00
Olivier Houchard	26b3e5236f	MEDIUM: servers/proxies: Switch to using per-tgroup queues. For both servers and proxies, use one connection queue per thread-group, instead of only one. Having only one can lead to severe performance issues on NUMA machines, it is actually trivial to get the watchdog to trigger on an AMD machine, having a server with a maxconn of 96, and an injector that uses 160 concurrent connections. We now have one queue per thread-group, however when dequeueing, we're dequeuing MAX_SELF_USE_QUEUE (currently 9) pendconns from our own queue, before dequeueing one from another thread group, if available, to make sure everybody is still running.	2025-01-28 12:49:41 +01:00
Olivier Houchard	583303c48b	MINOR: proxies/servers: Calculate queueslength and use it. For both proxies and servers, properly calculates queueslength, which is the total number of element in each queues (as they currently are only using one queue, it is equivalent to the number of element of that queue), and use it instead of the queue's length.	2025-01-28 12:49:41 +01:00
Olivier Houchard	59eddabe16	MINOR: Add fields to the per-thread group field in struct server. Add a per-thread group queue and associated fields in per-thread group field in struct server, as well as a new field, queues length. This is currently unused, so should change nothing.	2025-01-28 12:49:41 +01:00
Olivier Houchard	f879b9a18a	MINOR: proxies: Add a per-thread group field to struct proxy. Add a per-thread group field to struct proxy, that will contain a struct queue, as well as a new field, "queueslength". This is currently unused, so should change nothing. Please note that proxy_init_per_thr() must now be called for each proxy once the thread groups number is known.	2025-01-28 12:49:41 +01:00
Willy Tarreau	7fa70da06d	MINOR: epoll: permit to mask certain specific events A few times in the past we've seen cases where epoll was caught reporting a wrong event that caused trouble (e.g. spuriously reporting HUP or RDHUP after a successful connect()). The new tune.epoll.mask-events directive permits to mask events such as ERR, HUP and RDHUP and convert them to IN events that are processed by the regular receive path. This should help better diagnose and troubleshoot issues such as this one, as well as rule out such a cause when similar issues are reported: https://github.com/haproxy/haproxy/issues/2368 https://www.spinics.net/lists/netdev/msg876470.html It should be harmless to backport this if necessary.	2025-01-27 15:47:46 +01:00
Aurelien DARRAGON	e768a531b7	CLEANUP: tree-wide: define and use acl_match_cond() helper acl_match_cond() combines acl_exec_cond() + acl_pass() and a check on the condition->pol (to check if the cond is inverted) in order to return either 0 if the cond doesn't match or 1 if it matches (or NULL). Thanks to this we can actually simplify some redundant constructs that iterate over rules and evaluate if the condition matches or not. Conditions for tcp-request inspect-content and tcp-response inspect-content couldn't be simplified because they perform an extra check for missing data, and thus still need to leverage acl_exec_cond() It's best to display the patch using "-w", like "git show xxxx -w", because some blocks had to be re-indented after the cleanup, which makes the patch hard to review by default.	2025-01-27 11:11:43 +01:00
Valentine Krasnobaeva	94d3b7375a	CLEANUP: ssl: move ssl_sock_gencert_load_ca declaration in ssl_gencert.h As ssl_sock_gencert_load_ca and ssl_sock_gencert_free_ca are compiled only if SSL_NO_GENERATE_CERTIFICATES is not defined, let's align it and move these declarations in ssl_gencert.h.	2025-01-24 12:31:07 +01:00
Valentine Krasnobaeva	846819b316	CLEANUP: ssl: rename ssl_sock_load_ca to ssl_sock_gencert_load_ca ssl_sock_load_ca is defined in ssl_gencert.c and compiled only if SSL_NO_GENERATE_CERTIFICATES is not defined. It's name is a bit confusing, as we may think at the first glance, that it's a generic function, which is also used to load CA file, provided via 'ca-file' keyword. ssl_set_verify_locations_file is used in this case. So let's rename ssl_sock_load_ca into ssl_sock_gencert_load_ca. Same is applied to ssl_sock_free_ca.	2025-01-24 12:31:07 +01:00
Valentine Krasnobaeva	c987f30245	BUG/MINOR: ssl: put ssl_sock_load_ca under SSL_NO_GENERATE_CERTIFICATES ssl_sock_load_ca and ssl_sock_free_ca definitions are compiled only, if SSL_NO_GENERATE_CERTIFICATES is not set. In case, when we set this define and build haproxy, linker throws an error. So, let's fix this. This should be backported in all stable versions.	2025-01-24 12:31:07 +01:00
Willy Tarreau	670182bc9e	[RELEASE] Released version 3.2-dev4 Released version 3.2-dev4 with the following main changes : - BUG/MINOR: stktable: fix big-endian compatiblity in smp_to_stkey() - MINOR: stktable: add stkey_to_smp() helper - MINOR: stktable: add stksess_getkey() helper - MINOR: stktable: add sc[0-2]_key fetches - BUG/MEDIUM: queues: Adjust the proxy counters when appropriate - MINOR: trace: add help message for -dt argument - MINOR: trace: ensure -dt priority over traces config section - MINOR: trace: support all source alias on -dt - BUG/MINOR: quic: reject NEW_TOKEN frames from clients - MINOR: stktable: fix potential build issue in smp_to_stkey - BUG/MEDIUM: stktable: fix missing lock on some table converters - BUG/MEDIUM: promex: Use right context pointers to dump backends extra-counters - MINOR: stktable: fix potential build issue in smp_to_stkey (2nd try) - MINOR: stktable: add smp_fetch_stksess() helper function - MEDIUM: stktable: split src-based key smp_fetch_sc functions - MEDIUM: stktable: split sc_ and src_ fetch lookup logics - MEDIUM: stktable: leverage smp_fetch_* helpers from sample conv - DOC: config: unify sample conv\|fetches optional arguments syntax - DOC: config: stick-table converters support implicit <table> argument - DOC: config: stick-table converter do accept ANY-typed input - DOC: config: clarify return type for some stick-table converters - DOC: config: refer to canonical sticktable converters for src_* fetches - CLEANUP: stktable: move sample_conv_table_bytes_out_rate() - MINOR: stktable: add table_{inc,clr}_gpc* converters - BUG/MAJOR: quic: reject too large CRYPTO frames - BUG/MAJOR: log/sink: possible sink collision in sink_new_from_srv() - BUG/MINOR: init: set HAPROXY_STARTUP_VERSION from the variable, not the macro - REORG: version: move the remaining BUILD_* stuff from haproxy.c to version.c - BUG/MINOR: quic: ensure a detached coalesced packet can't access its neighbours - MINOR: quic: Add a BUG_ON() on quic_tx_packet refcount - BUILD: quic: Move an ASSUME_NONNULL() for variable which is not null - BUG/MEDIUM: mux-h1: Properly close H1C if an error is reported before sending data - CLEANUP: quic: remove unused prototype - MINOR: quic: rename pacing_rate cb to pacing_inter - BUG/MINOR: quic: do not increase congestion window if app limited - MINOR: mux-quic: increment pacing retry counter on expired - MEDIUM: quic: implement credit based pacing - MEDIUM: mux-quic: reduce pacing CPU usage with passive wait - MEDIUM: quic: use dynamic credit for pacing - MINOR: quic: remove unused pacing burst in bind_conf/quic_cc_path - MINOR: quic: adapt credit based pacing to BBR - MINOR: tools: add errname to print errno macro name - MINOR: debug: debug_parse_cli_show_dev: use errname - MINOR: debug: show boot and runtime process settings in table v3.2-dev4	2025-01-24 11:01:06 +01:00
Valentine Krasnobaeva	8620ae7962	MINOR: debug: show boot and runtime process settings in table Let's reformat output of "show dev" in order to show some boot and runtime process settings in a table. This makes the output less crowded.	2025-01-24 09:54:57 +01:00
Valentine Krasnobaeva	df7f16d960	MINOR: debug: debug_parse_cli_show_dev: use errname Let's use errname, introduced in the previous commit in the output of "show dev". This output is destined to engineers. So, no need to provide a long descriptions of errnos given by strerror.	2025-01-24 09:54:57 +01:00
Valentine Krasnobaeva	44537379fc	MINOR: tools: add errname to print errno macro name Add helper to print the name of errno's corresponding macro, for example "EINVAL" for errno=22. This may be helpful for debugging and for using in some CLI commands output. The switch-case in errname() contains only the errnos currently used in the code. So, it needs to be extended, if one starts to use new syscalls.	2025-01-24 09:54:57 +01:00
Amaury Denoyelle	42bac9339c	MINOR: quic: adapt credit based pacing to BBR Credit based pacing has been further refined to be able to calculate dynamically burst size based on congestion parameter. However, BBR algorithm already provides pacing rate and burst size (labelled as send_quantum) for 1ms of emission. Adapt quic_pacing_reload() to use BBR values to compute pacing credit. This is done via pacing_burst callback which is now only defined for BBR. For other algorithms, determine the burst size over 1ms with the congestion window size and RTT. This should be backported up to 3.1.	2025-01-23 17:41:07 +01:00
Amaury Denoyelle	7896edccdc	MINOR: quic: remove unused pacing burst in bind_conf/quic_cc_path Pacing burst size is now dynamic. As such, configuration value has been removed and related fields in bind_conf and quic_cc_path structures can be safely removed. This should be backported up to 3.1.	2025-01-23 17:40:48 +01:00
Amaury Denoyelle	cb91ccd8a8	MEDIUM: quic: use dynamic credit for pacing Major improvements have been introduced in pacing recently. Most notably, QMUX schedules emission on a millisecond resolution, which allow to use passive wait to be much CPU friendly. However, an issue remains with the pacing max credit. Unless BBR is used, it is fixed to the configured value from quic-cc-algo bind statement. This is not practical as if too low, it may drastically reduce performance due to 1ms sleep resolution. If too high, some clients will suffer from too much packet loss. This commit fixes the issue by implementing a dynamic maximum credit value based on the network condition specific to each clients. Calculation is done to fix a maximum value which should allow QMUX current tasklet context to emit enough data to cover the delay with the next tasklet invokation. As such, avg_loop_us is used to detect the process load. If too small, 1.5ms is used as minimal value, to cover the extra delay incurred by the system which will happen for a default 1ms sleep. This should be backported up to 3.1.	2025-01-23 17:40:48 +01:00
Amaury Denoyelle	8098be1fdc	MEDIUM: mux-quic: reduce pacing CPU usage with passive wait Pacing algorithm has been revamped in the previous commit to implement a credit based solution. This is a far more adaptative solution, in particular which allow to catch up in case pause between pacing emission was longer than expected. This allows QMUX to remove the active loop based on tasklet wake-up. Instead, a new task is used when emission should be paced. The main advantage is that CPU usage is drastically reduced. New pacing task timer is reset each time qcc_io_send() is invoked. Timer will be set only if pacing engine reports that emission must be interrupted. In this case timer is set via qcc_wakeup_pacing() to the delay reported by congestion algorithm, or 1ms if delay is too short. At the end of qcc_io_cb(), pacing task is queued if timer has been set. Pacing task execution is simple enough : it immediately wakes up QCC I/O handler. Note that to have decent performance, it requires to have a large enough burst defined in configuration of quic-cc-algo. However, this value is common to every listener clients, which may cause too much loss under network conditions. This will be address in a future patch. This should be backported up to 3.1.	2025-01-23 17:40:22 +01:00
Amaury Denoyelle	4489a61585	MEDIUM: quic: implement credit based pacing Implement a new method for QUIC pacing emission based on credit. This represents the number of packets which can be emitted in a single burst. After emission, decrement from the credit the number of emitted packets. Several emission can be conducted in the same sequence until the credit is completely decremented. When a new emission sequence is initiated (i.e. under a new QMUX tasklet invokation), credit is refilled according to the delay which occured between the last and current emission context. This new mechanism main advantage is that it allows to conduct several emission in the same task context without having to wait between each invokation. Wait is only forced if pacing is expired, which is now equivalent to having a null credit. Furthermore, if delay between two emissions sequence would have been smaller than expected, credit is only partially refilled. This allows to restart emission without having to wait for the whole credit to be available. On the implementation side, a new field <credit> is avaiable in quic_pacer structure. It is automatically decremented on quic_pacing_sent_done() invokation. Also, a new function quic_pacing_reload() must be used by QUIC MUX when a new emission sequence is initiated to refill credit. <next> field from quic_pacer has been removed. For the moment, credit is based on the burst configured via quic-cc-algo keyword, or directly reported by BBR. This should be backported up to 3.1.	2025-01-23 17:40:20 +01:00
Amaury Denoyelle	9d8589f0de	MINOR: mux-quic: increment pacing retry counter on expired A field <paced_sent_ctr> from quic_pacer structure is used to report the number of occurences where emission has been interrupted due to pacing. However, it was not incremented when QUIC MUX had to pause immediately emission as pacing was still not yet expired. Fix this by incrementing <paced_sent_ctr> in qcc_io_send() prior to emission if pacing is expired. Note that incrementation is only done once if the tasklet is then repeatdely woken up until the timer is expired. This should be backported up to 3.1.	2025-01-23 17:29:14 +01:00
Amaury Denoyelle	bbaa7aef7b	BUG/MINOR: quic: do not increase congestion window if app limited Previously, congestion window was increased any time each time a new acknowledge was received. However, it did not take into account the window filling level. In a network condition with negligible loss, this will cause the window to be incremented until the maximum value (by default 480k), even though the application does not have enough data to fill it. In most cases, this issue is not noticeable. However, it may lead to excessive memory consumption when a QUIC connection is suddendly interrupted, as in this case haproxy will fill the window with retransmission. It even has caused OOM crash when thousands of clients were interrupted at once on a local network benchmark. Fix this by first checking window level prior to every incrementation via a new helper function quic_cwnd_may_increase(). It was arbitrarily decided that the window must be at least 50% full when the ACK is handled prior to increment it. This value is a good compromise to keep window in check while still allowing fast increment when needed. Note that this patch only concerns cubic and newreno algorithm. BBR has already its notion of application limited which ensures the window is only incremented when necessary. This should be backported up to 2.6.	2025-01-23 14:49:35 +01:00
Amaury Denoyelle	7c0820892f	MINOR: quic: rename pacing_rate cb to pacing_inter Rename one of the congestion algorithms pacing callback from pacing_rate to pacing_inter. This better reflects that this function returns a delay (in nanoseconds) which should be applied between each packet emission to fill the congestion window with a perfectly smoothed emission. This should be backported up to 3.1.	2025-01-23 14:49:35 +01:00
Amaury Denoyelle	2178bf1192	CLEANUP: quic: remove unused prototype Remove undefined quic_pacing_send() function prototype from quic_pacing module. This should be backported up to 3.1.	2025-01-23 14:49:35 +01:00
Christopher Faulet	b18e988e0d	BUG/MEDIUM: mux-h1: Properly close H1C if an error is reported before sending data It is possible to have front H1 connections waiting for the client timeout while they should be closed because a conneciton error was reported before sebding an error message to the client. It is not a leak because the connections are closed when the timeout expires but it is a waste of ressources, especially if the client timeout is high. When an early error message must be sent to the client, if an error was already detected, no data are sent and the output buffer is released. At this stage, the H1 connection is in CLOSING state and it must be released. But because of a bug, this is not performed. The client timeout is rearmed and the H1 connection is only closed when it expires. To fix the issue, the condition to close a H1C must also be evaluated when an error is detected before sending data. It is only an issue with idle client connections, because there is no H1 stream in that case and the error message is generated by the mux itself. This patch must be backported as far as 2.8.	2025-01-23 11:05:48 +01:00
Frederic Lecaille	1f099db7e2	BUILD: quic: Move an ASSUME_NONNULL() for variable which is not null Some new compilers warn that <oldest_lost> variable can be null even this cannot be the case as mentioned by the comment about an already present ASSUME_NONNULL() call comment as follows: src/quic_loss.c: In function ‘qc_release_lost_pkts’: src/quic_loss.c:307:86: error: potential null pointer dereference [-Werror=null-dereference] 307 \| unsigned int period = newest_lost->time_sent_ms - oldest_lost->time_sent_ms; \| ~~~~~~~~~~~^~~~~~~~~~~~~~ Move up this ASSUME_NONNULL() statement to please these compiler. Must be backported as far as 2.6 to easy any further backport around this code part.	2025-01-21 22:01:34 +01:00
Frederic Lecaille	4f38c4bfd8	MINOR: quic: Add a BUG_ON() on quic_tx_packet refcount This is definitively a bug to call quic_tx_packet_refdec() to decrement the reference counter of a TX packet calling quic_tx_packet_refdec(), and possibly to release its memory when it is negative or null. This counter is incremented when a TX frm is attached to it with some allocated memory and when the packet is inserted into a data structure, if needed (list or tree). Should be easily backported as far as 2.6 to ease any further backport around this code part.	2025-01-21 22:01:34 +01:00
Frederic Lecaille	cb729fb64d	BUG/MINOR: quic: ensure a detached coalesced packet can't access its neighbours Reset ->prev and ->next fields of a coalesced TX packet to ensure it cannot access several times its neighbours after it is supposed to be detached from them calling quic_tx_packet_dgram_detach(). There are two cases where a packet can be coalesced to another previous built one: this is when it is built into the same datagrame without GSO (and flagged flag with QUIC_FL_TX_PACKET_COALESCED) or when sent from the same sendto() syscall with GOS (not flagged with QUIC_FL_TX_PACKET_COALESCED). This fix may be in relation with GH #2839. Must be backported as far as 2.6.	2025-01-21 22:01:34 +01:00
Willy Tarreau	b066c0affb	REORG: version: move the remaining BUILD_* stuff from haproxy.c to version.c version.c tries to centralize all variables conveying version information, but there's still an issue with the BUILD_* variables which are only passed to haproxy.o and are only updated when that one is rebuilt. This is not very logical given that we can end up with values there which contradict info from version.c. Better move all of these to version.c which is systematically rebuilt. Most of these variables only end up as string concatenation at the moment. Some of them are even duplicated. In version.c we now have one variable (or constant) for each of them and haproxy.c references them in messages. This is much more logical and easier to maintain in a consistent state. The patch looks a bit large but it really only moves the ifdefed string assignment from one file to another, placing them into variables.	2025-01-20 17:53:55 +01:00
Willy Tarreau	9e61cf6790	BUG/MINOR: init: set HAPROXY_STARTUP_VERSION from the variable, not the macro This environment variable was added by commit d4c0be6b20 ("MINOR: startup: HAPROXY_STARTUP_VERSION contains the version used to start"). However, it's set from the macro that is passed during the build process instead of being set from the variable that's kept up to date in version.c. The difference is visible only during debugging/bisecting because only changed files and version.o are rebuilt, but not necessarily haproxy.o, which is where the environment variable is set. This means that the version exposed in the environment is not necessarily the same as the one presented in "haproxy -v" during such debugging sessions. This should be backported to 2.8. It has no impact at all on regularly built binaries.	2025-01-20 17:53:55 +01:00
Aurelien DARRAGON	bfa493d4be	BUG/MAJOR: log/sink: possible sink collision in sink_new_from_srv() sink_new_from_srv() leverages sink_new_buf() with the server id as name, sink_new_buf() then calls __sink_new() with the provided name. Unfortunately sink_new() is designed in such a way that it will first look up in the list of existing sinks to check if a sink already exists with given name, in which case the existing sink is returned. While this behavior may be error-prone, it is actually up to the caller to ensure that the provided name is unique if it really expects a unique sink pointer. Due to this bug in sink_new_from_srv(), multiple tcp servers with the same name defined in distinct log backends would end up sharing the same sink, which means messages sent to one of the servers would also be forwarded to all servers with the same name across all log backend sections defined in the config, which is obviously an issue and could even raise security concerns. Example: defaults log backend@log-1 local0 backend log-1 mode log server s1 127.0.0.1:514 backend log-2 mode log server s1 127.0.0.1:5114 With the above config, logs sent to log-1/s1 would also end up being sent to log-2/s1 due to server id "s1" being used for tcp servers in distinct log backends. To fix the issue, we now prefix the sink ame with the backend name: back_name/srv_id combination is known to be unique (backend name serves as a namespace) This bug was reported by GH user @landon-lengyel under #2846. UDP servers (with udp@ prefix before the address) are not affected as they don't make use of the sink facility. As a workaround, one should manually ensure that all tcp servers across different log backends (backend with "mode log" enabled) use unique names This bug was introduced in e58a9b4 ("MINOR: sink: add sink_new_from_srv() function") thus it exists since the introduction of log backends in 2.9, which means this patch should be backported up to 2.9.	2025-01-20 12:33:20 +01:00

1 2 3 4 5 ...

23840 Commits