haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-09-20 21:31:28 +02:00

Author	SHA1	Message	Date
Willy Tarreau	ded2110ec6	MEDIUM: peers: move process_peer_sync() to a single thread The remaining half of the task_queue() and task_wakeup() contention is caused by this function when peers are in use, because just like process_table_expire(), it's created using task_new_anywhere() and is woken up for local updates. Let's turn it to single thread by rotating the assigned threads during initialization so that a table only runs on one thread at a time. Here we go backwards to assign the threads, so that on small setups they don't end up on the same CPUs as the ones used by the stick-tables. This way this will make an even better use of large machines. The performance remains the same as with previous patch, even slightly better (1-3% on avg). At this point there's almost no multi-threaded task activity anymore (only srv_cleanup_idle_server once in a while). This should improve the situation described by Felipe in issues #3084 and #3101. This should be backported to 3.2 after some extended checks.	2025-09-10 19:14:05 +02:00
Willy Tarreau	e05afda249	MEDIUM: stick-table: move process_table_expire() to a single thread A big deal of the task_queue() contention is caused by this function because it's created using task_new_anywhere() and is subject to heavy updates. Let's turn it to single thread by rotating the assigned threads during initialization so that a table only runs on one thread at a time. However there's a trick: the function used to call task_queue() to requeue the task if it had advanced its timer (may only happen when learning an entry from a peer). We can't do that anymore since we can't queue another thread's task. Thus instead of the task needs to be scheduled earlier than previously planned, we simply perform a wakeup. It will likely do nothing and will self-adjust its next wakeup timer. Doing so halves the number of multi-thread task wakeups. In addition the request rate at saturation increased by 12% with 16 peers and 40 tables on a 16 8-thread processes. This should improve the situation described by Felipe in issues #3084 and #3101. This should be backported to 3.2 after some extended checks.	2025-09-10 19:13:33 +02:00
Willy Tarreau	2831cb104f	BUG/MINOR: stick-table: make sure never to miss a process_table_expire update In stktable_requeue_exp(), there's a tiny race at the beginning during which we check the task's expiration date to decide whether or not to wake process_table_expire() up. During this race, the task might just have finished running on its owner thread and we can miss a task_queue() opportunity, which probably explains why during testing it seldom happens that a few entries are left at the end. Let's perform a CAS to confirm the value is still the same before leaving. This way we're certain that our value has been seen at least once. This should be backported to 3.2.	2025-09-10 18:45:01 +02:00
Willy Tarreau	2ce5e0edcc	MEDIUM: resolvers: make the process_resolvers() task single-threaded This task is sometimes caught triggering the watchdog while waiting for the infamous resolvers lock, or the scheduler's wait queue lock in task_queue(). Both are caused by its multi-threaded capability. The task may indeed start on a thread that's different from the one that is currently receiving a response and that holds the resolvers lock, and when being queued back, it requires to lock the wait queue. Both problems disappear when sticking it to a single thread. But for configs running multiple resolvers sections, it would be suboptimal to run them all on the same thread. In order to avoid this, we implement a counter in the resolvers_finalize_config() section that rotates the thread for each resolvers section. This was sufficient to further improve the performance here, making the CPU usage drop to about 7% (from 11 previously or 38 initially) and not showing any resolvers lock contention anymore in perf top output. The change was kept fairly minimal to permit a backport once enough testing is conducted on it. It could address a significant part of the trouble reported by Felipe in GH issue #3101.	2025-09-10 16:51:14 +02:00
Willy Tarreau	d624aceaef	MEDIUM: dns: bind the nameserver sockets to the initiating thread There's still a big architectural limitation in the dns/resolvers code regarding threads: resolvers run as a task that is scheduled to run anywhere, and each NS dgram socket is bound to any thread of the same thread group as the initiating thread. This becomes a big problem when dealing with multiple nameservers because responses arrive on any thread, start by locking the resolvers section, and other threads dealing with responses are just stuck waiting for the lock to disappear. This means that most of the time is exclusively spent causing contention. The process_resolvers() function also also suffers from this contention but apparently less often. It turns out that the nameserver sockets are created during emission of the first packet, triggered from the resolvers task. The present patch exploits this to stick all sockets to the calling thread instead of any thread. This way there is no longer any contention between multiple nameservers of a same resolvers section. Tests with a section having 10 name servers showed that the CPU usage dropped from 38 to about 10%, or almost by a factor of 4. Note that TCP resolvers do not offer this possibility because the tasks that manage the applets are created earlier to run anywhere during config parsing. This might possibly be refined later, e.g. by changing the task's affinity when it first runs. The change was kept fairly minimal to permit a backport once enough testing is conducted on it. It could address a significant part of the trouble reported by Felipe in GH issue #3101.	2025-09-10 16:48:09 +02:00
Olivier Houchard	07c10ec2f1	BUG/MEDIUM: ssl: Fix a crash if we failed to create the mux In ssl_sock_io_cb(), if we failed to create the mux, we may have destroyed the connection, so only attempt to access it to get the ALPN if conn_create_mux() was successful. This fixes crashes that may happen when using ssl.	2025-09-10 12:02:53 +02:00
Olivier Houchard	1759c97255	BUG/MEDIUM: ssl: Fix a crash when using QUIC Commit 5ab9954faa9c815425fa39171ad33e75f4f7d56f introduced a new flag in ssl_sock_ctx, to know that an ALPN was negociated, however, the way to get the ssl_sock_ctx was wrong for QUIC. If we're using QUIC, get it from the quic_conn. This should fix crashes when attempting to use QUIC.	2025-09-10 11:45:03 +02:00
Willy Tarreau	be86a69fe8	DEBUG: stick-tables: export stktable_add_pend_updates() for better reporting This function is a tasklet handler used to send peers updates, and it can happen quite a bit in "show tasks" and "show profiling tasks", so let's export it so that we don't face a cryptic symbol name: $ socat - /tmp/haproxy-n10.stat <<< "show tasks" Running tasks: 43 (8 threads) function places % lat_tot lat_avg calls_tot calls_avg calls% process_table_expire 16 37.2 1.072m 4.021s 115831 7239 15.4 task_process_applet 15 34.8 1.072m 4.287s 486299 32419 65.0 stktable_add_pend_updates 8 18.6 - - 89725 11215 12.0 sc_conn_io_cb 3 6.9 - - 5007 1669 0.6 process_peer_sync 1 2.3 4.293s 4.293s 50765 50765 6.7 This should be backported to 3.2 as it participates to debugging the table+peers processing overhead.	2025-09-10 11:34:51 +02:00
Willy Tarreau	993c09438b	BUG/MEDIUM: stick-tables: don't loop on non-expirable entries The stick-table expiration of ref-counted entries was insufficiently addresse by commit 324f0a60ab ("BUG/MINOR: stick-tables: never leave used entries without expiration"), because now entries are just requeued where they were, so they're visited over and over for long sessions, causing process_table_expire() to loop, eating CPU and causing lock contention. Here we take care of refreshing their timeer when they are met, so that we don't meet them more than once per stick-table lifetime. It should address at least a part of the recent degradation that Felipe noticed in GH #3084. Since the fix above was marked for backporting to 3.2, this one should be backported there as well.	2025-09-10 11:27:27 +02:00
Willy Tarreau	997d217dee	MINOR: tools: don't emit "+0" for symbol names which exactly match known ones resolve_sym_name() knows a number of symbols, but when one exactly matches (e.g. a task's handler), it systematically displays the offset behind it ("+0"). Let's only show the offset when non-zero. This can be backported as this is helpful for debugging.	2025-09-10 10:44:33 +02:00
Willy Tarreau	9eb35563a6	MINOR: activity: indicate the number of calls on "show tasks" The "show tasks" command can be useful to inspect run queues for active tasks, but currently it's difficult to distinguish an occasional running task from a heavily active one. Let's collect the number of calls for each of them, report them average on the number of instances of each task as well as a percentage of the total used. This way it even becomes possible to get a hint about how CPU usage is distributed.	2025-09-10 10:44:33 +02:00
Willy Tarreau	17d3392348	BUG/MINOR: activity: fix reporting of task latency In 2.4, "show tasks" was introduced by commit 7eff06e162 ("MINOR: activity: add a new "show tasks" command to list currently active tasks") to expose some info about running tasks. The latency is not correct because it's a u32 subtracted from a u64. It ought to have been casted to u32 for the operation, which is what this patch does. This can be backported to 2.4.	2025-09-10 10:44:33 +02:00
Willy Tarreau	bdff394195	BUILD: ssl: address a recent build warning when QUIC is enabled Since commit 5ab9954faa ("MINOR: ssl: Add a flag to let it known we have an ALPN negociated"), when building with QUIC we get this warning: src/ssl_sock.c: In function 'ssl_sock_advertise_alpn_protos': src/ssl_sock.c:2189:2: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] Let's just move the instructions after the optional declaration. No backport is needed.	2025-09-10 10:44:33 +02:00
Olivier Houchard	d4c51a4f57	MEDIUM: server: Make use of the stored ALPN stored in the server Now that which ALPN gets negociated for a given server, use that to decide if we can create the mux right away in connect_server(), and use it in conn_install_mux_be(). That way, we may create the mux soon enough for early data to be sent, before the handshake has been completed. This commit depends on several previous commits, and it has not been deemed important enough to backport.	2025-09-09 19:01:24 +02:00
Willy Tarreau	6a2b3269f9	CLEANUP: backend: clarify the cases where we want to use early data The conditions to use early data on output are super tricky and detected later, so that it's difficult to figure how this works. This patch splits the condition in two parts, the one that can be performed early that is based on config/client/etc. It is used to clear a variable that allows early data to be used in case any condition is not satisfied. It was purposely split into multiple independent and reviewable tests. The second part remains where it was at the end, and is used to temporarily clear the handshake flags to let the data layer use early data. This one being tricky, a large comment explaining the principle was added. The logic was not changed at all, only the code was made more readable.	2025-09-09 19:01:24 +02:00
Willy Tarreau	9b9d0720e1	CLEANUP: backend: simplify the complex ifdef related to 0RTT in connect_server() Since 3.0 we have HAVE_SSL_0RTT precisely to avoid checking horribly complicated and unmaintainable conditions to detect support for 0RTT. Let's just drop the complex condition and use the macro instead.	2025-09-09 19:01:24 +02:00
Willy Tarreau	4aaf0bfbce	CLEANUP: backend: invert the condition to start the mux in connect_server() Instead of trying to switch from delayed start to instant start based on a single condition, let's do the opposite and preset the condition to instant start and detect what could cause it to be delayed, thus falling back to the slow mode. The condition remains exactly the inverted one and better matches the comment about ALPN being the only cause of such a delay.	2025-09-09 19:01:24 +02:00
Willy Tarreau	7b4a7f92b5	CLEANUP: backend: clarify the role of the init_mux variable in connect_server() The init_mux variable is currently used in a way that's not super easy to grasp. It's set a bit too late and requires to know a lot of info at once. Let's first rename it to "may_start_mux_now" to clarify its role, as the purpose is not to force the mux to be initialized now but to permit it to do it.	2025-09-09 19:01:24 +02:00
Olivier Houchard	ff47ae60f3	MEDIUM: server: Introduce the concept of path parameters Add a new field in struct server, path parameters. It will contain connection informations for the server that are not expected to change. For now, just store the ALPN negociated with the server. Each time an handhskae is done, we'll update it, even though it is not supposed to change. This will be useful when trying to send early data, that way we'll know which mux to use. Each time the server goes down or is disabled, those informations are erased, as we can't be sure those parameters will be the same once the server will be back up.	2025-09-09 19:01:24 +02:00
Olivier Houchard	9d65f5cd4d	MINOR: ssl: Use the new flag to know when the ALPN has been set. How that we have a flag to let us know the ALPN has been set, we no longer have to call ssl_sock_get_alpn() to know if the alpn has been negociated already. Remove the call to conn_create_mux() from ssl_sock_handshake(), and just reuse the one already present in ssl_sock_io_cb() if we have received early data, and if the flag is set.	2025-09-09 19:01:24 +02:00
Olivier Houchard	5ab9954faa	MINOR: ssl: Add a flag to let it known we have an ALPN negociated Add a new flag to the ssl_sock_ctx, to be set as soon as the ALPN has been negociated. This happens before the handshake has been completed, and that information will let us know that, when we receive early data, if the ALPN has been negociated, then we can immediately create a mux, as the ALPN will tell us which mux to use.	2025-09-09 19:01:24 +02:00
Olivier Houchard	6b78af837d	BUG/MEDIUM: ssl: create the mux immediately on early data If we received early data, and an ALPN has been negociated, then immediately try to create a mux if we did not have one already. Generally, at this point we would not have one, as the mux is decided by the ALPN, however at this point, even if the handshake is not done yet, we have enough to determine the ALPN, so we can immediately create the mux. Doing so makes up able to treat the request immediately, without waiting for the handshake to be done. This should be backported up to 2.8.	2025-09-09 19:01:24 +02:00
Olivier Houchard	aa25ddb773	BUG/MEDIUM: h1: Allow reception if we have early data In h1_recv_allowed(), do not forbid the reception if we are yet to complete the connection, if we have received early data on it. That way, we can deal with them right away, instead of waiting for the handshake to be done. This should be backported up to 2.8.	2025-09-09 19:01:24 +02:00
Willy Tarreau	d7696d11e1	MEDIUM: peers: don't even try to process updates under contention Recent fix 2421c3769a ("BUG/MEDIUM: peers: don't fail twice to grab the update lock") improved the situation a lot for peers under locking contention but still not enough for situations with many peers and many entries to expire fast. It's indeed still possible to trigger warnings at end of injection sessions for 16 peers at 100k req/s each doing 10 random track-sc when process_table_expire() runs and holds the update lock if compiled with a high value of STKTABLE_MAX_UPDATES_AT_ONCE (1000). Better just not insist in this case and postpone the update. At this point, under load only ebmb_lookup() consumes CPU, other functions are in the few percent, indicating reasonable contention, and peers remain updated. This should be backported to 3.2 after a bit of testing.	2025-09-09 17:56:37 +02:00
Willy Tarreau	d5e7fba5c0	MEDIUM: stick-tables: don't wait indefinitely in stktable_add_pend_updates() This one doesn't need to wait forever, if it cannot work it can postpone it. When building with a high value of STKTABLE_MAX_UPDATES_AT_ONCE (1000), it's still possible to trigger warnings in this function on the write lock that is contended by peers and expiration. Changing it for a trylock resolves the issue. This should be backported to 3.2 after a bit of testing.	2025-09-09 17:56:37 +02:00
Willy Tarreau	a771b14541	MEDIUM: stick-tables: give up on lock contention in process_table_expire() process_table_expire() can take quite a lot of time running over all shards. During this time it will hinder track-sc rules and peers, which will experience an increased latency to do their work, especially peers where each message will cause a lock, whose cumulated time can exceed the watchdog's patience. Here, we proceed just like in stktable_trash_oldest(), which is that we're using a trylock to detect contention. The first time it happens, if we hadn't purged anything, we switch to a regular lock to perform the operation, and next time it happens we abort. This guarantees that some entries will be expired and that contention will be reduced with when detected. With this change, various tests didn't manage to produce any warning, including at the end of the load generation session. This should be backported to 3.2 after a bit more testing.	2025-09-09 17:56:37 +02:00
Willy Tarreau	f87cf8b76e	MEDIUM: stick-tables: relax stktable_trash_oldest() to only purge what is needed stktable_trash_oldest() does insist a lot on purging what was requested, only limited by STKTABLE_MAX_UPDATES_AT_ONCE. This is called in two conditions, one to allocate a new stksess, and the other one to purge entries of a stopping process. The cost of iterating over all shards is huge, and a shard lock is taken each time before looking up entries. Moreover, multiple threads can end up doing the same and looking hard for many entries to purge when only one is needed. Furthermore, all threads start from the same shard, hence synchronize their locks. All of this costs a lot to other operations such as access from peers. This commit simplifies the approach by ignoring the budget, starting from a random shard number, and using a trylock so as to be able to give up early in case of contention. The approach chosen here consists in trying hard to flush at least one entry, but once at least one is evicted or at least one trylock failed, then a failure on the trylock will result in finishing. The function now returns a success as long as one entry was freed. With this, tests no longer show watchdog warnings during tests, though a few still remain when stopping the tests (which are not related to this function but to the contention from process_table_expire()). With this change, under high contention some entries' purge might be postponed and the table may occasionally contain slightly more entries than their size (though this already happens since stksess_new() first increments ->current before decrementing it). Measures were made on a 64-core system with 8 peers of 16 threads each, at CPU saturation (350k req/s each doing 10 track-sc) for 10M req, with 3 different approaches: - this one resulted in 1500 failures to find an entry (0.015% size overhead), with the lowest contention and the fairest peers distibution. - leaving only after a success resulted in 229 failures (0.0029% size overhead) but doubled the time spent in the function (on the write lock precisely). - leaving only when both a success and a failed lock were met resulted in 31 failures (0.00031% overhead) but the contention was high enough again so that peers were not all up to date. Considering that a saturated machine might exceed its entries by 0.015% is pretty minimal, the mechanism is kept. This should be backported to 3.2 after a bit more testing as it resolves some watchdog warnings and panics. It requires precedent commit "MINOR: stick-table: permit stksess_new() to temporarily allocate more entries" to over-allocate instead of failing in case of contention.	2025-09-09 17:56:37 +02:00
Willy Tarreau	b119280f60	MINOR: stick-table: permit stksess_new() to temporarily allocate more entries stksess_new() calls stktable_trash_oldest() to release some entries. If it fails however, it will fail to allocate an entry. This is a problem because it doesn't permit stktable_trash_oldest() to be used in best effort mode, which forces it to impose high contention. There's no problem with allocating slightly more in practice. In the worst case if all entries are in use, it's not shocking to temporarily exceed the number of entries by a few units. Let's relax this problematic rule. This patch might need to be backported to 3.2 after a bit more testing in order to support locking relaxation.	2025-09-09 17:56:37 +02:00
Willy Tarreau	0f33a55171	DEBUG: peers: export functions that use locks The following functions take locks and are often involved in warnings but are currently not resolved, so let's export them so that they are properly decoded: peer_prepare_updatemsg(), peer_send_teachmsgs(), peer_treat_updatemsg(), peer_send_msgs(), peer_io_handler() This should be backported to 3.2.	2025-09-09 17:56:14 +02:00
Willy Tarreau	25195ba1e7	MINOR: debug: report the time since last wakeup and call When task profiling is enabled, the current thread knows when the currently running task was woken up and called, so we can calculate how long ago it was woken up and called. This is convenient to figure whether or not a warning or panic is caused by this task or by a previous one, so let's report this info in thread outputs when known. It would be useful to backport this to 3.2.	2025-09-09 17:56:14 +02:00
Willy Tarreau	12bc4f9c44	MINOR: debug: report the number of loops and ctxsw for each thread When multiple similar warnings are emitted, it can be difficult to know whether only one task is looping slowly or if many are sharing the CPU. Let's report the number of context switches and polling loop turns in thread dumps so that warnings are easier to understand. This should be backported to 3.2.	2025-09-09 17:56:14 +02:00
Willy Tarreau	c3f94fbd9b	DEBUG: stream: count the number of passes in the connect loop Normally the connect loop cannot loop, but some recent traces can easily convince one of the opposite. Let's add a counter, including in panic dumps, in order to avoid the repeated long head scratching sessions starting with "and what if...". In addition, if it's found to loop, this time it will be certain and will indicate what to zoom in. This should be backported to 3.2.	2025-09-09 17:56:14 +02:00
Willy Tarreau	8153cf1e51	MINOR: debug: report the process id in warnings and panics Warning and panic messages currently do not report the PID. This is annoying when trying to reproduce problems because warnings do not allow know which process to attach to in order to debug, and panics do not permit to know which core dump corresponds to which dump. Let's add them in both messages. This should probably be backported at least to 3.2.	2025-09-09 17:56:14 +02:00
Amaury Denoyelle	0678d0a69b	MINOR: check: reject invalid check config on a QUIC server QUIC is now supported on the backend side. The previous commit ensures that simple checks can be activated on QUIC servers without any issue. The current patch ensures that check server settings remain compatible with a QUIC server. Thus, configuration is now invalid if check specifies an explicit MUX proto other than QUIC, disables SSL or try to use PROXY protocol.	2025-09-09 16:55:09 +02:00
Amaury Denoyelle	cd3027a7ee	BUG/MINOR: check: ensure checks are compatible with QUIC servers Previously, checks were only performed on TCP. However, QUIC is now supported on backend. Prior to this patch, check activation for QUIC servers would result in a crash. To ensure compatibility between QUIC servers and checks, adjust protocol_lookup() performed during check connect step. Instead of using a hardcoded PROTO_TYPE_STREAM, the value is now derived from server settings. This does not need to be backported.	2025-09-09 16:55:09 +02:00
Amaury Denoyelle	c6d33c09fc	BUG/MEDIUM: checks: fix ALPN inheritance from server If no specific check settings are defined on a server line, it is expected that these checks will be performed with the same parameters as normal connections on the same server. ALPN must be carefully taken into account for checks. Most notably, MUX initialization is delayed so that it is performed only after SSL handshake. Prior to this patch, MUX init delay was only performed if ALPN was defined via check settings. Thus, with the following settings, checks would be performed on HTTP/1.1 without consulting ALPN negotiation result from the server : server s1 127.0.0.1:443 ssl crt <...> alpn h2 check This bug may result in checks reporting failure, for example in case of a server answering HTTP/2 to ALPN negotiation to the configuration above. Besides, there is incoherency between normal and check connections, which is not what the documentation specifies. This patch fixes this code. Now server parameters are also taken into account. This ensures that checks and normal connections by default use the same connection method. This must be backported up to 2.4.	2025-09-09 16:55:09 +02:00
Amaury Denoyelle	fee3bd48b4	OPTIM: check: do not delay MUX for ALPN if SSL not active To ensure ALPN is properly applied on checks, MUX initialization is delayed so that it is created on SSL handshake completion. However, this does not check if SSL is really active for the connection. This patch adjusts the condition so that MUX init is not delayed if SSL is not active for the check connection. A similar process is already conducted for normal connections via connect_server(). This must be backported up to 2.4. Despite not being a bug, it must be backported for the following patch which fixes check ALPN inheritance from server settings.	2025-09-09 16:55:09 +02:00
Amaury Denoyelle	536d2aafa3	BUG/MINOR: hq-interop: adjust parsing/encoding on backend side HTTP/0.9 is available on top of QUIC. This protocol is reserved for internal use, mostly interop purpose. This patch adjusts HTTP/0.9 layer with the following changes : * version is not emitted anymore on the status line. This is performed as some servers does not parse it correctly. * status line is set explicitely on HTX status-line. This ensures the correct HTTP status code is reported to the upper stream layer. This does not need to be backported.	2025-09-09 16:55:09 +02:00
Christopher Faulet	b901e56acd	BUG/MEDIUM: mux-h2: Reinforce conditions to report an error to app-layer stream This patch relies on the previous one ("BUG/MEDIUM: mux-h2: Report RST/error to app-layer stream during 0-copy fwding"). When the end of the connection is detected, so when the H2_CF_END_REACHED flag is set after the shutdown was received and all incoming data were processed, if a stream is blocked by the flow control (the stream one or the connection one), an error must be reported to the app-layer stream. Otherwise, outgoing data won't be sent and the opposite side will handle this as a lack of room. So the stream will be blocked until the write timeout is triggerd. By reporting the error early, the stream can be immediately closed. This patch should be backported to 3.2. For older versions, it is probably a good idea to wait for bug report.	2025-09-09 16:30:54 +02:00
Christopher Faulet	22e14f7b54	BUG/MEDIUM: mux-h2: Report RST/error to app-layer stream during 0-copy fwding In h2_nego_ff(), it is important to report reset and error to app-layer stream and to send the RST-STREAM frame accordingly. It is not clear if it is an issue or not. But it is clearly a difference with the classical forwarding via h2_snd_buf. And it is mandatory for the next fix. This patch should be backported to 3.2. But is is probably a good idea to not backport it on older versions, except if a bug is reported in this area.	2025-09-09 16:30:21 +02:00
Christopher Faulet	3b7112aa1d	BUG/MINOR: mux-h2: Remove H2_CF_DEM_DFULL flags when the demux buffer is reset This only happens when a connection error is detected or when the H2 connection is in ERR/ERR2 state. The demux buffer is explicitly reset. In that case, it is important to remove the flag reporting this buffer as full. It is probably worth to backport this patch to 3.2. But it is not mandatory on older versions because it does not fix any known issue.	2025-09-09 16:29:14 +02:00
Christopher Faulet	12edcccc82	BUG/MEDIUM: mux-h2: Restart reading when mbuf ring is no longer full When the mbuf ring buffer is full, the flag H2_CF_DEM_MROOM is set on the H2 connection to block any demux. It is important to properly handle ACK frames. However, we must take care to restart reading when some data were removed from the mbuf. Otherwise, we may block the demux for no reason. It is especially an issue if the demux buffer is full. In that case, the H2 connection is blocked, waiting for the timeout. This patch should be backported to 3.2. But is is probably a good idea to not backport it on older versions, except if a bug is reported in this area.	2025-09-09 16:07:20 +02:00
Christopher Faulet	c6e4584d2b	BUG/MEDIUM: mux-h2; Don't block reveives in H2_CS_ERROR and H2_CS_ERROR2 states The H2 connection is switched to ERR when a GOAWAY must be sent and in ERR2 when it is sent. In these states, no more data can be emitted by the mux. But there is no reason to not try to process incoming data or to not try to receive data. It is espcially important to be able to get the shutdown from the TCP connection when a SSL connection was previously detected. Otherwise, it is possible to block a H2 connection until its timeout expiration to be able to close it. This patch should be backported to 3.2. But is is probably a good idea to not backport it on older versions, except if a bug is reported in this area.	2025-09-09 16:07:20 +02:00
Christopher Faulet	626d7934cf	BUG/MEDIUM: mux-h2: Reset MUX blocking flags when a send error is caught When an send error is detected on the underlying connection, a pending error is reported to the H2 connection by setting H2_CF_ERR_PENDING flag. When this happen the tail of the mux ring buffer is reset. However some blocking flags remain set and have no chance to be removed later because of the pending error. Especially the flag H2_CF_DEM_MROOM which block data demultiplexing. Thus, it is possible to block a H2 connection with unparsed incoming data. Worse, if a read event is received, it could lead to a wakeup loop between the H2 connection and the underlying SSL connection. The H2 connection is unable to convert the pending error to a fatal error because the demultiplexing is blocked. In the mean time, it tries to receive more data because of the not-consumed read event. On the underlying connection side, the error detected earlier blocks the read, but the H2 connection is woken up to handle the error. To fix the issue, blocking flags must be removed when a send error is caught, H2_CF_MUX_MFULL and H2_CF_DEM_MROOM flags. But, it is not necessary to only release the tail of the mbuf ring. When a send error is detected, all outgoing data can be flushed. So, now, in h2_send(), h2_release_mbuf() function is called on pending error. The mbuf ring is fully released and H2_CF_MUX_MFULL and H2_CF_DEM_MROOM flags are removed. Many thanks to Krzysztof Kozłowski for its help to spot this issue. This patch could be backported at least as far as 2.8. But it is a bit sensitive. So, it is probably a good idea to backport it to 3.2 for now and wait for bug report on older versions.	2025-09-09 16:07:20 +02:00
Amaury Denoyelle	0b6908385e	BUG/MINOR: quic: properly support GSO on backend side Previously, GSO emission was explicitely disabled on backend side. This is not true since the following patch, thus GSO can be used, for example when transfering large POST requests to a HTTP/3 backend. commit e064e5d46171d32097a84b8f84ccc510a5c211db MINOR: quic: duplicate GSO unsupp status from listener to conn However, GSO on the backend side may cause crash when handling EIO. In this case, GSO must be completely disabled. Previously, this was performed by flagging listener instance. In backend side, this would cause a crash as listener is NULL. This patch fixes it by supporting GSO disable flag for servers. Thus, in qc_send_ppkts(), EIO can be converted either to a listener or server flag depending on the quic_conn proxy side. On backend side, server instance is retrieved via <qc.conn.target>. This is enough to guarantee that server is not deleted. This does not need to be backported.	2025-09-08 16:18:05 +02:00
Christopher Faulet	e653dc304e	MINOR: pools: Don't dump anymore info about pools when purge is forced Historically, when the purge of pools was forced by sending a SIGQUIT to haproxy, information about the pools were first dumped. It is now totally pointless because these info can be retrieved via the CLI. It is even less relevant now because the purge is forced typically when there are memroy issues and to dump pools information, data must be allocated. dump_pools_info() function was simplified because it is now called only from an applet. No reason to still try to dump info on stderr.	2025-09-08 16:04:40 +02:00
Christopher Faulet	982805e6a3	BUG/MINOR: pools: Fix the dump of pools info to deal with buffers limitations The "show pools" CLI command was not designed to dump information exceeding the size of a buffer. But there is now much more pools than few years ago and when detailed information are dumped, we exceeds the buffer limit and the output is truncated. To fix the issue, the command must be refactored to be able to stream the result. To do so, the array containing pools info is now part of the command context and it is dynamically allocated. A dedicated function was created to fill all info. In addition, the index of the next pool to dump is saved in the command context too to properly handle resumption cases. Finally global information about pools are also stored in the command context for convenience. This patch should fix the issue #3067. It must be backported to 3.2. On older release, the buffer limit is never reached.	2025-09-08 16:01:51 +02:00
Christopher Faulet	d75718af14	REGTESTS: ssl: Fix the script about automatic SNI selection First, the barrier to delay the client execution was moved before the client definition. Otherwise, the connection is established too early and with short timeouts it could be closed before the requests are sent. The main purpose of the barrier was to workaround slow health-checks. This is also the reason why the script was flagged as slow. But it can be significantly speed-up by setting a slow "inter" value. It is now set to 100ms and the script is no longer slow.	2025-09-08 15:55:56 +02:00
Amaury Denoyelle	f645cd3c74	MINOR: quic: restore QUIC_HP_SAMPLE_LEN constant The below patch fixes padding emission for small packets, which is required to ensure that header protection removal can be performed by the recipient. commit d7dea408c64c327cab6aebf4ccad93405b675565 BUG/MINOR: quic: too short PADDING frame for too short packets In addition to the proper fix, constant QUIC_HP_SAMPLE_LEN was removed and replaced by QUIC_TLS_TAG_LEN. However, it still makes sense to have a dedicated constant which represent the size of the sample used for header protection. Thus, this patch restores it. Special instructions for backport : above patch mentions that no backport is needed. However, this is incorrect, as bug is introduced by another patch scheduled for backport up to 2.6. Thus, it is first mandatory to schedule d7dea408c64c327cab6aebf4ccad93405b675565 after it. Then, this patch can also be used for the sake of code clarity.	2025-09-08 14:49:03 +02:00
Amaury Denoyelle	c20c71a079	TESTS: quic: add unit-tests for QUIC TX part Define a new "quic_tx" unit-test which is used to test QUIC TX module. For the moment, a single test is performed on qc_do_build_pkt(). It checks that PADDING is correctly added for HP sampling in case of a small packet.	2025-09-08 14:49:03 +02:00

1 2 3 4 5 ...

25347 Commits