haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2026-02-17 15:21:57 +01:00

Author	SHA1	Message	Date
Willy Tarreau	561319bd1c	BUG/MEDIUM: fd: mark FD transferred to another process as FD_CLONED The crappy epoll API stroke again with reloads and transferred FDs. Indeed, when listening sockets are retrieved by a new worker from a previous one, and the old one finally stops listening on them, it closes the FDs. But in this case, since the sockets themselves were not closed, epoll will not unregister them and will continue to report new activity for these in the old process, which can only observe, count an fd_poll_drop event and not unregister them since they're not reachable anymore. The unfortunate effect is that long-lasting old processes are woken up at the same rate as the new process when accepting new connections, and can waste a lot of CPU. Accept rates divided by 8 were observed on a small test involving a slow transfer on 10 connections facing a reload every second so that 10 processes were busy dealing with them while another process was hammering the service with new connections. Fortunately, years ago we implemented a flag FD_CLONED exactly for similar purposes. Let's simply mark transferred FDs with FD_CLONED so that the process knows that these ones require special treatment and have to be manually unregistered before being closed. This does the job fine, now old processes correctly unregister the FD before closing it and no longer receive accept events for the new process. This needs to be backported to all stable versions. It only affects epoll, as usual, and this time in combination with transferred FDs (typically reloads in master-worker mode). Thanks to Damien Claisse for providing all detailed measurements and statistics allowing to understand and reproduce the problem.	2025-02-12 16:35:01 +01:00
Amaury Denoyelle	e2744d23be	MINOR: quic: refactor CRYPTO encoding and splitting This patch is the direct follow-up of the previous one which refactor STREAM frame encoding. Reuse the newly defined quic_strm_frm_fillbuf() and quic_strm_frm_split() functions for CRYPTO frame encoding. The code for CRYPTO and STREAM frames encoding should now be clearer as it is mostly identical.	2025-02-12 15:10:54 +01:00
Amaury Denoyelle	f96af8e463	MINOR: quic: refactor STREAM encoding and splitting CRYPTO and STREAM frames encoding is similar. If payload is too large, frame will be splitted and only the first payload part will be written in the output QUIC packet. This process is complexified by the presence of a variable-length integer Length field prior to the payload. This commit aims at refactor these operations. Define two functions to simplify the code : * quic_strm_frm_fillbuf() which is used to calculate the optimal frame length of a STREAM/CRYPTO frame with its payload in a buffer * quic_strm_frm_split() which is used to split the frame payload if buffer is too small With this patch, both functions are now implemented for STREAM encoding.	2025-02-12 15:10:03 +01:00
William Lallemand	4de86bbbfc	MEDIUM: initcall: allow to register mutiple post_section_parser per section Before this patch, REGISTER_CONFIG_SECTION() allowed to register one and only one callback (<post>) called after the parsing of a section. It was limitating because you couldn't register a post callback from anywhere else in the code. This patch introduces the new REGISTER_CONFIG_SECTION_POST() macros which allows to register a new post callback for a section keyword from anywhere. This patch introduces the feature by allowing `struct cfg_section` entries that does not have a `section_parser`, and then iterating on all cfg_section with a post_section_parser for a keyword.	2025-02-12 12:52:41 +01:00
William Lallemand	5c2039b5b8	CLEANUP: mworker: "program" section does not have a post_section_parser anymore The "program" section does not have a post_section_parser anymore so no need to make an exception for it.	2025-02-12 12:37:01 +01:00
William Lallemand	313eeae7db	BUG/MINOR: mworker: post_section_parser for the last section in discovery Previous patch 2c270a05f ("BUG/MINOR: mworker: section ignored in discovery after a post_section_parser") needs an adjustment for the last section of the file. Indeed the post_section_parser of the last section must not be called in discovery mode. Must be backported in 3.1.	2025-02-12 12:34:57 +01:00
William Lallemand	2c270a05f0	BUG/MINOR: mworker: section ignored in discovery after a post_section_parser When a new section is discovered, the post_section_parser of the previous section is called. However in the new master-worker mode the discovery mode will skip the post_section_parser. But instead of trying to parse the current section keyword after that, it would skip completely the current line. This is a minor bug since there isn't a lot of section with post_section_parser, and not a lot of section to parse in discovery mode. But this could be reproduced like this: global expose-deprecated-directives resolvers res parse-resolv-conf program foo command sleep 10 program bar command sleep 10 Ths 'resolvers' section has a post_section_parser which will be ignored in discovery mode with the consequence of ignoring the first program section. This must be backported in 3.1.	2025-02-12 12:18:17 +01:00
Amaury Denoyelle	731340afbd	MINOR: quic: simplify length calculation for STREAM/CRYPTO frames STREAM and CRYPTO frames have a similar encoding format. In particular, both of them have a variable-length integer Length field just before the frame payload. It is complex to determine the optimal Length value before copying the payload data in the remaining buffer space. As such, helper functions were implemented to calculate this. However, CRYPTO and STREAM frames encoding implementation were not completely aligned, which renders the code harder to follow. The purpose of this commit is to simplify CRYPTO and STREAM frames encoding. First, a new helper quic_int_cap_length() is defined which is useful to determine the optimal buffer room available if prefixed by a variable-length integer as Length field. Then, processing of both CRYPTO and STREAM frames is now nearly identical, based on this new helper function. Functions max_available_room() and max_stream_data_size() are now unused and are removed.	2025-02-12 11:51:09 +01:00
Amaury Denoyelle	e6a223542a	BUG/MINOR: quic: fix CRYPTO payload size calcul for encoding Function max_stream_data_size() is used to determine the payload length of a CRYPTO frame. It takes into account that the CRYPTO length field is a variable length integer. Implemented calcul was incorrect as it reserved too much space as a frame header. This error is mostly due because max_stream_data_size() reuses max_available_room() which also reserve space for a variable length integer. This results in CRYPTO frames shorter of 1 to 2 bytes than the maximum achievable value, which produces in the end datagram shorter than the MTU. Fix max_stream_data_size() implementation. It is now merely a wrapper on max_available_room(). This ensures that CRYPTO frame encoding is now properly optimized to use the MTU available. This should be backported up to 2.6.	2025-02-12 11:51:09 +01:00
Amaury Denoyelle	63747452a3	BUG/MINOR: quic: reserve length field for long header encoding Long header packets have a mandatory Length field, which contains the size of Packet number and payload, encoded as a variable-length integer. Its value can thus only be determined after the payload size is known, which depends on the remaining buffer space after this variable-length field. Packet payload are encoded in two steps. First, a list of input frames is processed until the packet buffer is full. CRYPTO and STREAM frames payload can be splitted if need to fill the buffer. Real encoding is then performed as a second stage operation, first with Length field, then with the selected frames themselves. Before this patch, no space was reserved in the buffer for Length field when attaching the frames to the packet. This could result in a error as the packet payload would be too large for the remaining space. In practice, this issue was rarely encounted, mostly as a side-effect from another issue linked to CRYPTO frame encoding. Indeed, a wrong calculation is performed on CRYPTO splitting, which results in frame payload shorter by a few bytes than expected. This however ensured there would be always enough room for the Length field and payload during encoding. As CRYPTO frames are the only big enough content emitted with a Long header packet, this renders the current issue mostly non reproducible. Fix the original issue by reserving some space for Length field prior to frame payload calculation, using a maximum value based on the remaining room space. Packet length is then reduced if needed when encoding is performed, which ensures there is always enough room for the selected frames. Note that the other issue impacting CRYPTO frame encoding is not yet fixed. This could result in datagrams with Long header packets not completely extended to the full MTU. The issue will be addressed in another patch. This should be backported up to 2.6.	2025-02-12 11:51:09 +01:00
Willy Tarreau	627280e15f	MAJOR: leastconn: postpone the server's repositioning under contention When leastconn is used under many threads, there can be a lot of contention on leastconn, because the same node has to be moved around all the time (when picking it and when releasing it). In GH issue #2861 it was noticed that 46 threads out of 64 were waiting on the same lock in fwlc_srv_reposition(). In such a case, the accuracy of the server's key becomes quite irrelevant because nobody cares if the same server is picked twice in a row and the next one twice again. While other approaches in the past considered using a floating key to avoid moving the server each time (which was not compatible with the round-robin rule for equal keys), here a more drastic solution is needed. What we're doing instead is that we turn this lock into a trylock. If we can grab it, we do the job. If we can't, then we just wake up a server's tasklet dedicated to this. That tasklet will then try again slightly later, knowing that during this short time frame, the server's position in the queue is slightly inaccurate. Note that any thread touching the same server will also reposition it and save that work for next time. Also if multiple threads wake the tasklet up, then that's fine, their calls will be merged and a single lock will be taken in the end. Testing this on a 24-core EPYC 74F3 showed a significant performance boost from 382krps to 610krps. The performance profile reported by perf top dropped from 43% to 2.5%: Before: Overhead Shared Object Symbol 43.46% haproxy-master-inlineebo [.] fwlc_srv_reposition 21.20% haproxy-master-inlineebo [.] fwlc_get_next_server 0.91% haproxy-master-inlineebo [.] process_stream 0.75% [kernel] [k] ice_napi_poll 0.51% [kernel] [k] tcp_recvmsg 0.50% [kernel] [k] ice_start_xmit 0.50% [kernel] [k] tcp_ack After: Overhead Shared Object Symbol 30.37% haproxy [.] fwlc_get_next_server 2.51% haproxy [.] fwlc_srv_reposition 1.91% haproxy [.] process_stream 1.46% [kernel] [k] ice_napi_poll 1.36% [kernel] [k] tcp_recvmsg 1.04% [kernel] [k] tcp_ack 1.00% [kernel] [k] skb_release_data 0.96% [kernel] [k] ice_start_xmit 0.91% haproxy [.] conn_backend_get 0.82% haproxy [.] connect_server 0.82% haproxy [.] run_tasks_from_lists Tested on an Ampere Altra with 64 aarch64 cores dedicated to haproxy, the gain is even more visible (3.6x): Before: 311-323k rps, 3.16-3.25ms, 6400% CPU Overhead Shared Object Symbol 55.69% haproxy-master [.] fwlc_srv_reposition 33.30% haproxy-master [.] fwlc_get_next_server 0.89% haproxy-master [.] process_stream 0.45% haproxy-master [.] h1_snd_buf 0.34% haproxy-master [.] run_tasks_from_lists 0.32% haproxy-master [.] connect_server 0.31% haproxy-master [.] conn_backend_get 0.31% haproxy-master [.] h1_headers_to_hdr_list 0.24% haproxy-master [.] srv_add_to_idle_list 0.23% haproxy-master [.] http_request_forward_body 0.22% haproxy-master [.] __pool_alloc 0.21% haproxy-master [.] http_wait_for_response 0.21% haproxy-master [.] h1_send After: 1.21M rps, 0.842ms, 6400% CPU Overhead Shared Object Symbol 17.44% haproxy [.] fwlc_get_next_server 6.33% haproxy [.] process_stream 4.40% haproxy [.] fwlc_srv_reposition 3.64% haproxy [.] conn_backend_get 2.75% haproxy [.] connect_server 2.71% haproxy [.] h1_snd_buf 2.66% haproxy [.] srv_add_to_idle_list 2.33% haproxy [.] run_tasks_from_lists 2.14% haproxy [.] h1_headers_to_hdr_list 1.56% haproxy [.] stream_set_backend 1.37% haproxy [.] http_request_forward_body 1.35% haproxy [.] http_wait_for_response 1.34% haproxy [.] h1_send And at similar loads, the CPU usage considerably drops (3.55x), as well as the response time (10x): After: 320k rps, 0.322ms, 1800% CPU Overhead Shared Object Symbol 7.62% haproxy [.] process_stream 4.64% haproxy [.] h1_headers_to_hdr_list 3.09% haproxy [.] h1_snd_buf 3.08% haproxy [.] h1_process_demux 2.22% haproxy [.] __pool_alloc 2.14% haproxy [.] connect_server 1.87% haproxy [.] h1_send > 1.84% haproxy [.] fwlc_srv_reposition 1.84% haproxy [.] run_tasks_from_lists 1.77% haproxy [.] sock_conn_iocb 1.75% haproxy [.] srv_add_to_idle_list 1.66% haproxy [.] http_request_forward_body 1.65% haproxy [.] wake_expired_tasks 1.59% haproxy [.] h1_parse_msg_hdrs 1.51% haproxy [.] http_wait_for_response > 1.50% haproxy [.] fwlc_get_next_server The cost of fwlc_get_next_server() naturally increases as the server count increases, but now has no visible effect on updates. The load distribution remains unchanged compared to the previous approach, the weight still being respected. For further improvements to the fwlc algo, please consult github issue #881 which centralizes everything related to this algorithm.	2025-02-12 11:48:10 +01:00
Willy Tarreau	b6a8318cc2	MEDIUM: server: allocate a tasklet for asyncronous requeuing This creates a tasklet that only expects to be called when the LB algorithm is under contention when trying to reposition the server in its tree. Indeed, that's one of the operations that usually requires to take a write lock on a highly contended area, often for very little benefits under contention; indeed, under load, if a server keeps its previous position for a few extra microseconds, usually there's no harm. Thus this new tasklet can be woken up by the LB algo to ask the server to later call lbprm.server_requeue(). It does nothing else.	2025-02-11 17:24:09 +01:00
Willy Tarreau	20b8c4ddba	MINOR: lbprm: add a new callback ->server_requeue to the lbprm This callback will be used to reposition a server to its expected position regardless of the fact that it was taken or dropped. It will only be used by supporting LB algos. For now, only fwlc defines it and assigns it to fwlc_srv_reposition(). At the moment it's not used yet.	2025-02-11 17:16:14 +01:00
Willy Tarreau	eced1d6d8a	DEBUG: thread: reduce the struct lock_stat to store only 30 buckets Storing only 30 buckets means we only keep 256 bytes per label. This further simplifies address calculation and reduces the memory used without complicating the locking code. It means we won't measure wait times larger than a second but we're not supposed to face this as it would trigger the watchdog anyway. It may become a little bit just if measuring using rdtsc() instead of now_mono_time() though (typically the limit would be around 350ms for a 3 GHz CPU).	2025-02-10 18:34:43 +01:00
Willy Tarreau	c2f2d6fd3c	DEBUG: thread: make lock_stat per operation instead of for all operations It's more convenient (and more readable) to have the lock stats arranged by operation type (read, seek, write). It will also allow to later simplify the structure format and the bucket address calculation. Now lock_stat[] got split into lock_stats_rd[], lock_stats_sk[], lock_stats_wr[].	2025-02-10 18:34:43 +01:00
Willy Tarreau	4168d1278c	DEBUG: thread: don't keep the redundant _locked counter Now that we have our sums by bucket, the _locked counter is redundant since it's always equal to the sum of all entries. Let's just get rid of it and replace its consumption with a loop over all buckets, this will reduce the overhead of taking each lock at the expense of a tiny extra effort when dumping all locks, which we don't care about.	2025-02-10 18:34:43 +01:00
Willy Tarreau	a22550fbd7	DEBUG: thread: report the wait time buckets for lock classes In addition to the total/average wait time, we now also store the wait time in 2^N buckets. There are 32 buckets for each type (read, seek, write), allowing to store wait times from 1-2ns to 2.1-4.3s, which is quite sufficient, even if we'd want to switch from NS to CPU cycles in the future. The counters are only reported for non- zero buckets so as not to visually pollute the output. This significantly inflates the lock_stat struct, which is now aligned to 256 bytes and rounded up to 1kB. But that's not really a problem, given that there's only one per lock label.	2025-02-10 18:34:43 +01:00
Willy Tarreau	0b849c59fb	DEBUG: thread: make lock time computation more consistent The lock time computation was a bit inconsistent between functions, particularly those using a try_lock. Some of them would count the lock as taken without counting the time, others would simply not count it. This is essentially due to the way the time is retrieved, as it was done inside the atomic increment. Let's instead always use start_time to carry the elapsed time, by presetting it to the negative time before the event and addinf the positive time after, so that it finally contains the duration. Then depending on the try lock's success, we add the result or not. This was generalized to all lock functions for consistency, and because this will be handy for future changes.	2025-02-10 18:34:43 +01:00
Willy Tarreau	99a88ee904	DEBUG: thread: report the spin lock counters as seek locks Technically speaking, spin locks use a seek lock, not a write lock, so better count them appropriately for consistency (lock time, or function calls count).	2025-02-10 18:34:43 +01:00
Willy Tarreau	7ddcdff33f	BUG/MEDIUM: debug: close a possible race between thread dump and panic() The rework of the thread dumping mechanism in 2.8 with commit 9a6ecbd590 ("MEDIUM: debug: simplify the thread dump mechanism") opened a small race, which is that a thread in the process of dumping other ones may block the other one from panicing while it's looping at the end of ha_thread_dump_fill(), or any other sequence involving the currently dumped one. This was emphasized in 3.1 with commit 148eb5875f ("DEBUG: wdt: better detect apparently locked up threads and warn about them") that allowed to emit warnings about long-stuck threads, because in this case, what happens is that sometimes a thread starts to emit a warning (or a set of warnings), and while the warning is being awaited for, a panic finally happens and interrupts either the dumping thread, which never finishes and waits for the target's pointer to become NULL which will never happen since it was supposed to do it itself, or the currently dumped thread which could wait for the dumping thread to become ready while this one has not released the former. In order to address this, first we now make sure never to dump a thread that is already in the process of dumping another one. We're adding a new thread flag to know this situation, that is set in ha_thread_dump_fill() and cleared in ha_thread_dump_done(). And similarly, we don't trigger the watchdog on a thread waiting for another one to finish its dump, as it's likely a case of warning (and maybe even a panic) that makes them wait for each other and we don't want such cases to be reentrant. Finally, we check in the main polling loop that the flag never accidentally leaked (e.g. wrong flag manipulation) as this would be difficult to spot with bad consequences. This should be backported at least to 2.8, and should resolve github issue #2860. Thanks to Chris Staite for the very informative backtrace that exhibited the problem.	2025-02-10 18:34:26 +01:00
William Lallemand	3912780b1e	BUG/MEDIUM: ssl: chosing correct certificate using RSA-PSS with TLSv1.3 The clienthello callback was written when TLSv1.3 was not yet out, and signatures algorithm changed since then. With TLSv1.2, the least significant byte was used to determine the SignatureAlgorithm, which could be rsa(1), dsa(2), ecdsa(3). https://datatracker.ietf.org/doc/html/rfc5246#section-7.4.1.4.1 This was used to chose which type of certificate to push to the client. But TLSv1.3 changed that, and introduced new RSA-PSS algorithms that does not have the least sinificant byte to 1. https://datatracker.ietf.org/doc/html/rfc8446#section-4.2.3 This would result in chosing the wrong certificate when an RSA an ECDSA ones are in the configuration for the same SNI or default entry. This patch fixes the issue by parsing bothe hash and signature field to check the RSA-PSS signature scheme. This must fix issue #2852. This must be backported in every stable versions. The code was moved from ssl_sock.c to ssl_clienthello in recent versions.	2025-02-07 20:56:42 +01:00
Willy Tarreau	8d63dc50ab	BUG/MINOR: debug: make sure the "debug dev sched" tasks don't block stopping When "debug dev sched" is used to pop up background tasks, these tasks are never stopped, so we must be careful to stop them when the stopping flag is set, otherwise they can prevent the process from stopping when sufficiently numerous (tests went as far as 100 million tasks, leading the run queue never being completely purged in one poll round). No backport is needed since this is only used when debugging and tuning the scheduler.	2025-02-07 18:04:29 +01:00
Willy Tarreau	6765a32eb4	BUG/MINOR: debug: make "debug dev sched" accept a negative TID The TID passed to "debug dev sched" is used to pin the task to a given thread. A negative value normally means the task is unpinned and goes to the shared wait queue and run queue. However due to the type of the variable, negative values were mapped as highly positive values and were set to the current thread. Let's add the proper cast to fix this. No backport is needed since this is only used to experiment with the scheduler and measure its performance.	2025-02-07 18:04:29 +01:00
Christopher Faulet	d48b5add88	BUG/MINOR: stats-json: Define JSON_INT_MAX as a signed integer A JSON integer is defined in the range [-(253)+1, (253)-1]. Macro are used to define the minimum and the maximum value, The minimum one is defined using the maximum one. So JSON_INT_MAX must be defined as a signed integer value to avoid wrong cast of JSON_INT_MIN. It was reported by Coverity in #2841: CID 1587769. This patch could be backported to all stable versions.	2025-02-06 17:19:49 +01:00
Christopher Faulet	bc487afc85	MINOR: filters: Improve errors formating during filters parsing The error message reported by a filter during parsing are displayed between quotes. It is not really user friendly. So let's remove the quotes here.	2025-02-06 17:03:40 +01:00
Christopher Faulet	b20e2c96cf	BUG/MINOR: flt-trace: Support only one name option When a trace filter is defined, only one 'name' option is expected. But it was not tested. Thus it was possible to set several names leading to a memory leak. It is now tested, and it is not allowed to redefine the trace filter name. It was reported by Coverity in #2841: CID 1587768. This patch could be backported to all stable versions.	2025-02-06 17:01:15 +01:00
Christopher Faulet	a7f513af91	BUG/MINOR: auth: Fix a leak on error path when parsing user's groups In a userlist section, when a user is parsed, if a specified group is not found, an error is reported. In this case we must take care to release the alredy built groups list. It was reported by Coverity in #2841: CID 1587770. This patch could be backported to all stable versions.	2025-02-06 16:55:37 +01:00
Christopher Faulet	a1e14d2a82	BUG/MINOR: config/userlist: Support one 'users' option for 'group' directive When a group is defined in a userlist section, only one 'users' option is expected. But it was not tested. Thus it was possible to set several options leading to a memory leak. It is now tested, and it is not allowed to redefine the users option. It was reported by Coverity in #2841: CID 1587771. This patch could be backported to all stable versions.	2025-02-06 16:55:29 +01:00
Christopher Faulet	75e8c8ed33	BUG/MINOR: cli: Fix a possible infinite loop in _getsocks() In _getsocks() functuoin, when we failed to set the unix socket in non-blocking mode, a goto to "out" label led to loop infinitly. To fix the issue, we must only let the function exit. This patch should be backported to all stable versions.	2025-02-06 15:44:21 +01:00
Christopher Faulet	372cc696d4	BUG/MINOR: cli: Fix memory leak on error for _getsocks command Some errors in parse function of _getsocks commands were not properly handled and immediately returned, leading to a memory leak on cmsgbuf and tmpbuf buffers. To fix the issue, instead of immediately return with -1, we jump to "out" label. Returning 1 intead of -1 in that case is valid. This was reported by Coverity in #2841: CIDs 1587773 and 1587772. This patch should be backported as far as 2.4.	2025-02-06 15:43:04 +01:00
Christopher Faulet	7e927243b9	BUG/MINOR: cli: Don't set SE flags from the cli applet Since the CLI was updated to use the new applet API, it should no longer set directly the SE flags. Instead, the corresponding applet flags must be set, using the applet API (appet_set_*). It is true for the CLI I/O handler but also for the commands parse function and I/O callback function. This patch should be backported as far as 3.0.	2025-02-06 15:23:20 +01:00
Christopher Faulet	0aa69e7865	MINOR: mux-spop/mux-fcgi: Add support of the debug string for logs Now it is possible to have debug info about FCGI and SPOP multiplexers. To do so, the support for the MUX_SCTL_DBG_STR command was implemented for these muxes. The have this log message, the log-format must be set to: log-format "$HAPROXY_HTTP_LOG_FMT bs=<%[bs.debug_str]>"	2025-02-06 11:19:32 +01:00
Christopher Faulet	456cfa450a	MINOR: mux-fcgi: Dump info about connections and streams in dedicated functions fcgi_show_fd() function was splitted to dump the info about the FCGI connections and the FCGI streams in dedicated functions, duplicating this way what is performed in other muxes. In addition, the FCGI multiplexer now implements the .show_sd callback function called by "show sess" CLI command.	2025-02-06 11:19:32 +01:00
Christopher Faulet	bbc8c98a54	MINOR: tevt/mux-fcgi: Report termination events for the FCGI connect/stream Termination events are now reported for the FCGI connections and the FCGI streams. In addition, all available termination events logs are reported in the "show-fd" callback function. The .ctl and .sctl callback functions were also update to support, respectively, MUX_CTL_TEVTS and MUX_SCTL_TEVTS commands.	2025-02-06 11:19:32 +01:00
Christopher Faulet	5b1c2277ae	BUG/MEDIUM: mux-fcgi: Propagate flags to SE in fcgi_strm_wake_one_stream The commit is flagged as a bug because the same fix on the H2 multiplexer was reported as a bug. But no issue was reported. When a stream is explicitly woken up by the FCGI conneciton, if an error condition is detected, the corresponding error flag is set on the SE. So SE_FL_ERROR or SE_FL_ERR_PENDING, depending if the end of stream was reported or not. However, there is no attempt to propagate other termination flags. We must be sure to properly set SE_FL_EOI and SE_FL_EOS when appropriate to be able to switch a pending error to a fatal error. Because of this bug, the SE could remain with a pending error and no end of stream, preventing the applicative stream to trully abort it. It means on some abort scenario, it seems to be possible to block a stream infinitely. This patche depends on: * MEDIUM: mux-fcgi: Add a function to propagate termination flags from fstrm to SE * BUG/MEDIUM: mux-fcgi: Properly handle read0 on partial records This patch could be backported at least as far as 2.8 after a period of observation. However no bug was reportedn so there is no rush.	2025-02-06 11:19:32 +01:00
Christopher Faulet	ccdca4bb77	MEDIUM: mux-fcgi: Add a function to propagate termination flags from fstrm to SE The function fcgi_strm_propagate_term_flags() was added to check the FSTRM state and evaluate when EOI/EOS/ERR_PENDING/ERROR flags must be set on the SE. It is not the only place where those flags are set. But it centralizes the synchro between the FCGI stream and the SC. For now, this function is only used at the end of fcgi_rcv_buf(). But it will be used to fix a potential bug.	2025-02-06 11:19:32 +01:00
Christopher Faulet	7b638eb1a6	MINOR: mux-spop: Implement .show_sd callback function The SPOP multiplexer now implements the .show_sd callback function called by "show sess" CLI command.	2025-02-06 11:19:32 +01:00
Christopher Faulet	5aeb678762	MINOR: mux-spop: Dump info about connections and streams in dedicated functions spop_show_fd() function was splitted to dump the info about the SPOP connections and the SPOP streams in dedicated functions, duplicating this way what is performed in other muxes.	2025-02-06 11:19:32 +01:00
Christopher Faulet	eb4e517489	CLEANUP: mux-spop: Remove useless comments Just a small cleanup to remove some comments added during the development of the mux.	2025-02-06 11:19:32 +01:00
Christopher Faulet	4f8ae5b1f6	MINOR: tevt/mux-spop: Report termination events for the SPOP connect/stream Termination events are now reported for the SPOP connections and the SPOP streams. In addition, all available termination events logs are reported in the "show-fd" callback function. The .ctl and .sctl callback functions were also update to support, respectively, MUX_CTL_TEVTS and MUX_SCTL_TEVTS commands.	2025-02-06 11:19:32 +01:00
Christopher Faulet	514a912a4d	MINOR: mux-spop: Set SPOP_CF_ERROR flag on connection error only The SPOP_CF_ERROR flag is now set on connection error only. It was also set on some demux failures. But it is not mandatory because the connection is closed anyway. And it is handy to have a flag dedicated to tcp connection error. It was the original purpose of this flag. This patch could be backported to 3.1 to ease future backports.	2025-02-06 11:19:32 +01:00
Christopher Faulet	d16c534511	MINOR: mux-spop: Report EOI on the SE when a ACK is received for a stream The spop stream now reports the end of input when the ACK is transferred to the SPOE applet. To do so, the flag SPOP_SF_ACK_RCVD was added. It is set on the SPOP stream when its ACK is received by the SPOP connection. In addition when SPOP stream flags are propagated to the SE, the error is now reported if end of input was not reached instead of testing the connection error code. It is more accurate. This patch should be backported to 3.1.	2025-02-06 11:19:32 +01:00
Christopher Faulet	f7e5718596	MINOR: flt-spoe: Report end of input immediately after applet init The SPOE applet forwards the message that must be sent to agent during its init stage. So just after it is created. When it is performed, the end of input must be reported because no more data will be forwarded. However, it was performed after receiving the ACK response. It is harmless, but there is no reason to delay the EOI. It is now fixed. This patch must be backported to 3.1.	2025-02-06 11:19:32 +01:00
Christopher Faulet	38aac2c7bc	BUG/MEDIUM: flt-spoe: Properly handle end of stream from the SPOE applet The previous fix ("BUG/MEDIUM: applet: Don't pretend to have more data to handle EOI/EOS/ERROR") revealed an issue with the way the SPOE applet was reporting the end of stream, leading to never shut the applet down. In fact, there is two bug in one. The first one is about the applet shutdown. Since the fix above, the applet is no longer closed. Before, it was closed because it was reported in error. But now, it is just delayed because the applet and the SPOP stream are declared to support half close connections. So the applet is only closed when the SPOP connection is closed. To fix this bug, both side are now stating that half close connections are not supported. The second bug is about the way the end of stream is reported. It is reported when the ACK response is received. But it is too early, because the parent stream must process the response first. So now, we take care to have processed the ACK from the parent applet before reporting an end of stream. This patch must be backported with the commit above to 3.1.	2025-02-06 11:19:32 +01:00
Christopher Faulet	7214dcd52d	BUG/MEDIUM: applet: Don't pretend to have more data to handle EOI/EOS/ERROR The way appctx EOI/EOS/ERROR flags were reported for applets using the new API were to state the applet had more data to deliver. But it was not correct and for APPCTX_FL_EOS, this led to report an error on the SE because it is not expected. More data to deliver and an end of stream is an impossible situation. This was added as a fix by commit b8ca114031 ("BUG/MEDIUM: applet: State appctx have more data if its EOI/EOS/ERROR flag is set"), mainly to make the SPOE applet work. When an applet set one of these flags, it really means it has no more data to deliver. So we must not try to trigger a new receive to handle these flags. Instead we must handle them directly in task_process_applet() function and only if the corresponding SE flags were not already set. This patch must be backported to 3.1.	2025-02-06 11:19:32 +01:00
Christopher Faulet	db504fbdbe	BUG/MEDIUM: flt-spoe: Set/test applet flags instead of SE flags from I/O handler The SPOE applet is using the new applet API. Thus end of input, end of stream and errors must be reported using the applet flags, not the SE flags. This was not the case. So let's fix it. It seems this bug is harmless for now. This patch must be backported to 3.1.	2025-02-06 11:19:32 +01:00
Christopher Faulet	54a09dfe0f	BUG/MINOR: tevt/mux-h2: Set truncated receive/eos events at SE level on error When receive or EOS termination events are reported at the SE level, a truncation was erroneously reported when no error was detected. Of course, it must be the opposite. No backport needed.	2025-02-06 11:19:32 +01:00
Christopher Faulet	fad68cb16d	BUG/MINOR: http-check: Don't pretend a C-L heeader is set before adding it When a GET/HEAD/OPTIONS/DELETE healthcheck request was formatted, we claimed there was a "content-length" header set even when there was no payload, leading to actually send a "content-length: 0" header to the server. It was unexpected and could be rejected by servers. When a healthcheck request is sent we must take care to state there is a "content-length" header when it is explicitly added. This patch should fix the issue #2851. It must be backported as far as 2.9.	2025-02-03 18:46:41 +01:00
Aurelien DARRAGON	0846638f7f	MEDIUM: stream: interrupt costly rulesets after too many evaluations It is not rare to see configurations with a large number of "tcp-request content" or "http-request" rules for instance. A large number of rules combined with cpu-demanding actions (e.g.: actions that work on content) may create thread contention as all the rules from a given ruleset are evaluated under the same polling loop if the evaluation is not interrupted Thus, in this patch we add extra logic around "tcp-request content", "tcp-response content", "http-request" and "http-response" rulesets, so that when a certain number of rules are evaluated under the single polling loop, we force the evaluating function to yield. As such, the rule which was about to be evaluated is saved, and the function starts evaluating rules from the save pointer when it returns (in the next polling loop). We use task_wakeup(task, TASK_WOKEN_MSG) to explicitly wake the task so that no time is wasted and the processing is resumed ASAP. TASK_WOKEN_MSG is mandatory here because process_stream() expects TASK_WOKEN_MSG for explicit analyzers re-evaluation. rules_bcount stream's attribute was added to count how manu rules were evaluated since last interruption (yield). Also, SF_RULE_FYIELD flag was added to know that the s->current_rule was assigned due to forced yield and not regular yield. By default haproxy will enforce a yield every 50 rules, this behavior can be configured using the "tune.max-rules-at-once" global keyword. There is a limitation though: for now, if the ACT_OPT_FINAL flag is set on act_opts, we consider it is not safe to yield (as it is already the case for automatic yield). In this case instead of yielding an taking the risk of not being called back, we skip the yield and hope it will not create contention. This is something we should ideally try to improve in order to yield in all conditions.	2025-02-03 17:09:48 +01:00
Christopher Faulet	04bbfa4354	BUG/MINOR: tcp-rules: Don't forward close during tcp-response content rules eval When the tcp-response content ruleset evaluation is delayed because of an ACL condition, the close forwarding on the client side is not explicitly blocked. So it is possible to close the client side before the end of the response evaluation. To fix the issue, this is now done in all cases where some data are missing. Concretely, channel_dont_close() is called in "missing_data" goto label. Note it is only a theorical bug (or pending bug). It is not possible to trigger it for now because an ACL cannot wait for more data when a close was received. But the code remains a bit weak. It is safer this way. It is especially mandatory for the "force yield" option that should be added soon. This patch could be backported to all stable versions.	2025-02-03 15:31:59 +01:00

1 2 3 4 5 ...

18795 Commits