haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-10-30 08:00:59 +01:00

Author	SHA1	Message	Date
Aurelien DARRAGON	89b04f2191	CLEANUP: sink: remove useless cleanup in sink_new_from_logger() As reported by Ilya in GH #2994, some cleanup parts in sink_new_from_logger() function are not used. We can actually simplify the cleanup logic to remove dead code, let's do that by renaming "error_final" label to "error" and only making use of the "error" label, because sink_free() already takes care of proper cleanup for all sink members.	2025-06-05 09:58:50 +02:00
Aurelien DARRAGON	368d01361a	MEDIUM: server: add and use srv_init() function rename _srv_postparse() internal function to srv_init() function and group srv_init_per_thr() plus idle conns list init inside it. This way we can perform some simplifications as srv_init() performs multiple server init steps after parsing. SRV_F_CHECKED flag was added, it is automatically set when srv_init() runs successfully. If the flag is already set and srv_init() is called again, nothing is done. This permis to manually call srv_init() earlier than the default POST_CHECK hook when needed without risking to do things twice.	2025-06-02 17:51:33 +02:00
Aurelien DARRAGON	889ef6f67b	MEDIUM: server: automatically add server to proxy list in new_server() while new_server() takes the parent proxy as argument and even assigns srv->proxy to the parent proxy, it didn't actually inserted the server to the parent proxy server list on success. The result is that sometimes we add the server to the list after new_server() is called, and sometimes we don't. This is really error-prone and because of that hooks such as REGISTER_POST_SERVER_CHECK() which as run for all servers listed in all proxies may not be relied upon for servers which are not actually inserted in their parent proxy server list. Plus it feels very strange to have a server that points to a proxy, but then the proxy doesn't know about it because it cannot find it in its server list. To prevent errors and make proxy->srv list reliable, we move the insertion logic directly under new_server(). This requires to know if we are called during parsing or during runtime to either insert or append the server to the parent proxy list. For that we use PR_FL_CHECKED flag from the parent proxy (if the flag is set, then the proxy was checked so we are past the init phase, thus we assume we are called during runtime) This implies that during startup if new_server() has to be cancelled on error paths we need to call srv_detach() (which is now exposed in server.h) before srv_drop(). The consequence of this commit is that REGISTER_POST_SERVER_CHECK() should not run reliably on all servers created using new_server() (without having to manually loop on global servers_list)	2025-06-02 17:51:30 +02:00
Aurelien DARRAGON	098a5e5c0b	BUG/MINOR: sink: detect and warn when using "send-proxy" options with ring servers using "send-proxy" or "send-proxy-v2" option on a ring server is not relevant nor supported. Worse, on 2.4 it causes haproxy process to crash as reported in GH #2965. Let's be more explicit about the fact that this keyword is not supported under "ring" context by ignoring the option and emitting a warning message to inform the user about that. Ideally, we should do the same for peers and log servers. The proper way would be to check servers options during postparsing but we currently lack proper cross-type server postparsing hooks. This will come later and thus will give us a chance to perform the compatibilty checks for server options depending on proxy type. But for now let's simply fix the "ring" case since it is the only one that's known to cause a crash. It may be backported to all stable versions.	2025-05-15 16:18:31 +02:00
Aurelien DARRAGON	bd48e26a74	CLEANUP: proxy: mention that px->conn_retries isn't relevant in some cases Since 91e785edc ("MINOR: stream: Rely on a per-stream max connection retries value"), px->conn_retries may be ignored in the following cases: * proxy not part of a list which gets properly post-init (ie: main proxy list, log-forward list, sink list) * proxy lacking the CAP_FE capability Documenting such cases where the px->conn_retries is set but effectively ignored, so that we either remove ignored statements or fix them in the future if they are really needed. In fact all cases affected here are automomous applets that already handle the retries themselves so the fact that 91e785edc made ->conn_retries ineffective should not be a big deal anyway.	2025-04-29 21:21:19 +02:00
William Lallemand	bea6235629	MEDIUM: sink: add a new dpapi ring buffer Add a 1MB ring buffer called "dpapi" for communication with the dataplane API. It would first be used to transmit ACME informations to the dataplane API but could be used for more.	2025-04-16 13:56:12 +02:00
Aurelien DARRAGON	4194f756de	MEDIUM: tree-wide: avoid manually initializing proxies In this patch we try to use the proxy API init functions as much as possible to avoid code redundancy and prevent proxy initialization errors. As such, we prefer using alloc_new_proxy() and setup_new_proxy() instead of manually allocating the proxy pointer and performing the base init ourselves.	2025-04-10 22:10:31 +02:00
Willy Tarreau	f4634e5a38	MINOR: ring/cli: support delimiting events with a trailing \0 on "show events" At the moment it is not supported to produce multi-line events on the "show events" output, simply because the LF character is used as the default end-of-event mark. However it could be convenient to produce well-formatted multi-line events, e.g. in JSON or other formats. UNIX utilities have already faced similar needs in the past and added "-print0" to "find" and "-0" to "xargs" to mention that the delimiter is the NUL character. This makes perfect sense since it's never present in contents, so let's do exactly the same here. Thus from now on, "show events <ring> -0" will delimit messages using a \0 instead of a \n, permitting a better and safer encapsulation.	2025-04-08 14:36:35 +02:00
Willy Tarreau	0be6d73e88	MINOR: ring: support arbitrary delimiters through ring_dispatch_messages() In order to support delimiting output events with other characters than just the LF, let's pass the delimiter through the API. The default remains the LF, used by applet_append_line(), and ignored by the log forwarder.	2025-04-08 14:36:35 +02:00
Ilia Shipitsin	27a6353ceb	CLEANUP: assorted typo fixes in the code, commits and doc	2025-04-03 11:37:25 +02:00
Ilia Shipitsin	78b849b839	CLEANUP: assorted typo fixes in the code and comments code, comments and doc actually.	2025-04-02 11:12:20 +02:00
Aurelien DARRAGON	9561b9fb69	BUG/MINOR: sink: add tempo between 2 connection attempts for sft servers When the connection for sink_forward_{oc}_applet fails or a previous one is destroyed, the sft->appctx is instantly released. However process_sink_forward_task(), which may run at any time, iterates over all known sfts and tries to create sessions for orphan ones. It means that instantly after sft->appctx is destroyed, a new one will be created, thus a new connection attempt will be made. It can be an issue with tcp log-servers or sink servers, because if the server is unavailable, process_sink_forward() will keep looping without any temporisation until the applet survives (ie: connection succeeds), which results in unexpected CPU usage on the threads responsible for that task. Instead, we add a tempo logic so that a delay of 1second is applied between two retries. Of course the initial attempt is not delayed. This could be backported to all stable versions.	2025-02-21 11:22:35 +01:00
Aurelien DARRAGON	bfa493d4be	BUG/MAJOR: log/sink: possible sink collision in sink_new_from_srv() sink_new_from_srv() leverages sink_new_buf() with the server id as name, sink_new_buf() then calls __sink_new() with the provided name. Unfortunately sink_new() is designed in such a way that it will first look up in the list of existing sinks to check if a sink already exists with given name, in which case the existing sink is returned. While this behavior may be error-prone, it is actually up to the caller to ensure that the provided name is unique if it really expects a unique sink pointer. Due to this bug in sink_new_from_srv(), multiple tcp servers with the same name defined in distinct log backends would end up sharing the same sink, which means messages sent to one of the servers would also be forwarded to all servers with the same name across all log backend sections defined in the config, which is obviously an issue and could even raise security concerns. Example: defaults log backend@log-1 local0 backend log-1 mode log server s1 127.0.0.1:514 backend log-2 mode log server s1 127.0.0.1:5114 With the above config, logs sent to log-1/s1 would also end up being sent to log-2/s1 due to server id "s1" being used for tcp servers in distinct log backends. To fix the issue, we now prefix the sink ame with the backend name: back_name/srv_id combination is known to be unique (backend name serves as a namespace) This bug was reported by GH user @landon-lengyel under #2846. UDP servers (with udp@ prefix before the address) are not affected as they don't make use of the sink facility. As a workaround, one should manually ensure that all tcp servers across different log backends (backend with "mode log" enabled) use unique names This bug was introduced in e58a9b4 ("MINOR: sink: add sink_new_from_srv() function") thus it exists since the introduction of log backends in 2.9, which means this patch should be backported up to 2.9.	2025-01-20 12:33:20 +01:00
Willy Tarreau	f8d3d2e4cf	MINOR: ring: support unit suffixes in the size The ring size used to take only numbers and silently ignore letters (due to atol()), resulting it tiny buffers when trying to collect traces and using e.g. "size 10g". Let's make use of parse_size_err() to properly parse units.	2024-11-19 10:56:45 +01:00
Aurelien DARRAGON	1bdf6e884a	MEDIUM: sink: implement sink_find_early() sink_find_early() is a convenient function that can be used instead of sink_find() during parsing time in order to try to find a matching sink even if the sink is not defined yet. Indeed, if the sink is not defined, sink_find_early() will try to create it and mark it as forward-declared. It will also save informations from the caller to better identify it in case of errors. If the sink happens to be found in the config, it will transition from forward-declared type to its final type. Else, it means that the sink was not found in the config, in this case, during postresolve, we raise an error to indicate that the sink was not found in the configuration. It should help solve postresolving issue with rings, because for now only log targets implement proper ring postresolving.. but rings may be used at different places in the code, such as debug() converter or in "traces" section.	2024-10-10 16:55:15 +02:00
Willy Tarreau	fdf38ed7fc	BUG/MINOR: proxy: also make the cli and resolvers use the global name As detected by ASAN on the CI, two places still using strdup() on the proxy names were left by commit b325453c3 ("MINOR: proxy: use the global file names for conf->file"). No backport is needed.	2024-09-21 20:08:06 +02:00
Aurelien DARRAGON	e328056ddc	MEDIUM: sink: assume sft appctx stickiness As mentioned in b40d804 ("MINOR: sink: add some comments about sft->appctx usage in applet handlers"), there are few places in the code where it looks like we assumed that the applet callbacks such as sink_forward_session_init() or sink_forward_io_handler() could be executing an appctx whose sft is detached from the appctx (appctx != sft->appctx). In practise this should not be happening since an appctx sticks to the same thread its entire lifetime, and the only times sft->appctx is effectively assigned is during the session/appctx creation (in process_sink_forward()) or release. Thus if sft->appctx wouldn't point to the appctx that the sft was bound to after appctx creation, it would probably indicate a bug rather than an expected condition. To further emphasize that and prevent the confusion, and since 3.1-dev4 was released, let's remove such checks and instead add a BUG_ON to ensure this never happens. In _sink_forward_io_handler(), the "hard_close" label was removed since there are no more uses for it (no hard errors may be caught from the function for now)	2024-07-25 14:56:19 +02:00
Aurelien DARRAGON	2513bd257f	OPTIM: sink: consider threads' current load when rebalancing applets In c454296f0 ("OPTIM: sink: balance applets accross threads"), we already made sure to balance applets accross threads by picking a random thread to spawn the new applet. Also, thanks to the previous commit, we also have the ability to destroy the applet when a certain amount of messages were processed to help distribute the load during runtime. Let's improve that by trying up to 3 different threads in the hope to pick a non-overloaded one in the best scenario, and the least over loaded one in the worst case. This should help to better distribute the load over multiple threads when high loads are expected. Logic was greatly inspired from thread migration logic used by server health checks, but it was simpliflied for sink's use case.	2024-07-24 17:59:18 +02:00
Aurelien DARRAGON	237849c911	MEDIUM: sink: "max-reuse" support for sink servers Thanks to the previous commit, it is now possible to know how many events were processed for a given sft/server sink pair. As mentioned in commit c454296 ("OPTIM: sink: balance applets accross threads"), let's provide the ability to restart a server connection when a certain amount of events were processed to help better balance the load over multiple threads. For this, we make use the of "max-reuse" server keyword which was only relevant under "http" context so far. Under sink context, "max-reuse" corresponds to the number of times the tcp connection can be reused for sending messages, which in fact means that "max-reuse + 1" is the number of events (ie: messages) that are allowed to be sent using the same tcp server connection: when this threshold is met, the connection will be destroyed and a new one will be created on a random thread. The value is not strict: it is the minimum value above which the connection may be destroyed since the value is checked after ring_dispatch_messages() which may process multiple messages at once. By default, no limit is enforced (the connection will be reused for as long as it is available). The documentation was updated accordingly.	2024-07-24 17:59:14 +02:00
Aurelien DARRAGON	709b3db941	MINOR: sink: add processed events counter in sft Add a new struct member to sft structure named e_processed in order to track the total number of events processed by sft applets. sink_forward_oc_io_handler() and sink_forward_io_handler() now make use of ring_dispatch_messages() optional value added in the previous commit in order to increase the number of processed events.	2024-07-24 17:59:08 +02:00
Aurelien DARRAGON	47323e64ad	MINOR: ring: count processed messages in ring_dispatch_messages() ring_dispatch_messages() now takes an optional argument <processed> which must point to a size_t counter when provided. When provided, the value is updated to the number of messages processed by the function.	2024-07-24 17:59:03 +02:00
Aurelien DARRAGON	0821460e3f	MEDIUM: sink: don't set NOLINGER flag on the outgoing stream interface Given that sink applets are responsible for conveying messages from the ring to the tcp server endpoint, there are no protocol timeout or errors expected there, it is an unidirectional flow of data over TCP. As such, NOLINGER flag which was inherited from peers applet, see dbd026792 ("BUG/MEDIUM: peers: set NOLINGER on the outgoing stream interface") is not desirable under sink context: The reason why we have the NOLINGER flag set is to ensure the connection is closed right away and avoid 60s TIME_WAIT delay on closed sockets. The downside is that messages sent right before closing the socket are not guaranteed to make it to the server because closing with NOLINGER flag set will result in RST packet being emitted right away, which could prevent in-flight messages from being properly delivered. Unlike peers applets, the only cases were sink applets are expected to close the connection are upon unexpected error or upon stopping, which are relatively rare events. Thanks to previous commit, ERROR flag is already set in case of error, so the use of NOLINGER is not mandatory for the RST to be sent. Now for the stopping case, it only happens once in the process lifetime so it's acceptable to close the socket using EOS+EOI flags without the NOLINGER option set. So in our case, it is preferable to ensure messages get properly delivered knowning that closed sockets should be piling up in TIME_WAIT, this means removing the NOLINGER flag on the outgoing stream interface for sink applets. It is a prerequisite for upcoming patches in order to cleanly shut the applet during runtime without risking to send the RST packet before all pending messages were sent to the endpoint.	2024-07-24 17:58:58 +02:00
Aurelien DARRAGON	c6ab0e14e2	MINOR: sink: distinguish between hard and soft close in _sink_forward_io_handler() Aborting the socket on soft-stop is not the same as aborting it due to unexpected error. As such, let's leverage the granularity offered by sedesc flags to better reflect the situation: abort during soft-stop is handled as a soft close thanks to EOI+EOS flags, while abort due to unexpected error is handled as hard error thanks to ERROR+EOS flags. Thanks to this change, hard error will always emit RST packet even if the NOLINGER option wasn't set on the socket.	2024-07-24 17:58:52 +02:00
Aurelien DARRAGON	b40d804c7f	MINOR: sink: add some comments about sft->appctx usage in applet handlers There seem to be an ambiguity in the code where sft->appctx would differ from the appctx that was assigned to it upon appctx creation. In practise, it doesn't seem this could be happening. Adding a few notes to come back to this later and try to see if we can remove this ambiguity.	2024-07-24 17:58:47 +02:00
Aurelien DARRAGON	10811fdfd6	MINOR: sink: merge sink_forward_io_handler() with sink_forward_oc_io_handler() Now that sink_forward_oc_io_handler() and sink_forward_io_handler() were unified again thanks to the previous commit, let's take a chance to merge code that is common to both functions in order to ease code maintenance. Let's add _sink_forward_io_handler() internal function which takes the applet and a message handler as argument: sink_forward_io_handler() and sink_forward_oc_io_handler() leverage this internal function by passing the correct message handler for the desired format.	2024-07-24 17:58:41 +02:00
Aurelien DARRAGON	f2848e6146	MINOR: sink: Remove useless test on SE_FL_SHR/SHW flags Re-apply dcd917d972 ("MINOR: applet: Remove uselelss test on SE_FL_SHR/SHW flags") for sink_forward_oc_io_handler() function as it was probably overlooked given that sink_forward_oc_io_handler() and sink_forward_io_handler() follow the same logic.	2024-07-24 17:58:35 +02:00
Aurelien DARRAGON	901a66b3fc	MINOR: sink: unify and sink_forward_io_handler() and sink_forward_oc_io_handler() In a739dc2 ("MEDIUM: sink: Use the sedesc to report and detect end of processing"), we added a drain after close in sink_forward_oc_io_handler() by the use of "goto out". However, since we perform a close, there is no reason to drain data from the socket. Moreover, before the patch there was no drain and nothing mentioned the fact that that the drain was added on purpose. Lastly, sink_forward_io_handler() and sink_forward_oc_io_handler() functions are strictly identical when in comes to processing logic, and the drain was only added in sink_forward_oc_io_handler() and not in sink_forward_io_handler(). As such, it's pretty safe to assume that the drain is not needed here and was added as accident. So in this patch we remove it in an attempt to unify sink_forward_io_handler() and sink_forward_oc_io_handler() functions like it was already the case before.	2024-07-24 17:58:30 +02:00
Aurelien DARRAGON	c81b8ee480	BUG/MEDIUM: sink: properly init applet under sft lock Since 09d69eacf8 ("MEDIUM: sink: start applets asynchronously") the applet is no longer initialized under the sft lock while it was the case before. At first it doesn't seem to be an issue, but if we look closer at sink_forward_session_init(), we can see that sft->appctx is assigned while it can be accessed at the same time from sink_init_forward(). Let's restore the old guarantees by performing the .init under the sft lock. No backport needed unless 09d69eacf8 is.	2024-07-24 17:58:24 +02:00
Aurelien DARRAGON	c454296f07	OPTIM: sink: balance applets accross threads Most of the time all sink applets (which are responsible for relaying messages from the ring to the tcp servers endpoints) would end up being assigned to the first available thread (tid:0), resulting in excessive CPU usage on a single thread when multiple sink servers were defined (no matter if they were defined over multiple "ring" sections) and significant message load was pushed through them over the ring API. This patch is similar to 34e4085f ("MEDIUM: peers: Balance applets across threads") but for sinks. We use a slightly different approach, which is to elect a random thread instead of picking the one with leasts applets. This proves to be already sufficient to alleviate the issue. In the case we want to have a better load distribution we should consider breaking existing connections to reestablish them on a new thread when we find out that they start monopolizing a cpu thread (ie: after a certain amount of messages for instance). Also check tcpchecks migrating model for inspiration. This patch depends on the previous one ("MEDIUM: sink: start applets asynchronously").	2024-07-17 16:45:49 +02:00
Aurelien DARRAGON	09d69eacf8	MEDIUM: sink: start applets asynchronously Since d9c1d33fa1 ("MEDIUM: applet: Add support for async appctx startup on a thread subset"), it is now possible to delay appctx's init: for that it is required that the .init callback is defined on the applet. When the applet will be processed on the first run, applet API will automatically finish the applet initialization. Thus we explicitly call appctx_wakeup() on the applet to schedule it for initial run instead of calling appctx_init() ourselves. This is done in prevision of the next patch in order to be able to schedule the applet on a different thread from the one executing sink_forward_session_create() function. Note: 'out_free_appctx' label was removed since it is no longer used.	2024-07-17 16:45:43 +02:00
Aurelien DARRAGON	6c5869f846	DEBUG: sink: add name hint for memory area used by memory-backed sinks Thanks to ("MINOR: tools: add vma_set_name() helper"), set a name hint for user created memory-backed sinks (ring sections without backing-file) so that they can be easily indentified in /proc/<pid>/maps. Depending on malloc() implementation, such memory areas will normally be merged on the heap under MMAP_THRESHOLD (128 kB by default) and will have a dedicated memory area once the threshold is exceeded. As such, when large enough, they will appear like this in /proc/<pid>/maps: 7b8e8ac00000-7b8e8bf13000 rw-p 00000000 00:00 0 [anon💍myring]	2024-05-21 17:55:09 +02:00
Aurelien DARRAGON	0cfbeb1ae8	BUG/MINOR: ring: free ring's allocated area not ring's usable area when using maps Since 40d1c84bf0 ("BUG/MAJOR: ring: free the ring storage not the ring itself when using maps"), munmap() call for startup_logs's ring and file-backed rings fails to work (EINVAL) and causes memory leaks during process cleanup. munmap() fails because it is called with the ring's usable area pointer which is an offset from the underlying original memory block allocated using mmap(). Indeed, ring_area() helper function was misused because it didn't explicitly mention that the returned address corresponds to the usable storage's area, not the allocated one. To fix the issue, we add an explicit ring_allocated_area() helper to return the allocated area for the ring, just like we already have ring_allocated_size() for the allocated size, and we properly use both the allocated size and allocated area to manipulate them using munmap() and msync(). No backport needed.	2024-05-21 11:42:35 +02:00
Amaury Denoyelle	634cc2a5d8	MINOR: counters: move last_change into counters struct last_change was a member present in both proxy and server struct. It is used as an age statistics to report the last update of the object. Move last_change into fe_counters/be_counters. This is necessary to be able to manipulate it through generic stat column and report it into stats-file. Note that there is a change for proxy structure with now 2 different last_change values, on frontend and backend side. Special care was taken to ensure that the value is initialized only on the proxy side. The other value is set to 0 unless a listen proxy is instantiated. For the moment, only backend counter is reported in stats. However, with now two distinct values, stats could be extended to report it on both side.	2024-05-02 10:55:25 +02:00
Willy Tarreau	40d1c84bf0	BUG/MAJOR: ring: free the ring storage not the ring itself when using maps A recent issue was uncovered by the CI which started to randomly report segfaults on a few tests, and more systematically on FreeBSD. It turn out that it was introduced by recent commit 03816ccfa9 ("MAJOR: ring: insert an intermediary ring_storage level"), which overlooked the munmap() path of the sink and startup logs: once the ring and its storage were split, it was no longer correct to munmap() the ring, only its storage area needs to be unmapped, and the ring must always be freed separately. Thanks to Christopher and William for their help at trying to reproduce it and figure the circumstances that triggers it. No backport is needed.	2024-03-26 15:15:59 +01:00
Willy Tarreau	9e99cfbeb6	MAJOR: ring: drop the now unneeded lock It was only used to protect the list which is now an mt_list so it doesn't provide any required protection anymore. It obviously also used to provide strict ordering between the writer and the reader when the writer started to update the messages, but that's now covered by the oredered tail updates and updates to the readers count to protect the area. The message rate on small thread counts (up to 12) saw a boost of roughly 5% while on large counts while for large counts it lost about 2% due to some contention now becoming visible elsewhere. Typical measures are 6.13M -> 6.61M at 3C6T, and 1.88 -> 1.92M at 24C48T on the EPYC.	2024-03-25 17:34:19 +00:00
Willy Tarreau	a2d2dbf210	MEDIUM: ring/applet: turn the wait_entry list to an mt_list instead Rings are keeping a lock only for the list, which apparently doesn't need anything more than an mt_list, so let's first turn it into that before dropping the lock. There should be no visible effect.	2024-03-25 17:34:19 +00:00
Willy Tarreau	eb3d5f464d	MEDIUM: ring: use the topmost bit of the tail as a lock We're now locking the tail while looking for some room in the ring. In fact it's still while writing to it, but the goal definitely is to get rid of the lock ASAP. For this we reserve the topmost bit of the tail as a lock, which may have as a possible visible effect that buffers will be limited to 2GB instead of 4GB on 32-bit machines (though in practise, good luck for allocating more than 2GB contiguous on 32-bit), but in practice since the size is read with atol() and some operating systems limit it to LONG_MAX unless passing negative numbers, the limit is already there. For now the impact on x86_64 is significant (drop from 2.35 to 1.4M/s on 48 threads on EPYC 24 cores) but this situation is only temporary so that changes can be reviewable and bisectable. Other approaches were attempted, such as using XCHG instead, which is slightly faster on x86 with low thread counts (but causes more write contention), and forces readers to stall under heavy traffic because they can't access a valid value for the queue anymore. A CAS requires preloading the value and is les good on ARMv8.1. XADD could also be considered with 12-13 upper bits of the offset dedicated to locking, but that looks overkill.	2024-03-25 17:34:19 +00:00
Willy Tarreau	bf3dead20c	MEDIUM: ring: remove the struct buffer from the ring The purpose is to store a head and a tail that are independent so that we can further improve the API to update them independently from each other. The struct was arranged like the original one so that as long as a ring has its head set to zero (i.e. no recycling) it will continue to work. The new format is already detectable thanks to the "rsvd" field which indicates the number of reserved bytes at the beginning. It's located where the buffer's area pointer previously was, so that older versions of haring can continue to open the ring in repair mode, and newer ones can use the fact that the upper bits of that variable are zero to guess that it's working with the new format instead of the old one. Also let's keep in mind that the layout will further change to place some alignment constraints. The haring tool will thus updated based on this and it detects that the rsvd field is smaller than a page and that the sum of it with the size equals the mapped size, in which case it uses the new dump_v2() function instead of dump_v1(). The new function also creates a buffer from the ring's area, size, head and tail and calls the generic one so that no other code had to be adapted.	2024-03-25 17:34:19 +00:00
Willy Tarreau	4e6de42b27	MINOR: ring: allow to reduce a ring size In ring_resize() we used to check if the new ring was at least as large as the previous one before resizing it, but what counts is that it's as large as the previous one's contents. Initially it was thought this would not really matter, but given that rings are initially created as BUFSIZE, it's currently not possible to shrink them for debugging purposes. Now with this change it is.	2024-03-25 17:34:19 +00:00
Willy Tarreau	03816ccfa9	MAJOR: ring: insert an intermediary ring_storage level We'll need to add more complex structures in the ring, such as wait queues. That's far too much to be stored into the area in case of file-backed contents, so let's split the ring definition and its storage once for all. This patch introduces a struct ring_storage which is assigned to ring->storage, which contains minimal information to represent the storage layout, i.e. for now only the buffer, and all the rest remains in the ring itself. The storage is appended immediately after it and the buffer's pointer always points to that area. It has the benefit of remaining 100% compatible with the existing file-backed layout. In memory, the allocation loses the size of a struct buffer. It's not even certain it's worth placing the size there, given that it's constant and that a dump of a ring wouldn't really need it (the file size is sufficient). But for now everything comes with the struct buffer, and later this will change once split into head and tail. Also this area may be completed with more information in the future (e.g. storage version, format, endianness, word size etc).	2024-03-25 17:34:19 +00:00
Willy Tarreau	80441a6983	MINOR: ring: use ring_size(), ring_area(), ring_head() and ring_tail() Some open-coded constructs were updated to make use of the ring accessors instead. This allows to remove some direct dependencies on the buffers API a bit more.	2024-03-25 17:34:19 +00:00
Willy Tarreau	8f3edf2ac6	MEDIUM: log/sink: make the log forwarder code use ring_dispatch_messages() This code becomes even simpler and almost does not need any knowledge of the structure of the ring anymore. It even highlighted that an old race had not been fixed due to code duplication, but that's now done.	2024-03-25 17:34:19 +00:00
Willy Tarreau	c262442b1a	MEDIUM: sink: move the generic ring forwarder code use ring_dispatch_messages() Now the code is much simpler than the ring forwarding function almost does not need any knowledge of the structure of the ring anymore.	2024-03-25 17:34:19 +00:00
Willy Tarreau	8022ae326c	MEDIUM: ring/sink: use applet_append_line()/syslog_applet_append_event() for readers The rink reader code was duplicated as-is in 2.2 for the ring forwarding code in commits 494c505703 ("MEDIUM: ring: add server statement to forward messages from a ring") and 975564784f ("MEDIUM: ring: add new srv statement to support octet counting forward") (which only differs by using a prefix instead of a suffix to delimit messages). Unfortunately, that makes it almost impossible to rework the core ring code because all these parts rely on it. This first commit aims at restoring a common structure for the core loop by just calling a distinct function based on the use case. The functions are either applet_append_line() when a whole line is to be emitted followed by an LF character, or syslog_applet_appent_event() when trying to send a TCP syslog line prepended with its size in decimal. There is no functional change beyond this.	2024-03-25 17:34:19 +00:00
Willy Tarreau	758cb450a2	OPTIM: sink: drop the sink lock used to count drops The sink lock was made to prevent event producers from passing while there were other threads trying to print a "dropped" message, in order to guarantee the absence of reordering. It has a serious impact however, which is that all threads need to take the read lock when producing a regular trace even when there's no reader. This patch takes a different approach. The drop counter is shifted left by one so that the lowest bit is used to indicate that one thread is already taking care of trying to dump the counter. Threads only read this value normally, and will only try to change it if it's non-null, in which case they'll first check if they are the first ones trying to dump it, otherwise will simply count another drop and leave. This has a large benefit. First, it will avoid the locking that causes stalls as soon as a slow reader is present. Second, it avoids any write on the fast path as long as there's no drop. And it remains very lightweight since we just need to add +2 or subtract 2*dropped in operations, while offering the guarantee that the sink_write() has succeeded before unlocking the counter. While a reader was previously limiting the traffic to 11k RPS under 4C/8T, now we reach 36k RPS vs 14k with no reader, so readers will no longer slow the traffic down and will instead even speed it up due to avoiding the contention down the chain in the ring. The locking cost dropped from ~75% to ~60% now (it's in ring_write now).	2024-03-09 11:23:52 +01:00
Willy Tarreau	eb7b2ec83a	OPTIM: sink: try to merge "dropped" messages faster When a reader doesn't read fast enough and causes drops, subsequent threads try to produce a "dropped" message. But it takes time to produce and emit this message, in part due to the use of chunk_printf() that relies on vfprintf() which has to parse the printf format, and during this time other threads may continue to increment the counter. This is the reason why this is currently performed in a loop. When reading what is received, it's common to see a large count followed by one or two single-digit counts, indicating that we could possibly have improved that by writing faster. Let's improve the situation a little bit. First we're now using a static message prefixed with enough space to write the digits, and a call to ultoa_r() fills these digits from right to left so that we don't have to process a format string nor perform a copy of the message. Second, we now re-check the counter immediately after having prepared the message so that we still get an opportunity for updating it. In order to avoid too long loops, this is limited to 10 iterations. Tests show that the number of single-digit "dropped" counters on output now dropped roughly by 15-30%. Also, it was observed that with 8 threads, there's almost never more than one retry.	2024-03-09 11:23:52 +01:00
Willy Tarreau	962c129dc1	BUG/MINOR: sink: fix a race condition in the TCP log forwarding code That's exactly the same as commit 53bfab080c ("BUG/MINOR: sink: fix a race condition between the writer and the reader") that went into 2.7 and was backported as far as 2.4, except that since the code was duplicated, the second instance was not noticed, leaving the race present. The race has a limited impact, if a forwarder reaches the end of the logs and a new message arrives before it leaves, the forwarder will only wake up after yet another new message will be sent. In practice it remains unnoticeable because for the race to trigger, one needs to have a steady flow of logs, which means the wakeup will happen anyway. This should be backported, but no need to insist on it if it resists.	2024-03-05 11:48:44 +01:00
Christopher Faulet	dcd917d972	MINOR: applet: Remove uselelss test on SE_FL_SHR/SHW flags These both flags are set after releasing the applet, in appctx_shut(). Concretly, it means the applet is shutdown for reads and writes. Once set, the applet's I/O handler was no longer called. Tests on these flags are useless. There is no chance to match them.	2024-02-14 14:22:36 +01:00
Ilya Shipitsin	80813cdd2a	CLEANUP: assorted typo fixes in the code and comments This is 37th iteration of typo fixes	2023-11-23 16:23:14 +01:00
Aurelien DARRAGON	078ebde870	CLEANUP: sink: useless leftover in sink_add_srv() Removing a useless leftover which has been introduced with 31e8a003a5 ("MINOR: sink: function to add new sink servers")	2023-11-10 17:49:57 +01:00

1 2 3 4 5

222 Commits