haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-12-22 18:11:21 +01:00

Author	SHA1	Message	Date
Christopher Faulet	bca5e14235	OPTIM: stconn: Don't pretend mux have more data to deliver on EOI/EOS/ERROR Doing some benchs on the 3.0, we encountered a small loss on requests/sec on small objects compared to the 2.8 . After bisecting the issue, it appeared that this was introduced when the mux-to-mux zero-copy data forwarding was implemented in 2.9-dev8. Extra subscribes on receives at the end of the message were responsible of the loss. A basic configuration, sending H2 requests to a H1 server returning responses without payload is enough to observe the issue. With the following command, we can observe a huge increase of epoll_ctl calls on 2.9/3.x: h2load -c 100 -m 10 -n 100000 http://... On 2.8 we have around 3200 calls to epoll_ctl against more than 20k on 3.1. The fix seems obvious. After a receive, there is no reason to state a mux have more data to deliver if EOI/EOS/ERROR flag was set on the stream-endpoint descriptor. With this change, extra calls to epoll_ctl disappear. However it is a sensitive part so it is important to keep an eye on it and to not backport it. Thanks to Willy and Emeric to have spot the issue.	2024-09-30 16:55:48 +02:00
Willy Tarreau	11051ed9c7	OPTIM: channel: speed up co_getline()'s search of the end of line Previously, co_getline() was essentially used for occasional parsing in peers's banner or Lua, so it could afford to read one character at a time. However now it's also used on the TCP log path, where it can consume up to 40% CPU as mentioned in GH issue #2731. Let's speed it up by using memchr() to look for the LF, and copying the data at once using memcpy(). Previously it would take 2.44s to consume 1 GB of log on a single thread of a Core i7-8650U, now it takes 1.56s (-36%).	2024-09-30 11:36:39 +02:00
Willy Tarreau	1d403caf8a	MINOR: server: make srv_shutdown_sessions() call pendconn_redistribute() When shutting down server sessions, the queue was not considered, which is a problem if some element reached the queue at the moment the server was going down, because there will be no more requests to kick them out of it. Let's always make sure we scan the queue to kick these streams out of it and that they can possibly find a more suitable server. This may make a difference in the time it takes to shut down a server on the CLI when lots of servers are in the queue. It might be interesting to backport this to 3.0 but probably not much further.	2024-09-27 19:01:38 +02:00
Willy Tarreau	1385e33eb0	BUG/MINOR: queue: make sure that maintenance redispatches server queue Turning a server to maintenance currently doesn't redispatch the server queue unless there's an explicit "option redispatch" and no "option persist", while the former has never really been the purpose of this test. Better refine this so that forced maintenance also causes the queue to be flushed, and possibly redispatched unless the proxy has option persist. This way now when turning a server to maintenance, the queue is immediately flushed and streams can decide what to do. This can be backported, though there's no need to go far since it was never directly reported and only noticed as part of debugging some rare "shutdown sessions" strangeness, which it might participate to.	2024-09-27 18:54:07 +02:00
Willy Tarreau	b8e3b0a18d	BUG/MEDIUM: stream: make stream_shutdown() async-safe The solution found in commit b500e84e24 ("BUG/MINOR: server: shut down streams under thread isolation") to deal with inter-thread stream shutdown doesn't work fine because there exists code paths involving a server lock which can then deadlock on thread_isolate(). A better solution then consists in deferring the shutdown to the stream itself and just wake it up for that. The only thing is that TASK_WOKEN_OTHER is a bit too generic and we need to pass at least 2 types of events (SF_ERR_DOWN and SF_ERR_KILLED), so we're now leveraging the new TASK_F_UEVT1 and _UEVT2 flags on the task's state to convey these info. The caller only needs to wake the task up with these flags set, and the stream handler will then finish the job locally using stream_shutdown_self(). This needs to be carefully backported to all branches affected by the dequeuing issue and containing any of the 5541d4995d ("BUG/MEDIUM: queue: deal with a rare TOCTOU in assign_server_and_queue()"), and/or b11495652e ("BUG/MEDIUM: queue: implement a flag to check for the dequeuing").	2024-09-27 12:15:41 +02:00
Willy Tarreau	d1c398b786	Revert "BUG/MINOR: server: shut down streams under thread isolation" This reverts commit b500e84e24fd19ccbcdf4fae5165aeb07e46bd67. Thread isolation does not work well for this, there exists code paths which already hold the server's lock and result in a deadlock. Let's revert that and address it better without isolation.	2024-09-27 10:17:31 +02:00
Aurelien DARRAGON	e3eb6a9035	MEDIUM: log: consider log-steps proxy setting for existing log origins During tcp/http transaction processing, haproxy may produce logs at different steps during the processing (accept, connect, request, response, close). But the behavior is hardly configurable because haproxy will only emit a single log per transaction, and by default it will try to produce the log once all log aliases or fetches used in the logformat could be satisfied, which means the log is often emitted during connection teardown, unless "option logasap" is used. We were often asked to have a way to emit multiple logs for a single transaction, like for instance emit log during accept, then request, response and close for instance, see GH #401 for more context. Thanks to "log-steps" keyword introduced by commit "MINOR: log: introduce "log-steps" proxy keyword", it is now possible to explictly configure when logs should be generated by haproxy when processing a transaction. This commit adds the required checks so that log-steps proxy option is properly considered for existing logs generated by haproxy. If "log-steps" is not specified on the proxy, the old behavior is preserved. Note: a slight cpu overhead should only be visible when "log-steps" keyword will be used due to the implementation relying on eb32 lookup instead of basic bitfield check as described in "MINOR: proxy: add log_steps struct member". However, the default behavior shouldn't be affected. When combining log-steps with log-profiles, user has the ability to explicitly control how and when haproxy should generate logs during requests handling.	2024-09-26 16:53:07 +02:00
Aurelien DARRAGON	4189eb7aca	MINOR: log: add log_orig_proxy() helper function Function may be used on proxy where log-steps are used to check if a given log origin should be handled or not.	2024-09-26 16:53:07 +02:00
Aurelien DARRAGON	c043d5d372	MINOR: log: introduce "log-steps" proxy keyword For now it is only available for proxies with frontend capability because log-steps are only evaluated under sess_log() or strm_log() which essentially focus on the frontend side when it comes to log settings so it's better to keep it this way for better consistency, at least for now. For now the setting does nothing (it is not considered during runtime), it will be implemented and documented in upcoming commits.	2024-09-26 16:53:07 +02:00
Aurelien DARRAGON	9341792baf	MINOR: proxy: add log_steps struct member add proxy->conf.log_steps eb32 root tree which will be used to store the log origin identifiers that should result in haproxy emitting a log as configured by the user using upcoming "log-steps" proxy keyword. It was chosen to use eb32 tree instead of simple bitfield because despite the slight overhead it is more future-proof given that we already implemented the prerequisites for seamless custom log origins registration that will also be usable from "log-steps" proxy keyword.	2024-09-26 16:53:07 +02:00
Aurelien DARRAGON	b882402a29	MINOR: log: support extra log origins for '%OG' alias Following previous commits, let's improve log_orig_to_str() so that extra log origins (registered through log_orig_register()) can be translated to string from origin ID. For that, it is required to add eb_32 tree node to log_origin struct in order to enable quick integer lookup during runtime. Slow name lookup using the list is acceptable for config parsing, but it is not the case during runtime when log_orig_to_str() is expected to be used. Also, to prevent duplicated info, get rid of ->id field and use ->tree.key instead	2024-09-26 16:53:07 +02:00
Aurelien DARRAGON	f8bb9d5c57	MINOR: log: explicitly handle extra log origins as error when relevant Thanks to previous commit, we can know check for log_orig optional flags in functions taking struct log_orig as parameter. Let's take this opportunity to add the LOG_ORIG_FL_ERROR flag and check this flag at a few places to handle the log message differently because if the flag is set then the caller expects the log to be handled as an error explicitly. e.g.: in _process_send_log_override(), if the flag is set, use the error log format instead of the dedicated one.	2024-09-26 16:53:07 +02:00
Aurelien DARRAGON	3c15ee05e9	MINOR: log: introduce log_orig flags Rename 'enum log_orig' to 'enum log_orig_id', since this enum specifically contains the log origin ids. Add 'struct log_orig' which wraps 'enum log_orig' with optional flags (no flags defined for now). Add log_orig() helper func that takes id and flags as parameter and returns log_orig struct initialized with input arguments. Update functions taking log origin as parameter so they explicitly take log orig id or log orig wrapper as argument depending on the level of context expected by the function.	2024-09-26 16:53:07 +02:00
Aurelien DARRAGON	6567e37680	MINOR: log: handle extra log origins in _process_send_log_override() Thanks to the previous commit, it is now possible to register additional log origins that may be used from log-profile section as 'on' steps. As such, let's make _process_send_log_override() function aware of them by trying to lookup in the tree of extra logging steps in the default switch-case catchall. If the log origin id matches with the id of the extra logging step, we use the associated log format instead of the "any" log format.	2024-09-26 16:53:07 +02:00
Aurelien DARRAGON	818475c5cc	MINOR: log: introduce extra log profile steps add a way to register additional log origins using log_origin_register() that may be used as log profile steps from log profile sections. For now this does nothing as no extra origins are registered and extra log origins are not yet considered for runtime logging paths. When specifying an extra logging step for on <step> under log-profile section, the logging step is stored within a binary tree for efficient lookup during runtime. No performance impact should be expected if extra log origins are not being used, and slight performance impact if extra log origins are used. Don't forget to update the documentation when new log origins are added (both %OG log alias and on <step> log-profile keyword are concerned.	2024-09-26 16:53:07 +02:00
Aurelien DARRAGON	facf259d88	MINOR: log: fix indent in strm_log() 8f34320e15 ("MINOR: log: provide log origin in logformat expressions using '%OG'") caused wrong indent in strm_log()	2024-09-26 16:53:07 +02:00
Oliver Dala	a889413f5e	BUG/MEDIUM: cli: Deadlock when setting frontend maxconn The proxy lock state isn't passed down to relax_listener through dequeue_proxy_listeners, which causes a deadlock in relax_listener when it tries to get that lock. Backporting: Older versions didn't have relax_listener and directly called resume_listener in dequeue_proxy_listeners. lpx should just be passed directly to resume_listener then. The bug was introduced in commit 001328873c352e5e4b1df0dcc8facaf2fc1408aa [cf: This patch should fix the issue #2726. It must be backported as far as 2.4]	2024-09-25 17:12:11 +02:00
Christopher Faulet	14a413033c	BUG/MEDIUM: cli: Be sure to catch immediate client abort A client abort while nothing was sent is properly handled except when this immediately happens after the connection was accepted. The read0 event is caught before the CLI applet is created. In that case, the shutdown is not handled and the applet is no longer wakeup. In that case, the stream remains blocked and no timeout are armed. The bug was due to the fact that when the applet I/O handler was called for the first time, the applet context was initialized and nothing more was performed. A shutdown, if any, would be handled on the next call. In that case, it was too late. Now, afet the init step, we loop to eval the first command. There is no command here but the shutdown will be tested. This patch should fix the issue #2727. It must be backported to 3.0.	2024-09-24 18:01:38 +02:00
Aurelien DARRAGON	d622f9d5b6	MEDIUM: mailers: warn about deprecated legacy mailers As mentioned in 2.8 announce on the mailing list [1] and on the wiki [2], use of legacy mailers is now deprecated and will not be supported anymore starting with version 3.3. Use of Lua script (AKA Lua mailers) is now encouraged (and fully supported since 2.8) for this purpose, as it offers more flexibility (e.g: alerts can be customized) and is more future-proof. Configurations relying on legacy mailers will now raise a warning. Users willing to keep their existing mailers config in a working state should simply add the following line to their global section: # mailers.lua file as provided in the git repository # adjust path as needed lua-load examples/lua/mailers.lua [1]: https://www.mail-archive.com/haproxy@formilux.org/msg43600.html [2]: https://github.com/haproxy/wiki/wiki/Breaking-changes	2024-09-23 20:16:27 +02:00
Willy Tarreau	fdf38ed7fc	BUG/MINOR: proxy: also make the cli and resolvers use the global name As detected by ASAN on the CI, two places still using strdup() on the proxy names were left by commit b325453c3 ("MINOR: proxy: use the global file names for conf->file"). No backport is needed.	2024-09-21 20:08:06 +02:00
Willy Tarreau	b500e84e24	BUG/MINOR: server: shut down streams under thread isolation Since the beginning of thread support, the shutdown of streams attached to a server was run under the server's lock, but that's not sufficient. It indeed turns out that shutting down streams (either from the CLI using "shutdown sessions server XXX" or due to "on-error shutdown-sessions") iterates over all the streams to shut them down, but stream_shutdown() has no way to protect its actions against concurrent actions from the stream itself on another thread, and streams offer no such provisions anyway. The impact is some rare but possible crashes when shutting down streams from the CLI in cmopetition with high server traffic. The probability is low enough to mark it minor, though it was observed in the field. At least since 2.4 the streams are arranged in per-thread lists, so it likely would be possible using the event subsystem to delegate these events to dedicated per-thread tasks which would address the problem. But server streams don't get killed often enough to justify such extra complexity, so better just run the loop under thread isolation. It also shows that the internal API could probably be improved to support a lighter thread exclusion instead of full isolation: various places want to only exclude one thread and here it could work. But again there's no point doing this for now. This patch should be backported to all stable branches. It's important to carefully check that this srv_shutdowns_streams() function is never called itself under isolation in older versions (though at first glance it looks OK).	2024-09-21 19:35:35 +02:00
Willy Tarreau	e77c73316a	MEDIUM: cfgparse: warn about deprecated use of duplicate server names As discussed below, there are too many problems and limitations caused by still supporting duplicate server names. That's already particularly complicated and dissuasive to use since it requires these servers to have explicit IDs to be accept. Let's now warn on any duplicate, even with explicit IDs and remind that this will become forbidden in 3.3. Link: https://www.mail-archive.com/haproxy@formilux.org/msg45185.html	2024-09-20 17:15:11 +02:00
Willy Tarreau	029d75df1e	OPTIM: cfgparse: speed up duplicate server detection Surprisingly, the duplicate server name detection has never made use of the names tree, so lookups were still in O(N^2). It took 1 second to validate 50k servers spread into 25 backends at 2k per backend. By simply using the tree (and since the current server already is in the tree), we just have to walk using ebpt_prev_dup to visit previous servers with the same name. We can then detect which ones conflict without having an ID set and error. The config check time is now 1/4 of the previous one for 2k servers per backend, and more importantly it will make it simpler to check for any duplicates later.	2024-09-20 17:14:50 +02:00
Willy Tarreau	ccd1ecba1d	MEDIUM: cfgparse: drop duplicate named defaults sections after use It has never been permitted to explicitly reference named defaults sections for which there are duplicate names. This means that when a duplicate defaults section is found, there's no point in keeping it since it will never be used for lookups, so it can be dropped. However, some such defaults sections might have some rules in them that are implicitly referenced by proxies placed after them. In this case they cannot be removed. What is done here is that upon each new named section creation, if another one is found with the same name, its config location is stored into the new proxy's {prev_file,prev_line} pair, and the old section is either destroyed if its refcount is null, or just unindexed. The dup check when creating a new proxy now consists in checking the prev_line instead of performing a dup lookup on the defaults section. This will guarantee that we can't find duplicate defaults sections in their tree anymore, while still keeping track of what's allocated and releasing everything upon exit. Beyond the consistency gain, there are nice savings for large configs involving many defaults sections: a test with 300k sections saved about 1.9 GB of RAM, and started 25% faster likely thanks to spending less time allocating memory.	2024-09-20 16:35:32 +02:00
Willy Tarreau	c8b813771d	MINOR: proxy: add a list of orphaned defaults sections We'll soon delete unreferenced and duplicated named defaults sections from the list of proxies. The problem with this is that this list (in fact a name-based tree) is used to release all of them at the end. Let's add a list of orphaned defaults sections, typically those containing "http-check send" statements or various other rules, and that are implicitly inherited by a proxy hence have a non-zero refcount while also having a name. These now makes it possible to remove them from the name index while still keeping their memory around for the lifetime of the process, and cleaning it at the end.	2024-09-20 15:59:04 +02:00
Willy Tarreau	cb4c236fac	BUG/MINOR: cfgparse: detect another uncaught case of duplicate defaults The following sequence was not properly caught: defaults def backend back from def defaults def But this one was: defaults def defaults def backend back from def Let's check when defaults are declared that they're not already referenced. Better not backport this. While it will catch broken configs (possibly some with backends pasted after the wrong defaults), these might still work by accident. It may be reported as a diag warning though.	2024-09-20 15:58:10 +02:00
Willy Tarreau	5b221d1e41	CLEANUP: cfgparse: factor proxy vs log-forward collisions This simplifies the check added in 1a38684fbc ("MEDIUM: cfgparse: detect collisions between defaults and log-forward"), by factoring it with the other existing one. The tests are ugly in that code because a first block tests pure proxies, a second one proxies or defaults and inside that one we have special cases for defaults. Let's just move the tests to the "any proxy type" block.	2024-09-20 14:13:14 +02:00
Willy Tarreau	b325453c36	MINOR: proxy: use the global file names for conf->file Proxy file names are assigned a bit everywhere (resolvers, peers, cli, logs, proxy). All these elements were enumerated and now use copy_file_name(). The only ha_free() call was turned to drop_file_name(). As a bonus side effect, a 300k backend config saved 14 MB of RAM.	2024-09-19 15:38:19 +02:00
Willy Tarreau	9ab21a3c2d	CLEANUP: stick-table: make the file location point to a global file name The file name used to point to the calling function's stack for stick tables, which was OK during parsing but remained dangling afterwards. At least it was already marked const so as not to accidentally free it. Let's make it point to a file_name_node now.	2024-09-19 15:38:19 +02:00
Willy Tarreau	d6c060c5ae	MINOR: tools: add minimal file name management In proxies, stick-tables, servers, etc... at plenty of places we store a file name and a line number. Some file names are the result of strdup() (e.g. in proxies), others not (e.g. stick-tables) and leave dangling pointers at the end of parsing. The risk of double-free is not null either. In order to stop this, let's first add a simple tool that allows to register short strings inside a global list, these strings happening to be server names. The strings are either duplicated and stored upon failure to find them, or just added to this storage. Since file names are not expected to disappear before the end of the process, for now we don't even implement refcounting, and we free them all at the end. There's already a drop_file_name() function to reset the pointer like ha_free() used to do, and even if not strictly needed it's a good habit to get used to doing it. The strings are returned as const so that they're stored as-is in structs, and that nasty free() calls are easily caught. The pointer points to the char[] storage inside the node itself. This way later if we want to implement refcounting, it will be trivial to just look up a string and change its associated node's refcount. If needed, comparisons can also be made on pointers. For now they're not used yet and are released on deinit().	2024-09-19 15:36:58 +02:00
Willy Tarreau	1a38684fbc	MEDIUM: cfgparse: detect collisions between defaults and log-forward Sadly, when log-forward were introduced they took great care of avoiding collision with regular proxies but defaults were missed (they need to be explicitly checked for). So now we have to move them to a warning for 3.1 instead of rejecting them.	2024-09-18 18:08:15 +02:00
Willy Tarreau	d8f4b07e40	MEDIUM: cfgparse: warn about colliding names between defaults and proxies In order to complete the checks added in 303a66573d ("MEDIUM: cfgparse: warn about proxies having the same names"), we also need to warn about regular proxies having the same name as defaults sections as well as defaults sections having the same name as proxies, since defaults sections are inherently proxies, albeit stored in a separate list for now.	2024-09-18 18:08:06 +02:00
Amaury Denoyelle	fcd6d29acf	BUG/MINOR: mux-quic: report glitches to session Glitch counter was implemented for QUIC/HTTP3. The counter is stored in the QCC MUX connection instance. However, this is never reported at the session level which is necessary if glitch counter is tracked via a stick-table. To fix this, use session_add_glitch_ctr() in various QUIC MUX functions which may increment glitch counter. This should be backported up to 3.0.	2024-09-18 16:11:03 +02:00
Willy Tarreau	303a66573d	MEDIUM: cfgparse: warn about proxies having the same names As discussed below, there are too many problems and uncaught bugs in the parser when trying to support proxies having similar names but different types. There's specific code to detect the presence of stick-tables in a pair of such proxies for example. It's even possible that certain combinations of backend+listen that were not previously detected have some nasty side effects. According to the proposal in the discussion, this is now deprecated in 3.1 (thus we emit a warning) and will become forbidden in 3.3. A backport might be useful, but reporting a diag_warning only, not a classical warning, so as not to break setups running in zero-warning mode. It was verified with a config involving all 9 combinations of (frontend,backend,listen) followed by one of the same three that all collisions are now properly blocked and that only back+front are kept and emit a warning. Link: https://www.mail-archive.com/haproxy@formilux.org/msg45185.html	2024-09-17 19:55:00 +02:00
Willy Tarreau	c70906c8a1	BUG/MINOR: cfgparse: detect incorrect overlap of same backend names As reported below, it's possible to declare a backend then a proxy with the same name, because for the proxy we check a frontend capability (the first one to be tested): backend b listen b bind :8888 Let's check the two capabilities in this case and not just the frontend. Better not backport this, as there's a risk of breakage of existing setups that work by accident. It might make sense to report them as diag warnings though. Link: https://www.mail-archive.com/haproxy@formilux.org/msg45185.html	2024-09-17 19:55:00 +02:00
Aurelien DARRAGON	17e52c922b	BUG/MINOR: cfgparse-listen: fix option httpslog override warning message "option httpslog" override warning messaged used to be reported as "option httplog", probably as a result of copy paste without adjusting the context. Let's fix that to prevent emitting confusing warning messages The issue exists since 98b930d ("MINOR: ssl: Define a default https log format"), thus it should be backported up to 2.6	2024-09-17 15:40:02 +02:00
Aurelien DARRAGON	bc4bf5779f	BUG/MINOR: fix missing "'option httpslog' overrides previous 'option tcplog clf'..." detection Same as b85edd44db0 ("BUG/MINOR: fix missing "log-format overrides previous 'option tcplog clf'..." detection") but for "option httpslog" keyword. No backport needed unless fd48b28 ("MINOR: Implements new log format of option tcplog clf") is.	2024-09-17 15:40:02 +02:00
Aurelien DARRAGON	607b9adc9b	BUG/MINOR: fix missing "log-format overrides previous 'option tcplog clf'..." detection In commit fd48b28315 ("MINOR: Implements new log format of option tcplog clf") "option tcplog clf" detection was correcly added for "option tcplog" and "option httplog", but "log-format" case was overlooked. Thus, this config would report erroneous warning message: defaults option tcplog clf log-format "ok" [WARNING] (727893) : config : parsing [test.conf:3]: 'log-format' overrides previous 'log-format' in 'defaults' section. No backport needed unless fd48b28315 is.	2024-09-17 14:41:58 +02:00
Willy Tarreau	499e057644	MEDIUM: clock: don't compute before_poll when using monotonic clock There's no point keeping both clocks up to date; if the monotonic clock is ticking, let's just refrain from updating the wall clock one before polling since we won't use it. We still do it after polling however as we need a wall clock time to communicate with outside. This saves one gettimeofday() call per loop and two timeval comparisons.	2024-09-17 09:08:10 +02:00
Willy Tarreau	24496803d1	MEDIUM: clock: use the monotonic clock for idle time calculation By just keeping a copy of the last known value before entering polling, we can apply the same algorithm as we're currently using, except that it's now applied to the monotonic clock instead of the wall clock, when it's detected that it's ticking. This improves idle time calculation accuracy by making it independent on the wall clock.	2024-09-17 09:08:10 +02:00
Willy Tarreau	4150851ce5	MEDIUM: clock: opportunistically use CLOCK_MONOTONIC for the internal time We already collect CLOCK_MONOTONIC when it's available when leaving the poller, but it's only used for profiling. The functions that return it set the value to zero when it's not available, so we can use that to detect if it works or not. The idea is that if the monotonic time is non-zero, it is ticking and usable, then we use if for now_ns, otherwise we use the corrected date. We continue to apply the now_offset to the returned value because it helps forcing an early time wrap-around. Proceeding like this presents two benefits: - on systems supporting this, the time is much more robust against time changes - when it works, it saves us from having to go through the time correction code, which is usually cheap, but better avoided anyway. Note that idle time calculation continues to rely on the wall-clock time.	2024-09-17 09:08:10 +02:00
Willy Tarreau	f793845f4a	MEDIUM: clock: collect the monotonic time in clock_local_update_date() Now we collect this clock in clock_local_update_date(), the closest from the poller, which is also used when busy-polling, and the values is set into the thread's curr_mono_time which did not exist before. Later, clock_leaving_poll() just sets the prev_mono_time value from the curr_ one instead of retrieving the time at this specific point. It also means that the monotonic time will now also cover the time needed to update the global time, which should be negligible. Note that we don't collect the CPU time in the clock_local_update_date() function even though it's tempting, because when doing busy-polling, it would be collected on each round while being useless. Doing so will make sure that the local time always knows the monotonic time when it is available.	2024-09-17 09:08:10 +02:00
Willy Tarreau	42e699903e	MINOR: clock: test all clock_gettime() return values Till now we were only using clock_gettime() for profiling, so if it would fail it was no big deal. We intend to use it as the main clock as well now, so we need to more reliably detect its absence or failure and gracefully fall back to other options. Without the test we would return anything present in the stack, which is neither clean nor easy to detect.	2024-09-17 09:08:10 +02:00
Christopher Faulet	afc50f2445	BUG/MEDIUM: cache/stats: Wait to have the request before sending the response It seems obvious. On a classical workflow, the request headers analysis is finished when these applets are woken up for the first time. So they don't take care to really have the request to start to process it and to send the response. But with a filter, it is possible to stop the request analysis after the applet creation. If this happens for the stats applet, this leads to a crash because we retrieve the request start-line without checking if it is available. For the cache applet, the response is just immediatly sent. And here it is a problem if the compression is enabled. In that case too, this may lead to a crash because the compression may be enabled but not initialized. For a true server, there is no issue because the connection cannot be established. The server is chosen only after the request analysis. The issue with applets is that once created, an applet is quickly switched to the established state. So it is probably a point that must be carefully reviewed and probably reworked. In the mean time, as a fix, in the cache and the stats applet, we just take care to have the request before sending the response. This will do the trick. The patch must be backported as far as 2.6. On 2.6, the patch must be adapted.	2024-09-16 22:55:40 +02:00
Christopher Faulet	4de6632693	MINOR: proxy: Rename accept-invalid-http-* options With these options, it is possible to accept some invalid messages that may considered as unsafe and may result as vulnerabilities. The naming is not explicit enough on this point. These option must really be considered as dangerous and only used as a temporary workaround. Unfortunately, when used, it is probably because there are some legacy and unsupported applications in place. Nevermind. The documentation warns about the use of these options. Now the name of the options itself is a warning. So now, "accept-invalid-http-request" and "accept-invalid-http-response" options are deprecated and replaced by "accept-unsafe-violations-in-http-request" and "accept-unsafe-violations-in-http-response" options.	2024-09-16 22:55:25 +02:00
Aurelien DARRAGON	1e0920f855	BUG/MINOR: peers: local entries updates may not be advertised after resync Since commit 864ac3117 ("OPTIM: stick-tables: check the stksess without taking the read lock"), when entries for a local table are learned from another peer upon resynchro, and this is the only peer haproxy speaks to, local updates on such entries are not advertised to the peer anymore, until they eventually expire and can be recreated upon local updates. This is due to the fact that ts->seen is always set to 0 when creating new entry, and also when touch_remote is performed on the entry. Indeed, while 864ac3117 attempts to avoid useless updates, it didn't consider entries learned from a remote peer. Such entries are exclusively learned in peer_treat_updatemsg(): once the entry is created (or updated) with new data, touch_remote is used to commit the change. However, unlike touch_local, entries committed using touch_remote will not be advertised to the peer from which the entry was just learned (otherwise we would enter a looping situation). Due to the above patch, once an entry is learned from the (unique) remote peer, 'seen' will be stuck to 0 so it will never be advertised for its whole lifetime. Instead, when entries are learned from a peer, we should consider that the peer that taught us the entry has seen it. To do this, let's set seen=1 in peer_treat_updatemsg() after calling touch_remote(). This way, if we happen to perform updates on this entry, it will be properly advertized to relevant peers. This patch should not affect the performance gain documented in 864ac3117 given that the test scenario didn't involved entries learned by remote peers, but solely locally created entries advertised to remote peers upon updates. This should be backported in 3.0 with 864ac3117.	2024-09-16 14:06:39 +02:00
Willy Tarreau	5d350d1e50	OPTIM: vars: use multiple name heads in the vars struct Given that the original list-based version was using a list head as the root of the variables, while the tree is using a single pointer, it made sense to reuse that space to place multiple roots, indexed on the lower bits of the name hash. Two roots slightly increase the performance level, but the best gain is obtained with 4 roots. The performance is now always above that of the list, even with small counts, and with 100 vars, it's 21% higher than before, or 67% higher than with the list. We keep the same lock (it could have made sense to use one lock per head), because most of the variables in large configs are attached to a stream or a session, hence are not shared between threads. Thus there's no point in sharding the pointer.	2024-09-15 23:51:51 +02:00
Willy Tarreau	47ec7c681e	OPTIM: vars: use a cebtree instead of a list for variable names Configs involving many variables can start to eat a lot of CPU in name lookups. The reason is that the names themselves are dynamic in that they are relative to dynamic objects (sessions, streams, etc), so there's no fixed index for example. The current implementation relies on a standard linked list, and in order to speed up lookups and avoid comparing strings, only a 64-bit hash of the variable's name is stored and compared everywhere. But with just 100 variables and 1000 accesses in a config, it's clearly visible that variable name lookup can reach 56% CPU with a config generated this way: for i in {0..100}; do printf "\thttp-request set-var(txn.var%04d) int(%d)" $i $i; for j in {1..10}; do [ $i -lt $j ] \|\| printf ",add(txn.var%04d)" $((i-j)); done; echo; done The performance and a 4-core skylake 4.4 GHz reaches 85k RPS with a perf profile showing: Samples: 170K of event 'cycles', Event count (approx.): 142378815419 Overhead Shared Object Symbol 56.39% haproxy [.] var_to_smp 6.65% haproxy [.] var_set.part.0 5.76% haproxy [.] sample_process_cnv 3.23% haproxy [.] sample_conv_var2smp 2.88% haproxy [.] sample_conv_arith_add 2.33% haproxy [.] __pool_alloc 2.19% haproxy [.] action_store 2.13% haproxy [.] vars_get_by_desc 1.87% haproxy [.] smp_dup [above, var_to_smp() calls var_get() under the read lock]. By switching to a binary tree, the cost is significantly lower, the performance reaches 117k RPS (+37%) with this profile: Samples: 170K of event 'cycles', Event count (approx.): 142323631229 Overhead Shared Object Symbol 40.22% haproxy [.] cebu64_lookup 7.12% haproxy [.] sample_process_cnv 6.15% haproxy [.] var_to_smp 4.75% haproxy [.] cebu64_insert 3.79% haproxy [.] sample_conv_var2smp 3.40% haproxy [.] cebu64_delete 3.10% haproxy [.] sample_conv_arith_add 2.36% haproxy [.] action_store 2.32% haproxy [.] __pool_alloc 2.08% haproxy [.] vars_get_by_desc 1.96% haproxy [.] smp_dup 1.75% haproxy [.] var_set.part.0 1.74% haproxy [.] cebu64_first 1.07% [kernel] [k] aq_hw_read_reg 1.03% haproxy [.] pool_put_to_cache 1.00% haproxy [.] sample_process The performance lowers a bit earlier than with the list however. What can be seen is that the performance maintains a plateau till 25 vars, starts degrading a little bit for the tree while it remains stable till 28 vars for the list. Then both cross at 42 vars and the list continues to degrade doing a hyperbole while the tree resists better. The biggest loss is at around 32 variables where the list stays 10% higher. Regardless, given the extremely narrow band where the list is better, it looks relevant to switch to this in order to preserve the almost linear performance of large setups. For example at 1000 variables and 10k lookups, the tree is 18 times faster than the list. In addition this reduces the size of the struct vars by 8 bytes since there's a single pointer, though it could make sense to re-invest them into a secondary head for example.	2024-09-15 23:49:01 +02:00
Willy Tarreau	a0205f9de4	IMPORT: import cebtree (compact elastic binary trees) This is an import of the compact elastic binary trees at commit a9cd84a ("OPTIM: descent: better prefetch less and for writes when deleting") These will be used to replace certain lists (and possibly certain tree nodes as well). They're as fast (or even faster) than ebtrees for lookups, as fast for insertion and slower for deletion, and a node only uses 2 pointers (like a list). The only changes were cebtree.h where common/tools.h was replaced with ebtree.h which we already have and already provides the needed functions and macros, and the addition of a wrapper cebtree-prv.h in src/ to redirect to import/cebtree-prv.h.	2024-09-15 23:44:59 +02:00
Willy Tarreau	6e92988e20	MINOR: vars: remove the emptiness tests in callers before pruning All callers of vars_prune_* currently check the list for emptiness. Let's leave that to vars_prune() itself, it will ease some changes in the code. Thanks to the previous inlining of the vars_prune() function, there's no performance loss, and even a very tiny 0.1% gain.	2024-09-15 23:44:16 +02:00

... 40 41 42 43 44 ...

20122 Commits