haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2026-03-15 20:12:08 +01:00

Author	SHA1	Message	Date
Amaury Denoyelle	82907d5621	MINOR: stats: report BE unpublished status A previous patch defines a new proxy status : unpublished backends. This patch extends this by changing proxy status reported in stats. If unpublished is set, an extra "(UNPUB)" is added to the field. Also, HTML stats is also slightly updated. If a backend is up but unpublished, its status will be reported in orange color.	2026-01-15 09:08:18 +01:00
Olivier Houchard	5495c88441	MEDIUM: counters: Dynamically allocate per-thread group counters Instead of statically allocating the per-thread group counters, based on the max number of thread groups available, allocate them dynamically, based on the number of thread groups actually used. That way we can increase the maximum number of thread groups without using an unreasonable amount of memory.	2026-01-13 11:12:34 +01:00
Aurelien DARRAGON	a287841578	MINOR: stats-proxy: ensure future-proof FN_AGE manipulation in me_generate_field() Commit ad1bdc33 ("BUG/MAJOR: stats-file: fix crash on non-x86 platform caused by unaligned cast") revealed an ambiguity in me_generate_field() around FN_AGE manipulation. For now FN_AGE can only be stored as u32 or s32, but in the future we could also support 64bit FN_AGES, and the current code assumes 32bits types and performs and explicit unsigned int cast. Instead we group current 32 bits operations for FF_U32 and FF_S32 formats, and let room for potential future formats for FN_AGE. Commit ad1bdc33 also suggested that the fix was temporary and the approach must change, but after a code review it turns out the current approach (generic types manipulation under me_generate_field()) is legit. The introduction of shm-stats-file feature didn't change the logic which was initially implemented in 3.0. It only extended it and since shared stats are now spread over thread-groups since 3.3, the use of atomic operations made typecasting errors more visible, and structure mapping change from d655ed5f14 ("BUG/MAJOR: stats-file: ensure shm_stats_file_object struct mapping consistency (2nd attempt)") was in fact the only change to blame for the crash on non-x86 platforms. With ambiguities removed in me_generate_field(), let's hope we don't face similar bugs in the future. Indeed, with generic counters, and more specifically shared ones (which leverage atomic ops), great care must be taken when changing their underlying types as me_generate_field() solely relies on stat_col descriptor to know how to read the stat from a generic pointer, so any breaking change must be reflected in that function as well No backport needed.	2025-11-10 21:32:22 +01:00
Christopher Faulet	7d1787ba8e	MINOR: sample/stats: Add "bytes" in req_{in,out} and res_{in,out} names Number of bytes received or sent by a client or a server are now saved. Sample fetches and stats fields to retrieve these informations are renamed to add "bytes" in names to avoid any ambiguity with number of requests and responses.	2025-11-07 14:09:48 +01:00
Christopher Faulet	4991a51208	MINOR: stats: Add stats about request and response bytes received and sent In previous patches, these counters were added per frontend, backend, server and listener. With this patch, these counters are reported on stats, including promex. Note that the stats file minor version was incremented by one because the shm_stats_file_object struct size has changed. This patch is related to issue #1617.	2025-11-06 15:01:29 +01:00
Christopher Faulet	0084baa6ba	MINOR: counters: Remove bytes_in and bytes_out counter from fe/be/srv/li bytes_in and bytes_out counters per frontend, backend, listener and server were removed and we now rely on, respectively on, req_in and res_in counters. This patch is related to issue #1617.	2025-11-06 15:01:29 +01:00
Willy Tarreau	ad1bdc3364	BUG/MAJOR: stats-file: fix crash on non-x86 platform caused by unaligned cast Since commit d655ed5f14 ("BUG/MAJOR: stats-file: ensure shm_stats_file_object struct mapping consistency (2nd attempt)"), the last_state_change field in the counters is a uint (to match how it's reported). However, it happens that there are explicit casts in function me_generate_field() to retrieve the value, and which cause crashes on aarch64 and likely other non-x86 64-bit platforms due to atomically reading an unaligned 64-bit value, and may even randomly crash other 64-bit platforms when reading past the end of the structure. The fix for now adapts the cast to match the one used by the accessed type (i.e. unsigned int), but the approach must change, as there's nothing there which allows to figure whether or not the type is correct by just reading the code. At minima a typeof() on a named field is needed, but this requires more invasive changes, hence this temporary fix. No backport is needed, as stats-file is only in 3.3.	2025-11-03 07:33:11 +01:00
Amaury Denoyelle	fac1de935a	MINOR: stats: display new curr_sess_idle_conns server counter Add a new stats column in proxy stats to display server counter for private idle connections. This counter has been introduced recently. The value is displayed on CSV output on the last column before modules. It is also displayed on HTLM page alongside other idle server counters.	2025-08-28 18:58:11 +02:00
Aurelien DARRAGON	75e480d107	MEDIUM: stats: avoid 1 indirection by storing the shared stats directly in counters struct Between 3.2 and 3.3-dev we noticed a noticeable performance regression due to stats handling. After bisecting, Willy found out that recent work to split stats computing accross multiple thread groups (stats sharding) was responsible for that performance regression. We're looking at roughly 20% performance loss. More precisely, it is the added indirections, multiplied by the number of statistics that are updated for each request, which in the end causes a significant amount of time being spent resolving pointers. We noticed that the fe_counters_shared and be_counters_shared structures which are currently allocated in dedicated memory since a0dcab5c ("MAJOR: counters: add shared counters base infrastructure") are no longer huge since 16eb0fab31 ("MAJOR: counters: dispatch counters over thread groups") because they now essentially hold flags plus the per-thread group id pointer mapping, not the counters themselves. As such we decided to try merging fe_counters_shared and be_counters_shared in their parent structures. The cost is slight memory overhead for the parent structure, but it allows to get rid of one pointer indirection. This patch alone yields visible performance gains and almost restores 3.2 stats performance. counters_fe_shared_get() was renamed to counters_fe_shared_prepare() and now returns either failure or success instead of a pointer because we don't need to retrieve a shared pointer anymore, the function takes care of initializing existing pointer.	2025-07-25 16:46:10 +02:00
Aurelien DARRAGON	4fcc9b5572	MINOR: counters: rename last_change counter to last_state_change Since proxy and server struct already have an internal last_change variable and we cannot merge it with the shared counter one, let's rename the last_change counter to be more specific and prevent the mixup between the two. last_change counter is renamed to last_state_change, and unlike the internal last_change, this one is a shared counter so it is expected to be updated by other processes in our back. However, when updating last_state_change counter, we use the value of the server/proxy last_change as reference value.	2025-06-30 16:26:38 +02:00
Aurelien DARRAGON	16eb0fab31	MAJOR: counters: dispatch counters over thread groups Most fe and be counters are good candidates for being shared between processes. They are now grouped inside "shared" struct sub member under be_counters and fe_counters. Now they are properly identified, they would greatly benefit from being shared over thread groups to reduce the cost of atomic operations when updating them. For this, we take the current tgid into account so each thread group only updates its own counters. For this to work, it is mandatory that the "shared" member from {fe,be}_counters is initialized AFTER global.nbtgroups is known, because each shared counter causes the stat to be allocated lobal.nbtgroups times. When updating a counter without concurrency, the first counter from the array may be updated. To consult the shared counters (which requires aggregation of per-tgid individual counters), some helper functions were added to counter.h to ease code maintenance and avoid computing errors.	2025-06-05 09:59:38 +02:00
Aurelien DARRAGON	a0dcab5c45	MAJOR: counters: add shared counters base infrastructure Shareable counters are not tagged as shared counters and are dynamically allocated in separate memory area as a prerequisite for being stored in shared memory area. For now, GUID and threads groups are not taken into account, this is only a first step. also we ensure all counters are now manipulated using atomic operations, namely, "last_change" counter is now read from and written to using atomic ops. Despite the numerous changes caused by the counters being moved away from counters struct, no change of behavior should be expected.	2025-06-05 09:58:58 +02:00
Aurelien DARRAGON	c7c017ec3c	MINOR: stats: add ME_NEW_COMMON() helper Split ME_NEW_* helper into COMMON part and specific part so it becomes easier to add alternative helpers without code duplication.	2025-06-02 17:51:12 +02:00
Aurelien DARRAGON	d04843167c	MINOR: stats: add stat_col flags Add stat_col flags member to store .generic bit and prepare for upcoming flags. No functional change expected.	2025-06-02 17:51:08 +02:00
Christopher Faulet	7244f16ac4	MINOR: promex: Add agent check status/code/duration metrics In the Prometheus exporter, the last health check status is already exposed, with its code and duration in seconds. The server status is also exposed. But the information about the agent check are not available. It is not really handy because when a server status is changed because of the agent, it is not obvious by looking to the Prometheus metrics. Indeed, the server may reported as DOWN for instance, while the health check status still reports a success. Being able to get the agent status in that case could be valuable. So now, the last agent check status is exposed, with its code and duration in seconds. Following metrics can be grabbe now: * haproxy_server_agent_status * haproxy_server_agent_code * haproxy_server_agent_duration_seconds Note that unlike the other metrics, no per-backend aggregated metric is exposed. This patch is related to issue #2983.	2025-05-22 09:50:10 +02:00
Aurelien DARRAGON	dc95a3ed61	MINOR: promex: expose ST_I_PX_RATE (current_session_rate) It has been requested to have the current_session_rate exposed at the frontend level. For now only the per-process value was exposed (ST_I_INF_SESS_RATE). Thanks to the work done lately to merge promex and stat_cols_px[] array, let's simply defined an .alt_name for the ST_I_PX_RATE metric in order to have promex exposing it as current_session_rate for relevant contexts.	2025-04-28 12:23:20 +02:00
Ilia Shipitsin	78b849b839	CLEANUP: assorted typo fixes in the code and comments code, comments and doc actually.	2025-04-02 11:12:20 +02:00
Aurelien DARRAGON	276491dc22	MINOR: stats-proxy: add alt name info to stat_cols_px where relevant For all metrics defined under promex_st_metrics array, add the corresponding .alt_name field in the general purpose stat_cols_px array.	2025-03-21 17:05:26 +01:00
Aurelien DARRAGON	7f9d8c1327	MINOR: stats-proxy: add alt_name field for ME_NEW_{FE,BE,PX} helpers For now alt_name is systematically set to NULL. Thanks to this change we may easily add an altname to existing metrics. Also by requiring explicit value it offers more visibility for this field.	2025-03-21 17:05:19 +01:00
Aurelien DARRAGON	2ab82124ec	MINOR: stats: explicitly add frontend cap for ST_I_PX_REQ_TOT While being a generic metric, ST_I_PX_REQ_TOT is handled specifically for the frontend case. But the frontend capability isn't set for that metric It is actually quite misleading, because the capability may be checked to see whether the metric is relevant for a given scope, yet it is relevant for frontend scope. In this patch we also add the frontend capability for the metric.	2025-03-20 11:42:43 +01:00
Aurelien DARRAGON	8aa8626d12	MINOR: stats: add .cap for some static metrics Goal is to merge promex metrics definition into the main one. Promex metrics will use the metric capability to know available scopes, thus only metrics relevant for prometheus were updated.	2025-03-20 11:38:17 +01:00
Aurelien DARRAGON	3c1b00b127	MINOR: stats: add .generic explicit field in stat_col struct Further extend logic implemented in 65624876 ("MINOR: stats: introduce a more expressive stat definition method") and 4e9e8418 ("MINOR: stats: prepare stats-file support for values other than FN_COUNTER"): we don't rely anymore on the presence of the capability to know if the metric is generic or not. This is because it prevents us from setting a capability on static statistics. Yet it could be useful to set the capability even on static metrics, thus we add a dedicated .generic bit to tell haproxy that the metric is generic and can be handled automatically by the API. Also, ME_NEW_* helpers are not explicitly associated to generic metric definition (as it was already the case before) to avoid ambiguities. It may change in the future as we may need to use the new definition method to define static metrics (without the generic bit set). But for now it isn't the case as this need definition was implemented for generic metrics support in the first place. If we want to define static metrics using the API, we could add a new set of helpers for instance.	2025-03-20 11:37:21 +01:00
Aurelien DARRAGON	8311be5ac6	BUG/MINOR: stats: fix capabilities and hide settings for some generic metrics Performing a diff on stats output before vs after commit 66152526 ("MEDIUM: stats: convert counters to new column definition") revealed that some metrics were not properly ported to to the new API. Namely, "lbtot", "cli_abrt" and "srv_abrt" are now exposed on frontend and listeners while it was not the case before. Also, "hrsp_other" is exposed even when "mode http" wasn't set on the proxy. In this patch we restore original behavior by fixing the capabilities and hide settings. As this could be considered as a minor regression (looking at the commit message it doesn't seem intended), better tag this as a bug. It should be backported in 3.0 with 66152526.	2025-03-13 11:49:18 +01:00
Olivier Houchard	583303c48b	MINOR: proxies/servers: Calculate queueslength and use it. For both proxies and servers, properly calculates queueslength, which is the total number of element in each queues (as they currently are only using one queue, it is equivalent to the number of element of that queue), and use it instead of the queue's length.	2025-01-28 12:49:41 +01:00
Amaury Denoyelle	071ae8ce3d	BUG/MEDIUM: stats/server: use watcher to track server during stats dump If a server A is deleted while a stats dump is currently on it, deletion is delayed thanks to reference counting. Server A is nonetheless removed from the proxy list. However, this list is a single linked list. If the next server B is deleted and freed immediately, server A would still point to it. This problem has been solved by the prev_deleted list in servers. This model seems correct, but it is difficult to ensure completely its validity. In particular, it implies when stats dump is resumed, server A elements will be accessed despite the server being in a half-deleted state. Thus, it has been decided to completely ditch the refcount mechanism for stats dump. Instead, use the watcher element to register every stats dump currently tracking a server instance. Each time a server is deleted on the CLI, each stats dump element which may points to it are updated to access the next server instance, or NULL if this is the last server. This ensures that a server which was deleted via CLI but not completely freed is never accessed on stats dump resumption. Currently, no race condition related to dynamic servers and stats dump is known. However, as described above, the previous model is deemed too fragile, as such this patch is labelled as bug-fix. It should be backported up to 2.6, after a reasonable period of observation. It relies on the following patch : MINOR: list: define a watcher type	2024-12-10 16:19:33 +01:00
Willy Tarreau	d24768ab44	MINOR: protocol: create abnsz socket address family For now it's the same as abns. We'll need to modify sock_unix_addrcmp(), and a few other ones to support effective path length when dealing with the \0. Let's check with Tristan's patch for this (upcoming patch). Co-authored-by: Aurelien DARRAGON <adarragon@haproxy.com>	2024-10-29 12:14:50 +01:00
Willy Tarreau	78ac312bbd	MEDIUM: protocol: make abns a custom unix socket address family This is a pre-requisite to adding the abnsz socket address family: in this patch we make use of protocol API rework started by 732913f ("MINOR: protocol: properly assign the sock_domain and sock_family") in order to implement a dedicated address family for ABNS sockets (based on UNIX parent family). Thanks to this, it will become trivial to implement a new ABNSZ (for abns zero) family which is essentially the same as ABNS but with a slight difference when it comes to path handling (ABNS uses the whole sun_path length, while ABNSZ's path is zero terminated and evaluation stops at 0) It was verified that this patch doesn't break reg-tests and behaves properly (tests performed on the CLI with show sess and show fd). Anywhere relevant, AF_CUST_ABNS is handled alongside AF_UNIX. If no distinction needs to be made, real_family() is used to fetch the proper real family type to handle it properly. Both stream and dgram were converted, so no functional change should be expected for this "internal" rework, except that proto will be displayed as "abns_{stream,dgram}" instead of "unix_{stream,dgram}". Before ("show sess" output): 0x64c35528aab0: proto=unix_stream src=unix:1 fe=GLOBAL be=<NONE> srv=<none> ts=00 epoch=0 age=0s calls=1 rate=0 cpu=0 lat=0 rq[f=848000h,i=0,an=00h,ax=] rp[f=80008000h,i=0,an=00h,ax=] scf=[8,0h,fd=21,rex=10s,wex=] scb=[8,1h,fd=-1,rex=,wex=] exp=10s rc=0 c_exp= After: 0x619da7ad74c0: proto=abns_stream src=unix:1 fe=GLOBAL be=<NONE> srv=<none> ts=00 epoch=0 age=0s calls=1 rate=0 cpu=0 lat=0 rq[f=848000h,i=0,an=00h,ax=] rp[f=80008000h,i=0,an=00h,ax=] scf=[8,0h,fd=22,rex=10s,wex=] scb=[8,1h,fd=-1,rex=,wex=] exp=10s rc=0 c_exp= Co-authored-by: Aurelien DARRAGON <adarragon@haproxy.com>	2024-10-29 12:14:25 +01:00
Ilya Shipitsin	1f6e5f7a61	CLEANUP: assorted typo fixes in the code and comments This is 43rd iteration of typo fixes	2024-09-03 17:49:21 +02:00
Christopher Faulet	6b9daec93d	MINOR: stats-html: Display reuse ratio for spop connections Now SPOP connections can be reused, it could be pretty useful to know the reuse rate. The corresponding backend and server counters are already incremented, but not displayed on the stats HTML page. Thanks to this patch, it is now possible to get it, just like for HTTP proxies. The related issue is #2502.	2024-07-12 15:27:05 +02:00
Amaury Denoyelle	53782b9ea5	MINOR: stats: extract proxy clear-counter in a dedicated function Split code related to proxies list looping in cli_parse_clear_counters() to a new dedicated function. This function is placed in the new module stats-proxy.	2024-05-02 16:43:26 +02:00
Amaury Denoyelle	f0644d1bd7	REORG: stats: define stats-proxy source module Create a new module stats-proxy. Move stats functions related to proxies list looping in it. This allows to reduce stats source file dividing its size by half.	2024-05-02 16:42:36 +02:00

31 Commits