haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-07 15:47:01 +02:00

Author	SHA1	Message	Date
Amaury Denoyelle	83281303f6	MINOR: stats: define stats-file output format support Prepare stats function to handle a new format labelled "stats-file". Its purpose is to generate a statistics dump with a format closed from the CSV output. Such output will be then used to preload haproxy internal counters on process startup. stats-file output differs from a standard CSV on several points. First, only an excerpt of all statistics is outputted. All values that does not make sense to preload are excluded. For the moment, stats-file only list stats fully defined via "struct stat_col" method. Contrary to a CSV, sll columns of a stats-file will be filled. As such, empty field value is used to mark stats which should not be outputted. Some adaptation specifics to stats-file are necessary into me_generate_field(). First, stats-file will output separatedly values from frontend and backend sides with their own respective set of columns. As such, an empty field value is returned if stat is not defined for either frontend/listener, or backend/server when outputting the other side. Also, as stats-file does not support empty column, stcol_hide() is not used for it. A minor adjustement was necessary for stats_fill_fe_line() to pass context flags. This is necessary to detect stat output format. All other listener/server/backend corresponding functions already have it.	2024-04-26 10:20:57 +02:00
Amaury Denoyelle	861370a6d4	MINOR: stats: update ambiguous "metrics" naming to "stat_cols" The name "metrics" was chosen to represent the various list of haproxy exposed statistics. However, it is deemed as ambiguous as some stats are indeed metric in the true sense, but some are not, as highlighted by various "enum field_origin" values. Replace it by the new name "stat_cols" for statistic columns. Along with the already existing notion of stat lines it should better reflect its purpose.	2024-04-26 10:20:57 +02:00
Amaury Denoyelle	341bf913d4	MINOR: stats: use STAT_F_* prefix for flags Some flags are defined during statistics generation and output. They use the prefix STAT_* which is also used for other purposes. Rename them with the new prefix STAT_F_* to differentiate them from the other usages.	2024-04-22 16:25:18 +02:00
Amaury Denoyelle	e97375dcab	MINOR: stats: use stricter naming stats/field/line Several unique names were used for different purposes under statistics implementation. This caused the code to be difficult to understand. * stat/stats name is removed when a more specific name could be used * restrict field usage to purely refer to <struct field> which represents a raw stat value. * use "line" naming to represent an array of <struct field>	2024-04-22 16:25:18 +02:00
Amaury Denoyelle	8dbb74542f	MINOR: stats: rename info stats Info are used to expose haproxy global metrics. It is similar to proxy statistics and any other module. As such, rename info indexes using SI_I_INF_* prefix. Also info variable is renamed stat_line_info. Thanks to this, naming is now consistent between info and other statistics. It will help to integrate it as a "global" statistics module.	2024-04-22 16:25:18 +02:00
Amaury Denoyelle	8fc0b18087	MINOR: stats: rename proxy stats This commit is the first one of a serie which adjust naming convention for stats module. The objective is to remove ambiguity and better reflect how stats are implemented, especially since the introduction of stats module. This patch renames elements related to proxies statistics. One of the main change is to rename ST_F_* statistics indexes prefix with the new name ST_I_PX_*. This remove the reference to field which represents another concept in the stats module. In the same vein, global stat_fields variable is renamed metrics_px.	2024-04-22 16:25:18 +02:00
Willy Tarreau	1a088da7c2	MAJOR: stktable: split the keys across multiple shards to reduce contention In order to reduce the contention on the table when keys expire quickly, we're spreading the load over multiple trees. That counts for keys and expiration dates. The shard number is calculated from the key value itself, both when looking up and when setting it. The "show table" dump on the CLI iterates over all shards so that the output is not fully sorted, it's only sorted within each shard. The Lua table dump just does the same. It was verified with a Lua program to count stick-table entries that it works as intended (the test case is reproduced here as it's clearly not easy to automate as a vtc): function dump_stk() local dmp = core.proxies['tbl'].stktable:dump({}); local count = 0 for _, __ in pairs(dmp) do count = count + 1 end core.Info('Total entries: ' .. count) end core.register_action("dump_stk", {'tcp-req', 'http-req'}, dump_stk, 0); ## global tune.lua.log.stderr on lua-load-per-thread lua-cnttbl.lua listen front bind :8001 http-request lua.dump_stk if { path_beg /stk } http-request track-sc1 rand(),upper,hex table tbl http-request redirect location / backend tbl stick-table size 100k type string len 12 store http_req_cnt ## $ h2load -c 16 -n 10000 0:8001/ $ curl 0:8001/stk ## A count close to 100k appears on haproxy's stderr ## On the CLI, "show table tbl" \| wc will show the same. Some large parts were reindented only to add a top-level loop to iterate over shards (e.g. process_table_expire()). Better check the diff using git show -b. The number of shards is decided just like for the pools, at build time based on the max number of threads, so that we can keep a constant. Maybe this should be done differently. For now CONFIG_HAP_TBL_BUCKETS is used, and defaults to CONFIG_HAP_POOL_BUCKETS to keep the benefits of all the measurements made for the pools. It turns out that this value seems to be the most reasonable one without inflating the struct stktable too much. By default for 1024 threads the value is 32 and delivers 980k RPS in a test involving 80 threads, while adding 1kB to the struct stktable (roughly doubling it). The same test at 64 gives 1008 kRPS and at 128 it gives 1040 kRPS for 8 times the initial size. 16 would be too low however, with 675k RPS. The stksess already have a shard number, it's the one used to decide which peer connection to send the entry. Maybe we should also store the one associated with the entry itself instead of recalculating it, though it does not happen that often. The operation is done by hashing the key using XXH32(). The peers also take and release the table's lock but the way it's used it not very clear yet, so at this point it's sure this will not work. At this point, this allowed to completely unlock the performance on a 80-thread setup: before: 5.4 Gbps, 150k RPS, 80 cores 52.71% haproxy [.] stktable_lookup_key 36.90% haproxy [.] stktable_get_entry.part.0 0.86% haproxy [.] ebmb_lookup 0.18% haproxy [.] process_stream 0.12% haproxy [.] process_table_expire 0.11% haproxy [.] fwrr_get_next_server 0.10% haproxy [.] eb32_insert 0.10% haproxy [.] run_tasks_from_lists after: 36 Gbps, 980k RPS, 80 cores 44.92% haproxy [.] stktable_get_entry 5.47% haproxy [.] ebmb_lookup 2.50% haproxy [.] fwrr_get_next_server 0.97% haproxy [.] eb32_insert 0.92% haproxy [.] process_stream 0.52% haproxy [.] run_tasks_from_lists 0.45% haproxy [.] conn_backend_get 0.44% haproxy [.] __pool_alloc 0.35% haproxy [.] process_table_expire 0.35% haproxy [.] connect_server 0.35% haproxy [.] h1_headers_to_hdr_list 0.34% haproxy [.] eb_delete 0.31% haproxy [.] srv_add_to_idle_list 0.30% haproxy [.] h1_snd_buf WIP: uint64_t -> long WIP: ulong -> uint code is much smaller	2024-04-03 17:34:47 +02:00
Tim Duesterhus	f88ea5949c	CLEANUP: Reapply strcmp.cocci (2) This reapplies strcmp.cocci across the whole src/ tree.	2024-04-02 07:27:33 +02:00
Aurelien DARRAGON	3ac79b504a	MEDIUM: server: make server_set_inetaddr() updater serializable server_set_inetaddr() updater argument is a simple char * string containing infos about the caller responsible for the update. In this patch, we try to make this argument serializable, that is, make it so that we can easily export it without having to keep the original pointer passed by the caller or having to work with strings of variable lengths. This was a prerequisite for exposing more updater information through SERVER_INETADDR event (upcoming patch). Static strings were simply mapped to a fixed ID that can be converted back to a string when needed using server_inetaddr_updater_by_to_str(). One special case one made for the SERVER_INETADDR_UPDATER_DNS_RESOLVER updater since in this case the updater hint has to be generated from the corresponding resolver id / nameserver id combination. This was achieved by saving the nameserver id within the updater struct. Knowing that the resolver id can be guessed from the server struct directly, it was not exposed through the updater struct. This patch depends on: - "MINOR: resolvers: add unique numeric id to nameservers" No functional change should be expected.	2023-12-21 14:22:27 +01:00
Ilya Shipitsin	80813cdd2a	CLEANUP: assorted typo fixes in the code and comments This is 37th iteration of typo fixes	2023-11-23 16:23:14 +01:00
Willy Tarreau	6cbb5a057b	Revert "MAJOR: import: update mt_list to support exponential back-off" This reverts commit `c618ed5ff4`. The list iterator is broken. As found by Fred, running QUIC single- threaded shows that only the first connection is accepted because the accepter relies on the element being initialized once detached (which is expected and matches what MT_LIST_DELETE_SAFE() used to do before). However while doing this in the quic_sock code seems to work, doing it inside the macro show total breakage and the unit test doesn't work anymore (random crashes). Thus it looks like the fix is not trivial, let's roll this back for the time it will take to fix the loop.	2023-09-15 17:13:43 +02:00
Willy Tarreau	c618ed5ff4	MAJOR: import: update mt_list to support exponential back-off The new mt_list code supports exponential back-off on conflict, which is important for use cases where there is contention on a large number of threads. The API evolved a little bit and required some updates: - mt_list_for_each_entry_safe() is now in upper case to explicitly show that it is a macro, and only uses the back element, doesn't require a secondary pointer for deletes anymore. - MT_LIST_DELETE_SAFE() doesn't exist anymore, instead one just has to set the list iterator to NULL so that it is not re-inserted into the list and the list is spliced there. One must be careful because it was usually performed before freeing the element. Now instead the element must be nulled before the continue/break. - MT_LIST_LOCK_ELT() and MT_LIST_UNLOCK_ELT() have always been unclear. They were replaced by mt_list_cut_around() and mt_list_connect_elem() which more explicitly detach the element and reconnect it into the list. - MT_LIST_APPEND_LOCKED() was only in haproxy so it was left as-is in list.h. It may however possibly benefit from being upstreamed. This required tiny adaptations to event_hdl.c and quic_sock.c. The test case was updated and the API doc added. Note that in order to keep include files small, the struct mt_list definition remains in list-t.h (par of the internal API) and was ifdef'd out in mt_list.h. A test on QUIC with both quictls 1.1.1 and wolfssl 5.6.3 on ARM64 with 80 threads shows a drastic reduction of CPU usage thanks to this and the refined memory barriers. Please note that the CPU usage on OpenSSL 3.0.9 is significantly higher due to the excessive use of atomic ops by openssl, but 3.1 is only slightly above 1.1.1 though: - before: 35 Gbps, 3.5 Mpps, 7800% CPU - after: 41 Gbps, 4.2 Mpps, 2900% CPU	2023-09-13 11:50:33 +02:00
Aurelien DARRAGON	ee1891ccbe	BUG/MINOR: hlua_fcn: potentially unsafe stktable_data_ptr usage As reported by Coverity in GH #2253, stktable_data_ptr() usage in hlua_stktable_dump() func is potentially unsafe because stktable_data_ptr() may return NULL and the returned value is dereferenced as-is without precautions. In practise, this should not happen because some error checking was already performed prior to calling stktable_data_ptr(). But since we're using the safe stktable_data_ptr() function, all the error checking is already done within the function, thus all we need to do is check ptr against NULL instead to protect against NULL dereferences. This should be backported in every stable versions.	2023-08-25 11:52:43 +02:00
Willy Tarreau	7968fe3889	MEDIUM: stick-table: change the ref_cnt atomically Due to the ts->ref_cnt being manipulated and checked inside wrlocks, we continue to have it updated under plenty of read locks, which have an important cost on many-thread machines. This patch turns them all to atomic ops and carefully moves them outside of locks every time this is possible: - the ref_cnt is incremented before write-unlocking on creation otherwise the element could vanish before we can do it - the ref_cnt is decremented after write-locking on release - for all other cases it's updated out of locks since it's guaranteed by the sequence that it cannot vanish - checks are done before locking every time it's used to decide whether we're going to release the element (saves several write locks) - expiration tests are just done using atomic loads, since there's no particular ordering constraint there, we just want consistent values. For Lua, the loop that is used to dump stick-tables could switch to read locks only, but this was not done. For peers, the loop that builds updates in peer_send_teachmsgs is extremely expensive in write locks and it doesn't seem this is really needed since the only updated variables are last_pushed and commitupdate, the first one being on the shared table (thus not used by other threads) and the commitupdate could likely be changed using a CAS. Thus all of this could theoretically move under a read lock, but that was not done here. On a 80-thread machine with a peers section enabled, the request rate increased from 415 to 520k rps.	2023-08-11 19:03:35 +02:00
Aurelien DARRAGON	70e10ee5bc	BUG/MEDIUM: hlua_fcn/queue: bad pop_wait sequencing I assumed that the hlua_yieldk() function used in queue:pop_wait() function would eventually return when the continuation function would return. But this is wrong, the continuation function is simply called back by the resume after the hlua_yieldk() which does not return in this case. The caller is no longer the initial calling function, but Lua, so when the continuation function eventually returns, it does not give the hand back to the C calling function (queue:pop_wait()), like we're used to, but directly to Lua which will continue the normal execution of the (Lua) function that triggered the C-function, effectively bypassing the end of the C calling function. Because of this, the queue waiting list cleanup never occurs! This causes some undesirable effects: - pop_wait() will slowly leak over the time, because the allocated queue waiting entry never gets deallocated when the function is finished - queue:push() will become slower and slower because the wait list will keep growing indefinitely as a result of the previous leak - the task that performed at least 1 pop_wait() could suffer from useless wakeups because it will stay indefinitely in the queue waiting list, so every queue:push() will try to wake the task, even if the task is not waiting for new queue items. - last but not least, if the task that performed at least 1 pop_wait ends or crashes, the next queue:push() will lead to invalid reads and process crash because it will try to wakeup a ghost task that doesn't exist anymore. To fix this, the pop_wait function was reworked with the assumption that the hlua_yieldk() with continuation function never returns. Indeed, it is now the continuation function that will take care of the cleanup, instead of the parent function. This must be backported in 2.8 with `86fb22c5` ("MINOR: hlua_fcn: add Queue class")	2023-07-17 07:42:52 +02:00
Aurelien DARRAGON	33a8c2842b	BUG/MINOR: hlua_fcn/queue: use atomic load to fetch queue size In hlua_queue_size(), queue size is loaded as a regular int, but the queue might be shared by multiple threads that could perform some atomic pushing or popping attempts in parallel, so we better use an atomic load operation to guarantee consistent readings. This could be backported in 2.8.	2023-07-11 16:04:39 +02:00
Aurelien DARRAGON	b58bd9794f	MINOR: hlua_fcn/mailers: handle timeout mail from mailers section As the example/lua/mailers.lua script does its best to mimic the c-mailer implementation, it should support the "timeout mail" directive as well. This could be backported in 2.8.	2023-07-10 18:28:08 +02:00
Aurelien DARRAGON	d7d507aa8a	CLEANUP: hlua_fcn/queue: make queue:push() easier to read Adding some spaces and code comments in queue:push() function to make it easier to read.	2023-05-11 09:23:14 +02:00
Aurelien DARRAGON	c0af7cdba2	BUG/MINOR: hlua_fcn/queue: fix reference leak When pushing a lua object through lua Queue class, a new reference is created from the object so that it can be safely restored when needed. Likewise, when popping an object from lua Queue class, the object is restored at the top of the stack via its reference id. However, once the object is restored the related queue entry is removed, thus the object reference must be dropped to prevent reference leak.	2023-05-11 09:23:14 +02:00
Aurelien DARRAGON	bd8a94a759	BUG/MINOR: hlua_fcn/queue: fix broken pop_wait() queue:pop_wait() was broken during late refactor prior to merge. (Due to small modifications to ensure that pop() returns nil on empty queue instead of nothing) Because of this, pop_wait() currently behaves exactly as pop(), resulting in 100% active CPU when used in a while loop. Indeed, _hlua_queue_pop() should explicitly return 0 when the queue is empty since pop_wait logic relies on this and the pushnil should be handled directly in queue:pop() function instead. Adding some comments as well to document this.	2023-05-11 09:23:14 +02:00
Aurelien DARRAGON	86fb22c557	MINOR: hlua_fcn: add Queue class Adding a new lua class: Queue. This class provides a generic FIFO storage mechanism that may be shared between multiple lua contexts to easily pass data between them, as stock Lua doesn't provide easy methods for passing data between multiple coroutines. New Queue object may be obtained using core.queue() (it works like core.concat() for a concat Class) Lua documentation was updated (including some usage examples)	2023-05-05 16:28:32 +02:00
Aurelien DARRAGON	717a38d135	MINOR: hlua: expose proxy mailers Proxy mailers, which are configured using "email-alert" directives in proxy sections from the configuration, are now being exposed directly in lua thanks to the proxy:get_mailers() method which returns a class containing the various mailers settings if email alerts are configured for the given proxy (else nil is returned). Both the class and the proxy method were marked as LEGACY since this feature relies on mailers configuration, and the goal here is to provide the last missing bits of information to lua scripts in order to make them capable of sending email alerts instead of relying on the soon-to- be deprecated mailers implementation based on checks (see src/mailers.c) Lua documentation was updated accordingly.	2023-05-05 16:28:32 +02:00
Aurelien DARRAGON	fc84553df8	MINOR: hlua_fcn: add Proxy.get_srv_act() and Proxy.get_srv_bck() Proxy.get_srv_act: number of active servers that are eligible for LB Proxy.get_srv_bck: number of backup servers that are eligible for LB	2023-05-05 16:28:32 +02:00
Aurelien DARRAGON	fc759b4ac2	MINOR: hlua_fcn: add Server.get_pend_conn() and Server.get_cur_sess() Server.get_pend_conn: number of pending connections to the server Server.get_cur_sess: number of current sessions handled by the server Lua documentation was updated accordingly.	2023-05-05 16:28:32 +02:00
Aurelien DARRAGON	3889efa8e4	MINOR: hlua_fcn: add Server.get_proxy() Server.get_proxy(): get the proxy to which the server belongs (or nil if not available)	2023-05-05 16:28:32 +02:00
Aurelien DARRAGON	4be36a1337	MINOR: hlua_fcn: add Server.get_trackers() This function returns an array of servers who are currently tracking the server.	2023-05-05 16:28:32 +02:00
Aurelien DARRAGON	406511a2df	MINOR: hlua_fcn: add Server.tracking() This function returns the currently tracked server, if any.	2023-05-05 16:28:32 +02:00
Aurelien DARRAGON	7a03dee36f	MINOR: hlua_fcn: add Server.is_dynamic() This function returns true if the current server is dynamic, meaning that it was instantiated at runtime (ie: from the cli)	2023-05-05 16:28:32 +02:00
Aurelien DARRAGON	c72051d53a	MINOR: hlua_fcn: add Server.is_backup() This function returns true if the current server is a backup server.	2023-05-05 16:28:32 +02:00
Aurelien DARRAGON	862a0fe75a	MINOR: hlua_fcn: fix Server.is_draining() return type Adjusting Server.is_draining() return type from integer to boolean to comply with the documentation.	2023-05-05 16:28:32 +02:00
Willy Tarreau	c05d30e9d8	MINOR: clock: replace the timeval start_time with start_time_ns Now that "now" is no more a timeval, there's no point keeping a copy of it as a timeval, let's also switch start_time to nanoseconds, it simplifies operations.	2023-04-28 16:08:08 +02:00
Willy Tarreau	69530f59ae	MEDIUM: clock: replace timeval "now" with integer "now_ns" This puts an end to the occasional confusion between the "now" date that is internal, monotonic and not synchronized with the system's date, and "date" which is the system's date and not necessarily monotonic. Variable "now" was removed and replaced with a 64-bit integer "now_ns" which is a counter of nanoseconds. It wraps every 585 years, so if all goes well (i.e. if humanity does not need haproxy anymore in 500 years), it will just never wrap. This implies that now_ns is never nul and that the zero value can reliably be used as "not set yet" for a timestamp if needed. This will also simplify date checks where it becomes possible again to do "date1<date2". All occurrences of "tv_to_ns(&now)" were simply replaced by "now_ns". Due to the intricacies between now, global_now and now_offset, all 3 had to be turned to nanoseconds at once. It's not a problem since all of them were solely used in 3 functions in clock.c, but they make the patch look bigger than it really is. The clock_update_local_date() and clock_update_global_date() functions are now much simpler as there's no need anymore to perform conversions nor to round the timeval up or down. The wrapping continues to happen by presetting the internal offset in the short future so that the 32-bit now_ms continues to wrap 20 seconds after boot. The start_time used to calculate uptime can still be turned to nanoseconds now. One interrogation concerns global_now_ms which is used only for the freq counters. It's unclear whether there's more value in using two variables that need to be synchronized sequentially like today or to just use global_now_ns divided by 1 million. Both approaches will work equally well on modern systems, the difference might come from smaller ones. Better not change anyhting for now. One benefit of the new approach is that we now have an internal date with a resolution of the nanosecond and the precision of the microsecond, which can be useful to extend some measurements given that timestamps also have this resolution.	2023-04-28 16:08:08 +02:00
Willy Tarreau	d2f61de8c2	BUG/MINOR: hlua: return wall-clock date, not internal date in core.now() That's hopefully the last one affected by this. It was a bit trickier because there's the promise in the doc that the date is monotonous, so we continue to use now-start_time as the uptime value and add it to start_date to get the current date. It was also emphasized by commit `28360dc` ("MEDIUM: clock: force internal time to wrap early after boot"), causing core.now() to return a date of Mar 20 on Apr 27. No backport is needed.	2023-04-27 18:44:14 +02:00
Ilya Shipitsin	ccf8012f28	CLEANUP: assorted typo fixes in the code and comments This is 36th iteration of typo fixes	2023-04-23 09:44:53 +02:00
Aurelien DARRAGON	1746b56e68	MINOR: server: change srv_op_st_chg_cause storage type This one is greatly inspired by "MINOR: server: change adm_st_chg_cause storage type". While looking at current srv_op_st_chg_cause usage, it was clear that the struct needed some cleanup since some leftovers from asynchronous server state change updates were left behind and resulted in some useless code duplication, and making the whole thing harder to maintain. Two observations were made: - by tracking down srv_set_{running, stopped, stopping} usage, we can see that the <reason> argument is always a fixed statically allocated string. - check-related state change context (duration, status, code...) is not used anymore since srv_append_status() directly extracts the values from the server->check. This is pure legacy from when the state changes were applied asynchronously. To prevent code duplication, useless string copies and make the reason/cause more exportable, we store it as an enum now, and we provide srv_op_st_chg_cause() function to fetch the related description string. HEALTH and AGENT causes (check related) are now explicitly identified to make consumers like srv_append_op_chg_cause() able to fetch checks info from the server itself if they need to.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	223770ddca	MINOR: hlua/event_hdl: per-server event subscription Now that event_hdl api is properly implemented in hlua, we may add the per-server event subscription in addition to the global event subscription. Per-server subscription allows to be notified for events related to single server. It is useful to track a server UP/DOWN and DEL events. It works exactly like core.event_sub() except that the subscription will be performed within the server dedicated subscription list instead of the global one. The callback function will only be called for server events affecting the server from which the subscription was performed. Regarding the implementation, it is pretty trivial at this point, we add more doc than code this time. Usage examples have been added to the (lua) documentation.	2023-04-05 08:58:17 +02:00
Aurelien DARRAGON	c84899c636	MEDIUM: hlua/event_hdl: initial support for event handlers Now that the event handler API is pretty mature, we can expose it in the lua API. Introducing the core.event_sub(<event_types>, <cb>) lua function that takes an array of event types <event_types> as well as a callback function <cb> as argument. The function returns a subscription <sub> on success. Subscription <sub> allows you to manage the subscription from anywhere in the script. To this day only the sub->unsub method is implemented. The following event types are currently supported: - "SERVER_ADD": when a server is added - "SERVER_DEL": when a server is removed from haproxy - "SERVER_DOWN": server states goes from up to down - "SERVER_UP": server states goes from down to up As for the <cb> function: it will be called when one of the registered event types occur. The function will be called with 3 arguments: cb(<event>,<data>,<sub>) <event>: event type (string) that triggered the function. (could be any of the types used in <event_types> when registering the subscription) <data>: data associated with the event (specific to each event family). For "SERVER_" family events, server details such as server name/id/proxy will be provided. If the server still exists (not yet deleted), a reference to the live server is provided to spare you from an additionnal lookup if you need to have direct access to the server from lua. <sub> refers to the subscription. In case you need to manage it from within an event handler. (It refers to the same subscription that the one returned from core.event_sub()) Subscriptions are per-thread: the thread that will be handling the event is the one who performed the subscription using core.event_sub() function. Each thread treats events sequentially, it means that if you have, let's say SERVER_UP, then SERVER_DOWN in a short timelapse, then your cb function will first be called with SERVER_UP, and once you're done handling the event, your function will be called again with SERVER_DOWN. This is to ensure event consitency when it comes to logging / triggering logic from lua. Your lua cb function may yield if needed, but you're pleased to process the event as fast as possible to prevent the event queue from growing up To prevent abuses, if the event queue for the current subscription goes over 100 unconsumed events, the subscription will pause itself automatically for as long as it takes for your handler to catch up. This would lead to events being missed, so a warning will be emitted in the logs to inform you about that. This is not something you want to let happen too often, it may indicate that you subscribed to an event that is occurring too frequently or/and that your callback function is too slow to keep up the pace and you should review it. If you want to do some parallel processing because your callback functions are slow: you might want to create subtasks from lua using core.register_task() from within your callback function to perform the heavy job in a dedicated task and allow remaining events to be processed more quickly. Please check the lua documentation for more information.	2023-04-05 08:58:17 +02:00
Aurelien DARRAGON	94ee6632ee	MINOR: hlua_fcn: add server->get_rid() method Server revision ID was recently added to haproxy with `61e3894` ("MINOR: server: add srv->rid (revision id) value") Let's add it to the hlua server class.	2023-04-05 08:58:17 +02:00
Aurelien DARRAGON	c4b2437037	MEDIUM: hlua_fcn/api: remove some old server and proxy attributes Since ("MINOR: hlua_fcn: alternative to old proxy and server attributes"): - s->name(), s->puid() are superseded by s->get_name() and s->get_puid() - px->name(), px->uuid() are superseded by px->get_name() and px->get_uuid() And considering this is now the proper way to retrieve proxy name/uuid and server name/puid from lua: We're now removing such legacy attributes, but for retro-compatibility purposes we will be emulating them and warning the user for some time before completely dropping their support. To do this, we first remove old legacy code. Then we move server and proxy methods out of the metatable to allow direct elements access without systematically involving the "__index" metamethod. This allows us to involve the "__index" metamethod only when the requested key is missing from the table. Then we define relevant hlua_proxy_index and hlua_server_index functions that will be used as the "__index" metamethod to respectively handle "name, uuid" (proxy) or "name, puid" (server) keys, in which case we warn the user about the need to use the new getter function instead the legacy attribute (to prepare for the potential upcoming removal), and we call the getter function to return the value as if the getter function was directly called from the script. Note: Using the legacy variables instead of the getter functions results in a slight overhead due to the "__index" metamethod indirection, thus it is recommended to switch to the getter functions right away. With this commit we're also adding a deprecation notice about legacy attributes.	2023-04-05 08:58:16 +02:00
Thierry Fournier	1edf36a369	MEDIUM: hlua_fcn: dynamic server iteration and indexing This patch proposes to enumerate servers using internal HAProxy list. Also, remove the flag SRV_F_NON_PURGEABLE which makes the server non purgeable each time Lua uses the server. Removing reg-tests/cli_delete_server_lua.vtc since this test is no longer relevant (we don't set the SRV_F_NON_PURGEABLE flag anymore) and we already have a more generic test: reg-tests/server/cli_delete_server.vtc Co-authored-by: Aurelien DARRAGON <adarragon@haproxy.com>	2023-04-05 08:58:16 +02:00
Thierry Fournier	b0467730a0	MINOR: hlua_fcn: alternative to old proxy and server attributes This patch adds new lua methods: - "Proxy.get_uuid()" - "Proxy.get_name()" - "Server.get_puid()" - "Server.get_name()" These methods will be equivalent to their old analog Proxy.{uuid,name} and Server.{puid,name} attributes, but this will be the new preferred way to fetch such infos as it duplicates memory only when necessary and thus reduce the overall lua Server/Proxy objects memory footprint. Legacy attributes (now superseded by the explicit getters) are expected to be removed some day. Co-authored-by: Aurelien DARRAGON <adarragon@haproxy.com>	2023-04-05 08:58:16 +02:00
Thierry Fournier	467913c84e	MEDIUM: hlua: Dynamic list of frontend/backend in Lua When HAproxy is loaded with a lot of frontends/backends (tested with 300k), it is slow to start and it uses a lot of memory just for indexing backends in the lua tables. This patch uses the internal frontend/backend index of HAProxy in place of lua table. HAProxy startup is now quicker as each frontend/backend object is created on demand and not at init. This has to come with some cost: the execution of Lua will be a little bit slower.	2023-04-05 08:58:16 +02:00
Thierry Fournier	599f2311a8	MINOR: hlua: Fix two functions that return nothing useful Two lua init function seems to return something useful, but it is not the case. The function "hlua_concat_init" seems to return a failure status, but the function never fails. The function "hlua_fcn_reg_core_fcn" seems to return a number of elements in the stack, but it is not the case.	2023-04-05 08:58:16 +02:00
Willy Tarreau	76642223f0	MEDIUM: stick-table: switch the table lock to rwlock Right now a spinlock is used, but most accesses are for reads, so let's switch the lock to an rwlock and switch all accesses to exclusive locks for now. There should be no visible difference at this point.	2022-10-12 14:19:05 +02:00
Aurelien DARRAGON	7d00077fd5	BUG/MEDIUM: proxy: ensure pause_proxy() and resume_proxy() own PROXY_LOCK There was a race involving hlua_proxy_* functions and some proxy management functions. pause_proxy() and resume_proxy() can be used directly from lua code, but that could lead to some race as lua code didn't make sure PROXY_LOCK was owned before calling the proxy functions. This patch makes sure it won't happen again elsewhere in the code by locking PROXY_LOCK directly in resume and pause proxy functions so that it's not the caller's responsibility anymore. (based on stop_proxy() behavior that was already safe prior to the patch) This should be backported to stable series. Note that the API will likely differ < 2.4	2022-09-09 17:23:01 +02:00
Tim Duesterhus	6eded62f6a	CLEANUP: Add missing header to hlua_fcn.c Found with -Wmissing-prototypes: src/hlua_fcn.c:53:5: fatal error: no previous prototype for function 'hlua_checkboolean' [-Wmissing-prototypes] int hlua_checkboolean(lua_State L, int index) ^ src/hlua_fcn.c:53:1: note: declare 'static' if the function is not intended to be used outside of this translation unit int hlua_checkboolean(lua_State L, int index) ^ static 1 error generated.	2022-05-17 11:40:33 +02:00
William Lallemand	82d5f013f9	BUG/MINOR: lua: don't expose internal proxies Since internal proxies are now in the global proxy list, they are now reachable from core.proxies, core.backends, core.frontends. This patch fixes the issue by checking the PR_CAP_INT flag before exposing them in lua, so the user can't have access to them. This patch must be backported in 2.5.	2021-11-24 16:14:24 +01:00
Willy Tarreau	63617dbec6	BUILD: idleconns: include missing ebmbtree.h at several places backend.c, all muxes, backend.c started manipulating ebmb_nodes with the introduction of idle conns but the types were inherited through other includes. Let's add ebmbtree.h there.	2021-10-07 01:36:51 +02:00
Willy Tarreau	27539409fd	BUILD: hlua: needs to include stream-t.h It uses the SF_ERR_* error codes and currently gets them via intermediary includes.	2021-10-07 01:36:51 +02:00
Amaury Denoyelle	86f3707d14	MINOR: server: mark servers referenced by LUA script as non purgeable Each server that is retrieved by a LUA script is marked as non purgeable. Note that for this to work, the script must have been executed already once.	2021-08-25 15:53:54 +02:00

1 2 3

133 Commits