haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-11 09:37:20 +02:00

Author	SHA1	Message	Date
Aurelien DARRAGON	29b6d8af16	MINOR: hlua: rename "tune.lua.preserve-smp-bool" to "tune.lua.bool-sample-conversion" A better name was found for the option implemented in `ec74438` ("MINOR: hlua: add option to preserve bool type from smp to lua") Indeed, "tune.lua.preserve-smp-bool {on \| off}" wasn't explicit enough nor did it encourage the adoption of the new "fixed" behavior (vs historical behavior which is now considered as a bug). Thus it becomes "tune.lua.bool-sample-conversion { normal \| pre-3.1-bug }" which actively encourage users to switch the new behavior after having patched in-use Lua script if needed. From a technical point of view, the logic remains the same, as the option currently defaults to "pre-3.1-bug" to prevent script breakage, and a warning is emitted if the option isn't set explicily and Lua is used. Documentation and regtests were updated. Must be backported in 3.1 with `ec74438` and `f2838f5` ("REGTESTS: fix lua-based regtests using tune.lua.smp-preserve-bool")	2024-12-20 17:34:05 +01:00
Aurelien DARRAGON	ec74438273	MINOR: hlua: add option to preserve bool type from smp to lua As discussed in GH #2814, there is an ambiguity in hlua implementation that causes haproxy smp boolean type to be pushed as an integer on the Lua stack. On the other hand, when doing Lua to haproxy smp conversion, the boolean type is properly perserved. Of course this situation is not desirable and can lead to unexpected results. However we cannot simply fix the behavior because in Lua boolean and integer types are not are completely distinct types and cannot be used interchangeably. So in order to prevent breaking existing scripts logic, in this patch we add a dedicated lua tunable named "tune.lua.smp-preserve-bool" which can take the following values: - "on" : when converting haproxy smp to lua, boolean type is preserved - "off": when converting haproxy smp to lua, boolean is converted to integer (legacy behavior) For now, the tunable defaults to "off" to preserve historical behavior. However, when the option isn't set explicitly and lua is used, a warning will be emitted in order to raise user's awareness about this ambiguity. It is expected that the tunable could default to "on" in future versions, thus it is recommended to avoid setting it to "off" except when using existing Lua scripts that still rely on the old behavior regarding boolean smp to Lua conversion, and that they cannot be fixed easily. This should solve issue GH #2814. It may be relevant to backport this in haproxy 3.1.	2024-12-19 13:50:27 +01:00
William Lallemand	acb2c9eb8b	MINOR: ssl: improve HAVE_SSL_OCSP ifdef Allow to build correctly without OCSP. It could be disabled easily with OpenSSL build with OPENSSL_NO_OCSP. Or even with DEFINE="-DOPENSSL_NO_OCSP" on haproxy make line.	2024-12-19 10:53:05 +01:00
Willy Tarreau	a4f50c69e4	CLEANUP: hlua: use ASSUME_NONNULL() instead of ALREADY_CHECKED() The purpose of the test in hlua_applet_tcp_new() was precisely to declare non-nullity. Let's just do it using ASSUME_NONNULL() now.	2024-12-17 17:47:57 +01:00
Aurelien DARRAGON	70b5cd6794	MINOR: hlua: fix ambiguous hlua usage in hlua_filter_delete() In GH #2804, @Bbulatov reported that the result of hlua_stream_ctx_get() was used and de-referenced without checking if it's NULL in hlua_filter_delete() while other functions used to check for NULL before de-referencing it. In fact hlua_stream_ctx_get() can only return NULL if hlua_stream_ctx_prepare() failed or was not called on the current stream. Now because of the filter's API, since hlua_filter_delete() is mapped as detach method and hlua_filter_new() as attach method, and since hlua_filter_new() is responsible for calling hlua_stream_ctx_prepare(), there's no reason hlua_filter_delete() should be called if hlua_filter_new() failed or wasn't called. Thus we can assume that hlua can never be NULL in hlua_filter_delete(), so we add a BUG_ON() to ensure it is always the case and remove the ambiguity.	2024-12-02 17:22:51 +01:00
Aurelien DARRAGON	31784efad2	MINOR: hlua: add core.get_patref method core.get_patref() method may be used to get a reference to a pattern object (pat_ref struct which is used for maps and acl storage) from Lua by providing the reference name (filename for files, or prefix+name for opt or virtual pattern references). Lua documentation was updated.	2024-11-29 07:22:38 +01:00
Aurelien DARRAGON	3d250b3be8	MINOR: pattern: split pat_ref_set() split pat_ref_set() function in 2 distinct functions. Indeed, since `0844bed7d3` ("MEDIUM: map/acl: Improve pat_ref_set() efficiency (for "set-map", "add-acl" action perfs)"), pat_ref_set() prototype was updated to include an extra <elt> argument. But the logic behind is not explicit because the function will not only try to set <elt>, but also its duplicate (unlike pat_ref_set_elt() which only tries to update <elt>). Thus, to make it clearer and better distinguish between the key-based lookup version and the elt-based one, restotre pat_ref_set() previous prototype and add a dedicated pat_ref_set_elt_duplicate() that takes <elt> as argument and tries to update <elt> and all duplicates.	2024-11-26 16:12:05 +01:00
Aurelien DARRAGON	2ce0db4e4b	OPTION: map/hlua: make core.set_map() lookup more efficient `0844bed7d3` ("MEDIUM: map/acl: Improve pat_ref_set() efficiency (for "set-map", "add-acl" action perfs)") improved lookup efficiency for set-map http action, but the core.set_map() lua method which is built on the same construct was overlooked. Let's also benefit from this optim as it easily applies.	2024-11-20 16:14:13 +01:00
Aurelien DARRAGON	5d766260f0	MEDIUM: protocol: rely on AF_CUST_ABNS family to recognize ABNS sockets Now that we can easily distinguish regular UNIX socket from ABNS sockets by simply looking at the address family, stop looking at the first byte from addr->sun_path to guess if the socket is an ABNS one or not. Looking at the family is straightforward and will allow to differentiate between upcoming ABNSZ and ABNS (where looking at the first byte from path won't help anymore).	2024-10-29 12:14:37 +01:00
Willy Tarreau	78ac312bbd	MEDIUM: protocol: make abns a custom unix socket address family This is a pre-requisite to adding the abnsz socket address family: in this patch we make use of protocol API rework started by `732913f` ("MINOR: protocol: properly assign the sock_domain and sock_family") in order to implement a dedicated address family for ABNS sockets (based on UNIX parent family). Thanks to this, it will become trivial to implement a new ABNSZ (for abns zero) family which is essentially the same as ABNS but with a slight difference when it comes to path handling (ABNS uses the whole sun_path length, while ABNSZ's path is zero terminated and evaluation stops at 0) It was verified that this patch doesn't break reg-tests and behaves properly (tests performed on the CLI with show sess and show fd). Anywhere relevant, AF_CUST_ABNS is handled alongside AF_UNIX. If no distinction needs to be made, real_family() is used to fetch the proper real family type to handle it properly. Both stream and dgram were converted, so no functional change should be expected for this "internal" rework, except that proto will be displayed as "abns_{stream,dgram}" instead of "unix_{stream,dgram}". Before ("show sess" output): 0x64c35528aab0: proto=unix_stream src=unix:1 fe=GLOBAL be=<NONE> srv=<none> ts=00 epoch=0 age=0s calls=1 rate=0 cpu=0 lat=0 rq[f=848000h,i=0,an=00h,ax=] rp[f=80008000h,i=0,an=00h,ax=] scf=[8,0h,fd=21,rex=10s,wex=] scb=[8,1h,fd=-1,rex=,wex=] exp=10s rc=0 c_exp= After: 0x619da7ad74c0: proto=abns_stream src=unix:1 fe=GLOBAL be=<NONE> srv=<none> ts=00 epoch=0 age=0s calls=1 rate=0 cpu=0 lat=0 rq[f=848000h,i=0,an=00h,ax=] rp[f=80008000h,i=0,an=00h,ax=] scf=[8,0h,fd=22,rex=10s,wex=] scb=[8,1h,fd=-1,rex=,wex=] exp=10s rc=0 c_exp= Co-authored-by: Aurelien DARRAGON <adarragon@haproxy.com>	2024-10-29 12:14:25 +01:00
Aurelien DARRAGON	f88f162868	BUG/MEDIUM: hlua: properly handle sample func errors in hlua_run_sample_{fetch,conv}() To execute sample fetches and converters from lua. hlua API leverages the sample API. Prior to executing the sample func, the arg checker is called from hlua_run_sample_{fetch,conv}() to detect potential errors. However, hlua_run_sample_{fetch,conv}() both pass NULL as <err> argument, but it is wrong for two reasons. First we miss an opportunity to report precise error messages to help the user know what went wrong during the check.. and more importantly, some val check functions consider that the <err> pointer is never NULL. This is the case for example with check_crypto_hmac(). Because of this, when such val check functions encounter an error, they will crash the process because they will try to de-reference NULL. This bug was discovered and reported by GH user @JB0925 on #2745. Perhaps val check functions should make sure that the provided <err> pointer is != NULL prior to de-referencing it. But since there are multiple occurences found in the code and the API isn't clear about that, it is easier to fix the hlua part (caller) for now. To fix the issue, let's always provide a valid <err> pointer when leveraging val_arg() check function pointer, and make use of it in case or error to report relevant message to the user before freeing it. It should be backported to all stable versions.	2024-10-08 12:00:42 +02:00
Aurelien DARRAGON	d0e0105181	BUG/MEDIUM: hlua: make hlua_ctx_renew() safe hlua_ctx_renew() is called from unsafe places where the caller doesn't expect it to LJMP.. however hlua_ctx_renew() makes use of Lua library function that could potentially raise errors, such as lua_newthread(), and it does nothing to catch errors. Because of this, haproxy could unexpectedly crash. This was discovered and reported by GH user @JB0925 on #2745. To fix the issue, let's simply make hlua_ctx_renew() safe by applying the same logic implemented for hlua_ctx_init() or hlua_ctx_destroy(), which is catching Lua errors by leveraging SET_SAFE_LJMP_PARENT() helper. It should be backported to all stable versions.	2024-10-08 12:00:36 +02:00
Aperence	a7b04e383a	MINOR: tools: extend str2sa_range to add an alt parameter Add a new parameter "alt" that will store wether this configuration use an alternate protocol. This alt pointer will contain a value that can be transparently passed to protocol_lookup to obtain an appropriate protocol structure. This change is needed to allow for example the servers to know if it need to use an alternate protocol or not.	2024-08-30 18:53:49 +02:00
Christopher Faulet	e5e36ce097	BUG/MEDIUM: hlua/cli: Fix lua CLI commands to work with applet's buffers In 3.0, the CLI applet was rewritten to use its own buffers. However, the lua part, used to register CLI commands at runtime, was not updated accordingly. It means the lua CLI commands still try to write in the channel buffers. This is of course totally unexepected and not supported. Because of this bug, the applet hangs intead of returning the command result. The registration of lua CLI commands relies on the lua TCP applets. So the send and receive functions were fixed to use the applet's buffer when it is required and still use the channel buffers otherwies. This way, other lua TCP applets can still run on the legacy mode, without the applet's buffers. This patch must be backported to 3.0.	2024-07-02 10:05:40 +02:00
Aurelien DARRAGON	185d230e2c	BUG/MINOR: hlua: report proper context upon error in hlua_cli_io_handler_fct() As a result of copy pasting, hlua_cli_io_handler_fct() used to report lua exceptions like E_ETMOUT as "Lua converter" instead of "Lua cli". Let's fix that. It could be backported to all stable versions. [ada: for older versions, HLUA_E_BTMOUT case didn't exist so it has to be skipped]	2024-06-26 11:06:24 +02:00
Aurelien DARRAGON	983513d901	DEBUG: hlua: distinguish burst timeout errors from exec timeout errors hlua burst timeout was introduced in `58e36e5b1` ("MEDIUM: hlua: introduce tune.lua.burst-timeout"). It is a safety measure that allows to detect when too much time is spent on a single lua execution (between 2 interruptions/yields), meaning that the current thread is not able to perform other tasks. Such scenario should be avoided because it will cause thread contention which may have negative performance impact and could cause the watchdog to trigger. When the burst timeout is exceeded, the current Lua execution is aborted and a timeout error is reported to the user. Unfortunately, the same error is currently being reported for cumulative (AKA execution) timeout and for burst timeout, which may be confusing to the user. Indeed, "execution timeout" error historically results from the current hlua context exceeding the total (cumulative) time it's allowed to run. It is set per lua context using the dedicated tunables: - tune.lua.session-timeout - tune.lua.task-timeout - tune.lua.service-timeout We've already faced an user report where the user was able to trigger the burst timeout and got "Lua task: execution timeout." error while the user didn't set cumulative timeout. Thus the error was actually confusing because it was indeed the burst timeout which was causing it due to the use of cpu-intensive call from within the task without sufficient manual "yield" keypoints around the cpu-intensive call to ensure it runs on a dedicated scheduler cycle. In this patch we make it so burst timeout related errors are reported as "burst timeout" errors instead of "execution timeout" errors (which in fact became the generic timeout errors catchall with `58e36e5b1`). To do this, hlua_timer_check() now returns a different value depending if the exeeded timeout is the burst one or the cumulative one, which allows us to return either HLUA_E_ETMOUT or HLUA_E_BTMOUT in hlua_ctx_resume(). It should improve the situation described in GH #2356 and may possibly be backported with `58e36e5b1` to improve error reporting if it applies without resistance.	2024-06-14 18:25:58 +02:00
Aurelien DARRAGON	2bde0d64dd	CLEANUP: hlua: simplify ambiguous lua_insert() usage in hlua_ctx_resume() 'lua_insert(lua->T, -lua_gettop(lua->T))' is actually used to rotate the top value with the bottom one, thus the code was overkill and the comment was actually misleading, let's fix that by using explicit equivalent form (absolute index). It may be backported with `5508db9a2` ("BUG/MINOR: hlua: fix unsafe lua_tostring() usage with empty stack") to all stable versions to ease code maintenance.	2024-06-04 16:31:38 +02:00
Aurelien DARRAGON	755c2daf0f	BUG/MINOR: hlua: fix leak in hlua_ckch_set() error path in hlua_ckch_commit_yield() and hlua_ckch_set(), when an error occurs, we enter the error path and try to raise an error from the <err> msg pointer which must be freed afterwards. However, the fact that luaL_error() never returns was overlooked, because of that <err> msg is never freed in such case. To fix the issue, let's use hlua_pushfstring_safe() helper to push the err on the lua stack and then free it before throwing the error using lua_error(). It should be backported up to 2.6 with `30fcca18` ("MINOR: ssl/lua: CertCache.set() allows to update an SSL certificate file")	2024-06-04 16:31:30 +02:00
Aurelien DARRAGON	2be94c008e	CLEANUP: hlua: get rid of hlua_traceback() security checks Thanks to the previous commit, we may now assume that hlua_traceback() won't LJMP, so it's safe to use it from unprotected environment without any precautions.	2024-06-04 16:31:22 +02:00
Aurelien DARRAGON	365ee28510	BUG/MINOR: hlua: prevent LJMP in hlua_traceback() Function is often used on error paths where no precaution is taken against LJMP. Since the function is used on error paths (which include out-of-memory error paths) the function lua_getinfo() could also raise a memory exception, causing the process to crash or improper error handling if the caller isn't prepared against that eventually. Since the function is only used on rare events (error handling) and is lacking the __LJMP prototype pefix, let's make it safe by protecting the lua_getinfo() call so that hlua_traceback() callers may use it safely now (the function will always succeed, output will be truncated in case of error). This could be backported to all stable versions.	2024-06-04 16:31:15 +02:00
Aurelien DARRAGON	f0e5b825cf	BUG/MINOR: hlua: fix unsafe hlua_pusherror() usage Following previous commit's logic: hlua_pusherror() is mainly used from cleanup paths where the caller isn't protected against LJMPs. Caller was tempted to think that the function was safe because func prototype was lacking the __LJMP prefix. Let's make the function really LJMP-safe by wrapping the sensitive calls under lua_pcall(). This may be backported to all stable versions.	2024-06-04 16:31:09 +02:00
Aurelien DARRAGON	c0a3c1281f	BUG/MINOR: hlua: don't use lua_pushfstring() when we don't expect LJMP lua_pushfstring() is used in multiple cleanup paths (upon error) to push the error message that will be raised by lua_error(). However this is often done from an unprotected environment, or in the middle of a cleanup sequence, thus we don't want the function to LJMP! (it may cause various issues ranging from memory leaks to crashing the process..) Hopefully this has very few chances of happening but since the use of lua_pushfstring() is limited to error reporting here, it's ok to use our own hlua_pushfstring_safe() implementation with a little overhead to ensure that the function will never LJMP. This could be backported to all stable versions.	2024-06-04 16:31:01 +02:00
Aurelien DARRAGON	6e484996c6	CLEANUP: hlua: use hlua_pusherror() where relevant In hlua_map_new(), when error occurs we use a combination of luaL_where, lua_pushfstring and lua_concat to build the error string before calling lua_error(). It turns out that we already have the hlua_pusherror() macro which is exactly made for that purpose so let's use it. It could be backported to all stable versions to ease code maintenance.	2024-06-04 16:30:55 +02:00
Aurelien DARRAGON	a63f2cde94	CLEANUP: hlua: fix CertCache class comment CLASS_CERTCACHE is used to declare CertCache global object, not Regex one This copy-paste typo introduced was in `30fcca18` ("MINOR: ssl/lua: CertCache.set() allows to update an SSL certificate file")	2024-06-03 17:00:06 +02:00
Aurelien DARRAGON	4f906a9c38	BUG/MINOR: hlua: use CertCache.set() from various hlua contexts Using CertCache.set() from init context wasn't explicitly supported and caused the process to crash: crash.lua: core.register_init(function() CertCache.set{filename="reg-tests/ssl/set_cafile_client.pem", ocsp=""} end) crash.conf: global lua-load crash.lua listen front bind localhost:9090 ssl crt reg-tests/ssl/set_cafile_client.pem ca-file reg-tests/ssl/set_cafile_interCA1.crt verify none ./haproxy -f crash.conf [NOTICE] (267993) : haproxy version is 3.0-dev2-640ff6-910 [NOTICE] (267993) : path to executable is ./haproxy [WARNING] (267993) : config : missing timeouts for proxy 'front'. \| While not properly invalid, you will certainly encounter various problems \| with such a configuration. To fix this, please ensure that all following \| timeouts are set to a non-zero value: 'client', 'connect', 'server'. [1] 267993 segmentation fault (core dumped) ./haproxy -f crash.conf This is because in hlua_ckch_set/hlua_ckch_commit_yield, we always consider that we're being called from a yield-capable runtime context. As such, hlua_gethlua() is never checked for NULL and we systematically try to wake hlua->task and yield every 10 instances. In fact, if we're called from the body or init context (that is, during haproxy startup), hlua_gethlua() will return NULL, and in this case we shouldn't care about yielding because it is ok to commit all instances at once since haproxy is still starting up. Also, when calling CertCache.set() from a non-yield capable runtime context (such as hlua fetch context), we kept doing as if the yield succeeded, resulting in unexpected function termination (operation would be aborted and the CertCache lock wouldn't be released). Instead, now we explicitly state in the doc that CertCache.set() cannot be used from a non-yield capable runtime context, and we raise a runtime error if it is used that way. These bugs were discovered by reading the code when trying to address Svace report documented by @Bbulatov GH #2586. It should be backported up to 2.6 with `30fcca18` ("MINOR: ssl/lua: CertCache.set() allows to update an SSL certificate file")	2024-06-03 17:00:00 +02:00
Aurelien DARRAGON	231d3d32be	MEDIUM: hlua: take nbthread into account in hlua_get_nb_instruction() Based on Willy's idea (from 3.0-dev6 announcement message): in this patch we try to reduce the max latency that can be caused by running lua scripts with default settings. Indeed, by default, hlua engine is allowed to process up to 10k instructions per batch. While this value was found to be the optimal one for a single thread, it turns out that keeping a thread busy for 10k lua instructions could increase thread contention. This is especially true when the script is loaded with 'lua-load', because in that case the current thread owns the main lua lock and prevent other threads from making any progress if they're also waiting on the main lock. Thanks to Thierry Fournier's work, we know that performance-wise we can reach optimal performance by sticking between 500 and 10k instructions per batch. Given that, when the script is loaded using 'lua-load', if no "tune.lua.forced-yield" was set by the user, we automatically divide the default value (10K) by the number of threads haproxy can use to reduce thread contention (given that all threads could compete for the main lua lock), however we make sure not to return a value below 500, because Thierry's work showed that this would come with a significant performance loss. The historical behavior may still be enforced by setting "tune.lua.forced-yield" to 10000 in the global config section.	2024-05-15 11:59:44 +02:00
Aurelien DARRAGON	e60d9dddf8	MINOR: hlua: add hlua_nb_instruction getter No functional behavior change, but this will ease the work of dynamically computing hlua_nb_instruction value depending on various inputs.	2024-05-15 11:59:37 +02:00
Aurelien DARRAGON	07b2e84bce	BUG/MEDIUM: hlua: streams don't support mixing lua-load with lua-load-per-thread (2nd try) While trying to reproduce another crash case involving lua filters reported by @bgrooot on GH #2467, we found out that mixing filters loaded from different contexts ('lua-load' vs 'lua-load-per-thread') for the same stream isn't supported and may even cause the process to crash. Historically, mixing lua-load and lua-load-per-threads for a stream wasn't supported, but this changed thanks to `0913386` ("BUG/MEDIUM: hlua: streams don't support mixing lua-load with lua-load-per-thread"). However, the above fix didn't consider lua filters's use-case properly: unlike lua fetches, actions or even services, lua filters don't simply use the stream hlua context as a "temporary" hlua running context to process some hlua code. For fetches, actions.. hlua executions are processed sequentially, so we simply reuse the hlua context from the previous action/fetch to run the next one (this allows to bypass memory allocations and initialization, thus it increases performance), unless we need to run on a different hlua state-id, in which case we perform a reset of the hlua context. But this cannot work with filters: indeed, once registered, a filter will last for the whole stream duration. It means that the filter will rely on the stream hlua context from ->attach() to ->detach(). And here is the catch, if for the same stream we register 2 lua filters from different contexts ('lua-load' + 'lua-load-per-thread'), then we have an issue, because the hlua stream will be re-created each time we switch between runtime contexts, which means each time we switch between the filters (may happen for each stream processing step), and since lua filters rely on the stream hlua to carry context between filtering steps, this context will be lost upon a switch. Given that lua filters code was not designed with that in mind, it would confuse the code and cause unexpected behaviors ranging from lua errors to crashing process. So here we take another approach: instead of re-creating the stream hlua context each time we switch between "global" and "per-thread" runtime context, let's have both of them inside the stream directly as initially suggested by Christopher back then when talked about the original issue. For this we leverage hlua_stream_ctx_prepare() and hlua_stream_ctx_get() helper functions which return the proper hlua context for a given stream and state_id combination. As for debugging infos reported after ha_panic(), we check for both hlua runtime contexts to check if one of them was active when the panic occured (only 1 runtime ctx per stream may be active at a given time). This should be backported to all stable versions with `0913386` ("BUG/MEDIUM: hlua: streams don't support mixing lua-load with lua-load-per-thread") This commit depends on: - "DEBUG: lua: precisely identify if stream is stuck inside lua or not" [for versions < 2.9 the ha_thread_dump_one() part should be skipped] - "MINOR: hlua: use accessors for stream hlua ctx" For 2.4, the filters API didn't exist. However it may be a good idea to backport it anyway because ->set_priv()/->get_priv() from tcp/http lua applets may also be affected by this bug, plus it will ease code maintenance. Of course, filters-related parts should be skipped in this case.	2024-03-13 09:24:46 +01:00
Aurelien DARRAGON	aa554be69c	MINOR: hlua: use accessors for stream hlua ctx Change hlua_stream_ctx_prepare() prototype so that it now returns the proper hlua ctx on success instead of returning a boolean. Add hlua_stream_ctx_get() to retrieve hlua ctx out of a given stream. This way we may easily change the storage mechanism for hlua stream in the future without extensive code changes. No backport needed unless a commit depends on it.	2024-03-13 09:24:46 +01:00
Aurelien DARRAGON	1a2cdf64c9	DEBUG: lua: precisely identify if stream is stuck inside lua or not When ha_panic() is called by the watchdog, we try to guess from ha_task_dump() and ha_thread_dump_one() if the thread was stuck while executing lua from the stream context. However we consider this is the case by simply checking if the stream hlua context was set, but this is not very precise because if the hlua context is set, then it simply means that at least one lua instruction was executed at the stream level, not that the stuck was currently executing lua when the panic occured. This is especially true with filters, one could simply register a lua filter that does nothing but this will still end up initializing the stream hlua context for each stream. If the thread end up being stuck during the stream handling, then debug dumping functions will report that the stream was stuck while handling lua, which is not necessarilly true, and could in fact confuse us even more. So here we take another approach, we add the BUSY flag to hlua context: this flag is set by hlua_ctx_resume() around lua_resume() call, this way we can precisely tell if the thread was handling lua when it was interrupted, and we rely on this flag in debug functions to check if the thread was effectively stuck inside lua or not while processing the stream No backport needed unless a commit depends on it.	2024-03-13 09:24:46 +01:00
Aurelien DARRAGON	85d81e4d0a	BUG/MINOR: hlua: fix missing lock in hlua_filter_delete() hlua_filter_delete() calls hlua_unref() on the stream hlua stack, but we should own the lock prior to manipulating the stack. This should be backported up to 2.6.	2024-03-13 09:24:46 +01:00
Aurelien DARRAGON	ecd8f3bfd7	BUG/MINOR: hlua: missing lock in hlua_filter_new() This is a complementary patch to `8670db7` ("BUG/MAJOR: hlua: improper lock usage with hlua_ctx_resume()") for hlua_filter_new(). Indeed, the HLUA_E_ERRMSG case still relies on the lua stack but didn't take the lock to do so. This should be backported up to 2.6.	2024-03-13 09:24:46 +01:00
Aurelien DARRAGON	4aefffc38c	BUG/MINOR: hlua: segfault when loading the same filter from different contexts Trying to register the same lua filter from global and per-thread context (using 'lua-load' + 'lua-load-per-thread') causes a segmentation fault in hlua_post_init(). This is due to a simple copy paste error as we try to print the function name in the error message (like we do when loading the same lua function from different contexts) instead of the filter name. This should be backported up to 2.6.	2024-03-13 09:24:46 +01:00
Aurelien DARRAGON	75c8a1bc2d	CLEANUP: hlua: txn class functions may LJMP Clarify that some txn related class functions may LJMP by adding the __LJMP tag to their prototype.	2024-03-04 16:48:51 +01:00
Aurelien DARRAGON	f364f4670b	MINOR: hlua: use SEND_ERR to report errors in hlua_event_runner() Instead of reporting lua errors using ha_alert(), let's use SEND_ERR() helper which will also try to generate a log message according to lua log settings.	2024-03-04 16:48:48 +01:00
Aurelien DARRAGON	e1b0031650	BUG/MINOR: hlua: don't call ha_alert() in hlua_event_subscribe() hlua_event_subscribe() is meant to be called from a protected lua env during init and/or runtime. As such, only hlua_event_sub() makes uses of it: when an error happens hlua_event_sub() will already raise a Lua exception. Thus it's not relevant to use ha_alert() there as it could generate log pollution (error is relevant from Lua script point of view, not from haproxy one). This could be backported in 2.8.	2024-03-04 16:48:42 +01:00
Aurelien DARRAGON	8670db7a89	BUG/MAJOR: hlua: improper lock usage with hlua_ctx_resume() hlua_ctx_resume() itself can safely be used as-is in a multithreading context because it takes care of taking the lua lock. However, when hlua_ctx_resume() returns, the lock is released and it is thus the caller's responsibility to ensure it owns the lock prior to performing additional manipulations on the Lua stack. Unfortunately, since early haproxy lua implementation, we used to do it wrong: The most common hlua_ctx_resume() pattern we can find in the code (because it was duplicated over and over over time) is the following: \|ret = hlua_ctx_resume() \|switch (ret) { \| case HLUA_E_OK: \| break; \| case HLUA_E_ERRMSG: \| break; \| [...] \|} Problem is: for some of the switch cases, we still perform lua stack manipulations. This is the case for the HLUA_E_ERRMSG for instance where we often use lua_tostring() to retrieve last lua error message on the top of the stack, or sometimes for the HLUA_E_OK case, when we need to perform some lua cleanup logic once the resume ended. But all of this is done WITHOUT the lua lock, so this means that the main lua stack could be accessed simultaneously by concurrent threads when a script was loaded using 'lua-load'. While it is not critical for switch-cases dedicated to error handling, (those are not supposed to happen very often), it can be very problematic for stack manipulations occuring in the HLUA_E_OK case under heavy load for instance. In this case, main lua stack corruptions will eventually happen. This is especially true inside hlua_filter_new(), where this bug was known to cause lua stack corruptions under load, leading to lua errors and even crashing the process as reported by @bgrooot in GH #2467. The fix is relatively simple, once hlua_ctx_resume() returns: we should consider that ANY lua stack access should be lua-lock protected. If the related lua calls may raise lua errors, then (RE)SET_SAFE_LJMP combination should be used as usual (it allows to lock the lua stack and catch lua exceptions at the same time), else hlua_{lock,unlock} may be used if no exceptions are expected. This patch should fix GH #2467. It should be backported to all stable versions. [ada: some ctx adj will be required for older versions as event_hdl doesn't exist prior to 2.8 and filters were implemented in 2.5, thus some chunks won't apply]	2024-03-04 16:48:31 +01:00
Aurelien DARRAGON	19b016f9f8	BUG/MEDIUM: hlua: improper lock usage with SET_SAFE_LJMP() When we want to perform some unsafe lua stack manipulations from an unprotected lua environment, we use SET_SAFE_LJMP() RESET_SAFE_LJMP() combination to lock lua stack and catch potential lua exceptions that may occur between the two. Hence, we regularly find this pattern (duplicated over and over): \|if (!SET_SAFE_LJMP(hlua)) { \| const char error; \| \| if (lua_type(hlua->T, -1) == LUA_TSTRING) \| error = hlua_tostring_safe(hlua->T, -1); \| else \| error = "critical error"; \| SEND_ERR(NULL, ": %s.\n", error); \|} This is wrong because when SET_SAFE_LJMP() returns false (meaning that an exception was caught), then the lua lock was released already, thus the caller is not expected to perform lua stack manipulations (because the main lua stack may be shared between multiple threads). In the pattern above we only want to retrieve the lua exception message which may be found at the top of the stack, to do so we now explicitly take the lua lock before accessing the lua stack. Note that hlua_lock() doesn't catch lua exceptions so only safe lua functions are expected to be used there (lua functions that may NOT raise exceptions). It should be backported to every stable versions. [ada: some ctx adj will be required for older versions as event_hdl doesn't exist prior to 2.8 and filters were implemented in 2.5, thus some chunks won't apply, but other fixes should stay relevant]	2024-03-04 16:47:20 +01:00
Aurelien DARRAGON	d81c2205a3	BUG/MINOR: hlua: improper lock usage in hlua_filter_new() In hlua_filter_new(), after each hlua resume, we systematically try to empty the stack by calling lua_settop(). However we're doing this without locking the lua context, so it is unsafe in multithreading context if the script is loaded using 'lua-load'. To fix the issue, we protect the call with hlua_{lock,unlock}() helpers. This should be backported up to 2.6.	2024-03-04 16:47:18 +01:00
Aurelien DARRAGON	51f291c795	BUG/MINOR: hlua: improper lock usage in hlua_filter_callback() In hlua_filter_callback(), some lua stack work is performed under SET_SAFE_LJMP() guard which also takes care of locking the hlua context when needed. However, a lua_gettop() call is performed out of the guard, thus it is unsafe in multithreading context if the script is loaded using 'lua-load' because in this case the main lua stack is shared between threads and each access to a lua stack must be performed under the lock, thus we move lua_gettop() call under the lock. It should be backported up to 2.6.	2024-03-04 16:47:17 +01:00
Aurelien DARRAGON	9578524091	BUG/MINOR: hlua: fix possible crash in hlua_filter_new() under load hlua_filter_new() handles memory allocation errors by jumping to the "end:" cleanup label in case of errors. Such errors may happen when the system is heavily loaded for instance. In hlua_filter_new(), we try to allocate two hlua contexts in a row before checking if one of them failed (in which case we jump to the cleanup part of the function), and only then we initialize them both. If a memory allocation failure happens for only one out of the two flt_ctx->hlua[] contexts pair, we still jump to the cleanup part. It means that the hlua context that was successfully allocated and wasn't initialized yet will be passed to hlua_ctx_destroy(), resulting in invalid reads in the cleanup function, which may ultimately cause the process to crash. To fix the issue: we make sure flt_ctx hlua contexts are initialized right after they are allocated, that is before any error handling condition that may force the cleanup. This bug was discovered when trying to reproduce GH #2467 with haproxy started with "-dMfail" argument. It should be backported up to 2.6.	2024-03-04 16:47:03 +01:00
Aurelien DARRAGON	369bfa0b50	BUG/MINOR: hlua: don't use lua_tostring() from unprotected contexts As per lua documentation, lua_tostring() may raise a memory error. However, we're often using it to fetch the error message at the top of the stack (ie: after a failing lua call) from unprotected environments. In practise, lua_tostring() has rare chances of failing, but still, if it happens to be the case, it could crash the process and we better not risk it. So here, we add hlua_tostring_safe() function, which works exactly as lua_tostring(), but the function cannot LJMP as it will catch lua_tostring() exceptions to return NULL instead. Everywhere lua_tostring() was used to retrieve error string from such unprotected contexts, we now rely on hlua_tostring_safe(). This should be backported to all stable versions. [ada: ctx adj will be required, for versions prior to 2.8 event_hdl API didn't exist so some chunks won't apply, and prior to 2.5 filters API didn't exist either, so again, some chunks should be ignored]	2024-03-04 16:46:55 +01:00
Aurelien DARRAGON	5508db9a20	BUG/MINOR: hlua: fix unsafe lua_tostring() usage with empty stack Lua documentation says that lua_tostring() returns a pointer that remains valid as long as the object is not removed from the stack. However there are some places were we use the returned string AFTER the corresponding object is removed from the stack. In practise this doesn't seem to cause visible bugs (probably because the pointer remains valid waiting for a GC cycle), but let's fix that to comply with the documentation and avoid undefined behavior. It should be backported in all stable versions.	2024-03-04 16:46:53 +01:00
Christopher Faulet	31ec9f18bb	MINOR: hlua: Be able to disable logging from lua Add core.silent (-1) value to be able to disable logging via TXN:set_loglevel() call. Otherwise, there is no way to do so and it may be handy. This special value cannot be used with TXN:log() function. This patch may be backported if necessary.	2024-03-01 15:01:18 +01:00
Christopher Faulet	75fb0afde4	BUG/MINOR: hlua: Fix log level to the right value when set via TXN:set_loglevel When the log level is changed in lua, by calling TXN:set_loglevel function, it must be incremented by one because it is decremented in strm_log() function. This patch must be backport to all stable versions.	2024-03-01 15:01:18 +01:00
Christopher Faulet	56e73df37d	BUG/MEDIUM: hlua: Don't loop if a lua socket does not consume received data If some data are received for a lua socket while the lua script responsible to consume these data is not ready to do so, for instance because it is sleeping, the applet is woken up in loop because it never states it will not consume these data yet. To fix the issue, in the applet I/O handle, when there are outgoing data, we always pretend the applet will not consume it. It is the responsibility to the lua script to reactivate receives by calling Socket.receive() function. This patch must be backported to every stable version. For 2.4 and older, si_want_get()/si_cant_get() must be used instead of applet_will_consume()/applet_wont_consume().	2024-02-16 15:48:08 +01:00
Christopher Faulet	38534d344b	BUG/MEDIUM: hlua: Be able to garbage collect uninitialized lua sockets It is poosible to create a lua socket without performing any connect. In this case, the lua socket is released because of the garbage collector. However, the garbarge collector does not release the applet, it wakes it up. Since commit `751b59c40b` ("BUG/MEDIUM: hlua: Initialize appctx used by a lua socket on connect only"), the applet initialization is performed on connect. So, here, it is possible to wake an uninitialized applet. It is an unexpected case for the applet's I/O handler, leading to a segfault because some resources are not initialized (the stream's target in this case). So, now, in the lua socket GC function, we take care to immediately release uninitialized applets. At worst, the release itself is delayed. But it is safe because we are sure the applet's I/O handler will never be executed. In addition, we take case to increment the GC counter when the lua socket is created. The way, uninitialized lua socket are released more quickly. This patch should fix the issue #2451. It must be backported as far as 2.6.	2024-02-16 15:48:08 +01:00
Christopher Faulet	dcd917d972	MINOR: applet: Remove uselelss test on SE_FL_SHR/SHW flags These both flags are set after releasing the applet, in appctx_shut(). Concretly, it means the applet is shutdown for reads and writes. Once set, the applet's I/O handler was no longer called. Tests on these flags are useless. There is no chance to match them.	2024-02-14 14:22:36 +01:00
Aurelien DARRAGON	03cb782bcb	MINOR: hlua: Rename set_{tos, mark} to set_fc_{tos, mark} This is a complementary patch to "MINOR: tcp-act: Rename "set-{mark,tos}" to "set-fc-{mark,tos}"", but for the Lua API. set_mark and set_tos were kept as aliases for set_fc_mark and set_fc_tos but they were marked as deprecated. Using this opportunity to reorder set_mark and set_tos by alphabetical order.	2024-02-01 10:58:30 +01:00
Aurelien DARRAGON	f41402ab29	CLEANUP: hlua: fix indent, remove extra return in hlua_core_get_var() This is cleanup patch to address cosmetic issues introduced in `f034139bc0` ("MINOR: lua: Allow reading "proc." scoped vars from LUA core.") Also taking this opportunity to prefix the function with __LJMP to indicate that it may longjump. No backport needed.	2024-01-24 16:27:47 +01:00

1 2 3 4 5 ...

931 Commits