haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-21 06:31:18 +02:00

Author	SHA1	Message	Date
Willy Tarreau	bc1b820606	BUILD: watchdog: condition it to USE_RT It's needed on Linux to have access to timerfd_*, and on FreeBSD this lib is needed as well, though not enabled in our default build. We can see later if it's OK to enable it, for now let's fix the build issues.	2019-05-23 10:20:55 +02:00
Willy Tarreau	02255b24df	BUILD: watchdog: use si_value.sival_int, not si_int for the timer's value Bah, the linux manpage suggests to use si_int but it's a fake, it's only a define on sigval.sival_int where sigval is defined as si_value. Let's use si_value.sival_int, at least it builds on both Linux and FreeBSD. It's likely that this code will have to be limited to a small subset of OSes if it causes difficulties like this.	2019-05-23 08:36:29 +02:00
Willy Tarreau	96d5195862	MEDIUM: config: deprecate the antique req* and rsp* commands These commands don't follow the same flow as the rest of the commands, each of them iterates over all header lines before switching to the next directive. In addition they make no distinction between start line and headers and can lead to unparsable rewrites which are very difficult to deal with internally. Most of them are still occasionally found in configurations, mainly because of the usual "we've always done this way". By marking them deprecated and emitting a warning and recommendation on first use of each of them, we will raise users' awareness of users regarding the cleaner, faster and more reliable alternatives. Some use cases of "reqrep" still appear from time to time for URL rewriting that is not so convenient with other rules. But at least users facing this requirement will explain their use case so that we can best serve them. Some discussion started on this subject in a thread linked to from github issue #100. The goal is to remove them in 2.1 since they require to reparse the result before indexing it and we don't want this hack to live long. The following directives were marked deprecated : -reqadd -reqallow -reqdel -reqdeny -reqiallow -reqidel -reqideny -reqipass -reqirep -reqitarpit -reqpass -reqrep -reqtarpit -rspadd -rspdel -rspdeny -rspidel -rspideny -rspirep -rsprep	2019-05-22 20:43:45 +02:00
Willy Tarreau	3844747536	CLEANUP: raw_sock: remove support for very old linux splice bug workaround We've been dealing with a workaround for a bug in splice that used to affect version 2.6.25 to 2.6.27.12 and which was fixed 10 years ago in kernel versions which are not supported anymore. Given that people who would use a kernel in such a range would face much more serious stability and security issues, it's about time to get rid of this workaround and of the ASSUME_SPLICE_WORKS build option used to disable it.	2019-05-22 20:02:15 +02:00
Willy Tarreau	e5733234f6	CLEANUP: build: rename some build macros to use the USE_* ones We still have quite a number of build macros which are mapped 1:1 to a USE_something setting in the makefile but which have a different name. This patch cleans this up by renaming them to use the USE_something one, allowing to clean up the makefile and make it more obvious when reading the code what build option needs to be added. The following renames were done : ENABLE_POLL -> USE_POLL ENABLE_EPOLL -> USE_EPOLL ENABLE_KQUEUE -> USE_KQUEUE ENABLE_EVPORTS -> USE_EVPORTS TPROXY -> USE_TPROXY NETFILTER -> USE_NETFILTER NEED_CRYPT_H -> USE_CRYPT_H CONFIG_HAP_CRYPT -> USE_LIBCRYPT CONFIG_HAP_NS -> DUSE_NS CONFIG_HAP_LINUX_SPLICE -> USE_LINUX_SPLICE CONFIG_HAP_LINUX_TPROXY -> USE_LINUX_TPROXY CONFIG_HAP_LINUX_VSYSCALL -> USE_LINUX_VSYSCALL	2019-05-22 19:47:57 +02:00
Willy Tarreau	823bda0eb7	BUILD: time: remove the test on _POSIX_C_SOURCE It seems it's not defined on FreeBSD while it's mentioned on Linux that clock_gettime() can be detected using this. Given that we also have the test for _POSIX_TIMERS>0 that should cover it well enough. If it breaks on other systems, we'll see. Report was here : https://github.com/haproxy/haproxy/runs/133866993	2019-05-22 19:14:59 +02:00
Willy Tarreau	082b62828d	BUG/MEDIUM: init/threads: provide per-thread alloc/free function callbacks We currently have the ability to register functions to be called early on thread creation and at thread deinitialization. It turns out this is not sufficient because certain such functions may use resources that are being allocated by the other ones, thus creating a race condition depending only on the linking order. For example the mworker needs to register a file descriptor while the pollers will reallocate the fd_updt[] array. Similarly logs and trashes may be used by some init functions while it's unclear whether they have been deduplicated. The same issue happens on deinit, if the fd_updt[] or trash is released before some functions finish to use them, we'll get into trouble. This patch creates a couple of early and late callbacks for per-thread allocation/freeing of resources. A few init functions were moved there, and the fd init code was split between the two (since it used to both allocate and initialize at once). This way the init/deinit sequence is expected to be safe now. This patch should be backported to 1.9 as at least the trash/log issue seems to be present. The run_thread_poll_loop() code is a bit different there as the mworker is not a callback, but it will have no effect and it's enough to drop the mworker changes. This bug was reported by Ilya Shipitsin in github issue #104.	2019-05-22 14:59:08 +02:00
Willy Tarreau	aabbe6a3bb	MINOR: WURFL: do not emit warnings when not configured At the moment the WURFL module emits 3 lines of warnings upon startup when it is not referenced in the configuration file, which is quite confusing. Let's make sure to keep it silent when not configured, as detected by the absence of the wurfl-data-file statement.	2019-05-22 14:01:22 +02:00
mbellomi	ae4fcf1e67	MINOR: WURFL: module version bump to 2.0 Make it version 2.0.	2019-05-22 12:06:42 +02:00
mbellomi	2c07700098	MEDIUM: WURFL: HTX awareness. Now wurfl fetch process is fully HTX aware.	2019-05-22 12:06:38 +02:00
mbellomi	9896981675	MINOR: WURFL: wurfl_get() and wurfl_get_all() now return an empty string if device detection fails	2019-05-22 12:06:38 +02:00
mbellomi	e9fedf560a	MINOR: WURFL: removes heading wurfl-information-separator from wurfl-get-all() and wurfl-get() results	2019-05-22 12:06:38 +02:00
mbellomi	4304e30af1	MINOR: WURFL: shows log messages during module initialization Now some useful startup information is logged to stderr. Previously they were lost because logs were not yet enabled.	2019-05-22 12:06:34 +02:00
mbellomi	f9ea1e2fd4	MINOR: WURFL: fixed Engine load failed error when wurfl-information-list contains wurfl_root_id	2019-05-22 12:06:07 +02:00
mbellomi	d173e93aa7	BUG/MEDIUM: WURFL: segfault in wurfl-get() with missing info. A segfault may happen in ha_wurfl_get() when dereferencing information not present in wurfl-information-list. Check the node retrieved from the tree, not its container. This fix must be backported to 1.9.	2019-05-22 12:06:02 +02:00
Willy Tarreau	0a7a4fbbc8	CLEANUP: mux-h1: use "H1" and not "h1" as the mux's name The mux's name is the only one reported in lower case in "show sess" or "haproxy -vv" while the other ones are upper case, so it loses and the other ones win :-)	2019-05-22 11:50:48 +02:00
Willy Tarreau	b106ce1c3d	MINOR: stream: remove the cpu time detection from process_stream() It was not as efficient as the watchdog in that it would only trigger after the problem resolved by itself, and still required a huge margin to make sure we didn't trigger for an invalid reason. This used to leave little indication about the cause. Better use the watchdog now and improve it if needed. The detector of unkillable tasks remains active though.	2019-05-22 11:50:48 +02:00
Willy Tarreau	2bfefdbaef	MAJOR: watchdog: implement a thread lockup detection mechanism Since threads were introduced, we've naturally had a number of bugs related to locking issues. In addition we've also got some issues with corrupted lists in certain rare cases not necessarily involving threads. Not only these events cause a lot of trouble to the production as it is very hard to detect that the process is stuck in a loop and doesn't deliver the service anymore, but it's often difficult (or too late) to collect more debugging information. The patch presented here implements a lockup detection mechanism, also known as "watchdog". The principle is that (on systems supporting it), each thread will have its own CPU timer which progresses as the thread consumes CPU cycles, and when a deadline is met, a signal is delivered (SIGALRM here since it doesn't interrupt gdb by default). The thread handling this signal (which is not necessarily the one which triggered the timer) figures the thread ID from the signal arguments and checks if it's really stuck by looking at the time spent since last exit from poll() and by checking that the thread's scheduler is still alive (so that even when dealing with configuration issues resulting in insane amount of tasks being called in turn, it is not possible to accidently trigger it). Checking the scheduler's activity will usually result in a second chance, thus doubling the detecting time. In order not to incorrectly flag a thread as being the cause of the lockup, the thread_harmless_mask is checked : a thread could very well be spinning on itself waiting for all other threads to join (typically what happens when issuing "show sess"). In this case, once all threads but one (or two) have joined, all the innocent ones are marked harmless and will not trigger the timer. Only the ones not reacting will. The deadline is set to one second, which already appears impossible to reach, especially since it's 1 second of CPU usage, not elapsed time with the CPU being preempted by other threads/processes/hypervisor. In practice due to the scheduler's health verification it takes up to two seconds to decide to panic. Once all conditions are met, the goal is to crash from the offending thread. So if it's the current one, we call ha_panic() otherwise the signal is bounced to the offending thread which deals with it. This will result in all threads being woken up in turn to dump their context, the whole state is emitted on stderr in hope that it can be logged, and the process aborts, leaving a chance for a core to be dumped and for a service manager to restart it. An alternative mechanism could be implemented for systems unable to wake up a thread once its CPU clock reaches a deadline (e.g. FreeBSD). Instead of waking the timer each and every deadline, it is possible to use a standard timer which is reset each time we leave poll(). Since the signal handler rechecks the CPU consumption this will also work. However a totally idle process may trigger it from time to time which may or may not confuse some debugging sessions. The same is true for alarm() which could be another option for systems not having such a broad choice of timers (but it seems that in this case they will not have per-thread CPU measurements available either). The feature is currently implemented only when threads are enabled in order to keep the code clean, since the main purpose is to detect and address inter-thread deadlocks. But if it proves useful for other situations this condition might be relaxed.	2019-05-22 11:50:48 +02:00
Willy Tarreau	e6a02fa65a	MINOR: threads: add a "stuck" flag to the thread_info struct This flag is constantly cleared by the scheduler and will be set by the watchdog timer to detect stuck threads. It is also set by the "show threads" command so that it is easy to spot if the situation has evolved between two subsequent calls : if the first "show threads" shows no stuck thread and the second one shows such a stuck thread, it indicates that this thread didn't manage to make any forward progress since the previous call, which is extremely suspicious.	2019-05-22 11:50:48 +02:00
Willy Tarreau	578ea8be55	MINOR: debug: dump streams when an applet, iocb or stream is known Whenever we can retrieve a valid stream pointer, we now call stream_dump() to get a detailed dump of the stream currently running on the processor. This is used by "show threads" and by ha_panic().	2019-05-22 11:50:48 +02:00
Willy Tarreau	5484d58a17	MINOR: stream: introduce a stream_dump() function and use it in stream_dump_and_crash() This function dumps a lot of information about a stream into the provided buffer. It is now used by stream_dump_and_crash() and will be used by the debugger as well.	2019-05-22 11:50:48 +02:00
Willy Tarreau	fade80d162	CLEANUP: debug: make use of ha_tkill() and remove ifdefs This way we always signal the threads the same way.	2019-05-22 11:50:48 +02:00
Willy Tarreau	2beaaf7d46	MINOR: threads: implement ha_tkill() and ha_tkillall() These functions are used respectively to signal one thread or all threads. When multithreading is disabled, it's always the current thread which is signaled.	2019-05-22 11:50:48 +02:00
Willy Tarreau	8b35ba54bc	CLEANUP: debug: always report harmless/want_rdv even without threads This way we have a more consistent output and we can remove annoying ifdefs.	2019-05-22 11:50:48 +02:00
Willy Tarreau	05ed14cfc4	CLEANUP: threads: really move thread_info to hathreads.c Commit 5a6e2245f ("REORG: threads: move the struct thread_info from global.h to hathreads.h") didn't hold its promise well, as the thread_info struct was still declared and initialized in haproxy.c in addition to being in hathreads.c. Let's move it for real now.	2019-05-22 11:50:48 +02:00
Willy Tarreau	ddd8533f1b	MINOR: debug: switch to SIGURG for thread dumps The current choice of SIGPWR has the adverse effect of stopping gdb each time it is triggered using "show threads" or example, which is not really convenient. Let's switch to SIGURG instead, which we don't use either.	2019-05-22 11:50:48 +02:00
Tim Duesterhus	9b7a976cd6	BUG/MINOR: mworker: Fix memory leak of mworker_proc members The struct mworker_proc is not uniformly freed everywhere, sometimes leading to leaks of the `id` string (and possibly the other strings). Introduce a mworker_free_child function instead of duplicating the freeing logic everywhere to prevent this kind of issues. This leak was reported in issue #96. It looks like the leaks have been introduced in commit 9a1ee7ac31c56fd7d881adf2ef4659f336e50c9f, which is specific to 2.0-dev. Backporting `mworker_free_child` might be helpful to ease backporting other fixes, though.	2019-05-22 11:29:18 +02:00
Willy Tarreau	f61782418c	CLEANUP: time: refine the test on _POSIX_TIMERS The clock_gettime() man page says we must check that _POSIX_TIMERS is defined to a value greater than zero, not just that it's simply defined so let's fix this right now.	2019-05-21 20:03:03 +02:00
Olivier Houchard	aacc405c1f	BUG/MEDIUM: streams: Don't switch from SI_ST_CON to SI_ST_DIS on read0. When we receive a read0, and we're still in SI_ST_CON state (so on an outgoing conneciton), don't immediately switch to SI_ST_DIS, or, we would never call sess_establish(), and so the analysers will never run. Instead, let sess_establish() handle that case, and switch to SI_ST_DIS if we already have CF_SHUTR on the channel. This should be backported to 1.9.	2019-05-21 19:05:09 +02:00
Emmanuel Hocdet	0ba4f483d2	MAJOR: polling: add event ports support (Solaris) Event ports are kqueue/epoll polling class for Solaris. Code is based on https://github.com/joyent/haproxy-1.8/tree/joyent/dev-v1.8.8. Event ports are available only on SunOS systems derived from Solaris 10 and later (including illumos systems).	2019-05-21 15:16:45 +02:00
Willy Tarreau	663fda4c90	BUILD: threads: only assign the clock_id when supported I took extreme care to always check for _POSIX_THREAD_CPUTIME before manipulating clock_id, except at one place (run_thread_poll_loop) as found by Manu, breaking Solaris. Now fixed, no backport needed.	2019-05-21 15:14:08 +02:00
Willy Tarreau	9c8800af3b	MINOR: debug: report each thread's cpu usage in "show thread" Now we can report each thread's CPU time, both at wake up (poll) and retrieved while dumping (now), then the difference, which directly indicates how long the thread has been running uninterrupted. A very high value for the diff could indicate a deadlock, especially if it happens between two threads. Note that it may occasionally happen that a wrong value is displayed since nothing guarantees that the date is read atomically.	2019-05-20 21:14:14 +02:00
Willy Tarreau	81036f2738	MINOR: time: move the cpu, mono, and idle time to thread_info These ones are useful across all threads and would be better placed in struct thread_info than thread-local. There are very few users.	2019-05-20 21:14:14 +02:00
Willy Tarreau	8323a375bc	MINOR: threads: add a thread-local thread_info pointer "ti" Since we're likely to access this thread_info struct more frequently in the future, let's reserve the thread-local symbol to access it directly and avoid always having to combine thread_info and tid. This pointer is set when tid is set.	2019-05-20 21:14:12 +02:00
Willy Tarreau	624dcbf41e	MINOR: threads: always place the clockid in the struct thread_info It will be easier to deal with the internal API to always have it.	2019-05-20 21:13:01 +02:00
Willy Tarreau	5a6e2245fa	REORG: threads: move the struct thread_info from global.h to hathreads.h It doesn't make sense to keep this struct thread_info in global.h, it causes difficulties to access its contents from hathreads.h, let's move it to the threads where it ought to have been created.	2019-05-20 20:00:25 +02:00
Willy Tarreau	a9f9fc9e5b	MINOR: debug: make ha_panic() report threads starting at 1 Internally they start at zero but everywhere (config, dumps) we show them starting at 1, so let's fix the confusion.	2019-05-20 17:46:14 +02:00
Willy Tarreau	3710105945	MINOR: tools: provide a may_access() function and make dump_hex() use it It's a bit too easy to crash by accident when using dump_hex() on any area. Let's have a function to check if the memory may safely be read first. This one abuses the stat() syscall checking if it returns EFAULT or not, in which case it means we're not allowed to read from there. In other situations it may return other codes or even a success if the area pointed to by the file exists. It's important not to abuse it though and as such it's tested only once per output line.	2019-05-20 16:59:37 +02:00
Willy Tarreau	6bdf3e9b11	MINOR: debug/cli: add some debugging commands for developers When haproxy is built with DEBUG_DEV, the following commands are added to the CLI : debug dev close <fd> : close this file descriptor debug dev delay [ms] : sleep this long debug dev exec [cmd] ... : show this command's output debug dev exit [code] : immediately exit the process debug dev hex <addr> [len]: dump a memory area debug dev log [msg] ... : send this msg to global logs debug dev loop [ms] : loop this long debug dev panic : immediately trigger a panic debug dev tkill [thr] [sig] : send signal to thread These are essentially aimed at helping developers trigger certain conditions and are expected to be complemented over time.	2019-05-20 16:59:30 +02:00
Willy Tarreau	56131ca58e	MINOR: debug: implement ha_panic() This function dumps all existing threads using the thread dump mechanism then aborts. This will be used by the lockup detection and by debugging tools.	2019-05-20 16:51:30 +02:00
Willy Tarreau	9fc5dcbd71	MINOR: tools: add dump_hex() This is used to dump a memory area into a buffer for debugging purposes.	2019-05-20 16:51:30 +02:00
Willy Tarreau	da5a63f8f1	CLEANUP: stream: remove an obsolete debugging test The test consisted in checking that there was always a timeout on a stream's task and was only enabled when built in development mode, but 1) it is never tested and 2) if it had been tested it would have been noticed that it triggers a bit too easily on the CLI. Let's get rid of this old one.	2019-05-20 16:19:40 +02:00
Willy Tarreau	91e6df01fa	MINOR: threads: add each thread's clockid into the global thread_info This is the per-thread CPU runtime clock, it will be used to measure the CPU usage of each thread and by the lockup detection mechanism. It must only be retrieved at the beginning of run_thread_poll_loop() since the thread must already have been started for this. But it must be done before performing any per-thread initcall so that all thread init functions have access to the clock ID. Note that it could make sense to always have this clockid available even in non-threaded situations and place the process' clock there instead. But it would add portability issues which are currently easy to deal with by disabling threads so it may not be worth it for now.	2019-05-20 11:42:25 +02:00
Willy Tarreau	522cfbc1ea	MINOR: init/threads: make the global threads an array of structs This way we'll be able to store more per-thread information than just the pthread pointer. The storage became an array of struct instead of an allocated array since it's very small (typically 512 bytes) and not worth the hassle of dealing with memory allocation on this. The array was also renamed thread_info to make its intended usage more explicit.	2019-05-20 11:37:57 +02:00
Willy Tarreau	64a47b943c	CLEANUP: memory: make the fault injection code use the OTHER_LOCK label The mem_should_fail() function sets a lock while it's building its messages, and when this was done there was no relevant label available hence the confusing use of START_LOCK. Now OTHER_LOCK is available for such use cases, so let's switch to this one instead as START_LOCK is going to disappear.	2019-05-20 11:26:12 +02:00
Willy Tarreau	619a95f5ad	MEDIUM: init/mworker: make the pipe register function a regular initcall Now that we have the guarantee that init calls happen before any other thread starts, we don't need anymore the workaround installed by commit 1605c7ae6 ("BUG/MEDIUM: threads/mworker: fix a race on startup") and we can instead rely on a regular per-thread initcall for this function. It will only be performed on worker thread #0, the other ones and the master have nothing to do, just like in the original code that was only moved to the function.	2019-05-20 11:26:12 +02:00
Willy Tarreau	3078e9f8e2	MINOR: threads/init: synchronize the threads startup It's a bit dangerous to let threads initialize at different speeds on startup. Some are still in their init functions while others area already running. It was even subject to some race condition bugs like the one fixed by commit 1605c7ae6 ("BUG/MEDIUM: threads/mworker: fix a race on startup"). Here in order to secure all this, we take a very simplistic approach consisting in using half of the rendez-vous point, which is made exactly for this purpose : we first initialize the mask of the threads requesting a rendez-vous to the mask of all threads, and we simply call thread_release() once the init is complete. This guarantees that no thread will go further than the initialization code during this time. This could even safely be backported if any other issue related to an init race was discovered in a stable release.	2019-05-20 11:26:12 +02:00
William Lallemand	7b302d8dd5	MINOR: init: setenv HAPROXY_CFGFILES Set the HAPROXY_CFGFILES environment variable which contains the list of configuration files used to start haproxy, separated by semicolon.	2019-05-20 11:21:00 +02:00
Willy Tarreau	c7091d89ae	MEDIUM: debug/threads: implement an advanced thread dump system The current "show threads" command was too limited as it was not possible to dump other threads' detailed states (e.g. their tasks). This patch goes further by using thread signals so that each thread can dump its own state in turn into a shared buffer provided by the caller. Threads are synchronized using a mechanism very similar to the rendez-vous point and using this method, each thread can safely dump any of its contents and the caller can finally report the aggregated ones from the buffer. It is important to keep in mind that the list of signal-safe functions is limited, so we take care of only using chunk_printf() to write to a pre-allocated buffer. This mechanism is enabled by USE_THREAD_DUMP and is enabled by default on Linux 2.6.28+. On other platforms it falls back to the previous solution using the loop and the less precise dump.	2019-05-17 17:16:20 +02:00
Willy Tarreau	0ad46fa6f5	MINOR: stream: detach the stream from its own task on stream_free() This makes sure that the stream is not visible from its own task just before starting to free some of its components. This way we have the guarantee that a stream found in a task list is totally valid and can safely be dereferenced.	2019-05-17 17:16:20 +02:00

... 63 64 65 66 67 ...

10974 Commits