haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-12-04 01:01:00 +01:00

Author	SHA1	Message	Date
Willy Tarreau	205f1cbf4c	BUG/MEDIUM: wdt: improve stuck task detection accuracy The fact that the watchdog timer measures the execution time from the last return from the poller tends to amplify the impact of multiple bad tasks, and may explain some of the panics reported by Felipe and Ricardo in GH issues #3084, #3092 and #3101. The problem is that we check the time if we see that the scheduler appears not to be moving anymore, but one situation may still arise and catch a bad task: - one slow task takes so long a time that it triggers the watchdog twice, emitting a warning the second time (~200ms). The scheduler is rightfully marked as stuck. - then it completes and the scheduler is no longer stuck. Many other tasks run in turn, they all take quite some time but not enough to trigger a warning. But collectively their cost adds up. - then a task takes more than the warning time (100ms), and causes the total execution time to cross the second. The watchdog is called, sees that we've spend more than 1 second since we left the poller, and marks the thread as stuck. - the task is not finished, the watchdog is called again, sees more than one second with a stuck thread and panics 100ms later. The total time away from the poller is indeed more than one second, which is very bad, but no single task caused this individually, and while the warnings are OK, the watchdog should not panic in this case. This patch revisits the approach to store the moment the scheduler was marked as stuck in the wdt context. The idea is that this date will be used to detect warnings and panics. And by doing so and exploiting the new is_sched_alive(thr), we can greatly simplify the mechanism so that the signal handling thread does the strict minimum (mark the scheduler as possibly stuck and update the stuck_start date), and only bounces to the reporting thread if the scheduler made no progress since last call. This means that without even doing computations in the handing thread, we can continue to avoid all bounces unless a warning is required. Then when the reporting thread is signaled, it will check the dates from the last moment the scheduler was marked, and will decide to warn or panic. The panic decision continues to pass via a TH_FL_STUCK flag to probe the code so that exceptionally slow code (e.g. live cert generation etc) can still find a way to avoid the panic if absolutely certain that things are still moving. This means that now we have the guarantee that panics will only happen if a given task spends more than one full second not moving, and that warnings will be issued for other calls crossing the warn delay boundary. This was tested using artificially slow operations, and all combinations which individually took less than a second only resulted in floods of warnings even if the total reported time in the warning was much higher, while those above one second provoked the panic. One improvement could consist in reporting the time since last stuck in the thread dumps to differentiate the individual task from the whole set. This needs to be backported to 3.2 along with the two previous patches: MINOR: sched: let's permit to share the local ctx between threads MINOR: sched: pass the thread number to is_sched_alive()	2025-10-01 10:18:53 +02:00
Willy Tarreau	25f5f357cc	MINOR: sched: pass the thread number to is_sched_alive() Now it will be possible to query any thread's scheduler state, not only the current one. This aims at simplifying the watchdog checks for reported threads. The operation is now a simple atomic xchg.	2025-10-01 10:18:53 +02:00
Willy Tarreau	3b2fb5cc15	CLEANUP: wdt: clarify the comments on the common exit path The condition in which we reach the check for ha_panic() and ha_stuck_warning() are not super clear, let's reformulate them.	2025-05-20 16:37:06 +02:00
Willy Tarreau	0a8bfb5b90	BUG/MEDIUM: wdt: always ignore the first watchdog wakeup With commit a06c215f08 ("MEDIUM: wdt: always make the faulty thread report its own warnings"), when the TH_FL_STUCK flag was flipped on, we'd then go to the panic code instead of giving a second chance like before the commit. This can trigger rare cases that only happen with moderate loads like was addressed by commit 24ce001771 ("BUG/MEDIUM: wdt: fix the stuck detection for warnings"). This is in fact due to the loss of the common "goto update_and_leave" that used to serve both the warning code and the flag setting for probation, and it's apparently what hit Christian in issue #2980. Let's make sure we exit naturally when turning the bit on for the first time. Let's also update the confusing comment at the end of the check that was left over by latest change. Since the first commit was backported to 3.1, this commit should be backported there as well.	2025-05-20 16:37:03 +02:00
Willy Tarreau	5901164789	MINOR: wdt: use is_sched_alive() instead of keeping a local ctxsw copy Now we can simply call is_sched_alive() on the local thread to verify that the scheduler is still ticking instead of having to keep a copy of the ctxsw and comparing it. It's cleaner, doesn't require to maintain a local copy, doesn't rely on activity[] (whose purpose is mainly for observation and debugging), and shows how this could be extended later to cover other use cases. Practically speaking this doesn't change anything however, the algorithm is still the same.	2025-04-17 16:25:47 +02:00
Willy Tarreau	874ba2afed	CLEANUP: debug: no longer set nor use TH_FL_DUMPING_OTHERS TH_FL_DUMPING_OTHERS was being used to try to perform exclusion between threads running "show threads" and those producing warnings. Now that it is much more cleanly handled, we don't need that type of protection anymore, which was adding to the complexity of the solution. Let's just get rid of it.	2025-04-17 16:25:47 +02:00
Willy Tarreau	c16d5415a8	MINOR: debug: make ha_stuck_warning() only work for the current thread Since we no longer call it with a foreign thread, let's simplify its code and get rid of the special cases that were relying on ha_thread_dump_fill() and synchronization with a remote thread. We're not only dumping the current thread so ha_thread_dump_one() is sufficient.	2025-04-17 16:25:47 +02:00
Willy Tarreau	a06c215f08	MEDIUM: wdt: always make the faulty thread report its own warnings Warnings remain tricky to deal with, especially for other threads as they require some inter-thread synchronization that doesn't cope very well with other parallel activities such as "show threads" for example. However there is nothing that forces us to handle them this way. The panic for example is already handled by bouncing the WDT signal to the faulty thread. This commit rearranges the WDT handler to make a better used of this existing signal bouncing feature of the WDT handler so that it's no longer limited to panics but can also deal with warnings. In order not to bounce on all wakeups, we only bounce when there is a suspicion, that is, when the warning timer has been crossed. We'll let the target thread verify the stuck flag and context switch count by itself to decide whether or not to panic, warn, or just do nothing and update the counters. As a bonus, now all warning traces look the same regardless of the reporting thread: call trace(16): \| 0x6bc733 <01 00 00 e8 6d e6 de ff]: ha_dump_backtrace+0x73/0x309 > main-0x2570 \| 0x6bd37a <00 00 00 e8 d6 fb ff ff]: ha_thread_dump_fill+0xda/0x104 > ha_thread_dump_one \| 0x6bd625 <00 00 00 e8 7b fc ff ff]: ha_stuck_warning+0xc5/0x19e > ha_thread_dump_fill \| 0x7b2b60 <64 8b 3b e8 00 aa f0 ff]: wdt_handler+0x1f0/0x212 > ha_stuck_warning \| 0x7fd7e2cef3a0 <00 00 00 00 0f 1f 40 00]: libpthread:+0x123a0 \| 0x7ffc6af9e634 <85 a6 00 00 00 0f 01 f9]: linux-vdso:__vdso_gettimeofday+0x34/0x2b0 \| 0x6bad74 <7c 24 10 e8 9c 01 df ff]: sc_conn_io_cb+0x9fa4 > main-0x2400 \| 0x67c457 <89 f2 4c 89 e6 41 ff d0]: main+0x1cf147 \| 0x67d401 <48 89 df e8 8f ed ff ff]: cli_io_handler+0x191/0xb38 > main+0x1cee80 \| 0x6dd605 <40 48 8b 45 60 ff 50 18]: task_process_applet+0x275/0xce9	2025-04-17 16:25:47 +02:00
Willy Tarreau	ebf1757dc2	BUG/MINOR: wdt/debug: avoid signal re-entrance between debugger and watchdog As seen in issue #2860, there are some situations where a watchdog could trigger during the debug signal handler, and where similarly the debug signal handler may trigger during the wdt handler. This is really bad because it could trigger some deadlocks inside inner libc code such as dladdr() or backtrace() since the code will not protect against re- entrance but only against concurrent accesses. A first attempt was made using ha_sigmask() but that's not always very convenient because the second handler is called immediately after unblocking the signal and before returning, leaving signal cascades in backtrace. Instead, let's mark which signals to block at registration time. Here we're blocking wdt/dbg for both signals, and optionally SIGRTMAX if DEBUG_DEV is used as that one may also be used in this case. This should be backported at least to 3.1.	2025-04-17 16:25:47 +02:00
Willy Tarreau	fb7874c286	MINOR: tinfo: split the signal handler report flags into 3 While signals are not recursive, one signal (e.g. wdt) may interrupt another one (e.g. debug). The problem this causes is that when leaving the inner handler, it removes the outer's flag, hence the protection that comes with it. Let's just have 3 distinct flags for regular signals, debug signal and watchdog signal. We add a 4th definition which is an aggregate of the 3 to ease testing.	2025-02-24 13:37:52 +01:00
Willy Tarreau	ddd173355c	MINOR: tinfo: add a new thread flag to indicate a call from a sig handler Signal handlers must absolutely not change anything, but some long and complex call chains may look innocuous at first glance, yet result in some subtle write accesses (e.g. pools) that can conflict with a running thread being interrupted. Let's add a new thread flag TH_FL_IN_SIG_HANDLER that is only set when entering a signal handler and cleared when leaving them. Note, we're speaking about real signal handlers (synchronous ones), not deferred ones. This will allow some sensitive call places to act differently when detecting such a condition, and possibly even to place a few new BUG_ON().	2025-02-21 17:41:38 +01:00
Willy Tarreau	7ddcdff33f	BUG/MEDIUM: debug: close a possible race between thread dump and panic() The rework of the thread dumping mechanism in 2.8 with commit 9a6ecbd590 ("MEDIUM: debug: simplify the thread dump mechanism") opened a small race, which is that a thread in the process of dumping other ones may block the other one from panicing while it's looping at the end of ha_thread_dump_fill(), or any other sequence involving the currently dumped one. This was emphasized in 3.1 with commit 148eb5875f ("DEBUG: wdt: better detect apparently locked up threads and warn about them") that allowed to emit warnings about long-stuck threads, because in this case, what happens is that sometimes a thread starts to emit a warning (or a set of warnings), and while the warning is being awaited for, a panic finally happens and interrupts either the dumping thread, which never finishes and waits for the target's pointer to become NULL which will never happen since it was supposed to do it itself, or the currently dumped thread which could wait for the dumping thread to become ready while this one has not released the former. In order to address this, first we now make sure never to dump a thread that is already in the process of dumping another one. We're adding a new thread flag to know this situation, that is set in ha_thread_dump_fill() and cleared in ha_thread_dump_done(). And similarly, we don't trigger the watchdog on a thread waiting for another one to finish its dump, as it's likely a case of warning (and maybe even a panic) that makes them wait for each other and we don't want such cases to be reentrant. Finally, we check in the main polling loop that the flag never accidentally leaked (e.g. wrong flag manipulation) as this would be difficult to spot with bad consequences. This should be backported at least to 2.8, and should resolve github issue #2860. Thanks to Chris Staite for the very informative backtrace that exhibited the problem.	2025-02-10 18:34:26 +01:00
Willy Tarreau	24ce001771	BUG/MEDIUM: wdt: fix the stuck detection for warnings If two slow tasks trigger one warning even a few seconds apart, the watchdog code will mistakenly take this for a definite stuck task and kill the process. The reason is that since commit 148eb5875f ("DEBUG: wdt: better detect apparently locked up threads and warn about them") the updated ctxsw count is not the correct one, instead of updating the private counter it resets the public one, preventing it from making progress and making the wdt believe that no progress was made. In addition the initial value was read from [tid] instead of [thr]. Please note that another fix is needed in debug_handler() otherwise the watchdog will fire early after the first warning or thread dump. A simple test for this is to issue several of these commands back-to-back on the CLI, which crashes an unfixed 3.1 very quickly: $ socat /tmp/sock1 - <<< "expert-mode on; debug dev loop 1000" This needs to be backported to 2.9 since the fix above was backported there. The impact on 3.0 and 2.9 is almost inexistent since the watchdog there doesn't apply the shorter warning delay, so the first call already indicates that the thread is stuck.	2024-11-21 19:58:05 +01:00
Willy Tarreau	5f4fe20116	DEBUG: wdt: set the default blocked task delay to 100 ms The warn-blocked-traffic-after can be significantly lowered. In any case, in order to be usable it must be well below the limit to have a chance to emit exploitable traces before the watchdog finally fires. Even configured at 1ms it looks very difficult to trigger it on a laptop doing SSL and compression, so applying a 100-fold factor to cover for large configs and small machines sounds sane for 3.1. In any case, even at 100ms, the service degradation becomes quite visible.	2024-11-06 18:35:42 +01:00
Willy Tarreau	6127e5a4e9	DEBUG: wdt: make the blocked traffic warning delay configurable The new global "warn-blocked-traffic-after" allows one to configure after how much time a warning should be emitted when traffic is blocked.	2024-11-06 18:35:42 +01:00
Willy Tarreau	148eb5875f	DEBUG: wdt: better detect apparently locked up threads and warn about them In order to help users detect when threads are behaving abnormally, let's try to emit a warning when one is no longer making any progress. This will allow to catch faulty situations more accurately, instead of occasionally triggering just after the long task. It will also let users know that there is something wrong with their configuration, and inspect the call trace to figure whether they're using excessively long rules or Lua for example (the usual warnings about lua-load vs lua-load-per-thread are still reported). The warning will only be emitted for threads not yet marked as stuck so as not to interfere with panic dumps and avoid sending a warning just before a panic. A tainted flag is set when this happens however (0x2000).	2024-11-06 18:35:42 +01:00
Willy Tarreau	3f4d646849	MINOR: wdt: move the local timers to a struct Better have a local struct for per-thread timers, as this will allow us to store extra info that are useful to improve accurate reporting.	2024-11-06 18:35:42 +01:00
Willy Tarreau	5405c9cdf3	BUG/MEDIUM: wdt: fix wrong thread being checked for sleeping In 2.7, the method used to check for a sleeping thread changed with commit e7475c8e7 ("MEDIUM: tasks/fd: replace sleeping_thread_mask with a TH_FL_SLEEPING flag"). Previously there was a global sleeping mask and now there is a flag per thread. The commit above partially broke the watchdog by looking at the current thread's flags via th_ctx instead of the reported thread's flags, and using an AND condition instead of an OR to update and leave. This can cause a wrong thread to be killed when the load is uneven. For example, when enabling busy polling and sending traffic over a single connection, all threads have their run time grow, and if the one receiving the signal is also processing some traffic, it will not match the sleeping/harmless condition and will set the stuck flag, then die upon next invocation. While it's reproducible in tests, it's unlikely to be met in field. This fix should be backported to 2.7.	2023-02-17 16:01:34 +01:00
Willy Tarreau	1229ef312d	MINOR: wdt: do not rely on threads_to_dump anymore This flag is not needed anymore as we're already marking the waiting threads as harmless, thus the thread's bit is already covered by this information. The variable was unexported.	2022-07-01 19:26:35 +02:00
Willy Tarreau	03f9b35114	MEDIUM: tinfo: add a dynamic thread-group context The thread group info is not sufficient to represent a thread group's current state as it's read-only. We also need something comparable to the thread context to represent the aggregate state of the threads in that group. This patch introduces ha_tgroup_ctx[] and tg_ctx for this. It's indexed on the group id and must be cache-line aligned. The thread masks that were global and that do not need to remain global were moved there (want_rdv, harmless, idle). Given that all the masks placed there now become group-specific, the associated thread mask (tid_bit) now switches to the thread's local bit (ltid_bit). Both are the same for nbtgroups 1 but will differ for other values. There's also a tg_ctx pointer in the thread so that it can be reached from other threads.	2022-07-01 19:15:15 +02:00
Willy Tarreau	adc1f52c92	MINOR: wdt: use ltid_bit in wdt_handler() Since commit cc7a11ee3 ("MINOR: threads: set the tid, ltid and their bit in thread_cfg") we ought not use (1UL << thr) to get the group mask for thread <thr>, but (ha_thread_info[thr].ltid_bit). wdt_handler() needs this.	2022-07-01 19:15:14 +02:00
Willy Tarreau	e7475c8e79	MEDIUM: tasks/fd: replace sleeping_thread_mask with a TH_FL_SLEEPING flag Every single place where sleeping_thread_mask was still used was to test or set a single thread. We can now add a per-thread flag to indicate a thread is sleeping, and remove this shared mask. The wake_thread() function now always performs an atomic fetch-and-or instead of a first load then an atomic OR. That's cleaner and more reliable. This is not easy to test, as broadcast FD events are rare. The good way to test for this is to run a very low rate-limited frontend with a listener that listens to the fewest possible threads (2), and to send it only 1 connection at a time. The listener will periodically pause and the wakeup task will sometimes wake up on a random thread and will call wake_thread(): frontend test bind :8888 maxconn 10 thread 1-2 rate-limit sessions 5 Alternately, disabling/enabling a frontend in loops via the CLI also broadcasts such events, but they're more difficult to observe since this is causing connection failures.	2022-07-01 19:15:14 +02:00
Willy Tarreau	bdcd32598f	MINOR: thread: only use atomic ops to touch the flags The thread flags are touched a little bit by other threads, e.g. the STUCK flag may be set by other ones, and they're watched a little bit. As such we need to use atomic ops only to manipulate them. Most places were already using them, but here we generalize the practice. Only ha_thread_dump() does not change because it's run under isolation.	2022-07-01 19:15:14 +02:00
William Lallemand	ae053b30da	BUG/MEDIUM: wdt: don't trigger the watchdog when p is unitialized In wdt_handler(), does not try to trigger the watchdog if the prev_cpu_time wasn't initialized. This prevents an unexpected trigger of the watchdog when it wasn't initialized yet. This case could happen in the master just after loading the configuration. This would show a trace where the <diff> value is equal to the <now> value in the trace, and the <poll> value would be 0. For example: Thread 1 is about to kill the process. *>Thread 1 : id=0x0 act=1 glob=1 wq=0 rq=0 tl=0 tlsz=0 rqsz=0 stuck=1 prof=0 harmless=0 wantrdv=0 cpu_ns: poll=0 now=6005541706 diff=6005541706 curr_task=0 Thanks to Christian Ruppert for repporting the problem. Could be backported in every stable versions.	2022-05-13 11:28:08 +02:00
Willy Tarreau	a0b99536c8	REORG: thread/sched: move the thread_info flags to the thread_ctx The TI_FL_STUCK flag is manipulated by the watchdog and scheduler and describes the apparent life/death of a thread so it changes all the time and it makes sense to move it to the thread's context for an active thread.	2021-10-08 17:22:26 +02:00
Willy Tarreau	45c38e22bf	REORG: thread/clock: move the clock parts of thread_info to thread_ctx The "thread_info" name was initially chosen to store all info about threads but since we now have a separate per-thread context, there is no point keeping some of its elements in the thread_info struct. As such, this patch moves prev_cpu_time, prev_mono_time and idle_pct to thread_ctx, into the thread context, with the scheduler parts. Instead of accessing them via "ti->" we now access them via "th_ctx->", which makes more sense as they're totally dynamic, and will be required for future evolutions. There's no room problem for now, the structure still has 84 bytes available at the end.	2021-10-08 17:22:26 +02:00
Willy Tarreau	6414e4423c	CLEANUP: wdt: do not remap SI_TKILL to SI_LWP, test the values directly We used to remap SI_TKILL to SI_LWP when SI_TKILL was not available (e.g. FreeBSD) but that's ugly and since we need this only in a single switch/case block in wdt.c it's even simpler and cleaner to perform the two tests there, so let's do this.	2021-10-08 17:22:26 +02:00
Willy Tarreau	b474f43816	MINOR: wdt: move wd_timer to wdt.c The watchdog timer had no more reason for being shared with the struct thread_info since the watchdog is the only user now. Let's remove it from the struct and move it to a static array in wdt.c. This removes some ifdefs and the need for the ugly mapping to empty_t that might be subject to a cast to a long when compared to TIMER_INVALID. Now timer_t is not known outside of wdt.c and clock.c anymore.	2021-10-08 17:22:26 +02:00
Willy Tarreau	2169498941	MINOR: clock: move the clock_ids to clock.c This removes the knowledge of clockid_t from anywhere but clock.c, thus eliminating a source of includes burden. The unused clock_id field was removed from thread_info, and the definition setting of clockid_t was removed from compat.h. The most visible change is that the function now_cpu_time_thread() now takes the thread number instead of a tinfo pointer.	2021-10-08 17:22:26 +02:00
Willy Tarreau	6cb0c391e7	REORG: clock/wdt: move wdt timer initialization to clock.c The code that deals with timer creation for the WDT was moved to clock.c and is called with the few relevant arguments. This removes the need for awareness of clock_id from wdt.c and as such saves us from having to share it outside. The timer_t is also known only from both ends but not from the public API so that we don't have to create a fake timer_t anymore on systems which do not support it (e.g. macos).	2021-10-08 17:22:26 +02:00
Willy Tarreau	5554264f31	REORG: time: move time-keeping code and variables to clock.c There is currently a problem related to time keeping. We're mixing the functions to perform calculations with the os-dependent code needed to retrieve and adjust the local time. This patch extracts from time.{c,h} the parts that are solely dedicated to time keeping. These are the "now" or "before_poll" variables for example, as well as the various now_() functions that make use of gettimeofday() and clock_gettime() to retrieve the current time. The "tv_" functions moved there were also more appropriately renamed to "clock_*". Other parts used to compute stolen time are in other files, they will have to be picked next.	2021-10-08 17:22:26 +02:00
Willy Tarreau	19b18ad552	CLENAUP: wdt: use ha_tkill() instead of accessing pthread directly Instead of calling pthread_kill() directly on the pthread_t let's call ha_tkill() which does the same by itself. This will help isolate pthread_t.	2021-10-07 01:41:14 +02:00
Willy Tarreau	9310f481ce	CLEANUP: tree-wide: remove unneeded include time.h in ~20 files 20 files used to have haproxy/time.h included only for now_ms, and two were missing it for other things but used to inherit from it via other files.	2021-10-07 01:41:14 +02:00
Willy Tarreau	7f673c2cde	BUILD: wdt: include signal-t.h WDT_SIG is used there, thus signal-t.h is required. Currently it's retrieved by accident through global.h.	2021-05-08 12:29:01 +02:00
Christopher Faulet	fc633b6eff	CLEANUP: config: Return ERR_NONE from config callbacks instead of 0 Return ERR_NONE instead of 0 on success for all config callbacks that should return ERR_* codes. There is no change because ERR_NONE is a macro equals to 0. But this makes the return value more explicit.	2020-11-13 16:26:10 +01:00
Willy Tarreau	36979d9ad5	REORG: include: move the error reporting functions to from log.h to errors.h Most of the files dealing with error reports have to include log.h in order to access ha_alert(), ha_warning() etc. But while these functions don't depend on anything, log.h depends on a lot of stuff because it deals with log-formats and samples. As a result it's impossible not to embark long dependencies when using ha_warning() or qfprintf(). This patch moves these low-level functions to errors.h, which already defines the error codes used at the same places. About half of the users of log.h could be adjusted, sometimes revealing other issues such as missing tools.h. Interestingly the total preprocessed size shrunk by 4%.	2020-06-11 10:18:59 +02:00
Willy Tarreau	aeed4a85d6	REORG: include: move log.h to haproxy/log{,-t}.h The current state of the logging is a real mess. The main problem is that almost all files include log.h just in order to have access to the alert/warning functions like ha_alert() etc, and don't care about logs. But log.h also deals with real logging as well as log-format and depends on stream.h and various other things. As such it forces a few heavy files like stream.h to be loaded early and to hide missing dependencies depending where it's loaded. Among the missing ones is syslog.h which was often automatically included resulting in no less than 3 users missing it. Among 76 users, only 5 could be removed, and probably 70 don't need the full set of dependencies. A good approach would consist in splitting that file in 3 parts: - one for error output ("errors" ?). - one for log_format processing - and one for actual logging.	2020-06-11 10:18:58 +02:00
Willy Tarreau	3727a8a083	REORG: include: move signal.h to haproxy/signal{,-t}.h No change was necessary. Include from wdt.c was dropped since unneeded.	2020-06-11 10:18:58 +02:00
Willy Tarreau	f268ee8795	REORG: include: split global.h into haproxy/global{,-t}.h global.h was one of the messiest files, it has accumulated tons of implicit dependencies and declares many globals that make almost all other file include it. It managed to silence a dependency loop between server.h and proxy.h by being well placed to pre-define the required structs, forcing struct proxy and struct server to be forward-declared in a significant number of files. It was split in to, one which is the global struct definition and the few macros and flags, and the rest containing the functions prototypes. The UNIX_MAX_PATH definition was moved to compat.h.	2020-06-11 10:18:58 +02:00
Willy Tarreau	48fbcae07c	REORG: tools: split common/standard.h into haproxy/tools{,-t}.h And also rename standard.c to tools.c. The original split between tools.h and standard.h dates from version 1.3-dev and was mostly an accident. This patch moves the files back to what they were expected to be, and takes care of not changing anything else. However this time tools.h was split between functions and types, because it contains a small number of commonly used macros and structures (e.g. name_desc) which in turn cause the massive list of includes of tools.h to conflict with the callers. They remain the ugliest files of the whole project and definitely need to be cleaned and split apart. A few types are defined there only for functions provided there, and some parts are even OS-specific and should move somewhere else, such as the symbol resolution code.	2020-06-11 10:18:57 +02:00
Willy Tarreau	3f567e4949	REORG: include: split hathreads into haproxy/thread.h and haproxy/thread-t.h This splits the hathreads.h file into types+macros and functions. Given that most users of this file used to include it only to get the definition of THREAD_LOCAL and MAXTHREADS, the bare minimum was placed into thread-t.h (i.e. types and macros). All the thread management was left to haproxy/thread.h. It's worth noting the drop of the trailing "s" in the name, to remove the permanent confusion that arises between this one and the system implementation (no "s") and the makefile's option (no "s"). For consistency, src/hathreads.c was also renamed thread.c. A number of files were updated to only include thread-t which is the one they really needed. Some future improvements are possible like replacing empty inlined functions with macros for the thread-less case, as building at -O0 disables inlining and causes these ones to be emitted. But this really is cosmetic.	2020-06-11 10:18:56 +02:00
Willy Tarreau	2a83d60662	REORG: include: move debug.h from common/ to haproxy/ The debug file is cleaner now and does not depend on much anymore.	2020-06-11 10:18:56 +02:00
Willy Tarreau	4c7e4b7738	REORG: include: update all files to use haproxy/api.h or api-t.h if needed All files that were including one of the following include files have been updated to only include haproxy/api.h or haproxy/api-t.h once instead: - common/config.h - common/compat.h - common/compiler.h - common/defaults.h - common/initcall.h - common/tools.h The choice is simple: if the file only requires type definitions, it includes api-t.h, otherwise it includes the full api.h. In addition, in these files, explicit includes for inttypes.h and limits.h were dropped since these are now covered by api.h and api-t.h. No other change was performed, given that this patch is large and affects 201 files. At least one (tools.h) was already freestanding and didn't get the new one added.	2020-06-11 10:18:42 +02:00
Olivier Houchard	de01ea9878	MINOR: wdt: Move the definitions of WDTSIG and DEBUGSIG into types/signal.h. Move the definition of WDTSIG and DEBUGSIG from wdt.c and debug.c into types/signal.h, so that we can access them in another file. We need those definition to avoid blocking those signals when running __signal_process_queue(). This should be backported to 2.1, 2.0 and 1.9.	2020-03-18 13:07:19 +01:00
Willy Tarreau	0627815f70	BUILD: wdt: only test for SI_TKILL when compiled with thread support SI_TKILL is not necessarily defined on older systems and is used only with the pthread_kill() call a few lines below, so it should also be subject to the USE_THREAD condition.	2020-03-10 09:26:17 +01:00
Willy Tarreau	e58114e0e5	MINOR: wdt: do not depend on USE_THREAD There is no reason for restricting the use of the watchdog to threads anymore, as it works perfectly without threads as well.	2020-03-04 12:02:27 +01:00
Willy Tarreau	d6f1966543	MEDIUM: wdt: fall back to CLOCK_REALTIME if CLOCK_THREAD_CPUTIME is not available At least FreeBSD has a fully functional CLOCK_THREAD_CPUTIME but it cannot create a timer on it. This is not a problem since our timer is only used to measure each thread's usage using now_cpu_time_thread(). So by just replacing this clock with CLOCK_REALTIME we allow such platforms to periodically call the wdt and check the thread's CPU usage. The consequence is that even on a totally idle system there will still be a few extra periodic wakeups, but the watchdog becomes usable there as well.	2020-03-04 12:02:27 +01:00
Willy Tarreau	7259fa2b89	BUG/MINOR: wdt: do not return an error when the watchdog couldn't be enabled On operating systems not supporting to create a timer on POSIX_THREAD_CPUTIME we emit a warning but we return an error so the process fails to start, which is absurd. Let's return a success once the warning is emitted instead. This may be backported to 2.1 and 2.0.	2020-03-04 12:02:27 +01:00
Willy Tarreau	c1563e5474	MINOR: wdt: always clear sigev_value to make valgrind happy In issue #471 it was reported that valgrind sometimes complains about timer_create() being called with uninitialized bytes. These are in fact the bits from sigev_value.sival_ptr that are not part of sival_int that are tagged as such, as valgrind has no way to know we're using the int instead of the ptr in the union. It's cheap to initialize the field so let's do it.	2020-02-26 14:05:20 +01:00
David Carlier	a92c5cec2d	BUILD/MEDIUM: threads: rename thread_info struct to ha_thread_info On Darwin, the thread_info name exists as a standard function thus we need to rename our array to ha_thread_info to fix this conflict.	2019-10-17 07:15:17 +02:00

1 2

55 Commits