haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-11 17:46:57 +02:00

Author	SHA1	Message	Date
Willy Tarreau	fc800b6cb7	MINOR: task/profiling: do not record task_drop_running() as a caller Task_drop_running() is used to remove the RUNNING bit and check if while the task was running it got a new wakeup from itself. Thus each time task_drop_running() marks itself as a caller, it in fact removes the previous caller that woke up the task, such as below: Tasks activity over 10.439 sec till 0.000 sec ago: function calls cpu_tot cpu_avg lat_tot lat_avg task_run_applet 57895273 6.396m 6.628us 2.733h 170.0us <- run_tasks_from_lists@src/task.c:658 task_drop_running Better not mark this function as a caller and keep the original one: Tasks activity over 13.834 sec till 0.000 sec ago: function calls cpu_tot cpu_avg lat_tot lat_avg task_run_applet 62424582 5.825m 5.599us 5.717h 329.7us <- sc_app_chk_rcv_applet@src/stconn.c:952 appctx_wakeup	2023-11-27 11:24:52 +01:00
Willy Tarreau	a13f8425f0	MINOR: task/debug: make task_queue() and task_schedule() possible callers It's common to see process_stream() being woken up by wake_expired_tasks in the profiling output, without knowing which timeout was set to cause this. By making it possible to record the call places of task_queue() and task_schedule(), and by making wake_expired_tasks() explicitly not replace it, we'll be able to know which task_queue() or task_schedule() was triggered for a given wakeup. For example below: process_stream 51200 311.4ms 6.081us 34.59s 675.6us <- run_tasks_from_lists@src/task.c:659 task_queue process_stream 19227 70.00ms 3.640us 9.813m 30.62ms <- sc_notify@src/stconn.c:1136 task_wakeup process_stream 6414 102.3ms 15.95us 8.093m 75.70ms <- stream_new@src/stream.c:578 task_wakeup It's visible that it's the run_tasks_from_lists() which in fact applies on the task->expire returned by the ->process() function itself.	2023-11-09 17:24:00 +01:00
Willy Tarreau	0eb0914dba	MINOR: task/debug: explicitly support passing a null caller to wakeup functions This is used for tracing and profiling. By permitting to have a NULL caller, we allow a caller to explicitly pass zero to state that the current caller must not be replaced. This will soon be used by wake_expired_tasks() to avoid replacing a caller in the expire loop.	2023-11-09 17:24:00 +01:00
Willy Tarreau	28ff1a5d56	MINOR: tasks/stats: report the number of niced tasks in "show info" We currently know the number of tasks in the run queue that are niced, and we don't expose it. It's too bad because it can give a hint about what share of the load is relevant. For example if one runs a Lua script that was purposely reniced, or if a stats page or the CLI is hammered with slow operations, seeing them appear there can help identify what part of the load is not caused by the traffic, and improve monitoring systems or autoscalers.	2023-09-06 17:44:44 +02:00
Tim Duesterhus	3a8c63d48d	MINOR: Make `tasklet_free()` safe to be called with `NULL` Make this freeing function safe, like other freeing functions are as discussed in GitHub issue #2126.	2023-04-23 00:28:25 +02:00
Willy Tarreau	fc50b9dd14	BUG/MAJOR: sched: protect task during removal from wait queue The issue addressed by commit `fbb934da9` ("BUG/MEDIUM: stick-table: fix a race condition when updating the expiration task") is still present when thread groups are enabled, but this time it lies in the scheduler. What happens is that a task configured to run anywhere might already have been queued into one group's wait queue. When updating a stick table entry, sometimes the task will have to be dequeued and requeued. For this a lock is taken on the current thread group's wait queue lock, but while this is necessary for the queuing, it's not sufficient for dequeuing since another thread might be in the process of expiring this task under its own group's lock which is different. This is easy to test using 3 stick tables with 1ms expiration, 3 track-sc rules and 4 thread groups. The process crashes almost instantly under heavy traffic. One approach could consist in storing the group number the task was queued under in its descriptor (we don't need 32 bits to store the thread id, it's possible to use one short for the tid and another one for the tgrp). Sadly, no safe way to do this was figured, because the race remains at the moment the thread group number is checked, as it might be in the process of being changed by another thread. It seems that a working approach could consist in always having it associated with one group, and only allowing to change it under this group's lock, so that any code trying to change it would have to iterately read it and lock its group until the value matches, confirming it really holds the correct lock. But this seems a bit complicated, particularly with wait_expired_tasks() which already uses upgradable locks to switch from read state to a write state. Given that the shared tasks are not that common (stick-table expirations, rate-limited listeners, maybe resolvers), it doesn't seem worth the extra complexity for now. This patch takes a simpler and safer approach consisting in switching back to a single wq_lock, but still keeping separate wait queues. Given that shared wait queues are almost always empty and that otherwise they're scanned under a read lock, the contention remains manageable and most of the time the lock doesn't even need to be taken since such tasks are not present in a group's queue. In essence, this patch reverts half of the aforementionned patch. This was tested and confirmed to work fine, without observing any performance degradation under any workload. The performance with 8 groups on an EPYC 74F3 and 3 tables remains twice the one of a single group, with the contention remaining on the table's lock first. No backport is needed.	2022-11-22 09:10:08 +01:00
Willy Tarreau	2830d282e5	DEBUG: task: simplify the caller recording in DEBUG_TASK Instead of storing an index that's swapped at every call, let's use the two pointers as a shifting history. Now we have a permanent "caller" field that records the last caller, and an optional prev_caller in the debug section enabled by DEBUG_TASK that keeps a copy of the previous caller one. This way, not only it's much easier to follow what's happening during debugging, but it saves 8 bytes in the struct task in debug mode and still keeps it under 2 cache lines in nominal mode, and this will finally be usable everywhere and later in profiling. The caller_idx was also used as a hint that the entry was freed, in order to detect wakeup-after-free. This was changed by setting caller to -1 instead and preserving its value in caller[1]. Finally, the operations were made atomic. That's not critical but since it's used for debugging and race conditions represent a significant part of the issues in multi-threaded mode, it seems wise to at least eliminate some possible factors of faulty analysis.	2022-09-08 14:30:38 +02:00
Willy Tarreau	e08af9a0f4	DEBUG: task: use struct ha_caller instead of arrays of file:line This reduces the task struct by 8 bytes, reduces the code size a little bit by simplifying the calling convention (one argument dropped), and as a bonus provides the function name in the caller.	2022-09-08 14:30:38 +02:00
Willy Tarreau	d2b2ad902b	DEBUG: task: define a series of wakeup types for tasks and tasklets The WAKEUP_* values will be used to report how a task/tasklet was woken up, and task_wakeup_type_str() wlil report the associated function name.	2022-09-08 14:30:16 +02:00
Willy Tarreau	6a28a30efa	MINOR: tasks: do not keep cpu and latency times in struct task It was a mistake to put these two fields in the struct task. This was added in 1.9 via commit `9efd7456e` ("MEDIUM: tasks: collect per-task CPU time and latency"). These fields are used solely by streams in order to report the measurements via the lat_ns* and cpu_ns* sample fetch functions when task profiling is enabled. For the rest of the tasks, this is pure CPU waste when profiling is enabled, and memory waste 100% of the time, as the point where these latencies and usages are measured is in the profiling array. Let's move the fields to the stream instead, and have process_stream() retrieve the relevant info from the thread's context. The struct task is now back to 120 bytes, i.e. almost two cache lines, with 32 bit still available.	2022-09-08 14:19:15 +02:00
Willy Tarreau	04e50b3d32	CLEANUP: task: rename ->call_date to ->wake_date This field is misnamed because its real and important content is the date the task was woken up, not the date it was called. It temporarily holds the call date during execution but this remains confusing. In fact before the latency measurements were possible it was indeed a call date. Thus is will now be called wake_date. This change is necessary because a subsequent fix will require the introduction of the real call date in the thread ctx.	2022-09-08 14:19:15 +02:00
Willy Tarreau	768c2c5678	MINOR: task: permanently enable latency measurement on tasklets When tasklet latency measurement was enabled in 2.4 with commit `b2285de04` ("MINOR: tasks: also compute the tasklet latency when DEBUG_TASK is set"), the feature was conditionned on DEBUG_TASK because the field would add 8 bytes to the struct tasklet. This approach was not a very good idea because the struct ends on an int anyway thus it does finish with a 32-bit hole regardless of the presence of this field. What is true however is that adding it turned a 64-byte struct to 72-byte when caller debugging is enabled. This patch revisits this with a minor change. Now only the lowest 32 bits of the call date are stored, so they always fit in the remaining hole, and this allows to remove the dependency on DEBUG_TASK. With debugging off, we're now seeing a 48-byte struct, and with debugging on it's exactly 64 bytes, thus still exactly one cache line. 32 bits allow a latency of 4 seconds on a tasklet, which already indicates a completely dead process, so there's no point storing the upper bits at all. And even in the event it would happen once in a while, the lost upper bits do not really add any value to the debug reports. Also, now one tasklet wakeup every 4 billion will not be sampled due to the test on the value itself. Similarly we just don't care, it's statistics and the measurements are not 9-digit accurate anyway.	2022-09-08 14:19:15 +02:00
Willy Tarreau	0fae3a0360	BUG/MINOR: task: make task_instant_wakeup() work on a task not a tasklet There's a subtle (harmless) bug in task_instant_wakeup(). As it uses some tasklet code instead of some task code, the debug part also acts on the tasklet equivalent, and the call_date is only set when DEBUG_TASK is set instead of inconditionally like with tasks. As such, without this debugging macro, call dates are not updated for tasks woken this way. There isn't any impact yet because this function was introduced in 2.6 to solve certain classes of issues and is not used yet, and in the worst case it would only affect the reported latency time. This may be backported to 2.6 in case a future fix would depend on it but currently will not fix existing code.	2022-09-08 14:19:15 +02:00
Willy Tarreau	f27acd961e	BUG/MINOR: task: always reset a new tasklet's call date The tasklet's call date was not reset, so if profiling was enabled while some tasklets were in the run queue, their initial random value could be used to preload a bogus initial latency value into the task profiling bin. Let's just zero the initial value. This should be backported to 2.4 as it was brought with initial commit `b2285de04` ("MINOR: tasks: also compute the tasklet latency when DEBUG_TASK is set"). The impact is very low though.	2022-09-08 14:19:15 +02:00
Willy Tarreau	341ac99f4d	BUG/MEDIUM: task: relax one thread consistency check in task_unlink_wq() While testing the fix for the previous issue related to reloads with hard_stop_after, I've met another one which could spuriously produce: FATAL: bug condition "t->tid >= 0 && t->tid != tid" matched at include/haproxy/task.h:266 In 2.3-dev2, we've added more consistency checks for a number of bug- inducing programming errors related to the tasks, via commit `e5d79bccc` ("MINOR: tasks/debug: add a few BUG_ON() to detect use of wrong timer queue"), and this check comes from there. The problem that happens here is that when hard-stop-after is set, we can abort the current thread even if there are still ongoing checks (or connections in fact). In this case some tasks are present in a thread's wait queue and are thus bound exclusively to this thread. During deinit(), the collect and cleanup of all memory areas also stops servers and kills their check tasks. And calling task_destroy() does in turn call task_unlink_wq()... except that it's called from thread 0 which doesn't match the initially planned thread number. Several approaches are possible. One of them would consist in letting threads perform their own cleanup (tasks, pools, FDs, etc). This would possibly be even faster since done in parallel, but some corner cases might be way more complicated (e.g. who will kill a check's task, or what to do with a task found in a local wait queue or run queue, and what about other consistency checks this could violate?). Thus for now this patches takes an easier and more conservative approach consisting in admitting that when the process is stopping, this rule is not necessarily valid, and to let thread 0 collect all other threads' garbage. As such this patch can be backpoted to 2.4.	2022-08-10 18:03:11 +02:00
Ilya Shipitsin	3b64a28e15	CLEANUP: assorted typo fixes in the code and comments This is 31st iteration of typo fixes	2022-08-06 17:12:51 +02:00
Willy Tarreau	91a7c164b4	MINOR: task: move the niced_tasks counter to the thread group context This one is only used as a hint to improve scheduling latency, so there is no more point in keeping it global since each thread group handles its own run q	2022-07-15 19:43:10 +02:00
Willy Tarreau	b0e7712fb2	MEDIUM: task/thread: move the task shared wait queues per thread group Their migration was postponed for convenience only but now's time for having the shared wait queues per thread group and not just per process, otherwise the WQ lock uses a huge amount of CPU alone.	2022-07-15 19:43:10 +02:00
Willy Tarreau	bdcd32598f	MINOR: thread: only use atomic ops to touch the flags The thread flags are touched a little bit by other threads, e.g. the STUCK flag may be set by other ones, and they're watched a little bit. As such we need to use atomic ops only to manipulate them. Most places were already using them, but here we generalize the practice. Only ha_thread_dump() does not change because it's run under isolation.	2022-07-01 19:15:14 +02:00
Willy Tarreau	319d136ff9	MEDIUM: task: use regular eb32 trees for the run queues Since we don't mix tasks from different threads in the run queues anymore, we don't need to use the eb32sc_ trees and we can switch to the regular eb32 ones. This uses cheaper lookup and insert code, and a 16-thread test on the queues shows a performance increase from 570k RPS to 585k RPS.	2022-07-01 19:15:14 +02:00
Willy Tarreau	c958c70ec8	MINOR: task: replace global_tasks_mask with a check for tree's emptiness This bit field used to be a per-thread cache of the result of the last lookup of the presence of a task for each thread in the shared cache. Since we now know that each thread has its own shared cache, a test of emptiness is now sufficient to decide whether or not the shared tree has a task for the current thread. Let's just remove this mask.	2022-07-01 19:15:14 +02:00
Willy Tarreau	da195e8aab	MINOR: task: remove grq_total and use rq_total instead grq_total was only used to know how many tasks were being queued in the global runqueue for stats purposes, and that was transferred to the per thread rq_total counter once assigned. We don't need this anymore since we know where they are, so let's just directly update rq_total and drop that one.	2022-07-01 19:15:14 +02:00
Willy Tarreau	b17dd6cc19	MEDIUM: task: replace the global rq_lock with a per-rq one There's no point having a global rq_lock now that we have one shared RQ per thread, let's have one lock per runqueue instead.	2022-07-01 19:15:14 +02:00
Willy Tarreau	6f78038d72	MEDIUM: task: move the shared runqueue to one per thread Since we only use the shared runqueue to put tasks only assigned to known threads, let's move that runqueue to each of these threads. The goal will be to arrange an N*(N-1) mesh instead of a central contention point. The global_rqueue_ticks had to be dropped (for good) since we'll now use the per-thread rqueue_ticks counter for both trees. A few points to note: - the rq_lock stlil remains the global one for now so there should not be any gain in doing this, but should this trigger any regression, it is important to detect whether it's related to the lock or to the tree. - there's no more reason for using the scope-based version of the ebtree now, we could switch back to the regular eb32_tree. - it's worth checking if we still need TASK_GLOBAL (probably only to delete a task in one's own shared queue maybe).	2022-07-01 19:15:14 +02:00
Willy Tarreau	3961608f63	CLEANUP: task: remove the unused task_unlink_rq() This function stopped being used before 2.4 because either the task is dequeued by the scheduler itself and it knows where to find it, or it's killed by any thread, and task_kill() must be used for this as only this one is safe. It's difficult to say whether task_unlink_rq() is still safe, but once the lock moves to a thread declared in the task itself, it will be even more difficult to keep it safe. Let's just remove it now before someone reuses it and causes trouble.	2022-07-01 19:15:14 +02:00
Willy Tarreau	eed3911a54	MINOR: task: replace task_set_affinity() with task_set_thread() The latter passes a thread ID instead of a mask, making the code simpler.	2022-07-01 19:15:14 +02:00
Willy Tarreau	159e3acf5d	MEDIUM: task: remove TASK_SHARED_WQ and only use t->tid TASK_SHARED_WQ was set upon task creation and never changed afterwards. Thus if a task was created to run anywhere (e.g. a check or a Lua task), all its timers would always pass through the shared timers queue with a lock. Now we know that tid<0 indicates a shared task, so we can use that to decide whether or not to use the shared queue. The task might be migrated using task_set_affinity() but it's always dequeued first so the check will still be valid. Not only this removes a flag that's difficult to keep synchronized with the thread ID, but it should significantly lower the load on systems with many checks. A quick test with 5000 servers and fast checks that were saturating the CPU shows that the check rate increased by 20% (hence the CPU usage dropped by 17%). It's worth noting that run_task_lists() almost no longer appears in perf top now.	2022-07-01 19:15:14 +02:00
Willy Tarreau	1f4bf7215a	MEDIUM: task: only keep task_new_*() and drop task_new() As previously advertised in comments, the mask-based task_new() is now gone. The low-level function now is task_new_on() which takes a thread number or a negative value for "any thread", which is turned to zero for thread-less builds since there's no shared WQ in thiscase. The task_new_here() and task_new_anywhere() functions were adjusted accordingly.	2022-07-01 19:15:14 +02:00
Willy Tarreau	0ad00befc1	CLEANUP: task: remove thread_mask from the struct task It was not used anymore since everything moved to ->tid, so let's remove it.	2022-07-01 19:15:14 +02:00
Willy Tarreau	29ffe26733	MAJOR: task: use t->tid instead of ffsl(t->thread_mask) to take the thread ID At several places we need to figure the ID of the first thread allowed to run a task. Till now this was performed using my_ffsl(t->thread_mask) but since we now have the thread ID stored into the task, let's use it instead. This is tagged major because it starts to assume that tid<0 is strictly equivalent to atleast2(thread_mask), and that as such, among the allowed threads are the current one.	2022-07-01 19:15:14 +02:00
Willy Tarreau	5b8e054732	MEDIUM: task/debug: move the ->thread_mask integrity checks to ->tid Let's make sure the new ->tid field is always correct instead of checking the thread mask.	2022-07-01 19:15:14 +02:00
Willy Tarreau	6ef52f4479	MEDIUM: task: add and preset a thread ID in the task struct The tasks currently rely on a mask but do not have an assigned thread ID, contrary to tasklets. However, in practice they're either running on a single thread or on any thread, so that it will be worth simplifying all this in order to ease the transition to the thread groups. This patch introduces a "tid" field in the task struct, that's either the number of the thread the task is attached to, or a negative value if the task is not bound to a thread, (i.e. its mask is all_threads_mask). The new ID is only set and updated but not used yet.	2022-07-01 19:15:14 +02:00
Frédéric Lécaille	ad548b54a7	MINOR: task: Add tasklet_wakeup_after() We want to be able to schedule a tasklet onto a thread after the current tasklet is done. What we have to do is to insert this tasklet at the head of the thread task list. Furthermore, we would like to serialize the tasklets. They must be run in the same order as the order in which they have been scheduled. This is implemented passing a list of tasklet as parameter (see <head> parameters) which must be reused for subsequent calls. _tasklet_wakeup_after_on() is implemented to accomplish this job. tasklet_wakeup_after_on() and tasklet_wake_after() are only wrapper macros around _tasklet_wakeup_after_on(). tasklet_wakeup_after_on() does exactly the same thing as _tasklet_wakeup_after_on() without having to pass the filename and line in the filename as parameters (usefull when DEBUG_TASK is enabled). tasklet_wakeup_after() hides also the usage of the thread parameter which is <tl> tasklet thread ID.	2022-06-30 14:24:04 +02:00
Willy Tarreau	3ccb14d60d	MINOR: thread: get rid of MAX_THREADS_MASK This macro was used both for binding and for lookups. When binding tasks or FDs, using all_threads_mask instead is better as it will later be per group. For lookups, ~0UL always does the job. Thus in practice the macro was already almost not used anymore since the rest of the code could run fine with a constant of all ones there.	2022-06-14 11:18:40 +02:00
Willy Tarreau	680ed5f28b	MINOR: task: move profiling bit to per-thread Instead of having a global mask of all the profiled threads, let's have one flag per thread in each thread's flags. They are never accessed more than one at a time an are better located inside the threads' contexts for both performance and scalability.	2022-06-14 10:38:03 +02:00
Christopher Faulet	a45403f965	Revert "BUG/MINOR: task: Don't defer tasks release when HAProxy is stopping" This reverts commit `d9404b464f`. In fact, there is a BUG_ON() in __task_free() function to be sure the task is no longer in the wait-queue or the run-queue. Because the patch tries to fix a "leak" on deinit, it is safer to revert it. there is no reason to introduce potential bug for this kind of issues. And there is no reason to impact the normal use-cases at runtime with additionnal conditions to only remove a task on deinit.	2022-05-25 16:41:52 +02:00
Christopher Faulet	d9404b464f	BUG/MINOR: task: Don't defer tasks release when HAProxy is stopping A running or queued task is not released when task_destroy() is called, except if it is the current task. Its process function is set to NULL and we let the scheduler to release the task. However, when HAProxy is stopping, it never happens and some tasks may leak. To fix the issue, we now also rely on the global MODE_STOPPING flag. When this flag is set, the task is always immediately released. This patch should fix the issue #1714. It could be backported as far as 2.4 but it's not a real problem in practice because it only happens on deinit. The leak exists on previous versions but not MODE_STOPPING flag.	2022-05-25 15:31:21 +02:00
Willy Tarreau	a4e39890f3	MINOR: task: add a new task_instant_wakeup() function This function's purpose is to wake up either a local or remote task, bypassing the tree-based run queue. It is meant for fast wakeups that are supposed to be equivalent to those used with tasklets, i.e. a task had to pause some processing and can complete (typically a resource becomes available again). In all cases, it's important to keep in mind that the task must have gone through the regular scheduling path before being blocked, otherwise the task priorities would be ignored. The reason for this is that some wakeups are massively inter-thread (e.g. server queues), that these inter-thread wakeups cause a huge contention on the shared runqueue lock. A user reported 47% CPU spent in process_runnable_tasks with only 32 threads and 80k requests in queues. With this mechanism, purely one-to-one wakeups can avoid taking the lock thanks to the mt_list used for the shared tasklet queue. Right now the shared tasklet queue moves everything to the TL_URGENT queue. It's not dramatic but it would seem better to have a new shared list dedicated to tasks, and that would deliver into TL_NORMAL, for an even better fairness. This could be improved in the future.	2022-04-22 19:11:59 +02:00
Willy Tarreau	e1efd2a2d7	BUILD: sched: workaround crazy and dangerous warning in Clang 14 Ilya reported in issue #1638 that Clang 14 has invented a new warning that encourages to modify the code in a way that is not always equivalent, by turning "\|" to "\|\|" between some logical operators, except that the first one guarantees that all members of the expression will always be evaluated while the latter will stop at the first one which is true! This warning triggers in thread_has_tasks(), which is not sensitive to such change of behavior but which is built this way because it results in branchless code for something that most often evaluates to false for all terms. As such it was out of question to turn this to less efficient compare-and-jump that needlessly pollute the branch predictor, so the workaround consists in casting each expression to (int). It was verified that the code is the same. Yet another example of how-to-introduce-bugs-by-fixing-valid-code through warnings invented around a beer without thinking longer! This may need to be backported to a few older branches in case this compiler lands in recent distros or if gcc finds it wise to imitate it.	2022-04-14 15:11:12 +02:00
Willy Tarreau	6c8babf6c4	BUG/MAJOR: sched: prevent rare concurrent wakeup of multi-threaded tasks Since the relaxation of the run-queue locks in 2.0 there has been a very small but existing race between expired tasks and running tasks: a task might be expiring and being woken up at the same time, on different threads. This is protected against via the TASK_QUEUED and TASK_RUNNING flags, but just after the task finishes executing, it releases it TASK_RUNNING bit an only then it may go to task_queue(). This one will do nothing if the task's ->expire field is zero, but if the field turns to zero between this test and the call to __task_queue() then three things may happen: - the task may remain in the WQ until the 24 next days if it's in the future; - the task may prevent any other task after it from expiring during the 24 next days once it's queued - if DEBUG_STRICT is set on 2.4 and above, an abort may happen - since 2.2, if the task got killed in between, then we may even requeue a freed task, causing random behaviour next time it's found there, or possibly corrupting the tree if it gets reinserted later. The peers code is one call path that easily reproduces the case with the ->expire field being reset, because it starts by setting it to TICK_ETERNITY as the first thing when entering the task handler. But other code parts also use multi-threaded tasks and rightfully expect to be able to touch their expire field without causing trouble. No trivial code path was found that would destroy such a shared task at runtime, which already limits the risks. This must be backported to 2.0.	2022-02-14 20:10:43 +01:00
Willy Tarreau	cc5cd5b8d8	BUILD: task: use list_to_mt_list() instead of casting list to mt_list There were a few casts of list* to mt_list* that were upsetting some old compilers (not sure about the effect on others). We had created list_to_mt_list() purposely for this, let's use it instead of applying this cast.	2022-01-28 19:04:02 +01:00
Willy Tarreau	1a9c922b53	REORG: thread/sched: move the task_per_thread stuff to thread_ctx The scheduler contains a lot of stuff that is thread-local and not exclusively tied to the scheduler. Other parts (namely thread_info) contain similar thread-local context that ought to be merged with it but that is even less related to the scheduler. However moving more data into this structure isn't possible since task.h is high level and cannot be included everywhere (e.g. activity) without causing include loops. In the end, it appears that the task_per_thread represents most of the per-thread context defined with generic types and should simply move to tinfo.h so that everyone can use them. The struct was renamed to thread_ctx and the variable "sched" was renamed to "th_ctx". "sched" used to be initialized manually from run_thread_poll_loop(), now it's initialized by ha_set_tid() just like ti, tid, tid_bit. The memset() in init_task() was removed in favor of a bss initialization of the array, so that other subsystems can put their stuff in this array. Since the tasklet array has TL_CLASSES elements, the TL_* definitions was moved there as well, but it's not a problem. The vast majority of the change in this patch is caused by the renaming of the structures.	2021-10-08 17:22:26 +02:00
Willy Tarreau	f9d5e1079c	REORG: clock: move the updates of cpu/mono time to clock.c The entering_poll/leaving_poll/measure_idle functions that were hard to classify and used to move to various locations have now been placed into clock.c since it's precisely about time-keeping. The functions were renamed to clock_*. The samp_time and idle_time values are now static since there is no reason for them to be read from outside.	2021-10-08 17:22:26 +02:00
Willy Tarreau	5554264f31	REORG: time: move time-keeping code and variables to clock.c There is currently a problem related to time keeping. We're mixing the functions to perform calculations with the os-dependent code needed to retrieve and adjust the local time. This patch extracts from time.{c,h} the parts that are solely dedicated to time keeping. These are the "now" or "before_poll" variables for example, as well as the various now_() functions that make use of gettimeofday() and clock_gettime() to retrieve the current time. The "tv_" functions moved there were also more appropriately renamed to "clock_*". Other parts used to compute stolen time are in other files, they will have to be picked next.	2021-10-08 17:22:26 +02:00
Amaury Denoyelle	1a9b8a6122	BUG/MINOR: task: fix missing include with DEBUG_TASK Following include reorganzation, there is some missing include files for task.h when compiling with DEBUG_TASK : - activity.h for task_profiling_mask - time.h for now_mono_time() This is present since the following commit `d8b325c748` REORG: task: uninline the loop time measurement code No need to backport this.	2021-10-07 16:44:49 +02:00
Willy Tarreau	d8b325c748	REORG: task: uninline the loop time measurement code It's pointless to inline this, it's called exactly once per poll loop, and it depends on time.h which is quite deep. Let's move that to task.c along with sched_report_idle().	2021-10-07 01:41:14 +02:00
Willy Tarreau	9310f481ce	CLEANUP: tree-wide: remove unneeded include time.h in ~20 files 20 files used to have haproxy/time.h included only for now_ms, and two were missing it for other things but used to inherit from it via other files.	2021-10-07 01:41:14 +02:00
Willy Tarreau	078c2573c2	REORG: sched: moved samp_time and idle_time to task.c as well The idle time calculation stuff was moved to task.h by commit `6dfab112e` ("REORG: sched: move idle time calculation from time.h to task.h") but these two variables that are only maintained by task.{c,h} were still left in time.{c,h}. They have to move as well.	2021-10-07 01:41:14 +02:00
Willy Tarreau	1cdb531ec8	REORG: sched: move the stolen CPU time detection to sched_entering_poll() That's where that code initially was but it had been moved to activity_count_runtime() for pure reasons of dependency loops. These ones are no longer true so we can move that code back to the scheduler and keep it where the information are updated and checked.	2021-10-01 18:37:51 +02:00
Willy Tarreau	6dfab112e1	REORG: sched: move idle time calculation from time.h to task.h time.h is a horrible place to put activity calculation, it's a historical mistake because the functions were there. We already have most of the parts in sched.{c,h} and these ones make an exception in the middle, forcing time.h to include some thread stuff and to access the before/after_poll and idle_pct values. Let's move these 3 functions to task.h with the other ones. They were prefixed with "sched_" instead of the historical "tv_" which already made no sense anymore.	2021-10-01 18:37:51 +02:00

1 2

88 Commits