haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-14 11:06:56 +02:00

Author	SHA1	Message	Date
Willy Tarreau	58b73f9fa8	MINOR: pollers: only update the local date during busy polling This patch modifies epoll, kqueue and evports (the 3 pollers that support busy polling) to only update the local date in the inner polling loop, the global one being done when leaving the loop. Testing with epoll on a 24c/48t machine showed a boost from 53M to 352M loops/s, indicating that the loop was spending 85% of its time updating the global date or causing side effects (which was confirmed with perf top showing 67% in clock_update_global_date() alone).	2022-09-21 09:06:28 +02:00
Willy Tarreau	af985e0151	CLEANUP: pollers: remove dead code in the polling loop As reported by Ilya and Coverity in issue #1858, since recent commit `eea152ee6` ("BUG/MINOR: signals/poller: ensure wakeup from signals") which removed the test for the global signal flag from the pollers' loop, the remaining "wake" flag doesn't need to be tested since it already participates to zeroing the wait_time and will be caught on the previous line. Let's just remove that test now.	2022-09-12 09:35:44 +02:00
Matthias Wirth	eea152ee68	BUG/MINOR: signals/poller: ensure wakeup from signals Add self-wake in signal_handler() to fix a race condition with a signal coming in between checking signal_queue_len and entering polling sleep. The changes in commit `43c891dda` ("BUG/MINOR: signals/poller: set the poller timeout to 0 when there are signals") were insufficient. Move the signal_queue_len check from the poll implementations to run_poll_loop() to keep that logic in one place. The poll loops are terminated either by the parameter wake being set or wake up due to a write to their poller_wr_pipe by wake_thread() in signal_handler(). This fixes issue #1841. Must be backported in every stable version.	2022-09-09 11:15:22 +02:00
William Lallemand	43c891dda0	BUG/MINOR: signals/poller: set the poller timeout to 0 when there are signals When receiving a signal before entering the poller, and without any activity in the process, the poller will be entered with a timeout calculated without checking the signals. Since commit 4f59d3 ("MINOR: time: increase the minimum wakeup interval to 60s") the issue is much more visible because it could be stuck for 60s. When in mworker mode, if a worker quits and the SIGCHLD signal deliver at the right time to the master, this one could be stuck for the time of the timeout. This should fix issue #1841 Must be backported in every stable version.	2022-09-08 17:46:31 +02:00
Willy Tarreau	2c30de3b90	BUG/MINOR: epoll: do not actively poll for Rx after an error In 2.2, commit `5d7dcc2a8` ("OPTIM: epoll: always poll for recv if neither active nor ready") was added to compensate for the fact that our iocbs are almost always asynchronous now and do not have the opportunity to update the FD correctly. As such, they just perform a wakeup, the FD is turned to inactive, the tasklet wakes up, performs the I/O, updates the FD, most of the time this is done withing the same polling loop, and the update cancels itself in the poller without having to switch the FD off then on. The issue was that when deciding to claim an FD was active for reads if it was active for writes, we forgot one situation that unfortunately causes excessive wakeups: dealing with errors. Indeed, errors are reported and keep ringing as long as the FD is active for sending even if the consumer disabled the FD for receiving. Usually this only causes one extra wakeup for the time it takes to consider a potential write subscriber and to call it, though with many tasks in a run queue, it can last a bit longer and be reported more often. The fix consists in checking that we really want to get more receive events on this FD, that is: - that no prevous EPOLLERR was reported - that the FD doesn't carry a sticky error - that the FD is not shut for reads With this, after the last epoll_wait() reports EPOLLERR, one last recv() is performed to flush pending data and the FD is immediately unregistered. It's probably not needed to backport this as its effects are not much visible, though it should not harm. Before, EPOLLERR was seen twice: accept4(4, {sa_family=AF_INET, sin_port=htons(22314), sin_addr=inet_addr("127.0.0.1")}, [128 => 16], SOCK_NONBLOCK) = 8 accept4(4, 0x261b160, [128], SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable) recvfrom(8, "POST / HTTP/1.1\r\nConnection: close\r\nTransfer-encoding: chunk"..., 16320, 0, NULL, NULL) = 66 socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 9 connect(9, {sa_family=AF_INET, sin_port=htons(8002), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress) epoll_ctl(3, EPOLL_CTL_ADD, 8, {events=EPOLLIN\|EPOLLRDHUP, data={u32=8, u64=8}}) = 0 epoll_ctl(3, EPOLL_CTL_ADD, 9, {events=EPOLLIN\|EPOLLOUT\|EPOLLRDHUP, data={u32=9, u64=9}}) = 0 epoll_wait(3, [{events=EPOLLOUT, data={u32=9, u64=9}}], 200, 355) = 1 recvfrom(9, 0x25cfb30, 16320, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) sendto(9, "POST / HTTP/1.1\r\ntransfer-encoding: chunked\r\n\r\n", 47, MSG_DONTWAIT\|MSG_NOSIGNAL, NULL, 0) = 47 epoll_ctl(3, EPOLL_CTL_MOD, 9, {events=EPOLLIN\|EPOLLRDHUP, data={u32=9, u64=9}}) = 0 epoll_wait(3, [{events=EPOLLIN\|EPOLLERR\|EPOLLHUP\|EPOLLRDHUP, data={u32=9, u64=9}}], 200, 354) = 1 recvfrom(9, "HTTP/1.1 200 OK\r\ncontent-length: 0\r\nconnection: close\r\n\r\n", 16320, 0, NULL, NULL) = 57 sendto(8, "HTTP/1.1 200 OK\r\ncontent-length: 0\r\nconnection: close\r\n\r\n", 57, MSG_DONTWAIT\|MSG_NOSIGNAL, NULL, 0) = 57 ->epoll_wait(3, [{events=EPOLLIN\|EPOLLERR\|EPOLLHUP\|EPOLLRDHUP, data={u32=9, u64=9}}], 200, 354) = 1 epoll_ctl(3, EPOLL_CTL_DEL, 9, 0x7ffe0b65fb24) = 0 epoll_wait(3, [{events=EPOLLIN, data={u32=8, u64=8}}], 200, 354) = 1 recvfrom(8, "A\n0123456789\r\n0\r\n\r\n", 16320, 0, NULL, NULL) = 19 close(9) = 0 close(8) = 0 After, EPOLLERR is only seen only once, with one less call to epoll_wait(): accept4(4, {sa_family=AF_INET, sin_port=htons(22362), sin_addr=inet_addr("127.0.0.1")}, [128 => 16], SOCK_NONBLOCK) = 8 accept4(4, 0x20d0160, [128], SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable) recvfrom(8, "POST / HTTP/1.1\r\nConnection: close\r\nTransfer-encoding: chunk"..., 16320, 0, NULL, NULL) = 66 socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 9 connect(9, {sa_family=AF_INET, sin_port=htons(8002), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress) epoll_ctl(3, EPOLL_CTL_ADD, 8, {events=EPOLLIN\|EPOLLRDHUP, data={u32=8, u64=8}}) = 0 epoll_ctl(3, EPOLL_CTL_ADD, 9, {events=EPOLLIN\|EPOLLOUT\|EPOLLRDHUP, data={u32=9, u64=9}}) = 0 epoll_wait(3, [{events=EPOLLOUT, data={u32=9, u64=9}}], 200, 411) = 1 recvfrom(9, 0x2084b30, 16320, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) sendto(9, "POST / HTTP/1.1\r\ntransfer-encoding: chunked\r\n\r\n", 47, MSG_DONTWAIT\|MSG_NOSIGNAL, NULL, 0) = 47 epoll_ctl(3, EPOLL_CTL_MOD, 9, {events=EPOLLIN\|EPOLLRDHUP, data={u32=9, u64=9}}) = 0 epoll_wait(3, [{events=EPOLLIN\|EPOLLERR\|EPOLLHUP\|EPOLLRDHUP, data={u32=9, u64=9}}], 200, 411) = 1 recvfrom(9, "HTTP/1.1 200 OK\r\ncontent-length: 0\r\nconnection: close\r\n\r\n", 16320, 0, NULL, NULL) = 57 sendto(8, "HTTP/1.1 200 OK\r\ncontent-length: 0\r\nconnection: close\r\n\r\n", 57, MSG_DONTWAIT\|MSG_NOSIGNAL, NULL, 0) = 57 epoll_ctl(3, EPOLL_CTL_DEL, 9, 0x7ffc95d46f04) = 0 epoll_wait(3, [{events=EPOLLIN, data={u32=8, u64=8}}], 200, 411) = 1 recvfrom(8, "A\n0123456789\r\n0\r\n\r\n", 16320, 0, NULL, NULL) = 19 close(9) = 0 close(8) = 0	2022-08-29 18:45:27 +02:00
Willy Tarreau	6983426354	BUG/MAJOR: poller: drop FD's tgid when masks don't match A bug was introduced in 2.7-dev2 by commit `1f947cb39` ("MAJOR: poller: only touch/inspect the update_mask under tgid protection"): once the FD's tgid is held, we would forget to drop it in case the update mask doesn't match, resulting in random watchdog panics of older processes on successive reloads. This should fix issue #1798. Thanks to Christian for the report and to Christopher for the reproducer. No backport is needed.	2022-07-25 15:47:15 +02:00
Christopher Faulet	f7ebe584d7	BUILD: debug: Add braces to if statement calling only CHECK_IF() In src/ev_epoll.c, a CHECK_IF() is guarded by an if statement. So, when the macro is empty, GCC (at least 11.3.1) is not happy because there is an if statement with an empty body without braces... It is handled by "-Wempty-body" option. So, braces are added and GCC is now happy. No backport needed.	2022-07-19 12:11:04 +02:00
Willy Tarreau	1f947cb39e	MAJOR: poller: only touch/inspect the update_mask under tgid protection With thread groups and group-local masks, the update_mask cannot be touched nor even checked if it may change below us. In order to avoid this, we have to grab a reference to the FD's tgid before checking the update mask. The operations are cheap enough so that we don't notice it in performance tests. This is expected because the risk of meeting a reassigned FD during an update remains very low. It's worth noting that the tgid cannot be trusted during startup nor during soft-stop since that may come from anywhere at the moment. Since soft-stop runs under thread isolation we use that hint to decide whether or not to check that the FD's tgid matches the current one. The modification is applied to the 3 thread-aware pollers, i.e. epoll, kqueue, and evports. Also one poll_drop counter was missing for shared updates, though it might be hard to trigger it. With this change applied, thread groups are usable in benchmarks.	2022-07-15 20:16:30 +02:00
Willy Tarreau	d95f18fa39	MAJOR: pollers: rely on fd_reregister_all() at boot time The poller-specific thread init code now uses that new function to safely register boot events. This ensures that we don't register an event for another group and that we properly deal with parallel thread startup. It's only done for thread-aware pollers, there's no point in using that in poll/select though that should work as well.	2022-07-15 20:16:30 +02:00
Willy Tarreau	3638d174e5	MEDIUM: fd: make thread_mask now represent group-local IDs With the change that was started on other masks, the thread mask was still not fully converted, sometimes being used as a global mask and sometimes as a local one. This finishes the code modifications so that the mask is always considered as a group-local mask. This doesn't change anything as long as there's a single group, but is necessary for groups 2 and above since it's used against running_mask and so on.	2022-07-15 20:16:30 +02:00
Willy Tarreau	6d3c501c08	MEDIUM: fd/poller: turn update_mask to group-local IDs From now on, the FD's update_mask only refers to local thread IDs. However, there remains a limitation, in updt_fd_polling(), we temporarily have to check and set shared FDs against .thread_mask, which still contains global ones. As such, nbtgroups > 1 may break (but this is not yet supported without special build options).	2022-07-15 20:16:30 +02:00
Willy Tarreau	63022128a5	MEDIUM: fd/poller: turn polled_mask to group-local IDs This changes the signification of each bit in the polled_mask so that now each bit represents a local thread ID for the current group instead of a global thread ID. As such, all tests now apply to ltid_bit instead of tid_bit. No particular check was made to verify that the FD's tgid matches the current one because there should be no case where this is not true. A check was added in epoll's __fd_clo() to confirm it never differs unless expected (soft stop under thread isolation, or master in starting mode going to exec mode), but that doesn't prevent from doing the job: it only consists in checking in the group's threads those that are still polling this FD and to remove them. Some atomic loads were added at the various locations, and most repetitive references to polled_mask[fd].xx were turned to a local copy instead making the code much more clear.	2022-07-15 20:16:30 +02:00
Willy Tarreau	35ee710ece	MEDIUM: fd/poller: make the update-list per-group The update-list needs to be per-group because its inspection is based on a mask and we need to be certain when scanning it if a mask is for the same thread or another one. Once per-group there's no doubt about it, even if the FD's polling changes, the entry remains valid. It will be needed to check the tgid though. Note that a soft-stop or pause/resume might not necessarily work here with tgroups>1, because the operation might be delivered to a thread that doesn't belong to the group and whoe update mask will not reflect one that is interesting here. We can't do better at this stage.	2022-07-15 19:57:28 +02:00
Willy Tarreau	b1093c6ba2	MEDIUM: poller: program the update in fd_update_events() for a migrated FD When an FD is migrated, all pollers program an update. That's useless code duplication, and when thread groups will be supported, this will require an extra round of locking just to verify the update_mask on return. Let's just program the update direction from fd_update_events() as it already does for closed FDs, this becomes more logical.	2022-07-15 19:43:10 +02:00
Willy Tarreau	0d023774bf	MEDIUM: epoll: don't synchronously delete migrated FDs Between 1.8 and 1.9 commit `d9e7e36c6` ("BUG/MEDIUM: epoll/threads: use one epoll_fd per thread") split the epoll poller to use one poller per thread (and this was backported to 1.8). This patch added a call to epoll_ctl(DEL) on return from the I/O handler as a safe way to deal with a detected thread migration when that code was still quite fragile. One aspect of this choice was that by then we wanted to maintain support for the rare old bogus epoll implementations that failed to remove events on close(), so risking to lose the event was not an option. Later in 2.5, commit `200bd50b7` ("MEDIUM: fd: rely more on fd_update_events() to detect changes") changed the code to perform most of the operations inside fd_update_events(), but it maintained that oddity, to the point that strictly all pollers except epoll now just add an update to be dealt with at the next round. This approach is much more efficient, because under load and server-side connection reuse, it's perfectly possible for a thread to see the same FD several times in a poll loop, the first time to relinquish it after a migration, then the other thread makes a request, gets its response, and still during the same loop for the first one, grabbing an idle connection to send a request and wait for a response will program a new update on this FD. By using a synchronous epoll_ctl(DEL), we effectively lose the opportunity to aggregate certain changes in the same update. Some tests performed locally with 8 threads and one server show that on average, by using an update instead of a synchronous call, we reduce the number of epoll_ctl() calls by 25-30% (under low loads it will probably not change anything). So this patch implements the same method for all pollers and replaces the synchronous epoll_ctl() with an update.	2022-07-10 14:13:48 +02:00
Willy Tarreau	058b2c1015	MINOR: poller: centralize poll return handling When returning from the polling syscall, all pollers have a certain dance to follow, made of wall clock updates, thread harmless updates, idle time management and sleeping mask updates. Let's have a centralized function to deal with all of this boring stuff: fd_leaving_poll(), and make all the pollers use it.	2022-07-01 19:15:14 +02:00
Willy Tarreau	740d749d77	BUILD: pollers: use an initcall to register the pollers Pollers are among the few remaining blocks still using constructors to register themselves. That's not needed anymore since the initcalls so better turn to initcalls.	2022-04-25 19:00:55 +02:00
Willy Tarreau	20adfde9c8	MINOR: activity: get the run_time from the clock updates Instead of fiddling with before_poll and after_poll in activity_count_runtime(), the function is now called by clock_entering_poll() which passes it the number of microseconds spent working. This allows to remove all calls to activity_count_runtime() from the pollers.	2021-10-08 17:22:26 +02:00
Willy Tarreau	f9d5e1079c	REORG: clock: move the updates of cpu/mono time to clock.c The entering_poll/leaving_poll/measure_idle functions that were hard to classify and used to move to various locations have now been placed into clock.c since it's precisely about time-keeping. The functions were renamed to clock_*. The samp_time and idle_time values are now static since there is no reason for them to be read from outside.	2021-10-08 17:22:26 +02:00
Willy Tarreau	5554264f31	REORG: time: move time-keeping code and variables to clock.c There is currently a problem related to time keeping. We're mixing the functions to perform calculations with the os-dependent code needed to retrieve and adjust the local time. This patch extracts from time.{c,h} the parts that are solely dedicated to time keeping. These are the "now" or "before_poll" variables for example, as well as the various now_() functions that make use of gettimeofday() and clock_gettime() to retrieve the current time. The "tv_" functions moved there were also more appropriately renamed to "clock_*". Other parts used to compute stolen time are in other files, they will have to be picked next.	2021-10-08 17:22:26 +02:00
Willy Tarreau	6dfab112e1	REORG: sched: move idle time calculation from time.h to task.h time.h is a horrible place to put activity calculation, it's a historical mistake because the functions were there. We already have most of the parts in sched.{c,h} and these ones make an exception in the middle, forcing time.h to include some thread stuff and to access the before/after_poll and idle_pct values. Let's move these 3 functions to task.h with the other ones. They were prefixed with "sched_" instead of the historical "tv_" which already made no sense anymore.	2021-10-01 18:37:51 +02:00
Willy Tarreau	88d1c5d3fb	MEDIUM: threads: add a stronger thread_isolate_full() call The current principle of running under isolation was made to access sensitive data while being certain that no other thread was using them in parallel, without necessarily having to place locks everywhere. The main use case are "show sess" and "show fd" which run over long chains of pointers. The thread_isolate() call relies on the "harmless" bit that indicates for a given thread that it's not currently doing such sensitive things, which is advertised using thread_harmless_now() and which ends usings thread_harmless_end(), which also waits for possibly concurrent threads to complete their work if they took this opportunity for starting something tricky. As some system calls were notoriously slow (e.g. mmap()), a bunch of thread_harmless_now() / thread_harmless_end() were placed around them to let waiting threads do their work while such other threads were not able to modify memory contents. But this is not sufficient for performing memory modifications. One such example is the server deletion code. By modifying memory, it not only requires that other threads are not playing with it, but are not either in the process of touching it. The fact that a pool_alloc() or pool_free() on some structure may call thread_harmless_now() and let another thread start to release the same object's memory is not acceptable. This patch introduces the concept of "idle threads". Threads entering the polling loop are idle, as well as those that are waiting for all others to become idle via the new function thread_isolate_full(). Once thread_isolate_full() is granted, the thread is not idle anymore, and it is released using thread_release() just like regular isolation. Its users have to keep in mind that across this call nothing is granted as another thread might have performed shared memory modifications. But such users are extremely rare and are actually expecting this from their peers as well. Note that that in case of backport, this patch depends on previous patch: MINOR: threads: make thread_release() not wait for other ones to complete	2021-08-04 14:49:36 +02:00
Willy Tarreau	200bd50b73	MEDIUM: fd: rely more on fd_update_events() to detect changes This function already performs a number of checks prior to calling the IOCB, and detects the change of thread (FD migration). Half of the controls are still in each poller, and these pollers also maintain activity counters for various cases. Note that the unreliable test on thread_mask was removed so that only the one performed by fd_set_running() is now used, since this one is reliable. Let's centralize all that fd-specific logic into the function and make it return a status among: FD_UPDT_DONE, // update done, nothing else to be done FD_UPDT_DEAD, // FD was already dead, ignore it FD_UPDT_CLOSED, // FD was closed FD_UPDT_MIGRATED, // FD was migrated, ignore it now Some pollers already used to call it last and have nothing to do after it, regardless of the result. epoll has to delete the FD in case a migration is detected. Overall this removes more code than it adds.	2021-07-30 17:45:18 +02:00
Willy Tarreau	53a16187fd	MINOR: poll/epoll: move detection of RDHUP support earlier Let's move the detection of support for RDHUP earlier and out of the FD update chain, as it complicates its simplification.	2021-07-30 17:41:55 +02:00
Willy Tarreau	26d212c744	MINOR: epoll: move epoll_fd to read_mostly This one only contains the list of per-thread epoll FDs, and is used a lot during updates. Let's mark it read_mostly to avoid false sharing of FDs placed at the extremities.	2021-04-10 19:27:41 +02:00
Willy Tarreau	4781b1521a	CLEANUP: atomic/tree-wide: replace single increments/decrements with inc/dec This patch replaces roughly all occurrences of an HA_ATOMIC_ADD(&foo, 1) or HA_ATOMIC_SUB(&foo, 1) with the equivalent HA_ATOMIC_INC(&foo) and HA_ATOMIC_DEC(&foo) respectively. These are 507 changes over 45 files.	2021-04-07 18:18:37 +02:00
Willy Tarreau	5362bc9044	MINOR: fd: move .et_possible into fdtab[].state No need to keep this flag apart any more, let's merge it into the global state.	2021-04-07 18:09:43 +02:00
Willy Tarreau	030dae13a0	MINOR: fd: move .cloned into fdtab[].state No need to keep this flag apart any more, let's merge it into the global state.	2021-04-07 18:08:29 +02:00
Willy Tarreau	61cfdf4fd8	CLEANUP: tree-wide: replace free(x);x=NULL with ha_free(&x) This makes the code more readable and less prone to copy-paste errors. In addition, it allows to place some __builtin_constant_p() predicates to trigger a link-time error in case the compiler knows that the freed area is constant. It will also produce compile-time error if trying to free something that is not a regular pointer (e.g. a function). The DEBUG_MEM_STATS macro now also defines an instance for ha_free() so that all these calls can be checked. 178 occurrences were converted. The vast majority of them were handled by the following Coccinelle script, some slightly refined to better deal with "&*x" or with long lines: @ rule @ expression E; @@ - free(E); - E = NULL; + ha_free(&E); It was verified that the resulting code is the same, more or less a handful of cases where the compiler optimized slightly differently the temporary variable that holds the copy of the pointer. A non-negligible amount of {free(str);str=NULL;str_len=0;} are still present in the config part (mostly header names in proxies). These ones should also be cleaned for the same reasons, and probably be turned into ist strings.	2021-02-26 21:21:09 +01:00
Willy Tarreau	38e8a1c7b8	MINOR: debug: add a new DEBUG_FD build option When DEBUG_FD is set at build time, we'll keep a counter of per-FD events in the fdtab. This counter is reported in "show fd" even for closed FDs if not zero. The purpose is to help spot situations where an apparently closed FD continues to be reported in loops, or where some events are dismissed.	2020-06-23 10:04:54 +02:00
Willy Tarreau	bc52bec163	MEDIUM: fd: add experimental support for edge-triggered polling Some of the recent optimizations around the polling to save a few epoll_ctl() calls have shown that they could also cause some trouble. However, over time our code base has become totally asynchronous with I/Os always attempted from the upper layers and only retried at the bottom, making it look like we're getting closer to EPOLLET support. There are showstoppers there such as the listeners which cannot support this. But given that most of the epoll_ctl() dance comes from the connections, we can try to enable edge-triggered polling on connections. What this patch does is to add a new global tunable "tune.fd.edge-triggered", that makes fd_insert() automatically set an et_possible bit on the fd if the I/O callback is conn_fd_handler. When the epoll code sees an update for such an FD, it immediately registers it in both directions the first time and doesn't update it anymore. On a few tests it proved quite useful with a 14% request rate increase in a H2->H1 scenario, reducing the epoll_ctl() calls from 2 per request to 2 per connection. The option is obviously disabled by default as bugs are still expected, particularly around the subscribe() code where it is possible that some layers do not always re-attempt reading data after being woken up.	2020-06-19 14:21:46 +02:00
Willy Tarreau	e406386542	MINOR: activity: rename confusing poll_* fields in the output We have poll_drop, poll_dead and poll_skip which are confusingly named like their poll_io and poll_exp counterparts except that they are not per poll() call but per-fd. This patch renames them to poll_drop_fd(), poll_dead_fd() and poll_skip_fd() for this reason.	2020-06-17 20:35:33 +02:00
Willy Tarreau	e545153c50	MINOR: activity: report the number of times poll() reports I/O The "show activity" output mentions a number of indicators to explain wake up reasons but doesn't have the number of times poll() sees some I/O. And given that multiple events can happen simultaneously, it's not always possible to deduce this metric by subtracting. This patch adds a new "poll_io" counter that allows one to see how often poll() returns with at least one active FD. This should help detect stuck events and measure various ratios of poll sub-metrics.	2020-06-17 20:25:18 +02:00
Willy Tarreau	b2551057af	CLEANUP: include: tree-wide alphabetical sort of include files This patch fixes all the leftovers from the include cleanup campaign. There were not that many (~400 entries in ~150 files) but it was definitely worth doing it as it revealed a few duplicates.	2020-06-11 10:18:59 +02:00
Willy Tarreau	5b9cde4820	REORG: include: move THREAD_LOCAL and __decl_thread() to compiler.h Since these are used as type attributes or conditional clauses, they are used about everywhere and should not require a dependency on thread.h. Moving them to compiler.h along with other similar statements like ALIGN() etc looks more logical; this way they become part of the base API. This allowed to remove thread-t.h from ~12 files, one was found to only require thread-t and not thread and dict.c was found to require thread.h.	2020-06-11 10:18:59 +02:00
Willy Tarreau	3727a8a083	REORG: include: move signal.h to haproxy/signal{,-t}.h No change was necessary. Include from wdt.c was dropped since unneeded.	2020-06-11 10:18:58 +02:00
Willy Tarreau	f268ee8795	REORG: include: split global.h into haproxy/global{,-t}.h global.h was one of the messiest files, it has accumulated tons of implicit dependencies and declares many globals that make almost all other file include it. It managed to silence a dependency loop between server.h and proxy.h by being well placed to pre-define the required structs, forcing struct proxy and struct server to be forward-declared in a significant number of files. It was split in to, one which is the global struct definition and the few macros and flags, and the rest containing the functions prototypes. The UNIX_MAX_PATH definition was moved to compat.h.	2020-06-11 10:18:58 +02:00
Willy Tarreau	0f6ffd652e	REORG: include: move fd.h to haproxy/fd{,-t}.h A few includes were missing in each file. A definition of struct polled_mask was moved to fd-t.h. The MAX_POLLERS macro was moved to defaults.h Stdio used to be silently inherited from whatever path but it's needed for list_pollers() which takes a FILE* and which can thus not be forward-declared.	2020-06-11 10:18:57 +02:00
Willy Tarreau	48fbcae07c	REORG: tools: split common/standard.h into haproxy/tools{,-t}.h And also rename standard.c to tools.c. The original split between tools.h and standard.h dates from version 1.3-dev and was mostly an accident. This patch moves the files back to what they were expected to be, and takes care of not changing anything else. However this time tools.h was split between functions and types, because it contains a small number of commonly used macros and structures (e.g. name_desc) which in turn cause the massive list of includes of tools.h to conflict with the callers. They remain the ugliest files of the whole project and definitely need to be cleaned and split apart. A few types are defined there only for functions provided there, and some parts are even OS-specific and should move somewhere else, such as the symbol resolution code.	2020-06-11 10:18:57 +02:00
Willy Tarreau	c2f7c5895c	REORG: include: move common/ticks.h to haproxy/ticks.h Nothing needed to be changed, there are no exported types.	2020-06-11 10:18:57 +02:00
Willy Tarreau	a04ded58dc	REORG: include: move activity to haproxy/ This moves types/activity.h to haproxy/activity-t.h and proto/activity.h to haproxy/activity.h. The macros defining the bit field values for the profiling variable were moved to the type file to be more future-proof.	2020-06-11 10:18:57 +02:00
Willy Tarreau	92b4f1372e	REORG: include: move time.h from common/ to haproxy/ This one is included almost everywhere and used to rely on a few other .h that are not needed (unistd, stdlib, standard.h). It could possibly make sense to split it into multiple parts to distinguish operations performed on timers and the internal time accounting, but at this point it does not appear much important.	2020-06-11 10:18:56 +02:00
Willy Tarreau	3f567e4949	REORG: include: split hathreads into haproxy/thread.h and haproxy/thread-t.h This splits the hathreads.h file into types+macros and functions. Given that most users of this file used to include it only to get the definition of THREAD_LOCAL and MAXTHREADS, the bare minimum was placed into thread-t.h (i.e. types and macros). All the thread management was left to haproxy/thread.h. It's worth noting the drop of the trailing "s" in the name, to remove the permanent confusion that arises between this one and the system implementation (no "s") and the makefile's option (no "s"). For consistency, src/hathreads.c was also renamed thread.c. A number of files were updated to only include thread-t which is the one they really needed. Some future improvements are possible like replacing empty inlined functions with macros for the thread-less case, as building at -O0 disables inlining and causes these ones to be emitted. But this really is cosmetic.	2020-06-11 10:18:56 +02:00
Willy Tarreau	58017eef3f	REORG: include: move the BUG_ON() code to haproxy/bug.h This one used to be stored into debug.h but the debug tools got larger and require a lot of other includes, which can't use BUG_ON() anymore because of this. It does not make sense and instead this macro should be placed into the lower includes and given its omnipresence, the best solution is to create a new bug.h with the few surrounding macros needed to trigger bugs and place assertions anywhere. Another benefit is that it won't be required to add include <debug.h> anymore to use BUG_ON, it will automatically be covered by api.h. No less than 32 occurrences were dropped. The FSM_PRINTF macro was dropped since not used at all anymore (probably since 1.6 or so).	2020-06-11 10:18:56 +02:00
Willy Tarreau	4c7e4b7738	REORG: include: update all files to use haproxy/api.h or api-t.h if needed All files that were including one of the following include files have been updated to only include haproxy/api.h or haproxy/api-t.h once instead: - common/config.h - common/compat.h - common/compiler.h - common/defaults.h - common/initcall.h - common/tools.h The choice is simple: if the file only requires type definitions, it includes api-t.h, otherwise it includes the full api.h. In addition, in these files, explicit includes for inttypes.h and limits.h were dropped since these are now covered by api.h and api-t.h. No other change was performed, given that this patch is large and affects 201 files. At least one (tools.h) was already freestanding and didn't get the new one added.	2020-06-11 10:18:42 +02:00
Willy Tarreau	3858b122a6	CLEANUP: remove support for USE_MY_EPOLL This was made to support epoll on patched 2.4 kernels, and on early 2.6 using alternative libcs thanks to the arch-specific syscall definitions. All the features we support have been around since 2.6.2 and present in glibc since 2.3.2, neither of which are found in field anymore. Let's simply drop this and use epoll normally.	2020-03-10 07:08:10 +01:00
Willy Tarreau	55c5399846	MINOR: epoll: always initialize all of epoll_event to please valgrind valgrind complains that epoll_ctl() uses an epoll_event in which we have only set the part we use from the data field (i.e. the fd). Tests show that pre-initializing the struct in the stack doesn't have a measurable impact so let's do it.	2020-02-26 14:36:27 +01:00
Willy Tarreau	03e7853581	BUILD: remove obsolete support for -mregparm / USE_REGPARM This used to be a minor optimization on ix86 where registers are scarce and the calling convention not very efficient, but this platform is not relevant enough anymore to warrant all this dirt in the code for the sake of saving 1 or 2% of performance. Modern platforms don't use this at all since their calling convention already defaults to using several registers so better get rid of this once for all.	2020-02-25 07:41:47 +01:00
Willy Tarreau	902871dd07	CLEANUP: epoll: place the struct epoll_event in the stack Historically we used to have a global epoll_event for various manipulations involving epoll_ctl() and when threads were added, this was turned to a thread_local, which is needlessly expensive since it's just a temporary variable. Let's move it to a local variable wherever it's called instead.	2020-02-21 11:21:12 +01:00
Willy Tarreau	5d7dcc2a8e	OPTIM: epoll: always poll for recv if neither active nor ready The cost of enabling polling in one direction with epoll is very high because it requires one syscall per FD and per direction change. In addition we don't know about input readiness until we either try to receive() or enable polling and watch the result. With HTTP keep-alive, both are equally expensive as it's very uncommon to see the server instantly respond (unless it's a second stage of the same process on localhost, which has become much less common with threads). But when a connection is established it's also quite usual to have to poll for sending (except on localhost or UNIX sockets where it almost always instantly works). So this cost of polling could be factored out with the second step if both were enabled together. This is the idea behind this patch. What it does is to always enable polling for Rx if it's not ready and at least one direction is active. This means that if it's not explicitly disabled, or if it was but in a state that causes the loss of the information (rx ready cannot be guessed), then let's take any opportunity for a polling change to enable it at the same time, and learn about rx readiness for free. In addition the FD never gets unregistered for Rx unless it's ready and was blocked (buffer full). This avoids a lot of the flip-flop behaviour at beginning and end of requests. On a test with 10k requests in keep-alive, the difference is quite noticeable: Before: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 83.67 0.010847 0 20078 epoll_ctl 16.33 0.002117 0 2231 epoll_wait 0.00 0.000000 0 20 20 connect ------ ----------- ----------- --------- --------- ---------------- 100.00 0.012964 22329 20 total After: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 96.35 0.003351 1 2644 epoll_wait 2.36 0.000082 4 20 20 connect 1.29 0.000045 0 66 epoll_ctl ------ ----------- ----------- --------- --------- ---------------- 100.00 0.003478 2730 20 total It may also save a recvfrom() after connect() by changing the following sequence, effectively saving one epoll_ctl() and one recvfrom() : before \| after -----------------------------+---------------------------- - connect() \| - connect() - epoll_ctl(add,out) \| - epoll_ctl(add, in\|out) - sendto() \| - epoll_wait() = out - epoll_ctl(mod,in\|out) \| - send() - epoll_wait() = out \| - epoll_wait() = in\|out - recvfrom() = EAGAIN \| - recvfrom() = OK - epoll_ctl(mod,in) \| - recvfrom() = EAGAIN - epoll_wait() = in \| - epoll_ctl(mod, in) - recvfrom() = OK \| - epoll_wait() - recvfrom() = EAGAIN \| - epoll_wait() \| (...) Now on a 10M req test on 16 threads with 2k concurrent conns and 415kreq/s, we see 190k updates total and 14k epoll_ctl() only.	2019-12-27 16:38:47 +01:00

1 2 3

138 Commits