haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-10-28 23:20:59 +01:00

Author	SHA1	Message	Date
Willy Tarreau	974358954b	BUILD: fd: disguise the fd_set_nonblock/cloexec result We thought that we could get rid of some DISGUISE() with commit a80e4a354 ("MINOR: fd: add functions to set O_NONBLOCK and FD_CLOEXEC") thanks to the calls being in a function but that was without counting on Coverity. Let's put it directly in the function since most if not all callers don't care about this result.	2022-04-27 10:52:21 +02:00
Willy Tarreau	382474348c	CLEANUP: tree-wide: use fd_set_nonblock() and fd_set_cloexec() This gets rid of most open-coded fcntl() calls, some of which were passed through DISGUISE() to avoid a useless test. The FD_CLOEXEC was most often set without preserving previous flags, which could become a problem once new flags are created. Now this will not happen anymore.	2022-04-26 10:59:48 +02:00
Willy Tarreau	a80e4a3546	MINOR: fd: add functions to set O_NONBLOCK and FD_CLOEXEC Instead of seeing each location manipulate the fcntl() themselves and often forget to check previous flags, let's centralize the functions to do this. It also allows to drop fcntl.h from most call places and will ease the adoption of different OS-specific mechanisms if needed. Note that the fd_set_nonblock() function purposely doesn't check the previous flags as it's meant to be used on new FDs only.	2022-04-26 10:59:48 +02:00
Willy Tarreau	acef5e27b0	MINOR: tree-wide: always consider EWOULDBLOCK in addition to EAGAIN Some older systems may routinely return EWOULDBLOCK for some syscalls while we tend to check only for EAGAIN nowadays. Modern systems define EWOULDBLOCK as EAGAIN so that solves it, but on a few older ones (AIX, VMS etc) both are different, and for portability we'd need to test for both or we never know if we risk to confuse some status codes with plain errors. There were few entries, the most annoying ones are the switch/case because they require to only add the entry when it differs, but the other ones are really trivial.	2022-04-25 20:32:15 +02:00
Willy Tarreau	9aa324de2d	DEBUG: fd: make sure we never try to insert/delete an impossible FD number It's among the cases that would provoke memory corruption, let's add some tests against negative FDs and those larger than the table. This must never ever happen and would currently result in silent corruption or a crash. Better have a noticeable one exhibiting the call chain if that were to happen.	2022-01-31 21:00:35 +01:00
Willy Tarreau	97ea9c49f1	BUG/MEDIUM: fd: always align fdtab[] to 64 bytes There's a risk that fdtab is not 64-byte aligned. The first effect is that it may cause false sharing between cache lines resulting in contention when adjacent FDs are used by different threads. The second is related to what is explained in commit "BUG/MAJOR: compiler: relax alignment constraints on certain structures", i.e. that modern compilers might make use of aligned vector operations to zero some entries, and would crash. We do not use any memset() or so on fdtab, so the risk is almost inexistent, but that's not a reason for violating some valid assumptions. This patch addresses this by allocating 64 extra bytes and aligning the structure manually (this is an extremely cheap solution for this specific case). The original address is stored in a new variable "fdtab_addr" and is the one that gets freed. This remains extremely simple and should be easily backportable. A dedicated aligned allocator later would help, of course. This needs to be backported as far as 2.2. No issue related to this was reported yet, but it could very well happen as compilers evolve. In addition this should preserve high performance across restarts (i.e. no more dependency on allocator's alignment).	2022-01-27 16:28:10 +01:00
Willy Tarreau	3a6af1e5e8	MINOR: fd: register the write side of the poller pipe as well The poller's pipe was only registered on the read side since we don't need to poll to write on it. But this leaves some known FDs so it's better to also register the write side with no event. This will allow to show them in "show fd" and to avoid dumping them as unhandled FDs. Note that the only other type of unhandled FDs left are: - stdin/stdout/stderr - epoll FDs The later can be registered upon startup though but at least a dummy handler would be needed to keep the fdtab clean.	2022-01-24 20:41:25 +01:00
Willy Tarreau	a0b99536c8	REORG: thread/sched: move the thread_info flags to the thread_ctx The TI_FL_STUCK flag is manipulated by the watchdog and scheduler and describes the apparent life/death of a thread so it changes all the time and it makes sense to move it to the thread's context for an active thread.	2021-10-08 17:22:26 +02:00
Willy Tarreau	b63888c67c	REORG: fd: uninline compute_poll_timeout() It's not needed to inline it at all (one call per loop) and it introduces dependencies, let's move it to fd.c. Removing the few remaining includes that came with it further reduced by ~0.2% the LoC and the build time is now below 6s.	2021-10-07 01:41:14 +02:00
Willy Tarreau	5d9ddc5442	BUILD: tree-wide: add several missing activity.h A number of files currently access activity counters but rely on their definitions to be inherited from other files (task.c, backend.c hlua.c, sock.c, pool.c, stats.c, fd.c).	2021-10-07 01:36:51 +02:00
Willy Tarreau	87063a7da1	BUILD: fd: remove unused variable totlen in fd_write_frag_line() Ilya reports in GH #1392 that clang 13 complains about totlen being calculated and not used in fd_write_frag_line(), which is true. It's a leftover of some older code.	2021-09-17 12:00:27 +02:00
Willy Tarreau	7b2ac29a92	CLEANUP: fd: remove the now unneeded fd_mig_lock This is not needed anymore since we don't use it when setting the running mask anymore.	2021-08-04 16:03:36 +02:00
Willy Tarreau	f69fea64e0	MAJOR: fd: get rid of the DWCAS when setting the running_mask Right now we're using a DWCAS to atomically set the running_mask while being constrained by the thread_mask. This DWCAS is annoying because we may seriously need it later when adding support for thread groups, for checking that the running_mask applies to the correct group. It turns out that the DWCAS is not strictly necessary because we never need it to set the thread_mask based on the running_mask, only the other way around. And in fact, the running_mask is always cleared alone, and the thread_mask is changed alone as well. The running_mask is only relevant to indicate a takeover when the thread_mask matches it. Any bit set in running and not present in thread_mask indicates a transition in progress. As such, it is possible to re-arrange this by using a regular CAS around a consistency check between running_mask and thread_mask in fd_update_events and by making a CAS on running_mask then an atomic store on the thread_mask in fd_takeover(). The only other case is fd_delete() but that one already sets the running_mask before clearing the thread_mask, which is compatible with the consistency check above. This change has happily survived 10 billion takeovers on a 16-thread machine at 800k requests/s. The fd-migration doc was updated to reflect this change.	2021-08-04 16:03:36 +02:00
Willy Tarreau	b1f29bc625	MINOR: activity/fd: remove the dead_fd counter This one is set whenever an FD is reported by a poller with a null owner, regardless of the thread_mask. It has become totally meaningless because it only indicates a migrated FD that was not yet reassigned to a thread, but as soon as a thread uses it, the status will change to skip_fd. Thus there is no reason to distinguish between the two, it adds more confusion than it helps. Let's simply drop it.	2021-08-04 16:03:36 +02:00
Willy Tarreau	200bd50b73	MEDIUM: fd: rely more on fd_update_events() to detect changes This function already performs a number of checks prior to calling the IOCB, and detects the change of thread (FD migration). Half of the controls are still in each poller, and these pollers also maintain activity counters for various cases. Note that the unreliable test on thread_mask was removed so that only the one performed by fd_set_running() is now used, since this one is reliable. Let's centralize all that fd-specific logic into the function and make it return a status among: FD_UPDT_DONE, // update done, nothing else to be done FD_UPDT_DEAD, // FD was already dead, ignore it FD_UPDT_CLOSED, // FD was closed FD_UPDT_MIGRATED, // FD was migrated, ignore it now Some pollers already used to call it last and have nothing to do after it, regardless of the result. epoll has to delete the FD in case a migration is detected. Overall this removes more code than it adds.	2021-07-30 17:45:18 +02:00
Willy Tarreau	84c7922c52	REORG: fd: uninline fd_update_events() This function has become a monster (80 lines and 2/3 of a kB), it doesn't benefit from being static nor inline anymore, let's move it to fd.c.	2021-07-30 17:41:55 +02:00
Willy Tarreau	c28aab05d8	BUILD: fd: include log.h from fd.c It's needed for ha_alert() and the header was missing.	2021-05-08 20:35:39 +02:00
Willy Tarreau	9eec7e206e	MINOR: config: mark tune.fd.edge-triggered as experimental This one is stated as experimental in the doc but could still be used by accidental copy-paste. Let's mark it with KWF_EXPERIMENTAL so that users have to opt-in to use it.	2021-05-08 11:06:32 +02:00
Ilya Shipitsin	b2be9a1ea9	CLEANUP: assorted typo fixes in the code and comments This is 22nd iteration of typo fixes	2021-04-26 10:42:58 +02:00
Willy Tarreau	a1090a5b61	MINOR: fd: move a few read-mostly variables to their own section Some pointer to arrays such as fdtab, fdinfo, polled_mask etc are never written to at run time but are used a lot. fdtab accesses appear a lot in perf top because ha_used_fds is in the same cache line and is modified all the time. This patch moves all these read-mostly variables to the read_mostly section when defined. This way their cache lines will be able to remain in shared state in all CPU caches.	2021-04-10 19:27:41 +02:00
Willy Tarreau	4781b1521a	CLEANUP: atomic/tree-wide: replace single increments/decrements with inc/dec This patch replaces roughly all occurrences of an HA_ATOMIC_ADD(&foo, 1) or HA_ATOMIC_SUB(&foo, 1) with the equivalent HA_ATOMIC_INC(&foo) and HA_ATOMIC_DEC(&foo) respectively. These are 507 changes over 45 files.	2021-04-07 18:18:37 +02:00
Willy Tarreau	1db427399c	CLEANUP: atomic: add an explicit _FETCH variant for add/sub/and/or Currently our atomic ops return a value but it's never known whether the fetch is done before or after the operation, which causes some confusion each time the value is desired. Let's create an explicit variant of these operations suffixed with _FETCH to explicitly mention that the fetch occurs after the operation, and make use of it at the few call places.	2021-04-07 18:18:37 +02:00
Willy Tarreau	1673c4a883	MINOR: fd: implement an exclusive syscall bit to remove the ugly "log" lock There is a function called fd_write_frag_line() that's essentially used by loggers and that is used to write an atomic message line over a file descriptor using writev(). However a lock is required around the writev() call to prevent messages from multiple threads from being interleaved. Till now a SPIN_TRYLOCK was used on a dedicated lock that was common to all FDs. This is quite not pretty as if there are multiple output pipes to collect logs, there will be quite some contention. Now that there are empty flags left in the FD state and that we can finally use atomic ops on them, let's add a flag to indicate the FD is locked for exclusive access by a syscall. At least the locking will now be on an FD basis and not the whole process, so we can remove the log_lock.	2021-04-07 18:18:37 +02:00
Willy Tarreau	9063a660cc	MINOR: fd: move .exported into fdtab[].state No need to keep this flag apart any more, let's merge it into the global state.	2021-04-07 18:10:36 +02:00
Willy Tarreau	0cc612818d	MINOR: fd: move .initialized into fdtab[].state No need to keep this flag apart any more, let's merge it into the global state. The bit was not cleared in fd_insert() because the only user is the function used to create and atomically send a log message to a pipe FD, which never registers the fd. Here we clear it nevertheless for the sake of clarity. Note that with an extra cleaning pass we could have a bit number here and simply use a BTS to test and set it.	2021-04-07 18:09:08 +02:00
Willy Tarreau	030dae13a0	MINOR: fd: move .cloned into fdtab[].state No need to keep this flag apart any more, let's merge it into the global state.	2021-04-07 18:08:29 +02:00
Willy Tarreau	b41a6e9101	MINOR: fd: move .linger_risk into fdtab[].state No need to keep this flag apart any more, let's merge it into the global state. The CLI's output state was extended to 6 digits and the linger/cloned flags moved inside the parenthesis.	2021-04-07 18:07:49 +02:00
Olivier Houchard	c23b33764e	BUG/MEDIUM: fd: Take the fd_mig_lock when closing if no DWCAS is available. In fd_delete(), if we're running with no double-width cas, take the fd_mig_lock before setting thread_mask to 0 to make sure that another thread calling fd_set_running() won't miss the new value of thread_mask and set its bit in running_mask after we checked it. This should be backported to 2.2 as part of the series fixing fd_delete().	2021-03-25 07:34:35 +01:00
Willy Tarreau	2d4232901c	CLEANUP: fd: slightly simplify up _fd_delete_orphan() Let's release the port range earlier so that all zeroes are grouped together and that the compiler can slightly simplify the code.	2021-03-24 17:17:21 +01:00
Willy Tarreau	2c3f9818e8	BUG/MEDIUM: fd: do not wait on FD removal in fd_delete() Christopher discovered an issue mostly affecting 2.2 and to a less extent 2.3 and above, which is that it's possible to deadlock a soft-stop when several threads are using a same listener: thread1 thread2 unbind_listener() fd_set_running() lock(listener) listener_accept() fd_delete() lock(listener) while (running_mask); -----> deadlock unlock(listener) This simple case disappeared from 2.3 due to the removal of some locked operations at the end of listener_accept() on the regular path, but the architectural problem is still here and caused by a lock inversion built around the loop on running_mask in fd_clr_running_excl(), because there are situations where the caller of fd_delete() may hold a lock that is preventing other threads from dropping their bit in running_mask. The real need here is to make sure the last user deletes the FD. We have all we need to know the last one, it's the one calling fd_clr_running() last, or entering fd_delete() last, both of which can be summed up as the last one calling fd_clr_running() if fd_delete() calls fd_clr_running() at the end. And we can prevent new threads from appearing in running_mask by removing their bits in thread_mask. So what this patch does is that it sets the running_mask for the thread in fd_delete(), clears the thread_mask, thus marking the FD as orphaned, then clears the running mask again, and completes the deletion if it was the last one. If it was not, another thread will pass through fd_clr_running and will complete the deletion of the FD. The bug is easily reproducible in 2.2 under high connection rates during soft close. When the old process stops its listener, occasionally two threads will deadlock and the old process will then be killed by the watchdog. It's strongly believed that similar situations do exist in 2.3 and 2.4 (e.g. if the removal attempt happens during resume_listener() called from listener_accept()) but if so, they should be much harder to trigger. This should be backported to 2.2 as the issue appeared with the FD migration. It requires previous patches "fd: make fd_clr_running() return the remaining running mask" and "MINOR: fd: remove the unneeded running bit from fd_insert()". Notes for backport: in 2.2, the fd_dodelete() function requires an extra argument "do_close" indicating whether we want to remove and close the FD (fd_delete) or just delete it (fd_remove). While this information is not conveyed along the chain, we know that late calls always imply do_close=1 become do_close=0 exclusively results from fd_remove() which is only used by the config parser and the master, both of which are single-threaded, hence are always the last ones in the running_mask. Thus it is safe to assume that a postponed FD deletion always implies do_close=1. Thanks to Olivier for his help in designing this optimal solution.	2021-03-24 17:17:21 +01:00
Willy Tarreau	018251667e	CLEANUP: config: make the cfg_keyword parsers take a const for the defproxy The default proxy was passed as a variable to all parsers instead of a const, which is not without risk, especially when some timeout parsers used to make some int pointers point to the default values for comparisons. We want to be certain that none of these parsers will modify the defaults sections by accident, so it's important to mark this proxy as const. This patch touches all occurrences found (89).	2021-03-09 10:09:43 +01:00
Willy Tarreau	61cfdf4fd8	CLEANUP: tree-wide: replace free(x);x=NULL with ha_free(&x) This makes the code more readable and less prone to copy-paste errors. In addition, it allows to place some __builtin_constant_p() predicates to trigger a link-time error in case the compiler knows that the freed area is constant. It will also produce compile-time error if trying to free something that is not a regular pointer (e.g. a function). The DEBUG_MEM_STATS macro now also defines an instance for ha_free() so that all these calls can be checked. 178 occurrences were converted. The vast majority of them were handled by the following Coccinelle script, some slightly refined to better deal with "&*x" or with long lines: @ rule @ expression E; @@ - free(E); - E = NULL; + ha_free(&E); It was verified that the resulting code is the same, more or less a handful of cases where the compiler optimized slightly differently the temporary variable that holds the copy of the pointer. A non-negligible amount of {free(str);str=NULL;str_len=0;} are still present in the config part (mostly header names in proxies). These ones should also be cleaned for the same reasons, and probably be turned into ist strings.	2021-02-26 21:21:09 +01:00
Thayne McCombs	8f0cc5c4ba	CLEANUP: Fix spelling errors in comments This is from the output of codespell. It's done at once over a bunch of files and only affects comments, so there is nothing user-visible. No backport needed.	2021-01-08 14:56:32 +01:00
Ilya Shipitsin	b8888ab557	CLEANUP: assorted typo fixes in the code and comments This is 15th iteration of typo fixes	2021-01-06 17:32:03 +01:00
Willy Tarreau	5a7d6ebf2c	MINOR: fd/threads: silence a build warning with threads disabled Building with gcc-9.3.0 without threads may result in this warning: In file included from include/haproxy/api-t.h:36, from include/haproxy/api.h:33, from src/fd.c:90: src/fd.c: In function 'updt_fd_polling': include/haproxy/fd.h:507:11: warning: array subscript 63 is above array bounds of 'int[1]' [-Warray-bounds] 507 \| DISGUISE(write(poller_wr_pipe[tid], &c, 1)); include/haproxy/compiler.h:92:41: note: in definition of macro 'DISGUISE' 92 \| #define DISGUISE(v) ({ typeof(v) __v = (v); ALREADY_CHECKED(__v); __v; }) \| ^ src/fd.c:113:5: note: while referencing 'poller_wr_pipe' 113 \| int poller_wr_pipe[MAX_THREADS]; // Pipe to wake the threads \| ^~~~~~~~~~~~~~ gcc is wrong but this time it cannot be blamed because it doesn't know that the FD's thread_mask always has at least one bit set. Let's add the test for all_threads_mask there. It will also remove that test and drop the else block.	2020-11-26 22:28:41 +01:00
Willy Tarreau	7c9f756dcc	MINOR: fd: report an error message when failing initial allocations When starting with a huge maxconn (say 1 billion), the only error seen is "No polling mechanism available". This doesn't help at all to resolve the problem. Let's add specific alerts for the failed mallocs. Now we can get this instead: [ALERT] 286/154439 (23408) : Not enough memory to allocate 2000000033 entries for fdtab! This may be backported as far as 2.0 as it helps debugging bad configurations.	2020-10-13 18:15:33 +02:00
Willy Tarreau	f015887444	MEDIUM: fd: always wake up one thread when enabling a foreing FD Since 2.2 it's safe to enable/disable another thread's FD but the fd_wake calls will not immediately be considered because nothing wakes the other threads up. This will have an impact on listeners when deciding to resume them after they were paused, so at minima we want to wake up one of their threads, just like the scheduler does on task_kill(). This is what this patch does.	2020-10-09 11:27:29 +02:00
Tim Duesterhus	e52b6e5456	CLEANUP: Do not use a fixed type for 'sizeof' in 'calloc' Changes performed using the following coccinelle patch: @@ type T; expression E; expression t; @@ ( t = calloc(E, sizeof(t)) \| - t = calloc(E, sizeof(T)) + t = calloc(E, sizeof(t)) ) Looking through the commit history, grepping for coccinelle shows that the same replacement with a different patch was already performed in the past in commit 02779b6263a177b1e462e53db6eaf57bcda574bc.	2020-09-12 20:31:25 +02:00
Willy Tarreau	bb1caff70f	MINOR: fd: add a new "exported" flag and use it for all regular listeners This new flag will be used to mark FDs that must be passed to any future process across the CLI's "_getsocks" command. The scheme here is quite complex and full of special cases: - FDs inherited from parent processes are not exported this way, as they are supposed to instead be passed by the master process itself across reloads. However such FDs ought never to be paused otherwise this would disrupt the socket in the parent process as well; - FDs resulting from a "bind" performed over a socket pair, which are in fact one side of a socket pair passed inside another control socket pair must not be passed either. Since all of them are used the same way, for now it's enough never to put this "exported" flag to FDs bound by the socketpair code. - FDs belonging to temporary listeners (e.g. a passive FTP data port) must not be passed either. Fortunately we don't have such FDs yet. - the rest of the listeners for now are made of TCP, UNIX stream, ABNS sockets and are exportable, so they get the flag. - UDP listeners were wrongly created as listeners and are not suitable here. Their FDs should be passed but for now they are not since the client doesn't even distinguish the SO_TYPE of the retrieved sockets. In addition, it's important to keep in mind that: - inherited FDs may never be closed in master process but may be closed in worker processes if the service is shut down (useless since still bound, but technically possible) ; - inherited FDs may not be disabled ; - exported FDs may be disabled because the caller will perform the subsequent listen() on them. However that might not work for all OSes - exported FDs may be closed, it just means the service was shut down from the worker, and will be rebound in the new process. This implies that we have to disable exported on close(). => as such, contrary to an apparently obvious equivalence, the "exported" status doesn't imply anything regarding the ability to close a listener's FD or not.	2020-08-26 18:33:52 +02:00
Willy Tarreau	63d8b6009b	CLEANUP: fd: remove fd_remove() and rename fd_dodelete() to fd_delete() This essentially undoes what we did in fd.c in 1.8 to support seamless reload. Since we don't need to remove an fd anymore we can turn fd_delete() to the simple function it used to be.	2020-08-26 18:33:52 +02:00
Willy Tarreau	38e8a1c7b8	MINOR: debug: add a new DEBUG_FD build option When DEBUG_FD is set at build time, we'll keep a counter of per-FD events in the fdtab. This counter is reported in "show fd" even for closed FDs if not zero. The purpose is to help spot situations where an apparently closed FD continues to be reported in loops, or where some events are dismissed.	2020-06-23 10:04:54 +02:00
Willy Tarreau	bc52bec163	MEDIUM: fd: add experimental support for edge-triggered polling Some of the recent optimizations around the polling to save a few epoll_ctl() calls have shown that they could also cause some trouble. However, over time our code base has become totally asynchronous with I/Os always attempted from the upper layers and only retried at the bottom, making it look like we're getting closer to EPOLLET support. There are showstoppers there such as the listeners which cannot support this. But given that most of the epoll_ctl() dance comes from the connections, we can try to enable edge-triggered polling on connections. What this patch does is to add a new global tunable "tune.fd.edge-triggered", that makes fd_insert() automatically set an et_possible bit on the fd if the I/O callback is conn_fd_handler. When the epoll code sees an update for such an FD, it immediately registers it in both directions the first time and doesn't update it anymore. On a few tests it proved quite useful with a 14% request rate increase in a H2->H1 scenario, reducing the epoll_ctl() calls from 2 per request to 2 per connection. The option is obviously disabled by default as bugs are still expected, particularly around the subscribe() code where it is possible that some layers do not always re-attempt reading data after being woken up.	2020-06-19 14:21:46 +02:00
Willy Tarreau	f1cad38281	MINOR: fd: factorize the fd_takeover() exit path to make it safer Since there was a risk of leaving fd_takeover() without properly stopping the fd, let's take this opportunity for factoring the code around a commont exit point that's common to both double-cas and locked modes. This means using the "ret" variable inside the double-CAS code, and inverting the loop to first test the old values. Doing do also produces cleaner code because the compiler cannot factorize common exit paths using asm statements that are present in some atomic ops.	2020-06-18 08:25:42 +02:00
Willy Tarreau	4297363de3	MINOR: fd: slightly optimize the fd_takeover double-CAS loop The loop in fd_takeover() around the double-CAS is conditionned on a previous value of old_masks[0] that always matches tid_bit on the first iteration because it does not result from the atomic op but from a pre-loaded value. Let's set the result of the atomic op there instead so that the conflict between threads can be detected earlier and before performing the double-word CAS.	2020-06-18 08:08:50 +02:00
Willy Tarreau	c460c91633	MEDIUM: fd: refine the fd_takeover() migration lock When haproxy is compiled without double-word CAS, we use a migration lock in fd_takeover(). This lock was covering the atomic OR on the running_mask before checking its value, while it is not needed since this atomic op already returns the result. Let's just refine the code to avoid grabbing the lock in the event another thread has already stolen the FD, this may reduce contention in high reuse rate scenarios.	2020-06-18 07:28:09 +02:00
Olivier Houchard	ddc874c46c	BUG/MEDIUM: fd: Call fd_stop_recv() when we just got a fd. In fd_takeover(), when a double-width compare-and-swap is implemented, make sure, if we managed to get the fd, to call fd_stop_recv() on it, so that the thread that used to own it will know it has to stop polling it.	2020-06-17 20:36:28 +02:00
Olivier Houchard	8d7b517824	BUG/MEDIUM: fd: Don't fd_stop_recv() a fd we don't own. In fd_takeover(), if we failed to grab the fd, when a double-width compare-and-swap is not implemented, do not call fd_stop_recv() on the fd, it is not ours and may be used by another thread.	2020-06-17 20:36:28 +02:00
Olivier Houchard	f86a106f68	MINOR: fd: Fix a typo in a coment. The function si called fd_takeover, not fd_takeother.	2020-06-17 20:36:28 +02:00
Willy Tarreau	df187875da	BUG/MEDIUM: log: don't hold the log lock during writev() on a file descriptor In issue #648 a second problem was reported, indicating that some users mistakenly send the log to an FD mapped on a file. This situation doesn't even enable O_NONBLOCK and results in huge access times in the order of milliseconds with the lock held and other threads waiting till the watchdog fires to unblock the situation. The problem with files is that O_NONBLOCK is ignored, and we still need to lock otherwise we can end up with interleaved log messages. What this patch does is different. Instead of locking all writers, it uses a trylock so that there's always at most one logger and that other candidates can simply give up and report a failure, just as would happen if writev() returned -1 due to a pipe full condition. This solution is elegant because it gives back the control to haproxy to decide to give up when it takes too much time, while previously it was the kernel that used to block the syscall. However at high log rates (500000 req/s) there was up to 50% dropped logs due to the contention on the lock. In order to address this, we try to grab the lock up to 200 times and call ha_thread_relax() on failure. This results in almost no failure (no more than previously with O_NONBLOCK). A typical test with 6 competing threads writing to stdout chained to a pipe to a single process shows around 1000 drops for 10 million logs at 480000 lines per second. Please note that this doesn't mean that writing to a blocking FD is a good idea, and it might only be temporarily done on testing environments for debugging. A file or a terminal will continue to block the writing thread while others spin a little bit and lose their logs, but the writing thread will still experience performance-killing latencies. This patch should be backported to 2.1 and 2.0. The code is in log.c in 2.0, but the principle is the same.	2020-06-11 14:38:24 +02:00
Willy Tarreau	b2551057af	CLEANUP: include: tree-wide alphabetical sort of include files This patch fixes all the leftovers from the include cleanup campaign. There were not that many (~400 entries in ~150 files) but it was definitely worth doing it as it revealed a few duplicates.	2020-06-11 10:18:59 +02:00

1 2 3 4 5

210 Commits