haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-11 01:26:58 +02:00

Author	SHA1	Message	Date
Christopher Faulet	9dcf9b6f03	MINOR: threads: Use __decl_hathreads to declare locks This macro should be used to declare variables or struct members depending on the USE_THREAD compile option. It avoids the encapsulation of such declarations between #ifdef/#endif. It is used to declare all lock variables.	2017-11-13 11:38:17 +01:00
Christopher Faulet	2a944ee16b	BUILD: threads: Rename SPIN/RWLOCK macros using HA_ prefix This remove any name conflicts, especially on Solaris.	2017-11-07 11:10:24 +01:00
Willy Tarreau	f65610a83d	CLEANUP: threads: rename process_mask to thread_mask It was a leftover from the last cleaning session; this mask applies to threads and calling it process_mask is a bit confusing. It's the same in fd, task and applets.	2017-10-31 16:06:06 +01:00
Christopher Faulet	cd7879adc2	BUG/MEDIUM: threads: Run the poll loop on the main thread too There was a flaw in the way the threads was created. the main one was just used to create all the others and just wait to exit. Now, it is used to run a poll loop. So we only create nbthread-1 threads. This also fixes a bug about the compression filter when there is only 1 thread (nbthread == 1 or no threads support). The bug was in the way thread-local resources was initialized. per-thread init/deinit callbacks were never called for the main process. So, with nthread set to 1, some buffers remained uninitialized.	2017-10-31 13:58:33 +01:00
Christopher Faulet	8aae8b1d61	MINOR: threads/fd: Process cached events of FDs depending on the process mask	2017-10-31 13:58:30 +01:00
Christopher Faulet	a7c5d43085	MINOR: threads/fd: Add a mask of threads allowed to process on each fd in fdtab array	2017-10-31 13:58:30 +01:00
Christopher Faulet	d4604adeaa	MAJOR: threads/fd: Make fd stuffs thread-safe Many changes have been made to do so. First, the fd_updt array, where all pending FDs for polling are stored, is now a thread-local array. Then 3 locks have been added to protect, respectively, the fdtab array, the fd_cache array and poll information. In addition, a lock for each entry in the fdtab array has been added to protect all accesses to a specific FD or its information. For pollers, according to the poller, the way to manage the concurrency is different. There is a poller loop on each thread. So the set of monitored FDs may need to be protected. epoll and kqueue are thread-safe per-se, so there few things to do to protect these pollers. This is not possible with select and poll, so there is no sharing between the threads. The poller on each thread is independant from others. Finally, per-thread init/deinit functions are used for each pollers and for FD part for manage thread-local ressources. Now, you must be carefull when a FD is created during the HAProxy startup. All update on the FD state must be made in the threads context and never before their creation. This is mandatory because fd_updt array is thread-local and initialized only for threads. Because there is no pollers for the main one, this array remains uninitialized in this context. For this reason, listeners are now enabled in run_thread_poll_loop function, just like the worker pipe.	2017-10-31 13:58:30 +01:00
Christopher Faulet	63fe65277a	MINOR: fd: Move (de)allocation of fdtab and fdinfo in (de)init_pollers This will be useful for the threads support integration.	2017-09-05 10:49:45 +02:00
Christopher Faulet	d531f88622	MINOR: fd: Don't forget to reset fdtab[fd].update when a fd is added/removed It used to be guaranteed by the polling functions on a later call but with concurrent accesses it cannot be granted anymore.	2017-09-05 10:16:42 +02:00
Olivier Houchard	1fc0516516	MINOR: proxy: Don't close FDs if not our proxy. When running with multiple process, if some proxies are just assigned to some processes, the other processes will just close the file descriptors for the listening sockets. However, we may still have to provide those sockets when reloading, so instead we just try hard to pretend those proxies are dead, while keeping the sockets opened. A new global option, no-reused-socket", has been added, to restore the old behavior of closing the sockets not bound to this process.	2017-04-13 19:15:17 +02:00
Vincent Bernat	3c2f2f207f	CLEANUP: remove unneeded casts In C89, "void " is automatically promoted to any pointer type. Casting the result of malloc/calloc to the type of the LHS variable is therefore unneeded. Most of this patch was built using this Coccinelle patch: @@ type T; @@ - (T ) (\(lua_touserdata\\|malloc\\|calloc\\|SSL_get_app_data\\|hlua_checkudata\\|lua_newuserdata\)(...)) @@ type T; T x; void data; @@ x = - (T ) data @@ type T; T x; T data; @@ x = - (T ) data Unfortunately, either Coccinelle or I is too limited to detect situation where a complex RHS expression is of type "void *" and therefore casting is not needed. Those cases were manually examined and corrected.	2016-04-03 14:17:42 +02:00
Willy Tarreau	5be2f35231	MAJOR: polling: centralize calls to I/O callbacks In order for HTTP/2 not to eat too much memory, we'll have to support on-the-fly buffer allocation, since most streams will have an empty request buffer at some point. Supporting allocation on the fly means being able to sleep inside I/O callbacks if a buffer is not available. Till now, the I/O callbacks were called from two locations : - when processing the cached events - when processing the polled events from the poller This change cleans up the design a bit further than what was started in 1.5. It now ensures that we never call any iocb from the poller itself and that instead, events learned by the poller are put into the cache. The benefit is important in terms of stability : we don't have to care anymore about the risk that new events are added into the poller while processing its events, and we're certain that updates are processed at a single location. To achieve this, we now modify all the fd_* functions so that instead of creating updates, they add/remove the fd to/from the cache depending on its state, and only create an update when the polling status reaches a state where it will have to change. Since the pollers make use of these functions to notify readiness (using fd_may_recv/fd_may_send), the cache is always up to date with the poller. Creating updates only when the polling status needs to change saves a significant amount of work for the pollers : a benchmark showed that on a typical TCP proxy test, the amount of updates per connection dropped from 11 to 1 on average. This also means that the update list is smaller and has more chances of not thrashing too many CPU cache lines. The first observed benefit is a net 2% performance gain on the connection rate. A second benefit is that when a connection is accepted, it's only when we're processing the cache, and the recv event is automatically added into the cache after the current one, resulting in this event to be processed immediately during the same loop. Previously we used to have a second run over the updates to detect if new events were added to catch them before waking up tasks. The next gain will be offered by the next steps on this subject consisting in implementing an I/O queue containing all cached events ordered by priority just like the run queue, and to be able to leave some events pending there as long as needed. That will allow us not to perform some FD processing if it's not the proper time for this (typically keep waiting for a buffer to be allocated if none is available for an recv()). And by only processing a small bunch of them, we'll allow priorities to take place even at the I/O level. As a result of this change, functions fd_alloc_or_release_cache_entry() and fd_process_polled_events() have disappeared, and the code dedicated to checking for new fd events after the callback during the poll() loop was removed as well. Despite the patch looking large, it's mostly a change of what function is falled upon fd_*() and almost nothing was added.	2014-11-21 20:37:32 +01:00
Conrad Hoffmann	041751c13a	BUG/MEDIUM: polling: fix possible CPU hogging of worker processes after receiving SIGUSR1. When run in daemon mode (i.e. with at least one forked process) and using the epoll poller, sending USR1 (graceful shutdown) to the worker processes can cause some workers to start running at 100% CPU. Precondition is having an established HTTP keep-alive connection when the signal is received. The cloned (during fork) listening sockets do not get closed in the parent process, thus they do not get removed from the epoll set automatically (see man 7 epoll). This can lead to the process receiving epoll events that it doesn't feel responsible for, resulting in an endless loop around epoll_wait() delivering these events. The solution is to explicitly remove these file descriptors from the epoll set. To not degrade performance, care was taken to only do this when neccessary, i.e. when the file descriptor was cloned during fork. Signed-off-by: Conrad Hoffmann <conrad@soundcloud.com> [wt: a backport to 1.4 could be studied though chances to catch the bug are low]	2014-05-20 14:57:36 +02:00
Willy Tarreau	e852545594	MEDIUM: polling: centralize polled events processing Currently, each poll loop handles the polled events the same way, resulting in a lot of duplicated, complex code. Additionally, epoll was the only one to handle newly created FDs immediately. So instead, let's move that code to fd.c in a new function dedicated to this task : fd_process_polled_events(). All pollers now use this function.	2014-01-26 00:42:32 +01:00
Willy Tarreau	f817e9f473	MAJOR: polling: rework the whole polling system This commit heavily changes the polling system in order to definitely fix the frequent breakage of SSL which needs to remember the last EAGAIN before deciding whether to poll or not. Now we have a state per direction for each FD, as opposed to a previous and current state previously. An FD can have up to 8 different states for each direction, each of which being the result of a 3-bit combination. These 3 bits indicate a wish to access the FD, the readiness of the FD and the subscription of the FD to the polling system. This means that it will now be possible to remember the state of a file descriptor across disable/enable sequences that generally happen during forwarding, where enabling reading on a previously disabled FD would result in forgetting the EAGAIN flag it met last time. Several new state manipulation functions have been introduced or adapted : - fd_want_{recv,send} : enable receiving/sending on the FD regardless of its state (sets the ACTIVE flag) ; - fd_stop_{recv,send} : stop receiving/sending on the FD regardless of its state (clears the ACTIVE flag) ; - fd_cant_{recv,send} : report a failure to receive/send on the FD corresponding to EAGAIN (clears the READY flag) ; - fd_may_{recv,send} : report the ability to receive/send on the FD as reported by poll() (sets the READY flag) ; Some functions are used to report the current FD status : - fd_{recv,send}_active - fd_{recv,send}_ready - fd_{recv,send}_polled Some functions were removed : - fd_ev_clr(), fd_ev_set(), fd_ev_rem(), fd_ev_wai() The POLLHUP/POLLERR flags are now reported as ready so that the I/O layers knows it can try to access the file descriptor to get this information. In order to simplify the conditions to add/remove cache entries, a new function fd_alloc_or_release_cache_entry() was created to be used from pollers while scanning for updates. The following pollers have been updated : ev_select() : done, built, tested on Linux 3.10 ev_poll() : done, built, tested on Linux 3.10 ev_epoll() : done, built, tested on Linux 3.10 & 3.13 ev_kqueue() : done, built, tested on OpenBSD 5.2	2014-01-26 00:42:30 +01:00
Willy Tarreau	033cd9d78c	REORG: polling: rename "fd_process_spec_events()" to "fd_process_cached_events()" This is in order to be coherent with the rest.	2014-01-26 00:42:29 +01:00
Willy Tarreau	899d95757e	REORG: polling: rename the cache allocation functions - alloc_spec_entry() becomes fd_alloc_cache_entry() - release_spec_entry() becomes fd_release_cache_entry()	2014-01-26 00:42:29 +01:00
Willy Tarreau	16f649c82c	REORG: polling: rename "fd_spec" to "fd_cache" So fd_spec was renamed "fd_cache" as it's becoming an event cache, and fd_nbspec becomes fd_cache_num.	2014-01-26 00:42:29 +01:00
Willy Tarreau	69a41fa8a3	CLEANUP: polling: rename "spec_e" to "state" We're completely changing the way FDs will be polled. First, let's fix a few field names which become confusing. "spec_e" was used to store a speculative I/O event state. Now we'll store the whole R/W states for the FD there.	2014-01-26 00:42:28 +01:00
Willy Tarreau	fa7fc95e16	BUG/MEDIUM: polling: ensure we update FD status when there's no more activity Some rare unexplained busy loops were observed on versions up to 1.5-dev19. It happens that if a file descriptor happens to be disabled for both read and write while it was speculatively enabled for both and this without creating a new update entry, there will be no way to remove it from the speculative I/O list until some other changes occur. It is suspected that a double sequence such as enable_both/disable_both could have led to this situation where an update cancels itself and does not clear the spec list in the poll loop. While it is unclear what I/O sequence may cause this situation to arise, it is safer to always add the FD to the update list if nothing could be done on it so that the next poll round will automatically take care of it. This is 1.5-specific, no backport is needed.	2014-01-20 20:57:02 +01:00
Willy Tarreau	ad38acedaa	MEDIUM: connection: centralize handling of nolinger in fd management Right now we see many places doing their own setsockopt(SO_LINGER). Better only do it just before the close() in fd_delete(). For this we add a new flag on the file descriptor, indicating if it's safe or not to linger. If not (eg: after a connect()), then the setsockopt() call is automatically performed before a close(). The flag automatically turns to safe when receiving a read0.	2013-12-16 02:23:52 +01:00
Willy Tarreau	70d0ad560c	BUG: polling: don't skip polled events in the spec list Commit 09f245 came with a bug : if we don't process events from the spec list that are also being polled, we can end up with some stuck events that nobody processes. We must process all events from the spec list even if they're being polled in parallel.	2012-11-12 01:57:14 +01:00
Willy Tarreau	09f24569d4	REORG: fd: centralize the processing of speculative events Speculative events are independant on the poller, so they can be centralized in fd.c.	2012-11-11 17:45:39 +01:00
Willy Tarreau	6ea20b1acb	REORG: fd: move the fd state management from ev_sepoll ev_sepoll already provides everything needed to manage FD events by only manipulating the speculative I/O list. Nothing there is sepoll-specific so move all this to fd.	2012-11-11 17:45:39 +01:00
Willy Tarreau	7be79a41e1	REORG: fd: move the speculative I/O management from ev_sepoll The speculative I/O will need to be ported to all pollers, so move this to fd.c.	2012-11-11 17:45:39 +01:00
Willy Tarreau	1720abd727	MEDIUM: fd: don't unset fdtab[].updated upon delete We must not remove the .updated flag otherwise we risk having to reallocate a new updt entry if the same fd is reused.	2012-11-11 17:45:39 +01:00
Willy Tarreau	037d2c1f8f	MAJOR: sepoll: make the poller totally event-driven At the moment sepoll is not 100% event-driven, because a call to fd_set() on an event which is already being polled will not change its state. This causes issues with OpenSSL because if some I/O processing is interrupted after clearing the I/O event (eg: read all data from a socket, can't put it all into the buffer), then there is no way to call the SSL_read() again once the buffer releases some space. The only real solution is to go 100% event-driven. The principle is to use the spec list as an event cache and that each time an I/O event is reported by epoll_wait(), this event is automatically scheduled for addition to the spec list for future calls until the consumer explicitly asks for polling or stopping. Doing this is a bit tricky because sepoll used to provide a substantial number of optimizations such as event merging. These optimizations have been maintained : a dedicated update list is affected when events change, but not the event list, so that updates may cancel themselves without any side effect such as displacing events. A specific case was considered for handling newly created FDs as soon as they are detected from within the poll loop. This ensures that their read or write operation will always be attempted as soon as possible, thus reducing the number of poll loops and process_session wakeups. This is especially true for newly accepted fds which immediately perform their first recv() call. Two new flags were added to the fdtab[] struct to tag the fact that a file descriptor already exists in the update list. One flag indicates that a file descriptor is new and has just been created (fdtab[].new) and the other one indicates that a file descriptor is already referenced by the update list (fdtab[].updated). Even if the FD state changes during operations or if the fd is closed and replaced, it's not an issue because the update flag remains and is easily spotted during list walks. The flag must absolutely reflect the presence of the fd in the update list in order to avoid overflowing the update list with more events than there are distinct fds. Note that this change also recovers the small performance loss introduced by its connection counter-part and goes even beyond.	2012-11-10 00:17:27 +01:00
Willy Tarreau	49b046dddf	MAJOR: fd: replace all EV_FD_* macros with new fd__ inline calls These functions have a more explicity meaning and will offer provisions for explicit polling. EV_FD_ISSET() has been left for now as it is still in use in checks.	2012-09-02 21:53:11 +02:00
Willy Tarreau	db3b32610f	REORG/MEDIUM: fd: remove FD_STCLOSE from struct fdtab In an attempt to get rid of fdtab[].state, and to move the relevant parts to the connection struct, we remove the FD_STCLOSE state which can easily be deduced from the <owner> pointer as there is a 1:1 match.	2012-09-02 21:51:25 +02:00
Willy Tarreau	e79c3b24fb	[BUG] debug: report the correct poller list in verbose mode When running with -vv or -V -d, the list of usable polling systems is reported. The final selection did not take into account the possible failures during the tests, which is misleading and could make one think that a non-working poller will be used, while it is not the case. Fix that to really report the correct ones. (cherry picked from commit 6d0e354e0171f08b7b3868ad2882c3663bd068a7)	2010-11-19 13:25:10 +01:00
Willy Tarreau	8d5d77efc3	[OPTIM] move some rarely used fields out of fdtab Some rarely information are stored in fdtab, making it larger for no reason (source port ranges, remote address, ...). Such information lie there because the checks can't find them anywhere else. The goal will be to move these information to the stream interface once the checks make use of it. For now, we move them to an fdinfo array. This simple change might have improved the cache hit ratio a little bit because a 0.5% of performance increase has measured.	2009-10-18 08:17:33 +02:00
Willy Tarreau	c6f4ce8fc4	[MEDIUM] add support for binding to source port ranges during connect Some users are already hitting the 64k source port limit when connecting to servers. The system usually maintains a list of unused source ports, regardless of the source IP they're bound to. So in order to go beyond the 64k concurrent connections, we have to manage the source ip:port lists ourselves. The solution consists in assigning a source port range to each server and use a free port in that range when connecting to that server, either for a proxied connection or for a health check. The port must then be put back into the server's range when the connection is closed. This mechanism is used only when a port range is specified on a server. It makes it possible to reach 64k connections per server, possibly all from the same IP address. Right now it should be more than enough even for huge deployments.	2009-06-10 12:23:32 +02:00
Willy Tarreau	43b78999ec	[MEDIUM] move global tuning options to the global structure The global tuning options right now only concern the polling mechanisms, and they are not in the global struct itself. It's not very practical to add other options so let's move them to the global struct and remove types/polling.h which was not used for anything else.	2009-01-25 15:42:27 +01:00
Willy Tarreau	3eba98aa57	[MEDIUM] splice: make use of pipe pools Using pipe pools makes pipe management a lot easier. It also allows to remove quite a bunch of #ifdefs in areas which depended on the presence or not of support for kernel splicing. The buffer now holds a pointer to a pipe structure which is always NULL except if there are still data in the pipe. When it needs to use that pipe, it dynamically allocates it from the pipe pool. When the data is consumed, the pipe is immediately released. That way, there is no need anymore to care about pipe closure upon session termination, nor about pipe creation when trying to use splice(). Another immediate advantage of this method is that it considerably reduces the number of pipes needed to use splice(). Tests have shown that even with 0.2 pipe per connection, almost all sessions can use splice(), because the same pipe may be used by several consecutive calls to splice().	2009-01-25 13:56:13 +01:00
Willy Tarreau	3ec79b9c42	[MINOR] global.maxpipes: add the ability to reserve file descriptors for pipes This will be needed to use linux's splice() syscall.	2009-01-18 20:39:42 +01:00
Willy Tarreau	ec6c5df018	[CLEANUP] remove many #include <types/xxx> from C files It should be stated as a rule that a C file should never include types/xxx.h when proto/xxx.h exists, as it gives less exposure to declaration conflicts (one of which was caught and fixed here) and it complicates the file headers for nothing. Only types/global.h, types/capture.h and types/polling.h have been found to be valid includes from C files.	2008-07-16 10:30:42 +02:00
Krzysztof Piotr Oledzki	a643baf091	[MEDIUM] Fix memory freeing at exit New functions implemented: - deinit_pollers: called at the end of deinit()) - prune_acl: called via list_for_each_entry_safe Add missing pool_destroy2 calls: - p->hdr_idx_pool - pool2_tree64 Implement all task stopping: - health-check: needs new "struct task" in the struct server - queue processing: queue_mgt - appsess_refresh: appsession_refresh before (idle system): ==6079== LEAK SUMMARY: ==6079== definitely lost: 1,112 bytes in 75 blocks. ==6079== indirectly lost: 53,356 bytes in 2,090 blocks. ==6079== possibly lost: 52 bytes in 1 blocks. ==6079== still reachable: 150,996 bytes in 504 blocks. ==6079== suppressed: 0 bytes in 0 blocks. after (idle system): ==6945== LEAK SUMMARY: ==6945== definitely lost: 7,644 bytes in 137 blocks. ==6945== indirectly lost: 9,913 bytes in 587 blocks. ==6945== possibly lost: 0 bytes in 0 blocks. ==6945== still reachable: 0 bytes in 0 blocks. ==6945== suppressed: 0 bytes in 0 blocks. before (running system for ~2m): ==9343== LEAK SUMMARY: ==9343== definitely lost: 1,112 bytes in 75 blocks. ==9343== indirectly lost: 54,199 bytes in 2,122 blocks. ==9343== possibly lost: 52 bytes in 1 blocks. ==9343== still reachable: 151,128 bytes in 509 blocks. ==9343== suppressed: 0 bytes in 0 blocks. after (running system for ~2m): ==11616== LEAK SUMMARY: ==11616== definitely lost: 7,644 bytes in 137 blocks. ==11616== indirectly lost: 9,981 bytes in 591 blocks. ==11616== possibly lost: 0 bytes in 0 blocks. ==11616== still reachable: 4 bytes in 1 blocks. ==11616== suppressed: 0 bytes in 0 blocks. Still not perfect but significant improvement.	2008-05-30 07:07:19 +02:00
Willy Tarreau	ef1d1f859b	[MAJOR] auto-registering of pollers at load time Gcc provides __attribute__((constructor)) which is very convenient to execute functions at startup right before main(). All the pollers have been converted to have their register() function declared like this, so that it is not necessary anymore to call them from a centralized file.	2007-04-16 00:25:25 +02:00
Willy Tarreau	2ff7622c0c	[MAJOR] delay registering of listener sockets at startup Some pollers such as kqueue lose their FD across fork(), meaning that the registered file descriptors are lost too. Now when the proxies are started by start_proxies(), the file descriptors are not registered yet, leaving enough time for the fork() to take place and to get a new pollfd. It will be the first call to maintain_proxies that will register them.	2007-04-09 19:29:56 +02:00
Willy Tarreau	e1a7a2f0d8	[MAJOR] kqueue was not initialized during startup	2007-04-09 16:11:49 +02:00
Willy Tarreau	4f60f16dd3	[MAJOR] modularize the polling mechanisms select, poll and epoll now have their dedicated functions and have been split into distinct files. Several FD manipulation primitives have been provided with each poller. The rest of the code needs to be cleaned to remove traces of StaticReadEvent/StaticWriteEvent. A trick involving a macro has temporarily been used right now. Some work needs to be done to factorize tests and sets everywhere.	2007-04-08 16:39:58 +02:00
Willy Tarreau	b3107b9383	[MINOR] pollers should not use MY_FD_*	2007-04-08 09:32:47 +02:00
Willy Tarreau	1001b949ee	[CLEANUP] fd.c : regparm was hardcoded too.	2006-10-15 23:10:10 +02:00
Willy Tarreau	2a429503e0	[MINOR] turn every FD_* into functions On recent CPUs, functions are about twice as fast as inline FD_*, so there is now a #define CONFIG_HAP_INLINE_FD_SET to choose between the two modes.	2006-10-15 14:53:07 +02:00
Willy Tarreau	f8306d5391	[MEDIUM] got rid of event_{cli,srv}_write() in favor of stream_sock_write() The timeouts, expiration timers and results are now stored in the buffers. The timers will have to change a bit to become more flexible, and when the I/O completion functions will be written, the connect_complete() will have to be extracted from the write() function.	2006-07-29 19:01:31 +02:00
Willy Tarreau	d797128d6e	[MEDIUM] got rid of event_{cli,srv}_read() in favor of stream_sock_read()	2006-07-29 18:36:34 +02:00
Willy Tarreau	5446940e37	[MEDIUM] started the changes towards I/O completion callbacks Now the event_* functions find their buffer in the fdtab itself.	2006-07-29 16:59:06 +02:00
Willy Tarreau	2dd0d4799e	[CLEANUP] renamed include/haproxy to include/common	2006-06-29 17:53:05 +02:00
Willy Tarreau	baaee00406	[BIGMOVE] exploded the monolithic haproxy.c file into multiple files. The files are now stored under : - include/haproxy for the generic includes - include/types.h for the structures needed within prototypes - include/proto.h for function prototypes and inline functions - src/*.c for the C files Most include files are now covered by LGPL. A last move still needs to be done to put inline functions under GPL and not LGPL. Version has been set to 1.3.0 in the code but some control still needs to be done before releasing.	2006-06-26 02:48:02 +02:00

1 2 3 4

199 Commits