haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-21 22:51:23 +02:00

Author	SHA1	Message	Date
Willy Tarreau	3cfaa8d1e0	BUG/MEDIUM: task: bound the number of tasks picked from the wait queue at once There is a theorical problem in the wait queue, which is that with many threads, one could spend a lot of time looping on the newly expired tasks, causing a lot of contention on the global wq_lock and on the global rq_lock. This initially sounds bening, but if another thread does just a task_schedule() or task_queue(), it might end up waiting for a long time on this lock, and this wait time will count on its execution budget, degrading the end user's experience and possibly risking to trigger the watchdog if that lasts too long. The simplest (and backportable) solution here consists in bounding the number of expired tasks that may be picked from the global wait queue at once by a thread, given that all other ones will do it as well anyway. We don't need to pick more than global.tune.runqueue_depth tasks at once as we won't process more, so this counter is updated for both the local and the global queues: threads with more local expired tasks will pick less global tasks and conversely, keeping the load balanced between all threads. This will guarantee a much lower latency if/when wakeup storms happen (e.g. hundreds of thousands of synchronized health checks). Note that some crashes have been witnessed with 1/4 of the threads in wake_expired_tasks() and, while the issue might or might not be related, not having reasonable bounds here definitely justifies why we can spend so much time there. This patch should be backported, probably as far as 2.0 (maybe with some adaptations).	2020-10-16 15:18:48 +02:00
Willy Tarreau	ba29687bc1	BUG/MEDIUM: proxy: properly stop backends The proxy stopping mechanism was changed with commit 322b9b94e ("MEDIUM: proxy: make stop_proxy() now use stop_listener()") so that it's now entirely driven by the listeners. One thing was forgotten though, which is that pure backends will not stop anymore since they don't have any listener, and that it's necessary to stop them in order to stop the health checks. No backport is needed.	2020-10-16 15:16:17 +02:00
Willy Tarreau	233ad288cd	CLEANUP: protocol: remove the now unused <handler> field of proto_fam->bind() We don't need to specify the handler anymore since it's set in the receiver. Let's remove this argument from the function and clean up the remains of code that were still setting it.	2020-10-15 21:47:56 +02:00
Willy Tarreau	a74cb38e7c	MINOR: protocol: register the receiver's I/O handler and not the protocol's Now we define a new sock_accept_iocb() for socket-based stream protocols and use it as a wrapper for listener_accept() which now takes a listener and not an FD anymore. This will allow the receiver's I/O cb to be redefined during registration, and more specifically to get rid of the hard-coded hacks in protocol_bind_all() made for syslog. The previous ->accept() callback in the protocol was removed since it doesn't have anything to do with accept() anymore but is more generic. A few places where listener_accept() was compared against the FD's IO callback for debugging purposes on the CLI were updated.	2020-10-15 21:47:56 +02:00
Willy Tarreau	e140a6921f	MINOR: log: set the UDP receiver's I/O handler in the receiver The I/O handler is syslog_fd_handler(), let's set it when creating the receivers.	2020-10-15 21:47:56 +02:00
Willy Tarreau	d2fb99f9d5	MINOR: protocol: add a default I/O callback and put it into the receiver For now we're still using the protocol's default accept() function as the I/O callback registered by the receiver into the poller. While this is usable for most TCP connections where a listener is needed, this is not suitable for UDP where a different handler is needed. Let's make this configurable in the receiver just like the upper layer is configurable for listeners. In order to ease stream protocols handling, the protocols will now provide a default I/O callback which will be preset into the receivers upon allocation so that almost none of them has to deal with it.	2020-10-15 21:47:56 +02:00
Willy Tarreau	caa91de718	MEDIUM: listener: remove the second pass of fd manipulation at the end The receiver FDs must not be manipulated by the listener_accept() function anymore, it must exclusively rely on the job performed by its listeners, as it is also the only way to keep the receivers working for established connections regardless of the listener's state (typically for multiplexed protocols like QUIC). This used to be necessary when the FDs were adjusted at once only but now that fd_done() is gone and the need for polling enabled by the accept_conn() function which detects the EAGAIN, we have nothing to do there to fixup any possible previous bad decision anymore. Interestingly, as a side effect of making the code not depend on the FD anymore, it also removes the need for a second lock, which increase the accept rate by about 1% on 8 threads.	2020-10-15 21:47:56 +02:00
Willy Tarreau	9378bbe0be	MEDIUM: listener: use protocol->accept_conn() to accept a connection Now listener_accept() doesn't have to deal with the incoming FD anymore (except for a little bit of side band stuff). It directly retrieves a valid connection from the protocol layer, or receives a well-defined error code that helps it decide how to proceed. This removes a lot of hardly maintainable low-level code and opens the function to receive new protocol stacks.	2020-10-15 21:47:56 +02:00
Willy Tarreau	344b8fcf87	MINOR: sockpair: implement sockpair_accept_conn() to accept a connection This is the same as previous commit, but this time for the sockpair- specific stuff, relying on recv_fd_uxst() instead of accept(), so the code is simpler. The various errno cases are handled like for regular sockets, though some of them will probably never happen, but this does not hurt.	2020-10-15 21:47:56 +02:00
Willy Tarreau	f1dc9f2f17	MINOR: sock: implement sock_accept_conn() to accept a connection The socket-specific accept() code in listener_accept() has nothing to do there. Let's move it to sock.c where it can be significantly cleaned up. It will now directly return an accepted connection and provide a status code instead of letting listener_accept() deal with various errno values. Note that this doesn't support the sockpair specific code. The function is now responsible for dealing with its own receiver's polling state and calling fd_cant_recv() when facing EAGAIN. One tiny change from the previous implementation is that the connection's sockaddr is now allocated before trying accept(), which saves a memcpy() of the resulting address for each accept at the expense of a cheap pool_alloc/pool_free on the final accept returning EAGAIN. This still apparently slightly improves accept performance in microbencharks.	2020-10-15 21:47:56 +02:00
Willy Tarreau	7d053e4211	MINOR: sock: rename sock_accept_conn() to sock_accepting_conn() This call was introduced by commit 5ced3e887 ("MINOR: sock: add sock_accept_conn() to test a listening socket") but is actually quite confusing because it makes one think the socket will accept a connection (which is what we want to have in a new function) while it only tells whether it's configured to accept connections. Let's call it sock_accepting_conn() instead. The same change was applied to sockpair which had the same issue.	2020-10-15 21:47:56 +02:00
Willy Tarreau	01ca149047	MINOR: session: simplify error path in session_accept_fd() Now that this function is always called with an initialized connection and that the control layer is always initialized, we don't need to play games with fdtab[] to decide how to close, we can simply rely on the regular close path using conn_ctrl_close(), which can be fused with conn_xprt_close() into conn_full_close(). The code is cleaner because the FD is now used only for some protocol-specific setup (that will eventually have to move) and to try to send a hard-coded HTTP 500 error message on raw sockets.	2020-10-15 21:47:56 +02:00
Willy Tarreau	83efc320aa	MEDIUM: listener: allocate the connection before queuing a new connection Till now we would keep a per-thread queue of pending incoming connections for which we would store: - the listener - the accepted FD - the source address - the source address' length And these elements were first used in session_accept_fd() running on the target thread to allocate a connection and duplicate them again. Doing this induces various problems. The first one is that session_accept_fd() may only run on file descriptors and cannot be reused for QUIC. The second issue is that it induces lots of memory copies and that the listerner queue thrashes a lot of cache, consuming 64 bytes per entry. This patch changes this by allocating the connection before queueing it, and by only placing the connection's pointer into the queue. Indeed, the first two calls used to initialize the connection already store all the information above, which can be retrieved from the connection pointer alone. So we just have to pop one pointer from the target thread, and pass it to session_accept_fd() which only needs the FD for the final settings. This starts to make the accept path a bit more transport-agnostic, and saves memory and CPU cycles at the same time (1% connection rate increase was noticed with 4 threads). Thanks to dividing the accept-queue entry size from 64 to 8 bytes, its size could be increased from 256 to 1024 connections while still dividing the overall size by two. No single queue full condition was met. One minor drawback is that connection may be allocated from one thread's pool to be used into another one. But this already happens a lot with connection reuse so there is really nothing new here.	2020-10-15 21:47:56 +02:00
Willy Tarreau	9b7587a6af	MINOR: connection: make sockaddr_alloc() take the address to be copied Roughly half of the calls to sockadr_alloc() are made to copy an already known address. Let's optionally pass it in argument so that the function can handle the copy at the same time, this slightly simplifies its usage.	2020-10-15 21:47:56 +02:00
Willy Tarreau	0138f51f93	CLEANUP: fd: finally get rid of fd_done_recv() fd_done_recv() used to be useful with the FD cache because it used to allow to keep a file descriptor active in the poller without being marked as ready in the cache, saving it from ringing immediately, without incurring any system call. It was a way to make it yield to wait for new events leaving a bit of time for others. The only user left was the connection accepter (listen_accept()). We used to suspect that with the FD cache removal it had become totally useless since changing its readiness or not wouldn't change its status regarding the poller itself, which would be the only one deciding to report it again. Careful tests showed that it indeed has exactly zero effect nowadays, the syscall numbers are exactly the same with and without, including when enabling edge-triggered polling. Given that there's no more API available to manipulate it and that it was directly called as an optimization from listener_accept(), it's about time to remove it.	2020-10-15 21:47:56 +02:00
Willy Tarreau	e53e7ec9d9	CLEANUP: protocol: remove the ->drain() function No protocol defines it anymore. The last user used to be the monitor-net stuff that got partially broken already when the tcp_drain() function moved to conn_sock_drain() with commit e215bba95 ("MINOR: connection: make conn_sock_drain() work for all socket families") in 1.9-dev2. A part of this will surely move back later when non-socket connections arrive with QUIC but better keep the API clean and implement what's needed in time instead.	2020-10-15 21:47:04 +02:00
Willy Tarreau	9e9919dd8b	MEDIUM: proxy: remove obsolete "monitor-net" As discussed here during 2.1-dev, "monitor-net" is totally obsolete: https://www.mail-archive.com/haproxy@formilux.org/msg35204.html It's fundamentally incompatible with usage of SSL, and imposes the presence of file descriptors with hard-coded syscalls directly in the generic accept path. It's very unlikely that anyone has used it in the last 10 years for anything beyond testing. In the worst case if anyone would depend on it, replacing it with "http-request return status 200 if ..." and "mode http" would certainly do the trick. The keyword is still detected as special by the config parser to help users update their configurations appropriately.	2020-10-15 21:47:04 +02:00
Willy Tarreau	77e0daef9f	MEDIUM: proxy: remove obsolete "mode health" As discussed here during 2.1-dev, "mode health" is totally obsolete: https://www.mail-archive.com/haproxy@formilux.org/msg35204.html It's fundamentally incompatible with usage of SSL, doesn't support source filtering, and imposes the presence of file descriptors with hard-coded syscalls directly in the generic accept path. It's very unlikely that anyone has used it in the last 10 years for anything beyond testing. In the worst case if anyone would depend on it, replacing it with "http-request return status 200" and "mode http" would certainly do the trick. The keyword is still detected as special by the config parser to help users update their configurations appropriately.	2020-10-15 21:47:04 +02:00
Amaury Denoyelle	46f041d7f8	MEDIUM: fcgi: remove conn from session on detach FCGI mux is marked with HOL blocking. On safe reuse mode, the connection using it are placed on the sessions instead of the available lists to avoid sharing it with several clients. On detach, if they are no more streams, remove the connection from the session before adding it to the idle list. If there is still used streams, do not add it to available list as it should be already on the session list.	2020-10-15 15:19:34 +02:00
Amaury Denoyelle	6b8daef56b	MEDIUM: h2: remove conn from session on detach H2 mux is marked with HOL blocking. On safe reuse mode, the connection using it are placed on the sessions instead of the available lists to avoid sharing it with several clients. On detach, if they are no more streams, remove the connection from the session before adding it to the idle list. If there is still used streams, do not add it to available list as it should be already on the session list.	2020-10-15 15:19:34 +02:00
Amaury Denoyelle	0d21deaded	MEDIUM: backend: add reused conn to sess if mux marked as HOL blocking If a connection is using a mux protocol subject to HOL blocking, add it to the session instead of the available list to avoid sharing it with other clients on connection reuse.	2020-10-15 15:19:34 +02:00
Amaury Denoyelle	00464ab8f4	MEDIUM: backend: add new conn to session if mux marked as HOL blocking When allocating a new session on connect_server, if the mux protocol is marked as subject of HOL blocking, add it into session instead of available list to avoid sharing it with other clients.	2020-10-15 15:19:34 +02:00
Amaury Denoyelle	3d3c0918dc	MINOR: mux/connection: add a new mux flag for HOL risk This flag is used to indicate if the mux protocol is subject to head-of-line blocking problem.	2020-10-15 15:19:34 +02:00
Amaury Denoyelle	9c13b62b47	BUG/MEDIUM: connection: fix srv idle count on conn takeover On server connection migration from one thread to another, the wrong idle thread-specific counter is decremented. This bug was introduced since commit 3d52f0f1f828acb2a74b3f13fcc3fa069106d09f due to the factorization with srv_use_idle_conn. However, this statement is only executed from conn_backend_get. Extract the decrement from srv_use_idle_conn in conn_backend_get and use the correct thread-specific counter. Rename the function to srv_use_conn to better reflect its purpose as it is also used with a newly initialized connection not in the idle list. As a side change, the connection insertion to available list has also been extracted to conn_backend_get. This will be useful to be able to specify an alternative list for protocol subject to HOL risk that should not be shared between several clients. This bug is only present in this release and thus do not need a backport.	2020-10-15 15:19:34 +02:00
Amaury Denoyelle	5f1ded5629	BUG/MINOR: connection: fix loop iter on connection takeover The loop always missed one iteration due to the incrementation done on the for check. Move the incrementation on the loop last statement to fix this behaviour. This bug has a very limited impact, not at all visible to the user, but could be backported to 2.2.	2020-10-15 15:19:25 +02:00
Willy Tarreau	1a3770cbc7	BUG/MEDIUM: deinit: check fdtab before fdtab[fd].owner When running a pure config check (haproxy -c) we go through the deinit phase without having allocated fdtab, so we can't blindly dereference it. The issue was added by recent commit ae7bc4a23 ("MEDIUM: deinit: close all receivers/listeners before scanning proxies"), no backport is needed.	2020-10-14 12:13:51 +02:00
Willy Tarreau	2f6f362756	CLEANUP: protocol: intitialize all of the sockaddr when disconnecting In issue #894, Coverity suspects uninitialized values for a socket's address whose family is AF_UNSPEC but it doesn't know that the address is not used in this case. It's not on a critical path and working around it is trivial, let's fully declare the address. We're doing it for both TCP and UDP, because the same principle appears at two places.	2020-10-14 10:54:15 +02:00
Willy Tarreau	258b351704	BUG/MINOR: listener: detect and handle shared sockets stopped in other processes It may happen that during a temporary listener pause resulting from a SIGTTOU, one process gets one of its sockets disabled by another process and will not be able to recover from this situation by itself. For the protocols supporting this (TCPv4 and TCPv6 at the moment) this situation is detectable, so when this happens, let's put the listener into the PAUSED state so that it remains consistent with the real socket state. One nice effect is that just sending the SIGTTIN signal to the process is enough to recover the socket in this case. There is no need to backport this, this behavior has been there forever and the fix requires to reimplement the getsockopt() call there.	2020-10-13 18:15:33 +02:00
Willy Tarreau	85d2ba6b78	CLEANUP: unix: make use of sock_accept_conn() where relevant This allows to get rid of one getsockopt(SO_ACCEPTCONN) in the binding code.	2020-10-13 18:15:33 +02:00
Willy Tarreau	3e12de2cc6	CLEANUP: tcp: make use of sock_accept_conn() where relevant This allows to get rid of two getsockopt(SO_ACCEPTCONN).	2020-10-13 18:15:33 +02:00
Willy Tarreau	cc8b653483	MINOR: sockpair: implement the .rx_listening function For socket pairs we don't rely on a real listening socket but we need to have a properly connected UNIX stream socket. This is what the new sockpair_accept_conn() tries to report. Some corner cases like half shutdown will still not be detected but that should be sufficient for most cases we really care about.	2020-10-13 18:15:33 +02:00
Willy Tarreau	29185140db	MINOR: protocol: make proto_tcp & proto_uxst report listening sockets Now we introdce a new .rx_listening() function to report if a receiver is actually a listening socket. The reason for this is to help detect shared sockets that might have been broken by sibling processes.	2020-10-13 18:15:33 +02:00
Willy Tarreau	5ced3e8879	MINOR: sock: add sock_accept_conn() to test a listening socket At several places we need to check if a socket is still valid and still willing to accept connections. Instead of open-coding this, each time, let's add a new function for this.	2020-10-13 18:15:33 +02:00
Willy Tarreau	8b6fc3d10e	MINOR: proto-tcp: make use of connect(AF_UNSPEC) for the pause Currently the suspend/resume mechanism for listeners only works on Linux and we resort to a number of tricks involving shutdown+listen+shutdown to try to detect failures on other operating systems that do not support it. But on Linux connect(AF_UNSPEC) also works pretty well and is much cleaner. It still doesn't work on other operating systems but the error is easier to detect and appears safer. So let's switch to this.	2020-10-13 18:15:33 +02:00
Willy Tarreau	7c9f756dcc	MINOR: fd: report an error message when failing initial allocations When starting with a huge maxconn (say 1 billion), the only error seen is "No polling mechanism available". This doesn't help at all to resolve the problem. Let's add specific alerts for the failed mallocs. Now we can get this instead: [ALERT] 286/154439 (23408) : Not enough memory to allocate 2000000033 entries for fdtab! This may be backported as far as 2.0 as it helps debugging bad configurations.	2020-10-13 18:15:33 +02:00
Willy Tarreau	b1e600c9c5	BUG/MINOR: mux-h2: do not stop outgoing connections on stopping There are reports of a few "SC" in logs during reloads when H2 is used on the backend side. Christopher analysed this as being caused by the proxy disabled test in h2_process(). As the comment says, this was done for frontends only, and must absolutely not send a GOAWAY to the backend, as all it will result in is to make newly queued streams fail. The fix consists in simply testing the connection side before deciding to send the GOAWAY. This may be backported as far as 2.0, though for whatever reason it seems to manifest itself only since 2.2 (probably due to changes in the outgoing connection setup sequence).	2020-10-13 18:15:33 +02:00
Willy Tarreau	2bd0f8147b	BUG/MINOR: init: only keep rlim_fd_cur if max is unlimited On some operating systems, RLIM_INFINITY is set to -1 so that when the hard limit on the number of FDs is set to unlimited, taking the MAX of both values keeps rlim_fd_cur and everything works. But on other systems this values is defined as the highest positive integer. This is what was observed on a 32-bit AIX 5.1. The effect is that maxsock becomes 2^31-1 and that fdtab allocation fails. Note that a simple workaround consists in manually setting maxconn in the global section. Let's ignore unlimited as soon as we retrieve rlim_fd_max so that all systems behave consistently. This may be backported as far as 2.0, though it doesn't seem like it has annoyed anyone.	2020-10-13 15:36:08 +02:00
Fr�d�ric L�caille	3fc0fe05fd	MINOR: peers: heartbeat, collisions and handshake information for "show peers" command. This patch adds "coll" new counter and the heartbeat timer values to "show peers" command. It also adds the elapsed time since the last handshake to new "last_hdshk" new peer dump field.	2020-10-09 20:59:58 +02:00
Willy Tarreau	0a002df2c2	BUG/MINOR: proxy: respect the proper format string in sig_pause/sig_listen When factoring out the pause/resume error messages in commit 775e00158 ("MAJOR: signals: use protocol_pause_all() and protocol_resume_all()") I forgot that ha_warning() and send_log() take a format string and not just a const string. No backport is needed, this is 2.3-dev.	2020-10-09 19:26:27 +02:00
Willy Tarreau	ccf429960b	MEDIUM: config: remove the deprecated and dangerous global "debug" directive This one was scheduled for removal in 2.3 since 2.2-dev3 by commit 1b85785bc ("MINOR: config: mark global.debug as deprecated"). Let's remove it now. It remains totally possible to use -d on the command line though.	2020-10-09 19:18:45 +02:00
Willy Tarreau	ab0a5192a8	MEDIUM: config: mark "grace" as deprecated This was introduced 15 years ago or so to delay the stopping of some services so that a monitoring device could detect its port being down before services were stopped. Since then, clean reloads were implemented and this doesn't cope well with reload at all, preventing the new process from seamlessly binding, and forcing processes to coexist with half-baked configurations. Now it has become a real problem because there's a significant code portion in the proxies that is solely dedicated to this obsolete feature, and dealing with its special cases eases the introduction of bugs in other places so it's about time that it goes. We could tentatively schedule its removal for 2.4 with a hard deadline for 2.5 in any case.	2020-10-09 19:07:01 +02:00
Willy Tarreau	e03204c8e1	MEDIUM: listeners: implement protocol level ->suspend/resume() calls Now we have ->suspend() and ->resume() for listeners at the protocol level. This means that it now becomes possible for a protocol to redefine its own way to suspend and resume. The default functions are provided for TCP, UDP and unix, and they are pass-through to the receiver equivalent as it used to be till now. Nothing was defined for sockpair since it does not need to suspend/resume during reloads, hence it will succeed.	2020-10-09 18:44:37 +02:00
Willy Tarreau	7b2febde1d	MINOR: listeners: split do_unbind_listener() in two The inner part now goes into the protocol and is used to decide how to unbind a given protocol's listener. The existing code which is able to also unbind the receiver was provided as a default function that we currently use everywhere. Some complex listeners like QUIC will use this to decide how to unbind without impacting existing connections, possibly by setting up other incoming paths for the traffic.	2020-10-09 18:44:37 +02:00
Willy Tarreau	f58b8db47b	MEDIUM: receivers: add an rx_unbind() method in the protocols This is used as a generic way to unbind a receiver at the end of do_unbind_listener(). This allows to considerably simplify that function since we can now let the protocol perform the cleanup. The generic code was moved to sock.c, along with the conditional rx_disable() call. Now the code also supports that the ->disable() function of the protocol which acts on the listener performs the close itself and adjusts the RX_F_BUOND flag accordingly.	2020-10-09 18:44:36 +02:00
Willy Tarreau	18c20d28d7	MINOR: listeners: move the LI_O_MWORKER flag to the receiver This listener flag indicates whether the receiver part of the listener is specific to the master or to the workers. In practice it's only used by the master's CLI right now. It's used to know whether or not the FD must be closed before forking the workers. For this reason it's way more of a receiver's property than a listener's property, so let's move it there under the name RX_F_MWORKER. The rest of the code remains unchanged.	2020-10-09 18:43:05 +02:00
Willy Tarreau	75c98d166e	CLEANUP: listeners: remove the do_close argument to unbind_listener() And also remove it from its callers. This subtle distinction was added as sort of a hack for the seamless reload feature but is not needed anymore since the do_close turned unused since commit previous commit ("MEDIUM: listener: let do_unbind_listener() decide whether to close or not"). This also removes the unbind_listener_no_close() function.	2020-10-09 18:41:56 +02:00
Willy Tarreau	374e9af358	MEDIUM: listener: let do_unbind_listener() decide whether to close or not The listener contains all the information needed to decide to close on unbind or not. The rule is the following (when we're not stopping): - worker process unbinding from a worker's FD with socket transfer enabled => keep - master process unbinding from a master's inherited FD => keep - master process unbinding from a master's FD => close - master process unbinding from a worker's FD => close - worker process unbinding from a master's FD => close - worker process unbinding from a worker's FD => close Let's translate that into the function and stop using the do_close argument that is a bit obscure for callers. It was not yet removed to ease code testing.	2020-10-09 18:41:48 +02:00
Willy Tarreau	87acd4e848	BROKEN/MEDIUM: listeners: rework the unbind logic to make it idempotent BROKEN: the failure rate on reg-tests/seamless-reload/abns_socket.vtc has significantly increased for no obvious reason. It fails 99% of the time vs 10% before. do_unbind_listener() is not logical and is not even idempotent. It must not touch the fd if already -1, which also means not touch the receiver. In addition, when performing a partial stop on a socket (not closing), we know the socket remains in the listening state yet it's marked as LI_ASSIGNED, which is confusing as it doesn't translate its real state. With this change, we make sure that FDs marked for close end up in ASSIGNED state and that those which are really bound and on which a listen() was made (i.e. not pause) remain in LISTEN state. This is what is closest to reality. Ideally this function should become a default proto->unbind() one but it may still keep a bit too much state logic to become generalized to other protocols (e.g. QUIC).	2020-10-09 18:29:04 +02:00
Willy Tarreau	d6afb53bdc	MEDIUM: listeners: always close master vs worker listeners Right now in enable_listener(), we used to start all enabled listeners then kill from the workers those that were for the master. But this is incomplete. We must also close from the master the listeners that are solely for workers, and do it before we even start them. Otherwise we end up with a master responding to the worker CLI connections if the listener remains in listen mode to translate the socket's real state. It doesn't seem like it could have caused bugs in the past because we used to aggressively mark disabled listeners as LI_ASSIGNED despite the fact that they were still bound and listening. If this patch were ever seen as a candidate solution for any obscure bug, be careful in that it subtly relies on the fact that fd_delete() doesn't close inherited FDs anymore, otherwise that could break the master's ability to pass inherited FDs on reloads.	2020-10-09 18:29:04 +02:00
Willy Tarreau	95a3460739	MINOR: listener: add a few BUG_ON() statements to detect inconsistencies We must not have an fd==-1 when switching to certain states. This will later disappear but for now it helps detecting inconsistencies.	2020-10-09 18:29:04 +02:00

... 47 48 49 50 51 ...

12588 Commits