haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-19 05:31:26 +02:00

Author	SHA1	Message	Date
Willy Tarreau	08fa16e397	MINOR: raw_sock: make sure to disable polling once everything is sent Analysing traces revealed a rare but surprizing pattern : connect() = -1 EAGAIN send() = success epoll_ctl(ADD, EPOLLOUT) epoll_wait() recvfrom() = success close() What happens is that the failed connect() creates an FD update for pollout, but the successful synchronous send() doesn't disable it because polling was only disabled in the FD handler. But a successful synchronous connect() cancellation is a good opportunity to disable polling before it's effectively enabled in the next loop, so better disable it when reaching the end. The cost is very low if it was already disabled anyway (one atomic op). This only affects local connections but with this the typical number of epoll_ctl() calls per connection dropped from ~4.2 to ~3.8 for plain TCP and 10k transfers.	2020-01-08 09:59:40 +01:00
Willy Tarreau	0eae6323bf	MEDIUM: dns: implement synchronous send In dns_send_query(), there's no point in first waking up the FD, to get called back by the poller to send the request and sleep. Instead let's simply send the request as soon as it's known and only subscribe to the poller when the socket buffers are full and it's required to poll (i.e. almost never). This significantly reduces the number of calls to the poller. A large config sees the number of epoll_ctl() calls reduced from 577 to 7 over 10 seconds, the number of recvfrom() from 1533 to 582 and the number of sendto() from 369 to 162. It also has the extra benefit of building each requests only once per resolution and sending it to multiple resolvers instead of rebuilding it for each and every resolver. This will reduce the risk of seeing situations similar to bug #416 in the future.	2020-01-08 06:10:38 +01:00
Willy Tarreau	e5891ca6c1	BUG/MEDIUM: session: do not report a failure when rejecting a session In session_accept_fd() we can perform a synchronous call to conn_complete_session() and if it succeeds the connection is accepted and turned into a session. If it fails we take it as an error while it is not, in this case, it's just that a tcp-request rule has decided to reject the incoming connection. The problem with reporting such an event as an error is that the failed status is passed down to the listener code which decides to disable accept() for 100ms in order to leave some time for transient issues to vanish, and that's not what we want to do here. This fix must be backported as far as 1.7. In 1.7 the code is a bit different as tcp_exec_l5_rules() is called directly from within session_new_fd() and ret=0 must be assigned there.	2020-01-07 18:15:32 +01:00
Christopher Faulet	584348be63	BUG/MINOR: channel: inject output data at the end of output In co_inject(), data must be inserted at the end of output, not the end of input. For the record, this function does not take care of input data which are supposed to not exist. But the caller may reset input data after or before the call. It is its own choice. This bug, among other effects, is visible when a redirect is performed on the response path, on legacy HTTP mode (so for HAProxy < 2.1). The redirect response is appended after the server response when it should overwrite it. Thanks to Kevin Zhu <ip0tcp@gmail.com> to report the bug. It must be backported as far as 1.9.	2020-01-07 10:51:15 +01:00
Kevin Zhu	96b363963f	BUG/MEDIUM: http-ana: Truncate the response when a redirect rule is applied When a redirect rule is executed on the response path, we must truncate the received response. Otherwise, the redirect is appended after the response, which is sent to the client. So it is obviously a bug because the redirect is not performed. With bodyless responses, it is the "only" bug. But if the response has a body, the result may be invalid. If the payload is not fully received yet when the redirect is performed, an internal error is reported. It must be backported as far as 1.9.	2020-01-07 10:50:28 +01:00
Christopher Faulet	47a7210b9d	BUG/MINOR: proxy: Fix input data copy when an error is captured In proxy_capture_error(), input data are copied in the error snapshot. The copy must take care of the data wrapping. But the length of the first block is wrong. It should be the amount of contiguous input data that can be copied starting from the input's beginning. But the mininum between the input length and the buffer size minus the input length is used instead. So it is a problem if input data are wrapping or if more than the half of the buffer is used by input data. This patch must be backported as far as 1.9.	2020-01-06 13:58:30 +01:00
Christopher Faulet	1703478e2d	BUG/MINOR: h1: Report the right error position when a header value is invalid During H1 messages parsing, when the parser has finished to parse a full header line, some tests are performed on its value, depending on its name, to be sure it is valid. The content-length is checked and converted in integer and the host header is also checked. If an error occurred during this step, the error position must point on the header value. But from the parser point of view, we are already on the start of the next header. Thus the effective reported position in the error capture is the beginning of the unparsed header line. It is a bit confusing when we try to figure out why a message is rejected. Now, the parser state is updated to point on the invalid value. This way, the error position really points on the right position. This patch must be backported as far as 1.9.	2020-01-06 13:58:21 +01:00
Olivier Houchard	7f4f7f140f	MINOR: ssl: Remove unused variable "need_out". The "need_out" variable was used to let the ssl code know we're done reading early data, and we should start the handshake. Now that the handshake function is responsible for taking care of reading early data, all that logic has been removed from ssl_sock_to_buf(), but need_out was forgotten, and left. Remove it know. This patch was submitted by William Dauchy <w.dauchy@criteo.com>, and should fix github issue #434. This should be backported to 2.0 and 2.1.	2020-01-05 16:45:14 +01:00
William Dauchy	3894d97fb8	MINOR: config: disable busy polling on old processes in the context of seamless reload and busy polling, older processes will create unecessary cpu conflicts; we can assume there is no need for busy polling for old processes which are waiting to be terminated. This patch is not a bug fix itself but might be a good stability improvment when you are un the context of frequent seamless reloads with a high "hard-stop-after" value; for that reasons I think this patch should be backported in all 2.x versions. Signed-off-by: William Dauchy <w.dauchy@criteo.com>	2020-01-02 10:29:49 +01:00
Olivier Houchard	140237471e	BUG/MEDIUM: connections: Hold the lock when wanting to kill a connection. In connect_server(), when we decide we want to kill the connection of another thread because there are too many idle connections, hold the toremove_lock of the corresponding thread, othervise, there's a small race condition where we could try to add the connection to the toremove_connections list while it has already been free'd. This should be backported to 2.0 and 2.1.	2019-12-30 18:18:28 +01:00
Olivier Houchard	37d7897aaf	BUG/MEDIUM: checks: Only attempt to do handshakes if the connection is ready. When creating a new check connection, only attempt to add an handshake connection if the connection has fully been initialized. It can not be the case if a DNS resolution is still pending, and thus we don't yet have the address for the server, as the handshake code assumes the connection is fully initialized and would otherwise crash. This is not ideal, the check shouldn't probably run until we have an address, as it leads to check failures with "Socket error". While I'm there, also add an xprt handshake if we're using socks4, otherwise checks wouldn't be able to use socks4 properly. This should fix github issue #430 This should be backported to 2.0 and 2.1.	2019-12-30 15:18:16 +01:00
Willy Tarreau	5d7dcc2a8e	OPTIM: epoll: always poll for recv if neither active nor ready The cost of enabling polling in one direction with epoll is very high because it requires one syscall per FD and per direction change. In addition we don't know about input readiness until we either try to receive() or enable polling and watch the result. With HTTP keep-alive, both are equally expensive as it's very uncommon to see the server instantly respond (unless it's a second stage of the same process on localhost, which has become much less common with threads). But when a connection is established it's also quite usual to have to poll for sending (except on localhost or UNIX sockets where it almost always instantly works). So this cost of polling could be factored out with the second step if both were enabled together. This is the idea behind this patch. What it does is to always enable polling for Rx if it's not ready and at least one direction is active. This means that if it's not explicitly disabled, or if it was but in a state that causes the loss of the information (rx ready cannot be guessed), then let's take any opportunity for a polling change to enable it at the same time, and learn about rx readiness for free. In addition the FD never gets unregistered for Rx unless it's ready and was blocked (buffer full). This avoids a lot of the flip-flop behaviour at beginning and end of requests. On a test with 10k requests in keep-alive, the difference is quite noticeable: Before: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 83.67 0.010847 0 20078 epoll_ctl 16.33 0.002117 0 2231 epoll_wait 0.00 0.000000 0 20 20 connect ------ ----------- ----------- --------- --------- ---------------- 100.00 0.012964 22329 20 total After: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 96.35 0.003351 1 2644 epoll_wait 2.36 0.000082 4 20 20 connect 1.29 0.000045 0 66 epoll_ctl ------ ----------- ----------- --------- --------- ---------------- 100.00 0.003478 2730 20 total It may also save a recvfrom() after connect() by changing the following sequence, effectively saving one epoll_ctl() and one recvfrom() : before \| after -----------------------------+---------------------------- - connect() \| - connect() - epoll_ctl(add,out) \| - epoll_ctl(add, in\|out) - sendto() \| - epoll_wait() = out - epoll_ctl(mod,in\|out) \| - send() - epoll_wait() = out \| - epoll_wait() = in\|out - recvfrom() = EAGAIN \| - recvfrom() = OK - epoll_ctl(mod,in) \| - recvfrom() = EAGAIN - epoll_wait() = in \| - epoll_ctl(mod, in) - recvfrom() = OK \| - epoll_wait() - recvfrom() = EAGAIN \| - epoll_wait() \| (...) Now on a 10M req test on 16 threads with 2k concurrent conns and 415kreq/s, we see 190k updates total and 14k epoll_ctl() only.	2019-12-27 16:38:47 +01:00
Willy Tarreau	0fbc318e24	CLEANUP: connection: merge CO_FL_NOTIFY_DATA and CO_FL_NOTIFY_DONE Both flags became equal in commit 82967bf9 ("MINOR: connection: adjust CO_FL_NOTIFY_DATA after removal of flags"), which already predicted the overlap between xprt_done_cb() and wake() after the removal of the DATA specific flags in 1.8. Let's simply remove CO_FL_NOTIFY_DATA since the "_DONE" version already covers everything and explains the intent well enough.	2019-12-27 16:38:47 +01:00
Willy Tarreau	cbcf77edb7	MINOR: connection: remove the double test on xprt_done_cb() The conn_fd_handler used to have one possible call to this function to notify about end of handshakes, and another one to notify about connection setup or error. But given that we're now only performing wakeup calls after connection validation, we don't need to keep two places to run this test since the conditions do not change in between. This patch merges the two tests into a single one and moves the CO_FL_CONNECTED test appropriately as well so that it's called even on the error path if needed.	2019-12-27 16:38:47 +01:00
Willy Tarreau	b2a7ab08a8	MINOR: connection: check for connection validation earlier In conn_fd_handler() we used to first give a chance to the send() callback to try to send data and validate the connection at the same time. But since 1.9 we do not call this callback anymore inline, it's scheduled. So let's validate the connection ealier so that all other decisions can be taken based on this confirmation. This may notably be useful to the xprt_done_cb() to know that the connection was properly validated.	2019-12-27 16:38:47 +01:00
Willy Tarreau	4970e5adb7	REORG: connection: move tcp_connect_probe() to conn_fd_check() The function is not TCP-specific at all, it covers all FD-based sockets so let's move this where other similar functions are, in connection.c, and rename it conn_fd_check().	2019-12-27 16:38:43 +01:00
Willy Tarreau	7deff246ce	MEDIUM: tcp: make tcp_connect_probe() consider ERR/HUP Now that we know what pollers can return ERR/HUP, we can take this into account to save one syscall: with such a poller, if neither are reported, then we know the connection succeeded and we don't need to go with getsockopt() nor connect() to validate this. In addition, for the remaining cases (select() or suspected errors), we'll always go through the extra connect() attempt and enumerate possible "in progress", "connected" or "failed" status codes and take action solely based on this. This results in one saved syscall on modern pollers, only a second connect() still being used on select() and the server's address never being needed anymore. Note that we cannot safely replace connect() with getsockopt() as the latter clears the error on the socket without saving it, and health checks rely on it for their reporting. This would be OK if the error was saved in the connection itself.	2019-12-27 16:38:04 +01:00
Willy Tarreau	11ef0837af	MINOR: pollers: add a new flag to indicate pollers reporting ERR & HUP In practice it's all pollers except select(). It turns out that we're keeping some legacy code only for select and enforcing it on all pollers, let's offer the pollers the ability to declare that they do not need that.	2019-12-27 14:04:33 +01:00
Willy Tarreau	8081abe26a	CLEANUP: connection: conn->xprt is never NULL Let's remove this outdated test that's been there since 1.5. For quite some time now xprt hasn't been NULL anymore on an initialized connection.	2019-12-27 14:04:33 +01:00
Willy Tarreau	70ccb2cddf	BUG/MINOR: connection: only wake send/recv callbacks if the FD is active Since commit c3df4507fa ("MEDIUM: connections: Wake the upper layer even if sending/receiving is disabled.") the send/recv callbacks are called on I/O if the FD is ready and not just if it's active. This means that in some situations (e.g. send ready but nothing to send) we may needlessly enter the if() block, notice we're not subscribed, set io_available=1 and call the wake() callback even if we're just called for read activity. Better make sure we only do this when the FD is active in that direction.. This may be backported as far as 2.0 though it should remain under observation for a few weeks first as the risk of harm by a mistake is higher than the trouble it should cause.	2019-12-27 14:04:33 +01:00
Willy Tarreau	c8dc20a825	BUG/MINOR: checks: refine which errno values are really errors. Two regtest regularly fail in a random fashion depending on the machine's load (one could really wonder if it's really worth keeping such unreproducible tests) : - tcp-check_multiple_ports.vtc - 4be_1srv_smtpchk_httpchk_layer47errors.vtc It happens that one of the reason is the time it takes to connect to the local socket (hence the load-dependent aspect): if connect() on the loopback returns EINPROGRESS then this status is reported instead of a real error. Normally such a test is expected to see the error cleaned by tcp_connect_probe() but it really depends on the timing and instead we may very well send() first and see this error. The problem is that everything is collected based on errno, hoping it won't get molested in the way from the last unsuccesful syscall to wake_srv_chk(), which obviously is hard to guarantee. This patch at least makes sure that a few non-errors are reported as zero just like EAGAIN. It doesn't fix the root cause but makes it less likely to report incorrect failures. This fix could be backported as far as 1.9.	2019-12-27 14:04:33 +01:00
Lukas Tribus	a26d1e1324	BUILD: ssl: improve SSL_CTX_set_ecdh_auto compatibility SSL_CTX_set_ecdh_auto() is not defined when OpenSSL 1.1.1 is compiled with the no-deprecated option. Remove existing, incomplete guards and add a compatibility macro in openssl-compat.h, just as OpenSSL does: `bf4006a6f9/include/openssl/ssl.h (L1486)` This should be backported as far as 2.0 and probably even 1.9.	2019-12-21 06:46:55 +01:00
Christopher Faulet	eec7f8ac01	BUG/MEDIUM: stream: Be sure to never assign a TCP backend to an HTX stream With a TCP frontend, it is possible to upgrade a connection to HTTP when the backend is in HTTP mode. Concretly the upgrade install a new mux. So, once it is done, the downgrade to TCP is no longer possible. So we must take care to never assign a TCP backend to a stream on this connection. Otherwise, HAProxy crashes because raw data from the server are handled as structured data on the client side. This patch fixes the issue #420. It must be backported to all versions supporting the HTX.	2019-12-20 18:09:49 +01:00
Christopher Faulet	6716cc2b93	BUG/MAJOR: mux-h1: Don't pretend the input channel's buffer is full if empty A regression was introduced by the commit 76014fd1 ("MEDIUM: h1-htx: Add HTX EOM block when the message is in H1_MSG_DONE state"). When nothing is copied in the channel's buffer when the input message is parsed, we erroneously pretend it is because there is not enough room by setting the CS_FL_WANT_ROOM flag on the conn-stream. This happens when a partial request is parsed. Because of this flag, we never try anymore to get input data from the mux because we first wait for more room in the channel's buffer, which is empty. Because of this bug, it is pretty easy to freeze a h1 connection. To fix the bug, we must obsiously set the CS_FL_WANT_ROOM flag only when there are still data to transfer while the channel's buffer is not empty. This patch must be backported if the patch 76014fd1 is backported too. So for now, no backport needed.	2019-12-20 18:09:19 +01:00
Willy Tarreau	ca7a5af664	BUG/MINOR: state-file: do not leak memory on parse errors Issue #417 reports a possible memory leak in the state-file loading code. There's one such place in the loop which corresponds to parsing errors where the curreently allocated line is not freed when dropped. In any case this is very minor in that no more than the file's length may be lost in the worst case, considering that the whole file is kept anyway in case of success. This fix addresses this. It should be backported to 2.1.	2019-12-20 17:33:05 +01:00
Willy Tarreau	fd1aa01f72	BUG/MINOR: state-file: do not store duplicates in the global tree The global state file tree isn't configured for unique keys, so if an entry appears multiple times, e.g. due to a bogus script that concatenates entries multiple times, this will needlessly eat memory. Let's just drop duplicates. This should be backported to 2.1.	2019-12-20 17:23:40 +01:00
Willy Tarreau	7d6a1fa311	BUG/MEDIUM: state-file: do not allocate a full buffer for each server entry Starting haproxy with a state file of 700k servers eats 11.2 GB of RAM due to a mistake in the function that loads the strings into a tree: it allocates a full buffer for each backend+server name instead of allocating just the required string. By just fixing this we're down to 80 MB. This should be backported to 2.1.	2019-12-20 17:18:13 +01:00
Olivier Houchard	fc51f0f588	BUG/MEDIUM: fd/threads: fix a concurrency issue between add and rm on the same fd There's a very hard-to-trigger bug in the FD list code where the fd_add_to_fd_list() function assumes that if the FD it's trying to add is already locked, it's in the process of being added. Unfortunately, it can also be in the process of being removed. It is very hard to trigger because it requires that one thread is removing the FD while another one is adding it. First very few FDs run on multiple threads (listeners and DNS), and second, it does not make sense to add and remove the FD at the same time. In practice the DNS code built on the older callback-only model does perform bursts of fd_want_send() for all resolvers at once when it wants to send a new query (dns_send_query()). And this is more likely to happen when here are lots of resolutions in parallel and many resolvers, because the dns_response_recv() callback can also trigger a series of queries on all resolvers for each invalid response it receives. This means that it really is perfectly possible to both stop and start in parallel during short periods of time there. This issue was not reported before 2.1, but 2.1 had the FD cache, built on the exact same code base. It's very possible that the issue caused exactly the opposite situation, where an event was occasionally lost, causing a DNS retry that worked, and nobody noticing the problem in the end. In 2.1 the lost entries are the updates asking for not polling for writes anymore, and the effect is that the poller contiuously reports writability on the socket when the issue happens. This patch fixes bug #416 and must be backported as far as 1.8, and absolutely requires that previous commit "MINOR: fd/threads: make _GET_NEXT()/_GET_PREV() use the volatile attribute" is backported as well otherwise it will make the issue worse. Special thanks to Julien Pivotto for setting up a reliable reproducer for this difficult issue.	2019-12-20 08:09:28 +01:00
Willy Tarreau	337fb719ee	MINOR: fd/threads: make _GET_NEXT()/_GET_PREV() use the volatile attribute These macros are either used between atomic ops which cause the volatile to be implicit, or with an explicit volatile cast. However not having it in the macro causes some traps in the code because certain loop paths cannot safely be used without risking infinite loops if one isn't careful enough. Let's place the volatile attribute inside the macros and remove them from the explicit places to avoid this. It was verified that the output executable remains exactly the same byte-wise.	2019-12-20 08:09:28 +01:00
Olivier Houchard	54907bb848	BUG/MEDIUM: ssl: Revamp the way early data are handled. Instead of attempting to read the early data only when the upper layer asks for data, allocate a temporary buffer, stored in the ssl_sock_ctx, and put all the early data in there. Requiring that the upper layer takes care of it means that if for some reason the upper layer wants to emit data before it has totally read the early data, we will be stuck forever. This should be backported to 2.1 and 2.0. This may fix github issue #411.	2019-12-19 15:22:04 +01:00
Willy Tarreau	dd0e89a084	BUG/MAJOR: task: add a new TASK_SHARED_WQ flag to fix foreing requeuing Since 1.9 with commit b20aa9eef3 ("MAJOR: tasks: create per-thread wait queues") a task bound to a single thread will not use locks when being queued or dequeued because the wait queue is assumed to be the owner thread's. But there exists a rare situation where this is not true: the health check tasks may be running on one thread waiting for a response, and may in parallel be requeued by another thread calling health_adjust() after a detecting a response error in traffic when "observe l7" is set, and "fastinter" is lower than "inter", requiring to shorten the running check's timeout. In this case, the task being requeued was present in another thread's wait queue, thus opening a race during task_unlink_wq(), and gets requeued into the calling thread's wait queue instead of the running one's, opening a second race here. This patch aims at protecting against the risk of calling task_unlink_wq() from one thread while the task is queued on another thread, hence unlocked, by introducing a new TASK_SHARED_WQ flag. This new flag indicates that a task's position in the wait queue may be adjusted by other threads than then one currently executing it. This means that such WQ manipulations must be performed under a lock. There are two types of such tasks: - the global ones, using the global wait queue (technically speaking, those whose thread_mask has at least 2 bits set). - some local ones, which for now will be placed into the global wait queue as well in order to benefit from its lock. The flag is automatically set on initialization if the task's thread mask indicates more than one thread. The caller must also set it if it intends to let other threads update the task's expiration delay (e.g. delegated I/Os), or if it intends to change the task's affinity over time as this could lead to the same situation. Right now only the situation described above seems to be affected by this issue, and it is very difficult to trigger, and even then, will often have no visible effect beyond stopping the checks for example once the race is met. On my laptop it is feasible with the following config, chained to httpterm: global maxconn 400 # provoke FD errors, calling health_adjust() defaults mode http timeout client 10s timeout server 10s timeout connect 10s listen px bind :8001 option httpchk /?t=50 server sback 127.0.0.1:8000 backup server-template s 0-999 127.0.0.1:8000 check port 8001 inter 100 fastinter 10 observe layer7 This patch will automatically address the case for the checks because check tasks are created with multiple threads bound and will get the TASK_SHARED_WQ flag set. If in the future more tasks need to rely on this (multi-threaded muxes for example) and the use of the global wait queue becomes a bottleneck again, then it should not be too difficult to place locks on the local wait queues and queue the task on its bound thread. This patch needs to be backported to 2.1, 2.0 and 1.9. It depends on previous patch "MINOR: task: only check TASK_WOKEN_ANY to decide to requeue a task". Many thanks to William Dauchy for providing detailed traces allowing to spot the problem.	2019-12-19 14:42:22 +01:00
Willy Tarreau	8fe4253bf6	MINOR: task: only check TASK_WOKEN_ANY to decide to requeue a task After processing a task, its RUNNING bit is cleared and at the same time we check for other bits to decide whether to requeue the task or not. It happens that we only want to check the TASK_WOKEN_* bits, because : - TASK_RUNNING was just cleared - TASK_GLOBAL and TASK_QUEUE cannot be set yet as the task was running, preventing it from being requeued It's important not to catch yet undefined flags there because it would prevent addition of new task flags. This also shows more clearly that waking a task up with flags 0 is not something safe to do as the task will not be woken up if it's already running.	2019-12-19 14:42:22 +01:00
Willy Tarreau	262c3f1a00	MINOR: http: add a new "replace-path" action This action is very similar to "replace-uri" except that it only acts on the path component. This is assumed to better match users' expectations when they used to rely on "replace-uri" in HTTP/1 because mostly origin forms were used in H1 while mostly absolute URI form is used in H2, and their rules very often start with a '/', and as such do not match. It could help users to get this backported to 2.0 and 2.1.	2019-12-19 09:24:57 +01:00
Willy Tarreau	0851fd5eef	MINOR: debug: support logging to various sinks As discussed in the thread below [1], the debug converter is currently not of much use given that it's only built when DEBUG_EXPR is set, and it is limited to stderr only. This patch changes this to make it take an optional prefix and an optional target sink so that it can log to stdout, stderr or a ring buffer. The default output is the "buf0" ring buffer, that can be consulted from the CLI. [1] https://www.mail-archive.com/haproxy@formilux.org/msg35671.html Note: if this patch is backported, it also requires the following commit to work: 46dfd78cbf ("BUG/MINOR: sample: always check converters' arguments").	2019-12-19 09:19:13 +01:00
William Lallemand	ba22e901b3	BUG/MINOR: ssl/cli: fix build for openssl < 1.0.2 Commit d4f946c ("MINOR: ssl/cli: 'show ssl cert' give information on the certificates") introduced a build issue with openssl version < 1.0.2 because it uses the certificate bundles.	2019-12-18 20:40:20 +01:00
William Lallemand	d4f946c469	MINOR: ssl/cli: 'show ssl cert' give information on the certificates Implement the 'show ssl cert' command on the CLI which list the frontend certificates. With a certificate name in parameter it will show more details.	2019-12-18 18:16:34 +01:00
Olivier Houchard	545989f37f	BUG/MEDIUM: ssl: Don't set the max early data we can receive too early. When accepting the max early data, don't set it on the SSL_CTX while parsing the configuration, as at this point global.tune.maxrewrite may still be -1, either because it was not set, or because it hasn't been set yet. Instead, set it for each connection, just after we created the new SSL. Not doing so meant that we could pretend to accept early data bigger than one of our buffer. This should be backported to 2.1, 2.0, 1.9 and 1.8.	2019-12-17 15:45:38 +01:00
Tim Duesterhus	cd3732456b	MINOR: sample: Validate the number of bits for the sha2 converter Instead of failing the conversion when an invalid number of bits is given the sha2 converter now fails with an appropriate error message during startup. The sha2 converter was introduced in d4376302377e4f51f43a183c2c91d929b27e1ae3, which is in 2.1 and higher.	2019-12-17 13:28:00 +01:00
Willy Tarreau	46dfd78cbf	BUG/MINOR: sample: always check converters' arguments In 1.5-dev20, sample-fetch arguments parsing was addresse by commit 689a1df0a1 ("BUG/MEDIUM: sample: simplify and fix the argument parsing"). The issue was that argument checks were not run for sample-fetches if parenthesis were not present. Surprisingly, the fix was mde only for sample-fetches and not for converters which suffer from the exact same problem. There are even a few comments in the code mentioning that some argument validation functions are not called when arguments are missing. This fix applies the exact same method as the one above. The impact of this bug is limited because over the years the code has learned to work around this issue instead of fixing it. This may be backported to all maintained versions.	2019-12-17 10:44:49 +01:00
Willy Tarreau	5060326798	BUG/MINOR: sample: fix the closing bracket and LF in the debug converter The closing bracket was emitted for the "debug" converter even when the opening one was not sent, and the new line was not always emitted. Let's fix this. This is harmless since this converter is not built by default.	2019-12-17 09:04:38 +01:00
Christopher Faulet	29f7284333	MINOR: http-htx: Add some htx sample fetches for debugging purpose These sample fetches are internal and must be used for debugging purpose. Idea is to have a way to add some checks on the HTX content from http rules. The main purpose is to ease reg-tests writing.	2019-12-11 16:46:16 +01:00
Christopher Faulet	76014fd118	MEDIUM: h1-htx: Add HTX EOM block when the message is in H1_MSG_DONE state During H1 parsing, the HTX EOM block is added before switching the message state to H1_MSG_DONE. It is an exception in the way to convert an H1 message to HTX. Except for this block, the message is first switched to the right state before starting to add the corresponding HTX blocks. For instance, the message is switched in H1_MSG_DATA state and then the HTX DATA blocks are added. With this patch, the message is switched to the H1_MSG_DONE state when all data blocks or trailers were processed. It is the caller responsibility to call h1_parse_msg_eom() when the H1_MSG_DONE state is reached. This way, it is far easier to catch failures when the HTX buffer is full. The H1 and FCGI muxes have been updated accordingly. This patch may eventually be backported to 2.1 if it helps other backports.	2019-12-11 16:46:16 +01:00
Willy Tarreau	719e07c989	BUILD/MINOR: unix sockets: silence an absurd gcc warning about strncpy() Apparently gcc developers decided that strncpy() semantics are no longer valid and now deserve a warning, especially if used exactly as designed. This results in issue #304. Let's just remove one to the target size to please her majesty gcc, the God of C Compilers, who tries hard to make users completely eliminate any use of string.h and reimplement it by themselves at much higher risks. Pfff.... This can be backported to stable version, the fix is harmless since it ignores the last zero that is already set on next line.	2019-12-11 16:29:10 +01:00
Willy Tarreau	2444108f16	BUG/MINOR: server: make "agent-addr" work on default-server line As reported in issue #408, "agent-addr" doesn't work on default-server lines. This is due to the transcription of the old "addr" option in commit 6e5e0d8f9e ("MINOR: server: Make 'default-server' support 'addr' keyword.") which correctly assigns it to the check.addr and agent.addr fields, but which also copies the default check.addr into both the check's and the agent's addr fields. Thus the default agent's address is never used. This fix makes sure to copy the check from the check and the agent from the agent. However it's worth noting that if "addr" is specified on the server line, it will still overwrite both the check and the agent's addresses. This must be backported as far as 1.8.	2019-12-11 15:43:45 +01:00
Willy Tarreau	cdcba115b8	BUG/MINOR: listener: do not immediately resume on transient error The listener supports a "transient error" situation, which corresponds to those situations where accept fails badly but poll() reports an event. This happens for example when a listener is paused, or on out of FD. The same mechanism is used when facing a maxconn or maxsessrate limitation. When this happens, the listener is disabled for up to 100ms and put back into the global listener queue so that it automatically wakes up again as soon as the conditions change from an existing connection releasing one resource, or the system recovers from a transient issue. The listener_accept() function has a bug in its exit path causing a freshly limited listener to be immediately enabled again because all the conditions are met (connection count < max). It doesn't take into account the fact that the listener might have been queued and must first wait for the timeout to expire before doing so. The impact is that upon certain errors, the faulty process will busy loop on the accept code without sleeping. This is the scenario reported and diagnosed by @hedong0411 in issue #382. This commit fixes it by verifying that the global queue's delay is at least expired before deciding to resume the listener. Another approach could consist in having an extra state like LI_DELAY for situations where only a delay is acceptable, but this would probably not bring anything except more complex code. This issue was introduced with the lock-free listener accept code (commits 3f0d02b and 82c9789a) that were backported to 1.8.20+ and 1.9.7+, so this fix must be backported to the relevant branches.	2019-12-11 15:06:30 +01:00
Willy Tarreau	d26c9f9465	BUG/MINOR: mworker: properly pass SIGTTOU/SIGTTIN to workers If a new process is started with -sf and it fails to bind, it may send a SIGTTOU to the master process in hope that it will temporarily unbind. Unfortunately this one doesn't catch it and stops to background instead of forwarding the signal to the workers. The same is true for SIGTTIN. This commit simply implements an extra signal handler for the master to deal with such signals that must be passed down to the workers. It must be backported as far as 1.8, though there the code differs in that it's entirely in haproxy.c and doesn't require an extra sig handler.	2019-12-11 14:26:53 +01:00
Willy Tarreau	51013e82d4	BUG/MINOR: log: fix minor resource leaks on logformat error path As reported by Ilya in issue #392, Coverity found that we're leaking allocated strings on error paths in parse_logformat(). Let's use a proper exit label for failures instead of seeding return 0 everywhere. This should be backported to all supported versions.	2019-12-11 12:05:39 +01:00
Willy Tarreau	c49ba52524	MINOR: tasks: split wake_expired_tasks() in two parts to avoid useless wakeups We used to have wake_expired_tasks() wake up tasks and return the next expiration delay. The problem this causes is that we have to call it just before poll() in order to consider latest timers, but this also means that we don't wake up all newly expired tasks upon return from poll(), which thus systematically requires a second poll() round. This is visible when running any scheduled task like a health check, as there are systematically two poll() calls, one with the interval, nothing is done after it, and another one with a zero delay, and the task is called: listen test bind *:8001 server s1 127.0.0.1:1111 check 09:37:38.200959 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8696843}) = 0 09:37:38.200967 epoll_wait(3, [], 200, 1000) = 0 09:37:39.202459 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8712467}) = 0 >> nothing run here, as the expired task was not woken up yet. 09:37:39.202497 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8715766}) = 0 09:37:39.202505 epoll_wait(3, [], 200, 0) = 0 09:37:39.202513 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8719064}) = 0 >> now the expired task was woken up 09:37:39.202522 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 7 09:37:39.202537 fcntl(7, F_SETFL, O_RDONLY\|O_NONBLOCK) = 0 09:37:39.202565 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0 09:37:39.202577 setsockopt(7, SOL_TCP, TCP_QUICKACK, [0], 4) = 0 09:37:39.202585 connect(7, {sa_family=AF_INET, sin_port=htons(1111), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress) 09:37:39.202659 epoll_ctl(3, EPOLL_CTL_ADD, 7, {EPOLLOUT, {u32=7, u64=7}}) = 0 09:37:39.202673 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8814713}) = 0 09:37:39.202683 epoll_wait(3, [{EPOLLOUT\|EPOLLERR\|EPOLLHUP, {u32=7, u64=7}}], 200, 1000) = 1 09:37:39.202693 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8818617}) = 0 09:37:39.202701 getsockopt(7, SOL_SOCKET, SO_ERROR, [111], [4]) = 0 09:37:39.202715 close(7) = 0 Let's instead split the function in two parts: - the first part, wake_expired_tasks(), called just before process_runnable_tasks(), wakes up all expired tasks; it doesn't compute any timeout. - the second part, next_timer_expiry(), called just before poll(), only computes the next timeout for the current thread. Thanks to this, all expired tasks are properly woken up when leaving poll, and each poll call's timeout remains up to date: 09:41:16.270449 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10223556}) = 0 09:41:16.270457 epoll_wait(3, [], 200, 999) = 0 09:41:17.270130 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10238572}) = 0 09:41:17.270157 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 7 09:41:17.270194 fcntl(7, F_SETFL, O_RDONLY\|O_NONBLOCK) = 0 09:41:17.270204 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0 09:41:17.270216 setsockopt(7, SOL_TCP, TCP_QUICKACK, [0], 4) = 0 09:41:17.270224 connect(7, {sa_family=AF_INET, sin_port=htons(1111), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress) 09:41:17.270299 epoll_ctl(3, EPOLL_CTL_ADD, 7, {EPOLLOUT, {u32=7, u64=7}}) = 0 09:41:17.270314 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10337841}) = 0 09:41:17.270323 epoll_wait(3, [{EPOLLOUT\|EPOLLERR\|EPOLLHUP, {u32=7, u64=7}}], 200, 1000) = 1 09:41:17.270332 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10341860}) = 0 09:41:17.270340 getsockopt(7, SOL_SOCKET, SO_ERROR, [111], [4]) = 0 09:41:17.270367 close(7) = 0 This may be backported to 2.1 and 2.0 though it's unlikely to bring any user-visible improvement except to clarify debugging.	2019-12-11 09:42:58 +01:00
Willy Tarreau	d7f76a0a50	BUG/MEDIUM: proto_udp/threads: recv() and send() must not be exclusive. This is a complement to previous fix for bug #399. The exclusion between the recv() and send() calls prevents send handlers from being called if rx readiness is reported. The DNS code can trigger this situations with threads where the fd_recv_ready() flag disappears between the test in dgram_fd_handler() and the second test in dns_resolve_recv() while a thread calls fd_cant_recv(), and this situation can sustain itself for a while. With 8 threads and an error in the socket queue, placing a printf on the return statement in dns_resolve_recv() scrolls very fast. Simply removing the "else" in dgram_fd_handler() addresses the issue. This fix must be backported as far as 1.6.	2019-12-10 19:09:15 +01:00
Willy Tarreau	1c75995611	BUG/MAJOR: dns: add minimalist error processing on the Rx path It was reported in bug #399 that the DNS sometimes enters endless loops after hours working fine. The issue is caused by a lack of error processing in the DNS's recv() path combined with an exclusive recv OR send in the UDP layer, resulting in some errors causing CPU loops that will never stop until the process is restarted. The basic cause is that the FD_POLL_ERR and FD_POLL_HUP flags are sticky on the FD, and contrary to a stream socket, receiving an error on a datagram socket doesn't indicate that this socket cannot be used anymore. Thus the Rx code must at least handle this situation and flush the error otherwise it will constantly be reported. In theory this should not be a big issue but in practise it is due to another bug in the UDP datagram handler which prevents the send() callback from being called when Rx readiness was reported, so the situation cannot go away. It happens way more easily with threads enabled, so that there is no dead time between the moment the FD is disabled and another recv() is called, such as in the example below where the request was sent to a closed port on the loopback provoking an ICMP unreachable to be sent back: [pid 20888] 18:26:57.826408 sendto(29, ";\340\1\0\0\1\0\0\0\0\0\1\0031wt\2eu\0\0\34\0\1\0\0)\2\0\0\0\0\0\0\0", 35, 0, NULL, > [pid 20893] 18:26:57.826566 recvfrom(29, 0x7f97c54ef2f0, 513, 0, NULL, NULL) = -1 ECONNREFUSED (Connection refused) [pid 20889] 18:26:57.826601 recvfrom(29, 0x7f97c76182f0, 513, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) [pid 20892] 18:26:57.826630 recvfrom(29, 0x7f97c5cf02f0, 513, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) [pid 20891] 18:26:57.826684 recvfrom(29, 0x7f97c66162f0, 513, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) [pid 20895] 18:26:57.826716 recvfrom(29, 0x7f97bffda2f0, 513, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) [pid 20894] 18:26:57.826747 recvfrom(29, 0x7f97c4cee2f0, 513, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) [pid 20888] 18:26:58.419838 recvfrom(29, 0x7ffcc8712c20, 513, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) [pid 20893] 18:26:58.419900 recvfrom(29, 0x7f97c54ef2f0, 513, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) (... hundreds before next sendto() ...) This situation was handled by clearing HUP and ERR when recv() returns <0. A second case was handled, there was a control for a missing dgram handler, but it does nothing, causing the FD to ring again if this situation ever happens. After looking at the rest of the code, it doesn't seem possible to face such a situation because these handlers are registered during startup, but at least we need to handle it properly. A third case was handled, that's mainly a small optimization. With threads and massive responses, due to the large lock around the loop, it's likely that some threads will have seen fd_recv_ready() and will wait at the lock(). But if they wait here, chances are that other threads will have eliminated pending data and issued fd_cant_recv(). In this case, better re-check fd_recv_ready() before performing the recv() call to avoid the huge amounts of syscalls that happen on massively threaded setups. This patch must be backported as far as 1.6 (the atomic AND just needs to be turned to a regular AND).	2019-12-10 19:09:15 +01:00

... 45 46 47 48 49 ...

10974 Commits