haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-10-28 07:01:00 +01:00

Author	SHA1	Message	Date
Christopher Faulet	5cd4bbd7ab	BUG/MAJOR: threads/queue: Fix thread-safety issues on the queues management The management of the servers and the proxies queues was not thread-safe at all. First, the accesses to <strm>->pend_pos were not protected. So it was possible to release it on a thread (for instance because the stream is released) and to use it in same time on another one (because we redispatch pending connections for a server). Then, the accesses to stream's information (flags and target) from anywhere is forbidden. To be safe, The stream's state must always be updated in the context of process_stream. So to fix these issues, the queue module has been refactored. A lock has been added in the pendconn structure. And now, when we try to dequeue a pending connection, we start by unlinking it from the server/proxy queue and we wake up the stream. Then, it is the stream reponsibility to really dequeue it (or release it). This way, we are sure that only the stream can create and release its <pend_pos> field. However, be careful. This new implementation should be thread-safe (hopefully...). But it is not optimal and in some situations, it could be really slower in multi-threaded mode than in single-threaded one. The problem is that, when we try to dequeue pending connections, we process it from the older one to the newer one independently to the thread's affinity. So we need to wait the other threads' wakeup to really process them. If threads are blocked in the poller, this will add a significant latency. This problem happens when maxconn values are very low. This patch must be backported in 1.8.	2018-03-19 10:03:06 +01:00
Christopher Faulet	510c0d67ef	BUG/MEDIUM: threads/unix: Fix a deadlock when a listener is temporarily disabled When a listener is temporarily disabled, we start by locking it and then we call .pause callback of the underlying protocol (tcp/unix). For TCP listeners, this is not a problem. But listeners bound on an unix socket are in fact closed instead. So .pause callback relies on unbind_listener function to do its job. Unfortunatly, unbind_listener hold the listener's lock and then call an internal function to unbind it. So, there is a deadlock here. This happens during a reload. To fix the problemn, the function do_unbind_listener, which is lockless, is now exported and is called when a listener bound on an unix socket is temporarily disabled. This patch must be backported in 1.8.	2018-03-16 11:19:07 +01:00
Cyril Bont�	4288c5a9d8	BUG/MINOR: force-persist and ignore-persist only apply to backends >From the very first day of force-persist and ignore-persist features, they only applied to backends, except that the documentation stated it could also be applied to frontends. In order to make it clear, the documentation is updated and the parser will raise a warning if the keywords are used in a frontend section. This patch should be backported up to the 1.5 branch.	2018-03-12 22:52:24 +01:00
Cyril Bont�	d400ab3a36	BUG/MEDIUM: fix a 100% cpu usage with cpu-map and nbthread/nbproc Krishna Kumar reported a 100% cpu usage with a configuration using cpu-map and a high number of threads, Indeed, this minimal configuration to reproduce the issue : global nbthread 40 cpu-map auto:1/1-40 0-39 frontend test bind :8000 This is due to a wrong type in a shift operator (int vs unsigned long int), causing an endless loop while applying the cpu affinity on threads. The same issue may also occur with nbproc under FreeBSD. This commit addresses both cases. This patch must be backported to 1.8.	2018-03-12 22:52:24 +01:00
Aur�lien Nephtali	b53e20826e	BUG/MINOR: cli: Fix a typo in the 'set rate-limit' usage The correct keyword is 'ssl-sessions' (vs. 'ssl-session'). The typo was introduced in 45c742be05 ('REORG: cli: move the "set rate-limit" functions to their own parser'). Signed-off-by: Aur�lien Nephtali <aurelien.nephtali@corp.ovh.com>	2018-03-12 07:49:08 +01:00
Aur�lien Nephtali	bca08762d2	CLEANUP: cli: Remove a leftover debug message This printf() was added in f886e3478d ("MINOR: cli: Add a command to send listening sockets."). Signed-off-by: Aur�lien Nephtali <aurelien.nephtali@corp.ovh.com>	2018-03-12 07:49:05 +01:00
Aur�lien Nephtali	76de95a4c0	CLEANUP: ssl: Remove a duplicated #include openssl/x509.h is included twice since commit fc0421fde ("MEDIUM: ssl: add support for SNI and wildcard certificates"). Signed-off-by: Aur�lien Nephtali <aurelien.nephtali@corp.ovh.com>	2018-03-12 07:49:01 +01:00
Aur�lien Nephtali	498a115727	BUG/MINOR: cli: Fix a crash when passing a negative or too large value to "show fd" This bug is present since 7a4a0ac71d ("MINOR: cli: add a new "show fd" command"). This should be backported to 1.8. Signed-off-by: Aur�lien Nephtali <aurelien.nephtali@corp.ovh.com>	2018-03-12 07:47:26 +01:00
Willy Tarreau	84b118f312	BUG/MEDIUM: h2: also arm the h2 timeout when sending Right now the h2 idle timeout is only set when there is no stream. If we fail to send because the socket buffers are full (generally indicating the client has left), we also need to arm it so that we can properly expire such connections, otherwise some failed transfers might leave H2 connections pending forever. Thanks to Thierry Fournier for the diag and the traces. This patch needs to be backported to 1.8.	2018-03-08 18:43:56 +01:00
Willy Tarreau	c41b3e8dff	DOC: buffers: clarify the purpose of the <from> pointer in offer_buffers() This one is only used to compare pointers and NULL is permitted though this is far from being clear.	2018-03-08 18:33:48 +01:00
Olivier Houchard	ec9516a6dc	BUG/MINOR: unix: Don't mess up when removing the socket from the xfer_sock_list. When removing the socket from the xfer_sock_list, we want to set next->prev to prev, not to next->prev, which is useless. This should be backported to 1.8.	2018-03-08 18:33:11 +01:00
Emeric Brun	1738e86771	BUG/MINOR: session: Fix tcp-request session failure if handshake. Some sample fetches check if session is established using the flag CO_FL_CONNECTED. But in some cases, when a handshake is performed this flag is set too late, after the process of the tcp-request session rules. This fix move the raising of the flag at the beginning of the conn_complete_session function which processes the tcp-request session rules. This fix must be backported to 1.8 (and perhaps 1.7)	2018-03-06 14:04:45 +01:00
Willy Tarreau	44e973f508	MEDIUM: h2: use a single buffer allocator We used to have one buffer allocator per direction while we can never block on two buffers at once. Let's have a single one and rely on the connection's flags to know which one we're waitinf for.	2018-03-01 17:58:15 +01:00
Willy Tarreau	0a10de6066	MINOR: h2: provide and use h2s_detach() and h2s_free() These ones save us from open-coding the cleanup functions on each and every error path. The code was updated to use them with no functional change.	2018-03-01 16:35:01 +01:00
Willy Tarreau	00dd07895a	CLEANUP: h2: rename misleading h2c_stream_close() to h2s_close() This function takes an h2c and an h2s but it never uses the h2c, which is a bit confusing at some places in the code. Let's make it clear that it only operates on the h2s instead by renaming it and removing the unused h2c argument.	2018-03-01 16:31:34 +01:00
Emmanuel Hocdet	253c3b7516	MINOR: connection: add proxy-v2-options authority This patch add option PP2_TYPE_AUTHORITY to proxy protocol v2 when a TLS connection was negotiated. In this case, authority corresponds to the sni.	2018-03-01 11:38:32 +01:00
Emmanuel Hocdet	fa8d0f1875	MINOR: connection: add proxy-v2-options ssl-cipher,cert-sig,cert-key This patch implement proxy protocol v2 options related to crypto information: ssl-cipher (PP2_SUBTYPE_SSL_CIPHER), cert-sig (PP2_SUBTYPE_SSL_SIG_ALG) and cert-key (PP2_SUBTYPE_SSL_KEY_ALG).	2018-03-01 11:38:28 +01:00
Emmanuel Hocdet	283e004a85	MINOR: ssl: add ssl_sock_get_cert_sig function ssl_sock_get_cert_sig can be used to report cert signature short name to log and ppv2 (RSA-SHA256).	2018-03-01 11:34:08 +01:00
Emmanuel Hocdet	96b7834e98	MINOR: ssl: add ssl_sock_get_pkey_algo function ssl_sock_get_pkey_algo can be used to report pkey algorithm to log and ppv2 (RSA2048, EC256,...). Extract pkey information is not free in ssl api (lock/alloc/free): haproxy can use the pkey information computed in load_certificate. Store and use this information in a SSL ex_data when available, compute it if not (SSL multicert bundled and generated cert).	2018-03-01 11:34:05 +01:00
Emmanuel Hocdet	ddc090bc55	MINOR: ssl: extract full pkey info in load_certificate Private key information is used in switchctx to implement native multicert selection (ecdsa/rsa/anonymous). This patch extract and store full pkey information: dsa type and pkey size in bits. This can be used for switchctx or to report pkey informations in ppv2 and log.	2018-03-01 11:33:18 +01:00
Emmanuel Hocdet	8c0c34b6e7	Revert "BUG/MINOR: send-proxy-v2: string size must include ('\0')" This reverts commit 82913e4f79a1f1fb25aec84a2ce2f5f0e5ce1959. TLV string value should not be null-terminated. This should be backported to 1.8.	2018-03-01 06:48:05 +01:00
Christopher Faulet	7d9f1ba246	BUG/MEDIUM: spoe: Remove idle applets from idle list when HAProxy is stopping In the SPOE applet's handler, when an applet is switched from the state IDLE to PROCESSING, it is removed for the list of idle applets. But when HAProxy is stopping, this applet can be switched to DISCONNECT. In this case, we also need to remove it from the list of idle applets. Else the applet is removed but still present in the list. It could lead to a segmentation fault or an infinite loop, depending the code path.	2018-02-28 16:20:33 +01:00
Willy Tarreau	35a62705df	BUG/MEDIUM: h2: always consume any trailing data after end of output buffers In case a stream tries to emit more data than advertised by the chunks or content-length headers, the extra data remains in the channel's output buffer until the channel's timeout expires. It can easily happen when sending malformed error files making use of a wrong content-length or having extra CRLFs after the empty chunk. It may also be possible to forge such a bad response using Lua. The H1 to H2 encoder must protect itself against this by marking the data presented to it as consumed if it decides to discard them, so that the sending stream doesn't wait for the timeout to trigger. The visible effect of this problem is a huge memory usage and a high concurrent connection count during benchmarks when using such bad data (a typical place where this easily happens). This fix must be backported to 1.8.	2018-02-27 15:37:25 +01:00
Christopher Faulet	929b52d8a1	BUG/MINOR: h2: Set the target of dbuf_wait to h2c In h2_get_dbuf, when the buffer allocation was failing, dbuf_wait.target was errornously set to the connection (h2c->conn) instead of the h2 connection descriptor (h2c). This patch must be backported to 1.8.	2018-02-26 17:33:16 +01:00
Yves Lafon	95317289e9	MINOR: stats: display the number of threads in the statistics. Add the nbthread global variable to the output, matching nbproc. This may be backported to 1.8	2018-02-26 11:53:46 +01:00
Willy Tarreau	f161d0f51e	BUG/MINOR: pools/threads: don't ignore DEBUG_UAF on double-word CAS capable archs Since commit cf975d4 ("MINOR: pools/threads: Implement lockless memory pools."), we support lockless pools. However the parts dedicated to detecting use-after-free are not present in this part, making DEBUG_UAF useless in this situation. The present patch sets a new define CONFIG_HAP_LOCKLESS_POOLS when such a compatible architecture is detected, and when pool debugging is not requested, then makes use of this everywhere in pools and buffers functions. This way enabling DEBUG_UAF will automatically disable the lockless version. No backport is needed as this is purely 1.9-dev.	2018-02-22 14:18:45 +01:00
Tim Duesterhus	5e64286bab	CLEANUP: standard: Fix typo in IPv6 mask example IPv6 addresses with two double colons are invalid. This typo was introduced in commit 471851713af20d84b67b8966471ea758dc8c12b9.	2018-02-21 05:07:35 +01:00
Tim Duesterhus	66888f907c	CLEANUP: h2: Remove unused labels from mux_h2.c This removes the unused next_header_block and try_again labels from mux_h2.c. try_again is unused as of a76e4c21839cafd036fbe755416569206502c1d9, which first appeared in haproxy 1.8.0. next_header_block is unused as of 872855998bd03d5224e0e5cd6aef9b91e2a6de1d, which was backported to haproxy 1.8.0 as 59fcb216085a7aa9744cffe39567c80de4ebd6bf.	2018-02-20 08:30:13 +01:00
Tim Duesterhus	932bb289dd	CLEANUP: spoe: Remove unused label retry This removes the retry labels from spoe_send_frame and spoe_recv_frame which are unused since d5216d474d69856a282e4443f180af2093a80d6c, which is unreleased, but was backported to haproxy 1.8 as f13f3a4babdb1ce23a7e982c765704bca728111a.	2018-02-20 08:30:12 +01:00
Tim Duesterhus	9619e72c6b	CLEANUP: cfgparse: Remove unused label end This removes the end label from parse_process_number() which is unused since 5ab51775e736511b7e54f42e080dcef76a284da9, which first was released in haproxy 1.8.0.	2018-02-20 08:30:12 +01:00
Emeric Brun	74f7ffa229	MINOR: ssl/sample: adds ssl_bc_is_resumed fetch keyword. Returns true when the back connection was made over an SSL/TLS transport layer and the newly created SSL session was resumed using a cached session or a TLS ticket.	2018-02-19 16:50:20 +01:00
Emeric Brun	eb8def9f34	BUG/MEDIUM: ssl/sample: ssl_bc_* fetch keywords are broken. Since the split between connections and conn-stream objects, this keywords are broken. This patch must be backported in 1.8	2018-02-19 16:50:05 +01:00
Christopher Faulet	fd04fcf5ed	BUG/MEDIUM: http: Switch the HTTP response in tunnel mode as earlier as possible When the body length is undefined (no Content-Length or Transfer-Encoding headers), The reponse remains in ending mode, waiting the request is done. So, most of time this is not a problem because the resquest is done before the response. But when a client sends data to a server that replies without waiting all the data, it is really not desirable to wait the end of the request to finish the response. This bug was introduced when the tunneling of the request and the reponse was refactored, in commit 4be980391 ("MINOR: http: Switch requests/responses in TUNNEL mode only by checking txn flag"). This patch should be backported in 1.8 and 1.7.	2018-02-19 16:47:12 +01:00
Christopher Faulet	4ac77a98cd	BUG/MEDIUM: ssl: Shutdown the connection for reading on SSL_ERROR_SYSCALL When SSL_read returns SSL_ERROR_SYSCALL and errno is unset or set to EAGAIN, the connection must be shut down for reading. Else, the connection loops infinitly, consuming all the CPU. The bug was introduced in the commit 7e2e50500 ("BUG/MEDIUM: ssl: Don't always treat SSL_ERROR_SYSCALL as unrecovarable."). This patch must be backported in 1.8 too.	2018-02-19 15:37:47 +01:00
Willy Tarreau	280f42b99e	MINOR: sample: add a new "concat" converter It's always a pain not to be able to combine variables. This commit introduces the "concat" converter, which appends a delimiter, a variable's contents and another delimiter to an existing string. The result is a string. This makes it easier to build composite variables made of other variables.	2018-02-19 15:34:12 +01:00
Christopher Faulet	16f45c87d5	BUG/MINOR: ssl/threads: Make management of the TLS ticket keys files thread-safe A TLS ticket keys file can be updated on the CLI and used in same time. So we need to protect it to be sure all accesses are thread-safe. Because updates are infrequent, a R/W lock has been used. This patch must be backported in 1.8	2018-02-19 14:15:38 +01:00
Tim Duesterhus	9ad9f3517e	DOC: cfgparse: Warn on option (tcp\|http)log in backend The option does not seem to have any effect since at least haproxy 1.3. Also the `log-format` directive already warns when being used in a backend.	2018-02-19 13:57:32 +01:00
Aurélien Nephtali	39b89889e7	BUG/MINOR: init: Add missing brackets in the code parsing -sf/-st The codes tries to strip trailing spaces of arguments but due to missing brackets, it will always exit. It can be reproduced with this (silly) example: $ haproxy -f /etc/haproxy/haproxy.cfg -sf 1234 "1235 " 1236 $ echo $? 1 This was introduced in commit 236062f7c ("MINOR: init: emit warning when -sf/-sd cannot parse argument") Signed-off-by: Aurélien Nephtali <aurelien.nephtali@gmail.com>	2018-02-19 08:02:21 +01:00
Olivier Houchard	7e2e505006	BUG/MEDIUM: ssl: Don't always treat SSL_ERROR_SYSCALL as unrecovarable. Bart Geesink reported some random errors appearing under the form of termination flags SD in the logs for connections involving SSL traffic to reach the servers. Tomek Gacek and Mateusz Malek finally narrowed down the problem to commit c2aae74 ("MEDIUM: ssl: Handle early data with OpenSSL 1.1.1"). It happens that the special case of SSL_ERROR_SYSCALL isn't handled anymore since this commit. SSL_read() might return <= 0, and SSL_get_erro() return SSL_ERROR_SYSCALL, without meaning the connection is gone. Before flagging the connection as in error, check the errno value. This should be backported to 1.8.	2018-02-14 18:44:28 +01:00
Olivier Houchard	1ff9104117	BUG/MINOR: fd/threads: properly lock the FD before adding it to the fd cache. It was believed that it was useless to lock the "prev" field when adding a FD. However, if there's only one element in the FD cache, and that element removes itself from the fd cache, and another FD is added before the first add completed, there's a risk of losing elements. To prevent that, lock the "prev" field, so that such a removal will wait until the add completed.	2018-02-08 17:24:06 +01:00
Willy Tarreau	58aa5ccd76	BUG/MINOR: config: don't emit a warning when global stats is incompletely configured Martin Brauer reported an unexpected warning when some parts of the global stats are defined but not the listening address, like below : global #stats socket run/admin.sock mode 660 level admin stats timeout 30s Then haproxy complains : [WARNING] 334/150131 (23086) : config : frontend 'GLOBAL' has no 'bind' directive. Please declare it as a backend if this was intended. This is because of the check for a bind-less frontend (the global section creates a frontend for the stats). There's no clean fix for this one, so here we're simply checking that the frontend is not the global stats one before emitting the warning. This patch should be backported to all stable versions.	2018-02-08 09:55:09 +01:00
Willy Tarreau	821069832e	BUILD: fd/threads: fix breakage build breakage without threads The last fix for the volatile dereference made use of pl_deref_int() which is unknown when building without threads. Let's simply open-code it instead. No backport needed.	2018-02-06 12:00:27 +01:00
Chris Lane	236062f7ce	MINOR: init: emit warning when -sf/-sd cannot parse argument Previously, -sf and -sd command line parsing used atol which cannot detect errors. I had a problem where I was doing -sf "$pid1 $pid2 $pid" and it was sending the gracefully terminate signal only to the first pid. The change uses strtol and checks endptr and errno to see if the parsing worked. It will exit when the pid list is not parsed. [wt: this should be backported to 1.8]	2018-02-06 07:23:32 +01:00
Tim Duesterhus	7d58b4d156	BUG/MEDIUM: standard: Fix memory leak in str2ip2() An haproxy compiled with: > make -j4 all TARGET=linux2628 USE_GETADDRINFO=1 And running with a configuration like this: defaults log global mode http option httplog option dontlognull timeout connect 5000 timeout client 50000 timeout server 50000 frontend fe bind :::8080 v4v6 default_backend be backend be server s example.com:80 check Will leak memory inside `str2ip2()`, because the list `result` is not properly freed in success cases: ==18875== 140 (76 direct, 64 indirect) bytes in 1 blocks are definitely lost in loss record 87 of 111 ==18875== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==18875== by 0x537A565: gaih_inet (getaddrinfo.c:1223) ==18875== by 0x537DD5D: getaddrinfo (getaddrinfo.c:2425) ==18875== by 0x4868E5: str2ip2 (standard.c:733) ==18875== by 0x43F28B: srv_set_addr_via_libc (server.c:3767) ==18875== by 0x43F50A: srv_iterate_initaddr (server.c:3879) ==18875== by 0x43F50A: srv_init_addr (server.c:3944) ==18875== by 0x475B30: init (haproxy.c:1595) ==18875== by 0x40406D: main (haproxy.c:2479) The exists as long as the usage of getaddrinfo in that function exists, it was introduced in commit: d5f4328efd5f4eaa7c89cad9773124959195430a v1.5-dev8 is the first tag containing this comment, the fix should be backported to haproxy 1.5 and newer.	2018-02-05 21:04:15 +01:00
Willy Tarreau	a331544c33	BUG/MINOR: time/threads: ensure the adjusted time is always correct In the time offset calculation loop, we ensure we only commit the new date once it's futher in the future than the current one. However there is a small issue here on 32-bit platforms : if global_now is written in two cycles by another thread, starting with the tv_sec part, and the current thread reads it in the middle of a change, it may compute a wrong "adjusted" value on the first round, with the new (larger) tv_sec and the old (large) tv_usec. This will be detected as the CAS will fail, and another attempt will be made, but this time possibly with too large an adusted value, pushing the date further than needed (at worst almost one second). This patch addresses this by using a temporary adjusted time in the loop that always restarts from the last known one, and by assigning the result to the final value only once the CAS succeeds. The impact is very limited, it may cause the time to advance in small jumps on 32 bit platforms and in the worst case some timeouts might expire 1 second too early. This fix should be backported to 1.8.	2018-02-05 20:11:38 +01:00
Willy Tarreau	11559a7530	MINOR: fd: reorder fd_add_to_fd_list() The function was cleaned up a bit from duplicated parts inherited from the initial attempt at getting it to work. It's a bit smaller and cleaner this way.	2018-02-05 19:45:41 +01:00
Willy Tarreau	3a8263f86b	MINOR: fd: remove the unneeded last CAS when adding an fd to the list This was a leftover from the initial code where two threads could fight for the list's tail.	2018-02-05 19:45:39 +01:00
Willy Tarreau	abeaff2d54	BUG/MINOR: fd/threads: properly dereference fdcache as volatile In fd_rm_from_fd_list(), we have loops waiting for another change to complete, in case we don't have support for a double CAS. But these ones fail to place a compiler barrier or to dereference the fdcache as a volatile, resulting in an endless loop on the first collision, which is visible when run on MIPS32. No backport needed.	2018-02-05 19:45:31 +01:00
Willy Tarreau	4cc67a2782	MINOR: fd: move the fd_{add_to,rm_from}_fdlist functions to fd.c There's not point inlining these huge functions, better move them to real functions in fd.c.	2018-02-05 17:19:40 +01:00
Willy Tarreau	62a627ac19	MEDIUM: poller: use atomic ops to update the fdtab mask We don't need to lock the fdtab[].lock anymore since we only have one modification left (update update_mask). Let's use an atomic AND instead.	2018-02-05 16:02:22 +01:00

... 2 3 4 5 6 ...

6022 Commits