haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-10-28 23:20:59 +01:00

Author	SHA1	Message	Date
Willy Tarreau	9d4b56b88e	MINOR: tasks: only visit filled task slots after processing them process_runnable_tasks() needs to requeue or wake up tasks after processing them in batches. By only refilling the existing ones, we avoid revisiting all the queue. The performance gain is measurable starting with two threads, where the request rate climbs to 657k/s compared to 644k.	2017-11-06 11:20:11 +01:00
Willy Tarreau	ce4e0aa7f3	MEDIUM: task: change the construction of the loop in process_runnable_tasks() This patch slightly rearranges the loop to pack the locked code a little bit, and to try to concentrate accesses to the tree together to benefit more from the cache. It also fixes how the loop handles the right margin : now that is guaranteed that the retrieved nodes are filtered to only match the current thread, we don't need to rewind every 16 entries. Instead we can rewind each time we reach the right margin again. With this change, we now achieve the following performance for 10 H2 conns each containing 100 streams : 1 thread : 550kreq/s 2 thread : 644kreq/s 3 thread : 598kreq/s	2017-11-06 11:20:11 +01:00
Willy Tarreau	b992ba16ef	MINOR: task: simplify wake_expired_tasks() to avoid unlocking in the loop This function is sensitive, let's make it shorter by factoring out the unlock and leave code. This reduced the function's size by a few tens of bytes and increased the overall performance by about 1%.	2017-11-06 11:20:11 +01:00
Willy Tarreau	8d38805d3d	MAJOR: task: make use of the scope-aware ebtree functions Currently the task scheduler suffers from an O(n) lookup when skipping tasks that are not for the current thread. The reason is that eb32_lookup_ge() has no information about the current thread so it always revisits many tasks for other threads before finding its own tasks. This is particularly visible with HTTP/2 since the number of concurrent streams created at once causes long series of tasks for the same stream in the scheduler. With only 10 connections and 100 streams each, by running on two threads, the performance drops from 640kreq/s to 11.2kreq/s! Lookup metrics show that for only 200000 task lookups, 430 million skips had to be performed, which means that on average, each lookup leads to 2150 nodes to be visited. This commit backports the principle of scope lookups for ebtrees from the ebtree_v7 development tree. The idea is that each node contains a mask indicating the union of the scopes for the nodes below it, which is fed during insertion, and used during lookups. Then during lookups, branches that do not contain any leaf matching the requested scope are simply ignored. This perfectly matches a thread mask, allowing a thread to only extract the tasks it cares about from the run queue, and to always find them in O(log(n)) instead of O(n). Thus the scheduler uses tid_bit and task->thread_mask as the ebtree scope here. Doing this has recovered most of the performance, as can be seen on the test below with two threads, 10 connections, 100 streams each, and 1 million requests total : Before After Gain test duration : 89.6s 4.73s x19 HTTP requests/s (DEBUG) : 11200 211300 x19 HTTP requests/s (PROD) : 15900 447000 x28 spin_lock time : 85.2s 0.46s /185 time per lookup : 13us 40ns /325 Even when going to 6 threads (on 3 hyperthreaded CPU cores), the performance stays around 284000 req/s, showing that the contention is much lower. A test showed that there's no benefit in using this for the wait queue though.	2017-11-06 11:20:11 +01:00
William Lallemand	92159b2901	MINOR: mworker: do not store child pid anymore in the pidfile The parent process supervises itself the children, we don't need to store the children pids anymore in the pidfile in master-worker mode.	2017-11-06 11:19:53 +01:00
William Lallemand	deed780a22	MINOR: mworker: write parent pid in the pidfile The first pid in the pidfile is now the parent, it's more convenient for supervising the processus. You can now reload haproxy in master-worker mode with convenient command like: kill -USR2 $(head -1 /tmp/haproxy.pid)	2017-11-06 11:08:38 +01:00
William Lallemand	8029300df6	MINOR: mworker: allow pidfile in mworker + foreground This patch allows the use of the pidfile in master-worker mode without using the background option.	2017-11-06 11:08:38 +01:00
William Lallemand	cc113822a7	MINOR: add master-worker in the warning about nbproc	2017-11-06 11:08:38 +01:00
Willy Tarreau	6dbd3e963b	BUG/MEDIUM: threads: don't try to free build option message on exit Commit 0493149 ("MINOR: thread: report multi-thread support in haproxy -vv") added information about thread support in haproxy -vv output but accidently marked the message as "must_free" while it's a constant. This causes a segv on the old process on clean exit if threads are enabled. It doesn't affect the stability during operations however.	2017-11-05 11:51:48 +01:00
Willy Tarreau	bbd09b9306	BUG/MAJOR: thread/listeners: enable_listener must not call unbind_listener() unbind_listener() takes the listener lock, which is already held by enable_listener(). This situation happens when starting with nbproc > 1 with some bind lines limited to a certain process, because in this case enable_listener() tries to stop unneeded listeners. This commit introduces __do_unbind_listeners() which must be called with the lock held, and makes enable_listener() use this one. Given that the only return code has never been used and that it starts to make the code more complicated to propagate it before throwing it to the trash, the function's return type was changed to void.	2017-11-05 11:38:44 +01:00
Willy Tarreau	3340029b97	BUG/MAJOR: h2: set the connection's task to NULL when no client timeout is set If "timeout client" is missing from the frontend, the task is not initialized, causing a crash on connection teardown.	2017-11-05 11:23:40 +01:00
Willy Tarreau	4d5f13cab3	BUG/MEDIUM: threads/stick-tables: close a race condition on stktable_trash_expired() The spin_unlock() was called just before setting the expiry to TICK_ETERNITY, so if another thread has the time to perform its update and set a timeout, this would would clear it.	2017-11-05 11:04:47 +01:00
Willy Tarreau	03071f6937	BUG/MAJOR: threads/lb: fix missing unlock on map-based hash LB We often left the function with the lock held on success.	2017-11-05 10:59:12 +01:00
Willy Tarreau	1ed90ac377	BUG/MAJOR: threads/lb: fix missing unlock on consistent hash LB If no matching node was found, the function was left without unlocking the tree.	2017-11-05 10:54:50 +01:00
Willy Tarreau	5ec84574c7	BUG/MAJOR: threads/dns: add missing unlock on allocation failure path An unlock was missing when a memory allocation failure is detected.	2017-11-05 10:35:57 +01:00
Willy Tarreau	70124ce3e1	BUG/MAJOR: cli/streams: missing unlock on exit "show sess" An unlock was missing on the situation where the session disappeared while watching it.	2017-11-05 10:31:10 +01:00
Willy Tarreau	6ce38f3eab	CLEANUP: server: get rid of return statements in the CLI parser There were two many return, some of them missing a spin_unlock call, let's use a goto to a central place instead.	2017-11-05 10:19:23 +01:00
Willy Tarreau	a075258a2c	BUG/MINOR: cli: add severity in "set server addr" parser Commit c3680ec ("MINOR: add severity information to cli feedback messages") introduced a severity level to CLI messages, but one of them was missed on "set server addr". No backport is needed.	2017-11-05 10:17:49 +01:00
Willy Tarreau	62ac84f843	CLEANUP: checks: remove return statements in locked functions Given that all spinning loops we've had since 1.8-rc1 were caused by unbalanced lock/unlock, let's get rid of all return statements in the locked check functions and only exit via a a single unlock place.	2017-11-05 10:13:38 +01:00
Willy Tarreau	73247e0757	BUG/MAJOR: threads/checks: wrong use of SPIN_LOCK instead of SPIN_UNLOCK Must unlock on exit, copy-paste error.	2017-11-05 10:13:37 +01:00
Willy Tarreau	1c8980f9b5	BUG/MINOR: cli: do not perform an invalid action on "set server check-port" The "set server <srv> check-port" CLI handler forgot to return after detecting an error on the port number, and still proceeds with the action. This needs to be backported to 1.7.	2017-11-05 10:13:37 +01:00
Willy Tarreau	2a858a82ec	BUG/MAJOR: threads/server: missing unlock in CLI fqdn parser This one didn't properly unlock before returning an error message.	2017-11-05 10:13:37 +01:00
Willy Tarreau	1cd153aa89	BUG/MAJOR: threads/checks: add 4 missing spin_unlock() in various functions Some unlocks were missing, resulting in deadlocks even with a single thread. We really need to make these functions safer by getting rid of all those remaining "return" calls and only leave using a goto!	2017-11-05 10:13:28 +01:00
Olivier Houchard	f143b8040b	BUILD: use MAXPATHLEN instead of NAME_MAX. This fixes building on at least Solaris, where NAME_MAX doesn't exist.	2017-11-04 17:09:23 +01:00
Willy Tarreau	0493149ac3	MINOR: thread: report multi-thread support in haproxy -vv Otherwise it's hard to know if it was enabled or not.	2017-11-03 23:39:25 +01:00
Willy Tarreau	ed339a375c	BUG/MAJOR: mux_pt: don't dereference a connstream after ->wake() The wake() callback may destroy a connstream, so it must not be dereferenced in case wake() returns negative. No backport needed, this is 1.8-only.	2017-11-03 15:55:24 +01:00
Emeric Brun	8c4954c5c2	BUG/MINOR: lua: fix missing lock protection on server. To avoid inconsistencies server's attributes must be read or updated under lock.	2017-11-03 15:17:59 +01:00
Emeric Brun	e9fd6b5916	BUG/MINOR: dns: fix missing lock protection on server. To avoid inconsistencies server's attributes must be read or updated under lock.	2017-11-03 15:17:55 +01:00
Emeric Brun	f2fc1fda80	BUG/MINOR: freq: fix infinite loop on freq_ctr_period. Using peers or stick table we could update an freq_ctr using a tick value with the first bit set but this bit is reserved for lock since multithreading support.	2017-11-02 18:09:58 +01:00
William Lallemand	9c54c53f2f	BUG/MEDIUM: cache: don't try to resolve wrong filters Don't try to resolve wrong filters which are not cache filters during the post configuration callback.	2017-11-02 16:58:25 +01:00
Emeric Brun	f6ba17da20	BUG/MAJOR: fix deadlock on healthchecks. Fix bugs due to missing unlock and recursive lock performing http health check. The server's lock scope was enlarged to protect all callers of 'set_server_check_status' and 'chk_report_conn_err'. This fix also protects tcpcheck against concurrency.	2017-11-02 16:24:37 +01:00
Willy Tarreau	16257f648f	BUG/MEDIUM: checks/mux: always enable send-polling after connecting Before introducing the mux layer, tcp_connect() would poll for sending to detect the connection establishment. It happens that the health checks have apparently never explicitly enabled this polling and have been relying on this implicit one. Now that there's the mux layer, the conn_stream needs to be enabled for polling as well and since it's not done in the checks, it's never done and the check's request doesn't leave the machine, as can be noticed with http checks. The solution simply consists in going back to the well-known case where we enable polling after connecting using cs_want_send() if we have anything but just a plain connect(). The regular data path is not affected because the stream interface code automatically computes the polling needs based on buffer contents.	2017-11-02 15:53:04 +01:00
Willy Tarreau	f13ef96e70	BUG/MEDIUM: h2: don't try to parse incomplete H1 responses This situation which must not happen does in fact happen when feeding artificial responses using errorfiles, Lua or an applet. For now it causes the H1 response parser to loop forever trying to get a more complete response. Since it cannot progress, let's return an error.	2017-11-02 15:53:04 +01:00
Olivier Houchard	fccf840cdf	MINOR: cache: Don't confuse act_return and act_parse_ret.	2017-11-01 15:10:51 +01:00
Olivier Houchard	cd2867a012	MINOR: cache: Remove useless test for nonzero. Don't bother testing if len is nonzero, we know it is, as we're in the "else" part of a if (!len), and testing it confuses clang into thinking ret may be left uninitialized.	2017-11-01 15:10:51 +01:00
Olivier Houchard	7da120bb0e	MINOR: mux: Only define pipe functions on linux. Only define mux_pt_snd_pipe() and mux_pt_rcv_pipe() if splicing is available.	2017-11-01 15:10:51 +01:00
Emmanuel Hocdet	82913e4f79	BUG/MINOR: send-proxy-v2: string size must include ('\0') strlen() exclude the terminating null byte ('\0'), add it.	2017-11-01 07:58:20 +01:00
Emmanuel Hocdet	571c7ac0a5	BUG/MINOR: send-proxy-v2: fix dest_len in make_tlv call Subtract already allocated size from buf_len.	2017-11-01 07:57:42 +01:00
William Lallemand	77c1197bfb	MEDIUM: cache: deliver objects from cache Lookup objects in the cache and deliver them using the http-request action "cache-use".	2017-10-31 21:17:19 +01:00
William Lallemand	4da3f8a1f2	MEDIUM: cache: store objects in cache Store object in the cache. The cache use an shctx for storage. It uses an http-response action to store the headers and a filter to store the body. The http-response action is used in order to allow modifications by other actions before caching.	2017-10-31 21:17:19 +01:00
William Lallemand	41db46035e	MEDIUM: cache: configuration parsing and initialization Parse a configuration section "cache" and a http-{response,request} actions. Example: listen frt mode http http-response cache-store foobar http-request cache-use foobar cache foobar total-max-size 4 # size in megabytes	2017-10-31 21:17:19 +01:00
William Lallemand	7217c46dfe	MEDIUM: shctx: forbid shctx to read more than expected Forbid shctx to read more than expected, it allows you to use a greater value as a len with shctx_row_data_get(), the size of the destination buffer for example.	2017-10-31 21:17:19 +01:00
Willy Tarreau	3f133570b8	BUG/MEDIUM: h2: fix incorrect timeout handling on the connection Previous commit ea3928 (MEDIUM: h2: apply a timeout to h2 connections) was wrong for two reasons. The first one is that if the client timeout is not set, it's used as zero, preventing connections from establishing. The second reason is that if the timeout triggers with active streams (normally it should not since the task is supposed to be disabled), the task is removed (h2c->task=NULL), and the last quitting stream might try to dereference it. Instead of doing this, we simply not register the task if there's no timeout (it's useless) and we always control its presence in the streams.	2017-10-31 19:21:06 +01:00
Willy Tarreau	ea39282e85	MEDIUM: h2: apply a timeout to h2 connections Till now there was no way to deal with a dead H2 connection. Now each connection creates a task that wakes up to kill the connection. Its timeout is constantly refreshed when there's some activity. In case the timeout triggers, the best effort attempts are made at sending a clean GOAWAY message before closing and signaling the streams. The timeout is automatically disabled when there's an active stream on the connection, and restarted when the last stream finishes. This way it should not affect long sessions.	2017-10-31 18:16:19 +01:00
Willy Tarreau	a1349f0207	MEDIUM: h2: send a GOAWAY frame when dealing with an empty response Given that we're processing data produced by haproxy, we know that the situations where haproxy doesn't return anything are : - request timeout with option http-ignore-probes : there's no reason to hit this since we're creating the stream with the request into it ; - tcp-request content reject : this definitely means we want to kill the connection and abort keep-alive and any further processing ; - using /dev/null as the error file to hide an error In practice it appears that using the abort on empty response as a hint to trigger a connection close is very appropriate to continue to give the control over the connection management. This patch thus tries to send a GOAWAY frame with the max_id presented as the last stream ID, then sends an RST_STREAM for the current stream. For the client, this means that the connection must be shut down immediately after processing the last pending streams and that the current stream is aborted. This way it's still possible to force connections to be closed using tcp-request rules.	2017-10-31 18:16:19 +01:00
Willy Tarreau	af1e4f5167	MEDIUM: h2: perform a graceful shutdown on "Connection: close" After some long brainstorming sessions, it appears that "Connection: close" seems to be the best signal from the L7 layer to indicate the need to close the connection. Indeed, in H1 it is only present in very rare cases (eg: certain unrecoverable errors, some of which could remove it now by the way). It will also be added when the L7 layer wants to force the connection to terminate. By default when running in keep-alive mode it is not present. It's worth mentionning that in H1 with persistent connections, we have sort of a concurrency-1 mux and this header field is used the same way. Thus here this patch detects "Connection: close" in response headers and if seen, sends a GOAWAY frame with the highest possible ID so that the client knows that it can quit whenever it wants to. If more aggressive closures are needed in the future, we may decide to advertise the max_id to abort after the current requests and better honor "http-request deny".	2017-10-31 18:16:19 +01:00
Willy Tarreau	1c661986a8	MINOR: h2: properly reject PUSH_PROMISE frames coming from the client These ones deserve a connection error as per 5.1.	2017-10-31 18:16:19 +01:00
Willy Tarreau	c0da1964ba	MEDIUM: h2: silently ignore frames higher than last_id after GOAWAY For a graceful shutdown, the specs requries to discard frames with a stream ID higher than the advertised last_id. (RFC7540#6.8). Well, finally for now the code is disabled (see last page of #6.8). Some frames need to be processed anyway to maintain the compression state and the flow control window state, but we don't have any trivial way to do this and ignore them at the same time. For the headers it's the worst case where we can't parse headers frames without coming from the streams, and we don't want to create such streams as we'd have to abort them, and aborting would cause errors to flow back. Possibly that a longterm solution might involve using some dummy streams and dummy buffers for this and calling the parsers directly.	2017-10-31 18:16:19 +01:00
Willy Tarreau	f182a9a8b4	MINOR: h2: centralize the check for the half-closed(remote) streams RFC7540#5.1 is pretty clear : "any frame other than WINDOW_UPDATE, PRIORITY, or RST_STREAM in this state MUST be treated as a connection error of type STREAM_CLOSED". Instead of dealing with this for each and every frame type, let's do it once for all in the main demux loop.	2017-10-31 18:16:19 +01:00
Willy Tarreau	f65b80dd47	MINOR: h2: centralize the check for the idle streams RFC7540#5.1 is pretty clear : "any frame other than HEADERS or PRIORITY in this state MUST be treated as a connection error". Instead of dealing with this for each and every frame type, let's do it once for all in the main demux loop.	2017-10-31 18:16:19 +01:00

... 2 3 4 5 6 ...

5659 Commits