haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-14 11:06:56 +02:00

Author	SHA1	Message	Date
William Lallemand	7d42ef5b22	WIP/MINOR: ssl: add sample fetches for keylog in frontend OpenSSL 1.1.1 provides a callback registering function SSL_CTX_set_keylog_callback, which allows one to receive a string containing the keys to deciphers TLSv1.3. Unfortunately it is not possible to store this data in binary form and we can only get this information using the callback. Which means that we need to store it until the connection is closed. This patches add 2 pools, the first one, pool_head_ssl_keylog is used to store a struct ssl_keylog which will be inserted as a ex_data in a SSL *. The second one is pool_head_ssl_keylog_str which will be used to store the hexadecimal strings. To enable the capture of the keys, you need to set "tune.ssl.keylog on" in your configuration. The following fetches were implemented: ssl_fc_client_early_traffic_secret, ssl_fc_client_handshake_traffic_secret, ssl_fc_server_handshake_traffic_secret, ssl_fc_client_traffic_secret_0, ssl_fc_server_traffic_secret_0, ssl_fc_exporter_secret, ssl_fc_early_exporter_secret	2020-07-06 19:08:03 +02:00
Ilya Shipitsin	46a030cdda	CLEANUP: assorted typo fixes in the code and comments This is 11th iteration of typo fixes	2020-07-06 14:34:32 +02:00
Willy Tarreau	b0be8ae2a8	CLEANUP: auth: fix useless self-include of auth-t.h Since recent include cleanups auth-t.h ended up including itself.	2020-07-05 21:32:47 +02:00
Willy Tarreau	0c439d8956	BUILD: tools: make resolve_sym_name() return a const Originally it was made to return a void* because some comparisons in the code where it was used required a lot of casts. But now we don't need that anymore. And having it non-const breaks the build on NetBSD 9 as reported in issue #728. So let's switch to const and adjust debug.c to accomodate this.	2020-07-05 20:26:04 +02:00
Olivier Houchard	a74bb7e26e	BUG/MEDIUM: connections: Let the xprt layer know a takeover happened. When we takeover a connection, let the xprt layer know. If it has its own tasklet, and it is already scheduled, then it has to be destroyed, otherwise it may run the new mux tasklet on the old thread. Note that we only do this for the ssl xprt for now, because the only other one that might wake the mux up is the handshake one, which is supposed to disappear before idle connections exist. No backport is needed, this is for 2.2.	2020-07-03 17:49:33 +02:00
Olivier Houchard	1662cdb0c6	BUG/MEDIUM: connections: Set the tid for the old tasklet on takeover. In the various takeover() methods, make sure we schedule the old tasklet on the old thread, as we don't want it to run on our own thread! This was causing a very rare crash when building with DEBUG_STRICT, seeing that either an FD's thread mask didn't match the thread ID in h1_io_cb(), or that stream_int_notify() would try to queue a task with the wrong tid_bit. In order to reproduce this, it is necessary to maintain many connections (typically 30k) at a high request rate flowing over H1+SSL between two proxies, the second of which would randomly reject ~1% of the incoming connection and randomly killing some idle ones using a very short client timeout. The request rate must be adjusted so that the CPUs are nearly saturated, but never reach 100%. It's easier to reproduce this by skipping local connections and always picking from other threads. The issue should happen in less than 20s otherwise it's necessary to restart to reset the idle connections lists. No backport is needed, takeover() is 2.2 only.	2020-07-03 17:49:23 +02:00
Willy Tarreau	43079e0731	MINOR: sched: split tasklet_wakeup() into tasklet_wakeup_on() tasklet_wakeup() only checks tl->tid to know whether the task is programmed to run on the current thread or on a specific thread. We'll have to ease this selection in a subsequent patch, preferably without modifying tl->tid, so let's have a new tasklet_wakeup_on() function to specify the thread number to run on. That the logic has not changed at all.	2020-07-03 17:19:47 +02:00
Emeric Brun	9f9b22c4f1	MINOR: log: add time second fraction field to rfc5424 log timestamp. This patch adds the time second fraction in microseconds as supported by the rfc.	2020-07-02 17:56:06 +02:00
Willy Tarreau	dab586c3a8	BUILD: debug: avoid build warnings with DEBUG_MEM_STATS Some libcs define strdup() as a macro and cause redefine warnings to be emitted, so let's first undefine all functions we redefine.	2020-07-02 10:25:01 +02:00
Dragan Dosen	1e3b16f74f	MINOR: log-format: allow to preserve spacing in log format strings Now it's possible to preserve spacing everywhere except in "log-format", "log-format-sd" and "unique-id-format" directives, where spaces are delimiters and are merged. That may be useful when the response payload is specified as a log format string by "lf-file" or "lf-string", or even for headers or anything else. In order to merge spaces, a new option LOG_OPT_MERGE_SPACES is applied exclusively on options passed to function parse_logformat_string(). This patch fixes an issue #701 ("http-request return log-format file evaluation altering spacing of ASCII output/art").	2020-07-02 10:11:44 +02:00
Willy Tarreau	a6026a0c92	MINOR: debug: add a new "debug dev memstats" command Now when building with -DDEBUG_MEM_STATS, some malloc/calloc/strdup/realloc stats are kept per file+line number and may be displayed and even reset on the CLI using "debug dev memstats". This allows to easily track potential leakers or abnormal usages.	2020-07-02 09:14:48 +02:00
Willy Tarreau	76cc699017	MINOR: config: add a new tune.idle-pool.shared global setting. Enables ('on') or disables ('off') sharing of idle connection pools between threads for a same server. The default is to share them between threads in order to minimize the number of persistent connections to a server, and to optimize the connection reuse rate. But to help with debugging or when suspecting a bug in HAProxy around connection reuse, it can be convenient to forcefully disable this idle pool sharing between multiple threads, and force this option to "off". The default is on. This could have been nice to have during the idle connections debugging, but it's not too late to add it!	2020-07-01 19:07:37 +02:00
Olivier Houchard	ff1d0929b8	MEDIUM: connections: Don't use a lock when moving connections to remove. Make it so we don't have to take a lock while moving a connection from the idle list to the toremove_list by taking advantage of the MT_LIST.	2020-07-01 17:09:19 +02:00
Olivier Houchard	f8f4c2ef60	CLEANUP: connections: rename the toremove_lock to takeover_lock This lock was misnamed and a bit confusing. It's only used for takeover so let's call it takeover_lock.	2020-07-01 17:09:10 +02:00
Olivier Houchard	bbee1f7e78	MINOR: list: Add MT_LIST_DEL_SAFE_NOINIT() and MT_LIST_ADDQ_NOCHECK() Add two new macros, MT_LIST_DEL_SAFE_NOINIT makes sure we remove the element from the list, without reinitializing its next and prev, and MT_LIST_ADDQ_NOCHECK is similar to MT_LIST_ADDQ(), except it doesn't check if the element is already in a list. The goal is to be able to move an element from a list we're currently parsing to another, keeping it locked in the meanwhile.	2020-07-01 17:04:00 +02:00
Willy Tarreau	eb8c2c69fa	MEDIUM: sched: implement task_kill() to kill a task task_kill() may be used by any thread to kill any task with less overhead than a regular wakeup. In order to achieve this, it bypasses the priority tree and inserts the task directly into the shared tasklets list, cast as a tasklet. The task_list_size is updated to make sure it is properly decremented after execution of this task. The task will thus be picked by process_runnable_tasks() after checking the tree and sent to the TL_URGENT list, where it will be processed and killed. If the task is bound to more than one thread, its first thread will be the one notified. If the task was already queued or running, nothing is done, only the flag is added so that it gets killed before or after execution. Of course it's the caller's responsibility to make sur any resources allocated by this task were already cleaned up or taken over.	2020-07-01 16:35:53 +02:00
Willy Tarreau	8a6049c268	MEDIUM: sched: create a new TASK_KILLED task flag This flag, when set, will be used to indicate that the task must die. At the moment this may only be placed by the task itself or by the scheduler when placing it into the TL_NORMAL queue.	2020-07-01 16:35:49 +02:00
Willy Tarreau	364f25a688	MINOR: backend: don't always takeover from the same threads The next thread walking algorithm in commit `566df309c` ("MEDIUM: connections: Attempt to get idle connections from other threads.") proved to be sufficient for most cases, but it still has some rough edges when threads are unevenly loaded. If one thread wakes up with 10 streams to process in a burst, it will mainly take over connections from the next one until it doesn't have anymore. This patch implements a rotating index that is stored into the server list and that any thread taking over a connection is responsible for updating. This way it starts mostly random and avoids always picking from the same place. This results in a smoother distribution overall and a slightly lower takeover rate.	2020-07-01 16:07:43 +02:00
Willy Tarreau	2f3f4d3441	MEDIUM: server: add a new pool-low-conn server setting The problem with the way idle connections currently work is that it's easy for a thread to steal all of its siblings' connections, then release them, then it's done by another one, etc. This happens even more easily due to scheduling latencies, or merged events inside the same pool loop, which, when dealing with a fast server responding in sub-millisecond delays, can really result in one thread being fully at work at a time. In such a case, we perform a huge amount of takeover() which consumes CPU and requires quite some locking, sometimes resulting in lower performance than expected. In order to fight against this problem, this patch introduces a new server setting "pool-low-conn", whose purpose is to dictate when it is allowed to steal connections from a sibling. As long as the number of idle connections remains at least as high as this value, it is permitted to take over another connection. When the idle connection count becomes lower, a thread may only use its own connections or create a new one. By proceeding like this even with a low number (typically 2*nbthreads), we quickly end up in a situation where all active threads have a few connections. It then becomes possible to connect to a server without bothering other threads the vast majority of the time, while still being able to use these connections when the number of available FDs becomes low. We also use this threshold instead of global.nbthread in the connection release logic, allowing to keep more extra connections if needed. A test performed with 10000 concurrent HTTP/1 connections, 16 threads and 210 servers with 1 millisecond of server response time showed the following numbers: haproxy 2.1.7: 185000 requests per second haproxy 2.2: 314000 requests per second haproxy 2.2 lowconn 32: 352000 requests per second The takeover rate goes down from 300k/s to 13k/s. The difference is further amplified as the response time shrinks.	2020-07-01 15:23:15 +02:00
Willy Tarreau	35e30c9670	BUG/MINOR: server: fix the connection release logic regarding nearly full conditions There was a logic bug in commit `ddfe0743d` ("MEDIUM: server: use the two thresholds for the connection release algorithm"): instead of keeping only our first idle connection when FDs become scarce, the condition was inverted resulting in enforcing this constraint unless FDs are scarce. This results in less idle connections than permitted to be kept under normal condition. No backport needed.	2020-07-01 14:14:29 +02:00
Willy Tarreau	daf8aa62a8	MINOR: pools: increase MAX_BASE_POOLS to 64 When not sharing pools (i.e. when building with -DDEBUG_DONT_SHARE_POOLS) we have about 47 pools right now, while MAX_BASE_POOLS is only 32, meaning that only the first 32 ones will benefit from a per-thread cache entry. This totally kills performance when pools are not shared (roughly -20%). Let's double the limit to gain some margin, and make it possible to set it as a build option. It might be useful to backport this to stable versions as they're likely to be affected as well.	2020-06-30 14:29:02 +02:00
Willy Tarreau	ddfe0743d8	MEDIUM: server: use the two thresholds for the connection release algorithm The algorithm improvement in `bdb86bd` ("MEDIUM: server: improve estimate of the need for idle connections") is still not enough because there's a hard limit between below and above the FD count, so it continues to end up with many killed connections. Here we're proceeding differently. Given that there are two configured limits, a low and a high one, what we do is that we drop connections when the high limit is reached (what's already done by the killing task anyway), when we're between the low and the high threshold, we only keep the connection if our idle entries are empty (with a preference for safe ones), and below the low threshold, we keep any connection so as to give them a chance of being reused or taken over by another thread. Proceeding like this results in much less dropped connections, we typically see a 99.3% reuse rate (76k conns for 10M requests over 200 servers and 4 threads, with 335k takeovers or 3%), and much less CPU usage variations because there are no more bursts to try to kill extra connections. It should be possible to further improve this by counting the number of threads exploiting a server and trying to optimize the amount of per-thread idle connections so that it is approximately balanced among the threads.	2020-06-29 21:54:38 +02:00
Willy Tarreau	e69282a03f	BUG/MINOR: server: always count one idle slot for current thread The idle server connection estimates brought in commit `bdb86bd` ("MEDIUM: server: improve estimate of the need for idle connections") were committed without the minimum of 1 idle conn needed for the current thread. The net effect is that there are bursts of dropped connections when the load varies because there's no provision for the last connection. No backport needed, this is 2.2-dev.	2020-06-29 21:54:38 +02:00
Willy Tarreau	d59946e673	Revert "BUG/MEDIUM: lists: Lock the element while we check if it is in a list." This reverts previous commit 347bbf79d20e1cff57075a8a378355dfac2475e2i. The original code was correct. This patch resulted from a mistaken analysis and breaks the scheduler: ########################## Starting vtest ########################## Testing with haproxy version: 2.2-dev11-90b7d9-23 # top TEST reg-tests/lua/close_wait_lf.vtc TIMED OUT (kill -9) # top TEST reg-tests/lua/close_wait_lf.vtc FAILED (10.008) signal=9 1 tests failed, 0 tests skipped, 88 tests passed Program terminated with signal SIGABRT, Aborted. [Current thread is 1 (Thread 0x7fb0dac2c700 (LWP 11292))] (gdb) bt #0 0x00007fb0e7c143f8 in raise () from /lib64/libc.so.6 #1 0x00007fb0e7c15ffa in abort () from /lib64/libc.so.6 #2 0x000000000053f5d6 in ha_panic () at src/debug.c:269 #3 0x00000000005a6248 in wdt_handler (sig=14, si=<optimized out>, arg=<optimized out>) at src/wdt.c:119 #4 <signal handler called> #5 0x00000000004fbccd in tasklet_wakeup (tl=0x1b5abc0) at include/haproxy/task.h:351 #6 listener_accept (fd=<optimized out>) at src/listener.c:999 #7 0x00000000004262df in fd_update_events (evts=<optimized out>, fd=6) at include/haproxy/fd.h:418 #8 _do_poll (p=<optimized out>, exp=<optimized out>, wake=<optimized out>) at src/ev_epoll.c:251 #9 0x0000000000548d0f in run_poll_loop () at src/haproxy.c:2949 #10 0x000000000054908b in run_thread_poll_loop (data=<optimized out>) at src/haproxy.c:3067 #11 0x00007fb0e902b684 in start_thread () from /lib64/libpthread.so.0 #12 0x00007fb0e7ce5eed in clone () from /lib64/libc.so.6 (gdb) up #5 0x00000000004fbccd in tasklet_wakeup (tl=0x1b5abc0) at include/haproxy/task.h:351 351 if (MT_LIST_ADDQ(&task_per_thread[tl->tid].shared_tasklet_list, (struct mt_list *)&tl->list) == 1) { If the commit above is ever backported, this one must be as well!	2020-06-29 21:54:37 +02:00
Olivier Houchard	347bbf79d2	BUG/MEDIUM: lists: Lock the element while we check if it is in a list. In MT_LIST_ADDQ() and MT_LIST_ADD() we can't just check if the element is already in a list, because there's a small race condition, it could be added between the time we checked, and the time we actually set its next and prev. So we have to lock it first. This should be backported to 2.1.	2020-06-29 19:59:06 +02:00
Willy Tarreau	a9fcecbdf3	MINOR: stats: add the estimated need of concurrent connections per server The max_used_conns value is used as an estimate of the needed number of connections on a server to know how many to keep open. But this one is not reported, making it hard to troubleshoot reuse issues. Let's export it in the sessions/current column.	2020-06-29 16:29:11 +02:00
Willy Tarreau	bdb86bdaab	MEDIUM: server: improve estimate of the need for idle connections Starting with commit `079cb9a` ("MEDIUM: connections: Revamp the way idle connections are killed") we started to improve the way to compute the need for idle connections. But the condition to keep a connection idle or drop it when releasing it was not updated. This often results in storms of close when certain thresholds are met, and long series of takeover() when there aren't enough connections left for a thread on a server. This patch tries to improve the situation this way: - it keeps an estimate of the number of connections needed for a server. This estimate is a copy of the max over previous purge period, or is a max of what is seen over current period; it differs from max_used_conns in that this one is a counter that's reset on each purge period ; - when releasing, if the number of current idle+used connections is lower than this last estimate, then we'll keep the connection; - when releasing, if the current thread's idle conns head is empty, and we don't exceed the estimate by the number of threads, then we'll keep the connection. - when cleaning up connections, we consider the max of the last two periods to avoid killing too many idle conns when facing bursty traffic. Thanks to this we can better converge towards a situation where, provided there are enough FDs, each active server keeps at least one idle connection per thread all the time, with a total number close to what was needed over the previous measurement period (as defined by pool-purge-delay). On tests with large numbers of concurrent connections (30k) and many servers (200), this has quite smoothed the CPU usage pattern, increased the reuse rate and roughly halved the takeover rate.	2020-06-29 16:29:10 +02:00
Willy Tarreau	b159132ea3	MINOR: activity: add per-thread statistics on FD takeover The FD takeover operation might have certain impacts explaining unexpected activities, so it's important to report such a counter there. We thus count the number of times a thread has stolen an FD from another thread.	2020-06-29 14:26:05 +02:00
Willy Tarreau	3bb617cfe0	MINOR: stats: add 3 new output values for the per-server idle conn state The servers have internal states describing the status of idle connections, unfortunately these were not exported in the stats. This patch adds the 3 following gauges: - idle_conn_cur : Current number of unsafe idle connections - safe_conn_cur : Current number of safe idle connections - used_conn_cur : Current number of connections in use	2020-06-29 14:26:05 +02:00
Willy Tarreau	20dc3cd4a6	MINOR: pools: move the LRU cache heads to thread_info The LRU cache head was an array of list, which causes false sharing between 4 to 8 threads in the same cache line. Let's move it to the thread_info structure instead. There's no need to do the same for the pool_cache[] array since it's already quite large (32 pointers each). By doing this the request rate increased by 1% on a 16-thread machine.	2020-06-29 10:36:37 +02:00
Willy Tarreau	c03d7632a5	CLEANUP: pool: only include the type files from types pool-t.h was mistakenly including the full-blown includes for threads, lists and api instead of the types, and as such, CONFIG_HAP_LOCAL_POOLS and CONFIG_HAP_LOCKLESS_POOLS were not visible everywhere.	2020-06-29 10:11:24 +02:00
Willy Tarreau	e4d1505c83	REORG: includes: create tinfo.h for the thread_info struct The thread_info struct is convenient to store various per-thread info without having to resort to a painful thread_local storage which is slow and painful to initialize. The problem is, by having this one in thread.h it's very difficult to add more entries there because everyone already includes thread.h so conversely thread.h cannot reference certain types. There's no point in having this there, instead let's create a new pair of files, tinfo{,-t}.h, which declare the structure. This way it will become possible to extend them with other includes and have certain files store their own types there.	2020-06-29 09:57:23 +02:00
Willy Tarreau	4d82bf5c2e	MINOR: connection: align toremove_{lock,connections} and cleanup into idle_conns We used to have 3 thread-based arrays for toremove_lock, idle_cleanup, and toremove_connections. The problem is that these items are small, and that this creates false sharing between threads since it's possible to pack up to 8-16 of these values into a single cache line. This can cause real damage where there is contention on the lock. This patch creates a new array of struct "idle_conns" that is aligned on a cache line and which contains all three members above. This way each thread has access to its variables without hindering the other ones. Just doing this increased the HTTP/1 request rate by 5% on a 16-thread machine. The definition was moved to connection.{c,h} since it appeared a more natural evolution of the ongoing changes given that there was already one of them declared in connection.h previously.	2020-06-28 10:52:36 +02:00
Willy Tarreau	d79422a0ff	BUG/MEDIUM: buffers: always allocate from the local cache first It looked strange to see pool_evict_from_cache() always very present on "perf top", but there was actually a reason to this: while b_free() uses pool_free() which properly disposes the buffer into the local cache and b_alloc_fast() allocates using pool_get_first() which considers the local cache, b_alloc_margin() does not consider the local cache as it only uses __pool_get_first() which only allocates from the shared pools. The impact is that basically everywhere a buffer is allocated (muxes, streams, applets), it's always picked from the shared pool (hence involves locking) and is released to the local one and makes it grow until it's required to trigger a flush using pool_evict_from_cache(). Buffers usage are thus not thread-local at all, and cause eviction of a lot of possibly useful objects from the local caches. Just fixing this results in a 10% request rate increase in an HTTP/1 test on a 16-thread machine. This bug was caused by recent commit `ed891fd` ("MEDIUM: memory: make local pools independent on lockless pools") merged into 2.2-dev9, so not backport is needed.	2020-06-28 10:45:35 +02:00
Willy Tarreau	4dc6c860b4	CLEANUP: buffers: remove unused buffer_wq_lock lock Commit `2104659` ("MEDIUM: buffer: remove the buffer_wq lock") removed usage of the lock but not the lock itself. It's totally unused, let's remove it.	2020-06-28 10:45:34 +02:00
Anthonin Bonnefoy	85048f80c9	MINOR: http: Add support for http 413 status Add 413 http "payload too large" status code. This will allow 413 to be used in deny_status and errorfile.	2020-06-26 11:30:02 +02:00
Ilya Shipitsin	47d17182f4	CLEANUP: assorted typo fixes in the code and comments This is 10th iteration of typo fixes	2020-06-26 11:27:28 +02:00
Ilya Shipitsin	f44d155515	BUILD: fix ssl_sample.c when building against BoringSSL BoringSSL does not have X509_get_X509_PUBKEY let our emulation level define that for BoringSSL as well Build log: src/ssl_sample.o: In function `smp_fetch_ssl_x_key_alg': /home/travis/build/haproxy/haproxy/src/ssl_sample.c:592: undefined reference to `X509_get_X509_PUBKEY' clang-7: error: linker command failed with exit code 1 (use -v to see invocation) Makefile:860: recipe for target 'haproxy' failed make: *** [haproxy] Error 1 travis-ci: https://travis-ci.com/github/haproxy/haproxy/jobs/351670996	2020-06-26 10:33:38 +02:00
Willy Tarreau	c54e5ad9cc	MINOR: cfgparse: sanitize the output a little bit With the rework of the config line parser, we've started to emit a dump of the initial line underlined by a caret character indicating the error location. But with extremely large lines it starts to take time and can even cause trouble to slow terminals (e.g. over ssh), and this becomes useless. In addition, control characters could be dumped as-is which is bad, especially when the input file is accidently wrong (an executable). This patch adds a string sanitization function which isolates an area around the error position in order to report only that area if the string is too large. The limit was set to 80 characters, which will result in roughly 40 chars around the error being reported only, prefixed and suffixed with "..." as needed. In addition, non-printable characters in the line are now replaced with '?' so as not to corrupt the terminal. This way invalid variable names, unmatched quotes etc will be easier to spot. A typical output is now: [ALERT] 176/092336 (23852) : parsing [bad.cfg:8]: forbidden first char in environment variable name at position 811957: ...c$PATH$PATH$d(xlc`%?$PATH$PATH$dgc?T$%$P?AH?$PATH$PATH$d(?$PATH$PATH$dgc?%... ^	2020-06-25 09:43:27 +02:00
Willy Tarreau	e7723bddd7	MEDIUM: tasks: add a tune.sched.low-latency option Now that all tasklet queues are scanned at once by run_tasks_from_lists(), it becomes possible to always check for lower priority classes and jump back to them when they exist. This patch adds tune.sched.low-latency global setting to enable this behavior. What it does is stick to the lowest ranked priority list in which tasks are still present with an available budget, and leave the loop to refill the tasklet lists if the trees got new tasks or if new work arrived into the shared urgent queue. Doing so allows to cut the latency in half when running with extremely deep run queues (10k-100k), thus allowing forwarding of small and large objects to coexist better. It remains off by default since it does have a small impact on large traffic by default (shorter batches).	2020-06-24 12:21:26 +02:00
Willy Tarreau	59153fef86	MINOR: tasks: make run_tasks_from_lists() scan the queues itself Now process_runnable_tasks is responsible for calculating the budgets for each queue, dequeuing from the tree, and calling run_tasks_from_lists(). This latter one scans the queues, picking tasks there and respecting budgets. Note that its name was updated with a plural "s" for this reason.	2020-06-24 12:21:26 +02:00
Willy Tarreau	ba48d5c8f9	MINOR: tasks: pass the queue index to run_task_from_list() Instead of passing it a pointer to the queue, pass it the queue's index so that it can perform all the work around current_queue and tl_class_mask.	2020-06-24 12:21:26 +02:00
Willy Tarreau	49f90bf148	MINOR: tasks: add a mask of the queues with active tasklets It is neither convenient nor scalable to check each and every tasklet queue to figure whether it's empty or not while we often need to check them all at once. This patch introduces a tasklet class mask which gets a bit 1 set for each queue representing one class of service. A single test on the mask allows to figure whether there's still some work to be done. It will later be usable to better factor the runqueue code. Bits are set when tasklets are queued. They're cleared when queues are emptied. It is possible that a queue is empty but has a bit if a tasklet was added then removed, but this is not a problem as this is properly checked for in run_tasks_from_list().	2020-06-24 12:21:26 +02:00
Willy Tarreau	c0a08ba2df	MINOR: tasks: make current_queue an index instead of a pointer It will be convenient to have the tasklet queue number soon, better make current_queue an index rather than a pointer to the queue. When not currently running (e.g. from I/O), the index is -1.	2020-06-24 12:21:26 +02:00
William Lallemand	ee8530c65e	MINOR: ssl: free the crtlist and the ckch during the deinit() Add some functions to deinit the whole crtlist and ckch architecture. It will free all crtlist, crtlist_entry, ckch_store, ckch_inst and their associated SNI, ssl_conf and SSL_CTX. The SSL_CTX in the default_ctx and initial_ctx still needs to be free'd separately.	2020-06-23 20:07:50 +02:00
William Lallemand	7df5c2dc3c	BUG/MEDIUM: ssl: fix ssl_bind_conf double free Since commit `2954c47` ("MEDIUM: ssl: allow crt-list caching"), the ssl_bind_conf is allocated directly in the crt-list, and the crt-list can be shared between several bind_conf. The deinit() code wasn't changed to handle that. This patch fixes the issue by removing the free of the ssl_conf in ssl_sock_free_all_ctx(). It should be completed with a patch that free the ssl_conf and the crt-list. Fix issue #700.	2020-06-23 20:06:55 +02:00
Willy Tarreau	5bd73063ab	BUG/MEDIUM: task: be careful not to run too many tasks at TL_URGENT A test on large objects revealed a big performance loss from 2.1. The cause was found to be related to cache locality between scheduled operations that are batched using tasklets. It happens that we now have several layers of tasklets and that queuing all these operations leaves time to let memory objects cool down in the CPU cache, effectively resulting in halving the performance. A quick test consisting in putting most unknown tasklets into the BULK queue almost fixed the performance regression, but this is a wrong approach as it can also slow down some low-latency transfers or access to applets like the CLI. What this patch does instead is to queue unknown tasklets into the same queue as the current one when tasklet_wakeup() is itself called from a task/tasklet, otherwise it uses urgent for real I/O (when sched->current is NULL). This results in the called tasklet being woken up much sooner, often at the end of the current batch of tasklets. By doing so, a test on 2 cores 4 threads with 256 concurrent H1 conns transferring 16m objects with 256kB buffers jumped from 55 to 88 Gbps. It's even possible to go as high as 101 Gbps by evaluating the URGENT queue after the BULK one, though this was not done as considered dangerous for latency sensitive operations. This reinforces the importance of getting back the CPU transfer mechanisms based on tasklet_wakeup_after() to work at the tasklet level by supporting an immediate wakeup in certain cases. No backport is needed, this is strictly 2.2.	2020-06-23 16:45:28 +02:00
Willy Tarreau	116ef223d2	MINOR: task: add a new pointer to current tasklet queue In task_per_thread[] we now have current_queue which is a pointer to the current tasklet_list entry being evaluated. This will be used to know the class under which the current task/tasklet is currently running.	2020-06-23 16:35:38 +02:00
Willy Tarreau	38e8a1c7b8	MINOR: debug: add a new DEBUG_FD build option When DEBUG_FD is set at build time, we'll keep a counter of per-FD events in the fdtab. This counter is reported in "show fd" even for closed FDs if not zero. The purpose is to help spot situations where an apparently closed FD continues to be reported in loops, or where some events are dismissed.	2020-06-23 10:04:54 +02:00
Willy Tarreau	d1d005d7f6	MEDIUM: map: make the "clear map" operation yield As reported in issue #419, a "clear map" operation on a very large map can take a lot of time and freeze the entire process for several seconds. This patch makes sure that pat_ref_prune() can regularly yield after clearing some entries so that the rest of the process continues to work. The first part, the removal of the patterns, can take quite some time by itself in one run but it's still relatively fast. It may block for up to 100ms for 16M IP addresses in a tree typically. This change needed to declare an I/O handler for the clear operation so that we can get back to it after yielding. The second part can be much slower because it deconstructs the elements and its users, but it iterates progressively so we can yield less often here. The patch was tested with traffic in parallel sollicitating the map being released and showed no problem. Some traffic will definitely notice an incomplete map but the filling is already not atomic anyway thus this is not different. It may be backported to stable versions once sufficiently tested for side effects, at least as far as 2.0 in order to avoid the watchdog triggering when the process is frozen there. For a better behaviour, all these prune_* functions should support yielding so that the callers have a chance to continue also yield in turn.	2020-06-19 16:57:51 +02:00

1 2 3 4 5 ...

4367 Commits