haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-14 02:57:01 +02:00

Author	SHA1	Message	Date
Willy Tarreau	b1beaa302c	BUG/MINOR: init: make the automatic maxconn consider the max of soft/hard limits James Stroehmann reported something working as documented but that can be considered as a regression in the way the automatic maxconn is calculated from the process' limits : https://www.mail-archive.com/haproxy@formilux.org/msg36523.html The purpose of the changes in 2.0 was to have maxconn default to the highest possible value permitted to the user based on the ulimit -n setting, however the calculation starts from the soft limit, which can be lower than what users were allowed to with previous versions where the default value of 2000 would force a higher ulimit -n as long as it fitted in the hard limit. Usually this is not noticeable if the user changes the limits, because quite commonly setting a new value restricts both the soft and hard values. Let's instead always use the max between the hard and soft limits, as we know these values are permitted. This was tried on the following setup: $ cat ulimit-n.cfg global stats socket /tmp/sock1 level admin $ ulimit -n 1024 Before the change the limits would show like this: $ socat - /tmp/sock1 <<< "show info" \| grep -im2 ^Max Maxsock: 1023 Maxconn: 489 After the change the limits are now much better and more in line with the default settings in earlier versions: $ socat - /tmp/sock1 <<< "show info" \| grep -im2 ^Max Maxsock: 4095 Maxconn: 2025 The difference becomes even more obvious when running moderately large configs with hundreds of checked servers and hundreds of listeners: $ cat ulimit-n.cfg global stats socket /tmp/sock1 level admin listen l bind :10000-10300 server-template srv- 300 0.0.0.0 check disabled Before After Maxsock 1024 4096 Maxconn 189 1725 This issue is tagged as minor since a trivial config change fixes it, but it would help new users to have it backported as far as 2.0.	2020-03-06 10:49:55 +01:00
Carl Henrik Lunde	f91ac19299	OPTIM: startup: fast unique_id allocation for acl. pattern_finalize_config() uses an inefficient algorithm which is a problem with very large configuration files. This affects startup, and therefore reload time. When haproxy is deployed as a router in a Kubernetes cluster the generated configuration file may be large and reloads are frequently occuring, which makes this a significant issue. The old algorithm is O(n^2) * allocate missing uids - O(n^2) * sort linked list - O(n^2) The new algorithm is O(n log n): * find the user allocated uids - O(n) * store them for efficient lookup - O(n log n) * allocate missing uids - n times O(log n) * sort all uids - O(n log n) * convert back to linked list - O(n) Performance examples, startup time in seconds: pat_refs old new 1000 0.02 0.01 10000 2.1 0.04 20000 12.3 0.07 30000 27.9 0.10 40000 52.5 0.14 50000 77.5 0.17 Please backport to 1.8, 2.0 and 2.1.	2020-03-06 08:11:58 +01:00
Willy Tarreau	3ebd55ee51	MINOR: haproxy: export run_poll_loop This will help refine debug traces.	2020-03-03 15:26:10 +01:00
Willy Tarreau	908071171b	BUILD: general: always pass unsigned chars to is* functions The isalnum(), isalpha(), isdigit() etc functions from ctype.h are supposed to take an int in argument which must either reflect an unsigned char or EOF. In practice on some platforms they're implemented as macros referencing an array, and when passed a char, they either cause a warning "array subscript has type 'char'" when lucky, or cause random segfaults when unlucky. It's quite unconvenient by the way since none of them may return true for negative values. The recent introduction of cygwin to the list of regularly tested build platforms revealed a lot of breakage there due to the same issues again. So this patch addresses the problem all over the code at once. It adds unsigned char casts to every valid use case, and also drops the unneeded double cast to int that was sometimes added on top of it. It may be backported by dropping irrelevant changes if that helps better support uncommon platforms. It's unlikely to fix bugs on platforms which would already not emit any warning though.	2020-02-25 08:16:33 +01:00
Christopher Faulet	6d0c3dfac6	MEDIUM: http: Add a ruleset evaluated on all responses just before forwarding This patch introduces the 'http-after-response' rules. These rules are evaluated at the end of the response analysis, just before the data forwarding, on ALL HTTP responses, the server ones but also all responses generated by HAProxy. Thanks to this ruleset, it is now possible for instance to add some headers to the responses generated by the stats applet. Following actions are supported : * allow * add-header * del-header * replace-header * replace-value * set-header * set-status * set-var * strict-mode * unset-var	2020-02-06 14:55:34 +01:00
Christopher Faulet	546c4696bb	MINOR: global: Set default tune.maxrewrite value during global structure init When the global structure is initialized, instead of setting tune.maxrewrite to -1, its default value can be immediately set. This way, it is always defined during the configuration validity check. Otherwise, the only way to have it at this stage, it is to explicity set it in the global section.	2020-02-06 09:36:36 +01:00
Willy Tarreau	71f95fa20e	[RELEASE] Released version 2.2-dev1 Released version 2.2-dev1 with the following main changes : - DOC: this is development again - MINOR: version: this is development again, update the status - SCRIPTS: update create-release to fix the changelog on new branches - CLEANUP: ssl: Clean up error handling - BUG/MINOR: contrib/prometheus-exporter: decode parameter and value only - BUG/MINOR: h1: Don't test the host header during response parsing - BUILD/MINOR: trace: fix use of long type in a few printf format strings - DOC: Clarify behavior of server maxconn in HTTP mode - MINOR: ssl: deduplicate ca-file - MINOR: ssl: compute ca-list from deduplicate ca-file - MINOR: ssl: deduplicate crl-file - CLEANUP: dns: resolution can never be null - BUG/MINOR: http-htx: Don't make http_find_header() fail if the value is empty - DOC: ssl/cli: set/commit/abort ssl cert - BUG/MINOR: ssl: fix SSL_CTX_set1_chain compatibility for openssl < 1.0.2 - BUG/MINOR: fcgi-app: Make the directive pass-header case insensitive - BUG/MINOR: stats: Fix HTML output for the frontends heading - BUG/MINOR: ssl: fix X509 compatibility for openssl < 1.1.0 - DOC: clarify matching strings on binary fetches - DOC: Fix ordered list in summary - DOC: move the "group" keyword at the right place - MEDIUM: init: prevent process and thread creation at runtime - BUG/MINOR: ssl/cli: 'ssl cert' cmd only usable w/ admin rights - BUG/MEDIUM: stream-int: don't subscribed for recv when we're trying to flush data - BUG/MINOR: stream-int: avoid calling rcv_buf() when splicing is still possible - BUG/MINOR: ssl/cli: don't overwrite the filters variable - BUG/MEDIUM: listener/thread: fix a race when pausing a listener - BUG/MINOR: ssl: certificate choice can be unexpected with openssl >= 1.1.1 - BUG/MEDIUM: mux-h1: Never reuse H1 connection if a shutw is pending - BUG/MINOR: mux-h1: Don't rely on CO_FL_SOCK_RD_SH to set H1C_F_CS_SHUTDOWN - BUG/MINOR: mux-h1: Fix conditions to know whether or not we may receive data - BUG/MEDIUM: tasks: Make sure we switch wait queues in task_set_affinity(). - BUG/MEDIUM: checks: Make sure we set the task affinity just before connecting. - MINOR: debug: replace popen() with pipe+fork() in "debug dev exec" - MEDIUM: init: set NO_NEW_PRIVS by default when supported - BUG/MINOR: mux-h1: Be sure to set CS_FL_WANT_ROOM when EOM can't be added - BUG/MEDIUM: mux-fcgi: Handle cases where the HTX EOM block cannot be inserted - BUG/MINOR: proxy: make soft_stop() also close FDs in LI_PAUSED state - BUG/MINOR: listener/threads: always use atomic ops to clear the FD events - BUG/MINOR: listener: also clear the error flag on a paused listener - BUG/MEDIUM: listener/threads: fix a remaining race in the listener's accept() - MINOR: listener: make the wait paths cleaner and more reliable - MINOR: listener: split dequeue_all_listener() in two - REORG: listener: move the global listener queue code to listener.c - DOC: document the listener state transitions - BUG/MEDIUM: kqueue: Make sure we report read events even when no data. - BUG/MAJOR: dns: add minimalist error processing on the Rx path - BUG/MEDIUM: proto_udp/threads: recv() and send() must not be exclusive. - DOC: listeners: add a few missing transitions - BUG/MINOR: tasks: only requeue a task if it was already in the queue - MINOR: tasks: split wake_expired_tasks() in two parts to avoid useless wakeups - DOC: proxies: HAProxy only supports 3 connection modes - DOC: remove references to the outdated architecture.txt - BUG/MINOR: log: fix minor resource leaks on logformat error path - BUG/MINOR: mworker: properly pass SIGTTOU/SIGTTIN to workers - BUG/MINOR: listener: do not immediately resume on transient error - BUG/MINOR: server: make "agent-addr" work on default-server line - BUG/MINOR: listener: fix off-by-one in state name check - BUILD/MINOR: unix sockets: silence an absurd gcc warning about strncpy() - MEDIUM: h1-htx: Add HTX EOM block when the message is in H1_MSG_DONE state - MINOR: http-htx: Add some htx sample fetches for debugging purpose - REGTEST: Add an HTX reg-test to check an edge case - DOC: clarify the fact that replace-uri works on a full URI - BUG/MINOR: sample: fix the closing bracket and LF in the debug converter - BUG/MINOR: sample: always check converters' arguments - MINOR: sample: Validate the number of bits for the sha2 converter - BUG/MEDIUM: ssl: Don't set the max early data we can receive too early. - MINOR: ssl/cli: 'show ssl cert' give information on the certificates - BUG/MINOR: ssl/cli: fix build for openssl < 1.0.2 - MINOR: debug: support logging to various sinks - MINOR: http: add a new "replace-path" action - REGTEST: ssl: test the "set ssl cert" CLI command - REGTEST: run-regtests: implement #REQUIRE_BINARIES - MINOR: task: only check TASK_WOKEN_ANY to decide to requeue a task - BUG/MAJOR: task: add a new TASK_SHARED_WQ flag to fix foreing requeuing - BUG/MEDIUM: ssl: Revamp the way early data are handled. - MINOR: fd/threads: make _GET_NEXT()/_GET_PREV() use the volatile attribute - BUG/MEDIUM: fd/threads: fix a concurrency issue between add and rm on the same fd - REGTEST: make the "set ssl cert" require version 2.1 - BUG/MINOR: ssl: openssl-compat: Fix getm_ defines - BUG/MEDIUM: state-file: do not allocate a full buffer for each server entry - BUG/MINOR: state-file: do not store duplicates in the global tree - BUG/MINOR: state-file: do not leak memory on parse errors - BUG/MAJOR: mux-h1: Don't pretend the input channel's buffer is full if empty - BUG/MEDIUM: stream: Be sure to never assign a TCP backend to an HTX stream - BUILD: ssl: improve SSL_CTX_set_ecdh_auto compatibility - BUILD: travis-ci: link with ssl libraries using rpath instead of LD_LIBRARY_PATH/DYLD_LIBRARY_PATH - BUILD: travis-ci: reenable address sanitizer for clang builds - BUG/MINOR: checks: refine which errno values are really errors. - BUG/MINOR: connection: only wake send/recv callbacks if the FD is active - CLEANUP: connection: conn->xprt is never NULL - MINOR: pollers: add a new flag to indicate pollers reporting ERR & HUP - MEDIUM: tcp: make tcp_connect_probe() consider ERR/HUP - REORG: connection: move tcp_connect_probe() to conn_fd_check() - MINOR: connection: check for connection validation earlier - MINOR: connection: remove the double test on xprt_done_cb() - CLEANUP: connection: merge CO_FL_NOTIFY_DATA and CO_FL_NOTIFY_DONE - MINOR: poller: do not call the IO handler if the FD is not active - OPTIM: epoll: always poll for recv if neither active nor ready - OPTIM: polling: do not create update entries for FD removal - BUG/MEDIUM: checks: Only attempt to do handshakes if the connection is ready. - BUG/MEDIUM: connections: Hold the lock when wanting to kill a connection. - BUILD: CI: modernize cirrus-ci - MINOR: config: disable busy polling on old processes - MINOR: ssl: Remove unused variable "need_out". - BUG/MINOR: h1: Report the right error position when a header value is invalid - BUG/MINOR: proxy: Fix input data copy when an error is captured - BUG/MEDIUM: http-ana: Truncate the response when a redirect rule is applied - BUG/MINOR: channel: inject output data at the end of output - BUG/MEDIUM: session: do not report a failure when rejecting a session - MEDIUM: dns: implement synchronous send - MINOR: raw_sock: make sure to disable polling once everything is sent - MINOR: http: Add 410 to http-request deny - MINOR: http: Add 404 to http-request deny - CLEANUP: mux-h2: remove unused goto "out_free_h2s" - BUILD: cirrus-ci: choose proper openssl package name - BUG/MAJOR: listener: do not schedule a task-less proxy - CLEANUP: server: remove unused err section in server_finalize_init - REGTEST: set_ssl_cert.vtc: replace "echo" with "printf" - BUG/MINOR: stream-int: Don't trigger L7 retry if max retries is already reached - BUG/MEDIUM: tasks: Use the MT macros in tasklet_free(). - BUG/MINOR: mux-h2: use a safe list_for_each_entry in h2_send() - BUG/MEDIUM: mux-h2: fix missing test on sending_list in previous patch - CLEANUP: ssl: remove opendir call in ssl_sock_load_cert - MEDIUM: lua: don't call the GC as often when dealing with outgoing connections - BUG/MEDIUM: mux-h2: don't stop sending when crossing a buffer boundary - BUG/MINOR: cli/mworker: can't start haproxy with 2 programs - REGTEST: mcli/mcli_start_progs: start 2 programs - BUG/MEDIUM: mworker: remain in mworker mode during reload - DOC: clarify crt-base usage - CLEANUP: compression: remove unused deinit_comp_ctx section - BUG/MEDIUM: mux_h1: Don't call h1_send if we subscribed(). - BUG/MEDIUM: raw_sock: Make sur the fd and conn are sync. - CLEANUP: proxy: simplify proxy_parse_rate_limit proxy checks - BUG/MAJOR: hashes: fix the signedness of the hash inputs - REGTEST: add sample_fetches/hashes.vtc to validate hashes - BUG/MEDIUM: cli: _getsocks must send the peers sockets - CLEANUP: cli: deduplicate the code in _getsocks - BUG/MINOR: stream: don't mistake match rules for store-request rules - BUG/MEDIUM: connection: add a mux flag to indicate splice usability - BUG/MINOR: pattern: handle errors from fgets when trying to load patterns - MINOR: connection: move the CO_FL_WAIT_ROOM cleanup to the reader only - MINOR: stream-int: remove dependency on CO_FL_WAIT_ROOM for rcv_buf() - MEDIUM: connection: get rid of CO_FL_CURR_* flags - BUILD: pattern: include errno.h - MEDIUM: mux-h2: do not try to stop sending streams on blocked mux - MEDIUM: mux-fcgi: do not try to stop sending streams on blocked mux - MEDIUM: mux-h2: do not make an h2s subscribe to itself on deferred shut - MEDIUM: mux-fcgi: do not make an fstrm subscribe to itself on deferred shut - REORG: stream/backend: move backend-specific stuff to backend.c - MEDIUM: backend: move the connection finalization step to back_handle_st_con() - MEDIUM: connection: merge the send_wait and recv_wait entries - MEDIUM: xprt: merge recv_wait and send_wait in xprt_handshake - MEDIUM: ssl: merge recv_wait and send_wait in ssl_sock - MEDIUM: mux-h1: merge recv_wait and send_wait - MEDIUM: mux-h2: merge recv_wait and send_wait event notifications - MEDIUM: mux-fcgi: merge recv_wait and send_wait event notifications - MINOR: connection: make the last arg of subscribe() a struct wait_event* - MINOR: ssl: Add support for returning the dn samples from ssl_(c\|f)_(i\|s)_dn in LDAP v3 (RFC2253) format. - DOC: Fix copy and paste mistake in http-response replace-value doc - BUG/MINOR: cache: Fix leak of cache name in error path - BUG/MINOR: dns: Make dns_query_id_seed unsigned - BUG/MINOR: 51d: Fix bug when HTX is enabled - MINOR: http-htx: Move htx sample fetches in the scope "internal" - MINOR: http-htx: Rename 'internal.htx_blk.val' to 'internal.htx_blk.data' - MINOR: http-htx: Make 'internal.htx_blk_data' return a binary string - DOC: Add a section to document the internal sample fetches - MINOR: mux-h1: Inherit send flags from the upper layer - MINOR: contrib/prometheus-exporter: Add heathcheck status/code in server metrics - BUG/MINOR: http-ana/filters: Wait end of the http_end callback for all filters - BUG/MINOR: http-rules: Remove buggy deinit functions for HTTP rules - BUG/MINOR: stick-table: Use MAX_SESS_STKCTR as the max track ID during parsing - MEDIUM: http-rules: Register an action keyword for all http rules - MINOR: tcp-rules: Always set from which ruleset a rule comes from - MINOR: actions: Use ACT_RET_CONT code to ignore an error from a custom action - MINOR: tcp-rules: Kill connections when custom actions return ACT_RET_ERR - MINOR: http-rules: Return an error when custom actions return ACT_RET_ERR - MINOR: counters: Add a counter to report internal processing errors - MEDIUM: http-ana: Properly handle internal processing errors - MINOR: http-rules: Add a rule result to report internal error - MINOR: http-rules: Handle internal errors during HTTP rules evaluation - MINOR: http-rules: Add more return codes to let custom actions act as normal ones - MINOR: tcp-rules: Handle denied/aborted/invalid connections from TCP rules - MINOR: http-rules: Handle denied/aborted/invalid connections from HTTP rules - MINOR: stats: Report internal errors in the proxies/listeners/servers stats - MINOR: contrib/prometheus-exporter: Export internal errors per proxy/server - MINOR: counters: Remove failed_secu counter and use denied_resp instead - MINOR: counters: Review conditions to increment counters from analysers - MINOR: http-ana: Add a txn flag to support soft/strict message rewrites - MINOR: http-rules: Handle all message rewrites the same way - MINOR: http-rules: Add a rule to enable or disable the strict rewriting mode - MEDIUM: http-rules: Enable the strict rewriting mode by default - REGTEST: Fix format of set-uri HTTP request rule in h1or2_to_h1c.vtc - MINOR: actions: Add a function pointer to release args used by actions - MINOR: actions: Regroup some info about HTTP rules in the same struct - MINOR: http-rules/tcp-rules: Call the defined action function first if defined - MINOR: actions: Rename the act_flag enum into act_opt - MINOR: actions: Add flags to configure the action behaviour - MINOR: actions: Use an integer to set the action type - MINOR: http-rules: Use a specific action type for some custom HTTP actions - MINOR: http-rules: Make replace-header and replace-value custom actions - MINOR: http-rules: Make set-header and add-header custom actions - MINOR: http-rules: Make set/del-map and add/del-acl custom actions - MINOR: http-rules: Group all processing of early-hint rule in its case clause - MEDIUM: http-rules: Make early-hint custom actions - MINOR: http-rule/tcp-rules: Make track-sc* custom actions - MINOR: tcp-rules: Make tcp-request capture a custom action - MINOR: http-rules: Add release functions for existing HTTP actions - BUG/MINOR: http-rules: Fix memory releases on error path during action parsing - MINOR: tcp-rules: Add release functions for existing TCP actions - BUG/MINOR: tcp-rules: Fix memory releases on error path during action parsing - MINOR: http-htx: Add functions to read a raw error file and convert it in HTX - MINOR: http-htx: Add functions to create HTX redirect message - MINOR: config: Use dedicated function to parse proxy's errorfiles - MINOR: config: Use dedicated function to parse proxy's errorloc - MEDIUM: http-htx/proxy: Use a global and centralized storage for HTTP error messages - MINOR: proxy: Register keywords to parse errorfile and errorloc directives - MINOR: http-htx: Add a new section to create groups of custom HTTP errors - MEDIUM: proxy: Add a directive to reference an http-errors section in a proxy - MINOR: http-rules: Update txn flags and status when a deny rule is executed - MINOR: http-rules: Support an optional status on deny rules for http reponses - MINOR: http-rules: Use same function to parse request and response deny actions - MINOR: http-ana: Add an error message in the txn and send it when defined - MEDIUM: http-rules: Support an optional error message in http deny rules - REGTEST: Add a strict rewriting mode reg test - REGEST: Add reg tests about error files - MINOR: ssl: accept 'verify' bind option with 'set ssl cert' - BUG/MINOR: ssl: ssl_sock_load_ocsp_response_from_file memory leak - BUG/MINOR: ssl: ssl_sock_load_issuer_file_into_ckch memory leak - BUG/MINOR: ssl: ssl_sock_load_sctl_from_file memory leak - BUG/MINOR: http_htx: Fix some leaks on error path when error files are loaded - CLEANUP: http-ana: Remove useless test on txn when the error message is retrieved - BUILD: CI: introduce ARM64 builds - BUILD: ssl: more elegant anti-replay feature presence check - MINOR: proxy/http-ana: Add support of extra attributes for the cookie directive - MEDIUM: dns: use Additional records from SRV responses - CLEANUP: Consistently `unsigned int` for bitfields - CLEANUP: pattern: remove the pat_time definition - BUG/MINOR: http_act: don't check capture id in backend - BUG/MINOR: ssl: fix build on development versions of openssl-1.1.x	2020-01-22 10:34:58 +01:00
Christopher Faulet	2f5339079b	MINOR: proxy/http-ana: Add support of extra attributes for the cookie directive It is now possible to insert any attribute when a cookie is inserted by HAProxy. Any value may be set, no check is performed except the syntax validity (CTRL chars and ';' are forbidden). For instance, it may be used to add the SameSite attribute: cookie SRV insert attr "SameSite=Strict" The attr option may be repeated to add several attributes. This patch should fix the issue #361.	2020-01-22 07:18:31 +01:00
Christopher Faulet	5885775de1	MEDIUM: http-htx/proxy: Use a global and centralized storage for HTTP error messages All custom HTTP errors are now stored in a global tree. Proxies use a references on these messages. The key used for errorfile directives is the file name as specified in the configuration. For errorloc directives, a key is created using the redirect code and the url. This means that the same custom error message is now stored only once. It may be used in several proxies or for several status code, it is only parsed and stored once.	2020-01-20 15:18:46 +01:00
Christopher Faulet	58b3564fde	MINOR: actions: Add a function pointer to release args used by actions Arguments used by actions are never released during HAProxy deinit. Now, it is possible to specify a function to do so. ".release_ptr" field in the act_rule structure may be set during the configuration parsing to a specific deinit function depending on the action type.	2020-01-20 15:18:45 +01:00
Christopher Faulet	cb5501327c	BUG/MINOR: http-rules: Remove buggy deinit functions for HTTP rules Functions to deinitialize the HTTP rules are buggy. These functions does not check the action name to release the right part in the arg union. Only few info are released. For auth rules, the realm is released and there is no problem here. But the regex <arg.hdr_add.re> is always unconditionally released. So it is easy to make these functions crash. For instance, with the following rule HAProxy crashes during the deinit : http-request set-map(/path/to/map) %[src] %[req.hdr(X-Value)] For now, These functions are simply removed and we rely on the deinit function used for TCP rules (renamed as deinit_act_rules()). This patch fixes the bug. But arguments used by actions are not released at all, this part will be addressed later. This patch must be backported to all stable versions.	2020-01-20 15:18:45 +01:00
William Lallemand	24c928c8bd	BUG/MEDIUM: mworker: remain in mworker mode during reload If you reload an haproxy started in master-worker mode with "master-worker" in the configuration, and no "-W" argument, the new process lost the fact that is was in master-worker mode resulting in weird behaviors. The bigest problem is that if it is reloaded with an bad configuration, the master will exits instead of remaining in waitpid mode. This problem was discovered in bug #443. Should be backported in every version using the master-worker mode. (as far as 1.8)	2020-01-14 18:10:29 +01:00
Willy Tarreau	719e07c989	BUILD/MINOR: unix sockets: silence an absurd gcc warning about strncpy() Apparently gcc developers decided that strncpy() semantics are no longer valid and now deserve a warning, especially if used exactly as designed. This results in issue #304. Let's just remove one to the target size to please her majesty gcc, the God of C Compilers, who tries hard to make users completely eliminate any use of string.h and reimplement it by themselves at much higher risks. Pfff.... This can be backported to stable version, the fix is harmless since it ignores the last zero that is already set on next line.	2019-12-11 16:29:10 +01:00
Willy Tarreau	d26c9f9465	BUG/MINOR: mworker: properly pass SIGTTOU/SIGTTIN to workers If a new process is started with -sf and it fails to bind, it may send a SIGTTOU to the master process in hope that it will temporarily unbind. Unfortunately this one doesn't catch it and stops to background instead of forwarding the signal to the workers. The same is true for SIGTTIN. This commit simply implements an extra signal handler for the master to deal with such signals that must be passed down to the workers. It must be backported as far as 1.8, though there the code differs in that it's entirely in haproxy.c and doesn't require an extra sig handler.	2019-12-11 14:26:53 +01:00
Willy Tarreau	c49ba52524	MINOR: tasks: split wake_expired_tasks() in two parts to avoid useless wakeups We used to have wake_expired_tasks() wake up tasks and return the next expiration delay. The problem this causes is that we have to call it just before poll() in order to consider latest timers, but this also means that we don't wake up all newly expired tasks upon return from poll(), which thus systematically requires a second poll() round. This is visible when running any scheduled task like a health check, as there are systematically two poll() calls, one with the interval, nothing is done after it, and another one with a zero delay, and the task is called: listen test bind *:8001 server s1 127.0.0.1:1111 check 09:37:38.200959 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8696843}) = 0 09:37:38.200967 epoll_wait(3, [], 200, 1000) = 0 09:37:39.202459 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8712467}) = 0 >> nothing run here, as the expired task was not woken up yet. 09:37:39.202497 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8715766}) = 0 09:37:39.202505 epoll_wait(3, [], 200, 0) = 0 09:37:39.202513 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8719064}) = 0 >> now the expired task was woken up 09:37:39.202522 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 7 09:37:39.202537 fcntl(7, F_SETFL, O_RDONLY\|O_NONBLOCK) = 0 09:37:39.202565 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0 09:37:39.202577 setsockopt(7, SOL_TCP, TCP_QUICKACK, [0], 4) = 0 09:37:39.202585 connect(7, {sa_family=AF_INET, sin_port=htons(1111), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress) 09:37:39.202659 epoll_ctl(3, EPOLL_CTL_ADD, 7, {EPOLLOUT, {u32=7, u64=7}}) = 0 09:37:39.202673 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8814713}) = 0 09:37:39.202683 epoll_wait(3, [{EPOLLOUT\|EPOLLERR\|EPOLLHUP, {u32=7, u64=7}}], 200, 1000) = 1 09:37:39.202693 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8818617}) = 0 09:37:39.202701 getsockopt(7, SOL_SOCKET, SO_ERROR, [111], [4]) = 0 09:37:39.202715 close(7) = 0 Let's instead split the function in two parts: - the first part, wake_expired_tasks(), called just before process_runnable_tasks(), wakes up all expired tasks; it doesn't compute any timeout. - the second part, next_timer_expiry(), called just before poll(), only computes the next timeout for the current thread. Thanks to this, all expired tasks are properly woken up when leaving poll, and each poll call's timeout remains up to date: 09:41:16.270449 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10223556}) = 0 09:41:16.270457 epoll_wait(3, [], 200, 999) = 0 09:41:17.270130 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10238572}) = 0 09:41:17.270157 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 7 09:41:17.270194 fcntl(7, F_SETFL, O_RDONLY\|O_NONBLOCK) = 0 09:41:17.270204 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0 09:41:17.270216 setsockopt(7, SOL_TCP, TCP_QUICKACK, [0], 4) = 0 09:41:17.270224 connect(7, {sa_family=AF_INET, sin_port=htons(1111), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress) 09:41:17.270299 epoll_ctl(3, EPOLL_CTL_ADD, 7, {EPOLLOUT, {u32=7, u64=7}}) = 0 09:41:17.270314 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10337841}) = 0 09:41:17.270323 epoll_wait(3, [{EPOLLOUT\|EPOLLERR\|EPOLLHUP, {u32=7, u64=7}}], 200, 1000) = 1 09:41:17.270332 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10341860}) = 0 09:41:17.270340 getsockopt(7, SOL_SOCKET, SO_ERROR, [111], [4]) = 0 09:41:17.270367 close(7) = 0 This may be backported to 2.1 and 2.0 though it's unlikely to bring any user-visible improvement except to clarify debugging.	2019-12-11 09:42:58 +01:00
Willy Tarreau	a1d97f88e0	REORG: listener: move the global listener queue code to listener.c The global listener queue code and declarations were still lying in haproxy.c while not needed there anymore at all. This complicates the code for no reason. As a result, the global_listener_queue_task and the global_listener_queue were made static.	2019-12-10 14:16:03 +01:00
Willy Tarreau	241797a3fc	MINOR: listener: split dequeue_all_listener() in two We use it half times for the global_listener_queue and half times for a proxy's queue and this requires the callers to take care of these. Let's split it in two versions, the current one working only on the global queue and another one dedicated to proxies for the per-proxy queues. This cleans up quite a bit of code.	2019-12-10 14:14:09 +01:00
Willy Tarreau	a45a8b5171	MEDIUM: init: set NO_NEW_PRIVS by default when supported HAProxy doesn't need to call executables at run time (except when using external checks which are strongly recommended against), and is even expected to isolate itself into an empty chroot. As such, there basically is no valid reason to allow a setuid executable to be called without the user being fully aware of the risks. In a situation where haproxy would need to call external checks and/or disable chroot, exploiting a vulnerability in a library or in haproxy itself could lead to the execution of an external program. On Linux it is possible to lock the process so that any setuid bit present on such an executable is ignored. This significantly reduces the risk of privilege escalation in such a situation. This is what haproxy does by default. In case this causes a problem to an external check (for example one which would need the "ping" command), then it is possible to disable this protection by explicitly adding this directive in the global section. If enabled, it is possible to turn it back off by prefixing it with the "no" keyword. Before the option: $ socat - /tmp/sock1 <<< "expert-mode on; debug dev exec sudo /bin/id" uid=0(root) gid=0(root) groups=0(root After the option: $ socat - /tmp/sock1 <<< "expert-mode on; debug dev exec sudo /bin/id" sudo: effective uid is not 0, is /usr/bin/sudo on a file system with the 'nosuid' option set or an NFS file system without root privileges?	2019-12-06 17:20:26 +01:00
Willy Tarreau	d96f1126fe	MEDIUM: init: prevent process and thread creation at runtime Some concerns are regularly raised about the risk to inherit some Lua files which make use of a fork (e.g. via os.execute()) as well as whether or not some of bugs we fix might or not be exploitable to run some code. Given that haproxy is event-driven, any foreground activity completely stops processing and is easy to detect, but background activity is a different story. A Lua script could very well discretely fork a sub-process connecting to a remote location and taking commands, and some injected code could also try to hide its activity by creating a process or a thread without blocking the rest of the processing. While such activities should be extremely limited when run in an empty chroot without any permission, it would be better to get a higher assurance they cannot happen. This patch introduces something very simple: it limits the number of processes and threads to zero in the workers after the last thread was created. By doing so, it effectively instructs the system to fail on any fork() or clone() syscall. Thus any undesired activity has to happen in the foreground and is way easier to detect. This will obviously break external checks (whose concept is already totally insecure), and for this reason a new option "insecure-fork-wanted" was added to disable this protection, and it is suggested in the fork() error report from the checks. It is obviously recommended not to use it and to reconsider the reasons leading to it being enabled in the first place. If for any reason we fail to disable forks, we still start because it could be imaginable that some operating systems refuse to set this limit to zero, but in this case we emit a warning, that may or may not be reported since we're after the fork point. Ideally over the long term it should be conditionned by strict-limits and cause a hard fail.	2019-12-03 11:49:00 +01:00
Willy Tarreau	47479eb0e7	MINOR: version: emit the link to the known bugs in output of "haproxy -v" The link to the known bugs page for the current version is built and reported there. When it is a development version (less than 2 dots), instead a link to github open issues is reported as there's no way to be sure about the current situation in this case and it's better that users report their trouble there.	2019-11-21 18:48:20 +01:00
Willy Tarreau	08dd202d73	MINOR: version: report the version status in "haproxy -v" As discussed on Discourse here: https://discourse.haproxy.org/t/haproxy-branch-support-lifetime/4466 it's not always easy for end users to know the lifecycle of the version they are using. This patch introduces a "Status" line in the output of "haproxy -vv" indicating whether it's a development, stable, long-term supported version, possibly with an estimated end of life for the branch when it can be anticipated (e.g. for stable versions). This field should be adjusted when creating a major release to reflect the new status. It may make sense to backport this to other branches to clarify the situation.	2019-11-21 18:47:54 +01:00
William Lallemand	677e2f2c35	BUG/MEDIUM: mworker: don't fill the -sf argument with -1 during the reexec Upon a reexec_on_failure, if the process tried to exit after the initialization of the process structure but before it was filled with a PID, the PID in the mworker_proc structure is set to -1. In this particular case the -sf argument is filled with -1 and haproxy will exit with the usage message because of that argument. Should be backported in 2.0.	2019-11-19 17:30:34 +01:00
William Dauchy	f9af9d7f3c	MINOR: init: avoid code duplication while setting identify since the introduction of mworker, the setuid/setgid was duplicated in two places; try to improve that by creating a dedicated function. this patch does not introduce any functional change. Signed-off-by: William Dauchy <w.dauchy@criteo.com>	2019-11-17 16:55:50 +01:00
William Dauchy	e039f26ba4	BUG/MINOR: init: fix set-dumpable when using uid/gid in mworker mode used with uid/gid settings, it was not possible to get a coredump despite the set-dumpable option. indeed prctl(2) manual page specifies the dumpable attribute is reverted to `/proc/sys/fs/suid_dumpable` in a few conditions such as process effective user and group are changed. this patch moves the whole set-dumpable logic before the polling code in order to catch all possible cases where we could have changed the uid/gid. It however does not cover the possible segfault at startup. this patch should be backported in 2.0. Signed-off-by: William Dauchy <w.dauchy@criteo.com>	2019-11-17 16:55:24 +01:00
William Dauchy	0fec3ab7bf	MINOR: init: always fail when setrlimit fails this patch introduces a strict-limits parameter which enforces the setrlimit setting instead of a warning. This option can be forcingly disable with the "no" keyword. The general aim of this patch is to avoid bad surprises on a production environment where you change the maxconn for example, a new fd limit is calculated, but cannot be set because of sysfs setting. In that case you might want to have an explicit failure to be aware of it before seeing your traffic going down. During a global rollout it is also useful to explictly fail as most progressive rollout would simply check the general health check of the process. As discussed, plan to use the strict by default mode starting from v2.3. Signed-off-by: William Dauchy <w.dauchy@criteo.com>	2019-10-29 17:42:27 +01:00
William Lallemand	5fdb5b36e1	BUG/MINOR: mworker/ssl: close openssl FDs unconditionally Patch `56996da` ("BUG/MINOR: mworker/ssl: close OpenSSL FDs on reload") fixes a issue where the /dev/random FD was leaked by OpenSSL upon a reload in master worker mode. Indeed the FD was not flagged with CLOEXEC. The fix was checking if ssl_used_frontend or ssl_used_backend were set to close the FD. This is wrong, indeed the lua init code creates an SSL server without increasing the backend value, so the deinit is never done when you don't use SSL in your configuration. To reproduce the problem you just need to build haproxy with openssl and lua with an openssl which does not use the getrandom() syscall. No openssl nor lua configuration are required for haproxy. This patch must be backported as far as 1.8. Fix issue #314.	2019-10-17 11:36:22 +02:00
David Carlier	5e4c8e2a67	BUILD/MEDIUM: threads: enable cpu_affinity on osx Enable it but on a per thread basis only using Darwin native API.	2019-10-17 07:20:58 +02:00
David Carlier	a92c5cec2d	BUILD/MEDIUM: threads: rename thread_info struct to ha_thread_info On Darwin, the thread_info name exists as a standard function thus we need to rename our array to ha_thread_info to fix this conflict.	2019-10-17 07:15:17 +02:00
Olivier Houchard	bba1a263c5	BUG/MEDIUM: tasklets: Make sure we're waking the target thread if it sleeps. Now that we can wake tasklet for other threads, make sure that if the thread is sleeping, we wake it up, or the tasklet won't be executed until it's done sleeping. That also means that, before going to sleep, and after we put our bit in sleeping_thread_mask, we have to check that nobody added a tasklet for us, just checking for global_tasks_mask isn't enough anymore.	2019-09-24 14:58:45 +02:00
Willy Tarreau	d022e9c98b	MINOR: task: introduce a thread-local "sched" variable for local scheduler stuff The aim is to rassemble all scheduler information related to the current thread. It simply points to task_per_thread[tid] without having to perform the operation at each time. We save around 1.2 kB of code on performance sensitive paths and increase the request rate by almost 1%.	2019-09-24 11:23:30 +02:00
Olivier Houchard	859dc80f94	MEDIUM: list: Separate "locked" list from regular list. Instead of using the same type for regular linked lists and "autolocked" linked lists, use a separate type, "struct mt_list", for the autolocked one, and introduce a set of macros, similar to the LIST_* macros, with the MT_ prefix. When we use the same entry for both regular list and autolocked list, as is done for the "list" field in struct connection, we know have to explicitely cast it to struct mt_list when using MT_ macros.	2019-09-23 18:16:08 +02:00
Christopher Faulet	c16929658f	MINOR: config: Support per-proxy and per-server post-check functions callbacks Most of times, when a keyword is added in proxy section or on the server line, we need to have a post-parser callback to check the config validity for the proxy or the server which uses this keyword. It is possible to register a global post-parser callback. But all these callbacks need to loop on the proxies and servers to do their job. It is neither handy nor efficient. Instead, it is now possible to register per-proxy and per-server post-check callbacks.	2019-09-17 10:18:54 +02:00
Christopher Faulet	3ea5cbe6a4	MINOR: config: Support per-proxy and per-server deinit functions callbacks Most of times, when any allocation is done during configuration parsing because of a new keyword in proxy section or on the server line, we must add a call in the deinit() function to release allocated ressources. It is now possible to register a post-deinit callback because, at this stage, the proxies and the servers are already releases. Now, it is possible to register deinit callbacks per-proxy or per-server. These callbacks will be called for each proxy and server before releasing them.	2019-09-17 10:18:54 +02:00
Willy Tarreau	e0d86e2c1c	BUG/MINOR: mworker: disable SIGPROF on re-exec If haproxy is built with profiling enabled with -pg, it is possible to see the master quit during a reload while it's re-executing itself with error code 155 (signal 27) saying "Profile timer expired)". This happens if the SIGPROF signal is delivered during the execve() call while the handler was already unregistered. The issue itself is not directly inside haproxy but it's easy to address. This patch disables this signal before calling execvp() during a master reload. A simple test for this consists in running this little script with haproxy started in master-worker mode : $ while usleep 50000; do killall -USR2 haproxy; done This fix should be backported to all versions using the master-worker model.	2019-08-26 10:44:48 +02:00
Olivier Houchard	305d5ab469	MAJOR: fd: Get rid of the fd cache. Now that the architecture was changed so that attempts to receive/send data always come from the upper layers, instead of them only trying to do so when the lower layer let them know they could try, we can finally get rid of the fd cache. We don't really need it anymore, and removing it gives us a small performance boost.	2019-07-31 14:12:55 +02:00
Christopher Faulet	f734638976	MINOR: http: Don't store raw HTTP errors in chunks anymore Default HTTP error messages are stored in an array of chunks. And since the HTX was added, these messages are also converted in HTX and stored in another array. But now, the first array is not used anymore because the legacy HTTP mode was removed. So now, only the array with the HTX messages are kept. The other one was removed.	2019-07-19 09:46:23 +02:00
Christopher Faulet	41ba36f8b2	MINOR: global: Preset tune.max_http_hdr to its default value By default, this tune parameter is set to MAX_HTTP_HDR. This assignment is done after the configuration parsing, when we check the configuration validity. So during the configuration parsing, its value is 0. Now, it is set to MAX_HTTP_HDR from the start. So, it is possible to rely on it during the configuration parsing.	2019-07-19 09:46:23 +02:00
Christopher Faulet	1b6adb4a51	MINOR: proxy/http_ana: Remove unused req_exp/rsp_exp and req_add/rsp_add lists The keywords req* and rsp* are now unsupported. So the corresponding lists are now unused. It is safe to remove them from the structure proxy. As a result, the code dealing with these rules in HTTP analyzers was also removed.	2019-07-19 09:24:12 +02:00
Christopher Faulet	fc9cfe4006	REORG: proto_htx: Move HTX analyzers & co to http_ana.{c,h} files The old module proto_http does not exist anymore. All code dedicated to the HTTP analysis is now grouped in the file proto_htx.c. So, to finish the polishing after removing the legacy HTTP code, proto_htx.{c,h} files have been moved in http_ana.{c,h} files. In addition, all HTX analyzers and related functions prefixed with "htx_" have been renamed to start with "http_" instead.	2019-07-19 09:24:12 +02:00
Christopher Faulet	711ed6ae4a	MAJOR: http: Remove the HTTP legacy code First of all, all legacy HTTP analyzers and all functions exclusively used by them were removed. So the most of the functions in proto_http.{c,h} were removed. Only functions to deal with the HTTP transaction have been kept. Then, http_msg and hdr_idx modules were entirely removed. And finally the structure http_msg was lightened of all its useless information about the legacy HTTP. The structure hdr_ctx was also removed because unused now, just like unused states in the enum h1_state. Note that the memory pool "hdr_idx" was removed and "http_txn" is now smaller.	2019-07-19 09:24:12 +02:00
Willy Tarreau	7764a57d32	BUG/MEDIUM: threads: cpu-map designating a single thread/process are ignored Since commit `81492c989` ("MINOR: threads: flatten the per-thread cpu-map"), we don't keep the procthread matrix anymore to represent the full binding possibilities, but only the proc and thread ones. The problem is that the per-process binding is not the same for each thread and for the process, and the proc[] array was assumed to store the per-proc first thread value when doing this change. Worse, the logic present there tries to deal with thread ranges and process ranges in a way which automatically exclused the other possibility (since ranges cannot be used on both) but as such fails to apply changes if neither the process nor the thread is expressed as a range. The real problem comes from the fact that specifying cpu-map 1/1 doesn't yet reveal if the per-process mask or the per-thread mask needs to be updated. In practice it's the thread one but then the current storage doesn't allow to store the binding of the first thread of each other process in nbproc>1 configurations. When removing the procthread matrix, what ought to have been kept was both the thread column for process 1 and the process line for threads 1, but instead only the thread column was kept. This patch reintroduces the storage of the configuration for the first thread of each process so that it is again possible to store either the per-thread or per-process configuration. As a partial workaround for existing configurations, it is possible to systematically indicate at least two processes or two threads at once and map them by pairs or more so that at least two values are present in the range. E.g : # set processes 1-4 to cpus 0-3 : cpu-map auto:1-4/1 0 1 2 3 # or: cpu-map 1-2/1 0 1 cpu-map 2-3/1 2 3 # set threads 1-4 to cpus 0-3 : cpu-map auto:1/1-4 0 1 2 3 # or : cpu-map 1/1-2 0 1 cpu-map 3/3-4 2 3 This fix must be backported to 2.0.	2019-07-16 15:23:09 +02:00
William Lallemand	16866670dd	BUG/MEDIUM: mworker: don't call the thread and fdtab deinit Before switching to wait mode, the per thread deinit should not be called, because we didn't initiate threads and fdtab. The problem is that the master could crash if we try to reload HAProxy The commit `944e619` ("MEDIUM: mworker: wait mode use standard init code path") removed the deinit code by accident, but its fix `7c756a8` ("BUG/MEDIUM: mworker: fix FD leak upon reload") was incomplete and did not took care of the WAIT_MODE. This fix must be backported in 1.9 and 2.0	2019-06-24 17:54:05 +02:00
Willy Tarreau	76a80c710c	BUILD: mworker: silence two printf format warnings around getpid() getpid() is documented as returning a pit pid_t result, not necessarily an int. This causes a build warning on Solaris 10 because of '%d' or '%u' are used in the format passed to snprintf(). Let's just cast the result as an int (respectively unsigned int). This can be backported to 2.0 and possibly older versions though it really has no impact.	2019-06-22 07:57:56 +02:00
Willy Tarreau	3c39a7d889	CLEANUP: connection: rename the wait_event.task field to .tasklet It's really confusing to call it a task because it's a tasklet and used in places where tasks and tasklets are used together. Let's rename it to tasklet to remove this confusion.	2019-06-14 14:42:29 +02:00
William Lallemand	63329e36ab	MINOR: doc: update the manpage and usage message about -S Add -S in the manpage, and update the usage message. Should be backported to 1.9.	2019-06-13 17:09:27 +02:00
William Lallemand	1dc6963086	MINOR: mworker: add the HAProxy version in "show proc" Displays the HAProxy version so you can compare the version of old processes and new ones.	2019-06-12 19:19:57 +02:00
Willy Tarreau	34a150ccf5	MEDIUM: init/threads: don't use spinlocks during the init phase PiBa-NL found some pathological cases where starting threads can hinder each other and cause a measurable slow down. This problem is reproducible with the following config (haproxy must be built with -DDEBUG_DEV) : global stats socket /tmp/sock1 mode 666 level admin nbthread 64 backend stopme timeout server 1s option tcp-check tcp-check send "debug dev exit\n" server cli unix@/tmp/sock1 check This will cause the process to be stopped once the checks are ready to start. Binding all these to just a few cores magnifies the problem. Starting them in loops shows a significant time difference among the commits : # before startup serialization $ time for i in {1..20}; do taskset -c 0,1,2,3 ./haproxy-e186161 -db -f slow-init.cfg >/dev/null 2>&1; done real 0m1.581s user 0m0.621s sys 0m5.339s # after startup serialization $ time for i in {1..20}; do taskset -c 0,1,2,3 ./haproxy-e4d7c9dd -db -f slow-init.cfg >/dev/null 2>&1; done real 0m2.366s user 0m0.894s sys 0m8.238s In order to address this, let's use plain mutexes and cond_wait during the init phase. With this done, waiting threads now sleep and the problem completely disappeared : $ time for i in {1..20}; do taskset -c 0,1,2,3 ./haproxy -db -f slow-init.cfg >/dev/null 2>&1; done real 0m0.161s user 0m0.079s sys 0m0.149s	2019-06-11 11:30:26 +02:00
Willy Tarreau	e4d7c9dd65	OPTIM/MINOR: init/threads: only call protocol_enable_all() on first thread There's no point in calling this on each and every thread since the first thread passing there will enable the listeners, and the next ones will simply scan all of them in turn to discover that they are already initialized. Let's only initilize them on the first thread. This could slightly speed up start up on very large configurations, eventhough most of the time is still spent in the main thread binding the sockets. A few measurements have constantly shown that this decreases the startup time by ~0.1s for 150k listeners. Starting all of them in parallel doesn't provide better results and can still expose some undesired races.	2019-06-10 10:53:59 +02:00
Willy Tarreau	7109282577	BUG/MEDIUM: init/threads: prevent initialized threads from starting before others Since commit `6ec902a` ("MINOR: threads: serialize threads initialization") we now serialize threads initialization. But doing so has emphasized another race which is that some threads may actually start the loop before others are done initializing. As soon as all threads enter the first thread_release() call, their rdv bit is cleared and they're all waiting for all others' rdv to be cleared as well, with their harmless bit set. The first one to notice the cleared mask will progress through thread_isolate(), take rdv again preventing most others from noticing its short pass to zero, and this first one will be able to run all the way through the initialization till the last call to thread_release() which it happily crosses, being the only one with the rdv bit, leaving the room for one or a few others to do the same. This results in some threads entering the loop before others are done with their initialization, which is particularly bad. PiBa-NL reported that some regtests fail for him due to this (which was impossible to reproduce here, but races are racy by definition). However placing some printf() in the initialization code definitely shows this unsychronized startup. This patch takes a different approach in three steps : - first, we don't start with thread_release() anymore and we don't set the rdv mask anymore in the main call. This was initially done to let all threads start toghether, which we don't want. Instead we just start with thread_isolate(). Since all threads are harmful by default, they all wait for each other's readiness before starting. - second, we don't release with thread_release() but with thread_sync_release(), meaning that we don't leave the function until other ones have reached the point in the function where they decide to leave it as well. - third, it makes sure we don't start the listeners using protocol_enable_all() before all threads have allocated their local FD tables or have initialized their pollers, otherwise startup could be racy as well. It's worth noting that it is even possible to limit this call to thread #0 as it only needs to be performed once. This now guarantees that all thread init calls start only after all threads are ready, and that no thread enters the polling loop before all others have completed their initialization. Please check GH issues #111 and #117 for more context. No backport is needed, though if some new init races are reported in 1.9 (or even 1.8) which do not affect 2.0, then it may make sense to carefully backport this small series.	2019-06-10 10:53:52 +02:00
Willy Tarreau	6ec902a659	MINOR: threads: serialize threads initialization There is no point in initializing threads in parallel when we know that it's the moment where some global variables are turned to thread-local ones, and/or that some global variables are updated (like global_now or trash_size). Some FDs might be created/destroyed/reallocated and could be tricky to follow as well (think about epoll_fd for example). Instead of having to be extremely careful about all these, and to trigger false positives in thread sanitizers, let's simply initialize one thread at a time. The init step is very fast so nobody should even notice, and we won't have any more doubts about what might have happened when analysing a dump. See GH issues #111 and #117 for some background on this.	2019-06-07 15:37:47 +02:00

1 2 3 4 5 ...

702 Commits