haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-08 08:07:10 +02:00

Author	SHA1	Message	Date
William Lallemand	117b03ff4a	BUG/MINOR: mworker: leak of a socketpair during startup failure Aurelien Darragon found a case of leak when working on ticket #2184. When a reexec_on_failure() happens BEFORE protocol_bind_all(), the worker is not fork and the mworker_proc struct is still there with its 2 socketpairs. The socketpair that is supposed to be in the master is already closed in mworker_cleanup_proc(), the one for the worker was suppposed to be cleaned up in mworker_cleanlisteners(). However, since the fd is not bound during this failure, the fd is never closed. This patch fixes the problem by setting the fd to -1 in the mworker_proc after the fork, so we ensure that this it won't be close if everything was done right, and then we try to close it in mworker_cleanup_proc() when it's not set to -1. This could be triggered with the script in ticket #2184 and a `ulimit -H -n 300`. This will fail before the protocol_bind_all() when trying to increase the nofile setrlimit. In recent version of haproxy, there is a BUG_ON() in fd_insert() that could be triggered by this bug because of the global.maxsock check. Must be backported as far as 2.6. The problem could exist in previous version but the code is different and this won't be triggered easily without other consequences in the master.	2023-06-21 09:44:18 +02:00
Aurelien DARRAGON	33bbeecde3	BUILD: init: print rlim_cur as regular integer haproxy does not compile anymore on macOS+clang since `425d7ad` ("MINOR: init: pre-allocate kernel data structures on init"). This is due to rlim_cur being printed uncasted using %lu format specifier, with rlim_cur being stored as a rlim_t which is a typedef so its size may vary depending on the system's architecture. This is not the first time we need to dump rlim_cur in case of errors, there are already multiple occurences in the init code. Everywhere this happens, rlim is casted as a regular int and printed using the '%d' format specifier, so we do the same here as well to fix the build issue. No backport needed unless `425d7ad` gets backported.	2023-05-26 14:29:52 +02:00
Patrick Hemmer	425d7ad89d	MINOR: init: pre-allocate kernel data structures on init The Linux kernel maintains data structures to track a processes' open file descriptors, and it expands these structures as necessary when FD usage grows (at every FD=2^X starting at 64). However when threading is in use, during expansion the kernel will pause (observed up to 47ms) while it waits for thread synchronization (see https://bugzilla.kernel.org/show_bug.cgi?id=217366). This change addresses the issue and avoids the random pauses by opening the maximum file descriptor during initialization, so that expansion will not occur while processing traffic.	2023-05-26 09:28:18 +02:00
Willy Tarreau	c7b9308f20	BUG/MINOR: clock: automatically adjust the internal clock with the boot time This is a better and more general solution to the problem described in this commit: BUG/MINOR: checks: postpone the startup of health checks by the boot time Now we're updating the now_offset that is used to compute now_ms at the few points where we update the ready date during boot. This ensures that now_ms while being stable during all the boot process will be correct and will start with the boot value right after the boot is finished. As such the patch above is rolled back (we don't want to count the boot time twice). This must not be backported because it relies on the more flexible clock architecture in 2.8.	2023-05-17 09:33:54 +02:00
Willy Tarreau	da4aa6905c	MINOR: clock: measure the total boot time Some huge configs take a significant amount of time to start and this can cause some trouble (e.g. health checks getting delayed and grouped, process not responding to the CLI etc). For example, some configs might start fast in certain environments and slowly in other ones just due to the use of a wrong DNS server that delays all libc's resolutions. Let's first start by measuring it by keeping a copy of the most recently known ready date, once before calling check_config_validity() and then refine it when leaving this function. A last call is finally performed just before deciding to split between master and worker processes, and it covers the whole boot. It's trivial to collect and even allows to get rid of a call to clock_update_date() in function check_config_validity() that was used in hope to better schedule future events.	2023-05-17 09:33:54 +02:00
Willy Tarreau	c05d30e9d8	MINOR: clock: replace the timeval start_time with start_time_ns Now that "now" is no more a timeval, there's no point keeping a copy of it as a timeval, let's also switch start_time to nanoseconds, it simplifies operations.	2023-04-28 16:08:08 +02:00
Willy Tarreau	69530f59ae	MEDIUM: clock: replace timeval "now" with integer "now_ns" This puts an end to the occasional confusion between the "now" date that is internal, monotonic and not synchronized with the system's date, and "date" which is the system's date and not necessarily monotonic. Variable "now" was removed and replaced with a 64-bit integer "now_ns" which is a counter of nanoseconds. It wraps every 585 years, so if all goes well (i.e. if humanity does not need haproxy anymore in 500 years), it will just never wrap. This implies that now_ns is never nul and that the zero value can reliably be used as "not set yet" for a timestamp if needed. This will also simplify date checks where it becomes possible again to do "date1<date2". All occurrences of "tv_to_ns(&now)" were simply replaced by "now_ns". Due to the intricacies between now, global_now and now_offset, all 3 had to be turned to nanoseconds at once. It's not a problem since all of them were solely used in 3 functions in clock.c, but they make the patch look bigger than it really is. The clock_update_local_date() and clock_update_global_date() functions are now much simpler as there's no need anymore to perform conversions nor to round the timeval up or down. The wrapping continues to happen by presetting the internal offset in the short future so that the 32-bit now_ms continues to wrap 20 seconds after boot. The start_time used to calculate uptime can still be turned to nanoseconds now. One interrogation concerns global_now_ms which is used only for the freq counters. It's unclear whether there's more value in using two variables that need to be synchronized sequentially like today or to just use global_now_ns divided by 1 million. Both approaches will work equally well on modern systems, the difference might come from smaller ones. Better not change anyhting for now. One benefit of the new approach is that we now have an internal date with a resolution of the nanosecond and the precision of the microsecond, which can be useful to extend some measurements given that timestamps also have this resolution.	2023-04-28 16:08:08 +02:00
Willy Tarreau	0e875cf291	MEDIUM: listener: switch the default sharding to by-group Sharding by-group is exactly identical to by-process for a single group, and will use the same number of file descriptors for more than one group, while significantly lowering the kernel's locking overhead. Now that all special listeners (cli, peers) are properly handled, and that support for SO_REUSEPORT is detected at runtime per protocol, there should be no more reason for now switching to by-group by default. That's what this patch does. It does only this and nothing else so that it's easy to revert, should any issue be raised. Testing on an AMD EPYC 74F3 featuring 24 cores and 48 threads distributed into 8 core complexes of 3 cores each, shows that configuring 8 groups (one per CCX) is sufficient to simply double the forwarded connection rate from 112k to 214k/s, reducing kernel locking from 71 to 55%.	2023-04-23 10:18:16 +02:00
Willy Tarreau	7310164b2c	MINOR: listener: add a new global tune.listener.default-shards setting This new setting accepts "by-process", "by-group" and "by-thread" and will dictate how listeners will be sharded by default when nothing is specified. While the default remains "by-process", "by-group" should be much more efficient with many threads, while not changing anything for single-group setups.	2023-04-23 09:46:15 +02:00
Willy Tarreau	785b89f551	MINOR: protocol: move the global reuseport flag to the protocols Some protocol support SO_REUSEPORT and others not. Some have such a limitation in the kernel, and others in haproxy itself (e.g. sock_unix cannot support multiple bindings since each one will unbind the previous one). Also it's really protocol-dependent and not just family-dependent because on Linux for some time it was supported for TCP and not UDP. Let's move the definition to the protocols instead. Now it's preset in tcp/udp/quic when SO_REUSEPORT is defined, and is otherwise left unset. The enabled() config condition test validates IPv4 (generally sufficient), and -dR / noreuseport all protocols at once.	2023-04-23 09:46:15 +02:00
Willy Tarreau	84fe1f479b	MINOR: listener: support another thread dispatch mode: "fair" This new algorithm for rebalancing incoming connections to multiple threads is simpler and instead of considering the threads load, it will only cycle through all of them, offering a fair share of the traffic to each thread. It may be well suited for short-lived connections but is also convenient for very large thread counts where it's not always certain that the least loaded thread will always be found.	2023-04-21 17:41:26 +02:00
Aurelien DARRAGON	cca3355074	BUG/MINOR: log: free log forward proxies on deinit() Proxies belonging to the cfg_log_forward proxy list are not cleaned up in haproxy deinit() function. We add the missing cleanup directly in the main deinit() function since no other specific function may be used for this. This could be backported up to 2.4	2023-04-05 08:58:16 +02:00
Aurelien DARRAGON	9b1d15f53a	BUG/MINOR: sink: free forward_px on deinit() When a ring section is configured, a new sink is created and forward_px proxy may be allocated and assigned to the sink. Such sink-related proxies are added to the sink_proxies_list and thus don't belong to the main proxy list which is cleaned up in haproxy deinit() function. We don't have to manually clean up sink_proxies_list in the main deinit() func: sink API already provides the sink_deinit() function so we just add the missing free_proxy(sink->forward_px) there. This could be backported up to 2.4. [in 2.4, commit `b0281a49` ("MINOR: proxy: check if p is NULL in free_proxy()") must be backported first]	2023-04-05 08:58:16 +02:00
Willy Tarreau	9ef2742a51	MINOR: debug: support dumping the libs addresses when running in verbose mode Starting haproxy with -dL helps enumerate the list of libraries in use. But sometimes in order to go further we'd like to see their address ranges. This is already supported on the CLI's "show libs" but not on the command line where it can sometimes help troubleshoot startup issues. Let's dump them when in verbose mode. This way it doesn't change the existing behavior for those trying to enumerate libs to produce an archive.	2023-03-22 11:43:15 +01:00
William Lallemand	2078d4b1f7	BUG/MINOR: mworker: use MASTER_MAXCONN as default maxconn value In environments where SYSTEM_MAXCONN is defined when compiling, the master will use this value instead of the original minimal value which was set to 100. When this happens, the master process could allocate RAM excessively since it does not need to have an high maxconn. (For example if SYSTEM_MAXCONN was set to 100000 or more) This patch fixes the issue by using the new define MASTER_MAXCONN which define a default maxconn of 100 for the master process. Must be backported as far as 2.5.	2023-03-09 14:28:44 +01:00
Amaury Denoyelle	5907fede87	MEDIUM: quic: release closing connections on stopping Since the following commit : commit `fb375574f9` MINOR: quic: mark quic-conn as jobs on socket allocation quic-conn instances are marked as jobs. This prevent haproxy process to stop while there is transfer in progress. To not delay process termination, idle connections are woken up through their MUX instances to be able to release them immediately. However, there is no mechanism to wake up quic connections left on closing or draining state. This means that haproxy process termination is delayed until every closing quic connections timer has expired. To improve this, a new function quic_handle_stopping() is called when haproxy process is stopping. It simply wakes up the idle timer task of all connections in the global closing list. These connections will thus be released immediately to not interrupt haproxy process stopping. This should be backported up to 2.7.	2023-03-08 14:41:28 +01:00
S�baastien Gross	2a1bcf1a59	MINOR: config: add HAPROXY_BRANCH environment variable This patch adds support from HAPROXY_BRANCH environment variable. It can be useful is some resources are loaded from different locations when migrating from one version to another. Signed-off-by: S�bastien Gross <sgross@haproxy.com>	2023-02-24 09:45:44 +01:00
Aurelien DARRAGON	28a6d48a60	MINOR: haproxy: always protocol unbind on startup error path In haproxy startup, all init error paths after the protocol bind step cautiously call protocol_unbind_all() before exiting except one that was conditional. We're not making an exception to the rule and we now properly call protocol_unbind_all() as well. No backport needed as this patch is unnoticeable.	2023-02-23 15:05:05 +01:00
William Lallemand	d4c0be6b20	MINOR: startup: HAPROXY_STARTUP_VERSION contains the version used to start HAPROXY_STARTUP_VERSION: contains the version used to start, in master-worker mode this is the version which was used to start the master, even after updating the binary and reloading. This patch could be backported in every version since it is useful when debugging.	2023-02-21 14:16:45 +01:00
Christopher Faulet	2f7c82bfdf	BUG/MINOR: haproxy: Fix option to disable the fast-forward The option was renamed to only permit to disable the fast-forward. First there is no reason to enable it because it is the default behavior. Then it introduced a bug because there is no way to be sure the command line has precedence over the configuration this way. So, the option is now named "tune.disable-fast-forward" and does not support any argument. And of course, the commande line option "-dF" has now precedence over the configuration. No backport needed.	2023-02-21 11:44:55 +01:00
William Lallemand	5a7f83af84	BUG/MINOR: mworker: prevent incorrect values in uptime Since the recent changes on the clocks, now.tv_sec is not to be used between processes because it's a clock which is local to the process and does not contain a real unix timestamp. This patch fixes the issue by using "data.tv_sec" which is the wall clock instead of "now.tv_sec'. It prevents having incoherent timestamps. It also introduces some checks on negatives values in order to never displays a netative value if it was computed from a wrong value set by a previous haproxy version. It must be backported as far as 2.0.	2023-02-17 17:17:28 +01:00
Willy Tarreau	3e820a1056	MINOR: threads: add flags to know if a thread is started and/or running Several times during debugging it has been difficult to find a way to reliably indicate if a thread had been started and if it was still running. It's really not easy because the elements we look at are not necessarily reliable (e.g. harmless bit or idle bit might not reflect what we think during a signal). And such notions can be subjective anyway. Here we define two thread flags, TH_FL_STARTED which is set as soon as a thread enters run_thread_poll_loop() and drops the idle bit, and another one, TH_FL_IN_LOOP, which is set when entering run_poll_loop() and cleared when leaving it. This should help init/deinit code know whether it's called from a non-initialized thread (i.e. tid must not be trusted), or shared functions know if they're being called from a running thread or from init/deinit code outside of the polling loop.	2023-02-17 16:01:34 +01:00
Christopher Faulet	678a4ced70	MINOR: haproxy: Add an command option to disable data fast-forward The -dF option can now be used to disable data fast-forward. It does the same than the global option "tune.fast-forward off". Some reg-tests may rely on this optim. To detect the feature and skip such script, the following vtest command must be used: feature cmd "$HAPROXY_PROGRAM -cc '!(globa.tune & GTUNE_NO_FAST_FWD)'"	2023-02-17 10:17:02 +01:00
Amaury Denoyelle	2776e775ec	BUG/MINOR: mworker: fix uptime for master process Uptime calculation for master process was incorrect as it used <start_date> as its timestamp base time. Fix this by using the scheduler time <start_time> for this. The impact of this bug is minor as timestamp base time is only used for "show proc" CLI output. it was highlighted by the following commit. which caused a negative value to be displayed for the master process uptime on "show proc" output. `28360dc53f` MEDIUM: clock: force internal time to wrap early after boot This should be backported up to 2.0.	2023-02-10 15:57:33 +01:00
Willy Tarreau	6093ba47c0	BUG/MINOR: clock: do not mix wall-clock and monotonic time in uptime calculation We've had a start date even before the internal monotonic clock existed, but once the monotonic clock was added, the start date was not updated to distinguish the wall clock time units and the internal monotonic time units. The distinction is important because both clocks do not necessarily progress at the same speed. The very rare occurrences of the wall-clock date are essentially for human consumption and communication with third parties (e.g. report the start date in "show info" for monitoring purposes). However currently this one is also used to measure the distance to "now" as being the process' uptime. This is actually not correct. It only works because for now the two dates are initialized at the exact same instant at boot but could still be wrong if the system's date shows a big jump backwards during startup for example. In addition the current situation prevents us from enforcing an abritrary offset at boot to reveal some heisenbugs. This patch adds a new "start_time" at boot that is set from "now" and is used in uptime calculations. "start_date" instead is now set from "date" and will always reflect the system date for human consumption (e.g. in "show info"). This way we're now sure that any drift of the internal clock relative to the system date will not impact the reported uptime. This could possibly be backported though it's unlikely that anyone has ever noticed the problem.	2023-02-08 11:06:55 +01:00
Amaury Denoyelle	24d5b72ca9	MINOR: quic: add config for retransmit limit Define a new configuration option "tune.quic.max-frame-loss". This is used to specify the limit for which a single frame instance can be detected as lost. If exceeded, the connection is closed. This should be backported up to 2.7.	2023-02-03 11:56:46 +01:00
Aurelien DARRAGON	739281b3d6	BUG/MEDIUM: thread: consider secondary threads as idle+harmless during boot idle and harmless bits in the tgroup_ctx structure were not explicitly set during boot. \| struct tgroup_ctx ha_tgroup_ctx[MAX_TGROUPS] = { }; As the structure is first statically initialized, .threads_harmless and .threads_idle are automatically zero- initialized by the compiler. Unfortulately, this means that such threads are not considered idle nor harmless by thread_isolate(_full)() functions until they enter the polling loop (thread_harmless_now() and thread_idle_now() are respectively called before entering the polling loop) Because of this, any attempt to call thread_isolate() or thread_isolate_full() during a startup phase with nbthreads >= 2 will cause thread_isolate to loop until every secondary threads make it through their first polling loop. If the startup phase is aborted during boot (ie: "-c" option to check the configuration), secondary threads may be initialized but will never be started (ie: they won't enter the polling loop), thus thread_isolate() could would loop forever in such cases. We can easily reveal the bug with this patch reproducer: \| diff --git a/src/haproxy.c b/src/haproxy.c \| index e91691658..0b733f6ee 100644 \| --- a/src/haproxy.c \| +++ b/src/haproxy.c \| @@ -2317,6 +2317,10 @@ static void init(int argc, char *argv) \| if (pr \|\| px) { \| / At least one peer or one listener has been found */ \| qfprintf(stdout, "Configuration file is valid\n"); \| + printf("haproxy will loop...\n"); \| + thread_isolate(); \| + printf("we will never reach this\n"); \| + thread_release(); \| deinit_and_exit(0); \| } \| qfprintf(stdout, "Configuration file has no error but will not start (no listener) => exit(2).\n"); Now we start haproxy with a valid config: $> haproxy -c -f valid.conf Configuration file is valid haproxy will loop... ^C ------------------------------------------------------------------------------ This did not cause any issue so far because no early deinit paths require full thread isolation. But this may change when new features or requirements are introduced, so we should fix this before it becomes a real issue. To fix this, we explicitly assign .threads_harmless and .threads_idle to .threads_enabled value in thread_map_to_groups() function during boot. This is the proper place to do this since as long as .threads_enabled is not explicitly set, its default value is also 0 (zero-initialized by the compiler) code snippet from thread_isolate() function: ulong te = _HA_ATOMIC_LOAD(&ha_tgroup_info[tgrp].threads_enabled); ulong th = _HA_ATOMIC_LOAD(&ha_tgroup_ctx[tgrp].threads_harmless); if ((th & te) == te) break; Thus thread_isolate(_full()) won't be looping forever in thread_isolate() even if it were to be used before thread_map_to_groups() is executed. No backport needed unless this is a requirement.	2023-02-02 08:21:15 +01:00
Willy Tarreau	2c701dbc07	BUG/MINOR: log: release global log servers on exit Since 2.6 we have a free_logsrv() function that is used to release log servers. It must be called from deinit() instead of manually iterating over the log servers, otherwise some parts of the structure are not freed (namely the ring name), as reported by ASAN. This should be backported to 2.6.	2023-01-26 15:49:30 +01:00
Willy Tarreau	b2f38c13d1	BUG/MINOR: thread: always reload threads_enabled in loops A few loops waiting for threads to synchronize such as thread_isolate() rightfully filter the thread masks via the threads_enabled field that contains the list of enabled threads. However, it doesn't use an atomic load on it. Before 2.7, the equivalent variables were marked as volatile and were always reloaded. In 2.7 they're fields in ha_tgroup_ctx[], and the risk that the compiler keeps them in a register inside a loop is not null at all. In practice when ha_thread_relax() calls sched_yield() or an x86 PAUSE instruction, it could be verified that the variable is always reloaded. If these are avoided (e.g. architecture providing neither solution), it's visible in asm code that the variables are not reloaded. In this case, if a thread exists just between the moment the two values are read, the loop could spin forever. This patch adds the required _HA_ATOMIC_LOAD() on the relevant threads_enabled fields. It must be backported to 2.7.	2023-01-19 19:22:17 +01:00
Willy Tarreau	40c88f997f	[RELEASE] Released version 2.8-dev1 Released version 2.8-dev1 with the following main changes : - MEDIUM: 51d: add support for 51Degrees V4 with Hash algorithm - MINOR: debug: support pool filtering on "debug dev memstats" - MINOR: debug: add a balance of alloc - free at the end of the memstats dump - LICENSE: wurfl: clarify the dummy library license. - MINOR: event_hdl: add event handler base api - DOC/MINOR: api: add documentation for event_hdl feature - MEDIUM: ssl: rename the struct "cert_key_and_chain" to "ckch_data" - MINOR: quic: remove qc from quic_rx_packet - MINOR: quic: complete traces in qc_rx_pkt_handle() - MINOR: quic: extract datagram parsing code - MINOR: tools: add port for ipcmp as optional criteria - MINOR: quic: detect connection migration - MINOR: quic: ignore address migration during handshake - MINOR: quic: startup detect for quic-conn owned socket support - MINOR: quic: test IP_PKTINFO support for quic-conn owned socket - MINOR: quic: define config option for socket per conn - MINOR: quic: allocate a socket per quic-conn - MINOR: quic: use connection socket for emission - MEDIUM: quic: use quic-conn socket for reception - MEDIUM: quic: move receive out of FD handler to quic-conn io-cb - MINOR: mux-quic: rename duplicate function names - MEDIUM: quic: requeue datagrams received on wrong socket - MINOR: quic: reconnect quic-conn socket on address migration - MINOR: quic: activate socket per conn by default - BUG/MINOR: ssl: initialize SSL error before parsing - BUG/MINOR: ssl: initialize WolfSSL before parsing - BUG/MINOR: quic: fix fd leak on startup check quic-conn owned socket - BUG/MEDIIM: stconn: Flush output data before forwarding close to write side - MINOR: server: add srv->rid (revision id) value - MINOR: stats: add server revision id support - MINOR: server/event_hdl: add support for SERVER_ADD and SERVER_DEL events - MINOR: server/event_hdl: add support for SERVER_UP and SERVER_DOWN events - BUG/MEDIUM: checks: do not reschedule a possibly running task on state change - BUG/MINOR: checks: make sure fastinter is used even on forced transitions - CLEANUP: assorted typo fixes in the code and comments - MINOR: mworker: display an alert upon a wait-mode exit - BUG/MEDIUM: mworker: fix segv in early failure of mworker mode with peers - BUG/MEDIUM: mworker: create the mcli_reload socketpairs in case of upgrade - BUG/MINOR: checks: restore legacy on-error fastinter behavior - MINOR: check: use atomic for s->consecutive_errors - MINOR: stats: properly handle ST_F_CHECK_DURATION metric - MINOR: mworker: remove unused legacy code in mworker_cleanlisteners - MINOR: peers: unused code path in process_peer_sync - BUG/MINOR: init/threads: continue to limit default thread count to max per group - CLEANUP: init: remove useless assignment of nbthread - BUILD: atomic: atomic.h may need compiler.h on ARMv8.2-a - BUILD: makefile/da: also clean Os/ in Device Atlas dummy lib dir - BUG/MEDIUM: httpclient/lua: double LIST_DELETE on end of lua task - CLEANUP: pools: move the write before free to the uaf-only function - CLEANUP: pool: only include pool-os from pool.c not pool.h - REORG: pool: move all the OS specific code to pool-os.h - CLEANUP: pools: get rid of CONFIG_HAP_POOLS - DEBUG: pool: show a few examples in -dMhelp - MINOR: pools: make DEBUG_UAF a runtime setting - BUG/MINOR: promex: create haproxy_backend_agg_server_status - MINOR: promex: introduce haproxy_backend_agg_check_status - DOC: promex: Add missing backend metrics - BUG/MAJOR: fcgi: Fix uninitialized reserved bytes - REGTESTS: fix the race conditions in iff.vtc - CI: github: reintroduce openssl 1.1.1 - BUG/MINOR: quic: properly handle alloc failure in qc_new_conn() - BUG/MINOR: quic: handle alloc failure on qc_new_conn() for owned socket - CLEANUP: mux-quic: remove unused attribute on qcs_is_close_remote() - BUG/MINOR: mux-quic: remove qcs from opening-list on free - BUG/MINOR: mux-quic: handle properly alloc error in qcs_new() - CI: github: split ssl lib selection based on git branch - REGTESTS: startup: check maxconn computation - BUG/MINOR: startup: don't use internal proxies to compute the maxconn - REGTESTS: startup: change the expected maxconn to 11000 - CI: github: set ulimit -n to a greater value - REGTESTS: startup: activate automatic_maxconn.vtc - MINOR: sample: add param converter - CLEANUP: ssl: remove check on srv->proxy - BUG/MEDIUM: freq-ctr: Don't compute overshoot value for empty counters - BUG/MEDIUM: resolvers: Use tick_first() to update the resolvers task timeout - REGTESTS: startup: add alternatives values in automatic_maxconn.vtc - BUG/MEDIUM: h3: reject request with invalid header name - BUG/MEDIUM: h3: reject request with invalid pseudo header - MINOR: http: extract content-length parsing from H2 - BUG/MEDIUM: h3: parse content-length and reject invalid messages - CI: github: remove redundant ASAN loop - CI: github: split matrix for development and stable branches - BUG/MEDIUM: mux-h1: Don't release H1 stream upgraded from TCP on error - BUG/MINOR: mux-h1: Fix test instead a BUG_ON() in h1_send_error() - MINOR: http-htx: add BUG_ON to prevent API error on http_cookie_register - BUG/MEDIUM: h3: fix cookie header parsing - BUG/MINOR: h3: fix memleak on HEADERS parsing failure - MINOR: h3: check return values of htx_add_* on headers parsing - MINOR: ssl: Remove unneeded buffer allocation in show ocsp-response - MINOR: ssl: Remove unnecessary alloc'ed trash chunk in show ocsp-response - BUG/MINOR: ssl: Fix memory leak of find_chain in ssl_sock_load_cert_chain - MINOR: stats: provide ctx for dumping functions - MINOR: stats: introduce stats field ctx - BUG/MINOR: stats: fix show stat json buffer limitation - MINOR: stats: make show info json future-proof - BUG/MINOR: quic: fix crash on PTO rearm if anti-amplification reset - BUILD: 51d: fix build issue with recent compilers - REGTESTS: startup: disable automatic_maxconn.vtc - BUILD: peers: peers-t.h depends on stick-table-t.h - BUG/MEDIUM: tests: use tmpdir to create UNIX socket - BUG/MINOR: mux-h1: Report EOS on parsing/internal error for not running stream - BUG/MINOR:: mux-h1: Never handle error at mux level for running connection - BUG/MEDIUM: stats: Rely on a local trash buffer to dump the stats - OPTIM: pool: split the read_mostly from read_write parts in pool_head - MINOR: pool: make the thread-local hot cache size configurable - MINOR: freq_ctr: add opportunistic versions of swrate_add() - MINOR: pool: only use opportunistic versions of the swrate_add() functions - REGTESTS: ssl: enable the ssl_reuse.vtc test for WolfSSL - BUG/MEDIUM: mux-quic: fix double delete from qcc.opening_list - BUG/MEDIUM: quic: properly take shards into account on bind lines - BUG/MINOR: quic: do not allocate more rxbufs than necessary - MINOR: ssl: Add a lock to the OCSP response tree - MINOR: httpclient: Make the CLI flags public for future use - MINOR: ssl: Add helper function that extracts an OCSP URI from a certificate - MINOR: ssl: Add OCSP request helper function - MINOR: ssl: Add helper function that checks the validity of an OCSP response - MINOR: ssl: Add "update ssl ocsp-response" cli command - MEDIUM: ssl: Add ocsp_certid in ckch structure and discard ocsp buffer early - MINOR: ssl: Add ocsp_update_tree and helper functions - MINOR: ssl: Add crt-list ocsp-update option - MINOR: ssl: Store 'ocsp-update' mode in the ckch_data and check for inconsistencies - MEDIUM: ssl: Insert ocsp responses in update tree when needed - MEDIUM: ssl: Add ocsp update task main function - MEDIUM: ssl: Start update task if at least one ocsp-update option is set to on - DOC: ssl: Add documentation for ocsp-update option - REGTESTS: ssl: Add tests for ocsp auto update mechanism - MINOR: ssl: Move OCSP code to a dedicated source file - BUG/MINOR: ssl/ocsp: check chunk_strcpy() in ssl_ocsp_get_uri_from_cert() - CLEANUP: ssl/ocsp: add spaces around operators - BUG/MEDIUM: mux-h2: Refuse interim responses with end-stream flag set - BUG/MINOR: pool/stats: Use ullong to report total pool usage in bytes in stats - BUG/MINOR: ssl/ocsp: httpclient blocked when doing a GET - MINOR: httpclient: don't add body when istlen is empty - MEDIUM: httpclient: change the default log format to skip duplicate proxy data - BUG/MINOR: httpclient/log: free of invalid ptr with httpclient_log_format - MEDIUM: mux-quic: implement shutw - MINOR: mux-quic: do not count stream flow-control if already closed - MINOR: mux-quic: handle RESET_STREAM reception - MEDIUM: mux-quic: implement STOP_SENDING emission - MINOR: h3: use stream error when needed instead of connection - CI: github: enable github api authentication for OpenSSL tags read - BUG/MINOR: mux-quic: ignore remote unidirectional stream close - CI: github: use the GITHUB_TOKEN instead of a manually generated token - BUILD: makefile: build the features list dynamically - BUILD: makefile: move common options-oriented macros to include/make/options.mk - BUILD: makefile: sort the features list - BUILD: makefile: initialize all build options' variables at once - BUILD: makefile: add a function to collect all options' CFLAGS/LDFLAGS - BUILD: makefile: start to automatically collect CFLAGS/LDFLAGS - BUILD: makefile: ensure that all USE_* handlers appear before CFLAGS are used - BUILD: makefile: clean the wolfssl include and lib generation rules - BUILD: makefile: make sure to also ignore SSL_INC when using wolfssl - BUILD: makefile: reference libdl only once - BUILD: makefile: make sure LUA_INC and LUA_LIB are always initialized - BUILD: makefile: do not restrict Lua's prepend path to empty LUA_LIB_NAME - BUILD: makefile: never force -latomic, set USE_LIBATOMIC instead - BUILD: makefile: add an implicit USE_MATH variable for -lm - BUILD: makefile: properly report USE_PCRE/USE_PCRE2 in features - CLEANUP: makefile: properly indent ifeq/ifneq conditional blocks - BUILD: makefile: rework 51D to split v3/v4 - BUILD: makefile: support LIBCRYPT_LDFLAGS - BUILD: makefile: support RT_LDFLAGS - BUILD: makefile: support THREAD_LDFLAGS - BUILD: makefile: support BACKTRACE_LDFLAGS - BUILD: makefile: support SYSTEMD_LDFLAGS - BUILD: makefile: support ZLIB_CFLAGS and ZLIB_LDFLAGS - BUILD: makefile: support ENGINE_CFLAGS - BUILD: makefile: support OPENSSL_CFLAGS and OPENSSL_LDFLAGS - BUILD: makefile: support WOLFSSL_CFLAGS and WOLFSSL_LDFLAGS - BUILD: makefile: support LUA_CFLAGS and LUA_LDFLAGS - BUILD: makefile: support DEVICEATLAS_CFLAGS and DEVICEATLAS_LDFLAGS - BUILD: makefile: support PCRE[2]_CFLAGS and PCRE[2]_LDFLAGS - BUILD: makefile: refactor support for 51DEGREES v3/v4 - BUILD: makefile: support WURFL_CFLAGS and WURFL_LDFLAGS - BUILD: makefile: make all OpenSSL variants use the same settings - BUILD: makefile: remove the special case of the SSL option - BUILD: makefile: only consider settings from enabled options - BUILD: makefile: also list per-option settings in 'make opts' - BUG/MINOR: debug: don't mask the TH_FL_STUCK flag before dumping threads - MINOR: cfgparse-ssl: avoid a possible crash on OOM in ssl_bind_parse_npn() - BUG/MINOR: ssl: Missing goto in error path in ocsp update code - BUG/MINOR: stick-table: report the correct action name in error message - CI: Improve headline in matrix.py - CI: Add in-memory cache for the latest OpenSSL/LibreSSL - CI: Use proper `if` blocks instead of conditional expressions in matrix.py - CI: Unify the `GITHUB_TOKEN` name across matrix.py and vtest.yml - CI: Explicitly check environment variable against `None` in matrix.py - CI: Reformat `matrix.py` using `black` - MINOR: config: add environment variables for default log format - REGTESTS: Remove REQUIRE_VERSION=1.9 from all tests - REGTESTS: Remove REQUIRE_VERSION=2.0 from all tests - REGTESTS: Remove tests with REQUIRE_VERSION_BELOW=1.9 - BUG/MINOR: http-fetch: Only fill txn status during prefetch if not already set - BUG/MAJOR: buf: Fix copy of wrapping output data when a buffer is realigned - DOC: config: fix alphabetical ordering of http-after-response rules - MINOR: http-rules: Add missing actions in http-after-response ruleset - DOC: config: remove duplicated "http-response sc-set-gpt0" directive - BUG/MINOR: proxy: free orgto_hdr_name in free_proxy() - REGTEST: fix the race conditions in json_query.vtc - REGTEST: fix the race conditions in add_item.vtc - REGTEST: fix the race conditions in digest.vtc - REGTEST: fix the race conditions in hmac.vtc - BUG/MINOR: fd: avoid bad tgid assertion in fd_delete() from deinit() - BUG/MINOR: http: Memory leak of http redirect rules' format string - MEDIUM: stick-table: set the track-sc limit at boottime via tune.stick-counters - MINOR: stick-table: implement the sc-add-gpc() action	2023-01-07 09:45:17 +01:00
Willy Tarreau	6c0117168e	MEDIUM: stick-table: set the track-sc limit at boottime via tune.stick-counters The number of stick-counter entries usable by track-sc rules is currently set at build time. There is no good value for this since the vast majority of users don't need any, most need only a few and rare users need more. Adding more counters for everyone increases memory and CPU usages for no reason. This patch moves the per-session and per-stream arrays to a pool of a size defined at boot time. This way it becomes possible to set the number of entries at boot time via a new global setting "tune.stick-counters" that sets the limit for the whole process. When not set, the MAX_SESS_STR_CTR value still applies, or 3 if not set, as before. It is also possible to lower the value to 0 to save a bit of memory if not used at all. Note that a few low-level sample-fetch functions had to be protected due to the ability to use sample-fetches in the global section to set some variables.	2023-01-06 18:08:49 +01:00
S�bastien Gross	537b9e7f36	MINOR: config: add environment variables for default log format This patch provides a convenient way to override the default TCP, HTTP and HTTP log formats. Instead of having a look into the documentation to figure out what is the appropriate default log format three new environment variables can be used: HAPROXY_TCP_LOG_FMT, HAPROXY_HTTP_LOG_FMT and HAPROXY_HTTPS_LOG_FMT. Their content are substituted verbatim. These variables are set before parsing the configuration and are unset just after all configuration files are successful parsed. Example: # Instead of writing this long log-format line... log-format "%ci:%cp [%tr] %ft %b/%s %TR/%Tw/%Tc/%Tr/%Ta %ST %B %CC \ %CS %tsc %ac/%fc/%bc/%sc/%rc %sq/%bq %hr %hs %{+Q}r \ lr=last_rule_file:last_rule_line" # ..the HAPROXY_HTTP_LOG_FMT can be used to provide the default # http log-format string log-format "${HAPROXY_HTTP_LOG_FMT} lr=last_rule_file:last_rule_line" Please note that nothing prevents users to unset the variables or override their content in a global section. Signed-off-by: S�bastien Gross <sgross@haproxy.com>	2023-01-04 08:23:43 +01:00
Willy Tarreau	284cfc67b8	MINOR: pool: make the thread-local hot cache size configurable Till now it was only possible to change the thread local hot cache size at build time using CONFIG_HAP_POOL_CACHE_SIZE. But along benchmarks it was sometimes noticed a huge contention in the lower level memory allocators indicating that larger caches could be beneficial, especially on machines with large L2 CPUs. Given that the checks against this value was no longer on a hot path anymore, there was no reason for continuing to force it to be tuned at build time. So this patch allows to set it by tune.memory-hot-size. It's worth noting that during the boot phase the value remains zero so that it's possible to know if the value was set or not, which opens the possibility that we try to automatically adjust it based on the per-cpu L2 cache size or the use of certain protocols (none of this is done yet).	2022-12-20 14:51:12 +01:00
Willy Tarreau	57c3e75d4e	CLEANUP: init: remove useless assignment of nbthread The old test consisting in setting global.nbthread if lower than 1 is useless nowadays since it's already done in check_config_validity().	2022-12-08 08:14:35 +01:00
William Lallemand	e57b702e2b	BUG/MEDIUM: mworker: create the mcli_reload socketpairs in case of upgrade In ticket #1956, it was reported that an upgrade from 2.6 to 2.7 via a reload would stop the master process. When upgrading the binary, the new process is considered reexec and does not try to creates the socketpair for the mcli_reload listener, then tries to bind on -1 since the socket doesn't exit. The failure provokes an exit() of the master. This patch fixes the issue by trying to create the mcli_reload sockets only when they don't exist, instead of creating them at first start. This way we also avoid possible fd leak since we always try to use the existing FDs first. Must be backported in 2.7.	2022-12-07 15:30:52 +01:00
William Lallemand	40db4ae8bb	MINOR: mworker: display an alert upon a wait-mode exit When the mworker wait mode fails it does an exit, but there is no error message which says it exits. Add a message which specify that the error is non-recoverable. Could be backported in 2.7 and possibly earlier branch.	2022-12-07 15:07:53 +01:00
William Lallemand	151dbbe778	BUG/MINOR: ssl: initialize WolfSSL before parsing The wolfSSL library need to be initialized before parsing the configuration which uses some SSL functions. To be backported in 2.6.	2022-12-02 17:17:43 +01:00
William Lallemand	44c80ce5b3	BUG/MINOR: ssl: initialize SSL error before parsing The SSL error initialization need to be done before the configuration parsing, because it uses the SSL. Need to be backported to 2.6.	2022-12-02 17:10:11 +01:00
Amaury Denoyelle	e30f378236	MINOR: quic: activate socket per conn by default Activate QUIC connection socket to achieve the best performance. The previous behavior can be reverted by tune.quic.socket-owner configuration option. This change is part of quic-conn owned socket implementation. Contrary to its siblings patches, I suggest to not backport it to 2.7. This should ensure that stable releases behavior is perserved. If a user faces issues with QUIC performance on 2.7, he can nonetheless change the default configuration.	2022-12-02 14:45:43 +01:00
Ilya Shipitsin	6f86eaae4f	CLEANUP: assorted typo fixes in the code and comments This is 33rd iteration of typo fixes	2022-11-30 14:02:36 +01:00
Uriah Pollock	3cbf09ed64	MEDIUM: ssl: add minimal WolfSSL support with OpenSSL compatibility mode This adds a USE_OPENSSL_WOLFSSL option, wolfSSL must be used with the OpenSSL compatibility layer. This must be used with USE_OPENSSL=1. WolfSSL build options: ./configure --prefix=/opt/wolfssl --enable-haproxy HAProxy build options: USE_OPENSSL=1 USE_OPENSSL_WOLFSSL=1 WOLFSSL_INC=/opt/wolfssl/include/ WOLFSSL_LIB=/opt/wolfssl/lib/ ADDLIB='-Wl,-rpath=/opt/wolfssl/lib' Using at least the commit 54466b6 ("Merge pull request #5810 from Uriah-wolfSSL/haproxy-integration") from WolfSSL. (2022-11-23). This is still to be improved, reg-tests are not supported yet, and more tests are to be done. Signed-off-by: William Lallemand <wlallemand@haproxy.org>	2022-11-24 11:29:03 +01:00
Amaury Denoyelle	28ea31c7cb	MINOR: global: generate random cluster.secret if not defined If no cluster-secret is defined by the user, a random one is silently generated. This ensures that at least QUIC Retry tokens are generated if abnormal conditions are detected. However, it is advisable to specify it in the configuration for tokens to be valid even after a reload or across LBs instances in the same cluster. This should be backported up to 2.6.	2022-11-21 16:41:34 +01:00
Remi Tricot-Le Breton	e608b0eb16	BUG/MINOR: ssl: SSL_load_error_strings might not be defined The SSL_load_error_strings function was marked as deprecated in OpenSSL 1.1.0 so compiling HAProxy with OPENSSL_NO_DEPRECATED set and a recent OpenSSL library would fail. The manpages say that this function was replaced by OPENSSL_init_crypto and OPENSSL_init_ssl which are already called at start up by the SSL lib. We do not seem to be in a case where explicit call of those functions is required. This patch fixes GitHub issue #1813. It can be backported to 2.6.	2022-11-16 11:09:33 +01:00
Willy Tarreau	e98d385819	MINOR: deinit: add a "quick-exit" option to bypass the deinit step Once in a while we spot a bug in the deinit code that is complex, especially when it has to deal with incomplete initializations, and the ability to bypass this step has regularly been raised. In addition for fast-reloading setups it could theoretically save some time. Tests have shown that very large configs can barely save ~100-150ms by skipping the deinit step. However the ability not to crash if a bug is encountered can occasionally help. This patch adds an option to do exactly this. It's obviously not enabled by default and the documentation discourages from using it, but this might be useful in the future.	2022-11-15 09:37:09 +01:00
William Lallemand	eba6a54cd4	MINOR: logs: startup-logs can use a shm for logging the reload When compiled with USE_SHM_OPEN=1 the startup-logs are now able to use an shm which is used to keep the logs when switching to mworker wait mode. This allows to keep the failed reload logs. When allocating the startup-logs at first start of the process, haproxy will do a shm_open with a unique path using the PID of the process, the file is unlink immediatly so we don't let unwelcomed files be. The fd resulting from this shm is stored in the HAPROXY_STARTUPLOGS_FD environment variable so it can be mmap again when switching to wait mode. When forking children, the process is copying the mmap to a a mallocated ring so we never share the same memory section between the master and the workers. When switching to wait mode, the shm is not used anymore as it is also copied to a mallocated structure. This allow to use the "show startup-logs" command over the master CLI, to get the logs of the latest startup or reload. This way the logs of the latest failed reload are also kept. This is only activated on the linux-glibc target for now.	2022-10-13 16:50:22 +02:00
Willy Tarreau	c06557c23b	MINOR: init: do not try to shrink existing RLIMIT_NOFIlE As seen in issue #1866, some environments will not allow to change the current FD limit, and actually we don't need to do it, we only do it as a byproduct of adjusting the limit to the one that fits. Here we're replacing calls to setrlimit() with calls to raise_rlim_nofile(), which will avoid making the setrlimit() syscall in case the desired value is lower than the current process' one. This depends on previous commit "MINOR: fd: add a new function to only raise RLIMIT_NOFILE" and may need to be backported to 2.6, possibly earlier, depending on users' experience in such environments.	2022-10-04 08:38:47 +02:00
Amaury Denoyelle	92fa63f735	CLEANUP: quic: create a dedicated quic_conn module xprt_quic module was too large and did not reflect the true architecture by contrast to the other protocols in haproxy. Extract code related to XPRT layer and keep it under xprt_quic module. This code should only contains a simple API to communicate between QUIC lower layer and connection/MUX. The vast majority of the code has been moved into a new module named quic_conn. This module is responsible to the implementation of QUIC lower layer. Conceptually, it overlaps with TCP kernel implementation when comparing QUIC and HTTP1/2 stacks of haproxy. This should be backported up to 2.6.	2022-10-03 16:25:17 +02:00
Erwan Le Goas	f30c5d7666	MINOR: config: Add option line when the configuration file is dumped Add an option to dump the number lines of the configuration file when it's dumped. Other options can be easily added. Options are separated by ',' when tapping the command line: './haproxy -dC[key],line -f [file]' No backport needed, except if anonymization mechanism is backported.	2022-09-29 10:53:15 +02:00
William Lallemand	56f73b21a5	MINOR: mworker: stores the mcli_reload bind_conf Stores the mcli_reload bind_conf in order to identify it later.	2022-09-24 15:56:25 +02:00
William Lallemand	21623b5949	MINOR: mworker: mworker_cli_proxy_new_listener() returns a bind_conf mworker_cli_proxy_new_listener() now returns a bind_conf * or NULL upon failure.	2022-09-24 15:51:27 +02:00
William Lallemand	68192b2cdf	MINOR: mworker: store and shows loading status The environment variable HAPROXY_LOAD_SUCCESS stores "1" if it successfully load the configuration and started, "0" otherwise. The "_loadstatus" master CLI command displays either "Loading failure!\n" or "Loading success.\n"	2022-09-24 15:44:42 +02:00
William Lallemand	ec059c249e	MEDIUM: mworker/cli: keep the connection of the FD that ask for a reload When using the "reload" command over the master CLI, all connections to the master CLI were cut, this was unfortunate because it could have been used to implement a synchronous reload command. This patch implements an architecture to keep the connection alive after the reload. The master CLI is now equipped with a listener which uses a socketpair, the 2 FDs of this socketpair are stored in the mworker_proc of the master, which the master keeps via the environment variable. ipc_fd[1] is used as a listener for the master CLI. During the "reload" command, the CLI will send the FD of the current session over ipc_fd[0], then the reload is achieved, so the master won't handle the recv of the FD. Once reloaded, ipc_fd[1] receives the FD of the session, so the connection is preserved. Of course it is a new context, so everything like the "prompt mode" are lost. Only the FD which performs the reload is kept.	2022-09-22 18:16:19 +02:00
Erwan Le Goas	b0c0501516	MINOR: config: add command-line -dC to dump the configuration file This commit adds a new command line option -dC to dump the configuration file. An optional key may be appended to -dC in order to produce an anonymized dump using this key. The anonymizing process uses the same algorithm as the CLI so that the same key will produce the same hashes for the same identifiers. This way an admin may share an anonymized extract of a configuration to match against live dumps. Note that key 0 will not anonymize the output. However, in any case, the configuration is dumped after tokenizing, thus comments are lost.	2022-09-17 11:27:09 +02:00
Matthias Wirth	eea152ee68	BUG/MINOR: signals/poller: ensure wakeup from signals Add self-wake in signal_handler() to fix a race condition with a signal coming in between checking signal_queue_len and entering polling sleep. The changes in commit `43c891dda` ("BUG/MINOR: signals/poller: set the poller timeout to 0 when there are signals") were insufficient. Move the signal_queue_len check from the poll implementations to run_poll_loop() to keep that logic in one place. The poll loops are terminated either by the parameter wake being set or wake up due to a write to their poller_wr_pipe by wake_thread() in signal_handler(). This fixes issue #1841. Must be backported in every stable version.	2022-09-09 11:15:22 +02:00
Willy Tarreau	53bfac8c63	BUG/MEDIUM: master: force the thread count earlier Christopher bisected that recent commit `d0b73bca71` ("MEDIUM: listener: switch bind_thread from global to group-local") broke the master socket in that only the first out of the Nth initial connections would work, where N is the number of threads, after which they all work. The cause is that the master socket was bound to multiple threads, despite global.nbthread being 1 there, so the incoming connection load balancing would try to send incoming connections to non-existing threads, however the bind_thread mask would nonetheless include multiple threads. What happened is that in 1.9 we forced "nbthread" to 1 in the master's poll loop with commit `b3f2be338b` ("MEDIUM: mworker: use the haproxy poll loop"). In 2.0, nbthread detection was enabled by default in commit `149ab779cc` ("MAJOR: threads: enable one thread per CPU by default"). From this point on, the operation above is unsafe because everything during startup is performed with nbthread corresponding to the default value, then it changes to one when starting the polling loop. But by then we weren't using the wait mode except for reload errors, so even if it would have happened nobody would have noticed. In 2.5 with commit `fab0fdce9` ("MEDIUM: mworker: reexec in waitpid mode after successful loading") we started to rexecute all the time, not just for errors, so as to release precious resources and to possibly spot bugs that were rarely exposed in this mode. By then the incoming connection LB was enforcing all_threads_mask on the listener's thread mask so that the incorrect value was being corrected while using it. Finally in 2.7 commit `d0b73bca71` ("MEDIUM: listener: switch bind_thread from global to group-local") replaces the all_threads_mask there with the listener's bind_thread, but that one was never adjusted by the starting master, whose thread group was filled to N threads by the automatic detection during early setup. The best approach here is to set nbthread to 1 very early in init() when we're in the master in wait mode, so that we don't try to guess the best value and don't end up with incorrect bindings anymore. This patch does this and also sets nbtgroups to 1 in preparation for a possible future where this will also be automatically calculated. There is no need to backport this patch since no other versions were affected, but if it were to be discovered that the incorrect bind mask on some of the master's FDs could be responsible for any trouble in older versions, then the backport should be safe (provided that nbtgroups is dropped of course).	2022-07-22 17:51:53 +02:00
Willy Tarreau	41afd9084e	BUILD: add detection for unsupported compiler models As reported in github issue #1765, some people get trapped into building haproxy and companion libraries on Windows using a compiler following the LLP64 model. This has no chance to work, and definitely causes nasty bugs everywhere when pointers are passed as longs. Let's save them time and detect this at boot time. The message and detection was factored with the existing one for -fwrapv since we need the same info and actions. This should be backported to all recent supported versions (the ones that are likely to be tried on such platforms when people don't know).	2022-07-21 09:58:20 +02:00
William Lallemand	d4835a9680	BUG/MEDIUM: mworker: proc_self incorrectly set crashes upon reload When updating from 2.4 to 2.6, the child->reloads++ instruction changed place, resulting in a former worker from the 2.4 process, still identified as a current worker once in 2.6, because its reload counter is still 0. Unfortunately this counter is used to chose the mworker_proc structure that will be used for the new worker. What happens next, is that the mworker_proc structure of the previous process is selected, and this one has ipc_fd[1] set to -1, because this structure was supposed to be in the master. The process then forks, and mworker_sockpair_register_per_thread() tries to register ipc_fd[1] which is set to -1, instead of the fd of the new socketpair. This patch fixes the issue by checking if child->pid is equal to -1 when selecting proc_self. This way we could be sure it wasn't a previous process. Should fix issue #1785. This must be backported as far as 2.4 to fix the issue related to the reload computation difference. However backporting it in every stable branch will enforce the reload process.	2022-07-21 00:52:43 +02:00
William Lallemand	3b8bafd4a7	MINOR: init: load OpenSSL error strings Load OpenSSL Error strings in order to be able to output reason strings. This is mandatory to be able to use ERR_reason_error_string().	2022-07-19 19:13:08 +02:00
Willy Tarreau	c6b596dcce	CLEANUP: threads: remove the now unused all_threads_mask and tid_bit Since these are not used anymore, let's now remove them. Given the number of places where we're using ti->ldit_bit, maybe an equivalent might be useful though.	2022-07-15 20:25:41 +02:00
Willy Tarreau	5b09341c02	MEDIUM: cpu-map: replace the process number with the thread group number The principle remains the same, but instead of having a single process and ignoring extra ones, now we set the affinity masks for the respective threads of all groups. The doc was updated with a few extra examples.	2022-07-15 19:43:10 +02:00
Willy Tarreau	e5715bface	MEDIUM: poller: disable thread-groups for poll() and select() These old legacy pollers are not designed for this. They're still using a shared list of events for all threads, this will not scale at all, so there's no point in enabling thread-groups there. Modern systems have epoll, kqueue or event ports and do not need these ones. We arrange for failing at boot time, only when thread-groups > 1 so that existing setups will remain unaffected. If there's a compelling reason for supporting thread groups with these pollers in the future, the rework should not be too hard, it would just consume a lot of memory to have an fd_evts[] array per thread, but that is doable.	2022-07-15 19:43:10 +02:00
William Lallemand	a46a99e98c	MEDIUM: mworker/systemd: send STATUS over sd_notify The sd_notify API is not able to change the "Active:" line in "systemcl status". However a message can still be displayed on a "Status: " line, even if the service is still green and "active (running)". When startup succeed the Status will be set to "Ready.", upon a reload it will be set to "Reloading Configuration." If the configuration succeed "Ready." again. However if the reload failed, it will be set to "Reload failed!". Keep in mind that the "Active:" line won't change upon a reload failure, and will still be green.	2022-07-07 14:48:46 +02:00
Willy Tarreau	ad92fdf196	CLEANUP: thread: also remove a thread's bit from stopping_threads on stop As much as possible we should take care of not leaving bits from stopped threads in shared thread masks. It can avoid issues like the previous fix and will also make debugging less confusing.	2022-07-06 10:19:46 +02:00
Willy Tarreau	f34a3fa33d	BUG/MEDIUM: thread: mask stopping_threads with threads_enabled when checking it When soft-stopping, there's a comparison between stopping_threads and threads_enabled to make sure all threads are stopped, but this is not correct and is racy since the threads_enabled bit is removed when a thread is stopped but not its stopping_threads bit. The consequence is that depending on timing, when stopping, if the first stopping thread is fast enough to remove its bit from threads_enabled, the other threads will see that stopping_threads doesn't match threads_enabled anymore and will wait forever. As such the mask must be applied to stopping_threads during the test. This issue was introduced in recent commit `ef422ced9` ("MEDIUM: thread: make stopping_threads per-group and add stopping_tgroups"), no backport is needed.	2022-07-06 10:19:46 +02:00
Willy Tarreau	24cfc9f76e	BUG/MEDIUM: thread: check stopping thread against local bit and not global one Commit `ef422ced9` ("MEDIUM: thread: make stopping_threads per-group and add stopping_tgroups") moved the stopping_threads mask to per-group, but one test in the loop preserved its global value instead, resulting in stopping threads never sleeping on stop and eating 100% CPU until all were stopped. No backport is needed.	2022-07-04 14:09:39 +02:00
Willy Tarreau	291f6ff885	BUG/MEDIUM: threads: fix incorrect thread group being used on soft-stop Commit `377e37a80` ("MINOR: tinfo: add the mask of enabled threads in each group") forgot -1 on the tgid, thus the groups was not always correctly tested, which is visible only when running with more than one group. No backport is needed.	2022-07-04 13:37:31 +02:00
Willy Tarreau	ef422ced91	MEDIUM: thread: make stopping_threads per-group and add stopping_tgroups Stopping threads need a mask to figure who's still there without scanning everything in the poll loop. This means this will have to be per-group. And we also need to have a global stopping groups mask to know what groups were already signaled. This is used both to figure what thread is the first one to catch the event, and which one is the first one to detect the end of the last job. The logic isn't changed, though a loop is required in the slow path to make sure all threads are aware of the end. Note that for now the soft-stop still takes time for group IDs > 1 as the poller is not yet started on these threads and needs to expire its timeout as there's no way to wake it up. But all threads are eventually stopped.	2022-07-01 19:15:15 +02:00
Willy Tarreau	cce203aae5	MINOR: thread: add a new all_tgroups_mask variable to know about active tgroups In order to kill all_threads_mask we'll need to have an equivalent for the thread groups. The all_tgroups_mask does just this, it keeps one bit set per enabled group.	2022-07-01 19:15:15 +02:00
Willy Tarreau	377e37a80f	MINOR: tinfo: add the mask of enabled threads in each group In order to replace the global "all_threads_mask" we'll need to have an equivalent per group. Take this opportunity for calling it threads_enabled and make sure which ones are counted there (in case in the future we allow to stop some).	2022-07-01 19:15:14 +02:00
Willy Tarreau	e7475c8e79	MEDIUM: tasks/fd: replace sleeping_thread_mask with a TH_FL_SLEEPING flag Every single place where sleeping_thread_mask was still used was to test or set a single thread. We can now add a per-thread flag to indicate a thread is sleeping, and remove this shared mask. The wake_thread() function now always performs an atomic fetch-and-or instead of a first load then an atomic OR. That's cleaner and more reliable. This is not easy to test, as broadcast FD events are rare. The good way to test for this is to run a very low rate-limited frontend with a listener that listens to the fewest possible threads (2), and to send it only 1 connection at a time. The listener will periodically pause and the wakeup task will sometimes wake up on a random thread and will call wake_thread(): frontend test bind :8888 maxconn 10 thread 1-2 rate-limit sessions 5 Alternately, disabling/enabling a frontend in loops via the CLI also broadcasts such events, but they're more difficult to observe since this is causing connection failures.	2022-07-01 19:15:14 +02:00
Willy Tarreau	dce4ad755f	MEDIUM: thread: add a new per-thread flag TH_FL_NOTIFIED to remember wakeups Right now when an inter-thread wakeup happens, we preliminary check if the thread was asleep, and if so we wake the poller up and remove its bit from the sleeping mask. That's not very clean since the sleeping mask cannot be entirely trusted since a thread that's about to wake up will already have its sleeping bit removed. This patch adds a new per-thread flag (TH_FL_NOTIFIED) to remember that a thread was notified to wake up. It's cleared before checking the task lists last, so that new wakeups can be considered again (since wake_thread() is only used to notify about task wakeups and FD polling changes). This way we do not need to modify a remote thread's sleeping mask anymore. As such wake_thread() now only tests and sets the TH_FL_NOTIFIED flag but doesn't clear sleeping anymore.	2022-07-01 19:15:14 +02:00
William Lallemand	0a012aa16b	BUG/MEDIUM: mworker: use default maxconn in wait mode In bug #1751, it was reported that haproxy is consumming too much memory since the 2.4 version. This is because of a change in the master, which loses completely its configuration in wait mode, and lose its maxconn. Without the maxconn, haproxy will try to compute one itself, and will allocate RAM consequently, too much in our case. Which means the master will have a too high maxconn and too much RAM allocated. The patch fixes the issue by setting the maxconn to the default value when re-executing the master in wait mode. Must be backported as far as 2.5.	2022-06-21 14:22:49 +02:00
Frédéric Lécaille	aee675746c	MINOR: quic: Clarifications about transport parameters value This is becoming difficult to distinguish the default values for transport parameters which come with the RFC from our implementation default values when not set by configuration (tunable parameters). Add a comment to distinguish them. Prefix these default values by QUIC_TP_DFLT_ to distinguish them from QUIC_DFLT_* value even if there are not numerous. Furthermore ->max_udp_payload_size must be first initialized to QUIC_TP_DFLT_MAX_UDP_PAYLOAD_SIZE especially for received value.	2022-05-30 09:59:26 +02:00
Frédéric Lécaille	2674098569	MINOR: quic: Tunable "initial_max_streams_bidi" transport parameter Add tunable "tune.quic.frontend.max_streams_bidi" setting for QUIC frontends to set the "initial_max_streams_bidi" transport parameter. Add some documentation for this new setting.	2022-05-30 09:59:26 +02:00
Frédéric Lécaille	1d96d6e024	MINOR: quic: Tunable "max_idle_timeout" transport parameter Add two tunable settings both for backends and frontends "max_idle_timeout" QUIC transport parameter, "tune.quic.frontend.max-idle-timeout" and "tune.quic.backend.max-idle-timeout" respectively. cfg_parse_quic_time() has been implemented to parse a time value thanks to parse_time_err(). It should be reused for any tunable time value to be parsed. Add the documentation for this tunable setting only for frontend.	2022-05-30 09:59:26 +02:00
Willy Tarreau	8e5b9589b3	CLEANUP: init: address another coverity warning about a possible multiply overflow Commit `2cb3be76b` ("CLEANUP: init: address a coverity warning about possible multiply overflow") was incomplete, two other locations were present. This should address issue #1585.	2022-05-26 08:55:05 +02:00
Willy Tarreau	2cb3be76bf	CLEANUP: init: address a coverity warning about possible multiply overflow In issue #1585 Coverity suspects a risk of multiply overflow when calculating the SSL cache size, though in practice the cache is limited to 2^32 anyway thus it cannot really happen. Nevertheless, casting the operation should be sufficient to avoid marking it as a false positive.	2022-05-24 07:46:00 +02:00
Frédéric Lécaille	9286210aa8	MINOR: quic: Add tune.quic.retry-threshold keyword This QUIC specific keyword may be used to set the theshold, in number of connection openings, beyond which QUIC Retry feature will be automatically enabled. Its default value is 100.	2022-05-20 17:11:13 +02:00
Remi Tricot-Le Breton	5194446b76	MEDIUM: ssl: Delay random generator initialization after config parsing The random generator initialization needs to be performed before the chroot but it is not needed before. If we want to add provider configuration option to the configuration file, they need to be processed before any call to a crypto-related OpenSSL function. We can then delay the initialization until after the configuration file is parsed and processed.	2022-05-17 10:55:59 +02:00
Frédéric Lécaille	372508cc42	MINOR: config: Add "cluster-secret" new global keyword It could be usefull to set a ASCII secret which could be used for different usages. For instance, it will be used to derive QUIC stateless reset tokens.	2022-05-12 17:48:35 +02:00
William Lallemand	89e236f246	BUG/MINOR: startup: usage() when no -cc arguments Exit correctly with usage() instead of segfaulting when no argument were passed to -cc. Must be backported in 2.5.	2022-05-06 17:22:36 +02:00
William Lallemand	8b9a2df969	MINOR: init: exit() after pre-check upon error Add a test on the err_code variable so we don't go further if one of the pre-check callback failed.	2022-05-04 14:29:46 +02:00
Willy Tarreau	226866e1bb	CLEANUP: deinit: release the config postparsers These ones were not released either, it just requires to export the list ("postparsers") and it makes valgrind happy.	2022-04-27 18:07:24 +02:00
Willy Tarreau	65009ebde1	CLEANUP: deinit: release the pre-check callbacks The freeing of pre-check callbacks was missing when this feature was recently added with commit `b53eb8790` ("MINOR: init: add the pre-check callback"), let's do it to make valgrind happy.	2022-04-27 18:02:54 +02:00
Tim Duesterhus	77b3db0fbd	MINOR: Call deinit_and_exit(0) for `haproxy -vv` It appears that it is safe to call perform a clean deinit at this point, so let's do this to exercise the deinit paths some more. Running `valgrind --leak-check=full --show-leak-kinds=all ./haproxy -vv` with this change reports: ==261864== HEAP SUMMARY: ==261864== in use at exit: 344 bytes in 11 blocks ==261864== total heap usage: 1,178 allocs, 1,167 frees, 1,102,089 bytes allocated ==261864== ==261864== 24 bytes in 1 blocks are still reachable in loss record 1 of 2 ==261864== at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==261864== by 0x324BA6: hap_register_pre_check (init.c:92) ==261864== by 0x155824: main (haproxy.c:3024) ==261864== ==261864== 320 bytes in 10 blocks are still reachable in loss record 2 of 2 ==261864== at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==261864== by 0x26E54E: cfg_register_postparser (cfgparse.c:4238) ==261864== by 0x155824: main (haproxy.c:3024) ==261864== ==261864== LEAK SUMMARY: ==261864== definitely lost: 0 bytes in 0 blocks ==261864== indirectly lost: 0 bytes in 0 blocks ==261864== possibly lost: 0 bytes in 0 blocks ==261864== still reachable: 344 bytes in 11 blocks ==261864== suppressed: 0 bytes in 0 blocks which is looking pretty good.	2022-04-27 05:01:27 +02:00
Willy Tarreau	197715ae21	CLEANUP: compression: move the default setting of maxzlibmem to defaults __comp_fetch_init() only presets the maxzlibmem, and only when both USE_ZLIB and DEFAULT_MAXZLIBMEM are set. The intent is to preset a default value to protect the system against excessive memory usage when no setting is set by the user. Nowadays the entry in the global struct is always there so there's no point anymore in passing via a constructor to possibly set this value. Let's go the cleaner way by always presetting DEFAULT_MAXZLIBMEM to 0 in defaults.h unless these conditions are met, and always assigning it instead of pre-setting the entry to zero. This is more straightforward and removes some ifdefs and the last constructor. In addition, now the setting has a chance of being found.	2022-04-25 19:42:43 +02:00
Willy Tarreau	2df1fbf816	MINOR: init: add global setting "fd-hard-limit" to bound system limits On some systems, the hard limit for ulimit -n may be huge, in the order of 1 billion, and using this to automatically compute maxconn doesn't work as it requires way too much memory. Users tend to hard-code maxconn but that's not convenient to manage deployments on heterogenous systems, nor when porting configs to developers' machines. The ulimit-n parameter doesn't work either because it forces the limit. What most users seem to want (and it makes sense) is to respect the system imposed limits up to a certain value and cap this value. This is exactly what fd-hard-limit does. This addresses github issue #1622.	2022-04-25 18:04:49 +02:00
William Lallemand	b53eb8790e	MINOR: init: add the pre-check callback This adds a call to function <fct> to the list of functions to be called at the step just before the configuration validity checks. This is useful when you need to create things like it would have been done during the configuration parsing and where the initialization should continue in the configuration check. It could be used for example to generate a proxy with multiple servers using the configuration parser itself. At this step the trash buffers are allocated. Threads are not yet started so no protection is required. The function is expected to return non-zero on success, or zero on failure. A failure will make the process emit a succinct error message and immediately exit.	2022-04-22 15:45:47 +02:00
Amaury Denoyelle	97e84c6c69	MINOR: cfg-quic: define tune.quic.conn-buf-limit Add a new global configuration option to set the limit of buffers per QUIC connection. By default, this value is set to 30.	2022-04-21 12:04:04 +02:00
Remi Tricot-Le Breton	b5d968d9b2	MEDIUM: global: Add a "close-spread-time" option to spread soft-stop on time window The new 'close-spread-time' global option can be used to spread idle and active HTTP connction closing after a SIGUSR1 signal is received. This allows to limit bursts of reconnections when too many idle connections are closed at once. Indeed, without this new mechanism, in case of soft-stop, all the idle connections would be closed at once (after the grace period is over), and all active HTTP connections would be closed by appending a "Connection: close" header to the next response that goes over it (or via a GOAWAY frame in case of HTTP2). This patch adds the support of this new option for HTTP as well as HTTP2 connections. It works differently on active and idle connections. On active connections, instead of sending systematically the GOAWAY frame or adding the 'Connection: close' header like before once the soft-stop has started, a random based on the remainder of the close window is calculated, and depending on its result we could decide to keep the connection alive. The random will be recalculated for any subsequent request/response on this connection so the GOAWAY will still end up being sent, but we might wait a few more round trips. This will ensure that goaways are distributed along a longer time window than before. On idle connections, a random factor is used when determining the expire field of the connection's task, which should naturally spread connection closings on the time window (see h2c_update_timeout). This feature request was described in GitHub issue #1614. This patch should be backported to 2.5. It depends on "BUG/MEDIUM: mux-h2: make use of http-request and keep-alive timeouts" which refactorized the timeout management of HTTP2 connections.	2022-04-08 18:15:21 +02:00
Willy Tarreau	29d799d591	MINOR: sample: list registered sample converter functions Similar to the sample fetch keywords, let's also list the converter keywords. They're much simpler since there's no compatibility matrix. Instead the input and output types are listed. This is called by dump_registered_keywords() for the "cnv" keywords class.	2022-03-29 18:01:37 +02:00
Willy Tarreau	f78813f74f	MINOR: samples: add a function to list register sample fetch keywords New function smp_dump_fetch_kw lists registered sample fetch keywords with their compatibility matrix, mandatory and optional argument types, and output types. It's called from dump_registered_keywords() with class "smp".	2022-03-29 18:01:37 +02:00
Willy Tarreau	6ff7d1b9a5	MINOR: acl: add a function to dump the list of known ACL keywords New function acl_dump_kwd() dumps the registered ACL keywords and their sample-fetch equivalent to stdout. It's called by dump_registered_keywords() for keyword class "acl".	2022-03-29 18:01:37 +02:00
Willy Tarreau	06d0e2e034	MINOR: cli: add a new keyword dump function New function cli_list_keywords() scans the list of registered CLI keywords and dumps them on stdout. It's now called from dump_registered_keywords() for the class "cli". Some keywords are valid for the master, they'll be suffixed with "[MASTER]". Others are valid for the worker, they'll have "[WORKER]". Those accessible only in expert mode will show "[EXPERT]" and the experimental ones will show "[EXPERIM]".	2022-03-29 18:01:37 +02:00
Willy Tarreau	5fcc100d91	MINOR: services: extend list_services() to dump to stdout When no output stream is passed, stdout is used with one entry per line, and this is called from dump_registered_services() when passed the class "svc".	2022-03-29 18:01:37 +02:00
Willy Tarreau	3b65e14842	MINOR: filters: extend flt_dump_kws() to dump to stdout When passing a NULL output buffer the function will now dump to stdout with a more compact format that is more suitable for machine processing. An entry was added to dump_registered_keyword() to call it when the keyword class "flt" is requested.	2022-03-29 18:01:37 +02:00
Willy Tarreau	ca1acd6080	MINOR: config: add a function to dump all known config keywords All registered config keywords that are valid in the config parser are dumped to stdout organized like the regular sections (global, listen, etc). Some keywords that are known to only be valid in frontends or backends will be suffixed with [FE] or [BE]. All regularly registered "bind" and "server" keywords are also dumped, one per "bind" or "server" line. Those depending on ssl are listed after the "ssl" keyword. Doing so required to export the listener and server keyword lists that were static. The function is called from dump_registered_keywords() for keyword class "cfg".	2022-03-29 18:01:32 +02:00
Willy Tarreau	76871a4f8c	MINOR: management: add some basic keyword dump infrastructure It's difficult from outside haproxy to detect the supported keywords and syntax. Interestingly, many of our modern keywords are enumerated since they're registered from constructors, so it's not very hard to enumerate most of them. This patch creates some basic infrastructure to support dumping existing keywords from different classes on stdout. The format will differ depending on the classes, but the idea is that the output could easily be passed to a script that generates some simple syntax highlighting rules, completion rules for editors, syntax checkers or config parsers. The principle chosen here is that if "-dK" is passed on the command-line, at the end of the parsing the registered keywords will be dumped for the requested classes passed after "-dK". Special name "help" will show known classes, while "all" will execute all of them. The reason for doing that after the end of the config processor is that it will also enumerate internally-generated keywords, Lua or even those loaded from external code (e.g. if an add-on is loaded using LD_PRELOAD). A typical way to call this with a valid config would be: ./haproxy -dKall -q -c -f /path/to/config If there's no config available, feeding /dev/null will also do the job, though it will not be able to detect dynamically created keywords, of course. This patch also updates the management doc. For now nothing but the help is listed, various subsystems will follow in subsequent patches.	2022-03-29 17:55:54 +02:00
Willy Tarreau	edd426871f	DEBUG: move the tainted stuff to bug.h for easier inclusion The functions needed to manipulate the "tainted" flags were located in too high a level to be callable from the lower code layers. Let's move them to bug.h.	2022-02-25 11:55:38 +01:00
Willy Tarreau	9b4a0e6bac	BUG/MINOR: debug: fix get_tainted() to properly read an atomic value get_tainted() was using an atomic store from the atomic value to a local one instead of using an atomic load. In practice it has no effect given the relatively rare updates of this field and the fact that it's read only when dumping "show info" output, but better fix it. There's probably no need to backport this.	2022-02-25 11:54:30 +01:00

1 2 3 4 5 ...

1129 Commits