Commit Graph

1199 Commits

Author SHA1 Message Date
William Lallemand
ef9a195742 BUG/MINOR: startup: set GTUNE_SOCKET_TRANSFER correctly
This bug was forbidding the GTUNE_SOCKET_TRANSFER option to be set
when haproxy is neither in daemon mode nor in mworker mode. So it
basically only impacts the foreground mode.

The fix moves the code outside the 'if (global.mode & (MODE_DAEMON |
MODE_MWORKER | MODE_MWORKER_WAIT))' condition.

Bug was introduced with 7f80eb23 ("MEDIUM: proxy: zombify proxies only
when the expose-fd socket is bound").

Must be backported in every stable version.
2023-11-20 10:49:05 +01:00
Willy Tarreau
c7a90cc181 CLEANUP: haproxy: remove old comment from 1.1 from the file header
There was still a totally outdated comment speaking about issues
affecting solaris on 1.1.8pre4 (April 2002, 21 year-old)! This
proves that comments in headers are never read, so let's take this
opportunity for also removing the outdated one recommending to read
the "updated" RFC7230.
2023-11-17 18:10:16 +01:00
William Lallemand
d76fa37534 BUG/MEDIUM: mworker: set the master variable earlier
Since 2.7 and the mcli_reload_bind_conf (56f73b21a5), upon a reload
failure because of a bind error, the mcli_reload_bind_conf go through a
sock_unbind((). This is not supposed to do anything when a listener is
RX_F_INHERITED in the master, but unfortunately this is done too early
and provokes an exit of the master.

We already suspected in the past that setting the 'master' variable this
late could have negative impact.

The fix sets the master variable earlier before the bind.

This must be backported at least to 2.7. This could be backported
earlier but better wait any feedbacks on the fix.
2023-11-14 14:32:39 +01:00
William Lallemand
a06f6212c9 MEDIUM: startup: 'haproxy -c' is quiet when valid
MODE_CHECK does not output "Configuration file is valid" by default
anymore. To display this message the -V option must be used with -c.

However the warning and errors are still output by default if they
exist.

This allows to clean the output of the systemd unit file with is doing a
-c.
2023-11-13 09:59:34 +01:00
William Lallemand
3ac3a06963 MEDIUM: mworker: -W is mandatory when using -S
Defining a master CLI without the master-worker mode emits a warning
since version 1.8. This patch enforce the behavior by forbiding the
usage of the -S option without the master-worker mode.
2023-11-09 15:07:15 +01:00
Christopher Faulet
023564b685 MINOR: global: Add an option to disable the zero-copy forwarding
The zero-copy forwarding or the mux-to-mux forwarding is a way to
fast-forward data without using the channels buffers. Data are transferred
from a mux to the other one. The kernel splicing is an optimization of the
zero-copy forwarding. But it can also use normal buffers (but not channels
ones). This way, it could be possible to fast-forward data with muxes not
supporting the kernel splicing (H2 and H3 muxes) but also with applets.

However, this mode can introduce regressions or bugs in future (just like
the kernel splicing). Thus, It could be usefull to disable this optim. To do
so, in configuration, the global tune settting
'tune.disable-zero-copy-forwarding' may be set in a global section or the
'-dZ' command line parameter may be used to start HAProxy. Of course, this
also disables the kernel splicing.
2023-10-17 18:51:13 +02:00
Aurelien DARRAGON
18da35c123 MEDIUM: tree-wide: logsrv struct becomes logger
When 'log' directive was implemented, the internal representation was
named 'struct logsrv', because the 'log' directive would directly point
to the log target, which used to be a (UDP) log server exclusively at
that time, hence the name.

But things have become more complex, since today 'log' directive can point
to ring targets (implicit, or named) for example.

Indeed, a 'log' directive does no longer reference the "final" server to
which the log will be sent, but instead it describes which log API and
parameters to use for transporting the log messages to the proper log
destination.

So now the term 'logsrv' is rather confusing and prevents us from
introducing a new level of abstraction because they would be mixed
with logsrv.

So in order to better designate this 'log' directive, and make it more
generic, we chose the word 'logger' which now replaces logsrv everywhere
it was used in the code (including related comments).

This is internal rewording, so no functional change should be expected
on user-side.
2023-10-13 10:05:06 +02:00
Willy Tarreau
90fa2eaa15 MINOR: haproxy: permit to register features during boot
The regtests are using the "feature()" predicate but this one can only
rely on build-time options. It would be nice if some runtime-specific
options could be detected at boot time so that regtests could more
flexibly adapt to what is supported (capabilities, splicing, etc).

Similarly, certain features that are currently enabled with USE_XXX
could also be automatically detected at build time using ifdefs and
would simplify the configuration, but then we'd lose the feature
report in the feature list which is convenient for regtests.

This patch makes sure that haproxy -vv shows the variable's contents
and not the macro's contents, and adds a new hap_register_feature()
to allow the code to register a new keyword.
2023-10-06 11:40:02 +02:00
Willy Tarreau
5119109e3f MINOR: cpuset: dynamically allocate cpu_map
cpu_map is 8.2kB/entry and there's one such entry per group, that's
~520kB total. In addition, the init code is still in haproxy.c enclosed
in ifdefs. Let's make this a dynamically allocated array in the cpuset
code and remove that init code.

Later we may even consider reallocating it once the number of threads
and groups is known, in order to shrink it a little bit, as the typical
setup with a single group will only need 8.2kB, thus saving half a MB
of RAM. This would require that the upper bound is placed in a variable
though.
2023-09-08 16:25:19 +02:00
Willy Tarreau
5f10176e2c MEDIUM: init: initialize the trash earlier
More and more utility functions rely on the trash while most of the init
code doesn't have access to it because it's initialized very late (in
PRE_CHECK for the initial one). It's a pool, and it purposely supports
being reallocated, so let's initialize it in STG_POOL so that early
STG_INIT code can at least use it.
2023-09-08 16:25:19 +02:00
Frédéric Lécaille
292dfdd78d BUG/MINOR: quic: Wrong cluster secret initialization
The function generate_random_cluster_secret() which initializes the cluster secret
when not supplied by configuration is buggy. There 1/256 that the cluster secret
string is empty.

To fix this, one stores the cluster as a reduced size first 128 bits of its own
SHA1 (160 bits) digest, if defined by configuration. If this is not the case, it
is initialized with a 128 bits random value. Furthermore, thus the cluster secret
is always initialized.

As the cluster secret is always initialized, there are several tests which
are for now on useless. This patch removes such tests (if(global.cluster_secret))
in the QUIC code part and at parsing time: no need to check that a cluster
secret was initialized with "quic-force-retry" option.

Must be backported as far as 2.6.
2023-09-08 09:50:58 +02:00
Aurelien DARRAGON
e187361b52 MINOR: log: move log-forwarders cleanup in log.c
Move the log-forwarded proxies cleanup from global deinit() function into
log dedicated deinit function.

No backport needed.
2023-09-06 16:06:39 +02:00
William Lallemand
d90d3bf894 MINOR: global: export the display_version() symbol
Export the display_version() function which can be used elsewhere than
in haproxy.c
2023-09-05 15:24:39 +02:00
Willy Tarreau
86854dd032 MEDIUM: threads: detect excessive thread counts vs cpu-map
This detects when there are more threads bound via cpu-map than CPUs
enabled in cpu-map, or when there are more total threads than the total
number of CPUs available at boot (for unbound threads) and configured
for bound threads. In this case, a warning is emitted to explain the
problems it will cause, and explaining how to address the situation.

Note that some configurations will not be detected as faulty because
the algorithmic complexity to resolve all arrangements grows in O(N!).
This means that having 3 threads on 2 CPUs and one thread on 2 CPUs
will not be detected as it's 4 threads for 4 CPUs. But at least configs
such as T0:(1,4) T1:(1,4) T2:(2,4) T3:(3,4) will not trigger a warning
since they're valid.
2023-09-04 19:39:17 +02:00
Willy Tarreau
8357f950cb MEDIUM: threads: detect incomplete CPU bindings
It's very easy to mess up with some cpu-map directives and to leave
some thread unbound. Let's add a test that checks that either all
threads are bound or none are bound, but that we do not face the
intermediary situation where some are pinned and others are left
wandering around, possibly on the same CPUs as bound ones.

Note that this should not be backported, or maybe turned into a
notice only, as it appears that it will easily catch invalid
configs and that may break updates for some users.
2023-09-04 19:39:17 +02:00
Andrew Hopkins
b3f94f8b3b BUILD: ssl: Build with new cryptographic library AWS-LC
This adds a new option for the Makefile USE_OPENSSL_AWSLC, and
update the documentation with instructions to use HAProxy with
AWS-LC.

Update the type of the OCSP callback retrieved with
SSL_CTX_get_tlsext_status_cb with the actual type for
libcrypto versions greater than 1.0.2. This doesn't affect
OpenSSL which casts the callback to void* in SSL_CTX_ctrl.
2023-09-04 18:19:18 +02:00
Willy Tarreau
bd84387beb MEDIUM: capabilities: enable support for Linux capabilities
For a while there has been the constraint of having to run as root for
transparent proxying, and we're starting to see some cases where QUIC is
not running in socket-per-connection mode due to the missing capability
that would be needed to bind a privileged port. It's not realistic to
ask all QUIC users on port 443 to run as root, so instead let's provide
a basic support for capabilities at least on linux. The ones currently
supported are cap_net_raw, cap_net_admin and cap_net_bind_service. The
mechanism was made OS-specific with a dedicated file because it really
is. It can be easily refined later for other OSes if needed.

A new keyword "setcaps" is added to the global section, to enumerate the
capabilities that must be kept when switching from root to non-root. This
is ignored in other situations though. HAProxy has to be built with
USE_LINUX_CAP=1 for this to be supported, which is enabled by default
for linux-glibc, linux-glibc-legacy and linux-musl.

A good way to test this is to start haproxy with such a config:

    global
        uid 1000
        setcap cap_net_bind_service

    frontend test
        mode http
        timeout client 3s
        bind quic4@:443 ssl crt rsa+dh2048.pem allow-0rtt

and run it under "sudo strace -e trace=bind,setuid", then connecting
there from an H3 client. The bind() syscall must succeed despite the
user id having been switched.
2023-08-29 11:11:50 +02:00
Willy Tarreau
f54d8c6457 CLEANUP: cpuset: remove the unused proc_t1 field in cpu_map
This field used to store the cpumap of the first thread in a group, and
was used till 2.4 to hold some default settings, after which it was no
longer used. Let's just drop it.
2023-07-20 11:01:09 +02:00
Willy Tarreau
c955659906 BUG/MINOR: init: set process' affinity even in foreground
The per-process CPU affinity settings are only applied during forking,
which means that cpu-map are ignored when running in foreground (e.g.
haproxy started with -db). This is historic due to the original semantics
of a process array, but isn't documented and causes surprises when trying
to debug affinity settings.

Let's make sure the setting is applied to the workers themselves even
in foreground. This may be backported to 2.6 though it is really not
important. If backported, it also depends on previous commit:

  BUG/MINOR: cpuset: remove the bogus "proc" from the cpu_map struct
2023-07-20 11:01:09 +02:00
Willy Tarreau
151f9a2808 BUG/MINOR: cpuset: remove the bogus "proc" from the cpu_map struct
We're currently having a problem with the porting from cpu_map from
processes to thread-groups as it happened in 2.7 with commit 5b09341c0
("MEDIUM: cpu-map: replace the process number with the thread group
number"), though it seems that it has deeper roots even in 2.0 and
that it was progressively made worng over time.

The issue stems in the way the per-process and per-thread cpu-sets were
employed over time. Originally only processes were supported. Then
threads were added after an optional "/" and it was documented that
"cpu-map 1" is exactly equivalent to "cpu-map 1/all" (this was clarified
in 2.5 by commit 317804d28 ("DOC: update references to process numbers
in cpu-map and bind-process").

The reality is different: when processes were still supported, setting
"cpu-map 1" would apply the mask to the process itself (and only when
run in the background, which is not documented either and is also a
bug for another fix), and would be combined with any possible per-thread
mask when calculating the threads' affinity, possibly resulting in empty
sets. However, "cpu-map 1/all" would only set the mask for the threads
and not the process. As such the following:

    cpu-map 1 odd
    cpu-map 1/1-8 even

would leave no CPU while doing:

    cpu-map 1/all odd
    cpu-map 1/1-8 even

would allow all CPUs.

While such configs are very unlikely to ever be met (which is why this
bug is tagged minor), this is becoming quite more visible while testing
automatic CPU binding during 2.9 development because due to this bug
it's much more common to end up with incorrect bindings.

This patch fixes it by simply removing the .proc entry from cpu_map and
always setting all threads' maps. The process is no longer arbitrarily
bound to the group 1's mask, but in case threads are disabled, we'll
use thread 1's mask since it contains the configured CPUs.

This fix should be backported at least to 2.6, but no need to insist if
it resists as it's easier to break cpu-map than to fix an unlikely issue.
2023-07-20 11:01:09 +02:00
William Lallemand
117b03ff4a BUG/MINOR: mworker: leak of a socketpair during startup failure
Aurelien Darragon found a case of leak when working on ticket #2184.

When a reexec_on_failure() happens *BEFORE* protocol_bind_all(), the
worker is not fork and the mworker_proc struct is still there with
its 2 socketpairs.

The socketpair that is supposed to be in the master is already closed in
mworker_cleanup_proc(), the one for the worker was suppposed to
be cleaned up in mworker_cleanlisteners().

However, since the fd is not bound during this failure, the fd is never
closed.

This patch fixes the problem by setting the fd to -1 in the mworker_proc
after the fork, so we ensure that this it won't be close if everything
was done right, and then we try to close it in mworker_cleanup_proc()
when it's not set to -1.

This could be triggered with the script in ticket #2184 and a `ulimit -H
-n 300`. This will fail before the protocol_bind_all() when trying to
increase the nofile setrlimit.

In recent version of haproxy, there is a BUG_ON() in fd_insert() that
could be triggered by this bug because of the global.maxsock check.

Must be backported as far as 2.6.

The problem could exist in previous version but the code is different
and this won't be triggered easily without other consequences in the
master.
2023-06-21 09:44:18 +02:00
Aurelien DARRAGON
33bbeecde3 BUILD: init: print rlim_cur as regular integer
haproxy does not compile anymore on macOS+clang since 425d7ad ("MINOR:
init: pre-allocate kernel data structures on init"). This is due to
rlim_cur being printed uncasted using %lu format specifier, with rlim_cur
being stored as a rlim_t which is a typedef so its size may vary depending
on the system's architecture.

This is not the first time we need to dump rlim_cur in case of errors,
there are already multiple occurences in the init code. Everywhere this
happens, rlim is casted as a regular int and printed using the '%d'
format specifier, so we do the same here as well to fix the build issue.

No backport needed unless 425d7ad gets backported.
2023-05-26 14:29:52 +02:00
Patrick Hemmer
425d7ad89d MINOR: init: pre-allocate kernel data structures on init
The Linux kernel maintains data structures to track a processes' open file
descriptors, and it expands these structures as necessary when FD usage grows
(at every FD=2^X starting at 64). However when threading is in use, during
expansion the kernel will pause (observed up to 47ms) while it waits for thread
synchronization (see https://bugzilla.kernel.org/show_bug.cgi?id=217366).

This change addresses the issue and avoids the random pauses by opening the
maximum file descriptor during initialization, so that expansion will not occur
while processing traffic.
2023-05-26 09:28:18 +02:00
Willy Tarreau
c7b9308f20 BUG/MINOR: clock: automatically adjust the internal clock with the boot time
This is a better and more general solution to the problem described in
this commit:

    BUG/MINOR: checks: postpone the startup of health checks by the boot time

Now we're updating the now_offset that is used to compute now_ms at the
few points where we update the ready date during boot. This ensures that
now_ms while being stable during all the boot process will be correct
and will start with the boot value right after the boot is finished. As
such the patch above is rolled back (we don't want to count the boot
time twice).

This must not be backported because it relies on the more flexible clock
architecture in 2.8.
2023-05-17 09:33:54 +02:00
Willy Tarreau
da4aa6905c MINOR: clock: measure the total boot time
Some huge configs take a significant amount of time to start and this
can cause some trouble (e.g. health checks getting delayed and grouped,
process not responding to the CLI etc). For example, some configs might
start fast in certain environments and slowly in other ones just due to
the use of a wrong DNS server that delays all libc's resolutions. Let's
first start by measuring it by keeping a copy of the most recently known
ready date, once before calling check_config_validity() and then refine
it when leaving this function. A last call is finally performed just
before deciding to split between master and worker processes, and it covers
the whole boot. It's trivial to collect and even allows to get rid of a
call to clock_update_date() in function check_config_validity() that was
used in hope to better schedule future events.
2023-05-17 09:33:54 +02:00
Willy Tarreau
c05d30e9d8 MINOR: clock: replace the timeval start_time with start_time_ns
Now that "now" is no more a timeval, there's no point keeping a copy
of it as a timeval, let's also switch start_time to nanoseconds, it
simplifies operations.
2023-04-28 16:08:08 +02:00
Willy Tarreau
69530f59ae MEDIUM: clock: replace timeval "now" with integer "now_ns"
This puts an end to the occasional confusion between the "now" date
that is internal, monotonic and not synchronized with the system's
date, and "date" which is the system's date and not necessarily
monotonic. Variable "now" was removed and replaced with a 64-bit
integer "now_ns" which is a counter of nanoseconds. It wraps every
585 years, so if all goes well (i.e. if humanity does not need
haproxy anymore in 500 years), it will just never wrap. This implies
that now_ns is never nul and that the zero value can reliably be used
as "not set yet" for a timestamp if needed. This will also simplify
date checks where it becomes possible again to do "date1<date2".

All occurrences of "tv_to_ns(&now)" were simply replaced by "now_ns".
Due to the intricacies between now, global_now and now_offset, all 3
had to be turned to nanoseconds at once. It's not a problem since all
of them were solely used in 3 functions in clock.c, but they make the
patch look bigger than it really  is.

The clock_update_local_date() and clock_update_global_date() functions
are now much simpler as there's no need anymore to perform conversions
nor to round the timeval up or down.

The wrapping continues to happen by presetting the internal offset in
the short future so that the 32-bit now_ms continues to wrap 20 seconds
after boot.

The start_time used to calculate uptime can still be turned to
nanoseconds now. One interrogation concerns global_now_ms which is used
only for the freq counters. It's unclear whether there's more value in
using two variables that need to be synchronized sequentially like today
or to just use global_now_ns divided by 1 million. Both approaches will
work equally well on modern systems, the difference might come from
smaller ones. Better not change anyhting for now.

One benefit of the new approach is that we now have an internal date
with a resolution of the nanosecond and the precision of the microsecond,
which can be useful to extend some measurements given that timestamps
also have this resolution.
2023-04-28 16:08:08 +02:00
Willy Tarreau
0e875cf291 MEDIUM: listener: switch the default sharding to by-group
Sharding by-group is exactly identical to by-process for a single
group, and will use the same number of file descriptors for more than
one group, while significantly lowering the kernel's locking overhead.

Now that all special listeners (cli, peers) are properly handled, and
that support for SO_REUSEPORT is detected at runtime per protocol, there
should be no more reason for now switching to by-group by default.

That's what this patch does. It does only this and nothing else so that
it's easy to revert, should any issue be raised.

Testing on an AMD EPYC 74F3 featuring 24 cores and 48 threads distributed
into 8 core complexes of 3 cores each, shows that configuring 8 groups
(one per CCX) is sufficient to simply double the forwarded connection
rate from 112k to 214k/s, reducing kernel locking from 71 to 55%.
2023-04-23 10:18:16 +02:00
Willy Tarreau
7310164b2c MINOR: listener: add a new global tune.listener.default-shards setting
This new setting accepts "by-process", "by-group" and "by-thread" and
will dictate how listeners will be sharded by default when nothing is
specified. While the default remains "by-process", "by-group" should be
much more efficient with many threads, while not changing anything for
single-group setups.
2023-04-23 09:46:15 +02:00
Willy Tarreau
785b89f551 MINOR: protocol: move the global reuseport flag to the protocols
Some protocol support SO_REUSEPORT and others not. Some have such a
limitation in the kernel, and others in haproxy itself (e.g. sock_unix
cannot support multiple bindings since each one will unbind the previous
one). Also it's really protocol-dependent and not just family-dependent
because on Linux for some time it was supported for TCP and not UDP.

Let's move the definition to the protocols instead. Now it's preset in
tcp/udp/quic when SO_REUSEPORT is defined, and is otherwise left unset.
The enabled() config condition test validates IPv4 (generally sufficient),
and -dR / noreuseport all protocols at once.
2023-04-23 09:46:15 +02:00
Willy Tarreau
84fe1f479b MINOR: listener: support another thread dispatch mode: "fair"
This new algorithm for rebalancing incoming connections to multiple
threads is simpler and instead of considering the threads load, it will
only cycle through all of them, offering a fair share of the traffic to
each thread. It may be well suited for short-lived connections but is
also convenient for very large thread counts where it's not always certain
that the least loaded thread will always be found.
2023-04-21 17:41:26 +02:00
Aurelien DARRAGON
cca3355074 BUG/MINOR: log: free log forward proxies on deinit()
Proxies belonging to the cfg_log_forward proxy list are not cleaned up
in haproxy deinit() function.
We add the missing cleanup directly in the main deinit() function since
no other specific function may be used for this.

This could be backported up to 2.4
2023-04-05 08:58:16 +02:00
Aurelien DARRAGON
9b1d15f53a BUG/MINOR: sink: free forward_px on deinit()
When a ring section is configured, a new sink is created and forward_px
proxy may be allocated and assigned to the sink.
Such sink-related proxies are added to the sink_proxies_list and thus
don't belong to the main proxy list which is cleaned up in
haproxy deinit() function.

We don't have to manually clean up sink_proxies_list in the main deinit()
func:
sink API already provides the sink_deinit() function so we just add the
missing free_proxy(sink->forward_px) there.

This could be backported up to 2.4.
[in 2.4, commit b0281a49 ("MINOR: proxy: check if p is NULL in free_proxy()")
must be backported first]
2023-04-05 08:58:16 +02:00
Willy Tarreau
9ef2742a51 MINOR: debug: support dumping the libs addresses when running in verbose mode
Starting haproxy with -dL helps enumerate the list of libraries in use.
But sometimes in order to go further we'd like to see their address
ranges. This is already supported on the CLI's "show libs" but not on
the command line where it can sometimes help troubleshoot startup issues.
Let's dump them when in verbose mode. This way it doesn't change the
existing behavior for those trying to enumerate libs to produce an archive.
2023-03-22 11:43:15 +01:00
William Lallemand
2078d4b1f7 BUG/MINOR: mworker: use MASTER_MAXCONN as default maxconn value
In environments where SYSTEM_MAXCONN is defined when compiling, the
master will use this value instead of the original minimal value which
was set to 100. When this happens, the master process could allocate
RAM excessively since it does not need to have an high maxconn. (For
example if SYSTEM_MAXCONN was set to 100000 or more)

This patch fixes the issue by using the new define MASTER_MAXCONN which
define a default maxconn of 100 for the master process.

Must be backported as far as 2.5.
2023-03-09 14:28:44 +01:00
Amaury Denoyelle
5907fede87 MEDIUM: quic: release closing connections on stopping
Since the following commit :
  commit fb375574f9
  MINOR: quic: mark quic-conn as jobs on socket allocation

quic-conn instances are marked as jobs. This prevent haproxy process to
stop while there is transfer in progress. To not delay process
termination, idle connections are woken up through their MUX instances
to be able to release them immediately.

However, there is no mechanism to wake up quic connections left on
closing or draining state. This means that haproxy process termination
is delayed until every closing quic connections timer has expired.

To improve this, a new function quic_handle_stopping() is called when
haproxy process is stopping. It simply wakes up the idle timer task of
all connections in the global closing list. These connections will thus
be released immediately to not interrupt haproxy process stopping.

This should be backported up to 2.7.
2023-03-08 14:41:28 +01:00
Sbaastien Gross
2a1bcf1a59 MINOR: config: add HAPROXY_BRANCH environment variable
This patch adds support from HAPROXY_BRANCH environment variable.
It can be useful is some resources are loaded from different
locations when migrating from one version to another.

Signed-off-by: Sbastien Gross <sgross@haproxy.com>
2023-02-24 09:45:44 +01:00
Aurelien DARRAGON
28a6d48a60 MINOR: haproxy: always protocol unbind on startup error path
In haproxy startup, all init error paths after the protocol bind step
cautiously call protocol_unbind_all() before exiting except one that was
conditional. We're not making an exception to the rule and we now properly
call protocol_unbind_all() as well.

No backport needed as this patch is unnoticeable.
2023-02-23 15:05:05 +01:00
William Lallemand
d4c0be6b20 MINOR: startup: HAPROXY_STARTUP_VERSION contains the version used to start
HAPROXY_STARTUP_VERSION: contains the version used to start, in
master-worker mode this is the version which was used to start the
master, even after updating the binary and reloading.

This patch could be backported in every version since it is useful when
debugging.
2023-02-21 14:16:45 +01:00
Christopher Faulet
2f7c82bfdf BUG/MINOR: haproxy: Fix option to disable the fast-forward
The option was renamed to only permit to disable the fast-forward. First
there is no reason to enable it because it is the default behavior. Then it
introduced a bug because there is no way to be sure the command line has
precedence over the configuration this way. So, the option is now named
"tune.disable-fast-forward" and does not support any argument. And of
course, the commande line option "-dF" has now precedence over the
configuration.

No backport needed.
2023-02-21 11:44:55 +01:00
William Lallemand
5a7f83af84 BUG/MINOR: mworker: prevent incorrect values in uptime
Since the recent changes on the clocks, now.tv_sec is not to be used
between processes because it's a clock which is local to the process and
does not contain a real unix timestamp.  This patch fixes the issue by
using "data.tv_sec" which is the wall clock instead of "now.tv_sec'.
It prevents having incoherent timestamps.

It also introduces some checks on negatives values in order to never
displays a netative value if it was computed from a wrong value set by a
previous haproxy version.

It must be backported as far as 2.0.
2023-02-17 17:17:28 +01:00
Willy Tarreau
3e820a1056 MINOR: threads: add flags to know if a thread is started and/or running
Several times during debugging it has been difficult to find a way to
reliably indicate if a thread had been started and if it was still
running. It's really not easy because the elements we look at are not
necessarily reliable (e.g. harmless bit or idle bit might not reflect
what we think during a signal). And such notions can be subjective
anyway.

Here we define two thread flags, TH_FL_STARTED which is set as soon as
a thread enters run_thread_poll_loop() and drops the idle bit, and
another one, TH_FL_IN_LOOP, which is set when entering run_poll_loop()
and cleared when leaving it. This should help init/deinit code know
whether it's called from a non-initialized thread (i.e. tid must not
be trusted), or shared functions know if they're being called from a
running thread or from init/deinit code outside of the polling loop.
2023-02-17 16:01:34 +01:00
Christopher Faulet
678a4ced70 MINOR: haproxy: Add an command option to disable data fast-forward
The -dF option can now be used to disable data fast-forward. It does the
same than the global option "tune.fast-forward off". Some reg-tests may rely
on this optim. To detect the feature and skip such script, the following
vtest command must be used:

  feature cmd "$HAPROXY_PROGRAM -cc '!(globa.tune & GTUNE_NO_FAST_FWD)'"
2023-02-17 10:17:02 +01:00
Amaury Denoyelle
2776e775ec BUG/MINOR: mworker: fix uptime for master process
Uptime calculation for master process was incorrect as it used
<start_date> as its timestamp base time. Fix this by using the scheduler
time <start_time> for this.

The impact of this bug is minor as timestamp base time is only used for
"show proc" CLI output. it was highlighted by the following commit.
which caused a negative value to be displayed for the master process
uptime on "show proc" output.

  28360dc53f
  MEDIUM: clock: force internal time to wrap early after boot

This should be backported up to 2.0.
2023-02-10 15:57:33 +01:00
Willy Tarreau
6093ba47c0 BUG/MINOR: clock: do not mix wall-clock and monotonic time in uptime calculation
We've had a start date even before the internal monotonic clock existed,
but once the monotonic clock was added, the start date was not updated
to distinguish the wall clock time units and the internal monotonic time
units. The distinction is important because both clocks do not necessarily
progress at the same speed. The very rare occurrences of the wall-clock
date are essentially for human consumption and communication with third
parties (e.g. report the start date in "show info" for monitoring
purposes). However currently this one is also used to measure the distance
to "now" as being the process' uptime. This is actually not correct. It
only works because for now the two dates are initialized at the exact
same instant at boot but could still be wrong if the system's date shows
a big jump backwards during startup for example. In addition the current
situation prevents us from enforcing an abritrary offset at boot to reveal
some heisenbugs.

This patch adds a new "start_time" at boot that is set from "now" and is
used in uptime calculations. "start_date" instead is now set from "date"
and will always reflect the system date for human consumption (e.g. in
"show info"). This way we're now sure that any drift of the internal
clock relative to the system date will not impact the reported uptime.

This could possibly be backported though it's unlikely that anyone has
ever noticed the problem.
2023-02-08 11:06:55 +01:00
Amaury Denoyelle
24d5b72ca9 MINOR: quic: add config for retransmit limit
Define a new configuration option "tune.quic.max-frame-loss". This is
used to specify the limit for which a single frame instance can be
detected as lost. If exceeded, the connection is closed.

This should be backported up to 2.7.
2023-02-03 11:56:46 +01:00
Aurelien DARRAGON
739281b3d6 BUG/MEDIUM: thread: consider secondary threads as idle+harmless during boot
idle and harmless bits in the tgroup_ctx structure were not explicitly
set during boot.

    | struct tgroup_ctx ha_tgroup_ctx[MAX_TGROUPS] = { };

As the structure is first statically initialized,
.threads_harmless and .threads_idle are automatically zero-
initialized by the compiler.

Unfortulately, this means that such threads are not considered idle
nor harmless by thread_isolate(_full)() functions until they enter
the polling loop (thread_harmless_now() and thread_idle_now() are
respectively called before entering the polling loop)

Because of this, any attempt to call thread_isolate() or thread_isolate_full()
during a startup phase with nbthreads >= 2 will cause thread_isolate to
loop until every secondary threads make it through their first polling loop.

If the startup phase is aborted during boot (ie: "-c" option to check the
configuration), secondary threads may be initialized but will never be started
(ie: they won't enter the polling loop), thus thread_isolate()
could would loop forever in such cases.

We can easily reveal the bug with this patch reproducer:

    |  diff --git a/src/haproxy.c b/src/haproxy.c
    |  index e91691658..0b733f6ee 100644
    |  --- a/src/haproxy.c
    |  +++ b/src/haproxy.c
    |  @@ -2317,6 +2317,10 @@ static void init(int argc, char **argv)
    |   		if (pr || px) {
    |   			/* At least one peer or one listener has been found */
    |   			qfprintf(stdout, "Configuration file is valid\n");
    |  +			printf("haproxy will loop...\n");
    |  +			thread_isolate();
    |  +			printf("we will never reach this\n");
    |  +			thread_release();
    |   			deinit_and_exit(0);
    |   		}
    |   		qfprintf(stdout, "Configuration file has no error but will not start (no listener) => exit(2).\n");

Now we start haproxy with a valid config:
$> haproxy -c -f valid.conf
Configuration file is valid
haproxy will loop...

^C

------------------------------------------------------------------------------

This did not cause any issue so far because no early deinit paths require
full thread isolation. But this may change when new features or requirements
are introduced, so we should fix this before it becomes a real issue.

To fix this, we explicitly assign .threads_harmless and .threads_idle
to .threads_enabled value in thread_map_to_groups() function during boot.
This is the proper place to do this since as long as .threads_enabled is not
explicitly set, its default value is also 0 (zero-initialized by the compiler)

code snippet from thread_isolate() function:
       ulong te = _HA_ATOMIC_LOAD(&ha_tgroup_info[tgrp].threads_enabled);
       ulong th = _HA_ATOMIC_LOAD(&ha_tgroup_ctx[tgrp].threads_harmless);

       if ((th & te) == te)
           break;

Thus thread_isolate(_full()) won't be looping forever in thread_isolate()
even if it were to be used before thread_map_to_groups() is executed.

No backport needed unless this is a requirement.
2023-02-02 08:21:15 +01:00
Willy Tarreau
2c701dbc07 BUG/MINOR: log: release global log servers on exit
Since 2.6 we have a free_logsrv() function that is used to release log
servers. It must be called from deinit() instead of manually iterating
over the log servers, otherwise some parts of the structure are not
freed (namely the ring name), as reported by ASAN.

This should be backported to 2.6.
2023-01-26 15:49:30 +01:00
Willy Tarreau
b2f38c13d1 BUG/MINOR: thread: always reload threads_enabled in loops
A few loops waiting for threads to synchronize such as thread_isolate()
rightfully filter the thread masks via the threads_enabled field that
contains the list of enabled threads. However, it doesn't use an atomic
load on it. Before 2.7, the equivalent variables were marked as volatile
and were always reloaded. In 2.7 they're fields in ha_tgroup_ctx[], and
the risk that the compiler keeps them in a register inside a loop is not
null at all. In practice when ha_thread_relax() calls sched_yield() or
an x86 PAUSE instruction, it could be verified that the variable is
always reloaded. If these are avoided (e.g. architecture providing
neither solution), it's visible in asm code that the variables are not
reloaded. In this case, if a thread exists just between the moment the
two values are read, the loop could spin forever.

This patch adds the required _HA_ATOMIC_LOAD() on the relevant
threads_enabled fields. It must be backported to 2.7.
2023-01-19 19:22:17 +01:00
Willy Tarreau
40c88f997f [RELEASE] Released version 2.8-dev1
Released version 2.8-dev1 with the following main changes :
    - MEDIUM: 51d: add support for 51Degrees V4 with Hash algorithm
    - MINOR: debug: support pool filtering on "debug dev memstats"
    - MINOR: debug: add a balance of alloc - free at the end of the memstats dump
    - LICENSE: wurfl: clarify the dummy library license.
    - MINOR: event_hdl: add event handler base api
    - DOC/MINOR: api: add documentation for event_hdl feature
    - MEDIUM: ssl: rename the struct "cert_key_and_chain" to "ckch_data"
    - MINOR: quic: remove qc from quic_rx_packet
    - MINOR: quic: complete traces in qc_rx_pkt_handle()
    - MINOR: quic: extract datagram parsing code
    - MINOR: tools: add port for ipcmp as optional criteria
    - MINOR: quic: detect connection migration
    - MINOR: quic: ignore address migration during handshake
    - MINOR: quic: startup detect for quic-conn owned socket support
    - MINOR: quic: test IP_PKTINFO support for quic-conn owned socket
    - MINOR: quic: define config option for socket per conn
    - MINOR: quic: allocate a socket per quic-conn
    - MINOR: quic: use connection socket for emission
    - MEDIUM: quic: use quic-conn socket for reception
    - MEDIUM: quic: move receive out of FD handler to quic-conn io-cb
    - MINOR: mux-quic: rename duplicate function names
    - MEDIUM: quic: requeue datagrams received on wrong socket
    - MINOR: quic: reconnect quic-conn socket on address migration
    - MINOR: quic: activate socket per conn by default
    - BUG/MINOR: ssl: initialize SSL error before parsing
    - BUG/MINOR: ssl: initialize WolfSSL before parsing
    - BUG/MINOR: quic: fix fd leak on startup check quic-conn owned socket
    - BUG/MEDIIM: stconn: Flush output data before forwarding close to write side
    - MINOR: server: add srv->rid (revision id) value
    - MINOR: stats: add server revision id support
    - MINOR: server/event_hdl: add support for SERVER_ADD and SERVER_DEL events
    - MINOR: server/event_hdl: add support for SERVER_UP and SERVER_DOWN events
    - BUG/MEDIUM: checks: do not reschedule a possibly running task on state change
    - BUG/MINOR: checks: make sure fastinter is used even on forced transitions
    - CLEANUP: assorted typo fixes in the code and comments
    - MINOR: mworker: display an alert upon a wait-mode exit
    - BUG/MEDIUM: mworker: fix segv in early failure of mworker mode with peers
    - BUG/MEDIUM: mworker: create the mcli_reload socketpairs in case of upgrade
    - BUG/MINOR: checks: restore legacy on-error fastinter behavior
    - MINOR: check: use atomic for s->consecutive_errors
    - MINOR: stats: properly handle ST_F_CHECK_DURATION metric
    - MINOR: mworker: remove unused legacy code in mworker_cleanlisteners
    - MINOR: peers: unused code path in process_peer_sync
    - BUG/MINOR: init/threads: continue to limit default thread count to max per group
    - CLEANUP: init: remove useless assignment of nbthread
    - BUILD: atomic: atomic.h may need compiler.h on ARMv8.2-a
    - BUILD: makefile/da: also clean Os/ in Device Atlas dummy lib dir
    - BUG/MEDIUM: httpclient/lua: double LIST_DELETE on end of lua task
    - CLEANUP: pools: move the write before free to the uaf-only function
    - CLEANUP: pool: only include pool-os from pool.c not pool.h
    - REORG: pool: move all the OS specific code to pool-os.h
    - CLEANUP: pools: get rid of CONFIG_HAP_POOLS
    - DEBUG: pool: show a few examples in -dMhelp
    - MINOR: pools: make DEBUG_UAF a runtime setting
    - BUG/MINOR: promex: create haproxy_backend_agg_server_status
    - MINOR: promex: introduce haproxy_backend_agg_check_status
    - DOC: promex: Add missing backend metrics
    - BUG/MAJOR: fcgi: Fix uninitialized reserved bytes
    - REGTESTS: fix the race conditions in iff.vtc
    - CI: github: reintroduce openssl 1.1.1
    - BUG/MINOR: quic: properly handle alloc failure in qc_new_conn()
    - BUG/MINOR: quic: handle alloc failure on qc_new_conn() for owned socket
    - CLEANUP: mux-quic: remove unused attribute on qcs_is_close_remote()
    - BUG/MINOR: mux-quic: remove qcs from opening-list on free
    - BUG/MINOR: mux-quic: handle properly alloc error in qcs_new()
    - CI: github: split ssl lib selection based on git branch
    - REGTESTS: startup: check maxconn computation
    - BUG/MINOR: startup: don't use internal proxies to compute the maxconn
    - REGTESTS: startup: change the expected maxconn to 11000
    - CI: github: set ulimit -n to a greater value
    - REGTESTS: startup: activate automatic_maxconn.vtc
    - MINOR: sample: add param converter
    - CLEANUP: ssl: remove check on srv->proxy
    - BUG/MEDIUM: freq-ctr: Don't compute overshoot value for empty counters
    - BUG/MEDIUM: resolvers: Use tick_first() to update the resolvers task timeout
    - REGTESTS: startup: add alternatives values in automatic_maxconn.vtc
    - BUG/MEDIUM: h3: reject request with invalid header name
    - BUG/MEDIUM: h3: reject request with invalid pseudo header
    - MINOR: http: extract content-length parsing from H2
    - BUG/MEDIUM: h3: parse content-length and reject invalid messages
    - CI: github: remove redundant ASAN loop
    - CI: github: split matrix for development and stable branches
    - BUG/MEDIUM: mux-h1: Don't release H1 stream upgraded from TCP on error
    - BUG/MINOR: mux-h1: Fix test instead a BUG_ON() in h1_send_error()
    - MINOR: http-htx: add BUG_ON to prevent API error on http_cookie_register
    - BUG/MEDIUM: h3: fix cookie header parsing
    - BUG/MINOR: h3: fix memleak on HEADERS parsing failure
    - MINOR: h3: check return values of htx_add_* on headers parsing
    - MINOR: ssl: Remove unneeded buffer allocation in show ocsp-response
    - MINOR: ssl: Remove unnecessary alloc'ed trash chunk in show ocsp-response
    - BUG/MINOR: ssl: Fix memory leak of find_chain in ssl_sock_load_cert_chain
    - MINOR: stats: provide ctx for dumping functions
    - MINOR: stats: introduce stats field ctx
    - BUG/MINOR: stats: fix show stat json buffer limitation
    - MINOR: stats: make show info json future-proof
    - BUG/MINOR: quic: fix crash on PTO rearm if anti-amplification reset
    - BUILD: 51d: fix build issue with recent compilers
    - REGTESTS: startup: disable automatic_maxconn.vtc
    - BUILD: peers: peers-t.h depends on stick-table-t.h
    - BUG/MEDIUM: tests: use tmpdir to create UNIX socket
    - BUG/MINOR: mux-h1: Report EOS on parsing/internal error for not running stream
    - BUG/MINOR:: mux-h1: Never handle error at mux level for running connection
    - BUG/MEDIUM: stats: Rely on a local trash buffer to dump the stats
    - OPTIM: pool: split the read_mostly from read_write parts in pool_head
    - MINOR: pool: make the thread-local hot cache size configurable
    - MINOR: freq_ctr: add opportunistic versions of swrate_add()
    - MINOR: pool: only use opportunistic versions of the swrate_add() functions
    - REGTESTS: ssl: enable the ssl_reuse.vtc test for WolfSSL
    - BUG/MEDIUM: mux-quic: fix double delete from qcc.opening_list
    - BUG/MEDIUM: quic: properly take shards into account on bind lines
    - BUG/MINOR: quic: do not allocate more rxbufs than necessary
    - MINOR: ssl: Add a lock to the OCSP response tree
    - MINOR: httpclient: Make the CLI flags public for future use
    - MINOR: ssl: Add helper function that extracts an OCSP URI from a certificate
    - MINOR: ssl: Add OCSP request helper function
    - MINOR: ssl: Add helper function that checks the validity of an OCSP response
    - MINOR: ssl: Add "update ssl ocsp-response" cli command
    - MEDIUM: ssl: Add ocsp_certid in ckch structure and discard ocsp buffer early
    - MINOR: ssl: Add ocsp_update_tree and helper functions
    - MINOR: ssl: Add crt-list ocsp-update option
    - MINOR: ssl: Store 'ocsp-update' mode in the ckch_data and check for inconsistencies
    - MEDIUM: ssl: Insert ocsp responses in update tree when needed
    - MEDIUM: ssl: Add ocsp update task main function
    - MEDIUM: ssl: Start update task if at least one ocsp-update option is set to on
    - DOC: ssl: Add documentation for ocsp-update option
    - REGTESTS: ssl: Add tests for ocsp auto update mechanism
    - MINOR: ssl: Move OCSP code to a dedicated source file
    - BUG/MINOR: ssl/ocsp: check chunk_strcpy() in ssl_ocsp_get_uri_from_cert()
    - CLEANUP: ssl/ocsp: add spaces around operators
    - BUG/MEDIUM: mux-h2: Refuse interim responses with end-stream flag set
    - BUG/MINOR: pool/stats: Use ullong to report total pool usage in bytes in stats
    - BUG/MINOR: ssl/ocsp: httpclient blocked when doing a GET
    - MINOR: httpclient: don't add body when istlen is empty
    - MEDIUM: httpclient: change the default log format to skip duplicate proxy data
    - BUG/MINOR: httpclient/log: free of invalid ptr with httpclient_log_format
    - MEDIUM: mux-quic: implement shutw
    - MINOR: mux-quic: do not count stream flow-control if already closed
    - MINOR: mux-quic: handle RESET_STREAM reception
    - MEDIUM: mux-quic: implement STOP_SENDING emission
    - MINOR: h3: use stream error when needed instead of connection
    - CI: github: enable github api authentication for OpenSSL tags read
    - BUG/MINOR: mux-quic: ignore remote unidirectional stream close
    - CI: github: use the GITHUB_TOKEN instead of a manually generated token
    - BUILD: makefile: build the features list dynamically
    - BUILD: makefile: move common options-oriented macros to include/make/options.mk
    - BUILD: makefile: sort the features list
    - BUILD: makefile: initialize all build options' variables at once
    - BUILD: makefile: add a function to collect all options' CFLAGS/LDFLAGS
    - BUILD: makefile: start to automatically collect CFLAGS/LDFLAGS
    - BUILD: makefile: ensure that all USE_* handlers appear before CFLAGS are used
    - BUILD: makefile: clean the wolfssl include and lib generation rules
    - BUILD: makefile: make sure to also ignore SSL_INC when using wolfssl
    - BUILD: makefile: reference libdl only once
    - BUILD: makefile: make sure LUA_INC and LUA_LIB are always initialized
    - BUILD: makefile: do not restrict Lua's prepend path to empty LUA_LIB_NAME
    - BUILD: makefile: never force -latomic, set USE_LIBATOMIC instead
    - BUILD: makefile: add an implicit USE_MATH variable for -lm
    - BUILD: makefile: properly report USE_PCRE/USE_PCRE2 in features
    - CLEANUP: makefile: properly indent ifeq/ifneq conditional blocks
    - BUILD: makefile: rework 51D to split v3/v4
    - BUILD: makefile: support LIBCRYPT_LDFLAGS
    - BUILD: makefile: support RT_LDFLAGS
    - BUILD: makefile: support THREAD_LDFLAGS
    - BUILD: makefile: support BACKTRACE_LDFLAGS
    - BUILD: makefile: support SYSTEMD_LDFLAGS
    - BUILD: makefile: support ZLIB_CFLAGS and ZLIB_LDFLAGS
    - BUILD: makefile: support ENGINE_CFLAGS
    - BUILD: makefile: support OPENSSL_CFLAGS and OPENSSL_LDFLAGS
    - BUILD: makefile: support WOLFSSL_CFLAGS and WOLFSSL_LDFLAGS
    - BUILD: makefile: support LUA_CFLAGS and LUA_LDFLAGS
    - BUILD: makefile: support DEVICEATLAS_CFLAGS and DEVICEATLAS_LDFLAGS
    - BUILD: makefile: support PCRE[2]_CFLAGS and PCRE[2]_LDFLAGS
    - BUILD: makefile: refactor support for 51DEGREES v3/v4
    - BUILD: makefile: support WURFL_CFLAGS and WURFL_LDFLAGS
    - BUILD: makefile: make all OpenSSL variants use the same settings
    - BUILD: makefile: remove the special case of the SSL option
    - BUILD: makefile: only consider settings from enabled options
    - BUILD: makefile: also list per-option settings in 'make opts'
    - BUG/MINOR: debug: don't mask the TH_FL_STUCK flag before dumping threads
    - MINOR: cfgparse-ssl: avoid a possible crash on OOM in ssl_bind_parse_npn()
    - BUG/MINOR: ssl: Missing goto in error path in ocsp update code
    - BUG/MINOR: stick-table: report the correct action name in error message
    - CI: Improve headline in matrix.py
    - CI: Add in-memory cache for the latest OpenSSL/LibreSSL
    - CI: Use proper `if` blocks instead of conditional expressions in matrix.py
    - CI: Unify the `GITHUB_TOKEN` name across matrix.py and vtest.yml
    - CI: Explicitly check environment variable against `None` in matrix.py
    - CI: Reformat `matrix.py` using `black`
    - MINOR: config: add environment variables for default log format
    - REGTESTS: Remove REQUIRE_VERSION=1.9 from all tests
    - REGTESTS: Remove REQUIRE_VERSION=2.0 from all tests
    - REGTESTS: Remove tests with REQUIRE_VERSION_BELOW=1.9
    - BUG/MINOR: http-fetch: Only fill txn status during prefetch if not already set
    - BUG/MAJOR: buf: Fix copy of wrapping output data when a buffer is realigned
    - DOC: config: fix alphabetical ordering of http-after-response rules
    - MINOR: http-rules: Add missing actions in http-after-response ruleset
    - DOC: config: remove duplicated "http-response sc-set-gpt0" directive
    - BUG/MINOR: proxy: free orgto_hdr_name in free_proxy()
    - REGTEST: fix the race conditions in json_query.vtc
    - REGTEST: fix the race conditions in add_item.vtc
    - REGTEST: fix the race conditions in digest.vtc
    - REGTEST: fix the race conditions in hmac.vtc
    - BUG/MINOR: fd: avoid bad tgid assertion in fd_delete() from deinit()
    - BUG/MINOR: http: Memory leak of http redirect rules' format string
    - MEDIUM: stick-table: set the track-sc limit at boottime via tune.stick-counters
    - MINOR: stick-table: implement the sc-add-gpc() action
2023-01-07 09:45:17 +01:00
Willy Tarreau
6c0117168e MEDIUM: stick-table: set the track-sc limit at boottime via tune.stick-counters
The number of stick-counter entries usable by track-sc rules is currently
set at build time. There is no good value for this since the vast majority
of users don't need any, most need only a few and rare users need more.
Adding more counters for everyone increases memory and CPU usages for no
reason.

This patch moves the per-session and per-stream arrays to a pool of a size
defined at boot time. This way it becomes possible to set the number of
entries at boot time via a new global setting "tune.stick-counters" that
sets the limit for the whole process. When not set, the MAX_SESS_STR_CTR
value still applies, or 3 if not set, as before.

It is also possible to lower the value to 0 to save a bit of memory if
not used at all.

Note that a few low-level sample-fetch functions had to be protected due
to the ability to use sample-fetches in the global section to set some
variables.
2023-01-06 18:08:49 +01:00
Sbastien Gross
537b9e7f36 MINOR: config: add environment variables for default log format
This patch provides a convenient way to override the default TCP, HTTP
and HTTP log formats. Instead of having a look into the documentation
to figure out what is the appropriate default log format three new
environment variables can be used: HAPROXY_TCP_LOG_FMT,
HAPROXY_HTTP_LOG_FMT and HAPROXY_HTTPS_LOG_FMT. Their content are
substituted verbatim.

These variables are set before parsing the configuration and are unset
just after all configuration files are successful parsed.

Example:

    # Instead of writing this long log-format line...
    log-format "%ci:%cp [%tr] %ft %b/%s %TR/%Tw/%Tc/%Tr/%Ta %ST %B %CC \
                %CS %tsc %ac/%fc/%bc/%sc/%rc %sq/%bq %hr %hs %{+Q}r \
                lr=last_rule_file:last_rule_line"

    # ..the HAPROXY_HTTP_LOG_FMT can be used to provide the default
    # http log-format string
    log-format "${HAPROXY_HTTP_LOG_FMT} lr=last_rule_file:last_rule_line"

Please note that nothing prevents users to unset the variables or
override their content in a global section.

Signed-off-by: Sbastien Gross <sgross@haproxy.com>
2023-01-04 08:23:43 +01:00
Willy Tarreau
284cfc67b8 MINOR: pool: make the thread-local hot cache size configurable
Till now it was only possible to change the thread local hot cache size
at build time using CONFIG_HAP_POOL_CACHE_SIZE. But along benchmarks it
was sometimes noticed a huge contention in the lower level memory
allocators indicating that larger caches could be beneficial, especially
on machines with large L2 CPUs.

Given that the checks against this value was no longer on a hot path
anymore, there was no reason for continuing to force it to be tuned at
build time. So this patch allows to set it by tune.memory-hot-size.

It's worth noting that during the boot phase the value remains zero so
that it's possible to know if the value was set or not, which opens the
possibility that we try to automatically adjust it based on the per-cpu
L2 cache size or the use of certain protocols (none of this is done yet).
2022-12-20 14:51:12 +01:00
Willy Tarreau
57c3e75d4e CLEANUP: init: remove useless assignment of nbthread
The old test consisting in setting global.nbthread if lower than 1
is useless nowadays since it's already done in check_config_validity().
2022-12-08 08:14:35 +01:00
William Lallemand
e57b702e2b BUG/MEDIUM: mworker: create the mcli_reload socketpairs in case of upgrade
In ticket #1956, it was reported that an upgrade from 2.6 to 2.7 via a
reload would stop the master process.

When upgrading the binary, the new process is considered reexec and does
not try to creates the socketpair for the mcli_reload listener, then
tries to bind on -1 since the socket doesn't exit. The failure provokes
an exit() of the master.

This patch fixes the issue by trying to create the mcli_reload sockets
only when they don't exist, instead of creating them at first start.
This way we also avoid possible fd leak since we always try to use the
existing FDs first.

Must be backported in 2.7.
2022-12-07 15:30:52 +01:00
William Lallemand
40db4ae8bb MINOR: mworker: display an alert upon a wait-mode exit
When the mworker wait mode fails it does an exit, but there is no
error message which says it exits.

Add a message which specify that the error is non-recoverable.

Could be backported in 2.7 and possibly earlier branch.
2022-12-07 15:07:53 +01:00
William Lallemand
151dbbe778 BUG/MINOR: ssl: initialize WolfSSL before parsing
The wolfSSL library need to be initialized before parsing the
configuration which uses some SSL functions.

To be backported in 2.6.
2022-12-02 17:17:43 +01:00
William Lallemand
44c80ce5b3 BUG/MINOR: ssl: initialize SSL error before parsing
The SSL error initialization need to be done before the configuration
parsing, because it uses the SSL.

Need to be backported to 2.6.
2022-12-02 17:10:11 +01:00
Amaury Denoyelle
e30f378236 MINOR: quic: activate socket per conn by default
Activate QUIC connection socket to achieve the best performance. The
previous behavior can be reverted by tune.quic.socket-owner
configuration option.

This change is part of quic-conn owned socket implementation.

Contrary to its siblings patches, I suggest to not backport it to 2.7.
This should ensure that stable releases behavior is perserved. If a user
faces issues with QUIC performance on 2.7, he can nonetheless change the
default configuration.
2022-12-02 14:45:43 +01:00
Ilya Shipitsin
6f86eaae4f CLEANUP: assorted typo fixes in the code and comments
This is 33rd iteration of typo fixes
2022-11-30 14:02:36 +01:00
Uriah Pollock
3cbf09ed64 MEDIUM: ssl: add minimal WolfSSL support with OpenSSL compatibility mode
This adds a USE_OPENSSL_WOLFSSL option, wolfSSL must be used with the
OpenSSL compatibility layer. This must be used with USE_OPENSSL=1.

WolfSSL build options:

   ./configure --prefix=/opt/wolfssl --enable-haproxy

HAProxy build options:

  USE_OPENSSL=1 USE_OPENSSL_WOLFSSL=1 WOLFSSL_INC=/opt/wolfssl/include/ WOLFSSL_LIB=/opt/wolfssl/lib/ ADDLIB='-Wl,-rpath=/opt/wolfssl/lib'

Using at least the commit 54466b6 ("Merge pull request #5810 from
Uriah-wolfSSL/haproxy-integration") from WolfSSL. (2022-11-23).

This is still to be improved, reg-tests are not supported yet, and more
tests are to be done.

Signed-off-by: William Lallemand <wlallemand@haproxy.org>
2022-11-24 11:29:03 +01:00
Amaury Denoyelle
28ea31c7cb MINOR: global: generate random cluster.secret if not defined
If no cluster-secret is defined by the user, a random one is silently
generated.

This ensures that at least QUIC Retry tokens are generated if abnormal
conditions are detected. However, it is advisable to specify it in the
configuration for tokens to be valid even after a reload or across LBs
instances in the same cluster.

This should be backported up to 2.6.
2022-11-21 16:41:34 +01:00
Remi Tricot-Le Breton
e608b0eb16 BUG/MINOR: ssl: SSL_load_error_strings might not be defined
The SSL_load_error_strings function was marked as deprecated in OpenSSL
1.1.0 so compiling HAProxy with OPENSSL_NO_DEPRECATED set and a recent
OpenSSL library would fail.
The manpages say that this function was replaced by OPENSSL_init_crypto
and OPENSSL_init_ssl which are already called at start up by the SSL
lib. We do not seem to be in a case where explicit call of those
functions is required.

This patch fixes GitHub issue #1813.
It can be backported to 2.6.
2022-11-16 11:09:33 +01:00
Willy Tarreau
e98d385819 MINOR: deinit: add a "quick-exit" option to bypass the deinit step
Once in a while we spot a bug in the deinit code that is complex,
especially when it has to deal with incomplete initializations, and the
ability to bypass this step has regularly been raised. In addition for
fast-reloading setups it could theoretically save some time. Tests have
shown that very large configs can barely save ~100-150ms by skipping the
deinit step. However the ability not to crash if a bug is encountered can
occasionally help.

This patch adds an option to do exactly this. It's obviously not enabled
by default and the documentation discourages from using it, but this might
be useful in the future.
2022-11-15 09:37:09 +01:00
William Lallemand
eba6a54cd4 MINOR: logs: startup-logs can use a shm for logging the reload
When compiled with USE_SHM_OPEN=1 the startup-logs are now able to use
an shm which is used to keep the logs when switching to mworker wait
mode. This allows to keep the failed reload logs.

When allocating the startup-logs at first start of the process, haproxy
will do a shm_open with a unique path using the PID of the process, the
file is unlink immediatly so we don't let unwelcomed files be. The fd
resulting from this shm is stored in the HAPROXY_STARTUPLOGS_FD
environment variable so it can be mmap again when switching to wait
mode.

When forking children, the process is copying the mmap to a a mallocated
ring so we never share the same memory section between the master and
the workers. When switching to wait mode, the shm is not used anymore as
it is also copied to a mallocated structure.

This allow to use the "show startup-logs" command over the master CLI,
to get the logs of the latest startup or reload. This way the logs of
the latest failed reload are also kept.

This is only activated on the linux-glibc target for now.
2022-10-13 16:50:22 +02:00
Willy Tarreau
c06557c23b MINOR: init: do not try to shrink existing RLIMIT_NOFIlE
As seen in issue #1866, some environments will not allow to change the
current FD limit, and actually we don't need to do it, we only do it as
a byproduct of adjusting the limit to the one that fits. Here we're
replacing calls to setrlimit() with calls to raise_rlim_nofile(), which
will avoid making the setrlimit() syscall in case the desired value is
lower than the current process' one.

This depends on previous commit "MINOR: fd: add a new function to only
raise RLIMIT_NOFILE" and may need to be backported to 2.6, possibly
earlier, depending on users' experience in such environments.
2022-10-04 08:38:47 +02:00
Amaury Denoyelle
92fa63f735 CLEANUP: quic: create a dedicated quic_conn module
xprt_quic module was too large and did not reflect the true architecture
by contrast to the other protocols in haproxy.

Extract code related to XPRT layer and keep it under xprt_quic module.
This code should only contains a simple API to communicate between QUIC
lower layer and connection/MUX.

The vast majority of the code has been moved into a new module named
quic_conn. This module is responsible to the implementation of QUIC
lower layer. Conceptually, it overlaps with TCP kernel implementation
when comparing QUIC and HTTP1/2 stacks of haproxy.

This should be backported up to 2.6.
2022-10-03 16:25:17 +02:00
Erwan Le Goas
f30c5d7666 MINOR: config: Add option line when the configuration file is dumped
Add an option to dump the number lines of the configuration file when
it's dumped. Other options can be easily added. Options are separated
by ',' when tapping the command line:
'./haproxy -dC[key],line -f [file]'

No backport needed, except if anonymization mechanism is backported.
2022-09-29 10:53:15 +02:00
William Lallemand
56f73b21a5 MINOR: mworker: stores the mcli_reload bind_conf
Stores the mcli_reload bind_conf in order to identify it later.
2022-09-24 15:56:25 +02:00
William Lallemand
21623b5949 MINOR: mworker: mworker_cli_proxy_new_listener() returns a bind_conf
mworker_cli_proxy_new_listener() now returns a bind_conf * or NULL upon
failure.
2022-09-24 15:51:27 +02:00
William Lallemand
68192b2cdf MINOR: mworker: store and shows loading status
The environment variable HAPROXY_LOAD_SUCCESS stores "1" if it
successfully load the configuration and started, "0" otherwise.

The "_loadstatus" master CLI command displays either
"Loading failure!\n" or "Loading success.\n"
2022-09-24 15:44:42 +02:00
William Lallemand
ec059c249e MEDIUM: mworker/cli: keep the connection of the FD that ask for a reload
When using the "reload" command over the master CLI, all connections to
the master CLI were cut, this was unfortunate because it could have been
used to implement a synchronous reload command.

This patch implements an architecture to keep the connection alive after
the reload.

The master CLI is now equipped with a listener which uses a socketpair,
the 2 FDs of this socketpair are stored in the mworker_proc of the
master, which the master keeps via the environment variable.

ipc_fd[1] is used as a listener for the master CLI. During the "reload"
command, the CLI will send the FD of the current session over ipc_fd[0],
then the reload is achieved, so the master won't handle the recv of the
FD. Once reloaded, ipc_fd[1] receives the FD of the session, so the
connection is preserved. Of course it is a new context, so everything
like the "prompt mode" are lost.

Only the FD which performs the reload is kept.
2022-09-22 18:16:19 +02:00
Erwan Le Goas
b0c0501516 MINOR: config: add command-line -dC to dump the configuration file
This commit adds a new command line option -dC to dump the configuration
file. An optional key may be appended to -dC in order to produce an
anonymized dump using this key. The anonymizing process uses the same
algorithm as the CLI so that the same key will produce the same hashes
for the same identifiers. This way an admin may share an anonymized
extract of a configuration to match against live dumps. Note that key 0
will not anonymize the output. However, in any case, the configuration
is dumped after tokenizing, thus comments are lost.
2022-09-17 11:27:09 +02:00
Matthias Wirth
eea152ee68 BUG/MINOR: signals/poller: ensure wakeup from signals
Add self-wake in signal_handler() to fix a race condition with a signal
coming in between checking signal_queue_len and entering polling sleep.

The changes in commit 43c891dda ("BUG/MINOR: signals/poller: set the
poller timeout to 0 when there are signals") were insufficient.

Move the signal_queue_len check from the poll implementations to
run_poll_loop() to keep that logic in one place.

The poll loops are terminated either by the parameter wake being set or
wake up due to a write to their poller_wr_pipe by wake_thread() in
signal_handler().

This fixes issue #1841.

Must be backported in every stable version.
2022-09-09 11:15:22 +02:00
Willy Tarreau
53bfac8c63 BUG/MEDIUM: master: force the thread count earlier
Christopher bisected that recent commit d0b73bca71 ("MEDIUM: listener:
switch bind_thread from global to group-local") broke the master socket
in that only the first out of the Nth initial connections would work,
where N is the number of threads, after which they all work.

The cause is that the master socket was bound to multiple threads,
despite global.nbthread being 1 there, so the incoming connection load
balancing would try to send incoming connections to non-existing threads,
however the bind_thread mask would nonetheless include multiple threads.

What happened is that in 1.9 we forced "nbthread" to 1 in the master's poll
loop with commit b3f2be338b ("MEDIUM: mworker: use the haproxy poll loop").

In 2.0, nbthread detection was enabled by default in commit 149ab779cc
("MAJOR: threads: enable one thread per CPU by default"). From this point
on, the operation above is unsafe because everything during startup is
performed with nbthread corresponding to the default value, then it
changes to one when starting the polling loop. But by then we weren't
using the wait mode except for reload errors, so even if it would have
happened nobody would have noticed.

In 2.5 with commit fab0fdce9 ("MEDIUM: mworker: reexec in waitpid mode
after successful loading") we started to rexecute all the time, not just
for errors, so as to release precious resources and to possibly spot bugs
that were rarely exposed in this mode. By then the incoming connection LB
was enforcing all_threads_mask on the listener's thread mask so that the
incorrect value was being corrected while using it.

Finally in 2.7 commit d0b73bca71 ("MEDIUM: listener: switch bind_thread
from global to group-local") replaces the all_threads_mask there with
the listener's bind_thread, but that one was never adjusted by the
starting master, whose thread group was filled to N threads by the
automatic detection during early setup.

The best approach here is to set nbthread to 1 very early in init()
when we're in the master in wait mode, so that we don't try to guess
the best value and don't end up with incorrect bindings anymore. This
patch does this and also sets nbtgroups to 1 in preparation for a
possible future where this will also be automatically calculated.

There is no need to backport this patch since no other versions were
affected, but if it were to be discovered that the incorrect bind mask
on some of the master's FDs could be responsible for any trouble in
older versions, then the backport should be safe (provided that
nbtgroups is dropped of course).
2022-07-22 17:51:53 +02:00
Willy Tarreau
41afd9084e BUILD: add detection for unsupported compiler models
As reported in github issue #1765, some people get trapped into building
haproxy and companion libraries on Windows using a compiler following the
LLP64 model. This has no chance to work, and definitely causes nasty bugs
everywhere when pointers are passed as longs. Let's save them time and
detect this at boot time.

The message and detection was factored with the existing one for -fwrapv
since we need the same info and actions.

This should be backported to all recent supported versions (the ones
that are likely to be tried on such platforms when people don't know).
2022-07-21 09:58:20 +02:00
William Lallemand
d4835a9680 BUG/MEDIUM: mworker: proc_self incorrectly set crashes upon reload
When updating from 2.4 to 2.6, the child->reloads++ instruction changed
place, resulting in a former worker from the 2.4 process, still
identified as a current worker once in 2.6, because its reload counter
is still 0.

Unfortunately this counter is used to chose the mworker_proc structure
that will be used for the new worker.

What happens next, is that the mworker_proc structure of the previous
process is selected, and this one has ipc_fd[1] set to -1, because this
structure was supposed to be in the master.

The process then forks, and mworker_sockpair_register_per_thread() tries
to register ipc_fd[1] which is set to -1, instead of the fd of the new
socketpair.

This patch fixes the issue by checking if child->pid is equal to -1 when
selecting proc_self. This way we could be sure it wasn't a previous
process.

Should fix issue #1785.

This must be backported as far as 2.4 to fix the issue related to the
reload computation difference. However backporting it in every stable
branch will enforce the reload process.
2022-07-21 00:52:43 +02:00
William Lallemand
3b8bafd4a7 MINOR: init: load OpenSSL error strings
Load OpenSSL Error strings in order to be able to output reason strings.

This is mandatory to be able to use ERR_reason_error_string().
2022-07-19 19:13:08 +02:00
Willy Tarreau
c6b596dcce CLEANUP: threads: remove the now unused all_threads_mask and tid_bit
Since these are not used anymore, let's now remove them. Given the
number of places where we're using ti->ldit_bit, maybe an equivalent
might be useful though.
2022-07-15 20:25:41 +02:00
Willy Tarreau
5b09341c02 MEDIUM: cpu-map: replace the process number with the thread group number
The principle remains the same, but instead of having a single process
and ignoring extra ones, now we set the affinity masks for the respective
threads of all groups.

The doc was updated with a few extra examples.
2022-07-15 19:43:10 +02:00
Willy Tarreau
e5715bface MEDIUM: poller: disable thread-groups for poll() and select()
These old legacy pollers are not designed for this. They're still
using a shared list of events for all threads, this will not scale at
all, so there's no point in enabling thread-groups there. Modern
systems have epoll, kqueue or event ports and do not need these ones.

We arrange for failing at boot time, only when thread-groups > 1 so
that existing setups will remain unaffected.

If there's a compelling reason for supporting thread groups with these
pollers in the future, the rework should not be too hard, it would just
consume a lot of memory to have an fd_evts[] array per thread, but that
is doable.
2022-07-15 19:43:10 +02:00
William Lallemand
a46a99e98c MEDIUM: mworker/systemd: send STATUS over sd_notify
The sd_notify API is not able to change the "Active:" line in "systemcl
status". However a message can still be displayed on a "Status: " line,
even if the service is still green and "active (running)".

When startup succeed the Status will be set to "Ready.", upon a reload
it will be set to "Reloading Configuration." If the configuration
succeed "Ready." again. However if the reload failed, it will be set to
"Reload failed!".

Keep in mind that the "Active:" line won't change upon a reload failure,
and will still be green.
2022-07-07 14:48:46 +02:00
Willy Tarreau
ad92fdf196 CLEANUP: thread: also remove a thread's bit from stopping_threads on stop
As much as possible we should take care of not leaving bits from stopped
threads in shared thread masks. It can avoid issues like the previous
fix and will also make debugging less confusing.
2022-07-06 10:19:46 +02:00
Willy Tarreau
f34a3fa33d BUG/MEDIUM: thread: mask stopping_threads with threads_enabled when checking it
When soft-stopping, there's a comparison between stopping_threads and
threads_enabled to make sure all threads are stopped, but this is not
correct and is racy since the threads_enabled bit is removed when a
thread is stopped but not its stopping_threads bit. The consequence is
that depending on timing, when stopping, if the first stopping thread
is fast enough to remove its bit from threads_enabled, the other threads
will see that stopping_threads doesn't match threads_enabled anymore and
will wait forever. As such the mask must be applied to stopping_threads
during the test. This issue was introduced in recent commit ef422ced9
("MEDIUM: thread: make stopping_threads per-group and add stopping_tgroups"),
no backport is needed.
2022-07-06 10:19:46 +02:00
Willy Tarreau
24cfc9f76e BUG/MEDIUM: thread: check stopping thread against local bit and not global one
Commit ef422ced9 ("MEDIUM: thread: make stopping_threads per-group and add
stopping_tgroups") moved the stopping_threads mask to per-group, but one
test in the loop preserved its global value instead, resulting in stopping
threads never sleeping on stop and eating 100% CPU until all were stopped.

No backport is needed.
2022-07-04 14:09:39 +02:00
Willy Tarreau
291f6ff885 BUG/MEDIUM: threads: fix incorrect thread group being used on soft-stop
Commit 377e37a80 ("MINOR: tinfo: add the mask of enabled threads in each
group") forgot -1 on the tgid, thus the groups was not always correctly
tested, which is visible only when running with more than one group. No
backport is needed.
2022-07-04 13:37:31 +02:00
Willy Tarreau
ef422ced91 MEDIUM: thread: make stopping_threads per-group and add stopping_tgroups
Stopping threads need a mask to figure who's still there without scanning
everything in the poll loop. This means this will have to be per-group.
And we also need to have a global stopping groups mask to know what groups
were already signaled. This is used both to figure what thread is the first
one to catch the event, and which one is the first one to detect the end of
the last job. The logic isn't changed, though a loop is required in the
slow path to make sure all threads are aware of the end.

Note that for now the soft-stop still takes time for group IDs > 1 as the
poller is not yet started on these threads and needs to expire its timeout
as there's no way to wake it up. But all threads are eventually stopped.
2022-07-01 19:15:15 +02:00
Willy Tarreau
cce203aae5 MINOR: thread: add a new all_tgroups_mask variable to know about active tgroups
In order to kill all_threads_mask we'll need to have an equivalent for
the thread groups. The all_tgroups_mask does just this, it keeps one bit
set per enabled group.
2022-07-01 19:15:15 +02:00
Willy Tarreau
377e37a80f MINOR: tinfo: add the mask of enabled threads in each group
In order to replace the global "all_threads_mask" we'll need to have an
equivalent per group. Take this opportunity for calling it threads_enabled
and make sure which ones are counted there (in case in the future we allow
to stop some).
2022-07-01 19:15:14 +02:00
Willy Tarreau
e7475c8e79 MEDIUM: tasks/fd: replace sleeping_thread_mask with a TH_FL_SLEEPING flag
Every single place where sleeping_thread_mask was still used was to test
or set a single thread. We can now add a per-thread flag to indicate a
thread is sleeping, and remove this shared mask.

The wake_thread() function now always performs an atomic fetch-and-or
instead of a first load then an atomic OR. That's cleaner and more
reliable.

This is not easy to test, as broadcast FD events are rare. The good
way to test for this is to run a very low rate-limited frontend with
a listener that listens to the fewest possible threads (2), and to
send it only 1 connection at a time. The listener will periodically
pause and the wakeup task will sometimes wake up on a random thread
and will call wake_thread():

   frontend test
        bind :8888 maxconn 10 thread 1-2
        rate-limit sessions 5

Alternately, disabling/enabling a frontend in loops via the CLI also
broadcasts such events, but they're more difficult to observe since
this is causing connection failures.
2022-07-01 19:15:14 +02:00
Willy Tarreau
dce4ad755f MEDIUM: thread: add a new per-thread flag TH_FL_NOTIFIED to remember wakeups
Right now when an inter-thread wakeup happens, we preliminary check if the
thread was asleep, and if so we wake the poller up and remove its bit from
the sleeping mask. That's not very clean since the sleeping mask cannot be
entirely trusted since a thread that's about to wake up will already have
its sleeping bit removed.

This patch adds a new per-thread flag (TH_FL_NOTIFIED) to remember that a
thread was notified to wake up. It's cleared before checking the task lists
last, so that new wakeups can be considered again (since wake_thread() is
only used to notify about task wakeups and FD polling changes). This way
we do not need to modify a remote thread's sleeping mask anymore. As such
wake_thread() now only tests and sets the TH_FL_NOTIFIED flag but doesn't
clear sleeping anymore.
2022-07-01 19:15:14 +02:00
William Lallemand
0a012aa16b BUG/MEDIUM: mworker: use default maxconn in wait mode
In bug #1751, it was reported that haproxy is consumming too much memory
since the 2.4 version. This is because of a change in the master, which
loses completely its configuration in wait mode, and lose its maxconn.

Without the maxconn, haproxy will try to compute one itself, and will
allocate RAM consequently, too much in our case. Which means the master
will have a too high maxconn and too much RAM allocated.

The patch fixes the issue by setting the maxconn to the default value
when re-executing the master in wait mode.

Must be backported as far as 2.5.
2022-06-21 14:22:49 +02:00
Frédéric Lécaille
aee675746c MINOR: quic: Clarifications about transport parameters value
This is becoming difficult to distinguish the default values for
transport parameters which come with the RFC from our implementation
default values when not set by configuration (tunable parameters).
Add a comment to distinguish them.
Prefix these default values by QUIC_TP_DFLT_ to distinguish them from
QUIC_DFLT_* value even if there are not numerous.
Furthermore ->max_udp_payload_size must be first initialized to
QUIC_TP_DFLT_MAX_UDP_PAYLOAD_SIZE especially for received value.
2022-05-30 09:59:26 +02:00
Frédéric Lécaille
2674098569 MINOR: quic: Tunable "initial_max_streams_bidi" transport parameter
Add tunable "tune.quic.frontend.max_streams_bidi" setting for QUIC frontends
to set the "initial_max_streams_bidi" transport parameter.
Add some documentation for this new setting.
2022-05-30 09:59:26 +02:00
Frédéric Lécaille
1d96d6e024 MINOR: quic: Tunable "max_idle_timeout" transport parameter
Add two tunable settings both for backends and frontends "max_idle_timeout"
QUIC transport parameter, "tune.quic.frontend.max-idle-timeout" and
"tune.quic.backend.max-idle-timeout" respectively.
cfg_parse_quic_time() has been implemented to parse a time value thanks
to parse_time_err(). It should be reused for any tunable time value to be
parsed.
Add the documentation for this tunable setting only for frontend.
2022-05-30 09:59:26 +02:00
Willy Tarreau
8e5b9589b3 CLEANUP: init: address another coverity warning about a possible multiply overflow
Commit 2cb3be76b ("CLEANUP: init: address a coverity warning about
possible multiply overflow") was incomplete, two other locations were
present. This should address issue #1585.
2022-05-26 08:55:05 +02:00
Willy Tarreau
2cb3be76bf CLEANUP: init: address a coverity warning about possible multiply overflow
In issue #1585 Coverity suspects a risk of multiply overflow when
calculating the SSL cache size, though in practice the cache is
limited to 2^32 anyway thus it cannot really happen. Nevertheless,
casting the operation should be sufficient to avoid marking it as a
false positive.
2022-05-24 07:46:00 +02:00
Frédéric Lécaille
9286210aa8 MINOR: quic: Add tune.quic.retry-threshold keyword
This QUIC specific keyword may be used to set the theshold, in number of
connection openings, beyond which QUIC Retry feature will be automatically
enabled. Its default value is 100.
2022-05-20 17:11:13 +02:00
Remi Tricot-Le Breton
5194446b76 MEDIUM: ssl: Delay random generator initialization after config parsing
The random generator initialization needs to be performed before the
chroot but it is not needed before. If we want to add provider
configuration option to the configuration file, they need to be
processed before any call to a crypto-related OpenSSL function.
We can then delay the initialization until after the configuration file
is parsed and processed.
2022-05-17 10:55:59 +02:00
Frédéric Lécaille
372508cc42 MINOR: config: Add "cluster-secret" new global keyword
It could be usefull to set a ASCII secret which could be used for different
usages. For instance, it will be used to derive QUIC stateless reset tokens.
2022-05-12 17:48:35 +02:00
William Lallemand
89e236f246 BUG/MINOR: startup: usage() when no -cc arguments
Exit correctly with usage() instead of segfaulting when no argument
were passed to -cc.

Must be backported in 2.5.
2022-05-06 17:22:36 +02:00
William Lallemand
8b9a2df969 MINOR: init: exit() after pre-check upon error
Add a test on the err_code variable so we don't go further if one of the
pre-check callback failed.
2022-05-04 14:29:46 +02:00
Willy Tarreau
226866e1bb CLEANUP: deinit: release the config postparsers
These ones were not released either, it just requires to export the list
("postparsers") and it makes valgrind happy.
2022-04-27 18:07:24 +02:00
Willy Tarreau
65009ebde1 CLEANUP: deinit: release the pre-check callbacks
The freeing of pre-check callbacks was missing when this feature was
recently added with commit b53eb8790 ("MINOR: init: add the pre-check
callback"), let's do it to make valgrind happy.
2022-04-27 18:02:54 +02:00
Tim Duesterhus
77b3db0fbd MINOR: Call deinit_and_exit(0) for haproxy -vv
It appears that it is safe to call perform a clean deinit at this point, so
let's do this to exercise the deinit paths some more.

Running `valgrind --leak-check=full --show-leak-kinds=all ./haproxy -vv` with
this change reports:

    ==261864== HEAP SUMMARY:
    ==261864==     in use at exit: 344 bytes in 11 blocks
    ==261864==   total heap usage: 1,178 allocs, 1,167 frees, 1,102,089 bytes allocated
    ==261864==
    ==261864== 24 bytes in 1 blocks are still reachable in loss record 1 of 2
    ==261864==    at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
    ==261864==    by 0x324BA6: hap_register_pre_check (init.c:92)
    ==261864==    by 0x155824: main (haproxy.c:3024)
    ==261864==
    ==261864== 320 bytes in 10 blocks are still reachable in loss record 2 of 2
    ==261864==    at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
    ==261864==    by 0x26E54E: cfg_register_postparser (cfgparse.c:4238)
    ==261864==    by 0x155824: main (haproxy.c:3024)
    ==261864==
    ==261864== LEAK SUMMARY:
    ==261864==    definitely lost: 0 bytes in 0 blocks
    ==261864==    indirectly lost: 0 bytes in 0 blocks
    ==261864==      possibly lost: 0 bytes in 0 blocks
    ==261864==    still reachable: 344 bytes in 11 blocks
    ==261864==         suppressed: 0 bytes in 0 blocks

which is looking pretty good.
2022-04-27 05:01:27 +02:00
Willy Tarreau
197715ae21 CLEANUP: compression: move the default setting of maxzlibmem to defaults
__comp_fetch_init() only presets the maxzlibmem, and only when both
USE_ZLIB and DEFAULT_MAXZLIBMEM are set. The intent is to preset a
default value to protect the system against excessive memory usage
when no setting is set by the user.

Nowadays the entry in the global struct is always there so there's no
point anymore in passing via a constructor to possibly set this value.
Let's go the cleaner way by always presetting DEFAULT_MAXZLIBMEM to 0
in defaults.h unless these conditions are met, and always assigning it
instead of pre-setting the entry to zero. This is more straightforward
and removes some ifdefs and the last constructor. In addition, now the
setting has a chance of being found.
2022-04-25 19:42:43 +02:00
Willy Tarreau
2df1fbf816 MINOR: init: add global setting "fd-hard-limit" to bound system limits
On some systems, the hard limit for ulimit -n may be huge, in the order
of 1 billion, and using this to automatically compute maxconn doesn't
work as it requires way too much memory. Users tend to hard-code maxconn
but that's not convenient to manage deployments on heterogenous systems,
nor when porting configs to developers' machines. The ulimit-n parameter
doesn't work either because it forces the limit. What most users seem to
want (and it makes sense) is to respect the system imposed limits up to
a certain value and cap this value. This is exactly what fd-hard-limit
does.

This addresses github issue #1622.
2022-04-25 18:04:49 +02:00
William Lallemand
b53eb8790e MINOR: init: add the pre-check callback
This adds a call to function <fct> to the list of functions to be called at
the step just before the configuration validity checks. This is useful when you
need to create things like it would have been done during the configuration
parsing and where the initialization should continue in the configuration
check.
It could be used for example to generate a proxy with multiple servers using
the configuration parser itself. At this step the trash buffers are allocated.
Threads are not yet started so no protection is required. The function is
expected to return non-zero on success, or zero on failure. A failure will make
the process emit a succinct error message and immediately exit.
2022-04-22 15:45:47 +02:00
Amaury Denoyelle
97e84c6c69 MINOR: cfg-quic: define tune.quic.conn-buf-limit
Add a new global configuration option to set the limit of buffers per
QUIC connection. By default, this value is set to 30.
2022-04-21 12:04:04 +02:00
Remi Tricot-Le Breton
b5d968d9b2 MEDIUM: global: Add a "close-spread-time" option to spread soft-stop on time window
The new 'close-spread-time' global option can be used to spread idle and
active HTTP connction closing after a SIGUSR1 signal is received. This
allows to limit bursts of reconnections when too many idle connections
are closed at once. Indeed, without this new mechanism, in case of
soft-stop, all the idle connections would be closed at once (after the
grace period is over), and all active HTTP connections would be closed
by appending a "Connection: close" header to the next response that goes
over it (or via a GOAWAY frame in case of HTTP2).

This patch adds the support of this new option for HTTP as well as HTTP2
connections. It works differently on active and idle connections.

On active connections, instead of sending systematically the GOAWAY
frame or adding the 'Connection: close' header like before once the
soft-stop has started, a random based on the remainder of the close
window is calculated, and depending on its result we could decide to
keep the connection alive. The random will be recalculated for any
subsequent request/response on this connection so the GOAWAY will still
end up being sent, but we might wait a few more round trips. This will
ensure that goaways are distributed along a longer time window than
before.

On idle connections, a random factor is used when determining the expire
field of the connection's task, which should naturally spread connection
closings on the time window (see h2c_update_timeout).

This feature request was described in GitHub issue #1614.
This patch should be backported to 2.5. It depends on "BUG/MEDIUM:
mux-h2: make use of http-request and keep-alive timeouts" which
refactorized the timeout management of HTTP2 connections.
2022-04-08 18:15:21 +02:00
Willy Tarreau
29d799d591 MINOR: sample: list registered sample converter functions
Similar to the sample fetch keywords, let's also list the converter
keywords. They're much simpler since there's no compatibility matrix.
Instead the input and output types are listed. This is called by
dump_registered_keywords() for the "cnv" keywords class.
2022-03-29 18:01:37 +02:00
Willy Tarreau
f78813f74f MINOR: samples: add a function to list register sample fetch keywords
New function smp_dump_fetch_kw lists registered sample fetch keywords
with their compatibility matrix, mandatory and optional argument types,
and output types. It's called from dump_registered_keywords() with class
"smp".
2022-03-29 18:01:37 +02:00
Willy Tarreau
6ff7d1b9a5 MINOR: acl: add a function to dump the list of known ACL keywords
New function acl_dump_kwd() dumps the registered ACL keywords and their
sample-fetch equivalent to stdout. It's called by dump_registered_keywords()
for keyword class "acl".
2022-03-29 18:01:37 +02:00
Willy Tarreau
06d0e2e034 MINOR: cli: add a new keyword dump function
New function cli_list_keywords() scans the list of registered CLI keywords
and dumps them on stdout. It's now called from dump_registered_keywords()
for the class "cli".

Some keywords are valid for the master, they'll be suffixed with
"[MASTER]". Others are valid for the worker, they'll have "[WORKER]".
Those accessible only in expert mode will show "[EXPERT]" and the
experimental ones will show "[EXPERIM]".
2022-03-29 18:01:37 +02:00
Willy Tarreau
5fcc100d91 MINOR: services: extend list_services() to dump to stdout
When no output stream is passed, stdout is used with one entry per line,
and this is called from dump_registered_services() when passed the class
"svc".
2022-03-29 18:01:37 +02:00
Willy Tarreau
3b65e14842 MINOR: filters: extend flt_dump_kws() to dump to stdout
When passing a NULL output buffer the function will now dump to stdout
with a more compact format that is more suitable for machine processing.

An entry was added to dump_registered_keyword() to call it when the
keyword class "flt" is requested.
2022-03-29 18:01:37 +02:00
Willy Tarreau
ca1acd6080 MINOR: config: add a function to dump all known config keywords
All registered config keywords that are valid in the config parser are
dumped to stdout organized like the regular sections (global, listen,
etc). Some keywords that are known to only be valid in frontends or
backends will be suffixed with [FE] or [BE].

All regularly registered "bind" and "server" keywords are also dumped,
one per "bind" or "server" line. Those depending on ssl are listed after
the "ssl" keyword. Doing so required to export the listener and server
keyword lists that were static.

The function is called from dump_registered_keywords() for keyword
class "cfg".
2022-03-29 18:01:32 +02:00
Willy Tarreau
76871a4f8c MINOR: management: add some basic keyword dump infrastructure
It's difficult from outside haproxy to detect the supported keywords
and syntax. Interestingly, many of our modern keywords are enumerated
since they're registered from constructors, so it's not very hard to
enumerate most of them.

This patch creates some basic infrastructure to support dumping existing
keywords from different classes on stdout. The format will differ depending
on the classes, but the idea is that the output could easily be passed to
a script that generates some simple syntax highlighting rules, completion
rules for editors, syntax checkers or config parsers.

The principle chosen here is that if "-dK" is passed on the command-line,
at the end of the parsing the registered keywords will be dumped for the
requested classes passed after "-dK". Special name "help" will show known
classes, while "all" will execute all of them. The reason for doing that
after the end of the config processor is that it will also enumerate
internally-generated keywords, Lua or even those loaded from external
code (e.g. if an add-on is loaded using LD_PRELOAD). A typical way to
call this with a valid config would be:

    ./haproxy -dKall -q -c -f /path/to/config

If there's no config available, feeding /dev/null will also do the job,
though it will not be able to detect dynamically created keywords, of
course.

This patch also updates the management doc.

For now nothing but the help is listed, various subsystems will follow
in subsequent patches.
2022-03-29 17:55:54 +02:00
Willy Tarreau
edd426871f DEBUG: move the tainted stuff to bug.h for easier inclusion
The functions needed to manipulate the "tainted" flags were located in
too high a level to be callable from the lower code layers. Let's move
them to bug.h.
2022-02-25 11:55:38 +01:00
Willy Tarreau
9b4a0e6bac BUG/MINOR: debug: fix get_tainted() to properly read an atomic value
get_tainted() was using an atomic store from the atomic value to a
local one instead of using an atomic load. In practice it has no effect
given the relatively rare updates of this field and the fact that it's
read only when dumping "show info" output, but better fix it.

There's probably no need to backport this.
2022-02-25 11:54:30 +01:00
Willy Tarreau
f4b79c4a01 MINOR: pools: support setting debugging options using -dM
The 9 currently available debugging options may now be checked, set, or
cleared using -dM. The directive now takes a comma-delimited list of
options after the optional poisonning byte. With "help", the list of
available options is displayed with a short help and their current
status.

The management doc was updated.
2022-02-23 17:28:41 +01:00
Willy Tarreau
1408b1f8be MINOR: pools: delegate parsing of command line option -dM to a new function
New function pool_parse_debugging() is now dedicated to parsing options
of -dM. For now it only handles the optional memory poisonning byte, but
the function may already return an informative message to be printed for
help, a warning or an error. This way we'll reuse it for the settings
that will be needed for configurable debugging options.
2022-02-23 17:28:41 +01:00
Willy Tarreau
18f96d02d3 MEDIUM: init: handle arguments earlier
The argument parser runs too late, we'll soon need it before creating
pools, hence just after init_early(). No visible change is expected but
this part is sensitive enough to be placed into its own commit for easier
bisection later if needed.
2022-02-23 17:28:41 +01:00
Willy Tarreau
392524d222 MINOR: init: extract args parsing to their own function
The cmdline argument parsing was performed quite late, which prevents
from retrieving elements that can be used to initialize the pools and
certain sensitive areas. The goal is to improve this by parsing command
line arguments right after the early init stage. This is possible
because the cmdline parser already does very little beyond retrieving
config elements that are used later.

Doing so requires to move the parser code to a separate function and
to externalize a few variables out of the function as they're used
later in the boot process, in the original function.

This patch creates init_args() but doesn't move it upfront yet, it's
still executed just before init(), which essentially corresponds to
what was done before (only the trash buffers, ACLs and Lua were
initialized earlier and are not needed for this).

The rest is not modified and as expected no change is observed.

Note that the diff doesn't to justice to the change as it makes it
look like the early init() code was moved to a new function after
the function was renamed, while in fact it's clearly the parser
itself which moved.
2022-02-23 17:11:33 +01:00
Willy Tarreau
34527d5354 MEDIUM: init: split the early initialization in its own function
There are some delicate chicken-and-egg situations in the initialization
code, because the init() function currently does way too much (it goes
as far as parsing the config) and due to this it must be started very
late. But it's also in charge of initializing a number of variables that
are needed in early boot (e.g. hostname/pid for error reporting, or
entropy for random generators).

This patch carefully extracts all the early code that depends on
absolutely nothing, and places it immediately after the STG_LOCK init
stage. The only possible failures at this stage are only allocation
errors and they continue to provoke an immediate exit().

Some environment variables, hostname, date, pid etc are retrieved at
this stage. The program's arguments are also copied there since they're
needed to be kept intact for the master process.
2022-02-23 17:11:33 +01:00
Willy Tarreau
3ebe4d989c MEDIUM: initcall: move STG_REGISTER earlier
The STG_REGISTER init level is used to register known keywords and
protocol stacks. It must be called earlier because some of the init
code already relies on it to be known. For example, "haproxy -vv"
for now is constrained to start very late only because of this.

This patch moves it between STG_LOCK and STG_ALLOC, which is fine as
it's used for static registration.
2022-02-23 17:11:33 +01:00
Willy Tarreau
ef301b7556 MINOR: pools: add a debugging flag for memory poisonning option
Now -dM will set POOL_DBG_POISON for consistency with the rest of the
pool debugging options. As such now we only check for the new flag,
which allows the default value to be preset.
2022-02-23 17:11:33 +01:00
Willy Tarreau
b61fccdc3f CLEANUP: init: remove the ifdef on HAPROXY_MEMMAX
It's ugly, let's move it to defaults.h with all other ones and preset
it to zero if not defined.
2022-02-23 17:11:33 +01:00
Willy Tarreau
cc0d554e5f CLEANUP: vars: move the per-process variables initialization to vars.c
There's no point keeping the vars_init_head() call in init() when we
already have a vars_init() registered at the right time to do that,
and it complexifies the boot sequence, so let's move it there.
2022-02-23 17:11:33 +01:00
William Lallemand
7b820a6191 BUG/MINOR: mworker: does not erase the pidfile upon reload
When started in master-worker mode combined with daemon mode, HAProxy
will open() with O_TRUNC the pidfile when switching to wait mode.

In 2.5, it happens  everytime after trying to load the configuration,
since we switch to wait mode.

In previous version this happens upon a failure of the configuration
loading.

Fixes bug #1545.

Must be backported in every supported branches.
2022-02-14 09:28:13 +01:00
Willy Tarreau
2454d6ef5b [RELEASE] Released version 2.6-dev1
Released version 2.6-dev1 with the following main changes :
    - BUG/MINOR: cache: Fix loop on cache entries in "show cache"
    - BUG/MINOR: httpclient: allow to replace the host header
    - BUG/MINOR: lua: don't expose internal proxies
    - MEDIUM: mworker: seamless reload use the internal sockpairs
    - BUG/MINOR: lua: remove loop initial declarations
    - BUG/MINOR: mworker: does not add the -sf in wait mode
    - BUG/MEDIUM: mworker: FD leak of the eventpoll in wait mode
    - MINOR: quic: do not reject PADDING followed by other frames
    - REORG: quic: add comment on rare thread concurrence during CID alloc
    - CLEANUP: quic: add comments on CID code
    - MEDIUM: quic: handle CIDs to rattach received packets to connection
    - MINOR: qpack: support litteral field line with non-huff name
    - MINOR: quic: activate QUIC traces at compilation
    - MINOR: quic: use more verbose QUIC traces set at compile-time
    - MEDIUM: pool: refactor malloc_trim/glibc and jemalloc api addition detections.
    - MEDIUM: pool: support purging jemalloc arenas in trim_all_pools()
    - BUG/MINOR: mworker: deinit of thread poller was called when not initialized
    - BUILD: pools: only detect link-time jemalloc on ELF platforms
    - CI: github actions: add the output of $CC -dM -E-
    - BUG/MEDIUM: cli: Properly set stream analyzers to process one command at a time
    - BUILD: evports: remove a leftover from the dead_fd cleanup
    - MINOR: quic: Set "no_application_protocol" alert
    - MINOR: quic: More accurate immediately close.
    - MINOR: quic: Immediately close if no transport parameters extension found
    - MINOR: quic: Rename qc_prep_hdshk_pkts() to qc_prep_pkts()
    - MINOR: quic: Possible crash when inspecting the xprt context
    - MINOR: quic: Dynamically allocate the secrete keys
    - MINOR: quic: Add a function to derive the key update secrets
    - MINOR: quic: Add structures to maintain key phase information
    - MINOR: quic: Optional header protection key for quic_tls_derive_keys()
    - MINOR: quic: Add quic_tls_key_update() function for Key Update
    - MINOR: quic: Enable the Key Update process
    - MINOR: quic: Delete the ODCIDs asap
    - BUG/MINOR: vars: Fix the set-var and unset-var converters
    - MEDIUM: pool: Following up on previous pool trimming update.
    - BUG/MEDIUM: mux-h1: Fix splicing by properly detecting end of message
    - BUG/MINOR: mux-h1: Fix splicing for messages with unknown length
    - MINOR: mux-h1: Improve H1 traces by adding info about http parsers
    - MINOR: mux-h1: register a stats module
    - MINOR: mux-h1: add counters instance to h1c
    - MINOR: mux-h1: count open connections/streams on stats
    - MINOR: mux-h1: add stat for total count of connections/streams
    - MINOR: mux-h1: add stat for total amount of bytes received and sent
    - REGTESTS: h1: Add a script to validate H1 splicing support
    - BUG/MINOR: server: Don't rely on last default-server to init server SSL context
    - BUG/MEDIUM: resolvers: Detach query item on response error
    - MEDIUM: resolvers: No longer store query items in a list into the response
    - BUG/MAJOR: segfault using multiple log forward sections.
    - BUG/MEDIUM: h1: Properly reset h1m flags when headers parsing is restarted
    - BUG/MINOR: resolvers: Don't overwrite the error for invalid query domain name
    - BUILD: bug: Fix error when compiling with -DDEBUG_STRICT_NOCRASH
    - BUG/MEDIUM: sample: Fix memory leak in sample_conv_jwt_member_query
    - DOC: spoe: Clarify use of the event directive in spoe-message section
    - DOC: config: Specify %Ta is only available in HTTP mode
    - BUILD: tree-wide: avoid warnings caused by redundant checks of obj_types
    - IMPORT: slz: use the correct CRC32 instruction when running in 32-bit mode
    - MINOR: quic: fix segfault on CONNECTION_CLOSE parsing
    - MINOR: h3: add BUG_ON on control receive function
    - MEDIUM: xprt-quic: finalize app layer initialization after ALPN nego
    - MINOR: h3: remove duplicated FIN flag position
    - MAJOR: mux-quic: implement a simplified mux version
    - MEDIUM: mux-quic: implement release mux operation
    - MEDIUM: quic: detect the stream FIN
    - MINOR: mux-quic: implement subscribe on stream
    - MEDIUM: mux-quic: subscribe on xprt if remaining data after send
    - MEDIUM: mux-quic: wake up xprt on data transferred
    - MEDIUM: mux-quic: handle when sending buffer is full
    - MINOR: quic: RX buffer full due to wrong CRYPTO data handling
    - MINOR: quic: Race issue when consuming RX packets buffer
    - MINOR: quic: QUIC encryption level RX packets race issue
    - MINOR: quic: Delete remaining RX handshake packets
    - MINOR: quic: Remove QUIC TX packet length evaluation function
    - MINOR: hq-interop: fix tx buffering
    - MINOR: mux-quic: remove uneeded code to check fin on TX
    - MINOR: quic: add HTX EOM on request end
    - BUILD: mux-quic: fix compilation with DEBUG_MEM_STATS
    - MINOR: http-rules: Add capture action to http-after-response ruleset
    - BUG/MINOR: cli/server: Don't crash when a server is added with a custom id
    - MINOR: mux-quic: do not release qcs if there is remaining data to send
    - MINOR: quic: notify the mux on CONNECTION_CLOSE
    - BUG/MINOR: mux-quic: properly initialize flow control
    - MINOR: quic: Compilation fix for quic_rx_packet_refinc()
    - MINOR: h3: fix possible invalid dereference on htx parsing
    - DOC: config: retry-on list is space-delimited
    - DOC: config: fix error-log-format example
    - BUG/MEDIUM: mworker/cli: crash when trying to access an old PID in prompt mode
    - MINOR: hq-interop: refix tx buffering
    - REGTESTS: ssl: use X509_V_ERR_UNABLE_TO_GET_ISSUER_CERT_LOCALLY for cert check
    - MINOR: cli: "show version" displays the current process version
    - CLEANUP: cfgparse: modify preprocessor guards around numa detection code
    - MEDIUM: cfgparse: numa detect topology on FreeBSD.
    - BUILD: ssl: unbreak the build with newer libressl
    - MINOR: vars: Move UPDATEONLY flag test to vars_set_ifexist
    - MINOR: vars: Set variable type to ANY upon creation
    - MINOR: vars: Delay variable content freeing in var_set function
    - MINOR: vars: Parse optional conditions passed to the set-var converter
    - MINOR: vars: Parse optional conditions passed to the set-var actions
    - MEDIUM: vars: Enable optional conditions to set-var converter and actions
    - DOC: vars: Add documentation about the set-var conditions
    - REGTESTS: vars: Add new test for conditional set-var
    - MINOR: quic: Attach timer task to thread for the connection.
    - CLEANUP: quic_frame: Remove a useless suffix to STOP_SENDING
    - MINOR: quic: Add traces for STOP_SENDING frame and modify others
    - CLEANUP: quic: Remove cdata_len from quic_tx_packet struct
    - MINOR: quic: Enable TLS 0-RTT if needed
    - MINOR: quic: No TX secret at EARLY_DATA encryption level
    - MINOR: quic: Add quic_set_app_ops() function
    - MINOR: ssl_sock: Set the QUIC application from ssl_sock_advertise_alpn_protos.
    - MINOR: quic: Make xprt support 0-RTT.
    - MINOR: qpack: Missing check for truncated QPACK fields
    - CLEANUP: quic: Comment fix for qc_strm_cpy()
    - MINOR: hq_interop: Stop BUG_ON() truncated streams
    - MINOR: quic: Do not mix packet number space and connection flags
    - CLEANUP: quic: Shorten a litte bit the traces in lstnr_rcv_pkt()
    - MINOR: mux-quic: fix trace on stream creation
    - CLEANUP: quic: fix spelling mistake in a trace
    - CLEANUP: quic: rename quic_conn conn to qc in quic_conn_free
    - MINOR: quic: add missing lock on cid tree
    - MINOR: quic: rename constant for haproxy CIDs length
    - MINOR: quic: refactor concat DCID with address for Initial packets
    - MINOR: quic: compare coalesced packets by DCID
    - MINOR: quic: refactor DCID lookup
    - MINOR: quic: simplify the removal from ODCID tree
    - REGTESTS: vars: Remove useless ssl tunes from conditional set-var test
    - MINOR: ssl: Remove empty lines from "show ssl ocsp-response" output
    - MINOR: quic: Increase the RX buffer for each connection
    - MINOR: quic: Add a function to list remaining RX packets by encryption level
    - MINOR: quic: Stop emptying the RX buffer asap.
    - MINOR: quic: Do not expect to receive only one O-RTT packet
    - MINOR: quic: Do not forget STREAM frames received in disorder
    - MINOR: quic: Wrong packet refcount handling in qc_pkt_insert()
    - DOC: fix misspelled keyword "resolve_retries" in resolvers
    - CLEANUP: quic: rename quic_conn instances to qc
    - REORG: quic: move mux function outside of xprt
    - MINOR: quic: add reference to quic_conn in ssl context
    - MINOR: quic: add const qualifier for traces function
    - MINOR: trace: add quic_conn argument definition
    - MINOR: quic: use quic_conn as argument to traces
    - MINOR: quic: add quic_conn instance in traces for qc_new_conn
    - MINOR: quic: Add stream IDs to qcs_push_frame() traces
    - MINOR: quic: unchecked qc_retrieve_conn_from_cid() returned value
    - MINOR: quic: Wrong dropped packet skipping
    - MINOR: quic: Handle the cases of overlapping STREAM frames
    - MINOR: quic: xprt traces fixes
    - MINOR: quic: Drop asap Retry or Version Negotiation packets
    - MINOR: pools: work around possibly slow malloc_trim() during gc
    - DEBUG: ssl: make sure we never change a servername on established connections
    - MINOR: quic: Add traces for RX frames (flow control related)
    - MINOR: quic: Add CONNECTION_CLOSE phrase to trace
    - REORG: quic: remove qc_ prefix on functions which not used it directly
    - BUG/MINOR: quic: upgrade rdlock to wrlock for ODCID removal
    - MINOR: quic: remove unnecessary call to free_quic_conn_cids()
    - MINOR: quic: store ssl_sock_ctx reference into quic_conn
    - MINOR: quic: remove unnecessary if in qc_pkt_may_rm_hp()
    - MINOR: quic: replace usage of ssl_sock_ctx by quic_conn
    - MINOR: quic: delete timer task on quic_close()
    - MEDIUM: quic: implement refcount for quic_conn
    - BUG/MINOR: quic: fix potential null dereference
    - BUG/MINOR: quic: fix potential use of uninit pointer
    - BUG/MEDIUM: backend: fix possible sockaddr leak on redispatch
    - BUG/MEDIUM: peers: properly skip conn_cur from incoming messages
    - CI: Github Actions: do not show VTest failures if build failed
    - BUILD: opentracing: display warning in case of using OT_USE_VARS at compile time
    - MINOR: compat: detect support for dl_iterate_phdr()
    - MINOR: debug: add ability to dump loaded shared libraries
    - MINOR: debug: add support for -dL to dump library names at boot
    - BUG/MEDIUM: ssl: initialize correctly ssl w/ default-server
    - REGTESTS: ssl: fix ssl_default_server.vtc
    - BUG/MINOR: ssl: free the fields in srv->ssl_ctx
    - BUG/MEDIUM: ssl: free the ckch instance linked to a server
    - REGTESTS: ssl: update of a crt with server deletion
    - BUILD/MINOR: cpuset FreeBSD 14 build fix.
    - MINOR: pools: always evict oldest objects first in pool_evict_from_local_cache()
    - DOC: pool: document the purpose of various structures in the code
    - CLEANUP: pools: do not use the extra pointer to link shared elements
    - CLEANUP: pools: get rid of the POOL_LINK macro
    - MINOR: pool: allocate from the shared cache through the local caches
    - CLEANUP: pools: group list updates in pool_get_from_cache()
    - MINOR: pool: rely on pool_free_nocache() in pool_put_to_shared_cache()
    - MINOR: pool: make pool_is_crowded() always true when no shared pools are used
    - MINOR: pool: check for pool's fullness outside of pool_put_to_shared_cache()
    - MINOR: pool: introduce pool_item to represent shared pool items
    - MINOR: pool: add a function to estimate how many may be released at once
    - MEDIUM: pool: compute the number of evictable entries once per pool
    - MINOR: pools: prepare pool_item to support chained clusters
    - MINOR: pools: pass the objects count to pool_put_to_shared_cache()
    - MEDIUM: pools: centralize cache eviction in a common function
    - MEDIUM: pools: start to batch eviction from local caches
    - MEDIUM: pools: release cached objects in batches
    - OPTIM: pools: reduce local pool cache size to 512kB
    - CLEANUP: assorted typo fixes in the code and comments This is 29th iteration of typo fixes
    - CI: github actions: update OpenSSL to 3.0.1
    - BUILD/MINOR: tools: solaris build fix on dladdr.
    - BUG/MINOR: cli: fix _getsocks with musl libc
    - BUG/MEDIUM: http-ana: Preserve response's FLT_END analyser on L7 retry
    - MINOR: quic: Wrong traces after rework
    - MINOR: quic: Add trace about in flight bytes by packet number space
    - MINOR: quic: Wrong first packet number space computation
    - MINOR: quic: Wrong packet number space computation for PTO
    - MINOR: quic: Wrong loss time computation in qc_packet_loss_lookup()
    - MINOR: quic: Wrong ack_delay compution before calling quic_loss_srtt_update()
    - MINOR: quic: Remove nb_pto_dgrams quic_conn struct member
    - MINOR: quic: Wrong packet number space trace in qc_prep_pkts()
    - MINOR: quic: Useless test in qc_prep_pkts()
    - MINOR: quic: qc_prep_pkts() code moving
    - MINOR: quic: Speeding up Handshake Completion
    - MINOR: quic: Probe Initial packet number space more often
    - MINOR: quic: Probe several packet number space upon timer expiration
    - MINOR: quic: Comment fix.
    - MINOR: quic: Improve qc_prep_pkts() flexibility
    - MINOR: quic: Do not drop secret key but drop the CRYPTO data
    - MINOR: quic: Prepare Handshake packets asap after completed handshake
    - MINOR: quic: Flag asap the connection having reached the anti-amplification limit
    - MINOR: quic: PTO timer too often reset
    - MINOR: quic: Re-arm the PTO timer upon datagram receipt
    - MINOR: proxy: add option idle-close-on-response
    - MINOR: cpuset: switch to sched_setaffinity for FreeBSD 14 and above.
    - CI: refactor spelling check
    - CLEANUP: assorted typo fixes in the code and comments
    - BUILD: makefile: add -Wno-atomic-alignment to work around clang abusive warning
    - MINOR: quic: Only one CRYPTO frame by encryption level
    - MINOR: quic: Missing retransmission from qc_prep_fast_retrans()
    - MINOR: quic: Non-optimal use of a TX buffer
    - BUG/MEDIUM: mworker: don't use _getsocks in wait mode
    - BUG/MINOR: ssl: Store client SNI in SSL context in case of ClientHello error
    - BUG/MAJOR: mux-h1: Don't decrement .curr_len for unsent data
    - DOC: internals: document the pools architecture and API
    - CI: github actions: clean default step conditions
    - BUILD: cpuset: fix build issue on macos introduced by previous change
    - MINOR: quic: Remaining TRACEs with connection as firt arg
    - MINOR: quic: Reset ->conn quic_conn struct member when calling qc_release()
    - MINOR: quic: Flag the connection as being attached to a listener
    - MINOR: quic: Wrong CRYPTO frame concatenation
    - MINOR: quid: Add traces quic_close() and quic_conn_io_cb()
    - REGTESTS: ssl: Fix ssl_errors regtest with OpenSSL 1.0.2
    - MINOR: quic: Do not dereference ->conn quic_conn struct member
    - MINOR: quic: fix return of quic_dgram_read
    - MINOR: quic: add config parse source file
    - MINOR: quic: implement Retry TLS AEAD tag generation
    - MEDIUM: quic: implement Initial token parsing
    - MINOR: quic: define retry_source_connection_id TP
    - MEDIUM: quic: implement Retry emission
    - MINOR: quic: free xprt tasklet on its thread
    - BUG/MEDIUM: connection: properly leave stopping list on error
    - MINOR: pools: enable pools with DEBUG_FAIL_ALLOC as well
    - MINOR: quic: As server, skip 0-RTT packet number space
    - MINOR: quic: Do not wakeup the I/O handler before the mux is started
    - BUG/MEDIUM: htx: Adjust length to add DATA block in an empty HTX buffer
    - CI: github actions: use cache for OpenTracing
    - BUG/MINOR: httpclient: don't send an empty body
    - BUG/MINOR: httpclient: set default Accept and User-Agent headers
    - BUG/MINOR: httpclient/lua: don't pop the lua stack when getting headers
    - BUILD/MINOR: fix solaris build with clang.
    - BUG/MEDIUM: server: avoid changing healthcheck ctx with set server ssl
    - CI: refactor OpenTracing build script
    - DOC: management: mark "set server ssl" as deprecated
    - MEDIUM: cli: yield between each pipelined command
    - MINOR: channel: add new function co_getdelim() to support multiple delimiters
    - BUG/MINOR: cli: avoid O(bufsize) parsing cost on pipelined commands
    - MEDIUM: h2/hpack: emit a Dynamic Table Size Update after settings change
    - MINOR: quic: Retransmit the TX frames in the same order
    - MINOR: quic: Remove the packet number space TX MT_LIST
    - MINOR: quic: Splice the frames which could not be added to packets
    - MINOR: quic: Add the number of TX bytes to traces
    - CLEANUP: quic: Replace <nb_pto_dgrams> by <probe>
    - MINOR: quic: Send two ack-eliciting packets when probing packet number spaces
    - MINOR: quic: Probe regardless of the congestion control
    - MINOR: quic: Speeding up handshake completion
    - MINOR: quic: Release RX Initial packets asap
    - MINOR: quic: Release asap TX frames to be transmitted
    - MINOR: quic: Probe even if coalescing
    - BUG/MEDIUM: cli: Never wait for more data on client shutdown
    - BUG/MEDIUM: mcli: do not try to parse empty buffers
    - BUG/MEDIUM: mcli: always realign wrapping buffers before parsing them
    - BUG/MINOR: stream: make the call_rate only count the no-progress calls
    - MINOR: quic: do not use quic_conn after dropping it
    - MINOR: quic: adjust quic_conn refcount decrement
    - MINOR: quic: fix race-condition on xprt tasklet free
    - MINOR: quic: free SSL context on quic_conn free
    - MINOR: quic: Add QUIC_FT_RETIRE_CONNECTION_ID parsing case
    - MINOR: quic: Wrong packet number space selection
    - DEBUG: pools: add new build option DEBUG_POOL_INTEGRITY
    - MINOR: quic: add missing include in quic_sock
    - MINOR: quic: fix indentation in qc_send_ppkts
    - MINOR: quic: remove dereferencement of connection when possible
    - MINOR: quic: set listener accept cb on parsing
    - MEDIUM: quic/ssl: add new ex data for quic_conn
    - MINOR: quic: initialize ssl_sock_ctx alongside the quic_conn
    - MINOR: ssl: fix build in release mode
    - MINOR: pools: partially uninline pool_free()
    - MINOR: pools: partially uninline pool_alloc()
    - MINOR: pools: prepare POOL_EXTRA to be split into multiple extra fields
    - MINOR: pools: extend pool_cache API to pass a pointer to a caller
    - DEBUG: pools: add new build option DEBUG_POOL_TRACING
    - DEBUG: cli: add a new "debug dev fd" expert command
    - MINOR: fd: register the write side of the poller pipe as well
    - CI: github actions: use cache for SSL libs
    - BUILD: debug/cli: condition test of O_ASYNC to its existence
    - BUILD: pools: fix build error on DEBUG_POOL_TRACING
    - MINOR: quic: refactor header protection removal
    - MINOR: quic: handle app data according to mux/connection layer status
    - MINOR: quic: refactor app-ops initialization
    - MINOR: receiver: define a flag for local accept
    - MEDIUM: quic: flag listener for local accept
    - MINOR: quic: do not manage connection in xprt snd_buf
    - MINOR: quic: remove wait handshake/L6 flags on init connection
    - MINOR: listener: add flags field
    - MINOR: quic: define QUIC flag on listener
    - MINOR: quic: create accept queue for QUIC connections
    - MINOR: listener: define per-thr struct
    - MAJOR: quic: implement accept queue
    - CLEANUP: mworker: simplify mworker_free_child()
    - BUILD/DEBUG: lru: update the standalone code to support the revision
    - DEBUG: lru: use a xorshift generator in the testing code
    - BUG/MAJOR: compiler: relax alignment constraints on certain structures
    - BUG/MEDIUM: fd: always align fdtab[] to 64 bytes
    - MINOR: quic: No DCID length for datagram context
    - MINOR: quic: Comment fix about the token found in Initial packets
    - MINOR: quic: Get rid of a struct buffer in quic_lstnr_dgram_read()
    - MINOR: quic: Remove the QUIC haproxy server packet parser
    - MINOR: quic: Add new defintion about DCIDs offsets
    - MINOR: quic: Add a list to QUIC sock I/O handler RX buffer
    - MINOR: quic: Allocate QUIC datagrams from sock I/O handler
    - MINOR: proto_quic: Allocate datagram handlers
    - MINOR: quic: Pass CID as a buffer to quic_get_cid_tid()
    - MINOR: quic: Convert quic_dgram_read() into a task
    - CLEANUP: quic: Remove useless definition
    - MINOR: proto_quic: Wrong allocations for TX rings and RX bufs
    - MINOR: quic: Do not consume the RX buffer on QUIC sock i/o handler side
    - MINOR: quic: Do not reset a full RX buffer
    - MINOR: quic: Attach all the CIDs to the same connection
    - MINOR: quic: Make usage of by datagram handler trees
    - MEDIUM: da: new optional data file download scheduler service.
    - MEDIUM: da: update doc and build for new scheduler mode service.
    - MEDIUM: da: update module to handle schedule mode.
    - MINOR: quic: Drop Initial packets with wrong ODCID
    - MINOR: quic: Wrong RX buffer tail handling when no more contiguous data
    - MINOR: quic: Iterate over all received datagrams
    - MINOR: quic: refactor quic CID association with threads
    - BUG/MEDIUM: resolvers: Really ignore trailing dot in domain names
    - DEV: flags: Add missing flags
    - BUG/MINOR: sink: Use the right field in appctx context in release callback
    - MINOR: sock: move the unused socket cleaning code into its own function
    - BUG/MEDIUM: mworker: close unused transferred FDs on load failure
    - BUILD: atomic: make the old HA_ATOMIC_LOAD() support const pointers
    - BUILD: cpuset: do not use const on the source of CPU_AND/CPU_ASSIGN
    - BUILD: checks: fix inlining issue on set_srv_agent_[addr,port}
    - BUILD: vars: avoid overlapping field initialization
    - BUILD: server-state: avoid using not-so-portable isblank()
    - BUILD: mux_fcgi: avoid aliasing of a const struct in traces
    - BUILD: tree-wide: mark a few numeric constants as explicitly long long
    - BUILD: tools: fix warning about incorrect cast with dladdr1()
    - BUILD: task: use list_to_mt_list() instead of casting list to mt_list
    - BUILD: mworker: include tools.h for platforms without unsetenv()
    - BUG/MINOR: mworker: fix a FD leak of a sockpair upon a failed reload
    - MINOR: mworker: set the master side of ipc_fd in the worker to -1
    - MINOR: mworker: allocate and initialize a mworker_proc
    - CI: Consistently use actions/checkout@v2
    - REGTESTS: Remove REQUIRE_VERSION=1.8 from all tests
    - MINOR: mworker: sets used or closed worker FDs to -1
    - MINOR: quic: Try to accept 0-RTT connections
    - MINOR: quic: Do not try to treat 0-RTT packets without started mux
    - MINOR: quic: Do not try to accept a connection more than one time
    - MINOR: quic: Initialize the connection timer asap
    - MINOR: quic: Do not use connection struct xprt_ctx too soon
    - Revert "MINOR: mworker: sets used or closed worker FDs to -1"
    - BUILD: makefile: avoid testing all -Wno-* options when not needed
    - BUILD: makefile: validate support for extra warnings by batches
    - BUILD: makefile: only compute alternative options if required
    - DEBUG: fd: make sure we never try to insert/delete an impossible FD number
    - MINOR: mux-quic: add comment
    - MINOR: mux-quic: properly initialize qcc flags
    - MINOR: mux-quic: do not consider CONNECTION_CLOSE for the moment
    - MINOR: mux-quic: create a timeout task
    - MEDIUM: mux-quic: delay the closing with the timeout
    - MINOR: mux-quic: release idle conns on process stopping
    - MINOR: listener: replace the listener's spinlock with an rwlock
    - BUG/MEDIUM: listener: read-lock the listener during accept()
    - MINOR: mworker/cli: set expert/experimental mode from the CLI
2022-02-01 18:06:59 +01:00
William Lallemand
56be0e0146 MINOR: mworker: allocate and initialize a mworker_proc
mworker_proc_new() allocates and initializes correctly a mworker_proc
structure.
2022-01-28 23:52:36 +01:00
William Lallemand
7e01878e45 MINOR: mworker: set the master side of ipc_fd in the worker to -1
Once the child->ipc_fd[0] is closed in the worker, set the value to -1
so we don't reference a closed FD anymore.
2022-01-28 23:52:26 +01:00
William Lallemand
55a921c914 BUG/MINOR: mworker: fix a FD leak of a sockpair upon a failed reload
When starting HAProxy in master-worker, the master pre-allocate a struct
mworker_proc and do a socketpair() before the configuration parsing. If
the configuration loading failed, the FD are never closed because they
aren't part of listener, they are not even in the fdtab.

This patch fixes the issue by cleaning the mworker_proc structure that
were not asssigned a process, and closing its FDs.

Must be backported as far as 2.0, the srv_drop() only frees the memory
and could be dropped since it's done before an exec().
2022-01-28 23:47:43 +01:00
Willy Tarreau
e08acaed19 BUG/MEDIUM: mworker: close unused transferred FDs on load failure
When the master process is reloaded on a new config, it will try to
connect to the previous process' socket to retrieve all known
listening FDs to be reused by the new listeners. If listeners were
removed, their unused FDs are simply closed.

However there's a catch. In case a socket fails to bind, the master
will cancel its startup and swithc to wait mode for a new operation
to happen. In this case it didn't close the possibly remaining FDs
that were left unused.

It is very hard to hit this case, but it can happen during a
troubleshooting session with fat fingers. For example, let's say
a config runs like this:

   frontend ftp
        bind 1.2.3.4:20000-29999

The admin wants to extend the port range down to 10000-29999 and
by mistake ends up with:

   frontend ftp
        bind 1.2.3.41:20000-29999

Upon restart the bind will fail if the address is not present, and the
master will then switch to wait mode without releasing the previous FDs
for 1.2.3.4:20000-29999 since they're now apparently unused. Then once
the admin fixes the config and does:

   frontend ftp
        bind 1.2.3.4:10000-29999

The service will start, but will bind new sockets, half of them
overlapping with the previous ones that were not properly closed. This
may result in a startup error (if SO_REUSEPORT is not enabled or not
available), in a FD number exhaustion (if the error is repeated many
times), or in connections being randomly accepted by the process if
they sometimes land on the old FD that nobody listens on.

This patch will need to be backported as far as 1.8, and depends on
previous patch:

   MINOR: sock: move the unused socket cleaning code into its own function

Note that before 2.3 most of the code was located inside haproxy.c, so
the patch above should probably relocate the function there instead of
sock.c.
2022-01-28 19:04:02 +01:00
Willy Tarreau
b510116fd2 MINOR: sock: move the unused socket cleaning code into its own function
The startup code used to scan the list of unused sockets retrieved from
an older process, and to close them one by one. This also required that
the knowledge of the internal storage of these temporary sockets was
known from outside sock.c and that the code was copy-pasted at every
call place.

This patch moves this into sock.c under the name
sock_drop_unused_old_sockets(), and removes the xfer_sock_list
definition from sock.h since the rest of the code doesn't need to know
this.

This cleanup is minimal and preliminary to a future fix that will need
to be backported to all versions featuring FD transfers over the CLI.
2022-01-28 19:04:02 +01:00
David CARLIER
bb10dad5a8 BUILD: cpuset: fix build issue on macos introduced by previous change
The build on macos was broken by recent commit df91cbd58 ("MINOR: cpuset:
switch to sched_setaffinity for FreeBSD 14 and above."), let's move the
variable declaration inside the ifdef.
2022-01-11 15:09:49 +01:00
William Lallemand
f82afbb9cd BUG/MEDIUM: mworker: don't use _getsocks in wait mode
Since version 2.5 the master is automatically re-executed in wait-mode
when the config is successfully loaded, puting corner cases of the wait
mode in plain sight.

When using the -x argument and with the right timing, the master will
try to get the FDs again in wait mode even through it's not needed
anymore, which will harm the worker by removing its listeners.

However, if it fails, (and it's suppose to, sometimes), the
master will exit with EXIT_FAILURE because it does not have the
MODE_MWORKER flag, but only the MODE_MWORKER_WAIT flag. With the
consequence of killing the workers.

This patch fixes the issue by restricting the use of _getsocks to some
modes.

This patch must be backported in every version supported, even through
the impact should me more harmless in version prior to 2.5.
2022-01-07 18:44:27 +01:00
David CARLIER
df91cbd584 MINOR: cpuset: switch to sched_setaffinity for FreeBSD 14 and above.
Following up previous update on cpuset-t.h. Ultimately, at some point
 the cpuset_setaffinity code path could be removed.
2022-01-07 06:53:51 +01:00
Willy Tarreau
654726db5a MINOR: debug: add support for -dL to dump library names at boot
This is a second help to dump loaded library names late at boot, once
external code has already been initialized. The purpose is to provide
a format that makes it easy to pass to "tar" to produce an archive
containing the executable and the list of dependencies. For example
if haproxy is started as "haproxy -f foo.cfg", a config check only
will suffice to quit before starting, "-q" will be used to disable
undesired output messages, and -dL will be use to dump libraries.
This will result in such a command to trivially produce a tarball
of loaded libraries:

   ./haproxy -q -c -dL -f foo.cfg | tar -T - -hzcf archive.tgz
2021-12-28 17:07:13 +01:00
William Lallemand
efd954793e BUG/MINOR: mworker: deinit of thread poller was called when not initialized
Commit 67e371e ("BUG/MEDIUM: mworker: FD leak of the eventpoll in wait
mode") introduced a regression. Upon a reload it tries to deinit the
poller per thread, but no poll loop was initialized after loading the
configuration.

This patch fixes the issue by moving this part of the code in
mworker_reload(), since this function will be called only when the
poller is fully initialized.

This patch must be backported in 2.5.
2021-11-26 14:43:57 +01:00
William Lallemand
67e371ea14 BUG/MEDIUM: mworker: FD leak of the eventpoll in wait mode
Since 2.5, before re-executing in wait mode, the master can have a
working configuration loaded, with a eventpoll fd. This case was not
handled correctly and a new eventpoll FD is leaking in the master at
each reload, which is inherited by the new worker.

Must be backported in 2.5.
2021-11-25 10:45:29 +01:00
William Lallemand
befab9ee4a BUG/MINOR: mworker: does not add the -sf in wait mode
Since the wait mode is automatically executed after charging the
configuration, -sf was shown in argv[] with the previous PID, which is
normal, but also the current one. This is only a visual problem when
listing the processes, because -sf does not do anything in wait mode.

Fix the issue by removing the whole "-sf" part in wait mode, but the
executed command can be seen in the argv[] of the latest worker forked.

Must be backported in 2.5.
2021-11-25 10:39:54 +01:00
William Lallemand
2be557f7cb MEDIUM: mworker: seamless reload use the internal sockpairs
With the master worker, the seamless reload was still requiring an
external stats socket to the previous process, which is a pain to
configure.

This patch implements a way to use the internal socketpair between the
master and the workers to transfer the sockets during the reload.
This way, the master will always try to transfer the socket, even
without any configuration.

The master will still reload with the -x argument, followed by the
sockpair@ syntax. ( ex -x sockpair@4 ). Which use the FD of internal CLI
to the worker.
2021-11-24 19:00:39 +01:00
William Lallemand
c4810b8cc8 BUG/MEDIUM: mworker: cleanup the listeners when reexecuting
Previously, the cleanup of the listeners was done in mworker_loop(),
which was called once the configuration file was parsed. HAProxy was
switching in wait mode when the configuration failed to load, so no
listeners where created.

Since the latest change on the mworker mode, HAProxy switch to wait mode
after successfuly loading the configuration, without cleaning its
listeners, because it was done in mworker_loop, resulting in the master
not closing its listeners and keeping them. The master needs its
configuration to know which listeners it need to close, so that must be
done before the exec().

This patch fixes the problem by cleaning the listeners in the
mworker_reexec() function.

No backport needeed.
2021-11-18 11:01:16 +01:00
William Lallemand
6883674084 MINOR: mworker: implement a reload failure counter
Implement a reload failure counter which counts the number of failure
since the last success. This counter is available in 'show proc' over
the master CLI.
2021-11-10 15:53:01 +01:00
William Lallemand
ad221f4ece MINOR: mworker: only increment the number of reload in wait mode
Since the wait mode will be started in any case of succesful or failed
reload, change the way haproxy computes the number of reloads of the
processes.
2021-11-10 15:53:01 +01:00
William Lallemand
836bda226c MINOR: mworker: clarify starting/failure messages
Clarify the startup and reload messages:

On a successful configuration load, haproxy will emit "Loading success."
after successfuly forked the children.

When it didn't success to load the configuration it will emit "Loading failure!".

When trying to reload the master process, it will emit "Reloading
HAProxy".
2021-11-10 15:53:01 +01:00
William Lallemand
fab0fdce98 MEDIUM: mworker: reexec in waitpid mode after successful loading
Use the waitpid mode after successfully loading the configuration, this
way the memory will be freed in the master, and will preserve the memory.

This will be useful when doing a reload with a configuration which has
large maps or a lot of SSL certificates, avoiding an OOM because too
much memory was allocated in the master.
2021-11-10 15:53:01 +01:00
William Lallemand
5d71a6b0f1 CLEANUP: mworker: remove any relative PID reference
nbproc was removed, it's time to remove any reference to the relative
PID in the master-worker, since there can be only 1 current haproxy
process.

This patch cleans up the alerts and warnings emitted during the exit of
a process, as well as the "show proc" output.
2021-11-10 15:53:01 +01:00