Commit 048368ef6 ("MINOR: deinit: always deinit the init_mutex on
failed initialization") added the missing unlock but forgot to
condition it on USE_THREAD, resulting in a build failure. No
backport is needed.
This addresses oss-fuzz issue 36426.
This undocumented variable is only for internal use, and its sole
presence affects the process' behavior, as shown in bug #1324. It must
not be exported to workers, external checks, nor programs. Let's unset
it before forking programs and workers.
This should be backported as far as 1.8. The worker code might differ
a bit before 2.5 due to the recent removal of multi-process support.
The master-worker code registers an exit handler to deal with configuration
issues during reload, leading to a restart of the master process in wait
mode. But it shouldn't do that when it's expected that the program stops
during config parsing or condition checks, as the reload operation is
unexpectedly called and results in abnormal behavior and even crashes:
$ HAPROXY_MWORKER_REEXEC=1 ./haproxy -W -c -f /dev/null
Configuration file is valid
[NOTICE] (18418) : haproxy version is 2.5-dev2-ee2420-6
[NOTICE] (18418) : path to executable is ./haproxy
[WARNING] (18418) : config : Reexecuting Master process in waitpid mode
Segmentation fault
$ HAPROXY_MWORKER_REEXEC=1 ./haproxy -W -cc 1
[NOTICE] (18412) : haproxy version is 2.5-dev2-ee2420-6
[NOTICE] (18412) : path to executable is ./haproxy
[WARNING] (18412) : config : Reexecuting Master process in waitpid mode
[WARNING] (18412) : config : Reexecuting Master process
Note that the presence of this variable happens by accident when haproxy
is called from within its own programs (see issue #1324), but this should
be the object of a separate fix.
This patch fixes this by preventing the atexit registration in such
situations. This should be backported as far as 1.8. MODE_CHECK_CONDITION
has to be dropped for versions prior to 2.5.
The init_mutex was not unlocked in case an error is encountered during
a thread initialization, and the polling loop was aborted during startup.
In practise it does not have any observable effect since an explicit
exit() is placed there, but it could confuse some debugging tools or
some static analysers, so let's release it as expected.
This addresses issue #1326.
The removal for the shared inter-process cache in commit 6fd0450b4
("CLEANUP: shctx: remove the different inter-process locking techniques")
accidentally removed the enforcement of rlimit_memmax_all which
corresponds to what is passed to the command-line "-m" argument.
Let's restore it.
Thanks to @nafets227 for spotting this. This fixes github issue #1319.
Till now we were dealing with single-word expressions but in order to
extend the configuration condition language a bit more, we'll need to
support slightly more complex expressions involving operators, and we
must absolutely support spaces around them to keep them readable.
As all arguments are pointers to the same line with spaces replaced by
zeroes, we can trivially rebuild the whole line before calling the
condition evaluator, and remove the test for extraneous argument. This
is what this patch does.
The .if/.else/.endif and condition evaluation code is quite dirty and
was dumped into cfgparse.c because it was easy. But it should be tidied
quite a bit as it will need to evolve.
Let's move all that to cfgcond.{c,h}.
I found myself a few times testing some conditoin examples from the doc
against command line's "-cc" to see that they didn't work with environment
variables expansion. Not being documented as being on purpose it looks like
a miss, so let's add PARSE_OPT_ENV and PARSE_OPT_WORD_EXPAND to be able to
test for example -cc "streq(${WITH_SSL},yes)" to help debug expressions.
This adds the exact same restriction as commit 5546c8bdc ("MINOR:
cfgparse: Fail when encountering extra arguments in macro") but for
the "-cc" command line argument, for the sake of consistency.
Modern compilers love to break existing code, and some options detected
at build time (such as -fwrapv) are absolutely critical otherwise some
bad code can be generated.
Given that some users rely on packages that force CFLAGS without being
aware of this and can be hit by runtime bugs, we have to help packagers
figure that they need to be careful about their build options.
The test here consists in detecting correct wrapping of signed integers.
Some of the old code relies on it, and modern compilers recently decided
to break it. It's normally addressed using -fwrapv which users will
rarely enforce in their own flags. Thus it is a good indicator of missing
critical CFLAGS, and it happens to be very easy to detect at run time.
Note that the test uses argc in order to have a variable. While gcc
ignores wrapping even for constants, clang only ignores it for variables.
The way the code is constructed doesn't result in code being emitted for
optimized builds thanks to value range propagation.
This should address GitHub issue #1315, and should be backported to all
stable versions. It may result in instantly breaking binaries that seemed
to work fine (typically the ones suddenly showing a busy loop after a few
weeks of uptime), and require packagers to fix their flags. The vast
majority of distro packages are fine and will not be affected though.
Explicitly call ssl_initialize_random to initialize the random generator
in init() global function. If the initialization fails, the startup is
interrupted.
This commit is in preparation for support of ssl on dynamic servers. To
be able to activate ssl on dynamic servers, it is necessary to ensure
that the random generator is initialized on startup regardless of the
config. It cannot be called at runtime as access to /dev/urandom is
required.
This also has the effect to fix the previous non-consistent behavior.
Indeed, if bind or server in the config are using ssl, the
initialization function was called, and if it failed, the startup was
interrupted. Otherwise, the ssl initialization code could have been
called through the ssl server for lua, but this times without blocking
the startup on error. Or not called at all if lua was deactivated.
With a single process, we don't need to USE_PRIVATE_CACHE, USE_FUTEX
nor USE_PTHREAD_PSHARED anymore. Let's only keep the basic spinlock
to lock between threads.
The relative_pid is always 1. In mworker mode we also have a
child->relative_pid which is always equalt relative_pid, except for a
master (0) or external process (-1), but these types are usually tested
for, except for one place that was amended to carefully check for the
PROC_O_TYPE_WORKER option.
Changes were pretty limited as most usages of relative_pid were for
designating a process in stats output and peers protocol.
Lots of places iterating over nbproc or comparing with nbproc could be
simplified. Further, "bind-process" and "process" parsing that was
already limited to process 1 or "all" or "odd" resulted in a bind_proc
field that was either 0 or 1 during the init phase and later always 1.
All the checks for compatibilities were removed since it's not possible
anymore to run a frontend and a backend on different processes or to
have peers and stick-tables bound on different ones. This is the largest
part of this patch.
The bind_proc field was removed from both the proxy and the receiver
structs.
Since the "process" and "bind-process" directives are still parsed,
configs making use of correct values allowing process 1 will continue
to work.
There was a loop iterating over all nbproc values during init that
couldn't be immediately removed because the loop's index was used
to distinguish a child from a parent. That's now fixed by replacing
the iterator with an in_parent flag. All bindings that were checking
(1UL << proc) or cpu_map.proc[proc] were adjusted to always use zero
for proc.
Since its introduction in 1.8 with commit 095ba4c24 ("MEDIUM: mworker:
replace systemd mode by master worker mode"), it says "cannot chroot1(...)"
which seems to be a leftover of a debug message. It could be backported but
probably nobody will notice.
This one was deprecated in 2.3 and marked for removal in 2.5. It suffers
too many limitations compared to threads, and prevents some improvements
from being engaged. Instead of a bypassable startup error, there is now
a hard error.
The parsing code was removed, and very few obvious cases were as well.
The code is deeply rooted at certain places (e.g. "for" loops iterating
from 0 to nbproc) so it will not be that trivial to remove everywhere.
The "bind" and "bind-process" parsers will have to be adjusted, though
maybe not completely changed if we later want to support thread groups
for large NUMA machines. Some stats socket restrictions were removed,
and the doc was updated according to what was done. A few places in the
doc still refer to nbproc and will have to be revisited. The master-worker
code also refers to the process number to distinguish between master and
workers and will have to be carefully adjusted. The MAX_PROCS macro was
reset to 1, this will at least reduce the size of some remaining arrays.
Two regtests were dependieng on this directive, one with an explicit
"nbproc 1" and another one testing the master's CLI using nbproc 4.
Both were adapted.
This patch adds the `-cc` (check condition) argument to evaluate conditions on
startup and return the result as the exit code.
As an example this can be used to easily check HAProxy's version in scripts:
haproxy -cc 'version_atleast(2.4)'
This resolves GitHub issue #1246.
Co-authored-by: Tim Duesterhus <tim@bastelstu.be>
Set "config :" as a prefix for the user messages context before starting
the configuration parsing. All following stderr output will be prefixed
by it.
As a consequence, remove extraneous prefix "config" already specified in
various ha_alert/warning/notice calls.
Create a parsing_ctx structure. This type is used to store information
about the current file/line parsed. A global context is created and
can be manipulated when haproxy is in STARTING mode. When starting is
over, the context is resetted and should not be accessed anymore.
A memory allocation failure happening in mworker_env_to_proc_list when
trying to allocate a mworker_proc would have resulted in a crash. This
function is only called during init.
It was raised in GitHub issue #1233.
It could be backported to all stable branches.
A mistake was introduced in 2.4-dev17 by commit 982fb5339 ("MEDIUM:
config: use platform independent type hap_cpuset for cpu-map"), it
initializes cpu_map.thread[] from 0 to MAX_PROCS-1 instead of
MAX_THREADS-1 resulting in crashes when the two differ, e.g. when
building with USE_THREAD= but still with USE_CPU_AFFINITY=1.
No backport is needed.
When running "haproxy -v", we still get "HA-Proxy" which is the last
place where this confusing oddity happens. Being so used to it I didn't
even notice it until it was reported to me just after 2.2 but it never
got fixed, despite the PRODUCT_NAME macro that is used to report the
name in the stats page and in "show info" being already set to "HAProxy"
15 years ago in 1.2.14 with commit e03312613. It's about time to
uniformize everything.
Only mworker uses proc_self, and it was declared in global.h, forcing
users of global.h to include mworker and its dependencies.
Moving it to mworker reduces the preprocessed size of version.c from
170 to 125kB by shrinking the number of local includes from 30 to 16
and the number of system includes from 147 to 132.
The presence of this field causes a long dependency chain because almost
everyone includes global-t.h, and vars include sample_data which include
some system includes as well as HTTP parts.
There is absolutely no reason for having the process-wide variables in
the global struct, let's just move them into vars.c and vars.h. This
reduces from ~190k to ~170k the preprocessed output of version.c.
Add a new flag to mark a keyword as experimental. An experimental
keyword cannot be used if the global 'expose-experimental-directives' is
not present first.
Only keywords parsed through a standard cfg_keywords lists in
global/proxies section will be automatically detected if declared
experimental. To support a keyword outside of these lists,
check_kw_experimental must be called manually during its parsing.
If an experimental keyword is present in the config, the tainted flag is
updated.
For the moment, no keyword is marked as experimental.
Add a global flag named 'tainted'. Its purpose is to report various
status about experimental features used for the current process
lifetime.
By default it is initialized to 0. It can be set/retrieve by a couple of
new functions mark_tainted()/get_tainted(). Once a flag is set, it
cannot be resetted.
Currently, no tainted status is implemented, it will be the subject of
the following commits.
The new function split_version() converts a parsable haproxy version to
an array of integers. The function compare_current_version() compares an
arbitrary version to the current one. These two functions were written
by Thierry Fournier in 2013, and are still usable as-is. They will be
used to write config language predicates.
Till now it was only presented in the version output but could not be
consulted outside of haproxy.c, let's export it as a variable, and set
it to an empty string if not defined.
Implement a safe mechanism to close front idling connection which
prevents the soft-stop to complete. Every h1/h2 front connection is
added in a new per-thread list instance. On shutdown, a new task is
waking up which calls wake mux operation on every connection still
present in the new list.
A new stopping_list attach point has been added in the connection
structure. As this member is only used for frontend connections, it
shared the same union as the session_list reserved for backend
connections.
The compilation fails due to the following commit:
fc6ac53dca
BUG/MAJOR: fix build on musl with cpu_set_t support
The new global variable cpu_map conflicted with a local variable of the
same name in the code path for the apple platform when setting the
process affinity.
This does not need to be backported.
Move cpu_map structure outside of the global struct to a global
variable defined in cpuset.c compilation unit. This allows to reorganize
the includes without having to define _GNU_SOURCE everywhere for the
support of the cpu_set_t.
This fixes the compilation with musl libc, most notably used for the
alpine based docker image.
This fixes the github issue #1235.
No need to backport as this feature is new in the current
2.4-dev.
The compilation is currently broken on platform without USE_CPU_AFFINITY
set. An error has been reported by the cygwin build of the CI.
This does not need to be backported.
In file included from include/haproxy/global-t.h:27,
from include/haproxy/global.h:26,
from include/haproxy/fd.h:33,
from src/ev_poll.c:22:
include/haproxy/cpuset-t.h:32:3: error: #error "No cpuset support implemented on this platform"
32 | # error "No cpuset support implemented on this platform"
| ^~~~~
include/haproxy/cpuset-t.h:37:2: error: unknown type name ‘CPUSET_REPR’
37 | CPUSET_REPR cpuset;
| ^~~~~~~~~~~
make: *** [Makefile:944: src/ev_poll.o] Error 1
make: *** Waiting for unfinished jobs....
In file included from include/haproxy/global-t.h:27,
from include/haproxy/global.h:26,
from include/haproxy/fd.h:33,
from include/haproxy/connection.h:30,
from include/haproxy/ssl_sock.h:27,
from src/ssl_sample.c:30:
include/haproxy/cpuset-t.h:32:3: error: #error "No cpuset support implemented on this platform"
32 | # error "No cpuset support implemented on this platform"
| ^~~~~
include/haproxy/cpuset-t.h:37:2: error: unknown type name ‘CPUSET_REPR’
37 | CPUSET_REPR cpuset;
| ^~~~~~~~~~~
make: *** [Makefile:944: src/ssl_sample.o] Error 1
Fix the warning treated as error on the CI for the macOS compilation :
"src/haproxy.c:2939:23: error: unused variable 'set'
[-Werror,-Wunused-variable]"
This does not need to be backported.
Render numa detection optional with a global configuration statement
'no numa-cpu-mapping'. This can be used if the applied affinity of the
algorithm is not optimal. Also complete the documentation with this new
keyword.
Use the platform independent type hap_cpuset for the cpu-map statement
parsing. This allow to address CPU index greater than LONGBITS.
Update the documentation to reflect the removal of this limit except for
platforms without cpu_set_t type or equivalent.
Since commit 3f12887 ("MINOR: mworker: don't use children variable
anymore"), the oldpids array is not used anymore to generate the new -sf
parameters. So we don't need to set nb_oldpids to 0 during the first
start of the master process.
This patch fixes a bug when 2 masters process tries to synchronize their
peers, there is a small chances that it won't work because nb_oldpids
equals 0.
Should be backported as far as 2.0.
This bug affects the peers synchronisation code which rely on the
nb_oldpids variable to synchronize the peer from the old PID.
In the case the process is not started in master-worker mode and tries
to synchronize using the peers, there is a small chance that won't work
because nb_oldpids equals 0.
Fix the bug by setting the variable to 0 only in the case of the
master-worker when not reloaded.
It could also be a problem when trying to synchronize the peers between
2 masters process which should be fixed in another patch.
Bug exists since commit 8a361b5 ("BUG/MEDIUM: mworker: don't reuse PIDs
passed to the master").
Sould be backported as far as 1.8.
The application of a cpu-map statement with both process and threads
is broken (P-Q/1 or 1/P-Q notation).
For example, before the fix, when using P-Q/1, proc_t1 would be updated.
Then it would be AND'ed with thread which is still 0 and thus does
nothing.
Another problem is when using 1/1[-Q], thread[0] is defined. But if
there is multiple processes, every processes will use this define
affinity even if it should be applied only to 1st process.
The solution to the fix is a little bit too complex for my taste and
there is maybe a simpler solution but I did not wish to break the
storage of global.cpu_map, as it is quite painful to test all the
use-cases. Besides, this code will probably be clean up when
multiprocess support removed on the future version.
Let's try to explain my logic.
* either haproxy runs in multiprocess or multithread mode. If on
multiprocess, we should consider proc_t1 (P-Q/1 notation). If on
multithread, we should consider thread (1/P-Q notation). However
during parsing, the final number of processes or threads is unknown,
thus we have to consider the two possibilities.
* there is a special case for the first thread / first process which is
present in both execution modes. And as a matter of fact cpu-map 1 or
1/1 notation represents the same thing. Thus, thread[0] and proc_t1[0]
represents the same thing. To solve this problem, only thread[0] is
used for this special case.
This fix must be backported up to 2.0.
Config information has been added into the logsrv struct. The filename
is duplicated and should be freed on exit.
Introduced in the current release.
This does not need to be backported.