On armv7 haproxy doesn't work because of the fixes on the double-word
CAS. There are two issues. The first one is that the last argument in
case of dwcas is a pointer to the set of value and not a value ; the
second is that it's not enough to cast the data as (void*) since it will
be a single word. Let's fix this by using the pointers as an array of
long. This was tested on i386, armv7, x86_64 and aarch64 and it is now
fine. An alternate approach using a struct was attempted as well but it
used to produce less optimal code.
This fix must be backported to 1.9. This fixes github issue #105.
Cc: Olivier Houchard <ohouchard@haproxy.com>
The purpose is to manipulate rings made of series of buffers so that
it is possible to continue to work on a next buffer once one is full.
This will be used by muxes to deal with contention between multiple
streams and a single output buffer. No data is expected to span over
multiple buffers, all of them will be used like a regular buffer. This
will significantly limit the amount of changes and the code complexity
while still supporting larger output buffering.
The ring is made of a head and a tail indexes both of which point to a
buffer descriptor. At least one descriptor is always valid, so it could
be seen as a form of pagination always presenting one buffer. The root
of the ring is itself stored into a buffer descriptor so that the user
only has to declare a buffer array and to call br_init() on it in order
to use it.
It has not been used for many years, is unlikely to be reused and
conflicts with the similarly named macro in flt_trace, causing warnings
at build time when including debug.h in low-level files. Let's simply
remove it.
SI_TKILL is for Linux. We're again in the non-portable area. Both OSes
use macros to define these values so we can #ifdef them. Let's make
SI_TKILL defined based on SI_LWP when only the latter is defined.
We still have quite a number of build macros which are mapped 1:1 to a
USE_something setting in the makefile but which have a different name.
This patch cleans this up by renaming them to use the USE_something
one, allowing to clean up the makefile and make it more obvious when
reading the code what build option needs to be added.
The following renames were done :
ENABLE_POLL -> USE_POLL
ENABLE_EPOLL -> USE_EPOLL
ENABLE_KQUEUE -> USE_KQUEUE
ENABLE_EVPORTS -> USE_EVPORTS
TPROXY -> USE_TPROXY
NETFILTER -> USE_NETFILTER
NEED_CRYPT_H -> USE_CRYPT_H
CONFIG_HAP_CRYPT -> USE_LIBCRYPT
CONFIG_HAP_NS -> DUSE_NS
CONFIG_HAP_LINUX_SPLICE -> USE_LINUX_SPLICE
CONFIG_HAP_LINUX_TPROXY -> USE_LINUX_TPROXY
CONFIG_HAP_LINUX_VSYSCALL -> USE_LINUX_VSYSCALL
It seems it's not defined on FreeBSD while it's mentioned on Linux that
clock_gettime() can be detected using this. Given that we also have the
test for _POSIX_TIMERS>0 that should cover it well enough. If it breaks
on other systems, we'll see.
Report was here :
https://github.com/haproxy/haproxy/runs/133866993
This will be used by the watchdog to detect that a thread locked up.
It's only defined on platforms supporting it. This patch only reserves
the room for the timer in the struct. A special value was reserved for
the uninitialized timer. The problem is that the POSIX API was horribly
designed, defining no invalid value, thus for each timer it is required
to keep a second variable to indicate whether it's valid. A quick check
shows that defining a 32-bit invalid value is not something uncommon
across other implementations, with ~0 being common. Let's try with this
and if it causes issues we can revisit this decision.
This flag is constantly cleared by the scheduler and will be set by the
watchdog timer to detect stuck threads. It is also set by the "show
threads" command so that it is easy to spot if the situation has evolved
between two subsequent calls : if the first "show threads" shows no stuck
thread and the second one shows such a stuck thread, it indicates that
this thread didn't manage to make any forward progress since the previous
call, which is extremely suspicious.
These functions are used respectively to signal one thread or all threads.
When multithreading is disabled, it's always the current thread which is
signaled.
Some code starts to add ifdefs everywhere to work around the lack of
threads_harmless_mask when threads are not compiled in. This one is
often used to indicate a thread having joined the rendez-vous point or
a thread sleeping in the poller. By setting it to zero we translate
what usually is required in debugging code (i.e. the only thread is
currently working) and for signal handlers we can use a combination of
threads_harmless_mask and sleeping_threads_mask to detect the polling
cases as well. Similarly do the same with threads_want_rdv_mask which
is less often used though.
Some structures have optional fields which depend on availability of
certain features on certain platforms, and having to stuff lots of
ifdefs in these structs makes them unreadable. Using real values like
ints requires some initialization and adds even more confusion.
Here we take a different approach : we create an empty type called
empty_t to use as a substitute for the real type that is not implemented
and which doesn't contain any value (it's an empty struct). Thus it has
a size of zero but an address, thus a pointer may point to it. It will
not have to be initialized though. Some initialization code might even
continue to work and do nothing like initializing it using memset with
its sizeof which is zero.
The clock_gettime() man page says we must check that _POSIX_TIMERS is
defined to a value greater than zero, not just that it's simply defined
so let's fix this right now.
Since we're likely to access this thread_info struct more frequently in
the future, let's reserve the thread-local symbol to access it directly
and avoid always having to combine thread_info and tid. This pointer is
set when tid is set.
In order to ease the internal time API, we'll have the threads time always
present even when threads are disabled. Let's make sure clockid_t, and the
minimum clock times are defined even on older or non-compatible systems.
It doesn't make sense to keep this struct thread_info in global.h, it
causes difficulties to access its contents from hathreads.h, let's move
it to the threads where it ought to have been created.
Historically standard.h was the location where we used to (re-)define the
standard set of macros and functions, and to complement the ones missing
on the target OS. Over time it has become a toolbox in itself relying on
many other things, and its definition of LONGBITS is used everywhere else
(e.g. for MAX_THREADS), resulting in painful circular dependencies.
Let's move these few defines (integer sizes) to compat.h where other
similar definitions normally are.
It's a bit too easy to crash by accident when using dump_hex() on any
area. Let's have a function to check if the memory may safely be read
first. This one abuses the stat() syscall checking if it returns EFAULT
or not, in which case it means we're not allowed to read from there. In
other situations it may return other codes or even a success if the
area pointed to by the file exists. It's important not to abuse it
though and as such it's tested only once per output line.
This function dumps all existing threads using the thread dump mechanism
then aborts. This will be used by the lockup detection and by debugging
tools.
The current "show threads" command was too limited as it was not possible
to dump other threads' detailed states (e.g. their tasks). This patch
goes further by using thread signals so that each thread can dump its
own state in turn into a shared buffer provided by the caller. Threads
are synchronized using a mechanism very similar to the rendez-vous point
and using this method, each thread can safely dump any of its contents
and the caller can finally report the aggregated ones from the buffer.
It is important to keep in mind that the list of signal-safe functions
is limited, so we take care of only using chunk_printf() to write to a
pre-allocated buffer.
This mechanism is enabled by USE_THREAD_DUMP and is enabled by default
on Linux 2.6.28+. On other platforms it falls back to the previous
solution using the loop and the less precise dump.
At some places we're using a painful ifdef to decide whether to use
sched_yield() or pl_cpu_relax() to relax in loops, this is hardly
exportable. Let's move this to ha_thread_relax() instead and une
this one only.
Instead of having them dump into the trash and initialize it, let's have
the caller initialize a buffer and pass it. This will be convenient to
dump multiple threads at once into a single buffer.
The new function ha_thread_dump() will dump debugging info about all known
threads. The current thread will contain a bit more info. The long-term goal
is to make it possible to use it in signal handlers to improve the accuracy
of some dumps.
The function dumps its output into the trash so as it was trivial to add,
a new "show threads" command appeared on the CLI.
Gil Bahat reported build issues on Cygwin starting with 1.9 due to a
difference in the way the linker handles the weak symbols there,
causing multiple declarations of ist_lc[] and ist_uc[]. It's likely
that this issue could also happen on any older or non-ELF linker.
This patch addresses this by using literals instead on such platforms,
leaving it to the compiler to merge the constants when it can. On other
platforms the resulting executable is slightly larger due to strings
that could not be merged but this is a minor detail compared to not
being able to build at all.
If this change alone is confirmed to fix these issues, it's safe to
backport to 1.9.
We do have some code paths testing for impossible errors that tend to
be quite confusing, first for maintenance (what to do on such errors,
and how far to guess the bug), second for developers as it tends to
hide the main purpose and expectations of these call places. Also
most of the time impossible errors are ignored by the callers so the
tests are not even usable during debugging.
Let's instead implement a BUG_ON macro which takes a condition, which
if true, will cause a message to be emitted and optionally to crash the
process. Additionally, these calls inserted at various places server as
hints and documentation for developers to know that such conditions
must absolutely not happen.
This is only enabled when DEBUG_STRICT or DEBUG_STRICT_NOCRASH are set.
As its name implies, DEBUG_STRICT_NOCRASH only performs the test but
does not crash, which can be useful to track some checkpoints.
At the moment nothing uses this code.
On recent gcc versions with the null-deref checks, ABORT_NOW() rightfully
emits such a warning. But here it's on purpose. Simply changing the memory
address to 1 makes gcc happy.
Some code parts use LIST_ISEMPTY() a lot on list elements to detect
if they were reset consecutive to their removal from a list, but this
test is always confusing as this was initially designed for list heads.
Instead let's have a new macro, LIST_ADDED(), which returns true when
the element is in a list (i.e. it's not "empty").
This low-level asm implementation of a double CAS was implemented only
for certain architectures (x86_64, armv7, armv8). When threads are not
used, they were not defined, but since they were called directly from
a few locations, they were causing build issues on certain platforms
with threads disabled. This was addressed in commit f4436e1 ("BUILD:
threads: Add __ha_cas_dw fallback for single threaded builds") by
making it fall back to HA_ATOMIC_CAS() when threads are not defined,
but this actually made the situation worse by breaking other cases.
This patch fixes this by creating a high-level macro HA_ATOMIC_DWCAS()
which is similar to HA_ATOMIC_CAS() except that it's intended to work
on a double word, and which rely on the asm implementations when threads
are in use, and uses its own open-coded implementation when threads are
not used. The 3 call places relying on __ha_cas_dw() were updated to
use HA_ATOMIC_DWCAS() instead.
This change was tested on i586, x86_64, armv7, armv8 with and without
threads with gcc 4.7, armv8 with gcc 5.4 with and without threads, as
well as i586 with gcc-3.4 without threads. It will need to be backported
to 1.9 along with the fix above to fix build on armv7 with threads
disabled.
The following macros are now defined for openssl < 1.1 so that we
can remove the code performing direct access to the structures :
BIO_get_data(), BIO_set_data(), BIO_set_init(), BIO_meth_free(),
BIO_meth_new(), BIO_meth_set_gets(), BIO_meth_set_puts(),
BIO_meth_set_read(), BIO_meth_set_write(), BIO_meth_set_create(),
BIO_meth_set_ctrl(), BIO_meth_set_destroy()
__ha_cas_dw() is used in fd_rm_from_fd_list() and when built without
USE_THREADS=1 the linker fails to find __ha_cas_dw(). Add a definition
of __ha_cas_dw() for the #ifndef USE_THREADS case.
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
It's always a pain to have to stuff lots of #ifdef USE_OPENSSL around
ssl headers, it even results in some of them appearing in a random order
and multiple times just to benefit form an existing ifdef block. Let's
make these headers safe for inclusion when USE_OPENSSL is not defined,
they now perform the test themselves and do nothing if USE_OPENSSL is
not defined. This allows to remove no less than 8 such ifdef blocks
and make include blocks more readable.
Since we're providing a compatibility layer for multiple OpenSSL
implementations and their derivatives, it is important that no C file
directly includes openssl headers but only passes via openssl-compat
instead. As a bonus this also gets rid of redundant complex rules for
inclusion of certain files (engines etc).
Some defines like OPENSSL_VERSION or X509_getm_notBefore() have nothing
to do in ssl_sock and must move to openssl-compat.h so that they are
consistently shared by the whole code. A warning in the code was added
against wild additions of macros there.
Now we atomically allocate the my_regex struct within function
regex_comp() and compile the regex or free both in case of failure. The
pointer to the allocated my_regex struct is returned directly. The
my_regex* argument to regex_comp() is removed.
Function regex_free() was modified so that it systematically frees the
my_regex entry. The function does nothing when called with a NULL as
argument (like free()). It will avoid existing risk of not properly
freeing the initialized area.
Other structures are also updated in order to be compatible (the ones
related to Lua and action rules).
This implements support for the new API which relies on a call to
setsockopt().
On systems that support it (currently, only Linux >= 4.11), this enables
using TCP fast open when connecting to server.
Please note that you should use the retry-on "conn-failure", "empty-response"
and "response-timeout" keywords, or the request won't be able to be retried
on failure.
Co-authored-by: Olivier Houchard <ohouchard@haproxy.com>
The same way we have HA_ATOMIC_STORE(), implement HA_ATOMIC_LOAD().
This should be backported to 1.8 and 1.9, as we need it for a bug fix
in port ranges.
This patch implements the sampling and load-balancing of log servers configured
with "sample" new keyword implemented by this commit:
'MINOR: log: Add "sample" new keyword to "log" lines'.
As the list of ranges used to sample the log to balance is ordered, we only
have to maintain ->curr_idx member of smp_info struct which is the index of
the sample and check if it belongs or not to the current range to decide if we
must send it to the log server or not.
As specified in the function comment, the function h1_skip_chunk_crlf() must not
change anything and return zero if not enough data are available. This must
include the case where there is no data at all. On this point, it must do the
same that other h1 parsing functions. This bug is made visible since the commit
91f77d599 ("BUG/MINOR: mux-h1: Process input even if the input buffer is
empty").
This patch must be backported to 1.9.
Older compilers (like gcc-3.4) warn about the use of "const" on functions
returning a struct, which makes sense since the return may only be copied :
include/common/htx.h:233: warning: type qualifiers ignored on function return type
Let's simply drop "const" here.
Move the definition of the various _HA_ATOMIC_* macros that use
__atomic_* in the #if GCC_VERSION >= 4.7, not just after it, so that we
can build with older versions of gcc again.
The upgrade is performed when an H2 preface is detected when the first request
on a connection is parsed. The CS is destroyed by setting EOS flag on it. A
special flag is added on the HTX message to warn the HTX analyzers the stream
will be closed because of an upgrade. This way, no error and no log are
emitted. When the mux h1 is released, we create a mux h2, without any CS and
passing the buffer with the unparsed H2 preface.
The flag H1_MF_CLEAN_CONN_HDR has been added to let the H1 parser sanitize
connection headers. It means it will remove all "close" and "keep-alive" values
during the parsing. One noticeable effect is that connection headers may be
unfolded. In practice, this is not a problem because it is not frequent to have
multiple values for the connection headers.
If this flag is set, during the parsing The function
h1_parse_next_connection_header() is called in a loop instead of
h1_parse_conection_header().
No need to backport this patch
When creating a new initcall, don't forget to define the symbols, as it may
not be done automatically and that would lead to undefined symbols.
This should be backported to 1.9.
I found on an (old) AIX 5.1 machine that stdint.h didn't exist while
inttypes.h which is expected to include it does exist and provides the
desired functionalities.
As explained here, stdint being just a subset of inttypes for use in
freestanding environments, it's probably always OK to switch to inttypes
instead:
https://pubs.opengroup.org/onlinepubs/009696799/basedefs/stdint.h.html
Also it's even clearer here in the autoconf doc :
https://www.gnu.org/software/autoconf/manual/autoconf-2.61/html_node/Header-Portability.html
"The C99 standard says that inttypes.h includes stdint.h, so there's
no need to include stdint.h separately in a standard environment.
Some implementations have inttypes.h but not stdint.h (e.g., Solaris
7), but we don't know of any implementation that has stdint.h but not
inttypes.h"
The current initcall implementation relies on dedicated sections (one
section per init stage) to store the initcall descriptors. Then upon
startup, these sections are scanned from beginning to end and all items
found there are called in sequence.
On platforms like AIX or Cygwin it seems difficult to figure the
beginning and end of sections as the linker doesn't seem to provide
the corresponding symbols. In order to replace this, this patch
simply implements an array of single linked (one per init stage)
which are fed using constructors for each register call. These
constructors are declared static, with a name depending on their
line number in the file, in order to avoid name clashes. The final
effect is the same, except that the method is slightly more expensive
in that it explicitly produces code to register these initcalls :
$ size haproxy.sections haproxy.constructor
text data bss dec hex filename
4060312 249176 1457652 5767140 57ffe4 haproxy.sections
4062862 260408 1457652 5780922 5835ba haproxy.constructor
This mechanism is enabled as an alternative to the default one when
build option USE_OBSOLETE_LINKER is set. This option is currently
enabled by default only on AIX and Cygwin, and may be attempted for
any target which fails to build complaining about missing symbols
__start_init_* and/or __stop_init_*.
Once confirmed as a reliable fix, this will likely have to be backported
to 1.9 where AIX and Cygwin do not build anymore.
Older Solaris and AIX versions do not have unsetenv(). This adds a
fairly simple implementation which scans the environment, for use
with those systems. It will simply require to pass the define in
the "DEFINE" macro at build time like this :
DEFINE="-Dunsetenv=my_unsetenv"
http_known_methods, HTTP_100 and HTTP_103 were not declared extern and
as such were multiply defined since they were in http.h. There was
apparently no more side effect but it may depend on the platform and
the linker.
This needs to be backported to 1.9.
Injecting on a saturated listener started to exhibit some deadlocks
again between LIST_POP_LOCKED() and LIST_DEL_LOCKED(). Olivier found
it was due to a leftover from a previous debugging session. This patch
fixes it.
This will have to be backported if the other LIST_*_LOCKED() patches
are backported.
Some packages used to rely on DEFAULT_MAXCONN to set the default global
maxconn value to use regardless of the initial ulimit. The recent changes
made the lowest bound set to 100 so that it is compatible with almost any
environment. Now that DEFAULT_MAXCONN is not needed for anything else, we
can use it for the lowest bound set when maxconn is not configured. This
way it retains its original purpose of setting the default maxconn value
eventhough most of the time the effective value will be higher thanks to
the automatic computation based on "ulimit -n".
This entry was still set to 2000 but never used anymore. The only places
where it appeared was as an alias to SYSTEM_MAXCONN which forces it, so
let's turn these ones to SYSTEM_MAXCONN and remove the default value for
DEFAULT_MAXCONN. SYSTEM_MAXCONN still defines the upper bound however.
Add variants of the HA_ATOMIC* macros, prefixed with a _, that do the
atomic operation with no barrier generated by the compiler. It is expected
the developer adds barriers manually if needed.
When using the new __atomic* API, ask the compiler to generate barriers.
A variant of those functions that don't generate barriers will be added later.
Before that, using HA_ATOMIC* would not generate any barrier, and some parts
of the code should be reviewed and missing barriers should be added.
This should probably be backported to 1.8 and 1.9.
Implement __ha_barrier functions to be used when trying to protect data
modified by atomic operations (except when using HA_ATOMIC_STORE).
On intel, atomic operations either use the LOCK prefix and xchg, and both
atc as full barrier, so there's no need to add an extra barrier.
We already have my_ffsl() to find the lowest bit set in a word, and
this patch implements the search for the highest bit set in a word.
On x86 it uses the bsr instruction and on other architectures it
uses an efficient implementation.
It turns out that we call LIST_DEL+LIST_INIT very frequently and that
the compiler doesn't know what pointers get modified in the e->n->p
and e->p->n dance, so when LIST_INIT() is called, it reloads these
pointers, which is quite a bit of a mess in terms of performance.
This patch adds LIST_DEL_INIT() to perform the two operations at once
using local temporary variables so that the compiler knows these
pointers are left unaffected.
Well, that's becoming embarrassing. Now this fixes commit 4ef6801c
("BUG/MEDIUM: list: correct fix for LIST_POP_LOCKED's removal of last
element") which itself tried to fix commit 285192564. This fix only
works under low contention and was tested with the listener's queue.
With the idle conns it's obvious that it's still wrong since adding
more than one element to the list leaves a LLIST_BUSY pointer into
the list's head. This was visible when accumulating idle connections
in a server's list.
This new version of the fix almost goes back to the original code,
except that since then we addressed issues with expectedly idempotent
operations that were not. Now the code has been verified on paper again
and has survived 300 million connections spread over 4 threads.
This will have to be backported if the commit above is backported.
As seen with Olivier, in the end the fix in commit 285192564 ("BUG/MEDIUM:
list: fix LIST_POP_LOCKED's removal of the last pointer") is wrong,
the code there was right but the bug was triggered by another bug in
LIST_ADDQ_LOCKED() which doesn't properly update the list's head by
inserting in the wrong order.
This will have to be backported if the commit above is backported.
There is a very difficult to reproduce race in the listener's accept
code, which is much easier to reproduce once connection limits are
properly enforced. It's an ABBA lock issue :
- the following functions take l->lock then lq_lock :
disable_listener, pause_listener, listener_full, limit_listener,
do_unbind_listener
- the following ones take lq_lock then l->lock :
resume_listener, dequeue_all_listener
This is because __resume_listener() only takes the listener's lock
and expects to be called with lq_lock held. The problem can easily
happen when listener_full() and limit_listener() are called a lot
while in parallel another thread releases sessions for the same
listener using listener_release() which in turn calls resume_listener().
This scenario is more prevalent in 2.0-dev since the removal of the
accept lock in listener_accept(). However in 1.9 and before, a different
but extremely unlikely scenario can happen :
thread1 thread2
............................ enter listener_accept()
limit_listener()
............................ long pause before taking the lock
session_free()
dequeue_all_listeners()
lock(lq_lock) [1]
............................ try_lock(l->lock) [2]
__resume_listener()
spin_lock(l->lock) =>WAIT[2]
............................ accept()
l->accept()
nbconn==maxconn =>
listener_full()
state==LI_LIMITED =>
lock(lq_lock) =>DEADLOCK[1]!
In practice it is almost impossible to trigger it because it requires
to limit both on the listener's maxconn and the frontend's rate limit,
at the same time, and to release the listener when the connection rate
goes below the limit between poll() returns the FD and the lock is
taken (a few nanoseconds). But maybe with threads competing on the
same core it has more chances to appear.
This patch removes the lq_lock and replaces it with a lockless queue
for the listener's wait queue (well, technically speaking a self-locked
queue) brought by commit a8434ec14 ("MINOR: lists: Implement locked
variations.") and its few subsequent fixes. This relieves us from the
need of the lq_lock and removes the deadlock. It also gets rid of the
distinction between __resume_listener() and resume_listener() since the
only difference was the lq_lock. All listener removals from the list
are now unconditional to avoid races on the state. It's worth noting
that the list used to never be initialized and that it used to work
only thanks to the state tests, so the initialization has now been
added.
This patch must carefully be backported to 1.9 and very likely 1.8.
It is mandatory to be careful about replacing all manipulations of
l->wait_queue, global.listener_queue and p->listener_queue.
These operations previously used to return a "locked" element, which is
a constraint when multiple threads try to delete the same element, because
the second one will block indefinitely. Instead, let's make sure that both
LIST_DEL_LOCKED() and LIST_POP_LOCKED() always reinitialize the element
after deleting it. This ensures that the second thread will immediately
unblock and succeed with the removal. It also secures the pop vs delete
competition that may happen when trying to remove an element that's about
to be dequeued.
Commit a8434ec14 ("MINOR: lists: Implement locked variations.")
introduced locked lists which use the elements pointers as locks
for concurrent operations. Under heavy stress the lists occasionally
fail. The cause is a missing barrier at some points when updating
the list element and the head : nothing prevents the compiler (or
CPU) from updating the list head first before updating the element,
making another thread jump to a wrong location. This patch simply
adds the missing barriers before these two opeations.
This will have to be backported if the commit above is backported.
There was a typo making the last updated pointer be the pre-last element's
prev instead of the last's prev element. It didn't show up during early
tests because the contention is very rare on this one and it's implicitly
recovered when updating the pointers to go to the next element, but it was
clearly visible in the listener_accept() tests by having all threads block
on LIST_POP_LOCKED() with n==p==LLIST_BUSY.
This will have to be backported if commit a8434ec14 ("MINOR: lists:
Implement locked variations.") is backported.
Commit a8434ec14 ("MINOR: lists: Implement locked variations.")
introduced locked lists which use the elements pointers as locks
for concurrent operations. A copy-paste typo in LIST_ADDQ_LOCKED()
causes corruption in the list in case the next pointer is already
held, as it restores the previous pointer into the next one. It
may impact the server pools.
This will have to be backported if the commit above is backported.
Threads have long matured by now, still for most users their usage is
not trivial. It's about time to enable them by default on platforms
where we know the number of CPUs bound. This patch does this, it counts
the number of CPUs the process is bound to upon startup, and enables as
many threads by default. Of course, "nbthread" still overrides this, but
if it's not set the default behaviour is to start one thread per CPU.
The default number of threads is reported in "haproxy -vv". Simply using
"taskset -c" is now enough to adjust this number of threads so that there
is no more need for playing with cpu-map. And thanks to the previous
patches on the listener, the vast majority of configurations will not
need to duplicate "bind" lines with the "process x/y" statement anymore
either, so a simple config will automatically adapt to the number of
processors available.
Function mask_find_rank_bit() returns the bit position in mask <m> of
the nth bit set of rank <r>, between 0 and LONGBITS-1 included, starting
from the left. For example ranks 0,1,2,3 for mask 0x55 will be 6, 4, 2
and 0 respectively. This algorithm is based on a popcount variant and
is described here : https://graphics.stanford.edu/~seander/bithacks.html.
In LIST_DEL_LOCKED(), initialize p2 to NULL, and only attempt to set it back
to its previous value if we had a previous element, and thus p2 is non-NULL.
Implement LIST_ADD_LOCKED(), LIST_ADDQ_LOCKED(), LIST_DEL_LOCKED() and
LIST_POP_LOCKED().
LIST_ADD_LOCKED, LIST_ADDQ_LOCKED and LIST_DEL_LOCKED work the same as
LIST_ADD, LIST_ADDQ and LIST_DEL, except before any manipulation it locks
the relevant elements of the list, so it's safe to manipulate the list
with multiple threads.
LIST_POP_LOCKED() removes the first element from the list, and returns its
data.
This function is useful to parse strings made of unsigned integers
and to allocate a C array of unsigned integers from there.
For instance this function allocates this array { 1, 2, 3, 4, } from
this string: "1.2.3.4".
The function htx_drain() can now be used to drain data from an HTX message.
It will be used by other commits to fix bugs, so it must be backported to 1.9.
1xx responses does not work in HTTP2 when the HTX is enabled. First of all, when
a response is parsed, only one HEADERS frame is expected. So when an interim
response is received, the flag H2_SF_HEADERS_RCVD is set and the next HEADERS
frame (for another interim repsonse or the final one) is parsed as a trailers
one. Then when the response is sent, because an EOM block is found at the end of
the interim HTX response, the ES flag is added on the frame, closing too early
the stream. Here, it is a design problem of the HTX. Iterim responses are
considered as full messages, leading to some ambiguities when HTX messages are
processed. This will not be fixed now, but we need to keep it in mind for future
improvements.
To fix the parsing bug, the flag H2_MSGF_RSP_1XX is added when the response
headers are decoded. When this flag is set, an EOM block is added into the HTX
message, despite the fact that there is no ES flag on the frame. And we don't
set the flag H2_SF_HEADERS_RCVD on the corresponding H2S. So the next HEADERS
frame will not be parsed as a trailers one.
To fix the sending bug, the ES flag is not set on the frame when an interim
response is processed and the flag H2_SF_HEADERS_SENT is not set on the
corresponding H2S.
This patch must be backported to 1.9.
The existing threading flag in the 51Degrees API
(FIFTYONEDEGREES_NO_THREADING) has now been mapped to the HAProxy
threading flag (USE_THREAD), and the 51Degrees module code has been made
thread safe.
In Pattern, the cache is now locked with a spin lock from hathreads.h
using a new lable 'OTHER_LOCK'. The workset pool is now created with the
same size as the number of threads to avoid any time waiting on a
worket.
In Hash Trie, the global device offsets structure is only used in single
threaded operation. Multi threaded operation creates a new offsets
structure in each thread.
For some embedded systems, it's pointless to have 32- or even 64- large
arrays of processes when it's known that much fewer processes will be
used in the worst case. Let's introduce this MAX_PROCS define which
contains the highest number of processes allowed to run at once. It
still defaults to LONGBITS but may be lowered.
We'll call popcount() more often so better use a parallel method
than an iterative one. One optimal design is proposed at the site
below. It requires a fast multiplication though, but even without
it will still be faster than the iterative one, and all relevant
64 bit platforms do have a multiply unit.
https://graphics.stanford.edu/~seander/bithacks.html
When compiling with DEBUG_FAIL_ALLOC, add a new option, tune.fail-alloc,
that gives the percentage of chances an allocation fails.
This is useful to check that allocation failures are always handled
gracefully.
The previous patch clarifies the fact that the htx pointer is never null
along all the code. This test for a null will never match, didn't catch
the pointer 1 before the fix for b_is_null(), but it confuses the compiler
letting it think that any dereferences made to this pointer after this
test could actually mean we're dereferencing a null. Let's now drop this
test. This saves us from having to add impossible tests everywhere to
avoid the warning.
This should be backported to 1.9 if the b_is_null() patch is backported.
Update the comments above htxbuf() and htx_from_buf() to make it clear
that they always return valid htx pointers so that callers know they do
not have to test them. This is only true after the fix on b_is_null()
which was the only known corner case.
This should be backported to 1.9 if the b_is_null() patch is backported.
In b_is_null(), make sure we return 1 if the buffer is waiting for its
allocation, as users assume there's memory allocated if b_is_null() returns
0.
The indirect impact of not having this was that htxbuf() would not match
b_is_null() for a buffer waiting for an allocation, and would thus return
the value 1 for the htx pointer, causing various crashes under low memory
condition.
Note that this patch makes gcc versions 6 and above report two null-deref
warnings in proto_htx.c since htx_is_empty() continues to check for a null
pointer without knowing that this is protected by the test on b_is_null().
This is addressed by the following patches.
This should be backported to 1.9.
The new function h2_frame_check() checks the protocol limits for the
received frame (length, ID, direction) and returns a verdict made of
a connection error code. The purpose is to be able to validate any
frame regardless of the state and the ability to call the frame handler,
and to emit a GOAWAY early in this case.
There's some value in being able to limit MAX_THREADS, either to save
precious resources in embedded environments, or to protect certain
deployments against accidently incorrect settings.
With this patch, if MAX_THREADS is defined at build time, it will be
used. However, given that LONGBITS is not a macro but is defined
according to sizeof(long), we can't check the value range at build
time and instead we need to perform the check at early boot time.
However, the compiler is able to optimize away the constant comparisons
and doesn't even emit the check code when values are correct.
The output message regarding threading support was improved to report
the number of threads.