Commit Graph

882 Commits

Author SHA1 Message Date
Willy Tarreau
d678805783 REORG: include: move version.h to haproxy/
Few files were affected. The release scripts was updated.
2020-06-11 10:18:56 +02:00
Willy Tarreau
4c7e4b7738 REORG: include: update all files to use haproxy/api.h or api-t.h if needed
All files that were including one of the following include files have
been updated to only include haproxy/api.h or haproxy/api-t.h once instead:

  - common/config.h
  - common/compat.h
  - common/compiler.h
  - common/defaults.h
  - common/initcall.h
  - common/tools.h

The choice is simple: if the file only requires type definitions, it includes
api-t.h, otherwise it includes the full api.h.

In addition, in these files, explicit includes for inttypes.h and limits.h
were dropped since these are now covered by api.h and api-t.h.

No other change was performed, given that this patch is large and
affects 201 files. At least one (tools.h) was already freestanding and
didn't get the new one added.
2020-06-11 10:18:42 +02:00
William Lallemand
9fc6c97fb3 BUG/MINOR: mworker: fix a memleak when execvp() failed
Free next_argv when execvp() failed.

Must be backported as far as 1.8.

Should fix issue #668.
2020-06-08 10:01:13 +02:00
William Lallemand
0041741ef7 BUG/MEDIUM: mworker: fix the reload with an -- option
When HAProxy is started with a '--' option, all following parameters are
considered configuration files. You can't add new options after a '--'.

The current reload system of the master-worker adds extra options at the
end of the arguments list. Which is a problem if HAProxy was started wih
'--'.

This patch fixes the issue by copying the new option at the beginning of
the arguments list instead of the end.

This patch must be backported as far as 1.8.
2020-06-05 14:30:53 +02:00
William Lallemand
a6b3249935 BUG/MINOR: init: -S can have a parameter starting with a dash
There is no reason the -S option can't take an argument which starts with
a -. This limitation must only be used for options that take a
non-finite list of parameters (-sf/-st)

This can be backported only if the previous patch which fixes
copy_argv() is backported too.

Could be backported as far as 1.9.
2020-06-05 14:30:49 +02:00
William Lallemand
4f71d304aa BUG/MINOR: init: -x can have a parameter starting with a dash
There is no reason the -x option can't take an argument which starts with
a -. This limitation must only be used for options that take a
non-finite list of parameters (-sf/-st)

This can be backported only if the previous patch which fixes
copy_argv() is backported too.

Could be backported as far as 1.8.
2020-06-05 14:30:45 +02:00
William Lallemand
df6c5a8ffa BUG/MEDIUM: mworker: fix the copy of options in copy_argv()
The copy_argv() function, which is used to copy and remove some of the
arguments of the command line in order to re-exec() the master process,
is poorly implemented.

The function tries to remove the -x and the -sf/-st options but without
taking into account that some of the options could take a parameter
starting with a dash.

In issue #644, haproxy starts with "-L -xfoo" which is perfectly
correct. However, the re-exec is done without "-xfoo" because the master
tries to remove the "-x" option. Indeed, the copy_argv() function does
not know how much arguments an option can have, and just assume that
everything starting with a dash is an option. So haproxy is exec() with
"-L" but without a parameter, which is wrong and leads to the exit of
the master, with usage().

To fix this issue, copy_argv() must know how much parameters an option
takes, and copy or skip the parameters correctly.

This fix is a first step but it should evolve to a cleaner way of
declaring the options to avoid deduplication of the parsing code, so we
avoid new bugs.

Should be backported with care as far as 1.8, by removing the options
that does not exists in the previous versions.
2020-06-05 14:30:34 +02:00
Willy Tarreau
d645574fd4 MINOR: soft-stop: let the first stopper only signal other threads
When the first thread stops and wakes others up, it's possible some of
them will also start to wake others in parallel. Let's make give this
notification task to the very first one instead since it's enough and
can reduce the amount of needless (though harmless) wakeup calls.
2020-05-13 14:30:25 +02:00
Willy Tarreau
d7a6b2f742 BUG/MINOR: soft-stop: always wake up waiting threads on stopping
Currently the soft-stop can lead to old processes remaining alive for as
long as two seconds after receiving a soft-stop signal. What happens is
that when receiving SIGUSR1, one thread (usually the first one) wakes up,
handles the signal, sets "stopping", goes into runn_poll_loop(), and
discovers that stopping is set, so its also sets itself in the
stopping_thread_mask bit mask. After this it sees that other threads are
not yet willing to stop, so it continues to wait.

From there, other threads which were waiting in poll() expire after one
second on poll timeout and enter run_poll_loop() in turn. That's already
one second of wait time. They discover each in turn that they're stopping
and see that other threads are not yet stopping, so they go back waiting.

After the end of the first second, all threads know they're stopping and
have set their bit in stopping_thread_mask. It's only now that those who
started to wait first wake up again on timeout to discover that all other
ones are stopping, and can now quit. One second later all threads will
have done it and the process will quit.

This is effectively strictly larger than one second and up to two seconds.

What the current patch does is simple, when the first thread stops, it sets
its own bit into stopping_thread_mask then wakes up all other threads to do
also set theirs. This kills the first second which corresponds to the time
to discover the stopping state. Second, when a thread exists, it wakes all
other ones again because some might have gone back sleeping waiting for
"jobs" to go down to zero (i.e. closing the last connection). This kills
the last second of wait time.

Thanks to this, as SIGUSR1 now acts instantly again if there's no active
connection, or it stops immediately after the last connection has left if
one was still present.

This should be backported as far as 2.0.
2020-05-13 14:11:18 +02:00
Christopher Faulet
e5870d872b MAJOR: checks: Implement HTTP check using tcp-check rules
HTTP health-checks are now internally based on tcp-checks. Of course all the
configuration parsing of the "http-check" keyword and the httpchk option has
been rewritten. But the main changes is that now, as for tcp-check ruleset, it
is possible to perform several send/expect sequences into the same
health-checks. Thus the connect rule is now also available from HTTP checks, jst
like set-var, unset-var and comment rules.

Because the request defined by the "option httpchk" line is used for the first
request only, it is now possible to set the method, the uri and the version on a
"http-check send" line.
2020-04-27 09:39:38 +02:00
Christopher Faulet
8892e5d30b BUG/MEDIUM: server/checks: Init server check during config validity check
The options and directives related to the configuration of checks in a backend
may be defined after the servers declarations. So, initialization of the check
of each server must not be performed during configuration parsing, because some
info may be missing. Instead, it must be done during the configuration validity
check.

Thus, callback functions are registered to be called for each server after the
config validity check, one for the server check and another one for the server
agent-check. In addition deinit callback functions are also registered to
release these checks.

This patch should be backported as far as 1.7. But per-server post_check
callback functions are only supported since the 2.1. And the initcall mechanism
does not exist before the 1.9. Finally, in 1.7, the code is totally
different. So the backport will be harder on older versions.
2020-04-27 09:39:37 +02:00
Christopher Faulet
f61f33a1b2 BUG/MINOR: checks: Respect the no-check-ssl option
This options is used to force a non-SSL connection to check a SSL server or to
invert a check-ssl option inherited from the default section. The use_ssl field
in the check structure is used to know if a SSL connection must be used
(use_ssl=1) or not (use_ssl=0). The server configuration is used by default.

The problem is that we cannot distinguish the default case (no specific SSL
check option) and the case of an explicit non-SSL check. In both, use_ssl is set
to 0. So the server configuration is always used. For a SSL server, when
no-check-ssl option is set, the check is still performed using a SSL
configuration.

To fix the bug, instead of a boolean value (0=TCP, 1=SSL), we use a ternary value :

  * 0  = use server config
  * 1  = force SSL
  * -1 = force non-SSL

The same is done for the server parameter. It is not really necessary for
now. But it is a good way to know is the server no-ssl option is set.

In addition, the PR_O_TCPCHK_SSL proxy option is no longer used to set use_ssl
to 1 for a check. Instead the flag is directly tested to prepare or destroy the
server SSL context.

This patch should be backported as far as 1.8.
2020-04-27 09:39:37 +02:00
Christopher Faulet
8acb1284bc MINOR: checks: Add a way to send custom headers and payload during http chekcs
The 'http-check send' directive have been added to add headers and optionnaly a
payload to the request sent during HTTP healthchecks. The request line may be
customized by the "option httpchk" directive but there was not official way to
add extra headers. An old trick consisted to hide these headers at the end of
the version string, on the "option httpchk" line. And it was impossible to add
an extra payload with an "http-check expect" directive because of the
"Connection: close" header appended to the request (See issue #16 for details).

So to make things official and fully support payload additions, the "http-check
send" directive have been added :

    option httpchk POST /status HTTP/1.1

    http-check send hdr Content-Type "application/json;charset=UTF-8" \
        hdr X-test-1 value1 hdr X-test-2 value2 \
        body "{id: 1, field: \"value\"}"

When a payload is defined, the Content-Length header is automatically added. So
chunk-encoded requests are not supported yet. For now, there is no special
validity checks on the extra headers.

This patch is inspired by Kiran Gavali's work. It should fix the issue #16 and
as far as possible, it may be backported, at least as far as 1.8.
2020-04-27 09:39:37 +02:00
Tim Duesterhus
dfad6a41ad MINOR: version: Show uname output in display_version()
This patch adds the sysname, release, version and machine fields from
the uname results to the version output. It intentionally leaves out the
machine name, because it is usually not useful and users might not want to
expose their machine names for privacy reasons.

May be backported if it is considered useful for debugging.
2020-04-18 22:04:29 +02:00
Willy Tarreau
bb1b63c079 MINOR: init: report the compiler version in haproxy -vv
Some portability issues were met a few times in the past depending on
compiler versions, but this one was not reported in haproxy -vv output
while it's trivial to add it. This patch tries to be the most accurate
by explicitly reporting the clang version if detected, otherwise the
gcc version.
2020-04-15 17:00:03 +02:00
Willy Tarreau
3eb10b8e98 MINOR: init: add -dW and "zero-warning" to reject configs with warnings
Since some systems switched to service managers which hide all warnings
by default, some users are not aware of some possibly important warnings
and get caught too late with errors that could have been detected earlier.

This patch adds a new global keyword, "zero-warning" and an equivalent
command-line option "-dW" to refuse to start in case any warning is
detected. It is recommended to use these with configurations that are
managed by humans in order to catch mistakes very early.
2020-04-15 16:42:39 +02:00
Willy Tarreau
bebd212064 MINOR: init: report in "haproxy -c" whether there were warnings or not
This helps quickly checking if the config produces any warning. For
this we reuse the "warned" bit field to add a new WARN_ANY bit that is
set by ha_warning(). The rest of the bit field was also cleaned from
unused bits.
2020-04-15 16:42:00 +02:00
Willy Tarreau
95abd5be9f CLEANUP: haproxy/threads: don't check global_tasks_mask twice
In run_thread_poll_loop() we test both for (global_tasks_mask & tid_bit)
and thread_has_tasks(), but the former is useless since this test is
already part of the latter.
2020-03-23 09:33:32 +01:00
Willy Tarreau
4f46a354e6 BUG/MINOR: haproxy/threads: close a possible race in soft-stop detection
Commit 4b3f27b ("BUG/MINOR: haproxy/threads: try to make all threads
leave together") improved the soft-stop synchronization but it left a
small race open because it looks at tasks_run_queue, which can drop
to zero then back to one while another thread picks the task from the
run queue to insert it into the tasklet_list. The risk is very low but
not null. In addition the condition didn't consider the possible presence
of signals in the queue.

This patch moves the stopping detection just after the "wake" calculation
which already takes care of the various queues' sizes and signals. It
avoids needlessly duplicating these tests.

The bug was discovered during a code review but will probably never be
observed. This fix may be backported to 2.1 and 2.0 along with the commit
above.
2020-03-23 09:27:28 +01:00
Olivier Houchard
dc2f2753e9 MEDIUM: servers: Split the connections into idle, safe, and available.
Revamp the server connection lists. We know have 3 lists :
- idle_conns, which contains idling connections
- safe_conns, which contains idling connections that are safe to use even
for the first request
- available_conns, which contains connections that are not idling, but can
still accept new streams (those are HTTP/2 or fastcgi, and are always
considered safe).
2020-03-19 22:07:33 +01:00
Willy Tarreau
2e8ab6b560 MINOR: use DISGUISE() everywhere we deliberately want to ignore a result
It's more generic and versatile than the previous shut_your_big_mouth_gcc()
that was used to silence annoying warnings as it's not limited to ignoring
syscalls returns only. This allows us to get rid of the aforementioned
function and the shut_your_big_mouth_gcc_int variable, that started to
look ugly in multi-threaded environments.
2020-03-14 11:04:49 +01:00
Willy Tarreau
4b3f27b67f BUG/MINOR: haproxy/threads: try to make all threads leave together
There's a small issue with soft stop combined with the incoming
connection load balancing. A thread may dispatch a connection to
another one at the moment stopping=1 is set, and the second one could
stop by seeing (jobs - unstoppable_jobs) == 0 in run_poll_loop(),
without ever picking these connections from the queue. This is
visible in that it may occasionally cause a connection drop on
reload since no remaining thread will ever pick that connection
anymore.

In order to address this, this patch adds a stopping_thread_mask
variable by which threads acknowledge their willingness to stop
when their runqueue is empty. And all threads will only stop at
this moment, so that if finally some late work arrives in the
thread's queue, it still has a chance to process it.

This should be backported to 2.1 and 2.0.
2020-03-12 19:17:19 +01:00
Willy Tarreau
f8ea00e05e BUG/MINOR: haproxy: always initialize sleeping_thread_mask
Surprizingly the variable was never initialized, though on most
platforms it's zeroed at boot, and it is relatively harmless
anyway since in the worst case the bits are updated around poll().

This was introduced by commit 79321b95a8 and needs to be backported
as far as 1.9.
2020-03-12 19:09:46 +01:00
Olivier Houchard
8676514d4e MINOR: servers: Kill priv_conns.
Remove the list of private connections from server, it has been largely
unused, we only inserted connections in it, but we would never actually
use it.
2020-03-11 19:20:01 +01:00
Willy Tarreau
304e17eb88 MEDIUM: init: always try to push the FD limit when maxconn is set from -m
When a maximum memory setting is passed to haproxy and maxconn is not set
and ulimit-n is not set, it is expected that maxconn will be set to the
highest value permitted by this memory setting, possibly affecting the
FD limit.

When maxconn was changed to be deduced from the current process's FD limit,
the automatic setting above was partially lost because it now remains
limited to the current FD limit in addition to being limited to the
memory usage. For unprivileged processes it does not change anything,
but for privileged processes the difference is important. Indeed, the
previous behavior ensured that the new FD limit could be enforced on
the process as long as the user had the privilege to do so. Now this
does not happen anymore, and some people rely on this for automatic
sizing in VM environments.

This patch implements the ability to verify if the setting will be
enforceable on the process or not. First it computes maxconn based on
the memory limits alone, then checks if the process is willing to accept
them, otherwise tries again by respecting the process' hard limit.

Thanks to this we now have the best of the pre-2.0 behavior and the
current one, in that privileged users will be able to get as high a
maxconn as they need just based on the memory limit, while unprivileged
users will still get as high a setting as permitted by the intersection
of the memory limit and the process' FD limit.

Ideally, after some observation period, this patch along with the
previous one "MINOR: init: move the maxsock calculation code to
compute_ideal_maxsock()" should be backported to 2.1 and 2.0.

Thanks to Baptiste for raising the issue.
2020-03-10 18:08:11 +01:00
Willy Tarreau
a409f30d09 MINOR: init: move the maxsock calculation code to compute_ideal_maxsock()
The maxsock value is currently derived from global.maxconn and a few other
settings, some of which also depend on global.maxconn. This makes it
difficult to check if a limit is already too high or not during the maxconn
automatic sizing.

Let's move this code into a new function, compute_ideal_maxsock() which now
takes a maxconn in argument. It performs the same operations and returns
the maxsock value if global.maxconn were to be set to that value. It now
replaces the previous code to compute maxsock.
2020-03-10 18:08:11 +01:00
Willy Tarreau
52bf839394 BUG/MEDIUM: random: implement a thread-safe and process-safe PRNG
This is the replacement of failed attempt to add thread safety and
per-process sequences of random numbers initally tried with commit
1c306aa84d ("BUG/MEDIUM: random: implement per-thread and per-process
random sequences").

This new version takes a completely different approach and doesn't try
to work around the horrible OS-specific and non-portable random API
anymore. Instead it implements "xoroshiro128**", a reputedly high
quality random number generator, which is one of the many variants of
xorshift, which passes all quality tests and which is described here:

   http://prng.di.unimi.it/

While not cryptographically secure, it is fast and features a 2^128-1
period. It supports fast jumps allowing to cut the period into smaller
non-overlapping sequences, which we use here to support up to 2^32
processes each having their own, non-overlapping sequence of 2^96
numbers (~7*10^28). This is enough to provide 1 billion randoms per
second and per process for 2200 billion years.

The implementation was made thread-safe either by using a double 64-bit
CAS on platforms supporting it (x86_64, aarch64) or by using a local
lock for the time needed to perform the shift operations. This ensures
that all threads pick numbers from the same pool so that it is not
needed to assign per-thread ranges. For processes we use the fast jump
method to advance the sequence by 2^96 for each process.

Before this patch, the following config:
    global
        nbproc 8

    frontend f
        bind :4445
        mode http
        log stdout format raw daemon
        log-format "%[uuid] %pid"
        redirect location /

Would produce this output:
    a4d0ad64-2645-4b74-b894-48acce0669af 12987
    a4d0ad64-2645-4b74-b894-48acce0669af 12992
    a4d0ad64-2645-4b74-b894-48acce0669af 12986
    a4d0ad64-2645-4b74-b894-48acce0669af 12988
    a4d0ad64-2645-4b74-b894-48acce0669af 12991
    a4d0ad64-2645-4b74-b894-48acce0669af 12989
    a4d0ad64-2645-4b74-b894-48acce0669af 12990
    82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12987
    82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12992
    82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12986
    (...)

And now produces:
    f94b29b3-da74-4e03-a0c5-a532c635bad9 13011
    47470c02-4862-4c33-80e7-a952899570e5 13014
    86332123-539a-47bf-853f-8c8ea8b2a2b5 13013
    8f9efa99-3143-47b2-83cf-d618c8dea711 13012
    3cc0f5c7-d790-496b-8d39-bec77647af5b 13015
    3ec64915-8f95-4374-9e66-e777dc8791e0 13009
    0f9bf894-dcde-408c-b094-6e0bb3255452 13011
    49c7bfde-3ffb-40e9-9a8d-8084d650ed8f 13014
    e23f6f2e-35c5-4433-a294-b790ab902653 13012

There are multiple benefits to using this method. First, it doesn't
depend anymore on a non-portable API. Second it's thread safe. Third it
is fast and more proven than any hack we could attempt to try to work
around the deficiencies of the various implementations around.

This commit depends on previous patches "MINOR: tools: add 64-bit rotate
operators" and "BUG/MEDIUM: random: initialize the random pool a bit
better", all of which will need to be backported at least as far as
version 2.0. It doesn't require to backport the build fixes for circular
include files dependecy anymore.
2020-03-08 10:09:02 +01:00
Willy Tarreau
0fbf28a05b Revert "BUG/MEDIUM: random: implement per-thread and per-process random sequences"
This reverts commit 1c306aa84d.

It breaks the build on all non-glibc platforms. I got confused by the
man page (which possibly is the most confusing man page I've ever read
about a standard libc function) and mistakenly understood that random_r
was portable, especially since it appears in latest freebsd source as
well but not in released versions, and with a slightly different API :-/

We need to find a different solution with a fallback. Among the
possibilities, we may reintroduce this one with a fallback relying on
locking around the standard functions, keeping fingers crossed for no
other library function to call them in parallel, or we may also provide
our own PRNG, which is not necessarily more difficult than working
around the totally broken up design of the portable API.
2020-03-07 11:24:39 +01:00
Willy Tarreau
1c306aa84d BUG/MEDIUM: random: implement per-thread and per-process random sequences
As mentioned in previous patch, the random number generator was never
made thread-safe, which used not to be a problem for health checks
spreading, until the uuid sample fetch function appeared. Currently
it is possible for two threads or processes to produce exactly the
same UUID. In fact it's extremely likely that this will happen for
processes, as can be seen with this config:

    global
        nbproc 8

    frontend f
        bind :4445
        mode http
        log stdout daemon format raw
        log-format "%[uuid] %pid"
        redirect location /

It typically produces this log:

  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30645
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30641
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30644
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30639
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30646
  07764439-c24d-4e6f-a5a6-0138be59e7a8 30645
  07764439-c24d-4e6f-a5a6-0138be59e7a8 30639
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30643
  07764439-c24d-4e6f-a5a6-0138be59e7a8 30646
  b6773fdd-678f-4d04-96f2-4fb11ad15d6b 30646
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30642
  07764439-c24d-4e6f-a5a6-0138be59e7a8 30642

What this patch does is to use a distinct per-thread and per-process
seed to make sure the same sequences will not appear, and will then
extend these seeds by "burning" a number of randoms that depends on
the global random seed, the thread ID and the process ID. This adds
roughly 20 extra bits of randomness, resulting in 52 bits total per
thread and per process.

It only takes a few milliseconds to burn these randoms and given
that threads start with a different seed, we know they will not
catch each other. So these random extra bits are essentially added
to ensure randomness between boots and cluster instances.

This replaces all uses of random() with ha_random() which uses the
thread-local state.

This must be backported as far as 2.0 or any version having the
UUID sample-fetch function since it's the main victim here.

It's important to note that this patch, in addition to depending on
the previous one "BUG/MEDIUM: init: initialize the random pool a bit
better", also depends on the preceeding build fixes to address a
circular dependency issue in the include files that prevented it
from building. Part or all of these patches may need to be backported
or adapted as well.
2020-03-07 06:11:15 +01:00
Willy Tarreau
6c3a681bd6 BUG/MEDIUM: random: initialize the random pool a bit better
Since the UUID sample fetch was created, some people noticed that in
certain virtualized environments they manage to get exact same UUIDs
on different instances started exactly at the same moment. It turns
out that the randoms were only initialized to spread the health checks
originally, not to provide "clean" randoms.

This patch changes this and collects more randomness from various
sources, including existing randoms, /dev/urandom when available,
RAND_bytes() when OpenSSL is available, as well as the timing for such
operations, then applies a SHA1 on all this to keep a 160 bits random
seed available, 32 of which are passed to srandom().

It's worth mentioning that there's no clean way to pass more than 32
bits to srandom() as even initstate() provides an opaque state that
must absolutely not be tampered with since known implementations
contain state information.

At least this allows to have up to 4 billion different sequences
from the boot, which is not that bad.

Note that the thread safety was still not addressed, which is another
issue for another patch.

This must be backported to all versions containing the UUID sample
fetch function, i.e. as far as 2.0.
2020-03-07 06:11:11 +01:00
Willy Tarreau
b1beaa302c BUG/MINOR: init: make the automatic maxconn consider the max of soft/hard limits
James Stroehmann reported something working as documented but that can
be considered as a regression in the way the automatic maxconn is
calculated from the process' limits :

  https://www.mail-archive.com/haproxy@formilux.org/msg36523.html

The purpose of the changes in 2.0 was to have maxconn default to the
highest possible value permitted to the user based on the ulimit -n
setting, however the calculation starts from the soft limit, which
can be lower than what users were allowed to with previous versions
where the default value of 2000 would force a higher ulimit -n as
long as it fitted in the hard limit.

Usually this is not noticeable if the user changes the limits, because
quite commonly setting a new value restricts both the soft and hard
values.

Let's instead always use the max between the hard and soft limits, as
we know these values are permitted. This was tried on the following
setup:

  $ cat ulimit-n.cfg
  global
    stats socket /tmp/sock1 level admin
  $ ulimit -n
  1024

Before the change the limits would show like this:

  $ socat - /tmp/sock1 <<< "show info" | grep -im2 ^Max
  Maxsock: 1023
  Maxconn: 489

After the change the limits are now much better and more in line with
the default settings in earlier versions:

  $ socat - /tmp/sock1 <<< "show info" | grep -im2 ^Max
  Maxsock: 4095
  Maxconn: 2025

The difference becomes even more obvious when running moderately large
configs with hundreds of checked servers and hundreds of listeners:

  $ cat ulimit-n.cfg
  global
    stats socket /tmp/sock1 level admin

  listen l
    bind :10000-10300
    server-template srv- 300 0.0.0.0 check disabled

          Before   After
  Maxsock  1024    4096
  Maxconn  189     1725

This issue is tagged as minor since a trivial config change fixes it,
but it would help new users to have it backported as far as 2.0.
2020-03-06 10:49:55 +01:00
Carl Henrik Lunde
f91ac19299 OPTIM: startup: fast unique_id allocation for acl.
pattern_finalize_config() uses an inefficient algorithm which is a
problem with very large configuration files. This affects startup, and
therefore reload time. When haproxy is deployed as a router in a
Kubernetes cluster the generated configuration file may be large and
reloads are frequently occuring, which makes this a significant issue.

The old algorithm is O(n^2)
* allocate missing uids - O(n^2)
* sort linked list - O(n^2)

The new algorithm is O(n log n):
* find the user allocated uids - O(n)
* store them for efficient lookup - O(n log n)
* allocate missing uids - n times O(log n)
* sort all uids - O(n log n)
* convert back to linked list - O(n)

Performance examples, startup time in seconds:

    pat_refs old     new
    1000      0.02   0.01
    10000     2.1    0.04
    20000    12.3    0.07
    30000    27.9    0.10
    40000    52.5    0.14
    50000    77.5    0.17

Please backport to 1.8, 2.0 and 2.1.
2020-03-06 08:11:58 +01:00
Willy Tarreau
3ebd55ee51 MINOR: haproxy: export run_poll_loop
This will help refine debug traces.
2020-03-03 15:26:10 +01:00
Willy Tarreau
908071171b BUILD: general: always pass unsigned chars to is* functions
The isalnum(), isalpha(), isdigit() etc functions from ctype.h are
supposed to take an int in argument which must either reflect an
unsigned char or EOF. In practice on some platforms they're implemented
as macros referencing an array, and when passed a char, they either cause
a warning "array subscript has type 'char'" when lucky, or cause random
segfaults when unlucky. It's quite unconvenient by the way since none of
them may return true for negative values. The recent introduction of
cygwin to the list of regularly tested build platforms revealed a lot
of breakage there due to the same issues again.

So this patch addresses the problem all over the code at once. It adds
unsigned char casts to every valid use case, and also drops the unneeded
double cast to int that was sometimes added on top of it.

It may be backported by dropping irrelevant changes if that helps better
support uncommon platforms. It's unlikely to fix bugs on platforms which
would already not emit any warning though.
2020-02-25 08:16:33 +01:00
Christopher Faulet
6d0c3dfac6 MEDIUM: http: Add a ruleset evaluated on all responses just before forwarding
This patch introduces the 'http-after-response' rules. These rules are evaluated
at the end of the response analysis, just before the data forwarding, on ALL
HTTP responses, the server ones but also all responses generated by
HAProxy. Thanks to this ruleset, it is now possible for instance to add some
headers to the responses generated by the stats applet. Following actions are
supported :

   * allow
   * add-header
   * del-header
   * replace-header
   * replace-value
   * set-header
   * set-status
   * set-var
   * strict-mode
   * unset-var
2020-02-06 14:55:34 +01:00
Christopher Faulet
546c4696bb MINOR: global: Set default tune.maxrewrite value during global structure init
When the global structure is initialized, instead of setting tune.maxrewrite to
-1, its default value can be immediately set. This way, it is always defined
during the configuration validity check. Otherwise, the only way to have it at
this stage, it is to explicity set it in the global section.
2020-02-06 09:36:36 +01:00
Willy Tarreau
71f95fa20e [RELEASE] Released version 2.2-dev1
Released version 2.2-dev1 with the following main changes :
    - DOC: this is development again
    - MINOR: version: this is development again, update the status
    - SCRIPTS: update create-release to fix the changelog on new branches
    - CLEANUP: ssl: Clean up error handling
    - BUG/MINOR: contrib/prometheus-exporter: decode parameter and value only
    - BUG/MINOR: h1: Don't test the host header during response parsing
    - BUILD/MINOR: trace: fix use of long type in a few printf format strings
    - DOC: Clarify behavior of server maxconn in HTTP mode
    - MINOR: ssl: deduplicate ca-file
    - MINOR: ssl: compute ca-list from deduplicate ca-file
    - MINOR: ssl: deduplicate crl-file
    - CLEANUP: dns: resolution can never be null
    - BUG/MINOR: http-htx: Don't make http_find_header() fail if the value is empty
    - DOC: ssl/cli: set/commit/abort ssl cert
    - BUG/MINOR: ssl: fix SSL_CTX_set1_chain compatibility for openssl < 1.0.2
    - BUG/MINOR: fcgi-app: Make the directive pass-header case insensitive
    - BUG/MINOR: stats: Fix HTML output for the frontends heading
    - BUG/MINOR: ssl: fix X509 compatibility for openssl < 1.1.0
    - DOC: clarify matching strings on binary fetches
    - DOC: Fix ordered list in summary
    - DOC: move the "group" keyword at the right place
    - MEDIUM: init: prevent process and thread creation at runtime
    - BUG/MINOR: ssl/cli: 'ssl cert' cmd only usable w/ admin rights
    - BUG/MEDIUM: stream-int: don't subscribed for recv when we're trying to flush data
    - BUG/MINOR: stream-int: avoid calling rcv_buf() when splicing is still possible
    - BUG/MINOR: ssl/cli: don't overwrite the filters variable
    - BUG/MEDIUM: listener/thread: fix a race when pausing a listener
    - BUG/MINOR: ssl: certificate choice can be unexpected with openssl >= 1.1.1
    - BUG/MEDIUM: mux-h1: Never reuse H1 connection if a shutw is pending
    - BUG/MINOR: mux-h1: Don't rely on CO_FL_SOCK_RD_SH to set H1C_F_CS_SHUTDOWN
    - BUG/MINOR: mux-h1: Fix conditions to know whether or not we may receive data
    - BUG/MEDIUM: tasks: Make sure we switch wait queues in task_set_affinity().
    - BUG/MEDIUM: checks: Make sure we set the task affinity just before connecting.
    - MINOR: debug: replace popen() with pipe+fork() in "debug dev exec"
    - MEDIUM: init: set NO_NEW_PRIVS by default when supported
    - BUG/MINOR: mux-h1: Be sure to set CS_FL_WANT_ROOM when EOM can't be added
    - BUG/MEDIUM: mux-fcgi: Handle cases where the HTX EOM block cannot be inserted
    - BUG/MINOR: proxy: make soft_stop() also close FDs in LI_PAUSED state
    - BUG/MINOR: listener/threads: always use atomic ops to clear the FD events
    - BUG/MINOR: listener: also clear the error flag on a paused listener
    - BUG/MEDIUM: listener/threads: fix a remaining race in the listener's accept()
    - MINOR: listener: make the wait paths cleaner and more reliable
    - MINOR: listener: split dequeue_all_listener() in two
    - REORG: listener: move the global listener queue code to listener.c
    - DOC: document the listener state transitions
    - BUG/MEDIUM: kqueue: Make sure we report read events even when no data.
    - BUG/MAJOR: dns: add minimalist error processing on the Rx path
    - BUG/MEDIUM: proto_udp/threads: recv() and send() must not be exclusive.
    - DOC: listeners: add a few missing transitions
    - BUG/MINOR: tasks: only requeue a task if it was already in the queue
    - MINOR: tasks: split wake_expired_tasks() in two parts to avoid useless wakeups
    - DOC: proxies: HAProxy only supports 3 connection modes
    - DOC: remove references to the outdated architecture.txt
    - BUG/MINOR: log: fix minor resource leaks on logformat error path
    - BUG/MINOR: mworker: properly pass SIGTTOU/SIGTTIN to workers
    - BUG/MINOR: listener: do not immediately resume on transient error
    - BUG/MINOR: server: make "agent-addr" work on default-server line
    - BUG/MINOR: listener: fix off-by-one in state name check
    - BUILD/MINOR: unix sockets: silence an absurd gcc warning about strncpy()
    - MEDIUM: h1-htx: Add HTX EOM block when the message is in H1_MSG_DONE state
    - MINOR: http-htx: Add some htx sample fetches for debugging purpose
    - REGTEST: Add an HTX reg-test to check an edge case
    - DOC: clarify the fact that replace-uri works on a full URI
    - BUG/MINOR: sample: fix the closing bracket and LF in the debug converter
    - BUG/MINOR: sample: always check converters' arguments
    - MINOR: sample: Validate the number of bits for the sha2 converter
    - BUG/MEDIUM: ssl: Don't set the max early data we can receive too early.
    - MINOR: ssl/cli: 'show ssl cert' give information on the certificates
    - BUG/MINOR: ssl/cli: fix build for openssl < 1.0.2
    - MINOR: debug: support logging to various sinks
    - MINOR: http: add a new "replace-path" action
    - REGTEST: ssl: test the "set ssl cert" CLI command
    - REGTEST: run-regtests: implement #REQUIRE_BINARIES
    - MINOR: task: only check TASK_WOKEN_ANY to decide to requeue a task
    - BUG/MAJOR: task: add a new TASK_SHARED_WQ flag to fix foreing requeuing
    - BUG/MEDIUM: ssl: Revamp the way early data are handled.
    - MINOR: fd/threads: make _GET_NEXT()/_GET_PREV() use the volatile attribute
    - BUG/MEDIUM: fd/threads: fix a concurrency issue between add and rm on the same fd
    - REGTEST: make the "set ssl cert" require version 2.1
    - BUG/MINOR: ssl: openssl-compat: Fix getm_ defines
    - BUG/MEDIUM: state-file: do not allocate a full buffer for each server entry
    - BUG/MINOR: state-file: do not store duplicates in the global tree
    - BUG/MINOR: state-file: do not leak memory on parse errors
    - BUG/MAJOR: mux-h1: Don't pretend the input channel's buffer is full if empty
    - BUG/MEDIUM: stream: Be sure to never assign a TCP backend to an HTX stream
    - BUILD: ssl: improve SSL_CTX_set_ecdh_auto compatibility
    - BUILD: travis-ci: link with ssl libraries using rpath instead of LD_LIBRARY_PATH/DYLD_LIBRARY_PATH
    - BUILD: travis-ci: reenable address sanitizer for clang builds
    - BUG/MINOR: checks: refine which errno values are really errors.
    - BUG/MINOR: connection: only wake send/recv callbacks if the FD is active
    - CLEANUP: connection: conn->xprt is never NULL
    - MINOR: pollers: add a new flag to indicate pollers reporting ERR & HUP
    - MEDIUM: tcp: make tcp_connect_probe() consider ERR/HUP
    - REORG: connection: move tcp_connect_probe() to conn_fd_check()
    - MINOR: connection: check for connection validation earlier
    - MINOR: connection: remove the double test on xprt_done_cb()
    - CLEANUP: connection: merge CO_FL_NOTIFY_DATA and CO_FL_NOTIFY_DONE
    - MINOR: poller: do not call the IO handler if the FD is not active
    - OPTIM: epoll: always poll for recv if neither active nor ready
    - OPTIM: polling: do not create update entries for FD removal
    - BUG/MEDIUM: checks: Only attempt to do handshakes if the connection is ready.
    - BUG/MEDIUM: connections: Hold the lock when wanting to kill a connection.
    - BUILD: CI: modernize cirrus-ci
    - MINOR: config: disable busy polling on old processes
    - MINOR: ssl: Remove unused variable "need_out".
    - BUG/MINOR: h1: Report the right error position when a header value is invalid
    - BUG/MINOR: proxy: Fix input data copy when an error is captured
    - BUG/MEDIUM: http-ana: Truncate the response when a redirect rule is applied
    - BUG/MINOR: channel: inject output data at the end of output
    - BUG/MEDIUM: session: do not report a failure when rejecting a session
    - MEDIUM: dns: implement synchronous send
    - MINOR: raw_sock: make sure to disable polling once everything is sent
    - MINOR: http: Add 410 to http-request deny
    - MINOR: http: Add 404 to http-request deny
    - CLEANUP: mux-h2: remove unused goto "out_free_h2s"
    - BUILD: cirrus-ci: choose proper openssl package name
    - BUG/MAJOR: listener: do not schedule a task-less proxy
    - CLEANUP: server: remove unused err section in server_finalize_init
    - REGTEST: set_ssl_cert.vtc: replace "echo" with "printf"
    - BUG/MINOR: stream-int: Don't trigger L7 retry if max retries is already reached
    - BUG/MEDIUM: tasks: Use the MT macros in tasklet_free().
    - BUG/MINOR: mux-h2: use a safe list_for_each_entry in h2_send()
    - BUG/MEDIUM: mux-h2: fix missing test on sending_list in previous patch
    - CLEANUP: ssl: remove opendir call in ssl_sock_load_cert
    - MEDIUM: lua: don't call the GC as often when dealing with outgoing connections
    - BUG/MEDIUM: mux-h2: don't stop sending when crossing a buffer boundary
    - BUG/MINOR: cli/mworker: can't start haproxy with 2 programs
    - REGTEST: mcli/mcli_start_progs: start 2 programs
    - BUG/MEDIUM: mworker: remain in mworker mode during reload
    - DOC: clarify crt-base usage
    - CLEANUP: compression: remove unused deinit_comp_ctx section
    - BUG/MEDIUM: mux_h1: Don't call h1_send if we subscribed().
    - BUG/MEDIUM: raw_sock: Make sur the fd and conn are sync.
    - CLEANUP: proxy: simplify proxy_parse_rate_limit proxy checks
    - BUG/MAJOR: hashes: fix the signedness of the hash inputs
    - REGTEST: add sample_fetches/hashes.vtc to validate hashes
    - BUG/MEDIUM: cli: _getsocks must send the peers sockets
    - CLEANUP: cli: deduplicate the code in _getsocks
    - BUG/MINOR: stream: don't mistake match rules for store-request rules
    - BUG/MEDIUM: connection: add a mux flag to indicate splice usability
    - BUG/MINOR: pattern: handle errors from fgets when trying to load patterns
    - MINOR: connection: move the CO_FL_WAIT_ROOM cleanup to the reader only
    - MINOR: stream-int: remove dependency on CO_FL_WAIT_ROOM for rcv_buf()
    - MEDIUM: connection: get rid of CO_FL_CURR_* flags
    - BUILD: pattern: include errno.h
    - MEDIUM: mux-h2: do not try to stop sending streams on blocked mux
    - MEDIUM: mux-fcgi: do not try to stop sending streams on blocked mux
    - MEDIUM: mux-h2: do not make an h2s subscribe to itself on deferred shut
    - MEDIUM: mux-fcgi: do not make an fstrm subscribe to itself on deferred shut
    - REORG: stream/backend: move backend-specific stuff to backend.c
    - MEDIUM: backend: move the connection finalization step to back_handle_st_con()
    - MEDIUM: connection: merge the send_wait and recv_wait entries
    - MEDIUM: xprt: merge recv_wait and send_wait in xprt_handshake
    - MEDIUM: ssl: merge recv_wait and send_wait in ssl_sock
    - MEDIUM: mux-h1: merge recv_wait and send_wait
    - MEDIUM: mux-h2: merge recv_wait and send_wait event notifications
    - MEDIUM: mux-fcgi: merge recv_wait and send_wait event notifications
    - MINOR: connection: make the last arg of subscribe() a struct wait_event*
    - MINOR: ssl: Add support for returning the dn samples from ssl_(c|f)_(i|s)_dn in LDAP v3 (RFC2253) format.
    - DOC: Fix copy and paste mistake in http-response replace-value doc
    - BUG/MINOR: cache: Fix leak of cache name in error path
    - BUG/MINOR: dns: Make dns_query_id_seed unsigned
    - BUG/MINOR: 51d: Fix bug when HTX is enabled
    - MINOR: http-htx: Move htx sample fetches in the scope "internal"
    - MINOR: http-htx: Rename 'internal.htx_blk.val' to 'internal.htx_blk.data'
    - MINOR: http-htx: Make 'internal.htx_blk_data' return a binary string
    - DOC: Add a section to document the internal sample fetches
    - MINOR: mux-h1: Inherit send flags from the upper layer
    - MINOR: contrib/prometheus-exporter: Add heathcheck status/code in server metrics
    - BUG/MINOR: http-ana/filters: Wait end of the http_end callback for all filters
    - BUG/MINOR: http-rules: Remove buggy deinit functions for HTTP rules
    - BUG/MINOR: stick-table: Use MAX_SESS_STKCTR as the max track ID during parsing
    - MEDIUM: http-rules: Register an action keyword for all http rules
    - MINOR: tcp-rules: Always set from which ruleset a rule comes from
    - MINOR: actions: Use ACT_RET_CONT code to ignore an error from a custom action
    - MINOR: tcp-rules: Kill connections when custom actions return ACT_RET_ERR
    - MINOR: http-rules: Return an error when custom actions return ACT_RET_ERR
    - MINOR: counters: Add a counter to report internal processing errors
    - MEDIUM: http-ana: Properly handle internal processing errors
    - MINOR: http-rules: Add a rule result to report internal error
    - MINOR: http-rules: Handle internal errors during HTTP rules evaluation
    - MINOR: http-rules: Add more return codes to let custom actions act as normal ones
    - MINOR: tcp-rules: Handle denied/aborted/invalid connections from TCP rules
    - MINOR: http-rules: Handle denied/aborted/invalid connections from HTTP rules
    - MINOR: stats: Report internal errors in the proxies/listeners/servers stats
    - MINOR: contrib/prometheus-exporter: Export internal errors per proxy/server
    - MINOR: counters: Remove failed_secu counter and use denied_resp instead
    - MINOR: counters: Review conditions to increment counters from analysers
    - MINOR: http-ana: Add a txn flag to support soft/strict message rewrites
    - MINOR: http-rules: Handle all message rewrites the same way
    - MINOR: http-rules: Add a rule to enable or disable the strict rewriting mode
    - MEDIUM: http-rules: Enable the strict rewriting mode by default
    - REGTEST: Fix format of set-uri HTTP request rule in h1or2_to_h1c.vtc
    - MINOR: actions: Add a function pointer to release args used by actions
    - MINOR: actions: Regroup some info about HTTP rules in the same struct
    - MINOR: http-rules/tcp-rules: Call the defined action function first if defined
    - MINOR: actions: Rename the act_flag enum into act_opt
    - MINOR: actions: Add flags to configure the action behaviour
    - MINOR: actions: Use an integer to set the action type
    - MINOR: http-rules: Use a specific action type for some custom HTTP actions
    - MINOR: http-rules: Make replace-header and replace-value custom actions
    - MINOR: http-rules: Make set-header and add-header custom actions
    - MINOR: http-rules: Make set/del-map and add/del-acl custom actions
    - MINOR: http-rules: Group all processing of early-hint rule in its case clause
    - MEDIUM: http-rules: Make early-hint custom actions
    - MINOR: http-rule/tcp-rules: Make track-sc* custom actions
    - MINOR: tcp-rules: Make tcp-request capture a custom action
    - MINOR: http-rules: Add release functions for existing HTTP actions
    - BUG/MINOR: http-rules: Fix memory releases on error path during action parsing
    - MINOR: tcp-rules: Add release functions for existing TCP actions
    - BUG/MINOR: tcp-rules: Fix memory releases on error path during action parsing
    - MINOR: http-htx: Add functions to read a raw error file and convert it in HTX
    - MINOR: http-htx: Add functions to create HTX redirect message
    - MINOR: config: Use dedicated function to parse proxy's errorfiles
    - MINOR: config: Use dedicated function to parse proxy's errorloc
    - MEDIUM: http-htx/proxy: Use a global and centralized storage for HTTP error messages
    - MINOR: proxy: Register keywords to parse errorfile and errorloc directives
    - MINOR: http-htx: Add a new section to create groups of custom HTTP errors
    - MEDIUM: proxy: Add a directive to reference an http-errors section in a proxy
    - MINOR: http-rules: Update txn flags and status when a deny rule is executed
    - MINOR: http-rules: Support an optional status on deny rules for http reponses
    - MINOR: http-rules: Use same function to parse request and response deny actions
    - MINOR: http-ana: Add an error message in the txn and send it when defined
    - MEDIUM: http-rules: Support an optional error message in http deny rules
    - REGTEST: Add a strict rewriting mode reg test
    - REGEST: Add reg tests about error files
    - MINOR: ssl: accept 'verify' bind option with 'set ssl cert'
    - BUG/MINOR: ssl: ssl_sock_load_ocsp_response_from_file memory leak
    - BUG/MINOR: ssl: ssl_sock_load_issuer_file_into_ckch memory leak
    - BUG/MINOR: ssl: ssl_sock_load_sctl_from_file memory leak
    - BUG/MINOR: http_htx: Fix some leaks on error path when error files are loaded
    - CLEANUP: http-ana: Remove useless test on txn when the error message is retrieved
    - BUILD: CI: introduce ARM64 builds
    - BUILD: ssl: more elegant anti-replay feature presence check
    - MINOR: proxy/http-ana: Add support of extra attributes for the cookie directive
    - MEDIUM: dns: use Additional records from SRV responses
    - CLEANUP: Consistently `unsigned int` for bitfields
    - CLEANUP: pattern: remove the pat_time definition
    - BUG/MINOR: http_act: don't check capture id in backend
    - BUG/MINOR: ssl: fix build on development versions of openssl-1.1.x
2020-01-22 10:34:58 +01:00
Christopher Faulet
2f5339079b MINOR: proxy/http-ana: Add support of extra attributes for the cookie directive
It is now possible to insert any attribute when a cookie is inserted by
HAProxy. Any value may be set, no check is performed except the syntax validity
(CTRL chars and ';' are forbidden). For instance, it may be used to add the
SameSite attribute:

    cookie SRV insert attr "SameSite=Strict"

The attr option may be repeated to add several attributes.

This patch should fix the issue #361.
2020-01-22 07:18:31 +01:00
Christopher Faulet
5885775de1 MEDIUM: http-htx/proxy: Use a global and centralized storage for HTTP error messages
All custom HTTP errors are now stored in a global tree. Proxies use a references
on these messages. The key used for errorfile directives is the file name as
specified in the configuration. For errorloc directives, a key is created using
the redirect code and the url. This means that the same custom error message is
now stored only once. It may be used in several proxies or for several status
code, it is only parsed and stored once.
2020-01-20 15:18:46 +01:00
Christopher Faulet
58b3564fde MINOR: actions: Add a function pointer to release args used by actions
Arguments used by actions are never released during HAProxy deinit. Now, it is
possible to specify a function to do so. ".release_ptr" field in the act_rule
structure may be set during the configuration parsing to a specific deinit
function depending on the action type.
2020-01-20 15:18:45 +01:00
Christopher Faulet
cb5501327c BUG/MINOR: http-rules: Remove buggy deinit functions for HTTP rules
Functions to deinitialize the HTTP rules are buggy. These functions does not
check the action name to release the right part in the arg union. Only few info
are released. For auth rules, the realm is released and there is no problem
here. But the regex <arg.hdr_add.re> is always unconditionally released. So it
is easy to make these functions crash. For instance, with the following rule
HAProxy crashes during the deinit :

      http-request set-map(/path/to/map) %[src] %[req.hdr(X-Value)]

For now, These functions are simply removed and we rely on the deinit function
used for TCP rules (renamed as deinit_act_rules()). This patch fixes the
bug. But arguments used by actions are not released at all, this part will be
addressed later.

This patch must be backported to all stable versions.
2020-01-20 15:18:45 +01:00
William Lallemand
24c928c8bd BUG/MEDIUM: mworker: remain in mworker mode during reload
If you reload an haproxy started in master-worker mode with
"master-worker" in the configuration, and no "-W" argument,
the new process lost the fact that is was in master-worker mode
resulting in weird behaviors.

The bigest problem is that if it is reloaded with an bad configuration,
the master will exits instead of remaining in waitpid mode.

This problem was discovered in bug #443.

Should be backported in every version using the master-worker mode.
(as far as 1.8)
2020-01-14 18:10:29 +01:00
Willy Tarreau
719e07c989 BUILD/MINOR: unix sockets: silence an absurd gcc warning about strncpy()
Apparently gcc developers decided that strncpy() semantics are no longer
valid and now deserve a warning, especially if used exactly as designed.
This results in issue #304. Let's just remove one to the target size to
please her majesty gcc, the God of C Compilers, who tries hard to make
users completely eliminate any use of string.h and reimplement it by
themselves at much higher risks. Pfff....

This can be backported to stable version, the fix is harmless since it
ignores the last zero that is already set on next line.
2019-12-11 16:29:10 +01:00
Willy Tarreau
d26c9f9465 BUG/MINOR: mworker: properly pass SIGTTOU/SIGTTIN to workers
If a new process is started with -sf and it fails to bind, it may send
a SIGTTOU to the master process in hope that it will temporarily unbind.
Unfortunately this one doesn't catch it and stops to background instead
of forwarding the signal to the workers. The same is true for SIGTTIN.

This commit simply implements an extra signal handler for the master to
deal with such signals that must be passed down to the workers. It must
be backported as far as 1.8, though there the code differs in that it's
entirely in haproxy.c and doesn't require an extra sig handler.
2019-12-11 14:26:53 +01:00
Willy Tarreau
c49ba52524 MINOR: tasks: split wake_expired_tasks() in two parts to avoid useless wakeups
We used to have wake_expired_tasks() wake up tasks and return the next
expiration delay. The problem this causes is that we have to call it just
before poll() in order to consider latest timers, but this also means that
we don't wake up all newly expired tasks upon return from poll(), which
thus systematically requires a second poll() round.

This is visible when running any scheduled task like a health check, as there
are systematically two poll() calls, one with the interval, nothing is done
after it, and another one with a zero delay, and the task is called:

  listen test
    bind *:8001
    server s1 127.0.0.1:1111 check

  09:37:38.200959 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8696843}) = 0
  09:37:38.200967 epoll_wait(3, [], 200, 1000) = 0
  09:37:39.202459 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8712467}) = 0
>> nothing run here, as the expired task was not woken up yet.
  09:37:39.202497 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8715766}) = 0
  09:37:39.202505 epoll_wait(3, [], 200, 0) = 0
  09:37:39.202513 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8719064}) = 0
>> now the expired task was woken up
  09:37:39.202522 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 7
  09:37:39.202537 fcntl(7, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
  09:37:39.202565 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0
  09:37:39.202577 setsockopt(7, SOL_TCP, TCP_QUICKACK, [0], 4) = 0
  09:37:39.202585 connect(7, {sa_family=AF_INET, sin_port=htons(1111), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
  09:37:39.202659 epoll_ctl(3, EPOLL_CTL_ADD, 7, {EPOLLOUT, {u32=7, u64=7}}) = 0
  09:37:39.202673 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8814713}) = 0
  09:37:39.202683 epoll_wait(3, [{EPOLLOUT|EPOLLERR|EPOLLHUP, {u32=7, u64=7}}], 200, 1000) = 1
  09:37:39.202693 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8818617}) = 0
  09:37:39.202701 getsockopt(7, SOL_SOCKET, SO_ERROR, [111], [4]) = 0
  09:37:39.202715 close(7)                = 0

Let's instead split the function in two parts:
  - the first part, wake_expired_tasks(), called just before
    process_runnable_tasks(), wakes up all expired tasks; it doesn't
    compute any timeout.
  - the second part, next_timer_expiry(), called just before poll(),
    only computes the next timeout for the current thread.

Thanks to this, all expired tasks are properly woken up when leaving
poll, and each poll call's timeout remains up to date:

  09:41:16.270449 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10223556}) = 0
  09:41:16.270457 epoll_wait(3, [], 200, 999) = 0
  09:41:17.270130 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10238572}) = 0
  09:41:17.270157 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 7
  09:41:17.270194 fcntl(7, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
  09:41:17.270204 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0
  09:41:17.270216 setsockopt(7, SOL_TCP, TCP_QUICKACK, [0], 4) = 0
  09:41:17.270224 connect(7, {sa_family=AF_INET, sin_port=htons(1111), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
  09:41:17.270299 epoll_ctl(3, EPOLL_CTL_ADD, 7, {EPOLLOUT, {u32=7, u64=7}}) = 0
  09:41:17.270314 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10337841}) = 0
  09:41:17.270323 epoll_wait(3, [{EPOLLOUT|EPOLLERR|EPOLLHUP, {u32=7, u64=7}}], 200, 1000) = 1
  09:41:17.270332 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10341860}) = 0
  09:41:17.270340 getsockopt(7, SOL_SOCKET, SO_ERROR, [111], [4]) = 0
  09:41:17.270367 close(7)                = 0

This may be backported to 2.1 and 2.0 though it's unlikely to bring any
user-visible improvement except to clarify debugging.
2019-12-11 09:42:58 +01:00
Willy Tarreau
a1d97f88e0 REORG: listener: move the global listener queue code to listener.c
The global listener queue code and declarations were still lying in
haproxy.c while not needed there anymore at all. This complicates
the code for no reason. As a result, the global_listener_queue_task
and the global_listener_queue were made static.
2019-12-10 14:16:03 +01:00
Willy Tarreau
241797a3fc MINOR: listener: split dequeue_all_listener() in two
We use it half times for the global_listener_queue and half times
for a proxy's queue and this requires the callers to take care of
these. Let's split it in two versions, the current one working only
on the global queue and another one dedicated to proxies for the
per-proxy queues. This cleans up quite a bit of code.
2019-12-10 14:14:09 +01:00
Willy Tarreau
a45a8b5171 MEDIUM: init: set NO_NEW_PRIVS by default when supported
HAProxy doesn't need to call executables at run time (except when using
external checks which are strongly recommended against), and is even expected
to isolate itself into an empty chroot. As such, there basically is no valid
reason to allow a setuid executable to be called without the user being fully
aware of the risks. In a situation where haproxy would need to call external
checks and/or disable chroot, exploiting a vulnerability in a library or in
haproxy itself could lead to the execution of an external program. On Linux
it is possible to lock the process so that any setuid bit present on such an
executable is ignored. This significantly reduces the risk of privilege
escalation in such a situation. This is what haproxy does by default. In case
this causes a problem to an external check (for example one which would need
the "ping" command), then it is possible to disable this protection by
explicitly adding this directive in the global section. If enabled, it is
possible to turn it back off by prefixing it with the "no" keyword.

Before the option:

  $ socat - /tmp/sock1 <<< "expert-mode on; debug dev exec sudo /bin/id"
  uid=0(root) gid=0(root) groups=0(root

After the option:
  $ socat - /tmp/sock1 <<< "expert-mode on; debug dev exec sudo /bin/id"
  sudo: effective uid is not 0, is /usr/bin/sudo on a file system with the
        'nosuid' option set or an NFS file system without root privileges?
2019-12-06 17:20:26 +01:00
Willy Tarreau
d96f1126fe MEDIUM: init: prevent process and thread creation at runtime
Some concerns are regularly raised about the risk to inherit some Lua
files which make use of a fork (e.g. via os.execute()) as well as
whether or not some of bugs we fix might or not be exploitable to run
some code. Given that haproxy is event-driven, any foreground activity
completely stops processing and is easy to detect, but background
activity is a different story. A Lua script could very well discretely
fork a sub-process connecting to a remote location and taking commands,
and some injected code could also try to hide its activity by creating
a process or a thread without blocking the rest of the processing. While
such activities should be extremely limited when run in an empty chroot
without any permission, it would be better to get a higher assurance
they cannot happen.

This patch introduces something very simple: it limits the number of
processes and threads to zero in the workers after the last thread was
created. By doing so, it effectively instructs the system to fail on
any fork() or clone() syscall. Thus any undesired activity has to happen
in the foreground and is way easier to detect.

This will obviously break external checks (whose concept is already
totally insecure), and for this reason a new option
"insecure-fork-wanted" was added to disable this protection, and it
is suggested in the fork() error report from the checks. It is
obviously recommended not to use it and to reconsider the reasons
leading to it being enabled in the first place.

If for any reason we fail to disable forks, we still start because it
could be imaginable that some operating systems refuse to set this
limit to zero, but in this case we emit a warning, that may or may not
be reported since we're after the fork point. Ideally over the long
term it should be conditionned by strict-limits and cause a hard fail.
2019-12-03 11:49:00 +01:00
Willy Tarreau
47479eb0e7 MINOR: version: emit the link to the known bugs in output of "haproxy -v"
The link to the known bugs page for the current version is built and
reported there. When it is a development version (less than 2 dots),
instead a link to github open issues is reported as there's no way to
be sure about the current situation in this case and it's better that
users report their trouble there.
2019-11-21 18:48:20 +01:00
Willy Tarreau
08dd202d73 MINOR: version: report the version status in "haproxy -v"
As discussed on Discourse here:

    https://discourse.haproxy.org/t/haproxy-branch-support-lifetime/4466

it's not always easy for end users to know the lifecycle of the version
they are using. This patch introduces a "Status" line in the output of
"haproxy -vv" indicating whether it's a development, stable, long-term
supported version, possibly with an estimated end of life for the branch
when it can be anticipated (e.g. for stable versions). This field should
be adjusted when creating a major release to reflect the new status.

It may make sense to backport this to other branches to clarify the
situation.
2019-11-21 18:47:54 +01:00
William Lallemand
677e2f2c35 BUG/MEDIUM: mworker: don't fill the -sf argument with -1 during the reexec
Upon a reexec_on_failure, if the process tried to exit after the
initialization of the process structure but before it was filled with a
PID, the PID in the mworker_proc structure is set to -1.

In this particular case the -sf argument is filled with -1 and haproxy
will exit with the usage message because of that argument.

Should be backported in 2.0.
2019-11-19 17:30:34 +01:00
William Dauchy
f9af9d7f3c MINOR: init: avoid code duplication while setting identify
since the introduction of mworker, the setuid/setgid was duplicated in
two places; try to improve that by creating a dedicated function.
this patch does not introduce any functional change.

Signed-off-by: William Dauchy <w.dauchy@criteo.com>
2019-11-17 16:55:50 +01:00
William Dauchy
e039f26ba4 BUG/MINOR: init: fix set-dumpable when using uid/gid
in mworker mode used with uid/gid settings, it was not possible to get
a coredump despite the set-dumpable option.
indeed prctl(2) manual page specifies the dumpable attribute is reverted
to `/proc/sys/fs/suid_dumpable` in a few conditions such as process
effective user and group are changed.

this patch moves the whole set-dumpable logic before the polling code in
order to catch all possible cases where we could have changed the
uid/gid. It however does not cover the possible segfault at startup.

this patch should be backported in 2.0.

Signed-off-by: William Dauchy <w.dauchy@criteo.com>
2019-11-17 16:55:24 +01:00
William Dauchy
0fec3ab7bf MINOR: init: always fail when setrlimit fails
this patch introduces a strict-limits parameter which enforces the
setrlimit setting instead of a warning. This option can be forcingly
disable with the "no" keyword.
The general aim of this patch is to avoid bad surprises on a production
environment where you change the maxconn for example, a new fd limit is
calculated, but cannot be set because of sysfs setting. In that case you
might want to have an explicit failure to be aware of it before seeing
your traffic going down. During a global rollout it is also useful to
explictly fail as most progressive rollout would simply check the
general health check of the process.

As discussed, plan to use the strict by default mode starting from v2.3.

Signed-off-by: William Dauchy <w.dauchy@criteo.com>
2019-10-29 17:42:27 +01:00
William Lallemand
5fdb5b36e1 BUG/MINOR: mworker/ssl: close openssl FDs unconditionally
Patch 56996da ("BUG/MINOR: mworker/ssl: close OpenSSL FDs on reload")
fixes a issue where the /dev/random FD was leaked by OpenSSL upon a
reload in master worker mode. Indeed the FD was not flagged with
CLOEXEC.

The fix was checking if ssl_used_frontend or ssl_used_backend were set
to close the FD. This is wrong, indeed the lua init code creates an SSL
server without increasing the backend value, so the deinit is never
done when you don't use SSL in your configuration.

To reproduce the problem you just need to build haproxy with openssl and
lua with an openssl which does not use the getrandom() syscall.  No
openssl nor lua configuration are required for haproxy.

This patch must be backported as far as 1.8.

Fix issue #314.
2019-10-17 11:36:22 +02:00
David Carlier
5e4c8e2a67 BUILD/MEDIUM: threads: enable cpu_affinity on osx
Enable it but on a per thread basis only using Darwin native API.
2019-10-17 07:20:58 +02:00
David Carlier
a92c5cec2d BUILD/MEDIUM: threads: rename thread_info struct to ha_thread_info
On Darwin, the thread_info name exists as a standard function thus
we need to rename our array to ha_thread_info to fix this conflict.
2019-10-17 07:15:17 +02:00
Olivier Houchard
bba1a263c5 BUG/MEDIUM: tasklets: Make sure we're waking the target thread if it sleeps.
Now that we can wake tasklet for other threads, make sure that if the thread
is sleeping, we wake it up, or the tasklet won't be executed until it's
done sleeping.
That also means that, before going to sleep, and after we put our bit
in sleeping_thread_mask, we have to check that nobody added a tasklet for
us, just checking for global_tasks_mask isn't enough anymore.
2019-09-24 14:58:45 +02:00
Willy Tarreau
d022e9c98b MINOR: task: introduce a thread-local "sched" variable for local scheduler stuff
The aim is to rassemble all scheduler information related to the current
thread. It simply points to task_per_thread[tid] without having to perform
the operation at each time. We save around 1.2 kB of code on performance
sensitive paths and increase the request rate by almost 1%.
2019-09-24 11:23:30 +02:00
Olivier Houchard
859dc80f94 MEDIUM: list: Separate "locked" list from regular list.
Instead of using the same type for regular linked lists and "autolocked"
linked lists, use a separate type, "struct mt_list", for the autolocked one,
and introduce a set of macros, similar to the LIST_* macros, with the
MT_ prefix.
When we use the same entry for both regular list and autolocked list, as
is done for the "list" field in struct connection, we know have to explicitely
cast it to struct mt_list when using MT_ macros.
2019-09-23 18:16:08 +02:00
Christopher Faulet
c16929658f MINOR: config: Support per-proxy and per-server post-check functions callbacks
Most of times, when a keyword is added in proxy section or on the server line,
we need to have a post-parser callback to check the config validity for the
proxy or the server which uses this keyword.

It is possible to register a global post-parser callback. But all these
callbacks need to loop on the proxies and servers to do their job. It is neither
handy nor efficient. Instead, it is now possible to register per-proxy and
per-server post-check callbacks.
2019-09-17 10:18:54 +02:00
Christopher Faulet
3ea5cbe6a4 MINOR: config: Support per-proxy and per-server deinit functions callbacks
Most of times, when any allocation is done during configuration parsing because
of a new keyword in proxy section or on the server line, we must add a call in
the deinit() function to release allocated ressources. It is now possible to
register a post-deinit callback because, at this stage, the proxies and the
servers are already releases.

Now, it is possible to register deinit callbacks per-proxy or per-server. These
callbacks will be called for each proxy and server before releasing them.
2019-09-17 10:18:54 +02:00
Willy Tarreau
e0d86e2c1c BUG/MINOR: mworker: disable SIGPROF on re-exec
If haproxy is built with profiling enabled with -pg, it is possible to
see the master quit during a reload while it's re-executing itself with
error code 155 (signal 27) saying "Profile timer expired)". This happens
if the SIGPROF signal is delivered during the execve() call while the
handler was already unregistered. The issue itself is not directly inside
haproxy but it's easy to address. This patch disables this signal before
calling execvp() during a master reload. A simple test for this consists
in running this little script with haproxy started in master-worker mode :

     $ while usleep 50000; do killall -USR2 haproxy; done

This fix should be backported to all versions using the master-worker
model.
2019-08-26 10:44:48 +02:00
Olivier Houchard
305d5ab469 MAJOR: fd: Get rid of the fd cache.
Now that the architecture was changed so that attempts to receive/send data
always come from the upper layers, instead of them only trying to do so when
the lower layer let them know they could try, we can finally get rid of the
fd cache. We don't really need it anymore, and removing it gives us a small
performance boost.
2019-07-31 14:12:55 +02:00
Christopher Faulet
f734638976 MINOR: http: Don't store raw HTTP errors in chunks anymore
Default HTTP error messages are stored in an array of chunks. And since the HTX
was added, these messages are also converted in HTX and stored in another
array. But now, the first array is not used anymore because the legacy HTTP mode
was removed.

So now, only the array with the HTX messages are kept. The other one was
removed.
2019-07-19 09:46:23 +02:00
Christopher Faulet
41ba36f8b2 MINOR: global: Preset tune.max_http_hdr to its default value
By default, this tune parameter is set to MAX_HTTP_HDR. This assignment is done
after the configuration parsing, when we check the configuration validity. So
during the configuration parsing, its value is 0. Now, it is set to MAX_HTTP_HDR
from the start. So, it is possible to rely on it during the configuration
parsing.
2019-07-19 09:46:23 +02:00
Christopher Faulet
1b6adb4a51 MINOR: proxy/http_ana: Remove unused req_exp/rsp_exp and req_add/rsp_add lists
The keywords req* and rsp* are now unsupported. So the corresponding lists are
now unused. It is safe to remove them from the structure proxy.

As a result, the code dealing with these rules in HTTP analyzers was also
removed.
2019-07-19 09:24:12 +02:00
Christopher Faulet
fc9cfe4006 REORG: proto_htx: Move HTX analyzers & co to http_ana.{c,h} files
The old module proto_http does not exist anymore. All code dedicated to the HTTP
analysis is now grouped in the file proto_htx.c. So, to finish the polishing
after removing the legacy HTTP code, proto_htx.{c,h} files have been moved in
http_ana.{c,h} files.

In addition, all HTX analyzers and related functions prefixed with "htx_" have
been renamed to start with "http_" instead.
2019-07-19 09:24:12 +02:00
Christopher Faulet
711ed6ae4a MAJOR: http: Remove the HTTP legacy code
First of all, all legacy HTTP analyzers and all functions exclusively used by
them were removed. So the most of the functions in proto_http.{c,h} were
removed. Only functions to deal with the HTTP transaction have been kept. Then,
http_msg and hdr_idx modules were entirely removed. And finally the structure
http_msg was lightened of all its useless information about the legacy HTTP. The
structure hdr_ctx was also removed because unused now, just like unused states
in the enum h1_state. Note that the memory pool "hdr_idx" was removed and
"http_txn" is now smaller.
2019-07-19 09:24:12 +02:00
Willy Tarreau
7764a57d32 BUG/MEDIUM: threads: cpu-map designating a single thread/process are ignored
Since commit 81492c989 ("MINOR: threads: flatten the per-thread cpu-map"),
we don't keep the proc*thread matrix anymore to represent the full binding
possibilities, but only the proc and thread ones. The problem is that the
per-process binding is not the same for each thread and for the process,
and the proc[] array was assumed to store the per-proc first thread value
when doing this change. Worse, the logic present there tries to deal with
thread ranges and process ranges in a way which automatically exclused the
other possibility (since ranges cannot be used on both) but as such fails
to apply changes if neither the process nor the thread is expressed as a
range.

The real problem comes from the fact that specifying cpu-map 1/1 doesn't
yet reveal if the per-process mask or the per-thread mask needs to be
updated. In practice it's the thread one but then the current storage
doesn't allow to store the binding of the first thread of each other
process in nbproc>1 configurations.

When removing the proc*thread matrix, what ought to have been kept was
both the thread column for process 1 and the process line for threads 1,
but instead only the thread column was kept. This patch reintroduces the
storage of the configuration for the first thread of each process so that
it is again possible to store either the per-thread or per-process
configuration.

As a partial workaround for existing configurations, it is possible to
systematically indicate at least two processes or two threads at once
and map them by pairs or more so that at least two values are present
in the range. E.g :

  # set processes 1-4 to cpus 0-3 :

     cpu-map auto:1-4/1 0 1 2 3
  # or:
     cpu-map 1-2/1 0 1
     cpu-map 2-3/1 2 3

  # set threads 1-4 to cpus 0-3 :

     cpu-map auto:1/1-4 0 1 2 3
  # or :
     cpu-map 1/1-2 0 1
     cpu-map 3/3-4 2 3

This fix must be backported to 2.0.
2019-07-16 15:23:09 +02:00
William Lallemand
16866670dd BUG/MEDIUM: mworker: don't call the thread and fdtab deinit
Before switching to wait mode, the per thread deinit should not be
called, because we didn't initiate threads and fdtab.

The problem is that the master could crash if we try to reload HAProxy

The commit 944e619 ("MEDIUM: mworker: wait mode use standard init code
path") removed the deinit code by accident, but its fix 7c756a8
("BUG/MEDIUM: mworker: fix FD leak upon reload") was incomplete and did
not took care of the WAIT_MODE.

This fix must be backported in 1.9 and 2.0
2019-06-24 17:54:05 +02:00
Willy Tarreau
76a80c710c BUILD: mworker: silence two printf format warnings around getpid()
getpid() is documented as returning a pit pid_t result, not
necessarily an int. This causes a build warning on Solaris 10
because of '%d' or '%u' are used in the format passed to snprintf().
Let's just cast the result as an int (respectively unsigned int).

This can be backported to 2.0 and possibly older versions though
it really has no impact.
2019-06-22 07:57:56 +02:00
Willy Tarreau
3c39a7d889 CLEANUP: connection: rename the wait_event.task field to .tasklet
It's really confusing to call it a task because it's a tasklet and used
in places where tasks and tasklets are used together. Let's rename it
to tasklet to remove this confusion.
2019-06-14 14:42:29 +02:00
William Lallemand
63329e36ab MINOR: doc: update the manpage and usage message about -S
Add -S in the manpage, and update the usage message.

Should be backported to 1.9.
2019-06-13 17:09:27 +02:00
William Lallemand
1dc6963086 MINOR: mworker: add the HAProxy version in "show proc"
Displays the HAProxy version so you can compare the version of old
processes and new ones.
2019-06-12 19:19:57 +02:00
Willy Tarreau
34a150ccf5 MEDIUM: init/threads: don't use spinlocks during the init phase
PiBa-NL found some pathological cases where starting threads can hinder
each other and cause a measurable slow down. This problem is reproducible
with the following config (haproxy must be built with -DDEBUG_DEV) :

    global
	stats socket /tmp/sock1 mode 666 level admin
	nbthread 64

    backend stopme
	timeout server  1s
	option tcp-check
	tcp-check send "debug dev exit\n"
	server cli unix@/tmp/sock1 check

This will cause the process to be stopped once the checks are ready to
start. Binding all these to just a few cores magnifies the problem.
Starting them in loops shows a significant time difference among the
commits :

  # before startup serialization
  $ time for i in {1..20}; do taskset -c 0,1,2,3 ./haproxy-e186161 -db -f slow-init.cfg >/dev/null 2>&1; done

  real    0m1.581s
  user    0m0.621s
  sys     0m5.339s

  # after startup serialization
  $ time for i in {1..20}; do taskset -c 0,1,2,3 ./haproxy-e4d7c9dd -db -f slow-init.cfg >/dev/null 2>&1; done

  real    0m2.366s
  user    0m0.894s
  sys     0m8.238s

In order to address this, let's use plain mutexes and cond_wait during
the init phase. With this done, waiting threads now sleep and the problem
completely disappeared :

  $ time for i in {1..20}; do taskset -c 0,1,2,3 ./haproxy -db -f slow-init.cfg >/dev/null 2>&1; done

  real    0m0.161s
  user    0m0.079s
  sys     0m0.149s
2019-06-11 11:30:26 +02:00
Willy Tarreau
e4d7c9dd65 OPTIM/MINOR: init/threads: only call protocol_enable_all() on first thread
There's no point in calling this on each and every thread since the first
thread passing there will enable the listeners, and the next ones will
simply scan all of them in turn to discover that they are already
initialized. Let's only initilize them on the first thread. This could
slightly speed up start up on very large configurations, eventhough most
of the time is still spent in the main thread binding the sockets.

A few measurements have constantly shown that this decreases the startup
time by ~0.1s for 150k listeners. Starting all of them in parallel doesn't
provide better results and can still expose some undesired races.
2019-06-10 10:53:59 +02:00
Willy Tarreau
7109282577 BUG/MEDIUM: init/threads: prevent initialized threads from starting before others
Since commit 6ec902a ("MINOR: threads: serialize threads initialization")
we now serialize threads initialization. But doing so has emphasized another
race which is that some threads may actually start the loop before others
are done initializing.

As soon as all threads enter the first thread_release() call, their rdv
bit is cleared and they're all waiting for all others' rdv to be cleared
as well, with their harmless bit set. The first one to notice the cleared
mask will progress through thread_isolate(), take rdv again preventing
most others from noticing its short pass to zero, and this first one will
be able to run all the way through the initialization till the last call
to thread_release() which it happily crosses, being the only one with the
rdv bit, leaving the room for one or a few others to do the same. This
results in some threads entering the loop before others are done with
their initialization, which is particularly bad. PiBa-NL reported that
some regtests fail for him due to this (which was impossible to reproduce
here, but races are racy by definition). However placing some printf()
in the initialization code definitely shows this unsychronized startup.

This patch takes a different approach in three steps :
  - first, we don't start with thread_release() anymore and we don't
    set the rdv mask anymore in the main call. This was initially done
    to let all threads start toghether, which we don't want. Instead
    we just start with thread_isolate(). Since all threads are harmful
    by default, they all wait for each other's readiness before starting.

  - second, we don't release with thread_release() but with
    thread_sync_release(), meaning that we don't leave the function until
    other ones have reached the point in the function where they decide
    to leave it as well.

  - third, it makes sure we don't start the listeners using
    protocol_enable_all() before all threads have allocated their local
    FD tables or have initialized their pollers, otherwise startup could
    be racy as well. It's worth noting that it is even possible to limit
    this call to thread #0 as it only needs to be performed once.

This now guarantees that all thread init calls start only after all threads
are ready, and that no thread enters the polling loop before all others have
completed their initialization.

Please check GH issues #111 and #117 for more context.

No backport is needed, though if some new init races are reported in
1.9 (or even 1.8) which do not affect 2.0, then it may make sense to
carefully backport this small series.
2019-06-10 10:53:52 +02:00
Willy Tarreau
6ec902a659 MINOR: threads: serialize threads initialization
There is no point in initializing threads in parallel when we know that
it's the moment where some global variables are turned to thread-local
ones, and/or that some global variables are updated (like global_now or
trash_size). Some FDs might be created/destroyed/reallocated and could
be tricky to follow as well (think about epoll_fd for example).

Instead of having to be extremely careful about all these, and to trigger
false positives in thread sanitizers, let's simply initialize one thread
at a time. The init step is very fast so nobody should even notice, and
we won't have any more doubts about what might have happened when
analysing a dump.

See GH issues #111 and #117 for some background on this.
2019-06-07 15:37:47 +02:00
Willy Tarreau
7067b3a92e BUG/MINOR: deinit/threads: make hard-stop-after perform a clean exit
As reported in GH issue #99, when hard-stop-after triggers and threads
are in use, the chance that any thread releases the resources in use by
the other ones is non-null. Thus no thread should be allowed to deinit()
nor exit by itself.

Here we take a different approach. We simply use a 3rd possible value
for the "killed" variable so that all threads know they must break out
of the run-poll-loop and immediately stop.

This patch was tested by commenting the stream_shutdown() calls in
hard_stop() to increase the chances to see a stream use released
resources. With this fix applied, it never crashes anymore.

This fix should be backported to 1.9 and 1.8.
2019-06-02 11:30:07 +02:00
Olivier Houchard
cfbb3e6560 MEDIUM: tasks: Get rid of active_tasks_mask.
Remove the active_tasks_mask variable, we can deduce if we've work to do
by other means, and it is costly to maintain. Instead, introduce a new
function, thread_has_tasks(), that returns non-zero if there's tasks
scheduled for the thread, zero otherwise.
2019-05-29 21:53:37 +02:00
Willy Tarreau
2ae84e445d MEDIUM: poller: separate the wait time from the wake events
We have been abusing the do_poll()'s timeout for a while, making it zero
whenever there is some known activity. The problem this poses is that it
complicates activity diagnostic by incrementing the poll_exp field for
each known activity. It also requires extra computations that could be
avoided.

This change passes a "wake" argument to say that the poller must not
sleep. This simplifies the operations and allows one to differenciate
expirations from activity.
2019-05-28 17:25:21 +02:00
Willy Tarreau
e5733234f6 CLEANUP: build: rename some build macros to use the USE_* ones
We still have quite a number of build macros which are mapped 1:1 to a
USE_something setting in the makefile but which have a different name.
This patch cleans this up by renaming them to use the USE_something
one, allowing to clean up the makefile and make it more obvious when
reading the code what build option needs to be added.

The following renames were done :

 ENABLE_POLL -> USE_POLL
 ENABLE_EPOLL -> USE_EPOLL
 ENABLE_KQUEUE -> USE_KQUEUE
 ENABLE_EVPORTS -> USE_EVPORTS
 TPROXY -> USE_TPROXY
 NETFILTER -> USE_NETFILTER
 NEED_CRYPT_H -> USE_CRYPT_H
 CONFIG_HAP_CRYPT -> USE_LIBCRYPT
 CONFIG_HAP_NS -> DUSE_NS
 CONFIG_HAP_LINUX_SPLICE -> USE_LINUX_SPLICE
 CONFIG_HAP_LINUX_TPROXY -> USE_LINUX_TPROXY
 CONFIG_HAP_LINUX_VSYSCALL -> USE_LINUX_VSYSCALL
2019-05-22 19:47:57 +02:00
Willy Tarreau
082b62828d BUG/MEDIUM: init/threads: provide per-thread alloc/free function callbacks
We currently have the ability to register functions to be called early
on thread creation and at thread deinitialization. It turns out this is
not sufficient because certain such functions may use resources that are
being allocated by the other ones, thus creating a race condition depending
only on the linking order. For example the mworker needs to register a
file descriptor while the pollers will reallocate the fd_updt[] array.
Similarly logs and trashes may be used by some init functions while it's
unclear whether they have been deduplicated.

The same issue happens on deinit, if the fd_updt[] or trash is released
before some functions finish to use them, we'll get into trouble.

This patch creates a couple of early and late callbacks for per-thread
allocation/freeing of resources. A few init functions were moved there,
and the fd init code was split between the two (since it used to both
allocate and initialize at once). This way the init/deinit sequence is
expected to be safe now.

This patch should be backported to 1.9 as at least the trash/log issue
seems to be present. The run_thread_poll_loop() code is a bit different
there as the mworker is not a callback, but it will have no effect and
it's enough to drop the mworker changes.

This bug was reported by Ilya Shipitsin in github issue #104.
2019-05-22 14:59:08 +02:00
Willy Tarreau
05ed14cfc4 CLEANUP: threads: really move thread_info to hathreads.c
Commit 5a6e2245f ("REORG: threads: move the struct thread_info from
global.h to hathreads.h") didn't hold its promise well, as the thread_info
struct was still declared and initialized in haproxy.c in addition to being
in hathreads.c. Let's move it for real now.
2019-05-22 11:50:48 +02:00
Tim Duesterhus
9b7a976cd6 BUG/MINOR: mworker: Fix memory leak of mworker_proc members
The struct mworker_proc is not uniformly freed everywhere, sometimes leading
to leaks of the `id` string (and possibly the other strings).

Introduce a mworker_free_child function instead of duplicating the freeing
logic everywhere to prevent this kind of issues.

This leak was reported in issue #96.

It looks like the leaks have been introduced in commit 9a1ee7ac31,
which is specific to 2.0-dev. Backporting `mworker_free_child` might be
helpful to ease backporting other fixes, though.
2019-05-22 11:29:18 +02:00
Willy Tarreau
f61782418c CLEANUP: time: refine the test on _POSIX_TIMERS
The clock_gettime() man page says we must check that _POSIX_TIMERS is
defined to a value greater than zero, not just that it's simply defined
so let's fix this right now.
2019-05-21 20:03:03 +02:00
Emmanuel Hocdet
0ba4f483d2 MAJOR: polling: add event ports support (Solaris)
Event ports are kqueue/epoll polling class for Solaris. Code is based
on https://github.com/joyent/haproxy-1.8/tree/joyent/dev-v1.8.8.
Event ports are available only on SunOS systems derived from
Solaris 10 and later (including illumos systems).
2019-05-21 15:16:45 +02:00
Willy Tarreau
663fda4c90 BUILD: threads: only assign the clock_id when supported
I took extreme care to always check for _POSIX_THREAD_CPUTIME before
manipulating clock_id, except at one place (run_thread_poll_loop) as
found by Manu, breaking Solaris. Now fixed, no backport needed.
2019-05-21 15:14:08 +02:00
Willy Tarreau
8323a375bc MINOR: threads: add a thread-local thread_info pointer "ti"
Since we're likely to access this thread_info struct more frequently in
the future, let's reserve the thread-local symbol to access it directly
and avoid always having to combine thread_info and tid. This pointer is
set when tid is set.
2019-05-20 21:14:12 +02:00
Willy Tarreau
624dcbf41e MINOR: threads: always place the clockid in the struct thread_info
It will be easier to deal with the internal API to always have it.
2019-05-20 21:13:01 +02:00
Willy Tarreau
91e6df01fa MINOR: threads: add each thread's clockid into the global thread_info
This is the per-thread CPU runtime clock, it will be used to measure
the CPU usage of each thread and by the lockup detection mechanism. It
must only be retrieved at the beginning of run_thread_poll_loop() since
the thread must already have been started for this. But it must be done
before performing any per-thread initcall so that all thread init
functions have access to the clock ID.

Note that it could make sense to always have this clockid available even
in non-threaded situations and place the process' clock there instead.
But it would add portability issues which are currently easy to deal
with by disabling threads so it may not be worth it for now.
2019-05-20 11:42:25 +02:00
Willy Tarreau
522cfbc1ea MINOR: init/threads: make the global threads an array of structs
This way we'll be able to store more per-thread information than just
the pthread pointer. The storage became an array of struct instead of
an allocated array since it's very small (typically 512 bytes) and not
worth the hassle of dealing with memory allocation on this. The array
was also renamed thread_info to make its intended usage more explicit.
2019-05-20 11:37:57 +02:00
Willy Tarreau
619a95f5ad MEDIUM: init/mworker: make the pipe register function a regular initcall
Now that we have the guarantee that init calls happen before any other
thread starts, we don't need anymore the workaround installed by commit
1605c7ae6 ("BUG/MEDIUM: threads/mworker: fix a race on startup") and we
can instead rely on a regular per-thread initcall for this function. It
will only be performed on worker thread #0, the other ones and the master
have nothing to do, just like in the original code that was only moved
to the function.
2019-05-20 11:26:12 +02:00
Willy Tarreau
3078e9f8e2 MINOR: threads/init: synchronize the threads startup
It's a bit dangerous to let threads initialize at different speeds on
startup. Some are still in their init functions while others area already
running. It was even subject to some race condition bugs like the one
fixed by commit 1605c7ae6 ("BUG/MEDIUM: threads/mworker: fix a race on
startup").

Here in order to secure all this, we take a very simplistic approach
consisting in using half of the rendez-vous point, which is made
exactly for this purpose : we first initialize the mask of the threads
requesting a rendez-vous to the mask of all threads, and we simply call
thread_release() once the init is complete. This guarantees that no
thread will go further than the initialization code during this time.

This could even safely be backported if any other issue related to an
init race was discovered in a stable release.
2019-05-20 11:26:12 +02:00
William Lallemand
7b302d8dd5 MINOR: init: setenv HAPROXY_CFGFILES
Set the HAPROXY_CFGFILES environment variable which contains the list of
configuration files used to start haproxy, separated by semicolon.
2019-05-20 11:21:00 +02:00
Willy Tarreau
c125cef6da CLEANUP: ssl: make inclusion of openssl headers safe
It's always a pain to have to stuff lots of #ifdef USE_OPENSSL around
ssl headers, it even results in some of them appearing in a random order
and multiple times just to benefit form an existing ifdef block. Let's
make these headers safe for inclusion when USE_OPENSSL is not defined,
they now perform the test themselves and do nothing if USE_OPENSSL is
not defined. This allows to remove no less than 8 such ifdef blocks
and make include blocks more readable.
2019-05-10 09:58:43 +02:00
Willy Tarreau
8d164dc568 CLEANUP: ssl: never include openssl/*.h outside of openssl-compat.h anymore
Since we're providing a compatibility layer for multiple OpenSSL
implementations and their derivatives, it is important that no C file
directly includes openssl headers but only passes via openssl-compat
instead. As a bonus this also gets rid of redundant complex rules for
inclusion of certain files (engines etc).
2019-05-10 09:36:42 +02:00
Willy Tarreau
5599456ee2 REORG: ssl: move openssl-compat from proto to common
This way we can include it much earlier to cover types/ as well.
2019-05-10 09:19:50 +02:00
Willy Tarreau
5db847ab65 CLEANUP: ssl: remove 57 occurrences of useless tests on LIBRESSL_VERSION_NUMBER
They were all check to comply with the advertised openssl version. Now
that libressl doesn't pretend to be a more recent openssl anymore, we
can simply rely on the regular openssl version tests without having to
deal with exceptions for libressl.
2019-05-09 14:26:39 +02:00
Willy Tarreau
9a1ab08160 CLEANUP: ssl-sock: use HA_OPENSSL_VERSION_NUMBER instead of OPENSSL_VERSION_NUMBER
Most tests on OPENSSL_VERSION_NUMBER have become complex and break all
the time because this number is fake for some derivatives like LibreSSL.
This patch creates a new macro, HA_OPENSSL_VERSION_NUMBER, which will
carry the real openssl version defining the compatibility level, and
this version will be adjusted depending on the variants.
2019-05-09 14:25:43 +02:00
Willy Tarreau
affd1b980a BUILD: ssl: fix again a libressl build failure after the openssl FD leak fix
As with every single OpenSSL fix, LibreSSL build broke again, this time
after commit 56996dabe ("BUG/MINOR: mworker/ssl: close OpenSSL FDs on
reload"). A definitive solution will have to be found quickly. For now,
let's exclude libressl from the version test.

This patch must be backported to 1.9 since the fix above was already
backported there.
2019-05-09 13:55:33 +02:00
William Lallemand
27edc4b915 MINOR: mworker: support a configurable maximum number of reloads
This patch implements a new global parameter for the master-worker mode.
When setting the mworker-max-reloads value, a worker receive a SIGTERM
if its number of reloads is greater than this value.
2019-05-07 19:09:01 +02:00
Willy Tarreau
f656279347 CLEANUP: task: remove unneeded tests before task_destroy()
Since previous commit it's not needed anymore to test a task pointer
before calling task_destory() so let's just remove these tests from
the various callers before they become confusing. The function's
arguments were also documented. The same should probably be done
with tasklet_free() which involves a test in roughly half of the
call places.
2019-05-07 19:08:16 +02:00
Dragan Dosen
7d61a33921 BUG/MEDIUM: stick-table: fix regression caused by a change in proxy struct
In commit 1b8e68e ("MEDIUM: stick-table: Stop handling stick-tables as
proxies."), the ->table member of proxy struct was replaced by a pointer
that is not always checked and in some situations can cause a segfault,
eg. during reload or while using "show table" on CLI socket.

No backport is needed.
2019-05-07 14:56:59 +02:00
Rob Allen
56996dabe6 BUG/MINOR: mworker/ssl: close OpenSSL FDs on reload
From OpenSSL 1.1.1, the default behaviour is to maintain open FDs to any
random devices that get used by the random number library. As a result,
those FDs leak when the master re-execs on reload; since those FDs are
not marked FD_CLOEXEC or O_CLOEXEC, they also get inherited by children.
Eventually both master and children run out of FDs.

OpenSSL 1.1.1 introduces a new function to control whether the random
devices are kept open. When clearing the keep-open flag, it also closes
any currently open FDs, so it can be used to clean-up open FDs too.
Therefore, a call to this function is made in mworker_reload prior to
re-exec.

The call is guarded by whether SSL is in use, because it will cause
initialisation of the OpenSSL random number library if that has not
already been done.

This should be backported to 1.9 and 1.8.
2019-05-07 14:11:55 +02:00
Dragan Dosen
2674303912 MEDIUM: regex: modify regex_comp() to atomically allocate/free the my_regex struct
Now we atomically allocate the my_regex struct within function
regex_comp() and compile the regex or free both in case of failure. The
pointer to the allocated my_regex struct is returned directly. The
my_regex* argument to regex_comp() is removed.

Function regex_free() was modified so that it systematically frees the
my_regex entry. The function does nothing when called with a NULL as
argument (like free()). It will avoid existing risk of not properly
freeing the initialized area.

Other structures are also updated in order to be compatible (the ones
related to Lua and action rules).
2019-05-07 06:58:15 +02:00
Frdric Lcaille
1b8e68e89a MEDIUM: stick-table: Stop handling stick-tables as proxies.
This patch adds the support for the "table" line parsing in "peers" sections
to declare stick-table in such sections. This also prevents the user from having
to declare dummy backends sections with a unique stick-table inside.
Even if still supported, this usage will become deprecated.

To do so, the ->table member of proxy struct which is a stktable struct is replaced
by a pointer to a stktable struct allocated at parsing time in src/cfgparse-listen.c
for the dummy stick-table backends and in src/cfgparse.c for "peers" sections.
This has an impact on the code for stick-table sample converters and on the stickiness
rules parsers which first store the name of the dummy before resolving the rules.
This patch replaces proxy_tbl_by_name() calls by stktable_find_by_name() calls
to lookup for stick-tables stored in "stktable_by_name" ebtree at parsing time.
There is only one remaining place where proxy_tbl_by_name() is used: src/hlua.c.

At several places in the code we relied on the fact that ->size member of stick-table
was equal to zero to consider the stick-table was present by not configured,
this do not make sense anymore as ->table member of struct proxyis fow now on a pointer.
These tests are replaced by a test on ->table value itself.

In "peers" section we do not have to temporary store the name of the section the
stick-table are attached to because this name is obviously already known just after
having entered this "peers" section.

About the CLI stick-table I/O handler, the pointer to proxy struct is replaced by
a pointer to a stktable struct.
2019-05-07 06:54:06 +02:00
Willy Tarreau
c40efc1919 MINOR: init/threads: make the threads array global
Currently the thread array is a local variable inside a function block
and there is no access to it from outside, which often complicates
debugging. Let's make it global and export it. Also the allocation
return is now checked.
2019-05-03 10:16:30 +02:00
Willy Tarreau
b4f7cc3839 MINOR: init/threads: remove the useless tids[] array
It's still obscure how we managed to initialize an array of integers
with values always equal to the index, just to retrieve the value
from an opaque pointer to the index instead of directly using it! I
suspect it's a leftover from the very early threading experiments.

This commit gets rid of this and simply passes the thread ID as the
argument to run_thread_poll_loop(), thus significantly simplifying the
few call places and removing the need to allocate then free an array
of identity.
2019-05-03 09:59:15 +02:00
Willy Tarreau
81492c989c MINOR: threads: flatten the per-thread cpu-map
When we initially experimented with threads and processes support, we
needed to implement arrays of threads per process for cpu-map, but this
is not needed anymore since we support either threads or processes.
Let's simply make the thread-based cpu-map per thread and not per
thread and per process since that's not used anymore. Doing so reduces
the global struct from 33kB to 1.5kB.
2019-05-03 09:46:45 +02:00
Dragan Dosen
026ef570e1 BUG/MINOR: checks: free memory allocated for tasklets
The check->wait_list.task and agent->wait_list.task were not
freed properly on deinit().

This patch should be backported to 1.9.
2019-05-02 10:05:09 +02:00
Dragan Dosen
61302da0e7 BUG/MINOR: log: properly free memory on logformat parse error and deinit()
This patch may be backported to all supported versions.
2019-05-02 10:05:07 +02:00
Dragan Dosen
2a7c20f602 BUG/MINOR: haproxy: fix rule->file memory leak
When using the "use_backend" configuration directive, the configuration
file name stored as rule->file was not freed in some situations. This
was introduced in commit 4ed1c95 ("MINOR: http/conf: store the
use_backend configuration file and line for logs").

This patch should be backported to 1.9, 1.8 and 1.7.
2019-05-02 10:05:06 +02:00
Olivier Houchard
88698d966d MEDIUM: connections: Add a way to control the number of idling connections.
As by default we add all keepalive connections to the idle pool, if we run
into a pathological case, where all client don't do keepalive, but the server
does, and haproxy is configured to only reuse "safe" connections, we will
soon find ourself having lots of idling, unusable for new sessions, connections,
while we won't have any file descriptors available to create new connections.

To fix this, add 2 new global settings, "pool_low_ratio" and "pool_high_ratio".
pool-low-fd-ratio  is the % of fds we're allowed to use (against the maximum
number of fds available to haproxy) before we stop adding connections to the
idle pool, and destroy them instead. The default is 20. pool-high-fd-ratio is
the % of fds we're allowed to use (against the maximum number of fds available
to haproxy) before we start killing idling connection in the event we have to
create a new outgoing connection, and no reuse is possible. The default is 25.
2019-04-18 19:52:03 +02:00
Willy Tarreau
d83b6c1ab3 BUG/MINOR: mworker: disable busy polling in the master process
When enabling busy polling, we don't want the master to use it, or it
wastes a dedicated processor to this!

Must be backported to 1.9.
2019-04-18 11:34:41 +02:00
Olivier Houchard
3f795f76e8 MEDIUM: tasks: Merge task_delete() and task_free() into task_destroy().
task_delete() was never used without calling task_free() just after, and
task_free() was only used on error pathes to destroy a just-created task,
so merge them into task_destroy(), that will remove the task from the
wait queue, and make sure the task is either destroyed immediately if it's
not in the run queue, or destroyed when it's supposed to run.
2019-04-18 10:10:04 +02:00
Willy Tarreau
636848aa86 MINOR: init: add a "set-dumpable" global directive to enable core dumps
It's always a pain to get a core dump when enabling user/group setting
(which disables the dumpable flag on Linux), when using a chroot and/or
when haproxy is started by a service management tool which requires
complex operations to just raise the core dump limit.

This patch introduces a new "set-dumpable" global directive to work
around these troubles by doing the following :

  - remove file size limits     (equivalent of ulimit -f unlimited)
  - remove core size limits     (equivalent of ulimit -c unlimited)
  - mark the process dumpable again (equivalent of suid_dumpable=1)

Some of these will depend on the operating system. This way it becomes
much easier to retrieve a core file. Temporarily moving the chroot to
a user-writable place generally enough.
2019-04-16 14:31:23 +02:00
William Lallemand
482f9a9a2f MINOR: mworker: export HAPROXY_MWORKER=1 when running in mworker mode
Export HAPROXY_MWORKER=1 in an environment variable when running in
mworker mode.
2019-04-16 13:26:43 +02:00
William Lallemand
8f7069a389 CLEANUP: mworker: remove the type field in mworker_proc
Since the introduction of the options field, we can use it to store the
type of process.

type = 'm' is replaced by PROC_O_TYPE_MASTER
type = 'w' is replaced by PROC_O_TYPE_WORKER
type = 'e' is replaced by PROC_O_TYPE_PROG

The old values are still used in the HAPROXY_PROCESSES environment
variable to pass the information during a reload.
2019-04-16 13:26:43 +02:00
Willy Tarreau
0f93672dfe BUG/MEDIUM: pattern: assign pattern IDs after checking the config validity
Pavlos Parissis reported an interesting case where some map identifiers
were not assigned (appearing as -1 in show map). It turns out that it
only happens for log-format expressions parsed in check_config_validity()
that involve maps (log-format, use_backend, unique-id-header), as in the
sample configuration below :

    frontend foo
        bind :8001
        unique-id-format %[src,map(addr.lst)]
        log-format %[src,map(addr.lst)]
        use_backend %[src,map(addr.lst)]

The reason stems from the initial introduction of unique IDs in 1.5 via
commit af5a29d5f ("MINOR: pattern: Each pattern is identified by unique
id.") : the unique_id assignment was done before calling
check_config_validity() so all maps loaded after this call are not
properly configured. From what the function does, it seems they will not
be able to use a cache, will not have a unique_id assigned and will not
be updatable from the CLI.

This fix must be backported to all supported versions.
2019-04-11 14:52:25 +02:00
William Lallemand
9a1ee7ac31 MEDIUM: mworker-prog: implement program for master-worker
This patch implements the external binary support in the master worker.

To configure an external process, you need to use the program section,
for example:

	program dataplane-api
		command ./dataplane_api

Those processes are launched at the same time as the workers.

During a reload of HAProxy, those processes are dealing with the same
sequence as a worker:

  - the master is re-executed
  - the master sends a USR1 signal to the program
  - the master launches a new instance of the program

During a stop, or restart, a SIGTERM is sent to the program.
2019-04-01 14:45:37 +02:00
William Lallemand
3f12887ffa MINOR: mworker: don't use children variable anymore
The children variable is still used in haproxy, it is not required
anymore since we have the information about the current workers in the
mworker_proc linked list.

The oldpids array is also replaced by this linked list when we
generated the arguments for the master reexec.
2019-04-01 14:45:37 +02:00
William Lallemand
f3a86831ae MINOR: mworker: calloc mworker_proc structures
Initialize mworker_proc structures to 0 with calloc instead of just
doing a malloc.
2019-04-01 14:45:37 +02:00
William Lallemand
9001ce8c2f REORG: mworker: move mworker_cleanlisteners to mworker.c 2019-04-01 14:45:37 +02:00
William Lallemand
e25473c846 REORG: mworker: move signal handlers and related functions
Move the following functions to mworker.c:

void mworker_catch_sighup(struct sig_handler *sh);
void mworker_catch_sigterm(struct sig_handler *sh);
void mworker_catch_sigchld(struct sig_handler *sh);

static void mworker_kill(int sig);
int current_child(int pid);
2019-04-01 14:45:37 +02:00
William Lallemand
3fa724db87 REORG: mworker: move IPC functions to mworker.c
Move the following functions to mworker.c:

void mworker_accept_wrapper(int fd);
void mworker_pipe_register();
2019-04-01 14:45:37 +02:00
William Lallemand
3cd95d2f1b REORG: mworker: move signals functions to mworker.c
Move the following functions to mworker.c:

void mworker_block_signals();
void mworker_unblock_signals();
2019-04-01 14:45:37 +02:00
William Lallemand
48dfbbdea9 REORG: mworker: move serializing functions to mworker.c
Move the 2 following functions to mworker.c:

void mworker_proc_list_to_env()
void mworker_env_to_proc_list()
2019-04-01 14:45:37 +02:00
Willy Tarreau
7b5654f54a BUILD: re-implement an initcall variant without using executable sections
The current initcall implementation relies on dedicated sections (one
section per init stage) to store the initcall descriptors. Then upon
startup, these sections are scanned from beginning to end and all items
found there are called in sequence.

On platforms like AIX or Cygwin it seems difficult to figure the
beginning and end of sections as the linker doesn't seem to provide
the corresponding symbols. In order to replace this, this patch
simply implements an array of single linked (one per init stage)
which are fed using constructors for each register call. These
constructors are declared static, with a name depending on their
line number in the file, in order to avoid name clashes. The final
effect is the same, except that the method is slightly more expensive
in that it explicitly produces code to register these initcalls :

$ size  haproxy.sections haproxy.constructor
   text    data     bss     dec     hex filename
4060312  249176 1457652 5767140  57ffe4 haproxy.sections
4062862  260408 1457652 5780922  5835ba haproxy.constructor

This mechanism is enabled as an alternative to the default one when
build option USE_OBSOLETE_LINKER is set. This option is currently
enabled by default only on AIX and Cygwin, and may be attempted for
any target which fails to build complaining about missing symbols
__start_init_* and/or __stop_init_*.

Once confirmed as a reliable fix, this will likely have to be backported
to 1.9 where AIX and Cygwin do not build anymore.
2019-04-01 07:43:07 +02:00
William Lallemand
f94afebb94 BUG/MEDIUM: mworker: don't free the wrong child when not found
A bug occurs when the sigchld handler is called and a child which is
not in the process list just left, or with an empty process list.

The child variable won't be set and left as an uninitialized variable or
set to the wrong child entry, which can lead to a free of this
uninitialized variable or of the wrong child.

This can lead to a crash of the master during a stop or a reload.

It is not supposed to happen with a worker which was created by the
master. A cause could be a fork made by a dependency. (openssl, lua ?)

This patch strengthens the case of the missing child by doing the free
only if the child was found.

This patch must be backported to 1.9.
2019-03-28 11:36:18 +01:00
Willy Tarreau
7728ed3565 BUILD: report the whole feature set with their status in haproxy -vv
It's not convenient not to know the status of default options, and
requires the user to know what option is enabled by default in each
target. With this patch, a new "Features list" line is added to the
output of "haproxy -vv" to report the whole list of known features
with their respective status. They're prefixed with a "+" when enabled
or a "-" when disabled. The "USE_" prefix is removed for clarity.
2019-03-27 14:32:58 +01:00
Willy Tarreau
679bba13f7 MINOR: init: report the list of optionally available services
It's never easy to guess what services are built in. We currently have
the prometheus exporter in contrib/ which is the only extension for now.
Let's enumerate all available ones just like we do for filterr and pollers.
2019-03-19 08:08:10 +01:00
Willy Tarreau
3f20085617 BUG/MEDIUM: init/threads: consider epoll_fd/pipes for automatic maxconn calculation
This is the equivalent of the previous patch for the automatic maxconn
calculation. This doesn't need any backport.
2019-03-14 20:02:37 +01:00
Willy Tarreau
2c58b41c96 BUG/MEDIUM: threads/fd: do not forget to take into account epoll_fd/pipes
Each thread uses one epoll_fd or kqueue_fd, and a pipe (thus two FDs).
These ones have to be accounted for in the maxsock calculation, otherwise
we can reach maxsock before maxconn. This is difficult to observe but it
in fact happens when a server connects back to the frontend and has checks
enabled : the check uses its FD and serves to fill the loop. In this case
all FDs planed for the datapath are used for this.

This needs to be backported to 1.9 and 1.8.
2019-03-14 20:02:37 +01:00
Willy Tarreau
df23c0ce45 MINOR: config: continue to rely on DEFAULT_MAXCONN to set the minimum maxconn
Some packages used to rely on DEFAULT_MAXCONN to set the default global
maxconn value to use regardless of the initial ulimit. The recent changes
made the lowest bound set to 100 so that it is compatible with almost any
environment. Now that DEFAULT_MAXCONN is not needed for anything else, we
can use it for the lowest bound set when maxconn is not configured. This
way it retains its original purpose of setting the default maxconn value
eventhough most of the time the effective value will be higher thanks to
the automatic computation based on "ulimit -n".
2019-03-13 10:10:49 +01:00
Willy Tarreau
ca783d4ee6 MINOR: config: remove obsolete use of DEFAULT_MAXCONN at various places
This entry was still set to 2000 but never used anymore. The only places
where it appeared was as an alias to SYSTEM_MAXCONN which forces it, so
let's turn these ones to SYSTEM_MAXCONN and remove the default value for
DEFAULT_MAXCONN. SYSTEM_MAXCONN still defines the upper bound however.
2019-03-13 10:10:25 +01:00
Olivier Houchard
b23a61f78a MEDIUM: threads: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:38 +01:00
Willy Tarreau
ac35093a19 MEDIUM: init: make the global maxconn default to what rlim_fd_cur permits
The global maxconn value is often a pain to configure :
  - in development the user never has the permissions to increase the
    rlim_cur value too high and gets warnings all the time ;

  - in some production environments, users may have limited actions on
    it or may only be able to act on rlim_fd_cur using ulimit -n. This
    is sometimes particularly true in containers or whatever environment
    where the user has no privilege to upgrade the limits.

  - keeping config homogenous between machines is even less easy.

We already had the ability to automatically compute maxconn from the
memory limits when they were set. This patch goes a bit further by also
computing the limit permitted by the configured limit on the number of
FDs. For this it simply reverses the rlim_fd_cur calculation to determine
maxconn based on the number of reserved sockets for listeners & checks,
the number of SSL engines and the number of pipes (absolute or relative).

This way it becomes possible to make maxconn always be the highest possible
value resulting in maxsock matching what was set using "ulimit -n", without
ever setting it. Note that we adjust to the soft limit, not the hard one,
since it's what is configured with ulimit -n. This allows users to also
limit to low values if needed.

Just like before, the calculated value is reported in verbose mode.
2019-03-01 15:54:16 +01:00
Willy Tarreau
8d687d8464 MINOR: init: move some maxsock updates earlier
We'll need to know the global maxsock before the maxconn calculation.
Actually only two components were calculated too late, the peers FD
and the stats FD. Let's move them a few lines upward.
2019-03-01 15:53:14 +01:00
Willy Tarreau
5a023f0d7a MINOR: init: make the maxpipe computation more accurate
The default number of pipes is adjusted based on the sum of frontends
and backends maxconn/fullconn settings. Now that it is possible to have
a null maxconn on a frontend to indicate "unlimited" with commit
c8d5b95e6 ("MEDIUM: config: don't enforce a low frontend maxconn value
anymore"), the sum of maxconn may remain low and limited to the only
frontends/backends where this limit is set.

This patch considers this new unlimited case when doing the check, and
automatically switches to the default value which is maxconn/4 in this
case. All the calculation was moved to a distinct function for ease of
use. This function also supports returning unlimited (-1) when the
value depends on global.maxconn and this latter is not yet set.
2019-03-01 15:53:14 +01:00
Willy Tarreau
8dca19549a BUG/MINOR: mworker: be careful to restore the original rlim_fd_cur/max on reload
When the master re-execs itself on reload, it doesn't restore the initial
rlim_fd_cur/rlim_fd_max values, which have been modified by the ulimit-n
or global maxconn directives. This is a problem, because if these values
were set really low it could prevent the process from restarting, and if
they were set very high, this could have some implications on the restart
time, or later on the computed maxconn.

Let's simply reset these values to the ones we had at boot to maintain
the system in a consistent state.

A backport could be performed to 1.9 and maybe 1.8. This patch depends on
the two previous ones.
2019-03-01 11:26:08 +01:00
Willy Tarreau
e5cfdacb83 BUG/MINOR: init: never lower rlim_fd_max
If a ulimit-n value is set, we must not lower the rlim_max value if the
new value is lower, we must only adjust the rlim_cur one. The effect is
that on very low values, this could prevent a master-worker reload, or
make an external check fail by lack of FDs.

This may be backported to 1.9 and earlier, but it depends on this patch
"MINOR: global: keep a copy of the initial rlim_fd_cur and rlim_fd_max
values".
2019-03-01 10:40:30 +01:00
Willy Tarreau
bf6964007a MINOR: global: keep a copy of the initial rlim_fd_cur and rlim_fd_max values
Let's keep a copy of these initial values. They will be useful to
compute automatic maxconn, as well as to restore proper limits when
doing an execve() on external checks.
2019-03-01 10:40:30 +01:00
Willy Tarreau
c8d5b95e6d MEDIUM: config: don't enforce a low frontend maxconn value anymore
Historically the default frontend's maxconn used to be quite low (2000),
which was sufficient two decades ago but often proved to be a problem
when users had purposely set the global maxconn value but forgot to set
the frontend's.

There is no point in keeping this arbitrary limit for frontends : when
the global maxconn is lower, it's already too high and when the global
maxconn is much higher, it becomes a limiting factor which causes trouble
in production.

This commit allows the value to be set to zero, which becomes the new
default value, to mean it's not directly limited, or in fact it's set
to the global maxconn. Since this operation used to be performed before
computing a possibly automatic global maxconn based on memory limits,
the calculation of the maxconn value and its propagation to the backends'
fullconn has now moved to a dedicated function, proxy_adjust_all_maxconn(),
which is called once the global maxconn is stabilized.

This comes with two benefits :
  1) a configuration missing "maxconn" in the defaults section will not
     limit itself to a magically hardcoded value but will scale up to the
     global maxconn ;

  2) when the global maxconn is not set and memory limits are used instead,
     the frontends' maxconn automatically adapts, and the backends' fullconn
     as well.
2019-02-28 17:05:32 +01:00
Willy Tarreau
149ab779cc MAJOR: threads: enable one thread per CPU by default
Threads have long matured by now, still for most users their usage is
not trivial. It's about time to enable them by default on platforms
where we know the number of CPUs bound. This patch does this, it counts
the number of CPUs the process is bound to upon startup, and enables as
many threads by default. Of course, "nbthread" still overrides this, but
if it's not set the default behaviour is to start one thread per CPU.

The default number of threads is reported in "haproxy -vv". Simply using
"taskset -c" is now enough to adjust this number of threads so that there
is no more need for playing with cpu-map. And thanks to the previous
patches on the listener, the vast majority of configurations will not
need to duplicate "bind" lines with the "process x/y" statement anymore
either, so a simple config will automatically adapt to the number of
processors available.
2019-02-27 14:51:50 +01:00
Willy Tarreau
7ac908bf8c MINOR: config: add global tune.listener.multi-queue setting
tune.listener.multi-queue { on | off }
  Enables ('on') or disables ('off') the listener's multi-queue accept which
  spreads the incoming traffic to all threads a "bind" line is allowed to run
  on instead of taking them for itself. This provides a smoother traffic
  distribution and scales much better, especially in environments where threads
  may be unevenly loaded due to external activity (network interrupts colliding
  with one thread for example). This option is enabled by default, but it may
  be forcefully disabled for troubleshooting or for situations where it is
  estimated that the operating system already provides a good enough
  distribution and connections are extremely short-lived.
2019-02-27 14:27:07 +01:00
Olivier Houchard
9ea5d361ae MEDIUM: servers: Reorganize the way idle connections are cleaned.
Instead of having one task per thread and per server that does clean the
idling connections, have only one global task for every servers.
That tasks parses all the servers that currently have idling connections,
and remove half of them, to put them in a per-thread list of connections
to kill. For each thread that does have connections to kill, wake a task
to do so, so that the cleaning will be done in the context of said thread.
2019-02-26 18:17:32 +01:00
Willy Tarreau
6c1b667e57 [RELEASE] Released version 2.0-dev1
Released version 2.0-dev1 with the following main changes :
    - MINOR: mux-h2: only increase the connection window with the first update
    - REGTESTS: remove the expected window updates from H2 handshakes
    - BUG/MINOR: mux-h2: make empty HEADERS frame return a connection error
    - BUG/MEDIUM: mux-h2: mark that we have too many CS once we have more than the max
    - MEDIUM: mux-h2: remove padlen during headers phase
    - MINOR: h2: add a bit-based frame type representation
    - MINOR: mux-h2: remove useless check for empty frame length in h2s_decode_headers()
    - MEDIUM: mux-h2: decode HEADERS frames before allocating the stream
    - MINOR: mux-h2: make h2c_send_rst_stream() use the dummy stream's error code
    - MINOR: mux-h2: add a new dummy stream for the REFUSED_STREAM error code
    - MINOR: mux-h2: fail stream creation more cleanly using RST_STREAM
    - MINOR: buffers: add a new b_move() function
    - MINOR: mux-h2: make h2_peek_frame_hdr() support an offset
    - MEDIUM: mux-h2: handle decoding of CONTINUATION frames
    - CLEANUP: mux-h2: remove misleading comments about CONTINUATION
    - BUG/MEDIUM: servers: Don't try to reuse connection if we switched server.
    - BUG/MEDIUM: tasks: Decrement tasks_run_queue in tasklet_free().
    - BUG/MINOR: htx: send the proper authenticate header when using http-request auth
    - BUG/MEDIUM: mux_h2: Don't add to the idle list if we're full.
    - BUG/MEDIUM: servers: Fail if we fail to allocate a conn_stream.
    - BUG/MAJOR: servers: Use the list api correctly to avoid crashes.
    - BUG/MAJOR: servers: Correctly use LIST_ELEM().
    - BUG/MAJOR: sessions: Use an unlimited number of servers for the conn list.
    - BUG/MEDIUM: servers: Flag the stream_interface on handshake error.
    - MEDIUM: servers: Be smarter when switching connections.
    - MEDIUM: sessions: Keep track of which connections are idle.
    - MINOR: payload: add sample fetch for TLS ALPN
    - BUG/MEDIUM: log: don't mark log FDs as non-blocking on terminals
    - MINOR: channel: Add the function channel_add_input
    - MINOR: stats/htx: Call channel_add_input instead of updating channel state by hand
    - BUG/MEDIUM: cache: Be sure to end the forwarding when XFER length is unknown
    - BUG/MAJOR: htx: Return the good block address after a defrag
    - MINOR: lb: allow redispatch when using consistent hash
    - CLEANUP: mux-h2: fix end-of-stream flag name when processing headers
    - BUG/MEDIUM: mux-h2: always restart reading if data are available
    - BUG/MINOR: mux-h2: set the stream-full flag when leaving h2c_decode_headers()
    - BUG/MINOR: mux-h2: don't check the CS count in h2c_bck_handle_headers()
    - BUG/MINOR: mux-h2: mark end-of-stream after processing response HEADERS, not before
    - BUG/MINOR: mux-h2: only update rxbuf's length for H1 headers
    - BUG/MEDIUM: mux-h1: use per-direction flags to indicate transitions
    - BUG/MEDIUM: mux-h1: make HTX chunking consistent with H2
    - BUG/MAJOR: stream-int: Update the stream expiration date in stream_int_notify()
    - BUG/MEDIUM: proto-htx: Set SI_FL_NOHALF on server side when request is done
    - BUG/MEDIUM: mux-h1: Add a task to handle connection timeouts
    - MINOR: mux-h2: make h2c_decode_headers() return a status, not a count
    - MINOR: mux-h2: add a new dummy stream : h2_error_stream
    - MEDIUM: mux-h2: make h2c_decode_headers() support recoverable errors
    - BUG/MINOR: mux-h2: detect when the HTX EOM block cannot be added after headers
    - MINOR: mux-h2: remove a misleading and impossible test
    - CLEANUP: mux-h2: clean the stream error path on HEADERS frame processing
    - MINOR: mux-h2: check for too many streams only for idle streams
    - MINOR: mux-h2: set H2_SF_HEADERS_RCVD when a HEADERS frame was decoded
    - BUG/MEDIUM: mux-h2: decode trailers in HEADERS frames
    - MINOR: h2: add h2_make_h1_trailers to turn H2 headers to H1 trailers
    - MEDIUM: mux-h2: pass trailers to H1 (legacy mode)
    - MINOR: htx: add a new function to add a block without filling it
    - MINOR: h2: add h2_make_htx_trailers to turn H2 headers to HTX trailers
    - MEDIUM: mux-h2: pass trailers to HTX
    - MINOR: mux-h1: parse the content-length header on output and set H1_MF_CLEN
    - BUG/MEDIUM: mux-h1: don't enforce chunked encoding on requests
    - MINOR: mux-h2: make HTX_BLK_EOM processing idempotent
    - MINOR: h1: make the H1 headers block parser able to parse headers only
    - MEDIUM: mux-h2: emit HEADERS frames when facing HTX trailers blocks
    - MINOR: stream/htx: Add info about the HTX structs in "show sess all" command
    - MINOR: stream: Add the subscription events of SIs in "show sess all" command
    - MINOR: mux-h1: Add the subscription events in "show fd" command
    - BUG/MEDIUM: h1: Get the h1m state when restarting the headers parsing
    - BUG/MINOR: cache/htx: Be sure to count partial trailers
    - BUG/MEDIUM: h1: In h1_init(), wake the tasklet instead of calling h1_recv().
    - BUG/MEDIUM: server: Defer the mux init until after xprt has been initialized.
    - MINOR: connections: Remove a stall comment.
    - BUG/MEDIUM: cli: make "show sess" really thread-safe
    - BUILD: add a new file "version.c" to carry version updates
    - MINOR: stream/htx: add the HTX flags output in "show sess all"
    - MINOR: stream/cli: fix the location of the waiting flag in "show sess all"
    - MINOR: stream/cli: report more info about the HTTP messages on "show sess all"
    - BUG/MINOR: lua: bad args are returned for Lua actions
    - BUG/MEDIUM: lua: dead lock when Lua tasks are trigerred
    - MINOR: htx: Add an helper function to get the max space usable for a block
    - MINOR: channel/htx: Add HTX version for some helper functions
    - BUG/MEDIUM: cache/htx: Respect the reserve when cached objects are served
    - BUG/MINOR: stats/htx: Respect the reserve when the stats page is dumped
    - DOC: regtest: make it clearer what the purpose of the "broken" series is
    - REGTEST: mailers: add new test for 'mailers' section
    - REGTEST: Add a reg test for health-checks over SSL/TLS.
    - BUG/MINOR: mux-h1: Close connection on shutr only when shutw was really done
    - MEDIUM: mux-h1: Clarify how shutr/shutw are handled
    - BUG/MINOR: compression: Disable it if another one is already in progress
    - BUG/MINOR: filters: Detect cache+compression config on legacy HTTP streams
    - BUG/MINOR: cache: Disable the cache if any compression filter precedes it
    - REGTEST: Add some informatoin to test results.
    - MINOR: htx: Add a function to truncate all blocks after a specific offset
    - MINOR: channel/htx: Add the HTX version of channel_truncate/erase
    - BUG/MINOR: proto_htx: Use HTX versions to truncate or erase a buffer
    - BUG/CRITICAL: mux-h2: re-check the frame length when PRIORITY is used
    - DOC: Fix typo in req.ssl_alpn example (commit 4afdd138424ab...)
    - DOC: http-request cache-use / http-response cache-store expects cache name
    - REGTEST: "capture (request|response)" regtest.
    - BUG/MINOR: lua/htx: Respect the reserve when data are send from an HTX applet
    - REGTEST: filters: add compression test
    - BUG/MEDIUM: init: Initialize idle_orphan_conns for first server in server-template
    - BUG/MEDIUM: ssl: Disable anti-replay protection and set max data with 0RTT.
    - DOC: Be a bit more explicit about allow-0rtt security implications.
    - MINOR: mux-h1: make the mux_h1_ops struct static
    - BUILD: makefile: add an EXTRA_OBJS variable to help build optional code
    - BUG/MEDIUM: connection: properly unregister the mux on failed initialization
    - BUG/MAJOR: cache: fix confusion between zero and uninitialized cache key
    - REGTESTS: test case for map_regm commit 271022150d
    - REGTESTS: Basic tests for concat,strcmp,word,field,ipmask converters
    - REGTESTS: Basic tests for using maps to redirect requests / select backend
    - DOC: REGTESTS README varnishtest -Dno-htx= define.
    - MINOR: spoe: Make the SPOE filter compatible with HTX proxies
    - MINOR: checks: Store the proxy in checks.
    - BUG/MEDIUM: checks: Avoid having an associated server for email checks.
    - REGTEST: Switch to vtest.
    - REGTEST: Adapt reg test doc files to vtest.
    - BUG/MEDIUM: h1: Make sure we destroy an inactive connectin that did shutw.
    - BUG/MINOR: base64: dec func ignores padding for output size checking
    - BUG/MEDIUM: ssl: missing allocation failure checks loading tls key file
    - MINOR: ssl: add support of aes256 bits ticket keys on file and cli.
    - BUG/MINOR: backend: don't use url_param_name as a hint for BE_LB_ALGO_PH
    - BUG/MINOR: backend: balance uri specific options were lost across defaults
    - BUG/MINOR: backend: BE_LB_LKUP_CHTREE is a value, not a bit
    - MINOR: backend: move url_param_name/len to lbprm.arg_str/len
    - MINOR: backend: make headers and RDP cookie also use arg_str/len
    - MINOR: backend: add new fields in lbprm to store more LB options
    - MINOR: backend: make the header hash use arg_opt1 for use_domain_only
    - MINOR: backend: remap the balance uri settings to lbprm.arg_opt{1,2,3}
    - MINOR: backend: move hash_balance_factor out of chash
    - MEDIUM: backend: move all LB algo parameters into an union
    - MINOR: backend: make the random algorithm support a number of draws
    - BUILD/MEDIUM: da: Necessary code changes for new buffer API.
    - BUG/MINOR: stick_table: Prevent conn_cur from underflowing
    - BUG: 51d: Changes to the buffer API in 1.9 were not applied to the 51Degrees code.
    - BUG/MEDIUM: stats: Get the right scope pointer depending on HTX is used or not
    - DOC: add a missing space in the documentation for bc_http_major
    - REGTEST: checks basic stats webpage functionality
    - BUG/MEDIUM: servers: Make assign_tproxy_address work when ALPN is set.
    - BUG/MEDIUM: connections: Add the CO_FL_CONNECTED flag if a send succeeded.
    - DOC: add github issue templates
    - MINOR: cfgparse: Extract some code to be re-used.
    - CLEANUP: cfgparse: Return asap from cfg_parse_peers().
    - CLEANUP: cfgparse: Code reindentation.
    - MINOR: cfgparse: Useless frontend initialization in "peers" sections.
    - MINOR: cfgparse: Rework peers frontend init.
    - MINOR: cfgparse: Simplication.
    - MINOR: cfgparse: Make "peer" lines be parsed as "server" lines.
    - MINOR: peers: Make outgoing connection to SSL/TLS peers work.
    - MINOR: cfgparse: SSL/TLS binding in "peers" sections.
    - DOC: peers: SSL/TLS documentation for "peers"
    - BUG/MINOR: startup: certain goto paths in init_pollers fail to free
    - BUG/MEDIUM: checks: fix recent regression on agent-check making it crash
    - BUG/MINOR: server: don't always trust srv_check_health when loading a server state
    - BUG/MINOR: check: Wake the check task if the check is finished in wake_srv_chk()
    - BUG/MEDIUM: ssl: Fix handling of TLS 1.3 KeyUpdate messages
    - DOC: mention the effect of nf_conntrack_tcp_loose on src/dst
    - BUG/MINOR: proto-htx: Return an error if all headers cannot be received at once
    - BUG/MEDIUM: mux-h2/htx: Respect the channel's reserve
    - BUG/MINOR: mux-h1: Apply the reserve on the channel's buffer only
    - BUG/MINOR: mux-h1: avoid copying output over itself in zero-copy
    - BUG/MAJOR: mux-h2: don't destroy the stream on failed allocation in h2_snd_buf()
    - BUG/MEDIUM: backend: also remove from idle list muxes that have no more room
    - BUG/MEDIUM: mux-h2: properly abort on trailers decoding errors
    - MINOR: h2: declare new sets of frame types
    - BUG/MINOR: mux-h2: CONTINUATION in closed state must always return GOAWAY
    - BUG/MINOR: mux-h2: headers-type frames in HREM are always a connection error
    - BUG/MINOR: mux-h2: make it possible to set the error code on an already closed stream
    - BUG/MINOR: hpack: return a compression error on invalid table size updates
    - MINOR: server: make sure pool-max-conn is >= -1
    - BUG/MINOR: stream: take care of synchronous errors when trying to send
    - CLEANUP: server: fix indentation mess on idle connections
    - BUG/MINOR: mux-h2: always check the stream ID limit in h2_avail_streams()
    - BUG/MINOR: mux-h2: refuse to allocate a stream with too high an ID
    - BUG/MEDIUM: backend: never try to attach to a mux having no more stream available
    - MINOR: server: add a max-reuse parameter
    - MINOR: mux-h2: always consider a server's max-reuse parameter
    - MEDIUM: stream-int: always mark pending outgoing SI_ST_CON
    - MINOR: stream: don't wait before retrying after a failed connection reuse
    - MEDIUM: h2: always parse and deduplicate the content-length header
    - BUG/MINOR: mux-h2: always compare content-length to the sum of DATA frames
    - CLEANUP: h2: Remove debug printf in mux_h2.c
    - MINOR: cfgparse: make the process/thread parser support a maximum value
    - MINOR: threads: make MAX_THREADS configurable at build time
    - DOC: nbthread is no longer experimental.
    - BUG/MINOR: listener: always fill the source address for accepted socketpairs
    - BUG/MINOR: mux-h2: do not report available outgoing streams after GOAWAY
    - BUG/MINOR: spoe: corrected fragmentation string size
    - BUG/MINOR: task: fix possibly missed event in inter-thread wakeups
    - BUG/MEDIUM: servers: Attempt to reuse an unfinished connection on retry.
    - BUG/MEDIUM: backend: always call si_detach_endpoint() on async connection failure
    - SCRIPTS: add the issue tracker URL to the announce script
    - MINOR: peers: Extract some code to be reused.
    - CLEANUP: peers: Indentation fixes.
    - MINOR: peers: send code factorization.
    - MINOR: peers: Add new functions to send code and reduce the I/O handler.
    - MEDIUM: peers: synchronizaiton code factorization to reduce the size of the I/O handler.
    - MINOR: peers: Move update receive code to reduce the size of the I/O handler.
    - MINOR: peers: Move ack, switch and definition receive code to reduce the size of the I/O handler.
    - MINOR: peers: Move high level receive code to reduce the size of I/O handler.
    - CLEANUP: peers: Be more generic.
    - MINOR: peers: move error handling to reduce the size of the I/O handler.
    - MINOR: peers: move messages treatment code to reduce the size of the I/O handler.
    - MINOR: peers: move send code to reduce the size of the I/O handler.
    - CLEANUP: peers: Remove useless statements.
    - MINOR: peers: move "hello" message treatment code to reduce the size of the I/O handler.
    - MINOR: peers: move peer initializations code to reduce the size of the I/O handler.
    - CLEANUP: peers: factor the error handling code in peer_treet_updatemsg()
    - CLEANUP: peers: factor error handling in peer_treat_definedmsg()
    - BUILD/MINOR: peers: shut up a build warning introduced during last cleanup
    - BUG/MEDIUM: mux-h2: only close connection on request frames on closed streams
    - CLEANUP: mux-h2: remove two useless but misleading assignments
    - BUG/MEDIUM: checks: Check that conn_install_mux succeeded.
    - BUG/MEDIUM: servers: Only destroy a conn_stream we just allocated.
    - BUG/MEDIUM: servers: Don't add an incomplete conn to the server idle list.
    - BUG/MEDIUM: checks: Don't try to set ALPN if connection failed.
    - BUG/MEDIUM: h2: In h2_send(), stop the loop if we failed to alloc a buf.
    - BUG/MEDIUM: peers: Handle mux creation failure.
    - BUG/MEDIUM: servers: Close the connection if we failed to install the mux.
    - BUG/MEDIUM: compression: Rewrite strong ETags
    - BUG/MINOR: deinit: tcp_rep.inspect_rules not deinit, add to deinit
    - CLEANUP: mux-h2: remove misleading leftover test on h2s' nullity
    - BUG/MEDIUM: mux-h2: wake up flow-controlled streams on initial window update
    - BUG/MEDIUM: mux-h2: fix two half-closed to closed transitions
    - BUG/MEDIUM: mux-h2: make sure never to send GOAWAY on too old streams
    - BUG/MEDIUM: mux-h2: do not abort HEADERS frame before decoding them
    - BUG/MINOR: mux-h2: make sure response HEADERS are not received in other states than OPEN and HLOC
    - MINOR: h2: add a generic frame checker
    - MEDIUM: mux-h2: check the frame validity before considering the stream state
    - CLEANUP: mux-h2: remove stream ID and frame length checks from the frame parsers
    - BUG/MINOR: mux-h2: make sure request trailers on aborted streams don't break the connection
    - DOC: compression: Update the reasons for disabled compression
    - BUG/MEDIUM: buffer: Make sure b_is_null handles buffers waiting for allocation.
    - DOC: htx: make it clear that htxbuf() and htx_from_buf() always return valid pointers
    - MINOR: htx: never check for null htx pointer in htx_is_{,not_}empty()
    - MINOR: mux-h2: consistently rely on the htx variable to detect the mode
    - BUG/MEDIUM: peers: Peer addresses parsing broken.
    - BUG/MEDIUM: mux-h1: Don't add "transfer-encoding" if message-body is forbidden
    - BUG/MEDIUM: connections: Don't forget to remove CO_FL_SESS_IDLE.
    - BUG/MINOR: stream: don't close the front connection when facing a backend error
    - BUG/MEDIUM: mux-h2: wait for the mux buffer to be empty before closing the connection
    - MINOR: stream-int: add a new flag to mention that we want the connection to be killed
    - MINOR: connstream: have a new flag CS_FL_KILL_CONN to kill a connection
    - BUG/MEDIUM: mux-h2: do not close the connection on aborted streams
    - BUG/MINOR: server: fix logic flaw in idle connection list management
    - MINOR: mux-h2: max-concurrent-streams should be unsigned
    - MINOR: mux-h2: make sure to only check concurrency limit on the frontend
    - MINOR: mux-h2: learn and store the peer's advertised MAX_CONCURRENT_STREAMS setting
    - BUG/MEDIUM: mux-h2: properly consider the peer's advertised max-concurrent-streams
    - MINOR: xref: Add missing barriers.
    - MINOR: muxes: Don't bother to LIST_DEL(&conn->list) before calling conn_free().
    - MINOR: debug: Add an option that causes random allocation failures.
    - BUG/MEDIUM: backend: always release the previous connection into its own target srv_list
    - BUG/MEDIUM: htx: check the HTX compatibility in dynamic use-backend rules
    - BUG/MINOR: tune.fail-alloc: Don't forget to initialize ret.
    - BUG/MINOR: backend: check srv_conn before dereferencing it
    - BUG/MEDIUM: mux-h2: always omit :scheme and :path for the CONNECT method
    - BUG/MEDIUM: mux-h2: always set :authority on request output
    - BUG/MEDIUM: stream: Don't forget to free s->unique_id in stream_free().
    - BUG/MINOR: threads: fix the process range of thread masks
    - BUG/MINOR: config: fix bind line thread mask validation
    - CLEANUP: threads: fix misleading comment about all_threads_mask
    - CLEANUP: threads: use nbits to calculate the thread mask
    - OPTIM: listener: optimize cache-line packing for struct listener
    - MINOR: tools: improve the popcount() operation
    - MINOR: config: keep an all_proc_mask like we have all_threads_mask
    - MINOR: global: add proc_mask() and thread_mask()
    - MINOR: config: simplify bind_proc processing using proc_mask()
    - MINOR: threads: make use of thread_mask() to simplify some thread calculations
    - BUG/MINOR: compression: properly report compression stats in HTX mode
    - BUG/MINOR: task: close a tiny race in the inter-thread wakeup
    - BUG/MAJOR: config: verify that targets of track-sc and stick rules are present
    - BUG/MAJOR: spoe: verify that backends used by SPOE cover all their callers' processes
    - BUG/MAJOR: htx/backend: Make all tests on HTTP messages compatible with HTX
    - BUG/MINOR: config: make sure to count the error on incorrect track-sc/stick rules
    - DOC: ssl: Clarify when pre TLSv1.3 cipher can be used
    - DOC: ssl: Stop documenting ciphers example to use
    - BUG/MINOR: spoe: do not assume agent->rt is valid on exit
    - BUG/MINOR: lua: initialize the correct idle conn lists for the SSL sockets
    - BUG/MEDIUM: spoe: initialization depending on nbthread must be done last
    - BUG/MEDIUM: server: initialize the idle conns list after parsing the config
    - BUG/MEDIUM: server: initialize the orphaned conns lists and tasks at the end
    - MINOR: config: make MAX_PROCS configurable at build time
    - BUG/MAJOR: spoe: Don't try to get agent config during SPOP healthcheck
    - BUG/MINOR: config: Reinforce validity check when a process number is parsed
    - BUG/MEDIUM: peers: check that p->srv actually exists before using p->srv->use_ssl
    - CONTRIB: contrib/prometheus-exporter: Add a Prometheus exporter for HAProxy
    - BUG/MINOR: mux-h1: verify the request's version before dropping connection: keep-alive
    - BUG: 51d: In Hash Trie, multi header matching was affected by the header names stored globaly.
    - MEDIUM: 51d: Enabled multi threaded operation in the 51Degrees module.
    - BUG/MAJOR: stream: avoid double free on unique_id
    - BUILD/MINOR: stream: avoid a build warning with threads disabled
    - BUILD/MINOR: tools: fix build warning in the date conversion functions
    - BUILD/MINOR: peers: remove an impossible null test in intencode()
    - BUILD/MINOR: htx: fix some potential null-deref warnings with http_find_stline
    - BUG/MEDIUM: peers: Missing peer initializations.
    - BUG/MEDIUM: http_fetch: fix the "base" and "base32" fetch methods in HTX mode
    - BUG/MEDIUM: proto_htx: Fix data size update if end of the cookie is removed
    - BUG/MEDIUM: http_fetch: fix "req.body_len" and "req.body_size" fetch methods in HTX mode
    - BUILD/MEDIUM: initcall: Fix build on MacOS.
    - BUG/MEDIUM: mux-h2/htx: Always set CS flags before exiting h2_rcv_buf()
    - MINOR: h2/htx: Set the flag HTX_SL_F_BODYLESS for messages without body
    - BUG/MINOR: mux-h1: Add "transfer-encoding" header on outgoing requests if needed
    - BUG/MINOR: mux-h2: Don't add ":status" pseudo-header on trailers
    - BUG/MINOR: proto-htx: Consider a XFER_LEN message as chunked by default
    - BUG/MEDIUM: h2/htx: Correctly handle interim responses when HTX is enabled
    - MINOR: mux-h2: Set HTX extra value when possible
    - BUG/MEDIUM: htx: count the amount of copied data towards the final count
    - MINOR: mux-h2: make the H2 MAX_FRAME_SIZE setting configurable
    - BUG/MEDIUM: mux-h2/htx: send an empty DATA frame on empty HTX trailers
    - BUG/MEDIUM: servers: Use atomic operations when handling curr_idle_conns.
    - BUG/MEDIUM: servers: Add a per-thread counter of idle connections.
    - MINOR: fd: add a new my_closefrom() function to close all FDs
    - MINOR: checks: use my_closefrom() to close all FDs
    - MINOR: fd: implement an optimised my_closefrom() function
    - BUG/MINOR: fd: make sure my_closefrom() doesn't miss some FDs
    - BUG/MAJOR: fd/threads, task/threads: ensure all spin locks are unlocked
    - BUG/MAJOR: listener: Make sure the listener exist before using it.
    - MINOR: fd: Use closefrom() as my_closefrom() if supported.
    - BUG/MEDIUM: mux-h1: Report the right amount of data xferred in h1_rcv_buf()
    - BUG/MINOR: channel: Set CF_WROTE_DATA when outgoing data are skipped
    - MINOR: htx: Add function to drain data from an HTX message
    - MINOR: channel/htx: Add function to skips output bytes from an HTX channel
    - BUG/MAJOR: cache/htx: Set the start-line offset when a cached object is served
    - BUG/MEDIUM: cache: Get objects from the cache only for GET and HEAD requests
    - BUG/MINOR: cache/htx: Return only the headers of cached objects to HEAD requests
    - BUG/MINOR: mux-h1: Always initilize h1m variable in h1_process_input()
    - BUG/MEDIUM: proto_htx: Fix functions applying regex filters on HTX messages
    - BUG/MEDIUM: h2: advertise to servers that we don't support push
    - MINOR: standard: Add a function to parse uints (dotted notation).
    - MINOR: arg: Add support for ARGT_PBUF_FNUM arg type.
    - MINOR: http_fetch: add "req.ungrpc" sample fetch for gRPC.
    - MINOR: sample: Add two sample converters for protocol buffers.
    - DOC: sample: Add gRPC related documentation.
2019-02-26 16:43:49 +01:00
Olivier Houchard
f131481a0a BUG/MEDIUM: servers: Add a per-thread counter of idle connections.
Add a per-thread counter of idling connections, and use it to determine
how many connections we should kill after the timeout, instead of using
the global counter, or we're likely to just kill most of the connections.

This should be backported to 1.9.
2019-02-21 19:07:45 +01:00
Willy Tarreau
ff9c9140f4 MINOR: config: make MAX_PROCS configurable at build time
For some embedded systems, it's pointless to have 32- or even 64- large
arrays of processes when it's known that much fewer processes will be
used in the worst case. Let's introduce this MAX_PROCS define which
contains the highest number of processes allowed to run at once. It
still defaults to LONGBITS but may be lowered.
2019-02-07 15:10:19 +01:00
Willy Tarreau
a38a7175b1 MINOR: config: keep an all_proc_mask like we have all_threads_mask
This simplifies some mask comparisons at various places where
nbits(global.nbproc) was used.
2019-02-04 05:09:15 +01:00
Kevin Zhu
13ebef7ecb BUG/MINOR: deinit: tcp_rep.inspect_rules not deinit, add to deinit
It seems like this can be backported as far as 1.5.
2019-01-30 10:22:34 +01:00
Willy Tarreau
0cac26cd88 MEDIUM: backend: move all LB algo parameters into an union
Since all of them are exclusive, let's move them to an union instead
of eating memory with the sum of all of them. We're using a transparent
union to limit the code changes.

Doing so reduces the struct lbprm from 392 bytes to 372, and thanks
to these changes, the struct proxy is now down to 6480 bytes vs 6624
before the changes (144 bytes saved per proxy).
2019-01-14 19:33:17 +01:00
Willy Tarreau
4c03d1c9b6 MINOR: backend: move url_param_name/len to lbprm.arg_str/len
This one is exclusively used by LB parameters, when using URL param
hashing. Let's move it to the lbprm struct under a more generic name.
2019-01-14 19:33:17 +01:00
Willy Tarreau
909b9d852b BUILD: add a new file "version.c" to carry version updates
While testing fixes, it's sometimes confusing to rebuild only one C file
(e.g. a mux) and not to have the correct commit ID reported in "haproxy -v"
nor on the stats page.

This patch adds a new "version.c" file which is always rebuilt. It's
very small and contains only 3 variables derived from the various
version strings. These variables are used instead of the macros at the
few places showing the version. This way the output version of the
running code is always correct for the parts that were rebuilt.
2019-01-04 18:20:32 +01:00
Willy Tarreau
d22d69bd58 CLEANUP: remove my name and address from the copyright banner
First, it's a pain to always have to think about updating this date,
second for a long time I've not been the only developer there, and third,
some users contact me hoping to get help that I can't deliver. It's about
time to redirect them to the main site where all the useful links should
be.
2018-12-19 19:07:04 +01:00
William Lallemand
a57b7e33ef MINOR: cli: implements 'reload' on master CLI
The reload command reload the haproxy master like it is done with a kill
-USR2 on the master process.
2018-12-15 13:33:49 +01:00
William Lallemand
6e0d8aee26 BUG/MINOR: mworker: don't use unitialized mworker_proc struct
If the reload fail after the parsing of the configuration, the
mworker_proc structures are created for the processes it tried to
create.

The mworker_proc_list_to_env() function was exporting these unitialized
structures in the "HAPROXY_PROCESSES" environment variable which was
leading to this kind of output in "show proc":

4294967295      worker          [was: 1]        1               17879d 16h26m28s
2018-12-14 19:41:38 +01:00
William Lallemand
2672eb987a MINOR: mworker: set all_threads_mask and pid_bit to 1
Reinit the all_threads_mask and pid_bit to 1 before the master polling
loop, this process is monothread.
2018-12-14 16:01:37 +01:00
Willy Tarreau
c77d364905 MINOR: config: round up global.tune.bufsize to the next multiple of 2 void*
Since HTX casts the buffer to a struct and stores relative pointers at the
end, it is mandatory that its end is properly aligned. This patch enforces
a buffer size rounding up to the next multiple of two void*, thus 8 on
32-bit and 16 on 64-bit, to match what malloc() already does on the beginning
of the buffer. In pratice it will never be really noticeable since default
sizes already are such multiples.
2018-12-12 06:19:42 +01:00
William Lallemand
27f3fa56f5 BUG/MEDIUM: mworker: stop every tasks in the master
The master is not supposed to run (at the moment) any task before the
polling loop, the created tasks should be run only in the workers but in
the master they should be disabled or removed.

No backport needed.
2018-12-06 14:12:58 +01:00
William Lallemand
2fd45fae46 BUG/MEDIUM: mworker: stop proxies which have no listener in the master
The previous code was only stopping the listeners in the master, not the
entire proxy.

Since we now have a polling loop in the master, there might be some side
effects, indeed some things that are still initialized. For example the
checks were still running.
2018-12-04 05:54:33 +01:00
Olivier Houchard
0c18a6fe34 MEDIUM: servers: Add a way to keep idle connections alive.
Add a new keyword for servers, "idle-timeout". If set, unused connections are
kept alive until the timeout happens, and will be picked for reuse if no
other connection is available.
2018-12-02 18:16:53 +01:00
Willy Tarreau
b6b3df3ed3 MEDIUM: initcall: use initcalls for a few initialization functions
signal_init(), init_log(), init_stream(), and init_task() all used to
only preset some values and lists. This needs to be done very early to
provide a reliable interface to all other users. The calls used to be
explicit in haproxy.c:init(). Now they're placed in initcalls at the
STG_PREPARE stage. The functions are not exported anymore.
2018-11-26 19:50:32 +01:00
Willy Tarreau
2455cebe00 MEDIUM: memory: use pool_destroy_all() to destroy all pools on deinit()
Instead of exporting a number of pools and having to manually delete
them in deinit() or to have dedicated destructors to remove them, let's
simply kill all pools on deinit().

For this a new function pool_destroy_all() was introduced. As its name
implies, it destroys and frees all pools (provided they don't have any
user anymore of course).

This allowed to remove 4 implicit destructors, 2 explicit ones, and 11
individual calls to pool_destroy(). In addition it properly removes
the mux_pt_ctx pool which was not cleared on exit (no backport needed
here since it's 1.9 only). The sig_handler pool doesn't need to be
exported anymore and became static now.
2018-11-26 19:50:32 +01:00
Willy Tarreau
8ceae72d44 MEDIUM: init: use initcall for all fixed size pool creations
This commit replaces the explicit pool creation that are made in
constructors with a pool registration. Not only this simplifies the
pools declaration (it can be done on a single line after the head is
declared), but it also removes references to pools from within
constructors. The only remaining create_pool() calls are those
performed in init functions after the config is parsed, so there
is no more user of potentially uninitialized pool now.

It has been the opportunity to remove no less than 12 constructors
and 6 init functions.
2018-11-26 19:50:32 +01:00
Willy Tarreau
5794fb0c22 MINOR: init: process all initcalls in order at boot time
main() now iterates over all initcall stages at boot time. This will allow
to move init code from constructors to initcalls.
2018-11-26 19:50:32 +01:00
William Lallemand
7c756a8ccc BUG/MEDIUM: mworker: fix FD leak upon reload
We reintroduced some FDs leaking by using a poller and some listeners in
the master.

The master proxy needs to be stopped to avoid leaking its listeners, the
polling loop needs to be deinit, and the thread waker pipe need to be
closed too.

No backport needed.
2018-11-26 19:31:17 +01:00
Tim Duesterhus
742e0f9f1f BUG/MINOR: mworker: Do not attempt to close(2) fd -1
Valgrind reports:

==3389== Warning: invalid file descriptor -1 in syscall close()

Check for >= 0 before closing.

This bug was introduced in commit ce83b4a5dd
and is specific to 1.9. No backport needed.
2018-11-26 08:35:41 +01:00
Olivier Houchard
7fc3be76c7 MINOR: servers: Free [idle|safe|priv]_conns on exit.
Don't forget to free idle_conns, safe_conns and priv_conns on exit.

This can be backported to 1.8.
2018-11-22 19:53:03 +01:00
Willy Tarreau
609aad9e73 REORG: time/activity: move activity measurements to activity.{c,h}
At the moment the situation with activity measurement is quite tricky
because the struct activity is defined in global.h and declared in
haproxy.c, with operations made in time.h and relying on freq_ctr
which are defined in freq_ctr.h which itself includes time.h. It's
barely possible to touch any of these files without breaking all the
circular dependency.

Let's move all this stuff to activity.{c,h} and be done with it. The
measurement of active and stolen time is now done in a dedicated
function called just after tv_before_poll() instead of mixing the two,
which used to be a lazy (but convenient) decision.

No code was changed, stuff was just moved around.
2018-11-22 11:48:41 +01:00
William Lallemand
0564d41333 BUG/MEDIUM: mworker: unregister the signals of main()
The signal_register_fct() does not remove the handlers assigned to a
signal, but add a new handler to a list.

We accidentality inherited the handlers of the main() function in the
master process which is a problem because they act on the proxies.

The side effect was to stop the MASTER proxy which handle the master CLI
on a SIGUSR1, and to display some debug info when doing a SIGHUP and a
SIGQUIT.
2018-11-22 11:42:51 +01:00
William Lallemand
220567ec34 MINOR: mworker: use ha_notice to announce a new worker
Displays the PID and the relative PID when we fork a new worker with
 ha_notice().
2018-11-21 19:02:23 +01:00
William Lallemand
944e619b64 MEDIUM: mworker: wait mode use standard init code path
The mworker waitpid mode (which is used when a reload failed to apply
the new configuration) was still using a specific initialisation path.
That's a problem since we use a polling loop in the master now, the
master proxy is not initialized and the master CLI is not activated.

This patch removes the initialisation code of the wait mode and
introduce the MODE_MWORKER_WAIT in order to use the same init path as
the MODE_MWORKER with some exceptions. It allows to use the master proxy
and the master CLI during the waitpid mode.
2018-11-21 17:05:30 +01:00
William Lallemand
16dd1b3ead MINOR: cli: show master information in 'show proc'
Displays the master information in show proc.
2018-11-20 04:43:54 +01:00
William Lallemand
e368330128 MINOR: cli: displays uptime in show proc
Displays the uptime of the workers in `show proc`
2018-11-20 04:43:54 +01:00
Joseph Herlant
07a0834635 MINOR: Fix an error message thrown when we run out of memory
Fixes a typo in an error message that can be seen by the end user when
the haproxy subsystem runs out of memory.
2018-11-18 22:23:15 +01:00
Joseph Herlant
0342090ed7 CLEANUP: Fix some typos in the haproxy subsystem
Fix some typos in the code comments of the haproxy subsystem.
2018-11-18 22:23:15 +01:00
William Lallemand
a719926cf8 MEDIUM: jobs: support unstoppable jobs for soft stop
This patch allows a process to properly quit when some jobs are still
active, this feature is handled by the unstoppable_jobs variable, which
must be atomically incremented.

During each new iteration of run_poll_loop() the break condition of the
loop is now (jobs - unstoppable_jobs) == 0.

The unique usage of this at the moment is to handle the socketpair CLI
of a the worker during the stopping of the process.  During the soft
stop, we could mark the CLI listener as an unstoppable job and still
handle new connections till every other jobs are stopped.
2018-11-16 17:05:40 +01:00
William Lallemand
2e8fad9c30 MINOR: mworker: only close std{in,out,err} in daemon mode
This allows to output messages when we are not in daemon mode which is
useful to use log stdout in master worker mode.
2018-11-13 16:21:15 +01:00
David Carlier
42d9e5ae68 BUILD/MEDIUM: threads/affinity: DragonFly build fix
DragonFlyBSD does not have a build on its own, it has always
used the FreeBSD's. To be able to support the cpu affinity,
it needs few more headers.
2018-11-12 19:16:00 +01:00
William Lallemand
e260e0df44 BUG/MEDIUM: cli: crash when trying to access a worker
When using the CLI proxy of the master and trying to access a worker
with the @ prefix, the worker just crash.

The commit 7216032 ("MEDIUM: mworker: leave when the master die")
reintroduced the old code of the pipe, which was not trying to access
the pointers before. The owner of the FD was modified to a different
value, this is a problem since we call listener_accept() in most cases
now from the mworker_accept_wrapper() and it casts the owner variable to
get the listener.

This patch fix the issue by setting back the previous owner of the FD.
2018-11-08 14:48:06 +01:00
William Lallemand
a90cacfd70 BUG/MEDIUM: mworker: does not abort() in mworker_pipe_register()
The process was aborting with nbthread > 1.

The mworker_pipe_register() could be called several time in multithread
mode, we don't want to abort() there.
2018-11-07 09:26:36 +01:00
William Lallemand
7216032e6f MEDIUM: mworker: leave when the master die
When the master die, the worker should exit too, this is achieved by
checking if the FD of the socketpair/pipe was closed between the master
and the worker.

In the former architecture of the master-worker, there was only a pipe
between the master and the workers, and it was easy to check an EOF on
the pipe FD to exit() the worker.

With the new architecture, we use a socketpair by process, and this
socketpair is also used to accept new connections with the
listener_accept() callback.

This accept callback can't handle the EOF and the exit of the process,
because it's very specific to the master worker. This is why we
transformed the mworker_pipe_handler() function in a wrapper which check
if there is an EOF and exit the process, and if not call
listener_accept() to achieve the accept.
2018-11-06 18:30:57 +01:00
William Lallemand
5d05db8ce1 MINOR: mworker: displays a message when a worker is forked
Displays the PID and the relative PID when we fork a new worker.
2018-11-06 18:30:50 +01:00
William Lallemand
91723745c6 MEDIUM: mworker: exit with the incriminated exit code
The former behavior was to exit() the master process with the latest
status code known, which was the one of the last process to exit.

The problem is that the master process was not exiting with the status
code which provoked the exit-on-failure.
2018-11-06 18:28:33 +01:00
William Lallemand
18e52a8834 MINOR: mworker: displays more information when leaving
When a worker is leaving, we display the relative PID and the result of
the strsignal() function if it was killed by a signal.
2018-11-06 18:28:33 +01:00
William Lallemand
550db6d188 MEDIUM: mworker: does not create the CLI proxy when no listener
Does not create the CLI proxy if no -S argument was specified. It
prevents a warning that says that the MASTER proxy does not have any
bind option.
2018-11-06 18:28:33 +01:00
Willy Tarreau
2d372c2aa1 MINOR: stats: report the number of currently connected peers
The active peers output indicates both the number of established peers
connections and the number of peers connection attempts. The new counter
"ConnectedPeers" also indicates the number of currently connected peers.
This helps detect that some peers cannot be reached for example. It's
worth mentioning that this value changes over time because unused peers
are often disconnected and reconnected. Most of the time it should be
equal to ActivePeers.
2018-11-05 17:15:21 +01:00
Willy Tarreau
199ad24661 MINOR: stats: report the number of active peers in "show info"
Peers are the last type of activity which can maintain a job present, so
it's important to report that such an entity is still active to explain
why the job count may be higher than zero. Here by "ActivePeers" we report
peers sessions, which include both established connections and outgoing
connection attempts.
2018-11-05 17:15:21 +01:00
William Lallemand
309dc9adec MEDIUM: mworker: stop the master proxy in the workers
The master proxy which handles the CLI should not be used or shown in
the stats of the workers. This proxy is now disabled after the fork.
2018-10-28 14:03:31 +01:00
William Lallemand
e736115d3a MEDIUM: mworker: create CLI listeners from argv[]
This patch introduces mworker_cli_proxy_new_listener() which allows the
creation of new listeners for the CLI proxy.

Using this function it is possible to create new listeners from the
program arguments with -Sa <unix_socket>. It is allowed to create
multiple listeners with several -Sa.
2018-10-28 13:51:39 +01:00
William Lallemand
8a02257d88 MEDIUM: mworker: proxy for the master CLI
This patch implements a listen proxy within the master. It uses the
sockpair of all the workers as servers.

In the current state of the code, the proxy is only doing round robin on
the CLI of the workers. A CLI mode will be needed to know to which CLI
send the requests.
2018-10-28 13:51:39 +01:00
William Lallemand
1b66361f8d MEDIUM: mworker: move proc_list gen before proxies startup
We need to generate the process list before starting the proxies,
because it will be used to create a proxy in the master
2018-10-28 13:51:38 +01:00
William Lallemand
7e1299bb3a REORG: mworker: move struct mworker_proc to global.h
Move the definition of the mworker_proc structure in types/global.h.
2018-10-28 13:51:38 +01:00
William Lallemand
ce83b4a5dd MEDIUM: mworker: each worker socketpair is a CLI listener
The init code of the mworker_proc structs has been moved before the
init of the listeners.

Each socketpair is now connected to a CLI within the workers, which
allows the master to access their CLI.

The inherited flag of the worker side socketpair is removed so the
socket can be closed in the master.
2018-10-28 13:51:38 +01:00
William Lallemand
f1a62860c8 MINOR: mworker: number of reload in the life of a worker
This patch adds a field in the mworker_proc structure which contains how
much time the master reloaded during the life of a worker.
2018-10-28 13:51:38 +01:00
William Lallemand
dd319a5b1d BUG/MEDIUM: mworker: don't poll on LI_O_INHERITED listeners
The listeners with the LI_O_INHERITED flag were deleted but not unbound
which is a problem since we have a polling in the master.

This patch unbind every listeners which are not require for the master,
but does not close the FD of those that have a LI_O_INHERITED flag.
2018-10-12 19:30:18 +02:00