489 Commits

Author SHA1 Message Date
Olivier Houchard
5fa300da89 MINOR: init: make stdout unbuffered
printf is unusable for debugging without this, and printf() is not used
for anything else.
2018-02-05 14:15:20 +01:00
Willy Tarreau
a9786b6f04 MINOR: fd: pass the iocb and owner to fd_insert()
fd_insert() is currently called just after setting the owner and iocb,
but proceeding like this prevents the operation from being atomic and
requires a lock to protect the maxfd computation in another thread from
meeting an incompletely initialized FD and computing a wrong maxfd.
Fortunately for now all fdtab[].owner are set before calling fd_insert(),
and the first lock in fd_insert() enforces a memory barrier so the code
is safe.

This patch moves the initialization of the owner and iocb to fd_insert()
so that the function will be able to properly arrange its operations and
remain safe even when modified to become lockless. There's no other change
beyond the internal API.
2018-01-29 16:07:25 +01:00
Willy Tarreau
173d9951e2 MEDIUM: polling: start to move maxfd computation to the pollers
Since only select() and poll() still make use of maxfd, let's move
its computation right there in the pollers themselves, and only
during each fd update pass. The computation doesn't need a lock
anymore, only a few atomic ops. It will be accurate, be done much
less often and will not be required anymore in the FD's fast patch.

This provides a small performance increase of about 1% in connection
rate when using epoll since we get rid of this computation which was
performed under a lock.
2018-01-29 15:22:57 +01:00
Christopher Faulet
da18b9db7b MINOR: threads: Use __decl_hathreads instead of #ifdef/#endif
A #ifdef/#endif on USE_THREAD was added in the commit 0048dd04 ("MINOR: threads:
Fix build when we're not compiling with threads.") to conditionally define the
start_lock variable, because HA_SPINLOCK_T is only defined when HAProxy is
compiled with threads.

If fact, to do that, we should use the macro __decl_hathreads instead.

If commit 0048dd04 is backported in 1.8, this one can also be backported.
2018-01-25 17:52:57 +01:00
Olivier Houchard
0048dd04c9 MINOR: threads: Fix build when we're not compiling with threads.
Only declare the start_lock if threads are compiled in, otherwise
HA_SPINLOCK_T won't be defined.
This should be backported to 1.8 when/if
1605c7ae6154d8c2cfcf3b325872b1a7266c5bc2 is backported.
2018-01-24 21:41:29 +01:00
Willy Tarreau
46ec48bc1a BUG/MINOR: mworker: only write to pidfile if it exists
A missing test causes a write(-1, $PID) to appear in strace output when
in master-worker mode. This is totally harmless though.

This fix must be backported to 1.8.
2018-01-23 19:20:19 +01:00
Willy Tarreau
1605c7ae61 BUG/MEDIUM: threads/mworker: fix a race on startup
Marc Fournier reported an interesting case when using threads with the
master-worker mode : sometimes, a listener would have its FD closed
during startup. Sometimes it could even be health checks seeing this.

What happens is that after the threads are created, and the pollers
enabled on each threads, the master-worker pipe is registered, and at
the same time a close() is performed on the write side of this pipe
since the children must not use it.

But since this is replicated in every thread, what happens is that the
first thread closes the pipe, thus releases the FD, and the next thread
starting a listener in parallel gets this FD reassigned. Then another
thread closes the FD again, which this time corresponds to the listener.
It can also happen with the health check sockets if they're started
early enough.

This patch splits the mworker_pipe_register() function in two, so that
the close() of the write side of the FD is performed very early after the
fork() and long before threads are created (we don't need to delay it
anyway). Only the pipe registration is done in the threaded code since
it is important that the pollers are properly allocated for this.
The mworker_pipe_register() function now takes care of registering the
pipe only once, and this is guaranteed by a new surrounding lock.

The call to protocol_enable_all() looks fragile in theory since it
scans the list of proxies and their listeners, though in practice
all threads scan the same list and take the same locks for each
listener so it's not possible that any of them escapes the process
and finishes before all listeners are started. And the operation is
idempotent.

This fix must be backported to 1.8. Thanks to Marc for providing very
detailed traces clearly showing the problem.
2018-01-23 19:18:57 +01:00
Christopher Faulet
32467fef98 BUG/MEDIUM: threads/polling: Use fd_cache_mask instead of fd_cache_num
fd_cache_num is the number of FDs in the FD cache. It is a global variable. So
it is underoptimized because we may be lead to consider there are waiting FDs
for the current thread in the FD cache while in fact all FDs are assigned to the
other threads. So, in such cases, the polling loop will be evaluated many more
times than necessary.

Instead, we now check if the thread id is set in the bitfield fd_cache_mask.

[wt: it's not exactly a bug, rather a design limitation of the thread
 which was not addressed in time for the 1.8 release. It can appear more
 often than we initially predicted, when more threads are running than
 the number of assigned CPU cores, or when certain threads spend
 milliseconds computing crypto keys while other threads spin on
 epoll_wait(0)=0]

This patch should be backported to 1.8.
2018-01-23 15:39:51 +01:00
Willy Tarreau
d80cb4ee13 MINOR: global: add some global activity counters to help debugging
A number of counters have been added at special places helping better
understanding certain bug reports. These counters are maintained per
thread and are shown using "show activity" on the CLI. The "clear
counters" commands also reset these counters. The output is sent as a
single write(), which currently produces up to about 7 kB of data for
64 threads. If more counters are added, it may be necessary to write
into multiple buffers, or to reset the counters.

To backport to 1.8 to help collect more detailed bug reports.
2018-01-23 15:38:33 +01:00
Willy Tarreau
421f02e738 MINOR: threads: add a MAX_THREADS define instead of LONGBITS
This one allows not to inflate some structures when threads are
disabled. Now struct global is 1.4 kB instead of 33 kB.

Should be backported to 1.8 for ease of backporting of upcoming
patches.
2018-01-23 15:28:20 +01:00
William Lallemand
29f690c945 BUG/MEDIUM: mworker: execvp failure depending on argv[0]
The copy_argv() function lacks a check on '-' to remove the -x, -sf and
-st parameters.

When reloading a master process with a path starting by /st, /sf, or
/x..  the copy_argv() function skipped argv[0] leading to an execvp()
without the binary.
2018-01-09 23:44:18 +01:00
William Lallemand
e134041910 MINOR: don't close stdio anymore
Closing the standard IO FDs (0,1,2) can be troublesome, especially in
the case of the master-worker.

Instead of closing those FDs, they are now pointing to /dev/null which
prevents sending debugging messages to the wrong FDs.

This patch could be backported in 1.8.
2017-12-29 16:33:41 +01:00
PiBa-NL
149a81a443 BUG/MEDIUM: mworker: don't close stdio several time
This patch makes sure that a frontend socket that gets created after
initialization won't be closed when the master gets re-executed.

When used in daemon mode, the master-worker is closing the FDs 0, 1, 2
after the fork of the children.

When the master was reloading, those FDs were assigned again during the
parsing of the configuration (probably for some listeners), and the
workers were closing them thinking it was the stdio.

This patch must be backported to 1.8.
2017-12-29 16:31:10 +01:00
Tim Duesterhus
d16f450c98 MINOR: mworker: Improve wording in void mworker_wait()
Replace "left" / "leaving" with "exit" / "exiting".

This should be backported to haproxy 1.8.
2017-12-07 19:21:25 +01:00
Emeric Brun
ece0c334bd BUG/MEDIUM: ssl engines: Fix async engines fds were not considered to fix fd limit automatically.
The number of async fd is computed considering the maxconn, the number
of sides using ssl and the number of engines using async mode.

This patch should be backported on haproxy 1.8
2017-12-06 14:17:41 +01:00
Willy Tarreau
473cf5d0cd BUG/MEDIUM: mworker: also close peers sockets in the master
There's a nasty case related to signaling all processes via SIGUSR1.
Since the master process still holds the peers sockets, the old process
trying to connect to the new one to teach it its tables has a risk to
connect to the master instead, which will not do anything, causing the
old process to hang instead of quitting.

This patch ensures we correctly close the peers in the master process
on startup, just like it is done for proxies. Ultimately we would rather
have a complete list of listeners to avoid such issues. But that's a bit
trickier as it would require using unbind_all() and avoiding side effects
the master could cause to other processes (like unlinking unix sockets).

To be backported to 1.8.
2017-12-06 11:14:08 +01:00
Olivier Houchard
829aa24459 MINOR: threads: Fix pthread_setaffinity_np on FreeBSD.
As with the call to cpuset_setaffinity(), FreeBSD expects the argument to
pthread_setaffinity_np() to be a cpuset_t, not an unsigned long, so the call
was silently failing.

This should probably be backported to 1.8.
2017-12-02 14:23:12 +01:00
PiBa-NL
baf6ea4bd5 BUG/MINOR: mworker: detach from tty when in daemon mode
This allows a calling script to show the first startup output and
know when to stop reading from stdout so haproxy can daemonize.

To be backpored to 1.8.
2017-12-02 14:13:40 +01:00
PiBa-NL
4763ffdf04 BUG/MINOR: mworker: fix validity check for the pipe FDs
Check if master-worker pipe getenv succeeded, also allow pipe fd 0 as
valid. On FreeBSD in quiet mode the stdin/stdout/stderr are closed
which lets the mworker_pipe to use fd 0 and fd 1. Additionally exit()
upon failure to create or get the master-worker pipe.

This needs to be backported to 1.8.
2017-12-02 13:24:47 +01:00
David Carlier
6d5c841d24 BUILD/MINOR: haproxy : FreeBSD/cpu affinity needs pthread_np header
for pthread_*_np calls, pthread_np.h is needed under FreeBSD.
2017-11-29 14:30:38 +01:00
William Lallemand
4cfede87a3 MAJOR: mworker: exits the master on failure
This patch changes the behavior of the master during the exit of a
worker.

When a worker exits with an error code, for example in the case of a
segfault, all workers are now killed and the master leaves.

If you don't want this behavior you can use the option
"master-worker no-exit-on-failure".
2017-11-24 22:48:27 +01:00
Willy Tarreau
bafbe01028 CLEANUP: pools: rename all pool functions and pointers to remove this "2"
During the migration to the second version of the pools, the new
functions and pool pointers were all called "pool_something2()" and
"pool2_something". Now there's no more pool v1 code and it's a real
pain to still have to deal with this. Let's clean this up now by
removing the "2" everywhere, and by renaming the pool heads
"pool_head_something".
2017-11-24 17:49:53 +01:00
Olivier Houchard
fbc74e8556 MINOR/CLEANUP: proxy: rename "proxy" to "proxies_list"
Rename the global variable "proxy" to "proxies_list".
There's been multiple proxies in haproxy for quite some time, and "proxy"
is a potential source of bugs, a number of functions have a "proxy" argument,
and some code used "proxy" when it really meant "px" or "curproxy". It worked
by pure luck, because it usually happened while parsing the config, and thus
"proxy" pointed to the currently parsed proxy, but we should probably not
rely on this.

[wt: some of these are definitely fixes that are worth backporting]
2017-11-24 17:21:27 +01:00
Christopher Faulet
767a84bcc0 CLEANUP: log: Rename Alert/Warning in ha_alert/ha_warning 2017-11-24 17:19:12 +01:00
Christopher Faulet
cb6a94510d MINOR: config: Add the threads support in cpu-map directive
Now, it is possible to bind CPU at the thread level instead of the process level
by defining a thread set in "cpu-map" directives. Thus, its format is now:

  cpu-map [auto:]<process-set>[/<thread-set>] <cpu-set>...

where <process-set> and <thread-set> must follow the format:

  all | odd | even | number[-[number]]

Having a process range and a thread range in same time with the "auto:" prefix
is not supported. Only one range is supported, the other one must be a fixed
number. But it is allowed when there is no "auto:" prefix.

Because it is possible to define a mapping for a process and another for a
thread on this process, threads will be bound on the intersection of their
mapping and the one of the process on which they are attached. If the
intersection is null, no specific binding will be set for the threads.
2017-11-24 15:38:50 +01:00
Willy Tarreau
1f89b1805b BUG/MEDIUM: deinit: correctly deinitialize the proxy and global listener tasks
While using mmap() to allocate pools for debugging purposes, kill -USR1 caused
libc aborts in deinit() on two calls to free() on proxies' tasks and the global
listener task. The issue comes from the fact that we're using free() to release
a task instead of task_free(), so the task was allocated from a pool and released
using a different method.

This bug has been there since at least 1.5, so a backport is desirable to all
maintained versions.
2017-11-22 16:57:05 +01:00
Lukas Tribus
f46bf95d2b BUG/MINOR: systemd: ignore daemon mode
Since we switched to notify mode in the systemd unit file in commit
d6942c8, haproxy won't start if the daemon keyword is present in the
configuration.

This change makes sure that haproxy remains in foreground when using
systemd mode and adds a note in the documentation.
2017-11-21 21:21:35 +01:00
Tim Duesterhus
d6942c8297 MEDIUM: mworker: Add systemd Type=notify support
This patch adds support for `Type=notify` to the systemd unit.

Supporting `Type=notify` improves both starting as well as reloading
of the unit, because systemd will be let known when the action completed.

See this quote from `systemd.service(5)`:
> Note however that reloading a daemon by sending a signal (as with the
> example line above) is usually not a good choice, because this is an
> asynchronous operation and hence not suitable to order reloads of
> multiple services against each other. It is strongly recommended to
> set ExecReload= to a command that not only triggers a configuration
> reload of the daemon, but also synchronously waits for it to complete.

By making systemd aware of a reload in progress it is able to wait until
the reload actually succeeded.

This patch introduces both a new `USE_SYSTEMD` build option which controls
including the sd-daemon library as well as a `-Ws` runtime option which
runs haproxy in master-worker mode with systemd support.

When haproxy is running in master-worker mode with systemd support it will
send status messages to systemd using `sd_notify(3)` in the following cases:

- The master process forked off the worker processes (READY=1)
- The master process entered the `mworker_reload()` function (RELOADING=1)
- The master process received the SIGUSR1 or SIGTERM signal (STOPPING=1)

Change the unit file to specify `Type=notify` and replace master-worker
mode (`-W`) with master-worker mode with systemd support (`-Ws`).

Future evolutions of this feature could include making use of the `STATUS`
feature of `sd_notify()` to send information about the number of active
connections to systemd. This would require bidirectional communication
between the master and the workers and thus is left for future work.
2017-11-20 18:39:41 +01:00
Christopher Faulet
7163056dc5 MAJOR: polling: Use active_appels_mask instead of applets_active_queue
applets_active_queue is the active queue size. It is a global variable. So it is
underoptimized because we may be lead to consider there are active applets for a
thread while in fact all active applets are assigned to the otherthreads. So, in
such cases, the polling loop will be evaluated many more times than necessary.

Instead, we now check if the thread id is set in the bitfield active_applets_mask.

This is specific to threads, no backport is needed.
2017-11-16 11:19:46 +01:00
Christopher Faulet
8a48f67526 MAJOR: polling: Use active_tasks_mask instead of tasks_run_queue
tasks_run_queue is the run queue size. It is a global variable. So it is
underoptimized because we may be lead to consider there are active tasks for a
thread while in fact all active tasks are assigned to the other threads. So, in
such cases, the polling loop will be evaluated many more times than necessary.

Instead, we now check if the thread id is set in the bitfield active_tasks_mask.

Another change has been made in process_runnable_tasks. Now, we always limit the
number of tasks processed to 200.

This is specific to threads, no backport is needed.
2017-11-16 11:19:46 +01:00
Christopher Faulet
96d4483df7 BUG/MINOR: Allocate the log buffers before the proxies startup
Since the commit cd7879adc ("BUG/MEDIUM: threads: Run the poll loop on the main
thread too"), the log buffers are allocated after the proxies startup. So log
messages produced during this startup was ignored.

To fix the bug, we restore the initialization of these buffers before proxies
startup.

This is specific to threads, no backport is needed.
2017-11-16 11:19:46 +01:00
William Lallemand
75ea0a06b0 BUG/MEDIUM: mworker: does not close inherited FD
At the end of the master initialisation, a call to protocol_unbind_all()
was made, in order to close all the FDs.

Unfortunately, this function closes the inherited FDs (fd@), upon reload
the master wasn't able to reload a configuration with those FDs.

The create_listeners() function now store a flag to specify if the fd
was inherited or not.

Replace the protocol_unbind_all() by  mworker_cleanlisteners() +
deinit_pollers()
2017-11-15 19:53:33 +01:00
William Lallemand
fade49d8fb BUG/MEDIUM: mworker: does not deinit anymore
Does not use the deinit() function during a reload, it's dangerous and
might be subject to double free, segfault and hazardous behavior if
it's called twice in the case of a execvp fail.
2017-11-15 19:53:31 +01:00
William Lallemand
2f8b31c2c6 BUG/MEDIUM: mworker: wait again for signals when execvp fail
After execvp fails, the signals were ignored, preventing to try a reload
again. It is now fixed by reaching the top of the mworker_wait()
function once the execvp failed.
2017-11-15 19:52:06 +01:00
William Lallemand
722d4ca0dd MINOR: mworker: display an accurate error when the reexec fail
When the master worker fail the execvp, it returns the wrong error
"Cannot allocate memory".

We now display the accurate error corresponding to the errno value.
2017-11-15 19:52:06 +01:00
Tim Duesterhus
0436ab7841 BUG/MEDIUM: mworker: Fix re-exec when haproxy is started from PATH
If haproxy is started using the name of the binary only (i.e.
not using a relative or absolute path) the `execv` in
`mworker_reload` fails with `ENOENT`, because it does not
examine the `PATH`:

  [WARNING] 315/161139 (7) : Reexecuting Master process
  [WARNING] 315/161139 (7) : Cannot allocate memory
  [WARNING] 315/161139 (7) : Failed to reexecute the master processs [7]

The error messages are misleading, because the return value of
`execv` is not checked. This should be fixed in a separate commit.

Once this happened the master process ignores any further
signals sent by the administrator.

Replace `execv` with `execvp` to establish the expected
behaviour.

This bug was introduced in commit 73b85e75b3963086be889e1fb40a59e7ef2ad63b.
2017-11-14 15:11:24 +01:00
Willy Tarreau
387bd4f69f CLEANUP: global: introduce variable pid_bit to avoid shifts with relative_pid
At a number of places, bitmasks are used for process affinity and to map
listeners to processes. Every time 1UL<<(relative_pid-1) is used. Let's
create a "pid_bit" variable corresponding to this value to clean this up.
2017-11-10 19:08:14 +01:00
Christopher Faulet
2a944ee16b BUILD: threads: Rename SPIN/RWLOCK macros using HA_ prefix
This remove any name conflicts, especially on Solaris.
2017-11-07 11:10:24 +01:00
William Lallemand
92159b2901 MINOR: mworker: do not store child pid anymore in the pidfile
The parent process supervises itself the children, we don't need to
store the children pids anymore in the pidfile in master-worker mode.
2017-11-06 11:19:53 +01:00
William Lallemand
deed780a22 MINOR: mworker: write parent pid in the pidfile
The first pid in the pidfile is now the parent, it's more convenient for
supervising the processus.

You can now reload haproxy in master-worker mode with convenient command
like: kill -USR2 $(head -1 /tmp/haproxy.pid)
2017-11-06 11:08:38 +01:00
William Lallemand
8029300df6 MINOR: mworker: allow pidfile in mworker + foreground
This patch allows the use of the pidfile in master-worker mode without
using the background option.
2017-11-06 11:08:38 +01:00
William Lallemand
cc113822a7 MINOR: add master-worker in the warning about nbproc 2017-11-06 11:08:38 +01:00
Olivier Houchard
f143b8040b BUILD: use MAXPATHLEN instead of NAME_MAX.
This fixes building on at least Solaris, where NAME_MAX doesn't exist.
2017-11-04 17:09:23 +01:00
Olivier Houchard
e2b40b9eab MINOR: connection: introduce conn_stream
This patch introduces a new struct conn_stream. It's the stream-side of
a multiplexed connection. A pool is created and destroyed on exit. For
now the conn_streams are not used at all.
2017-10-31 18:03:23 +01:00
Christopher Faulet
d7bddda151 BUG/MEDIUM: threads: Initialize the sync-point
The sync point must be initialized before starting threads. This line was lost
in one of merges preparing the threads support integration.
2017-10-31 18:03:06 +01:00
Christopher Faulet
cd7879adc2 BUG/MEDIUM: threads: Run the poll loop on the main thread too
There was a flaw in the way the threads was created. the main one was just used
to create all the others and just wait to exit. Now, it is used to run a poll
loop. So we only create nbthread-1 threads.

This also fixes a bug about the compression filter when there is only 1 thread
(nbthread == 1 or no threads support). The bug was in the way thread-local
resources was initialized. per-thread init/deinit callbacks were never called
for the main process. So, with nthread set to 1, some buffers remained
uninitialized.
2017-10-31 13:58:33 +01:00
Christopher Faulet
6251902e67 MINOR: threads: Add thread-map config parameter in the global section
By default, no affinity is set for threads. To bind threads on CPU, you must
define a "thread-map" in the global section. The format is the same than the
"cpu-map" parameter, with a small difference. The process number must be
defined, with the same format than cpu-map ("all", "even", "odd" or a number
between 1 and 31/63).

A thread will be bound on the intersection of its mapping and the one of the
process on which it is attached. If the intersection is null, no specific bind
will be set for the thread.
2017-10-31 13:58:33 +01:00
Christopher Faulet
5b51755aef MEDIUM: threads/lb: Make LB algorithms (lb_*.c) thread-safe
A lock for LB parameters has been added inside the proxy structure and atomic
operations have been used to update server variables releated to lb.

The only significant change is about lb_map. Because the servers status are
updated in the sync-point, we can call recalc_server_map function synchronously
in map_set_server_status_up/down function.
2017-10-31 13:58:31 +01:00
Christopher Faulet
5d42e099c5 MINOR: threads/server: Add a lock to deal with insert in updates_servers list
This list is used to save changes on the servers state. So when serveral threads
are used, it must be locked. The changes are then applied in the sync-point. To
do so, servers_update_status has be moved in the sync-point. So this is useless
to lock it at this step because the sync-point is a protected area by iteself.
2017-10-31 13:58:31 +01:00
Christopher Faulet
29f77e846b MEDIUM: threads/server: Add a lock per server and atomically update server vars
The server's lock is use, among other things, to lock acces to the active
connection list of a server.
2017-10-31 13:58:31 +01:00