At the moment the situation with activity measurement is quite tricky
because the struct activity is defined in global.h and declared in
haproxy.c, with operations made in time.h and relying on freq_ctr
which are defined in freq_ctr.h which itself includes time.h. It's
barely possible to touch any of these files without breaking all the
circular dependency.
Let's move all this stuff to activity.{c,h} and be done with it. The
measurement of active and stolen time is now done in a dedicated
function called just after tv_before_poll() instead of mixing the two,
which used to be a lazy (but convenient) decision.
No code was changed, stuff was just moved around.
The signal_register_fct() does not remove the handlers assigned to a
signal, but add a new handler to a list.
We accidentality inherited the handlers of the main() function in the
master process which is a problem because they act on the proxies.
The side effect was to stop the MASTER proxy which handle the master CLI
on a SIGUSR1, and to display some debug info when doing a SIGHUP and a
SIGQUIT.
The mworker waitpid mode (which is used when a reload failed to apply
the new configuration) was still using a specific initialisation path.
That's a problem since we use a polling loop in the master now, the
master proxy is not initialized and the master CLI is not activated.
This patch removes the initialisation code of the wait mode and
introduce the MODE_MWORKER_WAIT in order to use the same init path as
the MODE_MWORKER with some exceptions. It allows to use the master proxy
and the master CLI during the waitpid mode.
This patch allows a process to properly quit when some jobs are still
active, this feature is handled by the unstoppable_jobs variable, which
must be atomically incremented.
During each new iteration of run_poll_loop() the break condition of the
loop is now (jobs - unstoppable_jobs) == 0.
The unique usage of this at the moment is to handle the socketpair CLI
of a the worker during the stopping of the process. During the soft
stop, we could mark the CLI listener as an unstoppable job and still
handle new connections till every other jobs are stopped.
When using the CLI proxy of the master and trying to access a worker
with the @ prefix, the worker just crash.
The commit 7216032 ("MEDIUM: mworker: leave when the master die")
reintroduced the old code of the pipe, which was not trying to access
the pointers before. The owner of the FD was modified to a different
value, this is a problem since we call listener_accept() in most cases
now from the mworker_accept_wrapper() and it casts the owner variable to
get the listener.
This patch fix the issue by setting back the previous owner of the FD.
The process was aborting with nbthread > 1.
The mworker_pipe_register() could be called several time in multithread
mode, we don't want to abort() there.
When the master die, the worker should exit too, this is achieved by
checking if the FD of the socketpair/pipe was closed between the master
and the worker.
In the former architecture of the master-worker, there was only a pipe
between the master and the workers, and it was easy to check an EOF on
the pipe FD to exit() the worker.
With the new architecture, we use a socketpair by process, and this
socketpair is also used to accept new connections with the
listener_accept() callback.
This accept callback can't handle the EOF and the exit of the process,
because it's very specific to the master worker. This is why we
transformed the mworker_pipe_handler() function in a wrapper which check
if there is an EOF and exit the process, and if not call
listener_accept() to achieve the accept.
The former behavior was to exit() the master process with the latest
status code known, which was the one of the last process to exit.
The problem is that the master process was not exiting with the status
code which provoked the exit-on-failure.
The active peers output indicates both the number of established peers
connections and the number of peers connection attempts. The new counter
"ConnectedPeers" also indicates the number of currently connected peers.
This helps detect that some peers cannot be reached for example. It's
worth mentioning that this value changes over time because unused peers
are often disconnected and reconnected. Most of the time it should be
equal to ActivePeers.
Peers are the last type of activity which can maintain a job present, so
it's important to report that such an entity is still active to explain
why the job count may be higher than zero. Here by "ActivePeers" we report
peers sessions, which include both established connections and outgoing
connection attempts.
This patch introduces mworker_cli_proxy_new_listener() which allows the
creation of new listeners for the CLI proxy.
Using this function it is possible to create new listeners from the
program arguments with -Sa <unix_socket>. It is allowed to create
multiple listeners with several -Sa.
This patch implements a listen proxy within the master. It uses the
sockpair of all the workers as servers.
In the current state of the code, the proxy is only doing round robin on
the CLI of the workers. A CLI mode will be needed to know to which CLI
send the requests.
The init code of the mworker_proc structs has been moved before the
init of the listeners.
Each socketpair is now connected to a CLI within the workers, which
allows the master to access their CLI.
The inherited flag of the worker side socketpair is removed so the
socket can be closed in the master.
The listeners with the LI_O_INHERITED flag were deleted but not unbound
which is a problem since we have a polling in the master.
This patch unbind every listeners which are not require for the master,
but does not close the FD of those that have a LI_O_INHERITED flag.
This bug appeared only if nbthread > 1. Handling the pipe with the
master, multiple threads of the same worker could process the deinit().
In addition, deinit() was called while some other threads were still
performing some tasks.
This patch assign the handler of the pipe with master to only the first
thread and removes the call to deinit() before exiting with an error.
This patch should be backported in v1.8.
These ones are mostly called from cfgparse.c for the parsing and do
not depend on the HTTP representation. The functions's prototypes
were moved to proto/http_rules.h, making this file work exactly like
tcp_rules. Ideally we should stop calling these functions directly
from cfgparse and register keywords, but there are a few cases where
that wouldn't work (stats http-request) so it's probably not worth
trying to go this far.
Cyril Bont reported that commit f9cc07c25b broke the build without
thread.
We don't need to initialise tid = 0 in mworker_loop, so we could
completely remove it.
These error codes and messages are agnostic to the version, even if
they are represented as HTTP/1.0 messages. Ultimately they will have
to be transformed into internal HTTP messages to be used everywhere.
The HTTP/1.1 100 Continue message was turned to an IST and the local
copy in the Lua code was removed.
We need to clean the FDs registered manually in the poller to avoid FD
leaking during a reload of the master.
This patch call the per thread deinit function which close the thread
waker pipe.
In order to communicate with the workers, the master pipe has been
replaced by a socketpair() per worker.
The goal is to use these sockets as stats sockets and be able to access
them from the master.
When reloading, the master serialize the information of the workers and
put them in a environment variable. Once the master has been reexecuted
it unserialize that information and it is capable of closing the FDs of
the leaving children.
The master now use a poll loop, which should be initialized even in wait
mode. We need to init some variables if we didn't success to load the
configuration file.
If haproxy failed to load its configuration, the process is reexecuted
and it did not init the poller. So we must not try to deinit the poller
before the exec().
With the new way of handling the signals in the master worker, we are
are not staying in a waitpid() loop. Which means that we need to catch the
SIGCHLD signals to call waitpid().
The problem is when the master is reloading, this signal is neither
registered nor blocked so we lost all signals between the restart and
the call to mworker_loop().
This patch blocks the SIGCHLD signals before the reloading and ensure
it's not unblocked before the master registered the SIGCHLD handler.
In order to reorganize the code of the master worker, the mworker_wait()
function which was the main function was split. This function was
handling a wait() loop, but it does not need it anymore since the code
will use the poll loop of haproxy instead.
The function was split in several functions:
- mworker_catch_sigterm() which is a signal handler for SIGTERM ans
SIGUSR1 that sends the signals to the workers
- mworker_catch_sigchld() which is the code handling the leaving of a
child
- mworker_catch_sighup which basically call the mworker_restart()
function
- mworker_loop() which is the function calling the main poll loop in the
master
Now we try to synchronously push updates as they come using the new rdv
point, so that the call to the server update function from the main poll
loop is not needed anymore.
It further reduces the apparent latency in the health checks as the response
time almost always appears as 0 ms, resulting in a slightly higher check rate
of ~1960 conn/s. Despite this, the CPU consumption has slightly dropped again
to ~32% for the same test.
The only trick is that the checks code is built with a bit of recursivity
because srv_update_status() calls server_recalc_eweight(), and the latter
needs to signal srv_update_status() in case of updates. Thus we added an
extra argument to this function to indicate whether or not it must
propagate updates (no if it comes from srv_update_status).
This partially reverts commit d8fd2af ("BUG/MEDIUM: threads: Use the sync
point to check active jobs and exit") which used to address an issue in
the way the sync point used to check for present threads, which was later
addressed by commit ddb6c16 ("BUG/MEDIUM: threads: Fix the exit condition
of the thread barrier"). Thus there is no need anymore to use the sync
point for exiting and we can completely remove this call in the main loop.
The current sync point causes some important stress when a high number
of threads is in use on a config with lots of checks, because it wakes
up all threads every time a server state changes.
A config like the following can easily saturate a 4-core machine reaching
only 750 checks per second out of the ~2000 configured :
global
nbthread 4
defaults
mode http
timeout connect 5s
timeout client 5s
timeout server 5s
frontend srv
bind :8001 process 1/1
redirect location / if { method OPTIONS } { rand(100) ge 50 }
stats uri /
backend chk
option httpchk
server-template srv 1-100 127.0.0.1:8001 check rise 1 fall 1 inter 50
The reason is that the random on the fake server causes the responses
to randomly match an HTTP check, and results in a lot of up/down events
that are broadcasted to all threads. It's worth noting that the CPU usage
already dropped by about 60% between 1.8 and 1.9 just due to the scheduler
updates, but the sync point remains expensive.
In addition, it's visible on the stats page that a lot of requests end up
with an L7TOUT status in ~60ms. With smaller timeouts, it's even L4TOUT
around 20-25ms.
By not using THREAD_WANT_SYNC() anymore and only calling the server updates
under thread_isolate(), we can avoid all these wakeups. The CPU usage on
the same config drops to around 44% on the same machine, with all checks
being delivered at ~1900 checks per second, and the stats page shows no
more timeouts, even at 10 ms check interval. The difference is mainly
caused by the fact that there's no more need to wait for a thread to wake
up from poll() before starting to process check results.