When threads are disabled, some variables such as tid and tid_bit are
still checked everywhere, the MAX_THREADS_MASK macro is ~0UL while
MAX_THREADS is 1, and the all_threads_mask variable is replaced with a
macro forced to zero. The compiler cannot optimize away all this code
involving checks on tid and tid_bit, and we end up in special cases
where all_threads_mask has to be specifically tested for being zero or
not. It is not even certain the code paths are always equivalent when
testing without threads and with nbthread 1.
Let's change this to make sure we always present a single thread when
threads are disabled, and have the relevant values declared as constants
so that the compiler can optimize all the tests away. Now we have
MAX_THREADS_MASK set to 1, all_threads_mask set to 1, tid set to zero
and tid_bit set to 1. Doing just this has removed 4 kB of code in the
no-thread case.
A few checks for all_threads_mask==0 have been removed since it never
happens anymore.
The "process" directive on "bind" lines supports process references and
thread references. No check is performed on the thread number validity,
so that if a listener is only bound to non-existent threads, the traffic
will never be processed. It easily happens when setting one bind line per
thread with an incorrect (or reduced) thread count. No warning appears
and some random connections are never served. It also happens when setting
thread references with threads support disabled at build time.
This patch makes use of the all_threads_mask variable to detect if some
referenced threads don't exist, to emit a warning and fix this.
This patch needs to be backported to 1.8, just like the previous one which
it depends on (MINOR: threads: move "nbthread" parsing to hathreads.c).
The purpose is to make sure that all variables which directly depend
on this nbthread argument are set at the right moment. For now only
all_threads_mask needs to be set. It used to be set while calling
thread_sync_init() which is called too late for certain checks. The
same function handles threads and non-threads, which removes the need
for some thread-specific knowledge from cfgparse.c.
Thierry discovered that the following config crashes haproxy while
parsing the config (it's probably the smallest crasher) :
defaults
stick-table type ip size 1M
And indeed it does because it looks for the current proxy's name which it
does not have as it's the default one. This affects all versions since 1.6.
This fix must be backported to all versions back to 1.6.
Chunks are only a subset of a buffer (a non-wrapping version with no head
offset). Despite this we still carry a lot of duplicated code between
buffers and chunks. Replacing chunks with buffers would significantly
reduce the maintenance efforts. This first patch renames the chunk's
fields to match the name and types used by struct buffers, with the goal
of isolating the code changes from the declaration changes.
Most of the changes were made with spatch using this coccinelle script :
@rule_d1@
typedef chunk;
struct chunk chunk;
@@
- chunk.str
+ chunk.area
@rule_d2@
typedef chunk;
struct chunk chunk;
@@
- chunk.len
+ chunk.data
@rule_i1@
typedef chunk;
struct chunk *chunk;
@@
- chunk->str
+ chunk->area
@rule_i2@
typedef chunk;
struct chunk *chunk;
@@
- chunk->len
+ chunk->data
Some minor updates to 3 http functions had to be performed to take size_t
ints instead of ints in order to match the unsigned length here.
This introduces a new directive for the `resolvers` section:
`parse-resolv-conf`. When present, it will attempt to add any
nameservers in `/etc/resolv.conf` to the list of nameservers
for the current `resolvers` section.
[Mailing list thread][1].
[1]: https://www.mail-archive.com/haproxy@formilux.org/msg29600.html
There's no real reason to have a specific scheduler for applets anymore, so
nuke it and just use tasks. This comes with some benefits, the first one
being that applets cannot induce high latencies anymore since they share
nice values with other tasks. Later it will be possible to configure the
applets' nice value. The second benefit is that the applet scheduler was
not very thread-friendly, having a big lock around it in prevision of this
change. Thus applet-intensive workloads should now scale much better with
threads.
Some more improvement is possible now : some applets also use a task to
handle timers and timeouts. These ones could now be simplified to use only
one task.
For large farms where servers are regularly added or removed, picking
a random server from the pool can ensure faster load transitions than
when using round-robin and less traffic surges on the newly added
servers than when using leastconn.
This commit introduces "balance random". It internally uses a random as
the key to the consistent hashing mechanism, thus all features available
in consistent hashing such as weights and bounded load via hash-balance-
factor are usable. It is extremely convenient because one common concern
when using random is what happens when a server is hammered a bit too
much. Here that can trivially be avoided, like in the configuration below :
backend bk0
balance random
hash-balance-factor 110
server-template s 1-100 127.0.0.1:8000 check inter 1s
Note that while "balance random" internally relies on a hash algorithm,
it holds the same properties as round-robin and as such is compatible with
reusing an existing server connection with "option prefer-last-server".
Louis Chanouha reported an inappropriate warning when http-reuse is
present in a defaults section while a TCP proxy accidently inherits
it and finds a conflict with other options like the use of the PROXY
protocol. To fix this patch removes the http-reuse option for TCP
proxies.
This fix needs to be backported to 1.8, 1.7 and possibly 1.6.
With "log global" line, the global list of loggers are copied into the proxy's
struct. The list coming from the default section is also copied when a frontend
or a backend section is parsed. So it is possible to have duplicate entries in
the proxy's list. For instance, with this following config, all messages will be
logged twice:
global
log 127.0.0.1 local0 debug
daemon
defaults
mode http
log global
option httplog
frontend front-http
log global
bind *:8888
default_backend back-http
backend back-http
server www 127.0.0.1:8000
Now, the function parse_logsrv should be used to parse a "log" line. This
function will update the list of loggers passed in argument. It can release all
log servers when "no log" line was parsed (by the caller) or it can parse "log
global" or "log <address> ... " lines. It takes care of checking the caller
context (global or not) to prohibit "log global" usage in the global section.
>From the very first day of force-persist and ignore-persist features,
they only applied to backends, except that the documentation stated it
could also be applied to frontends.
In order to make it clear, the documentation is updated and the parser
will raise a warning if the keywords are used in a frontend section.
This patch should be backported up to the 1.5 branch.
This removes the end label from parse_process_number() which
is unused since 5ab51775e736511b7e54f42e080dcef76a284da9, which
first was released in haproxy 1.8.0.
Martin Brauer reported an unexpected warning when some parts of the
global stats are defined but not the listening address, like below :
global
#stats socket run/admin.sock mode 660 level admin
stats timeout 30s
Then haproxy complains :
[WARNING] 334/150131 (23086) : config : frontend 'GLOBAL' has no
'bind' directive. Please declare it as a backend if this was intended.
This is because of the check for a bind-less frontend (the global section
creates a frontend for the stats). There's no clean fix for this one, so
here we're simply checking that the frontend is not the global stats one
before emitting the warning.
This patch should be backported to all stable versions.
This one allows not to inflate some structures when threads are
disabled. Now struct global is 1.4 kB instead of 33 kB.
Should be backported to 1.8 for ease of backporting of upcoming
patches.
Commit 4cfede87a313456fcbce7a185312460b4e1d05b7 removed
`exit-on-failure` in favor of `no-exit-on-failure`, but failed
to update references to the former in user facing messages.
This should be backported to haproxy 1.8.
"monitor-uri" may rely on "monitor fail" rules, which are processed
very early, immediately after the HTTP request is parsed and before
any http rulesets. It's not reported by the config parser when this
ruleset is misplaces, causing some configurations not to work like
users would expect. Let's just add the warning for a misplaced rule.
This patch changes the behavior of the master during the exit of a
worker.
When a worker exits with an error code, for example in the case of a
segfault, all workers are now killed and the master leaves.
If you don't want this behavior you can use the option
"master-worker no-exit-on-failure".
During the migration to the second version of the pools, the new
functions and pool pointers were all called "pool_something2()" and
"pool2_something". Now there's no more pool v1 code and it's a real
pain to still have to deal with this. Let's clean this up now by
removing the "2" everywhere, and by renaming the pool heads
"pool_head_something".
Rename the global variable "proxy" to "proxies_list".
There's been multiple proxies in haproxy for quite some time, and "proxy"
is a potential source of bugs, a number of functions have a "proxy" argument,
and some code used "proxy" when it really meant "px" or "curproxy". It worked
by pure luck, because it usually happened while parsing the config, and thus
"proxy" pointed to the currently parsed proxy, but we should probably not
rely on this.
[wt: some of these are definitely fixes that are worth backporting]
Now, it is possible to bind CPU at the thread level instead of the process level
by defining a thread set in "cpu-map" directives. Thus, its format is now:
cpu-map [auto:]<process-set>[/<thread-set>] <cpu-set>...
where <process-set> and <thread-set> must follow the format:
all | odd | even | number[-[number]]
Having a process range and a thread range in same time with the "auto:" prefix
is not supported. Only one range is supported, the other one must be a fixed
number. But it is allowed when there is no "auto:" prefix.
Because it is possible to define a mapping for a process and another for a
thread on this process, threads will be bound on the intersection of their
mapping and the one of the process on which they are attached. If the
intersection is null, no specific binding will be set for the threads.
It was a temporary directive used for development purpose. Now, CPU mapping for
at the thread level should be done using the cpu-map directive. This feature
will be added in a next commit.
Now, processa and CPU ranges can be partially defined. The higher bound can be
omitted. In such case, it is replaced by the corresponding maximum value, 32 or
64 depending on the machine's word size.
By extension, It is also true for the "bind-process" directive and "process"
parameter on a "bind" or a "stats socket" line.
The prefix "auto:" can be added before the process set to let HAProxy
automatically bind a process to a CPU by incrementing process and CPU sets. To
be valid, both sets must have the same size. No matter the declaration order of
the CPU sets, it will be bound from the lower to the higher bound.
Examples:
# all these lines bind the process 1 to the cpu 0, the process 2 to cpu 1
# and so on.
cpu-map auto:1-4 0-3
cpu-map auto:1-4 0-1 2-3
cpu-map auto:1-4 3 2 1 0
# bind each process to exaclty one CPU using all/odd/even keyword
cpu-map auto:all 0-63
cpu-map auto:even 0-31
cpu-map auto:odd 32-63
# invalid cpu-map because process and CPU sets have different sizes.
cpu-map auto:1-4 0 # invalid
cpu-map auto:1 0-3 # invalid
Now, this function returns a status code to indicate a success (0) or a failure
(1) and the error message in set in <err> parameter. And the result of the parsing
is set in <proc> parameter.
Now, you can define processes concerned by a cpu-map line using a range. For
instance, the following line binds the first 32 processes on CPUs 0 to 3:
cpu-map 1-32 0-3
HTTP/2 mandates the support of 16384 bytes frames by default, so we need
a large enough buffer to process them. Till now if tune.bufsize was too
small, H2 connections were simply rejected during their establishment,
making it quite hard to troubleshoot the issue.
Now we detect when HTTP/2 is enabled on an HTTP frontend and emit an
error if tune.bufsize is not large enough, with the appropriate
recommendation.
At the end of the master initialisation, a call to protocol_unbind_all()
was made, in order to close all the FDs.
Unfortunately, this function closes the inherited FDs (fd@), upon reload
the master wasn't able to reload a configuration with those FDs.
The create_listeners() function now store a flag to specify if the fd
was inherited or not.
Replace the protocol_unbind_all() by mworker_cleanlisteners() +
deinit_pollers()
By default, no affinity is set for threads. To bind threads on CPU, you must
define a "thread-map" in the global section. The format is the same than the
"cpu-map" parameter, with a small difference. The process number must be
defined, with the same format than cpu-map ("all", "even", "odd" or a number
between 1 and 31/63).
A thread will be bound on the intersection of its mapping and the one of the
process on which it is attached. If the intersection is null, no specific bind
will be set for the thread.
A lock is used to protect accesses to a peer structure.
A the lock is taken in the applet handler when the peer is identified
and released living the applet handler.
In the scheduling task for peers section, the lock is taken for every
listed peer and released at the end of the process task function.
The peer 'force shutdown' function was also re-worked.
A lock for LB parameters has been added inside the proxy structure and atomic
operations have been used to update server variables releated to lb.
The only significant change is about lb_map. Because the servers status are
updated in the sync-point, we can call recalc_server_map function synchronously
in map_set_server_status_up/down function.
2 global locks have been added to protect, respectively, the run queue and the
wait queue. And a process mask has been added on each task. Like for FDs, this
mask is used to know which threads are allowed to process a task.
For many tasks, all threads are granted. And this must be your first intension
when you create a new task, else you have a good reason to make a task sticky on
some threads. This is then the responsibility to the process callback to lock
what have to be locked in the task context.
Nevertheless, all tasks linked to a session must be sticky on the thread
creating the session. It is important that I/O handlers processing session FDs
and these tasks run on the same thread to avoid conflicts.
Email alerts relies on checks to send emails. The link between a mailers section
and a proxy was resolved during the configuration parsing, But initialization was
done when the first alert is triggered. This implied memory allocations and
tasks creations. With this patch, everything is now initialized during the
configuration parsing. So when an alert is triggered, only the memory required
by this alert is dynamically allocated.
Moreover, alerts processing had a flaw. The task handler used to process alerts
to be sent to the same mailer, process_email_alert, was designed to give back
the control to the scheduler when an alert was sent. So there was a delay
between the sending of 2 consecutives alerts (the min of
"proxy->timeout.connect" and "mailer->timeout.mail"). To fix this problem, now,
we try to process as much queued alerts as possible when the task is woken up.
This is a huge patch with many changes, all about the DNS. Initially, the idea
was to update the DNS part to ease the threads support integration. But quickly,
I started to refactor some parts. And after several iterations, it was
impossible for me to commit the different parts atomically. So, instead of
adding tens of patches, often reworking the same parts, it was easier to merge
all my changes in a uniq patch. Here are all changes made on the DNS.
First, the DNS initialization has been refactored. The DNS configuration parsing
remains untouched, in cfgparse.c. But all checks have been moved in a post-check
callback. In the function dns_finalize_config, for each resolvers, the
nameservers configuration is tested and the task used to manage DNS resolutions
is created. The links between the backend's servers and the resolvers are also
created at this step. Here no connection are kept alive. So there is no needs
anymore to reopen them after HAProxy fork. Connections used to send DNS queries
will be opened on demand.
Then, the way DNS requesters are linked to a DNS resolution has been
reworked. The resolution used by a requester is now referenced into the
dns_requester structure and the resolution pointers in server and dns_srvrq
structures have been removed. wait and curr list of requesters, for a DNS
resolution, have been replaced by a uniq list. And Finally, the way a requester
is removed from a DNS resolution has been simplified. Now everything is done in
dns_unlink_resolution.
srv_set_fqdn function has been simplified. Now, there is only 1 way to set the
server's FQDN, independently it is done by the CLI or when a SRV record is
resolved.
The static DNS resolutions pool has been replaced by a dynamoc pool. The part
has been modified by Baptiste Assmann.
The way the DNS resolutions are triggered by the task or by a health-check has
been totally refactored. Now, all timeouts are respected. Especially
hold.valid. The default frequency to wake up a resolvers is now configurable
using "timeout resolve" parameter.
Now, as documented, as long as invalid repsonses are received, we really wait
all name servers responses before retrying.
As far as possible, resources allocated during DNS configuration parsing are
releases when HAProxy is shutdown.
Beside all these changes, the code has been cleaned to ease code review and the
doc has been updated.
Allow to register a function which will be called after the
configuration file parsing, at the end of the check_config_validity().
It's useful fo checking dependencies between sections or for resolving
keywords, pointers or values.