By default, no affinity is set for threads. To bind threads on CPU, you must
define a "thread-map" in the global section. The format is the same than the
"cpu-map" parameter, with a small difference. The process number must be
defined, with the same format than cpu-map ("all", "even", "odd" or a number
between 1 and 31/63).
A thread will be bound on the intersection of its mapping and the one of the
process on which it is attached. If the intersection is null, no specific bind
will be set for the thread.
A lock is used to protect accesses to a peer structure.
A the lock is taken in the applet handler when the peer is identified
and released living the applet handler.
In the scheduling task for peers section, the lock is taken for every
listed peer and released at the end of the process task function.
The peer 'force shutdown' function was also re-worked.
A lock for LB parameters has been added inside the proxy structure and atomic
operations have been used to update server variables releated to lb.
The only significant change is about lb_map. Because the servers status are
updated in the sync-point, we can call recalc_server_map function synchronously
in map_set_server_status_up/down function.
2 global locks have been added to protect, respectively, the run queue and the
wait queue. And a process mask has been added on each task. Like for FDs, this
mask is used to know which threads are allowed to process a task.
For many tasks, all threads are granted. And this must be your first intension
when you create a new task, else you have a good reason to make a task sticky on
some threads. This is then the responsibility to the process callback to lock
what have to be locked in the task context.
Nevertheless, all tasks linked to a session must be sticky on the thread
creating the session. It is important that I/O handlers processing session FDs
and these tasks run on the same thread to avoid conflicts.
Email alerts relies on checks to send emails. The link between a mailers section
and a proxy was resolved during the configuration parsing, But initialization was
done when the first alert is triggered. This implied memory allocations and
tasks creations. With this patch, everything is now initialized during the
configuration parsing. So when an alert is triggered, only the memory required
by this alert is dynamically allocated.
Moreover, alerts processing had a flaw. The task handler used to process alerts
to be sent to the same mailer, process_email_alert, was designed to give back
the control to the scheduler when an alert was sent. So there was a delay
between the sending of 2 consecutives alerts (the min of
"proxy->timeout.connect" and "mailer->timeout.mail"). To fix this problem, now,
we try to process as much queued alerts as possible when the task is woken up.
This is a huge patch with many changes, all about the DNS. Initially, the idea
was to update the DNS part to ease the threads support integration. But quickly,
I started to refactor some parts. And after several iterations, it was
impossible for me to commit the different parts atomically. So, instead of
adding tens of patches, often reworking the same parts, it was easier to merge
all my changes in a uniq patch. Here are all changes made on the DNS.
First, the DNS initialization has been refactored. The DNS configuration parsing
remains untouched, in cfgparse.c. But all checks have been moved in a post-check
callback. In the function dns_finalize_config, for each resolvers, the
nameservers configuration is tested and the task used to manage DNS resolutions
is created. The links between the backend's servers and the resolvers are also
created at this step. Here no connection are kept alive. So there is no needs
anymore to reopen them after HAProxy fork. Connections used to send DNS queries
will be opened on demand.
Then, the way DNS requesters are linked to a DNS resolution has been
reworked. The resolution used by a requester is now referenced into the
dns_requester structure and the resolution pointers in server and dns_srvrq
structures have been removed. wait and curr list of requesters, for a DNS
resolution, have been replaced by a uniq list. And Finally, the way a requester
is removed from a DNS resolution has been simplified. Now everything is done in
dns_unlink_resolution.
srv_set_fqdn function has been simplified. Now, there is only 1 way to set the
server's FQDN, independently it is done by the CLI or when a SRV record is
resolved.
The static DNS resolutions pool has been replaced by a dynamoc pool. The part
has been modified by Baptiste Assmann.
The way the DNS resolutions are triggered by the task or by a health-check has
been totally refactored. Now, all timeouts are respected. Especially
hold.valid. The default frequency to wake up a resolvers is now configurable
using "timeout resolve" parameter.
Now, as documented, as long as invalid repsonses are received, we really wait
all name servers responses before retrying.
As far as possible, resources allocated during DNS configuration parsing are
releases when HAProxy is shutdown.
Beside all these changes, the code has been cleaned to ease code review and the
doc has been updated.
Allow to register a function which will be called after the
configuration file parsing, at the end of the check_config_validity().
It's useful fo checking dependencies between sections or for resolving
keywords, pointers or values.
This commit implements a post section callback. This callback will be
used at the end of a section parsing.
Every call to cfg_register_section must be modified to use the new
prototype:
int cfg_register_section(char *section_name,
int (*section_parser)(const char *, int, char **, int),
int (*post_section_parser)());
This function is used to create a series of listeners for a specific
address and a port range. It automatically calls the matching protocol
handlers to add them to the relevant lists. This way cfgparse doesn't
need to manipulate listeners anymore. As an added bonus, the memory
allocation is checked.
cfgparse has no business directly calling each individual protocol's 'add'
function to create a listener. Now that they're all registered, better
perform a protocol lookup on the family and have a standard ->add method
for all of them.
It's a shame that cfgparse() has to make special cases of each protocol
just to cast the port to the target address family. Let's pass the port
in argument to the function. The unix listener simply ignores it.
During the configuration parsing, log buffers are reallocated when
global.max_syslog_len is updated. This can be done serveral time. So, instead of
doing it serveral time, we do it only once after the configuration parsing.
Now, we use init_log_buffers and deinit_log_buffers to, respectively, initialize
and deinitialize log buffers used for syslog messages.
These functions have been introduced to be used by threads, to deal with
thread-local log buffers.
Trash buffers are reallocated when "tune.bufsize" parameter is changed. Here, we
just move the realloc after the configuration parsing.
Given that the config parser doesn't rely on the trash size, it should be
harmless.
Now, we use init_trash_buffers and deinit_trash_buffers to, respectively,
initialize and deinitialize trash buffers (trash, trash_buf1 and trash_buf2).
These functions have been introduced to be used by threads, to deal with
thread-local trash buffers.
Historically listeners used to have a handler depending on the upper
layer. But now it's exclusively process_stream() and nothing uses it
anymore so it can safely be removed.
Since commit 9d8dbbc ("MINOR: dns: Maximum DNS udp payload set to 8192") it's
possible to specify a packet size, but passing too large a size or a negative
size is not detected and results in memset() being performed over a 2GB+ area
upon receipt of the first DNS response, causing runtime crashes.
We now check that the size is not smaller than the smallest packet which is
the DNS header size (12 bytes).
No backport is needed.
Following up DNS extension introduction, this patch aims at making the
computation of the maximum number of records in DNS response dynamic.
This computation is based on the announced payload size accepted by
HAProxy.
The "hold obsolete" timer is used to prevent HAProxy from moving a server to
an other IP or from considering the server as DOWN if the IP currently
affected to this server has not been seen for this period of time in DNS
responses.
That said, historically, HAProxy used to update servers as soon as the IP
has disappeared from the response. Current default timeout break this
historical behavior and may change HAProxy's behavior when people will
upgrade to 1.8.
This patch changes the default value to 0 to keep backward compatibility.
Edns extensions may be used to negotiate some settings between a DNS
client and a server.
For now we only use it to announce the maximum response payload size accpeted
by HAProxy.
This size can be set through a configuration parameter in the resolvers
section. If not set, it defaults to 512 bytes.
Make it so for each server, instead of specifying a hostname, one can use
a SRV label.
When doing so, haproxy will first resolve the SRV label, then use the
resulting hostnames, as well as port and weight (priority is ignored right
now), to each server using the SRV label.
It is resolved periodically, and any server disappearing from the SRV records
will be removed, and any server appearing will be added, assuming there're
free servers in haproxy.
As DNS servers may not return all IPs in one answer, we want to cache the
previous entries. Those entries are removed when considered obsolete, which
happens when the IP hasn't been returned by the DNS server for a time
defined in the "hold obsolete" parameter of the resolver section. The default
is 30s.
When several stick-tables were configured with several peers sections,
only a part of them could be synchronized: the ones attached to the last
parsed 'peers' section. This was due to the fact that, at least, the peer I/O handler
refered to the wrong peer section list, in fact always the same: the last one parsed.
The fact that the global peer section list was named "struct peers *peers"
lead to this issue. This variable name is dangerous ;).
So this patch renames global 'peers' variable to 'cfg_peers' to ensure that
no such wrong references are still in use, then all the functions wich used
old 'peers' variable have been modified to refer to the correct peer list.
Must be backported to 1.6 and 1.7.
The pool used to log the uri was created with a size of 0 because the
configuration and 'tune.http.logurilen' were parsed too earlier.
The fix consist to postpone the pool_create as it is done for
cookie captures.
Regression introduced with 'MINOR: log: Add logurilen tunable'
We cannot store more than 32K headers in the structure hdr_idx, because
internaly we use signed short integers. To avoid any bugs (due to an integers
overflow), a check has been added on tune.http.maxhdr to be sure to not set a
value greater than 32767 and lower than 1 (because this is a nonsense to set
this parameter to a value <= 0).
The documentation has been updated accordingly.
This patch can be backported in 1.7, 1.6 and 1.5.
This patch is a major upgrade of the internal run-time DNS resolver in
HAProxy and it brings the following 2 main changes:
1. DNS resolution task
Up to now, DNS resolution was triggered by the health check task.
From now, DNS resolution task is autonomous. It is started by HAProxy
right after the scheduler is available and it is woken either when a
network IO occurs for one of its nameserver or when a timeout is
matched.
From now, this means we can enable DNS resolution for a server without
enabling health checking.
2. Introduction of a dns_requester structure
Up to now, DNS resolution was purposely made for resolving server
hostnames.
The idea, is to ensure that any HAProxy internal object should be able
to trigger a DNS resolution. For this purpose, 2 things has to be done:
- clean up the DNS code from the server structure (this was already
quite clean actually) and clean up the server's callbacks from
manipulating too much DNS resolution
- create an agnostic structure which allows linking a DNS resolution
and a requester of any type (using obj_type enum)
3. Manage requesters through queues
Up to now, there was an uniq relationship between a resolution and it's
owner (aka the requester now). It's a shame, because in some cases,
multiple objects may share the same hostname and may benefit from a
resolution being performed by a third party.
This patch introduces the notion of queues, which are basically lists of
either currently running resolution or waiting ones.
The resolutions are now available as a pool, which belongs to the resolvers.
The pool has has a default size of 64 resolutions per resolvers and is
allocated at configuration parsing.
This patch introduces a some re-organisation around the DNS code in
HAProxy.
1. make the dns_* functions less dependent on 'struct server' and 'struct resolution'.
With this in mind, the following changes were performed:
- 'struct dns_options' has been removed from 'struct resolution' (well,
we might need it back at some point later, we'll see)
==> we'll use the 'struct dns_options' from the owner of the resolution
- dns_get_ip_from_response(): takes a 'struct dns_options' instead of
'struct resolution'
==> so the caller can pass its own dns options to get the most
appropriate IP from the response
- dns_process_resolve(): struct dns_option is deduced from new
resolution->requester_type parameter
2. add hostname_dn and hostname_dn_len into struct server
In order to avoid recomputing a server's hostname into its domain name
format (and use a trash buffer to store the result), it is safer to
compute it once at configuration parsing and to store it into the struct
server.
In the mean time, the struct resolution linked to the server doesn't
need anymore to store the hostname in domain name format. A simple
pointer to the server one will make the trick.
The function srv_alloc_dns_resolution() properly manages everything for
us: memory allocation, pointer updates, etc...
3. move resolvers pointer into struct server
This patch makes the pointer to struct dns_resolvers from struct
dns_resolution obsolete.
Purpose is to make the resolution as "neutral" as possible and since the
requester is already linked to the resolvers, then we don't need this
information anymore in the resolution itself.
The default len of request uri in log messages is 1024. In some use
cases, you need to keep the long trail of GET parameters. The only
way to increase this len is to recompile with DEFINE=-DREQURI_LEN=2048.
This commit introduces a tune.http.logurilen configuration directive,
allowing to tune this at runtime.
This option exits every workers when one of the current workers die.
It allows you to monitor the master process in order to relaunch
everything on a failure.
For example it can be used with systemd and Restart=on-failure in a spec
file.
This commit remove the -Ds systemd mode in HAProxy in order to replace
it by a more generic master worker system. It aims to replace entirely
the systemd wrapper in the near future.
The master worker mode implements a new way of managing HAProxy
processes. The master is in charge of parsing the configuration
file and is responsible for spawning child processes.
The master worker mode can be invoked by using the -W flag. It can be
used either in background mode (-D) or foreground mode. When used in
background mode, the master will fork to daemonize.
In master worker background mode, chroot, setuid and setgid are done in
each child rather than in the master process, because the master process
will still need access to filesystem to reload the configuration.
When HAProxy is running with multiple processes and some listeners
arebound to processes, the unused sockets were not closed in the other
processes. The aim was to be able to send those listening sockets using
the -x option.
However to ensure the previous behavior which was to close those
sockets, we provided the "no-unused-socket" global option.
This patch changes this behavior, it will close unused sockets which are
not in the same process as an expose-fd socket, making the
"no-unused-socket" option useless.
The "no-unused-socket" option was removed in this patch.
This patch adds a new stats socket command to modify server
FQDNs at run time.
Its syntax:
set server <backend>/<server> fqdn <FQDN>
This patch also adds FQDNs to server state file at the end
of each line for backward compatibility ("-" if not present).
This patch makes backend sections support 'server-template' new keyword.
Such 'server-template' objects are parsed similarly to a 'server' object
by parse_server() function, but its first arguments are as follows:
server-template <ID prefix> <nb | range> <ip | fqdn>:<port> ...
The remaining arguments are the same as for 'server' lines.
With such server template declarations, servers may be allocated with IDs
built from <ID prefix> and <nb | range> arguments.
For instance declaring:
server-template foo 1-5 google.com:80 ...
or
server-template foo 5 google.com:80 ...
would be equivalent to declare:
server foo1 google.com:80 ...
server foo2 google.com:80 ...
server foo3 google.com:80 ...
server foo4 google.com:80 ...
server foo5 google.com:80 ...
When running with multiple process, if some proxies are just assigned
to some processes, the other processes will just close the file descriptors
for the listening sockets. However, we may still have to provide those
sockets when reloading, so instead we just try hard to pretend those proxies
are dead, while keeping the sockets opened.
A new global option, no-reused-socket", has been added, to restore the old
behavior of closing the sockets not bound to this process.
The error doesn't prevent checking for other errors after an invalid
character was detected in an ACL name. Better quit ASAP to avoid risking
to emit garbled and confusing error messages if something else fails on
the same line.
This should be backported to 1.7, 1.6 and 1.5.
There is a silly case where a loop is not detected in tracked servers lists:
when a server tracks itself.
Ex:
server srv1 127.0.0.1:8000 track srv1
Well, this never happens and this does not prevent haproxy from working.
But with this next following configuration:
server srv1 127.0.0.1:8000 track srv2
server srv2 127.0.0.1:8000 track srv2
server srv3 127.0.0.1:8000 track srv2
the code in charge of detecting such loops never returns (without any error message).
haproxy becomes stuck in an infinite loop because of this statement found
in check_config_validity():
for (loop = srv->track; loop && loop != newsrv; loop = loop->track);
Again, such a configuration is never accidentally used I guess.
This latter example seems silly, but as several 'default-server' directives may be used
in the same proxy section, and as 'default-server' settings are not resetted each a
new 'default-server' line is created, it will match the following configuration, in the future,
when 'track' setting will be supported by 'default-server':
default-server track srv3
server srv1 127.0.0.1:8000
server srv2 127.0.0.1:8000
.
.
.
default-server check
server srv3 127.0.0.1:8000
(cherry picked from commit 6528fc93d3c065fdac841f24e55cfe9674a67414)
This adds a new "dynamic" keyword for the cookie option. If set, a cookie
will be generated for each server (assuming one isn't already provided on
the "server" line), from the IP of the server, the TCP port, and a secret
key provided. To provide the secret key, a new keyword as been added,
"dynamic-cookie-key", for backends.
Example :
backend bk_web
balance roundrobin
dynamic-cookie-key "bla"
cookie WEBSRV insert dynamic
server s1 127.0.0.1:80 check
server s2 192.168.56.1:80 check
This is a first step to be able to dynamically add and remove servers,
without modifying the configuration file, and still have all the load
balancers redirect the traffic to the right server.
Provide a way to generate session cookies, based on the IP address of the
server, the TCP port, and a secret key provided.
Surprizingly, http-request, http-response, block, redirect, and capture
rules did not cause a warning to be emitted when used in a TCP proxy, so
let's fix this.
This patch may be backported to older versions as it helps spotting
configuration issues.
Adrian Fitzpatrick reported that since commit f51658d ("MEDIUM: config:
relax use_backend check to make the condition optional"), typos like "of"
instead of "if" on use_backend rules are not properly detected. The reason
is that the parser only checks for "if" or "unless" otherwise it considers
there's no keyword, making the rule inconditional.
This patch fixes it. It may reveal some rare config bugs for some people,
but will not affect valid configurations.
This fix must be backported to 1.7, 1.6 and 1.5.