Move every ha_alert calls in parsing functions into parse_server.
Parsing functions now support a pointer-to-string argument which will be
allocated with an error message if needed via memprintf.
parse_server has then the responsibility to display errors with ha_alert.
This is groundwork for dynamic server. No traces should be printed on
stderr as a response to a cli command. cli_err will replace ha_alert in
this case.
The huge parse_server function is splitted into two smaller ones.
* _srv_parse_init allocates a new server instance and parses the address
parameter
* _srv_parse_kw parse the current server keyword
This simplify a bit the parse_server function. Besides, it will be
useful for dynamic server creation.
Move server-keyword hardcoded in parse_server into the srv_kws list of
server.c. Now every server keywords is checked through srv_find_kw. This
has the effect to reduce the size of parse_server. As a side-effect,
common kw list can be reduced.
This change has been made to be able to quickly discard these keywords
in case of a dynamic server.
DNS hostname comparisons were fixed to be case-insensitive (see b17b88487
"BUG/MEDIUM: dns: Consider the fact that dns answers are
case-insensitive"). However 2 comparisons are still case-sensitive.
This patch must be backported as far as 1.8.
At startup, if a SRV resolution is set for a server, no DNS resolution is
created. We must wait the first SRV resolution to know if it must be
triggered. It is important to do so for two reasons.
First, during a "classical" startup, a server based on a SRV resolution has
no hostname. Thus the created DNS resolution is useless. Best waiting the
first SRV resolution. It is not really a bug at this stage, it is just
useless.
Second, in the same situation, if the server state is loaded from a file,
its hosname will be set a bit later. Thus, if there is no additionnal record
for this server, because there is already a DNS resolution, it inhibits any
new DNS resolution. But there is no hostname attached to the existing DNS
resolution. So no resolution is performed at all for this server.
To avoid any problem, it is fairly easier to handle this special case during
startup. But this means we must be prepared to have no "resolv_requester"
field for a server at runtime.
This patch must be backported as far as 2.2.
Another way to say it: "Safely unlink requester from a requester callbacks".
Requester callbacks must never try to unlink a requester from a resolution, for
the current requester or another one. First, these callback functions are called
in a loop on a request list, not necessarily safe. Thus unlink resolution at
this place, may be unsafe. And it is useless to try to make these loops safe
because, all this stuff is placed in a loop on a resolution list. Unlink a
requester may lead to release a resolution if it is the last requester.
However, the unkink is necessary because we cannot reset the server state
(hostname and IP) with some pending DNS resolution on it. So, to workaround
this issue, we introduce the "safe" unlink. It is only performed from a
requester callback. In this case, the unlink function never releases the
resolution, it only reset it if necessary. And when a resolution is found
with an empty requester list, it is released.
This patch depends on the following commits :
* MINOR: resolvers: Purge answer items when a SRV resolution triggers an error
* MINOR: resolvers: Use a function to remove answers attached to a resolution
* MINOR: resolvers: Directly call srvrq_update_srv_state() when possible
* MINOR: resolvers: Add function to change the srv status based on SRV resolution
All the series must be backported as far as 2.2. It fixes a regression
introduced by the commit b4badf720 ("BUG/MINOR: resolvers: new callback to
properly handle SRV record errors").
don't release resolution from requester cb
When the server status must be updated from the result of a SRV resolution,
we can directly call srvrq_update_srv_state(). It is simpler and this avoid
a test on the server DNS resolution.
This patch is mandatory for the next commit. It also rely on "MINOR:
resolvers: Directly call srvrq_update_srv_state() when possible".
srvrq_update_srv_status() update the server status based on result of SRV
resolution. For now, it is only used from snr_update_srv_status() when
appropriate.
When a SRV request trigger an error, if we decide to handle the error
because last_valid duration is expired, the answer list may be purged. All
items are considered as obsolete.
If no ADD item is found for a SRV item in a SRV response, a DNS resolution
is triggered. When it succeeds, we must be sure the SRV item is still
alive. Otherwise the DNS resolution must be ignored.
This patch depends on the commit "MINOR: resolvers: Move last_seen time of
an ADD into its corresponding SRV item". Both must be backported as far as
2.2.
When a server is set in RMAINT becaues of a SRV resolution failure, the
server DNS resolution, if any, must be unlink first. It is mandatory to
handle the change in the context of a SRV resolution.
This patch must be backported as far as 2.2.
When a DNS resolution error is detected, in snr_resolution_error_cb(), the
server address must be reset only if the server status has changed. It this
case, it means the server is set to RMAINT. Thus the server address may by
reset.
This patch fixes a bug introduced by commit d127ffa9f ("BUG/MEDIUM:
resolvers: Reset address for unresolved servers"). It must be backported as
far as 2.0.
When an error is received for a DNS resolution, for instance a NXDOMAIN
error, the server must be considered to have no address when its status is
updated, not the opposite.
Concretly, because this parameter is not used on error path in
snr_update_srv_status(), there is no impact.
This patch must be backported as far as 1.8.
This was introduced in previous commit 49c2b45c1 ("MINOR: cfgparse/server:
try to fix spelling mistakes on server lines"), the loop was changed but
the increment left. No backport is needed.
Let's apply the fuzzy match to server keywords so that we can avoid
dumping the huge list of supported keywords each time there is a spelling
mistake, and suggest proper spelling instead:
$ printf "listen f\nserver s 0 sendpx-v2\n" | ./haproxy -c -f /dev/stdin
[NOTICE] 070/095718 (24152) : haproxy version is 2.4-dev11-caa6e3-25
[NOTICE] 070/095718 (24152) : path to executable is ./haproxy
[ALERT] 070/095718 (24152) : parsing [/dev/stdin:2] : 'server s' unknown keyword 'sendpx-v2'; did you mean 'send-proxy-v2' maybe ?
[ALERT] 070/095718 (24152) : Error(s) found in configuration file : /dev/stdin
[ALERT] 070/095718 (24152) : Fatal errors found in configuration.
The default proxy was passed as a variable to all parsers instead of a
const, which is not without risk, especially when some timeout parsers used
to make some int pointers point to the default values for comparisons. We
want to be certain that none of these parsers will modify the defaults
sections by accident, so it's important to mark this proxy as const.
This patch touches all occurrences found (89).
The actconns list creates massive contention on low server counts because
it's in fact a list of streams using a server, all threads compete on the
list's head and it's still possible to see some watchdog panics on 48
threads under extreme contention with 47 threads trying to add and one
thread trying to delete.
Moving this list per thread is trivial because it's only used by
srv_shutdown_streams(), which simply required to iterate over the list.
The field was renamed to "streams" as it's really a list of streams
rather than a list of connections.
There are multiple per-thread lists in the listeners, which isn't the
most efficient in terms of cache, and doesn't easily allow to store all
the per-thread stuff.
Now we introduce an srv_per_thread structure which the servers will have an
array of, and place the idle/safe/avail conns tree heads into. Overall this
was a fairly mechanical change, and the array is now always initialized for
all servers since we'll put more stuff there. It's worth noting that the Lua
code still has to deal with its own deinit by itself despite being in a
global list, because its server is not dynamically allocated.
It's a real pain not to have access to the list of all registered servers,
because whenever there is a need to late adjust their configuration, only
those attached to regular proxies are seen, but not the peers, lua, logs
nor DNS.
What this patch does is that new_server() will automatically add the newly
created server to a global list, and it does so as well for the 1 or 2
statically allocated servers created for Lua. This way it will be possible
to iterate over all of them.
It's been too short for quite a while now and is now full. It's still
time to extend it to 32-bits since we have room for this without
wasting any space, so we now gained 16 new bits for future flags.
The values were not reassigned just in case there would be a few
hidden u16 or short somewhere in which these flags are placed (as
it used to be the case with stream->pending_events).
The patch is tagged MEDIUM because this required to update the task's
process() prototype to use an int instead of a short, that's quite a
bunch of places.
Refactoring performed with the following Coccinelle patch:
@@
char *s;
@@
(
- ist2(s, strlen(s))
+ ist(s)
|
- ist2(strdup(s), strlen(s))
+ ist(strdup(s))
)
Note that this replacement is safe even in the strdup() case, because `ist()`
will not call `strlen()` on a `NULL` pointer. Instead is inserts a length of
`0`, effectively resulting in `IST_NULL`.
This makes the code more readable and less prone to copy-paste errors.
In addition, it allows to place some __builtin_constant_p() predicates
to trigger a link-time error in case the compiler knows that the freed
area is constant. It will also produce compile-time error if trying to
free something that is not a regular pointer (e.g. a function).
The DEBUG_MEM_STATS macro now also defines an instance for ha_free()
so that all these calls can be checked.
178 occurrences were converted. The vast majority of them were handled
by the following Coccinelle script, some slightly refined to better deal
with "&*x" or with long lines:
@ rule @
expression E;
@@
- free(E);
- E = NULL;
+ ha_free(&E);
It was verified that the resulting code is the same, more or less a
handful of cases where the compiler optimized slightly differently
the temporary variable that holds the copy of the pointer.
A non-negligible amount of {free(str);str=NULL;str_len=0;} are still
present in the config part (mostly header names in proxies). These
ones should also be cleaned for the same reasons, and probably be
turned into ist strings.
These function names are unbearably long, they don't even fit into the
screen in "show profiling", let's trim the "_connections" to "_conns",
which happens to match the name of the lists there.
Some static functions are now exported and renamed to follow the same
pattern of other exported functions. Here is the list :
* update_server_fqdn: Renamed to srv_update_fqdn and exported
* update_server_check_addr_port: renamed to srv_update_check_addr_port and exported
* update_server_agent_addr_port: renamed to srv_update_agent_addr_port and exported
* update_server_addr: renamed to srv_update_addr
* update_server_addr_potr: renamed to srv_update_addr_port
* srv_prepare_for_resolution: exported
This change is mandatory to move all functions dealing with the server-state
files in a separate file.
This change is not huge but may have a visible impact for users. Now, if a
line of a server-state file is corrupted, the whole file is ignored. A
warning is emitted with the corrupted line number.
In fact, there is no way to recover from a corrupted line. A line is
considered as corrupted if it is too long (truncated line) or if it contains
the wrong number of arguments. In both cases, it means the file was forged
(or at least manually edited). It is safer to ignore it.
Note for now, memory allocation errors are not reported and the
corresponding line is silently ignored.
Now, srv_state_parse_and_store_line() function is used to parse and store a
line in a tree. It is used for global and local server-state files. This
significatly simplies the apply_server_state() function.
Just like for the global server-state file, the line of a local server-state
file are now stored in a tree. This way, the file is fully parsed before
loading the servers state. And with this change, global and local
server-state files are now handled the same way. This will be the
opportunity to factorize the code. It is also a good way to validate the
file before loading any server state.
The loop on the servers of a proxy to load the server states was moved in
the function srv_state_px_update(). This simplify a bit the
apply_server_state() function. It is aslo mandatory to simplify the loading
of local server-state file.
When a server for a given backend is found in the tree containing all lines
of the global server-state file, the node is removed from the tree. It is
useless to keep it longer. It is a small improvement, but it may also be
usefull to track the orphan lines (not used for now).
Parsed parameters are now stored in the tree of server-state lines. This
way, a line from the global server-state file is only parsed once. Before,
it was parsed a first time to store it in the tree and one more time to load
the server state. To do so, the server-state line object must be allocated
before parsing a line. This means its size must no longer depend on the
length of first parsed parameters (backend and server names). Thus the node
type was changed to use a hashed key instead of a string.
Now, we read a full line and expects to found an integer only on it. And if
the line is empty or truncated, an error is returned. If the version is not
valid, an error is also returned. This way, the first line is no longer
partially read.
There is no reason to use a global variable to store the lines of the global
server-state file. This tree is only used during the file parsing, as a line
cache. Now the eb-tree is declared as a local variable in the
apply_server_state() function.
The structure used to store a server-state line in an eb-tree has a too
generic name. Instead of state_line, the structure is renamed as
server_state_line.
<state_line.name_name> field is a node in an eb-tree. Thus, instead of
"name_name", we now use "node" to name this field. If is a more explicit
name and not too strange.
The apply_server_state() function is really hard to read. Thus it was
refactored to be more maintainable. First, an helper function is used to get
the server-state file path. Some useless variables were removed and most of
other variables were renamed to be more readable. The error messages are now
prefixed to know the context (global vs per-proxy). Finally, the loop on the
proxies list was simplified.
This patch may seem a bit huge, but the changes are not so important.
There is no reason to fill two parameter arrays in srv_state_parse_line()
function. Now, only one array is used. The 4th first entries are just
skipped when srv_update_state() is called.
The srv_state_parse_line() function was rewritten to be more strict. First
of all, it is possible to make the difference between an ignored line and an
malformed one. Then, only blank characters (spaces and tabs) are now allowed
as field separator. An error is reported for truncated lines or for lines
with an unexpected number of arguments regarding the provided version.
However, for now, errors are ignored by the caller, invalid lines are just
skipped.
If the DNS resolution failed for a server, its ip address must be
removed. Otherwise, the server is stopped but keeps its ip. This may be
confusing when the servers state are retrieved on the CLI and it may lead to
undefined behavior if HAproxy is configured to load its servers state from a
file.
This patch should be backported as far as 2.0.
When a SRV record expires, the ip/port assigned to the associated server are
now removed. Otherwise, the server is stopped but keeps its ip/port while
the server hostname is removed. It is confusing when the servers state are
retrieve on the CLI and may be a problem if saved in a server-state
file. Because the reload may fail because of this inconsistency.
Here is an example:
* Declare a server template in a backend, using the resolver <dns>
server-template test 2 _http._tcp.example.com resolvers dns check
* 2 SRV records are announced with the corresponding additional
records. Thus, 2 servers are filled. Here is the "show servers state"
output :
2 frt 1 test1 192.168.1.1 2 64 0 1 2 15 3 4 6 0 0 0 http1.example.com 8001 _http._tcp.example.com 0 0 - - 0
2 frt 2 test2 192.168.1.2 2 64 0 1 1 15 3 4 6 0 0 0 http2.example.com 8002 _http._tcp.example.com 0 0 - - 0
* Then, one additional record is removed (or a SRV record is removed, the
result is the same). Here is the new "show servers state" output :
2 frt 1 test1 192.168.1.1 2 64 0 1 38 15 3 4 6 0 0 0 http1.example.com 8001 _http._tcp.example.com 0 0 - - 0
2 frt 2 test2 192.168.1.2 0 96 0 1 19 15 3 0 14 0 0 0 - 8002 _http._tcp.example.com 0 0 - - 0
On reload, if a server-state file is used, this leads to undefined behaviors
depending on the configuration.
This patch should be backported as far as 2.0.
When a SRV record was created, it used to register the regular server name
resolution callbacks. That said, SRV records and regular server name
resolution don't work the same way, furthermore on error management.
This patch introduces a new call back to manage DNS errors related to
the SRV queries.
this fixes github issue #50.
Backport status: 2.3, 2.2, 2.1, 2.0
When a server-state line is parsed, a test is performed to be sure there is
enough but not too much fields. However the test is buggy. The bug was
introduced in the commit ea2cdf55e ("MEDIUM: server: Don't introduce a new
server-state file version").
No backport needed.
This revert the commit 63e6cba12 ("MEDIUM: server: add server-states version
2"), but keeping all recent features added to the server-sate file. Instead
of adding a 2nd version for the server-state file format to handle the 5 new
fields added during the 2.4 development, these fields are considered as
optionnal during the parsing. So it is possible to load a server-state file
from HAProxy 2.3. However, from 2.4, these new fields are always dumped in
the server-state file. But it should not be a problem to load it on the 2.3.
This patch seems a bit huge but the diff ignoring the space is much smaller.
The version 2 of the server-state file format is reserved for a real
refactoring to address all issues of the current format.
If a line of a server-state file has too many fields, the last one is not
cut on the first following space, as all other fileds. It contains all the
end of the line. It is not the expected behavior. So, now, we cut it on the
next following space, if any. The parsing loop was slighly rewritten.
Note that for now there is no error reported if the line is too long.
This patch may be backported at least as far as 2.1. On 2.0 and prior the
code is not the same. The line parsing is inlined in apply_server_state()
function.
Same static arrays of parameters are used to parse all server-state
lines. Thus it is important to reinit them to be sure to not get params from
the previous line, eventually from the previous loaded file.
This patch should be backported to all stable branches. However, in 2.0 and
prior, the parsing of server-state lines are inlined in apply_server_state()
function. Thus the patch will have to be adapted on these versions.
Remove ebmb_node entry from struct connection and create a dedicated
struct conn_hash_node. struct connection contains now only a pointer to
a conn_hash_node, allocated only for connections where target is of type
OBJ_TYPE_SERVER. This will reduce memory footprints for every
connections that does not need http-reuse such as frontend connections.
we previously forgot to add `agent-*` commands.
Take this opportunity to rewrite the help string in a simpler way for
readability (mainly removing simple quotes)
Signed-off-by: William Dauchy <wdauchy@gmail.com>
The remaining contention on the server lock solely comes from
sess_change_server() which takes the lock to add and remove a
stream from the server's actconn list. This is both expensive
and pointless since we have mt-lists, and this list is only
used by the CLI's "shutdown server sessions" command!
Let's migrate to an mt-list and remove the need for this costly
lock. By doing so, the request rate increased by ~1.8%.
The RMAINT admin state is dynamic and should be remove from the
srv_admin_state parameter when a server state is loaded from a server-state
file. Otherwise an erorr is reported, the server-state line is ignored and
the server state is not updated.
This patch should fix the issue #576. It must be backported as far as 1.8.
This patch splits current dns.c into two files:
The first dns.c contains code related to DNS message exchange over UDP
and in future other TCP. We try to remove depencies to resolving
to make it usable by other stuff as DNS load balancing.
The new resolvers.c inherit of the code specific to the actual
resolvers.
Note:
It was really difficult to obtain a clean diff dur to the amount
of moved code.
Note2:
Counters and stuff related to stats is not cleany separated because
currently counters for both layers are merged and hard to separate
for now.
Resolv callbacks are also updated to rely on counters and not on
nameservers.
"show stat domain dns" will now show the parent id (i.e. resolvers
section name).
This variable is now only used to point on the local server-state file. When
the server-state is global, it is unused. So, we now use "localfilepath"
instead. Thus, the "filepath" variable can safely be removed.
When a local server-state file is loaded, if its name is too long, the error
is not properly handled, resulting to a call to fopen() with the "filepath"
variable set to NULL. To fix the bug, when this error occurs, we jump to the
next proxy, via a "continue" statement. And we take case to set "filepath"
variable after the error handling to be sure.
This patch should fix the issue #1111. It must be backported as far as 1.6.
The default proxy was passed as a variable, which in addition to being
a PITA to deal with in the config parser, doesn't feel safe to use when
it ought to be const.
This will only affect new code so no backport is needed.
init_default_instance() was still left in cfgparse.c which is not the
best place to pre-initialize a proxy. Let's place it in proxy.c just
after init_new_proxy(), take this opportunity for renaming it to
proxy_preset_defaults() and taking out init_new_proxy() from it, and
let's pass it the pointer to the default proxy to be initialized instead
of implicitly assuming defproxy. We'll soon be able to exploit this.
Only two call places had to be updated.
server health checks and agent parameters are written the same way as
others to be able to enahcne code reuse: basically we make use of
parsing and assignment at the same place. It makes it difficult for
error handling to know whether srv object was modified partially or not.
The problem was already present with SRV resolution though.
I was a bit puzzled about the approach to take to be honest, and I did
not wanted to go into a full refactor, so I assumed it was ok to simply
notify whether the line was failed or partially applied.
Signed-off-by: William Dauchy <wdauchy@gmail.com>
logical followup from cli commands addition, so that the state server
file stays compatible with the changes made at runtime; use previously
added helper to load server attributes.
also alloc a specific chunk to avoid mixing with other called functions
using it
Signed-off-by: William Dauchy <wdauchy@gmail.com>
Even if it is possibly too much work for the current usage, it makes
sure we don't break states file from v2.3 to v2.4; indeed, since v2.3,
we introduced two new fields, so we put them aside to guarantee we can
easily reload from a version 1.
The diff seems huge but there is no specific change apart from:
- introduce v2 where it is needed (parsing, update)
- move away from switch/case in update to be able to reuse code
- move srv lock to the whole function to make it easier
this patch confirm how painful it is to maintain this functionality.
Signed-off-by: William Dauchy <wdauchy@gmail.com>
this patch allows to set agent port at runtime. In order to align with
both `addr` and `check-addr` commands, also add the possibility to
optionnaly set port on `agent-addr` command. This led to a small
refactor in order to use the same function for both `agent-addr` and
`agent-port` commands.
Signed-off-by: William Dauchy <wdauchy@gmail.com>
this patch allows to set server health check address at runtime. In
order to align with `addr` command, also allow to set port optionnaly.
This led to a small refactor in order to use the same function for both
`check-addr` and `check-port` commands.
for `check-port`, we however don't permit the change anymore if checks
are not enabled on the server.
This command becomes more and more useful for people having a consul
like architecture:
- the backend server is located on a container with its own IP
- the health checks are done the consul instance located on the host
with the host IP
Signed-off-by: William Dauchy <wdauchy@gmail.com>
The server idle/safe/available connection lists are replaced with ebmb-
trees. This is used to store backend connections, with the new field
connection hash as the key. The hash is a 8-bytes size field, used to
reflect specific connection parameters.
This is a preliminary work to be able to reuse connection with SNI,
explicit src/dst address or PROXY protocol.
This is a preparation work for connection reuse with sni/proxy
protocol/specific src-dst addresses.
Protect every access to idle conn lists with a lock. This is currently
strictly not needed because the access to the list are made with atomic
operations. However, to be able to reuse connection with specific
parameters, the list storage will be converted to eb-trees. As this
structure does not have atomic operation, it is mandatory to protect it
with a lock.
For this, the takeover lock is reused. Its role was to protect during
connection takeover. As it is now extended to general idle conns usage,
it is renamed to idle_conns_lock. A new lock section is also
instantiated named IDLE_CONNS_LOCK to isolate its impact on performance.
When adding the server side support for certificate update over the CLI
we encountered a design problem with the SSL session cache which was not
locked.
Indeed, once a certificate is updated we need to flush the cache, but we
also need to ensure that the cache is not used during the update.
To prevent the use of the cache during an update, this patch introduce a
rwlock for the SSL server session cache.
In the SSL session part this patch only lock in read, even if it writes.
The reason behind this, is that in the session part, there is one cache
storage per thread so it is not a problem to write in the cache from
several threads. The problem is only when trying to write in the cache
from the CLI (which could be on any thread) when a session is trying to
access the cache. So there is a write lock in the CLI part to prevent
simultaneous access by a session and the CLI.
This patch also remove the thread_isolate attempt which is eating too
much CPU time and was not protecting from the use of a free ptr in the
session.
small consistency problem with `addr` and `agent-addr` options:
for the both options, the last one parsed is always used to set the
agent-check addr. Thus these two lines don't have the same behavior:
server ... addr <addr1> agent-addr <addr2>
server ... agent-addr <addr2> addr <addr1>
After this patch `agent-addr` will always be the priority option over
`addr`. It means we test the flag before setting agentaddr.
We also fix all the places where we did not set the flag to be coherent
everywhere.
I was not really able to determine where this issue is coming from. So
it is probable we may backport it to all stable version where the agent
is supported.
Signed-off-by: William Dauchy <wdauchy@gmail.com>
We can currently change the check-port using the cli command `set server
check-port` but there is a consistency issue when using server state.
This patch aims to fix this problem but will be also a good preparation
work to get rid of checkport flag, so we are able to know when checkport
was set by config.
I am fully aware this is not making github #953 moving forward, I
however think this might be acceptable while waiting for a proper
solution and resolve consistency problem faced with port settings.
Signed-off-by: William Dauchy <wdauchy@gmail.com>
While trying to fix some consistency problem with the config file/cli
(e.g. check-port cli command does not set the flag), we realised
checkport flag was not necessarily needed. Indeed tcpcheck uses service
port as the last choice if check.port is zero. So we can assume if
check.port is zero, it means it was never set by the user, regardless if
it is by the cli or config file. In the longterm this will avoid to
introduce a new consistency issue if we forget to set the flag.
in the same manner of checkport flag, we don't really need checkaddr
flag. We can assume if checkaddr is not set, it means it was never set
by the user or config.
Signed-off-by: William Dauchy <wdauchy@gmail.com>
When the server state is loaded from a server-state file, there is no reason
to set an unconfigured check port with the server port. Because by default,
if the check port is not set, the server's one is used. Thus we can remove
this useless assignment. It is mandatory for next improvements.
while reading `update_server_addr_port` I found out some things which
can be seen as incoherency. I hope I did not overlooked anything:
- one comment is stating check's address should be updated if it uses
the server one; however the condition checks if `SRV_F_CHECKADDR` is
set; this flag is set when a check address is set; result is that we
override the check address where I was not expecting it. In fact we
don't need to update anything here as server addr is used when check
addr is not set.
- same goes for check agent addr
- for port, it is a bit different, we update the check port if it is
unset. This is harmless because we also use server port if check port
is unset. However it creates some incoherency before/after using this
command, as check port should stay unset througout the life of the
process unless it is is set by `set server check-port` command.
quite hard to locate the origin of this this issue but the function was
introduced in commit d458adcc52 ("MINOR:
new update_server_addr_port() function to change both server's ADDR and
service PORT"). I was however not able to determine whether this is due
to a change of behavior along the years. So this patch can potentially
be backported up to v1.8 but we must be careful while doing so, as the
code has changed a lot. That being said, the bug being not very
impacting I would be fine keeping it for 2.4 only.
Signed-off-by: William Dauchy <wdauchy@gmail.com>
The client_crt member is not used anymore since the server's ssl context
initialization now behaves the same way as the bind lines one (using
ckch stores and instances).
An fatal error is now reported if a server is defined in a frontend
section. til now, a warning was just emitted and the server was ignored. The
warning was added in the 1.3.4 when the frontend/backend keywords were
introduced to allow a smooth transition and to not break existing
configs. It is old enough now to emit an fatal error in this case.
This patch is related to the issue #1043. It may be backported at least as
far as 2.2, and possibly to older versions. It relies on the previous commit
("MINOR: config: Add failifnotcap() to emit an alert on proxy capabilities").
If a server is configured to not have any idle conns, returns immediatly
from srv_cleanup_connections. This avoids a segfault when a server is
configured with pool-max-conn to 0.
This should be backported up to 2.2.
Do not proceed on init_addr if the backend of the server is marked as
disabled. When marked as disabled, the server is not fully initialized
and some operation must be avoided to prevent segfault. It is correct
because there is no way to activate a disabled backend.
This fixes the github issue #1031.
This should be backported to 2.2.
GitHub Issue #1026 reported a crash during configuration check for the
following example config:
backend 0
server 0 0
server 0 0
HAProxy crashed in srv_set_addr_desc() due to a NULL pointer dereference
caused by `sa2str` returning NULL for an `AF_UNSPEC` address (`0`).
Check to make sure the address key is non-null before using it for
comparison or inserting it into the tree.
The crash was introduced in commit 92149f9a8 ("MEDIUM: stick-tables: Add
srvkey option to stick-table") which not in any released version so no
backport is needed.
Cc: Tim Duesterhus <tim@bastelstu.be>
This allows using the address of the server rather than the name of the
server for keeping track of servers in a backend for stickiness.
The peers code was also extended to support feeding the dictionary using
this key instead of the name.
Fixes#814
This patch adds QUIC structs to server struct so that to make the QUIC code
compile. Also initializes the ebtree to store the connections by connection
IDs.
in the context of a progressive backend migration, we want to be able to
activate SSL on outgoing connections to the server at runtime without
reloading.
This patch adds a `set server ssl` command; in order to allow that:
- add `srv_use_ssl` to `show servers state` command for compatibility,
also update associated parsing
- when using default-server ssl setting, and `no-ssl` on server line,
init SSL ctx without activating it
- when triggering ssl API, de/activate SSL connections as requested
- clean ongoing connections as it is done for addr/port changes, without
checking prior server state
example config:
backend be_foo
default-server ssl
server srv0 127.0.0.1:6011 weight 1 no-ssl
show servers state:
5 be_foo 1 srv0 127.0.0.1 2 0 1 1 15 1 0 4 0 0 0 0 - 6011 - -1
where srv0 can switch to ssl later during the runtime:
set server be_foo/srv0 ssl on
5 be_foo 1 srv0 127.0.0.1 2 0 1 1 15 1 0 4 0 0 0 0 - 6011 - 1
Also update existing tests and create a new one.
Signed-off-by: William Dauchy <wdauchy@gmail.com>
Define a per-thread counters allocated with the greatest size of any
stat module counters. This variable is named trash_counters.
When using a proxy without allocated counters, return the trash counters
from EXTRA_COUNTERS_GET instead of a dangling pointer to prevent
segfault.
This is useful for all the proxies used internally and not
belonging to the global proxy list. As these objects does not appears on
the stat report, it does not matter to use the dummy counters.
For this fix to be functional, the extra counters are explicitly
initialized to NULL on proxy/server/listener init functions.
Most notably, the crash has already been detected with the following
vtc:
- reg-tests/lua/txn_get_priv.vtc
- reg-tests/peers/tls_basic_sync.vtc
- reg-tests/peers/tls_basic_sync_wo_stkt_backend.vtc
There is probably other parts that may be impacted (SPOE for example).
This bug was introduced in the current release and do not need to be
backported. The faulty commits are
"MINOR: ssl: count client hello for stats" and
"MINOR: ssl: add counters for ssl sessions".
This function used to grab the idle lock when scanning the threads for
idle connections, but it doesn't need it since the lock only protects
the tree. Let's remove it.
In issue #933, @jaroslawr provided a report indicating that when using
many threads and many servers, it's very difficult to terminate the last
idle connections on each server. The issue has two causes in fact. The
first one is that during the calculation of the estimate of needed
connections, we round the computation up while in previous round it was
already rounded up, so we end up adding 1 to 1 which once divided by 2
remains 1. The second issue is that servers are not woken up anymore for
purging their connections if they don't have activity. The only reason
that was there to wake them up again was in case insufficient connections
were purged. And even then the purge task itself was not woken up. But
that is not enough for getting rid of the long tail of old connections
nor updating est_need_conns.
This patch makes sure to properly wake up as long as at least one idle
connection remains, and not to round up the needed connections anymore.
Prior to this patch, a test involving many connections which suddenly
stopped would keep many idle connections, now they're effectively halved
every pool-purge-delay.
This needs to be backported to 2.2.
When servers based on server templates are initialized, the configuration file
and line are now copied. This helps to emit understandable warning and alert
messages.
This patch may be backported if needed, as far as 1.8.
On startup, if a server has no address but the dns resolutions are configured,
"none" method is added to the default init-addr methods, in addition to "last"
and "libc". Thus on startup, this server is set to RMAINT mode if no address is
found. It is only performed if no other init-addr method is configured.
Setting the RMAINT mode on startup is important to inhibit the health checks.
For instance, following servers will now be set to RMAINT mode on startup :
server srv nofound.tld:80 check resolvers mydns
server srv _http._tcp.service.local check resolvers mydns
server-template srv 1-3 _http._tcp.service.local check resolvers mydns
while followings ones will trigger an error :
server srv nofound.tld:80 check
server srv nofound.tld:80 check resolvers mydns init-addr libc
server srv _http._tcp.service.local check
server srv _http._tcp.service.local check resolvers mydns init-addr libc
server-template srv 1-3 _http._tcp.service.local check resolvers mydns init-addr libc
This patch must be backported as far as 1.8.
Adjust condition used to report down_time for statistics. There was a
tiny probabilty to have a negative downtime if last_change was superior
to now. If this is the case, return only down_time.
This bug can backported up to 1.8.
When a server is up after a failure, its downtime was reset to 0 on the
statistics. This is due to a wrong condition that causes srv.down_time
to never be set. Fix this by updating down_time each time the server is in
STARTING state.
Fixes the github issue #920.
This bug can be backported up to 1.8.
No need to use an exclusive lock on the proxy anymore when reading its
setting, a read lock is enough. A few other places continue to use a
write-lock when modifying simple flags only in order to let this
function see a consistent value all along. This might be changed in
the future using barriers and local copies.
This is an anticipation of finer grained locking for the queues. For now
all lock places take a write lock so that there is no difference at all
with previous code.
If the slowstart value in a state file implies the latest state change
is within the slowstart period, we end up calling srv_update_status()
to reschedule the server's state change but its task is not yet
allocated and remains null, causing a crash on startup.
Make sure srv_update_status() supports being called with partially
initialized servers which do not yet have a task. If the task has to
be scheduled, it will necessarily happen after initialization since
it will result from a state change.
This should be backported wherever server-state is present.
The remaining proxy states were only used to distinguish an enabled
proxy from a disabled one. Due to the initialization order, both
PR_STNEW and PR_STREADY were equivalent after startup, and they
would only differ from PR_STSTOPPED when the proxy is disabled or
shutdown (which is effectively another way to disable it).
Now we just have a "disabled" field which allows to distinguish them.
It's becoming obvious that start_proxies() is only used to print a
greeting message now, that we'd rather get rid of. Probably that
zombify_proxy() and stop_proxy() should be merged once their
differences move to the right place.
Use the new stats module API to integrate the dns counters in the
standard stats. This is done in order to avoid code duplication, keep
the code related to cli out of dns and use the full possibility of the
stats function, allowing to print dns stats in csv or json format.
Most callers of str2sa_range() need the protocol only to check that it
provides a ->connect() method. It used to be used to verify that it's a
stream protocol, but it might be a bit early to get rid of it. Let's keep
the test for now but move it to str2sa_range() when the new flag PA_O_CONNECT
is present. This way almost all call places could be cleaned from this.
There's a strange test in the server address parsing code that rechecks
the family from the socket which seems to be a duplicate of the previously
removed tests. It will have to be rechecked.
We'll need this so that it can return pointers to stacked protocol in
the future (for QUIC). In addition this removes a lot of tests for
protocol validity in the callers.
Some of them were checked further apart, or after a call to
str2listener() and they were simplified as well.
There's still a trick, we can fail to return a protocol in case the caller
accepts an fqdn for use later. This is what servers do and in this case it
is valid to return no protocol. A typical example is:
server foo localhost:1111
If a file descriptor was passed, we can optionally return it. This will
be useful for listening sockets which are both a pre-bound FD and a ready
socket.
These flags indicate whether the call is made to fill a bind or a server
line, or even just send/recv calls (like logs or dns). Some special cases
are made for outgoing FDs (e.g. pipes for logs) or socket FDs (e.g external
listeners), and there's a distinction between stream or dgram usage that's
expected to significantly help str2sa_range() proceed appropriately with
the input information. For now they are not used yet.
Now that str2sa_range() checks for appropriate port specification, we
don't need to implement adhoc test cases in every call place, if the
result is valid, the conditions are met otherwise the error message is
appropriately filled.
These flags indicate what is expected regarding port specifications. Some
callers accept none, some need fixed ports, some have it mandatory, some
support ranges, and some take an offset. Each possibilty is reflected by
an option. For now they are not exploited, but the goal is to instrument
str2sa_range() to properly parse that.
We currently have an argument to require that the address is resolved
but we'll soon add more, so let's turn it into a bit field. The old
"resolve" boolean is now PA_O_RESOLVE.
The socks4 keyword parser was a bit too much copy-pasted, it only checks
for a null port and reports "invalid range". Let's properly check for the
1-65535 range and report the correct error.
It may be backported everywhere "socks4" is present (2.0).
When the server address is set for the first time, the log message is a bit ugly
because there is no old ip address to report. Thus in the log, we can see :
PX/SRV changed its IP from to A.B.C.D by DNS additional record.
Now, when this happens, "(none)" is reported :
PX/SRV changed its IP from (none) to A.B.C.D by DNS additional record.
This patch may be backported to 2.2.
When a SRV record for an already known server is processed, only the weight is
updated, if not configured to be ignored. It is a problem if the IP address
carried by the associated additional record changes. Because the server IP
address is never renewed.
To fix this bug, If there is an addition record attached to a SRV record, we
always try to set the IP address. If it is the same, no change is
performed. This way, IP changes are always handled.
This patch should fix the issue #841. It must be backported to 2.2.
A regression was introduced by 13a9232ebc
when I added support for Additional section of the SRV responses..
Basically, when a server is managed through SRV records additional
section and it's disabled (because its associated Additional record has
disappeared), it never leaves its MAINT state and so never comes back to
production.
This patch updates the "snr_update_srv_status()" function to clear the
MAINT status when the server now has an IP address and also ensure this
function is called when parsing Additional records (and associating them
to new servers).
This can cause severe outage for people using HAProxy + consul (or any
other service registry) through DNS service discovery).
This should fix issue #793.
This should be backported to 2.2.
Since commit 13a9232eb ("MEDIUM: dns: use Additional records from SRV
responses"), a struct server can have a NULL dns_requester->resolution,
when SRV records are used and DNS answers contain an Additional section.
This is a problem when we call snr_update_srv_status() because it does
not check that resolution is NULL, and dereferences it. This patch
simply adds a test for resolution being NULL. When that happens, it means
we are using SRV records with Additional records, and an entry was removed.
This should fix issue #775.
This should be backported to 2.2.
Reported github issue #759 shows there is no name resolving
on server lines for ring and peers sections.
This patch introduce the resolving for those lines.
This patch adds boolean a parameter to parse_server function to specify
if we want the function to perform an initial name resolving using libc.
This boolean is forced to true in case of peers or ring section.
The boolean is kept to false in case of classic servers (from
backend/listen)
This patch should be backported in branches where peers sections
support 'server' lines.
Previous fix dc6e8a9a7 ("BUG/MEDIUM: server: resolve state file handle
leak on reload") traded a bug for another one, now we get this warning
when building server.c, which is valid since f is not necessarily
initialized (e.g. if no global state file is passed):
src/server.c: In function 'apply_server_state':
src/server.c:3272:3: warning: 'f' may be used uninitialized in this function [-Wmaybe-uninitialized]
fclose(f);
^~~~~~~~~
Let's initialize it first. This whole code block should really be
splitted, cleaned up and reorganized as it's possible that other
similar bugs are hidden in it.
This must be backported to the same branches the commit above is
backported to (likely 2.2 and 2.1).
During reload, server state file is read and file handle is not released
this was indepently reported in #738 and #660.
partially resolves#660. This should be backported to 2.2 and 2.1.
Initially when mt_lists were added, their purpose was to be used with
the scheduler, where anyone may concurrently add the same tasklet, so
it sounded natural to implement a check in MT_LIST_ADD{,Q}. Later their
usage was extended and MT_LIST_ADD{,Q} started to be used on situations
where the element to be added was exclusively owned by the one performing
the operation so a conflict was impossible. This became more obvious with
the idle connections and the new macro was called MT_LIST_ADDQ_NOCHECK.
But this remains confusing and at many places it's not expected that
an MT_LIST_ADD could possibly fail, and worse, at some places we start
by initializing it before adding (and the test is superflous) so let's
rename them to something more conventional to denote the presence of the
check or not:
MT_LIST_ADD{,Q} : inconditional operation, the caller owns the
element, and doesn't care about the element's
current state (exactly like LIST_ADD)
MT_LIST_TRY_ADD{,Q}: only perform the operation if the element is not
already added or in the process of being added.
This means that the previously "safe" MT_LIST_ADD{,Q} are not "safe"
anymore. This also means that in case of backport mistakes in the
future causing this to be overlooked, the slower and safer functions
will still be used by default.
Note that the missing unchecked MT_LIST_ADD macro was added.
The rest of the code will have to be reviewed so that a number of
callers of MT_LIST_TRY_ADDQ are changed to MT_LIST_ADDQ to remove
the unneeded test.
In srv_cleanup_idle_connections(), we compute how many idle connections
are in excess compared to the average need. But we may actually be missing
some, for example if a certain number were recently closed and the average
of used connections didn't change much since previous period. In this
case exceed_conn can become negative. There was no special case for this
in the code, and calculating the per-thread share of connections to kill
based on this value resulted in special value -1 to be passed to
srv_migrate_conns_to_remove(), which for this function means "kill all of
them", as used in srv_cleanup_connections() for example.
This causes large variations of idle connections counts on servers and
CPU spikes at the moment the cleanup task passes. These were quite more
visible with SSL as it costs CPU to close and re-establish these
connections, and it also takes time, reducing the reuse ratio, hence
increasing the amount of connections during reconnection.
In this patch we simply skip the killing loop when this condition is met.
No backport is needed, this is purely 2.2.
Enables ('on') or disables ('off') sharing of idle connection pools between
threads for a same server. The default is to share them between threads in
order to minimize the number of persistent connections to a server, and to
optimize the connection reuse rate. But to help with debugging or when
suspecting a bug in HAProxy around connection reuse, it can be convenient to
forcefully disable this idle pool sharing between multiple threads, and force
this option to "off". The default is on.
This could have been nice to have during the idle connections debugging,
but it's not too late to add it!
The problem with the way idle connections currently work is that it's
easy for a thread to steal all of its siblings' connections, then release
them, then it's done by another one, etc. This happens even more easily
due to scheduling latencies, or merged events inside the same pool loop,
which, when dealing with a fast server responding in sub-millisecond
delays, can really result in one thread being fully at work at a time.
In such a case, we perform a huge amount of takeover() which consumes
CPU and requires quite some locking, sometimes resulting in lower
performance than expected.
In order to fight against this problem, this patch introduces a new server
setting "pool-low-conn", whose purpose is to dictate when it is allowed to
steal connections from a sibling. As long as the number of idle connections
remains at least as high as this value, it is permitted to take over another
connection. When the idle connection count becomes lower, a thread may only
use its own connections or create a new one. By proceeding like this even
with a low number (typically 2*nbthreads), we quickly end up in a situation
where all active threads have a few connections. It then becomes possible
to connect to a server without bothering other threads the vast majority
of the time, while still being able to use these connections when the
number of available FDs becomes low.
We also use this threshold instead of global.nbthread in the connection
release logic, allowing to keep more extra connections if needed.
A test performed with 10000 concurrent HTTP/1 connections, 16 threads
and 210 servers with 1 millisecond of server response time showed the
following numbers:
haproxy 2.1.7: 185000 requests per second
haproxy 2.2: 314000 requests per second
haproxy 2.2 lowconn 32: 352000 requests per second
The takeover rate goes down from 300k/s to 13k/s. The difference is
further amplified as the response time shrinks.
Starting with commit 079cb9a ("MEDIUM: connections: Revamp the way idle
connections are killed") we started to improve the way to compute the
need for idle connections. But the condition to keep a connection idle
or drop it when releasing it was not updated. This often results in
storms of close when certain thresholds are met, and long series of
takeover() when there aren't enough connections left for a thread on
a server.
This patch tries to improve the situation this way:
- it keeps an estimate of the number of connections needed for a server.
This estimate is a copy of the max over previous purge period, or is a
max of what is seen over current period; it differs from max_used_conns
in that this one is a counter that's reset on each purge period ;
- when releasing, if the number of current idle+used connections is
lower than this last estimate, then we'll keep the connection;
- when releasing, if the current thread's idle conns head is empty,
and we don't exceed the estimate by the number of threads, then
we'll keep the connection.
- when cleaning up connections, we consider the max of the last two
periods to avoid killing too many idle conns when facing bursty
traffic.
Thanks to this we can better converge towards a situation where, provided
there are enough FDs, each active server keeps at least one idle connection
per thread all the time, with a total number close to what was needed over
the previous measurement period (as defined by pool-purge-delay).
On tests with large numbers of concurrent connections (30k) and many
servers (200), this has quite smoothed the CPU usage pattern, increased
the reuse rate and roughly halved the takeover rate.
There's a minor glitch with the way idle connections start to be evicted.
The lookup always goes from thread 0 to thread N-1. This causes depletion
of connections on the first threads and abundance on the last ones. This
is visible with the takeover() stats below:
$ socat - /tmp/sock1 <<< "show activity"|grep ^fd ; \
sleep 10 ; \
socat -/tmp/sock1 <<< "show activity"|grep ^fd
fd_takeover: 300144 [ 91887 84029 66254 57974 ]
fd_takeover: 359631 [ 111369 99699 79145 69418 ]
There are respectively 19k, 15k, 13k and 11k takeovers for only 4 threads,
indicating that the first thread needs a foreign FD twice more often than
the 4th one.
This patch changes this si that all threads are scanned in round robin
starting with the current one. The takeovers now happen in a much more
distributed way (about 4 times 9k) :
fd_takeover: 1420081 [ 359562 359453 346586 354480 ]
fd_takeover: 1457044 [ 368779 368429 355990 363846 ]
There is no need to backport this, as this happened along a few patches
that were merged during 2.2 development.
We used to have 3 thread-based arrays for toremove_lock, idle_cleanup,
and toremove_connections. The problem is that these items are small,
and that this creates false sharing between threads since it's possible
to pack up to 8-16 of these values into a single cache line. This can
cause real damage where there is contention on the lock.
This patch creates a new array of struct "idle_conns" that is aligned
on a cache line and which contains all three members above. This way
each thread has access to its variables without hindering the other
ones. Just doing this increased the HTTP/1 request rate by 5% on a
16-thread machine.
The definition was moved to connection.{c,h} since it appeared a more
natural evolution of the ongoing changes given that there was already
one of them declared in connection.h previously.
Apparently Cygwin requires sys/types.h before netinet/tcp.h but doesn't
include it by itself, as shown here:
https://github.com/haproxy/haproxy/actions/runs/131943890
This patch makes sure it's always present, which is in server.c and
the SPOA example.
This patch fixes all the leftovers from the include cleanup campaign. There
were not that many (~400 entries in ~150 files) but it was definitely worth
doing it as it revealed a few duplicates.
Most of the files dealing with error reports have to include log.h in order
to access ha_alert(), ha_warning() etc. But while these functions don't
depend on anything, log.h depends on a lot of stuff because it deals with
log-formats and samples. As a result it's impossible not to embark long
dependencies when using ha_warning() or qfprintf().
This patch moves these low-level functions to errors.h, which already
defines the error codes used at the same places. About half of the users
of log.h could be adjusted, sometimes revealing other issues such as
missing tools.h. Interestingly the total preprocessed size shrunk by
4%.
Checks.c remains one of the largest file of the project and it contains
too many things. The tcpchecks code represents half of this file, and
both parts are relatively isolated, so let's move it away into its own
file. We now have tcpcheck.c, tcpcheck{,-t}.h.
Doing so required to export quite a number of functions because check.c
has almost everything made static, which really doesn't help to split!
check.c is one of the largest file and contains too many things. The
e-mail alerting code is stored there while nothing is in mailers.c.
Let's move this code out. That's only 4% of the code but a good start.
In order to do so, a few tcp-check functions had to be exported.
There's no point splitting the file in two since only cfgparse uses the
types defined there. A few call places were updated and cleaned up. All
of them were in C files which register keywords.
There is nothing left in common/ now so this directory must not be used
anymore.
This one was not easy because it was embarking many includes with it,
which other files would automatically find. At least global.h, arg.h
and tools.h were identified. 93 total locations were identified, 8
additional includes had to be added.
In the rare files where it was possible to finalize the sorting of
includes by adjusting only one or two extra lines, it was done. But
all files would need to be rechecked and cleaned up now.
It was the last set of files in types/ and proto/ and these directories
must not be reused anymore.
extern struct dict server_name_dict was moved from the type file to the
main file. A handful of inlined functions were moved at the bottom of
the file. Call places were updated to use server-t.h when relevant, or
to simply drop the entry when not needed.
The files remained mostly unchanged since they were OK. However, half of
the users didn't need to include them, and about as many actually needed
to have it and used to find functions like srv_currently_usable() through
a long chain that broke when moving the file.
Almost no change except moving the cli_kw struct definition after the
defines. Almost all users had both types&proto included, which is not
surprizing since this code is old and it used to be the norm a decade
ago. These places were cleaned.
Just some minor reordering, and the usual cleanup of call places for
those which didn't need it. We don't include the whole tools.h into
stats-t anymore but just tools-t.h.
The type file was slightly tidied. The cli-specific APPCTX_CLI_ST1_* flag
definitions were moved to cli.h. The type file was adjusted to include
buf-t.h and not the huge buf.h. A few call places were fixed because they
did not need this include.
All includes that were not absolutely necessary were removed because
checks.h happens to very often be part of dependency loops. A warning
was added about this in check-t.h. The fields, enums and structs were
a bit tidied because it's particularly tedious to find anything there.
It would make sense to split this in two or more files (at least
extract tcp-checks).
The file was renamed to the singular because it was one of the rare
exceptions to have an "s" appended to its name compared to the struct
name.
The type file is becoming a mess, half of it is for the proxy protocol,
another good part describes conn_streams and mux ops, it would deserve
being split again. At least it was reordered so that elements are easier
to find, with the PP-stuff left at the end. The MAX_SEND_FD macro was moved
to compat.h as it's said to be the value for Linux.
The TASK_IS_TASKLET() macro was moved to the proto file instead of the
type one. The proto part was a bit reordered to remove a number of ugly
forward declaration of static inline functions. About a tens of C and H
files had their dependency dropped since they were not using anything
from task.h.
global.h was one of the messiest files, it has accumulated tons of
implicit dependencies and declares many globals that make almost all
other file include it. It managed to silence a dependency loop between
server.h and proxy.h by being well placed to pre-define the required
structs, forcing struct proxy and struct server to be forward-declared
in a significant number of files.
It was split in to, one which is the global struct definition and the
few macros and flags, and the rest containing the functions prototypes.
The UNIX_MAX_PATH definition was moved to compat.h.
This one is particularly tricky to move because everyone uses it
and it depends on a lot of other types. For example it cannot include
arg-t.h and must absolutely only rely on forward declarations to avoid
dependency loops between vars -> sample_data -> arg. In order to address
this one, it would be nice to split the sample_data part out of sample.h.
The protocol.h files are pretty low in the dependency and (sadly) used
by some files from common/. Almost nothing was changed except lifting a
few comments.
The type was moved out as it's used by standard.h for netns_entry.
Instead of just being a forward declaration when not used, it's an
empty struct, which makes gdb happier (the resulting stripped executable
is the same).
This one is included almost everywhere and used to rely on a few other
.h that are not needed (unistd, stdlib, standard.h). It could possibly
make sense to split it into multiple parts to distinguish operations
performed on timers and the internal time accounting, but at this point
it does not appear much important.
All files that were including one of the following include files have
been updated to only include haproxy/api.h or haproxy/api-t.h once instead:
- common/config.h
- common/compat.h
- common/compiler.h
- common/defaults.h
- common/initcall.h
- common/tools.h
The choice is simple: if the file only requires type definitions, it includes
api-t.h, otherwise it includes the full api.h.
In addition, in these files, explicit includes for inttypes.h and limits.h
were dropped since these are now covered by api.h and api-t.h.
No other change was performed, given that this patch is large and
affects 201 files. At least one (tools.h) was already freestanding and
didn't get the new one added.
This is where other imported components are located. All files which
used to directly include ebtree were touched to update their include
path so that "import/" is now prefixed before the ebtree-related files.
The ebtree.h file was slightly adjusted to read compiler.h from the
common/ subdirectory (this is the only change).
A build issue was encountered when eb32sctree.h is loaded before
eb32tree.h because only the former checks for the latter before
defining type u32. This was addressed by adding the reverse ifdef
in eb32tree.h.
No further cleanup was done yet in order to keep changes minimal.
log-proto <logproto>
The "log-proto" specifies the protocol used to forward event messages to
a server configured in a ring section. Possible values are "legacy"
and "octet-count" corresponding respectively to "Non-transparent-framing"
and "Octet counting" in rfc6587. "legacy" is the default.
Notes: a separated io_handler was created to avoid per messages test
and to prepare code to set different log protocols such as
request- response based ones.
Helper functions are used to dump bind, server or filter keywords. These
functions are used to report errors during the configuration parsing. To have a
coherent API, these functions are now prepared to handle a null pointer as
argument. If so, no action is performed and functions immediately return.
This patch should fix the issue #631. It is not a bug. There is no reason to
backport it.
srv_cleanup_connections() is supposed to be static, so mark it as so.
This patch should be backported where commit 6318d33ce6
("BUG/MEDIUM: connections: force connections cleanup on server changes")
will be backported, that is to say v1.9 to v2.1.
Fixes: 6318d33ce6 ("BUG/MEDIUM: connections: force connections cleanup
on server changes")
Signed-off-by: William Dauchy <w.dauchy@criteo.com>
I've been trying to understand a change of behaviour between v2.2dev5 and
v2.2dev6. Indeed our probe is regularly testing to add and remove
servers on a given backend such as:
# echo "show servers state be_foo" | sudo socat stdio /var/lib/haproxy/stats
113 be_foo 1 srv0 10.236.139.34 2 0 1 1 263 15 3 4 6 0 0 0 - 31255 -
113 be_foo 2 srv1 0.0.0.0 0 1 256 256 0 15 3 0 14 0 0 0 - 0 -
-> curl on the corresponding frontend: reply from server:31255
# echo "set server be_foo/srv1 addr 10.236.139.34 port 31257" | sudo socat stdio /var/lib/haproxy/stats
IP changed from '0.0.0.0' to '10.236.139.34', port changed from '0' to '31257' by 'stats socket command'
# echo "set server be_foo/srv1 weight 256" | sudo socat stdio /var/lib/haproxy/stats
# echo "set server be_foo/srv1 check-port 8500" | sudo socat stdio /var/lib/haproxy/stats
health check port updated.
# echo "set server be_foo/srv1 state ready" | sudo socat stdio /var/lib/haproxy/stats
# echo "show servers state be_foo" | sudo socat stdio /var/lib/haproxy/stats
113 be_foo 1 srv0 10.236.139.34 2 0 1 1 105 15 3 4 6 0 0 0 - 31255 -
113 be_foo 2 srv1 10.236.139.34 2 0 256 256 2319 15 3 2 6 0 0 0 - 31257 -
-> curl on the corresponding frontend: reply for server:31257
(notice the difference of weight)
# echo "set server be_foo/srv1 state maint" | sudo socat stdio /var/lib/haproxy/stats
# echo "set server be_foo/srv1 addr 0.0.0.0 port 0" | sudo socat stdio /var/lib/haproxy/stats
IP changed from '10.236.139.34' to '0.0.0.0', port changed from '31257' to '0' by 'stats socket command'
# echo "show servers state be_foo" | sudo socat stdio /var/lib/haproxy/stats
113 be_foo 1 srv0 10.236.139.34 2 0 1 1 263 15 3 4 6 0 0 0 - 31255 -
113 be_foo 2 srv1 0.0.0.0 0 1 256 256 0 15 3 0 14 0 0 0 - 0 -
-> curl on the corresponding frontend: reply from server:31255
# echo "set server be_foo/srv1 addr 10.236.139.34 port 31256" | sudo socat stdio /var/lib/haproxy/stats
IP changed from '0.0.0.0' to '10.236.139.34', port changed from '0' to '31256' by 'stats socket command'
# echo "set server be_foo/srv1 weight 256" | sudo socat stdio /var/lib/haproxy/stats
# echo "set server be_foo/srv1 check-port 8500" | sudo socat stdio /var/lib/haproxy/stats
health check port updated.
# echo "set server be_foo/srv1 state ready" | sudo socat stdio /var/lib/haproxy/stats
# echo "show servers state be_foo" | sudo socat stdio /var/lib/haproxy/stats
113 be_foo 1 srv0 10.236.139.34 2 0 1 1 105 15 3 4 6 0 0 0 - 31255 -
113 be_foo 2 srv1 10.236.139.34 2 0 256 256 2319 15 3 2 6 0 0 0 - 31256 -
-> curl on the corresponding frontend: reply from server:31257 (!)
Here we indeed would expect to get an anver from server:31256. The issue
is highly linked to the usage of `pool-purge-delay`, with a value which
is higher than the duration of the test, 10s in our case.
a git bisect between dev5 and dev6 seems to show commit
079cb9af22 ("MEDIUM: connections: Revamp the way idle connections are killed")
being the origin of this new behaviour.
So if I understand the later correctly, it seems that it was more a
matter of chance that we did not saw the issue earlier.
My patch proposes to force clean idle connections in the two following
cases:
- we set a (still running) server to maintenance
- we change the ip/port of a server
This commit should be backported to 2.1, 2.0, and 1.9.
Signed-off-by: William Dauchy <w.dauchy@criteo.com>
The variable 'ret' must only be declared When HAProxy is compiled with the SSL
support (more precisely SSL_CTRL_SET_TLSEXT_HOSTNAME must be defined).
No backport needed.
A shared tcp-check ruleset is now created to support agent checks. The following
sequence is used :
tcp-check send "%[var(check.agent_string)] log-format
tcp-check expect custom
The custom function to evaluate the expect rule does the same that it was done
to handle agent response when a custom check was used.
Parsing of following keywords have been moved in checks.c file : addr, check,
check-send-proxy, check-via-socks4, no-check, no-check-send-proxy, rise, fall,
inter, fastinter, downinter and port.
A global list to tcp-check ruleset can now be used to share common rulesets with
all backends without any duplication. It is mandatory to convert all specific
protocol checks (redis, pgsql...) to tcp-check healthchecks.
To do so, a flag is now attached to each tcp-check ruleset to know if it is a
shared ruleset or not. tcp-check rules defined in a backend are still directly
attached to the proxy and not shared. In addition a second flag is used to know
if the ruleset is inherited from the defaults section.
To allow reusing these blocks without consuming more memory, their list
should be static and share-able accross uses. The head of the list will
be shared as well.
It is thus necessary to extract the head of the rule list from the proxy
itself. Transform it into a pointer instead, that can be easily set to
an external dynamically allocated head.
The options and directives related to the configuration of checks in a backend
may be defined after the servers declarations. So, initialization of the check
of each server must not be performed during configuration parsing, because some
info may be missing. Instead, it must be done during the configuration validity
check.
Thus, callback functions are registered to be called for each server after the
config validity check, one for the server check and another one for the server
agent-check. In addition deinit callback functions are also registered to
release these checks.
This patch should be backported as far as 1.7. But per-server post_check
callback functions are only supported since the 2.1. And the initcall mechanism
does not exist before the 1.9. Finally, in 1.7, the code is totally
different. So the backport will be harder on older versions.
Error codes ERR_WARN and ERR_ALERT are used to signal that the error
given is of the corresponding level. All errors are displayed as ALERT
in the display_parser_err() function.
Differentiate the display level based on the error code. If both
ERR_WARN and ERR_ALERT are used, ERR_ALERT is given priority.
Documentation states that default settings for ssl server options can be set
using either ssl-default-server-options or default-server directives. In practice,
not all ssl server options can have default values, such as ssl-min-ver, ssl-max-ver,
etc..
This patch adds the missing ssl options in srv_ssl_settings_cpy() and srv_parse_ssl(),
making it possible to write configurations like the following examples, and have them
behave as expected.
global
ssl-default-server-options ssl-max-ver TLSv1.2
defaults
mode http
listen l1
bind 1.2.3.4:80
default-server ssl verify none
server s1 1.2.3.5:443
listen l2
bind 2.2.3.4:80
default-server ssl verify none ssl-max-ver TLSv1.3 ssl-min-ver TLSv1.2
server s1 1.2.3.6:443
This should be backported as far as 1.8.
This fixes issue #595.
Before supporting "server" line in "peers" section, such sections without
any local peer were removed from the configuration to get it validated.
This patch fixes the issue where a "server" line without address and port which
is a remote peer without address and port makes the configuration parsing fail.
When encoutering such cases we now ignore such lines remove them from the
configuration.
Thank you to Jrme Magnin for having reported this bug.
Must be backported to 2.1 and 2.0.
The original algorithm always killed half the idle connections. This doesn't
take into account the way the load can change. Instead, we now kill half
of the exceeding connections (exceeding connection being the number of
used + idle connections past the last maximum used connections reached).
That way if we reach a peak, we will kill much less, and it'll slowly go back
down when there's less usage.
Revamp the server connection lists. We know have 3 lists :
- idle_conns, which contains idling connections
- safe_conns, which contains idling connections that are safe to use even
for the first request
- available_conns, which contains connections that are not idling, but can
still accept new streams (those are HTTP/2 or fastcgi, and are always
considered safe).
This patch adds the `unique-id` option to `proxy-v2-options`. If this
option is set a unique ID will be generated based on the `unique-id-format`
while sending the proxy protocol v2 header and stored as the unique id for
the first stream of the connection.
This feature is meant to be used in `tcp` mode. It works on HTTP mode, but
might result in inconsistent unique IDs for the first request on a keep-alive
connection, because the unique ID for the first stream is generated earlier
than the others.
Now that we can send unique IDs in `tcp` mode the `%ID` log variable is made
available in TCP mode.
The isalnum(), isalpha(), isdigit() etc functions from ctype.h are
supposed to take an int in argument which must either reflect an
unsigned char or EOF. In practice on some platforms they're implemented
as macros referencing an array, and when passed a char, they either cause
a warning "array subscript has type 'char'" when lucky, or cause random
segfaults when unlucky. It's quite unconvenient by the way since none of
them may return true for negative values. The recent introduction of
cygwin to the list of regularly tested build platforms revealed a lot
of breakage there due to the same issues again.
So this patch addresses the problem all over the code at once. It adds
unsigned char casts to every valid use case, and also drops the unneeded
double cast to int that was sometimes added on top of it.
It may be backported by dropping irrelevant changes if that helps better
support uncommon platforms. It's unlikely to fix bugs on platforms which
would already not emit any warning though.
When an end pointer is passed, instead of complaining that a comma is
missing after a keyword, sample_parse_expr() will silently return the
pointer to the current location into this return pointer so that the
caller can continue its parsing. This will be used by more complex
expressions which embed sample expressions, and may even permit to
embed sample expressions into arguments of other expressions.
Most DNS servers provide A/AAAA records in the Additional section of a
response, which correspond to the SRV records from the Answer section:
;; QUESTION SECTION:
;_http._tcp.be1.domain.tld. IN SRV
;; ANSWER SECTION:
_http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A1.domain.tld.
_http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A8.domain.tld.
_http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A5.domain.tld.
_http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A6.domain.tld.
_http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A4.domain.tld.
_http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A3.domain.tld.
_http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A2.domain.tld.
_http._tcp.be1.domain.tld. 3600 IN SRV 5 500 80 A7.domain.tld.
;; ADDITIONAL SECTION:
A1.domain.tld. 3600 IN A 192.168.0.1
A8.domain.tld. 3600 IN A 192.168.0.8
A5.domain.tld. 3600 IN A 192.168.0.5
A6.domain.tld. 3600 IN A 192.168.0.6
A4.domain.tld. 3600 IN A 192.168.0.4
A3.domain.tld. 3600 IN A 192.168.0.3
A2.domain.tld. 3600 IN A 192.168.0.2
A7.domain.tld. 3600 IN A 192.168.0.7
SRV record support was introduced in HAProxy 1.8 and the first design
did not take into account the records from the Additional section.
Instead, a new resolution is associated to each server with its relevant
FQDN.
This behavior generates a lot of DNS requests (1 SRV + 1 per server
associated).
This patch aims at fixing this by:
- when a DNS response is validated, we associate A/AAAA records to
relevant SRV ones
- set a flag on associated servers to prevent them from running a DNS
resolution for said FADN
- update server IP address with information found in the Additional
section
If no relevant record can be found in the Additional section, then
HAProxy will failback to running a dedicated resolution for this server,
as it used to do.
This behavior is the one described in RFC 2782.
Since commit 980855bd95 ("BUG/MEDIUM: server: initialize the orphaned
conns lists and tasks at the end"), we no longer use err section.
This should fix github issue #438
Signed-off-by: William Dauchy <w.dauchy@criteo.com>
Issue #417 reports a possible memory leak in the state-file loading code.
There's one such place in the loop which corresponds to parsing errors
where the curreently allocated line is not freed when dropped. In any
case this is very minor in that no more than the file's length may be
lost in the worst case, considering that the whole file is kept anyway
in case of success. This fix addresses this.
It should be backported to 2.1.
The global state file tree isn't configured for unique keys, so if an
entry appears multiple times, e.g. due to a bogus script that concatenates
entries multiple times, this will needlessly eat memory. Let's just drop
duplicates.
This should be backported to 2.1.
Starting haproxy with a state file of 700k servers eats 11.2 GB of RAM
due to a mistake in the function that loads the strings into a tree: it
allocates a full buffer for each backend+server name instead of allocating
just the required string. By just fixing this we're down to 80 MB.
This should be backported to 2.1.
As reported in issue #408, "agent-addr" doesn't work on default-server
lines. This is due to the transcription of the old "addr" option in commit
6e5e0d8f9e ("MINOR: server: Make 'default-server' support 'addr' keyword.")
which correctly assigns it to the check.addr and agent.addr fields, but
which also copies the default check.addr into both the check's and the
agent's addr fields. Thus the default agent's address is never used.
This fix makes sure to copy the check from the check and the agent from
the agent. However it's worth noting that if "addr" is specified on the
server line, it will still overwrite both the check and the agent's
addresses.
This must be backported as far as 1.8.
It was noted in #48 that there are times when a configuration
may use the server-template directive with SRV records and
simultaneously want to control weights using an agent-check or
through the runtime api. This patch adds a new option
"ignore-weight" to the "resolve-opts" directive.
When specified, any weight indicated within an SRV record will
be ignored. This is for both initial resolution and ongoing
resolution.
fopen() can return NULL when state file is missing. This patch adds a
check of fopen() return value so we can skip processing in such case.
No backport needed.
Instead of using the same type for regular linked lists and "autolocked"
linked lists, use a separate type, "struct mt_list", for the autolocked one,
and introduce a set of macros, similar to the LIST_* macros, with the
MT_ prefix.
When we use the same entry for both regular list and autolocked list, as
is done for the "list" field in struct connection, we know have to explicitely
cast it to struct mt_list when using MT_ macros.
There were 221 places where a status message or an error message were built
to be returned on the CLI. All of them were replaced to use cli_err(),
cli_msg(), cli_dynerr() or cli_dynmsg() depending on what was expected.
This removed a lot of duplicated code because most of the times, 4 lines
are replaced by a single, safer one.
A problem involving server slowstart was reported by @max2k1 in issue #197.
The problem is that pendconn_grab_from_px() takes the proxy lock while
already under the server's lock while process_srv_queue() first takes the
proxy's lock then the server's lock.
While the latter seems more natural, it is fundamentally incompatible with
mayn other operations performed on servers, namely state change propagation,
where the proxy is only known after the server and cannot be locked around
the servers. Howwever reversing the lock in process_srv_queue() is trivial
and only the few functions related to dynamic cookies need to be adjusted
for this so that the proxy's lock is taken for each server operation. This
is possible because the proxy's server list is built once at boot time and
remains stable. So this is what this patch does.
The comments in the proxy and server structs were updated to mention this
rule that the server's lock may not be taken under the proxy's lock but
may enclose it.
Another approach could consist in using a second lock for the proxy's queue
which would be different from the regular proxy's lock, but given that the
operations above are rare and operate on small servers list, there is no
reason for overdesigning a solution.
This fix was successfully tested with 10000 servers in a backend where
adjusting the dyncookies in loops over the CLI didn't have a measurable
impact on the traffic.
The only workaround without the fix is to disable any occurrence of
"slowstart" on server lines, or to disable threads using "nbthread 1".
This must be backported as far as 1.8.
When we're purging idle connections, there's a race condition, when we're
removing the connection from the idle list, to add it to the list of
connections to free, if the thread owning the connection tries to free it
at the same time.
To fix this, simply add a per-thread lock, that has to be hold before
removing the connection from the idle list, and when, in conn_free(), we're
about to remove the connection from every list. That way, we know for sure
the connection will stay valid while we remove it from the idle list, to add
it to the list of connections to free.
This should happen rarely enough that it shouldn't have any impact on
performances.
This has not been reported yet, but could provoke random segfaults.
This should be backported to 2.0.
Server states can be recovered from either a "global" file (all backends)
or a "local" file (per backend).
The way the algorithm to parse the state file was first implemented was good
enough for a low number of backends and servers per backend.
Basically, for each backend the state file (global or local) is opened,
parsed entirely and for each line we check if it contains data related to
a server from the backend we're currently processing.
We must read the file entirely, just in case some lines for the current
backend are stored at the end of the file.
This does not scale at all!
This patch changes the behavior above for the "global" file only. Now,
the global file is read and parsed once and all lines it contains are
stored in a tree, for faster discovery.
This result in way much less fopen, fgets, and strcmp calls, which make
loading of very big state files very quick now.
Since h7da71293e431b5ebb3d6289a55b0102331788ee6as has been added, the
server name (srv->id in the code) is now unique per backend, which
means it can reliabely be used to identify a server recovered from the
server-state file.
This patch cleans up the parsing of server-state file and ensure we use
only the server name as a reliable key.
As reported in GH issue #109 and in discourse issue
https://discourse.haproxy.org/t/haproxy-returns-408-or-504-error-when-timeout-client-value-is-every-25d
the time parser doesn't error on overflows nor underflows. This is a
recurring problem which additionally has the bad taste of taking a long
time before hitting the user.
This patch makes parse_time_err() return special error codes for overflows
and underflows, and adds the control in the call places to report suitable
errors depending on the requested unit. In practice, underflows are almost
never returned as the parsing function takes care of rounding values up,
so this might possibly happen on 64-bit overflows returning exactly zero
after rounding though. It is not really possible to cut the patch into
pieces as it changes the function's API, hence all callers.
Tests were run on about every relevant part (cookie maxlife/maxidle,
server inter, stats timeout, timeout*, cli's set timeout command,
tcp-request/response inspect-delay).
Commit fb55365f9 ("MINOR: server: increase the default pool-purge-delay
to 5 seconds") did this but the setting placed in new_server() was
overwritten by srv_settings_cpy() from the default-server values preset
in init_default_instance(). Now let's put it at the right place.
The default used to be a very aggressive delay of 1 second before starting
to purge idle connections, but tests show that with bursty traffic it's a
bit short. Let's increase this to 5 seconds.
Have "socks4" and "check-via-socks4" server keyword added.
Implement handshake with SOCKS4 proxy server for tcp stream connection.
See issue #82.
I have the "SOCKS: A protocol for TCP proxy across firewalls" doc found
at "https://www.openssh.com/txt/socks4.protocol". Please reference to it.
[wt: for now connecting to the SOCKS4 proxy over unix sockets is not
supported, and mixing IPv4/IPv6 is discouraged; indeed, the control
layer is unique for a connection and will be used both for connecting
and for target address manipulation. As such it may for example report
incorrect destination addresses in logs if the proxy is reached over
IPv6]
We still have quite a number of build macros which are mapped 1:1 to a
USE_something setting in the makefile but which have a different name.
This patch cleans this up by renaming them to use the USE_something
one, allowing to clean up the makefile and make it more obvious when
reading the code what build option needs to be added.
The following renames were done :
ENABLE_POLL -> USE_POLL
ENABLE_EPOLL -> USE_EPOLL
ENABLE_KQUEUE -> USE_KQUEUE
ENABLE_EVPORTS -> USE_EVPORTS
TPROXY -> USE_TPROXY
NETFILTER -> USE_NETFILTER
NEED_CRYPT_H -> USE_CRYPT_H
CONFIG_HAP_CRYPT -> USE_LIBCRYPT
CONFIG_HAP_NS -> DUSE_NS
CONFIG_HAP_LINUX_SPLICE -> USE_LINUX_SPLICE
CONFIG_HAP_LINUX_TPROXY -> USE_LINUX_TPROXY
CONFIG_HAP_LINUX_VSYSCALL -> USE_LINUX_VSYSCALL
They were all check to comply with the advertised openssl version. Now
that libressl doesn't pretend to be a more recent openssl anymore, we
can simply rely on the regular openssl version tests without having to
deal with exceptions for libressl.
Most tests on OPENSSL_VERSION_NUMBER have become complex and break all
the time because this number is fake for some derivatives like LibreSSL.
This patch creates a new macro, HA_OPENSSL_VERSION_NUMBER, which will
carry the real openssl version defining the compatibility level, and
this version will be adjusted depending on the variants.
This implements support for the new API which relies on a call to
setsockopt().
On systems that support it (currently, only Linux >= 4.11), this enables
using TCP fast open when connecting to server.
Please note that you should use the retry-on "conn-failure", "empty-response"
and "response-timeout" keywords, or the request won't be able to be retried
on failure.
Co-authored-by: Olivier Houchard <ohouchard@haproxy.com>
When copying the settings for all servers when using server templates,
fix a typo, or we would never copy the length of the ALPN to be used for
checks.
This should be backported to 1.9.
As by default we add all keepalive connections to the idle pool, if we run
into a pathological case, where all client don't do keepalive, but the server
does, and haproxy is configured to only reuse "safe" connections, we will
soon find ourself having lots of idling, unusable for new sessions, connections,
while we won't have any file descriptors available to create new connections.
To fix this, add 2 new global settings, "pool_low_ratio" and "pool_high_ratio".
pool-low-fd-ratio is the % of fds we're allowed to use (against the maximum
number of fds available to haproxy) before we stop adding connections to the
idle pool, and destroy them instead. The default is 20. pool-high-fd-ratio is
the % of fds we're allowed to use (against the maximum number of fds available
to haproxy) before we start killing idling connection in the event we have to
create a new outgoing connection, and no reuse is possible. The default is 25.
Older compilers don't like to see "inline" placed after the type in a
function declaration, it must be "static inline <type>" only. This
patch touches various areas. The warnings were seen with gcc-3.4.
It is mandatory to handle mux upgrades, because during a mux upgrade, the
connection will be reassigned to another multiplexer. So when the old one is
destroyed, it does not own the connection anymore. Or in other words, conn->ctx
does not point to the old mux's context when its destroy() callback is
called. So we now rely on the multiplexer context do destroy it instead of the
connection.
In addition, h1_release() and h2_release() have also been updated in the same
way.
Since LIST_DEL_LOCKED() and LIST_POP_LOCKED() now automatically reinitialize
the removed element, there's no need for keeping this LIST_INIT() call in the
idle connection code.
Instead of having one task per thread and per server that does clean the
idling connections, have only one global task for every servers.
That tasks parses all the servers that currently have idling connections,
and remove half of them, to put them in a per-thread list of connections
to kill. For each thread that does have connections to kill, wake a task
to do so, so that the cleaning will be done in the context of said thread.
Add a per-thread counter of idling connections, and use it to determine
how many connections we should kill after the timeout, instead of using
the global counter, or we're likely to just kill most of the connections.
This should be backported to 1.9.
This also depends on the nbthread count, so it must only be performed after
parsing the whole config file. As a side effect, this removes some code
duplication between servers and server-templates.
This must be backported to 1.9.
The idle conns lists are sized according to the number of threads. As such
they cannot be initialized during the parsing since nbthread can be set
later, as revealed by this simple config which randomly crashes when used.
Let's do this at the end instead.
listen proxy
bind :4445
mode http
timeout client 10s
timeout server 10s
timeout connect 10s
http-reuse always
server s1 127.0.0.1:8000
global
nbthread 8
This fix must be backported to 1.9 and 1.8.
Some servers may wish to limit the total number of requests they execute
over a connection because some of their components might leak resources.
In HTTP/1 it was easy, they just had to emit a "connection: close" header
field with the last response. In HTTP/2, it's less easy because the info
is not always shared with the component dealing with the H2 protocol and
it could be harder to advertise a GOAWAY with a stream limit.
This patch provides a solution to this by adding a new "max-reuse" parameter
to the server keyword. This parameter indicates how many times an idle
connection may be reused for new requests. The information is made available
and the underlying muxes will be able to use it at will.
This patch should be backported to 1.9.
The keyword parser doesn't check the value range, but supported values are
-1 and positive values, thus we should check it.
This can be backported to 1.9.
When we load health values from a server state file, make sure what we assign
to srv->check.health actually matches the state we restore.
This should be backported as far as 1.6.
In order to address the mailers issues, we needed to store the proxy
into the checks struct, which was done by commit c98aa1f18 ("MINOR:
checks: Store the proxy in checks."). However this one did it only for
the health checks and not for the agent checks, resulting in an immediate
crash when the agent is enabled on a random config like this one :
listen agent
bind :8000
server s1 255.255.255.255:1 agent-check agent-port 1
Thanks to Seri Kim for reporting it and providing a reproducer in
issue #20. This fix must be backported to 1.9.
Make "bind" keywork be supported in "peers" sections.
All "bind" settings are supported on this line.
Add "default-bind" option to parse the binding options excepted the bind address.
Do not parse anymore the bind address for local peers on "server" lines.
Do not use anymore list_for_each_entry() to set the "peers" section
listener parameters because there is only one listener by "peers" section.
May be backported to 1.5 and newer.
With this patch "default-server" lines are supported in "peers" sections
to setup the default settings of peers which are from now setup
when parsing both "peer" and "server" lines.
May be backported to 1.5 and newer.
Instead of assuming we have a server, store the proxy directly in struct
check, and use it instead of s->server.
This should be a no-op for now, but will be useful later when we change
mail checks to avoid having a server.
This should be backported to 1.9.
When initializing server-template all of the servers after the first
have srv->idle_orphan_conns initialized within server_template_init()
The first server does not have this initialized and when http-reuse
is active this causes a segmentation fault when accessed from
srv_add_to_idle_list(). This patch removes the check for
srv->tmpl_info.prefix within server_finalize_init() and allows
the first server within a server-template to have srv->idle_orphan_conns
properly initialized.
This should be backported to 1.9.
Add a way to configure the ALPN used by check, with a new "check-alpn"
keyword. By default, the checks will use the server ALPN, but it may not
be convenient, for instance because the server may use HTTP/2, while checks
are unable to do HTTP/2 yet.
Instead of the old "idle-timeout" mechanism, add a new option,
"pool-purge-delay", that sets the delay before purging idle connections.
Each time the delay happens, we destroy half of the idle connections.
Add a new command, "pool-max-conn" that sets the maximum number of connections
waiting in the orphan idling connections list (as activated with idle-timeout).
Using "-1" means unlimited. Using pools is now dependant on this.
Add a new keyword for servers, "idle-timeout". If set, unused connections are
kept alive until the timeout happens, and will be picked for reuse if no
other connection is available.
Currently a mux may be forced on a bind or server line by specifying the
"proto" keyword. The problem is that the mux may depend on the proxy's
mode, which is not known when parsing this keyword, so a wrong mux could
be picked.
Let's simply update the mux entry while checking its validity. We do have
the name and the side, we only need to see if a better mux fits based on
the proxy's mode. It also requires to remove the side check while parsing
the "proto" keyword since a wrong mux could be picked.
This way it becomes possible to declare multiple muxes with the same
protocol names and different sides or modes.
This switches explicit calls to various trivial registration methods for
keywords, muxes or protocols from constructors to INITCALL1 at stage
STG_REGISTER. All these calls have in common to consume a single pointer
and return void. Doing this removes 26 constructors. The following calls
were addressed :
- acl_register_keywords
- bind_register_keywords
- cfg_register_keywords
- cli_register_kw
- flt_register_keywords
- http_req_keywords_register
- http_res_keywords_register
- protocol_register
- register_mux_proto
- sample_register_convs
- sample_register_fetches
- srv_register_keywords
- tcp_req_conn_keywords_register
- tcp_req_cont_keywords_register
- tcp_req_sess_keywords_register
- tcp_res_cont_keywords_register
- flt_register_keywords
Remaining calls to si_cant_put() were all for lack of room and were
turned to si_rx_room_blk(). A few places where SI_FL_RXBLK_ROOM was
cleared by hand were converted to si_rx_room_rdy().
The now unused si_cant_put() function was removed.
It doesn't make sense to limit this code to applets, as any stream
interface can use it. Let's rename it by simply dropping the "applet_"
part of the name. No other change was made except updating the comments.
There are as many ways to build the globalfilepathlen variable as branches
in the if/then/else, creating lots of confusion. Address the most obvious
parts, but some polishing definitely is still needed.
OpenSSL released support for TLSv1.3. It also added a separate function
SSL_CTX_set_ciphersuites that is used to set the ciphers used in the
TLS 1.3 handshake. This change adds support for that new configuration
option by adding a ciphersuites configuration variable that works
essentially the same as the existing ciphers setting.
Note that it should likely be backported to 1.8 in order to ease usage
of the now released openssl-1.1.1.
This patch ensures that a DNS resolution may be launched before
setting a server FQDN via the CLI. Especially, it checks that
resolvers was set.
A LEVEL 4 reg testing file is provided.
Thanks to Lukas Tribus for having reported this issue.
Must be backported to 1.8.
Server state file has no indication that a server is currently managed
by a DNS SRV resolution.
And thus, both feature (DNS SRV resolution and server state), when used
together, does not provide the expected behavior: a smooth experience...
This patch introduce the "SRV record name" in the server state file and
loads and applies it if found and wherever required.
This patch applies to haproxy-dev branch only. For backport, a specific patch
is provided for 1.8.
thread_isolate() is currently being called with the server lock held.
This is not acceptable because it prevents other threads from reaching
the rendez-vous point. Now that the LB algos are thread-safe, let's get
rid of this call.
No backport is nedeed.
The server-specific CLI commands "set weight", "set maxconn",
"disable agent", "enable agent", "disable health", "enable health",
"disable server" and "enable server" were not protected against
concurrent accesses. Now they take the server lock around the
sensitive part.
This patch must be backported to 1.8.
At the moment it's totally unclear while reading the server's code which
functions require to be called with the server lock held and which ones
grab it and cannot be called this way. This commit simply inventories
all of them to indicate what is detected depending on how these functions
use the struct server. Only functions used at runtime were checked, those
dedicated to config parsing were skipped. Doing so already has uncovered
a few bugs on some CLI actions.
Commit 3ff577e ("MAJOR: server: make server state changes synchronous again")
reintroduced synchronous server state changes. However, during the previous
change from synchronous to asynchronous, the server state propagation was
placed at the end of the function to ease the code changes, and the commit
above didn't put it back at its place. This has resulted in propagated
states to be incomplete. For example, making a server leave maintenance
would make it up but would leave its tracking servers down because they
see their tracked server is still down.
Let's just move the status update right to its place. It also adds the
benefit of reporting state changes in the order they appear and not in
reverse.
No backport is needed.
We'll need trees to manage the queues by priorities. This change replaces
the list with a tree based on a single key. It's effectively a list but
allows us to get rid of the list management right now.
Now we try to synchronously push updates as they come using the new rdv
point, so that the call to the server update function from the main poll
loop is not needed anymore.
It further reduces the apparent latency in the health checks as the response
time almost always appears as 0 ms, resulting in a slightly higher check rate
of ~1960 conn/s. Despite this, the CPU consumption has slightly dropped again
to ~32% for the same test.
The only trick is that the checks code is built with a bit of recursivity
because srv_update_status() calls server_recalc_eweight(), and the latter
needs to signal srv_update_status() in case of updates. Thus we added an
extra argument to this function to indicate whether or not it must
propagate updates (no if it comes from srv_update_status).
The current sync point causes some important stress when a high number
of threads is in use on a config with lots of checks, because it wakes
up all threads every time a server state changes.
A config like the following can easily saturate a 4-core machine reaching
only 750 checks per second out of the ~2000 configured :
global
nbthread 4
defaults
mode http
timeout connect 5s
timeout client 5s
timeout server 5s
frontend srv
bind :8001 process 1/1
redirect location / if { method OPTIONS } { rand(100) ge 50 }
stats uri /
backend chk
option httpchk
server-template srv 1-100 127.0.0.1:8001 check rise 1 fall 1 inter 50
The reason is that the random on the fake server causes the responses
to randomly match an HTTP check, and results in a lot of up/down events
that are broadcasted to all threads. It's worth noting that the CPU usage
already dropped by about 60% between 1.8 and 1.9 just due to the scheduler
updates, but the sync point remains expensive.
In addition, it's visible on the stats page that a lot of requests end up
with an L7TOUT status in ~60ms. With smaller timeouts, it's even L4TOUT
around 20-25ms.
By not using THREAD_WANT_SYNC() anymore and only calling the server updates
under thread_isolate(), we can avoid all these wakeups. The CPU usage on
the same config drops to around 44% on the same machine, with all checks
being delivered at ~1900 checks per second, and the stats page shows no
more timeouts, even at 10 ms check interval. The difference is mainly
caused by the fact that there's no more need to wait for a thread to wake
up from poll() before starting to process check results.
Commit 64cc49c ("MAJOR: servers: propagate server status changes
asynchronously.") heavily changed the way the server states are
updated since they became asynchronous. During this change, some
code was lost, which is used to shut down some sessions from a
backup server and to pick pending connections from a proxy once
a server is turned back from maintenance to ready state. The
effect is that when temporarily disabling a server, connections
stay in the backend's queue, and when re-enabling it, they are
not picked and they expire in the backend's queue. Now they're
properly picked again.
This fix must be backported to 1.8.
When parsing the configuration, if "server", "default-server" or
"server-template" are found in a frontend, we first warn that it will be
ignored, only to be considered a fatal error later. Be true to our word, and
just ignore it.
This should be backported to 1.8 and 1.7.
Now all the code used to manipulate chunks uses a struct buffer instead.
The functions are still called "chunk*", and some of them will progressively
move to the generic buffer handling code as they are cleaned up.
Chunks are only a subset of a buffer (a non-wrapping version with no head
offset). Despite this we still carry a lot of duplicated code between
buffers and chunks. Replacing chunks with buffers would significantly
reduce the maintenance efforts. This first patch renames the chunk's
fields to match the name and types used by struct buffers, with the goal
of isolating the code changes from the declaration changes.
Most of the changes were made with spatch using this coccinelle script :
@rule_d1@
typedef chunk;
struct chunk chunk;
@@
- chunk.str
+ chunk.area
@rule_d2@
typedef chunk;
struct chunk chunk;
@@
- chunk.len
+ chunk.data
@rule_i1@
typedef chunk;
struct chunk *chunk;
@@
- chunk->str
+ chunk->area
@rule_i2@
typedef chunk;
struct chunk *chunk;
@@
- chunk->len
+ chunk->data
Some minor updates to 3 http functions had to be performed to take size_t
ints instead of ints in order to match the unsigned length here.
By default, HAProxy's DNS resolution at runtime ensure that there is no
IP address duplication in a backend (for servers being resolved by the
same hostname).
There are a few cases where people want, on purpose, to disable this
feature.
This patch introduces a couple of new server side options for this purpose:
"resolve-opts allow-dup-ip" or "resolve-opts prevent-dup-ip".
When creating a state file using "show servers state" an empty field is
created in the srv_addr column if the server is from the socket family
AF_UNIX. This leads to a warning on start up when using
"load-server-state-from-file". This patch defaults srv_addr to "-" if
the socket family is not covered.
This patch should be backported to 1.8.