MySQL will in stop supporting pre-4.1 authentication packets in the future
and is already giving us a hard time regarding non-silencable warnings
which are logged on each health check. Warnings look like the following:
"[Warning] Client failed to provide its character set. 'latin1' will be used
as client character set."
This patch adds basic support for post-4.1 authentication by sending the proper
authentication packet with the character set, along with the QUIT command.
Now that it is possible to know whether a server is in forced maintenance
or inherits its maintenance status from another one, it is possible to
allow server tracking at more than one level. We still provide a loop
detection however.
Note that for the stats it's a bit trickier since we have to report the
check state which corresponds to the state of the server at the end of
the chain.
This change now involves a new flag SRV_ADMF_IMAINT to note that the
maintenance status of a server is inherited from another server. Thus,
we know at each server level in the chain if it's running, in forced
maintenance or in a maintenance status because it tracks another server,
or even in both states.
Disabling a server propagates this flag down to other servers. Enabling
a server flushes the flag down. A server becomes up again once both of
its flags are cleared.
Two new functions "srv_adm_set_maint()" and "srv_adm_set_ready()" are used to
manipulate this maintenance status. They're used by the CLI and the stats
page.
Now the stats page always says "MAINT" instead of "MAINT(via)" and it's
only the chk/down field which reports "via x/y" when the status is
inherited from another server, but it doesn't say it when a server was
forced into maintenance. The CSV output indicates "MAINT (via x/y)"
instead of only "MAINT(via)". This is the most accurate representation.
One important thing is that now entering/leaving maintenance for a
tracking server correctly follows the state of the tracked server.
Servers used to have 3 flags to store a state, now they have 4 states
instead. This avoids lots of confusion for the 4 remaining undefined
states.
The encoding from the previous to the new states can be represented
this way :
SRV_STF_RUNNING
| SRV_STF_GOINGDOWN
| | SRV_STF_WARMINGUP
| | |
0 x x SRV_ST_STOPPED
1 0 0 SRV_ST_RUNNING
1 0 1 SRV_ST_STARTING
1 1 x SRV_ST_STOPPING
Note that the case where all bits were set used to exist and was randomly
dealt with. For example, the task was not stopped, the throttle value was
still updated and reported in the stats and in the http_server_state header.
It was the same if the server was stopped by the agent or for maintenance.
It's worth noting that the internal function names are still quite confusing.
Now we introduce srv->admin and srv->prev_admin which are bitfields
containing one bit per source of administrative status (maintenance only
for now). For the sake of backwards compatibility we implement a single
source (ADMF_FMAINT) but the code already checks any source (ADMF_MAINT)
where the STF_MAINTAIN bit was previously checked. This will later allow
us to add ADMF_IMAINT for maintenance mode inherited from tracked servers.
Along doing these changes, it appeared that some places will need to be
revisited when implementing the inherited bit, this concerns all those
modifying the ADMF_FMAINT bit (enable/disable actions on the CLI or stats
page), and the checks to report "via" on the stats page. But currently
the code is harmless.
Till now, the server's state and flags were all saved as a single bit
field. It causes some difficulties because we'd like to have an enum
for the state and separate flags.
This commit starts by splitting them in two distinct fields. The first
one is srv->state (with its counter-part srv->prev_state) which are now
enums, but which still contain bits (SRV_STF_*).
The flags now lie in their own field (srv->flags).
The function srv_is_usable() was updated to use the enum as input, since
it already used to deal only with the state.
Note that currently, the maintenance mode is still in the state for
simplicity, but it must move as well.
Thomas Heil reported that previous commit 07fcaaa ("MINOR: fix a few
memory usage errors") make haproxy crash when req* rules are used. As
diagnosed by Cyril Bonté, this commit introduced a regression which
makes haproxy free the memory areas allocated for regex even when
they're going to be used, resulting in the crashes.
This patch does three things :
- undo the free() on the valid path
- add regfree() on the error path but only when regcomp() succeeds
- rename err_code to ret_code to avoid confusing the valid return
path with an error path.
John-Paul Bader reported a stupid regression in 1.5-dev25, we
forget to check that global.stats_fe is initialized before visiting
its sockets, resulting in a crash.
No backport is needed.
We used to have is_addr() in place to validate sometimes the existence
of an address, sometimes a valid IPv4 or IPv6 address. Replace them
carefully so that is_inet_addr() is used wherever we can only use an
IPv4/IPv6 address.
Till now a warning was emitted if the "stats bind-process" was not
specified when nbproc was greater than 1. Now we can be much finer
and only emit a warning when at least of the stats socket is bound
to more than one process at a time.
When a process list is specified on either the proxy or the bind lines,
the latter is refined to the intersection of the two. A warning is emitted
if no intersection is found, and the situation is fixed by either falling
back to the first process of the proxy or to all processes.
When a bind-process setting is present in a frontend or backend, we
now verify that the specified process range at least shares one common
process with those defined globally by nbproc. Then if the value is
set, it is reduced to the one enforced by nbproc.
A warning is emitted if process count does not match, and the fix is
done the following way :
- if a single process was specified in the range, it's remapped to
process #1
- if more than one process was specified, the binding is removed
and all processes are usable.
Note that since backends may inherit their settings from frontends,
depending on the declaration order, they may or may not be reported
as warnings.
Some consistency checks cannot be performed between frontends, backends
and peers at the moment because there is no way to check for intersection
between processes bound to some processes when the number of processes is
higher than the number of bits in a word.
So first, let's limit the number of processes to the machine's word size.
This means nbproc will be limited to 32 on 32-bit machines and 64 on 64-bit
machines. This is far more than enough considering that configs rarely go
above 16 processes due to scalability and management issues, so 32 or 64
should be fine.
This way we'll ensure we can always build a mask of all the processes a
section is bound to.
By default, a proxy's bind_proc is zero, meaning "bind to all processes".
It's only when not zero that its process list is restricted. So we don't
want the frontends to enforce the value on the backends when the backends
are still set to zero.
Now, haproxy exit an error saying:
Unable to initialize the lock for the shared SSL session cache. You can retry using
the global statement 'tune.ssl.force-private-cache' but it could increase the CPU
usage due to renegotiation if nbproc > 1.
Process shared mutex seems not supported on some OSs (FreeBSD).
This patch checks errors on mutex lock init to fallback
on a private session cache (per process cache) in error cases.
Commit fc6c032 ("MEDIUM: global: add support for CPU binding on Linux ("cpu-map")")
merged into 1.5-dev13 involves a useless test that clang reports as a warning. The
"low" variable cannot be negative here. Issue reported by Charles Carter.
The "block" rules are redundant with http-request rules because they
are performed immediately before and do exactly the same thing as
"http-request deny". Moreover, this duplication has led to a few
minor stats accounting issues fixed lately.
Instead of keeping the two rule sets, we now build a list of "block"
rules that we compile as "http-request block" and that we later insert
at the beginning of the "http-request" rules.
The only user-visible change is that in case of a parsing error, the
config parser will now report "http-request block rule" instead of
"blocking condition".
Finn Arne Gangstad suggested that we should have the ability to break
keep-alive when the target server has reached its maxconn and that a
number of connections are present in the queue. After some discussion
around his proposed patch, the following solution was suggested : have
a per-proxy setting to fix a limit to the number of queued connections
on a server after which we break keep-alive. This ensures that even in
high latency networks where keep-alive is beneficial, we try to find a
different server.
This patch is partially based on his original proposal and implements
this configurable threshold.
In version 1.3.4, we got the ability to split configuration parts between
frontends and backends. The stats was attached to the backend and a control
was made to ensure that it was used only in a listen or backend section, but
not in a frontend.
The documentation clearly says that the statement may only be used in the
backend.
But since that same version above, the defaults stats configuration is
only filled in the frontend part of the proxy and not in the backend's.
So a backend will not get stats which are enabled in a defaults section,
despite what the doc says. However, a frontend configured after a defaults
section will get stats and will not emit the warning!
There were many technical limitations in 1.3.4 making it impossible to
have the stats working both in the frontend and backend, but now this has
become a total mess.
It's common however to see people create a frontend with a perfectly
working stats configuration which only emits a warning stating that it
might not work, adding to the confusion. Most people workaround the tricky
behaviour by declaring a "listen" section with no server, which was the
recommended solution in 1.3 where it was even suggested to add a dispatch
address to avoid a warning.
So the right solution seems to do the following :
- ensure that the defaults section's settings apply to the backends,
as documented ;
- let the frontends work in order not to break existing setups relying
on the defaults section ;
- officially allow stats to be declared in frontends and remove the
warninng
This patch should probably not be backported since it's not certain that
1.4 is fully compatible with having stats in frontends and backends (which
was really made possible thanks to applets).
Till now there was no check against misplaced use-server rules, and
no warning was emitted, adding to the confusion. They're processed
just after the use_backend rules, or more exactly at the same level
but for the backend.
Recently, the http-request ruleset started to be used a lot and some
bug reports were caused by misplaced http-request rules because there
was no warning if they're after a redirect or use_backend rule. Let's
fix this now. http-request rules are just after the block rules.
Since it became possible to use log-format expressions in use_backend,
having a mandatory condition becomes annoying because configurations
are full of "if TRUE". Let's relax the check to accept no condition
like many other keywords (eg: redirect).
When compiled with USE_GETADDRINFO, make sure we use getaddrinfo(3) to
perform name lookups. On default dual-stack setups this will change the
behavior of using IPv6 first. Global configuration option
'nogetaddrinfo' can be used to revert to deprecated gethostbyname(3).
A few occurrences of sprintf() were causing harmless warnings on OpenBSD :
src/cfgparse.o(.text+0x259e): In function `cfg_parse_global':
src/cfgparse.c:1044: warning: sprintf() is often misused, please use snprintf()
These ones were easy to get rid of, so better do it.
The cfgparse.c file becomes huge, and a large part of it comes from the
server keyword parser. Since the configuration is a bit more modular now,
move this parser to server.c.
This patch also moves the check of the "server" keyword earlier in the
supported keywords list, resulting in a slightly faster config parsing
for configs with large numbers of servers (about 10%).
No functional change was made, only the code was moved.
We have a use case where we look up a customer ID in an HTTP header
and direct it to the corresponding server. This can easily be done
using ACLs and use_backend rules, but the configuration becomes
painful to maintain when the number of customers grows to a few
tens or even a several hundreds.
We realized it would be nice if we could make the use_backend
resolve its name at run time instead of config parsing time, and
use a similar expression as http-request add-header to decide on
the proper backend to use. This permits the use of prefixes or
even complex names in backend expressions. If no name matches,
then the default backend is used. Doing so allowed us to get rid
of all the use_backend rules.
Since there are some config checks on the use_backend rules to see
if the referenced backend exists, we want to keep them to detect
config errors in normal config. So this patch does not modify the
default behaviour and proceeds this way :
- if the backend name in the use_backend directive parses as a log
format rule, it's used as-is and is resolved at run time ;
- otherwise it's a static name which must be valid at config time.
There was the possibility of doing this with the use-server directive
instead of use_backend, but it seems like use_backend is more suited
to this task, as it can be used for other purposes. For example, it
becomes easy to serve a customer-specific proxy.pac file based on the
customer ID by abusing the errorfile primitive :
use_backend bk_cust_%[hdr(X-Cust-Id)] if { hdr(X-Cust-Id) -m found }
default_backend bk_err_404
backend bk_cust_1
errorfile 200 /etc/haproxy/static/proxy.pac.cust1
Signed-off-by: Bertrand Jacquin <bjacquin@exosec.fr>
This patch permit to register new sections in the haproxy's
configuration file. This run like all the "keyword" registration, it is
used during the haproxy initialization, typically with the
"__attribute__((constructor))" functions.
The function str2net runs DNS resolution if valid ip cannot be parsed.
The DNS function used is the standard function of the libc and it
performs asynchronous request.
The asynchronous request is not compatible with the haproxy
archictecture.
str2net() is used during the runtime throught the "socket".
This patch remove the DNS resolution during the runtime.
This patch remove the limit of 32 groups. It also permit to use standard
"pat_parse_str()" function in place of "pat_parse_strcat()". The
"pat_parse_strcat()" is no longer used and its removed. Before this
patch, the groups are stored in a bitfield, now they are stored in a
list of strings. The matching is slower, but the number of groups is
low and generally the list of allowed groups is short.
The fetch function "smp_fetch_http_auth_grp()" used with the name
"http_auth_group" return valid username. It can be used as string for
displaying the username or with the acl "http_auth_group" for checking
the group of the user.
Maybe the names of the ACL and fetch methods are no longer suitable, but
I keep the current names for conserving the compatibility with existing
configurations.
The function "userlist_postinit()" is created from verification code
stored in the big function "check_config_validity()". The code is
adapted to the new authentication storage system and it is moved in the
"src/auth.c" file. This function is used to check the validity of the
users declared in groups and to check the validity of groups declared
on the "user" entries.
This resolve function is executed before the check of all proxy because
many acl needs solved users and groups.
The binary samples are sometimes copied as is into http headers.
A sample can contain bytes unallowed by the http rfc concerning
header content, for example if it was extracted from binary data.
The resulting http request can thus be invalid.
This issue does not yet happen because haproxy currently (mistakenly)
hex-encodes binary data, so it is not really possible to retrieve
invalid HTTP chars.
The solution consists in hex-encoding all non-printable chars prefixed
by a '%' sign.
No backport is needed since existing code is not affected yet.
cfg_parse_listen() currently checks for duplicated proxy names.
Now that we have a tree for this, we can use it.
The config load time was further reduced by 1.6, which is now
about 4.5 times faster than what it was without the trees.
In fact it was the last CPU-intensive processing involving proxy
names. Now the only remaining point is the automatic fullconn
computation which can be bypassed by having a fullconn in the
defaults section, reducing the load time by another 10x.
Large configurations can take time to parse when thousands of backends
are in use. Let's store all the proxies in trees.
findproxy_mode() has been modified to use the tree for lookups, which
has divided the parsing time by about 2.5. But many lookups are still
present at many places and need to be dealt with.
Disabled backends don't have their symbols resolved. We must not initialize
their peers section since they're not valid and instead still contain the
section's name.
There are other places where such unions are still in use, and other similar
errors might still happen. Ideally we should get rid of all of them in the
quite sensible config stage.
Commits e0d1bfb ("[MINOR] Allow shutdown of sessions when a server
becomes unavailable") and eb2c24a ("MINOR: checks: add on-marked-up
option") mentionned that the directive was supported in default-server
but while it can be stated there, it's ignored because the config value
is not copied from the default server upon creation of a new server.
Moving the statement to the "server" lines works fine though. Thanks
to Baptiste Assmann for reporting and diagnosing this bug.
These features were introduced in 1.5-dev6 and 1.5-dev10 respectively,
so no backport is needed.
Cyril Bonté reported that despite commit 0dbbf317 which attempted
to fix the crash when a peers section has no name, we still get a
segfault after the error message when parsing the peers. The reason
is that the returned error code is ERR_FATAL and not ERR_ABORT, so
the parsing continues while the section was not initialized.
This is 1.5-specific, no backport is needed.
The ability to globally override the default client and server cipher
suites has been requested multiple times since the introduction of SSL.
This commit adds two new keywords to the global section for this :
- ssl-default-bind-ciphers
- ssl-default-server-ciphers
It is still possible to preset them at build time by setting the macros
LISTEN_DEFAULT_CIPHERS and CONNECT_DEFAULT_CIPHERS.
The new tune.idletimer value allows one to set a different value for
idle stream detection. The default value remains set to one second.
It is possible to disable it using zero, and to change the default
value at build time using DEFAULT_IDLE_TIMER.