Always try to remove a connexion from its toremove_list in conn_free.
This prevents a double-free in case the connection is freed but was
already added in toremove_list.
This bug was easily reproduced by running 4-5 runs of inject on a
single-thread instance of haproxy :
$ inject -u 10000 -d 10 -G 127.0.0.1:20080
A crash would soon be triggered in srv_cleanup_toremove_connections.
This does not need to be backported.
move listen status to a helper, defining both status enum and string
definition.
this will be helpful to be reused in prometheus code. It also removes
this hard-to-read nested ternary.
Signed-off-by: William Dauchy <wdauchy@gmail.com>
prometheus approach requires to output all values for a given metric
name; meaning we iterate through all metrics, and then iterate in the
inner loop on all objects for this metric.
In order to allow more code reuse, adapt the stats API to be able to
select one field or fill them all otherwise.
From this patch it should be possible to add support for listen stats in
prometheus.
Signed-off-by: William Dauchy <wdauchy@gmail.com>
This patch introduce the "dns_stream_nameserver" to use DNS over
TCP on strict nameservers. For the upper layer it is analog to
the api used with udp nameservers except that the user que switch
the name server in "stream" mode at the init using "dns_stream_init".
The fallback from UDP to TCP is not handled and this is not the
purpose of this feature. This is done to choose the transport layer
during the initialization.
Currently there is a hardcoded limit of 4 pipelined transactions
per TCP connections. A batch of idle connections is expired every 5s.
This code is designed to support a maximum DNS message size on TCP: 64k.
Note: this code won't perform retry on unanswered queries this
should be handled by the upper layer
This patch splits current dns.c into two files:
The first dns.c contains code related to DNS message exchange over UDP
and in future other TCP. We try to remove depencies to resolving
to make it usable by other stuff as DNS load balancing.
The new resolvers.c inherit of the code specific to the actual
resolvers.
Note:
It was really difficult to obtain a clean diff dur to the amount
of moved code.
Note2:
Counters and stuff related to stats is not cleany separated because
currently counters for both layers are merged and hard to separate
for now.
This patch splits recv and send functions in two layers. the
lowest is responsible of DNS message transactions over
the network. Doing this we could use DNS message layer
for something else than resolving. Load balancing for instance.
This patch also re-works the way to init a nameserver and
introduce the new struct dns_dgram_server to prepare the arrival
of dns_stream_server and the support of DNS over TCP.
The way to retry a send failure of a request because of EAGAIN
was re-worked. Previously there was no control and all "pending"
queries were re-played each time it reaches a EAGAIN. This
patch introduce a ring to stack messages in case of sent
failure. This patch is emptied if poller shows that the
socket is ready again to push messages.
Counters are currently stored into lowlevel nameservers struct but
most of them are resolving layer data and increased in the upper layer
So this patch renames the prototype used to allocate/dump them with prefix
'resolv' waiting for a clean split.
Some types are specific to resolver code and a renamed using
the 'resolv' prefix instead 'dns'.
-struct dns_query_item {
+struct resolv_query_item {
-struct dns_answer_item {
+struct resolv_answer_item {
-struct dns_response_packet {
+struct resolv_response {
This patch adds the attribute packed on struct dns_question
because it is directly memcpy to network building a response.
This patch also removes the commented line:
// struct list options; /* list of option records */
because it is also used directly using memcpy to build a request
and must not contain host data.
Resolv callbacks are also updated to rely on counters and not on
nameservers.
"show stat domain dns" will now show the parent id (i.e. resolvers
section name).
The "show peers" output has become huge due to the dictionaries making it
less readable. Now this feature has reached a certain level of maturity
which doesn't warrant to dump it all the time, given that it was essentially
needed by developers. Let's make it optional, and disabled by default, only
when "show peers dict" is requested. The default output reminds about the
command. The output has been divided by 5 :
$ socat - /tmp/sock1 <<< "show peers dict" | wc -l
125
$ socat - /tmp/sock1 <<< "show peers" | wc -l
26
It could be useful to backport this to recent stable versions.
Now default proxies are stored into a dedicated tree, sorted by name.
Only unnamed entries are not kept upon new section creation. The very
first call to cfg_parse_listen() will automatically allocate a dummy
defaults section which corresponds to the previous static one, since
the code requires to have one at a few places.
The first immediately visible benefit is that it allows to reuse
alloc_new_proxy() to allocate a defaults section instead of doing it by
hand. And the secret goal is to allow to keep multiple named defaults
section in memory to reuse them from various proxies.
Now we'll have a tree of named defaults sections. The regular insertion
and lookup functions take care of the capability in order to select the
appropriate tree. A new function proxy_destroy_defaults() removes a
proxy from this tree and frees it entirely.
We don't want to expose this one anymore as we'll soon keep multiple
default proxies. Let's move it inside the parser which is the only
place which still uses it, and initialize it on the fly once needed
instead of doing it at boot time.
The default proxy was passed as a variable, which in addition to being
a PITA to deal with in the config parser, doesn't feel safe to use when
it ought to be const.
This will only affect new code so no backport is needed.
The default proxy was passed as a variable, which in addition to being
a PITA to deal with in the config parser, doesn't feel safe to use when
it ought to be const.
This will only affect new code so no backport is needed.
The default proxy was passed as a variable, which in addition to being
a PITA to deal with in the config parser, doesn't feel safe to use when
it ought to be const.
This will only affect new code so no backport is needed.
This used to be open-coded in cfgparse-listen.c when facing a "defaults"
keyword. Let's move this into proxy_free_defaults(). This code is ugly and
doesn't even reset the just freed pointers. Let's not change this yet.
This code should probably be merged with a generic proxy deinit function
called from deinit(). However there's a catch on uri_auth which cannot be
freed because it might be used by one or several proxies. We definitely
need refcounts there!
This new function takes over the old open-coding that used to be done
for too long in cfg_parse_listen() and it now does everything at once
in a proxy-centric function. The function does all the job of allocating
the structure, initializing it, presetting its defaults from the default
proxy and checking for errors. The code was almost unchanged except for
defproxy being passed as a pointer, and the error message being passed
using memprintf().
This change will be needed to ease reuse of multiple default proxies,
or to create dynamic backends in a distant future.
init_default_instance() was still left in cfgparse.c which is not the
best place to pre-initialize a proxy. Let's place it in proxy.c just
after init_new_proxy(), take this opportunity for renaming it to
proxy_preset_defaults() and taking out init_new_proxy() from it, and
let's pass it the pointer to the default proxy to be initialized instead
of implicitly assuming defproxy. We'll soon be able to exploit this.
Only two call places had to be updated.
struct comp is used in struct proxy but never declared prior to this
so depending on where proxy.h is included, touching the <comp> field
can break the build.
This is just an API bug but it's annoying when trying to tidy the code.
The source list passed in argument must be a const and not a variable,
as it's typically the list head from a default proxy and must obviously
not be modified by the function. No backport is needed as it only impacts
new code.
This is just an API bug but it's annoying when trying to tidy the code.
The default proxy passed in argument must be a const and not a variable.
No backport is needed as it only impacts new code.
In 2.1, commit ee4f5f83d ("MINOR: stats: get rid of the ST_CONVDONE flag")
introduced a subtle bug. By testing curproxy against defproxy in
check_config_validity(), it tried to eliminate the need for a flag
to indicate that stats authentication rules were already compiled,
but by doing so it left the issue opened for the case where a new
defaults section appears after the two proxies sharing the first
one:
defaults
mode http
stats auth foo:bar
listen l1
bind :8080
listen l2
bind :8181
defaults
# just to break above
This config results in:
[ALERT] 042/113725 (3121) : proxy 'f2': stats 'auth'/'realm' and 'http-request' can't be used at the same time.
[ALERT] 042/113725 (3121) : Fatal errors found in configuration.
Removing the last defaults remains OK. It turns out that the cleanups
that followed that patch render it useless, so the best fix is to revert
the change (with the up-to-date flags instead). The flag was marked as
belonging to the config. It's not exact but it's the closest to the
reality, as it's not there to configure the behavior but ti mention
that the config parser did its job.
This could be backported as far as 2.1, but in practice it looks like
nobody ever hit it.
logical followup from cli commands addition, so that the state server
file stays compatible with the changes made at runtime; use previously
added helper to load server attributes.
also alloc a specific chunk to avoid mixing with other called functions
using it
Signed-off-by: William Dauchy <wdauchy@gmail.com>
Even if it is possibly too much work for the current usage, it makes
sure we don't break states file from v2.3 to v2.4; indeed, since v2.3,
we introduced two new fields, so we put them aside to guarantee we can
easily reload from a version 1.
The diff seems huge but there is no specific change apart from:
- introduce v2 where it is needed (parsing, update)
- move away from switch/case in update to be able to reuse code
- move srv lock to the whole function to make it easier
this patch confirm how painful it is to maintain this functionality.
Signed-off-by: William Dauchy <wdauchy@gmail.com>
Use the proxy protocol frame if proxy protocol is activated on the
server line. Do not add anymore these connections in the private list.
If some requests are made with the same proxy fields, they can reuse
the idle connection.
The reg-tests proxy_protocol_send_unique_id must be adapted has it
relied on the side effect behavior that every requests from a same
connection reused a private server connection. Now, a new connection is
created as expected if the proxy protocol fields differ.
The source address is used as an input to the the server connection hash. The
address and port are used as separate hash inputs. Do not add anymore these
connections in the private list.
This parameter is set only if used in the transparent-proxy mode.
The destination address is used as an input to the server connection hash. The
address and port are used as separated hash inputs. Note that they are not used
when statically specified on the server line. This is only useful for dynamic
destination address.
This is typically used when the server address is dynamically set via the
set-dst action. The address and port are separated hash parameters.
Most notably, it should fixed set-dst use case (cf github issue #947).
The sni parameter is an input to the server connection hash. Do not add
anymore connections with dynamic sni in the private list. Thus, it is
now possible to reuse a server connection if they use the same sni.
Compare the connection hash when reusing a connection from the session.
This ensures that a private connection is reused only if it shares the
same set of parameters.
The pointer of the target server is used as a first parameter for the
server connection hash calcul. This prevents the hash to be null when no
specific parameters are present, and can serve as a simple defense
against an attacker trying to reuse a non-conform connection.
This is a preliminary work for the calcul of the backend connection
hash. A structure conn_hash_params is the input for the operation,
containing the various specific parameters of a connection.
The high bits of the hash will reflect the parameters present as input.
A set of macros is written to manipulate the connection hash and extract
the parameters/payload.
The server idle/safe/available connection lists are replaced with ebmb-
trees. This is used to store backend connections, with the new field
connection hash as the key. The hash is a 8-bytes size field, used to
reflect specific connection parameters.
This is a preliminary work to be able to reuse connection with SNI,
explicit src/dst address or PROXY protocol.
This is a preparation work for connection reuse with sni/proxy
protocol/specific src-dst addresses.
Protect every access to idle conn lists with a lock. This is currently
strictly not needed because the access to the list are made with atomic
operations. However, to be able to reuse connection with specific
parameters, the list storage will be converted to eb-trees. As this
structure does not have atomic operation, it is mandatory to protect it
with a lock.
For this, the takeover lock is reused. Its role was to protect during
connection takeover. As it is now extended to general idle conns usage,
it is renamed to idle_conns_lock. A new lock section is also
instantiated named IDLE_CONNS_LOCK to isolate its impact on performance.
Since commit 3169471964fdc49963e63f68c1fd88686821a0c4 ("MINOR: Add
server port field to server state file.") max_fields was not increased
on version number 1. So this patch aims to fix it. This should be
backported as far as v1.8, but the numbering should be adpated depending
on the version: simply increase the field by 1.
Signed-off-by: William Dauchy <wdauchy@gmail.com>
Amaury reported that the commit 3ce6eed ("MEDIUM: ssl: add a rwlock for
SSL server session cache") introduced some warning during compilation:
include/haproxy/thread.h|411 col 2| warning: enumeration value 'SSL_SERVER_LOCK' not handled in switch [-Wswitch]
This patch fix the issue by adding the right entry in the switch block.
Must be backported where 3ce6eed is backported. (2.4 only for now)
Historically we've been counting lots of client-triggered events in stick
tables to help detect misbehaving ones, but we've been missing the same on
the server side, and there's been repeated requests for being able to count
the server errors per URL in order to precisely monitor the quality of
service or even to avoid routing requests to certain dead services, which
is also called "circuit breaking" nowadays.
This commit introduces http_fail_cnt and http_fail_rate, which work like
http_err_cnt and http_err_rate in that they respectively count events and
their frequency, but they only consider server-side issues such as network
errors, unparsable and truncated responses, and 5xx status codes other
than 501 and 505 (since these ones are usually triggered by the client).
Note that retryable errors are purposely not accounted for, so that only
what the client really sees is considered.
With this it becomes very simple to put some protective measures in place
to perform a redirect or return an excuse page when the error rate goes
beyond a certain threshold for a given URL, and give more chances to the
server to recover from this condition. Typically it could look like this
to bypass a URL causing more than 10 requests per second:
stick-table type string len 80 size 4k expire 1m store http_fail_rate(1m)
http-request track-sc0 base # track host+path, ignore query string
http-request return status 503 content-type text/html \
lf-file excuse.html if { sc0_http_fail_rate gt 10 }
A more advanced mechanism using gpt0 could even implement high/low rates
to disable/enable the service.
Reg-test converteers_ref_cnt_never_dec.vtc was updated to test it.
mul32hi() multiples a constant a with a variable b from 0 to 0xffffffff
and shifts the result by 32 bits. It's visible that it's always impossible
to reach the constant a this way because the product always misses exactly
one unit of a to be preserved. And this cannot be corrected by the caller
either as adding one to the output will only shift the output range, and
it's not possible to pass 2^32 on the ratio <b>. The right approach is to
add "a" after the multiplication so that the input range is always
preserved for all ratio values from 0 to 0xffffffff:
(a=0x00000000 * b=0x00000000 + a=0x00000000) >> 32 = 0x00000000
(a=0x00000000 * b=0x00000001 + a=0x00000000) >> 32 = 0x00000000
(a=0x00000000 * b=0xffffffff + a=0x00000000) >> 32 = 0x00000000
(a=0x00000001 * b=0x00000000 + a=0x00000001) >> 32 = 0x00000000
(a=0x00000001 * b=0x00000001 + a=0x00000001) >> 32 = 0x00000000
(a=0x00000001 * b=0xffffffff + a=0x00000001) >> 32 = 0x00000001
(a=0xffffffff * b=0x00000000 + a=0xffffffff) >> 32 = 0x00000000
(a=0xffffffff * b=0x00000001 + a=0xffffffff) >> 32 = 0x00000001
(a=0xffffffff * b=0xffffffff + a=0xffffffff) >> 32 = 0xffffffff
This is only used in freq_ctr calculations and the slightly lower value
is unlikely to have ever been noticed by anyone. This may be backported
though it is not important.
When adding the server side support for certificate update over the CLI
we encountered a design problem with the SSL session cache which was not
locked.
Indeed, once a certificate is updated we need to flush the cache, but we
also need to ensure that the cache is not used during the update.
To prevent the use of the cache during an update, this patch introduce a
rwlock for the SSL server session cache.
In the SSL session part this patch only lock in read, even if it writes.
The reason behind this, is that in the session part, there is one cache
storage per thread so it is not a problem to write in the cache from
several threads. The problem is only when trying to write in the cache
from the CLI (which could be on any thread) when a session is trying to
access the cache. So there is a write lock in the CLI part to prevent
simultaneous access by a session and the CLI.
This patch also remove the thread_isolate attempt which is eating too
much CPU time and was not protecting from the use of a free ptr in the
session.