Commit Graph

6101 Commits

Author SHA1 Message Date
Emeric Brun
c943799c86 MEDIUM: resolvers/dns: split dns.c into dns.c and resolvers.c
This patch splits current dns.c into two files:

The first dns.c contains code related to DNS message exchange over UDP
and in future other TCP. We try to remove depencies to resolving
to make it usable by other stuff as DNS load balancing.

The new resolvers.c inherit of the code specific to the actual
resolvers.

Note:
It was really difficult to obtain a clean diff dur to the amount
of moved code.

Note2:
Counters and stuff related to stats is not cleany separated because
currently counters for both layers are merged and hard to separate
for now.
2021-02-13 10:03:46 +01:00
Emeric Brun
d26a6237ad MEDIUM: resolvers: split resolving and dns message exchange layers.
This patch splits recv and send functions in two layers. the
lowest is responsible of DNS message transactions over
the network. Doing this we could use DNS message layer
for something else than resolving. Load balancing for instance.

This patch also re-works the way to init a nameserver and
introduce the new struct dns_dgram_server to prepare the arrival
of dns_stream_server and the support of DNS over TCP.

The way to retry a send failure of a request because of EAGAIN
was re-worked. Previously there was no control and all "pending"
queries were re-played each time it reaches a EAGAIN. This
patch introduce a ring to stack messages in case of sent
failure. This patch is emptied if poller shows that the
socket is ready again to push messages.
2021-02-13 09:51:10 +01:00
Emeric Brun
d3b4495f0d MINOR: resolvers: rework dns stats prototype because specific to resolvers
Counters are currently stored into lowlevel nameservers struct but
most of them are resolving layer data and increased in the upper layer
So this patch renames the prototype used to allocate/dump them with prefix
'resolv' waiting for a clean split.
2021-02-13 09:43:18 +01:00
Emeric Brun
6a2006ae37 MINOR: resolvers: replace nameserver's resolver ref by generic parent pointer
This will allow to use nameservers in something else than a resolver
section (load balancing for instance).
2021-02-13 09:43:18 +01:00
Emeric Brun
d30e9a1709 MINOR: resolvers: rework prototype suffixes to split resolving and dns.
A lot of prototypes in dns.h are specific to resolvers and must
be renamed to split resolving and DNS layers.
2021-02-13 09:43:18 +01:00
Emeric Brun
456de77bdb MINOR: resolvers: renames resolvers DNS_UPD_* returncodes to RSLV_UPD_*
This patch renames some #defines prefixes from DNS to RSLV.
2021-02-13 09:43:18 +01:00
Emeric Brun
30c766ebbc MINOR: resolvers: renames resolvers DNS_RESP_* errcodes RSLV_RESP_*
This patch renames some #defines prefixes from DNS to RSLV.
2021-02-13 09:43:18 +01:00
Emeric Brun
21fbeedf97 MINOR: resolvers: renames some dns prefixed types using resolv prefix.
@@ -119,8 +119,8 @@ struct act_rule {
-               } dns;                         /* dns resolution */
+               } resolv;                      /* resolving */

-struct dns_options {
+struct resolv_options {
2021-02-13 09:43:18 +01:00
Emeric Brun
08622d3c0a MINOR: resolvers: renames some resolvers specific types to not use dns prefix
This patch applies those changes on names:

-struct dns_resolution {
+struct resolv_resolution {

-struct dns_requester {
+struct resolv_requester {

-struct dns_srvrq {
+struct resolv_srvrq {

@@ -185,12 +185,12 @@ struct stream {

        struct {
-               struct dns_requester *dns_requester;
+               struct resolv_requester *requester;
...
-       } dns_ctx;
+       } resolv_ctx;
2021-02-13 09:43:18 +01:00
Emeric Brun
750fe79cd0 MINOR: resolvers: renames type dns_resolvers to resolvers.
It also renames 'dns_resolvers' head list to sec_resolvers
to avoid conflicts with local variables 'resolvers'.
2021-02-13 09:43:17 +01:00
Emeric Brun
85914e9d9b MINOR: resolvers: renames some resolvers internal types and removes dns prefix
Some types are specific to resolver code and a renamed using
the 'resolv' prefix instead 'dns'.

-struct dns_query_item {
+struct resolv_query_item {

-struct dns_answer_item {
+struct resolv_answer_item {

-struct dns_response_packet {
+struct resolv_response {
2021-02-13 09:43:17 +01:00
Emeric Brun
67f830d29d BUG/MINOR: resolvers: fix attribute packed struct for dns
This patch adds the attribute packed on struct dns_question
because it is directly memcpy to network building a response.

This patch also removes the commented line:
//     struct list options;       /* list of option records */

because it is also used directly using memcpy to build a request
and must not contain host data.
2021-02-13 09:43:17 +01:00
Emeric Brun
50c870e4de BUG/MINOR: dns: add missing sent counter and parent id to dns counters.
Resolv callbacks are also updated to rely on counters and not on
nameservers.
"show stat domain dns" will now show the parent id (i.e. resolvers
section name).
2021-02-13 09:43:17 +01:00
Emeric Brun
e14b98c08e MINOR: ring: adds new ring_init function.
Adds the new ring_init function to initialize
a pre-allocated ring struct using the given
memory area.
2021-02-13 09:43:17 +01:00
Willy Tarreau
49962b58d0 MINOR: peers/cli: do not dump the peers dictionaries by default on "show peers"
The "show peers" output has become huge due to the dictionaries making it
less readable. Now this feature has reached a certain level of maturity
which doesn't warrant to dump it all the time, given that it was essentially
needed by developers. Let's make it optional, and disabled by default, only
when "show peers dict" is requested. The default output reminds about the
command. The output has been divided by 5 :

  $ socat - /tmp/sock1  <<< "show peers dict" | wc -l
  125
  $ socat - /tmp/sock1  <<< "show peers" | wc -l
  26

It could be useful to backport this to recent stable versions.
2021-02-12 17:00:52 +01:00
Willy Tarreau
e90904d5a9 MEDIUM: proxy: store the default proxies in a tree by name
Now default proxies are stored into a dedicated tree, sorted by name.
Only unnamed entries are not kept upon new section creation. The very
first call to cfg_parse_listen() will automatically allocate a dummy
defaults section which corresponds to the previous static one, since
the code requires to have one at a few places.

The first immediately visible benefit is that it allows to reuse
alloc_new_proxy() to allocate a defaults section instead of doing it by
hand. And the secret goal is to allow to keep multiple named defaults
section in memory to reuse them from various proxies.
2021-02-12 16:23:46 +01:00
Willy Tarreau
0a0f6a7e4f MINOR: proxy: support storing defaults sections into their own tree
Now we'll have a tree of named defaults sections. The regular insertion
and lookup functions take care of the capability in order to select the
appropriate tree. A new function proxy_destroy_defaults() removes a
proxy from this tree and frees it entirely.
2021-02-12 16:23:46 +01:00
Willy Tarreau
80dc6fea59 MINOR: proxy: add a new capability PR_CAP_DEF
In order to more easily distinguish a default proxy from a standard one,
let's introduce a new capability PR_CAP_DEF.
2021-02-12 16:23:46 +01:00
Willy Tarreau
7d0c143185 MINOR: cfgparse: move defproxy to cfgparse-listen as a static
We don't want to expose this one anymore as we'll soon keep multiple
default proxies. Let's move it inside the parser which is the only
place which still uses it, and initialize it on the fly once needed
instead of doing it at boot time.
2021-02-12 16:23:46 +01:00
Willy Tarreau
bb8669ae28 BUG/MINOR: server: parse_server() must take a const for the defproxy
The default proxy was passed as a variable, which in addition to being
a PITA to deal with in the config parser, doesn't feel safe to use when
it ought to be const.

This will only affect new code so no backport is needed.
2021-02-12 16:23:46 +01:00
Willy Tarreau
54fa7e332a BUG/MINOR: tcpcheck: proxy_parse_*check*() must take a const for the defproxy
The default proxy was passed as a variable, which in addition to being
a PITA to deal with in the config parser, doesn't feel safe to use when
it ought to be const.

This will only affect new code so no backport is needed.
2021-02-12 16:23:46 +01:00
Willy Tarreau
220fd70694 BUG/MINOR: extcheck: proxy_parse_extcheck() must take a const for the defproxy
The default proxy was passed as a variable, which in addition to being
a PITA to deal with in the config parser, doesn't feel safe to use when
it ought to be const.

This will only affect new code so no backport is needed.
2021-02-12 16:23:46 +01:00
Willy Tarreau
a3320a0509 MINOR: proxy: move the defproxy freeing code to proxy.c
This used to be open-coded in cfgparse-listen.c when facing a "defaults"
keyword. Let's move this into proxy_free_defaults(). This code is ugly and
doesn't even reset the just freed pointers. Let's not change this yet.

This code should probably be merged with a generic proxy deinit function
called from deinit(). However there's a catch on uri_auth which cannot be
freed because it might be used by one or several proxies. We definitely
need refcounts there!
2021-02-12 16:23:46 +01:00
Willy Tarreau
7683893c70 REORG: proxy: centralize the proxy allocation code into alloc_new_proxy()
This new function takes over the old open-coding that used to be done
for too long in cfg_parse_listen() and it now does everything at once
in a proxy-centric function. The function does all the job of allocating
the structure, initializing it, presetting its defaults from the default
proxy and checking for errors. The code was almost unchanged except for
defproxy being passed as a pointer, and the error message being passed
using memprintf().

This change will be needed to ease reuse of multiple default proxies,
or to create dynamic backends in a distant future.
2021-02-12 16:23:46 +01:00
Willy Tarreau
144289b459 REORG: move init_default_instance() to proxy.c and pass it the defproxy pointer
init_default_instance() was still left in cfgparse.c which is not the
best place to pre-initialize a proxy. Let's place it in proxy.c just
after init_new_proxy(), take this opportunity for renaming it to
proxy_preset_defaults() and taking out init_new_proxy() from it, and
let's pass it the pointer to the default proxy to be initialized instead
of implicitly assuming defproxy. We'll soon be able to exploit this.
Only two call places had to be updated.
2021-02-12 16:23:46 +01:00
Willy Tarreau
168a414037 BUILD: proxy: add missing compression-t.h to proxy-t.h
struct comp is used in struct proxy but never declared prior to this
so depending on where proxy.h is included, touching the <comp> field
can break the build.
2021-02-12 16:23:46 +01:00
Willy Tarreau
09f2e77eb1 BUG/MINOR: tcpheck: the source list must be a const in dup_tcpcheck_var()
This is just an API bug but it's annoying when trying to tidy the code.
The source list passed in argument must be a const and not a variable,
as it's typically the list head from a default proxy and must obviously
not be modified by the function. No backport is needed as it only impacts
new code.
2021-02-12 16:23:46 +01:00
Willy Tarreau
016255a483 BUG/MINOR: http-htx: defpx must be a const in proxy_dup_default_conf_errors()
This is just an API bug but it's annoying when trying to tidy the code.
The default proxy passed in argument must be a const and not a variable.
No backport is needed as it only impacts new code.
2021-02-12 16:23:46 +01:00
Willy Tarreau
5bbc676608 BUG/MINOR: stats: revert the change on ST_CONVDONE
In 2.1, commit ee4f5f83d ("MINOR: stats: get rid of the ST_CONVDONE flag")
introduced a subtle bug. By testing curproxy against defproxy in
check_config_validity(), it tried to eliminate the need for a flag
to indicate that stats authentication rules were already compiled,
but by doing so it left the issue opened for the case where a new
defaults section appears after the two proxies sharing the first
one:

      defaults
          mode http
          stats auth foo:bar

      listen l1
          bind :8080

      listen l2
          bind :8181

      defaults
          # just to break above

This config results in:
  [ALERT] 042/113725 (3121) : proxy 'f2': stats 'auth'/'realm' and 'http-request' can't be used at the same time.
  [ALERT] 042/113725 (3121) : Fatal errors found in configuration.

Removing the last defaults remains OK. It turns out that the cleanups
that followed that patch render it useless, so the best fix is to revert
the change (with the up-to-date flags instead). The flag was marked as
belonging to the config. It's not exact but it's the closest to the
reality, as it's not there to configure the behavior but ti mention
that the config parser did its job.

This could be backported as far as 2.1, but in practice it looks like
nobody ever hit it.
2021-02-12 16:23:45 +01:00
William Dauchy
d1a7b85a40 MEDIUM: server: support {check,agent}_addr, agent_port in server state
logical followup from cli commands addition, so that the state server
file stays compatible with the changes made at runtime; use previously
added helper to load server attributes.

also alloc a specific chunk to avoid mixing with other called functions
using it

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-02-12 16:04:52 +01:00
William Dauchy
63e6cba12a MEDIUM: server: add server-states version 2
Even if it is possibly too much work for the current usage, it makes
sure we don't break states file from v2.3 to v2.4; indeed, since v2.3,
we introduced two new fields, so we put them aside to guarantee we can
easily reload from a version 1.
The diff seems huge but there is no specific change apart from:
- introduce v2 where it is needed (parsing, update)
- move away from switch/case in update to be able to reuse code
- move srv lock to the whole function to make it easier

this patch confirm how painful it is to maintain this functionality.

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-02-12 16:04:52 +01:00
Amaury Denoyelle
1921d20fff MINOR: connection: use proxy protocol as parameter for srv conn hash
Use the proxy protocol frame if proxy protocol is activated on the
server line. Do not add anymore these connections in the private list.
If some requests are made with the same proxy fields, they can reuse
the idle connection.

The reg-tests proxy_protocol_send_unique_id must be adapted has it
relied on the side effect behavior that every requests from a same
connection reused a private server connection. Now, a new connection is
created as expected if the proxy protocol fields differ.
2021-02-12 12:54:04 +01:00
Amaury Denoyelle
d10a200f62 MINOR: connection: use src addr as parameter for srv conn hash
The source address is used as an input to the the server connection hash. The
address and port are used as separate hash inputs. Do not add anymore these
connections in the private list.

This parameter is set only if used in the transparent-proxy mode.
2021-02-12 12:54:04 +01:00
Amaury Denoyelle
01a287f1e5 MINOR: connection: use dst addr as parameter for srv conn hash
The destination address is used as an input to the server connection hash. The
address and port are used as separated hash inputs. Note that they are not used
when statically specified on the server line. This is only useful for dynamic
destination address.

This is typically used when the server address is dynamically set via the
set-dst action. The address and port are separated hash parameters.

Most notably, it should fixed set-dst use case (cf github issue #947).
2021-02-12 12:53:56 +01:00
Amaury Denoyelle
9b626e3c19 MINOR: connection: use sni as parameter for srv conn hash
The sni parameter is an input to the server connection hash. Do not add
anymore connections with dynamic sni in the private list. Thus, it is
now possible to reuse a server connection if they use the same sni.
2021-02-12 12:48:11 +01:00
Amaury Denoyelle
293dcc400e MINOR: backend: compare conn hash for session conn reuse
Compare the connection hash when reusing a connection from the session.
This ensures that a private connection is reused only if it shares the
same set of parameters.
2021-02-12 12:33:05 +01:00
Amaury Denoyelle
1a58aca84e MINOR: connection: use the srv pointer for the srv conn hash
The pointer of the target server is used as a first parameter for the
server connection hash calcul. This prevents the hash to be null when no
specific parameters are present, and can serve as a simple defense
against an attacker trying to reuse a non-conform connection.
2021-02-12 12:33:05 +01:00
Amaury Denoyelle
81c6f76d3e MINOR: connection: prepare hash calcul for server conns
This is a preliminary work for the calcul of the backend connection
hash. A structure conn_hash_params is the input for the operation,
containing the various specific parameters of a connection.

The high bits of the hash will reflect the parameters present as input.
A set of macros is written to manipulate the connection hash and extract
the parameters/payload.
2021-02-12 12:33:05 +01:00
Amaury Denoyelle
f232cb3e9b MEDIUM: connection: replace idle conn lists by eb trees
The server idle/safe/available connection lists are replaced with ebmb-
trees. This is used to store backend connections, with the new field
connection hash as the key. The hash is a 8-bytes size field, used to
reflect specific connection parameters.

This is a preliminary work to be able to reuse connection with SNI,
explicit src/dst address or PROXY protocol.
2021-02-12 12:33:05 +01:00
Amaury Denoyelle
5c7086f6b0 MEDIUM: connection: protect idle conn lists with locks
This is a preparation work for connection reuse with sni/proxy
protocol/specific src-dst addresses.

Protect every access to idle conn lists with a lock. This is currently
strictly not needed because the access to the list are made with atomic
operations. However, to be able to reuse connection with specific
parameters, the list storage will be converted to eb-trees. As this
structure does not have atomic operation, it is mandatory to protect it
with a lock.

For this, the takeover lock is reused. Its role was to protect during
connection takeover. As it is now extended to general idle conns usage,
it is renamed to idle_conns_lock. A new lock section is also
instantiated named IDLE_CONNS_LOCK to isolate its impact on performance.
2021-02-12 12:33:04 +01:00
William Dauchy
38cd986c54 BUG/MINOR: server: re-align state file fields number
Since commit 3169471964 ("MINOR: Add
server port field to server state file.") max_fields was not increased
on version number 1. So this patch aims to fix it. This should be
backported as far as v1.8, but the numbering should be adpated depending
on the version: simply increase the field by 1.

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-02-10 16:25:42 +01:00
William Lallemand
7b41654495 MINOR: ssl: add SSL_SERVER_LOCK label in threads.h
Amaury reported that the commit 3ce6eed ("MEDIUM: ssl: add a rwlock for
SSL server session cache") introduced some warning during compilation:

    include/haproxy/thread.h|411 col 2| warning: enumeration value 'SSL_SERVER_LOCK' not handled in switch [-Wswitch]

This patch fix the issue by adding the right entry in the switch block.

Must be backported where 3ce6eed is backported. (2.4 only for now)
2021-02-10 16:17:19 +01:00
Willy Tarreau
826f3ab5e6 MINOR: stick-tables/counters: add http_fail_cnt and http_fail_rate data types
Historically we've been counting lots of client-triggered events in stick
tables to help detect misbehaving ones, but we've been missing the same on
the server side, and there's been repeated requests for being able to count
the server errors per URL in order to precisely monitor the quality of
service or even to avoid routing requests to certain dead services, which
is also called "circuit breaking" nowadays.

This commit introduces http_fail_cnt and http_fail_rate, which work like
http_err_cnt and http_err_rate in that they respectively count events and
their frequency, but they only consider server-side issues such as network
errors, unparsable and truncated responses, and 5xx status codes other
than 501 and 505 (since these ones are usually triggered by the client).
Note that retryable errors are purposely not accounted for, so that only
what the client really sees is considered.

With this it becomes very simple to put some protective measures in place
to perform a redirect or return an excuse page when the error rate goes
beyond a certain threshold for a given URL, and give more chances to the
server to recover from this condition. Typically it could look like this
to bypass a URL causing more than 10 requests per second:

  stick-table type string len 80 size 4k expire 1m store http_fail_rate(1m)
  http-request track-sc0 base       # track host+path, ignore query string
  http-request return status 503 content-type text/html \
      lf-file excuse.html if { sc0_http_fail_rate gt 10 }

A more advanced mechanism using gpt0 could even implement high/low rates
to disable/enable the service.

Reg-test converteers_ref_cnt_never_dec.vtc was updated to test it.
2021-02-10 12:27:01 +01:00
Willy Tarreau
e66ee1a651 BUG/MINOR: intops: fix mul32hi()'s off-by-one
mul32hi() multiples a constant a with a variable b from 0 to 0xffffffff
and shifts the result by 32 bits. It's visible that it's always impossible
to reach the constant a this way because the product always misses exactly
one unit of a to be preserved. And this cannot be corrected by the caller
either as adding one to the output will only shift the output range, and
it's not possible to pass 2^32 on the ratio <b>. The right approach is to
add "a" after the multiplication so that the input range is always
preserved for all ratio values from 0 to 0xffffffff:

     (a=0x00000000 * b=0x00000000 + a=0x00000000) >> 32 = 0x00000000
     (a=0x00000000 * b=0x00000001 + a=0x00000000) >> 32 = 0x00000000
     (a=0x00000000 * b=0xffffffff + a=0x00000000) >> 32 = 0x00000000
     (a=0x00000001 * b=0x00000000 + a=0x00000001) >> 32 = 0x00000000
     (a=0x00000001 * b=0x00000001 + a=0x00000001) >> 32 = 0x00000000
     (a=0x00000001 * b=0xffffffff + a=0x00000001) >> 32 = 0x00000001
     (a=0xffffffff * b=0x00000000 + a=0xffffffff) >> 32 = 0x00000000
     (a=0xffffffff * b=0x00000001 + a=0xffffffff) >> 32 = 0x00000001
     (a=0xffffffff * b=0xffffffff + a=0xffffffff) >> 32 = 0xffffffff

This is only used in freq_ctr calculations and the slightly lower value
is unlikely to have ever been noticed by anyone. This may be backported
though it is not important.
2021-02-09 17:52:50 +01:00
William Lallemand
3ce6eedb37 MEDIUM: ssl: add a rwlock for SSL server session cache
When adding the server side support for certificate update over the CLI
we encountered a design problem with the SSL session cache which was not
locked.

Indeed, once a certificate is updated we need to flush the cache, but we
also need to ensure that the cache is not used during the update.
To prevent the use of the cache during an update, this patch introduce a
rwlock for the SSL server session cache.

In the SSL session part this patch only lock in read, even if it writes.
The reason behind this, is that in the session part, there is one cache
storage per thread so it is not a problem to write in the cache from
several threads. The problem is only when trying to write in the cache
from the CLI (which could be on any thread) when a session is trying to
access the cache. So there is a write lock in the CLI part to prevent
simultaneous access by a session and the CLI.

This patch also remove the thread_isolate attempt which is eating too
much CPU time and was not protecting from the use of a free ptr in the
session.
2021-02-09 09:43:44 +01:00
Ilya Shipitsin
acf84595a7 CLEANUP: assorted typo fixes in the code and comments
This is 17th iteration of typo fixes
2021-02-08 10:49:08 +01:00
Ilya Shipitsin
7bbf5866e0 BUILD: ssl: fix typo in HAVE_SSL_CTX_ADD_SERVER_CUSTOM_EXT macro
HAVE_SSL_CTX_ADD_SERVER_CUSTOM_EXT was introduced in ec60909871
however it was defined as HAVE_SL_CTX_ADD_SERVER_CUSTOM_EXT (missing "S")
let us fix typo
2021-02-08 00:11:41 +01:00
Willy Tarreau
4acb99f867 BUG/MINOR: xxhash: make sure armv6 uses memcpy()
There was a special case made to allow ARMv6 to use unaligned accesses
via a cast in xxHash when __ARM_FEATURE_UNALIGNED is defined. But while
ARMv6 (and v7) does support unaligned accesses, it's only for 32-bit
pointers, not 64-bit ones, leading to bus errors when the compiler emits
an ldrd instruction and the input (e.g. a pattern) is not aligned, as in
issue #1035.

Note that v7 was properly using the packed approach here and was safe,
however haproxy versions 2.3 and older use the old r39 xxhash code which
has the same issue for armv7. A slightly different fix is required there,
by using a different definition of packed for 32 and 64 bits.

The problem is really visible when running v7 code on a v8 kernel because
such kernels do not implement alignment trap emulation, and the process
dies when this happens. This is why in the issue above it was only detected
under lxc. The emulation could have been disabled on v7 as well by writing
zero to /proc/cpu/alignment though.

This commit is a backport of xxhash commit a470f2ef ("update default memory
access for armv6").

Thanks to @srkunze for the report and tests, @stgraber for his help on
setting up an easy reproducer outside of lxc, and @Cyan4973 for the
discussion around the best way to fix this. Details and alternate patches
available on https://github.com/Cyan4973/xxHash/issues/490.
2021-02-04 17:14:58 +01:00
William Dauchy
4858fb2e18 MEDIUM: check: align agentaddr and agentport behaviour
in the same manner of agentaddr, we now:
- permit to set agentport through `port` keyword, like it is the case
  for agentaddr through `addr`
- set the priority on `agent-port` keyword when used
- add a flag to be able to test when the value is set like for agentaddr

it makes the behaviour between `addr` and `port` more consistent.

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-02-04 14:00:38 +01:00
William Dauchy
1c921cd748 BUG/MINOR: check: consitent way to set agentaddr
small consistency problem with `addr` and `agent-addr` options:
for the both options, the last one parsed is always used to set the
agent-check addr.  Thus these two lines don't have the same behavior:

  server ... addr <addr1> agent-addr <addr2>
  server ... agent-addr <addr2> addr <addr1>

After this patch `agent-addr` will always be the priority option over
`addr`. It means we test the flag before setting agentaddr.
We also fix all the places where we did not set the flag to be coherent
everywhere.

I was not really able to determine where this issue is coming from. So
it is probable we may backport it to all stable version where the agent
is supported.

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-02-04 13:55:04 +01:00
William Dauchy
fe03e7d045 MEDIUM: server: adding support for check_port in server state
We can currently change the check-port using the cli command `set server
check-port` but there is a consistency issue when using server state.
This patch aims to fix this problem but will be also a good preparation
work to get rid of checkport flag, so we are able to know when checkport
was set by config.

I am fully aware this is not making github #953 moving forward, I
however think this might be acceptable while waiting for a proper
solution and resolve consistency problem faced with port settings.

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-02-04 10:46:52 +01:00
William Dauchy
69f118d7b6 MEDIUM: check: remove checkport checkaddr flag
While trying to fix some consistency problem with the config file/cli
(e.g. check-port cli command does not set the flag), we realised
checkport flag was not necessarily needed. Indeed tcpcheck uses service
port as the last choice if check.port is zero. So we can assume if
check.port is zero, it means it was never set by the user, regardless if
it is by the cli or config file.  In the longterm this will avoid to
introduce a new consistency issue if we forget to set the flag.

in the same manner of checkport flag, we don't really need checkaddr
flag. We can assume if checkaddr is not set, it means it was never set
by the user or config.

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-02-04 10:43:00 +01:00
William Dauchy
eedb9b13f4 MINOR: stats: improve pending connections description
In order to unify prometheus and stats description, we need to clarify
the description for pending connections.
- remove the BE reference in counters struct, as it is also used in
  servers
- remove reference of `qcur` field in description as it is specific to
  stats implemention
- try to reword cur and max pending connections description

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-02-01 15:16:33 +01:00
Christopher Faulet
7aa3271439 MINOR: checks: Add function to get the result code corresponding to a status
The function get_check_status_result() can now be used to get the result
code (CHK_RES_*) corresponding to a check status (HCHK_STATUS_*). It will be
used by the Prometheus exporter when reporting the check status of a server.
2021-02-01 15:16:33 +01:00
Willy Tarreau
d597ec2718 MINOR: listener: export manage_global_listener_queue()
This one pops up in tasks lists when running against a saturated
listener.
2021-01-29 14:29:57 +01:00
Willy Tarreau
02922e19ca MINOR: session: export session_expire_embryonic()
This is only to make it resolve nicely in "show tasks".
2021-01-29 12:27:57 +01:00
Willy Tarreau
fb5401f296 MINOR: listener: export accept_queue_process
This is only to make it resolve in "show tasks".
2021-01-29 12:25:23 +01:00
Willy Tarreau
3fb6a7b46e MINOR: activity: declare a new structure to collect per-function activity
The new sched_activity structure will be used to collect task-level
activity based on the target function. The principle is to declare a
large enough array to make collisions rare (256 entries), and hash
the function pointer using a reduced XXH to decide where to store the
stats. On first computation an entry is definitely assigned to the
array and it's done atomically. A special entry (0) is used to store
collisions ("others"). The goal is to make it easy and inexpensive for
the scheduler code to use these to store #calls, cpu_time and lat_time
for each task.
2021-01-29 12:10:33 +01:00
Willy Tarreau
aa622b822b MINOR: activity: make profiling more manageable
In 2.0, commit d2d3348ac ("MINOR: activity: enable automatic profiling
turn on/off") introduced an automatic mode to enable/disable profiling.
The problem is that the automatic mode automatically changes to on/off,
which implied that the forced on/off modes aren't sticky anymore. It's
annoying when debugging because as soon as the load decreases, profiling
stops.

This makes a small change which ought to have been done first, which
consists in having two states for "auto" (auto-on, auto-off) to
distinguish them from the forced states. Setting to "auto" in the config
defaults to "auto-off" as before, and setting it on the CLI switches to
auto but keeps the current operating state.

This is simple enough to be backported to older releases if needed.
2021-01-29 12:10:33 +01:00
Willy Tarreau
4deeb1055f MINOR: tools: add print_time_short() to print a condensed duration value
When reporting some values in debugging output we often need to have
some condensed, stable-length values. This function prints a duration
from nanosecond to years with at least 4 digits of accuracy using the
most suitable unit, always on 7 chars.
2021-01-29 12:10:33 +01:00
Christopher Faulet
405f054652 MINOR: h1: Raise the chunk size limit up to (2^52 - 1)
The allowed chunk size was historically limited to 2GB to avoid risk of
overflow. This restriction is no longer necessary because the chunk size is
immediately stored into a 64bits integer after the parsing. Thus, it is now
possible to raise this limit. However to never fed possibly bogus values
from languages that use floats for their integers, we don't get more than 13
hexa-digit (2^52 - 1). 4 petabytes is probably enough !

This patch should fix the issue #1065. It may be backported as far as
2.1. For the 2.0, the legacy HTTP part must be reviewed. But there is
honestely no reason to do so.
2021-01-28 16:37:14 +01:00
Amaury Denoyelle
f9dcbeeab3 MEDIUM: h2: send connect protocol h2 settings
In order to announce support for the Extended CONNECT h2 method by
haproxy, always send the ENABLE_CONNECT_PROTOCOL h2 settings. This new
setting has been described in the rfc 8441.

After receiving ENABLE_CONNECT_PROTOCOL, the client is free to use the
Extended CONNECT h2 method. This can notably be useful for the support
of websocket handshake on http/2.
2021-01-28 16:37:14 +01:00
Amaury Denoyelle
c9a0afcc32 MEDIUM: h2: parse Extended CONNECT request to htx
Support for the rfc 8441 Bootstraping WebSockets with HTTP/2

Convert an Extended CONNECT HTTP/2 request into a htx representation.
The htx message uses the GET method with an Upgrade header field to be
fully compatible with the equivalent HTTP/1.1 Upgrade mechanism.

The Extended CONNECT is of the following form :

:method = CONNECT
:protocol = websocket
:scheme = https
:path = /chat
:authority = server.example.com

The new pseudo-header :protocol has been defined and is used to identify
an Extended CONNECT method. Contrary to standard CONNECT, Extended
CONNECT must have :scheme, :path and :authority defined.
2021-01-28 16:37:14 +01:00
Amaury Denoyelle
aad333a9fc MEDIUM: h1: add a WebSocket key on handshake if needed
Add the header Sec-Websocket-Key when generating a h1 handshake websocket
without this header. This is the case when doing h2-h1 conversion.

The key is randomly generated and base64 encoded. It is stored on the session
side to be able to verify response key and reject it if not valid.
2021-01-28 16:37:14 +01:00
Amaury Denoyelle
7416274914 MEDIUM: h2: parse Extended CONNECT reponse to htx
Support for the rfc 8441 Bootstraping WebSockets with HTTP/2

Convert a 200 status reply from an Extended CONNECT request into a htx
representation. The htx message is set to 101 status code to be fully
compatible with the equivalent HTTP/1.1 Upgrade mechanism.

This conversion is only done if the stream flags H2_SF_EXT_CONNECT_SENT
has been set. This is true if an Extended CONNECT request has already
been seen on the stream.

Besides the 101 status, the additional headers Connection/Upgrade are
added to the htx message. The protocol is set from the value stored in
h2s. Typically it will be extracted from the client request. This is
only used if the client is using h1 as only the HTTP/1.1 101 Response
contains the Upgrade header.
2021-01-28 16:37:14 +01:00
Amaury Denoyelle
c193823343 MEDIUM: h1: generate WebSocket key on response if needed
Add the Sec-Websocket-Accept header on a websocket handshake response.
This header may be missing if a h2 server is used with a h1 client.

The response key is calculated following the rfc6455. For this, the
handshake request key must be stored in the h1 session, as a new field
name ws_key. Note that this is only done if the message has been
prealably identified as a Websocket handshake request.
2021-01-28 16:37:14 +01:00
Amaury Denoyelle
18ee5c3eb0 MINOR: h1: reject websocket handshake if missing key
If a request is identified as a WebSocket handshake, it must contains a
websocket key header or else it can be reject, following the rfc6455.

A new flag H1_MF_UPG_WEBSOCKET is set on such messages.  For the request
te be identified as a WebSocket handshake, it must contains the headers:
  Connection: upgrade
  Upgrade: websocket

This commit is a compagnon of
"MEDIUM: h1: generate WebSocket key on response if needed" and
"MEDIUM: h1: add a WebSocket key on handshake if needed".

Indeed, it ensures that a WebSocket key is added only from a http/2 side
and not for a http/1 bogus peer.
2021-01-28 16:37:14 +01:00
Christopher Faulet
7d247f0771 MINOR: h2/mux-h2: Add flags to notify the response is known to have no body
The H2 message flag H2_MSGF_BODYLESS_RSP is now used during the request or
the response parsing to notify the mux that, considering the parsed message,
the response is known to have no body. This happens during HEAD requests
parsing and during 204/304 responses parsing.

On the H2 multiplexer, the equivalent flag is set on H2 streams. Thus the
H2_SF_BODYLESS_RESP flag is set on a H2 stream if the H2_MSGF_BODYLESS_RSP
is found after a HEADERS frame parsing. Conversely, this flag is also set
when a HEADERS frame is emitted for HEAD requests and for 204/304 responses.

The H2_SF_BODYLESS_RESP flag will be used to ignore data payload from the
response but not the trailers.
2021-01-28 16:37:14 +01:00
Christopher Faulet
d1ac2b90cd MAJOR: htx: Remove the EOM block type and use HTX_FL_EOM instead
The EOM block may be removed. The HTX_FL_EOM flags is enough. Most of time,
to know if the end of the message is reached, we just need to have an empty
HTX message with HTX_FL_EOM flag set. It may also be detected when the last
block of a message with HTX_FL_EOM flag is manipulated.

Removing EOM blocks simplifies the HTX message filling. Indeed, there is no
more edge problems when the message ends but there is no more space to write
the EOM block. However, some part are more tricky. Especially the
compression filter or the FCGI mux. The compression filter must finish the
compression on the last DATA block. Before it was performed on the EOM
block, an extra DATA block with the checksum was added. Now, we must detect
the last DATA block to be sure to finish the compression. The FCGI mux on
its part must be sure to reserve the space for the empty STDIN record on the
last DATA block while this record was inserted on the EOM block.

The H2 multiplexer is probably the part that benefits the most from this
change. Indeed, it is now fairly easier to known when to set the ES flag.

The HTX documentaion has been updated accordingly.
2021-01-28 16:37:14 +01:00
Christopher Faulet
789a472674 MINOR: htx: Add a function to know if a block is the only one in a message
The htx_is_unique_blk() function may now be used to know if a block is the
only one in an HTX message, excluding all unused blocks. Note the purpose of
this function is not to know if a block is the last one of an HTTP message.
This means no more data part from the message are expected, except tunneled
data. It only says if a block is alone in an HTX message.
2021-01-28 16:37:14 +01:00
Christopher Faulet
42432f347f MINOR: htx: Rename HTX_FL_EOI flag into HTX_FL_EOM
The HTX_FL_EOI flag is not well named. For now, it is not very used. But
that will change. It will replace the EOM block. Thus, it is renamed.
2021-01-28 16:37:14 +01:00
Christopher Faulet
576c358508 MINOR: htx/http-ana: Save info about Upgrade option in the Connection header
Add an HTX start-line flag and its counterpart into the HTTP message to
track the presence of the Upgrade option into the Connection header. This
way, without parsing the Connection header again, it will be easy to know if
a client asks for a protocol upgrade and if the server agrees to do so. It
will also be easy to perform some conformance checks when a
101-switching-protocols is received.
2021-01-28 16:27:48 +01:00
Christopher Faulet
4ef84c9c41 MINOR: stream: Add a function to validate TCP to H1 upgrades
TCP to H1 upgrades are buggy for now. When such upgrade is performed, a
crash is experienced. The bug is the result of the recent H1 mux
refactoring, and more specifically because of the commit c4bfa59f1 ("MAJOR:
mux-h1: Create the client stream as later as possible"). Indeed, now the H1
mux is responsible to create the frontend conn-stream once the request
headers are fully received. Thus the TCP to H1 upgrade is a problem because
the frontend conn-stream already exists.

To fix the bug, we must keep this conn-stream and the associate stream and
use it in the H1 mux. To do so, the upgrade will be performed in two
steps. First, the mux is upgraded from mux-pt to mux-h1. Then, the mux-h1
performs the stream upgrade, once the request headers are fully received and
parsed. To do so, stream_upgrade_from_cs() must be used. This function set
the SF_HTX flags to switch the stream to HTX mode, it removes the SF_IGNORE
flags and eventually it fills the request channel with some input data.

This patch is required to fix the TCP to H1 upgrades and is intimately
linked with the next commits.
2021-01-28 16:27:48 +01:00
Amaury Denoyelle
3f07c20fab BUG/MEDIUM: session: only retrieve ready idle conn from session
A bug was introduced by the early insertion of idle connections at the
end of connect_server. It is possible to reuse a connection not yet
ready waiting for an handshake (for example with proxy protocol or ssl).
A wrong duplicate xprt_handshake_io_cb tasklet is thus registered as a
side-effect.

This triggers the BUG_ON statement of xprt_handshake_subscribe :
    BUG_ON(ctx->subs && ctx->subs != es);

To counter this, a check is now present in session_get_conn to only
return a connection without the flag CO_FL_WAIT_XPRT. This might cause
sometimes the creation of dedicated server connections when in theory
reuse could have been used, but probably only occurs rarely in real
condition.

This behavior is present since commit :
    MEDIUM: connection: Add private connections synchronously in session server list
It could also be further exagerated by :
    MEDIUM: backend: add reused conn to sess if mux marked as HOL blocking

It can be backported up to 2.3.

NOTE : This bug seems to be only reproducible with mode tcp, for an
unknown reason. However, reuse should never happen when not in http
mode. This improper behavior will be the subject of a dedicated patch.

This bug can easily be reproducible with the following config (a
webserver is required to accept proxy protocol on port 31080) :

    global

    defaults
      mode tcp
      timeout connect 1s
      timeout server 1s
      timeout client 1s

    listen li
      bind 0.0.0.0:4444
      server bla1 127.0.0.1:31080 check send-proxy-v2

with the inject client :
    $ inject -u 10000 -d 10 -G 127.0.0.1:4444

This should fix the github issue #1058.
2021-01-28 14:16:27 +01:00
Tim Duesterhus
491be54cf1 BUILD: Include stdlib.h in compiler.h if DEBUG_USE_ABORT is set
Building with `"DEBUG=-DDEBUG_STRICT=1 -DDEBUG_USE_ABORT=1"` previously emitted the warning:

    In file included from include/haproxy/api.h:35:0,
                     from src/mux_pt.c:13:
    include/haproxy/buf.h: In function ‘br_init’:
    include/haproxy/bug.h:42:90: warning: implicit declaration of function ‘abort’ [-Wimplicit-function-declaration]
     #define ABORT_NOW() do { extern void ha_backtrace_to_stderr(); ha_backtrace_to_stderr(); abort(); } while (0)
                                                                                              ^
    include/haproxy/bug.h:56:21: note: in expansion of macro ‘ABORT_NOW’
     #define CRASH_NOW() ABORT_NOW()
                         ^
    include/haproxy/bug.h:68:4: note: in expansion of macro ‘CRASH_NOW’
        CRASH_NOW();                                           \
        ^
    include/haproxy/bug.h:62:35: note: in expansion of macro ‘__BUG_ON’
     #define _BUG_ON(cond, file, line) __BUG_ON(cond, file, line)
                                       ^
    include/haproxy/bug.h:61:22: note: in expansion of macro ‘_BUG_ON’
     #define BUG_ON(cond) _BUG_ON(cond, __FILE__, __LINE__)
                          ^
    include/haproxy/buf.h:875:2: note: in expansion of macro ‘BUG_ON’
      BUG_ON(size < 2);
      ^

This patch fixes that issue. The `DEBUG_USE_ABORT` option exists for use with
static analysis tools. No backport needed.
2021-01-27 12:44:39 +01:00
William Lallemand
795bd9ba3a CLEANUP: ssl: remove SSL_CTX function parameter
Since the server SSL_CTX is now stored in the ckch_inst, it is not
needed anymore to pass an SSL_CTX to ckch_inst_new_load_srv_store() and
ssl_sock_load_srv_ckchs().
2021-01-26 15:19:36 +01:00
Remi Tricot-Le Breton
bb470aa327 MINOR: ssl: Remove client_crt member of the server's ssl context
The client_crt member is not used anymore since the server's ssl context
initialization now behaves the same way as the bind lines one (using
ckch stores and instances).
2021-01-26 15:19:36 +01:00
Remi Tricot-Le Breton
f3eedfe195 MEDIUM: ssl: Enable backend certificate hot update
When trying to update a backend certificate, we should find a
server-side ckch instance thanks to which we can rebuild a new ssl
context and a new ckch instance that replace the previous ones in the
server structure. This way any new ssl session will be built out of the
new ssl context and the newly updated certificate.

This resolves a subpart of GitHub issue #427 (the certificate part)
2021-01-26 15:19:36 +01:00
Remi Tricot-Le Breton
d817dc733e MEDIUM: ssl: Load client certificates in a ckch for backend servers
In order for the backend server's certificate to be hot-updatable, it
needs to fit into the implementation used for the "bind" certificates.
This patch follows the architecture implemented for the frontend
implementation and reuses its structures and general function calls
(adapted for the server side).
The ckch store logic is kept and a dedicated ckch instance is used (one
per server). The whole sni_ctx logic was not kept though because it is
not needed.
All the new functions added in this patch are basically server-side
copies of functions that already exist on the frontend side with all the
sni and bind_cond references removed.
The ckch_inst structure has a new 'is_server_instance' flag which is
used to distinguish regular instances from the server-side ones, and a
new pointer to the server's structure in case of backend instance.
Since the new server ckch instances are linked to a standard ckch_store,
a lookup in the ckch store table will succeed so the cli code used to
update bind certificates needs to be covered to manage those new server
side ckch instances.
2021-01-26 15:19:36 +01:00
Remi Tricot-Le Breton
442b7f2238 MINOR: ssl: Server ssl context prepare function refactoring
Split the server's ssl context initialization into the general ssl
related initializations and the actual initialization of a single
SSL_CTX structure. This way the context's initialization will be
usable by itself from elsewhere.
2021-01-26 15:19:36 +01:00
Christopher Faulet
6071c2d12d BUG/MEDIUM: filters/htx: Fix data forwarding when payload length is unknown
It is only a problem on the response path because the request payload length
it always known. But when a filter is registered to analyze the response
payload, the filtering may hang if the server closes just after the headers.

The root cause of the bug comes from an attempt to allow the filters to not
immediately forward the headers if necessary. A filter may choose to hold
the headers by not forwarding any bytes of the payload. For a message with
no payload but a known payload length, there is always a EOM block to
forward. Thus holding the EOM block for bodyless messages is a good way to
also hold the headers. However, messages with an unknown payload length,
there is no EOM block finishing the message, but only a SHUTR flag on the
channel to mark the end of the stream. If there is no payload when it
happens, there is no payload at all to forward. In the filters API, it is
wrongly detected as a condition to not forward the headers.

Because it is not the most used feature and not the obvious one, this patch
introduces another way to hold the message headers at the begining of the
forwarding. A filter flag is added to explicitly says the headers should be
hold. A filter may choose to set the STRM_FLT_FL_HOLD_HTTP_HDRS flag and not
forwad anything to hold the headers. This flag is removed at each call, thus
it must always be explicitly set by filters. This flag is only evaluated if
no byte has ever been forwarded because the headers are forwarded with the
first byte of the payload.

reg-tests/filters/random-forwarding.vtc reg-test is updated to also test
responses with unknown payload length (with and without payload).

This patch must be backported as far as 2.0.
2021-01-26 09:53:52 +01:00
Tim Duesterhus
3d7f9ff377 MINOR: abort() on my_unreachable() when DEBUG_USE_ABORT is set.
Hopefully this helps static analysis tools detecting that the code after that
call is unreachable.

See GitHub Issue #1075.
2021-01-26 09:33:18 +01:00
William Dauchy
d3a9a4992b MEDIUM: stats: allow to select one field in stats_fill_sv_stats
prometheus approach requires to output all values for a given metric
name; meaning we iterate through all metrics, and then iterate in the
inner loop on all objects for this metric.
In order to allow more code reuse, adapt the stats API to be able to
select one field or fill them all otherwise.
This patch follows what has already been done on frontend and backend
side.
From this patch it should be possible to remove most of the duplicate
code on prometheuse side for the server.

A few things to note though:
- state require prior calculation, so I moved that to a sort of helper
  `stats_fill_be_stats_computestate`.
- all ST_F*TIME fields requires some minor compute, so I moved it at te
  beginning of the function under a condition.

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-01-26 09:24:51 +01:00
William Dauchy
da3b466fc2 MEDIUM: stats: allow to select one field in stats_fill_be_stats
prometheus approach requires to output all values for a given metric
name; meaning we iterate through all metrics, and then iterate in the
inner loop on all objects for this metric.
In order to allow more code reuse, adapt the stats API to be able to
select one field or fill them all otherwise.
This patch follows what has already been done on frontend side.
From this patch it should be possible to remove most of the duplicate
code on prometheuse side for the backend

A few things to note though:
- status and uweight field requires prior compute, so I moved that to a
  sort of helper `stats_fill_be_stats_computesrv`.
- all ST_F*TIME fields requires some minor compute, so I moved it at te
  beginning of the function under a condition.

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-01-26 09:24:19 +01:00
William Dauchy
2107a0faf5 CLEANUP: stats: improve field selection for frontend http fields
while working on backend/servers I realised I could have written that in
a better way and avoid one extra break. This is slightly improving
readiness.
also while being here, fix function declaration which was not 100%
accurate.

this patch does not change the behaviour of the code.

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-01-25 15:53:28 +01:00
Ilya Shipitsin
1fc44d494a BUILD: ssl: guard Client Hello callbacks with HAVE_SSL_CLIENT_HELLO_CB macro instead of openssl version
let us introduce new macro HAVE_SSL_CLIENT_HELLO_CB and guard
callback functions with it
2021-01-22 20:45:24 +01:00
Willy Tarreau
2bfce7e424 MINOR: debug: let ha_dump_backtrace() dump a bit further for some callers
The dump state is now passed to the function so that the caller can adjust
the behavior. A new series of 4 values allow to stop *after* dumping main
instead of before it or any of the usual loops. This allows to also report
BUG_ON() that could happen very high in the call graph (e.g. startup, or
the scheduler itself) while still understanding what the call path was.
2021-01-22 14:48:34 +01:00
Willy Tarreau
5baf4fe31a MEDIUM: debug: now always print a backtrace on CRASH_NOW() and friends
The purpose is to enable the dumping of a backtrace on BUG_ON(). While
it's very useful to know that a condition was met, very often some
caller context is missing to figure how the condition could happen.
From now on, on systems featuring backtrace, a backtrace of the calling
thread will also be dumped to stderr in addition to the unexpected
condition. This will help users of DEBUG_STRICT as they'll most often
find this backtrace in their logs even if they can't find their core
file.

A new "debug dev bug" expert-mode CLI command was added to test the
feature.
2021-01-22 14:18:34 +01:00
Willy Tarreau
a8459b28c3 MINOR: debug: create ha_backtrace_to_stderr() to dump an instant backtrace
This function calls the ha_dump_backtrace() function with a locally
allocated buffer and sends the output slightly indented to fd #2. It's
meant to be used as an emergency backtrace dump.
2021-01-22 14:15:36 +01:00
Willy Tarreau
123fc9786a MINOR: debug: extract the backtrace dumping code to its own function
The backtrace dumping code was located into the thread dump function
but it looks particularly convenient to be able to call it to produce
a dump in other situations, so let's move it to its own function and
make sure it's called last in the function so that we can benefit from
tail merging to save one entry.
2021-01-22 13:52:41 +01:00
Willy Tarreau
2f1227eb3f MINOR: debug: always export the my_backtrace function
In order to simplify the code and remove annoying ifdefs everywhere,
let's always export my_backtrace() and make it adapt to the situation
and return zero if not supported. A small update in the thread dump
function was needed to make sure we don't use its results if it fails
now.
2021-01-22 12:12:29 +01:00
William Dauchy
b9577450ea MINOR: contrib/prometheus-exporter: use fill_fe_stats for frontend dump
use `stats_fill_fe_stats` when possible to avoid duplicating code; make
use of field selector to get the needed field only.

this should not introduce any difference of output.

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-01-21 18:59:30 +01:00
William Dauchy
0ef54397b0 MEDIUM: stats: allow to select one field in stats_fill_fe_stats
prometheus approach requires to output all values for a given metric
name; meaning we iterate through all metrics, and then iterate in the
inner loop on all objects for this metric.
In order to allow more code reuse, adapt the stats API to be able to
select one field or fill them all otherwise.
From this patch it should be possible to remove most of the duplicate
code on prometheuse side for the frontend.

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-01-21 18:59:30 +01:00
William Dauchy
defd15685e MINOR: stats: add new start time field
Another patch in order to try to reconciliate haproxy stats and
prometheus. Here I'm adding a proper start time field in order to make
proper use of uptime field.
That being done we can move the calculation in `fill_info`

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-01-21 18:59:30 +01:00
William Dauchy
a8766cfad1 MINOR: stats: duplicate 3 fields in bytes in info
in order to prepare a possible merge of fields between haproxy stats and
prometheus, duplicate 3 fields:
  INF_MEMMAX
  INF_POOL_ALLOC
  INF_POOL_USED
Those were specifically named in MB unit which is not what prometheus
recommends. We therefore used them but changed the unit while doing the
calculation. It created a specific case for that, up to the description.
This patch:
- removes some possible confusion, i.e. using MB field for bytes
- will permit an easier merge of fields such as description

First consequence for now, is that we can remove the calculation on
prometheus side and move it on `fill_info`.

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-01-21 18:59:30 +01:00
Christopher Faulet
142dd33912 MINOR: muxes: Add exit status for errors about not implemented features
The MUX_ES_NOTIMPL_ERR exit status is added to allow the multiplexers to
report errors about not implemented features. This will be used by the H1
mux to return 501-not-implemented errors.
2021-01-21 15:21:12 +01:00
Christopher Faulet
e095f31d36 MINOR: http: Add HTTP 501-not-implemented error message
Add the support for the 501-not-implemented status code with the
corresponding default message. The documentation is updated accordingly
because it is now part of status codes HAProxy may emit via an errorfile or
a deny/return HTTP action.
2021-01-21 15:21:12 +01:00
Christopher Faulet
8f100427c4 BUG/MEDIUM: tcpcheck: Don't destroy connection in the wake callback context
When a tcpcheck ruleset uses multiple connections, the existing one must be
closed and destroyed before openning the new one. This part is handled in
the tcpcheck_main() function, when called from the wake callback function
(wake_srv_chk). But it is indeed a problem, because this function may be
called from the mux layer. This means a mux may call the wake callback
function of the data layer, which may release the connection and the mux. It
is easy to see how it is hazardous. And actually, depending on the
scheduling, it leads to crashes.

Thus, we must avoid to release the connection in the wake callback context,
and move this part in the check's process function instead. To do so, we
rely on the CHK_ST_CLOSE_CONN flags. When a connection must be replaced by a
new one, this flag is set on the check, in tcpcheck_main() function, and the
check's task is woken up. Then, the connection is really closed in
process_chk_conn() function.

This patch must be backported as far as 2.2, with some adaptations however
because the code is not exactly the same.
2021-01-21 15:21:12 +01:00
Willy Tarreau
8050efeacb MINOR: cli: give the show_fd helpers the ability to report a suspicious entry
Now the show_fd helpers at the transport and mux levels return an integer
which indicates whether or not the inspected entry looks suspicious. When
an entry is reported as suspicious, "show fd" will suffix it with an
exclamation mark ('!') in the dump, that is supposed to help detecting
them.

For now, helpers were adjusted to adapt to the new API but none of them
reports any suspicious entry yet.
2021-01-21 08:58:15 +01:00
Willy Tarreau
108a271049 MINOR: xprt: add a new show_fd() helper to complete some "show fd" dumps.
Just like we did for the muxes, now the transport layers will have the
ability to provide helpers to report more detailed information about their
internal context. When the helper is not known, the pointer continues to
be dumped as-is if it's not NULL. This way a transport with no context nor
dump function will not add a useless "xprt_ctx=(nil)" but the pointer will
be emitted if valid or if a helper is defined.
2021-01-20 17:17:39 +01:00
Willy Tarreau
45fd1030d5 CLEANUP: tools: make resolve_sym_name() take a const pointer
When 0c439d895 ("BUILD: tools: make resolve_sym_name() return a const")
was written, the pointer argument ought to have been turned to const for
more flexibility. Let's do it now.
2021-01-20 17:17:39 +01:00
Tim Duesterhus
1d66e396bf MINOR: cache: Remove the hash part of the accept-encoding secondary key
As of commit 6ca89162dc this hash no longer is
required, because unknown encodings are not longer stored and known encodings
do not use the cache.
2021-01-18 15:01:41 +01:00
Willy Tarreau
31ffe9fad0 MINOR: pattern: add the missing generation ID manipulation functions
The functions needed to commit a pattern file generation number or
increase it were still missing. Better not have the caller play with
these.
2021-01-15 14:41:16 +01:00
Willy Tarreau
dc2410d093 CLEANUP: pattern: rename pat_ref_commit() to pat_ref_commit_elt()
It's about the third time I get confused by these functions, half of
which manipulate the reference as a whole and those manipulating only
an entry. For me "pat_ref_commit" means committing the pattern reference,
not just an element, so let's rename it. A number of other ones should
really be renamed before 2.4 gets released :-/
2021-01-15 14:11:59 +01:00
Christopher Faulet
d4a83dd6b3 MINOR: config: Add failifnotcap() to emit an alert on proxy capabilities
This function must be used to emit an alert if a proxy does not have at
least one of the requested capabilities. An additional message may be
appended to the alert.
2021-01-13 17:45:34 +01:00
William Dauchy
5d9b8f3c93 MINOR: contrib/prometheus-exporter: use fill_info for process dump
use `stats_fill_info` when possible to avoid duplicating code.

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-01-13 15:19:00 +01:00
Thayne McCombs
8f0cc5c4ba CLEANUP: Fix spelling errors in comments
This is from the output of codespell. It's done at once over a bunch
of files and only affects comments, so there is nothing user-visible.
No backport needed.
2021-01-08 14:56:32 +01:00
William Dauchy
5a982a7165 MINOR: contrib/prometheus-exporter: export build_info
commit c55a626217 ("MINOR: contrib/prometheus-exporter: Add
missing global and per-server metrics") is renaming two metrics between
v2.2 and v2.3:
  server_idle_connections_current
  server_idle_connections_limit

It is breaking some tools which are making use of those metrics while
supporting several haproxy versions. This build_info will permit tools
which make use of metrics to be able to match the haproxy version and
change the list of expected metrics. This was possible using the haproxy
stats socket but not with prometheus export.

This patch follows prometheus best pratices to export specific software
informations. It is adding a new field `build_info` so we can extend it
to other parameters if needed in the future.

example output:
  # HELP haproxy_process_build_info HAProxy build info.
  # TYPE haproxy_process_build_info gauge
  haproxy_process_build_info{version="2.4-dev5-2e1a3f-5"} 1

Even though it is not a bugfix, this patch will make more sense when
backported up to >= 2.0

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-01-08 14:48:13 +01:00
Ilya Shipitsin
b8888ab557 CLEANUP: assorted typo fixes in the code and comments
This is 15th iteration of typo fixes
2021-01-06 17:32:03 +01:00
Ilya Shipitsin
1e9a66603f CLEANUP: assorted typo fixes in the code and comments
This is 14th iteration of typo fixes
2021-01-06 16:26:50 +01:00
Frdric Lcaille
242fb1b639 MINOR: quic: Drop packets with STREAM frames with wrong direction.
A server initiates streams with odd-numbered stream IDs.
Also add useful traces when parsing STREAM frames.
2021-01-04 12:31:28 +01:00
Frdric Lcaille
6c1e36ce55 CLEANUP: quic: Remove useless QUIC event trace definitions.
Remove QUIC_EV_CONN_E* event trace macros which were defined for  errors.
Replace QUIC_EV_CONN_ECHPKT by QUIC_EV_CONN_BCFRMS used in qc_build_cfrms()
2021-01-04 12:31:28 +01:00
Frdric Lcaille
164096eb76 MINOR: qpack: Add static header table definitions for QPACK.
As HPACK, QPACK makes usage of a static header table.
2021-01-04 12:31:28 +01:00
Tim Duesterhus
54182ec9d7 CLEANUP: Apply the coccinelle patch for XXXcmp() on include/
Compare the various `XXXcmp()` functions against zero.
2021-01-04 10:09:02 +01:00
Thayne McCombs
92149f9a82 MEDIUM: stick-tables: Add srvkey option to stick-table
This allows using the address of the server rather than the name of the
server for keeping track of servers in a backend for stickiness.

The peers code was also extended to support feeding the dictionary using
this key instead of the name.

Fixes #814
2020-12-31 10:04:54 +01:00
Remi Tricot-Le Breton
ce9e7b2521 MEDIUM: cache: Manage a subset of encodings in accept-encoding normalizer
The accept-encoding normalizer now explicitely manages a subset of
encodings which will all have their own bit in the encoding bitmap
stored in the cache entry. This way two requests with the same primary
key will be served the same cache entry if they both explicitely accept
the stored response's encoding, even if their respective secondary keys
are not the same and do not match the stored response's one.
The actual hash of the accept-encoding will still be used if the
response's encoding is unmanaged.
The encoding matching and the encoding weight parsing are done for every
subpart of the accept-encoding values, and a bitmap of accepted
encodings is built for every request. It is then tested upon any stored
response that has the same primary key until one with an accepted
encoding is found.
The specific "identity" and "*" accept-encoding values are managed too.
When storing a response in the key, we also parse the content-encoding
header in order to only set the response's corresponding encoding's bit
in its cache_entry encoding bitmap.

This patch fixes GitHub issue #988.
It does not need to be backported.
2020-12-24 17:18:00 +01:00
Remi Tricot-Le Breton
56e46cb393 MINOR: http: Add helper functions to trim spaces and tabs
Add two helper functions that trim leading or trailing spaces and
horizontal tabs from an ist string.
2020-12-24 17:18:00 +01:00
Remi Tricot-Le Breton
2b5c5cbef6 MINOR: cache: Avoid storing responses whose secondary key was not correctly calculated
If any of the secondary hash normalizing functions raises an error, the
secondary hash will be unusable. In this case, the response will not be
stored anymore.
2020-12-24 17:18:00 +01:00
Frdric Lcaille
f63921fc24 MINOR: quic: Add traces for quic_packet_encrypt().
Add traces to have an idea why this function may fail. In fact
in never fails when the passed parameters are correct, especially the
lengths. This is not the case when a packet is not correctly built
before being encrypted.
2020-12-23 11:57:26 +01:00
Frdric Lcaille
133e8a7146 MINOR: quic: make a packet build fails when qc_build_frm() fails.
Even if the size of frames built by qc_build_frm() are computed so that
not to overflow a buffer, do not rely on this and always makes a packet
build fails if we could not build a frame.
Also add traces to have an idea where qc_build_frm() fails.
Fixes a memory leak in qc_build_phdshk_apkt().
2020-12-23 11:57:26 +01:00
Frdric Lcaille
f7e0b8d6ae MINOR: quic: Add traces for in flght ack-eliciting packet counter.
Add trace for this counter. Also shorten its variable name (->ifae_pkts).
2020-12-23 11:57:26 +01:00
Frdric Lcaille
04ffb66bc9 MINOR: quic: Make usage of the congestion control window.
Remove ->ifcdata which was there to control the CRYPTO data sent to the
peer so that not to saturate its reception buffer. This was a sort
of flow control.
Add ->prep_in_flight counter to the QUIC path struct to control the
number of bytes prepared to be sent so that not to saturare the
congestion control window. This counter is increased each time a
packet was built. This has nothing to see with ->in_flight which
is the real in flight number of bytes which have really been sent.
We are olbiged to maintain two such counters to know how many bytes
of data we can prepared before sending them.
Modify traces consequently which were useful to diagnose issues about
the congestion control window usage.
2020-12-23 11:57:26 +01:00
Frdric Lcaille
c5e72b9868 MINOR: quic: Attempt to make trace more readable
As there is a lot of information in this protocol, this is not
easy to make the traces readable. We remove here a few of them and
shorten some line shortening the variable names.
2020-12-23 11:57:26 +01:00
Frdric Lcaille
8090b51e92 MAJOR: quic: Make usage of ebtrees to store QUIC ACK ranges.
Store QUIC ACK ranges in ebtrees in place of lists with a 0(n) time complexity
for insertion.
2020-12-23 11:57:26 +01:00
Frdric Lcaille
f46c10cfb1 MINOR: server: Add QUIC definitions to servers.
This patch adds QUIC structs to server struct so that to make the QUIC code
compile. Also initializes the ebtree to store the connections by connection
IDs.
2020-12-23 11:57:26 +01:00
Frdric Lcaille
884f2e9f43 MINOR: listener: Add QUIC info to listeners and receivers.
This patch adds a quic_transport_params struct to bind_conf struct
used for the listeners. This is to store the QUIC transport parameters
for the listeners. Also initializes them when calling str2listener().
Before str2sa_range() it's too early to figure we're going to speak QUIC,
and after it's too late as listeners are already created. So it seems that
doing it in str2listener() when the protocol is discovered is the best
place.

Also adds two ebtrees to the underlying receivers to store the connection
by connections IDs (one for the original connection IDs, and another
one for the definitive connection IDs which really identify the connections.

However it doesn't seem normal that it is stored in the receiver nor the
listener. There should be a private context in the listener so that
protocols can store internal information. This element should in
fact be the listener handle.

Something still feels wrong, and probably we'll have to make QUIC and
SSL co-exist: a proof of this is that there's some explicit code in
bind_parse_ssl() to prevent the "ssl" keyword from replacing the xprt.
2020-12-23 11:57:26 +01:00
Frdric Lcaille
0c4e3b09b0 MINOR: quic: Add definitions for QUIC protocol.
This patch imports all the definitions for QUIC protocol with few modifications
from 20200720-quic branch of quic-dev repository found at
https://github.com/haproxytech/quic-dev.
2020-12-23 11:57:26 +01:00
Frdric Lcaille
901ee2f37b MINOR: ssl: Export definitions required by QUIC.
QUIC needs to initialize its BIO and SSL session the same way as for SSL over TCP
connections. It needs also to use the same ClientHello callback.
This patch only exports functions and variables shared between QUIC and SSL/TCP
connections.
2020-12-23 11:57:26 +01:00
Frdric Lcaille
5e3d83a221 MINOR: connection: Add a new xprt to connection.
Simply adds XPRT_QUIC new enum to integrate QUIC transport protocol.
2020-12-23 11:57:26 +01:00
Frdric Lcaille
70da889d57 MINOR: quic: Redefine control layer callbacks which are QUIC specific.
We add src/quic_sock.c QUIC specific socket management functions as callbacks
for the control layer: ->accept_conn, ->default_iocb and ->rx_listening.
accept_conn() will have to be defined. The default I/O handler only recvfrom()
the datagrams received. Furthermore, ->rx_listening callback always returns 1 at
this time but should returns 0 when reloading the processus.
2020-12-23 11:57:26 +01:00
Frdric Lcaille
72f7cb170a MINOR: connection: Attach a "quic_conn" struct to "connection" struct.
This is a simple patch to prepare the integration of QUIC support to come.
quic_conn struct is supposed to embed any QUIC specific information for a QUIC
connection.
2020-12-23 11:57:26 +01:00
Frdric Lcaille
ca42b2c9d3 MINOR: protocol: Create proto_quic QUIC protocol layer.
As QUIC is a connection oriented protocol, this file is almost a copy of
proto_tcp without TCP specific features. To suspend/resume a QUIC receiver
we proceed the same way as for proto_udp receivers.

With the recent updates to the listeners, we don't need a specific set of
quic*_add_listener() functions, the default ones are sufficient. The fields
declaration were reordered to make the various layers more visible like in
other protocols.

udp_suspend_receiver/udp_resume_receiver are up-to-date (the check for INHERITED
is present) and the code being UDP-specific, it's normal to use UDP here.
Note that in the future we might more reasily reference stacked layers so that
there's no more need for specifying the pointer here.
2020-12-23 11:57:26 +01:00
Dragan Dosen
6f7cc11e6d MEDIUM: xxhash: use the XXH_INLINE_ALL macro to inline all functions
This way we make all xxhash functions inline, with implementations being
directly included within xxhash.h.

Makefile is updated as well, since we don't need to compile and link
xxhash.o anymore.

Inlining should improve performance on small data inputs.
2020-12-23 06:39:21 +01:00
Dragan Dosen
967e7e79af MEDIUM: xxhash: use the XXH3 functions to generate 64-bit hashes
Replace the XXH64() function calls with the XXH3 variant function
XXH3_64bits_withSeed() where possible.
2020-12-23 06:39:21 +01:00
Dragan Dosen
de37443e64 IMPORT: xxhash: update to v0.8.0 that introduces stable XXH3 variant
A new XXH3 variant of hash functions shows a noticeable improvement in
performance (especially on small data), and also brings 128-bit support,
better inlining and streaming capabilities.

Performance comparison is available here:

  https://github.com/Cyan4973/xxHash/wiki/Performance-comparison
2020-12-23 06:39:21 +01:00
Olivier Houchard
63ee281854 MINOR: atomic: don't use ; to separate instruction on aarch64.
The assembler on MacOS aarch64 interprets ; as the beginning of comments,
so it is not suitable for separating instructions in inline asm. Use \n
instead.

This should be backported to 2.3, 2.2, 2.1, 2.0 and 1.9.
2020-12-23 01:23:41 +01:00
Willy Tarreau
4f59d38616 MINOR: time: increase the minimum wakeup interval to 60s
The MAX_DELAY_MS which is set an upper limit to the poll wait time and
force a wakeup this often used to be set to 1 second in order to easily
spot and correct time drifts. This was added 12 years ago at an era
where virtual machines were starting to become common in server
environments while not working particularly well. Nowadays, such issues
are not as common anymore, however forcing 64 threads to wake up every
single second starts to make the process visible on otherwise idle
systems. Let's increase this wakeup interval to one minute. In the worst
case it will make idle threads wake every second, which remains low.

If this is not sufficient anymore on some systems, another approach
would consist in implementing a deep-sleep mode which only triggers
after a while and which is always disabled if any time drift is
observed.
2020-12-22 10:35:43 +01:00
Christian Ruppert
b67e155895 BUILD: hpack: hpack-tbl-t.h uses VAR_ARRAY but does not include compiler.h
This fixes building hpack from contrib, which failed because of the
undeclared VAR_ARRAY:

make -C contrib/hpack
...
cc -O2 -Wall -g -I../../include -fwrapv -fno-strict-aliasing   -c -o gen-enc.o gen-enc.c
In file included from gen-enc.c:18:
../../include/haproxy/hpack-tbl-t.h:105:23: error: 'VAR_ARRAY' undeclared here (not in a function)
  105 |  struct hpack_dte dte[VAR_ARRAY]; /* dynamic table entries */
...

As discussed in the thread below, let's redefine VAR_ARRAY in this file
so that it remains self-sustaining:

   https://www.mail-archive.com/haproxy@formilux.org/msg39212.html
2020-12-22 10:18:07 +01:00
Ilya Shipitsin
f38a01884a CLEANUP: assorted typo fixes in the code and comments
This is 13n iteration of typo fixes
2020-12-21 11:24:48 +01:00
Ilya Shipitsin
af204881a3 BUILD: ssl: fine guard for SSL_CTX_get0_privatekey call
SSL_CTX_get0_privatekey is openssl/boringssl specific function present
since openssl-1.0.2, let us define readable guard for it, not depending
on HA_OPENSSL_VERSION
2020-12-21 11:17:36 +01:00
Willy Tarreau
b1f54925fc BUILD: plock: remove dead code that causes a warning in gcc 11
As Ilya reported in issue #998, gcc 11 complains about misleading code
indentation which is in fact caused by dead assignments to zero after
a loop which stops on zero. Let's clean both of these.
2020-12-21 10:27:18 +01:00
Miroslav Zagorac
7f8314c8d1 MINOR: opentracing: add ARGC_OT enum
Due to the addition of the OpenTracing filter it is necessary to define
ARGC_OT enum.  This value is used in the functions fmt_directive() and
smp_resolve_args().
2020-12-16 15:49:53 +01:00
Miroslav Zagorac
6deab79d59 MINOR: vars: replace static functions with global ones
The OpenTracing filter uses several internal HAProxy functions to work
with variables and therefore requires two static local HAProxy functions,
var_accounting_diff() and var_clear(), to be declared global.

In fact, the var_clear() function was not originally defined as static,
but it lacked a declaration.
2020-12-16 14:20:08 +01:00
Ilya Shipitsin
ec60909871 BUILD: SSL: fine guard for SSL_CTX_add_server_custom_ext call
SSL_CTX_add_server_custom_ext is openssl specific function present
since openssl-1.0.2, let us define readable guard for it, not depending
on HA_OPENSSL_VERSION
2020-12-15 16:13:35 +01:00
Willy Tarreau
472125bc04 MINOR: protocol: add a pair of check_events/ignore_events functions at the ctrl layer
Right now the connection subscribe/unsubscribe code needs to manipulate
FDs, which is not compatible with QUIC. In practice what we need there
is to be able to either subscribe or wake up depending on readiness at
the moment of subscription.

This commit introduces two new functions at the control layer, which are
provided by the socket code, to check for FD readiness or subscribe to it
at the control layer. For now it's not used.
2020-12-11 17:02:50 +01:00
Willy Tarreau
2ded48dd27 MINOR: connection: make conn_sock_drain() use the control layer's ->drain()
Now we don't touch the fd anymore there, instead we rely on the ->drain()
provided by the control layer. As such the function was renamed to
conn_ctrl_drain().
2020-12-11 16:26:01 +01:00
Willy Tarreau
427c846cc9 MINOR: protocol: add a ->drain() function at the connection control layer
This is what we need to drain pending incoming data from an connection.
The code was taken from conn_sock_drain() without the connection-specific
stuff. It still takes a connection for now for API simplicity.
2020-12-11 16:26:00 +01:00
Willy Tarreau
586f71b43f REORG: connection: move the socket iocb (conn_fd_handler) to sock.c
conn_fd_handler() is 100% specific to socket code. It's about time
it moves to sock.c which manipulates socket FDs. With it comes
conn_fd_check() which tests for the socket's readiness. The ugly
connection status check at the end of the iocb was moved to an inlined
function in connection.h so that if we need it for other socket layers
it's not too hard to reuse.

The code was really only moved and not changed at all.
2020-12-11 16:26:00 +01:00
Willy Tarreau
827fee7406 MINOR: connection: remove sock-specific code from conn_sock_send()
The send() loop present in this function and the error handling is already
present in raw_sock_from_buf(). Let's rely on it instead and stop touching
the FD from this place. The send flag was changed to use a more agnostic
CO_SFL_*. The name was changed to "conn_ctrl_send()" to remind that it's
meant to be used to send at the lowest level.
2020-12-11 16:25:11 +01:00
Willy Tarreau
3a9e56478e CLEANUP: connection: remove the unneeded fd_stop_{recv,send} on read0/shutw
These are two other areas where this fd_stop_recv()/fd_stop_send() makes no
sense anymore. Both happen by definition while the FD is *not* subscribed,
since nowadays it's subscribed after failing recv()/send(), in which case
we cannot close.
2020-12-11 13:56:12 +01:00
Willy Tarreau
3ec094b09d CLEANUP: remove the unused fd_stop_send() in conn_xprt_shutw{,_hard}()
These functions used to disable polling for writes when shutting down
but this is no longer used as it still happens later when closing if the
connection was subscribed to FD events. Let's just remove this fake and
undesired dependency on the FD layer.
2020-12-11 13:49:19 +01:00
Amaury Denoyelle
8d22823ade MEDIUM: http_act: define set-timeout server/tunnel action
Add a new http-request action 'set-timeout [server/tunnel]'. This action
can be used to update the server or tunnel timeout of a stream. It takes
two parameters, the timeout name to update and the new timeout value.
This rule is only valid for a proxy with backend capabilities. The
timeout value cannot be null. A sample expression can also be used
instead of a plain value.
2020-12-11 12:01:07 +01:00
Amaury Denoyelle
fb50443517 MEDIUM: stream: support a dynamic tunnel timeout
Allow the modification of the tunnel timeout on the stream side.
Use a new field in the stream for the tunnel timeout. It is initialized
by the tunnel timeout from backend unless it has already been set by a
set-timeout tunnel rule.
2020-12-11 12:01:07 +01:00
Amaury Denoyelle
b715078821 MINOR: stream: prepare the hot refresh of timeouts
Define a stream function to allow to update the timeouts.
This commit is in preparation for the support of dynamic timeouts with
the set-timeout rule.
2020-12-11 12:01:07 +01:00
Amaury Denoyelle
5a9fc2d10f MINOR: action: define enum for timeout type of the set-timeout rule
This enum is used to specify the timeout targetted by a set-timeout
rule.
2020-12-11 12:01:07 +01:00
Willy Tarreau
343d0356a5 CLEANUP: connection: remove the unused conn_{stop,cond_update}_polling()
These functions are not used anymore and were quite confusing given that
their names reflected their original role and not the current ones. Let's
kill them before they inspire anyone.
2020-12-11 11:21:53 +01:00
Willy Tarreau
6aee5b9a4c MINOR: connection: implement cs_drain_and_close()
We had cs_close() which forces a CS_SHR_RESET mode on the read side,
and due to this there are a few call places in the checks which
perform a manual call to conn_sock_drain() before calling cs_close().
This is absurd by principle, and it can be counter-productive in the
case of a mux where this could even cause the opposite of the desired
effect by deleting pending frames on the socket before closing.

Let's add cs_drain_and_close() which uses the CS_SHR_DRAIN mode to
prepare this.
2020-12-11 11:04:51 +01:00
Willy Tarreau
29885f0308 MINOR: udp: export udp_suspend_receiver() and udp_resume_receiver()
QUIC will rely on UDP at the receiver level, and will need these functions
to suspend/resume the receivers. In the future, protocol chaining may
simplify this.
2020-12-08 18:10:18 +01:00
Willy Tarreau
c14e7ae744 MINOR: connection: use the control layer's init/close
In conn_ctrl_init() and conn_ctrl_close() we now use the control layer's
functions instead of manipulating the FD directly. This is safe since the
control layer is always present when done. Note that now we also adjust
the flag before calling the function to make things cleaner in case such
a layer would need to call the same functions again for any reason.
2020-12-08 15:53:45 +01:00
Willy Tarreau
de471c4655 MINOR: protocol: add a set of ctrl_init/ctrl_close methods for setup/teardown
Currnetly conn_ctrl_init() does an fd_insert() and conn_ctrl_close() does an
fd_delete(). These are the two only short-term obstacles against using a
non-fd handle to set up a connection. Let's have pur these into the protocol
layer, along with the other connection-level stuff so that the generic
connection code uses them instead. This will allow to define new ones for
other protocols (e.g. QUIC).

Since we only support regular sockets at the moment, the code was placed
into sock.c and shared with proto_tcp, proto_uxst and proto_sockpair.
2020-12-08 15:50:56 +01:00
Willy Tarreau
b366c9a59a CLEANUP: protocol: group protocol struct members by usage
For the sake of an improved readability, let's group the protocol
field members according to where they're supposed to be defined:
  - connection layer (note: for now even UDP needs one)
  - binding layer
  - address family
  - socket layer
Nothing else was changed.
2020-12-08 14:58:24 +01:00
Willy Tarreau
b9b2fd7cf4 MINOR: protocol: export protocol definitions
The various protocols were made static since there was no point in
exporting them in the past. Nowadays with QUIC relying on UDP we'll
significantly benefit from UDP being exported and more generally from
being able to declare some functions as being the same as other
protocols'.

In an ideal world it should not be these protocols which should be
exported, but the intermediary levels:
  - socket layer (sock.c only right now), already exported as functions
    but nothing structured at the moment ;
  - family layer (sock_inet, sock_unix, sockpair etc): already structured
    and exported
  - binding layer (the part that relies on the receiver): currently fused
    within the protocol
  - connectiong layer (the part that manipulates connections): currently
    fused within the protocol
  - protocol (connection's control): shouldn't need to be exposed
    ultimately once the elements above are in an easily sharable way.
2020-12-08 14:54:08 +01:00
Willy Tarreau
f9ad06cb26 MINOR: protocol: remove the redundant ->sock_domain field
This field used to be needed before commit 2b5e0d8b6 ("MEDIUM: proto_udp:
replace last AF_CUST_UDP* with AF_INET*") as it was used as a protocol
entry selector. Since this commit it's always equal to the socket family's
value so it's entirely redundant. Let's remove it now to simplify the
protocol definition a little bit.
2020-12-08 12:13:54 +01:00
Christopher Faulet
16df178b6e BUG/MEDIUM: stream: Xfer the input buffer to a fully created stream
The input buffer passed as argument to create a new stream must not be
transferred when the request channel is initialized because the channel
flags are not set at this stage. In addition, the API is a bit confusing
regarding the buffer owner when an error occurred. The caller remains the
owner, but reading the code it is not obvious.

So, first of all, to avoid any ambiguities, comments are added on the
calling chain to make it clear. The buffer owner is the caller if any error
occurred. And the ownership is transferred to the stream on success.

Then, to make things simple, the ownership is transferred at the end of
stream_new(), in case of success. And the input buffer is updated to point
on BUF_NULL. Thus, in all cases, if the caller try to release it calling
b_free() on it, it is not a problem. Of course, it remains the caller
responsibility to release it on error.

The patch fixes a bug introduced by the commit 26256f86e ("MINOR: stream:
Pass an optional input buffer when a stream is created"). No backport is
needed.
2020-12-04 17:15:03 +01:00
Willy Tarreau
d1f250f87b MINOR: listener: now use a generic add_listener() function
With the removal of the family-specific port setting, all protocol had
exactly the same implementation of ->add(). A generic one was created
with the name "default_add_listener" so that all other ones can now be
removed. The API was slightly adjusted so that the protocol and the
listener are passed instead of the listener and the port.

Note that all protocols continue to provide this ->add() method instead
of routinely calling default_add_listener() from create_listeners(). This
makes sure that any non-standard protocol will still be able to intercept
the listener addition if needed.

This could be backported to 2.3 along with the few previous patches on
listners as a pure code cleanup.
2020-12-04 15:08:00 +01:00
Willy Tarreau
73bed9ff13 MINOR: protocol: add a ->set_port() helper to address families
At various places we need to set a port on an IPv4 or IPv6 address, and
it requires casts that are easy to get wrong. Let's add a new set_port()
helper to the address family to assist in this. It will be directly
accessible from the protocol and will make the operation seamless.
Right now this is only implemented for sock_inet as other families do
not need a port.
2020-12-04 15:08:00 +01:00
Christopher Faulet
6ad06066cd CLEANUP: connection: Remove CS_FL_READ_PARTIAL flag
Since the recent refactoring of the H1 multiplexer, this flag is no more
used. Thus it is removed.
2020-12-04 14:41:49 +01:00
Christopher Faulet
da831fa068 CLEANUP: http-ana: Remove TX_WAIT_NEXT_RQ unsued flag
This flags is now unused. It was used in REQ_WAIT_HTTP analyser, when a
stream was waiting for a request, to set the keep-alive timeout or to avoid
to send HTTP errors to client.
2020-12-04 14:41:49 +01:00
Christopher Faulet
2afd874704 CLEANUP: htx: Remove HTX_FL_UPGRADE unsued flag
Now the H1 to H2 upgrade is handled before the stream
creation. HTX_FL_UPGRADE flag is now unused.
2020-12-04 14:41:49 +01:00
Christopher Faulet
4c8ad84232 MINOR: mux: Add a ctl parameter to get the exit status of the multiplexers
The ctl param MUX_EXIT_STATUS can be request to get the exit status of a
multiplexer. For instance, it may be an HTTP status code or an H2 error. For
now, 0 is always returned. When the mux h1 will be able to return HTTP
errors itself, this ctl param will be used to get the HTTP status code from
the logs.

the mux_exit_status enum has been created to map internal mux exist status
to generic one. Thus there is 5 possible status for now: success, invalid
error, timeout error, internal error and unknown.
2020-12-04 14:41:49 +01:00
Christopher Faulet
7d0c19e82d MINOR: session: Add functions to increase http values of tracked counters
cumulative numbers of http request and http errors of counters tracked at
the session level and their rates can now be updated at the session level
thanks to two new functions. These functions are not used for now, but it
will be called to keep tracked counters up-to-date if an error occurs before
the stream creation.
2020-12-04 14:41:49 +01:00
Christopher Faulet
84600631cd MINOR: stick-tables: Add functions to update some values of a tracked counter
The cumulative numbers of http requests, http errors, bytes received and
sent and their respective rates for a tracked counters are now updated using
specific stream independent functions. These functions are used by the
stream but the aim is to allow the session to do so too. For now, there is
no reason to perform these updates from the session, except from the mux-h2
maybe. But, the mux-h1, on the frontend side, will be able to return some
errors to the client, before the stream creation. In this case, it will be
mandatory to update counters tracked at the session level.
2020-12-04 14:41:49 +01:00
Christopher Faulet
26256f86e1 MINOR: stream: Pass an optional input buffer when a stream is created
It is now possible to set the buffer used by the channel request buffer when
a stream is created. It may be useful if input data are already received,
instead of waiting the first call to the mux rcv_buf() callback. This change
is mandatory to support H1 connection with no stream attached.

For now, the multiplexers don't pass any buffer. BUF_NULL is thus used to
call stream_create_from_cs().
2020-12-04 14:41:48 +01:00
Christopher Faulet
afc02a4436 MINOR: muxes: Remove get_cs_info callback function now useless
This callback function was only defined by the mux-h1. But it has been
removed in the previous commit because it is unused now. So, we can do a
step forward removing the callback function from the mux definition and the
cs_info structure.
2020-12-04 14:41:48 +01:00
Christopher Faulet
d517396f8e MINOR: session: Add the idle duration field into the session
The idle duration between two streams is added to the session structure. It
is not necessarily pertinent on all protocols. In fact, it is only defined
for H1 connections. It is the duration between two H1 transactions. But the
.get_cs_info() callback function on the multiplexers only exists because
this duration is missing at the session level. So it is a simplification
opportunity for a really low cost.

To reduce the cost, a hole in the session structure is filled by moving
.srv_list field at the end of the structure.
2020-12-04 14:41:48 +01:00
Thierry Fournier
c749259dff MINOR: lua-thread: Store each function reference and init reference in array
The goal is to allow execution of one main lua state per thread.

The array introduces storage of one reference per thread, because each
lua state can have different reference id for a same function. A function
returns the preferred state id according to configuration and current
thread id.
2020-12-02 21:53:16 +01:00
Thierry Fournier
021d986ecc MINOR: lua-thread: Replace state_from by state_id
The goal is to allow execution of one main lua state per thread.

"state_from" is a pointer to the parent lua state. "state_id"
is the index of the parent state id in the reference lua states
array. "state_id" is better because the lock is a "== 0" test
which is quick than pointer comparison. In other way, the state_id
index could index other things the the Lua state concerned. I
think to the function references.
2020-12-02 21:53:16 +01:00
Thierry Fournier
62a22aa23f MINOR: lua-thread: Replace "struct hlua_function" allocation by dedicated function
The goal is to allow execution of one main lua state per thread.

This function will initialize the struct with other things than 0.
With this function helper, the initialization is centralized and
it prevents mistakes. This patch also keeps a reference to each
declared function in a list. It will be useful in next patches to
control consistency of declared references.
2020-12-02 21:53:16 +01:00
Thierry Fournier
75fc02956b MINOR: lua-thread: make hlua_ctx_init() get L from its caller
The goal is to allow execution of one main lua state per thread.

The function hlua_ctx_init() now gets the original lua state from
its caller. This allows the initialisation of lua_thread (coroutines)
from any master lua state.

The parent lua state is stored in the hlua struct.

This patch is a temporary transition, it will be modified later.
2020-12-02 21:53:16 +01:00
Thierry Fournier
ad5345fed7 MINOR: lua-thread: Replace embedded struct hlua_function by a pointer
The goal is to allow execution of one main lua state per thread.

Because this struct will be filled after the configuration parser, we
cannot copy the content. The actual state of the Haproxy code doesn't
justify this change, it is an update preparing next steps.
2020-12-02 21:53:16 +01:00
Thierry Fournier
a51a1fd174 MINOR: cli: add a function to look up a CLI service description
This function will be useful to check if the keyword is already registered.
Also add a define for the max number of args.

This will be needed by a next patch to fix a bug and will have to be
backported.
2020-12-02 09:45:18 +01:00
Thierry Fournier
87e539906b MINOR: actions: add a function returning a service pointer from its name
This function simply calls action_lookup() on the private service_keywords,
to look up a service name. This will be used to detect double registration
of a same service from Lua.

This will be needed by a next patch to fix a bug and will have to be
backported.
2020-12-02 09:45:18 +01:00
Thierry Fournier
7a71a6d9d2 MINOR: actions: Export actions lookup functions
These functions will be useful to check if a keyword is already registered.
This will be needed by a next patch to fix a bug, and will need to be
backported.
2020-12-02 09:45:18 +01:00
Willy Tarreau
a1f12746b1 MINOR: traces: add a new level "error" below the "user" level
Sometimes it would be nice to be able to only trace abnormal events such
as protocol errors. Let's add a new "error" level below the "user" level
for this. This will allow to add TRACE_ERROR() at various error points
and only see them.
2020-12-01 10:25:20 +01:00
Maciej Zdeb
fcdfd857b3 MINOR: log: Logging HTTP path only with %HPO
This patch adds a new logging variable '%HPO' for logging HTTP path only
(without query string) from relative or absolute URI.

For example:
log-format "hpo=%HPO hp=%HP hu=%HU hq=%HQ"

GET /r/1 HTTP/1.1
=>
hpo=/r/1 hp=/r/1 hu=/r/1 hq=

GET /r/2?q=2 HTTP/1.1
=>
hpo=/r/2 hp=/r/2 hu=/r/2?q=2 hq=?q=2

GET http://host/r/3 HTTP/1.1
=>
hpo=/r/3 hp=http://host/r/3 hu=http://host/r/3 hq=

GET http://host/r/4?q=4 HTTP/1.1
=>
hpo=/r/4 hp=http://host/r/4 hu=http://host/r/4?q=4 hq=?q=4
2020-12-01 09:32:44 +01:00
Emeric Brun
0237c4e3f5 BUG/MEDIUM: local log format regression.
Since 2.3 default local log format always adds hostame field.
This behavior change was due to log/sink re-work, because according
to rfc3164 the hostname field is mandatory.

This patch re-introduce a legacy "local" format which is analog
to rfc3164 but with hostname stripped. This is the new
default if logs are generated by haproxy.

To stay compliant with previous configurations, the option
"log-send-hostname" acts as if the default format is switched
to rfc3164.

This patch addresses the github issue #963

This patch should be backported in branches >= 2.3.
2020-12-01 06:58:42 +01:00
Willy Tarreau
4d6c594998 BUG/MEDIUM: task: close a possible data race condition on a tasklet's list link
In issue #958 Ashley Penney reported intermittent crashes on AWS's ARM
nodes which would not happen on x86 nodes. After investigation it turned
out that the Neoverse N1 CPU cores used in the Graviton2 CPU are much
more aggressive than the usual Cortex A53/A72/A55 or any x86 regarding
memory ordering.

The issue that was triggered there is that if a tasklet_wakeup() call
is made on a tasklet scheduled to run on a foreign thread and that
tasklet is just being dequeued to be processed, there can be a race at
two places:
  - if MT_LIST_TRY_ADDQ() happens between MT_LIST_BEHEAD() and
    LIST_SPLICE_END_DETACHED() if the tasklet is alone in the list,
    because the emptiness tests matches ;

  - if MT_LIST_TRY_ADDQ() happens during LIST_DEL_INIT() in
    run_tasks_from_lists(), then depending on how LIST_DEL_INIT() ends
    up being implemented, it may even corrupt the adjacent nodes while
    they're being reused for the in-tree storage.

This issue was introduced in 2.2 when support for waking up remote
tasklets was added. Initially the attachment of a tasklet to a list
was enough to know its status and this used to be stable information.
Now it's not sufficient to rely on this anymore, thus we need to use
a different information.

This patch solves this by adding a new task flag, TASK_IN_LIST, which
is atomically set before attaching a tasklet to a list, and is only
removed after the tasklet is detached from a list. It is checked
by tasklet_wakeup_on() so that it may only be done while the tasklet
is out of any list, and is cleared during the state switch when calling
the tasklet. Note that the flag is not set for pure tasks as it's not
needed.

However this introduces a new special case: the function
tasklet_remove_from_tasklet_list() needs to keep both states in sync
and cannot check both the state and the attachment to a list at the
same time. This function is already limited to being used by the thread
owning the tasklet, so in this case the test remains reliable. However,
just like its predecessors, this function is wrong by design and it
should probably be replaced with a stricter one, a lazy one, or be
totally removed (it's only used in checks to avoid calling a possibly
scheduled event, and when freeing a tasklet). Regardless, for now the
function exists so the flag is removed only if the deletion could be
done, which covers all cases we're interested in regarding the insertion.
This removal is safe against a concurrent tasklet_wakeup_on() since
MT_LIST_DEL() guarantees the atomic test, and will ultimately clear
the flag only if the task could be deleted, so the flag will always
reflect the last state.

This should be carefully be backported as far as 2.2 after some
observation period. This patch depends on previous patch
"MINOR: task: remove __tasklet_remove_from_tasklet_list()".
2020-11-30 18:17:59 +01:00
Willy Tarreau
2da4c316c2 MINOR: task: remove __tasklet_remove_from_tasklet_list()
This function is only used at a single place directly within the
scheduler in run_tasks_from_lists() and it really ought not be called
by anything else, regardless of what its comment says. Let's delete
it, move the two lines directly into the call place, and take this
opportunity to factor the atomic decrement on tasks_run_queue. A comment
was added on the remaining one tasklet_remove_from_tasklet_list() to
mention the risks in using it.
2020-11-30 18:17:44 +01:00
Willy Tarreau
a868c2920b MINOR: task: remove tasklet_insert_into_tasklet_list()
This function is only called at a single place and adds more confusion
than it removes. It also makes one think it could be used outside of
the scheduler while it must absolutely not. Let's just move its two
lines to the call place, making the code more readable there. In
addition this clearly shows that the preliminary LIST_INIT() is
useless since the entry is immediately overwritten.
2020-11-30 18:17:44 +01:00
Olivier Houchard
1f05324cbe BUG/MEDIUM: lists: Lock the element while we check if it is in a list.
In MT_LIST_TRY_ADDQ() and MT_LIST_TRY_ADD() we can't just check if the
element is already in a list, because there's a small race condition, it
could be added  between the time we checked, and the time we actually set
its next and prev, so we have to lock it first.

This is required to address issue #958.

This should be backported to 2.3, 2.2 and 2.1.
2020-11-30 18:17:29 +01:00
Your Name
1e237d037b MINOR: plock: use an ARMv8 instruction barrier for the pause instruction
As suggested by @AGSaidi in issue #958, on ARMv8 its convenient to use
an "isb" instruction in pl_cpu_relax() to improve fairness. Without it
I've met a few watchdog conditions on valid locks with 16 threads,
indicating that some threads couldn't manage to get it in 2 seconds. I
never happened again with it. In addition, the performance increased
by slightly more than 5% thanks to the reduced contention.

This should be backported as far as 2.2, possibly even 2.0.
2020-11-29 14:53:33 +01:00
Christopher Faulet
97b7bdfcf7 REORG: tcpcheck: Move check option parsing functions based on tcp-check
The parsing of the check options based on tcp-check rules (redis, spop,
smtp, http...) are moved aways from check.c. Now, these functions are placed
in tcpcheck.c. These functions are only related to the tcpcheck ruleset
configured on a proxy and not to the health-check attached to a server.
2020-11-27 10:30:23 +01:00
Christopher Faulet
bb9fb8b7f8 MINOR: config: Deprecate and ignore tune.chksize global option
This option is now ignored because I/O check buffers are now allocated using the
buffer pool. Thus, it is marked as deprecated in the documentation and ignored
during the configuration parsing. The field is also removed from the global
structure.

Because this option is ignored since a recent fix, backported as fare as 2.2,
this patch should be backported too. Especially because it updates the
documentation.
2020-11-27 10:30:23 +01:00
Christopher Faulet
b381a505c1 BUG/MAJOR: tcpcheck: Allocate input and output buffers from the buffer pool
Historically, the input and output buffers of a check are allocated by hand
during the startup, with a specific size (not necessarily the same than
other buffers). But since the recent refactoring of the checks to rely
exclusively on the tcp-checks and to use the underlying mux layer, this part
is totally buggy. Indeed, because these buffers are now passed to a mux,
they maybe be swapped if a zero-copy is possible. In fact, for now it is
only possible in h2_rcv_buf(). Thus the bug concretely only exists if a h2
health-check is performed. But, it is a latent bug for other muxes.

Another problem is the size of these buffers. because it may differ for the
other buffer size, it might be source of bugs.

Finally, for configurations with hundreds of thousands of servers, having 2
buffers per check always allocated may be an issue.

To fix the bug, we now allocate these buffers when required using the buffer
pool. Thus not-running checks don't waste memory and muxes may swap them if
possible. The only drawback is the check buffers have now always the same
size than buffers used by the streams. This deprecates indirectly the
"tune.chksize" global option.

In addition, the http-check regtest have been update to perform some h2
health-checks.

Many thanks to @VigneshSP94 for its help on this bug.

This patch should solve the issue #936. It relies on the commit "MINOR:
tcpcheck: Don't handle anymore in-progress send rules in tcpcheck_main".
Both must be backport as far as 2.2.

bla
2020-11-27 10:29:41 +01:00
Remi Tricot-Le Breton
3d08236cb3 MINOR: cache: Prepare helper functions for Vary support
The Vary functionality is based on a secondary key that needs to be
calculated for every request to which a server answers with a Vary
header. The Vary header, which can only be found in server responses,
determines which headers of the request need to be taken into account in
the secondary key. Since we do not want to have to store all the headers
of the request until we have the response, we will pre-calculate as many
sub-hashes as there are headers that we want to manage in a Vary
context. We will only focus on a subset of headers which are likely to
be mentioned in a Vary response (accept-encoding and referer for now).
Every managed header will have its own normalization function which is
in charge of transforming the header value into a core representation,
more robust to insignificant changes that could exist between multiple
clients. For instance, two accept-encoding values mentioning the same
encodings but in different orders should give the same hash.
This patch adds a function that parses a Vary header value and checks if
all the values belong to our supported subset. It also adds the
normalization functions for our two headers, as well as utility
functions that can prebuild a secondary key for a given request and
transform it into an actual secondary key after the vary signature is
determined from the response.
2020-11-24 16:52:57 +01:00
Christopher Faulet
401e6dbff3 BUG/MAJOR: filters: Always keep all offsets up to date during data filtering
When at least one data filter is registered on a channel, the offsets of all
filters must be kept up to date. For data filters but also for others. It is
safer to do it in that way. Indirectly, this patch fixes 2 hidden bugs
revealed by the commit 22fca1f2c ("BUG/MEDIUM: filters: Forward all filtered
data at the end of http filtering").

The first one, the worst of both, happens at the end of http filtering when
at least one data filtered is registered on the channel. We call the
http_end() callback function on the filters, when defined, to finish the
http filtering. But it is performed for all filters. Before the commit
22fca1f2c, the only risk was to call the http_end() callback function
unexpectedly on a filter. Now, we may have an overflow on the offset
variable, used at the end to forward all filtered data. Of course, from the
moment we forward an arbitrary huge amount of data, all kinds of bad things
may happen. So offset computation is performed for all filters and
http_end() callback function is called only for data filters.

The other one happens when a data filter alter the data of a channel, it
must update the offsets of all previous filters. But the offset of non-data
filters must be up to date, otherwise, here too we may have an integer
overflow.

Another way to fix these bugs is to always ignore non-data filters from the
offsets computation. But this patch is safer and probably easier to
maintain.

This patch must be backported in all versions where the above commit is. So
as far as 2.0.
2020-11-24 14:17:32 +01:00
Ilya Shipitsin
5bfe66366c BUILD: SSL: do not "update" BoringSSL version equivalent anymore
we have added all required fine guarding, no need to reduce
BoringSSL version back to 1.1.0 anymore, we do not depend on it
2020-11-24 09:54:44 +01:00
Ilya Shipitsin
f04a89c549 CLEANUP: remove unused function "ssl_sock_is_ckch_valid"
"ssl_sock_is_ckch_valid" is not used anymore, let us remove it
2020-11-24 09:54:44 +01:00
Julien Pivotto
2de240a676 MINOR: stream: Add level 7 retries on http error 401, 403
Level-7 retries are only possible with a restricted number of HTTP
return codes. While it is usually not safe to retry on 401 and 403, I
came up with an authentication backend which was not synchronizing
authentication of users. While not perfect, being allowed to also retry
on those return codes is really helpful and acts as a hotfix until we
can fix the backend.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-11-23 09:33:14 +01:00
Maciej Zdeb
ebdd4c55da MINOR: http_act: Add -m flag for del-header name matching method
This patch adds -m flag which allows to specify header name
matching method when deleting headers from http request/response.
Currently beg, end, sub, str and reg are supported.

This is related to GitHub issue #909
2020-11-21 15:54:30 +01:00
Willy Tarreau
3aab17bd56 BUG/MAJOR: connection: reset conn->owner when detaching from session list
Baptiste reported a new crash affecting 2.3 which can be triggered
when using H2 on the backend, with http-reuse always and with a tens
of clients doing close only. There are a few combined cases which cause
this to happen, but each time the issue is the same, an already freed
session is dereferenced in session_unown_conn().

Two cases were identified to cause this:
  - a connection referencing a session as its owner, which is detached
    from the session's list and is destroyed after this session ends.
    The test on conn->owner before calling session_unown_conn() is not
    sufficent as the pointer is not null but is not valid anymore.

  - a connection that never goes idle and that gets killed form the
    mux, where session_free() is called first, then conn_free() calls
    session_unown_conn() which scans the just freed session for older
    connections. This one is only triggered with DEBUG_UAF

The reason for this session to be present here is that it's needed during
the connection setup, to be passed to conn_install_mux_be() to mux->init()
as the owning session, but it's never deleted aftrewards. Furthermore, even
conn_session_free() doesn't delete this pointer after freeing the session
that lies there. Both do definitely result in a use-after-free that's more
easily triggered under DEBUG_UAF.

This patch makes sure that the owner is always deleted after detaching
or killing the session. However it is currently not possible to clear
the owner right after a synchronous init because the proxy protocol
apparently needs it (a reg test checks this), and if we leave it past
the connection setup with the session not attached anywhere, it's hard
to catch the right moment to detach it. This means that the session may
remain in conn->owner as long as the connection has never been added to
nor removed from the session's idle list. Given that this patch needs to
remain simple enough to be backported, instead it adds a workaround in
session_unown_conn() to detect that the element is already not attached
anywhere.

This fix absolutely requires previous patch "CLEANUP: connection: do not
use conn->owner when the session is known" otherwise the situation will
be even worse, as some places used to rely on conn->owner instead of the
session.

The fix could theorically be backported as far as 1.8. However, the code
in this area has significantly changed along versions and there are more
risks of breaking working stuff than fixing real issues there. The issue
was really woken up in two steps during 2.3-dev when slightly reworking
the idle conns with commit 08016ab82 ("MEDIUM: connection: Add private
connections synchronously in session server list") and when adding
support for storing used H2 connections in the session and adding the
necessary call to session_unown_conn() in the muxes. But the same test
managed to crash 2.2 when built in DEBUG_UAF and patched like this,
proving that we used to already leave dangling pointers behind us:

|  diff --git a/include/haproxy/connection.h b/include/haproxy/connection.h
|  index f8f235c1a..dd30b5f80 100644
|  --- a/include/haproxy/connection.h
|  +++ b/include/haproxy/connection.h
|  @@ -458,6 +458,10 @@ static inline void conn_free(struct connection *conn)
|                          sess->idle_conns--;
|                  session_unown_conn(sess, conn);
|          }
|  +       else {
|  +               struct session *sess = conn->owner;
|  +               BUG_ON(sess && sess->origin != &conn->obj_type);
|  +       }
|
|          sockaddr_free(&conn->src);
|          sockaddr_free(&conn->dst);

It's uncertain whether an existing code path there can lead to dereferencing
conn->owner when it's bad, though certain suspicious memory corruption bugs
make one think it's a likely candidate. The patch should not be hard to
adapt there.

Backports to 2.1 and older are left to the appreciation of the person
doing the backport.

A reproducer consists in this:

  global
    nbthread 1

  listen l
    bind :9000
    mode http
    http-reuse always
    server s 127.0.0.1:8999 proto h2

  frontend f
    bind :8999 proto h2
    mode http
    http-request return status 200

Then this will make it crash within 2-3 seconds:

  $ h1load -e -r 1 -c 10 http://0:9000/

If it does not, it might be that DEBUG_UAF was not used (it's harder then)
and it might be useful to restart.
2020-11-21 15:29:22 +01:00
Willy Tarreau
38b4d2eb22 CLEANUP: connection: do not use conn->owner when the session is known
At a few places we used to rely on conn->owner to retrieve the session
while the session is already known. This is not correct because at some
of these points the reason the connection's owner was still the session
(instead of NULL) is a mistake. At one place a comparison is even made
between the session and conn->owner assuming it's valid without checking
if it's NULL. Let's clean this up to use the session all the time.

Note that this will be needed for a forthcoming fix and will have to be
backported.
2020-11-21 15:29:22 +01:00
Ilya Shipitsin
f34ed0b74c BUILD: SSL: guard TLS13 ciphersuites with HAVE_SSL_CTX_SET_CIPHERSUITES
HAVE_SSL_CTX_SET_CIPHERSUITES is newly defined macro set in openssl-compat.h,
which helps to identify ssl libs (currently OpenSSL-1.1.1 only) that supports
TLS13 cipersuites manipulation on TLS13 context
2020-11-21 11:04:36 +01:00
Ilya Shipitsin
bdec3ba796 BUILD: ssl: use SSL_MODE_ASYNC macro instead of OPENSSL_VERSION 2020-11-19 19:59:32 +01:00
William Dauchy
f63704488e MEDIUM: cli/ssl: configure ssl on server at runtime
in the context of a progressive backend migration, we want to be able to
activate SSL on outgoing connections to the server at runtime without
reloading.
This patch adds a `set server ssl` command; in order to allow that:

- add `srv_use_ssl` to `show servers state` command for compatibility,
  also update associated parsing
- when using default-server ssl setting, and `no-ssl` on server line,
  init SSL ctx without activating it
- when triggering ssl API, de/activate SSL connections as requested
- clean ongoing connections as it is done for addr/port changes, without
  checking prior server state

example config:

backend be_foo
  default-server ssl
  server srv0 127.0.0.1:6011 weight 1 no-ssl

show servers state:

  5 be_foo 1 srv0 127.0.0.1 2 0 1 1 15 1 0 4 0 0 0 0 - 6011 - -1

where srv0 can switch to ssl later during the runtime:

  set server be_foo/srv0 ssl on

  5 be_foo 1 srv0 127.0.0.1 2 0 1 1 15 1 0 4 0 0 0 0 - 6011 - 1

Also update existing tests and create a new one.

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2020-11-18 17:22:28 +01:00
Christopher Faulet
83fefbcdff MINOR: init: Fix the prototype for per-thread free callbacks
Functions registered to release memory per-thread have no return value. But the
registering function and the function pointer in per_thread_free_fct structure
specify it should return an integer. This patch fixes it.

This patch may be backported as far as 2.0.
2020-11-13 16:26:10 +01:00
Amaury Denoyelle
7f8f6cb926 BUG/MEDIUM: stats: prevent crash if counters not alloc with dummy one
Define a per-thread counters allocated with the greatest size of any
stat module counters. This variable is named trash_counters.

When using a proxy without allocated counters, return the trash counters
from EXTRA_COUNTERS_GET instead of a dangling pointer to prevent
segfault.

This is useful for all the proxies used internally and not
belonging to the global proxy list. As these objects does not appears on
the stat report, it does not matter to use the dummy counters.

For this fix to be functional, the extra counters are explicitly
initialized to NULL on proxy/server/listener init functions.

Most notably, the crash has already been detected with the following
vtc:
- reg-tests/lua/txn_get_priv.vtc
- reg-tests/peers/tls_basic_sync.vtc
- reg-tests/peers/tls_basic_sync_wo_stkt_backend.vtc
There is probably other parts that may be impacted (SPOE for example).

This bug was introduced in the current release and do not need to be
backported. The faulty commits are
"MINOR: ssl: count client hello for stats" and
"MINOR: ssl: add counters for ssl sessions".
2020-11-12 15:16:05 +01:00
Remi Tricot-Le Breton
cc9bf2e5fe MEDIUM: cache: Change caching conditions
Do not cache responses that do not have an explicit expiration time
(s-maxage or max-age Cache-Control directives or Expires header) or a
validator (ETag or Last-Modified headers) anymore, as suggested in
RFC 7234#3.
The TX_FLAG_IGNORE flag is used instead of the TX_FLAG_CACHEABLE so as
not to change the behavior of the checkcache option.
2020-11-12 11:22:05 +01:00
Christopher Faulet
a66adf41ea MINOR: http-htx: Add understandable errors for the errorfiles parsing
No details are provided when an error occurs during the parsing of an errorfile,
Thus it is a bit hard to diagnose where the problem is. Now, when it happens, an
understandable error message is reported.

This patch is not a bug fix in itself. But it will be required to change an
fatal error into a warning in last stable releases. Thus it must be backported
as far as 2.0.
2020-11-06 09:13:58 +01:00
Willy Tarreau
38d41996c1 MEDIUM: pattern: turn the pattern chaining to single-linked list
It does not require heavy deletion from the expr anymore, so we can now
turn this to a single-linked list since most of the time we want to delete
all instances of a given pattern from the head. By doing so we save 32 bytes
of memory per pattern. The pat_unlink_from_head() function was adjusted
accordingly.
2020-11-05 19:27:09 +01:00
Willy Tarreau
94b9abe200 MINOR: pattern: add pat_ref_purge_older() to purge old entries
This function will be usable to purge at most a specified number of old
entries from a reference. Entries are declared old if their generation
number is in the past compared to the one passed in argument. This will
ease removal of early entries when new ones have been appended.

We also call malloc_trim() when available, at the end of the series,
because this is one place where there is a lot of memory to save. Reloads
of 1M IP addresses used in an ACL made the process grow up to 1.7 GB RSS
after 10 reloads and roughly stabilize there without this call, versus
only 260 MB when the call is present. Sadly there is no direct equivalent
for jemalloc, which stabilizes around 800MB-1GB.
2020-11-05 19:27:09 +01:00
Willy Tarreau
1a6857b9c1 MINOR: pattern: implement pat_ref_load() to load a pattern at a given generation
pat_ref_load() basically combines pat_ref_append() and pat_ref_commit().
It's very similar to pat_ref_add() except that it also allows to set the
generation ID and the line number. pat_ref_add() was modified to directly
rely on it to avoid code duplication. Note that a previous declaration
of pat_ref_load() was removed as it was just a leftover of an earlier
incarnation of something possibly similar, so no existing functionality
was changed here.
2020-11-05 19:27:09 +01:00
Willy Tarreau
0439e5eeb4 MINOR: pattern: add pat_ref_commit() to commit a previously inserted element
This function will be used after a successful pat_ref_append() to propagate
the pattern to all use places (including parsing and indexing). On failure,
it will entirely roll back all insertions and free the pattern itself. It
also preserves the generation number so that it is convenient for use in
association with pat_ref_append(). pat_ref_add() was modified to rely on
it instead of open-coding the insertion and roll-back.
2020-11-05 19:27:09 +01:00
Willy Tarreau
29947745b5 MINOR: pattern: store a generation number in the reference patterns
Right now it's not possible to perform a safe reload because we don't
know what patterns were recently added or were already present. This
patch adds a generation counter to the reference patterns so that it
is possible to know what generation of the reference they were loaded
with. A reference now has two generations, the current one, used for
all additions, and the next one, allocated to those wishing to update
the contents. The generation wraps at 2^32 so comparisons must be made
relative to the current position.

The idea will be that upon full reload, the caller will first get a new
generation ID, will insert all new patterns using it, will then switch
the current ID to the new one, and will delete all entries older than
the current ID. This has the benefit of supporting chunked updates that
remain consistent and that won't block the whole process for ages like
pat_ref_reload() currently does.
2020-11-05 19:27:09 +01:00
Willy Tarreau
1fd52f70e5 MINOR: pattern: introduce pat_ref_delete_by_ptr() to delete a valid reference
Till now the only way to remove a known reference was via
pat_ref_delete_by_id() which scans the whole list to find a matching pointer.
Let's add pat_ref_delete_by_ptr() which takes a valid pointer. It can be
called by the function above after the pointer is found, and can also be
used to roll back a failed insertion much more efficiently.
2020-11-05 19:27:09 +01:00
Willy Tarreau
a98b2882ac CLEANUP: pattern: remove pat_delete_fcts[] and pattern_head->delete()
These ones are not used anymore, so let's remove them to remove a bit
of the complexity. The ACL keyword's delete() function could be removed
as well, though most keyword declarations are positional and we have a
high risk of introducing a mistake here, so let's not touch the ACL part.
2020-11-05 19:27:09 +01:00
Willy Tarreau
f1c0892aa6 MINOR: pattern: remerge the list and tree deletion functions
pat_del_tree_gen() was already chained onto pat_del_list_gen() to deal
with remaining cases, so let's complete the merge and have a generic
pattern deletion function acting on the reference and taking care of
reliably removing all elements.
2020-11-05 19:27:09 +01:00
Willy Tarreau
78777ead32 MEDIUM: pattern: change the pat_del_* functions to delete from the references
This is the next step in speeding up entry removal. Now we don't scan
the whole lists or trees for elements pointing to the target reference,
instead we start from the reference and delete all linked patterns.

This simplifies some delete functions since we don't need anymore to
delete multiple times from an expression since all nodes appear after
the reference element. We can now have one generic list and one generic
tree deletion function.

This required the replacement of pattern_delete() with an open-coded
version since we now need to lock all expressions first before proceeding.
This means there is a high risk of lock inversion here but given that the
expressions are always scanned in the same order from the same head, this
must not happen.

Now deleting first entries is instantaneous, and it's still slow to
delete the last ones when looking up their ID since it still requires
to look them up by a full scan, but it's already way faster than
previously. Typically removing the last 10 IP from a 20M entries ACL
with a full-scan each took less than 2 seconds.

It would be technically possible to make use of indexed entries to
speed up most lookups for removal by value (e.g. IP addresses) but
that's for later.
2020-11-05 19:27:09 +01:00
Willy Tarreau
4bdd0a13d6 MEDIUM: pattern: link all final elements from the reference
There is a data model issue in the current pattern design that makes
pattern deletion extremely expensive: there's no direct way from a
reference to access all indexed occurrences. As such, the only way
to remove all indexed entries corresponding to a reference update
is to scan all expressions's lists and trees to find a link to the
reference. While this was possibly OK when map removal was not
common and most maps were small, this is not conceivable anymore
with GeoIP maps containing 10M+ entries and del-map operations that
are triggered from http-request rulesets.

This patch introduces two list heads from the pattern reference, one
for the objects linked by lists and one for those linked by tree node.
Ideally a single list would be enough but the linked elements are too
much unrelated to be distinguished at the moment, so we'll need two
lists. However for the long term a single-linked list will suffice but
for now it's not possible due to the way elements are removed from
expressions. As such this patch adds 32 bytes of memory usage per
reference plus 16 per indexed entry, but both will be cut in half
later.

The links are not yet used for deletion, this patch only ensures the
list is always consistent.
2020-11-05 19:27:09 +01:00
Willy Tarreau
6d8a68914e MINOR: pattern: make the delete and prune functions more generic
Now we have a single prune() function to act on an expression, and one
delete function for the lists and one for the trees. The presence of a
pointer in the lists is enough to warrant a free, and we rely on the
PAT_SF_REGFREE flag to decide whether to free using free() or regfree().
2020-11-05 19:27:09 +01:00
Willy Tarreau
9b5c8bbc89 MINOR: pattern: new sflag PAT_SF_REGFREE indicates regex_free() is needed
Currently we have no way to know how to delete/prune a pattern in a
generic way. A pattern doesn't contain its own type so we don't know
what function to call. Tree nodes are roughly OK but not lists where
regex are possible. Let's add one new bit for sflags at index time to
indicate that regex_free() will be needed upon deletion. It's not used
for now.
2020-11-05 19:27:08 +01:00
Willy Tarreau
3ee0de1b41 MINOR: pattern: move the update revision to the pat_ref, not the expression
It's not possible to uniquely update a single expression without updating
the pattern reference, I don't know why we've put the revision in the
expression back then, given that it in fact provides an update for a
full pattern. Let's move the revision into the reference's head instead.
2020-11-05 19:27:08 +01:00
Willy Tarreau
1d3c7003d9 MINOR: compat: automatically include malloc.h on glibc
This is in order to access malloc_trim() which is convenient after
clearing huge maps to reclaim memory. When this is detected, we also
define HA_HAVE_MALLOC_TRIM.
2020-11-05 19:27:08 +01:00
Baptiste Assmann
e279ca6bbe MINOR: sample: Add converts to parses MQTT messages
This patch implements a couple of converters to validate and extract data from a
MQTT (Message Queuing Telemetry Transport) message. The validation consists of a
few checks as well as "packet size" validation. The extraction can get any field
from the variable header and the payload.

This is limited to CONNECT and CONNACK packet types only. All other messages are
considered as invalid. It is not a problem for now because only the first packet
on each side can be parsed (CONNECT for the client and CONNACK for the server).

MQTT 3.1.1 and 5.0 are supported.

Reviewed and Fixed by Christopher Faulet <cfaulet@haproxy.com>
2020-11-05 19:27:03 +01:00
Baptiste Assmann
e138dda1e0 MINOR: sample: Add converters to parse FIX messages
This patch implements a couple of converters to validate and extract tag value
from a FIX (Financial Information eXchange) message. The validation consists in
a few checks such as mandatory fields and checksum computation. The extraction
can get any tag value based on a tag string or tag id.

This patch requires the istend() function. Thus it depends on "MINOR: ist: Add
istend() function to return a pointer to the end of the string".

Reviewed and Fixed by Christopher Faulet <cfaulet@haproxy.com>
2020-11-05 19:26:30 +01:00
Christopher Faulet
cf26623780 MINOR: ist: Add istend() function to return a pointer to the end of the string
istend() is a shortcut to istptr() + istlen().
2020-11-05 19:25:12 +01:00
Willy Tarreau
1db5579bf8 [RELEASE] Released version 2.4-dev0
Released version 2.4-dev0 with the following main changes :
    - MINOR: version: it's development again.
    - DOC: mention in INSTALL that it's development again
2020-11-05 17:20:35 +01:00
Willy Tarreau
b9b2ac20f8 MINOR: version: it's development again.
This reverts commit 0badabc381.
2020-11-05 17:18:49 +01:00
Willy Tarreau
0badabc381 MINOR: version: mention that it's stable now
This version will be maintained up to around Q1 2022.
2020-11-05 17:00:50 +01:00
Ilya Shipitsin
0aa8c29460 BUILD: ssl: use feature macros for detecting ec curves manipulation support
Let us use SSL_CTX_set1_curves_list, defined by OpenSSL, as well as in
openssl-compat when SSL_CTRL_SET_CURVES_LIST is present (BoringSSL),
for feature detection instead of versions.
2020-11-05 15:08:41 +01:00
Willy Tarreau
5b8af1e30c MINOR: ssl: define SSL_CTX_set1_curves_list to itself on BoringSSL
OpenSSL 1.0.2 and onwards define SSL_CTX_set1_curves_list which is both a
function and a macro. OpenSSL 1.0.2 to 1.1.0 define SSL_CTRL_SET_CURVES_LIST
as a macro, which disappeared from 1.1.1. BoringSSL only has that one and
not the former macro but it does have the function. Let's keep the test on
the macro matching the function name by defining the macro to itself when
needed.
2020-11-05 15:05:09 +01:00
Willy Tarreau
7e98e28eb0 MINOR: fd: add fd_want_recv_safe()
This does the same as fd_want_recv() except that it does check for
fd_updt[] to be allocated, as this may be called during early listener
initialization. Previously we used to check fd_updt[] before calling
fd_want_recv() but this is not correct since it does not update the
FD flags. This method will be safer.
2020-11-04 14:22:42 +01:00
Willy Tarreau
9dd7f4fb4b MINOR: debug: don't count free(NULL) in memstats
The mem stats are pretty convenient to spot leaks, except that they count
free(NULL) as 1, and the code does actually have quite a number of free(foo)
guards where foo is NULL if the object was already freed. Let's just not
count these ones so that the stats remain consistent. Now it's possible
to compare the strdup()/malloc() and free() and verify they are consistent.
2020-11-03 16:46:48 +01:00
Ilya Shipitsin
04a5a440b8 BUILD: ssl: use HAVE_OPENSSL_KEYLOG instead of OpenSSL versions
let us use HAVE_OPENSSL_KEYLOG for feature detection instead
of versions
2020-11-03 14:54:15 +01:00
Willy Tarreau
b706a3b4e1 CLEANUP: pattern: remove unused entry "tree" in pattern.val
This one might have disappeared since patterns were reworked, but the
entry was not removed from the structure, let's do it now.
2020-11-02 11:32:05 +01:00
Willy Tarreau
6bedf151e1 MINOR: pattern: export pat_ref_push()
Strangely this one was marked static inline within the file itself.
Let's export it.
2020-10-31 13:13:48 +01:00
Willy Tarreau
f4edb72e0a MINOR: pattern: make pat_ref_append() return the newly added element
It's more convenient to return the element than to return just 0 or 1,
as the next thing we'll want to do is to act on this element! In addition
it was using variable arguments instead of consts, causing some reuse
constraints which were also addressed. This doesn't change its use as
a boolean, hence why call places were not modified.
2020-10-31 13:13:48 +01:00
Remi Tricot-Le Breton
bb4582cf71 MINOR: ist: Add a case insensitive istmatch function
Add a helper function that checks if a string starts with another string
while ignoring case.
2020-10-30 13:20:21 +01:00
Willy Tarreau
bd71510024 MINOR: stats: report server's user-configured weight next to effective weight
The "weight" column on the stats page is somewhat confusing when using
slowstart becaue it reports the effective weight, without being really
explicit about it. In some situations the user-configured weight is more
relevant (especially with long slowstarts where it's important to know
if the configured weight is correct).

This adds a new uweight stat which reports a server's user-configured
weight, and in a backend it receives the sum of all servers' uweights.
In addition it adds the mention of "effective" in a few descriptions
for the "weight" column (help and doc).

As a result, the list of servers in a backend is now always scanned
when dumping the stats. But this is not a problem given that these
servers are already scanned anyway and for way heavier processing.
2020-10-23 22:47:30 +02:00
Willy Tarreau
3e32036701 MINOR: stats: also support a "no-maint" show stat modifier
"no-maint" is a bit similar to "up" except that it will only hide
servers that are in maintenance (or disabled in the configuration), and
not those that are enabled but failed a check. One benefit here is to
significantly reduce the output of the "show stat" command when using
large server-templates containing entries that are not yet provisioned.

Note that the prometheus exporter also has such an option which does
the exact same.
2020-10-23 18:11:24 +02:00
Willy Tarreau
670119955b Revert "OPTIM: queue: don't call pendconn_unlink() when the pendconn is not queued"
This reverts commit b7ba1d9011. Actually
this test had already been removed in the past by commit fac0f645d
("BUG/MEDIUM: queue: make pendconn_cond_unlink() really thread-safe"),
but the condition to reproduce the bug mentioned there was not clear.

Now after analysis and a certain dose of code cleanup, things start to
appear more obvious. what happens is that if we check the presence of
the node in the tree without taking the lock, we can see the NULL at
the instant the node is being unlinked by another thread in
pendconn_process_next_strm() as part of __pendconn_unlink_prx() or
__pendconn_unlink_srv(). Till now there is no issue except that the
pendconn is not removed from the queue during this operation and that
the task is scheduled to be woken up by pendconn_process_next_strm()
with the stream being added to the list of the server's active
connections by __stream_add_srv_conn(). The first thread finishes
faster and gets back to stream_free() faster than the second one
sets the srv_conn on the stream, so stream_free() skips the s->srv_conn
test and doesn't try to dequeue the freshly queued entry. At the
very least a barrier would be needed there but we can't afford to
free the stream while it's being queued. So there's no other solution
than making sure that either __pendconn_unlink_prx() or
pendconn_cond_unlink() get the entry but never both, which is why the
lock is required around the test. A possible solution would be to set
p->target before unlinking the entry and using it to complete the test.
This would leave no dead period where the pendconn is not seen as
attached.

It is possible, yet extremely difficult, to reproduce this bug, which
was first noticed in bug #880. Running 100 servers with maxconn 1 and
maxqueue 1 on leastconn and a connect timeout of 30ms under 16 threads
with DEBUG_UAF, with a traffic making the backend's queue oscillate
around zero (typically using 250 connections with a local httpterm
server) may rarely manage to trigger a use-after-free.

No backport is needed.
2020-10-23 09:21:55 +02:00
Willy Tarreau
b7ba1d9011 OPTIM: queue: don't call pendconn_unlink() when the pendconn is not queued
On connection error processing, we can see massive storms of calls to
pendconn_cond_unlink() to release a possible place in the queue. For
example, in issue #908, on average half of the threads are caught in
this function via back_try_conn_req() consecutive to a synchronous
error. However we wait until grabbing the lock to know if the pendconn
is effectively in a queue, which is expensive for many cases. We know
the transition may only happen from in-queue to out-of-queue so it's safe
to first run a preliminary check to see if it's worth going further. This
will allow to avoid the cost of locking for most requests. This should
not change anything for those completing correctly as they're already
run through pendconn_free() which doesn't call pendconn_cond_unlink()
unless deemed necessary.
2020-10-22 17:32:28 +02:00
Willy Tarreau
ac66d6bafb MINOR: proxy; replace the spinlock with an rwlock
This is an anticipation of finer grained locking for the queues. For now
all lock places take a write lock so that there is no difference at all
with previous code.
2020-10-22 17:32:28 +02:00
Willy Tarreau
de785f04e1 MINOR: threads/debug: only report lock stats for used operations
In addition to the previous simplification, most locks don't use the
seek or read lock (e.g. spinlocks etc) so let's split the dump into
distinct operations (write/seek/read) and only report those which
were used. Now the output size is roughly divided by 5 compared
to previous ones.
2020-10-22 17:32:28 +02:00
Willy Tarreau
23d3b00bdd MINOR: threads/debug: only report used lock stats
The lock stats are very verbose and more than half of them are used in
a typical test, making it hard to spot the sought values. Let's simply
report "not used" for those which have not been called at all.
2020-10-22 17:32:28 +02:00
Christopher Faulet
d6c48366b8 BUG/MINOR: http-ana: Don't send payload for internal responses to HEAD requests
When an internal response is returned to a client, the message payload must be
skipped if it is a reply to a HEAD request. The payload is removed from the HTX
message just before the message forwarding.

This bugs has been around for a long time. It was already there in the pre-HTX
versions. In legacy HTTP mode, internal errors are not parsed. So this bug
cannot be easily fixed. Thus, this patch should only be backported in all HTX
versions, as far as 2.0. However, the code has significantly changed in the
2.2. Thus in the 2.1 and 2.0, the patch must be entirely reworked.
2020-10-22 17:13:22 +02:00
Remi Tricot-Le Breton
6cb10384a3 MEDIUM: cache: Add support for 'If-None-Match' request header
Partial support of conditional HTTP requests. This commit adds the
support of the 'If-None-Match' header (see RFC 7232#3.2).
When a client specifies a list of ETags through one or more
'If-None-Match' headers, they are all compared to the one that might have
been stored in the corresponding http cache entry until one of them
matches.
If a match happens, a specific "304 Not Modified" response is
sent instead of the cached data. This response has all the stored
headers but no other data (see RFC 7232#4.1). Otherwise, the whole cached data
is sent.
Although unlikely in a GET/HEAD request, the "If-None-Match: *" syntax is
valid and also receives a "304 Not Modified" response (RFC 7434#4.3.2).

This resolves a part of GitHub issue #821.
2020-10-22 16:10:20 +02:00
Remi Tricot-Le Breton
bcced09b91 MINOR: http: Add etag comparison function
Add a function that compares two etags that might be of different types.
If any of them is weak, the 'W/' prefix is discarded and a strict string
comparison is performed.

Co-authored-by: Tim Duesterhus <tim@bastelstu.be>
2020-10-22 16:06:20 +02:00
Tim Duesterhus
2493ee81d4 MINOR: http: Add enum etag_type http_get_etag_type(const struct ist)
http_get_etag_type returns whether a given `etag` is a strong, weak, or invalid
ETag.
2020-10-22 16:02:29 +02:00
William Lallemand
8e8581e242 MINOR: ssl: 'ssl-load-extra-del-ext' removes the certificate extension
In issue #785, users are reporting that it's not convenient to load a
".crt.key" when the configuration contains a ".crt".

This option allows to remove the extension of the certificate before
trying to load any extra SSL file (.key, .ocsp, .sctl, .issuer etc.)

The patch changes a little bit the way ssl_sock_load_files_into_ckch()
looks for the file.
2020-10-20 18:25:46 +02:00
Christopher Faulet
96ddc8ab43 BUG/MEDIUM: connection: Never cleanup server lists when freeing private conns
When a connection is released, depending on its state, it may be detached from
the session and it may be removed from the server lists. The first case may
happen for private or unsharable active connections. The second one should only
be performed for idle or available connections. We never try to remove a
connection from the server list if it is attached to a session. But it is also
important to never try to remove a private connecion from the server lists, even
if it is not attached to a session. Otherwise, the curr_used_conn server counter
is decremented once too often.

This bug was introduced by the commit 04a24c5ea ("MINOR: connection: don't check
priv flag on free"). It is related to the issue #881. It only affects the 2.3,
no backport is needed.
2020-10-19 17:19:10 +02:00
Willy Tarreau
69a7b8fc6c CLEANUP: task: remove the unused and mishandled global_rqueue_size
This counter is only updated and never used, and in addition it's done
without any atomicity so it's very unlikely to be correct on multi-CPU
systems! Let's just remove it since it's not used.
2020-10-19 14:08:13 +02:00
Willy Tarreau
e72a3f4489 CLEANUP: tree-wide: reorder a few structures to plug some holes around locks
A few structures were slightly rearranged in order to plug some holes
left around the locks. Sizes ranging from 8 to 32 bytes could be saved
depending on the structures. No performance difference was noticed (none
was expected there), though memory usage might be slightly reduced in
some rare cases.
2020-10-19 14:08:13 +02:00
Willy Tarreau
8f1f177ed0 MINOR: threads: change lock_t to an unsigned int
We don't need to waste the size of a long for the locks: with the plocks,
even an unsigned short would offer enough room for up to 126 threads! Let's
use an unsigned int which will be easier to place in certain structures
and will more conveniently plug some holes, and Atomic ops are at least
as fast on 32-bit as on 64-bit. This will not change anything for 32-bit
platforms.
2020-10-19 14:08:13 +02:00
Willy Tarreau
3d18498645 CLEANUP: threads: don't register an initcall when not debugging
It's a bit overkill to register an initcall to call a function to set
a lock to zero when not debugging, let's just declare the lock as
pre-initialized to zero.
2020-10-19 14:08:13 +02:00
Ilya Shipitsin
fcb69d768b BUILD: ssl: make BoringSSL use its own version numbers
BoringSSL is a fork of OpenSSL 1.1.0, however in
49e9f67d8b7cbeb3953b5548ad1009d15947a523 it has changed version to 1.1.1.

Should fix issue #895.

This must be backported to 2.2, 2.1, 2.0, 1.8
2020-10-19 11:34:37 +02:00
Willy Tarreau
cd10def825 MINOR: backend: replace the lbprm lock with an rwlock
It was previously a spinlock, and it happens that a number of LB algos
only lock it for lookups, without performing any modification. Let's
first turn it to an rwlock and w-lock it everywhere. This is strictly
identical.

It was carefully checked that every HA_SPIN_LOCK() was turned to
HA_RWLOCK_WRLOCK() and that HA_SPIN_UNLOCK() was turned to
HA_RWLOCK_WRUNLOCK() on this lock. _INIT and _DESTROY were updated too.
2020-10-17 18:51:41 +02:00
Willy Tarreau
61f799b8da MINOR: threads: add the transitions to/from the seek state
Since our locks are based on progressive locks, we support the upgradable
seek lock that is compatible with readers and upgradable to a write lock.
The main purpose is to take it while seeking down a tree for modification
while other threads may seek the same tree for an input (e.g. compute the
next event date).

The newly supported operations are:

  HA_RWLOCK_SKLOCK(lbl,l)         pl_take_s(l)      /* N --> S */
  HA_RWLOCK_SKTOWR(lbl,l)         pl_stow(l)        /* S --> W */
  HA_RWLOCK_WRTOSK(lbl,l)         pl_wtos(l)        /* W --> S */
  HA_RWLOCK_SKTORD(lbl,l)         pl_stor(l)        /* S --> R */
  HA_RWLOCK_WRTORD(lbl,l)         pl_wtor(l)        /* W --> R */
  HA_RWLOCK_SKUNLOCK(lbl,l)       pl_drop_s(l)      /* S --> N */
  HA_RWLOCK_TRYSKLOCK(lbl,l)      (!pl_try_s(l))    /* N -?> S */
  HA_RWLOCK_TRYRDTOSK(lbl,l)      (!pl_try_rtos(l)) /* R -?> S */

Existing code paths are left unaffected so this patch doesn't affect
any running code.
2020-10-16 16:53:46 +02:00
Willy Tarreau
8d5360ca7f MINOR: threads: augment rwlock debugging stats to report seek lock stats
We currently use only read and write lock operations with rwlocks, but
ours also support upgradable seek locks for which we do not report any
stats. Let's add them now when DEBUG_THREAD is enabled.
2020-10-16 16:51:49 +02:00
Willy Tarreau
233ad288cd CLEANUP: protocol: remove the now unused <handler> field of proto_fam->bind()
We don't need to specify the handler anymore since it's set in the
receiver. Let's remove this argument from the function and clean up
the remains of code that were still setting it.
2020-10-15 21:47:56 +02:00
Willy Tarreau
a74cb38e7c MINOR: protocol: register the receiver's I/O handler and not the protocol's
Now we define a new sock_accept_iocb() for socket-based stream protocols
and use it as a wrapper for listener_accept() which now takes a listener
and not an FD anymore. This will allow the receiver's I/O cb to be
redefined during registration, and more specifically to get rid of the
hard-coded hacks in protocol_bind_all() made for syslog.

The previous ->accept() callback in the protocol was removed since it
doesn't have anything to do with accept() anymore but is more generic.
A few places where listener_accept() was compared against the FD's IO
callback for debugging purposes on the CLI were updated.
2020-10-15 21:47:56 +02:00
Willy Tarreau
d2fb99f9d5 MINOR: protocol: add a default I/O callback and put it into the receiver
For now we're still using the protocol's default accept() function as
the I/O callback registered by the receiver into the poller. While
this is usable for most TCP connections where a listener is needed,
this is not suitable for UDP where a different handler is needed.

Let's make this configurable in the receiver just like the upper layer
is configurable for listeners. In order to ease stream protocols
handling, the protocols will now provide a default I/O callback
which will be preset into the receivers upon allocation so that
almost none of them has to deal with it.
2020-10-15 21:47:56 +02:00
Willy Tarreau
f1dc9f2f17 MINOR: sock: implement sock_accept_conn() to accept a connection
The socket-specific accept() code in listener_accept() has nothing to
do there. Let's move it to sock.c where it can be significantly cleaned
up. It will now directly return an accepted connection and provide a
status code instead of letting listener_accept() deal with various errno
values. Note that this doesn't support the sockpair specific code.

The function is now responsible for dealing with its own receiver's
polling state and calling fd_cant_recv() when facing EAGAIN.

One tiny change from the previous implementation is that the connection's
sockaddr is now allocated before trying accept(), which saves a memcpy()
of the resulting address for each accept at the expense of a cheap
pool_alloc/pool_free on the final accept returning EAGAIN. This still
apparently slightly improves accept performance in microbencharks.
2020-10-15 21:47:56 +02:00
Willy Tarreau
1e509a7231 MINOR: protocol: add a new function accept_conn()
This per-protocol function will be used to accept an incoming
connection and return it as a struct connection*. As such the protocol
stack's internal representation of a connection will not need to be
handled by the listener code.
2020-10-15 21:47:56 +02:00
Willy Tarreau
7d053e4211 MINOR: sock: rename sock_accept_conn() to sock_accepting_conn()
This call was introduced by commit 5ced3e887 ("MINOR: sock: add
sock_accept_conn() to test a listening socket") but is actually quite
confusing because it makes one think the socket will accept a connection
(which is what we want to have in a new function) while it only tells
whether it's configured to accept connections. Let's call it
sock_accepting_conn() instead.

The same change was applied to sockpair which had the same issue.
2020-10-15 21:47:56 +02:00
Willy Tarreau
65ed143841 MINOR: connection: add new error codes for accept_conn()
accept_conn() will be used to accept an incoming connection and return it.
It will have to deal with various error codes. The currently identified
ones were created as CO_AC_*.
2020-10-15 21:47:56 +02:00
Willy Tarreau
83efc320aa MEDIUM: listener: allocate the connection before queuing a new connection
Till now we would keep a per-thread queue of pending incoming connections
for which we would store:
  - the listener
  - the accepted FD
  - the source address
  - the source address' length

And these elements were first used in session_accept_fd() running on the
target thread to allocate a connection and duplicate them again. Doing
this induces various problems. The first one is that session_accept_fd()
may only run on file descriptors and cannot be reused for QUIC. The second
issue is that it induces lots of memory copies and that the listerner
queue thrashes a lot of cache, consuming 64 bytes per entry.

This patch changes this by allocating the connection before queueing it,
and by only placing the connection's pointer into the queue. Indeed, the
first two calls used to initialize the connection already store all the
information above, which can be retrieved from the connection pointer
alone. So we just have to pop one pointer from the target thread, and
pass it to session_accept_fd() which only needs the FD for the final
settings.

This starts to make the accept path a bit more transport-agnostic, and
saves memory and CPU cycles at the same time (1% connection rate increase
was noticed with 4 threads). Thanks to dividing the accept-queue entry
size from 64 to 8 bytes, its size could be increased from 256 to 1024
connections while still dividing the overall size by two. No single
queue full condition was met.

One minor drawback is that connection may be allocated from one thread's
pool to be used into another one. But this already happens a lot with
connection reuse so there is really nothing new here.
2020-10-15 21:47:56 +02:00
Willy Tarreau
9b7587a6af MINOR: connection: make sockaddr_alloc() take the address to be copied
Roughly half of the calls to sockadr_alloc() are made to copy an already
known address. Let's optionally pass it in argument so that the function
can handle the copy at the same time, this slightly simplifies its usage.
2020-10-15 21:47:56 +02:00
Willy Tarreau
0138f51f93 CLEANUP: fd: finally get rid of fd_done_recv()
fd_done_recv() used to be useful with the FD cache because it used to
allow to keep a file descriptor active in the poller without being
marked as ready in the cache, saving it from ringing immediately,
without incurring any system call. It was a way to make it yield
to wait for new events leaving a bit of time for others. The only
user left was the connection accepter (listen_accept()). We used
to suspect that with the FD cache removal it had become totally
useless since changing its readiness or not wouldn't change its
status regarding the poller itself, which would be the only one
deciding to report it again.

Careful tests showed that it indeed has exactly zero effect nowadays,
the syscall numbers are exactly the same with and without, including
when enabling edge-triggered polling.

Given that there's no more API available to manipulate it and that it
was directly called as an optimization from listener_accept(), it's
about time to remove it.
2020-10-15 21:47:56 +02:00
Willy Tarreau
e53e7ec9d9 CLEANUP: protocol: remove the ->drain() function
No protocol defines it anymore. The last user used to be the monitor-net
stuff that got partially broken already when the tcp_drain() function
moved to conn_sock_drain() with commit e215bba95 ("MINOR: connection:
make conn_sock_drain() work for all socket families") in 1.9-dev2.

A part of this will surely move back later when non-socket connections
arrive with QUIC but better keep the API clean and implement what's
needed in time instead.
2020-10-15 21:47:04 +02:00
Willy Tarreau
9e9919dd8b MEDIUM: proxy: remove obsolete "monitor-net"
As discussed here during 2.1-dev, "monitor-net" is totally obsolete:

   https://www.mail-archive.com/haproxy@formilux.org/msg35204.html

It's fundamentally incompatible with usage of SSL, and imposes the
presence of file descriptors with hard-coded syscalls directly in the
generic accept path.

It's very unlikely that anyone has used it in the last 10 years for
anything beyond testing. In the worst case if anyone would depend
on it, replacing it with "http-request return status 200 if ..." and
"mode http" would certainly do the trick.

The keyword is still detected as special by the config parser to help
users update their configurations appropriately.
2020-10-15 21:47:04 +02:00
Willy Tarreau
77e0daef9f MEDIUM: proxy: remove obsolete "mode health"
As discussed here during 2.1-dev, "mode health" is totally obsolete:

   https://www.mail-archive.com/haproxy@formilux.org/msg35204.html

It's fundamentally incompatible with usage of SSL, doesn't support
source filtering, and imposes the presence of file descriptors with
hard-coded syscalls directly in the generic accept path.

It's very unlikely that anyone has used it in the last 10 years for
anything beyond testing. In the worst case if anyone would depend
on it, replacing it with "http-request return status 200" and "mode
http" would certainly do the trick.

The keyword is still detected as special by the config parser to help
users update their configurations appropriately.
2020-10-15 21:47:04 +02:00
Amaury Denoyelle
04a24c5eaa MINOR: connection: don't check priv flag on free
Do not check CO_FL_PRIVATE flag to check if the connection is in session
list on conn_free. This is necessary due to the future patches which add
server connections in the session list even if not private, if the mux
protocol is the subject of HOL blocking.
2020-10-15 15:19:34 +02:00
Amaury Denoyelle
3d3c0918dc MINOR: mux/connection: add a new mux flag for HOL risk
This flag is used to indicate if the mux protocol is subject to
head-of-line blocking problem.
2020-10-15 15:19:34 +02:00
Amaury Denoyelle
c98df5fb44 MINOR: connection: improve list api usage
Replace !LIST_ISEMPTY by LIST_ADDED and LIST_DEL+LIST_INIT by
LIST_DEL_INIT for connection session list.
2020-10-15 15:19:34 +02:00
Amaury Denoyelle
9c13b62b47 BUG/MEDIUM: connection: fix srv idle count on conn takeover
On server connection migration from one thread to another, the wrong
idle thread-specific counter is decremented. This bug was introduced
since commit 3d52f0f1f8 due to the
factorization with srv_use_idle_conn. However, this statement is only
executed from conn_backend_get. Extract the decrement from
srv_use_idle_conn in conn_backend_get and use the correct
thread-specific counter.

Rename the function to srv_use_conn to better reflect its purpose as it
is also used with a newly initialized connection not in the idle list.

As a side change, the connection insertion to available list has also
been extracted to conn_backend_get. This will be useful to be able to
specify an alternative list for protocol subject to HOL risk that should
not be shared between several clients.

This bug is only present in this release and thus do not need a backport.
2020-10-15 15:19:34 +02:00
Willy Tarreau
29185140db MINOR: protocol: make proto_tcp & proto_uxst report listening sockets
Now we introdce a new .rx_listening() function to report if a receiver is
actually a listening socket. The reason for this is to help detect shared
sockets that might have been broken by sibling processes.
2020-10-13 18:15:33 +02:00
Willy Tarreau
5ced3e8879 MINOR: sock: add sock_accept_conn() to test a listening socket
At several places we need to check if a socket is still valid and still
willing to accept connections. Instead of open-coding this, each time,
let's add a new function for this.
2020-10-13 18:15:33 +02:00
Frdric Lcaille
3fc0fe05fd MINOR: peers: heartbeat, collisions and handshake information for "show peers" command.
This patch adds "coll" new counter and the heartbeat timer values to "show peers"
command. It also adds the elapsed time since the last handshake to new "last_hdshk"
new peer dump field.
2020-10-09 20:59:58 +02:00
Willy Tarreau
e03204c8e1 MEDIUM: listeners: implement protocol level ->suspend/resume() calls
Now we have ->suspend() and ->resume() for listeners at the protocol
level. This means that it now becomes possible for a protocol to redefine
its own way to suspend and resume. The default functions are provided for
TCP, UDP and unix, and they are pass-through to the receiver equivalent
as it used to be till now. Nothing was defined for sockpair since it does
not need to suspend/resume during reloads, hence it will succeed.
2020-10-09 18:44:37 +02:00
Willy Tarreau
7b2febde1d MINOR: listeners: split do_unbind_listener() in two
The inner part now goes into the protocol and is used to decide how to
unbind a given protocol's listener. The existing code which is able to
also unbind the receiver was provided as a default function that we
currently use everywhere. Some complex listeners like QUIC will use this
to decide how to unbind without impacting existing connections, possibly
by setting up other incoming paths for the traffic.
2020-10-09 18:44:37 +02:00
Willy Tarreau
f58b8db47b MEDIUM: receivers: add an rx_unbind() method in the protocols
This is used as a generic way to unbind a receiver at the end of
do_unbind_listener(). This allows to considerably simplify that function
since we can now let the protocol perform the cleanup. The generic code
was moved to sock.c, along with the conditional rx_disable() call. Now
the code also supports that the ->disable() function of the protocol
which acts on the listener performs the close itself and adjusts the
RX_F_BUOND flag accordingly.
2020-10-09 18:44:36 +02:00
Willy Tarreau
18c20d28d7 MINOR: listeners: move the LI_O_MWORKER flag to the receiver
This listener flag indicates whether the receiver part of the listener
is specific to the master or to the workers. In practice it's only used
by the master's CLI right now. It's used to know whether or not the FD
must be closed before forking the workers. For this reason it's way more
of a receiver's property than a listener's property, so let's move it
there under the name RX_F_MWORKER. The rest of the code remains
unchanged.
2020-10-09 18:43:05 +02:00
Willy Tarreau
75c98d166e CLEANUP: listeners: remove the do_close argument to unbind_listener()
And also remove it from its callers. This subtle distinction was added as
sort of a hack for the seamless reload feature but is not needed anymore
since the do_close turned unused since commit previous commit ("MEDIUM:
listener: let do_unbind_listener() decide whether to close or not").
This also removes the unbind_listener_no_close() function.
2020-10-09 18:41:56 +02:00
Willy Tarreau
02e8557e88 MINOR: protocol: add protocol_stop_now() to instant-stop listeners
This will instantly stop all listeners except those which belong to
a proxy configured with a grace time. This means that UDP listeners,
and peers will also be stopped when called this way.
2020-10-09 18:29:04 +02:00
Willy Tarreau
acde152175 MEDIUM: proxy: centralize proxy status update and reporting
There are multiple ways a proxy may switch to the disabled state,
but now it's essentially once it loses its last listener. Instead
of keeping duplicate code around and reporting the state change
before actually seeing it, we now report it at the moment it's
performed (from the last listener leaving) which allows to remove
the message from all other places.
2020-10-09 18:29:04 +02:00
Willy Tarreau
a389c9e1e3 MEDIUM: proxy: add mode PR_MODE_PEERS to flag peers frontends
For now we cannot easily distinguish a peers frontend from another one,
which will be problematic to avoid reporting them when stopping their
listeners. Let's add PR_MODE_PEERS for this. It's not supposed to cause
any issue since all non-HTTP proxies are handled similarly now.
2020-10-09 18:28:21 +02:00
Willy Tarreau
caa7df1296 MINOR: listeners: add a new stop_listener() function
This function will be used to definitely stop a listener (e.g. during a
soft_stop). This is actually tricky because it may be called for a proxy
or for a protocol, both of which require locks and already hold some. The
function takes booleans indicating which ones are already held, hoping
this will be enough. It's not well defined wether proto->disable() and
proto->rx_disable() are supposed to be called with any lock held, and
they are used from do_unbind_listener() with all these locks. Some back
annotations ought to be added on this point.

The proxy's listeners count is updated, and the proxy is marked as
disabled and woken up after the last one is gone. Note that a
listener in listen state is already not attached anymore since it
was disabled.
2020-10-09 18:27:48 +02:00
Willy Tarreau
b4c083f5bf MINOR: listeners: split delete_listener() in two versions
We'll need an already locked variant of this function so let's make
__delete_listener() which will be called with the protocol lock held
and the listener's lock held.
2020-10-09 11:27:30 +02:00
Willy Tarreau
5ddf1ce9c4 MINOR: protocol: add a new pair of enable/disable methods for listeners
These methods will be used to enable/disable accepting new connections
so that listeners do not play with FD directly anymore. Since all the
currently supported protocols work on socket for now, these are identical
to the rx_enable/rx_disable functions. However they were not defined in
sock.c since it's likely that some will quickly start to differ. At the
moment they're not used.

We have to take care of fd_updt before calling fd_{want,stop}_recv()
because it's allocated fairly late in the boot process and some such
functions may be called very early (e.g. to stop a disabled frontend's
listeners).
2020-10-09 11:27:30 +02:00
Willy Tarreau
686fa3db50 MINOR: protocol: add a new pair of rx_enable/rx_disable methods
These methods will be used to enable/disable rx at the receiver level so
that callers don't play with FDs directly anymore. All our protocols use
the generic ones from sock.c at the moment. For now they're not used.
2020-10-09 11:27:30 +02:00
Willy Tarreau
e70c7977f2 MINOR: sock: provide a set of generic enable/disable functions
These will be used on receivers, to enable or disable receiving on a
listener, which most of the time just consists in enabling/disabling
the file descriptor.

We have to take care of the existence of fd_updt to know if we may
or not call fd_{want,stop}_recv() since it's not permitted in very
early boot.
2020-10-09 11:27:30 +02:00
Willy Tarreau
58e6b71bb0 MINOR: protocol: implement an ->rx_resume() method
This one undoes ->rx_suspend(), it tries to restore an operational socket.
It was only implemented for TCP since it's the only one we support right
now.
2020-10-09 11:27:30 +02:00
Willy Tarreau
cb66ea60cf MINOR: protocol: replace ->pause(listener) with ->rx_suspend(receiver)
The ->pause method is inappropriate since it doesn't exactly "pause" a
listener but rather temporarily disables it so that it's not visible at
all to let another process take its place. The term "suspend" is more
suitable, since the "pause" is actually what we'll need to apply to the
FULL and LIMITED states which really need to make a pause in the accept
process. And it goes well with the use of the "resume" function that
will also need to be made per-protocol.

Let's rename the function and make it act on the receiver since it's
already what it essentially does, hence the prefix "_rx" to make it
more explicit.

The protocol struct was a bit reordered because it was becoming a real
mess between the parts related to the listeners and those for the
receivers.
2020-10-09 11:27:30 +02:00
Willy Tarreau
d7f331c8b8 MINOR: protocol: rename the ->listeners field to ->receivers
Since the listeners were split into receiver+listener, this field ought
to have been renamed because it's confusing. It really links receivers
and not listeners, as most of the time it's used via rx.proto_list!
The nb_listeners field was updated accordingly.
2020-10-09 11:27:30 +02:00
Willy Tarreau
dae0692717 CLEANUP: listeners: remove the now unused enable_all_listeners()
It's not used anymore since previous commit. The good thing is that
no more listener function now directly acts on a protocol.
2020-10-09 11:27:30 +02:00
Willy Tarreau
078e1c7102 CLEANUP: protocol: remove the ->enable_all method
It's not used anymore, now the listeners are enabled from
protocol_enable_all().
2020-10-09 11:27:30 +02:00
Willy Tarreau
7834a3f70f MINOR: listeners: export enable_listener()
we'll soon call it from outside.
2020-10-09 11:27:30 +02:00
Willy Tarreau
d008009958 CLEANUP: listeners: remove unused disable_listener and disable_all_listeners
These ones have never been called, they were referenced by the protocol's
disable_all for some protocols but there are no traces of their use, so
in addition to not being sure the code works, it has never been tested.
Let's remove a bit of complexity starting from there.
2020-10-09 11:27:30 +02:00
Willy Tarreau
fb4ead8e8a CLEANUP: protocol: remove the ->disable_all method
This one has never been used, is only referenced by proto_uxst and
proto_sockpair, and it's not even certain it works at all. Let's
get rid of it.
2020-10-09 11:27:30 +02:00
Willy Tarreau
1accacbcc3 CLEANUP: proxy: remove the now unused pause_proxies() and resume_proxies()
They're not used anymore, delete them before someone thinks about using
them again!
2020-10-09 11:27:30 +02:00
Willy Tarreau
09819d1118 MINOR: protocol: introduce protocol_{pause,resume}_all()
These two functions are used to pause and resume all listeners of
all protocols. They use the standard listener functions for this
so they're supposed to handle the situation gracefully regardless
of the upper proxies' states, and they will report completion on
proxies once the switch is performed.

It might be nice to define a particular "failed" state for listeners
that cannot resume and to count them on proxies in order to mention
that they're definitely stuck. On the other hand, the current
situation is retryable which is quite appreciable as well.
2020-10-09 11:27:30 +02:00
Willy Tarreau
337c835d16 MEDIUM: proxy: merge zombify_proxy() with stop_proxy()
The two functions don't need to be distinguished anymore since they have
all the necessary info to act as needed on their listeners. Let's just
pass via stop_proxy() and make it check for each listener which one to
close or not.
2020-10-09 11:27:30 +02:00
Willy Tarreau
43ba3cf2b5 MEDIUM: proxy: remove start_proxies()
Its sole remaining purpose was to display "proxy foo started", which
has little benefit and pollutes output for those with plenty of proxies.
Let's remove it now.

The VTCs were updated to reflect this, because many of them had explicit
counts of dropped lines to match this message.

This is tagged as MEDIUM because some users may be surprized by the
loss of this quite old message.
2020-10-09 11:27:30 +02:00
Willy Tarreau
c3914d4fff MEDIUM: proxy: replace proxy->state with proxy->disabled
The remaining proxy states were only used to distinguish an enabled
proxy from a disabled one. Due to the initialization order, both
PR_STNEW and PR_STREADY were equivalent after startup, and they
would only differ from PR_STSTOPPED when the proxy is disabled or
shutdown (which is effectively another way to disable it).

Now we just have a "disabled" field which allows to distinguish them.
It's becoming obvious that start_proxies() is only used to print a
greeting message now, that we'd rather get rid of. Probably that
zombify_proxy() and stop_proxy() should be merged once their
differences move to the right place.
2020-10-09 11:27:30 +02:00
Willy Tarreau
1ad64acf6c CLEANUP: peers: don't use the PR_ST* states to mark enabled/disabled
The enabled/disabled config options were stored into a "state" field
that is an integer but contained only PR_STNEW or PR_STSTOPPED, which
is a bit confusing, and causes a dependency with proxies. This was
renamed to "disabled" and is used as a boolean. The field was also
moved to the end of the struct to stop creating a hole and fill another
one.
2020-10-09 11:27:30 +02:00
Willy Tarreau
f18d968830 MEDIUM: proxy: remove state PR_STPAUSED
This state was used to mention that a proxy was in PAUSED state, as opposed
to the READY state. This was causing some trouble because if a listener
failed to resume (e.g. because its port was temporarily in use during the
resume), it was not possible to retry the operation later. Now by checking
the number of READY or PAUSED listeners instead, we can accurately know if
something went bad and try to fix it again later. The case of the temporary
port conflict during resume now works well:

  $ socat readline /tmp/sock1
  prompt
  > disable frontend testme3

  > disable frontend testme3
  All sockets are already disabled.

  > enable frontend testme3
  Failed to resume frontend, check logs for precise cause (port conflict?).

  > enable frontend testme3

  > enable frontend testme3
  All sockets are already enabled.
2020-10-09 11:27:30 +02:00
Willy Tarreau
a17c91b37f MEDIUM: proxy: remove the PR_STERROR state
This state is only set when a pause() fails but isn't even set when a
resume() fails. And we cannot recover from this state. Instead, let's
just count remaining ready listeners to decide to emit an error or not.
It's more accurate and will better support new attempts if needed.
2020-10-09 11:27:30 +02:00
Willy Tarreau
6b3bf733dd MEDIUM: proxy: remove the unused PR_STFULL state
Since v1.4 or so, it's almost not possible anymore to set this state. The
only exception is by using the CLI to change a frontend's maxconn setting
below its current usage. This case makes no sense, and for other cases it
doesn't make sense either because "full" is a vague concept when only
certain listeners are full and not all. Let's just remove this unused
state and make it clear that it's not reported. The "ready" or "open"
states will continue to be reported without being misleading as they
will be opposed to "stop".
2020-10-09 11:27:30 +02:00
Willy Tarreau
efc0eec4c1 MINOR: proxy: maintain per-state counters of listeners
The proxy state tries to be synthetic but that doesn't work well with
many listeners, especially for transition phases or after a failed
pause/resume.

In order to address this, we'll instead rely on counters of listeners in
a given state for the 3 major states (ready, paused, listen) and a total
counter. We'll now be able to determine a proxy's state by comparing these
counters only.
2020-10-09 11:27:30 +02:00
Willy Tarreau
a37b244509 MINOR: listeners: introduce listener_set_state()
This function is used as a wrapper to set a listener's state everywhere.
We'll use it later to maintain some counters in a consistent state when
switching state so it's capital that all state changes go through it.
No functional change was made beyond calling the wrapper.
2020-10-09 11:27:30 +02:00
Willy Tarreau
c6dac6c7f5 MEDIUM: listeners: remove the now unused ZOMBIE state
The zombie state is not used anymore by the listeners, because in the
last two cases where it was tested it couldn't match as it was covered
by the test on the process mask. Instead now the FD is either in the
LISTEN state or the INIT state. This also avoids forcing the listener
to be single-dimensional because actually belonging to another process
isn't totally exclusive with the other states, which explains some of
the difficulties requiring to check the proc_mask and the fd sometimes.

So let's get rid of it now not to be tempted to reuse it.

The doc on the listeners state was updated.
2020-10-09 11:27:29 +02:00
Emeric Brun
b0c331f71f BUG/MINOR: proxy/log: frontend/backend and log forward names must differ
This patch disallow to use same name for a log forward section
and a frontend/backend section.
2020-10-08 08:53:26 +02:00
Emeric Brun
6d75616951 MINOR: channel: new getword and getchar functions on channel.
This patch adds two new functions to get a char
or a word from a channel.
2020-10-07 17:17:27 +02:00
Emeric Brun
2897644ae5 MINOR: stats: inc req counter on listeners.
This patch enables count of requests for listeners
if listener's counters are enabled.
2020-10-07 17:17:27 +02:00
Amaury Denoyelle
fbd0bc98fe MINOR: dns/stats: integrate dns counters in stats
Use the new stats module API to integrate the dns counters in the
standard stats. This is done in order to avoid code duplication, keep
the code related to cli out of dns and use the full possibility of the
stats function, allowing to print dns stats in csv or json format.
2020-10-05 12:02:14 +02:00
Amaury Denoyelle
0b70a8a314 MINOR: stats: add config "stats show modules"
By default, hide the extra statistics on the html page. Define a new
flag STAT_SHMODULES which is activated if the config "stats show
modules" is set.
2020-10-05 12:02:14 +02:00
Amaury Denoyelle
d3700a7fda MINOR: stats: support clear counters for dynamic stats
Add a boolean 'clearable' on stats module structure. If set, it forces
all the counters to be reset on 'clear counters' cli command. If not,
the counters are reset only when 'clear counters all' is used.
2020-10-05 12:02:14 +02:00
Amaury Denoyelle
ee63d4bd67 MEDIUM: stats: integrate static proxies stats in new stats
This is executed on startup with the registered statistics module. The
existing statistics have been merged in a list containing all
statistics for each domain. This is useful to print all available
statistics in a generic way.

Allocate extra counters for all proxies/servers/listeners instances.
These counters are allocated with the counters from the stats modules
registered on startup.
2020-10-05 12:02:14 +02:00
Amaury Denoyelle
730c727ea3 MEDIUM: stats: add abstract type to store counters
Implement a small API to easily add extra counters inside a structure
instance. This will be used to implement dynamic statistics linked on
every type of object as needed.

The counters are stored in a dynamic array inside the relevant objects.
2020-10-05 12:02:14 +02:00
Amaury Denoyelle
58d395e0d6 MEDIUM: stats: define an API to register stat modules
A stat module can be registered to quickly add new statistics on
haproxy. It must be attached to one of the available stats domain. The
register must be done using INITCALL on STG_REGISTER.

The stat module has a name which should be unique for each new module in
a domain. It also contains a statistics list with their name/desc and a
pointer to a function used to fill the stats from the module counters.

The module also provides the initial counters values used on
automatically allocated counters. The offset for these counters
are stored in the module structure.
2020-10-05 12:02:14 +02:00
Amaury Denoyelle
72b16e5173 MINOR: stats: define additional flag px cap on domain
This flag can be used to determine on what type of proxy object the
statistics should be relevant. It will be useful when adding dynamic
statistics. Currently, this flag is not used.
2020-10-05 12:02:14 +02:00
Amaury Denoyelle
072f97eddf MINOR: stats: define the concept of domain for statistics
The domain option will be used to have statistics attached to other
objects than proxies/listeners/servers. At the moment, only the PROXY
domain is available.

Add an argument 'domain' on the 'show stats' cli command to specify the
domain. Only 'domain proxy' is available now. If not specified, proxy
will be considered the default domain.

For HTML output, only proxy statistics will be displayed.
2020-10-05 12:02:14 +02:00
Amaury Denoyelle
da5b6d1cd9 MINOR: stats: hide px/sv/li fields in applet struct
Use an opaque pointer to store proxy instance. Regroup server/listener
as a single opaque pointer. This has the benefit to render the structure
more evolutive to support statistics on other types of objects in the
future.

This patch is needed to extend stat support for components other than
proxies objects.

The prometheus module has been adapted for these changes.
2020-10-05 10:48:58 +02:00
Amaury Denoyelle
97323c9ed4 MINOR: stats: add stats size as a parameter for csv/json dump
Render the stats size parametric in csv/json dump functions. This is
needed for the future patch which provides dynamic stats. For now the
static value ST_F_TOTAL_FIELDS is provided.

Remove unused parameter px on stats_dump_one_line.

This patch is needed to extend stat support to components other than
proxies objects.
2020-10-05 09:06:10 +02:00
Amaury Denoyelle
3ca927e68f REORG: stats: export some functions
Un-mark stats_dump_one_line and stats_putchk as static and export them
in the header file. These functions will be reusable by other components to
print their statistics.

This patch is needed to extend stat support to components other than
proxies objects.
2020-10-05 09:06:10 +02:00
Amaury Denoyelle
cd3de50779 MINOR: counters: fix a typo in comment
Wrong copy/paste comment, replace listeners/frontends by
servers/backends

This may be backported up to 1.7.
2020-10-05 09:05:57 +02:00
Willy Tarreau
fac0f645df BUG/MEDIUM: queue: make pendconn_cond_unlink() really thread-safe
A crash reported in github issue #880 looks impossible unless
pendconn_cond_unlink() occasionally sees a null leaf_p when attempting
to remove an entry, which seems to be confirmed by the reporter. What
seems to be happening is that depending on compiler optimizations,
this pointer can appear as null while pointers are moved if one of
the node's parents is removed from or inserted into the tree. There's
no explicit null of the pointer during these operations but those
pointers are rewritten in multiple steps and nothing prevents this
situation from happening, and there are no particular barrier nor
atomic ops around this.

This test was used to avoid unnecessary locking, for already deleted
entries, but looking at the code it appears that pendconn_free() already
resets s->pend_pos that's used as <p> there, and that the other call
reasons are after an error where the connection will be dropped as
well. So we don't save anything by doing this test, and make it
unsafe. The older code used to check for list emptiness there and
not inside pendconn_unlink(), which explains why the code has stayed
there. Let's just remove this now.

Thanks to @jaroslawr for reporting this issue in great details and for
testing the proposed fix.

This should be backpored to 1.8, where the test on LIST_ISEMPTY should
be moved to pendconn_unlink() instead (inside the lock, just like 2.0+).
2020-10-02 18:10:26 +02:00
Amaury Denoyelle
fa41cb6792 MINOR: tools: support for word expansion of environment in parse_line
Allow the syntax "${...[*]}" to expand an environment variable
containing several values separated by spaces as individual arguments. A
new flag PARSE_OPT_WORD_EXPAND has been added to toggle this feature on
parse_line invocation. In case of an invalid syntax, a new error
PARSE_ERR_WRONG_EXPAND will be triggered.

This feature has been asked on the github issue #165.
2020-10-01 17:24:14 +02:00
Willy Tarreau
3ca2365904 BUG/MEDIUM: h2: report frame bits only for handled types
As part of his GREASE experiments on Chromium, Bence Bky reported in
https://lists.w3.org/Archives/Public/ietf-http-wg/2020JulSep/0202.html
and https://bugs.chromium.org/p/chromium/issues/detail?id=1127060 that
a certain combination of frame type and frame flags was causing an error
on app.slack.com. It turns out that it's haproxy that is causing this
issue because the frame type is wrongly assumed to support padding, the
frame flags indicate padding is present, and the frame is too short for
this, resulting in an error.

The reason why only some frame types are affected is due to the frame
type being used in a bit shift to match against a mask, and where the
5 lower bits of the frame type only are used to compute the frame bit.
If the resulting frame bit matches a DATA, HEADERS or PUSH_PROMISE frame
bit, then padding support is assumed and the test is enforced, resulting
in a PROTOCOL_ERROR or FRAME_SIZE_ERROR depending on the payload size.

We must never match any such bit for unsupported frame types so let's
add a check for this. This must be backported as far as 1.8.

Thanks to Cooper Bethea for providing enough context to help narrow the
issue down and to Bence Bky for creating a simple reproducer.
2020-09-18 08:05:03 +02:00
Willy Tarreau
2b5e0d8b6a MEDIUM: proto_udp: replace last AF_CUST_UDP* with AF_INET*
We don't need to cheat with the sock_domain anymore, we now always have
the SOCK_DGRAM sock_type as a complementary selector. This patch restores
the sock_domain to AF_INET* in the udp* protocols and removes all traces
of the now unused AF_CUST_*.
2020-09-16 22:08:08 +02:00
Willy Tarreau
910c64da96 MEDIUM: protocol: store the socket and control type in the protocol array
The protocol array used to be only indexed by socket family, which is very
problematic with UDP (requiring an extra family) and with the forthcoming
QUIC (also requiring an extra family), especially since that binds them to
certain families, prevents them from supporting dgram UNIX sockets etc.

In order to address this, we now start to register the protocols with more
info, namely the socket type and the control type (either stream or dgram).
This is sufficient for the protocols we have to deal with, but could also
be extended further if multiple protocol variants were needed. But as is,
it still fits nicely in an array, which is convenient for lookups that are
instant.
2020-09-16 22:08:08 +02:00
Willy Tarreau
a54553f74f MINOR: protocol: add the control layer type in the protocol struct
This one will be needed to more accurately select a protocol. It may
differ from the socket type for QUIC, which uses dgram at the socket
layer and provides stream at the control layer. The upper level requests
a control layer only so we need this field.
2020-09-16 22:08:08 +02:00
Willy Tarreau
65ec4e3ff7 MEDIUM: tools: make str2sa_range() check that the protocol has ->connect()
Most callers of str2sa_range() need the protocol only to check that it
provides a ->connect() method. It used to be used to verify that it's a
stream protocol, but it might be a bit early to get rid of it. Let's keep
the test for now but move it to str2sa_range() when the new flag PA_O_CONNECT
is present. This way almost all call places could be cleaned from this.

There's a strange test in the server address parsing code that rechecks
the family from the socket which seems to be a duplicate of the previously
removed tests. It will have to be rechecked.
2020-09-16 22:08:08 +02:00
Willy Tarreau
5fc9328aa2 MINOR: tools: make str2sa_range() directly return the protocol
We'll need this so that it can return pointers to stacked protocol in
the future (for QUIC). In addition this removes a lot of tests for
protocol validity in the callers.

Some of them were checked further apart, or after a call to
str2listener() and they were simplified as well.

There's still a trick, we can fail to return a protocol in case the caller
accepts an fqdn for use later. This is what servers do and in this case it
is valid to return no protocol. A typical example is:

   server foo localhost:1111
2020-09-16 22:08:08 +02:00
Willy Tarreau
9b3178df23 MINOR: listener: pass the chosen protocol to create_listeners()
The function will need to use more than just a family, let's pass it
the selected protocol. The caller will then be able to do all the fancy
stuff required to pick the best protocol.
2020-09-16 22:08:08 +02:00
Willy Tarreau
aa333123f2 MINOR: cfgparse: add str2receiver() to parse dgram receivers
This is at least temporary, as the migration at once is way too difficuly.
For now it still creates listeners but only allows DGRAM sockets. This
aims at easing the split between listeners and receivers.
2020-09-16 22:08:08 +02:00
Willy Tarreau
a93e5c7fae MINOR: tools: make str2sa_range() optionally return the fd
If a file descriptor was passed, we can optionally return it. This will
be useful for listening sockets which are both a pre-bound FD and a ready
socket.
2020-09-16 22:08:08 +02:00
Willy Tarreau
909c23b086 MINOR: listener: remove the inherited arg to create_listener()
This argument can now safely be determined from fd != -1, let's just
drop it.
2020-09-16 22:08:08 +02:00
Willy Tarreau
328199348b MINOR: tools: add several PA_O_* flags in str2sa_range() callers
These flags indicate whether the call is made to fill a bind or a server
line, or even just send/recv calls (like logs or dns). Some special cases
are made for outgoing FDs (e.g. pipes for logs) or socket FDs (e.g external
listeners), and there's a distinction between stream or dgram usage that's
expected to significantly help str2sa_range() proceed appropriately with
the input information. For now they are not used yet.
2020-09-16 22:08:08 +02:00
Willy Tarreau
809587635e MINOR: tools: add several PA_O_PORT_* flags in str2sa_range() callers
These flags indicate what is expected regarding port specifications. Some
callers accept none, some need fixed ports, some have it mandatory, some
support ranges, and some take an offset. Each possibilty is reflected by
an option. For now they are not exploited, but the goal is to instrument
str2sa_range() to properly parse that.
2020-09-16 22:08:07 +02:00
Willy Tarreau
cd3a5591f6 MINOR: tools: make str2sa_range() take more options than just resolve
We currently have an argument to require that the address is resolved
but we'll soon add more, so let's turn it into a bit field. The old
"resolve" boolean is now PA_O_RESOLVE.
2020-09-16 22:08:07 +02:00
Willy Tarreau
a5b325f92c MINOR: protocol: add a real family for existing FDs
At some places (log fd@XXX, bind fd@XXX) we support using an explicit
file descriptor number, that is placed into the sockaddr for later use.
The problem is that till now it was done with an AF_UNSPEC family, which
is also used for other situations like missing info or rings (for logs).

Let's create an "official" family AF_CUST_EXISTING_FD for this case so
that we are certain the FD can be found in the address when it is set.
2020-09-16 22:08:07 +02:00
Willy Tarreau
1e984b73f0 CLEANUP: protocol: remove family-specific fields from struct protocol
This removes the following fields from struct protocol that are now
retrieved from the protocol family instead: .sock_family, .sock_addrlen,
.l3_addrlen, .addrcmp, .bind, .get_src, .get_dst.

This also removes the UDP-specific udp{,6}_get_{src,dst}() functions
which were referenced but not used yet. Their goal was only to remap
the original AF_INET* addresses to AF_CUST_UDP*.

Note that .sock_domain is still there as it's used as a selector for
the protocol struct to be used.
2020-09-16 22:08:07 +02:00
Willy Tarreau
f1f660978c MINOR: protocol: retrieve the family-specific fields from the family
We now take care of retrieving sock_family, l3_addrlen, bind(),
addrcmp(), get_src() and get_dst() from the protocol family and
not just the protocol itself. There are very few places, this was
only seldom used. Interestingly in sock_inet.c used to rely on
->sock_family instead of ->sock_domain, and sock_unix.c used to
hard-code PF_UNIX instead of using ->sock_domain.

Also it appears obvious we have something wrong it the protocol
selection algorithm because sock_domain is the one set to the custom
protocols while it ought to be sock_family instead, which would avoid
having to hard-code some conversions for UDP namely.
2020-09-16 22:08:07 +02:00
Willy Tarreau
b0254cb361 MINOR: protocol: add a new proto_fam structure for protocol families
We need to specially handle protocol families which regroup common
functions used for a given address family. These functions include
bind(), addrcmp(), get_src() and get_dst() for now. Some fields are
also added about the address family, socket domain (protocol family
passed to the socket() syscall), and address length.

These protocol families are referenced from the protocols but not yet
used.
2020-09-16 22:08:07 +02:00
Willy Tarreau
62292b28a3 MEDIUM: sockpair: implement sockpair_bind_receiver()
Note that for now we don't have a sockpair.c file to host that unusual
family, so the new function was placed directly into proto_sockpair.c.
It's no big deal given that this family is currently not shared with
multiple protocols.

The function does almost nothing but setting up the receiver. This is
normal as the socket the FDs are passed onto are supposed to have been
already created somewhere else, and the only usable identifier for such
a socket pair is the receiving FD itself.

The function was assigned to sockpair's ->bind() and is not used yet.
2020-09-16 22:08:07 +02:00
Willy Tarreau
1e0a860099 MEDIUM: sock_unix: implement sock_unix_bind_receiver()
This function performs all the bind-related stuff for UNIX sockets that
was previously done in uxst_bind_listener(). There is a very tiny
difference however, which is that previously, in the unlikely event
where listen() would fail, it was still possible to roll back the binding
and rename the backup to the original socket. Now we have to rename it
before calling returning, hence it will be done before calling listen().
However, this doesn't cover any particular use case since listen() has no
reason to fail there (and the rollback is not done for inherited sockets),
that was just done that way as a generic error processing path.

The code is not used yet and is referenced in the uxst proto's ->bind().
2020-09-16 22:08:07 +02:00
Willy Tarreau
d69ce1ffbc MEDIUM: sock_inet: implement sock_inet_bind_receiver()
This function collects all the receiver-specific code from both
tcp_bind_listener() and udp_bind_listener() in order to provide a more
generic AF_INET/AF_INET6 socket binding function. For now the API is
not very elegant because some info are still missing from the receiver
while there's no ideal place to fill them except when calling ->listen()
at the protocol level. It looks like some polishing code is needed in
check_config_validity() or somewhere around this in order to finalize
the receivers' setup. The main issue is that listeners and receivers
are created *before* bind_conf options are parsed and that there's no
finishing step to resolve some of them.

The function currently sets up a receiver and subscribes it to the
poller. In an ideal world we wouldn't subscribe it but let the caller
do it after having finished to configure the L4 stuff. The problem is
that the caller would then need to perform an fd_insert() call and to
possibly set the exported flag on the FD while it's not its job. Maybe
an improvement could be to have a separate sock_start_receiver() call
in sock.c.

For now the function is not used but it will soon be. It's already
referenced as tcp and udp's ->bind().
2020-09-16 22:08:07 +02:00
Willy Tarreau
3e5c7ab7ce MINOR: protocol: add a new ->bind() entry to bind the receiver
This will be the function that must be used to bind the receiver. It
solely depends on the address family but for now it's simpler to have
it per protocol.
2020-09-16 22:08:07 +02:00
Willy Tarreau
b3580b19c8 MINOR: protocol: rename the ->bind field to ->listen
The function currently is doing both the bind() and the listen(), so
let's call it ->listen so that the bind() operation can move to another
place.
2020-09-16 22:08:07 +02:00
Willy Tarreau
c049c0d5ad MINOR: sock: make sock_find_compatible_fd() only take a receiver
We don't need to have a listener anymore to find an fd, a receiver with
its settings properly set is enough now.
2020-09-16 22:08:07 +02:00
Willy Tarreau
3fd3bdc836 MINOR: receiver: move the FOREIGN and V6ONLY options from listener to settings
The new RX_O_FOREIGN, RX_O_V6ONLY and RX_O_V4V6 options are now set into
the rx_settings part during the parsing, so that we don't need to adjust
them in each and every listener anymore. We have to keep both v4v6 and
v6only due to the precedence from v6only over v4v6.
2020-09-16 22:08:07 +02:00
Willy Tarreau
43046fa4f4 MINOR: listener: move the INHERITED flag down to the receiver
It's the receiver's FD that's inherited from the parent process, not
the listener's so the flag must move to the receiver so that appropriate
actions can be taken.
2020-09-16 22:08:07 +02:00
Willy Tarreau
0b9150155e MINOR: receiver: add a receiver-specific flag to indicate the socket is bound
In order to split the receiver from the listener, we'll need to know that
a socket is already bound and ready to receive. We used to do that via
tha LI_O_ASSIGNED state but that's not sufficient anymore since the
receiver might not belong to a listener anymore. The new RX_F_BOUND flag
is used for this.
2020-09-16 22:08:07 +02:00
Willy Tarreau
eef454224d MINOR: receiver: link the receiver to its owner
A receiver will have to pass a context to be installed into the fdtab
for use by the handler. We need to set this into the receiver struct
as the bind will happen longer after the configuration.
2020-09-16 22:08:07 +02:00
Willy Tarreau
0fce6bce34 MINOR: receiver: link the receiver to its settings
Just like listeners keep a pointer to their bind_conf, receivers now also
have a pointer to their rx_settings. All those belonging to a listener are
automatically initialized with a pointer to the bind_conf's settings.
2020-09-16 22:08:07 +02:00
Willy Tarreau
d45693d85c REORG: listener: move the receiver part to a new file
We'll soon add flags for the receivers, better add them to the final
file, so it's time to move the definition to receiver-t.h. The struct
receiver and rx_settings were placed there.
2020-09-16 22:08:07 +02:00
Willy Tarreau
b743661f04 REORG: listener: move the listener's proto to the receiver
The receiver is the one which depends on the protocol while the listener
relies on the receiver. Let's move the protocol there. Since there's also
a list element to get back to the listener from the proto list, this list
element (proto_list) was moved as well. For now when scanning protos, we
still see listeners which are linked by their rx.proto_list part.
2020-09-16 22:08:05 +02:00
Willy Tarreau
38ba647f9f REORG: listener: move the receiving FD to struct receiver
The listening socket is represented by its file descriptor, which is
generic to all receivers and not just listeners, so it must move to
the rx struct.

It's worth noting that in order to extend receivers and listeners to
other protocols such as QUIC, we'll need other handles than file
descriptors here, and that either a union or a cast to uintptr_t
will have to be used. This was not done yet and the field was
preserved under the name "fd" to avoid adding confusion.
2020-09-16 22:08:03 +02:00
Willy Tarreau
371590661e REORG: listener: move the listening address to a struct receiver
The address will be specific to the receiver so let's move it there.
2020-09-16 22:08:01 +02:00
Willy Tarreau
37d9d6721a REORG: listener: create a new struct receiver
In order to start to split the listeners into the listener part and the
event receiver part, we introduce a new field "rx" into struct listener
that will eventually become a separate struct receiver. This patch only
adds the struct with an options field that the receivers will need.
2020-09-16 22:07:58 +02:00
Willy Tarreau
be56c1038f MINOR: listener: move the network namespace to the struct settings
The netns is common to all listeners/receivers and is used to bind the
listening socket so it must be in the receiver settings and not in the
listener. This removes some yet another set of unnecessary loops.
2020-09-16 20:13:13 +02:00
Willy Tarreau
7e307215e8 MINOR: listener: move the interface to the struct settings
The interface is common to all listeners/receivers and is used to bind
the listening socket so it must be in the receiver settings and not in
the listener. This removes some unnecessary loops.
2020-09-16 20:13:13 +02:00
Willy Tarreau
e26993c098 MINOR: listener: move bind_proc and bind_thread to struct settings
As mentioned previously, these two fields come under the settings
struct since they'll be used to bind receivers as well.
2020-09-16 20:13:13 +02:00
Willy Tarreau
6e459d7f92 MINOR: listener: create a new struct "settings" in bind_conf
There currently is a large inconsistency in how binding parameters are
split between bind_conf and listeners. It happens that for historical
reasons some parameters are available at the listener level but cannot
be configured per-listener but only for a bind_conf, and thus, need to
be replicated. In addition, some of the bind_conf parameters are in fact
for the listening socket itself while others are for the instanciated
sockets.

A previous attempt at splitting listeners into receivers failed because
the boundary between all these settings is not well defined.

This patch introduces a level of listening socket settings in the
bind_conf, that will be detachable later. Such settings that are solely
for the listening socket are:
  - unix socket permissions (used only during binding)
  - interface (used for binding)
  - network namespace (used for binding)
  - process mask and thread mask (used during startup)

The rest seems to be used only to initialize the resulting sockets, or
to control the accept rate. For now, only the unix params (bind_conf->ux)
were moved there.
2020-09-16 20:13:13 +02:00
William Lallemand
70bf06e5f0 BUILD: fix build with openssl < 1.0.2 since bundle removal
Bundle removal broke the build with openssl version < 1.0.2.

Remove the #ifdef around SSL_SOCK_KEYTYPE_NAMES.
2020-09-16 18:10:00 +02:00
William Lallemand
e7eb1fec2f CLEANUP: ssl: remove utility functions for bundle
Remove the last utility functions for handling the multi-cert bundles
and remove the multi-variable from the ckch structure.

With this patch, the bundles are completely removed.
2020-09-16 16:28:26 +02:00
William Lallemand
bd8e6eda59 CLEANUP: ssl: remove test on "multi" variable in ckch functions
Since the removal of the multi-certificates bundle support, this
variable is not useful anymore, we can remove all tests for this
variable and suppose that every ckch contains a single certificate.
2020-09-16 16:28:26 +02:00
Willy Tarreau
441b6c31e9 BUILD: connection: fix build on clang after the VAR_ARRAY cleanup
Commit 4987a4744 ("CLEANUP: tree-wide: use VAR_ARRAY instead of [0] in
various definitions") broke the build on clang due to the tlv field used
to receive/send the proxy protocol. The problem is that struct tlv is
included at the beginning of struct tlv_ssl, which doesn't make much
sense. In fact the value[] array isn't really a var array but just an
end of struct marker, and must really be an array of size zero.
2020-09-14 08:43:51 +02:00
Willy Tarreau
4987a47446 CLEANUP: tree-wide: use VAR_ARRAY instead of [0] in various definitions
Surprisingly there were still a number of [0] definitions for variable
sized arrays in certain structures all over the code. We need to use
VAR_ARRAY instead of zero to accommodate various compilers' preferences,
as zero was used only on old ones and tends to report errors on new ones.
2020-09-12 20:56:41 +02:00
Ilya Shipitsin
4a034f2212 BUILD: introduce possibility to define ABORT_NOW() conditionally
code analysis tools recognize abort() better, so let us introduce
such possibility
2020-09-12 13:11:27 +02:00
Willy Tarreau
00c363ba9d REORG: tools: move PARSE_OPT_* from tools.h to tools-t.h
These would better be placed into the low-level type files with other
similar macros.
2020-09-11 11:27:22 +02:00
Willy Tarreau
76296dce68 BUILD: trace: always have an argument before variadic args in macros
tcc supports variadic macros provided that there is always at least one
argument, like older gcc versions. Thus we need to always keep one and
define args as the remaining ones. It's not an issue at all and doesn't
change the way to use them, just the internal definitions.
2020-09-10 09:35:54 +02:00
Willy Tarreau
d966f1497c BUILD: intops: on x86_64, the bswap instruction is called bswapq
Building with tcc fails on "bswap" which in fact ought to be called
"bswapq". Let's rename it as gas doesn't care.
2020-09-10 09:31:50 +02:00
Willy Tarreau
f6afda6539 BUILD: compiler: workaround a glibc madness around __attribute__()
For whatever reason, glibc decided that the __attribute__ keyword is
the exclusive property of gcc, and redefines it to an empty macro on
other compilers. Some non-gcc compilers also support it (possibly
partially), tinycc is one of them. By doing this, glibc silently
broke all constructors, resulting in code that arrives in main() with
uninitialized variables.

The solution we use here consists in undefining the macro on non-gcc
compilers, and redefining it to itself in order to cause a conflict in
the event the redefinition would happen afterwards. This visibly solved
the problem.
2020-09-10 09:26:50 +02:00
Willy Tarreau
d9537f6082 BUILD: compiler: reserve the gcc version checks to the gcc compiler
Some checks on __GNUC__ imply that if it's undefined it will match a
low value but that's not always what we want, like for example in the
VAR_ARRAY definition which is not needed on tcc. Let's always be explicit
on these tests.
2020-09-10 08:35:28 +02:00
Christopher Faulet
5a89175ac8 BUG/MEDIUM: dns: Don't store additional records in a linked-list
A SRV record keeps a reference on the corresponding additional record, if
any. But this additional record is also inserted in a separate linked-list into
the dns response. The problems arise when obsolete additional records are
released. The additional records list is purged but the SRV records always
reference these objects, leading to an undefined behavior. Worst, this happens
very quickly because additional records are never renewed. Thus, once received,
an additional record will always expire.

Now, the addtional record are only associated to a SRV record or simply
ignored. And the last version is always used.

This patch helps to fix the issue #841. It must be backported to 2.2.
2020-09-08 10:44:39 +02:00
Willy Tarreau
e91bff2134 MAJOR: init: start all listeners via protocols and not via proxies anymore
Ever since the protocols were added in 1.3.13, listeners used to be
started twice:
  - once by start_proxies(), which iteratees over all proxies then all
    listeners ;
  - once by protocol_bind_all() which iterates over all protocols then
    all listeners ;

It's a real mess because error reporting is not even consistent, and
more importantly now that some protocols do not appear in regular
proxies (peers, logs), there is no way to retry their binding should
it fail on the last step.

What this patch does is to make sure that listeners are exclusively
started by protocols. The failure to start a listener now causes the
emission of an error indicating the proxy's name (as it used to be
the case per proxy), and retryable failures are silently ignored
during all but last attempts.

The start_proxies() function was kept solely for setting the proxy's
state to READY and emitting the "Proxy started" message and log that
some have likely got used to seeking in their logs.
2020-09-02 11:11:43 +02:00
Willy Tarreau
576a633868 CLEANUP: protocol: remove all ->bind_all() and ->unbind_all() functions
These ones were not used anymore since the two previous patches, let's
drop them.
2020-09-02 10:40:33 +02:00
Christopher Faulet
bde2c4c621 MINOR: http-htx: Handle an optional reason when replacing the response status
When calling the http_replace_res_status() function, an optional reason may now
be set. It is ignored if it points to NULL and the original reason is
preserved. Only the response status is replaced. Otherwise both the status and
the reason are replaced.

It simplifies the API and most of time, avoids an extra call to
http_replace_res_reason().
2020-09-01 10:55:36 +02:00
Christopher Faulet
b8ce505c6f MINOR: http-htx: Add an option to eval query-string when the path is replaced
The http_replace_req_path() function now takes a third argument to evaluate the
query-string as part of the path or to preserve it. If <with_qs> is set, the
query-string is replaced with the path. Otherwise, only the path is replaced.

This patch is mandatory to fix issue #829. The next commit depends on it. So be
carefull during backports.
2020-09-01 10:55:14 +02:00
Willy Tarreau
9dbb6c43ce MINOR: sock: distinguish dgram from stream types when retrieving old sockets
For now we still don't retrieve dgram sockets, but the code must be able
to distinguish them before we switch to receivers. This adds a new flag
to the xfer_sock_list indicating that a socket is of type SOCK_DGRAM. The
way to set the flag for now is by looking at the dummy address family which
equals AF_CUST_UDP{4,6} in this case (given that other dgram sockets are not
yet supported).
2020-08-28 19:26:39 +02:00
Willy Tarreau
a2c17877b3 MINOR: sock: do not use LI_O_* in xfer_sock_list anymore
We'll want to store more info there and some info that are not represented
in listener options at the moment (such as dgram vs stream) so let's get
rid of these and instead use a new set of options (SOCK_XFER_OPT_*).
2020-08-28 19:26:38 +02:00
Willy Tarreau
429617459d REORG: sock: move get_old_sockets() from haproxy.c
The new function was called sock_get_old_sockets() and was left as-is
except a minimum amount of style lifting to make it more readable. It
will never be awesome anyway since it's used very early in the boot
sequence and needs to perform socket I/O without any external help.
2020-08-28 19:24:55 +02:00
Willy Tarreau
37bafdcbb1 MINOR: sock_inet: move the IPv4/v6 transparent mode code to sock_inet
This code was highly redundant, existing for TCP clients, TCP servers
and UDP servers. Let's move it to sock_inet where it belongs. The new
functions are sock_inet4_make_foreign() and sock_inet6_make_foreign().
2020-08-28 18:51:36 +02:00
Willy Tarreau
2d34a710b1 MINOR: sock: implement sock_find_compatible_fd()
This is essentially a merge from tcp_find_compatible_fd() and
uxst_find_compatible_fd() that relies on a listener's address and
compare function and still checks for other variations. For AF_INET6
it compares a few of the listener's bind options. A minor change for
UNIX sockets is that transparent mode, interface and namespace used
to be ignored when trying to pick a previous socket while now if they
are changed, the socket will not be reused. This could be refined but
it's still better this way as there is no more risk of using a
differently bound socket by accident.

Eventually we should not pass a listener there but a set of binding
parameters (address, interface, namespace etc...) which ultimately will
be grouped into a receiver. For now this still doesn't exist so let's
stick to the listener to break dependencies in the rest of the code.
2020-08-28 18:51:36 +02:00
Willy Tarreau
a6473ede5c MINOR: sock: add interface and namespace length to xfer_sock_list
This will ease and speed up comparisons in FD lookups.
2020-08-28 18:51:36 +02:00
Willy Tarreau
063d47d136 REORG: listener: move xfer_sock_list to sock.{c,h}.
This will be used for receivers as well thus it is not specific to
listeners but to sockets.
2020-08-28 18:51:36 +02:00
Willy Tarreau
e5bdc51bb5 REORG: sock_inet: move default_tcp_maxseg from proto_tcp.c
Let's determine it at boot time instead of doing it on first use. It
also saves us from having to keep it thread local. It's been moved to
the new sock_inet_prepare() function, and the variables were renamed
to sock_inet_tcp_maxseg_default and sock_inet6_tcp_maxseg_default.
2020-08-28 18:51:36 +02:00
Willy Tarreau
d88e8c06ac REORG: sock_inet: move v6only_default from proto_tcp.c to sock_inet.c
The v6only_default variable is not specific to TCP but to AF_INET6, so
let's move it to the right file. It's now immediately filled on startup
during the PREPARE stage so that it doesn't have to be tested each time.
The variable's name was changed to sock_inet6_v6only_default.
2020-08-28 18:51:36 +02:00
Willy Tarreau
25140cc573 REORG: inet: replace tcp_is_foreign() with sock_inet_is_foreign()
The function now makes it clear that it's independent on the socket
type and solely relies on the address family. Note that it supports
both IPv4 and IPv6 as we don't seem to need it per-family.
2020-08-28 18:51:36 +02:00
Willy Tarreau
c5a94c936b MINOR: sock_inet: implement sock_inet_get_dst()
This one is common to the TCPv4 and UDPv4 code, it retrieves the
destination address of a socket, taking care of the possiblity that for
an incoming connection the traffic was possibly redirected. The TCP and
UDP definitions were updated to rely on it and remove duplicated code.
2020-08-28 18:51:36 +02:00
Willy Tarreau
f172558b27 MINOR: tcp/udp/unix: make use of proto->addrcmp() to compare addresses
The new addrcmp() protocol member points to the function to be used to
compare two addresses of the same family.

When picking an FD from a previous process, we can now use the address
specific address comparison functions instead of having to rely on a
local implementation. This will help move that code to a more central
place.
2020-08-28 18:51:36 +02:00
Willy Tarreau
0d06df6448 MINOR: sock: introduce sock_inet and sock_unix
These files will regroup everything specific to AF_INET, AF_INET6 and
AF_UNIX socket definitions and address management. Some code there might
be agnostic to the socket type and could later move to af_xxxx.c but for
now we only support regular sockets so no need to go too far.

The files are quite poor at this step, they only contain the address
comparison function for each address family.
2020-08-28 18:51:36 +02:00
Willy Tarreau
18b7df7a2b REORG: sock: start to move some generic socket code to sock.c
The new file sock.c will contain generic code for standard sockets
relying on file descriptors. We currently have way too much duplication
between proto_uxst, proto_tcp, proto_sockpair and proto_udp.

For now only get_src, get_dst and sock_create_server_socket were moved,
and are used where appropriate.
2020-08-28 18:51:36 +02:00
Willy Tarreau
478331dd93 CLEANUP: tcp: stop exporting smp_fetch_src()
This is totally ugly, smp_fetch_src() is exported only so that stick_table.c
can (ab)use it in the {sc,src}_* sample fetch functions. It could be argued
that the sample could have been reconstructed there in place, but we don't
even need to duplicate the code. We'd rather simply retrieve the "src"
fetch's function from where it's used at init time and be done with it.
2020-08-28 18:51:36 +02:00
Willy Tarreau
bb1caff70f MINOR: fd: add a new "exported" flag and use it for all regular listeners
This new flag will be used to mark FDs that must be passed to any future
process across the CLI's "_getsocks" command.

The scheme here is quite complex and full of special cases:
  - FDs inherited from parent processes are *not* exported this way, as
    they are supposed to instead be passed by the master process itself
    across reloads. However such FDs ought never to be paused otherwise
    this would disrupt the socket in the parent process as well;

  - FDs resulting from a "bind" performed over a socket pair, which are
    in fact one side of a socket pair passed inside another control socket
    pair must not be passed either. Since all of them are used the same
    way, for now it's enough never to put this "exported" flag to FDs
    bound by the socketpair code.

  - FDs belonging to temporary listeners (e.g. a passive FTP data port)
    must not be passed either. Fortunately we don't have such FDs yet.

  - the rest of the listeners for now are made of TCP, UNIX stream, ABNS
    sockets and are exportable, so they get the flag.

  - UDP listeners were wrongly created as listeners and are not suitable
    here. Their FDs should be passed but for now they are not since the
    client doesn't even distinguish the SO_TYPE of the retrieved sockets.

In addition, it's important to keep in mind that:
  - inherited FDs may never be closed in master process but may be closed
    in worker processes if the service is shut down (useless since still
    bound, but technically possible) ;

  - inherited FDs may not be disabled ;

  - exported FDs may be disabled because the caller will perform the
    subsequent listen() on them. However that might not work for all OSes

  - exported FDs may be closed, it just means the service was shut down
    from the worker, and will be rebound in the new process. This implies
    that we have to disable exported on close().

=> as such, contrary to an apparently obvious equivalence, the "exported"
   status doesn't imply anything regarding the ability to close a
   listener's FD or not.
2020-08-26 18:33:52 +02:00
Willy Tarreau
63d8b6009b CLEANUP: fd: remove fd_remove() and rename fd_dodelete() to fd_delete()
This essentially undoes what we did in fd.c in 1.8 to support seamless
reload. Since we don't need to remove an fd anymore we can turn
fd_delete() to the simple function it used to be.
2020-08-26 18:33:52 +02:00
Willy Tarreau
bf3b06b03d MINOR: reload: determine the foreing binding status from the socket
Let's not look at the listener options passed by the original process
and determine from the socket itself whether it is configured for
transparent mode or not. This is cleaner and safer, and doesn't rely
on flag values that could possibly change between versions.
2020-08-26 10:33:02 +02:00
Shimi Gersner
5846c490ce MEDIUM: ssl: Support certificate chaining for certificate generation
haproxy supports generating SSL certificates based on SNI using a provided
CA signing certificate. Because CA certificates may be signed by multiple
CAs, in some scenarios, it is neccesary for the server to attach the trust chain
in addition to the generated certificate.

The following patch adds the ability to serve the entire trust chain with
the generated certificate. The chain is loaded from the provided
`ca-sign-file` PEM file.
2020-08-25 16:36:06 +02:00
David Carlier
7adf8f35df OPTIM: regex: PCRE2 use JIT match when JIT optimisation occured.
When a regex had been succesfully compiled by the JIT pass, it is better
 to use the related match, thanksfully having same signature, for better
 performance.

Signed-off-by: David Carlier <devnexen@gmail.com>
2020-08-14 07:53:40 +02:00
Christopher Faulet
d25d926806 MINOR: lua: Add support for userlist as fetches and converters arguments
It means now http_auth() and http_auth_group() sample fetches are now exported
to the lua.
2020-08-07 14:27:54 +02:00
Christopher Faulet
e02fc4d0dd MINOR: arg: Add an argument type to keep a reference on opaque data
The ARGT_PTR argument type may now be used to keep a reference to opaque data in
the argument array used by sample fetches and converters. It is a generic way to
point on data. I guess it could be used for some other arguments, like proxy,
server, map or stick-table.
2020-08-07 14:20:07 +02:00
Ilya Shipitsin
6b79f38a7a CLEANUP: assorted typo fixes in the code and comments
This is 12th iteration of typo fixes
2020-07-31 11:18:07 +02:00
Christopher Faulet
2747fbb7ac MEDIUM: tcp-rules: Use a dedicated expiration date for tcp ruleset
A dedicated expiration date is now used to apply the inspect-delay of the
tcp-request or tcp-response rulesets. Before, the analyse expiratation date was
used but it may also be updated by the lua (at least). So a lua script may
extend or reduce the inspect-delay by side effect. This is not expected. If it
becomes necessary, a specific function will be added to do this. Because, for
now, it is a bit confusing.
2020-07-30 09:31:09 +02:00
Christopher Faulet
810df06145 MEDIUM: htx: Add a flag on a HTX message when no more data are expected
The HTX_FL_EOI flag must now be set on a HTX message when no more data are
expected. Most of time, it must be set before adding the EOM block. Thus, if
there is no space for the EOM, there is still an information to know all data
were received and pushed in the HTX message. There is only an exception for the
HTTP replies (deny, return...). For these messages, the flag is set after all
blocks are pushed in the message, including the EOM block, because, on error,
we remove all inserted data.
2020-07-22 16:43:32 +02:00
Willy Tarreau
f2452b3c70 MINOR: tasks/debug: add a BUG_ON() check to detect requeued task on free
__task_free() cannot be called with a task still in the queue. This
test adds a check which confirms there is no concurrency issue on such
a case where a thread could requeue nor wakeup a task being freed.
2020-07-22 14:42:52 +02:00
Willy Tarreau
e5d79bccc0 MINOR: tasks/debug: add a few BUG_ON() to detect use of wrong timer queue
This aims at catching calls to task_unlink_wq() performed by the wrong
thread based on the shared status for the task, as well as calls to
__task_queue() with the wrong timer queue being used based on the task's
capabilities. This will at least help eliminate some hypothesis during
debugging sessions when suspecting that a wrong thread has attempted to
queue a task at the wrong place.
2020-07-22 14:42:52 +02:00
Willy Tarreau
2447bce554 MINOR: tasks/debug: make the thread affinity BUG_ON check a bit stricter
The BUG_ON() test in task_queue() only tests for the case where
we're queuing a task that doesn't run on the current thread. Let's
refine it a bit further to catch all cases where the task does not
run *exactly* on the current thread alone.
2020-07-22 14:22:38 +02:00
Emeric Brun
d3db3846c5 BUG/MEDIUM: resolve: fix init resolving for ring and peers section.
Reported github issue #759 shows there is no name resolving
on server lines for ring and peers sections.

This patch introduce the resolving for those lines.

This patch adds  boolean a parameter to parse_server function to specify
if we want the function to perform an initial name resolving using libc.

This boolean is forced to true in case of peers or ring section.

The boolean is kept to false in case of classic servers (from
backend/listen)

This patch should be backported in branches where peers sections
support 'server' lines.
2020-07-21 17:59:20 +02:00
Emeric Brun
45c457a629 MINOR: log: adds counters on received syslog messages.
This patch adds a global counter of received syslog messages
and this one is exported on CLI "show info" as "CumRecvLogs".

This patch also updates internal conn counter and freq
of the listener and the proxy for each received log message to
prepare a further export on the "show stats".
2020-07-15 17:50:12 +02:00
Emeric Brun
12941c82d0 MEDIUM: log: adds log forwarding section.
Log forwarding:

It is possible to declare one or multiple log forwarding section,
haproxy will forward all received log messages to a log servers list.

log-forward <name>
  Creates a new log forwarder proxy identified as <name>.

bind <addr> [param*]
  Used to configure a log udp listener to receive messages to forward.
  Only udp listeners are allowed, address must be prefixed using
  'udp@', 'udp4@' or 'udp6@'. This supports for all "bind" parameters
  found in 5.1 paragraph but most of them are irrelevant for udp/syslog case.

log global
log <address> [len <length>] [format <format>] [sample <ranges>:<smp_size>]
    <facility> [<level> [<minlevel>]]
  Used to configure target log servers. See more details on proxies
  documentation.
  If no format specified, haproxy tries to keep the incoming log format.
  Configured facility is ignored, except if incoming message does not
  present a facility but one is mandatory on the outgoing format.
  If there is no timestamp available in the input format, but the field
  exists in output format, haproxy will use the local date.

  Example:
    global
       log stderr format iso local7

    ring myring
        description "My local buffer"
        format rfc5424
        maxlen 1200
        size 32764
        timeout connect 5s
        timeout server 10s
        # syslog tcp server
        server mysyslogsrv 127.0.0.1:514 log-proto octet-count

    log-forward sylog-loadb
        bind udp4@127.0.0.1:1514
        # all messages on stderr
        log global
        # all messages on local tcp syslog server
        log ring@myring local0
        # load balance messages on 4 udp syslog servers
        log 127.0.0.1:10001 sample 1:4 local0
        log 127.0.0.1:10002 sample 2:4 local0
        log 127.0.0.1:10003 sample 3:4 local0
        log 127.0.0.1:10004 sample 4:4 local0
2020-07-15 17:50:12 +02:00
Emeric Brun
54932b4408 MINOR: log: adds syslog udp message handler and parsing.
This patch introduce a new fd handler used to parse syslog
message on udp.

The parsing function returns level, facility and metadata that
can be immediatly reused to forward message to a log server.

This handler is enabled on udp listeners if proxy is internally set
to mode PR_MODE_SYSLOG
2020-07-15 17:50:12 +02:00
Emeric Brun
546488559a MEDIUM: log/sink: re-work and merge of build message API.
This patch merges build message code between sink and log
and introduce a new API based on struct ist array to
prepare message header with zero copy, targeting the
log forwarding feature.

Log format 'iso' and 'timed' are now avalaible on logs line.
A new log format 'priority' is also added.
2020-07-15 17:50:12 +02:00
Emeric Brun
3835c0dcb5 MEDIUM: udp: adds minimal proto udp support for message listeners.
This patch introduce proto_udp.c targeting a further support of
log forwarding feature.

This code was originally produced by Frederic Lecaille working on
QUIC support and only minimal requirements for syslog support
have been merged.
2020-07-15 17:50:12 +02:00
Christopher Faulet
aaa70852d9 MINOR: raw_sock: Report the number of bytes emitted using the splicing
In the continuity of the commit 7cf0e4517 ("MINOR: raw_sock: report global
traffic statistics"), we are now able to report the global number of bytes
emitted using the splicing. It can be retrieved in "show info" output on the
CLI.

Note this counter is always declared, regardless the splicing support. This
eases the integration with monitoring tools plugged on the CLI.
2020-07-15 14:08:14 +02:00
Christopher Faulet
0f9ff14b17 CLEANUP: connection: remove unused field idle_time from the connection struct
Thanks to previous changes, this field is now unused.
2020-07-15 14:08:14 +02:00
Christopher Faulet
c6e7563b1a MINOR: server: Factorize code to deal with connections removed from an idle list
The srv_del_conn_from_list() function is now responsible to update the server
counters and the connection flags when a connection is removed from an idle list
(safe, idle or available). It is called when a connection is released or when a
connection is set as private. This function also removes the connection from the
idle list if necessary.
2020-07-15 14:08:14 +02:00
Christopher Faulet
3d52f0f1f8 MINOR: server: Factorize code to deal with reuse of server idle connections
The srv_use_idle_conn() function is now responsible to update the server
counters and the connection flags when an idle connection is reused. The same
function is called when a new connection is created. This simplifies a bit the
connect_server() function.
2020-07-15 14:08:14 +02:00
Christopher Faulet
15979619c4 MINOR: session: Take care to decrement idle_conns counter in session_unown_conn
So conn_free() only calls session_unown_conn() if necessary. The details are now
fully handled by session_unown_conn().
2020-07-15 14:08:14 +02:00
Christopher Faulet
236c93b108 MINOR: connection: Set the conncetion target during its initialisation
When a new connection is created, its target is always set just after. So the
connection target may set when it is created instead, during its initialisation
to be precise. It is the purpose of this patch. Now, conn_new() function is
called with the connection target as parameter. The target is then passed to
conn_init(). It means the target must be passed when cs_new() is called. In this
case, the target is only used when the conn-stream is created with no
connection. This only happens for tcpchecks for now.
2020-07-15 14:08:14 +02:00
Christopher Faulet
fcc3d8a1c0 MINOR: connection: Use a dedicated function to look for a session's connection
The session_get_conn() must now be used to look for an available connection
matching a specific target for a given session. This simplifies a bit the
connect_server() function.
2020-07-15 14:08:14 +02:00
Christopher Faulet
08016ab82d MEDIUM: connection: Add private connections synchronously in session server list
When a connection is marked as private, it is now added in the session server
list. We don't wait a stream is detached from the mux to do so. When the
connection is created, this happens after the mux creation. Otherwise, it is
performed when the connection is marked as private.

To allow that, when a connection is created, the session is systematically set
as the connectin owner. Thus, a backend connection has always a owner during its
creation. And a private connection has always a owner until its death.

Note that outside the detach() callback, if the call to session_add_conn()
failed, the error is ignored. In this situation, we retry to add the connection
into the session server list in the detach() callback. If this fails at this
step, the multiplexer is destroyed and the connection is closed.
2020-07-15 14:08:14 +02:00
Christopher Faulet
21ddc74e8a MINOR: connection: Add a wrapper to mark a connection as private
To set a connection as private, the conn_set_private() function must now be
called. It sets the CO_FL_PRIVATE flags, but it also remove the connection from
the available connection list, if necessary. For now, it never happens because
only HTTP/1 connections may be set as private after their creation. And these
connections are never inserted in the available connection list.
2020-07-15 14:08:14 +02:00
Willy Tarreau
a9d7b76f6a MINOR: connection: use MT_LIST_ADDQ() to add connections to idle lists
When a connection is added to an idle list, it's already detached and
cannot be seen by two threads at once, so there's no point using
TRY_ADDQ, there will never be any conflict. Let's just use the cheaper
ADDQ.
2020-07-10 08:52:13 +02:00
Willy Tarreau
8689127816 MINOR: buffer: use MT_LIST_ADDQ() for buffer_wait lists additions
The TRY_ADDQ there was not needed since the wait list is exclusively
owned by the caller. There's a preliminary test on MT_LIST_ADDED()
that might have been eliminated by keeping MT_LIST_TRY_ADDQ() but
it would have required two more expensive writes before testing so
better keep the test the way it is.
2020-07-10 08:52:13 +02:00
Willy Tarreau
de4db17dee MINOR: lists: rename some MT_LIST operations to clarify them
Initially when mt_lists were added, their purpose was to be used with
the scheduler, where anyone may concurrently add the same tasklet, so
it sounded natural to implement a check in MT_LIST_ADD{,Q}. Later their
usage was extended and MT_LIST_ADD{,Q} started to be used on situations
where the element to be added was exclusively owned by the one performing
the operation so a conflict was impossible. This became more obvious with
the idle connections and the new macro was called MT_LIST_ADDQ_NOCHECK.

But this remains confusing and at many places it's not expected that
an MT_LIST_ADD could possibly fail, and worse, at some places we start
by initializing it before adding (and the test is superflous) so let's
rename them to something more conventional to denote the presence of the
check or not:

   MT_LIST_ADD{,Q}    : inconditional operation, the caller owns the
                        element, and doesn't care about the element's
                        current state (exactly like LIST_ADD)
   MT_LIST_TRY_ADD{,Q}: only perform the operation if the element is not
                        already added or in the process of being added.

This means that the previously "safe" MT_LIST_ADD{,Q} are not "safe"
anymore. This also means that in case of backport mistakes in the
future causing this to be overlooked, the slower and safer functions
will still be used by default.

Note that the missing unchecked MT_LIST_ADD macro was added.

The rest of the code will have to be reviewed so that a number of
callers of MT_LIST_TRY_ADDQ are changed to MT_LIST_ADDQ to remove
the unneeded test.
2020-07-10 08:50:41 +02:00
MIZUTA Takeshi
b24bc0dfb6 MINOR: tcp: Support TCP keepalive parameters customization
It is now possible to customize TCP keepalive parameters.
These correspond to the socket options TCP_KEEPCNT, TCP_KEEPIDLE, TCP_KEEPINTVL
and are valid for the defaults, listen, frontend and backend sections.

This patch fixes GitHub issue #670.
2020-07-09 05:22:16 +02:00
Willy Tarreau
3b8f9b7b88 BUG/MEDIUM: lists: add missing store barrier in MT_LIST_ADD/MT_LIST_ADDQ
The torture test run for previous commit 787dc20 ("BUG/MEDIUM: lists: add
missing store barrier on MT_LIST_BEHEAD()") finally broke again after 34M
connections. It appeared that MT_LIST_ADD and MT_LIST_ADDQ were suffering
from the same missing barrier when restoring the original pointers before
giving up, when checking if the element was already added. This is indeed
something which seldom happens with the shared scheduler, in case two
threads simultaneously try to wake up the same tasklet.

With a store barrier there after reverting the pointers, the torture test
survived 750M connections on the NanoPI-Fire3, so it looks good this time.
Probably that MT_LIST_BEHEAD should be added to test-list.c since it seems
to be more sensitive to concurrent accesses with MT_LIST_ADDQ.

It's worth noting that there is no barrier between the last two pointers
update, while there is one in MT_LIST_POP and MT_LIST_BEHEAD, the latter
having shown to be needed, but I cannot demonstrate why we would need
one here. Given that the code seems solid here, let's stick to what is
shown to work.

This fix should be backported to 2.1, just for the sake of safety since
the issue couldn't be triggered there, but it could change with the
compiler or when backporting a fix for example.
2020-07-09 05:01:27 +02:00
Willy Tarreau
787dc20952 BUG/MEDIUM: lists: add missing store barrier on MT_LIST_BEHEAD()
When running multi-threaded tests on my NanoPI-Fire3 (8 A53 cores), I
managed to occasionally get either a bus error or a segfault in the
scheduler, but only when running at a high connection rate (injecting
on a tcp-request connection reject rule). The bug is rare and happens
around once per million connections. I could never reproduce it with
less than 4 threads nor on A72 cores.

Haproxy 2.1.0 would also fail there but not 2.1.7.

Every time the crash happened with the TL_URGENT task list corrupted,
though it was not immediately after the LIST_SPLICE() call, indicating
background activity survived the MT_LIST_BEHEAD() operation. This queue
is where the shared runqueue is transferred, and the shared runqueue
gets fast inter-thread tasklet wakeups from idle conn takeover and new
connections.

Comparing the MT_LIST_BEHEAD() and MT_LIST_DEL() implementations, it's
quite obvious that a few barriers are missing from the former, and these
will simply fail on weakly ordered caches.

Two store barriers were added before the break() on failure, to match
what is done on the normal path. Missing them almost always results in
a segfault which is quite rare but consistent (after ~3M connections).

The 3rd one before updating n->prev seems intuitively needed though I
coudln't make the code fail without it. It's present in MT_LIST_DEL so
better not be needlessly creative. The last one is the most important
one, and seems to be the one that helps a concurrent MT_LIST_ADDQ()
detect a late failure and try again. With this, the code survives at
least 30M connections.

Interestingly the exact same issue was addressed in 2.0-dev2 for MT_LIST_DEL
with commit 690d2ad4d ("BUG/MEDIUM: list: add missing store barriers when
updating elements and head").

This fix must be backported to 2.1 as MT_LIST_BEHEAD() is also used there.

It's only tagged as medium because it will only affect entry-level CPUs
like Cortex A53 (x86 are not affected), and requires load levels that are
very hard to achieve on such machines to trigger it. In practice it's
unlikely anyone will ever hit it.
2020-07-08 19:45:50 +02:00
Willy Tarreau
e3cb9978c2 MINOR: version: back to development, update status message
Update the status message and update INSTALL again.
2020-07-07 16:38:51 +02:00
Willy Tarreau
33205c23a7 [RELEASE] Released version 2.3-dev0
Released version 2.3-dev0 with the following main changes :
    - exact copy of 2.2.0
2020-07-07 16:35:28 +02:00
Willy Tarreau
44c47de81a MINOR: version: mention that it's an LTS release now
The new version is going to be LTS up to around Q2 2025.
2020-07-07 16:31:52 +02:00
William Lallemand
7d42ef5b22 WIP/MINOR: ssl: add sample fetches for keylog in frontend
OpenSSL 1.1.1 provides a callback registering function
SSL_CTX_set_keylog_callback, which allows one to receive a string
containing the keys to deciphers TLSv1.3.

Unfortunately it is not possible to store this data in binary form and
we can only get this information using the callback. Which means that we
need to store it until the connection is closed.

This patches add 2 pools, the first one, pool_head_ssl_keylog is used to
store a struct ssl_keylog which will be inserted as a ex_data in a SSL *.
The second one is pool_head_ssl_keylog_str which will be used to store
the hexadecimal strings.

To enable the capture of the keys, you need to set "tune.ssl.keylog on"
in your configuration.

The following fetches were implemented:

ssl_fc_client_early_traffic_secret,
ssl_fc_client_handshake_traffic_secret,
ssl_fc_server_handshake_traffic_secret,
ssl_fc_client_traffic_secret_0,
ssl_fc_server_traffic_secret_0,
ssl_fc_exporter_secret,
ssl_fc_early_exporter_secret
2020-07-06 19:08:03 +02:00
Ilya Shipitsin
46a030cdda CLEANUP: assorted typo fixes in the code and comments
This is 11th iteration of typo fixes
2020-07-06 14:34:32 +02:00
Willy Tarreau
b0be8ae2a8 CLEANUP: auth: fix useless self-include of auth-t.h
Since recent include cleanups auth-t.h ended up including itself.
2020-07-05 21:32:47 +02:00
Willy Tarreau
0c439d8956 BUILD: tools: make resolve_sym_name() return a const
Originally it was made to return a void* because some comparisons in the
code where it was used required a lot of casts. But now we don't need
that anymore. And having it non-const breaks the build on NetBSD 9 as
reported in issue #728.

So let's switch to const and adjust debug.c to accomodate this.
2020-07-05 20:26:04 +02:00
Olivier Houchard
a74bb7e26e BUG/MEDIUM: connections: Let the xprt layer know a takeover happened.
When we takeover a connection, let the xprt layer know. If it has its own
tasklet, and it is already scheduled, then it has to be destroyed, otherwise
it may run the new mux tasklet on the old thread.

Note that we only do this for the ssl xprt for now, because the only other
one that might wake the mux up is the handshake one, which is supposed to
disappear before idle connections exist.

No backport is needed, this is for 2.2.
2020-07-03 17:49:33 +02:00
Olivier Houchard
1662cdb0c6 BUG/MEDIUM: connections: Set the tid for the old tasklet on takeover.
In the various takeover() methods, make sure we schedule the old tasklet
on the old thread, as we don't want it to run on our own thread! This
was causing a very rare crash when building with DEBUG_STRICT, seeing
that either an FD's thread mask didn't match the thread ID in h1_io_cb(),
or that stream_int_notify() would try to queue a task with the wrong
tid_bit.

In order to reproduce this, it is necessary to maintain many connections
(typically 30k) at a high request rate flowing over H1+SSL between two
proxies, the second of which would randomly reject ~1% of the incoming
connection and randomly killing some idle ones using a very short client
timeout. The request rate must be adjusted so that the CPUs are nearly
saturated, but never reach 100%. It's easier to reproduce this by skipping
local connections and always picking from other threads. The issue
should happen in less than 20s otherwise it's necessary to restart to
reset the idle connections lists.

No backport is needed, takeover() is 2.2 only.
2020-07-03 17:49:23 +02:00
Willy Tarreau
43079e0731 MINOR: sched: split tasklet_wakeup() into tasklet_wakeup_on()
tasklet_wakeup() only checks tl->tid to know whether the task is
programmed to run on the current thread or on a specific thread. We'll
have to ease this selection in a subsequent patch, preferably without
modifying tl->tid, so let's have a new tasklet_wakeup_on() function
to specify the thread number to run on. That the logic has not changed
at all.
2020-07-03 17:19:47 +02:00
Emeric Brun
9f9b22c4f1 MINOR: log: add time second fraction field to rfc5424 log timestamp.
This patch adds the time second fraction in microseconds
as supported by the rfc.
2020-07-02 17:56:06 +02:00
Willy Tarreau
dab586c3a8 BUILD: debug: avoid build warnings with DEBUG_MEM_STATS
Some libcs define strdup() as a macro and cause redefine warnings to
be emitted, so let's first undefine all functions we redefine.
2020-07-02 10:25:01 +02:00
Dragan Dosen
1e3b16f74f MINOR: log-format: allow to preserve spacing in log format strings
Now it's possible to preserve spacing everywhere except in "log-format",
"log-format-sd" and "unique-id-format" directives, where spaces are
delimiters and are merged. That may be useful when the response payload
is specified as a log format string by "lf-file" or "lf-string", or even
for headers or anything else.

In order to merge spaces, a new option LOG_OPT_MERGE_SPACES is applied
exclusively on options passed to function parse_logformat_string().

This patch fixes an issue #701 ("http-request return log-format file
evaluation altering spacing of ASCII output/art").
2020-07-02 10:11:44 +02:00
Willy Tarreau
a6026a0c92 MINOR: debug: add a new "debug dev memstats" command
Now when building with -DDEBUG_MEM_STATS, some malloc/calloc/strdup/realloc
stats are kept per file+line number and may be displayed and even reset on
the CLI using "debug dev memstats". This allows to easily track potential
leakers or abnormal usages.
2020-07-02 09:14:48 +02:00
Willy Tarreau
76cc699017 MINOR: config: add a new tune.idle-pool.shared global setting.
Enables ('on') or disables ('off') sharing of idle connection pools between
threads for a same server. The default is to share them between threads in
order to minimize the number of persistent connections to a server, and to
optimize the connection reuse rate. But to help with debugging or when
suspecting a bug in HAProxy around connection reuse, it can be convenient to
forcefully disable this idle pool sharing between multiple threads, and force
this option to "off". The default is on.

This could have been nice to have during the idle connections debugging,
but it's not too late to add it!
2020-07-01 19:07:37 +02:00
Olivier Houchard
ff1d0929b8 MEDIUM: connections: Don't use a lock when moving connections to remove.
Make it so we don't have to take a lock while moving a connection from
the idle list to the toremove_list by taking advantage of the MT_LIST.
2020-07-01 17:09:19 +02:00
Olivier Houchard
f8f4c2ef60 CLEANUP: connections: rename the toremove_lock to takeover_lock
This lock was misnamed and a bit confusing. It's only used for takeover
so let's call it takeover_lock.
2020-07-01 17:09:10 +02:00
Olivier Houchard
bbee1f7e78 MINOR: list: Add MT_LIST_DEL_SAFE_NOINIT() and MT_LIST_ADDQ_NOCHECK()
Add two new macros, MT_LIST_DEL_SAFE_NOINIT makes sure we remove the
element from the list, without reinitializing its next and prev, and
MT_LIST_ADDQ_NOCHECK is similar to MT_LIST_ADDQ(), except it doesn't check
if the element is already in a list.
The goal is to be able to move an element from a list we're currently
parsing to another, keeping it locked in the meanwhile.
2020-07-01 17:04:00 +02:00
Willy Tarreau
eb8c2c69fa MEDIUM: sched: implement task_kill() to kill a task
task_kill() may be used by any thread to kill any task with less overhead
than a regular wakeup. In order to achieve this, it bypasses the priority
tree and inserts the task directly into the shared tasklets list, cast as
a tasklet. The task_list_size is updated to make sure it is properly
decremented after execution of this task. The task will thus be picked by
process_runnable_tasks() after checking the tree and sent to the TL_URGENT
list, where it will be processed and killed.

If the task is bound to more than one thread, its first thread will be the
one notified.

If the task was already queued or running, nothing is done, only the flag
is added so that it gets killed before or after execution. Of course it's
the caller's responsibility to make sur any resources allocated by this
task were already cleaned up or taken over.
2020-07-01 16:35:53 +02:00
Willy Tarreau
8a6049c268 MEDIUM: sched: create a new TASK_KILLED task flag
This flag, when set, will be used to indicate that the task must die.
At the moment this may only be placed by the task itself or by the
scheduler when placing it into the TL_NORMAL queue.
2020-07-01 16:35:49 +02:00
Willy Tarreau
364f25a688 MINOR: backend: don't always takeover from the same threads
The next thread walking algorithm in commit 566df309c ("MEDIUM:
connections: Attempt to get idle connections from other threads.")
proved to be sufficient for most cases, but it still has some rough
edges when threads are unevenly loaded. If one thread wakes up with
10 streams to process in a burst, it will mainly take over connections
from the next one until it doesn't have anymore.

This patch implements a rotating index that is stored into the server
list and that any thread taking over a connection is responsible for
updating. This way it starts mostly random and avoids always picking
from the same place. This results in a smoother distribution overall
and a slightly lower takeover rate.
2020-07-01 16:07:43 +02:00
Willy Tarreau
2f3f4d3441 MEDIUM: server: add a new pool-low-conn server setting
The problem with the way idle connections currently work is that it's
easy for a thread to steal all of its siblings' connections, then release
them, then it's done by another one, etc. This happens even more easily
due to scheduling latencies, or merged events inside the same pool loop,
which, when dealing with a fast server responding in sub-millisecond
delays, can really result in one thread being fully at work at a time.

In such a case, we perform a huge amount of takeover() which consumes
CPU and requires quite some locking, sometimes resulting in lower
performance than expected.

In order to fight against this problem, this patch introduces a new server
setting "pool-low-conn", whose purpose is to dictate when it is allowed to
steal connections from a sibling. As long as the number of idle connections
remains at least as high as this value, it is permitted to take over another
connection. When the idle connection count becomes lower, a thread may only
use its own connections or create a new one. By proceeding like this even
with a low number (typically 2*nbthreads), we quickly end up in a situation
where all active threads have a few connections. It then becomes possible
to connect to a server without bothering other threads the vast majority
of the time, while still being able to use these connections when the
number of available FDs becomes low.

We also use this threshold instead of global.nbthread in the connection
release logic, allowing to keep more extra connections if needed.

A test performed with 10000 concurrent HTTP/1 connections, 16 threads
and 210 servers with 1 millisecond of server response time showed the
following numbers:

   haproxy 2.1.7:           185000 requests per second
   haproxy 2.2:             314000 requests per second
   haproxy 2.2 lowconn 32:  352000 requests per second

The takeover rate goes down from 300k/s to 13k/s. The difference is
further amplified as the response time shrinks.
2020-07-01 15:23:15 +02:00
Willy Tarreau
35e30c9670 BUG/MINOR: server: fix the connection release logic regarding nearly full conditions
There was a logic bug in commit ddfe0743d ("MEDIUM: server: use the two
thresholds for the connection release algorithm"): instead of keeping
only our first idle connection when FDs become scarce, the condition was
inverted resulting in enforcing this constraint unless FDs are scarce.
This results in less idle connections than permitted to be kept under
normal condition.

No backport needed.
2020-07-01 14:14:29 +02:00
Willy Tarreau
daf8aa62a8 MINOR: pools: increase MAX_BASE_POOLS to 64
When not sharing pools (i.e. when building with -DDEBUG_DONT_SHARE_POOLS)
we have about 47 pools right now, while MAX_BASE_POOLS is only 32, meaning
that only the first 32 ones will benefit from a per-thread cache entry.
This totally kills performance when pools are not shared (roughly -20%).

Let's double the limit to gain some margin, and make it possible to set
it as a build option.

It might be useful to backport this to stable versions as they're likely
to be affected as well.
2020-06-30 14:29:02 +02:00
Willy Tarreau
ddfe0743d8 MEDIUM: server: use the two thresholds for the connection release algorithm
The algorithm improvement in bdb86bd ("MEDIUM: server: improve estimate
of the need for idle connections") is still not enough because there's
a hard limit between below and above the FD count, so it continues to
end up with many killed connections.

Here we're proceeding differently. Given that there are two configured
limits, a low and a high one, what we do is that we drop connections
when the high limit is reached (what's already done by the killing task
anyway), when we're between the low and the high threshold, we only keep
the connection if our idle entries are empty (with a preference for safe
ones), and below the low threshold, we keep any connection so as to give
them a chance of being reused or taken over by another thread.

Proceeding like this results in much less dropped connections, we
typically see a 99.3% reuse rate (76k conns for 10M requests over 200
servers and 4 threads, with 335k takeovers or 3%), and much less CPU
usage variations because there are no more bursts to try to kill extra
connections.

It should be possible to further improve this by counting the number
of threads exploiting a server and trying to optimize the amount of
per-thread idle connections so that it is approximately balanced among
the threads.
2020-06-29 21:54:38 +02:00
Willy Tarreau
e69282a03f BUG/MINOR: server: always count one idle slot for current thread
The idle server connection estimates brought in commit bdb86bd ("MEDIUM:
server: improve estimate of the need for idle connections") were committed
without the minimum of 1 idle conn needed for the current thread. The net
effect is that there are bursts of dropped connections when the load varies
because there's no provision for the last connection.

No backport needed, this is 2.2-dev.
2020-06-29 21:54:38 +02:00
Willy Tarreau
d59946e673 Revert "BUG/MEDIUM: lists: Lock the element while we check if it is in a list."
This reverts previous commit 347bbf79d20e1cff57075a8a378355dfac2475e2i.

The original code was correct. This patch resulted from a mistaken analysis
and breaks the scheduler:

 ########################## Starting vtest ##########################
 Testing with haproxy version: 2.2-dev11-90b7d9-23
 #    top  TEST reg-tests/lua/close_wait_lf.vtc TIMED OUT (kill -9)
 #    top  TEST reg-tests/lua/close_wait_lf.vtc FAILED (10.008) signal=9
 1 tests failed, 0 tests skipped, 88 tests passed

 Program terminated with signal SIGABRT, Aborted.
 [Current thread is 1 (Thread 0x7fb0dac2c700 (LWP 11292))]
 (gdb) bt
 #0  0x00007fb0e7c143f8 in raise () from /lib64/libc.so.6
 #1  0x00007fb0e7c15ffa in abort () from /lib64/libc.so.6
 #2  0x000000000053f5d6 in ha_panic () at src/debug.c:269
 #3  0x00000000005a6248 in wdt_handler (sig=14, si=<optimized out>, arg=<optimized out>) at src/wdt.c:119
 #4  <signal handler called>
 #5  0x00000000004fbccd in tasklet_wakeup (tl=0x1b5abc0) at include/haproxy/task.h:351
 #6  listener_accept (fd=<optimized out>) at src/listener.c:999
 #7  0x00000000004262df in fd_update_events (evts=<optimized out>, fd=6) at include/haproxy/fd.h:418
 #8  _do_poll (p=<optimized out>, exp=<optimized out>, wake=<optimized out>) at src/ev_epoll.c:251
 #9  0x0000000000548d0f in run_poll_loop () at src/haproxy.c:2949
 #10 0x000000000054908b in run_thread_poll_loop (data=<optimized out>) at src/haproxy.c:3067
 #11 0x00007fb0e902b684 in start_thread () from /lib64/libpthread.so.0
 #12 0x00007fb0e7ce5eed in clone () from /lib64/libc.so.6
 (gdb) up
 #5  0x00000000004fbccd in tasklet_wakeup (tl=0x1b5abc0) at include/haproxy/task.h:351
 351                     if (MT_LIST_ADDQ(&task_per_thread[tl->tid].shared_tasklet_list, (struct mt_list *)&tl->list) == 1) {

If the commit above is ever backported, this one must be as well!
2020-06-29 21:54:37 +02:00
Olivier Houchard
347bbf79d2 BUG/MEDIUM: lists: Lock the element while we check if it is in a list.
In MT_LIST_ADDQ() and MT_LIST_ADD() we can't just check if the element is
already in a list, because there's a small race condition, it could be added
between the time we checked, and the time we actually set its next and prev.
So we have to lock it first.

This should be backported to 2.1.
2020-06-29 19:59:06 +02:00
Willy Tarreau
a9fcecbdf3 MINOR: stats: add the estimated need of concurrent connections per server
The max_used_conns value is used as an estimate of the needed number of
connections on a server to know how many to keep open. But this one is
not reported, making it hard to troubleshoot reuse issues. Let's export
it in the sessions/current column.
2020-06-29 16:29:11 +02:00
Willy Tarreau
bdb86bdaab MEDIUM: server: improve estimate of the need for idle connections
Starting with commit 079cb9a ("MEDIUM: connections: Revamp the way idle
connections are killed") we started to improve the way to compute the
need for idle connections. But the condition to keep a connection idle
or drop it when releasing it was not updated. This often results in
storms of close when certain thresholds are met, and long series of
takeover() when there aren't enough connections left for a thread on
a server.

This patch tries to improve the situation this way:
  - it keeps an estimate of the number of connections needed for a server.
    This estimate is a copy of the max over previous purge period, or is a
    max of what is seen over current period; it differs from max_used_conns
    in that this one is a counter that's reset on each purge period ;

  - when releasing, if the number of current idle+used connections is
    lower than this last estimate, then we'll keep the connection;

  - when releasing, if the current thread's idle conns head is empty,
    and we don't exceed the estimate by the number of threads, then
    we'll keep the connection.

  - when cleaning up connections, we consider the max of the last two
    periods to avoid killing too many idle conns when facing bursty
    traffic.

Thanks to this we can better converge towards a situation where, provided
there are enough FDs, each active server keeps at least one idle connection
per thread all the time, with a total number close to what was needed over
the previous measurement period (as defined by pool-purge-delay).

On tests with large numbers of concurrent connections (30k) and many
servers (200), this has quite smoothed the CPU usage pattern, increased
the reuse rate and roughly halved the takeover rate.
2020-06-29 16:29:10 +02:00
Willy Tarreau
b159132ea3 MINOR: activity: add per-thread statistics on FD takeover
The FD takeover operation might have certain impacts explaining
unexpected activities, so it's important to report such a counter
there. We thus count the number of times a thread has stolen an
FD from another thread.
2020-06-29 14:26:05 +02:00
Willy Tarreau
3bb617cfe0 MINOR: stats: add 3 new output values for the per-server idle conn state
The servers have internal states describing the status of idle connections,
unfortunately these were not exported in the stats. This patch adds the 3
following gauges:

 - idle_conn_cur : Current number of unsafe idle connections
 - safe_conn_cur : Current number of safe idle connections
 - used_conn_cur : Current number of connections in use
2020-06-29 14:26:05 +02:00
Willy Tarreau
20dc3cd4a6 MINOR: pools: move the LRU cache heads to thread_info
The LRU cache head was an array of list, which causes false sharing
between 4 to 8 threads in the same cache line. Let's move it to the
thread_info structure instead. There's no need to do the same for the
pool_cache[] array since it's already quite large (32 pointers each).

By doing this the request rate increased by 1% on a 16-thread machine.
2020-06-29 10:36:37 +02:00
Willy Tarreau
c03d7632a5 CLEANUP: pool: only include the type files from types
pool-t.h was mistakenly including the full-blown includes for threads,
lists and api instead of the types, and as such, CONFIG_HAP_LOCAL_POOLS
and CONFIG_HAP_LOCKLESS_POOLS were not visible everywhere.
2020-06-29 10:11:24 +02:00
Willy Tarreau
e4d1505c83 REORG: includes: create tinfo.h for the thread_info struct
The thread_info struct is convenient to store various per-thread info
without having to resort to a painful thread_local storage which is
slow and painful to initialize.

The problem is, by having this one in thread.h it's very difficult to
add more entries there because everyone already includes thread.h so
conversely thread.h cannot reference certain types.

There's no point in having this there, instead let's create a new pair
of files, tinfo{,-t}.h, which declare the structure. This way it will
become possible to extend them with other includes and have certain
files store their own types there.
2020-06-29 09:57:23 +02:00
Willy Tarreau
4d82bf5c2e MINOR: connection: align toremove_{lock,connections} and cleanup into idle_conns
We used to have 3 thread-based arrays for toremove_lock, idle_cleanup,
and toremove_connections. The problem is that these items are small,
and that this creates false sharing between threads since it's possible
to pack up to 8-16 of these values into a single cache line. This can
cause real damage where there is contention on the lock.

This patch creates a new array of struct "idle_conns" that is aligned
on a cache line and which contains all three members above. This way
each thread has access to its variables without hindering the other
ones. Just doing this increased the HTTP/1 request rate by 5% on a
16-thread machine.

The definition was moved to connection.{c,h} since it appeared a more
natural evolution of the ongoing changes given that there was already
one of them declared in connection.h previously.
2020-06-28 10:52:36 +02:00
Willy Tarreau
d79422a0ff BUG/MEDIUM: buffers: always allocate from the local cache first
It looked strange to see pool_evict_from_cache() always very present
on "perf top", but there was actually a reason to this: while b_free()
uses pool_free() which properly disposes the buffer into the local cache
and b_alloc_fast() allocates using pool_get_first() which considers the
local cache, b_alloc_margin() does not consider the local cache as it
only uses __pool_get_first() which only allocates from the shared pools.

The impact is that basically everywhere a buffer is allocated (muxes,
streams, applets), it's always picked from the shared pool (hence
involves locking) and is released to the local one and makes it grow
until it's required to trigger a flush using pool_evict_from_cache().
Buffers usage are thus not thread-local at all, and cause eviction of
a lot of possibly useful objects from the local caches.

Just fixing this results in a 10% request rate increase in an HTTP/1 test
on a 16-thread machine.

This bug was caused by recent commit ed891fd ("MEDIUM: memory: make local
pools independent on lockless pools") merged into 2.2-dev9, so not backport
is needed.
2020-06-28 10:45:35 +02:00
Willy Tarreau
4dc6c860b4 CLEANUP: buffers: remove unused buffer_wq_lock lock
Commit 2104659 ("MEDIUM: buffer: remove the buffer_wq lock") removed
usage of the lock but not the lock itself. It's totally unused, let's
remove it.
2020-06-28 10:45:34 +02:00
Anthonin Bonnefoy
85048f80c9 MINOR: http: Add support for http 413 status
Add 413 http "payload too large" status code. This will allow 413 to be
used in deny_status and errorfile.
2020-06-26 11:30:02 +02:00
Ilya Shipitsin
47d17182f4 CLEANUP: assorted typo fixes in the code and comments
This is 10th iteration of typo fixes
2020-06-26 11:27:28 +02:00
Ilya Shipitsin
f44d155515 BUILD: fix ssl_sample.c when building against BoringSSL
BoringSSL does not have X509_get_X509_PUBKEY
let our emulation level define that for BoringSSL as well

Build log:

src/ssl_sample.o: In function `smp_fetch_ssl_x_key_alg':
/home/travis/build/haproxy/haproxy/src/ssl_sample.c:592: undefined reference to `X509_get_X509_PUBKEY'
clang-7: error: linker command failed with exit code 1 (use -v to see invocation)
Makefile:860: recipe for target 'haproxy' failed
make: *** [haproxy] Error 1

travis-ci: https://travis-ci.com/github/haproxy/haproxy/jobs/351670996
2020-06-26 10:33:38 +02:00
Willy Tarreau
c54e5ad9cc MINOR: cfgparse: sanitize the output a little bit
With the rework of the config line parser, we've started to emit a dump
of the initial line underlined by a caret character indicating the error
location. But with extremely large lines it starts to take time and can
even cause trouble to slow terminals (e.g. over ssh), and this becomes
useless. In addition, control characters could be dumped as-is which is
bad, especially when the input file is accidently wrong (an executable).

This patch adds a string sanitization function which isolates an area
around the error position in order to report only that area if the string
is too large. The limit was set to 80 characters, which will result in
roughly 40 chars around the error being reported only, prefixed and suffixed
with "..." as needed. In addition, non-printable characters in the line are
now replaced with '?' so as not to corrupt the terminal. This way invalid
variable names, unmatched quotes etc will be easier to spot.

A typical output is now:

  [ALERT] 176/092336 (23852) : parsing [bad.cfg:8]: forbidden first char in environment variable name at position 811957:
    ...c$PATH$PATH$d(xlc`%?$PATH$PATH$dgc?T$%$P?AH?$PATH$PATH$d(?$PATH$PATH$dgc?%...
                                            ^
2020-06-25 09:43:27 +02:00
Willy Tarreau
e7723bddd7 MEDIUM: tasks: add a tune.sched.low-latency option
Now that all tasklet queues are scanned at once by run_tasks_from_lists(),
it becomes possible to always check for lower priority classes and jump
back to them when they exist.

This patch adds tune.sched.low-latency global setting to enable this
behavior. What it does is stick to the lowest ranked priority list in
which tasks are still present with an available budget, and leave the
loop to refill the tasklet lists if the trees got new tasks or if new
work arrived into the shared urgent queue.

Doing so allows to cut the latency in half when running with extremely
deep run queues (10k-100k), thus allowing forwarding of small and large
objects to coexist better. It remains off by default since it does have
a small impact on large traffic by default (shorter batches).
2020-06-24 12:21:26 +02:00
Willy Tarreau
59153fef86 MINOR: tasks: make run_tasks_from_lists() scan the queues itself
Now process_runnable_tasks is responsible for calculating the budgets
for each queue, dequeuing from the tree, and calling run_tasks_from_lists().
This latter one scans the queues, picking tasks there and respecting budgets.
Note that its name was updated with a plural "s" for this reason.
2020-06-24 12:21:26 +02:00
Willy Tarreau
ba48d5c8f9 MINOR: tasks: pass the queue index to run_task_from_list()
Instead of passing it a pointer to the queue, pass it the queue's index
so that it can perform all the work around current_queue and tl_class_mask.
2020-06-24 12:21:26 +02:00
Willy Tarreau
49f90bf148 MINOR: tasks: add a mask of the queues with active tasklets
It is neither convenient nor scalable to check each and every tasklet
queue to figure whether it's empty or not while we often need to check
them all at once. This patch introduces a tasklet class mask which gets
a bit 1 set for each queue representing one class of service. A single
test on the mask allows to figure whether there's still some work to be
done. It will later be usable to better factor the runqueue code.

Bits are set when tasklets are queued. They're cleared when queues are
emptied. It is possible that a queue is empty but has a bit if a tasklet
was added then removed, but this is not a problem as this is properly
checked for in run_tasks_from_list().
2020-06-24 12:21:26 +02:00
Willy Tarreau
c0a08ba2df MINOR: tasks: make current_queue an index instead of a pointer
It will be convenient to have the tasklet queue number soon, better make
current_queue an index rather than a pointer to the queue. When not currently
running (e.g. from I/O), the index is -1.
2020-06-24 12:21:26 +02:00
William Lallemand
ee8530c65e MINOR: ssl: free the crtlist and the ckch during the deinit()
Add some functions to deinit the whole crtlist and ckch architecture.

It will free all crtlist, crtlist_entry, ckch_store, ckch_inst and their
associated SNI, ssl_conf and SSL_CTX.

The SSL_CTX in the default_ctx and initial_ctx still needs to be free'd
separately.
2020-06-23 20:07:50 +02:00
William Lallemand
7df5c2dc3c BUG/MEDIUM: ssl: fix ssl_bind_conf double free
Since commit 2954c47 ("MEDIUM: ssl: allow crt-list caching"), the
ssl_bind_conf is allocated directly in the crt-list, and the crt-list
can be shared between several bind_conf. The deinit() code wasn't
changed to handle that.

This patch fixes the issue by removing the free of the ssl_conf in
ssl_sock_free_all_ctx().

It should be completed with a patch that free the ssl_conf and the
crt-list.

Fix issue #700.
2020-06-23 20:06:55 +02:00
Willy Tarreau
5bd73063ab BUG/MEDIUM: task: be careful not to run too many tasks at TL_URGENT
A test on large objects revealed a big performance loss from 2.1. The
cause was found to be related to cache locality between scheduled
operations that are batched using tasklets. It happens that we now
have several layers of tasklets and that queuing all these operations
leaves time to let memory objects cool down in the CPU cache, effectively
resulting in halving the performance.

A quick test consisting in putting most unknown tasklets into the BULK
queue almost fixed the performance regression, but this is a wrong
approach as it can also slow down some low-latency transfers or access
to applets like the CLI.

What this patch does instead is to queue unknown tasklets into the same
queue as the current one when tasklet_wakeup() is itself called from a
task/tasklet, otherwise it uses urgent for real I/O (when sched->current
is NULL). This results in the called tasklet being woken up much sooner,
often at the end of the current batch of tasklets.

By doing so, a test on 2 cores 4 threads with 256 concurrent H1 conns
transferring 16m objects with 256kB buffers jumped from 55 to 88 Gbps.
It's even possible to go as high as 101 Gbps by evaluating the URGENT
queue after the BULK one, though this was not done as considered
dangerous for latency sensitive operations.

This reinforces the importance of getting back the CPU transfer
mechanisms based on tasklet_wakeup_after() to work at the tasklet level
by supporting an immediate wakeup in certain cases.

No backport is needed, this is strictly 2.2.
2020-06-23 16:45:28 +02:00
Willy Tarreau
116ef223d2 MINOR: task: add a new pointer to current tasklet queue
In task_per_thread[] we now have current_queue which is a pointer to
the current tasklet_list entry being evaluated. This will be used to
know the class under which the current task/tasklet is currently
running.
2020-06-23 16:35:38 +02:00
Willy Tarreau
38e8a1c7b8 MINOR: debug: add a new DEBUG_FD build option
When DEBUG_FD is set at build time, we'll keep a counter of per-FD events
in the fdtab. This counter is reported in "show fd" even for closed FDs if
not zero. The purpose is to help spot situations where an apparently closed
FD continues to be reported in loops, or where some events are dismissed.
2020-06-23 10:04:54 +02:00
Willy Tarreau
d1d005d7f6 MEDIUM: map: make the "clear map" operation yield
As reported in issue #419, a "clear map" operation on a very large map
can take a lot of time and freeze the entire process for several seconds.

This patch makes sure that pat_ref_prune() can regularly yield after
clearing some entries so that the rest of the process continues to work.
The first part, the removal of the patterns, can take quite some time
by itself in one run but it's still relatively fast. It may block for
up to 100ms for 16M IP addresses in a tree typically. This change needed
to declare an I/O handler for the clear operation so that we can get
back to it after yielding.

The second part can be much slower because it deconstructs the elements
and its users, but it iterates progressively so we can yield less often
here.

The patch was tested with traffic in parallel sollicitating the map being
released and showed no problem. Some traffic will definitely notice an
incomplete map but the filling is already not atomic anyway thus this is
not different.

It may be backported to stable versions once sufficiently tested for side
effects, at least as far as 2.0 in order to avoid the watchdog triggering
when the process is frozen there. For a better behaviour, all these
prune_* functions should support yielding so that the callers have a
chance to continue also yield in turn.
2020-06-19 16:57:51 +02:00
Willy Tarreau
bc52bec163 MEDIUM: fd: add experimental support for edge-triggered polling
Some of the recent optimizations around the polling to save a few
epoll_ctl() calls have shown that they could also cause some trouble.
However, over time our code base has become totally asynchronous with
I/Os always attempted from the upper layers and only retried at the
bottom, making it look like we're getting closer to EPOLLET support.

There are showstoppers there such as the listeners which cannot support
this. But given that most of the epoll_ctl() dance comes from the
connections, we can try to enable edge-triggered polling on connections.

What this patch does is to add a new global tunable "tune.fd.edge-triggered",
that makes fd_insert() automatically set an et_possible bit on the fd if
the I/O callback is conn_fd_handler. When the epoll code sees an update
for such an FD, it immediately registers it in both directions the first
time and doesn't update it anymore.

On a few tests it proved quite useful with a 14% request rate increase in
a H2->H1 scenario, reducing the epoll_ctl() calls from 2 per request to
2 per connection.

The option is obviously disabled by default as bugs are still expected,
particularly around the subscribe() code where it is possible that some
layers do not always re-attempt reading data after being woken up.
2020-06-19 14:21:46 +02:00
Dragan Dosen
13cd54c08b MEDIUM: peers: add the "localpeer" global option
localpeer <name>
  Sets the local instance's peer name. It will be ignored if the "-L"
  command line argument is specified or if used after "peers" section
  definitions. In such cases, a warning message will be emitted during
  the configuration parsing.

  This option will also set the HAPROXY_LOCALPEER environment variable.
  See also "-L" in the management guide and "peers" section in the
  configuration manual.
2020-06-19 11:37:30 +02:00
Dragan Dosen
4f01415d3b MINOR: peers: do not use localpeer as an array anymore
It is now dynamically allocated by using strdup().
2020-06-19 11:37:11 +02:00
Willy Tarreau
7af4fa9a48 MINOR: activity: rename the "stream" field to "stream_calls"
This one was confusingly called, I thought it was the cumulated number
of streams but it's the number of calls to process_stream(). Let's make
this clearer.
2020-06-17 20:52:29 +02:00
Willy Tarreau
e406386542 MINOR: activity: rename confusing poll_* fields in the output
We have poll_drop, poll_dead and poll_skip which are confusingly named
like their poll_io and poll_exp counterparts except that they are not
per poll() call but per-fd. This patch renames them to poll_drop_fd(),
poll_dead_fd() and poll_skip_fd() for this reason.
2020-06-17 20:35:33 +02:00
Willy Tarreau
e545153c50 MINOR: activity: report the number of times poll() reports I/O
The "show activity" output mentions a number of indicators to explain
wake up reasons but doesn't have the number of times poll() sees some
I/O. And given that multiple events can happen simultaneously, it's
not always possible to deduce this metric by subtracting.

This patch adds a new "poll_io" counter that allows one to see how
often poll() returns with at least one active FD. This should help
detect stuck events and measure various ratios of poll sub-metrics.
2020-06-17 20:25:18 +02:00
Willy Tarreau
c208a54ab2 DOC: fd: make it clear that some fields ordering must absolutely be respected
fd_set_running() and fd_takeover() may both use a double-word CAS on the
(running_mask, thread_mask) couple and as such they expect the fields to
be exactly arranged like this. It's critical not to reorder them, so add
a comment to avoid such a potential mistake later.
2020-06-17 19:58:37 +02:00
Willy Tarreau
4f72ec851c CLEANUP: activity: remove unused counter fd_lock
Since 2.1-dev2, with commit 305d5ab46 ("MAJOR: fd: Get rid of the fd cache.")
we don't have the fd_lock anymore and as such its acitvity counter is always
zero. Let's remove it from the struct and from "show activity" output, as
there are already plenty of indicators to look at.

The cache line comment in the struct activity was updated to reflect
reality as it looks like another one already got removed in the past.
2020-06-17 19:15:51 +02:00
Willy Tarreau
6d4c81db96 MINOR: compiler: always define __has_feature()
This macro is provided by clang but gcc lacks it. Not having it makes
it painful to test features on both compilers. Better define it to zero
when not available so that __has_feature(foo) never errors.
2020-06-16 19:13:24 +02:00
Willy Tarreau
c8d167bcfb MINOR: tools: add a new configurable line parse, parse_line()
This function takes on input a string to tokenize, an output storage
(which may be the same) and a number of options indicating how to handle
certain characters (single & double quote support, backslash support,
end of line on '#', environment variables etc). On output it will provide
a list of pointers to individual words after having possibly unescaped
some character sequences, handled quotes and resolved environment
variables, and it will also indicate a status made of:
  - a list of failures (overlap between src/dst, wrong quote etc)
  - the pointer to the first sequence in error
  - the required output length (a-la snprintf()).

This allows a caller to freely unescape/unquote a string by using a
pre-allocated temporary buffer and expand it as necessary. It takes
extreme care at avoiding expensive operations and intentionally does
not use memmove() when removing escapes, hence the reason for the
different input and output buffers. The goal is to use it as the basis
for the config parser.
2020-06-16 16:27:26 +02:00
Willy Tarreau
853926a9ac BUG/MEDIUM: ebtree: use a byte-per-byte memcmp() to compare memory blocks
As reported in issue #689, there is a subtle bug in the ebtree code used
to compared memory blocks. It stems from the platform-dependent memcmp()
implementation. Original implementations used to perform a byte-per-byte
comparison and to stop at the first non-matching byte, as in this old
example:

   https://www.retro11.de/ouxr/211bsd/usr/src/lib/libc/compat-sys5/memcmp.c.html

The ebtree code has been relying on this to detect the first non-matching
byte when comparing keys. This is made so that a zero-terminated string
can fail to match against a longer string.

Over time, especially with large busses and SIMD instruction sets,
multi-byte comparisons have appeared, making the processor fetch bytes
past the first different byte, which could possibly be a trailing zero.
This means that it's possible to read past the allocated area for a
string if it was allocated by strdup().

This is not correct and definitely confuses address sanitizers. In real
life the problem doesn't have visible consequences. Indeed, multi-byte
comparisons are implemented so that aligned words are loaded (e.g. 512
bits at once to process a cache line at a time). So there is no way such
a multi-byte access will cross a page boundary and end up reading from
an unallocated zone. This is why it was never noticed before.

This patch addresses this by implementing a one-byte-at-a-time memcmp()
variant for ebtree, called eb_memcmp(). It's optimized for both small and
long strings and guarantees to stop after the first non-matching byte. It
only needs 5 instructions in the loop and was measured to be 3.2 times
faster than the glibc's AVX2-optimized memcmp() on short strings (1 to
257 bytes), since that latter one comes with a significant setup cost.
The break-even seems to be at 512 bytes where both version perform
equally, which is way longer than what's used in general here.

This fix should be backported to stable versions and reintegrated into
the ebtree code.
2020-06-16 11:30:33 +02:00
Willy Tarreau
f3ca5a0273 BUILD: haproxy: mark deinit_and_exit() as noreturn
Commit 0a3b43d9c ("MINOR: haproxy: Make use of deinit_and_exit() for
clean exits") introduced this build warning:

  src/haproxy.c: In function 'main':
  src/haproxy.c:3775:1: warning: control reaches end of non-void function [-Wreturn-type]
   }
   ^

This is because the new deinit_and_exit() is not marked as "noreturn"
so depending on the optimizations, the noreturn attribute of exit() will
either leak through it and silence the warning or not and confuse the
compiler. Let's just add the attribute to fix this.

No backport is needed, this is purely 2.2.
2020-06-15 18:43:46 +02:00
Willy Tarreau
bcefb85009 BUILD: atomic: add string.h for memcpy() on ARM64
As reported in issue #686, ARM64 build fails since the include files
reorganization. This is caused by the lack of string.h while a memcpy()
is present in __ha_cas_dw().
2020-06-14 08:08:13 +02:00
Tim Duesterhus
2654055316 MINOR: haproxy: Add void deinit_and_exit(int)
This helper function calls deinit() and then exit() with the given status.
2020-06-14 07:39:42 +02:00
Willy Tarreau
db57a142c3 BUILD: thread: add parenthesis around values of locking macros
clang just failed on fd.c with this error:

  src/fd.c:491:9: error: logical not is only applied to the left hand side of this comparison [-Werror,-Wlogical-not-parentheses]
          while (HA_SPIN_TRYLOCK(OTHER_LOCK, &log_lock) != 0) {
                 ^                                      ~~
That's because this expands to this:

          while (!pl_try_s(&log_lock) != 0) {

Let's just add parenthesis in the TRYLOCK macros to avoid this.
This may need to be backported if commit df187875d ("BUG/MEDIUM: log:
don't hold the log lock during writev() on a file descriptor") is
backported as well as it seems to be the first one to trigger it.
2020-06-12 11:46:44 +02:00
Willy Tarreau
7c18b54106 REORG: dgram: rename proto_udp to dgram
The set of files proto_udp.{c,h} were misleadingly named, as they do not
provide anything related to the UDP protocol but to datagram handling
instead, since currently all UDP processing is hard-coded where it's used
(dns, logs). They are to UDP what connection.{c,h} are to proto_tcp. This
was causing confusion about how to insert UDP socket management code,
so let's rename them right now to dgram.{c,h} which more accurately
matches what's inside since every function and type is already prefixed
with "dgram_".
2020-06-11 10:18:59 +02:00
Willy Tarreau
e5793916f0 REORG: include: make list-t.h part of the base API
There are list definitions everywhere in the code, let's drop the need
for including list-t.h to declare them. The rest of the list manipulation
is huge however and not needed everywhere so using the list walking macros
still requires to include list.h.
2020-06-11 10:18:59 +02:00
Willy Tarreau
b2551057af CLEANUP: include: tree-wide alphabetical sort of include files
This patch fixes all the leftovers from the include cleanup campaign. There
were not that many (~400 entries in ~150 files) but it was definitely worth
doing it as it revealed a few duplicates.
2020-06-11 10:18:59 +02:00
Willy Tarreau
5b9cde4820 REORG: include: move THREAD_LOCAL and __decl_thread() to compiler.h
Since these are used as type attributes or conditional clauses, they
are used about everywhere and should not require a dependency on
thread.h. Moving them to compiler.h along with other similar statements
like ALIGN() etc looks more logical; this way they become part of the
base API. This allowed to remove thread-t.h from ~12 files, one was
found to only require thread-t and not thread and dict.c was found to
require thread.h.
2020-06-11 10:18:59 +02:00
Willy Tarreau
ca8b069aa7 REORG: include: move MAX_THREADS to defaults.h
That's already where MAX_PROCS is set, and we already handle the case of
the default value so there is no reason for placing it in thread.h given
that most call places don't need the rest of the threads definitions. The
include was removed from global-t.h and activity.c.
2020-06-11 10:18:59 +02:00
Willy Tarreau
6784c99463 CLEANUP: include: make atomic.h part of the base API
Atomic ops are used about everywhere, let's make them part of the base
API by including atomic.h in api.h.
2020-06-11 10:18:59 +02:00
Willy Tarreau
8e3f5c6661 CLEANUP: compiler: add a THREAD_ALIGNED macro and use it where appropriate
Sometimes we need to align a struct member or a struct's size only when
threads are enabled. This is the case on fdtab for example. Instead of
using ugly ifdefs in the code itself, let's have a THREAD_ALIGNED() macro
performing the alignment only when threads are enabled. For now this was
only applied to fd-t.h as it was the only place found.
2020-06-11 10:18:59 +02:00
Willy Tarreau
36979d9ad5 REORG: include: move the error reporting functions to from log.h to errors.h
Most of the files dealing with error reports have to include log.h in order
to access ha_alert(), ha_warning() etc. But while these functions don't
depend on anything, log.h depends on a lot of stuff because it deals with
log-formats and samples. As a result it's impossible not to embark long
dependencies when using ha_warning() or qfprintf().

This patch moves these low-level functions to errors.h, which already
defines the error codes used at the same places. About half of the users
of log.h could be adjusted, sometimes revealing other issues such as
missing tools.h. Interestingly the total preprocessed size shrunk by
4%.
2020-06-11 10:18:59 +02:00
Willy Tarreau
251c2aae06 CLEANUP: include: move sample_data out of sample-t.h
The struct sample_data is used by pattern, map and vars, and currently
requires to include sample-t which comes with many other dependencies.
Let's move sample_data into its own file to shorten the dependency tree.
This revealed a number of issues in adjacent files which were hidden by
the fact that sample-t.h brought everything that was missing.
2020-06-11 10:18:59 +02:00
Willy Tarreau
4f663ec022 CLEANUP: include: don't include proxy-t.h in global-t.h
We only need a forward declaration here to avoid embarking lots of
files, and by just doing this we reduce the build size by 3.5%.
2020-06-11 10:18:59 +02:00
Willy Tarreau
d62af6abe4 CLEANUP: include: don't include stddef.h directly
Directly including stddef.h in many files results in it being processed
multiple times while it can be centralized in api-t.h and be guarded
against multiple inclusions. Doing so reduces the number of preprocessed
lines by 1200!
2020-06-11 10:18:59 +02:00
Willy Tarreau
bcc6733fab REORG: check: extract the external checks from check.{c,h}
The health check code is ugly enough, let's take the external checks
out of it to simplify the code and shrink the file a little bit.
2020-06-11 10:18:58 +02:00
Willy Tarreau
d604ace940 REORG: check: move email_alert* from proxy-t.h to mailers-t.h
These ones are specific to mailers and have nothing to do in proxy-t.h.
2020-06-11 10:18:58 +02:00
Willy Tarreau
51cd5956ee REORG: check: move tcpchecks away from check.c
Checks.c remains one of the largest file of the project and it contains
too many things. The tcpchecks code represents half of this file, and
both parts are relatively isolated, so let's move it away into its own
file. We now have tcpcheck.c, tcpcheck{,-t}.h.

Doing so required to export quite a number of functions because check.c
has almost everything made static, which really doesn't help to split!
2020-06-11 10:18:58 +02:00
Willy Tarreau
cee013e4e0 REORG: check: move the e-mail alerting code to mailers.c
check.c is one of the largest file and contains too many things. The
e-mail alerting code is stored there while nothing is in mailers.c.
Let's move this code out. That's only 4% of the code but a good start.
In order to do so, a few tcp-check functions had to be exported.
2020-06-11 10:18:58 +02:00
Willy Tarreau
4f6535d734 CLEANUP: hpack: export debug functions and move inlines to .h
When building contrib/hpack there is a warning about an unused static
function. Actually it makes no sense to make it static, instead it must
be regularly exported. Similarly there is hpack_dht_get_tail() which is
inlined in the C file and which would make more sense with all other ones
in the H file.
2020-06-11 10:18:58 +02:00
Willy Tarreau
6be7849f39 REORG: include: move cfgparse.h to haproxy/cfgparse.h
There's no point splitting the file in two since only cfgparse uses the
types defined there. A few call places were updated and cleaned up. All
of them were in C files which register keywords.

There is nothing left in common/ now so this directory must not be used
anymore.
2020-06-11 10:18:58 +02:00
Willy Tarreau
dfd3de8826 REORG: include: move stream.h to haproxy/stream{,-t}.h
This one was not easy because it was embarking many includes with it,
which other files would automatically find. At least global.h, arg.h
and tools.h were identified. 93 total locations were identified, 8
additional includes had to be added.

In the rare files where it was possible to finalize the sorting of
includes by adjusting only one or two extra lines, it was done. But
all files would need to be rechecked and cleaned up now.

It was the last set of files in types/ and proto/ and these directories
must not be reused anymore.
2020-06-11 10:18:58 +02:00
Willy Tarreau
1e56f92693 REORG: include: move server.h to haproxy/server{,-t}.h
extern struct dict server_name_dict was moved from the type file to the
main file. A handful of inlined functions were moved at the bottom of
the file. Call places were updated to use server-t.h when relevant, or
to simply drop the entry when not needed.
2020-06-11 10:18:58 +02:00
Willy Tarreau
a55c45470f REORG: include: move queue.h to haproxy/queue{,-t}.h
Nothing outstanding here. A number of call places were not justified and
removed.
2020-06-11 10:18:58 +02:00
Willy Tarreau
4980160ecc REORG: include: move backend.h to haproxy/backend{,-t}.h
The files remained mostly unchanged since they were OK. However, half of
the users didn't need to include them, and about as many actually needed
to have it and used to find functions like srv_currently_usable() through
a long chain that broke when moving the file.
2020-06-11 10:18:58 +02:00
Willy Tarreau
6c58ab0304 REORG: include: move spoe.h to haproxy/spoe{,-t}.h
Only minor change was to make sure all defines were before the structs
in spoe-t.h, everything else went smoothly.
2020-06-11 10:18:58 +02:00
Willy Tarreau
a264d960f6 REORG: include: move proxy.h to haproxy/proxy{,-t}.h
This one is particularly difficult to split because it provides all the
functions used to manipulate a proxy state and to retrieve names or IDs
for error reporting, and as such, it was included in 73 files (down to
68 after cleanup). It would deserve a small cleanup though the cut points
are not obvious at the moment given the number of structs involved in
the struct proxy itself.
2020-06-11 10:18:58 +02:00
Willy Tarreau
aeed4a85d6 REORG: include: move log.h to haproxy/log{,-t}.h
The current state of the logging is a real mess. The main problem is
that almost all files include log.h just in order to have access to
the alert/warning functions like ha_alert() etc, and don't care about
logs. But log.h also deals with real logging as well as log-format and
depends on stream.h and various other things. As such it forces a few
heavy files like stream.h to be loaded early and to hide missing
dependencies depending where it's loaded. Among the missing ones is
syslog.h which was often automatically included resulting in no less
than 3 users missing it.

Among 76 users, only 5 could be removed, and probably 70 don't need the
full set of dependencies.

A good approach would consist in splitting that file in 3 parts:
  - one for error output ("errors" ?).
  - one for log_format processing
  - and one for actual logging.
2020-06-11 10:18:58 +02:00
Willy Tarreau
c6599682d5 REORG: include: move fcgi-app.h to haproxy/fcgi-app{,-t}.h
Only arg-t.h was missing from the types to get arg_list.
2020-06-11 10:18:58 +02:00
Willy Tarreau
c7babd8570 REORG: include: move filters.h to haproxy/filters{,-t}.h
Just a minor change, moved the macro definitions upwards. A few caller
files were updated since they didn't need to include it.
2020-06-11 10:18:58 +02:00
Willy Tarreau
c2b1ff04e5 REORG: include: move http_ana.h to haproxy/http_ana{,-t}.h
It was moved without any change, however many callers didn't need it at
all. This was a consequence of the split of proto_http.c into several
parts that resulted in many locations to still reference it.
2020-06-11 10:18:58 +02:00
Willy Tarreau
f1d32c475c REORG: include: move channel.h to haproxy/channel{,-t}.h
The files were moved with no change. The callers were cleaned up a bit
and a few of them had channel.h removed since not needed.
2020-06-11 10:18:58 +02:00
Willy Tarreau
5e539c9b8d REORG: include: move stream_interface.h to haproxy/stream_interface{,-t}.h
Almost no changes, removed stdlib and added buf-t and connection-t to
the types to avoid a warning.
2020-06-11 10:18:58 +02:00
Willy Tarreau
209108dbbd REORG: include: move ssl_sock.h to haproxy/ssl_sock{,-t}.h
Almost nothing changed, just moved a static inline at the end and moved
an export from the types to the main file.
2020-06-11 10:18:58 +02:00
Willy Tarreau
2867159d63 REORG: include: move lb_map.h to haproxy/lb_map{,-t}.h
Nothing was changed.
2020-06-11 10:18:58 +02:00
Willy Tarreau
83487a833c REORG: include: move cli.h to haproxy/cli{,-t}.h
Almost no change except moving the cli_kw struct definition after the
defines. Almost all users had both types&proto included, which is not
surprizing since this code is old and it used to be the norm a decade
ago. These places were cleaned.
2020-06-11 10:18:58 +02:00
Willy Tarreau
2eec9b5f95 REORG: include: move stats.h to haproxy/stats{,-t}.h
Just some minor reordering, and the usual cleanup of call places for
those which didn't need it. We don't include the whole tools.h into
stats-t anymore but just tools-t.h.
2020-06-11 10:18:58 +02:00
Willy Tarreau
3f0f82e7a9 REORG: move applet.h to haproxy/applet{,-t}.h
The type file was slightly tidied. The cli-specific APPCTX_CLI_ST1_* flag
definitions were moved to cli.h. The type file was adjusted to include
buf-t.h and not the huge buf.h. A few call places were fixed because they
did not need this include.
2020-06-11 10:18:58 +02:00
Willy Tarreau
8c42b8a147 REORG: include: split common/uri_auth.h into haproxy/uri_auth{,-t}.h
Initially it looked like this could have been placed into auth.h or
stats.h but it's not the case as it's what makes the link between them
and the HTTP layer. However the file needed to be split in two. Quite
a number of call places were dropped because these were mostly leftovers
from the early days where the stats and cli were packed together.
2020-06-11 10:18:58 +02:00
Willy Tarreau
dcc048a14a REORG: include: move acl.h to haproxy/acl.h{,-t}.h
The files were moved almost as-is, just dropping arg-t and auth-t from
acl-t but keeping arg-t in acl.h. It was useful to revisit the call places
since a handful of files used to continue to include acl.h while they did
not need it at all. Struct stream was only made a forward declaration
since not otherwise needed.
2020-06-11 10:18:58 +02:00
Willy Tarreau
c6d61d762f REORG: include: move trace.h to haproxy/trace{,-t}.h
Only thread-t was added to satisfy THREAD_LOCAL but the rest was OK.
2020-06-11 10:18:58 +02:00
Willy Tarreau
48d25b3bc9 REORG: include: move session.h to haproxy/session{,-t}.h
Almost no change was needed beyond a little bit of reordering of the
types file and adjustments to use session-t instead of session at a
few places.
2020-06-11 10:18:58 +02:00
Willy Tarreau
872f2ea209 REORG: include: move stick_table.h to haproxy/stick_table{,-t}.h
The stktable_types[] array declaration was moved to the main file as
it had nothing to do in the types. A few declarations were reordered
in the types file so that defines were before the structs. Thread-t
was added since there are a few __decl_thread(). The loss of peers.h
revealed that cfgparse-listen needed it.
2020-06-11 10:18:58 +02:00
Willy Tarreau
3c2a7c2788 REORG: include: move peers.h to haproxy/peers{,-t}.h
The cfg_peers external declaration was moved to the main file instead
of the type one. A few types were still missing from the proto, causing
warnings in the functions prototypes (proxy, stick_table).
2020-06-11 10:18:58 +02:00
Willy Tarreau
126ba3a1e1 REORG: include: move http_fetch.h to haproxy/http_fetch.h
There's no type file for this trivial one. The unneeded dependency on
htx.h was dropped.
2020-06-11 10:18:58 +02:00
Willy Tarreau
4aa573da6f REORG: include: move checks.h to haproxy/check{,-t}.h
All includes that were not absolutely necessary were removed because
checks.h happens to very often be part of dependency loops. A warning
was added about this in check-t.h. The fields, enums and structs were
a bit tidied because it's particularly tedious to find anything there.
It would make sense to split this in two or more files (at least
extract tcp-checks).

The file was renamed to the singular because it was one of the rare
exceptions to have an "s" appended to its name compared to the struct
name.
2020-06-11 10:18:58 +02:00
Willy Tarreau
7ea393d95e REORG: include: move connection.h to haproxy/connection{,-t}.h
The type file is becoming a mess, half of it is for the proxy protocol,
another good part describes conn_streams and mux ops, it would deserve
being split again. At least it was reordered so that elements are easier
to find, with the PP-stuff left at the end. The MAX_SEND_FD macro was moved
to compat.h as it's said to be the value for Linux.
2020-06-11 10:18:58 +02:00
Willy Tarreau
8b550afe1e REORG: include: move tcp_rules.h to haproxy/tcp_rules.h
There's no type file on this one which is pretty simple.
2020-06-11 10:18:58 +02:00
Willy Tarreau
3727a8a083 REORG: include: move signal.h to haproxy/signal{,-t}.h
No change was necessary. Include from wdt.c was dropped since unneeded.
2020-06-11 10:18:58 +02:00
Willy Tarreau
fc77454aff REORG: include: move proto_tcp.h to haproxy/proto_tcp.h
There was no type file. This one really is trivial. A few missing
includes were added to satisfy the exported functions prototypes.
2020-06-11 10:18:58 +02:00
Willy Tarreau
cea0e1bb19 REORG: include: move task.h to haproxy/task{,-t}.h
The TASK_IS_TASKLET() macro was moved to the proto file instead of the
type one. The proto part was a bit reordered to remove a number of ugly
forward declaration of static inline functions. About a tens of C and H
files had their dependency dropped since they were not using anything
from task.h.
2020-06-11 10:18:58 +02:00
Willy Tarreau
f268ee8795 REORG: include: split global.h into haproxy/global{,-t}.h
global.h was one of the messiest files, it has accumulated tons of
implicit dependencies and declares many globals that make almost all
other file include it. It managed to silence a dependency loop between
server.h and proxy.h by being well placed to pre-define the required
structs, forcing struct proxy and struct server to be forward-declared
in a significant number of files.

It was split in to, one which is the global struct definition and the
few macros and flags, and the rest containing the functions prototypes.

The UNIX_MAX_PATH definition was moved to compat.h.
2020-06-11 10:18:58 +02:00
Willy Tarreau
a171892501 REORG: include: move vars.h to haproxy/vars{,-t}.h
A few includes (sessions.h, stream.h, api-t.h) were added for arguments
that were first declared in function prototypes.
2020-06-11 10:18:58 +02:00
Willy Tarreau
b23e5958ed REORG: include: move protocol_buffers.h to haproxy/protobuf{,-t}.h
There is no C file for this one, the code was placed into sample.c which
thus has a dependency on this file which itself includes sample.h. Probably
that it would be wise to split that later.
2020-06-11 10:18:58 +02:00
Willy Tarreau
e6ce10be85 REORG: include: move sample.h to haproxy/sample{,-t}.h
This one is particularly tricky to move because everyone uses it
and it depends on a lot of other types. For example it cannot include
arg-t.h and must absolutely only rely on forward declarations to avoid
dependency loops between vars -> sample_data -> arg. In order to address
this one, it would be nice to split the sample_data part out of sample.h.
2020-06-11 10:18:58 +02:00
Willy Tarreau
469509b39e REORG: include: move payload.h to haproxy/payload.h
There's no type file, it only contains fetch_rdp_cookie_name() and
val_payload_lv() which probably ought to move somewhere else instead
of staying there.
2020-06-11 10:18:58 +02:00
Willy Tarreau
2cd5809f94 REORG: include: move map to haproxy/map{,-t}.h
Only small cleanups, and removal of a few includes from files that
didn't need them.
2020-06-11 10:18:58 +02:00
Willy Tarreau
225a90aaec REORG: include: move pattern.h to haproxy/pattern{,-t}.h
It was moved as-is, except for extern declaration of pattern_reference.
A few C files used to include it but didn't need it anymore after having
been split apart so this was cleaned.
2020-06-11 10:18:58 +02:00
Willy Tarreau
213e99073b REORG: include: move listener.h to haproxy/listener{,-t}.h
stdlib and list were missing from listener.h, otherwise it was OK.
2020-06-11 10:18:58 +02:00
Willy Tarreau
546ba42c73 REORG: include: move lb_fwrr.h to haproxy/lb_fwrr{,-t}.h
Nothing fancy, includes were already OK. The proto didn't reference the
type, this was fixed. Still references proxy.h and server.h from types/.
2020-06-11 10:18:58 +02:00
Willy Tarreau
0254941666 REORG: include: move lb_fwlc.h to haproxy/lb_fwlc{,-t}.h
Nothing fancy, includes were already OK. The proto didn't reference the
type, this was fixed. Still references proxy.h and server.h from types/.
2020-06-11 10:18:58 +02:00
Willy Tarreau
b5fc3bf6dc REORG: include: move lb_fas.h to haproxy/lb_fas{,-t}.h
Nothing fancy, includes were already OK. The proto didn't reference the
type, this was fixed. Still references proxy.h and server.h from types/.
2020-06-11 10:18:58 +02:00
Willy Tarreau
fbe8da3320 REORG: include: move lb_chash.h to haproxy/lb_chash{,-t}.h
Nothing fancy, includes were already OK. The proto didn't reference the
type, this was fixed. Still references proxy.h and server.h from types/.
2020-06-11 10:18:58 +02:00
Willy Tarreau
52d88725ab REORG: move ssl_crtlist.h to haproxy/ssl_crtlist{,-t}.h
These files were already clean as well. Just added ebptnode which is
needed in crtlist_entry.
2020-06-11 10:18:58 +02:00
Willy Tarreau
47d7f9064d REORG: include: move ssl_ckch.h to haproxy/ssl_ckch{,-t}.h
buf-t and ebmbtree were included.
2020-06-11 10:18:58 +02:00
Willy Tarreau
b2bd865804 REORG: include: move ssl_utils.h to haproxy/ssl_utils.h
Just added buf-t and openssl-compat for the missing types that appear
in the prototypes.
2020-06-11 10:18:57 +02:00
Willy Tarreau
b5abe5bd5d REORG: include: move mworker.h to haproxy/mworker{,-t}.h
One function prototype makes reference to struct mworker_proc which was
not defined there but in global.h instead. This definition, along with
the PROC_O_* fields were moved to mworker-t.h instead.
2020-06-11 10:18:57 +02:00
Willy Tarreau
d7d2c28104 CLEANUP: include: remove unused mux_pt.h
It used to be needed to export mux_pt_ops when it was the only way to
detect a mux but that's no longer the case.
2020-06-11 10:18:57 +02:00
Willy Tarreau
c761f843da REORG: include: move http_rules.h to haproxy/http_rules.h
There was no include file. This one still includes types/proxy.h.
2020-06-11 10:18:57 +02:00
Willy Tarreau
8efbdfb77b REORG: include: move obj_type.h to haproxy/obj_type{,-t}.h
No change was necessary. It still includes lots of types/* files.
2020-06-11 10:18:57 +02:00
Willy Tarreau
762d7a5117 REORG: include: move frontend.h to haproxy/frontend.h
There was no type file for this one, it only contains frontend_accept().
2020-06-11 10:18:57 +02:00
Willy Tarreau
278161c1b8 REORG: include: move capture.h to haproxy/capture{,-t}.h
The file was split into two since it contains a variable declaration.
2020-06-11 10:18:57 +02:00
Willy Tarreau
cc9bbfb7b5 REORG: include: split mailers.h into haproxy/mailers{,-t}.h
The file mostly contained struct definitions but there was also a
variable export. Most of the stuff currently lies in checks.h and
should definitely move here!
2020-06-11 10:18:57 +02:00
Willy Tarreau
167e1eb7c7 REORG: include: move counters.h to haproxy/counters-t.h
Since these are only type definitions, let's move them to counters-t.h
and reserve counters.h for when functions will be needed.
2020-06-11 10:18:57 +02:00
Willy Tarreau
7d865a5e3e REORG: include: move flt_http_comp.h to haproxy/
There was no type definition for this file which was moved as-is.
2020-06-11 10:18:57 +02:00
Willy Tarreau
eb92deb500 REORG: include: move dns.h to haproxy/dns{,-t}.h
The files were moved as-is.
2020-06-11 10:18:57 +02:00
Willy Tarreau
ac13aeaa89 REORG: include: move auth.h to haproxy/auth{,-t}.h
The STATS_DEFAULT_REALM and STATS_DEFAULT_URI were moved to defaults.h.
It was required to include types/pattern.h and types/sample.h since they
are mentioned in function prototypes.

It would be wise to merge this with uri_auth.h later.
2020-06-11 10:18:57 +02:00
Willy Tarreau
aa74c4e1b3 REORG: include: move arg.h to haproxy/arg{,-t}.h
Almost no change was needed; chunk.h was replaced with buf-t.h.
It dpeends on types/vars.h and types/protocol_buffers.h.
2020-06-11 10:18:57 +02:00
Willy Tarreau
122eba92b7 REORG: include: move action.h to haproxy/action{,-t}.h
List.h was missing for LIST_ADDQ(). A few unneeded includes of action.h
were removed from certain files.

This one still relies on applet.h and stick-table.h.
2020-06-11 10:18:57 +02:00
Willy Tarreau
8c794000c4 REORG: include: move hlua_fcn.h to haproxy/hlua_fcn.h
Added lua.h which was missing from the includes.
2020-06-11 10:18:57 +02:00
Willy Tarreau
8641605ff6 REORG: include: move hlua.h to haproxy/hlua{,-t}.h
This one required a few more includes as it uses list and ebpt_node.
It still references lots of types/ files for now.
2020-06-11 10:18:57 +02:00
Willy Tarreau
87735330d1 REORG: include: move http_htx.h to haproxy/http_htx{,-t}.h
A few includes had to be added, namely list-t.h in the type file and
types/proxy.h in the proto file. actions.h was including http-htx.h
but didn't need it so it was dropped.
2020-06-11 10:18:57 +02:00
Willy Tarreau
c6fe884c74 REORG: include: move h1_htx.h to haproxy/h1_htx.h
This one didn't have a type file. A few missing includes were
added (htx, types).
2020-06-11 10:18:57 +02:00
Willy Tarreau
0a3bd3919e REORG: include: move compression.h to haproxy/compression{,-t}.h
No change was needed.
2020-06-11 10:18:57 +02:00
Willy Tarreau
f07f30c15f REORG: include: move proto/proto_sockpair.h to haproxy/proto_sockpair.h
This one didn't have any types file and was moved as-is.
2020-06-11 10:18:57 +02:00
Willy Tarreau
832ce65914 REORG: include: move proto_udp.h to haproxy/proto_udp{,-t}.h
No change was needed.
2020-06-11 10:18:57 +02:00
Willy Tarreau
14e8af5932 CLEANUP: include: remove empty raw_sock.h
This one only contained an include for types/stream_interface.h, which
was already present in its 3 users.
2020-06-11 10:18:57 +02:00
Willy Tarreau
551271d99c REORG: include: move pipe.h to haproxy/pipe{,-t}.h
No change was needed beyond a minor cleanup.
2020-06-11 10:18:57 +02:00
Willy Tarreau
ba2f73d40e REORG: include: move sink.h to haproxy/sink{,-t}.h
The sink files could be moved with almost no change at since they
didn't rely on anything fancy. ssize_t required sys/types.h and
thread.h was needed for the locks.
2020-06-11 10:18:57 +02:00
Willy Tarreau
d2ad57c352 REORG: include: move ring to haproxy/ring{,-t}.h
Some includes were wrong in the type definition but beyond this no
change was needed.
2020-06-11 10:18:57 +02:00
Willy Tarreau
0f6ffd652e REORG: include: move fd.h to haproxy/fd{,-t}.h
A few includes were missing in each file. A definition of
struct polled_mask was moved to fd-t.h. The MAX_POLLERS macro was
moved to defaults.h

Stdio used to be silently inherited from whatever path but it's needed
for list_pollers() which takes a FILE* and which can thus not be
forward-declared.
2020-06-11 10:18:57 +02:00
Willy Tarreau
fc8f6a8517 REORG: include: move port_range.h to haproxy/port_range{,-t}.h
The port ranges didn't depend on anything. However they were missing
some includes such as stdlib and api-t.h which were added.
2020-06-11 10:18:57 +02:00
Willy Tarreau
334099c324 REORG: include: move shctx to haproxy/shctx{,-t}.h
Minor cleanups were applied, some includes were missing from the types
file and some were incorrect in a few C files (duplicated or not using
path).
2020-06-11 10:18:57 +02:00
Willy Tarreau
3afc4c4bb0 REORG: include: move dict.h to hparoxy/dict{,-t}.h
This was entirely free-standing. haproxy/api-t.h was added for size_t.
2020-06-11 10:18:57 +02:00
Willy Tarreau
48fbcae07c REORG: tools: split common/standard.h into haproxy/tools{,-t}.h
And also rename standard.c to tools.c. The original split between
tools.h and standard.h dates from version 1.3-dev and was mostly an
accident. This patch moves the files back to what they were expected
to be, and takes care of not changing anything else. However this
time tools.h was split between functions and types, because it contains
a small number of commonly used macros and structures (e.g. name_desc)
which in turn cause the massive list of includes of tools.h to conflict
with the callers.

They remain the ugliest files of the whole project and definitely need
to be cleaned and split apart. A few types are defined there only for
functions provided there, and some parts are even OS-specific and should
move somewhere else, such as the symbol resolution code.
2020-06-11 10:18:57 +02:00
Willy Tarreau
2dd7c35052 REORG: include: move protocol.h to haproxy/protocol{,-t}.h
The protocol.h files are pretty low in the dependency and (sadly) used
by some files from common/. Almost nothing was changed except lifting a
few comments.
2020-06-11 10:18:57 +02:00
Willy Tarreau
fa2ef5b5eb REORG: include: move common/fcgi.h to haproxy/
The file was moved almost verbatim (only stdio.h was dropped as useless).
It was not split between types and functions because it's only included
from direct C code (fcgi.c and mux_fcgi.c) as well as fcgi_app.h, included
from the same ones, which should also be remerged as a single one.
2020-06-11 10:18:57 +02:00
Willy Tarreau
bf0731491b REORG: include: move common/h2.h to haproxy/h2.h
No change was performed, the file is only included from C files and
currently doesn't need to be split into types+functions.
2020-06-11 10:18:57 +02:00
Willy Tarreau
be327fa332 REORG: include: move hpack*.h to haproxy/ and split hpack-tbl
The various hpack files are self-contained, but hpack-tbl was one of
those showing difficulties when pools were added because that began
to add quite some dependencies. Now when built in standalone mode,
it still uses the bare minimum pool definitions and doesn't require
to know the prototypes anymore when only the structures are needed.
Thus the files were moved verbatim except for hpack-tbl which was
split between types and prototypes.
2020-06-11 10:18:57 +02:00
Willy Tarreau
16f958c0e9 REORG: include: split common/htx.h into haproxy/htx{,-t}.h
Most of the file was a large set of HTX elements manipulation functions
and few types, so splitting them allowed to further reduce dependencies
and shrink the build time. Doing so revealed that a few files (h2.c,
mux_pt.c) needed haproxy/buf.h and were previously getting it through
htx.h. They were fixed.
2020-06-11 10:18:57 +02:00
Willy Tarreau
5413a87ad3 REORG: include: move common/h1.h to haproxy/h1.h
The file was moved as-is. There was a wrong dependency on dynbuf.h
instead of buf.h which was addressed. There was no benefit to
splitting this between types and functions.
2020-06-11 10:18:57 +02:00
Willy Tarreau
0017be0143 REORG: include: split common/http-hdr.h into haproxy/http-hdr{,-t}.h
There's only one struct and 2 inline functions. It could have been
merged into http.h but that would have added a massive dependency on
the hpack parts for nothing, so better keep it this way since hpack
is already freestanding and portable.
2020-06-11 10:18:57 +02:00
Willy Tarreau
cd72d8c981 REORG: include: split common/http.h into haproxy/http{,-t}.h
So the enums and structs were placed into http-t.h and the functions
into http.h. This revealed that several files were dependeng on http.h
but not including it, as it was silently inherited via other files.
2020-06-11 10:18:57 +02:00
Willy Tarreau
c2f7c5895c REORG: include: move common/ticks.h to haproxy/ticks.h
Nothing needed to be changed, there are no exported types.
2020-06-11 10:18:57 +02:00
Willy Tarreau
374b442cbc REORG: include: split common/xref.h into haproxy/xref{,-t}.h
The type is the only element needed by applet.h and hlua.h, while hlua.c
needs the various functions. XREF_BUSY was placed into the types as well
since it's better to have the special values there.
2020-06-11 10:18:57 +02:00
Willy Tarreau
7cd8b6e3a4 REORG: include: split common/regex.h into haproxy/regex{,-t}.h
Regex are essentially included for myregex_t but it turns out that
several of the C files didn't include it directly, relying on the
one included by their own .h. This has been cleanly addressed so
that only the type is included by H files which need it, and adding
the missing includes for the other ones.
2020-06-11 10:18:57 +02:00
Willy Tarreau
7a00efbe43 REORG: include: move common/namespace.h to haproxy/namespace{,-t}.h
The type was moved out as it's used by standard.h for netns_entry.
Instead of just being a forward declaration when not used, it's an
empty struct, which makes gdb happier (the resulting stripped executable
is the same).
2020-06-11 10:18:57 +02:00
Willy Tarreau
6131d6a731 REORG: include: move common/net_helper.h to haproxy/net_helper.h
No change was necessary.
2020-06-11 10:18:57 +02:00
Willy Tarreau
2741c8c4aa REORG: include: move common/buffer.h to haproxy/dynbuf{,-t}.h
The pretty confusing "buffer.h" was in fact not the place to look for
the definition of "struct buffer" but the one responsible for dynamic
buffer allocation. As such it defines the struct buffer_wait and the
few functions to allocate a buffer or wait for one.

This patch moves it renaming it to dynbuf.h. The type definition was
moved to its own file since it's included in a number of other structs.

Doing this cleanup revealed that a significant number of files used to
rely on this one to inherit struct buffer through it but didn't need
anything from this file at all.
2020-06-11 10:18:57 +02:00
Willy Tarreau
a04ded58dc REORG: include: move activity to haproxy/
This moves types/activity.h to haproxy/activity-t.h and
proto/activity.h to haproxy/activity.h.

The macros defining the bit field values for the profiling variable
were moved to the type file to be more future-proof.
2020-06-11 10:18:57 +02:00
Willy Tarreau
c13ed53b12 REORG: include: move common/chunk.h to haproxy/chunk.h
No change was necessary, it was already properly split.
2020-06-11 10:18:57 +02:00
Willy Tarreau
d0ef439699 REORG: include: move common/memory.h to haproxy/pool.h
Now the file is ready to be stored into its final destination. A few
minor reorderings were performed to keep the file properly organized,
making the various sections more visible (cache & lockless).

In addition and to stay consistent, memory.c was renamed to pool.c.
2020-06-11 10:18:57 +02:00
Willy Tarreau
ed891fda52 MEDIUM: memory: make local pools independent on lockless pools
Till now the local pool caches were implemented only when lockless pools
were in use. This was mainly due to the difficulties to disentangle the
code parts. However the locked pools would further benefit from the local
cache, and having this would reduce the variants in the code.

This patch does just this. It adds a new debug macro DEBUG_NO_LOCAL_POOLS
to forcefully disable local pool caches, and makes sure that the high
level functions are now strictly the same between locked and lockless
(pool_alloc(), pool_alloc_dirty(), pool_free(), pool_get_first()). The
pool index calculation was moved inside the CONFIG_HAP_LOCAL_POOLS guards.
This allowed to move them out of the giant #ifdef and to significantly
reduce the code duplication.

A quick perf test shows that with locked pools the performance increases
by roughly 10% on 8 threads and gets closer to the lockless one.
2020-06-11 10:18:57 +02:00
Willy Tarreau
f8c1b648c0 MINOR: memory: move pool-specific path of the locked pool_free() to __pool_free()
pool_free() was not identical between locked and lockless pools. The
different was the call to __pool_free() in one case versus open-coded
accesses in the other, and the poisoning brought by commit da52035a45
("MINOR: memory: also poison the area on freeing") which unfortunately
did if only for the lockless path.

Let's now have __pool_free() to work on the global pool also in the
locked case so that the code is architected similarly.
2020-06-11 10:18:56 +02:00
Willy Tarreau
fb117e6a8e MEDIUM: memory: don't let pool_put_to_cache() free the objects itself
Just as for the allocation path, the release path was not symmetrical.
It was not logical to have pool_put_to_cache() free the objects itself,
it was pool_free's job. In addition, just because of a variable export
issue, it the insertion of the object to free back into the local cache
couldn't be inlined while it was very cheap.

This patch just slightly reorganizes this code path by making pool_free()
decide whether or not to put the object back into the cache via
pool_put_to_cache() otherwise place it back to the global pool using
__pool_free().  Then pool_put_to_cache() adds the item to the local
cache and only calls pool_evict_from_cache() if the cache is too big.
2020-06-11 10:18:56 +02:00
Willy Tarreau
a6982e5868 MINOR: memory: don't let __pool_get_first() pick from the cache
When building with the local cache support, we have an asymmetry in
the allocation path which is that __pool_get_first() picks from the
cache while when no cache support is used, this one directly accesses
the shared area. It looks like it was done this way only to centralize
the call to __pool_get_from_cache() but this was not a good idea as it
complicates the splitting the code.

Let's move the cache access to the upper layer so thatt __pool_get_first()
remains agnostic to the cache support.

The call tree now looks like this with the cache enabled :

    pool_get_first()
      __pool_get_from_cache() // if cache enabled
      __pool_get_first()

    pool_alloc()
      pool_alloc_dirty()
        __pool_get_from_cache() // if cache enabled
        __pool_get_first()
        __pool_refill_alloc()

    __pool_free()
      pool_free_area()

    pool_put_to_cache()
      __pool_free()
      __pool_put_to_cache()

    pool_free()
      pool_put_to_cache()

With cache disabled, the pool_free() path still differs:

    pool_free()
      __pool_free_area()
      __pool_put_to_cache()
2020-06-11 10:18:56 +02:00
Willy Tarreau
24aa1eebaa REORG: memory: move the OS-level allocator to haproxy/pool-os.h
The memory.h file is particularly complex due to the combination of
debugging options. This patch extracts the OS-level interface and
places it into a new file: pool-os.h.

Doing this also moves pool_alloc_area() and pool_free_area() out of
the #ifndef CONFIG_HAP_LOCKLESS_POOLS, making them usable from
__pool_refill_alloc(), pool_free(), pool_flush() and pool_gc() instead
of having direct calls to malloc/free there that are hard to wrap for
debugging purposes.
2020-06-11 10:18:56 +02:00
Willy Tarreau
3646777a77 REORG: memory: move the pool type definitions to haproxy/pool-t.h
This is the beginning of the move and cleanup of memory.h. This first
step only extracts type definitions and basic macros that are needed
by the files which reference a pool. They're moved to pool-t.h (since
"pool" is more obvious than "memory" when looking for pool-related
stuff). 3 files which didn't need to include the whole memory.h were
updated.
2020-06-11 10:18:56 +02:00
Willy Tarreau
606135ac88 CLEANUP: pool: include freq_ctr.h and remove locally duplicated functions
In memory.h we had to reimplement the swrate* functions just because of
a broken circular dependency around freq_ctr.h. Now that this one is
solved, let's get rid of this copy and use the original ones instead.
2020-06-11 10:18:56 +02:00
Willy Tarreau
6634794992 REORG: include: move freq_ctr to haproxy/
types/freq_ctr.h was moved to haproxy/freq_ctr-t.h and proto/freq_ctr.h
was moved to haproxy/freq_ctr.h. Files were updated accordingly, no other
change was applied.
2020-06-11 10:18:56 +02:00
Willy Tarreau
889faf467b CLEANUP: include: remove excessive includes of common/standard.h
Some of them were simply removed as unused (possibly some leftovers
from an older cleanup session), some were turned to haproxy/bitops.h
and a few had to be added (hlua.c and stick-table.h need standard.h
for parse_time_err; htx.h requires chunk.h but used to get it through
standard.h).
2020-06-11 10:18:56 +02:00
Willy Tarreau
aea4635c38 REORG: include: move integer manipulation functions from standard.h to intops.h
There are quite a number of integer manipulation functions defined in
standard.h, which is one of the reasons why standard.h is included from
many places and participates to the dependencies loop.

Let's just have a new file, intops.h to place all these operations.
These are a few bitops, 32/64 bit mul/div/rotate, integer parsing and
encoding (including varints), the full avalanche hash function, and
the my_htonll/my_ntohll functions.

For now no new C file was created for these yet.
2020-06-11 10:18:56 +02:00
Willy Tarreau
92b4f1372e REORG: include: move time.h from common/ to haproxy/
This one is included almost everywhere and used to rely on a few other
.h that are not needed (unistd, stdlib, standard.h). It could possibly
make sense to split it into multiple parts to distinguish operations
performed on timers and the internal time accounting, but at this point
it does not appear much important.
2020-06-11 10:18:56 +02:00
Willy Tarreau
af613e8359 CLEANUP: thread: rename __decl_hathreads() to __decl_thread()
I can never figure whether it takes an "s" or not, and in the end it's
better if it matches the file's naming, so let's call it "__decl_thread".
2020-06-11 10:18:56 +02:00
Willy Tarreau
3f567e4949 REORG: include: split hathreads into haproxy/thread.h and haproxy/thread-t.h
This splits the hathreads.h file into types+macros and functions. Given
that most users of this file used to include it only to get the definition
of THREAD_LOCAL and MAXTHREADS, the bare minimum was placed into thread-t.h
(i.e. types and macros).

All the thread management was left to haproxy/thread.h. It's worth noting
the drop of the trailing "s" in the name, to remove the permanent confusion
that arises between this one and the system implementation (no "s") and the
makefile's option (no "s").

For consistency, src/hathreads.c was also renamed thread.c.

A number of files were updated to only include thread-t which is the one
they really needed.

Some future improvements are possible like replacing empty inlined
functions with macros for the thread-less case, as building at -O0 disables
inlining and causes these ones to be emitted. But this really is cosmetic.
2020-06-11 10:18:56 +02:00
Willy Tarreau
5775d0964a CLEANUP: threads: remove a few needless includes of hathreads.h
A few files were including it while not needing it (anymore). Some
only required access to the atomic ops and got haproxy/atomic.h in
exchange. Others didn't need it at all. A significant number of
files still include it only for THREAD_LOCAL definition.
2020-06-11 10:18:56 +02:00
Willy Tarreau
9453ecd670 REORG: threads: extract atomic ops from hathreads.h
The hathreads.h file has quickly become a total mess because it contains
thread definitions, atomic operations and locking operations, all this for
multiple combinations of threads, debugging and architectures, and all this
done with random ordering!

This first patch extracts all the atomic ops code from hathreads.h to move
it to haproxy/atomic.h. The code there still contains several sections
based on non-thread vs thread, and GCC versions in the latter case. Each
section was arranged in the exact same order to ease finding.

The redundant HA_BARRIER() which was the same as __ha_compiler_barrier()
was dropped in favor of the latter which follows the naming convention
of all other barriers. It was only used in freq_ctr.c which was updated.
Additionally, __ha_compiler_barrier() was defined inconditionally but
used only for thread-related operations, so it was made thread-only like
HA_BARRIER() used to be. We'd still need to have two types of compiler
barriers, one for the general case (e.g. signals) and another one for
concurrency, but this was not addressed here.

Some comments were added at the beginning of each section to inform about
the use case and warn about the traps to avoid.

Some files which continue to include hathreads.h solely for atomic ops
should now be updated.
2020-06-11 10:18:56 +02:00
Willy Tarreau
853b297c9b REORG: include: split mini-clist into haproxy/list and list-t.h
Half of the users of this include only need the type definitions and
not the manipulation macros nor the inline functions. Moves the various
types into mini-clist-t.h makes the files cleaner. The other one had all
its includes grouped at the top. A few files continued to reference it
without using it and were cleaned.

In addition it was about time that we'd rename that file, it's not
"mini" anymore and contains a bit more than just circular lists.
2020-06-11 10:18:56 +02:00
Willy Tarreau
f0f1c80daf REORG: include: move istbuf.h to haproxy/
This one now relies on two files that were already cleaned up and is
only used by buffer.h.
2020-06-11 10:18:56 +02:00
Willy Tarreau
8dabda7497 REORG: include: split buf.h into haproxy/buf-t.h and haproxy/buf.h
File buf.h is one common cause of pain in the dependencies. Many files in
the code need it to get the struct buffer definition, and a few also need
the inlined functions to manipulate a buffer, but the file used to depend
on a long chain only for BUG_ON() (addressed by last commit).

Now buf.h is split into buf-t.h which only contains the type definitions,
and buf.h for all inlined functions. Callers who don't care can continue
to use buf.h but files in types/ must only use buf-t.h. sys/types.h had
to be added to buf.h to get ssize_t as used by b_move(). It's worth noting
that ssize_t is only supposed to be a size_t supporting -1, so b_move()
ought to be rethought regarding this.

The files were moved to haproxy/ and all their users were updated
accordingly. A dependency issue was addressed on fcgi whose C file didn't
include buf.h.
2020-06-11 10:18:56 +02:00
Willy Tarreau
025beea507 CLEANUP: debug: drop unused function p_malloc()
This one was introduced 5 years ago for debugging and never really used.
It is the one which used to cause circular dependencies issues. Let's drop
it instead of starting to split the debug include in two.
2020-06-11 10:18:56 +02:00
Willy Tarreau
2a83d60662 REORG: include: move debug.h from common/ to haproxy/
The debug file is cleaner now and does not depend on much anymore.
2020-06-11 10:18:56 +02:00
Willy Tarreau
58017eef3f REORG: include: move the BUG_ON() code to haproxy/bug.h
This one used to be stored into debug.h but the debug tools got larger
and require a lot of other includes, which can't use BUG_ON() anymore
because of this. It does not make sense and instead this macro should
be placed into the lower includes and given its omnipresence, the best
solution is to create a new bug.h with the few surrounding macros needed
to trigger bugs and place assertions anywhere.

Another benefit is that it won't be required to add include <debug.h>
anymore to use BUG_ON, it will automatically be covered by api.h. No
less than 32 occurrences were dropped.

The FSM_PRINTF macro was dropped since not used at all anymore (probably
since 1.6 or so).
2020-06-11 10:18:56 +02:00
Willy Tarreau
eb6f701b99 REORG: include: move ist.h from common/ to import/
Fortunately that file wasn't made dependent upon haproxy since it was
integrated, better isolate it before it's too late. Its dependency on
api.h was the result of the change from config.h, which in turn wasn't
correct. It was changed back to stddef.h for size_t and sys/types.h for
ssize_t. The recently added reference to MAX() was changed as it was
placed only to avoid a zero length in the non-free-standing version and
was causing a build warning in the hpack encoder.
2020-06-11 10:18:56 +02:00
Willy Tarreau
6019faba50 REORG: include: move openssl-compat.h from common/ to haproxy/
This file is to openssl what compat.h is to the libc, so it makes sense
to move it to haproxy/. It could almost be part of api.h but given the
amount of openssl stuff that gets loaded I fear it could increase the
build time.

Note that this file contains lots of inlined functions. But since it
does not depend on anything else in haproxy, it remains safe to keep
all that together.
2020-06-11 10:18:56 +02:00
Willy Tarreau
8d36697dee REORG: include: move base64.h, errors.h and hash.h from common to to haproxy/
These ones do not depend on any other file. One used to include
haproxy/api.h but that was solely for stddef.h.
2020-06-11 10:18:56 +02:00
Willy Tarreau
d678805783 REORG: include: move version.h to haproxy/
Few files were affected. The release scripts was updated.
2020-06-11 10:18:56 +02:00
Willy Tarreau
fd4bffe7c0 REORG: include: move the base files from common/ to haproxy/
The files currently covered by api-t.h and api.h (compat, compiler,
defaults, initcall) are now located inside haproxy/.
2020-06-11 10:18:56 +02:00
Willy Tarreau
b9082a93e5 CLEANUP: include: remove unused common/tools.h
Let's definitely get rid of this old file.
2020-06-11 10:18:56 +02:00
Willy Tarreau
4d653a6285 REORG: include: move SWAP/MID_RANGE/MAX_RANGE from tools.h to standard.h
Tools.h doesn't make sense for these 3 macros alone anymore, let's move
them to standard.h which will ultimately become again tools.h once moved.
2020-06-11 10:18:56 +02:00
Willy Tarreau
5ae5006dde REORG: include: move MIN/MAX from tools.h to compat.h
Given that these macros are usually provided by sys/param.h, better move
them to compat.h.
2020-06-11 10:18:56 +02:00
Willy Tarreau
57bb71e83a CLEANUP: include: remove unused template.h
There is one "template.h" per include subdirectory to show how to create
a new file but in practice nobody knows they're here so they're useless.
Let's simply remove them.
2020-06-11 10:18:56 +02:00
Willy Tarreau
86556a5377 CLEANUP: include: remove common/config.h
It was already an indirection to load other files, it's not used
anymore.
2020-06-11 10:18:56 +02:00
Willy Tarreau
4c7e4b7738 REORG: include: update all files to use haproxy/api.h or api-t.h if needed
All files that were including one of the following include files have
been updated to only include haproxy/api.h or haproxy/api-t.h once instead:

  - common/config.h
  - common/compat.h
  - common/compiler.h
  - common/defaults.h
  - common/initcall.h
  - common/tools.h

The choice is simple: if the file only requires type definitions, it includes
api-t.h, otherwise it includes the full api.h.

In addition, in these files, explicit includes for inttypes.h and limits.h
were dropped since these are now covered by api.h and api-t.h.

No other change was performed, given that this patch is large and
affects 201 files. At least one (tools.h) was already freestanding and
didn't get the new one added.
2020-06-11 10:18:42 +02:00
Willy Tarreau
7ab7031e34 REORG: include: create new file haproxy/api.h
This file includes everything that must be guaranteed to be available to
any buildable file in the project (including the contrib/ subdirs). For
now it includes <haproxy/api-t.h> so that standard integer types and
compiler macros are known, <common/initcall.h> to ease dynamic registration
of init functions, and <common/tools.h> for a few MIN/MAX macros.

version.h should probably also be added, though at the moment it doesn't
bring a great value.

All files which currently include the ones above should now switch to
haproxy/api.h or haproxy/api-t.h instead. This should also reduce build
time by having a single guard for several files at once.
2020-06-11 09:31:11 +02:00
Willy Tarreau
ca1765713b REORG: include: create new file haproxy/api-t.h
This file is at the lowest level of the include tree. Its purpose is to
make sure that common types are known pretty much everywhere, particularly
in structure declarations. It will essentially cover integer types such as
uintXX_t via inttypes.h, "size_t" and "ptrdiff_t" via stddef.h, and various
type modifiers such as __maybe_unused or ALIGN() via compiler.h, compat.h
and defaults.h.

It could be enhanced later if required, for example if some macros used
to compute array sizes are needed.
2020-06-11 09:31:11 +02:00
Willy Tarreau
8d2b777fe3 REORG: ebtree: move the include files from ebtree to include/import/
This is where other imported components are located. All files which
used to directly include ebtree were touched to update their include
path so that "import/" is now prefixed before the ebtree-related files.

The ebtree.h file was slightly adjusted to read compiler.h from the
common/ subdirectory (this is the only change).

A build issue was encountered when eb32sctree.h is loaded before
eb32tree.h because only the former checks for the latter before
defining type u32. This was addressed by adding the reverse ifdef
in eb32tree.h.

No further cleanup was done yet in order to keep changes minimal.
2020-06-11 09:31:11 +02:00
Christopher Faulet
89aed32bff MINOR: mux-h1/proxy: Add a proxy option to disable clear h2 upgrade
By default, HAProxy is able to implicitly upgrade an H1 client connection to an
H2 connection if the first request it receives from a given HTTP connection
matches the HTTP/2 connection preface. This way, it is possible to support H1
and H2 clients on a non-SSL connections. It could be a problem if for any
reason, the H2 upgrade is not acceptable. "option disable-h2-upgrade" may now be
used to disable it, per proxy. The main puprose of this option is to let an
admin to totally disable the H2 support for security reasons. Recently, a
critical issue in the HPACK decoder was fixed, forcing everyone to upgrade their
HAProxy version to fix the bug. It is possible to disable H2 for SSL
connections, but not on clear ones. This option would have been a viable
workaround.
2020-06-03 10:23:39 +02:00
Willy Tarreau
39bd740d00 CLEANUP: regex: remove outdated support for regex actions
The support for reqrep and friends was removed in 2.1 but the
chain_regex() function and the "action" field in the regex struct
was still there. This patch removes them.

One point worth mentioning though. There is a check_replace_string()
function whose purpose was to validate the replacement strings passed
to reqrep. It should also be used for other replacement regex, but is
never called. Callers of exp_replace() should be checked and a call to
this function should be added to detect the error early.
2020-06-02 17:17:13 +02:00
Emeric Brun
975564784f MEDIUM: ring: add new srv statement to support octet counting forward
log-proto <logproto>
  The "log-proto" specifies the protocol used to forward event messages to
  a server configured in a ring section. Possible values are "legacy"
  and "octet-count" corresponding respectively to "Non-transparent-framing"
  and "Octet counting" in rfc6587. "legacy" is the default.

Notes: a separated io_handler was created to avoid per messages test
and to prepare code to set different log protocols such as
request- response based ones.
2020-05-31 10:49:43 +02:00
Emeric Brun
494c505703 MEDIUM: ring: add server statement to forward messages from a ring
This patch adds new statement "server" into ring section, and the
related "timeout connect" and "timeout server".

server <name> <address> [param*]
  Used to configure a syslog tcp server to forward messages from ring buffer.
  This supports for all "server" parameters found in 5.2 paragraph.
  Some of these parameters are irrelevant for "ring" sections.

timeout connect <timeout>
  Set the maximum time to wait for a connection attempt to a server to succeed.

  Arguments :
    <timeout> is the timeout value specified in milliseconds by default, but
              can be in any other unit if the number is suffixed by the unit,
              as explained at the top of this document.

timeout server <timeout>
  Set the maximum time for pending data staying into output buffer.

  Arguments :
    <timeout> is the timeout value specified in milliseconds by default, but
              can be in any other unit if the number is suffixed by the unit,
              as explained at the top of this document.

  Example:
    global
        log ring@myring local7

    ring myring
        description "My local buffer"
        format rfc3164
        maxlen 1200
        size 32764
        timeout connect 5s
        timeout server 10s
        server mysyslogsrv 127.0.0.1:6514
2020-05-31 10:46:13 +02:00
Emeric Brun
dcd58afaf1 MINOR: ring: re-work ring attach generic API.
Attach is now independent on appctx, which was unused anyway.
2020-05-31 10:37:31 +02:00
Willy Tarreau
21072b9480 CLEANUP: pools: use the regular lock for the flush operation on lockless pools
Commit 04f5fe87d3 introduced an rwlock in the pools to deal with the risk
that pool_flush() dereferences an area being freed, and commit 899fb8abdc
turned it into a spinlock. The pools already contain a spinlock in case of
locked pools, so let's use the same and simplify the code by removing ifdefs.

At this point I'm really suspecting that if pool_flush() would instead
rely on __pool_get_first() to pick entries from the pool, the concurrency
problem could never happen since only one user would get a given entry at
once, thus it could not be freed by another user. It's not certain this
would be faster however because of the number of atomic ops to retrieve
one entry compared to a locked batch.
2020-05-29 17:28:04 +02:00
Christopher Faulet
0bac4cdf1a CLEANUP: http: Remove unused HTTP message templates
HTTP_1XX, HTTP_3XX and HTTP_4XX message templates are no longer used. Only
HTTP_302 and HTTP_303 are used during configuration parsing by "errorloc" family
directives. So these templates are removed from the generic http code. And
HTTP_302 and HTTP_303 templates are moved as static strings in the function
parsing "errorloc" directives.
2020-05-28 15:07:20 +02:00
Christopher Faulet
b304883754 MINOR: http-rules: Use an action function to eval http-request auth rules
Now http-request auth rules are evaluated in a dedicated function and no longer
handled "in place" during the HTTP rules evaluation. Thus the action name
ACT_HTTP_REQ_AUTH is removed. In additionn, http_reply_40x_unauthorized() is
also removed. This part is now handled in the new action_ptr callback function.
2020-05-28 15:07:20 +02:00
Christopher Faulet
612f2eafe9 MINOR: http-ana: Use proxy's error replies to emit 401/407 responses
There is no reason to not use proxy's error replies to emit 401/407
responses. The function http_reply_40x_unauthorized(), responsible to emit those
responses, is not really complex. It only adds a
WWW-Authenticate/Proxy-Authenticate header to a generic message.

So now, error replies can be defined for 401 and 407 status codes, using
errorfile or http-error directives. When an http-request auth rule is evaluated,
the corresponding error reply is used. For 401 responses, all occurrences of the
WWW-Authenticate header are removed and replaced by a new one with a basic
authentication challenge for the configured realm. For 407 responses, the same
is done on the Proxy-Authenticate header. If the error reply must not be
altered, "http-request return" rule must be used instead.
2020-05-28 15:07:20 +02:00
Christopher Faulet
ae43b6c446 MINOR: http-ana: Make the function http_reply_to_htx() public
This function may be used from anywhere to convert an HTTP reply to an HTX
message.
2020-05-28 15:07:20 +02:00
Willy Tarreau
63a8738724 MEDIUM: pools: directly free objects when pools are too much crowded
During pool_free(), when the ->allocated value is 125% of needed_avg or
more, instead of putting the object back into the pool, it's immediately
freed using free(). By doing this we manage to significantly reduce the
amount of memory pinned in pools after transient traffic spikes.

During a test involving a constant load of 100 concurrent connections
each delivering 100 requests per second, the memory usage was a steady
21 MB RSS. Adding a 1 minute parallel load of 40k connections all looping
on 100kB objects made the memory usage climb to 938 MB before this patch.
With the patch it was only 660 MB. But when this parasit load stopped,
before the patch the RSS would remain at 938 MB while with the patch,
it went down to 480 then 180 MB after a few seconds, to stabilize around
69 MB after about 20 seconds.

This can be particularly important to improve reloads where the memory
has to be shared between the old and new process.

Another improvement would be welcome, we ought to have a periodic task
to check pools usage and continue to free up unused objects regardless
of any call to pool_free(), because the needed_avg value depends on the
past and will not cover recently refilled objects.
2020-05-27 08:32:42 +02:00
Willy Tarreau
a1e4f8c27c MINOR: pools: compute an estimate of each pool's average needed objects
This adds a sliding estimate of the pools' usage. The goal is to be able
to use this to start to more aggressively free memory instead of keeping
lots of unused objects in pools. The average is calculated as a sliding
average over the last 1024 consecutive measures of ->used during calls to
pool_free(), and is bumped up for 1/4 of its history from ->allocated when
allocation from the pool fails and results in a call to malloc().

The result is a floating value between ->used and ->allocated, that tries
to react fast to under-estimates that result in expensive malloc() but
still maintains itself well in case of stable usage, and progressively
goes down if usage shrinks over time.

This new metric is reported as "needed_avg" in "show pools".

Sadly due to yet another include dependency hell, we couldn't reuse the
functions from freq_ctr.h so they were temporarily duplicated into memory.h.
2020-05-27 08:32:42 +02:00
Emeric Brun
99c453df9d MEDIUM: ring: new section ring to declare custom ring buffers.
It is possible to globally declare ring-buffers, to be used as target for log
servers or traces.

ring <ringname>
  Creates a new ring-buffer with name <ringname>.

description <text>
  The descritpition is an optional description string of the ring. It will
  appear on CLI. By default, <name> is reused to fill this field.

format <format>
  Format used to store events into the ring buffer.

  Arguments:
    <format> is the log format used when generating syslog messages. It may be
             one of the following :

      iso     A message containing only the ISO date, followed by the text.
              The PID, process name and system name are omitted. This is
              designed to be used with a local log server.

      raw     A message containing only the text. The level, PID, date, time,
              process name and system name are omitted. This is designed to be
              used in containers or during development, where the severity
              only depends on the file descriptor used (stdout/stderr). This
              is the default.

      rfc3164 The RFC3164 syslog message format. This is the default.
              (https://tools.ietf.org/html/rfc3164)

      rfc5424 The RFC5424 syslog message format.
              (https://tools.ietf.org/html/rfc5424)

      short   A message containing only a level between angle brackets such as
              '<3>', followed by the text. The PID, date, time, process name
              and system name are omitted. This is designed to be used with a
              local log server. This format is compatible with what the systemd
              logger consumes.

      timed   A message containing only a level between angle brackets such as
              '<3>', followed by ISO date and by the text. The PID, process
              name and system name are omitted. This is designed to be
              used with a local log server.

maxlen <length>
  The maximum length of an event message stored into the ring,
  including formatted header. If an event message is longer than
  <length>, it will be truncated to this length.

size <size>
  This is the optional size in bytes for the ring-buffer. Default value is
  set to BUFSIZE.

  Example:
    global
        log ring@myring local7

    ring myring
        description "My local buffer"
        format rfc3164
        maxlen 1200

Note: ring names are resolved during post configuration processing.
2020-05-26 08:03:15 +02:00
Tim Duesterhus
b4fac1eb3c MINOR: vars: Make vars_(un|)set_by_name(_ifexist|) return a success value
Change the return type from `void` to `int` and return whether setting
the variable was successful.
2020-05-25 08:12:27 +02:00
Tim Duesterhus
7329327333 CLEANUP: vars: Remove void vars_unset_by_name(const char*, size_t, struct sample*)
With "MINOR: lua: Use vars_unset_by_name_ifexist()" the last user was
removed and as outlined in that commit there is no good reason for this
function to exist.

May be backported together with the commit mentioned above.
2020-05-25 08:12:23 +02:00
Willy Tarreau
0ff9b3d64f BUILD: hpack: make sure the hpack table can still be built standalone
Recent commit 2bdcc70fa7 ("MEDIUM: hpack: use a pool for the hpack table")
made the hpack code finally use a pool with very unintrusive code that was
assumed to be trivial enough to adjust if the code needed to be reused
outside of haproxy. Unfortunately the code in contrib/hpack already uses
it and broke the oss-fuzz tests as it doesn't build anymore.

This patch adds an HPACK_STANDALONE macro to decide if we should use the
pools or malloc+free. The resulting macros are called hpack_alloc() and
hpack_free() respectively, and the size must be passed into the pool
itself.
2020-05-22 12:13:43 +02:00
Christopher Faulet
3b967c1210 MINOR: http-htx/proxy: Add http-error directive using http return syntax
The http-error directive can now be used instead of errorfile to define an error
message in a proxy section (including default sections). This directive uses the
same syntax that http return rules. The only real difference is the limitation
on status code that may be specified. Only status codes supported by errorfile
directives are supported for this new directive. Parsing of errorfile directive
remains independent from http-error parsing. But functionally, it may be
expressed in terms of http-errors :

  errorfile <status> <file> ==> http-errror status <status> errorfile <file>
2020-05-20 18:27:14 +02:00
Christopher Faulet
963ce5bc06 CLEANUP: channel: Remove channel_htx_copy_msg() function
This function is now unused. So it is removed.
2020-05-20 18:27:14 +02:00
Christopher Faulet
2056736453 MINOR: htx: Add a function to copy a buffer in an HTX message
The htx_copy_msg() function can now be used to copy the HTX message stored in a
buffer in an existing HTX message. It takes care to not overwrite existing
data. If the destination message is empty, a raw copy is performed. All the
message is copied or nothing.

This function is used instead of channel_htx_copy_msg().
2020-05-20 18:27:14 +02:00
Christopher Faulet
f1fedc3cce CLEANUP: http-htx: Remove unused storage of error messages in buffers
Now, error messages are all stored in http replies. So the storage as a buffer
can safely be removed.
2020-05-20 18:27:14 +02:00
Christopher Faulet
8dfeccf6d3 MEDIUM: http-ana: Use http replies for HTTP error messages
When HAProxy returns an http error message, the corresponding http reply is now
used instead of the buffer containing the corresponding HTX message. So,
http_error_message() function now returns the http reply to use for a given
stream. And the http_reply_and_close() function now relies on
http_reply_message() to send the response to the client.
2020-05-20 18:27:14 +02:00
Christopher Faulet
507479b096 MINOR: http-ana: Use a TXN flag to prevent after-response ruleset evaluation
The txn flag TX_CONST_REPLY may now be used to prevent after-response ruleset
evaluation. It is used if this ruleset evaluation failed on an internal error
response. Before, it was done incrementing the parameter <final>. But it is not
really convenient if an intermediary function is used to produce the
response. Using a txn flag could also be a good way to prevent after-response
ruleset evaluation in a different context.
2020-05-20 18:27:13 +02:00
Christopher Faulet
e29a97e51a MINOR: http-htx: Use http reply from the http-errors section
When an http reply is configured to use an error message from an http-errors
section, instead of referencing the error message, the http reply is used. To do
so the new http reply type HTTP_REPLY_INDIRECT has been added.
2020-05-20 18:27:13 +02:00
Christopher Faulet
40e8569676 MINOR: proxy: Add references on http replies for proxy error messages
Error messages defined in proxy section or inherited from a default section are
now also referenced using an array of http replies. This is done during the
configuration validity check.
2020-05-20 18:27:13 +02:00
Christopher Faulet
5809e10b48 MINOR: http-htx: Store errorloc/errorfile messages in http replies
During configuration parsing, error messages resulting of parsing of errorloc
and errorfile directives are now also stored as an http reply. So, for now,
these messages are stored as a buffer and as an http reply. To be able to
release all these http replies when haproxy is stopped, a global list is
used. We must do that because the same http reply may be referenced several
times by different proxies if it is defined in a default section.
2020-05-20 18:27:13 +02:00
Christopher Faulet
de30bb7245 MINOR: http-htx: Store messages of an http-errors section in a http reply array
Error messages specified in an http-errors section is now also stored in an
array of http replies. So, for now, these messages are stored as a buffer and as
a http reply.
2020-05-20 18:27:13 +02:00
Christopher Faulet
1b13ecaca2 MINOR: http-htx: Store default error messages in a global http reply array
Default error messages are stored as a buffer, in http_err_chunks global array.
Now, they are also stored as a http reply, in http_err_replies global array.
2020-05-20 18:27:13 +02:00
Christopher Faulet
5cb513abeb MEDIUM: http-rules: Rely on http reply for http deny/tarpit rules
"http-request deny", "http-request tarpit" and "http-response deny" rules now
use the same syntax than http return rules and internally rely on the http
replies. The behaviour is not the same when no argument is specified (or only
the status code). For http replies, a dummy response is produced, with no
payload. For old deny/tarpit rules, the proxy's error messages are used. Thus,
to be compatible with existing configuration, the "default-errorfiles" parameter
is implied. For instance :

  http-request deny deny_status 404

is now an alias of

  http-request deny status 404 default-errorfiles
2020-05-20 18:27:13 +02:00
Christopher Faulet
0e2ad61315 MINOR: http-ana: Use a dedicated function to send a response from an http reply
The http_reply_message() function may be used to send an http reply to a
client. This function is responsile to convert the reply in HTX, to push it in
the response buffer and to forward it to the client. It is also responsible to
terminate the transaction.

This function is used during evaluation of http return rules.
2020-05-20 18:27:13 +02:00
Christopher Faulet
7eea241c39 MINOR: http-htx: Use a dedicated function to check http reply validity
A dedicated function is added to check the validity of an http reply object,
after parsing. It is used to check the validity of http return rules.

For now, this function is only used to find the right error message in an
http-errors section for http replies of type HTTP_REPLY_ERRFILES (using
"errorfiles" argument). On success, such replies are updated to point on the
corresponding error message and their type is set to HTTP_REPLY_ERRMSG. If an
unknown http-errors section is referenced, anx error is returned. If a unknown
error message is referenced inside an existing http-errors section, a warning is
emitted and the proxy's error messages are used instead.
2020-05-20 18:27:13 +02:00
Christopher Faulet
47e791e220 MINOR: http-htx: Use a dedicated function to parse http reply arguments
A dedicated function to parse arguments and create an http_reply object is
added. It is used to parse http return rule. Thus, following arguments are
parsed by this function :

  ... [status <code>] [content-type <type>]
      [ { default-errorfiles | errorfile <file> | errorfiles <name> |
          file <file> | lf-file <file> | string <str> | lf-string <fmt> } ]
      [ hdr <name> <fmt> ]*

Because the status code argument is optional, a default status code must be
defined when this function is called.
2020-05-20 18:27:13 +02:00
Christopher Faulet
18630643a9 MINOR: http-htx: Use a dedicated function to release http_reply objects
A function to release an http_reply object has been added. It is now called when
an http return rule is released.
2020-05-20 18:27:13 +02:00
Christopher Faulet
5ff0c64921 MINOR: http-rules: Use http_reply structure for http return rules
No real change here. Instead of using an internal structure to the action rule,
the http return rules are now stored as an http reply. The main change is about
the action type. It is now always set to ACT_CUSTOM. The http reply type is used
to know how to evaluate the rule.
2020-05-20 18:27:13 +02:00
Christopher Faulet
b6ea17c6fc CLEANUP: http-htx: Rename http_error structure into http_error_msg
The structure owns an error message, most of time loaded from a file, and
converted to HTX. It is created when an errorfile or errorloc directive is
parsed. It is renamed to avoid ambiguities with http_reply structure.
2020-05-20 18:27:13 +02:00
Christopher Faulet
7bd3de06e7 MINOR: http-htx: Add http_reply type based on what is used for http return rules
The http_reply structure is added. It represents a generic HTTP message used as
internal response by HAProxy. It is based on the structure used to store http
return rules. The aim is to store all error messages using this structure, as
well as http return and http deny rules.
2020-05-20 18:27:13 +02:00
Christopher Faulet
a53abad42d CLEANUP: http_ana: Remove unused TXN flags
TX_CLDENY, TX_CLALLOW, TX_SVDENY and TX_SVALLOW flags are unused. Only
TX_CLTARPIT is used to make the difference between an http deny rule and an http
tarpit rule. So these unused flags are removed.
2020-05-20 18:27:13 +02:00
William Lallemand
8177ad9895 MINOR: ssl: split config and runtime variable for ssl-{min,max}-ver
In the CLI command 'show ssl crt-list', the ssl-min-ver and the
ssl-min-max arguments were always displayed because the dumped versions
were the actual version computed and used by haproxy, instead of the
version found in the configuration.

To fix the problem, this patch separates the variables to have one with
the configured version, and one with the actual version used. The dump
only shows the configured version.
2020-05-20 16:49:02 +02:00
Willy Tarreau
d68a6927f7 Revert "MEDIUM: sink: add global statement to create a new ring (sink buffer)"
This reverts commit 957ec59571.

As discussed with Emeric, the current syntax is not extensible enough,
this will be turned to a section instead in a forthcoming patch.
2020-05-20 12:06:16 +02:00
Willy Tarreau
928068a74b MINOR: ring: make the applet code not depend on the CLI
The ring to applet communication was only made to deal with CLI functions
but it's generic. Let's have generic appctx functions and have the CLI
rely on these instead. This patch introduces ring_attach_appctx() and
ring_detach_appctx().
2020-05-19 19:37:12 +02:00
Willy Tarreau
9597cbd17a MINOR: applet: adopt the wait list entry from the CLI
A few fields, including a generic list entry, were added to the CLI context
by commit 300decc8d9 ("MINOR: cli: extend the CLI context with a list and
two offsets"). It turns out that the list entry (l0) is solely used to
consult rings and that the generic ring_write() code is restricted to a
consumer on the CLI due to this, which was not the initial intent. Let's
make it a general purpose wait_entry field that is properly initialized
during appctx_init(). This will allow any applet to wait on a ring, not
just the CLI.
2020-05-19 19:37:12 +02:00
Willy Tarreau
2bdcc70fa7 MEDIUM: hpack: use a pool for the hpack table
Instead of using malloc/free to allocate an HPACK table, let's declare
a pool. However the HPACK size is configured by the H2 mux, so it's
also this one which allocates it after post_check.
2020-05-19 11:40:39 +02:00
Emeric Brun
957ec59571 MEDIUM: sink: add global statement to create a new ring (sink buffer)
This patch adds the new global statement:
ring <name> [desc <desc>] [format <format>] [size <size>] [maxlen <length>]
  Creates a named ring buffer which could be used on log line for instance.

  <desc> is an optionnal description string of the ring. It will appear on
         CLI. By default, <name> is reused to fill this field.

  <format> is the log format used when generating syslog messages. It may be
           one of the following :

    iso       A message containing only the ISO date, followed by the text.
              The PID, process name and system name are omitted. This is
              designed to be used with a local log server.

    raw       A message containing only the text. The level, PID, date, time,
              process name and system name are omitted. This is designed to be
              used in containers or during development, where the severity only
              depends on the file descriptor used (stdout/stderr). This is
              the default.

    rfc3164   The RFC3164 syslog message format. This is the default.
              (https://tools.ietf.org/html/rfc3164)

    rfc5424   The RFC5424 syslog message format.
              (https://tools.ietf.org/html/rfc5424)

    short     A message containing only a level between angle brackets such as
              '<3>', followed by the text. The PID, date, time, process name
              and system name are omitted. This is designed to be used with a
              local log server. This format is compatible with what the systemd
              logger consumes.

    timed     A message containing only a level between angle brackets such as
              '<3>', followed by ISO date and by the text. The PID, process
              name and system name are omitted. This is designed to be
              used with a local log server.

  <length> is the maximum length of event message stored into the ring,
           including formatted header. If the event message is longer
           than <length>, it would be truncated to this length.

  <name> is the ring identifier, which follows the same naming convention as
         proxies and servers.

  <size> is the optionnal size in bytes. Default value is set to BUFSIZE.

Note: Historically sink's name and desc were refs on const strings. But with new
configurable rings a dynamic allocation is needed.
2020-05-19 11:04:11 +02:00
Emeric Brun
e709e1e777 MEDIUM: logs: buffer targets now rely on new sink_write
Before this path, they rely directly on ring_write bypassing
a part of the sink API.

Now the maxlen parameter of the log will apply only on the text
message part (and not the header, for this you woud prefer
to use the maxlen parameter on the sink/ring).

sink_write prototype was also reviewed to return the number of Bytes
written to be compliant with the other write functions.
2020-05-19 11:04:11 +02:00
Emeric Brun
bd163817ed MEDIUM: sink: build header in sink_write for log formats
This patch extends the sink_write prototype and code to
handle the rfc5424 and rfc3164 header.

It uses header building tools from log.c. Doing this some
functions/vars have been externalized.

facility and minlevel have been removed from the struct sink
and passed to args at sink_write because they depends of the log
and not of the sink (they remained unused by rest of the code
until now).
2020-05-19 11:04:11 +02:00
William Dauchy
1665c43fd8 BUILD: ssl: include buffer common headers for ssl_sock_ctx
since commit c0cdaffaa3 ("REORG: ssl: move ssl_sock_ctx and fix
cross-dependencies issues"), `struct ssl_sock_ctx` was moved in
ssl_sock.h. As it contains a `struct buffer`, including
`common/buffer.h` is now mandatory. I encountered an issue while
including ssl_sock.h on another patch:

include/types/ssl_sock.h:240:16: error: field ‘early_buf’ has incomplete type
  240 |  struct buffer early_buf;      /* buffer to store the early data received */

no backport needed.

Fixes: c0cdaffaa3 ("REORG: ssl: move ssl_sock_ctx and fix
cross-dependencies issues")
Signed-off-by: William Dauchy <w.dauchy@criteo.com>
2020-05-18 08:29:32 +02:00
Marcin Deranek
4dc2b57d51 MINOR: stats: Prepare for more accurate moving averages
Add swrate_add_dynamic function which is similar to swrate_add, but more
accurate when calculating moving averages when not enough samples have
been processed yet.
2020-05-16 22:40:00 +02:00
William Lallemand
6a66a5ec9b REORG: ssl: move utility functions to src/ssl_utils.c
These functions are mainly used to extract information from
certificates.
2020-05-15 14:11:54 +02:00
William Lallemand
15e169447d REORG: ssl: move sample fetches to src/ssl_sample.c
Move all SSL sample fetches to src/ssl_sample.c.
2020-05-15 14:11:54 +02:00
William Lallemand
c0cdaffaa3 REORG: ssl: move ssl_sock_ctx and fix cross-dependencies issues
In order to move all SSL sample fetches in another file, moving the
ssl_sock_ctx definition in a .h file is required.

Unfortunately it became a cross dependencies hell to solve, because of
the struct wait_event field, so <types/connection.h> is needed which
created other problems.
2020-05-15 14:11:54 +02:00
William Lallemand
ef76107a4b MINOR: ssl: remove static keyword in some SSL utility functions
In order to move the the sample fetches to another file, remove the
static keyword of some utility functions in the SSL fetches.
2020-05-15 14:11:54 +02:00
William Lallemand
dad3105157 REORG: ssl: move ssl configuration to cfgparse-ssl.c
Move all the configuration parsing of the ssl keywords in cfgparse-ssl.c
2020-05-15 14:11:54 +02:00
William Lallemand
da8584c1ea REORG: ssl: move the CLI 'cert' functions to src/ssl_ckch.c
Move the 'ssl cert' CLI functions to src/ssl_ckch.c.
2020-05-15 14:11:54 +02:00
William Lallemand
c756bbd3df REORG: ssl: move the crt-list CLI functions in src/ssl_crtlist.c
Move the crtlist functions for the CLI to src/ssl_crtlist.c
2020-05-15 14:11:54 +02:00
William Lallemand
03c331c80a REORG: ssl: move the ckch_store related functions to src/ssl_ckch.c
Move the cert_key_and_chain functions:

int ssl_sock_load_files_into_ckch(const char *path, struct cert_key_and_chain *ckch, char **err);
int ssl_sock_load_pem_into_ckch(const char *path, char *buf, struct cert_key_and_chain *ckch , char **err);
void ssl_sock_free_cert_key_and_chain_contents(struct cert_key_and_chain *ckch);

int ssl_sock_load_key_into_ckch(const char *path, char *buf, struct cert_key_and_chain *ckch , char **err);
int ssl_sock_load_ocsp_response_from_file(const char *ocsp_path, char *buf, struct cert_key_and_chain *ckch, char **err);
int ssl_sock_load_sctl_from_file(const char *sctl_path, char *buf, struct cert_key_and_chain *ckch, char **err);
int ssl_sock_load_issuer_file_into_ckch(const char *path, char *buf, struct cert_key_and_chain *ckch, char **err);

And the utility ckch_store functions:

void ckch_store_free(struct ckch_store *store)
struct ckch_store *ckch_store_new(const char *filename, int nmemb)
struct ckch_store *ckchs_dup(const struct ckch_store *src)
ckch_store *ckchs_lookup(char *path)
ckch_store *ckchs_load_cert_file(char *path, int multi, char **err)
2020-05-15 14:11:54 +02:00
William Lallemand
c1c50b46e9 CLEANUP: ssl: avoid circular dependencies in ssl_crtlist.h
Add forward declarations in types/ssl_crtlist.h in order to avoid
circular dependencies. Also remove the listener.h include which is not
needed anymore.
2020-05-15 14:11:54 +02:00
William Lallemand
6e9556b635 REORG: ssl: move crtlist functions to src/ssl_crtlist.c
Move the crtlist functions to src/ssl_crtlist.c and their definitions to
proto/ssl_crtlist.h.

The following functions were moved:

/* crt-list entry functions */
void ssl_sock_free_ssl_conf(struct ssl_bind_conf *conf);
char **crtlist_dup_filters(char **args, int fcount);
void crtlist_free_filters(char **args);
void crtlist_entry_free(struct crtlist_entry *entry);
struct crtlist_entry *crtlist_entry_new();

/* crt-list functions */
void crtlist_free(struct crtlist *crtlist);
struct crtlist *crtlist_new(const char *filename, int unique);

/* file loading */
int crtlist_parse_line(char *line, char **crt_path, struct crtlist_entry *entry, const char *file, int linenum, char **err);
int crtlist_parse_file(char *file, struct bind_conf *bind_conf, struct proxy *curproxy, struct crtlist **crtlist, char **err);
int crtlist_load_cert_dir(char *path, struct bind_conf *bind_conf, struct crtlist **crtlist, char **err);
2020-05-15 14:11:54 +02:00
William Lallemand
c69973f7eb CLEANUP: ssl: add ckch prototypes in proto/ssl_ckch.h
Remove the static definitions of the ckch functions and add them to
ssl_ckch.h in order to use them outside ssl_sock.c.
2020-05-15 14:11:54 +02:00
William Lallemand
d4632b2b6d REORG: ssl: move the ckch structures to types/ssl_ckch.h
Move all the structures used for loading the SSL certificates in
ssl_ckch.h
2020-05-15 14:11:54 +02:00
William Lallemand
be21b663cd REORG: move the crt-list structures in their own .h
Move the structure definitions specifics to the crt-list in
types/ssl_crtlist.h.
2020-05-15 14:11:54 +02:00
William Lallemand
7fd8b4567e REORG: ssl: move macros and structure definitions to ssl_sock.h
The ssl_sock.c file contains a lot of macros and structure definitions
that should be in a .h. Move them to the more appropriate
types/ssl_sock.h file.
2020-05-15 14:11:54 +02:00
Dragan Dosen
eb607fe6a1 MINOR: ssl: add a new function ssl_sock_get_ssl_object()
This one can be used later to get a SSL object from connection. It will
return NULL if connection is not established over SSL.
2020-05-14 13:13:14 +02:00
Dragan Dosen
1e7ed04665 MEDIUM: ssl: allow to register callbacks for SSL/TLS protocol messages
This patch adds the ability to register callbacks for SSL/TLS protocol
messages by using the function ssl_sock_register_msg_callback().

All registered callback functions will be called when observing received
or sent SSL/TLS protocol messages.
2020-05-14 13:13:14 +02:00
Christopher Faulet
325504cf89 BUG/MINOR: sample/ssl: Fix digest converter for openssl < 1.1.0
The EVP_MD_CTX_create() and EVP_MD_CTX_destroy() functions were renamed to
EVP_MD_CTX_new() and EVP_MD_CTX_free() in OpenSSL 1.1.0, respectively. These
functions are used by the digest converter, introduced by the commit 8e36651ed
("MINOR: sample: Add digest and hmac converters"). So for prior versions of
openssl, macros are used to fallback on old functions.

This patch must only be backported if the commit 8e36651ed is backported too.
2020-05-12 16:30:41 +02:00
Willy Tarreau
5778fea4da CLEANUP: remove THREAD_LOCAL from config.h
This one really ought to be defined in hathreads.h like all other thread
definitions, which is what this patch does. As expected, all files but
one (regex.h) were already including hathreads.h when using THREAD_LOCAL;
regex.h was fixed for this.

This was the last entry in config.h which is now useless.
2020-05-09 09:08:09 +02:00
Willy Tarreau
3bc4e8bfe6 CLENAUP: config: move CONFIG_HAP_LOCKLESS_POOLS out of config.h
The setting of CONFIG_HAP_LOCKLESS_POOLS depending on threads and
compat was done in config.h for use only in memory.h and memory.c
where other settings are dealt with. Further, the default pool cache
size was set there from a fixed value instead of being set from
defaults.h

Let's move the decision to enable lockless pools via
CONFIG_HAP_LOCKLESS_POOLS to memory.h, and set the default pool
cache size in defaults.h like other default settings.

This was the next-to-last setting in config.h.
2020-05-09 09:02:35 +02:00
Willy Tarreau
755afc08d5 CLEANUP: config: drop unused setting CONFIG_HAP_INLINE_FD_SET
CONFIG_HAP_INLINE_FD_SET was introduced in 1.3.3 and dropped in 1.3.9
when the pollers were reworked, let's remove it.
2020-05-09 08:57:48 +02:00
Willy Tarreau
571eb3d659 CLEANUP: config: drop unused setting CONFIG_HAP_MEM_OPTIM
CONFIG_HAP_MEM_OPTIM was introduced with memory pools in 1.3 and dropped
in 1.6 when pools became the only way to allocate memory. Still the
option remained present in config.h. Let's kill it.
2020-05-09 08:53:31 +02:00
Christopher Faulet
67a234583e CLEANUP: checks: sort and rename tcpcheck_expect_type types
The same naming format is used for all expect rules. And names are sorted to be
grouped by type.
2020-05-06 12:38:44 +02:00
Christopher Faulet
aaab0836d9 MEDIUM: checks: Add matching on log-format string for expect rules
It is now possible to use log-format string (or hexadecimal string for the
binary version) to match a content in tcp-check based expect rules. For
hexadecimal log-format string, the conversion in binary is performed after the
string evaluation, during health check execution. The pattern keywords to use
are "string-lf" for the log-format string and "binary-lf" for the hexadecimal
log-format string.
2020-05-06 08:31:29 +02:00
Willy Tarreau
a4d9ee3d1c BUG/MINOR: threads: fix multiple use of argument inside HA_ATOMIC_UPDATE_{MIN,MAX}()
Just like in previous patch, it happens that HA_ATOMIC_UPDATE_MIN() and
HA_ATOMIC_UPDATE_MAX() would evaluate the (val) argument up to 3 times.
However this time it affects both thread and non-thread versions. It's
strange because the copy was properly performed for the (new) argument
in order to avoid this. Anyway it was done for the "val" one as well.

A quick code inspection showed that this currently has no effect as
these macros are fairly limited in usage.

It would be best to backport this for long-term stability (till 1.8)
but it will not fix an existing bug.
2020-05-05 16:18:52 +02:00
Willy Tarreau
d66345d6b0 BUG/MINOR: threads: fix multiple use of argument inside HA_ATOMIC_CAS()
When threads are disabled, HA_ATOMIC_CAS() becomes a simple compound
expression. However this expression presents a problem, which is that
its arguments are evaluated multiple times, once for the comparison
and once again for the assignement. This presents a risk of performing
some side-effect operations twice in the non-threaded case (e.g. in
case of auto-increment or function return).

The macro was rewritten using local copies for arguments like the
other macros do.

Fortunately a complete inspection of the code indicates that this case
currently never happens. It was however responsible for the strict-aliasing
warning emitted when building fd.c without threads but with 64-bit CAS.

This may be backported as far as 1.8 though it will not fix any existing
bug and is more of a long-term safety measure in case a future fix would
depend on this behavior.
2020-05-05 16:05:45 +02:00
Baptiste Assmann
0e9d87bf06 MINOR: istbuf: add ist2buf() function
Purpose of this function is to build a <struct buffer> from a <struct
ist>.
2020-05-05 15:28:59 +02:00
Baptiste Assmann
de80201460 MINOR: ist: add istissame() function
The istissame() function takes 2 ist and compare their <.ptr> and <.len>
values respectively. It returns non-zero if they are the same.
2020-05-05 15:28:59 +02:00
Baptiste Assmann
9ef1967af7 MINOR: ist: add istadv() function
The purpose of istadv() function is to move forward <.ptr> by <nb>
characters. It is very useful when parsing a payload.
2020-05-05 15:28:59 +02:00
Christopher Faulet
3970819a55 MEDIUM: checks: Support matching on headers for http-check expect rules
It is now possible to add http-check expect rules matching HTTP header names and
values. Here is the format of these rules:

  http-check expect header name [ -m <meth> ] <name> [log-format] \
                           [ value [ -m <meth> ] <value> [log-format] [full] ]

the name pattern (name ...) is mandatory but the value pattern (value ...) is
optionnal. If not specified, only the header presence is verified. <meth> is the
matching method, applied on the header name or the header value. Supported
matching methods are:

  * "str" (exact match)
  * "beg" (prefix match)
  * "end" (suffix match)
  * "sub" (substring match)
  * "reg" (regex match)

If not specified, exact matching method is used. If the "log-format" option is
used, the pattern (<name> or <value>) is evaluated as a log-format string. This
option cannot be used with the regex matching method. Finally, by default, the
header value is considered as comma-separated list. Each part may be tested. The
"full" option may be used to test the full header line. Note that matchings are
case insensitive on the header names.
2020-05-05 11:19:27 +02:00
Christopher Faulet
8dd33e13a5 MINOR: http-htx: Support different methods to look for header names
It is now possible to use different matching methods to look for header names in
an HTTP message:

 * The exact match. It is the default method. http_find_header() uses this
   method. http_find_str_header() is an alias.

 * The prefix match. It evals the header names starting by a prefix.
   http_find_pfx_header() must be called to use this method.

 * The suffix match. It evals the header names ending by a suffix.
   http_find_sfx_header() must be called to use this method.

 * The substring match. It evals the header names containing a string.
   http_find_sub_header() must be called to use this method.

 * The regex match. It evals the header names matching a regular expression.
   http_match_header() must be called to use this method.
2020-05-05 11:07:00 +02:00
Christopher Faulet
778f5ed478 MEDIUM: checks/http-fetch: Support htx prefetch from a check for HTTP samples
Some HTTP sample fetches will be accessible from the context of a http-check
health check. Thus, the prefetch function responsible to return the HTX message
has been update to handle a check, in addition to a channel. Both cannot be used
at the same time. So there is no ambiguity.
2020-05-05 11:06:43 +02:00
Willy Tarreau
86c6a9221a BUG/MEDIUM: shctx: bound the number of loops that can happen around the lock
Given that a "count" value of 32M was seen in _shctx_wait4lock(), it
is very important to prevent this from happening again. It's absolutely
essential to prevent the value from growing unbounded because with an
increase of the number of threads, the number of successive failed
attempts will necessarily grow.

Instead now we're scanning all 2^p-1 values from 3 to 255 and are
bounding to count to 255 so that in the worst case each thread tries an
xchg every 255 failed read attempts. That's one every 4 on average per
thread when there are 64 threads, which corresponds to the initial count
of 4 for the first attempt so it seems like a reasonable value to keep a
low latency.

The bug was introduced with the shctx entries in 1.5 so the fix must
be backported to all versions. Before 1.8 the function was called
_shared_context_wait4lock() and was in shctx.c.
2020-05-01 13:32:20 +02:00
Willy Tarreau
3801bdc3fc BUG/MEDIUM: shctx: really check the lock's value while waiting
Jrme reported an amazing crash in the spinlock version of
_shctx_wait4lock() with an extremely high <count> value of 32M! The
root cause is that the function cannot deal with contention on the lock
at all because it forgets to check if the lock's value has changed! As
such, every time it's called due to a contention, it waits twice as
long before trying again and lets the caller check for the contention
by itself.

The correct thing to do is to compare the value again at each loop.
This way it makes sure to mostly perform read accesses on the shared
cache line without writing too often, and to be ready fast enough to
try to grab the lock. And we must not increase the count on success
either!

Unfortunately I'd have expected to see a performance boost on the cache
with this but there was absolutely no change, so it's very likely that
these issues only happen once in a while and are sufficient to derail
the process when they strike, but not to have a permanent performance
impact.

The bug was introduced with the shctx entries in 1.5 so the fix must
be backported to all versions. Before 1.8 the function was called
_shared_context_wait4lock() and was in shctx.c.
2020-05-01 13:29:14 +02:00
Willy Tarreau
f0e5da20e1 BUG/MINOR: debug: properly use long long instead of long for the thread ID
I changed my mind twice on this one and pushed after the last test with
threads disabled, without re-enabling long long, causing this rightful
build warning.

This needs to be backported if the previous commit ff64d3b027 ("MINOR:
threads: export the POSIX thread ID in panic dumps") is backported as
well.
2020-05-01 12:26:03 +02:00
Willy Tarreau
ff64d3b027 MINOR: threads: export the POSIX thread ID in panic dumps
It is very difficult to map a panic dump against a gdb thread dump
because the thread numbers do not match. However gdb provides the
pthread ID but this one is supposed to be opaque and not to be cast
to a scalar.

This patch provides a fnuction, ha_get_pthread_id() which retrieves
the pthread ID of the indicated thread and casts it to an unsigned
long long so as to lose the least possible amount of information from
it. This is done cleanly using a union to maintain alignment so as
long as these IDs are stored on 1..8 bytes they will be properly
reported. This ID is now presented in the panic dumps so it now
becomes possible to map these threads. When threads are disabled,
zero is returned. For example, this is a panic dump:

  Thread 1 is about to kill the process.
  *>Thread 1 : id=0x7fe92b825180 act=0 glob=0 wq=1 rq=0 tl=0 tlsz=0 rqsz=0
               stuck=1 prof=0 harmless=0 wantrdv=0
               cpu_ns: poll=5119122 now=2009446995 diff=2004327873
               curr_task=0xc99bf0 (task) calls=4 last=0
                 fct=0x592440(task_run_applet) ctx=0xca9c50(<CLI>)
               strm=0xc996a0 src=unix fe=GLOBAL be=GLOBAL dst=<CLI>
               rqf=848202 rqa=0 rpf=80048202 rpa=0 sif=EST,200008 sib=EST,204018
               af=(nil),0 csf=0xc9ba40,8200
               ab=0xca9c50,4 csb=(nil),0
               cof=0xbf0e50,1300:PASS(0xc9cee0)/RAW((nil))/unix_stream(20)
               cob=(nil),0:NONE((nil))/NONE((nil))/NONE(0)
               call trace(20):
               |       0x59e4cf [48 83 c4 10 5b 5d 41 5c]: wdt_handler+0xff/0x10c
               | 0x7fe92c170690 [48 c7 c0 0f 00 00 00 0f]: libpthread:+0x13690
               | 0x7ffce29519d9 [48 c1 e2 20 48 09 d0 48]: linux-vdso:+0x9d9
               | 0x7ffce2951d54 [eb d9 f3 90 e9 1c ff ff]: linux-vdso:__vdso_gettimeofday+0x104/0x133
               |       0x57b484 [48 89 e6 48 8d 7c 24 10]: main+0x157114
               |       0x50ee6a [85 c0 75 76 48 8b 55 38]: main+0xeaafa
               |       0x50f69c [48 63 54 24 20 85 c0 0f]: main+0xeb32c
               |       0x59252c [48 c7 c6 d8 ff ff ff 44]: task_run_applet+0xec/0x88c
    Thread 2 : id=0x7fe92b6e6700 act=0 glob=0 wq=0 rq=0 tl=0 tlsz=0 rqsz=0
               stuck=0 prof=0 harmless=1 wantrdv=0
               cpu_ns: poll=786738 now=1086955 diff=300217
               curr_task=0
    Thread 3 : id=0x7fe92aee5700 act=0 glob=0 wq=0 rq=0 tl=0 tlsz=0 rqsz=0
               stuck=0 prof=0 harmless=1 wantrdv=0
               cpu_ns: poll=828056 now=1129738 diff=301682
               curr_task=0
    Thread 4 : id=0x7fe92a6e4700 act=0 glob=0 wq=0 rq=0 tl=0 tlsz=0 rqsz=0
               stuck=0 prof=0 harmless=1 wantrdv=0
               cpu_ns: poll=818900 now=1153551 diff=334651
               curr_task=0

And this is the gdb output:

  (gdb) info thr
    Id   Target Id                         Frame
  * 1    Thread 0x7fe92b825180 (LWP 15234) 0x00007fe92ba81d6b in raise () from /lib64/libc.so.6
    2    Thread 0x7fe92b6e6700 (LWP 15235) 0x00007fe92bb56a56 in epoll_wait () from /lib64/libc.so.6
    3    Thread 0x7fe92a6e4700 (LWP 15237) 0x00007fe92bb56a56 in epoll_wait () from /lib64/libc.so.6
    4    Thread 0x7fe92aee5700 (LWP 15236) 0x00007fe92bb56a56 in epoll_wait () from /lib64/libc.so.6

We can clearly see that while threads 1 and 2 are the same, gdb's
threads 3 and 4 respectively are haproxy's threads 4 and 3.

This may be backported to 2.0 as it removes some confusion in github issues.
2020-05-01 11:45:56 +02:00
Christopher Faulet
dc75d577b9 CLEANUP: checks: Fix checks includes 2020-04-29 13:32:29 +02:00
Christopher Faulet
1543d44607 MINOR: http-htx: Export functions to update message authority and host
These functions will be used by HTTP health checks when a request is formatted
before sending it.
2020-04-29 13:32:29 +02:00
Damien Claisse
57c8eb939d MINOR: log: Add "Tu" timer
It can be sometimes useful to measure total time of a request as seen
from an end user, including TCP/TLS negotiation, server response time
and transfer time. "Tt" currently provides something close to that, but
it also takes client idle time into account, which is problematic for
keep-alive requests as idle time can be very long. "Ta" is also not
sufficient as it hides TCP/TLS negotiationtime. To improve that, introduce
a "Tu" timer, without idle time and everything else. It roughly estimates
time spent time spent from user point of view (without DNS resolution
time), assuming network latency is the same in both directions.
2020-04-28 16:30:13 +02:00
Christopher Faulet
bfb0f72d52 BUG/MEDIUM: sessions: Always pass the mux context as argument to destroy a mux
This bug was introduced by the commit 2444aa5b ("MEDIUM: sessions: Don't be
responsible for connections anymore."). In session_check_idle_conn(), when the
mux is destroyed, its context must be passed as argument instead of the
connection.

It is de 2.2-dev bug. No need to backport.
2020-04-27 15:53:43 +02:00
Christopher Faulet
4a8c026117 BUG/MINOR: checks/server: use_ssl member must be signed 2020-04-27 12:13:06 +02:00
Christopher Faulet
8021a5f4a5 MINOR: checks: Support list of status codes on http-check expect rules
It is now possible to match on a comma-separated list of status codes or range
of codes. In addtion, instead of a string comparison to match the response's
status code, a integer comparison is performed. Here is an example:

  http-check expect status 200,201,300-310
2020-04-27 10:46:28 +02:00
Christopher Faulet
88d939c831 Revert "MEDIUM: checks: capture groups in expect regexes"
This reverts commit 1979943c30ef285ed04f07ecf829514de971d9b2.

Captures in comment was only used when a tcp-check expect based on a negative
regex matching failed to eventually report what was captured while it was not
expected. It is a bit far-fetched to be useable IMHO. on-error and on-success
log-format strings are far more usable. For now there is few check sample
fetches (in fact only one...). But it could be really powerful to report info in
logs.
2020-04-27 10:46:28 +02:00
Christopher Faulet
d7cee71e77 MINOR: checks: Use a tree instead of a list to store tcp-check rulesets
Since all tcp-check rulesets are globally stored, it is a problem to use
list. For configuration with many backends, the lookups in list may be costly
and slow downs HAProxy startup. To solve this problem, tcp-check rulesets are
now stored in a tree.
2020-04-27 10:46:28 +02:00
Christopher Faulet
0417975bdc MINOR: ist: Add a function to retrieve the ist pointer
There is already the istlen() function to get the ist length. Now, it is
possible to call istptr() to get the ist pointer.
2020-04-27 10:46:28 +02:00
Christopher Faulet
61cc852230 CLEANUP: checks: Reorg checks.c file to be more readable
The patch is not obvious at the first glance. But it is just a reorg. Functions
have been grouped and ordered in a more logical way. Some structures and flags
are now private to the checks module (so moved from the .h to the .c file).
2020-04-27 10:46:28 +02:00
Christopher Faulet
d7e639661a MEDIUM: checks: Implement default TCP check using tcp-check rules
Defaut health-checks, without any option, doing only a connection check, are now
based on tcp-checks. An implicit default tcp-check connect rule is used. A
shared tcp-check ruleset, name "*tcp-check" is created to support these checks.
2020-04-27 10:46:28 +02:00
Christopher Faulet
a9e1c4c7c2 MINOR: connection: Add a function to install a mux for a health-check
This function is unused for now. But it will have be used to install a mux for
an outgoing connection openned in a health-check context. In this case, the
session's origin is the check itself, and it is used to know the mode, HTTP or
TCP, depending on the tcp-check type and not the proxy mode. The check is also
used to get the mux protocol if configured.
2020-04-27 09:39:38 +02:00
Christopher Faulet
b356714769 MINOR: checks: Add a mux proto to health-check and tcp-check connect rule
It is not set and not used for now, but it will be possible to force the mux
protocol thanks to this patch. A mux proto field is added to the checks and to
tcp-check connect rules.
2020-04-27 09:39:38 +02:00
Christopher Faulet
a142c1deb4 BUG/MINOR: obj_type: Handle stream object in obj_base_ptr() function
The stream object (OBJ_TYPE_STREAM) was missing in the switch statement of the
obj_base_ptr() function.

This patch must be backported as far as 2.0.
2020-04-27 09:39:38 +02:00
Christopher Faulet
3829046893 MINOR: checks/obj_type: Add a new object type for checks
An object type is now affected to the check structure.
2020-04-27 09:39:38 +02:00
Christopher Faulet
e60abd1a06 MINOR: connection: Add macros to know if a conn or a cs uses an HTX mux
IS_HTX_CONN() and IS_HTX_CS may now be used to know if a connection or a
conn-stream use an HTX based multiplexer.
2020-04-27 09:39:38 +02:00
Christopher Faulet
e5870d872b MAJOR: checks: Implement HTTP check using tcp-check rules
HTTP health-checks are now internally based on tcp-checks. Of course all the
configuration parsing of the "http-check" keyword and the httpchk option has
been rewritten. But the main changes is that now, as for tcp-check ruleset, it
is possible to perform several send/expect sequences into the same
health-checks. Thus the connect rule is now also available from HTTP checks, jst
like set-var, unset-var and comment rules.

Because the request defined by the "option httpchk" line is used for the first
request only, it is now possible to set the method, the uri and the version on a
"http-check send" line.
2020-04-27 09:39:38 +02:00
Christopher Faulet
5eb96cbcbc MINOR: standard: Add my_memspn and my_memcspn
Do the same than strsnp() and strcspn() but on a raw bytes buffer.
2020-04-27 09:39:38 +02:00
Christopher Faulet
12d5740a38 MINOR: checks: Introduce flags to configure in tcp-check expect rules
Instead of having 2 independent integers, used as boolean values, to know if the
expect rule is invered and to know if the matching regexp has captures, we know
use a 32-bits bitfield.
2020-04-27 09:39:38 +02:00
Christopher Faulet
f930e4c4df MINOR: checks: Use an indirect string to represent the expect matching string
Instead of having a string in the expect union with its length outside of the
union, directly in the expect structure, an indirect string is now used.
2020-04-27 09:39:38 +02:00
Christopher Faulet
404f919995 MEDIUM: checks: Use a shared ruleset to store tcp-check rules
All tcp-check rules are now stored in the globla shared list. The ones created
to parse a specific protocol, for instance redis, are already stored in this
list. Now pure tcp-check rules are also stored in it. The ruleset name is
created using the proxy name and its config file and line. tcp-check rules
declared in a defaults section are also stored this way using "defaults" as
proxy name.

For now, all tcp-check ruleset are stored in a list. But it could be a bit slow
to looks for a specific ruleset with a huge number of backends. So, it could be
a good idea to use a tree instead.
2020-04-27 09:39:38 +02:00
Christopher Faulet
6f5579160a MINOR: proxy/checks: Move parsing of external-check option in checks.c
Parsing of the proxy directive "option external-check" have been moved in checks.c.
2020-04-27 09:39:38 +02:00
Christopher Faulet
430e480510 MINOR: proxy/checks: Move parsing of tcp-check option in checks.c
Parsing of the proxy directive "option tcp-check" have been moved in checks.c.
2020-04-27 09:39:38 +02:00
Christopher Faulet
6c2a743538 MINOR: proxy/checks: Move parsing of httpchk option in checks.c
Parsing of the proxy directive "option httpchk" have been moved in checks.c.
2020-04-27 09:39:38 +02:00
Christopher Faulet
ec07e386a7 MINOR: checks: Add an option to set success status of tcp-check expect rules
It is now possible to specified the healthcheck status to use on success of a
tcp-check rule, if it is the last evaluated rule. The option "ok-status"
supports "L4OK", "L6OK", "L7OK" and "L7OKC" status.
2020-04-27 09:39:38 +02:00
Christopher Faulet
799f3a4621 MINOR: Produce tcp-check info message for pure tcp-check rules only
This way, messages reported by protocol checks are closer that the old one.
2020-04-27 09:39:38 +02:00
Christopher Faulet
0ae3d1dbdf MEDIUM: checks: Implement agent check using tcp-check rules
A shared tcp-check ruleset is now created to support agent checks. The following
sequence is used :

    tcp-check send "%[var(check.agent_string)] log-format
    tcp-check expect custom

The custom function to evaluate the expect rule does the same that it was done
to handle agent response when a custom check was used.
2020-04-27 09:39:38 +02:00
Christopher Faulet
267b01b761 MEDIUM: checks: Implement SPOP check using tcp-check rules
A share tcp-check ruleset is now created to support SPOP checks. This way no
extra memory is used if several backends use a SPOP check.

The following sequence is used :

    tcp-check send-binary SPOP_REQ
    tcp-check expect custom min-recv 4

The spop request is the result of the function
spoe_prepare_healthcheck_request() and the expect rule relies on a custom
function calling spoe_handle_healthcheck_response().
2020-04-27 09:39:38 +02:00
Christopher Faulet
1997ecaa0c MEDIUM: checks: Implement LDAP check using tcp-check rules
A shared tcp-check ruleset is now created to support LDAP check. This way no
extra memory is used if several backends use a LDAP check.

The following sequance is used :

    tcp-check send-binary "300C020101600702010304008000"

    tcp-check expect rbinary "^30" min-recv 14 \
        on-error "Not LDAPv3 protocol"

    tcp-check expect custom

The last expect rule relies on a custom function to check the LDAP server reply.
2020-04-27 09:39:38 +02:00
Christopher Faulet
f2b3be5c27 MEDIUM: checks: Implement MySQL check using tcp-check rules
A share tcp-check ruleset is now created to support MySQL checks. This way no
extra memory is used if several backends use a MySQL check.

One for the following sequence is used :

    ## If no extra params are set
    tcp-check connect default linger
    tcp-check expect custom  ## will test the initial handshake

    ## If the username is defined
    tcp-check connect default linger
    tcp-check send-binary MYSQL_REQ log-format
    tcp-check expect custom  ## will test the initial handshake
    tcp-check expect custom  ## will test the reply to the client message

The log-format hexa string MYSQL_REQ depends on 2 preset variables, the packet
header containing the packet length and the sequence ID (check.header) and the
username (check.username). If is also different if the "post-41" option is set
or not. Expect rules relies on custom functions to check MySQL server packets.
2020-04-27 09:39:38 +02:00
Christopher Faulet
ce355074f1 MEDIUM: checks: Implement postgres check using tcp-check rules
A shared tcp-check ruleset is now created to support postgres check. This way no
extra memory is used if several backends use a pgsql check.

The following sequence is used :

    tcp-check connect default linger

    tcp-check send-binary PGSQL_REQ log-format

    tcp-check expect !rstring "^E" min-recv 5 \
        error-status "L7RSP" on-error "%[check.payload(6,0)]"

    tcp-check expect rbinary "^520000000800000000 min-recv "9" \
        error-status "L7STS" \
        on-success "PostgreSQL server is ok" \
        on-error "PostgreSQL unknown error"

The log-format hexa string PGSQL_REQ depends on 2 preset variables, the packet
length (check.plen) and the username (check.username).
2020-04-27 09:39:38 +02:00
Christopher Faulet
fbcc77c6ba MEDIUM: checks: Implement smtp check using tcp-check rules
A share tcp-check ruleset is now created to support smtp checks. This way no
extra memory is used if several backends use a smtp check.

The following sequence is used :

    tcp-check connect default linger

    tcp-check expect rstring "^[0-9]{3}[ \r]" min-recv 4 \
        error-status "L7RSP" on-error "%[check.payload(),cut_crlf]"

    tcp-check expect rstring "^2[0-9]{2}[ \r]" min-recv 4 \
        error-status "L7STS" \
        on-error %[check.payload(4,0),ltrim(' '),cut_crlf] \
        status-code "check.payload(0,3)"

    tcp-echeck send "%[var(check.smtp_cmd)]\r\n" log-format

    tcp-check expect rstring "^2[0-9]{2}[- \r]" min-recv 4 \
        error-status "L7STS" \
        on-error %[check.payload(4,0),ltrim(' '),cut_crlf] \
        on-success "%[check.payload(4,0),ltrim(' '),cut_crlf]" \
        status-code "check.payload(0,3)"

The variable check.smtp_cmd is by default the string "HELO localhost" by may be
customized setting <helo> and <domain> parameters on the option smtpchk
line. Note there is a difference with the old smtp check. The server gretting
message is checked before send the HELO/EHLO comand.
2020-04-27 09:39:38 +02:00
Christopher Faulet
811f78ced1 MEDIUM: checks: Implement ssl-hello check using tcp-check rules
A shared tcp-check ruleset is now created to support ssl-hello check. This way
no extra memory is used if several backends use a ssl-hello check.

The following sequence is used :

    tcp-check send-binary SSLV3_CLIENT_HELLO log-format

    tcp-check expect rbinary "^1[56]" min-recv 5 \
        error-status "L6RSP" tout-status "L6TOUT"

SSLV3_CLIENT_HELLO is a log-format hexa string representing a SSLv3 CLIENT HELLO
packet. It is the same than the one used by the old ssl-hello except the sample
expression "%[date(),htonl,hex]" is used to set the date field.
2020-04-27 09:39:38 +02:00
Christopher Faulet
33f05df650 MEDIUM: checks: Implement redis check using tcp-check rules
A share tcp-check ruleset is now created to support redis checks. This way no
extra memory is used if several backends use a redis check.

The following sequence is used :

  tcp-check send "*1\r\n$4\r\nPING\r\n"

  tcp-check expect string "+PONG\r\n" error-status "L7STS" \
      on-error "%[check.payload(),cut_crlf]" on-success "Redis server is ok"
2020-04-27 09:39:38 +02:00
Christopher Faulet
9e6ed1598e MINOR: checks: Support custom functions to eval a tcp-check expect rules
It is now possible to set a custom function to evaluate a tcp-check expect
rule. It is an internal and not documentd option because the right pointer of
function must be set and it is not possible to express it in the
configuration. It will be used to convert some protocol healthchecks to
tcp-checks.

Custom functions must have the following signature:

  enum tcpcheck_eval_ret (*custom)(struct check *, struct tcpcheck_rule *, int);
2020-04-27 09:39:38 +02:00
Christopher Faulet
6f87adcf20 MINOR: checks: Export the tcpcheck_eval_ret enum
This enum will be used to define custom function for tcp-check expect rules.
2020-04-27 09:39:38 +02:00
Christopher Faulet
7a1e2e1823 MEDIUM: checks: Add a list of vars to set before executing a tpc-check ruleset
A list of variables is now associated to each tcp-check ruleset. It is more a
less a list of set-var expressions. This list may be filled during the
configuration parsing. The listed variables will then be set during each
execution of the tcp-check healthcheck, at the begining, before execution of the
the first tcp-check rule.

This patch is mandatory to convert all protocol checks to tcp-checks. It is a
way to customize shared tcp-check rulesets.
2020-04-27 09:39:37 +02:00
Christopher Faulet
bb591a1a11 MINOR: checks: Relax the default option for tcp-check connect rules
Now this option may be mixed with other options. This way, options on the server
line are used but may be overridden by tcp-check connect options.
2020-04-27 09:39:37 +02:00
Christopher Faulet
98cc57cf5c MEDIUM: checks: Add status-code sample expression on tcp-check expect rules
This option defines a sample expression, evaluated as an integer, to set the
status code (check->code) if a tcp-check healthcheck ends on the corresponding
expect rule.
2020-04-27 09:39:37 +02:00
Christopher Faulet
be52b4de66 MEDIUM: checks: Add on-error/on-success option on tcp-check expect rules
These options define log-format strings used to produce the info message if a
tcp-check expect rule fails (on-error option) or succeeds (on-success
option). For this last option, it must be the ending rule, otherwise the
parameter is ignored.
2020-04-27 09:39:37 +02:00
Christopher Faulet
cf80f2f263 MINOR: checks: Add option to tcp-check expect rules to customize error status
It is now possible to specified the healthcheck status to use on error or on
timeout for tcp-check expect rules. First, to define the error status, the
option "error-status" must be used followed by "L4CON", "L6RSP", "L7RSP" or
"L7STS". Then, to define the timeout status, the option "tout-status" must be
used followed by "L4TOUT", "L6TOUT" or "L7TOUT".

These options will be used to convert specific protocol healthchecks (redis,
pgsql...) to tcp-check ones.
x
2020-04-27 09:39:37 +02:00
Christopher Faulet
1032059bd0 MINOR: checks: Use a name for the healthcheck status enum
The enum defining all healthcheck status (HCHK_STATUS_*) is now named.
2020-04-27 09:39:37 +02:00
Christopher Faulet
5d503fcf5b MEDIUM: checks: Add a shared list of tcp-check rules
A global list to tcp-check ruleset can now be used to share common rulesets with
all backends without any duplication. It is mandatory to convert all specific
protocol checks (redis, pgsql...) to tcp-check healthchecks.

To do so, a flag is now attached to each tcp-check ruleset to know if it is a
shared ruleset or not. tcp-check rules defined in a backend are still directly
attached to the proxy and not shared. In addition a second flag is used to know
if the ruleset is inherited from the defaults section.
2020-04-27 09:39:37 +02:00
Christopher Faulet
f50f4e956f MEDIUM: checks: Support log-format strings for tcp-check send rules
An extra parameter for tcp-check send rules can be specified to handle the
string or the hexa string as a log-format one. Using "log-format" option,
instead of considering the data to send as raw data, it is parsed as a
log-format string. Thus it is possible to call sample fetches to customize data
sent to a server. Of course, because we have no stream attached to healthchecks,
not all sample fetches are available. So be careful.

    tcp-check set-var(check.port) int(8000)
    tcp-check set-var(check.uri) str(/status)
    tcp-check connect port var(check.port)
    tcp-check send "GET %[check.uri] HTTP/1.0\r\n" log-format
    tcp-check send "Host: %[srv_name]\r\n" log-format
    tcp-check send "\r\n"
2020-04-27 09:39:37 +02:00
Christopher Faulet
b7d30098f3 MEDIUM: checks: Support expression to set the port
Since we have a session attached to tcp-check healthchecks, It is possible use
sample expression and variables. In addition, it is possible to add tcp-check
set-var rules to define custom variables. So, now, a sample expression can be
used to define the port to use to establish a connection for a tcp-check connect
rule. For instance:

    tcp-check set-var(check.port) int(8888)
    tcp-check connect port var(check.port)
2020-04-27 09:39:37 +02:00
Christopher Faulet
5c28874a69 MINOR: checks: Add the addr option for tcp-check connect rule
With this option, it is now possible to use a specific address to open the
connection for a tcp-check connect rule. If the port option is also specified,
it is used in priority.
2020-04-27 09:39:37 +02:00
Christopher Faulet
d75f57e94c MINOR: ssl: Export a generic function to parse an alpn string
Parsing of an alpn string has been moved in a dedicated function and exposed to
be used from outside the ssl_sock module.
2020-04-27 09:39:37 +02:00
Christopher Faulet
085426aea9 MINOR: checks: Add the via-socks4 option for tcp-check connect rules
With this option, it is possible to establish the connection opened by a
tcp-check connect rule using upstream socks4 proxy. Info from the socks4
parameter on the server are used.
2020-04-27 09:39:37 +02:00
Christopher Faulet
79b31d4ee5 MINOR: checks: Add the sni option for tcp-check connect rules
With this option, it is possible to specify the SNI to be used for SSL
conncection opened by a tcp-check connect rule.
2020-04-27 09:39:37 +02:00
Gaetan Rivet
707b52f17e MEDIUM: checks: Parse custom action rules in tcp-checks
Register the custom action rules "set-var" and "unset-var", that will
call the parse_store() command upon parsing.

These rules are thus built and integrated to the tcp-check ruleset, but
have no further effect for the moment.
2020-04-27 09:39:37 +02:00
Gaetan Rivet
13a5043a9e MINOR: checks/vars: Add a check scope for variables
Add a dedicated vars scope for checks. This scope is considered as part of the
session scope for accounting purposes.

The scope can be addressed by a valid session, even embryonic. The stream is not
necessary.

The scope is initialized after the check session is created. All variables are
then pruned before the session is destroyed.
2020-04-27 09:39:37 +02:00
Gaetan Rivet
05d692dc09 MEDIUM: checks: Associate a session to each tcp-check healthcheck
Create a session for each healthcheck relying on a tcp-check ruleset. When such
check is started, a session is allocated, which will be freed when the check
finishes. A dummy static frontend is used to create these sessions. This will be
useful to support variables and sample expression. This will also be used,
later, by HTTP healthchecks to rely on HTTP muxes.
2020-04-27 09:39:37 +02:00
Christopher Faulet
b2c2e0fcca MAJOR: checks: Refactor and simplify the tcp-check loop
The loop in tcpcheck_main() function is quite hard to understand. Depending
where we are in the loop, The current_step is the currentely executed rule or
the one to execute on the next call to tcpcheck_main(). When the check result is
reported, we rely on the rule pointed by last_started_step or the one pointed by
current_step. In addition, the loop does not use the common list_for_each_entry
macro and it is thus quite confusing.

So the loop has been totally rewritten and splitted to several functions to
simplify its reading and its understanding. Tcp-check rules are evaluated in
dedicated functions. And a common for_each loop is used and only one rule is
referenced, the current one.
2020-04-27 09:39:37 +02:00
Christopher Faulet
a202d1d4c1 MEDIUM: checks: Add implicit tcp-check connect rule
After the configuration parsing, when its validity check, an implicit tcp-check
connect rule is added in front of the tcp-check ruleset if the first non-comment
rule is not a connect one. This implicit rule is flagged to use the default
check parameter.

This means now, all tcp-check rulesets begin with a connect and are never
empty. When tcp-check healthchecks are used, all connections are thus handled by
tcpcheck_main() function.
2020-04-27 09:39:37 +02:00
Gaetan Rivet
06d963aeca MINOR: checks: define a tcp-check connect type
The check rule itself is not changed, only its representation.
2020-04-27 09:39:37 +02:00
Gaetan Rivet
48219dc50e MINOR: checks: define tcp-check send type
The check rule itself is not changed, only its representation.
2020-04-27 09:39:37 +02:00
Gaetan Rivet
5301b01f99 MINOR: checks: Set the tcp-check rule index during parsing
Now the position of a tcp-check rule in a chain is set during the parsing. This
simplify significantly the function retrieving the current step id.
2020-04-27 09:39:37 +02:00
Gaetan Rivet
04578dbf37 MINOR: checks: Don't use a static tcp rule list head
To allow reusing these blocks without consuming more memory, their list
should be static and share-able accross uses. The head of the list will
be shared as well.

It is thus necessary to extract the head of the rule list from the proxy
itself. Transform it into a pointer instead, that can be easily set to
an external dynamically allocated head.
2020-04-27 09:39:37 +02:00
Gaetan Rivet
9dcb09fc98 MEDIUM: checks: capture groups in expect regexes
Parse back-references in comments of tcp-check expect rules.  If references are
made, capture groups in the match and replace references to it within the
comment when logging the error. Both text and binary regex can caputre groups
and reference them in the expect rule comment.

[Cf: I slightly updated the patch. exp_replace() function is used instead of a
custom one. And if the trash buffer is too small to contain the comment during
the substitution, the comment is ignored.]
2020-04-27 09:39:37 +02:00
Gaetan Rivet
efab6c61d9 MINOR: checks: add rbinary expect match type
The rbinary match works similarly to the rstring match type, however the
received data is rewritten as hex-string before the match operation is
done.

This allows using regexes on binary content even with the POSIX regex
engine.

[Cf: I slightly updated the patch. mem2hex function was removed and dump_binary
is used instead.]
2020-04-27 09:39:37 +02:00
Gaetan Rivet
b616add793 MINOR: checks: define a tcp expect type
Extract the expect definition from its tcpcheck ; create a standalone type.
2020-04-27 09:39:37 +02:00
Gaetan Rivet
f8ba6773e5 MINOR: checks: add linger option to tcp connect
Allow declaring tcpcheck connect commands with a new parameter,
"linger". This option will configure the connection to avoid using an
RST segment to close, instead following the four-way termination
handshake. Some servers would otherwise log each healthcheck as
an error.
2020-04-27 09:39:37 +02:00
Gaetan Rivet
1afd826ae4 MINOR: checks: add min-recv tcp-check expect option
Some expect rules cannot be satisfied due to inherent ambiguity towards
the received data: in the absence of match, the current behavior is to
be forced to wait either the end of the connection or a buffer full,
whichever comes first. Only then does the matching diagnostic is
considered  conclusive. For instance :

    tcp-check connect
    tcp-check expect !rstring "^error"
    tcp-check expect string "valid"

This check will only succeed if the connection is closed by the server before
the check timeout. Otherwise the first expect rule will wait for more data until
"^error" regex matches or the check expires.

Allow the user to explicitly define an amount of data that will be
considered enough to determine the value of the check.

This allows succeeding on negative rstring rules, as previously
in valid condition no match happened, and the matching was repeated
until the end of the connection. This could timeout the check
while no error was happening.

[Cf: I slighly updated the patch. The parameter was renamed and the value is a
signed integer to support -1 as default value to ignore the parameter.]
2020-04-27 09:39:37 +02:00
Gaetan Rivet
4038b94706 MEDIUM: checks: rewind to the first inverse expect rule of a chain on new data
When receiving additional data while chaining multiple tcp-check expects,
previous inverse expects might have a different result with the new data. They
need to be evaluated again against the new data.

Add a pointer to the first inverse expect rule of the current expect chain
(possibly of length one) to each expect rule. When receiving new data, the
currently evaluated tcp-check rule is set back to this pointed rule.

Fonctionnaly speaking, it is a bug and it exists since the introduction of the
feature. But there is no way for now to hit it because when an expect rule does
not match, we wait for more data, independently on the inverse flag. The only
way to move to the following rule is to be sure no more data will be received.

This patch depends on the commit "MINOR: mini-clist: Add functions to iterate
backward on a list".

[Cf: I slightly updated the patch. First, it only concerns inverse expect
rule. Normal expect rules are not concerned. Then, I removed the BUG tag
because, for now, it is not possible to move to the following rule when the
current one does not match while more data can be received.]
2020-04-27 09:39:37 +02:00
Gaetan Rivet
dd66732ffe MINOR: checks: Use an enum to describe the tcp-check rule type
Replace the generic integer with an enumerated list. This allows light
type check and helps debugging (seeing action = 2 in the struct is not
helpful).
2020-04-27 09:39:37 +02:00
Christopher Faulet
31c30fdf1e CLEANUP: checks: Don't export anymore init_check and srv_check_healthcheck_port
These functions are no longer called outside the checks.
2020-04-27 09:39:37 +02:00
Christopher Faulet
f61f33a1b2 BUG/MINOR: checks: Respect the no-check-ssl option
This options is used to force a non-SSL connection to check a SSL server or to
invert a check-ssl option inherited from the default section. The use_ssl field
in the check structure is used to know if a SSL connection must be used
(use_ssl=1) or not (use_ssl=0). The server configuration is used by default.

The problem is that we cannot distinguish the default case (no specific SSL
check option) and the case of an explicit non-SSL check. In both, use_ssl is set
to 0. So the server configuration is always used. For a SSL server, when
no-check-ssl option is set, the check is still performed using a SSL
configuration.

To fix the bug, instead of a boolean value (0=TCP, 1=SSL), we use a ternary value :

  * 0  = use server config
  * 1  = force SSL
  * -1 = force non-SSL

The same is done for the server parameter. It is not really necessary for
now. But it is a good way to know is the server no-ssl option is set.

In addition, the PR_O_TCPCHK_SSL proxy option is no longer used to set use_ssl
to 1 for a check. Instead the flag is directly tested to prepare or destroy the
server SSL context.

This patch should be backported as far as 1.8.
2020-04-27 09:39:37 +02:00
Christopher Faulet
8acb1284bc MINOR: checks: Add a way to send custom headers and payload during http chekcs
The 'http-check send' directive have been added to add headers and optionnaly a
payload to the request sent during HTTP healthchecks. The request line may be
customized by the "option httpchk" directive but there was not official way to
add extra headers. An old trick consisted to hide these headers at the end of
the version string, on the "option httpchk" line. And it was impossible to add
an extra payload with an "http-check expect" directive because of the
"Connection: close" header appended to the request (See issue #16 for details).

So to make things official and fully support payload additions, the "http-check
send" directive have been added :

    option httpchk POST /status HTTP/1.1

    http-check send hdr Content-Type "application/json;charset=UTF-8" \
        hdr X-test-1 value1 hdr X-test-2 value2 \
        body "{id: 1, field: \"value\"}"

When a payload is defined, the Content-Length header is automatically added. So
chunk-encoded requests are not supported yet. For now, there is no special
validity checks on the extra headers.

This patch is inspired by Kiran Gavali's work. It should fix the issue #16 and
as far as possible, it may be backported, at least as far as 1.8.
2020-04-27 09:39:37 +02:00
Christopher Faulet
bc1f54b0fc MINOR: mini-clist: Add functions to iterate backward on a list
list_for_each_entry_rev() and list_for_each_entry_from_rev() and corresponding
safe versions have been added to iterate on a list in the reverse order. All
these functions work the same way than the forward versions, except they use the
.p field to move for an element to another.
2020-04-27 09:39:37 +02:00
Christopher Faulet
aaae9a0e99 BUG/MINOR: check: Update server address and port to execute an external check
Server address and port may change at runtime. So the address and port passed as
arguments and as environment variables when an external check is executed must
be updated. The current number of connections on the server was already updated
before executing the command. So the same mechanism is used for the server
address and port. But in addition, command arguments are also updated.

This patch must be backported to all stable versions. It should fix the
issue #577.
2020-04-27 09:39:13 +02:00
Willy Tarreau
62ba9ba6ca BUG/MINOR: http: make url_decode() optionally convert '+' to SP
The url_decode() function used by the url_dec converter and a few other
call points is ambiguous on its processing of the '+' character which
itself isn't stable in the spec. This one belongs to the reserved
characters for the query string but not for the path nor the scheme,
in which it must be left as-is. It's only in argument strings that
follow the application/x-www-form-urlencoded encoding that it must be
turned into a space, that is, in query strings and POST arguments.

The problem is that the function is used to process full URLs and
paths in various configs, and to process query strings from the stats
page for example.

This patch updates the function to differentiate the situation where
it's parsing a path and a query string. A new argument indicates if a
query string should be assumed, otherwise it's only assumed after seeing
a question mark.

The various locations in the code making use of this function were
updated to take care of this (most call places were using it to decode
POST arguments).

The url_dec converter is usually called on path or url samples, so it
needs to remain compatible with this and will default to parsing a path
and turning the '+' to a space only after a question mark. However in
situations where it would explicitly be extracted from a POST or a
query string, it now becomes possible to enforce the decoding by passing
a non-null value in argument.

It seems to be what was reported in issue #585. This fix may be
backported to older stable releases.
2020-04-23 20:03:27 +02:00
Willy Tarreau
09568fd54d BUG/MINOR: tools: fix the i386 version of the div64_32 function
As reported in issue #596, the edx register isn't marked as clobbered
in div64_32(), which could technically allow gcc to try to reuse it
if it needed a copy of the 32 highest bits of the o1 register after
the operation.

Two attempts were tried, one using a dummy 32-bit local variable to
store the intermediary edx and another one switching to "=A" and making
result a long long. It turns out the former makes the resulting object
code significantly dirtier while the latter makes it better and was
kept. This is due to gcc's difficulties at working with register pairs
mixing 32- and 64- bit values on i386. It was verified that no code
change happened at all on x86_64, armv7, aarch64 nor mips32.

In practice it's only used by the frequency counters so this bug
cannot even be triggered but better fix it.

This may be backported to stable branches though it will not fix any
issue.
2020-04-23 17:21:37 +02:00
Ilya Shipitsin
856aabcda5 CLEANUP: assorted typo fixes in the code and comments
This is 8th iteration of typo fixes
2020-04-17 09:37:36 +02:00
Willy Tarreau
bb86986253 MINOR: init: report the haproxy version and executable path once on errors
If haproxy fails to start and emits an alert, then it can be useful
to have it also emit the version and the path used to load it. Some
users may be mistakenly launching the wrong binary due to a misconfigured
PATH variable and this will save them some troubleshooting time when it
reports that some keywords are not understood.

What we do here is that we *try* to extract the binary name from the
AUX vector on glibc, and we report this as a NOTICE tag before the
very first alert is emitted.
2020-04-16 10:52:41 +02:00
Ilya Shipitsin
d425950c68 CLEANUP: assorted typo fixes in the code and comments
This is 7th iteration of typo fixes
2020-04-16 10:04:36 +02:00
Willy Tarreau
3eb10b8e98 MINOR: init: add -dW and "zero-warning" to reject configs with warnings
Since some systems switched to service managers which hide all warnings
by default, some users are not aware of some possibly important warnings
and get caught too late with errors that could have been detected earlier.

This patch adds a new global keyword, "zero-warning" and an equivalent
command-line option "-dW" to refuse to start in case any warning is
detected. It is recommended to use these with configurations that are
managed by humans in order to catch mistakes very early.
2020-04-15 16:42:39 +02:00
Willy Tarreau
bebd212064 MINOR: init: report in "haproxy -c" whether there were warnings or not
This helps quickly checking if the config produces any warning. For
this we reuse the "warned" bit field to add a new WARN_ANY bit that is
set by ha_warning(). The rest of the bit field was also cleaned from
unused bits.
2020-04-15 16:42:00 +02:00
Frdric Lcaille
8ba10fea69 BUG/MINOR: peers: Incomplete peers sections should be validated.
Before supporting "server" line in "peers" section, such sections without
any local peer were removed from the configuration to get it validated.

This patch fixes the issue where a "server" line without address and port which
is a remote peer without address and port makes the configuration parsing fail.
When encoutering such cases we now ignore such lines remove them from the
configuration.

Thank you to Jrme Magnin for having reported this bug.

Must be backported to 2.1 and 2.0.
2020-04-15 10:47:39 +02:00
William Lallemand
b7296c42bd CLEANUP: ssl: remove a commentary in struct ckch_inst
The struct ckch_inst now handles the ssl_bind_conf so this commentary is
obsolete
2020-04-09 16:13:42 +02:00
William Lallemand
caa161982f CLEANUP: ssl/cli: use the list of filters in the crtlist_entry
In 'commit ssl cert', instead of trying to regenerate a list of filters
from the SNIs, use the list provided by the crtlist_entry used to
generate the ckch_inst.

This list of filters doesn't need to be free'd anymore since they are
always reused from the crtlist_entry.
2020-04-08 16:52:51 +02:00
William Lallemand
02e19a5c7b CLEANUP: ssl: use the refcount for the SSL_CTX'
Use the refcount of the SSL_CTX' to free them instead of freeing them on
certains conditions. That way we can free the SSL_CTX everywhere its
pointer is used.
2020-04-08 16:52:51 +02:00
William Lallemand
c69f02d0f0 MINOR: ssl/cli: replace dump/show ssl crt-list by '-n' option
The dump and show ssl crt-list commands does the same thing, they dump
the content of a crt-list, but the 'show' displays an ID in the first
column. Delete the 'dump' command so it is replaced by the 'show' one.
The old 'show' command is replaced by an '-n' option to dump the ID.
And the ID which was a pointer is replaced by a line number and placed
after colons in the filename.

Example:
  $ echo "show ssl crt-list -n kikyo.crt-list" | socat /tmp/sock1 -
  # kikyo.crt-list
  kikyo.pem.rsa:1 secure.domain.tld
  kikyo.pem.ecdsa:2 secure.domain.tld
2020-04-06 19:33:33 +02:00
Frdric Lcaille
876ed55d9b BUG/MINOR: protocol_buffer: Wrong maximum shifting.
This patch fixes a bad stop condition when decoding a protocol buffer variable integer
whose maximum lenghts are 10, shifting a uint64_t value by more than 63.

Thank you to Ilya for having reported this issue.

Must be backported to 2.1 and 2.0.
2020-04-02 15:09:46 +02:00
Olivier Houchard
4a0e7fe4f7 MINOR: connections: Don't mark conn flags 0x00000001 and 0x00000002 as unused.
Remove the comments saying 0x00000001 and 0x00000002 are unused, they are
now used by CO_FL_SAFE_LIST and CO_FL_IDLE_LIST.
2020-03-31 23:04:20 +02:00
William Lallemand
fa8cf0c476 MINOR: ssl: store a ptr to crtlist in crtlist_entry
Store a pointer to crtlist in crtlist_entry so we can re-insert a
crtlist_entry in its crtlist ebpt after updating its key.
2020-03-31 12:32:17 +02:00
William Lallemand
23d61c00b9 MINOR: ssl: add a list of crtlist_entry in ckch_store
When updating a ckch_store we may want to update its pointer in the
crtlist_entry which use it. To do this, we need the list of the entries
using the store.
2020-03-31 12:32:17 +02:00
William Lallemand
493983128b BUG/MINOR: ssl: ckch_inst wrongly inserted in crtlist_entry
The instances were wrongly inserted in the crtlist entries, all
instances of a crt-list were inserted in the last crt-list entry.
Which was kind of handy to free all instances upon error.

Now that it's done correctly, the error path was changed, it must
iterate on the entries and find the ckch_insts which were generated for
this bind_conf. To avoid wasting time, it stops the iteration once it
found the first unsuccessful generation.
2020-03-31 12:32:17 +02:00
William Lallemand
ad3c37b760 REORG: ssl: move SETCERT enum to ssl_sock.h
Move the SETCERT enum at the right place to cleanup ssl_sock.c.
2020-03-31 12:32:17 +02:00
William Lallemand
79d31ec0d4 MINOR: ssl: add a list of bind_conf in struct crtlist
In order to be able to add new certificate in a crt-list, we need the
list of bind_conf that uses this crt-list so we can create a ckch_inst
for each of them.
2020-03-31 12:32:17 +02:00
William Lallemand
638f6ad033 MINOR: cli: add a general purpose pointer in the CLI struct
This patch adds a p2 generic pointer which is inialized to zero before
calling the parser.
2020-03-31 12:32:17 +02:00
Olivier Houchard
cf612a0457 MINOR: servers: Add a counter for the number of currently used connections.
Add a counter to know the current number of used connections, as well as the
max, this will be used later to refine the algorithm used to kill idle
connections, based on current usage.
2020-03-30 00:30:01 +02:00
Jerome Magnin
824186bb08 MEDIUM: stream: support use-server rules with dynamic names
With server-template was introduced the possibility to scale the
number of servers in a backend without needing a configuration change
and associated reload. On the other hand it became impractical to
write use-server rules for these servers as they would only accept
existing server labels as argument. This patch allows the use of
log-format notation to describe targets of a use-server rules, such
as in the example below:

  listen test
    bind *:1234
    use-server %[hdr(srv)] if { hdr(srv) -m found }
    use-server s1 if { path / }
    server s1 127.0.0.1:18080
    server s2 127.0.0.1:18081

If a use-server rule is applied because it was conditionned by an
ACL returning true, but the target of the use-server rule cannot be
resolved, no other use-server rule is evaluated and we fall back to
load balancing.

This feature was requested on the ML, and bumped with issue #563.
2020-03-29 09:55:10 +02:00
Olivier Houchard
dbda31939d BUG/MINOR: connections: Set idle_time before adding to idle list.
In srv_add_to_idle_list(), make sure we set the idle_time before we add
the connection to an idle list, not after, otherwise another thread may
grab it, set the idle_time to 0, only to have the original thread set it
back to now_ms.
This may have an impact, as in conn_free() we check idle_time to decide
if we should decrement the idle connection counters for the server.
2020-03-22 20:05:59 +01:00
Olivier Houchard
ad91124bcf BUILD/MEDIUM: fd: Declare fd_mig_lock as extern.
Declare fd_mig_lock as extern so that it isn't defined multiple times.
This should fix build for architectures without double-width CAS.
2020-03-20 11:42:11 +01:00
Olivier Houchard
566df309c6 MEDIUM: connections: Attempt to get idle connections from other threads.
In connect_server(), if we no longer have any idle connections for the
current thread, attempt to use the new "takeover" mux method to steal a
connection from another thread.
This should have no impact right now, given no mux implements it.
2020-03-19 22:07:33 +01:00
Olivier Houchard
d2489e00b0 MINOR: connections: Add a flag to know if we're in the safe or idle list.
Add flags to connections, CO_FL_SAFE_LIST and CO_FL_IDLE_LIST, to let one
know we are in the safe list, or the idle list.
2020-03-19 22:07:33 +01:00
Olivier Houchard
f0d4dff25c MINOR: connections: Make the "list" element a struct mt_list instead of list.
Make the "list" element a struct mt_list, and explicitely use
list_from_mt_list to get a struct list * where it is used as such, so that
mt_list_for_each_entry will be usable with it.
2020-03-19 22:07:33 +01:00
Olivier Houchard
00bdce24d5 MINOR: connections: Add a new mux method, "takeover".
Add a new mux method, "takeover", that will attempt to make the current thread
responsible for the connection.
It should return 0 on success, and non-zero on failure.
2020-03-19 22:07:33 +01:00
Olivier Houchard
8851664293 MINOR: fd: Implement fd_takeover().
Implement a new function, fd_takeover(), that lets you become the thread
responsible for the fd. On architectures that do not have a double-width CAS,
use a global rwlock.
fd_set_running() was also changed to be able to compete with fd_takeover(),
either using a dooble-width CAS on both running_mask and thread_mask, or
by claiming a reader on the global rwlock. This extra operation should not
have any measurable impact on modern architectures where threading is
relevant.
2020-03-19 22:07:33 +01:00
Olivier Houchard
dc2f2753e9 MEDIUM: servers: Split the connections into idle, safe, and available.
Revamp the server connection lists. We know have 3 lists :
- idle_conns, which contains idling connections
- safe_conns, which contains idling connections that are safe to use even
for the first request
- available_conns, which contains connections that are not idling, but can
still accept new streams (those are HTTP/2 or fastcgi, and are always
considered safe).
2020-03-19 22:07:33 +01:00
Olivier Houchard
2444aa5b66 MEDIUM: sessions: Don't be responsible for connections anymore.
Make it so sessions are not responsible for connection anymore, except for
connections that are private, and thus can't be shared, otherwise, as soon
as a request is done, the session will just add the connection to the
orphan connections pool.
This will break http-reuse safe, but it is expected to be fixed later.
2020-03-19 22:07:33 +01:00
Olivier Houchard
899fb8abdc MINOR: memory: Change the flush_lock to a spinlock, and don't get it in alloc.
The flush_lock was introduced, mostly to be sure that pool_gc() will never
dereference a pointer that has been free'd. __pool_get_first() was acquiring
the lock to, the fear was that otherwise that pointer could get free'd later,
and then pool_gc() would attempt to dereference it. However, that can not
happen, because the only functions that can free a pointer, when using
lockless pools, are pool_gc() and pool_flush(), and as long as those two
are mutually exclusive, nobody will be able to free the pointer while
pool_gc() attempts to access it.
So change the flush_lock to a spinlock, and don't bother acquire/release
it in __pool_get_first(), that way callers of __pool_get_first() won't have
to wait while the pool is flushed. The worst that can happen is we call
__pool_refill_alloc() while the pool is getting flushed, and memory can
get allocated just to be free'd.

This may help with github issue #552

This may be backported to 2.1, 2.0 and 1.9.
2020-03-18 15:55:35 +01:00
Olivier Houchard
de01ea9878 MINOR: wdt: Move the definitions of WDTSIG and DEBUGSIG into types/signal.h.
Move the definition of WDTSIG and DEBUGSIG from wdt.c and debug.c into
types/signal.h, so that we can access them in another file.
We need those definition to avoid blocking those signals when running
__signal_process_queue().

This should be backported to 2.1, 2.0 and 1.9.
2020-03-18 13:07:19 +01:00
Olivier Houchard
a7bf573520 MEDIUM: fd: Introduce a running mask, and use it instead of the spinlock.
In the struct fdtab, introduce a new mask, running_mask. Each thread should
add its bit before using the fd.
Use the running_mask instead of a lock, in fd_insert/fd_delete, we'll just
spin as long as the mask is non-zero, to be sure we access the data
exclusively.
fd_set_running_excl() spins until the mask is 0, fd_set_running() just
adds the thread bit, and fd_clr_running() removes it.
2020-03-17 15:30:07 +01:00
William Lallemand
2954c478eb MEDIUM: ssl: allow crt-list caching
The crtlist structure defines a crt-list in the HAProxy configuration.
It contains crtlist_entry structures which are the lines in a crt-list
file.

crt-list are now loaded in memory using crtlist and crtlist_entry
structures. The file is read only once. The generation algorithm changed
a little bit, new ckch instances are generated from the crtlist
structures, instead of being generated during the file loading.

The loading function was split in two, one that loads and caches the
crt-list and certificates, and one that looks for a crt-list and creates
the ckch instances.

Filters are also stored in crtlist_entry->filters as a char ** so we can
generate the sni_ctx again if needed. I won't be needed anymore to parse
the sni_ctx to do that.

A crtlist_entry stores the list of all ckch_inst that were generated
from this entry.
2020-03-16 16:18:49 +01:00
Willy Tarreau
e4d42551bd BUILD: pools: silence build warnings with DEBUG_MEMORY_POOLS and DEBUG_UAF
With these debug options we still get these warnings:

include/common/memory.h:501:23: warning: null pointer dereference [-Wnull-dereference]
    *(volatile int *)0 = 0;
    ~~~~~~~~~~~~~~~~~~~^~~
include/common/memory.h:460:22: warning: null pointer dereference [-Wnull-dereference]
   *(volatile int *)0 = 0;
   ~~~~~~~~~~~~~~~~~~~^~~

These are purposely there to crash the process at specific locations.
But the annoying warnings do not help with debugging and they are not
even reliable as the compiler may decide to optimize them away. Let's
pass the pointer through DISGUISE() to avoid this.
2020-03-14 11:10:21 +01:00
Willy Tarreau
2e8ab6b560 MINOR: use DISGUISE() everywhere we deliberately want to ignore a result
It's more generic and versatile than the previous shut_your_big_mouth_gcc()
that was used to silence annoying warnings as it's not limited to ignoring
syscalls returns only. This allows us to get rid of the aforementioned
function and the shut_your_big_mouth_gcc_int variable, that started to
look ugly in multi-threaded environments.
2020-03-14 11:04:49 +01:00
Willy Tarreau
15ed69fd3f MINOR: debug: consume the write() result in BUG_ON() to silence a warning
Tim reported that BUG_ON() issues warnings on his distro, as the libc marks
some syscalls with __attribute__((warn_unused_result)). Let's pass the
write() result through DISGUISE() to hide it.
2020-03-14 10:58:35 +01:00
Willy Tarreau
f401668306 MINOR: debug: add a new DISGUISE() macro to pass a value as identity
This does exactly the same as ALREADY_CHECKED() but does it inline,
returning an identical copy of the scalar variable without letting
the compiler know how it might have been transformed. This can
forcefully disable certain null-pointer checks or result checks when
known undesirable. Typically forcing a crash with *(DISGUISE(NULL))=0
will not cause a null-deref warning.
2020-03-14 10:52:46 +01:00
Ilya Shipitsin
77e3b4a2c4 CLEANUP: assorted typo fixes in the code and comments
These are mostly comments in the code. A few error messages were fixed
and are of low enough importance not to deserve a backport. Some regtests
were also fixed.
2020-03-14 09:42:07 +01:00
Tim Duesterhus
cf6e0c8a83 MEDIUM: proxy_protocol: Support sending unique IDs using PPv2
This patch adds the `unique-id` option to `proxy-v2-options`. If this
option is set a unique ID will be generated based on the `unique-id-format`
while sending the proxy protocol v2 header and stored as the unique id for
the first stream of the connection.

This feature is meant to be used in `tcp` mode. It works on HTTP mode, but
might result in inconsistent unique IDs for the first request on a keep-alive
connection, because the unique ID for the first stream is generated earlier
than the others.

Now that we can send unique IDs in `tcp` mode the `%ID` log variable is made
available in TCP mode.
2020-03-13 17:26:43 +01:00
Tim Duesterhus
d1b15b6e9b MINOR: proxy_protocol: Ingest PP2_TYPE_UNIQUE_ID on incoming connections
This patch reads a proxy protocol v2 provided unique ID and makes it
available using the `fc_pp_unique_id` fetch.
2020-03-13 17:25:23 +01:00
Tim Duesterhus
b435f77620 DOC: proxy_protocol: Reserve TLV type 0x05 as PP2_TYPE_UNIQUE_ID
This reserves and defines TLV type 0x05.
2020-03-13 17:25:23 +01:00
Olivier Houchard
84fd8a77b7 MINOR: lists: fix indentation.
Fix indentation in the recently added list_to_mt_list().
2020-03-11 21:41:13 +01:00
Olivier Houchard
8676514d4e MINOR: servers: Kill priv_conns.
Remove the list of private connections from server, it has been largely
unused, we only inserted connections in it, but we would never actually
use it.
2020-03-11 19:20:01 +01:00
Olivier Houchard
751e5e21a9 MINOR: lists: Implement function to convert list => mt_list and mt_list => list
Implement mt_list_to_list() and list_to_mt_list(), to be able to convert
from a struct list to a struct mt_list, and vice versa.
This is normally of no use, except for struct connection's list field, that
can go in either a struct list or a struct mt_list.
2020-03-11 17:10:40 +01:00
Olivier Houchard
49983a9fe1 MINOR: mt_lists: Appease gcc.
gcc is confused, and think p may end up being NULL in _MT_LIST_RELINK_DELETED.
It should never happen, so let gcc know that.
2020-03-11 17:10:08 +01:00
Willy Tarreau
638698da37 BUILD: stream-int: fix a few includes dependencies
The stream-int code doesn't need to load server.h as it doesn't use
servers at all. However removing this one reveals that proxy.h was
lacking types/checks.h that used to be silently inherited from
types/server.h loaded before in stream_interface.h.
2020-03-11 14:15:33 +01:00
Willy Tarreau
855796bdc8 BUG/MAJOR: list: fix invalid element address calculation
Ryan O'Hara reported that haproxy breaks on fedora-32 using gcc-10
(pre-release). It turns out that constructs such as:

    while (item != head) {
         item = LIST_ELEM(item.n);
    }

loop forever, never matching <item> to <head> despite a printf there
showing them equal. In practice the problem is that the LIST_ELEM()
macro is wrong, it assigns the subtract of two pointers (an integer)
to another pointer through a cast to its pointer type. And GCC 10 now
considers that this cannot match a pointer and silently optimizes the
comparison away. A tested workaround for this is to build with
-fno-tree-pta. Note that older gcc versions even with -ftree-pta do
not exhibit this rather surprizing behavior.

This patch changes the test to instead cast the null-based address to
an int to get the offset and subtract it from the pointer, and this
time it works. There were just a few places to adjust. Ideally
offsetof() should be used but the LIST_ELEM() API doesn't make this
trivial as it's commonly called with a typeof(ptr) and not typeof(ptr*)
thus it would require to completely change the whole API, which is not
something workable in the short term, especially for a backport.

With this change, the emitted code is subtly different even on older
versions. A code size reduction of ~600 bytes and a total executable
size reduction of ~1kB are expected to be observed and should not be
taken as an anomaly. Typically this loop in dequeue_proxy_listeners() :

   	while ((listener = MT_LIST_POP(...)))

used to produce this code where the comparison is performed on RAX
while the new offset is assigned to RDI even though both are always
identical:

  53ded8:       48 8d 78 c0             lea    -0x40(%rax),%rdi
  53dedc:       48 83 f8 40             cmp    $0x40,%rax
  53dee0:       74 39                   je     53df1b <dequeue_proxy_listeners+0xab>

and now produces this one which is slightly more efficient as the
same register is used for both purposes:

  53dd08:       48 83 ef 40             sub    $0x40,%rdi
  53dd0c:       74 2d                   je     53dd3b <dequeue_proxy_listeners+0x9b>

Similarly, retrieving the channel from a stream_interface using si_ic()
and si_oc() used to cause this (stream-int in rdi):

    1cb7:       c7 47 1c 00 02 00 00    movl   $0x200,0x1c(%rdi)
    1cbe:       f6 47 04 10             testb  $0x10,0x4(%rdi)
    1cc2:       74 1c                   je     1ce0 <si_report_error+0x30>
    1cc4:       48 81 ef 00 03 00 00    sub    $0x300,%rdi
    1ccb:       81 4f 10 00 08 00 00    orl    $0x800,0x10(%rdi)

and now causes this:

    1cb7:       c7 47 1c 00 02 00 00    movl   $0x200,0x1c(%rdi)
    1cbe:       f6 47 04 10             testb  $0x10,0x4(%rdi)
    1cc2:       74 1c                   je     1ce0 <si_report_error+0x30>
    1cc4:       81 8f 10 fd ff ff 00    orl    $0x800,-0x2f0(%rdi)

There is extremely little chance that this fix wakes up a dormant bug as
the emitted code effectively does what the source code intends.

This must be backported to all supported branches (dropping MT_LIST_ELEM
and the spoa_example parts as needed), since the bug is subtle and may
not always be visible even when compiling with gcc-10.
2020-03-11 14:12:51 +01:00
Olivier Houchard
1d117e3dcd BUG/MEDIUM: mt_lists: Make sure we set the deleted element to NULL;
In MT_LIST_DEL_SAFE(), when the code was changed to use a temporary variable
instead of using the provided pointer directly, we shouldn't have changed
the code that set the pointer to NULL, as we really want the pointer
provided to be nullified, otherwise other parts of the code won't know
we just deleted an element, and bad things will happen.

This should be backported to 2.1.
2020-03-10 17:45:05 +01:00
Willy Tarreau
9a0dfa5298 CLEANUP: remove the now unused common/syscall.h
It was added 9 years ago to implement USE_MY_SPLICE on some libcs where
syscall() was bogus. It's about time to get rid of this.
2020-03-10 07:28:46 +01:00
Willy Tarreau
06c63aec95 CLEANUP: remove support for USE_MY_SPLICE
The splice() syscall has been supported in glibc since version 2.5 issued
in 2006 and is present on supported systems so there's no need for having
our own arch-specific syscall definitions anymore.
2020-03-10 07:23:41 +01:00
Willy Tarreau
3858b122a6 CLEANUP: remove support for USE_MY_EPOLL
This was made to support epoll on patched 2.4 kernels, and on early 2.6
using alternative libcs thanks to the arch-specific syscall definitions.
All the features we support have been around since 2.6.2 and present in
glibc since 2.3.2, neither of which are found in field anymore. Let's
simply drop this and use epoll normally.
2020-03-10 07:08:10 +01:00
Willy Tarreau
618ac6ea52 CLEANUP: drop support for USE_MY_ACCEPT4
The accept4() syscall has been present for a while now, there is no more
reason for maintaining our own arch-specific syscall implementation for
systems lacking it in libc but having it in the kernel.
2020-03-10 07:02:46 +01:00
Willy Tarreau
c3e926bf3b CLEANUP: remove support for Linux i686 vsyscalls
This was introduced 10 years ago to squeeze a few CPU cycles per syscall
on 32-bit x86 machines and was already quite old by then, requiring to
explicitly enable support for this in the kernel. We don't even know if
it still builds, let alone if it works at all on recent kernels! Let's
completely drop this now.
2020-03-10 06:55:52 +01:00
William Lallemand
6763016866 BUG/MINOR: ssl/cli: sni_ctx' mustn't always be used as filters
Since commit 244b070 ("MINOR: ssl/cli: support crt-list filters"),
HAProxy generates a list of filters based on the sni_ctx in memory.
However it's not always relevant, sometimes no filters were configured
and the CN/SAN in the new certificate are not the same.

This patch fixes the issue by using a flag filters in the ckch_inst, so
we are able to know if there were filters or not. In the late case it
uses the CN/SAN of the new certificate to generate the sni_ctx.

note: filters are still only used in the crt-list atm.
2020-03-09 17:32:04 +01:00
William Lallemand
0a52846603 CLEANUP: ssl: is_default is a bit in ckch_inst
The field is_default becomes a bit in the ckch_inst structure.
2020-03-09 17:32:04 +01:00
Miroslav Zagorac
d7dc67ba1d CLEANUP: remove unused code in 'my_ffsl/my_flsl' functions
Shifting the variable 'a' one bit to the right has no effect on the
result of the functions.
2020-03-09 14:47:27 +01:00
Willy Tarreau
ee3bcddef7 MINOR: tools: add a generic function to generate UUIDs
We currently have two UUID generation functions, one for the sample
fetch and the other one in the SPOE filter. Both were a bit complicated
since they were made to support random() implementations returning an
arbitrary number of bits, and were throwing away 33 bits every 64. Now
we don't need this anymore, so let's have a generic function consuming
64 bits at once and use it as appropriate.
2020-03-08 18:04:16 +01:00
Willy Tarreau
52bf839394 BUG/MEDIUM: random: implement a thread-safe and process-safe PRNG
This is the replacement of failed attempt to add thread safety and
per-process sequences of random numbers initally tried with commit
1c306aa84d ("BUG/MEDIUM: random: implement per-thread and per-process
random sequences").

This new version takes a completely different approach and doesn't try
to work around the horrible OS-specific and non-portable random API
anymore. Instead it implements "xoroshiro128**", a reputedly high
quality random number generator, which is one of the many variants of
xorshift, which passes all quality tests and which is described here:

   http://prng.di.unimi.it/

While not cryptographically secure, it is fast and features a 2^128-1
period. It supports fast jumps allowing to cut the period into smaller
non-overlapping sequences, which we use here to support up to 2^32
processes each having their own, non-overlapping sequence of 2^96
numbers (~7*10^28). This is enough to provide 1 billion randoms per
second and per process for 2200 billion years.

The implementation was made thread-safe either by using a double 64-bit
CAS on platforms supporting it (x86_64, aarch64) or by using a local
lock for the time needed to perform the shift operations. This ensures
that all threads pick numbers from the same pool so that it is not
needed to assign per-thread ranges. For processes we use the fast jump
method to advance the sequence by 2^96 for each process.

Before this patch, the following config:
    global
        nbproc 8

    frontend f
        bind :4445
        mode http
        log stdout format raw daemon
        log-format "%[uuid] %pid"
        redirect location /

Would produce this output:
    a4d0ad64-2645-4b74-b894-48acce0669af 12987
    a4d0ad64-2645-4b74-b894-48acce0669af 12992
    a4d0ad64-2645-4b74-b894-48acce0669af 12986
    a4d0ad64-2645-4b74-b894-48acce0669af 12988
    a4d0ad64-2645-4b74-b894-48acce0669af 12991
    a4d0ad64-2645-4b74-b894-48acce0669af 12989
    a4d0ad64-2645-4b74-b894-48acce0669af 12990
    82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12987
    82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12992
    82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12986
    (...)

And now produces:
    f94b29b3-da74-4e03-a0c5-a532c635bad9 13011
    47470c02-4862-4c33-80e7-a952899570e5 13014
    86332123-539a-47bf-853f-8c8ea8b2a2b5 13013
    8f9efa99-3143-47b2-83cf-d618c8dea711 13012
    3cc0f5c7-d790-496b-8d39-bec77647af5b 13015
    3ec64915-8f95-4374-9e66-e777dc8791e0 13009
    0f9bf894-dcde-408c-b094-6e0bb3255452 13011
    49c7bfde-3ffb-40e9-9a8d-8084d650ed8f 13014
    e23f6f2e-35c5-4433-a294-b790ab902653 13012

There are multiple benefits to using this method. First, it doesn't
depend anymore on a non-portable API. Second it's thread safe. Third it
is fast and more proven than any hack we could attempt to try to work
around the deficiencies of the various implementations around.

This commit depends on previous patches "MINOR: tools: add 64-bit rotate
operators" and "BUG/MEDIUM: random: initialize the random pool a bit
better", all of which will need to be backported at least as far as
version 2.0. It doesn't require to backport the build fixes for circular
include files dependecy anymore.
2020-03-08 10:09:02 +01:00
Willy Tarreau
7a40909c00 MINOR: tools: add 64-bit rotate operators
This adds rotl64/rotr64 to rotate a 64-bit word by an arbitrary number
of bits. It's mainly aimed at being used with constants.
2020-03-08 00:42:18 +01:00
Willy Tarreau
0fbf28a05b Revert "BUG/MEDIUM: random: implement per-thread and per-process random sequences"
This reverts commit 1c306aa84d.

It breaks the build on all non-glibc platforms. I got confused by the
man page (which possibly is the most confusing man page I've ever read
about a standard libc function) and mistakenly understood that random_r
was portable, especially since it appears in latest freebsd source as
well but not in released versions, and with a slightly different API :-/

We need to find a different solution with a fallback. Among the
possibilities, we may reintroduce this one with a fallback relying on
locking around the standard functions, keeping fingers crossed for no
other library function to call them in parallel, or we may also provide
our own PRNG, which is not necessarily more difficult than working
around the totally broken up design of the portable API.
2020-03-07 11:24:39 +01:00
Willy Tarreau
1c306aa84d BUG/MEDIUM: random: implement per-thread and per-process random sequences
As mentioned in previous patch, the random number generator was never
made thread-safe, which used not to be a problem for health checks
spreading, until the uuid sample fetch function appeared. Currently
it is possible for two threads or processes to produce exactly the
same UUID. In fact it's extremely likely that this will happen for
processes, as can be seen with this config:

    global
        nbproc 8

    frontend f
        bind :4445
        mode http
        log stdout daemon format raw
        log-format "%[uuid] %pid"
        redirect location /

It typically produces this log:

  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30645
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30641
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30644
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30639
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30646
  07764439-c24d-4e6f-a5a6-0138be59e7a8 30645
  07764439-c24d-4e6f-a5a6-0138be59e7a8 30639
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30643
  07764439-c24d-4e6f-a5a6-0138be59e7a8 30646
  b6773fdd-678f-4d04-96f2-4fb11ad15d6b 30646
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30642
  07764439-c24d-4e6f-a5a6-0138be59e7a8 30642

What this patch does is to use a distinct per-thread and per-process
seed to make sure the same sequences will not appear, and will then
extend these seeds by "burning" a number of randoms that depends on
the global random seed, the thread ID and the process ID. This adds
roughly 20 extra bits of randomness, resulting in 52 bits total per
thread and per process.

It only takes a few milliseconds to burn these randoms and given
that threads start with a different seed, we know they will not
catch each other. So these random extra bits are essentially added
to ensure randomness between boots and cluster instances.

This replaces all uses of random() with ha_random() which uses the
thread-local state.

This must be backported as far as 2.0 or any version having the
UUID sample-fetch function since it's the main victim here.

It's important to note that this patch, in addition to depending on
the previous one "BUG/MEDIUM: init: initialize the random pool a bit
better", also depends on the preceeding build fixes to address a
circular dependency issue in the include files that prevented it
from building. Part or all of these patches may need to be backported
or adapted as well.
2020-03-07 06:11:15 +01:00
Willy Tarreau
6c3a681bd6 BUG/MEDIUM: random: initialize the random pool a bit better
Since the UUID sample fetch was created, some people noticed that in
certain virtualized environments they manage to get exact same UUIDs
on different instances started exactly at the same moment. It turns
out that the randoms were only initialized to spread the health checks
originally, not to provide "clean" randoms.

This patch changes this and collects more randomness from various
sources, including existing randoms, /dev/urandom when available,
RAND_bytes() when OpenSSL is available, as well as the timing for such
operations, then applies a SHA1 on all this to keep a 160 bits random
seed available, 32 of which are passed to srandom().

It's worth mentioning that there's no clean way to pass more than 32
bits to srandom() as even initstate() provides an opaque state that
must absolutely not be tampered with since known implementations
contain state information.

At least this allows to have up to 4 billion different sequences
from the boot, which is not that bad.

Note that the thread safety was still not addressed, which is another
issue for another patch.

This must be backported to all versions containing the UUID sample
fetch function, i.e. as far as 2.0.
2020-03-07 06:11:11 +01:00
Willy Tarreau
5a421a8f49 BUILD: listener: types/listener.h must not include standard.h
It's only a type definition, this header is not needed and causes
some circular dependency issues.
2020-03-07 06:07:18 +01:00
Willy Tarreau
c7f64e7a58 BUILD: freq_ctr: proto/freq_ctr needs to include common/standard.h
This is needed for div_64_32() which is there and currently accidently
inherited via global.h!
2020-03-07 06:07:18 +01:00
Willy Tarreau
f23e029409 BUILD: global: must not include common/standard.h but only types/freq_ctr.h
This one was accidently inherited and used to work but causes a circular
dependency.
2020-03-07 06:07:18 +01:00
Willy Tarreau
8dd0d55efe BUILD: ssl: include mini-clist.h
We use some list definitions and we don't include this header which
is in fact accidently inherited from others, causing a circular
dependency issue.
2020-03-07 06:07:18 +01:00
Willy Tarreau
a8561db936 BUILD: buffer: types/{ring.h,checks.h} should include buf.h, not buffer.h
buffer.h relies on proto/activity because it contains some code and not
just type definitions. It must not be included from types files. It
should probably also be split in two if it starts to include a proto.
This causes some circular dependencies at other places.
2020-03-07 06:07:18 +01:00
Christopher Faulet
d8f0e073dd MINOR: lua: Remove the flag HLUA_TXN_HTTP_RDY
This flag was used in some internal functions to be sure the current stream is
able to handle HTTP content. It was introduced when the legacy HTTP code was
still there. Now, It is possible to rely on stream's flags to be sure we have an
HTX stream.

So the flag HLUA_TXN_HTTP_RDY can be removed. Everywhere it was tested, it is
replaced by a call to the IS_HTX_STRM() macro.

This patch is mandatory to allow the support of the filters written in lua.
2020-03-06 14:13:00 +01:00
Christopher Faulet
1cdceb9365 MINOR: htx: Add a function to return a block at a specific offset
The htx_find_offset() function may be used to look for a block at a specific
offset in an HTX message, starting from the message head. A compound result is
returned, an htx_ret structure, with the found block and the position of the
offset in the block. If the offset is ouside of the HTX message, the returned
block is NULL.
2020-03-06 14:12:59 +01:00
Christopher Faulet
251f4917c3 MINOR: buf: Add function to insert a string at an absolute offset in a buffer
The b_insert_blk() function may now be used to insert a string, given a pointer
and the string length, at an absolute offset in a buffer, moving data between
this offset and the buffer's tail just after the end of the inserted string. The
buffer's length is automatically updated. This function supports wrapping. All
the string is copied or nothing. So it returns 0 if there are not enough space
to perform the copy. Otherwise, the number of bytes copied is returned.
2020-03-06 14:12:59 +01:00
Carl Henrik Lunde
f91ac19299 OPTIM: startup: fast unique_id allocation for acl.
pattern_finalize_config() uses an inefficient algorithm which is a
problem with very large configuration files. This affects startup, and
therefore reload time. When haproxy is deployed as a router in a
Kubernetes cluster the generated configuration file may be large and
reloads are frequently occuring, which makes this a significant issue.

The old algorithm is O(n^2)
* allocate missing uids - O(n^2)
* sort linked list - O(n^2)

The new algorithm is O(n log n):
* find the user allocated uids - O(n)
* store them for efficient lookup - O(n log n)
* allocate missing uids - n times O(log n)
* sort all uids - O(n log n)
* convert back to linked list - O(n)

Performance examples, startup time in seconds:

    pat_refs old     new
    1000      0.02   0.01
    10000     2.1    0.04
    20000    12.3    0.07
    30000    27.9    0.10
    40000    52.5    0.14
    50000    77.5    0.17

Please backport to 1.8, 2.0 and 2.1.
2020-03-06 08:11:58 +01:00
Tim Duesterhus
a17e66289c MEDIUM: stream: Make the unique_id member of struct stream a struct ist
The `unique_id` member of `struct stream` now is a `struct ist`.
2020-03-05 20:21:58 +01:00
Tim Duesterhus
0643b0e7e6 MINOR: proxy: Make header_unique_id a struct ist
The `header_unique_id` member of `struct proxy` now is a `struct ist`.
2020-03-05 19:58:22 +01:00
Tim Duesterhus
9576ab7640 MINOR: ist: Add struct ist istdup(const struct ist)
istdup() performs the equivalent of strdup() on a `struct ist`.
2020-03-05 19:53:12 +01:00
Tim Duesterhus
35005d01d2 MINOR: ist: Add struct ist istalloc(size_t) and void istfree(struct ist*)
`istalloc` allocates memory and returns an `ist` with the size `0` that points
to this allocation.

`istfree` frees the pointed memory and clears the pointer.
2020-03-05 19:52:07 +01:00
Tim Duesterhus
e296d3e5f0 MINOR: ist: Add int isttest(const struct ist)
`isttest` returns whether the `.ptr` is non-null.
2020-03-05 19:52:07 +01:00
Tim Duesterhus
241e29ef9c MINOR: ist: Add IST_NULL macro
`IST_NULL` is equivalent to an `struct ist` with `.ptr = NULL` and
`.len = 0`.
2020-03-05 19:52:07 +01:00
William Lallemand
cfca1422c7 MINOR: ssl: reach a ckch_store from a sni_ctx
It was only possible to go down from the ckch_store to the sni_ctx but
not to go up from the sni_ctx to the ckch_store.

To allow that, 2 pointers were added:

- a ckch_inst pointer in the struct sni_ctx
- a ckckh_store pointer in the struct ckch_inst
2020-03-05 11:28:42 +01:00
William Lallemand
38df1c8006 MINOR: ssl/cli: support crt-list filters
Generate a list of the previous filters when updating a certificate
which use filters in crt-list. Then pass this list to the function
generating the sni_ctx during the commit.

This feature allows the update of the crt-list certificates which uses
the filters with "set ssl cert".

This function could be probably replaced by creating a new
ckch_inst_new_load_store() function which take the previous sni_ctx list as
an argument instead of the char **sni_filter, avoiding the
allocation/copy during runtime for each filter. But since are still
handling the multi-cert bundles, it's better this way to avoid code
duplication.
2020-03-05 11:27:53 +01:00
Tim Duesterhus
127a74dd48 MINOR: stream: Add stream_generate_unique_id function
Currently unique IDs for a stream are generated using repetitive code in
multiple locations, possibly allowing for inconsistent behavior.
2020-03-05 07:23:00 +01:00
Willy Tarreau
899e5f69a1 MINOR: debug: use our own backtrace function on clang+x86_64
A test on FreeBSD with clang 4 to 8 produces this on a call to a
spinning loop on the CLI:

  call trace(5):
  |       0x53e2bc [eb 16 48 63 c3 48 c1 e0]: wdt_handler+0x10c
  |    0x800e02cfe [e8 5d 83 00 00 8b 18 8b]: libthr:pthread_sigmask+0x53e

with our own function it correctly produces this:

  call trace(20):
  |       0x53e2dc [eb 16 48 63 c3 48 c1 e0]: wdt_handler+0x10c
  |    0x800e02cfe [e8 5d 83 00 00 8b 18 8b]: libthr:pthread_sigmask+0x53e
  |    0x800e022bf [48 83 c4 38 5b 41 5c 41]: libthr:pthread_getspecific+0xdef
  | 0x7ffffffff003 [48 8d 7c 24 10 6a 00 48]: main+0x7fffffb416f3
  |    0x801373809 [85 c0 0f 84 6f ff ff ff]: libc:__sys_gettimeofday+0x199
  |    0x801373709 [89 c3 85 c0 75 a6 48 8b]: libc:__sys_gettimeofday+0x99
  |    0x801371c62 [83 f8 4e 75 0f 48 89 df]: libc:gettimeofday+0x12
  |       0x51fa0a [48 89 df 4c 89 f6 e8 6b]: ha_thread_dump_all_to_trash+0x49a
  |       0x4b723b [85 c0 75 09 49 8b 04 24]: mworker_cli_sockpair_new+0xd9b
  |       0x4b6c68 [85 c0 75 08 4c 89 ef e8]: mworker_cli_sockpair_new+0x7c8
  |       0x532f81 [4c 89 e7 48 83 ef 80 41]: task_run_applet+0xe1

So let's add clang+x86_64 to the list of platforms that will use our
simplified version. As a bonus it will not require to link with
-lexecinfo on FreeBSD and will work out of the box when passing
USE_BACKTRACE=1.
2020-03-04 12:04:07 +01:00
Willy Tarreau
13faf16e1e MINOR: debug: improve backtrace() on aarch64 and possibly other systems
It happens that on aarch64 backtrace() only returns one entry (tested
with gcc 4.7.4, 5.5.0 and 7.4.1). Probably that it refrains from unwinding
the stack due to the risk of hitting a bad pointer. Here we can use
may_access() to know when it's safe, so we can actually unwind the stack
without taking risks. It happens that the faulting function (the one
just after the signal handler) is not listed here, very likely because
the signal handler uses a special stack and did not create a new frame.

So this patch creates a new my_backtrace() function in standard.h that
either calls backtrace() or does its own unrolling. The choice depends
on HA_HAVE_WORKING_BACKTRACE which is set in compat.h based on the build
target.
2020-03-04 12:04:07 +01:00
Emmanuel Hocdet
842e94ee06 MINOR: ssl: add "ca-verify-file" directive
It's only available for bind line. "ca-verify-file" allows to separate
CA certificates from "ca-file". CA names sent in server hello message is
only compute from "ca-file". Typically, "ca-file" must be defined with
intermediate certificates and "ca-verify-file" with certificates to
ending the chain, like root CA.

Fix issue #404.
2020-03-04 11:53:11 +01:00
Willy Tarreau
eb8b1ca3eb MINOR: tools: add resolve_sym_name() to resolve function pointers
We use various hacks at a few places to try to identify known function
pointers in debugging outputs (show threads & show fd). Let's centralize
this into a new function dedicated to this. It already knows about the
functions matched by "show threads" and "show fd", and when built with
USE_DL, it can rely on dladdr1() to resolve other functions. There are
some limitations, as static functions are not resolved, linking with
-rdynamic is mandatory, and even then some functions will not necessarily
appear. It's possible to do a better job by rebuilding the whole symbol
table from the ELF headers in memory but it's less portable and the gains
are still limited, so this solution remains a reasonable tradeoff.
2020-03-03 18:18:40 +01:00
Willy Tarreau
762fb3ec8e MINOR: tools: add new function dump_addr_and_bytes()
This function dumps <n> bytes from <addr> in hex form into buffer <buf>
enclosed in brackets after the address itself, formatted on 14 chars
including the "0x" prefix. This is meant to be used as a prefix for code
areas. For example: "0x7f10b6557690 [48 c7 c0 0f 00 00 00 0f]: "
It relies on may_access() to know if the bytes are dumpable, otherwise "--"
is emitted. An optional prefix is supported.
2020-03-03 17:46:37 +01:00
Willy Tarreau
27d00c0167 MINOR: task: export run_tasks_from_list
This will help refine debug traces.
2020-03-03 15:26:10 +01:00
Willy Tarreau
3ebd55ee51 MINOR: haproxy: export run_poll_loop
This will help refine debug traces.
2020-03-03 15:26:10 +01:00
Willy Tarreau
1827845a3d MINOR: haproxy: export main to ease access from debugger
Better just export main instead of declaring it as extern, it's cleaner
and may be usable elsewhere.
2020-03-03 15:26:10 +01:00
Willy Tarreau
1ed3781e21 MINOR: fd: merge the read and write error bits into RW error
We always set them both, which makes sense since errors at the FD level
indicate a terminal condition for the socket that cannot be recovered.
Usually this is detected via a write error, but sometimes such an error
may asynchronously be reported on the read side. Let's simplify this
using only the write bit and calling it RW since it's used like this
everywhere, and leave the R bit spare for future use.
2020-02-28 07:42:29 +01:00
Willy Tarreau
a135ea63a6 CLEANUP: fd: remove some unneeded definitions of FD_EV_* flags
There's no point in trying to be too generic for these flags as the
read and write sides will soon differ a bit. Better explicitly define
the flags for each direction without trying to be direction-agnostic.
this clarifies the code and removes some defines.
2020-02-28 07:42:29 +01:00
Willy Tarreau
f80fe832b1 CLEANUP: fd: remove the FD_EV_STATUS aggregate
This was used only by fd_recv_state() and fd_send_state(), both of
which are unused. This will not work anymore once recv and send flags
start to differ, so let's remove this.
2020-02-28 07:42:29 +01:00
Jerome Magnin
967d3cc105 BUG/MINOR: http_ana: make sure redirect flags don't have overlapping bits
commit c87e46881 ("MINOR: http-rules: Add a flag on redirect rules to know the
rule direction") introduced a new flag for redirect rules, but its value has
bits in common with REDIRECT_FLAG_DROP_QS, which makes us enter this code path
in http_apply_redirect_rule(), which will then drop the query string.
To fix this, just give REDIRECT_FLAG_FROM_REQ its own unique value.

This must be backported where c87e468816 is backported.

This should fix issue 521.
2020-02-27 23:44:41 +01:00
Willy Tarreau
2104659cd5 MEDIUM: buffer: remove the buffer_wq lock
This lock was only needed to protect the buffer_wq list, but now we have
the mt_list for this. This patch simply turns the buffer_wq list to an
mt_list and gets rid of the lock.

It's worth noting that the whole buffer_wait thing still looks totally
wrong especially in a threaded context: the wakeup_cb() callback is
called synchronously from any thread and may end up calling some
connection code that was not expected to run on a given thread. The
whole thing should probably be reworked to use tasklets instead and be
a bit more centralized.
2020-02-26 10:39:36 +01:00
William Lallemand
e0f3fd5b4c CLEANUP: ssl: move issuer_chain tree and definition
Move the cert_issuer_tree outside the global_ssl structure since it's
not a configuration variable. And move the declaration of the
issuer_chain structure in types/ssl_sock.h
2020-02-25 15:06:40 +01:00
Willy Tarreau
226ef26056 MINOR: compiler: add new alignment macros
This commit adds ALWAYS_ALIGN(), MAYBE_ALIGN() and ATOMIC_ALIGN() to
be placed as delimitors inside structures to force alignment to a
given size. These depend on the architecture's capabilities so that
it is possible to always align, align only on archs not supporting
unaligned accesses at all, or only on those not supporting them for
atomic accesses (e.g. before a lock).
2020-02-25 10:34:43 +01:00
Willy Tarreau
908071171b BUILD: general: always pass unsigned chars to is* functions
The isalnum(), isalpha(), isdigit() etc functions from ctype.h are
supposed to take an int in argument which must either reflect an
unsigned char or EOF. In practice on some platforms they're implemented
as macros referencing an array, and when passed a char, they either cause
a warning "array subscript has type 'char'" when lucky, or cause random
segfaults when unlucky. It's quite unconvenient by the way since none of
them may return true for negative values. The recent introduction of
cygwin to the list of regularly tested build platforms revealed a lot
of breakage there due to the same issues again.

So this patch addresses the problem all over the code at once. It adds
unsigned char casts to every valid use case, and also drops the unneeded
double cast to int that was sometimes added on top of it.

It may be backported by dropping irrelevant changes if that helps better
support uncommon platforms. It's unlikely to fix bugs on platforms which
would already not emit any warning though.
2020-02-25 08:16:33 +01:00
Willy Tarreau
03e7853581 BUILD: remove obsolete support for -mregparm / USE_REGPARM
This used to be a minor optimization on ix86 where registers are scarce
and the calling convention not very efficient, but this platform is not
relevant enough anymore to warrant all this dirt in the code for the sake
of saving 1 or 2% of performance. Modern platforms don't use this at all
since their calling convention already defaults to using several registers
so better get rid of this once for all.
2020-02-25 07:41:47 +01:00
Tim Duesterhus
1d48ba91d7 CLEANUP: net_helper: Do not negate the result of unlikely
This patch turns the double negation of 'not unlikely' into 'likely'
and then turns the negation of 'not smaller' into 'greater or equal'
in an attempt to improve readability of the condition.

[wt: this was not a bug but purposely written like this to improve code
 generation on older compilers but not needed anymore as described here:
 https://www.mail-archive.com/haproxy@formilux.org/msg36392.html ]
2020-02-25 07:30:49 +01:00
Tim Duesterhus
927063b892 CLEANUP: conn: Do not pass a pointer to likely
Move the `!` inside the likely and negate it to unlikely.

The previous version should not have caused issues, because it is converted
to a boolean / integral value before being passed to __builtin_expect(), but
it's certainly unusual.

[wt: this was not a bug but purposely written like this to improve code
 generation on older compilers but not needed anymore as described here:
 https://www.mail-archive.com/haproxy@formilux.org/msg36392.html ]
2020-02-25 07:30:49 +01:00
Willy Tarreau
89ee79845c MINOR: compiler: drop special cases of likely/unlikely for older compilers
We used to special-case the likely()/unlikely() macros for a series of
early gcc 4.x compilers which used to produce very bad code when using
__builtin_expect(x,1), which basically used to build an integer (0 or 1)
from a condition then compare it to integer 1. This was already fixed in
5.x, but even now, looking at the code produced by various flavors of 4.x
this bad behavior couldn't be witnessed anymore. So let's consider it as
fixed by now, which will allow to get rid of some ugly tricks at some
specific places. A test on 4.7.4 shows that the code shrinks by about 3kB
now, thanks to some tests being inlined closer to the call place and the
unlikely case being moved to real functions. See the link below for more
background on this.

Link: https://www.mail-archive.com/haproxy@formilux.org/msg36392.html
2020-02-25 07:29:55 +01:00
Willy Tarreau
0e2686762f MINOR: compiler: move CPU capabilities definition from config.h and complete them
These ones are irrelevant to the config but rather to the platform, and
as such are better placed in compiler.h.

Here we take the opportunity for declaring a few extra capabilities:
 - HA_UNALIGNED         : CPU supports unaligned accesses
 - HA_UNALIGNED_LE      : CPU supports unaligned accesses in little endian
 - HA_UNALIGNED_FAST    : CPU supports fast unaligned accesses
 - HA_UNALIGNED_ATOMIC  : CPU supports unaligned accesses in atomics

This will help remove a number of #ifdefs with arch-specific statements.
2020-02-21 16:32:57 +01:00
Jerome Magnin
9dde0b2d31 MINOR: ist: add an iststop() function
Add a function that finds a character in an ist and returns an
updated ist with the length of the portion of the original string
that doesn't contain the char.

Might be backported to 2.1
2020-02-21 11:47:25 +01:00
Willy Tarreau
716bec2dc6 MINOR: connection: introduce a new receive flag: CO_RFL_READ_ONCE
This flag is currently supported by raw_sock to perform a single recv()
attempt and avoid subscribing. Typically on the request and response
paths with keep-alive, with short messages we know that it's very likely
that the first message is enough.
2020-02-21 11:22:45 +01:00
Willy Tarreau
5d4d1806db CLEANUP: connection: remove the definitions of conn_xprt_{stop,want}_{send,recv}
This marks the end of the transition from the connection polling states
introduced in 1.5-dev12 and the subscriptions in that arrived in 1.9.
The socket layer can now safely use its FD while all upper layers rely
exclusively on subscriptions. These old functions were removed. Some may
deserve some renaming to improved clarty though. The single call to
conn_xprt_stop_both() was dropped in favor of conn_cond_update_polling()
which already does the same.
2020-02-21 11:21:12 +01:00
Willy Tarreau
d1d14c3157 MINOR: connection: remove the last calls to conn_xprt_{want,stop}_*
The last few calls to conn_xprt_{want,stop}_{recv,send} in the central
connection code were replaced with their strictly exact equivalent fd_*,
adding the call to conn_ctrl_ready() when it was missing.
2020-02-21 11:21:12 +01:00
Willy Tarreau
19bc201c9f MEDIUM: connection: remove the intermediary polling state from the connection
Historically we used to require that the connections held the desired
polling states for the data layer and the socket layer. Then with muxes
these were more or less merged into the transport layer, and now it
happens that with all transport layers having their own state, the
"transport layer state" as we have it in the connection (XPRT_RD_ENA,
XPRT_WR_ENA) is only an exact copy of the undelying file descriptor
state, but with a delay. All of this is causing some difficulties at
many places in the code because there are still some locations which
use the conn_want_* API to remain clean and only rely on connection,
and count on a later collection call to conn_cond_update_polling(),
while others need an immediate action and directly use the FD updates.

Since our updates are now much cheaper, most of them being only an
atomic test-and-set operation, and since our I/O callbacks are deferred,
there's no benefit anymore in trying to "cache" the transient state
change in the connection flags hoping to cancel them before they
become an FD event. Better make such calls transparent indirections
to the FD layer instead and get rid of the deferred operations which
needlessly complicate the logic inside.

This removes flags CO_FL_XPRT_{RD,WR}_ENA and CO_FL_WILL_UPDATE.
A number of functions related to polling updates were either greatly
simplified or removed.

Two places were using CO_FL_XPRT_WR_ENA as a hint to know if more data
were expected to be sent after a PROXY protocol or SOCKSv4 header. These
ones were simply replaced with a check on the subscription which is
where we ought to get the autoritative information from.

Now the __conn_xprt_want_* and their conn_xprt_want_* counterparts
are the same. conn_stop_polling() and conn_xprt_stop_both() are the
same as well. conn_cond_update_polling() only causes errors to stop
polling. It also becomes way more obvious that muxes should not at
all employ conn_xprt_{want|stop}_{recv,send}(), and that the call
to __conn_xprt_stop_recv() in case a mux failed to allocate a buffer
is inappropriate, it ought to unsubscribe from reads instead. All of
this definitely requires a serious cleanup.
2020-02-21 11:21:12 +01:00
Christopher Faulet
727a3f1ca3 MINOR: http-htx: Add a function to retrieve the headers size of an HTX message
http_get_hdrs_size() function may now be used to get the bytes held by headers
in an HTX message. It only works if the headers were not already
forwarded. Metadata are not counted here.
2020-02-18 11:19:57 +01:00
Willy Tarreau
a71667c07d BUG/MINOR: tools: also accept '+' as a valid character in an identifier
The function is_idchar() was added by commit 36f586b ("MINOR: tools:
add is_idchar() to tell if a char may belong to an identifier") to
ease matching of sample fetch/converter names. But it lacked support
for the '+' character used in "base32+src" and "url32+src". A quick
way to figure the list of supported sample fetch+converter names is
to issue the following command:

   git grep '"[^"]*",.*SMP_T_.*SMP_USE_'|cut -f2 -d'"'|sort -u

No more entry is reported once searching for characters not covered
by is_idchar().

No backport is needed.
2020-02-17 06:37:40 +01:00
Willy Tarreau
e3b57bf92f MINOR: sample: make sample_parse_expr() able to return an end pointer
When an end pointer is passed, instead of complaining that a comma is
missing after a keyword, sample_parse_expr() will silently return the
pointer to the current location into this return pointer so that the
caller can continue its parsing. This will be used by more complex
expressions which embed sample expressions, and may even permit to
embed sample expressions into arguments of other expressions.
2020-02-14 19:02:06 +01:00
Willy Tarreau
80b53ffb1c MEDIUM: arg: make make_arg_list() stop after its own arguments
The main problem we're having with argument parsing is that at the
moment the caller looks for the first character looking like an end
of arguments (')') and calls make_arg_list() on the sub-string inside
the parenthesis.

Let's first change the way it works so that make_arg_list() also
consumes the parenthesis and returns the pointer to the first char not
consumed. This will later permit to refine each argument parsing.

For now there is no functional change.
2020-02-14 19:02:06 +01:00
Willy Tarreau
d4ad669051 MINOR: chunk: implement chunk_strncpy() to copy partial strings
This does like chunk_strcpy() except that the maximum string length may
be limited by the caller. A trailing zero is always appended. This is
particularly handy to extract portions of strings to put into the trash
for use with libc functions requiring a nul-terminated string.
2020-02-14 19:02:06 +01:00
Willy Tarreau
36f586b694 MINOR: tools: add is_idchar() to tell if a char may belong to an identifier
This function will simply be used to find the end of config identifiers
(proxies, servers, ACLs, sample fetches, converters, etc).
2020-02-14 19:02:06 +01:00
Ilya Shipitsin
88a2f0304c CLEANUP: ssl: remove unused functions in openssl-compat.h
functions SSL_SESSION_get0_id_context, SSL_CTX_get_default_passwd_cb,
SSL_CTX_get_default_passwd_cb_userdata are not used anymore
2020-02-14 16:15:00 +01:00
Willy Tarreau
160ad9e38a CLEANUP: mini-clist: simplify nested do { while(1) {} } while (0)
While looking for other occurrences of do { continue; } while (0) I
found these few leftovers in mini-clist where an outer loop was made
around "do { } while (0)" then another loop was placed inside just to
handle the continue. Let's clean this up by just removing the outer
one. Most of the patch is only the inner part of the loop that is
reindented. It was verified that the resulting code is the same.
2020-02-11 10:27:04 +01:00
Christopher Faulet
7716cdf450 MINOR: lua: Get the action return code on the stack when an action finishes
When an action successfully finishes, the action return code (ACT_RET_*) is now
retrieve on the stack, ff the first element is an integer. In addition, in
hlua_txn_done(), the value ACT_RET_DONE is pushed on the stack before
exiting. Thus, when a script uses this function, the corresponding action still
finishes with the good code. Thanks to this change, the flag HLUA_STOP is now
useless. So it has been removed.

It is a mandatory step to allow a lua action to return any action return code.
2020-02-06 15:13:03 +01:00
Christopher Faulet
07a718e712 CLEANUP: lua: Remove consistency check for sample fetches and actions
It is not possible anymore to alter the HTTP parser state from lua sample
fetches or lua actions. So there is no reason to still check for the parser
state consistency.
2020-02-06 15:13:03 +01:00
Christopher Faulet
4a2c142779 MEDIUM: http-rules: Support extra headers for HTTP return actions
It is now possible to append extra headers to the generated responses by HTTP
return actions, while it is not based on an errorfile. For return actions based
on errorfiles, these extra headers are ignored. To define an extra header, a
"hdr" argument must be used with a name and a value. The value is a log-format
string. For instance:

  http-request status 200 hdr "x-src" "%[src]" hdr "x-dst" "%[dst]"
2020-02-06 15:13:03 +01:00
Christopher Faulet
24231ab61f MEDIUM: http-rules: Add the return action to HTTP rules
Thanks to this new action, it is now possible to return any responses from
HAProxy, with any status code, based on an errorfile, a file or a string. Unlike
the other internal messages generated by HAProxy, these ones are not interpreted
as errors. And it is not necessary to use a file containing a full HTTP
response, although it is still possible. In addition, using a log-format string
or a log-format file, it is possible to have responses with a dynamic
content. This action can be used on the request path or the response path. The
only constraint is to have a responses smaller than a buffer. And to avoid any
warning the buffer space reserved to the headers rewritting should also be free.

When a response is returned with a file or a string as payload, it only contains
the content-length header and the content-type header, if applicable. Here are
examples:

  http-request return content-type image/x-icon file /var/www/favicon.ico  \
      if { path /favicon.ico }

  http-request return status 403 content-type text/plain    \
      lf-string "Access denied. IP %[src] is blacklisted."  \
      if { src -f /etc/haproxy/blacklist.lst }
2020-02-06 15:12:54 +01:00
Christopher Faulet
6d0c3dfac6 MEDIUM: http: Add a ruleset evaluated on all responses just before forwarding
This patch introduces the 'http-after-response' rules. These rules are evaluated
at the end of the response analysis, just before the data forwarding, on ALL
HTTP responses, the server ones but also all responses generated by
HAProxy. Thanks to this ruleset, it is now possible for instance to add some
headers to the responses generated by the stats applet. Following actions are
supported :

   * allow
   * add-header
   * del-header
   * replace-header
   * replace-value
   * set-header
   * set-status
   * set-var
   * strict-mode
   * unset-var
2020-02-06 14:55:34 +01:00
Christopher Faulet
ef70e25035 MINOR: http-ana: Add a function for forward internal responses
Operations performed when internal responses (redirect/deny/auth/errors) are
returned are always the same. The http_forward_proxy_resp() function is added to
group all of them under a unique function.
2020-02-06 14:55:34 +01:00
Christopher Faulet
72c7d8d040 MINOR: http-ana: Rely on http_reply_and_close() to handle server error
The http_server_error() function now relies on http_reply_and_close(). Both do
almost the same actions. In addtion, http_server_error() sets the error flag and
the final state flag on the stream.
2020-02-06 14:55:34 +01:00
Christopher Faulet
c87e468816 MINOR: http-rules: Add a flag on redirect rules to know the rule direction
HTTP redirect rules can be evaluated on the request or the response path. So
when a redirect rule is evaluated, it is important to have this information
because some specific processing may be performed depending on the direction. So
the REDIRECT_FLAG_FROM_REQ flag has been added. It is set when applicable on the
redirect rule during the parsing.

This patch is mandatory to fix a bug on redirect rule. It must be backported to
all stable versions.
2020-02-06 14:55:34 +01:00
Christopher Faulet
a4168434a7 MINOR: dns: Dynamically allocate dns options to reduce the act_rule size
<.arg.dns.dns_opts> field in the act_rule structure is now dynamically allocated
when a do-resolve rule is parsed. This drastically reduces the structure size.
2020-02-06 14:55:34 +01:00
Christopher Faulet
7651362e52 MINOR: htx/channel: Add a function to copy an HTX message in a channel's buffer
The channel_htx_copy_msg() function can now be used to copy an HTX message in a
channel's buffer. This function takes care to not overwrite existing data.

This patch depends on the commit "MINOR: htx: Add a function to append an HTX
message to another one". Both are mandatory to fix a bug in
http_reply_and_close() function. Be careful to backport both first.
2020-02-06 14:55:16 +01:00
Christopher Faulet
0ea0c86753 MINOR: htx: Add a function to append an HTX message to another one
the htx_append_msg() function can now be used to append an HTX message to
another one. All the message is copied or nothing. If an error occurs during the
copy, all changes are rolled back.

This patch is mandatory to fix a bug in http_reply_and_close() function. Be
careful to backport it first.
2020-02-06 14:54:47 +01:00
Olivier Houchard
1c7c0d6b97 BUG/MAJOR: memory: Don't forget to unlock the rwlock if the pool is empty.
In __pool_get_first(), don't forget to unlock the pool lock if the pool is
empty, otherwise no writer will be able to take the lock, and as it is done
when reloading, it leads to an infinite loop on reload.

This should be backported with commit 04f5fe87d3
2020-02-03 13:05:31 +01:00
Olivier Houchard
04f5fe87d3 BUG/MEDIUM: memory: Add a rwlock before freeing memory.
When using lockless pools, add a new rwlock, flush_pool. read-lock it when
getting memory from the pool, so that concurrenct access are still
authorized, but write-lock it when we're about to free memory, in
pool_flush() and pool_gc().
The problem is, when removing an item from the pool, we unreference it
to get the next one, however, that pointer may have been free'd in the
meanwhile, and that could provoke a crash if the pointer has been unmapped.
It should be OK to use a rwlock, as normal operations will still be able
to access the pool concurrently, and calls to pool_flush() and pool_gc()
should be pretty rare.

This should be backported to 2.1, 2.0 and 1.9.
2020-02-01 18:08:34 +01:00
Willy Tarreau
b30a153cd1 MINOR: task: detect self-wakeups on tl==sched->current instead of TASK_RUNNING
This is exactly what we want to detect (a task/tasklet waking itself),
so let's use the proper condition for this.
2020-01-31 17:45:10 +01:00
Willy Tarreau
bb238834da MINOR: task: permanently flag tasklets waking themselves up
Commit a17664d829 ("MEDIUM: tasks: automatically requeue into the bulk
queue an already running tasklet") tried to inflict a penalty to
self-requeuing tasks/tasklets which correspond to those involved in
large, high-latency data transfers, for the benefit of all other
processing which requires a low latency. However, it turns out that
while it ought to do this on a case-by-case basis, basing itself on
the RUNNING flag isn't accurate because this flag doesn't leave for
tasklets, so we'd rather need a distinct flag to tag such tasklets.

This commit introduces TASK_SELF_WAKING to mark tasklets acting like
this. For now it's still set when TASK_RUNNING is present but this
will have to change. The flag is kept across wakeups.
2020-01-31 17:45:10 +01:00
Willy Tarreau
a17664d829 MEDIUM: tasks: automatically requeue into the bulk queue an already running tasklet
When a tasklet re-runs itself such as in this chain:

   si_cs_io_cb -> si_cs_process -> si_notify -> si_chk_rcv

then we know it can easily clobber the run queue and harm latency. Now
what the scheduler does when it detects this is that such a tasklet is
automatically placed into the bulk list so that it's processed with the
remaining CPU bandwidth only. Thanks to this the CLI becomes instantly
responsive again even under heavy stress at 50 Gbps over 40kcon and
100% CPU on 16 threads.
2020-01-30 19:03:31 +01:00
Willy Tarreau
a62917b890 MEDIUM: tasks: implement 3 different tasklet classes with their own queues
We used to mix high latency tasks and low latency tasklets in the same
list, and to even refill bulk tasklets there, causing some unfairness
in certain situations (e.g. poll-less transfers between many connections
saturating the machine with similarly-sized in and out network interfaces).

This patch changes the mechanism to split the load into 3 lists depending
on the task/tasklet's desired classes :
  - URGENT: this is mainly for tasklets used as deferred callbacks
  - NORMAL: this is for regular tasks
  - BULK: this is for bulk tasks/tasklets

Arbitrary ratios of max_processed are picked from each of these lists in
turn, with the ability to complete in one list from what was not picked
in the previous one. After some quick tests, the following setup gave
apparently good results both for raw TCP with splicing and for H2-to-H1
request rate:

  - 0 to 75% for urgent
  - 12 to 50% for normal
  - 12 to what remains for bulk

Bulk is not used yet.
2020-01-30 18:59:33 +01:00
Willy Tarreau
911db9bd29 MEDIUM: connection: use CO_FL_WAIT_XPRT more consistently than L4/L6/HANDSHAKE
As mentioned in commit c192b0ab95 ("MEDIUM: connection: remove
CO_FL_CONNECTED and only rely on CO_FL_WAIT_*"), there is a lack of
consistency on which flags are checked among L4/L6/HANDSHAKE depending
on the code areas. A number of sample fetch functions only check for
L4L6 to report MAY_CHANGE, some places only check for HANDSHAKE and
many check both L4L6 and HANDSHAKE.

This patch starts to make all of this more consistent by introducing a
new mask CO_FL_WAIT_XPRT which is the union of L4/L6/HANDSHAKE and
reports whether the transport layer is ready or not.

All inconsistent call places were updated to rely on this one each time
the goal was to check for the readiness of the transport layer.
2020-01-23 16:34:26 +01:00
Willy Tarreau
4450b587dd MINOR: connection: remove CO_FL_SSL_WAIT_HS from CO_FL_HANDSHAKE
Most places continue to check CO_FL_HANDSHAKE while in fact they should
check CO_FL_HANDSHAKE_NOSSL, which contains all handshakes but the one
dedicated to SSL renegotiation. In fact the SSL layer should be the
only one checking CO_FL_SSL_WAIT_HS, so as to avoid processing data
when a renegotiation is in progress, but other ones randomly include it
without knowing. And ideally it should even be an internal flag that's
not exposed in the connection.

This patch takes CO_FL_SSL_WAIT_HS out of CO_FL_HANDSHAKE, uses this flag
consistently all over the code, and gets rid of CO_FL_HANDSHAKE_NOSSL.

In order to limit the confusion that has accumulated over time, the
CO_FL_SSL_WAIT_HS flag which indicates an ongoing SSL handshake,
possibly used by a renegotiation was moved after the other ones.
2020-01-23 16:34:26 +01:00
Willy Tarreau
c192b0ab95 MEDIUM: connection: remove CO_FL_CONNECTED and only rely on CO_FL_WAIT_*
Commit 477902bd2e ("MEDIUM: connections: Get ride of the xprt_done
callback.") broke the master CLI for a very obscure reason. It happens
that short requests immediately terminated by a shutdown are properly
received, CS_FL_EOS is correctly set, but in si_cs_recv(), we refrain
from setting CF_SHUTR on the channel because CO_FL_CONNECTED was not
yet set on the connection since we've not passed again through
conn_fd_handler() and it was not done in conn_complete_session(). While
commit a8a415d31a ("BUG/MEDIUM: connections: Set CO_FL_CONNECTED in
conn_complete_session()") fixed the issue, such accident may happen
again as the root cause is deeper and actually comes down to the fact
that CO_FL_CONNECTED is lazily set at various check points in the code
but not every time we drop one wait bit. It is not the first time we
face this situation.

Originally this flag was used to detect the transition between WAIT_*
and CONNECTED in order to call ->wake() from the FD handler. But since
at least 1.8-dev1 with commit 7bf3fa3c23 ("BUG/MAJOR: connection: update
CO_FL_CONNECTED before calling the data layer"), CO_FL_CONNECTED is
always synchronized against the two others before being checked. Moreover,
with the I/Os moved to tasklets, the decision to call the ->wake() function
is performed after the I/Os in si_cs_process() and equivalent, which don't
care about this transition either.

So in essence, checking for CO_FL_CONNECTED has become a lazy wait to
check for (CO_FL_WAIT_L4_CONN | CO_FL_WAIT_L6_CONN), but that always
relies on someone else having synchronized it.

This patch addresses it once for all by killing this flag and only checking
the two others (for which a composite mask CO_FL_WAIT_L4L6 was added). This
revealed a number of inconsistencies that were purposely not addressed here
for the sake of bisectability:

  - while most places do check both L4+L6 and HANDSHAKE at the same time,
    some places like assign_server() or back_handle_st_con() and a few
    sample fetches looking for proxy protocol do check for L4+L6 but
    don't care about HANDSHAKE ; these ones will probably fail on TCP
    request session rules if the handshake is not complete.

  - some handshake handlers do validate that a connection is established
    at L4 but didn't clear CO_FL_WAIT_L4_CONN

  - the ->ctl method of mux_fcgi, mux_pt and mux_h1 only checks for L4+L6
    before declaring the mux ready while the snd_buf function also checks
    for the handshake's completion. Likely the former should validate the
    handshake as well and we should get rid of these extra tests in snd_buf.

  - raw_sock_from_buf() would directly set CO_FL_CONNECTED and would only
    later clear CO_FL_WAIT_L4_CONN.

  - xprt_handshake would set CO_FL_CONNECTED itself without actually
    clearing CO_FL_WAIT_L4_CONN, which could apparently happen only if
    waiting for a pure Rx handshake.

  - most places in ssl_sock that were checking CO_FL_CONNECTED don't need
    to include the L4 check as an L6 check is enough to decide whether to
    wait for more info or not.

It also becomes obvious when reading the test in si_cs_recv() that caused
the failure mentioned above that once converted it doesn't make any sense
anymore: having CS_FL_EOS set while still waiting for L4 and L6 to complete
cannot happen since for CS_FL_EOS to be set, the other ones must have been
validated.

Some of these parts will still deserve further cleanup, and some of the
observations above may induce some backports of potential bug fixes once
totally analyzed in their context. The risk of breaking existing stuff
is too high to blindly backport everything.
2020-01-23 14:41:37 +01:00
Olivier Houchard
477902bd2e MEDIUM: connections: Get ride of the xprt_done callback.
The xprt_done_cb callback was used to defer some connection initialization
until we're connected and the handshake are done. As it mostly consists of
creating the mux, instead of using the callback, introduce a conn_create_mux()
function, that will just call conn_complete_session() for frontend, and
create the mux for backend.
In h2_wake(), make sure we call the wake method of the stream_interface,
as we no longer wakeup the stream task.
2020-01-22 18:56:05 +01:00
Olivier Houchard
8af03b396a MEDIUM: streams: Always create a conn_stream in connect_server().
In connect_server(), when creating a new connection for which we don't yet
know the mux (because it'll be decided by the ALPN), instead of associating
the connection to the stream_interface, always create a conn_stream. This way,
we have less special-casing needed. Store the conn_stream in conn->ctx,
so that we can reach the upper layers if needed.
2020-01-22 18:55:59 +01:00
Emmanuel Hocdet
6b5b44e10f BUG/MINOR: ssl: ssl_sock_load_pem_into_ckch is not consistent
"set ssl cert <filename> <payload>" CLI command should have the same
result as reload HAproxy with the updated pem file (<filename>).
Is not the case, DHparams/cert-chain is kept from the previous
context if no DHparams/cert-chain is set in the context (<payload>).

This patch should be backport to 2.1
2020-01-22 15:55:55 +01:00
Adis Nezirovic
1a693fc2fd MEDIUM: cli: Allow multiple filter entries for "show table"
For complex stick tables with many entries/columns, it can be beneficial
to filter using multiple criteria. The maximum number of filter entries
can be controlled by defining STKTABLE_FILTER_LEN during build time.

This patch can be backported to older releases.
2020-01-22 14:33:17 +01:00
Ilya Shipitsin
056c629531 BUG/MINOR: ssl: fix build on development versions of openssl-1.1.x
while working on issue #429, I encountered build failures with various
non-released openssl versions, let us improve ssl defines, switch to
features, not versions, for EVP_CTRL_AEAD_SET_IVLEN and
EVP_CTRL_AEAD_SET_TAG.

No backport is needed as there is no valid reason to build a stable haproxy
version against a development version of openssl.
2020-01-22 07:54:52 +01:00
Willy Tarreau
2086365f51 CLEANUP: pattern: remove the pat_time definition
It was inherited from acl_time, introduced in 1.3.10 by commit a84d374367
("[MAJOR] new framework for generic ACL support") and was never ever used.
Let's simply drop it now.
2020-01-22 07:44:36 +01:00
Tim Duesterhus
6a0dd73390 CLEANUP: Consistently unsigned int for bitfields
Signed bitfields of size `1` hold the values `0` and `-1`, but are
usually assigned `1`, possibly leading to subtle bugs when the value
is explicitely compared against `1`.
2020-01-22 07:28:39 +01:00
Baptiste Assmann
13a9232ebc MEDIUM: dns: use Additional records from SRV responses
Most DNS servers provide A/AAAA records in the Additional section of a
response, which correspond to the SRV records from the Answer section:

  ;; QUESTION SECTION:
  ;_http._tcp.be1.domain.tld.     IN      SRV

  ;; ANSWER SECTION:
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A1.domain.tld.
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A8.domain.tld.
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A5.domain.tld.
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A6.domain.tld.
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A4.domain.tld.
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A3.domain.tld.
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A2.domain.tld.
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A7.domain.tld.

  ;; ADDITIONAL SECTION:
  A1.domain.tld.          3600    IN      A       192.168.0.1
  A8.domain.tld.          3600    IN      A       192.168.0.8
  A5.domain.tld.          3600    IN      A       192.168.0.5
  A6.domain.tld.          3600    IN      A       192.168.0.6
  A4.domain.tld.          3600    IN      A       192.168.0.4
  A3.domain.tld.          3600    IN      A       192.168.0.3
  A2.domain.tld.          3600    IN      A       192.168.0.2
  A7.domain.tld.          3600    IN      A       192.168.0.7

SRV record support was introduced in HAProxy 1.8 and the first design
did not take into account the records from the Additional section.
Instead, a new resolution is associated to each server with its relevant
FQDN.
This behavior generates a lot of DNS requests (1 SRV + 1 per server
associated).

This patch aims at fixing this by:
- when a DNS response is validated, we associate A/AAAA records to
  relevant SRV ones
- set a flag on associated servers to prevent them from running a DNS
  resolution for said FADN
- update server IP address with information found in the Additional
  section

If no relevant record can be found in the Additional section, then
HAProxy will failback to running a dedicated resolution for this server,
as it used to do.
This behavior is the one described in RFC 2782.
2020-01-22 07:19:54 +01:00
Christopher Faulet
2f5339079b MINOR: proxy/http-ana: Add support of extra attributes for the cookie directive
It is now possible to insert any attribute when a cookie is inserted by
HAProxy. Any value may be set, no check is performed except the syntax validity
(CTRL chars and ';' are forbidden). For instance, it may be used to add the
SameSite attribute:

    cookie SRV insert attr "SameSite=Strict"

The attr option may be repeated to add several attributes.

This patch should fix the issue #361.
2020-01-22 07:18:31 +01:00
Christopher Faulet
554c0ebffd MEDIUM: http-rules: Support an optional error message in http deny rules
It is now possible to set the error message to use when a deny rule is
executed. It may be a specific error file, adding "errorfile <file>" :

  http-request deny deny_status 400 errorfile /etc/haproxy/errorfiles/400badreq.http

It may also be an error file from an http-errors section, adding "errorfiles
<name>" :

  http-request deny errorfiles my-errors  # use 403 error from "my-errors" section

When defined, this error message is set in the HTTP transaction. The tarpit rule
is also concerned by this change.
2020-01-20 15:18:46 +01:00
Christopher Faulet
473e880a25 MINOR: http-ana: Add an error message in the txn and send it when defined
It is now possible to set the error message to return to client in the HTTP
transaction. If it is defined, this error message is used instead of proxy's
errors or default errors.
2020-01-20 15:18:46 +01:00
Christopher Faulet
76edc0f29c MEDIUM: proxy: Add a directive to reference an http-errors section in a proxy
It is now possible to import in a proxy, fully or partially, error files
declared in an http-errors section. It may be done using the "errorfiles"
directive, followed by a name and optionally a list of status code. If there is
no status code specified, all error files of the http-errors section are
imported. Otherwise, only error files associated to the listed status code are
imported. For instance :

  http-errors my-errors
      errorfile 400 ...
      errorfile 403 ...
      errorfile 404 ...

  frontend frt
      errorfiles my-errors 403 404  # ==> error 400 not imported
2020-01-20 15:18:46 +01:00
Christopher Faulet
35cd81d363 MINOR: http-htx: Add a new section to create groups of custom HTTP errors
A new section may now be declared in the configuration to create global groups
of HTTP errors. These groups are not linked to a proxy and are referenced by
name. The section must be declared using the keyword "http-errors" followed by
the group name. This name must be unique. A list of "errorfile" directives may
be declared in such section. For instance:

    http-errors website-1
        errorfile 400 /path/to/site1/400.http
        errorfile 404 /path/to/site1/404.http

    http-errors website-2
        errorfile 400 /path/to/site2/400.http
        errorfile 404 /path/to/site2/404.http

For now, it is just possible to create "http-errors" sections. There is no
documentation because these groups are not used yet.
2020-01-20 15:18:46 +01:00
Christopher Faulet
5885775de1 MEDIUM: http-htx/proxy: Use a global and centralized storage for HTTP error messages
All custom HTTP errors are now stored in a global tree. Proxies use a references
on these messages. The key used for errorfile directives is the file name as
specified in the configuration. For errorloc directives, a key is created using
the redirect code and the url. This means that the same custom error message is
now stored only once. It may be used in several proxies or for several status
code, it is only parsed and stored once.
2020-01-20 15:18:46 +01:00
Christopher Faulet
bdf6526e94 MINOR: http-htx: Add functions to create HTX redirect message
http_parse_errorloc() may now be used to create an HTTP 302 or 303 redirect
message with a specific url passed as parameter. A parameter is used to known if
it is a 302 or a 303 redirect. A status code is passed as parameter. It must be
one of the supported HTTP error codes to be valid. Otherwise an error is
returned. It aims to be used to parse "errorloc" directives. It relies on
http_load_errormsg() to do most of the job, ie converting it in HTX.
2020-01-20 15:18:45 +01:00
Christopher Faulet
5031ef58ca MINOR: http-htx: Add functions to read a raw error file and convert it in HTX
http_parse_errorfile() may now be used to parse a raw HTTP message from a
file. A status code is passed as parameter. It must be one of the supported HTTP
error codes to be valid. Otherwise an error is returned. It aims to be used to
parse "errorfile" directives. It relies on http_load_errorfile() to do most of
the job, ie reading the file content and converting it in HTX.
2020-01-20 15:18:45 +01:00
Christopher Faulet
d73b96d48c MINOR: tcp-rules: Make tcp-request capture a custom action
Now, this action is use its own dedicated function and is no longer handled "in
place" during the TCP rules evaluation. Thus the action name ACT_TCP_CAPTURE is
removed. The action type is set to ACT_CUSTOM and a check function is used to
know if the rule depends on request contents while there is no inspect-delay.
2020-01-20 15:18:45 +01:00
Christopher Faulet
ac98d81f46 MINOR: http-rule/tcp-rules: Make track-sc* custom actions
Now, these actions use their own dedicated function and are no longer handled
"in place" during the TCP/HTTP rules evaluation. Thus the action names
ACT_ACTION_TRK_SC0 and ACT_ACTION_TRK_SCMAX are removed. The action type is now
the tracking index. Thus the function trk_idx() is no longer needed.
2020-01-20 15:18:45 +01:00
Christopher Faulet
91b3ec13c6 MEDIUM: http-rules: Make early-hint custom actions
Now, the early-hint action uses its own dedicated action and is no longer
handled "in place" during the HTTP rules evaluation. Thus the action name
ACT_HTTP_EARLY_HINT is removed. In additionn, http_add_early_hint_header() and
http_reply_103_early_hints() are also removed. This part is now handled in the
new action_ptr callback function.
2020-01-20 15:18:45 +01:00
Christopher Faulet
046cf44f6c MINOR: http-rules: Make set/del-map and add/del-acl custom actions
Now, these actions use their own dedicated function and are no longer handled
"in place" during the HTTP rules evaluation. Thus the action names
ACT_HTTP_*_ACL and ACT_HTTP_*_MAP are removed. The action type is now mapped as
following: 0 = add-acl, 1 = set-map, 2 = del-acl and 3 = del-map.
2020-01-20 15:18:45 +01:00
Christopher Faulet
d1f27e3394 MINOR: http-rules: Make set-header and add-header custom actions
Now, these actions use their own dedicated function and are no longer handled
"in place" during the HTTP rules evaluation. Thus the action names
ACT_HTTP_SET_HDR and ACT_HTTP_ADD_VAL are removed. The action type is now set to
0 to set a header (so remove existing ones if any and add a new one) or to 1 to
add a header (add without remove).
2020-01-20 15:18:45 +01:00
Christopher Faulet
92d34fe38d MINOR: http-rules: Make replace-header and replace-value custom actions
Now, these actions use their own dedicated function and are no longer handled
"in place" during the HTTP rules evaluation. Thus the action names
ACT_HTTP_REPLACE_HDR and ACT_HTTP_REPLACE_VAL are removed. The action type is
now set to 0 to evaluate the whole header or to 1 to evaluate every
comma-delimited values.

The function http_transform_header_str() is renamed to http_replace_hdrs() to be
more explicit and the function http_transform_header() is removed. In fact, this
last one is now more or less the new action function.

The lua code has been updated accordingly to use http_replace_hdrs().
2020-01-20 15:18:45 +01:00
Christopher Faulet
006f6507d7 MINOR: actions: Use an integer to set the action type
<action> field in the act_rule structure is now an integer. The act_name values
are used for all actions without action function (but it is not a pre-requisit
though) or the action will have no effect. But for all other actions, any
integer value may used, only the action function will take care of it. The
default for such actions is ACT_CUSTOM.
2020-01-20 15:18:45 +01:00
Christopher Faulet
245cf795c1 MINOR: actions: Add flags to configure the action behaviour
Some flags can now be set on an action when it is registered. The flags are
defined in the act_flag enum. For now, only ACT_FLAG_FINAL may be set on an
action to specify if it stops the rules evaluation. It is set on
ACT_ACTION_ALLOW, ACT_ACTION_DENY, ACT_HTTP_REQ_TARPIT, ACT_HTTP_REQ_AUTH,
ACT_HTTP_REDIR and ACT_TCP_CLOSE actions. But, when required, it may also be set
on custom actions.

Consequently, this flag is checked instead of the action type during the
configuration parsing to trigger a warning when a rule inhibits all the
following ones.
2020-01-20 15:18:45 +01:00
Christopher Faulet
105ba6cc54 MINOR: actions: Rename the act_flag enum into act_opt
The flags in the act_flag enum have been renamed act_opt. It means ACT_OPT
prefix is used instead of ACT_FLAG. The purpose of this patch is to reserve the
action flags for the actions configuration.
2020-01-20 15:18:45 +01:00
Christopher Faulet
cd26e8a2ec MINOR: http-rules/tcp-rules: Call the defined action function first if defined
When TCP and HTTP rules are evaluated, if an action function (action_ptr field
in the act_rule structure) is defined for a given action, it is now always
called in priority over the test on the action type. Concretly, for now, only
custom actions define it. Thus there is no change. It just let us the choice to
extend the action type beyond the existing ones in the enum.
2020-01-20 15:18:45 +01:00
Christopher Faulet
96bff76087 MINOR: actions: Regroup some info about HTTP rules in the same struct
Info used by HTTP rules manipulating the message itself are splitted in several
structures in the arg union. But it is possible to group all of them in a unique
struct. Now, <arg.http> is used by most of these rules, which contains:

  * <arg.http.i>   : an integer used as status code, nice/tos/mark/loglevel or
                     action id.
  * <arg.http.str> : an IST used as header name, reason string or auth realm.
  * <arg.http.fmt> : a log-format compatible expression
  * <arg.http.re>  : a regular expression used by replace rules
2020-01-20 15:18:45 +01:00
Christopher Faulet
58b3564fde MINOR: actions: Add a function pointer to release args used by actions
Arguments used by actions are never released during HAProxy deinit. Now, it is
possible to specify a function to do so. ".release_ptr" field in the act_rule
structure may be set during the configuration parsing to a specific deinit
function depending on the action type.
2020-01-20 15:18:45 +01:00
Christopher Faulet
e00d06c99f MINOR: http-rules: Handle all message rewrites the same way
In HTTP rules, error handling during a rewrite is now handle the same way for
all rules. First, allocation errors are reported as internal errors. Then, if
soft rewrites are allowed, rewrite errors are ignored and only the
failed_rewrites counter is incremented. Otherwise, when strict rewrites are
mandatory, interanl errors are returned.

For now, only soft rewrites are supported. Note also that the warning sent to
notify a rewrite failure was removed. It will be useless once the strict
rewrites will be possible.
2020-01-20 15:18:45 +01:00
Christopher Faulet
a00071e2e5 MINOR: http-ana: Add a txn flag to support soft/strict message rewrites
the HTTP_MSGF_SOFT_RW flag must now be set on the HTTP transaction to ignore
rewrite errors on a message, from HTTP rules. The mode is called the soft
rewrites. If thes flag is not set, strict rewrites are performed. In this mode,
if a rewrite error occurred, an internal error is reported.

For now, HTTP_MSGF_SOFT_RW is always set and there is no way to switch a
transaction in strict mode.
2020-01-20 15:18:45 +01:00
Christopher Faulet
a08546bb5a MINOR: counters: Remove failed_secu counter and use denied_resp instead
The failed_secu counter is only used for the servers stats. It is used to report
the number of denied responses. On proxies, the same info is stored in the
denied_resp counter. So, it is more consistent to use the same field for
servers.
2020-01-20 15:18:45 +01:00
Christopher Faulet
0159ee4032 MINOR: stats: Report internal errors in the proxies/listeners/servers stats
The stats field ST_F_EINT has been added to report internal errors encountered
per proxy, per listener and per server. It appears in the CLI export and on the
HTML stats page.
2020-01-20 15:18:45 +01:00
Christopher Faulet
30a2a3724b MINOR: http-rules: Add more return codes to let custom actions act as normal ones
When HTTP/TCP rules are evaluated, especially HTTP ones, some results are
possible for normal actions and not for custom ones. So missing return codes
(ACT_RET_) have been added to let custom actions act as normal ones. Concretely
following codes have been added:

 * ACT_RET_DENY : deny the request/response. It must be handled by the caller
 * ACT_RET_ABRT : abort the request/response, handled by action itsleft.
 * ACT_RET_INV  : invalid request/response
2020-01-20 15:18:45 +01:00
Christopher Faulet
4d90db5f4c MINOR: http-rules: Add a rule result to report internal error
Now, when HTTP rules are evaluated, HTTP_RULE_RES_ERROR must be returned when an
internal error is catched. It is a way to make the difference between a bad
request or a bad response and an error during its processing.
2020-01-20 15:18:45 +01:00
Christopher Faulet
d4ce6c2957 MINOR: counters: Add a counter to report internal processing errors
This counter, named 'internal_errors', has been added in frontend and backend
counters. It should be used when a internal error is encountered, instead for
failed_req or failed_resp.
2020-01-20 15:18:45 +01:00
Christopher Faulet
cb5501327c BUG/MINOR: http-rules: Remove buggy deinit functions for HTTP rules
Functions to deinitialize the HTTP rules are buggy. These functions does not
check the action name to release the right part in the arg union. Only few info
are released. For auth rules, the realm is released and there is no problem
here. But the regex <arg.hdr_add.re> is always unconditionally released. So it
is easy to make these functions crash. For instance, with the following rule
HAProxy crashes during the deinit :

      http-request set-map(/path/to/map) %[src] %[req.hdr(X-Value)]

For now, These functions are simply removed and we rely on the deinit function
used for TCP rules (renamed as deinit_act_rules()). This patch fixes the
bug. But arguments used by actions are not released at all, this part will be
addressed later.

This patch must be backported to all stable versions.
2020-01-20 15:18:45 +01:00
Willy Tarreau
ee1a6fc943 MINOR: connection: make the last arg of subscribe() a struct wait_event*
The subscriber used to be passed as a "void *param" that was systematically
cast to a struct wait_event*. By now it appears clear that the subscribe()
call at every layer is well defined and always takes a pointer to an event
subscriber of type wait_event, so let's enforce this in the functions'
prototypes, remove the intermediary variables used to cast it and clean up
the comments to clarify what all these functions do in their context.
2020-01-17 18:30:37 +01:00
Willy Tarreau
7872d1fc15 MEDIUM: connection: merge the send_wait and recv_wait entries
In practice all callers use the same wait_event notification for any I/O
so instead of keeping specific code to handle them separately, let's merge
them and it will allow us to create new events later.
2020-01-17 18:30:36 +01:00
Willy Tarreau
3a9312af8f REORG: stream/backend: move backend-specific stuff to backend.c
For more than a decade we've kept all the sess_update_st_*() functions
in stream.c while they're only there to work in relation with what is
currently being done in backend.c (srv_redispatch_connect, connect_server,
etc). Let's move all this pollution over there and take this opportunity
to try to find slightly less confusing names for these old functions
whose role is only to handle transitions from one specific stream-int
state:

  sess_update_st_rdy_tcp() -> back_handle_st_rdy()
  sess_update_st_con_tcp() -> back_handle_st_con()
  sess_update_st_cer()     -> back_handle_st_cer()
  sess_update_stream_int() -> back_try_conn_req()
  sess_prepare_conn_req()  -> back_handle_st_req()
  sess_establish()         -> back_establish()

The last one remained in stream.c because it's more or less a completion
function which does all the initialization expected on a connection
success or failure, can set analysers and emit logs.

The other ones could possibly slightly benefit from being modified to
take a stream-int instead since it's really what they're working with,
but it's unimportant here.
2020-01-17 18:30:36 +01:00
Willy Tarreau
3381bf89e3 MEDIUM: connection: get rid of CO_FL_CURR_* flags
These ones used to serve as a set of switches between CO_FL_SOCK_* and
CO_FL_XPRT_*, and now that the SOCK layer is gone, they're always a
copy of the last know CO_FL_XPRT_* ones that is resynchronized before
I/O events by calling conn_refresh_polling_flags(), and that are pushed
back to FDs when detecting changes with conn_xprt_polling_changes().

While these functions are not particularly heavy, what they do is
totally redundant by now because the fd_want_*/fd_stop_*() actions
already perform test-and-set operations to decide to create an entry
or not, so they do the exact same thing that is done by
conn_xprt_polling_changes(). As such it is pointless to call that
one, and given that the only reason to keep CO_FL_CURR_* is to detect
changes there, we can now remove them.

Even if this does only save very few cycles, this removes a significant
complexity that has been responsible for many bugs in the past, including
the last one affecting FreeBSD.

All tests look good, and no performance regressions were observed.
2020-01-17 17:45:12 +01:00
Willy Tarreau
e2a0eeca77 MINOR: connection: move the CO_FL_WAIT_ROOM cleanup to the reader only
CO_FL_WAIT_ROOM is set by the splicing function in raw_sock, and cleared
by the stream-int when splicing is disabled, as well as in
conn_refresh_polling_flags() so that a new call to ->rcv_pipe() could
be attempted by the I/O callbacks called from conn_fd_handler(). This
clearing in conn_refresh_polling_flags() makes no sense anymore and is
in no way related to the polling at all.

Since we don't call them from there anymore it's better to clear it
before attempting to receive, and to set it again later. So let's move
this operation where it should be, in raw_sock_to_pipe() so that it's
now symmetric. It was also placed in raw_sock_to_buf() so that we're
certain that it gets cleared if an attempt to splice is replaced with
a subsequent attempt to recv(). And these were currently already achieved
by the call to conn_refresh_polling_flags(). Now it could theorically be
removed from the stream-int.
2020-01-17 17:19:27 +01:00
Willy Tarreau
17ccd1a356 BUG/MEDIUM: connection: add a mux flag to indicate splice usability
Commit c640ef1a7d ("BUG/MINOR: stream-int: avoid calling rcv_buf() when
splicing is still possible") fixed splicing in TCP and legacy mode but
broke it badly in HTX mode.

What happens in HTX mode is that the channel's to_forward value remains
set to CHN_INFINITE_FORWARD during the whole transfer, and as such it is
not a reliable signal anymore to indicate whether more data are expected
or not. Thus, when data are spliced out of the mux using rcv_pipe(), even
when the end is reached (that only the mux knows about), the call to
rcv_buf() to get the final HTX blocks completing the message were skipped
and there was often no new event to wake this up, resulting in transfer
timeouts at the end of large objects.

All this goes down to the fact that the channel has no more information
about whether it can splice or not despite being the one having to take
the decision to call rcv_pipe() or not. And we cannot afford to call
rcv_buf() inconditionally because, as the commit above showed, this
reduces the forwarding performance by 2 to 3 in TCP and legacy modes
due to data lying in the buffer preventing splicing from being used
later.

The approach taken by this patch consists in offering the muxes the ability
to report a bit more information to the upper layers via the conn_stream.
This information could simply be to indicate that more data are awaited
but the real need being to distinguish splicing and receiving, here
instead we clearly report the mux's willingness to be called for splicing
or not. Hence the flag's name, CS_FL_MAY_SPLICE.

The mux sets this flag when it knows that its buffer is empty and that
data waiting past what is currently known may be spliced, and clears it
when it knows there's no more data or that the caller must fall back to
rcv_buf() instead.

The stream-int code now uses this to determine if splicing may be used
or not instead of looking at the rcv_pipe() callbacks through the whole
chain. And after the rcv_pipe() call, it checks the flag again to decide
whether it may safely skip rcv_buf() or not.

All this bitfield dance remains a bit complex and it starts to appear
obvious that splicing vs reading should be a decision of the mux based
on permission granted by the data layer. This would however increase
the API's complexity but definitely need to be thought about, and should
even significantly simplify the data processing layer.

The way it was integrated in mux-h1 will also result in no more calls
to rcv_pipe() on chunked encoded data, since these ones are currently
disabled at the mux level. However once the issue with chunks+splice
is fixed, it will be important to explicitly check for curr_len|CHNK
to set MAY_SPLICE, so that we don't call rcv_buf() after each chunk.

This fix must be backported to 2.1 and 2.0.
2020-01-17 17:00:12 +01:00
Willy Tarreau
340b07e868 BUG/MAJOR: hashes: fix the signedness of the hash inputs
Wietse Venema reported in the thread below that we have a signedness
issue with our hashes implementations: due to the use of const char*
for the input key that's often text, the crc32, sdbm, djb2, and wt6
algorithms return a platform-dependent value for binary input keys
containing bytes with bit 7 set. This means that an ARM or PPC
platform will hash binary inputs differently from an x86 typically.
Worse, some algorithms are well defined in the industry (like CRC32)
and do not provide the expected result on x86, possibly causing
interoperability issues (e.g. a user-agent would fail to compare the
CRC32 of a message body against the one computed by haproxy).

Fortunately, and contrary to the first impression, the CRC32c variant
used in the PROXY protocol processing is not affected. Thus the impact
remains very limited (the vast majority of input keys are text-based,
such as user-agent headers for exmaple).

This patch addresses the issue by fixing all hash functions' prototypes
(even those not affected, for API consistency). A reg test will follow
in another patch.

The vast majority of users do not use these hashes. And among those
using them, very few will pass them on binary inputs. However, for the
rare ones doing it, this fix MAY have an impact during the upgrade. For
example if the package is upgraded on one LB then on another one, and
the CRC32 of a binary input is used as a stick table key (why?) then
these CRCs will not match between both nodes. Similarly, if
"hash-type ... crc32" is used, LB inconsistency may appear during the
transition. For this reason it is preferable to apply the patch on all
nodes using such hashes at the same time. Systems upgraded via their
distros will likely observe the least impact since they're expected to
be upgraded within a short time frame.

And it is important for distros NOT to skip this fix, in order to avoid
distributing an incompatible implementation of a hash. This is the
reason why this patch is tagged as MAJOR, eventhough it's extremely
unlikely that anyone will ever notice a change at all.

This patch must be backported to all supported branches since the
hashes were introduced in 1.5-dev20 (commit 98634f0c). Some parts
may be dropped since implemented later.

Link to Wietse's report:
  https://marc.info/?l=postfix-users&m=157879464518535&w=2
2020-01-16 08:23:42 +01:00
Willy Tarreau
f31af9367e MEDIUM: lua: don't call the GC as often when dealing with outgoing connections
In order to properly close connections established from Lua in case
a Lua context dies, the context currently automatically gets a flag
HLUA_MUST_GC set whenever an outgoing connection is used. This causes
the GC to be enforced on the context's death as well as on yield. First,
it does not appear necessary to do it when yielding, since if the
connections die they are already cleaned up. Second, the problem with
the flag is that even if a connection gets properly closed, the flag is
not removed and the GC continues to be called on the Lua context.

The impact on performance looks quite significant, as noticed and
diagnosed by Sadasiva Gujjarlapudi in the following thread:

  https://www.mail-archive.com/haproxy@formilux.org/msg35810.html

This patch changes the flag for a counter so that each created
connection increments it and each cleanly closed connection decrements
it. That way we know we have to call the GC on the context's death only
if the count is non-null. As reported in the thread above, the Lua
performance gain is now over 20% by doing this.

Thanks to Sada and Thierry for the design discussion and tests that
led to this solution.
2020-01-14 10:12:31 +01:00
Olivier Houchard
3c4f40acbf BUG/MEDIUM: tasks: Use the MT macros in tasklet_free().
In tasklet_free(), to attempt to remove ourself, use MT_LIST_DEL, we can't
just use LIST_DEL(), as we theorically could be in the shared tasklet list.

This should be backported to 2.1.
2020-01-10 16:56:59 +01:00
Florian Tham
9205fea13a MINOR: http: Add 404 to http-request deny
This patch adds http status code 404 Not Found to http-request deny. See
issue #80.
2020-01-08 16:15:23 +01:00
Florian Tham
272e29b5cc MINOR: http: Add 410 to http-request deny
This patch adds http status code 410 Gone to http-request deny. See
issue #80.
2020-01-08 16:15:23 +01:00
Willy Tarreau
eaf05be0ee OPTIM: polling: do not create update entries for FD removal
In order to reduce the number of poller updates, we can benefit from
the fact that modern pollers use sampling to report readiness and that
under load they rarely report the same FD multiple times in a row. As
such it's not always necessary to disable such FDs especially when we're
almost certain they'll be re-enabled again and will require another set
of syscalls.

Now instead of creating an update for a (possibly temporary) removal,
we only perform this removal if the FD is reported again as ready while
inactive. In addition this is performed via another update so that
alternating workloads like transfers have a chance to re-enable the
FD without any syscall during the loop (typically after the data that
filled a buffer have been sent). However we only do that for single-
threaded FDs as the other ones require a more complex setup and are not
on the critical path.

This does cause a few spurious wakeups but almost totally eliminates the
calls to epoll_ctl() on connections seeing intermitent traffic like HTTP/1
to a server or client.

A typical example with 100k requests for 4 kB objects over 200 connections
shows that the number of epoll_ctl() calls doesn't depend on the number
of requests anymore but most exclusively on the number of established
connections:

Before:
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 57.09    0.499964           0    654361    321190 recvfrom
 38.33    0.335741           0    369097         1 epoll_wait
  4.56    0.039898           0     44643           epoll_ctl
  0.02    0.000211           1       200       200 connect
------ ----------- ----------- --------- --------- ----------------
100.00    0.875814               1068301    321391 total

After:
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 59.25    0.504676           0    657600    323630 recvfrom
 40.68    0.346560           0    374289         1 epoll_wait
  0.04    0.000370           0       620           epoll_ctl
  0.03    0.000228           1       200       200 connect
------ ----------- ----------- --------- --------- ----------------
100.00    0.851834               1032709    323831 total

As expected there is also a slight increase of epoll_wait() calls since
delaying de-activation of events can occasionally cause one spurious
wakeup.
2019-12-27 16:38:47 +01:00
Willy Tarreau
19689882e6 MINOR: poller: do not call the IO handler if the FD is not active
For now this almost never happens but with subsequent patches it will
become more important not to uselessly call the I/O handlers if the FD
is not active.
2019-12-27 16:38:47 +01:00
Willy Tarreau
0fbc318e24 CLEANUP: connection: merge CO_FL_NOTIFY_DATA and CO_FL_NOTIFY_DONE
Both flags became equal in commit 82967bf9 ("MINOR: connection: adjust
CO_FL_NOTIFY_DATA after removal of flags"), which already predicted the
overlap between xprt_done_cb() and wake() after the removal of the DATA
specific flags in 1.8. Let's simply remove CO_FL_NOTIFY_DATA since the
"_DONE" version already covers everything and explains the intent well
enough.
2019-12-27 16:38:47 +01:00
Willy Tarreau
4970e5adb7 REORG: connection: move tcp_connect_probe() to conn_fd_check()
The function is not TCP-specific at all, it covers all FD-based sockets
so let's move this where other similar functions are, in connection.c,
and rename it conn_fd_check().
2019-12-27 16:38:43 +01:00
Willy Tarreau
11ef0837af MINOR: pollers: add a new flag to indicate pollers reporting ERR & HUP
In practice it's all pollers except select(). It turns out that we're
keeping some legacy code only for select and enforcing it on all
pollers, let's offer the pollers the ability to declare that they
do not need that.
2019-12-27 14:04:33 +01:00
Lukas Tribus
a26d1e1324 BUILD: ssl: improve SSL_CTX_set_ecdh_auto compatibility
SSL_CTX_set_ecdh_auto() is not defined when OpenSSL 1.1.1 is compiled
with the no-deprecated option. Remove existing, incomplete guards and
add a compatibility macro in openssl-compat.h, just as OpenSSL does:

bf4006a6f9/include/openssl/ssl.h (L1486)

This should be backported as far as 2.0 and probably even 1.9.
2019-12-21 06:46:55 +01:00
Rosen Penev
b3814c2ca8 BUG/MINOR: ssl: openssl-compat: Fix getm_ defines
LIBRESSL_VERSION_NUMBER evaluates to 0 under OpenSSL, making the condition
always true. Check for the define before checking it.

Signed-off-by: Rosen Penev <rosenp@gmail.com>

[wt: to be backported as far as 1.9]
2019-12-20 16:01:31 +01:00
Willy Tarreau
dd0e89a084 BUG/MAJOR: task: add a new TASK_SHARED_WQ flag to fix foreing requeuing
Since 1.9 with commit b20aa9eef3 ("MAJOR: tasks: create per-thread wait
queues") a task bound to a single thread will not use locks when being
queued or dequeued because the wait queue is assumed to be the owner
thread's.

But there exists a rare situation where this is not true: the health
check tasks may be running on one thread waiting for a response, and
may in parallel be requeued by another thread calling health_adjust()
after a detecting a response error in traffic when "observe l7" is set,
and "fastinter" is lower than "inter", requiring to shorten the running
check's timeout. In this case, the task being requeued was present in
another thread's wait queue, thus opening a race during task_unlink_wq(),
and gets requeued into the calling thread's wait queue instead of the
running one's, opening a second race here.

This patch aims at protecting against the risk of calling task_unlink_wq()
from one thread while the task is queued on another thread, hence unlocked,
by introducing a new TASK_SHARED_WQ flag.

This new flag indicates that a task's position in the wait queue may be
adjusted by other threads than then one currently executing it. This means
that such WQ manipulations must be performed under a lock. There are two
types of such tasks:
  - the global ones, using the global wait queue (technically speaking,
    those whose thread_mask has at least 2 bits set).
  - some local ones, which for now will be placed into the global wait
    queue as well in order to benefit from its lock.

The flag is automatically set on initialization if the task's thread mask
indicates more than one thread. The caller must also set it if it intends
to let other threads update the task's expiration delay (e.g. delegated
I/Os), or if it intends to change the task's affinity over time as this
could lead to the same situation.

Right now only the situation described above seems to be affected by this
issue, and it is very difficult to trigger, and even then, will often have
no visible effect beyond stopping the checks for example once the race is
met. On my laptop it is feasible with the following config, chained to
httpterm:

    global
        maxconn 400 # provoke FD errors, calling health_adjust()

    defaults
        mode http
        timeout client 10s
        timeout server 10s
        timeout connect 10s

    listen px
        bind :8001
        option httpchk /?t=50
        server sback 127.0.0.1:8000 backup
        server-template s 0-999 127.0.0.1:8000 check port 8001 inter 100 fastinter 10 observe layer7

This patch will automatically address the case for the checks because
check tasks are created with multiple threads bound and will get the
TASK_SHARED_WQ flag set.

If in the future more tasks need to rely on this (multi-threaded muxes
for example) and the use of the global wait queue becomes a bottleneck
again, then it should not be too difficult to place locks on the local
wait queues and queue the task on its bound thread.

This patch needs to be backported to 2.1, 2.0 and 1.9. It depends on
previous patch "MINOR: task: only check TASK_WOKEN_ANY to decide to
requeue a task".

Many thanks to William Dauchy for providing detailed traces allowing to
spot the problem.
2019-12-19 14:42:22 +01:00
Christopher Faulet
76014fd118 MEDIUM: h1-htx: Add HTX EOM block when the message is in H1_MSG_DONE state
During H1 parsing, the HTX EOM block is added before switching the message state
to H1_MSG_DONE. It is an exception in the way to convert an H1 message to
HTX. Except for this block, the message is first switched to the right state
before starting to add the corresponding HTX blocks. For instance, the message
is switched in H1_MSG_DATA state and then the HTX DATA blocks are added.

With this patch, the message is switched to the H1_MSG_DONE state when all data
blocks or trailers were processed. It is the caller responsibility to call
h1_parse_msg_eom() when the H1_MSG_DONE state is reached. This way, it is far
easier to catch failures when the HTX buffer is full.

The H1 and FCGI muxes have been updated accordingly.

This patch may eventually be backported to 2.1 if it helps other backports.
2019-12-11 16:46:16 +01:00
Willy Tarreau
fec56c6a76 BUG/MINOR: listener: fix off-by-one in state name check
As reported in issue #380, the state check in listener_state_str() is
invalid as it allows state value 9 to report crap. We don't use such
a state value so the issue should never happen unless the memory is
already corrupted, but better clean this now while it's harmless.

This should be backported to all maintained branches.
2019-12-11 15:51:37 +01:00
Willy Tarreau
d26c9f9465 BUG/MINOR: mworker: properly pass SIGTTOU/SIGTTIN to workers
If a new process is started with -sf and it fails to bind, it may send
a SIGTTOU to the master process in hope that it will temporarily unbind.
Unfortunately this one doesn't catch it and stops to background instead
of forwarding the signal to the workers. The same is true for SIGTTIN.

This commit simply implements an extra signal handler for the master to
deal with such signals that must be passed down to the workers. It must
be backported as far as 1.8, though there the code differs in that it's
entirely in haproxy.c and doesn't require an extra sig handler.
2019-12-11 14:26:53 +01:00
Willy Tarreau
c49ba52524 MINOR: tasks: split wake_expired_tasks() in two parts to avoid useless wakeups
We used to have wake_expired_tasks() wake up tasks and return the next
expiration delay. The problem this causes is that we have to call it just
before poll() in order to consider latest timers, but this also means that
we don't wake up all newly expired tasks upon return from poll(), which
thus systematically requires a second poll() round.

This is visible when running any scheduled task like a health check, as there
are systematically two poll() calls, one with the interval, nothing is done
after it, and another one with a zero delay, and the task is called:

  listen test
    bind *:8001
    server s1 127.0.0.1:1111 check

  09:37:38.200959 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8696843}) = 0
  09:37:38.200967 epoll_wait(3, [], 200, 1000) = 0
  09:37:39.202459 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8712467}) = 0
>> nothing run here, as the expired task was not woken up yet.
  09:37:39.202497 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8715766}) = 0
  09:37:39.202505 epoll_wait(3, [], 200, 0) = 0
  09:37:39.202513 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8719064}) = 0
>> now the expired task was woken up
  09:37:39.202522 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 7
  09:37:39.202537 fcntl(7, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
  09:37:39.202565 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0
  09:37:39.202577 setsockopt(7, SOL_TCP, TCP_QUICKACK, [0], 4) = 0
  09:37:39.202585 connect(7, {sa_family=AF_INET, sin_port=htons(1111), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
  09:37:39.202659 epoll_ctl(3, EPOLL_CTL_ADD, 7, {EPOLLOUT, {u32=7, u64=7}}) = 0
  09:37:39.202673 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8814713}) = 0
  09:37:39.202683 epoll_wait(3, [{EPOLLOUT|EPOLLERR|EPOLLHUP, {u32=7, u64=7}}], 200, 1000) = 1
  09:37:39.202693 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8818617}) = 0
  09:37:39.202701 getsockopt(7, SOL_SOCKET, SO_ERROR, [111], [4]) = 0
  09:37:39.202715 close(7)                = 0

Let's instead split the function in two parts:
  - the first part, wake_expired_tasks(), called just before
    process_runnable_tasks(), wakes up all expired tasks; it doesn't
    compute any timeout.
  - the second part, next_timer_expiry(), called just before poll(),
    only computes the next timeout for the current thread.

Thanks to this, all expired tasks are properly woken up when leaving
poll, and each poll call's timeout remains up to date:

  09:41:16.270449 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10223556}) = 0
  09:41:16.270457 epoll_wait(3, [], 200, 999) = 0
  09:41:17.270130 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10238572}) = 0
  09:41:17.270157 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 7
  09:41:17.270194 fcntl(7, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
  09:41:17.270204 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0
  09:41:17.270216 setsockopt(7, SOL_TCP, TCP_QUICKACK, [0], 4) = 0
  09:41:17.270224 connect(7, {sa_family=AF_INET, sin_port=htons(1111), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
  09:41:17.270299 epoll_ctl(3, EPOLL_CTL_ADD, 7, {EPOLLOUT, {u32=7, u64=7}}) = 0
  09:41:17.270314 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10337841}) = 0
  09:41:17.270323 epoll_wait(3, [{EPOLLOUT|EPOLLERR|EPOLLHUP, {u32=7, u64=7}}], 200, 1000) = 1
  09:41:17.270332 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10341860}) = 0
  09:41:17.270340 getsockopt(7, SOL_SOCKET, SO_ERROR, [111], [4]) = 0
  09:41:17.270367 close(7)                = 0

This may be backported to 2.1 and 2.0 though it's unlikely to bring any
user-visible improvement except to clarify debugging.
2019-12-11 09:42:58 +01:00
Willy Tarreau
440d09b244 BUG/MINOR: tasks: only requeue a task if it was already in the queue
Commit 0742c314c3 ("BUG/MEDIUM: tasks: Make sure we switch wait queues
in task_set_affinity().") had a slight side effect on expired timeouts,
which is that when used before a timeout is updated, it will cause an
existing task to be requeued earlier than its expected timeout when done
before being updated, resulting in the next poll wakup timeout too early
or even instantly if the previous wake up was done on a timeout. This is
visible in strace when health checks are enabled because there are two
poll calls, one of which has a short or zero delay. The correct solution
is to only requeue a task if it was already in the queue.

This can be backported to all branches having the fix above.
2019-12-11 09:21:36 +01:00
Willy Tarreau
a1d97f88e0 REORG: listener: move the global listener queue code to listener.c
The global listener queue code and declarations were still lying in
haproxy.c while not needed there anymore at all. This complicates
the code for no reason. As a result, the global_listener_queue_task
and the global_listener_queue were made static.
2019-12-10 14:16:03 +01:00
Willy Tarreau
241797a3fc MINOR: listener: split dequeue_all_listener() in two
We use it half times for the global_listener_queue and half times
for a proxy's queue and this requires the callers to take care of
these. Let's split it in two versions, the current one working only
on the global queue and another one dedicated to proxies for the
per-proxy queues. This cleans up quite a bit of code.
2019-12-10 14:14:09 +01:00
Willy Tarreau
a45a8b5171 MEDIUM: init: set NO_NEW_PRIVS by default when supported
HAProxy doesn't need to call executables at run time (except when using
external checks which are strongly recommended against), and is even expected
to isolate itself into an empty chroot. As such, there basically is no valid
reason to allow a setuid executable to be called without the user being fully
aware of the risks. In a situation where haproxy would need to call external
checks and/or disable chroot, exploiting a vulnerability in a library or in
haproxy itself could lead to the execution of an external program. On Linux
it is possible to lock the process so that any setuid bit present on such an
executable is ignored. This significantly reduces the risk of privilege
escalation in such a situation. This is what haproxy does by default. In case
this causes a problem to an external check (for example one which would need
the "ping" command), then it is possible to disable this protection by
explicitly adding this directive in the global section. If enabled, it is
possible to turn it back off by prefixing it with the "no" keyword.

Before the option:

  $ socat - /tmp/sock1 <<< "expert-mode on; debug dev exec sudo /bin/id"
  uid=0(root) gid=0(root) groups=0(root

After the option:
  $ socat - /tmp/sock1 <<< "expert-mode on; debug dev exec sudo /bin/id"
  sudo: effective uid is not 0, is /usr/bin/sudo on a file system with the
        'nosuid' option set or an NFS file system without root privileges?
2019-12-06 17:20:26 +01:00
Olivier Houchard
0742c314c3 BUG/MEDIUM: tasks: Make sure we switch wait queues in task_set_affinity().
In task_set_affinity(), leave the wait_queue if any before changing the
affinity, and re-enter a wait queue once it is done. If we don't do that,
the task may stay in the wait queue of another thread, and we later may
end up modifying that wait queue while holding no lock, which could lead
to memory corruption.

THis should be backported to 2.1, 2.0 and 1.9.
2019-12-05 15:11:19 +01:00
Willy Tarreau
d96f1126fe MEDIUM: init: prevent process and thread creation at runtime
Some concerns are regularly raised about the risk to inherit some Lua
files which make use of a fork (e.g. via os.execute()) as well as
whether or not some of bugs we fix might or not be exploitable to run
some code. Given that haproxy is event-driven, any foreground activity
completely stops processing and is easy to detect, but background
activity is a different story. A Lua script could very well discretely
fork a sub-process connecting to a remote location and taking commands,
and some injected code could also try to hide its activity by creating
a process or a thread without blocking the rest of the processing. While
such activities should be extremely limited when run in an empty chroot
without any permission, it would be better to get a higher assurance
they cannot happen.

This patch introduces something very simple: it limits the number of
processes and threads to zero in the workers after the last thread was
created. By doing so, it effectively instructs the system to fail on
any fork() or clone() syscall. Thus any undesired activity has to happen
in the foreground and is way easier to detect.

This will obviously break external checks (whose concept is already
totally insecure), and for this reason a new option
"insecure-fork-wanted" was added to disable this protection, and it
is suggested in the fork() error report from the checks. It is
obviously recommended not to use it and to reconsider the reasons
leading to it being enabled in the first place.

If for any reason we fail to disable forks, we still start because it
could be imaginable that some operating systems refuse to set this
limit to zero, but in this case we emit a warning, that may or may not
be reported since we're after the fork point. Ideally over the long
term it should be conditionned by strict-limits and cause a hard fail.
2019-12-03 11:49:00 +01:00
Emmanuel Hocdet
e9a100e982 BUG/MINOR: ssl: fix X509 compatibility for openssl < 1.1.0
Commit d4f9a60e "MINOR: ssl: deduplicate ca-file" uses undeclared X509
functions when build with openssl < 1.1.0. Introduce this functions
in openssl-compat.h .

Fix issue #385.
2019-12-03 07:13:12 +01:00
Emmanuel Hocdet
d4f9a60ee2 MINOR: ssl: deduplicate ca-file
Typically server line like:
'server-template srv 1-1000 *:443 ssl ca-file ca-certificates.crt'
load ca-certificates.crt 1000 times and stay duplicated in memory.
Same case for bind line: ca-file is loaded for each certificate.
Same 'ca-file' can be load one time only and stay deduplicated in
memory.

As a corollary, this will prevent file access for ca-file when
updating a certificate via CLI.
2019-11-28 11:11:20 +01:00
Willy Tarreau
cdb27e8295 MINOR: version: this is development again, update the status
It's basically a revert of commit 9ca7f8cea.
2019-11-25 20:38:32 +01:00
Willy Tarreau
2e077f8d53 [RELEASE] Released version 2.2-dev0
Released version 2.2-dev0 with the following main changes :
    - exact copy of 2.1.0
2019-11-25 20:36:16 +01:00
Willy Tarreau
9ca7f8ceac MINOR: version: indicate that this version is stable
Also indicate that it will get fixes till ~Q1 2021.
2019-11-25 19:47:23 +01:00
Willy Tarreau
c22d5dfeb8 MINOR: h2: add a function to report H2 error codes as strings
Just like we have frame type to string, let's have error to string to
improve debugging and traces.
2019-11-25 11:34:26 +01:00
Willy Tarreau
8f3ce06f14 MINOR: ist: add ist_find_ctl()
This new function looks for the first control character in a string (a
char whose value is between 0x00 and 0x1F included) and returns it, or
NULL if there is none. It is optimized for quickly evicting non-matching
strings and scans ~0.43 bytes per cycle. It can be used as an accelerator
when it's needed to look up several of these characters (e.g. CR/LF/NUL).
2019-11-25 10:33:35 +01:00
Willy Tarreau
47479eb0e7 MINOR: version: emit the link to the known bugs in output of "haproxy -v"
The link to the known bugs page for the current version is built and
reported there. When it is a development version (less than 2 dots),
instead a link to github open issues is reported as there's no way to
be sure about the current situation in this case and it's better that
users report their trouble there.
2019-11-21 18:48:20 +01:00
Willy Tarreau
08dd202d73 MINOR: version: report the version status in "haproxy -v"
As discussed on Discourse here:

    https://discourse.haproxy.org/t/haproxy-branch-support-lifetime/4466

it's not always easy for end users to know the lifecycle of the version
they are using. This patch introduces a "Status" line in the output of
"haproxy -vv" indicating whether it's a development, stable, long-term
supported version, possibly with an estimated end of life for the branch
when it can be anticipated (e.g. for stable versions). This field should
be adjusted when creating a major release to reflect the new status.

It may make sense to backport this to other branches to clarify the
situation.
2019-11-21 18:47:54 +01:00
William Lallemand
8b453912ce MINOR: ssl: ssl_sock_prepare_ctx() return an error code
Rework ssl_sock_prepare_ctx() so it fills a buffer with the error
messages instead of using ha_alert()/ha_warning(). Also returns an error
code (ERR_*) instead of the number of errors.
2019-11-21 17:48:11 +01:00
Daniel Corbett
f8716914c7 MEDIUM: dns: Add resolve-opts "ignore-weight"
It was noted in #48 that there are times when a configuration
may use the server-template directive with SRV records and
simultaneously want to control weights using an agent-check or
through the runtime api.  This patch adds a new option
"ignore-weight" to the "resolve-opts" directive.

When specified, any weight indicated within an SRV record will
be ignored.  This is for both initial resolution and ongoing
resolution.
2019-11-21 17:25:31 +01:00
Frdric Lcaille
ec1c10b839 MINOR: peers: Add debugging information to "show peers".
This patch adds three counters to help in debugging peers protocol issues
to "peer" struct:
	->no_hbt counts the number of reconnection period without receiving heartbeat
	->new_conn counts the number of reconnections after ->reconnect timeout expirations.
	->proto_err counts the number of protocol errors.
2019-11-19 14:48:28 +01:00
Frdric Lcaille
33cab3c0eb MINOR: peers: Add TX/RX heartbeat counters.
Add RX/TX heartbeat counters to "peer" struct to have an idead about which
peer is alive or not.
Dump these counters values on the CLI via "show peers" command.
2019-11-19 14:48:25 +01:00
Cdric Dufour
0d7712dff0 MINOR: stick-table: allow sc-set-gpt0 to set value from an expression
Allow the sc-set-gpt0 action to set GPT0 to a value dynamically evaluated from
its <expr> argument (in addition to the existing static <int> alternative).
2019-11-15 18:24:19 +01:00
Willy Tarreau
869efd5eeb BUG/MINOR: log: make "show startup-log" use a ring buffer instead
The copy of the startup logs used to rely on a re-allocated memory area
on the fly, that would attempt to be delivered at once over the CLI. But
if it's too large (too many warnings) it will take time to start up, and
may not even show up on the CLI as it doesn't fit in a buffer.

The ring buffer infrastructure solves all this with no more code, let's
switch to this instead. It simply requires a parsing function to attach
the ring via ring_attach_cli() and all the rest is automatically handled.

Initially this was imagined as a code cleanup, until a test with a config
involving 100k backends and just one occurrence of
"load-server-state-from-file global" in the defaults section took approx
20 minutes to parse due to the O(N^2) cost of concatenating the warnings
resulting in ~1 TB of data to be copied, while it took only 0.57s with
the ring.

Ideally this patch should be backported to 2.0 and 1.9, though it relies
on the ring infrastructure which will then also need to be backported.
Configs able to trigger the bug are uncommon, so another workaround for
older versions without backporting the rings would consist in simply
limiting the size of the error message in print_message() to something
always printable, which will only return the first errors.
2019-11-15 15:50:16 +01:00
Christopher Faulet
0d1c2a65e8 MINOR: stats: Report max times in addition of the averages for sessions
Now, for the sessions, the maximum times (queue, connect, response, total) are
reported in addition of the averages over the last 1024 connections. These
values are called qtime_max, ctime_max, rtime_max and ttime_max.

This patch is related to #272.
2019-11-15 14:23:54 +01:00
Christopher Faulet
efb41f0d8d MINOR: counters: Add fields to store the max observed for {q,c,d,t}_time
For backends and servers, some average times for last 1024 connections are
already calculated. For the moment, the averages for the time passed in the
queue, the connect time, the response time (for HTTP session only) and the total
time are calculated. Now, in addition, the maximum time observed for these
values are also stored.

In addition, These new counters are cleared as all other max values with the CLI
command "clear counters".

This patch is related to #272.
2019-11-15 14:23:21 +01:00
Christopher Faulet
e2e8c6779e MINOR: freq_ctr: Make the sliding window sums thread-safe
swrate_add() and swrate_add_scaled() now rely on the CAS atomic operation. So
the sliding window sums are atomically updated.
2019-11-15 13:43:08 +01:00
Christopher Faulet
b2e58492b1 MEDIUM: filters: Adapt filters API to allow again TCP filtering on HTX streams
This change make the payload filtering uniform between TCP and HTTP
filters. Now, in TCP, like in HTTP, there is only one callback responsible to
forward data. Thus, old callbacks, tcp_data() and tcp_forward_data(), are
replaced by a single callback function, tcp_payload(). This new callback gets
the offset in the payload to (re)start the filtering and the maximum amount of
data it can forward. It is the filter's responsibility to be compatible with HTX
streams. If not, it must not set the flag FLT_CFG_FL_HTX.

Because of this change, nxt and fwd offsets are no longer needed. Thus they are
removed from the filter structure with their update functions,
flt_change_next_size() and flt_change_forward_size(). Moreover, the trace filter
has been updated accordingly.

This patch breaks the compatibility with the old API. Thus it should probably
not be backported. But, AFAIK, there is no TCP filter, thus the breakage is very
limited.
2019-11-15 13:43:08 +01:00
Willy Tarreau
da52035a45 MINOR: memory: also poison the area on freeing
Doing so sometimes helps detect some UAF situations without the overhead
associated to the DEBUG_UAF define.
2019-11-15 07:06:46 +01:00
Olivier Houchard
7031e3dace BUG/MEDIUM: tasks: Make tasklet_remove_from_tasklet_list() no matter the tasklet.
In tasklet_remove_from_tasket_list(), we can be called for a tasklet that is
either in the private task list, or in the shared tasklet list. Take that into
account and always use MT_LIST_DEL() to remove it, otherwise if we're in the
shared list and another thread attempts to add a tasklet in it, bad things
will happen.

__tasklet_remove_from_tasklet_list() is left unchanged, it's only supposed
to be used by process_runnable_task() to remove task/tasklets from the private
tast list.

This should not be backported.
This should fix github issue #357.
2019-11-09 18:27:17 +01:00
Christopher Faulet
fee726ffa7 MINOR: http-ana: Remove the unused function http_reset_txn()
Since the legacy HTTP mode was removed, the stream is always released at the end
of each HTTP transaction and a new is created to handle the next request for
keep-alive connections. So the HTTP transaction is no longer reset and the
function http_reset_txn() can be removed.
2019-11-07 15:32:52 +01:00
Christopher Faulet
eea8fc737b MEDIUM: stream/trace: Register a new trace source with its events
Runtime traces are now supported for the streams, only if compiled with
debug. process_stream() is covered as well as TCP/HTTP analyzers and filters.

In traces, the first argument is always a stream. So it is easy to get the info
about the channels and the stream-interfaces. The second argument, when defined,
is always a HTTP transaction. And the third one is an HTTP message. The trace
message is adapted to report HTTP info when possible.
2019-11-06 10:14:32 +01:00
Christopher Faulet
db703b1918 MINOR: trace: Add a set of macros to trace events if HA is compiled with debug
The macros DBG_TRACE_*() can be used instead of existing trace macros to emit
trace messages in debug mode only, ie, when HAProxy is compiled with DEBUG_FULL
or DEBUG_DEV. Otherwise, these macros do nothing. So it is possible to add
traces for development purpose without impacting performance of production
instances.
2019-11-06 10:14:32 +01:00
William Lallemand
21724f0807 MINOR: ssl/cli: replace the default_ctx during 'commit ssl cert'
If the SSL_CTX of a previous instance (ckch_inst) was used as a
default_ctx, replace the default_ctx of the bind_conf by the first
SSL_CTX inserted in the SNI tree.

Use the RWLOCK of the sni tree to handle the change of the default_ctx.
2019-11-04 18:16:53 +01:00
Damien Claisse
ae6f125c7b MINOR: sample: add us/ms support to date/http_date
It can be sometimes interesting to have a timestamp with a
resolution of less than a second.
It is currently painful to obtain this, because concatenation
of date and date_us lead to a shorter timestamp during first
100ms of a second, which is not parseable and needs ugly ACLs
in configuration to prepend 0s when needed.
To improve this, add an optional <unit> parameter to date sample
to report an integer with desired unit.
Also support this unit in http_date converter to report
a date string with sub-second precision.
2019-10-31 08:47:31 +01:00
William Lallemand
beea2a476e CLEANUP: ssl/cli: remove leftovers of bundle/certs (it < 2)
Remove the leftovers of the certificate + bundle updating in 'ssl set
cert' and 'commit ssl cert'.

* Remove the it variable in appctx.ctx.ssl.
* Stop doing everything twice.
* Indent
2019-10-30 17:52:34 +01:00
William Lallemand
bc6ca7ccaa MINOR: ssl/cli: rework 'set ssl cert' as 'set/commit'
This patch splits the 'set ssl cert' CLI command into 2 commands.

The previous way of updating the certificate on the CLI was limited with
the bundles. It was only able to apply one of the tree part of the
certificate during an update, which mean that we needed 3 updates to
update a full 3 certs bundle.

It was also not possible to apply atomically several part of a
certificate with the ability to rollback on error. (For example applying
a .pem, then a .ocsp, then a .sctl)

The command 'set ssl cert' will now duplicate the certificate (or
bundle) and update it in a temporary transaction..

The second command 'commit ssl cert' will commit all the changes made
during the transaction for the certificate.

This commit breaks the ability to update a certificate which was used as
a unique file and as a bundle in the HAProxy configuration. This way of
using the certificates wasn't making any sense.

Example:

	// For a bundle:

	$ echo -e "set ssl cert localhost.pem.rsa <<\n$(cat kikyo.pem.rsa)\n" | socat /tmp/sock1 -
	Transaction created for certificate localhost.pem!

	$ echo -e "set ssl cert localhost.pem.dsa <<\n$(cat kikyo.pem.dsa)\n" | socat /tmp/sock1 -
	Transaction updated for certificate localhost.pem!

	$ echo -e "set ssl cert localhost.pem.ecdsa <<\n$(cat kikyo.pem.ecdsa)\n" | socat /tmp/sock1 -
	Transaction updated for certificate localhost.pem!

	$ echo "commit ssl cert localhost.pem" | socat /tmp/sock1 -
	Committing localhost.pem.
	Success!
2019-10-30 17:01:07 +01:00
William Dauchy
0fec3ab7bf MINOR: init: always fail when setrlimit fails
this patch introduces a strict-limits parameter which enforces the
setrlimit setting instead of a warning. This option can be forcingly
disable with the "no" keyword.
The general aim of this patch is to avoid bad surprises on a production
environment where you change the maxconn for example, a new fd limit is
calculated, but cannot be set because of sysfs setting. In that case you
might want to have an explicit failure to be aware of it before seeing
your traffic going down. During a global rollout it is also useful to
explictly fail as most progressive rollout would simply check the
general health check of the process.

As discussed, plan to use the strict by default mode starting from v2.3.

Signed-off-by: William Dauchy <w.dauchy@criteo.com>
2019-10-29 17:42:27 +01:00
Olivier Houchard
6e8e2ec849 BUG/MEDIUM: stream_interface: Only use SI_ST_RDY when the mux is ready.
In si_connect(), only switch the strema_interface status to SI_ST_RDY if
we're reusing a connection and if the connection's mux is ready. Otherwise,
maybe we're reusing a connection that is not fully established yet, and may
fail, and setting SI_ST_RDY would mean we would not be able to retry to
connect.

This should be backported to 1.9 and 2.0.
This commit depends on 55234e33708c5a584fb9efea81d71ac47235d518.
2019-10-29 14:15:20 +01:00
Olivier Houchard
9b8e11e691 MINOR: mux: Add a new method to get informations about a mux.
Add a new method, ctl(), to muxes. It uses a "enum mux_ctl_type" to
let it know which information we're asking for, and can output it either
directly by returning the expected value, or by using an optional argument.
"output" argument.
Right now, the only known mux_ctl_type is MUX_STATUS, that will return 0 if
the mux is not ready, or MUX_STATUS_READY if the mux is ready.

We probably want to backport this to 1.9 and 2.0.
2019-10-29 14:15:20 +01:00
Willy Tarreau
2254b8ef4a Revert "MINOR: istbuf: add b_fromist() to make a buffer from an ist"
This reverts commit 9e46496d45. It was
wrong and is not reliable, depending on the compiler's version and
optimization, as the struct is assigned inside a statement, thus on
its own stack. It's not needed anymore now so let's remove this.
2019-10-29 13:09:14 +01:00
Willy Tarreau
20020ae804 MINOR: chunk: add chunk_istcat() to concatenate an ist after a chunk
We previously relied on chunk_cat(dst, b_fromist(src)) for this but it
is not reliable as the allocated buffer is inside the expression and
may be on a temporary stack. While it's possible to allocate stack space
for a struct and return a pointer to it, it's not possible to initialize
it form a temporary variable to prevent arguments from being evaluated
multiple times. Since this is only used to append an ist after a chunk,
let's instead have a chunk_istcat() function to perform exactly this
from a native ist.

The only call place (URI computation in the cache) was updated.
2019-10-29 13:09:14 +01:00
Willy Tarreau
9b013701f1 MINOR: stats/debug: maintain a counter of debug commands issued
Debug commands will usually mark the fate of the process. We'd rather
have them counted and visible in a core or in stats output than trying
to guess how a flag combination could happen. The counter is only
incremented when the command is about to be issued however, so that
failed attempts are ignored.
2019-10-24 18:38:00 +02:00
Willy Tarreau
abb9f9b057 MINOR: cli: add an expert mode to hide dangerous commands
Some commands like the debug ones are not enabled by default but can be
useful on some production environments. In order to avoid the temptation
of using them incorrectly, let's introduce an "expert" mode for a CLI
connection, which allows some commands to appear and be used. It is
enabled by command "expert-mode on" which is not listed by default.
2019-10-24 18:38:00 +02:00
Willy Tarreau
86bfe146c9 REORG: move CLI access level definitions to cli.h
These ones were still in global.h which is misplaced.
2019-10-24 18:38:00 +02:00
William Lallemand
705e088f0a BUG/MINOR: ssl: fix build of X509_chain_up_ref() w/ libreSSL
LibreSSL brought X509_chain_up_ref() in 2.7.5, so no need to build our
own version starting from this version.
2019-10-23 23:20:08 +02:00
William Lallemand
89f5807315 BUG/MINOR: ssl: fix build with openssl < 1.1.0
8c1cddef ("MINOR: ssl: new functions duplicate and free a ckch_store")
use some OpenSSL refcount functions that were introduced in OpenSSL
1.0.2 and OpenSSL 1.1.0.

Fix the problem by introducing them in openssl-compat.h.

Fix #336.
2019-10-23 19:44:50 +02:00
William Lallemand
8f840d7e55 MEDIUM: cli/ssl: handle the creation of SSL_CTX in an IO handler
To avoid affecting too much the traffic during a certificate update,
create the SNIs in a IO handler which yield every 10 ckch instances.

This way haproxy continues to respond even if we tries to update a
certificate which have 50 000 instances.
2019-10-23 11:54:51 +02:00
Willy Tarreau
403bfbb130 BUG/MEDIUM: pattern: make the pattern LRU cache thread-local and lockless
As reported in issue #335, a lot of contention happens on the PATLRU lock
when performing expensive regex lookups. This is absurd since the purpose
of the LRU cache was to have a fast cache for expressions, thus the cache
must not be shared between threads and must remain lockless.

This commit makes the LRU cache thread-local and gets rid of the PATLRU
lock. A test with 7 threads on 4 cores climbed from 67kH/s to 369kH/s,
or a scalability factor of 5.5.

Given the huge performance difference and the regression caused to
users migrating from processes to threads, this should be backported at
least to 2.0.

Thanks to Brian Diekelman for his detailed report about this regression.
2019-10-23 07:27:25 +02:00
Willy Tarreau
8cdc167df8 BUG/MEDIUM: task: make tasklets either local or shared but not both at once
Tasklets may be woken up to run on the calling thread or by a specific thread
(the owner). But since we use a non-thread safe mechanism when the calling
thread is also the for the owner, there may sometimes be collisions when two
threads decide to wake the same tasklet up at the same time and one of them
is the owner.

This is more of a matter of usage than code, in that a tasklet usually is
designed to be woken up and executed on the calling thread only (most cases)
or on a specific thread. Thus it is a property of the tasklet itself as this
solely depends how the code is constructed around it.

This patch performs a small change to address this. By default tasklet_new()
creates a "local" tasklet, which will run on the calling thread, like in 2.0.
This is done by setting tl->tid to a negative value. If the caller wants the
tasklet to run exclusively on a specific thread, it just has to set tl->tid,
which is already what shared tasklet callers do anyway.

No backport is needed.
2019-10-18 09:04:55 +02:00
Willy Tarreau
891b5ef05a BUG/MEDIUM: tasklet: properly compute the sleeping threads mask in tasklet_wakeup()
The use of ~(1 << tid) to compute the sleeping_mask in tasklet_wakeup()
will result in breakage above 32 threads, because (1<<31) = 0xFFFFFFFF8000000,
and upper values will lead to theorically undefined results, but practically
will wrap over 0x1 to 0x80000000 again and indicate wrong sleeping masks. It
seems that the main visible effect maybe extra latency on some threads or
short CPU loops on others.

No backport is needed.
2019-10-18 09:00:26 +02:00
Olivier Houchard
2068ec4f89 BUG/MEDIUM: lists: Handle 1-element-lists in MT_LIST_BEHEAD().
In MT_LIST_BEHEAD(), explicitely set the next element of the prev to NULL,
instead of setting it to the prev of the next. If we only had one element,
then we'd set the next and the prev to the element itself, and thus it would
make the element appear to be outside any list.
2019-10-17 17:48:20 +02:00
Willy Tarreau
9e46496d45 MINOR: istbuf: add b_fromist() to make a buffer from an ist
A lot of our chunk-based functions are able to work on a buffer pointer
but not on an ist. Instead of duplicating all of them to also take an
ist as a source, let's have a macro to make a temporary dummy buffer
from an ist. This will only result in structure field manipulations
that the compiler will quickly figure to eliminate them with inline
functions, and in other cases it will just use 4 words in the stack
before calling a function, instead of performing intermediary
conversions.
2019-10-17 10:40:47 +02:00
David Carlier
a92c5cec2d BUILD/MEDIUM: threads: rename thread_info struct to ha_thread_info
On Darwin, the thread_info name exists as a standard function thus
we need to rename our array to ha_thread_info to fix this conflict.
2019-10-17 07:15:17 +02:00
Christopher Faulet
065118166c MINOR: htx: Add a flag on HTX to known when a response was generated by HAProxy
The flag HTX_FL_PROXY_RESP is now set on responses generated by HAProxy,
excluding responses returned by applets and services. It is an informative flag
set by the applicative layer.
2019-10-16 10:03:12 +02:00
Willy Tarreau
abefa34c34 MINOR: version: make the version strings variables, not constants
It currently is not possible to figure the exact haproxy version from a
core file for the sole reason that the version is stored into a const
string and as such ends up in the .text section that is not part of a
core file. By turning them into variables we move them to the data
section and they appear in core files. In order to help finding them,
we just prepend an extra variable in front of them and we're able to
immediately spot the version strings from a core file:

  $ strings core | fgrep -A2 'HAProxy version'
  HAProxy version follows
  2.1-dev2-e0f48a-88
  2019/10/15

(These are haproxy_version and haproxy_date respectively). This may be
backported to 2.0 since this part is not support to impact anything but
the developer's time spent debugging.
2019-10-16 09:56:57 +02:00
Christopher Faulet
53a899b946 CLEANUP: h1-htx: Move htx-to-h1 formatting functions from htx.c to h1_htx.c
The functions "htx_*_to_h1()" have been renamed into "h1_format_htx_*()" and
moved in the file h1_htx.c. It is the right place for such functions.
2019-10-14 22:28:50 +02:00
Christopher Faulet
48fa033f28 BUG/MINOR: chunk: Fix tests on the chunk size in functions copying data
When raw data are copied or appended in a chunk, the result must not exceed the
chunk size but it can reach it. Unlike functions to copy or append a string,
there is no terminating null byte.

This patch must be backported as far as 1.8. Note in 1.8, the functions
chunk_cpy() and chunk_cat() don't exist.
2019-10-14 16:45:09 +02:00
William Lallemand
e0c51ae358 BUG/MINOR: ssl: fix build without SSL
Commits 222a7c6 and 150bfa8 introduced some SSL initialization in
bind_conf_alloc() which broke the build without SSL.

Issue #322.
2019-10-14 11:24:17 +02:00
William Lallemand
246c0246d3 MINOR: ssl: load the ocsp in/from the ckch
Don't try to load the files containing the issuer and the OCSP response
each time we generate a SSL_CTX.

The .ocsp and the .issuer are now loaded in the struct
cert_key_and_chain only once and then loaded from this structure when
creating a SSL_CTX.
2019-10-11 17:32:03 +02:00
William Lallemand
a17f4116d5 MINOR: ssl: load the sctl in/from the ckch
Don't try to load the file containing the sctl each time we generate a
SSL_CTX.

The .sctl is now loaded in the struct cert_key_and_chain only once and
then loaded from this structure when creating a SSL_CTX.

Note that this now make possible the use of sctl with multi-cert
bundles.
2019-10-11 17:32:03 +02:00
William Lallemand
150bfa84e3 MEDIUM: ssl/cli: 'set ssl cert' updates a certificate from the CLI
$ echo -e "set ssl cert certificate.pem <<\n$(cat certificate2.pem)\n" | \
    socat stdio /var/run/haproxy.stat
    Certificate updated!

The operation is locked at the ckch level with a HA_SPINLOCK_T which
prevents the ckch architecture (ckch_store, ckch_inst..) to be modified
at the same time. So you can't do a certificate update at the same time
from multiple CLI connections.

SNI trees are also locked with a HA_RWLOCK_T so reading operations are
locked only during a certificate update.

Bundles are supported but you need to update each file (.rsa|ecdsa|.dsa)
independently. If a file is used in the configuration as a bundle AND
as a unique certificate, both will be updated.

Bundles, directories and crt-list are supported, however filters in
crt-list are currently unsupported.

The code tries to allocate every SNIs and certificate instances first,
so it can rollback the operation if that was unsuccessful.

If you have too much instances of the certificate (at least 20000 in my
tests on my laptop), the function can take too much time and be killed
by the watchdog. This will be fixed later. Also with too much
certificates it's possible that socat exits before the end of the
generation without displaying a message, consider changing the socat
timeout in this case (-t2 for example).

The size of the certificate is currently limited by the maximum size of
a payload, that must fit in a buffer.
2019-10-11 17:32:03 +02:00
William Lallemand
1d29c7438e MEDIUM: ssl: split ssl_sock_add_cert_sni()
In order to allow the creation of sni_ctx in runtime, we need to split
the function to allow rollback.

We need to be able to allocate all sni_ctxs required before inserting
them in case we need to rollback if we didn't succeed the allocation.

The function was splitted in 2 parts.

The first one ckch_inst_add_cert_sni() allocates a struct sni_ctx, fill
it with the right data and insert it in the ckch_inst's list of sni_ctx.

The second will take every sni_ctx in the ckch_inst and insert them in
the bind_conf's sni tree.
2019-10-11 17:32:03 +02:00
William Lallemand
9117de9e37 MEDIUM: ssl: introduce the ckch instance structure
struct ckch_inst represents an instance of a certificate (ckch_node)
used in a bind_conf. Every sni_ctx created for 1 ckch_node in a
bind_conf are linked in this structure.

This patch allocate the ckch_inst for each bind_conf and inserts the
sni_ctx in its linked list.
2019-10-11 17:32:03 +02:00
William Lallemand
222a7c6ae0 MINOR: ssl: initialize explicitly the sni_ctx trees 2019-10-11 17:32:02 +02:00
William Lallemand
f6adbe9f28 REORG: ssl: move structures to ssl_sock.h 2019-10-11 17:32:02 +02:00
Olivier Houchard
804ef244c6 MINOR: lists: Fix alignement of \ when relevant.
Make sure all the \ are properly aligned in macroes, this contains no
functional change.
2019-10-11 16:56:25 +02:00
Olivier Houchard
74715da030 MINOR: lists: Try to use local variables instead of macro arguments.
When possible, use local variables instead of using the macro arguments
explicitely, otherwise they may be evaluated over and over.
2019-10-11 16:56:25 +02:00
Olivier Houchard
06910464dd MEDIUM: task: Split the tasklet list into two lists.
As using an mt_list for the tasklet list is costly, instead use a regular list,
but add an mt_list for tasklet woken up by other threads, to be run on the
current thread. At the beginning of process_runnable_tasks(), we just take
the new list, and merge it into the task_list.
This should give us performances comparable to before we started using a
mt_list, but allow us to use tasklet_wakeup() from other threads.
2019-10-11 16:37:41 +02:00
Willy Tarreau
d7f2bbcbe3 MINOR: list: add new macro MT_LIST_BEHEAD
This macro atomically cuts the head of a list and returns the list
of elements as a detached list, meaning that they're all linked
together without any head. If the list was empty, NULL is returned.
2019-10-11 16:37:41 +02:00
Willy Tarreau
c32a0e522f MINOR: lists: add new macro LIST_SPLICE_END_DETACHED
This macro adds a detached list at the end of an existing
list. The detached list is a list without head, containing
only elements.
2019-10-11 16:37:41 +02:00
Willy Tarreau
eaa55370c3 MINOR: stats: prepare to add a description with each stat/info field
Several times some users have expressed the non-intuitive aspect of some
of our stat/info metrics and suggested to add some help. This patch
replaces the char* arrays with an array of name_desc so that we now have
some reserved room to store a description with each stat or info field.
These descriptions are currently empty and not reported yet.
2019-10-10 11:30:07 +02:00
Willy Tarreau
2f39738750 MINOR: stats: support the "desc" output format modifier for info and stat
Now "show info" and "show stat" can parse "desc" as an output format
modifier that will be passed down the chain to add some descriptions
to the fields depending on the format in use. For now it is not
exploited.
2019-10-10 11:30:07 +02:00
Willy Tarreau
ab02b3f345 MINOR: stats: get rid of the STAT_SHOWADMIN flag
This flag is used to decide to show the check box in front of a proxy
on the HTML stat page. It is always equal to STAT_ADMIN except when the
proxy has no backend capability (i.e. a pure frontend) or has no server,
in which case it's only used to avoid leaving an empty column at the
beginning of the table. Not only this is pretty useless, but it also
causes the columns not to align well when mixing multiple proxies with
or without servers.

Let's simply always use STAT_ADMIN and get rid of this flag.
2019-10-10 11:30:07 +02:00
Willy Tarreau
708c41602b MINOR: stats: replace the ST_* uri_auth flags with STAT_*
We used to rely on some config flags defined in uri_auth.h set during
parsing, and another set of STAT_* flags defined in stats.h set at run
time, with a somewhat gray area between the two sets. This is confusing
in the stats code as both are called "flags" in various functions and
it's quite hard to know which one describes what.

This patch cleans this up by replacing all ST_* by a newly assigned
value from the STAT_* set so that we can now use unified flags to
describe both the configuration and the current state. There is no
functional change at all.
2019-10-10 11:30:07 +02:00
Willy Tarreau
ee4f5f83d3 MINOR: stats: get rid of the ST_CONVDONE flag
This flag was added in 1.4-rc1 by commit 329f74d463 ("[BUG] uri_auth: do
not attemp to convert uri_auth -> http-request more than once") to
address the case where two proxies inherit the stats settings from
the defaults instance, and the first one compiles the expression while
the second one uses it. In this case since they use the exact same
uri_auth pointer, only the first one should compile and the second one
must not fail the check. This was addressed by adding an ST_CONVDONE
flag indicating that the expression conversion was completed and didn't
need to be done again. But this is a hack and it becomes cumbersome in
the middle of the other flags which are all relevant to the stats
applet. Let's instead fix it by checking if we're dealing with an
alias of the defaults instance and refrain from compiling this twice.
This allows us to remove the ST_CONVDONE flag.

A typical config requiring this check is :

   defaults
        mode http
        stats auth foo:bar

   listen l1
        bind :8080

   listen l2
        bind :8181

Without this (or previous) check it would cmoplain when checking l2's
validity since the rule was already built.
2019-10-10 11:30:07 +02:00
Christopher Faulet
16fdc55f79 MINOR: http: Add a function to get the authority into a URI
The function http_get_authority() may be used to parse a URI and looks for the
authority, between the scheme and the path. An option may be used to skip the
user info (part before the '@'). Most of time, the user info will be ignored.
2019-10-09 11:05:31 +02:00
Christopher Faulet
9a67c293b9 MINOR: htx: Add 2 flags on the start-line to have more info about the uri
The first flag, HTX_SL_F_HAS_AUTHORITY, is set when the uri contains an
authority. For the H1, it happens when a CONNECT request is received or when an
absolute uri is used. For the H2, it happens when the pseudo header ":authority"
is provided.

The second one, HTX_SL_F_NORMALIZED_URI, is set when the received uri is
represented as an absolute uri because of the protocol requirements. For now, it
is only used for h2 requests, when the pseudo headers :authority and :scheme are
found. Internally, the uri is represented as an absolute uri. This flag allows
us to make the difference between an absolute uri in h1 and h2.
2019-10-09 11:05:31 +02:00
Christopher Faulet
c5a3eb4e3a MINOR: fcgi: Add function to get the string representation of a record type
This function will be used to emit traces in the FCGI multiplexer.
2019-10-04 16:12:02 +02:00
Christopher Faulet
27aa65ecfb MINOR: htx: Adapt htx_dump() to be used from traces
This function now dumps info about the HTX message into a buffer, passed as
argument. In addition, it is possible to only dump meta information, without the
message content.
2019-10-04 15:48:55 +02:00
Christopher Faulet
af542635f7 MINOR: h1-htx: Update h1_copy_msg_data() to ease the traces in the mux-h1
This function now uses the address of the pointer to the htx message where the
copy must be performed. This way, when a zero-copy is performed, there is no
need to refresh the caller's htx message. It is a bit easier to do that way,
especially to add traces in the mux-h1.
2019-10-04 15:46:59 +02:00
Willy Tarreau
2aaeee34da BUG/MEDIUM: fd: HUP is an error only when write is active
William reported that since commit 6b3089856f ("MEDIUM: fd: do not use
the FD_POLL_* flags in the pollers anymore") the master's CLI often
fails to access sub-processes. There are two causes to this. One is
that we did report FD_POLL_ERR on an FD as soon as FD_EV_SHUT_W was
seen, which is automatically inherited from POLLHUP. And since we do
not store the current shutdown state of an FD we can't know if the
poller reports a sudden close resulting from an error or just a
byproduct of a previous shutdown(WR) followed by a read0. The current
patch addresses this by only considering this when the FD was active,
since a shutdown FD is not active. The second issue is that *somewhere*
down the chain, channel data are ignored if an error is reported on a
channel. This results in content truncation, but this cause was not
figured yet.

No backport is needed.
2019-10-01 11:52:08 +02:00
Tim Duesterhus
07626eafa2 CLEANUP: proxy: Remove proxy_tbl_by_name
It is no longer required as of 1b8e68e89a
and is no longer used when #306 is fixed.
2019-09-30 04:11:36 +02:00
Christopher Faulet
88a0db28ae MINOR: stats: Add the support of float fields in stats
It is now possible to format stats counters as floats. But the stats applet does
not use it.

This patch is required by the Prometheus exporter to send the time averages in
seconds. If the promex change is backported, this patch must be backported
first.
2019-09-27 08:49:09 +02:00
Christopher Faulet
d72665b425 CLEANUP: http-ana: Remove the unused function http_send_name_header()
Because the HTTP multiplexers are now responsible to handle the option
"http-send-name-header", the function http_send_name_header() can be removed.
2019-09-27 08:48:53 +02:00
Christopher Faulet
b1bb1afa47 MINOR: spoe: Support the async mode with several threads
A different engine-id is now generated for each thread. So, it is possible to
enable the async mode with several threads.

This patch may be backported to older versions.
2019-09-26 16:51:02 +02:00
Willy Tarreau
93acfa2263 MINOR: time: add timeofday_as_iso_us() to return instant time as ISO
We often need ISO time + microseconds in traces and ring buffers, thus
function does this by calling gettimeofday() and keeping a cached value
of the part representing the tv_sec value, and only rewrites the microsecond
part. The cache is per-thread so it's lockless and safe to use as-is.
Some tests already show that it's easy to see 3-4 events in a single
microsecond, thus it's likely that the nanosecond version will have to
be implemented as well. But certain comments on the net suggest that
some parsers are having trouble beyond microsecond, thus for now let's
stick to the microsecond only.
2019-09-26 08:13:38 +02:00
Olivier Houchard
bba1a263c5 BUG/MEDIUM: tasklets: Make sure we're waking the target thread if it sleeps.
Now that we can wake tasklet for other threads, make sure that if the thread
is sleeping, we wake it up, or the tasklet won't be executed until it's
done sleeping.
That also means that, before going to sleep, and after we put our bit
in sleeping_thread_mask, we have to check that nobody added a tasklet for
us, just checking for global_tasks_mask isn't enough anymore.
2019-09-24 14:58:45 +02:00
Willy Tarreau
d022e9c98b MINOR: task: introduce a thread-local "sched" variable for local scheduler stuff
The aim is to rassemble all scheduler information related to the current
thread. It simply points to task_per_thread[tid] without having to perform
the operation at each time. We save around 1.2 kB of code on performance
sensitive paths and increase the request rate by almost 1%.
2019-09-24 11:23:30 +02:00
Willy Tarreau
d66d75656e MINOR: task: split the tasklet vs task code in process_runnable_tasks()
There are a number of tests there which are enforced on tasklets while
they will never apply (various handlers, destroyed task or not, arguments,
results, ...). Instead let's have a single TASK_IS_TASKLET() test and call
the tasklet processing function directly, skipping all the rest.

It now appears visible that the only unneeded code is the update to
curr_task that is never used for tasklets, except for opportunistic
reporting in the debug handler, which can only catch si_cs_io_cb,
which in practice doesn't appear in any report so the extra cost
incurred there is pointless.

This change alone removes 700 bytes of code, mostly in
process_runnable_tasks() and increases the performance by about
1%.
2019-09-24 11:23:30 +02:00
Willy Tarreau
2bd65a781e OPTIM: listeners: use tasklets for the multi-queue rings
Now that we can wake up a remote thread's tasklet, it's way more
interesting to use a tasklet than a task in the accept queue, as it
will avoid passing through all the scheduler. Just doing this increases
the accept rate by about 4%, overall recovering the slight loss
introduced by the tasklet change. In addition it makes sure that
even a heavily loaded scheduler (e.g. many very fast checks) will
not delay a connection accept.
2019-09-24 06:57:32 +02:00
Olivier Houchard
ff1e9f39b9 MEDIUM: tasklets: Make the tasklet list a struct mt_list.
Change the tasklet code so that the tasklet list is now a mt_list.
That means that tasklet now do have an associated tid, for the thread it
is expected to run on, and any thread can now call tasklet_wakeup() for
that tasklet.
One can change the associated tid with tasklet_set_tid().
2019-09-23 18:16:08 +02:00
Olivier Houchard
0cd6a976ff MINOR: mt_lists: Give MT_LIST_ADD, MT_LIST_ADDQ and MT_LIST_DEL a return value.
Make it so MT_LIST_ADD and MT_LIST_ADDQ return 1 if it managed to add the
item, 0 (because it was already in a list) otherwise.
Make it so MT_LIST_DEL returns 1 if it managed to remove the item from a
list, or 0 otherwise (because it was in no list).
2019-09-23 18:16:08 +02:00
Olivier Houchard
cb22ad4f71 MINOR: mt_lists: Do nothing in MT_LIST_ADD/MT_LIST_ADDQ if already in list.
Modify MT_LIST_ADD and MT_LIST_ADDQ to do nothing if the element is already
in a list.
2019-09-23 18:16:08 +02:00
Olivier Houchard
9570ecf662 MEDIUM: servers: Use LIST_DEL_INIT() instead of LIST_DEL().
In srv_add_to_idle_list(), use LIST_DEL_INIT instead of just LIST_DEL.
We're about to add the connection to a mt_list, and MT_LIST_ADD/MT_LIST_ADDQ
will be modified to make sure we're not adding the element if it's already
in a list.
2019-09-23 18:16:08 +02:00
Olivier Houchard
5e9b92cbff MINOR: mt_lists: Add new macroes.
Add a few new macroes to the mt_lists.
MT_LIST_LOCK_ELT()/MT_LIST_UNLOCK_ELT() helps locking/unlocking an element.
This should only be used if you know for sure nobody else will remove the
element from the list in the meanwhile.
mt_list_for_each_entry_safe() is an iterator, similar to
list_for_each_entry_safe().
It takes 5 arguments, item, list_head, member are similar to those of
the non-mt variant, tmpelt is a temporary pointer to a struct mt_list, while
tmpelt2 is a struct mt_list itself.
MT_LIST_DEL_SELF() can be used to delete an item while parsing the list with
mt_list_for_each_entry_safe(). It shouldn't be used outside, and you
shouldn't use MT_LIST_DEL() while using mt_list_for_each_entry_safe().
2019-09-23 18:16:08 +02:00
Olivier Houchard
859dc80f94 MEDIUM: list: Separate "locked" list from regular list.
Instead of using the same type for regular linked lists and "autolocked"
linked lists, use a separate type, "struct mt_list", for the autolocked one,
and introduce a set of macros, similar to the LIST_* macros, with the
MT_ prefix.
When we use the same entry for both regular list and autolocked list, as
is done for the "list" field in struct connection, we know have to explicitely
cast it to struct mt_list when using MT_ macros.
2019-09-23 18:16:08 +02:00
Christopher Faulet
78fbb9f991 MEDIUM: fcgi-app: Add FCGI application and filter
The FCGI application handles all the configuration parameters used to format
requests sent to an application. The configuration of an application is grouped
in a dedicated section (fcgi-app <name>) and referenced in a backend to be used
(use-fcgi-app <name>). To be valid, a FCGI application must at least define a
document root. But it is also possible to set the default index, a regex to
split the script name and the path-info from the request URI, parameters to set
or unset...  In addition, this patch also adds a FCGI filter, responsible for
all processing on a stream.
2019-09-17 10:18:54 +02:00
Christopher Faulet
63bbf284a1 MINOR: fcgi: Add code related to FCGI protocol
This code is independant and is only responsible to encode and decode part of
the FCGI protocol.
2019-09-17 10:18:54 +02:00
Christopher Faulet
4f0f88a9d0 MEDIUM: mux-h1/h1-htx: move HTX convertion of H1 messages in dedicated file
To avoid code duplication in the futur mux FCGI, functions parsing H1 messages
and converting them into HTX have been moved in the file h1_htx.c. Some
specific parts remain in the mux H1. But most of the parsing is now generic.
2019-09-17 10:18:54 +02:00
Christopher Faulet
341fac1eb2 MINOR: http: Add function to parse value of the header Status
It will be used by the mux FCGI to get the status a response.
2019-09-17 10:18:54 +02:00
Christopher Faulet
5c6fefc8eb MINOR: log: Provide a function to emit a log for an application
Application is a generic term here. It is a modules which handle its own log
server list, with no dependency on a proxy. Such applications can now call the
function app_log() to log messages, passing a log server list and a tag as
parameters. Internally, the function __send_log() has been adapted accordingly.
2019-09-17 10:18:54 +02:00
Christopher Faulet
130cf21709 MINOR: istbuf: Add the function b_isteqi()
This function compares a part of a buffer to an indirect string (ist), ignoring
the case of the characters.
2019-09-17 10:18:54 +02:00
Christopher Faulet
c16929658f MINOR: config: Support per-proxy and per-server post-check functions callbacks
Most of times, when a keyword is added in proxy section or on the server line,
we need to have a post-parser callback to check the config validity for the
proxy or the server which uses this keyword.

It is possible to register a global post-parser callback. But all these
callbacks need to loop on the proxies and servers to do their job. It is neither
handy nor efficient. Instead, it is now possible to register per-proxy and
per-server post-check callbacks.
2019-09-17 10:18:54 +02:00
Christopher Faulet
3ea5cbe6a4 MINOR: config: Support per-proxy and per-server deinit functions callbacks
Most of times, when any allocation is done during configuration parsing because
of a new keyword in proxy section or on the server line, we must add a call in
the deinit() function to release allocated ressources. It is now possible to
register a post-deinit callback because, at this stage, the proxies and the
servers are already releases.

Now, it is possible to register deinit callbacks per-proxy or per-server. These
callbacks will be called for each proxy and server before releasing them.
2019-09-17 10:18:54 +02:00
Christopher Faulet
e3d2a877fb MINOR: http-ana: Remove err_state field from http_msg
This field is not used anymore. In addition, the state HTTP_MSG_ERROR is now
only used when an error occurred during the body forward.
2019-09-17 10:18:54 +02:00
Christopher Faulet
505adfca51 MINOR: htx: Add a flag on HTX message to report processing errors
This new flag may be used to report unexpected error because of not well
formatted HTX messages (not related to a parsing error) or our incapactity to
handle the processing because we reach a limit (ressource exhaustion, too big
headers...). It should result to an error 500 returned to the client when
applicable.
2019-09-17 10:18:54 +02:00
Christopher Faulet
6338a08c34 MINOR: stats: Add JSON export from the stats page
It is now possible to export stats using the JSON format from the HTTP stats
page. Like for the CSV export, to export stats in JSON, you must add the option
";json" on the stats URL. It is also possible to dump the JSON schema with the
option ";json-schema". Corresponding Links have been added on the HTML page.

This patch fixes the issue #263.
2019-09-10 10:29:54 +02:00
Willy Tarreau
f21d17bbe8 MINOR: stats: report the number of idle connections for each server
This adds two extra fields to the stats, one for the current number of idle
connections and one for the configured limit. A tooltip link now appears on
the HTML page to show these values in front of the active connection values.

This should be backported to 2.0 and 1.9 as it's the only way to monitor
the idle connections behaviour.
2019-09-08 09:30:50 +02:00
Willy Tarreau
4cae3bf631 BUG/MEDIUM: connection: don't keep more idle connections than ever needed
When using "http-reuse safe", which is the default, a new incoming connection
does not automatically reuse an existing connection for the first request, as
we don't want to risk to lose the contents if we know the client will not be
able to replay the request. A side effect to this is that when dealing with
mostly http-close traffic, the reuse rate is extremely low and we keep
accumulating server-side connections that may even never be reused. At some
point we're limited to a ratio of file descriptors, but when the system is
configured with very high FD limits, we can still reach the limit of outgoing
source ports and make the system significantly slow down trying to find an
available port for outgoing connections. A simple test on my laptop with
ulimit 100000 and with the following config results in the load immediately
dropping after a few seconds :

   listen l1
        bind :4445
        mode http
        server s1 127.0.0.1:8000

As can be seen, the load falls from 38k cps to 400 cps during the first 200ms
(in fact when the source port table is full and connect() takes ages to find
a spare port for a new connection):

   $ injectl464 -p 4 -o 1 -u 10 -G 127.0.0.1:4445/ -F -c -w 100
   hits ^hits hits/s  ^h/s     bytes  kB/s  last  errs  tout htime  sdht ptime
   2439  2439  39338 39338    356094  5743  5743     0     0 0.4 0.5 0.4
   7637  5198  38185 37666   1115002  5575  5499     0     0 0.7 0.5 0.7
   7719    82  25730   820   1127002  3756   120     0     0 21.8 18.8 21.8
   7797    78  19492   780   1138446  2846   114     0     0 61.4 2.5 61.4
   7877    80  15754   800   1150182  2300   117     0     0 58.6 0.5 58.6
   7920    43  13200   430   1156488  1927    63     0     0 58.9 0.3 58.9

At this point, lots of connections are indeed in use, for only 10 connections
on the frontend side:

   $ ss -ant state established | wc -l
   39022

This patch makes sure we never keep more idle connections than we've ever
had outstanding requests on a server. This way the total number of idle
connections will never exceed the sum of maximum connections. Thus highly
loaded servers will be able to get many connections and slightly loaded
servers will keep less. Ideally we should apply similar limits per process
and the per backend, but in practice this already addresses the issues
pretty well:

   $ injectl464 -p 4 -o 1 -u 10 -G 127.0.0.1:4445/ -F -c -w 100
   hits ^hits hits/s  ^h/s     bytes  kB/s  last  errs  tout htime  sdht ptime
   4423  4423  40209 40209    645758  5870  5870     0     0 0.2 0.4 0.2
   8020  3597  40100 39966   1170920  5854  5835     0     0 0.2 0.4 0.2
  12037  4017  40123 40170   1757402  5858  5864     0     0 0.2 0.4 0.2
  16069  4032  40172 40320   2346074  5865  5886     0     0 0.2 0.4 0.2
  20047  3978  40013 39386   2926862  5842  5750     0     0 0.3 0.4 0.3
  24005  3958  40008 39979   3504730  5841  5837     0     0 0.2 0.4 0.2

   $ ss -ant state established | wc -l
   234

This patch must be backported to 2.0. It could be useful in 1.9 as well
eventhough pools and reuse are not enabled by default there.
2019-09-08 09:30:50 +02:00
Willy Tarreau
6b3089856f MEDIUM: fd: do not use the FD_POLL_* flags in the pollers anymore
As mentioned in previous commit, these flags do not map well to
modern poller capabilities. Let's use the FD_EV_*_{R,W} flags instead.
This first patch only performs a 1-to-1 mapping making sure that the
previously reported flags are still reported identically while using
the closest possible semantics in the pollers.

It's worth noting that kqueue will now support improvements such as
returning distinctions between shut and errors on each direction,
though this is not exploited for now.
2019-09-06 19:09:56 +02:00
Willy Tarreau
77abb43ed1 MINOR: fd: add two flags ERR and SHUT to describe FD states
There's currently a big ambiguity on our use of POLLHUP because we
currently map POLLHUP and POLLRDHUP to FD_POLL_HUP. The first one
indicates a close in *both* directions while the second one indicates
a unidirectional close. Since we don't know from the resulting flag
we always have to read when reported. Furthermore kqueue only reports
unidirectional responses which are mapped to FD_POLL_HUP as well, and
their write closes are mapped to a general error.

We could add a new FD_POLL_RDHUP flag to improve the mapping, or
switch only to the POLL* flags, but that further complicates the
portability for operating systems like FreeBSD which do not have
POLLRDHUP but have its semantics.

Let's instead directly use the per-direction flag values we already
have, and it will be a first step in the direction of finer states.
Thus we introduce an ERR and a SHUT status for each direction, that
the pollers will be able to compute and pass to fd_update_events().

It's worth noting that FD_EV_STATUS already sees the two new flags,
but they are harmless since used only by fd_{recv,send}_state() which
are never called. Thus in its current state this patch must be totally
transparent.
2019-09-06 18:33:07 +02:00
Willy Tarreau
8f2825f3ab MINOR: fd: add two new calls fd_cond_{recv,send}()
These two functions are used to enable recv/send but only if the FD is
not marked as active yet. The purpose is to conditionally mark them as
tentatively usable without interfering with the polling if polling was
already enabled, when it's supposed to be likely true.
2019-09-06 17:50:36 +02:00
Willy Tarreau
4ac9d064d2 MEDIUM: fd: mark the FD as ready when it's inserted
Given that all our I/Os are now directed from top to bottom and not the
opposite way around, and the FD cache was removed, it doesn't make sense
anymore to create FDs that are marked not ready since this would prevent
the first accesses unless the caller explicitly does an fd_may_recv()
which is not expected to be its job (which conn_ctrl_init() has to do
by the way). Let's move this into fd_insert() instead, and have a single
atomic operation for both directions via fd_may_both().
2019-09-06 17:50:36 +02:00
Willy Tarreau
dbe3060e81 MINOR: fd: make updt_fd_polling() a normal function
It's called from many places, better use a real function than an inline.
2019-09-05 09:31:18 +02:00
Willy Tarreau
f8ecc7f667 MEDIUM: fd: simplify the fd_*_{recv,send} functions using BTS/BTR
Now that we don't have to update FD_EV_POLLED_* at the same time as
FD_EV_ACTIVE_*, we don't need to use a CAS anymore, a bit-test-and-set
operation is enough. Doing so reduces the code size by a bit more than
1 kB. One function was special, fd_done_recv(), whose comments and doc
were inaccurate for the part related to the lack of polling.
2019-09-05 09:31:18 +02:00
Willy Tarreau
5bee3e2f47 MEDIUM: fd: remove the FD_EV_POLLED status bit
Since commit 7ac0e35f2 in 1.9-dev1 ("MAJOR: fd: compute the new fd polling
state out of the fd lock") we've started to update the FD POLLED bit a
bit more aggressively. Lately with the removal of the FD cache, this bit
is always equal to the ACTIVE bit. There's no point continuing to watch
it and update it anymore, all it does is create confusion and complicate
the code. One interesting side effect is that it now becomes visible that
all fd_*_{send,recv}() operations systematically call updt_fd_polling(),
except fd_cant_recv()/fd_cant_send() which never saw it change.
2019-09-05 09:31:18 +02:00
Willy Tarreau
c046d167e4 MEDIUM: log: add support for logging to a ring buffer
Now by prefixing a log server with "ring@<name>" it's possible to send
the logs to a ring buffer. One nice thing is that it allows multiple
sessions to consult the logs in real time in parallel over the CLI, and
without requiring file system access. At the moment, ring0 is created as
a default sink for tracing purposes and is available. No option is
provided to create new rings though this is trivial to add to the global
section.
2019-08-30 15:24:59 +02:00
Willy Tarreau
f3dc30f6de MINOR: log: add a target type instead of hacking the address family
Instead of detecting an AF_UNSPEC address family for a log server and
to deduce a file descriptor, let's create a target type field and
explicitly mention that the socket is of type FD.
2019-08-30 15:07:25 +02:00
Willy Tarreau
d660990cee MINOR: fd: add a new "initialized" bit in the fdtab struct
The purpose is to be able to remember that initialization was already
done for a file descriptor. This will allow to get rid of some dirty
hacks performed in the logs or fd sinks where the init state of the
fd has to be guessed.
2019-08-30 15:07:25 +02:00
Willy Tarreau
76913d3ef4 CLEANUP: fd: remove leftovers of the fdcache
The "cache" entry was still present in the fdtab struct and it was
reported in "show sess". Removing it broke the cache-line alignment
on 64-bit machines which is important for threads, so it was fixed
by adding an attribute(aligned()) when threads are in use. Doing it
only in this case allows 32-bit thread-less platforms to see the
struct fit into 32 bytes.
2019-08-30 15:07:25 +02:00
Willy Tarreau
1d181e489c MEDIUM: ring: implement a wait mode for watchers
Now it is possible for a reader to subscribe and wait for new events
sent to a ring buffer. When new events are written to a ring buffer,
the applets that are subscribed are woken up to display new events.
For now we only support this with the CLI applet called by "show events"
since the I/O handler is indeed a CLI I/O handler. But it's not
complicated to add other mechanisms to consume events and forward them
to external log servers for example. The wait mode is enabled by adding
"-w" after "show events <sink>". An extra "-n" was added to directly
seek to new events only.
2019-08-30 11:58:58 +02:00
Willy Tarreau
300decc8d9 MINOR: cli: extend the CLI context with a list and two offsets
Some CLI parsers are currently abusing the CLI context types such as
pointers to stuff longs into them by lack of room. But the context is
80 bytes while cli is only 48, thus there's some room left. This patch
adds a list element and two size_t usable as various offsets. The list
element is initialized.
2019-08-30 11:58:58 +02:00
Willy Tarreau
370a694879 MINOR: trace: change the detail_level to per-source verbosity
The detail level initially based on syslog levels is not used, while
something related is missing, trace verbosity, to indicate whether or
not we want to call the decoding callback and what level of decoding
we want (raw captures etc). Let's change the field to "verbosity" for
this. A verbosity of zero means that the decoding callback is not
called, and all other levels are handled by this callback and are
source-specific. The source is now prompted to list the levels that
are proposed to the user. When the source doesn't define anything,
"quiet" and "default" are available.
2019-08-29 17:11:25 +02:00
Willy Tarreau
09fb0df6fd MINOR: trace: prepend the function name for developer level traces
Working on adding traces to mux-h2 revealed that the function names are
manually copied a lot in developer traces. The reason is that they are
not preprocessor macros and as such cannot be concatenated. Let's
slightly adjust the trace() function call to take a function name just
after the file:line argument. This argument is only added for the
TRACE_DEVEL and 3 new TRACE_ENTER, TRACE_LEAVE, and TRACE_POINT macros
and left NULL for others. This way the function name is only reported
for traces aimed at the developers. The pretty-print callback was also
extended to benefit from this. This will also significantly shrink the
data segment as the "entering" and "leaving" strings will now be merged.

One technical point worth mentioning is that the function name is *not*
passed as an ist to the inline function because it's not considered as
a builtin constant by the compiler, and would lead to strlen() being
run on it from all call places before calling the inline function. Thus
instead we pass the const char * (that the compiler knows where to find)
and it's the __trace() function that converts it to an ist for internal
consumption and for the pretty-print callback. Doing this avoids losing
5-10% peak performance.
2019-08-29 17:09:13 +02:00
Willy Tarreau
2ea549bc43 MINOR: trace: change the "payload" level to "data" and move it
The "payload" trace level was ambigous because its initial purpose was
to be able to dump received data. But it doesn't make sense to force to
report data transfers just to be able to report state changes. For
example, all snd_buf()/rcv_buf() operations coming from the application
layer should be tagged at this level. So here we move this payload level
above the state transitions and rename it to avoid the ambiguity making
one think it's only about request/response payload. Now it clearly is
about any data transfer and is thus just below the developer level. The
help messages on the CLI and the doc were slightly reworded to help
remove this ambiguity.
2019-08-29 10:46:11 +02:00
Willy Tarreau
be5a288424 MINOR: trace: replace struct trace_lockon_args with struct name_desc
No need for a specific struct anymore, name_desc suits us.
2019-08-29 09:34:53 +02:00
Willy Tarreau
fb4ba91ac1 MINOR: tools: add a generic struct "name_desc" for name-description pairs
In prompts on the CLI we now commonly need to propose a keyword name
and a description and it doesn't make sense to define a new struct for
each such pairs. Let's simply have a generic "name_desc" for this.
2019-08-29 09:34:53 +02:00
Geoff Simmons
7185b789f9 MINOR: connection: add the fc_pp_authority fetch -- authority TLV, from PROXYv2
Save the authority TLV in a PROXYv2 header from the client connection,
if present, and make it available as fc_pp_authority.

The fetch can be used, for example, to set the SNI for a backend TLS
connection.
2019-08-28 17:16:20 +02:00
Willy Tarreau
c326ecc9b1 MINOR: trace: change the TRACE() calling convention to put the args and cb last
Previously the callback was almost mandatory so it made sense to have it
before the message. Now that it can default to the one declared in the
trace source, most TRACE() calls contain series of empty args and callbacks,
which make them suitable for being at the end and being totally omitted.

This patch thus reverses the TRACE arguments so that the message appears
first, then the mask, then arg1..arg4, then the callback. In practice
we'll mostly see 1 arg, or 2 args and nothing else, and it will not be
needed anymore to pass long series of commas in the middle of the
arguments. However if a source is enforced, the empty commas will still
be needed for all omitted arguments.
2019-08-28 10:39:43 +02:00
Willy Tarreau
3da0026d25 MINOR: trace: support a default callback for the source
It becomes apparent that most traces will use a single trace pretty
print callback, so let's allow the trace source to declare a default
one so that it can be omitted from trace calls, and will be used if
no other one is specified.
2019-08-28 07:06:23 +02:00
Willy Tarreau
8f24023ba0 MINOR: sink: now report the number of dropped events on output
The principle is that when emitting a message, if some dropped events
were logged, we first attempt to report this counter before going
further. This is done under an exclusive lock while all logs are
produced under a shared lock. This ensures that the dropped line is
accurately reported and doesn't accidently arrive after a later
event.
2019-08-27 17:14:19 +02:00
Willy Tarreau
4ed23ca0e7 MINOR: sink: add support for ring buffers
This now provides sink_new_buf() which allocates a ring buffer. One such
ring ("buf0") of 1 MB is created already, and may be used by sink_write().
The sink's creation should probably be moved somewhere else later.
2019-08-27 17:14:19 +02:00
Willy Tarreau
072931cdcb MINOR: ring: add a generic CLI io_handler to dump a ring buffer
The three functions (attach, IO handler, and release) are meant to be
called by any CLI command which requires to dump the contents of a ring
buffer. We do not implement anything generic to dump any ring buffer on
the CLI since it's meant to be used by other functionalities above.
However these functions deal with locking and everything so it's trivial
to embed them in other code.
2019-08-27 17:14:19 +02:00
Willy Tarreau
be97853c2f MINOR: ring: add a ring_write() function
This function tries to write to the ring buffer, possibly removing enough
old messages to make room for the new one. It takes two arrays of fragments
on input to ease the insertion of prefixes by the caller. It atomically
writes the message, possibly truncating it if desired, and returns the
operation's status.
2019-08-27 17:14:19 +02:00
Willy Tarreau
172945fbad MINOR: ring: add a new mechanism for retrieving/storing ring data in buffers
Our circular buffers are well suited for being used as ring buffers for
not-so-structured data. The machanism here consists in making room in a
buffer before inserting a new record which is prefixed by its size, and
looking up next record based on the previous one's offset and size. We
can have up to 255 consumers watching for data (dump in progress, tail)
which guarantee that entrees are not recycled while they're being dumped.
The complete representation is described in the header file. For now only
ring_new(), ring_resize() and ring_free() are created.
2019-08-27 17:14:19 +02:00
Willy Tarreau
931d8b79a8 MINOR: fd: add fd_write_frag_line() to send a fragmented line to an fd
Currently both logs and event sinks may use a file descriptor to
atomically emit some output contents. The two may use the same FD though
nothing is done to make sure they use the same lock. Also there is quite
some redundancy between the two. Better make a specific function to send
a fragmented message to a file descriptor which will take care of the
locking via the fd's lock. The function is also able to truncate a
message and to enforce addition of a trailing LF when building the
output message.
2019-08-27 17:14:19 +02:00
Willy Tarreau
b88d231773 MINOR: buffer: add functions to read/write varints from/to buffers
The new functions are :
  __b_put_varint() : inserts a varint when it's known that it fits
  b_put_varint()   : tries to insert a varint at the tail
  b_get_varint()   : tries to get a varint from the head
  b_peek_varint()  : tries to peek a varint at a specific offset

Wrapping is supported so that they are expected to be safe to use to
manipulate varints with buffers anywhere.
2019-08-27 17:14:19 +02:00
Willy Tarreau
4d589e719b MINOR: tools: add a function varint_bytes() to report the size of a varint
It will sometimes be useful to encode varints to know the output size in
advance. Two versions are provided, one inline using a switch/case construct
which will be trivial for use with constants (and will be very fast albeit
huge) and one function iterating on the number which is 5 times smaller,
for use with variables.
2019-08-27 17:14:19 +02:00
Willy Tarreau
e40f274878 BUILD: trace: make the lockon_ptr const to silence a warning without threads
I forgot to fix this one before pushing, despite my tests. lockon_ptr is
only used to compare pointers, it doesn't need to point to a writable
location. Without threads the atomic store is turned into an assignment
and rightfully complains.
2019-08-22 20:26:28 +02:00
Willy Tarreau
c14eea49e6 MINOR: trace: add the possibility to lock on some arguments
Given that we can pass typed arguments to the trace() function, let's
add provisions for tracking them. They are source-specific so we need
to let the source fill their name and description. Only those with a
non-null name will be proposed.
2019-08-22 20:21:00 +02:00
Willy Tarreau
17a51c64b5 MINOR: trace: add a definition of typed arguments to trace()
With a few macros it's possible for a trace source to commit to only
using a certain type for a given argument (or set of). This will be
particularly useful to let the trace subsystem retrieve some precious
information such as a connection, session, listener, source address or
so, and enable/disable filtering and/or locking.
2019-08-22 20:21:00 +02:00
Willy Tarreau
4ab242136d MINOR: trace: add per-level macros to produce traces
The new TRACE_<level>() macros take a mask, 4 args, a callback and a
static message. From this they also inherit the TRACE_SOURCE macro from
the caller, which contains the pointer to the trace source (so that it's
not required to paste it everywhere), and an ist string is also made by
the concatenation of the file name and the line number. This uses string
concatenation by the preprocessor, and turns it into an ist by the compiler
so that there is no operation at all to perform to adjust the data length
as the compiler knows where to cut during the optimization phase. Last,
the message is also automatically turned into an ist so that it's trivial
to put it into an iovec without having to run strlen() on it.

All arguments and the callback may be empty and will then automatically
be replaced with a NULL pointer. This makes the TRACE calls slightly
lighter especially since arguments are not always used. Several other
options were considered to use variadic macros but there's no outstanding
rule that justifies to place an argument before another one, and it still
looks convenient to have the message be the last one to encourage copy-
pasting of the trace statements.

A generic TRACE() macro takes TRACE_LEVEL in from the source file as the
trace level instead of taking it from its name. This may slightly simplify
the production of traces that always run at the same level (internal core
parts may probably only be called at developer level).
2019-08-22 20:21:00 +02:00
Willy Tarreau
bfd14fc6eb MINOR: trace: implement a call to a decode function
The trace() call will support an optional decoding callback and 4
arguments that this function is supposed to know how to use to provide
extra information. The output remains unchanged when the function is
NULL. Otherwise, the message is pre-filled into the thread-local
trace_buf, and the function is called with all arguments so that it
completes the buffer in a readable form depending on the expected
level of detail.
2019-08-22 20:21:00 +02:00
Willy Tarreau
5da408818b MINOR: trace: make trace() now also take a level in argument
This new "level" argument will allow the trace sources to label the
traces for different purposes, and filter out some of them if they
are not relevant to the current target. Right now we have 5 different
levels:
  - USER : the least verbose one, only a few functional information
  - PAYLOAD: like user but also displays some payload-related information
  - PROTO: focuses on the protocol's framing
  - STATE: also indicate state internal transitions or non-transitions
  - DEVELOPER: adds extra info about branches taken in the code (break
    points, return points)
2019-08-22 20:21:00 +02:00
Willy Tarreau
419bd49f0b MINOR: trace: add the file name and line number in the prefix
We now pass an extra argument "where" to the trace() call, which
is supposed to be an ist made of the concatenation of the filename
and the line number. We only keep the last 10 chars from this string
since the end of file names is most often easy to recognize. This
gives developers useful information at very low cost.
2019-08-22 20:21:00 +02:00
Willy Tarreau
4c2ae48375 MINOR: trace: implement a very basic trace() function
For now it remains quite basic. It performs a few state checks, calls
the source's sink if defined, and performs the transitions between
RUNNING, STOPPED and WAITING when the configured events match.
2019-08-22 20:21:00 +02:00
Willy Tarreau
864e880f6c MINOR: trace/cli: register the "trace" CLI keyword to list the sources
For now it lists the sources if one is not provided, and checks
for the source's existence. It lists the events if not provided,
checks for their existence if provided, and adjusts reported
events/start/stop/pause events, and performs state transitions.
It lists sinks and adjusts them as well. Filters, lock, and
level are not implemented yet.
2019-08-22 20:21:00 +02:00
Willy Tarreau
88ebd4050e MINOR: trace: add allocation of buffer-sized trace buffers
This will be needed so that we can implement protocol decoders which will
have to emit their contents into such a buffer.
2019-08-22 20:21:00 +02:00
Willy Tarreau
4151c753fc MINOR: trace: start to create a new trace subsystem
The principle of this subsystem will be to support taking live traces
at various places in the code with conditional triggers, filters, and
ability to lock on some elements. The traces will support typed events
and will be sent into sinks made of ring buffers, file descriptors or
remote servers.
2019-08-22 20:21:00 +02:00
Willy Tarreau
973e662fe8 MINOR: sink: add a support for file descriptors
This is the most basic type of sink. It pre-registers "stdout" and
"stderr", and is able to use writev() on them. The writev() operation
is locked to avoid mixing outputs. It's likely that the registration
should move somewhere else to take into account the fact that stdout
and stderr are still opened or are closed.
2019-08-22 20:21:00 +02:00
Willy Tarreau
67b5a161b4 MINOR: sink: create definitions a minimal code for event sinks
The principle will be to be able to dispatch events to various destinations
called "sinks". This is already done in part in logs where log servers can
be either a UDP socket or a file descriptor. This will be needed with the
new trace subsystem where we may also want to add ring buffers. And it turns
out that all such destinations make sense at all places. Logs may need to be
sent to a TCP server via a ring buffer, or consulted from the CLI. Trace
events may need to be sent to stdout/stderr as well as to remote log servers.

This patch creates a new structure "sink" aiming at addressing these similar
needs. The goal is to merge together what is common to all of them, such as
the output format, the dropped events count, etc, and also keep separately
the target identification (network address, file descriptor). Provisions
were made to have a "waiter" on the sink. For a TCP log server it will be
the task to wake up after writing to the log buffer. For a ring buffer, it
could be the list of watchers on the CLI running a "tail" operation and
waiting for new events. A lock was also placed in the struct since many
operations will require some locking, including the FD ones. The output
formats covers those in use by logs and two extra ones prepending the ISO
time in front of the message (convenient for stdio/buffer).

For now only the generic infrastructure is present, no type-specific
output is implemented. There's the sink_write() function which prepares
and formats a message to be sent, trying hard to avoid copies and only
using pointer manipulation, where the type-specific code just has to be
added. Dropped messages are already counted (for now 100% drop). The
message is put into an iovec array as it will be trivial to use with
file descriptors and sockets.
2019-08-22 20:21:00 +02:00
Willy Tarreau
9eebd8a978 REORG: trace: rename trace.c to calltrace.c and mention it's not thread-safe
The function call tracing code is a quite old and was never ported to
support threads. It's not even sure whether it still works well, but
at least its presence creates confusion for future work so let's rename
it to calltrace.c and add a comment about its lack of thread-safety.
2019-08-22 20:21:00 +02:00
Willy Tarreau
32c24552e4 MINOR: tools: add a DEFNULL() macro to use NULL for empty args
It's sometimes convenient for debugging macros not to be forced to
explicitly pass NULL in an unused argument. This macro does this, it
replaces a missing arg with NULL.
2019-08-22 20:21:00 +02:00
Willy Tarreau
9bead8c7f5 MINOR: list: add LIST_SPLICE() to merge one list into another
This will move the contents of list <old> at the beginning of list
<new>.
2019-08-22 20:21:00 +02:00
Willy Tarreau
60409db0b1 MINOR: lua: export applet and task handlers
The current functions are seen outside from the debugging code and are
convenient to export so that we can improve the thread dump output :

  void hlua_applet_tcp_fct(struct appctx *ctx);
  void hlua_applet_http_fct(struct appctx *ctx);
  struct task *hlua_process_task(struct task *task, void *context, unsigned short state);

Of course they are only available when USE_LUA is defined.
2019-08-21 14:32:09 +02:00
Willy Tarreau
a2c9911ace MINOR: tools: add append_prefixed_str()
This is somewhat related to indent_msg() except that this one places a
known prefix at the beginning of each line, allows to replace the EOL
character, and not to insert a prefix on the first line if not desired.
It works with a normal output buffer/chunk so it doesn't need to allocate
anything nor to modify the input string. It is suitable for use in multi-
line backtraces.
2019-08-21 14:32:09 +02:00
Willy Tarreau
f5cab82025 MINOR: fd: make sure to mark the thread as not stuck in fd_update_events()
When I/O events are being processed, we want to make sure to mark the
thread as not stuck. The reason is that some pollers (like poll()) which
do not limit the number of FDs they report could possibly report a huge
amount of FD all having to perform moderately expensive operations in
the I/O callback (e.g. via mux-pt which forwards to the upper layers),
making the watchdog think the thread is stuck since it does not schedule.
Of course this must never happen but if it ever does we must be liberal
about it.

This should be backported to 2.0, where the situation may happen more
easily due to the FD cache which can start to collect a large amount of
events. It may be related to the report in issue #201 though nothing is
certain about it.
2019-08-16 16:06:14 +02:00
Willy Tarreau
edb91ad647 MINOR: cli: add cli_msg(), cli_err(), cli_dynmsg(), cli_dynerr()
These functions perform all the boring filling of the appctx's
cli struct needed by CLI parsers to return a message or an error,
and they return 1 so that they can be used as a single-line return
statement. They may be used for const messages or dynamic messages.
2019-08-09 10:11:38 +02:00
Willy Tarreau
d50c7feaa1 MINOR: cli: add two new states to print messages on the CLI
Right now we used to have extremely inconsistent states to report output,
one is CLI_ST_PRINT which prints constant message cli->msg with the
assigned severity, and CLI_ST_PRINT_FREE which prints dynamically
allocated cli->err with severity LOG_ERR, and nothing in between,
eventhough it's useful to be able to report dynamically allocated
messages as well as constant error messages.

This patch adds two extra states, which are not particularly well named
given the constraints imposed by existing ones. One is CLI_ST_PRINT_ERR
which prints a constant error message. The other one is CLI_ST_PRINT_DYN
which prints a dynamically allocated message. By doing so we maintain
the compatibility with current code.

It is important to keep in mind that we cannot pre-initialize pointers
and automatically detect what message type it is based on the assigned
fields, because the CLI's context is in a union shared with all other
users, thus unused fields contain anything upon return. This is why we
have no choice but using 4 states. Keeping the two fields <msg> and
<err> remains useful because one is const and not the other one, and
this catches may copy-paste mistakes. It's just that <err> is pretty
confusing here, it should be renamed.
2019-08-09 10:11:38 +02:00
Willy Tarreau
247a8b1d81 CLEANUP: task: move the cpu_time field to the task-only part
The CPU time accounting field called "cpu_time" is used only by tasks
and not tasklets, yet it used to be stored into the TASK_COMMON part,
which doesn't make sense and wastes tasklet memory. In addition, moving
it to tasks also helps better group the various parts in cache lines.
2019-08-08 10:11:05 +02:00
Willy Tarreau
e0d0b4089d CLEANUP: buffer: replace b_drop() with b_free()
Since last commit there's no point anymore in having two variants of the
same function, let's switch to b_free() only. __b_drop() was renamed to
__b_free() for obvious consistency reasons.
2019-08-08 08:07:45 +02:00
Willy Tarreau
3b091f80aa BUG/MINOR: buffers/threads: always clear a buffer's head before releasing it
A small race exists in buffers with "show sess all". This one wants to show
some information grabbed from the buffer (especially in HTX mode). But the
thread owning this buffer might just be releasing its area, right after a
free() or munmap() call, resulting in a head that is not seen as empty yet
though the area was released. It may then be dereferenced by "show sess all"
causing a crash. Note that in practice it only happens in debug mode with
UAF enabled, but it's tricky enough to fix it right now.

This should be backported to stable versions which support threads and a
store barrier. It's worth noting that by performing the clearing first,
b_free() and b_drop() now become two exact equivalent.
2019-08-08 08:07:45 +02:00
Willy Tarreau
229e739c21 BUG/MINOR: pools: don't mark the thread harmless if already isolated
Commit 85b2cae63 ("MINOR: pools: make the thread harmless during the
mmap/munmap syscalls") was used to relax the pressure experienced by
other threads when running in debug mode with UAF enabled. It places
a pair of thread_harmless_now()/thread_harmless_end() around the call
to mmap(), assuming callers are not sensitive to parallel activity.
But there are a few cases like "show sess all" where this happens in
isolated threads, and marking the thread as harmless there is a very
bad idea, even worse when arriving to thread_harmless_end() which loops
forever.

Let's only do that when the thread is not isolated. No backport is
needed as the patch above was only in 2.1-dev.
2019-08-08 07:41:52 +02:00
Frdric Lcaille
be36793d1d BUG/MEDIUM: stick-table: Wrong stick-table backends parsing.
When parsing references to stick-tables declared as backends, they are added to
a list of proxies (they are proxies!) which refer to this stick-tables.
Before this patch we added them to these list without checking they were already
present, making the silly hypothesis the actions/sample were checked/resolved in the same
order the proxies are parsed.

This patch implement a simple inline function to in_proxies_list() to test
the presence of a proxy in a list of proxies. We use this function when resolving
/checking samples/actions.

This bug was introduced by 015e4d7 commit.

Must be backported to 2.0.
2019-08-07 10:32:31 +02:00
Olivier Houchard
4c18f94c11 BUG/MEDIUM: proxy: Make sure to destroy the stream on upgrade from TCP to H2
In stream_set_backend(), if we have a TCP stream, and we want to upgrade it
to H2 instead of attempting ot reuse the stream, just destroy the
conn_stream, make sure we don't log anything about the stream, and pretend
we failed setting the backend, so that the stream will get destroyed.
New streams will then be created by the mux, as if the connection just
happened.
This fixes a crash when upgrading from TCP to H2, as the H2 mux totally
ignored the conn_stream provided by the upgrade, as reported in github
issue #196.

This should be backported to 2.0.
2019-08-02 18:28:58 +02:00
Emmanuel Hocdet
f580d0f391 BUILD: ssl: BoringSSL add EVP_PKEY_base_id
Remove EVP_PKEY_base_id compatibility, it is now included in BoringSSL.
2019-08-01 11:21:42 +02:00
Willy Tarreau
a37cb1880c MINOR: wdt: also consider that waiting in the thread dumper is normal
It happens that upon looping threads the watchdog fires, starts a dump,
and other threads expire their budget while waiting for the other threads
to get dumped and trigger a watchdog event again, adding some confusion
to the traces. With this patch the situation becomes clearer as we export
the list of threads being dumped so that the watchdog can check it before
deciding to trigger. This way such threads in queue for being dumped are
not attempted to be reported in turn.

This should be backported to 2.0 as it helps understand stack traces.
2019-07-31 19:35:31 +02:00
Olivier Houchard
53055055c5 MEDIUM: pollers: Remember the state for read and write for each threads.
In the poller code, instead of just remembering if we're currently polling
a fd or not, remember if we're polling it for writing and/or for reading, that
way, we can avoid to modify the polling if it's already polled as needed.
2019-07-31 14:54:41 +02:00
Olivier Houchard
305d5ab469 MAJOR: fd: Get rid of the fd cache.
Now that the architecture was changed so that attempts to receive/send data
always come from the upper layers, instead of them only trying to do so when
the lower layer let them know they could try, we can finally get rid of the
fd cache. We don't really need it anymore, and removing it gives us a small
performance boost.
2019-07-31 14:12:55 +02:00
Willy Tarreau
5e83d996cf BUG/MAJOR: queue/threads: avoid an AB/BA locking issue in process_srv_queue()
A problem involving server slowstart was reported by @max2k1 in issue #197.
The problem is that pendconn_grab_from_px() takes the proxy lock while
already under the server's lock while process_srv_queue() first takes the
proxy's lock then the server's lock.

While the latter seems more natural, it is fundamentally incompatible with
mayn other operations performed on servers, namely state change propagation,
where the proxy is only known after the server and cannot be locked around
the servers. Howwever reversing the lock in process_srv_queue() is trivial
and only the few functions related to dynamic cookies need to be adjusted
for this so that the proxy's lock is taken for each server operation. This
is possible because the proxy's server list is built once at boot time and
remains stable. So this is what this patch does.

The comments in the proxy and server structs were updated to mention this
rule that the server's lock may not be taken under the proxy's lock but
may enclose it.

Another approach could consist in using a second lock for the proxy's queue
which would be different from the regular proxy's lock, but given that the
operations above are rare and operate on small servers list, there is no
reason for overdesigning a solution.

This fix was successfully tested with 10000 servers in a backend where
adjusting the dyncookies in loops over the CLI didn't have a measurable
impact on the traffic.

The only workaround without the fix is to disable any occurrence of
"slowstart" on server lines, or to disable threads using "nbthread 1".

This must be backported as far as 1.8.
2019-07-30 14:02:06 +02:00
Christopher Faulet
bfab2dddad MINOR: hlua: Add a flag on the lua txn to know in which context it can be used
When a lua action or a lua sample fetch is called, a lua transaction is
created. It is an entry in the stack containing the class TXN. Thanks to it, we
can know the direction (request or response) of the call. But, for some
functions, it is also necessary to know if the buffer is "HTTP ready" for the
given direction. "HTTP ready" means there is a valid HTTP message in the
channel's buffer. So, when a lua action or a lua sample fetch is called, the
flag HLUA_TXN_HTTP_RDY is set if it is appropriate.
2019-07-29 11:17:52 +02:00
Willy Tarreau
d6e0c03384 BUILD: threads: add the definition of PROTO_LOCK
This one was added by commit daacf3664 ("BUG/MEDIUM: protocols: add a
global lock for the init/deinit stuff") but I forgot to add it to the
include file, breaking DEBUG_THREAD.
2019-07-25 07:53:56 +02:00
Christopher Faulet
98fbe9531a MEDIUM: mux-h1: Add the support of headers adjustment for bogus HTTP/1 apps
There is no standard case for HTTP header names because, as stated in the
RFC7230, they are case-insensitive. So applications must handle them in a
case-insensitive manner. But some bogus applications erroneously rely on the
case used by most browsers. This problem becomes critical with HTTP/2
because all header names must be exchanged in lowercase. And HAProxy uses the
same convention. All header names are sent in lowercase to clients and servers,
regardless of the HTTP version.

This design choice is linked to the HTX implementation. So, for previous
versions (2.0 and 1.9), a workaround is to disable the HTX mode to fall
back to the legacy HTTP mode.

Since the legacy HTTP mode was removed, some users reported interoperability
issues because their application was not able anymore to handle HTTP/1 message
received from HAProxy. So, we've decided to add a way to change the case of some
headers before sending them. It is now possible to define a "mapping" between a
lowercase header name and a version supported by the bogus application. To do
so, you must use the global directives "h1-case-adjust" and
"h1-case-adjust-file". Then options "h1-case-adjust-bogus-client" and
"h1-case-adjust-bogus-server" may be used in proxy sections to enable the
conversion. See the configuration manual for more info.

Of course, our advice is to urgently upgrade these applications for
interoperability concerns and because they may be vulnerable to various types of
content smuggling attacks. But, if your are really forced to use an unmaintained
bogus application, you may use these directive, at your own risks.

If it is relevant, this feature may be backported to 2.0.
2019-07-24 18:32:47 +02:00
Willy Tarreau
daacf36645 BUG/MEDIUM: protocols: add a global lock for the init/deinit stuff
Dragan Dosen found that the listeners lock is not sufficient to protect
the listeners list when proxies are stopping because the listeners are
also unlinked from the protocol list, and under certain situations like
bombing with soft-stop signals or shutting down many frontends in parallel
from multiple CLI connections, it could be possible to provoke multiple
instances of delete_listener() to be called in parallel for different
listeners, thus corrupting the protocol lists.

Such operations are pretty rare, they are performed once per proxy upon
startup and once per proxy on shut down. Thus there is no point trying
to optimize anything and we can use a global lock to protect the protocol
lists during these manipulations.

This fix (or a variant) will have to be backported as far as 1.8.
2019-07-24 16:45:02 +02:00
Christopher Faulet
90cc4811be BUG/MINOR: http_htx: Support empty errorfiles
Empty error files may be used to disable the sending of any message for specific
error codes. A common use-case is to use the file "/dev/null". This way the
default error message is overridden and no message is returned to the client. It
was supported in the legacy HTTP mode, but not in HTX. Because of a bug, such
messages triggered an error.

This patch must be backported to 2.0 and 1.9. However, the patch will have to be
adapted.
2019-07-23 14:58:32 +02:00
Willy Tarreau
1c8d32bb62 MAJOR: stream: store the target address into s->target_addr
When forcing the outgoing address of a connection, till now we used to
allocate this outgoing connection and set the address into it, then set
SF_ADDR_SET. With connection reuse this causes a whole lot of issues and
difficulties in the code.

Thanks to the previous changes, it is now possible to store the target
address into the stream instead, and copy the address from the stream to
the connection when initializing the connection. assign_server_address()
does this and as a result SF_ADDR_SET now reflects the presence of the
target address in the stream, not in the connection. The http_proxy mode,
the peers and the master's CLI now use the same mechanism. For now the
existing connection code was not removed to limit the amount of tricky
changes, but the allocated connection is not used anymore.

This change also revealed a latent issue that we've been having around
option http_proxy : the address was set in the connection but neither the
SF_ADDR_SET nor the SF_ASSIGNED flags were set. It looks like the connection
could establish only due to the fact that it existed with a non-null
destination address.
2019-07-19 13:50:09 +02:00
Willy Tarreau
9042060b0b MINOR: stream: add a new target_addr entry in the stream structure
The purpose will be to store the target address there and not to
allocate a connection just for this anymore. For now it's only placed
in the struct, a few fields were moved to plug some holes, and the
entry is freed on release (never allocated yet for now). This must
have no impact. Note that in order to fit, the store_count which
previously was an int was turned into a short, which is way more
than enough given that the hard-coded limit is 8.
2019-07-19 13:50:09 +02:00
Willy Tarreau
e71fca81dd MAJOR: connection: remove the addr field
Now addresses are dynamically allocated when needed. Each connection is
created with src=dst=NULL, these entries are allocated on the fly, and
released when the connection is released.
2019-07-19 13:50:09 +02:00
Willy Tarreau
ca79f59365 MEDIUM: connection: make sure all address producers allocate their address
This commit places calls to sockaddr_alloc() at the places where an address
is needed, and makes sure that the allocation is properly tested. This does
not add too many error paths since connection allocations are already in the
vicinity and share the same error paths. For the two cases where a
clear_addr() was called, instead the address was not allocated.
2019-07-19 13:50:09 +02:00
Willy Tarreau
ff5d57b022 MINOR: connection: create a new pool for struct sockaddr_storage
This pool will be used to allocate storage for source and destination
addresses used in connections. Two functions sockaddr_{alloc,free}()
were added and will have to be used everywhere an address is needed.
These ones are safe for progressive replacement as they check that the
existing pointer is set before replacing it. The pool is not yet used
during allocation nor freeing. Also they operate on pointers to pointers
so they will perform checks and replace values. The free one nulls the
pointer.
2019-07-19 13:50:09 +02:00
Willy Tarreau
226572f55f MINOR: connection: use conn->{src,dst} instead of &conn->addr.{from,to}
This is in preparation for the switch to dynamic address allocation,
let's migrate the code using the old fields to the pointers instead.
Note that no extra check was added for now, the purpose is only to
get the code to use the pointers and still work.

In the proxy protocol message handling we make sure the addresses are
properly allocated before declaring them unset.
2019-07-19 13:50:09 +02:00
Willy Tarreau
1ef4cbc693 MINOR: connection: add new src and dst fields
At the moment we're facing difficulties with connection reuse based on
the fact that connections may be allocated very early only to set a
target address in transparent mode. With the imminent removal of the
legacy mode, the connection reuse by a same stream will not exist
anymore and all this awful complexity is not justified anymore. However
we still need to be able to assign addresses somewhere.

Thus instead of allocating a connection, we'll only place addresses where
needed in the stream during operations. But this takes quite some room
(typically 128 bytes). This is a nice opportunity for cleaning all this
up and dynamically allocatating the addresses fields, which will result
in actually saving memory from connection structs since most of the time
the client's "to" address is not used and the server's "from" is not used
either, thus saving ~256 bytes per end-to-end connection.

For now these new "src" and "dst" pointers point to addr.from and addr.to.
This will allow us to smoothly update the whole code to use these pointers
prior to going further and switching them to pools.
2019-07-19 13:50:09 +02:00
Willy Tarreau
cc4df3b3de CLEANUP: connection: remove the now unused conn_get_{from,to}_addr()
These functions are not used anymore. They didn't report failures and
as such were often misused. conn_get_src() and conn_get_dst() now
replaced them everywhere.
2019-07-19 13:50:09 +02:00
Willy Tarreau
3cc01d84b3 MINOR: backend: switch to conn_get_{src,dst}() for port and address mapping
The backend connect code uses conn_get_{from,to}_addr to forward addresses
in transparent mode and to map server ports, without really checking if the
operation succeeds. In preparation of future changes, let's switch to
conn_get_{src,dst}() and integrate status check for possible failures.
2019-07-19 13:50:09 +02:00
Willy Tarreau
2e34c11458 MINOR: connection: add conn_get_src() and conn_get_dst()
These functions currently are the same as conn_get_from_addr() and
conn_get_to_addr() respectively except that they return a status for
the operation that the caller can test.
2019-07-19 13:50:09 +02:00
Christopher Faulet
f734638976 MINOR: http: Don't store raw HTTP errors in chunks anymore
Default HTTP error messages are stored in an array of chunks. And since the HTX
was added, these messages are also converted in HTX and stored in another
array. But now, the first array is not used anymore because the legacy HTTP mode
was removed.

So now, only the array with the HTX messages are kept. The other one was
removed.
2019-07-19 09:46:23 +02:00
Christopher Faulet
1b6adb4a51 MINOR: proxy/http_ana: Remove unused req_exp/rsp_exp and req_add/rsp_add lists
The keywords req* and rsp* are now unsupported. So the corresponding lists are
now unused. It is safe to remove them from the structure proxy.

As a result, the code dealing with these rules in HTTP analyzers was also
removed.
2019-07-19 09:24:12 +02:00
Christopher Faulet
8c3b63ae1d MINOR: proxy: Remove the unused list of block rules
The keyword "block" is now unsupported. So the list of block rules is now
unused. It can be safely removed from the structure proxy.
2019-07-19 09:24:12 +02:00
Christopher Faulet
a6a56e6483 MEDIUM: config: Remove parsing of req* and rsp* directives
It was announced for the 2.1. Following keywords are now unsupported:

  * reqadd, reqallow, reqiallow, reqdel, reqidel, reqdeny, reqideny, reqpass,
    reqipass, reqrep, reqirep reqtarpit, reqitarpit

  * rspadd, rspdel, rspidel, rspdeny, rspideny, rsprep, rspirep

a fatal error is emitted if one of these keyword is found during the
configuraion parsing.
2019-07-19 09:24:12 +02:00
Christopher Faulet
73e8ede156 MINOR: proxy: Remove support of the option 'http-tunnel'
The option 'http-tunnel' is deprecated and it was only used in the legacy HTTP
mode. So this option is now totally ignored and a warning is emitted during
HAProxy startup if it is found in a configuration file.
2019-07-19 09:24:12 +02:00
Christopher Faulet
fc9cfe4006 REORG: proto_htx: Move HTX analyzers & co to http_ana.{c,h} files
The old module proto_http does not exist anymore. All code dedicated to the HTTP
analysis is now grouped in the file proto_htx.c. So, to finish the polishing
after removing the legacy HTTP code, proto_htx.{c,h} files have been moved in
http_ana.{c,h} files.

In addition, all HTX analyzers and related functions prefixed with "htx_" have
been renamed to start with "http_" instead.
2019-07-19 09:24:12 +02:00
Christopher Faulet
eb2754bef8 CLEANUP: proto_http: Remove unecessary includes and comments 2019-07-19 09:24:12 +02:00
Christopher Faulet
22dc248c2a CLEANUP: channel: Remove the unused flag CF_WAKE_CONNECT
This flag is tested or cleared but never set anymore.
2019-07-19 09:24:12 +02:00
Christopher Faulet
3716ebc50f CLEANUP: proto_http: Group remaining flags of the HTTP transaction 2019-07-19 09:24:12 +02:00
Christopher Faulet
cc76d5b9a1 MINOR: proto_http: Remove the unused flag HTTP_MSGF_WAIT_CONN
This flag is set but never used. So remove it.
2019-07-19 09:24:12 +02:00
Christopher Faulet
c41547b66e MINOR: proto_http: Remove unused http txn flags
Many flags of the HTTP transction (TX_*) are now unused and useless. So the
flags TX_WAIT_CLEANUP, TX_HDR_CONN_*, TX_CON_CLO_SET and TX_CON_KAL_SET were
removed. Most of TX_CON_WANT_* were also removed. Only TX_CON_WANT_TUN has been
kept.
2019-07-19 09:24:12 +02:00
Christopher Faulet
711ed6ae4a MAJOR: http: Remove the HTTP legacy code
First of all, all legacy HTTP analyzers and all functions exclusively used by
them were removed. So the most of the functions in proto_http.{c,h} were
removed. Only functions to deal with the HTTP transaction have been kept. Then,
http_msg and hdr_idx modules were entirely removed. And finally the structure
http_msg was lightened of all its useless information about the legacy HTTP. The
structure hdr_ctx was also removed because unused now, just like unused states
in the enum h1_state. Note that the memory pool "hdr_idx" was removed and
"http_txn" is now smaller.
2019-07-19 09:24:12 +02:00
Christopher Faulet
3d11969a91 MAJOR: filters: Remove code relying on the legacy HTTP mode
This commit breaks the compatibility with filters still relying on the legacy
HTTP code. The legacy callbacks were removed (http_data, http_chunk_trailers and
http_forward_data).

For now, the filters must still set the flag FLT_CFG_FL_HTX to be used on HTX
streams.
2019-07-19 09:18:27 +02:00
Christopher Faulet
28b18c5e21 CLEANUP: proxy: Remove the flag PR_O2_USE_HTX
This flag is now unused. So we can safely remove it.
2019-07-19 09:18:27 +02:00
Christopher Faulet
6d1dd46917 MEDIUM: http_fetch: Remove code relying on HTTP legacy mode
Since the legacy HTTP mode is disbabled, all HTTP sample fetches work on HTX
streams. So it is safe to remove all code relying on HTTP legacy mode. Among
other things, the function smp_prefetch_http() was removed with the associated
macros CHECK_HTTP_MESSAGE_FIRST() and CHECK_HTTP_MESSAGE_FIRST_PERM().
2019-07-19 09:18:27 +02:00
Christopher Faulet
c985f6c5d8 MINOR: connection: Remove the multiplexer protocol PROTO_MODE_HTX
Since the legacy HTTP mode is disabled and no multiplexer relies on it anymore,
there is no reason to have 2 multiplexer protocols for the HTTP. So the protocol
PROTO_MODE_HTX was removed and all HTTP multiplexers use now PROTO_MODE_HTTP.
2019-07-19 09:18:27 +02:00
Christopher Faulet
5ed8353dcf CLEANUP: h2: Remove functions converting h2 requests to raw HTTP/1.1 ones
Because the h2 multiplexer only uses the HTX mode, following H2 functions were
removed :

  * h2_prepare_h1_reqline
  * h2_make_h1_request()
  * h2_make_h1_trailers()
2019-07-19 09:18:27 +02:00
Christopher Faulet
24e116bfe0 MINOR: htx: Slightly update htx_dump() to report better messages
Sign of <tail_addr>, <head_addr> and <end_addr> is respsected to not convert -1
into its unsigned representation.
2019-07-19 09:18:27 +02:00
Christopher Faulet
2bf43f0746 MINOR: htx: Use an array of char to store HTX blocks
Instead of using a array of (struct block), it is more natural and intuitive to
use an array of char. Indeed, not only (struct block) are stored in this array,
but also their payload.
2019-07-19 09:18:27 +02:00
Christopher Faulet
192c6a23d4 MINOR: htx: Deduce the number of used blocks from tail and head values
<head> and <tail> fields are now signed 32-bits integers. For an empty HTX
message, these fields are set to -1. So the field <used> is now useless and can
safely be removed. To know if an HTX message is empty or not, we just compare
<head> against -1 (it also works with <tail>). The function htx_nbblks() has
been added to get the number of used blocks.
2019-07-19 09:18:27 +02:00
Christopher Faulet
5a916f7326 CLEANUP: htx: Remove the unsued function htx_add_blk_type_size() 2019-07-19 09:18:27 +02:00
Christopher Faulet
3b21972061 DOC: htx: Update comments in HTX files
This patch may be backported to 2.0 to have accurate comments.
2019-07-19 09:18:27 +02:00
Christopher Faulet
304cc40536 MINOR: proto_htx: Add the function htx_return_srv_error()
Instead of using a function from the legacy HTTP, the HTX code now uses its own
one.
2019-07-19 09:18:27 +02:00
Willy Tarreau
8280ea97a0 MINOR: applet: make appctx use their own pool
A long time ago, applets were seen as an alternative to connections,
and since their respective sizes were roughly equal it appeared wise
to share the same pool. Nowadays, connections got significantly larger
but applets are not that often used, except for the cache. However
applets are mostly complementary and not alternatives anymore, as
it's very possible not to have a back connection or to share one with
other streams.

The connections will soon lose their addresses and their size will
shrink so much that appctx won't fit anymore. Given that the old
benefits of sharing these pools have long disappeared, let's stop
doing this and have a dedicated pool for appctx.
2019-07-18 10:45:08 +02:00
Willy Tarreau
7764a57d32 BUG/MEDIUM: threads: cpu-map designating a single thread/process are ignored
Since commit 81492c989 ("MINOR: threads: flatten the per-thread cpu-map"),
we don't keep the proc*thread matrix anymore to represent the full binding
possibilities, but only the proc and thread ones. The problem is that the
per-process binding is not the same for each thread and for the process,
and the proc[] array was assumed to store the per-proc first thread value
when doing this change. Worse, the logic present there tries to deal with
thread ranges and process ranges in a way which automatically exclused the
other possibility (since ranges cannot be used on both) but as such fails
to apply changes if neither the process nor the thread is expressed as a
range.

The real problem comes from the fact that specifying cpu-map 1/1 doesn't
yet reveal if the per-process mask or the per-thread mask needs to be
updated. In practice it's the thread one but then the current storage
doesn't allow to store the binding of the first thread of each other
process in nbproc>1 configurations.

When removing the proc*thread matrix, what ought to have been kept was
both the thread column for process 1 and the process line for threads 1,
but instead only the thread column was kept. This patch reintroduces the
storage of the configuration for the first thread of each process so that
it is again possible to store either the per-thread or per-process
configuration.

As a partial workaround for existing configurations, it is possible to
systematically indicate at least two processes or two threads at once
and map them by pairs or more so that at least two values are present
in the range. E.g :

  # set processes 1-4 to cpus 0-3 :

     cpu-map auto:1-4/1 0 1 2 3
  # or:
     cpu-map 1-2/1 0 1
     cpu-map 2-3/1 2 3

  # set threads 1-4 to cpus 0-3 :

     cpu-map auto:1/1-4 0 1 2 3
  # or :
     cpu-map 1/1-2 0 1
     cpu-map 3/3-4 2 3

This fix must be backported to 2.0.
2019-07-16 15:23:09 +02:00
Andrew Heberle
9723696759 MEDIUM: mworker-prog: Add user/group options to program section
This patch adds "user" and "group" config options to the "program"
section so the configured command can be run as a different user.
2019-07-15 16:43:16 +02:00
Olivier Houchard
4bd5867627 BUG/MEDIUM: streams: Don't redispatch with L7 retries if redispatch isn't set.
Move the logic to decide if we redispatch to a new server from
sess_update_st_cer() to a new inline function, stream_choose_redispatch(), and
use it in do_l7_retry() instead of just setting the state to SI_ST_REQ.
That way, when using L7 retries, we won't redispatch the request to another
server except if "option redispatch" is used.

This should be backported to 2.0.
2019-07-12 16:17:50 +02:00
Willy Tarreau
64e6012eb9 MINOR: task: introduce work lists
Sometimes we need to delegate some list processing to a function running
on another thread. In this case the list element will simply be queued
into a dedicated self-locked list and the task responsible for this list
will be woken up, calling the associated function which will run over the
list.

This is what work_list does. Such lists will be dedicated to a limited
type of work but will significantly ease such remote handling. A function
is provided to create these per-thread lists, their tasks and to properly
bind each task to a distinct thread, so that the caller only has to store
the resulting pointer to the start of the structure.

These structures should not be abused though as each head will consume
4 pointers per thread, hence 32 bytes per thread or 2 kB for 64 threads.
2019-07-12 09:07:48 +02:00
Olivier Houchard
4be7190c10 BUG/MEDIUM: servers: Fix a race condition with idle connections.
When we're purging idle connections, there's a race condition, when we're
removing the connection from the idle list, to add it to the list of
connections to free, if the thread owning the connection tries to free it
at the same time.
To fix this, simply add a per-thread lock, that has to be hold before
removing the connection from the idle list, and when, in conn_free(), we're
about to remove the connection from every list. That way, we know for sure
the connection will stay valid while we remove it from the idle list, to add
it to the list of connections to free.
This should happen rarely enough that it shouldn't have any impact on
performances.
This has not been reported yet, but could provoke random segfaults.

This should be backported to 2.0.
2019-07-11 16:16:38 +02:00
Christopher Faulet
34ce7d075a BUG/MINOR: server: Be really able to keep "pool-max-conn" idle connections
The maximum number of idle connections for a server can be configured by setting
the server option "pool-max-conn". But when we try to add a connection in its
idle list, because of a wrong comparison, it may be rejected because there are
already "pool-max-conn - 1" idle connections.

This patch must be backported to 2.0 and 1.9.
2019-07-10 14:20:52 +02:00
Willy Tarreau
1dad3843dc BUG/MEDIUM: fd/threads: fix excessive CPU usage on multi-thread accept
While experimenting with potentially improved fairness and latency using
ticket locks on a Ryzen 16-thread/8-core, a very strange situation happened
a lot for some levels of traffic. Around 300k connections per second, no
more connections would be accepted on the multi-threaded listener but all
others would continue to work fine. All attempts to trace showed that the
threads were all in the trylock in the fd cache, or in the spinlock of
fd_update_events(), or in the one of fd_may_recv(). But as indicated this
was not a deadlock since the process continues to work fine.

After quite some investigation it appeared that the issue is caused by a
lack of fairness between the fdcache's trylock and these functions' spin
locks above. In fact, regardless of the success or failure of the fdcache's
attempt at grabbing the lock, the poller was calling fd_update_events()
which locks the FD once for something that can be done with a CAS, and
then calls fd_may_recv() with another lock for something that most often
didn't change. The high contention on these spinlocks leaves no chance to
any other thread to grab the lock using trylock(), and once this happens,
there is no thread left to process incoming connection events nor to stop
polling on the FD, leaving all threads at 100% CPU but partially operational.

This patch addresses the issue by using bit-test-and-set instead of the OR
in fd_may_recv() / fd_may_send() so that nothing is done if the FD was
already configured as expected. It does the same in fd_update_events()
using a CAS to check if the FD's events need to be changed at all or not.
With this patch applied, it became impossible to reproduce the issue, and
now there's no way to saturate all 16 CPUs with the load used for testing,
as no more than 1350-1400 were noticed at 300+kcps vs 1600.

Ideally this patch should go further and try to remove the remaining
incarnations of the fdlock as this seems possible, but it's difficult
enough to be done in a distinct patch that will not have to be backported.

It is possible that workloads involving a high connection rate may slightly
benefit from this patch and observe a slightly lower CPU usage even when
the service doesn't misbehave.

This patch must be backported to 2.0 and 1.9.
2019-07-09 10:41:24 +02:00
Willy Tarreau
85b2cae63c MINOR: pools: make the thread harmless during the mmap/munmap syscalls
These calls can take quite some time and leave the thread harmless so
it's better to mark it as such. This makes "show sess" respond way
faster during high loads running on processes build with DEBUG_UAF
since these calls are stressed a lot.
2019-07-09 10:40:33 +02:00
Willy Tarreau
828675421e MINOR: pools: always pre-initialize allocated memory outside of the lock
When calling mmap(), in general the system gives us a page but does not
really allocate it until we first dereference it. And it turns out that
this time is much longer than the time to perform the mmap() syscall.
Unfortunately, when running with memory debugging enabled, we mmap/munmap()
each object resulting in lots of such calls and a high contention on the
allocator. And the first accesses to the page being done under the pool
lock is extremely damaging to other threads.

The simple fact of writing a 0 at the beginning of the page after
allocating it and placing the POOL_LINK pointer outside of the lock is
enough to boost the performance by 8x in debug mode and to save the
watchdog from triggering on lock contention. This is what this patch
does.
2019-07-09 10:40:33 +02:00
Willy Tarreau
3e853ea74d MINOR: pools: release the pool's lock during the malloc/free calls
The malloc and free calls and especially the underlying mmap/munmap()
can occasionally take a huge amount of time and even cause the thread
to sleep. This is visible when haproxy is compiled with DEBUG_UAF which
causes every single pool allocation/free to allocate and release pages.
In this case, when using the locked pools, the watchdog can occasionally
fire under high contention (typically requesting 40000 1M objects in
parallel over 8 threads). Then, "perf top" shows that 50% of the CPU
time is spent in mmap() and munmap(). The reason the watchdog fires is
because some threads spin on the pool lock which is held by other threads
waiting on mmap() or munmap().

This patch modifies this so that the pool lock is released during these
syscalls. Not only this allows other threads to request try to allocate
their data in parallel, but it also considerably reduces the lock
contention.

Note that the locked pools are only used on small architectures where
high thread counts would not make sense, so this will not provide any
benefit in the general case. However it makes the debugging versions
way more stable, which is always appreciated.
2019-07-09 10:40:33 +02:00
Christopher Faulet
037b3ebd35 BUG/MEDIUM: stream-int: Don't rely on CF_WRITE_PARTIAL to unblock opposite si
In the function stream_int_notify(), when the opposite stream-interface is
blocked because there is no more room into the input buffer, if the flag
CF_WRITE_PARTIAL is set on this buffer, it is unblocked. It is a way to unblock
the reads on the other side because some data was sent.

But it is a problem during the fast-forwarding because only the stream is able
to remove the flag CF_WRITE_PARTIAL. So it is possible to have this flag because
of a previous send while the input buffer of the opposite stream-interface is
now full. In such case, the opposite stream-interface will be woken up for
nothing because its input buffer is full. If the same happens on the opposite
side, we will have a loop consumming all the CPU.

To fix the bug, the opposite side is now only notify if there is some available
room in its input buffer in the function si_cs_send(), so only if some data was
sent.

This patch must be backported to 2.0 and 1.9.
2019-07-05 14:26:15 +02:00
Christopher Faulet
2e4843d1d2 MINOR: action: Add the return code ACT_RET_DONE for actions
This code should be now used by action to stop at the same time the rules
processing and the possible following processings. And from its side, the return
code ACT_RET_STOP should be used to only stop rules processing.

So concretely, for TCP rules, there is no changes. ACT_RET_STOP and ACT_RET_DONE
are handled the same way. However, for HTTP rules, ACT_RET_STOP should now be
mapped on HTTP_RULE_RES_STOP and ACT_RET_DONE on HTTP_RULE_RES_DONE. So this
way, a action will have the possibilty to stop all processing or only rules
processing.

Note that changes about the TCP is done in this commit but changes about the
HTTP will be done in another one because it will fix a bug in the same time.

This patch must be backported to 2.0 because a bugfix depends on it.
2019-07-05 14:26:14 +02:00
Olivier Houchard
cee0389088 BUG/MEDIUM: sessions: Don't keep an extra idle connection in sessions.
When deciding if we keep an idle connection in the session, check if
the number of connections currently in the session is >= the max allowed,
not >, or we'll keep an extra connection.

This should be backported to 1.9 and 2.0.
2019-07-04 14:28:18 +02:00
Olivier Houchard
2ab3dada01 BUG/MEDIUM: connections: Make sure we're unsubscribe before upgrading the mux.
Just calling conn_force_unsubscribe() from conn_upgrade_mux_fe() is not
enough, as there may be multiple XPRT involved. Instead, require that
any user of conn_upgrade_mux_fe() unsubscribe itself before calling it.
This should fix upgrading a TCP connection to HTX when using SSL.

This should be backported to 2.0.
2019-07-03 13:57:30 +02:00
Christopher Faulet
621da6bafa BUG/MEDIUM: channel/htx: Use the total HTX size in channel_htx_recv_limit()
The receive limit of an HTX channel must be calculated against the total size of
the HTX message. Otherwise, the buffer may never be seen as full whereas the
receive limit is 0. Indeed, the function channel_htx_full() already takes care
to add a block size to the buffer's reserve (8 bytes). So if the function
channel_htx_recv_limit() also keep a block size free in addition to the buffer's
reserve, it means that at least 2 block size will be kept free but only one will
be taken into account, freezing the stream if the option http-buffer-request is
enabled.

This patch fixes the Github issue #136. It should be backported to 2.0 and
1.9. Thanks jaroslawr (Jarosław Rzeszótko) for his help.
2019-07-02 21:32:45 +02:00
Olivier Houchard
6c7e96a3e1 BUG/MEDIUM: connections: Always call shutdown, with no linger.
Revert commit fe4abe62c7.
The goal was to make sure for health-checks, we would not get sockets in
TIME_WAIT. To do so, we would not call shutdown() if linger_risk is set.
However that is wrong, and that means shutw would never be forwarded to
the server, and thus we could get connection that are never properly closed.
Instead, to fix the original problem as described here :
https://www.mail-archive.com/haproxy@formilux.org/msg34080.html
Just make sure the checks code call cs_shutr() before calling cs_shutw().
If shutr has been called, conn_sock_shutw() will make no attempt to call
shutdown(), as it knows close() will be called.
We should really review and revamp the shutr/shutw code, as described in
github issue #142.

This should be backported to 1.9 and 2.0.
2019-07-02 16:40:55 +02:00
William Lallemand
ad03288e6b BUG/MINOR: mworker/cli: don't output a \n before the response
When using a level lower than admin on the master CLI, a \n is output
before the response, this is caused by the response of the "operator" or
"user" that are sent before the actual command.

To fix this problem we introduce the flag APPCTX_CLI_ST1_NOLF which ask
a command response to not be followed by the final \n.
This patch made a special case with the command operator and user
followed by a - so they are not followed by \n.

This patch must be backported to 2.0 and 1.9.
2019-07-01 15:34:11 +02:00
Christopher Faulet
bb0efcdd29 MINOR: htx: Add the function htx_change_blk_value_len()
As its name suggest, this function change the value length of a block. But it
also update the HTX message accordingly. It simplifies the HTX API. The function
htx_set_blk_value_len() is still available and must be used with caution because
this one does not update the HTX message. It just updates the HTX block. It
should be considered as an internal function. When possible,
htx_change_blk_value_len() should be used instead.

This function is used to fix a bug affecting the 2.0. So, this patch must be
backported to 2.0.
2019-06-18 10:01:55 +02:00
Baptiste Assmann
da29fe2360 MEDIUM: server: server-state global file stored in a tree
Server states can be recovered from either a "global" file (all backends)
or a "local" file (per backend).

The way the algorithm to parse the state file was first implemented was good
enough for a low number of backends and servers per backend.
Basically, for each backend the state file (global or local) is opened,
parsed entirely and for each line we check if it contains data related to
a server from the backend we're currently processing.
We must read the file entirely, just in case some lines for the current
backend are stored at the end of the file.
This does not scale at all!

This patch changes the behavior above for the "global" file only. Now,
the global file is read and parsed once and all lines it contains are
stored in a tree, for faster discovery.
This result in way much less fopen, fgets, and strcmp calls, which make
loading of very big state files very quick now.
2019-06-17 13:40:42 +02:00
Tim Duesterhus
86e6b6ebf8 MEDIUM: Make '(cli|con|srv)timeout' directive fatal
They were deprecated with HAProxy 1.5. Time to remove them.
2019-06-17 13:35:54 +02:00
Tim Duesterhus
dac168bc15 MEDIUM: Make 'redispatch' directive fatal
It was deprecated with HAProxy 1.5. Time to remove it.
2019-06-17 13:35:54 +02:00
Tim Duesterhus
7b7c47f05c MEDIUM: Make 'block' directive fatal
It was deprecated with HAProxy 1.5. Time to remove it.
2019-06-17 13:35:54 +02:00
Willy Tarreau
9dc6b97429 [RELEASE] Released version 2.1-dev0
Released version 2.1-dev0 with the following main changes :
    - exact copy of 2.0.0
2019-06-16 21:49:47 +02:00
Willy Tarreau
bd20a9dd4e BUG: tasks: fix bug introduced by latest scheduler cleanup
In commit 86eded6c6 ("CLEANUP: tasks: rename task_remove_from_tasklet_list()
to tasklet_remove_*") which consisted in removing the casts between tasks
and tasklet, I was a bit too fast to believe that we only saw tasklets in
this function since process_runnable_tasks() also uses it with tasks under
a cast. So removing the bookkeeping on task_list_size was not appropriate.
Bah, the joy of casts which hide the real thing...

This patch does two things at once to address this mess once for all:
  - it restores the decrement of task_list_size when it's a real task,
    but moves it to process_runnable_task() since it's the only place
    where it's allowed to call it with a task

  - it moves the increment there as well and renames
    task_insert_into_tasklet_list() to tasklet_insert_into_tasklet_list()
    of obvious consistency reasons.

This way the increment/decrement of task_list_size is made at the only
places where the cast is enforced, so it has less risks to be missed.
The comments on top of these functions were updated to reflect that they
are only supposed to be used with tasklets and that the caller is responsible
for keeping task_list_size up to date if it decides to enforce a task there.

Now we don't have to worry anymore about how these functions work outside
of the scheduler, which is better longterm-wise. Thanks to Christopher for
spotting this mistake.

No backport is needed.
2019-06-14 18:16:19 +02:00
Olivier Houchard
fe4abe62c7 BUG/MEDIUM: connections: Don't call shutdown() if we want to disable linger.
In conn_sock_shutw(), avoid calling shutdown() if linger_risk is set. Not
doing so will result in getting sockets in TIME_WAIT for some time.
This is particularly observable with health checks.

This should be backported to 1.9.
2019-06-14 15:33:41 +02:00
Willy Tarreau
86eded6c69 CLEANUP: tasks: rename task_remove_from_tasklet_list() to tasklet_remove_*
The function really only operates on tasklets, its arguments are always
tasklets cast as tasks to match the function's type, to be cast back to
a struct tasklet. Let's rename it to tasklet_remove_from_tasklet_list(),
take a struct tasklet, and get rid of the undesired task casts.
2019-06-14 14:57:03 +02:00
Willy Tarreau
3c39a7d889 CLEANUP: connection: rename the wait_event.task field to .tasklet
It's really confusing to call it a task because it's a tasklet and used
in places where tasks and tasklets are used together. Let's rename it
to tasklet to remove this confusion.
2019-06-14 14:42:29 +02:00
Christopher Faulet
e21c01637a MINOR: htx: Add 3 flags on the start-line to deal with the request schemes
The first one, HTX_SL_F_HAS_SCHM, will be used to know the request has an
explicit scheme. So, in H2, it is always true because the pseudo-header
":scheme" is mandatory. In H1, it is only true when an absolute URI is found on
the start-line. The other flags, HTX_SL_F_SCHM_HTTP and HTX_SL_F_SCHM_HTTPS,
will be used to know which scheme the request have. For now, other protocols are
not handled.

The aim of these flags is to pass this information to the backend side in
general, and to the H2 mux in particular. So the multiplexer will have a chance
to use this information to send the right scheme to the server.
2019-06-14 11:13:32 +02:00
Christopher Faulet
36a7702b03 CLEANUP: channel: Remove channel_htx_fwd_payload() and channel_htx_fwd_all()
These functions are unused now. No backport needed.
2019-06-14 11:13:32 +02:00
Christopher Faulet
421e769783 BUG/MEDIUM: htx: Don't change position of the first block during HTX analysis
In the HTX structure, the field <first> is used to know where to (re)start the
analysis. It may differ from the message's head. It is especially important to
update it to handle 1xx messages, to be sure to restart the analysis on the next
message (another 1xx message or the final one). It is also updated when some
data are forwarded (the headers or part of the body). But this update is an
error and must never be done at the analysis level. It is a bug, because some
sample fetches may be used after the data forwarding (but before the first send
of course). At this stage, if the first block position does not point on the
start-line, most of HTTP sample fetches fail.

So now, when something is forwarding by HTX analyzers, the first block position
is not update anymore.

This issue was reported on Github. See #119. No backport needed.
2019-06-14 11:13:32 +02:00
Christopher Faulet
87ebe944d6 BUG/MINOR: channel/htx: Call channel_htx_full() from channel_full()
When channel_full() is called for an HTX stream, we fall back on the HTX
version. This function is called, among other, from tcp_inspect_request(). With
this patch, the inspect delay is respected again.

This patch must be backported to 1.9.
2019-06-14 11:13:32 +02:00
Willy Tarreau
3cec0f94f3 BUG/MINOR: task: prevent schedulable tasks from starving under high I/O activity
With both I/O and tasks in the same tasklet list, we now have a very
smooth and responsive scheduler, providing a good fairness between I/O
activities. With the lower layers relying on tasklet a lot (I/O wakeup,
subscribe, etc), there may often be a large number of totally autonomous
tasklets doing their business such as forwarding data between two muxes.

But the task scheduler historically refrained from picking tasks from the
priority-ordered run queue to put them into the tasklet list until this
later had less than max_runqueue_depth entries. This was to make sure that
low-latency, high-priority tasks would have an opportunity to be dequeued
before others even if they arrive late. But the counter used for this is
still the tasklet list size, which contains countless I/O events. This
causes an unfairness between unbounded I/Os and bounded tasks, resulting
for example in the CLI responding slower when forwarding 40 Gbps of HTTP
traffic spread over a thousand of connections.

A good solution consists in sticking to the initial intent of
max_runqueue_depth which is to limit the number of tasks in the list
(to maintain fairness between them) and not to limit the number of these
tasks among tasklets. It just turns out that the task_list_size initially
was this task counter and changed over time to be a tasklet list size.
Let's simply refrain from updating it for pure tasklets so that it takes
back its original role of counting real tasks as its name implies. With
this change the CLI becomes instantly responsive under load again.

This patch may possibly be backported to 1.9 though it requires some
careful checks.
2019-06-14 09:16:51 +02:00
William Lallemand
1dc6963086 MINOR: mworker: add the HAProxy version in "show proc"
Displays the HAProxy version so you can compare the version of old
processes and new ones.
2019-06-12 19:19:57 +02:00
Olivier Houchard
a0fdce3950 MINOR: fd: Don't use atomic operations when it's not needed.
In updt_fd_polling(), when updating fd_nbupdt, there's no need to use an
atomic operation, as it's a TLS variable.
2019-06-12 14:36:24 +02:00
Christopher Faulet
86fcf6d6cd MINOR: htx: Add the function htx_move_blk_before()
The function htx_add_data_before() was removed because it was buggy. The
function htx_move_blk_before() may be used if necessary to do something
equivalent, except it just moves blocks. It doesn't handle the adding.
2019-06-11 14:05:25 +02:00
Christopher Faulet
d7884d3449 MAJOR: htx: Rework how free rooms are tracked in an HTX message
In an HTX message, it may have 2 available rooms to store a new block. The first
one is between the blocks and their payload. Blocks are added starting from the
end of the buffer and their payloads are added starting from the begining. So
the first free room is between these 2 edges. The second one is at the begining
of the buffer, when we start to wrap to add new payloads. Once we start to use
this one, the other one is ignored until the next defragmentation of the HTX
message.

In theory, there is no problem. But in practice, some lacks in the HTX structure
force us to defragment too often HTX messages to always be in a known state. The
second free room is not tracked as it should do and the first one may be easily
corrupted when rewrites happen.

So to fix the problem and avoid unecessary defragmentation, the HTX structure
has been refactored. The front (the block's position of the first payload before
the blocks) is no more stored. Instead we keep the relative addresses of 3 edges:

 * tail_addr : The start address of the free space in front of the the blocks
               table
 * head_addr : The start address of the free space at the beginning
 * end_addr  : The end address of the free space at the beginning

Here is the general view of the HTX message now:

           head_addr     end_addr    tail_addr
               |            |            |
               V            V            V
  +------------+------------+------------+------------+------------------+
  |            |            |            |            |                  |
  |  PAYLOAD   | Free space |  PAYLOAD   | Free space |    Blocks area   |
  |    ==>     |     1      |    ==>     |     2      |        <==       |
  +------------+------------+------------+------------+------------------+

<head_addr> is always lower or equal to <end_addr> and <tail_addr>. <end_addr>
is always lower or equal to <tail_addr>.

In addition;, to simplify everything, the blocks area are now contiguous. It
doesn't wrap anymore. So the head is always the block with the lowest position,
and the tail is always the one with the highest position.
2019-06-11 14:05:25 +02:00
Christopher Faulet
86bc8df955 BUG/MEDIUM: compression/htx: Fix the adding of the last data block
The function htx_add_data_before() is buggy and cannot work. It first add a data
block and then move it before another one, passed in argument. The problem
happens when a defragmentation is done to add the new block. In this case, the
reference is no longer valid, because the blocks are rearranged. So, instead of
moving the new block before the reference, it is moved at the head of the HTX
message.

So this function has been removed. It was only used by the compression filter to
add a last data block before a TLR, EOT or EOM block. Now, the new function
htx_add_last_data() is used. It adds a last data block, after all others and
before any TLR, EOT or EOM block. Then, the next bock is get. It is the first
non-data block after data in the HTX message. The compression loop continues
with it.

This patch must be backported to 1.9.
2019-06-11 14:05:25 +02:00
Willy Tarreau
9a1f57351d MEDIUM: threads: add thread_sync_release() to synchronize steps
This function provides an alternate way to leave a critical section run
under thread_isolate(). Currently, a thread may remain in thread_release()
without having the time to notice that the rdv mask was released and taken
again by another thread entering thread_isolate() (often the same that just
released it). This is because threads wait in harmless mode in the loop,
which is compatible with the conditions to enter thread_isolate(). It's
not possible to make them wait with the harmless bit off or we cannot know
when the job is finished for the next thread to start in thread_isolate(),
and if we don't clear the rdv bit when going there, we create another
race on the start point of thread_isolate().

This new synchronous variant of thread_release() makes use of an extra
mask to indicate the threads that want to be synchronously released. In
this case, they will be marked harmless before releasing their sync bit,
and will wait for others to release their bit as well, guaranteeing that
thread_isolate() cannot be started by any of them before they all left
thread_sync_release(). This allows to construct synchronized blocks like
this :

     thread_isolate()
     /* optionally do something alone here */
     thread_sync_release()
     /* do something together here */
     thread_isolate()
     /* optionally do something alone here */
     thread_sync_release()

And so on. This is particularly useful during initialization where several
steps have to be respected and no thread must start a step before the
previous one is completed by other threads.

This one must not be placed after any call to thread_release() or it would
risk to block an earlier call to thread_isolate() which the current thread
managed to leave without waiting for others to complete, and end up here
with the thread's harmless bit cleared, blocking others. This might be
improved in the future.
2019-06-10 09:42:43 +02:00
Willy Tarreau
9faebe34cd MEDIUM: tools: improve time format error detection
As reported in GH issue #109 and in discourse issue
https://discourse.haproxy.org/t/haproxy-returns-408-or-504-error-when-timeout-client-value-is-every-25d
the time parser doesn't error on overflows nor underflows. This is a
recurring problem which additionally has the bad taste of taking a long
time before hitting the user.

This patch makes parse_time_err() return special error codes for overflows
and underflows, and adds the control in the call places to report suitable
errors depending on the requested unit. In practice, underflows are almost
never returned as the parsing function takes care of rounding values up,
so this might possibly happen on 64-bit overflows returning exactly zero
after rounding though. It is not really possible to cut the patch into
pieces as it changes the function's API, hence all callers.

Tests were run on about every relevant part (cookie maxlife/maxidle,
server inter, stats timeout, timeout*, cli's set timeout command,
tcp-request/response inspect-delay).
2019-06-07 19:32:02 +02:00
Frdric Lcaille
b65717fa55 MINOR: peers: Optimization for dictionary cache lookup.
When we look up an dictionary entry in the cache used upon transmission
we store the last result in ->prev_lookup of struct dcache_tx so that
to compare it with the subsequent entries to look up and save performances.
2019-06-07 15:47:54 +02:00
Frdric Lcaille
99de1d0479 MINOR: dict: Store the length of the dictionary entries.
When allocating new dictionary entries we store the length of the strings.
May be useful so that not to have to call strlen() too much often at runing
time.
2019-06-07 15:47:54 +02:00
Frdric Lcaille
6c39198b57 MINOR peers: data structure simplifications for server names dictionary cache.
We store pointers to server names dictionary entries in a pre-allocated array of
ebpt_node's (->entries member of struct dcache_tx) to cache those sent to remote
peers. Consequently the ID used to identify the server name dictionary entry is
also used as index for this array. There is no need to implement a lookup by key
for this dictionary cache.
2019-06-07 15:47:54 +02:00
Willy Tarreau
1bfd6020ce MINOR: logs: use the new bitmap functions instead of fd_sets for encoding maps
The fd_sets we've been using in the log encoding functions are not portable
and were shown to break at least under Cygwin. This patch gets rid of them
in favor of the new bitmap functions. It was verified with the config below
that the log output was exactly the same before and after the change :

    defaults
        mode http
        option httplog
        log stdout local0
        timeout client 1s
        timeout server 1s
        timeout connect 1s

    frontend foo
        bind :8001
        capture request header chars len 255

    backend bar
        option httpchk "GET" "/" "HTTP/1.0\r\nchars: \x01\x02\x03\x04\x05\x06\x07\x08\x09\x0b\x0c\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f\x20\x21\x22\x23\x24\x25\x26\x27\x28\x29\x2a\x2b\x2c\x2d\x2e\x2f\x30\x31\x32\x33\x34\x35\x36\x37\x38\x39\x3a\x3b\x3c\x3d\x3e\x3f\x40\x41\x42\x43\x44\x45\x46\x47\x48\x49\x4a\x4b\x4c\x4d\x4e\x4f\x50\x51\x52\x53\x54\x55\x56\x57\x58\x59\x5a\x5b\x5c\x5d\x5e\x5f\x60\x61\x62\x63\x64\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f\x70\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7a\x7b\x7c\x7d\x7e\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff"
        server foo 127.0.0.1:8001 check
2019-06-07 11:13:24 +02:00
Willy Tarreau
7355b040d1 MINOR: tools: add new bitmap manipulation functions
We now have ha_bit_{set,clr,flip,test} to manipulate bitfields made
of arrays of longs. The goal is to get rid of the remaining non-portable
FD_{SET,CLR,ISSET} that still exist at a few places.
2019-06-07 10:44:49 +02:00
Willy Tarreau
ad660e3f84 BUILD: stream-int: avoid a build warning in dev mode in si_state_bit()
The BUG_ON() test emits a warning about an always-true comparison regarding
<state> which cannot be lower than zero. Let's get rid of it.
2019-06-06 16:42:08 +02:00
Willy Tarreau
3b285d7fbd MINOR: stream-int: make si_sync_send() from the send code of si_update_both()
Just like we have a synchronous recv() function for the stream interface,
let's have a synchronous send function that we'll be able to call from
different places. For now this only moves the code, nothing more.
2019-06-06 16:36:19 +02:00
Willy Tarreau
236c4298b3 MINOR: stream-int: split si_update() into si_update_rx() and si_update_tx()
We should not update the two directions at once, in fact we should update
the Rx path after recv() and the Tx path after send(). Let's start by
splitting the update function in two for this.
2019-06-06 16:36:19 +02:00
Willy Tarreau
8c603ded39 MEDIUM: stream-int: make idle-conns switch to ST_RDY
The purpose of making idle-conns switch to SI_ST_CON was to make the
transition detectable and the operation retryable in case of connection
error. Now we have the RDY state for this which is much more suitable
since it indicates a validated connection on which we didn't necessarily
send anything yet. This will still lead to a transition to EST while not
requiring unnatural write polling nor connect timeouts.
2019-06-06 16:36:19 +02:00
Willy Tarreau
4f283fa604 MEDIUM: stream-int: introduce a new state SI_ST_RDY
The main reason for all the trouble we're facing with stream interface
error or timeout reports during the connection phase is that we currently
can't make the difference between a connection attempt and a validated
connection attempt. It is problematic because we tend to switch early
to SI_ST_EST but can't always do what we want in this state since it's
supposed to be set when we don't need to visit sess_establish() again.

This patch introduces a new state betwen SI_ST_CON and SI_ST_EST, which
is SI_ST_RDY. It indicates that we've verified that the connection is
ready. It's a transient state, like SI_ST_DIS, that cannot persist when
leaving process_stream(). For now it is not set, only verified in various
tests where SI_ST_CON was used or SI_ST_EST depending on the cases.

The stream-int state diagram was minimally updated to reflect the new
state, though it is largely obsolete and would need to be seriously
updated.
2019-06-06 16:36:19 +02:00
Willy Tarreau
7ab22adbf7 MEDIUM: stream-int: remove dangerous interval checks for stream-int states
The stream interface state checks involving ranges were replaced with
checks on a set of states, already revealing some issues. No issue was
fixed, all was replaced in a one-to-one mapping for easier control. Some
checks involving a strict difference were also replaced with fields to
be clearer. At this stage, the result must be strictly equivalent. A few
tests were also turned to their bit-field equivalent for better readability
or in preparation for upcoming changes.

The test performed in the SPOE filter was swapped so that the closed and
error states are evicted first and that the established vs conn state is
tested second.
2019-06-06 16:36:19 +02:00
Willy Tarreau
bedcd698b3 MINOR: stream-int: use bit fields to match multiple stream-int states at once
At some places we do check for ranges of stream-int states but those
are confusing as states ordering is not well known (e.g. it's not obvious
that CER is between CON and EST). Let's create a bit field from states so
that we can match multiple states at once instead. The new enum si_state_bit
contains SI_SB_* which are state bits instead of state values. The function
si_state_in() indicates if the state in argument is one of those represented
by the bit mask in second argument.
2019-06-06 16:36:19 +02:00
Olivier Houchard
03abf2d31e MEDIUM: connections: Remove CONN_FL_SOCK*
Now that the various handshakes come with their own XPRT, there's no
need for the CONN_FL_SOCK* flags, and the conn_sock_want|stop functions,
so garbage-collect them.
2019-06-05 18:03:38 +02:00
Olivier Houchard
fe50bfb82c MEDIUM: connections: Introduce a handshake pseudo-XPRT.
Add a new XPRT that is used when using non-SSL handshakes, such as proxy
protocol or Netscaler, instead of taking care of it in conn_fd_handler().
This XPRT is installed when any of those is used, and it removes itself once
the handshake is done.
This should allow us to remove the distinction between CO_FL_SOCK* and
CO_FL_XPRT*.
2019-06-05 18:03:38 +02:00
Olivier Houchard
2e055483ff MINOR: connections: Add a new xprt method, add_xprt().
Add a new method to xprt_ops, add_xprt(), that changes the underlying
xprt to the one provided, and optionally provide the old one.
2019-06-05 18:03:38 +02:00
Olivier Houchard
5149b59851 MINOR: connections: Add a new xprt method, remove_xprt.
Add a new method to xprt_ops, remove_xprt. When called, if the provided
xprt_ctx is the same as the xprt's underlying xprt_ctx, it then uses the
new xprt provided, otherwise it calls the remove_xprt method of the next
xprt.
The goal is to be able to add a temporary xprt, that removes itself from
the chain when it did what it had to do. This will be used to implement
a pseudo-xprt for anything that just requires a handshake (such as the
proxy protocol).
2019-06-05 18:03:38 +02:00
Olivier Houchard
000694cf96 MINOR: ssl: Make ssl_sock_handshake() static.
ssl_sock_handshake is now only used by the ssl code itself, there's no need
to export it anymore, so make it static.
2019-06-05 18:03:38 +02:00
Olivier Houchard
ea8dd949e4 MEDIUM: ssl: Handle subscribe by itself.
As the SSL code may have different needs than the upper layer, ie it may want
to receive when the upper layer wants to right, instead of directly forwarding
the subscribe to the underlying xprt, handle it ourself. The SSL code will
know remember any subscribe call, and wake the tasklet when it is ready
for more I/O.
2019-06-05 18:03:38 +02:00
Christopher Faulet
54b5e214b0 MINOR: htx: Don't use end-of-data blocks anymore
This type of blocks is useless because transition between data and trailers is
obvious. And when there is no trailers, the end-of-message is still there to
know when data end for chunked messages.
2019-06-05 10:12:11 +02:00
Christopher Faulet
2d7c5395ed MEDIUM: htx: Add the parsing of trailers of chunked messages
HTTP trailers are now parsed in the same way headers are. It means trailers are
converted to K/V blocks followed by an end-of-trailer marker. For now, to make
things simple, the type for trailer blocks are not the same than for header
blocks. But the aim is to make no difference between headers and trailers by
using the same type. Probably for the end-of marker too.
2019-06-05 10:12:11 +02:00
Christopher Faulet
8f3c256f7e MEDIUM: cache/htx: Always store info about HTX blocks in the cache
It was only done for the headers (including the EOH marker). data were prefixed
by the info field of these blocks. The payload and the trailers of the messages
were stored in raw. The total size of headers and payload were kept in the
cached object state to help output formatting.

Now, info about each HTX block is store in the cache. Only data are allowed to
be splitted. Otherwise, all blocks of an HTX message are handled the same way,
both when storing a message in the cache and when delivering it from the
cache. This will help the cache implementation to be more robust to internal
changes in the HTX. Especially for the upcoming parsing of trailers. There is
also no more need to keep extra info in the cached object state.
2019-06-05 10:12:11 +02:00
Christopher Faulet
a4f9dd4a56 BUG/MINOR: channel/htx: Don't alter channel during forward for empty HTX message
In channel_htx_forward() and channel_htx_forward_forever(), if the HTX message
is empty, the underlying buffer may be really empty too. And we have no warranty
the caller will call htx_to_buf() later. And in practice, it is almost never
done. So the channel's buffer must not be altered. Otherwise, the buffer may be
considered as full (data == size) for an empty HTX message and no outgoing data.

This patch must be backported to 1.9.
2019-06-05 10:12:11 +02:00
Frdric Lcaille
8d78fa7def MINOR: peers: Make peers protocol support new "server_name" data type.
Make usage of the APIs implemented for dictionaries (dict.c) and their LRU caches (struct dcache)
so that to send/receive server names used for the server by name stickiness. These
names are sent over the network as follows:

 - in every case we send the encode length of the data (STD_T_DICT), then
 - if the server names is not present in the cache used upon transmission (struct dcache_tx)
   we cache it and we the ID of this TX cache entry followed the encode length of the
   server name, and finally the sever name itseft (non NULL terminated string).
 - if the server name is present, we repead these operations but we only send the TX cache
   entry ID.

Upon receipt, the couple of (cache IDs, server name) are stored the LRU cache used
only upon receipt (struct dcache_rx). As the peers protocol is symetrical, the fact
that the server name is present in the received data (resp. or not) denotes if
the entry is absent (resp. or not).
2019-06-05 08:42:33 +02:00
Frdric Lcaille
7da71293e4 MINOR: server: Add a dictionary for server names.
This patch only declares and defines a dictionary for the server
names (stored as ->id member field).
2019-06-05 08:33:35 +02:00
Frdric Lcaille
84d6046a33 MINOR: proxy: Add a "server by name" tree to proxy.
Add a tree to proxy struct to lookup by name for servers attached
to this proxy and populated it at parsing time.
2019-06-05 08:33:35 +02:00
Frdric Lcaille
5ad57ea85f MINOR: stick-table: Add "server_name" new data type.
This simple patch only adds definitions to create a new stick-table
data type ID and a new standard type to store information in relation
wich dictionary entries (STD_T_DICT).
2019-06-05 08:33:35 +02:00
Frdric Lcaille
74167b25f7 MINOR: peers: Add a LRU cache implementation for dictionaries.
We want to send some stick-table data fields stored as strings in dictionaries
without consuming too much memory and CPU. To do so we implement with this patch
a cache for send/received dictionaries entries. These dictionary of strings entries are
stored in others real dictionary entries with an identifier as key (unsigned int)
and a pointer to the dictionary of strings entries as values.
2019-06-05 08:33:35 +02:00
Frdric Lcaille
4a3fef834c MINOR: dict: Add dictionary new data structure.
This patch adds minimalistic definitions to implement dictionary new data structure
which is an ebtree of ebpt_node structs with strings as keys. Note that this has nothing
to see with real dictionary data structure (maps of keys in association with values).
2019-06-05 08:33:35 +02:00
Frdric Lcaille
1673bbdf98 CLEANUP: peers: Remove tabs characters.
This patch only replaces very annoying tabulation characters by spaces
so that not to have to use again tabulations where they should not be used.
2019-06-05 08:33:34 +02:00
Willy Tarreau
7bb39d7cd6 CLEANUP: connection: remove the now unused CS_FL_REOS flag
Let's remove it before it gets uesd again. It was mostly replaced with
CS_FL_EOI and by mux-specific states or flags.
2019-06-03 14:23:33 +02:00
Willy Tarreau
7067b3a92e BUG/MINOR: deinit/threads: make hard-stop-after perform a clean exit
As reported in GH issue #99, when hard-stop-after triggers and threads
are in use, the chance that any thread releases the resources in use by
the other ones is non-null. Thus no thread should be allowed to deinit()
nor exit by itself.

Here we take a different approach. We simply use a 3rd possible value
for the "killed" variable so that all threads know they must break out
of the run-poll-loop and immediately stop.

This patch was tested by commenting the stream_shutdown() calls in
hard_stop() to increase the chances to see a stream use released
resources. With this fix applied, it never crashes anymore.

This fix should be backported to 1.9 and 1.8.
2019-06-02 11:30:07 +02:00
Alexander Liu
2a54bb74cd MEDIUM: connection: Upstream SOCKS4 proxy support
Have "socks4" and "check-via-socks4" server keyword added.
Implement handshake with SOCKS4 proxy server for tcp stream connection.
See issue #82.

I have the "SOCKS: A protocol for TCP proxy across firewalls" doc found
at "https://www.openssh.com/txt/socks4.protocol". Please reference to it.

[wt: for now connecting to the SOCKS4 proxy over unix sockets is not
 supported, and mixing IPv4/IPv6 is discouraged; indeed, the control
 layer is unique for a connection and will be used both for connecting
 and for target address manipulation. As such it may for example report
 incorrect destination addresses in logs if the proxy is reached over
 IPv6]
2019-05-31 17:24:06 +02:00
Olivier Houchard
cfbb3e6560 MEDIUM: tasks: Get rid of active_tasks_mask.
Remove the active_tasks_mask variable, we can deduce if we've work to do
by other means, and it is costly to maintain. Instead, introduce a new
function, thread_has_tasks(), that returns non-zero if there's tasks
scheduled for the thread, zero otherwise.
2019-05-29 21:53:37 +02:00
Olivier Houchard
250031e444 MEDIUM: sessions: Introduce session flags.
Add session flags, and add a new flag, SESS_FL_PREFER_LAST, to be set when
we use NTLM authentication, and we should reuse the last connection. This
should fix using NTLM with HTX. This totally replaces TX_PREFER_LAST.

This should be backported to 1.9.
2019-05-29 15:41:47 +02:00
Willy Tarreau
ef28dc11e3 MINOR: task: turn the WQ lock to an RW_LOCK
For now it's exclusively used as a write lock though, thus it remains
100% equivalent to the spinlock it replaces.
2019-05-28 19:15:44 +02:00
Willy Tarreau
186e96ece0 MEDIUM: buffers: relax the buffer lock a little bit
In lock profiles it's visible that there is a huge contention on the
buffer lock. The reason is that when offer_buffers() is called, it
systematically takes the lock before verifying if there is any
waiter. However doing so doesn't protect against races since a
waiter can happen just after we release the lock as well. Similarly
in h2 we take the lock every time an h2c is going to be released,
even without checking that the h2c belongs to a wait list. These
two have now been addressed by verifying non-emptiness of the list
prior to taking the lock.
2019-05-28 17:25:21 +02:00
Willy Tarreau
a8b2ce02b8 MINOR: activity: report the number of failed pool/buffer allocations
Haproxy is designed to be able to continue to run even under very low
memory conditions. However this can sometimes have a serious impact on
performance that it hard to diagnose. Let's report counters of failed
pool and buffer allocations per thread in show activity.
2019-05-28 17:25:21 +02:00
Willy Tarreau
2ae84e445d MEDIUM: poller: separate the wait time from the wake events
We have been abusing the do_poll()'s timeout for a while, making it zero
whenever there is some known activity. The problem this poses is that it
complicates activity diagnostic by incrementing the poll_exp field for
each known activity. It also requires extra computations that could be
avoided.

This change passes a "wake" argument to say that the poller must not
sleep. This simplifies the operations and allows one to differenciate
expirations from activity.
2019-05-28 17:25:21 +02:00
Willy Tarreau
0a7ef02074 MINOR: htx: make htx_add_data() return the transmitted byte count
In order to later allow htx_add_data() to transmit partial blocks and
avoid defragmenting the buffer, we'll need to return the number of bytes
consumed. This first modification makes the function do this and its
callers take this into account. At the moment the function still works
atomically so it returns either the block size or zero. However all
call places have been adapted to consider any value between zero and
the block size.
2019-05-28 14:48:59 +02:00
Willy Tarreau
d4908fa465 MINOR: htx: rename htx_append_blk_value() to htx_add_data_atonce()
This function is now dedicated to data blocks, and we'll soon need to
access it from outside in a rare few cases. Let's rename it and export
it.
2019-05-28 14:48:59 +02:00
Christopher Faulet
39744f792d MINOR: htx: Remove support of pseudo headers because it is unused
The code to handle pseudo headers is unused and with no real value. So remove
it.
2019-05-28 07:42:33 +02:00
Christopher Faulet
613346b60e MINOR: htx: remove the unused function htx_find_blk() 2019-05-28 07:42:33 +02:00
Christopher Faulet
dab5ab551d MINOR: channel/htx: Add functions to forward a part or all HTX payload
The functions channel_htx_fwd_payload() and channel_htx_fwd_all() should now be
used to forward, respectively, a part of the HTX payload or all of it. These
functions forward data and update the first block position.
2019-05-28 07:42:33 +02:00
Christopher Faulet
29f1758285 MEDIUM: htx: Store the first block position instead of the start-line one
We don't store the start-line position anymore in the HTX message. Instead we
store the first block position to analyze. For now, it is almost the same. But
once all changes will be made on this part, this position will have to be used
by HTX analyzers, and only in the analysis context, to know where the analyse
should start.

When new blocks are added in an HTX message, if the first block position is not
defined, it is set. When the block pointed by it is removed, it is set to the
block following it. -1 remains the value to unset the position. the first block
position is unset when the HTX message is empty. It may also be unset on a
non-empty message, meaning every blocks were already analyzed.

From HTX analyzers point of view, this position is always set during headers
analysis. When they are waiting for a request or a response, if it is unset, it
means the analysis should wait. But once the analysis is started, and as long as
headers are not forwarded, it points to the message start-line.

As mentionned, outside the HTX analysis, no code must rely on the first block
position. So multiplexers and applets must always use the head position to start
a loop on an HTX message.
2019-05-28 07:42:33 +02:00
Christopher Faulet
b2f4e83a28 MINOR: channel/htx: Add function to forward headers of an HTX message
The function channel_htx_fwd_headers() should now be used by HTX analyzers to
forward all headers of an HTX message, from the start-line to the corresponding
EOH. It takes care to update the star-line position.
2019-05-28 07:42:33 +02:00
Christopher Faulet
05c083ca8d MINOR: htx: Add a field to set the memory used by headers in the HTX start-line
The field hdrs_bytes has been added in the structure htx_sl. It should be used
to set how many bytes are help by all headers, from the start-line to the
corresponding EOH block. it must be set to -1 if it is unknown.
2019-05-28 07:42:12 +02:00
Christopher Faulet
9b04d22945 MINOR: connection: Remove the unused flag CO_RFL_KEEP_RSV 2019-05-28 07:42:12 +02:00
Christopher Faulet
2ae35045e2 MINOR: htx: Add function htx_get_max_blksz()
This functions should be used to get the maximum size for a block, not exceeding
the max amount of bytes passed in argument. Thus max may be set to -1 to have no
limit.
2019-05-28 07:42:12 +02:00
Christopher Faulet
aad458587d MINOR: channel/htx: Call channel_htx_recv_max() from channel_recv_max()
When channel_recv_max() is called for an HTX stream, we fall back on the HTX
version. This function is called from si_cs_recv(). This will let us pass the
max amount of bytes to read to HTX multiplexers.
2019-05-28 07:42:12 +02:00
Christopher Faulet
dd2ad8518f CLEANUP: htx: Remove unused function htx_get_stline() 2019-05-28 07:42:12 +02:00
Christopher Faulet
297fbb45fe MINOR: htx: Replace the function http_find_stline() by http_get_stline()
Now, we only return the start-line. If not found, NULL is returned. No lookup is
performed and the HTX message is no more updated. It is now the caller
responsibility to update the position of the start-line to the right value. So
when it is not found, i.e sl_pos is set to -1, it means the last start-line has
been already processed and the next one has not been inserted yet.

It is mandatory to rely on this kind of warranty to store 1xx informational
responses and final reponse in the same HTX message.
2019-05-28 07:42:12 +02:00
Christopher Faulet
a3ad6b1b8f MINOR: htx: Add functions to get the first block of an HTX message
It is the first block relatively to the start-line. So it is the start-line if
its position is set (sl_pos != -1), otherwise it is the head. The functions
htx_get_first() and htx_get_first_blk() can be used to get it.  This change is
mandatory to consider 1xx informational messages as part of a response.
2019-05-28 07:42:12 +02:00
Christopher Faulet
9c66b980fa MINOR: htx: Store start-line block's position instead of address of its payload
Nothing much to say. This change is just mandatory to consider 1xx informational
messages as part of a response.
2019-05-28 07:42:12 +02:00
Christopher Faulet
28f29c7eea MINOR: htx: Store the head position instead of the wrap one
The head of an HTX message is heavily used whereas the wrap position is only
used when a block is added or removed. So it is more logical to store the head
position in the HTX message instead of the wrap one. The wrap position can be
easily deduced. To get it, the new function htx_get_wrap() may be used.
2019-05-28 07:42:12 +02:00
Christopher Faulet
c8b246f108 MINOR: htx: Move the macro IS_HTX_STRM() in proto/stream.h
The macro IS_HTX_STRM() only relies on stream flags. So move it in
proto/stream.h.
2019-05-28 07:42:12 +02:00
Christopher Faulet
429b91d308 MINOR: htx: Remove the macro IS_HTX_SMP() and always use IS_HTX_STRM() instead
The macro IS_HTX_SMP() is only used at a place, in a context where the stream
always exists. So, we can remove it to use IS_HTX_STRM() instead.
2019-05-28 07:42:12 +02:00
Willy Tarreau
c3b5958255 BUG/MEDIUM: threads: fix double-word CAS on non-optimized 32-bit platforms
On armv7 haproxy doesn't work because of the fixes on the double-word
CAS. There are two issues. The first one is that the last argument in
case of dwcas is a pointer to the set of value and not a value ; the
second is that it's not enough to cast the data as (void*) since it will
be a single word. Let's fix this by using the pointers as an array of
long. This was tested on i386, armv7, x86_64 and aarch64 and it is now
fine. An alternate approach using a struct was attempted as well but it
used to produce less optimal code.

This fix must be backported to 1.9. This fixes github issue #105.

Cc: Olivier Houchard <ohouchard@haproxy.com>
2019-05-27 17:40:59 +02:00
Willy Tarreau
d6a7850200 MINOR: cli/activity: add 3 general purpose counters in development mode
The unused fd_del and fd_skip were being abused during debugging sessions
as general purpose event counters. With their removal, let's officially
have dedicated counters for such use cases. These counters are called
"ctr0".."ctr2" and are listed at the end when DEBUG_DEV is set.
2019-05-27 07:03:38 +02:00
Willy Tarreau
394c9b4215 MINOR: cli/activity: remove "fd_del" and "fd_skip" from show activity
These variables are never set anymore and were always reported as zero.
2019-05-27 06:59:14 +02:00
Willy Tarreau
c4943d5170 MINOR: buffer: add a new buffer ring API to manipulate rings of buffers
The purpose is to manipulate rings made of series of buffers so that
it is possible to continue to work on a next buffer once one is full.
This will be used by muxes to deal with contention between multiple
streams and a single output buffer. No data is expected to span over
multiple buffers, all of them will be used like a regular buffer. This
will significantly limit the amount of changes and the code complexity
while still supporting larger output buffering.

The ring is made of a head and a tail indexes both of which point to a
buffer descriptor. At least one descriptor is always valid, so it could
be seen as a form of pagination always presenting one buffer. The root
of the ring is itself stored into a buffer descriptor so that the user
only has to declare a buffer array and to call br_init() on it in order
to use it.
2019-05-26 09:26:59 +02:00
Willy Tarreau
e39b58f045 MINOR: buffer: introduce b_make() to make a buffer from its parameters
This is convenient to assign a buffer from parts of another one.
2019-05-26 09:26:59 +02:00
Willy Tarreau
7562a7291d CLEANUP: debug: remove the TRACE() macro
It has not been used for many years, is unlikely to be reused and
conflicts with the similarly named macro in flt_trace, causing warnings
at build time when including debug.h in low-level files. Let's simply
remove it.
2019-05-26 09:25:59 +02:00
Willy Tarreau
0d6c75d749 OPTIM: freq-ctr: don't take the date lock for most updates
It's amazing that the value was still incremented under the date lock,
let's first use an atomic increment for the counter and move it out of
the date lock to reduce contention. These are just counters, we don't
need to take locks if we're not rotating, atomic ops are enough. This
patch does this, and leaves the lock for when the period is over. It's
important to note that some values might be added just before or just
after a rotation but this is not a problem since we don't care if a
value is counted in the previous or next period when it's exactly on
the edge. Great care was taken to ensure that the current counter is
always atomically updated.

Other minor cleanups were performed, such as avoiding to reload the
value from memory after a CAS, or using &~1 instead of two shifts to
remove the lowest bit.
2019-05-25 20:31:53 +02:00
Willy Tarreau
7cf0e4517d MINOR: raw_sock: report global traffic statistics
Many times we've been missing per-process traffic statistics. While it
didn't make sense in multi-process mode, with threads it does. Thus we
now have a counter of bytes emitted by raw_sock, and a freq counter for
these as well. However, freq_ctr are limited to 32 bits, and given that
loads of 300 Gbps have already been reached over a loopback using
splicing, we need to downscale this a bit. Here we're storing 1/32 of
the byte rate, which gives a theorical limit of 128 GB/s or ~1 Tbps,
which is more than enough. Let's have fun re-reading this sentence in
2029 :-)  The values can be read in "show info" output on the CLI.
2019-05-23 11:45:38 +02:00
Willy Tarreau
f4c1e56b5e BUILD: signals: FreeBSD has SI_LWP instead of SI_TKILL
SI_TKILL is for Linux. We're again in the non-portable area. Both OSes
use macros to define these values so we can #ifdef them. Let's make
SI_TKILL defined based on SI_LWP when only the latter is defined.
2019-05-23 08:40:50 +02:00
Willy Tarreau
96d5195862 MEDIUM: config: deprecate the antique req* and rsp* commands
These commands don't follow the same flow as the rest of the commands,
each of them iterates over all header lines before switching to the
next directive. In addition they make no distinction between start
line and headers and can lead to unparsable rewrites which are very
difficult to deal with internally.

Most of them are still occasionally found in configurations, mainly
because of the usual "we've always done this way". By marking them
deprecated and emitting a warning and recommendation on first use of
each of them, we will raise users' awareness of users regarding the
cleaner, faster and more reliable alternatives.

Some use cases of "reqrep" still appear from time to time for URL
rewriting that is not so convenient with other rules. But at least
users facing this requirement will explain their use case so that we
can best serve them. Some discussion started on this subject in a
thread linked to from github issue #100.

The goal is to remove them in 2.1 since they require to reparse the
result before indexing it and we don't want this hack to live long.
The following directives were marked deprecated :

  -reqadd
  -reqallow
  -reqdel
  -reqdeny
  -reqiallow
  -reqidel
  -reqideny
  -reqipass
  -reqirep
  -reqitarpit
  -reqpass
  -reqrep
  -reqtarpit
  -rspadd
  -rspdel
  -rspdeny
  -rspidel
  -rspideny
  -rspirep
  -rsprep
2019-05-22 20:43:45 +02:00
Willy Tarreau
d1f56c9a01 BUG/MEDIUM: dns: make the port numbers unsigned
Mustafa Yildirim reported in Discourse that ports >32767 advertised
in SRV records are wrong. Given the high value they definitely
correspond to a sign extension of a negative number. The cause was
indeed that the port is declared as a signed int in the dns_answer_item
structure, and Lukas confirmed in github issue #103 that turning it to
unsigned addresses the issue.

It is worth noting that there are other such fields in this structure
that don't look right (ttl, priority, class, type) and that someone
should audit this part to be certain they are properly typed.

This fix must be backported to 1.9 and likely to 1.8 as well.
2019-05-22 20:07:45 +02:00
Willy Tarreau
e5733234f6 CLEANUP: build: rename some build macros to use the USE_* ones
We still have quite a number of build macros which are mapped 1:1 to a
USE_something setting in the makefile but which have a different name.
This patch cleans this up by renaming them to use the USE_something
one, allowing to clean up the makefile and make it more obvious when
reading the code what build option needs to be added.

The following renames were done :

 ENABLE_POLL -> USE_POLL
 ENABLE_EPOLL -> USE_EPOLL
 ENABLE_KQUEUE -> USE_KQUEUE
 ENABLE_EVPORTS -> USE_EVPORTS
 TPROXY -> USE_TPROXY
 NETFILTER -> USE_NETFILTER
 NEED_CRYPT_H -> USE_CRYPT_H
 CONFIG_HAP_CRYPT -> USE_LIBCRYPT
 CONFIG_HAP_NS -> DUSE_NS
 CONFIG_HAP_LINUX_SPLICE -> USE_LINUX_SPLICE
 CONFIG_HAP_LINUX_TPROXY -> USE_LINUX_TPROXY
 CONFIG_HAP_LINUX_VSYSCALL -> USE_LINUX_VSYSCALL
2019-05-22 19:47:57 +02:00
Willy Tarreau
823bda0eb7 BUILD: time: remove the test on _POSIX_C_SOURCE
It seems it's not defined on FreeBSD while it's mentioned on Linux that
clock_gettime() can be detected using this. Given that we also have the
test for _POSIX_TIMERS>0 that should cover it well enough. If it breaks
on other systems, we'll see.

Report was here :
    https://github.com/haproxy/haproxy/runs/133866993
2019-05-22 19:14:59 +02:00
Willy Tarreau
082b62828d BUG/MEDIUM: init/threads: provide per-thread alloc/free function callbacks
We currently have the ability to register functions to be called early
on thread creation and at thread deinitialization. It turns out this is
not sufficient because certain such functions may use resources that are
being allocated by the other ones, thus creating a race condition depending
only on the linking order. For example the mworker needs to register a
file descriptor while the pollers will reallocate the fd_updt[] array.
Similarly logs and trashes may be used by some init functions while it's
unclear whether they have been deduplicated.

The same issue happens on deinit, if the fd_updt[] or trash is released
before some functions finish to use them, we'll get into trouble.

This patch creates a couple of early and late callbacks for per-thread
allocation/freeing of resources. A few init functions were moved there,
and the fd init code was split between the two (since it used to both
allocate and initialize at once). This way the init/deinit sequence is
expected to be safe now.

This patch should be backported to 1.9 as at least the trash/log issue
seems to be present. The run_thread_poll_loop() code is a bit different
there as the mworker is not a callback, but it will have no effect and
it's enough to drop the mworker changes.

This bug was reported by Ilya Shipitsin in github issue #104.
2019-05-22 14:59:08 +02:00
Willy Tarreau
ca2a3cc8d5 MINOR: connection: report the mux names in "haproxy -vv"
Since the mux names appear at a few places (dumps etc), let's list
them in front of supported mux protocols in "haproxy -vv".
2019-05-22 11:50:48 +02:00
Willy Tarreau
430f590b5b MINOR: threads: add a timer_t per thread in thread_info
This will be used by the watchdog to detect that a thread locked up.
It's only defined on platforms supporting it. This patch only reserves
the room for the timer in the struct. A special value was reserved for
the uninitialized timer. The problem is that the POSIX API was horribly
designed, defining no invalid value, thus for each timer it is required
to keep a second variable to indicate whether it's valid. A quick check
shows that defining a 32-bit invalid value is not something uncommon
across other implementations, with ~0 being common. Let's try with this
and if it causes issues we can revisit this decision.
2019-05-22 11:50:48 +02:00
Willy Tarreau
e6a02fa65a MINOR: threads: add a "stuck" flag to the thread_info struct
This flag is constantly cleared by the scheduler and will be set by the
watchdog timer to detect stuck threads. It is also set by the "show
threads" command so that it is easy to spot if the situation has evolved
between two subsequent calls : if the first "show threads" shows no stuck
thread and the second one shows such a stuck thread, it indicates that
this thread didn't manage to make any forward progress since the previous
call, which is extremely suspicious.
2019-05-22 11:50:48 +02:00
Willy Tarreau
5484d58a17 MINOR: stream: introduce a stream_dump() function and use it in stream_dump_and_crash()
This function dumps a lot of information about a stream into the provided
buffer. It is now used by stream_dump_and_crash() and will be used by the
debugger as well.
2019-05-22 11:50:48 +02:00
Willy Tarreau
2beaaf7d46 MINOR: threads: implement ha_tkill() and ha_tkillall()
These functions are used respectively to signal one thread or all threads.
When multithreading is disabled, it's always the current thread which is
signaled.
2019-05-22 11:50:48 +02:00
Willy Tarreau
441259c561 MINOR: threads: make threads_{harmless|want_rdv}_mask constant 0 without threads
Some code starts to add ifdefs everywhere to work around the lack of
threads_harmless_mask when threads are not compiled in. This one is
often used to indicate a thread having joined the rendez-vous point or
a thread sleeping in the poller. By setting it to zero we translate
what usually is required in debugging code (i.e. the only thread is
currently working) and for signal handlers we can use a combination of
threads_harmless_mask and sleeping_threads_mask to detect the polling
cases as well. Similarly do the same with threads_want_rdv_mask which
is less often used though.
2019-05-22 11:50:48 +02:00
Willy Tarreau
6ea63c301d CLEANUP: objtype: make obj_type() and obj_type_name() take consts
There is no reason for them to require a writable area.
2019-05-22 11:50:48 +02:00
Tim Duesterhus
9b7a976cd6 BUG/MINOR: mworker: Fix memory leak of mworker_proc members
The struct mworker_proc is not uniformly freed everywhere, sometimes leading
to leaks of the `id` string (and possibly the other strings).

Introduce a mworker_free_child function instead of duplicating the freeing
logic everywhere to prevent this kind of issues.

This leak was reported in issue #96.

It looks like the leaks have been introduced in commit 9a1ee7ac31,
which is specific to 2.0-dev. Backporting `mworker_free_child` might be
helpful to ease backporting other fixes, though.
2019-05-22 11:29:18 +02:00
Willy Tarreau
80daaa1e9d CLEANUP: time: switch clockid_t to empty_t when not available
This is cleaner than using an int. We also get rid of the constants
that we don't need nor use.
2019-05-21 20:03:03 +02:00
Willy Tarreau
9a85a1700b MINOR: compat: define a new empty type empty_t for non-implemented fields
Some structures have optional fields which depend on availability of
certain features on certain platforms, and having to stuff lots of
ifdefs in these structs makes them unreadable. Using real values like
ints requires some initialization and adds even more confusion.

Here we take a different approach : we create an empty type called
empty_t to use as a substitute for the real type that is not implemented
and which doesn't contain any value (it's an empty struct). Thus it has
a size of zero but an address, thus a pointer may point to it. It will
not have to be initialized though. Some initialization code might even
continue to work and do nothing like initializing it using memset with
its sizeof which is zero.
2019-05-21 20:03:03 +02:00
Willy Tarreau
f61782418c CLEANUP: time: refine the test on _POSIX_TIMERS
The clock_gettime() man page says we must check that _POSIX_TIMERS is
defined to a value greater than zero, not just that it's simply defined
so let's fix this right now.
2019-05-21 20:03:03 +02:00
Emmanuel Hocdet
0ba4f483d2 MAJOR: polling: add event ports support (Solaris)
Event ports are kqueue/epoll polling class for Solaris. Code is based
on https://github.com/joyent/haproxy-1.8/tree/joyent/dev-v1.8.8.
Event ports are available only on SunOS systems derived from
Solaris 10 and later (including illumos systems).
2019-05-21 15:16:45 +02:00
Willy Tarreau
219b829b62 MINOR: time: add a function to retrieve another thread's cputime
now_cpu_time_thread() does the same as now_cpu_time() but for another
thread based on its clockid.
2019-05-20 21:14:14 +02:00
Willy Tarreau
81036f2738 MINOR: time: move the cpu, mono, and idle time to thread_info
These ones are useful across all threads and would be better placed
in struct thread_info than thread-local. There are very few users.
2019-05-20 21:14:14 +02:00
Willy Tarreau
8323a375bc MINOR: threads: add a thread-local thread_info pointer "ti"
Since we're likely to access this thread_info struct more frequently in
the future, let's reserve the thread-local symbol to access it directly
and avoid always having to combine thread_info and tid. This pointer is
set when tid is set.
2019-05-20 21:14:12 +02:00
Willy Tarreau
624dcbf41e MINOR: threads: always place the clockid in the struct thread_info
It will be easier to deal with the internal API to always have it.
2019-05-20 21:13:01 +02:00
Willy Tarreau
b81939cef0 MINOR: compat: make sure to always define clockid_t
In order to ease the internal time API, we'll have the threads time always
present even when threads are disabled. Let's make sure clockid_t, and the
minimum clock times are defined even on older or non-compatible systems.
2019-05-20 20:24:10 +02:00
Willy Tarreau
5a6e2245fa REORG: threads: move the struct thread_info from global.h to hathreads.h
It doesn't make sense to keep this struct thread_info in global.h, it
causes difficulties to access its contents from hathreads.h, let's move
it to the threads where it ought to have been created.
2019-05-20 20:00:25 +02:00
Willy Tarreau
e3e2b7283f REORG: compat: move some integer limit definitions from standard.h to compat.h
Historically standard.h was the location where we used to (re-)define the
standard set of macros and functions, and to complement the ones missing
on the target OS. Over time it has become a toolbox in itself relying on
many other things, and its definition of LONGBITS is used everywhere else
(e.g. for MAX_THREADS), resulting in painful circular dependencies.

Let's move these few defines (integer sizes) to compat.h where other
similar definitions normally are.
2019-05-20 19:59:34 +02:00
Willy Tarreau
3710105945 MINOR: tools: provide a may_access() function and make dump_hex() use it
It's a bit too easy to crash by accident when using dump_hex() on any
area. Let's have a function to check if the memory may safely be read
first. This one abuses the stat() syscall checking if it returns EFAULT
or not, in which case it means we're not allowed to read from there. In
other situations it may return other codes or even a success if the
area pointed to by the file exists. It's important not to abuse it
though and as such it's tested only once per output line.
2019-05-20 16:59:37 +02:00
Willy Tarreau
56131ca58e MINOR: debug: implement ha_panic()
This function dumps all existing threads using the thread dump mechanism
then aborts. This will be used by the lockup detection and by debugging
tools.
2019-05-20 16:51:30 +02:00
Willy Tarreau
9fc5dcbd71 MINOR: tools: add dump_hex()
This is used to dump a memory area into a buffer for debugging purposes.
2019-05-20 16:51:30 +02:00
Willy Tarreau
91e6df01fa MINOR: threads: add each thread's clockid into the global thread_info
This is the per-thread CPU runtime clock, it will be used to measure
the CPU usage of each thread and by the lockup detection mechanism. It
must only be retrieved at the beginning of run_thread_poll_loop() since
the thread must already have been started for this. But it must be done
before performing any per-thread initcall so that all thread init
functions have access to the clock ID.

Note that it could make sense to always have this clockid available even
in non-threaded situations and place the process' clock there instead.
But it would add portability issues which are currently easy to deal
with by disabling threads so it may not be worth it for now.
2019-05-20 11:42:25 +02:00
Willy Tarreau
522cfbc1ea MINOR: init/threads: make the global threads an array of structs
This way we'll be able to store more per-thread information than just
the pthread pointer. The storage became an array of struct instead of
an allocated array since it's very small (typically 512 bytes) and not
worth the hassle of dealing with memory allocation on this. The array
was also renamed thread_info to make its intended usage more explicit.
2019-05-20 11:37:57 +02:00
Willy Tarreau
b49a58dda2 CLEANUP: threads: remove the now unused START_LOCK label
The last two users are now gone.
2019-05-20 11:26:12 +02:00
Willy Tarreau
619a95f5ad MEDIUM: init/mworker: make the pipe register function a regular initcall
Now that we have the guarantee that init calls happen before any other
thread starts, we don't need anymore the workaround installed by commit
1605c7ae6 ("BUG/MEDIUM: threads/mworker: fix a race on startup") and we
can instead rely on a regular per-thread initcall for this function. It
will only be performed on worker thread #0, the other ones and the master
have nothing to do, just like in the original code that was only moved
to the function.
2019-05-20 11:26:12 +02:00
Willy Tarreau
c7091d89ae MEDIUM: debug/threads: implement an advanced thread dump system
The current "show threads" command was too limited as it was not possible
to dump other threads' detailed states (e.g. their tasks). This patch
goes further by using thread signals so that each thread can dump its
own state in turn into a shared buffer provided by the caller. Threads
are synchronized using a mechanism very similar to the rendez-vous point
and using this method, each thread can safely dump any of its contents
and the caller can finally report the aggregated ones from the buffer.

It is important to keep in mind that the list of signal-safe functions
is limited, so we take care of only using chunk_printf() to write to a
pre-allocated buffer.

This mechanism is enabled by USE_THREAD_DUMP and is enabled by default
on Linux 2.6.28+. On other platforms it falls back to the previous
solution using the loop and the less precise dump.
2019-05-17 17:16:20 +02:00
Willy Tarreau
29bf96d73d MINOR: task: always reset curr_task when freeing a task or tasklet
With the thread debugger it becomes visible that we can leave some
wandering pointers for a while in curr_task, which is inappropriate.
This patch addresses this by resetting curr_task to NULL before really
freeing the area. This way it becomes safe even regarding signals.
2019-05-17 17:16:20 +02:00
Willy Tarreau
38171daf21 MINOR: thread: implement ha_thread_relax()
At some places we're using a painful ifdef to decide whether to use
sched_yield() or pl_cpu_relax() to relax in loops, this is hardly
exportable. Let's move this to ha_thread_relax() instead and une
this one only.
2019-05-17 17:16:20 +02:00
Willy Tarreau
5cf64dd1bd MINOR: debug: make ha_thread_dump() and ha_task_dump() take a buffer
Instead of having them dump into the trash and initialize it, let's have
the caller initialize a buffer and pass it. This will be convenient to
dump multiple threads at once into a single buffer.
2019-05-17 17:16:20 +02:00
Willy Tarreau
4e2b646d60 MINOR: cli/debug: add a thread dump function
The new function ha_thread_dump() will dump debugging info about all known
threads. The current thread will contain a bit more info. The long-term goal
is to make it possible to use it in signal handlers to improve the accuracy
of some dumps.

The function dumps its output into the trash so as it was trivial to add,
a new "show threads" command appeared on the CLI.
2019-05-16 18:06:45 +02:00
Willy Tarreau
aa1e1be88f MINOR: task: export global_task_mask
It will be used in debugging functions and must be exported.
2019-05-16 18:02:03 +02:00
Tim Duesterhus
10c6c16cde MEDIUM: Make 'option forceclose' actually warn
It is deprecated since 315b39c391 (1.9-dev),
but only was deprecated in the docs.

Make it warn when being used and remove it from the docs.
2019-05-16 18:02:03 +02:00
Willy Tarreau
0f35c593f6 BUILD: ist: turn the lower/upper case tables to literal on obsolete linkers
Gil Bahat reported build issues on Cygwin starting with 1.9 due to a
difference in the way the linker handles the weak symbols there,
causing multiple declarations of ist_lc[] and ist_uc[]. It's likely
that this issue could also happen on any older or non-ELF linker.

This patch addresses this by using literals instead on such platforms,
leaving it to the compiler to merge the constants when it can. On other
platforms the resulting executable is slightly larger due to strings
that could not be merged but this is a minor detail compared to not
being able to build at all.

If this change alone is confirmed to fix these issues, it's safe to
backport to 1.9.
2019-05-15 16:14:04 +02:00
Willy Tarreau
469fa2c9d9 MINOR: debug: add a new BUG_ON macro
We do have some code paths testing for impossible errors that tend to
be quite confusing, first for maintenance (what to do on such errors,
and how far to guess the bug), second for developers as it tends to
hide the main purpose and expectations of these call places. Also
most of the time impossible errors are ignored by the callers so the
tests are not even usable during debugging.

Let's instead implement a BUG_ON macro which takes a condition, which
if true, will cause a message to be emitted and optionally to crash the
process. Additionally, these calls inserted at various places server as
hints and documentation for developers to know that such conditions
must absolutely not happen.

This is only enabled when DEBUG_STRICT or DEBUG_STRICT_NOCRASH are set.
As its name implies, DEBUG_STRICT_NOCRASH only performs the test but
does not crash, which can be useful to track some checkpoints.

At the moment nothing uses this code.
2019-05-14 17:34:49 +02:00
Willy Tarreau
a5e33a9b66 BUILD: debug: make gcc not complain on the ABORT_NOW() macro
On recent gcc versions with the null-deref checks, ABORT_NOW() rightfully
emits such a warning. But here it's on purpose. Simply changing the memory
address to 1 makes gcc happy.
2019-05-14 17:22:28 +02:00
Willy Tarreau
8bdb5c9bb4 CLEANUP: connection: remove the handle field from the wait_event struct
It was only set and not consumed after the previous change. The reason
is that the task's context always contains the relevant information,
so there is no need for a second pointer.
2019-05-13 19:14:52 +02:00
Willy Tarreau
42ccb5ac45 MINOR: lists: add LIST_ADDED() to check if an element belongs to a list
Some code parts use LIST_ISEMPTY() a lot on list elements to detect
if they were reset consecutive to their removal from a list, but this
test is always confusing as this was initially designed for list heads.

Instead let's have a new macro, LIST_ADDED(), which returns true when
the element is in a list (i.e. it's not "empty").
2019-05-13 19:14:52 +02:00
Olivier Houchard
478281f55d BUG/MEDIUM: connections: Don't forget to set xprt_ctx to NULL on close.
In conn_xprt_close(), after calling xprt->close(), don't forget to set
conn->xprt_ctx to NULL, or we may attempt to reuse the now-free'd
conn->xprt_ctx if the connection failed and we're retrying it.
2019-05-13 19:11:38 +02:00
Willy Tarreau
6a38b3297c BUILD: threads: fix again the __ha_cas_dw() definition
This low-level asm implementation of a double CAS was implemented only
for certain architectures (x86_64, armv7, armv8). When threads are not
used, they were not defined, but since they were called directly from
a few locations, they were causing build issues on certain platforms
with threads disabled. This was addressed in commit f4436e1 ("BUILD:
threads: Add __ha_cas_dw fallback for single threaded builds") by
making it fall back to HA_ATOMIC_CAS() when threads are not defined,
but this actually made the situation worse by breaking other cases.

This patch fixes this by creating a high-level macro HA_ATOMIC_DWCAS()
which is similar to HA_ATOMIC_CAS() except that it's intended to work
on a double word, and which rely on the asm implementations when threads
are in use, and uses its own open-coded implementation when threads are
not used. The 3 call places relying on __ha_cas_dw() were updated to
use HA_ATOMIC_DWCAS() instead.

This change was tested on i586, x86_64, armv7, armv8 with and without
threads with gcc 4.7, armv8 with gcc 5.4 with and without threads, as
well as i586 with gcc-3.4 without threads. It will need to be backported
to 1.9 along with the fix above to fix build on armv7 with threads
disabled.
2019-05-11 18:13:29 +02:00
Willy Tarreau
295d614de1 CLEANUP: ssl: move all BIO_* definitions to openssl-compat
The following macros are now defined for openssl < 1.1 so that we
can remove the code performing direct access to the structures :

  BIO_get_data(), BIO_set_data(), BIO_set_init(), BIO_meth_free(),
  BIO_meth_new(), BIO_meth_set_gets(), BIO_meth_set_puts(),
  BIO_meth_set_read(), BIO_meth_set_write(), BIO_meth_set_create(),
  BIO_meth_set_ctrl(), BIO_meth_set_destroy()
2019-05-11 17:39:08 +02:00
Willy Tarreau
11b167167e CLEANUP: ssl: remove ifdef around SSL_CTX_get_extra_chain_certs()
Instead define this one in openssl-compat.h when
SSL_CTRL_GET_EXTRA_CHAIN_CERTS is not defined (which was the current
condition used in the ifdef).
2019-05-11 17:38:21 +02:00
Willy Tarreau
366a6987a7 CLEANUP: ssl: move the SSL_OP_* and SSL_MODE_* definitions to openssl-compat
These ones were defined in the middle of ssl_sock.c, better move them
to the include file to find them.
2019-05-11 17:37:44 +02:00
Olivier Houchard
602bf7d2ea MEDIUM: streams: Add a new http action, disable-l7-retry.
Add a new action for http-request, disable-l7-retry, that can be used to
disable any attempt at retry requests (see retry-on) if it fails for any
reason other than a connection failure.
This is useful for example to make sure POST requests aren't retried.
2019-05-10 17:49:09 +02:00
Chris Packham
f4436e145b BUILD: threads: Add __ha_cas_dw fallback for single threaded builds
__ha_cas_dw() is used in fd_rm_from_fd_list() and when built without
USE_THREADS=1 the linker fails to find __ha_cas_dw(). Add a definition
of __ha_cas_dw() for the #ifndef USE_THREADS case.

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
2019-05-10 10:55:31 +02:00
Willy Tarreau
c125cef6da CLEANUP: ssl: make inclusion of openssl headers safe
It's always a pain to have to stuff lots of #ifdef USE_OPENSSL around
ssl headers, it even results in some of them appearing in a random order
and multiple times just to benefit form an existing ifdef block. Let's
make these headers safe for inclusion when USE_OPENSSL is not defined,
they now perform the test themselves and do nothing if USE_OPENSSL is
not defined. This allows to remove no less than 8 such ifdef blocks
and make include blocks more readable.
2019-05-10 09:58:43 +02:00