Commit Graph

680 Commits

Author SHA1 Message Date
Willy Tarreau
751153e0f1 OPTIM: server: switch the actconn list to an mt-list
The remaining contention on the server lock solely comes from
sess_change_server() which takes the lock to add and remove a
stream from the server's actconn list. This is both expensive
and pointless since we have mt-lists, and this list is only
used by the CLI's "shutdown server sessions" command!

Let's migrate to an mt-list and remove the need for this costly
lock. By doing so, the request rate increased by ~1.8%.
2021-02-18 10:06:45 +01:00
Christopher Faulet
eaab7325a7 BUG/MINOR: server: Remove RMAINT from admin state when loading server state
The RMAINT admin state is dynamic and should be remove from the
srv_admin_state parameter when a server state is loaded from a server-state
file. Otherwise an erorr is reported, the server-state line is ignored and
the server state is not updated.

This patch should fix the issue #576. It must be backported as far as 1.8.
2021-02-15 11:56:31 +01:00
Emeric Brun
c943799c86 MEDIUM: resolvers/dns: split dns.c into dns.c and resolvers.c
This patch splits current dns.c into two files:

The first dns.c contains code related to DNS message exchange over UDP
and in future other TCP. We try to remove depencies to resolving
to make it usable by other stuff as DNS load balancing.

The new resolvers.c inherit of the code specific to the actual
resolvers.

Note:
It was really difficult to obtain a clean diff dur to the amount
of moved code.

Note2:
Counters and stuff related to stats is not cleany separated because
currently counters for both layers are merged and hard to separate
for now.
2021-02-13 10:03:46 +01:00
Emeric Brun
d30e9a1709 MINOR: resolvers: rework prototype suffixes to split resolving and dns.
A lot of prototypes in dns.h are specific to resolvers and must
be renamed to split resolving and DNS layers.
2021-02-13 09:43:18 +01:00
Emeric Brun
456de77bdb MINOR: resolvers: renames resolvers DNS_UPD_* returncodes to RSLV_UPD_*
This patch renames some #defines prefixes from DNS to RSLV.
2021-02-13 09:43:18 +01:00
Emeric Brun
21fbeedf97 MINOR: resolvers: renames some dns prefixed types using resolv prefix.
@@ -119,8 +119,8 @@ struct act_rule {
-               } dns;                         /* dns resolution */
+               } resolv;                      /* resolving */

-struct dns_options {
+struct resolv_options {
2021-02-13 09:43:18 +01:00
Emeric Brun
08622d3c0a MINOR: resolvers: renames some resolvers specific types to not use dns prefix
This patch applies those changes on names:

-struct dns_resolution {
+struct resolv_resolution {

-struct dns_requester {
+struct resolv_requester {

-struct dns_srvrq {
+struct resolv_srvrq {

@@ -185,12 +185,12 @@ struct stream {

        struct {
-               struct dns_requester *dns_requester;
+               struct resolv_requester *requester;
...
-       } dns_ctx;
+       } resolv_ctx;
2021-02-13 09:43:18 +01:00
Emeric Brun
750fe79cd0 MINOR: resolvers: renames type dns_resolvers to resolvers.
It also renames 'dns_resolvers' head list to sec_resolvers
to avoid conflicts with local variables 'resolvers'.
2021-02-13 09:43:17 +01:00
Emeric Brun
50c870e4de BUG/MINOR: dns: add missing sent counter and parent id to dns counters.
Resolv callbacks are also updated to rely on counters and not on
nameservers.
"show stat domain dns" will now show the parent id (i.e. resolvers
section name).
2021-02-13 09:43:17 +01:00
Christopher Faulet
469676423e CLEANUP: server: Remove useless "filepath" variable in apply_server_state()
This variable is now only used to point on the local server-state file. When
the server-state is global, it is unused. So, we now use "localfilepath"
instead. Thus, the "filepath" variable can safely be removed.
2021-02-12 16:42:00 +01:00
Christopher Faulet
8952ea636b BUG/MINOR: server: Don't call fopen() with server-state filepath set to NULL
When a local server-state file is loaded, if its name is too long, the error
is not properly handled, resulting to a call to fopen() with the "filepath"
variable set to NULL. To fix the bug, when this error occurs, we jump to the
next proxy, via a "continue" statement. And we take case to set "filepath"
variable after the error handling to be sure.

This patch should fix the issue #1111. It must be backported as far as 1.6.
2021-02-12 16:42:00 +01:00
Willy Tarreau
bb8669ae28 BUG/MINOR: server: parse_server() must take a const for the defproxy
The default proxy was passed as a variable, which in addition to being
a PITA to deal with in the config parser, doesn't feel safe to use when
it ought to be const.

This will only affect new code so no backport is needed.
2021-02-12 16:23:46 +01:00
Willy Tarreau
144289b459 REORG: move init_default_instance() to proxy.c and pass it the defproxy pointer
init_default_instance() was still left in cfgparse.c which is not the
best place to pre-initialize a proxy. Let's place it in proxy.c just
after init_new_proxy(), take this opportunity for renaming it to
proxy_preset_defaults() and taking out init_new_proxy() from it, and
let's pass it the pointer to the default proxy to be initialized instead
of implicitly assuming defproxy. We'll soon be able to exploit this.
Only two call places had to be updated.
2021-02-12 16:23:46 +01:00
William Dauchy
ddc7ce9645 MINOR: server: enhance error precision when applying server state
server health checks and agent parameters are written the same way as
others to be able to enahcne code reuse: basically we make use of
parsing and assignment at the same place. It makes it difficult for
error handling to know whether srv object was modified partially or not.
The problem was already present with SRV resolution though.

I was a bit puzzled about the approach to take to be honest, and I did
not wanted to go into a full refactor, so I assumed it was ok to simply
notify whether the line was failed or partially applied.

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-02-12 16:04:52 +01:00
William Dauchy
d1a7b85a40 MEDIUM: server: support {check,agent}_addr, agent_port in server state
logical followup from cli commands addition, so that the state server
file stays compatible with the changes made at runtime; use previously
added helper to load server attributes.

also alloc a specific chunk to avoid mixing with other called functions
using it

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-02-12 16:04:52 +01:00
William Dauchy
63e6cba12a MEDIUM: server: add server-states version 2
Even if it is possibly too much work for the current usage, it makes
sure we don't break states file from v2.3 to v2.4; indeed, since v2.3,
we introduced two new fields, so we put them aside to guarantee we can
easily reload from a version 1.
The diff seems huge but there is no specific change apart from:
- introduce v2 where it is needed (parsing, update)
- move away from switch/case in update to be able to reuse code
- move srv lock to the whole function to make it easier

this patch confirm how painful it is to maintain this functionality.

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-02-12 16:04:52 +01:00
William Dauchy
7cabc06da6 MEDIUM: cli: add agent-port command
this patch allows to set agent port at runtime. In order to align with
both `addr` and `check-addr` commands, also add the possibility to
optionnaly set port on `agent-addr` command. This led to a small
refactor in order to use the same function for both `agent-addr` and
`agent-port` commands.

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-02-12 16:04:52 +01:00
William Dauchy
b456e1f389 MEDIUM: cli: add check-addr command
this patch allows to set server health check address at runtime. In
order to align with `addr` command, also allow to set port optionnaly.
This led to a small refactor in order to use the same function for both
`check-addr` and `check-port` commands.
for `check-port`, we however don't permit the change anymore if checks
are not enabled on the server.

This command becomes more and more useful for people having a consul
like architecture:
- the backend server is located on a container with its own IP
- the health checks are done the consul instance located on the host
  with the host IP

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-02-12 16:04:52 +01:00
Amaury Denoyelle
f232cb3e9b MEDIUM: connection: replace idle conn lists by eb trees
The server idle/safe/available connection lists are replaced with ebmb-
trees. This is used to store backend connections, with the new field
connection hash as the key. The hash is a 8-bytes size field, used to
reflect specific connection parameters.

This is a preliminary work to be able to reuse connection with SNI,
explicit src/dst address or PROXY protocol.
2021-02-12 12:33:05 +01:00
Amaury Denoyelle
5c7086f6b0 MEDIUM: connection: protect idle conn lists with locks
This is a preparation work for connection reuse with sni/proxy
protocol/specific src-dst addresses.

Protect every access to idle conn lists with a lock. This is currently
strictly not needed because the access to the list are made with atomic
operations. However, to be able to reuse connection with specific
parameters, the list storage will be converted to eb-trees. As this
structure does not have atomic operation, it is mandatory to protect it
with a lock.

For this, the takeover lock is reused. Its role was to protect during
connection takeover. As it is now extended to general idle conns usage,
it is renamed to idle_conns_lock. A new lock section is also
instantiated named IDLE_CONNS_LOCK to isolate its impact on performance.
2021-02-12 12:33:04 +01:00
William Lallemand
3ce6eedb37 MEDIUM: ssl: add a rwlock for SSL server session cache
When adding the server side support for certificate update over the CLI
we encountered a design problem with the SSL session cache which was not
locked.

Indeed, once a certificate is updated we need to flush the cache, but we
also need to ensure that the cache is not used during the update.
To prevent the use of the cache during an update, this patch introduce a
rwlock for the SSL server session cache.

In the SSL session part this patch only lock in read, even if it writes.
The reason behind this, is that in the session part, there is one cache
storage per thread so it is not a problem to write in the cache from
several threads. The problem is only when trying to write in the cache
from the CLI (which could be on any thread) when a session is trying to
access the cache. So there is a write lock in the CLI part to prevent
simultaneous access by a session and the CLI.

This patch also remove the thread_isolate attempt which is eating too
much CPU time and was not protecting from the use of a free ptr in the
session.
2021-02-09 09:43:44 +01:00
Christopher Faulet
a8979a9b59 DOC: server: Add missing params in comment of the server state line parsing
srv_use_ssl and srv_check_port parameters were not mentionned in the comment
of the function parsing a server state line.
2021-02-04 14:00:43 +01:00
William Dauchy
1c921cd748 BUG/MINOR: check: consitent way to set agentaddr
small consistency problem with `addr` and `agent-addr` options:
for the both options, the last one parsed is always used to set the
agent-check addr.  Thus these two lines don't have the same behavior:

  server ... addr <addr1> agent-addr <addr2>
  server ... agent-addr <addr2> addr <addr1>

After this patch `agent-addr` will always be the priority option over
`addr`. It means we test the flag before setting agentaddr.
We also fix all the places where we did not set the flag to be coherent
everywhere.

I was not really able to determine where this issue is coming from. So
it is probable we may backport it to all stable version where the agent
is supported.

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-02-04 13:55:04 +01:00
William Dauchy
fe03e7d045 MEDIUM: server: adding support for check_port in server state
We can currently change the check-port using the cli command `set server
check-port` but there is a consistency issue when using server state.
This patch aims to fix this problem but will be also a good preparation
work to get rid of checkport flag, so we are able to know when checkport
was set by config.

I am fully aware this is not making github #953 moving forward, I
however think this might be acceptable while waiting for a proper
solution and resolve consistency problem faced with port settings.

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-02-04 10:46:52 +01:00
William Dauchy
69f118d7b6 MEDIUM: check: remove checkport checkaddr flag
While trying to fix some consistency problem with the config file/cli
(e.g. check-port cli command does not set the flag), we realised
checkport flag was not necessarily needed. Indeed tcpcheck uses service
port as the last choice if check.port is zero. So we can assume if
check.port is zero, it means it was never set by the user, regardless if
it is by the cli or config file.  In the longterm this will avoid to
introduce a new consistency issue if we forget to set the flag.

in the same manner of checkport flag, we don't really need checkaddr
flag. We can assume if checkaddr is not set, it means it was never set
by the user or config.

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-02-04 10:43:00 +01:00
Christopher Faulet
99497d7dba MINOR: server: Don't set the check port during the update from a state file
When the server state is loaded from a server-state file, there is no reason
to set an unconfigured check port with the server port. Because by default,
if the check port is not set, the server's one is used. Thus we can remove
this useless assignment. It is mandatory for next improvements.
2021-02-04 10:42:45 +01:00
William Dauchy
446db718cb BUG/MINOR: cli: fix set server addr/port coherency with health checks
while reading `update_server_addr_port` I found out some things which
can be seen as incoherency. I hope I did not overlooked anything:

- one comment is stating check's address should be updated if it uses
  the server one; however the condition checks if `SRV_F_CHECKADDR` is
  set; this flag is set when a check address is set; result is that we
  override the check address where I was not expecting it. In fact we
  don't need to update anything here as server addr is used when check
  addr is not set.
- same goes for check agent addr
- for port, it is a bit different, we update the check port if it is
  unset. This is harmless because we also use server port if check port
  is unset. However it creates some incoherency before/after using this
  command, as check port should stay unset througout the life of the
  process unless it is is set by `set server check-port` command.

quite hard to locate the origin of this this issue but the function was
introduced in commit d458adcc52 ("MINOR:
new update_server_addr_port() function to change both server's ADDR and
service PORT"). I was however not able to determine whether this is due
to a change of behavior along the years. So this patch can potentially
be backported up to v1.8 but we must be careful while doing so, as the
code has changed a lot. That being said, the bug being not very
impacting I would be fine keeping it for 2.4 only.

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-02-04 09:06:04 +01:00
Remi Tricot-Le Breton
bb470aa327 MINOR: ssl: Remove client_crt member of the server's ssl context
The client_crt member is not used anymore since the server's ssl context
initialization now behaves the same way as the bind lines one (using
ckch stores and instances).
2021-01-26 15:19:36 +01:00
Amaury Denoyelle
18c68df558 CLEANUP: srv: fix comment for pool-max-conn
Adjust comment for the unlimited value of pool-max-conn which is -1.
2021-01-26 14:48:39 +01:00
Christopher Faulet
e3bdc81f8a MINOR: server: Forbid server definitions in frontend sections
An fatal error is now reported if a server is defined in a frontend
section. til now, a warning was just emitted and the server was ignored. The
warning was added in the 1.3.4 when the frontend/backend keywords were
introduced to allow a smooth transition and to not break existing
configs. It is old enough now to emit an fatal error in this case.

This patch is related to the issue #1043. It may be backported at least as
far as 2.2, and possibly to older versions. It relies on the previous commit
("MINOR: config: Add failifnotcap() to emit an alert on proxy capabilities").
2021-01-13 17:45:34 +01:00
Ilya Shipitsin
2aa4b3a083 BUILD: SSL: guard TLS13 ciphersuites with HAVE_SSL_CTX_SET_CIPHERSUITES
accidently src/server.c still used earlier guarding
2021-01-07 10:19:56 +01:00
Amaury Denoyelle
10d5c3172b BUG/MINOR: srv: do not cleanup idle conns if pool max is null
If a server is configured to not have any idle conns, returns immediatly
from srv_cleanup_connections. This avoids a segfault when a server is
configured with pool-max-conn to 0.

This should be backported up to 2.2.
2021-01-06 16:57:17 +01:00
Amaury Denoyelle
e3c4192962 BUG/MINOR: srv: do not init address if backend is disabled
Do not proceed on init_addr if the backend of the server is marked as
disabled. When marked as disabled, the server is not fully initialized
and some operation must be avoided to prevent segfault. It is correct
because there is no way to activate a disabled backend.

This fixes the github issue #1031.
This should be backported to 2.2.
2021-01-06 16:57:17 +01:00
Thayne McCombs
24da7e1aa6 BUG/MEDIUM: server: srv_set_addr_desc() crashes when a server has no address
GitHub Issue #1026 reported a crash during configuration check for the
following example config:

    backend 0
    server 0 0
    server 0 0

HAProxy crashed in srv_set_addr_desc() due to a NULL pointer dereference
caused by `sa2str` returning NULL for an `AF_UNSPEC` address (`0`).

Check to make sure the address key is non-null before using it for
comparison or inserting it into the tree.

The crash was introduced in commit 92149f9a8 ("MEDIUM: stick-tables: Add
srvkey option to stick-table") which not in any released version so no
backport is needed.

Cc: Tim Duesterhus <tim@bastelstu.be>
2021-01-06 09:19:15 +01:00
Tim Duesterhus
e5ff14100a CLEANUP: Compare the return value of XXXcmp() functions with zero
According to coding-style.txt it is recommended to use:

`strcmp(a, b) == 0` instead of `!strcmp(a, b)`

So let's do this.

The change was performed by running the following (very long) coccinelle patch
on src/:

    @@
    statement S;
    expression E;
    expression F;
    @@

      if (
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) != 0
      )
    (
      S
    |
      { ... }
    )

    @@
    statement S;
    expression E;
    expression F;
    @@

      if (
    - !
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) == 0
      )
    (
      S
    |
      { ... }
    )

    @@
    expression E;
    expression F;
    expression G;
    @@

    (
    G &&
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) != 0
    )

    @@
    expression E;
    expression F;
    expression G;
    @@

    (
    G ||
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) != 0
    )

    @@
    expression E;
    expression F;
    expression G;
    @@

    (
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) != 0
    && G
    )

    @@
    expression E;
    expression F;
    expression G;
    @@

    (
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) != 0
    || G
    )

    @@
    expression E;
    expression F;
    expression G;
    @@

    (
    G &&
    - !
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) == 0
    )

    @@
    expression E;
    expression F;
    expression G;
    @@

    (
    G ||
    - !
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) == 0
    )

    @@
    expression E;
    expression F;
    expression G;
    @@

    (
    - !
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) == 0
    && G
    )

    @@
    expression E;
    expression F;
    expression G;
    @@

    (
    - !
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) == 0
    || G
    )

    @@
    expression E;
    expression F;
    expression G;
    @@

    (
    - !
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) == 0
    )
2021-01-04 10:09:02 +01:00
Thayne McCombs
92149f9a82 MEDIUM: stick-tables: Add srvkey option to stick-table
This allows using the address of the server rather than the name of the
server for keeping track of servers in a backend for stickiness.

The peers code was also extended to support feeding the dictionary using
this key instead of the name.

Fixes #814
2020-12-31 10:04:54 +01:00
Frédéric Lécaille
f46c10cfb1 MINOR: server: Add QUIC definitions to servers.
This patch adds QUIC structs to server struct so that to make the QUIC code
compile. Also initializes the ebtree to store the connections by connection
IDs.
2020-12-23 11:57:26 +01:00
Ilya Shipitsin
f38a01884a CLEANUP: assorted typo fixes in the code and comments
This is 13n iteration of typo fixes
2020-12-21 11:24:48 +01:00
William Dauchy
f63704488e MEDIUM: cli/ssl: configure ssl on server at runtime
in the context of a progressive backend migration, we want to be able to
activate SSL on outgoing connections to the server at runtime without
reloading.
This patch adds a `set server ssl` command; in order to allow that:

- add `srv_use_ssl` to `show servers state` command for compatibility,
  also update associated parsing
- when using default-server ssl setting, and `no-ssl` on server line,
  init SSL ctx without activating it
- when triggering ssl API, de/activate SSL connections as requested
- clean ongoing connections as it is done for addr/port changes, without
  checking prior server state

example config:

backend be_foo
  default-server ssl
  server srv0 127.0.0.1:6011 weight 1 no-ssl

show servers state:

  5 be_foo 1 srv0 127.0.0.1 2 0 1 1 15 1 0 4 0 0 0 0 - 6011 - -1

where srv0 can switch to ssl later during the runtime:

  set server be_foo/srv0 ssl on

  5 be_foo 1 srv0 127.0.0.1 2 0 1 1 15 1 0 4 0 0 0 0 - 6011 - 1

Also update existing tests and create a new one.

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2020-11-18 17:22:28 +01:00
Amaury Denoyelle
7f8f6cb926 BUG/MEDIUM: stats: prevent crash if counters not alloc with dummy one
Define a per-thread counters allocated with the greatest size of any
stat module counters. This variable is named trash_counters.

When using a proxy without allocated counters, return the trash counters
from EXTRA_COUNTERS_GET instead of a dangling pointer to prevent
segfault.

This is useful for all the proxies used internally and not
belonging to the global proxy list. As these objects does not appears on
the stat report, it does not matter to use the dummy counters.

For this fix to be functional, the extra counters are explicitly
initialized to NULL on proxy/server/listener init functions.

Most notably, the crash has already been detected with the following
vtc:
- reg-tests/lua/txn_get_priv.vtc
- reg-tests/peers/tls_basic_sync.vtc
- reg-tests/peers/tls_basic_sync_wo_stkt_backend.vtc
There is probably other parts that may be impacted (SPOE for example).

This bug was introduced in the current release and do not need to be
backported. The faulty commits are
"MINOR: ssl: count client hello for stats" and
"MINOR: ssl: add counters for ssl sessions".
2020-11-12 15:16:05 +01:00
Willy Tarreau
f5fe70620c MINOR: server: remove idle lock in srv_cleanup_connections
This function used to grab the idle lock when scanning the threads for
idle connections, but it doesn't need it since the lock only protects
the tree. Let's remove it.
2020-11-06 13:22:44 +01:00
Willy Tarreau
21b9ff59b2 BUG/MEDIUM: server: make it possible to kill last idle connections
In issue #933, @jaroslawr provided a report indicating that when using
many threads and many servers, it's very difficult to terminate the last
idle connections on each server. The issue has two causes in fact. The
first one is that during the calculation of the estimate of needed
connections, we round the computation up while in previous round it was
already rounded up, so we end up adding 1 to 1 which once divided by 2
remains 1. The second issue is that servers are not woken up anymore for
purging their connections if they don't have activity. The only reason
that was there to wake them up again was in case insufficient connections
were purged. And even then the purge task itself was not woken up. But
that is not enough for getting rid of the long tail of old connections
nor updating est_need_conns.

This patch makes sure to properly wake up as long as at least one idle
connection remains, and not to round up the needed connections anymore.

Prior to this patch, a test involving many connections which suddenly
stopped would keep many idle connections, now they're effectively halved
every pool-purge-delay.

This needs to be backported to 2.2.
2020-11-05 09:12:20 +01:00
Christopher Faulet
75bef00538 MINOR: server: Copy configuration file and line for server templates
When servers based on server templates are initialized, the configuration file
and line are now copied. This helps to emit understandable warning and alert
messages.

This patch may be backported if needed, as far as 1.8.
2020-11-03 10:44:38 +01:00
Christopher Faulet
ac1c60fd9c BUG/MINOR: server: Set server without addr but with dns in RMAINT on startup
On startup, if a server has no address but the dns resolutions are configured,
"none" method is added to the default init-addr methods, in addition to "last"
and "libc". Thus on startup, this server is set to RMAINT mode if no address is
found. It is only performed if no other init-addr method is configured.

Setting the RMAINT mode on startup is important to inhibit the health checks.

For instance, following servers will now be set to RMAINT mode on startup :

  server srv nofound.tld:80 check resolvers mydns
  server srv _http._tcp.service.local check resolvers mydns
  server-template srv 1-3 _http._tcp.service.local check resolvers mydns

while followings ones will trigger an error :

  server srv nofound.tld:80 check
  server srv nofound.tld:80 check resolvers mydns init-addr libc
  server srv _http._tcp.service.local check
  server srv _http._tcp.service.local check resolvers mydns init-addr libc
  server-template srv 1-3 _http._tcp.service.local check resolvers mydns init-addr libc

This patch must be backported as far as 1.8.
2020-11-03 10:44:26 +01:00
Amaury Denoyelle
e6ba7915eb BUG/MINOR: server: fix down_time report for stats
Adjust condition used to report down_time for statistics. There was a
tiny probabilty to have a negative downtime if last_change was superior
to now. If this is the case, return only down_time.

This bug can backported up to 1.8.
2020-10-29 18:52:39 +01:00
Amaury Denoyelle
fe2bf091f6 BUG/MINOR: server: fix srv downtime calcul on starting
When a server is up after a failure, its downtime was reset to 0 on the
statistics. This is due to a wrong condition that causes srv.down_time
to never be set. Fix this by updating down_time each time the server is in
STARTING state.

Fixes the github issue #920.
This bug can be backported up to 1.8.
2020-10-29 18:52:18 +01:00
Willy Tarreau
595e767030 MINOR: server: read-lock the cookie during srv_set_dyncookie()
No need to use an exclusive lock on the proxy anymore when reading its
setting, a read lock is enough. A few other places continue to use a
write-lock when modifying simple flags only in order to let this
function see a consistent value all along. This might be changed in
the future using barriers and local copies.
2020-10-22 17:32:28 +02:00
Willy Tarreau
ac66d6bafb MINOR: proxy; replace the spinlock with an rwlock
This is an anticipation of finer grained locking for the queues. For now
all lock places take a write lock so that there is no difference at all
with previous code.
2020-10-22 17:32:28 +02:00
Willy Tarreau
1e690bb6c4 BUG/MEDIUM: server: support changing the slowstart value from state-file
If the slowstart value in a state file implies the latest state change
is within the slowstart period, we end up calling srv_update_status()
to reschedule the server's state change but its task is not yet
allocated and remains null, causing a crash on startup.

Make sure srv_update_status() supports being called with partially
initialized servers which do not yet have a task. If the task has to
be scheduled, it will necessarily happen after initialization since
it will result from a state change.

This should be backported wherever server-state is present.
2020-10-22 12:07:07 +02:00
Willy Tarreau
c3914d4fff MEDIUM: proxy: replace proxy->state with proxy->disabled
The remaining proxy states were only used to distinguish an enabled
proxy from a disabled one. Due to the initialization order, both
PR_STNEW and PR_STREADY were equivalent after startup, and they
would only differ from PR_STSTOPPED when the proxy is disabled or
shutdown (which is effectively another way to disable it).

Now we just have a "disabled" field which allows to distinguish them.
It's becoming obvious that start_proxies() is only used to print a
greeting message now, that we'd rather get rid of. Probably that
zombify_proxy() and stop_proxy() should be merged once their
differences move to the right place.
2020-10-09 11:27:30 +02:00
Amaury Denoyelle
fbd0bc98fe MINOR: dns/stats: integrate dns counters in stats
Use the new stats module API to integrate the dns counters in the
standard stats. This is done in order to avoid code duplication, keep
the code related to cli out of dns and use the full possibility of the
stats function, allowing to print dns stats in csv or json format.
2020-10-05 12:02:14 +02:00
Willy Tarreau
65ec4e3ff7 MEDIUM: tools: make str2sa_range() check that the protocol has ->connect()
Most callers of str2sa_range() need the protocol only to check that it
provides a ->connect() method. It used to be used to verify that it's a
stream protocol, but it might be a bit early to get rid of it. Let's keep
the test for now but move it to str2sa_range() when the new flag PA_O_CONNECT
is present. This way almost all call places could be cleaned from this.

There's a strange test in the server address parsing code that rechecks
the family from the socket which seems to be a duplicate of the previously
removed tests. It will have to be rechecked.
2020-09-16 22:08:08 +02:00
Willy Tarreau
5fc9328aa2 MINOR: tools: make str2sa_range() directly return the protocol
We'll need this so that it can return pointers to stacked protocol in
the future (for QUIC). In addition this removes a lot of tests for
protocol validity in the callers.

Some of them were checked further apart, or after a call to
str2listener() and they were simplified as well.

There's still a trick, we can fail to return a protocol in case the caller
accepts an fqdn for use later. This is what servers do and in this case it
is valid to return no protocol. A typical example is:

   server foo localhost:1111
2020-09-16 22:08:08 +02:00
Willy Tarreau
a93e5c7fae MINOR: tools: make str2sa_range() optionally return the fd
If a file descriptor was passed, we can optionally return it. This will
be useful for listening sockets which are both a pre-bound FD and a ready
socket.
2020-09-16 22:08:08 +02:00
Willy Tarreau
328199348b MINOR: tools: add several PA_O_* flags in str2sa_range() callers
These flags indicate whether the call is made to fill a bind or a server
line, or even just send/recv calls (like logs or dns). Some special cases
are made for outgoing FDs (e.g. pipes for logs) or socket FDs (e.g external
listeners), and there's a distinction between stream or dgram usage that's
expected to significantly help str2sa_range() proceed appropriately with
the input information. For now they are not used yet.
2020-09-16 22:08:08 +02:00
Willy Tarreau
8b0fa8f0ab MEDIUM: config: remove all checks for missing/invalid ports/ranges
Now that str2sa_range() checks for appropriate port specification, we
don't need to implement adhoc test cases in every call place, if the
result is valid, the conditions are met otherwise the error message is
appropriately filled.
2020-09-16 22:08:08 +02:00
Willy Tarreau
809587635e MINOR: tools: add several PA_O_PORT_* flags in str2sa_range() callers
These flags indicate what is expected regarding port specifications. Some
callers accept none, some need fixed ports, some have it mandatory, some
support ranges, and some take an offset. Each possibilty is reflected by
an option. For now they are not exploited, but the goal is to instrument
str2sa_range() to properly parse that.
2020-09-16 22:08:07 +02:00
Willy Tarreau
cd3a5591f6 MINOR: tools: make str2sa_range() take more options than just resolve
We currently have an argument to require that the address is resolved
but we'll soon add more, so let's turn it into a bit field. The old
"resolve" boolean is now PA_O_RESOLVE.
2020-09-16 22:08:07 +02:00
Willy Tarreau
9743f709d0 BUG/MINOR: server: report correct error message for invalid port on "socks4"
The socks4 keyword parser was a bit too much copy-pasted, it only checks
for a null port and reports "invalid range". Let's properly check for the
1-65535 range and report the correct error.

It may be backported everywhere "socks4" is present (2.0).
2020-09-15 12:00:29 +02:00
Christopher Faulet
b0b7607a54 MINOR: server: Improve log message sent when server address is updated
When the server address is set for the first time, the log message is a bit ugly
because there is no old ip address to report. Thus in the log, we can see :

  PX/SRV changed its IP from  to A.B.C.D by DNS additional record.

Now, when this happens, "(none)" is reported :

  PX/SRV changed its IP from (none) to A.B.C.D by DNS additional record.

This patch may be backported to 2.2.
2020-09-08 10:44:57 +02:00
Christopher Faulet
d6c6b5f43b BUG/MEDIUM: dns: Be sure to renew IP address for already known servers
When a SRV record for an already known server is processed, only the weight is
updated, if not configured to be ignored. It is a problem if the IP address
carried by the associated additional record changes. Because the server IP
address is never renewed.

To fix this bug, If there is an addition record attached to a SRV record, we
always try to set the IP address. If it is the same, no change is
performed. This way, IP changes are always handled.

This patch should fix the issue #841. It must be backported to 2.2.
2020-09-08 10:44:57 +02:00
Baptiste Assmann
87138c3524 BUG/MAJOR: dns: disabled servers through SRV records never recover
A regression was introduced by 13a9232ebc
when I added support for Additional section of the SRV responses..

Basically, when a server is managed through SRV records additional
section and it's disabled (because its associated Additional record has
disappeared), it never leaves its MAINT state and so never comes back to
production.
This patch updates the "snr_update_srv_status()" function to clear the
MAINT status when the server now has an IP address and also ensure this
function is called when parsing Additional records (and associating them
to new servers).

This can cause severe outage for people using HAProxy + consul (or any
other service registry) through DNS service discovery).

This should fix issue #793.
This should be backported to 2.2.
2020-08-05 21:48:23 +02:00
Jerome Magnin
012261ab34 BUG/MAJOR: dns: fix null pointer dereference in snr_update_srv_status
Since commit 13a9232eb ("MEDIUM: dns: use Additional records from SRV
responses"), a struct server can have a NULL dns_requester->resolution,
when SRV records are used and DNS answers contain an Additional section.

This is a problem when we call snr_update_srv_status() because it does
not check that resolution is NULL, and dereferences it. This patch
simply adds a test for resolution being NULL. When that happens, it means
we are using SRV records with Additional records, and an entry was removed.

This should fix issue #775.
This should be backported to 2.2.
2020-07-29 12:05:55 +02:00
Emeric Brun
d3db3846c5 BUG/MEDIUM: resolve: fix init resolving for ring and peers section.
Reported github issue #759 shows there is no name resolving
on server lines for ring and peers sections.

This patch introduce the resolving for those lines.

This patch adds  boolean a parameter to parse_server function to specify
if we want the function to perform an initial name resolving using libc.

This boolean is forced to true in case of peers or ring section.

The boolean is kept to false in case of classic servers (from
backend/listen)

This patch should be backported in branches where peers sections
support 'server' lines.
2020-07-21 17:59:20 +02:00
Willy Tarreau
2d067f93fb BUG/MEDIUM: server: fix possibly uninitialized state file on close
Previous fix dc6e8a9a7 ("BUG/MEDIUM: server: resolve state file handle
leak on reload") traded a bug for another one, now we get this warning
when building server.c, which is valid since f is not necessarily
initialized (e.g. if no global state file is passed):

  src/server.c: In function 'apply_server_state':
  src/server.c:3272:3: warning: 'f' may be used uninitialized in this function [-Wmaybe-uninitialized]
     fclose(f);
   ^~~~~~~~~

Let's initialize it first. This whole code block should really be
splitted, cleaned up and reorganized as it's possible that other
similar bugs are hidden in it.

This must be backported to the same branches the commit above is
backported to (likely 2.2 and 2.1).
2020-07-16 06:44:04 +02:00
Ilya Shipitsin
dc6e8a9a7b BUG/MEDIUM: server: resolve state file handle leak on reload
During reload, server state file is read and file handle is not released
this was indepently reported in #738 and #660.

partially resolves #660. This should be backported to 2.2 and 2.1.
2020-07-16 04:41:32 +02:00
Willy Tarreau
de4db17dee MINOR: lists: rename some MT_LIST operations to clarify them
Initially when mt_lists were added, their purpose was to be used with
the scheduler, where anyone may concurrently add the same tasklet, so
it sounded natural to implement a check in MT_LIST_ADD{,Q}. Later their
usage was extended and MT_LIST_ADD{,Q} started to be used on situations
where the element to be added was exclusively owned by the one performing
the operation so a conflict was impossible. This became more obvious with
the idle connections and the new macro was called MT_LIST_ADDQ_NOCHECK.

But this remains confusing and at many places it's not expected that
an MT_LIST_ADD could possibly fail, and worse, at some places we start
by initializing it before adding (and the test is superflous) so let's
rename them to something more conventional to denote the presence of the
check or not:

   MT_LIST_ADD{,Q}    : inconditional operation, the caller owns the
                        element, and doesn't care about the element's
                        current state (exactly like LIST_ADD)
   MT_LIST_TRY_ADD{,Q}: only perform the operation if the element is not
                        already added or in the process of being added.

This means that the previously "safe" MT_LIST_ADD{,Q} are not "safe"
anymore. This also means that in case of backport mistakes in the
future causing this to be overlooked, the slower and safer functions
will still be used by default.

Note that the missing unchecked MT_LIST_ADD macro was added.

The rest of the code will have to be reviewed so that a number of
callers of MT_LIST_TRY_ADDQ are changed to MT_LIST_ADDQ to remove
the unneeded test.
2020-07-10 08:50:41 +02:00
Willy Tarreau
18ed789ae2 BUG/MEDIUM: server: don't kill all idle conns when there are not enough
In srv_cleanup_idle_connections(), we compute how many idle connections
are in excess compared to the average need. But we may actually be missing
some, for example if a certain number were recently closed and the average
of used connections didn't change much since previous period. In this
case exceed_conn can become negative. There was no special case for this
in the code, and calculating the per-thread share of connections to kill
based on this value resulted in special value -1 to be passed to
srv_migrate_conns_to_remove(), which for this function means "kill all of
them", as used in srv_cleanup_connections() for example.

This causes large variations of idle connections counts on servers and
CPU spikes at the moment the cleanup task passes. These were quite more
visible with SSL as it costs CPU to close and re-establish these
connections, and it also takes time, reducing the reuse ratio, hence
increasing the amount of connections during reconnection.

In this patch we simply skip the killing loop when this condition is met.

No backport is needed, this is purely 2.2.
2020-07-02 19:05:30 +02:00
Willy Tarreau
76cc699017 MINOR: config: add a new tune.idle-pool.shared global setting.
Enables ('on') or disables ('off') sharing of idle connection pools between
threads for a same server. The default is to share them between threads in
order to minimize the number of persistent connections to a server, and to
optimize the connection reuse rate. But to help with debugging or when
suspecting a bug in HAProxy around connection reuse, it can be convenient to
forcefully disable this idle pool sharing between multiple threads, and force
this option to "off". The default is on.

This could have been nice to have during the idle connections debugging,
but it's not too late to add it!
2020-07-01 19:07:37 +02:00
Olivier Houchard
ff1d0929b8 MEDIUM: connections: Don't use a lock when moving connections to remove.
Make it so we don't have to take a lock while moving a connection from
the idle list to the toremove_list by taking advantage of the MT_LIST.
2020-07-01 17:09:19 +02:00
Olivier Houchard
f8f4c2ef60 CLEANUP: connections: rename the toremove_lock to takeover_lock
This lock was misnamed and a bit confusing. It's only used for takeover
so let's call it takeover_lock.
2020-07-01 17:09:10 +02:00
Willy Tarreau
2f3f4d3441 MEDIUM: server: add a new pool-low-conn server setting
The problem with the way idle connections currently work is that it's
easy for a thread to steal all of its siblings' connections, then release
them, then it's done by another one, etc. This happens even more easily
due to scheduling latencies, or merged events inside the same pool loop,
which, when dealing with a fast server responding in sub-millisecond
delays, can really result in one thread being fully at work at a time.

In such a case, we perform a huge amount of takeover() which consumes
CPU and requires quite some locking, sometimes resulting in lower
performance than expected.

In order to fight against this problem, this patch introduces a new server
setting "pool-low-conn", whose purpose is to dictate when it is allowed to
steal connections from a sibling. As long as the number of idle connections
remains at least as high as this value, it is permitted to take over another
connection. When the idle connection count becomes lower, a thread may only
use its own connections or create a new one. By proceeding like this even
with a low number (typically 2*nbthreads), we quickly end up in a situation
where all active threads have a few connections. It then becomes possible
to connect to a server without bothering other threads the vast majority
of the time, while still being able to use these connections when the
number of available FDs becomes low.

We also use this threshold instead of global.nbthread in the connection
release logic, allowing to keep more extra connections if needed.

A test performed with 10000 concurrent HTTP/1 connections, 16 threads
and 210 servers with 1 millisecond of server response time showed the
following numbers:

   haproxy 2.1.7:           185000 requests per second
   haproxy 2.2:             314000 requests per second
   haproxy 2.2 lowconn 32:  352000 requests per second

The takeover rate goes down from 300k/s to 13k/s. The difference is
further amplified as the response time shrinks.
2020-07-01 15:23:15 +02:00
Willy Tarreau
bdb86bdaab MEDIUM: server: improve estimate of the need for idle connections
Starting with commit 079cb9a ("MEDIUM: connections: Revamp the way idle
connections are killed") we started to improve the way to compute the
need for idle connections. But the condition to keep a connection idle
or drop it when releasing it was not updated. This often results in
storms of close when certain thresholds are met, and long series of
takeover() when there aren't enough connections left for a thread on
a server.

This patch tries to improve the situation this way:
  - it keeps an estimate of the number of connections needed for a server.
    This estimate is a copy of the max over previous purge period, or is a
    max of what is seen over current period; it differs from max_used_conns
    in that this one is a counter that's reset on each purge period ;

  - when releasing, if the number of current idle+used connections is
    lower than this last estimate, then we'll keep the connection;

  - when releasing, if the current thread's idle conns head is empty,
    and we don't exceed the estimate by the number of threads, then
    we'll keep the connection.

  - when cleaning up connections, we consider the max of the last two
    periods to avoid killing too many idle conns when facing bursty
    traffic.

Thanks to this we can better converge towards a situation where, provided
there are enough FDs, each active server keeps at least one idle connection
per thread all the time, with a total number close to what was needed over
the previous measurement period (as defined by pool-purge-delay).

On tests with large numbers of concurrent connections (30k) and many
servers (200), this has quite smoothed the CPU usage pattern, increased
the reuse rate and roughly halved the takeover rate.
2020-06-29 16:29:10 +02:00
Willy Tarreau
c35bcfcc21 BUG/MINOR: server: start cleaning idle connections from various points
There's a minor glitch with the way idle connections start to be evicted.
The lookup always goes from thread 0 to thread N-1. This causes depletion
of connections on the first threads and abundance on the last ones. This
is visible with the takeover() stats below:

 $ socat - /tmp/sock1 <<< "show activity"|grep ^fd ; \
   sleep 10 ; \
   socat -/tmp/sock1 <<< "show activity"|grep ^fd
 fd_takeover: 300144 [ 91887 84029 66254 57974 ]
 fd_takeover: 359631 [ 111369 99699 79145 69418 ]

There are respectively 19k, 15k, 13k and 11k takeovers for only 4 threads,
indicating that the first thread needs a foreign FD twice more often than
the 4th one.

This patch changes this si that all threads are scanned in round robin
starting with the current one. The takeovers now happen in a much more
distributed way (about 4 times 9k) :

  fd_takeover: 1420081 [ 359562 359453 346586 354480 ]
  fd_takeover: 1457044 [ 368779 368429 355990 363846 ]

There is no need to backport this, as this happened along a few patches
that were merged during 2.2 development.
2020-06-29 14:43:16 +02:00
Willy Tarreau
4d82bf5c2e MINOR: connection: align toremove_{lock,connections} and cleanup into idle_conns
We used to have 3 thread-based arrays for toremove_lock, idle_cleanup,
and toremove_connections. The problem is that these items are small,
and that this creates false sharing between threads since it's possible
to pack up to 8-16 of these values into a single cache line. This can
cause real damage where there is contention on the lock.

This patch creates a new array of struct "idle_conns" that is aligned
on a cache line and which contains all three members above. This way
each thread has access to its variables without hindering the other
ones. Just doing this increased the HTTP/1 request rate by 5% on a
16-thread machine.

The definition was moved to connection.{c,h} since it appeared a more
natural evolution of the ongoing changes given that there was already
one of them declared in connection.h previously.
2020-06-28 10:52:36 +02:00
Willy Tarreau
0a4b0ab177 BUILD: include: add sys/types before netinet/tcp.h
Apparently Cygwin requires sys/types.h before netinet/tcp.h but doesn't
include it by itself, as shown here:
  https://github.com/haproxy/haproxy/actions/runs/131943890

This patch makes sure it's always present, which is in server.c and
the SPOA example.
2020-06-11 11:22:44 +02:00
Willy Tarreau
b2551057af CLEANUP: include: tree-wide alphabetical sort of include files
This patch fixes all the leftovers from the include cleanup campaign. There
were not that many (~400 entries in ~150 files) but it was definitely worth
doing it as it revealed a few duplicates.
2020-06-11 10:18:59 +02:00
Willy Tarreau
36979d9ad5 REORG: include: move the error reporting functions to from log.h to errors.h
Most of the files dealing with error reports have to include log.h in order
to access ha_alert(), ha_warning() etc. But while these functions don't
depend on anything, log.h depends on a lot of stuff because it deals with
log-formats and samples. As a result it's impossible not to embark long
dependencies when using ha_warning() or qfprintf().

This patch moves these low-level functions to errors.h, which already
defines the error codes used at the same places. About half of the users
of log.h could be adjusted, sometimes revealing other issues such as
missing tools.h. Interestingly the total preprocessed size shrunk by
4%.
2020-06-11 10:18:59 +02:00
Willy Tarreau
51cd5956ee REORG: check: move tcpchecks away from check.c
Checks.c remains one of the largest file of the project and it contains
too many things. The tcpchecks code represents half of this file, and
both parts are relatively isolated, so let's move it away into its own
file. We now have tcpcheck.c, tcpcheck{,-t}.h.

Doing so required to export quite a number of functions because check.c
has almost everything made static, which really doesn't help to split!
2020-06-11 10:18:58 +02:00
Willy Tarreau
cee013e4e0 REORG: check: move the e-mail alerting code to mailers.c
check.c is one of the largest file and contains too many things. The
e-mail alerting code is stored there while nothing is in mailers.c.
Let's move this code out. That's only 4% of the code but a good start.
In order to do so, a few tcp-check functions had to be exported.
2020-06-11 10:18:58 +02:00
Willy Tarreau
6be7849f39 REORG: include: move cfgparse.h to haproxy/cfgparse.h
There's no point splitting the file in two since only cfgparse uses the
types defined there. A few call places were updated and cleaned up. All
of them were in C files which register keywords.

There is nothing left in common/ now so this directory must not be used
anymore.
2020-06-11 10:18:58 +02:00
Willy Tarreau
dfd3de8826 REORG: include: move stream.h to haproxy/stream{,-t}.h
This one was not easy because it was embarking many includes with it,
which other files would automatically find. At least global.h, arg.h
and tools.h were identified. 93 total locations were identified, 8
additional includes had to be added.

In the rare files where it was possible to finalize the sorting of
includes by adjusting only one or two extra lines, it was done. But
all files would need to be rechecked and cleaned up now.

It was the last set of files in types/ and proto/ and these directories
must not be reused anymore.
2020-06-11 10:18:58 +02:00
Willy Tarreau
1e56f92693 REORG: include: move server.h to haproxy/server{,-t}.h
extern struct dict server_name_dict was moved from the type file to the
main file. A handful of inlined functions were moved at the bottom of
the file. Call places were updated to use server-t.h when relevant, or
to simply drop the entry when not needed.
2020-06-11 10:18:58 +02:00
Willy Tarreau
a55c45470f REORG: include: move queue.h to haproxy/queue{,-t}.h
Nothing outstanding here. A number of call places were not justified and
removed.
2020-06-11 10:18:58 +02:00
Willy Tarreau
4980160ecc REORG: include: move backend.h to haproxy/backend{,-t}.h
The files remained mostly unchanged since they were OK. However, half of
the users didn't need to include them, and about as many actually needed
to have it and used to find functions like srv_currently_usable() through
a long chain that broke when moving the file.
2020-06-11 10:18:58 +02:00
Willy Tarreau
5e539c9b8d REORG: include: move stream_interface.h to haproxy/stream_interface{,-t}.h
Almost no changes, removed stdlib and added buf-t and connection-t to
the types to avoid a warning.
2020-06-11 10:18:58 +02:00
Willy Tarreau
83487a833c REORG: include: move cli.h to haproxy/cli{,-t}.h
Almost no change except moving the cli_kw struct definition after the
defines. Almost all users had both types&proto included, which is not
surprizing since this code is old and it used to be the norm a decade
ago. These places were cleaned.
2020-06-11 10:18:58 +02:00
Willy Tarreau
2eec9b5f95 REORG: include: move stats.h to haproxy/stats{,-t}.h
Just some minor reordering, and the usual cleanup of call places for
those which didn't need it. We don't include the whole tools.h into
stats-t anymore but just tools-t.h.
2020-06-11 10:18:58 +02:00
Willy Tarreau
3f0f82e7a9 REORG: move applet.h to haproxy/applet{,-t}.h
The type file was slightly tidied. The cli-specific APPCTX_CLI_ST1_* flag
definitions were moved to cli.h. The type file was adjusted to include
buf-t.h and not the huge buf.h. A few call places were fixed because they
did not need this include.
2020-06-11 10:18:58 +02:00
Willy Tarreau
4aa573da6f REORG: include: move checks.h to haproxy/check{,-t}.h
All includes that were not absolutely necessary were removed because
checks.h happens to very often be part of dependency loops. A warning
was added about this in check-t.h. The fields, enums and structs were
a bit tidied because it's particularly tedious to find anything there.
It would make sense to split this in two or more files (at least
extract tcp-checks).

The file was renamed to the singular because it was one of the rare
exceptions to have an "s" appended to its name compared to the struct
name.
2020-06-11 10:18:58 +02:00
Willy Tarreau
7ea393d95e REORG: include: move connection.h to haproxy/connection{,-t}.h
The type file is becoming a mess, half of it is for the proxy protocol,
another good part describes conn_streams and mux ops, it would deserve
being split again. At least it was reordered so that elements are easier
to find, with the PP-stuff left at the end. The MAX_SEND_FD macro was moved
to compat.h as it's said to be the value for Linux.
2020-06-11 10:18:58 +02:00
Willy Tarreau
cea0e1bb19 REORG: include: move task.h to haproxy/task{,-t}.h
The TASK_IS_TASKLET() macro was moved to the proto file instead of the
type one. The proto part was a bit reordered to remove a number of ugly
forward declaration of static inline functions. About a tens of C and H
files had their dependency dropped since they were not using anything
from task.h.
2020-06-11 10:18:58 +02:00
Willy Tarreau
f268ee8795 REORG: include: split global.h into haproxy/global{,-t}.h
global.h was one of the messiest files, it has accumulated tons of
implicit dependencies and declares many globals that make almost all
other file include it. It managed to silence a dependency loop between
server.h and proxy.h by being well placed to pre-define the required
structs, forcing struct proxy and struct server to be forward-declared
in a significant number of files.

It was split in to, one which is the global struct definition and the
few macros and flags, and the rest containing the functions prototypes.

The UNIX_MAX_PATH definition was moved to compat.h.
2020-06-11 10:18:58 +02:00
Willy Tarreau
e6ce10be85 REORG: include: move sample.h to haproxy/sample{,-t}.h
This one is particularly tricky to move because everyone uses it
and it depends on a lot of other types. For example it cannot include
arg-t.h and must absolutely only rely on forward declarations to avoid
dependency loops between vars -> sample_data -> arg. In order to address
this one, it would be nice to split the sample_data part out of sample.h.
2020-06-11 10:18:58 +02:00
Willy Tarreau
eb92deb500 REORG: include: move dns.h to haproxy/dns{,-t}.h
The files were moved as-is.
2020-06-11 10:18:57 +02:00
Willy Tarreau
fc8f6a8517 REORG: include: move port_range.h to haproxy/port_range{,-t}.h
The port ranges didn't depend on anything. However they were missing
some includes such as stdlib and api-t.h which were added.
2020-06-11 10:18:57 +02:00
Willy Tarreau
3afc4c4bb0 REORG: include: move dict.h to hparoxy/dict{,-t}.h
This was entirely free-standing. haproxy/api-t.h was added for size_t.
2020-06-11 10:18:57 +02:00
Willy Tarreau
2dd7c35052 REORG: include: move protocol.h to haproxy/protocol{,-t}.h
The protocol.h files are pretty low in the dependency and (sadly) used
by some files from common/. Almost nothing was changed except lifting a
few comments.
2020-06-11 10:18:57 +02:00
Willy Tarreau
7a00efbe43 REORG: include: move common/namespace.h to haproxy/namespace{,-t}.h
The type was moved out as it's used by standard.h for netns_entry.
Instead of just being a forward declaration when not used, it's an
empty struct, which makes gdb happier (the resulting stripped executable
is the same).
2020-06-11 10:18:57 +02:00
Willy Tarreau
92b4f1372e REORG: include: move time.h from common/ to haproxy/
This one is included almost everywhere and used to rely on a few other
.h that are not needed (unistd, stdlib, standard.h). It could possibly
make sense to split it into multiple parts to distinguish operations
performed on timers and the internal time accounting, but at this point
it does not appear much important.
2020-06-11 10:18:56 +02:00
Willy Tarreau
af613e8359 CLEANUP: thread: rename __decl_hathreads() to __decl_thread()
I can never figure whether it takes an "s" or not, and in the end it's
better if it matches the file's naming, so let's call it "__decl_thread".
2020-06-11 10:18:56 +02:00
Willy Tarreau
8d36697dee REORG: include: move base64.h, errors.h and hash.h from common to to haproxy/
These ones do not depend on any other file. One used to include
haproxy/api.h but that was solely for stddef.h.
2020-06-11 10:18:56 +02:00
Willy Tarreau
4c7e4b7738 REORG: include: update all files to use haproxy/api.h or api-t.h if needed
All files that were including one of the following include files have
been updated to only include haproxy/api.h or haproxy/api-t.h once instead:

  - common/config.h
  - common/compat.h
  - common/compiler.h
  - common/defaults.h
  - common/initcall.h
  - common/tools.h

The choice is simple: if the file only requires type definitions, it includes
api-t.h, otherwise it includes the full api.h.

In addition, in these files, explicit includes for inttypes.h and limits.h
were dropped since these are now covered by api.h and api-t.h.

No other change was performed, given that this patch is large and
affects 201 files. At least one (tools.h) was already freestanding and
didn't get the new one added.
2020-06-11 10:18:42 +02:00
Willy Tarreau
8d2b777fe3 REORG: ebtree: move the include files from ebtree to include/import/
This is where other imported components are located. All files which
used to directly include ebtree were touched to update their include
path so that "import/" is now prefixed before the ebtree-related files.

The ebtree.h file was slightly adjusted to read compiler.h from the
common/ subdirectory (this is the only change).

A build issue was encountered when eb32sctree.h is loaded before
eb32tree.h because only the former checks for the latter before
defining type u32. This was addressed by adding the reverse ifdef
in eb32tree.h.

No further cleanup was done yet in order to keep changes minimal.
2020-06-11 09:31:11 +02:00
Emeric Brun
975564784f MEDIUM: ring: add new srv statement to support octet counting forward
log-proto <logproto>
  The "log-proto" specifies the protocol used to forward event messages to
  a server configured in a ring section. Possible values are "legacy"
  and "octet-count" corresponding respectively to "Non-transparent-framing"
  and "Octet counting" in rfc6587. "legacy" is the default.

Notes: a separated io_handler was created to avoid per messages test
and to prepare code to set different log protocols such as
request- response based ones.
2020-05-31 10:49:43 +02:00
Christopher Faulet
784063eeb2 MINOR: config: Don't dump keywords if argument is NULL
Helper functions are used to dump bind, server or filter keywords. These
functions are used to report errors during the configuration parsing. To have a
coherent API, these functions are now prepared to handle a null pointer as
argument. If so, no action is performed and functions immediately return.

This patch should fix the issue #631. It is not a bug. There is no reason to
backport it.
2020-05-18 18:30:06 +02:00
Ilya Shipitsin
c02a23f981 CLEANUP: assorted typo fixes in the code and comments
This is 9th iteration of typo fixes
2020-05-11 10:11:29 +02:00
William Dauchy
707ad328ef CLEANUP: connections: align function declaration
srv_cleanup_connections() is supposed to be static, so mark it as so.
This patch should be backported where commit 6318d33ce6
("BUG/MEDIUM: connections: force connections cleanup on server changes")
will be backported, that is to say v1.9 to v2.1.

Fixes: 6318d33ce6 ("BUG/MEDIUM: connections: force connections cleanup
on server changes")
Signed-off-by: William Dauchy <w.dauchy@criteo.com>
2020-05-04 19:26:19 +02:00
William Dauchy
6318d33ce6 BUG/MEDIUM: connections: force connections cleanup on server changes
I've been trying to understand a change of behaviour between v2.2dev5 and
v2.2dev6. Indeed our probe is regularly testing to add and remove
servers on a given backend such as:

 # echo "show servers state be_foo" | sudo socat stdio /var/lib/haproxy/stats
 113 be_foo 1 srv0 10.236.139.34 2 0 1 1 263 15 3 4 6 0 0 0 - 31255 -
 113 be_foo 2 srv1 0.0.0.0 0 1 256 256 0 15 3 0 14 0 0 0 - 0 -

 -> curl on the corresponding frontend: reply from server:31255

 # echo "set server be_foo/srv1 addr 10.236.139.34 port 31257" | sudo socat stdio /var/lib/haproxy/stats
 IP changed from '0.0.0.0' to '10.236.139.34', port changed from '0' to '31257' by 'stats socket command'
 # echo "set server be_foo/srv1 weight 256" | sudo socat stdio /var/lib/haproxy/stats
 # echo "set server be_foo/srv1 check-port 8500" | sudo socat stdio /var/lib/haproxy/stats
 health check port updated.
 # echo "set server be_foo/srv1 state ready" | sudo socat stdio /var/lib/haproxy/stats
 # echo "show servers state be_foo" | sudo socat stdio /var/lib/haproxy/stats
 113 be_foo 1 srv0 10.236.139.34 2 0 1 1 105 15 3 4 6 0 0 0 - 31255 -
 113 be_foo 2 srv1 10.236.139.34 2 0 256 256 2319 15 3 2 6 0 0 0 - 31257 -

 -> curl on the corresponding frontend: reply for server:31257
 (notice the difference of weight)

 # echo "set server be_foo/srv1 state maint" | sudo socat stdio /var/lib/haproxy/stats
 # echo "set server be_foo/srv1 addr 0.0.0.0 port 0" | sudo socat stdio /var/lib/haproxy/stats
 IP changed from '10.236.139.34' to '0.0.0.0', port changed from '31257' to '0' by 'stats socket command'
 # echo "show servers state be_foo" | sudo socat stdio /var/lib/haproxy/stats
 113 be_foo 1 srv0 10.236.139.34 2 0 1 1 263 15 3 4 6 0 0 0 - 31255 -
 113 be_foo 2 srv1 0.0.0.0 0 1 256 256 0 15 3 0 14 0 0 0 - 0 -

 -> curl on the corresponding frontend: reply from server:31255

 # echo "set server be_foo/srv1 addr 10.236.139.34 port 31256" | sudo socat stdio /var/lib/haproxy/stats
 IP changed from '0.0.0.0' to '10.236.139.34', port changed from '0' to '31256' by 'stats socket command'
 # echo "set server be_foo/srv1 weight 256" | sudo socat stdio /var/lib/haproxy/stats
 # echo "set server be_foo/srv1 check-port 8500" | sudo socat stdio /var/lib/haproxy/stats
 health check port updated.
 # echo "set server be_foo/srv1 state ready" | sudo socat stdio /var/lib/haproxy/stats
 # echo "show servers state be_foo" | sudo socat stdio /var/lib/haproxy/stats
 113 be_foo 1 srv0 10.236.139.34 2 0 1 1 105 15 3 4 6 0 0 0 - 31255 -
 113 be_foo 2 srv1 10.236.139.34 2 0 256 256 2319 15 3 2 6 0 0 0 - 31256 -

 -> curl on the corresponding frontend: reply from server:31257 (!)

Here we indeed would expect to get an anver from server:31256. The issue
is highly linked to the usage of `pool-purge-delay`, with a value which
is higher than the duration of the test, 10s in our case.

a git bisect between dev5 and dev6 seems to show commit
079cb9af22 ("MEDIUM: connections: Revamp the way idle connections are killed")
being the origin of this new behaviour.

So if I understand the later correctly, it seems that it was more a
matter of chance that we did not saw the issue earlier.

My patch proposes to force clean idle connections in the two following
cases:
- we set a (still running) server to maintenance
- we change the ip/port of a server

This commit should be backported to 2.1, 2.0, and 1.9.

Signed-off-by: William Dauchy <w.dauchy@criteo.com>
2020-05-02 22:24:36 +02:00
Christopher Faulet
b3b53524ad BUG/MINOR: server: Fix server_finalize_init() to avoid unused variable
The variable 'ret' must only be declared When HAProxy is compiled with the SSL
support (more precisely SSL_CTRL_SET_TLSEXT_HOSTNAME must be defined).

No backport needed.
2020-04-27 11:17:12 +02:00
Christopher Faulet
3829046893 MINOR: checks/obj_type: Add a new object type for checks
An object type is now affected to the check structure.
2020-04-27 09:39:38 +02:00
Christopher Faulet
0ae3d1dbdf MEDIUM: checks: Implement agent check using tcp-check rules
A shared tcp-check ruleset is now created to support agent checks. The following
sequence is used :

    tcp-check send "%[var(check.agent_string)] log-format
    tcp-check expect custom

The custom function to evaluate the expect rule does the same that it was done
to handle agent response when a custom check was used.
2020-04-27 09:39:38 +02:00
Christopher Faulet
ce8111ec60 MINOR: server/checks: Move parsing of server check keywords in checks.c
Parsing of following keywords have been moved in checks.c file : addr, check,
check-send-proxy, check-via-socks4, no-check, no-check-send-proxy, rise, fall,
inter, fastinter, downinter and port.
2020-04-27 09:39:38 +02:00
Christopher Faulet
cbba66cdc3 MINOR: server/checks: Move parsing of agent keywords in checks.c
Parsing of following keywords have been moved in checks.c file: agent-addr,
agent-check, agent-inter, agent-port, agent-send and no-agent-check.
2020-04-27 09:39:38 +02:00
Christopher Faulet
5d503fcf5b MEDIUM: checks: Add a shared list of tcp-check rules
A global list to tcp-check ruleset can now be used to share common rulesets with
all backends without any duplication. It is mandatory to convert all specific
protocol checks (redis, pgsql...) to tcp-check healthchecks.

To do so, a flag is now attached to each tcp-check ruleset to know if it is a
shared ruleset or not. tcp-check rules defined in a backend are still directly
attached to the proxy and not shared. In addition a second flag is used to know
if the ruleset is inherited from the defaults section.
2020-04-27 09:39:37 +02:00
Gaetan Rivet
04578dbf37 MINOR: checks: Don't use a static tcp rule list head
To allow reusing these blocks without consuming more memory, their list
should be static and share-able accross uses. The head of the list will
be shared as well.

It is thus necessary to extract the head of the rule list from the proxy
itself. Transform it into a pointer instead, that can be easily set to
an external dynamically allocated head.
2020-04-27 09:39:37 +02:00
Christopher Faulet
8892e5d30b BUG/MEDIUM: server/checks: Init server check during config validity check
The options and directives related to the configuration of checks in a backend
may be defined after the servers declarations. So, initialization of the check
of each server must not be performed during configuration parsing, because some
info may be missing. Instead, it must be done during the configuration validity
check.

Thus, callback functions are registered to be called for each server after the
config validity check, one for the server check and another one for the server
agent-check. In addition deinit callback functions are also registered to
release these checks.

This patch should be backported as far as 1.7. But per-server post_check
callback functions are only supported since the 2.1. And the initcall mechanism
does not exist before the 1.9. Finally, in 1.7, the code is totally
different. So the backport will be harder on older versions.
2020-04-27 09:39:37 +02:00
Gaetan Rivet
10c4b4a795 MINOR: server: respect warning and alert semantic
Error codes ERR_WARN and ERR_ALERT are used to signal that the error
given is of the corresponding level. All errors are displayed as ALERT
in the display_parser_err() function.

Differentiate the display level based on the error code. If both
ERR_WARN and ERR_ALERT are used, ERR_ALERT is given priority.
2020-04-27 09:39:37 +02:00
Jerome Magnin
2e8d52f869 BUG/MINOR: ssl: default settings for ssl server options are not used
Documentation states that default settings for ssl server options can be set
using either ssl-default-server-options or default-server directives. In practice,
not all ssl server options can have default values, such as ssl-min-ver, ssl-max-ver,
etc..

This patch adds the missing ssl options in srv_ssl_settings_cpy() and srv_parse_ssl(),
making it possible to write configurations like the following examples, and have them
behave as expected.

   global
     ssl-default-server-options ssl-max-ver TLSv1.2

   defaults
     mode http

   listen l1
     bind 1.2.3.4:80
     default-server ssl verify none
     server s1 1.2.3.5:443

   listen l2
     bind 2.2.3.4:80
     default-server ssl verify none ssl-max-ver TLSv1.3 ssl-min-ver TLSv1.2
     server s1 1.2.3.6:443

This should be backported as far as 1.8.
This fixes issue #595.
2020-04-22 15:43:03 +02:00
Frédéric Lécaille
8ba10fea69 BUG/MINOR: peers: Incomplete peers sections should be validated.
Before supporting "server" line in "peers" section, such sections without
any local peer were removed from the configuration to get it validated.

This patch fixes the issue where a "server" line without address and port which
is a remote peer without address and port makes the configuration parsing fail.
When encoutering such cases we now ignore such lines remove them from the
configuration.

Thank you to Jérôme Magnin for having reported this bug.

Must be backported to 2.1 and 2.0.
2020-04-15 10:47:39 +02:00
Olivier Houchard
079cb9af22 MEDIUM: connections: Revamp the way idle connections are killed
The original algorithm always killed half the idle connections. This doesn't
take into account the way the load can change. Instead, we now kill half
of the exceeding connections (exceeding connection being the number of
used + idle connections past the last maximum used connections reached).
That way if we reach a peak, we will kill much less, and it'll slowly go back
down when there's less usage.
2020-03-30 00:30:07 +02:00
Olivier Houchard
dc2f2753e9 MEDIUM: servers: Split the connections into idle, safe, and available.
Revamp the server connection lists. We know have 3 lists :
- idle_conns, which contains idling connections
- safe_conns, which contains idling connections that are safe to use even
for the first request
- available_conns, which contains connections that are not idling, but can
still accept new streams (those are HTTP/2 or fastcgi, and are always
considered safe).
2020-03-19 22:07:33 +01:00
Tim Duesterhus
cf6e0c8a83 MEDIUM: proxy_protocol: Support sending unique IDs using PPv2
This patch adds the `unique-id` option to `proxy-v2-options`. If this
option is set a unique ID will be generated based on the `unique-id-format`
while sending the proxy protocol v2 header and stored as the unique id for
the first stream of the connection.

This feature is meant to be used in `tcp` mode. It works on HTTP mode, but
might result in inconsistent unique IDs for the first request on a keep-alive
connection, because the unique ID for the first stream is generated earlier
than the others.

Now that we can send unique IDs in `tcp` mode the `%ID` log variable is made
available in TCP mode.
2020-03-13 17:26:43 +01:00
Willy Tarreau
908071171b BUILD: general: always pass unsigned chars to is* functions
The isalnum(), isalpha(), isdigit() etc functions from ctype.h are
supposed to take an int in argument which must either reflect an
unsigned char or EOF. In practice on some platforms they're implemented
as macros referencing an array, and when passed a char, they either cause
a warning "array subscript has type 'char'" when lucky, or cause random
segfaults when unlucky. It's quite unconvenient by the way since none of
them may return true for negative values. The recent introduction of
cygwin to the list of regularly tested build platforms revealed a lot
of breakage there due to the same issues again.

So this patch addresses the problem all over the code at once. It adds
unsigned char casts to every valid use case, and also drops the unneeded
double cast to int that was sometimes added on top of it.

It may be backported by dropping irrelevant changes if that helps better
support uncommon platforms. It's unlikely to fix bugs on platforms which
would already not emit any warning though.
2020-02-25 08:16:33 +01:00
Willy Tarreau
e3b57bf92f MINOR: sample: make sample_parse_expr() able to return an end pointer
When an end pointer is passed, instead of complaining that a comma is
missing after a keyword, sample_parse_expr() will silently return the
pointer to the current location into this return pointer so that the
caller can continue its parsing. This will be used by more complex
expressions which embed sample expressions, and may even permit to
embed sample expressions into arguments of other expressions.
2020-02-14 19:02:06 +01:00
Baptiste Assmann
13a9232ebc MEDIUM: dns: use Additional records from SRV responses
Most DNS servers provide A/AAAA records in the Additional section of a
response, which correspond to the SRV records from the Answer section:

  ;; QUESTION SECTION:
  ;_http._tcp.be1.domain.tld.     IN      SRV

  ;; ANSWER SECTION:
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A1.domain.tld.
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A8.domain.tld.
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A5.domain.tld.
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A6.domain.tld.
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A4.domain.tld.
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A3.domain.tld.
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A2.domain.tld.
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A7.domain.tld.

  ;; ADDITIONAL SECTION:
  A1.domain.tld.          3600    IN      A       192.168.0.1
  A8.domain.tld.          3600    IN      A       192.168.0.8
  A5.domain.tld.          3600    IN      A       192.168.0.5
  A6.domain.tld.          3600    IN      A       192.168.0.6
  A4.domain.tld.          3600    IN      A       192.168.0.4
  A3.domain.tld.          3600    IN      A       192.168.0.3
  A2.domain.tld.          3600    IN      A       192.168.0.2
  A7.domain.tld.          3600    IN      A       192.168.0.7

SRV record support was introduced in HAProxy 1.8 and the first design
did not take into account the records from the Additional section.
Instead, a new resolution is associated to each server with its relevant
FQDN.
This behavior generates a lot of DNS requests (1 SRV + 1 per server
associated).

This patch aims at fixing this by:
- when a DNS response is validated, we associate A/AAAA records to
  relevant SRV ones
- set a flag on associated servers to prevent them from running a DNS
  resolution for said FADN
- update server IP address with information found in the Additional
  section

If no relevant record can be found in the Additional section, then
HAProxy will failback to running a dedicated resolution for this server,
as it used to do.
This behavior is the one described in RFC 2782.
2020-01-22 07:19:54 +01:00
William Dauchy
7675c720f8 CLEANUP: server: remove unused err section in server_finalize_init
Since commit 980855bd95 ("BUG/MEDIUM: server: initialize the orphaned
conns lists and tasks at the end"), we no longer use err section.

This should fix github issue #438

Signed-off-by: William Dauchy <w.dauchy@criteo.com>
2020-01-09 05:54:48 +01:00
Willy Tarreau
ca7a5af664 BUG/MINOR: state-file: do not leak memory on parse errors
Issue #417 reports a possible memory leak in the state-file loading code.
There's one such place in the loop which corresponds to parsing errors
where the curreently allocated line is not freed when dropped. In any
case this is very minor in that no more than the file's length may be
lost in the worst case, considering that the whole file is kept anyway
in case of success. This fix addresses this.

It should be backported to 2.1.
2019-12-20 17:33:05 +01:00
Willy Tarreau
fd1aa01f72 BUG/MINOR: state-file: do not store duplicates in the global tree
The global state file tree isn't configured for unique keys, so if an
entry appears multiple times, e.g. due to a bogus script that concatenates
entries multiple times, this will needlessly eat memory. Let's just drop
duplicates.

This should be backported to 2.1.
2019-12-20 17:23:40 +01:00
Willy Tarreau
7d6a1fa311 BUG/MEDIUM: state-file: do not allocate a full buffer for each server entry
Starting haproxy with a state file of 700k servers eats 11.2 GB of RAM
due to a mistake in the function that loads the strings into a tree: it
allocates a full buffer for each backend+server name instead of allocating
just the required string. By just fixing this we're down to 80 MB.

This should be backported to 2.1.
2019-12-20 17:18:13 +01:00
Willy Tarreau
2444108f16 BUG/MINOR: server: make "agent-addr" work on default-server line
As reported in issue #408, "agent-addr" doesn't work on default-server
lines. This is due to the transcription of the old "addr" option in commit
6e5e0d8f9e ("MINOR: server: Make 'default-server' support 'addr' keyword.")
which correctly assigns it to the check.addr and agent.addr fields, but
which also copies the default check.addr into both the check's and the
agent's addr fields. Thus the default agent's address is never used.

This fix makes sure to copy the check from the check and the agent from
the agent. However it's worth noting that if "addr" is specified on the
server line, it will still overwrite both the check and the agent's
addresses.

This must be backported as far as 1.8.
2019-12-11 15:43:45 +01:00
Daniel Corbett
f8716914c7 MEDIUM: dns: Add resolve-opts "ignore-weight"
It was noted in #48 that there are times when a configuration
may use the server-template directive with SRV records and
simultaneously want to control weights using an agent-check or
through the runtime api.  This patch adds a new option
"ignore-weight" to the "resolve-opts" directive.

When specified, any weight indicated within an SRV record will
be ignored.  This is for both initial resolution and ongoing
resolution.
2019-11-21 17:25:31 +01:00
Vedran Furac
5d48627aba BUG/MINOR: server: check return value of fopen() in apply_server_state()
fopen() can return NULL when state file is missing. This patch adds a
check of fopen() return value so we can skip processing in such case.

No backport needed.
2019-10-21 16:00:24 +02:00
Olivier Houchard
859dc80f94 MEDIUM: list: Separate "locked" list from regular list.
Instead of using the same type for regular linked lists and "autolocked"
linked lists, use a separate type, "struct mt_list", for the autolocked one,
and introduce a set of macros, similar to the LIST_* macros, with the
MT_ prefix.
When we use the same entry for both regular list and autolocked list, as
is done for the "list" field in struct connection, we know have to explicitely
cast it to struct mt_list when using MT_ macros.
2019-09-23 18:16:08 +02:00
Willy Tarreau
9d00869323 CLEANUP: cli: replace all occurrences of manual handling of return messages
There were 221 places where a status message or an error message were built
to be returned on the CLI. All of them were replaced to use cli_err(),
cli_msg(), cli_dynerr() or cli_dynmsg() depending on what was expected.
This removed a lot of duplicated code because most of the times, 4 lines
are replaced by a single, safer one.
2019-08-09 11:26:10 +02:00
Willy Tarreau
5e83d996cf BUG/MAJOR: queue/threads: avoid an AB/BA locking issue in process_srv_queue()
A problem involving server slowstart was reported by @max2k1 in issue #197.
The problem is that pendconn_grab_from_px() takes the proxy lock while
already under the server's lock while process_srv_queue() first takes the
proxy's lock then the server's lock.

While the latter seems more natural, it is fundamentally incompatible with
mayn other operations performed on servers, namely state change propagation,
where the proxy is only known after the server and cannot be locked around
the servers. Howwever reversing the lock in process_srv_queue() is trivial
and only the few functions related to dynamic cookies need to be adjusted
for this so that the proxy's lock is taken for each server operation. This
is possible because the proxy's server list is built once at boot time and
remains stable. So this is what this patch does.

The comments in the proxy and server structs were updated to mention this
rule that the server's lock may not be taken under the proxy's lock but
may enclose it.

Another approach could consist in using a second lock for the proxy's queue
which would be different from the regular proxy's lock, but given that the
operations above are rare and operate on small servers list, there is no
reason for overdesigning a solution.

This fix was successfully tested with 10000 servers in a backend where
adjusting the dyncookies in loops over the CLI didn't have a measurable
impact on the traffic.

The only workaround without the fix is to disable any occurrence of
"slowstart" on server lines, or to disable threads using "nbthread 1".

This must be backported as far as 1.8.
2019-07-30 14:02:06 +02:00
Olivier Houchard
4be7190c10 BUG/MEDIUM: servers: Fix a race condition with idle connections.
When we're purging idle connections, there's a race condition, when we're
removing the connection from the idle list, to add it to the list of
connections to free, if the thread owning the connection tries to free it
at the same time.
To fix this, simply add a per-thread lock, that has to be hold before
removing the connection from the idle list, and when, in conn_free(), we're
about to remove the connection from every list. That way, we know for sure
the connection will stay valid while we remove it from the idle list, to add
it to the list of connections to free.
This should happen rarely enough that it shouldn't have any impact on
performances.
This has not been reported yet, but could provoke random segfaults.

This should be backported to 2.0.
2019-07-11 16:16:38 +02:00
Frédéric Lécaille
1b9423d214 MINOR: server: Add "no-tfo" option.
Simple patch to add "no-tfo" option to "default-server" and "server" lines
to disable any usage of TCP fast open.

Must be backported to 2.0.
2019-07-04 14:45:52 +02:00
Olivier Houchard
8d82db70a5 BUG/MEDIUM: servers: Authorize tfo in default-server.
There's no reason to forbid using tfo with default-server, so allow it.

This should be backported to 2.0.
2019-07-04 13:34:25 +02:00
Baptiste Assmann
da29fe2360 MEDIUM: server: server-state global file stored in a tree
Server states can be recovered from either a "global" file (all backends)
or a "local" file (per backend).

The way the algorithm to parse the state file was first implemented was good
enough for a low number of backends and servers per backend.
Basically, for each backend the state file (global or local) is opened,
parsed entirely and for each line we check if it contains data related to
a server from the backend we're currently processing.
We must read the file entirely, just in case some lines for the current
backend are stored at the end of the file.
This does not scale at all!

This patch changes the behavior above for the "global" file only. Now,
the global file is read and parsed once and all lines it contains are
stored in a tree, for faster discovery.
This result in way much less fopen, fgets, and strcmp calls, which make
loading of very big state files very quick now.
2019-06-17 13:40:42 +02:00
Baptiste Assmann
95c2c01ced MEDIUM: server: server-state only rely on server name
Since h7da71293e431b5ebb3d6289a55b0102331788ee6as has been added, the
server name (srv->id in the code) is now unique per backend, which
means it can reliabely be used to identify a server recovered from the
server-state file.

This patch cleans up the parsing of server-state file and ensure we use
only the server name as a reliable key.
2019-06-14 14:18:55 +02:00
Willy Tarreau
9faebe34cd MEDIUM: tools: improve time format error detection
As reported in GH issue #109 and in discourse issue
https://discourse.haproxy.org/t/haproxy-returns-408-or-504-error-when-timeout-client-value-is-every-25d
the time parser doesn't error on overflows nor underflows. This is a
recurring problem which additionally has the bad taste of taking a long
time before hitting the user.

This patch makes parse_time_err() return special error codes for overflows
and underflows, and adds the control in the call places to report suitable
errors depending on the requested unit. In practice, underflows are almost
never returned as the parsing function takes care of rounding values up,
so this might possibly happen on 64-bit overflows returning exactly zero
after rounding though. It is not really possible to cut the patch into
pieces as it changes the function's API, hence all callers.

Tests were run on about every relevant part (cookie maxlife/maxidle,
server inter, stats timeout, timeout*, cli's set timeout command,
tcp-request/response inspect-delay).
2019-06-07 19:32:02 +02:00
Willy Tarreau
975b155ebb MINOR: server: really increase the pool-purge-delay default to 5 seconds
Commit fb55365f9 ("MINOR: server: increase the default pool-purge-delay
to 5 seconds") did this but the setting placed in new_server() was
overwritten by srv_settings_cpy() from the default-server values preset
in init_default_instance(). Now let's put it at the right place.
2019-06-06 16:25:55 +02:00
Frédéric Lécaille
7da71293e4 MINOR: server: Add a dictionary for server names.
This patch only declares and defines a dictionary for the server
names (stored as ->id member field).
2019-06-05 08:33:35 +02:00
Willy Tarreau
fb55365f9e MINOR: server: increase the default pool-purge-delay to 5 seconds
The default used to be a very aggressive delay of 1 second before starting
to purge idle connections, but tests show that with bursty traffic it's a
bit short. Let's increase this to 5 seconds.
2019-06-04 14:06:31 +02:00
Alexander Liu
2a54bb74cd MEDIUM: connection: Upstream SOCKS4 proxy support
Have "socks4" and "check-via-socks4" server keyword added.
Implement handshake with SOCKS4 proxy server for tcp stream connection.
See issue #82.

I have the "SOCKS: A protocol for TCP proxy across firewalls" doc found
at "https://www.openssh.com/txt/socks4.protocol". Please reference to it.

[wt: for now connecting to the SOCKS4 proxy over unix sockets is not
 supported, and mixing IPv4/IPv6 is discouraged; indeed, the control
 layer is unique for a connection and will be used both for connecting
 and for target address manipulation. As such it may for example report
 incorrect destination addresses in logs if the proxy is reached over
 IPv6]
2019-05-31 17:24:06 +02:00
Willy Tarreau
e5733234f6 CLEANUP: build: rename some build macros to use the USE_* ones
We still have quite a number of build macros which are mapped 1:1 to a
USE_something setting in the makefile but which have a different name.
This patch cleans this up by renaming them to use the USE_something
one, allowing to clean up the makefile and make it more obvious when
reading the code what build option needs to be added.

The following renames were done :

 ENABLE_POLL -> USE_POLL
 ENABLE_EPOLL -> USE_EPOLL
 ENABLE_KQUEUE -> USE_KQUEUE
 ENABLE_EVPORTS -> USE_EVPORTS
 TPROXY -> USE_TPROXY
 NETFILTER -> USE_NETFILTER
 NEED_CRYPT_H -> USE_CRYPT_H
 CONFIG_HAP_CRYPT -> USE_LIBCRYPT
 CONFIG_HAP_NS -> DUSE_NS
 CONFIG_HAP_LINUX_SPLICE -> USE_LINUX_SPLICE
 CONFIG_HAP_LINUX_TPROXY -> USE_LINUX_TPROXY
 CONFIG_HAP_LINUX_VSYSCALL -> USE_LINUX_VSYSCALL
2019-05-22 19:47:57 +02:00
Willy Tarreau
5db847ab65 CLEANUP: ssl: remove 57 occurrences of useless tests on LIBRESSL_VERSION_NUMBER
They were all check to comply with the advertised openssl version. Now
that libressl doesn't pretend to be a more recent openssl anymore, we
can simply rely on the regular openssl version tests without having to
deal with exceptions for libressl.
2019-05-09 14:26:39 +02:00
Willy Tarreau
9a1ab08160 CLEANUP: ssl-sock: use HA_OPENSSL_VERSION_NUMBER instead of OPENSSL_VERSION_NUMBER
Most tests on OPENSSL_VERSION_NUMBER have become complex and break all
the time because this number is fake for some derivatives like LibreSSL.
This patch creates a new macro, HA_OPENSSL_VERSION_NUMBER, which will
carry the real openssl version defining the compatibility level, and
this version will be adjusted depending on the variants.
2019-05-09 14:25:43 +02:00
Willy Tarreau
034c88cf03 MEDIUM: tcp: add the "tfo" option to support TCP fastopen on the server
This implements support for the new API which relies on a call to
setsockopt().
On systems that support it (currently, only Linux >= 4.11), this enables
using TCP fast open when connecting to server.
Please note that you should use the retry-on "conn-failure", "empty-response"
and "response-timeout" keywords, or the request won't be able to be retried
on failure.

Co-authored-by: Olivier Houchard <ohouchard@haproxy.com>
2019-05-06 22:29:39 +02:00
Ilya Shipitsin
0c50b1ecbb BUG/MEDIUM: servers: fix typo "src" instead of "srv"
When copying the settings for all servers when using server templates,
fix a typo, or we would never copy the length of the ALPN to be used for
checks.

This should be backported to 1.9.
2019-04-30 23:04:47 +02:00
Olivier Houchard
88698d966d MEDIUM: connections: Add a way to control the number of idling connections.
As by default we add all keepalive connections to the idle pool, if we run
into a pathological case, where all client don't do keepalive, but the server
does, and haproxy is configured to only reuse "safe" connections, we will
soon find ourself having lots of idling, unusable for new sessions, connections,
while we won't have any file descriptors available to create new connections.

To fix this, add 2 new global settings, "pool_low_ratio" and "pool_high_ratio".
pool-low-fd-ratio  is the % of fds we're allowed to use (against the maximum
number of fds available to haproxy) before we stop adding connections to the
idle pool, and destroy them instead. The default is 20. pool-high-fd-ratio is
the % of fds we're allowed to use (against the maximum number of fds available
to haproxy) before we start killing idling connection in the event we have to
create a new outgoing connection, and no reuse is possible. The default is 25.
2019-04-18 19:52:03 +02:00
Willy Tarreau
0e492e2ad0 BUILD: address a few cases of "static <type> inline foo()"
Older compilers don't like to see "inline" placed after the type in a
function declaration, it must be "static inline <type>" only. This
patch touches various areas. The warnings were seen with gcc-3.4.
2019-04-15 21:55:48 +02:00
Christopher Faulet
73c1207c71 MINOR: muxes: Pass the context of the mux to destroy() instead of the connection
It is mandatory to handle mux upgrades, because during a mux upgrade, the
connection will be reassigned to another multiplexer. So when the old one is
destroyed, it does not own the connection anymore. Or in other words, conn->ctx
does not point to the old mux's context when its destroy() callback is
called. So we now rely on the multiplexer context do destroy it instead of the
connection.

In addition, h1_release() and h2_release() have also been updated in the same
way.
2019-04-12 22:06:53 +02:00
Willy Tarreau
c912f94b57 MINOR: server: remove a few unneeded LIST_INIT calls after LIST_DEL_LOCKED
Since LIST_DEL_LOCKED() and LIST_POP_LOCKED() now automatically reinitialize
the removed element, there's no need for keeping this LIST_INIT() call in the
idle connection code.
2019-02-28 16:08:54 +01:00
Olivier Houchard
9ea5d361ae MEDIUM: servers: Reorganize the way idle connections are cleaned.
Instead of having one task per thread and per server that does clean the
idling connections, have only one global task for every servers.
That tasks parses all the servers that currently have idling connections,
and remove half of them, to put them in a per-thread list of connections
to kill. For each thread that does have connections to kill, wake a task
to do so, so that the cleaning will be done in the context of said thread.
2019-02-26 18:17:32 +01:00
Olivier Houchard
f131481a0a BUG/MEDIUM: servers: Add a per-thread counter of idle connections.
Add a per-thread counter of idling connections, and use it to determine
how many connections we should kill after the timeout, instead of using
the global counter, or we're likely to just kill most of the connections.

This should be backported to 1.9.
2019-02-21 19:07:45 +01:00
Willy Tarreau
980855bd95 BUG/MEDIUM: server: initialize the orphaned conns lists and tasks at the end
This also depends on the nbthread count, so it must only be performed after
parsing the whole config file. As a side effect, this removes some code
duplication between servers and server-templates.

This must be backported to 1.9.
2019-02-07 15:08:13 +01:00
Willy Tarreau
835daa119e BUG/MEDIUM: server: initialize the idle conns list after parsing the config
The idle conns lists are sized according to the number of threads. As such
they cannot be initialized during the parsing since nbthread can be set
later, as revealed by this simple config which randomly crashes when used.
Let's do this at the end instead.

    listen proxy
        bind :4445
        mode http
        timeout client 10s
        timeout server 10s
        timeout connect 10s
        http-reuse always
        server s1 127.0.0.1:8000

    global
        nbthread 8

This fix must be backported to 1.9 and 1.8.
2019-02-07 15:08:13 +01:00
Willy Tarreau
9c538e01c2 MINOR: server: add a max-reuse parameter
Some servers may wish to limit the total number of requests they execute
over a connection because some of their components might leak resources.
In HTTP/1 it was easy, they just had to emit a "connection: close" header
field with the last response. In HTTP/2, it's less easy because the info
is not always shared with the component dealing with the H2 protocol and
it could be harder to advertise a GOAWAY with a stream limit.

This patch provides a solution to this by adding a new "max-reuse" parameter
to the server keyword. This parameter indicates how many times an idle
connection may be reused for new requests. The information is made available
and the underlying muxes will be able to use it at will.

This patch should be backported to 1.9.
2019-01-24 19:06:43 +01:00
Willy Tarreau
15c120d251 CLEANUP: server: fix indentation mess on idle connections
Apparently some code was moved around leaving the inner block incorrectly
indented and with the closing brace in the middle of nowhere.
2019-01-24 19:06:43 +01:00
Willy Tarreau
cb923d5001 MINOR: server: make sure pool-max-conn is >= -1
The keyword parser doesn't check the value range, but supported values are
-1 and positive values, thus we should check it.

This can be backported to 1.9.
2019-01-24 16:31:56 +01:00
Jérôme Magnin
f57afa453a BUG/MINOR: server: don't always trust srv_check_health when loading a server state
When we load health values from a server state file, make sure what we assign
to srv->check.health actually matches the state we restore.

This should be backported as far as 1.6.
2019-01-21 11:09:03 +01:00
Willy Tarreau
1ba32032ef BUG/MEDIUM: checks: fix recent regression on agent-check making it crash
In order to address the mailers issues, we needed to store the proxy
into the checks struct, which was done by commit c98aa1f18 ("MINOR:
checks: Store the proxy in checks."). However this one did it only for
the health checks and not for the agent checks, resulting in an immediate
crash when the agent is enabled on a random config like this one :

  listen agent
      bind :8000
      server s1 255.255.255.255:1 agent-check agent-port 1

Thanks to Seri Kim for reporting it and providing a reproducer in
issue #20. This fix must be backported to 1.9.
2019-01-21 07:48:26 +01:00
Frédéric Lécaille
355b2033ec MINOR: cfgparse: SSL/TLS binding in "peers" sections.
Make "bind" keywork be supported in "peers" sections.
All "bind" settings are supported on this line.
Add "default-bind" option to parse the binding options excepted the bind address.
Do not parse anymore the bind address for local peers on "server" lines.
Do not use anymore list_for_each_entry() to set the "peers" section
listener parameters because there is only one listener by "peers" section.

May be backported to 1.5 and newer.
2019-01-18 14:26:21 +01:00
Frédéric Lécaille
c06b5d4f74 MINOR: cfgparse: Make "peer" lines be parsed as "server" lines.
With this patch "default-server" lines are supported in "peers" sections
to setup the default settings of peers which are from now setup
when parsing both "peer" and "server" lines.

May be backported to 1.5 and newer.
2019-01-18 14:26:21 +01:00
Olivier Houchard
c98aa1f182 MINOR: checks: Store the proxy in checks.
Instead of assuming we have a server, store the proxy directly in struct
check, and use it instead of s->server.
This should be a no-op for now, but will be useful later when we change
mail checks to avoid having a server.

This should be backported to 1.9.
2019-01-14 11:15:11 +01:00
Daniel Corbett
43bb842a08 BUG/MEDIUM: init: Initialize idle_orphan_conns for first server in server-template
When initializing server-template all of the servers after the first
have srv->idle_orphan_conns initialized within server_template_init()
The first server does not have this initialized and when http-reuse
is active this causes a segmentation fault when accessed from
srv_add_to_idle_list().  This patch removes the check for
srv->tmpl_info.prefix within server_finalize_init() and allows
the first server within a server-template to have srv->idle_orphan_conns
properly initialized.

This should be backported to 1.9.
2019-01-09 14:45:21 +01:00
Olivier Houchard
921501443b MEDIUM: checks: Add check-alpn.
Add a way to configure the ALPN used by check, with a new "check-alpn"
keyword. By default, the checks will use the server ALPN, but it may not
be convenient, for instance because the server may use HTTP/2, while checks
are unable to do HTTP/2 yet.
2018-12-21 19:54:16 +01:00
Olivier Houchard
21944019ca BUG/MEDIUM: server: Also copy "check-sni" for server templates.
When using server templates, if "check-sni" is used, make sure it shows up
in all the created servers.

This should be backported to 1.8 and 1.9.
2018-12-21 19:53:28 +01:00
Olivier Houchard
b7b3faa79c MEDIUM: servers: Replace idle-timeout with pool-purge-delay.
Instead of the old "idle-timeout" mechanism, add a new option,
"pool-purge-delay", that sets the delay before purging idle connections.
Each time the delay happens, we destroy half of the idle connections.
2018-12-15 23:50:09 +01:00
Olivier Houchard
006e3101f9 MEDIUM: servers: Add a command to limit the number of idling connections.
Add a new command, "pool-max-conn" that sets the maximum number of connections
waiting in the orphan idling connections list (as activated with idle-timeout).
Using "-1" means unlimited. Using pools is now dependant on this.
2018-12-15 23:50:08 +01:00
Olivier Houchard
0c18a6fe34 MEDIUM: servers: Add a way to keep idle connections alive.
Add a new keyword for servers, "idle-timeout". If set, unused connections are
kept alive until the timeout happens, and will be picked for reuse if no
other connection is available.
2018-12-02 18:16:53 +01:00
Willy Tarreau
76a551de2e MINOR: config: make sure to associate the proper mux to bind and servers
Currently a mux may be forced on a bind or server line by specifying the
"proto" keyword. The problem is that the mux may depend on the proxy's
mode, which is not known when parsing this keyword, so a wrong mux could
be picked.

Let's simply update the mux entry while checking its validity. We do have
the name and the side, we only need to see if a better mux fits based on
the proxy's mode. It also requires to remove the side check while parsing
the "proto" keyword since a wrong mux could be picked.

This way it becomes possible to declare multiple muxes with the same
protocol names and different sides or modes.
2018-12-02 13:29:35 +01:00
Willy Tarreau
0108d90c6c MEDIUM: init: convert all trivial registration calls to initcalls
This switches explicit calls to various trivial registration methods for
keywords, muxes or protocols from constructors to INITCALL1 at stage
STG_REGISTER. All these calls have in common to consume a single pointer
and return void. Doing this removes 26 constructors. The following calls
were addressed :

- acl_register_keywords
- bind_register_keywords
- cfg_register_keywords
- cli_register_kw
- flt_register_keywords
- http_req_keywords_register
- http_res_keywords_register
- protocol_register
- register_mux_proto
- sample_register_convs
- sample_register_fetches
- srv_register_keywords
- tcp_req_conn_keywords_register
- tcp_req_cont_keywords_register
- tcp_req_sess_keywords_register
- tcp_res_cont_keywords_register
- flt_register_keywords
2018-11-26 19:50:32 +01:00
Olivier Houchard
c756600103 MINOR: server: Add "alpn" and "npn" keywords.
Add new keywords to "server" lines, alpn and npn.
If set, when connecting through SSL, those alpn/npn will be negociated
during the SSL handshake.
2018-11-22 19:50:08 +01:00
Joseph Herlant
44466826b1 CLEANUP: fix a few typos in the comments of the server subsystem
A few misspells where detected in the server subsystem. This commit
fixes them.
2018-11-18 22:23:15 +01:00
Willy Tarreau
db398435aa MINOR: stream-int: replace si_cant_put() with si_rx_room_{blk,rdy}()
Remaining calls to si_cant_put() were all for lack of room and were
turned to si_rx_room_blk(). A few places where SI_FL_RXBLK_ROOM was
cleared by hand were converted to si_rx_room_rdy().

The now unused si_cant_put() function was removed.
2018-11-18 21:41:50 +01:00
Willy Tarreau
0cd3bd628a MINOR: stream-int: rename si_applet_{want|stop|cant}_{get|put}
It doesn't make sense to limit this code to applets, as any stream
interface can use it. Let's rename it by simply dropping the "applet_"
part of the name. No other change was made except updating the comments.
2018-11-11 10:18:37 +01:00
William Lallemand
313bfd18c1 MINOR: server: export new_server() function
The new_server() function will be useful to create a proxy for the
master-worker.
2018-10-28 13:51:38 +01:00
Willy Tarreau
5dfb6c4cc9 CLEANUP: state-file: make the path concatenation code a bit more consistent
There are as many ways to build the globalfilepathlen variable as branches
in the if/then/else, creating lots of confusion. Address the most obvious
parts, but some polishing definitely is still needed.
2018-10-16 19:26:12 +02:00
Olivier Houchard
17f8b90736 MINOR: server: Use memcpy() instead of strncpy().
Use memcpy instead of strncpy, strncpy buys us nothing, and gcc is being
annoying.
2018-10-16 19:22:20 +02:00
Dirkjan Bussink
415150f764 MEDIUM: ssl: add support for ciphersuites option for TLSv1.3
OpenSSL released support for TLSv1.3. It also added a separate function
SSL_CTX_set_ciphersuites that is used to set the ciphers used in the
TLS 1.3 handshake. This change adds support for that new configuration
option by adding a ciphersuites configuration variable that works
essentially the same as the existing ciphers setting.

Note that it should likely be backported to 1.8 in order to ease usage
of the now released openssl-1.1.1.
2018-10-08 19:20:13 +02:00
Frédéric Lécaille
5afb3cfbcc BUG/MINOR: server: Crash when setting FQDN via CLI.
This patch ensures that a DNS resolution may be launched before
setting a server FQDN via the CLI. Especially, it checks that
resolvers was set.

A LEVEL 4 reg testing file is provided.

Thanks to Lukas Tribus for having reported this issue.

Must be backported to 1.8.
2018-09-12 07:41:41 +02:00
Baptiste Assmann
6d0f38f00d BUG/MEDIUM: dns/server: fix incomatibility between SRV resolution and server state file
Server state file has no indication that a server is currently managed
by a DNS SRV resolution.
And thus, both feature (DNS SRV resolution and server state), when used
together, does not provide the expected behavior: a smooth experience...

This patch introduce the "SRV record name" in the server state file and
loads and applies it if found and wherever required.

This patch applies to haproxy-dev branch only. For backport, a specific patch
is provided for 1.8.
2018-09-04 17:40:22 +02:00
Willy Tarreau
49725a0977 BUG/MEDIUM: check/threads: do not involve the rendez-vous point for status updates
thread_isolate() is currently being called with the server lock held.
This is not acceptable because it prevents other threads from reaching
the rendez-vous point. Now that the LB algos are thread-safe, let's get
rid of this call.

No backport is nedeed.
2018-08-21 19:54:09 +02:00
Willy Tarreau
3bcc2699ba BUG/MEDIUM: cli/threads: protect some server commands against concurrent operations
The server-specific CLI commands "set weight", "set maxconn",
"disable agent", "enable agent", "disable health", "enable health",
"disable server" and "enable server" were not protected against
concurrent accesses. Now they take the server lock around the
sensitive part.

This patch must be backported to 1.8.
2018-08-21 15:35:31 +02:00
Willy Tarreau
46b7f53ad9 DOC: server/threads: document which functions need to be called with/without locks
At the moment it's totally unclear while reading the server's code which
functions require to be called with the server lock held and which ones
grab it and cannot be called this way. This commit simply inventories
all of them to indicate what is detected depending on how these functions
use the struct server. Only functions used at runtime were checked, those
dedicated to config parsing were skipped. Doing so already has uncovered
a few bugs on some CLI actions.
2018-08-21 14:58:25 +02:00
Willy Tarreau
eeba36b3af BUG/MEDIUM: server: update our local state before propagating changes
Commit 3ff577e ("MAJOR: server: make server state changes synchronous again")
reintroduced synchronous server state changes. However, during the previous
change from synchronous to asynchronous, the server state propagation was
placed at the end of the function to ease the code changes, and the commit
above didn't put it back at its place. This has resulted in propagated
states to be incomplete. For example, making a server leave maintenance
would make it up but would leave its tracking servers down because they
see their tracked server is still down.

Let's just move the status update right to its place. It also adds the
benefit of reporting state changes in the order they appear and not in
reverse.

No backport is needed.
2018-08-21 08:29:25 +02:00
Patrick Hemmer
0355dabd7c MINOR: queue: replace the linked list with a tree
We'll need trees to manage the queues by priorities. This change replaces
the list with a tree based on a single key. It's effectively a list but
allows us to get rid of the list management right now.
2018-08-10 15:06:27 +02:00
Christopher Faulet
8ed0a3e32a MINOR: mux/server: Add 'proto' keyword to force the multiplexer's protocol
For now, it is parsed but not used. Tests are done on it to check if the side
and the mode are compatible with the server's definition.
2018-08-08 10:42:08 +02:00
Willy Tarreau
91c2826e1d CLEANUP: server: remove the update list and the update lock
These ones are not more used, let's get rid of them.
2018-08-08 09:57:45 +02:00
Willy Tarreau
3ff577e165 MAJOR: server: make server state changes synchronous again
Now we try to synchronously push updates as they come using the new rdv
point, so that the call to the server update function from the main poll
loop is not needed anymore.

It further reduces the apparent latency in the health checks as the response
time almost always appears as 0 ms, resulting in a slightly higher check rate
of ~1960 conn/s. Despite this, the CPU consumption has slightly dropped again
to ~32% for the same test.

The only trick is that the checks code is built with a bit of recursivity
because srv_update_status() calls server_recalc_eweight(), and the latter
needs to signal srv_update_status() in case of updates. Thus we added an
extra argument to this function to indicate whether or not it must
propagate updates (no if it comes from srv_update_status).
2018-08-08 09:57:45 +02:00
Willy Tarreau
3d3700f216 MEDIUM: checks: use the new rendez-vous point to spread check result
The current sync point causes some important stress when a high number
of threads is in use on a config with lots of checks, because it wakes
up all threads every time a server state changes.

A config like the following can easily saturate a 4-core machine reaching
only 750 checks per second out of the ~2000 configured :

    global
        nbthread 4

    defaults
        mode    http
        timeout connect 5s
        timeout client  5s
        timeout server  5s

    frontend srv
        bind :8001 process 1/1
        redirect location / if { method OPTIONS } { rand(100) ge 50 }
        stats uri /

    backend chk
        option httpchk
        server-template srv 1-100 127.0.0.1:8001 check rise 1 fall 1 inter 50

The reason is that the random on the fake server causes the responses
to randomly match an HTTP check, and results in a lot of up/down events
that are broadcasted to all threads. It's worth noting that the CPU usage
already dropped by about 60% between 1.8 and 1.9 just due to the scheduler
updates, but the sync point remains expensive.

In addition, it's visible on the stats page that a lot of requests end up
with an L7TOUT status in ~60ms. With smaller timeouts, it's even L4TOUT
around 20-25ms.

By not using THREAD_WANT_SYNC() anymore and only calling the server updates
under thread_isolate(), we can avoid all these wakeups. The CPU usage on
the same config drops to around 44% on the same machine, with all checks
being delivered at ~1900 checks per second, and the stats page shows no
more timeouts, even at 10 ms check interval. The difference is mainly
caused by the fact that there's no more need to wait for a thread to wake
up from poll() before starting to process check results.
2018-08-08 09:56:32 +02:00
Willy Tarreau
6a78e61694 BUG/MEDIUM: servers: check the queues once enabling a server
Commit 64cc49c ("MAJOR: servers: propagate server status changes
asynchronously.") heavily changed the way the server states are
updated since they became asynchronous. During this change, some
code was lost, which is used to shut down some sessions from a
backup server and to pick pending connections from a proxy once
a server is turned back from maintenance to ready state. The
effect is that when temporarily disabling a server, connections
stay in the backend's queue, and when re-enabling it, they are
not picked and they expire in the backend's queue. Now they're
properly picked again.

This fix must be backported to 1.8.
2018-08-07 10:14:53 +02:00
Olivier Houchard
306e653331 BUG/MINOR: servers: Don't make "server" in a frontend fatal.
When parsing the configuration, if "server", "default-server" or
"server-template" are found in a frontend, we first warn that it will be
ignored, only to be considered a fatal error later. Be true to our word, and
just ignore it.

This should be backported to 1.8 and 1.7.
2018-07-24 17:13:54 +02:00
Willy Tarreau
83061a820e MAJOR: chunks: replace struct chunk with struct buffer
Now all the code used to manipulate chunks uses a struct buffer instead.
The functions are still called "chunk*", and some of them will progressively
move to the generic buffer handling code as they are cleaned up.
2018-07-19 16:23:43 +02:00
Willy Tarreau
843b7cbe9d MEDIUM: chunks: make the chunk struct's fields match the buffer struct
Chunks are only a subset of a buffer (a non-wrapping version with no head
offset). Despite this we still carry a lot of duplicated code between
buffers and chunks. Replacing chunks with buffers would significantly
reduce the maintenance efforts. This first patch renames the chunk's
fields to match the name and types used by struct buffers, with the goal
of isolating the code changes from the declaration changes.

Most of the changes were made with spatch using this coccinelle script :

  @rule_d1@
  typedef chunk;
  struct chunk chunk;
  @@
  - chunk.str
  + chunk.area

  @rule_d2@
  typedef chunk;
  struct chunk chunk;
  @@
  - chunk.len
  + chunk.data

  @rule_i1@
  typedef chunk;
  struct chunk *chunk;
  @@
  - chunk->str
  + chunk->area

  @rule_i2@
  typedef chunk;
  struct chunk *chunk;
  @@
  - chunk->len
  + chunk->data

Some minor updates to 3 http functions had to be performed to take size_t
ints instead of ints in order to match the unsigned length here.
2018-07-19 16:23:43 +02:00
Baptiste Assmann
8e2d9430c0 MINOR: dns: new DNS options to allow/prevent IP address duplication
By default, HAProxy's DNS resolution at runtime ensure that there is no
IP address duplication in a backend (for servers being resolved by the
same hostname).
There are a few cases where people want, on purpose, to disable this
feature.

This patch introduces a couple of new server side options for this purpose:
"resolve-opts allow-dup-ip" or "resolve-opts prevent-dup-ip".
2018-07-12 17:56:44 +02:00
Daniel Corbett
9215ffa6b2 BUG/MEDIUM: servers: Add srv_addr default placeholder to the state file
When creating a state file using "show servers state" an empty field is
created in the srv_addr column if the server is from the socket family
AF_UNIX.  This leads to a warning on start up when using
"load-server-state-from-file". This patch defaults srv_addr to "-" if
the socket family is not covered.

This patch should be backported to 1.8.
2018-05-24 22:06:08 +02:00
Aurélien Nephtali
abbf607105 MEDIUM: cli: Add payload support
In order to use arbitrary data in the CLI (multiple lines or group of words
that must be considered as a whole, for example), it is now possible to add a
payload to the commands. To do so, the first line needs to end with a special
pattern: <<\n. Everything that follows will be left untouched by the CLI parser
and will be passed to the commands parsers.

Per-command support will need to be added to take advantage of this
feature.

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-04-26 14:19:33 +02:00
Emmanuel Hocdet
4399c75f6c MINOR: proxy-v2-options: add crc32c
This patch add option crc32c (PP2_TYPE_CRC32C) to proxy protocol v2.
It compute the checksum of proxy protocol v2 header as describe in
"doc/proxy-protocol.txt".
2018-03-21 05:04:01 +01:00
Emmanuel Hocdet
253c3b7516 MINOR: connection: add proxy-v2-options authority
This patch add option PP2_TYPE_AUTHORITY to proxy protocol v2 when a TLS
connection was negotiated. In this case, authority corresponds to the sni.
2018-03-01 11:38:32 +01:00
Emmanuel Hocdet
fa8d0f1875 MINOR: connection: add proxy-v2-options ssl-cipher,cert-sig,cert-key
This patch implement proxy protocol v2 options related to crypto information:
ssl-cipher (PP2_SUBTYPE_SSL_CIPHER), cert-sig (PP2_SUBTYPE_SSL_SIG_ALG) and
cert-key (PP2_SUBTYPE_SSL_KEY_ALG).
2018-03-01 11:38:28 +01:00
Emmanuel Hocdet
f643b80429 MINOR: introduce proxy-v2-options for send-proxy-v2
Proxy protocol v2 can transport many optional informations. To avoid
send-proxy-v2-* explosion, this patch introduce proxy-v2-options parameter
and will allow to write: "send-proxy-v2 proxy-v2-options ssl,cert-cn".
2018-02-02 05:52:51 +01:00
Christopher Faulet
8d01fd6b3c BUG/MEDIUM: threads/server: Fix deadlock in srv_set_stopping/srv_set_admin_flag
Because of a typo (HA_SPIN_LOCK instead of HA_SPIN_UNLOCK), there is a deadlock
in srv_set_stopping and srv_set_admin_flag when there is at least one trackers.

This patch must be backported in 1.8.
2018-01-25 13:51:23 +01:00
Olivier Houchard
e9bad0a936 MINOR: servers: Don't report duplicate dyncookies for disabled servers.
Especially with server-templates, it can happen servers starts with a
placeholder IP, in the disabled state. In this case, we don't want to report
that the same cookie was generated for multiple servers. So defer the test
until the server is enabled.

This should be backported to 1.8.
2018-01-23 14:05:17 +01:00
Emeric Brun
e31148031f BUG/MEDIUM: checks: a server passed in maint state was not forced down.
Setting a server in maint mode, the required next_state was not set
before calling the 'lb_down' function and so the system state was never
commited.

This patch should be backported in 1.8
2017-12-21 15:23:55 +01:00
Emeric Brun
8f29829e24 BUG/MEDIUM: checks: a down server going to maint remains definitely stucked on down state.
The new admin state was not correctly commited in this case.
Checks were fully disabled but the server was not marked in MAINT state.
It results with a server definitely stucked on the DOWN state.

This patch should be backported on haproxy 1.8
2017-12-06 17:01:00 +01:00
Olivier Houchard
fbc74e8556 MINOR/CLEANUP: proxy: rename "proxy" to "proxies_list"
Rename the global variable "proxy" to "proxies_list".
There's been multiple proxies in haproxy for quite some time, and "proxy"
is a potential source of bugs, a number of functions have a "proxy" argument,
and some code used "proxy" when it really meant "px" or "curproxy". It worked
by pure luck, because it usually happened while parsing the config, and thus
"proxy" pointed to the currently parsed proxy, but we should probably not
rely on this.

[wt: some of these are definitely fixes that are worth backporting]
2017-11-24 17:21:27 +01:00
Christopher Faulet
767a84bcc0 CLEANUP: log: Rename Alert/Warning in ha_alert/ha_warning 2017-11-24 17:19:12 +01:00
Willy Tarreau
358847f026 BUILD: server: check->desc always exists
Clang reports this warning :

  src/server.c:872:14: warning: address of array 'check->desc' will
  always evaluate to 'true' [-Wpointer-bool-conversion]

Indeed, check->desc used to be a pointer to a dynamically allocated area
a long time ago and is now an array. Let's remove the useless test.
2017-11-20 21:33:21 +01:00
Christopher Faulet
9dcf9b6f03 MINOR: threads: Use __decl_hathreads to declare locks
This macro should be used to declare variables or struct members depending on
the USE_THREAD compile option. It avoids the encapsulation of such declarations
between #ifdef/#endif. It is used to declare all lock variables.
2017-11-13 11:38:17 +01:00
Christopher Faulet
2a944ee16b BUILD: threads: Rename SPIN/RWLOCK macros using HA_ prefix
This remove any name conflicts, especially on Solaris.
2017-11-07 11:10:24 +01:00
Olivier Houchard
283810773a BUG/MINOR: dns: Don't lock the server lock in snr_check_ip_callback().
snr_check_ip_callback() may be called with the server lock, so don't attempt
to lock it again, instead, make sure the callers always have the lock before
calling it.
2017-11-06 18:34:42 +01:00
Olivier Houchard
55dcdf4c39 BUG/MINOR: dns: Don't try to get the server lock if it's already held.
dns_link_resolution() can be called with the server lock already held, so
don't attempt to lock it again in that case.
2017-11-06 18:34:24 +01:00
Willy Tarreau
6ce38f3eab CLEANUP: server: get rid of return statements in the CLI parser
There were two many return, some of them missing a spin_unlock call,
let's use a goto to a central place instead.
2017-11-05 10:19:23 +01:00
Willy Tarreau
a075258a2c BUG/MINOR: cli: add severity in "set server addr" parser
Commit c3680ec ("MINOR: add severity information to cli feedback messages")
introduced a severity level to CLI messages, but one of them was missed
on "set server addr". No backport is needed.
2017-11-05 10:17:49 +01:00
Willy Tarreau
1c8980f9b5 BUG/MINOR: cli: do not perform an invalid action on "set server check-port"
The "set server <srv> check-port" CLI handler forgot to return after
detecting an error on the port number, and still proceeds with the action.
This needs to be backported to 1.7.
2017-11-05 10:13:37 +01:00
Willy Tarreau
2a858a82ec BUG/MAJOR: threads/server: missing unlock in CLI fqdn parser
This one didn't properly unlock before returning an error message.
2017-11-05 10:13:37 +01:00
Emeric Brun
e9fd6b5916 BUG/MINOR: dns: fix missing lock protection on server.
To avoid inconsistencies server's attributes must be read
or updated under lock.
2017-11-03 15:17:55 +01:00
Olivier Houchard
d16bfe6c01 BUG/MINOR: dns: Fix SRV records with the new thread code.
srv_set_fqdn() may be called with the DNS lock already held, but tries to
lock it anyway. So, add a new parameter to let it know if it was already
locked or not;
2017-10-31 15:47:55 +01:00
Emeric Brun
9f0b458525 MEDIUM: threads/server: Use the server lock to protect health check and cli concurrency 2017-10-31 13:58:33 +01:00
Christopher Faulet
b2812a6240 MEDIUM: thread/dns: Make DNS thread-safe 2017-10-31 13:58:33 +01:00
Christopher Faulet
5d42e099c5 MINOR: threads/server: Add a lock to deal with insert in updates_servers list
This list is used to save changes on the servers state. So when serveral threads
are used, it must be locked. The changes are then applied in the sync-point. To
do so, servers_update_status has be moved in the sync-point. So this is useless
to lock it at this step because the sync-point is a protected area by iteself.
2017-10-31 13:58:31 +01:00
Christopher Faulet
29f77e846b MEDIUM: threads/server: Add a lock per server and atomically update server vars
The server's lock is use, among other things, to lock acces to the active
connection list of a server.
2017-10-31 13:58:31 +01:00
Christopher Faulet
40a007cf2a MEDIUM: threads/server: Make connection list (priv/idle/safe) thread-safe
For now, we have a list of each type per thread. So there is no need to lock
them. This is the easiest solution for now, but not the best one because there
is no sharing between threads. An idle connection on a thread will not be able
be used by a stream on another thread. So it could be a good idea to rework this
patch later.
2017-10-31 13:58:30 +01:00
Christopher Faulet
67957bd59e MAJOR: dns: Refactor the DNS code
This is a huge patch with many changes, all about the DNS. Initially, the idea
was to update the DNS part to ease the threads support integration. But quickly,
I started to refactor some parts. And after several iterations, it was
impossible for me to commit the different parts atomically. So, instead of
adding tens of patches, often reworking the same parts, it was easier to merge
all my changes in a uniq patch. Here are all changes made on the DNS.

First, the DNS initialization has been refactored. The DNS configuration parsing
remains untouched, in cfgparse.c. But all checks have been moved in a post-check
callback. In the function dns_finalize_config, for each resolvers, the
nameservers configuration is tested and the task used to manage DNS resolutions
is created. The links between the backend's servers and the resolvers are also
created at this step. Here no connection are kept alive. So there is no needs
anymore to reopen them after HAProxy fork. Connections used to send DNS queries
will be opened on demand.

Then, the way DNS requesters are linked to a DNS resolution has been
reworked. The resolution used by a requester is now referenced into the
dns_requester structure and the resolution pointers in server and dns_srvrq
structures have been removed. wait and curr list of requesters, for a DNS
resolution, have been replaced by a uniq list. And Finally, the way a requester
is removed from a DNS resolution has been simplified. Now everything is done in
dns_unlink_resolution.

srv_set_fqdn function has been simplified. Now, there is only 1 way to set the
server's FQDN, independently it is done by the CLI or when a SRV record is
resolved.

The static DNS resolutions pool has been replaced by a dynamoc pool. The part
has been modified by Baptiste Assmann.

The way the DNS resolutions are triggered by the task or by a health-check has
been totally refactored. Now, all timeouts are respected. Especially
hold.valid. The default frequency to wake up a resolvers is now configurable
using "timeout resolve" parameter.

Now, as documented, as long as invalid repsonses are received, we really wait
all name servers responses before retrying.

As far as possible, resources allocated during DNS configuration parsing are
releases when HAProxy is shutdown.

Beside all these changes, the code has been cleaned to ease code review and the
doc has been updated.
2017-10-31 11:36:12 +01:00
Olivier Houchard
796a2b3324 BUG/MEDIUM: server: Allocate tmptrash before using it.
Don't forget to allocate tmptrash before using it, and free it once we're
done.

[wt: introduced by commit 64cc49cf ("MAJOR: servers: propagate server
status changes asynchronously"), no backport needed]
2017-10-24 19:54:25 +02:00
Emeric Brun
5a1335110c BUG/MEDIUM: log: check result details truncated.
Fix regression introduced by commit:
'MAJOR: servers: propagate server status changes asynchronously.'

The building of the log line was re-worked to be done at the
postponed point without lack of data.

[wt: this only affects 1.8-dev, no backport needed]
2017-10-19 18:51:32 +02:00
Willy Tarreau
06d80a9a9c REORG: channel: finally rename the last bi_* / bo_* functions
For HTTP/2 we'll need some buffer-only equivalent functions to some of
the ones applying to channels and still squatting the bi_* / bo_*
namespace. Since these names have kept being misleading for quite some
time now and are really getting annoying, it's time to rename them. This
commit will use "ci/co" as the prefix (for "channel in", "channel out")
instead of "bi/bo". The following ones were renamed :

  bi_getblk_nc, bi_getline_nc, bi_putblk, bi_putchr,
  bo_getblk, bo_getblk_nc, bo_getline, bo_getline_nc, bo_inject,
  bi_putchk, bi_putstr, bo_getchr, bo_skip, bi_swpbuf
2017-10-19 15:01:08 +02:00
Emeric Brun
64cc49cf7e MAJOR: servers: propagate server status changes asynchronously.
In order to prepare multi-thread development, code was re-worked
to propagate changes asynchronoulsy.

Servers with pending status changes are registered in a list
and this one is processed and emptied only once 'run poll' loop.

Operational status changes are performed before administrative
status changes.

In a case of multiple operational status change or admin status
change in the same 'run poll' loop iteration, those changes are
merged to reach only the targeted status.
2017-10-13 12:00:27 +02:00
Willy Tarreau
6fb4ba38e0 BUG/MEDIUM: server: unwanted behavior leaving maintenance mode on tracked stopping server (take2)
Previous patch got accidently broken. This one fixes it.
2017-09-21 17:37:38 +02:00
Emeric Brun
e1e3947e7e BUG/MEDIUM: server: unwanted behavior leaving maintenance mode on tracked stopping server
Leaving the maintenance state and if the server remains in stopping mode due
to a tracked one:

- We mistakenly try to grab some pending conns and shutdown backup sessions.
- The proxy down time and last change were also mistakenly updated
2017-09-21 17:30:01 +02:00
Christopher Faulet
3bbd65b23e BUG/MINOR: dns: Fix check on nameserver in snr_resolution_cb
snr_resolution_cb can be called with <nameserver> parameter set to NULL. So we
must check it before using it. This is done most of time, except when we deal
with invalid DNS response.
2017-09-15 18:42:23 +02:00
Andjelko Iharos
c3680ecdf8 MINOR: add severity information to cli feedback messages 2017-09-13 13:38:32 +02:00
Willy Tarreau
3d609a755e Revert "BUG/MINOR: server: Remove FQDN requirement for using init-addr and state file"
This reverts commit 19e8aa58f7.

It causes some trouble reported by Manu :
   listen tls
     [...]
     server bla 127.0.0.1:8080

   [ALERT] 248/130258 (21960) : parsing [/etc/haproxy/test.cfg:53] : 'server bla' : no method found to resolve address '(null)'
   [ALERT] 248/130258 (21960) : Failed to initialize server(s) addr.

According to Nenad :
  "It's not a good way to fix the issue we were experiencing
   before. It will need a bigger rewrite, because the logic in
   srv_iterate_initaddr needs to be changed."
2017-09-06 14:22:45 +02:00
Nenad Merdanovic
19e8aa58f7 BUG/MINOR: server: Remove FQDN requirement for using init-addr and state file
Historically the DNS was the only way of updating the server IP dynamically
and the init-addr processing and state file load required the server to have
an FQDN defined. Given that we can now update the IP through the socket as
well and also can have different init-addr values (like IP and 'none') - this
requirement needs to be removed.

This patch should be backported to 1.7.
2017-09-05 15:52:58 +02:00
Emeric Brun
52a91d3d48 MEDIUM: check: server states and weight propagation re-work
The server state and weight was reworked to handle
"pending" values updated by checks/CLI/LUA/agent.
These values are commited to be propagated to the
LB stack.

In further dev related to multi-thread, the commit
will be handled into a sync point.

Pending values are named using the prefix 'next_'
Current values used by the LB stack are named 'cur_'
2017-09-05 15:23:16 +02:00
Baptiste Assmann
747359eeca BUG/MINOR: dns: server set by SRV records stay in "no resolution" status
This patch fixes a bug where some servers managed by SRV record query
types never ever recover from a "no resolution" status.
The problem is due to a wrong function called when breaking the
server/resolution (A/AAAA) relationship: this is performed when a server's SRV
record disappear from the SRV response.
2017-08-22 11:34:49 +02:00
Baptiste Assmann
6fb8192b28 MINOR: dns: enable caching of responses for server set by a SRV record
The function srv_set_fqdn() is used to update a server's fqdn and set
accordingly its DNS resolution.
Current implementation prevents a server whose update is triggered by a
SRV record from being linked to an existing resolution in the cache (if
applicable).
This patch aims at fixing this.
2017-08-18 11:25:41 +02:00
Olivier Houchard
8da5f98fbe MINOR: dns: Handle SRV records.
Make it so for each server, instead of specifying a hostname, one can use
a SRV label.
When doing so, haproxy will first resolve the SRV label, then use the
resulting hostnames, as well as port and weight (priority is ignored right
now), to each server using the SRV label.
It is resolved periodically, and any server disappearing from the SRV records
will be removed, and any server appearing will be added, assuming there're
free servers in haproxy.
2017-08-09 16:32:49 +02:00
Olivier Houchard
a8c6db8d2d MINOR: dns: Cache previous DNS answers.
As DNS servers may not return all IPs in one answer, we want to cache the
previous entries. Those entries are removed when considered obsolete, which
happens when the IP hasn't been returned by the DNS server for a time
defined in the "hold obsolete" parameter of the resolver section. The default
is 30s.
2017-08-09 16:32:49 +02:00
Frédéric Lécaille
3169471964 MINOR: Add server port field to server state file.
This patch adds server ports to server state file at the end of each line
for backward compatibility.
2017-08-03 14:31:46 +02:00
Frédéric Lécaille
0bedb8ac90 BUG/MAJOR: server: Segfault after parsing server state file.
This patch makes the server state file parser ignore servers wich are
not present in the configuration file.
2017-06-15 15:30:30 +02:00
Baptiste Assmann
201c07f681 MAJOR/REORG: dns: DNS resolution task and requester queues
This patch is a major upgrade of the internal run-time DNS resolver in
HAProxy and it brings the following 2 main changes:

1. DNS resolution task

Up to now, DNS resolution was triggered by the health check task.
From now, DNS resolution task is autonomous. It is started by HAProxy
right after the scheduler is available and it is woken either when a
network IO occurs for one of its nameserver or when a timeout is
matched.

From now, this means we can enable DNS resolution for a server without
enabling health checking.

2. Introduction of a dns_requester structure

Up to now, DNS resolution was purposely made for resolving server
hostnames.
The idea, is to ensure that any HAProxy internal object should be able
to trigger a DNS resolution. For this purpose, 2 things has to be done:
  - clean up the DNS code from the server structure (this was already
    quite clean actually) and clean up the server's callbacks from
    manipulating too much DNS resolution
  - create an agnostic structure which allows linking a DNS resolution
    and a requester of any type (using obj_type enum)

3. Manage requesters through queues

Up to now, there was an uniq relationship between a resolution and it's
owner (aka the requester now). It's a shame, because in some cases,
multiple objects may share the same hostname and may benefit from a
resolution being performed by a third party.
This patch introduces the notion of queues, which are basically lists of
either currently running resolution or waiting ones.

The resolutions are now available as a pool, which belongs to the resolvers.
The pool has has a default size of 64 resolutions per resolvers and is
allocated at configuration parsing.
2017-06-02 11:58:54 +02:00
Baptiste Assmann
fa4a663095 MINOR: dns: implement a LRU cache for DNS resolutions
Introduction of a DNS response LRU cache in HAProxy.

When a positive response is received from a DNS server, HAProxy stores
it in the struct resolution and then also populates a LRU cache with the
response.
For now, the key in the cache is a XXHASH64 of the hostname in the
domain name format concatened to the query type in string format.
2017-06-02 11:40:01 +02:00
Baptiste Assmann
729c901c3f MAJOR: dns: save a copy of the DNS response in struct resolution
Prior this patch, the DNS responses were stored in a pre-allocated
memory area (allocated at HAProxy's startup).
The problem is that this memory is erased for each new DNS responses
received and processed.

This patch removes the global memory allocation (which was not thread
safe by the way) and introduces a storage of the dns response  in the
struct
resolution.
The memory in the struct resolution is also reserved at start up and is
thread safe, since each resolution structure will have its own memory
area.

For now, we simply store the response and use it atomically per
response per server.
2017-06-02 11:30:21 +02:00
Baptiste Assmann
fb7091e213 MINOR: dns: new snr_check_ip_callback function
In the process of breaking links between dns_* functions and other
structures (mainly server and a bit of resolution), the function
dns_get_ip_from_response needs to be reworked: it now can call
"callback" functions based on resolution's owner type to allow modifying
the way the response is processed.

For now, main purpose of the callback function is to check that an IP
address is not already affected to an element of the same type.

For now, only server type has a callback.
2017-06-02 11:28:14 +02:00
Baptiste Assmann
42746373eb REORG: dns: dns_option structure, storage of hostname_dn
This patch introduces a some re-organisation around the DNS code in
HAProxy.

1. make the dns_* functions less dependent on 'struct server' and 'struct resolution'.

With this in mind, the following changes were performed:
- 'struct dns_options' has been removed from 'struct resolution' (well,
  we might need it back at some point later, we'll see)
  ==> we'll use the 'struct dns_options' from the owner of the resolution
- dns_get_ip_from_response(): takes a 'struct dns_options' instead of
  'struct resolution'
  ==> so the caller can pass its own dns options to get the most
      appropriate IP from the response
- dns_process_resolve(): struct dns_option is deduced from new
  resolution->requester_type parameter

2. add hostname_dn and hostname_dn_len into struct server

In order to avoid recomputing a server's hostname into its domain name
format (and use a trash buffer to store the result), it is safer to
compute it once at configuration parsing and to store it into the struct
server.
In the mean time, the struct resolution linked to the server doesn't
need anymore to store the hostname in domain name format. A simple
pointer to the server one will make the trick.

The function srv_alloc_dns_resolution() properly manages everything for
us: memory allocation, pointer updates, etc...

3. move resolvers pointer into struct server

This patch makes the pointer to struct dns_resolvers from struct
dns_resolution obsolete.
Purpose is to make the resolution as "neutral" as possible and since the
requester is already linked to the resolvers, then we don't need this
information anymore in the resolution itself.
2017-06-02 11:26:48 +02:00
Baptiste Assmann
4f91f7ea59 MINOR: dns: parse_server() now uses srv_alloc_dns_resolution()
In order to make DNS code more consistent, the function parse_server()
now uses srv_alloc_dns_resolution() to set up a server and its
resolution.
2017-06-02 11:20:50 +02:00
Baptiste Assmann
81ed1a0516 MINOR: dns: functions to manage memory for a DNS resolution structure
A couple of new functions to allocate and free memory for a DNS
resolution structure. Main purpose is to to make the code related to DNS
more consistent.
They allocate or free memory for the structure itself. Later, if needed,
they should also allocate / free the buffers, etc, used by this structure.
They don't set/unset any parameters, this is the role of the caller.

This patch also implement calls to these function eveywhere it is
required.
2017-06-02 11:20:29 +02:00
Baptiste Assmann
9d41fe7f98 CLEANUP: server.c: missing prototype of srv_free_dns_resolution
Prototype for the function srv_free_dns_resolution() missing at the top
of the file.
2017-06-02 11:18:28 +02:00
Frédéric Lécaille
b418c1228c MINOR: server: cli: Add server FQDNs to server-state file and stats socket.
This patch adds a new stats socket command to modify server
FQDNs at run time.
Its syntax:
  set server <backend>/<server> fqdn <FQDN>
This patch also adds FQDNs to server state file at the end
of each line for backward compatibility ("-" if not present).
2017-05-03 06:58:53 +02:00
Frédéric Lécaille
72ed4758d6 MINOR: server: Add server_template_init() function to initialize servers from a templates.
This patch adds server_template_init() function used to initialize servers
from server templates. It is called just after having parsed a 'server-template'
line.
2017-04-21 15:42:10 +02:00
Frédéric Lécaille
b82f742b78 MINOR: server: Add 'server-template' new keyword supported in backend sections.
This patch makes backend sections support 'server-template' new keyword.
Such 'server-template' objects are parsed similarly to a 'server' object
by parse_server() function, but its first arguments are as follows:
    server-template <ID prefix> <nb | range> <ip | fqdn>:<port> ...

The remaining arguments are the same as for 'server' lines.

With such server template declarations, servers may be allocated with IDs
built from <ID prefix> and <nb | range> arguments.

For instance declaring:
    server-template foo 1-5 google.com:80 ...
or
    server-template foo 5 google.com:80 ...

would be equivalent to declare:
    server foo1 google.com:80 ...
    server foo2 google.com:80 ...
    server foo3 google.com:80 ...
    server foo4 google.com:80 ...
    server foo5 google.com:80 ...
2017-04-21 15:42:10 +02:00
Frédéric Lécaille
759ea98db2 MINOR: server: Extract the code which finalizes server initializations after 'server' lines parsing.
This patch moves the code which is responsible of finalizing server initializations
after having fully parsed a 'server' line (health-check, agent check and SNI expression
initializations) from parse_server() to new functions.
2017-04-21 15:42:10 +02:00
Frédéric Lécaille
58b207cdd5 MINOR: server: Extract the code responsible of copying default-server settings.
This patch moves the code responsible of copying default server settings
to a new server instance from parse_server() function to new defsrv_*_cpy()
functions which may be used both during server lines parsing and during server
templates initializations to come.

These defsrv_*_cpy() do not make any reference to anything else than default
server settings.
2017-04-21 15:42:10 +02:00
Frédéric Lécaille
daa2fe6621 BUG/MINOR: server: missing default server 'resolvers' setting duplication.
'resolvers' setting was not duplicated from default server setting to
new server instances when parsing 'server' lines.
This fix is simple: strdup() default resolvers <id> string argument after
having allocated a new server when parsing 'server' lines.

This patch must be backported to 1.7 and 1.6.
2017-04-21 15:42:09 +02:00
Olivier Houchard
7d8e688953 BUG/MINOR: server: don't use "proxy" when px is really meant.
In server_parse_sni_expr(), we use the "proxy" global variable, when we
should probably be using "px" given as an argument.
It happens to work by accident right now, but may not in the future.

[wt: better backport it]
2017-04-20 19:51:10 +02:00
Frédéric Lécaille
dfacd69b94 BUG/MAJOR: Broken parsing for valid keywords provided after 'source' setting.
Any valid keyword could not be parsed anymore if provided after 'source' keyword.
This was due to the fact that 'source' number of arguments is variable.
So, as its parser srv_parse_source() is the only one who may know how many arguments
was provided after 'source' keyword, it updates 'cur_arg' variable (the index
in the line of the current arg to be parsed), this is a good thing.
This variable is also incremented by one (to skip the 'source' keyword).
This patch disable this behavior.

Should have come with dba9707 commit.
2017-04-16 18:13:06 +02:00
Frédéric Lécaille
8d083ed796 BUG/MINOR: server: Fix a wrong error message during 'usesrc' keyword parsing.
'usesrc' setting is not permitted on 'server' lines if not provided after
'source' setting. This is now also the case on 'default-server' lines.
Without this patch parse_server() parser displayed that 'usersrc' is
an unknown keyword.

Should have come with dba9707 commit.
2017-04-15 13:42:55 +02:00
Willy Tarreau
04bf98149b BUG/MEDIUM: servers: unbreak server weight propagation
This reverts commit 266b1a8 ("MEDIUM: server: Inherit CLI weight changes and
agent-check weight responses") from Michal Idzikowski, which is still broken.
It stops propagating weights at the first error encountered, leaving servers
in a random state depending on what LB algorithms are used on other servers
tracking the one experiencing the weight change. It's unsure what the best
way to address this is, but we cannot leave the servers in an inconsistent
state between farms. For example :

  backend site1
      mode http
      balance uri
      hash-type consistent
      server s1 127.0.0.1:8001 weight 10 track servers/s1

  backend site2
      mode http
      balance uri
      server s1 127.0.0.1:8001 weight 10 track servers/s1

  backend site3
      mode http
      balance uri
      hash-type consistent
      server s1 127.0.0.1:8001 weight 10 track servers/s1

  backend servers
      server s1 127.0.0.1:8001 weight 10 check inter 1s

The weight change is applied on "servers/s1". It tries to propagate
to the servers tracking it, which are site1/s1, site2/s1 and site3/s1.
Let's say that "weight 50%" is requested. The servers are linked in
reverse-order, so the change is applied to "servers/s1", then to
"site3/s1", then to "site2/s1" and this one fails and rejects the
change. The change is aborted and never propagated to "site1/s1",
which keeps the server in a different state from "site3/s1". At the
very least, in case of error, the changes should probably be unrolled.

Also the error reported on the CLI (when changing from the CLI) simply says :

  Backend is using a static LB algorithm and only accepts weights '0%' and '100%'.

Without more indications what the faulty backend is.

Let's revert this change for now, as initially feared it will definitely
cause more harm than good and at least needs to be revisited. It was never
backported to any stable branch so no backport is needed.
2017-04-13 15:09:26 +02:00
Michal Idzikowski
266b1a8336 MEDIUM: server: Inherit CLI weight changes and agent-check weight responses
When agent-check or CLI command executes relative weight change this patch
propagates it to tracking server allowing grouping many backends running on
same server underneath. Additionaly in case with many src IPs many backends
can have shared state checker, so there won't be unnecessary health checks.

[wt: Note: this will induce some behaviour change on some setups]
2017-04-13 11:31:38 +02:00
David Carlier
3a471935e6 BUG/MINOR: server : no transparent proxy for DragonflyBSD
IP*_BINDANY is not defined under this system thus it is
necessary to make those fields access since CONFIG_HAP_TRANSPARENT
is not defined.
[wt: problem introduced late in 1.8-dev. The same fix was also reported
  by Steven Davidovitz]
2017-04-10 15:27:46 +02:00
Olivier Houchard
b4a2d5e19a MINOR server: Restrict dynamic cookie check to the same proxy.
Each time we generate a dynamic cookie, we try to make sure the same
cookie hasn't been generated for another server, it's very unlikely, but
it may happen.
We only have to check that for the servers in the same proxy, no, need to
check in others, plus the code was buggy and would always check in the
first proxy of the proxy list.
2017-04-10 15:20:11 +02:00
David Carlier
6f1820864b CLEANUP: server: moving netinet/tcp.h inclusion
netinet/tcp.h needs sys/types.h for u_int* types usage,
issue found while building on OpenBSD.
2017-04-04 08:22:48 +02:00
Frédéric Lécaille
acd4827eca BUG/MEDIUM: server: Wrong server default CRT filenames initialization.
This patch fixes a bug which came with 5e57643 commit where server
default CRT filenames were initialized to the same value as server
default CRL filenames.
2017-03-29 16:37:56 +02:00
Frédéric Lécaille
6e0843c0e0 MINOR: server: Add 'no-agent-check' server keyword.
This patch adds 'no-agent-check' setting supported both by 'default-server'
and 'server' directives to disable an agent check for a specific server which would
have 'agent-check' set as default value (inherited from 'default-server'
'agent-check' setting), or, on 'default-server' lines, to disable 'agent-check' setting
as default value for any further 'server' declarations.

For instance, provided this configuration:

    default-server agent-check
    server srv1
    server srv2 no-agent-check
    server srv3
    default-server no-agent-check
    server srv4

srv1 and srv3 would have an agent check enabled contrary to srv2 and srv4.

We do not allocate anymore anything when parsing 'default-server' 'agent-check'
setting.
2017-03-27 14:37:01 +02:00
Frédéric Lécaille
2a0d061a60 MINOR: server: Make 'default-server' support 'disabled' keyword.
Before this patch, only 'server' directives could support 'disabled' setting.
This patch makes also 'default-server' directives support this setting.
It is used to disable a list of servers declared after a 'defaut-server' directive.
'enabled' new keyword has been added, both supported as 'default-server' and
'server' setting, to enable again a list of servers (so, declared after a
'default-server enabled' directive) or to explicitly enable a specific server declared
after a 'default-server disabled' directive.

For instance provided this configuration:

    default-server disabled
    server srv1...
    server srv2...
    server srv3... enabled
    server srv4... enabled

srv1 and srv2 are disabled and srv3 and srv4 enabled.

This is equivalent to this configuration:

    default-server disabled
    server srv1...
    server srv2...
    default-server enabled
    server srv3...
    server srv4...

even if it would have been preferable/shorter to declare:

    server srv3...
    server srv4...
    default-server disabled
    server srv1...
    server srv2...

as 'enabled' is the default server state.
2017-03-27 14:37:01 +02:00
Frédéric Lécaille
6e5e0d8f9e MINOR: server: Make 'default-server' support 'addr' keyword.
This patch makes 'default-server' support 'addr' setting.
The code which was responsible of parsing 'server' 'addr' setting
has moved from parse_server() to implement a new parser
callable both as 'default-server' and 'server' 'addr' setting parser.

Should not break anything.
2017-03-27 14:37:01 +02:00
Frédéric Lécaille
9a146de934 MINOR: server: Make 'default-server' support 'sni' keyword.
This patch makes 'default-server' directives support 'sni' settings.
A field 'sni_expr' has been added to 'struct server' to temporary
stores SNI expressions as strings during both 'default-server' and 'server'
lines parsing. So, to duplicate SNI expressions from 'default-server' 'sni' setting
for new 'server' instances we only have to "strdup" these strings as this is
often done for most of the 'server' settings.
Then, sample expressions are computed calling sample_parse_expr() (only for 'server'
instances).
A new function has been added to produce the same error output as before in case
of any error during 'sni' settings parsing (display_parser_err()).
Should not break anything.
2017-03-27 14:37:01 +02:00
Frédéric Lécaille
dba9707713 MINOR: server: Make 'default-server' support 'source' keyword.
Before this patch, only 'server' directives could support 'source' setting.
This patch makes also 'default-server' directives support this setting.

To do so, we had to extract the code responsible of parsing 'source' setting
arguments from parse_server() function and make it callable both
as 'default-server' and 'server' 'source' setting parser. So, the code is mostly
the same as before except that before allocating anything for 'struct conn_src'
members, we must free the memory previously allocated.

Should not break anything.
2017-03-27 14:37:01 +02:00
Frédéric Lécaille
22f41a2d23 MINOR: server: Make 'default-server' support 'namespace' keyword.
Before this patch, 'namespace' setting was only supported by 'server' directive.
This patch makes 'default-server' directive support this setting.
2017-03-27 14:37:01 +02:00
Frédéric Lécaille
5c3cd97550 MINOR: server: Make 'default-server' support 'tcp-ut' keyword.
This patch makes 'default-server' directive support 'tcp-ut' keyword.
2017-03-27 14:37:01 +02:00
Frédéric Lécaille
bcaf1d7397 MINOR: server: Make 'default-server' support 'ciphers' keyword.
This patch makes 'default-server' directive support 'ciphers' setting.
2017-03-27 14:37:01 +02:00
Frédéric Lécaille
9d1b95b591 MINOR: server: Make 'default-server' support 'cookie' keyword.
Before this patch, 'cookie' setting was only supported by 'server' directives.
This patch makes 'default-server' directive also support 'cookie' setting.
Should not break anything.
2017-03-27 14:37:01 +02:00
Frédéric Lécaille
547356e484 MINOR: server: Make 'default-server' support 'observe' keyword.
Before this path, 'observe' setting was only supported by 'server' directives.
This patch makes 'default-server' directives also support 'observe' setting.
Should not break anything.
2017-03-27 14:37:01 +02:00
Frédéric Lécaille
16186236dd MINOR: server: Make 'default-server' support 'redir' keyword.
Before this patch only 'server' directives could support 'redir' setting.
This patch makes also 'default-server' directives support 'redir' setting.
Should not break anything.
2017-03-27 14:37:01 +02:00
Frédéric Lécaille
5e57643e09 MINOR: server: Make 'default-server' support 'ca-file', 'crl-file' and 'crt' settings.
This patch makes 'default-server' directives support 'ca-file', 'crl-file' and
'crt' settings.
2017-03-27 14:37:01 +02:00
Frédéric Lécaille
67e0e61316 MINOR: server: Make 'default-server' support 'track' setting.
Before this patch only 'server' directives could support 'track' setting.
This patch makes 'default-server' directives also support this setting.
Should not break anything.
2017-03-27 14:37:01 +02:00
Frédéric Lécaille
65aa356c0b MINOR: server: Make 'default-server' support 'check' keyword.
Before this patch 'check' setting was only supported by 'server' directives.
This patch makes also 'default-server' directives support this setting.
A new 'no-check' keyword parser has been implemented to disable this setting both
in 'default-server' and 'server' directives.
Should not break anything.
2017-03-27 14:37:01 +02:00
Frédéric Lécaille
273f321404 MINOR: server: Make 'default-server' support 'verifyhost' setting.
This patch makes 'default-server' directive support 'verifyhost' setting.
Note: there was a little memory leak when several 'verifyhost' arguments were
supplied on the same 'server' line.
2017-03-27 14:37:01 +02:00
Frédéric Lécaille
7c8cd587c2 MINOR: server: Make 'default-server' support 'verify' keyword.
This patch makes 'default-server' directive support 'verify' keyword.
2017-03-27 14:37:01 +02:00
Frédéric Lécaille
31045e4c10 MINOR: server: Make 'default-server' support 'send-proxy' and 'send-proxy-v2 keywords.
This patch makes 'default-server' directive support 'send-proxy'
(resp. 'send-proxy-v2') setting.
A new keyword 'no-send-proxy' (resp. 'no-send-proxy-v2') has been added
to disable 'send-proxy' (resp. 'send-proxy-v2') setting both in 'server' and
'default-server' directives.
2017-03-27 14:36:12 +02:00
Frédéric Lécaille
f9bc1d6a13 MINOR: server: Make 'default-server' support 'non-stick' keyword.
This patch makes 'default-server' directive support 'non-stick' setting.
A new keyword 'stick' has been added so that to disable
'non-stick' setting both in 'server' and 'default-server' directives.
2017-03-27 14:36:11 +02:00
Frédéric Lécaille
1502cfd1a3 CLEANUP: server: code alignement.
Code alignement.
2017-03-27 14:36:11 +02:00
Frédéric Lécaille
25df89066b MINOR: server: Make 'default-server' support 'check-send-proxy' keyword.
This patch makes 'default-server' directive support 'check-send-proxy' setting.
A new keyword 'no-check-send-proxy' has been added so that to disable
'check-send-proxy' setting both in 'server' and 'default-server' directives.
2017-03-27 14:36:11 +02:00
Frédéric Lécaille
f5bf903be6 MINOR: server: Make 'default-server' support 'backup' keyword.
At this time, only 'server' supported 'backup' keyword.
This patch makes also 'default-server' directive support this keyword.
A new keyword 'no-backup' has been added so that to disable 'backup' setting
both in 'server' and 'default-server' directives.

For instance, provided the following sequence of directives:

default-server backup
server srv1
server srv2 no-backup

default-server no-backup
server srv3
server srv4 backup

srv1 and srv4 are declared as backup servers,
srv2 and srv3 are declared as non-backup servers.
2017-03-27 14:36:11 +02:00
Frédéric Lécaille
8065b6d4f2 MINOR: server: irrelevant error message with 'default-server' config file keyword.
There is no reason to emit such an error message:
"'default-server' expects <name> and <addr>[:<port>] as arguments."
if less than two arguments are provided on 'default-server' lines.
This is a 'server' specific error message.
2017-03-27 14:33:58 +02:00
Olivier Houchard
2cb49ebbc4 BUG/MEDIUM server: Fix crash when dynamic is defined, but not key is provided.
Wait until we're sure we have a key before trying to calculate its length.

[wt: no backport needed, was just merged]
2017-03-15 16:01:33 +01:00
Olivier Houchard
4e694049fa MINOR: server: Add dynamic session cookies.
This adds a new "dynamic" keyword for the cookie option. If set, a cookie
will be generated for each server (assuming one isn't already provided on
the "server" line), from the IP of the server, the TCP port, and a secret
key provided. To provide the secret key, a new keyword as been added,
"dynamic-cookie-key", for backends.

Example :
backend bk_web
  balance roundrobin
  dynamic-cookie-key "bla"
  cookie WEBSRV insert dynamic
  server s1 127.0.0.1:80 check
  server s2 192.168.56.1:80 check

This is a first step to be able to dynamically add and remove servers,
without modifying the configuration file, and still have all the load
balancers redirect the traffic to the right server.

Provide a way to generate session cookies, based on the IP address of the
server, the TCP port, and a secret key provided.
2017-03-15 11:37:30 +01:00
Misiek
2da082d732 MINOR: cli: Add possiblity to change agent config via CLI/socket
This change adds possibility to change agent-addr and agent-send directives
by CLI/socket. Now you can replace server's and their configuration without
reloading/restarting whole haproxy, so it's a step in no-reload/no-restart
direction.

Depends on #e9602af - agent-addr is implemented there.

Can be backported to 1.7.
2017-01-16 11:38:59 +01:00
Misiek
ea849333ca MINOR: checks: Add agent-addr config directive
This directive add possibility to set different address for agent-checks.
With this you can manage server status and weight from central place.

Can be backported to 1.7.
2017-01-16 11:38:02 +01:00
Ryabin Sergey
77ee7526de BUG/MINOR: Reset errno variable before calling strtol(3)
Sometimes errno != 0 before calling strtol(3)

[wt: this needs to be backported to 1.7]
2017-01-11 21:30:07 +01:00
Willy Tarreau
9698f4b295 MEDIUM: server: disable protocol validations when the server doesn't resolve
When a server doesn't resolve we don't know the address family so we
can't perform the basic protocol validations. However we know that we'll
ultimately resolve to AF_INET4 or AF_INET6 so the controls are OK. It is
important to proceed like this otherwise it will not be possible to start
with unresolved addresses.
2017-01-06 19:29:34 +01:00
Willy Tarreau
6ecb10aec7 MINOR: server: take the destination port from the port field, not the addr
Next patch will cause the port to disappear from the address field when servers
do not resolve so we need to take it from the separate field provided by
str2sa_range().
2017-01-06 19:29:34 +01:00
Willy Tarreau
48ef4c95b6 MINOR: tools: make str2sa_range() return the port in a separate argument
This will be needed so that we're don't have to extract it from the
returned address where it will not always be anymore (eg: for unresolved
servers).
2017-01-06 19:29:34 +01:00
Willy Tarreau
04276f3d6e MEDIUM: server: split the address and the port into two different fields
Keeping the address and the port in the same field causes a lot of problems,
specifically on the DNS part where we're forced to cheat on the family to be
able to keep the port. This causes some issues such as some families not being
resolvable anymore.

This patch first moves the service port to a new field "svc_port" so that the
port field is never used anymore in the "addr" field (struct sockaddr_storage).
All call places were adapted (there aren't that many).
2017-01-06 19:29:33 +01:00
Willy Tarreau
3acfcd1aa1 BUG/MEDIUM: server: consider AF_UNSPEC as a valid address family
The DNS code is written so as to support AF_UNSPEC to decide on the
server family based on responses, but unfortunately snr_resolution_cb()
considers it as invalid causing a DNS storm to happen when a server
arrives with this family.

This situation is not supposed to happen as long as unresolved addresses
are forced to AF_INET, but this will change with the upcoming fixes and
it's possible that it's not granted already when changing an address on
the CLI.

This fix must be backported to 1.7 and 1.6.
2017-01-06 19:21:37 +01:00