Commit Graph

787 Commits

Author SHA1 Message Date
Christopher Faulet
95a61e8a0e MINOR: stream: Add pointer to front/back conn-streams into stream struct
frontend and backend conn-streams are now directly accesible from the
stream. This way, and with some other changes, it will be possible to remove
the stream-interfaces from the stream structure.
2022-02-24 11:00:02 +01:00
Christopher Faulet
13a35e5752 MAJOR: conn_stream/stream-int: move the appctx to the conn-stream
Thanks to previous changes, it is now possible to set an appctx as endpoint
for a conn-stream. This means the appctx is no longer linked to the
stream-interface but to the conn-stream. Thus, a pointer to the conn-stream
is explicitly stored in the stream-interface. The endpoint (connection or
appctx) can be retrieved via the conn-stream.
2022-02-24 11:00:02 +01:00
Christopher Faulet
e00ad358c9 MEDIUM: stream: No longer release backend conn-stream on connection retry
The backend conn-stream is no longer released on connection retry. This
means the conn-stream is detached from the underlying connection but not
released. Thus, during connection retries, the stream has always an
allocated conn-stream with no connection. All previous changes were made to
make this possible.

Note that .attach() mux callback function was changed to get the conn-stream
as argument. The muxes are no longer responsible to create the conn-stream
when a server connection is attached to a stream.
2022-02-24 11:00:02 +01:00
Christopher Faulet
0256da14a5 MINOR: connection: Be prepared to handle conn-stream with no connection
The conn-stream will progressively replace the stream-interface. Thus, a
stream will have to allocate the backend conn-stream during its
creation. This means it will be possible to have a conn-stream with no
connection. To prepare this change, we test the conn-stream's connection
when we retrieve it.
2022-02-24 11:00:01 +01:00
Willy Tarreau
88bc800eae BUILD: tree-wide: avoid warnings caused by redundant checks of obj_types
At many places we use construct such as:

   if (objt_server(blah))
       do_something(objt_server(blah));

At -O2 the compiler manages to simplify the operation and see that the
second one returns the same result as the first one. But at -O1 that's
not always the case, and the compiler is able to emit a second
expression and sees the potential null that results from it, and may
warn about a potential null deref (e.g. with gcc-6.5). There are two
solutions to this:
  - either the result of the first test has to be passed to a local
    variable
  - or the second reference ought to be unchecked using the __objt_*
    variant.

This patch fixes all occurrences at once by taking the second approach
(the least intrusive). For constructs like:

   objt_server(blah) ? objt_server(blah)->name : "no name"

a macro could be useful. It would for example take the object type
(server), the field name (name) and the default value. But there
are probably not enough occurrences across the whole code for this
to really matter.

This should be backported wherever it applies.
2021-12-06 09:11:47 +01:00
Christopher Faulet
34a3eb4c42 MINOR: backend: Get client dst address to set the server's one only if needful
In alloc_dst_address(), the client destination address must only be
retrieved when we are sure to use it. Most of time, this save a syscall to
getsockname(). It is not a bugfix in itself. But it revealed a bug in the
QUIC part. The CO_FL_ADDR_TO_SET flag is not set when the destination
address is create for anew quic client connection.
2021-11-05 15:25:34 +01:00
Amaury Denoyelle
9c3251d108 MEDIUM: server/backend: implement websocket protocol selection
Handle properly websocket streams if the server uses an ALPN with both
h1 and h2. Add a new field h2_ws in the server structure. If set to off,
reuse is automatically disable on backend and ALPN is forced to http1.x
if possible. Nothing is done if on.

Implement a mechanism to be able to use a different http version for
websocket streams. A new server member <ws> represents the algorithm to
select the protocol. This can overrides the server <proto>
configuration. If the connection uses ALPN for proto selection, it is
updated for websocket streams to select the right protocol.

Three mode of selection are implemented :
- auto : use the same protocol between non-ws and ws streams. If ALPN is
  use, try to update it to "http/1.1"; this is only done if the server
  ALPN contains "http/1.1".
- h1 : use http/1.1
- h2 : use http/2.0; this requires the server to support RFC8441 or an
  error will be returned by haproxy.
2021-11-03 16:24:48 +01:00
Amaury Denoyelle
ac03ef26e8 MINOR: connection: add alternative mux_ops param for conn_install_mux_be
Add a new parameter force_mux_ops. This will be useful to specify an
alternative to the srv->mux_proto field. If non-NULL, it will be use to
force the mux protocol wether srv->mux_proto is set or not.

This argument will become useful to install a mux for non-standard
streams, most notably websocket streams.
2021-11-03 16:24:48 +01:00
Willy Tarreau
14e7f29e86 MINOR: protocols: replace protocol_by_family() with protocol_lookup()
At a few places we were still using protocol_by_family() instead of
the richer protocol_lookup(). The former is limited as it enforces
SOCK_STREAM and a stream protocol at the control layer. At least with
protocol_lookup() we don't have this limitationn. The values were still
set for now but later we can imagine making them configurable on the
fly.
2021-10-27 17:41:07 +02:00
Christopher Faulet
16f16afb31 MINOR: stream: Use backend stream-interface dst address instead of target_addr
target_addr field in the stream structure is removed. The backend
stream-interface destination address is now used.
2021-10-27 11:35:59 +02:00
Christopher Faulet
a8e95fed43 MEDIUM: backend: Rely on addresses at stream level to init server connection
Client source and destination addresses at stream level are used to initiate
the connections to a server. For now, stream-interface addresses are never
set. So, thanks to the fallback mechanism, no changes are expected with this
patch. But its purpose is to rely on addresses at the appropriate level when
set instead of those at the connection level.
2021-10-27 11:35:59 +02:00
Amaury Denoyelle
926712ab2d MINOR: backend: improve perf with tcp proxies skipping idle conns
Skip the hash connection calcul when reuse must not be used in
connect_server() : this is the case for TCP proxies. This should result
in slightly better performance when using this use-case.
2021-10-22 17:28:29 +02:00
Amaury Denoyelle
aee4fdbd17 BUG/MINOR: backend: fix improper insert in avail tree for always reuse
In connect_server(), if http-reuse always is set, the backend connection
is inserted into the available tree as soon as created. However, the
hash connection field is only set later at the end of the function.

This seems to have no impact as the hash connection field is always
position before a lookup. However, this is not a proper usage of ebmb
API. Fix this by setting the hash connection field before the insertion
into the avail tree.

This must be backported up to 2.4.
2021-10-22 17:26:22 +02:00
Amaury Denoyelle
1252b6f951 MINOR: backend: add traces for idle connections reuse
Add traces in connect_server() to debug idle connection reuse. These
are attached to stream trace module, as it's already in use in
backend.c with the macro TRACE_SOURCE.
2021-10-22 17:21:14 +02:00
Christopher Faulet
37a9e21a3a MINOR: sample/arg: Be able to resolve args found in defaults sections
It is not yet used but thanks to this patch, it will be possible to resolve
arguments found in defaults sections. However, there is some restrictions:

  * For FE (frontend) or BE (backend) arguments, if the proxy is explicity
    defined, there is no change. But for implicit proxy (not specified), the
    argument points on the default proxy. when a sample fetch using this
    kind of argument is evaluated, the default proxy replaced by the current
    one.

  * For SRV (server) and TAB (stick-table)arguments, the proxy must always
    be specified. Otherwise an error is reported.

This patch is mandatory to support TCP/HTTP rules in defaults sections.
2021-10-15 14:12:19 +02:00
Willy Tarreau
5d9ddc5442 BUILD: tree-wide: add several missing activity.h
A number of files currently access activity counters but rely on their
definitions to be inherited from other files (task.c, backend.c hlua.c,
sock.c, pool.c, stats.c, fd.c).
2021-10-07 01:36:51 +02:00
Willy Tarreau
63617dbec6 BUILD: idleconns: include missing ebmbtree.h at several places
backend.c, all muxes, backend.c started manipulating ebmb_nodes with
the introduction of idle conns but the types were inherited through
other includes. Let's add ebmbtree.h there.
2021-10-07 01:36:51 +02:00
Willy Tarreau
b131049eb5 BUILD: ssl: fix two remaining occurrences of #if USE_OPENSSL
One was in backend.c and the other one in hlua.c. No other candidate
was found with "git grep '^#if\s*USE'". It's worth noting that 3
other such tests exist for SSL_OP_NO_{SSLv3,TLSv1_1,TLSv1_2} but
that these ones are properly set to 0 in openssl-compat.h when not
defined.
2021-08-30 09:39:24 +02:00
Willy Tarreau
252412316e MEDIUM: proxy: remove long-broken 'option http_proxy'
This option had always been broken in HTX, which means that the first
breakage appeared in 1.9, that it was broken by default in 2.0 and that
no workaround existed starting with 2.1. The way this option works is
praticularly unfit to the rest of the configuration and to the internal
architecture. It had some uses when it was introduced 14 years ago but
nowadays it's possible to do much better and more reliable using a
set of "http-request set-dst" and "http-request set-uri" rules, which
additionally are compatible with DNS resolution (via do-resolve) and
are not exclusive to normal load balancing. The "option-http_proxy"
example config file was updated to reflect this.

The option is still parsed so that an error message gives hints about
what to look for.
2021-07-18 19:35:32 +02:00
Amaury Denoyelle
c453f9547e MINOR: http: use http uri parser for path
Replace http_get_path by the http_uri_parser API. The new functions is
renamed http_parse_path. Replace duplicated code for scheme and
authority parsing by invocations to http_parse_scheme/authority.

If no scheme is found for an URI detected as an absolute-uri/authority,
consider it to be an authority format : no path will be found. For an
absolute-uri or absolute-path, use the remaining of the string as the
path. A new http_uri_parser state is declared to mark the path parsing
as done.
2021-07-08 17:11:17 +02:00
Willy Tarreau
9ab78293bf MEDIUM: queue: simplify again the process_srv_queue() API (v2)
This basically undoes the API changes that were performed by commit
0274286dd ("BUG/MAJOR: server: fix deadlock when changing maxconn via
agent-check") to address the deadlock issue: since process_srv_queue()
doesn't use the server lock anymore, it doesn't need the "server_locked"
argument, so let's get rid of it before it gets used again.
2021-06-24 10:52:31 +02:00
Willy Tarreau
ccd85a3e08 Revert "MEDIUM: queue: simplify again the process_srv_queue() API"
This reverts commit c83e45e9b0.

The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.
2021-06-24 07:22:18 +02:00
Willy Tarreau
5ffb045ed1 CLEANUP: backend: remove impossible case of round-robin + consistent hash
In 1.4, consistent hashing was brought by commit 6b2e11be1 ("[MEDIUM]
backend: implement consistent hashing variation") which took care of
replacing all direct calls to map_get_server_rr() with an alternate
call to chash_get_next_server() if consistent hash was being used.

One of them, however, cannot happen because a preliminary test for
static round-robin is being done prior to the call, so we're certain
that if it matches it cannot use a consistent hash tree.

Let's remove it.
2021-06-22 19:21:11 +02:00
Willy Tarreau
c83e45e9b0 MEDIUM: queue: simplify again the process_srv_queue() API
This basically undoes the API changes that were performed by commit
0274286dd ("BUG/MAJOR: server: fix deadlock when changing maxconn via
agent-check") to address the deadlock issue: since process_srv_queue()
doesn't use the server lock anymore, it doesn't need the "server_locked"
argument, so let's get rid of it before it gets used again.
2021-06-22 18:57:15 +02:00
Willy Tarreau
a05704582c MINOR: server: replace the pendconns-related stuff with a struct queue
Just like for proxies, all three elements (pendconns, nbpend, queue_idx)
were moved to struct queue.
2021-06-22 18:43:14 +02:00
Willy Tarreau
7f3c1df248 MINOR: proxy: replace the pendconns-related stuff with a struct queue
All three elements (pendconns, nbpend, queue_idx) were moved to struct
queue.
2021-06-22 18:43:14 +02:00
Willy Tarreau
5941ef0a6c MINOR: lb/api: remove the locked argument from take_conn/drop_conn
This essentially reverts commit 2b4370078 ("MINOR: lb/api: let callers
of take_conn/drop_conn tell if they have the lock") that was merged
during 2.4 before the various locks could be eliminated at the lower
layers. Passing that information complicates the cleanup of the queuing
code and it's become useless.
2021-06-22 18:43:12 +02:00
Amaury Denoyelle
0274286dd3 BUG/MAJOR: server: fix deadlock when changing maxconn via agent-check
The server_parse_maxconn_change_request locks the server lock. However,
this function can be called via agent-checks or lua code which already
lock it. This bug has been introduced by the following commit :

  commit 79a88ba3d0
  BUG/MAJOR: server: prevent deadlock when using 'set maxconn server'

This commit tried to fix another deadlock with can occur because
previoulsy server_parse_maxconn_change_request requires the server lock
to be held. However, it may call internally process_srv_queue which also
locks the server lock. The locking policy has thus been updated. The fix
is functional for the CLI 'set maxconn' but fails to address the
agent-check / lua counterparts.

This new issue is fixed in two steps :
- changes from the above commit have been reverted. This means that
  server_parse_maxconn_change_request must again be called with the
  server lock.

- to counter the deadlock fixed by the above commit, process_srv_queue
  now takes an argument to render the server locking optional if the
  caller already held it. This is only used by
  server_parse_maxconn_change_request.

The above commit was subject to backport up to 1.8. Thus this commit
must be backported in every release where it is already present.
2021-06-22 11:39:20 +02:00
Amaury Denoyelle
655dec81bd BUG/MINOR: backend: do not set sni on connection reuse
When reusing a backend connection, do not reapply the SNI on the
connection. It should already be defined when the connection was
instantiated on a previous connect_server invocation. As the SNI is a
parameter used to select a connection, only connection with same value
can be reused.

The impact of this bug is unknown and may be null. No memory leak has
been reported by valgrind. So this is more a cleaning fix.

This commit relies on the SF_SRV_REUSED flag and thus depends on the
following fix :
  BUG/MINOR: backend: restore the SF_SRV_REUSED flag original purpose

This should be backported up to 2.4.
2021-06-17 18:01:57 +02:00
Amaury Denoyelle
2b1d91758d BUG/MINOR: backend: restore the SF_SRV_REUSED flag original purpose
The SF_SRV_REUSED flag was set if a stream reused a backend connection.
One of its purpose is to count the total reuse on the backend in
opposition to newly instantiated connection.

However, the flag was diverted from its original purpose since the
following commit :

  e8f5f5d8b2
  BUG/MEDIUM: servers: Only set SF_SRV_REUSED if the connection if fully ready.

With this change, the flag is not set anymore if the mux is not ready
when a connection is picked for reuse. This can happen for multiplexed
connections which are inserted in the available list as soon as created
in http-reuse always mode. The goal of this change is to not retry
immediately this request in case on an error on the same server if the
reused connection is not fully ready.

This change is justified for the retry timeout handling but it breaks
other places which still uses the flag for its original purpose. Mainly,
in this case the wrong 'connect' backend counter is incremented instead
of the 'reuse' one. The flag is also used in http_return_srv_error and
may have an impact if a http server error is replied for this stream.

To fix this problem, the original purpose of the flag is restored by
setting it unconditionaly when a connection is reused. Additionally, a
new flag SF_SRV_REUSED_ANTICIPATED is created. This flag is set when the
connection is reused but the mux is not ready yet. For the timeout
handling on error, the request is retried immediately only if the stream
reused a connection without this newly anticipated flag.

This must be backported up to 2.1.
2021-06-17 17:58:50 +02:00
Ilya Shipitsin
213bb99f9e CLEANUP: assorted typo fixes in the code and comments
This is 24th iteration of typo fixes
2021-06-17 09:02:16 +02:00
Willy Tarreau
f9a7c442f6 MINOR: backend: only skip LB when there are actual connections
In 2.3, a significant improvement was brought against situations where
the queue was heavily used, because some LB algos were still checked
for no reason before deciding to put the request into the queue. This
was commit 82cd5c13a ("OPTIM: backend: skip LB when we know the backend
is full").

As seen in previous commit ("BUG/MAJOR: queue: set SF_ASSIGNED when
setting strm->target on dequeue") the dequeuing code is extremely
tricky, and the optimization above tends to emphasize transient issues
by making them permanent until the next reload, which is not acceptable
as the code must always be robust against any bad situation.

This commit brings a protection against such a situation by slightly
relaxing the test. Instead of checking that there are pending connections
in the backend queue, it also verifies that the backend's connections are
not solely composed of queued connections, which would then indicate we
are in this situation. This is not rocket science, but at least if the
situation happens, we know that it will unlock by itself once the streams
have left, as new requests will be allowed to reach the servers and to
flush the queue again.

This needs to be backported to 2.4 and 2.3.
2021-06-16 09:05:35 +02:00
Christopher Faulet
e9106d69cb MINOR: backend: Don't release SI endpoint anymore in connect_server()
Thanks to the previous patch (822decfd "BUG/MAJOR: stream-int: Release SI
endpoint on server side ASAP on retry"), it is now useless to release any
existing connection in connect_server() because it was already done in
back_handle_st_cer() if necessary.

This patch is not a CLEANUP because it may introduce some bugs in edge
cases. There is no reason to backport it for now except if it is required to
fix a bug.
2021-06-01 15:54:50 +02:00
Christopher Faulet
f822decfda BUG/MAJOR: stream-int: Release SI endpoint on server side ASAP on retry
When a connection attempt failed, if a retry is possible, the SI endpoint on
the server side is immediately released, instead of waiting to establish a
new connection to a server. Thus, when the backend SI is switched from
SI_ST_CER state to SI_ST_REQ, SI_ST_ASS or SI_ST_TAR, its endpoint is
released. It is expected because the SI is moved to a state prior to the
connection stage ( < SI_ST_CONN). So it seems logical to not have any server
connection.

It is especially important if the retry is delayed (SI_ST_TAR or
SI_ST_QUE). Because, if the server connection is preserved, any error at the
connection level is unexpectedly relayed to the stream, via the
stream-interface, leading to an infinite loop in process_stream(). if
SI_FL_ERR flag is set on the backend SI in another state than SI_ST_CLO, an
internal goto is performed to resync the stream-interfaces. In addtition,
some ressources are not released ASAP.

This bug is quite old and was reported 1 or 2 times per years since the 2.2
(at least) with not enough information to catch it. It must be backported as
far as 2.2 with a special care because this part has moved several times and
after some observation period and feedback from users to be sure. For info,
in 2.0 and prior, the connection is released when an error is encountered in
SI_ST_CON or SI_ST_RDY states.
2021-06-01 15:53:54 +02:00
Willy Tarreau
2b71810cb3 CLEANUP: lists/tree-wide: rename some list operations to avoid some confusion
The current "ADD" vs "ADDQ" is confusing because when thinking in terms
of appending at the end of a list, "ADD" naturally comes to mind, but
here it does the opposite, it inserts. Several times already it's been
incorrectly used where ADDQ was expected, the latest of which was a
fortunate accident explained in 6fa922562 ("CLEANUP: stream: explain
why we queue the stream at the head of the server list").

Let's use more explicit (but slightly longer) names now:

   LIST_ADD        ->       LIST_INSERT
   LIST_ADDQ       ->       LIST_APPEND
   LIST_ADDED      ->       LIST_INLIST
   LIST_DEL        ->       LIST_DELETE

The same is true for MT_LISTs, including their "TRY" variant.
LIST_DEL_INIT keeps its short name to encourage to use it instead of the
lazier LIST_DELETE which is often less safe.

The change is large (~674 non-comment entries) but is mechanical enough
to remain safe. No permutation was performed, so any out-of-tree code
can easily map older names to new ones.

The list doc was updated.
2021-04-21 09:20:17 +02:00
Willy Tarreau
4781b1521a CLEANUP: atomic/tree-wide: replace single increments/decrements with inc/dec
This patch replaces roughly all occurrences of an HA_ATOMIC_ADD(&foo, 1)
or HA_ATOMIC_SUB(&foo, 1) with the equivalent HA_ATOMIC_INC(&foo) and
HA_ATOMIC_DEC(&foo) respectively. These are 507 changes over 45 files.
2021-04-07 18:18:37 +02:00
Willy Tarreau
1db427399c CLEANUP: atomic: add an explicit _FETCH variant for add/sub/and/or
Currently our atomic ops return a value but it's never known whether
the fetch is done before or after the operation, which causes some
confusion each time the value is desired. Let's create an explicit
variant of these operations suffixed with _FETCH to explicitly mention
that the fetch occurs after the operation, and make use of it at the
few call places.
2021-04-07 18:18:37 +02:00
Christopher Faulet
1bb6afa35d MINOR: stream: Use stream type instead of proxy mode when appropriate
We now use the stream instead of the proxy to know if we are processing HTTP
data or not. If the stream is an HTX stream, it means we are dealing with
HTTP data. It is more accurate than the proxy mode because when an HTTP
upgrade is performed, the proxy is not changed and only the stream may be
used.

Note that it was not a problem to rely on the proxy because HTTP upgrades
may only happen when an HTTP backend was set. But, we will add the support
of HTTP upgrades on the frontend side, after te tcp-request rules
evaluation.  In this context, we cannot rely on the proxy mode.
2021-04-01 11:06:48 +02:00
Willy Tarreau
9b9f8477f8 MEDIUM: backend: use a trylock to grab a connection on high FD counts as well
Commit b1adf03df ("MEDIUM: backend: use a trylock when trying to grab an
idle connection") solved a contention issue on the backend under normal
condition, but there is another one further, which only happens when the
number of FDs in use is considered too high, and which obviously causes
random crashes with just 16 threads once the number of FDs is about to be
exhausted.

Like the aforementioned patch, this one should be backported to 2.3.
2021-03-27 09:39:23 +01:00
Amaury Denoyelle
65bf600cc3 BUG/MEDIUM: release lock on idle conn killing on reached pool high count
Release the lock before calling mux destroy in connect_server when
trying to kill an idle connection because the pool high count has been
reached.

The lock must be released because the mux destroy will call
srv_release_conn which also takes the lock to remove the connection from
the tree. As the connection was already deleted from the tree at this
stage, it is safe to release the lock, and the removal in
srv_release_conn will be a noop.

It does not need to be backported because it is only present in the
current release. It has been introduced by
    5c7086f6b0
    MEDIUM: connection: protect idle conn lists with locks
2021-03-25 11:55:35 +01:00
Olivier Houchard
1b3c931bff MEDIUM: connections: Introduce a new XPRT method, start().
Introduce a new XPRT method, start(). The init() method will now only
initialize whatever is needed for the XPRT to run, but any action the XPRT
has to do before being ready, such as handshakes, will be done in the new
start() method. That way, we will be sure the full stack of xprt will be
initialized before attempting to do anything.
The init() call is also moved to conn_prepare(). There's no longer any reason
to wait for the ctrl to be ready, any action will be deferred until start(),
anyway. This means conn_xprt_init() is no longer needed.
2021-03-19 15:33:04 +01:00
Amaury Denoyelle
249f0562cf BUG/MINOR: backend: fix condition for reuse on mode HTTP
This commit is a fix/complement to the following one :
08d87b3f49
BUG/MEDIUM: backend: never reuse a connection for tcp mode

It fixes the check for the early insertion of backend connections in
the reuse lists if the backend mode is HTTP.

The impact of this bug seems limited because :
- in tcp mode, no insertion is done in the avail list as mux_pt does not
  support multiple streams.
- in http mode, muxes are also responsible to insert backend connections
  in lists in their detach functions. Prior to this fix the reuse rate
  could be slightly inferior.

It can be backported to 2.3.
2021-03-05 15:44:51 +01:00
Amaury Denoyelle
d7faa3d6e9 MINOR: backend: add a BUG_ON if conn mux NULL in connect_server
Currently, there seems to be no way to have the transport layer ready
but not the mux in the function connect_server. Add a BUG_ON to report
if this implicit condition is not true anymore.

This should fix coverity report from github issue #1120.
2021-03-05 15:27:41 +01:00
Willy Tarreau
430bf4a483 MINOR: server: allocate a per-thread struct for the per-thread connections stuff
There are multiple per-thread lists in the listeners, which isn't the
most efficient in terms of cache, and doesn't easily allow to store all
the per-thread stuff.

Now we introduce an srv_per_thread structure which the servers will have an
array of, and place the idle/safe/avail conns tree heads into. Overall this
was a fairly mechanical change, and the array is now always initialized for
all servers since we'll put more stuff there. It's worth noting that the Lua
code still has to deal with its own deinit by itself despite being in a
global list, because its server is not dynamically allocated.
2021-03-05 15:00:24 +01:00
Ubuntu
1adaddb494 OPTIM: lb-random: use a cheaper PRNG to pick a server
The PRNG used by the "random" LB algorithm was the central one which tries
hard to produce "correct" (i.e. hardly predictable) values suitable for use
in UUIDs or cookies. It's much too expensive for pure load balancing where
a cheaper thread-local PRNG is sufficient, and the current PRNG is part of
the hot places when running with many threads.

Let's switch to the stastistical PRNG instead, it's thread-local, very
fast, and with a period of (2^32)-1 which is more than enough to decide
on a server.
2021-03-05 08:30:08 +01:00
Ubuntu
b1adf03df9 MEDIUM: backend: use a trylock when trying to grab an idle connection
In conn_backend_get() we can cause some extreme contention due to the
idle_conns_lock. Indeed, even though it's per-thread, it still causes
high contention when running with many threads. The reason is that all
threads which do not have any idle connections are quickly skipped,
till the point where there are still some, so the first reaching that
point will grab the lock and the other ones wait behind. From this
point, all threads are synchronized waiting on the same lock, and
will follow the leader in small jumps, all hindering each other.

Here instead of doing this we're using a trylock. This way when a thread
is already checking a list, other ones will continue to next thread. In
the worst case, a high contention will lead to a few new connections to
be set up, but this may actually be what is required to avoid contention
in the first place. With this change, the contention has mostly
disappeared on this lock (it's still present in muxes and transport
layers due to the takeover).

Surprisingly, checking for emptiness of the tree root before taking
the lock didn't address any contention.

A few improvements are still possible and desirable here. The first
one would be to avoid seeing all threads jump to the next one. We
could have each thread use a different prime number as the increment
so as to spread them across the entire table instead of keeping them
synchronized. The second one is that the lock in the muck layers
shouldn't be needed to check for the tasklet's context availability.
2021-03-05 08:30:08 +01:00
Amaury Denoyelle
8ede3db080 MINOR: backend: handle reuse for conns with no server as target
If dispatch mode or transparent backend is used, the backend connection
target is a proxy instead of a server. In these cases, the reuse of
backend connections is not consistent.

With the default behavior, no reuse is done and every new request uses a
new connection. However, if http-reuse is set to never, the connection
are stored by the mux in the session and can be reused for future
requests in the same session.

As no server is used for these connections, no reuse can be made outside
of the session, similarly to http-reuse never mode. A different
http-reuse config value should not have an impact. To achieve this, mark
these connections as private to have a defined behavior.

For this feature to properly work, the connection hash has been slightly
adjusted. The server pointer as an input as been replaced by a generic
target pointer to refer to the server or proxy instance. The hash is
always calculated on connect_server even if the connection target is not
a server. This also requires to allocate the connection hash node for
every backend connections, not just the one with a server target.
2021-03-03 11:31:19 +01:00
Amaury Denoyelle
68967e595b BUG/MINOR: backend: free allocated bind_addr if reuse conn
Fix a leak in connect_server which happens when a connection is reused
and a bind_addr was allocated because transparent mode is active. The
connection has already an allocated bind_addr so free the newly
allocated one.

No backport needed.
2021-03-03 11:28:02 +01:00
Amaury Denoyelle
603657835f CLEANUP: backend: fix a wrong comment
missing 'not' when skipping reuse if proxy mode not HTTP
2021-03-03 11:28:02 +01:00
Tim Duesterhus
7b5777d9b4 CLEANUP: Use isttest(const struct ist) whenever possible
Refactoring performed with the following Coccinelle patch:

    @@
    struct ist i;
    @@

    - i.ptr != NULL
    + isttest(i)
2021-03-03 05:07:10 +01:00
Christopher Faulet
ae3056157c BUG/MINOR: connection: Use the client's dst family for adressless servers
When the selected server has no address, the destination address of the
client is used. However, for now, only the address is set, not the
family. Thus depending on how the server is configured and the client's
destination address, the server address family may be wrong.

For instance, with such server :

   server srv 0.0.0.0:0

The server address family is AF_INET. The server connection will fail if a
client is asking for an IPv6 destination.

To fix the bug, we take care to set the rigth family, the family of the
client destination address.

This patch should fix the issue #202. It must be backported to all stable
versions.
2021-03-01 11:34:00 +01:00
Amaury Denoyelle
8990b010a0 MINOR: connection: allocate dynamically hash node for backend conns
Remove ebmb_node entry from struct connection and create a dedicated
struct conn_hash_node. struct connection contains now only a pointer to
a conn_hash_node, allocated only for connections where target is of type
OBJ_TYPE_SERVER. This will reduce memory footprints for every
connections that does not need http-reuse such as frontend connections.
2021-02-19 16:59:18 +01:00
Willy Tarreau
59b0fecfd9 MINOR: lb/api: let callers of take_conn/drop_conn tell if they have the lock
The two algos defining these functions (first and leastconn) do not need the
server's lock. However it's already present in pendconn_process_next_strm()
so the API must be updated so that the functions may take it if needed and
that the callers indicate whether they already own it.

As such, the call places (backend.c and stream.c) now do not take it
anymore, queue.c was unchanged since it's already held, and both "first"
and "leastconn" were updated to take it if not already held.

A quick test on the "first" algo showed a jump from 432 to 565k rps by
just dropping the lock in stream.c!
2021-02-18 10:06:45 +01:00
Amaury Denoyelle
36441f46c4 MINOR: connection: remove pointers for prehash in conn_hash_params
Replace unneeded pointers for sni/proxy prehash by plain data type.
The code is slightly cleaner.
2021-02-17 16:43:07 +01:00
Amaury Denoyelle
4c09800b76 BUG/MINOR: backend: do not call smp_make_safe for sni conn hash
conn_hash_prehash does not need a nul-terminated string, thus it is only
needed to test if the sni sample is not null before using it as
connection hash input.

Moreover, a bug could be introduced between smp_make_safe and
ssl_sock_set_servername call. Indeed, smp_make_safe may call smp_dup
which duplicates the sample in the trash buffer. If another function
manipulates the trash buffer before the call to ssl_sock_set_servername,
the sni sample might be erased. Currently, no function seems to do that
except make_proxy_line in case proxy protocol is used simultaneously
with the sni on the server.

This does not need to be backported.
2021-02-17 16:38:20 +01:00
Amaury Denoyelle
edadf192fe BUG/MINOR: backend: fix compilation without ssl
sni_smp/sni_hash are reported as unused on compilation without
USE_OPENSL and may cause compilation failure

This does not need to be backported.
2021-02-12 13:49:42 +01:00
Amaury Denoyelle
1921d20fff MINOR: connection: use proxy protocol as parameter for srv conn hash
Use the proxy protocol frame if proxy protocol is activated on the
server line. Do not add anymore these connections in the private list.
If some requests are made with the same proxy fields, they can reuse
the idle connection.

The reg-tests proxy_protocol_send_unique_id must be adapted has it
relied on the side effect behavior that every requests from a same
connection reused a private server connection. Now, a new connection is
created as expected if the proxy protocol fields differ.
2021-02-12 12:54:04 +01:00
Amaury Denoyelle
d10a200f62 MINOR: connection: use src addr as parameter for srv conn hash
The source address is used as an input to the the server connection hash. The
address and port are used as separate hash inputs. Do not add anymore these
connections in the private list.

This parameter is set only if used in the transparent-proxy mode.
2021-02-12 12:54:04 +01:00
Amaury Denoyelle
f7bdf00071 MINOR: backend: rewrite alloc of connection src address
This commit is similar to
"MINOR: backend: rewrite alloc of stream target address" but with source
address.
2021-02-12 12:54:04 +01:00
Amaury Denoyelle
01a287f1e5 MINOR: connection: use dst addr as parameter for srv conn hash
The destination address is used as an input to the server connection hash. The
address and port are used as separated hash inputs. Note that they are not used
when statically specified on the server line. This is only useful for dynamic
destination address.

This is typically used when the server address is dynamically set via the
set-dst action. The address and port are separated hash parameters.

Most notably, it should fixed set-dst use case (cf github issue #947).
2021-02-12 12:53:56 +01:00
Amaury Denoyelle
68cf3959b3 MINOR: backend: rewrite alloc of stream target address
Change the API of the function used to allocate the stream target
address. This is done in order to be able to allocate the destination
address and use it to reuse a connection sharing with the same address.
In particular, the flag stream SF_ADDR_SET is now set outside of the
function.
2021-02-12 12:53:56 +01:00
Amaury Denoyelle
9b626e3c19 MINOR: connection: use sni as parameter for srv conn hash
The sni parameter is an input to the server connection hash. Do not add
anymore connections with dynamic sni in the private list. Thus, it is
now possible to reuse a server connection if they use the same sni.
2021-02-12 12:48:11 +01:00
Amaury Denoyelle
293dcc400e MINOR: backend: compare conn hash for session conn reuse
Compare the connection hash when reusing a connection from the session.
This ensures that a private connection is reused only if it shares the
same set of parameters.
2021-02-12 12:33:05 +01:00
Amaury Denoyelle
1a58aca84e MINOR: connection: use the srv pointer for the srv conn hash
The pointer of the target server is used as a first parameter for the
server connection hash calcul. This prevents the hash to be null when no
specific parameters are present, and can serve as a simple defense
against an attacker trying to reuse a non-conform connection.
2021-02-12 12:33:05 +01:00
Amaury Denoyelle
81c6f76d3e MINOR: connection: prepare hash calcul for server conns
This is a preliminary work for the calcul of the backend connection
hash. A structure conn_hash_params is the input for the operation,
containing the various specific parameters of a connection.

The high bits of the hash will reflect the parameters present as input.
A set of macros is written to manipulate the connection hash and extract
the parameters/payload.
2021-02-12 12:33:05 +01:00
Amaury Denoyelle
aa890aef3d MINOR: backend: search conn in idle tree after safe on always reuse
With http-reuse always, if no matching safe connection is found, check
in idle tree for a matching one. This is needed because now idle
connections can be differentiated from each other.

If only the safe tree was checked because not empty, but did not contain
a matching connection, we could miss matching entry in idle tree.
2021-02-12 12:33:05 +01:00
Amaury Denoyelle
1399d695c0 MINOR: backend: search conn in idle/safe trees after available
If no matching connection is found on available, check on idle/safe
trees for a matching one. This is needed because now idle connections
can be differentiated from each other.

If only the available list was checked because not empty, but did not
contain a matching connection, we could miss matching entries in idle or
safe trees.
2021-02-12 12:33:05 +01:00
Amaury Denoyelle
f232cb3e9b MEDIUM: connection: replace idle conn lists by eb trees
The server idle/safe/available connection lists are replaced with ebmb-
trees. This is used to store backend connections, with the new field
connection hash as the key. The hash is a 8-bytes size field, used to
reflect specific connection parameters.

This is a preliminary work to be able to reuse connection with SNI,
explicit src/dst address or PROXY protocol.
2021-02-12 12:33:05 +01:00
Amaury Denoyelle
5c7086f6b0 MEDIUM: connection: protect idle conn lists with locks
This is a preparation work for connection reuse with sni/proxy
protocol/specific src-dst addresses.

Protect every access to idle conn lists with a lock. This is currently
strictly not needed because the access to the list are made with atomic
operations. However, to be able to reuse connection with specific
parameters, the list storage will be converted to eb-trees. As this
structure does not have atomic operation, it is mandatory to protect it
with a lock.

For this, the takeover lock is reused. Its role was to protect during
connection takeover. As it is now extended to general idle conns usage,
it is renamed to idle_conns_lock. A new lock section is also
instantiated named IDLE_CONNS_LOCK to isolate its impact on performance.
2021-02-12 12:33:04 +01:00
Amaury Denoyelle
a3bf62ec54 BUG/MINOR: backend: hold correctly lock when killing idle conn
The wrong lock seems to be held when trying to remove another thread
connection if max fd limit has been reached (locking the current thread
instead of the target thread lock).

This could be backported up to 2.0.
2021-02-12 12:32:31 +01:00
Amaury Denoyelle
a81bb7197e BUG/MINOR: backend: check available list allocation for reuse
Do not consider reuse connection if available list is not allocated for
the target server. This will prevent a crash when using a standalone
server for an external purpose like socket_tcp/socket_ssl on hlua code.
For the idle/safe lists, they are considered allocated if
srv.max_idle_conns is not null.

Note that the hlua code is currently safe thanks to the additional
checks on proxy http mode and stream reuse policy not never. However,
this might not be sufficient for future code.

This patch should be backported in every branches containing the
following patch :
  7f68d815af (2.4 tree)
  REORG: backend: simplify conn_backend_get
2021-01-28 18:12:07 +01:00
Amaury Denoyelle
08d87b3f49 BUG/MEDIUM: backend: never reuse a connection for tcp mode
The reuse of idle connections should only happen for a proxy with the
http mode. In case of a backend with the tcp mode, the reuse selection
and insertion in session list are skipped.

This behavior is present since commit :
MEDIUM: connection: Add private connections synchronously in session server list
It could also be further exagerated by :
MEDIUM: backend: add reused conn to sess if mux marked as HOL blocking

It can be backported up to 2.3.
2021-01-28 14:18:33 +01:00
Amaury Denoyelle
7f68d815af REORG: backend: simplify conn_backend_get
Reorganize the conditions for the reuse of idle/safe connections :
- reduce code by using variable to store reuse mode and idle/safe conns
  counts
- consider that idle/safe/avail lists are properly allocated if
  max_idle_conns not null. An allocation failure prevents haproxy
  startup.
2021-01-26 14:48:39 +01:00
Amaury Denoyelle
37e25bcd1e CLEANUP: backend: remove an obsolete comment on conn_backend_get
This comment was valid for haproxy 1.8 but now it is obsolete.
2021-01-26 14:48:39 +01:00
Thayne McCombs
8f0cc5c4ba CLEANUP: Fix spelling errors in comments
This is from the output of codespell. It's done at once over a bunch
of files and only affects comments, so there is nothing user-visible.
No backport needed.
2021-01-08 14:56:32 +01:00
Tim Duesterhus
e5ff14100a CLEANUP: Compare the return value of XXXcmp() functions with zero
According to coding-style.txt it is recommended to use:

`strcmp(a, b) == 0` instead of `!strcmp(a, b)`

So let's do this.

The change was performed by running the following (very long) coccinelle patch
on src/:

    @@
    statement S;
    expression E;
    expression F;
    @@

      if (
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) != 0
      )
    (
      S
    |
      { ... }
    )

    @@
    statement S;
    expression E;
    expression F;
    @@

      if (
    - !
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) == 0
      )
    (
      S
    |
      { ... }
    )

    @@
    expression E;
    expression F;
    expression G;
    @@

    (
    G &&
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) != 0
    )

    @@
    expression E;
    expression F;
    expression G;
    @@

    (
    G ||
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) != 0
    )

    @@
    expression E;
    expression F;
    expression G;
    @@

    (
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) != 0
    && G
    )

    @@
    expression E;
    expression F;
    expression G;
    @@

    (
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) != 0
    || G
    )

    @@
    expression E;
    expression F;
    expression G;
    @@

    (
    G &&
    - !
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) == 0
    )

    @@
    expression E;
    expression F;
    expression G;
    @@

    (
    G ||
    - !
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) == 0
    )

    @@
    expression E;
    expression F;
    expression G;
    @@

    (
    - !
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) == 0
    && G
    )

    @@
    expression E;
    expression F;
    expression G;
    @@

    (
    - !
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) == 0
    || G
    )

    @@
    expression E;
    expression F;
    expression G;
    @@

    (
    - !
    (
    dns_hostname_cmp
    |
    eb_memcmp
    |
    memcmp
    |
    strcasecmp
    |
    strcmp
    |
    strncasecmp
    |
    strncmp
    )
    -  (E, F)
    +  (E, F) == 0
    )
2021-01-04 10:09:02 +01:00
Amaury Denoyelle
d91d779618 MINOR: backend: add timeout sample fetches
Add be_server_timeout and be_tunnel_timeout.

These sample fetches return the configuration value for server or tunnel
timeout on the backend side.
2020-12-11 12:01:07 +01:00
Willy Tarreau
38b4d2eb22 CLEANUP: connection: do not use conn->owner when the session is known
At a few places we used to rely on conn->owner to retrieve the session
while the session is already known. This is not correct because at some
of these points the reason the connection's owner was still the session
(instead of NULL) is a mistake. At one place a comparison is even made
between the session and conn->owner assuming it's valid without checking
if it's NULL. Let's clean this up to use the session all the time.

Note that this will be needed for a forthcoming fix and will have to be
backported.
2020-11-21 15:29:22 +01:00
Willy Tarreau
8ae8c48eb0 MEDIUM: fwlc: re-enable per-server queuing up to maxqueue
Leastconn has the nice propery of being able to sort servers by their
current usage. It's really a shame to force all requests into the backend
queue when the algo would be able to also consider their current queue.

In order not to change existing behavior but extend it, this patch allows
leastconn to elect servers which are already full if they have an explicitly
configured maxqueue setting above zero and their queue hasn't reached that
threshold. This will significantly reduce the pressure in the backend queue
when queuing a lot with lots of servers.

A test on 8 threads with 100 servers configured with maxconn 1 jumped
from 165krps to 330krps with maxqueue 15 with this patch.

This partially undoes commit 82cd5c13a ("OPTIM: backend: skip LB when we
know the backend is full") but allows to scale much better even by setting
a single-digit maxqueue value. Some better heuristics could be used to
maintain the behavior of the bypass in the patch above, consisting in
keeping it if it's known that there is no server with a configured
maxqueue in the farm (or in the backend).
2020-10-22 18:30:25 +02:00
Christopher Faulet
26a52af642 BUG/MEDIUM: lb: Always lock the server when calling server_{take,drop}_conn
The server lock must be held when server_take_conn() and server_drop_conn()
lbprm callback functions are called. It is a documented prerequisite but it is
not always performed. It only affects leastconn and fas lb algorithm. Others
don't use these callback functions.

A race condition on the next pending effecive weight (next_eweight) may be
encountered with the leastconn lb algorithm. An agent check may set it to 0
while fwlc_srv_reposition() is called. The server is locked during the
next_eweight update. But because the server lock is not acquired when
fwlc_srv_reposition() is called, we may use it to recompute the server key,
leading to a division by 0.

This patch must be backported as far as 1.8.
2020-10-17 09:29:43 +02:00
Amaury Denoyelle
7239c24986 MEDIUM: backend: reuse connection if using a static sni
Detect if the sni used a constant value and if so, allow to reuse this
connection for later sessions. Use a combination of SMP_USE_INTRN +
!SMP_F_VOLATILE to consider a sample as a constant value.

This features has been requested on github issue #371.
2020-10-16 17:48:01 +02:00
Willy Tarreau
9b7587a6af MINOR: connection: make sockaddr_alloc() take the address to be copied
Roughly half of the calls to sockadr_alloc() are made to copy an already
known address. Let's optionally pass it in argument so that the function
can handle the copy at the same time, this slightly simplifies its usage.
2020-10-15 21:47:56 +02:00
Amaury Denoyelle
0d21deaded MEDIUM: backend: add reused conn to sess if mux marked as HOL blocking
If a connection is using a mux protocol subject to HOL blocking, add it
to the session instead of the available list to avoid sharing it with
other clients on connection reuse.
2020-10-15 15:19:34 +02:00
Amaury Denoyelle
00464ab8f4 MEDIUM: backend: add new conn to session if mux marked as HOL blocking
When allocating a new session on connect_server, if the mux protocol is
marked as subject of HOL blocking, add it into session instead of
available list to avoid sharing it with other clients.
2020-10-15 15:19:34 +02:00
Amaury Denoyelle
9c13b62b47 BUG/MEDIUM: connection: fix srv idle count on conn takeover
On server connection migration from one thread to another, the wrong
idle thread-specific counter is decremented. This bug was introduced
since commit 3d52f0f1f8 due to the
factorization with srv_use_idle_conn. However, this statement is only
executed from conn_backend_get. Extract the decrement from
srv_use_idle_conn in conn_backend_get and use the correct
thread-specific counter.

Rename the function to srv_use_conn to better reflect its purpose as it
is also used with a newly initialized connection not in the idle list.

As a side change, the connection insertion to available list has also
been extracted to conn_backend_get. This will be useful to be able to
specify an alternative list for protocol subject to HOL risk that should
not be shared between several clients.

This bug is only present in this release and thus do not need a backport.
2020-10-15 15:19:34 +02:00
Amaury Denoyelle
5f1ded5629 BUG/MINOR: connection: fix loop iter on connection takeover
The loop always missed one iteration due to the incrementation done on
the for check. Move the incrementation on the loop last statement to fix
this behaviour.

This bug has a very limited impact, not at all visible to the user, but
could be backported to 2.2.
2020-10-15 15:19:25 +02:00
Willy Tarreau
82cd5c13a5 OPTIM: backend: skip LB when we know the backend is full
For some algos (roundrobin, static-rr, leastconn, first) we know that
if there is any request queued in the backend, it's because a previous
attempt failed at finding a suitable server after trying all of them.
This alone is sufficient to decide that the next request will skip the
LB algo and directly reach the backend's queue. Doing this alone avoids
an O(N) lookup when load-balancing on a saturated farm of N servers,
which starts to be very expensive for hundreds of servers, especially
under the lbprm lock. This change alone has increased the request rate
from 110k to 148k RPS for 200 saturated servers on 8 threads, and
fwlc_reposition_srv() doesn't show up anymore in perf top. See github
issue #880 for more context.

It could have been the same for random, except that random is performed
using a consistent hash and it only considers a small set of servers (2
by default), so it may result in queueing at the backend despite having
some free slots on unknown servers. It's no big deal though since random()
only performs two attempts by default.

For hashing algorithms this is pointless since we don't queue at the
backend, except when there's no hash key found, which is the least of
our concerns here.
2020-09-29 17:18:37 +02:00
Willy Tarreau
b88ae18021 OPTIM: backend/random: never queue on the server, always on the backend
If random() returns a server whose maxconn is reached or the queue is
used, instead of adding the request to the server's queue, better add
it to the backend queue so that it can be served by any server (hence
the fastest one).
2020-09-29 17:18:11 +02:00
Willy Tarreau
57a374131c MINOR: backend: add a new "path-only" option to "balance uri"
Since we've fixed the way URIs are handled in 2.1, some users have started
to experience inconsistencies in "balance uri" between requests received
over H1 and the same ones received over H2. This is caused by the fact
that H1 rarely uses absolute URIs while H2 always uses them. Similar
issues were reported already around replace-uri etc, leading to "pathq"
recently being introduced, so this isn't new.

Here what this patch does is add a new option to "balance uri" to indicate
that the hashing should only start at the path and not cover the authority.
This makes H1 relative URIs and H2 absolute URI hashes equally again.

Some extra options could be added to normalize URIs by always hashing the
authority (or host) in front of them, which would make sure that both
absolute and relative requests provide the same hash. This is left for
later if needed.
2020-09-23 08:56:29 +02:00
Willy Tarreau
3d1119d225 MINOR: backend: make the "whole" option of balance uri take only one bit
We'll want to add other boolean options on "balance uri", so let's make
some room aside "whole" and make it take only one bit and not one int.
2020-09-23 08:05:47 +02:00
Ilya Shipitsin
6b79f38a7a CLEANUP: assorted typo fixes in the code and comments
This is 12th iteration of typo fixes
2020-07-31 11:18:07 +02:00
Willy Tarreau
a3b17563e1 BUG/MEDIUM: backend: always attach the transport before installing the mux
In connect_server(), we can enter in a stupid situation:

  - conn_install_mux_be() is called to install the mux. This one
    subscribes for receiving and quits ;

  - then we discover that a handshake is required on the connection
    (e.g. send-proxy), so xprt_add_hs() is called and subscribes as
    well.

  - we crash in conn_subscribe() which gets a different subscriber.
    And if BUG_ON is disabled, we'd likely lose one event.

Note that it doesn't seem to happen by default, but definitely does
if connect() rightfully performs fd_cant_recv(), so it's a matter of
who does what and in what order.

A simple reproducer consists in adding fd_cant_recv() after fd_cant_send()
in tcp_connect_server() and running it on this config, as discussed in issue

  listen foo
    bind :8181
    mode http
    server srv1 127.0.0.1:8888 send-proxy-v2

The root cause is that xprt_add_hs() installs an xprt layer underneath
the mux without taking over its subscriptions. Ideally if we want to
support this, we'd need to steal the connection's wait_events and
replace them by new ones. But there doesn't seem to be any case where
we're interested in doing this so better simply always install the
transport layer before installing the mux, that's safer and simpler.

This needs to be backported to 2.1 which is constructed the same way
and thus suffers from the same issue, though the code is slightly
different there.
2020-07-31 08:47:58 +02:00
Christopher Faulet
b4de420472 MINOR: connection: Preinstall the mux for non-ssl connect
In the connect_server() function, there is an optim to install the mux as soon
as possible. It is possible if we can determine the mux to use from the
configuration only. For instance if the mux is explicitly specified or if no ALPN
is set. This patch adds a new condition to preinstall the mux for non-ssl
connection. In this case, by default, we always use the mux_pt for raw
connections and the mux-h1 for HTTP ones.

This patch is related to the issue #762. It may be backported to 2.2 (and
possibly as far as 1.9 if necessary).
2020-07-30 09:31:09 +02:00
Christopher Faulet
3f5bcd0c96 BUG/MEDIUM: connection: Be sure to always install a mux for sync connect
Sometime, a server connection may be performed synchronously. Most of time on
TCP socket, it does not happen. It is easier to have sync connect with unix
socket. When it happens, if we are not waiting for any hanshake completion, we
must be sure to have a mux installed before leaving the connect_server()
function because an attempt to send may be done before the I/O connection
handler have a chance to be executed to install the mux, if not already done.

For now, It is not expected to perform a send with no mux installed, leading to
a crash if it happens.

This patch should fix the issue #762 and probably #779 too. It must be
backported as far as 1.9.
2020-07-30 09:31:09 +02:00
Willy Tarreau
dc2ac81c41 BUG/MINOR: backend: fix potential null deref on srv_conn
Commit 08016ab82 ("MEDIUM: connection: Add private connections
synchronously in session server list") introduced a build warning about
a potential null dereference which is actually true: in case a reuse
fails an we fail to allocate a new connection, we could crash. The
issue was already present earlier but the compiler couldn't detect
it since it was guarded by an independent condition.

This should be carefully backported to older versions (at least 2.2
and maybe 2.1), the change consists in only adding a test on srv_conn.

The whole sequence of "if" blocks is ugly there and would deserve being
cleaned up so that the !srv_conn condition is matched ASAP and the
assignment is done later. This would remove complicated conditions.
2020-07-15 17:46:32 +02:00
Christopher Faulet
1bea865811 MINOR: backend: Add sample fetches to get the server's weight
The following sample fetches have been added :

 * srv_iweight : returns the initial server's weight
 * srv_uweight : returns the user-visible server's weight
 * srv_weight  : returns the current (or effetctive) server's weight

The requested server must be passed as argument, evnetually preceded by the
backend name. For instance :

  srv_weight(back-http/www1)
2020-07-15 14:08:14 +02:00
Christopher Faulet
3d52f0f1f8 MINOR: server: Factorize code to deal with reuse of server idle connections
The srv_use_idle_conn() function is now responsible to update the server
counters and the connection flags when an idle connection is reused. The same
function is called when a new connection is created. This simplifies a bit the
connect_server() function.
2020-07-15 14:08:14 +02:00
Christopher Faulet
236c93b108 MINOR: connection: Set the conncetion target during its initialisation
When a new connection is created, its target is always set just after. So the
connection target may set when it is created instead, during its initialisation
to be precise. It is the purpose of this patch. Now, conn_new() function is
called with the connection target as parameter. The target is then passed to
conn_init(). It means the target must be passed when cs_new() is called. In this
case, the target is only used when the conn-stream is created with no
connection. This only happens for tcpchecks for now.
2020-07-15 14:08:14 +02:00
Christopher Faulet
fcc3d8a1c0 MINOR: connection: Use a dedicated function to look for a session's connection
The session_get_conn() must now be used to look for an available connection
matching a specific target for a given session. This simplifies a bit the
connect_server() function.
2020-07-15 14:08:14 +02:00
Christopher Faulet
08016ab82d MEDIUM: connection: Add private connections synchronously in session server list
When a connection is marked as private, it is now added in the session server
list. We don't wait a stream is detached from the mux to do so. When the
connection is created, this happens after the mux creation. Otherwise, it is
performed when the connection is marked as private.

To allow that, when a connection is created, the session is systematically set
as the connectin owner. Thus, a backend connection has always a owner during its
creation. And a private connection has always a owner until its death.

Note that outside the detach() callback, if the call to session_add_conn()
failed, the error is ignored. In this situation, we retry to add the connection
into the session server list in the detach() callback. If this fails at this
step, the multiplexer is destroyed and the connection is closed.
2020-07-15 14:08:14 +02:00
Christopher Faulet
21ddc74e8a MINOR: connection: Add a wrapper to mark a connection as private
To set a connection as private, the conn_set_private() function must now be
called. It sets the CO_FL_PRIVATE flags, but it also remove the connection from
the available connection list, if necessary. For now, it never happens because
only HTTP/1 connections may be set as private after their creation. And these
connections are never inserted in the available connection list.
2020-07-15 14:08:14 +02:00
Christopher Faulet
c64badd573 MINOR: connection: Set new connection as private on reuse never
When a new connection is created, it may immediatly be set as private if
http-reuse never is configured for the backend. There is no reason to wait the
call to mux->detach() to do so.
2020-07-15 14:08:14 +02:00
Christopher Faulet
27bd6ff96d MINOR: connection: Set the SNI on server connections before installing the mux
If an expression is configured to set the SNI on a server connection, the
connection is marked as private. To not needlessly add it in the available
connection list when the mux is installed, the SNI is now set on the connection
before installing the mux, just after the call to si_connect().
2020-07-15 14:08:14 +02:00
Willy Tarreau
a9d7b76f6a MINOR: connection: use MT_LIST_ADDQ() to add connections to idle lists
When a connection is added to an idle list, it's already detached and
cannot be seen by two threads at once, so there's no point using
TRY_ADDQ, there will never be any conflict. Let's just use the cheaper
ADDQ.
2020-07-10 08:52:13 +02:00
Willy Tarreau
de4db17dee MINOR: lists: rename some MT_LIST operations to clarify them
Initially when mt_lists were added, their purpose was to be used with
the scheduler, where anyone may concurrently add the same tasklet, so
it sounded natural to implement a check in MT_LIST_ADD{,Q}. Later their
usage was extended and MT_LIST_ADD{,Q} started to be used on situations
where the element to be added was exclusively owned by the one performing
the operation so a conflict was impossible. This became more obvious with
the idle connections and the new macro was called MT_LIST_ADDQ_NOCHECK.

But this remains confusing and at many places it's not expected that
an MT_LIST_ADD could possibly fail, and worse, at some places we start
by initializing it before adding (and the test is superflous) so let's
rename them to something more conventional to denote the presence of the
check or not:

   MT_LIST_ADD{,Q}    : inconditional operation, the caller owns the
                        element, and doesn't care about the element's
                        current state (exactly like LIST_ADD)
   MT_LIST_TRY_ADD{,Q}: only perform the operation if the element is not
                        already added or in the process of being added.

This means that the previously "safe" MT_LIST_ADD{,Q} are not "safe"
anymore. This also means that in case of backport mistakes in the
future causing this to be overlooked, the slower and safer functions
will still be used by default.

Note that the missing unchecked MT_LIST_ADD macro was added.

The rest of the code will have to be reviewed so that a number of
callers of MT_LIST_TRY_ADDQ are changed to MT_LIST_ADDQ to remove
the unneeded test.
2020-07-10 08:50:41 +02:00
Christopher Faulet
aa27853ce2 BUG/MEDIUM: connection: Don't consider new private connections as available
When a connection is created and the multiplexer is installed, if the connection
is marked as private, don't consider it as available, regardless the number of
available streams. This test is performed when the mux is installed when the
connection is created, in connect_server(), and when the mux is installed after
the handshakes stage.

No backport needed, this is 2.2-dev.
2020-07-07 14:30:38 +02:00
Christopher Faulet
e91a526c8f BUG/MINOR: backend: Remove CO_FL_SESS_IDLE if a client remains on the last server
When a connection is picked from the session server list because the proxy or
the session are marked to use the last requested server, if it is idle, we must
marked it as used removing the CO_FL_SESS_IDLE flag and decrementing the session
idle_conns counter.

This patch must be backported as far as 1.9.
2020-07-07 14:30:26 +02:00
Olivier Houchard
1662cdb0c6 BUG/MEDIUM: connections: Set the tid for the old tasklet on takeover.
In the various takeover() methods, make sure we schedule the old tasklet
on the old thread, as we don't want it to run on our own thread! This
was causing a very rare crash when building with DEBUG_STRICT, seeing
that either an FD's thread mask didn't match the thread ID in h1_io_cb(),
or that stream_int_notify() would try to queue a task with the wrong
tid_bit.

In order to reproduce this, it is necessary to maintain many connections
(typically 30k) at a high request rate flowing over H1+SSL between two
proxies, the second of which would randomly reject ~1% of the incoming
connection and randomly killing some idle ones using a very short client
timeout. The request rate must be adjusted so that the CPUs are nearly
saturated, but never reach 100%. It's easier to reproduce this by skipping
local connections and always picking from other threads. The issue
should happen in less than 20s otherwise it's necessary to restart to
reset the idle connections lists.

No backport is needed, takeover() is 2.2 only.
2020-07-03 17:49:23 +02:00
Willy Tarreau
76cc699017 MINOR: config: add a new tune.idle-pool.shared global setting.
Enables ('on') or disables ('off') sharing of idle connection pools between
threads for a same server. The default is to share them between threads in
order to minimize the number of persistent connections to a server, and to
optimize the connection reuse rate. But to help with debugging or when
suspecting a bug in HAProxy around connection reuse, it can be convenient to
forcefully disable this idle pool sharing between multiple threads, and force
this option to "off". The default is on.

This could have been nice to have during the idle connections debugging,
but it's not too late to add it!
2020-07-01 19:07:37 +02:00
Olivier Houchard
f8f4c2ef60 CLEANUP: connections: rename the toremove_lock to takeover_lock
This lock was misnamed and a bit confusing. It's only used for takeover
so let's call it takeover_lock.
2020-07-01 17:09:10 +02:00
Willy Tarreau
364f25a688 MINOR: backend: don't always takeover from the same threads
The next thread walking algorithm in commit 566df309c ("MEDIUM:
connections: Attempt to get idle connections from other threads.")
proved to be sufficient for most cases, but it still has some rough
edges when threads are unevenly loaded. If one thread wakes up with
10 streams to process in a burst, it will mainly take over connections
from the next one until it doesn't have anymore.

This patch implements a rotating index that is stored into the server
list and that any thread taking over a connection is responsible for
updating. This way it starts mostly random and avoids always picking
from the same place. This results in a smoother distribution overall
and a slightly lower takeover rate.
2020-07-01 16:07:43 +02:00
Willy Tarreau
0d587116c2 BUG/MEDIUM: backend: always search in the safe list after failing on the idle one
There's a tricky behavior that was lost when the idle connections were
made sharable between thread in commit 566df309c ("MEDIUM: connections:
Attempt to get idle connections from other threads."), it is the ability
to retry from the safe list when looking for any type of idle connection
and not finding one in the idle list.

It is already important when dealing with long-lived connections since
they ultimately all become safe, but that case is already covered by
the fact that safe conns not being used end up closing and are not
looked up anymore since connect_server() sees there are none.

But it's even more important when using server-side connections which
periodically close, because the new connections may spend half of their
time in safe state and the other half in the idle state, and failing
to grab one such connection from the right list results in establishing
a new connection.

This patch makes sure that a failure to find an idle connection results
in a new attempt at finding one from the safe list if available. In order
to avoid locking twice, connections are attempted alternatively from the
idle then safe list when picking from siblings. Tests have shown a ~2%
performance increase by avoiding to lock twice.

A typical test with 10000 connections over 16 threads with 210 servers
having a 1 millisecond response time and closing every 5 requests shows
a degrading performance starting at 120k req/s down to 60-90k and an
average reuse rate of 44%. After the fix, the reuse rate raises to 79%
and the performance becomes stable at 254k req/s. Similarly the previous
test with full keep-alive has now increased from 96% reuse rate to 99%
and from 352k to 375k req/s.

No backport is needed as this is 2.2-only.
2020-07-01 15:49:21 +02:00
Willy Tarreau
2f3f4d3441 MEDIUM: server: add a new pool-low-conn server setting
The problem with the way idle connections currently work is that it's
easy for a thread to steal all of its siblings' connections, then release
them, then it's done by another one, etc. This happens even more easily
due to scheduling latencies, or merged events inside the same pool loop,
which, when dealing with a fast server responding in sub-millisecond
delays, can really result in one thread being fully at work at a time.

In such a case, we perform a huge amount of takeover() which consumes
CPU and requires quite some locking, sometimes resulting in lower
performance than expected.

In order to fight against this problem, this patch introduces a new server
setting "pool-low-conn", whose purpose is to dictate when it is allowed to
steal connections from a sibling. As long as the number of idle connections
remains at least as high as this value, it is permitted to take over another
connection. When the idle connection count becomes lower, a thread may only
use its own connections or create a new one. By proceeding like this even
with a low number (typically 2*nbthreads), we quickly end up in a situation
where all active threads have a few connections. It then becomes possible
to connect to a server without bothering other threads the vast majority
of the time, while still being able to use these connections when the
number of available FDs becomes low.

We also use this threshold instead of global.nbthread in the connection
release logic, allowing to keep more extra connections if needed.

A test performed with 10000 concurrent HTTP/1 connections, 16 threads
and 210 servers with 1 millisecond of server response time showed the
following numbers:

   haproxy 2.1.7:           185000 requests per second
   haproxy 2.2:             314000 requests per second
   haproxy 2.2 lowconn 32:  352000 requests per second

The takeover rate goes down from 300k/s to 13k/s. The difference is
further amplified as the response time shrinks.
2020-07-01 15:23:15 +02:00
Willy Tarreau
151c253a1e MINOR: server: skip servers with no idle conns earlier
In conn_backend_get() we can avoid locking other servers when trying
to steal their connections when we know for sure they will not have
one, so let's do it to lower the contention on the lock.
2020-07-01 10:33:39 +02:00
Willy Tarreau
bdb86bdaab MEDIUM: server: improve estimate of the need for idle connections
Starting with commit 079cb9a ("MEDIUM: connections: Revamp the way idle
connections are killed") we started to improve the way to compute the
need for idle connections. But the condition to keep a connection idle
or drop it when releasing it was not updated. This often results in
storms of close when certain thresholds are met, and long series of
takeover() when there aren't enough connections left for a thread on
a server.

This patch tries to improve the situation this way:
  - it keeps an estimate of the number of connections needed for a server.
    This estimate is a copy of the max over previous purge period, or is a
    max of what is seen over current period; it differs from max_used_conns
    in that this one is a counter that's reset on each purge period ;

  - when releasing, if the number of current idle+used connections is
    lower than this last estimate, then we'll keep the connection;

  - when releasing, if the current thread's idle conns head is empty,
    and we don't exceed the estimate by the number of threads, then
    we'll keep the connection.

  - when cleaning up connections, we consider the max of the last two
    periods to avoid killing too many idle conns when facing bursty
    traffic.

Thanks to this we can better converge towards a situation where, provided
there are enough FDs, each active server keeps at least one idle connection
per thread all the time, with a total number close to what was needed over
the previous measurement period (as defined by pool-purge-delay).

On tests with large numbers of concurrent connections (30k) and many
servers (200), this has quite smoothed the CPU usage pattern, increased
the reuse rate and roughly halved the takeover rate.
2020-06-29 16:29:10 +02:00
Willy Tarreau
c35bcfcc21 BUG/MINOR: server: start cleaning idle connections from various points
There's a minor glitch with the way idle connections start to be evicted.
The lookup always goes from thread 0 to thread N-1. This causes depletion
of connections on the first threads and abundance on the last ones. This
is visible with the takeover() stats below:

 $ socat - /tmp/sock1 <<< "show activity"|grep ^fd ; \
   sleep 10 ; \
   socat -/tmp/sock1 <<< "show activity"|grep ^fd
 fd_takeover: 300144 [ 91887 84029 66254 57974 ]
 fd_takeover: 359631 [ 111369 99699 79145 69418 ]

There are respectively 19k, 15k, 13k and 11k takeovers for only 4 threads,
indicating that the first thread needs a foreign FD twice more often than
the 4th one.

This patch changes this si that all threads are scanned in round robin
starting with the current one. The takeovers now happen in a much more
distributed way (about 4 times 9k) :

  fd_takeover: 1420081 [ 359562 359453 346586 354480 ]
  fd_takeover: 1457044 [ 368779 368429 355990 363846 ]

There is no need to backport this, as this happened along a few patches
that were merged during 2.2 development.
2020-06-29 14:43:16 +02:00
Willy Tarreau
b159132ea3 MINOR: activity: add per-thread statistics on FD takeover
The FD takeover operation might have certain impacts explaining
unexpected activities, so it's important to report such a counter
there. We thus count the number of times a thread has stolen an
FD from another thread.
2020-06-29 14:26:05 +02:00
Olivier Houchard
4ba494ca1b BUG/MEDIUM: connections: Don't increase curr_used_conns for shared connections.
In connect_server(), we want to increase curr_used_conns only if the
connection is new, or if it comes from an idle_pool, otherwise it means
the connection is already used by at least one another stream, and it is
already accounted for.
2020-06-28 16:16:13 +02:00
Willy Tarreau
4d82bf5c2e MINOR: connection: align toremove_{lock,connections} and cleanup into idle_conns
We used to have 3 thread-based arrays for toremove_lock, idle_cleanup,
and toremove_connections. The problem is that these items are small,
and that this creates false sharing between threads since it's possible
to pack up to 8-16 of these values into a single cache line. This can
cause real damage where there is contention on the lock.

This patch creates a new array of struct "idle_conns" that is aligned
on a cache line and which contains all three members above. This way
each thread has access to its variables without hindering the other
ones. Just doing this increased the HTTP/1 request rate by 5% on a
16-thread machine.

The definition was moved to connection.{c,h} since it appeared a more
natural evolution of the ongoing changes given that there was already
one of them declared in connection.h previously.
2020-06-28 10:52:36 +02:00
Willy Tarreau
b2551057af CLEANUP: include: tree-wide alphabetical sort of include files
This patch fixes all the leftovers from the include cleanup campaign. There
were not that many (~400 entries in ~150 files) but it was definitely worth
doing it as it revealed a few duplicates.
2020-06-11 10:18:59 +02:00
Willy Tarreau
dfd3de8826 REORG: include: move stream.h to haproxy/stream{,-t}.h
This one was not easy because it was embarking many includes with it,
which other files would automatically find. At least global.h, arg.h
and tools.h were identified. 93 total locations were identified, 8
additional includes had to be added.

In the rare files where it was possible to finalize the sorting of
includes by adjusting only one or two extra lines, it was done. But
all files would need to be rechecked and cleaned up now.

It was the last set of files in types/ and proto/ and these directories
must not be reused anymore.
2020-06-11 10:18:58 +02:00
Willy Tarreau
1e56f92693 REORG: include: move server.h to haproxy/server{,-t}.h
extern struct dict server_name_dict was moved from the type file to the
main file. A handful of inlined functions were moved at the bottom of
the file. Call places were updated to use server-t.h when relevant, or
to simply drop the entry when not needed.
2020-06-11 10:18:58 +02:00
Willy Tarreau
a55c45470f REORG: include: move queue.h to haproxy/queue{,-t}.h
Nothing outstanding here. A number of call places were not justified and
removed.
2020-06-11 10:18:58 +02:00
Willy Tarreau
4980160ecc REORG: include: move backend.h to haproxy/backend{,-t}.h
The files remained mostly unchanged since they were OK. However, half of
the users didn't need to include them, and about as many actually needed
to have it and used to find functions like srv_currently_usable() through
a long chain that broke when moving the file.
2020-06-11 10:18:58 +02:00
Willy Tarreau
a264d960f6 REORG: include: move proxy.h to haproxy/proxy{,-t}.h
This one is particularly difficult to split because it provides all the
functions used to manipulate a proxy state and to retrieve names or IDs
for error reporting, and as such, it was included in 73 files (down to
68 after cleanup). It would deserve a small cleanup though the cut points
are not obvious at the moment given the number of structs involved in
the struct proxy itself.
2020-06-11 10:18:58 +02:00
Willy Tarreau
aeed4a85d6 REORG: include: move log.h to haproxy/log{,-t}.h
The current state of the logging is a real mess. The main problem is
that almost all files include log.h just in order to have access to
the alert/warning functions like ha_alert() etc, and don't care about
logs. But log.h also deals with real logging as well as log-format and
depends on stream.h and various other things. As such it forces a few
heavy files like stream.h to be loaded early and to hide missing
dependencies depending where it's loaded. Among the missing ones is
syslog.h which was often automatically included resulting in no less
than 3 users missing it.

Among 76 users, only 5 could be removed, and probably 70 don't need the
full set of dependencies.

A good approach would consist in splitting that file in 3 parts:
  - one for error output ("errors" ?).
  - one for log_format processing
  - and one for actual logging.
2020-06-11 10:18:58 +02:00
Willy Tarreau
c2b1ff04e5 REORG: include: move http_ana.h to haproxy/http_ana{,-t}.h
It was moved without any change, however many callers didn't need it at
all. This was a consequence of the split of proto_http.c into several
parts that resulted in many locations to still reference it.
2020-06-11 10:18:58 +02:00
Willy Tarreau
f1d32c475c REORG: include: move channel.h to haproxy/channel{,-t}.h
The files were moved with no change. The callers were cleaned up a bit
and a few of them had channel.h removed since not needed.
2020-06-11 10:18:58 +02:00
Willy Tarreau
5e539c9b8d REORG: include: move stream_interface.h to haproxy/stream_interface{,-t}.h
Almost no changes, removed stdlib and added buf-t and connection-t to
the types to avoid a warning.
2020-06-11 10:18:58 +02:00
Willy Tarreau
209108dbbd REORG: include: move ssl_sock.h to haproxy/ssl_sock{,-t}.h
Almost nothing changed, just moved a static inline at the end and moved
an export from the types to the main file.
2020-06-11 10:18:58 +02:00
Willy Tarreau
2867159d63 REORG: include: move lb_map.h to haproxy/lb_map{,-t}.h
Nothing was changed.
2020-06-11 10:18:58 +02:00
Willy Tarreau
dcc048a14a REORG: include: move acl.h to haproxy/acl.h{,-t}.h
The files were moved almost as-is, just dropping arg-t and auth-t from
acl-t but keeping arg-t in acl.h. It was useful to revisit the call places
since a handful of files used to continue to include acl.h while they did
not need it at all. Struct stream was only made a forward declaration
since not otherwise needed.
2020-06-11 10:18:58 +02:00
Willy Tarreau
48d25b3bc9 REORG: include: move session.h to haproxy/session{,-t}.h
Almost no change was needed beyond a little bit of reordering of the
types file and adjustments to use session-t instead of session at a
few places.
2020-06-11 10:18:58 +02:00
Willy Tarreau
4aa573da6f REORG: include: move checks.h to haproxy/check{,-t}.h
All includes that were not absolutely necessary were removed because
checks.h happens to very often be part of dependency loops. A warning
was added about this in check-t.h. The fields, enums and structs were
a bit tidied because it's particularly tedious to find anything there.
It would make sense to split this in two or more files (at least
extract tcp-checks).

The file was renamed to the singular because it was one of the rare
exceptions to have an "s" appended to its name compared to the struct
name.
2020-06-11 10:18:58 +02:00
Willy Tarreau
fc77454aff REORG: include: move proto_tcp.h to haproxy/proto_tcp.h
There was no type file. This one really is trivial. A few missing
includes were added to satisfy the exported functions prototypes.
2020-06-11 10:18:58 +02:00
Willy Tarreau
cea0e1bb19 REORG: include: move task.h to haproxy/task{,-t}.h
The TASK_IS_TASKLET() macro was moved to the proto file instead of the
type one. The proto part was a bit reordered to remove a number of ugly
forward declaration of static inline functions. About a tens of C and H
files had their dependency dropped since they were not using anything
from task.h.
2020-06-11 10:18:58 +02:00
Willy Tarreau
f268ee8795 REORG: include: split global.h into haproxy/global{,-t}.h
global.h was one of the messiest files, it has accumulated tons of
implicit dependencies and declares many globals that make almost all
other file include it. It managed to silence a dependency loop between
server.h and proxy.h by being well placed to pre-define the required
structs, forcing struct proxy and struct server to be forward-declared
in a significant number of files.

It was split in to, one which is the global struct definition and the
few macros and flags, and the rest containing the functions prototypes.

The UNIX_MAX_PATH definition was moved to compat.h.
2020-06-11 10:18:58 +02:00
Willy Tarreau
e6ce10be85 REORG: include: move sample.h to haproxy/sample{,-t}.h
This one is particularly tricky to move because everyone uses it
and it depends on a lot of other types. For example it cannot include
arg-t.h and must absolutely only rely on forward declarations to avoid
dependency loops between vars -> sample_data -> arg. In order to address
this one, it would be nice to split the sample_data part out of sample.h.
2020-06-11 10:18:58 +02:00
Willy Tarreau
469509b39e REORG: include: move payload.h to haproxy/payload.h
There's no type file, it only contains fetch_rdp_cookie_name() and
val_payload_lv() which probably ought to move somewhere else instead
of staying there.
2020-06-11 10:18:58 +02:00
Willy Tarreau
546ba42c73 REORG: include: move lb_fwrr.h to haproxy/lb_fwrr{,-t}.h
Nothing fancy, includes were already OK. The proto didn't reference the
type, this was fixed. Still references proxy.h and server.h from types/.
2020-06-11 10:18:58 +02:00
Willy Tarreau
0254941666 REORG: include: move lb_fwlc.h to haproxy/lb_fwlc{,-t}.h
Nothing fancy, includes were already OK. The proto didn't reference the
type, this was fixed. Still references proxy.h and server.h from types/.
2020-06-11 10:18:58 +02:00
Willy Tarreau
b5fc3bf6dc REORG: include: move lb_fas.h to haproxy/lb_fas{,-t}.h
Nothing fancy, includes were already OK. The proto didn't reference the
type, this was fixed. Still references proxy.h and server.h from types/.
2020-06-11 10:18:58 +02:00
Willy Tarreau
fbe8da3320 REORG: include: move lb_chash.h to haproxy/lb_chash{,-t}.h
Nothing fancy, includes were already OK. The proto didn't reference the
type, this was fixed. Still references proxy.h and server.h from types/.
2020-06-11 10:18:58 +02:00
Willy Tarreau
d7d2c28104 CLEANUP: include: remove unused mux_pt.h
It used to be needed to export mux_pt_ops when it was the only way to
detect a mux but that's no longer the case.
2020-06-11 10:18:57 +02:00
Willy Tarreau
8efbdfb77b REORG: include: move obj_type.h to haproxy/obj_type{,-t}.h
No change was necessary. It still includes lots of types/* files.
2020-06-11 10:18:57 +02:00
Willy Tarreau
762d7a5117 REORG: include: move frontend.h to haproxy/frontend.h
There was no type file for this one, it only contains frontend_accept().
2020-06-11 10:18:57 +02:00
Willy Tarreau
aa74c4e1b3 REORG: include: move arg.h to haproxy/arg{,-t}.h
Almost no change was needed; chunk.h was replaced with buf-t.h.
It dpeends on types/vars.h and types/protocol_buffers.h.
2020-06-11 10:18:57 +02:00
Willy Tarreau
87735330d1 REORG: include: move http_htx.h to haproxy/http_htx{,-t}.h
A few includes had to be added, namely list-t.h in the type file and
types/proxy.h in the proto file. actions.h was including http-htx.h
but didn't need it so it was dropped.
2020-06-11 10:18:57 +02:00
Willy Tarreau
2dd7c35052 REORG: include: move protocol.h to haproxy/protocol{,-t}.h
The protocol.h files are pretty low in the dependency and (sadly) used
by some files from common/. Almost nothing was changed except lifting a
few comments.
2020-06-11 10:18:57 +02:00
Willy Tarreau
16f958c0e9 REORG: include: split common/htx.h into haproxy/htx{,-t}.h
Most of the file was a large set of HTX elements manipulation functions
and few types, so splitting them allowed to further reduce dependencies
and shrink the build time. Doing so revealed that a few files (h2.c,
mux_pt.c) needed haproxy/buf.h and were previously getting it through
htx.h. They were fixed.
2020-06-11 10:18:57 +02:00
Willy Tarreau
cd72d8c981 REORG: include: split common/http.h into haproxy/http{,-t}.h
So the enums and structs were placed into http-t.h and the functions
into http.h. This revealed that several files were dependeng on http.h
but not including it, as it was silently inherited via other files.
2020-06-11 10:18:57 +02:00
Willy Tarreau
c2f7c5895c REORG: include: move common/ticks.h to haproxy/ticks.h
Nothing needed to be changed, there are no exported types.
2020-06-11 10:18:57 +02:00
Willy Tarreau
7a00efbe43 REORG: include: move common/namespace.h to haproxy/namespace{,-t}.h
The type was moved out as it's used by standard.h for netns_entry.
Instead of just being a forward declaration when not used, it's an
empty struct, which makes gdb happier (the resulting stripped executable
is the same).
2020-06-11 10:18:57 +02:00
Willy Tarreau
2741c8c4aa REORG: include: move common/buffer.h to haproxy/dynbuf{,-t}.h
The pretty confusing "buffer.h" was in fact not the place to look for
the definition of "struct buffer" but the one responsible for dynamic
buffer allocation. As such it defines the struct buffer_wait and the
few functions to allocate a buffer or wait for one.

This patch moves it renaming it to dynbuf.h. The type definition was
moved to its own file since it's included in a number of other structs.

Doing this cleanup revealed that a significant number of files used to
rely on this one to inherit struct buffer through it but didn't need
anything from this file at all.
2020-06-11 10:18:57 +02:00
Willy Tarreau
92b4f1372e REORG: include: move time.h from common/ to haproxy/
This one is included almost everywhere and used to rely on a few other
.h that are not needed (unistd, stdlib, standard.h). It could possibly
make sense to split it into multiple parts to distinguish operations
performed on timers and the internal time accounting, but at this point
it does not appear much important.
2020-06-11 10:18:56 +02:00
Willy Tarreau
58017eef3f REORG: include: move the BUG_ON() code to haproxy/bug.h
This one used to be stored into debug.h but the debug tools got larger
and require a lot of other includes, which can't use BUG_ON() anymore
because of this. It does not make sense and instead this macro should
be placed into the lower includes and given its omnipresence, the best
solution is to create a new bug.h with the few surrounding macros needed
to trigger bugs and place assertions anywhere.

Another benefit is that it won't be required to add include <debug.h>
anymore to use BUG_ON, it will automatically be covered by api.h. No
less than 32 occurrences were dropped.

The FSM_PRINTF macro was dropped since not used at all anymore (probably
since 1.6 or so).
2020-06-11 10:18:56 +02:00
Willy Tarreau
8d36697dee REORG: include: move base64.h, errors.h and hash.h from common to to haproxy/
These ones do not depend on any other file. One used to include
haproxy/api.h but that was solely for stddef.h.
2020-06-11 10:18:56 +02:00
Willy Tarreau
4c7e4b7738 REORG: include: update all files to use haproxy/api.h or api-t.h if needed
All files that were including one of the following include files have
been updated to only include haproxy/api.h or haproxy/api-t.h once instead:

  - common/config.h
  - common/compat.h
  - common/compiler.h
  - common/defaults.h
  - common/initcall.h
  - common/tools.h

The choice is simple: if the file only requires type definitions, it includes
api-t.h, otherwise it includes the full api.h.

In addition, in these files, explicit includes for inttypes.h and limits.h
were dropped since these are now covered by api.h and api-t.h.

No other change was performed, given that this patch is large and
affects 201 files. At least one (tools.h) was already freestanding and
didn't get the new one added.
2020-06-11 10:18:42 +02:00
Olivier Houchard
68ad53cb37 BUG/MEDIUM: backend: set the connection owner to the session when using alpn.
In connect_server(), if we can't create the mux immediately because we have
to wait until the alpn is negociated, store the session as the connection's
owner. conn_create_mux() expects it to be set, and provides it to the mux
init() method. Failure to do so will result to crashes later if the
connection is private, and even if we didn't do so it would prevent connection
reuse for private connections.
This should fix github issue #651.
2020-05-27 01:34:33 +02:00
Christopher Faulet
f98e626491 MINOR: checks/sample: Remove unnecessary tests on the sample session
A sample must always have a session defined. Otherwise, it is a bug. So it is
unnecessary to test if it is defined when called from a health checks context.

This patch fixes the issue #616.
2020-05-06 12:44:46 +02:00
Christopher Faulet
d1b4464b69 MINOR: checks: Add support of be_id, be_name, srv_id and srv_name sample fetches
It is now possible to call be_id, be_name, srv_id and srv_name sample fetches
from any sample expression or log-format string in a tcp-check based ruleset.
2020-05-05 11:06:43 +02:00
Olivier Houchard
cf612a0457 MINOR: servers: Add a counter for the number of currently used connections.
Add a counter to know the current number of used connections, as well as the
max, this will be used later to refine the algorithm used to kill idle
connections, based on current usage.
2020-03-30 00:30:01 +02:00
Olivier Houchard
c0caac2cc8 BUG/MINOR: connections: Make sure we free the connection on failure.
In connect_server(), make sure we properly free a newly created connection
if we somehow fail, and it has not yet been attached to a conn_stream, or
it would lead to a memory leak.
This should appease coverity for backend.c, as reported in inssue #556.

This should be backported to 2.1, 2.0 and 1.9
2020-03-20 14:35:07 +01:00
Olivier Houchard
fdc7ee2173 BUG/MEDIUM: connections: Don't forget to decrement idle connection counters.
In conn_backend_get(), when we manage to get an idle connection from the
current thread's pool, don't forget to decrement the idle connection
counters, or we may end up not reusing connections when we could, and/or
killing connections when we shouldn't.
2020-03-19 23:56:08 +01:00
Olivier Houchard
b3397367dc MEDIUM: connections: Kill connections even if we are reusing one.
In connect_server(), if we notice we have more file descriptors opened than
we should, there's no reason not to close a connection just because we're
reusing one, so do it anyway.
2020-03-19 22:07:34 +01:00
Olivier Houchard
566df309c6 MEDIUM: connections: Attempt to get idle connections from other threads.
In connect_server(), if we no longer have any idle connections for the
current thread, attempt to use the new "takeover" mux method to steal a
connection from another thread.
This should have no impact right now, given no mux implements it.
2020-03-19 22:07:33 +01:00
Olivier Houchard
d2489e00b0 MINOR: connections: Add a flag to know if we're in the safe or idle list.
Add flags to connections, CO_FL_SAFE_LIST and CO_FL_IDLE_LIST, to let one
know we are in the safe list, or the idle list.
2020-03-19 22:07:33 +01:00
Olivier Houchard
f0d4dff25c MINOR: connections: Make the "list" element a struct mt_list instead of list.
Make the "list" element a struct mt_list, and explicitely use
list_from_mt_list to get a struct list * where it is used as such, so that
mt_list_for_each_entry will be usable with it.
2020-03-19 22:07:33 +01:00
Olivier Houchard
dc2f2753e9 MEDIUM: servers: Split the connections into idle, safe, and available.
Revamp the server connection lists. We know have 3 lists :
- idle_conns, which contains idling connections
- safe_conns, which contains idling connections that are safe to use even
for the first request
- available_conns, which contains connections that are not idling, but can
still accept new streams (those are HTTP/2 or fastcgi, and are always
considered safe).
2020-03-19 22:07:33 +01:00
Olivier Houchard
2444aa5b66 MEDIUM: sessions: Don't be responsible for connections anymore.
Make it so sessions are not responsible for connection anymore, except for
connections that are private, and thus can't be shared, otherwise, as soon
as a request is done, the session will just add the connection to the
orphan connections pool.
This will break http-reuse safe, but it is expected to be fixed later.
2020-03-19 22:07:33 +01:00
Willy Tarreau
b9f54c5592 MINOR: backend: use a single call to ha_random32() for the random LB algo
For the random LB algorithm we need a random 32-bit hashing key that used
to be made of two calls to random(). Now we can simply perform a single
call to ha_random32() and get rid of the useless operations.
2020-03-08 17:31:39 +01:00
Willy Tarreau
52bf839394 BUG/MEDIUM: random: implement a thread-safe and process-safe PRNG
This is the replacement of failed attempt to add thread safety and
per-process sequences of random numbers initally tried with commit
1c306aa84d ("BUG/MEDIUM: random: implement per-thread and per-process
random sequences").

This new version takes a completely different approach and doesn't try
to work around the horrible OS-specific and non-portable random API
anymore. Instead it implements "xoroshiro128**", a reputedly high
quality random number generator, which is one of the many variants of
xorshift, which passes all quality tests and which is described here:

   http://prng.di.unimi.it/

While not cryptographically secure, it is fast and features a 2^128-1
period. It supports fast jumps allowing to cut the period into smaller
non-overlapping sequences, which we use here to support up to 2^32
processes each having their own, non-overlapping sequence of 2^96
numbers (~7*10^28). This is enough to provide 1 billion randoms per
second and per process for 2200 billion years.

The implementation was made thread-safe either by using a double 64-bit
CAS on platforms supporting it (x86_64, aarch64) or by using a local
lock for the time needed to perform the shift operations. This ensures
that all threads pick numbers from the same pool so that it is not
needed to assign per-thread ranges. For processes we use the fast jump
method to advance the sequence by 2^96 for each process.

Before this patch, the following config:
    global
        nbproc 8

    frontend f
        bind :4445
        mode http
        log stdout format raw daemon
        log-format "%[uuid] %pid"
        redirect location /

Would produce this output:
    a4d0ad64-2645-4b74-b894-48acce0669af 12987
    a4d0ad64-2645-4b74-b894-48acce0669af 12992
    a4d0ad64-2645-4b74-b894-48acce0669af 12986
    a4d0ad64-2645-4b74-b894-48acce0669af 12988
    a4d0ad64-2645-4b74-b894-48acce0669af 12991
    a4d0ad64-2645-4b74-b894-48acce0669af 12989
    a4d0ad64-2645-4b74-b894-48acce0669af 12990
    82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12987
    82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12992
    82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12986
    (...)

And now produces:
    f94b29b3-da74-4e03-a0c5-a532c635bad9 13011
    47470c02-4862-4c33-80e7-a952899570e5 13014
    86332123-539a-47bf-853f-8c8ea8b2a2b5 13013
    8f9efa99-3143-47b2-83cf-d618c8dea711 13012
    3cc0f5c7-d790-496b-8d39-bec77647af5b 13015
    3ec64915-8f95-4374-9e66-e777dc8791e0 13009
    0f9bf894-dcde-408c-b094-6e0bb3255452 13011
    49c7bfde-3ffb-40e9-9a8d-8084d650ed8f 13014
    e23f6f2e-35c5-4433-a294-b790ab902653 13012

There are multiple benefits to using this method. First, it doesn't
depend anymore on a non-portable API. Second it's thread safe. Third it
is fast and more proven than any hack we could attempt to try to work
around the deficiencies of the various implementations around.

This commit depends on previous patches "MINOR: tools: add 64-bit rotate
operators" and "BUG/MEDIUM: random: initialize the random pool a bit
better", all of which will need to be backported at least as far as
version 2.0. It doesn't require to backport the build fixes for circular
include files dependecy anymore.
2020-03-08 10:09:02 +01:00
Willy Tarreau
0fbf28a05b Revert "BUG/MEDIUM: random: implement per-thread and per-process random sequences"
This reverts commit 1c306aa84d.

It breaks the build on all non-glibc platforms. I got confused by the
man page (which possibly is the most confusing man page I've ever read
about a standard libc function) and mistakenly understood that random_r
was portable, especially since it appears in latest freebsd source as
well but not in released versions, and with a slightly different API :-/

We need to find a different solution with a fallback. Among the
possibilities, we may reintroduce this one with a fallback relying on
locking around the standard functions, keeping fingers crossed for no
other library function to call them in parallel, or we may also provide
our own PRNG, which is not necessarily more difficult than working
around the totally broken up design of the portable API.
2020-03-07 11:24:39 +01:00
Willy Tarreau
1c306aa84d BUG/MEDIUM: random: implement per-thread and per-process random sequences
As mentioned in previous patch, the random number generator was never
made thread-safe, which used not to be a problem for health checks
spreading, until the uuid sample fetch function appeared. Currently
it is possible for two threads or processes to produce exactly the
same UUID. In fact it's extremely likely that this will happen for
processes, as can be seen with this config:

    global
        nbproc 8

    frontend f
        bind :4445
        mode http
        log stdout daemon format raw
        log-format "%[uuid] %pid"
        redirect location /

It typically produces this log:

  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30645
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30641
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30644
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30639
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30646
  07764439-c24d-4e6f-a5a6-0138be59e7a8 30645
  07764439-c24d-4e6f-a5a6-0138be59e7a8 30639
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30643
  07764439-c24d-4e6f-a5a6-0138be59e7a8 30646
  b6773fdd-678f-4d04-96f2-4fb11ad15d6b 30646
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30642
  07764439-c24d-4e6f-a5a6-0138be59e7a8 30642

What this patch does is to use a distinct per-thread and per-process
seed to make sure the same sequences will not appear, and will then
extend these seeds by "burning" a number of randoms that depends on
the global random seed, the thread ID and the process ID. This adds
roughly 20 extra bits of randomness, resulting in 52 bits total per
thread and per process.

It only takes a few milliseconds to burn these randoms and given
that threads start with a different seed, we know they will not
catch each other. So these random extra bits are essentially added
to ensure randomness between boots and cluster instances.

This replaces all uses of random() with ha_random() which uses the
thread-local state.

This must be backported as far as 2.0 or any version having the
UUID sample-fetch function since it's the main victim here.

It's important to note that this patch, in addition to depending on
the previous one "BUG/MEDIUM: init: initialize the random pool a bit
better", also depends on the preceeding build fixes to address a
circular dependency issue in the include files that prevented it
from building. Part or all of these patches may need to be backported
or adapted as well.
2020-03-07 06:11:15 +01:00
Willy Tarreau
ada4c5806b MEDIUM: stream-int: make sure to try to immediately validate the connection
In the rare case of immediate connect() (unix sockets, socket pairs, and
occasionally TCP over the loopback), it is counter-productive to subscribe
for sending and then getting immediately back to process_stream() after
having passed through si_cs_process() just to update the connection. We
already know it is established and it doesn't have any handshake anymore
so we just have to complete it and return to process_stream() with the
stream_interface in the SI_ST_RDY state. In this case, process_stream will
simply loop back to the beginning to synchronize the state and turn it to
SI_ST_EST/ASS/CLO/TAR etc.

This will save us from having to needlessly subscribe in the connect()
code, something which in addition cannot work with edge-triggered pollers.
2020-03-04 19:29:12 +01:00
Olivier Houchard
849d4f047f BUG/MEDIUM: connections: Don't forget to unlock when killing a connection.
Commit 140237471e made sure we hold the
toremove_lock for the corresponding thread before removing a connection
from its idle_orphan_conns list, however it failed to unlock it if we
found a connection, leading to a deadlock, so add the missing deadlock.

This should be backported to 2.1 and 2.0.
2020-01-31 17:25:37 +01:00
Olivier Houchard
1fc5a648bf MEDIUM: streams: Don't close the connection in back_handle_st_rdy().
In back_handle_st_rdy(), don't bother trying to close the connection, it
should be taken care of somewhere else.
2020-01-24 15:40:34 +01:00
Olivier Houchard
7c30642ede MEDIUM: streams: Don't close the connection in back_handle_st_con().
In back_handle_st_con(), don't bother trying to close the connection, it
should be taken care of elsewhere.
2020-01-24 15:40:34 +01:00
Olivier Houchard
b43589cac5 BUG/MEDIUM: stream: Don't install the mux in back_handle_st_con().
In back_handle_st_con(), don't bother setting up the mux, it is now done by
conn_fd_handler().
2020-01-24 15:40:34 +01:00
Olivier Houchard
ecffb7d841 BUG/MEDIUM: streams: Move the conn_stream allocation outside #IF USE_OPENSSL.
When commit 477902bd2e made the conn_stream
allocation unconditional, it unfortunately moved the code doing the allocation
inside #if USE_OPENSSL, which means anybody compiling haproxy without
openssl wouldn't allocate any conn_stream, and would get a segfault later.
Fix that by moving the code that does the allocation outside #if USE_OPENSSL.
2020-01-24 14:14:35 +01:00
Willy Tarreau
911db9bd29 MEDIUM: connection: use CO_FL_WAIT_XPRT more consistently than L4/L6/HANDSHAKE
As mentioned in commit c192b0ab95 ("MEDIUM: connection: remove
CO_FL_CONNECTED and only rely on CO_FL_WAIT_*"), there is a lack of
consistency on which flags are checked among L4/L6/HANDSHAKE depending
on the code areas. A number of sample fetch functions only check for
L4L6 to report MAY_CHANGE, some places only check for HANDSHAKE and
many check both L4L6 and HANDSHAKE.

This patch starts to make all of this more consistent by introducing a
new mask CO_FL_WAIT_XPRT which is the union of L4/L6/HANDSHAKE and
reports whether the transport layer is ready or not.

All inconsistent call places were updated to rely on this one each time
the goal was to check for the readiness of the transport layer.
2020-01-23 16:34:26 +01:00
Willy Tarreau
4450b587dd MINOR: connection: remove CO_FL_SSL_WAIT_HS from CO_FL_HANDSHAKE
Most places continue to check CO_FL_HANDSHAKE while in fact they should
check CO_FL_HANDSHAKE_NOSSL, which contains all handshakes but the one
dedicated to SSL renegotiation. In fact the SSL layer should be the
only one checking CO_FL_SSL_WAIT_HS, so as to avoid processing data
when a renegotiation is in progress, but other ones randomly include it
without knowing. And ideally it should even be an internal flag that's
not exposed in the connection.

This patch takes CO_FL_SSL_WAIT_HS out of CO_FL_HANDSHAKE, uses this flag
consistently all over the code, and gets rid of CO_FL_HANDSHAKE_NOSSL.

In order to limit the confusion that has accumulated over time, the
CO_FL_SSL_WAIT_HS flag which indicates an ongoing SSL handshake,
possibly used by a renegotiation was moved after the other ones.
2020-01-23 16:34:26 +01:00
Willy Tarreau
c192b0ab95 MEDIUM: connection: remove CO_FL_CONNECTED and only rely on CO_FL_WAIT_*
Commit 477902bd2e ("MEDIUM: connections: Get ride of the xprt_done
callback.") broke the master CLI for a very obscure reason. It happens
that short requests immediately terminated by a shutdown are properly
received, CS_FL_EOS is correctly set, but in si_cs_recv(), we refrain
from setting CF_SHUTR on the channel because CO_FL_CONNECTED was not
yet set on the connection since we've not passed again through
conn_fd_handler() and it was not done in conn_complete_session(). While
commit a8a415d31a ("BUG/MEDIUM: connections: Set CO_FL_CONNECTED in
conn_complete_session()") fixed the issue, such accident may happen
again as the root cause is deeper and actually comes down to the fact
that CO_FL_CONNECTED is lazily set at various check points in the code
but not every time we drop one wait bit. It is not the first time we
face this situation.

Originally this flag was used to detect the transition between WAIT_*
and CONNECTED in order to call ->wake() from the FD handler. But since
at least 1.8-dev1 with commit 7bf3fa3c23 ("BUG/MAJOR: connection: update
CO_FL_CONNECTED before calling the data layer"), CO_FL_CONNECTED is
always synchronized against the two others before being checked. Moreover,
with the I/Os moved to tasklets, the decision to call the ->wake() function
is performed after the I/Os in si_cs_process() and equivalent, which don't
care about this transition either.

So in essence, checking for CO_FL_CONNECTED has become a lazy wait to
check for (CO_FL_WAIT_L4_CONN | CO_FL_WAIT_L6_CONN), but that always
relies on someone else having synchronized it.

This patch addresses it once for all by killing this flag and only checking
the two others (for which a composite mask CO_FL_WAIT_L4L6 was added). This
revealed a number of inconsistencies that were purposely not addressed here
for the sake of bisectability:

  - while most places do check both L4+L6 and HANDSHAKE at the same time,
    some places like assign_server() or back_handle_st_con() and a few
    sample fetches looking for proxy protocol do check for L4+L6 but
    don't care about HANDSHAKE ; these ones will probably fail on TCP
    request session rules if the handshake is not complete.

  - some handshake handlers do validate that a connection is established
    at L4 but didn't clear CO_FL_WAIT_L4_CONN

  - the ->ctl method of mux_fcgi, mux_pt and mux_h1 only checks for L4+L6
    before declaring the mux ready while the snd_buf function also checks
    for the handshake's completion. Likely the former should validate the
    handshake as well and we should get rid of these extra tests in snd_buf.

  - raw_sock_from_buf() would directly set CO_FL_CONNECTED and would only
    later clear CO_FL_WAIT_L4_CONN.

  - xprt_handshake would set CO_FL_CONNECTED itself without actually
    clearing CO_FL_WAIT_L4_CONN, which could apparently happen only if
    waiting for a pure Rx handshake.

  - most places in ssl_sock that were checking CO_FL_CONNECTED don't need
    to include the L4 check as an L6 check is enough to decide whether to
    wait for more info or not.

It also becomes obvious when reading the test in si_cs_recv() that caused
the failure mentioned above that once converted it doesn't make any sense
anymore: having CS_FL_EOS set while still waiting for L4 and L6 to complete
cannot happen since for CS_FL_EOS to be set, the other ones must have been
validated.

Some of these parts will still deserve further cleanup, and some of the
observations above may induce some backports of potential bug fixes once
totally analyzed in their context. The risk of breaking existing stuff
is too high to blindly backport everything.
2020-01-23 14:41:37 +01:00
Willy Tarreau
79fd577ac1 CLEANUP: backend: shut another false null-deref in back_handle_st_con()
objt_conn() may return a NULL though here we don't have this situation
anymore since the connection is always there, so let's simply switch
to the unchecked __objt_conn(). This addresses issue #454.
2020-01-23 11:40:40 +01:00
Willy Tarreau
b1a40c72e7 CLEANUP: backend: remove useless test for inexistent connection
Coverity rightfully reported that it's pointless to test for "conn"
to be null while all code paths leading to it have already
dereferenced it. This addresses issue #461.
2020-01-23 11:37:43 +01:00
Olivier Houchard
477902bd2e MEDIUM: connections: Get ride of the xprt_done callback.
The xprt_done_cb callback was used to defer some connection initialization
until we're connected and the handshake are done. As it mostly consists of
creating the mux, instead of using the callback, introduce a conn_create_mux()
function, that will just call conn_complete_session() for frontend, and
create the mux for backend.
In h2_wake(), make sure we call the wake method of the stream_interface,
as we no longer wakeup the stream task.
2020-01-22 18:56:05 +01:00
Olivier Houchard
8af03b396a MEDIUM: streams: Always create a conn_stream in connect_server().
In connect_server(), when creating a new connection for which we don't yet
know the mux (because it'll be decided by the ALPN), instead of associating
the connection to the stream_interface, always create a conn_stream. This way,
we have less special-casing needed. Store the conn_stream in conn->ctx,
so that we can reach the upper layers if needed.
2020-01-22 18:55:59 +01:00
Willy Tarreau
062df2c23a MEDIUM: backend: move the connection finalization step to back_handle_st_con()
Currently there's still lots of code in conn_complete_server() that performs
one half of the connection setup, which is then checked and finalized in
back_handle_st_con(). There isn't a valid reason for this anymore, we can
simplify this and make sure that conn_complete_server() only wakes the stream
up to inform it about the fact the whole connection stack is set up so that
back_handle_st_con() finishes its job at the stream-int level.

It looks like the there could even be further simplified, but for now it
was moved straight out of conn_complete_server() with no modification.
2020-01-17 18:30:36 +01:00
Willy Tarreau
3a9312af8f REORG: stream/backend: move backend-specific stuff to backend.c
For more than a decade we've kept all the sess_update_st_*() functions
in stream.c while they're only there to work in relation with what is
currently being done in backend.c (srv_redispatch_connect, connect_server,
etc). Let's move all this pollution over there and take this opportunity
to try to find slightly less confusing names for these old functions
whose role is only to handle transitions from one specific stream-int
state:

  sess_update_st_rdy_tcp() -> back_handle_st_rdy()
  sess_update_st_con_tcp() -> back_handle_st_con()
  sess_update_st_cer()     -> back_handle_st_cer()
  sess_update_stream_int() -> back_try_conn_req()
  sess_prepare_conn_req()  -> back_handle_st_req()
  sess_establish()         -> back_establish()

The last one remained in stream.c because it's more or less a completion
function which does all the initialization expected on a connection
success or failure, can set analysers and emit logs.

The other ones could possibly slightly benefit from being modified to
take a stream-int instead since it's really what they're working with,
but it's unimportant here.
2020-01-17 18:30:36 +01:00
Olivier Houchard
140237471e BUG/MEDIUM: connections: Hold the lock when wanting to kill a connection.
In connect_server(), when we decide we want to kill the connection of
another thread because there are too many idle connections, hold the
toremove_lock of the corresponding thread, othervise, there's a small race
condition where we could try to add the connection to the toremove_connections
list while it has already been free'd.

This should be backported to 2.0 and 2.1.
2019-12-30 18:18:28 +01:00
Christopher Faulet
eea8fc737b MEDIUM: stream/trace: Register a new trace source with its events
Runtime traces are now supported for the streams, only if compiled with
debug. process_stream() is covered as well as TCP/HTTP analyzers and filters.

In traces, the first argument is always a stream. So it is easy to get the info
about the channels and the stream-interfaces. The second argument, when defined,
is always a HTTP transaction. And the third one is an HTTP message. The trace
message is adapted to report HTTP info when possible.
2019-11-06 10:14:32 +01:00
vkill
1dfd16536f MINOR: backend: Add srv_name sample fetche
The sample fetche can get srv_name without foreach
`core.backends["bk"].servers`.

Then we can get Server class quickly via
`core.backends[txn.f:be_name()].servers[txn.f:srv_name()]`.

Issue#342
2019-11-01 05:40:24 +01:00
Olivier Houchard
e8f5f5d8b2 BUG/MEDIUM: servers: Only set SF_SRV_REUSED if the connection if fully ready.
In connect_server(), if we're reusing a connection, only use SF_SRV_REUSED
if the connection is fully ready. We may be using a multiplexed connection
created by another stream that is not yet ready, and may fail.
If we set SF_SRV_REUSED, process_stream() will then not wait for the timeout
to expire, and retry to connect immediately.

This should be backported to 1.9 and 2.0.
This commit depends on 55234e33708c5a584fb9efea81d71ac47235d518.
2019-10-29 14:15:20 +01:00
Olivier Houchard
859dc80f94 MEDIUM: list: Separate "locked" list from regular list.
Instead of using the same type for regular linked lists and "autolocked"
linked lists, use a separate type, "struct mt_list", for the autolocked one,
and introduce a set of macros, similar to the LIST_* macros, with the
MT_ prefix.
When we use the same entry for both regular list and autolocked list, as
is done for the "list" field in struct connection, we know have to explicitely
cast it to struct mt_list when using MT_ macros.
2019-09-23 18:16:08 +02:00
Christopher Faulet
1dbc4676c6 BUG/MINOR: backend: Fix a possible null pointer dereference
In the function connect_server(), when we are not able to reuse a connection and
too many FDs are opened, the variable srv must be defined to kill an idle
connection.

This patch fixes the issue #257. It must be backported to 2.0
2019-09-13 10:08:44 +02:00
Nenad Merdanovic
177adc9e57 MINOR: backend: Add srv_queue converter
The converter can be useful to look up a server queue from a dynamic value.

It takes an input value of type string, either a server name or
<backend>/<server> format and returns the number of queued sessions
on that server. Can be used in places where we want to look up
queued sessions from a dynamic name, like a cookie value (e.g.
req.cook(SRVID),srv_queue) and then make a decision to break
persistence or direct a request elsewhere.

Signed-off-by: Nenad Merdanovic <nmerdan@haproxy.com>
2019-08-27 04:32:06 +02:00
Willy Tarreau
b082186528 MEDIUM: backend: remove impossible cases from connect_server()
Now that we start by releasing any possibly existing connection,
the conditions simplify a little bit and some of the complex cases
can be removed. A few comments were also added for non-obvious cases.
2019-07-19 13:50:09 +02:00
Willy Tarreau
a5797aab11 MEDIUM: backend: always release any existing prior connection in connect_server()
When entering connect_server() we're not supposed to have a connection
already, except when retrying a failed connection, which is pretty rare.
Let's simplify the code by starting to unconditionally release any existing
connection. For now we don't go further, as this change alone will lead to
quite some simplification that'd rather be done as a separate cleanup.
2019-07-19 13:50:09 +02:00
Willy Tarreau
1c8d32bb62 MAJOR: stream: store the target address into s->target_addr
When forcing the outgoing address of a connection, till now we used to
allocate this outgoing connection and set the address into it, then set
SF_ADDR_SET. With connection reuse this causes a whole lot of issues and
difficulties in the code.

Thanks to the previous changes, it is now possible to store the target
address into the stream instead, and copy the address from the stream to
the connection when initializing the connection. assign_server_address()
does this and as a result SF_ADDR_SET now reflects the presence of the
target address in the stream, not in the connection. The http_proxy mode,
the peers and the master's CLI now use the same mechanism. For now the
existing connection code was not removed to limit the amount of tricky
changes, but the allocated connection is not used anymore.

This change also revealed a latent issue that we've been having around
option http_proxy : the address was set in the connection but neither the
SF_ADDR_SET nor the SF_ASSIGNED flags were set. It looks like the connection
could establish only due to the fact that it existed with a non-null
destination address.
2019-07-19 13:50:09 +02:00
Willy Tarreau
16aa4aff6b MINOR: connection: don't use clear_addr() anymore, just release the address
Now that we have dynamically allocated addresses, there's no need to
clear an address before reusing it, just release it. Note that this
is not equivalent to saying that an address is never zero, as shown in
assign_server_address() where an address 0.0.0.0 can still be assigned
to a connection for the time it takes to modify it.
2019-07-19 13:50:09 +02:00