Commit Graph

419 Commits

Author SHA1 Message Date
Willy Tarreau
08e2b41e81 BUILD: connections: shut up gcc about impossible out-of-bounds warning
Since commit 88698d9 ("MEDIUM: connections: Add a way to control the
number of idling connections.") when building without threads, gcc
complains that the operations made on the idle_orphan_conns[] list is
out of bounds, which is always false since 1) <i> can only equal zero,
and 2) given it's equal to <tid> we never even enter the loop. But as
usual it thinks it knows better, so let's mask the origin of this <i>
value to shut it up. Another solution consists in making <i> unsigned
and adding an explicit range check.
2019-05-26 11:54:20 +02:00
Willy Tarreau
c125cef6da CLEANUP: ssl: make inclusion of openssl headers safe
It's always a pain to have to stuff lots of #ifdef USE_OPENSSL around
ssl headers, it even results in some of them appearing in a random order
and multiple times just to benefit form an existing ifdef block. Let's
make these headers safe for inclusion when USE_OPENSSL is not defined,
they now perform the test themselves and do nothing if USE_OPENSSL is
not defined. This allows to remove no less than 8 such ifdef blocks
and make include blocks more readable.
2019-05-10 09:58:43 +02:00
Willy Tarreau
5db847ab65 CLEANUP: ssl: remove 57 occurrences of useless tests on LIBRESSL_VERSION_NUMBER
They were all check to comply with the advertised openssl version. Now
that libressl doesn't pretend to be a more recent openssl anymore, we
can simply rely on the regular openssl version tests without having to
deal with exceptions for libressl.
2019-05-09 14:26:39 +02:00
Willy Tarreau
9a1ab08160 CLEANUP: ssl-sock: use HA_OPENSSL_VERSION_NUMBER instead of OPENSSL_VERSION_NUMBER
Most tests on OPENSSL_VERSION_NUMBER have become complex and break all
the time because this number is fake for some derivatives like LibreSSL.
This patch creates a new macro, HA_OPENSSL_VERSION_NUMBER, which will
carry the real openssl version defining the compatibility level, and
this version will be adjusted depending on the variants.
2019-05-09 14:25:43 +02:00
Olivier Houchard
4cd2af4e5d BUG/MEDIUM: ssl: Don't attempt to use early data with libressl.
Libressl doesn't yet provide early data, so don't put the CO_FL_EARLY_SSL_HS
on the connection if we're building with libressl, or the handshake will
never be done.
2019-05-06 15:20:42 +02:00
Olivier Houchard
865d8392bb MEDIUM: streams: Add a way to replay failed 0rtt requests.
Add a new keyword for retry-on, 0rtt-rejected. If set, we will try to
replay requests for which we sent early data that got rejected by the
server.
If that option is set, we will attempt to use 0rtt if "allow-0rtt" is set
on the server line even if the client didn't send early data.
2019-05-04 10:20:24 +02:00
Olivier Houchard
010941f876 BUG/MEDIUM: ssl: Use the early_data API the right way.
We can only read early data if we're a server, and write if we're a client,
so don't attempt to mix both.

This should be backported to 1.8 and 1.9.
2019-05-03 21:00:10 +02:00
Olivier Houchard
a48237fd07 BUG/MEDIUM: connections: Make sure we remove CO_FL_SESS_IDLE on disown.
When for some reason the session is not the owner of the connection anymore,
make sure we remove CO_FL_SESS_IDLE, even if we're about to call
conn->mux->destroy(), as the destroy may not destroy the connection
immediately if it's still in use.
This should be backported to 1.9.
u
2019-05-02 12:08:39 +02:00
Christopher Faulet
46451d6e04 MINOR: gcc: Fix a silly gcc warning in connect_server()
Don't know why it happens now, but gcc seems to think srv_conn may be NULL when
a reused connection is removed from the orphan list. It happens when HAProxy is
compiled with -O2 with my gcc (8.3.1) on fedora 29... Changing a little how
reuse parameter is tested removes the warnings. So...

This patch may be backported to 1.9.
2019-04-19 15:53:23 +02:00
Olivier Houchard
88698d966d MEDIUM: connections: Add a way to control the number of idling connections.
As by default we add all keepalive connections to the idle pool, if we run
into a pathological case, where all client don't do keepalive, but the server
does, and haproxy is configured to only reuse "safe" connections, we will
soon find ourself having lots of idling, unusable for new sessions, connections,
while we won't have any file descriptors available to create new connections.

To fix this, add 2 new global settings, "pool_low_ratio" and "pool_high_ratio".
pool-low-fd-ratio  is the % of fds we're allowed to use (against the maximum
number of fds available to haproxy) before we stop adding connections to the
idle pool, and destroy them instead. The default is 20. pool-high-fd-ratio is
the % of fds we're allowed to use (against the maximum number of fds available
to haproxy) before we start killing idling connection in the event we have to
create a new outgoing connection, and no reuse is possible. The default is 25.
2019-04-18 19:52:03 +02:00
Christopher Faulet
73c1207c71 MINOR: muxes: Pass the context of the mux to destroy() instead of the connection
It is mandatory to handle mux upgrades, because during a mux upgrade, the
connection will be reassigned to another multiplexer. So when the old one is
destroyed, it does not own the connection anymore. Or in other words, conn->ctx
does not point to the old mux's context when its destroy() callback is
called. So we now rely on the multiplexer context do destroy it instead of the
connection.

In addition, h1_release() and h2_release() have also been updated in the same
way.
2019-04-12 22:06:53 +02:00
Olivier Houchard
237f781f2d MEDIUM: backend: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:37 +01:00
Willy Tarreau
c912f94b57 MINOR: server: remove a few unneeded LIST_INIT calls after LIST_DEL_LOCKED
Since LIST_DEL_LOCKED() and LIST_POP_LOCKED() now automatically reinitialize
the removed element, there's no need for keeping this LIST_INIT() call in the
idle connection code.
2019-02-28 16:08:54 +01:00
Olivier Houchard
9ea5d361ae MEDIUM: servers: Reorganize the way idle connections are cleaned.
Instead of having one task per thread and per server that does clean the
idling connections, have only one global task for every servers.
That tasks parses all the servers that currently have idling connections,
and remove half of them, to put them in a per-thread list of connections
to kill. For each thread that does have connections to kill, wake a task
to do so, so that the cleaning will be done in the context of said thread.
2019-02-26 18:17:32 +01:00
Olivier Houchard
7f1bc31fee MEDIUM: servers: Used a locked list for idle_orphan_conns.
Use the locked macros when manipulating idle_orphan_conns, so that other
threads can remove elements from it.
It will be useful later to avoid having a task per server and per thread to
cleanup the orphan list.
2019-02-26 18:17:32 +01:00
Olivier Houchard
f131481a0a BUG/MEDIUM: servers: Add a per-thread counter of idle connections.
Add a per-thread counter of idling connections, and use it to determine
how many connections we should kill after the timeout, instead of using
the global counter, or we're likely to just kill most of the connections.

This should be backported to 1.9.
2019-02-21 19:07:45 +01:00
Olivier Houchard
e737103173 BUG/MEDIUM: servers: Use atomic operations when handling curr_idle_conns.
Use atomic operations when dealing with srv->curr_idle_conns, as it's shared
between threads, otherwise we could get inconsistencies.

This should be backported to 1.9.
2019-02-21 19:07:19 +01:00
Christopher Faulet
f7679ad4db BUG/MAJOR: htx/backend: Make all tests on HTTP messages compatible with HTX
A piece of code about the HTX was lost this summer, after the "big merge"
(htx/http2/connection layer refactoring). I forgot to keep HTX changes in the
functions connect_server() and assign_server(). So, this patch fixes "uri",
"url_param" and "hdr" LB algorithms when the HTX is enabled. It also fixes
evaluation of the "sni" expression on server lines.

This issue was reported on github. See issue #32.

This patch must be backported in 1.9.
2019-02-06 10:20:01 +01:00
Willy Tarreau
1da41ecf5b BUG/MINOR: backend: check srv_conn before dereferencing it
Commit 3c4e19f42 ("BUG/MEDIUM: backend: always release the previous
connection into its own target srv_list") introduced a valid warning
about a null-deref risk since we didn't check conn_new()'s return value.

This patch must be backported to 1.9 with the patch above.
2019-02-01 16:47:46 +01:00
Willy Tarreau
3c4e19f429 BUG/MEDIUM: backend: always release the previous connection into its own target srv_list
There was a bug reported in issue #19 regarding the fact that haproxy
could mis-route requests to the wrong server. It turns out that when
switching to another server, the old connection was put back into the
srv_list corresponding to the stream's target instead of this connection's
target. Thus if this connection was later picked, it was pointing to the
wrong server.

The patch fixes this and also clarifies the assignment to srv_conn->target
so that it's clear we don't change it when picking it from the srv_list.

This must be backported to 1.9 only.
2019-02-01 11:58:33 +01:00
Olivier Houchard
7493114729 BUG/MEDIUM: servers: Close the connection if we failed to install the mux.
If we failed to install the mux, just close the connection, or
conn_fd_handler() will be called for the FD, and crash as soon as it attempts
to access the mux' wake method.

This should be backported to 1.9.
2019-01-29 19:53:09 +01:00
Olivier Houchard
26da323cb9 BUG/MEDIUM: servers: Don't add an incomplete conn to the server idle list.
If we failed to add the connection to the session, only try to add it back
to the server idle list if it has a mux, otherwise the connection is
incomplete and unusable.

This should be backported to 1.9.
2019-01-29 19:47:20 +01:00
Olivier Houchard
4dc85538ba BUG/MEDIUM: servers: Only destroy a conn_stream we just allocated.
In connect_server(), if we failed to add the connection to the session,
only destroy the conn_stream if we just allocated it, otherwise it may
have been allocated outside connect_server(), along with a connection which
has its destination address set.
Also use si_release_endpoint() instead of cs_destroy(), to make sure the
stream_interface doesn't reference it anymore.

This should be backported to 1.9.
2019-01-29 19:47:20 +01:00
Willy Tarreau
d822013f45 BUG/MEDIUM: backend: always call si_detach_endpoint() on async connection failure
In case an asynchronous connection (ALPN) succeeds but the mux fails to
attach, we must release the stream interface's endpoint, otherwise we
leave the stream interface with an endpoint pointing to a freed connection
with si_ops == si_conn_ops, and sess_update_st_cer() calls si_shutw() on
it, causing it to crash.

This must be backported to 1.9 only.
2019-01-28 16:33:35 +01:00
Olivier Houchard
9ef5155ba6 BUG/MEDIUM: servers: Attempt to reuse an unfinished connection on retry.
In connect_server(), if the previous connection failed, but had an alpn, no
mux was created, and thus the stream_interface's endpoint would be the
connection. In this case, instead of forgetting about it, and overriding
the stream_interface's endpoint later, try to reuse the connection, or the
connection will still be in the session's connection list, and will reference
to a stream that was probably destroyed.

This should be backported to 1.9.
2019-01-28 16:33:31 +01:00
Willy Tarreau
2c7deddc06 BUG/MEDIUM: backend: never try to attach to a mux having no more stream available
The code dealing with idle connections used to check the number of streams
available on the connection only to unlink the connection from the idle
list. But this still resulted in too many streams reusing the same connection
when they were already attached to it.

We must detect that there is no more room and refrain from using this
connection at all, and instead fall back to the no-reuse case. Ideally
we should try to search among other idle connections, but for a backport
let's stay safe.

This must be backported to 1.9.
2019-01-24 19:06:43 +01:00
Willy Tarreau
5ce6337254 BUG/MEDIUM: backend: also remove from idle list muxes that have no more room
The current test consists in removing muxes which report that they're going
to assign their last available stream, but a mux may already be saturated
without having passed in this situation at all. This is what happens with
mux_h2 when receiving a GOAWAY frame informing the mux about the ID of the
last stream the other end is willing to process. The limit suddenly changes
from near infinite to 0. Currently what happens is that such a mux remains
in the idle list for a long time and refuses all new streams. Now at least
it will only fail a single stream in a retryable way. A future improvement
should consist in trying to pick another connection from the idle list.

This fix must be backported to 1.9.
2019-01-24 13:53:06 +01:00
Olivier Houchard
09a0f03994 BUG/MEDIUM: servers: Make assign_tproxy_address work when ALPN is set.
If an ALPN is set on the server line, then when we reach assign_tproxy_address,
the stream_interface's endpoint will be a connection, not a conn_stream,
so make sure assign_tproxy_address() handles both cases.

This should be backported to 1.9.
2019-01-17 19:18:20 +01:00
Willy Tarreau
21c741a665 MINOR: backend: make the random algorithm support a number of draws
When an argument <draws> is present, it must be an integer value one
or greater, indicating the number of draws before selecting the least
loaded of these servers. It was indeed demonstrated that picking the
least loaded of two servers is enough to significantly improve the
fairness of the algorithm, by always avoiding to pick the most loaded
server within a farm and getting rid of any bias that could be induced
by the unfair distribution of the consistent list. Higher values N will
take away N-1 of the highest loaded servers at the expense of performance.
With very high values, the algorithm will converge towards the leastconn's
result but much slower. The default value is 2, which generally shows very
good distribution and performance. This algorithm is also known as the
Power of Two Random Choices and is described here :

http://www.eecs.harvard.edu/~michaelm/postscripts/handbook2001.pdf
2019-01-14 19:33:17 +01:00
Willy Tarreau
a9a7249966 MINOR: backend: remap the balance uri settings to lbprm.arg_opt{1,2,3}
The algo-specific settings move from the proxy to the LB algo this way :
  - uri_whole => arg_opt1
  - uri_len_limit => arg_opt2
  - uri_dirs_depth1 => arg_opt3
2019-01-14 19:33:17 +01:00
Willy Tarreau
9fed8586b5 MINOR: backend: make the header hash use arg_opt1 for use_domain_only
This is only a boolean extra arg. Let's map it to arg_opt1 and remove
hh_match_domain from struct proxy.
2019-01-14 19:33:17 +01:00
Willy Tarreau
484ff07691 MINOR: backend: make headers and RDP cookie also use arg_str/len
These ones used to rely on separate variables called hh_name/hh_len
but they are exclusive with the former. Let's use the same variable
which becomes a generic argument name and length for the LB algorithm.
2019-01-14 19:33:17 +01:00
Willy Tarreau
4c03d1c9b6 MINOR: backend: move url_param_name/len to lbprm.arg_str/len
This one is exclusively used by LB parameters, when using URL param
hashing. Let's move it to the lbprm struct under a more generic name.
2019-01-14 19:33:17 +01:00
Willy Tarreau
6c30be52da BUG/MINOR: backend: BE_LB_LKUP_CHTREE is a value, not a bit
There are a few instances where the lookup algo is tested against
BE_LB_LKUP_CHTREE using a binary "AND" operation while this macro
is a value among a set, and not a bit. The test happens to work
because the value is exactly 4 and no bit overlaps with the other
possible values but this is a latent bug waiting for a new LB algo
to appear to strike. At the moment the only other algo sharing a bit
with it is the "first" algo which is never supported in the same code
places.

This fix should be backported to maintained versions for safety if it
passes easily, otherwise it's not important as it will not fix any
visible issue.
2019-01-14 19:33:17 +01:00
Willy Tarreau
602a499da5 BUG/MINOR: backend: balance uri specific options were lost across defaults
The "balance uri" options "whole", "len" and "depth" were not properly
inherited from the defaults sections. In addition, "whole" and "len"
were not even reset when parsing "uri", meaning that 2 subsequent
"balance uri" statements would not have the expected effect as the
options from the first one would remain for the second one.

This may be backported to all maintained versions.
2019-01-14 19:33:17 +01:00
Olivier Houchard
5cd6217185 BUG/MEDIUM: server: Defer the mux init until after xprt has been initialized.
In connect_server(), if we're using a new connection, and we have to
initialize the mux right away, only do it so after si_connect() has been
called. si_connect() is responsible for initializing the xprt, and the
mux initialization may depend on the xprt being usable, as it may try to
receive data. Otherwise, the connection will be flagged as having an error,
and we will have to try to connect a second time.

This should be backported to 1.9.
2019-01-04 17:08:47 +01:00
Willy Tarreau
59884a646c MINOR: lb: allow redispatch when using consistent hash
Redispatch traditionally only worked for cookie based persistence.

Adding redispatch support for consistent hash based persistence - also
update docs.

Reported by Oskar Stenman on discourse:
https://discourse.haproxy.org/t/balance-uri-consistent-hashing-redispatch-3-not-redispatching/3344

Should be backported to 1.8.

Cc: Lukas Tribus <lukas@ltri.eu>
2019-01-02 20:22:17 +01:00
Olivier Houchard
a2dbeb22fc MEDIUM: sessions: Keep track of which connections are idle.
Instead of keeping track of the number of connections we're responsible for,
keep track of the number of connections we're responsible for that we are
currently considering idling (ie that we are not using, they may be in use
by other sessions), that way we can actually reuse connections when we have
more connections than the max configured.
2018-12-28 19:16:03 +01:00
Olivier Houchard
c685d700fd MEDIUM: servers: Be smarter when switching connections.
When connecting to a server, and reusing a connection, always attempt to give
the owner of the previous session one of its own connections, so that one
session won't be responsible for too many connections.

This should be backported to 1.9.
2018-12-28 16:34:03 +01:00
Olivier Houchard
4f41751ad2 BUG/MEDIUM: servers: Flag the stream_interface on handshake error.
When creating a new outgoing connection, if we're using ALPN and waiting
for the handshake completion to choose the mux, and for some reason the
handshake failed, add the SI_FL_ERR flag to the stream_interface, so that
process_streams() knows the connection failed, and can attempt to retry,
instead of just hanging.

This should be backported to 1.9.
2018-12-28 16:33:22 +01:00
Olivier Houchard
351411facd BUG/MAJOR: sessions: Use an unlimited number of servers for the conn list.
When a session adds a connection to its connection list, we used to remove
connections for an another server if there were not enough room for our
server. This can't work, because those lists are now the list of connections
we're responsible for, not just the idle connections.
To fix this, allow for an unlimited number of servers, instead of using
an array, we're now using a linked list.
2018-12-28 16:33:13 +01:00
Olivier Houchard
5f7de56a08 BUG/MAJOR: servers: Correctly use LIST_ELEM().
To access the first element of the list, correctly use LIST_ELEM(), or we
end up getting the head of the list, instead of getting the first connection.

This should be backported to 1.9.
2018-12-28 16:33:06 +01:00
Olivier Houchard
c3fa638b4c BUG/MAJOR: servers: Use the list api correctly to avoid crashes.
In connect_server(), if we looked for an usable connection and failed to
find one, srv_conn won't be NULL at the end of list_for_each_entry(), but
will point to the head of a list, which is not a pointer to a struct
connection, so explicitely set it to NULL.

This should be backported to 1.9.
2018-12-28 16:33:00 +01:00
Olivier Houchard
134a2045bb BUG/MEDIUM: servers: Fail if we fail to allocate a conn_stream.
If, for some reason we failed to allocate a conn_stream when reusing an
existing connection, set srv_conn to NULL, so that we fail later, instead
of pretending all is right. This ends up giving a stream_interface with
no endpoint, and so the stream will never end.

This should be backported to 1.9.
2018-12-28 15:49:24 +01:00
Olivier Houchard
bb3dac37a2 BUG/MEDIUM: servers: Don't try to reuse connection if we switched server.
In connect_server(), don't attempt to reuse the old connection if it's
targetting a different server than the one we're supposed to access, or
we will never be able to connect to a server if the first one we tried failed.

This should be backported to 1.9.
2018-12-24 13:45:43 +01:00
Willy Tarreau
94031d30d7 MINOR: connection: remove an unwelcome dependency on struct stream
There was a reference to struct stream in conn_free() for the case
where we're freeing a connection that doesn't have a mux attached.
For now we know it's always a stream, and we only need to do it to
put a NULL in s->si[1].end.

Let's do it better by storing the pointer to si[1].end in the context
and specifying that this pointer is always nulled if the mux is null.
This way it allows a connection to detach itself from wherever it's
being used. Maybe we could even get rid of the condition on the mux.
2018-12-19 14:36:29 +01:00
Willy Tarreau
3d2ee55ebd CLEANUP: connection: rename conn->mux_ctx to conn->ctx
We most often store the mux context there but it can also be something
else while setting up the connection. Better call it "ctx" and know
that it's the owner's context than misleadingly call it mux_ctx and
get caught doing suspicious tricks.
2018-12-19 14:13:07 +01:00
Olivier Houchard
7aec9ed2f8 MEDIUM: servers: Be more agressive when adding H2 connection to idle lists.
Add the newly created to the idle list as long as http-reuse != never, and
when completing a H2 request, add the connection to the safe list instead of
the idle list, if we have to add it at that point, that means we created
many streams so we know it's safe.
2018-12-15 23:50:10 +01:00
Olivier Houchard
a4d4fdfaa3 MEDIUM: sessions: Don't keep an infinite number of idling connections.
In session, don't keep an infinite number of connection that can idle.
Add a new frontend parameter, "max-session-srv-conns" to set a max number,
with a default value of 5.
2018-12-15 23:50:10 +01:00
Olivier Houchard
f502aca5c2 MEDIUM: mux: provide the session to the init() and attach() method.
Instead of trying to get the session from the connection, which is not
always there, and of course there could be multiple sessions per connection,
provide it with the init() and attach() methods, so that we know the
session for each outgoing stream.
2018-12-15 23:50:09 +01:00