Commit Graph

632 Commits

Author SHA1 Message Date
Christopher Faulet
aa27853ce2 BUG/MEDIUM: connection: Don't consider new private connections as available
When a connection is created and the multiplexer is installed, if the connection
is marked as private, don't consider it as available, regardless the number of
available streams. This test is performed when the mux is installed when the
connection is created, in connect_server(), and when the mux is installed after
the handshakes stage.

No backport needed, this is 2.2-dev.
2020-07-07 14:30:38 +02:00
Christopher Faulet
e91a526c8f BUG/MINOR: backend: Remove CO_FL_SESS_IDLE if a client remains on the last server
When a connection is picked from the session server list because the proxy or
the session are marked to use the last requested server, if it is idle, we must
marked it as used removing the CO_FL_SESS_IDLE flag and decrementing the session
idle_conns counter.

This patch must be backported as far as 1.9.
2020-07-07 14:30:26 +02:00
Olivier Houchard
1662cdb0c6 BUG/MEDIUM: connections: Set the tid for the old tasklet on takeover.
In the various takeover() methods, make sure we schedule the old tasklet
on the old thread, as we don't want it to run on our own thread! This
was causing a very rare crash when building with DEBUG_STRICT, seeing
that either an FD's thread mask didn't match the thread ID in h1_io_cb(),
or that stream_int_notify() would try to queue a task with the wrong
tid_bit.

In order to reproduce this, it is necessary to maintain many connections
(typically 30k) at a high request rate flowing over H1+SSL between two
proxies, the second of which would randomly reject ~1% of the incoming
connection and randomly killing some idle ones using a very short client
timeout. The request rate must be adjusted so that the CPUs are nearly
saturated, but never reach 100%. It's easier to reproduce this by skipping
local connections and always picking from other threads. The issue
should happen in less than 20s otherwise it's necessary to restart to
reset the idle connections lists.

No backport is needed, takeover() is 2.2 only.
2020-07-03 17:49:23 +02:00
Willy Tarreau
76cc699017 MINOR: config: add a new tune.idle-pool.shared global setting.
Enables ('on') or disables ('off') sharing of idle connection pools between
threads for a same server. The default is to share them between threads in
order to minimize the number of persistent connections to a server, and to
optimize the connection reuse rate. But to help with debugging or when
suspecting a bug in HAProxy around connection reuse, it can be convenient to
forcefully disable this idle pool sharing between multiple threads, and force
this option to "off". The default is on.

This could have been nice to have during the idle connections debugging,
but it's not too late to add it!
2020-07-01 19:07:37 +02:00
Olivier Houchard
f8f4c2ef60 CLEANUP: connections: rename the toremove_lock to takeover_lock
This lock was misnamed and a bit confusing. It's only used for takeover
so let's call it takeover_lock.
2020-07-01 17:09:10 +02:00
Willy Tarreau
364f25a688 MINOR: backend: don't always takeover from the same threads
The next thread walking algorithm in commit 566df309c ("MEDIUM:
connections: Attempt to get idle connections from other threads.")
proved to be sufficient for most cases, but it still has some rough
edges when threads are unevenly loaded. If one thread wakes up with
10 streams to process in a burst, it will mainly take over connections
from the next one until it doesn't have anymore.

This patch implements a rotating index that is stored into the server
list and that any thread taking over a connection is responsible for
updating. This way it starts mostly random and avoids always picking
from the same place. This results in a smoother distribution overall
and a slightly lower takeover rate.
2020-07-01 16:07:43 +02:00
Willy Tarreau
0d587116c2 BUG/MEDIUM: backend: always search in the safe list after failing on the idle one
There's a tricky behavior that was lost when the idle connections were
made sharable between thread in commit 566df309c ("MEDIUM: connections:
Attempt to get idle connections from other threads."), it is the ability
to retry from the safe list when looking for any type of idle connection
and not finding one in the idle list.

It is already important when dealing with long-lived connections since
they ultimately all become safe, but that case is already covered by
the fact that safe conns not being used end up closing and are not
looked up anymore since connect_server() sees there are none.

But it's even more important when using server-side connections which
periodically close, because the new connections may spend half of their
time in safe state and the other half in the idle state, and failing
to grab one such connection from the right list results in establishing
a new connection.

This patch makes sure that a failure to find an idle connection results
in a new attempt at finding one from the safe list if available. In order
to avoid locking twice, connections are attempted alternatively from the
idle then safe list when picking from siblings. Tests have shown a ~2%
performance increase by avoiding to lock twice.

A typical test with 10000 connections over 16 threads with 210 servers
having a 1 millisecond response time and closing every 5 requests shows
a degrading performance starting at 120k req/s down to 60-90k and an
average reuse rate of 44%. After the fix, the reuse rate raises to 79%
and the performance becomes stable at 254k req/s. Similarly the previous
test with full keep-alive has now increased from 96% reuse rate to 99%
and from 352k to 375k req/s.

No backport is needed as this is 2.2-only.
2020-07-01 15:49:21 +02:00
Willy Tarreau
2f3f4d3441 MEDIUM: server: add a new pool-low-conn server setting
The problem with the way idle connections currently work is that it's
easy for a thread to steal all of its siblings' connections, then release
them, then it's done by another one, etc. This happens even more easily
due to scheduling latencies, or merged events inside the same pool loop,
which, when dealing with a fast server responding in sub-millisecond
delays, can really result in one thread being fully at work at a time.

In such a case, we perform a huge amount of takeover() which consumes
CPU and requires quite some locking, sometimes resulting in lower
performance than expected.

In order to fight against this problem, this patch introduces a new server
setting "pool-low-conn", whose purpose is to dictate when it is allowed to
steal connections from a sibling. As long as the number of idle connections
remains at least as high as this value, it is permitted to take over another
connection. When the idle connection count becomes lower, a thread may only
use its own connections or create a new one. By proceeding like this even
with a low number (typically 2*nbthreads), we quickly end up in a situation
where all active threads have a few connections. It then becomes possible
to connect to a server without bothering other threads the vast majority
of the time, while still being able to use these connections when the
number of available FDs becomes low.

We also use this threshold instead of global.nbthread in the connection
release logic, allowing to keep more extra connections if needed.

A test performed with 10000 concurrent HTTP/1 connections, 16 threads
and 210 servers with 1 millisecond of server response time showed the
following numbers:

   haproxy 2.1.7:           185000 requests per second
   haproxy 2.2:             314000 requests per second
   haproxy 2.2 lowconn 32:  352000 requests per second

The takeover rate goes down from 300k/s to 13k/s. The difference is
further amplified as the response time shrinks.
2020-07-01 15:23:15 +02:00
Willy Tarreau
151c253a1e MINOR: server: skip servers with no idle conns earlier
In conn_backend_get() we can avoid locking other servers when trying
to steal their connections when we know for sure they will not have
one, so let's do it to lower the contention on the lock.
2020-07-01 10:33:39 +02:00
Willy Tarreau
bdb86bdaab MEDIUM: server: improve estimate of the need for idle connections
Starting with commit 079cb9a ("MEDIUM: connections: Revamp the way idle
connections are killed") we started to improve the way to compute the
need for idle connections. But the condition to keep a connection idle
or drop it when releasing it was not updated. This often results in
storms of close when certain thresholds are met, and long series of
takeover() when there aren't enough connections left for a thread on
a server.

This patch tries to improve the situation this way:
  - it keeps an estimate of the number of connections needed for a server.
    This estimate is a copy of the max over previous purge period, or is a
    max of what is seen over current period; it differs from max_used_conns
    in that this one is a counter that's reset on each purge period ;

  - when releasing, if the number of current idle+used connections is
    lower than this last estimate, then we'll keep the connection;

  - when releasing, if the current thread's idle conns head is empty,
    and we don't exceed the estimate by the number of threads, then
    we'll keep the connection.

  - when cleaning up connections, we consider the max of the last two
    periods to avoid killing too many idle conns when facing bursty
    traffic.

Thanks to this we can better converge towards a situation where, provided
there are enough FDs, each active server keeps at least one idle connection
per thread all the time, with a total number close to what was needed over
the previous measurement period (as defined by pool-purge-delay).

On tests with large numbers of concurrent connections (30k) and many
servers (200), this has quite smoothed the CPU usage pattern, increased
the reuse rate and roughly halved the takeover rate.
2020-06-29 16:29:10 +02:00
Willy Tarreau
c35bcfcc21 BUG/MINOR: server: start cleaning idle connections from various points
There's a minor glitch with the way idle connections start to be evicted.
The lookup always goes from thread 0 to thread N-1. This causes depletion
of connections on the first threads and abundance on the last ones. This
is visible with the takeover() stats below:

 $ socat - /tmp/sock1 <<< "show activity"|grep ^fd ; \
   sleep 10 ; \
   socat -/tmp/sock1 <<< "show activity"|grep ^fd
 fd_takeover: 300144 [ 91887 84029 66254 57974 ]
 fd_takeover: 359631 [ 111369 99699 79145 69418 ]

There are respectively 19k, 15k, 13k and 11k takeovers for only 4 threads,
indicating that the first thread needs a foreign FD twice more often than
the 4th one.

This patch changes this si that all threads are scanned in round robin
starting with the current one. The takeovers now happen in a much more
distributed way (about 4 times 9k) :

  fd_takeover: 1420081 [ 359562 359453 346586 354480 ]
  fd_takeover: 1457044 [ 368779 368429 355990 363846 ]

There is no need to backport this, as this happened along a few patches
that were merged during 2.2 development.
2020-06-29 14:43:16 +02:00
Willy Tarreau
b159132ea3 MINOR: activity: add per-thread statistics on FD takeover
The FD takeover operation might have certain impacts explaining
unexpected activities, so it's important to report such a counter
there. We thus count the number of times a thread has stolen an
FD from another thread.
2020-06-29 14:26:05 +02:00
Olivier Houchard
4ba494ca1b BUG/MEDIUM: connections: Don't increase curr_used_conns for shared connections.
In connect_server(), we want to increase curr_used_conns only if the
connection is new, or if it comes from an idle_pool, otherwise it means
the connection is already used by at least one another stream, and it is
already accounted for.
2020-06-28 16:16:13 +02:00
Willy Tarreau
4d82bf5c2e MINOR: connection: align toremove_{lock,connections} and cleanup into idle_conns
We used to have 3 thread-based arrays for toremove_lock, idle_cleanup,
and toremove_connections. The problem is that these items are small,
and that this creates false sharing between threads since it's possible
to pack up to 8-16 of these values into a single cache line. This can
cause real damage where there is contention on the lock.

This patch creates a new array of struct "idle_conns" that is aligned
on a cache line and which contains all three members above. This way
each thread has access to its variables without hindering the other
ones. Just doing this increased the HTTP/1 request rate by 5% on a
16-thread machine.

The definition was moved to connection.{c,h} since it appeared a more
natural evolution of the ongoing changes given that there was already
one of them declared in connection.h previously.
2020-06-28 10:52:36 +02:00
Willy Tarreau
b2551057af CLEANUP: include: tree-wide alphabetical sort of include files
This patch fixes all the leftovers from the include cleanup campaign. There
were not that many (~400 entries in ~150 files) but it was definitely worth
doing it as it revealed a few duplicates.
2020-06-11 10:18:59 +02:00
Willy Tarreau
dfd3de8826 REORG: include: move stream.h to haproxy/stream{,-t}.h
This one was not easy because it was embarking many includes with it,
which other files would automatically find. At least global.h, arg.h
and tools.h were identified. 93 total locations were identified, 8
additional includes had to be added.

In the rare files where it was possible to finalize the sorting of
includes by adjusting only one or two extra lines, it was done. But
all files would need to be rechecked and cleaned up now.

It was the last set of files in types/ and proto/ and these directories
must not be reused anymore.
2020-06-11 10:18:58 +02:00
Willy Tarreau
1e56f92693 REORG: include: move server.h to haproxy/server{,-t}.h
extern struct dict server_name_dict was moved from the type file to the
main file. A handful of inlined functions were moved at the bottom of
the file. Call places were updated to use server-t.h when relevant, or
to simply drop the entry when not needed.
2020-06-11 10:18:58 +02:00
Willy Tarreau
a55c45470f REORG: include: move queue.h to haproxy/queue{,-t}.h
Nothing outstanding here. A number of call places were not justified and
removed.
2020-06-11 10:18:58 +02:00
Willy Tarreau
4980160ecc REORG: include: move backend.h to haproxy/backend{,-t}.h
The files remained mostly unchanged since they were OK. However, half of
the users didn't need to include them, and about as many actually needed
to have it and used to find functions like srv_currently_usable() through
a long chain that broke when moving the file.
2020-06-11 10:18:58 +02:00
Willy Tarreau
a264d960f6 REORG: include: move proxy.h to haproxy/proxy{,-t}.h
This one is particularly difficult to split because it provides all the
functions used to manipulate a proxy state and to retrieve names or IDs
for error reporting, and as such, it was included in 73 files (down to
68 after cleanup). It would deserve a small cleanup though the cut points
are not obvious at the moment given the number of structs involved in
the struct proxy itself.
2020-06-11 10:18:58 +02:00
Willy Tarreau
aeed4a85d6 REORG: include: move log.h to haproxy/log{,-t}.h
The current state of the logging is a real mess. The main problem is
that almost all files include log.h just in order to have access to
the alert/warning functions like ha_alert() etc, and don't care about
logs. But log.h also deals with real logging as well as log-format and
depends on stream.h and various other things. As such it forces a few
heavy files like stream.h to be loaded early and to hide missing
dependencies depending where it's loaded. Among the missing ones is
syslog.h which was often automatically included resulting in no less
than 3 users missing it.

Among 76 users, only 5 could be removed, and probably 70 don't need the
full set of dependencies.

A good approach would consist in splitting that file in 3 parts:
  - one for error output ("errors" ?).
  - one for log_format processing
  - and one for actual logging.
2020-06-11 10:18:58 +02:00
Willy Tarreau
c2b1ff04e5 REORG: include: move http_ana.h to haproxy/http_ana{,-t}.h
It was moved without any change, however many callers didn't need it at
all. This was a consequence of the split of proto_http.c into several
parts that resulted in many locations to still reference it.
2020-06-11 10:18:58 +02:00
Willy Tarreau
f1d32c475c REORG: include: move channel.h to haproxy/channel{,-t}.h
The files were moved with no change. The callers were cleaned up a bit
and a few of them had channel.h removed since not needed.
2020-06-11 10:18:58 +02:00
Willy Tarreau
5e539c9b8d REORG: include: move stream_interface.h to haproxy/stream_interface{,-t}.h
Almost no changes, removed stdlib and added buf-t and connection-t to
the types to avoid a warning.
2020-06-11 10:18:58 +02:00
Willy Tarreau
209108dbbd REORG: include: move ssl_sock.h to haproxy/ssl_sock{,-t}.h
Almost nothing changed, just moved a static inline at the end and moved
an export from the types to the main file.
2020-06-11 10:18:58 +02:00
Willy Tarreau
2867159d63 REORG: include: move lb_map.h to haproxy/lb_map{,-t}.h
Nothing was changed.
2020-06-11 10:18:58 +02:00
Willy Tarreau
dcc048a14a REORG: include: move acl.h to haproxy/acl.h{,-t}.h
The files were moved almost as-is, just dropping arg-t and auth-t from
acl-t but keeping arg-t in acl.h. It was useful to revisit the call places
since a handful of files used to continue to include acl.h while they did
not need it at all. Struct stream was only made a forward declaration
since not otherwise needed.
2020-06-11 10:18:58 +02:00
Willy Tarreau
48d25b3bc9 REORG: include: move session.h to haproxy/session{,-t}.h
Almost no change was needed beyond a little bit of reordering of the
types file and adjustments to use session-t instead of session at a
few places.
2020-06-11 10:18:58 +02:00
Willy Tarreau
4aa573da6f REORG: include: move checks.h to haproxy/check{,-t}.h
All includes that were not absolutely necessary were removed because
checks.h happens to very often be part of dependency loops. A warning
was added about this in check-t.h. The fields, enums and structs were
a bit tidied because it's particularly tedious to find anything there.
It would make sense to split this in two or more files (at least
extract tcp-checks).

The file was renamed to the singular because it was one of the rare
exceptions to have an "s" appended to its name compared to the struct
name.
2020-06-11 10:18:58 +02:00
Willy Tarreau
fc77454aff REORG: include: move proto_tcp.h to haproxy/proto_tcp.h
There was no type file. This one really is trivial. A few missing
includes were added to satisfy the exported functions prototypes.
2020-06-11 10:18:58 +02:00
Willy Tarreau
cea0e1bb19 REORG: include: move task.h to haproxy/task{,-t}.h
The TASK_IS_TASKLET() macro was moved to the proto file instead of the
type one. The proto part was a bit reordered to remove a number of ugly
forward declaration of static inline functions. About a tens of C and H
files had their dependency dropped since they were not using anything
from task.h.
2020-06-11 10:18:58 +02:00
Willy Tarreau
f268ee8795 REORG: include: split global.h into haproxy/global{,-t}.h
global.h was one of the messiest files, it has accumulated tons of
implicit dependencies and declares many globals that make almost all
other file include it. It managed to silence a dependency loop between
server.h and proxy.h by being well placed to pre-define the required
structs, forcing struct proxy and struct server to be forward-declared
in a significant number of files.

It was split in to, one which is the global struct definition and the
few macros and flags, and the rest containing the functions prototypes.

The UNIX_MAX_PATH definition was moved to compat.h.
2020-06-11 10:18:58 +02:00
Willy Tarreau
e6ce10be85 REORG: include: move sample.h to haproxy/sample{,-t}.h
This one is particularly tricky to move because everyone uses it
and it depends on a lot of other types. For example it cannot include
arg-t.h and must absolutely only rely on forward declarations to avoid
dependency loops between vars -> sample_data -> arg. In order to address
this one, it would be nice to split the sample_data part out of sample.h.
2020-06-11 10:18:58 +02:00
Willy Tarreau
469509b39e REORG: include: move payload.h to haproxy/payload.h
There's no type file, it only contains fetch_rdp_cookie_name() and
val_payload_lv() which probably ought to move somewhere else instead
of staying there.
2020-06-11 10:18:58 +02:00
Willy Tarreau
546ba42c73 REORG: include: move lb_fwrr.h to haproxy/lb_fwrr{,-t}.h
Nothing fancy, includes were already OK. The proto didn't reference the
type, this was fixed. Still references proxy.h and server.h from types/.
2020-06-11 10:18:58 +02:00
Willy Tarreau
0254941666 REORG: include: move lb_fwlc.h to haproxy/lb_fwlc{,-t}.h
Nothing fancy, includes were already OK. The proto didn't reference the
type, this was fixed. Still references proxy.h and server.h from types/.
2020-06-11 10:18:58 +02:00
Willy Tarreau
b5fc3bf6dc REORG: include: move lb_fas.h to haproxy/lb_fas{,-t}.h
Nothing fancy, includes were already OK. The proto didn't reference the
type, this was fixed. Still references proxy.h and server.h from types/.
2020-06-11 10:18:58 +02:00
Willy Tarreau
fbe8da3320 REORG: include: move lb_chash.h to haproxy/lb_chash{,-t}.h
Nothing fancy, includes were already OK. The proto didn't reference the
type, this was fixed. Still references proxy.h and server.h from types/.
2020-06-11 10:18:58 +02:00
Willy Tarreau
d7d2c28104 CLEANUP: include: remove unused mux_pt.h
It used to be needed to export mux_pt_ops when it was the only way to
detect a mux but that's no longer the case.
2020-06-11 10:18:57 +02:00
Willy Tarreau
8efbdfb77b REORG: include: move obj_type.h to haproxy/obj_type{,-t}.h
No change was necessary. It still includes lots of types/* files.
2020-06-11 10:18:57 +02:00
Willy Tarreau
762d7a5117 REORG: include: move frontend.h to haproxy/frontend.h
There was no type file for this one, it only contains frontend_accept().
2020-06-11 10:18:57 +02:00
Willy Tarreau
aa74c4e1b3 REORG: include: move arg.h to haproxy/arg{,-t}.h
Almost no change was needed; chunk.h was replaced with buf-t.h.
It dpeends on types/vars.h and types/protocol_buffers.h.
2020-06-11 10:18:57 +02:00
Willy Tarreau
87735330d1 REORG: include: move http_htx.h to haproxy/http_htx{,-t}.h
A few includes had to be added, namely list-t.h in the type file and
types/proxy.h in the proto file. actions.h was including http-htx.h
but didn't need it so it was dropped.
2020-06-11 10:18:57 +02:00
Willy Tarreau
2dd7c35052 REORG: include: move protocol.h to haproxy/protocol{,-t}.h
The protocol.h files are pretty low in the dependency and (sadly) used
by some files from common/. Almost nothing was changed except lifting a
few comments.
2020-06-11 10:18:57 +02:00
Willy Tarreau
16f958c0e9 REORG: include: split common/htx.h into haproxy/htx{,-t}.h
Most of the file was a large set of HTX elements manipulation functions
and few types, so splitting them allowed to further reduce dependencies
and shrink the build time. Doing so revealed that a few files (h2.c,
mux_pt.c) needed haproxy/buf.h and were previously getting it through
htx.h. They were fixed.
2020-06-11 10:18:57 +02:00
Willy Tarreau
cd72d8c981 REORG: include: split common/http.h into haproxy/http{,-t}.h
So the enums and structs were placed into http-t.h and the functions
into http.h. This revealed that several files were dependeng on http.h
but not including it, as it was silently inherited via other files.
2020-06-11 10:18:57 +02:00
Willy Tarreau
c2f7c5895c REORG: include: move common/ticks.h to haproxy/ticks.h
Nothing needed to be changed, there are no exported types.
2020-06-11 10:18:57 +02:00
Willy Tarreau
7a00efbe43 REORG: include: move common/namespace.h to haproxy/namespace{,-t}.h
The type was moved out as it's used by standard.h for netns_entry.
Instead of just being a forward declaration when not used, it's an
empty struct, which makes gdb happier (the resulting stripped executable
is the same).
2020-06-11 10:18:57 +02:00
Willy Tarreau
2741c8c4aa REORG: include: move common/buffer.h to haproxy/dynbuf{,-t}.h
The pretty confusing "buffer.h" was in fact not the place to look for
the definition of "struct buffer" but the one responsible for dynamic
buffer allocation. As such it defines the struct buffer_wait and the
few functions to allocate a buffer or wait for one.

This patch moves it renaming it to dynbuf.h. The type definition was
moved to its own file since it's included in a number of other structs.

Doing this cleanup revealed that a significant number of files used to
rely on this one to inherit struct buffer through it but didn't need
anything from this file at all.
2020-06-11 10:18:57 +02:00
Willy Tarreau
92b4f1372e REORG: include: move time.h from common/ to haproxy/
This one is included almost everywhere and used to rely on a few other
.h that are not needed (unistd, stdlib, standard.h). It could possibly
make sense to split it into multiple parts to distinguish operations
performed on timers and the internal time accounting, but at this point
it does not appear much important.
2020-06-11 10:18:56 +02:00
Willy Tarreau
58017eef3f REORG: include: move the BUG_ON() code to haproxy/bug.h
This one used to be stored into debug.h but the debug tools got larger
and require a lot of other includes, which can't use BUG_ON() anymore
because of this. It does not make sense and instead this macro should
be placed into the lower includes and given its omnipresence, the best
solution is to create a new bug.h with the few surrounding macros needed
to trigger bugs and place assertions anywhere.

Another benefit is that it won't be required to add include <debug.h>
anymore to use BUG_ON, it will automatically be covered by api.h. No
less than 32 occurrences were dropped.

The FSM_PRINTF macro was dropped since not used at all anymore (probably
since 1.6 or so).
2020-06-11 10:18:56 +02:00
Willy Tarreau
8d36697dee REORG: include: move base64.h, errors.h and hash.h from common to to haproxy/
These ones do not depend on any other file. One used to include
haproxy/api.h but that was solely for stddef.h.
2020-06-11 10:18:56 +02:00
Willy Tarreau
4c7e4b7738 REORG: include: update all files to use haproxy/api.h or api-t.h if needed
All files that were including one of the following include files have
been updated to only include haproxy/api.h or haproxy/api-t.h once instead:

  - common/config.h
  - common/compat.h
  - common/compiler.h
  - common/defaults.h
  - common/initcall.h
  - common/tools.h

The choice is simple: if the file only requires type definitions, it includes
api-t.h, otherwise it includes the full api.h.

In addition, in these files, explicit includes for inttypes.h and limits.h
were dropped since these are now covered by api.h and api-t.h.

No other change was performed, given that this patch is large and
affects 201 files. At least one (tools.h) was already freestanding and
didn't get the new one added.
2020-06-11 10:18:42 +02:00
Olivier Houchard
68ad53cb37 BUG/MEDIUM: backend: set the connection owner to the session when using alpn.
In connect_server(), if we can't create the mux immediately because we have
to wait until the alpn is negociated, store the session as the connection's
owner. conn_create_mux() expects it to be set, and provides it to the mux
init() method. Failure to do so will result to crashes later if the
connection is private, and even if we didn't do so it would prevent connection
reuse for private connections.
This should fix github issue #651.
2020-05-27 01:34:33 +02:00
Christopher Faulet
f98e626491 MINOR: checks/sample: Remove unnecessary tests on the sample session
A sample must always have a session defined. Otherwise, it is a bug. So it is
unnecessary to test if it is defined when called from a health checks context.

This patch fixes the issue #616.
2020-05-06 12:44:46 +02:00
Christopher Faulet
d1b4464b69 MINOR: checks: Add support of be_id, be_name, srv_id and srv_name sample fetches
It is now possible to call be_id, be_name, srv_id and srv_name sample fetches
from any sample expression or log-format string in a tcp-check based ruleset.
2020-05-05 11:06:43 +02:00
Olivier Houchard
cf612a0457 MINOR: servers: Add a counter for the number of currently used connections.
Add a counter to know the current number of used connections, as well as the
max, this will be used later to refine the algorithm used to kill idle
connections, based on current usage.
2020-03-30 00:30:01 +02:00
Olivier Houchard
c0caac2cc8 BUG/MINOR: connections: Make sure we free the connection on failure.
In connect_server(), make sure we properly free a newly created connection
if we somehow fail, and it has not yet been attached to a conn_stream, or
it would lead to a memory leak.
This should appease coverity for backend.c, as reported in inssue #556.

This should be backported to 2.1, 2.0 and 1.9
2020-03-20 14:35:07 +01:00
Olivier Houchard
fdc7ee2173 BUG/MEDIUM: connections: Don't forget to decrement idle connection counters.
In conn_backend_get(), when we manage to get an idle connection from the
current thread's pool, don't forget to decrement the idle connection
counters, or we may end up not reusing connections when we could, and/or
killing connections when we shouldn't.
2020-03-19 23:56:08 +01:00
Olivier Houchard
b3397367dc MEDIUM: connections: Kill connections even if we are reusing one.
In connect_server(), if we notice we have more file descriptors opened than
we should, there's no reason not to close a connection just because we're
reusing one, so do it anyway.
2020-03-19 22:07:34 +01:00
Olivier Houchard
566df309c6 MEDIUM: connections: Attempt to get idle connections from other threads.
In connect_server(), if we no longer have any idle connections for the
current thread, attempt to use the new "takeover" mux method to steal a
connection from another thread.
This should have no impact right now, given no mux implements it.
2020-03-19 22:07:33 +01:00
Olivier Houchard
d2489e00b0 MINOR: connections: Add a flag to know if we're in the safe or idle list.
Add flags to connections, CO_FL_SAFE_LIST and CO_FL_IDLE_LIST, to let one
know we are in the safe list, or the idle list.
2020-03-19 22:07:33 +01:00
Olivier Houchard
f0d4dff25c MINOR: connections: Make the "list" element a struct mt_list instead of list.
Make the "list" element a struct mt_list, and explicitely use
list_from_mt_list to get a struct list * where it is used as such, so that
mt_list_for_each_entry will be usable with it.
2020-03-19 22:07:33 +01:00
Olivier Houchard
dc2f2753e9 MEDIUM: servers: Split the connections into idle, safe, and available.
Revamp the server connection lists. We know have 3 lists :
- idle_conns, which contains idling connections
- safe_conns, which contains idling connections that are safe to use even
for the first request
- available_conns, which contains connections that are not idling, but can
still accept new streams (those are HTTP/2 or fastcgi, and are always
considered safe).
2020-03-19 22:07:33 +01:00
Olivier Houchard
2444aa5b66 MEDIUM: sessions: Don't be responsible for connections anymore.
Make it so sessions are not responsible for connection anymore, except for
connections that are private, and thus can't be shared, otherwise, as soon
as a request is done, the session will just add the connection to the
orphan connections pool.
This will break http-reuse safe, but it is expected to be fixed later.
2020-03-19 22:07:33 +01:00
Willy Tarreau
b9f54c5592 MINOR: backend: use a single call to ha_random32() for the random LB algo
For the random LB algorithm we need a random 32-bit hashing key that used
to be made of two calls to random(). Now we can simply perform a single
call to ha_random32() and get rid of the useless operations.
2020-03-08 17:31:39 +01:00
Willy Tarreau
52bf839394 BUG/MEDIUM: random: implement a thread-safe and process-safe PRNG
This is the replacement of failed attempt to add thread safety and
per-process sequences of random numbers initally tried with commit
1c306aa84d ("BUG/MEDIUM: random: implement per-thread and per-process
random sequences").

This new version takes a completely different approach and doesn't try
to work around the horrible OS-specific and non-portable random API
anymore. Instead it implements "xoroshiro128**", a reputedly high
quality random number generator, which is one of the many variants of
xorshift, which passes all quality tests and which is described here:

   http://prng.di.unimi.it/

While not cryptographically secure, it is fast and features a 2^128-1
period. It supports fast jumps allowing to cut the period into smaller
non-overlapping sequences, which we use here to support up to 2^32
processes each having their own, non-overlapping sequence of 2^96
numbers (~7*10^28). This is enough to provide 1 billion randoms per
second and per process for 2200 billion years.

The implementation was made thread-safe either by using a double 64-bit
CAS on platforms supporting it (x86_64, aarch64) or by using a local
lock for the time needed to perform the shift operations. This ensures
that all threads pick numbers from the same pool so that it is not
needed to assign per-thread ranges. For processes we use the fast jump
method to advance the sequence by 2^96 for each process.

Before this patch, the following config:
    global
        nbproc 8

    frontend f
        bind :4445
        mode http
        log stdout format raw daemon
        log-format "%[uuid] %pid"
        redirect location /

Would produce this output:
    a4d0ad64-2645-4b74-b894-48acce0669af 12987
    a4d0ad64-2645-4b74-b894-48acce0669af 12992
    a4d0ad64-2645-4b74-b894-48acce0669af 12986
    a4d0ad64-2645-4b74-b894-48acce0669af 12988
    a4d0ad64-2645-4b74-b894-48acce0669af 12991
    a4d0ad64-2645-4b74-b894-48acce0669af 12989
    a4d0ad64-2645-4b74-b894-48acce0669af 12990
    82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12987
    82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12992
    82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12986
    (...)

And now produces:
    f94b29b3-da74-4e03-a0c5-a532c635bad9 13011
    47470c02-4862-4c33-80e7-a952899570e5 13014
    86332123-539a-47bf-853f-8c8ea8b2a2b5 13013
    8f9efa99-3143-47b2-83cf-d618c8dea711 13012
    3cc0f5c7-d790-496b-8d39-bec77647af5b 13015
    3ec64915-8f95-4374-9e66-e777dc8791e0 13009
    0f9bf894-dcde-408c-b094-6e0bb3255452 13011
    49c7bfde-3ffb-40e9-9a8d-8084d650ed8f 13014
    e23f6f2e-35c5-4433-a294-b790ab902653 13012

There are multiple benefits to using this method. First, it doesn't
depend anymore on a non-portable API. Second it's thread safe. Third it
is fast and more proven than any hack we could attempt to try to work
around the deficiencies of the various implementations around.

This commit depends on previous patches "MINOR: tools: add 64-bit rotate
operators" and "BUG/MEDIUM: random: initialize the random pool a bit
better", all of which will need to be backported at least as far as
version 2.0. It doesn't require to backport the build fixes for circular
include files dependecy anymore.
2020-03-08 10:09:02 +01:00
Willy Tarreau
0fbf28a05b Revert "BUG/MEDIUM: random: implement per-thread and per-process random sequences"
This reverts commit 1c306aa84d.

It breaks the build on all non-glibc platforms. I got confused by the
man page (which possibly is the most confusing man page I've ever read
about a standard libc function) and mistakenly understood that random_r
was portable, especially since it appears in latest freebsd source as
well but not in released versions, and with a slightly different API :-/

We need to find a different solution with a fallback. Among the
possibilities, we may reintroduce this one with a fallback relying on
locking around the standard functions, keeping fingers crossed for no
other library function to call them in parallel, or we may also provide
our own PRNG, which is not necessarily more difficult than working
around the totally broken up design of the portable API.
2020-03-07 11:24:39 +01:00
Willy Tarreau
1c306aa84d BUG/MEDIUM: random: implement per-thread and per-process random sequences
As mentioned in previous patch, the random number generator was never
made thread-safe, which used not to be a problem for health checks
spreading, until the uuid sample fetch function appeared. Currently
it is possible for two threads or processes to produce exactly the
same UUID. In fact it's extremely likely that this will happen for
processes, as can be seen with this config:

    global
        nbproc 8

    frontend f
        bind :4445
        mode http
        log stdout daemon format raw
        log-format "%[uuid] %pid"
        redirect location /

It typically produces this log:

  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30645
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30641
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30644
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30639
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30646
  07764439-c24d-4e6f-a5a6-0138be59e7a8 30645
  07764439-c24d-4e6f-a5a6-0138be59e7a8 30639
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30643
  07764439-c24d-4e6f-a5a6-0138be59e7a8 30646
  b6773fdd-678f-4d04-96f2-4fb11ad15d6b 30646
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30642
  07764439-c24d-4e6f-a5a6-0138be59e7a8 30642

What this patch does is to use a distinct per-thread and per-process
seed to make sure the same sequences will not appear, and will then
extend these seeds by "burning" a number of randoms that depends on
the global random seed, the thread ID and the process ID. This adds
roughly 20 extra bits of randomness, resulting in 52 bits total per
thread and per process.

It only takes a few milliseconds to burn these randoms and given
that threads start with a different seed, we know they will not
catch each other. So these random extra bits are essentially added
to ensure randomness between boots and cluster instances.

This replaces all uses of random() with ha_random() which uses the
thread-local state.

This must be backported as far as 2.0 or any version having the
UUID sample-fetch function since it's the main victim here.

It's important to note that this patch, in addition to depending on
the previous one "BUG/MEDIUM: init: initialize the random pool a bit
better", also depends on the preceeding build fixes to address a
circular dependency issue in the include files that prevented it
from building. Part or all of these patches may need to be backported
or adapted as well.
2020-03-07 06:11:15 +01:00
Willy Tarreau
ada4c5806b MEDIUM: stream-int: make sure to try to immediately validate the connection
In the rare case of immediate connect() (unix sockets, socket pairs, and
occasionally TCP over the loopback), it is counter-productive to subscribe
for sending and then getting immediately back to process_stream() after
having passed through si_cs_process() just to update the connection. We
already know it is established and it doesn't have any handshake anymore
so we just have to complete it and return to process_stream() with the
stream_interface in the SI_ST_RDY state. In this case, process_stream will
simply loop back to the beginning to synchronize the state and turn it to
SI_ST_EST/ASS/CLO/TAR etc.

This will save us from having to needlessly subscribe in the connect()
code, something which in addition cannot work with edge-triggered pollers.
2020-03-04 19:29:12 +01:00
Olivier Houchard
849d4f047f BUG/MEDIUM: connections: Don't forget to unlock when killing a connection.
Commit 140237471e made sure we hold the
toremove_lock for the corresponding thread before removing a connection
from its idle_orphan_conns list, however it failed to unlock it if we
found a connection, leading to a deadlock, so add the missing deadlock.

This should be backported to 2.1 and 2.0.
2020-01-31 17:25:37 +01:00
Olivier Houchard
1fc5a648bf MEDIUM: streams: Don't close the connection in back_handle_st_rdy().
In back_handle_st_rdy(), don't bother trying to close the connection, it
should be taken care of somewhere else.
2020-01-24 15:40:34 +01:00
Olivier Houchard
7c30642ede MEDIUM: streams: Don't close the connection in back_handle_st_con().
In back_handle_st_con(), don't bother trying to close the connection, it
should be taken care of elsewhere.
2020-01-24 15:40:34 +01:00
Olivier Houchard
b43589cac5 BUG/MEDIUM: stream: Don't install the mux in back_handle_st_con().
In back_handle_st_con(), don't bother setting up the mux, it is now done by
conn_fd_handler().
2020-01-24 15:40:34 +01:00
Olivier Houchard
ecffb7d841 BUG/MEDIUM: streams: Move the conn_stream allocation outside #IF USE_OPENSSL.
When commit 477902bd2e made the conn_stream
allocation unconditional, it unfortunately moved the code doing the allocation
inside #if USE_OPENSSL, which means anybody compiling haproxy without
openssl wouldn't allocate any conn_stream, and would get a segfault later.
Fix that by moving the code that does the allocation outside #if USE_OPENSSL.
2020-01-24 14:14:35 +01:00
Willy Tarreau
911db9bd29 MEDIUM: connection: use CO_FL_WAIT_XPRT more consistently than L4/L6/HANDSHAKE
As mentioned in commit c192b0ab95 ("MEDIUM: connection: remove
CO_FL_CONNECTED and only rely on CO_FL_WAIT_*"), there is a lack of
consistency on which flags are checked among L4/L6/HANDSHAKE depending
on the code areas. A number of sample fetch functions only check for
L4L6 to report MAY_CHANGE, some places only check for HANDSHAKE and
many check both L4L6 and HANDSHAKE.

This patch starts to make all of this more consistent by introducing a
new mask CO_FL_WAIT_XPRT which is the union of L4/L6/HANDSHAKE and
reports whether the transport layer is ready or not.

All inconsistent call places were updated to rely on this one each time
the goal was to check for the readiness of the transport layer.
2020-01-23 16:34:26 +01:00
Willy Tarreau
4450b587dd MINOR: connection: remove CO_FL_SSL_WAIT_HS from CO_FL_HANDSHAKE
Most places continue to check CO_FL_HANDSHAKE while in fact they should
check CO_FL_HANDSHAKE_NOSSL, which contains all handshakes but the one
dedicated to SSL renegotiation. In fact the SSL layer should be the
only one checking CO_FL_SSL_WAIT_HS, so as to avoid processing data
when a renegotiation is in progress, but other ones randomly include it
without knowing. And ideally it should even be an internal flag that's
not exposed in the connection.

This patch takes CO_FL_SSL_WAIT_HS out of CO_FL_HANDSHAKE, uses this flag
consistently all over the code, and gets rid of CO_FL_HANDSHAKE_NOSSL.

In order to limit the confusion that has accumulated over time, the
CO_FL_SSL_WAIT_HS flag which indicates an ongoing SSL handshake,
possibly used by a renegotiation was moved after the other ones.
2020-01-23 16:34:26 +01:00
Willy Tarreau
c192b0ab95 MEDIUM: connection: remove CO_FL_CONNECTED and only rely on CO_FL_WAIT_*
Commit 477902bd2e ("MEDIUM: connections: Get ride of the xprt_done
callback.") broke the master CLI for a very obscure reason. It happens
that short requests immediately terminated by a shutdown are properly
received, CS_FL_EOS is correctly set, but in si_cs_recv(), we refrain
from setting CF_SHUTR on the channel because CO_FL_CONNECTED was not
yet set on the connection since we've not passed again through
conn_fd_handler() and it was not done in conn_complete_session(). While
commit a8a415d31a ("BUG/MEDIUM: connections: Set CO_FL_CONNECTED in
conn_complete_session()") fixed the issue, such accident may happen
again as the root cause is deeper and actually comes down to the fact
that CO_FL_CONNECTED is lazily set at various check points in the code
but not every time we drop one wait bit. It is not the first time we
face this situation.

Originally this flag was used to detect the transition between WAIT_*
and CONNECTED in order to call ->wake() from the FD handler. But since
at least 1.8-dev1 with commit 7bf3fa3c23 ("BUG/MAJOR: connection: update
CO_FL_CONNECTED before calling the data layer"), CO_FL_CONNECTED is
always synchronized against the two others before being checked. Moreover,
with the I/Os moved to tasklets, the decision to call the ->wake() function
is performed after the I/Os in si_cs_process() and equivalent, which don't
care about this transition either.

So in essence, checking for CO_FL_CONNECTED has become a lazy wait to
check for (CO_FL_WAIT_L4_CONN | CO_FL_WAIT_L6_CONN), but that always
relies on someone else having synchronized it.

This patch addresses it once for all by killing this flag and only checking
the two others (for which a composite mask CO_FL_WAIT_L4L6 was added). This
revealed a number of inconsistencies that were purposely not addressed here
for the sake of bisectability:

  - while most places do check both L4+L6 and HANDSHAKE at the same time,
    some places like assign_server() or back_handle_st_con() and a few
    sample fetches looking for proxy protocol do check for L4+L6 but
    don't care about HANDSHAKE ; these ones will probably fail on TCP
    request session rules if the handshake is not complete.

  - some handshake handlers do validate that a connection is established
    at L4 but didn't clear CO_FL_WAIT_L4_CONN

  - the ->ctl method of mux_fcgi, mux_pt and mux_h1 only checks for L4+L6
    before declaring the mux ready while the snd_buf function also checks
    for the handshake's completion. Likely the former should validate the
    handshake as well and we should get rid of these extra tests in snd_buf.

  - raw_sock_from_buf() would directly set CO_FL_CONNECTED and would only
    later clear CO_FL_WAIT_L4_CONN.

  - xprt_handshake would set CO_FL_CONNECTED itself without actually
    clearing CO_FL_WAIT_L4_CONN, which could apparently happen only if
    waiting for a pure Rx handshake.

  - most places in ssl_sock that were checking CO_FL_CONNECTED don't need
    to include the L4 check as an L6 check is enough to decide whether to
    wait for more info or not.

It also becomes obvious when reading the test in si_cs_recv() that caused
the failure mentioned above that once converted it doesn't make any sense
anymore: having CS_FL_EOS set while still waiting for L4 and L6 to complete
cannot happen since for CS_FL_EOS to be set, the other ones must have been
validated.

Some of these parts will still deserve further cleanup, and some of the
observations above may induce some backports of potential bug fixes once
totally analyzed in their context. The risk of breaking existing stuff
is too high to blindly backport everything.
2020-01-23 14:41:37 +01:00
Willy Tarreau
79fd577ac1 CLEANUP: backend: shut another false null-deref in back_handle_st_con()
objt_conn() may return a NULL though here we don't have this situation
anymore since the connection is always there, so let's simply switch
to the unchecked __objt_conn(). This addresses issue #454.
2020-01-23 11:40:40 +01:00
Willy Tarreau
b1a40c72e7 CLEANUP: backend: remove useless test for inexistent connection
Coverity rightfully reported that it's pointless to test for "conn"
to be null while all code paths leading to it have already
dereferenced it. This addresses issue #461.
2020-01-23 11:37:43 +01:00
Olivier Houchard
477902bd2e MEDIUM: connections: Get ride of the xprt_done callback.
The xprt_done_cb callback was used to defer some connection initialization
until we're connected and the handshake are done. As it mostly consists of
creating the mux, instead of using the callback, introduce a conn_create_mux()
function, that will just call conn_complete_session() for frontend, and
create the mux for backend.
In h2_wake(), make sure we call the wake method of the stream_interface,
as we no longer wakeup the stream task.
2020-01-22 18:56:05 +01:00
Olivier Houchard
8af03b396a MEDIUM: streams: Always create a conn_stream in connect_server().
In connect_server(), when creating a new connection for which we don't yet
know the mux (because it'll be decided by the ALPN), instead of associating
the connection to the stream_interface, always create a conn_stream. This way,
we have less special-casing needed. Store the conn_stream in conn->ctx,
so that we can reach the upper layers if needed.
2020-01-22 18:55:59 +01:00
Willy Tarreau
062df2c23a MEDIUM: backend: move the connection finalization step to back_handle_st_con()
Currently there's still lots of code in conn_complete_server() that performs
one half of the connection setup, which is then checked and finalized in
back_handle_st_con(). There isn't a valid reason for this anymore, we can
simplify this and make sure that conn_complete_server() only wakes the stream
up to inform it about the fact the whole connection stack is set up so that
back_handle_st_con() finishes its job at the stream-int level.

It looks like the there could even be further simplified, but for now it
was moved straight out of conn_complete_server() with no modification.
2020-01-17 18:30:36 +01:00
Willy Tarreau
3a9312af8f REORG: stream/backend: move backend-specific stuff to backend.c
For more than a decade we've kept all the sess_update_st_*() functions
in stream.c while they're only there to work in relation with what is
currently being done in backend.c (srv_redispatch_connect, connect_server,
etc). Let's move all this pollution over there and take this opportunity
to try to find slightly less confusing names for these old functions
whose role is only to handle transitions from one specific stream-int
state:

  sess_update_st_rdy_tcp() -> back_handle_st_rdy()
  sess_update_st_con_tcp() -> back_handle_st_con()
  sess_update_st_cer()     -> back_handle_st_cer()
  sess_update_stream_int() -> back_try_conn_req()
  sess_prepare_conn_req()  -> back_handle_st_req()
  sess_establish()         -> back_establish()

The last one remained in stream.c because it's more or less a completion
function which does all the initialization expected on a connection
success or failure, can set analysers and emit logs.

The other ones could possibly slightly benefit from being modified to
take a stream-int instead since it's really what they're working with,
but it's unimportant here.
2020-01-17 18:30:36 +01:00
Olivier Houchard
140237471e BUG/MEDIUM: connections: Hold the lock when wanting to kill a connection.
In connect_server(), when we decide we want to kill the connection of
another thread because there are too many idle connections, hold the
toremove_lock of the corresponding thread, othervise, there's a small race
condition where we could try to add the connection to the toremove_connections
list while it has already been free'd.

This should be backported to 2.0 and 2.1.
2019-12-30 18:18:28 +01:00
Christopher Faulet
eea8fc737b MEDIUM: stream/trace: Register a new trace source with its events
Runtime traces are now supported for the streams, only if compiled with
debug. process_stream() is covered as well as TCP/HTTP analyzers and filters.

In traces, the first argument is always a stream. So it is easy to get the info
about the channels and the stream-interfaces. The second argument, when defined,
is always a HTTP transaction. And the third one is an HTTP message. The trace
message is adapted to report HTTP info when possible.
2019-11-06 10:14:32 +01:00
vkill
1dfd16536f MINOR: backend: Add srv_name sample fetche
The sample fetche can get srv_name without foreach
`core.backends["bk"].servers`.

Then we can get Server class quickly via
`core.backends[txn.f:be_name()].servers[txn.f:srv_name()]`.

Issue#342
2019-11-01 05:40:24 +01:00
Olivier Houchard
e8f5f5d8b2 BUG/MEDIUM: servers: Only set SF_SRV_REUSED if the connection if fully ready.
In connect_server(), if we're reusing a connection, only use SF_SRV_REUSED
if the connection is fully ready. We may be using a multiplexed connection
created by another stream that is not yet ready, and may fail.
If we set SF_SRV_REUSED, process_stream() will then not wait for the timeout
to expire, and retry to connect immediately.

This should be backported to 1.9 and 2.0.
This commit depends on 55234e33708c5a584fb9efea81d71ac47235d518.
2019-10-29 14:15:20 +01:00
Olivier Houchard
859dc80f94 MEDIUM: list: Separate "locked" list from regular list.
Instead of using the same type for regular linked lists and "autolocked"
linked lists, use a separate type, "struct mt_list", for the autolocked one,
and introduce a set of macros, similar to the LIST_* macros, with the
MT_ prefix.
When we use the same entry for both regular list and autolocked list, as
is done for the "list" field in struct connection, we know have to explicitely
cast it to struct mt_list when using MT_ macros.
2019-09-23 18:16:08 +02:00
Christopher Faulet
1dbc4676c6 BUG/MINOR: backend: Fix a possible null pointer dereference
In the function connect_server(), when we are not able to reuse a connection and
too many FDs are opened, the variable srv must be defined to kill an idle
connection.

This patch fixes the issue #257. It must be backported to 2.0
2019-09-13 10:08:44 +02:00
Nenad Merdanovic
177adc9e57 MINOR: backend: Add srv_queue converter
The converter can be useful to look up a server queue from a dynamic value.

It takes an input value of type string, either a server name or
<backend>/<server> format and returns the number of queued sessions
on that server. Can be used in places where we want to look up
queued sessions from a dynamic name, like a cookie value (e.g.
req.cook(SRVID),srv_queue) and then make a decision to break
persistence or direct a request elsewhere.

Signed-off-by: Nenad Merdanovic <nmerdan@haproxy.com>
2019-08-27 04:32:06 +02:00
Willy Tarreau
b082186528 MEDIUM: backend: remove impossible cases from connect_server()
Now that we start by releasing any possibly existing connection,
the conditions simplify a little bit and some of the complex cases
can be removed. A few comments were also added for non-obvious cases.
2019-07-19 13:50:09 +02:00
Willy Tarreau
a5797aab11 MEDIUM: backend: always release any existing prior connection in connect_server()
When entering connect_server() we're not supposed to have a connection
already, except when retrying a failed connection, which is pretty rare.
Let's simplify the code by starting to unconditionally release any existing
connection. For now we don't go further, as this change alone will lead to
quite some simplification that'd rather be done as a separate cleanup.
2019-07-19 13:50:09 +02:00
Willy Tarreau
1c8d32bb62 MAJOR: stream: store the target address into s->target_addr
When forcing the outgoing address of a connection, till now we used to
allocate this outgoing connection and set the address into it, then set
SF_ADDR_SET. With connection reuse this causes a whole lot of issues and
difficulties in the code.

Thanks to the previous changes, it is now possible to store the target
address into the stream instead, and copy the address from the stream to
the connection when initializing the connection. assign_server_address()
does this and as a result SF_ADDR_SET now reflects the presence of the
target address in the stream, not in the connection. The http_proxy mode,
the peers and the master's CLI now use the same mechanism. For now the
existing connection code was not removed to limit the amount of tricky
changes, but the allocated connection is not used anymore.

This change also revealed a latent issue that we've been having around
option http_proxy : the address was set in the connection but neither the
SF_ADDR_SET nor the SF_ASSIGNED flags were set. It looks like the connection
could establish only due to the fact that it existed with a non-null
destination address.
2019-07-19 13:50:09 +02:00
Willy Tarreau
16aa4aff6b MINOR: connection: don't use clear_addr() anymore, just release the address
Now that we have dynamically allocated addresses, there's no need to
clear an address before reusing it, just release it. Note that this
is not equivalent to saying that an address is never zero, as shown in
assign_server_address() where an address 0.0.0.0 can still be assigned
to a connection for the time it takes to modify it.
2019-07-19 13:50:09 +02:00
Willy Tarreau
ca79f59365 MEDIUM: connection: make sure all address producers allocate their address
This commit places calls to sockaddr_alloc() at the places where an address
is needed, and makes sure that the allocation is properly tested. This does
not add too many error paths since connection allocations are already in the
vicinity and share the same error paths. For the two cases where a
clear_addr() was called, instead the address was not allocated.
2019-07-19 13:50:09 +02:00
Willy Tarreau
c0e16f208d MEDIUM: backend: turn all conn->addr.{from,to} to conn->{src,dst}
All reads were carefully reviewed for only reading already checked
values. Assignments were commented indicating that an allocation will
be needed once they become dynamic. The memset() used to clear the
addresses should then be turned to a free() and a NULL assignment.
2019-07-19 13:50:09 +02:00
Willy Tarreau
3cc01d84b3 MINOR: backend: switch to conn_get_{src,dst}() for port and address mapping
The backend connect code uses conn_get_{from,to}_addr to forward addresses
in transparent mode and to map server ports, without really checking if the
operation succeeds. In preparation of future changes, let's switch to
conn_get_{src,dst}() and integrate status check for possible failures.
2019-07-19 13:50:09 +02:00
Christopher Faulet
fc9cfe4006 REORG: proto_htx: Move HTX analyzers & co to http_ana.{c,h} files
The old module proto_http does not exist anymore. All code dedicated to the HTTP
analysis is now grouped in the file proto_htx.c. So, to finish the polishing
after removing the legacy HTTP code, proto_htx.{c,h} files have been moved in
http_ana.{c,h} files.

In addition, all HTX analyzers and related functions prefixed with "htx_" have
been renamed to start with "http_" instead.
2019-07-19 09:24:12 +02:00
Christopher Faulet
7d37fbb753 MEDIUM: backend: Remove code relying on the HTTP legacy mode
The L7 loadbalancing algorithms are concerned (uri, url_param and hdr), the
"sni" parameter on the server line and the "source" parameter on the server line
when used with "use_src hdr_ip()".
2019-07-19 09:18:27 +02:00
Christopher Faulet
b5f86f116b MINOR: backend/htx: Don't rewind output data to set the sni on a srv connection
Rewind on output data is useless for HTX streams.
2019-07-19 09:18:27 +02:00
Willy Tarreau
09e0203ef4 BUG/MINOR: backend: do not try to install a mux when the connection failed
If si_connect() failed, do not try to install the mux nor to complete
the operations or add the connection to an idle list, and abort quickly
instead. No obvious side effects were identified, but continuing to
allocate some resources after something has already failed seems risky.

This was a result of a prior fix which already wanted to push this code
further : aa089d80b ("BUG/MEDIUM: server: Defer the mux init until after
xprt has been initialized.") but it ought to have pushed it even further
to maintain the error check just after si_connect().

To be backported to 2.0 and 1.9.
2019-07-18 16:49:11 +02:00
Olivier Houchard
a1ab97316f BUG/MEDIUM: servers: Don't forget to set srv_cs to NULL if we can't reuse it.
In connect_server(), if there were already a CS assosciated with the stream,
but we can't reuse it, because the target is different (because we tried a
previous connection, it failed, and we use redispatch so we switched servers),
don't forget to set srv_cs to NULL. Otherwise, if we end up reusing another
connection, we would consider we already have a conn_stream, and we won't
create a new one, so we'd have a new connection but we would not be able to
use it.
This can explain frozen streams and connections stuck in CLOSE_WAIT when
using redispatch.

This should be backported to 1.9 and 2.0.
2019-07-08 16:32:58 +02:00
Olivier Houchard
6c6dc58da0 BUG/MEDIUM: connections: Always add the xprt handshake if needed.
In connect_server(), we used to only call xprt_add_hs() if CO_FL_SEND_PROXY
was set during the function call, we would not do it if the flag was set
before connect_server() was called. The rational at the time was if the flag
was already set, then the XPRT was already present. But now the xprt_handshake
always removes itself, so we have to re-add it each time, or it wouldn't be
done if the first connection attempt failed.
While I'm there, check any non-ssl handshake flag, instead of just
CO_FL_SEND_PROXY, or we'd miss the SOCKS4 flags.

This should be backported to 2.0.
2019-06-24 19:00:16 +02:00
Olivier Houchard
8694e5bc99 BUG/MEDIUM: connections: Don't try to send early data if we have no mux.
In connect_server(), if we don't yet have a mux, because we're choosing
one depending on the ALPN, don't attempt to send early data. We can't do
it because those data would depend on the mux, that will only be determined
by the handshake.

This should be backported to 1.9.
2019-06-15 11:35:00 +02:00
Olivier Houchard
b4a8b2c63d BUG/MEDIUM: connections: Don't use ALPN to pick mux when in mode TCP.
In connect_server(), don't wait until we negociate the ALPN to choose the
mux, the only mux we want to use is the mux_pt anyway.

This should be backported to 1.9.
2019-06-15 11:34:55 +02:00
Olivier Houchard
fe50bfb82c MEDIUM: connections: Introduce a handshake pseudo-XPRT.
Add a new XPRT that is used when using non-SSL handshakes, such as proxy
protocol or Netscaler, instead of taking care of it in conn_fd_handler().
This XPRT is installed when any of those is used, and it removes itself once
the handshake is done.
This should allow us to remove the distinction between CO_FL_SOCK* and
CO_FL_XPRT*.
2019-06-05 18:03:38 +02:00
Olivier Houchard
14fcc2ebcc BUG/MEDIUM: servers: Don't attempt to destroy idle connections if disabled.
In connect_server(), when deciding if we should attempt to remove idle
connections, because we have to many file descriptors opened, don't attempt
to do so if idle connection pool is disabled (with pool-max-conn 0), as
if it is, srv->idle_orphan_conns won't even be allocated, and trying to
dereference it will cause a crash.
2019-06-05 13:58:06 +02:00
Willy Tarreau
7bb39d7cd6 CLEANUP: connection: remove the now unused CS_FL_REOS flag
Let's remove it before it gets uesd again. It was mostly replaced with
CS_FL_EOI and by mux-specific states or flags.
2019-06-03 14:23:33 +02:00
Alexander Liu
2a54bb74cd MEDIUM: connection: Upstream SOCKS4 proxy support
Have "socks4" and "check-via-socks4" server keyword added.
Implement handshake with SOCKS4 proxy server for tcp stream connection.
See issue #82.

I have the "SOCKS: A protocol for TCP proxy across firewalls" doc found
at "https://www.openssh.com/txt/socks4.protocol". Please reference to it.

[wt: for now connecting to the SOCKS4 proxy over unix sockets is not
 supported, and mixing IPv4/IPv6 is discouraged; indeed, the control
 layer is unique for a connection and will be used both for connecting
 and for target address manipulation. As such it may for example report
 incorrect destination addresses in logs if the proxy is reached over
 IPv6]
2019-05-31 17:24:06 +02:00
Olivier Houchard
250031e444 MEDIUM: sessions: Introduce session flags.
Add session flags, and add a new flag, SESS_FL_PREFER_LAST, to be set when
we use NTLM authentication, and we should reuse the last connection. This
should fix using NTLM with HTX. This totally replaces TX_PREFER_LAST.

This should be backported to 1.9.
2019-05-29 15:41:47 +02:00
Christopher Faulet
a3f1550dfa MEDIUM: http/htx: Perform analysis relatively to the first block
The first block is the start-line, if defined. Otherwise it the head of the HTX
message. So now, during HTTP analysis, lookup are all done using the first block
instead of the head. Concretely, for now, it is the same because only one HTTP
message is stored at a time in an HTX message. 1xx informational messages are
handled separatly from the final reponse and from each other. But it will make
sense when the 1xx informational messages and the associated final reponse will
be stored in the same HTX message.
2019-05-28 07:42:12 +02:00
Christopher Faulet
297fbb45fe MINOR: htx: Replace the function http_find_stline() by http_get_stline()
Now, we only return the start-line. If not found, NULL is returned. No lookup is
performed and the HTX message is no more updated. It is now the caller
responsibility to update the position of the start-line to the right value. So
when it is not found, i.e sl_pos is set to -1, it means the last start-line has
been already processed and the next one has not been inserted yet.

It is mandatory to rely on this kind of warranty to store 1xx informational
responses and final reponse in the same HTX message.
2019-05-28 07:42:12 +02:00
Willy Tarreau
08e2b41e81 BUILD: connections: shut up gcc about impossible out-of-bounds warning
Since commit 88698d9 ("MEDIUM: connections: Add a way to control the
number of idling connections.") when building without threads, gcc
complains that the operations made on the idle_orphan_conns[] list is
out of bounds, which is always false since 1) <i> can only equal zero,
and 2) given it's equal to <tid> we never even enter the loop. But as
usual it thinks it knows better, so let's mask the origin of this <i>
value to shut it up. Another solution consists in making <i> unsigned
and adding an explicit range check.
2019-05-26 11:54:20 +02:00
Willy Tarreau
c125cef6da CLEANUP: ssl: make inclusion of openssl headers safe
It's always a pain to have to stuff lots of #ifdef USE_OPENSSL around
ssl headers, it even results in some of them appearing in a random order
and multiple times just to benefit form an existing ifdef block. Let's
make these headers safe for inclusion when USE_OPENSSL is not defined,
they now perform the test themselves and do nothing if USE_OPENSSL is
not defined. This allows to remove no less than 8 such ifdef blocks
and make include blocks more readable.
2019-05-10 09:58:43 +02:00
Willy Tarreau
5db847ab65 CLEANUP: ssl: remove 57 occurrences of useless tests on LIBRESSL_VERSION_NUMBER
They were all check to comply with the advertised openssl version. Now
that libressl doesn't pretend to be a more recent openssl anymore, we
can simply rely on the regular openssl version tests without having to
deal with exceptions for libressl.
2019-05-09 14:26:39 +02:00
Willy Tarreau
9a1ab08160 CLEANUP: ssl-sock: use HA_OPENSSL_VERSION_NUMBER instead of OPENSSL_VERSION_NUMBER
Most tests on OPENSSL_VERSION_NUMBER have become complex and break all
the time because this number is fake for some derivatives like LibreSSL.
This patch creates a new macro, HA_OPENSSL_VERSION_NUMBER, which will
carry the real openssl version defining the compatibility level, and
this version will be adjusted depending on the variants.
2019-05-09 14:25:43 +02:00
Olivier Houchard
4cd2af4e5d BUG/MEDIUM: ssl: Don't attempt to use early data with libressl.
Libressl doesn't yet provide early data, so don't put the CO_FL_EARLY_SSL_HS
on the connection if we're building with libressl, or the handshake will
never be done.
2019-05-06 15:20:42 +02:00
Olivier Houchard
865d8392bb MEDIUM: streams: Add a way to replay failed 0rtt requests.
Add a new keyword for retry-on, 0rtt-rejected. If set, we will try to
replay requests for which we sent early data that got rejected by the
server.
If that option is set, we will attempt to use 0rtt if "allow-0rtt" is set
on the server line even if the client didn't send early data.
2019-05-04 10:20:24 +02:00
Olivier Houchard
010941f876 BUG/MEDIUM: ssl: Use the early_data API the right way.
We can only read early data if we're a server, and write if we're a client,
so don't attempt to mix both.

This should be backported to 1.8 and 1.9.
2019-05-03 21:00:10 +02:00
Olivier Houchard
a48237fd07 BUG/MEDIUM: connections: Make sure we remove CO_FL_SESS_IDLE on disown.
When for some reason the session is not the owner of the connection anymore,
make sure we remove CO_FL_SESS_IDLE, even if we're about to call
conn->mux->destroy(), as the destroy may not destroy the connection
immediately if it's still in use.
This should be backported to 1.9.
u
2019-05-02 12:08:39 +02:00
Christopher Faulet
46451d6e04 MINOR: gcc: Fix a silly gcc warning in connect_server()
Don't know why it happens now, but gcc seems to think srv_conn may be NULL when
a reused connection is removed from the orphan list. It happens when HAProxy is
compiled with -O2 with my gcc (8.3.1) on fedora 29... Changing a little how
reuse parameter is tested removes the warnings. So...

This patch may be backported to 1.9.
2019-04-19 15:53:23 +02:00
Olivier Houchard
88698d966d MEDIUM: connections: Add a way to control the number of idling connections.
As by default we add all keepalive connections to the idle pool, if we run
into a pathological case, where all client don't do keepalive, but the server
does, and haproxy is configured to only reuse "safe" connections, we will
soon find ourself having lots of idling, unusable for new sessions, connections,
while we won't have any file descriptors available to create new connections.

To fix this, add 2 new global settings, "pool_low_ratio" and "pool_high_ratio".
pool-low-fd-ratio  is the % of fds we're allowed to use (against the maximum
number of fds available to haproxy) before we stop adding connections to the
idle pool, and destroy them instead. The default is 20. pool-high-fd-ratio is
the % of fds we're allowed to use (against the maximum number of fds available
to haproxy) before we start killing idling connection in the event we have to
create a new outgoing connection, and no reuse is possible. The default is 25.
2019-04-18 19:52:03 +02:00
Christopher Faulet
73c1207c71 MINOR: muxes: Pass the context of the mux to destroy() instead of the connection
It is mandatory to handle mux upgrades, because during a mux upgrade, the
connection will be reassigned to another multiplexer. So when the old one is
destroyed, it does not own the connection anymore. Or in other words, conn->ctx
does not point to the old mux's context when its destroy() callback is
called. So we now rely on the multiplexer context do destroy it instead of the
connection.

In addition, h1_release() and h2_release() have also been updated in the same
way.
2019-04-12 22:06:53 +02:00
Olivier Houchard
237f781f2d MEDIUM: backend: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:37 +01:00
Willy Tarreau
c912f94b57 MINOR: server: remove a few unneeded LIST_INIT calls after LIST_DEL_LOCKED
Since LIST_DEL_LOCKED() and LIST_POP_LOCKED() now automatically reinitialize
the removed element, there's no need for keeping this LIST_INIT() call in the
idle connection code.
2019-02-28 16:08:54 +01:00
Olivier Houchard
9ea5d361ae MEDIUM: servers: Reorganize the way idle connections are cleaned.
Instead of having one task per thread and per server that does clean the
idling connections, have only one global task for every servers.
That tasks parses all the servers that currently have idling connections,
and remove half of them, to put them in a per-thread list of connections
to kill. For each thread that does have connections to kill, wake a task
to do so, so that the cleaning will be done in the context of said thread.
2019-02-26 18:17:32 +01:00
Olivier Houchard
7f1bc31fee MEDIUM: servers: Used a locked list for idle_orphan_conns.
Use the locked macros when manipulating idle_orphan_conns, so that other
threads can remove elements from it.
It will be useful later to avoid having a task per server and per thread to
cleanup the orphan list.
2019-02-26 18:17:32 +01:00
Olivier Houchard
f131481a0a BUG/MEDIUM: servers: Add a per-thread counter of idle connections.
Add a per-thread counter of idling connections, and use it to determine
how many connections we should kill after the timeout, instead of using
the global counter, or we're likely to just kill most of the connections.

This should be backported to 1.9.
2019-02-21 19:07:45 +01:00
Olivier Houchard
e737103173 BUG/MEDIUM: servers: Use atomic operations when handling curr_idle_conns.
Use atomic operations when dealing with srv->curr_idle_conns, as it's shared
between threads, otherwise we could get inconsistencies.

This should be backported to 1.9.
2019-02-21 19:07:19 +01:00
Christopher Faulet
f7679ad4db BUG/MAJOR: htx/backend: Make all tests on HTTP messages compatible with HTX
A piece of code about the HTX was lost this summer, after the "big merge"
(htx/http2/connection layer refactoring). I forgot to keep HTX changes in the
functions connect_server() and assign_server(). So, this patch fixes "uri",
"url_param" and "hdr" LB algorithms when the HTX is enabled. It also fixes
evaluation of the "sni" expression on server lines.

This issue was reported on github. See issue #32.

This patch must be backported in 1.9.
2019-02-06 10:20:01 +01:00
Willy Tarreau
1da41ecf5b BUG/MINOR: backend: check srv_conn before dereferencing it
Commit 3c4e19f42 ("BUG/MEDIUM: backend: always release the previous
connection into its own target srv_list") introduced a valid warning
about a null-deref risk since we didn't check conn_new()'s return value.

This patch must be backported to 1.9 with the patch above.
2019-02-01 16:47:46 +01:00
Willy Tarreau
3c4e19f429 BUG/MEDIUM: backend: always release the previous connection into its own target srv_list
There was a bug reported in issue #19 regarding the fact that haproxy
could mis-route requests to the wrong server. It turns out that when
switching to another server, the old connection was put back into the
srv_list corresponding to the stream's target instead of this connection's
target. Thus if this connection was later picked, it was pointing to the
wrong server.

The patch fixes this and also clarifies the assignment to srv_conn->target
so that it's clear we don't change it when picking it from the srv_list.

This must be backported to 1.9 only.
2019-02-01 11:58:33 +01:00
Olivier Houchard
7493114729 BUG/MEDIUM: servers: Close the connection if we failed to install the mux.
If we failed to install the mux, just close the connection, or
conn_fd_handler() will be called for the FD, and crash as soon as it attempts
to access the mux' wake method.

This should be backported to 1.9.
2019-01-29 19:53:09 +01:00
Olivier Houchard
26da323cb9 BUG/MEDIUM: servers: Don't add an incomplete conn to the server idle list.
If we failed to add the connection to the session, only try to add it back
to the server idle list if it has a mux, otherwise the connection is
incomplete and unusable.

This should be backported to 1.9.
2019-01-29 19:47:20 +01:00
Olivier Houchard
4dc85538ba BUG/MEDIUM: servers: Only destroy a conn_stream we just allocated.
In connect_server(), if we failed to add the connection to the session,
only destroy the conn_stream if we just allocated it, otherwise it may
have been allocated outside connect_server(), along with a connection which
has its destination address set.
Also use si_release_endpoint() instead of cs_destroy(), to make sure the
stream_interface doesn't reference it anymore.

This should be backported to 1.9.
2019-01-29 19:47:20 +01:00
Willy Tarreau
d822013f45 BUG/MEDIUM: backend: always call si_detach_endpoint() on async connection failure
In case an asynchronous connection (ALPN) succeeds but the mux fails to
attach, we must release the stream interface's endpoint, otherwise we
leave the stream interface with an endpoint pointing to a freed connection
with si_ops == si_conn_ops, and sess_update_st_cer() calls si_shutw() on
it, causing it to crash.

This must be backported to 1.9 only.
2019-01-28 16:33:35 +01:00
Olivier Houchard
9ef5155ba6 BUG/MEDIUM: servers: Attempt to reuse an unfinished connection on retry.
In connect_server(), if the previous connection failed, but had an alpn, no
mux was created, and thus the stream_interface's endpoint would be the
connection. In this case, instead of forgetting about it, and overriding
the stream_interface's endpoint later, try to reuse the connection, or the
connection will still be in the session's connection list, and will reference
to a stream that was probably destroyed.

This should be backported to 1.9.
2019-01-28 16:33:31 +01:00
Willy Tarreau
2c7deddc06 BUG/MEDIUM: backend: never try to attach to a mux having no more stream available
The code dealing with idle connections used to check the number of streams
available on the connection only to unlink the connection from the idle
list. But this still resulted in too many streams reusing the same connection
when they were already attached to it.

We must detect that there is no more room and refrain from using this
connection at all, and instead fall back to the no-reuse case. Ideally
we should try to search among other idle connections, but for a backport
let's stay safe.

This must be backported to 1.9.
2019-01-24 19:06:43 +01:00
Willy Tarreau
5ce6337254 BUG/MEDIUM: backend: also remove from idle list muxes that have no more room
The current test consists in removing muxes which report that they're going
to assign their last available stream, but a mux may already be saturated
without having passed in this situation at all. This is what happens with
mux_h2 when receiving a GOAWAY frame informing the mux about the ID of the
last stream the other end is willing to process. The limit suddenly changes
from near infinite to 0. Currently what happens is that such a mux remains
in the idle list for a long time and refuses all new streams. Now at least
it will only fail a single stream in a retryable way. A future improvement
should consist in trying to pick another connection from the idle list.

This fix must be backported to 1.9.
2019-01-24 13:53:06 +01:00
Olivier Houchard
09a0f03994 BUG/MEDIUM: servers: Make assign_tproxy_address work when ALPN is set.
If an ALPN is set on the server line, then when we reach assign_tproxy_address,
the stream_interface's endpoint will be a connection, not a conn_stream,
so make sure assign_tproxy_address() handles both cases.

This should be backported to 1.9.
2019-01-17 19:18:20 +01:00
Willy Tarreau
21c741a665 MINOR: backend: make the random algorithm support a number of draws
When an argument <draws> is present, it must be an integer value one
or greater, indicating the number of draws before selecting the least
loaded of these servers. It was indeed demonstrated that picking the
least loaded of two servers is enough to significantly improve the
fairness of the algorithm, by always avoiding to pick the most loaded
server within a farm and getting rid of any bias that could be induced
by the unfair distribution of the consistent list. Higher values N will
take away N-1 of the highest loaded servers at the expense of performance.
With very high values, the algorithm will converge towards the leastconn's
result but much slower. The default value is 2, which generally shows very
good distribution and performance. This algorithm is also known as the
Power of Two Random Choices and is described here :

http://www.eecs.harvard.edu/~michaelm/postscripts/handbook2001.pdf
2019-01-14 19:33:17 +01:00
Willy Tarreau
a9a7249966 MINOR: backend: remap the balance uri settings to lbprm.arg_opt{1,2,3}
The algo-specific settings move from the proxy to the LB algo this way :
  - uri_whole => arg_opt1
  - uri_len_limit => arg_opt2
  - uri_dirs_depth1 => arg_opt3
2019-01-14 19:33:17 +01:00
Willy Tarreau
9fed8586b5 MINOR: backend: make the header hash use arg_opt1 for use_domain_only
This is only a boolean extra arg. Let's map it to arg_opt1 and remove
hh_match_domain from struct proxy.
2019-01-14 19:33:17 +01:00
Willy Tarreau
484ff07691 MINOR: backend: make headers and RDP cookie also use arg_str/len
These ones used to rely on separate variables called hh_name/hh_len
but they are exclusive with the former. Let's use the same variable
which becomes a generic argument name and length for the LB algorithm.
2019-01-14 19:33:17 +01:00
Willy Tarreau
4c03d1c9b6 MINOR: backend: move url_param_name/len to lbprm.arg_str/len
This one is exclusively used by LB parameters, when using URL param
hashing. Let's move it to the lbprm struct under a more generic name.
2019-01-14 19:33:17 +01:00
Willy Tarreau
6c30be52da BUG/MINOR: backend: BE_LB_LKUP_CHTREE is a value, not a bit
There are a few instances where the lookup algo is tested against
BE_LB_LKUP_CHTREE using a binary "AND" operation while this macro
is a value among a set, and not a bit. The test happens to work
because the value is exactly 4 and no bit overlaps with the other
possible values but this is a latent bug waiting for a new LB algo
to appear to strike. At the moment the only other algo sharing a bit
with it is the "first" algo which is never supported in the same code
places.

This fix should be backported to maintained versions for safety if it
passes easily, otherwise it's not important as it will not fix any
visible issue.
2019-01-14 19:33:17 +01:00
Willy Tarreau
602a499da5 BUG/MINOR: backend: balance uri specific options were lost across defaults
The "balance uri" options "whole", "len" and "depth" were not properly
inherited from the defaults sections. In addition, "whole" and "len"
were not even reset when parsing "uri", meaning that 2 subsequent
"balance uri" statements would not have the expected effect as the
options from the first one would remain for the second one.

This may be backported to all maintained versions.
2019-01-14 19:33:17 +01:00
Olivier Houchard
5cd6217185 BUG/MEDIUM: server: Defer the mux init until after xprt has been initialized.
In connect_server(), if we're using a new connection, and we have to
initialize the mux right away, only do it so after si_connect() has been
called. si_connect() is responsible for initializing the xprt, and the
mux initialization may depend on the xprt being usable, as it may try to
receive data. Otherwise, the connection will be flagged as having an error,
and we will have to try to connect a second time.

This should be backported to 1.9.
2019-01-04 17:08:47 +01:00
Willy Tarreau
59884a646c MINOR: lb: allow redispatch when using consistent hash
Redispatch traditionally only worked for cookie based persistence.

Adding redispatch support for consistent hash based persistence - also
update docs.

Reported by Oskar Stenman on discourse:
https://discourse.haproxy.org/t/balance-uri-consistent-hashing-redispatch-3-not-redispatching/3344

Should be backported to 1.8.

Cc: Lukas Tribus <lukas@ltri.eu>
2019-01-02 20:22:17 +01:00