Commit Graph

1559 Commits

Author SHA1 Message Date
Olivier Houchard
4c28328572 MEDIUM: task: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:37 +01:00
Olivier Houchard
aa4d71a7fe MEDIUM: server: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:37 +01:00
Olivier Houchard
11ecfd1c01 MEDIUM: proxy: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:37 +01:00
Olivier Houchard
d5f9b19196 MEDIUM: freq_ctr: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:37 +01:00
Olivier Houchard
d360879fb5 MEDIUM: fd: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:37 +01:00
Olivier Houchard
a2735340fb MEDIUM: applets: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:37 +01:00
Olivier Houchard
92fce85d03 MINOR: fd: Remove debugging code.
Remove a debugging test, and call to abort, it's no longer needed.
2019-03-08 16:05:25 +01:00
Willy Tarreau
1e56c70cc9 OPTIM: task: limit the impact of memory barriers in taks_remove_from_task_list()
In this function we end up with successive locked operations then a
store barrier, and in addition the compiler has to emit less efficient
code due to a longer jump. There's no need for absolutely updating the
tasks_run_queue counter before clearing the task's leaf pointer, so
let's swap the two operations and benefit from a single barrier as much
as possible. This code is on the hot path and shows about half a percent
of improvement with 8 threads.
2019-03-07 18:44:12 +01:00
Willy Tarreau
b238b12e98 MINOR: task: use LIST_DEL_INIT() to remove a task from the queue
By using LIST_DEL_INIT() instead of LIST_DEL()+LIST_INIT() we manage
to bump the peak connection rate by no less than 3% on 8 threads.
The perf top profile shows much less contention in this area which
suffered from the second reload.
2019-03-07 11:45:44 +01:00
Frédéric Lécaille
5f33f85ce8 MINOR: sample: Extract some protocol buffers specific code.
We move the code responsible of parsing protocol buffers messages
inside gRPC messages from sample.c to include/proto/protocol_buffers.h
so that to reuse it to cascade "ungrpc" converter.
2019-03-06 15:36:02 +01:00
Frédéric Lécaille
756d97f205 MINOR: sample: Rework gRPC converter code.
For now on, "ungrpc" may take a second optional argument to provide
the protocol buffers types used to encode the field value to be extracted.
When absent the field value is extracted as a binary sample which may then
followed by others converters like "hex" which takes binary as input sample.
When this second argument is a type which does not match the one found by "ungrpc",
this field is considered as not found even if present.

With this patch we also remove the useless "varint" and "svarint" converters.

Update the documentation about "ungrpc" converters.
2019-03-05 11:04:23 +01:00
Frédéric Lécaille
7c93e88d0c MINOR: sample: Code factorization "ungrpc" converter.
Parsing protocol buffer fields always consists in skip the field
if the field is not found or store the field value if found.
So, with this patch we factorize a little bit the code for "ungrpc" converter.
2019-03-05 11:03:53 +01:00
Willy Tarreau
c8d5b95e6d MEDIUM: config: don't enforce a low frontend maxconn value anymore
Historically the default frontend's maxconn used to be quite low (2000),
which was sufficient two decades ago but often proved to be a problem
when users had purposely set the global maxconn value but forgot to set
the frontend's.

There is no point in keeping this arbitrary limit for frontends : when
the global maxconn is lower, it's already too high and when the global
maxconn is much higher, it becomes a limiting factor which causes trouble
in production.

This commit allows the value to be set to zero, which becomes the new
default value, to mean it's not directly limited, or in fact it's set
to the global maxconn. Since this operation used to be performed before
computing a possibly automatic global maxconn based on memory limits,
the calculation of the maxconn value and its propagation to the backends'
fullconn has now moved to a dedicated function, proxy_adjust_all_maxconn(),
which is called once the global maxconn is stabilized.

This comes with two benefits :
  1) a configuration missing "maxconn" in the defaults section will not
     limit itself to a magically hardcoded value but will scale up to the
     global maxconn ;

  2) when the global maxconn is not set and memory limits are used instead,
     the frontends' maxconn automatically adapts, and the backends' fullconn
     as well.
2019-02-28 17:05:32 +01:00
Willy Tarreau
e2711c7bd6 MINOR: listener: introduce listener_backlog() to report the backlog value
In an attempt to try to provide automatic maxconn settings, we need to
decorrelate a listner's backlog and maxconn so that these values can be
independent. This introduces a listener_backlog() function which retrieves
the backlog value from the listener's backlog, the frontend's, the
listener's maxconn, the frontend's or falls back to 1024. This
corresponds to what was done in cfgparse.c to force a value there except
the last fallback which was not set since the frontend's maxconn is always
known.
2019-02-28 17:05:29 +01:00
Willy Tarreau
c912f94b57 MINOR: server: remove a few unneeded LIST_INIT calls after LIST_DEL_LOCKED
Since LIST_DEL_LOCKED() and LIST_POP_LOCKED() now automatically reinitialize
the removed element, there's no need for keeping this LIST_INIT() call in the
idle connection code.
2019-02-28 16:08:54 +01:00
Willy Tarreau
1efafce61f MINOR: listener: implement multi-queue accept for threads
There is one point where we can migrate a connection to another thread
without taking risk, it's when we accept it : the new FD is not yet in
the fd cache and no task was created yet. It's still possible to assign
it a different thread than the one which accepted the connection. The
only requirement for this is to have one accept queue per thread and
their respective processing tasks that have to be woken up each time
an entry is added to the queue.

This is a multiple-producer, single-consumer model. Entries are added
at the queue's tail and the processing task is woken up. The consumer
picks entries at the head and processes them in order. The accept queue
contains the fd, the source address, and the listener. Each entry of
the accept queue was rounded up to 64 bytes (one cache line) to avoid
cache aliasing because tests have shown that otherwise performance
suffers a lot (5%). A test has shown that it's important to have at
least 256 entries for the rings, as at 128 it's still possible to fill
them often at high loads on small thread counts.

The processing task does almost nothing except calling the listener's
accept() function and updating the global session and SSL rate counters
just like listener_accept() does on synchronous calls.

At this point the accept queue is implemented but not used.
2019-02-27 14:27:07 +01:00
Willy Tarreau
b2b50a7784 MINOR: listener: pre-compute some thread counts per bind_conf
In order to quickly pick a thread ID when accepting a connection, we'll
need to know certain pre-computed values derived from the thread mask,
which are counts of bits per position multiples of 1, 2, 4, 8, 16 and
32. In practice it is sufficient to compute only the 4 first ones and
store them in the bind_conf. We update the count every time the
bind_thread value is adjusted.

The fields in the bind_conf struct have been moved around a little bit
to make it easier to group all thread bit values into the same cache
line.

The function used to return a thread number is bind_map_thread_id(),
and it maps a number between 0 and 31/63 to a thread ID between 0 and
31/63, starting from the left.
2019-02-27 14:27:07 +01:00
Olivier Houchard
9ea5d361ae MEDIUM: servers: Reorganize the way idle connections are cleaned.
Instead of having one task per thread and per server that does clean the
idling connections, have only one global task for every servers.
That tasks parses all the servers that currently have idling connections,
and remove half of them, to put them in a per-thread list of connections
to kill. For each thread that does have connections to kill, wake a task
to do so, so that the cleaning will be done in the context of said thread.
2019-02-26 18:17:32 +01:00
Olivier Houchard
7f1bc31fee MEDIUM: servers: Used a locked list for idle_orphan_conns.
Use the locked macros when manipulating idle_orphan_conns, so that other
threads can remove elements from it.
It will be useful later to avoid having a task per server and per thread to
cleanup the orphan list.
2019-02-26 18:17:32 +01:00
Frédéric Lécaille
1fceee8316 MINOR: http_fetch: add "req.ungrpc" sample fetch for gRPC.
This patch implements "req.ungrpc" sample fetch method to decode and
parse a gRPC request. It takes only one argument: a protocol buffers
field number to identify the protocol buffers message number to be looked up.
This argument is a sort of path in dotted notation to the terminal field number
to be retrieved.

  ex:
    req.ungrpc(1.2.3.4)

This sample fetch catch the data in raw mode, without interpreting them.
Some protocol buffers specific converters may be used to convert the data
to the correct type.
2019-02-26 16:27:05 +01:00
Christopher Faulet
c6827d52c1 MINOR: channel/htx: Add function to skips output bytes from an HTX channel
It is the HTX version of co_skip(). Internally, It uses the function htx_drain().

It will be used by other commits to fix bugs, so it must be backported to 1.9.
2019-02-26 14:04:23 +01:00
Christopher Faulet
729b5b308c BUG/MINOR: channel: Set CF_WROTE_DATA when outgoing data are skipped
in co_skip(), the flag CF_WRITE_PARTIAL is set on the channel. The flag
CF_WROTE_DATA must also be set to notify the channel some data were sent.

This patch must be backported to 1.9.
2019-02-26 14:04:23 +01:00
Richard Russo
bc9d9844d5 BUG/MAJOR: fd/threads, task/threads: ensure all spin locks are unlocked
Calculate if the fd or task should be locked once, before locking, and
reuse the calculation when determing when to unlock.

Fixes a race condition added in 87d54a9a for fds, and b20aa9ee for tasks,
released in 1.9-dev4. When one thread modifies thread_mask to be a single
thread for a task or fd while a second thread has locked or is waiting on a
lock for that task or fd, the second thread will not unlock it.  For FDs,
this is observable when a listener is polled by multiple threads, and is
closed while those threads have events pending.  For tasks, this seems
possible, where task_set_affinity is called, but I did not observe it.

This must be backported to 1.9.
2019-02-25 16:16:36 +01:00
Willy Tarreau
2d7f81b809 MINOR: fd: add a new my_closefrom() function to close all FDs
This is a naive implementation of closefrom() which closes all FDs
starting from the one passed in argument. closefrom() is not provided
on all operating systems, and other versions will follow.
2019-02-21 22:19:17 +01:00
Olivier Houchard
f131481a0a BUG/MEDIUM: servers: Add a per-thread counter of idle connections.
Add a per-thread counter of idling connections, and use it to determine
how many connections we should kill after the timeout, instead of using
the global counter, or we're likely to just kill most of the connections.

This should be backported to 1.9.
2019-02-21 19:07:45 +01:00
Olivier Houchard
e737103173 BUG/MEDIUM: servers: Use atomic operations when handling curr_idle_conns.
Use atomic operations when dealing with srv->curr_idle_conns, as it's shared
between threads, otherwise we could get inconsistencies.

This should be backported to 1.9.
2019-02-21 19:07:19 +01:00
Frédéric Lécaille
76d2cef0c2 BUG/MEDIUM: peers: Missing peer initializations.
Initialize ->srv peer field for all the peers, the local peer included.
Indeed, a haproxy process needs to connect to the local peer of a remote
process. Furthermore, when a "peer" or "server" line is parsed by parse_server()
the address must be copied to ->addr field of the peer object only if this address
has been also parsed by parse_server(). This is not the case if this address belongs
to the local peer and is provided on a "server" line.

After having parsed the "peer" or "server" lines of a peer
sections, the ->srv part of all the peer must be initialized for SSL, if
enabled. Same thing for the binding part.

Revert 1417f0b commit which is no more required.

No backport is needed, this is purely 2.0.
2019-02-12 19:49:22 +01:00
Willy Tarreau
1417f0b5dc BUG/MEDIUM: peers: check that p->srv actually exists before using p->srv->use_ssl
Commit 1055e687a ("MINOR: peers: Make outgoing connection to SSL/TLS
peers work.") introduced an "srv" field in the peers, which points to
the equivalent server to hold SSL settings. This one is not set when
the peer is local so we must always test it before testing p->srv->use_ssl
otherwise haproxy dies during reloads.

No backport is needed, this is purely 2.0.
2019-02-08 10:22:31 +01:00
Willy Tarreau
980855bd95 BUG/MEDIUM: server: initialize the orphaned conns lists and tasks at the end
This also depends on the nbthread count, so it must only be performed after
parsing the whole config file. As a side effect, this removes some code
duplication between servers and server-templates.

This must be backported to 1.9.
2019-02-07 15:08:13 +01:00
Willy Tarreau
00f18a36b6 BUG/MINOR: server: fix logic flaw in idle connection list management
With variable connection limits, it's not possible to accurately determine
whether the mux is still in use by comparing usage and max to be equal due
to the fact that one determines the capacity and the other one takes care
of the context. This can cause some connections to be dropped before they
reach their stream ID limit.

It seems it could also cause some connections to be terminated with
streams still alive if the limit was reduced to match the newly computed
avail_streams() value, though this cannot yet happen with existing muxes.

Instead let's switch to usage reports and simply check whether connections
are both unused and available before adding them to the idle list.

This should be backported to 1.9.
2019-01-31 19:38:25 +01:00
Willy Tarreau
0f9cd7b196 MINOR: stream-int: add a new flag to mention that we want the connection to be killed
The new flag SI_FL_KILL_CONN is now set by the rare actions which
deliberately want the whole connection (and not just the stream) to be
killed. This is only used for "tcp-request content reject",
"tcp-response content reject", "tcp-response content close" and
"http-request reject". The purpose is to desambiguate the close from
a regular shutdown. This will be used by the next patches.
2019-01-31 19:38:25 +01:00
Olivier Houchard
8788b4111c BUG/MEDIUM: connections: Don't forget to remove CO_FL_SESS_IDLE.
If we're adding a connection to the server orphan idle list, don't forget
to remove the CO_FL_SESS_IDLE flag, or we will assume later it's still
attached to a session.

This should be backported to 1.9.
2019-01-31 19:38:25 +01:00
Willy Tarreau
13afcb7ab3 BUG/MINOR: task: fix possibly missed event in inter-thread wakeups
There's a very small but existing uncertainty window when waking another
thread up where it is possible for task_wakeup() not to wake the other
task up because it's still running while this once is in the process of
finishing and loses its TASK_RUNNING flag. In this case the wakeup will
be missed.

The problem is that we have a single flag to store 3 states, since the
transition from running to sleeping isn't atomic. Thus we need to have
another flag to cover this part. This patch introduces TASK_QUEUED to
mention that the task is already in the run queue, running or not. This
bit will be removed while TASK_RUNNING is kept once dequeued, and will
be used when removing TASK_RUNNING to check if the task has been requeued.

It might be possible to slightly improve this but the occurrence rate
is quite low and we don't really need to complexify the scheduler to
optimize for a rare case.

The impact with the current code is very low since we have few inter-
thread wakeups. Most of them are caused by checks killing sessions.

This must be backported to 1.9.
2019-01-28 15:03:04 +01:00
Willy Tarreau
bf66bd1b8b MEDIUM: stream-int: always mark pending outgoing SI_ST_CON
Before the first send() attempt, we should be in SI_ST_CON, not
SI_ST_EST, since we have not yet attempted to send and we are
allowed to retry. This is particularly important with complex
outgoing muxes which can fail during the first send attempt (e.g.
failed stream ID allocation).

It only requires that sess_update_st_con_tcp() knows about this
possibility, as we must not forcefully close a reused connection
when facing an error in this case, this will be handled later.

This may be backported to 1.9 with care after some observation period.
2019-01-24 19:06:43 +01:00
Frédéric Lécaille
355b2033ec MINOR: cfgparse: SSL/TLS binding in "peers" sections.
Make "bind" keywork be supported in "peers" sections.
All "bind" settings are supported on this line.
Add "default-bind" option to parse the binding options excepted the bind address.
Do not parse anymore the bind address for local peers on "server" lines.
Do not use anymore list_for_each_entry() to set the "peers" section
listener parameters because there is only one listener by "peers" section.

May be backported to 1.5 and newer.
2019-01-18 14:26:21 +01:00
Frédéric Lécaille
1055e687a2 MINOR: peers: Make outgoing connection to SSL/TLS peers work.
This patch adds pointer to a struct server to peer structure which
is initialized after having parsed a remote "peer" line.

After having parsed all peers section we run ->prepare_srv to initialize
all SSL/TLS stuff of remote perr (or server).

Remaining thing to do to completely support peer protocol over SSL/TLS:
make "bind" keyword be supported in "peers" sections to make SSL/TLS
incoming connections to local peers work.

May be backported to 1.5 and newer.
2019-01-18 14:26:21 +01:00
Tim Duesterhus
8b87c01c4d BUG/MINOR: stick_table: Prevent conn_cur from underflowing
When using the peers feature a race condition could prevent
a connection from being properly counted. When this connection
exits it is being "uncounted" nonetheless, leading to a possible
underflow (-1) of the conn_curr stick table entry in the following
scenario :

  - Connect to peer A     (A=1, B=0)
  - Peer A sends 1 to B   (A=1, B=1)
  - Kill connection to A  (A=0, B=1)
  - Connect to peer B     (A=0, B=2)
  - Peer A sends 0 to B   (A=0, B=0)
  - Peer B sends 0/2 to A (A=?, B=0)
  - Kill connection to B  (A=?, B=-1)
  - Peer B sends -1 to A  (A=-1, B=-1)

This fix may be backported to all supported branches.
2019-01-15 15:34:49 +01:00
Emeric Brun
9e7547740c MINOR: ssl: add support of aes256 bits ticket keys on file and cli.
Openssl switched from aes128 to aes256 since may 2016  to compute
tls ticket secrets used by default. But Haproxy still handled only
128 bits keys for both tls key file and CLI.

This patch permit the user to set aes256 keys throught CLI or
the key file (80 bytes encoded in base64) in the same way that
aes128 keys were handled (48 bytes encoded in base64):
- first 16 bytes for the key name
- next 16/32 bytes for aes 128/256 key bits key
- last 16/32 bytes for hmac 128/256 bits

Both sizes are now supported (but keys from same file must be
of the same size and can but updated via CLI only using a key of
the same size).

Note: This feature need the fix "dec func ignores padding for output
size checking."
2019-01-14 19:32:58 +01:00
Willy Tarreau
762475e1f9 BUG/MEDIUM: connection: properly unregister the mux on failed initialization
When mux->init() fails, session_free() will call it again to unregister
it while it was already done, resulting in null derefs or use-after-free.
This typically happens on out-of-memory conditions during H1 or H2 connection
or stream allocation.

This fix must be backported to 1.9.
2019-01-10 19:47:43 +01:00
Christopher Faulet
f7ed195ac8 MINOR: channel/htx: Add the HTX version of channel_truncate/erase
The function channel_htx_truncate() can now be used on HTX buffer to truncate
all incoming data, keeping outgoing one intact. This function relies on the
function channel_htx_erase() and htx_truncate().

This patch may be backported to 1.9. If so, the patch "MINOR: channel/htx: Add
the HTX version of channel_truncate()" must also be backported.
2019-01-08 12:06:55 +01:00
Christopher Faulet
5811db0043 MINOR: channel/htx: Add HTX version for some helper functions
HTX versions for functions to test the free space in input against the reserve
have been added. Now, on HTX streams, following functions can be used:

  * channel_htx_may_recv
  * channel_htx_recv_limit
  * channel_htx_recv_max
  * channel_htx_full

This patch must be backported in 1.9 because it will be used by a futher patch
to fix a bug.
2019-01-07 16:32:05 +01:00
Willy Tarreau
59884a646c MINOR: lb: allow redispatch when using consistent hash
Redispatch traditionally only worked for cookie based persistence.

Adding redispatch support for consistent hash based persistence - also
update docs.

Reported by Oskar Stenman on discourse:
https://discourse.haproxy.org/t/balance-uri-consistent-hashing-redispatch-3-not-redispatching/3344

Should be backported to 1.8.

Cc: Lukas Tribus <lukas@ltri.eu>
2019-01-02 20:22:17 +01:00
Christopher Faulet
e64582929f MINOR: channel: Add the function channel_add_input
This function must be called when new incoming data are pushed in the channel's
buffer. It updates the channel state and take care of the fast forwarding by
consuming right amount of data and decrementing "->to_forward" accordingly when
necessary. In fact, this patch just moves a part of ci_putblk in a dedicated
function.

This patch must be backported to 1.9.
2019-01-02 20:12:44 +01:00
Olivier Houchard
a2dbeb22fc MEDIUM: sessions: Keep track of which connections are idle.
Instead of keeping track of the number of connections we're responsible for,
keep track of the number of connections we're responsible for that we are
currently considering idling (ie that we are not using, they may be in use
by other sessions), that way we can actually reuse connections when we have
more connections than the max configured.
2018-12-28 19:16:03 +01:00
Olivier Houchard
351411facd BUG/MAJOR: sessions: Use an unlimited number of servers for the conn list.
When a session adds a connection to its connection list, we used to remove
connections for an another server if there were not enough room for our
server. This can't work, because those lists are now the list of connections
we're responsible for, not just the idle connections.
To fix this, allow for an unlimited number of servers, instead of using
an array, we're now using a linked list.
2018-12-28 16:33:13 +01:00
Olivier Houchard
09e498f1a1 BUG/MEDIUM: tasks: Decrement tasks_run_queue in tasklet_free().
If the tasklet is in the list, don't forget to decrement tasks_run_queue
in tasklet_free().

This should be backported to 1.9.
2018-12-24 14:04:55 +01:00
Olivier Houchard
ab28a320aa MINOR: ssl: Add ssl_sock_set_alpn().
Add a new function, ssl_sock_set_alpn(), to be able to change the ALPN
for a connection, instead of relying of the one defined in the SSL_CTX.
2018-12-21 19:53:30 +01:00
Olivier Houchard
8ab8a6eee5 BUG/MAJOR: connections: Close the connection before freeing it.
In si_release_endpoint(), if the end point is a connection, because we don't
know which mux to use it, make sure we close the connection before freeing it,
or else, we'd have a fd left for polling, which would point to a now free'd
connection.

This should be backported to 1.9.
2018-12-20 06:03:14 +01:00
Willy Tarreau
e9f4301f0f MINOR: connection: add cs_set_error() to set the error bits
Depending on the CS_FL_EOS status, we either set CS_FL_ERR_PENDING
or CS_FL_ERROR at various places. Let's have a generic function to
do this.
2018-12-19 18:13:52 +01:00
Willy Tarreau
14bfe9af12 CLEANUP: stream-int: consistently call the si/stream_int functions
As long-time changes have accumulated over time, the exported functions
of the stream-interface were almost all prefixed "si_<something>" while
most private ones (mostly callbacks) were called "stream_int_<something>".
There were still a few confusing exceptions, which were addressed to
follow this shcme :
  - stream_sock_read0(), only used internally, was renamed stream_int_read0()
    and made static
  - stream_int_notify() is only private and was made static
  - stream_int_{check_timeouts,report_error,retnclose,register_handler,update}
    were renamed si_<something>.

Now it is clearer when checking one of these if it risks to be used outside
or not.
2018-12-19 15:25:43 +01:00