6236 Commits

Author SHA1 Message Date
Willy Tarreau
1b4cf9b754 BUG/MINOR: h1: the HTTP/1 make status code parser check for digits
The H1 parser used by the H2 gateway was a bit lax and could validate
non-numbers in the status code. Since it computes the code on the fly
it's problematic, as "30:" is read as status code 310. Let's properly
check that it's a number now. No backport needed.
2017-11-09 11:15:45 +01:00
Willy Tarreau
ddfbd83780 BUILD: shctx: do not depend on openssl anymore
The build breaks on a machine without openssl/crypto.h because shctx
still loads openssl-compat.h while it doesn't need it anymore since
the code was moved :

In file included from src/shctx.c:20:0:
include/proto/openssl-compat.h:3:28: fatal error: openssl/crypto.h: No such file or directory
 #include <openssl/crypto.h>

Just remove include openssl-compat from shctx.
2017-11-08 14:33:36 +01:00
Willy Tarreau
46c9d3e6cb BUILD: ssl: fix build of backend without ssl
Commit 522eea7 ("MINOR: ssl: Handle sending early data to server.") added
a dependency on SRV_SSL_O_EARLY_DATA which only exists when USE_OPENSSL
is defined (which is probably not the best solution) and breaks the build
when ssl is not enabled. Just add an ifdef USE_OPENSSL around the block
for now.
2017-11-08 14:28:08 +01:00
Olivier Houchard
522eea7110 MINOR: ssl: Handle sending early data to server.
This adds a new keyword on the "server" line, "allow-0rtt", if set, we'll try
to send early data to the server, as long as the client sent early data, as
in case the server rejects the early data, we no longer have them, and can't
resend them, so the only option we have is to send back a 425, and we need
to be sure the client knows how to interpret it correctly.
2017-11-08 14:11:10 +01:00
Olivier Houchard
cfdef2e312 MINOR: ssl: Spell 0x10101000L correctly.
Issue added in 1.8-dev by c2aae74 ("MEDIUM: ssl: Handle early data with
OpenSSL 1.1.1"), no impact on older versions.
2017-11-08 14:10:02 +01:00
Olivier Houchard
bd84ac8737 MINOR: ssl: Handle session resumption with TLS 1.3
With TLS 1.3, session aren't established until after the main handshake
has completed. So we can't just rely on calling SSL_get1_session(). Instead,
we now register a callback for the "new session" event. This should work for
previous versions of TLS as well.
2017-11-08 14:08:07 +01:00
Olivier Houchard
35a63cc1c7 BUG/MINOR; ssl: Don't assume we have a ssl_bind_conf because a SNI is matched.
We only have a ssl_bind_conf if crt-list is used, however we can still
match a certificate SNI, so don't assume we have a ssl_bind_conf.
2017-11-08 14:08:07 +01:00
Willy Tarreau
9e45b33f7e BUG/MAJOR: threads/tasks: fix the scheduler again
My recent change in commit ce4e0aa ("MEDIUM: task: change the construction
of the loop in process_runnable_tasks()") was bogus as it used to keep the
rq_next across an unlock/lock sequence, occasionally leading to crashes for
tasks that are eligible to any thread. We must use the lookup call for each
new batch instead. The problem is easily triggered with such a configuration :

    global
        nbthread 4

    listen check
        mode http
        bind 0.0.0.0:8080
        redirect location /
        option httpchk GET /
        server s1 127.0.0.1:8080 check inter 1
        server s2 127.0.0.1:8080 check inter 1

Thanks to Olivier for diagnosing this one. No backport is needed.
2017-11-08 14:05:19 +01:00
Willy Tarreau
ecd2e15919 BUG/MINOR: stream-int: don't set MSG_MORE on closed request path
Commit 4ac4928 ("BUG/MINOR: stream-int: don't set MSG_MORE on SHUTW_NOW
without AUTO_CLOSE") was incomplete. H2 reveals another situation where
the input stream is marked closed with the request and we set MSG_MORE,
causing a delay before the request leaves.

Better avoid setting the flag on the request path for close cases in
general.
2017-11-07 15:07:25 +01:00
Emeric Brun
11f5886e5c BUG/MINOR: comp: fix compilation warning compiling without compression.
This is specific to threads, no backport is needed.
2017-11-07 14:48:13 +01:00
Emeric Brun
d8b3b65faa BUG/MEDIUM: splice/threads: pipe reuse list was not protected.
The list is now protected using a global spinlock.
2017-11-07 14:47:28 +01:00
Willy Tarreau
926fa4c098 BUG/MINOR: h2: don't send GOAWAY on failed response
As part of the detection for intentional closes, we can kill the
connection if a shutw() happens before the headers. But it can also
happen that an invalid response is not properly parsed, preventing
any headers frame from being sent and making the function believe
it was an abort. Now instead we check if any response was received
from the stream, regardless of the fact that it was properly
converted.
2017-11-07 14:47:04 +01:00
Willy Tarreau
c4312d3dfd MINOR: h2: add new stream flag H2_SF_OUTGOING_DATA
This one indicates whether we've received data to mux out. It helps
make the difference between a clean close and a an erroneous one.
2017-11-07 14:47:04 +01:00
Willy Tarreau
58e3208714 BUG/MINOR: h2: correctly check for H2_SF_ES_SENT before closing
In h2_shutw() we must not send another empty frame (nor RST) after
one has been sent, as the stream is already in HLOC/CLOSED state.
2017-11-07 14:47:04 +01:00
Willy Tarreau
6d8b682f9a BUG/MEDIUM: h2: properly set H2_SF_ES_SENT when sending the final frame
When sending DATA+ES, it's important to set H2_SF_ES_SENT as we don't
want to emit is several times nor to send an RST afterwards.
2017-11-07 14:47:04 +01:00
Willy Tarreau
e6ae77f64f MINOR: h2: don't re-enable the connection's task when we're closing
It's pointless to requeue the task when we're closing, so swap the
order of the task_queue() and h2_release(). It also matches what
was written in the comment regarding re-arming the timer.
2017-11-07 14:47:04 +01:00
Willy Tarreau
83906c2f91 BUG/MEDIUM: h2: don't close the connection is there are data left
h2_detach() is called after a stream was closed, and it evaluates if it's
worth closing the connection. The issue there is that the connection is
closed too early in case there's demand for closing after the last stream,
even if some data remain in the mux. Let's change the condition to check
for this.
2017-11-07 14:47:04 +01:00
Christopher Faulet
2a944ee16b BUILD: threads: Rename SPIN/RWLOCK macros using HA_ prefix
This remove any name conflicts, especially on Solaris.
2017-11-07 11:10:24 +01:00
Willy Tarreau
7d8e4af46a BUG/MEDIUM: h2: fix some wrong error codes on connections
When the assignment of the connection state was moved into h2c_error(),
3 of them were missed because they were wrong, using H2_SS_ERROR instead.
This resulted in the connection's state being set to H2_CS_ERROR2 in fact,
so the error was not properly sent.
2017-11-07 11:08:28 +01:00
Willy Tarreau
721c974e5e MEDIUM: h2: remove the H2_SS_RESET intermediate state
This one was created to maintain the knowledge that a stream was closed
after having sent an RST_STREAM frame but that's not needed anymore and
it confuses certain conditions on the error processing path. It's time
to get rid of it.
2017-11-07 11:05:42 +01:00
Willy Tarreau
319994a2e9 BUG/MEDIUM: h2: don't try (and fail) to send non-existing data in the mux
The call to xprt->snd_buf() was not conditionned on the presence of
data in the buffer, resulting in snd_buf() returning 0 and never
disabling the polling. It was revealed by the previous bug on error
processing but must properly be handled.
2017-11-07 11:03:56 +01:00
Willy Tarreau
3eabe9b174 BUG/MEDIUM: h2: properly send the GOAWAY frame in the mux
A typo on a condition prevented H2_CS_ERROR from being processed,
leading to an infinite loop on connection error.
2017-11-07 11:03:01 +01:00
Willy Tarreau
c6795ca7c1 BUG/MEDIUM: h2: properly send an RST_STREAM on mux stream error
Some stream errors are detected on the MUX path (eg: H1 response
encoding). The ones forgot to emit an RST_STREAM frame, causing the
client to wait and/or to see the connection being immediately closed.
This is now fixed.
2017-11-07 09:43:06 +01:00
Willy Tarreau
6743420778 BUG/MINOR: h2: set the "HEADERS_SENT" flag on stream, not connection
This flag was added after the GOAWAY flags were introduced and mistakenly
placed in the connection, but that doesn't make sense as it's specific to
the stream. The main impact is the risk of returning a DATA0+ES frame for
an error instead of an RST_STREAM.
2017-11-06 20:20:51 +01:00
Olivier Houchard
283810773a BUG/MINOR: dns: Don't lock the server lock in snr_check_ip_callback().
snr_check_ip_callback() may be called with the server lock, so don't attempt
to lock it again, instead, make sure the callers always have the lock before
calling it.
2017-11-06 18:34:42 +01:00
Olivier Houchard
55dcdf4c39 BUG/MINOR: dns: Don't try to get the server lock if it's already held.
dns_link_resolution() can be called with the server lock already held, so
don't attempt to lock it again in that case.
2017-11-06 18:34:24 +01:00
Willy Tarreau
f0c531ab55 MEDIUM: tasks: implement a lockless scheduler for single-thread usage
The scheduler is complex and uses local queues to amortize the cost of
locks. But all this comes with a cost that is quite observable with
single-thread workloads.

The purpose of this patch is to reimplement the much simpler scheduler
for the case where threads are not used. The code is very small and
simple. It doesn't impact the multi-threaded performance at all, and
provides a nice 10% performance increase in single-thread by reaching
606kreq/s on the tests that showed 550kreq/s before.
2017-11-06 11:20:11 +01:00
Willy Tarreau
9d4b56b88e MINOR: tasks: only visit filled task slots after processing them
process_runnable_tasks() needs to requeue or wake up tasks after
processing them in batches. By only refilling the existing ones, we
avoid revisiting all the queue. The performance gain is measurable
starting with two threads, where the request rate climbs to 657k/s
compared to 644k.
2017-11-06 11:20:11 +01:00
Willy Tarreau
ce4e0aa7f3 MEDIUM: task: change the construction of the loop in process_runnable_tasks()
This patch slightly rearranges the loop to pack the locked code a little
bit, and to try to concentrate accesses to the tree together to benefit
more from the cache.

It also fixes how the loop handles the right margin : now that is guaranteed
that the retrieved nodes are filtered to only match the current thread, we
don't need to rewind every 16 entries. Instead we can rewind each time we
reach the right margin again.

With this change, we now achieve the following performance for 10 H2 conns
each containing 100 streams :

   1 thread : 550kreq/s
   2 thread : 644kreq/s
   3 thread : 598kreq/s
2017-11-06 11:20:11 +01:00
Willy Tarreau
b992ba16ef MINOR: task: simplify wake_expired_tasks() to avoid unlocking in the loop
This function is sensitive, let's make it shorter by factoring out the
unlock and leave code. This reduced the function's size by a few tens
of bytes and increased the overall performance by about 1%.
2017-11-06 11:20:11 +01:00
Willy Tarreau
8d38805d3d MAJOR: task: make use of the scope-aware ebtree functions
Currently the task scheduler suffers from an O(n) lookup when
skipping tasks that are not for the current thread. The reason
is that eb32_lookup_ge() has no information about the current
thread so it always revisits many tasks for other threads before
finding its own tasks.

This is particularly visible with HTTP/2 since the number of
concurrent streams created at once causes long series of tasks
for the same stream in the scheduler. With only 10 connections
and 100 streams each, by running on two threads, the performance
drops from 640kreq/s to 11.2kreq/s! Lookup metrics show that for
only 200000 task lookups, 430 million skips had to be performed,
which means that on average, each lookup leads to 2150 nodes to
be visited.

This commit backports the principle of scope lookups for ebtrees
from the ebtree_v7 development tree. The idea is that each node
contains a mask indicating the union of the scopes for the nodes
below it, which is fed during insertion, and used during lookups.

Then during lookups, branches that do not contain any leaf matching
the requested scope are simply ignored. This perfectly matches a
thread mask, allowing a thread to only extract the tasks it cares
about from the run queue, and to always find them in O(log(n))
instead of O(n). Thus the scheduler uses tid_bit and
task->thread_mask as the ebtree scope here.

Doing this has recovered most of the performance, as can be seen on
the test below with two threads, 10 connections, 100 streams each,
and 1 million requests total :

                              Before     After    Gain
              test duration : 89.6s      4.73s     x19
    HTTP requests/s (DEBUG) : 11200     211300     x19
     HTTP requests/s (PROD) : 15900     447000     x28
             spin_lock time : 85.2s      0.46s    /185
            time per lookup : 13us       40ns     /325

Even when going to 6 threads (on 3 hyperthreaded CPU cores), the
performance stays around 284000 req/s, showing that the contention
is much lower.

A test showed that there's no benefit in using this for the wait queue
though.
2017-11-06 11:20:11 +01:00
William Lallemand
92159b2901 MINOR: mworker: do not store child pid anymore in the pidfile
The parent process supervises itself the children, we don't need to
store the children pids anymore in the pidfile in master-worker mode.
2017-11-06 11:19:53 +01:00
William Lallemand
deed780a22 MINOR: mworker: write parent pid in the pidfile
The first pid in the pidfile is now the parent, it's more convenient for
supervising the processus.

You can now reload haproxy in master-worker mode with convenient command
like: kill -USR2 $(head -1 /tmp/haproxy.pid)
2017-11-06 11:08:38 +01:00
William Lallemand
8029300df6 MINOR: mworker: allow pidfile in mworker + foreground
This patch allows the use of the pidfile in master-worker mode without
using the background option.
2017-11-06 11:08:38 +01:00
William Lallemand
cc113822a7 MINOR: add master-worker in the warning about nbproc 2017-11-06 11:08:38 +01:00
Willy Tarreau
6dbd3e963b BUG/MEDIUM: threads: don't try to free build option message on exit
Commit 0493149 ("MINOR: thread: report multi-thread support in haproxy -vv")
added information about thread support in haproxy -vv output but accidently
marked the message as "must_free" while it's a constant. This causes a segv
on the old process on clean exit if threads are enabled. It doesn't affect
the stability during operations however.
2017-11-05 11:51:48 +01:00
Willy Tarreau
bbd09b9306 BUG/MAJOR: thread/listeners: enable_listener must not call unbind_listener()
unbind_listener() takes the listener lock, which is already held by
enable_listener(). This situation happens when starting with nbproc > 1
with some bind lines limited to a certain process, because in this case
enable_listener() tries to stop unneeded listeners.

This commit introduces __do_unbind_listeners() which must be called with
the lock held, and makes enable_listener() use this one. Given that the
only return code has never been used and that it starts to make the code
more complicated to propagate it before throwing it to the trash, the
function's return type was changed to void.
2017-11-05 11:38:44 +01:00
Willy Tarreau
3340029b97 BUG/MAJOR: h2: set the connection's task to NULL when no client timeout is set
If "timeout client" is missing from the frontend, the task is not initialized,
causing a crash on connection teardown.
2017-11-05 11:23:40 +01:00
Willy Tarreau
4d5f13cab3 BUG/MEDIUM: threads/stick-tables: close a race condition on stktable_trash_expired()
The spin_unlock() was called just before setting the expiry to
TICK_ETERNITY, so if another thread has the time to perform its
update and set a timeout, this would would clear it.
2017-11-05 11:04:47 +01:00
Willy Tarreau
03071f6937 BUG/MAJOR: threads/lb: fix missing unlock on map-based hash LB
We often left the function with the lock held on success.
2017-11-05 10:59:12 +01:00
Willy Tarreau
1ed90ac377 BUG/MAJOR: threads/lb: fix missing unlock on consistent hash LB
If no matching node was found, the function was left without unlocking
the tree.
2017-11-05 10:54:50 +01:00
Willy Tarreau
5ec84574c7 BUG/MAJOR: threads/dns: add missing unlock on allocation failure path
An unlock was missing when a memory allocation failure is detected.
2017-11-05 10:35:57 +01:00
Willy Tarreau
70124ce3e1 BUG/MAJOR: cli/streams: missing unlock on exit "show sess"
An unlock was missing on the situation where the session disappeared
while watching it.
2017-11-05 10:31:10 +01:00
Willy Tarreau
6ce38f3eab CLEANUP: server: get rid of return statements in the CLI parser
There were two many return, some of them missing a spin_unlock call,
let's use a goto to a central place instead.
2017-11-05 10:19:23 +01:00
Willy Tarreau
a075258a2c BUG/MINOR: cli: add severity in "set server addr" parser
Commit c3680ec ("MINOR: add severity information to cli feedback messages")
introduced a severity level to CLI messages, but one of them was missed
on "set server addr". No backport is needed.
2017-11-05 10:17:49 +01:00
Willy Tarreau
62ac84f843 CLEANUP: checks: remove return statements in locked functions
Given that all spinning loops we've had since 1.8-rc1 were caused by
unbalanced lock/unlock, let's get rid of all return statements in the
locked check functions and only exit via a a single unlock place.
2017-11-05 10:13:38 +01:00
Willy Tarreau
73247e0757 BUG/MAJOR: threads/checks: wrong use of SPIN_LOCK instead of SPIN_UNLOCK
Must unlock on exit, copy-paste error.
2017-11-05 10:13:37 +01:00
Willy Tarreau
1c8980f9b5 BUG/MINOR: cli: do not perform an invalid action on "set server check-port"
The "set server <srv> check-port" CLI handler forgot to return after
detecting an error on the port number, and still proceeds with the action.
This needs to be backported to 1.7.
2017-11-05 10:13:37 +01:00
Willy Tarreau
2a858a82ec BUG/MAJOR: threads/server: missing unlock in CLI fqdn parser
This one didn't properly unlock before returning an error message.
2017-11-05 10:13:37 +01:00
Willy Tarreau
1cd153aa89 BUG/MAJOR: threads/checks: add 4 missing spin_unlock() in various functions
Some unlocks were missing, resulting in deadlocks even with a single thread.
We really need to make these functions safer by getting rid of all those
remaining "return" calls and only leave using a goto!
2017-11-05 10:13:28 +01:00