10599 Commits

Author SHA1 Message Date
Christopher Faulet
00618aadf9 MINOR: proto_htx: Rely on the HTX function to apply a redirect rules
There is no reason to use the legacy HTTP version here, which falls back on the
HTX version in this case.
2019-07-19 09:18:27 +02:00
Christopher Faulet
75b4cd967d MINOR: proto_htx: Directly call htx_check_response_for_cacheability()
Instead of using the HTTP legacy version.
2019-07-19 09:18:27 +02:00
Christopher Faulet
4d0e263079 BUG/MINOR: hlua: Make the function txn:done() HTX aware
The function hlua_txn_done() still relying, for the HTTP, on the legacy HTTP
mode. Now, for HTX streams, it calls the function htx_reply_and_close().

This patch must be backported to 2.0 and 1.9.
2019-07-19 09:18:27 +02:00
Christopher Faulet
5f2c49f5ee BUG/MINOR: cache/htx: Make maxage calculation HTX aware
The function http_calc_maxage() was not updated to be HTX aware. So the header
"Cache-Control" on the response was never parsed to find "max-age" or "s-maxage"
values.

This patch must be backported to 2.0 and 1.9.
2019-07-19 09:18:27 +02:00
Christopher Faulet
7b889cb387 BUG/MINOR: http_htx: Initialize HTX error messages for TCP proxies
Since the HTX is the default mode for all proxies, HTTP and TCP, we must
initialize all HTX error messages for all HTX-aware proxies and not only for
HTTP ones. It is required to support HTTP upgrade for TCP proxies.

This patch must be backported to 2.0.
2019-07-19 09:18:27 +02:00
Christopher Faulet
cd76195061 BUG/MINOR: http_fetch: Fix http_auth/http_auth_group when called from TCP rules
These sample fetches rely on the static fnuction get_http_auth(). For HTX
streams and TCP proxies, this last one gets its HTX message from the request's
channel. When called from an HTTP rule, There is no problem. Bu when called from
TCP rules for a TCP proxy, this buffer is a raw buffer not an HTX message. For
instance, using the following TCP rule leads to a crash :

  tcp-request content accept if { http_auth(Users) }

To fix the bug, we must rely on the HTX message returned by the function
smp_prefetch_htx(). So now, the HTX message is passed as argument to the
function get_http_auth().

This patch must be backported to 2.0 and 1.9.
2019-07-19 09:18:27 +02:00
Christopher Faulet
6d36e1c282 MINOR: mux-h2: Don't adjust anymore the amount of data sent in h2_snd_buf()
Because the infinite forward is HTX aware, it is useless to tinker with the
number of bytes really sent. This was fixed long ago for the H1 and forgotten to
do so for the H2.
2019-07-19 09:18:27 +02:00
Willy Tarreau
09e0203ef4 BUG/MINOR: backend: do not try to install a mux when the connection failed
If si_connect() failed, do not try to install the mux nor to complete
the operations or add the connection to an idle list, and abort quickly
instead. No obvious side effects were identified, but continuing to
allocate some resources after something has already failed seems risky.

This was a result of a prior fix which already wanted to push this code
further : aa089d80b ("BUG/MEDIUM: server: Defer the mux init until after
xprt has been initialized.") but it ought to have pushed it even further
to maintain the error check just after si_connect().

To be backported to 2.0 and 1.9.
2019-07-18 16:49:11 +02:00
Willy Tarreau
69564b1c49 BUG/MEDIUM: http/htx: unbreak option http_proxy
The temporary connection used to hold the target connection's address
was missing a valid target, resulting in a 500 server error being
reported when trying to connect to a remote host. Strangely this
issue was introduced as a side effect of commit 2c52a2b9e ("MEDIUM:
connection: make mux->detach() release the connection") which at
first glance looks unrelated but solidly stops the bisection (note
that by default this part even crashes). It's suspected that the
error only happens when closing and destroys pending data in fact.

Given that this feature was broken very early during 1.8-rc1 development
it doesn't seem to be used often. This must be backported as far as 1.8.
2019-07-18 16:49:11 +02:00
Olivier Houchard
0ba6c85a0b BUG/MEDIUM: checks: Don't attempt to receive data if we already subscribed.
tcpcheck_main() might be called while we already attempted to subscribe, and
failed. There's no point in trying to call rcv_buf() again, and failing
would lead to us trying to subscribe again, which is not allowed.

This should be backported to 2.0 and 1.9.
2019-07-18 16:42:45 +02:00
Willy Tarreau
8280ea97a0 MINOR: applet: make appctx use their own pool
A long time ago, applets were seen as an alternative to connections,
and since their respective sizes were roughly equal it appeared wise
to share the same pool. Nowadays, connections got significantly larger
but applets are not that often used, except for the cache. However
applets are mostly complementary and not alternatives anymore, as
it's very possible not to have a back connection or to share one with
other streams.

The connections will soon lose their addresses and their size will
shrink so much that appctx won't fit anymore. Given that the old
benefits of sharing these pools have long disappeared, let's stop
doing this and have a dedicated pool for appctx.
2019-07-18 10:45:08 +02:00
Willy Tarreau
45726fd458 BUG/MINOR: dns: remove irrelevant dependency on a client connection
The do-resolve action tests for a client connection to the stream and
tries to get the client's address, otherwise it refrains from performing
the resolution. This really makes no sense at all and looks like an
earlier attempt at resolving the client's address to test that the
code was working. Further, it prevents the action from being used
from other places such as an autonomous applet for example, even if
at the moment this use case does not exist.

This patch simply removes the irrelevant test.

This can be backported to 2.0.
2019-07-17 14:11:57 +02:00
Willy Tarreau
7764a57d32 BUG/MEDIUM: threads: cpu-map designating a single thread/process are ignored
Since commit 81492c989 ("MINOR: threads: flatten the per-thread cpu-map"),
we don't keep the proc*thread matrix anymore to represent the full binding
possibilities, but only the proc and thread ones. The problem is that the
per-process binding is not the same for each thread and for the process,
and the proc[] array was assumed to store the per-proc first thread value
when doing this change. Worse, the logic present there tries to deal with
thread ranges and process ranges in a way which automatically exclused the
other possibility (since ranges cannot be used on both) but as such fails
to apply changes if neither the process nor the thread is expressed as a
range.

The real problem comes from the fact that specifying cpu-map 1/1 doesn't
yet reveal if the per-process mask or the per-thread mask needs to be
updated. In practice it's the thread one but then the current storage
doesn't allow to store the binding of the first thread of each other
process in nbproc>1 configurations.

When removing the proc*thread matrix, what ought to have been kept was
both the thread column for process 1 and the process line for threads 1,
but instead only the thread column was kept. This patch reintroduces the
storage of the configuration for the first thread of each process so that
it is again possible to store either the per-thread or per-process
configuration.

As a partial workaround for existing configurations, it is possible to
systematically indicate at least two processes or two threads at once
and map them by pairs or more so that at least two values are present
in the range. E.g :

  # set processes 1-4 to cpus 0-3 :

     cpu-map auto:1-4/1 0 1 2 3
  # or:
     cpu-map 1-2/1 0 1
     cpu-map 2-3/1 2 3

  # set threads 1-4 to cpus 0-3 :

     cpu-map auto:1/1-4 0 1 2 3
  # or :
     cpu-map 1/1-2 0 1
     cpu-map 3/3-4 2 3

This fix must be backported to 2.0.
2019-07-16 15:23:09 +02:00
Andrew Heberle
9723696759 MEDIUM: mworker-prog: Add user/group options to program section
This patch adds "user" and "group" config options to the "program"
section so the configured command can be run as a different user.
2019-07-15 16:43:16 +02:00
Willy Tarreau
7df8ca6296 BUG/MEDIUM: tcp-check: unbreak multiple connect rules again
The last connect rule used to be ignored and that was fixed by commit
248f1173f ("BUG/MEDIUM: tcp-check: single connect rule can't detect
DOWN servers") during 1.9 development. However this patch went a bit
too far by not breaking out of the loop after a pending connect(),
resulting in a series of failed connect() to be quickly skipped and
only the last one to be taken into account.

Technically speaking the series is not exactly skipped, it's just that
TCP checks suffer from a design issue which is that there is no
distinction between a new rule and this rule's completion in the
"connect" rule handling code. As such, when evaluating TCPCHK_ACT_CONNECT
a new connection is created regardless of any previous connection in
progress, and the previous result is ignored. It seems that this issue
is mostly specific to the connect action if we refer to the comments at
the top of the function, so it might be possible to durably address it
by reworking the connect state.

For now this patch does something simpler, it restores the behaviour
before the commit above consisting in breaking out of the loop when
the connection is in progress and after skipping comment rules. This
way we fall back to the default code waiting for completion.

This patch must be backported as far as 1.8 since the commit above
was backported there. Thanks to Jrme Magnin for reporting and
bisecting this issue.
2019-07-15 11:10:36 +02:00
Willy Tarreau
9cca8dfc0b BUG/MINOR: mux-pt: do not pretend there's more data after a read0
Commit 8706c8131 ("BUG/MEDIUM: mux_pt: Always set CS_FL_RCV_MORE.")
was a bit excessive in setting this flag, it refrained from removing
it after read0 unless it was on an empty call. The problem it causes
is that read0 is thus ignored on the first call :

  $ strace -tts200 -e trace=recvfrom,epoll_wait,sendto  ./haproxy -db -f tcp.cfg
  06:34:23.956897 recvfrom(9, "blah\n", 15360, 0, NULL, NULL) = 5
  06:34:23.956938 recvfrom(9, "", 15355, 0, NULL, NULL) = 0
  06:34:23.956958 recvfrom(9, "", 15355, 0, NULL, NULL) = 0
  06:34:23.957033 sendto(8, "blah\n", 5, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = 5
  06:34:23.957229 epoll_wait(3, [{EPOLLIN|EPOLLHUP|EPOLLRDHUP, {u32=8, u64=8}}], 200, 0) = 1
  06:34:23.957297 recvfrom(8, "", 15360, 0, NULL, NULL) = 0

If CO_FL_SOCK_RD_SH is reported by the transport layer, it indicates the
read0 was already seen thus we must not try again and we must immedaitely
report it. The simple fix consists in removing the test on ret==0 :

  $ strace -tts200 -e trace=recvfrom,epoll_wait,sendto  ./haproxy -db -f tcp.cfg
  06:44:21.634835 recvfrom(9, "blah\n", 15360, 0, NULL, NULL) = 5
  06:44:21.635020 recvfrom(9, "", 15355, 0, NULL, NULL) = 0
  06:44:21.635056 sendto(8, "blah\n", 5, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = 5
  06:44:21.635269 epoll_wait(3, [{EPOLLIN|EPOLLHUP|EPOLLRDHUP, {u32=8, u64=8}}], 200, 0) = 1
  06:44:21.635330 recvfrom(8, "", 15360, 0, NULL, NULL) = 0

The issue is minor, it only results in extra syscalls and CPU usage.
This fix should be backported to 2.0 and 1.9.
2019-07-15 06:47:54 +02:00
Olivier Houchard
4bd5867627 BUG/MEDIUM: streams: Don't redispatch with L7 retries if redispatch isn't set.
Move the logic to decide if we redispatch to a new server from
sess_update_st_cer() to a new inline function, stream_choose_redispatch(), and
use it in do_l7_retry() instead of just setting the state to SI_ST_REQ.
That way, when using L7 retries, we won't redispatch the request to another
server except if "option redispatch" is used.

This should be backported to 2.0.
2019-07-12 16:17:50 +02:00
Olivier Houchard
29cac3c5f7 BUG/MEDIUM: streams: Don't give up if we couldn't send the request.
In htx_request_forward_body(), don't give up if we failed to send the request,
and we have L7 retries activated. If we do, we will not retry when we should.

This should be backported to 2.0.
2019-07-12 16:17:50 +02:00
Dave Pirotte
234740f65d BUG/MINOR: mux-h1: Correctly report Ti timer when HTX and keepalives are used
When HTTP keepalives are used in conjunction with HTX, the Ti timer
reports the elapsed time since the beginning of the connection instead
of the end of the previous request as stated in the documentation. Th,
Tq and Tt also report incorrectly as a result.

When creating a new h1s, check if it is the first request on the
connection. If not, set the session create times to the current
timestamp rather than the initial session accept timestamp. This makes
the logged timers behave as stated in the documentation.

This fix should be backported to 1.9 and 2.0.
2019-07-12 16:14:12 +02:00
Christopher Faulet
37243bc61f BUG/MEDIUM: mux-h1: Don't release h1 connection if there is still data to send
When the h1 stream (h1s) is detached, If the connection is not really shutdown
yet and if there is still some data to send, the h1 connection (h1c) must not be
released. Otherwise, the remaining data are lost. This bug was introduced by the
commit 3ac0f430 ("BUG/MEDIUM: mux-h1: Always release H1C if a shutdown for
writes was reported").

Here is the conditions to release an h1 connection when the h1 stream is
detached :

  * An error or a shutdown write occurred on the connection
    (CO_FL_ERROR|CO_FL_SOCK_WR_SH)

  * an error, an h2 upgrade or full shutdown occurred on the h1 connection
    (H1C_F_CS_ERROR||H1C_F_UPG_H2C|H1C_F_CS_SHUTDOWN)

  * A shutdown write is pending on the h1 connection and there is no more data
    in the output buffer
    ((h1c->flags & H1C_F_CS_SHUTW_NOW) && !b_data(&h1c->obuf))

If one of these conditions is fulfilled, the h1 connection is
released. Otherwise, the release is delayed. If we are waiting to send remaining
data, a timeout is set.

This patch must be backported to 2.0 and 1.9. It fixes the issue #164.
2019-07-12 10:06:41 +02:00
Willy Tarreau
f2cb169487 BUG/MAJOR: listener: fix thread safety in resume_listener()
resume_listener() can be called from a thread not part of the listener's
mask after a curr_conn has gone lower than a proxy's or the process' limit.
This results in fd_may_recv() being called unlocked if the listener is
bound to only one thread, and quickly locks up.

This patch solves this by creating a per-thread work_list dedicated to
listeners, and modifying resume_listener() so that it bounces the listener
to one of its owning thread's work_list and waking it up. This thread will
then call resume_listener() again and will perform the operation on the
file descriptor itself. It is important to do it this way so that the
listener's state cannot be modified while the listener is being moved,
otherwise multiple threads can take conflicting decisions and the listener
could be put back into the global queue if the listener was used at the
same time.

It seems like a slightly simpler approach would be possible if the locked
list API would provide the ability to return a locked element. In this
case the listener would be immediately requeued in dequeue_all_listeners()
without having to go through resume_listener() with its associated lock.

This fix must be backported to all versions having the lock-less accept
loop, which is as far as 1.8 since deadlock fixes involving this feature
had to be backported there. It is expected that the code should not differ
too much there. However, previous commit "MINOR: task: introduce work lists"
will be needed as well and should not present difficulties either. For 1.8,
the commits introducing thread_mask() and LIST_ADDED() will be needed as
well, either backporting my_flsl() or switching to my_ffsl() will be OK,
and some changes will have to be performed so that the init function is
properly called (and maybe the deinit one can be dropped).

In order to test for the fix, simply set up a multi-threaded frontend with
multiple bind lines each attached to a single thread (reproduced with 16
threads here), set up a very low maxconn value on the frontend, and inject
heavy traffic on all listeners in parallel with slightly more connections
than the configured limit ( typically +20%) so that it flips very
frequently. If the bug is still there, at some point (5-20 seconds) the
traffic will go much lower or even stop, either with spinning threads or
not.
2019-07-12 09:07:48 +02:00
Willy Tarreau
64e6012eb9 MINOR: task: introduce work lists
Sometimes we need to delegate some list processing to a function running
on another thread. In this case the list element will simply be queued
into a dedicated self-locked list and the task responsible for this list
will be woken up, calling the associated function which will run over the
list.

This is what work_list does. Such lists will be dedicated to a limited
type of work but will significantly ease such remote handling. A function
is provided to create these per-thread lists, their tasks and to properly
bind each task to a distinct thread, so that the caller only has to store
the resulting pointer to the start of the structure.

These structures should not be abused though as each head will consume
4 pointers per thread, hence 32 bytes per thread or 2 kB for 64 threads.
2019-07-12 09:07:48 +02:00
Olivier Houchard
4be7190c10 BUG/MEDIUM: servers: Fix a race condition with idle connections.
When we're purging idle connections, there's a race condition, when we're
removing the connection from the idle list, to add it to the list of
connections to free, if the thread owning the connection tries to free it
at the same time.
To fix this, simply add a per-thread lock, that has to be hold before
removing the connection from the idle list, and when, in conn_free(), we're
about to remove the connection from every list. That way, we know for sure
the connection will stay valid while we remove it from the idle list, to add
it to the list of connections to free.
This should happen rarely enough that it shouldn't have any impact on
performances.
This has not been reported yet, but could provoke random segfaults.

This should be backported to 2.0.
2019-07-11 16:16:38 +02:00
Frdric Lcaille
51596c166b CLEANUP: proto_tcp: Remove useless header inclusions.
I guess "sys/un.h" and "sys/stat.h" were included for debugging purposes when
"proto_tcp.c" was initially created. There are no more useful.
2019-07-11 10:40:20 +02:00
David Carlier
7df4185f3c BUG/MEDIUM: da: cast the chunk to string.
in fetch mode, the output was incorrect, setting the type to string
explicitally.

This should be backported to all stable versions.
2019-07-11 10:20:09 +02:00
Olivier Houchard
bc89ad8d94 BUG/MEDIUM: checks: Don't attempt to read if we destroyed the connection.
In event_srv_chk_io(), only call __event_srv_chk_r() if we did not subscribe
for reading, and if wake_srv_chk() didn't return -1, as it would mean it
just destroyed the connection and the conn_stream, and attempting to use
those to recv data would lead to a crash.

This should be backported to 1.9 and 2.0.
2019-07-10 16:29:12 +02:00
Willy Tarreau
828675421e MINOR: pools: always pre-initialize allocated memory outside of the lock
When calling mmap(), in general the system gives us a page but does not
really allocate it until we first dereference it. And it turns out that
this time is much longer than the time to perform the mmap() syscall.
Unfortunately, when running with memory debugging enabled, we mmap/munmap()
each object resulting in lots of such calls and a high contention on the
allocator. And the first accesses to the page being done under the pool
lock is extremely damaging to other threads.

The simple fact of writing a 0 at the beginning of the page after
allocating it and placing the POOL_LINK pointer outside of the lock is
enough to boost the performance by 8x in debug mode and to save the
watchdog from triggering on lock contention. This is what this patch
does.
2019-07-09 10:40:33 +02:00
Willy Tarreau
3e853ea74d MINOR: pools: release the pool's lock during the malloc/free calls
The malloc and free calls and especially the underlying mmap/munmap()
can occasionally take a huge amount of time and even cause the thread
to sleep. This is visible when haproxy is compiled with DEBUG_UAF which
causes every single pool allocation/free to allocate and release pages.
In this case, when using the locked pools, the watchdog can occasionally
fire under high contention (typically requesting 40000 1M objects in
parallel over 8 threads). Then, "perf top" shows that 50% of the CPU
time is spent in mmap() and munmap(). The reason the watchdog fires is
because some threads spin on the pool lock which is held by other threads
waiting on mmap() or munmap().

This patch modifies this so that the pool lock is released during these
syscalls. Not only this allows other threads to request try to allocate
their data in parallel, but it also considerably reduces the lock
contention.

Note that the locked pools are only used on small architectures where
high thread counts would not make sense, so this will not provide any
benefit in the general case. However it makes the debugging versions
way more stable, which is always appreciated.
2019-07-09 10:40:33 +02:00
Lukas Tribus
4979916134 BUG/MINOR: ssl: revert empty handshake detection in OpenSSL <= 1.0.2
Commit 54832b97 ("BUILD: enable several LibreSSL hacks, including")
changed empty handshake detection in OpenSSL <= 1.0.2 and LibreSSL,
from accessing packet_length directly (not available in LibreSSL) to
calling SSL_state() instead.

However, SSL_state() appears to be fully broken in both OpenSSL and
LibreSSL.

Since there is no possibility in LibreSSL to detect an empty handshake,
let's not try (like BoringSSL) and restore this functionality for
OpenSSL 1.0.2 and older, by reverting to the previous behavior.

Should be backported to 2.0.
2019-07-09 04:47:18 +02:00
Olivier Houchard
a1ab97316f BUG/MEDIUM: servers: Don't forget to set srv_cs to NULL if we can't reuse it.
In connect_server(), if there were already a CS assosciated with the stream,
but we can't reuse it, because the target is different (because we tried a
previous connection, it failed, and we use redispatch so we switched servers),
don't forget to set srv_cs to NULL. Otherwise, if we end up reusing another
connection, we would consider we already have a conn_stream, and we won't
create a new one, so we'd have a new connection but we would not be able to
use it.
This can explain frozen streams and connections stuck in CLOSE_WAIT when
using redispatch.

This should be backported to 1.9 and 2.0.
2019-07-08 16:32:58 +02:00
Christopher Faulet
037b3ebd35 BUG/MEDIUM: stream-int: Don't rely on CF_WRITE_PARTIAL to unblock opposite si
In the function stream_int_notify(), when the opposite stream-interface is
blocked because there is no more room into the input buffer, if the flag
CF_WRITE_PARTIAL is set on this buffer, it is unblocked. It is a way to unblock
the reads on the other side because some data was sent.

But it is a problem during the fast-forwarding because only the stream is able
to remove the flag CF_WRITE_PARTIAL. So it is possible to have this flag because
of a previous send while the input buffer of the opposite stream-interface is
now full. In such case, the opposite stream-interface will be woken up for
nothing because its input buffer is full. If the same happens on the opposite
side, we will have a loop consumming all the CPU.

To fix the bug, the opposite side is now only notify if there is some available
room in its input buffer in the function si_cs_send(), so only if some data was
sent.

This patch must be backported to 2.0 and 1.9.
2019-07-05 14:26:15 +02:00
Christopher Faulet
86162db15c MINOR: stream-int: Factorize processing done after sending data in si_cs_send()
In the function si_cs_send(), what is done when an error occurred on the
connection or the conn_stream or when some successfully data was send via a pipe
or the channel's buffer may be factorized at the function. It slightly simplify
the function.

This patch must be backported to 2.0 and 1.9 because a bugfix depends on it.
2019-07-05 14:26:15 +02:00
Christopher Faulet
0e54d547f1 BUG/MINOR: mux-h1: Don't process input or ouput if an error occurred
It is useless to proceed if an error already occurred. Instead, it is better to
wait it will be catched by the stream or the connection, depending on which is
the first one to detect it.

This patch must be backported to 2.0.
2019-07-05 14:26:15 +02:00
Christopher Faulet
f8db73efbe BUG/MEDIUM: mux-h1: Handle TUNNEL state when outgoing messages are formatted
Since the commit 94b2c7 ("MEDIUM: mux-h1: refactor output processing"), the
formatting of outgoing messages is performed on the message state and no more on
the HTX blocks read. But the TUNNEL state was left out. So, the HTTP tunneling
using the CONNECT method or switching the protocol (for instance,
the WebSocket) does not work.

This issue was reported on Github. See #131. This patch must be backported to
2.0.
2019-07-05 14:26:15 +02:00
Christopher Faulet
16b2be93ad BUG/MEDIUM: lb_fas: Don't test the server's lb_tree from outside the lock
In the function fas_srv_reposition(), the server's lb_tree is tested from
outside the lock. So it is possible to remove it after the test and then call
eb32_insert() in fas_queue_srv() with a NULL root pointer, which is
invalid. Moving the test in the scope of the lock fixes the bug.

This issue was reported on Github, issue #126.

This patch must be backported to 2.0, 1.9 and 1.8.
2019-07-05 14:26:15 +02:00
Christopher Faulet
8f1aa77b42 BUG/MEDIUM: http/applet: Finish request processing when a service is registered
In the analyzers AN_REQ_HTTP_PROCESS_FE/BE, when a service is registered, it is
important to not interrupt remaining processing but just the http-request rules
processing. Otherwise, the part that handles the applets installation is
skipped.

Among the several effects, if the service is registered on a frontend (not a
listen), the forwarding of the request is skipped because all analyzers are not
set on the request channel. If the service does not depends on it, the response
is still produced and forwarded to the client. But the stream is infinitly
blocked because the request is not fully consumed. This issue was reported on
Github, see #151.

So this bug is fixed thanks to the new action return ACT_RET_DONE. Once a
service is registered, the action process_use_service() still returns
ACT_RET_STOP. But now, only rules processing is stopped. As a side effet, the
action http_action_reject() must now return ACT_RET_DONE to really stop all
processing.

This patch must be backported to 2.0. It depends on the commit introducing the
return code ACT_RET_DONE.
2019-07-05 14:26:14 +02:00
Christopher Faulet
2e4843d1d2 MINOR: action: Add the return code ACT_RET_DONE for actions
This code should be now used by action to stop at the same time the rules
processing and the possible following processings. And from its side, the return
code ACT_RET_STOP should be used to only stop rules processing.

So concretely, for TCP rules, there is no changes. ACT_RET_STOP and ACT_RET_DONE
are handled the same way. However, for HTTP rules, ACT_RET_STOP should now be
mapped on HTTP_RULE_RES_STOP and ACT_RET_DONE on HTTP_RULE_RES_DONE. So this
way, a action will have the possibilty to stop all processing or only rules
processing.

Note that changes about the TCP is done in this commit but changes about the
HTTP will be done in another one because it will fix a bug in the same time.

This patch must be backported to 2.0 because a bugfix depends on it.
2019-07-05 14:26:14 +02:00
Frédéric Lécaille
1b9423d214 MINOR: server: Add "no-tfo" option.
Simple patch to add "no-tfo" option to "default-server" and "server" lines
to disable any usage of TCP fast open.

Must be backported to 2.0.
2019-07-04 14:45:52 +02:00
Olivier Houchard
8d82db70a5 BUG/MEDIUM: servers: Authorize tfo in default-server.
There's no reason to forbid using tfo with default-server, so allow it.

This should be backported to 2.0.
2019-07-04 13:34:25 +02:00
Olivier Houchard
2ab3dada01 BUG/MEDIUM: connections: Make sure we're unsubscribe before upgrading the mux.
Just calling conn_force_unsubscribe() from conn_upgrade_mux_fe() is not
enough, as there may be multiple XPRT involved. Instead, require that
any user of conn_upgrade_mux_fe() unsubscribe itself before calling it.
This should fix upgrading a TCP connection to HTX when using SSL.

This should be backported to 2.0.
2019-07-03 13:57:30 +02:00
Christopher Faulet
9060fc02b5 BUG/MINOR: hlua/htx: Respect the reserve when HTX data are sent
The previous commit 7e145b3e2 ("BUG/MINOR: hlua: Don't use
channel_htx_recv_max()") is buggy. The buffer's reserve must be respected.

This patch must be backported to 2.0 and 1.9.
2019-07-03 11:47:20 +02:00
Christopher Faulet
7e145b3e24 BUG/MINOR: hlua: Don't use channel_htx_recv_max()
The function htx_free_data_space() must be used intead. Otherwise, if there are
some output data not already forwarded, the maximum amount of data that may be
inserted into the buffer may be greater than what we can really insert.

This patch must be backported to 2.0 and 1.9.
2019-07-02 21:32:45 +02:00
Olivier Houchard
f494957980 BUG/MEDIUM: checks: Make sure the tasklet won't run if the connection is closed.
wake_srv_chk() can be called from conn_fd_handler(), and may decide to
destroy the conn_stream and the connection, by calling cs_close(). If that
happens, we have to make sure the tasklet isn't scheduled to run, or it will
probably crash trying to access the connection or the conn_stream.
This fixes a crash that can be seen when using tcp checks.

This should be backported to 1.9 and 2.0.
For 1.9, the call should be instead :
task_remove_from_tasklet_list((struct task *)check->wait_list.task);
That function was renamed in 2.0.
2019-07-02 17:45:35 +02:00
Olivier Houchard
6c7e96a3e1 BUG/MEDIUM: connections: Always call shutdown, with no linger.
Revert commit fe4abe62c7c5206dff1802f42d17014e198b9141.
The goal was to make sure for health-checks, we would not get sockets in
TIME_WAIT. To do so, we would not call shutdown() if linger_risk is set.
However that is wrong, and that means shutw would never be forwarded to
the server, and thus we could get connection that are never properly closed.
Instead, to fix the original problem as described here :
https://www.mail-archive.com/haproxy@formilux.org/msg34080.html
Just make sure the checks code call cs_shutr() before calling cs_shutw().
If shutr has been called, conn_sock_shutw() will make no attempt to call
shutdown(), as it knows close() will be called.
We should really review and revamp the shutr/shutw code, as described in
github issue #142.

This should be backported to 1.9 and 2.0.
2019-07-02 16:40:55 +02:00
Christopher Faulet
b8fc304e8f BUG/MINOR: mux-h1: Don't return the empty chunk on HEAD responses
HEAD responses must not have any body payload. But, because of a bug, for chunk
reponses, the empty chunk was always added.

This patch fixes the Github issue #146. It must be backported to 2.0 and 1.9.
2019-07-01 16:24:01 +02:00
Christopher Faulet
5433a0b021 BUG/MINOR: mux-h1: Skip trailers for non-chunked outgoing messages
Unlike H1, H2 messages may contains trailers while the header "Content-Length"
is set. Indeed, because of the framed structure of HTTP/2, it is no longer
necessary to use the chunked transfer encoding. So Trailing HEADERS frames,
after all DATA frames, may be added on messages with an explicit content length.

But in H1, it is impossible to have trailers on non-chunked messages. So when
outgoing messages are formatted by the H1 multiplexer, if the message is not
chunked, all trailers must be dropped.

This patch must be backported to 2.0 and 1.9. However, the patch will have to be
adapted for the 1.9.
2019-07-01 16:24:01 +02:00
Willy Tarreau
2df8cad0fe BUG/MEDIUM: checks: unblock signals in external checks
As discussed in issue #140, processes are forked with signals blocked
resulting in haproxy's kill being ignored. This happens when the command
takes more time to complete than the configured check timeout or interval.
Just calling "sleep 30" every second makes the problem obvious.

The fix simply consists in unblocking the signals in the child after the
fork. It needs to be backported to all stable branches containing external
checks and where signals are blocked on startup. It's unclear when it
started, but the following config exhibits the issue :

  global
    external-check

  listen www
    bind :8001
    timeout client 5s
    timeout server 5s
    timeout connect 5s
    option external-check
    external-check command "$PWD/sleep10.sh"
    server local 127.0.0.1:80 check inter 200

  $ cat sleep10.sh
  #!/bin/sh
  exec /bin/sleep 10

The "sleep" processes keep accumulating for 10 seconds and stabilize
around 25 when the bug is present. Just issuing "killall sleep" has no
effect on them, and stopping haproxy leaves these processes behind.
2019-07-01 16:03:44 +02:00
William Lallemand
ad03288e6b BUG/MINOR: mworker/cli: don't output a \n before the response
When using a level lower than admin on the master CLI, a \n is output
before the response, this is caused by the response of the "operator" or
"user" that are sent before the actual command.

To fix this problem we introduce the flag APPCTX_CLI_ST1_NOLF which ask
a command response to not be followed by the final \n.
This patch made a special case with the command operator and user
followed by a - so they are not followed by \n.

This patch must be backported to 2.0 and 1.9.
2019-07-01 15:34:11 +02:00
Christopher Faulet
3ac0f43020 BUG/MEDIUM: mux-h1: Always release H1C if a shutdown for writes was reported
We must take care of this when the stream is detached from the
connection. Otherwise, on the server side, the connexion is inserted in the list
of idle connections of the session. But when reused, because the shutdown for
writes was already catched, nothing is sent to the server and the session is
blocked with a freezed connection.

This patch must be backported to 2.0 and 1.9. It is related to the issue #136
reported on Github.
2019-06-28 17:58:15 +02:00
Olivier Houchard
e488ea865a BUG/MEDIUM: ssl: Don't attempt to set alpn if we're not using SSL.
Checks use ssl_sock_set_alpn() to set the ALPN if check-alpn is used, however
check-alpn failed to check if the connection was indeed using SSL, and thus,
would crash if check-alpn was used on a non-SSL connection. Fix this by
making sure the connection uses SSL before attempting to set the ALPN.

This should be backported to 2.0 and 1.9.
2019-06-28 14:12:28 +02:00