Commit Graph

3875 Commits

Author SHA1 Message Date
Willy Tarreau
eaf05be0ee OPTIM: polling: do not create update entries for FD removal
In order to reduce the number of poller updates, we can benefit from
the fact that modern pollers use sampling to report readiness and that
under load they rarely report the same FD multiple times in a row. As
such it's not always necessary to disable such FDs especially when we're
almost certain they'll be re-enabled again and will require another set
of syscalls.

Now instead of creating an update for a (possibly temporary) removal,
we only perform this removal if the FD is reported again as ready while
inactive. In addition this is performed via another update so that
alternating workloads like transfers have a chance to re-enable the
FD without any syscall during the loop (typically after the data that
filled a buffer have been sent). However we only do that for single-
threaded FDs as the other ones require a more complex setup and are not
on the critical path.

This does cause a few spurious wakeups but almost totally eliminates the
calls to epoll_ctl() on connections seeing intermitent traffic like HTTP/1
to a server or client.

A typical example with 100k requests for 4 kB objects over 200 connections
shows that the number of epoll_ctl() calls doesn't depend on the number
of requests anymore but most exclusively on the number of established
connections:

Before:
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 57.09    0.499964           0    654361    321190 recvfrom
 38.33    0.335741           0    369097         1 epoll_wait
  4.56    0.039898           0     44643           epoll_ctl
  0.02    0.000211           1       200       200 connect
------ ----------- ----------- --------- --------- ----------------
100.00    0.875814               1068301    321391 total

After:
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 59.25    0.504676           0    657600    323630 recvfrom
 40.68    0.346560           0    374289         1 epoll_wait
  0.04    0.000370           0       620           epoll_ctl
  0.03    0.000228           1       200       200 connect
------ ----------- ----------- --------- --------- ----------------
100.00    0.851834               1032709    323831 total

As expected there is also a slight increase of epoll_wait() calls since
delaying de-activation of events can occasionally cause one spurious
wakeup.
2019-12-27 16:38:47 +01:00
Willy Tarreau
19689882e6 MINOR: poller: do not call the IO handler if the FD is not active
For now this almost never happens but with subsequent patches it will
become more important not to uselessly call the I/O handlers if the FD
is not active.
2019-12-27 16:38:47 +01:00
Willy Tarreau
0fbc318e24 CLEANUP: connection: merge CO_FL_NOTIFY_DATA and CO_FL_NOTIFY_DONE
Both flags became equal in commit 82967bf9 ("MINOR: connection: adjust
CO_FL_NOTIFY_DATA after removal of flags"), which already predicted the
overlap between xprt_done_cb() and wake() after the removal of the DATA
specific flags in 1.8. Let's simply remove CO_FL_NOTIFY_DATA since the
"_DONE" version already covers everything and explains the intent well
enough.
2019-12-27 16:38:47 +01:00
Willy Tarreau
4970e5adb7 REORG: connection: move tcp_connect_probe() to conn_fd_check()
The function is not TCP-specific at all, it covers all FD-based sockets
so let's move this where other similar functions are, in connection.c,
and rename it conn_fd_check().
2019-12-27 16:38:43 +01:00
Willy Tarreau
11ef0837af MINOR: pollers: add a new flag to indicate pollers reporting ERR & HUP
In practice it's all pollers except select(). It turns out that we're
keeping some legacy code only for select and enforcing it on all
pollers, let's offer the pollers the ability to declare that they
do not need that.
2019-12-27 14:04:33 +01:00
Lukas Tribus
a26d1e1324 BUILD: ssl: improve SSL_CTX_set_ecdh_auto compatibility
SSL_CTX_set_ecdh_auto() is not defined when OpenSSL 1.1.1 is compiled
with the no-deprecated option. Remove existing, incomplete guards and
add a compatibility macro in openssl-compat.h, just as OpenSSL does:

bf4006a6f9/include/openssl/ssl.h (L1486)

This should be backported as far as 2.0 and probably even 1.9.
2019-12-21 06:46:55 +01:00
Rosen Penev
b3814c2ca8 BUG/MINOR: ssl: openssl-compat: Fix getm_ defines
LIBRESSL_VERSION_NUMBER evaluates to 0 under OpenSSL, making the condition
always true. Check for the define before checking it.

Signed-off-by: Rosen Penev <rosenp@gmail.com>

[wt: to be backported as far as 1.9]
2019-12-20 16:01:31 +01:00
Willy Tarreau
dd0e89a084 BUG/MAJOR: task: add a new TASK_SHARED_WQ flag to fix foreing requeuing
Since 1.9 with commit b20aa9eef3 ("MAJOR: tasks: create per-thread wait
queues") a task bound to a single thread will not use locks when being
queued or dequeued because the wait queue is assumed to be the owner
thread's.

But there exists a rare situation where this is not true: the health
check tasks may be running on one thread waiting for a response, and
may in parallel be requeued by another thread calling health_adjust()
after a detecting a response error in traffic when "observe l7" is set,
and "fastinter" is lower than "inter", requiring to shorten the running
check's timeout. In this case, the task being requeued was present in
another thread's wait queue, thus opening a race during task_unlink_wq(),
and gets requeued into the calling thread's wait queue instead of the
running one's, opening a second race here.

This patch aims at protecting against the risk of calling task_unlink_wq()
from one thread while the task is queued on another thread, hence unlocked,
by introducing a new TASK_SHARED_WQ flag.

This new flag indicates that a task's position in the wait queue may be
adjusted by other threads than then one currently executing it. This means
that such WQ manipulations must be performed under a lock. There are two
types of such tasks:
  - the global ones, using the global wait queue (technically speaking,
    those whose thread_mask has at least 2 bits set).
  - some local ones, which for now will be placed into the global wait
    queue as well in order to benefit from its lock.

The flag is automatically set on initialization if the task's thread mask
indicates more than one thread. The caller must also set it if it intends
to let other threads update the task's expiration delay (e.g. delegated
I/Os), or if it intends to change the task's affinity over time as this
could lead to the same situation.

Right now only the situation described above seems to be affected by this
issue, and it is very difficult to trigger, and even then, will often have
no visible effect beyond stopping the checks for example once the race is
met. On my laptop it is feasible with the following config, chained to
httpterm:

    global
        maxconn 400 # provoke FD errors, calling health_adjust()

    defaults
        mode http
        timeout client 10s
        timeout server 10s
        timeout connect 10s

    listen px
        bind :8001
        option httpchk /?t=50
        server sback 127.0.0.1:8000 backup
        server-template s 0-999 127.0.0.1:8000 check port 8001 inter 100 fastinter 10 observe layer7

This patch will automatically address the case for the checks because
check tasks are created with multiple threads bound and will get the
TASK_SHARED_WQ flag set.

If in the future more tasks need to rely on this (multi-threaded muxes
for example) and the use of the global wait queue becomes a bottleneck
again, then it should not be too difficult to place locks on the local
wait queues and queue the task on its bound thread.

This patch needs to be backported to 2.1, 2.0 and 1.9. It depends on
previous patch "MINOR: task: only check TASK_WOKEN_ANY to decide to
requeue a task".

Many thanks to William Dauchy for providing detailed traces allowing to
spot the problem.
2019-12-19 14:42:22 +01:00
Christopher Faulet
76014fd118 MEDIUM: h1-htx: Add HTX EOM block when the message is in H1_MSG_DONE state
During H1 parsing, the HTX EOM block is added before switching the message state
to H1_MSG_DONE. It is an exception in the way to convert an H1 message to
HTX. Except for this block, the message is first switched to the right state
before starting to add the corresponding HTX blocks. For instance, the message
is switched in H1_MSG_DATA state and then the HTX DATA blocks are added.

With this patch, the message is switched to the H1_MSG_DONE state when all data
blocks or trailers were processed. It is the caller responsibility to call
h1_parse_msg_eom() when the H1_MSG_DONE state is reached. This way, it is far
easier to catch failures when the HTX buffer is full.

The H1 and FCGI muxes have been updated accordingly.

This patch may eventually be backported to 2.1 if it helps other backports.
2019-12-11 16:46:16 +01:00
Willy Tarreau
fec56c6a76 BUG/MINOR: listener: fix off-by-one in state name check
As reported in issue #380, the state check in listener_state_str() is
invalid as it allows state value 9 to report crap. We don't use such
a state value so the issue should never happen unless the memory is
already corrupted, but better clean this now while it's harmless.

This should be backported to all maintained branches.
2019-12-11 15:51:37 +01:00
Willy Tarreau
d26c9f9465 BUG/MINOR: mworker: properly pass SIGTTOU/SIGTTIN to workers
If a new process is started with -sf and it fails to bind, it may send
a SIGTTOU to the master process in hope that it will temporarily unbind.
Unfortunately this one doesn't catch it and stops to background instead
of forwarding the signal to the workers. The same is true for SIGTTIN.

This commit simply implements an extra signal handler for the master to
deal with such signals that must be passed down to the workers. It must
be backported as far as 1.8, though there the code differs in that it's
entirely in haproxy.c and doesn't require an extra sig handler.
2019-12-11 14:26:53 +01:00
Willy Tarreau
c49ba52524 MINOR: tasks: split wake_expired_tasks() in two parts to avoid useless wakeups
We used to have wake_expired_tasks() wake up tasks and return the next
expiration delay. The problem this causes is that we have to call it just
before poll() in order to consider latest timers, but this also means that
we don't wake up all newly expired tasks upon return from poll(), which
thus systematically requires a second poll() round.

This is visible when running any scheduled task like a health check, as there
are systematically two poll() calls, one with the interval, nothing is done
after it, and another one with a zero delay, and the task is called:

  listen test
    bind *:8001
    server s1 127.0.0.1:1111 check

  09:37:38.200959 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8696843}) = 0
  09:37:38.200967 epoll_wait(3, [], 200, 1000) = 0
  09:37:39.202459 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8712467}) = 0
>> nothing run here, as the expired task was not woken up yet.
  09:37:39.202497 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8715766}) = 0
  09:37:39.202505 epoll_wait(3, [], 200, 0) = 0
  09:37:39.202513 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8719064}) = 0
>> now the expired task was woken up
  09:37:39.202522 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 7
  09:37:39.202537 fcntl(7, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
  09:37:39.202565 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0
  09:37:39.202577 setsockopt(7, SOL_TCP, TCP_QUICKACK, [0], 4) = 0
  09:37:39.202585 connect(7, {sa_family=AF_INET, sin_port=htons(1111), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
  09:37:39.202659 epoll_ctl(3, EPOLL_CTL_ADD, 7, {EPOLLOUT, {u32=7, u64=7}}) = 0
  09:37:39.202673 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8814713}) = 0
  09:37:39.202683 epoll_wait(3, [{EPOLLOUT|EPOLLERR|EPOLLHUP, {u32=7, u64=7}}], 200, 1000) = 1
  09:37:39.202693 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8818617}) = 0
  09:37:39.202701 getsockopt(7, SOL_SOCKET, SO_ERROR, [111], [4]) = 0
  09:37:39.202715 close(7)                = 0

Let's instead split the function in two parts:
  - the first part, wake_expired_tasks(), called just before
    process_runnable_tasks(), wakes up all expired tasks; it doesn't
    compute any timeout.
  - the second part, next_timer_expiry(), called just before poll(),
    only computes the next timeout for the current thread.

Thanks to this, all expired tasks are properly woken up when leaving
poll, and each poll call's timeout remains up to date:

  09:41:16.270449 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10223556}) = 0
  09:41:16.270457 epoll_wait(3, [], 200, 999) = 0
  09:41:17.270130 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10238572}) = 0
  09:41:17.270157 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 7
  09:41:17.270194 fcntl(7, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
  09:41:17.270204 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0
  09:41:17.270216 setsockopt(7, SOL_TCP, TCP_QUICKACK, [0], 4) = 0
  09:41:17.270224 connect(7, {sa_family=AF_INET, sin_port=htons(1111), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
  09:41:17.270299 epoll_ctl(3, EPOLL_CTL_ADD, 7, {EPOLLOUT, {u32=7, u64=7}}) = 0
  09:41:17.270314 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10337841}) = 0
  09:41:17.270323 epoll_wait(3, [{EPOLLOUT|EPOLLERR|EPOLLHUP, {u32=7, u64=7}}], 200, 1000) = 1
  09:41:17.270332 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10341860}) = 0
  09:41:17.270340 getsockopt(7, SOL_SOCKET, SO_ERROR, [111], [4]) = 0
  09:41:17.270367 close(7)                = 0

This may be backported to 2.1 and 2.0 though it's unlikely to bring any
user-visible improvement except to clarify debugging.
2019-12-11 09:42:58 +01:00
Willy Tarreau
440d09b244 BUG/MINOR: tasks: only requeue a task if it was already in the queue
Commit 0742c314c3 ("BUG/MEDIUM: tasks: Make sure we switch wait queues
in task_set_affinity().") had a slight side effect on expired timeouts,
which is that when used before a timeout is updated, it will cause an
existing task to be requeued earlier than its expected timeout when done
before being updated, resulting in the next poll wakup timeout too early
or even instantly if the previous wake up was done on a timeout. This is
visible in strace when health checks are enabled because there are two
poll calls, one of which has a short or zero delay. The correct solution
is to only requeue a task if it was already in the queue.

This can be backported to all branches having the fix above.
2019-12-11 09:21:36 +01:00
Willy Tarreau
a1d97f88e0 REORG: listener: move the global listener queue code to listener.c
The global listener queue code and declarations were still lying in
haproxy.c while not needed there anymore at all. This complicates
the code for no reason. As a result, the global_listener_queue_task
and the global_listener_queue were made static.
2019-12-10 14:16:03 +01:00
Willy Tarreau
241797a3fc MINOR: listener: split dequeue_all_listener() in two
We use it half times for the global_listener_queue and half times
for a proxy's queue and this requires the callers to take care of
these. Let's split it in two versions, the current one working only
on the global queue and another one dedicated to proxies for the
per-proxy queues. This cleans up quite a bit of code.
2019-12-10 14:14:09 +01:00
Willy Tarreau
a45a8b5171 MEDIUM: init: set NO_NEW_PRIVS by default when supported
HAProxy doesn't need to call executables at run time (except when using
external checks which are strongly recommended against), and is even expected
to isolate itself into an empty chroot. As such, there basically is no valid
reason to allow a setuid executable to be called without the user being fully
aware of the risks. In a situation where haproxy would need to call external
checks and/or disable chroot, exploiting a vulnerability in a library or in
haproxy itself could lead to the execution of an external program. On Linux
it is possible to lock the process so that any setuid bit present on such an
executable is ignored. This significantly reduces the risk of privilege
escalation in such a situation. This is what haproxy does by default. In case
this causes a problem to an external check (for example one which would need
the "ping" command), then it is possible to disable this protection by
explicitly adding this directive in the global section. If enabled, it is
possible to turn it back off by prefixing it with the "no" keyword.

Before the option:

  $ socat - /tmp/sock1 <<< "expert-mode on; debug dev exec sudo /bin/id"
  uid=0(root) gid=0(root) groups=0(root

After the option:
  $ socat - /tmp/sock1 <<< "expert-mode on; debug dev exec sudo /bin/id"
  sudo: effective uid is not 0, is /usr/bin/sudo on a file system with the
        'nosuid' option set or an NFS file system without root privileges?
2019-12-06 17:20:26 +01:00
Olivier Houchard
0742c314c3 BUG/MEDIUM: tasks: Make sure we switch wait queues in task_set_affinity().
In task_set_affinity(), leave the wait_queue if any before changing the
affinity, and re-enter a wait queue once it is done. If we don't do that,
the task may stay in the wait queue of another thread, and we later may
end up modifying that wait queue while holding no lock, which could lead
to memory corruption.

THis should be backported to 2.1, 2.0 and 1.9.
2019-12-05 15:11:19 +01:00
Willy Tarreau
d96f1126fe MEDIUM: init: prevent process and thread creation at runtime
Some concerns are regularly raised about the risk to inherit some Lua
files which make use of a fork (e.g. via os.execute()) as well as
whether or not some of bugs we fix might or not be exploitable to run
some code. Given that haproxy is event-driven, any foreground activity
completely stops processing and is easy to detect, but background
activity is a different story. A Lua script could very well discretely
fork a sub-process connecting to a remote location and taking commands,
and some injected code could also try to hide its activity by creating
a process or a thread without blocking the rest of the processing. While
such activities should be extremely limited when run in an empty chroot
without any permission, it would be better to get a higher assurance
they cannot happen.

This patch introduces something very simple: it limits the number of
processes and threads to zero in the workers after the last thread was
created. By doing so, it effectively instructs the system to fail on
any fork() or clone() syscall. Thus any undesired activity has to happen
in the foreground and is way easier to detect.

This will obviously break external checks (whose concept is already
totally insecure), and for this reason a new option
"insecure-fork-wanted" was added to disable this protection, and it
is suggested in the fork() error report from the checks. It is
obviously recommended not to use it and to reconsider the reasons
leading to it being enabled in the first place.

If for any reason we fail to disable forks, we still start because it
could be imaginable that some operating systems refuse to set this
limit to zero, but in this case we emit a warning, that may or may not
be reported since we're after the fork point. Ideally over the long
term it should be conditionned by strict-limits and cause a hard fail.
2019-12-03 11:49:00 +01:00
Emmanuel Hocdet
e9a100e982 BUG/MINOR: ssl: fix X509 compatibility for openssl < 1.1.0
Commit d4f9a60e "MINOR: ssl: deduplicate ca-file" uses undeclared X509
functions when build with openssl < 1.1.0. Introduce this functions
in openssl-compat.h .

Fix issue #385.
2019-12-03 07:13:12 +01:00
Emmanuel Hocdet
d4f9a60ee2 MINOR: ssl: deduplicate ca-file
Typically server line like:
'server-template srv 1-1000 *:443 ssl ca-file ca-certificates.crt'
load ca-certificates.crt 1000 times and stay duplicated in memory.
Same case for bind line: ca-file is loaded for each certificate.
Same 'ca-file' can be load one time only and stay deduplicated in
memory.

As a corollary, this will prevent file access for ca-file when
updating a certificate via CLI.
2019-11-28 11:11:20 +01:00
Willy Tarreau
cdb27e8295 MINOR: version: this is development again, update the status
It's basically a revert of commit 9ca7f8cea.
2019-11-25 20:38:32 +01:00
Willy Tarreau
2e077f8d53 [RELEASE] Released version 2.2-dev0
Released version 2.2-dev0 with the following main changes :
    - exact copy of 2.1.0
2019-11-25 20:36:16 +01:00
Willy Tarreau
9ca7f8ceac MINOR: version: indicate that this version is stable
Also indicate that it will get fixes till ~Q1 2021.
2019-11-25 19:47:23 +01:00
Willy Tarreau
c22d5dfeb8 MINOR: h2: add a function to report H2 error codes as strings
Just like we have frame type to string, let's have error to string to
improve debugging and traces.
2019-11-25 11:34:26 +01:00
Willy Tarreau
8f3ce06f14 MINOR: ist: add ist_find_ctl()
This new function looks for the first control character in a string (a
char whose value is between 0x00 and 0x1F included) and returns it, or
NULL if there is none. It is optimized for quickly evicting non-matching
strings and scans ~0.43 bytes per cycle. It can be used as an accelerator
when it's needed to look up several of these characters (e.g. CR/LF/NUL).
2019-11-25 10:33:35 +01:00
Willy Tarreau
47479eb0e7 MINOR: version: emit the link to the known bugs in output of "haproxy -v"
The link to the known bugs page for the current version is built and
reported there. When it is a development version (less than 2 dots),
instead a link to github open issues is reported as there's no way to
be sure about the current situation in this case and it's better that
users report their trouble there.
2019-11-21 18:48:20 +01:00
Willy Tarreau
08dd202d73 MINOR: version: report the version status in "haproxy -v"
As discussed on Discourse here:

    https://discourse.haproxy.org/t/haproxy-branch-support-lifetime/4466

it's not always easy for end users to know the lifecycle of the version
they are using. This patch introduces a "Status" line in the output of
"haproxy -vv" indicating whether it's a development, stable, long-term
supported version, possibly with an estimated end of life for the branch
when it can be anticipated (e.g. for stable versions). This field should
be adjusted when creating a major release to reflect the new status.

It may make sense to backport this to other branches to clarify the
situation.
2019-11-21 18:47:54 +01:00
William Lallemand
8b453912ce MINOR: ssl: ssl_sock_prepare_ctx() return an error code
Rework ssl_sock_prepare_ctx() so it fills a buffer with the error
messages instead of using ha_alert()/ha_warning(). Also returns an error
code (ERR_*) instead of the number of errors.
2019-11-21 17:48:11 +01:00
Daniel Corbett
f8716914c7 MEDIUM: dns: Add resolve-opts "ignore-weight"
It was noted in #48 that there are times when a configuration
may use the server-template directive with SRV records and
simultaneously want to control weights using an agent-check or
through the runtime api.  This patch adds a new option
"ignore-weight" to the "resolve-opts" directive.

When specified, any weight indicated within an SRV record will
be ignored.  This is for both initial resolution and ongoing
resolution.
2019-11-21 17:25:31 +01:00
Frdric Lcaille
ec1c10b839 MINOR: peers: Add debugging information to "show peers".
This patch adds three counters to help in debugging peers protocol issues
to "peer" struct:
	->no_hbt counts the number of reconnection period without receiving heartbeat
	->new_conn counts the number of reconnections after ->reconnect timeout expirations.
	->proto_err counts the number of protocol errors.
2019-11-19 14:48:28 +01:00
Frdric Lcaille
33cab3c0eb MINOR: peers: Add TX/RX heartbeat counters.
Add RX/TX heartbeat counters to "peer" struct to have an idead about which
peer is alive or not.
Dump these counters values on the CLI via "show peers" command.
2019-11-19 14:48:25 +01:00
Cdric Dufour
0d7712dff0 MINOR: stick-table: allow sc-set-gpt0 to set value from an expression
Allow the sc-set-gpt0 action to set GPT0 to a value dynamically evaluated from
its <expr> argument (in addition to the existing static <int> alternative).
2019-11-15 18:24:19 +01:00
Willy Tarreau
869efd5eeb BUG/MINOR: log: make "show startup-log" use a ring buffer instead
The copy of the startup logs used to rely on a re-allocated memory area
on the fly, that would attempt to be delivered at once over the CLI. But
if it's too large (too many warnings) it will take time to start up, and
may not even show up on the CLI as it doesn't fit in a buffer.

The ring buffer infrastructure solves all this with no more code, let's
switch to this instead. It simply requires a parsing function to attach
the ring via ring_attach_cli() and all the rest is automatically handled.

Initially this was imagined as a code cleanup, until a test with a config
involving 100k backends and just one occurrence of
"load-server-state-from-file global" in the defaults section took approx
20 minutes to parse due to the O(N^2) cost of concatenating the warnings
resulting in ~1 TB of data to be copied, while it took only 0.57s with
the ring.

Ideally this patch should be backported to 2.0 and 1.9, though it relies
on the ring infrastructure which will then also need to be backported.
Configs able to trigger the bug are uncommon, so another workaround for
older versions without backporting the rings would consist in simply
limiting the size of the error message in print_message() to something
always printable, which will only return the first errors.
2019-11-15 15:50:16 +01:00
Christopher Faulet
0d1c2a65e8 MINOR: stats: Report max times in addition of the averages for sessions
Now, for the sessions, the maximum times (queue, connect, response, total) are
reported in addition of the averages over the last 1024 connections. These
values are called qtime_max, ctime_max, rtime_max and ttime_max.

This patch is related to #272.
2019-11-15 14:23:54 +01:00
Christopher Faulet
efb41f0d8d MINOR: counters: Add fields to store the max observed for {q,c,d,t}_time
For backends and servers, some average times for last 1024 connections are
already calculated. For the moment, the averages for the time passed in the
queue, the connect time, the response time (for HTTP session only) and the total
time are calculated. Now, in addition, the maximum time observed for these
values are also stored.

In addition, These new counters are cleared as all other max values with the CLI
command "clear counters".

This patch is related to #272.
2019-11-15 14:23:21 +01:00
Christopher Faulet
e2e8c6779e MINOR: freq_ctr: Make the sliding window sums thread-safe
swrate_add() and swrate_add_scaled() now rely on the CAS atomic operation. So
the sliding window sums are atomically updated.
2019-11-15 13:43:08 +01:00
Christopher Faulet
b2e58492b1 MEDIUM: filters: Adapt filters API to allow again TCP filtering on HTX streams
This change make the payload filtering uniform between TCP and HTTP
filters. Now, in TCP, like in HTTP, there is only one callback responsible to
forward data. Thus, old callbacks, tcp_data() and tcp_forward_data(), are
replaced by a single callback function, tcp_payload(). This new callback gets
the offset in the payload to (re)start the filtering and the maximum amount of
data it can forward. It is the filter's responsibility to be compatible with HTX
streams. If not, it must not set the flag FLT_CFG_FL_HTX.

Because of this change, nxt and fwd offsets are no longer needed. Thus they are
removed from the filter structure with their update functions,
flt_change_next_size() and flt_change_forward_size(). Moreover, the trace filter
has been updated accordingly.

This patch breaks the compatibility with the old API. Thus it should probably
not be backported. But, AFAIK, there is no TCP filter, thus the breakage is very
limited.
2019-11-15 13:43:08 +01:00
Willy Tarreau
da52035a45 MINOR: memory: also poison the area on freeing
Doing so sometimes helps detect some UAF situations without the overhead
associated to the DEBUG_UAF define.
2019-11-15 07:06:46 +01:00
Olivier Houchard
7031e3dace BUG/MEDIUM: tasks: Make tasklet_remove_from_tasklet_list() no matter the tasklet.
In tasklet_remove_from_tasket_list(), we can be called for a tasklet that is
either in the private task list, or in the shared tasklet list. Take that into
account and always use MT_LIST_DEL() to remove it, otherwise if we're in the
shared list and another thread attempts to add a tasklet in it, bad things
will happen.

__tasklet_remove_from_tasklet_list() is left unchanged, it's only supposed
to be used by process_runnable_task() to remove task/tasklets from the private
tast list.

This should not be backported.
This should fix github issue #357.
2019-11-09 18:27:17 +01:00
Christopher Faulet
fee726ffa7 MINOR: http-ana: Remove the unused function http_reset_txn()
Since the legacy HTTP mode was removed, the stream is always released at the end
of each HTTP transaction and a new is created to handle the next request for
keep-alive connections. So the HTTP transaction is no longer reset and the
function http_reset_txn() can be removed.
2019-11-07 15:32:52 +01:00
Christopher Faulet
eea8fc737b MEDIUM: stream/trace: Register a new trace source with its events
Runtime traces are now supported for the streams, only if compiled with
debug. process_stream() is covered as well as TCP/HTTP analyzers and filters.

In traces, the first argument is always a stream. So it is easy to get the info
about the channels and the stream-interfaces. The second argument, when defined,
is always a HTTP transaction. And the third one is an HTTP message. The trace
message is adapted to report HTTP info when possible.
2019-11-06 10:14:32 +01:00
Christopher Faulet
db703b1918 MINOR: trace: Add a set of macros to trace events if HA is compiled with debug
The macros DBG_TRACE_*() can be used instead of existing trace macros to emit
trace messages in debug mode only, ie, when HAProxy is compiled with DEBUG_FULL
or DEBUG_DEV. Otherwise, these macros do nothing. So it is possible to add
traces for development purpose without impacting performance of production
instances.
2019-11-06 10:14:32 +01:00
William Lallemand
21724f0807 MINOR: ssl/cli: replace the default_ctx during 'commit ssl cert'
If the SSL_CTX of a previous instance (ckch_inst) was used as a
default_ctx, replace the default_ctx of the bind_conf by the first
SSL_CTX inserted in the SNI tree.

Use the RWLOCK of the sni tree to handle the change of the default_ctx.
2019-11-04 18:16:53 +01:00
Damien Claisse
ae6f125c7b MINOR: sample: add us/ms support to date/http_date
It can be sometimes interesting to have a timestamp with a
resolution of less than a second.
It is currently painful to obtain this, because concatenation
of date and date_us lead to a shorter timestamp during first
100ms of a second, which is not parseable and needs ugly ACLs
in configuration to prepend 0s when needed.
To improve this, add an optional <unit> parameter to date sample
to report an integer with desired unit.
Also support this unit in http_date converter to report
a date string with sub-second precision.
2019-10-31 08:47:31 +01:00
William Lallemand
beea2a476e CLEANUP: ssl/cli: remove leftovers of bundle/certs (it < 2)
Remove the leftovers of the certificate + bundle updating in 'ssl set
cert' and 'commit ssl cert'.

* Remove the it variable in appctx.ctx.ssl.
* Stop doing everything twice.
* Indent
2019-10-30 17:52:34 +01:00
William Lallemand
bc6ca7ccaa MINOR: ssl/cli: rework 'set ssl cert' as 'set/commit'
This patch splits the 'set ssl cert' CLI command into 2 commands.

The previous way of updating the certificate on the CLI was limited with
the bundles. It was only able to apply one of the tree part of the
certificate during an update, which mean that we needed 3 updates to
update a full 3 certs bundle.

It was also not possible to apply atomically several part of a
certificate with the ability to rollback on error. (For example applying
a .pem, then a .ocsp, then a .sctl)

The command 'set ssl cert' will now duplicate the certificate (or
bundle) and update it in a temporary transaction..

The second command 'commit ssl cert' will commit all the changes made
during the transaction for the certificate.

This commit breaks the ability to update a certificate which was used as
a unique file and as a bundle in the HAProxy configuration. This way of
using the certificates wasn't making any sense.

Example:

	// For a bundle:

	$ echo -e "set ssl cert localhost.pem.rsa <<\n$(cat kikyo.pem.rsa)\n" | socat /tmp/sock1 -
	Transaction created for certificate localhost.pem!

	$ echo -e "set ssl cert localhost.pem.dsa <<\n$(cat kikyo.pem.dsa)\n" | socat /tmp/sock1 -
	Transaction updated for certificate localhost.pem!

	$ echo -e "set ssl cert localhost.pem.ecdsa <<\n$(cat kikyo.pem.ecdsa)\n" | socat /tmp/sock1 -
	Transaction updated for certificate localhost.pem!

	$ echo "commit ssl cert localhost.pem" | socat /tmp/sock1 -
	Committing localhost.pem.
	Success!
2019-10-30 17:01:07 +01:00
William Dauchy
0fec3ab7bf MINOR: init: always fail when setrlimit fails
this patch introduces a strict-limits parameter which enforces the
setrlimit setting instead of a warning. This option can be forcingly
disable with the "no" keyword.
The general aim of this patch is to avoid bad surprises on a production
environment where you change the maxconn for example, a new fd limit is
calculated, but cannot be set because of sysfs setting. In that case you
might want to have an explicit failure to be aware of it before seeing
your traffic going down. During a global rollout it is also useful to
explictly fail as most progressive rollout would simply check the
general health check of the process.

As discussed, plan to use the strict by default mode starting from v2.3.

Signed-off-by: William Dauchy <w.dauchy@criteo.com>
2019-10-29 17:42:27 +01:00
Olivier Houchard
6e8e2ec849 BUG/MEDIUM: stream_interface: Only use SI_ST_RDY when the mux is ready.
In si_connect(), only switch the strema_interface status to SI_ST_RDY if
we're reusing a connection and if the connection's mux is ready. Otherwise,
maybe we're reusing a connection that is not fully established yet, and may
fail, and setting SI_ST_RDY would mean we would not be able to retry to
connect.

This should be backported to 1.9 and 2.0.
This commit depends on 55234e33708c5a584fb9efea81d71ac47235d518.
2019-10-29 14:15:20 +01:00
Olivier Houchard
9b8e11e691 MINOR: mux: Add a new method to get informations about a mux.
Add a new method, ctl(), to muxes. It uses a "enum mux_ctl_type" to
let it know which information we're asking for, and can output it either
directly by returning the expected value, or by using an optional argument.
"output" argument.
Right now, the only known mux_ctl_type is MUX_STATUS, that will return 0 if
the mux is not ready, or MUX_STATUS_READY if the mux is ready.

We probably want to backport this to 1.9 and 2.0.
2019-10-29 14:15:20 +01:00
Willy Tarreau
2254b8ef4a Revert "MINOR: istbuf: add b_fromist() to make a buffer from an ist"
This reverts commit 9e46496d45. It was
wrong and is not reliable, depending on the compiler's version and
optimization, as the struct is assigned inside a statement, thus on
its own stack. It's not needed anymore now so let's remove this.
2019-10-29 13:09:14 +01:00
Willy Tarreau
20020ae804 MINOR: chunk: add chunk_istcat() to concatenate an ist after a chunk
We previously relied on chunk_cat(dst, b_fromist(src)) for this but it
is not reliable as the allocated buffer is inside the expression and
may be on a temporary stack. While it's possible to allocate stack space
for a struct and return a pointer to it, it's not possible to initialize
it form a temporary variable to prevent arguments from being evaluated
multiple times. Since this is only used to append an ist after a chunk,
let's instead have a chunk_istcat() function to perform exactly this
from a native ist.

The only call place (URI computation in the cache) was updated.
2019-10-29 13:09:14 +01:00
Willy Tarreau
9b013701f1 MINOR: stats/debug: maintain a counter of debug commands issued
Debug commands will usually mark the fate of the process. We'd rather
have them counted and visible in a core or in stats output than trying
to guess how a flag combination could happen. The counter is only
incremented when the command is about to be issued however, so that
failed attempts are ignored.
2019-10-24 18:38:00 +02:00
Willy Tarreau
abb9f9b057 MINOR: cli: add an expert mode to hide dangerous commands
Some commands like the debug ones are not enabled by default but can be
useful on some production environments. In order to avoid the temptation
of using them incorrectly, let's introduce an "expert" mode for a CLI
connection, which allows some commands to appear and be used. It is
enabled by command "expert-mode on" which is not listed by default.
2019-10-24 18:38:00 +02:00
Willy Tarreau
86bfe146c9 REORG: move CLI access level definitions to cli.h
These ones were still in global.h which is misplaced.
2019-10-24 18:38:00 +02:00
William Lallemand
705e088f0a BUG/MINOR: ssl: fix build of X509_chain_up_ref() w/ libreSSL
LibreSSL brought X509_chain_up_ref() in 2.7.5, so no need to build our
own version starting from this version.
2019-10-23 23:20:08 +02:00
William Lallemand
89f5807315 BUG/MINOR: ssl: fix build with openssl < 1.1.0
8c1cddef ("MINOR: ssl: new functions duplicate and free a ckch_store")
use some OpenSSL refcount functions that were introduced in OpenSSL
1.0.2 and OpenSSL 1.1.0.

Fix the problem by introducing them in openssl-compat.h.

Fix #336.
2019-10-23 19:44:50 +02:00
William Lallemand
8f840d7e55 MEDIUM: cli/ssl: handle the creation of SSL_CTX in an IO handler
To avoid affecting too much the traffic during a certificate update,
create the SNIs in a IO handler which yield every 10 ckch instances.

This way haproxy continues to respond even if we tries to update a
certificate which have 50 000 instances.
2019-10-23 11:54:51 +02:00
Willy Tarreau
403bfbb130 BUG/MEDIUM: pattern: make the pattern LRU cache thread-local and lockless
As reported in issue #335, a lot of contention happens on the PATLRU lock
when performing expensive regex lookups. This is absurd since the purpose
of the LRU cache was to have a fast cache for expressions, thus the cache
must not be shared between threads and must remain lockless.

This commit makes the LRU cache thread-local and gets rid of the PATLRU
lock. A test with 7 threads on 4 cores climbed from 67kH/s to 369kH/s,
or a scalability factor of 5.5.

Given the huge performance difference and the regression caused to
users migrating from processes to threads, this should be backported at
least to 2.0.

Thanks to Brian Diekelman for his detailed report about this regression.
2019-10-23 07:27:25 +02:00
Willy Tarreau
8cdc167df8 BUG/MEDIUM: task: make tasklets either local or shared but not both at once
Tasklets may be woken up to run on the calling thread or by a specific thread
(the owner). But since we use a non-thread safe mechanism when the calling
thread is also the for the owner, there may sometimes be collisions when two
threads decide to wake the same tasklet up at the same time and one of them
is the owner.

This is more of a matter of usage than code, in that a tasklet usually is
designed to be woken up and executed on the calling thread only (most cases)
or on a specific thread. Thus it is a property of the tasklet itself as this
solely depends how the code is constructed around it.

This patch performs a small change to address this. By default tasklet_new()
creates a "local" tasklet, which will run on the calling thread, like in 2.0.
This is done by setting tl->tid to a negative value. If the caller wants the
tasklet to run exclusively on a specific thread, it just has to set tl->tid,
which is already what shared tasklet callers do anyway.

No backport is needed.
2019-10-18 09:04:55 +02:00
Willy Tarreau
891b5ef05a BUG/MEDIUM: tasklet: properly compute the sleeping threads mask in tasklet_wakeup()
The use of ~(1 << tid) to compute the sleeping_mask in tasklet_wakeup()
will result in breakage above 32 threads, because (1<<31) = 0xFFFFFFFF8000000,
and upper values will lead to theorically undefined results, but practically
will wrap over 0x1 to 0x80000000 again and indicate wrong sleeping masks. It
seems that the main visible effect maybe extra latency on some threads or
short CPU loops on others.

No backport is needed.
2019-10-18 09:00:26 +02:00
Olivier Houchard
2068ec4f89 BUG/MEDIUM: lists: Handle 1-element-lists in MT_LIST_BEHEAD().
In MT_LIST_BEHEAD(), explicitely set the next element of the prev to NULL,
instead of setting it to the prev of the next. If we only had one element,
then we'd set the next and the prev to the element itself, and thus it would
make the element appear to be outside any list.
2019-10-17 17:48:20 +02:00
Willy Tarreau
9e46496d45 MINOR: istbuf: add b_fromist() to make a buffer from an ist
A lot of our chunk-based functions are able to work on a buffer pointer
but not on an ist. Instead of duplicating all of them to also take an
ist as a source, let's have a macro to make a temporary dummy buffer
from an ist. This will only result in structure field manipulations
that the compiler will quickly figure to eliminate them with inline
functions, and in other cases it will just use 4 words in the stack
before calling a function, instead of performing intermediary
conversions.
2019-10-17 10:40:47 +02:00
David Carlier
a92c5cec2d BUILD/MEDIUM: threads: rename thread_info struct to ha_thread_info
On Darwin, the thread_info name exists as a standard function thus
we need to rename our array to ha_thread_info to fix this conflict.
2019-10-17 07:15:17 +02:00
Christopher Faulet
065118166c MINOR: htx: Add a flag on HTX to known when a response was generated by HAProxy
The flag HTX_FL_PROXY_RESP is now set on responses generated by HAProxy,
excluding responses returned by applets and services. It is an informative flag
set by the applicative layer.
2019-10-16 10:03:12 +02:00
Willy Tarreau
abefa34c34 MINOR: version: make the version strings variables, not constants
It currently is not possible to figure the exact haproxy version from a
core file for the sole reason that the version is stored into a const
string and as such ends up in the .text section that is not part of a
core file. By turning them into variables we move them to the data
section and they appear in core files. In order to help finding them,
we just prepend an extra variable in front of them and we're able to
immediately spot the version strings from a core file:

  $ strings core | fgrep -A2 'HAProxy version'
  HAProxy version follows
  2.1-dev2-e0f48a-88
  2019/10/15

(These are haproxy_version and haproxy_date respectively). This may be
backported to 2.0 since this part is not support to impact anything but
the developer's time spent debugging.
2019-10-16 09:56:57 +02:00
Christopher Faulet
53a899b946 CLEANUP: h1-htx: Move htx-to-h1 formatting functions from htx.c to h1_htx.c
The functions "htx_*_to_h1()" have been renamed into "h1_format_htx_*()" and
moved in the file h1_htx.c. It is the right place for such functions.
2019-10-14 22:28:50 +02:00
Christopher Faulet
48fa033f28 BUG/MINOR: chunk: Fix tests on the chunk size in functions copying data
When raw data are copied or appended in a chunk, the result must not exceed the
chunk size but it can reach it. Unlike functions to copy or append a string,
there is no terminating null byte.

This patch must be backported as far as 1.8. Note in 1.8, the functions
chunk_cpy() and chunk_cat() don't exist.
2019-10-14 16:45:09 +02:00
William Lallemand
e0c51ae358 BUG/MINOR: ssl: fix build without SSL
Commits 222a7c6 and 150bfa8 introduced some SSL initialization in
bind_conf_alloc() which broke the build without SSL.

Issue #322.
2019-10-14 11:24:17 +02:00
William Lallemand
246c0246d3 MINOR: ssl: load the ocsp in/from the ckch
Don't try to load the files containing the issuer and the OCSP response
each time we generate a SSL_CTX.

The .ocsp and the .issuer are now loaded in the struct
cert_key_and_chain only once and then loaded from this structure when
creating a SSL_CTX.
2019-10-11 17:32:03 +02:00
William Lallemand
a17f4116d5 MINOR: ssl: load the sctl in/from the ckch
Don't try to load the file containing the sctl each time we generate a
SSL_CTX.

The .sctl is now loaded in the struct cert_key_and_chain only once and
then loaded from this structure when creating a SSL_CTX.

Note that this now make possible the use of sctl with multi-cert
bundles.
2019-10-11 17:32:03 +02:00
William Lallemand
150bfa84e3 MEDIUM: ssl/cli: 'set ssl cert' updates a certificate from the CLI
$ echo -e "set ssl cert certificate.pem <<\n$(cat certificate2.pem)\n" | \
    socat stdio /var/run/haproxy.stat
    Certificate updated!

The operation is locked at the ckch level with a HA_SPINLOCK_T which
prevents the ckch architecture (ckch_store, ckch_inst..) to be modified
at the same time. So you can't do a certificate update at the same time
from multiple CLI connections.

SNI trees are also locked with a HA_RWLOCK_T so reading operations are
locked only during a certificate update.

Bundles are supported but you need to update each file (.rsa|ecdsa|.dsa)
independently. If a file is used in the configuration as a bundle AND
as a unique certificate, both will be updated.

Bundles, directories and crt-list are supported, however filters in
crt-list are currently unsupported.

The code tries to allocate every SNIs and certificate instances first,
so it can rollback the operation if that was unsuccessful.

If you have too much instances of the certificate (at least 20000 in my
tests on my laptop), the function can take too much time and be killed
by the watchdog. This will be fixed later. Also with too much
certificates it's possible that socat exits before the end of the
generation without displaying a message, consider changing the socat
timeout in this case (-t2 for example).

The size of the certificate is currently limited by the maximum size of
a payload, that must fit in a buffer.
2019-10-11 17:32:03 +02:00
William Lallemand
1d29c7438e MEDIUM: ssl: split ssl_sock_add_cert_sni()
In order to allow the creation of sni_ctx in runtime, we need to split
the function to allow rollback.

We need to be able to allocate all sni_ctxs required before inserting
them in case we need to rollback if we didn't succeed the allocation.

The function was splitted in 2 parts.

The first one ckch_inst_add_cert_sni() allocates a struct sni_ctx, fill
it with the right data and insert it in the ckch_inst's list of sni_ctx.

The second will take every sni_ctx in the ckch_inst and insert them in
the bind_conf's sni tree.
2019-10-11 17:32:03 +02:00
William Lallemand
9117de9e37 MEDIUM: ssl: introduce the ckch instance structure
struct ckch_inst represents an instance of a certificate (ckch_node)
used in a bind_conf. Every sni_ctx created for 1 ckch_node in a
bind_conf are linked in this structure.

This patch allocate the ckch_inst for each bind_conf and inserts the
sni_ctx in its linked list.
2019-10-11 17:32:03 +02:00
William Lallemand
222a7c6ae0 MINOR: ssl: initialize explicitly the sni_ctx trees 2019-10-11 17:32:02 +02:00
William Lallemand
f6adbe9f28 REORG: ssl: move structures to ssl_sock.h 2019-10-11 17:32:02 +02:00
Olivier Houchard
804ef244c6 MINOR: lists: Fix alignement of \ when relevant.
Make sure all the \ are properly aligned in macroes, this contains no
functional change.
2019-10-11 16:56:25 +02:00
Olivier Houchard
74715da030 MINOR: lists: Try to use local variables instead of macro arguments.
When possible, use local variables instead of using the macro arguments
explicitely, otherwise they may be evaluated over and over.
2019-10-11 16:56:25 +02:00
Olivier Houchard
06910464dd MEDIUM: task: Split the tasklet list into two lists.
As using an mt_list for the tasklet list is costly, instead use a regular list,
but add an mt_list for tasklet woken up by other threads, to be run on the
current thread. At the beginning of process_runnable_tasks(), we just take
the new list, and merge it into the task_list.
This should give us performances comparable to before we started using a
mt_list, but allow us to use tasklet_wakeup() from other threads.
2019-10-11 16:37:41 +02:00
Willy Tarreau
d7f2bbcbe3 MINOR: list: add new macro MT_LIST_BEHEAD
This macro atomically cuts the head of a list and returns the list
of elements as a detached list, meaning that they're all linked
together without any head. If the list was empty, NULL is returned.
2019-10-11 16:37:41 +02:00
Willy Tarreau
c32a0e522f MINOR: lists: add new macro LIST_SPLICE_END_DETACHED
This macro adds a detached list at the end of an existing
list. The detached list is a list without head, containing
only elements.
2019-10-11 16:37:41 +02:00
Willy Tarreau
eaa55370c3 MINOR: stats: prepare to add a description with each stat/info field
Several times some users have expressed the non-intuitive aspect of some
of our stat/info metrics and suggested to add some help. This patch
replaces the char* arrays with an array of name_desc so that we now have
some reserved room to store a description with each stat or info field.
These descriptions are currently empty and not reported yet.
2019-10-10 11:30:07 +02:00
Willy Tarreau
2f39738750 MINOR: stats: support the "desc" output format modifier for info and stat
Now "show info" and "show stat" can parse "desc" as an output format
modifier that will be passed down the chain to add some descriptions
to the fields depending on the format in use. For now it is not
exploited.
2019-10-10 11:30:07 +02:00
Willy Tarreau
ab02b3f345 MINOR: stats: get rid of the STAT_SHOWADMIN flag
This flag is used to decide to show the check box in front of a proxy
on the HTML stat page. It is always equal to STAT_ADMIN except when the
proxy has no backend capability (i.e. a pure frontend) or has no server,
in which case it's only used to avoid leaving an empty column at the
beginning of the table. Not only this is pretty useless, but it also
causes the columns not to align well when mixing multiple proxies with
or without servers.

Let's simply always use STAT_ADMIN and get rid of this flag.
2019-10-10 11:30:07 +02:00
Willy Tarreau
708c41602b MINOR: stats: replace the ST_* uri_auth flags with STAT_*
We used to rely on some config flags defined in uri_auth.h set during
parsing, and another set of STAT_* flags defined in stats.h set at run
time, with a somewhat gray area between the two sets. This is confusing
in the stats code as both are called "flags" in various functions and
it's quite hard to know which one describes what.

This patch cleans this up by replacing all ST_* by a newly assigned
value from the STAT_* set so that we can now use unified flags to
describe both the configuration and the current state. There is no
functional change at all.
2019-10-10 11:30:07 +02:00
Willy Tarreau
ee4f5f83d3 MINOR: stats: get rid of the ST_CONVDONE flag
This flag was added in 1.4-rc1 by commit 329f74d463 ("[BUG] uri_auth: do
not attemp to convert uri_auth -> http-request more than once") to
address the case where two proxies inherit the stats settings from
the defaults instance, and the first one compiles the expression while
the second one uses it. In this case since they use the exact same
uri_auth pointer, only the first one should compile and the second one
must not fail the check. This was addressed by adding an ST_CONVDONE
flag indicating that the expression conversion was completed and didn't
need to be done again. But this is a hack and it becomes cumbersome in
the middle of the other flags which are all relevant to the stats
applet. Let's instead fix it by checking if we're dealing with an
alias of the defaults instance and refrain from compiling this twice.
This allows us to remove the ST_CONVDONE flag.

A typical config requiring this check is :

   defaults
        mode http
        stats auth foo:bar

   listen l1
        bind :8080

   listen l2
        bind :8181

Without this (or previous) check it would cmoplain when checking l2's
validity since the rule was already built.
2019-10-10 11:30:07 +02:00
Christopher Faulet
16fdc55f79 MINOR: http: Add a function to get the authority into a URI
The function http_get_authority() may be used to parse a URI and looks for the
authority, between the scheme and the path. An option may be used to skip the
user info (part before the '@'). Most of time, the user info will be ignored.
2019-10-09 11:05:31 +02:00
Christopher Faulet
9a67c293b9 MINOR: htx: Add 2 flags on the start-line to have more info about the uri
The first flag, HTX_SL_F_HAS_AUTHORITY, is set when the uri contains an
authority. For the H1, it happens when a CONNECT request is received or when an
absolute uri is used. For the H2, it happens when the pseudo header ":authority"
is provided.

The second one, HTX_SL_F_NORMALIZED_URI, is set when the received uri is
represented as an absolute uri because of the protocol requirements. For now, it
is only used for h2 requests, when the pseudo headers :authority and :scheme are
found. Internally, the uri is represented as an absolute uri. This flag allows
us to make the difference between an absolute uri in h1 and h2.
2019-10-09 11:05:31 +02:00
Christopher Faulet
c5a3eb4e3a MINOR: fcgi: Add function to get the string representation of a record type
This function will be used to emit traces in the FCGI multiplexer.
2019-10-04 16:12:02 +02:00
Christopher Faulet
27aa65ecfb MINOR: htx: Adapt htx_dump() to be used from traces
This function now dumps info about the HTX message into a buffer, passed as
argument. In addition, it is possible to only dump meta information, without the
message content.
2019-10-04 15:48:55 +02:00
Christopher Faulet
af542635f7 MINOR: h1-htx: Update h1_copy_msg_data() to ease the traces in the mux-h1
This function now uses the address of the pointer to the htx message where the
copy must be performed. This way, when a zero-copy is performed, there is no
need to refresh the caller's htx message. It is a bit easier to do that way,
especially to add traces in the mux-h1.
2019-10-04 15:46:59 +02:00
Willy Tarreau
2aaeee34da BUG/MEDIUM: fd: HUP is an error only when write is active
William reported that since commit 6b3089856f ("MEDIUM: fd: do not use
the FD_POLL_* flags in the pollers anymore") the master's CLI often
fails to access sub-processes. There are two causes to this. One is
that we did report FD_POLL_ERR on an FD as soon as FD_EV_SHUT_W was
seen, which is automatically inherited from POLLHUP. And since we do
not store the current shutdown state of an FD we can't know if the
poller reports a sudden close resulting from an error or just a
byproduct of a previous shutdown(WR) followed by a read0. The current
patch addresses this by only considering this when the FD was active,
since a shutdown FD is not active. The second issue is that *somewhere*
down the chain, channel data are ignored if an error is reported on a
channel. This results in content truncation, but this cause was not
figured yet.

No backport is needed.
2019-10-01 11:52:08 +02:00
Tim Duesterhus
07626eafa2 CLEANUP: proxy: Remove proxy_tbl_by_name
It is no longer required as of 1b8e68e89a
and is no longer used when #306 is fixed.
2019-09-30 04:11:36 +02:00
Christopher Faulet
88a0db28ae MINOR: stats: Add the support of float fields in stats
It is now possible to format stats counters as floats. But the stats applet does
not use it.

This patch is required by the Prometheus exporter to send the time averages in
seconds. If the promex change is backported, this patch must be backported
first.
2019-09-27 08:49:09 +02:00
Christopher Faulet
d72665b425 CLEANUP: http-ana: Remove the unused function http_send_name_header()
Because the HTTP multiplexers are now responsible to handle the option
"http-send-name-header", the function http_send_name_header() can be removed.
2019-09-27 08:48:53 +02:00
Christopher Faulet
b1bb1afa47 MINOR: spoe: Support the async mode with several threads
A different engine-id is now generated for each thread. So, it is possible to
enable the async mode with several threads.

This patch may be backported to older versions.
2019-09-26 16:51:02 +02:00
Willy Tarreau
93acfa2263 MINOR: time: add timeofday_as_iso_us() to return instant time as ISO
We often need ISO time + microseconds in traces and ring buffers, thus
function does this by calling gettimeofday() and keeping a cached value
of the part representing the tv_sec value, and only rewrites the microsecond
part. The cache is per-thread so it's lockless and safe to use as-is.
Some tests already show that it's easy to see 3-4 events in a single
microsecond, thus it's likely that the nanosecond version will have to
be implemented as well. But certain comments on the net suggest that
some parsers are having trouble beyond microsecond, thus for now let's
stick to the microsecond only.
2019-09-26 08:13:38 +02:00
Olivier Houchard
bba1a263c5 BUG/MEDIUM: tasklets: Make sure we're waking the target thread if it sleeps.
Now that we can wake tasklet for other threads, make sure that if the thread
is sleeping, we wake it up, or the tasklet won't be executed until it's
done sleeping.
That also means that, before going to sleep, and after we put our bit
in sleeping_thread_mask, we have to check that nobody added a tasklet for
us, just checking for global_tasks_mask isn't enough anymore.
2019-09-24 14:58:45 +02:00
Willy Tarreau
d022e9c98b MINOR: task: introduce a thread-local "sched" variable for local scheduler stuff
The aim is to rassemble all scheduler information related to the current
thread. It simply points to task_per_thread[tid] without having to perform
the operation at each time. We save around 1.2 kB of code on performance
sensitive paths and increase the request rate by almost 1%.
2019-09-24 11:23:30 +02:00
Willy Tarreau
d66d75656e MINOR: task: split the tasklet vs task code in process_runnable_tasks()
There are a number of tests there which are enforced on tasklets while
they will never apply (various handlers, destroyed task or not, arguments,
results, ...). Instead let's have a single TASK_IS_TASKLET() test and call
the tasklet processing function directly, skipping all the rest.

It now appears visible that the only unneeded code is the update to
curr_task that is never used for tasklets, except for opportunistic
reporting in the debug handler, which can only catch si_cs_io_cb,
which in practice doesn't appear in any report so the extra cost
incurred there is pointless.

This change alone removes 700 bytes of code, mostly in
process_runnable_tasks() and increases the performance by about
1%.
2019-09-24 11:23:30 +02:00
Willy Tarreau
2bd65a781e OPTIM: listeners: use tasklets for the multi-queue rings
Now that we can wake up a remote thread's tasklet, it's way more
interesting to use a tasklet than a task in the accept queue, as it
will avoid passing through all the scheduler. Just doing this increases
the accept rate by about 4%, overall recovering the slight loss
introduced by the tasklet change. In addition it makes sure that
even a heavily loaded scheduler (e.g. many very fast checks) will
not delay a connection accept.
2019-09-24 06:57:32 +02:00