10599 Commits

Author SHA1 Message Date
Willy Tarreau
ddd8533f1b MINOR: debug: switch to SIGURG for thread dumps
The current choice of SIGPWR has the adverse effect of stopping gdb each
time it is triggered using "show threads" or example, which is not really
convenient. Let's switch to SIGURG instead, which we don't use either.
2019-05-22 11:50:48 +02:00
Tim Duesterhus
9b7a976cd6 BUG/MINOR: mworker: Fix memory leak of mworker_proc members
The struct mworker_proc is not uniformly freed everywhere, sometimes leading
to leaks of the `id` string (and possibly the other strings).

Introduce a mworker_free_child function instead of duplicating the freeing
logic everywhere to prevent this kind of issues.

This leak was reported in issue #96.

It looks like the leaks have been introduced in commit 9a1ee7ac31c56fd7d881adf2ef4659f336e50c9f,
which is specific to 2.0-dev. Backporting `mworker_free_child` might be
helpful to ease backporting other fixes, though.
2019-05-22 11:29:18 +02:00
Willy Tarreau
f61782418c CLEANUP: time: refine the test on _POSIX_TIMERS
The clock_gettime() man page says we must check that _POSIX_TIMERS is
defined to a value greater than zero, not just that it's simply defined
so let's fix this right now.
2019-05-21 20:03:03 +02:00
Olivier Houchard
aacc405c1f BUG/MEDIUM: streams: Don't switch from SI_ST_CON to SI_ST_DIS on read0.
When we receive a read0, and we're still in SI_ST_CON state (so on an
outgoing conneciton), don't immediately switch to SI_ST_DIS, or, we would
never call sess_establish(), and so the analysers will never run.
Instead, let sess_establish() handle that case, and switch to SI_ST_DIS if
we already have CF_SHUTR on the channel.

This should be backported to 1.9.
2019-05-21 19:05:09 +02:00
Emmanuel Hocdet
0ba4f483d2 MAJOR: polling: add event ports support (Solaris)
Event ports are kqueue/epoll polling class for Solaris. Code is based
on https://github.com/joyent/haproxy-1.8/tree/joyent/dev-v1.8.8.
Event ports are available only on SunOS systems derived from
Solaris 10 and later (including illumos systems).
2019-05-21 15:16:45 +02:00
Willy Tarreau
663fda4c90 BUILD: threads: only assign the clock_id when supported
I took extreme care to always check for _POSIX_THREAD_CPUTIME before
manipulating clock_id, except at one place (run_thread_poll_loop) as
found by Manu, breaking Solaris. Now fixed, no backport needed.
2019-05-21 15:14:08 +02:00
Willy Tarreau
9c8800af3b MINOR: debug: report each thread's cpu usage in "show thread"
Now we can report each thread's CPU time, both at wake up (poll) and
retrieved while dumping (now), then the difference, which directly
indicates how long the thread has been running uninterrupted. A very
high value for the diff could indicate a deadlock, especially if it
happens between two threads. Note that it may occasionally happen
that a wrong value is displayed since nothing guarantees that the
date is read atomically.
2019-05-20 21:14:14 +02:00
Willy Tarreau
81036f2738 MINOR: time: move the cpu, mono, and idle time to thread_info
These ones are useful across all threads and would be better placed
in struct thread_info than thread-local. There are very few users.
2019-05-20 21:14:14 +02:00
Willy Tarreau
8323a375bc MINOR: threads: add a thread-local thread_info pointer "ti"
Since we're likely to access this thread_info struct more frequently in
the future, let's reserve the thread-local symbol to access it directly
and avoid always having to combine thread_info and tid. This pointer is
set when tid is set.
2019-05-20 21:14:12 +02:00
Willy Tarreau
624dcbf41e MINOR: threads: always place the clockid in the struct thread_info
It will be easier to deal with the internal API to always have it.
2019-05-20 21:13:01 +02:00
Willy Tarreau
5a6e2245fa REORG: threads: move the struct thread_info from global.h to hathreads.h
It doesn't make sense to keep this struct thread_info in global.h, it
causes difficulties to access its contents from hathreads.h, let's move
it to the threads where it ought to have been created.
2019-05-20 20:00:25 +02:00
Willy Tarreau
a9f9fc9e5b MINOR: debug: make ha_panic() report threads starting at 1
Internally they start at zero but everywhere (config, dumps) we show
them starting at 1, so let's fix the confusion.
2019-05-20 17:46:14 +02:00
Willy Tarreau
3710105945 MINOR: tools: provide a may_access() function and make dump_hex() use it
It's a bit too easy to crash by accident when using dump_hex() on any
area. Let's have a function to check if the memory may safely be read
first. This one abuses the stat() syscall checking if it returns EFAULT
or not, in which case it means we're not allowed to read from there. In
other situations it may return other codes or even a success if the
area pointed to by the file exists. It's important not to abuse it
though and as such it's tested only once per output line.
2019-05-20 16:59:37 +02:00
Willy Tarreau
6bdf3e9b11 MINOR: debug/cli: add some debugging commands for developers
When haproxy is built with DEBUG_DEV, the following commands are added
to the CLI :

  debug dev close <fd>        : close this file descriptor
  debug dev delay [ms]        : sleep this long
  debug dev exec  [cmd] ...   : show this command's output
  debug dev exit  [code]      : immediately exit the process
  debug dev hex   <addr> [len]: dump a memory area
  debug dev log   [msg] ...   : send this msg to global logs
  debug dev loop  [ms]        : loop this long
  debug dev panic             : immediately trigger a panic
  debug dev tkill [thr] [sig] : send signal to thread

These are essentially aimed at helping developers trigger certain
conditions and are expected to be complemented over time.
2019-05-20 16:59:30 +02:00
Willy Tarreau
56131ca58e MINOR: debug: implement ha_panic()
This function dumps all existing threads using the thread dump mechanism
then aborts. This will be used by the lockup detection and by debugging
tools.
2019-05-20 16:51:30 +02:00
Willy Tarreau
9fc5dcbd71 MINOR: tools: add dump_hex()
This is used to dump a memory area into a buffer for debugging purposes.
2019-05-20 16:51:30 +02:00
Willy Tarreau
da5a63f8f1 CLEANUP: stream: remove an obsolete debugging test
The test consisted in checking that there was always a timeout on a
stream's task and was only enabled when built in development mode,
but 1) it is never tested and 2) if it had been tested it would have
been noticed that it triggers a bit too easily on the CLI. Let's get
rid of this old one.
2019-05-20 16:19:40 +02:00
Willy Tarreau
91e6df01fa MINOR: threads: add each thread's clockid into the global thread_info
This is the per-thread CPU runtime clock, it will be used to measure
the CPU usage of each thread and by the lockup detection mechanism. It
must only be retrieved at the beginning of run_thread_poll_loop() since
the thread must already have been started for this. But it must be done
before performing any per-thread initcall so that all thread init
functions have access to the clock ID.

Note that it could make sense to always have this clockid available even
in non-threaded situations and place the process' clock there instead.
But it would add portability issues which are currently easy to deal
with by disabling threads so it may not be worth it for now.
2019-05-20 11:42:25 +02:00
Willy Tarreau
522cfbc1ea MINOR: init/threads: make the global threads an array of structs
This way we'll be able to store more per-thread information than just
the pthread pointer. The storage became an array of struct instead of
an allocated array since it's very small (typically 512 bytes) and not
worth the hassle of dealing with memory allocation on this. The array
was also renamed thread_info to make its intended usage more explicit.
2019-05-20 11:37:57 +02:00
Willy Tarreau
64a47b943c CLEANUP: memory: make the fault injection code use the OTHER_LOCK label
The mem_should_fail() function sets a lock while it's building its
messages, and when this was done there was no relevant label available
hence the confusing use of START_LOCK. Now OTHER_LOCK is available for
such use cases, so let's switch to this one instead as START_LOCK is
going to disappear.
2019-05-20 11:26:12 +02:00
Willy Tarreau
619a95f5ad MEDIUM: init/mworker: make the pipe register function a regular initcall
Now that we have the guarantee that init calls happen before any other
thread starts, we don't need anymore the workaround installed by commit
1605c7ae6 ("BUG/MEDIUM: threads/mworker: fix a race on startup") and we
can instead rely on a regular per-thread initcall for this function. It
will only be performed on worker thread #0, the other ones and the master
have nothing to do, just like in the original code that was only moved
to the function.
2019-05-20 11:26:12 +02:00
Willy Tarreau
3078e9f8e2 MINOR: threads/init: synchronize the threads startup
It's a bit dangerous to let threads initialize at different speeds on
startup. Some are still in their init functions while others area already
running. It was even subject to some race condition bugs like the one
fixed by commit 1605c7ae6 ("BUG/MEDIUM: threads/mworker: fix a race on
startup").

Here in order to secure all this, we take a very simplistic approach
consisting in using half of the rendez-vous point, which is made
exactly for this purpose : we first initialize the mask of the threads
requesting a rendez-vous to the mask of all threads, and we simply call
thread_release() once the init is complete. This guarantees that no
thread will go further than the initialization code during this time.

This could even safely be backported if any other issue related to an
init race was discovered in a stable release.
2019-05-20 11:26:12 +02:00
William Lallemand
7b302d8dd5 MINOR: init: setenv HAPROXY_CFGFILES
Set the HAPROXY_CFGFILES environment variable which contains the list of
configuration files used to start haproxy, separated by semicolon.
2019-05-20 11:21:00 +02:00
Willy Tarreau
c7091d89ae MEDIUM: debug/threads: implement an advanced thread dump system
The current "show threads" command was too limited as it was not possible
to dump other threads' detailed states (e.g. their tasks). This patch
goes further by using thread signals so that each thread can dump its
own state in turn into a shared buffer provided by the caller. Threads
are synchronized using a mechanism very similar to the rendez-vous point
and using this method, each thread can safely dump any of its contents
and the caller can finally report the aggregated ones from the buffer.

It is important to keep in mind that the list of signal-safe functions
is limited, so we take care of only using chunk_printf() to write to a
pre-allocated buffer.

This mechanism is enabled by USE_THREAD_DUMP and is enabled by default
on Linux 2.6.28+. On other platforms it falls back to the previous
solution using the loop and the less precise dump.
2019-05-17 17:16:20 +02:00
Willy Tarreau
0ad46fa6f5 MINOR: stream: detach the stream from its own task on stream_free()
This makes sure that the stream is not visible from its own task just
before starting to free some of its components. This way we have the
guarantee that a stream found in a task list is totally valid and can
safely be dereferenced.
2019-05-17 17:16:20 +02:00
Willy Tarreau
01f3489752 MINOR: task: put barriers after each write to curr_task
This one may be watched by signal handlers, we don't want the compiler
to optimize its assignment away at the end of the loop and leave some
wandering pointers there.
2019-05-17 17:16:20 +02:00
Willy Tarreau
38171daf21 MINOR: thread: implement ha_thread_relax()
At some places we're using a painful ifdef to decide whether to use
sched_yield() or pl_cpu_relax() to relax in loops, this is hardly
exportable. Let's move this to ha_thread_relax() instead and une
this one only.
2019-05-17 17:16:20 +02:00
Willy Tarreau
20db9115dc BUG/MINOR: debug: don't check the call date on tasklets
tasklets don't have a call date, so when a tasklet is cast into a task
and is present at the end of a page we run a risk of dereferencing
unmapped memory when dumping them in ha_task_dump(). This commit
simplifies the test and uses to distinct calls for tasklets and tasks.
No backport is needed.
2019-05-17 17:16:20 +02:00
Willy Tarreau
5cf64dd1bd MINOR: debug: make ha_thread_dump() and ha_task_dump() take a buffer
Instead of having them dump into the trash and initialize it, let's have
the caller initialize a buffer and pass it. This will be convenient to
dump multiple threads at once into a single buffer.
2019-05-17 17:16:20 +02:00
Willy Tarreau
14a1ab75d0 BUG/MINOR: debug: make ha_task_dump() actually dump the requested task
It used to only dump the current task, which isn't different for now
but the purpose clearly is to dump the requested task. No backport is
needed.
2019-05-17 17:16:20 +02:00
Willy Tarreau
231ec395c1 BUG/MINOR: debug: make ha_task_dump() always check the task before dumping it
For now it cannot happen since we're calling it from a task but it will
break with signals. No backport is needed.
2019-05-17 17:16:20 +02:00
Olivier Houchard
6db1699f77 BUG/MEDIUM: streams: Try to L7 retry before aborting the connection.
In htx_wait_for_response, in case of error, attempt a L7 retry before
aborting the connection if the TX_NOT_FIRST flag is set.
If we don't do that, then we wouldn't attempt L7 retries after the first
request, or if we use HTTP/2, as with HTTP/2 that flag is always set.
2019-05-17 15:49:21 +02:00
Olivier Houchard
ce1a0292bf BUG/MEDIUM: streams: Don't use CF_EOI to decide if the request is complete.
In si_cs_send(), don't check CF_EOI on the request channel to decide if the
request is complete and if we should save the buffer to eventually attempt
L7 retries. The flag may not be set yet, and it may too be set to early,
before we're done modifying the buffer. Instead, get the msg, and make sure
its state is HTTP_MSG_DONE.
That way we will store the request buffer when sending it even in H2.
2019-05-17 15:49:21 +02:00
Willy Tarreau
4e2b646d60 MINOR: cli/debug: add a thread dump function
The new function ha_thread_dump() will dump debugging info about all known
threads. The current thread will contain a bit more info. The long-term goal
is to make it possible to use it in signal handlers to improve the accuracy
of some dumps.

The function dumps its output into the trash so as it was trivial to add,
a new "show threads" command appeared on the CLI.
2019-05-16 18:06:45 +02:00
Willy Tarreau
58d9621fc8 MINOR: cli/activity: show the dumping thread ID starting at 1
Both the config and gdb report thread IDs starting at 1, so better do the
same in "show activity" to limit confusion. We also display the full
permitted range.

This could be backported to 1.9 since it was present there.
2019-05-16 18:02:03 +02:00
Tim Duesterhus
3506dae342 MEDIUM: Make 'resolution_pool_size' directive fatal
This directive never appeared in a stable release and instead was
introduced and deprecated within 1.8-dev. While it technically could
be outright removed we detect it and error out for good measure.
2019-05-16 18:02:03 +02:00
Tim Duesterhus
10c6c16cde MEDIUM: Make 'option forceclose' actually warn
It is deprecated since 315b39c3914f4c2301ce19a93564566caa2ede50 (1.9-dev),
but only was deprecated in the docs.

Make it warn when being used and remove it from the docs.
2019-05-16 18:02:03 +02:00
Christopher Faulet
c1f40dd492 BUG/MINOR: http_fetch: Rely on the smp direction for "cookie()" and "hdr()"
A regression was introduced in the commit 89dc49935 ("BUG/MAJOR: http_fetch: Get
the channel depending on the keyword used") on the samples "cookie()" and
"hdr()". Unlike other samples manipulating the HTTP headers, these ones depend
on the sample direction. To fix the bug, these samples use now their own
functions. Depending on the sample direction, they call smp_fetch_cookie() and
smp_fetch_hdr() with the appropriate keyword.

Thanks to Yves Lafon to report this issue.

This patch must be backported wherever the commit 89dc49935 was backported. For
now, 1.9 and 1.8.
2019-05-16 11:31:28 +02:00
Olivier Houchard
35d116885d MINOR: connections: Use BUG_ON() to enforce rules in subscribe/unsubscribe.
It is not legal to subscribe if we're already subscribed, or to unsubscribe
if we did not subscribe, so instead of trying to handle those cases, just
assert that it's ok using the new BUG_ON() macro.
2019-05-14 18:18:25 +02:00
Olivier Houchard
00b8f7c60b MINOR: h1: Use BUG_ON() to enforce rules in subscribe/unsubscribe.
It is not legal to subscribe if we're already subscribed, or to unsubscribe
if we did not subscribe, so instead of trying to handle those cases, just
assert that it's ok using the new BUG_ON() macro.
2019-05-14 18:18:25 +02:00
Olivier Houchard
f8338151a3 MINOR: h2: Use BUG_ON() to enforce rules in subscribe/unsubscribe.
It is not legal to subscribe if we're already subscribed, or to unsubscribe
if we did not subscribe, so instead of trying to handle those cases, just
assert that it's ok using the new BUG_ON() macro.
2019-05-14 18:18:25 +02:00
Christopher Faulet
fa922f03a3 BUG/MEDIUM: mux-h2: Set EOI on the conn_stream during h2_rcv_buf()
Just like CS_FL_REOS previously, the CS_FL_EOI flag is abused as a proxy
for H2_SF_ES_RCVD. The problem is that this flag is consumed by the
application layer and is set immediately when an end of stream was met,
which is too early since the application must retrieve the rxbuf's
contents first. The effect is that some transfers are truncated (mostly
the first one of a connection in most tests).

The problem of mixing CS flags and H2S flags in the H2 mux is not new
(and is currently being addressed) but this specific one was emphasized
in commit 63768a63d ("MEDIUM: mux-h2: Don't mix the end of the message
with the end of stream") which was backported to 1.9. Note that other
flags, particularly CS_FL_REOS still need to be asynchronously reported,
though their impact seems more limited for now.

This patch makes sure that all internal uses of CS_FL_EOI are replaced
with a test on H2_SF_ES_RCVD (as there is a 1-to-1 equivalence) and that
CS_FL_EOI is only reported once the rxbuf is empty.

This should ideally be backported to 1.9 unless it causes too much
trouble due to the recent changes in this area, as 1.9 *seems* not
to be directly affected by this bug.
2019-05-14 15:47:57 +02:00
Willy Tarreau
99ad1b3e8c MINOR: mux-h2: stop relying on CS_FL_REOS
This flag was introduced early in 1.9 development (a3f7efe00) to report
the fact that the rxbuf that was present on the conn_stream was followed
by a shutr. Since then the rxbuf moved from the conn_stream to the h2s
(638b799b0) but the flag remained on the conn_stream. It is problematic
because some state transitions inside the mux depend on it, thus depend
on the CS, and as such have to test for its existence before proceeding.

This patch replaces the test on CS_FL_REOS with a test on the only
states that set this flag (H2_SS_CLOSED, H2_SS_HREM, H2_SS_ERROR).
The few places where the flag was set were removed (the flag is not
used by the data layer).
2019-05-14 15:47:57 +02:00
Willy Tarreau
4c688eb8d1 MINOR: mux-h2: add macros to check multiple stream states at once
At many places we need to test for several stream states at once, let's
have macros to make a bit mask from a state to ease this.
2019-05-14 15:47:57 +02:00
Willy Tarreau
f8fe3d63f0 CLEANUP: mux-h2: don't test for impossible CS_FL_REOS conditions
This flag is currently set when an incoming close was received, which
results in the stream being in either H2_SS_HREM, H2_SS_CLOSED, or
H2_SS_ERROR states, so let's remove the test for the OPEN and HLOC
cases.
2019-05-14 15:47:57 +02:00
Willy Tarreau
3cf69fe6b2 BUG/MINOR: mux-h2: make sure to honor KILL_CONN in do_shut{r,w}
If the stream closes and quits while there's no room in the mux buffer
to send an RST frame, next time it is attempted it will not lead to
the connection being closed because the conn_stream will have been
released and the KILL_CONN flag with it as well.

This patch reserves a new H2_SF_KILL_CONN flag that is copied from
the CS when calling shut{r,w} so that the stream remains autonomous
on this even when the conn_stream leaves.

This should ideally be backported to 1.9 though it depends on several
previous patches that may or may not be suitable for backporting. The
severity is very low so there's no need to insist in case of trouble.
2019-05-14 15:47:57 +02:00
Willy Tarreau
aebbe5ef72 MINOR: mux-h2: make h2s_wake_one_stream() not depend on temporary CS flags
In h2s_wake_one_stream() we used to rely on the temporary flags used to
adjust the CS to determine the new h2s state. This really is not convenient
and creates far too many dependencies. This commit just moves the same
condition to the places where the temporary flags were set so that we
don't have to rely on these anymore. Whether these are relevant or not
was not the subject of the operation, what matters was to make sure the
conditions to adjust the stream's state and the CS's flags remain the
same. Later it could be studied if these conditions are correct or not.
2019-05-14 15:47:57 +02:00
Willy Tarreau
13b6c2e8b3 MINOR: mux-h2: make h2s_wake_one_stream() the only function to deal with CS
h2s_wake_one_stream() has access to all the required elements to update
the connstream's flags and figure the necessary state transitions, so
let's move the conditions there from h2_wake_some_streams().
2019-05-14 15:47:57 +02:00
Willy Tarreau
234829111f MINOR: mux-h2: make h2_wake_some_streams() not depend on the CS flags
It's problematic to have to pass some CS flags to this function because
that forces some h2s state transistions to update them just in time
while some of them are supposed to only be updated during I/O operations.

As a first step this patch transfers the decision to pass CS_FL_ERR_PENDING
from the caller to the leaf function h2s_wake_one_stream(). It is easy
since this is the only flag passed there and it depends on the position of
the stream relative to the last_sid if it was set.
2019-05-14 15:47:57 +02:00
Willy Tarreau
c3b1183f57 MINOR: mux-h2: remove useless test on stream ID vs last in wake function
h2_wake_some_streams() first looks up streams whose IDs are greater than
or equal to last+1, then checks if the id is lower than or equal to last,
which by definition will never match. Let's remove this confusing leftover
from ancient code.
2019-05-14 15:47:57 +02:00