Commit Graph

6804 Commits

Author SHA1 Message Date
Olivier Houchard
af4021e680 MEDIUM: connections: Get rid of the recv() method.
Remove the recv() method from mux and conn_stream.
The goal is to always receive from the upper layers, instead of waiting
for the connection later. For now, recv() is still called from the wake()
method, but that should change soon.
2018-09-12 17:37:55 +02:00
Olivier Houchard
4cf7fb148f MEDIUM: connections/mux: Add a recv and a send+recv wait list.
For struct connection, struct conn_stream, and for the h2 mux, add 2 new
lists, one that handles waiters for recv, and one that handles waiters for
recv and send. That way we can ask to subscribe for either recv or send.
2018-09-12 17:37:55 +02:00
Olivier Houchard
524344b4e0 MEDIUM: connections: Don't reset the polling flags in conn_fd_handler().
Resetting the polling flags at the end of conn_fd_handler() shouldn't be
needed anymore, and it will create problem when we won't handle send/recv
from conn_fd_handler() anymore.
2018-09-12 17:37:55 +02:00
William Lallemand
cd5c944ea5 BUILD: fix build without thread
Cyril Bonté reported that commit f9cc07c25b broke the build without
thread.

We don't need to initialise tid = 0 in mworker_loop, so we could
completely remove it.
2018-09-12 13:59:00 +02:00
Willy Tarreau
2c096c3b7a BUG/MINOR: h2: report asynchronous end of stream on closed connections
Christopher noticed that the CS_FL_EOS to CS_FL_REOS conversion was
incomplete : when the connectionis closed, we mark the streams with EOS
instead of REOS, causing the loss of any possibly pending data. At the
moment it's not an issue since H2 is used only with a client, but with
servers it could be a real problem if servers close the connection right
after sending their response.

This patch should be backported to 1.8.
2018-09-12 09:45:54 +02:00
Frédéric Lécaille
5afb3cfbcc BUG/MINOR: server: Crash when setting FQDN via CLI.
This patch ensures that a DNS resolution may be launched before
setting a server FQDN via the CLI. Especially, it checks that
resolvers was set.

A LEVEL 4 reg testing file is provided.

Thanks to Lukas Tribus for having reported this issue.

Must be backported to 1.8.
2018-09-12 07:41:41 +02:00
William Lallemand
2fe7dd0b2e MEDIUM: protocol: sockpair protocol
This protocol is based on the uxst one, but it uses socketpair and FD
passing insteads of a connect()/accept().

The "sockpair@" prefix has been implemented for both bind and server
keywords.

When HAProxy wants to connect through a sockpair@, it creates 2 new
sockets using the socketpair() syscall and pass one of the socket
through the FD specified on the server line.

On the bind side, haproxy will receive the FD, and will use it like it
was the FD of an accept() syscall.

This protocol was designed for internal communication within HAProxy
between the master and the workers, but it's possible to use it
externaly with a wrapper and pass the FD through environment variabls.
2018-09-12 07:20:17 +02:00
William Lallemand
2d3f8a411f MEDIUM: protocol: use a custom AF_MAX to help protocol parser
It's possible to have several protocols per family which is a problem
with the current way the protocols are stored.

This allows to register a new protocol in HAProxy which is not a
protocol in the strict socket definition. It will be used to register a
SOCK_STREAM protocol using socketpair().
2018-09-12 07:12:27 +02:00
Olivier Houchard
5ab33944cd BUG/MAJOR: kqueue: Don't reset the changes number by accident.
In _update_fd(), if the fd wasn't polled, and we don't want it to be polled,
we just returned 0, however, we should return changes instead, or all previous
changes will be lost.

This should be backported to 1.8.
2018-09-11 14:53:00 +02:00
Willy Tarreau
ab813a4b05 REORG: http: move some header value processing functions to http.c
The following functions only deal with header field values and are agnostic
to the HTTP version so they were moved to http.c :

http_header_match2(), find_hdr_value_end(), find_cookie_value_end(),
extract_cookie_value(), parse_qvalue(), http_find_url_param_pos(),
http_find_next_url_param().

Those lacking the "http_" prefix were modified to have it.
2018-09-11 10:30:25 +02:00
Willy Tarreau
e10cd48a83 REORG: http: move the log encoding tables to log.c
There are 3 tables in proto_http which are used exclusively by logs :
hdr_encode_map[], url_encode_map[] and http_encode_map[]. They indicate
what characters are safe to be emitted in logs depending on the part of
the message where they are placed. Let's move this to log.c, as well as
its initialization. It's worth noting that the rfc5424 map was already
initialized there.
2018-09-11 10:30:25 +02:00
Willy Tarreau
04f1e2d202 REORG: http: move error codes production and processing to http.c
These error codes and messages are agnostic to the version, even if
they are represented as HTTP/1.0 messages. Ultimately they will have
to be transformed into internal HTTP messages to be used everywhere.

The HTTP/1.1 100 Continue message was turned to an IST and the local
copy in the Lua code was removed.
2018-09-11 10:30:25 +02:00
Willy Tarreau
6b952c8101 REORG: http: move http_get_path() to http.c
This function is purely HTTP once http_txn is put aside. So the original
one was renamed to http_txn_get_path() and it extracts the relevant offsets
from the txn to pass them to http_get_path(). One benefit of the new version
is that it returns the length at the same time so that allowed to slightly
simplify http_get_path_from_string() which had to look up the end pointer
previously and which is not needed anymore.
2018-09-11 10:30:25 +02:00
Willy Tarreau
35b51c6e5b REORG: http: move the HTTP semantics definitions to http.h/http.c
It's a bit painful to have to deal with HTTP semantics for each protocol
version (H1 and H2), and working on the version-agnostic code further
emphasizes the problem.

This patch creates http.h and http.c which are agnostic to the version
in use, and which borrow a few parts from proto_http and from h1. For
example the once thought h1-specific h1_char_classes array is in fact
dictated by RFC7231 and is used to parse HTTP headers. A few changes
were made to a few files which were including proto_http.h while they
only needed http.h.

Certain string definitions pre-dated the introduction of indirect
strings (ist) so some were used to simplify the definition of the known
HTTP methods. The current lookup code saves 2 kB of a heavily used table
and is faster than the previous table based lookup (typ. 14 ns vs 16
before).
2018-09-11 10:30:25 +02:00
William Lallemand
123f1f6441 MEDIUM: mworker: call per_thread deinit in mworker_reload()
We need to clean the FDs registered manually in the poller to avoid FD
leaking during a reload of the master.

This patch call the per thread deinit function which close the thread
waker pipe.
2018-09-11 10:23:24 +02:00
William Lallemand
333d7979cd MEDIUM: threads: close the thread-waker pipe during deinit
In order to avoid FD leaking, we close the pipe used to wake the threads
up during per thread deinit.
2018-09-11 10:23:24 +02:00
William Lallemand
e22f11ff47 MINOR: mworker: keep and clean the listeners
Keep the listeners that should be used in the master process and clean
them in the workers.
2018-09-11 10:23:24 +02:00
William Lallemand
bc19305e53 MEDIUM: mworker: replace the master pipe by socketpairs
In order to communicate with the workers, the master pipe has been
replaced by a socketpair() per worker.

The goal is to use these sockets as stats sockets and be able to access
them from the master.

When reloading, the master serialize the information of the workers and
put them in a environment variable. Once the master has been reexecuted
it unserialize that information and it is capable of closing the FDs of
the leaving children.
2018-09-11 10:21:58 +02:00
William Lallemand
f9cc07c25b MEDIUM: mworker: master wait mode use its own initialization
The master now use a poll loop, which should be initialized even in wait
mode. We need to init some variables if we didn't success to load the
configuration file.
2018-09-11 10:21:58 +02:00
William Lallemand
de0ff5ab20 MINOR: mworker: don't deinit the poller fd when in wait mode
If haproxy failed to load its configuration, the process is reexecuted
and it did not init the poller. So we must not try to deinit the poller
before the exec().
2018-09-11 10:21:58 +02:00
William Lallemand
d3801c1c21 MEDIUM: startup: unify signal init between daemon and mworker mode
The signals are now unblocked only once the configuration have been
parsed.
2018-09-11 10:21:58 +02:00
William Lallemand
242aae96c7 MEDIUM: mworker: never block SIG{TERM,INT} during reload
The master should be able to be killed even if the reload is not
finished.
2018-09-11 10:21:58 +02:00
William Lallemand
ebf304f8dd MEDIUM: mworker: block SIGCHLD until the master is ready
With the new way of handling the signals in the master worker, we are
are not staying in a waitpid() loop. Which means that we need to catch the
SIGCHLD signals to call waitpid().

The problem is when the master is reloading, this signal is neither
registered nor blocked so we lost all signals between the restart and
the call to mworker_loop().

This patch blocks the SIGCHLD signals before the reloading and ensure
it's not unblocked before the master registered the SIGCHLD handler.
2018-09-11 10:21:58 +02:00
William Lallemand
91c13b696a MINOR: mworker: mworker_cleanlisteners() delete the listeners
The mworker_cleanlisteners() function now remove the listeners, we don't
need them in the master for now.
2018-09-11 10:21:58 +02:00
William Lallemand
3da9769ee4 BUG/MINOR: mworker: no need to stop peers for each proxy
The mworker_cleanlisteners() was cleaning the peers in the proxy loop,
which is useless since we need to stop the peers only once.
2018-09-11 10:21:58 +02:00
William Lallemand
b3f2be338b MEDIUM: mworker: use the haproxy poll loop
In order to reorganize the code of the master worker, the mworker_wait()
function which was the main function was split. This function was
handling a wait() loop, but it does not need it anymore since the code
will use the poll loop of haproxy instead.

The function was split in several functions:

- mworker_catch_sigterm() which is a signal handler for SIGTERM ans
SIGUSR1 that sends the signals to the workers
- mworker_catch_sigchld() which is the code handling the leaving of a
child
- mworker_catch_sighup which basically call the mworker_restart()
function
- mworker_loop() which is the function calling the main poll loop in the
master
2018-09-11 10:21:58 +02:00
William Lallemand
73e1dfcfdf MEDIUM: mworker: remove register/unregister signal functions
Remove the register and unregister signal functions specifics to the
master worker, because that should be done with the generic ones.
2018-09-11 10:21:58 +02:00
Willy Tarreau
4bc7d90d3b MEDIUM: snapshot: merge the captured data after the descriptor
Instead of having a separate area for the captured data, we now have a
contigous block made of the descriptor and the data. At the moment, since
the area is dynamically allocated, we can adjust its size to what is
needed, but the idea is to quickly switch to a pool and an LRU list.
2018-09-07 20:07:17 +02:00
Willy Tarreau
c55015ee5b MEDIUM: snapshots: dynamically allocate the snapshots
Now upon error we dynamically allocate the snapshot instead of overwriting
it. This way there is no more memory wasted in the proxy to hold the two
error snapshot descriptors. Also an appreciable side effect of this is that
the proxy's lock is only taken during the pointer swap, no more while copying
the buffer's contents. This saves 480 bytes of memory per proxy.
2018-09-07 19:59:58 +02:00
Willy Tarreau
36b2736a69 BUG/MEDIUM: snapshot: take the proxy's lock while dumping errors
The proxy's lock it held while filling the error but not while dumping
it, so it's possible to dereference pointers being replaced, typically
server pointers. The risk is very low and unlikely but not inexistent.

Since "show errors" is rarely used in parallel, let's simply grab the
proxy's lock while dumping. Ideally we should use an R/W lock here but
it will not make any difference.

This patch must be backported to 1.8, but the code is in proto_http.c
there, though mostly similar.
2018-09-07 19:55:44 +02:00
Willy Tarreau
ddb68ac69e REORG: cli: move the "show errors" handler from http to proxy
There's nothing HTTP-specific there anymore at all, let's move this
to the proxy where it belongs.
2018-09-07 18:36:50 +02:00
Willy Tarreau
fd9419d560 MINOR: http: remove the pointer to the error snapshot in http_capture_bad_message()
It's not needed anymore as we know the side thanks to the channel. This
will allow the proxy generic code to better manage the error snapshots.
2018-09-07 18:36:04 +02:00
Willy Tarreau
ef3ca73fc3 MINOR: http: make the HTTP error capture rely on the generic proxy code
Now that we have a generic error capture function, let's simplify
http_capture_bad_message() to make use of it. At this point the API
is not changed at all, but it could be further simplified.
2018-09-07 18:36:04 +02:00
Willy Tarreau
75fb65a51f MINOR: proxy: add a new generic proxy_capture_error()
This function now captures an error regardless of its side and protocol.
The caller must pass a number of elements and may pass a protocol-specific
structure and a callback to display it. Later this function may deal with
more advanced allocation techniques to avoid allocating as many buffers
as proxies.
2018-09-07 18:36:04 +02:00
Willy Tarreau
7ccdd8dad9 MEDIUM: snapshot: implement a show() callback and use it for HTTP
The HTTP dumps are now configurable in the code : "show errors" now
calls a protocol-specific function to emit the decoded output. For
now only HTTP is implemented.
2018-09-07 18:36:01 +02:00
Willy Tarreau
0b5b480594 MEDIUM: snapshot: start to reorder the HTTP snapshot output a little bit
The output of "show errors" was slightly reordered to split the HTTP part
in a single chunk_appendf() call. The useless buffer total input was
replaced to report the buffer's start offset, which is the offset in the
stream of the first input byte (thus not counting output). Also it was
the opportunity to stop calling the stream "session".
2018-09-07 17:48:14 +02:00
Willy Tarreau
7480f323ff MINOR: snapshot: split the error snapshots into common and proto-specific parts
The idea will be to make the error snapshot feature accessible to other
protocols than just HTTP. This patch only introduces an "http_snapshot"
structure and renames a few fields to make things more explicit. The
HTTP part was installed inside a union so that we can easily add more
protocols in the future.
2018-09-07 16:13:45 +02:00
Willy Tarreau
5865a8fe69 MINOR: snapshot: restart on the event ID and not the stream ID
The snapshots have the ability to restart a partial dump and they use
the stream ID as the restart point. Since it's purely HTTP, let's use
the event ID instead.
2018-09-07 15:00:43 +02:00
Willy Tarreau
e9e878a056 BUG/MINOR: http/threads: atomically increment the error snapshot ID
Let's use an atomic increment for the error snapshot, as we'd rather
not assign the same ID to two errors happening in parallel. It's very
unlikely that it will ever happen though.

This patch must be backported to 1.8 with the other one it relies on
("MINOR: thread: implement HA_ATOMIC_XADD()").
2018-09-07 11:31:58 +02:00
Baptiste Assmann
044fd5bc2c BUG/MINOR: dns: check and link servers' resolvers right after config parsing
On the Mailing list, Marcos Moreno reported that haproxy configuration
validation (through "haproxy -c cfgfile") does not detect when a
resolvers section does not exist for a server.
That said, this checking is done after HAProxy has started up.

The problem is that this can create production issue, since init
script can't detect the problem before starting / reloading HAProxy.

To fix this issue, this patch registers the function which validates DNS
configuration validity and run it right after configuration parsing is
finished (through cfg_register_postparser()).
Thanks to it, now "haproxy -c cfgfile" will fail when a server
points to a non-existing resolvers section (or any other validation made
by the function above).

Backport status: 1.8
2018-09-06 19:41:30 +02:00
Willy Tarreau
be373150c7 MINOR: connection: make the initialization more consistent
Sometimes a connection is prepared before the target is set, sometimes
after. There's no real rule since the few functions involved operate on
different and independent fields. Soon we'll benefit from knowing the
target at the connection layer, in order to figure the associated proxy
and retrieve the various parameters (timeouts etc). This patch slightly
reorders a few calls to conn_prepare() so that we can make sure that the
target is always known to the mux.
2018-09-06 11:45:30 +02:00
Willy Tarreau
950a8a6fde BUG/MINOR: h1: fix buffer shift after realignment
Commit 5e74b0b ("MEDIUM: h1: port to new buffer API.") introduced a
minor bug by which a buffer's head could stay shifted by the amount
of removed CRLF if it started with empty lines. This would cause the
second request (or response) not to work until it would receive a few
extra characters. This most only impacts requests sent by hand though.

This is purely 1.9, no backport is needed.
2018-09-06 10:48:15 +02:00
Willy Tarreau
22de8d3e01 MEDIUM: h2: produce some logs on early errors that prevent streams from being created
The h2 mux currently lacks some basic transparency. Some errors cause the
connection to be aborted but they couldn't be reported. With this patch,
almost all situations where an error will cause a stream or connection to
be aborted without the ability for an existing stream to report it will be
reported in the logs. This at least provides a solution to monitor the
activity and abnormal traffic.
2018-09-06 09:43:41 +02:00
Willy Tarreau
5383935856 MINOR: log: provide a function to emit a log for a session
The new function sess_log() only needs a session to emit a log. It will
ignore the parts that depend on the stream. It is usable to emit a log
to report early errors in muxes. These ones will typically mention
"<BADREQ>" for the request and 0 for the HTTP status code.
2018-09-06 09:43:41 +02:00
Willy Tarreau
09bb27cdea MEDIUM: log: make sess_build_logline() support being called with no stream
Till now it was impossible to emit logs from the lower layers only because
a stream was mandatory. From now on it will at least be possible to emit a
log to report a bad request or some timings for example. When the stream
is null, sess_build_logline() will use default values and will extract the
timing information from the session just like stream_new() does, so the
resulting log line is perfectly valid.

The termination state will indicate a proxy error during the request phase
since it is the only realistic use for such a call with no stream.
2018-09-06 09:43:06 +02:00
Willy Tarreau
5cacab63e1 MINOR: log: use zero as the request counter if there is no stream
When s==NULL we don't have any assigned request counter. Ideally we
should proceed exactly like when a stream is initialized and assign
a unique value. For now we only place it into a local variable.
2018-09-05 20:01:23 +02:00
Willy Tarreau
b8bc52522c MINOR: log: keep a copy of s->flags early to avoid a dereference
By placing s->flags into a local variable we'll be able to force it new
values when s is NULL.
2018-09-05 20:01:23 +02:00
Willy Tarreau
02fdf4f77b MINOR: log: use NULL for the unique_id if there is no stream
Now s->unique_id is used as NULL (not set) if s==NULL.
2018-09-05 20:01:23 +02:00
Willy Tarreau
abd71a5c2e MINOR: log: don't check the stream-int's conn_retries if the stream is NULL
Let's simply forget the conn_retries when there is no stream since we
haven't tried to connect yet.
2018-09-05 20:01:23 +02:00
Willy Tarreau
e1809dfdaf MINOR: log: be sure not to dereference a null stream for a target
The supported targets are either a server or an applet, so both are
NULL if the stream is NULL.
2018-09-05 20:01:23 +02:00
Willy Tarreau
d4f9166f4e MINOR: log: do not dereference a null stream to access captures
If the stream is null, let's simply not check captures. That's already
done if there is no capture.
2018-09-05 20:01:23 +02:00
Willy Tarreau
2393c5b6a9 MINOR: log: keep a copy of the backend connection early in sess_build_logline()
This way we can avoid dereferencing a possibly inexisting stream.
2018-09-05 20:01:23 +02:00
Willy Tarreau
26ffa8544d CLEANUP: log: make the low_level lf_{ip,port,text,text_len} functions take consts
These ones were abusively relying on variables making it hard to integrate
with const arguments.
2018-09-05 20:01:23 +02:00
Willy Tarreau
372ac5abff MINOR: log: don't unconditionally pick log info from s->logs
We'll soon support s==NULL so let's use an intermediary variable for the
logs structure. For now it only points to s->logs but will support a local
variable as an alternative later.
2018-09-05 20:01:23 +02:00
Willy Tarreau
56a91dddc6 MINOR: log: make sess_build_logline() not dereference a NULL stream for txn
If the stream is NULL, the txn is NULL as well. This condition is already
handled everywhere else.
2018-09-05 20:01:23 +02:00
Willy Tarreau
a21c0e60d2 MINOR: log: make the backend fall back to the frontend when there's no stream
This is already what happens before the backend is assigned, except that
now we don't need to dereference a NULL stream to figure this.
2018-09-05 20:01:23 +02:00
Willy Tarreau
43c538eab6 MINOR: log: move the log code to sess_build_logline() to add extra arguments
The current build_logline() can only be used with valid streams, which
means it is not suitable for use from muxes. We start by moving it into
another more generic function which takes the session as an argument,
to avoid complexifying all the internal API for jsut a few use cases.
This new function is not supposed to be called directly from outside so
we'll be able to instrument it to support several calling conventions.

For now the behaviour and conditions remain unchanged.
2018-09-05 20:01:23 +02:00
Willy Tarreau
a0d11b6fd5 BUG/MEDIUM: h2: fix risk of memory leak on malformated wrapped frames
While parsing a headers frame, if the frame is wrapped in the buffer
and needs to be unwrapped, it will be duplicated before being processed.
But if it contains certain combinations of invalid flags, the parser
returns without releasing the temporary buffer leading to a memory
leak.

This fix needs to be backported to 1.8.
2018-09-05 20:01:14 +02:00
Willy Tarreau
590a0514f2 BUG/MEDIUM: session: fix reporting of handshake processing time in the logs
The handshake processing time used to be stored per stream, which was
valid when there was exactly one stream per session. With H2 and
multiplexing it's not the case anymore and the reported handshake times
are wrong in the logs as it's computed between the TCP accept() and the
stream creation. Let's first move the handshake where it belongs, which
is the session.

However, this is not enough because we don't want to report an excessive
idle time either for H2 (since many requests use the connection).

So the solution used here is to have the stream retrieve sess->tv_accept
and the handshake duration when the stream is created, and let the mux
immediately reset them. This way, the handshake time becomes zero for the
second and subsequent requests in H2 (which was already the case in H1),
and the idle time exactly counts how long the connection remained unused
while it could be used, so in H1 it runs from the end of the previous
response and in H2 it runs from the end of the previous request since the
channel is already available.

This patch will need to be backported to 1.8.
2018-09-05 16:30:23 +02:00
Willy Tarreau
90a7c03ec0 BUG/MINOR: stream: use atomic increments for the request counter
The request counter is incremented when creating a new stream and when
resetting a stream, preparing for a new request. Unfortunately during
the thread migration this was missed, leading to non-atomic increments
in case threads are in use. The most visible side effect is that two
requests may have the same ID from time to time in the logs. However
the SPOE also uses this ID to route responses back to the stream so it
may also lead to occasional spurious SPOE timeouts.

Note that it still doesn't guarantee temporal unicity in the stream
identifiers since a long and a short connection could technically use
the same ID. The likeliness that this happens at the same time is almost
null (roughly threads*runqueue_depth/2^32 that it happens in the same
poll loop), but it will have to be addressed later anyway.

This patch must be backported to 1.8 with the other one it relies on
("MINOR: thread: implement HA_ATOMIC_XADD()").
2018-09-05 16:30:19 +02:00
Willy Tarreau
f16cb41d19 MINOR: tools: make date2str_log() take some consts
The "tm" and "date" field are not modified, they can be const instead
of forcing their callers to use vars.
2018-09-05 16:30:11 +02:00
Emmanuel Hocdet
9f9b0c6a7f BUG/MEDIUM: ECC cert should work with TLS < v1.2 and openssl >= 1.1.1
With openssl >= 1.1.1 and boringssl multi-cert is natively supported.
ECDSA/RSA selection is done and work correctly with TLS >= v1.2.
TLS < v1.2 have no TLSEXT_TYPE_signature_algorithms extension: ECC
certificate can't be selected, and handshake fail if no RSA cert
is present. Safe ECC certificate selection without client announcement
can be very tricky (browser compatibilty). The safer approach is to
select ECDSA certificate if no other certificate matches, like it is
with openssl < 1.1.1: certificate selection is only done via the SNI.

Thanks to Lukas Tribus for reporting this and analysing the problem.

This patch should be backported to 1.8
2018-09-04 17:47:10 +02:00
Baptiste Assmann
6d0f38f00d BUG/MEDIUM: dns/server: fix incomatibility between SRV resolution and server state file
Server state file has no indication that a server is currently managed
by a DNS SRV resolution.
And thus, both feature (DNS SRV resolution and server state), when used
together, does not provide the expected behavior: a smooth experience...

This patch introduce the "SRV record name" in the server state file and
loads and applies it if found and wherever required.

This patch applies to haproxy-dev branch only. For backport, a specific patch
is provided for 1.8.
2018-09-04 17:40:22 +02:00
Olivier Houchard
9e643ea172 BUG/MEDIUM: hlua: Don't call RESET_SAFE_LJMP if SET_SAFE_LJMP returns 0.
If SET_SAFE_LJMP returns 0, the spinlock is already unlocked, and lua_atpanic
is already set back to hlua_panic_safe, so there's no need to call
RESET_SAFE_LJMP.

This should be MFC'd into 1.8.
2018-08-31 16:14:58 +02:00
Frédéric Lécaille
54f2bcf22b BUG/MAJOR: thread: lua: Wrong SSL context initialization.
When calling ->prepare_srv() callback for SSL server which
depends on global "nbthread" value, this latter was not already parsed,
so equal to 1 default value. This lead to bad memory accesses.

Thank you to Pieter (PiBa-NL) for having reported this issue and
for having provided a very helpful reg testing file to reproduce
this issue (reg-test/lua/b00002.*).

Must be backported to 1.8.
2018-08-30 10:06:45 +02:00
Olivier Houchard
c7ffa91763 BUG/MEDIUM: stream_interface: try to call si_cs_send() earlier.
Call si_cs_send() at the beginning of si_cs_wake_cb(), instead of from
stream_int_notify-), so that if we get a connection error while trying to
send, the stream_interface will get SI_FL_ERR, the associated task will
be woken up, and the connection will be properly destroyed.

No backport needed.
2018-08-28 19:46:45 +02:00
Olivier Houchard
4501c3e099 MINOR: checks: Call wake_srv_chk() when we can finally send data.
Instead of calling __event_srv_chk_w, call wake_srv_chk(), which will then
either call tcpcheck_main() or __event_srv_chk_w().
Also make tcpcheck_main() subscribe if it can't send.
2018-08-28 19:43:57 +02:00
Olivier Houchard
594c8c5015 BUG/MEDIUM: hlua: Make sure we drain the output buffer when done.
In hlua_applet_tcp_fct(), drain the output buffer when the applet is done
running, every time we're called.
Overwise, there's a race condition, and the output buffer could be filled
after the applet ran, and as it is never cleared, the stream interface
will never be destroyed.

This should be backported to 1.8 and 1.7.
2018-08-28 16:18:34 +02:00
Patrick Hemmer
155e93e570 MINOR: Add srv_conn_free sample fetch
This adds the 'srv_conn_free([<backend>/]<server>)' sample fetch. This fetch
provides the number of available connections on the designated server.
2018-08-27 16:38:56 +02:00
Patrick Hemmer
4cdf3abaa0 MINOR: add be_conn_free sample fetch
This adds the sample fetch 'be_conn_free([<backend>])'. This sample fetch
provides the total number of unused connections across available servers in the
specified backend.
2018-08-27 14:10:16 +02:00
Patrick Hemmer
e3faf02581 BUG/MEDIUM: lua: reset lua transaction between http requests
Previously LUA code would maintain the transaction state between http
requests, resulting in things like txn:get_priv() retrieving data from
a previous request. This addresses the issue by ensuring the LUA state
is reset between requests.

Co-authored-by: Tim Düsterhus <tim@bastelstu.be>
2018-08-25 07:51:02 +02:00
Willy Tarreau
ad7f0ad1c3 BUG/MEDIUM: mux_pt: dereference the connection with care in mux_pt_wake()
mux_pt_wake() calls data->wake() which can return -1 indicating that the
connection was just destroyed. We need to check for this condition and
immediately exit in this case otherwise we dereference a just freed
connection. Note that this mainly happens on idle connections between
two HTTP requests. It can have random implications between requests as
it may lead a wrong connection's polling to be re-enabled or disabled
for example, especially with threads.

This patch must be backported to 1.8.
2018-08-24 15:48:59 +02:00
Frédéric Lécaille
83ed5d58d2 BUG/MINOR: lua: Bad HTTP client request duration.
HTTP LUA applet callback should not update the date on which the HTTP client requests
arrive. This was done just after the LUA applet has completed its job.

This patch simply removes the affected statement. The same fixe has been applied
to TCP LUA applet callback.

To reproduce this issue, as reported by Patrick Hemmer, implement an HTTP LUA applet
which sleeps a bit before replying:

  core.register_service("foo", "http", function(applet)
      core.msleep(100)
      applet:set_status(200)
      applet:start_response()
  end)

This had as a consequence to log %TR field with approximatively the same value as
the LUA sleep time.

Thank you to Patrick Hemmer for having reported this issue.

Must be backported to 1.8, 1.7 and 1.6.
2018-08-24 14:49:30 +02:00
Willy Tarreau
e215bba956 MINOR: connection: make conn_sock_drain() work for all socket families
This patch improves the previous fix by implementing the socket draining
code directly in conn_sock_drain() so that it always applies regardless
of the protocol's family. Thus it gets rid of tcp_drain().
2018-08-24 14:45:46 +02:00
Willy Tarreau
fe5d2ac65f BUG/MEDIUM: unix: provide a ->drain() function
Right now conn_sock_drain() calls the protocol's ->drain() function if
it exists, otherwise it simply tries to disable polling for receiving
on the connection. This doesn't work well anymore since we've implemented
the muxes in 1.8, and it has a side effect with keep-alive backend
connections established over unix sockets. What happens is that if
during the idle time after a request, a connection reports some data,
si_idle_conn_null_cb() is called, which will call conn_sock_drain().
This one sees there's no drain() on unix sockets and will simply disable
polling for data on the connection. But it doesn't do anything on the
conn_stream. Thus while leaving the conn_fd_handler, the mux's polling
is updated and recomputed based on the conn_stream's polling state,
which is still enabled, and nothing changes, so we see the process
use 100% CPU in this case because the FD remains active in the cache.

There are several issues that need to be addressed here. The first and
most important is that we cannot expect some protocols to simply stop
reading data when asked to drain pending data. So this patch make the
unix sockets rely on tcp_drain() since the functions are the same. This
solution is appropriate for backporting, but a better one is desired for
the long term. The second issue is that si_idle_conn_null_cb() shouldn't
drain the connection but the conn_stream.

At the moment we don't have any way to drain a conn_stream, though a flag
on rcv_buf() will do it well. Until we support muxes on the server side
it is not a problem so this part can be addressed later.

This fix must be backported to 1.8.
2018-08-24 14:42:50 +02:00
Willy Tarreau
bba81563cf MINOR: chunk: remove impossible tests on negative chunk->data
Since commit 843b7cb ("MEDIUM: chunks: make the chunk struct's fields
match the buffer struct") a chunk length is unsigned so we can remove
negative size checks.
2018-08-22 05:28:32 +02:00
Willy Tarreau
1c913e4232 BUG/MEDIUM: cli/ssl: don't store base64dec() result in the trash's length
By convenience or laziness we used to store base64dec()'s return code
into trash.data and to compare it against 0 to check for conversion
failure, but it's now unsigned since commit 843b7cb ("MEDIUM: chunks:
make the chunk struct's fields match the buffer struct"). Let's clean
this up and test the result itself without storing it first.

No backport is needed.
2018-08-22 05:28:32 +02:00
Willy Tarreau
b406b8708f BUG/MEDIUM: connection: don't store recv() result into trash.data
Cyril Bonté discovered that the proxy protocol randomly fails since
commit 843b7cb ("MEDIUM: chunks: make the chunk struct's fields match
the buffer struct"). This is because we used to store recv()'s return
code into trash.data which is now unsigned, so it never compares as
negative against 0. Let's clean this up and test the result itself
without storing it first.

No backport is needed.
2018-08-22 05:28:32 +02:00
Willy Tarreau
2842e05c7c BUG/MEDIUM: map: don't store exp_replace() result in the trash's length
By convenience or laziness we used to store exp_replace()'s return code
into str->data. The result checks applied there compare str->data to -1
while it's now unsigned since commit 843b7cb ("MEDIUM: chunks: make the
chunk struct's fields match the buffer struct"). Let's clean this up
and test the result itself without storing it first.

No backport is needed.
2018-08-22 05:16:33 +02:00
Willy Tarreau
f6ee9dc616 BUG/MEDIUM: dns: don't store dns_build_query() result in the trash's length
By convenience or laziness we used to store dns_build_query()'s return code
into trash.data. The result checks applied there compare trash.data to -1
while it's now unsigned since commit 843b7cb ("MEDIUM: chunks: make the
chunk struct's fields match the buffer struct"). Let's clean this up
and test the result itself without storing it first.

No backport is needed.
2018-08-22 05:16:32 +02:00
Willy Tarreau
9c768fdca1 BUG/MEDIUM: http: don't store url_decode() result in the samples's length
By convenience or laziness we used to store url_decode()'s return code
into smp->data.u.str.data. The result checks applied there compare it
to 0 while it's now unsigned since commit 843b7cb ("MEDIUM: chunks: make
the chunk struct's fields match the buffer struct "). Let's clean this up
and test the result itself without storing it first.

No backport is needed.
2018-08-22 05:16:32 +02:00
Willy Tarreau
6e27be1a5d BUG/MEDIUM: http: don't store exp_replace() result in the trash's length
By convenience or laziness we used to store exp_replace()'s return code
into trash.data. The result checks applied there compare trash.data to -1
while it's now unsigned since commit 843b7cb ("MEDIUM: chunks: make the
chunk struct's fields match the buffer struct "). Let's clean this up
and test the result itself without storing it first.

No backport is needed.
2018-08-22 05:16:32 +02:00
Willy Tarreau
5f6333caca BUG/MINOR: chunks: do not store -1 into chunk_printf() in case of error
Since commit 843b7cb ("MEDIUM: chunks: make the chunk struct's fields
match the buffer struct") a chunk length is unsigned so we can't reliably
store -1 and check for negative values in the caller. Only one such
location was found in proto_http's http-request auth rules (which cannot
realistically fail).

No backport is needed.
2018-08-22 05:16:31 +02:00
Willy Tarreau
49725a0977 BUG/MEDIUM: check/threads: do not involve the rendez-vous point for status updates
thread_isolate() is currently being called with the server lock held.
This is not acceptable because it prevents other threads from reaching
the rendez-vous point. Now that the LB algos are thread-safe, let's get
rid of this call.

No backport is nedeed.
2018-08-21 19:54:09 +02:00
Willy Tarreau
1b87748ff5 BUG/MEDIUM: lb/threads: always properly lock LB algorithms on maintenance operations
Since commit 3ff577e ("MAJOR: server: make server state changes
synchronous again"), srv_update_status() calls the various maintenance
operations of the LB algorithms (->set_server_up, ->set_server_down,
->update_server_weight()). These ones are called with a single thread
guaranteed by the rendez-vous point, so the fact that they're lacking
some locks has no effect. However we'll need to remove the rendez-vous
point so we have to take care of properly locking all the LB algos.

The comments have been properly updated on the various functions to
mention their locking expectations. All these functions are called
with the server lock held, and all of them now support concurrent
calls by using the lbprm's lock.

This fix doesn't need to be backported at the moment, though if any
check-specific issue surfaced in 1.8, it could make sense to reuse it.
2018-08-21 19:44:53 +02:00
Willy Tarreau
deca26c452 BUG/MAJOR: queue/threads: make pendconn_redistribute not lock the server
Since commit 3ff577e ("MAJOR: server: make server state changes
synchronous again"), srv_update_status() is called with the server
lock held. It calls (among others) pendconn_redistribute() which used
to take this lock, causing CPU loops by default, or crashes if build
with -DDEBUG_THREAD. Since this function is not called from any other
place anymore, it doesn't require the lock on its own so let's simply
drop it from there.

No backport is needed, this is 1.9-specific.
2018-08-21 18:11:03 +02:00
Olivier Houchard
80c56790d9 BUG/MEDIUM: stream_interface: Call the wake callback after sending.
If we subscribed to send, and the callback is called, call the wake callback
after, so that process_stream() may be woken up if needed.

This is 1.9-specific, no backport is needed.
2018-08-21 18:06:57 +02:00
Olivier Houchard
fab7c7e91c BUG/MEDIUM: H2: Activate polling after successful h2_snd_buf().
Make sure h2_send() is called after h2_snd_buf() by activating polling.

This is 1.9-specific, no backport is needed.
2018-08-21 18:06:57 +02:00
Olivier Houchard
a6ff035770 BUG/MEDIUM: stream-int: Check if the conn_stream exist in si_cs_io_cb.
It is possible that the conn_stream gets detached from the stream_interface,
and as it subscribed to the wait list, si_cs_io_cb() gets called anyway,
so make sure we have a conn_stream before attempting to send more data.

This is 1.9-specific, no backport is needed.
2018-08-21 18:06:54 +02:00
Olivier Houchard
18a85fe602 BUG/MEDIUM: streams: Don't forget to remove the si from the wait list.
When freeing the stream, make sure we remove the stream interfaces from the
wait lists, in case it was in there.

This is 1.9-specific, no backport is needed.
2018-08-21 18:06:33 +02:00
Willy Tarreau
3bcc2699ba BUG/MEDIUM: cli/threads: protect some server commands against concurrent operations
The server-specific CLI commands "set weight", "set maxconn",
"disable agent", "enable agent", "disable health", "enable health",
"disable server" and "enable server" were not protected against
concurrent accesses. Now they take the server lock around the
sensitive part.

This patch must be backported to 1.8.
2018-08-21 15:35:31 +02:00
Willy Tarreau
46b7f53ad9 DOC: server/threads: document which functions need to be called with/without locks
At the moment it's totally unclear while reading the server's code which
functions require to be called with the server lock held and which ones
grab it and cannot be called this way. This commit simply inventories
all of them to indicate what is detected depending on how these functions
use the struct server. Only functions used at runtime were checked, those
dedicated to config parsing were skipped. Doing so already has uncovered
a few bugs on some CLI actions.
2018-08-21 14:58:25 +02:00
Willy Tarreau
a275a3710e BUG/MEDIUM: cli/threads: protect all "proxy" commands against concurrent updates
The proxy-related commands like "{enable|disable|shutdown} frontend",
"{enable|disable} dynamic-cookie", "set dynamic-cookie-key" were not
protected against concurrent accesses making their use dangerous with
threads.

This patch must be backported to 1.8.
2018-08-21 14:58:25 +02:00
Willy Tarreau
eeba36b3af BUG/MEDIUM: server: update our local state before propagating changes
Commit 3ff577e ("MAJOR: server: make server state changes synchronous again")
reintroduced synchronous server state changes. However, during the previous
change from synchronous to asynchronous, the server state propagation was
placed at the end of the function to ease the code changes, and the commit
above didn't put it back at its place. This has resulted in propagated
states to be incomplete. For example, making a server leave maintenance
would make it up but would leave its tracking servers down because they
see their tracked server is still down.

Let's just move the status update right to its place. It also adds the
benefit of reporting state changes in the order they appear and not in
reverse.

No backport is needed.
2018-08-21 08:29:25 +02:00
Cyril Bonté
7ee465f1ad BUG/MINOR: lua: fix extra 500ms added to socket timeouts
Since commit #56cc12509, haproxy accepts double values for timeouts. The
value is then converted to  milliseconds before being rounded up and cast
to int. The issue is that to round up the value, a constant value of 0.5
is added to it, but too early in the conversion, resulting in an
additional 500ms to the value. We are talking about a precision of 1ms,
so we can safely get rid of this rounding trick and adjust resulting
timeouts equal to 0 to a minimum of 1ms.

This patch is specific to the 1.9 branch and doesn't require to be
backported.
2018-08-19 22:11:28 +02:00
Cyril Bonté
7bb6345497 BUG/MEDIUM: lua: socket timeouts are not applied
Sachin Shetty reported that socket timeouts set in LUA code have no effect.
Indeed, connect timeout is never modified and is always set to its default,
set to 5 seconds. Currently, this patch will apply the specified timeout
value to the connect timeout.
For the read and write timeouts, the issue is that the timeout is updated but
the expiration dates were not updated.

This patch should be backported up to the 1.6 branch.
2018-08-18 00:23:52 +02:00
Olivier Houchard
19bdf2428d MINOR: tasks: Don't special-case when nbthreads == 1
Instead of checking if nbthreads == 1, just and thread_mask with
all_threads_mask to know if we're supposed to add the task to the local or
the global runqueue.
2018-08-17 14:50:37 +02:00
Bertrand Jacquin
a25282bb39 DOC: ssl: Use consistent naming for TLS protocols
In most cases, "TLSv1.x" naming is used across and documentation, lazy
people tend to grep too much and may not find what they are looking for.

Fixing people is hard.
2018-08-16 20:20:26 +02:00
Emeric Brun
271022150d BUG/MINOR: map: fix map_regm with backref
Due to a cascade of get_trash_chunk calls the sample is
corrupted when we want to read it.

The fix consist to use a temporary chunk to copy the sample
value and use it.

[wt: for 1.8 and older, a backport was successfully tested here :
 https://www.mail-archive.com/haproxy@formilux.org/msg30694.html]
2018-08-16 19:44:04 +02:00
Emeric Brun
e1b4ed4352 BUG/MEDIUM: ssl: loading dh param from certifile causes unpredictable error.
If the dh parameter is not found, the openssl's error global
stack was not correctly cleared causing unpredictable error
during the following parsing (chain cert parsing for instance).

This patch should be backported in 1.8 (and perhaps 1.7)
2018-08-16 19:36:08 +02:00
Emeric Brun
eb155b6ca6 BUG/MEDIUM: ssl: fix missing error loading a keytype cert from a bundle.
If there was an issue loading a keytype's part of a bundle, the bundle
was implicitly ignored without errors.

This patch should be backported in 1.8 (and perhaps 1.7)
2018-08-16 19:36:06 +02:00
Olivier Houchard
fde2a09a15 BUG/MEDIUM: sessions: Don't use t->state.
In session_expire_embryonic(), don't use t->state, use the "state" argument
instead, as t->state has been cleaned before we're being called.
2018-08-16 19:25:56 +02:00
Olivier Houchard
d8b7a4701d BUG/MEDIUM: tasks: Don't insert in the global rqueue if nbthread == 1
Make sure we don't insert a task in the global run queue if nbthread == 1,
as, as an optimisation, we avoid reading from it if nbthread == 1.
2018-08-16 19:25:46 +02:00
Olivier Houchard
5c110b924e MINOR: checks: Add event_srv_chk_io().
In checks, introduce event_srv_chk_io() as a callback to be called if data
can be sent again, instead of abusing event_srv_chk_w.
2018-08-16 17:29:54 +02:00
Olivier Houchard
29fb89dc5e MINOR: mux_h2: Don't use h2_send() as a callback.
Instead of using h2_send() directly as a callback, introcude h2_io_cb(), that
will call h2_send() if it is possible to send data.
2018-08-16 17:29:54 +02:00
Olivier Houchard
8f0b4c66f5 MINOR: stream_interface: Give stream_interface its own wait_list.
Instead of just using the conn_stream wait_list, give the stream_interface
its own. When the conn_stream will have its own buffers, the stream_interface
may have to wait on it.
2018-08-16 17:29:54 +02:00
Olivier Houchard
91894cbf4c MINOR: stream_interface: Don't use si_cs_send() as a task handler.
Instead of using si_cs_send() as a task handler, define a new function,
si_cs_io_cb(), and give si_cs_send() its original prototype. Right now
si_cs_io_cb() just handles send, but later it'll handle recv() too.
2018-08-16 17:29:54 +02:00
Olivier Houchard
e1c6dbcd70 MINOR: connections/mux: Add the wait reason(s) to wait_list.
Add a new element to the wait_list, that let us know which event(s) we are
waiting on.
2018-08-16 17:29:53 +02:00
Olivier Houchard
ed0f207ef5 MINOR: connections: Get rid of txbuf.
Remove txbuf from conn_stream. It is not used yet, and its only user will
probably be the mux_h2, so it will be better suited in the struct h2s.
2018-08-16 17:29:51 +02:00
Olivier Houchard
638b799b09 MINOR: connections: Move rxbuf from the conn_stream to the h2s.
As the mux_h2 is the only user of rxbuf, move it to the struct h2s, instead
of conn_stream.
2018-08-16 17:28:11 +02:00
Olivier Houchard
511efeae7e MINOR: connections: Make rcv_buf mandatory and nuke cs_recv().
Reintroduce h2_rcv_buf(), right now it just does what cs_recv() did, but
should be modified later.
2018-08-16 17:23:44 +02:00
Emeric Brun
77e8919fc6 BUG/MINOR: ssl: empty connections reported as errors.
Empty connection is reported as handshake error
even if dont-log-null is specified.

This bug affect is a regression du to:

BUILD: ssl: fix to build (again) with boringssl

New openssl 1.1.1 defines OPENSSL_NO_HEARTBEATS as boring ssl
so the test was replaced by OPENSSL_IS_BORINGSSL

This fix should be backported on 1.8
2018-08-16 11:59:59 +02:00
Patrick Hemmer
248cb4c503 MEDIUM: queue: adjust position based on priority-class and priority-offset
The priority values are used when connections are queued to determine
which connections should be served first. The lowest priority class is
served first. When multiple requests from the same class are found, the
earliest (according to queue_time + offset) is served first. The queue
offsets can span over roughly 17 minutes after which the offsets will
wrap around. This allows up to 8 minutes spent in the queue with no
reordering.
2018-08-10 15:06:48 +02:00
Patrick Hemmer
268a707a3d MEDIUM: add set-priority-class and set-priority-offset
This adds the set-priority-class and set-priority-offset actions to
http-request and tcp-request content. At this point they are not used
yet, which is the purpose of the next commit, but all the logic to
set and clear the values is there.
2018-08-10 15:06:31 +02:00
Patrick Hemmer
0355dabd7c MINOR: queue: replace the linked list with a tree
We'll need trees to manage the queues by priorities. This change replaces
the list with a tree based on a single key. It's effectively a list but
allows us to get rid of the list management right now.
2018-08-10 15:06:27 +02:00
Patrick Hemmer
da282f4a8f MINOR: queue: store the queue index in the stream when enqueuing
We store the queue index in the stream and check it on dequeueing to
figure how many entries were processed in between. This way we'll be
able to count the elements that may later be added before ours.
2018-08-10 15:06:25 +02:00
Patrick Hemmer
ffe5e8c638 MINOR: stream: rename {srv,prx}_queue_size to *_queue_pos
The current name is misleading as it implies a queue size, but the value
instead indicates a position in the queue.
The value is only the queue size at the exact moment the element is enqueued.
Soon we will gain the ability to insert anywhere into the queue, upon which
clarity of the name is more important.
2018-08-10 15:04:14 +02:00
Willy Tarreau
66425e31b5 MINOR: queue: make sure the pendconn is released before logging
We'll soon need to rely on the pendconn position at the time of dequeuing
to figure the position a stream took in the queue. Usually it's not a
problem since pendconn_free() is called once the connection starts, but
it will make a difference for failed dequeues (eg: queue timeout reached).
Thus it's important to call pendconn_free() before logging in cases we are
not certain whether it was already performed, and to call pendconn_unlink()
after we know the pendconn will not be used so that we collect the queue
state as accurately as possible. As a benefit it will also make the
server's and backend's queues count more accurate in these cases.
2018-08-10 15:04:08 +02:00
Christopher Faulet
7ce0c891ab MEDIUM: mux: Use the mux protocol specified on bind/server lines
To do so, mux choices are split to handle incoming and outgoing connections in a
different way. The protocol specified on the bind/server line is used in
priority. Then, for frontend connections, the ALPN is retrieved and used to
choose the best mux. For backend connection, there is no ALPN. Finaly, if no
protocol is specified and no protocol matches the ALPN, we fall back on a
default mux, choosing in priority the first mux with exactly the same mode.
2018-08-08 10:42:08 +02:00
Christopher Faulet
8ed0a3e32a MINOR: mux/server: Add 'proto' keyword to force the multiplexer's protocol
For now, it is parsed but not used. Tests are done on it to check if the side
and the mode are compatible with the server's definition.
2018-08-08 10:42:08 +02:00
Christopher Faulet
a717b99284 MINOR: mux/frontend: Add 'proto' keyword to force the mux protocol
For now, it is parsed but not used. Tests are done on it to check if the side
and the mode are compatible with the proxy's definition.
2018-08-08 10:41:11 +02:00
Christopher Faulet
7c42eacbe9 BUG/MEDIUM: stream_int: Don't check CO_FL_SOCK_RD_SH flag to trigger cs receive
It is mandatory to be sure to process data blocked in the RX buffer of the
conn_stream while the shutr/read0 was already processed. The stream interface
doesn't need to rely on this flags because it already tests CS_FL_EOS.
2018-08-08 10:41:11 +02:00
Willy Tarreau
91c2826e1d CLEANUP: server: remove the update list and the update lock
These ones are not more used, let's get rid of them.
2018-08-08 09:57:45 +02:00
Willy Tarreau
3ff577e165 MAJOR: server: make server state changes synchronous again
Now we try to synchronously push updates as they come using the new rdv
point, so that the call to the server update function from the main poll
loop is not needed anymore.

It further reduces the apparent latency in the health checks as the response
time almost always appears as 0 ms, resulting in a slightly higher check rate
of ~1960 conn/s. Despite this, the CPU consumption has slightly dropped again
to ~32% for the same test.

The only trick is that the checks code is built with a bit of recursivity
because srv_update_status() calls server_recalc_eweight(), and the latter
needs to signal srv_update_status() in case of updates. Thus we added an
extra argument to this function to indicate whether or not it must
propagate updates (no if it comes from srv_update_status).
2018-08-08 09:57:45 +02:00
Willy Tarreau
647c70b681 MINOR: threads: remove the previous synchronization point
It's not needed anymore as it is fully covered by the new rendez-vous
point. This also removes the pipe and its polling.
2018-08-08 09:57:45 +02:00
Willy Tarreau
85c459d7e8 MEDIUM: haproxy: don't use sync_poll_loop() anymore in the main loop
This partially reverts commit d8fd2af ("BUG/MEDIUM: threads: Use the sync
point to check active jobs and exit") which used to address an issue in
the way the sync point used to check for present threads, which was later
addressed by commit ddb6c16 ("BUG/MEDIUM: threads: Fix the exit condition
of the thread barrier"). Thus there is no need anymore to use the sync
point for exiting and we can completely remove this call in the main loop.
2018-08-08 09:56:32 +02:00
Willy Tarreau
3d3700f216 MEDIUM: checks: use the new rendez-vous point to spread check result
The current sync point causes some important stress when a high number
of threads is in use on a config with lots of checks, because it wakes
up all threads every time a server state changes.

A config like the following can easily saturate a 4-core machine reaching
only 750 checks per second out of the ~2000 configured :

    global
        nbthread 4

    defaults
        mode    http
        timeout connect 5s
        timeout client  5s
        timeout server  5s

    frontend srv
        bind :8001 process 1/1
        redirect location / if { method OPTIONS } { rand(100) ge 50 }
        stats uri /

    backend chk
        option httpchk
        server-template srv 1-100 127.0.0.1:8001 check rise 1 fall 1 inter 50

The reason is that the random on the fake server causes the responses
to randomly match an HTTP check, and results in a lot of up/down events
that are broadcasted to all threads. It's worth noting that the CPU usage
already dropped by about 60% between 1.8 and 1.9 just due to the scheduler
updates, but the sync point remains expensive.

In addition, it's visible on the stats page that a lot of requests end up
with an L7TOUT status in ~60ms. With smaller timeouts, it's even L4TOUT
around 20-25ms.

By not using THREAD_WANT_SYNC() anymore and only calling the server updates
under thread_isolate(), we can avoid all these wakeups. The CPU usage on
the same config drops to around 44% on the same machine, with all checks
being delivered at ~1900 checks per second, and the stats page shows no
more timeouts, even at 10 ms check interval. The difference is mainly
caused by the fact that there's no more need to wait for a thread to wake
up from poll() before starting to process check results.
2018-08-08 09:56:32 +02:00
Christopher Faulet
98d9fe21e0 MINOR: mux: Print the list of existing mux protocols during HA startup
This is done in verbose/debug mode and when build options are reported.
2018-08-08 09:54:22 +02:00
Christopher Faulet
32f61c0421 MINOR: mux: Unlink ALPN and multiplexers to rather speak of mux protocols
Multiplexers are not necessarily associated to an ALPN. ALPN is a TLS extension,
so it is not always defined or used. Instead, we now rather speak of
multiplexer's protocols. So in this patch, there are no significative changes,
some structures and functions are just renamed.
2018-08-08 09:54:22 +02:00
Christopher Faulet
2d5292a412 MINOR: mux: Add info about the supported side in alpn_mux_list structure
Now, a multiplexer can specify if it can be install on incoming connections
(ALPN_SIDE_FE), on outgoing connections (ALPN_SIDE_BE) or both
(ALPN_SIDE_BOTH). These flags are compatible with proxies' ones.
2018-08-08 09:54:22 +02:00
Christopher Faulet
b75bb21092 MEDIUM: backend: don't rely on mux_pt_ops in connect_server()
The comment above the change remains true. We assume there is always 1
conn_stream per outgoing connectionq. Today, it is always true because H2 is not
supported yet for server connections.
2018-08-08 09:54:22 +02:00
Christopher Faulet
6cc7afa04e MINOR: backend: Try to find the best mux for outgoing connections
For now, there is no effect. mux-pt will always be used because this is only
available mux for backend connections.
2018-08-08 09:54:22 +02:00
Christopher Faulet
063f786553 MINOR: conn_stream: add cs_send() as a default snd_buf() function
This function is generic and is able to automatically transfer data from a
buffer to the conn_stream's tx buffer. It does this automatically if the mux
doesn't define another snd_buf() function.

It cannot yet be used as-is with the conn_stream's txbuf without risking to
lose data on close since conn_streams need to be orphaned for this.
2018-08-08 09:53:58 +02:00
Christopher Faulet
2bf88c05d0 CLEANUP: backend: Move mux install to call it at only one place
It makes the code readability simpler. It will also ease futur changes.
2018-08-07 14:37:37 +02:00
Christopher Faulet
d44a9b3627 MEDIUM: mux: Remove const on the buffer in mux->snd_buf()
This is a partial revert of the commit deccd1116 ("MEDIUM: mux: make
mux->snd_buf() take the byte count in argument"). It is a requirement to do
zero-copy transfers. This will be mandatory when the TX buffer of the
conn_stream will be used.

So, now, data are consumed by mux->snd_buf() and not only sent. So it needs to
update the buffer state. On its side, the caller must be aware the buffer can be
replaced y an empty or unallocated one.

As a side effet of this change, the function co_set_data() is now only responsible
to update the channel set, by update ->output field.
2018-08-07 14:36:52 +02:00
Willy Tarreau
a8694654ba BUG/MEDIUM: queue: prevent a backup server from draining the proxy's connections
When switching back from a backup to an active server, the backup server
currently continues to drain the proxy's connections, which is a problem
because it's not expected to be able to pick them.

This patch ensures that a backup server will only pick backend connections
if there is no active server and it is the selected backup server or all
backup servers are supposed to be used.

This issue seems to have existed forever, so this fix should be backported
to all stable versions.
2018-08-07 10:52:01 +02:00
Willy Tarreau
6a78e61694 BUG/MEDIUM: servers: check the queues once enabling a server
Commit 64cc49c ("MAJOR: servers: propagate server status changes
asynchronously.") heavily changed the way the server states are
updated since they became asynchronous. During this change, some
code was lost, which is used to shut down some sessions from a
backup server and to pick pending connections from a proxy once
a server is turned back from maintenance to ready state. The
effect is that when temporarily disabling a server, connections
stay in the backend's queue, and when re-enabling it, they are
not picked and they expire in the backend's queue. Now they're
properly picked again.

This fix must be backported to 1.8.
2018-08-07 10:14:53 +02:00
Willy Tarreau
ab657ce251 BUG/MEDIUM: threads: fix the no-thread case after the change to the sync point
In commit 0c026f4 ("MINOR: threads: add more consistency between certain
variables in no-thread case"), we ensured that we don't have all_threads_mask
zeroed anymore. But one test was missed for the write() to the sync pipe.
This results in a situation where when running single-threaded, once a
server status changes, a wake-up message is written to the pipe and never
consumed, showing a 100% CPU usage.

No backport is needed.
2018-08-07 10:07:15 +02:00
Willy Tarreau
65e94d1ce9 [RELEASE] Released version 1.9-dev1
Released version 1.9-dev1 with the following main changes :
    - BUG/MEDIUM: kqueue: Don't bother closing the kqueue after fork.
    - DOC: cache: update sections and fix some typos
    - BUILD/MINOR: deviceatlas: enable thread support
    - BUG/MEDIUM: tcp-check: Don't lock the server in tcpcheck_main
    - BUG/MEDIUM: ssl: don't allocate shctx several time
    - BUG/MEDIUM: cache: bad computation of the remaining size
    - BUILD: checks: don't include server.h
    - BUG/MEDIUM: stream: fix session leak on applet-initiated connections
    - BUILD/MINOR: haproxy : FreeBSD/cpu affinity needs pthread_np header
    - BUILD/MINOR: Makefile : enabling USE_CPU_AFFINITY
    - BUG/MINOR: ssl: CO_FL_EARLY_DATA removal is managed by stream
    - BUG/MEDIUM: threads/peers: decrement, not increment jobs on quitting
    - BUG/MEDIUM: h2: don't report an error after parsing a 100-continue response
    - BUG/MEDIUM: peers: fix some track counter rules dont register entries for sync.
    - BUG/MAJOR: thread/peers: fix deadlock on peers sync.
    - BUILD/MINOR: haproxy: compiling config cpu parsing handling when needed
    - MINOR: config: report when "monitor fail" rules are misplaced
    - BUG/MINOR: mworker: fix validity check for the pipe FDs
    - BUG/MINOR: mworker: detach from tty when in daemon mode
    - MINOR: threads: Fix pthread_setaffinity_np on FreeBSD.
    - BUG/MAJOR: thread: Be sure to request a sync between threads only once at a time
    - BUILD: Fix LDFLAGS vs. LIBS re linking order in various makefiles
    - BUG/MEDIUM: checks: Be sure we have a mux if we created a cs.
    - BUG/MINOR: hpack: fix debugging output of pseudo header names
    - BUG/MINOR: hpack: must reject huffman literals padded with more than 7 bits
    - BUG/MINOR: hpack: reject invalid header index
    - BUG/MINOR: hpack: dynamic table size updates are only allowed before headers
    - BUG/MAJOR: h2: correctly check the request length when building an H1 request
    - BUG/MINOR: h2: immediately close if receiving GOAWAY after the last stream
    - BUG/MINOR: h2: try to abort closed streams as soon as possible
    - BUG/MINOR: h2: ":path" must not be empty
    - BUG/MINOR: h2: fix a typo causing PING/ACK to be responded to
    - BUG/MINOR: h2: the TE header if present may only contain trailers
    - BUG/MEDIUM: h2: enforce the per-connection stream limit
    - BUG/MINOR: h2: do not accept SETTINGS_ENABLE_PUSH other than 0 or 1
    - BUG/MINOR: h2: reject incorrect stream dependencies on HEADERS frame
    - BUG/MINOR: h2: properly check PRIORITY frames
    - BUG/MINOR: h2: reject response pseudo-headers from requests
    - BUG/MEDIUM: h2: remove connection-specific headers from request
    - BUG/MEDIUM: h2: do not accept upper case letters in request header names
    - BUG/MINOR: h2: use the H2_F_DATA_* macros for DATA frames
    - BUG/MINOR: action: Don't check http capture rules when no id is defined
    - BUG/MAJOR: hpack: don't pretend large headers fit in empty table
    - BUG/MINOR: ssl: support tune.ssl.cachesize 0 again
    - BUG/MEDIUM: mworker: also close peers sockets in the master
    - BUG/MEDIUM: ssl engines: Fix async engines fds were not considered to fix fd limit automatically.
    - BUG/MEDIUM: checks: a down server going to maint remains definitely stucked on down state.
    - BUG/MEDIUM: peers: set NOLINGER on the outgoing stream interface
    - BUG/MEDIUM: h2: fix handling of end of stream again
    - MINOR: mworker: Update messages referencing exit-on-failure
    - MINOR: mworker: Improve wording in `void mworker_wait()`
    - CONTRIB: halog: Add help text for -s switch in halog program
    - BUG/MEDIUM: email-alert: don't set server check status from a email-alert task
    - BUG/MEDIUM: threads/vars: Fix deadlock in register_name
    - MINOR: systemd: remove comment about HAPROXY_STATS_SOCKET
    - DOC: notifications: add precisions about thread usage
    - BUG/MEDIUM: lua/notification: memory leak
    - MINOR: conn_stream: add new flag CS_FL_RCV_MORE to indicate pending data
    - BUG/MEDIUM: stream-int: always set SI_FL_WAIT_ROOM on CS_FL_RCV_MORE
    - BUG/MEDIUM: h2: automatically set CS_FL_RCV_MORE when the output buffer is full
    - BUG/MEDIUM: h2: enable recv polling whenever demuxing is possible
    - BUG/MEDIUM: h2: work around a connection API limitation
    - BUG/MEDIUM: h2: debug incoming traffic in h2_wake()
    - MINOR: h2: store the demux padding length in the h2c struct
    - BUG/MEDIUM: h2: support uploading partial DATA frames
    - MINOR: h2: don't demand that a DATA frame is complete before processing it
    - BUG/MEDIUM: h2: don't switch the state to HREM before end of DATA frame
    - BUG/MEDIUM: h2: don't close after the first DATA frame on tunnelled responses
    - BUG/MEDIUM: http: don't disable lingering on requests with tunnelled responses
    - BUG/MEDIUM: h2: fix stream limit enforcement
    - BUG/MINOR: stream-int: don't try to receive again after receiving an EOS
    - MINOR: sample: add len converter
    - BUG: MAJOR: lb_map: server map calculation broken
    - BUG: MINOR: http: don't check http-request capture id when len is provided
    - MINOR: sample: rename the "len" converter to "length"
    - BUG/MEDIUM: mworker: Set FD_CLOEXEC flag on log fd
    - DOC/MINOR: intro: typo, wording, formatting fixes
    - MINOR: netscaler: respect syntax
    - MINOR: netscaler: remove the use of cip_magic only used once
    - MINOR: netscaler: rename cip_len to clarify its uage
    - BUG/MEDIUM: netscaler: use the appropriate IPv6 header size
    - BUG/MAJOR: netscaler: address truncated CIP header detection
    - MINOR: netscaler: check in one-shot if buffer is large enough for IP and TCP header
    - MEDIUM: netscaler: do not analyze original IP packet size
    - MEDIUM: netscaler: add support for standard NetScaler CIP protocol
    - MINOR: spoe: add force-set-var option in spoe-agent configuration
    - CONTRIB: iprange: Fix compiler warning in iprange.c
    - CONTRIB: halog: Fix compiler warnings in halog.c
    - BUG/MINOR: h2: properly report a stream error on RST_STREAM
    - MINOR: mux: add flags to describe a mux's capabilities
    - MINOR: stream-int: set flag SI_FL_CLEAN_ABRT when mux supports clean aborts
    - BUG/MEDIUM: stream: don't consider abortonclose on muxes which close cleanly
    - BUG/MEDIUM: checks: a server passed in maint state was not forced down.
    - BUG/MEDIUM: lua: fix crash when using bogus mode in register_service()
    - MINOR: http: adjust the list of supposedly cacheable methods
    - MINOR: http: update the list of cacheable status codes as per RFC7231
    - MINOR: http: start to compute the transaction's cacheability from the request
    - BUG/MINOR: http: do not ignore cache-control: public
    - BUG/MINOR: http: properly detect max-age=0 and s-maxage=0 in responses
    - BUG/MINOR: cache: do not force the TX_CACHEABLE flag before checking cacheability
    - MINOR: http: add a function to check request's cache-control header field
    - BUG/MEDIUM: cache: do not try to retrieve host-less requests from the cache
    - BUG/MEDIUM: cache: replace old object on store
    - BUG/MEDIUM: cache: respect the request cache-control header
    - BUG/MEDIUM: cache: don't cache the response on no-cache="set-cookie"
    - BUG/MAJOR: connection: refine the situations where we don't send shutw()
    - BUG/MEDIUM: checks: properly set servers to stopping state on 404
    - BUG/MEDIUM: h2: properly handle and report some stream errors
    - BUG/MEDIUM: h2: improve handling of frames received on closed streams
    - DOC/MINOR: configuration: typo, formatting fixes
    - BUG/MEDIUM: h2: ensure we always know the stream before sending a reset
    - BUG/MEDIUM: mworker: don't close stdio several time
    - MINOR: don't close stdio anymore
    - BUG/MEDIUM: http: don't automatically forward request close
    - BUG/MAJOR: hpack: don't return direct references to the dynamic headers table
    - MINOR: h2: add a function to report pseudo-header names
    - DEBUG: hpack: make hpack_dht_dump() expose the output file
    - DEBUG: hpack: add more traces to the hpack decoder
    - CONTRIB: hpack: add an hpack decoder
    - MEDIUM: h2: prepare a graceful shutdown when the frontend is stopped
    - BUG/MEDIUM: h2: properly handle the END_STREAM flag on empty DATA frames
    - BUILD: ssl: silence a warning when building without NPN nor ALPN support
    - CLEANUP: rbtree: remove
    - BUG/MEDIUM: ssl: cache doesn't release shctx blocks
    - BUG/MINOR: lua: Fix default value for pattern in Socket.receive
    - DOC: lua: Fix typos in comments of hlua_socket_receive
    - BUG/MEDIUM: lua: Fix IPv6 with separate port support for Socket.connect
    - BUG/MINOR: lua: Fix return value of Socket.settimeout
    - MINOR: dns: Handle SRV record weight correctly.
    - BUG/MEDIUM: mworker: execvp failure depending on argv[0]
    - MINOR: hathreads: add support for gcc < 4.7
    - BUILD/MINOR: ancient gcc versions atomic fix
    - BUG/MEDIUM: stream: properly handle client aborts during redispatch
    - MINOR: spoe: add register-var-names directive in spoe-agent configuration
    - MINOR: spoe: Don't queue a SPOE context if nothing is sent
    - DOC: clarify the scope of ssl_fc_is_resumed
    - CONTRIB: debug: fix a few flags definitions
    - BUG/MINOR: poll: too large size allocation for FD events
    - MINOR: sample: add date_us sample
    - BUG/MEDIUM: peers: fix expire date wasn't updated if entry is modified remotely.
    - MINOR: servers: Don't report duplicate dyncookies for disabled servers.
    - MINOR: global/threads: move cpu_map at the end of the global struct
    - MINOR: threads: add a MAX_THREADS define instead of LONGBITS
    - MINOR: global: add some global activity counters to help debugging
    - MINOR: threads/fd: Use a bitfield to know if there are FDs for a thread in the FD cache
    - BUG/MEDIUM: threads/polling: Use fd_cache_mask instead of fd_cache_num
    - BUG/MEDIUM: fd: maintain a per-thread update mask
    - MINOR: fd: add a bitmask to indicate that an FD is known by the poller
    - BUG/MEDIUM: epoll/threads: use one epoll_fd per thread
    - BUG/MEDIUM: kqueue/threads: use one kqueue_fd per thread
    - BUG/MEDIUM: threads/mworker: fix a race on startup
    - BUG/MINOR: mworker: only write to pidfile if it exists
    - MINOR: threads: Fix build when we're not compiling with threads.
    - BUG/MINOR: threads: always set an owner to the thread_sync pipe
    - BUG/MEDIUM: threads/server: Fix deadlock in srv_set_stopping/srv_set_admin_flag
    - BUG/MEDIUM: checks: Don't try to release undefined conn_stream when a check is freed
    - BUG/MINOR: kqueue/threads: Don't forget to close kqueue_fd[tid] on each thread
    - MINOR: threads: Use __decl_hathreads instead of #ifdef/#endif
    - BUILD: epoll/threads: Add test on MAX_THREADS to avoid warnings when complied without threads
    - BUILD: kqueue/threads: Add test on MAX_THREADS to avoid warnings when complied without threads
    - CLEANUP: sample: Fix comment encoding of sample.c
    - CLEANUP: sample: Fix outdated comment about sample casts functions
    - BUG/MINOR: sample: Fix output type of c_ipv62ip
    - CLEANUP: Fix typo in ARGT_MSK6 comment
    - CLEANUP: standard: Use len2mask4 in str2mask
    - MINOR: standard: Add str2mask6 function
    - MINOR: config: Add support for ARGT_MSK6
    - MEDIUM: sample: Add IPv6 support to the ipmask converter
    - MINOR: config: Enable tracking of up to MAX_SESS_STKCTR stick counters.
    - BUG/MINOR: cli: use global.maxsock and not maxfd to list all FDs
    - MINOR: polling: make epoll and kqueue not depend on maxfd anymore
    - MINOR: fd: don't report maxfd in alert messages
    - MEDIUM: polling: start to move maxfd computation to the pollers
    - CLEANUP: fd/threads: remove the now unused fdtab_lock
    - MINOR: poll: more accurately compute the new maxfd in the loop
    - CLEANUP: fd: remove the unused "new" field
    - MINOR: fd: move the hap_fd_{clr,set,isset} functions to fd.h
    - MEDIUM: select: make use of hap_fd_* functions
    - MEDIUM: fd: use atomic ops for hap_fd_{clr,set} and remove poll_lock
    - MEDIUM: select: don't use the old FD state anymore
    - MEDIUM: poll: don't use the old FD state anymore
    - MINOR: fd: pass the iocb and owner to fd_insert()
    - BUG/MINOR: threads: Update labels array because of changes in lock_label enum
    - MINOR: stick-tables: Adds support for new "gpc1" and "gpc1_rate" counters.
    - BUG/MINOR: epoll/threads: only call epoll_ctl(DEL) on polled FDs
    - DOC: don't suggest using http-server-close
    - MINOR: introduce proxy-v2-options for send-proxy-v2
    - BUG/MEDIUM: spoe: Always try to receive or send the frame to detect shutdowns
    - BUG/MEDIUM: spoe: Allow producer to read and to forward shutdown on request side
    - MINOR: spoe: Remove check on min_applets number when a SPOE context is queued
    - MINOR: spoe: Always link a SPOE context with the applet processing it
    - MINOR: spoe: Replace sending_rate by a frequency counter
    - MINOR: spoe: Count the number of frames waiting for an ack for each applet
    - MEDIUM: spoe: Use an ebtree to manage idle applets
    - MINOR: spoa_example: Count the number of frames processed by each worker
    - MINOR: spoe: Add max-waiting-frames directive in spoe-agent configuration
    - MINOR: init: make stdout unbuffered
    - MINOR: early data: Don't rely on CO_FL_EARLY_DATA to wake up streams.
    - MINOR: early data: Never remove the CO_FL_EARLY_DATA flag.
    - MINOR: compiler: introduce offsetoff().
    - MINOR: threads: Introduce double-width CAS on x86_64 and arm.
    - MINOR: threads: add test and set/reset operations
    - MINOR: pools/threads: Implement lockless memory pools.
    - MAJOR: fd/threads: Make the fdcache mostly lockless.
    - MEDIUM: fd/threads: Make sure we don't miss a fd cache entry.
    - MAJOR: fd: compute the new fd polling state out of the fd lock
    - MINOR: epoll: get rid of the now useless fd_compute_new_polled_status()
    - MINOR: kqueue: get rid of the now useless fd_compute_new_polled_status()
    - MINOR: poll: get rid of the now useless fd_compute_new_polled_status()
    - MINOR: select: get rid of the now useless fd_compute_new_polled_status()
    - CLEANUP: fd: remove the now unused fd_compute_new_polled_status() function
    - MEDIUM: fd: make updt_fd_polling() use atomics
    - MEDIUM: poller: use atomic ops to update the fdtab mask
    - MINOR: fd: move the fd_{add_to,rm_from}_fdlist functions to fd.c
    - BUG/MINOR: fd/threads: properly dereference fdcache as volatile
    - MINOR: fd: remove the unneeded last CAS when adding an fd to the list
    - MINOR: fd: reorder fd_add_to_fd_list()
    - BUG/MINOR: time/threads: ensure the adjusted time is always correct
    - BUG/MEDIUM: standard: Fix memory leak in str2ip2()
    - MINOR: init: emit warning when -sf/-sd cannot parse argument
    - BUILD: fd/threads: fix breakage build breakage without threads
    - DOC: Describe routing impact of using interface keyword on bind lines
    - DOC: Mention -Ws in the list of available options
    - BUG/MINOR: config: don't emit a warning when global stats is incompletely configured
    - BUG/MINOR: fd/threads: properly lock the FD before adding it to the fd cache.
    - BUG/MEDIUM: threads: fix the double CAS implementation for ARMv7
    - BUG/MEDIUM: ssl: Don't always treat SSL_ERROR_SYSCALL as unrecovarable.
    - BUILD/MINOR: memory: stdint is needed for uintptr_t
    - BUG/MINOR: init: Add missing brackets in the code parsing -sf/-st
    - DOC: lua: new prototype for function "register_action()"
    - DOC: cfgparse: Warn on option (tcp|http)log in backend
    - BUG/MINOR: ssl/threads: Make management of the TLS ticket keys files thread-safe
    - MINOR: sample: add a new "concat" converter
    - BUG/MEDIUM: ssl: Shutdown the connection for reading on SSL_ERROR_SYSCALL
    - BUG/MEDIUM: http: Switch the HTTP response in tunnel mode as earlier as possible
    - BUG/MEDIUM: ssl/sample: ssl_bc_* fetch keywords are broken.
    - MINOR: ssl/sample: adds ssl_bc_is_resumed fetch keyword.
    - CLEANUP: cfgparse: Remove unused label end
    - CLEANUP: spoe: Remove unused label retry
    - CLEANUP: h2: Remove unused labels from mux_h2.c
    - CLEANUP: pools: Remove unused end label in memory.h
    - CLEANUP: standard: Fix typo in IPv6 mask example
    - BUG/MINOR: pools/threads: don't ignore DEBUG_UAF on double-word CAS capable archs
    - BUG/MINOR: debug/pools: properly handle out-of-memory when building with DEBUG_UAF
    - MINOR: debug/pools: make DEBUG_UAF also detect underflows
    - MINOR: stats: display the number of threads in the statistics.
    - BUG/MINOR: h2: Set the target of dbuf_wait to h2c
    - BUG/MEDIUM: h2: always consume any trailing data after end of output buffers
    - BUG/MEDIUM: buffer: Fix the wrapping case in bo_putblk
    - BUG/MEDIUM: buffer: Fix the wrapping case in bi_putblk
    - BUG/MEDIUM: spoe: Remove idle applets from idle list when HAProxy is stopping
    - Revert "BUG/MINOR: send-proxy-v2: string size must include ('\0')"
    - MINOR: ssl: extract full pkey info in load_certificate
    - MINOR: ssl: add ssl_sock_get_pkey_algo function
    - MINOR: ssl: add ssl_sock_get_cert_sig function
    - MINOR: connection: add proxy-v2-options ssl-cipher,cert-sig,cert-key
    - MINOR: connection: add proxy-v2-options authority
    - MINOR: systemd: Add section for SystemD sandboxing to unit file
    - MINOR: systemd: Add SystemD's Protect*= options to the unit file
    - MINOR: systemd: Add SystemD's SystemCallFilter option to the unit file
    - CLEANUP: h2: rename misleading h2c_stream_close() to h2s_close()
    - MINOR: h2: provide and use h2s_detach() and h2s_free()
    - MEDIUM: h2: use a single buffer allocator
    - MINOR/BUILD: fix Lua build on Mac OS X
    - BUILD/MINOR: fix Lua build on Mac OS X (again)
    - BUG/MINOR: session: Fix tcp-request session failure if handshake.
    - CLEANUP: .gitignore: Ignore binaries from the contrib directory
    - BUG/MINOR: unix: Don't mess up when removing the socket from the xfer_sock_list.
    - DOC: buffers: clarify the purpose of the <from> pointer in offer_buffers()
    - BUG/MEDIUM: h2: also arm the h2 timeout when sending
    - BUG/MINOR: cli: Fix a crash when passing a negative or too large value to "show fd"
    - CLEANUP: ssl: Remove a duplicated #include
    - CLEANUP: cli: Remove a leftover debug message
    - BUG/MINOR: cli: Fix a typo in the 'set rate-limit' usage
    - BUG/MEDIUM: fix a 100% cpu usage with cpu-map and nbthread/nbproc
    - BUG/MINOR: force-persist and ignore-persist only apply to backends
    - BUG/MEDIUM: threads/unix: Fix a deadlock when a listener is temporarily disabled
    - BUG/MAJOR: threads/queue: Fix thread-safety issues on the queues management
    - BUG/MINOR: dns: don't downgrade DNS accepted payload size automatically
    - TESTS: Add a testcase for multi-port + multi-server listener issue
    - CLEANUP: dns: remove duplicate code in src/dns.c
    - BUG/MINOR: seemless reload: Fix crash when an interface is specified.
    - BUG/MINOR: cli: Ensure all command outputs end with a LF
    - BUG/MINOR: cli: Fix a crash when sending a command with too many arguments
    - BUILD: ssl: Fix build with OpenSSL without NPN capability
    - BUG/MINOR: spoa-example: unexpected behavior for more than 127 args
    - BUG/MINOR: lua: return bad error messages
    - CLEANUP: lua/syntax: lua is a name and not an acronym
    - BUG/MEDIUM: tcp-check: single connect rule can't detect DOWN servers
    - BUG/MINOR: tcp-check: use the server's service port as a fallback
    - BUG/MEDIUM: threads/queue: wake up other threads upon dequeue
    - MINOR: log: stop emitting alerts when it's not possible to write on the socket
    - BUILD/BUG: enable -fno-strict-overflow by default
    - BUG/MEDIUM: fd/threads: ensure the fdcache_mask always reflects the cache contents
    - DOC: log: more than 2 log servers are allowed
    - MINOR: hash: add new function hash_crc32c
    - MINOR: proxy-v2-options: add crc32c
    - MINOR: accept-proxy: support proxy protocol v2 CRC32c checksum
    - REORG: compact "struct server"
    - MINOR: samples: add crc32c converter
    - BUG/MEDIUM: h2: properly account for DATA padding in flow control
    - BUG/MINOR: h2: ensure we can never send an RST_STREAM in response to an RST_STREAM
    - BUG/MINOR: listener: Don't decrease actconn twice when a new session is rejected
    - CLEANUP: map, stream: remove duplicate code in src/map.c, src/stream.c
    - BUG/MINOR: lua: the function returns anything
    - BUG/MINOR: lua funtion hlua_socket_settimeout don't check negative values
    - CLEANUP: lua: typo fix in comments
    - BUILD/MINOR: fix build when USE_THREAD is not defined
    - MINOR: lua: allow socket api settimeout to accept integers, float, and doubles
    - BUG/MINOR: hpack: fix harmless use of uninitialized value in hpack_dht_insert
    - MINOR: cli/threads: make "show fd" report thread_sync_io_handler instead of "unknown"
    - MINOR: cli: make "show fd" report the mux and mux_ctx pointers when available
    - BUILD/MINOR: cli: fix a build warning introduced by last commit
    - BUG/MAJOR: h2: remove orphaned streams from the send list before closing
    - MINOR: h2: always call h2s_detach() in h2_detach()
    - MINOR: h2: fuse h2s_detach() and h2s_free() into h2s_destroy()
    - BUG/MEDIUM: h2/threads: never release the task outside of the task handler
    - BUG/MEDIUM: h2: don't consider pending data on detach if connection is in error
    - BUILD/MINOR: threads: always export thread_sync_io_handler()
    - MINOR: mux: add a "show_fd" function to dump debugging information for "show fd"
    - MINOR: h2: implement a basic "show_fd" function
    - MINOR: cli: report cache indexes in "show fd"
    - BUG/MINOR: h2: remove accidental debug code introduced with show_fd function
    - BUG/MEDIUM: h2: always add a stream to the send or fctl list when blocked
    - BUG/MINOR: checks: check the conn_stream's readiness and not the connection
    - BUG/MINOR: fd: Don't clear the update_mask in fd_insert.
    - BUG/MINOR: email-alert: Set the mailer port during alert initialization
    - BUG/MINOR: cache: fix "show cache" output
    - BUG/MAJOR: cache: fix random crashes caused by incorrect delete() on non-first blocks
    - BUG/MINOR: spoe: Initialize variables used during conf parsing before any check
    - BUG/MINOR: spoe: Don't release the context buffer in .check_timeouts callbaclk
    - BUG/MINOR: spoe: Register the variable to set when an error occurred
    - BUG/MINOR: spoe: Don't forget to decrement fpa when a processing is interrupted
    - MINOR: spoe: Add metrics in to know time spent in the SPOE
    - MINOR: spoe: Add options to store processing times in variables
    - MINOR: log: move 'log' keyword parsing in dedicated function
    - MINOR: log: Keep the ref when a log server is copied to avoid duplicate entries
    - MINOR: spoe: Add loggers dedicated to the SPOE agent
    - MINOR: spoe: Add support for option dontlog-normal in the SPOE agent section
    - MINOR: spoe: use agent's logger to log SPOE messages
    - MINOR: spoe: Add counters to log info about SPOE agents
    - BUG/MAJOR: cache: always initialize newly created objects
    - MINOR: servers: Support alphanumeric characters for the server templates names
    - BUG/MEDIUM: threads: Fix the max/min calculation because of name clashes
    - BUG/MEDIUM: connection: Make sure we have a mux before calling detach().
    - BUG/MINOR: http: Return an error in proxy mode when url2sa fails
    - MINOR: proxy: Add fe_defbe fetcher
    - MINOR: config: Warn if resolvers has no nameservers
    - BUG/MINOR: cli: Guard against NULL messages when using CLI_ST_PRINT_FREE
    - MINOR: cli: Ensure the CLI always outputs an error when it should
    - MEDIUM: sample: Extend functionality for field/word converters
    - MINOR: export localpeer as an environment variable
    - BUG/MEDIUM: kqueue: When adding new events, provide an output to get errors.
    - BUILD: sample: avoid build warning in sample.c
    - BUG/CRITICAL: h2: fix incorrect frame length check
    - DOC: lua: update the links to the config and Lua API
    - BUG/MINOR: pattern: Add a missing HA_SPIN_INIT() in pat_ref_newid()
    - BUG/MAJOR: channel: Fix crash when trying to read from a closed socket
    - BUG/MINOR: log: t_idle (%Ti) is not set for some requests
    - BUG/MEDIUM: lua: Fix segmentation fault if a Lua task exits
    - MINOR: h2: detect presence of CONNECT and/or content-length
    - BUG/MEDIUM: h2: implement missing support for chunked encoded uploads
    - BUG/MINOR: spoe: Fix counters update when processing is interrupted
    - BUG/MINOR: spoe: Fix parsing of dontlog-normal option
    - MEDIUM: cli: Add payload support
    - MINOR: map: Add payload support to "add map"
    - MINOR: ssl: Add payload support to "set ssl ocsp-response"
    - BUG/MINOR: lua/threads: Make lua's tasks sticky to the current thread
    - MINOR: sample: Add strcmp sample converter
    - MINOR: http: Add support for 421 Misdirected Request
    - BUG/MINOR: config: disable http-reuse on TCP proxies
    - MINOR: ssl: disable SSL sample fetches when unsupported
    - MINOR: ssl: add fetch 'ssl_fc_session_key' and 'ssl_bc_session_key'
    - BUG/MINOR: checks: Fix check->health computation for flapping servers
    - BUG/MEDIUM: threads: Fix the sync point for more than 32 threads
    - BUG/MINOR, BUG/MINOR: lua: Put tasks to sleep when waiting for data
    - MINOR: backend: implement random-based load balancing
    - DOC/MINOR: clean up LUA documentation re: servers & array/table.
    - MINOR: lua: Add server name & puid to LUA Server class.
    - MINOR: lua: add get_maxconn and set_maxconn to LUA Server class.
    - BUG/MINOR: map: correctly track reference to the last ref_elt being dumped
    - BUG/MEDIUM: task: Don't free a task that is about to be run.
    - MINOR: fd: Make the lockless fd list work with multiple lists.
    - BUG/MEDIUM: pollers: Use a global list for fd shared between threads.
    - MINOR: pollers: move polled_mask outside of struct fdtab.
    - BUG/MINOR: lua: schedule socket task upon lua connect()
    - BUG/MINOR: lua: ensure large proxy IDs can be represented
    - BUG/MEDIUM: pollers/kqueue: use incremented position in event list
    - BUG/MINOR: cli: don't stop cli_gen_usage_msg() when kw->usage == NULL
    - BUG/MEDIUM: http: don't always abort transfers on CF_SHUTR
    - BUG/MEDIUM: ssl: properly protect SSL cert generation
    - BUG/MINOR: lua: Socket.send threw runtime error: 'close' needs 1 arguments.
    - BUG/MINOR: spoe: Mistake in error message about SPOE configuration
    - BUG/MEDIUM: spoe: Flags are not encoded in network order
    - CLEANUP: spoe: Remove unused variables the agent structure
    - DOC: spoe: fix a typo
    - BUG/MEDIUM: contrib/mod_defender: Use network order to encode/decode flags
    - BUG/MEDIUM: contrib/modsecurity: Use network order to encode/decode flags
    - DOC: add some description of the pending rework of the buffer structure
    - BUG/MINOR: ssl/lua: prevent lua from affecting automatic maxconn computation
    - MINOR: lua: Improve error message
    - BUG/MEDIUM: cache: don't cache when an Authorization header is present
    - MINOR: ssl: set SSL_OP_PRIORITIZE_CHACHA
    - BUG/MEDIUM: dns: Delay the attempt to run a DNS resolution on check failure.
    - BUG/BUILD: threads: unbreak build without threads
    - BUG/MEDIUM: servers: Add srv_addr default placeholder to the state file
    - BUG/MEDIUM: lua/socket: Length required read doesn't work
    - MINOR: tasks: Change the task API so that the callback takes 3 arguments.
    - MAJOR: tasks: Create a per-thread runqueue.
    - MAJOR: tasks: Introduce tasklets.
    - MINOR: tasks: Make the number of tasks to run at once configurable.
    - MAJOR: applets: Use tasks, instead of rolling our own scheduler.
    - BUG/MEDIUM: stick-tables: Decrement ref_cnt in table_* converters
    - MINOR: http: Log warning if (add|set)-header fails
    - DOC: management: add the new wrew stats column
    - MINOR: stats: also report the failed header rewrites warnings on the stats page
    - BUG/MEDIUM: tasks: Don't forget to increase/decrease tasks_run_queue.
    - BUG/MEDIUM: task: Don't forget to decrement max_processed after each task.
    - MINOR: task: Also consider the task list size when getting global tasks.
    - MINOR: dns: Implement `parse-resolv-conf` directive
    - BUG/MEDIUM: spoe: Return an error when the wrong ACK is received in sync mode
    - MINOR: task/notification: Is notifications registered ?
    - BUG/MEDIUM: lua/socket: wrong scheduling for sockets
    - BUG/MAJOR: lua: Dead lock with sockets
    - BUG/MEDIUM: lua/socket: Notification error
    - BUG/MEDIUM: lua/socket: Sheduling error on write: may dead-lock
    - BUG/MEDIUM: lua/socket: Buffer error, may segfault
    - DOC: contrib/modsecurity: few typo fixes
    - DOC: SPOE.txt: fix a typo
    - MAJOR: spoe: upgrade the SPOP version to 2.0 and remove the support for 1.0
    - BUG/MINOR: contrib/spoa_example: Don't reset the status code during disconnect
    - BUG/MINOR: contrib/mod_defender: Don't reset the status code during disconnect
    - BUG/MINOR: contrib/modsecurity: Don't reset the status code during disconnect
    - BUG/MINOR: contrib/mod_defender: update pointer on the end of the frame
    - BUG/MINOR: contrib/modsecurity: update pointer on the end of the frame
    - MINOR: task: Fix a compiler warning by adding a cast.
    - MINOR: stats: also report the nice and number of calls for applets
    - MINOR: applet: assign the same nice value to a new appctx as its owner task
    - MINOR: task: Fix compiler warning.
    - BUG/MEDIUM: tasks: Use the local runqueue when building without threads.
    - MINOR: tasks: Don't define rqueue if we're building without threads.
    - BUG/MINOR: unix: Make sure we can transfer abns sockets on seamless reload.
    - MINOR: lua: Increase debug information
    - BUG/MEDIUM: threads: handle signal queue only in thread 0
    - BUG/MINOR: don't ignore SIG{BUS,FPE,ILL,SEGV} during signal processing
    - BUG/MINOR: signals: ha_sigmask macro for multithreading
    - BUG/MAJOR: map: fix a segfault when using http-request set-map
    - DOC: regression testing: Add a short starting guide.
    - MINOR: tasks: Make sure we correctly init and deinit a tasklet.
    - BUG/MINOR: tasklets: Just make sure we don't pass a tasklet to the handler.
    - BUG/MINOR: lua: Segfaults with wrong usage of types.
    - BUG/MAJOR: ssl: Random crash with cipherlist capture
    - BUG/MAJOR: ssl: OpenSSL context is stored in non-reserved memory slot
    - BUG/MEDIUM: ssl: do not store pkinfo with SSL_set_ex_data
    - MINOR: tests: First regression testing file.
    - MINOR: reg-tests: Add reg-tests/README file.
    - MINOR: reg-tests: Add a few regression testing files.
    - DOC: Add new REGTEST tag info about reg testing.
    - BUG/MEDIUM: fd: Don't modify the update_mask in fd_dodelete().
    - MINOR: Some spelling cleanup in the comments.
    - BUG/MEDIUM: threads: Use the sync point to check active jobs and exit
    - MINOR: threads: Be sure to remove threads from all_threads_mask on exit
    - REGTEST/MINOR: Wrong URI in a reg test for SSL/TLS.
    - REGTEST/MINOR: Set HAPROXY_PROGRAM default value.
    - REGTEST/MINOR: Add levels to reg-tests target.
    - BUG/MAJOR: Stick-tables crash with segfault when the key is not in the stick-table
    - BUG/BUILD: threads: unbreak build without threads
    - BUG/MAJOR: stick_table: Complete incomplete SEGV fix
    - MINOR: stick-tables: make stktable_release() do nothing on NULL
    - BUG/MEDIUM: lua: possible CLOSE-WAIT state with '\n' headers
    - MINOR: startup: change session/process group settings
    - MINOR: systemd: consider exit status 143 as successful
    - REGTEST/MINOR: Wrong URI syntax.
    - CLEANUP: dns: remove obsolete macro DNS_MAX_IP_REC
    - CLEANUP: dns: inacurate comment about prefered IP score
    - MINOR: dns: fix wrong score computation in dns_get_ip_from_response
    - MINOR: dns: new DNS options to allow/prevent IP address duplication
    - REGTEST/MINOR: Unexpected curl URL globling.
    - BUG/MINOR: ssl: properly ref-count the tls_keys entries
    - MINOR: h2: keep a count of the number of conn_streams attached to the mux
    - BUG/MEDIUM: h2: don't accept new streams if conn_streams are still in excess
    - MINOR: h2: add the mux and demux buffer lengths on "show fd"
    - BUG/MEDIUM: h2: never leave pending data in the output buffer on close
    - BUG/MEDIUM: h2: make sure the last stream closes the connection after a timeout
    - MINOR: tasklet: Set process to NULL.
    - MINOR: buffer: implement a new file for low-level buffer manipulation functions
    - MINOR: buffer: switch buffer sizes and offsets to size_t
    - MINOR: buffer: add a few basic functions for the new API
    - MINOR: buffer: Introduce b_sub(), b_add(), and bo_add()
    - MINOR: buffer: Add b_set_data().
    - MINOR: buffer: introduce b_realign_if_empty()
    - MINOR: compression: pass the channel to http_compression_buffer_end()
    - MINOR: channel: add a few basic functions for the new buffer API
    - MINOR: channel/buffer: use c_realign_if_empty() instead of buffer_realign()
    - MINOR: channel/buffer: replace buffer_slow_realign() with channel_slow_realign() and b_slow_realign()
    - MEDIUM: channel: make channel_slow_realign() take a swap buffer
    - MINOR: h2: use b_slow_realign() with the trash as a swap buffer
    - MINOR: buffer: remove buffer_slow_realign() and the swap_buffer allocation code
    - MINOR: channel/buffer: replace b_{adv,rew} with c_{adv,rew}
    - MINOR: buffer: replace calls to buffer_space_wraps() with b_space_wraps()
    - MINOR: buffer: remove bi_getblk() and bi_getblk_nc()
    - MINOR: buffer: split bi_contig_data() into ci_contig_data and b_config_data()
    - MINOR: buffer: remove bi_ptr()
    - MINOR: buffer: remove bo_ptr()
    - MINOR: buffer: remove bo_end()
    - MINOR: buffer: remove bi_end()
    - MINOR: buffer: remove bo_contig_data()
    - MINOR: buffer: merge b{i,o}_contig_space()
    - MINOR: buffer: replace bo_getblk() with direction agnostic b_getblk()
    - MINOR: buffer: replace bo_getblk_nc() with b_getblk_nc() which takes an offset
    - MINOR: buffer: replace bi_del() and bo_del() with b_del()
    - MINOR: buffer: convert most b_ptr() calls to c_ptr()
    - MINOR: h1: make h1_measure_trailers() take the byte count in argument
    - MINOR: h2: clarify the fact that the send functions are unsigned
    - MEDIUM: h2: prevent the various mux encoders from modifying the buffer
    - MINOR: h1: make h1_skip_chunk_crlf() not depend on b_ptr() anymore
    - MINOR: h1: make h1_parse_chunk_size() not depend on b_ptr() anymore
    - MINOR: h1: make h1_measure_trailers() use an offset and a count
    - MEDIUM: h2: do not use buf->o anymore inside h2_snd_buf's loop
    - MEDIUM: h2: don't use b_ptr() nor b_end() anymore
    - MINOR: buffer: get rid of b_end() and b_to_end()
    - MINOR: buffer: make b_getblk_nc() take const pointers
    - MINOR: buffer: make b_getblk_nc() take size_t for the block sizes
    - MEDIUM: connection: make xprt->snd_buf() take the byte count in argument
    - MEDIUM: mux: make mux->snd_buf() take the byte count in argument
    - MEDIUM: connection: make xprt->rcv_buf() use size_t for the count
    - MEDIUM: mux: make mux->rcv_buf() take a size_t for the count
    - MINOR: connection: add a flags argument to rcv_buf()
    - MINOR: connection: add a new receive flag : CO_RFL_BUF_WET
    - MINOR: buffer: get rid of b_ptr() and convert its last users
    - MINOR: buffer: use b_room() to determine available space in a buffer
    - MINOR: buffer: replace buffer_not_empty() with b_data() or c_data()
    - MINOR: buffer: replace buffer_empty() with b_empty() or c_empty()
    - MINOR: buffer: make bo_putchar() use b_tail()
    - MINOR: buffer: replace buffer_full() with channel_full()
    - MINOR: buffer: replace bi_space_for_replace() with ci_space_for_replace()
    - MINOR: buffer: replace buffer_pending() with ci_data()
    - MINOR: buffer: replace buffer_flush() with c_adv(chn, ci_data(chn))
    - MINOR: buffer: use c_head() instead of buffer_wrap_sub(c->buf, p-o)
    - MINOR: buffer: use b_orig() to replace most references to b->data
    - MINOR: buffer: Use b_add()/bo_add() instead of accessing b->i/b->o.
    - MINOR: channel: remove almost all references to buf->i and buf->o
    - MINOR: channel: Add co_set_data().
    - MEDIUM: channel: adapt to the new buffer API
    - MINOR: checks: adapt to the new buffer API
    - MEDIUM: h2: update to the new buffer API
    - MINOR: buffer: remove unused bo_add()
    - MEDIUM: spoe: use the new buffer API for the SPOE buffer
    - MINOR: stats: adapt to the new buffers API
    - MINOR: cli: use the new buffer API
    - MINOR: cache: use the new buffer API
    - MINOR: stream-int: use the new buffer API
    - MINOR: stream: use wrappers instead of directly manipulating buffers
    - MINOR: backend: use new buffer API
    - MEDIUM: http: use wrappers instead of directly manipulating buffers states
    - MINOR: filters: convert to the new buffer API
    - MINOR: payload: convert to the new buffer API
    - MEDIUM: h1: port to new buffer API.
    - MINOR: flt_trace: adapt to the new buffer API
    - MEDIUM: compression: start to move to the new buffer API
    - MINOR: lua: use the wrappers instead of directly manipulating buffer states
    - MINOR: buffer: convert part bo_putblk() and bi_putblk() to the new API
    - MINOR: buffer: adapt buffer_slow_realign() and buffer_dump() to the new API
    - MAJOR: start to change buffer API
    - MINOR: buffer: remove the check for output on b_del()
    - MINOR: buffer: b_set_data() doesn't truncate output data anymore
    - MINOR: buffer: rename the "data" field to "area"
    - MEDIUM: buffers: move "output" from struct buffer to struct channel
    - MINOR: buffer: replace bi_fast_delete() with b_del()
    - MINOR: buffer: replace b{i,o}_put* with b_put*
    - MINOR: buffer: add a new file for ist + buffer manipulation functions
    - MINOR: checks: use b_putist() instead of b_putstr()
    - MINOR: buffers: remove b_putstr()
    - CLEANUP: buffer: minor cleanups to buffer.h
    - MINOR: buffers/channel: replace buffer_insert_line2() with ci_insert_line2()
    - MINOR: buffer: replace buffer_replace2() with b_rep_blk()
    - MINOR: buffer: rename the data length member to '->data'
    - MAJOR: buffer: finalize buffer detachment
    - MEDIUM: chunks: make the chunk struct's fields match the buffer struct
    - MAJOR: chunks: replace struct chunk with struct buffer
    - DOC: buffers: document the new buffers API
    - DOC: buffers: remove obsolete docs about buffers
    - MINOR: tasklets: Don't attempt to add a tasklet in the list twice.
    - MINOR: connections/mux: Add a new "subscribe" method.
    - MEDIUM: connections/mux: Revamp the send direction.
    - MINOR: connection: simplify subscription by adding a registration function
    - BUG/MINOR: http: Set brackets for the unlikely macro at the right place
    - BUG/MINOR: build: Fix compilation with debug mode enabled
    - BUILD: Generate sha256 checksums in publish-release
    - MINOR: debug: Add check for CO_FL_WILL_UPDATE
    - MINOR: debug: Add checks for conn_stream flags
    - MINOR: ist: Add the function isteqi
    - BUG/MEDIUM: threads: Fix the exit condition of the thread barrier
    - BUG/MEDIUM: mux_h2: Call h2_send() before updating polling.
    - MINOR: buffers: simplify b_contig_space()
    - MINOR: buffers: split b_putblk() into __b_putblk()
    - MINOR: buffers: add b_xfer() to transfer data between buffers
    - DOC: add some design notes about the new layering model
    - MINOR: conn_stream: add a new CS_FL_REOS flag
    - MINOR: conn_stream: add an rx buffer to the conn_stream
    - MEDIUM: conn_stream: add cs_recv() as a default rcv_buf() function
    - MEDIUM: stream-int: automatically call si_cs_recv_cb() if the cs has data on wake()
    - MINOR: h2: make each H2 stream support an intermediary input buffer
    - MEDIUM: h2: make h2_frt_decode_headers() use an intermediary buffer
    - MEDIUM: h2: make h2_frt_transfer_data() copy via an intermediary buffer
    - MEDIUM: h2: centralize transfer of decoded frames in h2_rcv_buf()
    - MEDIUM: h2: move headers and data frame decoding to their respective parsers
    - MEDIUM: buffers: make b_xfer() automatically swap buffers when possible
    - MEDIUM: h2: perform a single call to the data layer in demux()
    - MEDIUM: h2: don't call data_cb->recv() anymore
    - MINOR: h2: make use of CS_FL_REOS to indicate that end of stream was seen
    - MEDIUM: h2: use the default conn_stream's receive function
    - DOC: add more design feedback on the new layering model
    - MINOR: h2: add the error code and the max/last stream IDs to "show fd"
    - BUG/MEDIUM: stream-int: don't immediately enable reading when the buffer was reportedly full
    - BUG/MEDIUM: stats: don't ask for more data as long as we're responding
    - BUG/MINOR: servers: Don't make "server" in a frontend fatal.
    - BUG/MEDIUM: tasks: make sure we pick all tasks in the run queue
    - BUG/MEDIUM: tasks: Decrement rqueue_size at the right time.
    - BUG/MEDIUM: tasks: use atomic ops for active_tasks_mask
    - BUG/MEDIUM: tasks: Make sure there's no task left before considering inactive.
    - MINOR: signal: don't pass the signal number anymore as the wakeup reason
    - MINOR: tasks: extend the state bits from 8 to 16 and remove the reason
    - MINOR: tasks: Add a flag that tells if we're in the global runqueue.
    - BUG/MEDIUM: tasks: make __task_unlink_rq responsible for the rqueue size.
    - MINOR: queue: centralize dequeuing code a bit better
    - MEDIUM: queue: make pendconn_free() work on the stream instead
    - DOC: queue: document the expected locking model for the server's queue
    - MINOR: queue: make sure pendconn->strm->pend_pos is always valid
    - MINOR: queue: use a distinct variable for the assigned server and the queue
    - MINOR: queue: implement pendconn queue locking functions
    - MEDIUM: queue: get rid of the pendconn lock
    - MINOR: tasks: Make active_tasks_mask volatile.
    - MINOR: tasks: Make global_tasks_mask volatile.
    - MINOR: pollers: Add a way to wake a thread sleeping in the poller.
    - MINOR: threads/queue: Get rid of THREAD_WANT_SYNC in the queue code.
    - BUG/MEDIUM: threads/sync: use sched_yield when available
    - MINOR: ssl: BoringSSL matches OpenSSL 1.1.0
    - BUG/MEDIUM: h2: prevent orphaned streams from blocking a connection forever
    - BUG/MINOR: config: stick-table is not supported in defaults section
    - BUILD/MINOR: threads: unbreak build with threads disabled
    - BUG/MINOR: threads: Handle nbthread == MAX_THREADS.
    - BUG/MEDIUM: threads: properly fix nbthreads == MAX_THREADS
    - MINOR: threads: move "nbthread" parsing to hathreads.c
    - BUG/MEDIUM: threads: unbreak "bind" referencing an incorrect thread number
    - MEDIUM: proxy_protocol: Convert IPs to v6 when protocols are mixed
    - BUILD/MINOR: compiler: fix offsetof() on older compilers
    - SCRIPTS: git-show-backports: add missing quotes to "echo"
    - MINOR: threads: add more consistency between certain variables in no-thread case
    - MEDIUM: hathreads: implement a more flexible rendez-vous point
    - BUG/MEDIUM: cli: make "show fd" thread-safe
2018-08-02 18:12:50 +02:00
Willy Tarreau
bf9fd65088 BUG/MEDIUM: cli: make "show fd" thread-safe
The "show fd" command was implemented as a debugging aid but it's not
thread safe. Its features have grown, it can now dump some mux-specific
parts and is being used in production to capture some useful debugging
traces. But it will quickly crash the process when used during an H2 load
test for example, especially when haproxy is built with the DEBUG_UAF
option. It cannot afford not to be thread safe anymore. Let's make use
of the new rendez-vous point using thread_isolate() / thread_release()
to ensure that the data being dumped are not changing under us. The dump
becomes slightly slower under load but now it's safe.

This should be backported to 1.8 along with the rendez-vous point code
once considered stable enough.
2018-08-02 17:51:49 +02:00
Willy Tarreau
60b639ccbe MEDIUM: hathreads: implement a more flexible rendez-vous point
The current synchronization point enforces certain restrictions which
are hard to workaround in certain areas of the code. The fact that the
critical code can only be called from the sync point itself is a problem
for some callback-driven parts. The "show fd" command for example is
fragile regarding this.

Also it is expensive in terms of CPU usage because it wakes every other
thread just to be sure all of them join to the rendez-vous point. It's a
problem because the sleeping threads would not need to be woken up just
to know they're doing nothing.

Here we implement a different approach. We keep track of harmless threads,
which are defined as those either doing nothing, or doing harmless things.
The rendez-vous is used "for others" as a way for a thread to isolate itself.
A thread then requests to be alone using thread_isolate() when approaching
the dangerous area, and then waits until all other threads are either doing
the same or are doing something harmless (typically polling). The function
only returns once the thread is guaranteed to be alone, and the critical
section is terminated using thread_release().
2018-08-02 17:51:45 +02:00
Willy Tarreau
0c026f49e7 MINOR: threads: add more consistency between certain variables in no-thread case
When threads are disabled, some variables such as tid and tid_bit are
still checked everywhere, the MAX_THREADS_MASK macro is ~0UL while
MAX_THREADS is 1, and the all_threads_mask variable is replaced with a
macro forced to zero. The compiler cannot optimize away all this code
involving checks on tid and tid_bit, and we end up in special cases
where all_threads_mask has to be specifically tested for being zero or
not. It is not even certain the code paths are always equivalent when
testing without threads and with nbthread 1.

Let's change this to make sure we always present a single thread when
threads are disabled, and have the relevant values declared as constants
so that the compiler can optimize all the tests away. Now we have
MAX_THREADS_MASK set to 1, all_threads_mask set to 1, tid set to zero
and tid_bit set to 1. Doing just this has removed 4 kB of code in the
no-thread case.

A few checks for all_threads_mask==0 have been removed since it never
happens anymore.
2018-08-02 17:48:09 +02:00
Tim Duesterhus
7fec021537 MEDIUM: proxy_protocol: Convert IPs to v6 when protocols are mixed
http-request set-src possibly creates a situation where src and dst
are from different address families. Convert both addresses to IPv6
to avoid a PROXY UNKNOWN.

This patch should be backported to haproxy 1.8.
2018-07-30 11:23:30 +02:00
Willy Tarreau
c477b6fcc9 BUG/MEDIUM: threads: unbreak "bind" referencing an incorrect thread number
The "process" directive on "bind" lines supports process references and
thread references. No check is performed on the thread number validity,
so that if a listener is only bound to non-existent threads, the traffic
will never be processed. It easily happens when setting one bind line per
thread with an incorrect (or reduced) thread count. No warning appears
and some random connections are never served. It also happens when setting
thread references with threads support disabled at build time.

This patch makes use of the all_threads_mask variable to detect if some
referenced threads don't exist, to emit a warning and fix this.

This patch needs to be backported to 1.8, just like the previous one which
it depends on (MINOR: threads: move "nbthread" parsing to hathreads.c).
2018-07-30 11:10:46 +02:00
Willy Tarreau
0ccd32285f MINOR: threads: move "nbthread" parsing to hathreads.c
The purpose is to make sure that all variables which directly depend
on this nbthread argument are set at the right moment. For now only
all_threads_mask needs to be set. It used to be set while calling
thread_sync_init() which is called too late for certain checks. The
same function handles threads and non-threads, which removes the need
for some thread-specific knowledge from cfgparse.c.
2018-07-30 11:10:46 +02:00
Willy Tarreau
5e954e1f27 BUG/MEDIUM: threads: properly fix nbthreads == MAX_THREADS
While moving Olivier's patch for nbthread==MAX_THREADS in commit
3e12304 ("BUG/MINOR: threads: Handle nbthread == MAX_THREADS.") to
hathreads.c, I missed one place resulting in the computed thread mask
being used as the thread count, which is worse than the initial bug.
Let's fix it properly this time.

This fix must be backported to 1.8 just like the other one.
2018-07-30 11:10:26 +02:00
Olivier Houchard
3e12304ae0 BUG/MINOR: threads: Handle nbthread == MAX_THREADS.
If nbthread is MAX_THREADS, the shift operation needed to compute
all_threads_mask fails in thread_sync_init(). Instead pass a number
of threads to this function and let it compute the mask without
overflowing.

This should be backported to 1.8.
2018-07-27 17:18:22 +02:00
Willy Tarreau
85d9b84eb1 BUILD/MINOR: threads: unbreak build with threads disabled
Depending on the optimization level, gcc may complain that wake_thread()
uses an invalid array index for poller_wr_pipe[] when called from
__task_wakeup(). Normally the condition to get there never happens,
but it's simpler to ifdef out this part of the code which is only
used to wake other threads up. No backport is needed, this was brought
by the recent introduction of the ability to wake a sleeping thread.
2018-07-27 17:18:22 +02:00
Willy Tarreau
c786768dba BUG/MINOR: config: stick-table is not supported in defaults section
Thierry discovered that the following config crashes haproxy while
parsing the config (it's probably the smallest crasher) :

   defaults
       stick-table type ip size 1M

And indeed it does because it looks for the current proxy's name which it
does not have as it's the default one. This affects all versions since 1.6.

This fix must be backported to all versions back to 1.6.
2018-07-27 10:26:22 +02:00
Willy Tarreau
a2b5181e7a BUG/MEDIUM: h2: prevent orphaned streams from blocking a connection forever
Some h2 connections remaining in CLOSE_WAIT state forever have been
reported for a while. Thanks to detailed captures provided by Milan
Petruzelka, the sequence where this happens became clearer :

  1) multiple streams compete for the mux and are queued in the send_list

  2) at this point the mux has to emit a GOAWAY for any reason (for
     example because it received a bad message)

  3) the streams are woken up, notified about the error

  4) h2_detach() is called for each of them

  5) the CS they are detached from the H2S

  6) since the streams are marked as blocked for some room, they are
     orphaned and nothing more is done on them.

  7) at this point, any activity on the connection goes through h2_wake()
     which sees the conneciton in ERROR2 state, tries again to release
     the streams, cannot, and stops polling (thus even connection errors
     cannot be detected anymore).

=> from this point, no more events can be received on the connection, and
   the streams remain orphaned forever.

This patch makes sure that we never return without doing anything once
an error was met. It has to act both on the h2_detach() side (for h2
streams being detached after the error was emitted) and on the h2_wake()
side (for errors reported after h2s have already been orphaned).

Many thanks to Milan Petruzelka and Janusz Dziemidowicz for their
awesome work on this issue, collecting traces and testing patches,
and to Olivier Doucet for extra testing and confirming the fix.

This fix must be backported to 1.8.
2018-07-27 09:55:14 +02:00
Willy Tarreau
3ea2490b48 BUG/MEDIUM: threads/sync: use sched_yield when available
There is a corner case with the sync point which can significantly
degrade performance. The reason is that it forces all threads to busy
spin there, and that if there are less CPUs available than threads,
this busy activity from some threads will force others to wait longer
in epoll() or to simply be scheduled out while doing something else,
and will increase the time needed to reach the sync point.

Given that the sync point is not expected to be stressed *that* much,
better call sched_yield() while waiting there to release the CPU and
offer it to waiting threads.

On a simple test with 4 threads bound to two cores using "maxconn 1"
on the server line, the performance was erratic before the recent
scheduler changes (between 40 and 200 conn/s with hundreds of ms
response time), and it jumped to 7200 with 12ms response time with
this fix applied.

It should be backported to 1.8 since 1.8 is affected as well.
2018-07-27 07:54:08 +02:00
Olivier Houchard
ecfe673f61 MINOR: threads/queue: Get rid of THREAD_WANT_SYNC in the queue code.
Now that we can wake one thread sleeping in the poller, we don't have to
use THREAD_WANT_SYNC any more.

This gives a significant performance boost on highly contended accesses
(servers with maxconn 1), showing a jump from 21k to 31k conn/s on a
test involving 8 threads.
2018-07-26 20:55:02 +02:00
Olivier Houchard
79321b95a8 MINOR: pollers: Add a way to wake a thread sleeping in the poller.
Add a new pipe, one per thread, so that we can write on it to wake a thread
sleeping in a poller, and use it to wake threads supposed to take care of a
task, if they are all sleeping.
2018-07-26 19:09:50 +02:00
Olivier Houchard
eba0c0b51d MINOR: tasks: Make global_tasks_mask volatile.
In order to make sure modifications are noticed by other threads when needed,
make global_tasks_mask volatile.
2018-07-26 19:09:50 +02:00
Olivier Houchard
9b03c0c9a7 MINOR: tasks: Make active_tasks_mask volatile.
To be sure we have the relevant informations, make active_tasks_mask volatile
2018-07-26 19:09:50 +02:00
Willy Tarreau
3201e4e428 MEDIUM: queue: get rid of the pendconn lock
This lock was necessary to manipulate the pendconn element between
concurrent places, but was causing great difficulties in the list walk
by having to iterate over multiple entries instead of being able to
safely pick the first one (in fact the first element was always the
right one but the locking model was hard to prove).

Here since we know we can always rely on the queue's locks, we take
the queue's lock every time we need to modify the element. In practice
it was already the case everywhere except in pendconn_dequeue() which
only works on an element that was already detached. This function had
to be protected against the risk of meeting an incompletely detached
element (which could be unlinked but not yet assigned). By taking the
queue lock around the LIST_ISEMPTY test, it's enough to ensure that a
concurrent thread either didn't begin or had completed the operation.

The true benefit really is in pendconn_process_next_strm() where we
can again safely work with the first element of each queue. This will
significantly simplify next updates to this code.
2018-07-26 17:32:51 +02:00
Willy Tarreau
7c6f8a2b0d MINOR: queue: implement pendconn queue locking functions
The new pendconn_queue_lock() and pendconn_queue_unlock() functions are
made to make it more convenient to lock or unlock the pendconn queue
either at the proxy or the server depending on pendconn->srv. This way
it is possible to remove the open-coding of these locks at various places.
These ones have been used in pendconn_unlink() and pendconn_add(), thus
significantly simplifying the logic there.
2018-07-26 17:32:51 +02:00
Willy Tarreau
88930dd364 MINOR: queue: use a distinct variable for the assigned server and the queue
The pendconn struct uses ->px and ->srv to designate where the element is
queued. There is something confusing regarding threads though, because we
have to lock the appropriate queue before inserting/removing elements, and
this queue may only be determined by looking at ->srv (if it's not NULL
it's the server, otherwise use the proxy). But pendconn_grab_from_px() and
pendconn_process_next_strm() both assign this ->srv field, making it
complicated to know what queue to lock before manipulating the element,
which is exactly why we have the pendconn_lock in the first place.

This commit introduces pendconn->target which is the target server that
the two aforementioned functions will set when assigning the server.
Thanks to this, the server pointer may always be relied on to determine
what queue to use.
2018-07-26 17:32:51 +02:00
Willy Tarreau
c1a60d6218 MINOR: queue: make sure pendconn->strm->pend_pos is always valid
pendconn_add() used to assign strm->pend_pos very late, after unlocking
the queue, so that a watching thread could see a random value in
pendconn->strm->pend_pos even while holding the lock on the element and
the queue itself. While there's currently nothing wrong with this, it
costs nothing to arrange it and will simplify code analysis later.
2018-07-26 17:32:51 +02:00
Willy Tarreau
6bdd05c0ef DOC: queue: document the expected locking model for the server's queue
The locking model is not trivial and is worth documenting to avoid seeing
apparent bugs everywhere while they are not.
2018-07-26 17:32:51 +02:00
Willy Tarreau
d0ad4a87f0 MEDIUM: queue: make pendconn_free() work on the stream instead
Now pendconn_free() takes a stream, checks that pend_pos is set, clears
it, and uses pendconn_unlink() to complete the job. It's cleaner and
centralizes all the bookkeeping work in pendconn_unlink() only and
ensures that there's a single place where the stream's position in the
queue is manipulated.
2018-07-26 17:32:51 +02:00
Willy Tarreau
9624faec86 MINOR: queue: centralize dequeuing code a bit better
For now the pendconns may be dequeued at two places :
  - pendconn_unlink(), which operates on a locked queue
  - pendconn_free(), which operates on an unlocked queue and frees
    everything.

Some changes are coming to the queue and we'll need to be able to be a
bit stricter regarding the places where we dequeue to keep the accounting
accurate. This first step renames the locked function __pendconn_unlink()
as it's for use by those aware of it, and introduces a new general purpose
pendconn_unlink() function which automatically grabs the necessary locks
before calling the former, and pendconn_cond_unlink() which additionally
checks the pointer and the presence in the queue.
2018-07-26 17:32:48 +02:00
Olivier Houchard
77551ee8a7 BUG/MEDIUM: tasks: make __task_unlink_rq responsible for the rqueue size.
As __task_wakeup() is responsible for increasing
rqueue_local[tid]/global_rqueue_size, make __task_unlink_rq responsible for
decreasing it, as process_runnable_tasks() isn't the only one that removes
tasks from runqueues.
2018-07-26 16:33:29 +02:00
Olivier Houchard
76e45181b2 MINOR: tasks: Add a flag that tells if we're in the global runqueue.
How that we have bits available in task->state, add a flag that tells if we're
in the global runqueue or not.
2018-07-26 16:33:10 +02:00
Willy Tarreau
ad8bd2467c MINOR: signal: don't pass the signal number anymore as the wakeup reason
This is never used and would even be wrong since the reasons are ORed
so two signals would be turned into a third value, just like if any
other reason was used at the same time.
2018-07-26 16:12:48 +02:00
Olivier Houchard
c4aac9effe BUG/MEDIUM: tasks: Make sure there's no task left before considering inactive.
We may remove the thread's bit in active_tasks_mask despite tasks for that
thread still being present in the global runqueue. To fix that, introduce
global_tasks_mask, and set the correspnding bits when we add a task to the
runqueue.
2018-07-26 15:40:22 +02:00
Willy Tarreau
189ea856a7 BUG/MEDIUM: tasks: use atomic ops for active_tasks_mask
We don't have the lock anymore so we need to protect it.
2018-07-26 15:16:43 +02:00
Olivier Houchard
e85ee7b663 BUG/MEDIUM: tasks: Decrement rqueue_size at the right time.
We need to decrement requeue_size when we remove a task form rqueue_local,
not when we remove if from the task list, or we'd also decrement it for any
tasklet, that was never in the rqueue in the first place.
2018-07-26 15:00:58 +02:00
Willy Tarreau
9a77186cb0 BUG/MEDIUM: tasks: make sure we pick all tasks in the run queue
Commit 09eeb76 ("BUG/MEDIUM: tasks: Don't forget to increase/decrease
tasks_run_queue.") addressed a count issue in the run queue and uncovered
another issue with the way the tasks are dequeued from the global run
queue. The number of tasks to pick is computed using an integral divide,
which results in up to nbthread-1 tasks never being run. The fix simply
consists in getting rid of the divide and checking the task count in the
loop.

No backport is needed, this is 1.9-specific.
2018-07-26 14:24:46 +02:00
Olivier Houchard
306e653331 BUG/MINOR: servers: Don't make "server" in a frontend fatal.
When parsing the configuration, if "server", "default-server" or
"server-template" are found in a frontend, we first warn that it will be
ignored, only to be considered a fatal error later. Be true to our word, and
just ignore it.

This should be backported to 1.8 and 1.7.
2018-07-24 17:13:54 +02:00
Willy Tarreau
055ba4f505 BUG/MEDIUM: stats: don't ask for more data as long as we're responding
The stats applet is still a bit hackish. It uses the HTTP txn to parse
the POST contents. Due to this it pretends not having parsed the request
from the buffer so that the HTTP parser continues to work fine on these
data. This comes with a side effect : the request lies pending in the
channel's buffer, and because of this, stream_int_update_applet() always
wakes the applet up. It's very visible when retrieving a large stats page
over a slow link as haproxy eats 100% of the CPU waiting for the data to
leave.

While the proper long term solution definitely is to consume these data
and parse the body from the applet, changing this is not suitable for a
fix.

What this patch does instead is to disable request polling as long as there
are pending data in the response buffer. Given that for almost all cases,
the applet remains busy sending data, this is at least enough to ensure
that we don't wake up for the pending request data while we're waiting for
the client to receive these data. Now a 5k backend stats page is dumped at
1% CPU over a 10 Mbps link instead of 100%, using 1500 epoll_wait() calls
instead of 80000.

Note that the previous fix (BUG/MEDIUM: stream-int: don't immediately
enable reading when the buffer was reportedly full) is necessary for the
effects of the fix to be noticed since both bugs have the exact same
effect.

This fix must be backported at least as far as 1.5.
2018-07-24 17:13:32 +02:00
Willy Tarreau
171d5f203a BUG/MEDIUM: stream-int: don't immediately enable reading when the buffer was reportedly full
There is a long-time issue which affects some applets, at least the stats
applet. If a large stats page is read over a slow link, regularly the
channel's buffer contains too many response data to allow another round
of ci_putblk() to copy a new message. In this case the applet calls
si_applet_cant_put() to mention that it failed to emit data into the
channel's buffer, and wants to be called only once some room is made.

The problem is that stream_int_update(), which is called from
process_stream(), will clear this flag whenever it sees there's some
spare room in the channel's buffer. It causes the applet to be woken
again immediately. This is very visible when reading a large stats
page over a slow link, because in this case haproxy will run at 100%
CPU and strace shows mostly epoll_wait(0). It is very likely that some
other applets like CLI, Lua, peers or SPOE have also been affected but
that the effect were less noticeable because it was mixed with traffic.

Ideally stream_int_update() should not touch these flags, but changing
this would require a very careful auditing of all users. Instead here
what we do is that we respect the flag if the channel still has output
data. This way the flag will automatically disappear once the buffer is
empty, and the applet function will be called only when input data
remains, if at all.

This patch alone is not enough to observe the behaviour change on the
stats page because another bug takes over, addressed by next patch
(BUG/MEDIUM: stats: don't ask for more data as long as we're responding).
When both are applied, dumping stats for 5k backends over a 10 Mbps link
take 1% CPU instead of 100%, with 1.5k epoll_wait() calls instead of 80k.

This fix should be backported at least as far as 1.5.
2018-07-24 17:12:38 +02:00
Willy Tarreau
616ac81dec MINOR: h2: add the error code and the max/last stream IDs to "show fd"
This is intented to help debugging H2 in field.
2018-07-24 14:12:42 +02:00
Willy Tarreau
842ed9b1cb MEDIUM: h2: use the default conn_stream's receive function
This removes h2_rcv_buf() now that the generic code can handle it fine.
2018-07-20 19:37:12 +02:00
Willy Tarreau
39d68508c3 MINOR: h2: make use of CS_FL_REOS to indicate that end of stream was seen
This allows h2_rcv_buf() not to depend anymore on h2s at all and to become
generic.
2018-07-20 19:35:14 +02:00
Willy Tarreau
2df65e7194 MEDIUM: h2: don't call data_cb->recv() anymore
Now we simply call data_cb->wake() which will automatically perform the
recv() call if required.
2018-07-20 19:31:36 +02:00
Willy Tarreau
2a761dcf0d MEDIUM: h2: perform a single call to the data layer in demux()
Instead of calling the data layer from each individual frame processing
function, we now call it from demux. This requires to know the h2s that
was created inside h2c_frt_handle_headers(), which is why the pointer is
now returned. This results in a small performance boost from 58k to 60k
POST requests/s compared to -master, thanks to half the number of
si_cs_recv_cb() calls and 66% calls to si_cs_wake_cb().

It's interesting to note that all calls to data_cb->recv() are now always
immediately followed by a call to data_cb->wake(). The next step should
be to let the ->wake handler perform the recv() call itself. For this it
will be useful to have some info on the CS to indicate whether or not it
is ready to be read (ie: contains a non-empty input buffer).
2018-07-20 19:30:03 +02:00
Willy Tarreau
a56a6def91 MEDIUM: h2: move headers and data frame decoding to their respective parsers
Now we entirely process the input frame before transfering it above, so
that h2_rcv_buf() doesn't have to "speak" h2 anymore.
2018-07-20 19:21:43 +02:00
Willy Tarreau
454b57b347 MEDIUM: h2: centralize transfer of decoded frames in h2_rcv_buf()
We still call the parser but it should soon not be needed anymore. The
decode functions don't need the buffer nor the max size anymore. They
must also not touch the CS_FL_EOS or CS_FL_RCV_MORE flags either, so
this is done within h2_rcv_buf() after transmission.

The "flags" argument to h2_frt_decode_headers() and h2_frt_transfer_data()
has been removed since it's not used anymore.
2018-07-20 19:21:43 +02:00
Willy Tarreau
d755ea6c7d MEDIUM: h2: make h2_frt_transfer_data() copy via an intermediary buffer
The purpose here is also to ensure we can split the lower from the top
layers. The way the CS_FL_MSG_MORE flag is set was updated so that it's
set or cleared upon exit depending on the buffer's remaining contents.
2018-07-20 19:21:43 +02:00
Willy Tarreau
937f760e1e MEDIUM: h2: make h2_frt_decode_headers() use an intermediary buffer
The purpose is to decode to a temporary buffer and then to copy this buffer
to the caller. This double-copy definitely has an impact on performance, the
test code goes down from 220k to 140k req/s, but this memcpy() will disappear
soon.

The test on CO_RFL_BUF_WET has become irrelevant now since we only use
the cs' rxbuf, so we cannot be blocked by "output" data that has to be
forwarded first. Thus instead we don't start until the rxbuf is empty
(it will be drained from any input data once the stream processes it).
2018-07-20 19:21:43 +02:00
Willy Tarreau
0b559071dd MINOR: h2: make each H2 stream support an intermediary input buffer
The purpose is to decode to a temporary buffer and then to copy this buffer
to the caller upon request to avoid having to process frames on the fly
when called from the higher level. For now the buffer is only initialized
on stream creation via cs_new() and allocated if the buffer_wait's callback
is called.
2018-07-20 19:21:43 +02:00
Willy Tarreau
67b1e78f68 MEDIUM: stream-int: automatically call si_cs_recv_cb() if the cs has data on wake()
If the cs has data pending or shutdown and the input channel is still
waiting for reads, let's simply call the recv() function from the wake()
callback. This will allow the lower layers to simply wake the upper one
up without having to consider the recv() nor anything else.
2018-07-20 19:21:43 +02:00
Willy Tarreau
11c9aa424e MEDIUM: conn_stream: add cs_recv() as a default rcv_buf() function
This function is generic and is able to automatically transfer data
from a conn_stream's rx buffer to the destination buffer. It does this
automatically if the mux doesn't define another rcv_buf() function.
2018-07-20 19:21:43 +02:00
Olivier Houchard
f495fc460e BUG/MEDIUM: mux_h2: Call h2_send() before updating polling.
In h2_wake(), make sure we call h2_send() before we try to update the
polling flags, and detect connection errors, or errors will never be
detected.
2018-07-20 19:07:49 +02:00
Christopher Faulet
ddb6c16576 BUG/MEDIUM: threads: Fix the exit condition of the thread barrier
In thread_sync_barrier, we exit when all threads have set their own bit in the
barrier mask. It is done by comparing it to all_threads_mask. But we must not
use a simple equality to do so, becaue all_threads_mask may change. Since commit
ba86c6c25 ("MINOR: threads: Be sure to remove threads from all_threads_mask on
exit"), when a thread exit, its bit is removed from all_threads_mask. Instead,
we must use a bitwise AND to test is all bits of all_threads_mask are set.

This also requires that all_threads_mask is set to volatile if we want to
catch changes.

This patch must be backported in 1.8.
2018-07-20 14:24:41 +02:00
Christopher Faulet
4507351a2f BUG/MINOR: build: Fix compilation with debug mode enabled
It remained some fragments of the old buffers API in debug messages, here and
there.

This was caused by the recent buffer API changes, no backport is needed.
2018-07-20 10:45:20 +02:00
Christopher Faulet
005e79e5dd BUG/MINOR: http: Set brackets for the unlikely macro at the right place
When test on the header "Early-Data" is made, the unlikely macro must encompass
the condition.

This patch must be backported in 1.8.
2018-07-20 10:42:53 +02:00
Willy Tarreau
8318885487 MINOR: connection: simplify subscription by adding a registration function
This new function wl_set_waitcb() prepopulates a wait_list with a tasklet
and a context and returns it so that it can be passed to ->subscribe() to
be added to a connection or conn_stream's wait_list. The caller doesn't
need to know all the insiders details anymore this way.
2018-07-19 18:31:07 +02:00
Olivier Houchard
910b2bc829 MEDIUM: connections/mux: Revamp the send direction.
Totally nuke the "send" method, instead, the upper layer decides when it's
time to send data, and if it's not possible, uses the new subscribe() method
to be called when it can send data again.
2018-07-19 18:31:07 +02:00
Olivier Houchard
6ff2039d13 MINOR: connections/mux: Add a new "subscribe" method.
Add a new "subscribe" method for connection, conn_stream and mux, so that
upper layer can subscribe to them, to be called when the event happens.
Right now, the only event implemented is "SUB_CAN_SEND", where the upper
layer can register to be called back when it is possible to send data.

The connection and conn_stream got a new "send_wait_list" entry, which
required to move a few struct members around to maintain an efficient
cache alignment (and actually this slightly improved performance).
2018-07-19 16:23:43 +02:00
Willy Tarreau
83061a820e MAJOR: chunks: replace struct chunk with struct buffer
Now all the code used to manipulate chunks uses a struct buffer instead.
The functions are still called "chunk*", and some of them will progressively
move to the generic buffer handling code as they are cleaned up.
2018-07-19 16:23:43 +02:00
Willy Tarreau
843b7cbe9d MEDIUM: chunks: make the chunk struct's fields match the buffer struct
Chunks are only a subset of a buffer (a non-wrapping version with no head
offset). Despite this we still carry a lot of duplicated code between
buffers and chunks. Replacing chunks with buffers would significantly
reduce the maintenance efforts. This first patch renames the chunk's
fields to match the name and types used by struct buffers, with the goal
of isolating the code changes from the declaration changes.

Most of the changes were made with spatch using this coccinelle script :

  @rule_d1@
  typedef chunk;
  struct chunk chunk;
  @@
  - chunk.str
  + chunk.area

  @rule_d2@
  typedef chunk;
  struct chunk chunk;
  @@
  - chunk.len
  + chunk.data

  @rule_i1@
  typedef chunk;
  struct chunk *chunk;
  @@
  - chunk->str
  + chunk->area

  @rule_i2@
  typedef chunk;
  struct chunk *chunk;
  @@
  - chunk->len
  + chunk->data

Some minor updates to 3 http functions had to be performed to take size_t
ints instead of ints in order to match the unsigned length here.
2018-07-19 16:23:43 +02:00
Willy Tarreau
c9fa0480af MAJOR: buffer: finalize buffer detachment
Now the buffers only contain the header and a pointer to the storage
area which can be anywhere. This will significantly simplify buffer
swapping and will make it possible to map chunks on buffers as well.

The buf_empty variable was removed, as now it's enough to have size==0
and area==NULL to designate the empty buffer (thus a non-allocated head
is the empty buffer by default). buf_wanted for now is indicated by
size==0 and area==(void *)1.

The channels and the checks now embed the buffer's head, and the only
pointer is to the storage area. This slightly increases the unallocated
buffer size (3 extra ints for the empty buffer) but considerably
simplifies dynamic buffer management. It will also later permit to
detach unused checks.

The way the struct buffer is arranged has proven quite efficient on a
number of tests, which makes sense given that size is always accessed
and often first, followed by the othe ones.
2018-07-19 16:23:43 +02:00
Willy Tarreau
e3128024bf MINOR: buffer: replace buffer_replace2() with b_rep_blk()
This one is more generic and designed to work on a random block. It
may later get a b_rep_ist() variant since many strings are already
available as (ptr,len).
2018-07-19 16:23:43 +02:00
Willy Tarreau
4d893d440c MINOR: buffers/channel: replace buffer_insert_line2() with ci_insert_line2()
There was no point keeping that function in the buffer part since it's
exclusively used by HTTP at the channel level, since it also automatically
appends the CRLF. This further cleans up the buffer code.
2018-07-19 16:23:43 +02:00
Willy Tarreau
a094fde2b6 MINOR: checks: use b_putist() instead of b_putstr()
This slightly simplifies the code.
2018-07-19 16:23:43 +02:00
Willy Tarreau
ea1b06d5bb MINOR: buffer: add a new file for ist + buffer manipulation functions
The new file istbuf.h links the indirect strings (ist) with the buffers.
The purpose is to encourage addition of more standard buffer manipulation
functions that rely on this in order to improve the overall ease of use
along all the code. Just like ist.h and buf.h, this new file is not
expected to depend on anything beyond these two files.

A few functions were added and/or converted from buffer.h :
  - b_isteq()  : indicates if a buffer and a string match
  - b_isteat() : consumes a string from the buffer if it matches
  - b_istput() : appends a small string to a buffer (all or none)
  - b_putist() : appends part of a large string to a buffer

The equivalent functions were removed from buffer.h and changed at the
various call places.
2018-07-19 16:23:43 +02:00
Willy Tarreau
55372f646f MINOR: buffer: replace b{i,o}_put* with b_put*
The two variants now do exactly the same (appending at the tail of the
buffer) so let's not keep the distinction between these classes of
functions and have generic ones for this. It's also worth noting that
b{i,o}_putchk() wasn't used at all and was removed.
2018-07-19 16:23:43 +02:00
Willy Tarreau
72a100b386 MINOR: buffer: replace bi_fast_delete() with b_del()
There's no distinction between in and out data now. The latter covers
the needs of the former and supports wrapping. The extra cost is
negligible given the locations where it's used.
2018-07-19 16:23:43 +02:00
Olivier Houchard
08afac0fd7 MEDIUM: buffers: move "output" from struct buffer to struct channel
Since we never access this field directly anymore, but only through the
channel's wrappers, it can now move to the channel. The buffers are now
completely free from the distinction between input and output data.
2018-07-19 16:23:43 +02:00
Willy Tarreau
892f1dbe4f MINOR: buffer: rename the "data" field to "area"
Since we use "_data" for the amount of data at many places, as opposed to
"_space" for the amount of space, let's rename the "data" field to "area"
so that we can reuse "data" later for the amount of data in the buffer
(currently called "len" despite not being contigous).
2018-07-19 16:23:43 +02:00
Willy Tarreau
d54a8ceb97 MAJOR: start to change buffer API
This is intentionally the minimal and safest set of changes, some cleanups
area still required. These changes are quite tricky and cannot be
independantly tested, so it's important to keep this patch as bisectable
as possible.

buf_empty and buf_wanted were changed and are now exactly similar since
there's no <p> member in the structure anymore. Given that no test is
ever made in the code to check that buf == &buf_wanted, it may be possible
that we don't need to have two anymore, unless some buf_empty tests have
precedence. This will have to be investigated.

A significant part of this commit affects the HTTP compression code,
which used to deeply manipulate the input and output buffers without
any reasonable solution for a better abstraction. For this reason, if
any regression is met and designates this patch as the culprit, it is
important to run tests which specifically involve compression or which
definitely don't use it in order to spot the issue.

Cc: Olivier Houchard <ohouchard@haproxy.com>
2018-07-19 16:23:42 +02:00
Willy Tarreau
81521ed850 MINOR: buffer: adapt buffer_slow_realign() and buffer_dump() to the new API
These are the last ones which need to be adapted.
2018-07-19 16:23:42 +02:00
Willy Tarreau
a79021af6f MINOR: lua: use the wrappers instead of directly manipulating buffer states
This replaces chn->buf->p with ci_head(chn), chn->buf->o with co_data(chn)
and chn->buf->i with ci_data(chn). This is in order to help porting to the
new buffer API.
2018-07-19 16:23:42 +02:00
Olivier Houchard
0b662843c8 MEDIUM: compression: start to move to the new buffer API
This part is tricky, it passes a channel where we used to have a buffer,
in order to reduce the API changes during the big switch. This way all
the channel's wrappers to distinguish between input and output are
available. It also makes sense given that the compression applies on
a channel since it's in the forwarding path.
2018-07-19 16:23:42 +02:00
Willy Tarreau
f158937620 MINOR: flt_trace: adapt to the new buffer API
The trace_hexdump() function now takes a count in argument to know where
to start dumping from.
2018-07-19 16:23:42 +02:00
Willy Tarreau
5e74b0ba3b MEDIUM: h1: port to new buffer API.
The parser now uses the channel exclusively to access the data. In order
to avoid the cost of indirection, a local variable "input" was added to
the function that replaces buf->p. Given that this part is on the critical
path, it will have to be tested again for any visible performance loss.
2018-07-19 16:23:42 +02:00
Willy Tarreau
fc0785d26c MINOR: payload: convert to the new buffer API
Mostly mechanical changes. It seems that some of them could be further
factored out by adding a few more wrappers at the channel level.
2018-07-19 16:23:42 +02:00
Willy Tarreau
44a41a83fb MINOR: filters: convert to the new buffer API
Use b_set_data() to modify the buffer size, and use the usual wrappers.
2018-07-19 16:23:42 +02:00
Willy Tarreau
f37954d4da MEDIUM: http: use wrappers instead of directly manipulating buffers states
This is aimed at easing the transition to the new API. There are a few places
which deserve some simplifications afterwards because ci_head() is called
often and may be placed into a local pointer.
2018-07-19 16:23:42 +02:00
Willy Tarreau
6a445ebc8a MINOR: backend: use new buffer API
The few locations dealing with the buffer rewind were updated not to
touch ->o nor ->p anymore and to use the channel's functions instead.
2018-07-19 16:23:42 +02:00
Willy Tarreau
7e9c30a7e0 MINOR: stream: use wrappers instead of directly manipulating buffers
This will help transitioning to the new API. These changes are very
scarce limited.
2018-07-19 16:23:42 +02:00
Willy Tarreau
77e478c56e MINOR: stream-int: use the new buffer API
A few locations still accessing ->i and ->o directly were changed to
use ci_data() and co_data() respectively. A call to b_del() was replaced
with co_set_data() in si_cs_send() so that ->o will is automatically be
decremented after the migration.
2018-07-19 16:23:42 +02:00
Willy Tarreau
178b987025 MINOR: cache: use the new buffer API
A few direct accesses to buf->p now use ci_head() instead.
2018-07-19 16:23:42 +02:00
Willy Tarreau
851d12c3d4 MINOR: cli: use the new buffer API
Almost nothing required to be touched.
2018-07-19 16:23:42 +02:00
Willy Tarreau
97f538b895 MINOR: stats: adapt to the new buffers API
The changes are fairly straightforward. Some places require to trim
the length. Maybe we'd need a b_extend() or b_adjust() for this.
2018-07-19 16:23:42 +02:00
Willy Tarreau
4fca5a9940 MEDIUM: spoe: use the new buffer API for the SPOE buffer
The buffer is not used as a forwarding buffer so we can simply map ->i
to ->len and ->p to b_head(). It *seems* that p is never modified, so
that we could even always use b_orig(). This needs to be rechecked.
2018-07-19 16:23:42 +02:00
Willy Tarreau
b7b5fe1a14 MEDIUM: h2: update to the new buffer API
There is no more distinction between ->i and ->o for the mux's buffers,
we always use b_data() to know the buffer's length since only one side
is used for each direction.
2018-07-19 16:23:42 +02:00
Willy Tarreau
876171e636 MINOR: checks: adapt to the new buffer API
The code exclusively used ->i for data received and ->o for data sent. Now
it always uses b_data(), b_head() and b_tail() so that there is no more
distinction between ->i and ->o.
2018-07-19 16:23:42 +02:00
Willy Tarreau
cd9e60db00 MEDIUM: channel: adapt to the new buffer API
Also, ci_swpbuf() was removed (unused).
2018-07-19 16:23:42 +02:00
Olivier Houchard
acd1403794 MINOR: buffer: Use b_add()/bo_add() instead of accessing b->i/b->o.
Use the newly available functions instead of using the buffer fields directly.
2018-07-19 16:23:42 +02:00
Willy Tarreau
591d445049 MINOR: buffer: use b_orig() to replace most references to b->data
This patch updates most users of b->data to use b_orig().
2018-07-19 16:23:42 +02:00
Willy Tarreau
144c5c4d21 MINOR: buffer: replace buffer_flush() with c_adv(chn, ci_data(chn))
It used to forward some input into output.
2018-07-19 16:23:41 +02:00
Willy Tarreau
5ba65521a3 MINOR: buffer: replace buffer_pending() with ci_data()
It used to return b->i for channels, which is what ci_data() does.
2018-07-19 16:23:41 +02:00
Willy Tarreau
3f6799975f MINOR: buffer: replace bi_space_for_replace() with ci_space_for_replace()
This one computes the size that can be overwritten over the input part
of the buffer, so it's channel-specific.
2018-07-19 16:23:41 +02:00
Willy Tarreau
2375233ef0 MINOR: buffer: replace buffer_full() with channel_full()
It's only used by channels since we need to know the amount of output
data.
2018-07-19 16:23:41 +02:00
Willy Tarreau
0c7ed5d264 MINOR: buffer: replace buffer_empty() with b_empty() or c_empty()
For the same consistency reasons, let's use b_empty() at the few places
where an empty buffer is expected, or c_empty() if it's done on a channel.
Some of these places were there to realign the buffer so
{b,c}_realign_if_empty() was used instead.
2018-07-19 16:23:41 +02:00
Willy Tarreau
d760eecf61 MINOR: buffer: replace buffer_not_empty() with b_data() or c_data()
It's mostly for consistency as many places already use one of these instead.
2018-07-19 16:23:41 +02:00
Willy Tarreau
eac5259888 MINOR: buffer: use b_room() to determine available space in a buffer
We used to have variations around buffer_total_space() and
size-buffer_len() or size-b_data(). Let's simplify all this. buffer_len()
was also removed as not used anymore.
2018-07-19 16:23:41 +02:00
Willy Tarreau
337ea57cfc MINOR: connection: add a new receive flag : CO_RFL_BUF_WET
With this flag we introduce the notion of "dry" vs "wet" buffers : some
demultiplexers like the H2 mux require as much room as possible for some
operations that are not retryable like decoding a headers frame. For this
they need to know if the buffer is congested with data scheduled for
leaving soon or not. Since the new API will not provide this information
in the buffer itself, the caller must indicate it. We never need to know
the amount of such data, just the fact that the buffer is not in its
optimal condition to be used for receipt. This "CO_RFL_BUF_WET" flag is
used to mention that such outgoing data are still pending in the buffer
and that a sensitive receiver should better let it "dry" before using it.
2018-07-19 16:23:41 +02:00
Willy Tarreau
7f3225f251 MINOR: connection: add a flags argument to rcv_buf()
The mux and transport rcv_buf() now takes a "flags" argument, just like
the snd_buf() one or like the equivalent syscall lower part. The upper
layers will use this to pass some information such as indicating whether
the buffer is free from outgoing data or if the lower layer may allocate
the buffer itself.
2018-07-19 16:23:41 +02:00
Willy Tarreau
d9cf540457 MEDIUM: mux: make mux->rcv_buf() take a size_t for the count
It also returns a size_t. This is in order to clean the API. Note
that the H2 mux still uses some ints in the functions called from
h2_rcv_buf(), though it's not really a problem given that H2 frames
are smaller. It may deserve a general cleanup later though.
2018-07-19 16:23:41 +02:00
Willy Tarreau
bfc4d77ad3 MEDIUM: connection: make xprt->rcv_buf() use size_t for the count
Just like we have a size_t for xprt->snd_buf(), we adjust to use size_t
for rcv_buf()'s count argument and return value. It also removes the
ambiguity related to the possibility to see a negative value there.
2018-07-19 16:23:41 +02:00
Willy Tarreau
deccd1116d MEDIUM: mux: make mux->snd_buf() take the byte count in argument
This way the mux doesn't need to modify the buffer's metadata anymore
nor to know the output's size. The mux->snd_buf() function now takes a
const buffer and it's up to the caller to update the buffer's state.

The return type was updated to return a size_t to comply with the count
argument.
2018-07-19 16:23:41 +02:00
Willy Tarreau
787db9a6a4 MEDIUM: connection: make xprt->snd_buf() take the byte count in argument
This way the senders don't need to modify the buffer's metadata anymore
nor to know about the output's split point. This way the functions can
take a const buffer and it's clearer who's in charge of updating the
buffer after a send. That's why the buffer realignment is now performed
by the caller of the transport's snd_buf() functions.

The return type was updated to return a size_t to comply with the count
argument.
2018-07-19 16:23:41 +02:00
Willy Tarreau
55f3ce1c91 MINOR: buffer: make b_getblk_nc() take size_t for the block sizes
Till now we used to reimplement it using ints to limit external changes
but we must adjust it and the various users to switch to size_t.
2018-07-19 16:23:41 +02:00
Willy Tarreau
206ba834ef MINOR: buffer: make b_getblk_nc() take const pointers
Now that there are no more users requiring to modify the buffer anymore,
switch these ones to const char and const buffer. This will make it more
obvious next time send functions are tempted to modify the buffer's output
count. Minor adaptations were necessary at a few call places which were
using char due to the function's previous prototype.
2018-07-19 16:23:41 +02:00
Willy Tarreau
9c7f2d19bf MEDIUM: h2: don't use b_ptr() nor b_end() anymore
The few places where they were still used were replaced with b_peek() and
b_wrap() respectively. The parts making use of ->i and ->o should now be
convertible to the new API.
2018-07-19 16:23:41 +02:00
Willy Tarreau
0bad0439f4 MEDIUM: h2: do not use buf->o anymore inside h2_snd_buf's loop
buf->o is only retrieved at the loop entry and modified using b_del()
on exit. We're close to being able to change the API to take a count
argument.
2018-07-19 16:23:41 +02:00
Willy Tarreau
f40e68227b MINOR: h1: make h1_measure_trailers() use an offset and a count
This will be needed by the H2 encoder to restart after wrapping.
2018-07-19 16:23:41 +02:00
Willy Tarreau
84d6b7af87 MINOR: h1: make h1_parse_chunk_size() not depend on b_ptr() anymore
It's similar to the previous commit so that the function doesn't rely
on buf->p anymore.
2018-07-19 16:23:41 +02:00
Willy Tarreau
c0973c6742 MINOR: h1: make h1_skip_chunk_crlf() not depend on b_ptr() anymore
It now takes offsets relative to the buffer's head. It's up to the
callers to add this offset which corresponds to the buffer's output
size.
2018-07-19 16:23:41 +02:00
Willy Tarreau
5dd17353d5 MEDIUM: h2: prevent the various mux encoders from modifying the buffer
Functions h2s_frt_make_resp_headers() and h2s_frt_make_resp_data() used
to modify the buffer's output data count. This is problematic for the
buffer's rework as we don't want to rely on this anymore. This commit
modifies these functions to take an offset (relative to the buffer's
head) and a maximum byte count. Thus h2_snd_buf() now calls them with
buf->o and takes care of removing deleted data itself. The send functions
now almost support being passed const buffers (except for the data part
which is still embedded).
2018-07-19 16:23:41 +02:00
Willy Tarreau
1dc41e75d8 MINOR: h2: clarify the fact that the send functions are unsigned
There's no more error return combined with the send output, though
the comments were misleading. Let's fix this as well as the functions'
prototypes. h2_snd_buf()'s return value wasn't changed yet since it
has to match the ->snd_buf prototype.
2018-07-19 16:23:40 +02:00
Willy Tarreau
7314be8e2c MINOR: h1: make h1_measure_trailers() take the byte count in argument
The principle is that it should not have to take this value from the
buffer itself anymore.
2018-07-19 16:23:40 +02:00
Willy Tarreau
188e230704 MINOR: buffer: convert most b_ptr() calls to c_ptr()
The latter uses the channel wherever a channel is known.
2018-07-19 16:23:40 +02:00
Willy Tarreau
e5f12ce7f2 MINOR: buffer: replace bi_del() and bo_del() with b_del()
Till now the callers had to know which one to call for specific use cases.
Let's fuse them now since a single one will remain after the API migration.
Given that bi_del() may only be used where o==0, just combine the two tests
by first removing output data then only input.
2018-07-19 16:23:40 +02:00
Willy Tarreau
a1f78fb652 MINOR: buffer: replace bo_getblk_nc() with b_getblk_nc() which takes an offset
This will be important so that we can parse a buffer without touching it.
Now we indicate where from the buffer's head we plan to start to copy, and
for how many bytes. This will be used by send functions to loop at the end
of the buffer without having to update the buffer's output byte count.
2018-07-19 16:23:40 +02:00
Willy Tarreau
90ed3836db MINOR: buffer: replace bo_getblk() with direction agnostic b_getblk()
This new functoin limits itself to the amount of data available in the
buffer and doesn't care about the direction anymore. It's only called
from co_getblk() which already checks that no more than the available
output bytes is requested.
2018-07-19 16:23:40 +02:00
Willy Tarreau
e4d5a036ed MINOR: buffer: merge b{i,o}_contig_space()
These ones were merged into a single b_contig_space() that covers both
(the bo_ case was a simplified version of the other one). The function
doesn't use ->i nor ->o anymore.
2018-07-19 16:23:40 +02:00
Willy Tarreau
0e11d59af6 MINOR: buffer: remove bo_contig_data()
The two call places now make use of b_contig_data(0) and check by
themselves that the returned size is no larger than the scheduled
output data.
2018-07-19 16:23:40 +02:00
Willy Tarreau
8f9c72d301 MINOR: buffer: remove bi_end()
It was replaced by ci_tail() when the channel is known, or b_tail() in
other cases.
2018-07-19 16:23:40 +02:00
Willy Tarreau
41e38ac0ee MINOR: buffer: remove bo_end()
It was replaced by either b_tail() when the buffer has no input data, or
b_peek(b, b->o).
2018-07-19 16:23:40 +02:00
Willy Tarreau
89faf5d7c3 MINOR: buffer: remove bo_ptr()
It was replaced by co_head() when a channel was known, otherwise b_head().
2018-07-19 16:23:40 +02:00
Willy Tarreau
dda2e41881 MINOR: buffer: remove bi_ptr()
It's now been replaced by b_head() when b->o is null, ci_head() when
the channel is known, or b_peek(b, b->o) in other situations.
2018-07-19 16:23:40 +02:00
Willy Tarreau
7194d3cc3b MINOR: buffer: split bi_contig_data() into ci_contig_data and b_config_data()
This function was sometimes used from a channel and sometimes from a buffer.
In both cases it requires knowledge of the size of the output data (to skip
them). Here the split ensures the channel can deal with this point, and that
other places not having output data can continue to work.
2018-07-19 16:23:40 +02:00
Willy Tarreau
aa7af7213d MINOR: buffer: replace calls to buffer_space_wraps() with b_space_wraps()
And remove the unused function.
2018-07-19 16:23:40 +02:00
Willy Tarreau
bcbd39370f MINOR: channel/buffer: replace b_{adv,rew} with c_{adv,rew}
These ones manipulate the output data count which will be specific to
the channel soon, so prepare the call points to use the channel only.
The b_* functions are now unused and were removed.
2018-07-19 16:23:40 +02:00
Willy Tarreau
c0a51c51b1 MINOR: buffer: remove buffer_slow_realign() and the swap_buffer allocation code
Since all call places can use the trash now, this is not needed anymore.
2018-07-19 16:23:40 +02:00
Willy Tarreau
0db4d10efc MINOR: h2: use b_slow_realign() with the trash as a swap buffer
H2 doesn't use the trash so it can make use of it as a swap area when
calling b_slow_realign(). This way we don't need buffer_slow_realign()
anymore.
2018-07-19 16:23:40 +02:00
Willy Tarreau
fd8d42f496 MEDIUM: channel: make channel_slow_realign() take a swap buffer
The few call places where it's used can use the trash as a swap buffer,
which is made for this exact purpose. This way we can rely on the
generic b_slow_realign() call.
2018-07-19 16:23:40 +02:00
Willy Tarreau
4cf1300e6a MINOR: channel/buffer: replace buffer_slow_realign() with channel_slow_realign() and b_slow_realign()
Where relevant, the channel version is used instead. The buffer version
was ported to be more generic and now takes a swap buffer and the output
byte count to know where to set the alignment point. The H2 mux still
uses buffer_slow_realign() with buf->o but it will change later.
2018-07-19 16:23:40 +02:00
Willy Tarreau
d5b343bf9e MINOR: channel/buffer: use c_realign_if_empty() instead of buffer_realign()
This patch removes buffer_realign() and replaces it with c_realign_if_empty()
instead.
2018-07-19 16:23:40 +02:00
Willy Tarreau
4d452384a3 MINOR: compression: pass the channel to http_compression_buffer_end()
This will be needed to access the output data count from the channel
after the buffer/channel changes.
2018-07-19 16:23:39 +02:00
Willy Tarreau
506a29ac6e MINOR: buffer: switch buffer sizes and offsets to size_t
Passing unsigned ints everywhere is painful, and will cause some headache
later when we'll want to integrate better with struct ist which already
uses size_t. Let's switch buffers to use size_t instead.
2018-07-19 16:23:39 +02:00
Willy Tarreau
42d55b9b6a BUG/MEDIUM: h2: make sure the last stream closes the connection after a timeout
If a timeout strikes on the connection side with some active streams,
there is a corner case which can sometimes cause the following sequence
to happen :

  - There are active streams but there are data in the mux buffer
    (eg: a client suddenly disconnected during a download with pending
    requests). The timeout is active.

  - The timeout strikes, h2_timeout_task() is called, kills the task and
    doesn't close the connection since there are streams left ; The
    connection is marked in H2_CS_ERROR ;

  - the streams are woken up and closed ;

  - when the last stream closes, calling h2_detach(), it sees the
    tree list is empty, but there is no condition allowing the
    connection to be closed (mbuf->o > 0), thus it does nothing ;

  - since the task is dead, there's no more hope to clear this
    situation later

For now we can take care of this by adding a test for the presence of
H2_CS_ERROR and !task, implying the timeout task triggered already
and will not be able to handle this again.

Over the long term it seems like a more reliable test on should be
made, so that it is possible to know whether or not someone is still
able to close this connection.

A big thanks to Janusz Dziemidowicz and Milan Petruzelka for providing
many details helping in figuring this bug.
2018-07-19 14:31:47 +02:00
Willy Tarreau
00610960a1 BUG/MEDIUM: h2: never leave pending data in the output buffer on close
We currently don't process trailers on H2, but this has an impact : on
chunked HTTP/1 responses, we decide to emit the ES bit once we see the
0CRLF. From this point the stream switches to the CLOSED state, which
aborts processing of the remaining bytes. Thus the extra CRLF which ends
trailers is not processed and remains in the buffer. This prevents the
stream from being notified about end of transmission, which in turn keeps
the mux busy and prevents the connection from quitting.

The case of the trailers is not the root cause of this issue, though it
is what triggers it. The root cause is that upon error and/or close, once
we know we're not going to process any more data, we must absolutely flush
any remaining bytes from the output buffer, otherwise there is no way the
stream can quit. This is what this patch does.

It looks very likely related to the issues reported and debugged by
Janusz Dziemidowicz and Milan Petruzelka.

One way to reproduce it is to chain two proxies with the last one emitting
chunked data (typically using the stats page) :

    global
        stats socket /tmp/sock1 mode 666 level admin
        stats timeout 1h
        tune.ssl.default-dh-param 1024
        tune.bufsize 16384

    defaults
        mode http
        timeout connect 4s
        timeout client 10s
        timeout server 20s

    listen px1
        bind :4443 ssl crt rsa+dh2048.pem npn h2 alpn h2
        server s1 127.0.0.1:4445

    listen px2
        bind :4444 ssl crt rsa+dh2048.pem npn h2 alpn h2
        bind :4445
        stats uri /

Then use curl to fetch the stats through px1 :

    curl --http2 -k "https://127.0.0.1:4443/"

When curl is sent to the first one, "show sess" issued to the CLI will
show a remaining session during the client timeout. When curl is aimed at
port 4444 (px2), there is no such remaining session.

This fix needs to be backported to 1.8.
2018-07-19 11:09:12 +02:00
Willy Tarreau
c65edac804 MINOR: h2: add the mux and demux buffer lengths on "show fd"
It is convenient during debugging sessions to know if the mux and demux
buffers are empty/full/other. Let's report this on "show fd" output.
2018-07-19 10:54:43 +02:00
Willy Tarreau
f210191dcd BUG/MEDIUM: h2: don't accept new streams if conn_streams are still in excess
The streams bookkeeping made in H2 is used for protocol compliance only
but it doesn't consider the number of conn_streams still attached to the
mux. It causes an issue when http-request set-nice rules are applied on
H2 requests processed on a saturated machine. Indeed, in this case, the
requests are accepted and assigned a default nice value of zero. When
they are processed, their nice value changes to a higher one (say 1024).
The response is sent through the H2 mux, which detects the end of stream
and decrements the protocol-level stream count (h2c->nb_streams). The
client may then send a new request. But the conn_stream is still attached
and will require a new call to process_stream() to finish, which is made
through the scheduler. Given that the machine is saturated, it is assumed
that many tasks are present in the scheduler. Thus the closing tasks holding
a higher nice value will pass after the new stream creations. If the client
is fast enough with a low latency link, it may add a lot of new stream
creations before the stream terminations have a chance to disappear due
to their high nice value, resulting in a huge amount of memory being used.

The solution consists in letting a mux always monitor its conn_streams and
refrain from creating new ones when it is full. Here the H2 mux checks the
nb_cs counter and sets a new blocked flag (H2_CF_DEM_TOOMANY) if the limit
was reached, so that the frame parser requests a pause in the new stream
creation, leaving some time for the pending conn_streams to vanish.

Several experiments were made using varying thresholds to see if
overbooking would provide any benefit here but it turned out not to be
the case, so the conn_stream limit remains set to the exact streams
limit. Interestingly various performance measurements showed that the
code tends to be slightly faster now than without the limit, probably
due to the smoother memory usage.

This commit requires previous patch ("MINOR: h2: keep a count of the number
of conn_streams attached to the mux"). It needs to be backported to 1.8.
2018-07-19 10:23:15 +02:00
Willy Tarreau
7ac60e836a MINOR: h2: keep a count of the number of conn_streams attached to the mux
The h2 mux only knows about the number of H2 streams which are not in a
CLOSED state. This is used for protocol compliance. But it doesn't hold
the number of really attached streams. It is a problem because depending
on scheduling, it is possible that more streams are attached to the mux
than the ones seen at the protocol level, due to some streams taking some
time to be detached. Let's add this count based on the conn_streams.

Note: this patch is part of a series of fixes which will have to be
backported to 1.8.
2018-07-19 09:06:37 +02:00
Willy Tarreau
17b4aa1adc BUG/MINOR: ssl: properly ref-count the tls_keys entries
Commit 200b0fa ("MEDIUM: Add support for updating TLS ticket keys via
socket") introduced support for updating TLS ticket keys from the CLI,
but missed a small corner case : if multiple bind lines reference the
same tls_keys file, the same reference is used (as expected), but during
the clean shutdown, it will lead to a double free when destroying the
bind_conf contexts since none of the lines knows if others still use
it. The impact is very low however, mostly a core and/or a message in
the system's log upon old process termination.

Let's introduce some basic refcounting to prevent this from happening,
so that only the last bind_conf frees it.

Thanks to Janusz Dziemidowicz and Thierry Fournier for both reporting
the same issue with an easy reproducer.

This fix needs to be backported from 1.6 to 1.8.
2018-07-18 08:59:50 +02:00
Baptiste Assmann
8e2d9430c0 MINOR: dns: new DNS options to allow/prevent IP address duplication
By default, HAProxy's DNS resolution at runtime ensure that there is no
IP address duplication in a backend (for servers being resolved by the
same hostname).
There are a few cases where people want, on purpose, to disable this
feature.

This patch introduces a couple of new server side options for this purpose:
"resolve-opts allow-dup-ip" or "resolve-opts prevent-dup-ip".
2018-07-12 17:56:44 +02:00
Baptiste Assmann
84221b4e90 MINOR: dns: fix wrong score computation in dns_get_ip_from_response
dns_get_ip_from_response() is used to compare the caller current IP to
the IP available in the records returned by the DNS server.
A scoring system is in place to get the best IP address available.
That said, in the current implementation, there are a couple of issues:
1. a comment does not match what the code does
2. the code does not match what the commet says (score value is not
   incremented with '2')

This patch fixes both issues.

Backport status: 1.8
2018-07-12 17:56:34 +02:00
Baptiste Assmann
741e00a820 CLEANUP: dns: inacurate comment about prefered IP score
The comment was about "prefered network ip version" while it's actually
"prefered ip version" in the code.
Fixed

Backport status: 1.7 and 1.8
  Be careful, this patch may not apply on 1.7, since the score was '4'
  for this item at that time.
2018-07-12 17:55:16 +02:00
Baptiste Assmann
e56fffd896 CLEANUP: dns: remove obsolete macro DNS_MAX_IP_REC
Since a8c6db8d2d, this macro is not used
anymore and can be safely removed.

Backport status: 1.8
2018-07-12 17:55:05 +02:00
William Lallemand
bfd8eb5909 MINOR: startup: change session/process group settings
Change the way the process groups are set. Indeed setsid() was called
for every processes which caused the worker to have a different process
group than the master.

This patch behave in a better way:

- In daemon mode only, each child do a setsid()
- In master worker + daemon mode, the setsid() is done in the master before
forking the children
- In any foreground mode, we don't do a setsid()

Could be backported in 1.8 but the master-worker mode is mostly used
with systemd which rely on cgroups so that won't affect much people.
2018-07-04 19:29:56 +02:00
Thierry FOURNIER
70d318ccb7 BUG/MEDIUM: lua: possible CLOSE-WAIT state with '\n' headers
The Lua parser doesn't takes in account end-of-headers containing
only '\n'. It expects always '\r\n'. If a '\n' is processes the Lua
parser considers it miss 1 byte, and wait indefinitely for new data.

When the client reaches their timeout, it closes the connection.
This close is not detected and the connection keep in CLOSE-WAIT
state.

I guess that this patch fix only a visible part of the problem.
If the Lua HTTP parser wait for data, the timeout server or the
connectio closed by the client may stop the applet.

How reproduce the problem:

HAProxy conf:

   global
      lua-load bug38.lua
   frontend frt
      timeout client 2s
      timeout server 2s
      mode http
      bind *:8080
      http-request use-service lua.donothing

Lua conf

   core.register_service("donothing", "http", function(applet) end)

Client request:

   echo -ne 'GET / HTTP/1.1\n\n' | nc 127.0.0.1 8080

Look for CLOSE-WAIT in the connection with "netstat" or "ss". I
use this script:

   while sleep 1; do ss | grep CLOSE-WAIT; done

This patch must be backported in 1.6, 1.7 and 1.8

Workaround: enable the "hard-stop-after" directive, and perform
periodic reload.
2018-07-01 06:08:43 +02:00
Willy Tarreau
43e903553e MINOR: stick-tables: make stktable_release() do nothing on NULL
stktable_release() has been involved in two recent crashes by being
used without enough care. Just like any free() function this one is
often called on an exit path with a possibly unsafe argument. Given
that there is another case (smp_fetch_sc_trackers()) which theorically
could call it with an unchecked NULL, though it cannot happen since
the function doesn't support being called with src_* hence cannot make
use of tmpstkctr, let's rather move the check into the function itself
to make it safer for the long term.

This patch could be backported to 1.8 as a strengthening measure.
2018-06-27 06:33:20 +02:00
Tim Duesterhus
65189c17c6 BUG/MAJOR: stick_table: Complete incomplete SEGV fix
This commit completes the incomplete segmentation fault fix
in commit ac1f3ed64b.

Likewise it must be backported to haproxy 1.8.
2018-06-26 20:29:36 +02:00
William Lallemand
091d827e09 BUG/BUILD: threads: unbreak build without threads
The build without threads was once again broken.

This issue was introduced in commit ba86c6c ("MINOR: threads: Be sure to
remove threads from all_threads_mask on exit").

This is exactly the same problem as last time it happened, because of
all_threads_mask not being defined with USE_THREAD=

This must be backported in 1.8
2018-06-26 14:15:12 +02:00
Thierry FOURNIER
ac1f3ed64b BUG/MAJOR: Stick-tables crash with segfault when the key is not in the stick-table
When a lookup is done on a key not present in the stick-table the "st"
pointer is NULL and it is used to return the converter result, but it
is used untested with stktable_release().

This regression was introduced in 1.8.10 here:

   BUG/MEDIUM: stick-tables: Decrement ref_cnt in table_* converters
   commit d7bd88009d88dd413e01bc0baa90d6662a3d7718
   Author: Daniel Corbett <dcorbett@haproxy.com>
   Date:   Sun May 27 09:47:12 2018 -0400

Minimal conf for reproducong the problem:

   frontend test
      mode http
      stick-table type ip size 1m expire 1h store gpc0
      bind *:8080
      http-request redirect location /a if { src,in_table(test) }

The segfault is triggered using:

   curl -i http://127.0.0.1:8080/

This patch must be backported in 1.8
2018-06-26 13:51:46 +02:00
Christopher Faulet
ba86c6c25b MINOR: threads: Be sure to remove threads from all_threads_mask on exit
When HAProxy is started with several threads, Each running thread holds a bit in
the bitfiled all_threads_mask. This bitfield is used here and there to check
which threads are registered to take part in a specific processing. So when a
thread exits, it seems normal to remove it from all_threads_mask.

No direct impact could be identified with this right now but it would
be better to backport it to 1.8 as a preventive measure to avoid complex
situations like the one in previous bug.
2018-06-22 14:55:15 +02:00
Christopher Faulet
d8fd2af882 BUG/MEDIUM: threads: Use the sync point to check active jobs and exit
When HAProxy is shutting down, it exits the polling loop when there is no jobs
anymore (jobs == 0). When there is no thread, it works pretty well, but when
HAProxy is started with several threads, a thread can decide to exit because
jobs variable reached 0 while another one is processing a task (e.g. a
health-check). At this stage, the running thread could decide to request a
synchronization. But because at least one of them has already gone, the others
will wait infinitly in the sync point and the process will never die.

To fix the bug, when the first thread (and only this one) detects there is no
active jobs anymore, it requests a synchronization. And in the sync point, all
threads will check if jobs variable reached 0 to exit the polling loop.

This patch must be backported in 1.8.
2018-06-22 10:16:26 +02:00
Dave Chiluk
8618a6a5e2 MINOR: Some spelling cleanup in the comments.
Signed-off-by: Dave Chiluk <chiluk+haproxy@indeed.com>
2018-06-21 20:43:52 +02:00
Olivier Houchard
d0e60d852a BUG/MEDIUM: fd: Don't modify the update_mask in fd_dodelete().
Only the pollers should remove bits in the update_mask. Removing it will
mean if the fd is currently in the global update list, it will never be
removed, and while it's mostly harmless in 1.9, in 1.8, only update_mask
is checked to know if the fd is already in the list or not, so we can end
up trying to add a fd that is already in the list, and corrupt it, which
means some fd may not be added to the poller.

This should be backported to 1.8.
2018-06-20 10:21:44 +02:00
Emmanuel Hocdet
3448c490ca BUG/MEDIUM: ssl: do not store pkinfo with SSL_set_ex_data
Bug from 96b7834e: pkinfo is stored on SSL_CTX ex_data and should
not be also stored on SSL ex_data without reservation.
Simply extract pkinfo from SSL_CTX in ssl_sock_get_pkey_algo.

No backport needed.
2018-06-18 13:34:09 +02:00
Thierry FOURNIER
28962c9941 BUG/MAJOR: ssl: OpenSSL context is stored in non-reserved memory slot
We never saw unexplicated crash with SSL, so I suppose that we are
luck, or the slot 0 is always reserved. Anyway the usage of the macro
SSL_get_app_data() and SSL_set_app_data() seem wrong. This patch change
the deprecated functions SSL_get_app_data() and SSL_set_app_data()
by the new functions SSL_get_ex_data() and SSL_set_ex_data(), and
it reserves the slot in the SSL memory space.

For information, this is the two declaration which seems wrong or
incomplete in the OpenSSL ssl.h file. We can see the usage of the
slot 0 whoch is hardcoded, but never reserved.

   #define SSL_set_app_data(s,arg)     (SSL_set_ex_data(s,0,(char *)arg))
   #define SSL_get_app_data(s)      (SSL_get_ex_data(s,0))

This patch must be backported at least in 1.8, maybe in other versions.
2018-06-18 10:32:14 +02:00
Thierry FOURNIER
16ff050478 BUG/MAJOR: ssl: Random crash with cipherlist capture
The cipher list capture struct is stored in the SSL memory space,
but the slot is reserved in the SSL_CTX memory space. This causes
ramdom crashes.

This patch should be backported to 1.8
2018-06-18 10:32:12 +02:00
Frédéric Lécaille
f874a83b57 BUG/MINOR: lua: Segfaults with wrong usage of types.
Patrick reported that this simple configuration made haproxy segfaults:

    global
        lua-load /tmp/haproxy.lua

    frontend f1
        mode http
        bind :8000
        default_backend b1

        http-request lua.foo

    backend b1
        mode http
        server s1 127.0.0.1:8080

with this '/tmp/haproxy.lua' script:

    core.register_action("foo", { "http-req" }, function(txn)
        txn.sc:ipmask(txn.f:src(), 24, 112)
    end)

This is due to missing initialization of the array of arguments
passed to hlua_lua2arg_check() which makes it enter code with
corrupted arguments.

Thanks a lot to Patrick Hemmer for having reported this issue.

Must be backported to 1.8, 1.7 and 1.6.
2018-06-18 10:23:47 +02:00
Olivier Houchard
9db0fedb59 BUG/MINOR: tasklets: Just make sure we don't pass a tasklet to the handler.
We can't just set t to NULL if it's a tasklet, or we'd have a hard time
accessing to t->process, so just make sure we pass NULL as the first parameter
of t->process if it's a tasklet.
This should be a non-issue at this point, as tasklets aren't used yet.
2018-06-14 18:57:26 +02:00
William Lallemand
579fb25b62 BUG/MAJOR: map: fix a segfault when using http-request set-map
The bug happens with an existing entry, when you try to overwrite the
value with wrong data, for example, a string when the type is INT.

The code path was not secure and tried to set *err and *merr while
err = merr = NULL when performing an http action.

Must be backported in 1.6, 1.7, 1.8.
2018-06-11 11:02:06 +02:00
William Lallemand
6e1796e85d BUG/MINOR: signals: ha_sigmask macro for multithreading
The behavior of sigprocmask in an multithreaded environment is
undefined.

The new macro ha_sigmask() calls either pthreads_sigmask() or
sigprocmask() if haproxy was built with thread support or not.

This should be backported to 1.8.
2018-06-08 18:24:53 +02:00
William Lallemand
933642c6ef BUG/MINOR: don't ignore SIG{BUS,FPE,ILL,SEGV} during signal processing
We don't have any reason of blocking those signals.

If SIGBUS, SIGFPE, SIGILL, or SIGSEGV are generated while they are blocked, the
result is undefined, unless the signal was generated by kill(2), sigqueue(3), or
raise(3).

This should be backported to 1.8.
2018-06-08 18:22:43 +02:00
William Lallemand
1aab50bb4a BUG/MEDIUM: threads: handle signal queue only in thread 0
Signals were handled in all threads which caused some signals to be lost
from time to time. To avoid complicated lock system (threads+signals),
we prefer handling the signals in one thread avoiding concurrent access.

The side effect of this bug was that some process were not leaving from
time to time during a reload.

This patch must be backported in 1.8.
2018-06-08 18:22:31 +02:00
Thierry FOURNIER
fc044c98e4 MINOR: lua: Increase debug information
When an unrecoverable error raises, the user receive poor information
for the trouble shooting. For example:

   [ALERT] 157/143755 (21212) : Lua function 'hello-world': runtime error: memory allocation error: block too big.

Unfortunately, the memory allocation error can be throwed by many
function, and we have no informatio to reach the original cause.
This patch add the list of function called from the entry point to
the function in error, like this:

   [ALERT] 157/143755 (21212) : Lua function 'hello-world': runtime error: memory allocation error: block too big from [C] method 'req_get_headers', bug35.lua:2 global 'ee', bug35.lua:6 global 'ff', bug35.lua:10 C function line 9.
2018-06-08 18:18:33 +02:00
Olivier Houchard
b4dd15bd6f BUG/MINOR: unix: Make sure we can transfer abns sockets on seamless reload.
When checking if a socket we got from the parent is suitable for a listener,
we just checked that the path matched sockname.tmp, however this is
unsuitable for abns sockets, where we don't have to create a temporary
file and rename it later.
To detect that, check that the first character of the sun_path is 0 for
both, and if so, that &sun_path[1] is the same too.

This should be backported to 1.8.
2018-06-07 14:33:44 +02:00
Olivier Houchard
b1ca58b245 MINOR: tasks: Don't define rqueue if we're building without threads.
To make sure we don't inadvertently insert task in the global runqueue,
while only the local runqueue is used without threads, make its definition
and usage conditional on USE_THREAD.
2018-06-06 16:35:12 +02:00
David Carlier
cc0a957a50 MINOR: task: Fix compiler warning.
Waking up task, when checking if it is a valid entry.
Similarly to commit caa8a37ffe,
casting explicitally to void pointer as HA_ATOMIC_CAS needs.
2018-06-05 13:55:57 +02:00
Willy Tarreau
34b1facbcf MINOR: stats: also report the nice and number of calls for applets
Since applets are now part of the main scheduler, it's useful to report
their nice value and the number of calls to the applet handler, to see
where the CPU is spent.
2018-06-05 11:18:21 +02:00
Christopher Faulet
6381650516 MAJOR: spoe: upgrade the SPOP version to 2.0 and remove the support for 1.0
The commit c4dcaff3 ("BUG/MEDIUM: spoe: Flags are not encoded in network order")
introduced an incompatibility with older agents. So the major version of the
SPOP is increased to make the situation unambiguous. And because before the fix,
the protocol is buggy, the support of the version 1.0 is removed to be sure to
not continue to support buggy agents.

The agents in the contrib folder (spoa_example, modsecurity and mod_defender)
are also updated to announce the SPOP version 2.0.

So, to be clear, from the patch, connections to agents announcing the SPOP
version 1.0 will be rejected.

This patch must be backported in 1.8.
2018-06-04 17:33:48 +02:00
Thierry FOURNIER
66b8919b10 BUG/MEDIUM: lua/socket: Buffer error, may segfault
The buffer pointer is already updated. It is again updated
when it is given to the function ci_putblk().

This patch must be backported in 1.6, 1.7 and 1.8
2018-05-31 10:58:41 +02:00
Thierry FOURNIER
101b97619a BUG/MEDIUM: lua/socket: Sheduling error on write: may dead-lock
When we write data, we risk to encounter a dead-loack. The
function "stream_int_notify()" cannot be called the the
cosocket because the caller acquire a lock and when the socket
is closed, the cleanup function try to acquire the same lock.,
so a dead-lock raises.

In other way, the function stream_int_update_applet() can't
be called because it schedumes the applet only if some activity
in the buffers were detected. It is not always the case. We
replace this function by appctx_wakeup() which wake up the
applet inconditionnaly.

The last part of the fix is setting right signals. the applet
call the stream_int_update() function if the output buffer si
not empty, and ask for put data if some rite signals are
registered.

This patch must be backported in 1.6, 1.7 and 1.8. Note that it requires
patch "MINOR: task/notification: Is notifications registered" to be
applied.
2018-05-31 10:58:41 +02:00
Thierry FOURNIER
ba42fcd064 BUG/MEDIUM: lua/socket: Notification error
Each time the send function yields, a notification must be registered.
Without this notification, the task is never wakeup when data arrives.

Today, the notification is registered only if the buffer is not available.
Other cases like the buffer is too small for all data are not processed.

This patch must be backported in 1.6, 1.7 and 1.8
2018-05-31 10:58:41 +02:00
Thierry FOURNIER
7e4ee47acc BUG/MAJOR: lua: Dead lock with sockets
In some cases, when we are waiting for data and the socket
timeout expires, we have a dead lock. The Lua socket locks
the applet socket, and call for a notify. The notify
immediately executes code and try to acquire the same lock,
so ... dead lock.

stream_int_notify() cant be used because it wakeup the applet
task only if the stream have changes. The changes are forces
by Lua, but not repported on the stream.

stream_int_update_applet() cant be used because the deadlock.

So, I inconditionnaly wakeup the applet. This wake is performed
asynchronously, and will call a stream_int_notify().

This patch must be backported in 1.6, 1.7 and 1.8
2018-05-31 10:58:41 +02:00
Thierry FOURNIER
af4bd0867a BUG/MEDIUM: lua/socket: wrong scheduling for sockets
The appctx pointer is given from any variable which are wrong.
This implies the wakeup of wrong applet, and the socket are no
longer responsive.

This behavior is hidden by another inherited error which is
fixed in the next patch.

This patch remove all wrong appctx affectations.

This patch must be backported in 1.6, 1.7 and 1.8
2018-05-31 10:58:41 +02:00
Christopher Faulet
3a47e5e25c BUG/MEDIUM: spoe: Return an error when the wrong ACK is received in sync mode
This is required to let a message processing timed out. Because, when it
happens, there is no more context attached to the SPOE applet that sent the
NOTIFY frame. So when the ACK is received, it is too late. This is the same
situation when we receive the wrong ACK. It is invalid in sync mode. Otherwise,
the SPOE applet remains in the state "WAITING_SYNC_ACK" until the idle timeout
is reached. In such case, the applet is seen as busy and it is unusable. If this
happens too often, more and more applets will be created because some others are
blocked. If there is a maxconn on the SPOE backend, all processings will be
drastically slowdown.

Returning an error in such cases, in sync mode, allow us to terminate the SPOE
applet. Because it means the agent is unresponsive or too slow.

Note this bug exists only if the sync mode is used.

This patch must be backported in 1.8.
2018-05-30 15:34:48 +02:00
Ben Draut
44e609bfa5 MINOR: dns: Implement parse-resolv-conf directive
This introduces a new directive for the `resolvers` section:
`parse-resolv-conf`. When present, it will attempt to add any
nameservers in `/etc/resolv.conf` to the list of nameservers
for the current `resolvers` section.

[Mailing list thread][1].

[1]: https://www.mail-archive.com/haproxy@formilux.org/msg29600.html
2018-05-30 05:17:16 +02:00
Olivier Houchard
082627af77 MINOR: task: Also consider the task list size when getting global tasks.
We're taking tasks from the global runqueue based on the number of tasks
the thread already have in its local runqueue, but now that we have a task
list, we also have to take that into account.
2018-05-28 15:20:59 +02:00
Olivier Houchard
736ea41c6c BUG/MEDIUM: task: Don't forget to decrement max_processed after each task.
When the task list was introduced, we bogusly lost max_processed--, that means
we would execute as much tasks as present in the list, and we would never
set active_tasks_mask, so the thread would go to sleep even if more tasks were
to be executed.

1.9-dev only, no backport is needed.
2018-05-28 15:20:57 +02:00
Willy Tarreau
1b0f85e47f MINOR: stats: also report the failed header rewrites warnings on the stats page
These ones concern the warnings detected during header addition/insertion.
They are visible in the tooltip reporting the per-status codes stats. The
frontend and backend contain a total of request+response warnings, while
server only has the response warnings.
2018-05-28 15:16:23 +02:00
Tim Duesterhus
3fd1973d37 MINOR: http: Log warning if (add|set)-header fails
This patch adds a warning if an http-(request|reponse) (add|set)-header
rewrite fails to change the respective header in a request or response.

This usually happens when tune.maxrewrite is not sufficient to hold all
the headers that should be added.
2018-05-28 14:53:59 +02:00
Daniel Corbett
3e60b11100 BUG/MEDIUM: stick-tables: Decrement ref_cnt in table_* converters
When using table_* converters ref_cnt was incremented
and never decremented causing entries to not expire.

The root cause appears to be that stktable_lookup_key()
was called within all sample_conv_table_* functions which was
incrementing ref_cnt and not decrementing after completion.

Added stktable_release() to the end of each sample_conv_table_*
function and reworked the end logic to ensure that ref_cnt is
always decremented after use.

This should be backported to 1.8
2018-05-28 10:36:20 +02:00
Olivier Houchard
673867c357 MAJOR: applets: Use tasks, instead of rolling our own scheduler.
There's no real reason to have a specific scheduler for applets anymore, so
nuke it and just use tasks. This comes with some benefits, the first one
being that applets cannot induce high latencies anymore since they share
nice values with other tasks. Later it will be possible to configure the
applets' nice value. The second benefit is that the applet scheduler was
not very thread-friendly, having a big lock around it in prevision of this
change. Thus applet-intensive workloads should now scale much better with
threads.

Some more improvement is possible now : some applets also use a task to
handle timers and timeouts. These ones could now be simplified to use only
one task.
2018-05-26 20:03:30 +02:00
Olivier Houchard
1599b80360 MINOR: tasks: Make the number of tasks to run at once configurable.
Instead of hardcoding 200, make the number of tasks to be run configurable
using tune.runqueue-depth. 200 is still the default.
2018-05-26 20:03:24 +02:00
Olivier Houchard
b0bdae7b88 MAJOR: tasks: Introduce tasklets.
Introduce tasklets, lightweight tasks. They have no notion of priority,
they are just run as soon as possible, and will probably be used for I/O
later.

For the moment they're used to replace the temporary thread-local list
that was used in the scheduler. The first part of the struct is common
with tasks so that tasks can be cast to tasklets and queued in this list.
Once a task is in the tasklet list, it has its leaf_p set to 0x1 so that
it cannot accidently be confused as not in the queue.

Pure tasklets are identifiable by their nice value of -32768 (which is
normally not possible).
2018-05-26 20:03:19 +02:00
Olivier Houchard
f6e6dc12cd MAJOR: tasks: Create a per-thread runqueue.
A lot of tasks are run on one thread only, so instead of having them all
in the global runqueue, create a per-thread runqueue which doesn't require
any locking, and add all tasks belonging to only one thread to the
corresponding runqueue.

The global runqueue is still used for non-local tasks, and is visited
by each thread when checking its own runqueue. The nice parameter is
thus used both in the global runqueue and in the local ones. The rare
tasks that are bound to multiple threads will have their nice value
used twice (once for the global queue, once for the thread-local one).
2018-05-26 19:27:29 +02:00
Olivier Houchard
9f6af33222 MINOR: tasks: Change the task API so that the callback takes 3 arguments.
In preparation for thread-specific runqueues, change the task API so that
the callback takes 3 arguments, the task itself, the context, and the state,
those were retrieved from the task before. This will allow these elements to
change atomically in the scheduler while the application uses the copied
value, and even to have NULL tasks later.
2018-05-26 19:23:57 +02:00
Thierry FOURNIER
8c126c7235 BUG/MEDIUM: lua/socket: Length required read doesn't work
The limit of data read works only if all the data is in the
input buffer. Otherwise (if the data arrive in chunks), the
total amount of data is not taken in acount.

Only the current read data are compared to the expected amout
of data.

This patch must be backported from 1.9 to 1.6
2018-05-26 08:51:05 +02:00
Daniel Corbett
9215ffa6b2 BUG/MEDIUM: servers: Add srv_addr default placeholder to the state file
When creating a state file using "show servers state" an empty field is
created in the srv_addr column if the server is from the socket family
AF_UNIX.  This leads to a warning on start up when using
"load-server-state-from-file". This patch defaults srv_addr to "-" if
the socket family is not covered.

This patch should be backported to 1.8.
2018-05-24 22:06:08 +02:00
Olivier Houchard
f3d9e608d7 BUG/MEDIUM: dns: Delay the attempt to run a DNS resolution on check failure.
When checks fail, the code tries to run a dns resolution, in case the IP
changed.
The old way of doing that was to check, in case the last dns resolution
hadn't expired yet, if there were an applicable IP, which should be useless,
because it has already be done when the resolution was first done, or to
run a new resolution.
Both are a locking nightmare, and lead to deadlocks, so instead, just wake the
resolvers task, that should do the trick.

This should be backported to 1.8.
2018-05-23 16:57:15 +02:00
Lukas Tribus
926594f606 MINOR: ssl: set SSL_OP_PRIORITIZE_CHACHA
Sets OpenSSL 1.1.1's SSL_OP_PRIORITIZE_CHACHA unconditionally, as per [1]:

When SSL_OP_CIPHER_SERVER_PREFERENCE is set, temporarily reprioritize
ChaCha20-Poly1305 ciphers to the top of the server cipher list if a
ChaCha20-Poly1305 cipher is at the top of the client cipher list. This
helps those clients (e.g. mobile) use ChaCha20-Poly1305 if that cipher
is anywhere in the server cipher list; but still allows other clients to
use AES and other ciphers. Requires SSL_OP_CIPHER_SERVER_PREFERENCE.

[1] https://www.openssl.org/docs/man1.1.1/man3/SSL_CTX_clear_options.html
2018-05-23 16:55:15 +02:00
William Lallemand
8a16fe0d05 BUG/MEDIUM: cache: don't cache when an Authorization header is present
RFC 7234 says:

A cache MUST NOT store a response to any request, unless:
[...] the Authorization header field (see Section 4.2 of [RFC7235]) does
      not appear in the request, if the cache is shared, unless the
      response explicitly allows it (see Section 3.2), [...]

In this patch we completely disable the cache upon the receipt of an
Authorization header in the request. In this case it's not possible to
either use the cache or store into the cache anymore.

Thanks to Adam Eijdenberg of Digital Transformation Agency for raising
this issue.

This patch must be backported to 1.8.
2018-05-23 10:36:44 +02:00
Thierry Fournier
d5b073cf1f MINOR: lua: Improve error message
The function hlua_ctx_resume return less text message and more error
code. These error code allow the caller to return appropriate
message to the user.
2018-05-22 18:57:46 +02:00
Willy Tarreau
cbe6da5eb0 BUG/MINOR: ssl/lua: prevent lua from affecting automatic maxconn computation
Since commit 36d1374 ("BUG/MINOR: lua: Fix SSL initialisation") in 1.6, the
Lua code always initializes an SSL server. It caused a small visible side
effect which is that by calling ssl_sock_prepare_srv_ctx(), it forces
global.ssl_used_backend to 1 and makes the initialization code believe that
there are some SSL servers in certain backends. This detection is used to
figure how to set the global maxconn value when only the memory usage is
limited. As such, even a configuration with no SSL at all will have a very
conservative maxconn.

The configuration below exhibits this :

   global
        ssl-server-verify none
        stats socket /tmp/sock1 mode 666 level admin
        tune.bufsize 16384

   listen  px
        timeout client  5s
        timeout server  5s
        timeout connect 5s
        bind :4445
        #bind :4443 ssl crt rsa+dh2048.pem
        #server s1 127.0.0.1:8003 ssl

Starting it with "-m 200" to limit it to 200 MB of RAM reports 1500 for
Maxconn, the same when uncommenting the "server" line, and 1300 when
uncommenting the "bind" line, regardless of the "server" line's status.

In practice it doesn't make sense to consider that Lua's server template
counts for one regular SSL server, because even if used for SSL, it will
not take large connection counts, compared to a backend relaying traffic.
Thus the solution consists in resetting the ssl_used_backend to its
previous value after creating the server_ctx from the Lua code. With the
fix, the same config with the same parameters now show :
  - maxconn=5700 when neither side uses SSL
  - maxconn=1500 when only one side uses SSL
  - maxconn=1300 when both sides use SSL

This fix can be backported to versions 1.6 and beyond.
2018-05-18 17:09:35 +02:00
Christopher Faulet
68db0235fd CLEANUP: spoe: Remove unused variables the agent structure
applets_act and applets_idle were used for debugging purpose. Now, these values
are part of the agent's counters.
2018-05-18 15:04:46 +02:00
Thierry FOURNIER
c4dcaff3f0 BUG/MEDIUM: spoe: Flags are not encoded in network order
The flags are direct copy of the "unsigned int" in the network stream,
so the stream contains a 32 bits field encoded with the host endian.
 - This is not reliable for stream betwen different architecture host
 - For x86, the bits doesn't correspond to the documentation.

This patch add some precision in the documentation and put the bitfield
in the stream usig network butes order.

Warning: this patch can break compatibility with existing agents.

This patch should be backported in all version supporing SPOE

Original network capture:

   12:28:16.181343 IP 127.0.0.1.46782 > 127.0.0.1.12345: Flags [P.], seq 134:168, ack 59, win 342, options [nop,nop,TS val 2855241281 ecr 2855241281], length 34
           0x0000:  4500 0056 6b94 4000 4006 d10b 7f00 0001  E..Vk.@.@.......
           0x0010:  7f00 0001 b6be 3039 a3d1 ee54 7d61 d6f7  ......09...T}a..
           0x0020:  8018 0156 fe4a 0000 0101 080a aa2f 8641  ...V.J......./.A
           0x0030:  aa2f 8641 0000 001e 0301 0000 0000 010f  ./.A............
                                          ^^^^^^^^^^
           0x0040:  6368 6563 6b2d 636c 6965 6e74 2d69 7001  check-client-ip.
           0x0050:  0006 7f00 0001                           ......

Fixed network capture:

   12:24:26.948165 IP 127.0.0.1.46706 > 127.0.0.1.12345: Flags [P.], seq 4066280627:4066280661, ack 3148908096, win 342, options [nop,nop,TS val 2855183972 ecr 2855177690], length 34
           0x0000:  4500 0056 0538 4000 4006 3768 7f00 0001  E..V.8@.@.7h....
           0x0010:  7f00 0001 b672 3039 f25e 84b3 bbb0 8640  .....r09.^.....@
           0x0020:  8018 0156 fe4a 0000 0101 080a aa2e a664  ...V.J.........d
           0x0030:  aa2e 8dda 0000 001e 0300 0000 0114 010f  ................
                                          ^^^^^^^^^^
           0x0040:  6368 6563 6b2d 636c 6965 6e74 2d69 7001  check-client-ip.
           0x0050:  0006 7f00 0001                           ......
2018-05-18 13:50:53 +02:00
Thierry FOURNIER
01a3f20740 BUG/MINOR: spoe: Mistake in error message about SPOE configuration
The announced accepted chars are "[a-zA-Z_-.]", but
the real accepted alphabet is "[a-zA-Z0-9_.]".

Numbers are supported and "-" is not supported.

This patch should be backported to 1.8 and 1.7
2018-05-18 13:50:40 +02:00
sada
05ed330d72 BUG/MINOR: lua: Socket.send threw runtime error: 'close' needs 1 arguments.
Function `hlua_socket_close` expected exactly one argument on the Lua stack.
But when `hlua_socket_close` was called from `hlua_socket_write_yield`,
Lua stack had 3 arguments. So `hlua_socket_close` threw the exception with
message "'close' needs 1 arguments".

Introduced new helper function `hlua_socket_close_helper`, which removed the
Lua stack argument count check and only checked if the first argument was
a socket.

This fix should be backported to 1.8, 1.7 and 1.6.
2018-05-18 13:48:21 +02:00
Willy Tarreau
03f4ec47d9 BUG/MEDIUM: ssl: properly protect SSL cert generation
Commit 821bb9b ("MAJOR: threads/ssl: Make SSL part thread-safe") added
insufficient locking to the cert lookup and generation code : it uses
lru64_lookup(), which will automatically remove and add a list element
to the LRU list. It cannot be simply read-locked.

A long-term improvement should consist in using a lockless mechanism
in lru64_lookup() to safely move the list element at the head. For now
let's simply use a write lock during the lookup. The effect will be
minimal since it's used only in conjunction with automatically generated
certificates, which are much more expensive and rarely used.

This fix must be backported to 1.8.
2018-05-17 10:56:47 +02:00
Willy Tarreau
ba20dfc501 BUG/MEDIUM: http: don't always abort transfers on CF_SHUTR
Pawel Karoluk reported on Discourse[1] that HTTP/2 breaks url_param.

Christopher managed to track it down to the HTTP_MSGF_WAIT_CONN flag
which is set there to ensure the connection is validated before sending
the headers, as we may need to rewind the stream and hash again upon
redispatch. What happens is that in the forwarding code we refrain
from forwarding when this flag is set and the connection is not yet
established, and for this we go through the missing_data_or_waiting
path. This exit path was initially designed only to wait for data
from the client, so it rightfully checks whether or not the client
has already closed since in that case it must not wait for more data.
But it also has the side effect of aborting such a transfer if the
client has closed after the request, which is exactly what happens
in H2.

A study on the code reveals that this whole combined check should
be revisited : while it used to be true that waiting had the same
error conditions as missing data, it's not true anymore. Some other
corner cases were identified, such as the risk to report a server
close instead of a client timeout when waiting for the client to
read the last chunk of data if the shutr is already present, or
the risk to fail a redispatch when a client uploads some data and
closes before the connection establishes. The compression seems to
be at risk of rare issues there if a write to a full buffer is not
yet possible but a shutr is already queued.

At the moment these risks are extremely unlikely but they do exist,
and their impact is very minor since it mostly concerns an issue not
being optimally handled, and the fixes risk to cause more serious
issues. Thus this patch only focuses on how the HTTP_MSGF_WAIT_CONN
is handled and leaves the rest untouched.

This patch needs to be backported to 1.8, and could be backported to
earlier versions to properly take care of HTTP/1 requests passing via
url_param which are closed immediately after the headers, though this
is unlikely as this behaviour is only exhibited by scripts.

[1] https://discourse.haproxy.org/t/haproxy-1-8-x-url-param-issue-in-http2/2482/13
2018-05-16 11:35:05 +02:00
William Lallemand
0154edc96f BUG/MINOR: cli: don't stop cli_gen_usage_msg() when kw->usage == NULL
In commit abbf607 ("MEDIUM: cli: Add payload support") some cli keywords
without usage message have been added at the beginning of the keywords
array.

cli_gen_usage_usage_msg() use the kw->usage == NULL to stop generating
the usage message for the current keywords array. With those keywords at
the beginning, the whole array in cli.c was ignored in the usage message
generation.

This patch now checks the keyword itself, allowing a keyword without
usage message anywhere in the array.
2018-05-15 15:16:23 +02:00
PiBa-NL
c55b88ece6 BUG/MEDIUM: pollers/kqueue: use incremented position in event list
When composing the event list for subscribe to kqueue events, the index
where the new event is added must be after the previous events, as such
the changes counter should continue counting.

This caused haproxy to accept connections but not try read and process
the incoming data.

This patch is for 1.9 only
2018-05-11 14:08:56 +02:00
Willy Tarreau
29d698040d BUG/MINOR: lua: ensure large proxy IDs can be represented
In function hlua_fcn_new_proxy() too small a buffer was passed to
snprintf(), resulting in large proxy or listener IDs to make
snprintf() fail. It is unlikely to meet this case but let's fix it
anyway.

This fix must be backported to all stable branches where it applies.
2018-05-06 14:50:09 +02:00
PiBa-NL
706d5ee0c3 BUG/MINOR: lua: schedule socket task upon lua connect()
The parameters like server-address, port and timeout should be set before
process_stream task is called to avoid the stream being 'closed' before it
got initialized properly. This is most clearly visible when running with
tune.lua.forced-yield=1.. So scheduling the task should not be done when
creating the lua socket, but when connect is called. The error
"socket: not yet initialised, you can't set timeouts." would then appear.

Below code for example also shows this issue, as the sleep will
yield the lua code:
  local con = core.tcp()
  core.sleep(1)
  con:settimeout(10)
2018-05-06 14:36:41 +02:00
Olivier Houchard
cb92f5cae4 MINOR: pollers: move polled_mask outside of struct fdtab.
The polled_mask is only used in the pollers, and removing it from the
struct fdtab makes it fit in one 64B cacheline again, on a 64bits machine,
so make it a separate array.
2018-05-06 06:27:34 +02:00
Olivier Houchard
6b96f7289c BUG/MEDIUM: pollers: Use a global list for fd shared between threads.
With the old model, any fd shared by multiple threads, such as listeners
or dns sockets, would only be updated on one threads, so that could lead
to missed event, or spurious wakeups.
To avoid this, add a global list for fd that are shared, using the same
implementation as the fd cache, and only remove entries from this list
when every thread as updated its poller.

[wt: this will need to be backported to 1.8 but differently so this patch
 must not be backported as-is]
2018-05-06 06:27:09 +02:00
Olivier Houchard
6a2cf8752c MINOR: fd: Make the lockless fd list work with multiple lists.
Modify fd_add_to_fd_list() and fd_rm_from_fd_list() so that they take an
offset in the fdtab to the list entry, instead of hardcoding the fd cache,
so we can use them with other lists.
2018-05-06 06:25:49 +02:00
Olivier Houchard
9b36cb4a41 BUG/MEDIUM: task: Don't free a task that is about to be run.
While running a task, we may try to delete and free a task that is about to
be run, because it's part of the local tasks list, or because rq_next points
to it.
So flag any task that is in the local tasks list to be deleted, instead of
run, by setting t->process to NULL, and re-make rq_next a global,
thread-local variable, that is modified if we attempt to delete that task.

Many thanks to PiBa-NL for reporting this and analysing the problem.

This should be backported to 1.8.
2018-05-04 20:11:04 +02:00
Dragan Dosen
336a11f755 BUG/MINOR: map: correctly track reference to the last ref_elt being dumped
The bug was introduced in the commit 8d85aa4 ("BUG/MAJOR: map: fix
segfault during 'show map/acl' on cli").

This patch should be backported to 1.8, 1.7 and 1.6.
2018-05-04 17:14:39 +02:00
Patrick Hemmer
32d539fa88 MINOR: lua: add get_maxconn and set_maxconn to LUA Server class. 2018-05-03 18:53:42 +02:00
Patrick Hemmer
a62ae7ed9a MINOR: lua: Add server name & puid to LUA Server class. 2018-05-03 18:44:44 +02:00
Willy Tarreau
760e81d356 MINOR: backend: implement random-based load balancing
For large farms where servers are regularly added or removed, picking
a random server from the pool can ensure faster load transitions than
when using round-robin and less traffic surges on the newly added
servers than when using leastconn.

This commit introduces "balance random". It internally uses a random as
the key to the consistent hashing mechanism, thus all features available
in consistent hashing such as weights and bounded load via hash-balance-
factor are usable. It is extremely convenient because one common concern
when using random is what happens when a server is hammered a bit too
much. Here that can trivially be avoided, like in the configuration below :

    backend bk0
        balance random
        hash-balance-factor 110
        server-template s 1-100 127.0.0.1:8000 check inter 1s

Note that while "balance random" internally relies on a hash algorithm,
it holds the same properties as round-robin and as such is compatible with
reusing an existing server connection with "option prefer-last-server".
2018-05-03 07:20:40 +02:00
PiBa-NL
fe971b35ae BUG/MINOR, BUG/MINOR: lua: Put tasks to sleep when waiting for data
If a lua socket is waiting for data it currently spins at 100% cpu usage.
This because the TICK_ETERNITY returned by the socket is ignored when
setting the 'expire' time of the task.

Fixed by removing the check for yields that return TICK_ETERNITY.

This should be backported to at least 1.8.
2018-05-03 05:00:25 +02:00
Christopher Faulet
148b16e1ce BUG/MEDIUM: threads: Fix the sync point for more than 32 threads
In the sync point, to know if a thread has requested a synchronization, we call
the function thread_need_sync(). It should return 1 if yes, otherwise it should
return 0. It is intended to return a signed integer.

But internally, instead of returning 0 or 1, it returns 0 or tid_bit
(threads_want_sync & tid_bit). So, tid_bit is casted in integer. For the first
32 threads, it's ok, because we always check if thread_need_sync() returns
something else than 0. But this is a problem if HAProxy is started with more
than 32 threads, because for threads 33 to 64 (so for tid 32 to 63), their
tid_bit casted to integer are evaluated to 0. So the sync point does not work for
more than 32 threads.

Now, the function thread_need_sync() respects its contract, returning 0 or
1. the function thread_no_sync() has also been updated to avoid any ambiguities.

This patch must be backported in HAProxy 1.8.
2018-05-02 17:58:36 +02:00
Christopher Faulet
b119a79fc3 BUG/MINOR: checks: Fix check->health computation for flapping servers
This patch fixes an old bug introduced in the commit 7b1d47ce ("MAJOR: checks:
move health checks changes to set_server_check_status()"). When a DOWN server is
flapping, everytime a check succeds, check->health is incremented. But when a
check fails, it is decremented only when it is higher than the rise value. So if
only one check succeds for a DOWN server, check->health will remain set to 1 for
all subsequent failing checks.

So, at first glance, it seems not that terrible because the server remains
DOWN. But it is reported in the transitional state "DOWN server, going up". And
it will remain in this state until it is UP again. And there is also an
insidious side effect. If a DOWN server is flapping time to time, It will end to
be considered UP after a uniq successful check, , regardless the rise threshold,
because check->health will be increased slowly and never decreased.

To fix the bug, we just need to reset check->health to 0 when a check fails for
a DOWN server. To do so, we just need to relax the condition to handle a failure
in the function set_server_check_status.

This patch must be backported to haproxy 1.5 and newer.
2018-05-02 14:57:58 +02:00
Patrick Hemmer
e027547f8d MINOR: ssl: add fetch 'ssl_fc_session_key' and 'ssl_bc_session_key'
These fetches return the SSL master key of the front/back connection.
This is useful to decrypt traffic encrypted with ephemeral ciphers.
2018-04-30 14:56:19 +02:00
Patrick Hemmer
419667746b MINOR: ssl: disable SSL sample fetches when unsupported
Previously these fetches would return empty results when HAProxy was
compiled
without the requisite SSL support. This results in confusion and problem
reports from people who unexpectedly encounter the behavior.
2018-04-30 14:56:19 +02:00
Willy Tarreau
46deab6e64 BUG/MINOR: config: disable http-reuse on TCP proxies
Louis Chanouha reported an inappropriate warning when http-reuse is
present in a defaults section while a TCP proxy accidently inherits
it and finds a conflict with other options like the use of the PROXY
protocol. To fix this patch removes the http-reuse option for TCP
proxies.

This fix needs to be backported to 1.8, 1.7 and possibly 1.6.
2018-04-28 07:18:15 +02:00
Tim Duesterhus
e2b10bf491 MINOR: http: Add support for 421 Misdirected Request
This makes haproxy aware of HTTP 421 Misdirected Request, which
is defined in RFC 7540, section 9.1.2.
2018-04-28 07:03:39 +02:00
Tim Duesterhus
ca097c16a8 MINOR: sample: Add strcmp sample converter
This converter supplements the existing string matching by allowing
strings to be converted to a variable.

Example usage:

  http-request set-var(txn.host) hdr(host)
  # Check whether the client is attempting domain fronting.
  acl ssl_sni_http_host_match ssl_fc_sni,strcmp(txn.host) eq 0
2018-04-28 07:03:39 +02:00
Christopher Faulet
5bc9972ed8 BUG/MINOR: lua/threads: Make lua's tasks sticky to the current thread
PiBa-NL reported a bug with tasks registered in lua when HAProxy is started with
serveral threads. These tasks have not specific affinity with threads so they
can be woken up on any threads. So, it is impossbile for these tasks to handled
cosockets or applets, because cosockets and applets are sticky on the thread
which created them. It is forbbiden to manipulate a cosocket from another
thread.

So to fix the bug, tasks registered in lua are now sticky to the current
thread. Because these tasks can be registered before threads creation, the
affinity is set the first time a lua's task is processed.

This patch must be backported in HAProxy 1.8.
2018-04-26 22:58:16 +02:00
Aurélien Nephtali
1e0867cfbc MINOR: ssl: Add payload support to "set ssl ocsp-response"
It is now possible to use a payload with the "set ssl ocsp-response"
command.  These syntaxes will work the same way:

 # echo "set ssl ocsp-response $(base64 -w 10000 ocsp.der)" | \
     socat /tmp/sock1 -

 # echo -e "set ssl ocsp-response <<\n$(base64 ocsp.der)\n" | \
     socat /tmp/sock1 -

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-04-26 14:20:09 +02:00
Aurélien Nephtali
25650ce513 MINOR: map: Add payload support to "add map"
It is now possible to use a payload with the "add map" command.
These syntaxes will work the same way:

 # echo "add map #-1 key value" | socat /tmp/sock1 -

 # echo -e "add map #-1 <<\n$(cat data)\n" | socat /tmp/sock1 -

with

 # cat data
 key1 value1 with spaces
 key2 value2
 key3 value3 also with spaces

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-04-26 14:20:01 +02:00
Aurélien Nephtali
abbf607105 MEDIUM: cli: Add payload support
In order to use arbitrary data in the CLI (multiple lines or group of words
that must be considered as a whole, for example), it is now possible to add a
payload to the commands. To do so, the first line needs to end with a special
pattern: <<\n. Everything that follows will be left untouched by the CLI parser
and will be passed to the commands parsers.

Per-command support will need to be added to take advantage of this
feature.

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-04-26 14:19:33 +02:00
Christopher Faulet
799f51801a BUG/MINOR: spoe: Fix parsing of dontlog-normal option
A missing goto led to a parsing error when line "option dontlog-normal" was
parsed.
2018-04-26 11:50:30 +02:00
Christopher Faulet
ebe1399efe BUG/MINOR: spoe: Fix counters update when processing is interrupted
When the processing is interrupted, because of a typo, <nb_sending> was
incremented instead of decremented.
2018-04-26 11:50:18 +02:00
Willy Tarreau
eba10f24b7 BUG/MEDIUM: h2: implement missing support for chunked encoded uploads
Upload requests not carrying a content-length nor tunnelling data must
be sent chunked-encoded over HTTP/1. The code was planned but for some
reason forgotten during the implementation, leading to such payloads to
be sent as tunnelled data.

Browsers always emit a content length in uploads so this problem doesn't
happen for most sites. However some applications may send data frames
after a request without indicating it earlier.

The only way to detect that a client will need to send data is that the
HEADERS frame doesn't hold the ES bit. In this case it's wise to look
for the content-length header. If it's not there, either we're in tunnel
(CONNECT method) or chunked-encoding (other methods).

This patch implements this.

The following request is sent using content-length :

    curl --http2 -sk https://127.0.0.1:4443/s2 -XPOST -T /large/file

and these ones using chunked-encoding :

    curl --http2 -sk https://127.0.0.1:4443/s2 -XPUT -T /large/file
    curl --http2 -sk https://127.0.0.1:4443/s2 -XPUT -T - < /dev/urandom

Thanks to Robert Samuel Newson for raising this issue with details.
This fix must be backported to 1.8.
2018-04-26 10:20:44 +02:00
Willy Tarreau
174b06a572 MINOR: h2: detect presence of CONNECT and/or content-length
We'll need this in order to support uploading chunks. The h2 to h1
converter checks for the presence of the content-length header field
as well as the CONNECT method and returns these information to the
caller. The caller indicates whether or not a body is detected for
the message (presence of END_STREAM or not). No transfer-encoding
header is emitted yet.
2018-04-26 10:15:14 +02:00
Tim Duesterhus
cd235c6042 BUG/MEDIUM: lua: Fix segmentation fault if a Lua task exits
PiBa-NL reported that haproxy crashes with a segmentation fault
if a function registered using `core.register_task` returns.

An example Lua script that reproduces the bug is:

  mytask = function()
  	core.Info("Stopping task")
  end
  core.register_task(mytask)

The Valgrind output is as follows:

  ==6759== Process terminating with default action of signal 11 (SIGSEGV)
  ==6759==  Access not within mapped region at address 0x20
  ==6759==    at 0x5B60AA9: lua_sethook (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==6759==    by 0x430264: hlua_ctx_resume (hlua.c:1009)
  ==6759==    by 0x43BB68: hlua_process_task (hlua.c:5525)
  ==6759==    by 0x4FED0A: process_runnable_tasks (task.c:231)
  ==6759==    by 0x4B2256: run_poll_loop (haproxy.c:2397)
  ==6759==    by 0x4B2256: run_thread_poll_loop (haproxy.c:2459)
  ==6759==    by 0x41A7E4: main (haproxy.c:3049)

Add the missing `task = NULL` for the `HLUA_E_OK` case. The error cases
have been fixed as of 253e53e661 which
first was included in haproxy v1.8-dev3. This bugfix should be backported
to haproxy 1.8.
2018-04-25 11:30:56 +02:00
Rian McGuire
89fcb7d929 BUG/MINOR: log: t_idle (%Ti) is not set for some requests
If TCP content inspection is used, msg_state can be >= HTTP_MSG_ERROR
the first time http_wait_for_request is called. t_idle was being left
unset in that case.

In the example below :
     stick-table type string len 64 size 100k expire 60s
     tcp-request inspect-delay 1s
     tcp-request content track-sc1 hdr(X-Session)

%Ti will always be -1, because the msg_state is already at HTTP_MSG_BODY
when http_wait_for_request is called for the first time.

This patch should backported to 1.8 and 1.7.
2018-04-25 08:59:23 +02:00
Tim Duesterhus
45be38c9c7 BUG/MAJOR: channel: Fix crash when trying to read from a closed socket
When haproxy is compiled using GCC <= 3.x or >= 5.x the `unlikely`
macro performs a comparison with zero: `(x) != 0`, thus returning
either 0 or 1.

In `int co_getline_nc()` this macro was accidentally applied to
the variable `retcode` itself, instead of the result of the
comparison `retcode <= 0`. As a result any negative `retcode`
is converted to `1` for purposes of the comparison.
Thus never taking the branch (and exiting the function) for
negative values.

This in turn leads to reads of uninitialized memory in the for-loop
below:

  ==12141== Conditional jump or move depends on uninitialised value(s)
  ==12141==    at 0x4EB6B4: co_getline_nc (channel.c:346)
  ==12141==    by 0x421CA4: hlua_socket_receive_yield (hlua.c:1713)
  ==12141==    by 0x421F6F: hlua_socket_receive (hlua.c:1896)
  ==12141==    by 0x529B08F: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x52A7EFC: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529B497: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529711A: lua_pcallk (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x52ABDF0: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529B08F: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x52A7EFC: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529A9F1: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529B523: lua_resume (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==
  ==12141== Use of uninitialised value of size 8
  ==12141==    at 0x4EB6B9: co_getline_nc (channel.c:346)
  ==12141==    by 0x421CA4: hlua_socket_receive_yield (hlua.c:1713)
  ==12141==    by 0x421F6F: hlua_socket_receive (hlua.c:1896)
  ==12141==    by 0x529B08F: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x52A7EFC: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529B497: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529711A: lua_pcallk (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x52ABDF0: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529B08F: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x52A7EFC: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529A9F1: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529B523: lua_resume (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==
  ==12141== Invalid read of size 1
  ==12141==    at 0x4EB6B9: co_getline_nc (channel.c:346)
  ==12141==    by 0x421CA4: hlua_socket_receive_yield (hlua.c:1713)
  ==12141==    by 0x421F6F: hlua_socket_receive (hlua.c:1896)
  ==12141==    by 0x529B08F: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x52A7EFC: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529B497: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529711A: lua_pcallk (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x52ABDF0: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529B08F: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x52A7EFC: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529A9F1: ??? (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==    by 0x529B523: lua_resume (in /usr/lib/x86_64-linux-gnu/liblua5.3.so.0.0.0)
  ==12141==  Address 0x8637171e928bb500 is not stack'd, malloc'd or (recently) free'd

Fix this bug by correctly applying the `unlikely` macro to the result of the comparison.

This bug exists as of commit ca16b03813
which is the first commit adding this function.

v1.6-dev1 is the first tag containing this commit, the fix should
be backported to haproxy 1.6 and newer.
2018-04-25 05:39:49 +02:00
Aurélien Nephtali
564d15a71e BUG/MINOR: pattern: Add a missing HA_SPIN_INIT() in pat_ref_newid()
pat_ref_newid() is lacking a spinlock init. It was probably forgotten
in b5997f740b ("MAJOR: threads/map: Make acls/maps thread safe").

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-04-19 17:49:48 +02:00
Willy Tarreau
3f0e1ec701 BUG/CRITICAL: h2: fix incorrect frame length check
The incoming H2 frame length was checked against the max_frame_size
setting instead of being checked against the bufsize. The max_frame_size
only applies to outgoing traffic and not to incoming one, so if a large
enough frame size is advertised in the SETTINGS frame, a wrapped frame
will be defragmented into a temporary allocated buffer where the second
fragment my overflow the heap by up to 16 kB.

It is very unlikely that this can be exploited for code execution given
that buffers are very short lived and their address not realistically
predictable in production, but the likeliness of an immediate crash is
absolutely certain.

This fix must be backported to 1.8.

Many thanks to Jordan Zebor from F5 Networks for reporting this issue
in a responsible way.
2018-04-19 10:35:30 +02:00
Willy Tarreau
9eb2a4addf BUILD: sample: avoid build warning in sample.c
Recent commit 9631a28 ("MEDIUM: sample: Extend functionality for field/word
converters") introduced this minor build warning that this patch addresses :

 src/sample.c: In function 'sample_conv_word':
 src/sample.c:2108:8: warning: suggest explicit braces to avoid ambiguous 'else' [-Wparentheses]
 src/sample.c:2137:8: warning: suggest explicit braces to avoid ambiguous 'else' [-Wparentheses]

No backport is needed.
2018-04-19 10:33:28 +02:00
Olivier Houchard
ebaba75429 BUG/MEDIUM: kqueue: When adding new events, provide an output to get errors.
When adding new events using kevent(), if there's an error, because we're
trying to delete an event that wasn't there, or because the fd has already
been closed, kevent() will either add an event in the eventlist array if
there's enough room for it, and keep on handling other events, or stop and
return -1.
We want it to process all the events, so give it a large-enough array to
store any error.

Special thanks to PiBa-NL for diagnosing the root cause of this bug.

This should be backported to 1.8.
2018-04-17 17:46:56 +02:00
William Lallemand
daf4cd209a MINOR: export localpeer as an environment variable
Export localpeer as the environment variable $HAPROXY_LOCALPEER,
allowing to use this variable in the configuration file.

It's useful to use this variable in the case of synchronized
configuration between peers.
2018-04-17 17:17:58 +02:00
Marcin Deranek
9631a28275 MEDIUM: sample: Extend functionality for field/word converters
Extend functionality of field/word converters, so it's possible
to extract field(s)/word(s) counting from the beginning/end and/or
extract multiple fields/words (including separators) eg.

str(f1_f2_f3__f5),field(2,_,2)  # f2_f3
str(f1_f2_f3__f5),field(2,_,0)  # f2_f3__f5
str(f1_f2_f3__f5),field(-2,_,3) # f2_f3_
str(f1_f2_f3__f5),field(-3,_,0) # f1_f2_f3

str(w1_w2_w3___w4),word(3,_,2)  # w3___w4
str(w1_w2_w3___w4),word(2,_,0)  # w2_w3___w4
str(w1_w2_w3___w4),word(-2,_,3) # w1_w2_w3
str(w1_w2_w3___w4),word(-3,_,0) # w1_w2

Change is backward compatible.
2018-04-17 11:27:48 +02:00
Aurélien Nephtali
9a4da683a6 MINOR: cli: Ensure the CLI always outputs an error when it should
When using the CLI_ST_PRINT_FREE state, always output something back
if the faulty function did not fill the 'err' variable.
The map/acl code could lead to a crash whereas the SSL code was silently
failing.

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-04-16 19:23:16 +02:00
Aurélien Nephtali
c511b7cc97 BUG/MINOR: cli: Guard against NULL messages when using CLI_ST_PRINT_FREE
Some error paths (especially those followed when running out of memory)
can set the error message to NULL. In order to avoid a crash, use a
generic message ("Out of memory") when this case arises.

It should be backported to 1.8.

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-04-16 19:22:42 +02:00
Ben Draut
054fbee67a MINOR: config: Warn if resolvers has no nameservers
Today, a `resolvers` section may be configured without any `nameserver`
directives, which is useless. This implements a warning when such
sections are detected.

[List thread][1].

[1]: https://www.mail-archive.com/haproxy@formilux.org/msg29600.html
2018-04-16 15:58:23 +02:00
Marcin Deranek
9a66dfbd6c MINOR: proxy: Add fe_defbe fetcher
Patch adds ability to fetch frontend's default backend name in your
logic, so it can be used later to derive other backend names to make routing
decisions.
2018-04-16 15:51:57 +02:00
Christopher Faulet
11ebb2080e BUG/MINOR: http: Return an error in proxy mode when url2sa fails
In proxy mode, the result of url2sa is never checked. So when the function fails
to resolve the destination server from the URL, we continue. Depending on the
internal state of the connection, we get different behaviours. With a newly
allocated connection, the field <addr.to> is not set. So we will get a HTTP
error. The status code is 503 instead of 400, but it's not really critical. But,
if it's a recycled connection, we will reuse the previous value of <addr.to>,
opening a connection on an unexpected server.

To fix the bug, we return an error when url2sa fails.

This patch should be backported in all version from 1.5.
2018-04-16 15:31:18 +02:00
Thierry Fournier
f7b7c3e2f2 MINOR: servers: Support alphanumeric characters for the server templates names
'server-template' directive doesn't support the same name alphabet as
the 'server' directive. This patch allows the usage of chars [0-9].

[wt: let's backport this to 1.8 to apply the principle of least surprize
 to people migrating to server templates]
2018-04-06 19:16:18 +02:00
Willy Tarreau
1093a4586c BUG/MAJOR: cache: always initialize newly created objects
Recent commit 5bd37fa ("BUG/MAJOR: cache: fix random crashes caused by
incorrect delete() on non-first blocks") addressed an issue where
dangling objects could be deleted in the cache, but even after this fix
some similar segfaults were reported at the same place (cache_free_blocks()).
The tree was always corrupted as well. Placing some traces revealed
that this time it's caused by a missing initialization in
http_action_store_cache() : while object->eb.key is used to note that
the object is not in the tree, the first retrieved block may contain
random data and is not initialized. Further, this entry can be updated
later without the object being inserted into the tree. Thus, if at the
end the object is not stored and the blocks are put back to the avail
list, the next attempt to use them will find eb.key != 0 and will try
to delete the uninitialized block, will see that eb.node.leaf_p is not
NULL (random data), and will dereference it as well as a few other
uninitialized pointers. It was harder to trigger than the previous one,
despite being very closely related. This time the following config
was used :

  listen l1
    mode http
    bind :8888
    http-request cache-use c1
    http-response cache-store c1
    server s1 127.0.0.1:8000

  cache c1
    total-max-size 4
    max-age 10

Httpterm was running on port 8000.

And it was stressed this way :

  $ inject -o 1 -u 500 -P 1 -G '127.0.0.1:8888/?s=4097&p=1&x=%s'
  ... wait 5 seconds then Ctrl-C ...
  # wait 3 seconds doing nothing
  $ inject -o 1 -u 500 -P 1 -G '127.0.0.1:8888/?s=4097&p=1&x=%s'
  => segfault

Other values don't work well. The size and the small pieces in the
responses (p=1) are critical to make it work.

Here the fix consists in pre-zeroing object->eb.key AND object->eb.leaf_p
just after the object is allocated so as to stay consistent with other
locations. Ideally this could be simplified later by only relying on
eb->node.leaf_p everywhere since in the end the key alone is not a
reliable indicator, so that we use only one indicator of being part of
the tree or not.

This fix needs to be backported to 1.8.
2018-04-06 19:02:25 +02:00
Christopher Faulet
caf2feca62 MINOR: spoe: Add counters to log info about SPOE agents
In addition to metrics about time spent in the SPOE, following counters have
been added:

  * applets : number of SPOE applets.
  * idles : number of idle applets.
  * nb_sending : number of streams waiting to send data.
  * nb_waiting : number of streams waiting for a ack.
  * nb_processed : number of events/groups processed by the SPOE (from the
                   stream point of view).
  * nb_errors : number of errors during the processing (from the stream point of
                view).

Log messages has been updated to report these counters. Following pattern has
been added at the end of the log message:

    ... <idles>/<applets> <nb_sending>/<nb_waiting> <nb_error>/<nb_processed>
2018-04-05 15:13:54 +02:00
Christopher Faulet
3b8e34902b MINOR: spoe: use agent's logger to log SPOE messages
Instead of using the logger of the stream, we now use dedicated logger of the
SPOE. This means a logger should be defined.
2018-04-05 15:13:54 +02:00
Christopher Faulet
0e0f085a73 MINOR: spoe: Add support for option dontlog-normal in the SPOE agent section
It does the same than for proxies.
2018-04-05 15:13:54 +02:00
Christopher Faulet
7250b8fb5c MINOR: spoe: Add loggers dedicated to the SPOE agent
Now it is possible to configure a logger in a spoe-agent section using a "log"
line, as for a proxy. "no log", "log global" and "log <address> ..." syntaxes
are supported.
2018-04-05 15:13:54 +02:00
Christopher Faulet
28ac099907 MINOR: log: Keep the ref when a log server is copied to avoid duplicate entries
With "log global" line, the global list of loggers are copied into the proxy's
struct. The list coming from the default section is also copied when a frontend
or a backend section is parsed. So it is possible to have duplicate entries in
the proxy's list. For instance, with this following config, all messages will be
logged twice:

    global
        log 127.0.0.1 local0 debug
        daemon

    defaults
        mode   http
        log    global
        option httplog

    frontend front-http
        log global
        bind *:8888
        default_backend back-http

    backend back-http
        server www 127.0.0.1:8000
2018-04-05 15:13:54 +02:00
Christopher Faulet
4b0b79dd56 MINOR: log: move 'log' keyword parsing in dedicated function
Now, the function parse_logsrv should be used to parse a "log" line. This
function will update the list of loggers passed in argument. It can release all
log servers when "no log" line was parsed (by the caller) or it can parse "log
global" or "log <address> ... " lines. It takes care of checking the caller
context (global or not) to prohibit "log global" usage in the global section.
2018-04-05 15:13:54 +02:00
Christopher Faulet
36bda1cd4a MINOR: spoe: Add options to store processing times in variables
"set-process-time" and "set-total-time" options have been added to store
processing times in the transaction scope, at each event and group processing,
the current one and the total one. So it is possible to get them.

TODO: documentation
2018-04-05 15:13:54 +02:00
Christopher Faulet
b2dd1e034c MINOR: spoe: Add metrics in to know time spent in the SPOE
Following metrics are added for each event or group of messages processed in the
SPOE:

  * processing time: the delay to process the event or the group. From the
                     stream point of view, it is the latency added by the SPOE
                     processing.
  * request time : It is the encoding time. It includes ACLs processing, if
                   any. For fragmented frames, it is the sum of all fragments.
  * queue time : the delay before the request gets out the sending queue. For
                 fragmented frames, it is the sum of all fragments.
  * waiting time: the delay before the reponse is received. No fragmentation
                  supported here.
  * response time: the delay to process the response. No fragmentation supported
                   here.
  * total time: (unused for now). It is the sum of all events or groups
                processed by the SPOE for a specific threads.

Log messages has been updated. Before, only errors was logged (status_code !=
0). Now every processing is logged, following this format:

  SPOE: [AGENT] <TYPE:NAME> sid=STREAM-ID st=STATUC-CODE reqT/qT/wT/resT/pT

where:

  AGENT              is the agent name
  TYPE               is EVENT of GROUP
  NAME               is the event or the group name
  STREAM-ID          is an integer, the unique id of the stream
  STATUS_CODE        is the processing's status code
  reqT/qT/wT/resT/pT are delays descrive above

For all these delays, -1 means the processing was interrupted before the end. So
-1 for the queue time means the request was never dequeued. For fragmented
frames it is harder to know when the interruption happened.

For now, messages are logged using the same logger than the backend of the
stream which initiated the request.
2018-04-05 15:13:53 +02:00
Christopher Faulet
879dca9a76 BUG/MINOR: spoe: Don't forget to decrement fpa when a processing is interrupted
In async or pipelining mode, we count the number of NOTIFY frames sent waiting
for their corresponding ACK frames. This is a way to evaluate the "load" of a
SPOE applet. For pipelining mode, it is easy to make the link between a NOTIFY
frame and its ACK one, because exchanges are done using the same TCP connection.

For async mode, it is harder because a ACK frame can be received on another
connection than the one sending the NOTIFY frame. So to decrement the fpa of the
right applet, we need to keep it in the SPOE context. Most of time, it works
expect when the processing is interrupted by the stream, because of a timeout.

This patch fixes this issue. If a SPOE applet is still link to a SPOE context
when the processing is interrupted by the stream, the applet's fpa is
decremented. This is only done for unfragmented frames.
2018-04-05 15:13:53 +02:00
Christopher Faulet
b7426d1562 BUG/MINOR: spoe: Register the variable to set when an error occurred
Variables referenced in HAProxy's configuration file are registered during the
configuration parsing (during parsing of "var", "set-var" or "unset-var"
keywords). For the SPOE, you can use "register-var-names" directive to
explicitly register variable names. All unknown variables will be rejected
(unless you set "force-set-var" option). But, the variable set when an error
occurred (when "set-on-error" option is defined) should also be regiestered by
default. This is done with this patch.
2018-04-05 15:13:53 +02:00
Christopher Faulet
ac580608d7 BUG/MINOR: spoe: Don't release the context buffer in .check_timeouts callbaclk
It is better to let spoe_stop_processing release this buffer because, in
.check_timeouts callback, we lack information to know if it should be release or
not. For instance, if the processing timeout is reached while the SPOE applet
receives the reply, it is preferable to ignore the timeout and process the
result.

This patch should be backported in 1.8.
2018-04-05 15:13:53 +02:00
Christopher Faulet
84c844eb12 BUG/MINOR: spoe: Initialize variables used during conf parsing before any check
Some initializations must be done at the beginning of parse_spoe_flt to avoid
segmentaion fault when first errors are catched, when the "filter spoe" line is
parsed.

This patch must be backported in 1.8.
[cf: the variable "curvars" doesn't exist in 1.8. So the patch must be adapted.]
2018-04-05 15:13:53 +02:00
Willy Tarreau
5bd37fa625 BUG/MAJOR: cache: fix random crashes caused by incorrect delete() on non-first blocks
Several segfaults were reported in the cache, each time in eb_delete()
called from cache_free_blocks() itself called from shctx_row_reserve_hot().
Each time the tree node was corrupted with random cached data (often JS or
HTML contents).

The problem comes from an incompatibility between the cache's expectations
and the recycling algorithm used in the shctx. The shctx allocates and
releases a chain of blocks at once. And when it needs to allocate N blocks
from the avail list while a chain of M>N is found, it picks the first N
from the list, moves them to the hot list, and marks all remaining M-N
blocks as isolated blocks (chains of 1).

For each such released block, the shctx->free_block() callback is used
and passed a pointer to the first and current block of the chain. For
the cache, it's cache_free_blocks(). What this function does is check
that the current block is the first one, and in this case delete the
object from the tree and mark it as not in tree by setting key to zero.

The problem this causes is that the tail blocks when M>N become first
blocks for the next call to shctx_row_reserve_hot(), these ones will
be passed to cache_free_blocks() as list heads, and will be sent to
eb_delete() despite containing only cached data.

The simplest solution for now is to mark each block as holding no cache
object by setting key to zero all the time. It keeps the principle used
elsewhere in the code. The SSL code is not subject to this problem
because it relies on the block's len not being null, which happens
immediately after a block was released. It was uncertain however whether
this method is suitable for the cache. It is not critical though since
this code is going to change soon in 1.9 to dynamically allocate only
the number of required blocks.

This fix must be backported to 1.8. Thanks to Thierry for providing
exploitable cores.
2018-04-04 20:17:03 +02:00
Willy Tarreau
afe1de5d98 BUG/MINOR: cache: fix "show cache" output
The "show cache" command used to dump the header for each entry into into
the handler loop, making it repeated every ~16kB of output data. Additionally
chunk_appendf() was used instead of chunk_printf(), causing the output to
repeat already emitted lines, and the output size to grow in O(n^2). It used
to take several minutes to report tens of millions of objects from a small
cache containing only a few thousands. There was no more impact though.

This fix must be backported to 1.8.
2018-04-04 11:56:43 +02:00
Christopher Faulet
b797ae1f15 BUG/MINOR: email-alert: Set the mailer port during alert initialization
Since the commit 2f3a56b4f ("BUG/MINOR: tcp-check: use the server's service port
as a fallback"), email alerts stopped working because the mailer's port was
overriden by the server's port. Remember, email alerts are defined as checks
with specific tcp-check rules and triggered on demand to send alerts. So to send
an email, a check is executed. Because no specific port's was defined, the
server's one was used.

To fix the bug, the ports used for checks attached an email alert are explicitly
set using the mailer's port. So this port will be used instead of the server's
one.

In this patch, the assignement to a default port (587) when an email alert is
defined has been removed. Indeed, when a mailer is defined, the port must be
defined. So the default port was never used.

This patch must be backported in 1.8.
2018-04-04 10:36:50 +02:00
Olivier Houchard
8ef1a6b0d8 BUG/MINOR: fd: Don't clear the update_mask in fd_insert.
Clearing the update_mask bit in fd_insert may lead to duplicate insertion
of fd in fd_updt, that could lead to a write past the end of the array.
Instead, make sure the update_mask bit is cleared by the pollers no matter
what.

This should be backported to 1.8.
[wt: warning: 1.8 doesn't have the lockless fdcache changes and will
 require some careful changes in the pollers]
2018-04-03 19:38:15 +02:00
Willy Tarreau
2500fc2c34 BUG/MINOR: checks: check the conn_stream's readiness and not the connection
Since commit 9aaf778 ("MAJOR: connection : Split struct connection into
struct connection and struct conn_stream."), the checks use a conn_stream
and not directly the connection anymore. However wake_srv_chk() still used
to verify the connection's readiness instead of the conn_stream's. Due to
the existence of a mux, the connection is always waiting for receiving
something, and doesn't reflect the changes made in event_srv_chk_{r,w}(),
causing the connection appear as not ready yet, and the check to be
validated only after its timeout. The difference is only visible when
sending pure TCP checks, and simply adding a "tcp-check connect" line
is enough to work around it.

This fix must be backported to 1.8.
2018-04-03 19:31:38 +02:00
Willy Tarreau
b2e290acb6 BUG/MEDIUM: h2: always add a stream to the send or fctl list when blocked
When a stream blocks on a mux buffer full/unallocated or on connection
flow control, a flag among H2_SF_MUX_M* is set, but the stream is not
always added to the connection's list. It's properly done when the
operations are performed from the connection handler but not always when
done from the stream handler. For instance, a simple shutr or shutw may
fail by lack of room. If it's immediately followed by a call to h2_detach(),
the stream remains lying around in no list at all, and prevents the
connection from ending. This problem is actually quite difficult to
trigger and seems to require some large objects and low server-side
timeouts.

This patch covers all identified paths. Some are redundant but since the
code will change and will be simplified in 1.9, it's better to stay on
the safe side here for now. It must be backported to 1.8.
2018-03-30 17:43:49 +02:00
Willy Tarreau
1a1dd6066f BUG/MINOR: h2: remove accidental debug code introduced with show_fd function
Commit e3f36cd ("MINOR: h2: implement a basic "show_fd" function")
accidently brought one surrounding debugging part that was in the same
context. No backport needed.
2018-03-30 17:41:19 +02:00
Willy Tarreau
c754b343a2 MINOR: cli: report cache indexes in "show fd"
Instead of just indicating "cache={0,1}" we now report cache.next and
cache.prev since they are the ones used with the lockless fd cache.
2018-03-30 15:00:15 +02:00
Willy Tarreau
e3f36cd479 MINOR: h2: implement a basic "show_fd" function
The purpose here is to dump some information regarding an H2 connection,
and a few statistics about its streams. The output looks like this :

     35 : st=0x55(R:PrA W:PrA) ev=0x00(heopi) [lc] cache=0 owner=0x7ff49ee15e80 iocb=0x588a61(conn_fd_handler) tmask=0x1 umask=0x0 cflg=0x00201366 fe=decrypt mux=H2 mux_ctx=0x7ff49ee16f30 st0=2 flg=0x00000002 fctl_cnt=0 send_cnt=33 tree_cnt=33 orph_cnt=0

- st0 is the connection's state (FRAME_H here)
- flg is the connection's flags (MUX_MFULL here)
- fctl_cnt is the number of streams in the fctl_list
- send_cnt is the number of streams in the send_list
- tree_cnt is the number of streams in the streams_by_id tree
- orph_cnt is the number of orphaned streams (cs==0) in the tree
2018-03-30 14:43:13 +02:00
Willy Tarreau
b011d8f4c4 MINOR: mux: add a "show_fd" function to dump debugging information for "show fd"
This function will be called from the CLI's "show fd" command to append some
extra mux-specific information that only the mux handler can decode. This is
supposed to help collect various hints about what is happening when facing
certain anomalies.
2018-03-30 14:41:19 +02:00
Willy Tarreau
e96e61cadc BUILD/MINOR: threads: always export thread_sync_io_handler()
Otherwise it doesn't build again without threads.
2018-03-29 18:54:33 +02:00
Willy Tarreau
3041fcc2fd BUG/MEDIUM: h2: don't consider pending data on detach if connection is in error
Interrupting an h2load test shows that some connections remain active till
the client timeout. This is due to the fact that h2_detach() immediately
returns if the h2s flags indicate that the h2s is still waiting for some
buffer room in the output mux (possibly to emit a response or to send some
window updates). If the connection is broken, these data will never leave
and must not prevent the stream from being terminated nor the connection
from being released.

This fix must be backported to 1.8.
2018-03-29 15:41:32 +02:00
Willy Tarreau
0975f11d55 BUG/MEDIUM: h2/threads: never release the task outside of the task handler
Currently, h2_release() will release all resources assigned to the h2
connection, including the timeout task if any. But since the multi-threaded
scheduler, the timeout task could very well be queued in the thread-local
list of running tasks without any way to remove it, so task_delete() will
have no effect and task_free() will cause this undefined object to be
dereferenced.

In order to prevent this from happening, we never release the task in
h2_release(), instead we wake it up after marking its context NULL so that
the task handler can release the task.

Future improvements could consist in modifying the scheduler so that a
task_wakeup() has to be done on any task having to be killed, letting
the scheduler take care of it.

This fix must be backported to 1.8. This bug was apparently not reported
so far.
2018-03-29 15:22:59 +02:00
Willy Tarreau
71049cce3f MINOR: h2: fuse h2s_detach() and h2s_free() into h2s_destroy()
Since these two functions are always used together, let's simplify
the code by having a single one for both operations. It also ensures
we don't leave wandering elements that risk to leak later.
2018-03-29 13:22:15 +02:00
Willy Tarreau
e323f3458c MINOR: h2: always call h2s_detach() in h2_detach()
The code is safer and more robust this way, it avoids multiple paths.
This is possible due to the idempotence of LIST_DEL() and eb32_delete()
that are called in h2s_detach().
2018-03-29 13:22:15 +02:00
Willy Tarreau
4a333d3d53 BUG/MAJOR: h2: remove orphaned streams from the send list before closing
Several people reported very strange occasional crashes when using H2.
Every time it appeared that either an h2s or a task was corrupted. The
outcome is that a missing LIST_DEL() when removing an orphaned stream
from the list in h2_wake_some_streams() can cause this stream to
remain present in the send list after it was freed. This may happen
when receiving a GOAWAY frame for example. In the mean time the send
list may be processed due to pending streams, and the just released
stream is still found. If due to a buffer full condition we left the
h2_process_demux() loop before being able to process the pending
stream, the pool entry may be reassigned somewhere else. Either another
h2 connection will get it, or a task, since they are the same size and
are shared. Then upon next pass in h2_process_mux(), the stream is
processed again. Either it crashes here due to modifications, or the
contents are harmless to it and its last changes affect the other object
reasigned to this area (typically a struct task). In the case of a
collision with struct task, the LIST_DEL operation performed on h2s
corrupts the task's wait queue's leaf_p pointer, thus all the wait
queue's structure.

The fix consists in always performing the LIST_DEL in h2s_detach().
It will also make h2s_stream_new() more robust against a possible
future situation where stream_create_from_cs() could have sent data
before failing.

Many thanks to all the reporters who provided extremely valuable
information, traces and/or cores, namely Thierry Fournier, Yves Lafon,
Holger Amann, Peter Lindegaard Hansen, and discourse user "slawekc".

This fix must be backported to 1.8. It is probably better to also
backport the following code cleanups with it as well to limit the
divergence between master and 1.8-stable :

  00dd078 CLEANUP: h2: rename misleading h2c_stream_close() to h2s_close()
  0a10de6 MINOR: h2: provide and use h2s_detach() and h2s_free()
2018-03-29 13:22:15 +02:00
Willy Tarreau
a833cd90b2 BUILD/MINOR: cli: fix a build warning introduced by last commit
Commit 35b1b48 ("MINOR: cli: make "show fd" report the mux and mux_ctx
pointers when available") introduced an accidental build warning due to
a missing const statement.
2018-03-29 13:19:37 +02:00
Willy Tarreau
35b1b48c75 MINOR: cli: make "show fd" report the mux and mux_ctx pointers when available
This is handy to quickly distinguish H2 connections as well as to easily
access the h2c context. It could be backported to 1.8 to help during
troubleshooting sessions.
2018-03-28 18:41:30 +02:00
Willy Tarreau
4037a3f904 MINOR: cli/threads: make "show fd" report thread_sync_io_handler instead of "unknown"
The output was confusing when the sync point's dummy handler was shown.

This patch should be backported to 1.8 to help with troubleshooting.
2018-03-28 18:06:47 +02:00
Willy Tarreau
a7394e1b72 BUG/MINOR: hpack: fix harmless use of uninitialized value in hpack_dht_insert
A warning is reported here by valgrind on first pass in hpack_dht_insert().
The cause is that the not-yet-initialized dht->head is checked in
hpack_dht_get_tail(), though the result is not used, making it have
no impact. At the very least it confuses valgrind, and maybe it makes
it harder for gcc to optimize the code path. Let's move the variable
initialization around to shut it up. Thanks to Olivier for reporting
this one.

This fix may be backported to 1.8 at least to make valgrind usage less
painful.
2018-03-27 20:05:13 +02:00
Mark Lakes
56cc12509c MINOR: lua: allow socket api settimeout to accept integers, float, and doubles
Instead of hlua_socket_settimeout() accepting only integers, allow user
to specify float and double as well. Convert to milliseconds much like
cli_parse_set_timeout but also sanity check the value.

http://w3.impa.br/~diego/software/luasocket/tcp.html#settimeout

T. Fournier edit:

The main goal is to keep compatibility with the LuaSocket API. This
API only accept seconds, so using a float to specify milliseconds is
an acceptable way.

Update doc.
2018-03-27 14:17:02 +02:00
Ilya Shipitsin
7741c854cd BUILD/MINOR: fix build when USE_THREAD is not defined
src/queue.o: In function `pendconn_redistribute':
/home/ilia/haproxy/src/queue.c:272: undefined reference to `thread_want_sync'
src/queue.o: In function `pendconn_grab_from_px':
/home/ilia/haproxy/src/queue.c:311: undefined reference to `thread_want_sync'
src/queue.o: In function `process_srv_queue':
/home/ilia/haproxy/src/queue.c:184: undefined reference to `thread_want_sync'
collect2: error: ld returned 1 exit status
make: *** [Makefile:900: haproxy] Error 1

To be backported to 1.8.
2018-03-26 17:17:59 +02:00
Mark Lakes
22154b437d CLEANUP: lua: typo fix in comments
Some typo fixes in comments.
2018-03-26 11:12:41 +02:00
Thierry Fournier
17a921b799 BUG/MINOR: lua funtion hlua_socket_settimeout don't check negative values
Negatives timeouts doesn't have sense. A negative timeout doesn't cause
a crash, but the connection expires before the system try to extablish it.

This patch should be backported in all versions from 1.6
2018-03-26 11:11:49 +02:00
Thierry Fournier
e9636f192a BUG/MINOR: lua: the function returns anything
The output of these function indicates that one element is pushed in
the stack, but no element is set in the stack. Actually, if anyone
read the value returned by this function, is gets "something"
present in the stack.

This patch is a complement of these one: 119a5f10e4

The LuaSocket documentation tell anything about the returned value,
but the effective code set an integer of value one.

   316a9455b9/src/timeout.c (L172)

Thanks to Tim for the bug report.

This patch should be backported in all version from 1.6
2018-03-26 11:11:23 +02:00
Ilya Shipitsin
f93f0935c9 CLEANUP: map, stream: remove duplicate code in src/map.c, src/stream.c
issue was identified by cppcheck

[src/map.c:372] -> [src/map.c:376]: (warning) Variable 'appctx->st2' is reassigned a value before the old one has been used. 'break;' missing?
[src/map.c:433] -> [src/map.c:437]: (warning) Variable 'appctx->st2' is reassigned a value before the old one has been used. 'break;' missing?
[src/map.c:555] -> [src/map.c:559]: (warning) Variable 'appctx->st2' is reassigned a value before the old one has been used. 'break;' missing?
[src/stream.c:3264] -> [src/stream.c:3268]: (warning) Variable 'appctx->st2' is reassigned a value before the old one has been used. 'break;' missing?

Signed-off-by: Ilya Shipitsin <chipitsine@gmail.com>
2018-03-23 18:00:09 +01:00
Christopher Faulet
fe234281d6 BUG/MINOR: listener: Don't decrease actconn twice when a new session is rejected
When a freshly created session is rejected, for any reason, during the accept in
the function "session_accept_fd", the variable "actconn" is decreased twice. The
first time when the rejected session is released, then in the function
"listener_accpect", because of the failure. So it is possible to have an
negative value for actconn. Note that, in this case, we will also have a negatve
value for the current number of connections on the listener rejecting the
session (actconn and l->nbconn are in/decreased in same time).

It is easy to reproduce the bug with this small configuration:

  global
      stats socket /tmp/haproxy

  listen test
      bind *:12345
      tcp-request connection reject if TRUE

A "show info" on the stat socket, after a connection attempt, will show a very
high value (the unsigned representation of -1).

To fix the bug, if the function "session_accept_fd" returns an error, it
decrements the right counters and "listener_accpect" leaves them untouched.

This patch must be backported in 1.8.
2018-03-23 16:21:50 +01:00
Willy Tarreau
8adae7c15f BUG/MINOR: h2: ensure we can never send an RST_STREAM in response to an RST_STREAM
There are some corner cases where this could happen by accident. Since
the spec explicitly forbids this (RFC7540#5.4.2), let's add a test in
the two only functions which make the RST to avoid this. Thanks to user
klzgrad for reporting this problem. Usually it is expected to be harmless
but may result in browsers issuing a warning.

This fix must be backported to 1.8.
2018-03-22 17:37:05 +01:00
Willy Tarreau
d1023bbab3 BUG/MEDIUM: h2: properly account for DATA padding in flow control
Recent fixes made to process partial frames broke the flow control on
DATA frames, as the padding is not considered anymore, only the actual
data is. Let's simply take account of the padding once the transfer
ends. The probability to meet this bug is low because, when used, padding
is small and it can require a large number of padded transfers before the
window is completely depleted.

Thanks to user klzgrad for reporting this bug and confirming the fix.

This fix must be backported to 1.8.
2018-03-22 16:53:12 +01:00
Emmanuel Hocdet
50791a7df3 MINOR: samples: add crc32c converter
This patch adds the support of CRC32c (rfc4960).
2018-03-21 16:17:00 +01:00
Emmanuel Hocdet
115df3e38e MINOR: accept-proxy: support proxy protocol v2 CRC32c checksum
When proxy protocol v2 CRC32c tlv is received, check it before accept
connection (as describe in "doc/proxy-protocol.txt").
2018-03-21 05:04:01 +01:00
Emmanuel Hocdet
4399c75f6c MINOR: proxy-v2-options: add crc32c
This patch add option crc32c (PP2_TYPE_CRC32C) to proxy protocol v2.
It compute the checksum of proxy protocol v2 header as describe in
"doc/proxy-protocol.txt".
2018-03-21 05:04:01 +01:00
Emmanuel Hocdet
6afd898988 MINOR: hash: add new function hash_crc32c
This function will be used to perform CRC32c computations. This is
required to compute proxy protocol v2 CRC32C tlv (PP2_TYPE_CRC32C).
2018-03-21 05:04:01 +01:00
Willy Tarreau
c98aebcdb8 MINOR: log: stop emitting alerts when it's not possible to write on the socket
This is a recurring pain when using certain unix domain sockets or when
sending to temporarily unroutable addresses, if the process remains in
the foreground, the console is full of error which it's impossible to
do anything about. It's even worse when the process is remote, or when
run from a serial console which will slow the whole process down. Let's
send them only once now to warn about a possible config issue, and not
pollute the system nor slow everything down.
2018-03-20 16:44:25 +01:00
Christopher Faulet
fd83f0bfa4 BUG/MEDIUM: threads/queue: wake up other threads upon dequeue
The previous patch about queues (5cd4bbd7a "BUG/MAJOR: threads/queue: Fix
thread-safety issues on the queues management") revealed a performance drop when
multithreading is enabled (nbthread > 1). This happens when pending connections
handled by other theads are dequeued. If these other threads are blocked in the
poller, we have to wait the poller's timeout (or any I/O event) to process the
dequeued connections.

To fix the problem, at least temporarly, we "wake up" the threads by requesting
a synchronization. This may seem a bit overkill to use the sync point to do a
wakeup on threads, but it fixes this performance issue. So we can now think
calmly on the good way to address this kind of issues.

This patch should be backported in 1.8 with the commit 5cd4bbd7a ("BUG/MAJOR:
threads/queue: Fix thread-safety issues on the queues management").
2018-03-19 22:16:58 +01:00
Baptiste Assmann
2f3a56b4ff BUG/MINOR: tcp-check: use the server's service port as a fallback
When running tcp-check scripts, one must ensure we can establish a tcp
connection first.
When doing this action, HAProxy needs a TCP port configured either on
the server or on the check itself or on the connect rule itself.
For some reasons, the connect code did not evaluate the service port on
the server structure...

this patch fixes this error.

Backport status: 1.8
2018-03-19 13:55:55 +01:00
Baptiste Assmann
248f1173f2 BUG/MEDIUM: tcp-check: single connect rule can't detect DOWN servers
When tcpcheck is used to do TCP port monitoring only and the script is
composed by a single "tcp-check connect" rule (whatever port and ssl
options enabled), then the server can't be seen as DOWN.
Simple configuration to reproduce:

  backend b
    [...]
    option tcp-check
    tcp-check connect
    server s1 127.0.0.1:22 check

The main reason for this issue is that the piece of code which validates
that we're not at the end of the chained list (of rules) prevents
executing the validation of the establishment of the TCP connection.
Since validation is not executed, the rule is terminated and the report
says no errors were encountered, hence the server is UP all the time.

The workaround is simple: move the connection validation outsied the
CONNECT rule processing loop, into the main function.
That way, if the connection status is not CONNECTED, then HAProxy will
now add more time to wait for it. If the time is expired, an error is
now well reported.

Backport status: 1.8
2018-03-19 13:53:59 +01:00
Thierry FOURNIER
2986c0db88 CLEANUP: lua/syntax: lua is a name and not an acronym
This patch fix some first letter upercase for Lua messages.
2018-03-19 12:59:26 +01:00
Thierry FOURNIER
fd1e955a56 BUG/MINOR: lua: return bad error messages
The returned type is the type of the top of stack value and
not the type of the checked argument.

[wt: this can be backported to 1.8, 1.7 and 1.6]
2018-03-19 12:59:19 +01:00
Bernard Spil
13c53f8cc2 BUILD: ssl: Fix build with OpenSSL without NPN capability
OpenSSL can be built without NEXTPROTONEG support by passing
-no-npn to the configure script. This sets the
OPENSSL_NO_NEXTPROTONEG flag in opensslconf.h

Since NEXTPROTONEG is now considered deprecated, it is superseeded
by ALPN (Application Layer Protocol Next), HAProxy should allow
building withough NPN support.
2018-03-19 12:43:15 +01:00
Aurélien Nephtali
6a61e968ac BUG/MINOR: cli: Fix a crash when sending a command with too many arguments
This bug was introduced in 48bcfdab2 ("MEDIUM: dumpstat: make the CLI
parser understand the backslash as an escape char").

This should be backported to 1.8.

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-03-19 12:15:38 +01:00
Aurélien Nephtali
6e8a41d8fc BUG/MINOR: cli: Ensure all command outputs end with a LF
Since 200b0fac ("MEDIUM: Add support for updating TLS ticket keys via
socket"), 4147b2ef ("MEDIUM: ssl: basic OCSP stapling support."),
4df59e9 ("MINOR: cli: add socket commands and config to prepend
informational messages with severity") and 654694e1 ("MEDIUM: stats/cli:
add support for "set table key" to enter values"), commands
'set ssl tls-key', 'set ssl ocsp-response', 'set severity-output' and
'set table' do not always send an extra LF at the end of their outputs.

This is required as mentioned in doc/management.txt:

"Since multiple commands may be issued at once, haproxy uses the empty
line as a delimiter to mark an end of output for each command"

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-03-19 12:13:02 +01:00
Olivier Houchard
33e083c92e BUG/MINOR: seemless reload: Fix crash when an interface is specified.
When doing a seemless reload, while receiving the sockets from the old process
the new process will die if the socket has been bound to a specific
interface.
This happens because the code that tries to parse the informations bogusly
try to set xfer_sock->namespace, while it should be setting wfer_sock->iface.

This should be backported to 1.8.
2018-03-19 12:10:53 +01:00
Ilya Shipitsin
210eb259bf CLEANUP: dns: remove duplicate code in src/dns.c
issue was identified by cppcheck

[src/dns.c:2037] -> [src/dns.c:2041]: (warning) Variable 'appctx->st2' is reassigned a value before the old one has been used. 'break;' missing?
2018-03-19 12:09:16 +01:00
Baptiste Assmann
1fa7d2acce BUG/MINOR: dns: don't downgrade DNS accepted payload size automatically
Automatic downgrade of DNS accepted payload size may have undesired side
effect, which could make a backend with all servers DOWN.

After talking with Lukas on the ML, I realized this "feature" introduces
more issues that it fixes problem.
The "best" way to handle properly big responses will be to implement DNS
over TCP.

To be backported to 1.8.
2018-03-19 11:41:52 +01:00
Christopher Faulet
5cd4bbd7ab BUG/MAJOR: threads/queue: Fix thread-safety issues on the queues management
The management of the servers and the proxies queues was not thread-safe at
all. First, the accesses to <strm>->pend_pos were not protected. So it was
possible to release it on a thread (for instance because the stream is released)
and to use it in same time on another one (because we redispatch pending
connections for a server). Then, the accesses to stream's information (flags and
target) from anywhere is forbidden. To be safe, The stream's state must always
be updated in the context of process_stream.

So to fix these issues, the queue module has been refactored. A lock has been
added in the pendconn structure. And now, when we try to dequeue a pending
connection, we start by unlinking it from the server/proxy queue and we wake up
the stream. Then, it is the stream reponsibility to really dequeue it (or
release it). This way, we are sure that only the stream can create and release
its <pend_pos> field.

However, be careful. This new implementation should be thread-safe
(hopefully...). But it is not optimal and in some situations, it could be really
slower in multi-threaded mode than in single-threaded one. The problem is that,
when we try to dequeue pending connections, we process it from the older one to
the newer one independently to the thread's affinity. So we need to wait the
other threads' wakeup to really process them. If threads are blocked in the
poller, this will add a significant latency. This problem happens when maxconn
values are very low.

This patch must be backported in 1.8.
2018-03-19 10:03:06 +01:00
Christopher Faulet
510c0d67ef BUG/MEDIUM: threads/unix: Fix a deadlock when a listener is temporarily disabled
When a listener is temporarily disabled, we start by locking it and then we call
.pause callback of the underlying protocol (tcp/unix). For TCP listeners, this
is not a problem. But listeners bound on an unix socket are in fact closed
instead. So .pause callback relies on unbind_listener function to do its job.

Unfortunatly, unbind_listener hold the listener's lock and then call an internal
function to unbind it. So, there is a deadlock here. This happens during a
reload. To fix the problemn, the function do_unbind_listener, which is lockless,
is now exported and is called when a listener bound on an unix socket is
temporarily disabled.

This patch must be backported in 1.8.
2018-03-16 11:19:07 +01:00
Cyril Bonté
4288c5a9d8 BUG/MINOR: force-persist and ignore-persist only apply to backends
>From the very first day of force-persist and ignore-persist features,
they only applied to backends, except that the documentation stated it
could also be applied to frontends.

In order to make it clear, the documentation is updated and the parser
will raise a warning if the keywords are used in a frontend section.

This patch should be backported up to the 1.5 branch.
2018-03-12 22:52:24 +01:00
Cyril Bonté
d400ab3a36 BUG/MEDIUM: fix a 100% cpu usage with cpu-map and nbthread/nbproc
Krishna Kumar reported a 100% cpu usage with a configuration using
cpu-map and a high number of threads,

Indeed, this minimal configuration to reproduce the issue :
  global
    nbthread 40
    cpu-map auto:1/1-40 0-39

  frontend test
    bind :8000

This is due to a wrong type in a shift operator (int vs unsigned long int),
causing an endless loop while applying the cpu affinity on threads. The same
issue may also occur with nbproc under FreeBSD. This commit addresses both
cases.

This patch must be backported to 1.8.
2018-03-12 22:52:24 +01:00
Aurélien Nephtali
b53e20826e BUG/MINOR: cli: Fix a typo in the 'set rate-limit' usage
The correct keyword is 'ssl-sessions' (vs. 'ssl-session').
The typo was introduced in 45c742be05 ('REORG: cli: move the "set
rate-limit" functions to their own parser').

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-03-12 07:49:08 +01:00
Aurélien Nephtali
bca08762d2 CLEANUP: cli: Remove a leftover debug message
This printf() was added in f886e3478d ("MINOR: cli: Add a command to
send listening sockets.").

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-03-12 07:49:05 +01:00
Aurélien Nephtali
76de95a4c0 CLEANUP: ssl: Remove a duplicated #include
openssl/x509.h is included twice since commit fc0421fde ("MEDIUM: ssl:
add support for SNI and wildcard certificates").

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-03-12 07:49:01 +01:00
Aurélien Nephtali
498a115727 BUG/MINOR: cli: Fix a crash when passing a negative or too large value to "show fd"
This bug is present since 7a4a0ac71d ("MINOR: cli: add a new "show fd"
command").

This should be backported to 1.8.

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-03-12 07:47:26 +01:00
Willy Tarreau
84b118f312 BUG/MEDIUM: h2: also arm the h2 timeout when sending
Right now the h2 idle timeout is only set when there is no stream. If we
fail to send because the socket buffers are full (generally indicating
the client has left), we also need to arm it so that we can properly
expire such connections, otherwise some failed transfers might leave
H2 connections pending forever.

Thanks to Thierry Fournier for the diag and the traces.

This patch needs to be backported to 1.8.
2018-03-08 18:43:56 +01:00
Willy Tarreau
c41b3e8dff DOC: buffers: clarify the purpose of the <from> pointer in offer_buffers()
This one is only used to compare pointers and NULL is permitted though
this is far from being clear.
2018-03-08 18:33:48 +01:00
Olivier Houchard
ec9516a6dc BUG/MINOR: unix: Don't mess up when removing the socket from the xfer_sock_list.
When removing the socket from the xfer_sock_list, we want to set
next->prev to prev, not to next->prev, which is useless.

This should be backported to 1.8.
2018-03-08 18:33:11 +01:00
Emeric Brun
1738e86771 BUG/MINOR: session: Fix tcp-request session failure if handshake.
Some sample fetches check if session is established using
the flag CO_FL_CONNECTED. But in some cases, when a handshake
is performed this flag is set too late, after the process
of the tcp-request session rules.

This fix move the raising of the flag at the beginning of the
conn_complete_session function which processes the tcp-request
session rules.

This fix must be backported to 1.8 (and perhaps 1.7)
2018-03-06 14:04:45 +01:00
Willy Tarreau
44e973f508 MEDIUM: h2: use a single buffer allocator
We used to have one buffer allocator per direction while we can never
block on two buffers at once. Let's have a single one and rely on the
connection's flags to know which one we're waitinf for.
2018-03-01 17:58:15 +01:00
Willy Tarreau
0a10de6066 MINOR: h2: provide and use h2s_detach() and h2s_free()
These ones save us from open-coding the cleanup functions on each and
every error path. The code was updated to use them with no functional
change.
2018-03-01 16:35:01 +01:00
Willy Tarreau
00dd07895a CLEANUP: h2: rename misleading h2c_stream_close() to h2s_close()
This function takes an h2c and an h2s but it never uses the h2c, which
is a bit confusing at some places in the code. Let's make it clear that
it only operates on the h2s instead by renaming it and removing the
unused h2c argument.
2018-03-01 16:31:34 +01:00
Emmanuel Hocdet
253c3b7516 MINOR: connection: add proxy-v2-options authority
This patch add option PP2_TYPE_AUTHORITY to proxy protocol v2 when a TLS
connection was negotiated. In this case, authority corresponds to the sni.
2018-03-01 11:38:32 +01:00
Emmanuel Hocdet
fa8d0f1875 MINOR: connection: add proxy-v2-options ssl-cipher,cert-sig,cert-key
This patch implement proxy protocol v2 options related to crypto information:
ssl-cipher (PP2_SUBTYPE_SSL_CIPHER), cert-sig (PP2_SUBTYPE_SSL_SIG_ALG) and
cert-key (PP2_SUBTYPE_SSL_KEY_ALG).
2018-03-01 11:38:28 +01:00
Emmanuel Hocdet
283e004a85 MINOR: ssl: add ssl_sock_get_cert_sig function
ssl_sock_get_cert_sig can be used to report cert signature short name
to log and ppv2 (RSA-SHA256).
2018-03-01 11:34:08 +01:00
Emmanuel Hocdet
96b7834e98 MINOR: ssl: add ssl_sock_get_pkey_algo function
ssl_sock_get_pkey_algo can be used to report pkey algorithm to log
and ppv2 (RSA2048, EC256,...).
Extract pkey information is not free in ssl api (lock/alloc/free):
haproxy can use the pkey information computed in load_certificate.
Store and use this information in a SSL ex_data when available,
compute it if not (SSL multicert bundled and generated cert).
2018-03-01 11:34:05 +01:00
Emmanuel Hocdet
ddc090bc55 MINOR: ssl: extract full pkey info in load_certificate
Private key information is used in switchctx to implement native multicert
selection (ecdsa/rsa/anonymous). This patch extract and store full pkey
information: dsa type and pkey size in bits. This can be used for switchctx
or to report pkey informations in ppv2 and log.
2018-03-01 11:33:18 +01:00
Emmanuel Hocdet
8c0c34b6e7 Revert "BUG/MINOR: send-proxy-v2: string size must include ('\0')"
This reverts commit 82913e4f79.
TLV string value should not be null-terminated.

This should be backported to 1.8.
2018-03-01 06:48:05 +01:00
Christopher Faulet
7d9f1ba246 BUG/MEDIUM: spoe: Remove idle applets from idle list when HAProxy is stopping
In the SPOE applet's handler, when an applet is switched from the state IDLE to
PROCESSING, it is removed for the list of idle applets. But when HAProxy is
stopping, this applet can be switched to DISCONNECT. In this case, we also need
to remove it from the list of idle applets. Else the applet is removed but still
present in the list. It could lead to a segmentation fault or an infinite loop,
depending the code path.
2018-02-28 16:20:33 +01:00
Willy Tarreau
35a62705df BUG/MEDIUM: h2: always consume any trailing data after end of output buffers
In case a stream tries to emit more data than advertised by the chunks
or content-length headers, the extra data remains in the channel's output
buffer until the channel's timeout expires. It can easily happen when
sending malformed error files making use of a wrong content-length or
having extra CRLFs after the empty chunk. It may also be possible to
forge such a bad response using Lua.

The H1 to H2 encoder must protect itself against this by marking the data
presented to it as consumed if it decides to discard them, so that the
sending stream doesn't wait for the timeout to trigger.

The visible effect of this problem is a huge memory usage and a high
concurrent connection count during benchmarks when using such bad data
(a typical place where this easily happens).

This fix must be backported to 1.8.
2018-02-27 15:37:25 +01:00
Christopher Faulet
929b52d8a1 BUG/MINOR: h2: Set the target of dbuf_wait to h2c
In h2_get_dbuf, when the buffer allocation was failing, dbuf_wait.target was
errornously set to the connection (h2c->conn) instead of the h2 connection
descriptor (h2c).

This patch must be backported to 1.8.
2018-02-26 17:33:16 +01:00
Yves Lafon
95317289e9 MINOR: stats: display the number of threads in the statistics.
Add the nbthread global variable to the output, matching nbproc.

This may be backported to 1.8
2018-02-26 11:53:46 +01:00
Willy Tarreau
f161d0f51e BUG/MINOR: pools/threads: don't ignore DEBUG_UAF on double-word CAS capable archs
Since commit cf975d4 ("MINOR: pools/threads: Implement lockless memory
pools."), we support lockless pools. However the parts dedicated to
detecting use-after-free are not present in this part, making DEBUG_UAF
useless in this situation.

The present patch sets a new define CONFIG_HAP_LOCKLESS_POOLS when such
a compatible architecture is detected, and when pool debugging is not
requested, then makes use of this everywhere in pools and buffers
functions. This way enabling DEBUG_UAF will automatically disable the
lockless version.

No backport is needed as this is purely 1.9-dev.
2018-02-22 14:18:45 +01:00
Tim Duesterhus
5e64286bab CLEANUP: standard: Fix typo in IPv6 mask example
IPv6 addresses with two double colons are invalid.

This typo was introduced in commit 471851713a.
2018-02-21 05:07:35 +01:00
Tim Duesterhus
66888f907c CLEANUP: h2: Remove unused labels from mux_h2.c
This removes the unused next_header_block and try_again labels
from mux_h2.c.

try_again is unused as of a76e4c2183,
which first appeared in haproxy 1.8.0.
next_header_block is unused as of 872855998b,
which was backported to haproxy 1.8.0 as
59fcb216085a7aa9744cffe39567c80de4ebd6bf.
2018-02-20 08:30:13 +01:00
Tim Duesterhus
932bb289dd CLEANUP: spoe: Remove unused label retry
This removes the retry labels from spoe_send_frame and spoe_recv_frame
which are unused since d5216d474d, which
is unreleased, but was backported to haproxy 1.8 as
f13f3a4babdb1ce23a7e982c765704bca728111a.
2018-02-20 08:30:12 +01:00
Tim Duesterhus
9619e72c6b CLEANUP: cfgparse: Remove unused label end
This removes the end label from parse_process_number() which
is unused since 5ab51775e7, which
first was released in haproxy 1.8.0.
2018-02-20 08:30:12 +01:00
Emeric Brun
74f7ffa229 MINOR: ssl/sample: adds ssl_bc_is_resumed fetch keyword.
Returns true when the back connection was made over an SSL/TLS transport
layer and the newly created SSL session was resumed using a cached
session or a TLS ticket.
2018-02-19 16:50:20 +01:00
Emeric Brun
eb8def9f34 BUG/MEDIUM: ssl/sample: ssl_bc_* fetch keywords are broken.
Since the split between connections and conn-stream objects, this
keywords are broken.

This patch must be backported in 1.8
2018-02-19 16:50:05 +01:00
Christopher Faulet
fd04fcf5ed BUG/MEDIUM: http: Switch the HTTP response in tunnel mode as earlier as possible
When the body length is undefined (no Content-Length or Transfer-Encoding
headers), The reponse remains in ending mode, waiting the request is done. So,
most of time this is not a problem because the resquest is done before the
response. But when a client sends data to a server that replies without waiting
all the data, it is really not desirable to wait the end of the request to
finish the response.

This bug was introduced when the tunneling of the request and the reponse was
refactored, in commit 4be980391 ("MINOR: http: Switch requests/responses in
TUNNEL mode only by checking txn flag").

This patch should be backported in 1.8 and 1.7.
2018-02-19 16:47:12 +01:00
Christopher Faulet
4ac77a98cd BUG/MEDIUM: ssl: Shutdown the connection for reading on SSL_ERROR_SYSCALL
When SSL_read returns SSL_ERROR_SYSCALL and errno is unset or set to EAGAIN, the
connection must be shut down for reading. Else, the connection loops infinitly,
consuming all the CPU.

The bug was introduced in the commit 7e2e50500 ("BUG/MEDIUM: ssl: Don't always
treat SSL_ERROR_SYSCALL as unrecovarable."). This patch must be backported in
1.8 too.
2018-02-19 15:37:47 +01:00
Willy Tarreau
280f42b99e MINOR: sample: add a new "concat" converter
It's always a pain not to be able to combine variables. This commit
introduces the "concat" converter, which appends a delimiter, a variable's
contents and another delimiter to an existing string. The result is a string.
This makes it easier to build composite variables made of other variables.
2018-02-19 15:34:12 +01:00
Christopher Faulet
16f45c87d5 BUG/MINOR: ssl/threads: Make management of the TLS ticket keys files thread-safe
A TLS ticket keys file can be updated on the CLI and used in same time. So we
need to protect it to be sure all accesses are thread-safe. Because updates are
infrequent, a R/W lock has been used.

This patch must be backported in 1.8
2018-02-19 14:15:38 +01:00
Tim Duesterhus
9ad9f3517e DOC: cfgparse: Warn on option (tcp|http)log in backend
The option does not seem to have any effect since at least haproxy
1.3. Also the `log-format` directive already warns when being used
in a backend.
2018-02-19 13:57:32 +01:00
Aurélien Nephtali
39b89889e7 BUG/MINOR: init: Add missing brackets in the code parsing -sf/-st
The codes tries to strip trailing spaces of arguments but due to missing
brackets, it will always exit.

It can be reproduced with this (silly) example:

$ haproxy -f /etc/haproxy/haproxy.cfg -sf 1234 "1235 " 1236
$ echo $?
1

This was introduced in commit 236062f7c ("MINOR: init: emit warning when
-sf/-sd cannot parse argument")

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@gmail.com>
2018-02-19 08:02:21 +01:00
Olivier Houchard
7e2e505006 BUG/MEDIUM: ssl: Don't always treat SSL_ERROR_SYSCALL as unrecovarable.
Bart Geesink reported some random errors appearing under the form of
termination flags SD in the logs for connections involving SSL traffic
to reach the servers.

Tomek Gacek and Mateusz Malek finally narrowed down the problem to commit
c2aae74 ("MEDIUM: ssl: Handle early data with OpenSSL 1.1.1"). It happens
that the special case of SSL_ERROR_SYSCALL isn't handled anymore since
this commit.

SSL_read() might return <= 0, and SSL_get_erro() return SSL_ERROR_SYSCALL,
without meaning the connection is gone. Before flagging the connection
as in error, check the errno value.

This should be backported to 1.8.
2018-02-14 18:44:28 +01:00
Olivier Houchard
1ff9104117 BUG/MINOR: fd/threads: properly lock the FD before adding it to the fd cache.
It was believed that it was useless to lock the "prev" field when adding a
FD. However, if there's only one element in the FD cache, and that element
removes itself from the fd cache, and another FD is added before the first
add completed, there's a risk of losing elements. To prevent that, lock the
"prev" field, so that such a removal will wait until the add completed.
2018-02-08 17:24:06 +01:00
Willy Tarreau
58aa5ccd76 BUG/MINOR: config: don't emit a warning when global stats is incompletely configured
Martin Brauer reported an unexpected warning when some parts of the
global stats are defined but not the listening address, like below :

  global
    #stats socket run/admin.sock mode 660 level admin
    stats timeout 30s

Then haproxy complains :
  [WARNING] 334/150131 (23086) : config : frontend 'GLOBAL' has no
'bind' directive. Please declare it as a backend if this was intended.

This is because of the check for a bind-less frontend (the global section
creates a frontend for the stats). There's no clean fix for this one, so
here we're simply checking that the frontend is not the global stats one
before emitting the warning.

This patch should be backported to all stable versions.
2018-02-08 09:55:09 +01:00
Willy Tarreau
821069832e BUILD: fd/threads: fix breakage build breakage without threads
The last fix for the volatile dereference made use of pl_deref_int()
which is unknown when building without threads. Let's simply open-code
it instead. No backport needed.
2018-02-06 12:00:27 +01:00
Chris Lane
236062f7ce MINOR: init: emit warning when -sf/-sd cannot parse argument
Previously, -sf and -sd command line parsing used atol which cannot
detect errors.  I had a problem where I was doing -sf "$pid1 $pid2 $pid"
and it was sending the gracefully terminate signal only to the first pid.
The change uses strtol and checks endptr and errno to see if the parsing
worked.  It will exit when the pid list is not parsed.

[wt: this should be backported to 1.8]
2018-02-06 07:23:32 +01:00
Tim Duesterhus
7d58b4d156 BUG/MEDIUM: standard: Fix memory leak in str2ip2()
An haproxy compiled with:

> make -j4 all TARGET=linux2628 USE_GETADDRINFO=1

And running with a configuration like this:

  defaults
  	log	global
  	mode	http
  	option	httplog
  	option	dontlognull
  	timeout connect 5000
  	timeout client  50000
  	timeout server  50000

  frontend fe
  	bind :::8080 v4v6

  	default_backend be

  backend be
  	server s example.com:80 check

Will leak memory inside `str2ip2()`, because the list `result` is not
properly freed in success cases:

==18875== 140 (76 direct, 64 indirect) bytes in 1 blocks are definitely lost in loss record 87 of 111
==18875==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==18875==    by 0x537A565: gaih_inet (getaddrinfo.c:1223)
==18875==    by 0x537DD5D: getaddrinfo (getaddrinfo.c:2425)
==18875==    by 0x4868E5: str2ip2 (standard.c:733)
==18875==    by 0x43F28B: srv_set_addr_via_libc (server.c:3767)
==18875==    by 0x43F50A: srv_iterate_initaddr (server.c:3879)
==18875==    by 0x43F50A: srv_init_addr (server.c:3944)
==18875==    by 0x475B30: init (haproxy.c:1595)
==18875==    by 0x40406D: main (haproxy.c:2479)

The exists as long as the usage of getaddrinfo in that function exists,
it was introduced in commit:
d5f4328efd

v1.5-dev8 is the first tag containing this comment, the fix
should be backported to haproxy 1.5 and newer.
2018-02-05 21:04:15 +01:00
Willy Tarreau
a331544c33 BUG/MINOR: time/threads: ensure the adjusted time is always correct
In the time offset calculation loop, we ensure we only commit the new
date once it's futher in the future than the current one. However there
is a small issue here on 32-bit platforms : if global_now is written in
two cycles by another thread, starting with the tv_sec part, and the
current thread reads it in the middle of a change, it may compute a
wrong "adjusted" value on the first round, with the new (larger) tv_sec
and the old (large) tv_usec. This will be detected as the CAS will fail,
and another attempt will be made, but this time possibly with too large
an adusted value, pushing the date further than needed (at worst almost
one second).

This patch addresses this by using a temporary adjusted time in the loop
that always restarts from the last known one, and by assigning the result
to the final value only once the CAS succeeds.

The impact is very limited, it may cause the time to advance in small
jumps on 32 bit platforms and in the worst case some timeouts might
expire 1 second too early.

This fix should be backported to 1.8.
2018-02-05 20:11:38 +01:00
Willy Tarreau
11559a7530 MINOR: fd: reorder fd_add_to_fd_list()
The function was cleaned up a bit from duplicated parts inherited from
the initial attempt at getting it to work. It's a bit smaller and cleaner
this way.
2018-02-05 19:45:41 +01:00
Willy Tarreau
3a8263f86b MINOR: fd: remove the unneeded last CAS when adding an fd to the list
This was a leftover from the initial code where two threads could fight
for the list's tail.
2018-02-05 19:45:39 +01:00
Willy Tarreau
abeaff2d54 BUG/MINOR: fd/threads: properly dereference fdcache as volatile
In fd_rm_from_fd_list(), we have loops waiting for another change to
complete, in case we don't have support for a double CAS. But these
ones fail to place a compiler barrier or to dereference the fdcache
as a volatile, resulting in an endless loop on the first collision,
which is visible when run on MIPS32.

No backport needed.
2018-02-05 19:45:31 +01:00
Willy Tarreau
4cc67a2782 MINOR: fd: move the fd_{add_to,rm_from}_fdlist functions to fd.c
There's not point inlining these huge functions, better move them to real
functions in fd.c.
2018-02-05 17:19:40 +01:00
Willy Tarreau
62a627ac19 MEDIUM: poller: use atomic ops to update the fdtab mask
We don't need to lock the fdtab[].lock anymore since we only have one
modification left (update update_mask). Let's use an atomic AND instead.
2018-02-05 16:02:22 +01:00
Willy Tarreau
d4daeac7f1 MINOR: select: get rid of the now useless fd_compute_new_polled_status()
Do not call it anymore and avoid updating the fdstate.
2018-02-05 16:02:22 +01:00
Willy Tarreau
1394eb0120 MINOR: poll: get rid of the now useless fd_compute_new_polled_status()
Do not call it anymore and avoid updating the fdstate.
2018-02-05 16:02:22 +01:00
Willy Tarreau
7d24fadf7c MINOR: kqueue: get rid of the now useless fd_compute_new_polled_status()
Do not call it anymore and avoid updating the fdstate.
2018-02-05 16:02:22 +01:00
Willy Tarreau
038e54cb3c MINOR: epoll: get rid of the now useless fd_compute_new_polled_status()
Do not call it anymore and avoid updating the fdstate. We're not very far
from removing the fd lock it seems.
2018-02-05 16:02:22 +01:00
Olivier Houchard
1256836ebf MEDIUM: fd/threads: Make sure we don't miss a fd cache entry.
An fd cache entry might be removed and added at the end of the list, while
another thread is parsing it, if that happens, we may miss fd cache entries,
to avoid that, add a new field in the struct fdtab, "added_mask", which
contains a mask for potentially affected threads, if it is set, the
corresponding thread will set its bit in fd_cache_mask, to avoid waiting in
poll while it may have more work to do.
2018-02-05 16:02:22 +01:00
Olivier Houchard
4815c8cbfe MAJOR: fd/threads: Make the fdcache mostly lockless.
Create a local, per-thread, fdcache, for file descriptors that only belongs
to one thread, and make the global fd cache mostly lockless, as we can get
a lot of contention on the fd cache lock.
2018-02-05 16:02:22 +01:00
Olivier Houchard
cf975d46bc MINOR: pools/threads: Implement lockless memory pools.
On CPUs that support a double-width compare-and-swap, implement lockless
pools.
2018-02-05 16:02:22 +01:00
Olivier Houchard
25ae45a078 MINOR: early data: Never remove the CO_FL_EARLY_DATA flag.
It may be useful to keep the CO_FL_EARLY_DATA flag, so that we know early
data were used, so instead of doing this, only add the Early-data header,
and have the sample fetch ssl_fc_has_early return 1, if CO_FL_EARLY_DATA is
set, and if the handshake isn't done yet.
2018-02-05 14:24:50 +01:00
Olivier Houchard
6fa63d9852 MINOR: early data: Don't rely on CO_FL_EARLY_DATA to wake up streams.
Instead of looking for CO_FL_EARLY_DATA to know if we have to try to wake
up a stream, because it is waiting for a SSL handshake, instead add a new
conn_stream flag, CS_FL_WAIT_FOR_HS. This way we don't have to rely on
CO_FL_EARLY_DATA, and we will only wake streams that are actually waiting.
2018-02-05 14:24:50 +01:00
Olivier Houchard
5fa300da89 MINOR: init: make stdout unbuffered
printf is unusable for debugging without this, and printf() is not used
for anything else.
2018-02-05 14:15:20 +01:00
Christopher Faulet
e8ade385b4 MINOR: spoe: Add max-waiting-frames directive in spoe-agent configuration
This is the maximum number of frames waiting for an acknowledgement on the same
connection. This value is only used when the pipelinied or asynchronus exchanges
between HAProxy and SPOA are enabled. By default, it is set to 20.
2018-02-02 16:00:32 +01:00
Christopher Faulet
b077cdc012 MEDIUM: spoe: Use an ebtree to manage idle applets
Instead of using a list of applets with idle ones in front, we now use an
ebtree. Aapplets in the tree are idle by definition. And the key is the applet's
weight. When a new frame is queued, the first idle applet (with the lowest
weight) is woken up and its weight is increased by one. And when an applet sends
a frame to a SPOA, its weight is decremented by one.

This is empirical, but it should avoid to overuse a very few number of applets
and increase the balancing between idle applets.
2018-02-02 16:00:32 +01:00
Christopher Faulet
8f82b203d5 MINOR: spoe: Count the number of frames waiting for an ack for each applet
So it is easier to respect the max_fpa value. This is no more the maximum frames
processed by an applet at each loop but the maximum frames waiting for an ack
for a specific applet.

The function spoe_handle_processing_appctx has been rewritten accordingly.
2018-02-02 16:00:32 +01:00
Christopher Faulet
6f9ea4f87b MINOR: spoe: Replace sending_rate by a frequency counter
sending_rate was a counter used to evaluate the SPOE capacity to process
frames. Because it was not really accurrate, it has been replaced by a frequency
counter representing the number of frames handled by the SPOE per second. We
just check this counter is higher than the number of streams waiting for a
reply. If not, a new applet is created.
2018-02-02 16:00:32 +01:00
Christopher Faulet
fce747bbaa MINOR: spoe: Always link a SPOE context with the applet processing it
This was already done for fragmented frames. Now, this is true for all
frames.
2018-02-02 16:00:32 +01:00
Christopher Faulet
420977903b MINOR: spoe: Remove check on min_applets number when a SPOE context is queued
The calculation of a minimal number of active applets was really empirical and
finally useless. On heavy load, there are always many active applets (most of
time, more than the minimal required) and when the load is low, there is no
reason to keep unused applets opened.

Because of this change, the flag SPOE_APPCTX_FL_PERSIST is now unused. So it has
been removed.
2018-02-02 16:00:32 +01:00
Christopher Faulet
9cdca976d3 BUG/MEDIUM: spoe: Allow producer to read and to forward shutdown on request side
This is mandatory to correctly set right timeout on the stream. Else the client
timeout is never set. So only SPOE processing timeout will be evaluated. If it
is not defined (ie infinity), the stream can be blocked for a while, waiting the
SPOA reply. Of course, this is not a good idea to let the SPOE processing
timeout undefined, but it can happen.

This patch must be backported in 1.8.
2018-02-02 16:00:31 +01:00
Christopher Faulet
d5216d474d BUG/MEDIUM: spoe: Always try to receive or send the frame to detect shutdowns
Before, we checked if the buffer was allocated or not to avoid sending or
receiving a frame. This was done to not call ci_putblk or co_getblk if there is
nothing to do. But the checks on the buffers are also done in these
functions. So this is not mandatory here. But in these functions, the channel
state is also checked, so an error is returned if it is closed. By skipping the
call, we also skip the checks on the channel state, delaying shutdowns
detection.

Now, we always try to send or receive a frame. So if the corresponding channel
is closed, we can immediatly handle the error.

This patch must be backported in 1.8
2018-02-02 16:00:31 +01:00
Emmanuel Hocdet
f643b80429 MINOR: introduce proxy-v2-options for send-proxy-v2
Proxy protocol v2 can transport many optional informations. To avoid
send-proxy-v2-* explosion, this patch introduce proxy-v2-options parameter
and will allow to write: "send-proxy-v2 proxy-v2-options ssl,cert-cn".
2018-02-02 05:52:51 +01:00
Willy Tarreau
4979592907 BUG/MINOR: epoll/threads: only call epoll_ctl(DEL) on polled FDs
Commit d9e7e36 ("BUG/MEDIUM: epoll/threads: use one epoll_fd per thread")
addressed an issue with the polling and required that cloned FDs are removed
from all polling threads on close. But in fact it does it for all bound
threads, some of which may not necessarily poll the FD. This is harmless,
but it may also make it harder later to deal with FD migration between
threads. Better use polled_mask which only reports threads still aware
of the FD instead of thread_mask.

This fix should be backported to 1.8.
2018-01-31 09:49:29 +01:00
Frédéric Lécaille
6778b27542 MINOR: stick-tables: Adds support for new "gpc1" and "gpc1_rate" counters.
Implement exactly the same code as this has been done for "gpc0" and "gpc0_rate"
counters.
2018-01-31 09:40:05 +01:00
Willy Tarreau
a9786b6f04 MINOR: fd: pass the iocb and owner to fd_insert()
fd_insert() is currently called just after setting the owner and iocb,
but proceeding like this prevents the operation from being atomic and
requires a lock to protect the maxfd computation in another thread from
meeting an incompletely initialized FD and computing a wrong maxfd.
Fortunately for now all fdtab[].owner are set before calling fd_insert(),
and the first lock in fd_insert() enforces a memory barrier so the code
is safe.

This patch moves the initialization of the owner and iocb to fd_insert()
so that the function will be able to properly arrange its operations and
remain safe even when modified to become lockless. There's no other change
beyond the internal API.
2018-01-29 16:07:25 +01:00
Willy Tarreau
fc6eea4de2 MEDIUM: poll: don't use the old FD state anymore
The polling updates are now performed exactly like the epoll/kqueue
ones : only the new polled state is considered, and the previous one
is checked using polled_mask. The only specific stuff here is that
the fd state is shared between all threads, so an FD removal has to
be done only once.
2018-01-29 16:03:15 +01:00
Willy Tarreau
56dd12a7f0 MEDIUM: select: don't use the old FD state anymore
The polling updates are now performed exactly like the epoll/kqueue
ones : only the new polled state is considered, and the previous one
is checked using polled_mask. The only specific stuff here is that
the fd state is shared between all threads, so an FD removal has to
be done only once.
2018-01-29 16:03:15 +01:00
Willy Tarreau
82b37d74d2 MEDIUM: fd: use atomic ops for hap_fd_{clr,set} and remove poll_lock
Now that we can use atomic ops to set/clear an fd occurrence in an
fd_set, we don't need the poll_lock anymore. Let's remove it.
2018-01-29 16:03:15 +01:00
Willy Tarreau
d51a507dbd MEDIUM: select: make use of hap_fd_* functions
Given that FD_{CLR,SET} are not always guaranteed to be thread safe,
let's fall back to using the hap_fd_* functions as we used to till
1.5-dev18 and as poll() continues to use. This will make it easier
to remove the poll_lock.
2018-01-29 16:03:15 +01:00
Willy Tarreau
322e6c7e73 MINOR: fd: move the hap_fd_{clr,set,isset} functions to fd.h
These functions were created for poll() in 1.5-dev18 (commit 80da05a4) to
replace the previous FD_{CLR,SET,ISSET} that were shared with select()
because some libcs enforce a limit on FD_SET. But FD_SET doesn't seem
to be universally MT-safe, requiring locks in the select() code that
are not needed in the poll code. So let's move back to the initial
situation where we used to only use bit fields, since that has been in
use since day one without a problem, and let's use these hap_fd_*
functions instead of FD_*.

This patch only moves the functions to fd.h and revives hap_fd_isset()
that was recently removed to kill an "unused" warning.
2018-01-29 16:03:15 +01:00
Willy Tarreau
745c60eac6 CLEANUP: fd: remove the unused "new" field
This field has been unused since 1.6, it's only updated and never
tested. Let's remove it.
2018-01-29 16:02:59 +01:00
Willy Tarreau
2d3c2db868 MINOR: poll: more accurately compute the new maxfd in the loop
Last commit 173d995 ("MEDIUM: polling: start to move maxfd computation
to the pollers") moved the maxfd computation to the polling loop, but
it still adds an entry when removing an fd, forcing the next loop to
seek from further away than necessary. Let's only update the max when
actually adding an entry.
2018-01-29 16:00:28 +01:00
Willy Tarreau
f2b5c99b4c CLEANUP: fd/threads: remove the now unused fdtab_lock
It was only used to protect maxfd computation and is not needed
anymore.
2018-01-29 15:25:35 +01:00
Willy Tarreau
173d9951e2 MEDIUM: polling: start to move maxfd computation to the pollers
Since only select() and poll() still make use of maxfd, let's move
its computation right there in the pollers themselves, and only
during each fd update pass. The computation doesn't need a lock
anymore, only a few atomic ops. It will be accurate, be done much
less often and will not be required anymore in the FD's fast patch.

This provides a small performance increase of about 1% in connection
rate when using epoll since we get rid of this computation which was
performed under a lock.
2018-01-29 15:22:57 +01:00
Willy Tarreau
c5532acb4d MINOR: fd: don't report maxfd in alert messages
The listeners and connectors may complain that process-wide or
system-wide FD limits have been reached and will in this case report
maxfd as the limit. This is wrong in fact since there's no reason for
the whole FD space to be contiguous when the total # of FD is reached.
A better approach would consist in reporting the accurate number of
opened FDs, but this is pointless as what matters here is to give a
hint about what might be wrong. So let's simply report the configured
maxsock, which will generally explain why the process' limits were
reached, which is the most common reason. This removes another
dependency on maxfd.
2018-01-29 15:18:54 +01:00
Willy Tarreau
ce036bc2da MINOR: polling: make epoll and kqueue not depend on maxfd anymore
Maxfd is really only useful to poll() and select(), yet epoll and
kqueue reference it almost by mistake :
  - cloning of the initial FDs (maxsock should be used here)
  - max polled events, it's maxpollevents which should be used here.

Let's fix these places.
2018-01-29 15:18:54 +01:00
Willy Tarreau
ccea35c980 BUG/MINOR: cli: use global.maxsock and not maxfd to list all FDs
The "show fd" command on the CLI doesn't list the last FD in use since
it doesn't include maxfd. We don't need to use maxfd here anyway as
global.maxsock will do the job pretty well and removes this dependency.
This patch may be backported to 1.8.
2018-01-29 15:18:54 +01:00
Frédéric Lécaille
a41d531e4e MINOR: config: Enable tracking of up to MAX_SESS_STKCTR stick counters.
This patch really adds support for up to MAX_SESS_STKCTR stick counters.
2018-01-29 13:53:56 +01:00
Tim Duesterhus
1478aa795e MEDIUM: sample: Add IPv6 support to the ipmask converter
Add an optional second parameter to the ipmask converter that specifies
the number of bits to mask off IPv6 addresses.

If the second parameter is not given IPv6 addresses fail to mask (resulting
in an empty string), preserving backwards compatibility: Previously
a sample like `src,ipmask(24)` failed to give a result for IPv6 addresses.

This feature can be tested like this:

  defaults
  	log	global
  	mode	http
  	option	httplog
  	option	dontlognull
  	timeout connect 5000
  	timeout client  50000
  	timeout server  50000

  frontend fe
  	bind :::8080 v4v6

  	# Masked IPv4 for IPv4, empty for IPv6 (with and without this commit)
  	http-response set-header Test %[src,ipmask(24)]
  	# Correctly masked IP addresses for both IPv4 and IPv6
  	http-response set-header Test2 %[src,ipmask(24,ffff:ffff:ffff:ffff::)]
  	# Correctly masked IP addresses for both IPv4 and IPv6
  	http-response set-header Test3 %[src,ipmask(24,64)]

  	default_backend be

  backend be
  	server s example.com:80

Tested-By: Jarno Huuskonen <jarno.huuskonen@uef.fi>
2018-01-25 22:25:40 +01:00
Tim Duesterhus
b814da6c5c MINOR: config: Add support for ARGT_MSK6
This commit adds support for ARGT_MSK6 to make_arg_list().
2018-01-25 22:25:40 +01:00
Tim Duesterhus
471851713a MINOR: standard: Add str2mask6 function
This new function mirrors the str2mask() function for IPv4 addresses.

This commit is in preparation to support ARGT_MSK6.
2018-01-25 22:25:40 +01:00
Tim Duesterhus
8575f72e93 CLEANUP: standard: Use len2mask4 in str2mask
The len2mask4 function was introduced in commit:
70473a5f8c
which is about six years later than the commit that introduced the
str2mask function:
2937c0dd20

This is a clean up in preparation for a str2mask6 function which
will use len2mask6.
2018-01-25 22:25:40 +01:00
Tim Duesterhus
bf5ce02eff BUG/MINOR: sample: Fix output type of c_ipv62ip
c_ipv62ip failed to set the output type of the cast to SMP_T_IPV4
even for a successful conversion.

This bug exists as of commit cc4d1716a2
which is the first commit adding this function.

v1.6-dev4 is the first tag containing this commit, the fix should
be backported to haproxy 1.6 and newer.
2018-01-25 22:25:40 +01:00
Tim Duesterhus
ec6b0a2d18 CLEANUP: sample: Fix outdated comment about sample casts functions
The cast functions modify their output type as of commit:
b805f71d1b

v1.5-dev20 is the first tag containing this comment, the fix
should be backported to haproxy 1.5 and newer.
2018-01-25 22:25:40 +01:00
Tim Duesterhus
c555ee0c45 CLEANUP: sample: Fix comment encoding of sample.c
The file contained an 'e' with an gravis accent and thus was
not US-ASCII, but ISO-8859-1.

Also correct the spelling in the incorrect comment.

The incorrect character was introduced in commit:
4d9a1d1a5c

v1.6-dev1 is the first tag containing this comment, the fix
should be backported to haproxy 1.6 and newer.
2018-01-25 22:25:40 +01:00
Christopher Faulet
727c89b3df BUILD: kqueue/threads: Add test on MAX_THREADS to avoid warnings when complied without threads
This is the same patch than the previous one ("BUILD: epoll/threads: Add test on
MAX_THREADS to avoid warnings when complied without threads ").

It should be backported in 1.8 with the commit 7a2364d4 ("BUG/MEDIUM:
kqueue/threads: use one kqueue_fd per thread").
2018-01-25 17:52:57 +01:00
Christopher Faulet
3e805ed08e BUILD: epoll/threads: Add test on MAX_THREADS to avoid warnings when complied without threads
When HAProxy is complied without threads, gcc throws following warnings:

  src/ev_epoll.c:222:3: warning: array subscript is outside array bounds [-Warray-bounds]
  ...
  src/ev_epoll.c:199:11: warning: array subscript is outside array bounds [-Warray-bounds]
  ...

Of course, this is not a bug. In such case, tid is always equal to 0. But to
avoid the noise, a check on MAX_THREADS in "if (tid)" lines makes gcc happy.

This patch should be backported in 1.8 with the commit d9e7e36c ("BUG/MEDIUM:
epoll/threads: use one epoll_fd per thread").
2018-01-25 17:52:57 +01:00
Christopher Faulet
da18b9db7b MINOR: threads: Use __decl_hathreads instead of #ifdef/#endif
A #ifdef/#endif on USE_THREAD was added in the commit 0048dd04 ("MINOR: threads:
Fix build when we're not compiling with threads.") to conditionally define the
start_lock variable, because HA_SPINLOCK_T is only defined when HAProxy is
compiled with threads.

If fact, to do that, we should use the macro __decl_hathreads instead.

If commit 0048dd04 is backported in 1.8, this one can also be backported.
2018-01-25 17:52:57 +01:00
Christopher Faulet
13b007d583 BUG/MINOR: kqueue/threads: Don't forget to close kqueue_fd[tid] on each thread
in deinit_kqueue_per_thread, kqueue_fd[tid] must be closed, except for the main
thread (the first one, tid==0).

This patch must be backported in 1.8 with commit 7a2364d4.
2018-01-25 17:52:57 +01:00
Christopher Faulet
23d86d157e BUG/MEDIUM: checks: Don't try to release undefined conn_stream when a check is freed
When a healt-check is released, the attached conn_stream may be undefined. For
instance, this happens when 'no-check' option is used on a server line. So we
must check it is defined before trying to release it.

This patch must be backported in 1.8.
2018-01-25 13:51:23 +01:00
Christopher Faulet
8d01fd6b3c BUG/MEDIUM: threads/server: Fix deadlock in srv_set_stopping/srv_set_admin_flag
Because of a typo (HA_SPIN_LOCK instead of HA_SPIN_UNLOCK), there is a deadlock
in srv_set_stopping and srv_set_admin_flag when there is at least one trackers.

This patch must be backported in 1.8.
2018-01-25 13:51:23 +01:00
Willy Tarreau
c20d737338 BUG/MINOR: threads: always set an owner to the thread_sync pipe
The owner of the fd used by the synchronization pipe was set to NULL,
making it ignored by maxfd computation. The risk would be that some
synchronization events get delayed between threads when using poll()
or select(). However this is only theorical since the pipe is created
before listeners are bound so normally its FD should be lower and
this should normally not happen. The only possible situation would
be if all listeners are bound to inherited FDs which are lower than
the pipe's.

This patch must be backported to 1.8.
2018-01-25 07:31:08 +01:00
Olivier Houchard
0048dd04c9 MINOR: threads: Fix build when we're not compiling with threads.
Only declare the start_lock if threads are compiled in, otherwise
HA_SPINLOCK_T won't be defined.
This should be backported to 1.8 when/if
1605c7ae61 is backported.
2018-01-24 21:41:29 +01:00
Willy Tarreau
46ec48bc1a BUG/MINOR: mworker: only write to pidfile if it exists
A missing test causes a write(-1, $PID) to appear in strace output when
in master-worker mode. This is totally harmless though.

This fix must be backported to 1.8.
2018-01-23 19:20:19 +01:00
Willy Tarreau
1605c7ae61 BUG/MEDIUM: threads/mworker: fix a race on startup
Marc Fournier reported an interesting case when using threads with the
master-worker mode : sometimes, a listener would have its FD closed
during startup. Sometimes it could even be health checks seeing this.

What happens is that after the threads are created, and the pollers
enabled on each threads, the master-worker pipe is registered, and at
the same time a close() is performed on the write side of this pipe
since the children must not use it.

But since this is replicated in every thread, what happens is that the
first thread closes the pipe, thus releases the FD, and the next thread
starting a listener in parallel gets this FD reassigned. Then another
thread closes the FD again, which this time corresponds to the listener.
It can also happen with the health check sockets if they're started
early enough.

This patch splits the mworker_pipe_register() function in two, so that
the close() of the write side of the FD is performed very early after the
fork() and long before threads are created (we don't need to delay it
anyway). Only the pipe registration is done in the threaded code since
it is important that the pollers are properly allocated for this.
The mworker_pipe_register() function now takes care of registering the
pipe only once, and this is guaranteed by a new surrounding lock.

The call to protocol_enable_all() looks fragile in theory since it
scans the list of proxies and their listeners, though in practice
all threads scan the same list and take the same locks for each
listener so it's not possible that any of them escapes the process
and finishes before all listeners are started. And the operation is
idempotent.

This fix must be backported to 1.8. Thanks to Marc for providing very
detailed traces clearly showing the problem.
2018-01-23 19:18:57 +01:00
Willy Tarreau
7a2364d474 BUG/MEDIUM: kqueue/threads: use one kqueue_fd per thread
This is the same principle as the previous patch (BUG/MEDIUM:
epoll/threads: use one epoll_fd per thread) except that this time it's
for kqueue. We don't want all threads to wake up because of activity on
a single other thread that the other ones are not interested in.

Just like with previous patch, this one shows that the polling state
doesn't need to be changed here and that some simplifications are now
possible. This patch only implements the minimum required for a stable
backport.

This should be backported to 1.8.
2018-01-23 15:50:03 +01:00
Willy Tarreau
d9e7e36c6e BUG/MEDIUM: epoll/threads: use one epoll_fd per thread
There currently is a problem regarding epoll(). While select() and poll()
compute their polling state on the fly upon each call, epoll() keeps a
shared state between all threads via the epoll_fd. The problem is that
once an fd is registered on *any* thread, all other threads receive
events for that FD as well. It is clearly visible when binding a listener
to a single thread like in the configuration below where all 4 threads
will work, 3 of them simply spinning to skip the event :

    global
        nbthread 4

    frontend foo
        bind :1234 process 1/1

The worst case happens when some slow operations are in progress on a
busy thread, preventing it from processing its task and causing the
other ones to wake up not being able to do anything with this event.
Typically computing a large TLS key will delay processing of next
events on the same thread while others will still wake up.

All this simply shows that the poller must remain thread-specific, with
its own events and its own ability to sleep when it doesn't have anyhing
to do.

This patch does exactly this. For this, it proceeds like this :

   - have one epoll_fd per thread instead of one per process
   - initialize these epoll_fd when threads are created.
   - mark all known FDs as updated so that the next invocation of
     _do_poll() recomputes their polling status (including a possible
     removal of undesired polling from the original FD) ;
   - use each fd's polled_mask to maintain an accurate status of
     the current polling activity for this FD.
   - when scanning updates, only focus on events whose new polling
     status differs from the existing one
   - during updates, always verify the thread_mask to resist migration
   - on __fd_clo(), for cloned FDs (typically listeners inherited
     from the parent during a graceful shutdown), run epoll_ctl(DEL)
     on all epoll_fd. This is the reason why epoll_fd is stored in a
     shared array and not in a thread_local storage. Note: maybe this
     can be moved to an update instead.

Interestingly, this shows that we don't need the FD's old state anymore
and that we only use it to convert it to the new state based on stable
information. It appears clearly that the FD code can be further improved
by computing the final state directly when manipulating it.

With this change, the config above goes from 22000 cps at 380% CPU to
43000 cps at 100% CPU : not only the 3 unused threads are not activated,
but they do not disturb the activity anymore.

The output of "show activity" before and after the patch on a 4-thread
config where a first listener on thread 2 forwards over SSL to threads
3 & 4 shows this a much smaller amount of undesired events (thread 1
doesn't wake up anymore, poll_skip remains zero, fd_skip stays low) :

  // before: 400% CPU, 7700 cps, 13 seconds
  loops: 11380717 65879 5733468 5728129
  wake_cache: 0 63986 317547 314174
  wake_tasks: 0 0 0 0
  wake_applets: 0 0 0 0
  wake_signal: 0 0 0 0
  poll_exp: 0 63986 317547 314174
  poll_drop: 1 0 49981 48893
  poll_dead: 65514 0 31334 31934
  poll_skip: 46293690 34071 22867786 22858208
  fd_skip: 66068135 174157 33732685 33825727
  fd_lock: 0 2 2809 2905
  fd_del: 0 494361 80890 79464
  conn_dead: 0 0 0 0
  stream: 0 407747 50526 49474
  empty_rq: 11380718 1914 5683023 5678715
  long_rq: 0 0 0 0

  // after: 200% cpu, 9450 cps, 11 seconds
  loops: 17 66147 1001631 450968
  wake_cache: 0 66119 865139 321227
  wake_tasks: 0 0 0 0
  wake_applets: 0 0 0 0
  wake_signal: 0 0 0 0
  poll_exp: 0 66119 865139 321227
  poll_drop: 6 5 38279 60768
  poll_dead: 0 0 0 0
  poll_skip: 0 0 0 0
  fd_skip: 54 172661 4411407 2008198
  fd_lock: 0 0 10890 5394
  fd_del: 0 492829 58965 105091
  conn_dead: 0 0 0 0
  stream: 0 406223 38663 61338
  empty_rq: 18 40 962999 390549
  long_rq: 0 0 0 0

This patch presents a few risks but fixes a real problem with threads,
and as such it needs be backported to 1.8. It depends on previous patch
("MINOR: fd: add a bitmask to indicate that an FD is known by the poller").

Special thanks go to Samuel Reed for providing a large amount of useful
debugging information and for testing fixes.
2018-01-23 15:48:08 +01:00
Willy Tarreau
c9c8378c2b MINOR: fd: add a bitmask to indicate that an FD is known by the poller
Some pollers like epoll() need to know if the fd is already known or
not in order to compute the operation to perform (add, mod, del). For
now this is performed based on the difference between the previous FD
state and the new state but this will not be usable anymore once threads
become responsible for their own polling.

Here we come with a different approach : a bitmask is stored with the
fd to indicate which pollers already know it, and the pollers will be
able to simply perform the add/mod/del operations based on this bit
combined with the new state.

This patch only adds the bitmask declaration and initialization, it
is it not yet used. It will be needed by the next two fixes and will
need to be backported to 1.8.
2018-01-23 15:42:57 +01:00
Willy Tarreau
ebc78d78a2 BUG/MEDIUM: fd: maintain a per-thread update mask
Since the fd update tables are per-thread, we need to have a bit per
thread to indicate whether an update exists, otherwise this can lead
to lost update events every time multiple threads want to update the
same FD. In practice *for now*, it only happens at start time when
listeners are enabled and ask for polling after facing their first
EAGAIN. But since the pollers are still shared, a lost event is still
recovered by a neighbor thread. This will not reliably work anymore
with per-thread pollers, where it has been observed a few times on
startup that a single-threaded listener would not always accept
incoming connections upon startup.

It's worth noting that during this code review it appeared that the
"new" flag in the fdtab isn't used anymore.

This fix should be backported to 1.8.
2018-01-23 15:41:19 +01:00
Christopher Faulet
32467fef98 BUG/MEDIUM: threads/polling: Use fd_cache_mask instead of fd_cache_num
fd_cache_num is the number of FDs in the FD cache. It is a global variable. So
it is underoptimized because we may be lead to consider there are waiting FDs
for the current thread in the FD cache while in fact all FDs are assigned to the
other threads. So, in such cases, the polling loop will be evaluated many more
times than necessary.

Instead, we now check if the thread id is set in the bitfield fd_cache_mask.

[wt: it's not exactly a bug, rather a design limitation of the thread
 which was not addressed in time for the 1.8 release. It can appear more
 often than we initially predicted, when more threads are running than
 the number of assigned CPU cores, or when certain threads spend
 milliseconds computing crypto keys while other threads spin on
 epoll_wait(0)=0]

This patch should be backported to 1.8.
2018-01-23 15:39:51 +01:00
Christopher Faulet
69553fe62c MINOR: threads/fd: Use a bitfield to know if there are FDs for a thread in the FD cache
A bitfield has been added to know if there are some FDs processable by a
specific thread in the FD cache. When a FD is inserted in the FD cache, the bits
corresponding to its thread_mask are set. On each thread, the bitfield is
updated when the FD cache is processed. If there is no FD processed, the thread
is removed from the bitfield by unsetting its tid_bit.

Note that this bitfield is updated but not checked in
fd_process_cached_events. So, when this function is called, the FDs cache is
always processed.

[wt: should be backported to 1.8 as it will help fix a design limitation]
2018-01-23 15:39:10 +01:00
Willy Tarreau
d80cb4ee13 MINOR: global: add some global activity counters to help debugging
A number of counters have been added at special places helping better
understanding certain bug reports. These counters are maintained per
thread and are shown using "show activity" on the CLI. The "clear
counters" commands also reset these counters. The output is sent as a
single write(), which currently produces up to about 7 kB of data for
64 threads. If more counters are added, it may be necessary to write
into multiple buffers, or to reset the counters.

To backport to 1.8 to help collect more detailed bug reports.
2018-01-23 15:38:33 +01:00
Willy Tarreau
421f02e738 MINOR: threads: add a MAX_THREADS define instead of LONGBITS
This one allows not to inflate some structures when threads are
disabled. Now struct global is 1.4 kB instead of 33 kB.

Should be backported to 1.8 for ease of backporting of upcoming
patches.
2018-01-23 15:28:20 +01:00
Olivier Houchard
e9bad0a936 MINOR: servers: Don't report duplicate dyncookies for disabled servers.
Especially with server-templates, it can happen servers starts with a
placeholder IP, in the disabled state. In this case, we don't want to report
that the same cookie was generated for multiple servers. So defer the test
until the server is enabled.

This should be backported to 1.8.
2018-01-23 14:05:17 +01:00
Emeric Brun
5548291395 BUG/MEDIUM: peers: fix expire date wasn't updated if entry is modified remotely.
The stktable_touch_remote considers the expire field stored in the stksess
struct.
The expire field was updated on the a newly created stksess to store.

But if the stksess with a same key is still present the expire was not updated.

This patch postpones the update of the expire field of the stksess just before
processing the "touch".

These bug was introduced in commit:

MEDIUM: threads/stick-tables: handle multithreads on stick tables.

And the fix should be backported on 1.8.
2018-01-22 16:03:25 +01:00
Etienne Carriere
a792a0aa93 MINOR: sample: add date_us sample
Add date_us sample that returns the microsecond part of the timeval
structure representing the date of the structure. The "second" part of
the timeval can already be fetched by the "date" sample
2018-01-21 07:56:42 +01:00
Willy Tarreau
cc35923c32 BUG/MINOR: poll: too large size allocation for FD events
Commit 80da05a ("MEDIUM: poll: do not use FD_* macros anymore") which
appeared in 1.5-dev18 and which was backported to 1.4.23 made explicit
use of arrays of FDs mapped to unsigned ints. The problem lies in the
allocated size for poll(), as the resulting size is in bits and not
bytes, resulting in poll() arrays being 8 times larger than necessary!

In practice poll() is not used on highly loaded systems, explaining why
nobody noticed. But it definetely has to be addressed.

This fix needs to be backported to all stable versions.
2018-01-17 15:52:11 +01:00
Christopher Faulet
333694d771 MINOR: spoe: Don't queue a SPOE context if nothing is sent
When some messages must be sent to an agent, the SPOE context of the stream is
queued to be handled by an SPOE applet. If there is no available applet, a new
one is created, thus opening a connection with the agent.

Since the support of ACLs on messages, some processing can now be discarded. So,
to avoid opening a connection for nothing, the SPOE context is now queued after
the messages encoding.
2018-01-15 13:48:03 +01:00
Christopher Faulet
336d3ef0e7 MINOR: spoe: add register-var-names directive in spoe-agent configuration
In addition to "option force-set-var", recently added, this directive can be
used to selectivelly register unknown variable names, without totally relaxing
their registration during the runtime, like "option force-set-var" does.

So there is no way for a malicious agent to exhaust memory by defining a too
high number of variable names. In other hand, you need to enumerate all
variable names. This could be painfull in some circumstances.

Remember, this directive is only usefull when the variable names are not
referenced anywhere in the HAProxy configuration or the SPOE one.

Thanks to Etienne Carrière for his help on this part.
2018-01-15 13:47:27 +01:00
Willy Tarreau
d651ba14d4 BUG/MEDIUM: stream: properly handle client aborts during redispatch
James Mc Bride reported an interesting case affecting all versions since
at least 1.5 : if a client aborts a connection on an empty buffer at the
exact moment a server redispatch happens, the CF_SHUTW_NOW flag on the
channel is immediately turned into CF_SHUTW, which is not caught by
check_req_may_abort(), leading the redispatch to be performed anyway
with the channel marked as shut in both directions while the stream
interface correctly establishes. This situation makes no sense.
Ultimately the transfer times out and the server-side stream interface
remains in EST state while the client is in CLO state, and this case
doesn't correspond to anything we can handle in process_stream, leading
to poll() being woken up all the time without any progress being made.
And the session cannot even be killed from the CLI.

So we must ensure that check_req_may_abort() also considers the case
where the channel is already closed, which is what this patch does.
Thanks to James for providing detailed captures allowing to diagnose
the problem.

This fix must be backported to all maintained versions.
2018-01-12 10:47:48 +01:00
William Lallemand
29f690c945 BUG/MEDIUM: mworker: execvp failure depending on argv[0]
The copy_argv() function lacks a check on '-' to remove the -x, -sf and
-st parameters.

When reloading a master process with a path starting by /st, /sf, or
/x..  the copy_argv() function skipped argv[0] leading to an execvp()
without the binary.
2018-01-09 23:44:18 +01:00