Commit Graph

9436 Commits

Author SHA1 Message Date
Christopher Faulet
ddc005ae57 BUG/MINOR: rules: Increment be_counters if backend is assigned for a silent-drop
Backend counters must be incremented only if a backend was already assigned to
the stream (when the stream exists). Otherwise, it means we are still on the
frontend side.

This patch may be backported as far as 1.6.
2020-03-06 15:36:04 +01:00
Christopher Faulet
f573ba2033 BUG/MINOR: rules: Return ACT_RET_ABRT when a silent-drop action is executed
When an action interrupts a transaction, returning a response or not, it must
return the ACT_RET_ABRT value and not ACT_RET_STOP. ACT_RET_STOP is reserved to
stop the processing of the current ruleset.

No backport needed because on previous versions, the action return values are
not handled the same way.
2020-03-06 15:36:04 +01:00
Christopher Faulet
177f480f2c BUG/MINOR: rules: Preserve FLT_END analyzers on silent-drop action
When at least a filter is attached to a stream, FLT_END analyzers must be
preserved on request and response channels.

This patch should be backported as far as 1.7.
2020-03-06 15:36:04 +01:00
Christopher Faulet
5e896510a8 MINOR: compression/filters: Initialize the comp filter when stream is created
Since the HTX mode is the only mode to process HTTP messages, the stream is
created for a uniq transaction. The keep-alive is handled at the mux level. So,
the compression filter can be initialized when the stream is created and
released with the stream. Concretly, .channel_start_analyze and
.channel_end_analyze callback functions are replaced by .attach and .detach
ones.

With this change, it is no longer necessary to call FLT_START_FE/BE and FLT_END
analysers for the compression filter.
2020-03-06 15:36:04 +01:00
Christopher Faulet
65554e1b95 MINOR: cache/filters: Initialize the cache filter when stream is created
Since the HTX mode is the only mode to process HTTP messages, the stream is
created for a uniq transaction. The keep-alive is handled at the mux level. So,
the cache filter can be initialized when the stream is created and released with
the stream. Concretly, .channel_start_analyze and .channel_end_analyze callback
functions are replaced by .attach and .detach ones.

With this change, it is no longer necessary to call FLT_START_FE/BE and FLT_END
analysers for the cache filter.
2020-03-06 15:36:04 +01:00
Christopher Faulet
d4a824e533 BUG/MINOR: http-rules: Fix a typo in the reject action function
A typo was introduced by the commit c5bb5a0f2 ("BUG/MINOR: http-rules: Preserve
FLT_END analyzers on reject action").

This patch must be backported with the commit c5bb5a0f2.
2020-03-06 15:36:04 +01:00
Christopher Faulet
c5bb5a0f2b BUG/MINOR: http-rules: Preserve FLT_END analyzers on reject action
When at least a filter is attached to a stream, FLT_END analyzers must be
preserved on request and response channels.

This patch should be backported as far as 1.8.
2020-03-06 14:13:00 +01:00
Christopher Faulet
90d22a88cb BUG/MINOR: http-rules: Return ACT_RET_ABRT to abort a transaction
When an action interrupts a transaction, returning a response or not, it must
return the ACT_RET_ABRT value and not ACT_RET_DONE. ACT_RET_DONE is reserved to
stop the processing on the current channel but some analysers may still be
active. When ACT_RET_ABRT is returned, all analysers are removed, except FLT_END
if it is set.

No backport needed because on previous verions, the action return value was not
handled the same way.

It is stated in the comment the return action returns ACT_RET_ABRT on
success. It it the right code to use to abort a transaction. ACT_RET_DONE must
be used when the message processing must be stopped. This does not means the
transaction is interrupted.

No backport needed.
2020-03-06 14:13:00 +01:00
Christopher Faulet
bc275a9e44 BUG/MINOR: lua: Init the lua wake_time value before calling a lua function
The wake_time of a lua context is now always set to TICK_ETERNITY when the
context is initialized and when everytime the execution of the lua stack is
started. It is mandatory to not set arbitrary wake_time when an action yields.

No backport needed.
2020-03-06 14:13:00 +01:00
Christopher Faulet
501465d94b MINOR: lua: Rename hlua_action_wake_time() to hlua_set_wake_time()
This function does not depends on the action class. So use a more generic
name. It will be easier to bind it on another class if necessary.
2020-03-06 14:13:00 +01:00
Christopher Faulet
d8f0e073dd MINOR: lua: Remove the flag HLUA_TXN_HTTP_RDY
This flag was used in some internal functions to be sure the current stream is
able to handle HTTP content. It was introduced when the legacy HTTP code was
still there. Now, It is possible to rely on stream's flags to be sure we have an
HTX stream.

So the flag HLUA_TXN_HTTP_RDY can be removed. Everywhere it was tested, it is
replaced by a call to the IS_HTX_STRM() macro.

This patch is mandatory to allow the support of the filters written in lua.
2020-03-06 14:13:00 +01:00
Christopher Faulet
d31c7b322c MINOR: lua: Stop using lua txn in hlua_http_del_hdr() and hlua_http_add_hdr()
In these functions, the lua txn was not used. So it can be removed from the
function argument list.

This patch is mandatory to allow the support of the filters written in lua.
2020-03-06 14:13:00 +01:00
Christopher Faulet
d1914aaa03 MINOR: lua: Stop using the lua txn in hlua_http_rep_hdr()
In this function, the lua txn was only used to retrieve the stream. But it can
be retieve from the HTTP message, using its channel pointer. So, the lua txn can
be removed from the function argument list.

This patch is mandatory to allow the support of the filters written in lua.
2020-03-06 14:13:00 +01:00
Christopher Faulet
9d1332bbf4 MINOR: lua: Stop using the lua txn in hlua_http_get_headers()
In this function, the lua txn was only used to test if the HTTP transaction is
defined. But it is always used in a context where it is true. So, the lua txn
can be removed from the function argument list.

This patch is mandatory to allow the support of the filters written in lua.
2020-03-06 14:13:00 +01:00
Christopher Faulet
2ac9ba2a1c MINOR: lua: Add function to know if a channel is a response one
It is now possible to call Channel.is_resp(chn) method to know if a channel is a
response channel or not.
2020-03-06 14:13:00 +01:00
Christopher Faulet
0ec740eaee BUG/MINOR: lua: Ignore the reserve to know if a channel is full or not
The Lua function Channel.is_full() should not take care of the reserve because
it is not called from a producer (an applet for instance). From an action, it is
allowed to overwrite the buffer reserve.

This patch should be backported as far as 1.7. But it must be adapted for 1.8
and lower because there is no HTX on these versions.
2020-03-06 14:13:00 +01:00
Christopher Faulet
4ad7310399 BUG/MINOR: lua: Abort when txn:done() is called from a Lua action
When a lua action aborts a transaction calling txn:done() function, the action
must return ACT_RET_ABRT instead of ACT_RET_DONE. It is mandatory to
abort the message analysis.

This patch must be backported everywhere the commit 7716cdf45 ("MINOR: lua: Get
the action return code on the stack when an action finishes") was
backported. For now, no backport needed.
2020-03-06 14:12:59 +01:00
Christopher Faulet
e58c0002ff BUG/MINOR: http-ana: Reset request analysers on a response side error
When an error occurred on the response side, request analysers must be reset. At
this stage, only AN_REQ_HTTP_XFER_BODY analyser remains, and possibly
AN_REQ_FLT_END, if at least one filter is attached to the stream. So it is safe
to remove the AN_REQ_HTTP_XFER_BODY analyser. An error was already handled and a
response was already returned to the client (or it was at least scheduled to be
sent). So there is no reason to continue to process the request payload. It may
cause some troubles for the filters because when an error occurred, data from
the request buffer are truncated.

This patch must be backported as far as 1.9, for the HTX part only. I don't know
if the legacy HTTP code is affected.
2020-03-06 14:12:59 +01:00
Christopher Faulet
e6a62bf796 BUG/MEDIUM: compression/filters: Fix loop on HTX blocks compressing the payload
During the payload filtering, the offset is relative to the head of the HTX
message and not its first index. This index is the position of the first block
to (re)start the HTTP analysis. It must be used during HTTP analysis but not
during the payload forwarding.

So, from the compression filter point of view, when we loop on the HTX blocks to
compress the response payload, we must start from the head of the HTX
message. To ease the loop, we use the function htx_find_offset().

This patch must be backported as far as 2.0. It depends on the commit "MINOR:
htx: Add a function to return a block at a specific an offset". So this one must
be backported first.
2020-03-06 14:12:59 +01:00
Christopher Faulet
497c759558 BUG/MEDIUM: cache/filters: Fix loop on HTX blocks caching the response payload
During the payload filtering, the offset is relative to the head of the HTX
message and not its first index. This index is the position of the first block
to (re)start the HTTP analysis. It must be used during HTTP analysis but not
during the payload forwarding.

So, from the cache point of view, when we loop on the HTX blocks to cache the
response payload, we must start from the head of the HTX message. To ease the
loop, we use the function htx_find_offset().

This patch must be backported as far as 2.0. It depends on the commit "MINOR:
htx: Add a function to return a block at a specific an offset". So this one must
be backported first.
2020-03-06 14:12:59 +01:00
Christopher Faulet
81340d7b53 BUG/MINOR: filters: Forward everything if no data filters are called
If a filter enable the data filtering, in TCP or in HTTP, but it does not
defined the corresponding callback function (so http_payload() or
tcp_payload()), it will be ignored. If all configured data filter do the same,
we must be sure to forward everything. Otherwise nothing will be forwarded at
all.

This patch must be forwarded as far as 1.9.
2020-03-06 14:12:59 +01:00
Christopher Faulet
c50ee0b3b4 BUG/MINOR: filters: Use filter offset to decude the amount of forwarded data
When the tcp or http payload is filtered, it is important to use the filter
offset to decude the amount of forwarded data because this offset may change
during the call to the callback function. So we should not rely on a local
variable defined before this call.

For now, existing HAproxy filters don't change this offset, so this bug may only
affect external filters.

This patch must be forwarded as far as 1.9.
2020-03-06 14:12:59 +01:00
Christopher Faulet
24598a499f MINOR: flt_trace: Use htx_find_offset() to get the available payload length
The trace_get_htx_datalen() function now uses htx_find_offset() to get the
payload length, ie. the length of consecutives DATA blocks.
2020-03-06 14:12:59 +01:00
Christopher Faulet
bb76aa4d37 MINOR: htx: Use htx_find_offset() to truncate an HTX message
The htx_truncate() function now uses htx_find_offset() to find the first block
to start the truncation.
2020-03-06 14:12:59 +01:00
Christopher Faulet
1cdceb9365 MINOR: htx: Add a function to return a block at a specific offset
The htx_find_offset() function may be used to look for a block at a specific
offset in an HTX message, starting from the message head. A compound result is
returned, an htx_ret structure, with the found block and the position of the
offset in the block. If the offset is ouside of the HTX message, the returned
block is NULL.
2020-03-06 14:12:59 +01:00
Tim Duesterhus
ba837ec367 CLEANUP: proxy_protocol: Use size_t when parsing TLVs
Change `int` to `size_t` for consistency.
2020-03-06 11:16:19 +01:00
Tim Duesterhus
488ee7fb6e BUG/MAJOR: proxy_protocol: Properly validate TLV lengths
This patch fixes PROXYv2 parsing when the payload of the TCP connection is
fused with the PROXYv2 header within a single recv() call.

Previously HAProxy ignored the PROXYv2 header length when attempting to
parse the TLV, possibly interpreting the first byte of the payload as a
TLV type.

This patch adds proper validation. It ensures that:

1. TLV parsing stops when the end of the PROXYv2 header is reached.
2. TLV lengths cannot exceed the PROXYv2 header length.
3. The PROXYv2 header ends together with the last TLV, not allowing for
   "stray bytes" to be ignored.

A reg-test was added to ensure proper behavior.

This patch tries to find the sweat spot between a small and easily
backportable one, and a cleaner one that's more easily adaptable to
older versions, hence why it merges the "if" and "while" blocks which
causes a reindent of the whole block. It should be used as-is for
versions 1.9 to 2.1, the block about PP2_TYPE_AUTHORITY should be
dropped for 2.0 and the block about CRC32C should be dropped for 1.8.

This bug was introduced when TLV parsing was added. This happened in commit
b3e54fe387. This commit was first released
with HAProxy 1.6-dev1.

A similar issue was fixed in commit 7209c204bd.

This patch must be backported to HAProxy 1.6+.
2020-03-06 11:11:22 +01:00
Willy Tarreau
b1beaa302c BUG/MINOR: init: make the automatic maxconn consider the max of soft/hard limits
James Stroehmann reported something working as documented but that can
be considered as a regression in the way the automatic maxconn is
calculated from the process' limits :

  https://www.mail-archive.com/haproxy@formilux.org/msg36523.html

The purpose of the changes in 2.0 was to have maxconn default to the
highest possible value permitted to the user based on the ulimit -n
setting, however the calculation starts from the soft limit, which
can be lower than what users were allowed to with previous versions
where the default value of 2000 would force a higher ulimit -n as
long as it fitted in the hard limit.

Usually this is not noticeable if the user changes the limits, because
quite commonly setting a new value restricts both the soft and hard
values.

Let's instead always use the max between the hard and soft limits, as
we know these values are permitted. This was tried on the following
setup:

  $ cat ulimit-n.cfg
  global
    stats socket /tmp/sock1 level admin
  $ ulimit -n
  1024

Before the change the limits would show like this:

  $ socat - /tmp/sock1 <<< "show info" | grep -im2 ^Max
  Maxsock: 1023
  Maxconn: 489

After the change the limits are now much better and more in line with
the default settings in earlier versions:

  $ socat - /tmp/sock1 <<< "show info" | grep -im2 ^Max
  Maxsock: 4095
  Maxconn: 2025

The difference becomes even more obvious when running moderately large
configs with hundreds of checked servers and hundreds of listeners:

  $ cat ulimit-n.cfg
  global
    stats socket /tmp/sock1 level admin

  listen l
    bind :10000-10300
    server-template srv- 300 0.0.0.0 check disabled

          Before   After
  Maxsock  1024    4096
  Maxconn  189     1725

This issue is tagged as minor since a trivial config change fixes it,
but it would help new users to have it backported as far as 2.0.
2020-03-06 10:49:55 +01:00
Carl Henrik Lunde
f91ac19299 OPTIM: startup: fast unique_id allocation for acl.
pattern_finalize_config() uses an inefficient algorithm which is a
problem with very large configuration files. This affects startup, and
therefore reload time. When haproxy is deployed as a router in a
Kubernetes cluster the generated configuration file may be large and
reloads are frequently occuring, which makes this a significant issue.

The old algorithm is O(n^2)
* allocate missing uids - O(n^2)
* sort linked list - O(n^2)

The new algorithm is O(n log n):
* find the user allocated uids - O(n)
* store them for efficient lookup - O(n log n)
* allocate missing uids - n times O(log n)
* sort all uids - O(n log n)
* convert back to linked list - O(n)

Performance examples, startup time in seconds:

    pat_refs old     new
    1000      0.02   0.01
    10000     2.1    0.04
    20000    12.3    0.07
    30000    27.9    0.10
    40000    52.5    0.14
    50000    77.5    0.17

Please backport to 1.8, 2.0 and 2.1.
2020-03-06 08:11:58 +01:00
Tim Duesterhus
a17e66289c MEDIUM: stream: Make the unique_id member of struct stream a struct ist
The `unique_id` member of `struct stream` now is a `struct ist`.
2020-03-05 20:21:58 +01:00
Tim Duesterhus
0643b0e7e6 MINOR: proxy: Make header_unique_id a struct ist
The `header_unique_id` member of `struct proxy` now is a `struct ist`.
2020-03-05 19:58:22 +01:00
Tim Duesterhus
ed5263739b CLEANUP: Use isttest() and istfree()
This adjusts a few locations to make use of `isttest()` and `istfree()`.
2020-03-05 19:52:07 +01:00
Tim Duesterhus
e296d3e5f0 MINOR: ist: Add int isttest(const struct ist)
`isttest` returns whether the `.ptr` is non-null.
2020-03-05 19:52:07 +01:00
Tim Duesterhus
241e29ef9c MINOR: ist: Add IST_NULL macro
`IST_NULL` is equivalent to an `struct ist` with `.ptr = NULL` and
`.len = 0`.
2020-03-05 19:52:07 +01:00
Willy Tarreau
6cbe62b858 MINOR: debug: add CLI command "debug dev write" to write an arbitrary size
This command is used to produce an arbitrary amount of data on the
output. It can be used to test the CLI's state machine as well as
the internal parts related to applets an I/O. A typical test consists
in asking for all sizes from 0 to 16384:

  $ (echo "prompt;expert-mode on";for i in {0..16384}; do
     echo "debug dev write $i"; done) | socat - /tmp/sock1 | wc -c
  134258738

A better test would consist in first waiting for the response before
sending a new request.

This command is not restricted to the admin since it's harmless.
2020-03-05 17:20:15 +01:00
Willy Tarreau
d04a2a6654 BUG/MINOR: ssl-sock: do not return an uninitialized pointer in ckch_inst_sni_ctx_to_sni_filters
There's a build error reported here:
   c9c6cdbf9c/checks

It's just caused by an inconditional assignment of tmp_filter to
*sni_filter without having been initialized, though it's harmless because
this return pointer is not used when fcount is NULL, which is the only
case where this happens.

No backport is needed as this was brought today by commit 38df1c8006
("MINOR: ssl/cli: support crt-list filters").
2020-03-05 16:26:12 +01:00
William Lallemand
cfca1422c7 MINOR: ssl: reach a ckch_store from a sni_ctx
It was only possible to go down from the ckch_store to the sni_ctx but
not to go up from the sni_ctx to the ckch_store.

To allow that, 2 pointers were added:

- a ckch_inst pointer in the struct sni_ctx
- a ckckh_store pointer in the struct ckch_inst
2020-03-05 11:28:42 +01:00
William Lallemand
38df1c8006 MINOR: ssl/cli: support crt-list filters
Generate a list of the previous filters when updating a certificate
which use filters in crt-list. Then pass this list to the function
generating the sni_ctx during the commit.

This feature allows the update of the crt-list certificates which uses
the filters with "set ssl cert".

This function could be probably replaced by creating a new
ckch_inst_new_load_store() function which take the previous sni_ctx list as
an argument instead of the char **sni_filter, avoiding the
allocation/copy during runtime for each filter. But since are still
handling the multi-cert bundles, it's better this way to avoid code
duplication.
2020-03-05 11:27:53 +01:00
Willy Tarreau
f4629a5346 BUG/MINOR: connection/debug: do not enforce !event_type on subscribe() anymore
When building with DEBUG_STRICT, there are still some BUG_ON(events&event_type)
in the subscribe() code which are not welcome anymore since we explicitly
permit to wake the caller up on readiness. This causes some regtests to fail
since 2c1f37d353 ("OPTIM: mux-h1: subscribe rather than waking up at a few
other places") when built with this option.

No backport is needed, this is 2.2-dev.
2020-03-05 07:46:33 +01:00
Tim Duesterhus
2825b4b0ca MINOR: stream: Use stream_generate_unique_id
This patch replaces the ad-hoc generation of stream's `unique_id` values
by calls to `stream_generate_unique_id`.
2020-03-05 07:23:00 +01:00
Tim Duesterhus
127a74dd48 MINOR: stream: Add stream_generate_unique_id function
Currently unique IDs for a stream are generated using repetitive code in
multiple locations, possibly allowing for inconsistent behavior.
2020-03-05 07:23:00 +01:00
Willy Tarreau
2c1f37d353 OPTIM: mux-h1: subscribe rather than waking up at a few other places
This is another round of conversion from a blind tasklet_wakeup() to
a more careful subscribe(). It has significantly improved the number
of function calls per HTTP request (/?s=1k/t=20) :
                        before    after
 tasklet_wakeup:          3         2
 conn_subscribe:          3         2
 h1_iocb:                 3         2
 h1_process:              3         2
 h1_parse_msg_hdrs:       4         3
 h1_rcv_buf:              5         3
 h1_send:                 5         4
 h1_subscribe:            2         1
 h1_wake_stream_for_send: 5         4
 http_wait_for_request:   2         1
 process_stream:          3         2
 si_cs_io_cb:             4         2
 si_cs_process:           3         1
 si_cs_rcv:               5         3
 si_sync_send:            2         1
 si_update_both:          2         1
 stream_int_chk_rcv_conn: 3         2
 stream_int_notify:       3         1
 stream_release_buffers:  9         4
2020-03-04 19:29:12 +01:00
Willy Tarreau
6f95f6e111 OPTIM: connection: disable receiving on disabled events when the run queue is too high
In order to save a lot on syscalls, we currently don't disable receiving
on a file descriptor anymore if its handler was already woken up. But if
the run queue is huge and the poller collects a lot of events, this causes
excess wakeups which take CPU time which is not used to flush these tasklets.
This patch simply considers the run queue size to decide whether or not to
stop receiving. Tests show that by stopping receiving when the run queue
reaches ~16 times its configured size, we can still hold maximal performance
in extreme situations like maxpollevents=20k for runqueue_depth=2, and still
totally avoid calling epoll_event under moderate load using default settings
on keep-alive connections.
2020-03-04 19:29:12 +01:00
Willy Tarreau
8de5c4fa15 MEDIUM: connection: only call ->wake() for connect() without I/O
We used to call ->wake() for any I/O event for which there was no
subscriber. But this is a problem because this causes massive wake()
storms since we disabled fd_stop_recv() to save syscalls.

The only reason for the io_available condition is to detect that an
asynchronous connect() just finished and will not be handled by any
registered event handler. Since we now properly handle synchronous
connects, we can detect this situation by the fact that we had a
success on conn_fd_check() and no requested I/O took over.
2020-03-04 19:29:12 +01:00
Willy Tarreau
4c69cff438 MINOR: tcp/uxst/sockpair: only ask for I/O when really waiting for a connect()
Now that the stream-interface properly handles synchonous connects, there
is no more reason for subscribing and doing nothing.
2020-03-04 19:29:12 +01:00
Willy Tarreau
ada4c5806b MEDIUM: stream-int: make sure to try to immediately validate the connection
In the rare case of immediate connect() (unix sockets, socket pairs, and
occasionally TCP over the loopback), it is counter-productive to subscribe
for sending and then getting immediately back to process_stream() after
having passed through si_cs_process() just to update the connection. We
already know it is established and it doesn't have any handshake anymore
so we just have to complete it and return to process_stream() with the
stream_interface in the SI_ST_RDY state. In this case, process_stream will
simply loop back to the beginning to synchronize the state and turn it to
SI_ST_EST/ASS/CLO/TAR etc.

This will save us from having to needlessly subscribe in the connect()
code, something which in addition cannot work with edge-triggered pollers.
2020-03-04 19:29:12 +01:00
Willy Tarreau
667fefdc90 BUG/MEDIUM: connection: stop polling for sending when the event is ready
With commit 065a025610 ("MEDIUM: connection: don't stop receiving events
in the FD handler") we disabled a number of fd_stop_* in conn_fd_handler(),
in order to wait for their respective handlers to deal with them. But it
is not correct to do that for the send direction, as we may very well
have nothing to send. This is visible when connecting in TCP mode to
a server with no data to send, there's nobody anymore to disable the
polling for the send direction.

And it is logical, on the recv() path we know the system has data to
deliver and that some code will be in charge of it. On the send
direction we simply don't know if it was the result of a successful
connect() or if there is still something to send. In any case we
almost never fill the network buffer on a single send() after being
woken up by the system, so disabling the FD immediately or much later
will not change the number of operations.

No backport is needed, this is 2.2-dev.
2020-03-04 19:29:12 +01:00
Willy Tarreau
109201fc5c BUILD: tools: rely on __ELF__ not USE_DL to enable use of dladdr()
The approach was wrong. USE_DL is for the makefile to know if it's required
to append -ldl at link time. Some platforms do not need it (and in fact do
not have it) yet they have a working dladdr(). The real condition is related
to ELF. Given that due to Lua, all platforms that require -ldl already have
USE_DL set, let's replace USE_DL with __ELF__ here and consider that dladdr
is always needed on ELF, which basically is already the case.
2020-03-04 12:04:07 +01:00
Willy Tarreau
9133e48f2a BUILD: tools: unbreak resolve_sym_name() on non-GNU platforms
resolve_sym_name() doesn't build when USE_DL is set on non-GNU platforms
because "Elf(W)" isn't defined. Since it's only used for dladdr1(), let's
refactor all this so that we can completely ifdef out that part on other
platforms. Now we have a separate function to perform the call depending
on the platform and it only returns the size when available.
2020-03-04 12:04:07 +01:00
Willy Tarreau
a91b7946bd MINOR: debug: dump the whole trace if we can't spot the starting point
Instead of special-casing the use of the symbol resolving to decide
whether to dump a partial or complete trace, let's simply start over
and dump everything when we reach the end after having found nothing.
It will be more robust against dirty traces as well.
2020-03-04 12:04:07 +01:00
Willy Tarreau
13faf16e1e MINOR: debug: improve backtrace() on aarch64 and possibly other systems
It happens that on aarch64 backtrace() only returns one entry (tested
with gcc 4.7.4, 5.5.0 and 7.4.1). Probably that it refrains from unwinding
the stack due to the risk of hitting a bad pointer. Here we can use
may_access() to know when it's safe, so we can actually unwind the stack
without taking risks. It happens that the faulting function (the one
just after the signal handler) is not listed here, very likely because
the signal handler uses a special stack and did not create a new frame.

So this patch creates a new my_backtrace() function in standard.h that
either calls backtrace() or does its own unrolling. The choice depends
on HA_HAVE_WORKING_BACKTRACE which is set in compat.h based on the build
target.
2020-03-04 12:04:07 +01:00
Willy Tarreau
cdd8074433 MINOR: debug: report the number of entries in the backtrace
It's useful to get an indication of unresolved stuff or memory
corruption to have the apparent depth of the stack trace in the
output, especially if we dump nothing.
2020-03-04 12:02:27 +01:00
Willy Tarreau
e58114e0e5 MINOR: wdt: do not depend on USE_THREAD
There is no reason for restricting the use of the watchdog to threads
anymore, as it works perfectly without threads as well.
2020-03-04 12:02:27 +01:00
Willy Tarreau
d6f1966543 MEDIUM: wdt: fall back to CLOCK_REALTIME if CLOCK_THREAD_CPUTIME is not available
At least FreeBSD has a fully functional CLOCK_THREAD_CPUTIME but it
cannot create a timer on it. This is not a problem since our timer is
only used to measure each thread's usage using now_cpu_time_thread().
So by just replacing this clock with CLOCK_REALTIME we allow such
platforms to periodically call the wdt and check the thread's CPU usage.
The consequence is that even on a totally idle system there will still
be a few extra periodic wakeups, but the watchdog becomes usable there
as well.
2020-03-04 12:02:27 +01:00
Willy Tarreau
7259fa2b89 BUG/MINOR: wdt: do not return an error when the watchdog couldn't be enabled
On operating systems not supporting to create a timer on
POSIX_THREAD_CPUTIME we emit a warning but we return an error so the
process fails to start, which is absurd. Let's return a success once
the warning is emitted instead.

This may be backported to 2.1 and 2.0.
2020-03-04 12:02:27 +01:00
Emmanuel Hocdet
842e94ee06 MINOR: ssl: add "ca-verify-file" directive
It's only available for bind line. "ca-verify-file" allows to separate
CA certificates from "ca-file". CA names sent in server hello message is
only compute from "ca-file". Typically, "ca-file" must be defined with
intermediate certificates and "ca-verify-file" with certificates to
ending the chain, like root CA.

Fix issue #404.
2020-03-04 11:53:11 +01:00
Willy Tarreau
0214b45a61 MINOR: debug: call backtrace() once upon startup
Calling backtrace() will access libgcc at runtime. We don't want to do
it after the chroot, so let's perform a first call to have it ready in
memory for later use.
2020-03-04 06:01:40 +01:00
Willy Tarreau
f5b4e064dc MEDIUM: debug: add support for dumping backtraces of stuck threads
When a panic() occurs due to a stuck thread, we'll try to dump a
backtrace of this thread if the config directive USE_BACKTRACE is
set (which is the case on linux+glibc). For this we use the
backtrace() call provided by glibc and iterate the pointers through
resolve_sym_name(). In order to minimize the output (which is limited
to one buffer), we only do this for stuck threads, and we start the
dump above ha_panic()/ha_thread_dump_all_to_trash(), and stop when
meeting known points such as main/run_tasks_from_list/run_poll_loop.

If enabled without USE_DL, the dump will be complete with no details
except that pointers will all be given relative to main, which is
still better than nothing.

The new USE_BACKTRACE config option is enabled by default on glibc since
it has been present for ages. When it is set, the export-dynamic linker
option is enabled so that all non-static symbols are properly resolved.
2020-03-03 18:40:03 +01:00
Willy Tarreau
cf12f2ee66 MINOR: cli: make "show fd" rely on resolve_sym_name()
This way we can drop all hard-coded iocb matching.
2020-03-03 18:19:04 +01:00
Willy Tarreau
2e89b0930b MINOR: debug: use resolve_sym_name() to dump task handlers
Now in "show threads", the task/tasklet handler will be resolved
using this function, which will provide more detailed results and
will still support offsets to main for unresolved symbols.
2020-03-03 18:19:04 +01:00
Willy Tarreau
eb8b1ca3eb MINOR: tools: add resolve_sym_name() to resolve function pointers
We use various hacks at a few places to try to identify known function
pointers in debugging outputs (show threads & show fd). Let's centralize
this into a new function dedicated to this. It already knows about the
functions matched by "show threads" and "show fd", and when built with
USE_DL, it can rely on dladdr1() to resolve other functions. There are
some limitations, as static functions are not resolved, linking with
-rdynamic is mandatory, and even then some functions will not necessarily
appear. It's possible to do a better job by rebuilding the whole symbol
table from the ELF headers in memory but it's less portable and the gains
are still limited, so this solution remains a reasonable tradeoff.
2020-03-03 18:18:40 +01:00
Willy Tarreau
762fb3ec8e MINOR: tools: add new function dump_addr_and_bytes()
This function dumps <n> bytes from <addr> in hex form into buffer <buf>
enclosed in brackets after the address itself, formatted on 14 chars
including the "0x" prefix. This is meant to be used as a prefix for code
areas. For example: "0x7f10b6557690 [48 c7 c0 0f 00 00 00 0f]: "
It relies on may_access() to know if the bytes are dumpable, otherwise "--"
is emitted. An optional prefix is supported.
2020-03-03 17:46:37 +01:00
Willy Tarreau
55a6c4f34d BUILD: tools: remove obsolete and conflicting trace() from standard.c
Since commit 4c2ae48375 ("MINOR: trace: implement a very basic trace()
function") merged in 2.1, trace() is an inline function. It must not
appear in standard.c anymore and may break build depending on includes.

This can be backported to 2.1.
2020-03-03 17:46:37 +01:00
Willy Tarreau
27d00c0167 MINOR: task: export run_tasks_from_list
This will help refine debug traces.
2020-03-03 15:26:10 +01:00
Willy Tarreau
3ebd55ee51 MINOR: haproxy: export run_poll_loop
This will help refine debug traces.
2020-03-03 15:26:10 +01:00
Willy Tarreau
1827845a3d MINOR: haproxy: export main to ease access from debugger
Better just export main instead of declaring it as extern, it's cleaner
and may be usable elsewhere.
2020-03-03 15:26:10 +01:00
Willy Tarreau
82aafc4a0f BUG/MEDIUM: debug: make the debug_handler check for the thread in threads_to_dump
It happens that just sending the debug signal to the process makes on
thread wait for its turn while nobody wants to dump. We need to at
least verify that a dump was really requested for this thread.

This can be backported to 2.1 and 2.0.
2020-03-03 08:31:34 +01:00
Willy Tarreau
516853f1cc MINOR: debug: report the task handler's pointer relative to main
Often in crash dumps we see unknown function pointers. Let's display
them relative to main, that helps quite a lot figure the function
from an executable, for example:

  (gdb) x/a main+645360
  0x4c56a0 <h1_timeout_task>:     0x2e6666666666feeb

This could be backported to 2.0.
2020-03-03 07:04:42 +01:00
Willy Tarreau
7d9421deca MINOR: tools: make sure to correctly check the returned 'ms' in date2std_log
In commit 4eee38a ("BUILD/MINOR: tools: fix build warning in the date
conversion functions") we added some return checks to shut build
warnings but the last test is useless since the tested pointer is not
updated by the last call to utoa_pad() used to convert the milliseconds.
It turns out the original code from 2012 already skipped this part,
probably in order to avoid the risk of seeing a millisecond field not
belonging to the 0-999 range. Better keep the check and put the code
into stricter shape.

No backport is needed. This fixes issue #526.
2020-02-29 09:08:02 +01:00
Willy Tarreau
77e463f729 BUG/MINOR: arg: don't reject missing optional args
Commit 80b53ffb1c ("MEDIUM: arg: make make_arg_list() stop after its
own arguments") changed the way we detect the empty list because we
cannot stop by looking up the closing parenthesis anymore, thus for
the first missing arg we have to enter the parsing loop again. And
there, finding an empty arg means we go to the empty_err label, where
it was not initially planned to handle this condition. This results
in %[date()] to fail while %[date] works. Let's simply check if we've
reached the minimally supported args there (it used to be done during
the function entry).

Thanks to Jrme for reporting this issue. No backport is needed,
this is 2.2-dev2+ only.
2020-02-28 16:41:29 +01:00
Willy Tarreau
493d9dc6ba MEDIUM: mux-h1: do not blindly wake up the tasklet at end of request anymore
Since commit "MEDIUM: connection: make the subscribe() call able to wakeup
if ready" we have the guarantee that the tasklet will be woken up if
subscribing to a connection for an even that's ready. Since we have too
many tasklet_wakeup() calls in mux-h1, let's now use this property to
improve the situation a bit.

With this change, no syscall count changed, however the number of useless
calls to some functions significantly went down. Here are the differences
for the test below (100k req), in number of calls per request :

  $ ./h1load -n 100000 -t 4 -c 1000 -T 20 -F 127.0.0.1:8001/?s=1k/t=20

                           before   after  change   note
  tasklet_wakeup:           3        1      -66%
  h1_io_cb:                 4        3      -25%
  h1_send:                  6.7      5.4    -19%
  h1_wake:                  0.73     0.44   -39%
  h1_process:               4.7      3.4    -27%
  h1_wake_stream_for_send:  6.7      5.5    -18%
  si_cs_process             3.7      3.4     -7.8%
  conn_fd_handler           2.7      2.4    -10%
  raw_sock_to_buf:          4        2      -50%
  pool_free:                4        2      -50%    from failed rx calls

Note that the situation could be further improved by having muxes lazily
subscribe to Rx events in case the FD is already being polled. However
this requires deeper changes to implement a LAZY_RECV subscribe mode,
and to replace the FD's active bit by 3 states representing the desired
action to perform on the FD during the update, among NONE (no need to
change), POLL (can't proceed without), and STOP (buffer full). This
would only impact Rx since on Tx we know what we have to send. The
savings to expect from this might be more visible with splicing and/or
when dealing with many connections having long think times.
2020-02-28 16:17:09 +01:00
Willy Tarreau
065a025610 MEDIUM: connection: don't stop receiving events in the FD handler
The remaining epoll_ctl() calls are exclusively caused by the disagreement
between conn_fd_handler() and the mux receiving the data: the fd handler
wants to stop after having woken up the tasklet, then the mux after
receiving data wants to receive again. Given that they don't happen in
the same poll loop when there are many FDs, this causes a lot of state
changes.

As suggested by Olivier, if the task is already scheduled for running,
we don't need to disable the event because it's in the run queue, poll()
cannot stop, and reporting it again will be harmless. What *might*
happen however is that a sampling-based poller like epoll() would report
many times the same event and has trouble getting others behind. But if
it would happen, it would still indicate the run queue has plenty of
pending operations, so it would in fact only displace the problem from
the poller to the run queue, which doesn't seem to be worse (and in
fact we do support priorities while the poller does not).

By doing this change, the keep-alive test with 1k conns and 100k reqs
completely gets rid of the per-request epoll_ctl changes, while still
not causing extra recvfrom() :

  $ ./h1load -n 100000 -t 4 -c 1000 -T 20 -F 127.0.0.1:8001/?s=1k/t=20

  200000 sendto 1
  200000 recvfrom 1
   10762 epoll_wait 1
    3664 epoll_ctl 1
    1999 recvfrom -1

In close mode, it didn't change anything, we're still in the optimal
case (2 epoll per connection) :

  $ ./h1load -n 100000 -r 1 -t 4 -c 1000 -T 20 -F 127.0.0.1:8001/?s=1k/t=20

  203764 epoll_ctl 1
  200000 sendto 1
  200000 recvfrom 1
    6091 epoll_wait 1
    2994 recvfrom -1
2020-02-28 16:17:09 +01:00
Willy Tarreau
7e59c0a5e1 MEDIUM: connection: make the subscribe() call able to wakeup if ready
There's currently an internal API limitation at the connection layer
regarding conn_subscribe(). We must not subscribe if we haven't yet
met EAGAIN or such a condition, so we sometimes force ourselves to read
in order to meet this condition and being allowed to call subscribe.
But reading cannot always be done (e.g. at the end of a loop where we
cannot afford to retrieve new data and start again) so we instead
perform a tasklet_wakeup() of the requester's io_cb. This is what is
done in mux_h1 for example. The problem with this is that it forces
a new receive when we're not necessarily certain we need one. And if
the FD is not ready and was already being polled, it's a useless
wakeup.

The current patch improves the connection-level subscribe() so that
it really manipulates the polling if the FD is marked not-ready, but
instead schedules the argument tasklet for a wakeup if the FD is
ready. This guarantees we'll wake this tasklet up in any case once the
FD is ready, either immediately or after polling.

By doing so, a test on pure close mode shows we cut in half the number
of epoll_ctl() calls and almost eliminate failed recvfrom():

  $ ./h1load -n 100000 -r 1 -t 4 -c 1000 -T 20 -F 127.0.0.1:8001/?s=1k/t=20

  before:
   399464 epoll_ctl 1
   200007 recvfrom 1
   200000 sendto 1
   100000 recvfrom -1
     7508 epoll_wait 1

  after:
   205739 epoll_ctl 1
   200000 sendto 1
   200000 recvfrom 1
     6084 epoll_wait 1
     2651 recvfrom -1

On keep-alive there is no change however.
2020-02-28 16:17:09 +01:00
Willy Tarreau
8dd348c90c MINOR: rawsock: always mark the FD not ready when we're certain it happens
This partially reverts commit 1113116b4a ("MEDIUM: raw-sock: remove
obsolete calls to fd_{cant,cond,done}_{send,recv}") so that we can mark
the FD not ready as required since commit 19bc201c9f ("MEDIUM: connection:
remove the intermediary polling state from the connection"). Indeed, with
the removal of the latter we don't have any other reliable indication that
the FD is blocked, which explains why there are so many EAGAIN in traces.

It's worth noting that a short read or a short write are also reliable
indicators of exhausted buffers and are even documented as such in the
epoll man page in case of edge-triggered mode. That's why we also report
the FD as blocked in such a case.

With this change we completely got rid of EAGAIN in keep-alive tests, but
they were expectedly transferred to epoll_ctl:

  $ ./h1load -n 100000 -t 4 -c 1000 -T 20 -F 127.0.0.1:8001/?s=1k/t=20

  before:
   266331 epoll_ctl 1
   200000 sendto 1
   200000 recvfrom 1
   135757 recvfrom -1
     8626 epoll_wait 1

  after:
   394865 epoll_ctl 1
   200000 sendto 1
   200000 recvfrom 1
    10748 epoll_wait 1
     1999 recvfrom -1
2020-02-28 16:17:09 +01:00
Christopher Faulet
b045bb221a MINOR: mux-h1: Remove useless case-insensitive comparisons
Header names from an HTX message are always in lower-case, so the comparison may
be case-sensitive.
2020-02-28 10:49:09 +01:00
Christopher Faulet
3e1f7f4a39 BUG/MINOR: http-htx: Do case-insensive comparisons on Host header name
When a header is added or modified, in http_add_header() or
http_replace_header(), a comparison is performed on its name to know if it is
the Host header and if the authority part of the uri must be updated or
not. This comparision must be case-insensive.

This patch should fix the issue #522. It must be backported to 2.1.
2020-02-28 10:49:09 +01:00
Lukas Tribus
81725b867c BUG/MINOR: dns: ignore trailing dot
As per issue #435 a hostname with a trailing dot confuses our DNS code,
as for a zero length DNS label we emit a null-byte. This change makes us
ignore the zero length label instead.

Must be backported to 1.8.
2020-02-28 10:26:29 +01:00
William Lallemand
858885737c BUG/MEDIUM: ssl: chain must be initialized with sk_X509_new_null()
Even when there isn't a chain, it must be initialized to a empty X509
structure with sk_X509_new_null().

This patch fixes a segfault which appears with older versions of the SSL
libs (openssl 0.9.8, libressl 2.8.3...) because X509_chain_up_ref() does
not check the pointer.

This bug was introduced by b90d2cb ("MINOR: ssl: resolve issuers chain
later").

Should fix issue #516.
2020-02-27 14:48:35 +01:00
Tim Duesterhus
530408f976 BUG/MINOR: sample: Make sure to return stable IDs in the unique-id fetch
Previously when the `unique-id-format` contained non-deterministic parts,
such as the `uuid` fetch each use of the `unique-id` fetch would generate
a new unique ID, replacing the old one. The following configuration shows
the error:

  global
        log stdout format short daemon

  listen test
        log global
        log-format "%ID"
        unique-id-format %{+X}o\ TEST-%[uuid]

        mode http
        bind *:8080
        http-response set-header A %[unique-id]
        http-response set-header B %[unique-id]
        server example example.com:80

Without the patch the contents of the `A` and `B` response header would
differ.

This bug was introduced in commit f4011ddcf5,
which was first released with HAProxy 1.7-dev3.

This fix should be backported to HAProxy 1.7+.
2020-02-27 03:50:10 +01:00
Willy Tarreau
55c5399846 MINOR: epoll: always initialize all of epoll_event to please valgrind
valgrind complains that epoll_ctl() uses an epoll_event in which we
have only set the part we use from the data field (i.e. the fd). Tests
show that pre-initializing the struct in the stack doesn't have a
measurable impact so let's do it.
2020-02-26 14:36:27 +01:00
Willy Tarreau
c1563e5474 MINOR: wdt: always clear sigev_value to make valgrind happy
In issue #471 it was reported that valgrind sometimes complains about
timer_create() being called with uninitialized bytes. These are in fact
the bits from sigev_value.sival_ptr that are not part of sival_int that
are tagged as such, as valgrind has no way to know we're using the int
instead of the ptr in the union. It's cheap to initialize the field so
let's do it.
2020-02-26 14:05:20 +01:00
Willy Tarreau
fd2658c0c6 BUG/MINOR: h2: reject again empty :path pseudo-headers
Since commit 92919f7fd5 ("MEDIUM: h2: make the request parser rebuild
a complete URI") we make sure to rebuild a complete URI. Unfortunately
the test for an empty :path pseudo-header that is mandated by #8.1.2.3
appened to be performed on the URI before this patch, which is never
empty anymore after being rebuilt, causing h2spec to complain :

  8. HTTP Message Exchanges
    8.1. HTTP Request/Response Exchange
      8.1.2. HTTP Header Fields
        8.1.2.3. Request Pseudo-Header Fields
          - 1: Sends a HEADERS frame with empty ":path" pseudo-header field
            -> The endpoint MUST respond with a stream error of type PROTOCOL_ERROR.
               Expected: GOAWAY Frame (Error Code: PROTOCOL_ERROR)
                         RST_STREAM Frame (Error Code: PROTOCOL_ERROR)
                         Connection closed
                 Actual: DATA Frame (length:0, flags:0x01, stream_id:1)

It's worth noting that this error doesn't trigger when calling h2spec
with a timeout as some scripts do, which explains why it wasn't detected
after the patch above.

This fixes one half of issue #471 and should be backported to 2.1.
2020-02-26 13:56:24 +01:00
Emmanuel Hocdet
cf8cf6c5cd MINOR: ssl/cli: "show ssl cert" command should print the "Chain Filename:"
When the issuers chain of a certificate is picked from
the "issuers-chain-path" tree, "ssl show cert" prints it.
2020-02-26 13:11:59 +01:00
Emmanuel Hocdet
6f507c7c5d MINOR: ssl: resolve ocsp_issuer later
The goal is to use the ckch to store data from PEM files or <payload> and
only for that. This patch adresses the ckch->ocsp_issuer case. It finds
issuers chain if no chain is present in the ckch in ssl_sock_put_ckch_into_ctx(),
filling the ocsp_issuer from the chain must be done after.
It changes the way '.issuer' is managed: it tries to load '.issuer' in
ckch->ocsp_issuer first and then look for the issuer in the chain later
(in ssl_sock_load_ocsp() ). "ssl-load-extra-files" without the "issuer"
parameter can negate extra '.issuer' file check.
2020-02-26 13:11:59 +01:00
Emmanuel Hocdet
b90d2cbc42 MINOR: ssl: resolve issuers chain later
The goal is to use the ckch to store data from a loaded PEM file or a
<payload> and only for that. This patch addresses the ckch->chain case.
Looking for the issuers chain, if no chain is present in the ckch, can
be done in ssl_sock_put_ckch_into_ctx(). This way it is possible to know
the origin of the certificate chain without an extra state.
2020-02-26 13:06:04 +01:00
Emmanuel Hocdet
75a7aa13da MINOR: ssl: move find certificate chain code to its own function
New function ssl_get_issuer_chain(cert) to find an issuer_chain entry
from "issers-chain-path" tree.
2020-02-26 12:48:47 +01:00
Willy Tarreau
2104659cd5 MEDIUM: buffer: remove the buffer_wq lock
This lock was only needed to protect the buffer_wq list, but now we have
the mt_list for this. This patch simply turns the buffer_wq list to an
mt_list and gets rid of the lock.

It's worth noting that the whole buffer_wait thing still looks totally
wrong especially in a threaded context: the wakeup_cb() callback is
called synchronously from any thread and may end up calling some
connection code that was not expected to run on a given thread. The
whole thing should probably be reworked to use tasklets instead and be
a bit more centralized.
2020-02-26 10:39:36 +01:00
William Lallemand
e0f3fd5b4c CLEANUP: ssl: move issuer_chain tree and definition
Move the cert_issuer_tree outside the global_ssl structure since it's
not a configuration variable. And move the declaration of the
issuer_chain structure in types/ssl_sock.h
2020-02-25 15:06:40 +01:00
William Lallemand
a90e593a7a MINOR: ssl/cli: reorder 'show ssl cert' output
Reorder the 'show ssl cert' output so it's easier to see if the whole
chain is correct.

For a chain to be correct, an "Issuer" line must have the same
content as the next "Subject" line.

Example:

  Subject: /C=FR/ST=Paris/O=HAProxy Test Certificate/CN=test.haproxy.local
  Issuer: /C=FR/ST=Paris/O=HAProxy Test Intermediate CA 2/CN=ca2.haproxy.local
  Chain Subject: /C=FR/ST=Paris/O=HAProxy Test Intermediate CA 2/CN=ca2.haproxy.local
  Chain Issuer: /C=FR/ST=Paris/O=HAProxy Test Intermediate CA 1/CN=ca1.haproxy.local
  Chain Subject: /C=FR/ST=Paris/O=HAProxy Test Intermediate CA 1/CN=ca1.haproxy.local
  Chain Issuer: /C=FR/ST=Paris/O=HAProxy Test Root CA/CN=root.haproxy.local
2020-02-25 14:17:50 +01:00
William Lallemand
bb7288a9f5 MINOR: ssl/cli: 'show ssl cert'displays the issuer in the chain
For each certificate in the chain, displays the issuer, so it's easy to
know if the chain is right.

Also rename "Chain" to "Chain Subject".

Example:

  Chain Subject: /C=FR/ST=Paris/O=HAProxy Test Intermediate CA 2/CN=ca2.haproxy.local
  Chain Issuer: /C=FR/ST=Paris/O=HAProxy Test Intermediate CA 1/CN=ca1.haproxy.local
  Chain Subject: /C=FR/ST=Paris/O=HAProxy Test Intermediate CA 1/CN=ca1.haproxy.local
  Chain Issuer: /C=FR/ST=Paris/O=HAProxy Test Root CA/CN=root.haproxy.local
2020-02-25 14:17:44 +01:00
William Lallemand
35f4a9dd8c MINOR: ssl/cli: 'show ssl cert' displays the chain
Display the subject of each certificate contained in the chain in the
output of "show ssl cert <filename>".
Each subjects are on a unique line prefixed by "Chain: "

Example:

Chain: /C=FR/ST=Paris/O=HAProxy Test Intermediate CA 2/CN=ca2.haproxy.local
Chain: /C=FR/ST=Paris/O=HAProxy Test Intermediate CA 1/CN=ca1.haproxy.local
2020-02-25 12:02:51 +01:00
Willy Tarreau
1b85785bc2 MINOR: config: mark global.debug as deprecated
This directive has never made any sense and has already caused trouble
by forcing the process to stay in foreground during the boot process.
Let's emit a warning mentioning it's deprecated and will be removed in
2.3.
2020-02-25 11:28:58 +01:00
Willy Tarreau
7f26391bc5 BUG/MINOR: connection: make sure to correctly tag local PROXY connections
As reported in issue #511, when sending an outgoing local connection
(e.g. health check) we must set the "local" tag and not a "proxy" tag.
The issue comes from historic support on v1 which required to steal the
address on the outgoing connection for such ones, creating confusion in
the v2 code which believes it sees the incoming connection.

In order not to risk to break existing setups which might rely on seeing
the LB's address in the connection's source field, let's just change the
connection type from proxy to local and keep the addresses. The protocol
spec states that for local, the addresses must be ignored anyway.

This problem has always existed, this can be backported as far as 1.5,
though it's probably not a good idea to change such setups, thus maybe
2.0 would be more reasonable.
2020-02-25 10:31:37 +01:00
Willy Tarreau
1ac83af560 CLEANUP: connection: use read_u32() instead of a cast in the netscaler parser
The netscaler protocol parser used to involve a few casts from char to
(uint32_t*), let's properly use u32 for this instead.
2020-02-25 10:24:51 +01:00
Willy Tarreau
26474c486d CLEANUP: lua: fix aliasing issues in the address matching code
Just use read_u32() instead of casting IPv6 addresses to uint32_t*.
2020-02-25 10:24:51 +01:00
Willy Tarreau
296cfd17ef MINOR: pattern: fix all remaining strict aliasing issues
There were still a number of struct casts from various sizes. All of
them were now replaced with read_u32(), read_u16(), read_u64() or
memcpy().
2020-02-25 10:24:51 +01:00
Willy Tarreau
a8b7ecd4dc CLEANUP: sample: use read_u64() in ipmask() to apply an IPv6 mask
There were 8 strict aliasing warnings there due to the dereferences
casting to uint32_t of input and output. We can achieve the same using
two write_u64() and four read_u64() which do not cause this issue and
even let the compiler use 64-bit operations.
2020-02-25 10:24:14 +01:00
Willy Tarreau
6cde5d883c CLEANUP: stick-tables: use read_u32() to display a node's key
This fixes another aliasing issue that pops up in stick_table.c
and peers.c's debug code.
2020-02-25 09:41:22 +01:00
Willy Tarreau
8b5075806d CLEANUP: cache: use read_u32/write_u32 to access the cache entry's hash
Enabling strict aliasing fails on the cache's hash which is a series of
20 bytes cast as u32. And in practice it could even fail on some archs
if the http_txn didn't guarantee the hash was properly aligned. Let's
use read_u32() to read the value and write_u32() to set it, this makes
sure the compiler emits the correct code to access these and knows about
the intentional aliasing.
2020-02-25 09:35:07 +01:00
Willy Tarreau
2b9f0664d6 CLEANUP: fd: use a union in fd_rm_from_fd_list() to shut aliasing warnings
Enabling strict aliasing fails in fd.c when using the double-word CAS,
let's get rid of the (void**)(void*)&cur_list junk and use a union
instead. This way the compiler knows they do alias.
2020-02-25 09:25:53 +01:00
Willy Tarreau
105599c1ba BUG/MEDIUM: ssl: fix several bad pointer aliases in a few sample fetch functions
Sample fetch functions ssl_x_sha1(), ssl_fc_npn(), ssl_fc_alpn(),
ssl_fc_session_id(), as well as the CLI's "show cert details" handler
used to dereference the output buffer's <data> field by casting it to
"unsigned *". But while doing this could work before 1.9, it broke
starting with commit 843b7cbe9d ("MEDIUM: chunks: make the chunk struct's
fields match the buffer struct") which merged chunks and buffers, causing
the <data> field to become a size_t. The impact is only on 64-bit platform
and depends on the endianness: on little endian, there should never be any
non-zero bits in the field as it is supposed to have been zeroed before the
call, so it shouldbe harmless; on big endian, the high part of the value
only is written instead of the lower one, often making the result appear
4 billion times larger, and making such values dropped everywhere due to
being larger than a buffer.

It seems that it would be wise to try to re-enable strict-aliasing to
catch such errors.

This must be backported till 1.9.
2020-02-25 08:59:23 +01:00
Willy Tarreau
5715da269d BUG/MINOR: sample: fix the json converter's endian-sensitivity
About every time there's a pointer cast in the code, there's a hidden
bug, and this one was no exception, as it passes the first octet of the
native representation of an integer as a single-character string, which
obviously only works on little endian machines. On big-endian machines,
something as simple as "str(foo),json" only returns zeroes.

This bug was introduced with the JSON converter in 1.6-dev1 by commit
317e1c4f1e ("MINOR: sample: add "json" converter"), the fix may be
backported to all stable branches.
2020-02-25 08:47:45 +01:00
Willy Tarreau
908071171b BUILD: general: always pass unsigned chars to is* functions
The isalnum(), isalpha(), isdigit() etc functions from ctype.h are
supposed to take an int in argument which must either reflect an
unsigned char or EOF. In practice on some platforms they're implemented
as macros referencing an array, and when passed a char, they either cause
a warning "array subscript has type 'char'" when lucky, or cause random
segfaults when unlucky. It's quite unconvenient by the way since none of
them may return true for negative values. The recent introduction of
cygwin to the list of regularly tested build platforms revealed a lot
of breakage there due to the same issues again.

So this patch addresses the problem all over the code at once. It adds
unsigned char casts to every valid use case, and also drops the unneeded
double cast to int that was sometimes added on top of it.

It may be backported by dropping irrelevant changes if that helps better
support uncommon platforms. It's unlikely to fix bugs on platforms which
would already not emit any warning though.
2020-02-25 08:16:33 +01:00
Willy Tarreau
ded15b7564 BUILD: ssl: only pass unsigned chars to isspace()
A build failure on cygwin was reported on github actions here:

  https://github.com/haproxy/haproxy/runs/466507874

It's caused by a signed char being passed to isspace(), and this one
being implemented as a macro instead of a function as the man page
suggests. It's the same issue that regularly pops up on Solaris. This
comes from commit 98263291cc which was merged in 1.8-dev1. A backport
is possible though not incredibly useful.
2020-02-25 07:51:59 +01:00
Tim Duesterhus
017484c80f CLEANUP: cfgparse: Fix type of second calloc() parameter
`curr_idle_thr` is of type `unsigned int`, not `int`. Fix this issue by
taking the size of the dereferenced `curr_idle_thr` array.

This issue was introduced when adding the `curr_idle_thr` struct member
in commit f131481a0a. This commit is first
tagged in 2.0-dev1 and marked for backport to 1.9.
2020-02-25 07:42:51 +01:00
Willy Tarreau
03e7853581 BUILD: remove obsolete support for -mregparm / USE_REGPARM
This used to be a minor optimization on ix86 where registers are scarce
and the calling convention not very efficient, but this platform is not
relevant enough anymore to warrant all this dirt in the code for the sake
of saving 1 or 2% of performance. Modern platforms don't use this at all
since their calling convention already defaults to using several registers
so better get rid of this once for all.
2020-02-25 07:41:47 +01:00
William Lallemand
3f25ae31bd BUG/MINOR: ssl: load .key in a directory only after PEM
Don't try to load a .key in a directory without loading its associated
certificate file.

This patch ignores the .key files when iterating over the files in a
directory.

Introduced by 4c5adbf ("MINOR: ssl: load the key from a dedicated
file").
2020-02-24 16:34:16 +01:00
William Lallemand
4c5adbf595 MINOR: ssl: load the key from a dedicated file
For a certificate on a bind line, if the private key was not found in
the PEM file, look for a .key and load it.

This default behavior can be changed by using the ssl-load-extra-files
directive in the global section

This feature was mentionned in the issue #221.
2020-02-24 15:39:53 +01:00
Willy Tarreau
02ac950a11 CLEANUP: http/h1: rely on HA_UNALIGNED_LE instead of checking for CPU families
Now that we have flags indicating the CPU's capabilities, better use
them instead of missing some updates for new CPU families (ARMv8 was
missing there).
2020-02-21 16:32:57 +01:00
Willy Tarreau
a7ddab0c25 BUG/MEDIUM: shctx: make sure to keep all blocks aligned
The blocksize and the extra field are not necessarily aligned on a
machine word. This can result in crashing an align-sensitive machine
when initializing the shctx area. Let's round both sizes up to a pointer
size to make this safe everywhere.

This fixes issue #512. This should be backported as far as 1.8.
2020-02-21 13:45:58 +01:00
Jerome Magnin
4fb196c1d6 CLEANUP: sample: use iststop instead of a for loop
In sample_fetch_path we can use iststop() instead of a for loop
to find the '?' and return the correct length. This requires commit
"MINOR: ist: add an iststop() function".
2020-02-21 11:53:18 +01:00
Jerome Magnin
4bbc9494b7 BUG/MINOR: http: http-request replace-path duplicates the query string
In http_action_replace_uri() we call http_get_path() in the case of
a replace-path rule. http_get_path() will return an ist pointing to
the start of the path, but uri.ptr + uri.len points to the end of the
uri. As as result, we are matching against a string containing the
query, which we append to the "path" later, effectively duplicating
the query string.

This patch uses the iststop() function introduced in "MINOR: ist: add
an iststop() function" to find the '?' character and update the ist
length when needed.

This fixes issue #510.

The bug was introduced by commit 262c3f1a ("MINOR: http: add a new
"replace-path" action"), which was backported to 2.1 and 2.0.
2020-02-21 11:52:14 +01:00
Willy Tarreau
6e59cb5db1 MINOR: mux-h1: pass CO_RFL_READ_ONCE to the lower layers when relevant
When we're in H1_MSG_RQBEFORE or H1_MSG_RPBEFORE, we know that the
first message is highly likely the only one and that it's pointless
to try to perform a second recvfrom() to complete a first partial
read. This is similar to what used to be done in the older I/O methods
with the CF_READ_DONTWAIT flag on the channel. So let's pass
CO_RFL_READ_ONCE to the transport layer during rcv_buf() in this case.

By doing so, in a test involving keep-alive connections with a non-null
client think time, we remove 20% of the recvfrom() calls, all of which
used to systematically fail. More precisely, we observe a drop from 5.0
recvfrom() per request with 60% failure to 4.0 per request with 50%
failure.
2020-02-21 11:38:50 +01:00
Willy Tarreau
716bec2dc6 MINOR: connection: introduce a new receive flag: CO_RFL_READ_ONCE
This flag is currently supported by raw_sock to perform a single recv()
attempt and avoid subscribing. Typically on the request and response
paths with keep-alive, with short messages we know that it's very likely
that the first message is enough.
2020-02-21 11:22:45 +01:00
Willy Tarreau
5d4d1806db CLEANUP: connection: remove the definitions of conn_xprt_{stop,want}_{send,recv}
This marks the end of the transition from the connection polling states
introduced in 1.5-dev12 and the subscriptions in that arrived in 1.9.
The socket layer can now safely use its FD while all upper layers rely
exclusively on subscriptions. These old functions were removed. Some may
deserve some renaming to improved clarty though. The single call to
conn_xprt_stop_both() was dropped in favor of conn_cond_update_polling()
which already does the same.
2020-02-21 11:21:12 +01:00
Willy Tarreau
d1d14c3157 MINOR: connection: remove the last calls to conn_xprt_{want,stop}_*
The last few calls to conn_xprt_{want,stop}_{recv,send} in the central
connection code were replaced with their strictly exact equivalent fd_*,
adding the call to conn_ctrl_ready() when it was missing.
2020-02-21 11:21:12 +01:00
Willy Tarreau
562e0d8619 MINOR: tcp/uxst/sockpair: use fd_want_send() instead of conn_xprt_want_send()
Just like previous commit, we don't need to pass through the connection
layer anymore to enable polling during a connect(), we know the FD, so
let's simply call fd_want_send().
2020-02-21 11:21:12 +01:00
Willy Tarreau
3110eb769b MINOR: raw_sock: directly call fd_stop_send() and not conn_xprt_stop_send()
Now that we know that the connection layer is transparent for polling
changes, we have no reason for hiding behind conn_xprt_stop_send() and
can safely call fd_stop_send() on the FD once the buffer is empty.
2020-02-21 11:21:12 +01:00
Willy Tarreau
19bc201c9f MEDIUM: connection: remove the intermediary polling state from the connection
Historically we used to require that the connections held the desired
polling states for the data layer and the socket layer. Then with muxes
these were more or less merged into the transport layer, and now it
happens that with all transport layers having their own state, the
"transport layer state" as we have it in the connection (XPRT_RD_ENA,
XPRT_WR_ENA) is only an exact copy of the undelying file descriptor
state, but with a delay. All of this is causing some difficulties at
many places in the code because there are still some locations which
use the conn_want_* API to remain clean and only rely on connection,
and count on a later collection call to conn_cond_update_polling(),
while others need an immediate action and directly use the FD updates.

Since our updates are now much cheaper, most of them being only an
atomic test-and-set operation, and since our I/O callbacks are deferred,
there's no benefit anymore in trying to "cache" the transient state
change in the connection flags hoping to cancel them before they
become an FD event. Better make such calls transparent indirections
to the FD layer instead and get rid of the deferred operations which
needlessly complicate the logic inside.

This removes flags CO_FL_XPRT_{RD,WR}_ENA and CO_FL_WILL_UPDATE.
A number of functions related to polling updates were either greatly
simplified or removed.

Two places were using CO_FL_XPRT_WR_ENA as a hint to know if more data
were expected to be sent after a PROXY protocol or SOCKSv4 header. These
ones were simply replaced with a check on the subscription which is
where we ought to get the autoritative information from.

Now the __conn_xprt_want_* and their conn_xprt_want_* counterparts
are the same. conn_stop_polling() and conn_xprt_stop_both() are the
same as well. conn_cond_update_polling() only causes errors to stop
polling. It also becomes way more obvious that muxes should not at
all employ conn_xprt_{want|stop}_{recv,send}(), and that the call
to __conn_xprt_stop_recv() in case a mux failed to allocate a buffer
is inappropriate, it ought to unsubscribe from reads instead. All of
this definitely requires a serious cleanup.
2020-02-21 11:21:12 +01:00
Willy Tarreau
902871dd07 CLEANUP: epoll: place the struct epoll_event in the stack
Historically we used to have a global epoll_event for various
manipulations involving epoll_ctl() and when threads were added,
this was turned to a thread_local, which is needlessly expensive
since it's just a temporary variable. Let's move it to a local
variable wherever it's called instead.
2020-02-21 11:21:12 +01:00
Willy Tarreau
7c9d0e1b20 MINOR: checks: do not call conn_xprt_stop_send() anymore
While trying to address issue #253, Commit 5909380c ("BUG/MINOR: checks:
stop polling for write when we have nothing left to send") made sure that
we stop polling for writes when the buffer is empty. This was actually
more a workaround than a bug fix because by doing so we may be stopping
polling for an intermediary transport layer without acting on the check
itself in case there's SSL or send-proxy in the chain for example, thus
the approach is wrong. In practice due to the small size of check
requests, this will not have any impact. At best, we ought to unsubscribe
for sending, but that's already the case when we arrive in this function.
But given that the root cause of the issue was addressed later in commits
cc705a6b, c5940392 and ccf3f6d1, we can now safely revert this change.

It was confirmed on the faulty config that this change doesn't have any
effect anymore on the test.
2020-02-21 11:21:12 +01:00
Willy Tarreau
d57e34978d BUG/MINOR: mux: do not call conn_xprt_stop_recv() on buffer shortage
In H1/H2/FCGI, the *_get_buf() functions try to disable receipt of data
when there's no buffer available. But they do so at the lowest possible
level, which is unrelated to the upper transport layers which may still
be trying to feed data based on subscription. The correct approach here
would theorically be to only disable subscription, though when we get
there, the subscription will already have been dropped, so we can safely
just remove that call.

It's unlikely that this could have had any practical impact, as the upper
xprt layer would call this callback which would fail an not resubscribe.
Having the lowest layer disabled would just be temporary since when
re-enabling reading, a subscribe at the end of data would re-enable it.

Backport should not harm but seems useless at this point.
2020-02-21 11:21:12 +01:00
Christopher Faulet
9d9d645409 BUG/MAJOR: http-ana: Always abort the request when a tarpit is triggered
If an client error is reported on the request channel (CF_READ_ERROR) while a
session is tarpitted, no error is returned to the client. Concretly,
http_reply_and_close() function is not called. This function is reponsible to
forward the error to the client. But not only. It is also responsible to abort
the request. Because this function is not called when a read error is reported
on the request channel, and because the tarpit analyzer is the last one, there
is nothing preventing a connection attempt on a server while it is totally
unexpected.

So, a useless connexion on a backend server may be performed because of this
bug. If an HTTP load-balancing algorithm is used on the backend side, it leads
to a crash of HAProxy because the request was already erased.

If you have tarpit rules and if you use an HTTP load-balancing algorithm on your
backends, you must apply this patch. Otherwise a simple TCP reset on a tarpitted
connexion will most likely crash your HAProxy. A safe workaround is to use a
silent-drop rule or a deny rule instead of a tarpit.

This bug also affect the legacy code. It is in fact an very old hidden bug. But
the refactoring of process_stream() in the 1.9 makes it visible. And,
unfortunately, with the HTX, it is easier to hit it because many processing has
been moved in lower layers, in the muxes.

It must be backported as far as 1.9. For the 2.0 and the 1.9, the legacy HTTP
code must also be patched the same way. For older versions, it may be backported
but the bug seems to not impact them.

Thanks to Olivier D <webmaster@ajeux.com> to have reported the bug and provided
all the infos to analyze it.
2020-02-21 11:18:08 +01:00
Tim Duesterhus
e8aa5f24d6 BUG/MINOR: ssl: Stop passing dynamic strings as format arguments
gcc complains rightfully:

src/ssl_sock.c: In function ‘ssl_load_global_issuers_from_path’:
src/ssl_sock.c:9860:4: warning: format not a string literal and no format arguments [-Wformat-security]
    ha_warning(warn);
    ^

Introduced in 70df7bf19c.
2020-02-19 11:46:18 +01:00
Christopher Faulet
6072beb214 MINOR: http-ana: Match on the path if the monitor-uri starts by a /
if the monitor-uri starts by a slash ('/'), the matching is performed against
the request's path instead of the request's uri. It is a workaround to let the
HTTP/2 requests match the monitor-uri. Indeed, in HTTP/2, clients are encouraged
to send absolute URIs only.

This patch is not tagged as a bug, because the previous behavior matched exactly
what the doc describes. But it may surprise that HTTP/2 requests don't match the
monitor-uri.

This patch may be backported to 2.1 because URIs of HTTP/2 are stored using the
absolute-form starting this version. For previous versions, this patch will only
helps explicitely absolute HTTP/1 requests (and only the HTX part because on the
legacy HTTP, all the URI is matched).

It should fix the issue #509.
2020-02-18 16:29:29 +01:00
Christopher Faulet
d27689e952 BUG/MINOR: http-ana: Matching on monitor-uri should be case-sensitive
The monitor-uri should be case-sensitive. In reality, the scheme and the host
part are case-insensitives and only the path is case-sensive. But concretely,
since the start, the matching on the monitor-uri is case-sensitive. And it is
probably the expected behavior of almost all users.

This patch must be backported as far as 1.9. For HAProxy 2.0 and 1.9, it must be
applied on src/proto_htx.c.
2020-02-18 16:29:23 +01:00
Emmanuel Hocdet
70df7bf19c MINOR: ssl: add "issuers-chain-path" directive.
Certificates loaded with "crt" and "crt-list" commonly share the same
intermediate certificate in PEM file. "issuers-chain-path" is a global
directive to share intermediate chain certificates in a directory. If
certificates chain is not included in certificate PEM file, haproxy
will complete chain if issuer match the first certificate of the chain
stored via "issuers-chain-path" directive. Such chains will be shared
in memory.
2020-02-18 14:33:05 +01:00
Willy Tarreau
23997daf4e BUG/MINOR: sample: exit regsub() in case of trash allocation error
As reported in issue #507, since commiy 07e1e3c93e ("MINOR: sample:
regsub now supports backreferences") we must not proceed in regsub()
if we fali to allocate a trash (which in practice never happens). No
backport needed.
2020-02-18 14:27:44 +01:00
Christopher Faulet
0f19e43f2e BUG/MINOR: stream: Don't incr frontend cum_req counter when stream is closed
This counter is already incremented when a new request is received (or if an
error occurred waiting it). So it must not be incremented when the stream is
terminated, at the end of process_strem(). This bug was introduced by the commit
cff0f739e ("MINOR: counters: Review conditions to increment counters from
analysers").

No backport needed.
2020-02-18 11:56:22 +01:00
Christopher Faulet
34b18e4391 BUG/MINOR: http-htx: Don't return error if authority is updated without changes
When an Host header is updated, the autority part, if any, is also updated to
keep the both syncrhonized. But, when the update is performed while there is no
change, a failure is reported while, in reality, no update is necessary. This
bug was introduced by the commit d7b7a1ce5 ("MEDIUM: http-htx: Keep the Host
header and the request start-line synchronized").

This commit was pushed in the 2.1. But on this version, the bug is hidden
because rewrite errors are silently ignored. And because it happens when there
is no change, if the rewrite fails, noone notices it. But since the 2.2, rewrite
errors are now fatals by default. So when the bug is hit, a 500 error is
returned to the client. Without this fix, a workaround is to disable the strict
rewriting mode (see the "strict-mode" HTTP rule).

The following HTTP rule is a good way to reproduce the bug if a request with an
authority is received. In HTT2, it is pretty common.

    acl host_header_exists req.hdr(host) -m found
    http-request set-header host %[req.hdr(host)] if host_header_exists

This patch must be backported to 2.1 and everywhere the commit d7b7a1ce5 is
backported. It should fix the issue #494.
2020-02-18 11:19:57 +01:00
Christopher Faulet
9c44e4813c BUG/MINOR: filters: Count HTTP headers as filtered data but don't forward them
In flt_analyze_http_headers() HTTP analyzer, we must not forward systematically
the headers. We must only count them as filtered data (ie. increment the offset
of the right size). It is the http_payload callback responsibility to decide to
forward headers or not by forwarding at least 1 byte of payload. And there is
always at least 1 byte of payload to forward, the EOM block.

This patch depends on following commits:

 * MINOR: filters: Forward data only if the last filter forwards something
 * MINOR: http-htx: Add a function to retrieve the headers size of an HTX message

This patch must be backported with commits above as far as 1.9. In HAProxy 2.0
and 1.9, the patch must be adapted because of the legacy HTTP code.
2020-02-18 11:19:57 +01:00
Christopher Faulet
71179a3ea9 MINOR: filters: Forward data only if the last filter forwards something
In flt_tcp_payload() and flt_http_payload(), if the last filter does not
forwarding anything, nothing is forwarded, not even the already filtered
data. For now, this patch is useless because the last filter is always sync with
the stream's offset. But it will be mandatory for a bugfix.
2020-02-18 11:19:57 +01:00
Christopher Faulet
727a3f1ca3 MINOR: http-htx: Add a function to retrieve the headers size of an HTX message
http_get_hdrs_size() function may now be used to get the bytes held by headers
in an HTX message. It only works if the headers were not already
forwarded. Metadata are not counted here.
2020-02-18 11:19:57 +01:00
Jerome Magnin
07e1e3c93e MINOR: sample: regsub now supports backreferences
Now that the configuration parser is more flexible with samples,
converters and their arguments, we can leverage this to enable
support for backreferences in regsub.
2020-02-16 19:48:54 +01:00
Willy Tarreau
9af749b43e BUG/MINOR: arg: fix again incorrect argument length check
Recent commit 807aef8a14 ("BUG/MINOR: arg: report an error if an argument
is larger than bufsize") aimed at fixing the argument length check but
relied on the fact that the end of string was not reached, except that
it forgot to consider the delimiters (comma and parenthesis) which are
valid conditions to break out of the loop. This used to break simple
expressions like "hdr(xff,1)". Thanks to Jrme for reporting this.

No backport is needed.
2020-02-16 10:49:55 +01:00
Willy Tarreau
807aef8a14 BUG/MINOR: arg: report an error if an argument is larger than bufsize
Commit ef21facd99 ("MEDIUM: arg: make make_arg_list() support quotes
in arguments") removed the chunk_strncpy() to fill the trash buffer
as the input is being parsed, and accidently dropped the jump to the
error path in case the argument is too large, which is now fixed.

No backport is needed, this is for 2.2. This addresses issue #502.
2020-02-15 14:54:28 +01:00
Willy Tarreau
cd0d2ed6ee MEDIUM: log-format: make the LF parser aware of sample expressions' end
For a very long time it used to be impossible to pass a closing square
bracket as a valid character in argument to a sample fetch function or
to a converter because the LF parser used to stop on the first such
character found and to pass what was between the first '[' and the first
']' to sample_parse_expr().

This patch addresses this by passing the whole string to sample_parse_expr()
which is the only one authoritative to indicate the first character that
does not belong to the expression. The LF parser then verifies it matches
a ']' or fails. As a result it is finally possible to write rules such as
the following, which is totally valid an unambigous :

    http-request redirect location %[url,regsub([.:/?-],!,g)]
                                                |-----| | |
                                                  arg1  | `---> arg3
                                                        `-----> arg2
                                         |-----------------|
                                              converter
                                     |---------------------|
                                        sample expression
                                   |------------------------|
                                         log-format tag
2020-02-14 19:02:06 +01:00
Willy Tarreau
e3b57bf92f MINOR: sample: make sample_parse_expr() able to return an end pointer
When an end pointer is passed, instead of complaining that a comma is
missing after a keyword, sample_parse_expr() will silently return the
pointer to the current location into this return pointer so that the
caller can continue its parsing. This will be used by more complex
expressions which embed sample expressions, and may even permit to
embed sample expressions into arguments of other expressions.
2020-02-14 19:02:06 +01:00
Willy Tarreau
ef21facd99 MEDIUM: arg: make make_arg_list() support quotes in arguments
Now it becomes possible to reuse the quotes within arguments, allowing
the parser to distinguish a ',' or ')' that is part of the value from
one which delimits the argument. In addition, ',' and ')' may be escaped
using a backslash. However, it is also important to keep in mind that
just like in shell, quotes are first resolved by the word tokenizer, so
in order to pass quotes that are visible to the argument parser, a second
level is needed, either using backslash escaping, or by using an alternate
type.

For example, it's possible to write this to append a comma:

     http-request add-header paren-comma-paren "%[str('(--,--)')]"

or this:

     http-request add-header paren-comma-paren '%[str("(--,--)")]'

or this:

     http-request add-header paren-comma-paren %[str(\'(--,--)\')]

or this:

     http-request add-header paren-comma-paren %[str(\"(--,--)\")]

or this:

     http-request add-header paren-comma-paren %[str(\"(\"--\',\'--\")\")]

Note that due to the wide use of '\' in front of parenthesis in regex,
the backslash character will purposely *not* escape parenthesis, so that
'\)' placed in quotes is passed verbatim to a regex engine.
2020-02-14 19:02:06 +01:00
Willy Tarreau
338c670745 MEDIUM: arg: copy parsed arguments into the trash instead of allocating them
For each and every argument parsed by make_arg_list(), there was an
strndup() call, just so that we have a trailing zero for most functions,
and this temporary buffer is released afterwards except for strings where
it is kept.

Proceeding like this is not convenient because 1) it performs a huge
malloc/free dance, and 2) it forces to decide upfront where the argument
ends, which is what prevents commas and right parenthesis from being used.

This patch makes the function copy the temporary argument into the trash
instead, so that we avoid the malloc/free dance for most all non-string
args (e.g. integers, addresses, time, size etc), and that we can later
produce the contents on the fly while parsing the input. It adds a length
check to make sure that the argument is not longer than the buffer size,
which should obviously never be the case but who knows what people put in
their configuration.
2020-02-14 19:02:06 +01:00
Willy Tarreau
80b53ffb1c MEDIUM: arg: make make_arg_list() stop after its own arguments
The main problem we're having with argument parsing is that at the
moment the caller looks for the first character looking like an end
of arguments (')') and calls make_arg_list() on the sub-string inside
the parenthesis.

Let's first change the way it works so that make_arg_list() also
consumes the parenthesis and returns the pointer to the first char not
consumed. This will later permit to refine each argument parsing.

For now there is no functional change.
2020-02-14 19:02:06 +01:00
Willy Tarreau
ed2c662b01 MINOR: sample/acl: use is_idchar() to locate the fetch/conv name
Instead of scanning a string looking for an end of line, ')' or ',',
let's only accept characters which are actually valid identifier
characters. This will let the parser know that in %[src], only "src"
is the sample fetch name, not "src]". This was done both for samples
and ACLs since they are the same here.
2020-02-14 19:02:06 +01:00
Christopher Faulet
6c57f2da43 MINOR: mux-fcgi: Make the capture of the path-info optional in pathinfo regex
Now, only one capture is mandatory in the path-info regex, the one matching the
script-name. The path-info capture is optional. Of couse, it must be defined to
fill the PATH_INFO parameter. But it is not mandatory. This way, it is possible
to get the script-name part from the path, excluding the path-info.

This patch is small enough to be backported to 2.1.
2020-02-14 18:31:29 +01:00
Christopher Faulet
28cb36613b BUG/MINOR: mux-fcgi: Forbid special characters when matching PATH_INFO param
If a regex to match the PATH_INFO parameter is configured, it systematically
fails if a newline or a null character is present in the URL-decoded path. So,
from the moment there is at least a "%0a" or a "%00" in the request path, we
always fail to get the PATH_INFO parameter and all the decoded path is used for
the SCRIPT_NAME parameter.

It is probably not the expected behavior. Because, most of time, these
characters are not expected at all in a path, an error is now triggered when one
of these characters is found in the URL-decoded path before trying to execute
the path_info regex. However, this test is not performed if there is no regex
configured.

Note that in reality, the newline character is only a problem when HAProxy is
complied with pcre or pcre2 library and conversely, the null character is only a
problem for the libc's regex library. But both are always excluded to avoid any
inconsistency depending on compile options.

An alternative, not implemented yet, is to replace these characters by another
one. If someone complains about this behavior, it will be re-evaluated.

This patch must be backported to all versions supporting the FastCGI
applications, so to 2.1 for now.
2020-02-14 16:02:35 +01:00
Olivier Houchard
12ffab03b6 BUG/MEDIUM: muxes: Use the right argument when calling the destroy method.
When calling the mux "destroy" method, the argument should be the mux
context, not the connection. In a few instances in the mux code, the
connection was used (mainly when the session wouldn't handle the idle
connection, and the server pool was fool), and that could lead to random
segfaults.

This should be backported to 2.1, 2.0, and 1.9
2020-02-14 13:28:38 +01:00
William Dauchy
f7dcdc8a6f BUG/MINOR: namespace: avoid closing fd when socket failed in my_socketat
we cannot return right after socket opening as we need to move back to
the default namespace first

this should fix github issue #500

this might be backported to all version >= 1.6

Fixes: b3e54fe387 ("MAJOR: namespace: add Linux network namespace
support")
Signed-off-by: William Dauchy <w.dauchy@criteo.com>
2020-02-14 04:23:08 +01:00
William Dauchy
97a7bdac3e BUG/MINOR: tcp: don't try to set defaultmss when value is negative
when `getsockopt` previously failed, we were trying to set defaultmss
with -2 value.

this is a followup of github issue #499

this should be backported to all versions >= v1.8

Fixes: 153659f1ae ("MINOR: tcp: When binding socket, attempt to
reuse one from the old proc.")
Signed-off-by: William Dauchy <w.dauchy@criteo.com>
2020-02-12 16:01:50 +01:00
William Dauchy
c0e23aef05 BUG/MINOR: tcp: avoid closing fd when socket failed in tcp_bind_listener
we were trying to close file descriptor even when `socket` call was
failing.
this should fix github issue #499

this should be backported to all versions >= v1.8

Fixes: 153659f1ae ("MINOR: tcp: When binding socket, attempt to
reuse one from the old proc.")
Signed-off-by: William Dauchy <w.dauchy@criteo.com>
2020-02-12 15:24:21 +01:00
Willy Tarreau
0948a781fc BUG/MINOR: listener: enforce all_threads_mask on bind_thread on init
When intializing a listener, let's make sure the bind_thread mask is
always limited to all_threads_mask when inserting the FD. This will
avoid seeing listening FDs with bits corresponding to threads that are
not active (e.g. when using "bind ... process 1/even"). The side effect
is very limited, all that was identified is that atomic operations are
used in fd_update_events() when not necessary. It's more a matter of
long-term correctness in practice.

This fix might be backported as far as 1.8 (then proto_sockpair must
be dropped).
2020-02-12 10:21:49 +01:00
Willy Tarreau
50b659476c BUG/MEDIUM: listener: only consider running threads when resuming listeners
In bug #495 we found that it is possible to resume a listener on an
inexistent thread. This happens when a bind's thread_mask contains bits
out of the active threads mask, such as when using "1/odd" or "1/even".
The thread_mask was used as-is to pick a thread number to re-enable the
listener, and given that the highest number is used, 1/odd or 1/even can
produce quite high thread numbers and crash the process by queuing some
entries into non-existent lists.

This bug is an incomplete fix of commit 413e926ba ("BUG/MAJOR: listener:
fix thread safety in resume_listener()") though it will only trigger if
some bind lines are explicitly bound to thread numbers higher than the
thread count. The fix must be backported to all branches having the fix
above (as far as 1.8, though the code is different there, see the commit
message in 1.8 for changes).

There are a few other places where bind_thread is used without
enforcing all_thread_mask, namely when doing fd_insert() while creating
listeners. It seems harmless but would probably deserve another fix.
2020-02-12 10:21:33 +01:00
Willy Tarreau
e35d1d4f42 BUILD: http_act: cast file sizes when reporting file size error
As seen in issue #496, st_size may be of varying types on different
systems. Let's simply cast it to long long and use long long for
all size outputs.
2020-02-11 10:58:56 +01:00
Willy Tarreau
157788c7b1 BUG/MINOR: connection: correctly retry I/O on signals
Issue #490 reports that there are a few bogus constructs of the famous
"do { if (cond) continue; } while (0)" in the connection code, that are
used to retry on I/O failures caused by receipt of a signal. Let's turn
them into the more correct "while (1) { if (cond) continue; break }"
instead. This may or may not be backported, it shouldn't have any
visible effect.
2020-02-11 10:26:39 +01:00
Willy Tarreau
327ea5aec8 BUG/MINOR: unix: better catch situations where the unix socket path length is close to the limit
We do have some checks for the UNIX socket path length to validate the
full pathname of a unix socket but the pathname extension is only taken
into account when using a bind_prefix. The second check only matches
against MAXPATHLEN. So this means that path names between 98 and 108
might successfully parse but fail to bind. Let's adjust the check in
the address parser and refine the error checking at the bind() step.

This addresses bug #493.
2020-02-11 06:49:42 +01:00
Willy Tarreau
508f989758 BUG/MAJOR: mux-h2: don't wake streams after connection was destroyed
In commit 477902b ("MEDIUM: connections: Get ride of the xprt_done
callback.") we added an inconditional call to h2_wake_some_streams()
in h2_wake(), though we must not do it if the connection is destroyed
or we end up with a use-after-free. In this case it's already done in
h2_process() before destroying the connection anyway.

Let's just add this test for now. A cleaner approach might consist in
doing it in the h2_process() function itself when a connection status
change is detected.

No backport is needed, this is purely 2.2.
2020-02-11 04:42:05 +01:00
Christopher Faulet
67307796e6 BUG/MEDIUM: tcp-rules: Fix track-sc* actions for L4/L5 TCP rules
A bug was introduced during TCP rules refactoring by the commit ac98d81f4
("MINOR: http-rule/tcp-rules: Make track-sc* custom actions"). There is no
stream when L4/L5 TCP rules are evaluated. For these rulesets, In track-sc*
actions, we must take care to rely on the session instead of the stream.

Because of this bug, any evaluation of L4/L5 TCP rules using a track-sc* action
leads to a crash of HAProxy.

No backport needed, except if the above commit is backported.
2020-02-10 10:09:58 +01:00
William Lallemand
696f317f13 BUG/MEDIUM: ssl/cli: 'commit ssl cert' wrong SSL_CTX init
The code which is supposed to apply the bind_conf configuration on the
SSL_CTX was not called correctly. Indeed it was called with the previous
SSL_CTX so the new ones were left with default settings. For example the
ciphers were not changed.

This patch fixes #429.

Must be backported in 2.1.
2020-02-07 20:55:35 +01:00
Christopher Faulet
817c4e39e5 BUG/MINOR: http-act: Fix bugs on error path during parsing of return actions
This patch fixes memory leaks and a null pointer dereference found by coverity
on the error path when an HTTP return action is parsed. See issue #491.

No need to backport this patch except the HTT return action is backported too.
2020-02-07 10:37:59 +01:00
Christopher Faulet
692a6c2e69 BUG/MINOR: http-act: Set stream error flag before returning an error
In action_http_set_status(), when a rewrite error occurred, the stream error
flag must be set before returning the error.

No need to backport this patch except if commit 333bf8c33 ("MINOR: http-rules:
Set SF_ERR_PRXCOND termination flag when a header rewrite fails") is
backported. This bug was reported in issue #491.
2020-02-07 10:37:53 +01:00
Tim Duesterhus
f1bc24cb27 BUG/MINOR: acl: Fix type of log message when an acl is named 'or'
The patch adding this check initially only issued a warning, instead of
being fatal. It was changed before committing. However when making this
change the type of the log message was not changed from `ha_warning` to
`ha-alert`. This patch makes this forgotten adjustment.

see 0cf811a5f9
No backport needed. The initial patch was backported as a warning, thus
the log message type is correct.
2020-02-06 22:16:07 +01:00
Tim Duesterhus
0cf811a5f9 MINOR: acl: Warn when an ACL is named 'or'
Consider a configuration like this:

> acl t always_true
> acl or always_false
>
> http-response set-header Foo Bar if t or t

The 'or' within the condition will be treated as a logical disjunction
and the header will be set, despite the ACL 'or' being falsy.

This patch makes it an error to declare such an ACL that will never
work. This patch may be backported to stable releases, turning the
error into a warning only (the code was written in a way to make this
trivial). It should not break anything and might improve the users'
lifes.
2020-02-06 16:08:36 +01:00
Willy Tarreau
9d6bb5a546 BUILD: lua: silence a warning on systems where longjmp is not marked as noreturn
If the longjmp() call is not flagged as "noreturn", for example, because the
operating system doesn't target a gcc-compatible compiler, we may get this
warning when building Lua :

  src/hlua.c: In function 'hlua_panic_ljmp':
  src/hlua.c:128:1: warning: no return statement in function returning non-void [-Wreturn-type]
   static int hlua_panic_ljmp(lua_State *L) { longjmp(safe_ljmp_env, 1); }
   ^~~~~~

The function's prototype cannot be changed because it must be compatible
with Lua's callbacks. Let's simply enclose the call inside WILL_LJMP()
which we created exactly to signal a call to longjmp(). It lets the compiler
know we won't get back into the function and that the return statement is
not needed.
2020-02-06 16:01:04 +01:00
Christopher Faulet
700d9e88ad MEDIUM: lua: Add ability for actions to intercept HTTP messages
It is now possible to intercept HTTP messages from a lua action and reply to
clients. To do so, a reply object must be provided to the function
txn:done(). It may contain a status code with a reason, a header list and a
body. By default, if an empty reply object is used, an empty 200 response is
returned. If no reply is passed when txn:done() is called, the previous
behaviour is respected, the transaction is terminated and nothing is returned to
the client. The same is done for TCP streams. When txn:done() is called, the
action is terminated with the code ACT_RET_DONE on success and ACT_RET_ERR on
error, interrupting the message analysis.

The reply object may be created for the lua, by hand. Or txn:reply() may be
called. If so, this object provides some methods to fill it:

  * Reply:set_status(<status> [  <reason>]) : Set the status and optionally the
   reason. If no reason is provided, the default one corresponding to the status
   code is used.

  * Reply:add_header(<name>, <value>) : Add a header. For a given name, the
    values are stored in an ordered list.

  * Reply:del_header(<name>) : Removes all occurrences of a header name.

  * Reply:set_body(<body>) : Set the reply body.

Here are some examples, all doing the same:

    -- ex. 1
    txn:done{
        status  = 400,
        reason  = "Bad request",
        headers = {
            ["content-type"]  = { "text/html" },
            ["cache-control"] = { "no-cache", "no-store" },
        },
        body = "<html><body><h1>invalid request<h1></body></html>"
    }

    -- ex. 2
    local reply = txn:reply{
        status  = 400,
        reason  = "Bad request",
        headers = {
            ["content-type"]  = { "text/html" },
            ["cache-control"] = { "no-cache", "no-store" }
        },
        body = "<html><body><h1>invalid request<h1></body></html>"
    }
    txn:done(reply)

    -- ex. 3
    local reply = txn:reply()
    reply:set_status(400, "Bad request")
    reply:add_header("content-length", "text/html")
    reply:add_header("cache-control", "no-cache")
    reply:add_header("cache-control", "no-store")
    reply:set_body("<html><body><h1>invalid request<h1></body></html>")
    txn:done(reply)
2020-02-06 15:13:04 +01:00
Christopher Faulet
2c2c2e381b MINOR: lua: Add act:wake_time() function to set a timeout when an action yields
This function may be used to defined a timeout when a lua action returns
act:YIELD. It is a way to force to reexecute the script after a short time
(defined in milliseconds).

Unlike core:sleep() or core:yield(), the script is fully reexecuted if it
returns act:YIELD. With core functions to yield, the script is interrupted and
restarts from the yield point. When a script returns act:YIELD, it is finished
but the message analysis is blocked on the action waiting its end.
2020-02-06 15:13:04 +01:00
Christopher Faulet
0f3c8907c3 MINOR: lua: Create the global 'act' object to register all action return codes
ACT_RET_* code are now available from lua scripts. The gloabl object "act" is
used to register these codes as constant. Now, lua actions can return any of
following codes :

  * act.CONTINUE for ACT_RET_CONT
  * act.STOP for ACT_RET_STOP
  * act.YIELD for ACT_RET_YIELD
  * act.ERROR for ACT_RET_ERR
  * act.DONE for ACT_RET_DONE
  * act.DENY for ACT_RET_DENY
  * act.ABORT for ACT_RET_ABRT
  * act.INVALID for ACT_RET_INV

For instance, following script denied all requests :

  core.register_action("deny", { "http-req" }, function (txn)
      return act.DENY
  end)

Thus "http-request lua.deny" do exactly the same than "http-request deny".
2020-02-06 15:13:03 +01:00
Christopher Faulet
7716cdf450 MINOR: lua: Get the action return code on the stack when an action finishes
When an action successfully finishes, the action return code (ACT_RET_*) is now
retrieve on the stack, ff the first element is an integer. In addition, in
hlua_txn_done(), the value ACT_RET_DONE is pushed on the stack before
exiting. Thus, when a script uses this function, the corresponding action still
finishes with the good code. Thanks to this change, the flag HLUA_STOP is now
useless. So it has been removed.

It is a mandatory step to allow a lua action to return any action return code.
2020-02-06 15:13:03 +01:00
Christopher Faulet
a20a653e07 BUG/MINOR: http-ana: Increment failed_resp counters on invalid response
In http_process_res_common() analyzer, when a invalid response is reported, the
failed_resp counters must be incremented.

No need to backport this patch, except if the commit b8a5371a ("MEDIUM:
http-ana: Properly handle internal processing errors") is backported too.
2020-02-06 15:13:03 +01:00
Christopher Faulet
07a718e712 CLEANUP: lua: Remove consistency check for sample fetches and actions
It is not possible anymore to alter the HTTP parser state from lua sample
fetches or lua actions. So there is no reason to still check for the parser
state consistency.
2020-02-06 15:13:03 +01:00
Christopher Faulet
4a2c142779 MEDIUM: http-rules: Support extra headers for HTTP return actions
It is now possible to append extra headers to the generated responses by HTTP
return actions, while it is not based on an errorfile. For return actions based
on errorfiles, these extra headers are ignored. To define an extra header, a
"hdr" argument must be used with a name and a value. The value is a log-format
string. For instance:

  http-request status 200 hdr "x-src" "%[src]" hdr "x-dst" "%[dst]"
2020-02-06 15:13:03 +01:00
Christopher Faulet
24231ab61f MEDIUM: http-rules: Add the return action to HTTP rules
Thanks to this new action, it is now possible to return any responses from
HAProxy, with any status code, based on an errorfile, a file or a string. Unlike
the other internal messages generated by HAProxy, these ones are not interpreted
as errors. And it is not necessary to use a file containing a full HTTP
response, although it is still possible. In addition, using a log-format string
or a log-format file, it is possible to have responses with a dynamic
content. This action can be used on the request path or the response path. The
only constraint is to have a responses smaller than a buffer. And to avoid any
warning the buffer space reserved to the headers rewritting should also be free.

When a response is returned with a file or a string as payload, it only contains
the content-length header and the content-type header, if applicable. Here are
examples:

  http-request return content-type image/x-icon file /var/www/favicon.ico  \
      if { path /favicon.ico }

  http-request return status 403 content-type text/plain    \
      lf-string "Access denied. IP %[src] is blacklisted."  \
      if { src -f /etc/haproxy/blacklist.lst }
2020-02-06 15:12:54 +01:00
Christopher Faulet
6d0c3dfac6 MEDIUM: http: Add a ruleset evaluated on all responses just before forwarding
This patch introduces the 'http-after-response' rules. These rules are evaluated
at the end of the response analysis, just before the data forwarding, on ALL
HTTP responses, the server ones but also all responses generated by
HAProxy. Thanks to this ruleset, it is now possible for instance to add some
headers to the responses generated by the stats applet. Following actions are
supported :

   * allow
   * add-header
   * del-header
   * replace-header
   * replace-value
   * set-header
   * set-status
   * set-var
   * strict-mode
   * unset-var
2020-02-06 14:55:34 +01:00
Christopher Faulet
a72a7e49e8 MINOR: http-ana/http-rules: Use dedicated function to forward internal responses
Call http_forward_proxy_resp() function when an internal response is
returned. It concerns redirect, auth and error reponses. But also 100-Continue
and 103-Early-Hints responses. For errors, there is a subtlety. if the forward
fails, an HTTP 500 error is generated if it is not already an internal
error. For now http_forward_proxy_resp() cannot fail. But it will be possible
when the new ruleset applied on all responses will be added.
2020-02-06 14:55:34 +01:00
Christopher Faulet
ef70e25035 MINOR: http-ana: Add a function for forward internal responses
Operations performed when internal responses (redirect/deny/auth/errors) are
returned are always the same. The http_forward_proxy_resp() function is added to
group all of them under a unique function.
2020-02-06 14:55:34 +01:00
Christopher Faulet
72c7d8d040 MINOR: http-ana: Rely on http_reply_and_close() to handle server error
The http_server_error() function now relies on http_reply_and_close(). Both do
almost the same actions. In addtion, http_server_error() sets the error flag and
the final state flag on the stream.
2020-02-06 14:55:34 +01:00
Christopher Faulet
60b33a5a62 MINOR: http-rules: Handle the rule direction when a redirect is evaluated
The rule direction must be tested to do specific processing on the request
path. intercepted_req counter shoud be updated if the rule is evaluated on the
frontend and remaining request's analyzers must be removed. But only on the
request path. The rule direction must also be tested to set the right final
stream state flag.

This patch depends on the commit "MINOR: http-rules: Add a flag on redirect
rules to know the rule direction". Both must be backported to all stable
versions.
2020-02-06 14:55:34 +01:00
Christopher Faulet
c87e468816 MINOR: http-rules: Add a flag on redirect rules to know the rule direction
HTTP redirect rules can be evaluated on the request or the response path. So
when a redirect rule is evaluated, it is important to have this information
because some specific processing may be performed depending on the direction. So
the REDIRECT_FLAG_FROM_REQ flag has been added. It is set when applicable on the
redirect rule during the parsing.

This patch is mandatory to fix a bug on redirect rule. It must be backported to
all stable versions.
2020-02-06 14:55:34 +01:00
Christopher Faulet
c20afb810f BUG/MINOR: http-ana: Set HTX_FL_PROXY_RESP flag if a server perform a redirect
It is important to not forget to specify the HTX resposne was internally
generated when a server perform a redirect. This information is used by the H1
multiplexer to choose the right connexion mode when the response is sent to the
client.

This patch must be backported to 2.1.
2020-02-06 14:55:34 +01:00
Christopher Faulet
7a138dc908 BUG/MINOR: http-ana: Reset HTX first index when HAPRoxy sends a response
The first index in an HTX message is the HTX block index from which the HTTP
analysis must be performed. When HAProxy sends an HTTP response, on error or
redirect, this index must be reset because all pending incoming data are
considered as forwarded. For now, it is only a bug for 103-Early-Hints
response. For other responses, it is not a problem. But it will be when the new
ruleset applied on all responses will be added. For 103 responses, if the first
index is not reset, if there are rewritting rules on server responses, the
generated 103 responses, if any, are evaluated too.

This patch must be backported and probably adapted, at least for 103 responses,
as far as 1.9.
2020-02-06 14:55:34 +01:00
Christopher Faulet
3b2bb63ded MINOR: dns: Add function to release memory allocated for a do-resolve rule
Memory allocated when a do-resolve rule is parsed is now released when HAProxy
exits.
2020-02-06 14:55:34 +01:00
Christopher Faulet
a4168434a7 MINOR: dns: Dynamically allocate dns options to reduce the act_rule size
<.arg.dns.dns_opts> field in the act_rule structure is now dynamically allocated
when a do-resolve rule is parsed. This drastically reduces the structure size.
2020-02-06 14:55:34 +01:00
Christopher Faulet
637259e044 BUG/MINOR: http-ana: Don't overwrite outgoing data when an error is reported
When an error is returned to a client, the right message is injected into the
response buffer. It is performed by http_server_error() or
http_replay_and_close(). Both ignore any data already present into the channel's
buffer. While it is legitimate to remove all input data, it is important to not
remove any outgoing data.

So now, we try to append the error message to the response buffer, only removing
input data. We rely on the channel_htx_copy_msg() function to do so. So this
patch depends on the following two commits:

  * MINOR: htx: Add a function to append an HTX message to another one
  * MINOR: htx/channel: Add a function to copy an HTX message in a channel's buffer

This patch must be backported as far as 1.9. However, above patches must be
backported first.
2020-02-06 14:55:34 +01:00
Christopher Faulet
0ea0c86753 MINOR: htx: Add a function to append an HTX message to another one
the htx_append_msg() function can now be used to append an HTX message to
another one. All the message is copied or nothing. If an error occurs during the
copy, all changes are rolled back.

This patch is mandatory to fix a bug in http_reply_and_close() function. Be
careful to backport it first.
2020-02-06 14:54:47 +01:00
Christopher Faulet
0a589fde7c MINOR: http-htx: Emit a warning if an error file runs over the buffer's reserve
If an error file is too big and, once converted in HTX, runs over the buffer
space reserved to headers rewritting, a warning is emitted. Because a new set of
rules will be added to allow headers rewritting on all responses, including
HAProxy ones, it is important to always keep this space free for error files.
2020-02-06 09:36:36 +01:00
Christopher Faulet
333bf8c33f MINOR: http-rules: Set SF_ERR_PRXCOND termination flag when a header rewrite fails
When a header rewrite fails, an internal errors is triggered. But
SF_ERR_INTERNAL is documented to be the concequence of a bug and must be
reported to the dev teamm. So, when this happens, the SF_ERR_PRXCOND termination
flag is set now.
2020-02-06 09:36:36 +01:00
Christopher Faulet
546c4696bb MINOR: global: Set default tune.maxrewrite value during global structure init
When the global structure is initialized, instead of setting tune.maxrewrite to
-1, its default value can be immediately set. This way, it is always defined
during the configuration validity check. Otherwise, the only way to have it at
this stage, it is to explicity set it in the global section.
2020-02-06 09:36:36 +01:00
Christopher Faulet
91e31d83c9 BUG/MINOR: http-act: Use the good message to test strict rewritting mode
Since the strict rewritting mode was introduced, actions manipulating headers
(set/add/replace) always rely on the request message to test if the
HTTP_MSGF_SOFT_RW flag is set or not. But, of course, we must only rely on the
request for http-request rules. For http-response rules, we must use the
response message.

This patch must be backported if the strict rewritting is backported too.
2020-02-06 09:36:36 +01:00
Tim Duesterhus
d02ffe9b6d CLEANUP: peers: Remove unused static function free_dcache_tx
The function was added in commit 6c39198b57,
but was also used within a single function `free_dcache` which was unused
itself.

see issue #301
see commit 10ce0c2f31 which removed
`free_dcache`
2020-02-05 23:40:17 +01:00
Tim Duesterhus
10ce0c2f31 CLEANUP: peers: Remove unused static function free_dcache
The function was changed to be static in commit
6c39198b57, but even that commit
no longer uses it. The purpose of the change vs. outright removal
is unclear.

see issue #301
2020-02-05 18:49:29 +01:00
Willy Tarreau
077d366ef7 CLEANUP: hpack: remove a redundant test in the decoder
As reported in issue #485 the test for !len at the end of the
loop in get_var_int() is useless since it was already done inside
the loop. Actually the code is more readable if we remove the first
one so let's do this instead. The resulting code is exactly the same
since the compiler already optimized the test away.
2020-02-05 15:39:08 +01:00
William Lallemand
4dd145a888 BUG/MINOR: ssl: clear the SSL errors on DH loading failure
In ssl_sock_load_dh_params(), if haproxy failed to apply the dhparam
with SSL_CTX_set_tmp_dh(), it will apply the DH with
SSL_CTX_set_dh_auto().

The problem is that we don't clean the OpenSSL errors when leaving this
function so it could fail to load the certificate, even if it's only a
warning.

Fixes bug #483.

Must be backported in 2.1.
2020-02-05 15:32:24 +01:00
Willy Tarreau
731248f0db BUG/MINOR: ssl: we may only ignore the first 64 errors
We have the ability per bind option to ignore certain errors (CA, crt, ...),
and for this we use a 64-bit field. In issue #479 coverity reports a risk of
too large a left shift. For now as of OpenSSL 1.1.1 the highest error value
that may be reported by X509_STORE_CTX_get_error() seems to be around 50 so
there should be no risk yet, but it's enough of a warning to add a check so
that we don't accidently hide random errors in the future.

This may be backported to relevant stable branches.
2020-02-04 14:04:36 +01:00
William Lallemand
3af48e706c MINOR: ssl: ssl-load-extra-files configure loading of files
This new setting in the global section alters the way HAProxy will look
for unspecified files (.ocsp, .sctl, .issuer, bundles) during the
loading of the SSL certificates.

By default, HAProxy discovers automatically a lot of files not specified
in the configuration, and you may want to disable this behavior if you
want to optimize the startup time.

This patch sets flags in global_ssl.extra_files and then check them
before trying to load an extra file.
2020-02-03 17:50:26 +01:00
Olivier Houchard
04f5fe87d3 BUG/MEDIUM: memory: Add a rwlock before freeing memory.
When using lockless pools, add a new rwlock, flush_pool. read-lock it when
getting memory from the pool, so that concurrenct access are still
authorized, but write-lock it when we're about to free memory, in
pool_flush() and pool_gc().
The problem is, when removing an item from the pool, we unreference it
to get the next one, however, that pointer may have been free'd in the
meanwhile, and that could provoke a crash if the pointer has been unmapped.
It should be OK to use a rwlock, as normal operations will still be able
to access the pool concurrently, and calls to pool_flush() and pool_gc()
should be pretty rare.

This should be backported to 2.1, 2.0 and 1.9.
2020-02-01 18:08:34 +01:00
Olivier Houchard
8af97eb4a1 MINOR: memory: Only init the pool spinlock once.
In pool_create(), only initialize the pool spinlock if we just created the
pool, in the event we're reusing it, there's no need to initialize it again.
2020-02-01 18:08:34 +01:00
Olivier Houchard
b6fa08bc7b BUG/MEDIUM: memory_pool: Update the seq number in pool_flush().
In pool_flush(), we can't just set the free_list to NULL, or we may suffer
the ABA problem. Instead, use a double-width CAS and update the sequence
number.

This should be backported to 2.1, 2.0 and 1.9.
This may, or may not, be related to github issue #476.
2020-02-01 18:08:34 +01:00
Willy Tarreau
952c2640b0 MINOR: task: don't set TASK_RUNNING on tasklets
We can't clear flags on tasklets because we don't know if they're still
present upon return (they all return NULL, maybe that could change in
the future). As a side effect, once TASK_RUNNING is set, it's never
cleared anymore, which is misleading and resulted in some incorrect
flagging of bulk tasks in the recent scheduler changes. And the only
reason for setting TASK_RUNNING on tasklets was to detect self-wakers,
which is not done using a dedicated flag. So instead of setting this
flags for no opportunity to clear it, let's simply not set it.
2020-01-31 18:37:03 +01:00
Willy Tarreau
1dfc9bbdc6 OPTIM: task: readjust CPU bandwidth distribution since last update
Now that we can more accurately watch which connection is really
being woken up from itself, it was desirable to re-adjust the CPU BW
thresholds based on measurements. New tests with 60000 concurrent
connections were run at 100 Gbps with unbounded queues and showed
the following distribution:

     scenario           TC0 TC1 TC2   observation
    -------------------+---+---+----+---------------------------
     TCP conn rate     : 32, 51, 17
     HTTP conn rate    : 34, 41, 25
     TCP byte rate     :  2,  3, 95   (2 MB objets)
     splicing byte rate: 11,  6, 83   (2 MB objets)
     H2 10k object     : 44, 23, 33   client-limited
     mixed traffic     : 18, 10, 72   2*1m+1*0: 11kcps, 36 Gbps

The H2 experienced a huge change since it uses a persistent connection
that was accidently flagged in the previous test. The splicing test
exhibits a higher need for short tasklets, so does the mixed traffic
test. Given that latency mainly matters for conn rate and H2 here,
the ratios were readjusted as 33% for TC0, 50% for TC1 and 17% for
TC2, keeping in mind that whatever is not consumed by one class is
automatically shared in equal propertions by the next one(s). This
setting immediately provided a nice improvement as with the default
settings (maxpollevents=200, runqueue-depth=200), the same ratios as
above are still reported, while the time to request "show activity"
on the CLI dropped to 30-50ms. The average loop time is around 5.7ms
on the mixed traffic.

In addition, one extra stress test at 90.5 Gbps with 5100 conn/s shows
70-100ms CLI request time, with an average loop time of 17 ms.
2020-01-31 18:37:01 +01:00
Willy Tarreau
d23d413e38 MINOR: task: make sched->current also reflect tasklets
sched->current is used to know the current task/tasklet, and is currently
only used by the panic dump code. However it turns out it was not set for
tasklets, which prevents us from using it for more usages, despite the
panic handling code already handling this case very well. Let's make sure
it's now set.
2020-01-31 17:45:10 +01:00
Willy Tarreau
bb238834da MINOR: task: permanently flag tasklets waking themselves up
Commit a17664d829 ("MEDIUM: tasks: automatically requeue into the bulk
queue an already running tasklet") tried to inflict a penalty to
self-requeuing tasks/tasklets which correspond to those involved in
large, high-latency data transfers, for the benefit of all other
processing which requires a low latency. However, it turns out that
while it ought to do this on a case-by-case basis, basing itself on
the RUNNING flag isn't accurate because this flag doesn't leave for
tasklets, so we'd rather need a distinct flag to tag such tasklets.

This commit introduces TASK_SELF_WAKING to mark tasklets acting like
this. For now it's still set when TASK_RUNNING is present but this
will have to change. The flag is kept across wakeups.
2020-01-31 17:45:10 +01:00
Olivier Houchard
849d4f047f BUG/MEDIUM: connections: Don't forget to unlock when killing a connection.
Commit 140237471e made sure we hold the
toremove_lock for the corresponding thread before removing a connection
from its idle_orphan_conns list, however it failed to unlock it if we
found a connection, leading to a deadlock, so add the missing deadlock.

This should be backported to 2.1 and 2.0.
2020-01-31 17:25:37 +01:00
Willy Tarreau
c633607c06 OPTIM: task: refine task classes default CPU bandwidth ratios
Measures with unbounded execution ratios under 40000 concurrent
connections at 100 Gbps showed the following CPU bandwidth
distribution between task classes depending on traffic scenarios:

    scenario           TC0 TC1 TC2   observation
   -------------------+---+---+----+---------------------------
    TCP conn rate     : 29, 48, 23   221 kcps
    HTTP conn rate    : 29, 47, 24   200 kcps
    TCP byte rate     :  3,  5, 92   53 Gbps
    splicing byte rate:  5, 10, 85   70 Gbps
    H2 10k object     : 10, 21, 74   client-limited
    mixed traffic     :  4,  7, 89   2*1m+1*0: 11kcps, 36 Gbps

Thus it seems that we always need a bit of bulk tasks even for short
connections, which seems to imply a suboptimal processing somewhere,
and that there are roughly twice as many tasks (TC1=normal) as regular
tasklets (TC0=urgent). This ratio stands even when data forwarding
increases. So at first glance it looks reasonable to enforce the
following ratio by default:

  - 16% for TL_URGENT
  - 33% for TL_NORMAL
  - 50% for TL_BULK

With this, the TCP conn rate climbs to ~225 kcps, and the mixed traffic
pattern shows a more balanced 17kcps + 35 Gbps with 35ms CLI request
time time instead of 11kcps + 36 Gbps and 400 ms response time. The
byte rate tests (1M objects) are not affected at all. This setting
looks "good enough" to allow immediate merging, and could be refined
later.

It's worth noting that it resists very well to massive increase of
run queue depth and maxpollevents: with the run queue depth changed
from 200 to 10000 and maxpollevents to 10000 as well, the CLI's
request time is back to the previous ~400ms, but the mixed traffic
test reaches 52 Gbps + 7500 CPS, which was never met with the previous
scheduling model, while the CLI used to show ~1 minute response time.
The reason is that in the bulk class it becomes possible to perform
multiple rounds of recv+send and eliminate objects at once, increasing
the L3 cache hit ratio, and keeping the connection count low, without
degrading too much the latency.

Another test with mixed traffic involving 2/3 splicing on huge objects
and 1/3 on empty objects without touching any setting reports 51 Gbps +
5300 cps and 35ms CLI request time.
2020-01-31 07:09:10 +01:00
Willy Tarreau
a62917b890 MEDIUM: tasks: implement 3 different tasklet classes with their own queues
We used to mix high latency tasks and low latency tasklets in the same
list, and to even refill bulk tasklets there, causing some unfairness
in certain situations (e.g. poll-less transfers between many connections
saturating the machine with similarly-sized in and out network interfaces).

This patch changes the mechanism to split the load into 3 lists depending
on the task/tasklet's desired classes :
  - URGENT: this is mainly for tasklets used as deferred callbacks
  - NORMAL: this is for regular tasks
  - BULK: this is for bulk tasks/tasklets

Arbitrary ratios of max_processed are picked from each of these lists in
turn, with the ability to complete in one list from what was not picked
in the previous one. After some quick tests, the following setup gave
apparently good results both for raw TCP with splicing and for H2-to-H1
request rate:

  - 0 to 75% for urgent
  - 12 to 50% for normal
  - 12 to what remains for bulk

Bulk is not used yet.
2020-01-30 18:59:33 +01:00
Willy Tarreau
4ffa0b526a MINOR: tasks: move the list walking code to its own function
New function run_tasks_from_list() will run over a tasklet list and will
run all the tasks and tasklets it finds there within a limit of <max>
that is passed in arggument. This is a preliminary work for scheduler QoS
improvements.
2020-01-30 18:13:13 +01:00
Willy Tarreau
876b411f2b BUG/MEDIUM: pipe/thread: fix atomicity of pipe counters
Previous patch 160287b676 ("MEDIUM: pipe/thread: maintain a per-thread
local cache of recently used pipes") didn't replace all pipe counter
updates with atomic ops since some were already under a lock, which is
obviously not a valid reason since these ones can be updated in parallel
to other atomic ops. The result was that the pipes_used could seldom be
seen as negative in the stats (harmless) but also this could result in
slightly more pipes being allocated than permitted, thus stealing a few
file descriptors that were not usable for connections anymore. Let's use
pure atomic ops everywhere these counters are updated.

No backport is needed.
2020-01-30 09:15:37 +01:00
Willy Tarreau
160287b676 MEDIUM: pipe/thread: maintain a per-thread local cache of recently used pipes
In order to completely remove the pipe locking cost and try to reuse
hot pipes, each thread now maintains a local cache of recently used pipes
that is no larger than its share (maxpipes/nbthreads). All extra pipes
are instead refilled into the global pool. Allocations are made from the
local pool first, and fall back to the global one before allocating one.
This completely removes the observed pipe locking cost at high bit rates,
which was still around 5-6%.
2020-01-29 11:12:07 +01:00
Willy Tarreau
a945cfdfe0 MEDIUM: pipe/thread: reduce the locking overhead
In a quick test involving splicing, we can see that get_pipe() and
put_pipe() together consume up to 12% of the CPU. That's not surprizing
considering how much work is performed under the lock, including the
pipe struct allocation, the pipe creation and its initialization. Same
for releasing, we don't need a lock there to call close() nor to free
to the pool.

Changing this alone was enough to cut the overhead in half. A better
approach should consist in having a per-thread pipe cache, which will
also help keep pages hot in the CPU caches.
2020-01-29 10:44:00 +01:00
William Lallemand
a25a19fdee BUG/MINOR: ssl/cli: fix unused variable with openssl < 1.0.2
src/ssl_sock.c: In function ‘cli_io_handler_show_cert’:
src/ssl_sock.c:10214:6: warning: unused variable ‘n’ [-Wunused-variable]
  int n;
      ^
Fix this problem in the io handler of the "show ssl cert" function.
2020-01-29 00:08:10 +01:00
Willy Tarreau
1113116b4a MEDIUM: raw-sock: remove obsolete calls to fd_{cant,cond,done}_{send,recv}
Given that raw_sock's functions solely act on connections and that all its
callers properly use subscribe() when they want to receive/send more, there
is no more reason for calling fd_{cant,cond,done}_{send,recv} anymore as
this call is immediately overridden by the subscribe call. It's also worth
noting that the purpose of fd_cond_recv() whose purpose was to speculatively
enable reading in the FD cache if the FD was active but not yet polled was
made to save on expensive epoll_ctl() calls and was implicitly covered more
cleanly by recent commit 5d7dcc2a8e ("OPTIM: epoll: always poll for recv if
neither active nor ready").

No change on the number of calls to epoll_ctl() was noticed consecutive to
this change.
2020-01-28 19:06:41 +01:00
William Dauchy
1e2256d4d3 MINOR: proxy: clarify number of connections log when stopping
this log could be sometimes a bit confusing (depending on the number in
fact) when you read it (e.g is it the number of active connection?) -
only trained eyes knows haproxy output a different log when closing
active connections while stopping.

Signed-off-by: William Dauchy <w.dauchy@criteo.com>
2020-01-28 13:10:03 +01:00
William Dauchy
aecd5dcac2 BUG/MINOR: dns: allow 63 char in hostname
hostname were limited to 62 char, which is not RFC1035 compliant;
- the parsing loop should stop when above max label char
- fix len label test where d[i] was wrongly used
- simplify the whole function to avoid using two extra char* variable

this should fix github issue #387

Signed-off-by: William Dauchy <w.dauchy@criteo.com>
Reviewed-by: Tim Duesterhus <tim@bastelstu.be>
Acked-by: Baptiste <bedis9@gmail.com>
2020-01-28 13:08:08 +01:00
William Dauchy
bd8bf67102 BUG/MINOR: connection: fix ip6 dst_port copy in make_proxy_line_v2
triggered by coverity; src_port is set earlier.

this should fix github issue #467

Fixes: 7fec021537 ("MEDIUM: proxy_protocol: Convert IPs to v6 when
protocols are mixed")
This should be backported to 1.8.

Signed-off-by: William Dauchy <w.dauchy@criteo.com>
Reviewed-by: Tim Duesterhus <tim@bastelstu.be>
2020-01-28 13:02:58 +01:00
Christopher Faulet
c20b37112b BUG/MINOR: http-rules: Always init log-format expr for common HTTP actions
Many HTTP actions rely on <.arg.http> in the act_rule structure. Not all actions
use the log-format expression, but it must be initialized anyway. Otherwise,
HAProxy may crash during the deinit when the release function is called.

No backport needed. This patch should fix issue #468.
2020-01-27 15:51:57 +01:00
Willy Tarreau
74ab7d2b80 BUG/MINOR: tcpchecks: fix the connect() flags regarding delayed ack
In issue #465, we see that Coverity detected dead code in checks.c
which is in fact a missing parenthesis to build the connect() flags
consecutive to the API change in commit fdcb007ad8 ("MEDIUM: proto:
Change the prototype of the connect() method.").

The impact should be imperceptible as in the best case it may have
resulted in a missed optimization trying to save a syscall or to merge
outgoing packets.

It may be backported as far as 2.0 though it's not critical.
2020-01-24 17:52:37 +01:00
Olivier Houchard
1fc5a648bf MEDIUM: streams: Don't close the connection in back_handle_st_rdy().
In back_handle_st_rdy(), don't bother trying to close the connection, it
should be taken care of somewhere else.
2020-01-24 15:40:34 +01:00
Olivier Houchard
7c30642ede MEDIUM: streams: Don't close the connection in back_handle_st_con().
In back_handle_st_con(), don't bother trying to close the connection, it
should be taken care of elsewhere.
2020-01-24 15:40:34 +01:00
Olivier Houchard
b43589cac5 BUG/MEDIUM: stream: Don't install the mux in back_handle_st_con().
In back_handle_st_con(), don't bother setting up the mux, it is now done by
conn_fd_handler().
2020-01-24 15:40:34 +01:00
Olivier Houchard
efe5e8e998 BUG/MEDIUM: ssl: Don't forget to free ctx->ssl on failure.
In ssl_sock_init(), if we fail to allocate the BIO, don't forget to free
the SSL *, or we'd end up with a memory leak.

This should be backported to 2.1 and 2.0.
2020-01-24 15:17:38 +01:00
Olivier Houchard
6d53cd6978 MINOR: ssl: Remove dead code.
Now that we don't call the handshake function directly, but merely wake
the tasklet, we can no longer have CO_FL_ERR, so don't bother checking it.
2020-01-24 15:13:57 +01:00
Frédéric Lécaille
3139c1b198 BUG/MINOR: ssl: Possible memleak when allowing the 0RTT data buffer.
​
As the server early data buffer is allocated in the middle of the loop
used to allocate the SSL session without being freed before retrying,
this leads to a memory leak.
​
To fix this we move the section of code responsible of this early data buffer
alloction after the one reponsible of allocating the SSL session.
​
Must be backported to 2.1 and 2.0.
2020-01-24 15:12:21 +01:00
Olivier Houchard
ecffb7d841 BUG/MEDIUM: streams: Move the conn_stream allocation outside #IF USE_OPENSSL.
When commit 477902bd2e made the conn_stream
allocation unconditional, it unfortunately moved the code doing the allocation
inside #if USE_OPENSSL, which means anybody compiling haproxy without
openssl wouldn't allocate any conn_stream, and would get a segfault later.
Fix that by moving the code that does the allocation outside #if USE_OPENSSL.
2020-01-24 14:14:35 +01:00
Christopher Faulet
99ac8a1aa4 BUG/MINOR: stream: Be sure to have a listener to increment its counters
In process_stream(), when a client or a server abort is handled, the
corresponding listener's counter is incremented. But, we must be sure to have a
listener attached to the session. This bug was introduced by the commit
cff0f739e5.

Thanks to Fred to reporting me the bug.

No need to backport this patch, except if commit cff0f739e5 is backported.
2020-01-24 11:55:17 +01:00
Christopher Faulet
be20cf36af BUG/MINOR: http-ana: Increment the backend counters on the backend
A stupid cut-paste bug was introduced in the commit cff0f739e5. Backend
counters must of course be incremented on the stream's backend. Not the
frontend.

No need to backport this patch, except if commit cff0f739e5 is backported.
2020-01-24 11:55:17 +01:00
Willy Tarreau
645c588e71 BUILD: cfgparse: silence a bogus gcc warning on 32-bit machines
A first patch was made during 2.0-dev to silence a bogus warning emitted
by gcc : dd1c8f1f72 ("MINOR: cfgparse: Add a cast to make gcc happier."),
but it happens it was not sufficient as the warning re-appeared on 32-bit
machines under gcc-8 and gcc-9 :

  src/cfgparse.c: In function 'check_config_validity':
  src/cfgparse.c:3642:33: warning: argument 1 range [2147483648, 4294967295] exceeds maximum object size 2147483647 [-Walloc-size-larger-than=]
       newsrv->idle_orphan_conns = calloc((unsigned int)global.nbthread, sizeof(*newsrv->idle_orphan_conns));
                                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This warning doesn't trigger in other locations, and it immediately
vanishes if the previous or subsequent loops do not depend on
global.nbthread anymore, or if the field ordering of the struct server
changes! As discussed in the thread at:

   https://www.mail-archive.com/haproxy@formilux.org/msg36107.html

playing with -Walloc-size-larger-than has no effect. And a minimal
reproducer could be isolated, indicating it's pointless to circle around
this one. Let's just cast nbthread to ushort so that gcc cannot make
this wrong detection. It's unlikely we'll use more than 65535 threads in
the near future anyway.

This may be backported to older releases if they are also affected, at
least to ease the job of distro maintainers.

Thanks to Ilya for testing.
2020-01-24 11:30:06 +01:00
Tim Duesterhus
541fe1ec52 MINOR: lua: Add HLUA_PREPEND_C?PATH build option
This complements the lua-prepend-path configuration option to allow
distro maintainers to add a default path for HAProxy specific Lua
libraries.
2020-01-24 09:22:03 +01:00
Tim Duesterhus
dd74b5f237 MINOR: lua: Add lua-prepend-path configuration option
lua-prepend-path allows the administrator to specify a custom Lua library
path to load custom Lua modules that are useful within the context of HAProxy
without polluting the global Lua library folder.
2020-01-24 09:22:03 +01:00
Tim Duesterhus
c9fc9f2836 MINOR: lua: Add hlua_prepend_path function
This function is added in preparation for following patches.
2020-01-24 09:21:35 +01:00
Willy Tarreau
bb2c4ae065 BUG/MEDIUM: mux-h2: make sure we don't emit TE headers with anything but "trailers"
While the H2 parser properly checks for the absence of anything but
"trailers" in the TE header field, we forget to check this when sending
the request to an H2 server. The problem is that an H2->H2 conversion
may keep "gzip" and fail on the next stage.

This patch makes sure that we only send "TE: trailers" if the TE header
contains the "trailers" token, otherwise it's dropped.

This fixes issue #464 and should be backported till 1.9.
2020-01-24 09:07:53 +01:00
Willy Tarreau
508d232a06 BUG/MINOR: stktable: report the current proxy name in error messages
Since commit 1b8e68e89a ("MEDIUM: stick-table: Stop handling stick-tables
as proxies."), a rule referencing the current proxy with no table leads
to the following error :

  [ALERT] 023/071924 (16479) : Proxy 'px': unable to find stick-table '(null)'.
  [ALERT] 023/071914 (16479) : Fatal errors found in configuration.

for a config like this one:

  backend px
        stick on src

This patch fixes it and should be backported as far as 2.0.
2020-01-24 07:19:34 +01:00
Willy Tarreau
f22758d12a MINOR: connection: remove some unneeded checks for CO_FL_SOCK_WR_SH
A few places in health checks and stream-int on the send path were still
checking for this flag. Now we do not and instead we rely on snd_buf()
to report the error if any.

It's worth noting that all 3 real muxes still use CO_FL_SOCK_WR_SH and
CO_FL_ERROR interchangeably at various places to decide to abort and/or
free their data. This should be clarified and fixed so that only
CO_FL_ERROR is used, and this will render the error paths simpler and
more accurate.
2020-01-23 19:01:37 +01:00
Willy Tarreau
a8c7e8e3a8 MINOR: raw-sock: always check for CO_FL_SOCK_WR_SH before sending
The test was added before splice() and send() to make sure we never
accidently send after a shutdown, because upper layers do not all
check and it's not their job to do it. In such a case we also set
errno to EPIPE so that the error can be accurately reported, e.g.,
in health checks.
2020-01-23 19:01:37 +01:00
Willy Tarreau
49139cb914 MINOR: connection: don't check for CO_FL_SOCK_WR_SH too early in handshakes
Just like with CO_FL_SOCK_RD_SH, we don't need to check for this flag too
early because conn_sock_send() already does it. No error was lost so it
was harmless, it was only useless code.
2020-01-23 19:01:37 +01:00
Willy Tarreau
d838fb840c MINOR: connection: do not check for CO_FL_SOCK_RD_SH too early
The handshake functions dedicated to proxy proto, netscaler and
socks4 all check for this flag before proceeding. This is wrong,
they must not do and instead perform the call to recv() then
report the close. The reason for this is that the current
construct managed to lose the CO_ER_CIP_EMPTY error code in case
the connection was already shut, thus causing a race condition
with some errors being reported correctly or as unknown depending
on the timing.
2020-01-23 18:05:18 +01:00
Willy Tarreau
6d015724ec MINOR: connection: remove checks for CO_FL_HANDSHAKE before I/O
There are still leftovers from the pre-xprt_handshake era with lots
of places where I/O callbacks refrain from receiving/sending if they
see that a handshake is present. This needlessly duplicates the
subscribe calls as it will automatically be done by the underlying
xprt_handshake code when attempting the operation.

The only reason for still checking CO_FL_HANDSHAKE is when we decide
to instantiate xprt_handshake. This patch removes all other ones.
2020-01-23 17:30:42 +01:00
Willy Tarreau
911db9bd29 MEDIUM: connection: use CO_FL_WAIT_XPRT more consistently than L4/L6/HANDSHAKE
As mentioned in commit c192b0ab95 ("MEDIUM: connection: remove
CO_FL_CONNECTED and only rely on CO_FL_WAIT_*"), there is a lack of
consistency on which flags are checked among L4/L6/HANDSHAKE depending
on the code areas. A number of sample fetch functions only check for
L4L6 to report MAY_CHANGE, some places only check for HANDSHAKE and
many check both L4L6 and HANDSHAKE.

This patch starts to make all of this more consistent by introducing a
new mask CO_FL_WAIT_XPRT which is the union of L4/L6/HANDSHAKE and
reports whether the transport layer is ready or not.

All inconsistent call places were updated to rely on this one each time
the goal was to check for the readiness of the transport layer.
2020-01-23 16:34:26 +01:00
Willy Tarreau
4450b587dd MINOR: connection: remove CO_FL_SSL_WAIT_HS from CO_FL_HANDSHAKE
Most places continue to check CO_FL_HANDSHAKE while in fact they should
check CO_FL_HANDSHAKE_NOSSL, which contains all handshakes but the one
dedicated to SSL renegotiation. In fact the SSL layer should be the
only one checking CO_FL_SSL_WAIT_HS, so as to avoid processing data
when a renegotiation is in progress, but other ones randomly include it
without knowing. And ideally it should even be an internal flag that's
not exposed in the connection.

This patch takes CO_FL_SSL_WAIT_HS out of CO_FL_HANDSHAKE, uses this flag
consistently all over the code, and gets rid of CO_FL_HANDSHAKE_NOSSL.

In order to limit the confusion that has accumulated over time, the
CO_FL_SSL_WAIT_HS flag which indicates an ongoing SSL handshake,
possibly used by a renegotiation was moved after the other ones.
2020-01-23 16:34:26 +01:00
Willy Tarreau
18955db43d MINOR: stream-int: always report received shutdowns
As mentioned in c192b0ab95 ("MEDIUM: connection: remove CO_FL_CONNECTED
and only rely on CO_FL_WAIT_*"), si_cs_recv() currently does not propagate
CS_FL_EOS to CF_READ_NULL if CO_FL_WAIT_L4L6 is set, while this situation
doesn't exist anymore. Let's get rid of this confusing test.
2020-01-23 16:34:26 +01:00
Olivier Houchard
220a26c316 BUG/MEDIUM: 0rtt: Only consider the SSL handshake.
We only add the Early-data header, or get ssl_fc_has_early to return 1, if
we didn't already did the SSL handshake, as otherwise, we know the early
data were fine, and there's no risk of replay attack. But to do so, we
wrongly checked CO_FL_HANDSHAKE, we have to check CO_FL_SSL_WAIT_HS instead,
as we don't care about the status of any other handshake.

This should be backported to 2.1, 2.0, and 1.9.

When deciding if we should add the Early-Data header, or if the sample fetch
should return
2020-01-23 15:01:11 +01:00
Willy Tarreau
c192b0ab95 MEDIUM: connection: remove CO_FL_CONNECTED and only rely on CO_FL_WAIT_*
Commit 477902bd2e ("MEDIUM: connections: Get ride of the xprt_done
callback.") broke the master CLI for a very obscure reason. It happens
that short requests immediately terminated by a shutdown are properly
received, CS_FL_EOS is correctly set, but in si_cs_recv(), we refrain
from setting CF_SHUTR on the channel because CO_FL_CONNECTED was not
yet set on the connection since we've not passed again through
conn_fd_handler() and it was not done in conn_complete_session(). While
commit a8a415d31a ("BUG/MEDIUM: connections: Set CO_FL_CONNECTED in
conn_complete_session()") fixed the issue, such accident may happen
again as the root cause is deeper and actually comes down to the fact
that CO_FL_CONNECTED is lazily set at various check points in the code
but not every time we drop one wait bit. It is not the first time we
face this situation.

Originally this flag was used to detect the transition between WAIT_*
and CONNECTED in order to call ->wake() from the FD handler. But since
at least 1.8-dev1 with commit 7bf3fa3c23 ("BUG/MAJOR: connection: update
CO_FL_CONNECTED before calling the data layer"), CO_FL_CONNECTED is
always synchronized against the two others before being checked. Moreover,
with the I/Os moved to tasklets, the decision to call the ->wake() function
is performed after the I/Os in si_cs_process() and equivalent, which don't
care about this transition either.

So in essence, checking for CO_FL_CONNECTED has become a lazy wait to
check for (CO_FL_WAIT_L4_CONN | CO_FL_WAIT_L6_CONN), but that always
relies on someone else having synchronized it.

This patch addresses it once for all by killing this flag and only checking
the two others (for which a composite mask CO_FL_WAIT_L4L6 was added). This
revealed a number of inconsistencies that were purposely not addressed here
for the sake of bisectability:

  - while most places do check both L4+L6 and HANDSHAKE at the same time,
    some places like assign_server() or back_handle_st_con() and a few
    sample fetches looking for proxy protocol do check for L4+L6 but
    don't care about HANDSHAKE ; these ones will probably fail on TCP
    request session rules if the handshake is not complete.

  - some handshake handlers do validate that a connection is established
    at L4 but didn't clear CO_FL_WAIT_L4_CONN

  - the ->ctl method of mux_fcgi, mux_pt and mux_h1 only checks for L4+L6
    before declaring the mux ready while the snd_buf function also checks
    for the handshake's completion. Likely the former should validate the
    handshake as well and we should get rid of these extra tests in snd_buf.

  - raw_sock_from_buf() would directly set CO_FL_CONNECTED and would only
    later clear CO_FL_WAIT_L4_CONN.

  - xprt_handshake would set CO_FL_CONNECTED itself without actually
    clearing CO_FL_WAIT_L4_CONN, which could apparently happen only if
    waiting for a pure Rx handshake.

  - most places in ssl_sock that were checking CO_FL_CONNECTED don't need
    to include the L4 check as an L6 check is enough to decide whether to
    wait for more info or not.

It also becomes obvious when reading the test in si_cs_recv() that caused
the failure mentioned above that once converted it doesn't make any sense
anymore: having CS_FL_EOS set while still waiting for L4 and L6 to complete
cannot happen since for CS_FL_EOS to be set, the other ones must have been
validated.

Some of these parts will still deserve further cleanup, and some of the
observations above may induce some backports of potential bug fixes once
totally analyzed in their context. The risk of breaking existing stuff
is too high to blindly backport everything.
2020-01-23 14:41:37 +01:00
Emmanuel Hocdet
078156d063 BUG/MINOR: ssl/cli: ocsp_issuer must be set w/ "set ssl cert"
ocsp_issuer is primary set from ckch->chain when PEM is loaded from file,
but not set when PEM is loaded via CLI payload. Set ckch->ocsp_issuer in
ssl_sock_load_pem_into_ckch to fix that.

Should be backported in 2.1.
2020-01-23 14:33:14 +01:00
Olivier Houchard
a8a415d31a BUG/MEDIUM: connections: Set CO_FL_CONNECTED in conn_complete_session().
We can't just assume conn_create_mux() will be called, and set CO_FL_CONNECTED,
conn_complete_session() might be call synchronously if we're not using SSL,
so ew haee no choice but to set CO_FL_CONNECTED in there. This should fix
the recent breakage of the mcli reg tests.
2020-01-23 13:20:03 +01:00
William Lallemand
dad239d08b BUG/MINOR: ssl: typo in previous patch
The previous patch 5c3c96f ("BUG/MINOR: ssl: memory leak w/ the
ocsp_issuer") contains a typo that prevent it to build.

Should be backported in 2.1.
2020-01-23 11:59:02 +01:00
William Lallemand
5c3c96fd36 BUG/MINOR: ssl: memory leak w/ the ocsp_issuer
This patch frees the ocsp_issuer in
ssl_sock_free_cert_key_and_chain_contents().

Shoudl be backported in 2.1.
2020-01-23 11:57:39 +01:00
William Lallemand
b829dda57b BUG/MINOR: ssl: increment issuer refcount if in chain
When using the OCSP response, if the issuer of the response is in
the certificate chain, its address will be stored in ckch->ocsp_issuer.
However, since the ocsp_issuer could be filled by a separate file, this
pointer is free'd. The refcount of the X509 need to be incremented to
avoid a double free if we free the ocsp_issuer AND the chain.
2020-01-23 11:57:39 +01:00
Willy Tarreau
027d206b57 CLEANUP: stats: shut up a wrong null-deref warning from gcc 9.2
As reported in bug #447, gcc 9.2 invents impossible code paths and then
complains that we don't check for our pointers to be NULL... This code
path is not critical, better add the test to shut it up than try to
help it being less creative.

This code hasn't changed for a while, so it could help distros to
backport this to older releases.
2020-01-23 11:49:02 +01:00
Willy Tarreau
79fd577ac1 CLEANUP: backend: shut another false null-deref in back_handle_st_con()
objt_conn() may return a NULL though here we don't have this situation
anymore since the connection is always there, so let's simply switch
to the unchecked __objt_conn(). This addresses issue #454.
2020-01-23 11:40:40 +01:00
Willy Tarreau
b1a40c72e7 CLEANUP: backend: remove useless test for inexistent connection
Coverity rightfully reported that it's pointless to test for "conn"
to be null while all code paths leading to it have already
dereferenced it. This addresses issue #461.
2020-01-23 11:37:43 +01:00
William Lallemand
75b15f790f BUG/MINOR: ssl/cli: free the previous ckch content once a PEM is loaded
When using "set ssl cert" on the CLI, if we load a new PEM, the previous
sctl, issuer and OCSP response are still loaded. This doesn't make any
sense since they won't be usable with a new private key.

This patch free the previous data.

Should be backported in 2.1.
2020-01-23 11:08:46 +01:00
Adis Nezirovic
d0142e7224 MINOR: cli: Report location of errors or any extra data for "show table"
When using multiple filters with "show table", it can be useful to
report which filter entry failed

  > show table MY_TABLE data.gpc0 gt 0 data.gpc0a lt 1000
  Filter entry #2: Unknown data type

  > show table MY_TABLE data.gpc0 gt 0 data.gpc0 lt 1000a
  Filter entry #2: Require a valid integer value to compare against

We now also catch garbage data after the filter

  > show table MY_TABLE data.gpc0 gt 0 data.gpc0 lt 1000 data.gpc0 gt 1\
    data.gpc0 gt 10 a
  Detected extra data in filter, 16th word of input, after '10'

Even before multi-filter feature we've also silently accepted garbage
after the input, hiding potential bugs

  > show table MY_TABLE data.gpc0 gt 0 data.gpc0
or
  > show table MY_TABLE data.gpc0 gt 0 a

In both cases, only first filter entry would be used, silently ignoring
extra filter entry or garbage data.

Last, but not the least, it is now possible to detect multi-filter
feature from cli with something like the following:

  > show table MY_TABLE data.blah
  Filter entry #1: Unknown data type
2020-01-23 10:43:52 +01:00
Olivier Houchard
477902bd2e MEDIUM: connections: Get ride of the xprt_done callback.
The xprt_done_cb callback was used to defer some connection initialization
until we're connected and the handshake are done. As it mostly consists of
creating the mux, instead of using the callback, introduce a conn_create_mux()
function, that will just call conn_complete_session() for frontend, and
create the mux for backend.
In h2_wake(), make sure we call the wake method of the stream_interface,
as we no longer wakeup the stream task.
2020-01-22 18:56:05 +01:00
Olivier Houchard
8af03b396a MEDIUM: streams: Always create a conn_stream in connect_server().
In connect_server(), when creating a new connection for which we don't yet
know the mux (because it'll be decided by the ALPN), instead of associating
the connection to the stream_interface, always create a conn_stream. This way,
we have less special-casing needed. Store the conn_stream in conn->ctx,
so that we can reach the upper layers if needed.
2020-01-22 18:55:59 +01:00
Adis Nezirovic
56dd354b3c BUG/MINOR: cli: Missing arg offset for filter data values.
We don't properly check for missing data values for additional filter
entries, passing out of bounds index to args[], then passing to strlen.

Introduced in commit 1a693fc2: (MEDIUM: cli: Allow multiple filter
entries for "show table")
2020-01-22 18:09:06 +01:00
Willy Tarreau
2b64a35184 BUILD: stick-table: fix build errors introduced by last stick-table change
Last commit 1a693fc2fd ("MEDIUM: cli: Allow multiple filter entries for "show table"")
broke the build at two places:

  src/stick_table.c: In function 'table_prepare_data_request':
  src/stick_table.c:3620:33: warning: ordered comparison of pointer with integer zero [-Wextra]
  src/stick_table.c: In function 'cli_io_handler_table':
  src/stick_table.c:3763:5: error: 'for' loop initial declarations are only allowed in C99 mode
  src/stick_table.c:3763:5: note: use option -std=c99 or -std=gnu99 to compile your code
  make: *** [src/stick_table.o] Error 1

This patch fixes both. No backport needed.
2020-01-22 17:11:00 +01:00
Emmanuel Hocdet
6b5b44e10f BUG/MINOR: ssl: ssl_sock_load_pem_into_ckch is not consistent
"set ssl cert <filename> <payload>" CLI command should have the same
result as reload HAproxy with the updated pem file (<filename>).
Is not the case, DHparams/cert-chain is kept from the previous
context if no DHparams/cert-chain is set in the context (<payload>).

This patch should be backport to 2.1
2020-01-22 15:55:55 +01:00
Olivier Houchard
1a9dbe58a6 BUG/MEDIUM: netscaler: Don't forget to allocate storage for conn->src/dst.
In conn_recv_netscaler_cip(), don't forget to allocate conn->src and
conn->dst, as those are now dynamically allocated. Not doing so results in
getting a crash when using netscaler.
This should fix github issue #460.

This should be backported to 2.1.
2020-01-22 15:33:03 +01:00
Adis Nezirovic
1a693fc2fd MEDIUM: cli: Allow multiple filter entries for "show table"
For complex stick tables with many entries/columns, it can be beneficial
to filter using multiple criteria. The maximum number of filter entries
can be controlled by defining STKTABLE_FILTER_LEN during build time.

This patch can be backported to older releases.
2020-01-22 14:33:17 +01:00
Willy Tarreau
71f95fa20e [RELEASE] Released version 2.2-dev1
Released version 2.2-dev1 with the following main changes :
    - DOC: this is development again
    - MINOR: version: this is development again, update the status
    - SCRIPTS: update create-release to fix the changelog on new branches
    - CLEANUP: ssl: Clean up error handling
    - BUG/MINOR: contrib/prometheus-exporter: decode parameter and value only
    - BUG/MINOR: h1: Don't test the host header during response parsing
    - BUILD/MINOR: trace: fix use of long type in a few printf format strings
    - DOC: Clarify behavior of server maxconn in HTTP mode
    - MINOR: ssl: deduplicate ca-file
    - MINOR: ssl: compute ca-list from deduplicate ca-file
    - MINOR: ssl: deduplicate crl-file
    - CLEANUP: dns: resolution can never be null
    - BUG/MINOR: http-htx: Don't make http_find_header() fail if the value is empty
    - DOC: ssl/cli: set/commit/abort ssl cert
    - BUG/MINOR: ssl: fix SSL_CTX_set1_chain compatibility for openssl < 1.0.2
    - BUG/MINOR: fcgi-app: Make the directive pass-header case insensitive
    - BUG/MINOR: stats: Fix HTML output for the frontends heading
    - BUG/MINOR: ssl: fix X509 compatibility for openssl < 1.1.0
    - DOC: clarify matching strings on binary fetches
    - DOC: Fix ordered list in summary
    - DOC: move the "group" keyword at the right place
    - MEDIUM: init: prevent process and thread creation at runtime
    - BUG/MINOR: ssl/cli: 'ssl cert' cmd only usable w/ admin rights
    - BUG/MEDIUM: stream-int: don't subscribed for recv when we're trying to flush data
    - BUG/MINOR: stream-int: avoid calling rcv_buf() when splicing is still possible
    - BUG/MINOR: ssl/cli: don't overwrite the filters variable
    - BUG/MEDIUM: listener/thread: fix a race when pausing a listener
    - BUG/MINOR: ssl: certificate choice can be unexpected with openssl >= 1.1.1
    - BUG/MEDIUM: mux-h1: Never reuse H1 connection if a shutw is pending
    - BUG/MINOR: mux-h1: Don't rely on CO_FL_SOCK_RD_SH to set H1C_F_CS_SHUTDOWN
    - BUG/MINOR: mux-h1: Fix conditions to know whether or not we may receive data
    - BUG/MEDIUM: tasks: Make sure we switch wait queues in task_set_affinity().
    - BUG/MEDIUM: checks: Make sure we set the task affinity just before connecting.
    - MINOR: debug: replace popen() with pipe+fork() in "debug dev exec"
    - MEDIUM: init: set NO_NEW_PRIVS by default when supported
    - BUG/MINOR: mux-h1: Be sure to set CS_FL_WANT_ROOM when EOM can't be added
    - BUG/MEDIUM: mux-fcgi: Handle cases where the HTX EOM block cannot be inserted
    - BUG/MINOR: proxy: make soft_stop() also close FDs in LI_PAUSED state
    - BUG/MINOR: listener/threads: always use atomic ops to clear the FD events
    - BUG/MINOR: listener: also clear the error flag on a paused listener
    - BUG/MEDIUM: listener/threads: fix a remaining race in the listener's accept()
    - MINOR: listener: make the wait paths cleaner and more reliable
    - MINOR: listener: split dequeue_all_listener() in two
    - REORG: listener: move the global listener queue code to listener.c
    - DOC: document the listener state transitions
    - BUG/MEDIUM: kqueue: Make sure we report read events even when no data.
    - BUG/MAJOR: dns: add minimalist error processing on the Rx path
    - BUG/MEDIUM: proto_udp/threads: recv() and send() must not be exclusive.
    - DOC: listeners: add a few missing transitions
    - BUG/MINOR: tasks: only requeue a task if it was already in the queue
    - MINOR: tasks: split wake_expired_tasks() in two parts to avoid useless wakeups
    - DOC: proxies: HAProxy only supports 3 connection modes
    - DOC: remove references to the outdated architecture.txt
    - BUG/MINOR: log: fix minor resource leaks on logformat error path
    - BUG/MINOR: mworker: properly pass SIGTTOU/SIGTTIN to workers
    - BUG/MINOR: listener: do not immediately resume on transient error
    - BUG/MINOR: server: make "agent-addr" work on default-server line
    - BUG/MINOR: listener: fix off-by-one in state name check
    - BUILD/MINOR: unix sockets: silence an absurd gcc warning about strncpy()
    - MEDIUM: h1-htx: Add HTX EOM block when the message is in H1_MSG_DONE state
    - MINOR: http-htx: Add some htx sample fetches for debugging purpose
    - REGTEST: Add an HTX reg-test to check an edge case
    - DOC: clarify the fact that replace-uri works on a full URI
    - BUG/MINOR: sample: fix the closing bracket and LF in the debug converter
    - BUG/MINOR: sample: always check converters' arguments
    - MINOR: sample: Validate the number of bits for the sha2 converter
    - BUG/MEDIUM: ssl: Don't set the max early data we can receive too early.
    - MINOR: ssl/cli: 'show ssl cert' give information on the certificates
    - BUG/MINOR: ssl/cli: fix build for openssl < 1.0.2
    - MINOR: debug: support logging to various sinks
    - MINOR: http: add a new "replace-path" action
    - REGTEST: ssl: test the "set ssl cert" CLI command
    - REGTEST: run-regtests: implement #REQUIRE_BINARIES
    - MINOR: task: only check TASK_WOKEN_ANY to decide to requeue a task
    - BUG/MAJOR: task: add a new TASK_SHARED_WQ flag to fix foreing requeuing
    - BUG/MEDIUM: ssl: Revamp the way early data are handled.
    - MINOR: fd/threads: make _GET_NEXT()/_GET_PREV() use the volatile attribute
    - BUG/MEDIUM: fd/threads: fix a concurrency issue between add and rm on the same fd
    - REGTEST: make the "set ssl cert" require version 2.1
    - BUG/MINOR: ssl: openssl-compat: Fix getm_ defines
    - BUG/MEDIUM: state-file: do not allocate a full buffer for each server entry
    - BUG/MINOR: state-file: do not store duplicates in the global tree
    - BUG/MINOR: state-file: do not leak memory on parse errors
    - BUG/MAJOR: mux-h1: Don't pretend the input channel's buffer is full if empty
    - BUG/MEDIUM: stream: Be sure to never assign a TCP backend to an HTX stream
    - BUILD: ssl: improve SSL_CTX_set_ecdh_auto compatibility
    - BUILD: travis-ci: link with ssl libraries using rpath instead of LD_LIBRARY_PATH/DYLD_LIBRARY_PATH
    - BUILD: travis-ci: reenable address sanitizer for clang builds
    - BUG/MINOR: checks: refine which errno values are really errors.
    - BUG/MINOR: connection: only wake send/recv callbacks if the FD is active
    - CLEANUP: connection: conn->xprt is never NULL
    - MINOR: pollers: add a new flag to indicate pollers reporting ERR & HUP
    - MEDIUM: tcp: make tcp_connect_probe() consider ERR/HUP
    - REORG: connection: move tcp_connect_probe() to conn_fd_check()
    - MINOR: connection: check for connection validation earlier
    - MINOR: connection: remove the double test on xprt_done_cb()
    - CLEANUP: connection: merge CO_FL_NOTIFY_DATA and CO_FL_NOTIFY_DONE
    - MINOR: poller: do not call the IO handler if the FD is not active
    - OPTIM: epoll: always poll for recv if neither active nor ready
    - OPTIM: polling: do not create update entries for FD removal
    - BUG/MEDIUM: checks: Only attempt to do handshakes if the connection is ready.
    - BUG/MEDIUM: connections: Hold the lock when wanting to kill a connection.
    - BUILD: CI: modernize cirrus-ci
    - MINOR: config: disable busy polling on old processes
    - MINOR: ssl: Remove unused variable "need_out".
    - BUG/MINOR: h1: Report the right error position when a header value is invalid
    - BUG/MINOR: proxy: Fix input data copy when an error is captured
    - BUG/MEDIUM: http-ana: Truncate the response when a redirect rule is applied
    - BUG/MINOR: channel: inject output data at the end of output
    - BUG/MEDIUM: session: do not report a failure when rejecting a session
    - MEDIUM: dns: implement synchronous send
    - MINOR: raw_sock: make sure to disable polling once everything is sent
    - MINOR: http: Add 410 to http-request deny
    - MINOR: http: Add 404 to http-request deny
    - CLEANUP: mux-h2: remove unused goto "out_free_h2s"
    - BUILD: cirrus-ci: choose proper openssl package name
    - BUG/MAJOR: listener: do not schedule a task-less proxy
    - CLEANUP: server: remove unused err section in server_finalize_init
    - REGTEST: set_ssl_cert.vtc: replace "echo" with "printf"
    - BUG/MINOR: stream-int: Don't trigger L7 retry if max retries is already reached
    - BUG/MEDIUM: tasks: Use the MT macros in tasklet_free().
    - BUG/MINOR: mux-h2: use a safe list_for_each_entry in h2_send()
    - BUG/MEDIUM: mux-h2: fix missing test on sending_list in previous patch
    - CLEANUP: ssl: remove opendir call in ssl_sock_load_cert
    - MEDIUM: lua: don't call the GC as often when dealing with outgoing connections
    - BUG/MEDIUM: mux-h2: don't stop sending when crossing a buffer boundary
    - BUG/MINOR: cli/mworker: can't start haproxy with 2 programs
    - REGTEST: mcli/mcli_start_progs: start 2 programs
    - BUG/MEDIUM: mworker: remain in mworker mode during reload
    - DOC: clarify crt-base usage
    - CLEANUP: compression: remove unused deinit_comp_ctx section
    - BUG/MEDIUM: mux_h1: Don't call h1_send if we subscribed().
    - BUG/MEDIUM: raw_sock: Make sur the fd and conn are sync.
    - CLEANUP: proxy: simplify proxy_parse_rate_limit proxy checks
    - BUG/MAJOR: hashes: fix the signedness of the hash inputs
    - REGTEST: add sample_fetches/hashes.vtc to validate hashes
    - BUG/MEDIUM: cli: _getsocks must send the peers sockets
    - CLEANUP: cli: deduplicate the code in _getsocks
    - BUG/MINOR: stream: don't mistake match rules for store-request rules
    - BUG/MEDIUM: connection: add a mux flag to indicate splice usability
    - BUG/MINOR: pattern: handle errors from fgets when trying to load patterns
    - MINOR: connection: move the CO_FL_WAIT_ROOM cleanup to the reader only
    - MINOR: stream-int: remove dependency on CO_FL_WAIT_ROOM for rcv_buf()
    - MEDIUM: connection: get rid of CO_FL_CURR_* flags
    - BUILD: pattern: include errno.h
    - MEDIUM: mux-h2: do not try to stop sending streams on blocked mux
    - MEDIUM: mux-fcgi: do not try to stop sending streams on blocked mux
    - MEDIUM: mux-h2: do not make an h2s subscribe to itself on deferred shut
    - MEDIUM: mux-fcgi: do not make an fstrm subscribe to itself on deferred shut
    - REORG: stream/backend: move backend-specific stuff to backend.c
    - MEDIUM: backend: move the connection finalization step to back_handle_st_con()
    - MEDIUM: connection: merge the send_wait and recv_wait entries
    - MEDIUM: xprt: merge recv_wait and send_wait in xprt_handshake
    - MEDIUM: ssl: merge recv_wait and send_wait in ssl_sock
    - MEDIUM: mux-h1: merge recv_wait and send_wait
    - MEDIUM: mux-h2: merge recv_wait and send_wait event notifications
    - MEDIUM: mux-fcgi: merge recv_wait and send_wait event notifications
    - MINOR: connection: make the last arg of subscribe() a struct wait_event*
    - MINOR: ssl: Add support for returning the dn samples from ssl_(c|f)_(i|s)_dn in LDAP v3 (RFC2253) format.
    - DOC: Fix copy and paste mistake in http-response replace-value doc
    - BUG/MINOR: cache: Fix leak of cache name in error path
    - BUG/MINOR: dns: Make dns_query_id_seed unsigned
    - BUG/MINOR: 51d: Fix bug when HTX is enabled
    - MINOR: http-htx: Move htx sample fetches in the scope "internal"
    - MINOR: http-htx: Rename 'internal.htx_blk.val' to 'internal.htx_blk.data'
    - MINOR: http-htx: Make 'internal.htx_blk_data' return a binary string
    - DOC: Add a section to document the internal sample fetches
    - MINOR: mux-h1: Inherit send flags from the upper layer
    - MINOR: contrib/prometheus-exporter: Add heathcheck status/code in server metrics
    - BUG/MINOR: http-ana/filters: Wait end of the http_end callback for all filters
    - BUG/MINOR: http-rules: Remove buggy deinit functions for HTTP rules
    - BUG/MINOR: stick-table: Use MAX_SESS_STKCTR as the max track ID during parsing
    - MEDIUM: http-rules: Register an action keyword for all http rules
    - MINOR: tcp-rules: Always set from which ruleset a rule comes from
    - MINOR: actions: Use ACT_RET_CONT code to ignore an error from a custom action
    - MINOR: tcp-rules: Kill connections when custom actions return ACT_RET_ERR
    - MINOR: http-rules: Return an error when custom actions return ACT_RET_ERR
    - MINOR: counters: Add a counter to report internal processing errors
    - MEDIUM: http-ana: Properly handle internal processing errors
    - MINOR: http-rules: Add a rule result to report internal error
    - MINOR: http-rules: Handle internal errors during HTTP rules evaluation
    - MINOR: http-rules: Add more return codes to let custom actions act as normal ones
    - MINOR: tcp-rules: Handle denied/aborted/invalid connections from TCP rules
    - MINOR: http-rules: Handle denied/aborted/invalid connections from HTTP rules
    - MINOR: stats: Report internal errors in the proxies/listeners/servers stats
    - MINOR: contrib/prometheus-exporter: Export internal errors per proxy/server
    - MINOR: counters: Remove failed_secu counter and use denied_resp instead
    - MINOR: counters: Review conditions to increment counters from analysers
    - MINOR: http-ana: Add a txn flag to support soft/strict message rewrites
    - MINOR: http-rules: Handle all message rewrites the same way
    - MINOR: http-rules: Add a rule to enable or disable the strict rewriting mode
    - MEDIUM: http-rules: Enable the strict rewriting mode by default
    - REGTEST: Fix format of set-uri HTTP request rule in h1or2_to_h1c.vtc
    - MINOR: actions: Add a function pointer to release args used by actions
    - MINOR: actions: Regroup some info about HTTP rules in the same struct
    - MINOR: http-rules/tcp-rules: Call the defined action function first if defined
    - MINOR: actions: Rename the act_flag enum into act_opt
    - MINOR: actions: Add flags to configure the action behaviour
    - MINOR: actions: Use an integer to set the action type
    - MINOR: http-rules: Use a specific action type for some custom HTTP actions
    - MINOR: http-rules: Make replace-header and replace-value custom actions
    - MINOR: http-rules: Make set-header and add-header custom actions
    - MINOR: http-rules: Make set/del-map and add/del-acl custom actions
    - MINOR: http-rules: Group all processing of early-hint rule in its case clause
    - MEDIUM: http-rules: Make early-hint custom actions
    - MINOR: http-rule/tcp-rules: Make track-sc* custom actions
    - MINOR: tcp-rules: Make tcp-request capture a custom action
    - MINOR: http-rules: Add release functions for existing HTTP actions
    - BUG/MINOR: http-rules: Fix memory releases on error path during action parsing
    - MINOR: tcp-rules: Add release functions for existing TCP actions
    - BUG/MINOR: tcp-rules: Fix memory releases on error path during action parsing
    - MINOR: http-htx: Add functions to read a raw error file and convert it in HTX
    - MINOR: http-htx: Add functions to create HTX redirect message
    - MINOR: config: Use dedicated function to parse proxy's errorfiles
    - MINOR: config: Use dedicated function to parse proxy's errorloc
    - MEDIUM: http-htx/proxy: Use a global and centralized storage for HTTP error messages
    - MINOR: proxy: Register keywords to parse errorfile and errorloc directives
    - MINOR: http-htx: Add a new section to create groups of custom HTTP errors
    - MEDIUM: proxy: Add a directive to reference an http-errors section in a proxy
    - MINOR: http-rules: Update txn flags and status when a deny rule is executed
    - MINOR: http-rules: Support an optional status on deny rules for http reponses
    - MINOR: http-rules: Use same function to parse request and response deny actions
    - MINOR: http-ana: Add an error message in the txn and send it when defined
    - MEDIUM: http-rules: Support an optional error message in http deny rules
    - REGTEST: Add a strict rewriting mode reg test
    - REGEST: Add reg tests about error files
    - MINOR: ssl: accept 'verify' bind option with 'set ssl cert'
    - BUG/MINOR: ssl: ssl_sock_load_ocsp_response_from_file memory leak
    - BUG/MINOR: ssl: ssl_sock_load_issuer_file_into_ckch memory leak
    - BUG/MINOR: ssl: ssl_sock_load_sctl_from_file memory leak
    - BUG/MINOR: http_htx: Fix some leaks on error path when error files are loaded
    - CLEANUP: http-ana: Remove useless test on txn when the error message is retrieved
    - BUILD: CI: introduce ARM64 builds
    - BUILD: ssl: more elegant anti-replay feature presence check
    - MINOR: proxy/http-ana: Add support of extra attributes for the cookie directive
    - MEDIUM: dns: use Additional records from SRV responses
    - CLEANUP: Consistently `unsigned int` for bitfields
    - CLEANUP: pattern: remove the pat_time definition
    - BUG/MINOR: http_act: don't check capture id in backend
    - BUG/MINOR: ssl: fix build on development versions of openssl-1.1.x
2020-01-22 10:34:58 +01:00
Baptiste Assmann
19a69b3740 BUG/MINOR: http_act: don't check capture id in backend
A wrong behavior was introduced by
e9544935e8, leading to preventing loading
any configuration where a capture slot id is used in a backend.
IE, the configuration below does not parse:

  frontend f
   bind *:80
   declare capture request len 32
   default_backend webserver

  backend webserver
   http-request capture req.hdr(Host) id 1

The point is that such type of configuration is valid and should run.

This patch enforces the check of capture slot id only if the action rule
is configured in a frontend.
The point is that at configuration parsing time, it is impossible to
check which frontend could point to this backend (furthermore if we use
dynamic backend name resolution at runtime).

The documentation has been updated to warn the user to ensure that
relevant frontends have required declaration when such rule has to be
used in a backend.
If no capture slot can be found, then the action will just not be
executed and HAProxy will process the next one in the list, as expected.

This should be backported to all supported branches (bug created as part
of a bug fix introduced into 1.7 and backported to 1.6).
2020-01-22 07:44:36 +01:00
Baptiste Assmann
13a9232ebc MEDIUM: dns: use Additional records from SRV responses
Most DNS servers provide A/AAAA records in the Additional section of a
response, which correspond to the SRV records from the Answer section:

  ;; QUESTION SECTION:
  ;_http._tcp.be1.domain.tld.     IN      SRV

  ;; ANSWER SECTION:
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A1.domain.tld.
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A8.domain.tld.
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A5.domain.tld.
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A6.domain.tld.
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A4.domain.tld.
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A3.domain.tld.
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A2.domain.tld.
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A7.domain.tld.

  ;; ADDITIONAL SECTION:
  A1.domain.tld.          3600    IN      A       192.168.0.1
  A8.domain.tld.          3600    IN      A       192.168.0.8
  A5.domain.tld.          3600    IN      A       192.168.0.5
  A6.domain.tld.          3600    IN      A       192.168.0.6
  A4.domain.tld.          3600    IN      A       192.168.0.4
  A3.domain.tld.          3600    IN      A       192.168.0.3
  A2.domain.tld.          3600    IN      A       192.168.0.2
  A7.domain.tld.          3600    IN      A       192.168.0.7

SRV record support was introduced in HAProxy 1.8 and the first design
did not take into account the records from the Additional section.
Instead, a new resolution is associated to each server with its relevant
FQDN.
This behavior generates a lot of DNS requests (1 SRV + 1 per server
associated).

This patch aims at fixing this by:
- when a DNS response is validated, we associate A/AAAA records to
  relevant SRV ones
- set a flag on associated servers to prevent them from running a DNS
  resolution for said FADN
- update server IP address with information found in the Additional
  section

If no relevant record can be found in the Additional section, then
HAProxy will failback to running a dedicated resolution for this server,
as it used to do.
This behavior is the one described in RFC 2782.
2020-01-22 07:19:54 +01:00
Christopher Faulet
2f5339079b MINOR: proxy/http-ana: Add support of extra attributes for the cookie directive
It is now possible to insert any attribute when a cookie is inserted by
HAProxy. Any value may be set, no check is performed except the syntax validity
(CTRL chars and ';' are forbidden). For instance, it may be used to add the
SameSite attribute:

    cookie SRV insert attr "SameSite=Strict"

The attr option may be repeated to add several attributes.

This patch should fix the issue #361.
2020-01-22 07:18:31 +01:00
Ilya Shipitsin
e9ff8992a1 BUILD: ssl: more elegant anti-replay feature presence check
Instead of tracking the version number to figure whether
SSL_OP_NO_ANTI_REPLAY is defined, simply rely on its definition.
2020-01-22 06:50:21 +01:00
Christopher Faulet
53a87e134e CLEANUP: http-ana: Remove useless test on txn when the error message is retrieved
In http_error_message(), the HTTP txn is always defined. So, this is no reason
to test its nullity.

This patch partially fixes the issue #457.
2020-01-21 11:12:37 +01:00
Christopher Faulet
7cde96c829 BUG/MINOR: http_htx: Fix some leaks on error path when error files are loaded
No backports needed. This patch partially fixes the issue #457.
2020-01-21 11:12:37 +01:00
Emmanuel Hocdet
224a087a27 BUG/MINOR: ssl: ssl_sock_load_sctl_from_file memory leak
"set ssl cert <filename.sctl> <payload>" CLI command must free
previous context.

This patch should be backport to 2.1
2020-01-21 10:44:33 +01:00
Emmanuel Hocdet
eb73dc34bb BUG/MINOR: ssl: ssl_sock_load_issuer_file_into_ckch memory leak
"set ssl cert <filename.issuer> <payload>" CLI command must free
previous context.

This patch should be backport to 2.1
2020-01-21 10:44:33 +01:00
Emmanuel Hocdet
0667faebcf BUG/MINOR: ssl: ssl_sock_load_ocsp_response_from_file memory leak
"set ssl cert <filename.ocsp> <payload>" CLI command must free
previous context.

This patch should be backport to 2.1
2020-01-21 10:44:33 +01:00
Emmanuel Hocdet
ebf840bf37 MINOR: ssl: accept 'verify' bind option with 'set ssl cert'
Since patches initiated with d4f9a60e "MINOR: ssl: deduplicate ca-file",
no more file access is done for 'verify' bind options (crl/ca file).
Remove conditional restriction for "set ssl cert" CLI commands.
2020-01-21 09:58:41 +01:00
Christopher Faulet
554c0ebffd MEDIUM: http-rules: Support an optional error message in http deny rules
It is now possible to set the error message to use when a deny rule is
executed. It may be a specific error file, adding "errorfile <file>" :

  http-request deny deny_status 400 errorfile /etc/haproxy/errorfiles/400badreq.http

It may also be an error file from an http-errors section, adding "errorfiles
<name>" :

  http-request deny errorfiles my-errors  # use 403 error from "my-errors" section

When defined, this error message is set in the HTTP transaction. The tarpit rule
is also concerned by this change.
2020-01-20 15:18:46 +01:00
Christopher Faulet
473e880a25 MINOR: http-ana: Add an error message in the txn and send it when defined
It is now possible to set the error message to return to client in the HTTP
transaction. If it is defined, this error message is used instead of proxy's
errors or default errors.
2020-01-20 15:18:46 +01:00
Christopher Faulet
e0fca297d5 MINOR: http-rules: Use same function to parse request and response deny actions
Because there is no more difference between http-request and http-response
rules, the same function is now used to parse them.
2020-01-20 15:18:46 +01:00
Christopher Faulet
040c8cdbbe MINOR: http-rules: Support an optional status on deny rules for http reponses
It is now possible to specified the status code to return an http-response deny
rules. For instance :

    http-response deny deny_status 500
2020-01-20 15:18:46 +01:00
Christopher Faulet
b58f62b316 MINOR: http-rules: Update txn flags and status when a deny rule is executed
When a deny rule is executed, the flag TX_CLDENY and the status code are set on
the HTTP transaction. Now, these steps are handled by the code executing the
deny rule. So into http_req_get_intercept_rule() for the request and
http_res_get_intercept_rule() for the response.
2020-01-20 15:18:46 +01:00
Christopher Faulet
76edc0f29c MEDIUM: proxy: Add a directive to reference an http-errors section in a proxy
It is now possible to import in a proxy, fully or partially, error files
declared in an http-errors section. It may be done using the "errorfiles"
directive, followed by a name and optionally a list of status code. If there is
no status code specified, all error files of the http-errors section are
imported. Otherwise, only error files associated to the listed status code are
imported. For instance :

  http-errors my-errors
      errorfile 400 ...
      errorfile 403 ...
      errorfile 404 ...

  frontend frt
      errorfiles my-errors 403 404  # ==> error 400 not imported
2020-01-20 15:18:46 +01:00
Christopher Faulet
35cd81d363 MINOR: http-htx: Add a new section to create groups of custom HTTP errors
A new section may now be declared in the configuration to create global groups
of HTTP errors. These groups are not linked to a proxy and are referenced by
name. The section must be declared using the keyword "http-errors" followed by
the group name. This name must be unique. A list of "errorfile" directives may
be declared in such section. For instance:

    http-errors website-1
        errorfile 400 /path/to/site1/400.http
        errorfile 404 /path/to/site1/404.http

    http-errors website-2
        errorfile 400 /path/to/site2/400.http
        errorfile 404 /path/to/site2/404.http

For now, it is just possible to create "http-errors" sections. There is no
documentation because these groups are not used yet.
2020-01-20 15:18:46 +01:00
Christopher Faulet
07f41f79cb MINOR: proxy: Register keywords to parse errorfile and errorloc directives
errorfile and errorloc directives are now pased in dedicated functions in
http_htx.c.
2020-01-20 15:18:46 +01:00
Christopher Faulet
5885775de1 MEDIUM: http-htx/proxy: Use a global and centralized storage for HTTP error messages
All custom HTTP errors are now stored in a global tree. Proxies use a references
on these messages. The key used for errorfile directives is the file name as
specified in the configuration. For errorloc directives, a key is created using
the redirect code and the url. This means that the same custom error message is
now stored only once. It may be used in several proxies or for several status
code, it is only parsed and stored once.
2020-01-20 15:18:46 +01:00
Christopher Faulet
ac2412fee8 MINOR: config: Use dedicated function to parse proxy's errorloc
The parsing of the "errorloc" directive is now handled by the function
http_parse_errorloc().
2020-01-20 15:18:45 +01:00
Christopher Faulet
13d297f3d6 MINOR: config: Use dedicated function to parse proxy's errorfiles
The parsing of the "errorfile" directive is now handled by the function
http_parse_errorfile().
2020-01-20 15:18:45 +01:00
Christopher Faulet
bdf6526e94 MINOR: http-htx: Add functions to create HTX redirect message
http_parse_errorloc() may now be used to create an HTTP 302 or 303 redirect
message with a specific url passed as parameter. A parameter is used to known if
it is a 302 or a 303 redirect. A status code is passed as parameter. It must be
one of the supported HTTP error codes to be valid. Otherwise an error is
returned. It aims to be used to parse "errorloc" directives. It relies on
http_load_errormsg() to do most of the job, ie converting it in HTX.
2020-01-20 15:18:45 +01:00
Christopher Faulet
5031ef58ca MINOR: http-htx: Add functions to read a raw error file and convert it in HTX
http_parse_errorfile() may now be used to parse a raw HTTP message from a
file. A status code is passed as parameter. It must be one of the supported HTTP
error codes to be valid. Otherwise an error is returned. It aims to be used to
parse "errorfile" directives. It relies on http_load_errorfile() to do most of
the job, ie reading the file content and converting it in HTX.
2020-01-20 15:18:45 +01:00
Christopher Faulet
fdb6fbfa9a BUG/MINOR: tcp-rules: Fix memory releases on error path during action parsing
When an error occurred during the parsing of a TCP action, if some memory was
allocated, it should be released before exiting.  Here, the fix consists for
replace a call to free() on a sample expression by a call to
release_sample_expr().

This patch may be backported to all supported versions.
2020-01-20 15:18:45 +01:00
Christopher Faulet
adfc6e8e14 MINOR: tcp-rules: Add release functions for existing TCP actions
TCP actions allocating memory during configuration parsing now use dedicated
functions to release it.
2020-01-20 15:18:45 +01:00
Christopher Faulet
1337b328d9 BUG/MINOR: http-rules: Fix memory releases on error path during action parsing
When an error occurred during the parsing of an HTTP action, if some memory was
allocated, it should be released before exiting. Sometime a call to free() is
used on a sample expression instead of a call to release_sample_expr(). Other
time, it is just a string or a regex that should be released.

There is no real reason to backport this patch. Especially because this part was
highly modified recentely in 2.2-DEV.
2020-01-20 15:18:45 +01:00
Christopher Faulet
2eb539687e MINOR: http-rules: Add release functions for existing HTTP actions
HTTP actions allocating memory during configuration parsing now use dedicated
functions to release it.
2020-01-20 15:18:45 +01:00
Christopher Faulet
d73b96d48c MINOR: tcp-rules: Make tcp-request capture a custom action
Now, this action is use its own dedicated function and is no longer handled "in
place" during the TCP rules evaluation. Thus the action name ACT_TCP_CAPTURE is
removed. The action type is set to ACT_CUSTOM and a check function is used to
know if the rule depends on request contents while there is no inspect-delay.
2020-01-20 15:18:45 +01:00
Christopher Faulet
ac98d81f46 MINOR: http-rule/tcp-rules: Make track-sc* custom actions
Now, these actions use their own dedicated function and are no longer handled
"in place" during the TCP/HTTP rules evaluation. Thus the action names
ACT_ACTION_TRK_SC0 and ACT_ACTION_TRK_SCMAX are removed. The action type is now
the tracking index. Thus the function trk_idx() is no longer needed.
2020-01-20 15:18:45 +01:00
Christopher Faulet
91b3ec13c6 MEDIUM: http-rules: Make early-hint custom actions
Now, the early-hint action uses its own dedicated action and is no longer
handled "in place" during the HTTP rules evaluation. Thus the action name
ACT_HTTP_EARLY_HINT is removed. In additionn, http_add_early_hint_header() and
http_reply_103_early_hints() are also removed. This part is now handled in the
new action_ptr callback function.
2020-01-20 15:18:45 +01:00
Christopher Faulet
5275aa7540 MINOR: http-rules: Group all processing of early-hint rule in its case clause
To know if the 103 response start-line must be added, we test if it is the first
rule of the ruleset or if the previous rule is not an early-hint rule. And at
the end, to know if the 103 response must be terminated, we test if it is the
last rule of the ruleset or if the next rule is not an early-hint rule. This
way, all the code dealing with early-hint rules is grouped in its case clause.
2020-01-20 15:18:45 +01:00
Christopher Faulet
046cf44f6c MINOR: http-rules: Make set/del-map and add/del-acl custom actions
Now, these actions use their own dedicated function and are no longer handled
"in place" during the HTTP rules evaluation. Thus the action names
ACT_HTTP_*_ACL and ACT_HTTP_*_MAP are removed. The action type is now mapped as
following: 0 = add-acl, 1 = set-map, 2 = del-acl and 3 = del-map.
2020-01-20 15:18:45 +01:00
Christopher Faulet
d1f27e3394 MINOR: http-rules: Make set-header and add-header custom actions
Now, these actions use their own dedicated function and are no longer handled
"in place" during the HTTP rules evaluation. Thus the action names
ACT_HTTP_SET_HDR and ACT_HTTP_ADD_VAL are removed. The action type is now set to
0 to set a header (so remove existing ones if any and add a new one) or to 1 to
add a header (add without remove).
2020-01-20 15:18:45 +01:00
Christopher Faulet
92d34fe38d MINOR: http-rules: Make replace-header and replace-value custom actions
Now, these actions use their own dedicated function and are no longer handled
"in place" during the HTTP rules evaluation. Thus the action names
ACT_HTTP_REPLACE_HDR and ACT_HTTP_REPLACE_VAL are removed. The action type is
now set to 0 to evaluate the whole header or to 1 to evaluate every
comma-delimited values.

The function http_transform_header_str() is renamed to http_replace_hdrs() to be
more explicit and the function http_transform_header() is removed. In fact, this
last one is now more or less the new action function.

The lua code has been updated accordingly to use http_replace_hdrs().
2020-01-20 15:18:45 +01:00
Christopher Faulet
2c22a6923a MINOR: http-rules: Use a specific action type for some custom HTTP actions
For set-method, set-path, set-query and set-uri, a specific action type is
used. The same as before but no longer stored in <arg.http.i>. Same is done for
replace-path and replace-uri. The same types are used than the "set-" versions.
2020-01-20 15:18:45 +01:00
Christopher Faulet
245cf795c1 MINOR: actions: Add flags to configure the action behaviour
Some flags can now be set on an action when it is registered. The flags are
defined in the act_flag enum. For now, only ACT_FLAG_FINAL may be set on an
action to specify if it stops the rules evaluation. It is set on
ACT_ACTION_ALLOW, ACT_ACTION_DENY, ACT_HTTP_REQ_TARPIT, ACT_HTTP_REQ_AUTH,
ACT_HTTP_REDIR and ACT_TCP_CLOSE actions. But, when required, it may also be set
on custom actions.

Consequently, this flag is checked instead of the action type during the
configuration parsing to trigger a warning when a rule inhibits all the
following ones.
2020-01-20 15:18:45 +01:00
Christopher Faulet
105ba6cc54 MINOR: actions: Rename the act_flag enum into act_opt
The flags in the act_flag enum have been renamed act_opt. It means ACT_OPT
prefix is used instead of ACT_FLAG. The purpose of this patch is to reserve the
action flags for the actions configuration.
2020-01-20 15:18:45 +01:00
Christopher Faulet
cd26e8a2ec MINOR: http-rules/tcp-rules: Call the defined action function first if defined
When TCP and HTTP rules are evaluated, if an action function (action_ptr field
in the act_rule structure) is defined for a given action, it is now always
called in priority over the test on the action type. Concretly, for now, only
custom actions define it. Thus there is no change. It just let us the choice to
extend the action type beyond the existing ones in the enum.
2020-01-20 15:18:45 +01:00
Christopher Faulet
96bff76087 MINOR: actions: Regroup some info about HTTP rules in the same struct
Info used by HTTP rules manipulating the message itself are splitted in several
structures in the arg union. But it is possible to group all of them in a unique
struct. Now, <arg.http> is used by most of these rules, which contains:

  * <arg.http.i>   : an integer used as status code, nice/tos/mark/loglevel or
                     action id.
  * <arg.http.str> : an IST used as header name, reason string or auth realm.
  * <arg.http.fmt> : a log-format compatible expression
  * <arg.http.re>  : a regular expression used by replace rules
2020-01-20 15:18:45 +01:00
Christopher Faulet
58b3564fde MINOR: actions: Add a function pointer to release args used by actions
Arguments used by actions are never released during HAProxy deinit. Now, it is
possible to specify a function to do so. ".release_ptr" field in the act_rule
structure may be set during the configuration parsing to a specific deinit
function depending on the action type.
2020-01-20 15:18:45 +01:00
Christopher Faulet
1aea50e1ff MEDIUM: http-rules: Enable the strict rewriting mode by default
Now, by default, when a rule performing a rewrite on an HTTP message fails, an
internal error is triggered. Before, the failure was ignored. But most of users
are not aware of this behavior. And it does not happen very often because the
buffer reserve space in large enough. So it may be surprising. Returning an
internal error makes the rewrite failure explicit. If it is acceptable to
silently ignore it, the strict rewriting mode can be disabled.
2020-01-20 15:18:45 +01:00
Christopher Faulet
46f95543c5 MINOR: http-rules: Add a rule to enable or disable the strict rewriting mode
It is now possible to explicitly instruct rewriting rules to be strict or not
towards errors. It means that in this mode, an internal error is trigger if a
rewrite rule fails. The HTTP action "strict-mode" can be used to enable or
disable the strict rewriting mode. It can be used in an http-request and an
http-response ruleset.

For now, by default the strict rewriting mode is disabled. Because it is the
current behavior. But it will be changed in another patch.
2020-01-20 15:18:45 +01:00
Christopher Faulet
e00d06c99f MINOR: http-rules: Handle all message rewrites the same way
In HTTP rules, error handling during a rewrite is now handle the same way for
all rules. First, allocation errors are reported as internal errors. Then, if
soft rewrites are allowed, rewrite errors are ignored and only the
failed_rewrites counter is incremented. Otherwise, when strict rewrites are
mandatory, interanl errors are returned.

For now, only soft rewrites are supported. Note also that the warning sent to
notify a rewrite failure was removed. It will be useless once the strict
rewrites will be possible.
2020-01-20 15:18:45 +01:00
Christopher Faulet
a00071e2e5 MINOR: http-ana: Add a txn flag to support soft/strict message rewrites
the HTTP_MSGF_SOFT_RW flag must now be set on the HTTP transaction to ignore
rewrite errors on a message, from HTTP rules. The mode is called the soft
rewrites. If thes flag is not set, strict rewrites are performed. In this mode,
if a rewrite error occurred, an internal error is reported.

For now, HTTP_MSGF_SOFT_RW is always set and there is no way to switch a
transaction in strict mode.
2020-01-20 15:18:45 +01:00
Christopher Faulet
cff0f739e5 MINOR: counters: Review conditions to increment counters from analysers
Now, for these counters, the following rules are followed to know if it must be
incremented or not:

  * if it exists for a frontend, the counter is incremented
  * if stats must be collected for the session's listener, if the counter exists
    for this listener, it is incremented
  * if the backend is already assigned, if the counter exists for this backend,
    it is incremented
  * if a server is attached to the stream, if the counter exists for this
    server, it is incremented

It is not hardcoded rules. Some counters are still handled in a different
way. But many counters are incremented this way now.
2020-01-20 15:18:45 +01:00
Christopher Faulet
a08546bb5a MINOR: counters: Remove failed_secu counter and use denied_resp instead
The failed_secu counter is only used for the servers stats. It is used to report
the number of denied responses. On proxies, the same info is stored in the
denied_resp counter. So, it is more consistent to use the same field for
servers.
2020-01-20 15:18:45 +01:00
Christopher Faulet
0159ee4032 MINOR: stats: Report internal errors in the proxies/listeners/servers stats
The stats field ST_F_EINT has been added to report internal errors encountered
per proxy, per listener and per server. It appears in the CLI export and on the
HTML stats page.
2020-01-20 15:18:45 +01:00
Christopher Faulet
74f67af8d4 MINOR: http-rules: Handle denied/aborted/invalid connections from HTTP rules
The new possible results for a custom action (deny/abort/invalid) are now handled
during HTTP rules evaluation. These codes are mapped on HTTP rules ones :

  * ACT_RET_DENY => HTTP_RULE_RES_DENY
  * ACT_RET_ABRT => HTTP_RULE_RES_ABRT
  * ACT_RET_INV  => HTTP_RULE_RES_BADREQ

For now, no custom action uses these new codes.
2020-01-20 15:18:45 +01:00
Christopher Faulet
282992e25f MINOR: tcp-rules: Handle denied/aborted/invalid connections from TCP rules
The new possible results for a custom action (deny/abort/invalid) are now
handled during TCP rules evaluation. For L4/L5 rules, the session is
rejected. For L7 rules, the right counter is incremented, then the connections
killed.

For now, no custom action uses these new codes.
2020-01-20 15:18:45 +01:00
Christopher Faulet
3a26beea18 MINOR: http-rules: Handle internal errors during HTTP rules evaluation
The HTTP_RULE_RES_ERROR code is now used by HTTP analyzers to handle internal
errors during HTTP rules evaluation. It is used instead of HTTP_RULE_RES_BADREQ,
used for invalid requests/responses. In addition, the SF_ERR_RESOURCE flag is
set on the stream when an allocation failure happens.

Note that the return value of http-response rules evaluation is now tested in
the same way than the result of http-request rules evaluation.
2020-01-20 15:18:45 +01:00
Christopher Faulet
b8a5371a32 MEDIUM: http-ana: Properly handle internal processing errors
Now, processing errors are properly handled. Instead of returning an error 400
or 502, depending where the error happens, an error 500 is now returned. And the
processing_errors counter is incremented. By default, when such error is
detected, the SF_ERR_INTERNAL stream error is used. When the error is caused by
an allocation failure, and when it is reasonnably possible, the SF_ERR_RESOURCE
stream error is used. Thanks to this patch, bad requests and bad responses
should be easier to detect.
2020-01-20 15:18:45 +01:00
Christopher Faulet
28160e73dd MINOR: http-rules: Return an error when custom actions return ACT_RET_ERR
Thanks to the commit "MINOR: actions: Use ACT_RET_CONT code to ignore an error
from a custom action", it is now possible to trigger an error from a custom
action in http rules. Now, when a custom action returns the ACT_RET_ERR code
from an http-request rule, an error 400 is returned. And from an http-response
rule, an error 502 is returned.

Be careful if this patch is backported. The other mentioned patch must be
backported first.
2020-01-20 15:18:45 +01:00
Christopher Faulet
491ab5e2e5 MINOR: tcp-rules: Kill connections when custom actions return ACT_RET_ERR
Thanks to the commit "MINOR: actions: Use ACT_RET_CONT code to ignore an error
from a custom action", it is now possible to trigger an error from a custom
action in tcp-content rules. Now, when a custom action returns the ACT_RET_ERR
code, it has the same behavior than a reject rules, the connection is killed.

Be careful if this patch is backported. The other mentioned patch must be
backported first.
2020-01-20 15:18:45 +01:00
Christopher Faulet
13403761d5 MINOR: actions: Use ACT_RET_CONT code to ignore an error from a custom action
Some custom actions are just ignored and skipped when an error is encoutered. In
that case, we jump to the next rule. To do so, most of them use the return code
ACT_RET_ERR. Currently, for http rules and tcp content rules, it is not a
problem because this code is handled the same way than ACT_RET_CONT. But, it
means there is no way to handle the error as other actions. The custom actions
must handle the error and return ACT_RET_DONE. For instance, when http-request
rules are processed, an error when we try to replace a header value leads to a
bad request and an error 400 is returned to the client. But when we fail to
replace the URI, the error is silently ignored. This difference between the
custom actions and the others is an obstacle to write new custom actions.

So, in this first patch, ACT_RET_CONT is now returned from custom actions
instead of ACT_RET_ERR when an error is encoutered if it should be ignored. The
behavior remains the same but it is now possible to handle true errors using the
return code ACT_RET_ERR. Some actions will probably be reviewed to determine if
an error is fatal or not. Other patches will be pushed to trigger an error when
a custom action returns the ACT_RET_ERR code.

This patch is not tagged as a bug because it is just a design issue. But others
will depends on it. So be careful during backports, if so.
2020-01-20 15:18:45 +01:00
Christopher Faulet
cb9106b3e3 MINOR: tcp-rules: Always set from which ruleset a rule comes from
The ruleset from which a TCP rule comes from (the <from> field in the act_rule
structure) is only set when a rule is created from a registered keyword and not
for all TCP rules. But this information may be useful to check the configuration
validity or during the rule evaluation. So now, we systematically set it.
2020-01-20 15:18:45 +01:00
Christopher Faulet
81e20177df MEDIUM: http-rules: Register an action keyword for all http rules
There are many specific http actions that don't use the action registration
mechanism (allow, deny, set-header...). Instead, the parsing of these actions is
inlined in the functions responsible to parse the http-request/http-response
rules. There is no reason to not register an action keyword for all these
actions. It it the purpose of this patch. The new functions responsible to parse
these http actions are defined in http_act.c
2020-01-20 15:18:45 +01:00
Christopher Faulet
28436e23d3 BUG/MINOR: stick-table: Use MAX_SESS_STKCTR as the max track ID during parsing
During the parsing of the sc-inc-gpc0, sc-inc-gpc1 and sc-inc-gpt1 actions, the
maximum stick table track ID allowed is tested against ACT_ACTION_TRK_SCMAX. It
is the action number and not the maximum number of stick counters. Instead,
MAX_SESS_STKCTR must be used.

This patch must be backported to all stable versions.
2020-01-20 15:18:45 +01:00
Christopher Faulet
cb5501327c BUG/MINOR: http-rules: Remove buggy deinit functions for HTTP rules
Functions to deinitialize the HTTP rules are buggy. These functions does not
check the action name to release the right part in the arg union. Only few info
are released. For auth rules, the realm is released and there is no problem
here. But the regex <arg.hdr_add.re> is always unconditionally released. So it
is easy to make these functions crash. For instance, with the following rule
HAProxy crashes during the deinit :

      http-request set-map(/path/to/map) %[src] %[req.hdr(X-Value)]

For now, These functions are simply removed and we rely on the deinit function
used for TCP rules (renamed as deinit_act_rules()). This patch fixes the
bug. But arguments used by actions are not released at all, this part will be
addressed later.

This patch must be backported to all stable versions.
2020-01-20 15:18:45 +01:00
Christopher Faulet
1a3e0279c6 BUG/MINOR: http-ana/filters: Wait end of the http_end callback for all filters
Filters may define the "http_end" callback, called at the end of the analysis of
any HTTP messages. It is called at the end of the payload forwarding and it can
interrupt the stream processing. So we must be sure to not remove the XFER_BODY
analyzers while there is still at least filter in progress on this callback.

Unfortunatly, once the request and the response are borh in the DONE or the
TUNNEL mode, we consider the XFER_BODY analyzer has finished its processing on
both sides. So it is possible to prematurely interrupt the execution of the
filters "http_end" callback.

To fix this bug, we switch a message in the ENDING state. It is then switched in
DONE/TUNNEL mode only after the execution of the filters "http_end" callback.

This patch must be backported (and adapted) to 2.1, 2.0 and 1.9. The legacy HTTP
mode shoud probaly be fixed too.
2020-01-20 15:18:45 +01:00
Christopher Faulet
46230363af MINOR: mux-h1: Inherit send flags from the upper layer
Send flags (CO_SFL_*) used when xprt->snd_buf() is called, in h1_send(), are now
inherited from the upper layer, when h1_snd_buf() is called. First, the flag
CO_SFL_MSG_MORE is no more set if the output buffer is full, but only if the
stream-interface decides to set it. It has more info to do it than the
mux. Then, the flag CO_SFL_STREAMER is now also handled this way. It was just
ignored till now.
2020-01-20 15:18:45 +01:00
Christopher Faulet
8178e4006c MINOR: http-htx: Make 'internal.htx_blk_data' return a binary string
This internal sample fetch now returns a binary string (SMP_T_BIN) instead of a
character string.
2020-01-20 15:18:45 +01:00
Christopher Faulet
c5db14c5d4 MINOR: http-htx: Rename 'internal.htx_blk.val' to 'internal.htx_blk.data'
Use a more explicit name for this internal sample fetch.
2020-01-20 15:18:45 +01:00
Christopher Faulet
01f44456e6 MINOR: http-htx: Move htx sample fetches in the scope "internal"
HTX sample fetches are now prefixed by "internal." to explicitly reserve their
uses for debugging or testing purposes.
2020-01-20 15:18:45 +01:00
Ben51Degrees
6bf0672711 BUG/MINOR: 51d: Fix bug when HTX is enabled
When HTX is enabled, the sample flags were set too early. When matching for
multiple HTTP headers, the sample is fetched more than once, meaning that the
flags would need to be set again. Instead, the flags are now set last (just
before the outermost function returns). This could be further improved by
passing around the message without calling prefetch again.

This patch must be backported as far as 1.9. it should fix bug #450.
2020-01-20 14:01:52 +01:00
Tim Duesterhus
fcac33d0c1 BUG/MINOR: dns: Make dns_query_id_seed unsigned
Left shifting of large signed values and negative values is undefined.

In a test script clang's ubsan rightfully complains:

> runtime error: left shift of 1934242336581872173 by 13 places cannot be represented in type 'int64_t' (aka 'long')

This bug was introduced in the initial version of the DNS resolver
in 325137d603. The fix must be backported
to HAProxy 1.6+.
2020-01-18 06:45:54 +01:00
Tim Duesterhus
d34b1ce5a2 BUG/MINOR: cache: Fix leak of cache name in error path
This issue was introduced in commit 99a17a2d91
which first appeared in tag v1.9-dev11. This bugfix should be backported
to HAProxy 1.9+.
2020-01-18 06:45:54 +01:00
Elliot Otchet
71f829767d MINOR: ssl: Add support for returning the dn samples from ssl_(c|f)_(i|s)_dn in LDAP v3 (RFC2253) format.
Modifies the existing sample extraction methods (smp_fetch_ssl_x_i_dn,
smp_fetch_ssl_x_s_dn) to accommodate a third argument that indicates the
DN should be returned in LDAP v3 format. When the third argument is
present, the new function (ssl_sock_get_dn_formatted) is called with
three parameters including the X509_NAME, a buffer containing the format
argument, and a buffer for the output.  If the supplied format matches
the supported format string (currently only "rfc2253" is supported), the
formatted value is extracted into the supplied output buffer using
OpenSSL's X509_NAME_print_ex and BIO_s_mem. 1 is returned when a dn
value is retrieved.  0 is returned when a value is not retrieved.

Argument validation is added to each of the related sample
configurations to ensure the third argument passed is either blank or
"rfc2253" using strcmp.  An error is returned if the third argument is
present with any other value.

Documentation was updated in configuration.txt and it was noted during
preliminary reviews that a CLEANUP patch should follow that adjusts the
documentation.  Currently, this patch and the existing documentation are
copied with some minor revisions for each sample configuration.  It
might be better to have one entry for all of the samples or entries for
each that reference back to a primary entry that explains the sample in
detail.

Special thanks to Chris, Willy, Tim and Aleks for the feedback.

Author: Elliot Otchet <degroens@yahoo.com>
Reviewed-by: Tim Duesterhus <tim@bastelstu.be>
2020-01-18 06:42:30 +01:00
Willy Tarreau
ee1a6fc943 MINOR: connection: make the last arg of subscribe() a struct wait_event*
The subscriber used to be passed as a "void *param" that was systematically
cast to a struct wait_event*. By now it appears clear that the subscribe()
call at every layer is well defined and always takes a pointer to an event
subscriber of type wait_event, so let's enforce this in the functions'
prototypes, remove the intermediary variables used to cast it and clean up
the comments to clarify what all these functions do in their context.
2020-01-17 18:30:37 +01:00
Willy Tarreau
8907e4ddb8 MEDIUM: mux-fcgi: merge recv_wait and send_wait event notifications
This is the last of the "recv_wait+send_wait merge" patches and is
functionally equivalent to previous commit "MEDIUM: mux-h2: merge
recv_wait and send_wait event notifications" but for FCGI this time.
The principle is pretty much the same, since the code is very similar.
We use a single wait_event for both recv and send and rely on the
subscribe flags to know the desired notifications.
2020-01-17 18:30:37 +01:00
Willy Tarreau
f96508aae6 MEDIUM: mux-h2: merge recv_wait and send_wait event notifications
This is the continuation of the recv+send event notifications merge
that was started. This patch is less trivial than the previous ones
because the existence of a send event subscription is also used to
decide to put a stream back into the send list.
2020-01-17 18:30:36 +01:00
Willy Tarreau
1b0d4d19fc MEDIUM: mux-h1: merge recv_wait and send_wait
This is the same principle as previous commit, but for the H1 mux this
time. The checks in the subscribe()/unsubscribe() calls were factored
and some BUG_ON() were added to detect unexpected cases.
h1_wake_for_recv() and h1_wake_for_send() needed to be refined to
consider the current subscription before deciding to wake up.
2020-01-17 18:30:36 +01:00
Willy Tarreau
113d52bfb4 MEDIUM: ssl: merge recv_wait and send_wait in ssl_sock
This is the same principle as previous commit, but for ssl_sock.
2020-01-17 18:30:36 +01:00
Willy Tarreau
ac6febd3ae MEDIUM: xprt: merge recv_wait and send_wait in xprt_handshake
This is the same principle as previous commit, but for xprt_handshake.
2020-01-17 18:30:36 +01:00
Willy Tarreau
7872d1fc15 MEDIUM: connection: merge the send_wait and recv_wait entries
In practice all callers use the same wait_event notification for any I/O
so instead of keeping specific code to handle them separately, let's merge
them and it will allow us to create new events later.
2020-01-17 18:30:36 +01:00
Willy Tarreau
062df2c23a MEDIUM: backend: move the connection finalization step to back_handle_st_con()
Currently there's still lots of code in conn_complete_server() that performs
one half of the connection setup, which is then checked and finalized in
back_handle_st_con(). There isn't a valid reason for this anymore, we can
simplify this and make sure that conn_complete_server() only wakes the stream
up to inform it about the fact the whole connection stack is set up so that
back_handle_st_con() finishes its job at the stream-int level.

It looks like the there could even be further simplified, but for now it
was moved straight out of conn_complete_server() with no modification.
2020-01-17 18:30:36 +01:00
Willy Tarreau
3a9312af8f REORG: stream/backend: move backend-specific stuff to backend.c
For more than a decade we've kept all the sess_update_st_*() functions
in stream.c while they're only there to work in relation with what is
currently being done in backend.c (srv_redispatch_connect, connect_server,
etc). Let's move all this pollution over there and take this opportunity
to try to find slightly less confusing names for these old functions
whose role is only to handle transitions from one specific stream-int
state:

  sess_update_st_rdy_tcp() -> back_handle_st_rdy()
  sess_update_st_con_tcp() -> back_handle_st_con()
  sess_update_st_cer()     -> back_handle_st_cer()
  sess_update_stream_int() -> back_try_conn_req()
  sess_prepare_conn_req()  -> back_handle_st_req()
  sess_establish()         -> back_establish()

The last one remained in stream.c because it's more or less a completion
function which does all the initialization expected on a connection
success or failure, can set analysers and emit logs.

The other ones could possibly slightly benefit from being modified to
take a stream-int instead since it's really what they're working with,
but it's unimportant here.
2020-01-17 18:30:36 +01:00
Willy Tarreau
7aad7039e4 MEDIUM: mux-fcgi: do not make an fstrm subscribe to itself on deferred shut
This is the port to FCGI of previous commit "MEDIUM: mux-h2: do not make
an h2s subscribe to itself on deferred shut".

The purpose is to avoid subscribing to the send_wait list when trying to
close, because we'll soon have to merge both recv and send lists. Basic
testing showed no difference (performance nor issues).
2020-01-17 18:30:36 +01:00
Willy Tarreau
5723f295d8 MEDIUM: mux-h2: do not make an h2s subscribe to itself on deferred shut
The logic handling the deferred shutdown is a bit complex because it
involves a wait_event struct in each h2s dedicated to subscribing to
itself when shutdowns are not immediately possible. This implies that
we will not be able to support a shutdown and a receive subscription
in the future when we merge all wait events.

Let's solely rely on the H2_SF_WANT_SHUT_{R,W} flags instead and have
an autonomous tasklet for this. This requires to add a few controls
in the code because now when waking up a stream we need to check if it
is for I/O or just a shut, but since sending and shutting are exclusive
it's not difficult.

One point worth noting is that further resources could be shaved off
by only allocating the tasklet when failing to shut, given that in the
vast majority of streams it will never be used. In fact the sole purpose
of the tasklet is to support calling this code from outside the H2 mux
context. Looking at the code, it seems that not too many adaptations
would be required to have the send_list walking code deal with sending
the shut bits itself and further simplify all this.
2020-01-17 18:30:36 +01:00
Willy Tarreau
f11be0ea1e MEDIUM: mux-fcgi: do not try to stop sending streams on blocked mux
This is essentially the same change as applied to mux-h2 in previous commit
"MEDIUM: mux-h2: do not try to stop sending streams on blocked mux". The
goal is to make sure we don't need to keep the item in the send_wait list
until it's executed so that we can later merge it with the recv_wait list.
No performance changes were observed.
2020-01-17 18:30:36 +01:00
Willy Tarreau
d9464167fa MEDIUM: mux-h2: do not try to stop sending streams on blocked mux
This partially reverts commit d846c267 ("MINOR: h2: Don't run tasks that
are waiting to send if mux in full"). This commit was introduced to
limit the start/stop overhead incurred by waking many streams to let
only a few work. But since commit 9c218e7521 ("MAJOR: mux-h2: switch
to next mux buffer on buffer full condition."), this situation occurs
way less (typically 2000 to 4000 times less often) and the benefits of
the patch above do not outweigh its shortcomings anymore. And commit
c7ce4e3e7f ("BUG/MEDIUM: mux-h2: don't stop sending when crossing a
buffer boundary") addressed a root cause of many unexpected sleeps and
wakeups.

The main problem it's causing is that it requires to keep the element
in the send_wait list until it's executed, leaving the entry in an
uncertain state, and significantly complicating the coexistence of this
list and the wait list dedicated to shutdown. Also it happens that this
call to tasklet_remove_from_task_list() will not be usable anymore once
we start to support streams on different threads. And finally, some of
the other streams that we remove might very well have managed to find
their way to the h2_snd_buf() with an unblocked condition as well so it
is possible that some of these removals were not welcome.

So this patch now makes sure that send_wait is immediately nulled when
the task is woken up, and that we don't have to play with it afterwards.
Since we don't need to stop the tasklets anymore, we don't need the
sending_list that we can remove.

However one very useful benefit of the sending_list was that it used to
provide the information about the fact that the stream already tried to
send and failed. This was an important factor to improve fairness because
late arrived streams should not be allowed to send if others are already
scheduled. So this patch introduces a new per-stream flag H2_SF_NOTIFIED
to distinguish such streams.

With this patch the fairness is preserved, and the ratio of aborted
h2_snd_buf() due to other streams already sending remains quite low
(~0.3-2.1% measured depending on object size, this is within
expectations for 100 independent streams).

If the contention issue the patch above used to address comes up again
in the future, a much better (though more complicated) solution would
be to switch to per-connection buffer pools to distribute between the
connection and the streams so that by default there are more buffers
available for the mux and the streams only have some when the mux's are
unused, i.e. it would push the memory pressure back to the data layer.

One observation made while developing this patch is that when dealing
with large objects we still spend a huge amount of time scanning the
send_list with tasks that are already woken up every time a send()
manages to purge a bit more data. Indeed, by removing the elements
from the list when H2_SF_NOTIFIED is set, the netowrk bandwidth on
1 MB objects fetched over 100 streams per connection increases by 38%.
This was not done here to preserve fairness but is worth studying (e.g.
by keeping a restart pointer on the list or just having a flag indicating
if an entry was added since last scan).
2020-01-17 18:30:36 +01:00
Jerome Magnin
b8bd6d7efd BUILD: pattern: include errno.h
Commit 3c79d4bdc introduced the use of errno in pattern.c without
including errno.h.
If we build haproxy without any option errno is not defined and the
build fails.
2020-01-17 18:30:06 +01:00
Willy Tarreau
3381bf89e3 MEDIUM: connection: get rid of CO_FL_CURR_* flags
These ones used to serve as a set of switches between CO_FL_SOCK_* and
CO_FL_XPRT_*, and now that the SOCK layer is gone, they're always a
copy of the last know CO_FL_XPRT_* ones that is resynchronized before
I/O events by calling conn_refresh_polling_flags(), and that are pushed
back to FDs when detecting changes with conn_xprt_polling_changes().

While these functions are not particularly heavy, what they do is
totally redundant by now because the fd_want_*/fd_stop_*() actions
already perform test-and-set operations to decide to create an entry
or not, so they do the exact same thing that is done by
conn_xprt_polling_changes(). As such it is pointless to call that
one, and given that the only reason to keep CO_FL_CURR_* is to detect
changes there, we can now remove them.

Even if this does only save very few cycles, this removes a significant
complexity that has been responsible for many bugs in the past, including
the last one affecting FreeBSD.

All tests look good, and no performance regressions were observed.
2020-01-17 17:45:12 +01:00
Willy Tarreau
93c9f59a9c MINOR: stream-int: remove dependency on CO_FL_WAIT_ROOM for rcv_buf()
The only case where this made sense was with mux_h1 but Since we
introduced CS_FL_MAY_SPLICE, we don't need to rely on this anymore,
thus we don't need to clear it either when we do not splice.

There is a last check on this flag used to determine if the rx channel
is full and that cannot go away unless it's changed to use the CS
instead but for now this wouldn't add any benefit so better not do
it yet.
2020-01-17 17:24:30 +01:00
Willy Tarreau
e2a0eeca77 MINOR: connection: move the CO_FL_WAIT_ROOM cleanup to the reader only
CO_FL_WAIT_ROOM is set by the splicing function in raw_sock, and cleared
by the stream-int when splicing is disabled, as well as in
conn_refresh_polling_flags() so that a new call to ->rcv_pipe() could
be attempted by the I/O callbacks called from conn_fd_handler(). This
clearing in conn_refresh_polling_flags() makes no sense anymore and is
in no way related to the polling at all.

Since we don't call them from there anymore it's better to clear it
before attempting to receive, and to set it again later. So let's move
this operation where it should be, in raw_sock_to_pipe() so that it's
now symmetric. It was also placed in raw_sock_to_buf() so that we're
certain that it gets cleared if an attempt to splice is replaced with
a subsequent attempt to recv(). And these were currently already achieved
by the call to conn_refresh_polling_flags(). Now it could theorically be
removed from the stream-int.
2020-01-17 17:19:27 +01:00
Jerome Magnin
3c79d4bdc4 BUG/MINOR: pattern: handle errors from fgets when trying to load patterns
We need to do some error handling after we call fgets to make sure everything
went fine. If we don't users can be fooled into thinking they can load pattens
from directory because cfgparse doesn't flinch. This applies to acl patterns
map files.

This should be backported to all supported versions.
2020-01-17 17:09:50 +01:00
Willy Tarreau
17ccd1a356 BUG/MEDIUM: connection: add a mux flag to indicate splice usability
Commit c640ef1a7d ("BUG/MINOR: stream-int: avoid calling rcv_buf() when
splicing is still possible") fixed splicing in TCP and legacy mode but
broke it badly in HTX mode.

What happens in HTX mode is that the channel's to_forward value remains
set to CHN_INFINITE_FORWARD during the whole transfer, and as such it is
not a reliable signal anymore to indicate whether more data are expected
or not. Thus, when data are spliced out of the mux using rcv_pipe(), even
when the end is reached (that only the mux knows about), the call to
rcv_buf() to get the final HTX blocks completing the message were skipped
and there was often no new event to wake this up, resulting in transfer
timeouts at the end of large objects.

All this goes down to the fact that the channel has no more information
about whether it can splice or not despite being the one having to take
the decision to call rcv_pipe() or not. And we cannot afford to call
rcv_buf() inconditionally because, as the commit above showed, this
reduces the forwarding performance by 2 to 3 in TCP and legacy modes
due to data lying in the buffer preventing splicing from being used
later.

The approach taken by this patch consists in offering the muxes the ability
to report a bit more information to the upper layers via the conn_stream.
This information could simply be to indicate that more data are awaited
but the real need being to distinguish splicing and receiving, here
instead we clearly report the mux's willingness to be called for splicing
or not. Hence the flag's name, CS_FL_MAY_SPLICE.

The mux sets this flag when it knows that its buffer is empty and that
data waiting past what is currently known may be spliced, and clears it
when it knows there's no more data or that the caller must fall back to
rcv_buf() instead.

The stream-int code now uses this to determine if splicing may be used
or not instead of looking at the rcv_pipe() callbacks through the whole
chain. And after the rcv_pipe() call, it checks the flag again to decide
whether it may safely skip rcv_buf() or not.

All this bitfield dance remains a bit complex and it starts to appear
obvious that splicing vs reading should be a decision of the mux based
on permission granted by the data layer. This would however increase
the API's complexity but definitely need to be thought about, and should
even significantly simplify the data processing layer.

The way it was integrated in mux-h1 will also result in no more calls
to rcv_pipe() on chunked encoded data, since these ones are currently
disabled at the mux level. However once the issue with chunks+splice
is fixed, it will be important to explicitly check for curr_len|CHNK
to set MAY_SPLICE, so that we don't call rcv_buf() after each chunk.

This fix must be backported to 2.1 and 2.0.
2020-01-17 17:00:12 +01:00
Jerome Magnin
bee00ad080 BUG/MINOR: stream: don't mistake match rules for store-request rules
In process_sticking_rules() we only want to apply the first store-request
rule for a given table, but when doing so we need to make sure we only
count actual store-request rules when we list the sticking rules.

Failure to do so leads to not being able to write store-request and match
sticking rules in any order as a match rule after a store-request rule
will be ignored.

The following configuration reproduces the issue:

  global
    stats socket /tmp/foobar

  defaults
    mode http

  frontend in
    bind *:8080
    default_backend bar

  backend bar
    server s1 127.0.0.1:21212
    server s2 127.0.0.1:21211
    stick store-request req.hdr(foo)
    stick match req.hdr(foo)
    stick-table type string size 10

  listen foo
    bind *:21212
    bind *:21211
    http-request deny deny_status 200 if { dst_port 21212 }
    http-request deny

This patch fixes issue #448 and should be backported as far as 1.6.
2020-01-16 18:07:52 +01:00
William Lallemand
d308c5e0ce CLEANUP: cli: deduplicate the code in _getsocks
Since the fix 5fd3b28 ("BUG/MEDIUM: cli: _getsocks must send the peers
sockets") for bug #443. The code which sends the socket for the peers
and the proxies is duplicated. This patch move this code in a separated
function.
2020-01-16 16:26:41 +01:00
William Lallemand
5fd3b28c9c BUG/MEDIUM: cli: _getsocks must send the peers sockets
This bug prevents to reload HAProxy when you have both the seamless
reload (-x / expose-fd listeners) and the peers.

Indeed the _getsocks command does not send the FDs of the peers
listeners, so if no reuseport is possible during the bind, the new
process will fail to bind and exits.

With this feature, it is not possible to fallback on the SIGTTOU method
if we didn't receive all the sockets, because you can't close() the
sockets of the new process without closing those of the previous
process, they are the same.

Should fix bug #443.

Must be backported as far as 1.8.
2020-01-16 16:25:22 +01:00
Willy Tarreau
340b07e868 BUG/MAJOR: hashes: fix the signedness of the hash inputs
Wietse Venema reported in the thread below that we have a signedness
issue with our hashes implementations: due to the use of const char*
for the input key that's often text, the crc32, sdbm, djb2, and wt6
algorithms return a platform-dependent value for binary input keys
containing bytes with bit 7 set. This means that an ARM or PPC
platform will hash binary inputs differently from an x86 typically.
Worse, some algorithms are well defined in the industry (like CRC32)
and do not provide the expected result on x86, possibly causing
interoperability issues (e.g. a user-agent would fail to compare the
CRC32 of a message body against the one computed by haproxy).

Fortunately, and contrary to the first impression, the CRC32c variant
used in the PROXY protocol processing is not affected. Thus the impact
remains very limited (the vast majority of input keys are text-based,
such as user-agent headers for exmaple).

This patch addresses the issue by fixing all hash functions' prototypes
(even those not affected, for API consistency). A reg test will follow
in another patch.

The vast majority of users do not use these hashes. And among those
using them, very few will pass them on binary inputs. However, for the
rare ones doing it, this fix MAY have an impact during the upgrade. For
example if the package is upgraded on one LB then on another one, and
the CRC32 of a binary input is used as a stick table key (why?) then
these CRCs will not match between both nodes. Similarly, if
"hash-type ... crc32" is used, LB inconsistency may appear during the
transition. For this reason it is preferable to apply the patch on all
nodes using such hashes at the same time. Systems upgraded via their
distros will likely observe the least impact since they're expected to
be upgraded within a short time frame.

And it is important for distros NOT to skip this fix, in order to avoid
distributing an incompatible implementation of a hash. This is the
reason why this patch is tagged as MAJOR, eventhough it's extremely
unlikely that anyone will ever notice a change at all.

This patch must be backported to all supported branches since the
hashes were introduced in 1.5-dev20 (commit 98634f0c). Some parts
may be dropped since implemented later.

Link to Wietse's report:
  https://marc.info/?l=postfix-users&m=157879464518535&w=2
2020-01-16 08:23:42 +01:00
William Dauchy
bb9da0b8e2 CLEANUP: proxy: simplify proxy_parse_rate_limit proxy checks
rate-limits are valid for both frontend and listen, but not backend; so
we can simplify this check in a similar manner as it is done in e.g
max-keep-alive-queue.

this should fix github issue #449

Signed-off-by: William Dauchy <w.dauchy@criteo.com>
2020-01-16 07:04:05 +01:00
Olivier Houchard
ac8147446c BUG/MEDIUM: raw_sock: Make sur the fd and conn are sync.
Commit 08fa16e397 made sure
we let the fd layer we didn't want to poll anymore if
we failed to send and sendto() returne EAGAIN.
However, just disabling the polling with fd_stop_send()
while not notifying the connection layer means the
connection layer may believe the polling is activated
and nothing needs to be done when it is wrong.
A better fix is to revamp that whole code, for the
time being, just make sure the fd and connection
layer are properly synchronised.

This should fix the problem recently reported on FreeBSD.
2020-01-15 19:16:23 +01:00
Olivier Houchard
68787ef70a BUG/MEDIUM: mux_h1: Don't call h1_send if we subscribed().
In h1_snd_buf(), only attempt to call h1_send() if we haven't
already subscribed.
It makes no sense to do it if we subscribed, as we know we failed
to send before, and will create a useless call to sendto(), and
in 2.2, the call to raw_sock_from_buf() will disable polling if
it is enabled.

This should be backported to 2.2, 2.1, 2.0 and 1.9.
2020-01-15 19:13:32 +01:00
William Dauchy
8602394040 CLEANUP: compression: remove unused deinit_comp_ctx section
Since commit 27d93c3f94 ("BUG/MAJOR: compression/cache: Make it
really works with these both filters"), we no longer use section
deinit_comp_ctx.

This should fix github issue #441

Signed-off-by: William Dauchy <w.dauchy@criteo.com>
2020-01-15 10:58:17 +01:00
William Lallemand
24c928c8bd BUG/MEDIUM: mworker: remain in mworker mode during reload
If you reload an haproxy started in master-worker mode with
"master-worker" in the configuration, and no "-W" argument,
the new process lost the fact that is was in master-worker mode
resulting in weird behaviors.

The bigest problem is that if it is reloaded with an bad configuration,
the master will exits instead of remaining in waitpid mode.

This problem was discovered in bug #443.

Should be backported in every version using the master-worker mode.
(as far as 1.8)
2020-01-14 18:10:29 +01:00
William Lallemand
a31b09e982 BUG/MINOR: cli/mworker: can't start haproxy with 2 programs
When trying to start HAProxy with the master CLI and more than one
program in the configuration, it refuses to start with:

[ALERT] 013/132926 (1378) : parsing [cur--1:0] : proxy 'MASTER', another server named 'cur--1' was already defined at line 0, please use distinct names.
[ALERT] 013/132926 (1378) : Fatal errors found in configuration.

The problem is that haproxy tries to create a server for the MASTER
proxy but only the worker are supposed to be in the server list.

Fix issue #446.

Must be backported as far as 2.0.
2020-01-14 15:42:38 +01:00
Willy Tarreau
c7ce4e3e7f BUG/MEDIUM: mux-h2: don't stop sending when crossing a buffer boundary
In version 2.0, after commit 9c218e7521 ("MAJOR: mux-h2: switch to next
mux buffer on buffer full condition."), the H2 mux started to use a ring
buffer for the output data in order to reduce competition between streams.
However, one corner case was suboptimally covered: when crossing a buffer
boundary, we have to shrink the outgoing frame size to the one left in
the output buffer, but this shorter size is later used as a signal of
incomplete send due to a buffer full condition (which used to be true when
using a single buffer). As a result, function h2s_frt_make_resp_data()
used to return less than requested, which in turn would cause h2_snd_buf()
to stop sending and leave some unsent data in the buffer, and si_cs_send()
to subscribe for sending more later.

But it goes a bit further than this, because subscribing to send again
causes the mux's send_list not to be empty anymore, hence extra streams
can be denied the access to the mux till the first stream is woken again.
This causes a nasty wakeup-sleep dance between streams that makes it
totally impractical to try to remove the sending list. A test showed
that it was possible to observe 3 million h2_snd_buf() giveups for only
100k requests when using 100 concurrent streams on 20kB objects.

It doesn't seem likely that a stream could get blocked and time out due
to this bug, though it's not possible either to demonstrate the opposite.
One risk is that incompletely sent streams do not have any blocking flags
so they may not be identified as blocked. However on first scan of the
send_list they meet all conditions for a wakeup.

This patch simply allows to continue on a new frame after a partial
frame. with only this change, the number of failed h2_snd_buf() was
divided by 800 (4% of calls). And by slightly increasing the H2C_MBUF_CNT
size, it can go down to zero.

This fix must be backported to 2.1 and 2.0.
2020-01-14 13:55:04 +01:00
Willy Tarreau
f31af9367e MEDIUM: lua: don't call the GC as often when dealing with outgoing connections
In order to properly close connections established from Lua in case
a Lua context dies, the context currently automatically gets a flag
HLUA_MUST_GC set whenever an outgoing connection is used. This causes
the GC to be enforced on the context's death as well as on yield. First,
it does not appear necessary to do it when yielding, since if the
connections die they are already cleaned up. Second, the problem with
the flag is that even if a connection gets properly closed, the flag is
not removed and the GC continues to be called on the Lua context.

The impact on performance looks quite significant, as noticed and
diagnosed by Sadasiva Gujjarlapudi in the following thread:

  https://www.mail-archive.com/haproxy@formilux.org/msg35810.html

This patch changes the flag for a counter so that each created
connection increments it and each cleanly closed connection decrements
it. That way we know we have to call the GC on the context's death only
if the count is non-null. As reported in the thread above, the Lua
performance gain is now over 20% by doing this.

Thanks to Sada and Thierry for the design discussion and tests that
led to this solution.
2020-01-14 10:12:31 +01:00
William Dauchy
9a8ef7f51d CLEANUP: ssl: remove opendir call in ssl_sock_load_cert
Since commit 3180f7b554 ("MINOR: ssl: load certificates in
alphabetical order"), `readdir` was replaced by `scandir`. We can indeed
replace it with a check on the previous `stat` call.

This micro cleanup can be a good benefit when you have hundreds of bind
lines which open TLS certificates directories in terms of syscall,
especially in a case of frequent reloads.

Signed-off-by: William Dauchy <w.dauchy@criteo.com>
2020-01-13 19:51:52 +01:00
Willy Tarreau
70c5b0e5fd BUG/MEDIUM: mux-h2: fix missing test on sending_list in previous patch
Previous commit 989539b048 ("BUG/MINOR: mux-h2: use a safe
list_for_each_entry in h2_send()") accidently lost its sending_list test,
resulting in some elements to be woken up again while already in the
sending_list and h2_unsubscribe() crashing on integrity tests (only
when built with DEBUG_DEV).

If the fix above is backported this one must be as well.
2020-01-10 18:20:15 +01:00
Willy Tarreau
989539b048 BUG/MINOR: mux-h2: use a safe list_for_each_entry in h2_send()
h2_send() uses list_for_each_entry() to scan paused streams and resume
them, but happily deletes any leftover from a previous failed unsubscribe,
which is obviously not safe and would corrupt the list. In practice this
is a proof that this doesn't happen, but it's not the best way to prove it.
In order to fix this and reduce the maintenance burden caused by code
duplication (this list walk exists at 3 places), let's introduce a new
function h2_resume_each_sending_h2s() doing exactly this and use it at
all 3 places.

This bug was introduced as a side effect of fix 998410a41b ("BUG/MEDIUM:
h2: Revamp the way send subscriptions works.") so it should be backported
as far as 1.9.
2020-01-10 17:18:32 +01:00
Christopher Faulet
48726b78e5 BUG/MINOR: stream-int: Don't trigger L7 retry if max retries is already reached
When an HTTP response is received, at the stream-interface level, if a L7 retry
must be triggered because of the status code, the response is trashed and a read
error is reported on the response channel. Then the stream handles this error
and perform the retry. Except if the maximum connection retries is reached. In
this case, an error is reported. Because the server response was already trashed
by the stream-interface, a generic 502 error is returned to the client instead
of the server's one.

Now, the stream-interface triggers a L7 retry only if the maximum connection
retries is not already reached. Thus, at the end, the last server's response is
returned.

This patch must be backported to 2.1 and 2.0. It should fix the issue #439.
2020-01-09 15:39:06 +01:00
William Dauchy
7675c720f8 CLEANUP: server: remove unused err section in server_finalize_init
Since commit 980855bd95 ("BUG/MEDIUM: server: initialize the orphaned
conns lists and tasks at the end"), we no longer use err section.

This should fix github issue #438

Signed-off-by: William Dauchy <w.dauchy@criteo.com>
2020-01-09 05:54:48 +01:00
Willy Tarreau
eeea8082a8 BUG/MAJOR: listener: do not schedule a task-less proxy
Apparently seamingless commit 0591bf7deb ("MINOR: listener: make the
wait paths cleaner and more reliable") caused a nasty regression and
revealed a rare race that hits regtest stickiness/lb-services.vtc
about 4% of the times for 8 threads.

The problem is that when a multi-threaded listener wakes up on an
incoming connection, several threads can receive the event, especially
when idle. And all of them will race to accept the connections in
parallel, adjusting the listener's nbconn and proxy's feconn until
one reaches the proxy's limit and declines. At this step the changes
are cancelled, the listener is marked "limited", and when the threads
exit the function, one of them will unlimit the listener/proxy again
so that it can accept incoming connections again.

The problem happens when many threads connect to a small peers section
because its maxconn is very limited (typically 6 for 2 peers), and it's
sometimes possible for enough competing threads to hit the limit and
one of them will limit the listener and queue the proxy's task... except
that peers do not initialize their proxy task since they do not use rate
limiting. Thus the process crashes when doing task_schedule(p->task).
Prior to the cleanup patch above, this didn't happen because the error
path that was dedicated to only limiting the listener did not call
task_schedule(p->task).

Given that the proxy's task is optional, and that the expire value
passed there is always TICK_ETERNITY, it's sufficient and reasonable to
avoid calling this task_schedule() when expire is not set. And for long
term safety we can also avoid to do it when the task is not set. A first
fix consisted in allocating a task for the peers proxies but it's never
used and would eat resources for reason.

No backport is needed as this commit was only merged into 2.2.
2020-01-08 19:39:09 +01:00
William Dauchy
cd7fa3dcfc CLEANUP: mux-h2: remove unused goto "out_free_h2s"
Since commit fa8aa867b9 ("MEDIUM: connections: Change struct
wait_list to wait_event.") we no longer use this section.

this should fix github issue #437

Signed-off-by: William Dauchy <w.dauchy@criteo.com>
2020-01-08 16:16:19 +01:00
Florian Tham
9205fea13a MINOR: http: Add 404 to http-request deny
This patch adds http status code 404 Not Found to http-request deny. See
issue #80.
2020-01-08 16:15:23 +01:00
Florian Tham
272e29b5cc MINOR: http: Add 410 to http-request deny
This patch adds http status code 410 Gone to http-request deny. See
issue #80.
2020-01-08 16:15:23 +01:00
Willy Tarreau
08fa16e397 MINOR: raw_sock: make sure to disable polling once everything is sent
Analysing traces revealed a rare but surprizing pattern :

    connect()  = -1 EAGAIN
    send()     = success
    epoll_ctl(ADD, EPOLLOUT)
    epoll_wait()
    recvfrom() = success
    close()

What happens is that the failed connect() creates an FD update for pollout,
but the successful synchronous send() doesn't disable it because polling was
only disabled in the FD handler. But a successful synchronous connect()
cancellation is a good opportunity to disable polling before it's effectively
enabled in the next loop, so better disable it when reaching the end. The
cost is very low if it was already disabled anyway (one atomic op).

This only affects local connections but with this the typical number of
epoll_ctl() calls per connection dropped from ~4.2 to ~3.8 for plain TCP
and 10k transfers.
2020-01-08 09:59:40 +01:00
Willy Tarreau
0eae6323bf MEDIUM: dns: implement synchronous send
In dns_send_query(), there's no point in first waking up the FD, to get
called back by the poller to send the request and sleep. Instead let's
simply send the request as soon as it's known and only subscribe to the
poller when the socket buffers are full and it's required to poll (i.e.
almost never).

This significantly reduces the number of calls to the poller. A large
config sees the number of epoll_ctl() calls reduced from 577 to 7 over
10 seconds, the number of recvfrom() from 1533 to 582 and the number of
sendto() from 369 to 162.

It also has the extra benefit of building each requests only once per
resolution and sending it to multiple resolvers instead of rebuilding
it for each and every resolver.

This will reduce the risk of seeing situations similar to bug #416 in
the future.
2020-01-08 06:10:38 +01:00
Willy Tarreau
e5891ca6c1 BUG/MEDIUM: session: do not report a failure when rejecting a session
In session_accept_fd() we can perform a synchronous call to
conn_complete_session() and if it succeeds the connection is accepted
and turned into a session. If it fails we take it as an error while it
is not, in this case, it's just that a tcp-request rule has decided to
reject the incoming connection. The problem with reporting such an event
as an error is that the failed status is passed down to the listener code
which decides to disable accept() for 100ms in order to leave some time
for transient issues to vanish, and that's not what we want to do here.

This fix must be backported as far as 1.7. In 1.7 the code is a bit
different as tcp_exec_l5_rules() is called directly from within
session_new_fd() and ret=0 must be assigned there.
2020-01-07 18:15:32 +01:00
Christopher Faulet
584348be63 BUG/MINOR: channel: inject output data at the end of output
In co_inject(), data must be inserted at the end of output, not the end of
input. For the record, this function does not take care of input data which are
supposed to not exist. But the caller may reset input data after or before the
call. It is its own choice.

This bug, among other effects, is visible when a redirect is performed on
the response path, on legacy HTTP mode (so for HAProxy < 2.1). The redirect
response is appended after the server response when it should overwrite it.

Thanks to Kevin Zhu <ip0tcp@gmail.com> to report the bug. It must be backported
as far as 1.9.
2020-01-07 10:51:15 +01:00
Kevin Zhu
96b363963f BUG/MEDIUM: http-ana: Truncate the response when a redirect rule is applied
When a redirect rule is executed on the response path, we must truncate the
received response. Otherwise, the redirect is appended after the response, which
is sent to the client. So it is obviously a bug because the redirect is not
performed. With bodyless responses, it is the "only" bug. But if the response
has a body, the result may be invalid. If the payload is not fully received yet
when the redirect is performed, an internal error is reported.

It must be backported as far as 1.9.
2020-01-07 10:50:28 +01:00
Christopher Faulet
47a7210b9d BUG/MINOR: proxy: Fix input data copy when an error is captured
In proxy_capture_error(), input data are copied in the error snapshot. The copy
must take care of the data wrapping. But the length of the first block is
wrong. It should be the amount of contiguous input data that can be copied
starting from the input's beginning. But the mininum between the input length
and the buffer size minus the input length is used instead. So it is a problem
if input data are wrapping or if more than the half of the buffer is used by
input data.

This patch must be backported as far as 1.9.
2020-01-06 13:58:30 +01:00
Christopher Faulet
1703478e2d BUG/MINOR: h1: Report the right error position when a header value is invalid
During H1 messages parsing, when the parser has finished to parse a full header
line, some tests are performed on its value, depending on its name, to be sure
it is valid. The content-length is checked and converted in integer and the host
header is also checked. If an error occurred during this step, the error
position must point on the header value. But from the parser point of view, we
are already on the start of the next header. Thus the effective reported
position in the error capture is the beginning of the unparsed header line. It
is a bit confusing when we try to figure out why a message is rejected.

Now, the parser state is updated to point on the invalid value. This way, the
error position really points on the right position.

This patch must be backported as far as 1.9.
2020-01-06 13:58:21 +01:00
Olivier Houchard
7f4f7f140f MINOR: ssl: Remove unused variable "need_out".
The "need_out" variable was used to let the ssl code know we're done
reading early data, and we should start the handshake.
Now that the handshake function is responsible for taking care of reading
early data, all that logic has been removed from ssl_sock_to_buf(), but
need_out was forgotten, and left. Remove it know.
This patch was submitted by William Dauchy <w.dauchy@criteo.com>, and should
fix github issue #434.
This should be backported to 2.0 and 2.1.
2020-01-05 16:45:14 +01:00
William Dauchy
3894d97fb8 MINOR: config: disable busy polling on old processes
in the context of seamless reload and busy polling, older processes will
create unecessary cpu conflicts; we can assume there is no need for busy
polling for old processes which are waiting to be terminated.

This patch is not a bug fix itself but might be a good stability
improvment when you are un the context of frequent seamless reloads with
a high "hard-stop-after" value; for that reasons I think this patch
should be backported in all 2.x versions.

Signed-off-by: William Dauchy <w.dauchy@criteo.com>
2020-01-02 10:29:49 +01:00
Olivier Houchard
140237471e BUG/MEDIUM: connections: Hold the lock when wanting to kill a connection.
In connect_server(), when we decide we want to kill the connection of
another thread because there are too many idle connections, hold the
toremove_lock of the corresponding thread, othervise, there's a small race
condition where we could try to add the connection to the toremove_connections
list while it has already been free'd.

This should be backported to 2.0 and 2.1.
2019-12-30 18:18:28 +01:00
Olivier Houchard
37d7897aaf BUG/MEDIUM: checks: Only attempt to do handshakes if the connection is ready.
When creating a new check connection, only attempt to add an handshake
connection if the connection has fully been initialized. It can not be the
case if a DNS resolution is still pending, and thus we don't yet have the
address for the server, as the handshake code assumes the connection is fully
initialized and would otherwise crash.
This is not ideal, the check shouldn't probably run until we have an address,
as it leads to check failures with "Socket error".
While I'm there, also add an xprt handshake if we're using socks4, otherwise
checks wouldn't be able to use socks4 properly.
This should fix github issue #430

This should be backported to 2.0 and 2.1.
2019-12-30 15:18:16 +01:00
Willy Tarreau
5d7dcc2a8e OPTIM: epoll: always poll for recv if neither active nor ready
The cost of enabling polling in one direction with epoll is very high
because it requires one syscall per FD and per direction change. In
addition we don't know about input readiness until we either try to
receive() or enable polling and watch the result. With HTTP keep-alive,
both are equally expensive as it's very uncommon to see the server
instantly respond (unless it's a second stage of the same process on
localhost, which has become much less common with threads).

But when a connection is established it's also quite usual to have to
poll for sending (except on localhost or UNIX sockets where it almost
always instantly works). So this cost of polling could be factored out
with the second step if both were enabled together.

This is the idea behind this patch. What it does is to always enable
polling for Rx if it's not ready and at least one direction is active.
This means that if it's not explicitly disabled, or if it was but in a
state that causes the loss of the information (rx ready cannot be
guessed), then let's take any opportunity for a polling change to
enable it at the same time, and learn about rx readiness for free.

In addition the FD never gets unregistered for Rx unless it's ready
and was blocked (buffer full). This avoids a lot of the flip-flop
behaviour at beginning and end of requests.

On a test with 10k requests in keep-alive, the difference is quite
noticeable:

Before:
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 83.67    0.010847           0     20078           epoll_ctl
 16.33    0.002117           0      2231           epoll_wait
  0.00    0.000000           0        20        20 connect
------ ----------- ----------- --------- --------- ----------------
100.00    0.012964                 22329        20 total

After:
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 96.35    0.003351           1      2644           epoll_wait
  2.36    0.000082           4        20        20 connect
  1.29    0.000045           0        66           epoll_ctl
------ ----------- ----------- --------- --------- ----------------
100.00    0.003478                  2730        20 total

It may also save a recvfrom() after connect() by changing the following
sequence, effectively saving one epoll_ctl() and one recvfrom() :

           before              |            after
  -----------------------------+----------------------------
  - connect()                  |  - connect()
  - epoll_ctl(add,out)         |  - epoll_ctl(add, in|out)
  - sendto()                   |  - epoll_wait() = out
  - epoll_ctl(mod,in|out)      |  - send()
  - epoll_wait() = out         |  - epoll_wait() = in|out
  - recvfrom() = EAGAIN        |  - recvfrom() = OK
  - epoll_ctl(mod,in)          |  - recvfrom() = EAGAIN
  - epoll_wait() = in          |  - epoll_ctl(mod, in)
  - recvfrom() = OK            |  - epoll_wait()
  - recvfrom() = EAGAIN        |
  - epoll_wait()               |
    (...)

Now on a 10M req test on 16 threads with 2k concurrent conns and 415kreq/s,
we see 190k updates total and 14k epoll_ctl() only.
2019-12-27 16:38:47 +01:00
Willy Tarreau
0fbc318e24 CLEANUP: connection: merge CO_FL_NOTIFY_DATA and CO_FL_NOTIFY_DONE
Both flags became equal in commit 82967bf9 ("MINOR: connection: adjust
CO_FL_NOTIFY_DATA after removal of flags"), which already predicted the
overlap between xprt_done_cb() and wake() after the removal of the DATA
specific flags in 1.8. Let's simply remove CO_FL_NOTIFY_DATA since the
"_DONE" version already covers everything and explains the intent well
enough.
2019-12-27 16:38:47 +01:00
Willy Tarreau
cbcf77edb7 MINOR: connection: remove the double test on xprt_done_cb()
The conn_fd_handler used to have one possible call to this function to
notify about end of handshakes, and another one to notify about connection
setup or error. But given that we're now only performing wakeup calls
after connection validation, we don't need to keep two places to run
this test since the conditions do not change in between.

This patch merges the two tests into a single one and moves the
CO_FL_CONNECTED test appropriately as well so that it's called even
on the error path if needed.
2019-12-27 16:38:47 +01:00
Willy Tarreau
b2a7ab08a8 MINOR: connection: check for connection validation earlier
In conn_fd_handler() we used to first give a chance to the send()
callback to try to send data and validate the connection at the same
time. But since 1.9 we do not call this callback anymore inline, it's
scheduled. So let's validate the connection ealier so that all other
decisions can be taken based on this confirmation. This may notably
be useful to the xprt_done_cb() to know that the connection was
properly validated.
2019-12-27 16:38:47 +01:00
Willy Tarreau
4970e5adb7 REORG: connection: move tcp_connect_probe() to conn_fd_check()
The function is not TCP-specific at all, it covers all FD-based sockets
so let's move this where other similar functions are, in connection.c,
and rename it conn_fd_check().
2019-12-27 16:38:43 +01:00
Willy Tarreau
7deff246ce MEDIUM: tcp: make tcp_connect_probe() consider ERR/HUP
Now that we know what pollers can return ERR/HUP, we can take this
into account to save one syscall: with such a poller, if neither are
reported, then we know the connection succeeded and we don't need to
go with getsockopt() nor connect() to validate this. In addition, for
the remaining cases (select() or suspected errors), we'll always go
through the extra connect() attempt and enumerate possible "in progress",
"connected" or "failed" status codes and take action solely based on
this.

This results in one saved syscall on modern pollers, only a second
connect() still being used on select() and the server's address never
being needed anymore.

Note that we cannot safely replace connect() with getsockopt() as the
latter clears the error on the socket without saving it, and health
checks rely on it for their reporting. This would be OK if the error
was saved in the connection itself.
2019-12-27 16:38:04 +01:00
Willy Tarreau
11ef0837af MINOR: pollers: add a new flag to indicate pollers reporting ERR & HUP
In practice it's all pollers except select(). It turns out that we're
keeping some legacy code only for select and enforcing it on all
pollers, let's offer the pollers the ability to declare that they
do not need that.
2019-12-27 14:04:33 +01:00
Willy Tarreau
8081abe26a CLEANUP: connection: conn->xprt is never NULL
Let's remove this outdated test that's been there since 1.5. For quite
some time now xprt hasn't been NULL anymore on an initialized connection.
2019-12-27 14:04:33 +01:00
Willy Tarreau
70ccb2cddf BUG/MINOR: connection: only wake send/recv callbacks if the FD is active
Since commit c3df4507fa ("MEDIUM: connections: Wake the upper layer even
if sending/receiving is disabled.") the send/recv callbacks are called
on I/O if the FD is ready and not just if it's active. This means that
in some situations (e.g. send ready but nothing to send) we may
needlessly enter the if() block, notice we're not subscribed, set
io_available=1 and call the wake() callback even if we're just called
for read activity. Better make sure we only do this when the FD is
active in that direction..

This may be backported as far as 2.0 though it should remain under
observation for a few weeks first as the risk of harm by a mistake
is higher than the trouble it should cause.
2019-12-27 14:04:33 +01:00
Willy Tarreau
c8dc20a825 BUG/MINOR: checks: refine which errno values are really errors.
Two regtest regularly fail in a random fashion depending on the machine's
load (one could really wonder if it's really worth keeping such
unreproducible tests) :
  - tcp-check_multiple_ports.vtc
  - 4be_1srv_smtpchk_httpchk_layer47errors.vtc

It happens that one of the reason is the time it takes to connect to
the local socket (hence the load-dependent aspect): if connect() on the
loopback returns EINPROGRESS then this status is reported instead of a
real error. Normally such a test is expected to see the error cleaned
by tcp_connect_probe() but it really depends on the timing and instead
we may very well send() first and see this error. The problem is that
everything is collected based on errno, hoping it won't get molested
in the way from the last unsuccesful syscall to wake_srv_chk(), which
obviously is hard to guarantee.

This patch at least makes sure that a few non-errors are reported as
zero just like EAGAIN. It doesn't fix the root cause but makes it less
likely to report incorrect failures.

This fix could be backported as far as 1.9.
2019-12-27 14:04:33 +01:00
Lukas Tribus
a26d1e1324 BUILD: ssl: improve SSL_CTX_set_ecdh_auto compatibility
SSL_CTX_set_ecdh_auto() is not defined when OpenSSL 1.1.1 is compiled
with the no-deprecated option. Remove existing, incomplete guards and
add a compatibility macro in openssl-compat.h, just as OpenSSL does:

bf4006a6f9/include/openssl/ssl.h (L1486)

This should be backported as far as 2.0 and probably even 1.9.
2019-12-21 06:46:55 +01:00
Christopher Faulet
eec7f8ac01 BUG/MEDIUM: stream: Be sure to never assign a TCP backend to an HTX stream
With a TCP frontend, it is possible to upgrade a connection to HTTP when the
backend is in HTTP mode. Concretly the upgrade install a new mux. So, once it is
done, the downgrade to TCP is no longer possible. So we must take care to never
assign a TCP backend to a stream on this connection. Otherwise, HAProxy crashes
because raw data from the server are handled as structured data on the client
side.

This patch fixes the issue #420. It must be backported to all versions
supporting the HTX.
2019-12-20 18:09:49 +01:00
Christopher Faulet
6716cc2b93 BUG/MAJOR: mux-h1: Don't pretend the input channel's buffer is full if empty
A regression was introduced by the commit 76014fd1 ("MEDIUM: h1-htx: Add HTX EOM
block when the message is in H1_MSG_DONE state"). When nothing is copied in the
channel's buffer when the input message is parsed, we erroneously pretend it is
because there is not enough room by setting the CS_FL_WANT_ROOM flag on the
conn-stream. This happens when a partial request is parsed. Because of this
flag, we never try anymore to get input data from the mux because we first wait
for more room in the channel's buffer, which is empty. Because of this bug, it
is pretty easy to freeze a h1 connection.

To fix the bug, we must obsiously set the CS_FL_WANT_ROOM flag only when there
are still data to transfer while the channel's buffer is not empty.

This patch must be backported if the patch 76014fd1 is backported too. So for
now, no backport needed.
2019-12-20 18:09:19 +01:00
Willy Tarreau
ca7a5af664 BUG/MINOR: state-file: do not leak memory on parse errors
Issue #417 reports a possible memory leak in the state-file loading code.
There's one such place in the loop which corresponds to parsing errors
where the curreently allocated line is not freed when dropped. In any
case this is very minor in that no more than the file's length may be
lost in the worst case, considering that the whole file is kept anyway
in case of success. This fix addresses this.

It should be backported to 2.1.
2019-12-20 17:33:05 +01:00
Willy Tarreau
fd1aa01f72 BUG/MINOR: state-file: do not store duplicates in the global tree
The global state file tree isn't configured for unique keys, so if an
entry appears multiple times, e.g. due to a bogus script that concatenates
entries multiple times, this will needlessly eat memory. Let's just drop
duplicates.

This should be backported to 2.1.
2019-12-20 17:23:40 +01:00
Willy Tarreau
7d6a1fa311 BUG/MEDIUM: state-file: do not allocate a full buffer for each server entry
Starting haproxy with a state file of 700k servers eats 11.2 GB of RAM
due to a mistake in the function that loads the strings into a tree: it
allocates a full buffer for each backend+server name instead of allocating
just the required string. By just fixing this we're down to 80 MB.

This should be backported to 2.1.
2019-12-20 17:18:13 +01:00
Olivier Houchard
fc51f0f588 BUG/MEDIUM: fd/threads: fix a concurrency issue between add and rm on the same fd
There's a very hard-to-trigger bug in the FD list code where the
fd_add_to_fd_list() function assumes that if the FD it's trying to add
is already locked, it's in the process of being added. Unfortunately, it
can also be in the process of being removed. It is very hard to trigger
because it requires that one thread is removing the FD while another one
is adding it. First very few FDs run on multiple threads (listeners and
DNS), and second, it does not make sense to add and remove the FD at the
same time.

In practice the DNS code built on the older callback-only model does
perform bursts of fd_want_send() for all resolvers at once when it wants
to send a new query (dns_send_query()). And this is more likely to happen
when here are lots of resolutions in parallel and many resolvers, because
the dns_response_recv() callback can also trigger a series of queries on
all resolvers for each invalid response it receives. This means that it
really is perfectly possible to both stop and start in parallel during
short periods of time there.

This issue was not reported before 2.1, but 2.1 had the FD cache, built
on the exact same code base. It's very possible that the issue caused
exactly the opposite situation, where an event was occasionally lost,
causing a DNS retry that worked, and nobody noticing the problem in the
end. In 2.1 the lost entries are the updates asking for not polling for
writes anymore, and the effect is that the poller contiuously reports
writability on the socket when the issue happens.

This patch fixes bug #416 and must be backported as far as 1.8, and
absolutely requires that previous commit "MINOR: fd/threads: make
_GET_NEXT()/_GET_PREV() use the volatile attribute" is backported as
well otherwise it will make the issue worse.

Special thanks to Julien Pivotto for setting up a reliable reproducer
for this difficult issue.
2019-12-20 08:09:28 +01:00
Willy Tarreau
337fb719ee MINOR: fd/threads: make _GET_NEXT()/_GET_PREV() use the volatile attribute
These macros are either used between atomic ops which cause the volatile
to be implicit, or with an explicit volatile cast. However not having it
in the macro causes some traps in the code because certain loop paths
cannot safely be used without risking infinite loops if one isn't careful
enough.

Let's place the volatile attribute inside the macros and remove them from
the explicit places to avoid this. It was verified that the output executable
remains exactly the same byte-wise.
2019-12-20 08:09:28 +01:00
Olivier Houchard
54907bb848 BUG/MEDIUM: ssl: Revamp the way early data are handled.
Instead of attempting to read the early data only when the upper layer asks
for data, allocate a temporary buffer, stored in the ssl_sock_ctx, and put
all the early data in there. Requiring that the upper layer takes care of it
means that if for some reason the upper layer wants to emit data before it
has totally read the early data, we will be stuck forever.

This should be backported to 2.1 and 2.0.
This may fix github issue #411.
2019-12-19 15:22:04 +01:00
Willy Tarreau
dd0e89a084 BUG/MAJOR: task: add a new TASK_SHARED_WQ flag to fix foreing requeuing
Since 1.9 with commit b20aa9eef3 ("MAJOR: tasks: create per-thread wait
queues") a task bound to a single thread will not use locks when being
queued or dequeued because the wait queue is assumed to be the owner
thread's.

But there exists a rare situation where this is not true: the health
check tasks may be running on one thread waiting for a response, and
may in parallel be requeued by another thread calling health_adjust()
after a detecting a response error in traffic when "observe l7" is set,
and "fastinter" is lower than "inter", requiring to shorten the running
check's timeout. In this case, the task being requeued was present in
another thread's wait queue, thus opening a race during task_unlink_wq(),
and gets requeued into the calling thread's wait queue instead of the
running one's, opening a second race here.

This patch aims at protecting against the risk of calling task_unlink_wq()
from one thread while the task is queued on another thread, hence unlocked,
by introducing a new TASK_SHARED_WQ flag.

This new flag indicates that a task's position in the wait queue may be
adjusted by other threads than then one currently executing it. This means
that such WQ manipulations must be performed under a lock. There are two
types of such tasks:
  - the global ones, using the global wait queue (technically speaking,
    those whose thread_mask has at least 2 bits set).
  - some local ones, which for now will be placed into the global wait
    queue as well in order to benefit from its lock.

The flag is automatically set on initialization if the task's thread mask
indicates more than one thread. The caller must also set it if it intends
to let other threads update the task's expiration delay (e.g. delegated
I/Os), or if it intends to change the task's affinity over time as this
could lead to the same situation.

Right now only the situation described above seems to be affected by this
issue, and it is very difficult to trigger, and even then, will often have
no visible effect beyond stopping the checks for example once the race is
met. On my laptop it is feasible with the following config, chained to
httpterm:

    global
        maxconn 400 # provoke FD errors, calling health_adjust()

    defaults
        mode http
        timeout client 10s
        timeout server 10s
        timeout connect 10s

    listen px
        bind :8001
        option httpchk /?t=50
        server sback 127.0.0.1:8000 backup
        server-template s 0-999 127.0.0.1:8000 check port 8001 inter 100 fastinter 10 observe layer7

This patch will automatically address the case for the checks because
check tasks are created with multiple threads bound and will get the
TASK_SHARED_WQ flag set.

If in the future more tasks need to rely on this (multi-threaded muxes
for example) and the use of the global wait queue becomes a bottleneck
again, then it should not be too difficult to place locks on the local
wait queues and queue the task on its bound thread.

This patch needs to be backported to 2.1, 2.0 and 1.9. It depends on
previous patch "MINOR: task: only check TASK_WOKEN_ANY to decide to
requeue a task".

Many thanks to William Dauchy for providing detailed traces allowing to
spot the problem.
2019-12-19 14:42:22 +01:00
Willy Tarreau
8fe4253bf6 MINOR: task: only check TASK_WOKEN_ANY to decide to requeue a task
After processing a task, its RUNNING bit is cleared and at the same time
we check for other bits to decide whether to requeue the task or not. It
happens that we only want to check the TASK_WOKEN_* bits, because :
  - TASK_RUNNING was just cleared
  - TASK_GLOBAL and TASK_QUEUE cannot be set yet as the task was running,
    preventing it from being requeued

It's important not to catch yet undefined flags there because it would
prevent addition of new task flags. This also shows more clearly that
waking a task up with flags 0 is not something safe to do as the task
will not be woken up if it's already running.
2019-12-19 14:42:22 +01:00
Willy Tarreau
262c3f1a00 MINOR: http: add a new "replace-path" action
This action is very similar to "replace-uri" except that it only acts on the
path component. This is assumed to better match users' expectations when they
used to rely on "replace-uri" in HTTP/1 because mostly origin forms were used
in H1 while mostly absolute URI form is used in H2, and their rules very often
start with a '/', and as such do not match.

It could help users to get this backported to 2.0 and 2.1.
2019-12-19 09:24:57 +01:00
Willy Tarreau
0851fd5eef MINOR: debug: support logging to various sinks
As discussed in the thread below [1], the debug converter is currently
not of much use given that it's only built when DEBUG_EXPR is set, and
it is limited to stderr only.

This patch changes this to make it take an optional prefix and an optional
target sink so that it can log to stdout, stderr or a ring buffer. The
default output is the "buf0" ring buffer, that can be consulted from the
CLI.

[1] https://www.mail-archive.com/haproxy@formilux.org/msg35671.html

Note: if this patch is backported, it also requires the following commit to
work: 46dfd78cbf ("BUG/MINOR: sample: always check converters' arguments").
2019-12-19 09:19:13 +01:00
William Lallemand
ba22e901b3 BUG/MINOR: ssl/cli: fix build for openssl < 1.0.2
Commit d4f946c ("MINOR: ssl/cli: 'show ssl cert' give information on the
certificates") introduced a build issue with openssl version < 1.0.2
because it uses the certificate bundles.
2019-12-18 20:40:20 +01:00
William Lallemand
d4f946c469 MINOR: ssl/cli: 'show ssl cert' give information on the certificates
Implement the 'show ssl cert' command on the CLI which list the frontend
certificates. With a certificate name in parameter it will show more
details.
2019-12-18 18:16:34 +01:00
Olivier Houchard
545989f37f BUG/MEDIUM: ssl: Don't set the max early data we can receive too early.
When accepting the max early data, don't set it on the SSL_CTX while parsing
the configuration, as at this point global.tune.maxrewrite may still be -1,
either because it was not set, or because it hasn't been set yet. Instead,
set it for each connection, just after we created the new SSL.
Not doing so meant that we could pretend to accept early data bigger than one
of our buffer.

This should be backported to 2.1, 2.0, 1.9 and 1.8.
2019-12-17 15:45:38 +01:00
Tim Duesterhus
cd3732456b MINOR: sample: Validate the number of bits for the sha2 converter
Instead of failing the conversion when an invalid number of bits is
given the sha2 converter now fails with an appropriate error message
during startup.

The sha2 converter was introduced in d437630237,
which is in 2.1 and higher.
2019-12-17 13:28:00 +01:00
Willy Tarreau
46dfd78cbf BUG/MINOR: sample: always check converters' arguments
In 1.5-dev20, sample-fetch arguments parsing was addresse by commit
689a1df0a1 ("BUG/MEDIUM: sample: simplify and fix the argument parsing").
The issue was that argument checks were not run for sample-fetches if
parenthesis were not present. Surprisingly, the fix was mde only for
sample-fetches and not for converters which suffer from the exact same
problem. There are even a few comments in the code mentioning that some
argument validation functions are not called when arguments are missing.

This fix applies the exact same method as the one above. The impact of
this bug is limited because over the years the code has learned to work
around this issue instead of fixing it.

This may be backported to all maintained versions.
2019-12-17 10:44:49 +01:00
Willy Tarreau
5060326798 BUG/MINOR: sample: fix the closing bracket and LF in the debug converter
The closing bracket was emitted for the "debug" converter even when the
opening one was not sent, and the new line was not always emitted. Let's
fix this. This is harmless since this converter is not built by default.
2019-12-17 09:04:38 +01:00
Christopher Faulet
29f7284333 MINOR: http-htx: Add some htx sample fetches for debugging purpose
These sample fetches are internal and must be used for debugging purpose. Idea
is to have a way to add some checks on the HTX content from http rules. The main
purpose is to ease reg-tests writing.
2019-12-11 16:46:16 +01:00
Christopher Faulet
76014fd118 MEDIUM: h1-htx: Add HTX EOM block when the message is in H1_MSG_DONE state
During H1 parsing, the HTX EOM block is added before switching the message state
to H1_MSG_DONE. It is an exception in the way to convert an H1 message to
HTX. Except for this block, the message is first switched to the right state
before starting to add the corresponding HTX blocks. For instance, the message
is switched in H1_MSG_DATA state and then the HTX DATA blocks are added.

With this patch, the message is switched to the H1_MSG_DONE state when all data
blocks or trailers were processed. It is the caller responsibility to call
h1_parse_msg_eom() when the H1_MSG_DONE state is reached. This way, it is far
easier to catch failures when the HTX buffer is full.

The H1 and FCGI muxes have been updated accordingly.

This patch may eventually be backported to 2.1 if it helps other backports.
2019-12-11 16:46:16 +01:00
Willy Tarreau
719e07c989 BUILD/MINOR: unix sockets: silence an absurd gcc warning about strncpy()
Apparently gcc developers decided that strncpy() semantics are no longer
valid and now deserve a warning, especially if used exactly as designed.
This results in issue #304. Let's just remove one to the target size to
please her majesty gcc, the God of C Compilers, who tries hard to make
users completely eliminate any use of string.h and reimplement it by
themselves at much higher risks. Pfff....

This can be backported to stable version, the fix is harmless since it
ignores the last zero that is already set on next line.
2019-12-11 16:29:10 +01:00
Willy Tarreau
2444108f16 BUG/MINOR: server: make "agent-addr" work on default-server line
As reported in issue #408, "agent-addr" doesn't work on default-server
lines. This is due to the transcription of the old "addr" option in commit
6e5e0d8f9e ("MINOR: server: Make 'default-server' support 'addr' keyword.")
which correctly assigns it to the check.addr and agent.addr fields, but
which also copies the default check.addr into both the check's and the
agent's addr fields. Thus the default agent's address is never used.

This fix makes sure to copy the check from the check and the agent from
the agent. However it's worth noting that if "addr" is specified on the
server line, it will still overwrite both the check and the agent's
addresses.

This must be backported as far as 1.8.
2019-12-11 15:43:45 +01:00
Willy Tarreau
cdcba115b8 BUG/MINOR: listener: do not immediately resume on transient error
The listener supports a "transient error" situation, which corresponds
to those situations where accept fails badly but poll() reports an event.
This happens for example when a listener is paused, or on out of FD. The
same mechanism is used when facing a maxconn or maxsessrate limitation.
When this happens, the listener is disabled for up to 100ms and put back
into the global listener queue so that it automatically wakes up again
as soon as the conditions change from an existing connection releasing
one resource, or the system recovers from a transient issue.

The listener_accept() function has a bug in its exit path causing a
freshly limited listener to be immediately enabled again because all
the conditions are met (connection count < max). It doesn't take into
account the fact that the listener might have been queued and must
first wait for the timeout to expire before doing so. The impact is
that upon certain errors, the faulty process will busy loop on the
accept code without sleeping. This is the scenario reported and
diagnosed by @hedong0411 in issue #382.

This commit fixes it by verifying that the global queue's delay is
at least expired before deciding to resume the listener. Another
approach could consist in having an extra state like LI_DELAY for
situations where only a delay is acceptable, but this would probably
not bring anything except more complex code.

This issue was introduced with the lock-free listener accept code
(commits 3f0d02b and 82c9789a) that were backported to 1.8.20+ and
1.9.7+, so this fix must be backported to the relevant branches.
2019-12-11 15:06:30 +01:00
Willy Tarreau
d26c9f9465 BUG/MINOR: mworker: properly pass SIGTTOU/SIGTTIN to workers
If a new process is started with -sf and it fails to bind, it may send
a SIGTTOU to the master process in hope that it will temporarily unbind.
Unfortunately this one doesn't catch it and stops to background instead
of forwarding the signal to the workers. The same is true for SIGTTIN.

This commit simply implements an extra signal handler for the master to
deal with such signals that must be passed down to the workers. It must
be backported as far as 1.8, though there the code differs in that it's
entirely in haproxy.c and doesn't require an extra sig handler.
2019-12-11 14:26:53 +01:00
Willy Tarreau
51013e82d4 BUG/MINOR: log: fix minor resource leaks on logformat error path
As reported by Ilya in issue #392, Coverity found that we're leaking
allocated strings on error paths in parse_logformat(). Let's use a
proper exit label for failures instead of seeding return 0 everywhere.

This should be backported to all supported versions.
2019-12-11 12:05:39 +01:00
Willy Tarreau
c49ba52524 MINOR: tasks: split wake_expired_tasks() in two parts to avoid useless wakeups
We used to have wake_expired_tasks() wake up tasks and return the next
expiration delay. The problem this causes is that we have to call it just
before poll() in order to consider latest timers, but this also means that
we don't wake up all newly expired tasks upon return from poll(), which
thus systematically requires a second poll() round.

This is visible when running any scheduled task like a health check, as there
are systematically two poll() calls, one with the interval, nothing is done
after it, and another one with a zero delay, and the task is called:

  listen test
    bind *:8001
    server s1 127.0.0.1:1111 check

  09:37:38.200959 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8696843}) = 0
  09:37:38.200967 epoll_wait(3, [], 200, 1000) = 0
  09:37:39.202459 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8712467}) = 0
>> nothing run here, as the expired task was not woken up yet.
  09:37:39.202497 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8715766}) = 0
  09:37:39.202505 epoll_wait(3, [], 200, 0) = 0
  09:37:39.202513 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8719064}) = 0
>> now the expired task was woken up
  09:37:39.202522 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 7
  09:37:39.202537 fcntl(7, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
  09:37:39.202565 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0
  09:37:39.202577 setsockopt(7, SOL_TCP, TCP_QUICKACK, [0], 4) = 0
  09:37:39.202585 connect(7, {sa_family=AF_INET, sin_port=htons(1111), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
  09:37:39.202659 epoll_ctl(3, EPOLL_CTL_ADD, 7, {EPOLLOUT, {u32=7, u64=7}}) = 0
  09:37:39.202673 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8814713}) = 0
  09:37:39.202683 epoll_wait(3, [{EPOLLOUT|EPOLLERR|EPOLLHUP, {u32=7, u64=7}}], 200, 1000) = 1
  09:37:39.202693 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=8818617}) = 0
  09:37:39.202701 getsockopt(7, SOL_SOCKET, SO_ERROR, [111], [4]) = 0
  09:37:39.202715 close(7)                = 0

Let's instead split the function in two parts:
  - the first part, wake_expired_tasks(), called just before
    process_runnable_tasks(), wakes up all expired tasks; it doesn't
    compute any timeout.
  - the second part, next_timer_expiry(), called just before poll(),
    only computes the next timeout for the current thread.

Thanks to this, all expired tasks are properly woken up when leaving
poll, and each poll call's timeout remains up to date:

  09:41:16.270449 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10223556}) = 0
  09:41:16.270457 epoll_wait(3, [], 200, 999) = 0
  09:41:17.270130 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10238572}) = 0
  09:41:17.270157 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 7
  09:41:17.270194 fcntl(7, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
  09:41:17.270204 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0
  09:41:17.270216 setsockopt(7, SOL_TCP, TCP_QUICKACK, [0], 4) = 0
  09:41:17.270224 connect(7, {sa_family=AF_INET, sin_port=htons(1111), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
  09:41:17.270299 epoll_ctl(3, EPOLL_CTL_ADD, 7, {EPOLLOUT, {u32=7, u64=7}}) = 0
  09:41:17.270314 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10337841}) = 0
  09:41:17.270323 epoll_wait(3, [{EPOLLOUT|EPOLLERR|EPOLLHUP, {u32=7, u64=7}}], 200, 1000) = 1
  09:41:17.270332 clock_gettime(CLOCK_THREAD_CPUTIME_ID, {tv_sec=0, tv_nsec=10341860}) = 0
  09:41:17.270340 getsockopt(7, SOL_SOCKET, SO_ERROR, [111], [4]) = 0
  09:41:17.270367 close(7)                = 0

This may be backported to 2.1 and 2.0 though it's unlikely to bring any
user-visible improvement except to clarify debugging.
2019-12-11 09:42:58 +01:00
Willy Tarreau
d7f76a0a50 BUG/MEDIUM: proto_udp/threads: recv() and send() must not be exclusive.
This is a complement to previous fix for bug #399. The exclusion between
the recv() and send() calls prevents send handlers from being called if
rx readiness is reported. The DNS code can trigger this situations with
threads where the fd_recv_ready() flag disappears between the test in
dgram_fd_handler() and the second test in dns_resolve_recv() while a
thread calls fd_cant_recv(), and this situation can sustain itself for
a while. With 8 threads and an error in the socket queue, placing a
printf on the return statement in dns_resolve_recv() scrolls very fast.

Simply removing the "else" in dgram_fd_handler() addresses the issue.

This fix must be backported as far as 1.6.
2019-12-10 19:09:15 +01:00
Willy Tarreau
1c75995611 BUG/MAJOR: dns: add minimalist error processing on the Rx path
It was reported in bug #399 that the DNS sometimes enters endless loops
after hours working fine. The issue is caused by a lack of error
processing in the DNS's recv() path combined with an exclusive recv OR
send in the UDP layer, resulting in some errors causing CPU loops that
will never stop until the process is restarted.

The basic cause is that the FD_POLL_ERR and FD_POLL_HUP flags are sticky
on the FD, and contrary to a stream socket, receiving an error on a
datagram socket doesn't indicate that this socket cannot be used anymore.
Thus the Rx code must at least handle this situation and flush the error
otherwise it will constantly be reported. In theory this should not be a
big issue but in practise it is due to another bug in the UDP datagram
handler which prevents the send() callback from being called when Rx
readiness was reported, so the situation cannot go away. It happens way
more easily with threads enabled, so that there is no dead time between
the moment the FD is disabled and another recv() is called, such as in
the example below where the request was sent to a closed port on the
loopback provoking an ICMP unreachable to be sent back:

  [pid 20888] 18:26:57.826408 sendto(29, ";\340\1\0\0\1\0\0\0\0\0\1\0031wt\2eu\0\0\34\0\1\0\0)\2\0\0\0\0\0\0\0", 35, 0, NULL, >
  [pid 20893] 18:26:57.826566 recvfrom(29, 0x7f97c54ef2f0, 513, 0, NULL, NULL) = -1 ECONNREFUSED (Connection refused)
  [pid 20889] 18:26:57.826601 recvfrom(29, 0x7f97c76182f0, 513, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
  [pid 20892] 18:26:57.826630 recvfrom(29, 0x7f97c5cf02f0, 513, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
  [pid 20891] 18:26:57.826684 recvfrom(29, 0x7f97c66162f0, 513, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
  [pid 20895] 18:26:57.826716 recvfrom(29, 0x7f97bffda2f0, 513, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
  [pid 20894] 18:26:57.826747 recvfrom(29, 0x7f97c4cee2f0, 513, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
  [pid 20888] 18:26:58.419838 recvfrom(29, 0x7ffcc8712c20, 513, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
  [pid 20893] 18:26:58.419900 recvfrom(29, 0x7f97c54ef2f0, 513, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
  (... hundreds before next sendto() ...)

This situation was handled by clearing HUP and ERR when recv()
returns <0.

A second case was handled, there was a control for a missing dgram
handler, but it does nothing, causing the FD to ring again if this
situation ever happens. After looking at the rest of the code, it
doesn't seem possible to face such a situation because these handlers
are registered during startup, but at least we need to handle it
properly.

A third case was handled, that's mainly a small optimization. With
threads and massive responses, due to the large lock around the loop,
it's likely that some threads will have seen fd_recv_ready() and will
wait at the lock(). But if they wait here, chances are that other
threads will have eliminated pending data and issued fd_cant_recv().
In this case, better re-check fd_recv_ready() before performing the
recv() call to avoid the huge amounts of syscalls that happen on
massively threaded setups.

This patch must be backported as far as 1.6 (the atomic AND just
needs to be turned to a regular AND).
2019-12-10 19:09:15 +01:00
Olivier Houchard
eaefc3c503 BUG/MEDIUM: kqueue: Make sure we report read events even when no data.
When we have a EVFILT_READ event, an optimization was made, and the FD was
not reported as ready to receive if there were no data available. That way,
if the socket was closed by our peer (the EV8EOF flag was set), and there were
no remaining data to read, we would just close(), and avoid doing a recv().
However, it may be fine for TCP socket, but it is not for UDP.
If we send data via UDP, and we receive an error, the only way to detect it
is to attempt a recv(). However, in this case, kevent() will report a read
event, but with no data, so we'd just ignore that read event, nothing would be
done about it, and the poller would be woken up by it over and over.
To fix this, report read events if either we have data, or the EV_EOF flag
is not set.

This should be backported to 2.1, 2.0, 1.9 and 1.8.
2019-12-10 18:27:17 +01:00
Willy Tarreau
a1d97f88e0 REORG: listener: move the global listener queue code to listener.c
The global listener queue code and declarations were still lying in
haproxy.c while not needed there anymore at all. This complicates
the code for no reason. As a result, the global_listener_queue_task
and the global_listener_queue were made static.
2019-12-10 14:16:03 +01:00
Willy Tarreau
241797a3fc MINOR: listener: split dequeue_all_listener() in two
We use it half times for the global_listener_queue and half times
for a proxy's queue and this requires the callers to take care of
these. Let's split it in two versions, the current one working only
on the global queue and another one dedicated to proxies for the
per-proxy queues. This cleans up quite a bit of code.
2019-12-10 14:14:09 +01:00
Willy Tarreau
0591bf7deb MINOR: listener: make the wait paths cleaner and more reliable
In listener_accept() there are several situations where we have to wait
for an event or a delay. These ones all implement their own call to
limit_listener() and the associated task_schedule(). In addition to
being ugly and confusing, one expire date computation is even wrong as
it doesn't take in account the fact that we're using threads and that
the value might change in the middle. Fortunately task_schedule() gets
it right for us.

This patch creates two jump locations, one for the global queue and
one for the proxy queue, allowing the rest of the code to only compute
the expire delay and jump to the right location.
2019-12-10 12:04:27 +01:00
Willy Tarreau
92079934a9 BUG/MEDIUM: listener/threads: fix a remaining race in the listener's accept()
Recent fix 4c044e274c ("BUG/MEDIUM: listener/thread: fix a race when
pausing a listener") is insufficient and moves the race slightly farther.
What now happens is that if we're limiting a listener due to a transient
error such as an accept() error for example, or because the proxy's
maxconn was reached, another thread might in the mean time have switched
again to LI_READY and at the end of the function we'll disable polling on
this FD, resulting in a listener that never accepts anything anymore. It
can more easily happen when sending SIGTTOU/SIGTTIN to temporarily pause
the listeners to let another process bind next to them.

What this patch does instead is to move all enable/disable operations at
the end of the function and condition them to the state. The listener's
state is checked under the lock and the FD's polling state adjusted
accordingly so that the listener's state and the FD always remain 100%
synchronized. It was verified with 16 threads that the cost of taking
that lock is not measurable so that's fine.

This should be backported to the same branches the patch above is
backported to.
2019-12-10 10:43:31 +01:00
Willy Tarreau
20aeb1c7cd BUG/MINOR: listener: also clear the error flag on a paused listener
When accept() fails because a listener is temporarily paused, the
FD might have both FD_POLL_HUP and FD_POLL_ERR bits set. While we do
not exploit FD_POLL_ERR here it's better to clear it because it is
reported on "show fd" and is confusing.

This may be backported to all versions.
2019-12-10 10:43:31 +01:00
Willy Tarreau
7cdeb61701 BUG/MINOR: listener/threads: always use atomic ops to clear the FD events
There was a leftover of the single-threaded era when removing the
FD_POLL_HUP flag from the listeners. By not using an atomic operation
to clear the flag, another thread acting on the same listener might
have lost some events, though this would have resulted in that thread
to reprocess them immediately on the next loop pass.

This should be backported as far as 1.8.
2019-12-10 10:43:31 +01:00
Willy Tarreau
67878d7bdc BUG/MINOR: proxy: make soft_stop() also close FDs in LI_PAUSED state
The proxies' soft_stop() function closes the FDs in all opened states
except LI_PAUSED. This means that a transient error on a listener might
cause it to turn back to the READY state if it happens exactly when a
reload signal is received.

This must be backported to all supported versions.
2019-12-10 10:43:31 +01:00
Christopher Faulet
f950c2e97e BUG/MEDIUM: mux-fcgi: Handle cases where the HTX EOM block cannot be inserted
During the HTTP response parsing, if there is not enough space in the channel's
buffer, it is possible to fail to add the HTX EOM block while all data in the
rxbuf were consumed. As for the h1 mux, we must notify the conn-stream the
buffer is full to have a chance to add the HTX EOM block later. In this case, we
must also be carefull to not report a server abort by setting too early the
CS_FL_EOS flag on the conn-stream.

To do so, the FCGI_SF_APPEND_EOM flag must be set on the FCGI stream to know the
HTX EOM block is missing.

This patch must be backported to 2.1.
2019-12-09 09:30:50 +01:00
Christopher Faulet
7aae858001 BUG/MINOR: mux-h1: Be sure to set CS_FL_WANT_ROOM when EOM can't be added
During the message parsing, when the HTX buffer is full and only the HTX EOM
block cannot be added, it is important to notify the conn-stream that some
processing must still be done but it is blocked because there is not enough room
in the buffer. The way to do so is to set the CS_FL_WANT_ROOM flag on the
conn-stream. Otherwise, because all data are received and consumed, the mux is
not called anymore to add this last block, leaving the message unfinished from
the HAProxy point of view. The only way to unblock it is to receive a shutdown
for reads or to hit a timeout.

This patch must be backported to 2.1 and 2.0. The 1.9 does not seem to be
affected.
2019-12-09 09:30:50 +01:00
Willy Tarreau
a45a8b5171 MEDIUM: init: set NO_NEW_PRIVS by default when supported
HAProxy doesn't need to call executables at run time (except when using
external checks which are strongly recommended against), and is even expected
to isolate itself into an empty chroot. As such, there basically is no valid
reason to allow a setuid executable to be called without the user being fully
aware of the risks. In a situation where haproxy would need to call external
checks and/or disable chroot, exploiting a vulnerability in a library or in
haproxy itself could lead to the execution of an external program. On Linux
it is possible to lock the process so that any setuid bit present on such an
executable is ignored. This significantly reduces the risk of privilege
escalation in such a situation. This is what haproxy does by default. In case
this causes a problem to an external check (for example one which would need
the "ping" command), then it is possible to disable this protection by
explicitly adding this directive in the global section. If enabled, it is
possible to turn it back off by prefixing it with the "no" keyword.

Before the option:

  $ socat - /tmp/sock1 <<< "expert-mode on; debug dev exec sudo /bin/id"
  uid=0(root) gid=0(root) groups=0(root

After the option:
  $ socat - /tmp/sock1 <<< "expert-mode on; debug dev exec sudo /bin/id"
  sudo: effective uid is not 0, is /usr/bin/sudo on a file system with the
        'nosuid' option set or an NFS file system without root privileges?
2019-12-06 17:20:26 +01:00
Willy Tarreau
368bff40ce MINOR: debug: replace popen() with pipe+fork() in "debug dev exec"
popen() is annoying because it doesn't catch stderr. The command was
implemented using it just by pure laziness, let's just redo it a bit
cleaner using normal syscalls. Note that this command is only enabled
when built with -DDEBUG_DEV.
2019-12-06 17:20:26 +01:00
Olivier Houchard
aebeff74fc BUG/MEDIUM: checks: Make sure we set the task affinity just before connecting.
In process_chk_conn(), make sure we set the task affinity to the current
thread as soon as we're attempting a connection (and reset the affinity to
"any thread" if we detect a failure).
We used to only set the task affinity if connect_conn_chk() returned
SF_ERR_NONE, however for TCP checks, SF_ERR_UP is returned, so for those
checks, the task could still run on any thread, and this could lead to a
race condition where the connection runs on one thread, while the task runs
on another one, which could create random memory corruption and/or crashes.
This may fix github issue #369.

This should be backported to 2.1, 2.0 and 1.9.
2019-12-05 15:31:44 +01:00
Christopher Faulet
2545a0b352 BUG/MINOR: mux-h1: Fix conditions to know whether or not we may receive data
The h1_recv_allowed() function is inherited from the h2 multiplexer. But for the
h1, conditions to know if we may receive data are less complex because there is
no multiplexing and because data are not parsed when received. So now, following
rules are respected :

 * if an error or a shutdown for reads was detected on the connection we must
   not attempt to receive
 * if the input buffer failed to be allocated or is full, we must not try to
   receive
 * if the input processing is busy waiting for the output side, we may attempt
   to receive
 * otherwise must may not attempt to receive

This patch must be backported as far as 1.9.
2019-12-05 13:36:03 +01:00
Christopher Faulet
7b109f2f8b BUG/MINOR: mux-h1: Don't rely on CO_FL_SOCK_RD_SH to set H1C_F_CS_SHUTDOWN
The CO_FL_SOCK_RD_SH flag is only set when a read0 is received. So we must not
rely on it to set the H1 connection in shutdown state (H1C_F_CS_SHUTDOWN). In
fact, it is suffisant to set the connection in shutdown state when the shutdown
for writes is forwared to the sock layer.

This patch must be backported as far as 1.9.
2019-12-05 13:36:03 +01:00
Christopher Faulet
aaa67bcef2 BUG/MEDIUM: mux-h1: Never reuse H1 connection if a shutw is pending
On the server side, when a H1 stream is detached from the connection, if the
connection is not reusable but some outgoing data remain, the connection is not
immediatly released. In this case, the connection is not inserted in any idle
connection list. But it is still attached to the session. Because of that, it
can be erroneously reused. h1_avail_streams() always report a free slot if no
stream is attached to the connection, independently on the connection's
state. It is obviously a bug. If a second request is handled by the same session
(it happens with H2 connections on the client side), this connection is reused
before we close it.

There is small window to hit the bug, but it may lead to very strange
behaviors. For instance, if a first h2 request is quickly aborted by the client
while it is blocked in the mux on the server side (so before any response is
received), a second request can be processed and sent to the server. Because the
connection was not closed, the possible reply to the first request will be
interpreted as a reply to the second one. It is probably the bug described by
Peter Fröhlich in the issue #290.

To fix the bug, a new flag has been added to know if an H1 connection is idle or
not. So now, H1C_F_CS_IDLE is set when a connection is idle and useable to
handle a new request. If it is set, we try to add the connection in an idle
connection list. And h1_avail_streams() only relies on this flag
now. Concretely, this flag is set when a K/A stream is detached and both the
request and the response are in DONE state. It is exclusive to other H1C_F_CS
flags.

This patch must be backported as far as 1.9.
2019-12-05 13:31:16 +01:00
Emmanuel Hocdet
3777e3ad14 BUG/MINOR: ssl: certificate choice can be unexpected with openssl >= 1.1.1
It's regression from 9f9b0c6 "BUG/MEDIUM: ECC cert should work with
TLS < v1.2 and openssl >= 1.1.1". Wilcard EC certifcate could be selected
at the expense of specific RSA certificate.
In any case, specific certificate should always selected first, next wildcard.
Reflect this rule in a loop to avoid any bug in certificate selection changes.

Fix issue #394.

It should be backported as far as 1.8.
2019-12-05 10:49:24 +01:00
Willy Tarreau
4c044e274c BUG/MEDIUM: listener/thread: fix a race when pausing a listener
There exists a race in the listener code where a thread might disable
receipt on a listener's FD then turn it to LI_PAUSED while at the same
time another one faces EAGAIN on accept() and enables it again via
fd_cant_recv(). The result is that the FD is in LI_PAUSED state with
its polling still enabled. listener_accept() does not do anything then
and doesn't disable the FD either, resulting in a thread eating all the
CPU as reported in issue #358. A solution would be to take the listener's
lock to perform the fd_cant_recv() call and do it only if the FD is still
in LI_READY state, but this would be totally overkill while in practice
the issue only happens during shutdown.

Instead what is done here is that when leaving we recheck the state and
disable polling if the listener is not in LI_READY state, which never
happens except when being limited. In the worst case there could be one
extra check per thread for the time required to converge, which is
absolutely nothing.

This fix was successfully tested, and should be backported to all
versions using the lock-free listeners, which means all those containing
commit 3f0d02bb ("MAJOR: listener: do not hold the listener lock in
listener_accept()"), hence 2.1, 2.0, 1.9.7+, 1.8.20+.
2019-12-05 07:40:32 +01:00
William Lallemand
920b035238 BUG/MINOR: ssl/cli: don't overwrite the filters variable
When a crt-list line using an already used ckch_store does not contain
filters, it will overwrite the ckchs->filters variable with 0.
This problem will generate all sni_ctx of this ckch_store without
filters. Filters generation mustn't be allowed in any case.

Must be backported in 2.1.
2019-12-05 00:00:04 +01:00
Willy Tarreau
c640ef1a7d BUG/MINOR: stream-int: avoid calling rcv_buf() when splicing is still possible
In si_cs_recv(), we can end up with a partial splice() call that will be
followed by an attempt to us rcv_buf(). Sometimes this works and places
data into the buffer, which then prevent splicing from being used, and
this causes splice() and recvfrom() calls to alternate. Better simply
refrain from calling rcv_buf() when there are data in the pipe and still
data to be forwarded. Usually this indicates that we've ate everything
available and that we still want to use splice() on subsequent calls.

This should be backported to 2.1 and 2.0.
2019-12-04 11:55:49 +01:00
Willy Tarreau
1ac5f20804 BUG/MEDIUM: stream-int: don't subscribed for recv when we're trying to flush data
If we cannot splice incoming data using rcv_pipe() due to remaining data
in the buffer, we must not subscribe to the mux but instead tag the
stream-int as blocked on missing Rx room. Otherwise when data are
flushed, calling si_chk_rcv() will have no effect because the WAIT_EP
flag remains present, and we'll end in an rx timeout. This case is very
hard to reproduce, and requires an inversion of the polling side in the
middle of a transfer. This can only happen when the client and the server
are using similar links and when splicing is enabled. It typically takes
hundreds of MB to GB for the problem to happen, and tends to be magnified
by the use of option contstats which causes process_stream() to be called
every 5s and to try again to recv.

This fix must be backported to 2.1, 2.0, and possibly 1.9.
2019-12-04 11:55:49 +01:00
William Lallemand
230662a0dd BUG/MINOR: ssl/cli: 'ssl cert' cmd only usable w/ admin rights
The 3 commands 'set ssl cert', 'abort ssl cert' and 'commit ssl cert'
must be only usable with admin rights over the CLI.

Must be backported in 2.1.
2019-12-03 15:10:46 +01:00
Willy Tarreau
d96f1126fe MEDIUM: init: prevent process and thread creation at runtime
Some concerns are regularly raised about the risk to inherit some Lua
files which make use of a fork (e.g. via os.execute()) as well as
whether or not some of bugs we fix might or not be exploitable to run
some code. Given that haproxy is event-driven, any foreground activity
completely stops processing and is easy to detect, but background
activity is a different story. A Lua script could very well discretely
fork a sub-process connecting to a remote location and taking commands,
and some injected code could also try to hide its activity by creating
a process or a thread without blocking the rest of the processing. While
such activities should be extremely limited when run in an empty chroot
without any permission, it would be better to get a higher assurance
they cannot happen.

This patch introduces something very simple: it limits the number of
processes and threads to zero in the workers after the last thread was
created. By doing so, it effectively instructs the system to fail on
any fork() or clone() syscall. Thus any undesired activity has to happen
in the foreground and is way easier to detect.

This will obviously break external checks (whose concept is already
totally insecure), and for this reason a new option
"insecure-fork-wanted" was added to disable this protection, and it
is suggested in the fork() error report from the checks. It is
obviously recommended not to use it and to reconsider the reasons
leading to it being enabled in the first place.

If for any reason we fail to disable forks, we still start because it
could be imaginable that some operating systems refuse to set this
limit to zero, but in this case we emit a warning, that may or may not
be reported since we're after the fork point. Ideally over the long
term it should be conditionned by strict-limits and cause a hard fail.
2019-12-03 11:49:00 +01:00
Christopher Faulet
bc271ec113 BUG/MINOR: stats: Fix HTML output for the frontends heading
Since the flag STAT_SHOWADMIN was removed, the frontends heading in the HTML
output appears unaligned because the space reserved for the checkbox (not
displayed for frontends) is not inserted.

This patch fixes the issue #390. It must be backported to 2.1.
2019-12-02 11:40:04 +01:00
Christopher Faulet
bc96c90614 BUG/MINOR: fcgi-app: Make the directive pass-header case insensitive
The header name configured by the directive "pass-header", in the "fcgi-app"
section, must be case insensitive. For now, it must be in lowercase to match an
header. Internally, header names are in lowercase but there is no reason to
impose this syntax in the configuration.

This patch must be backported to 2.1.
2019-12-02 10:38:52 +01:00
Emmanuel Hocdet
140b64fb56 BUG/MINOR: ssl: fix SSL_CTX_set1_chain compatibility for openssl < 1.0.2
Commit 1c65fdd5 "MINOR: ssl: add extra chain compatibility" really implement
SSL_CTX_set0_chain. Since ckch can be used to init more than one ctx with
openssl < 1.0.2 (commit 89f58073 for X509_chain_up_ref compatibility),
SSL_CTX_set1_chain compatibility is required.

This patch must be backported to 2.1.
2019-11-29 17:02:30 +01:00
Christopher Faulet
f3ad62996f BUG/MINOR: http-htx: Don't make http_find_header() fail if the value is empty
http_find_header() is used to find the next occurrence of a header matching on
its name. When found, the matching header is returned with the corresponding
value. This value may be empty. Unfortunatly, because of a bug, an empty value
make the function fail.

This patch must be backported to 2.1, 2.0 and 1.9.
2019-11-29 11:48:15 +01:00
William Dauchy
be8a387e93 CLEANUP: dns: resolution can never be null
`eb` being tested above, `res` cannot be null, so the condition is
not needed and introduces potential dead code.

also fix a typo in associated comment

This should fix issue #349

Reported-by: Илья Шипицин <chipitsine@gmail.com>
Signed-off-by: William Dauchy <w.dauchy@criteo.com>
2019-11-28 20:41:46 +01:00
Emmanuel Hocdet
b270e8166c MINOR: ssl: deduplicate crl-file
Load file for crl or ca-cert is realy done with the same function in OpenSSL,
via X509_STORE_load_locations. Accordingly, deduplicate crl-file and ca-file
can share the same function.
2019-11-28 11:11:20 +01:00
Emmanuel Hocdet
129d3285a5 MINOR: ssl: compute ca-list from deduplicate ca-file
ca-list can be extracted from ca-file already loaded in memory.
This patch set ca-list from deduplicated ca-file when needed
and share it in ca-file tree.

As a corollary, this will prevent file access for ca-list when
updating a certificate via CLI.
2019-11-28 11:11:20 +01:00
Emmanuel Hocdet
d4f9a60ee2 MINOR: ssl: deduplicate ca-file
Typically server line like:
'server-template srv 1-1000 *:443 ssl ca-file ca-certificates.crt'
load ca-certificates.crt 1000 times and stay duplicated in memory.
Same case for bind line: ca-file is loaded for each certificate.
Same 'ca-file' can be load one time only and stay deduplicated in
memory.

As a corollary, this will prevent file access for ca-file when
updating a certificate via CLI.
2019-11-28 11:11:20 +01:00
Willy Tarreau
e18f53e01c BUILD/MINOR: trace: fix use of long type in a few printf format strings
Building on a 32-bit platform produces these warnings in trace code:

src/stream.c: In function 'strm_trace':
src/stream.c:226:29: warning: format '%lu' expects argument of type 'long unsigned int', but argument 9 has type 'size_t {aka const unsigned int}' [-Wformat=]
   chunk_appendf(&trace_buf, " req=(%p .fl=0x%08x .ana=0x%08x .exp(r,w,a)=(%u,%u,%u) .o=%lu .tot=%llu .to_fwd=%u)",
                             ^
src/stream.c:229:29: warning: format '%lu' expects argument of type 'long unsigned int', but argument 9 has type 'size_t {aka const unsigned int}' [-Wformat=]
   chunk_appendf(&trace_buf, " res=(%p .fl=0x%08x .ana=0x%08x .exp(r,w,a)=(%u,%u,%u) .o=%lu .tot=%llu .to_fwd=%u)",
                             ^
src/mux_fcgi.c: In function 'fcgi_trace':
src/mux_fcgi.c:443:29: warning: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'size_t {aka const unsigned int}' [-Wformat=]
   chunk_appendf(&trace_buf, " - VAL=%lu", *val);
                             ^
src/mux_h1.c: In function 'h1_trace':
src/mux_h1.c:290:29: warning: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'size_t {aka const unsigned int}' [-Wformat=]
   chunk_appendf(&trace_buf, " - VAL=%lu", *val);
                             ^

Let's just cast the type to long. This should be backported to 2.1.
2019-11-27 15:45:11 +01:00
Christopher Faulet
bc7c03eba3 BUG/MINOR: h1: Don't test the host header during response parsing
During the H1 message parsing, the host header is tested to be sure it matches
the request's authority, if defined. When there are multiple host headers, we
also take care they are all the same. Of course, these tests must only be
performed on the requests. A host header in a response has no special meaning.

This patch must be backported to 2.1.
2019-11-27 14:01:17 +01:00
Tim Duesterhus
9312853530 CLEANUP: ssl: Clean up error handling
This commit removes the explicit checks for `if (err)` before
passing `err` to `memprintf`. `memprintf` already checks itself
whether the `**out*` parameter is `NULL` before doing anything.
This reduces the indentation depth and makes the code more readable,
before there is less boilerplate code.

Instead move the check into the ternary conditional when the error
message should be appended to a previous message. This is consistent
with the rest of ssl_sock.c and with the rest of HAProxy.

Thus this patch is the arguably cleaner fix for issue #374 and builds
upon
5f1fa7db86 and
8b453912ce

Additionally it fixes a few places where the check *still* was missing.
2019-11-26 04:16:56 +01:00
Willy Tarreau
2e7fdfc9a1 BUG/MEDIUM: trace: fix a typo causing an incorrect startup error
Since commit 88ebd40 ("MINOR: trace: add allocation of buffer-sized
trace buffers") we have a trace buffer allocated at boot time. But
there was a copy-paste error there making the test verify that the
trash was allocated instead of the trace buffer. The result is that
depending on the link order either the test will succeed or fail,
preventing haproxy from starting at all.

No backport is needed.
2019-11-25 19:47:22 +01:00
Willy Tarreau
f3ce0418aa MINOR: mux-h2/trace: report the connection and/or stream error code
We were missing the error code when tracing a call to h2s_error() or
h2c_error(), let's report it when it's set.
2019-11-25 11:34:26 +01:00
Willy Tarreau
57a1816fae BUG/MAJOR: mux-h2: don't try to decode a response HEADERS frame in idle state
Christopher found another issue in the H2 backend implementation that
results from a miss in the H2 spec: the processing of a HEADERS frame
is always permitted in IDLE state, but this doesn't make sense on the
response path! And here when facing such a frame, we try to decode it
while we didn't allocate any stream, so we end up trying to fill the
idle stream's buffer (read-only) and crash.

What we're doing here is that if we get a HEADERS frame in IDLE state
from a server, we terminate the connection with a PROTOCOL_ERROR. No
such transition seems to be permitted by the spec but it seems to be
the only sane solution.

This fix must be backported as far as 1.9. Note that in 2.0 and earlier
there's no h2_frame_check_vs_state() function, instead the check is
inlined in h2_process_demux().
2019-11-25 11:34:20 +01:00
Willy Tarreau
146f53ae7e BUG/MAJOR: h2: make header field name filtering stronger
Tim Dsterhus found that the amount of sanitization we perform on HTTP
header field names received in H2 is insufficient. Currently we reject
upper case letters as mandated by RFC7540#8.1.2, but section 10.3 also
requires that intermediaries translating streams to HTTP/1 further
refine the filtering to also reject invalid names (which means any name
that doesn't match a token). There is a small trick here which is that
the colon character used to start pseudo-header names doesn't match a
token, so pseudo-header names fall into that category, thus we have to
swap the pseudo-header name lookup with this check so that we only check
from the second character (past the ':') in case of pseudo-header names.

Another possibility could have been to perform this check only in the
HTX-to-H1 trancoder but doing would still expose the configured rules
and logs to such header names.

This fix must be backported as far as 1.8 since this bug could be
exploited and serve as the base for an attack. In 2.0 and earlier,
functions h2_make_h1_request() and h2_make_h1_trailers() must also
be adapted to sanitize requests coming in legacy mode.
2019-11-25 11:11:32 +01:00