12501 Commits

Author SHA1 Message Date
Willy Tarreau
fdf53b4962 BUG/MINOR: pools: don't mark ourselves as harmless in DEBUG_UAF mode
When haproxy is built with DEBUG_UAF=1, some particularly slow
allocation functions are used for each pool, and it was not uncommon
to see the watchdog trigger during performance tests. For this reason
the allocation functions were surrounded by a pair of thread_harmless
calls to mention that the function was waiting in slow syscalls. The
problem is that this also releases functions blocked in thread_isolate()
which can then start their work.

In order to protect against the accidental removal of a shared resource
in this situation, in 2.5-dev4 with commit ba3ab7907 ("MEDIUM: servers:
make the server deletion code run under full thread isolation") was added
thread_isolate_full() for functions which want to be totally protected
due to being manipulating some data.

But this is not sufficient, because there are still places where we
can allocate/free (thus sleep) under a lock, such as in long call
chains involving the release of an idle connection. In this case, if
one thread asks for isolation, one thread might hang in
pool_alloc_area_uaf() with a lock held (for example the conns_lock
when coming from conn_backend_get()->h1_takeover()->task_new()), with
another thread blocked on a lock waiting for that one to release it,
both keeping their bit clear in the thread_harmless mask, preventing
the first thread from being released, thus causing a deadlock.

In addition to this, it was already seen that the "show fd" CLI handler
could wake up during a pool_free_area_uaf() with an incompletely
released memory area while deleting a file descriptor, and be fooled
showing bad pointers, or during a pool_alloc() on another thread that
was in the process of registering a freshly allocated connection to a
new file descriptor.

One solution could consist in replacing all thread_isolate() calls by
thread_isolate_full() but then that makes thread_isolate() useless
and only shifts the problem by one slot.

A better approach could possibly consist in having a way to mark that
a thread is entering an extremely slow section. Such sections would
be timed so that this is not abused, and the bit would be used to
make the watchdog more patient. This would be acceptable as this would
only affect debugging.

The approach used here for now consists in removing the harmless bits
around the UAF allocator, thus essentially undoing commit 85b2cae63
("MINOR: pools: make the thread harmless during the mmap/munmap
syscalls").

This is marked as minor because nobody is expected to be running with
DEBUG_UAF outside of development or serious debugging, so this issue
cannot affect regular users. It must be backported to stable branches
that have thread_harmless_now() around the mmap() call.
2021-11-12 11:17:37 +01:00
Christopher Faulet
47940c39e2 BUG/MINOR: mux-h2: Fix H2_CF_DEM_SHORT_READ value
The value for H2_CF_DEM_SHORT_READ flag is wrong. 2 bits are erroneously
set, 0x200 and 0x80000.  It is not an issue because both bits are not used
anywhere else.

The typo was introduced in the commit b5f7b5296 ("BUG/MEDIUM: mux-h2: Handle
remaining read0 cases on partial frames"). Thus this patch must also be
backported as far a 2.0.
2021-11-10 18:04:36 +01:00
William Lallemand
67b778418e BUG/MEDIUM: httpclient/cli: free of unallocated hc->req.uri
httpclient_new() sets the hc->req.uri ist without duplicating its
memory, which is a problem since the string in the ist could be
inaccessible at some point. The API was made to use a ist which was
allocated dynamically, but httpclient_new() didn't do that, which result
in a crash when calling istfree().

This patch fixes the problem by doing an istdup()

Fix issue #1452.
2021-11-10 17:02:50 +01:00
William Lallemand
5f47b2e280 BUG/MINOR: mworker: doesn't launch the program postparser
When in wait mode, the mworker-prog postparser is launched, but
unfortunately the child structure doesn't contain all required
information to be able to launch the test.

This test is only required when doing a configuration parsing.

Must be backported as far as 2.0.
2021-11-10 15:53:01 +01:00
William Lallemand
90034bba15 MINOR: mworker: change the way we set PROC_O_LEAVING
Since the wait mode is always used once we successfuly loaded the
configuration, every processes were marked as old workers.

To fix this, the PROC_O_LEAVING flag is set only on the processes which
have a number of reloads greater than the current processes.
2021-11-10 15:53:01 +01:00
William Lallemand
3ba7c7b5e1 MINOR: mworker: ReloadFailed shown depending on failedreload
The ReloadFailed prompt in the master CLI is shown only when
failedreloads > 0. It was previously using a check on the wait mode, but
we always use the wait mode now.
2021-11-10 15:53:01 +01:00
William Lallemand
6883674084 MINOR: mworker: implement a reload failure counter
Implement a reload failure counter which counts the number of failure
since the last success. This counter is available in 'show proc' over
the master CLI.
2021-11-10 15:53:01 +01:00
William Lallemand
ad221f4ece MINOR: mworker: only increment the number of reload in wait mode
Since the wait mode will be started in any case of succesful or failed
reload, change the way haproxy computes the number of reloads of the
processes.
2021-11-10 15:53:01 +01:00
William Lallemand
836bda226c MINOR: mworker: clarify starting/failure messages
Clarify the startup and reload messages:

On a successful configuration load, haproxy will emit "Loading success."
after successfuly forked the children.

When it didn't success to load the configuration it will emit "Loading failure!".

When trying to reload the master process, it will emit "Reloading
HAProxy".
2021-11-10 15:53:01 +01:00
William Lallemand
fab0fdce98 MEDIUM: mworker: reexec in waitpid mode after successful loading
Use the waitpid mode after successfully loading the configuration, this
way the memory will be freed in the master, and will preserve the memory.

This will be useful when doing a reload with a configuration which has
large maps or a lot of SSL certificates, avoiding an OOM because too
much memory was allocated in the master.
2021-11-10 15:53:01 +01:00
William Lallemand
5d71a6b0f1 CLEANUP: mworker: remove any relative PID reference
nbproc was removed, it's time to remove any reference to the relative
PID in the master-worker, since there can be only 1 current haproxy
process.

This patch cleans up the alerts and warnings emitted during the exit of
a process, as well as the "show proc" output.
2021-11-10 15:53:01 +01:00
Christopher Faulet
99293b0380 MINOR: mux-h1: Slightly Improve H1 traces
Connection and conn-stream pointers and flags are now dumped, if available,
in each trace messages. In addition, shutr and shutw mode is now reported.
2021-11-10 11:45:27 +01:00
Christopher Faulet
4c5a591b10 Revert "BUG/MINOR: http-ana: Don't eval front after-response rules if stopped on back"
This reverts commit 597909f4e67866c4f3ecf77f95f2cd4556c0c638

http-after-response rules evaluation was changed to do the same that was
done for http-response, in the code. However, the opposite must be performed
instead. Only the rules of the current section must be stopped. Thus the
above commit is reverted and the http-response rules evaluation will be
fixed instead.

Note that only "allow" action is concerned. It is most probably an uncommon
action for an http-after-request rule.

This patch must be backported as far as 2.2 if the above commit was
backported.
2021-11-09 18:02:49 +01:00
Christopher Faulet
46f46df300 BUG/MINOR: http-ana: Apply stop to the current section for http-response rules
A TCP/HTTP action can stop the rules evaluation. However, it should be
applied on the current section only. For instance, for http-requests rules,
an "allow" on a frontend must stop evaluation of rules defined in this
frontend. But the backend rules, if any, must still be evaluated.

For http-response rulesets, according the configuration manual, the same
must be true. Only "allow" action is concerned. However, since the
beginning, this action stops evaluation of all remaining rules, not only
those of the current section.

This patch may be backported to all supported versions. But it is not so
critical because the bug exists since a while. I doubt it will break any
existing configuration because the current behavior is
counterintuitive.
2021-11-09 18:02:36 +01:00
William Dauchy
42d7c402d5 MINOR: promex: backend aggregated server check status
- add new metric: `haproxy_backend_agg_server_check_status`
  it counts the number of servers matching a specific check status
  this permits to exclude per server check status as the usage is often
  to rely on the total. Indeed in large setup having thousands of
  servers per backend the memory impact is not neglible to store the per
  server metric.
- realign promex_str_metrics array

quite simple implementation - we could improve it later by adding an
internal state to the prometheus exporter, thus to avoid counting at
every dump.

this patch is an attempt to close github issue #1312. It may bebackported
to 2.4 if requested.

Signed-off-by: William Dauchy <wdauchy@gmail.com>
2021-11-09 10:51:08 +01:00
William Lallemand
db8a1f391d BUG/MEDIUM: httpclient: channel_add_input() must use htx->data
The httpclient uses channel_add_input() to notify the channel layer that
it must forward some data. This function was used with b_data(&req->buf)
which ask to send the size of a buffer (because of the HTX metadata
which fill the buffer completely).

This is wrong and will have the consequence of trying to send data that
doesn't exist, letting HAProxy looping at 100% CPU.

When using htx channel_add_input() must be used with the size of the htx
payload, and not the size of a buffer.

When sending the request payload it also need to sets the buffer size to
0, which is achieved with a htx_to_buf() when the htx payload is empty.
2021-11-08 17:36:31 +01:00
William Lallemand
933fe394bb BUG/MINOR: httpclient/lua: rcv freeze when no request payload
This patch fixes the receive part of the lua httpclient when no payload
was sent.

The lua task was not awoken once it jumped into
hlua_httpclient_rcv_yield(), which caused the lua client to freeze.

It works with a payload because the payload push is doing the wakeup.

A change in the state machine of the IO handler is also require to
achieve correctly the change from the REQ state to the RES state, it has
to detect if there is the right EOM flag in the request.
2021-11-08 17:36:31 +01:00
Willy Tarreau
1f38bdb3f6 BUG/MINOR: cache: properly ignore unparsable max-age in quotes
When "max-age" or "s-maxage" receive their values in quotes, the pointer
to the integer to be parsed is advanced by one, but the error pointer
check doesn't consider this advanced offset, so it will not match a
parse error such as max-age="a" and will take the value zero instead.

This probably needs to be backported, though it's unsure it has any
effect in the real world.
2021-11-08 12:09:27 +01:00
Willy Tarreau
49b0482ed4 CLEANUP: chunk: remove misleading chunk_strncat() function
This function claims to perform an strncat()-like operation but it does
not, it always copies the indicated number of bytes, regardless of the
presence of a NUL character (what is currently done by chunk_memcat()).
Let's remove it and explicitly replace it with chunk_memcat().
2021-11-08 12:08:26 +01:00
Tim Duesterhus
9f7ed8a60c CLEANUP: Apply ist.cocci
This is to make use of `chunk_istcat()`.
2021-11-08 12:08:26 +01:00
Tim Duesterhus
2471f5c2b2 CLEANUP: Apply ist.cocci
Make use of the new rules to use `isttrim()`.
2021-11-08 12:08:26 +01:00
Frédéric Lécaille
c4becf5424 MINOR: quic: Fix potential null pointer dereference
Fix compilation warnings about non initialized pointers.

This partially address #1445 github issue.
2021-11-08 11:31:12 +01:00
Amaury Denoyelle
b9ce14e5a2 MINOR: h3: fix potential NULL dereference
Fix potential allocation failure of HTX start-line during H3 request
decoding. In this case, h3_decode_qcs returns -1 as error code.

This addresses in part github issue #1445.
2021-11-08 09:17:24 +01:00
Amaury Denoyelle
7bb54f9906 MINOR: mux-quic: fix gcc11 warning
Fix minor warnings about an unused variable.

This addresses in part github issue #1445.
2021-11-08 08:59:30 +01:00
Amaury Denoyelle
3cae4049b0 MINOR: h3/qpack: fix gcc11 warnings
Fix minor warnings about unused variables and mixed declarations.

This addresses in part github issue #1445.
2021-11-08 08:59:30 +01:00
Tim Duesterhus
16cc16dd82 CLEANUP: Re-apply xalloc_size.cocci
Use a consistent size as the parameter for the *alloc family.
2021-11-08 08:05:39 +01:00
Tim Duesterhus
4c8f75fc31 CLEANUP: Apply ist.cocci
Make use of the new rules to use `istend()`.
2021-11-08 08:05:39 +01:00
Willy Tarreau
68574dd492 MEDIUM: log: add the client's SNI to the default HTTPS log format
During a troublehooting it came obvious that the SNI always ought to
be logged on httpslog, as it explains errors caused by selection of
the default certificate (or failure to do so in case of strict-sni).

This expectation was also confirmed on the mailing list.

Since the field may be empty it appeared important not to leave an
empty string in the current format, so it was decided to place the
field before a '/' preceding the SSL version and ciphers, so that
in the worst case a missing field leads to a field looking like
"/TLSv1.2/AES...", though usually a missing element still results
in a "-" in logs.

This will change the log format for users who already deployed the
2.5-dev versions (hence the medium level) but no released version
was using this format yet so there's no harm for stable deployments.
The reg-test was updated to check for "-" there since we don't send
SNI in reg-tests.

Link: https://www.mail-archive.com/haproxy@formilux.org/msg41410.html
Cc: William Lallemand <wlallemand@haproxy.org>
2021-11-06 09:20:07 +01:00
Willy Tarreau
579259d150 MINOR: ssl: make the ssl_fc_sni() sample-fetch function always available
Its definition is enclosed inside an ifdef SSL_CTRL_SET_TLSEXT_HOSTNAME
which is defined since OpenSSL 0.9.8. Having it conditioned like this
prevents us from using it by default in a log format, which could cause
an error on an old or exotic library.

Let's just always define it and make the sample fetch fail to return
anything on such libs instead.
2021-11-06 09:20:07 +01:00
Willy Tarreau
6f7497616e MEDIUM: connection: rename fc_conn_err and bc_conn_err to fc_err and bc_err
Commit 3d2093af9 ("MINOR: connection: Add a connection error code sample
fetch") added these convenient sample-fetch functions but it appears that
due to a misunderstanding the redundant "conn" part was kept in their
name, causing confusion, since "fc" already stands for "front connection".

Let's simply call them "fc_err" and "bc_err" to match all other related
ones before they appear in a final release. The VTC they appeared in were
also updated, and the alpha sort in the keywords table updated.

Cc: William Lallemand <wlallemand@haproxy.org>
2021-11-06 09:20:07 +01:00
Christopher Faulet
44d34bfbe7 MINOR: compression: Warn for 'compression offload' in defaults sections
This directive is documented as being ignored if set in a defaults
section. But it is only mentionned in a small note in the configuration
manual. Thus, now, a warning is emitted. To do so, the errors handling in
parse_compression_options() function was slightly changed.

In addition, this directive is now documented apart from the other
compression directives. This way, it is clearly visible that it must not be
used in a defaults section.
2021-11-05 16:36:42 +01:00
Christopher Faulet
34a3eb4c42 MINOR: backend: Get client dst address to set the server's one only if needful
In alloc_dst_address(), the client destination address must only be
retrieved when we are sure to use it. Most of time, this save a syscall to
getsockname(). It is not a bugfix in itself. But it revealed a bug in the
QUIC part. The CO_FL_ADDR_TO_SET flag is not set when the destination
address is create for anew quic client connection.
2021-11-05 15:25:34 +01:00
Frédéric Lécaille
b0006eee09 MINOR: quic: Use QUIC_LOCK QUIC specific lock label.
Very minor modifications without any impact.
2021-11-05 15:20:04 +01:00
Frédéric Lécaille
46ea033be0 MINOR: quic: Remove a useless lock for CRYPTO frames
->frms_rwlock is an old lock supposed to be used when several threads
could handle the same connection. This is no more the case since this
commit:
 "MINOR: quic: Attach the QUIC connection to a thread."
2021-11-05 15:20:04 +01:00
Frédéric Lécaille
324ecdafbb MINOR: quic: Enhance the listener RX buffering part
Add a buffer per QUIC connection. At this time the listener which receives
the UDP datagram is responsible of identifying the underlying QUIC connection
and must copy the QUIC packets to its buffer.
->pkt_list member has been added to quic_conn struct to enlist the packets
in the order they have been copied to the connection buffer so that to be
able to consume this buffer when the packets are freed. This list is locked
thanks to a R/W lock to protect it from concurent accesses.
quic_rx_packet struct does not use a static buffer anymore to store the QUIC
packets contents.
2021-11-05 15:20:04 +01:00
Frédéric Lécaille
c5c69a0ad2 CLEANUP: quic: Remove useless code
Remove old I/O handler implementation (listener and server).
At this time keep a defined but not used function for servers (qc_srv_pkt_rcv()).
2021-11-05 15:20:04 +01:00
Frédéric Lécaille
c1029f6182 MINOR: quic: Allocate listener RX buffers
At this time we allocate an RX buffer by thread.
Also take the opportunity offered by this patch to rename TX related variable
names to distinguish them from the RX part.
2021-11-05 15:20:04 +01:00
Tim Duesterhus
284fbe1214 CLEANUP: Apply ist.cocci
Make use of the new rules to use `istnext()`.
2021-11-05 07:48:38 +01:00
Tim Duesterhus
025b93e3a2 CLEANUP: Apply ha_free.cocci
Use `ha_free()` where possible.
2021-11-05 07:48:38 +01:00
Remi Tricot-Le Breton
7266350181 BUG/MINOR: jwt: Fix jwt_parse_alg incorrectly returning JWS_ALG_NONE
jwt_parse_alg would mistakenly return JWT_ALG_NONE for algorithms "",
"n", "no" and "non" because of a strncmp misuse. It now sees them as
unknown algorithms.

No backport needed.

Cc: Tim Duesterhus <tim@bastelstu.be>
2021-11-03 17:19:48 +01:00
Emeric Brun
f8642ee826 MEDIUM: resolvers: rename dns extra counters to resolvers extra counters
This patch renames all dns extra counters and stats functions, types and
enums using the 'resolv' prefix/suffixes.

The dns extra counter domain id used on cli was replaced by "resolvers"
instead of "dns".

The typed extra counter prefix dumping resolvers domain "D." was
also renamed "N." because it points counters on a Nameserver.

This was done to finish the split between "resolver" and "dns" layers
and to avoid further misunderstanding when haproxy will handle dns
load balancing.

This should not be backported.
2021-11-03 17:16:46 +01:00
Emeric Brun
d174f0e59a MINOR: resolvers/dns: split dns and resolver counters in dns_counter struct
This patch add a union and struct into dns_counter struct to split
application specific counters.

The only current existing application is the resolver.c layer but
in futur we could handle different application such as dns load
balancing with others specific counters.

This patch should not be backported.
2021-11-03 17:16:46 +01:00
Emeric Brun
0161d32df2 BUG/MINOR: resolvers: throw log message if trash not large enough for query
Before this patch the sent error counter was increased
for each targeted nameserver as soon as we were unable to build
the query message into the trash buffer. But this counter is here
to count sent errors at dns.c transport layer and this error is not
related to a nameserver.

This patch stops to increase those counters and sent a log message
to signal the trash buffer size is not large enough to build the query.

Note: This case should not happen except if trash size buffer was
customized to a very low value.

The function was also re-worked to return -1 in this error case
as it was specified in comment. This function is currently
called at multiple point in resolver.c but return code
is still not yet handled. So to advert the user of the malfunction
the log message was added.

This patch should be backported on all versions including the
layer split between dns.c and resolver.c (v >= 2.4)
2021-11-03 17:16:46 +01:00
Emeric Brun
c37caab21c BUG/MINOR: resolvers: fix sent messages were counted twice
The sent messages counter was increased at both resolver.c and dns.c
layers.

This patch let the dns.c layer count the sent messages since this
layer handle a retry if transport layer is not ready (EAGAIN on udp
or tcp session ring buffer full).

This patch should be backported on all versions using a split of those
layers for resolving (v >=2.4)
2021-11-03 17:16:46 +01:00
Amaury Denoyelle
f9d5957cd9 MINOR: server: add ws keyword
Implement parsing for the server keyword 'ws'. This is used to configure
the mode of selection for websocket protocol. The configuration
documentation has been updated.

A new regtest has been created to test the proper behavior of the
keyword.
2021-11-03 16:24:48 +01:00
Amaury Denoyelle
9c3251d108 MEDIUM: server/backend: implement websocket protocol selection
Handle properly websocket streams if the server uses an ALPN with both
h1 and h2. Add a new field h2_ws in the server structure. If set to off,
reuse is automatically disable on backend and ALPN is forced to http1.x
if possible. Nothing is done if on.

Implement a mechanism to be able to use a different http version for
websocket streams. A new server member <ws> represents the algorithm to
select the protocol. This can overrides the server <proto>
configuration. If the connection uses ALPN for proto selection, it is
updated for websocket streams to select the right protocol.

Three mode of selection are implemented :
- auto : use the same protocol between non-ws and ws streams. If ALPN is
  use, try to update it to "http/1.1"; this is only done if the server
  ALPN contains "http/1.1".
- h1 : use http/1.1
- h2 : use http/2.0; this requires the server to support RFC8441 or an
  error will be returned by haproxy.
2021-11-03 16:24:48 +01:00
Amaury Denoyelle
ac03ef26e8 MINOR: connection: add alternative mux_ops param for conn_install_mux_be
Add a new parameter force_mux_ops. This will be useful to specify an
alternative to the srv->mux_proto field. If non-NULL, it will be use to
force the mux protocol wether srv->mux_proto is set or not.

This argument will become useful to install a mux for non-standard
streams, most notably websocket streams.
2021-11-03 16:24:48 +01:00
Amaury Denoyelle
2454bda140 MINOR: connection: implement function to update ALPN
Implement a new function to update the ALPN on an existing connection.
on an existing connection. The ALPN from the ssl context can be checked
to update the ALPN only if it is a subset of the context value.

This method will be useful to change a connection ALPN for websocket,
must notably if the server does not support h2 websocket through the
rfc8441 Extended Connect.
2021-11-03 16:24:48 +01:00
Amaury Denoyelle
90ac605ef3 MINOR: stream/mux: implement websocket stream flag
Define a new stream flag SF_WEBSOCKET and a new cs flag CS_FL_WEBSOCKET.
The conn-stream flag is first set by h1/h2 muxes if the request is a
valid websocket upgrade. The flag is then converted to SF_WEBSOCKET on
the stream creation.

This will be useful to properly manage websocket streams in
connect_server().
2021-11-03 16:24:48 +01:00
Amaury Denoyelle
0df043608f BUG/MEDIUM: mux-h2: reject upgrade if no RFC8441 support
The RFC8441 was not respected by haproxy in regards with server support
for Extended CONNECT. The Extended CONNECT method was used to convert an
Upgrade header stream even if no SETTINGS_ENABLE_CONNECT_PROTOCOL was
received, which is forbidden by the RFC8441. In this case, the behavior
of the http/2 server is unspecified.

Fix this by flagging the connection on receiption of the RFC8441
settings SETTINGS_ENABLE_CONNECT_PROTOCOL. Extended CONNECT is thus only
be used if the flag is present. In the other case, the stream is
immediatly closed as there is no way to handle it in http/2. It results
in a http/1.1 502 or http/2 RESET_STREAM to the client side.

The protocol-upgrade regtest has been extended to test that haproxy does
not emit Extended CONNECT on servers without RFC8441 support.

It must be backported up to 2.4.
2021-11-03 16:24:48 +01:00