14759 Commits

Author SHA1 Message Date
Amaury Denoyelle
2f668f0e60 MINOR: quic: complete traces/debug for handshake
Add more traces to follow CRYPTO data buffering in ncbuf. Offset for
quic_enc_level is now reported for event QUIC_EV_CONN_PRHPKTS. Also
ncb_advance() must never fail so a BUG_ON() statement is here to
guarantee it.

This was useful to track handshake failure reported too often. This is
related to github issue #1903.

This should be backported up to 2.6.
2022-11-18 16:44:46 +01:00
Amaury Denoyelle
bc174b2101 BUG/MEDIUM: quic: fix memleak for out-of-order crypto data
Liberate quic_enc_level ncbuf in quic_stream_free(). In most cases, this
will already be done when handshake is completed via
qc_treat_rx_crypto_frms(). However, if a connection is released before
handshake completion, a leak was present without this patch.

Under normal situation, this leak should have been limited due to the
majority of QUIC connection success on handshake. However, another bug
caused handshakes to fail too frequently, especially with chrome client.
This had the side-effect to dramatically increase this memory leak.

This should fix in part github issue #1903.
2022-11-18 16:44:46 +01:00
Amaury Denoyelle
ff95f2d447 BUG/MEDIUM: quic: fix unsuccessful handshakes on ncb_advance error
QUIC handshakes were frequently in error due to haproxy misuse of
ncbuf. This resulted in one of the following scenario :
- handshake rejected with CONNECTION_CLOSE due to overlapping data
  rejected
- CRYPTO data fully received by haproxy but handshake completion signal
  not reported causing the client to emit PING repeatedly before timeout

This was produced because ncb_advance() result was not checked after
providing crypto data to the SSL stack in qc_provide_cdata(). However,
this operation can fail if a too small gap is formed. In the meantime,
quic_enc_level offset was always incremented. In some cases, this caused
next ncb_add() to report rejected overlapping data. In other cases, no
error was reported but SSL stack never received the end of CRYPTO data.

Change slightly the handling of new CRYPTO frames to avoid this bug :
when receiving a CRYPTO frame for the current offset, handle them
directly as previously done only if quic_enc_level ncbuf is empty. In
the other case, copy them to the buffer before treating contiguous data
via qc_treat_rx_crypto_frms().

This change ensures that ncb_advance() operation is now conducted only
in a data block : thus this is guaranteed to never fail.

This bug was easily reproduced with chromium as it fragments CRYPTO
frames randomly in several frames out of order.

This commit has two drawbacks :
- it is slightly less worst on performance because as sometimes even
  data at the current offset will be memcpy
- if a client uses too many fragmented CRYPTO frames, this can cause
  repeated ncb_add() error on gap size. This can be reproduced with
  chrome, albeit with a slighly less frequent rate than the fixed issue.

This change should fix in part github issue #1903.

This must be backported up to 2.6.
2022-11-18 16:44:46 +01:00
Amaury Denoyelle
7f0295f08a MINOR: ncbuf: complete doc for ncb_advance()
ncb_advance() operation may reject the operation if a too small gap is
formed in buffer front. This must be documented to avoid an issue with
it.

This should be backported up to 2.6.
2022-11-18 16:44:46 +01:00
Christopher Faulet
4cfdcbbd19 BUILD: peers: Remove unused variables
Since 0909f62266 ("BUG/MEDIUM: peers: messages about unkown tables not
correctly ignored"), the 'sc' variable is no longer used in
peer_treat_updatemsg() and peer_treat_definemsg() functions. So, we must
remove them to avoid compilation warning.

This patch must be backported with the commit above.
2022-11-18 16:40:56 +01:00
Christopher Faulet
5534334f1f MEDIUM: thread: Restric nbthread/thread-group(s) to very first global sections
nbhread, thead-group and thread-groups directives must only be defined in
very first global sections. It means no other section must have been parsed
before. Indeed, some parts of the configuratio depends on the value of these
settings and it is undefined to change them after.
2022-11-18 16:03:45 +01:00
Christopher Faulet
037e3f8735 MINOR: cfgparse: Always check the section position
In diag mode, the section position is checked and a warning is emitted if a
global section is defined after any non-global one. Now, this check is
always performed. But the warning is still only emitted in diag mode. In
addition, the result of this check is now stored in a global variable, to be
used from anywhere.

The aim of this patch is to be able to restrict usage of some global
directives to the very first global sections. It will be useful to avoid
undefined behaviors. Indeed, some config parts may depend on global settings
and it is a problem if these settings are changed after.
2022-11-18 16:03:45 +01:00
Emeric Brun
0909f62266 BUG/MEDIUM: peers: messages about unkown tables not correctly ignored
Table defintion's messages and update messages are not correctly
ignored if the table is not configured on the local peer.

It is a bug because, receiving those messages, the parser
returns an error and the upper layer considers that the state of the
peer's connection is modified (as it is done in the case of protocol
error) and switch immediatly the automate to process the new state.
But, even if message is silently ignored because the connection's
state doesn't change and we continue to process the next message, some
processing remains not performed: for instance the ALIVE flag is not set
on the peer's connection as it should be done after receiving any valid
messages. This results in a shutdown of the connection when timeout
is elapsed as if no message has been received during this delay.

This patch fix the behavior, those messages are now silently ignored
and the upper layer continue the processing as it is done for any valid
messages.

This bug appears with the code re-work of the peers on 2.0 so
it should be backported until this version.
2022-11-18 15:54:33 +01:00
William Lallemand
b60a77b6d0 BUG/MINOR: ssl: don't initialize the keylog callback when not required
The registering of the keylog callback seems to provoke a loss of
performance. Disable the registration as well as the fetches if
tune.ssl.keylog is off.

Must be backported as far as 2.2.
2022-11-18 15:24:23 +01:00
Christopher Faulet
dfefebcd7a BUG/MEDIUM: raw-sock: Don't report connection error if something was received
In raw_sock_to_buf(), if a low-level error is reported, we no longer
immediately set an error on the connexion if something was received. This
may happen when a RST is received with data. This way, we let a chance to
the mux to process received data first instead of immediately aborting.

This patch should fix some spurious health-check failures. It is pretty hard
to observe, but with a server immediately returning the response followed by
a RST, without waiting the request, it is possible to have some health-check
errors. For instance, with the following tcploop server:

  tcploop 8000 L Q W N1 A S:"HTTP/1.0 200 OK\r\n\r\n" F K
  ( Accept -> send response -> FIN -> Close)

we can have such strace output:

15:11:21.433005 socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 38
15:11:21.433141 fcntl(38, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
15:11:21.433233 setsockopt(38, SOL_TCP, TCP_NODELAY, [1], 4) = 0
15:11:21.433359 setsockopt(38, SOL_TCP, TCP_QUICKACK, [0], 4) = 0
15:11:21.433457 connect(38, {sa_family=AF_INET, sin_port=htons(8000), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
15:11:21.434215 epoll_ctl(4, EPOLL_CTL_ADD, 38, {events=EPOLLIN|EPOLLOUT|EPOLLRDHUP, data={u32=38, u64=38}}) = 0
15:11:21.434468 epoll_wait(4, [{events=EPOLLOUT, data={u32=38, u64=38}}], 200, 21) = 1
15:11:21.434810 recvfrom(38, 0x7f32a83e5020, 16320, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
15:11:21.435405 sendto(38, "OPTIONS / HTTP/1.0\r\ncontent-leng"..., 41, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = 41
15:11:21.435833 epoll_ctl(4, EPOLL_CTL_MOD, 38, {events=EPOLLIN|EPOLLRDHUP, data={u32=38, u64=38}}) = 0
15:11:21.435907 epoll_wait(4, [{events=EPOLLIN|EPOLLERR|EPOLLHUP|EPOLLRDHUP, data={u32=38, u64=38}}], 200, 17) = 1
15:11:21.436024 recvfrom(38, "HTTP/1.0 200 OK\r\n\r\n", 16320, 0, NULL, NULL) = 19
15:11:21.436189 close(38)  = 0
15:11:21.436402 write(2, "[WARNING]  (163564) : Server bac"..., 184[WARNING]  (163564) : Server back-http/www is DOWN, reason: Socket error, check duration: 5ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.

The response was received, but it is ignored because an error was reported
too. The error handling must be refactored. But it a titanic stain. Thus,
for now, a good fix is to delay the error report when something was
received. The error will be reported on the next receive, if any.

This patch should fix the issue #1863, but it must be confirmed. At least it
fixes the above example. It must be backported to 2.6. For older versions,
it must be evaluated first.
2022-11-18 15:12:23 +01:00
Aurelien DARRAGON
5ad2b64262 BUG/MINOR: http_ana/txn: don't re-initialize txn and req var lists
In http_create_txn(): vars_init_head() was performed on both s->vars_txn
and s->var_reqres lists.

But this is wrong, these two lists are already initialized upon stream
creation in stream_new().
Moreover, between stream_new() and http_create_txn(), some variable may
be defined (e.g.: by the frontend), resulting in lists not being empty.

Because of this "extra" list initialization, already defined variables
can be lost.
This causes txn dependant code not being able to access previously defined
variables as well as memory leak because http_destroy_txn() relies on these
lists to perform the purge.

This proved to be the case when a frontend sets variables and lua sample
fetch is used in backend section as described in GH #1935.
Many thanks to Darragh O'Toole for his detailed report.

Removing extra var_init_head (x2) in http_create_txn() to fix the issue.
Adding somme comments in the code in an attempt to prevent future misuses
of s->var_reqres, and s->var_txn lists.

It should be backported in every stable version.
(This is an old bug that seems to exist since 1.6-dev6)

[cf: On 2.0 and 1.8, for the legacy HTTP code, vars_init() are used during
     the TXN cleanup, when the stream is reused. So, these calls must be
     moved from http_init_txn() to http_reset_txn() and not removed.]
2022-11-18 10:20:44 +01:00
Christopher Faulet
ce7928d19c CLEANUP: mux-h1: Don't test h1c in h1_shutw_conn()
The H1 connection cannot be NULL when h1_shutw_conn() is called. Thus there
is no reason to test it.

This patch should fix the issue #1936.
2022-11-18 08:44:46 +01:00
Christopher Faulet
e6ef4cd747 BUG/MINOR: mux-h1: Fix error handling when H1S allocation failed on client side
The goto label is not at the right place. When H1S allocation failed, the
error is immediately handled. Thus, "no_parsing" label must be set just
after h1_send() call to skip the request parsing part.

It is 2-7-specific. No backport needed.
2022-11-17 15:54:13 +01:00
Christopher Faulet
ddfb50eec6 CLEANUP: listener: Remove useless task_queue from manage_global_listener_queue
At the end of manage_global_listener_queue(), the task expire date is set to
TICK_ETERNITY. Thus, it is useless to call task_queue() just after because
the function does nothing in this case.
2022-11-17 15:18:59 +01:00
Christopher Faulet
13e86d947d BUG/MEDIUM: listener: Fix race condition when updating the global mngmt task
It is pretty similar to fbb934da90 ("BUG/MEDIUM: stick-table: fix a race
condition when updating the expiration task"). When the global management
task is running, at the end of its process function, it resets the expire
date by setting it to TICK_ETERNITY. In same time, a listener may reach a
global limit and decides to schedule the task. Thus it is possible to queue
the task and trigger the BUG_ON() on the expire date because its value was
set to TICK_ETERNITY in the means time:

  FATAL: bug condition "task->expire == 0" matched at src/task.c:310
    call trace(12):
    |       0x662de8 [b8 01 00 00 00 c6 00 00]: __task_queue+0xc7/0x11e
    |       0x63b03f [48 b8 04 00 00 00 05 00]: main+0x2535e
    |       0x63ed1a [e9 d2 fd ff ff 48 8b 45]: listener_accept+0xf72/0xfda
    |       0x6a36d3 [eb 01 90 c9 c3 55 48 89]: sock_accept_iocb+0x82/0x87
    |       0x6af22f [48 8b 05 ca f9 13 00 8b]: fd_update_events+0x35a/0x497
    |       0x42a7a8 [89 45 d8 83 7d d8 02 75]: main-0x1eb539
    |       0x6158fb [48 8b 05 e6 06 1c 00 64]: run_poll_loop+0x2e7/0x319
    |       0x615b6c [48 8b 05 ed 65 1d 00 48]: main-0x175
    | 0x7ffff775bded [e9 69 fe ff ff 48 8b 4c]: libc:+0x8cded
    | 0x7ffff77e1370 [48 89 c7 b8 3c 00 00 00]: libc:+0x112370

To fix the bug, a RW lock is introduced. It is used to fix the race
condition. A read lock is taken when the task is scheduled, in
listener_accpet() and a write lock is used at the end of process function to
set the expire date to TICK_ETERNITY. This lock should not be used very
often and most of time by "readers". So, the impact should be really
limited.

This patch should fix the issue #1930. It must be backported as far as 1.8
with some cautions because the code has evolved a lot since then.
2022-11-17 15:18:40 +01:00
Christopher Faulet
62138aab3e MINOR: mux-h1: Rely on a H1S flag to know a WS key was found or not
h1_process_mux() is written to allow partial headers formatting. For now,
all headers are forwarded in one time. But it is still good to keep this
ability at the H1 mux level. So we must rely on a H1S flag instead of a
local variable to know a WebSocket key was found in headers to be able to
generate a key if necessary.

There is no reason to backport this patch.
2022-11-17 14:33:15 +01:00
Christopher Faulet
7f6aa56d47 MINOR: sconn: Set SE_FL_ERROR only when there is no more data to read
SE_FL_ERR_PENDING flag is used when there is still data to be read. So we
must take care to not set SE_FL_ERROR too early. Thus, on sending path, it
must be set if SE_FL_EOS was already set.
2022-11-17 14:33:15 +01:00
Christopher Faulet
ab79b321d6 MEDIUM: mux-fcgi: Introduce flags to deal with connection read/write errors
Similarly to the H1 and H2 multiplexers, FCFI_CF_ERR_PENDING is now used to
report an error when we try to send data and FCGI_CF_ERROR to report an
error when we try to read data. In other funcions, we rely on these flags
instead of connection ones. Only FCGI_CF_ERROR is considered as a final
error.  FCGI_CF_ERR_PENDING does not block receive attempt.

In addition, FCGI_CF_EOS flag was added. we rely on it to test if a read0
was received or not.
2022-11-17 14:33:15 +01:00
Christopher Faulet
68ee7845cf CLEANUP: mux-h2: Remove unused fields in h2c structures
Some fields in h2c structures are not used: .mfl, .mft and .mff. Just remove
them.

.msi field is also removed. It is tested but never set, except when a H2
connection is initialized. It also means h2c_mux_busy() function is useless
because it always returns 0 (.msi is always -1). And thus, by transitivity,
H2_CF_DEM_MBUSY is also useless because it is never set. So .msi field,
h2c_mux_busy() function and H2C_MUX_BUSY flag are removed.
2022-11-17 14:33:15 +01:00
Christopher Faulet
ff7925dce0 MEDIUM: mux-h2: Introduce flags to deal with connection read/write errors
Similarly to the H1 multiplexer, H2_CF_ERR_PENDING is now used to report an
error when we try to send data and H2_CF_ERROR to report an error when we
try to read data. In other funcions, we rely on these flags instead of
connection ones. Only H2_CF_ERROR is considered as a final error.
H2_CF_ERR_PENDING does not block receive attempt.

In addition, we rely on H2_CF_RCVD_SHUT flag to test if a read0 was received
or not.
2022-11-17 14:33:15 +01:00
Christopher Faulet
b65af26e19 MEDIUM: mux-pt: Don't always set a final error on SE on the sending path
SE_FL_ERROR must be set on the SE descriptor only if EOS was already
reported. So call se_fl_set_error() function to properly the
ERR_PENDING/ERROR flags. It is not really a bug because the mux-pt is really
simple. But it is better to do it now the right way.
2022-11-17 14:33:15 +01:00
Christopher Faulet
31da34d1e7 MEDIUM: mux-h1: Don't report a final error whe a message is aborted
When the H1 connection is aborted, we no longer set a final error. To do so,
the flag H1C_F_ABORTED was added. For now, it is only set when a error is
detected on the H1 stream. Idea is to use ERR_PENDING/ERROR for upgoing
errors and ABRT_PENDING/ABRTED for downgoing errors.
2022-11-17 14:33:15 +01:00
Christopher Faulet
fc473a6453 MEDIUM: mux-h1: Rely on the H1C to deal with shutdown for reads
read0 is now handled with a H1 connection flag (H1C_F_EOS). Corresponding
flag was removed on the H1 stream and we fully rely on the SE descriptor at
the stream level.

Concretly, it means we rely on the H1 connection flags instead of the
connection one. H1C_F_EOS is only set in h1_recv() or h1_rcv_pipe() after a
read if a read0 was detected.
2022-11-17 14:33:15 +01:00
Christopher Faulet
bef8900cd6 MINOR: mux-h1: Add flag on H1 stream to deal with internal errors
A new error is added on H1 stream to deal with internal errors. For now,
this error is only reported when we fail to create a stream-connector. This
way, the error is reported at the H1 stream level and not the H1 connection
level.
2022-11-17 14:33:14 +01:00
Christopher Faulet
56a499475f CLEANUP: mux-h1: Rename H1C_F_ERR_PENDING into H1C_F_ABRT_PENDING
H1C_F_ERR_PENDING flags will be used to refactor error handling at the H1
connection level. It will be used to notify error during sends. Thus, the
flag to notify an error must be sent before closing the connection is now
named H1C_F_ABRT_PENDING.

This introduce a naming convertion: ERROR must be used to notify upper layer
of an event at the lower ones while ABORT must be used in the opposite
direction.
2022-11-17 14:33:14 +01:00
Christopher Faulet
2177d96acd MINOR: mux-h1: Don't handle subscribe for reads in h1_process_demux()
When the request headers are not fully received, we must subscribe the H1
connection for reads to be able to receive more data. This was performed in
h1_process_demux(). It is now perfoemd in h1_process_demux().
2022-11-17 14:33:14 +01:00
Christopher Faulet
4e72b172d7 MEDIUM: mux-h1: Handle H1C states via its state field instead of H1C_F_ST_*
The H1 connection state is now handled in a dedicated state. H1C_F_ST_*
flags are removed. All states are now exclusives. It is easier to know the
H1 connection states. It is alive, or usable, if it is not CLOSING or
CLOSED. It is CLOSING if it should be closed ASAP but a stream is still
attached and/or the output buffer is not empty. CLOSED is used when the H1
connection is ready to be closed. Other states are quite easy to understand.

There is no special changes in the H1 connection behavior. Except in
h1_send(). When a CLOSING connection is CLOSED, the function now reports an
activity. In addition, when an embryonic H1 stream is aborted, it is
destroyed. This way, the H1 connection can be switched to CLOSED state.
2022-11-17 14:33:14 +01:00
Christopher Faulet
ef93be2a7b MINOR: mux-h1: Add a dedicated enum to deal with H1 connection state
The H1 connection state will be handled is a dedicated field. To do so,
h1_cs enum was added. The different states are more or less equivalent to
H1C_F_ST_* flags:

 * H1_CS_IDLE      <=> H1C_F_ST_IDLE
 * H1_CS_EMBRYONIC <=> H1C_F_ST_EMBRYONIC
 * H1_CS_UPGRADING <=> H1C_F_ST_ATTACHED && !H1C_F_ST_READY
 * H1_CS_RUNNING   <=> H1C_F_ST_ATTACHED && H1C_F_ST_READY
 * H1_CS_CLOSING   <=> H1C_F_ST_SHUTDOWN && (H1C_F_ST_ATTACHED || b_data(&h1c->ibuf))
 * H1_CS_CLOSED    <=> H1C_F_ST_SHUTDOWN && !H1C_F_ST_ATTACHED && !b_data(&h1c->ibuf)

In addition, in this patch, the h1_is_alive() and h1_close() function are
added. The first one will be used to know if a H1 connection is alive or
not. The second one will be used to set the connection in CLOSING or CLOSED
state, depending on the output buffer state and if there is still a H1
stream or not.

For now, the H1 connection state is not used.
2022-11-17 14:33:14 +01:00
Christopher Faulet
71abc0cfd5 CLEANUP: mux-h1: Rename H1C_F_ST_ERROR and H1C_F_ST_SILENT_SHUT flags
_ST_ part is removed from these 2 flags because they don't reflect a
state. In addition, the H1 connection state will be handled in a dedicated
enum.
2022-11-17 14:33:14 +01:00
Christopher Faulet
089cc6e805 REORG: mux-h1: Reorg the H1C structure
Fields in H1C structure are reorganised to not have the output buffer
straddled between to cache lines. There is 4-bytes hole after the flags, but
it will be partially filled by an enum representing the H1 connection state.
2022-11-17 14:33:14 +01:00
Christopher Faulet
7fcbcc0e4c CLEANUP: mux-h1; Rename H1S_F_ERROR flag into H1S_F_ERROR_MASK
In fact, H1S_F_ERROR is not a flag but a mask. So rename it to make it
clear.
2022-11-17 14:33:14 +01:00
Christopher Faulet
c3fe6f3b7a MINOR: mux-h1: Remove usless code inside shutr callback
For now, at the transport-layer level, there is no shutr callback (in
xprt_ops). Thus, to ease the aborts refacotring, the code is removed from
the h1_shutr() function. The callback is not removed for now. It is only
kept to have a trace message. It may be handy for debugging sessions.
2022-11-17 14:33:14 +01:00
Willy Tarreau
0c5e9896c7 BUG/MINOR: pool/cli: use ullong to report total pool usage in bytes
As noticed by Gabriel Tzagkarakis in issue #1903, the total pool size
in bytes is historically still in 32 bits, but at least we should report
the product of the number of objects and their size in 64 bits so that
the value doesn't wrap around 4G.

This may be backported to all versions.
2022-11-17 11:10:53 +01:00
Amaury Denoyelle
3a72ba2aed BUILD: quic: fix dubious 0-byte overflow on qc_release_lost_pkts
With GCC 12.2.0 and O2 optimization activated, compiler reports the
following warning for qc_release_lost_pkts().

In function ‘quic_tx_packet_refdec’,
    inlined from ‘qc_release_lost_pkts.constprop’ at src/quic_conn.c:2056:3:
include/haproxy/atomic.h:320:41: error: ‘__atomic_sub_fetch_4’ writing 4 bytes into a region of size 0 overflows the destination [-Werror=stringop-overflow=]
  320 | #define HA_ATOMIC_SUB_FETCH(val, i)     __atomic_sub_fetch(val, i, __ATOMIC_SEQ_CST)
      |                                         ^~~~~~~~~~~~~~~~~~
include/haproxy/quic_conn.h:499:14: note: in expansion of macro ‘HA_ATOMIC_SUB_FETCH’
  499 |         if (!HA_ATOMIC_SUB_FETCH(&pkt->refcnt, 1)) {
      |              ^~~~~~~~~~~~~~~~~~~

GCC thinks that quic_tx_packet_refdec() can be called with a NULL
argument from qc_release_lost_pkts() with <oldest_lost> as arg.

This warning is a false positive as <oldest_lost> cannot be NULL in
qc_release_lost_pkts() at this stage. This is due to the previous check
to ensure that <pkts> list is not empty.

This warning is silenced by using ALREADY_CHECKED() macro.

This should be backported up to 2.6.

This should fix github issue #1852.
2022-11-17 10:34:47 +01:00
Willy Tarreau
1b662aabbf BUG/MEDIUM: ring: fix creation of server in uninitialized ring
If a "ring" section initialization fails (e.g. due to a duplicate name,
invalid chars, or missing memory), any subsequent "server" statement that
appears in the same section will crash the config parser by dereferencing
the currently NULL cfg_sink. E.g:

    ring x
    ring x                 # fails on "already exists"
       server srv 1.1.1.1  # crashes on cfg_sink==NULL

All other statements have a test for this but "server" was missing it,
so this patch adds it.

Thanks to Joel Hutchinson for reporting this issue.

This must be backported as far as 2.2.
2022-11-16 18:59:43 +01:00
Willy Tarreau
9fd0542148 MEDIUM: trace: create a new "trace" statement in the "global" section
The exact same commands as those from the CLI may be pre-loaded at boot
time by passing them one per line after the "trace" keyword in the global
section; i.e. just copy-pasting all commands directly there will do the
job. Note that if a ring is mentioned, it needs to be declared before the
global section. Another option is to append another global section after
"ring".

For now the keyword is marked as experimental to discourage its broad
adoption by default. "expose-experimental-directives" needs to be placed
in the global section to expose it.
2022-11-16 17:55:53 +01:00
Willy Tarreau
c11f1cdf4d MINOR: trace: split the CLI "trace" parser in CLI vs statement
In order to be able to reuse the "trace" statements elsewhere (e.g.
global section), we'll first need to split its parser. It turns out
that the whole thing is self-contained inside a single function that
emits a single message on warning/error or nothing on success. That's
quite easy to split in two parts, the one that does the job and produces
the status message and the one that sends it to the CLI. That's what
this patch does.
2022-11-16 17:55:53 +01:00
Mickael Torres
226082d13a BUG/MINOR: mux-h1: Do not send a last null chunk on body-less answers
HEAD answers should not contain any body data. Currently when a
"transfer-encoding: chunked" header is returned, a last null-chunk is added to
the answer. Some clients choke on it and fail when trying to reuse the
connection. Check that the response should not be body-less before sending the
null-chunk.

This patch should fix #1932. It must be backported as far as 2.4.
2022-11-16 16:25:26 +01:00
Remi Tricot-Le Breton
e608b0eb16 BUG/MINOR: ssl: SSL_load_error_strings might not be defined
The SSL_load_error_strings function was marked as deprecated in OpenSSL
1.1.0 so compiling HAProxy with OPENSSL_NO_DEPRECATED set and a recent
OpenSSL library would fail.
The manpages say that this function was replaced by OPENSSL_init_crypto
and OPENSSL_init_ssl which are already called at start up by the SSL
lib. We do not seem to be in a case where explicit call of those
functions is required.

This patch fixes GitHub issue #1813.
It can be backported to 2.6.
2022-11-16 11:09:33 +01:00
Christopher Faulet
52fd8a1b7b BUG/MEDIUM: mux-fcgi: Avoid value length overflow when it doesn't fit at once
When the request data are copied in a mbuf, if the free space is too small
to copy all data at once, the data length is shortened. When this is
performed, we reserve the size of the STDIN recod header and eventually the
same for the empty STDIN record if it is the last HTX block of the request.

However, there is no test to be sure the free space is large enough. Thus,
on this special case, when the mbuf is almost full, it is possible to
overflow the value length. Because of this bug, it is possible to experience
crashes from time to time.

This patch should fix the issue #1923. It must be backported as far as 2.4.
2022-11-16 09:27:09 +01:00
Christopher Faulet
e8c7fb3588 BUG/MINOR: mux-fcgi: Be sure to send empty STDING record in case of zero-copy
When the last HTX DATA block was copied in zero-copy, the empty STDIN
record, marking the end of the request data was never sent. Thanks to this
patch, it is now sent.

This patch must be backported as far as 2.4.
2022-11-16 09:27:09 +01:00
Christopher Faulet
2364b39984 BUG/MINOR: resolvers: Set port before IP address when processing SRV records
For a server subject to SRV resolution, when the server's address is set,
its dynamic cookie, if any, and its server key are computed. Both are based
on the ip/port pair. However, this happens before the server's port is
set. Thus the port is equal to 0 at this stage. It is a problem if several
servers share the same IP but with different ports because they will share
the same dynamic cookie and the same server key, disturbing this way the
connection persistency and the session stickiness.

This patch must be backported as far as 2.2.
2022-11-16 09:27:09 +01:00
Christopher Faulet
68a61b6321 BUG/MINOR: resolvers: Don't wait periodic resolution on healthcheck failure
DNS resoltions may be triggered via a "do-resolve" action or when a connection
failure is experienced during a healthcheck. Cached valid responses are used, if
possible. But if the entry is expired or if there is no valid response, a new
reolution should be performed. However, an resolution is only performed if the
"resolve" timeout is expired. Thus, when this comes from a healthcheck, it means
no extra resolution is performed at all.

Now, when the resolution is performed for a server (SRV or SRVEQ) and no valid
response is found, the resolution timer is reset (last_resolution is set to
TICK_ETERNITY). Of course, it is only performed if no resolution is already
running.

Note that this feature was broken 5 years ago when the resolvers code was
refactored (67957bd59e).

This patch should fix the issue #1906. It affects all stable versions. However,
it is probably a good idea to not backport it too far (2.6, maybe 2.4) and with
some delay.
2022-11-16 09:27:09 +01:00
Christopher Faulet
5a3d9a77e2 BUG/MINOR: http-htx: Fix error handling during parsing http replies
When an error is triggered during arguments parsing of an http reply (for
instance, from a "return" rule), while a log-format body was expected but
not evaluated yet, HAproxy crashes when the body log-format string is
released because it was not properly initialized.

The list used for the log-format string must be initialized earlier.

This patch should fix the issue #1925. It must be backported as far as 2.2.
2022-11-16 09:27:09 +01:00
William Lallemand
78c7a06e4f MINOR: ssl: reintroduce ERR_GET_LIB(ret) == ERR_LIB_PEM in ssl_sock_load_pem_into_ckch()
Commit 432cd1a ("MEDIUM: ssl: be stricter about chain error") removed
the ERR_GET_LIB(ret) != ERR_LIB_PEM to be stricter about errors.

However, PEM_R_NO_START_LINE is better be checked with ERR_LIB_PEM.
So this patch complete the previous one.

The original problem was that the condition was wrongly inversed. This
original code from openssl:

  if (ERR_GET_LIB(err) == ERR_LIB_PEM &&
      ERR_GET_REASON(err) == PEM_R_NO_START_LINE)

became:

  if (ret && (ERR_GET_LIB(ret) != ERR_LIB_PEM &&
              ERR_GET_REASON(ret) != PEM_R_NO_START_LINE))

instead of:

  if (ret && !(ERR_GET_LIB(ret) == ERR_LIB_PEM &&
               ERR_GET_REASON(ret) == PEM_R_NO_START_LINE))

This must not be backported as it will break a lot of setup. That's too
bad because a lot of errors are lost. Not marked as a bug because of the
breakage it could cause on working setups.
2022-11-15 18:24:17 +01:00
William Lallemand
45fed2c7a6 MINOR: ssl: ssl_sock_load_cert_chain() display error strings
Display error strings when SSL_CTX_use_certificate()  or
SSL_CTX_set1_chain() doesn't work.
2022-11-15 16:56:03 +01:00
Willy Tarreau
e98d385819 MINOR: deinit: add a "quick-exit" option to bypass the deinit step
Once in a while we spot a bug in the deinit code that is complex,
especially when it has to deal with incomplete initializations, and the
ability to bypass this step has regularly been raised. In addition for
fast-reloading setups it could theoretically save some time. Tests have
shown that very large configs can barely save ~100-150ms by skipping the
deinit step. However the ability not to crash if a bug is encountered can
occasionally help.

This patch adds an option to do exactly this. It's obviously not enabled
by default and the documentation discourages from using it, but this might
be useful in the future.
2022-11-15 09:37:09 +01:00
Aurelien DARRAGON
16d6c0cb09 BUG/MEDIUM: wdt/clock: properly handle early task hangs
In ae053b30 - BUG/MEDIUM: wdt: don't trigger the watchdog when p is unitialized:
	wdt is not triggering until prev_cpu_time
	is initialized to prevent unexpected process
	termination.

Unfortunately this is not enough, some tasks could start
immediately after process startup, and in such cases
prev_cpu_time could be uninitialized, because
prev_cpu_time is set after the polling loop while
process_runnable_tasks() is executed before the polling loop.

It happens to be the case with lua tasks registered using
register_task function from lua script.

Those tasks are registered in early init stage of haproxy and
they are scheduled to run before the first polling loop,
leading to prev_cpu_time being uninitialized (equals 0)
on the thread when the task is first executed.

Because of this, if such tasks get stuck right away
(e.g: blocking IO) the watchdog won't behave as expected
and the thread will remain stuck indefinitely.
(polling loop for the thread won't run at all as
the thread is already stuck)

To solve this, we're now making sure that prev_cpu_time is first
set before any tasks are processed on the thread.
This is done by setting initial prev_cpu_time value directly
in clock_init_thread_date()

Thanks to Abhijeet Rastogi for reporting this unexpected behavior.

It could be backported in every stable versions.
(everywhere ae053b30 is, because both are related)
2022-11-14 19:14:53 +01:00
Willy Tarreau
aa1909edf7 MEDIUM: http-ana: remove set-cookie2 support
This has never really been implemented in clients nor servers. We wanted
to drop it from 2.5 already but forgot, so let's do it now. The code was
only minimally changed. It could possibly be slightly simplified but it
would only be marginal, at the great risk of breaking something, thus
let's keep it in its proven state instead.

Tracked in github issue #1551.
2022-11-14 19:01:45 +01:00
Willy Tarreau
fbb934da90 BUG/MEDIUM: stick-table: fix a race condition when updating the expiration task
Pierre Cheynier reported a rare crash that can affect stick-tables. When
a entry is created, the stick-table's expiration date is updated. But if
at exactly the same time the expiration task runs, it finishes by updating
its expiration timer without any protection, which may collide with the
call to task_queue() in another thread. In this case, it sometimes happens
that the first test for TICK_ETERNITY in task_queue() passes, then the
"expire" field is reset, then the BUG_ON() triggers, like below:

  FATAL: bug condition "task->expire == 0" matched at src/task.c:279
    call trace(13):
    |       0x649d86 [c6 04 25 01 00 00 00 00]: __task_queue+0xc6/0xce
    |       0x596bef [eb 90 ba 03 00 00 00 be]: stktable_requeue_exp+0x1ef/0x258
    |       0x596c87 [48 83 bb 90 00 00 00 00]: stktable_touch_with_exp+0x27/0x312
    |       0x563698 [48 8b 4c 24 18 4c 8b 4c]: stream_process_counters+0x3a8/0x6a2
    |       0x569344 [49 8b 87 f8 00 00 00 48]: process_stream+0x3964/0x3b4f
    |       0x64a80b [49 89 c7 e9 23 ff ff ff]: run_tasks_from_lists+0x3ab/0x566
    |       0x64ad66 [29 44 24 14 8b 7c 24 14]: process_runnable_tasks+0x396/0x71e
    |       0x6184b2 [83 3d 47 b3 a6 00 01 0f]: run_poll_loop+0x92/0x4ff
    |       0x618acf [48 8b 1d aa 20 7d 00 48]: main+0x1877ef
    | 0x7fc7d6ec1e45 [64 48 89 04 25 30 06 00]: libpthread:+0x7e45
    | 0x7fc7d6c9e4af [48 89 c7 b8 3c 00 00 00]: libc:clone+0x3f/0x5a

This one is extremely difficult to reproduce in practice, but adding a
printf() in process_table_expire() before assigning the value, while
running with an expire delay of 1ms helps a lot and may trigger the
crash in less than one minute on a 8-thread machine. Interestingly,
depending on the sequencing, this bug could also have made a table fail
to expire if the expire field got reset after the last update but before
the call to task_queue(). It would require to be quite unlucky so that
the table is never touched anymore after the race though.

The solution taken by this patch is to take the table's lock when
updating its expire value in stktable_requeue_exp(), enclosing the call
to task_queue(), and to update the task->expire field while still under
the lock in process_table_expire(). Note that thanks to previous changes,
taking the table's lock for the update in stktable_requeue_exp() costs
almost nothing since we now have the guarantee that this is not done more
than 1000 times a second.

Since process_table_expire() sets the timeout after returning from
stktable_trash_expired() which just released the lock, the two functions
were merged so that the task's expire field is updated while still under
the lock. Note that this heavily depends on the two previous patches
below:

  CLEANUP: stick-table: remove the unused table->exp_next
  OPTIM: stick-table: avoid atomic ops in stktable_requeue_exp() when possible

This is a bit complicated due to the fact that in 2.7 some parts were
made lockless. In 2.6 and older, the second part (the merge of the
two functions) will be sufficient since the task_queue() call was
already performed under the table's lock, and the patches above are
not needed.

This needs to be backported as far as 1.8 scrupulously following
instructions above.
2022-11-14 18:20:38 +01:00