14970 Commits

Author SHA1 Message Date
Christopher Faulet
52fd8a1b7b BUG/MEDIUM: mux-fcgi: Avoid value length overflow when it doesn't fit at once
When the request data are copied in a mbuf, if the free space is too small
to copy all data at once, the data length is shortened. When this is
performed, we reserve the size of the STDIN recod header and eventually the
same for the empty STDIN record if it is the last HTX block of the request.

However, there is no test to be sure the free space is large enough. Thus,
on this special case, when the mbuf is almost full, it is possible to
overflow the value length. Because of this bug, it is possible to experience
crashes from time to time.

This patch should fix the issue #1923. It must be backported as far as 2.4.
2022-11-16 09:27:09 +01:00
Christopher Faulet
e8c7fb3588 BUG/MINOR: mux-fcgi: Be sure to send empty STDING record in case of zero-copy
When the last HTX DATA block was copied in zero-copy, the empty STDIN
record, marking the end of the request data was never sent. Thanks to this
patch, it is now sent.

This patch must be backported as far as 2.4.
2022-11-16 09:27:09 +01:00
Christopher Faulet
2364b39984 BUG/MINOR: resolvers: Set port before IP address when processing SRV records
For a server subject to SRV resolution, when the server's address is set,
its dynamic cookie, if any, and its server key are computed. Both are based
on the ip/port pair. However, this happens before the server's port is
set. Thus the port is equal to 0 at this stage. It is a problem if several
servers share the same IP but with different ports because they will share
the same dynamic cookie and the same server key, disturbing this way the
connection persistency and the session stickiness.

This patch must be backported as far as 2.2.
2022-11-16 09:27:09 +01:00
Christopher Faulet
68a61b6321 BUG/MINOR: resolvers: Don't wait periodic resolution on healthcheck failure
DNS resoltions may be triggered via a "do-resolve" action or when a connection
failure is experienced during a healthcheck. Cached valid responses are used, if
possible. But if the entry is expired or if there is no valid response, a new
reolution should be performed. However, an resolution is only performed if the
"resolve" timeout is expired. Thus, when this comes from a healthcheck, it means
no extra resolution is performed at all.

Now, when the resolution is performed for a server (SRV or SRVEQ) and no valid
response is found, the resolution timer is reset (last_resolution is set to
TICK_ETERNITY). Of course, it is only performed if no resolution is already
running.

Note that this feature was broken 5 years ago when the resolvers code was
refactored (67957bd59e).

This patch should fix the issue #1906. It affects all stable versions. However,
it is probably a good idea to not backport it too far (2.6, maybe 2.4) and with
some delay.
2022-11-16 09:27:09 +01:00
Christopher Faulet
5a3d9a77e2 BUG/MINOR: http-htx: Fix error handling during parsing http replies
When an error is triggered during arguments parsing of an http reply (for
instance, from a "return" rule), while a log-format body was expected but
not evaluated yet, HAproxy crashes when the body log-format string is
released because it was not properly initialized.

The list used for the log-format string must be initialized earlier.

This patch should fix the issue #1925. It must be backported as far as 2.2.
2022-11-16 09:27:09 +01:00
William Lallemand
78c7a06e4f MINOR: ssl: reintroduce ERR_GET_LIB(ret) == ERR_LIB_PEM in ssl_sock_load_pem_into_ckch()
Commit 432cd1a ("MEDIUM: ssl: be stricter about chain error") removed
the ERR_GET_LIB(ret) != ERR_LIB_PEM to be stricter about errors.

However, PEM_R_NO_START_LINE is better be checked with ERR_LIB_PEM.
So this patch complete the previous one.

The original problem was that the condition was wrongly inversed. This
original code from openssl:

  if (ERR_GET_LIB(err) == ERR_LIB_PEM &&
      ERR_GET_REASON(err) == PEM_R_NO_START_LINE)

became:

  if (ret && (ERR_GET_LIB(ret) != ERR_LIB_PEM &&
              ERR_GET_REASON(ret) != PEM_R_NO_START_LINE))

instead of:

  if (ret && !(ERR_GET_LIB(ret) == ERR_LIB_PEM &&
               ERR_GET_REASON(ret) == PEM_R_NO_START_LINE))

This must not be backported as it will break a lot of setup. That's too
bad because a lot of errors are lost. Not marked as a bug because of the
breakage it could cause on working setups.
2022-11-15 18:24:17 +01:00
William Lallemand
45fed2c7a6 MINOR: ssl: ssl_sock_load_cert_chain() display error strings
Display error strings when SSL_CTX_use_certificate()  or
SSL_CTX_set1_chain() doesn't work.
2022-11-15 16:56:03 +01:00
Willy Tarreau
e98d385819 MINOR: deinit: add a "quick-exit" option to bypass the deinit step
Once in a while we spot a bug in the deinit code that is complex,
especially when it has to deal with incomplete initializations, and the
ability to bypass this step has regularly been raised. In addition for
fast-reloading setups it could theoretically save some time. Tests have
shown that very large configs can barely save ~100-150ms by skipping the
deinit step. However the ability not to crash if a bug is encountered can
occasionally help.

This patch adds an option to do exactly this. It's obviously not enabled
by default and the documentation discourages from using it, but this might
be useful in the future.
2022-11-15 09:37:09 +01:00
Aurelien DARRAGON
16d6c0cb09 BUG/MEDIUM: wdt/clock: properly handle early task hangs
In ae053b30 - BUG/MEDIUM: wdt: don't trigger the watchdog when p is unitialized:
	wdt is not triggering until prev_cpu_time
	is initialized to prevent unexpected process
	termination.

Unfortunately this is not enough, some tasks could start
immediately after process startup, and in such cases
prev_cpu_time could be uninitialized, because
prev_cpu_time is set after the polling loop while
process_runnable_tasks() is executed before the polling loop.

It happens to be the case with lua tasks registered using
register_task function from lua script.

Those tasks are registered in early init stage of haproxy and
they are scheduled to run before the first polling loop,
leading to prev_cpu_time being uninitialized (equals 0)
on the thread when the task is first executed.

Because of this, if such tasks get stuck right away
(e.g: blocking IO) the watchdog won't behave as expected
and the thread will remain stuck indefinitely.
(polling loop for the thread won't run at all as
the thread is already stuck)

To solve this, we're now making sure that prev_cpu_time is first
set before any tasks are processed on the thread.
This is done by setting initial prev_cpu_time value directly
in clock_init_thread_date()

Thanks to Abhijeet Rastogi for reporting this unexpected behavior.

It could be backported in every stable versions.
(everywhere ae053b30 is, because both are related)
2022-11-14 19:14:53 +01:00
Willy Tarreau
aa1909edf7 MEDIUM: http-ana: remove set-cookie2 support
This has never really been implemented in clients nor servers. We wanted
to drop it from 2.5 already but forgot, so let's do it now. The code was
only minimally changed. It could possibly be slightly simplified but it
would only be marginal, at the great risk of breaking something, thus
let's keep it in its proven state instead.

Tracked in github issue #1551.
2022-11-14 19:01:45 +01:00
Willy Tarreau
fbb934da90 BUG/MEDIUM: stick-table: fix a race condition when updating the expiration task
Pierre Cheynier reported a rare crash that can affect stick-tables. When
a entry is created, the stick-table's expiration date is updated. But if
at exactly the same time the expiration task runs, it finishes by updating
its expiration timer without any protection, which may collide with the
call to task_queue() in another thread. In this case, it sometimes happens
that the first test for TICK_ETERNITY in task_queue() passes, then the
"expire" field is reset, then the BUG_ON() triggers, like below:

  FATAL: bug condition "task->expire == 0" matched at src/task.c:279
    call trace(13):
    |       0x649d86 [c6 04 25 01 00 00 00 00]: __task_queue+0xc6/0xce
    |       0x596bef [eb 90 ba 03 00 00 00 be]: stktable_requeue_exp+0x1ef/0x258
    |       0x596c87 [48 83 bb 90 00 00 00 00]: stktable_touch_with_exp+0x27/0x312
    |       0x563698 [48 8b 4c 24 18 4c 8b 4c]: stream_process_counters+0x3a8/0x6a2
    |       0x569344 [49 8b 87 f8 00 00 00 48]: process_stream+0x3964/0x3b4f
    |       0x64a80b [49 89 c7 e9 23 ff ff ff]: run_tasks_from_lists+0x3ab/0x566
    |       0x64ad66 [29 44 24 14 8b 7c 24 14]: process_runnable_tasks+0x396/0x71e
    |       0x6184b2 [83 3d 47 b3 a6 00 01 0f]: run_poll_loop+0x92/0x4ff
    |       0x618acf [48 8b 1d aa 20 7d 00 48]: main+0x1877ef
    | 0x7fc7d6ec1e45 [64 48 89 04 25 30 06 00]: libpthread:+0x7e45
    | 0x7fc7d6c9e4af [48 89 c7 b8 3c 00 00 00]: libc:clone+0x3f/0x5a

This one is extremely difficult to reproduce in practice, but adding a
printf() in process_table_expire() before assigning the value, while
running with an expire delay of 1ms helps a lot and may trigger the
crash in less than one minute on a 8-thread machine. Interestingly,
depending on the sequencing, this bug could also have made a table fail
to expire if the expire field got reset after the last update but before
the call to task_queue(). It would require to be quite unlucky so that
the table is never touched anymore after the race though.

The solution taken by this patch is to take the table's lock when
updating its expire value in stktable_requeue_exp(), enclosing the call
to task_queue(), and to update the task->expire field while still under
the lock in process_table_expire(). Note that thanks to previous changes,
taking the table's lock for the update in stktable_requeue_exp() costs
almost nothing since we now have the guarantee that this is not done more
than 1000 times a second.

Since process_table_expire() sets the timeout after returning from
stktable_trash_expired() which just released the lock, the two functions
were merged so that the task's expire field is updated while still under
the lock. Note that this heavily depends on the two previous patches
below:

  CLEANUP: stick-table: remove the unused table->exp_next
  OPTIM: stick-table: avoid atomic ops in stktable_requeue_exp() when possible

This is a bit complicated due to the fact that in 2.7 some parts were
made lockless. In 2.6 and older, the second part (the merge of the
two functions) will be sufficient since the task_queue() call was
already performed under the table's lock, and the patches above are
not needed.

This needs to be backported as far as 1.8 scrupulously following
instructions above.
2022-11-14 18:20:38 +01:00
Willy Tarreau
3238f79d12 OPTIM: stick-table: avoid atomic ops in stktable_requeue_exp() when possible
Since the task's time resolution is the millisecond we know there
will not be more than 1000 useful updates per second, so there's no
point in doing a CAS and a task_queue() for each call, better first
check if we're going to change the date. Now we're certain not to
perform such operations more than 1000 times a second for a given
table.

The loop was modified because this improvement will also be used to fix a
bug later.
2022-11-14 18:20:38 +01:00
Willy Tarreau
6342714052 CLEANUP: stick-table: remove the unused table->exp_next
The ->exp_next field of the stick-table was probably useful in 1.5 but
it currently only carries a copy of what the future value of the table's
task's expire value will be, while it's systematically copied over there
immediately after being assigned. As such it provides exactly a local
variable. Let's remove it, as it costs atomic operations.
2022-11-14 18:20:38 +01:00
William Lallemand
f813dab175 BUG/MINOR: ssl: crt-ignore-err memory leak with 'all' parameter
Only allocate "str" if the parameter is not "all" in order to avoid any
memory leak.

No backport needed.
2022-11-14 11:43:52 +01:00
William Lallemand
380085dfc1 CLEANUP: ssl: remove printf in bind_parse_ignore_err
Remove debug printf.

No backport needed.
2022-11-14 11:34:07 +01:00
Willy Tarreau
476c2802c5 BUILD: stconn: use __fallthrough in various shutw() functions
This avoids 7 build warnings when preprocessing happens before compiling
with gcc >= 7.
2022-11-14 11:14:02 +01:00
Willy Tarreau
91d398cce2 BUILD: compression: use __fallthrough in comp_http_payload()
This avoids one build warning when preprocessing happens before compiling
with gcc >= 7.
2022-11-14 11:14:02 +01:00
Willy Tarreau
ab42dc3358 BUILD: map: use __fallthrough in cli_io_handler_*()
This avoids 6 build warnings when preprocessing happens before compiling
with gcc >= 7.
2022-11-14 11:14:02 +01:00
Willy Tarreau
68a71ed3f6 BUILD: vars: use __fallthrough in var_accounting_{diff,add}()
This avoids 6 build warnings when preprocessing happens before compiling
with gcc >= 7.
2022-11-14 11:14:02 +01:00
Willy Tarreau
779aa693d9 BUILD: h1_htx: use __fallthrough in h1_parse_chunk()
This avoids one build warning when preprocessing happens before compiling
with gcc >= 7.
2022-11-14 11:14:02 +01:00
Willy Tarreau
c5bc4ad24d BUILD: http_act: use __fallthrough in parse_http_del_header()
This avoids one build warning when preprocessing happens before compiling
with gcc >= 7.
2022-11-14 11:14:02 +01:00
Willy Tarreau
36a73439f9 BUILD: check: use __fallthrough in __health_adjust()
This avoids one build warning when preprocessing happens before compiling
with gcc >= 7.
2022-11-14 11:14:02 +01:00
Willy Tarreau
80f9a63184 BUILD: logs: use __fallthrough in build_log_header()
This avoids 4 build warnings when preprocessing happens before compiling
with gcc >= 7.
2022-11-14 11:14:02 +01:00
Willy Tarreau
3064f52b15 BUILD: spoe: use __fallthrough in spoe_handle_appctx()
This avoids two build warnings when preprocessing happens before compiling
with gcc >= 7.
2022-11-14 11:14:02 +01:00
Willy Tarreau
8de35935b0 BUILD: acl: use __fallthrough in parse_acl_expr()
This avoids two build warnings when preprocessing happens before compiling
with gcc >= 7.
2022-11-14 11:14:02 +01:00
Willy Tarreau
266ce55109 BUILD: args: use __fallthrough in make_arg_list()
This avoids one build warning when preprocessing happens before compiling
with gcc >= 7.
2022-11-14 11:14:02 +01:00
Willy Tarreau
7de8de0bf8 BUILD: tools: use __fallthrough in url_decode()
This avoids one build warning when preprocessing happens before compiling
with gcc >= 7.
2022-11-14 11:14:02 +01:00
Willy Tarreau
cff89874ea BUILD: hash: use __fallthrough in hash_djb2()
This avoids 5 build warnings when preprocessing happens before compiling
with gcc >= 7.
2022-11-14 11:14:02 +01:00
Willy Tarreau
7910825409 BUILD: peers: use __fallthrough in peer_io_handler()
This avoids 7 build warnings when preprocessing happens before compiling
with gcc >= 7.
2022-11-14 11:14:02 +01:00
Willy Tarreau
9eb93c031b BUILD: stats: use __fallthrough in stats_dump_proxy_to_buffer()
This avoids 12 build warnings when preprocessing happens before compiling
with gcc >= 7.
2022-11-14 11:14:02 +01:00
Willy Tarreau
f3f60763fa BUILD: tcpcheck: use __fallthrough in check_proxy_tcpcheck()
This avoids 1 build warning when preprocessing happens before compiling
with gcc >= 7.
2022-11-14 11:14:02 +01:00
Willy Tarreau
469847945c BUILD: stream: use __fallthrough in stats_dump_full_strm_to_buffer()
This avoids 1 build warning when preprocessing happens before compiling
with gcc >= 7.
2022-11-14 11:14:02 +01:00
Willy Tarreau
14de395a30 BUILD: hlua: use __fallthrough in hlua_post_init_state()
This avoids 5 build warnings when preprocessing happens before compiling
with gcc >= 7.
2022-11-14 11:14:02 +01:00
Willy Tarreau
a551f4fbfd BUILD: ssl: use __fallthrough in cli_io_handler_tlskeys_files()
This avoids two build warnings when preprocessing happens before compiling
with gcc >= 7.
2022-11-14 11:14:02 +01:00
Willy Tarreau
6fcc86b312 BUILD: ssl: use __fallthrough in cli_io_handler_commit_{cert,cafile_crlfile}()
This avoids 8 build warnings when preprocessing happens before compiling
with gcc >= 7.
2022-11-14 11:14:02 +01:00
Willy Tarreau
aef8448b58 BUILD: ssl/crt-list: use __fallthrough in cli_io_handler_add_crtlist()
This avoids 3 build warnings when preprocessing happens before compiling
with gcc >= 7.
2022-11-14 11:14:02 +01:00
Willy Tarreau
5c8b52f80a BUILD: quic: use __fallthrough in quic_connect_server()
This avoids one build warning when preprocessing happens before compiling
with gcc >= 7.
2022-11-14 11:14:02 +01:00
Willy Tarreau
7ed0597ce8 BUILD: sample: use __fallthrough in smp_is_rw() and smp_dup()
This avoids three build warnings when preprocessing happens before compiling
with gcc >= 7.
2022-11-14 11:14:02 +01:00
Willy Tarreau
eda36f1c23 IMPORT: slz: declare len to fix debug build when optimal match is enabled
Building with -DFIND_OPTIMAL_MATCH would fail on undeclared "len".
This one likely vanished in some cleanup.

This is libslz upstream commit 1ea20360715e1ad0cd81db83fa4361310716b8cc
2022-11-14 11:14:02 +01:00
Willy Tarreau
a8a83bcc80 BUILD: ssl_utils: fix build on gcc versions before 8
Commit 960fb74ca ("MEDIUM: ssl: {ca,crt}-ignore-err can now use error
constant name") provided a very convenient way to initialize only desired
macros. Unfortunately with gcc versions older than 8, it breaks with:

  src/ssl_utils.c:473:12: error: initializer element is not constant

because it seems that the compiler cannot resolve strings to constants
at build time.

This patch takes a different approach, it stores the value of the macro
as a string and this string is converted to integer at boot time. This
way it works everywhere.
2022-11-14 11:12:49 +01:00
William Lallemand
4639689d89 BUG/MINOR: ssl: bind_conf is uncorrectly accessed when using QUIC
Since commit 9b2598 ("BUG/MEDIUM: ssl: Verify error codes can exceed
63"), the ca_ignerr_bitfield and crt_ignerr_bietfield are incorrecly
accessed from __objt_listener(conn->target)->bind_conf which is not
avaiable from QUIC. The bind_conf variable was mistakenly replaced.

This patch fixes the issue by using again the bind_conf variable.

Must be backported where 9b2598 was backported.
2022-11-10 16:56:21 +01:00
Amaury Denoyelle
30fc6da148 MINOR: server: clear prefix on stderr logs after add server
cli_parse_add_server() is the CLI handler for 'add server' command. This
functions uses usermsgs_ctx to retrieve logs messages from internal
ha_alert() calls and display it at the end of the handler.

At the beginning of the handler, stderr prefix is defined to "CLI" via
usermsgs_clr() function. However, this is not resetted at the end. This
causes inconsistency for stderr output :
1. each ha_alert() invocation will reuse "CLI" prefix if 'add server'
   command was executed before, even in non-CLI context
2. usermsgs_ctx is thread local, so this is only true if this runs on
   the same thread as 'add server' handler.

To fix this, ensure that "CLI" prefix is now resetted after
cli_parse_add_server(). This is done thanks to the addition to
cli_umsg()/cli_umsgerr() functions.

This can be backported up to 2.5 if we prefer to ensure output
consistency at the risk of changing stderr behaviors in stable versions.
In this case, the previous commit should be backported before :
  MINOR: cli: define usermsgs print context
2022-11-10 16:42:47 +01:00
Amaury Denoyelle
24e9961a8f MINOR: cli: define usermsgs print context
CLI 'add server' handler relies on usermsgs_ctx to display errors in
internal function on CLI output. This may be also extended to other
handlers.

However, to not clutter stderr from another contextes, usermsgs_ctx must
be resetted when it is not needed anymore. This operation cannot be
conducted in the CLI parse handler as display is conducted after it.

To achieve this, define new CLI states CLI_ST_PRINT_UMSG /
CLI_ST_PRINT_UMSGERR. Their principles is nearly identical to states for
dynamic messages printing.
2022-11-10 16:42:47 +01:00
Amaury Denoyelle
56f50a03b7 CLEANUP: cli: rename dynamic error printing state
Rename CLI_ST_PRINT_FREE to CLI_ST_PRINT_DYNERR.

Most notably, this highlights that this is reserved to error printing.

This is done to ensure consistency between CLI_ST_PRINT/CLI_ST_PRINT_DYN
and CLI_ST_PRINT_ERR/CLI_ST_PRINT_DYNERR. The name is also consistent
with the function cli_dynerr() which activates it.
2022-11-10 16:42:47 +01:00
William Lallemand
9fbc84e571 MINOR: ssl: x509_v_err_str converter transforms an integer to a X509_V_ERR name
The x509_v_err_str converter transforms a numerical X509 verify error
to its constant name.
2022-11-10 13:28:37 +01:00
William Lallemand
960fb74cae MEDIUM: ssl: {ca,crt}-ignore-err can now use error constant name
The ca-ignore-err and crt-ignore-err directives are now able to use the
openssl X509_V_ERR constant names instead of the numerical values.

This allow a configuration to survive an OpenSSL upgrade, because the
numerical ID can change between versions. For example
X509_V_ERR_INVALID_CA was 24 in OpenSSL 1 and is 79 in OpenSSL 3.

The list of errors must be updated when a new major OpenSSL version is
released.
2022-11-10 13:28:37 +01:00
Remi Tricot-Le Breton
9b25982716 BUG/MEDIUM: ssl: Verify error codes can exceed 63
The CRT and CA verify error codes were stored in 6 bits each in the
xprt_st field of the ssl_sock_ctx meaning that only error code up to 63
could be stored. Likewise, the ca-ignore-err and crt-ignore-err options
relied on two unsigned long longs that were used as bitfields for all
the ignored error codes. On the latest OpenSSL1.1.1 and with OpenSSLv3
and newer, verify errors have exceeded this value so these two storages
must be increased. The error codes will now be stored on 7 bits each and
the ignore-err bitfields are replaced by a big enough array and
dedicated bit get and set functions.

It can be backported on all stable branches.

[wla: let it be tested a little while before backport]
Signed-off-by: William Lallemand <wlallemand@haproxy.org>
2022-11-10 11:45:48 +01:00
Remi Tricot-Le Breton
aa529f776d BUG/MINOR: ssl: ocsp structure not freed properly in case of error
In case of error, the ocsp item might already be in the ocsp certificate
tree but simply freed instead of destroyed through ssl_sock_free_ocsp.

This patch can be backported to all stable versions.
2022-11-04 11:40:29 +01:00
Remi Tricot-Le Breton
1621dc1cc5 BUG/MINOR: ssl: Memory leak of AUTHORITY_KEYID struct when loading issuer
When calling ssl_get0_issuer_chain, if akid is not NULL but its keyid
is, then the AUTHORITY_KEYID is not freed.

This patch can be backported to all stable branches.
2022-11-04 11:40:29 +01:00
Remi Tricot-Le Breton
a2c21db155 BUG/MINOR: ssl: Memory leak of DH BIGNUM fields
When running HAProxy with OpenSSLv3, the two BIGNUMs used to build our
own DH parameters are not freed. It was not necessary previously because
ownership of those parameters was transferred to OpenSSL through the
DH_set0_pqg call.

This patch should be backported to 2.6.
2022-11-04 11:40:29 +01:00