In github issue #1878, Bart Butler reported observing turn-around states
(1 second pause) after connection retries going to different servers,
while this ought not happen.
In fact it does happen because back_handle_st_cer() enforces the TAR
state for any algo that's not round-robin. This means that even leastconn
has it, as well as hashes after the number of servers changed.
Prior to doing that, the call to stream_choose_redispatch() has already
had a chance to perform the correct choice and to check the algo and
the number of retries left. So instead we should just let that function
deal with the algo when needed (and focus on deterministic ones), and
let the former just obey. Bart confirmed that the fixed version works
as expected (no more delays during retries).
This may be backported to older releases, though it doesn't seem very
important. At least Bart would like to have it in 2.4 so let's go there
for now after it has cooked a few weeks in 2.6.
The maxconn value is decoded using atol(), so values like "3k" are
rightly converter as interger 3, while the user wants 3000.
This patch fixes this behavior by reporting a parsing error.
This patch could be backported on all maintained version, but it
could break some configuration. The bug is really minor, I recommend
to not backport, or backport a patch which only throws a warning in
place of a fatal error.
Idle connections do not work on 32-bit machines due to an alignment issue
causing the connection nodes to be indexed with their lower 32-bits set to
zero and the higher 32 ones containing the 32 lower bitss of the hash. The
cause is the use of ebmb_node with an aligned data, as on this platform
ebmb_node is only 32-bit aligned, leaving a hole before the following hash
which is a uint64_t:
$ pahole -C conn_hash_node ./haproxy
struct conn_hash_node {
struct ebmb_node node; /* 0 20 */
/* XXX 4 bytes hole, try to pack */
int64_t hash; /* 24 8 */
struct connection * conn; /* 32 4 */
/* size: 40, cachelines: 1, members: 3 */
/* sum members: 32, holes: 1, sum holes: 4 */
/* padding: 4 */
/* last cacheline: 40 bytes */
};
Instead, eb64 nodes should be used when it comes to simply storing a
64-bit key, and that is what this patch does.
For backports, a variant consisting in simply marking the "hash" member
with a "packed" attribute on the struct also does the job (tested), and
might be preferable if the fix is difficult to adapt. Only 2.6 and 2.5
are affected by this.
Previous commit 8a6767d26 ("BUG/MINOR: config: don't count trailing spaces
as empty arg (v2)") was still not enough. As reported by ClusterFuzz in
issue 52049 (https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=52049),
there remains a case where for the sake of reporting the correct argument
count, the function may produce virtual args that span beyond the end of
the output buffer if that one is too short. That's what's happening with
a config file of one empty line followed by a large number of args.
This means that what args[] points to cannot be relied on and that a
different approach is needed. Since no output is produced for spaces and
comments, we know that args[arg] continues to point to out+outpos as long
as only comments or spaces are found, which is what we're interested in.
As such it's safe to check the last arg's pointer against the one before
the trailing zero was emitted, in order to decide to count one final arg.
No backport is needed, unless the commit above is backported.
In parse_line(), spaces increment the arg count and it is incremented
again on '#' or end of line, resulting in an extra empty arg at the
end of arg's list. The visible effect is that the reported arg count
is in excess of 1. It doesn't seem to affect regular function but
specialized ones like anonymisation depends on this count.
This is the second attempt for this problem, here the explanation :
When called for the first line, no <out> was allocated yet so it's NULL,
letting the caller realloc a larger line if needed. However the words are
parsed and their respective args[arg] are filled with out+position, which
means that while the first arg is NULL, the other ones are no and fail the
test that was meant to avoid dereferencing a NULL. Let's simply check <out>
instead of <args> since the latter is always derived from the former and
cannot be NULL without the former also being.
This may need to be backported to stable versions.
In function hlua_applet_http_send_response(), a pushvalue
is performed with index '0'.
But according to lua doc (https://www.lua.org/manual/5.3/manual.html#4.3):
"Note that 0 is never an acceptable index".
Adding a FIXME comment near to the pushvalue operation
so that this can get some chance to be reviewed later.
No backport needed.
When providing multiple optional arguments with lua-load or
lua-load-per-thread directives, arguments where pushed 1 by 1
to the stack using lua_pushstring() without checking if the stack
could handle it.
This could easily lead to program crash when providing too much
arguments. I can easily reproduce the crash starting from ~50 arguments.
Calling lua_checkstack() before pushing to the stack fixes the crash:
According to lua.org, lua_checkstack() does some housekeeping and
allow the stack to be expanded as long as some memory is available
and the hard limit isn't reached.
When no memory is available to expand the stack or the limit is reached,
lua_checkstacks returns an error: in this case we force hlua_load_state()
to return a meaningfull error instead of crashing.
In practice though, cfgparse complains about too many words
way before such event may occur on a normal system.
TLDR: the ~50 arguments limitation is not an issue anymore.
No backport needed, except if 'MINOR: hlua: Allow argument on
lua-lod(-per-thread) directives' (ae6b568) is backported.
Calling the function with an offset when "offset + len" was superior or equal
to the targeted blk length caused 'v' value to be improperly set.
And because 'v' is directly provided to htx_replace_blk_value(), blk consistency was compromised.
(It seems that blk was overrunning in htx_replace_blk_value() due to
this and header data was overwritten in this case).
This patch adds the missing checks to make the function behave as
expected when offset is set and offset+len is greater or equals to the targeted blk length.
Some comments were added to the function as well.
It may be backported to 2.6 and 2.5
"stdout" and "stderr" are not hashed. In the same spirit, "fd@" and
"sockpair@" prefixes are not hashed too. There is no reason to hash such
address and it may be useful to diagnose bugs.
No backport needed, except if anonymization mechanism is backported.
Add PA_O_DGRAM and PA_O_PORT_OFS options when str2sa_range() is called. This
way dgram sockets and addresses with port offsets are supported.
No backport needed, except if anonymization mechanism is backported.
Correct a commentary in in include/haproxy/global-t.h and include/haproxy/tools.h
Replace the CLI command 'set global-key <key>' by 'set anon global-key <key>' in
order to find it easily when you don't remember it, the recommandation can guide
you when you just tap 'set anon'.
No backport needed, except if anonymization mechanism is backported.
Add an option to dump the number lines of the configuration file when
it's dumped. Other options can be easily added. Options are separated
by ',' when tapping the command line:
'./haproxy -dC[key],line -f [file]'
No backport needed, except if anonymization mechanism is backported.
Add keywords recognized during the dump of the configuration file,
these keywords are followed by sensitive information.
Remove the condition 'localhost' for the second argument of keyword
'server', consider as not essential and can disturb when comparing
it in cli section (there is no exception 'localhost').
No backport needed, except if anonymization mechanism is backported.
Add HA_ANON_CLI to the srv->hostname when using 'show servers state'.
It can contain sensitive information like 'www....com'
No backport needed, except if anonymization mechanism is backported.
Replace HA_ANON_CLI by hash_ipanon to anonynmized address like
anonymizing address in the configuration file.
No backport needed, except if anonymization mechanism is backported.
Add a parameter hasport to return a simple hash or ipstring when
ipstring has no port. Doesn't hash if scramble is null. Add
option PA_O_PORT_RESOLVE to str2sa_range. Add a case UNIX.
Those modification permit to use hash_ipanon in cli section
in order to dump the same anonymization of address in the
configuration file and with CLI.
No backport needed, except if anonymization mechanism is backported.
Removed the error message in 'set anon on|off', it's more user
friendly: users use 'set anon on' even if the mode is already
activated, and the same for 'set anon off'. That allows users
to write the command line in the anonymized mode they want
without errors.
No backport needed, except if anonymization mechanism is backported.
Add an anonymization for an element missed in the first merge
for 'show sess all'.
No backport needed, except if anonymization mechanism is backported.
hlua_http_msg_insert_data() function is called upon
HTTPMessage.insert() method from lua script.
This function did not work properly for multiple reasons:
- An incorrect argument check was performed and prevented the user
from providing optional offset argument.
- Input and output variables were inverted
and offset was not handled properly. The same bug
was also fixed in hlua_http_msg_del_data(), see:
'BUG/MINOR: hlua: fixing hlua_http_msg_del_data behavior'
The function now behaves as described in the documentation.
This could be backported to 2.6 and 2.5.
GH issue #1885 reported that HTTPMessage.remove() did not
work as expected.
It turns out that underlying hlua_http_msg_del_data() function
was not working properly due to input / output inversion as well
as incorrect user offset handling.
This patch fixes it so that the behavior is the one described in
the documentation.
This could be backported to 2.6 and 2.5.
This reverts commit 5529424ef1.
Since this patch, HAProxy crashes when the first line of the configuration
file contains more than one parameter because, on the first call of
parse_line(), the output line is not allocated. Thus elements in the
arguments array may point on invalid memory area.
It may be considered as a bug to reference invalid memory area and should be
fixed. But for now, it is safer to revert this patch
If the reverted commit is backported, this one must be backported too.
In parse_line(), spaces increment the arg count and it is incremented
again on '#' or end of line, resulting in an extra empty arg at the
end of arg's list. The visible effect is that the reported arg count
is in excess of 1. It doesn't seem to affect regular function but
specialized ones like anonymisation depends on this count.
This may need to be backported to stable versions.
Fix the size check in ring_make_from_area() which is checking the size
of the pointer instead of the size of the structure.
No backport needed, 2.7 only.
To avoid any UAF when a resolution is released, a mechanism was added to
abort a resolution and delayed the released at the end of the current
execution path. This mechanism depends on an hard assumption: Any reference
on an aborted resolution must be removed. So, when a resolution is aborted,
it is removed from the resolver lists and inserted into a death row list.
However, a resolution may still be referenced in the query_ids tree. It is
the tree containing all resolutions with a pending request. Because aborted
resolutions are released outside the resolvers lock, it is possible to
release a resolution on a side while a query ansswer is received and
processed on another one. Thus, it is still possible to have a UAF because
of this bug.
To fix the issue, when a resolution is aborted, it is removed from any list,
but it is also removed from the query_ids tree.
This patch should solve the issue #1862 and may be related to #1875. It must
be backported as far as 2.2.
If stream_new() fails after the frontend SC is attached, the underlying SE
descriptor is not properly reset. Among other things, SE_FL_ORPHAN flag is
not set again. Because of this error, a BUG_ON() is triggered when the mux
stream on the frontend side is destroyed.
Thus, now, when stream_new() fails, SE_FL_ORPHAN flag is set on the SE
descriptor and its stream-connector is set to NULL.
This patch should solve the issue #1880. It must be backported to 2.6.
The frontend SC is attached before the backend one is allocated. Thus an
allocation error on backend SC must be handled before an error on the
frontend SC.
This patch must be backported to 2.6.
Upon a reload with the master CLI, the FD of the master CLI session is
received by the internal socketpair listener.
This session is used to display the status of the reload and then will
close.
The environment variable HAPROXY_LOAD_SUCCESS stores "1" if it
successfully load the configuration and started, "0" otherwise.
The "_loadstatus" master CLI command displays either
"Loading failure!\n" or "Loading success.\n"
As pointed out by chipitsine in GH #1879, coverity complains
about a sizeof with char ** type where it should be char *.
This was introduced in 'MINOR: hlua: Allow argument on
lua-lod(-per-thread) directives' (ae6b568)
Luckily this had no effect so far because on most platforms
sizeof(char **) == sizeof(char *), but this can not be safely
assumed for portability reasons.
The fix simply changes the argument to sizeof so that it refers to
'*per_thread_load[len]' instead of 'per_thread_load[len]'.
No backport needed.
When using the "reload" command over the master CLI, all connections to
the master CLI were cut, this was unfortunate because it could have been
used to implement a synchronous reload command.
This patch implements an architecture to keep the connection alive after
the reload.
The master CLI is now equipped with a listener which uses a socketpair,
the 2 FDs of this socketpair are stored in the mworker_proc of the
master, which the master keeps via the environment variable.
ipc_fd[1] is used as a listener for the master CLI. During the "reload"
command, the CLI will send the FD of the current session over ipc_fd[0],
then the reload is achieved, so the master won't handle the recv of the
FD. Once reloaded, ipc_fd[1] receives the FD of the session, so the
connection is preserved. Of course it is a new context, so everything
like the "prompt mode" are lost.
Only the FD which performs the reload is kept.
chipitsine reported in github issue #1872 that in function hash_anon and
hash_ipanon, index_hash can be equal to NB_L_HASH_WORD and can reach an
inexisting line table, the table is initialized hash_word[NB_L_HASH_WORD][20];
so hash_word[NB_L_HASH_WORD] doesn't exist.
No backport needed, except if anonymization mechanism is backported.
Because memprintf return an error to the caller and not on screen.
the function which perform display of message on the right output
is in charge of adding \n if it is necessary.
This patch may be backported.
At present option smtpchk closes the TCP connection abruptly on completion of service checking,
even if successful. This can result in a very high volume of errors in backend SMTP server logs.
This patch ensures an SMTP QUIT is sent and a positive 2xx response is received from the SMTP
server prior to disconnection.
This patch depends on the following one:
* MINOR: smtpchk: Update expect rule to fully match replies to EHLO commands
This patch should fix the issue #1812. It may be backported as far as 2.2
with the commit above On the 2.2, proxy_parse_smtpchk_opt() function is
located in src/check.c
[cf: I updated reg-tests script accordingly]
The response to EHLO command is a multiline reply. However the corresponding
expect rule only match on the first line. For now, it is not an issue. But
to be able to send the QUIT command and gracefully close the connection, we
must be sure to consume the full EHLO reply first.
To do so, the regex has been updated to match all 2xx lines at a time.
Tests with forced wakeups on a 24c/48t machine showed that we're caping
at 7.3M loops/s, which means 6.6 microseconds of loop delay without
having anything to do.
This is caused by two factors:
- the load and update of the now_offset variable
- the update of the global_now variable
What is happening is that threads are not running within the one-
microsecond time precision provided by gettimeofday(), so each thread
waking up sees a slightly different date and causes undesired updates
to global_now. But worse, these undesired updates mean that we then
have to adjust the now_offset to match that, and adds significant noise
to this variable, which then needs to be updated upon each call.
By only allowing sightly less precision we can completely eliminate
that contention. Here we're ignoring the 5 lowest bits of the usec
part, meaning that the global_now variable may be off by up to 31 us
(16 on avg). The variable is only used to correct the time drift some
threads might be observing in environments where CPU clocks are not
synchronized, and it's used by freq counters. In both cases we don't
need that level of precision and even one millisecond would be pretty
fine. We're just 30 times better at almost no cost since the global_now
and now_offset variables now only need to be updated 30000 times a
second in the worst case, which is unnoticeable.
After this change, the wakeup rate jumped from 7.3M/s to 66M/s, meaning
that the loop delay went from 6.6us to 0.73us, that's a 9x improvement
when under load! With real tasks we're seeing a boost from 28M to 52M
wakeups/s. The clock_update_global_date() function now only takes
1.6%, it's good enough so that we don't need to go further.
This patch modifies epoll, kqueue and evports (the 3 pollers that support
busy polling) to only update the local date in the inner polling loop,
the global one being done when leaving the loop. Testing with epoll on
a 24c/48t machine showed a boost from 53M to 352M loops/s, indicating
that the loop was spending 85% of its time updating the global date or
causing side effects (which was confirmed with perf top showing 67% in
clock_update_global_date() alone).
Pollers that support busy polling spend a lot of time (and cause
contention) updating the global date when they're looping over themselves
while it serves no purpose: what's needed is only an update on the local
date to know when to stop looping.
This patch splits clock_pudate_date() into a pair of local and global
update functions, so that pollers can be easily improved.
Patrick Hemmer reported an improper log behavior when using
log-format to escape log data (+E option):
Some bytes were truncated from the output:
- escape_string() function now takes an extra parameter that
allow the caller to specify input string stop pointer in
case the input string is not guaranteed to be zero-terminated.
- Minors checks were added into lf_text_len() to make sure dst
string will not overflow.
- lf_text_len() now makes proper use of escape_string() function.
This should be backported as far as 1.8.
The commit 372b38f935 ("BUG/MEDIUM: mux-h1: Handle connection error after a
synchronous send") introduced a bug. In h1_snd_buf(), consumed data are not
properly accounted if a connection error is detected. Indeed, data are
consumed when the output buffer is filled. But, on connection error, we exit
from the loop without incremented total variable accordingly.
When this happens, this leaves the channel buffer in an inconsistent
state. The buffer may be empty with some output at the channel level.
Because an error is reported, it is harmless. But it is safer to fix this
bug now to avoid any regression in future.
This patch must be backported as far as 2.2.
MUX QUIC snd_buf operation whill return early if a qcs instance is
resetted. In this case, HTX is left untouched and the callback returns
the whole bufer size. This lead to an undefined behavior as the stream
layer is notified about a transfer but does not see its HTX buffer
emptied. In the end, the transfer may stall which will lead to a leak on
session.
To fix this, HTX buffer is now resetted when snd_buf is short-circuited.
This should fix the issue as now the stream layer can continue the
transfer until its completion.
This patch has already been tested by Tristan and is reported to solve
the github issue #1801.
This should be backported up to 2.6.
Factorize common code between h3 and hq-interop snd_buf operation. This
is inserted in MUX QUIC snd_buf own callback.
The h3/hq-interop API has been adjusted to directly receive a HTX
message instead of a plain buf. This led to extracting part of MUX QUIC
snd_buf in qmux_http module.
This should be backported up to 2.6.
Extract function dealing with HTX outside of MUX QUIC. For the moment,
only rcv_buf stream operation is concerned.
The main objective is to be able to support both TCP and HTTP proxy mode
with a common base and add specialized modules on top of it.
This should be backported up to 2.6.
QUIC MUX implements several APIs to interface with stream, quic-conn and
app-ops layers. It is planified to better separate this roles, possibly
by using several files.
The first step is to extract QUIC MUX traces in a dedicated source
files. This will allow to reuse traces in multiple files.
The main objective is to be
able to support both TCP and HTTP proxy mode with a common base and add
specialized modules on top of it.
This should be backported up to 2.6.
A qcs instance free may be postponed in stream detach operation if the
stream is not locally closed. This condition is there to achieve
transfering data still present in Tx buffer. Once all data have been
emitted to quic-conn layer, qcs instance can be released.
However, the stream is only closed locally if HTX EOM has been seen or
it has been resetted. In case the transfer finished without EOM, a
detached qcs won't be freed even if there is no more activity on it.
This bug was not reproduced but was found on code analysis. Its precise
impact is unknown but it should not cause any leak as all qcs instances
are freed with its parent qcc connection : this should eventually happen
on MUX timeout or QUIC idle timeout.
To adjust this, condition to mark a stream as locally closed has been
extended. On qcc_streams_sent_done() notification, if its Tx buffer has
been fully transmitted, it will be closed if either FIN STREAM was set
or the stream is detached.
This must be backported up to 2.6.
Some tables are currently used to decode bit blocks and lengths. We do
see such lookups in perf top. We have 4 512-byte tables and one 64-byte
one. Looking closer, the second half of the table (length) has so few
variations that most of the time it will be computed in a single "if",
and never more than 3. This alone allows to cut the tables in half. In
addition, one table (bits 15-11) is only 32-element long, while another
one (bits 11-4) starts at 0x60, so we can merge the two as they do not
overlap, and further save size. We're now down to 4 256-entries tables.
This is visible in h3 and h2 where the max request rate is slightly higher
(e.g. +1.6% for h2). The huff_dec() function got slightly larger but the
overall code size shrunk:
$ nm --size haproxy-before | grep huff_dec
000000000000029e T huff_dec
$ nm --size haproxy-after | grep huff_dec
0000000000000345 T huff_dec
$ size haproxy-before haproxy-after
text data bss dec hex filename
7591126 569268 2761348 10921742 a6a70e haproxy-before
7591082 568180 2761348 10920610 a6a2a2 haproxy-after
The locally defined static variables 'httpclient_srv_raw' and
'httpclient_srv_ssl' are not used anywhere in the source code,
except that they are set in the httpclient_precheck() function.
nb_hreq is a counter on qcc for active HTTP requests. It is incremented
for each qcs where a full HTTP request was received. It is decremented
when the stream is closed locally :
- on HTTP response fully transmitted
- on stream reset
A bug will occur if a stream is resetted without having processed a full
HTTP request. nb_hreq will be decremented whereas it was not
incremented. This will lead to a crash when building with
DEBUG_STRICT=2. If BUG_ON_HOT are not active, nb_hreq counter will wrap
which may break the timeout logic for the connection.
This bug was triggered on haproxy.org. It can be reproduced by
simulating the reception of a STOP_SENDING frame instead of a STREAM one
by patching qc_handle_strm_frm() :
+ if (quic_stream_is_bidi(strm_frm->id))
+ qcc_recv_stop_sending(qc->qcc, strm_frm->id, 0);
+ //ret = qcc_recv(qc->qcc, strm_frm->id, strm_frm->len,
+ // strm_frm->offset.key, strm_frm->fin,
+ // (char *)strm_frm->data);
To fix this bug, a qcs is now flagged with a new QC_SF_HREQ_RECV. This
is set when the full HTTP request is received. When the stream is closed
locally, nb_hreq will be decremented only if this flag was set.
This must be backported up to 2.6.
This commit adds a new command line option -dC to dump the configuration
file. An optional key may be appended to -dC in order to produce an
anonymized dump using this key. The anonymizing process uses the same
algorithm as the CLI so that the same key will produce the same hashes
for the same identifiers. This way an admin may share an anonymized
extract of a configuration to match against live dumps. Note that key 0
will not anonymize the output. However, in any case, the configuration
is dumped after tokenizing, thus comments are lost.
Modify proxy.c in order to anonymize the following confidential data on
commands 'show servers state' and 'show servers conn':
- proxy name
- server name
- server address
Modify stream.c in order to hash the following confidential data if the
anonymized mode is enabled:
- configuration elements such as frontend/backend/server names
- IP addresses
In order to allow users to dump internal states using a specific key
without changing the global one, we're introducing a key in the CLI's
appctx. This key is preloaded from the global one when "set anon on"
is used (and if none exists, a random one is assigned). And the key
can optionally be assigned manually for the whole CLI session.
A "show anon" command was also added to show the anon state, and the
current key if the users has sufficient permissions. In addition, a
"debug dev hash" command was added to test the feature.
Add a uint32_t key in global to hash words with it. A new CLI command
'set global-key <key>' was added to change the global anonymizing key.
The global may also be set in the configuration using the global
"anonkey" directive. For now this key is not used.
These macros and functions will be used to anonymize strings by producing
a short hash. This will allow to match config elements against dump elements
without revealing the original data. This will later be used to anonymize
configuration parts and CLI commands output. For now only string, identifiers
and addresses are supported, but the model is easily extensible.
Ilya reported in issue #1816 a build warning on armhf (promoted to error
here since -Werror):
src/fd.c: In function fd_rm_from_fd_list:
src/fd.c:209:87: error: passing argument 3 of __ha_cas_dw discards volatile qualifier from pointer target type [-Werror=discarded-array-qualifiers]
209 | unlikely(!_HA_ATOMIC_DWCAS(((long *)&fdtab[fd].update), (uint32_t *)&cur_list.u32, &next_list.u32))
| ^~~~~~~~~~~~~~
This happens only on such an architecture because the DWCAS requires the
pointer not the value, and gcc seems to be needlessly picky about reading
a const from a volatile! This may safely be backported to older versions.
Ed Hein reported in github issue #1856 some occasional watchdog panics
in 2.4.18 showing extreme contention on the proxy's lock while the libc
was in malloc()/free(). One cause of this problem is that we call free()
under the proxy's lock in proxy_capture_error(), which makes no sense
since if we can free the object under the lock after it's been detached,
we can also free it after releasing the lock (since it's not referenced
anymore).
This should be backported to all relevant versions, likely all
supported ones.
This fixes 4 tiny and harmless typos in mux_quic.c, quic_tls.c and
ssl_sock.c. Originally sent via GitHub PR #1843.
Signed-off-by: cui fliter <imcusg@gmail.com>
[Tim: Rephrased the commit message]
[wt: further complete the commit message]
When calling 'add server' with a hostname from the cli (runtime),
str2sa_range() does not resolve hostname because it is purposely
called without PA_O_RESOLVE flag.
This leads to 'srv->addr_node.key' being NULL. According to Willy it
is fine behavior, as long as we handle it properly, and is already
handled like this in srv_set_addr_desc().
This patch fixes GH #1865 by adding an extra check before inserting
'srv->addr_node' into 'be->used_server_addr'. Insertion and removal
will be skipped if 'addr_node.key' is NULL.
It must be backported to 2.6 and 2.5 only.
A stream is considered as remotely closed once we have received all the
data with the FIN bit set.
The condition to close the stream was wrong. In particular, if we
receive an empty STREAM frame with FIN bit set, this would have close
the stream even if we do not have yet received all the data. The
condition is now adjusted to ensure that Rx buffer contains all the data
up to the stream final size.
In most cases, this bug is harmless. However, if compiled with
DEBUG_STRICT=2, a BUG_ON_HOT crash would have been triggered if close is
done too early. This was most notably the case sometimes on interop test
suite with quinn or kwik clients. This can also be artificially
reproduced by simulating reception of an empty STREAM frame with FIN bit
set in qc_handle_strm_frm() :
+ if (strm_frm->fin) {
+ qcc_recv(qc->qcc, strm_frm->id, 0,
+ strm_frm->len, strm_frm->fin,
+ (char *)strm_frm->data);
+ }
ret = qcc_recv(qc->qcc, strm_frm->id, strm_frm->len,
strm_frm->offset.key, strm_frm->fin,
(char *)strm_frm->data);
This must be backported up to 2.6.
Small cleanup on snd_buf for application protocol layer.
* do not export h3_snd_buf
* replace stconn by a qcs argument. This is better as h3/hq-interop only
uses the qcs instance.
This should be backported up to 2.6.
The same was performed for the H2 multiplexer. H1C and H1S flags are moved
in a dedicated header file. It will be mainly used to be able to decode
mux-h1 flags from the flags utility.
In this patch, we only move the flags to mux_h1-t.h.
H3 SETTINGS emission has recently been delayed. The idea is to send it
with the first STREAM to reduce sendto syscall invocation. This was
implemented in the following patch :
3dd79d378c
MINOR: h3: Send the h3 settings with others streams (requests)
This patch works fine under nominal conditions. However, it will cause a
crash if a HTTP/3 connection is released before having sent any data,
for example when receiving an invalid first request. In this case,
qc_release will first free qcc.app_ops HTTP/3 application protocol layer
via release callback. Then qc_send is called to emit any closing frames
built by app_ops release invocation. However, in qc_send, as no data has
been sent, it will try to complete application layer protocol
intialization, with a SETTINGS emission for HTTP/3. Thus, qcc.app_ops is
reused, which is invalid as it has been just freed. This will cause a
crash with h3_finalize in the call stack.
This bug can be reproduced artificially by generating incomplete HTTP/3
requests. This will in time trigger http-request timeout without any
data send. This is done by editing qc_handle_strm_frm function.
- ret = qcc_recv(qc->qcc, strm_frm->id, strm_frm->len,
+ ret = qcc_recv(qc->qcc, strm_frm->id, strm_frm->len - 1,
strm_frm->offset.key, strm_frm->fin,
(char *)strm_frm->data);
To fix this, application layer closing API has been adjusted to be done
in two-steps. A new shutdown callback is implemented : it is used by the
HTTP/3 layer to generate GOAWAY frame in qc_release prologue.
Application layer context qcc.app_ops is then freed later in qc_release
via the release operation which is now only used to liberate app layer
ressources. This fixes the problem as the intermediary qc_send
invocation will be able to reuse app_ops before it is freed.
This patch fixes the crash, but it would be better to adjust H3 SETTINGS
emission in case of early connection closing : in this case, there is no
need to send it. This should be implemented in a future patch.
This should fix the crash recently experienced by Tristan in github
issue #1801.
This must be backported up to 2.6.
With quicTLS the set_encruption_secrets callback is always called with
the read_secret and the write_secret.
However this is not the case with libreSSL, which uses the
set_read_secret()/set_write_secret() mecanism. It still provides the
set_encryption_secrets() callback, which is called with a NULL
parameter for the write_secret during the read, and for the read_secret
during the write.
The exchange key was not designed in haproxy to be called separately for
read and write, so this patch allow calls with read or write key to
NULL.
httpclient_new_from_proxy() is a variant of httpclient_new() which
allows to create the requests from a different proxy.
The proxy and its 2 servers are now stored in the httpclient structure.
The proxy must have been created with httpclient_create_proxy() to be
used.
The httpclient_postcheck() callback will finish the initialization of
all proxies created with PR_CAP_HTTPCLIENT.
httpclient_create_proxy() is a function which creates a proxy that could
be used for the httpclient. It will allocate a proxy, a raw server and
an ssl server.
This patch moves most of the code from httpclient_precheck() into a
generic function httpclient_create_proxy().
The proxy will have the PR_CAP_HTTPCLIENT capability.
This could be used for specifics httpclient instances that needs
different proxy settings.
The init of tcp sink, particularly for SSL, was done
too early in the code, during parsing, and this can cause
a crash specially if nbthread was not configured.
This was detected by William using ASAN on a new regtest
on log forward.
This patch adds the 'struct proxy' created for a sink
to a list and this list is now submitted to the same init
code than the main proxies list or the log_forward's proxies
list. Doing this, we are assured to use the right init sequence.
It also removes the ini code for ssl from post section parsing.
This patch should be backported as far as v2.2
Note: this fix uses 'goto' labels created by commit
'BUG/MAJOR: log-forward: Fix log-forward proxies not fully initialized'
but this code didn't exist before v2.3 so this patch needs to be
adapted for v2.2.
Originally in 1.8 we wanted to have an independent mux that could possibly
be disabled and would not impose dependencies on the outside. Everything
would fit into a single C file and that was fine.
Nowadays muxes are unavoidable, and not being able to easily inspect them
from outside is sometimes a bit of a pain. In particular, the flags utility
still cannot be used to decode their flags.
As a first step towards this, this patch moves the flags and enums to
mux_h2-t.h, as well as the two state decoding inline functions. It also
dropped the H2_SS_*_BIT defines that nobody uses. The mux_h2.c file remains
the only one to include that for now.
Please refer to GH #1859 for more info.
Coverity suspected improper proxy pointer handling.
Without the fix it is considered safe for the moment, but it might not
be the case in the future as we want to keep the ability to have
isolated listeners.
Making sure stop_listener(), pause_listener(), resume_listener()
and listener_release() functions make proper use
of px pointer in that context.
No need for backport except if multi-connection protocols (ie:FTP)
were to be backported as well.
Since this counter was added, it was incremented at the wrong place for
client streams. It was incremented when the stream-connector (formely the
conn-stream) was created while it should be done when the H1 stream is
created. Thus, on parsing error, on H1>H2 upgrades or TCP>H1 upgrades, the
counter is not incremented. However, it is always decremented when the H1
stream is destroyed.
On bakcned side, there is no issue.
This patch must be backported to 2.6.
As reported by Ilya and Coverity in issue #1858, since recent commit
eea152ee6 ("BUG/MINOR: signals/poller: ensure wakeup from signals")
which removed the test for the global signal flag from the pollers'
loop, the remaining "wake" flag doesn't need to be tested since it
already participates to zeroing the wait_time and will be caught
on the previous line.
Let's just remove that test now.
This patch adresses the issue #1626.
Adding support for PR_FL_PAUSED flag in the function stats_fill_fe_stats().
The command 'show stat' now properly reports a disabled frontend
using "PAUSED" state label.
This patch depends on the following commits:
- 7d00077fd5 "BUG/MEDIUM: proxy: ensure pause_proxy()
and resume_proxy() own PROXY_LOCK".
- 001328873c "MINOR: listener: small API change"
- d46f437de6 "MINOR: proxy/listener: support for additional PAUSED state"
It should be backported to 2.6, 2.5 and 2.4
This patch is a prerequisite for #1626.
Adding PAUSED state to the list of available proxy states.
The flag is set when the proxy is paused at runtime (pause_listener()).
It is cleared when the proxy is resumed (resume_listener()).
It should be backported to 2.6, 2.5 and 2.4
A minor API change was performed in listener(.c/.h) to restore consistency
between stop_listener() and (resume/pause)_listener() functions.
LISTENER_LOCK was never locked prior to calling stop_listener():
lli variable hint is thus not useful anymore.
Added PROXY_LOCK locking in (resume/pause)_listener() functions
with related lpx variable hint (prerequisite for #1626).
It should be backported to 2.6, 2.5 and 2.4
There was a race involving hlua_proxy_* functions
and some proxy management functions.
pause_proxy() and resume_proxy() can be used directly from lua code,
but that could lead to some race as lua code didn't make sure PROXY_LOCK
was owned before calling the proxy functions.
This patch makes sure it won't happen again elsewhere in the code
by locking PROXY_LOCK directly in resume and pause proxy functions
so that it's not the caller's responsibility anymore.
(based on stop_proxy() behavior that was already safe prior to the patch)
This should be backported to stable series.
Note that the API will likely differ < 2.4
Add self-wake in signal_handler() to fix a race condition with a signal
coming in between checking signal_queue_len and entering polling sleep.
The changes in commit 43c891dda ("BUG/MINOR: signals/poller: set the
poller timeout to 0 when there are signals") were insufficient.
Move the signal_queue_len check from the poll implementations to
run_poll_loop() to keep that logic in one place.
The poll loops are terminated either by the parameter wake being set or
wake up due to a write to their poller_wr_pipe by wake_thread() in
signal_handler().
This fixes issue #1841.
Must be backported in every stable version.
This is the ->finalize application callback which prepares the unidirectional STREAM
frames for h3 settings and wakeup the mux I/O handler to send them. As haproxy is
at the same time always waiting for the client request, this makes haproxy
call sendto() to send only about 20 bytes of stream data. Furthermore in case
of heavy loss, this give less chances to short h3 requests to succeed.
Drawback: as at this time the mux sends its streams by their IDs ascending order
the stream 0 is always embedded before the unidirectional stream 3 for h3 settings.
Nevertheless, as these settings may be lost and received after other h3 request
streams, this is permitted by the RFC.
Perhaps there is a better way to do. This will have to be checked with Amaury.
Must be backported to 2.6.
This was due to a missing check in h3_trace() about the first argument
presence (connection) and h3_parse_settings_frm() which calls TRACE_LEAVE()
without any argument. Then this argument was dereferenced.
Must be backported to 2.6
<qc> variable was confused with <qel>. The consequence was that it was
always the same packet number space which was displayed: the first one (or
the Initial packet number space).
Must be backported to 2.6.
It is possible to speed up the handshake completion but only one time
by connection as mentionned in RFC 9002 "6.2.3. Speeding up Handshake Completion".
Add a flag to prevent this process to be run several times
(see https://www.rfc-editor.org/rfc/rfc9002#name-speeding-up-handshake-compl).
Must be backported to 2.6.
When receiving a signal before entering the poller, and without any
activity in the process, the poller will be entered with a timeout
calculated without checking the signals.
Since commit 4f59d3 ("MINOR: time: increase the minimum wakeup interval
to 60s") the issue is much more visible because it could be stuck for
60s.
When in mworker mode, if a worker quits and the SIGCHLD signal deliver
at the right time to the master, this one could be stuck for the time of
the timeout.
This should fix issue #1841
Must be backported in every stable version.
By default we now dump stats between caller and callee, but by
specifying "aggr" on the command line, stats get aggregated by
callee again as it used to be before the feature was available.
It may sometimes be helpful when comparing total call counts,
though that's about all.
The following task/tasklet handlers often appear in "show profiling tasks"
but were not resolved since static:
qc_io_cb, quic_conn_app_io_cb, process_timer,
quic_accept_run, qc_idle_timer_task
This commit simply exports them so they can be resolved now. "process_timer"
which was a bit too generic and renamed to qc_process_timer.
The function appears like this in "show profiling tasks", so let's export
it:
function calls cpu_tot cpu_avg lat_tot lat_avg
main+0x1463f0 92 77.28us 839.0ns 2.018ms 21.93us <- wake_expired_tasks@src/task.c:429 task_drop_running
This removes all the hard-coded 8-bit and 256 entries to use a pair of
macros instead so that we can more easily experiment with larger table
sizes if needed.
Now that ->wake_date is common to tasks and tasklets, we don't need
anymore to carry a duplicate control block to read and update it for
tasks and tasklets. And given that this code was present early in the
if/else fork between tasks and tasklets, taking it out of the block
allows to move the task part into a more visible "else" branch that
also allows to factor the epilogue that resets th_ctx->current and
updates profile_entry->cpu_time, which also used to be duplicated.
Overall, doing just that saved 253 bytes in the function, or ~1/6,
which is not bad considering that it's on a hot path. And the code
got much ore readable.
The memstats code currently defines its own file/function/line number,
type and extra pointer. We don't need to keep them separate and we can
easily replace them all with just a struct ha_caller. Note that the
extra pointer could be converted to a pool ID stored into arg8 or
arg32 and be dropped as well, but this would first require to define
IDs for pools (which we currently do not have).