Add postparsing checks to control server line conformity regarding QUIC
both on the server address and the MUX protocol. An error is reported in
the following case :
* proto quic is explicitely specified but server address does not
specify quic4/quic6 prefix
* another proto is explicitely specified but server address uses
quic4/quic6 prefix
There was already a check for this but there used to be an exception
that allowed duplicate server names only in case where their IDs were
explicit and different. This has been emitting a warning since 3.1 and
planned for removal in 3.3, so let's do it now. The doc was updated,
though it never mentioned this unicity constraint, so that was added.
Only the check for the exception was removed, the rest of the code
that is currently made to deal with duplicate server names was not
cleaned yet (e.g. the tree doesn't need to support dups anymore, and
this could be done at insertion time). This may be a subject for future
cleanups.
Force QUIC as <mux_proto> for server if a QUIC address is used. This is
similarly to what is already done for bind instances on the frontend
side. This step ensures that conn_create_mux() will select the proper
protocol.
This XPRT callback is called from check_config_validity() after the configuration
has been parsed to initialize all the SSL server contexts.
This patch implements the same thing for the QUIC servers.
If an empty argument is used in configuration, for example due to an
undefined environment variable, the rest of the line is not parsed. As
such, a warning is emitted to report this.
The warning was not totally correct as it reported the wrong argument
index. Fix this by this patch. Note that there is still an issue with
the "^" indicator, but this is not as easy to fix yet.
This is related to github issue #2995.
This should be backported up to 3.2.
Hide warning about empty argument outside of discovery mode. This is
necessary, else the message will be displayed twice, which hampers
haproxy output lisibility.
This should fix github isue #2995.
This should be backported up to 3.2.
proxies, listeners and server shared counters are now managed via helpers
added in one of the previous commits.
When guid is not set (ie: when not yet assigned), shared counters pointer
is allocated using calloc() (local memory) and a flag is set on the shared
counters struct to know how to manipulate (and free it). Else if guid is
set, then it means that the counters may be shared so while for now we
don't actually use a shared memory location the API is ready for that.
The way it works, for proxies and servers (for which guid is not known
during creation), we first call counters_{fe,be}_shared_get with guid not
set, which results in local pointer being retrieved (as if we just
manually called calloc() to retrieve a pointer). Later (during postparsing)
if guid is set we try to upgrade the pointer from local to shared.
Lastly, since the memory location for some objects (proxies and servers
counters) may change from creation to postparsing, let's update
counters->last_change member directly under counters_{fe,be}_shared_get()
so we don't miss it.
No change of behavior is expected, this is only preparation work.
Shareable counters are not tagged as shared counters and are dynamically
allocated in separate memory area as a prerequisite for being stored
in shared memory area. For now, GUID and threads groups are not taken into
account, this is only a first step.
also we ensure all counters are now manipulated using atomic operations,
namely, "last_change" counter is now read from and written to using atomic
ops.
Despite the numerous changes caused by the counters being moved away from
counters struct, no change of behavior should be expected.
rename _srv_postparse() internal function to srv_init() function and group
srv_init_per_thr() plus idle conns list init inside it. This way we can
perform some simplifications as srv_init() performs multiple server
init steps after parsing.
SRV_F_CHECKED flag was added, it is automatically set when srv_init()
runs successfully. If the flag is already set and srv_init() is called
again, nothing is done. This permis to manually call srv_init() earlier
than the default POST_CHECK hook when needed without risking to do things
twice.
OSS Fuzz found that the previous fix ebb19fb367 ("BUG/MINOR: cfgparse:
consider the special case of empty arg caused by \x00") was incomplete,
as the output can sometimes be larger than the input (due to variables
expansion) in which case the work around to try to report a bad arg will
fail. While the parse_line() function has been made more robust now in
order to avoid this condition, let's fix the handling of this special
case anyway by just pointing to the beginning of the line if the supposed
error location is out of the line's buffer.
All details here:
https://oss-fuzz.com/testcase-detail/5202563081502720
No backport is needed unless the fix above is backported.
The reporting of the empty arg location added with commit 08d3caf30
("MINOR: cfgparse: visually show the input line on empty args") falls
victim of a special case detected by OSS Fuzz:
https://issues.oss-fuzz.com/issues/415850462
In short, making an argument start with "\x00" doesn't make it empty for
the parser, but still emits an empty string which is detected and
displayed. Unfortunately in this case the error pointer is not set so
the sanitization function crashes.
What we're doing in this case is that we fall back to the position of
the output argument as an estimate of where it was located in the input.
It's clearly inexact (quoting etc) but will still help the user locate
the problem.
No backport is needed unless the commit above is backported.
Now when an empty arg is found on a line, we emit the sanitized
input line and the position of the first empty arg so as to help
the user figure the cause (likely an empty environment variable).
Co-authored-by: Valentine Krasnobaeva <vkrasnobaeva@haproxy.com>
For historical reasons, the config parser relies on the trailing '\0'
to detect the end of the line being parsed. When the lines started to be
tokenized into arguments, this principle has been preserved, and now all
the parsers rely on *args[arg]='\0' to detect the end of a line. But as
reported in issue #2944, while most of the time it breaks the parsing
like below:
http-request deny if { path_dir '' }
it can also cause some elements to be silently ignored like below:
acl bad_path path_sub '%2E' '' '%2F'
This may also subtly happen with environment variables that don't exist
or which are empty:
acl bad_path path_sub '%2E' "$BAD_PATTERN" '%2F'
Fortunately, parse_line() returns the number of arguments found, so it's
easy from the callers to verify if any was empty. The goal of this commit
is not to perform sensitive changes, it's only to mention when parsing a
line that an empty argument was found and alert about its consequences
using a warning. Most of the time when this happens, the config does not
parse. But for examples as the ACLs above, there could be consequences
that are better detected early.
This patch depends on this previous fix:
BUG/MINOR: tools: do not create an empty arg from trailing spaces
Co-authored-by: Valentine Krasnobaeva <vkrasnobaeva@haproxy.com>
In check_config_validity(), we initialize the proxy in several stages.
We do so for the sink list for stage1, but not for stage2. It may not be
needed right now, but it may become needed in the future, so do it
anyway.
Move the call to initialize the proxy's per-thread structure earlier
than currently done, so that they are usable when we're initializing the
load balancers.
Move the code responsible for calling per-thread server initialization
earlier than it was done, so that per-thread structures are available a
bit later, when we initialize load-balancing.
In this patch we try to use the proxy API init functions as much as
possible to avoid code redundancy and prevent proxy initialization
errors. As such, we prefer using alloc_new_proxy() and setup_new_proxy()
instead of manually allocating the proxy pointer and performing the
base init ourselves.
When first pre-parsing the config to detect the presence or absence of
the master mode, we must not emit messages because they are not supposed
to be visible at this point, otherwise they appear twice each. The
pre-parsing, also called discovery mode, is only for internal use,
thus it should remain silent.
This should be backported to 3.1 where this mode was introduced.
For now the function refrains from detecting the CPU topology when a
restrictive taskset or cpu-map was already performed on the process,
and it's documented as such, the reason being that until we're able
to automatically create groups, better not change user settings. But
we'll need to be able to detect bound CPUs and to process them as
desired by the user, so we now need to move that detection into the
function itself. It changes nothing to the logic, just gives more
freedom to the function.
The function is not convenient because it doesn't allow us to undo the
startup changes, and depending on where it's being used, we don't know
whether the values read have already been altered (this is not the case
right now but it's going to evolve).
Let's just compute the status during cpu_detect_usable() and set a
variable accordingly. This way we'll always read the init value, and
if needed we can even afford to reset it. Also, placing it in cpu_topo.c
limits cross-file dependencies (e.g. threads without affinity etc).
The cpuset files are normally used only for cpu manipulations. It happens
that the initial CPU binding detection was initially placed there since
there was no better place, but in practice, being OS-specific, it should
really be in cpu-topo. This simplifies cpuset which doesn't need to know
about the OS anymore.
Invalid (incomplete) "server" or "peer" lines under peers section are now
properly ignored. For completeness, in this patch we add some reports so
that the user knows that incomplete lines were ignored.
For an incomplete server line, since it is tolerated (see GH #565), we
only emit a diag warning.
For an incomplete peer line, we report a real warning, as it is not
expected to have a peer line without an address:port specified.
Also, 'newpeer == curpeers->local' check could be simplified since
we already have the 'local_peer' variable which tells us that the
parsed line refers to a local peer.
In 8ba10fea6 ("BUG/MINOR: peers: Incomplete peers sections should be
validated."), some checks were relaxed in parse_server(), and extra logic
was added in the peers section parser in an attempt to properly ignore
incomplete "server" or "peer" statement under peers section.
This was done in response to GH #565, the main intent was that haproxy
should already complain about incomplete peers section (ie: missing
localpeer).
However, 8ba10fea69 explicitly skipped the peer cleanup upon missing
srv association for local peers. This is wrong because later haproxy
code always assumes that peer->srv is valid. Indeed, we got reports
that the (invalid) config below would cause segmentation fault on
all stable versions:
global
localpeer 01JM0TEPAREK01FQQ439DDZXD8
peers my-table
peer 01JM0TEPAREK01FQQ439DDZXD8
listen dummy
bind localhost:8080
To fix the issue, instead of by-passing some cleanup for the local
peer, handle this case specifically by doing the regular peer cleanup
and reset some fields set on the curpeers and curpeers proxy because
of the invalid local peer (do as if the peer was not declared).
It should still comply with requirements from #565.
This patch should be backported to all stable versions.
In the "peers" section parser, right after parse_server() is called, we
used to check whether the curpeers->peers_fe->srv pointer was set or not
to know if parse_server() successfuly added a server to the peers proxy,
server that we can then associate to the new peer.
However the check is wrong, as curpeers->peers_fe->srv points to the
last added server, if a server was successfully added before the
failing one, we cannot detect that the last parse_server() didn't
add a server. This is known to cause bug with bad "peer"/"server"
statements.
To fix the issue, we save a pointer on the last known
curpeers->peers_fe->srv before parse_server() is called, and we then
compare the save with the pointer after parse_server(), if the value
didn't change, then parse_server() didn't add a server. This makes
the check consistent in all situations.
It should be backported to all stable versions.
In check_config_validity() we evaluate some sample fetch expressions
(log-format, server rules, etc). These expressions may use external files like
maps.
If some particular 'default-path' was set in the global section before, it's no
longer applied to resolve file pathes in check_config_validity(). parse_cfg()
at the end of config parsing switches back to the initial cwd.
This fixes the issue #2886.
This patch should be backported in all stable versions since 2.4.0, including
2.4.0.
When "peers" keyword is followed by more than one argument and it's the first
"peers" section in the config, cfg_parse_peers() detects it and exits with
"ERR_ALERT|ERR_FATAL" err_code.
So, upper layer parser, parse_cfg(), continues and parses the next keyword
"peer" and then he tries to check the global cfg_peers, which should contain
"my_cluster". The global cfg_peers is still NULL, because after alerting a user
in alertif_too_many_args, cfg_parse_peers() exited.
peers my_cluster __some_wrong_data__
peer haproxy1 1.1.1.1 1000
In order to fix this, let's add ERR_ABORT, if "peers" keyword is followed by
more than one argument. Like this parse_cfg() will stops immediately and
terminates haproxy with "too many args for peers my_cluster..." alert message.
It's more reliable, than add checks "if (cfg_peers !=NULL)" in "peer"
subparser, as we may have many "peers" sections.
peers my_another_cluster
peer haproxy1 1.1.1.2 1000
peers my_cluster __some_wrong_data__
peer haproxy1 1.1.1.1 1000
In addition, for the example above, parse_cfg() will parse all configuration
until the end and only then terminates haproxy with the alert
"too many args...". Peer haproxy1 will be wrongly associated with
my_another_cluster.
This fixes the issue #2872.
This should be backported in all stable versions.
Since we are now iterating on post_section_parser() for a same keyword,
we need to exit at the first ERR_ABORT.
The post_section_parser() is called when parsing a new section, but also
at the end of the file to be called for the last section.
The changes in 4de86bb ("MEDIUM: initcall: allow to register mutiple
post_section_parser per section") should have added tests on the
ERR_ABORT value.
Also pcs->post_section_parser() must be called instead of
cs->post_section_parser() because we could have a NULL ptr.
This bug does not affect anything since we don't use
REGISTER_CONFIG_POST_SECTION() yet.
Before this patch, REGISTER_CONFIG_SECTION() allowed to register one and only
one callback (<post>) called after the parsing of a section.
It was limitating because you couldn't register a post callback from anywhere
else in the code.
This patch introduces the new REGISTER_CONFIG_SECTION_POST() macros which allows
to register a new post callback for a section keyword from anywhere.
This patch introduces the feature by allowing `struct cfg_section` entries that
does not have a `section_parser`, and then iterating on all cfg_section with a
post_section_parser for a keyword.
Previous patch 2c270a05f ("BUG/MINOR: mworker: section ignored in
discovery after a post_section_parser") needs an adjustment for the last
section of the file.
Indeed the post_section_parser of the last section must not be called in
discovery mode.
Must be backported in 3.1.
When a new section is discovered, the post_section_parser of the
previous section is called. However in the new master-worker mode the
discovery mode will skip the post_section_parser. But instead of
trying to parse the current section keyword after that, it would skip
completely the current line.
This is a minor bug since there isn't a lot of section with
post_section_parser, and not a lot of section to parse in discovery
mode.
But this could be reproduced like this:
global
expose-deprecated-directives
resolvers res
parse-resolv-conf
program foo
command sleep 10
program bar
command sleep 10
Ths 'resolvers' section has a post_section_parser which will be ignored
in discovery mode with the consequence of ignoring the first program
section.
This must be backported in 3.1.
When a group is defined in a userlist section, only one 'users' option is
expected. But it was not tested. Thus it was possible to set several options
leading to a memory leak.
It is now tested, and it is not allowed to redefine the users option.
It was reported by Coverity in #2841: CID 1587771.
This patch could be backported to all stable versions.
A new global option was recently introduced to disable pacing. However,
the value used (1<<31) caused issue with some compiler as options field
used for storage is declared as int. Move pacing deactivation flag
outside into the newly defined quic_tune to fix this.
This should be backported up to 3.1 after a period of observation. Note
that it relied on the previous patch which defined new quic_tune type.
Pacing support was previously activated on each bind line individually,
via an optional argument of quic-cc-algo keyword. Remove this optional
argument and introduce a global setting to enable/disable pacing. Pacing
activation is still flagged as experimental.
One important change is that previously BBR usage automatically
activated pacing support. This is not the case anymore, so users should
now always explicitely activate pacing if BBR is selected. A new warning
message will be displayed if this is not the case.
Another consequence of this change is that now pacing_inter callback is
always defined for every quic_cc_algo types. As such, QUIC MUX uses
global.tune.options to determine if pacing is required.
This should be backported up to 3.1, after a period of observation.
Add a per-thread group field to struct proxy, that will contain a struct
queue, as well as a new field, "queueslength".
This is currently unused, so should change nothing.
Please note that proxy_init_per_thr() must now be called for each proxy
once the thread groups number is known.
proxy auth_uri struct was manually cleaned up during deinit, but the logic
behind was kind of akward because it was required to find out which ones
were shared or not. Instead, let's switch to a proper refcount mechanism
and free the auth_uri struct directly in proxy_free_common().
Each server is inserted in a global list named servers_list on
new_server(). This list is then only used to finalize servers
initialization after parsing.
On dynamic server creation, there is no issue as new_server() is under
thread isolation. However, when a server is deleted after its refcount
reached zero, srv_drop() removes it from servers_list without lock
protection. In the longterm, this can cause list corruption and crashes,
especially if multiple adjacent servers are removed in parallel.
To fix this, convert servers_list to a mt_list. This should not impact
performance as servers_list is not used during runtime outside of server
creation/deletion.
This should fix github issue #2733. Thanks to Chris Staite who first
found the issue here.
This must be backported up to 2.6.
This patch is a part of series to reintroduce the program support in the new
master-worker architecture.
Programs are launched by master, thus only the master process needs its
configuration. Therefore, program section parser should be called only in
discovery mode, when master parses its configuration.
Program section has a post section parser. It should be called only in
discovery mode as well.
This commit is a part of the series to add a support of discovery mode in the
configuration parser and in initialization sequence.
So, in discovery mode, when we read the configuration the first time, we
parse for the moment only the "global" section. Unknown section names will be
ignored.
It is no longer supported to declare debug traces, via 'trace' directive, in
a global section. A 'traces' directive must be used instead. The syntax of
the 'trace' directive in these sections remains the same. But it is no
longer experimental.
The main reason for this change is to avoid to have a ring section defined
before a global one. Indeed, for now, forward declarations of ring sections
are not supported. So to configure traces, you had to add a ring section
before the global one defining the traces. Most of time, that meant to have
two global sections :
global
[...] # global settings
ring <name>
[...]
global
[...] # trace config
In addition, it will be possible to easily extend the traces section by
adding some new directives.
As discussed below, there are too many problems and limitations caused
by still supporting duplicate server names. That's already particularly
complicated and dissuasive to use since it requires these servers to
have explicit IDs to be accept. Let's now warn on any duplicate, even
with explicit IDs and remind that this will become forbidden in 3.3.
Link: https://www.mail-archive.com/haproxy@formilux.org/msg45185.html
Surprisingly, the duplicate server name detection has never made use
of the names tree, so lookups were still in O(N^2). It took 1 second
to validate 50k servers spread into 25 backends at 2k per backend.
By simply using the tree (and since the current server already is in
the tree), we just have to walk using ebpt_prev_dup to visit previous
servers with the same name. We can then detect which ones conflict
without having an ID set and error. The config check time is now 1/4
of the previous one for 2k servers per backend, and more importantly
it will make it simpler to check for any duplicates later.
Proxy file names are assigned a bit everywhere (resolvers, peers,
cli, logs, proxy). All these elements were enumerated and now use
copy_file_name(). The only ha_free() call was turned to drop_file_name().
As a bonus side effect, a 300k backend config saved 14 MB of RAM.
Add a new parameter "alt" that will store wether this configuration
use an alternate protocol.
This alt pointer will contain a value that can be transparently
passed to protocol_lookup to obtain an appropriate protocol structure.
This change is needed to allow for example the servers to know if it
need to use an alternate protocol or not.
load_cfg_in_mem() can continuously reallocate memory in order to load an
extremely large input from /dev/stdin, until it fails with ENOMEM, which means
that process has consumed all available RAM. In case of containers and
virtualized environments it's not very good.
So, in order to prevent this, let's introduce MAX_CFG_SIZE as 10MB, which will
limit the size of input supplied via /dev/stdin.
This commit fixes potential null ptr dereferences reported by coverity, see
more details about it in the issues #2676 and #2668.
'outline' ptr, which is initialized to NULL explicitly as a temporary buffer to
store split keywords may be in theory implicitly dereferenced in some corner
cases (which we haven't encountered yet with real world configurations) in
'if (!**args)'. parse_line() code, called before under some conditions
assigns: args[arg] = outline + outpos and outpos initial value is 0.