The "429" status can now be specified on retry-on directives. PR_RE_* flags
were updated to remains sorted.
This patch should fix the issue #2687. It is quite simple so it may safely
be backported to 3.0 if necessary.
Some systems require log formats in the CLF format and that meant that I
could not send my logs for proxies in mode tcp to those servers. This
implements a format that uses log variables that are compatble with TCP
mode frontends and replaces traditional HTTP values in the CLF format
to make them stand out. Instead of logging method and URI like this
"GET /example HTTP/1.1" it will log "TCP " and for a response code I
used "000" so it would be easy to separate from legitimate HTTP
traffic. Now your log servers that require a CLF format can see the
timings for TCP traffic as well as HTTP.
Implement a new set of rules labelled as quic-initial.
These rules as specific to QUIC. They are scheduled to be executed early
on Initial packet parsing, prior a new QUIC connection instantiation.
Contrary to tcp-request connection, this allows to reject traffic
earlier, most notably by avoiding unnecessary QUIC SSL handshake
processing.
A new module quic_rules is created. Its main function
quic_init_exec_rules() is called on Initial packet parsing in function
quic_rx_pkt_retrieve_conn().
For the moment, only "accept" and "dgram-drop" are valid actions. Both
are final. The latter drops silently the Initial packet instead of
allocating a new QUIC connection.
The SPOE was significantly lightened. It is now possible to refactor it to
use a dedicated multiplexer. The first step is to add a SPOP mode for
proxies. The corresponding multiplexer mode is also added.
For now, there is no SPOP multiplexer, so it is only declarative. But at the
end, the SPOP multiplexer will be automatically selected for servers inside
a SPOP backend.
The related issue is #2502.
As shown in GH #2608 and ("BUG/MEDIUM: proxy: fix email-alert invalid
free"), simply calling free_email_alert() from free_proxy() is not the
right thing to do.
In this patch, we reuse proxy->email_alert.set memory space to introduce
proxy->email_alert.flags in order to support 2 flags:
PR_EMAIL_ALERT_SET (to mimic proxy->email_alert.set) and
PR_EMAIL_ALERT_RESOLVED (set once init_email_alert() was called on the
proxy to resolve email_alert.mailer pointer).
Thanks to PR_EMAIL_ALERT_RESOLVED flag, free_email_alert() may now
properly handle the freeing of proxy email_alert settings: if the RESOLVED
flag is set, then it means the .email_alert.mailers.name parsing hint was
replaced by the actual mailers pointer, thus no free should be attempted.
No backport needed: as described in ("BUG/MEDIUM: proxy: fix email-alert
invalid free"), this historical leak is not sensitive as it cannot be
triggered during runtime.. thus given that the fix is not backport-
friendly, it's not worth the trouble.
In fa90a7d3 ("BUG/MINOR: proxy: fix email-alert leak on deinit()"), I
tried to fix email-alert deinit() leak the simple way by leveraging
existing free_email_alert() helper function which was already used for
freeing email alert settings used in a default section.
However, as described in GH #2608, there is a subtelty that makes
free_email_alert() not suitable for use from free_proxy().
Indeed, proxy 'mailers.name' hint shares the same memory space than the
pointer to the corresponding mailers section (once the proxy is resolved,
name hint is replaced by the pointer to the section). However, since both
values share the same space (through union), we have to take care of not
freeing `mailers.name` once init_email_alert() was called on the proxy.
Unfortunately, free_email_alert() isn't protected against that, causing
double free() during deinit when mailers section is referenced from
multiple proxy sections. Since there is no easy fix, and that the leak in
itself isn't a big deal (fa90a7d3 was simply an opportunistic fix rather
than a must-have given that the leak only occurs during deinit and not
during runtime), let's actually revert the fix to restore legacy behavior
and prevent deinit errors.
Thanks to @snetat for having reported the issue on Github as well as
providing relevant infos to pinpoint the bug.
It should be backported everywhere fa90a7d3 was backported.
[ada: for versions prior to 3.0, simply revert the offending commit using
'git revert' as proxy_free_common() first appears in 3.0]
When parsing a logformat expression using parse_logformat_string(), the
caller passes the proxy under which the expression is found as argument.
This information allows the logformat expression API to check if the
expression is compatible with the proxy settings.
Since 7a21c3a ("MAJOR: log: implement proper postparsing for logformat
expressions"), the proxy compatibilty checks are postponed after the proxy
is fully parsed to ensure proxy properties are fully resolved for checks
consistency.
The way it works, is that each time parse_logformat_string() is called for
a given expression and proxy, it schedules the expression for postchecking
by appending the expression to the list of pending expression checks on
the proxy (lf_checks struct). Then, when the proxy is called with the
REGISTER_POST_PROXY_CHECK() hook, it iterates over unchecked expressions
and performs the check, then it removes the expression from its list.
However, I overlooked a special case: if a logformat expression is used
on a proxy that is disabled or a default proxy:
REGISTER_POST_PROXY_CHECK() hook is never called. Because of that, lf
expressions may still point to the proxy after the proxy is freed.
For most logformat expressions, this isn't an issue because they are
stored within the proxy itself, but this isn't the case with
{tcp,http}checks logformat expressions: during deinit() sequence, all
proxies are first cleaned up, and only then shared checks are freed.
Because of that, the below config will trigger UAF since 7a21c3a:
uaf.conf:
listen dummy
bind localhost:2222
backend testback
disabled
mode http
option httpchk
http-check send hdr test "test"
http-check expect status 200
haproxy -f uaf.conf -c:
==152096== Invalid write of size 8
==152096== at 0x21C317: lf_expr_deinit (log.c:3491)
==152096== by 0x2334A3: free_tcpcheck_http_hdr (tcpcheck.c:84)
==152096== by 0x2334A3: free_tcpcheck_http_hdr (tcpcheck.c:79)
==152096== by 0x2334A3: free_tcpcheck_http_hdrs (tcpcheck.c:98)
==152096== by 0x23365A: free_tcpcheck.part.0 (tcpcheck.c:130)
==152096== by 0x2338B1: free_tcpcheck (tcpcheck.c:108)
==152096== by 0x2338B1: deinit_tcpchecks (tcpcheck.c:3780)
==152096== by 0x2CF9A4: deinit (haproxy.c:2949)
==152096== by 0x2D0065: deinit_and_exit (haproxy.c:3052)
==152096== by 0x169BC0: main (haproxy.c:3996)
==152096== Address 0x52a8df8 is 6,968 bytes inside a block of size 7,168 free'd
==152096== at 0x484B27F: free (vg_replace_malloc.c:872)
==152096== by 0x2CF8AD: deinit (haproxy.c:2906)
==152096== by 0x2D0065: deinit_and_exit (haproxy.c:3052)
==152096== by 0x169BC0: main (haproxy.c:3996)
To fix the issue, let's ensure in proxy_free_common() that no unchecked
expressions may still point to the proxy after the proxy is freed by
purging the list (DEL_INIT is used to reset list items).
Special thanks to GH user @mhameed who filed a comprehensive issue with
all the relevant information required to reproduce the bug (see GH #2597),
after having first reported the issue on the alpine project bug tracker.
As shown by previous patch series, having to free some common proxy
struct members twice (in free_proxy() and proxy_free_defaults()) is
error-prone: we often overlook one of the two free locations when
adding new features.
To prevent such bugs from being introduced in the future, and also avoid
code duplication, we now have a proxy_free_common() function to free all
proxy struct members that are common to all proxy types (either regular or
default ones).
This should greatly improve code maintenance related to proxy freeing
logic.
proxy header_unique_id wasn't cleaned up in proxy_free_defaults(),
resulting in small memory leak if "unique-id-header" was used on a
default proxy section.
It may be backported to all stable versions.
proxy conn_src.iface_name was only freed in proxy_free_defaults(), whereas
proxy conn_src.bind_hdr_name was only freed in free_proxy().
Because of that, using "source usesrc hdr_ip()" in a default proxy, or
"source interface" in a regular or default proxy would cause memory leaks
during deinit.
It may be backported to all stable versions.
proxy dyncookie_key wasn't cleaned up in free_proxy(), resulting in small
memory leak if "dynamic-cookie-key" was used on a regular or default
proxy.
It may be backported to all stable versions.
proxy check_{command,path} members (used for "external-check" feature)
weren't cleaned up in free_proxy(), resulting in small memory leak if
"external-check command" or "external-check path" were used on a regular
or default proxy.
It may be backported to all stable versions.
proxy email-alert settings weren't cleaned up in free_proxy(), resulting
in small memory leak if "email-alert to" or "email-alert from" were used
on a regular or default proxy.
It may be backported to all stable versions.
proxy log_tag wasn't cleaned up in free_proxy(), resulting in small
memory leak if "log-tag" was used on a regular or default proxy.
It may be backported to all stable versions.
proxy server_id_hdr_name member (used for "http-send-name-header" option)
wasn't cleaned up in free_proxy(), resulting in small memory leak if
"http-send-name-header" was used on a regular or default proxy.
This may be backported to all stable versions.
last_change was a member present in both proxy and server struct. It is
used as an age statistics to report the last update of the object.
Move last_change into fe_counters/be_counters. This is necessary to be
able to manipulate it through generic stat column and report it into
stats-file.
Note that there is a change for proxy structure with now 2 different
last_change values, on frontend and backend side. Special care was taken
to ensure that the value is initialized only on the proxy side. The
other value is set to 0 unless a listen proxy is instantiated. For the
moment, only backend counter is reported in stats. However, with now two
distinct values, stats could be extended to report it on both side.
Move freq-ctr defined in proxy or server structures into their dedicated
fe_counters/be_counters struct.
Functionnaly no change here. This commit will allow to convert rate
stats column to generic one, which is mandatory to manipulate them in
the stats-file.
This commit is part of a serie to align counters usage between
frontends/listeners on one side and backends/servers on the other.
"stot" metric refers to the total number of sessions. On backend side,
it is interpreted as a number of streams. Previously, this was accounted
using <cum_sess> be_counters field for servers, but <cum_conn> instead
for backend proxies.
Adjust this by using <cum_sess> for both proxies and servers. As such,
<cum_conn> field can be removed from be_counters.
Note that several diagnostic messages which reports total frontend and
backend connections were adjusted to use <cum_sess>. However, this is an
outdated and misleading information as it does reports streams count on
backend side. These messages should be fixed in a separate commit.
This should be backported to all stable releases.
This commit is similar with the two previous ones. Its purpose is to add
GUID support on listeners. Due to bind_conf and listeners configuration,
some specifities were required.
Its possible to define several listeners on a single bind line, for
example by specifying multiple addresses. As such, it's impossible to
support a "guid" keyword on a bind line. The problem is exacerbated by
the cloning of listeners when sharding is used.
To resolve this, a new keyword "guid-prefix" is defined for bind lines.
It allows to specify a string which will be used as a prefix for
automatically generated GUID for each listeners attached to a bind_conf.
Automatic GUID listeners generation is implemented via a new function
bind_generate_guid(). It is called on post-parsing, after
bind_complete_thread_setup(). For each listeners on a bind_conf, a new
GUID is generated with bind_conf prefix and the index of the listener
relative to other listeners in the bind_conf. This last value is stored
in a new bind_conf field named <guid_idx>. If a GUID cannot be inserted,
for example due to a non-unique value, an error is returned, startup is
interrupted with configuration rejected.
Implement proxy identiciation through GUID. As such, a guid_node member
is inserted into proxy structure. A proxy keyword "guid" is defined to
allow user to fix its value.
Currently, the way proxy-oriented logformat directives are handled is way
too complicated. Indeed, "log-format", "log-format-error", "log-format-sd"
and "unique-id-format" all rely on preparsing hints stored inside
proxy->conf member struct. Those preparsing hints include the original
string that should be compiled once the proxy parameters are known plus
the config file and line number where the string was found to generate
precise error messages in case of failure during the compiling process
that happens within check_config_validity().
Now that lf_expr API permits to compile a lf_expr struct that was
previously prepared (with original string and config hints), let's
leverage lf_expr_compile() from check_config_validity() and instead
of relying on individual proxy->conf hints for each logformat expression,
store string and config hints in the lf_expr struct directly and use
lf_expr helpers funcs to handle them when relevant (ie: original
logformat string freeing is now done at a central place inside
lf_expr_deinit(), which allows for some simplifications)
Doing so allows us to greatly simplify the preparsing logic for those 4
proxy directives, and to finally save some space in the proxy struct.
Also, since httpclient proxy has its "logformat" automatically compiled
in check_config_validity(), we now use the file hint from the logformat
expression struct to set an explicit name that will be reported in case
of error ("parsing [httpclient:0] : ...") and remove the extraneous check
in httpclient_precheck() (logformat was parsed twice previously..)
This patch tries to address a design flaw with how logformat expressions
are parsed from config. Indeed, some parse_logformat_string() calls are
performed during config parsing when the proxy mode is not yet known.
Here's a config example that illustrates the issue:
defaults
mode tcp
listen test
bind :8888
http-response set-header custom-hdr "%trl" # needs http
mode http
The above config should work, because the effective proxy mode is http,
yet haproxy fails with this error:
[ALERT] (99051) : config : parsing [repro.conf:6] : error detected in proxy 'test' while parsing 'http-response set-header' rule : format tag 'trl' is reserved for HTTP mode.
To fix the issue once and for all, let's implement smart postparsing for
logformat expressions encountered during config parsing:
- split parse_logformat_string() (and subfonctions) in order to create a
new lf_expr_postcheck() function that must be called to finish
preparing and checking the logformat expression once the proxy type is
known.
- save some config hints info during parse_logformat_string() to
generate more precise error messages during lf_expr_postcheck(), if
needed, we rely on curpx->conf.args.{file,line} hints for that because
parse_logformat_string() doesn't know about current file and line
number.
- lf_expr_postcheck() uses PR_FL_CHECKED proxy flag to know if the
function may try to make the proxy compatible with the expression, or
if it should simply fail as soon as an incompatibility is detected.
- if parse_logformat_string() is called from an unchecked proxy, then
schedule the expression for postparsing, else (ie: during runtime),
run the postcheck right away.
This change will also allow for some logformat expression error handling
simplifications in the future.
log format expressions are broadly used within the code: once they are
parsed from input string, they are converted to a linked list of
logformat nodes.
We're starting to face some limitations because we're simply storing the
converted expression as a generic logformat_node list.
The first issue we're facing is that storing logformat expressions that
way doesn't allow us to add metadata alongside the list, which is part
of the prerequites for implementing log-profiles.
Another issue with storing logformat expressions as generic lists of
logformat_node elements is that it's starting to become really hard to
tell when we rely on logformat expressions or not in the code given that
there isn't always a comment near the list declaration or manipulation
to indicate that it's relying on logformat expressions under the hood,
so this adds some complexity for code maintenance.
This patch looks quite impressive due to changes in a lot of header and
source files (since logformat expressions are broadly used), but it does
a simple thing: it defines the lf_expr structure which itself holds a
generic list of logformat nodes, and then declares some helpers to
manipulate lf_expr elements and fixes the code so that we now exclusively
manipulate logformat_node lists as lf_expr elements outside of log.c.
For now, lf_expr struct only contains the list of logformat nodes (no
additional metadata), but now that we have dedicated type and helpers,
doing so in the future won't be problematic at all and won't require
extensive code changes.
When support for dynamic names was added for use_backend rules in
702d44f2f ("MEDIUM: proxy: support use_backend with dynamic names"), the
sample expression resulting from parse_logformat_string() was only freed
for non dynamic rules (when the expression resolved to a simple string
node). But for complex expressions (ie: multiple nodes), rule->dynamic
was set but the expression was never released, resulting in a small
memory leak when freeing the parent proxy.
To fix the issue, in free_proxy(), we free the switching rule expression
if the switching rule is dynamic.
This should be backported to every stable versions.
[ada: prior to 2.9, free_logformat_list() helper did not exist: we may
use the same manual sample expr freeing logic as in server_rules pruning
right above it]
log load-balancing implementation was not seamlessly integrated within
lbprm API. The consequence is that it could become harder to maintain
over time since it added some specific cases just for the log backend.
Moreover, it resulted in some code duplication since balance algorithms
that are common to logs and regular (tcp, http) backends were specifically
rewritten for log backends.
Thanks to the previous commit, we now have all the prerequisites to make
log load-balancing fully leverage lbprm logic. Thus in this patch we make
__do_send_log_backend() use existing lbprm algorithms, and we no longer
require log-specific lbprm initialization in cfgparse.c and in
postcheck_log_backend().
As a bonus, for log backends this allows weighed algorithms to properly
support weights (ie: roundrobin, random and log-hash) since we now
leverage the same lb algorithms that we use for tcp/http backends
(doc was updated).
It is the third applet to be refactored to use its own buffers. In addition to
the CLI applet, some I/O handlers of CLI commands were also updated, especially
the stats ones.
Some command I/O handlers were updated to use applet's buffers instead of
channels ones.
The main CLI I/O handle is responsible to interrupt the processing on
shutdown/abort. It is not the responsibility of the I/O handler of CLI
commands to take care of it.
Contrary to static servers, dynamic servers does not initialize their
settings from a default server instance. As such, _srv_parse_init() was
responsible to set a set of minimal values to have a correct behavior.
However, some settings were not properly initialized. This caused
dynamic servers to not behave as static ones without explicit
parameters.
Currently, the main issue detected is connection reuse which was
completely impossible. This is due to incorrect pool_purge_delay and
max_reuse settings incompatible with srv_add_to_idle_list().
To fix the connection reuse, but also more generally to ensure dynamic
servers are aligned with other server instances, define a new function
srv_settings_init(). This is used to set initial values for both default
servers and dynamic servers. For static servers, srv_settings_cpy() is
kept instead, using their default server as reference.
This patch could have unexpected effects on dynamic servers behavior as
it restored proper initial settings. Previously, they were set to 0 via
calloc() invocation from new_server().
This should be backported up to 2.6, after a brief period of
observation.
In several places in the source, there was the same block of code that was
used to deinitialize the log buffer. There were even two functions that
did this, but they were called only from the code that is in the same
source file (free_tcpcheck_fmt() in src/tcpcheck.c and free_logformat_list()
in src/proxy.c - they were both static functions).
The function free_logformat_list() was moved from the file src/proxy.c to
src/log.c, and a check of the list before freeing the memory was added to
that function.
Remove some QUIC definitions of members from server structure as the haproxy QUIC
stack does not support at all the server part (QUIC client) as this time.
Remove the statements in relation with their initializations.
This patch should be backported as far as 2.6 to save memory.
4e5e2664 ("MINOR: proxy: add findserver_unique_id() and findserver_unique_name()")
added findserver_unique_id() and findserver_unique_name() functions that
were inspired from the historical findserver() function, so unfortunately
they don't perform well when used on large backend farms because they scan
the whole server list linearly.
I was about to provide a patch to optimize such functions when I stumbled
on Baptiste's work:
19a106d24 ("MINOR: server: server_find functions: id, name, best_match")
It turns out Baptiste already implemented helper functions to supersed
the unoptimized findserver() function (at least at runtime when servers
have been assigned their final IDs and inserted in the lookup trees): they
offer more matching options and rely on eb lookups so they are much more
suitable for fast queries. I don't know how I missed that, but they are a
perfect base for the server rid matching functions.
So in this patch, we essentially revert 4e5e2664 to provide the optimized
equivalent functions named server_find_by_id_unique() and
server_find_by_name_unique(), then we force existing findserver_unique_*()
callers to switch to the new functions.
This patch depends on:
- "OPTIM: server: eb lookup for server_find_by_name()"
This could be backported up to 2.8.
Currently, we have a check in proxy_cfg_ensure_no_http() that generates a
warning if the monitor-uri is configured on a proxy that doesn't have
mode HTTP enabled.
However, when we give a look at monitor-uri implementation, it's not
100% correct. Indeed, despite the warning message, the directive will
still be evaluated when HTTP upgrade occurs from a TCP frontend.
Thus the error is misleading.
To make the error message comply with the actual behavior, the check was
moved alongside other checks that accept both native HTTP mode or HTTP
upgrades in cfgparse.c.
Take the px->server_rules freeing part out of free_proxy() and make it
a dedicated helper function so that it becomes possible to use it from
anywhere.
There are multiple places inside free_proxy() where we need to perform
the exact same operation: freeing a logformat list which includes freeing
every member.
To prevent code duplication, we add the free_logformat_list() function
that takes such list as parameter and does all the freeing job on its
own.
Previous commit renames 'proto_reverse_connect' module to 'proto_rhttp'.
This commits follows this by replacing various custom prefix by 'rhttp_'
to make the code uniform.
Note that 'reverse_' prefix was kept in connection module. This is
because if a new reversable protocol not based on HTTP is implemented,
it may be necessary to reused the same connection function which are
protocol agnostic.
When a default-server directive is used in a defaults section, it's never
freed and the "defaults" proxy gets reset without freeing the fields from
that default-server. Normally there are no allocation there, except for
the config file location stored in srv->conf.file form an strdup() since
commit 9394a9444 ("REORG: server: move alert traces in parse_server")
that appeared in 2.4. In addition, if a "default-server" directive
appears multiple times in a defaults section, one more entry will be
leaked per call.
This commit addresses this by checking that we don't overwrite the file
upon multiple calls, and by clearing it when resetting the default proxy.
This should be backported to 2.4.
add proxy_cfg_ensure_no_log() function (similar to
proxy_cfg_ensure_no_http()) to ensure at the end of proxy parsing that
no log exclusive options are found if the proxy is not in log mode.
In 1b8e68e ("MEDIUM: stick-table: Stop handling stick-tables as proxies.")
we forgot to free the table pointer which is now dynamically allocated.
Let's take this opportunity to also fix a missing free in the table itself
(the table expire task wasn't properly destroyed)
This patch depends on:
- "MINOR: stktable: add sktable_deinit function"
It should be backported in every stable versions.
Add a new timeout for the handshake, on the frontend side only. Such a hanshake
will be typically used for TLS hanshakes during client connections to TLS/TCP or
QUIC frontends.
The proxy's initialization is rather odd. First, init_new_proxy() is
called to zero all the lists and certain values, except those that can
come from defaults, which are initialized by proxy_preset_defaults().
The default server settings are also only set there.
This results in these settings not to be set for a number of internal
proxies that do not explicitly call proxy_preset_defaults() after
allocation, such as sink and log forwarders.
This was revealed by last commit 79aa63823 ("MINOR: server: always
initialize pp_tlvs for default servers") which crashes in log parsers
when applied to certain proxies which did not initialize their default
servers.
In theory this should be backported, however it would be desirable to
wait a bit before backporting it, in case certain parts would rely on
these elements not being initialized.
In commit 6f4bfed3a ("MINOR: server: Add parser support for
set-proxy-v2-tlv-fmt") a suspicious check for a NULL srv_tlv was placed
in the list_for_each_entry(), that should not be needed. In practice,
it's caused by the list head not being initialized, hence the first
element is NULL, as shown by Alexander's reproducer below which crashes
if the test in the loop is removed:
backend dummy
default-server send-proxy-v2 set-proxy-v2-tlv-fmt(0xE1) %[fc_pp_tlv(0xE1)]
server dummy_server 127.0.0.1:2319
The right place to initialize this field is proxy_preset_defaults().
We'd really need a function to initialize a server :-/
The check in the loop was removed. No backport is needed.
Simplify stick and store sticktable proxy rules postparsing by adding
a sticking rule entry resolve (postparsing) function.
This will ease code maintenance.
Using "mode log" in a backend section turns the proxy in a log backend
which can be used to log-balance logs between multiple log targets
(udp or tcp servers)
log backends can be used as regular log targets using the log directive
with "backend@be_name" prefix, like so:
| log backend@mybackend local0
A log backend will distribute log messages to servers according to the
log load-balancing algorithm that can be set using the "log-balance"
option from the log backend section. For now, only the roundrobin
algorithm is supported and set by default.
When 'log' directive was implemented, the internal representation was
named 'struct logsrv', because the 'log' directive would directly point
to the log target, which used to be a (UDP) log server exclusively at
that time, hence the name.
But things have become more complex, since today 'log' directive can point
to ring targets (implicit, or named) for example.
Indeed, a 'log' directive does no longer reference the "final" server to
which the log will be sent, but instead it describes which log API and
parameters to use for transporting the log messages to the proper log
destination.
So now the term 'logsrv' is rather confusing and prevents us from
introducing a new level of abstraction because they would be mixed
with logsrv.
So in order to better designate this 'log' directive, and make it more
generic, we chose the word 'logger' which now replaces logsrv everywhere
it was used in the code (including related comments).
This is internal rewording, so no functional change should be expected
on user-side.