When an htx DATA block is partially transfer, we must take care to remove
exactly the copied size. To do so, we must save the size of the last block
value copied and not rely on the last data block after the copy. Indeed,
data can be merged with an existing DATA block, so the last block size can
be larger than the last part copied.
Because of this issue, it is possible to remove more data than
expected. Worse, this could lead to a crash by performing an integer
overflow on the block size.
No backport needed.
Fix a leak of the task object in acme_start_task() when one of the
condition in the function failed.
Fix issue #3308.
Must be backported to 3.2 and later.
There are two settings to control idle connection sharing across
threads.
tune.idle-pool.shared, that enables or disables it, and then
tune.takeover-other-tg-connections, which lets you or not get idle
connections from other thread groups.
Add a new keyword for tune.idle-pool.shared, "full", that lets you get
connections from other thread groups (equivalent to "full" keyword for
tune.takeover-other-tg-connections). The "on" keyword now will be
equivalent to the "restrict" one, which allowed getting connection from
other thread groups only when not doing it would result in a connection
failure (when reverse-http or when strict-macxonn are used).
tune.takeover-other-tg-connections will be deprecated.
If server returns an auth with status valid it seems that client
needs to always skip it, CA can recycle authorizations, without
this change haproxy fails to obtain certificates in that case.
It is also something that is explicitly allowed and stated
in the dns-persist-01 draft RFC.
Note that it would be better to change how haproxy does status polling,
and implements the state machine, but that will take some thought
and time, this patch is a quick fix of the problem.
See:
https://github.com/letsencrypt/boulder/issues/2125https://github.com/letsencrypt/pebble/issues/133
This must be backported to 3.2 and later.
When abortonclose option is enabled (by default since 3.3), the HTTP rules
can no longer yield if the client aborts. However, stream aborts were also
considered. So it was possible to interrupt yielding rules, especially on
the response processing, while the client was still waiting for the
response.
So now, when abortonclose option is enabled, we now take care to only
consider client aborts to prevent HTTP rules to yield.
Many thanks to @DirkyJerky for his detailed analysis.
This patch should fix the issue #3306. It should be backported as far as
2.8.
warnif_misplaced_* functions return 1 when a warning is reported and 0
otherwise. So the caller must properly handle the return value.
When parsing a proxy, ERR_WARN code must be added to the error code instead
of the return value. When a warning was reported, ERR_RETRYABLE (1) was
added instead of ERR_WARN.
And when tcp rules were parsed, warnings were ignored. Message were emitted
but the return values were ignored.
This patch should be backported to all stable versions.
When warnif_cond_conflicts() is called, we must take care to emit a warning
only when a conflict is reported. We cannot rely on the err_code variable
because some warnings may have been already reported. We now rely on the
errmsg variable. If it contains something, a warning is emitted. It is good
enough becasue warnif_cond_conflicts() only reports warnings.
This patch should fix the issue #3305. It is a 3.4-dev specific issue. No
backport needed.
When picking a mux, pay attention to its MX_FL_FRAMED. If it is set,
then it means we explicitely want QUIC, so don't use that mux for any
protocol that is not QUIC.
When parsing the check address, store the associated proto too.
That way we can use the notation like quic4@address, and the right
protocol will be used. It is possible for checks to use a different
protocol than the server, ie we can have a QUIC server but want to run
TCP checks, so we can't just reuse whatever the server uses.
WIP: store the protocol in checks
Don't assume the check will reuse the server's xprt. It may not be true
if some settings such as the ALPN has been set, and it differs from the
server's one. If the server is QUIC, and we want to use TCP for checks,
we certainly don't want to reuse its XPRT.
Permission checks on the CLI for ACME are missing.
This patch adds a check on the ACME commands
so they can only be run in admin mode.
ACME is stil a feature in experimental-mode.
Initial report by Cameron Brown.
Must be backported to 3.2 and later.
Permission checks on the CLI for ECH are missing.
This patch adds a check for "(add|set|del|show) ssl ech" commands
so they can only be run in admin mode.
ECH is stil a feature in experimental-mode and is not compiled by
default.
Initial report by Cameron Brown.
Must be backported to 3.3.
This patch fixes a warning that can be reproduced with gcc-8.5 on RHEL8
(gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-28)).
This should fix issue #3303.
Must be backported everywhere 917e82f283 ("MINOR: debug: copy debug
symbols from /usr/lib/debug when present") was backported, which is
to branch 3.2 for now.
Fix the check or arguments of the 'acme challenge_ready' command which
was checking if all arguments are NULL instead of one of the argument.
Must be backported to 3.2 and later.
Replace atol() by _strl2uic() in cases the input are ISTs when parsing
the retry-after header. There's no risk of an error since it will stop
at the first non-digit.
Must be backported to 3.2 and later.
In acme_req_finalize() the data buffer is only freed when a2base64url
succeed. This patch moves the allocation so it free() the DER buffer in
every cases.
Must be backported to 3.2 and later.
In the documentation of 'http-check expect' directive, the parameter
'status-code' was missing. Let's add it.
This patch could be backported to all stable versions.
Thanks to previous commits, it is possible to use small buffers at different
places: to store the request when a connection is queued or when L7 retries
are enabled, or for health-checks requests. However, there was no
configuration parameter to fine tune small buffer use.
It is now possible, thanks to the proxy option "use-small-buffers".
Documentation was updated accordingly.
When healthchecks were configured for a proxy, an enum-like was used to
sepcify the check's type. The idea was to reserve some values for futur
types of healthcheck. But it is overkill. I doubt we will ever have
something else than tcp and external checks. So corresponding PR_O2 flags
were slightly reviewed and a hole was filled.
Thanks to this change, some bits were released in options2 bitfield.
If support for small buffers is enabled, we now try to use them for
healthcheck requests. First, we take care the tcpcheck ruleset may use small
buffers. Send rules using LF strings or too large data are excluded. The
ability to use small buffers or not are set on the ruleset. All send rules
of the ruleset must be compatible. This info is then transfer to server's
healthchecks relying on this ruleset.
Then, when a healthcheck is running, when a send rule is evaluated, if
possible, we try to use small buffers. On error, the ability to use small
buffers is removed and we retry with a regular buffer. It means on the first
error, the support is disabled for the healthcheck and all other runs will
use regular buffers.
In h2_rcv_buf(), HTX flags are transfer with data when htx_xfer() is
called. There is no reason to continue to deal with them in the H2 mux. In
addition, there is no reason to set SE_FL_EOI flag when a parsing error was
reported. This part was added before the stconn era. Nowadays, when an HTX
parsing error is reported, an error on the sedesc should also be reported.
This reverts commit 44932b6c417e472d25039ec3d7b8bf14e07629bc.
The patch above was only necessary to handle partial headers or trailers
parsing. There was nothing to prevent the H2 multiplexer to start to add
headers or trailers in an HTX message and to stop the processing on error,
leaving the HTX message with no EOH/EOT block.
From the HTX API point of view, it is unexepected. And this was fixed thanks
to the commit ba7dc46a9 ("BUG/MINOR: h2/h3: Never insert partial
headers/trailers in an HTX message").
So this patch can be reverted. It is important to not report a parsign error
too early, when there are still data to transfer to the upper layer.
This patch must be backport where 44932b6c4 was backported but only after
backporting ba7dc46a9 first.
When a HTX stream is queued, if the request is small enough, it is moved
into a small buffer. This should save memory on instances intensively using
queues.
Applet and connection receive function were update to block receive when a
small buffer is in use.
In the same way support for large chunks was added to properly work with
large buffers, we are now adding supports for small chunks because it is
possible to process small buffers.
So a dedicated memory pool is added to allocate small
chunks. alloc_small_trash_chunk() must be used to allocate a small
chunk. alloc_trash_chunk_sz() and free_trash_chunk() were uppdated to
support small chunks.
In addition, small trash buffers are also created, using the same mechanism
than for regular trash buffers. So three thread-local trash buffers are
created. get_small_trash_chunk() must be used to get a small trash buffer.
And get_trash_chunk_sz() was updated to also deal with small buffers.
htx_move_to_small_buffer()/htx_move_to_large_buffer() and
htx_copy_to_small_buffer()/htx_copy_to_large_buffer() functions can now be
used to move or copy blocks from a default buffer to a small or large
buffer. The destination buffer is allocated and then each blocks are
transferred into it.
These funtions relies in htx_xfer() function.
htx_xfer() function should replace htx_xfer_blks(). It will be a bit easier to
maintain and to use. The behavior of htx_xfer() can be changed by calling it
with specific flags:
* HTX_XFER_KEEP_SRC_BLKS: Blocks from the source message are just copied
* HTX_XFER_PARTIAL_HDRS_COPY: It is allowed to partially xfer headers or trailers
* HTX_XFER_HDRS_ONLY: only headers are xferred
By default (HTX_XFER_DEFAULT or 0), all blocks from the source message are moved
into to the destination mesage. So copied in the destination messageand removed
from the source message.
The caller must still define the maximum amount of data (including meta-data)
that can be xferred.
It is no longer necessary to specify a block type to stop the copy. Most of
time, with htx_xfer_blks(), this parameter was set to HTX_BLK_UNUSED. And
otherwise it was only specified to transfer headers.
It is important to not that the caller is responsible to verify the original
HTX message is well-formated. Especially, it must be sure headers part and
trailers part are complete (finished by EOH/EOT block).
For now, htx_xfer_blks() is not removed for compatiblity reason. But it is
deprecated.
When small buffer size was greater than the default buffer size, an error
was triggered. We now do the same than for large buffer. A warning is
emitted and the small buffer size is set to 0 do disable small buffer
allocation.
Because small buffers were only used by QUIC streams, the pool used to alloc
these buffers was located in the quic code. However, their usage will be
extended to other parts. So, the small buffers pool was moved into the
dynbuf part.
http-errors parsing has been refactored in a recent serie of patches.
However, a null deref was introduced by the following patch in case a
non-existent http-errors section is referenced by an "errorfiles"
directive.
commit 2ca7601c2d6781f455cf205e4f3b52f5beb16e41
MINOR/OPTIM: http_htx: lookup once http_errors section on check/init
Fix this by delaying ha_free() so that it is called after ha_alert().
No need to backport.
The cfg_parse_acme() function checks if an 'acme' section is already
existing in the configuration with cur_acme->linenum > 0. But the wrong
filename and line number are displayed in the commit message.
Must be backported to 3.2 and later.
This patch fixes a leak of the ext_san structure when
sk_X509_EXTENSION_push() failed. sk_X509_EXTENSION_pop_free() is already
suppose to free it, so ext_san must be set to NULL upon success to avoid
a double-free.
Must be backported to 3.2 and later.
Use proxy_check_http_errors() on defaults proxy instances. This will
emit alert messages for errorfiles directives referencing a non-existing
http-errors section, or a warning if an explicitely listed status code
is not present in the target section.
This is a small behavior changes, as previouly this was only performed
for regular proxies. Thus, errorfile/errorfiles directives in an unused
defaults were never checked.
This may prevent startup of haproxy with a configuration file previously
considered as valid. However, this change is considered as necessary to
be able to use http-errors with dynamic backends. Any invalid defaults
will be detected on startup, rather than having to discover it at
runtime via "add backend" invokation.
Thus, any restriction on http-errors usage is now lifted for the
creation of dynamic backends.
The previous patch has splitted the original proxy_check_errors()
function in two, so that check and init steps are performed separately.
However, this renders the code inefficient for "errorfiles" directive as
tree lookup on http-errors section is performed twice.
Optimize this by adding a reference to the section in conf_errors
structure. This is resolved during proxy_check_http_errors() and
proxy_finalize_http_errors() can reuse it.
No need to backport.
Function proxy_check_errors() is used when configuration parsing is
over. This patch splits it in two newly named ones.
The first function is named proxy_check_http_errors(). It is responsible
to check for the validity of any "errorfiles" directive which could
reference non-existent http-errors section or code not defined in such
section. This function is now called via proxy_finalize().
The second function is named proxy_finalize_http_errors(). It converts
each conf_errors type used during parsing in a proper http_reply type
for runtime usage. This function is still called via post-proxy-check,
after proxy_finalize().
This patch does not bring any functional change. However, it will become
necessary to ensure http-errors can be used as expected with dynamic
backends.
This patch is the second part of the refactoring for http-errors
parsing. It renames some fields in <conf_errors> structure to clarify
their usage. In particular, union variants are renamed "inl"/"section",
which better highlight the link with the newly defined enum
http_err_directive.
In conf_errors struct, arbitrary integer values were used for both
<type> field and <status> array. This renders the code difficult to
follow.
Replaces these values with proper enums type. Two new types are defined
for each of these fields. The first one represents the directive type,
derived from the keyword used (errorfile vs errorfiles). This directly
represents which part of <info> union should be manipulated.
The second enum is used for errorfiles directive with a reference on a
http-errors section. It indicates whether or not if a status code should
be imported from this section, and if this import is explicit or
implicit.
Several resources were leaked on both success and error paths:
- X509_NAME *nm was never freed. X509_REQ_set_subject_name() makes
an internal copy, so nm must be freed separately by the caller.
- str_san allocated via my_strndup() was never freed on either path.
- On error paths after allocation, x (X509_REQ) and exts
(STACK_OF(X509_EXTENSION)) were also leaked.
Fix this by adding proper cleanup of all allocated resources in both
the success and error paths. Also move sk_X509_EXTENSION_pop_free()
after X509_REQ_sign() so it is not skipped when sign fails, and
initialize nm to NULL to make early error paths safe.
Must be backported as far as 3.2.
There was a leftover of "activity[tid].ctr1++" in commit 7d40b3134
("MEDIUM: sched: do not run a same task multiple times in series")
that unfortunately only builds in development mode :-(
This introduces 3 new settings: tune.h2.be.max-frames-at-once and
tune.h2.fe.max-frames-at-once, which limit the number of frames that
will be processed at once for backend and frontend side respectively,
and tune.h2.fe.max-rst-at-once which limits the number of RST_STREAM
frames processed at once on the frontend.
We can now yield when reading too many frames at once, which allows to
limit the latency caused by processing too many frames in large buffers.
However if we stop due to the RST budget being depleted, it's most likely
the sign of a protocol abuse, so we make the tasklet go to BULK since
the goal is to punish it.
By limiting the number of RST per loop to 1, the SSL response time drops
from 95ms to 1.6ms during an H2 RST flood attack, and the maximum SSL
connection rate drops from 35.5k to 28.0k instead of 11.8k. A moderate
SSL load that shows 1ms response time and 23kcps increases to 2ms with
15kcps versus 95ms and 800cps before. The average loop time goes down
from 270-280us to 160us, while still doubling the attack absorption
rate with the same CPU capacity.
This patch may usefully be backported to 3.3 and 3.2. Note that to be
effective, this relies on the following patches:
MEDIUM: sched: do not run a same task multiple times in series
MINOR: sched: do not requeue a tasklet into the current queue
MINOR: sched: do not punish self-waking tasklets anymore
MEDIUM: sched: do not punish self-waking tasklets if TASK_WOKEN_ANY
MEDIUM: sched: change scheduler budgets to lower TL_BULK
Having less yielding tasks in TL_BULK and more in TL_NORMAL, we need
to rebalance these queues' priorities. Tests have shown that raising
TL_NORMAL to 40% and lowering TL_BULK to 3% seems to give about the
best tradeoffs.
Self-waking tasklets are currently punished and go to the BULK list.
However it's a problem with muxes or the stick-table purge that just
yield and wake themselves up to limit the latency they cause to the
rest of the process, because by doing so to help others, they punish
themselves. Let's check if any TASK_WOKEN_ANY flag is present on
the tasklet and stop sending tasks presenting such a flag to TL_BULK.
Since tasklet_wakeup() by default passes TASK_WOKEN_OTHER, it means
that such tasklets will no longer be punished. However, tasks which
only want a best-effort wakeup can simply pass 0.
It's worth noting that a comparison was made between going into
TL_BULK at all and only setting the TASK_SELF_WAKING flag, and
it shows that the average latencies are ~10% better when entirely
avoiding TL_BULK in this case.
Nowadays due to yield etc, it's counter-productive to permanently
punish self-waking tasklets, let's abandon this principle as it prevent
finer task priority handling.
We continue to check for the TASK_SELF_WAKING flag to place a task
into TL_BULK in case some code wants to make use of it in the future
(similarly to TASK_HEAVY), but no code sets it anymore. It could
possible make sense in the future to replace this flag with a one-shot
variant requesting low-priority.
As found by Christopher, the concept of waking a tasklet up into the
current queue is totally flawed, because if a task is in TL_BULK or
TL_HEAVY, all the tasklets it will wake up will end up in the same
queue. Not only this will clobber such queues, but it will also
reduce their quality of service, and this can contaminate other
tasklets due to the numerous wakeups there are now with the subsribe
mechanism between layers.
There's always a risk that some tasks run multiple times if they wake
each other up. Now we include the loop counter in the task struct and
stop processing the queue it's in when meeting a task that has already
run. We only pick 16 bits since that's only what remains free in the
task common part, so from time to time (once every 65536) it will be
possible to wrongly match a task as having already run and stop evaluating
its queue, but it's rare enough that we don't care, because this will
be OK on the next iteration.
This patch improves the robustness of the QPACK varint decoder and fixes
potential 1-byte out-of-bounds reads in qpack_decode_fs().
In qpack_decode_fs(), two 1-byte OOB reads were possible on truncated
streams between two varint decoding. These occurred when trying to read
the byte containing the Huffman bit <h> and the Value Length prefix
immediately following an Index or a Name Length.
Note that these OOB are limited to a single byte because
qpack_get_varint() already ensures that its input length is non-zero
before consuming any data.
The fixes in qpack_decode_fs() are:
- When decoding an index, we now verify that at least one byte remains
to safely access the following <h> bit and value length.
- When decoding a literal, we now check len < name_len + 1 to ensure
the byte starting the header value is reachable.
In qpack_get_varint(), the maximum value is now strictly capped at 2^62-1
as per RFC. This is enforced using a budget-based check:
(v & 127) > (limit - ret) >> shift
This prevents values from overflowing into the 63rd or 64th bits, which
would otherwise break subsequent signed comparisons (e.g., if (len < name_len))
by interpreting the length as a negative value, leading to false positive
tests.
Thank you to @jming912 for having reported this issue in GH #3302.
Must be backported as far as 2.6
In the ENFILE and ENOMEM cases, when accept() fails, an irrelevant
global.maxsock value was printed that doesn't reflect system limits.
Now the actconn is printed that gives a hint about the failure reasons.
Should be backported in all stable branches.
We anticipated that the do-log action should be expanded with optional
arguments at some point. Now that we heard of multiple use-cases
that could be achieved with do-log action, but that are limitated by the
fact that all do-log statements inherit from the implicit log-profile
defined on the logger, we need to provide a way for the user to specify
that custom log-profile that could be used per do-log actions individually
This is what we try to achieve in this commit, by leveraging the
prerequisite work performed by the last 2 commits.
In process_send_log(), now also consider the ctx if ctx->profile != NULL
In that case, we do as if logger->prof was set, but we consider
ctx->profile in priority over the logger one. What this means is that
it will become possible to pass ctx.profile to a profile that will be
used no matter what to generate the log payload.
This is a pre-requisite to implement optional "profile" argument for
do-log action
do_log() is just a wrapper to use do_log_ctx() with pre-filled ctx, but
we now have the low-level do_log_ctx() variant which can be used to
pass specific ctx parameters instead.
Released version 3.4-dev7 with the following main changes :
- BUG/MINOR: stconn: Increase SC bytes_out value in se_done_ff()
- BUG/MINOR: ssl-sample: Fix sample_conv_sha2() by checking EVP_Digest* failures
- BUG/MINOR: backend: Don't get proto to use for webscoket if there is no server
- BUG/MINOR: jwt: Missing 'jwt_tokenize' return value check
- MINOR: flt_http_comp: define and use proxy_get_comp() helper function
- MEDIUM: flt_http_comp: split "compression" filter in 2 distinct filters
- CLEANUP: flt_http_comp: comp_state doesn't bother about the direction anymore
- BUG/MINOR: admin: haproxy-reload use explicit socat address type
- MEDIUM: admin: haproxy-reload conversion to POSIX sh
- BUG/MINOR: admin: haproxy-reload rename -vv long option
- SCRIPTS: git-show-backports: hide the common ancestor warning in quiet mode
- SCRIPTS: git-show-backports: add a restart-from-last option
- MINOR: mworker: add a BUG_ON() on mproxy_li in _send_status
- BUG/MINOR: mworker: don't set the PROC_O_LEAVING flag on master process
- Revert "BUG/MINOR: jwt: Missing 'jwt_tokenize' return value check"
- MINOR: jwt: Improve 'jwt_tokenize' function
- MINOR: jwt: Convert EC JWK to EVP_PKEY
- MINOR: jwt: Parse ec-specific fields in jose header
- MINOR: jwt: Manage ECDH-ES algorithm in jwt_decrypt_jwk function
- MINOR: jwt: Add ecdh-es+axxxkw support in jwt_decrypt_jwk converter
- MINOR: jwt: Manage ec certificates in jwt_decrypt_cert
- DOC: jwt: Add ECDH support in jwt_decrypt converters
- MINOR: stconn: Call sc_conn_process from the I/O callback if TASK_WOKEN_MSG state was set
- MINOR: mux-h2: Rely on h2s_notify_send() when resuming h2s for sending
- MINOR: mux-spop: Rely on spop_strm_notify_send() when resuming streams for sending
- MINOR: muxes: Wakup the data layer from a mux stream with TASK_WOKEN_IO state
- MAJOR: muxes: No longer use app_ops .wake() callback function from muxes
- MINOR: applet: Call sc_applet_process() instead of .wake() callback function
- MINOR: connection: Call sc_conn_process() instead of .wake() callback function
- MEDIUM: stconn: Remove .wake() callback function from app_ops
- MINOR: check: Remove wake_srv_chk() function
- MINOR: haterm: Remove hstream_wake() function
- MINOR: stconn: Wakup the SC with TASK_WOKEN_IO state from opposite side
- MEDIUM: stconn: Merge all .chk_rcv() callback functions in sc_chk_rcv()
- MINOR: stconn: Remove .chk_rcv() callback functions
- MEDIUM: stconn: Merge all .chk_snd() callback functions in sc_chk_snd()
- MINOR: stconn: Remove .chk_snd() callback functions
- MEDIUM: stconn: Merge all .abort() callback functions in sc_abort()
- MINOR: stconn: Remove .abort() callback functions
- MEDIUM: stconn: Merge all .shutdown() callback functions in sc_shutdown()
- MINOR: stconn: Remove .shutdown() callback functions
- MINOR: stconn: Totally app_ops from the stconns
- MINOR: stconn: Simplify sc_abort/sc_shutdown by merging calls to se_shutdown
- DEBUG: stconn: Add a CHECK_IF() when I/O are performed on a orphan SC
- MEDIUM: mworker: exiting when couldn't find the master mworker_proc element
- BUILD: ssl: use ASN1_STRING accessors for OpenSSL 4.0 compatibility
- BUILD: ssl: make X509_NAME usage OpenSSL 4.0 ready
- BUG/MINOR: tcpcheck: Fix typo in error error message for `http-check expect`
- BUG/MINOR: jws: fix memory leak in jws_b64_signature
- DOC: configuration: http-check expect example typo
- DOC/CLEANUP: config: update mentions of the old "Global parameters" section
- BUG/MEDIUM: ssl: Handle receiving early data with BoringSSL/AWS-LC
- BUG/MINOR: mworker: always stop the receiving listener
- BUG/MEDIUM: ssl: Don't report read data as early data with AWS-LC
- BUILD: makefile: fix range build without test command
- BUG/MINOR: memprof: avoid a small memory leak in "show profiling"
- BUG/MINOR: proxy: do not forget to validate quic-initial rules
- MINOR: activity: use dynamic allocation for "show profiling" entries
- MINOR: tools: extend the pointer hashing code to ease manipulations
- MINOR: tools: add a new pointer hash function that also takes an argument
- MINOR: memprof: attempt different retry slots for different hashes on collision
- MINOR: tinfo: start to add basic thread_exec_ctx
- MINOR: memprof: prepare to consider exec_ctx in reporting
- MINOR: memprof: also permit to sort output by calling context
- MINOR: tools: add a function to write a thread execution context.
- MINOR: debug: report the execution context on thread dumps
- MINOR: memprof: report the execution context on profiling output
- MINOR: initcall: record the file and line declaration of an INITCALL
- MINOR: tools: decode execution context TH_EX_CTX_INITCALL
- MINOR: tools: support decoding ha_caller type exec context
- MINOR: sample: store location for fetch/conv via initcalls
- MINOR: sample: also report contexts registered directly
- MINOR: tools: support an execution context that is just a function
- MINOR: actions: store the location of keywords registered via initcalls
- MINOR: actions: also report execution contexts registered directly
- MINOR: filters: set the exec context to the current filter config
- MINOR: ssl: set the thread execution context during message callbacks
- MINOR: connection: track mux calls to report their allocation context
- MINOR: task: set execution context on task/tasklet calls
- MINOR: applet: set execution context on applet calls
- MINOR: cli: keep the info of the current keyword being processed in the appctx
- MINOR: cli: keep track of the initcall context since kw registration
- MINOR: cli: implement execution context for manually registered keywords
- MINOR: activity: support aggregating by caller also for memprofile
- MINOR: activity: raise the default number of memprofile buckets to 4k
- DOC: internals: short explanation on how thread_exec_ctx works
- BUG/MINOR: mworker: only match worker processes when looking for unspawned proc
- MINOR: traces: defer processing of "-dt" options
- BUG/MINOR: mworker: fix typo &= instead of & in proc list serialization
- BUG/MINOR: mworker: set a timeout on the worker socketpair read at startup
- BUG/MINOR: mworker: avoid passing NULL version in proc list serialization
- BUG/MINOR: sockpair: set FD_CLOEXEC on fd received via SCM_RIGHTS
- BUG/MEDIUM: stconn: Don't forget to wakeup applets on shutdown
- BUG/MINOR: spoe: Properly switch SPOE filter to WAITING_ACK state
- BUG/MEDIUM: spoe: Properly abort processing on client abort
- BUG/MEDIUM: stconn: Fix abort on close when a large buffer is used
- BUG/MEDIUM: stconn: Don't perform L7 retries with large buffer
- BUG/MINOR: h2/h3: Only test number of trailers inserted in HTX message
- MINOR: htx: Add function to truncate all blocks after a specific block
- BUG/MINOR: h2/h3: Never insert partial headers/trailers in an HTX message
- BUG/MINOR: http-ana: Swap L7 buffer with request buffer by hand
- BUG/MINOR: stream: Fix crash in stream dump if the current rule has no keyword
- BUG/MINOR: mjson: make mystrtod() length-aware to prevent out-of-bounds reads
- MEDIUM: stats-file/clock: automatically update now_offset based on shared clock
- MINOR: promex: export "haproxy_sticktable_local_updates" metric
- BUG/MINOR: spoe: Fix condition to abort processing on client abort
- BUILD: spoe: Remove unsused variable
- MINOR: tools: add a function to create a tar file header
- MINOR: tools: add a function to load a file into a tar archive
- MINOR: config: support explicit "on" and "off" for "set-dumpable"
- MINOR: debug: read all libs in memory when set-dumpable=libs
- DEV: gdb: add a new utility to extract libs from a core dump: libs-from-core
- MINOR: debug: copy debug symbols from /usr/lib/debug when present
- MINOR: debug: opportunistically load libthread_db.so.1 with set-dumpable=libs
- BUG/MINOR: mworker: don't try to access an initializing process
- BUG/MEDIUM: peers: enforce check on incoming table key type
- BUG/MINOR: mux-h2: properly ignore R bit in GOAWAY stream ID
- BUG/MINOR: mux-h2: properly ignore R bit in WINDOW_UPDATE increments
- OPTIM: haterm: use chunk builders for generated response headers
- BUG/MAJOR: h3: check body size with content-length on empty FIN
- BUG/MEDIUM: h3: reject unaligned frames except DATA
- BUG/MINOR: mworker/cli: fix show proc pagination losing entries on resume
- CI: github: treat vX.Y.Z release tags as stable like haproxy-* branches
- MINOR: freq_ctr: add a function to add values with a peak
- MINOR: task: maintain a per-thread indicator of the peak run-queue size
- MINOR: mux-h2: store the concurrent streams hard limit in the h2c
- MINOR: mux-h2: permit to moderate the advertised streams limit depending on load
- MINOR: mux-h2: permit to fix a minimum value for the advertised streams limit
- BUG/MINOR: mworker: fix sort order of mworker_proc in 'show proc'
- CLEANUP: mworker: fix tab/space mess in mworker_env_to_proc_list()
Since version 3.1, the display order of old workers in 'show proc' was
accidentally reversed. The oldest worker was shown first and the newest
last, which was not the intended behavior. This regression was introduced
during the master-worker rework.
Fix this by sorting the list during deserialization in
mworker_env_to_proc_list().
An alternative fix would have been to iterate the list in reverse order
in the show proc function, but that approach risks introducing
inconsistencies when backporting to older versions.
Must be backported to 3.1 and later.
When using rq-load on tune.h2.fe.max-concurrent-streams, it's easy to
reach a situation where only one stream is allowed. There's nothing
wrong with this but it turns out that slightly higher values do not
necessarily cause significantly higher loads and will improve the user
experience. For this reason the keyword now also supports "min" to
specify a value. Experimentation shows that values from 5 to 15 remain
very effective at protecting the run queue while allowing a great level
of parallelism that keeps a site fluid.
Global setting tune.h2.fe.max-concurrent-streams now supports an optional
"rq-load" option to pass either a target load, or a keyword among "auto"
and "ignore". These are used to quadratically reduce the advertised streams
limit when the thread's run queue size goes beyong the configured value,
and automatically reduce the load on the process from new connections.
With "auto", instead of taking an explicit value, it uses as a target the
"tune.runqueue-depth" setting (which might be automatic). Tests have shown
that values between 50 and 100 are already very effective at reducing the
loads during attacks from 100000 to around 1500. By default, "ignore"
is in effect, which means that the dynamic tuning is not enabled.
The hard limit on the number of concurrent streams is currently
determined only by configuration and returned by
h2c_max_concurrent_streams(). However this doesn't permit to
change such settings on the fly without risking to break connections,
and it doesn't allow a connection to pick a different value, which
could be desirable for example to try to slow abuse down.
Let's store a copy of h2c_max_concurrent_streams() at connection
creation time into the h2c as streams_hard_limit. This inflates
the h2c size from 1324 to 1328 (0.3%) which is acceptable for the
expected benefits.
The new field th_ctx->rq_tot_peak contains the computed peak run queue
length averaged over the last 512 calls. This is computed when entering
process_runnable_tasks. It will not take into account new tasks that are
created or woken up during this round nor those which are evicted, which
is the reason why we're using a peak measurement to increase chances to
observe transient high values. Tests have shown that 512 samples are good
to provide a relatively smooth average measurement while still fading
away in a matter of milliseconds at high loads. Since this value is
only updated once per round, it cannot be used as a statistic and
shouldn't be exposed, it's only for internal use (self-regulation).
Sometimes it's desirable to observe fading away peak values, where a new
value that is higher than the historical one instantly replaces it,
otherwise contributes to it. It is convenient when trying to observe
certain phenomenons like peak queue sizes. The new function
swrate_add_peak_local() does that to a private variable (no atomic ops
involved as it's not worth the cost since such use cases are typically
local).
Add detection of release tags matching the vX.Y.Z pattern so they use
the same stable CI configuration as haproxy-* branches, rather than the
development one.
It prevents stable tag to trigger the CI with docker images and SSL
libraries only used for development.
Must be backported in stable releases.
After commit 594408cd612b5 ("BUG/MINOR: mworker/cli: fix show proc
pagination using reload counter"), the old-workers pagination stores
ctx->next_reload = child->reloads on flush failure, then skips entries
with child->reloads >= ctx->next_reload on resume.
The >= comparison is direction-dependent: it assumes the list is in
descending reload order (newest first). On current master, proc_list
is in ascending order (oldest first) because mworker_env_to_proc_list()
appends deserialized entries before mworker_prepare_master() appends
the new worker. This means the skip logic is inverted and can miss
entries or loop incorrectly depending on the version.
We fix this by renaming the context field to resume_reload and changing its
semantics: it now tracks the reload count of the last *successfully
flushed* row rather than the failed one. On flush failure, resume_reload
is left unchanged so the failed row is replayed on the next call. On
resume, entries are skipped by walking the list until the marker entry is
found (exact == match), which works regardless of list direction.
Additionally, we have to handle the unlikely case where the marker entry
is deleted from proc_list between handler calls (e.g. the process exits and
SIGCHLD processing removes it). Detect this by tracking the previous
LEAVING entry's reload count during the skip phase: if two consecutive
entries straddle the skip value (one > skip, the other < skip), the
deleted entry's former position has been crossed, so skipping stops and
the current entry is emitted.
This should be backported to all stable branches. On branches where
proc_list is in descending order (2.9, 3.0), the fix applies the
same way since the skip logic is now direction-agnostic.
HTTP/3 parser cannot deal with unaligned frames, except for DATA. As it
was expected that such case would not occur, a simple BUG_ON() was
written to protect HEADERS parsing.
First, this BUG_ON() was incorrectly written due an incorrect operator
'>=' vs '>' when checking if data wraps. Thus this patch correct it.
However this correction is not sufficient as it still possible to handle
a large unaligned HEADERS frame, which would trigger this BUG_ON(). This
is very unlikely as HEADERS is the first received frame on a request
stream, but not completely impossible. As HTTP/3 frame header (type +
length) is parsed first and removed, this leaves a small gap at the
buffer beginning. If this small gap is then filled with the remaining
frame payload, it would result in unaligned data. Also, trailers are
also sensitive here as in this case a HEADERS frame is handled after
other frames.
The objective of this patch is to ensure that an unaligned frame is now
handled in a safe way. This is extend to all HTTP/3 frames (except DATA)
and not only to HEADERS type. Parsing is interrupted if frame payload is
wrapping in the buffer. This should never happen except maybe with some
weird clients, so the connection is closed with H3_EXCESSIVE_LOAD error.
This approach is considered the safest one, in particular for backport
purpose. In the future, realign operation via copy may be implemented
instead if considered as useful.
This must be backported up to 2.6.
In QUIC, a STREAM frame may be received with no data but with FIN bit
set. This situation is tedious to handle and haproxy parsing code has
changed several times to deal with this situation. Now, H3 and H09
layers parsing code are skipped in favor of the shared function
qcs_http_handle_standalone_fin() used to handle the HTX EOM emission.
However, this shortcut bypasses an important HTTP/3 validation check on
the received body size vs the announced content-length header. Under
some conditions, this could cause a desynchronization with the backend
server which could be exploited for request smuggling.
Fix HTTP/3 parsing code by adding a call to h3_check_body_size() prior
to qcs_http_handle_standalone_fin() if content-length header has been
found. If the body size is incorrect, the stream is immediately resetted
with H3_MESSAGE_ERROR code and the error is forwarded to the stream
layer.
Thanks to Martino Spagnuolo for his detailed report on this issue and
for having contacting us about it via the security mailing list.
This must be backported up to 2.6.
hstream_build_http_resp() currently uses snprintf() to build the
status code and the generated X-req/X-rsp header values.
These strings are short and are fully derived from already parsed request
state, so they can be assembled directly in the HAProxy trash buffer using
`chunk_strcat()` and `ultoa_o()`.
This keeps the generated output unchanged while removing the remaining
`snprintf()` calls from the response-building path.
No functional change is expected.
Signed-off-by: Aleksandar Lazic <al-haproxy@none.at>
The window size increments are 31 bits and the topmost bit is reserved
and should be ignored, however it was not masked, so a peer sending it
set would emit a negative value which could actually reduce the current
window instead of increasing it. Note that the window cannot reach zero
as there's already a test for this, but transfers could slow down to
the same speed as if an initial window of just a few bytes had been
advertised. Let's just mask the reserved bit before processing.
This should be backported to all stable versions.
The stream ID indicated in GOAWAY frames must have its bit 31 (R) ignored
and this wasn't the case. The effect is that if this bit was present, the
GOAWAY frame would mark the last acceptable stream as negative, which is
the default situation (unlimited), thus would basically result in this
GOAWAY frame to be ignored since it would replace a negative last_sid
with another negative one. The impact is thus basically that if a peer
would emit anything non-zero in the R bit, the GOAWAY frame would be
ignored and new streams would still be initiated on the backend, before
being rejected by the server.
Thanks to Haruto Kimura (Stella) for finding and reporting this bug.
This fix needs to be backported to all stable versions.
The key type received over the peers protocol is not checked for
validity and as a result can crash the process when passed through
peer_int_key_type[] in peer_treat_definemsg(). The risk remains
very low since only trusted peers may exchange tables, however it
represents a risk the day haproxy supports new key types, because
mixing old and new versions could then cause the old ones to crash.
Let's add the required check in peer_treat_definemsg().
It is also worth noting that in this function a few protocol identifiers
of type int read directly from a var_int via intdecode() and that some
protocol aliasing may occur (e.g. table_id, table_id_len etc). This is
not supposed to be a problem but it could hide implementation bugs and
cause interoperability issues once fixed, so these should be addressed
in a future commit that will not be marked for backporting.
Thanks to Haruto Kimura (Stella) for finding and reporting this bug.
This fix needs to be backported to all stable versions.
In pcli_prefix_to_pid(), when resolving a worker by absolute pid
(@!<pid>) or by relative pid (@1), a worker that still has PROC_O_INIT
set (i.e. not yet ready, still initializing) could be returned as a
valid target.
During a reload, if a client connects to the master CLI and sends a
command targeting a worker (e.g. @@1 or @@!<pid>), the master resolves
the target pid and attempts to forward the command by transferring a fd
over the worker's sockpair. If the worker is still initializing and has
not yet sent its READY signal, its end of the sockpair is not usable,
causing send_fd_uxst() to fail with EPIPE. This results in the
following alert being repeated in a loop:
[ALERT] (550032) : socketpair: Cannot transfer the fd 13 over sockpair@5. Giving up.
The situation is even worse if the initializing worker has already
exited (e.g. due to a bind failure) but has not yet been removed from
the process list: in that case the sockpair's remote end is already
closed, making the failure immediate and unrecoverable until the dead
worker is cleaned up.
This was not possible before 3.1 because the master's polling loop only
started once all workers were fully ready, making it impossible to
receive CLI connections while a worker was still initializing.
Fix this by skipping workers with PROC_O_INIT set in both the absolute
and relative pid resolution paths of pcli_prefix_to_pid(), so that
only fully initialized workers can be targeted.
Must be backported to 3.1 and later.
When loading libs into the core dump, let's also try to load
libthread_db.so.1 that gdb usually requires. It can significantly help
decoding the threads for systems which require it, and the file is quite
small. It can appear at a few different locations and is generally next
to libpthread.so, or alternately libc, so we first look where we found
them, and fall back to a few other common places. The file is really
small, a few tens of kB usually.
When set-dumpable=libs, let's also pick the debug symbols for the libs
we're loading. For now we only try /usr/lib/debug/<path>, which is quite
common and easy to guess. Build IDs could also be used but are more
complex to deal with, so let's stay simple for now.
This utility takes in argument the path to a core dump, and it looks
for the archive signature of libraries embedded with "set-dumpable libs",
and either emits the offset and size of stdout, or directly dumps the
contents so that the tar file can be extracted directly by piping the
output to tar xf.
When "set-dumpable" is set to "libs", in addition to marking the process
dumpable, haproxy also reads the binary and shared objects into memory as
a tar archive in a page-aligned location so that these files are easily
extractable from a future core dump. The goal here is to always have
access to the exact same binary and libs as those which caused the core
to happen. It's indeed very frequent to miss some of these, or to get
mismatching files due to a local update that didn't experience a reload,
or to get those of a host system instead of the container.
The in-memory tar file presents everything under a directory called
"core-%d" where %d corresponds to the PID of the worker process. In
order to ease the finding of these data in the core dump, the memory
area is contiguous and surrounded by PROT_NONE pages so that it appears
in its own segment in the core file. The total size used by this is a
few tens of MB, which is not a problem on large systems.
The global "set-dumpable" keyword currently is only positional. Let's
extend its syntax to support arguments. For now we support both "on"
and "off" to explicitly enable or disable it.
New function load_file_into_tar() concatenates a file into an in-memory
tar archive and grows its size. Only the base name and a provided prefix
are used to name the faile. If the file cannot be loaded, it's added as
size zero and permissions 0 to show that it failed to load. This will
be used to load post-mortem information so it needs to remain simple.
The purpose here is to create a tar file header in memory from a known
file name, prefix, size and mode. It will be used to prepare archives
of libs in use for improved debugging, but may probably be useful for
other purposes due to its simplicity.
Since 7a1382da7 ("BUG/MINOR: spoe: Fix condition to abort processing on
client abort"), the chn variable is no longer used in
spoe_process_event(). Let's remove it
This patch must be backported with the commit above, as far as 3.1.
The test to detect client aborts in the SPOE, introduced by commit b3be3b94a
("BUG/MEDIUM: spoe: Properly abort processing on client abort"), was no
correct. Producer flags must not be tested. Only the frontend SC must be
tested when the abortonclose option is set.
Because of this bug, when a client aborted, the SPOE processing was aborted
too, regardless the abortonclose option.
This patch must be backpoeted with the commit above, so as far as 3.1.
haproxy_sticktable_local_updates corresponds to the table->localupdate
counter, which is used internally by the peers protocol to identify
update messages in order to send and ack them among peers.
Here we decide to expose this information, as it is already the case in
"show peers" output, because it turns out that this value, which is
cumulative and grows in sync with the number of updates triggered on the
table due to changes initiated by the current process, can be used to
compute the update rate of the table. Computing the update rate of the
table (from the process point of view, ie: updates sent by the process and
not those received by the process), can be a great load indicator in order
to properly scale the infrastructure that is intended to handle the
table updates.
Note that there is a pitfall, which is that the value will eventually
wrap since it is stored using unsigned 32bits integer. Scripts or system
making use of this value must take wrapping into account between two
readings to properly compute the effective number of updates that were
performed between two readings. Also, they must ensure that the "polling"
rate between readings is small enough so that the value cannot wrap behind
their back.
We no longer rely on now_offset stored in the shm-stats-file. Instead
haproxy automatically computes the now_offset relative to the monotonic
clock and the shared global clock.
Indeed, the previous model based on static now_offset when monotonic
clock is available proved to be insufficient when used in
combination with shm-stats-file (that is when monotonic clock is shared
between multiple co-processes). In ideal situation co-processes would
correctly apply the offset to their local monotonic clock and end up
with consistent now_ns. But when restarting from an existing
shm-stats-file from a previous session (ie: prior to reboot), then the
local monotonic clock would no longer be consistent with the one used
to update the file previously, so applying a static offset would fail
to restore clock consistency.
For this specific issue, a workaround was brought by 09bf116
("BUG/MEDIUM: stats-file: detect and fix inconsistent shared clock when resuming from shm-stats-file")
but the solution implemented there was deemed too fragile, because there
is a 60sec window where the fix would fail to detect inconsistent clock
and would leave haproxy with a broken clock ranging from 0 to 60 seconds,
which can be huge..
By simply recomputing the now_offset each time we learn from another
process (through the shared map by reading global_now_ns), we simply
recompute our local offset (difference between OUR monotonic clock
and the SHARED one). Also, in clock_update_global_date(), we make
sure we always recompute the now_offset as now_ms may have been
updated from shared clock if shared clock was ahead of us.
Thanks to that new logic, interrupted processes, resumed processes,
processed started with shm-stats-file from previous session now
correctly recover from those various situations and multiple
co-processes with diverting clocks on startup end up converging to
the same values.
Since it is no longer relevant to save now_offset in the map, it was
removed but to prevent shm-stats-file incompatibility with previous
versions, 8-byte hole was forced, and we didn't bump the shm-stats-file
version on purpose.
This patch may be backported in 3.3 after a solid period of observation
to ensure we didn't break things.
mystrtod() was not length-aware and relied on null-termination or a
non-numeric character to stop. The fix adds a length parameter as a
strict upper bound for all pointer accesses.
The practical impact in haproxy is essentially null: all callers embed
the JSON payload inside a large haproxy buffer, so the speculative read
past the last digit lands on memory that is still within the same
allocation. ASAN cannot detect it in a normal haproxy run for the same
reason — the overread never escapes the enclosing buffer. Triggering a
detectable fault requires placing the JSON payload at the exact end of
an allocation.
Note: the 'path' buffer was using a null-terminated string so the result
of strlen is passed to it, this part was not at risk.
Thanks to Kamil Frankowicz for the original bug report.
This patch must be backported to all maintained versions.
The commit 9f1e9ee0e ("DEBUG: stream: Display the currently running rule in
stream dump") revealed a bug. When a stream is dumped, if it is blocked on a
rule, we must take care the rule has a keyword to display its name.
Indeed, some action parsings are inlined with the rule parser. In that case,
there is no keyword attached to the rule.
Because of this bug, crashes can be experienced when a stream is
dumped. Now, when there is no keyword, "?" is display instead.
This patch must be backported as far as 2.6.
When a L7 retry is performed, we should not rely on b_xfer() to swap the L7
buffer with the request buffer. When it is performed the request buffer is
not allocated. b_xfer() must not be called with an unallocated destination
buffer. The swap remains an optim. For instance, It is not performed on
buffers of different size. So the caller is responsible to provide an
allocated destination buffer with enough free space to transfer data.
However, when a L7 retry is performed, we cannot allocate a request buffer,
because we cannot yield. An error was reported, if we wait for a buffer, the
error will be handled by process_stream(). But we can swap the buffers by
hand. At this stage, we know there is no request buffer, so we can easily
swap it with the L7 buffer.
Note there is no real bug for now.
This patch could be backported to all stable versions.
In HTX, headers and trailers parts must always be complete. It is unexpected
to found header blocks without the EOH block or trailer blocks without the
EOT block. So, during H2/H3 message parsing, we must take care to remove any
HEADER/TRAILER block inserted when an error is encountered. It is mandatory
to be sure to properly report parsing error to upper layer.x
It is now performed by calling htx_truncat_blk() function on the error
path. The tail block is saved before converting any HEADERS/TRAILERS frame
to HTX. It is used to remove all inserted block on error.
This patch rely on the following one:
"MINOR: htx: Add function to truncate all blocks after a specific block"
It should be backported with the commit above to all stable versions for
the H2 part and as far as 2.8 for h3 one.
When H2 or H3 trailers are inserted in an HTX message, we must take care to
not exceed the maximum number of trailers allowed in a message (same than
the maximum number of headers, i.e tune.http.maxhdr). However, all HTX
blocks in the HTX message were considered. Only TRAILERS HTX blocks must be
considered.
To fix the issue, in h2_make_htx_trailers(), we rely on the "idx" variable
at the end of the for loop. In h3_trailers_to_htx(), we rely on the
"hdr_idx" variable.
This patch must be backported to all stables versions for the H2 part and as
far as 2.8 for the H3 one.
pouet
L7 retries are buggy when a large buffer is used on the request channel. A
memcpy is used to copy data from the request buffer into the L7 buffer. The
L7 buffer is for now always a standard buffer. So if a larger buffer is
used, this leads to a buffer overflow and crash the process.
The Best way to fix the issue is to disable L7 retries when a large buffer
was allocated for the request channel. In that case, we don't want to
allocate an extra large buffer.
No backport needed.
When a large buffer is used on a channel, once we've started to send data to
the opposite side, receives are blocked temporarily to be sure to flush the
large buffer ASAP to be able to fall back on regular buffers. This was
performed by skipping call to the endpoint (connection or applet). Howerver,
doing so, this broken the abortonclose and more generally this masked any
shut or error events reported by the lower layer.
To fix the issue, instead of skipping receives, we now try a receive but
with a requested size set to 0.
No backport needed
Client abort when abortonclose is configured was ignored when messges were
sent on event while it works properly when messages are sent via an
"send-spoe-group" action.
To fix the issue, when the SPOE filter is waiting for the SPOE applet
response, it must check if a client abort was reported and if so, must
interrupt its processing.
This patch should be backported as far as 3.1.
When the SPOE applet is created, the SPOE filter is set in SENDING_MSGS
state. When the applet has transferred data, it should switch the filter to
WAITING_ACK state. Concretly, there is no bug. At best, it could save some
useless applet wakeups.
This patch should be backported as far as 3.1
When SC's shudown callback functions were merged, a regression was
introduced. The applet was no longer woken up. Because of this bug, an
applet could remain blocked, waiting for an I/O event or a timeout.
This patch should fix the issue #3301.
No backport needed.
FDs received through recv_fd_uxst() do not have FD_CLOEXEC set.
The equivalent sock_accept_conn() already handles this correctly:
any FD accepted or received in the master must be marked close-on-exec
to avoid leaking it across the execvp() performed on soft-reload.
This is currently triggering a leak in the master since 3.1: the worker
sends a socketpair fd to the master to issue the _send_status CLI
command, and recv_fd_uxst() receive it without setting FD_CLOEXEC. If a
re-exec is emitted before the master had the chance to close that fd, it
survives execvp() and appears as an untracked unnamed AF_UNIX socket in
the new master generation.
This must be backported to all maintained branches.
Add a NULL guard for the version field. This has no functional impact
since the master process never uses this field for its own mworker_proc
element, and should be the only one impacted. This avoid seeing "(null)"
in the version field when debugging.
Must be backported to 3.1 and later.
During a soft reload, a starting worker sends sock_pair[0] to the master
via send_fd_uxst(), then reads on sock_pair[1] waiting for the master to
acknowledge receipt. Because of a documented macOS sendmsg(2) bug, the
worker must keep sock_pair[0] open until the master confirms the fd was
received by the CLI applet. This means the read() on sock_pair[1] will
never return 0 (EOF), since the worker itself still holds a reference to
sock_pair[0]. The worker can only unblock when the master actively sends
a byte back. If the master crashes before doing so, the worker blocks
indefinitely in read().
Fix this by setting a 2-second SO_RCVTIMEO on sock_pair[1] before the
read(), so the worker can unblock and continue regardless of the master's
state.
This was introduced by d7f6819161c ("BUG/MEDIUM: mworker: fix startup
and reload on macOS").
This should be backported to 3.1 and later.
In mworker_proc_list_to_env(), a typo used '&=' instead of '&' when
checking PROC_O_TYPE_WORKER in child->options. This would corrupt the
options field by clearing all bits except PROC_O_TYPE_WORKER, but since
the function is called right before the master re-execs itself during a
reload, the corruption has no actual effect: the in-memory proc_list is
discarded by the exec, and the options field is not serialized to the
environment anyway.
This should be backported to all maintained versions.
We defer processing of the "-dt" options until after the configuration
file has been read. This will be useful if we ever allow trace sources
to be registered later, for instance with LUA.
No backport needed.
In master-worker mode, when a freshly forked worker looks up its own
entry in proc_list to send its "READY" status to the master, the loop
was breaking on the first process with pid == -1 regardless of its
type. If a non-worker process (e.g. a master or program) also had
pid == -1, the wrong entry could be selected, causing send_fd_uxst()
to use an invalid ipc_fd.
Fix this by adding a PROC_O_TYPE_WORKER check to the loop condition,
and add a BUG_ON() assertion to catch any case where the loop exits
without finding a valid worker entry.
Must be backported to 3.1.
"show profiling" supports "aggr" for tasks but it was ignored for
memory. Now that we're having many more entries, it makes sense to
have it to ignore the call path and merge similar operations.
Keywords registered out of an initcall will have a TH_EX_CTX_CLI_KWL
execution context pointing to the keyword list. The report will indicate
the 5 first words of the first command of the list, e.g.:
exec_ctx: cli kwl starting with 'debug counters '
This should also work for CLI keywords registered in Lua.
Now CLI keywords registered via an initcall will be tracked during
execution, by keeping a link to their initcall location. "show threads"
now shows "exec_ctx: kw registered at @debug.c:3093" which indeed
corresponds to the initcall for the debugging commands.
Till now the CLI didn't know what keyword was being processed after it
was parsed. In order to report the execution context, we'll need to
store it. And this may even help for post-mortem analysis to know the
exact keyword being processed, so let's store the pointer in the cli_ctx
part of the appctx.
It allows to know when a thread is currnetly running inside an applet.
For example now "show threads" will show "applet '<CLI>'" for the thread
issuing this command.
It now appears almost everywhere due to callbacks (e.g. ssl_sock_io_cb).
Muxes also become visible now on memory profiling. A small test on h1+ssl
yields 838 lines of statistics. The number of buckets should definitely
be increased, and more grouping criteria should be added.
A performance test was conducted to observe the possible effect of
setting the execution context on each task switch, and it didn't change
at all, remaining at about 1.01 billion ctxsw/s on a 128-thread EPYC.
Most calls to mux ops were instrumented with a CALL_MUX_WITH_RET() or
CALL_MUX_NO_RET() macro in order to make the current thread's context
point to the called mux and be able to track its allocations. Only
a bunch of harmless mux_ctl() and ->subscribe/unsubscribe calls were
left untouched since useless. But destroy/detach/shut/init/snd_buf
and rcv_buf are now tracked.
It will not show allocations performed in IO callback via tasklet
wakeups however.
In order to ease reading of the output, cmp_memprof_ctx() knows about
muxes and sorts based on the .subscribe function address instead of
the mux_ops address so as to keep various callers grouped.
In order to be able to track memory allocation performed from message
callbacks, let's set the thread execution context to a generic function
pointing to them during their call. This allows for example to observe
the share of SSL allocations caused by ssl_sock_parse_clienthello() when
SSL captures are enabled.
The release calls are automatic from the SSL library for these, and are
registered directly via SSL_get_ex_new_index(). Maybe we should improve
the internal API to wrap that function and systematically track free
calls as well. In this case, maybe even registering the message callback
registration could take both the callback and the release function.
There are few such users however, essentially capture and keylog.
Doing this allows to report the allocations/releases performed by filters
when running with memory profiling enabled. The flt_conf pointer is kept
and the report shows the filter name.
A bit similar to what was done for sample fetch functions and converters,
we now store with each action keyword the location of the initcall when
they're registered this way. Since there are many functions only calling
a LIST_APPEND() (one per ruleset), we now implement a dedicated function
to store the context in all keywords before doing the append.
However that's not sufficient, because keywords are not mandatory for
actions, so we cannot safely rely on rule->kw. Thus we then set the
exec_ctx per rule when they are all scanned in check_action_rules(),
based on the keyword if it exists, otherwise we make a context from
the action_ptr function if it is set (it should).
Finally at all call points we now check rule->exec_ctx.
The purpose here is to be able to spot certain callbacks, such as the
SSL message callbacks, which are difficult to associate to anything.
Thus we introduce a new context type, TH_EX_CTX_FUNC, for which the
context is just the function pointed to by the void *pointer. One
difficulty with callbacks is that the allocation and release contexts
will likely be different, so the code should be properly structured
to allow proper tracking, either by instrumenting all calls, or by
making sure that the free calls are easy to spot in a report.
With the two new context types TH_EX_CTX_SMPF/CONV, we can now also
report contexts corresponding to direct calls to sample_register_fetches()
and sample_register_convs(). In this case, the first word of the keyword
list is reported.
Now keywords are registered with an exec_ctx and this one is passed
when calling ->process. The ctx is of type INITCALL when passed via
an initcall where we know the file name and line number.
This was tested with and extra "malloc(15)" added in smp_fetch_path()
which shows that it works:
$ socat /tmp/sock1 - <<< "show profiling memory"|grep via
Calls | Tot Bytes | Caller and method [via]
1893399 0 60592592 0| 0x78b2ec task_run_applet+0x3339c malloc(32) [via initcall @http_fetch.c:2416]
When the execution context is set to TH_EX_CTX_INITCALL, the pointer
points to a valid initcall, and the decoder will show "kw registered
at %s:%d" with file and line number of the initcall declaration. It's
up to the caller to make the initcall pointer point to the one that was
set during the initcall. The purpose here is to be able to preserve and
pass that knowledge of an initcall down the chain so that future calls
to functions registered via the initcall are still assigned to it.
The INITCALL macros will now store the file and line number where they
are declared into the initcall struct, and RUN_INITCALLS() will assign
them to the global caller_file and caller_line variables, and will even
set caller_initcall to the current initall so that at any instant such
functions know where their caller declared them. This will help with
error messages and traces where a bit of context will be welcome.
Now we have one extra line saying "exec_ctx: something" in thread dumps
when it's known. It may help with warnings and panics to figure what
is ongoing.
The new function chunk_append_thread_ctx() appends to a buffer the given
execution context based on its type and pointer. The goal is to easily
use it in profiling output and thread dumps. For now it only handles
TH_EX_CTX_NONE (which prints nothing) and TH_EX_CTX_OTHER (which indicates
"other ctx" followed by the pointer). It will be extended by new types as
they arrive.
By passing "byctx" to "show profiling memory", it's possible to sort by
the calling context first, which could help group certain calls by
subsystem and ease the interpretation of the output.
This now allows to report the same function in multiple bins based on the
th_ctx's exec_ctx discriminant. It's also worth noting that the context is
not atomically committed, but this shouldn't be a problem since a single
entry can get it. In the worst case, a second thread trying to create the
same context in parallel would create a different bin just for this call,
which is harmless. The same situation already exists with the caller
pointer.
We have the struct made of a type and a pointer in the th_ctx and a
function to switch it for the current thread. Two macros are provided
to enclose a callee within a temporary context. For now only type OTHER
is supported (only a generic pointer).
When two pointer hash to the same memprofile bin, we currently try again
with the same bin until we find a spare one or we reach the limit of 16.
Olivier suggested to try with a different step for different pointers so
as to limit the number of bins to visit in such a case, so let's split
the pointer hash calculation so that we keep the raw hash before reduction
and use its lowest bits as the retry step. We force lowest bit to 1 to
avoid integral multiples that would oscillate between only a few positions.
Quick tests with h1+h2 requests show that for ~744 distinct entries, we
used to have 1.17 retries per lookup before and 0.6 now so we're halving
the cost of hash collisions. A heavier workload that used to produce 920
entries with 2.01 retries per lookup now reaches 966 entries (94.3% usage
vs 89.8% before) with only 1.44 retries per lookup.
This should be safe to backport, but depends on this previous commit:
MINOR: tools: extend the pointer hashing code to ease manipulations
The purpose here is to combine two pointers and a long argument instead
of having the caller perform the mixing. Also it's cleaner and more
efficient this was as the arg is mixed after the multiplications, and
modern processors are efficient at multiplying then adding.
We'll need to further extend the pointer hashing code to pass extra
parameters and to retrieve the dropped bits, so let's first split the
part that hashes the pointer from the part that reduces the hash to
the desired size.
Historically, the data manipulated by "show profiling" were copied
onto the stack for sorting and aggregating, but not only this limits
the number of entries we can keep, but it also has an impact on CPU
usage (having to redo the whole copy+sort upon each resume) and the
output accuracy (if sorting changes lines, resume may happen from an
incorrect one).
Instead, let's dynamically allocate the work buffer and place it into
the service context. We only allocate it immediately before needing it
and release it immediately afterwards so that it doesn't stay long. It
also requires a release handler to release those allocates by interrupted
dumps, but that's all. The overall result is now much cleaner, more
accurate, faster and safer.
This patch may be backported to older LTS releases.
In check_config_validity() and proxy_finalize() we check the consistency
of all rule sets, but the quic_initial rules were not placed there. This
currently has little to no impact, however we're going to use that to
also finalize certain debugging info so better call the function. This
can be backported to 3.1 (proxy_finalize is 3.4-only).
In 3.1, per-DSO statistics were added to the memprofile output by
commit 401fb0e87a ("MINOR: activity/memprofile: show per-DSO stats").
However an strdup() is performed there on the .info field, that is
never freed when leaving the function. Let's do it each time we leave
it. Ironically, this was found thanks to "show profiling" showing
itself as an unbalanced caller of strdup().
This needs to be backported to 3.0 since that commit was backported
there.
In 3.3, the "make range" target adopted a test command via the TEST_CMD
variable, with commit 90b70b61b1 ("BUILD: makefile: implement support
for running a command in range"). However now it breaks the script when
TEST_CMD is not set due to the shell expansion leaving two '||' operators
side by side. Let's fix this by passing the contents of the makefile
variable in positional arguments before executing them.
To read early data with AWS-LC (and BoringSSL), we have to use
SSL_read(). But SSL_read() will also try to do the handshake if it
hasn't been done yet, and at some point will do the handshake and will
return data that are actually not early data. So use SSL_in_early_data()
to make sure that the data we received are actually early data, and only
if so add the CO_FL_EARLY_DATA flag. Otherwise any data first received will be
considered early, and a Early-data header will be added.
As this bug was introduced by 76ba026548975a6d1bc23d1344807c64d994bf1e,
it should be backported with it.
Upon _send_status, always stop the listener from which the request
was received, rather than looking it up from the proc_list entry via
fdtab[proc->ipc_fd[0]].owner.
A BUG_ON is added to verify that the listener which received the
request is the one expected for the reported PID.
This means it is no longer possible to send "_send_status READY XXX"
manually through the master CLI for testing, as that would trigger
the BUG_ON.
Must be backported as far as 3.1.
The API for early data is a bit different with BoringSSL and AWS-LC than
it is for OpenSSL. As it was implemented, early data would be accepted,
but would not be processed until the handshake is done. Change that by
doing something similar to what OpenSSL does, and, if 0RTT has been
enabled on the listener, use SSL_read() to try to get early data before
starting the handshake, and if there's any, provide them to the mux the
same way it is done for OpenSSL.
That replaces a bunch of #ifdef SSL_READ_EARLY_DATA_SUCCESS by
something specific to OpenSSL has to be done.
This should be backported to 3.3.
The name of "Global section" was changed only in the summary, not in the
text itself. The names of some related refs were also updated.
Should be backported as far as 3.2.
EVP_MD_CTX is allocated using EVP_MD_CTX_new() but was never freed.
ctx should be initialized to NULL otherwise EVP_MD_CTX_free(ctx) could
segfault.
Must be backported as far as 3.2.
With a config:
backend bk_app
http-check expect status 200 string "status: ok"
This now correctly emits the error:
config : parsing [./patch.cfg:2] : 'http-check expect' : only one pattern expected.
This line containing the typo is unchanged since at least HAProxy 2.2, the
patch should be backported into all supported branches.
Starting with OpenSSL 4.0, X509_get_subject_name(), X509_get_issuer_name(),
and X509_CRL_get_issuer() return a const-qualified X509_NAME pointer.
Similarly, X509_NAME_get_entry() returns a const X509_NAME_ENTRY *, and
X509_NAME_ENTRY_get_data() returns a const ASN1_STRING *.
Introduce the __X509_NAME_CONST__ macro (defined to 'const' for OpenSSL
>= 4.0.0, empty for WolfSSL and older OpenSSL version which lacks const
on these APIs) and use it to qualify X509_NAME * variables and the
parameters of the three DN helper functions ssl_sock_get_dn_entry(),
ssl_sock_get_dn_formatted(), and ssl_sock_get_dn_oneline(). This avoids
both const-qualifier warnings on OpenSSL 4.0 and discarded-qualifier
warnings on WolfSSL, without needing explicit casts at call sites.
In ssl_sock.c (ssl_get_client_ca_file) and ssl_gencert.c
(ssl_sock_do_create_cert), a __X509_NAME_CONST__ X509_NAME * variable was
being reused to store the result of X509_NAME_dup() and then passed to
mutating functions (X509_NAME_add_entry_by_txt, X509_NAME_free). Introduce
separate X509_NAME * variables (xn_dup, subject) to hold the mutable
duplicate.
Original patch from Alexandr Nedvedicky <sashan@openssl.org>:
https://www.mail-archive.com/haproxy@formilux.org/msg46696.html
In OpenSSL 4.0, the ASN1_STRING struct was made opaque and direct access
to its members (->data, ->length, ->type) no longer compiles. Replace
these accesses in ssl_sock_get_serial(), ssl_sock_get_time(), and
asn1_generalizedtime_to_epoch() with the proper accessor functions
ASN1_STRING_get0_data(), ASN1_STRING_length(), and ASN1_STRING_type().
The old direct access is preserved under USE_OPENSSL_WOLFSSL since
WolfSSL does not provide these accessor functions.
Original patch from Alexandr Nedvedicky <sashan@openssl.org>:
https://www.mail-archive.com/haproxy@formilux.org/msg46696.html
When a master process is reloading, the HAPROXY_PROCESSES variable is
deserialized. In older version of the master-worker (< 1.9), no master
element was existing in this variable.
This is not suppose to happen anymore, and could have provoked problems
in the master anyway.
This patch changes the behavior by exiting the master with an alert if
mp master element was found in this variable.
When no endpoint is attached to a SC, it is unexpected to have I/O (receive
or send). But we honestly don't know if it happens or not. So a CHECK_IF()
is added to be able to track such calls.
Calls to se_shutdown were no the same between applets and mux endpoints.
Only the SHUTW flag was not the same. However, on the multiplexers are
sensitive to the true SHUTW flag. The applets handle all of them the same
way. So calls to se_shutdown() from sc_abort() and sc_shutdown() can be
merged to always use the multiplexer version.
wake_srv_chk() function is now only used by srv_chk_io_cb(), the
health-checl I/O callback function. So let's remove it. The code of the
function was moved in srv_chk_io_cb().
At we fail to create a mux, in conn_create_mux(), instead of calling the
app_ops .wake() callback function, we can directly call sc_conn_process().
At this stage, we know we are using an connection, so it is safe to do so.
At the end of task_run_applet() and task_process_applet(), instead of
calling the app_ops .wake() callback function, we can directly call
sc_applet_process(). At this stage, we know we are using an applet, so it is
safe to do so.
Thanks to previous commits, it is now possible to wake the data layer up,
via a tasklet_wakeup, instead of using the app_ops .wake() callback
function.
When a data layer must be notified of a mux event (an error for instance),
we now always perform a tasklet_wakeup(). TASK_WOKEN_MSG state is used by
default. TASK_WOKEN_IO is eventually added if the data layer was subscribed
to receives or sends.
Changes are not trivial at all. We replaced a synchronous call to the
sc_conn_process() function by a tasklet_wakeup().
Now, when a mux stream is waking its data layer up for receives or sends, it
uses the TASK_WOKEN_IO state. The state is not used by the stconn I/O
callback function for now.
In spop_resume_each_sending_spop_strm(), there was exactly the same code
than spop_strm_notify_send(). So let's use spop_strm_notify_send() instead
of duplicating code.
It is the first commit of a series to refactor the SC app_ops. The first
step is to remove the .wake() callback function from the app_ops to replace
all uses by a wakeup of the SC tasklet.
Here, when the SC is woken up, the state is now tested and if TASK_WOKEN_MSG
is set, sc_conn_process() is called.
When ECDH-ES algorithm is used in a JWE token, no cek is provided and
one must be built in order to decrypt the contents of the token. The
decrypting key is built by deriving a temporary key out of a public key
provided in the token and the private key provided by the user and
performing a concatKDF operation.
When the encoding is of the ECDH family, the optional "apu" and "apv"
fields of the JOSE header must be parsed, as well as the mandatory "epk"
field that contains an EC public key used to derive a key that allows
either to decrypt the contents of the token (in case of ECDH-ES) or to
decrypt the content encoding key (cek) when using ECDH-ES+AES Key Wrap.
Convert a JWK with the "EC" key type ("kty") into an EVP_PKEY. The JWK
can either represent a public key if it only contains the "x" and "y"
fields, or a private key if it also contains the "d" field.
The 'jwt_tokenize' function that can be used to split a JWT token into
its subparts can either fully process the token (from beginning to end)
when we need to check its signature, or only partially when using the
jwt_header_query or jwt_member_query converters. In this case we relied
on the fact that the return value of the 'jwt_tokenize' function was not
checked because a '-1' was returned (which was not actually an error).
In order to make this logic more explicit, the 'jwt_tokenize' function
now has a way to warn the caller that the token was invalid (less
subparts than the specified 'item_num') or that the token was not
processed in full (enough subparts found without parsing the token all
the way).
The function will now only return 0 if we found strictly the same number
of subparts as 'item_num'.
The master process in the proc_list mustn't set the PROC_O_LEAVING flag
since the reload doesn't mean the master will leave.
Could be backported as far as 3.1.
mproxy_li is supposed to be used in _send_status to stop the sockpair FD
between the master and the new worker, being a listener.
This can only work if the listener has been stored in the fdtab owner,
and there's no reason it shouldn't be here.
It's always a bit tricky to avoid already backported patches when they
just got a different ID (e.g. a critical fix in a topic branch). Most
often with stable topic branches we just want to pick all stable commits
since the last backported one. New option -L instead of -m does exactly
this: it enumerates only commits that were added to the reference branch
after its most recent backport.
The -vv option used --verbose as its long form, which was identical to
the long form of -v. Since the case statement matches top-to-bottom,
--verbose would always trigger -v (VERBOSE=2), making -vv unreachable
via its long option. The long form is renamed to --verbose=all to avoid
the conflict, and the usage string is updated accordingly.
Must be backported to 3.3.
The script relied on a bash-specific process substitution (< <(...)) to
feed socat's output into the read loop. This is replaced with a standard
POSIX pipe into a command group.
The response parsing is also simplified: instead of iterating over each
line with a while loop and echoing them individually, the status line is
read first, the "--" separator consumed, and the remaining output is
streamed to stderr or discarded as a whole depending on the verbosity
level.
Could be backported to 3.3 as it makes it more portable, but introduce a
slight change in the error format.
socat was used with the ${MASTER_SOCKET} variable directly, letting it
auto-detect the network protocol. However, when given a plain filename
that does not point to a UNIX socket, socat would create a file at that
path instead of reporting an error.
To fix this, the address type is now determined explicitly: if
MASTER_SOCKET points to an existing UNIX socket file (checked with -S),
UNIX-CONNECT: is used; if it matches a <host>:<port> pattern, TCP: is
used; otherwise an error is reported. The socat_addr variable is also
properly scoped as local to the reload() function.
Could be backported in 3.3.
no need to have duplicated comp_ctx and comp_algo for request vs response
in comp_state struct, because thanks to previous commit compression filter
is either oriented on the request or the response, and 2 distinct filters
are instanciated when we need to handle both requests and responses
compression.
Thus we can save us from duplicated struct members and related operations.
Existing "compression" filter is a multi-purpose filter that will try
to compress both requests and responses according to "compression"
settings, such as "compression direction".
One of the pre-requisite work identified to implement decompression
filter is that we needed a way to manually define the sequence of
enabled filters to chain them in the proper order to make
compression and decompression chains work as expected in regard
to the intended use-case.
Due to the current nature of the "compression" filter this was not
possible, because the filter has a combined action as it will try
to compress both requests and responses, and as we are about to
implement "filter-sequence" directive, we will not be able to
change the order of execution of the compression filter between
requests and responses.
A possible solution we identified to solve this issue is to split the
existing "compression" filter into 2 distinct filters, one which is
request-oriented, "comp-req", and another one which is response-oriented
"comp-res". This is what we are doing in this commit. Compression logic
in itself is unchanged, "comp-req" will only aim to compress the request
while "comp-res" will try to compress the response. Both filters will
still be invoked on request and responses hooks, but they only do their
part of the job.
From now on, to compress both requests and responses, both filters have
to be enabled on the proxy. To preserve original behavior, the "compression"
filter is still supported, what it does is that it instantiates both
"comp-req" and "comp-res" filters implicitly, as the compression filter is
now effectively split into 2 separate filters under the hood.
When using "comp-res" and "comp-req" filters explicitly, the use of the
"compression direction" setting is not relevant anymore. Indeed, the
compression direction is assumed as soon as one or both filters are
enabled. Thus "compression direction" is kept as a legacy option in
order to configure the "compression" generic filter.
Documentation was updated.
proxy_get_comp() function can be used to retrieve proxy->comp options or
allocate and initialize it if missing
For now, it is solely used by parse_compression_options(), but the goal is
to be able to use this helper from multiple origins.
There was a "jwt_tokenize" call whose return value was not checked.
This was found by coverity and raised in GitHub #3277.
This patch can be backported to all stable branches.
In connect_server(), it is possible to have no server defined (dispatch mode
or transparent backend). In that case, we must be carefull to check the srv
variable in all calls involving the server. It was not perform at one place,
when the protocol to use for websocket is retrieved. This must not be done
when there is no server.
This patch should fix the first report in #3144. It must be backported to
all stable version.
In sample_conv_sha2(), calls to EVP_Digest* can fail. So we must check
return value of each call and report a error on failure and release the
digest context.
This patch should fix the issue #3274. It should be backported as far as
2.6.
Released version 3.4-dev6 with the following main changes :
- CLEANUP: acme: remove duplicate includes
- BUG/MINOR: proxy: detect strdup error on server auto SNI
- BUG/MINOR: server: set auto SNI for dynamic servers
- BUG/MINOR: server: enable no-check-sni-auto for dynamic servers
- MINOR: haterm: provide -b and -c options (RSA key size, ECDSA curves)
- MINOR: haterm: add long options for QUIC and TCP "bind" settings
- BUG/MINOR: haterm: missing allocation check in copy_argv()
- BUG/MINOR: quic: fix counters used on BE side
- MINOR: quic: add BUG_ON() on half_open_conn counter access from BE
- BUG/MINOR: quic/h3: display QUIC/H3 backend module on HTML stats
- BUG/MINOR: acme: acme_ctx_destroy() leaks auth->dns
- BUG/MINOR: acme: wrong labels logic always memprintf errmsg
- MINOR: ssl: clarify error reporting for unsupported keywords
- BUG/MINOR: acme: fix incorrect number of arguments allowed in config
- CLEANUP: haterm: remove unreachable labels hstream_add_data()
- CLEANUP: haterm: avoid static analyzer warnings about rand() use
- CLEANUP: ssl: Remove a useless variable from ssl_gen_x509()
- CI: use the latest docker for QUIC Interop
- CI: remove redundant "halog" compilation
- CLENAUP: cfgparse: accept-invalid-http-* does not support "no"/"defaults"
- BUG/MEDIUM: spoe: Acquire context buffer in applet before consuming a frame
- MINOR: traces: always mark trace_source as thread-aligned
- MINOR: ncbmbuf: improve itbmap_next() code
- MINOR: proxy: improve code when checking server name conflicts
- MINOR: quic: add a new metric for ncbuf failures
- BUG/MINOR: haterm: cannot reset default "haterm" mode
- BUG/MEDIUM: cpu-topo: Distribute CPUs fairly across groups
- BUG/MINOR: quic: missing app ops init during backend 0-RTT sessions
- CLEANUP: ssl: remove outdated comments
- MINOR: mux-h2: also count glitches on invalid trailers
- MINOR: mux-h2: add a new setting, "tune.h2.log-errors" to tweak error logging
- BUG/MEDIUM: mux-h2: make sure to always report pending errors to the stream
- BUG/MINOR: server: adjust initialization order for dynamic servers
- CLEANUP: tree-wide: drop a few useless null-checks before free()
- CLEANUP: quic-stats: include counters from quic_stats
- REORG: stats/counters: move extra_counters to counters not stats
- CLEANUP: stats: drop stats.h / stats-t.h where not needed
- MEDIUM: counters: change the fill_stats() API to pass the module and extra_counters
- CLEANUP: counters: only retrieve zeroes for unallocated extra_counters
- MEDIUM: counters: add a dedicated storage for extra_counters in various structs
- MINOR: counters: store a tgroup step for extra_counters to access multiple tgroups
- MEDIUM: counters: store the number of thread groups accessing extra_counters
- MINOR: counters: add EXTRA_COUNTERS_BASE() to retrieve extra_counters base storage
- MEDIUM: counters: return aggregate extra counters in ->fill_stats()
- MEDIUM: counters: make EXTRA_COUNTERS_GET() consider tgid
- BUG/MINOR: call EXTRA_COUNTERS_FREE() before srv_free_params() in srv_drop()
- MINOR: promex: test applet resume in stress mode
- BUG/MINOR: promex: fix server iteration when last server is deleted
- BUG/MINOR: proxy: add dynamic backend into ID tree
- MINOR: proxy: convert proxy flags to uint
- MINOR: server: refactor srv_detach()
- MINOR: proxy: define a basic "del backend" CLI
- MINOR: proxy: define proxy watcher member
- MINOR: stats: protect proxy iteration via watcher
- MINOR: promex: use watcher to iterate over backend instances
- MINOR: lua: use watcher for proxies iterator
- MINOR: proxy: add refcount to proxies
- MINOR: proxy: rename default refcount to avoid confusion
- MINOR: server: take proxy refcount when deleting a server
- MINOR: lua: handle proxy refcount
- MINOR: proxy: prevent backend removal when unsupported
- MINOR: proxy: prevent deletion of backend referenced by config elements
- MINOR: proxy: prevent backend deletion if server still exists in it
- MINOR: server: mark backend removal as forbidden if QUIC was used
- MINOR: cli: implement wait on be-removable
- MINOR: proxy: add comment for defaults_px_ref/unref_all()
- MEDIUM: proxy: add lock for global accesses during proxy free
- MEDIUM: proxy: add lock for global accesses during default free
- MINOR: proxy: use atomic ops for default proxy refcount
- MEDIUM: proxy: implement backend deletion
- REGTESTS: add a test on "del backend"
- REGTESTS: complete "del backend" with unnamed defaults ref free
- BUG/MINOR: hlua: fix return with push nil on proxy check
- BUG/MEDIUM: stream: Handle TASK_WOKEN_RES as a stream event
- MINOR: quic: use signed char type for ALPN manipulation
- MINOR: quic/h3: reorganize stream reject after MUX closure
- MINOR: mux-quic: add function for ALPN to app-ops conversion
- MEDIUM: quic/mux-quic: adjust app-ops install
- MINOR: quic: use server cache for ALPN on BE side
- BUG/MEDIUM: hpack: correctly deal with too large decoded numbers
- BUG/MAJOR: qpack: unchecked length passed to huffman decoder
- BUG/MINOR: qpack: fix 1-byte OOB read in qpack_decode_fs_pfx()
- BUG/MINOR: quic: fix OOB read in preferred_address transport parameter
- BUG/MEDIUM: qpack: correctly deal with too large decoded numbers
- BUG/MINOR: hlua: Properly enable/disable line receives from HTTP applet
- BUG/MEDIUM: hlua: Fix end of request detection when retrieving payload
- BUG/MINOR: hlua: Properly enable/disable receives for TCP applets
- MINOR: htx: Add a function to retrieve the HTTP version from a start-line
- MINOR: h1-htx: Reports non-HTTP version via dedicated flags
- BUG/MINOR: h1-htx: Be sure that H1 response version starts by "HTTP/"
- MINOR: http-ana: Save the message version in the http_msg structure
- MEDIUM: http-fetch: Rework how HTTP message version is retrieved
- MEDIUM: http-ana: Use the version of the opposite side for internal messages
- DEBUG: stream: Display the currently running rule in stream dump
- MINOR: filters: Use filter API as far as poissible to break loops on filters
- MINOR: filters: Set last_entity when a filter fails on stream_start callback
- MINOR: stream: Display the currently running filter per channel in stream dump
- DOC: config: Use the right alias for %B
- BUG/MINOR: channel: Increase the stconn bytes_in value in channel_add_input()
- BUG/MINOR: sample: Fix sample to retrieve the number of bytes received and sent
- BUG/MINOR: http-ana: Increment scf bytes_out value if an haproxy error is sent
- BUG/MAJOR: fcgi: Fix param decoding by properly checking its size
- BUG/MAJOR: resolvers: Properly lowered the names found in DNS response
- BUG/MEDIUM: mux-fcgi: Use a safe loop to resume each stream eligible for sending
- MINOR: mux-fcgi: Use a dedicated function to resume streams eligible for sending
- CLEANUP: qpack: simplify length checks in qpack_decode_fs()
- MINOR: counters: Introduce COUNTERS_UPDATE_MAX()
- MINOR: listeners: Update the frequency counters separately when needed
- MINOR: proxies: Update beconn separately
- MINOR: stats: Add an option to disable the calculation of max counters
Add a new option, "stats calculate-max-counters [on|off]".
It makes it possible to disable the calculation of max counters, as they
can have a performance cost.
Update beconn separately from the call to COUNTERS_UPDATE_MAX(), as soon
there will be an option to get COUNTERS_UPDATE_MAX() to do nothing, and
we still want beconn to be properly updated, as it is used for other
purposes.
Update the frequency counters that are exported to the stats page
outside of the call to COUNTERS_UPDATE_MAX(), so that they will
happen even if COUNTERS_UPDATE_MAX() ends up doing nothing.
Introduce COUNTERS_UPDATE_MAX(), and use it instead of using
HA_ATOMIC_UPDATE_MAX() directly.
For now it just calls HA_ATOMIC_UPDATE_MAX(), but will later be modified
so that we can disable max calculation.
This can be backported up to 2.8 if the usage of COUNTERS_UPDATE_MAX()
generates too many conflicts.
This patch simplifies the decoding loop by merging the variable-length
integer truncation check (len == -1) with the subsequent buffer
availability check (len < length).
This removes redundant code blocks and improves readability without
changing the decoding logic.
Note that the second removal is correct, as the check was duplicate and
unnecessary."
At the end of fcgi_send(), if the connection is not full anymore, we loop on
the send list to resume FCGI stream for sending. But a streams may be
removed from the this list during the loop. So a safe loop must be used.
This patch should be backported to all stable versions.
Names found in DNS responses are lowered to be compared. A name is composed
of several labels, strings precedeed by their length on one byte. For
instance:
3www7haproxy3org
There is an bug when labels are lowered. The label length is not skipped and
tolower() function is called on it. So for label length in the range [65-90]
(uppercase char), 32 is added to the label length due to the conversion of a
uppercase char to lowercase. This bugs can lead to OOB read later in the
resolvers code.
The fix is quite obvious, the label length must be skipped when the label is
lowered.
Thank you to Kamil Frankowicz for having reported this.
This patch must be backported to all stable versions.
In functions used to decode a FCGI parameter, the test on the data length
before reading the parameter's name and value did not consider the offset
value used to skip already parsed data. So it was possible to read more data
than available (OOB read). To do so, a malicious FCGI server must send a
forged GET_VALUES_RESULT record containing a parameter with wrong name/value
length.
Thank you to Kamil Frankowicz for having reported this.
This patch must be backported to all stable versions.
There was an issue in the if/else statement in smp_fetch_bytes() function.
When req.bytes_in or req.bytes_out was requested, res.bytes_in was always
returned. It is now fixed.
This patch must be backported to 3.3.
This function is no longer used. So it is not really an bug. But it is still
available and could be used by legacy applets. In that case, we must take
care to increment the stconn bytes_in value accordingly when input data are
inserted.
This patch must be backported to 3.3.
In custom log format part, %[req.bytes_in] was erroneously documented as the
alias of %B. The good alias is %[res.bytes_in]. It is now fixed.
This patch must be backported to 3.3.
Since the 3.1, when stream's info are dump, it is possible to print the
yielding filter on each channel, if any. It was useful to detect buggy
filter on spinning loop. But it is not possible to detect a filter consuming
too much CPU per-execution. We can see a filter was executing in the
backtrace reported by the watchdog, but we are unable to spot the specific
one.
Thanks to this patch, it is now possible. When a dump is emitted, the
running or yield filters on each channel are now displayed with their
current state (RUNNING or YIELDING).
This patch could be backported as far as 3.2 because it could be useful to
spot issues. But the filter API was slightly refactored in 3.4, so this
patch should be adapted.
On the stream, the last_entity should reference the last rule or the last
filter evaluated during the stream processing. However, this info was not
saved when a filter failed on strem_start callback function. It is now
fixed.
This patch could be backported as far as 3.1.
When the filters API was refactored to improve loops on filters, some places
were not updated (or not fully updated). Some loops were not relying on
resume_filter_list_break() while it was possible. So let's do so with this
patch.
Since the 2.5, when stream's info are dump, it is possible to print the
yielding rule, if any. It was useful to detect buggy rules on spinning
loop. But it is not possible to detect a rule consuming too much CPU
per-execution. We can see a rule was executing in the backtrace reported by
the watchdog, but we are unable to spot the specific rule.
Thanks to this patch, it is now possible. When a dump is emitted, the
running or yield rule is now displayed with its current state (RUNNING or
YIELDING).
This patch could be backported as far as 3.2 because it could be useful to
spot issues.
When the response is send by HAProxy, from a applet or for the analyzers,
The request version is used for the response. The main reason is that there
is not real version for the response when this happens. "HTTP/1.1" is used,
but it is in fact just an HTX response. So the version of the request is
used.
In the same manner, when the request is sent from an applet (httpclient),
the response version is used, once available.
The purpose of this change is to return the most accurate version from the
user point of view.
Thanks to previous patches, we can now rely on the version stored in the
http_msg structure to get the request or the response version.
"req.ver" and "res.ver" sample fetch functions returns the string
representation of the version, without the prefix, so "<major>.<minor>", but
only if the version is valid. For the response, "res.ver" may be added from
a health-check context, in that case, the HTX message is used.
"capture.req.ver" and "capture.res.ver" does the same but the "HTTP/" prefix
is added to the result. And "capture.res.ver" cannot be called from a
health-check.
To ease the version formatting and avoid code duplication, an helper
function was added. So these samples are now relying on "get_msg_version()".
When the request or the response is received, the numerical value of the
message version is now saved. To do so, the field "vsn" was added in the
http_msg structure. It is an unsigned char. The 4 MSB bits are used for the
major digit and the 4 LSB bits for the minor one.
Of couse, the version must be valid. the HTX_SL_F_NOT_HTTP flag of the
start-line is used to be sure the version is valid. But because this flag is
quite new, we also take care the string representation of the version is 8
bytes length. 0 means the version is not valid.
When the response is parsed, we test the version to be sure it is
valid. However, the protocol was not tested. Now we take care that the
response version starts by "HTTP/", otherwise an error is returned.
Of course, it is still possible to by-pass this test with
"accept-unsafe-violations-in-http-response" option.
This patch could be backported to all stable versions.
Now, when the HTTP version format is not strictly valid, flags are set on
the h1 parser and the HTX start-line. H1_MF_NOT_HTTP is set on the H1 parser
and HTX_SL_F_NOT_HTTP is set on the HTX start-line. These flags were
introduced to avoid parsing again and again the version to know if it is a
valid version or not, escpecially because it is most of time valid.
htx_sl_vsn() function can now be used to retrieve the ist string
representing the HTTP version from a start-line passed as parameter. This
function takes care to return the right part of the start-line, depending on
its type (request or response).
From a lua TCP applet, in functions used to retrieve data (receive,
try_receive and getline), we must take care to disable receives when data
are returned (or on failure) and to restart receives when these functions
are called again. In addition, when an applet execution is finished, we must
restart receives to properly drain the request.
This patch should be backported to 3.3. On older version, no bug was
reported so we can wait a report first.
When the lua HTTP applet was refactored to use its own buffers, a bug was
introduced in receive() and getline() function. We rely on HTX_FL_EOM flag
to detect the end of the request. But there is nothing preventing extra
calls to these function, when the whole request was consumed. If this
happens, the call will yield waiting for more data with no way to stop it.
To fix the issue, APPLET_REQ_RECV flag was added to know the whole request
was received.
This patch should fix#3293. It must be backported to 3.3.
From a lua HTTP applet, in the getline() function, we must take care to
disable receives when a line is retrieved and to restart receives when the
function is called again. In addition, when an applet execution is finished,
we must restart receives to properly drain the request.
This patch could help to fix#3293. It must be backported to 3.3. On older
version, no bug was reported so we can wait a report first. But in that
case, hlua_applet_http_recv() should also be fixed (on 3.3 it was fixed
during the applets refactoring).
Same fix as this one for hpack:
7315428615 ("BUG/MEDIUM: hpack: correctly deal with too large decoded numbers")
Indeed, the encoding of integers for QPACK is the same as for HPACK but for 64 bits
integers.
Must be backported as far as 2.6.
This bug impacts only the QUIC backend. A QUIC server does receive
a server preferred address transport parameter.
In quic_transport_param_dec_pref_addr(), the boundary check for the
connection ID was inverted and incorrect. This could lead to an
out-of-bounds read during the following memcpy.
This patch fixes the comparison to ensure the buffer has enough input data
for both the CID and the mandatory Stateless Reset Token.
Thank you to Kamil Frankowicz for having reported this.
Must be backported to 3.3.
In qpack_decode_fs_pfx(), if the first qpack_get_varint() call
consumes the entire buffer, the code would perform a 1-byte
out-of-bounds read when accessing the sign bit via **raw.
This patch adds an explicit length check at the beginning of
qpack_get_varint(), which systematically secures all other callers
against empty inputs. It also adds a necessary check before the
second varint call in qpack_decode_fs_pfx() to ensure data is still
available before dereferencing the pointer to extract the sign bit,
returning QPACK_RET_TRUNCATED if the buffer is exhausted.
Thank you to Kamil Frankowicz for having reported this.
Must be backported as far as 2.6.
A call to huffman decoder function (huff_dec()) is made from qpack_decode_fs()
without checking the buffer length passed to this function, leading to OOB read
which can crash the process.
Thank you to Kamil Frankowicz for having reported this.
Must be backport as far as 2.6.
The varint hpack decoder supports unbounded numbers but returns 32-bit
results. This means that possible truncation my happen on some field
lengths or indexes that would be emitted as quantities that do not fit
in a 32-bit number. The final value will also depend on how the left
shift operation behaves on the target architecture (e.g. whether bits
are lost or used modulo 31). This could lead to a desynchronization of
the HPACK stream decoding compared to what an external observer would
see (e.g. from a network traffic capture). However, there isn't any
impact between streams, HPACK is performed at the connection level,
not at the stream level, so no stream may try to leverage this
limitation to have any effect on another one.
For the fix, instead of adding checks everywhere in the loop and for
the final stage, let's rewrite the decoder to compare the read value
to a max value that is shifted by 7 bits for every 7 bits read. This
allows a sender to continue to emit zeroes for higher bits without
being blocked, while detecting that a received value would overflow.
The loop is now simpler as it deals both with values with the higher
bit set and the final ones, and stops once the final value was recorded.
A test on non-zero before performing the shift was added to please
ubsan, though in practice zero shifted by any quantity remains zero.
But the test is cheap so that's OK.
Thanks to Guillaume Meunier, Head of Vulnerability Operations Center
France at Orange Cyberdefense, for reporting this bug.
This should be backported to all stable versions.
On the backend side, QUIC MUX may be started preemptively before the
ALPN negotiation. This is useful notably for 0-RTT implementation.
However, this was a source of crashes. ALPN was expected to be retrieved
from the server cache, however QUIC MUX still used the ALPN from the
transport layer. This could cause a crash, especially when several
connections runs in parallel as the server cache is shared among
threads.
Thanks to the previous patch which reworks QUIC MUX init, this solution
can now be fixed. Indeed, if conn_get_alpn() is not successful, MUX can
look at the server cache again to use the expected value.
Note that this could still prevent the MUX to work as expected if the
server cache is resetted between connect_server() and MUX init. Thus,
the ultimate solution would be to copy the cached ALPN into the
connection. This problem is not specific to QUIC though, and must be
fixed in a separate patch.
This patch reworks the installation of app-ops layer by QUIC MUX.
Previously, app_ops field was stored directly into the quic_conn
structure. Then the MUX reused it directly during its qmux_init().
This patch removes app_ops field from quic_conn and replaces it with a
copy of the negotiated ALPN. By using quic_alpn_to_app_ops(), it ensures
it remains compatible with a known application layer.
On the MUX layer, qcc_install_app_ops() now uses the standard
conn_get_alpn() to retrieve the ALPN from the transport layer. This is
done via the newly defined <get_alpn> QUIC xprt callback.
This new architecture should be cleaner as it better highlights the
responsibility of each layers in the ALPN/app negotiation.
Extract the conversion from ALPN to qcc_app_ops type from quic_conn
source file into QUIC MUX. The newly created function is named
quic_alpn_to_app_ops(). This will serve as a central point to identify
which ALPNs are currently supported in our QUIC stack.
This patch is purely a small refactoring. It will be useful for the next
one which rework MUX app-ops layer init. The current cleanup allows
notably to remove H3/hq-interop headers from quic_conn source file.
The QUIC MUX layer is closed after its transport counterpart. This may
be necessary then to reject any new streams opened by the remote peer.
This operation is dependent however from the application protocol.
Previously, a function qc_h3_request_reject() was directly implemented
in quic_conn source file for use when HTTP/3 was previously negotiated.
However, this solution was not evolutive and broke layering.
This patch introduces a new proper separation with a <strm_reject>
callback defined in quic_conn structure. When set, it will be used to
preemptively close any new stream. QUIC MUX is responsible to set it
just before its closure.
No functional change. This patch is purely a refactoring with a better
architecture design. Especially, H3 specific code from transport layer
is now completely removed.
In most of haproxy code, ALPN is used as a signed char pointer. In QUIC
code instead, it is manipulated as unsigned.
Unifies this by using signed type in QUIC code. This allows to remove a
bunch of unnecessary casts.
The conversion of TASK_WOKEN_RES to a stream event was missing. Among other
things, this wakeup reason is used when a stream is dequeued. So it was
possible to skip the connection establishment if the stream was also woken
up for a timer reason. When this happened, the stream was blocked till the
queue timeout expiration.
Converting TASK_WOKEN_RES to STRM_EVT_RES fixes the issue.
This patch should fix the issue #3290. It must be backported as far as 3.2.
hlua_check_proxy() may now return NULL if the target proxy instance has
been flagged for deletion. Thus, proxies method have been adjusted and
may push nil to report such case.
This patch fixes these error paths. When nil is pushed, 1 must be
returned instead of 0. This represents the count of pushed values on the
stack which can be retrieved by the caller.
No need to backport.
Complete delete backend regtests by checking deletion of a proxy with a
reference on an unnamed defaults instance. This operation is sensible as
the defaults refcount is decremented, and when the last backend is
removed, the defaults is also freed.
Add a reg-tests to test "del backend" CLI command. First, checks are
performed to ensure a backend cannot be deleted if not in the expected
state.
Then, a "del backend" success is tested. Stats are dumped to ensure the
backend instance is indeed removed.
This patch finalizes "del backend" handler by implementing the proper
proxy deletion.
After ensuring backend deletion can be performed, several steps are
executed. First, any watcher elements are updated to point on the next
proxy instance. The backend is then removed from ID and name global
trees and is finally detached from proxies_list.
Once the backend instance is removed from proxies_list, the backend
cannot be found by new elements. Thread isolation is lifted and
proxy_drop() is called, which will purge the proxy if its refcount is
null. Thanks to recently introduced PROXIES_DEL_LOCK, proxy_drop() is
thread safe.
Default proxy refcount <def_ref> is used to comptabilize reference on a
default proxy instance by standard proxies. Currently, this is necessary
when a default proxy defines TCP/HTTP rules or a tcpcheck ruleset.
Transform every access on <def_ref> so that atomic operations are now
used. Currently, this is not strictly needed as default proxies
references are only manipulated at init or deinit in single thread mode.
However, when dynamic backends deletion will be implemented, <def_ref>
will be decremented at runtime also.
This patch is similar to the previous one, but this time it deals with
functions related to defaults proxies instances. Lock PROXIES_DEL_LOCK
is used to protect accesses on global collections.
This patch will be necessary to implement dynamic backend deletion, even
if defaults won't be use as direct target of a "del backend" CLI.
However, a backend may have a reference on a default instance. When the
backend is freed, this references is released, which can in turn cause
the freeing of the default proxy instance. All of this will occur at
runtime, outside of thread isolation.
Define a new lock with label PROXIES_DEL_LOCK. Its purpose is to protect
operations performed on global lists or trees while a proxy is freed.
Currently, this lock is unneeded as proxies are only freed on
single-thread init or deinit. However, with the incoming dynamic backend
deletion, this operation will be also performed at runtime, outside of
thread isolation.
Implement be-removable argument to CLI wait. This is implemented via
be_check_for_deletion() invokation, also used by "del backend" handler.
The objective is to test whether a backend instance can be removed. If
this is not the case, the command may returns immediately if the target
proxy is incompatible with dynamic removal or if a user action is
required. Else, the command will wait until the temporary restriction is
lifted.
Currenly, quic_conn on the backend side may access their parent proxy
instance during their lifetime. In particular, this is the case for
counters update, with <prx_counters> field directly referencing a proxy
memory zone.
As such, this prevents safe backend removal. One solution would be to
check if the upper connection instance is still alive, as a proxy cannot
be removed if connection are still active. However, this would
completely prevent proxy counters update via
quic_conn_prx_cntrs_update(), as this is performed on quic_conn release.
Another solution would be to use refcount, or a dedicated counter on the
which account for QUIC connections on a backend instance. However,
refcount is currently only used by short-term references, and it could
also have a negative impact on performance.
Thus, the simplest solution for now is to disable a backend removal if a
QUIC server is/was used in it. This is considered acceptable for now as
QUIC on the backend side is experimental.
Ensure a backend instance cannot be removed if there is still server in
it. This is checked via be_check_for_deletion() to ensure "del backend"
cannot be executed. The only solution is to use "del server" to remove
on the servers instances.
This check only covers servers not yet targetted via "del server". For
deleted servers not yet purged (due to their refcount), the proxy
refcount is incremented but this does not block "del backend"
invokation.
Define a new proxy flag PR_FL_NON_PURGEABLE. This is used to mark every
proxy instance explicitely referenced in the config. Such instances
cannot be deleted at runtime.
Static use_backend/default_backend rules are handled in
proxy_finalize(). Also, sample expression proxy references are protected
via smp_resolve_args().
Note that this last case also incidentally protects any proxies
referenced via a CLI "set var" expression. This should not be the case
as in this case variable value is instantly resolved so the proxy
reference is not needed anymore. This also affects dynamic servers.
Prevent removal of a backend which relies on features not compatible
with dynamic backends. This is the case if either dispatch or
transparent option is used, or if a stick-table is declared.
These limitations are similar to the "add backend" ones.
Implement proxy refcount for Lua proxy class. This is similar to the
server class.
In summary, proxy_take() is used to increment refcount when a Lua proxy
is instantiated. proxy_drop() is called via Lua garbage collector. To
ensure a deleted backend is released asap, hlua_check_proxy() now
returns NULL if PR_FL_DELETED is set.
This approach is directly dependable on Lua GC execution. As such, it
probably suffers from the same limitations as the ones already described
in the previous commit. With the current patch, "del backend" is not
directly impacted though. However, the final proxy deinit may happen
after a long period of time, which could cause memory pressure increase.
One final observations regarding deinit : it is necessary to delay a
BUG_ON() which checks that defaults proxies list is empty. Now this must
be executed after Lua deinit (called via post_deinit_list). This should
guarantee that all proxies and their defaults refcount are null.
When a server is deleted via "del server", increment refcount of its
parent backend. This is necessary as the server is not referenced
anymore in the backend, but can still access it via its own <proxy>
member. Thus, backend removal must not happen until the complete purge
of the server.
The proxy refcount is released in srv_drop() if the flag SRV_F_DELETED
is set, which indicates that "del server" was used. This operation is
performed after the complete release of the server instance to ensure no
access will be performed on the proxy via itself. The refcount must not
be decremented if a server is freed without "del server" invokation.
Another solution could be for servers to always increment the refcount.
However, for now in haproxy refcount usage is limited, so the current
approach is preferred. It should also ensure that if the refcount is
still incremented, it may indicate that some servers are not completely
purged themselves.
Note that this patch may cause issues if "del backend" are used in
parallel with LUA scripts referencing servers. Currently, any servers
referenced by LUA must be released by its garbage collector to ensure it
can be finally freed. However, it appeas that in some case the gc does
not run for several minutes. At least this has been observed with Lua
version 5.4.8. In the end, this will result in indefinitely blocking of
"del backend" commands.
Rename proxy conf <refcount> to <def_ref>. This field only serves for
defaults proxy instances. The objective is to avoid confusion with the
newly introduced <refcount> field used for dynamic backends.
As an optimization, it could be possible to remove <def_ref> and only
use <refcount> also for defaults proxies usage. However for now the
simplest solution is implemented.
This patch does not bring any functional change.
Implement refcount notion into proxy structure. The objective is to be
able to increment refcount on proxy to prevent its deletion temporarily.
This is similar to the server refcount : "del backend" is not blocked
and will remove the targetted instance from the global proxies_list.
However, the final free operation is delayed until the refcount is null.
As stated above, the API is similar to servers. Proxies are initialized
with a refcount of 1. Refcount can be incremented via proxy_take(). When
no longer useful, refcount is decremented via proxy_drop() which
replaces the older free_proxy(). Deinit is only performed once refcount
is null.
This commit also defines flag PR_FL_DELETED. It is set when a proxy
instance has been removed via a "del backend" CLI command. This should
serve as indication to modules which may still have a refcount on the
target proxy so that they can release it as soon as possible.
Note that this new refcount is completely ignored for a default proxy
instance. For them, proxy_take() is pure noop. Free is immediately
performed on first proxy_drop() invokation.
Ensures proxies iteration in promex applet is safe via a new watcher
member. The principle is similar to the one already used for servers
iteration.
Note that ctx.p[0] is not updated anymore at the end of a function, as
this is automatically done via the watcher itself.
Define a new <px_watch> watcher member in stats applet context. It is
used to register the applet on a proxy when iterating over the proxies
list. <obj1> is automatically updated via the watcher interaction.
Watcher is first initialized prior to stats_dump_proxies() invocation.
This guarantees that stats dump is safe even if applet yields and a
backend is removed in parallel.
Define a new member watcher_list in proxy. It will be used to register
modules which iterate over the proxies list. This will ensure that the
operation is safe even if a backend is removed in parallel.
Add "del backend" handler which is restricted to admin level. Along with
it, a new function be_check_for_deletion() is used to test if the
backend is removable.
Correct documentation for srv_detach() which previously stated that this
function could be called for a server even if not stored in its proxy
list. In fact there is a BUG_ON() which detects this case.
Proxy flags member were of type char. This will soon enough not be
sufficient as new flags will be defined. As such, convert flags member
to unsigned int type.
Add missing proxy_index_id() call in "add backend" handler. This step is
responsible to store the newly created proxy instance in the
used_proxy_id global tree.
No need to backport.
Servers iteration via promex is now resilient to server runtime deletion
thanks to the watcher mechanism. However, the watcher was not correctly
initialized which could cause duplicate metrics reporting.
This issue happens when promex dump yielded when manipulating the last
server of a proxy. If this server is removed in parallel, <sv> pointer
will be set to NULL when promex resumes. Instead of switching to another
proxy, the code would reuse the same one and iterate again on the same
server list.
To fix this issue, <sv> pointer must not be reinitialized just after a
resumption point. Instead, this is now performed before
promex_dump_srv_metrics(), or just after switching to another proxy
instance. Thus, on resumption, if promex_dump_srv_metrics() is started
with <sv> as NULL, it means that the server was deleted and the end of
the current proxy list is reached, hence iteration is restarted on the
next proxy instance.
Note that ctx.p[1] does not need to be manually updated at the end of
promex_dump_srv_metrics() as srv_watch already does that.
This patch must be backported up to 3.0.
Implement a stress mode with force yield for promex applet each time a
metric is displayed. This is implemented by returning 0 in
promex_dump_ts() each time the output buffer is not empty.
To test this, haproxy must be compiled with DEBUG_STRESS and the
following configuration must be used :
global
stress-level 1
As seen with the last changes to counters allocation, the move of the
counters storage to the thread group as operated in commit 04a9f86a85
("MEDIUM: counters: add a dedicated storage for extra_counters in various
structs") causes some random errors when using ASAN, because the extra
counters are freed in srv_drop() after calling srv_free_params(), which
is responsible for freeing the per-thread group storage.
For the proxies however it's OK because free calls are made before the
call to deinit_proxy() which frees the per_tgrp area.
No backport is needed, this is purely 3.4-dev.
Now we store and retrieve only counters for the current tgid when more
than one is supported. This allows to significantly reduce contention
on shared stats. The haterm utility saw its performance increase from
4.9 to 5.8M req/s in H1, and 6.0 to 7.6M for H2, both with 5 groups of
16 threads, showing that we don't necessarily need insane amounts of
groups.
Now thanks to new macro EXTRA_COUNTERS_AGGR() we can iterate over all
thread groups storages when returning the data for a given metric. This
remains convenient and mostly transparent. The caller continues to pass
the pointer to the metric in the first group, and offsets are calculated
for all other groups and data summed. For now all groups except the
first one contain only zeroes but reported values are nevertheless
correct.
The goal is to always retrieve the storage address of the first thread
group for the given module. This will be used to iterate over all thread
groups. For now it returns the same value as EXTRA_COUNTERS_GET().
In order to be able to properly allocate all storage and retrieve data
from there, we'll need to know how many thread groups are supposed to
access it. Let's store the number of thread groups at init time. If the
tgrp_step is zero, there's always only one tg though.
Now EXTRA_COUNTERS_ALLOC() takes this number of thread groups in argument
and stores it in the structure. It also allocates as many areas as needed,
incrementing the datap pointer by the step for each of them.
EXTRA_COUNTERS_FREE() uses this info to free all allocated areas.
EXTRA_COUNTERS_INIT() initializes all allocated areas, this is used
elsewhere to clear/preset counters, e.g. in proxy_stats_clear_counters().
It involves a memcpy() call for each array, which is normally preset to
something empty but might also be used to preset certain non-scalar
fields such as an instance name.
We'll need to permit any user to update its own tgroup's extra counters
instead of the global ones. For this we now store the per-tgroup step
between two consecutive data storages, for when they're stored in a
tgroup array. When shared (e.g. resolvers or listeners), we just store
zero to indicate that it doesn't scale with tgroups. For now only the
registration was handled, it's not used yet.
Servers, proxies, listeners and resolvers all use extra_counters. We'll
need to move the storage to per-tgroup for those where it matters. Now
we're relying on an external storage, and the data member of the struct
was replaced with a pointer to that pointer to data called datap. When
the counters are registered, these datap are set to point to relevant
locations. In the case of proxies and servers, it points to the first
tgrp's storage. For listeners and resolvers, it points to a local
storage. The rationale here is that listeners are limited to a single
group anyway, and that resolvers have a low enough load so that we do
not care about contention there.
Nothing should change for the user at this point.
Since version 2.4 with commit 7f8f6cb926 ("BUG/MEDIUM: stats: prevent
crash if counters not alloc with dummy one") we can afford to always
update extra_counters because we know they're always either allocated
or linked to a dedicated trash. However, the ->fill_stats() callbacks
continue to access such values, making it technically possible to
retrieve random counters from this trash, which is not really clean.
Let's implement an explicit test in the ->fill_stats() functions to
only return 0 for the metric when not allocated like this. It's much
cleaner because it guarantees that we're returning an empty counter
in this case rather than random values.
The situation currently happens for dummy servers like the ones used
in Lua proxies as well as those used by rings (e.g. used for logging
or traces). Normally, none of the objects retrieved via stats or
Prometheus is concerned by this unallocated extra_counters situation,
so this is more about a cleanup than a real fix.
We'll soon need to iterate over thread groups in the fill_stats() functions,
so let's first pass the extra_counters and stats_module pointers to the
fill_stats functions. They now call EXTRA_COUNTERS_GET() themselves with
these elements in order to retrieve the required pointer. Nothing else
changed, and it's getting even a bit more transparent for callers.
This doesn't change anything visible however.
A number of C files include stats.h or stats-t.h, many of which were
just to access the counters. Now those which really need counters rely
on counters.h or counters-t.h, which already reduces the amount of
preprocessed code to be built (~3000 lines or about 0.05%).
It was always difficult to find extra_counters when the rest of the
counters are now in counters-t.h. Let's move the types to counters-t.h
and the macros to counters.h. Stats include them since they're used
there. But some users could be cleaned from the stats definitions now.
There's something a bit awkward in the way stats counters are inherited
through the QUIC modules: quic_conn-t includes quic_stats-t.h, which
declares quic_stats_module as extern from a type that's not known from
this file. And anyway externs should not be exported from type defintions
since they're not part of the ABI itself.
This commit moves the declaration to quic_stats.h which now takes care
to include stats-t.h to get the definition of struct stats_module. The
few users who used to learn it through quic_conn-t.h now include it
explicitly. As a bonus this reduces the number of preprocessed lines
by 5000 (~0.1%).
By the way, it looks like struct stats_module could benefit from being
moved off stats-t.h since it's only used at places where the rest of
the stats is not needed. Maybe something to consider for a future
cleanup.
We only support platforms where free(NULL) is a NOP so that
null checks are useless before free(). Let's drop them to keep
the code clean. There were a few in cfgparse-global, flt_trace,
ssl_sock and stats.
It appears that in cli_parse_add_server(), we're calling srv_alloc_lb()
and stats_allocate_proxy_counters_internal() before srv_preinit() which
allocates the thread groups. LB algos can make use of the per_tgrp part
which is initialized by srv_preinit(). Fortunately for now no algo uses
both tgrp and ->server_init() so this explains why this remained
unnoticed to date. Also, extra counters will soon require per_tgrp to
already be initialized. So let's move these between srv_preinit() and
srv_postinit(). It's possible that other parts will have to be moved
in between.
This could be backported to recent versions for the sake of safety but
it looks like the current code cannot tell the difference.
Some stream parsing errors that do not affect the connection result in
the parsed block not being transferred from the rx buffer to the channel
and not being reported upstream in rcv_buf(), causing the stconn to time
out. Let's detect this condition, and propagate term flags anyway since
no more progress will be made otherwise.
This should be backported at least till 3.2, probably even 2.8.
The H2 mux currently logs whenever some decoding fails. Most of the errors
happen at the connection level, but some are even at the stream level,
meaning that multiple logs can be emitted for a given connection, which
can quickly use some resource for little value. This new setting allows
to tweak this and decide to only log errors that affect the connection,
or even none at all.
This should be backported at least as far as 3.2.
Two cases were not causing glitches to be incremented:
- invalid trailers
- trailers on closed streams
This patch addresses this. It could be backported, at least to 3.2.
ssl_sock_srv_try_reuse_sess() was modified by this commit to no longer
fail (it now returns void), but the related comments remained:
BUG/MINOR: quic: missing app ops init during backend 0-RTT sessions
This patch cleans them up.
The QUIC mux requires "application operations" (app ops), which are a list
of callbacks associated with the application level (i.e., h3, h0.9) and
derived from the ALPN. For 0-RTT, when the session cache cannot be reused
before activation, the current code fails to reach the initialization of
these app ops, causing the mux to crash during its initialization.
To fix this, this patch restores the behavior of
ssl_sock_srv_try_reuse_sess(), whose purpose was to reuse sessions stored
in the session cache regardless of whether 0-RTT was enabled, prior to
this commit:
MEDIUM: quic-be: modify ssl_sock_srv_try_reuse_sess() to reuse backend
sessions (0-RTT)
With this patch, this function now does only one thing: attempt to reuse a
session, and that's it!
This patch allows ignoring whether a session was successfully reused from
the cache or not. This directly fixes the issue where app ops
initialization was skipped upon a session cache reuse failure. From a
functional standpoint, starting a mux without reusing the session cache
has no negative impact; the mux will start, but with no early data to
send.
Finally, there is the case where the ALPN is reset when the backend is
stopped. It is critical to continue locking read access to the ALPN to
secure shared access, which this patch does. It is indeed possible for the
server to be stopped between the call to connect_server() and
quic_reuse_srv_params(). But this cannot prevent the mux to start
without app ops. This is why a 'TODO' section was added, as a reminder that a
race condition regarding the ALPN reset still needs to be fixed.
Must be backported to 3.3
Make sure CPUs are distributed fairly across groups, in case the number
of groups to generate is not a divider of the number of CPUs, otherwise
we may end up with a few groups that will have no CPU bound to them.
This was introduced in 3.4-dev2 with commit 56fd0c1a5c ("MEDIUM: cpu-topo:
Add an optional directive for per-group affinity"). No backport is
needed unless this commit is backported.
When "mode haterm" was set in a "defaults" section, it could not be
overridden in subsequent sections using the "mode" keyword. This is because
the proxy stream instantiation callback was not being reset to the
default stream_new() value.
This could break the stats URI with a configuration such as:
defaults
mode haterm
# ...
frontend stats
bind :8181
mode http
stats uri /
This patch ensures the ->stream_new_from_sc() proxy callback is reset
to stream_new() when the "mode" keyword is parsed for any mode other
than "haterm".
No need to backport.
During proxy_finalize(), a lookup is performed over the servers by name
tree to detect any collision. Only the first conflict for each server
instance is reported to avoid a combinatory explosion with too many
alerts shown.
Previously, this was written using a for loop without any iteration.
Replace this by a simple if statement as this is cleaner.
This should fix github issue #3276.
itbmap_next() advances an iterator over a ncbmbuf buffer storage. When
reaching the end of the buffer, <b> field is set to NULL, and the caller
is expected to stop working with the iterator.
Complete this part to ensure that itbmap type is fully initialized in
case null iterator value is returned. This is not strictly required
given the above description, but this is better to avoid any possible
future mistake.
This should fix coverity issue from github #3273.
This could be backported up to 2.8.
Some perf profiles occasionally show that reading the trace source's
state can take some time, which is not expected at all. It just happens
that the trace_source is not cache-aligned so depending on linkage, it
may share a cache line with a more active variable, thereby inducing a
slow down to all threads trying to read the variable.
Let's always mark it aligned to avoid this. For now the problem was not
observed again.
Changes brought to support large buffers revealed a bug in the SPOE applet
when a frame is copied in the SPOE context buffer. A b_xfer() was performed
without allocating the SPOE context buffer. It is not expected. As stated in
the function documentation, the caller is responsible for ensuring there is
enough space in the destination buffer. So first of all, it must ensure this
buffer was allocated.
With recent changes, we are able to hit a BUG_ON() because the swap is no
longer possible if source and destination buffers size are not the same.
This patch should fix the issue #3286. It could be backported as far as 3.1.
Some options do not support "no" nor "defaults" and they're placed after
the check for their absence. However, "accept-invalid-http-request" and
"accept-invalid-http-response" still used to check for the flags that
come with these prefixes, but Coverity noticed this was dead code in
github issue #3272. Let's just drop the test.
No backport needed as it's just dead code.
This function was recently created by moving code from acme_gen_tmp_x509()
(in acme.c) to ssl_gencrt.c (ssl_gen_x509()). The <ctmp> variable was
initialized and then freed without ever being used. This was already the
case in the original acme_gen_tmp_x509() function.
This patch removes these useless statements.
Reported in GH #3284
Avoid such a warnings from coverity:
CID 1645121: (#1 of 1): Calling risky function (DC.WEAK_CRYPTO)
dont_call: random should not be used for security-related applications,
because linear congruential algorithms are too easy to break.
Reported in GH #3283 and #3285
This patch changes the registration of the following keywords to be
unconditional:
- ssl-dh-param-file
- ssl-engine
- ssl-propquery, ssl-provider, ssl-provider-path
- ssl-default-bind-curves, ssl-default-server-curves
- ssl-default-bind-sigalgs, ssl-default-server-sigalgs
- ssl-default-bind-client-sigalgs, ssl-default-server-client-sigalgs
Instead of excluding them at compile time via #ifdef guards in the keyword
registration table, their parsing functions now check feature availability
at runtime and return a descriptive error when the feature is missing.
For features controlled by the SSL library (providers, curves, sigalgs,
DH), the error message includes the actual OpenSSL version string via
OpenSSL_version(OPENSSL_VERSION), so users can immediately identify which
library they are running rather than seeing cryptic internal macro names.
For ssl-dh-param-file, the message also includes "(no DH support)" as a
hint, since OPENSSL_NO_DH can be set either by an OpenSSL build or by
HAProxy itself in certain configurations.
For ssl-engine, which depends on a HAProxy build-time flag (USE_ENGINE),
the message retains the flag name as it is more actionable for the user.
This addresses issue https://github.com/haproxy/haproxy/issues/3246.
In acme_req_finalize(), acme_req_challenge(), acme_req_neworder(),
acme_req_account(), and acme_post_as_get(), the success path always
calls unconditionally memprintf(errmsg, ...).
This may result in a leak of errmsg.
Additionally, acme_res_chkorder(), acme_res_finalize(), acme_res_auth(),
and acme_res_neworder() had unused 'out:' labels that were removed.
Must be backported as far as 3.2.
365a696 ("MINOR: acme: emit a log for DNS-01 challenge response")
introduces the auth->dns member which is istdup(). But this member is
never free, instead auth->token was freed twice by mistake.
Must be backported to 3.2.
QUIC is now implemented on the backend side. Complete definitions for
QUIC/H3 stats module to add STATS_PX_CAP_BE capability.
This change is necessary to display QUIC/H3 counters on backend lines
for HTML stats page.
This should be backported up to 3.3.
half_open_conn is a proxy counter used to account for quic_conn in
half-open state : this represents a connection whose address is not yet
validated (handshake successful, or via token validation).
This counter only has sense for the frontend side. Currently, code is
safe as access is only performed if quic_conn is not yet flagged with
QUIC_FL_CONN_PEER_VALIDATED_ADDR, which is always set for backend
connections.
To better reflect this, add a BUG_ON() when half_open_conn is
incremented/decremented to ensure this never occurs for backend
connections.
quic_conn is initialized with a pointer to its proxy counters. These
counters are then updated during the connection lifetime.
Counters pointer was incorrect for backend quic_conn, as it always
referenced frontend counters. For pure backend, no stats would be
updated. For listen instances, this resulted in incorrect stats
reporting.
Fix this by correctly set proxy counters based on the connection side.
This must be backported up to 3.3.
This is a very minor bug with a very low probability of occurring.
However, it could be flagged by a static analyzer or result in a small
contribution, which is always time-consuming for very little gain.
Add the --quic-bind-opts and --tcp-bind-opts long options to append
settings to all QUIC and TCP bind lines. This requires modifying the argv
parser to first process these new options, ensuring they are available
during the second argv pass to be added to each relevant "bind" line.
Add -b and -c options to the haterm argv parser. Use -b to specify the RSA
private key size (in bits) and -c to define the ECDSA certificate curves.
These self-signed certificates are required for haterm SSL bindings.
Allows server keyword "no-check-sni-auto" for dynamic servers. This may
be necessary to users who do not want to benefit from auto SNI for
checks.
Keyword "check-sni-auto" is still deactivated for dynamic servers, for
the same reason as "sni-auto" (cf the previous patch for a complete
explanation).
This must be backported up to 3.3.
Auto SNI configuration is configured during check config validity.
However, nothing was implemented for dynamic servers.
Fix this by implementing auto SNI configuration during "add server" CLI
handler. Auto SNI configuration code is moved in a dedicated function
srv_configure_auto_sni() called both for static and dynamic servers.
Along with this, allows the keyword "no-sni-auto" on dynamic servers, so
that this process can be deactivated if wanted. Note that "sni-auto"
remains unavailable as it only makes sense with default-servers which
are never used for dynamic server creation.
This must be backported up to 3.3.
There was no check on the result of strdup() used to setup auto SNI on a
server instance during check config validity. In case of failure, the
error would be silently ignored as the following server_parse_exprs()
does nothing when <sni_expr> server field is NULL. Hence, no SNI would
be used on the server, without any error nor warning reported.
Fix this by adding a check on strdup() return value. On error, ERR_ABORT
is reported along with an alert, parsing should be interrupted as soon
as possible.
This must be backported up to 3.3. Note that the related code in this
case is present in cfgparse.c source file.
Released version 3.4-dev5 with the following main changes :
- DOC: internals: addd mworker V3 internals
- BUG/MINOR: threads: Initialize maxthrpertgroup earlier.
- BUG/MEDIUM: threads: Differ checking the max threads per group number
- BUG/MINOR: startup: fix allocation error message of progname string
- BUG/MINOR: startup: handle a possible strdup() failure
- MINOR: cfgparse: validate defaults proxies separately
- MINOR: cfgparse: move proxy post-init in a dedicated function
- MINOR: proxy: refactor proxy inheritance of a defaults section
- MINOR: proxy: refactor mode parsing
- MINOR: backend: add function to check support for dynamic servers
- MINOR: proxy: define "add backend" handler
- MINOR: proxy: parse mode on dynamic backend creation
- MINOR: proxy: parse guid on dynamic backend creation
- MINOR: proxy: check default proxy compatibility on "add backend"
- MEDIUM: proxy: implement dynamic backend creation
- MINOR: proxy: assign dynamic proxy ID
- REGTESTS: add dynamic backend creation test
- BUG/MINOR: proxy: fix clang build error on "add backend" handler
- BUG/MINOR: proxy: fix null dereference in "add backend" handler
- MINOR: net_helper: extend the ip.fp output with an option presence mask
- BUG/MINOR: proxy: fix default ALPN bind settings
- CLEANUP: lb-chash: free lb_nodes from chash's deinit(), not global
- BUG/MEDIUM: lb-chash: always properly initialize lb_nodes with dynamic servers
- CLEANUP: haproxy: fix bad line wrapping in run_poll_loop()
- MINOR: activity: support setting/clearing lock/memory watching for task profiling
- MEDIUM: activity: apply and use new finegrained task profiling settings
- MINOR: activity: allow to switch per-task lock/memory profiling at runtime
- MINOR: startup: Add the SSL lib verify directory in haproxy -vv
- BUG/MINOR: ssl: SSL_CERT_DIR environment variable doesn't affect haproxy
- CLEANUP: initcall: adjust comments to INITCALL{0,1} macros
- DOC: proxy-proto: underline the packed attribute for struct pp2_tlv_ssl
- MINOR: queues: Check minconn first in srv_dynamic_maxconn()
- MINOR: servers: Call process_srv_queue() without lock when possible
- BUG/MINOR: quic: ensure handshake speed up is only run once per conn
- BUG/MAJOR: quic: reject invalid token
- BUG/MAJOR: quic: fix parsing frame type
- MINOR: ssl: Missing '\n' in error message
- MINOR: jwt: Convert an RSA JWK into an EVP_PKEY
- MINOR: jwt: Add new jwt_decrypt_jwk converter
- REGTESTS: jwt: Add new "jwt_decrypt_jwk" tests
- MINOR: startup: Add HAVE_WORKING_TCP_MD5SIG in haproxy -vv
- MINOR: startup: sort the feature list in haproxy -vv
- MINOR: startup: show the list of detected features at runtime with haproxy -vv
- SCRIPTS: build-vtest: allow to set a TMPDIR and a DESTDIR
- MINOR: filters: rework RESUME_FILTER_* macros as inline functions
- MINOR: filters: rework filter iteration for channel related callback functions
- MEDIUM: filters: use per-channel filter list when relevant
- DEV: gdb: add a utility to find the post-mortem address from a core
- BUG/MINOR: deviceatlas: add missing return on error in config parsers
- BUG/MINOR: deviceatlas: add NULL checks on strdup() results in config parsers
- BUG/MEDIUM: deviceatlas: fix resource leaks on init error paths
- BUG/MINOR: deviceatlas: fix off-by-one in da_haproxy_conv()
- BUG/MINOR: deviceatlas: fix cookie vlen using wrong length after extraction
- BUG/MINOR: deviceatlas: fix double-checked locking race in checkinst
- BUG/MINOR: deviceatlas: fix resource leak on hot-reload compile failure
- BUG/MINOR: deviceatlas: fix deinit to only finalize when initialized
- BUG/MINOR: deviceatlas: set cache_size on hot-reloaded atlas instance
- MINOR: deviceatlas: check getproptype return and remove pprop indirection
- MINOR: deviceatlas: increase DA_MAX_HEADERS and header buffer sizes
- MINOR: deviceatlas: define header_evidence_entry in dummy library header
- MINOR: deviceatlas: precompute maxhdrlen to skip oversized headers early
- CLEANUP: deviceatlas: add unlikely hints and minor code tidying
- DEV: gdb: use unsigned longs to display pools memory usage
- BUG/MINOR: ssl: lack crtlist_dup_ssl_conf() declaration
- BUG/MINOR: ssl: double-free on error path w/ ssl-f-use parser
- BUG/MINOR: ssl: fix leak in ssl-f-use parser upon error
- BUG/MINOR: ssl: clarify ssl-f-use errors in post-section parsing
- BUG/MINOR: ssl: error with ssl-f-use when no "crt"
- MEDIUM: backend: make "balance random" consider tg local req rate when loads are equal
- BUG/MAJOR: Revert "MEDIUM: mux-quic: add BUG_ON if sending on locally closed QCS"
- BUG/MEDIUM: h3: reject frontend CONNECT as currently not implemented
- MINOR: mux-quic: add BUG_ON_STRESS() when draining data on closed stream
- REGTESTS: fix quoting in feature cmd which prevents test execution
- BUG/MEDIUM: mux-h2/quic: Stop sending via fast-forward if stream is closed
- BUG/MEDIUM: mux-h1: Stop sending vi fast-forward for unexpected states
- BUG/MEDIUM: applet: Fix test on shut flags for legacy applets (v2)
- DEV: term-events: Fix hanshake events decoding
- BUG/MINOR: flt-trace: Properly compute length of the first DATA block
- MINOR: flt-trace: Add an option to limit the amount of data forwarded
- CLEANUP: compression: Remove unused static buffers
- BUG/MEDIUM: shctx: Use the next block when data exactly filled a block
- BUG/MINOR: http-ana: Stop to wait for body on client error/abort
- MINOR: stconn: Add missing SC_FL_NO_FASTFWD flag in sc_show_flags
- REORG: stconn: Move functions related to channel buffers to sc_strm.h
- BUG/MEDIUM: jwe: fix timing side-channel and dead code in JWE decryption
- MINOR: tree-wide: Use the buffer size instead of global setting when possible
- MINOR: buffers: Swap buffers of same size only
- BUG/MINOR: config: Check buffer pool creation for failures
- MEDIUM: cache: Don't rely on a chunk to store messages payload
- MEDIUM: stream: Limit number of synchronous send per stream wakeup
- MEDIUM: compression: Be sure to never compress more than a chunk at once
- MEDIUM: mux-h1/mux-h2/mux-fcgi/h3: Disable 0-copy for buffers of different size
- MEDIUM: applet: Disable 0-copy for buffers of different size
- MINOR: h1-htx: Disable 0-copy for buffers of different size
- MEDIUM: stream: Offer buffers of default size only
- BUG/MEDIUM: htx: Fix function used to change part of a block value when defrag
- MEDIUM: htx: Refactor transfer of htx blocks to merge DATA blocks if possible
- MEDIUM: htx: Refactor htx defragmentation to merge data blocks
- MEDIUM: htx: Improve detection of fragmented/unordered HTX messages
- MINOR: http-ana: Do a defrag on unaligned HTX message when waiting for payload
- MINOR: http-fetch: Use pointer to HTX DATA block when retrieving HTX body
- MEDIUM: dynbuf: Add a pool for large buffers with a configurable size
- MEDIUM: chunk: Add support for large chunks
- MEDIUM: stconn: Properly handle large buffers during a receive
- MEDIUM: sample: Get chunks with a size dependent on input data when necessary
- MEDIUM: http-fetch: Be able to use large chunks when necessary
- MINPR: htx: Get large chunk if necessary to perform a defrag
- MEDIUM: http-ana: Use a large buffer if necessary when waiting for body
- MINOR: dynbuf: Add helpers to know if a buffer is a default or a large buffer
- MINOR: config: reject configs using HTTP with large bufsize >= 256 MB
- CI: do not use ghcr.io for Quic Interop workflows
- BUG/MEDIUM: ssl: SSL backend sessions used after free
- CI: vtest: move the vtest2 URL to vinyl-cache.org
- CI: github: disable windows.yml by default on unofficials repo
- MEDIUM: Add connect/queue/tarpit timeouts to set-timeout
- CLEANUP: mux-h1: Remove unneeded null check
- DOC: remove openssl no-deprecated CI image
- BUG/MINOR: acme: fix X509_NAME leak when X509_set_issuer_name() fails
- BUG/MINOR: backend: check delay MUX before conn_prepare()
- OPTIM: backend: reduce contention when checking MUX init with ALPN
- DOC: configuration: add the ACME wiki page link
- MINOR: ssl/ckch: Move EVP_PKEY and cert code generation from acme
- MINOR: ssl/ckch: certificates generation from "load" "crt-store" directive
- MINOR: trace: add definitions for haterm streams
- MINOR: init: allow a fileless init mode
- MEDIUM: init: allow the redefinition of argv[] parsing function
- MINOR: stconn: stream instantiation from proxy callback
- MINOR: haterm: add haterm HTTP server
- MINOR: haterm: new "haterm" utility
- MINOR: haterm: increase thread-local pool size
- BUG/MEDIUM: stats-file: fix shm-stats-file recover when all process slots are full
- BUG/MINOR: stats-file: manipulate shm-stats-file heartbeat using unsigned int
- BUG/MEDIUM: stats-file: detect and fix inconsistent shared clock when resuming from shm-stats-file
- CI: github: only enable OS X on development branches
Don't use the macOS job on maintenance branches, it's mainly use for
development and checking portability, but we don't support actively
macOS on stable branches.
When leveraging shm-stats-file, global_now_ms and global_now_ns are stored
(and thus shared) inside the shared map, so that all co-processes share
the same clock source.
Since the global_now_{ns,ms} clocks are derived from now_ns, and given
that now_ns is a monotonic clock (hence inconsistent from one host to
another or reset after reboot) special care must be taken to detect
situations where the clock stored in the shared map is inconsistent
with the one from the local process during startup, and cannot be
relied upon anymore. A common situation where the current implementation
fails is resuming from a shared file after reboot: the global_now_ns stored
in the shm-stats-file will be greater than the local now_ns after reboot,
and applying the shared offset doesn't help since it was only relevant to
processes prior to rebooting. Haproxy's clock code doesn't expect that
(once the now offset is applied) global_now_ns > now_ns, and it creates
ambiguous situation where the clock computations (both haproxy oriented
and shm-stats-file oriented) are broken.
To fix the issue, when we detect that the clock stored in the shm is
off by more than SHM_STATS_FILE_HEARTBEAT_TIMEOUT (60s) from the
local now_ns, since this situation is not supposed to happen in normal
environment on the host, we assume that the shm file was previously used
on a different system (or that the current host rebooted).
In this case, we perform a manually adjustment of the now offset so that
the monotonic clock from the current host is consistent again with the
global_now_ns stored in the file. Doing so we can ensure that clock-
dependent objects (such as freq_counters) stored within the map will keep
working as if we just (re)started where we left off when the last process
stopped updating the map.
Normally it is not expected that we update the now offset stored in the
map once the map was already created (because of concurrent accesses to
the file when multiple processes are attached to it), but in this specific
case, we know we are the first process on this host to start working
(again) on the file, thus we update the offset as if we created the
file ourself, while keeping existing content.
It should be backported in 3.3
shm-stats-file heartbeat is derived from now_ms with an extra time added
to it, thus it should be handled using the same time as now_ms is.
Until now, we used to handle heartbeat using signed integer. This was not
found to cause severe harm but it could result in improper handling due
to early wrapping because of signedness for instance, so let's better fix
that before it becomes a real issue.
It should be backported in 3.3
Amaury reported that when the following warning is reported by haproxy:
[WARNING] (296347) : config: failed to get shm stats file slot for 'haproxy.stats', all slots are occupied
haproxy would segfault right after during clock update operation.
The reason for the warning being emitted is not the object of this commit
(all shm-stats-file slots occupied by simultaneous co-processes) but since
it was is intended that haproxy is able to keep working despite that
warning (ignoring the use of shm-stats-file), we should fix the crash.
The crash is caused by the fact that we detach from the shared memory while
the global_now_ns and global_now_ms clock pointers still point to the shared
memory. Instead we should revert to using our local clock instead before
detaching from the map.
It should be backported in 3.3
QUIC uses many objects and the default pool size causes a lot of
thrashing at the current request rate, taking ~12% CPU in pools.
Let's increase it to 3MB, which allows us to reach around 11M
req/s on a 80-core machine.
haterm_init.c is added to implement haproxy_init_args() which overloads
the one defined by haproxy.c. This way, haterm program uses its own argv[]
parsing function. It generates its own configuration in memory that is
parsed during boot and executed by the common code.
Contrary to haproxy, httpterm does not support all the HTTP protocols.
Furthermore, it has become easier to handle inbound/outbound
connections / streams since the rework done at conn_stream level.
This patch implements httpterm HTTP server services into haproxy. To do
so, it proceeds the same way as for the TCP checks which use only one
stream connector, but on frontend side.
The makefile is modified to handle haterm.c in additions to all the C
files for haproxy to build new haterm program into haproxy, the haterm
server also instantiates a haterm stream (hstream struct) attached to a
stream connector for each incoming connection without backend stream
connector. This is the role of sc_new_from_endp() called by the muxes to
instantiate streams/hstreams.
As for stream_new(), hstream_new() instantiates a task named
process_hstream() (see haterm.c) which has the same role as
process_stream() but for haterm streams.
haterm into haproxy takes advantage of the HTTP muxes and HTX API to
support all the HTTP protocols supported by haproxy.
Add a pointer to function to proxies as ->stream_new_from_sc proxy
struct member to instantiate stream from connection as this is done by
all the muxes when they call sc_new_from_endp(). The default value for
this pointer is obviously stream_new() which is exported by this patch.
This patches allows the argv[] parsing function to be redefined from
others C modules. This is done extracting the function which really
parse the argv[] array to implement haproxy_init_args(). This function
is declared as a weak symbol which may be overloaded by others C module.
Same thing for copy_argv() which checks/cleanup/modifies the argv array.
One may want this function to be redefined. This is the case when other
C modules do not handle the same command line option. Copying such
argv[] would lead to conflicts with the original haproxy argv[] during
the copy.
This patch provides the possibility to initialize haproxy without
configuration file. This may be identified by the new global and exported
<fileless_mode> and <fileless_cfg> variables which may be used to
provide a struct cfgfile to haproxy by others means than a physical
file (built in memory).
When enabled, this fileless mode skips all the configuration files
parsing.
Add definitions for haterm stream as arguments to be used by the TRACE API.
This will be used by the haterm module to come which will have to handle
hstream struct objects (in place of stream struct objects).
Add "generate-dummy" on/off type keyword to "load" directive to
automatically generate dummy certificates as this is done for ACME from
ckch_conf_load_pem_or_generate() function which is called if a "crt"
keyword is also provide for this directive.
Also implement "keytype" to specify the key type used for these
certificates. Only "RSA" or "ECDSA" is accepted. This patch also
implements "bits" keyword for the "load" directive to specify the
private key size used for RSA. For ECDSA, a new "curves" keyword is also
provided by this patch to specify the curves to be used for the EDCSA
private keys generation.
ckch_conf_load_pem_or_generate() is modified to use these parameters
provided by "keytype", "bits" and "curves" to generate the private key
with ssl_gen_EVP_PKEY() before generating the X509 certificate calling
ssl_gen_x509().
Move acme_EVP_PKEY_gen() implementation to ssl_gencrt.c and rename it to
ssl_EVP_PKEY_gen(). Also extract from acme_gen_tmp_x509() the generic
part to implement ssl_gen_x509() into ssl_gencrt.c.
To generate a self-signed expired certificate ssl_EVP_PKEY_gen() must be
used to generate the private key. Then, ssl_gen_x509() must be called
with the private key as argument. acme_gen_tmp_x509() is also modified
to called these two functions to generate a temporary certificate has
done before modifying this part.
Such an expired self-signed certificate should not be use on the field
but only during testing and development steps.
In connect_server(), MUX initialization must be delayed if ALPN
negotiation is configured, unless ALPN can already be retrieved via the
server cache.
A readlock is used to consult the server cache. Prior to this patch, it
was always taken even if no ALPN is configured. The lock was thus used
for every new backend connection instantiation.
Rewrite the check so that now the lock is only used if ALPN is
configured. Thus, no lock access is done if SSL is not used or if ALPN
is not defined.
In practice, there will be no performance gain, as the read lock should
never block if ALPN is not configured. However, the code is cleaner as
it better reflect that only access to server nego_alpn requires the
path_params lock protection.
In connect_server(), when a new connection must be instantiated, MUX
initialization is delayed if an ALPN setting is present on the server
line configuration, as negotiation must be performed to select the
correct MUX. However, this is not the case if the ALPN can already be
retrieved on the server cache.
This check is performed too late however and may cause issue with the
QUIC stack. The problem can happen when the server ALPN is not yet set.
In the normal case, quic_conn layer is instantiated and MUX init is
delayed until the handshake completion. When the MUX is finally
instantiated, it reused without any issue app_ops from its quic_conn,
which is derived from the negotiated ALPN.
However, there is a race condition if another QUIC connection populates
the server ALPN cache. If this happens after the first quic_conn init
but prior to the MUX delay check, the MUX will thus immediately start in
connect_server(). When app_ops is retrieved from its quic_conn, a crash
occurs in qcc_install_app_ops() as the QUIC handshake is not yet
finalized :
#0 0x000055e242a66df4 in qcc_install_app_ops (qcc=0x7f127c39da90, app_ops=0x0) at src/mux_quic.c:1697
1697 if (app_ops->init && !app_ops->init(qcc)) {
[Current thread is 1 (Thread 0x7f12810f06c0 (LWP 25758))]
To fix this, MUX delay check is moved up in connect_server(). It is now
performed prior conn_prepare() which is responsible for the quic_conn
layer instantiation. Thus, it ensures consistency for the QUIC stack :
MUX init is always delayed if the quic_conn does not reuses itself the
SSL session and ALPN server cache (no quic_reuse_srv_params()).
This must be backported up to 3.3.
In acme_gen_tmp_x509(), if X509_set_issuer_name() fails, the code
jumped to the mkcert_error label without freeing the previously
allocated X509_NAME object. The other error paths after X509_NAME_new()
(X509_NAME_add_entry_by_txt and X509_set_subject_name) already properly
freed the name before jumping to mkcert_error, but this one was missed.
Fix this by freeing name before the goto, consistent with the other
error paths in the same function.
Must be backported as far as 3.3.
Since 3.1 a task is always created when H1 connections initialize, so
the later null check before task_queue() became unneeded.
Could be backported with 3c09b3432 (BUG/MEDIUM: mux-h1: Fix how timeouts
are applied on H1 connections).
Add the ability to set connect, queue and tarpit timeouts from the
set-timeout action. This is especially useful when using set-dst to
dynamically connect to servers.
This patch also adds the relevant fe_/be_/cur_ sample fetches for these
timeouts.
Disable the windows job for repository in repositories that are not in
the "haproxy" organization. This is mostly used for portability during
development and only making noise during the maintenance cycle.
Must be backported in every branches.
This bug impacts only the backends. The sessions cached could be used after been
freed because of a missing write lock into ssl_sock_handle_hs_error() when freeing
such objects. This issue could be rarely reproduced and only with QUIC with
difficulties (random CRYPTO data corruptions and instrumented code).
Must be backported as far as 2.6.
due to some (yet unknown) changes in ghcr.io we are not able to pull
images from it anymore. Lets temporarily switch to "local only" images
storage.
no functional change
b_is_default() and b_is_large() can now be used to know if a buffer is a
default buffer or a large one. _b_free() now relies on it.
These functions are also used when possible (stream_free(),
stream_release_buffers() and http_wait_for_msg_body()).
Thanks to previous patches, it is now possible to allocate a large buffer to
store the message payload in the context of the "wait-for-body" action. To
do so, "use-large-buffer" option must be set.
It means now it is no longer necessary to increase the regular buffer size
to be able to get message payloads of some requests or responses.
The function used to fetch params was update to get a chunk accordingly to
the parameter size. The same was also performed when the message body is
retrieved.
The function used to duplicate a sample was update to support large buffers. In
addition several converters were also reviewed to deal with large buffers. For
instance, base64 encoding and decoding must use chunks of the same size than the
sample. For some of them, a retry is performed to enlarge the chunk is possible.
TODO: Review reg_sub, concat and add_item to get larger chunk if necessary
While large buffers are still unused internally, functions receiving data
from endpoint (connections or applets) were updated to block the receives
when channels are using large buffer and the data forwarding was started.
The goal of this patch is to be able to flush large buffers at the end of
the analyzis stage to return asap on regular buffers.
Because there is now a memory pool for large buffers, we must also add the
support for large chunks. So, if large buffers are configured, a dedicated
memory pool is created to allocate large chunks. alloc_large_trash_chunk()
must be used to allocate a large chunk. alloc_trash_chunk_sz() can be used to
allocate a chunk with the best size. However free_trash_chunk() remains the
only way to release a chunk, regular or large.
In addition, large trash buffers are also created, using the same mechanism
than for regular trash buffers. So three thread-local trash buffers are
created. get_large_trash_chunk() must be used to get a large trash buffer.
And get_trash_chunk_sz() may be used to get a trash buffer with the best
size.
Add the support for large bufers. A dedicated memory pool is added. The size
of these buffers must be explicitly configured by setting
"tune.bufsize.large" directive. If it is not set, the pool is not
created. In addition, if the size for large buffers is the same than for
regular buffer, the feature is automatically disable.
For now, large buffers remain unused.
In sample fetch functions retrieving the message payload (req.body,
res.body...), instead of copying the payload in a trash buffer, we know
directely return a pointer the payload in the HTX message. To do so, we must
be sure there is only one HTX DATA block. Thanks to previous patches, it is
possible. However, we must take care to perform a defragmentation if
necessary.
When we are waiting for the request or response payload, it is usually
because the payload will be analyzed in a way or another. So, perform a
defrag if necessary. This should ease payload analyzis.
First, an HTX flags was added to know when blocks are unordered. It may
happen when a header is added while part of the payload was already received
or when the start-line is replaced by an new one. In these cases, the blocks
indexes are in the right order but not the blocks payload. Knowing a message
is unordered can be useful to trigger a defragmentation, mainly to be able
to append data properly for instance.
Then, detection of fragmented messages was improved, especially when a
header or a start-line is replaced by a new one.
Finally, when data are added in a message and cannot be appended into the
previous DATA block because the message is not aligned, a defragmentation is
performed to realign the message and append data.
When an HTX message is defragmented, the HTX DATA blocks are now merged into
one block. Just like the previous commit, this will help all payload
analysis, if any. However, there is an exception when the reference on a
DATA block must be preserved, via the <blk> parameter. In that case, this
DATA block is not merged with previous block.
htx_replace_blk_value() is buggy when a defrag is performed. It only happens
on data expension. But in that case, because a defragmentation is performed,
the blocks data are moved and old data of the updated block are no longer
accessible.
To fix the bug, we now use a chunk to temporarily copy the new data of the
block. This way we can safely perform the HTX defragmentation and then
recopy the data from the chunk to the HTX message.
It is theorically possible to hit this bug but concretly it is pretty hard.
This patch should be backported to all stable versions.
Check the channels buffers size on release before trying to offer it to
waiting entities. Only normal buffers must be considered. This will be
mandatory when the large buffers support on channels will be added.
When a message payload is parsed, it is possible to swap buffers. We must
only take care both buffers have same size. It will be mandatory when the
large buffers support on channels will be added.
Just like the previous commit, we must take care to never swap buffers of
different size when data are exchanged between an applet and a SC. it will
be mandatory when the large buffers support on channels will be added.
Today, it is useless to check the buffers size before performing a 0-copy in
muxes when data are sent, but it will be mandatory when the large buffers
support on channels will be added. Indeed, muxes will still rely on normal
buffers, so we must take care to never swap buffers of different size.
When the compression is performed, a trash chunk is used. So be sure to
never compression more data than the trash size. Otherwise the commression
could fail. Today, this cannot happen. But with the large buffers support on
channels, it could be an issue.
Note that this part should be reviewed to evaluate if we should use a larger
chunk too to perform the compression, maybe via an option.
It is not a bug fix, because there is no way to hit the issue for now. But
there is nothing preventing a loop of synchronous sends in process_stream().
Indead, when a synchronous send is successfully performed, we restart the
SCs evaluation and at the end another synchronous send is attempted. So with
an endpoint consuming data bit by bit or with a filter fowarding few bytes
at each call, it is possible to loop for a while in process_stream().
Because it is not expected, we now limit the number of synchronous send per
wakeup to two calls. In a nominal case, it should never be more. This commit
is mandatory to be able to handle large buffers on channels
There is no reason to backport this commit except if the large buffers
support on channels are backported.
When the response payload is stored in the cache, we can avoid to use a
trash chunk as temporary space area before copying everything in the cache
in one call. Instead we can directly write each HTX block in the cache, one
by one. It should not be an issue because, most of time, there is only one
DATA block.
This commit depends on "BUG/MEDIUM: shctx: Use the next block when data
exactly filled a block".
The call to init_buffer() during the worker startup may fail. In that case,
an error message is displayed but the error was not properly handled. So
let's add the proper check and exit on error.
At many places, we rely on global.tune.bufsize value instead of using the buffer
size. For now, it is not a problem. But if we want to be able to deal with
buffers of different sizes, it is good to reduce as far as possible dependencies
on the global value. most of time, we can use b_size() or c_size()
functions. The main change is performed on the error snapshot where the buffer
size was added into the error_snapshot structure.
Fix two issues in JWE token processing:
- Replace memcmp() with CRYPTO_memcmp() for authentication tag
verification in build_and_check_tag() to prevent timing
side-channel attacks. Also add a tag length validation check
before the comparison to avoid potential buffer over-read when
the decoded tag length doesn't match the expected HMAC half.
- Remove unreachable break statement after JWE_ALG_A256GCMKW case
in decrypt_cek_aesgcmkw().
sc_have_buff(), sc_need_buff(), sc_have_room() and sc_need_room() are
related to the buffer's channel. So we can move them in sc_strm.h header
file. In addition, this will be mandatory for the next commit.
During the message analysis, we must take care to stop wait for the message
body if an error was reported on client side or an abort was detected with
abort-on-close configured (by default now).
The bug was introduced when the "wait-for-body" action was added. Only the
producer state was tested. So, when we were waiting for the request payload,
there was no issue. But when we were waiting for the response payload, error
or abort on client side was not considered.
This patch should be backported to all stable versions.
When the hot list was removed in 3.0, a regression was introduced.
Theorically, it is possible to override data in a block when new data are
appended. It happens when data are copied. If the data size is a multiple of
the block size, all data are copied and the last used block is full. But
instead of saving a reference on the next block as the restart point for the
next copies, we keep a reference on the last full one. On the next read, we
reuse this block and old data are crushed. To hit the bug, no new blocks
should be reserved between the two data copy attempts.
Concretely, for now, it seems not possible to hit the bug. But with a block
size set to 1024, if more than 1024 bytes are reseved, with a first copy of
1024 bytes and a second one with remaining data, data in the first block
will be crushed.
So to fix the bug, the reference of the last block used to write data (which
is in fact the next one to use to perform the next copy) is only updated
when a block is full. In that case the next block is used.
This patch should be backported as far as 3.0 after a period of observation.
Since the legacy HTTP code was removed, the global and thread-local buffers,
tmpbuf and zbuf, are no longer used. So let's removed them.
This could be backported, theorically to all supported versions. But at
least it could be good to do so as far as 3.2 as it saves 2 buffers
per-thread.
This bug is quite old. When the length of the first DATA block is computed,
the offset is used instead of the block length minus the offset. It is only
used with random forwarding and there is a test just after to prevent any
issue, so there is no effect.
It could be backported to all stable versions.
Handshakes events were not properly decoded. Only send errors were decoded
as expected, other events were reported with a '-'. It is now fixes.
This patch could be backported as far as 3.2.
The previous fix was wrong. When shut flags are tested for legacy applets,
to know if the I/O handler can be called or not, we must be sure shut for
reads and for writes are both set to skip the applet I/O handler.
This bug introduced regression, at least for the peer applet and for the DNS
applet.
This patch must be backported with abc1947e1 ("BUG/MEDIUM: applet: Fix test
on shut flags for legacy applets"), so as far as 3.0.
If a producer tries to send data via the fast-forward mechanism while the
message is in an unexpected state from the consumer point of view, the
fast-forward is now disabled. Concretely, we now take care that the message
is in its data/tunnel stage to proceed in h1_nego_ff().
By disabling fast-forward in that case, we will automatically fall back on
the regular sending path and be able to handle the error in h1_snd_buf().
This patch should be backported as far as 3.0
If is illegal to send data if the stream is already closed. The case is
properly handled when data are sent via snd_buf(), by draining the data. But
it was still possible to process these data via nego_ff().
So, in this patch, both for the H2 and QUIC multiplexers, the fast-forward
is disabled if the stream is closed and nothing is performed. Doing so, we
will automatically fall back on the regular sending path and be able to
drain data in snd_buf().
Thanks to Mike Walker for his investigation on the subject.
This patch should be backported as far as 3.0.
Remove extra quote in feature cmd used to test SSL compatibility with
set_ssl_cafile QUIC regtest. Due to this syntax error, the test was
never executed.
No need to backport.
HTTP/3 CONNECT transcoding is not properly implemented on the frontend
side. Neither tunnel mode of application nor extended connect are
currently functional.
Clarify this situation by rejecting any CONNETC attempts on the frontend
side. The stream is thus now closed via a RESET_STREAM with error code
REQUEST_REJECTED.
This should be backported to every stable versions.
This reverts commit 235e8f1afd7e9753a26051b30c47ecc398ccfd12.
Prior to the above commit, snd_buf callback for QUIC MUX was able to
deal with data even after stream closure. The excess was simply
discarded, as no STREAM frame can be emitted after FIN/RESET_STREAM.
This code was later removed and replaced by a BUG_ON() to ensure snd_buf
is never called after stream closure.
However, this approach is too strict. Indeed, there is nothing in the
haproxy stream architecture which forbids this scheduling, in part
because QUIC MUX is the sole responsible of the stream closure. As such,
it is preferable to revert to the old code to prevent any triggering of
a BUG_ON() failure.
Note that nego_ff does not implement data draining if called after
stream closure. This will be done in a future patch.
Thanks to Mike Walker for his investigation on the subject.
This must be backported up to 2.8.
This is a follow up to b6bdb2553 ("MEDIUM: backend: make "balance random"
consider req rate when loads are equal")
In the above patch, we used the global sess_per_sec metric to choose which
server we should be using. But the original intent was to use the per
thread group statistic.
No backport needed, the previous patch already improved the situation in
3.3, so let's not take the risk of breaking that.
ssl-f-use lines tries to load a crt file, but the "crt" keyword is not
mandatory. That could lead to crtlist_load_crt() being called with a
NULL path, and trying to do a stat.
In this particular case we don't need to try anything and it's better to
leave with an actual error.
Must be backported as far as 3.2.
crtlist_load_crt() in post_section_frontend_crt_init() won't give
details about the line being parsed, this should be done by the caller.
Modify post_section_frontend_crt_init() to ouput the right error format.
Must be backported to 3.2.
In post_section_frontend_crt_init(), the crt_entry is populated by the
ssl_conf fromt the cfg_crt_node. On error path, the crt_list is
completely freed, including the ssl_conf structure. But the ssl_conf
structure was already freed when freeing the cfg_crt_node.
Fix the issue by doing a crtlist_dup_ssl_conf(n->ssl_conf) in the
crtlist_entry instead of an assignation.
Fix issue #3268.
Need to be backported as far as 3.2. The previous patch which adds the
crtlist_dup_ssl_conf() declaration is needed.
The pools memory usage calculation was done using ints by default, making
it harder to identify large ones. Let's switch to unsigned long for the
size calculations.
Add unlikely() hints on error paths in init, conv and fetch functions.
Remove unnecessary zero-initialization of local buffers that are
always written before use. Fix indentation in da_haproxy_checkinst()
and remove unused loop variable initialization.
Precompute the maximum header name length from the atlas evidence
headers at init and hot-reload time. Use it in da_haproxy_fetch() to
skip headers early that cannot match any known DeviceAtlas evidence
header, avoiding unnecessary string copies and comparisons.
Add the struct header_evidence_entry definition to the dummy dac.h
to accommodate the ongoing deviceatlas module update which now
iterates over atlas header_priorities to precompute maxhdrlen.
The struct was already referenced by struct da_atlas but lacked
a definition in the dummy header.
Increase DA_MAX_HEADERS from 24 to 32 and hbuf from 24 to 64 to
accommodate current DeviceAtlas data files which may use more headers
and longer header names.
Check the return value of da_atlas_getproptype() and skip the property
on failure instead of using an uninitialized proptype. Also remove the
unnecessary pprop pointer indirection, using prop directly.
When hot-reloading the atlas in da_haproxy_checkinst(), the configured
cache_size was not applied to the new instance, causing it to use the
default value.
This should be backported to lower branches.
da_fini() was called unconditionally in deinit_deviceatlas() even when
da_init() was never called. Move it inside the daset check. Also remove
the erroneous shm_unlink() call which could affect the dadwsch shared
memory used by the scheduling process.
This should be backported to lower branches.
In da_haproxy_checkinst(), when da_atlas_compile() failed, the cnew
buffer was leaked. Add a free(cnew) in the else branch.
This should be backported to lower branches.
In da_haproxy_checkinst(), base[0] was checked before acquiring the
lock but not re-checked after. Another thread could have already
processed the reload between the initial check and the lock
acquisition, leading to a race condition.
This should be backported to lower branches.
In da_haproxy_fetch(), vlen was set from v.len (the raw header value
length) instead of the truncated copy length. Also the cookie-specific
vlen calculation used an incorrect subtraction instead of the actual
extracted cookie value length (pl) returned by
http_extract_cookie_value().
This should be backported to lower branches.
The user-agent string copy had an off-by-one error: the buffer size
limit did not account for the null terminator, and the memcpy length
used i-1 which truncated the last character of the user-agent string.
This should be backported to lower branches.
When da_atlas_compile() or da_atlas_open() failed in init_deviceatlas(),
atlasimgptr was leaked and da_fini() was never called. Also add a NULL
check on strdup() for the default cookie name with proper cleanup of
the atlas and image pointer on failure.
This should be backported to lower branches.
Add missing NULL checks after strdup() for the json file path in
da_json_file() and the cookie name in da_properties_cookie().
This should be backported to lower branches.
da_log_level() and da_cache_size() were missing a return -1 on error,
causing fall-through to the normal return 0 path when invalid values
were provided.
This should be backported to lower branches.
More and more often, core dumps retrieved on systems that build with
-fPIE by default are becoming unexploitable. Even functions and global
symbols get relocated and gdb cannot figure their final position.
Ironically the post_mortem struct lying in its own section that was
meant to ease its finding is not exempt from this problem.
The only remaining way is to inspect the core to search for the
post-mortem magic, figure its offset from the file and look up the
corresponding virtual address with objdump. This is quite a hassle.
This patch implements a simple utility that opens a 64-bit core dump,
scans the program headers looking for a data segment which contains
the post-mortem magic, and prints it on stdout. It also places the
"pm_init" command alone on its own line to ease copy-pasting into the
gdb console. With this, at least the other commands in this directory
work again and allow to inspect the program's state. E.g:
$ ./getpm core.57612
Found post-mortem magic in segment 5:
Core File Offset: 0xfc600 (0xd5000 + 0x27600)
Runtime VAddr: 0x5613e52b6600 (0x5613e528f000 + 0x27600)
Segment Size: 0x28000
In gdb, copy-paste this line:
pm_init 0x5613e52b6600
It's worth noting that the program has so few dependencies that it even
builds with nolibc, allowing to upload a static executable into containers
being debugged and lacking development tools and compilers. The build
procedure is indicated inthe source code.
In the historical implementation, all filter related information where
stored at the stream level (using struct strm_flt * context), and filters
iteration was performed at the stream level also.
We identified that this was not ideal and would make the implementation of
future filters more complex since filters ordering should be handled in
a different order during request and response handling for decompression
for instance.
To make such thing possible, in this commit we migrate some channel
specific filter contexts in the channel directly (request or response),
and we implement 2 additional filter lists, one on the request channel
and another on the response channel. The historical stream filter list
is kept as-is because in some contexts only the stream is available and
we have to iterate on all filters. But for functions where we only are
interested in request side or response side filters, we now use dedicated
channel filters list instead.
The only overhead is that the "struct filter" was expanded by two "struct
list".
For now, no change of behavior is expected.
Multiple channel related functions have the same construction: they use
list_for_each_entry() to work on a given filter from the stream+channel
combination. In future commits we will try to use filter list from
dedicated channel list instead of the stream one, thus in this patch we
need as a prerequisite to implement and use the flt_list_{start,next} API
to iterate over filter list, giving the API the responsibility to iterate
over the correct list depending on the context, while the calling function
remains free to use the iteration construction it needs. This way we will
be able to easily change the way we iterate over filter list without
duplicating the code for requests and responses.
There is no need to have those helpers defined as macro, and since it
is not mandatory, code maintenance is much easier using functions,
thus let's switch to function definitions.
Also, we change the way we iterate over the list so that the calling
function now has a pseudo API to get and iterate over filter pointers
while keeping control on how they implement the iterating logic.
One benefit of this is that we will also be able to switch between lists
depending on the channel type, which is a prerequisite for upcoming
rework that split the filter list over request and response channels
(commit will follow)
No change of behavior is expected.
Implement a way to set a destination directory using DESTDIR, and a tmp
directory using TMPDIR.
By default:
- DESTDIR is ../vtest like it was done previously
- TMPDIR is mktemp -d
Only the vtest binary is copied in DESTDIR.
Example:
TMPDIR=/dev/shm/ DESTDIR=/home/user/.local/bin/ ./scripts/build-vtest.sh
Features prefixed by "HAVE_WORKING_" in the haproxy -vv feature list,
are features that are detected during runtime.
This patch splits these features on another line in haproxy -vv. This
line is named "Detected feature list".
The feature list in haproxy -vv is partly generated from the Makefile
using the USE_* keywords, but it's also possible to add keywords in the
feature list using hap_register_feature(), which adds the keyword at the
end of list. When doing so, the list is not correctly sorted anymore.
This patch fixes the problem by splitting the string using an array of
ist and applying a qsort() on it.
the TCP_MD5SIG ifdef is not enough to check if the feature is usable.
The code might compile but the OS could prevent to use it.
This patch tries to use the TCP_MD5SIG setsockopt before adding
HAVE_WORKING_TCP_MD5SIG in the feature list. so it would prevent to
start reg-tests if the OS can't run it.
Test the new "jwt_decrypt_jwk" converter that takes a JWK as argument,
either as a string or in a variable.
Only "RSA" and "oct" types are managed for now.
This converter takes a private key in the JWK format (RFC7517) that can
be provided as a string of via a variable.
The only keys managed for now are of type 'RSA' or 'oct'.
Add helper functions that take a JWK (JSON representation of an RSA
private key) into an EVP_PKEY (containing the private key).
Those functions are not used yet, they will be used in the upcoming
'jwt_decrypt_jwk' converter.
QUIC frame type is encoded as a varint. Initially, haproxy parsed it as
a single byte, which was enough to cover frames defined in RFC9000.
The code has been extended recently to support multi-bytes encoded
value, in anticipation of QUIC frames extension support. However, there
was no check on the varint format. This is interpreted erroneously as a
PADDING frame as this serves as the initial value. Thus the rest of the
packet is incorrectly handled, with various resulting effects, including
infinite loops and/or crashes.
This patch fixes this by checking the return value of quic_dec_int(). If
varint cannot be parsed, the connection is immediately closed.
This issue is assigned to CVE-2026-26080 report.
This must be backported up to 3.2.
Reported-by: Asim Viladi Oglu Manizada <manizada@pm.me>
Token parsing code on INITIAL packet for the NEW_TOKEN format is not
robust enough and may even crash on some rare malformed packets.
This patch fixes this by adding a check on the expected length of the
received token. The packet is now rejected if the token does not match
QUIC_TOKEN_LEN. This check is legitimate as haproxy should only parse
tokens emitted by itself.
This issue has been introduced with the implementation of NEW_TOKEN
tokens parsing required for 0-RTT support.
This issue is assigned to CVE-2026-26081 report.
This must be backported up to 3.0.
Reported-by: Asim Viladi Oglu Manizada <manizada@pm.me>
When a duplicated CRYPTO frame is received during handshake, a server
may consider that there was a packet loss and immediately retransmit its
pending CRYPTO data without having to wait for PTO expiration. However,
RFC 9002 indicates that this should only be performed at most once per
connection to avoid excessive packet transmission.
QUIC connection is flagged with QUIC_FL_CONN_HANDSHAKE_SPEED_UP to mark
that a fast retransmit has been performed. However, during the
refactoring on CRYPTO handling with the storage conversion from ncbuf to
ncbmbuf, the check on the flag was accidentely removed. The faulty patch
is the following one :
commit f50425c021eceb324add6873b58cc5f366554d31
MINOR: quic: remove received CRYPTO temporary tree storage
This patch adds again the check on QUIC_FL_CONN_HANDSHAKE_SPEED_UP
before initiating fast retransmit. This ensures this is only performed
once per connection.
This must be backported up to 3.3.
In srv_dynamic_maxconn(), we'll decide that the max number of connection
is the server's maxconn if 1) the proxy's number of connection is over
fullconn, or if minconn was not set.
Check if minconn is not set first, as it will be true most of the time,
and as the proxy's "beconn" variable is in a busy cache line, it can be
costly to access it, while minconn/maxconn is in a cache line that
should very rarely change.
Oto Valek rightfully reported in issue #3262 that the proxy-protocol
doc makes no mention of the packed attribute on struct pp2_tlv_ssl,
which is mandatory since fields are not type-aligned in it. Let's
add it in the definition and make an explicit mention about it to
save implementers from wasting their time trying to debug this.
It can be backported.
The documentation of @system-ca specifies that one can overwrite the
value provided by the SSL Library using SSL_CERT_DIR.
However it seems like X509_get_default_cert_dir() is not affected by
this environment variable, and X509_get_default_cert_dir_env() need to
be used in order to get the variable name, and get the value manually.
This could be backported in every stable branches. Note that older
branches don't have the memprintf in ssl_sock.c.
SSL libraries built manually might lack the right
X509_get_default_cert_dir() value.
The common way to fix the problem is to build openssl with
./configure --openssldir=/etc/ssl/
In order to verify this setting, output it with haproxy -vv.
Given that we already have "set profiling task", it's easy to permit to
enable/disable the lock and/or memory profiling at run time. However, the
change will only be applied next time the task profiling will be switched
from off/auto to on.
The patch is very minor and is best viewed with git show -b because it
indents a whole block that moves in a "if" clause.
This can be backported to 3.3 along with the two previous patches.
In continuity of previous patch, this one makes use of the new profiling
flags. For this, based on the global "profiling" setting, when switching
profiling on, we set or clear two flags on the thread context,
TH_FL_TASK_PROFILING_L and TH_FL_TASK_PROFILING_M to indicate whether
lock profiling and/or malloc profiling are desired when profiling is
enabled. These flags are checked along with TH_FL_TASK_PROFILING to
decide when to collect time around a lock or a malloc. And by default
we're back to the behavior of 3.2 in that neither lock nor malloc times
are collected anymore.
This is sufficient to see the CPU usage spent in the VDSO to significantly
drop from 22% to 2.2% on a highly loaded system.
This should be backported to 3.3 along with the previous patch.
Damien Claisse reported in issue #3257 a performance regression between
3.2 and 3.3 when task profiling is enabled, more precisely in relation
with the following patches were merged:
98cc815e3e ("MINOR: activity: collect time spent with a lock held for each task")
503084643f ("MINOR: activity: collect time spent waiting on a lock for each task")
9d8c2a888b ("MINOR: activity: collect CPU time spent on memory allocations for each task")
The issue mostly comes from the first patches. What happens is that the
local time is taken when entering and leaving each lock, which costs a
lot on a contended system. The problem here is the lack of finegrained
settings for lock and malloc profiling.
This patch introduces a better approach. The task profiler goes back to
its default behavior in on/auto modes, but the configuration now accepts
new extra options "lock", "no-lock", "memory", "no-memory" to precisely
indicate other timers to watch for each task when profiling turns on.
This is achieved by setting two new flags HA_PROF_TASKS_LOCK and
HA_PROF_TASKS_MEM in the global "profiling" variable.
This patch only parses the new values and assigns them to the global
variable from the config file for now. The doc was updated.
Commit 3674afe8a0 ("BUG/MEDIUM: threads: Atomically set TH_FL_SLEEPING
and clr FL_NOTIFIED") accidentally left a strange-looking line wrapping
making one think of an editing mistake, let's fix it and keep it on a
single line given that even indented wrapping is almost as large.
This can be backported with the fix above till 2.8 to keep the patch
context consistent between versions.
An issue was introduced in 3.0 with commit faa8c3e024 ("MEDIUM: lb-chash:
Deterministic node hashes based on server address"): the new server_key
field and lb_nodes entries initialization were not updated for servers
added at run time with "add server": server_key remains zero and the key
used in lb_node remains the one depending only on the server's ID.
This will cause trouble when adding new servers with consistent hashing,
because the hash-key will be ignored until the server's weight changes
and the key difference is detected, leading to its recalculation.
This is essentially caused by the poorly placed lb_nodes initialization
that is specific to lb-chash and had to be replicated in the code dealing
with server addition.
This commit solves the problem by adding a new ->server_init() function
in the lbprm proxy struct, that is called by the server addition code.
This also allows to abandon the complex check for LB algos that was
placed there for that purpose. For now only lb-chash provides such a
function, and calls it as well during initial setup. This way newly
added servers always use the correct key now.
While it should also theoretically have had an impact on servers added
with the "random" algorithm, it's unlikely that the difference between
proper server keys and those based on their ID could have had any visible
effect.
This patch should be backported as far as 3.0. The backport may be eased
by a preliminary backport of previous commit "CLEANUP: lb-chash: free
lb_nodes from chash's deinit(), not global", though this is not strictly
necessary if context is manually adjusted.
There's an ambuity on the ownership of lb_nodes in chash, it's allocated
by chash but freed by the server code in srv_free_params() from srv_drop()
upon deinit. Let's move this free() call to a chash-specific function
which will own the responsibility for doing this instead. Note that
the .server_deinit() callback is properly called both on proxy being
taken down and on server deletion.
For "add backend" implementation, postparsing code in
check_config_validity() from cfgparse.c has been extracted in a new
dedicated function named proxy_finalize() into proxy.c.
This has caused unexpected compilation issue as in the latter file
TLSEXT_TYPE_application_layer_protocol_negotiation macro may be
undefined, in particular when building without QUIC support. Thus, code
related to default ALPN on binds is discarded after the preprocessing
stage.
Fix this by including openssl-compat header file into proxy source file.
This should be sufficient to ensure SSL related defines are properly
included.
This should fix recent issues on SSL regtests.
No need to backport.
Emeric suggested that it's sometimes convenient to instantly know if a
client has advertised support for window scaling or timestamps for
example. While the info is present in the TCP options output, it's hard
to extract since it respects the options order.
So here we're extending the 56-bit fingerprint with 8 extra bits that
indicate the presence of options 2..8, and any option above 9 for the
last bit. In practice this is sufficient since higher options are not
commonly used. Also TCP option 5 is normally not sent on the SYN (SACK,
only SACK_perm is sent), and echo options 6 & 7 are no longer used
(replaced with timestamps). These fields might be repurposed in the
future if some more meaningful options are to be mapped (e.g. MPTCP,
TFO cookie, auth).
When a backend is created at runtime, the new proxy instance is inserted
at the end of proxies_list. This operation is buggy if this list is
empty : the code causes a null dereference which will lead to a crash.
This causes the following compilation error :
CC src/proxy.o
src/proxy.c: In function 'cli_parse_add_backend':
src/proxy.c:4933:36: warning: null pointer dereference [-Wnull-dereference]
4933 | proxies_list->next = px;
| ~~~~~~~~~~~~~~~~~~~^~~~
This patch fixes this issue. Note that in reality it cannot occur at
this moment as proxies_list cannot be empty (haproxy requires at least
one frontend to start, and the list also always contains internal
proxies).
No need to backport.
This patch fixes the following compilation error :
src/proxy.c:4954:12: error: format string is not a string literal
(potentially insecure) [-Werror,-Wformat-security]
4954 | ha_notice(msg);
| ^~~
No need to backport.
Add a new regtests to validate backend creation at runtime. A server is
then added and requests made to validate the newly created instance
before (with force-be-switch) and after publishing.
Implement proxy ID generation for dynamic backends. This is performed
through the already function existing proxy_get_next_id().
As an optimization, lookup will performed starting from a global
variable <dynpx_next_id>. It is initialized to the greatest ID assigned
after parsing, and updated each time a backend instance is created. When
backend deletion will be implemented, it could be lowered to the newly
available slot.
Implement the required operations for "add backend" handler. This
requires a new proxy allocation, settings copy from the specified
default instance and proxy config finalization. All handlers registered
via REGISTER_POST_PROXY_CHECK() are also called on the newly created
instance.
If no error were encountered, the newly created proxy is finally
attached in the proxies list.
This commits completes "add backend" handler with some checks performed
on the specified default proxy instance. These are additional checks
outside of the already existing inheritance rules, specific to dynamic
backends.
For now, a default proxy is considered not compatible if it is not in
mode TCP/HTTP. Also, a default proxy is rejected if it references HTTP
errors. This limitation may be lifted in the future, when HTTP errors
are partiallay reworked.
Add an optional "mode" argument to "add backend" CLI command. This
argument allows to specify if the backend is in TCP or HTTP mode.
By default, it is mandatory, unless the inherited default proxy already
explicitely specifies the mode. To differentiate if TCP mode is implicit
or explicit, a new proxy flag PR_FL_DEF_EXPLICIT_MODE is defined. It is
set for every defaults instances which explicitely defined their mode.
Define a basic CLI handler for "add backend".
For now, this handler only performs a parsing of the name argument and
return an error if a duplicate already exists. It runs under thread
isolation, to guarantee thread safety during the proxy creation.
This feature is considered in development. CLI command requires to set
experimental-mode.
Move backend compatibility checks performed during 'add server' in a
dedicated function be_supports_dynamic_srv(). This should simplify
addition of future restriction.
This function will be reused when implementing backend creation at
runtime.
Define a new utility function str_to_proxy_mode() which is able to
convert a string into the corresponding proxy mode if possible. This new
function is used for the parsing of "mode" configuration proxy keyword.
This patch will be reused for dynamic backend implementation, in order
to parse a similar "mode" argument via a CLI handler.
If a proxy is referencing a defaults instance, some checks must be
performed to ensure that inheritance will be compatible. Refcount of the
defaults instance may also be incremented if some settings cannot be
copied. This operation is performed when parsing a new proxy of defaults
section which references a defaults, either implicitely or explicitely.
This patch extracts this code into a dedicated function named
proxy_ref_defaults(). This in turn may call defaults_px_ref()
(previously called proxy_ref_defaults()) to increment its refcount.
The objective of this patch is to be able to reuse defaults inheritance
validation for dynamic backends created at runtime, outside of the
parsing code.
A lot of proxies initialization code is delayed on post-parsing stage,
as it depends on the configuration fully parsed. This is performed via a
loop on proxies_list.
Extract this code in a dedicated function proxy_finalize(). This patch
will be useful for dynamic backends creation.
Note that for the moment the code has been extracted as-is. With each
new features, some init code was added there. This has become a giant
loop with no real ordering. A future patch may provide some cleanup in
order to reorganize this.
Default proxies validation occurs during post-parsing. The objective is
to report any tcp/http-rules which could not behave as expected.
Previously, this was performed while looping over standard proxies list,
when such proxy is referencing a default instance. This was enough as
only named referenced proxies were kept after parsing. However, this is
not the case anymore in the context of dynamic backends creation at
runtime.
As such, this patch now performs validation on every named defaults
outside of the standard proxies list loop. This should not cause any
behavior difference, as defaults are validated without using the proxy
which relies on it.
Along with this change, PR_FL_READY proxy flag is now removed. Its usage
was only really needed for defaults, to avoid validating a same instance
multiple times. With the validation of defaults in their own loop, it is
now redundant.
Fix unhandled strdup() failure when initializing global.log_tag.
Bug was introduced with the fix UAF for global progname pointer from
351ae5dbe. So it must be backported as far as 3.1.
Initially when init_early was introduced the progname string was a local
used for temporary storage of log_tag. Now it's global and detached from
log_tag enough. Thus, in the past we could inform that log_tag
allocation has been failed but not now.
Must be backported since the progname string became global, that is
v3.1-dev9-96-g49772c55e
Differ checking the max threads per group number until we're done
parsing the configuration file, as it may be set after a "thread-group-
directive. Otherwise the default value of 64 will be used, even if there
is a max-threads-per-group directive.
This should be backported to 3.3.
Give global.maxthrpertgroup its default value at global creation,
instead of later when we're trying to detect the thread count.
It is used when verifying the configuration file validity, and if it was
not set in the config file, in a few corner cases, the value of 0 would
be used, which would then reject perfectly fine configuration files.
This should be backported to 3.3.
Document the mworker V3 implementation introduced in HAProxy 3.1.
Explains the rationale behind moving configuration parsing out of the
master process to improve robustness.
Could be backported to 3.1.
2026-02-04 16:39:44 +01:00
232 changed files with 13488 additions and 4940 deletions
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.