There is a bug in a way the channels flags are checked to set clientfin or
serverfin timeout. Indeed, to set the clientfin timeout, the request channel
must be shut for reads (CF_SHUTR) or the response channel must be shut for
writes (CF_SHUTW). As the opposite, the serverfin timeout must be set when
the request channel is shut for writes (CF_SHUTW) or the response channel is
shut for reads (CF_SHUTR).
It is a 2.8-dev specific issue. No backport needed.
In the commut b08c5259e ("MINOR: stconn: Always report READ/WRITE event on
shutr/shutw"), a return statement was erroneously removed from
sc_app_shutr(). As a consequence, CF_SHUTR flags was never set. Fortunately,
it is the default .shutr callback function. Thus when a connection or an
applet is attached to the SC, another callback is used to performe a
shutdown for reads.
It is a 28-dev specific issue. No backport needed.
It is not possible to successfully match an empty response. However using
regex, it should be possible to reject response with any content. For
instance:
tcp-check expect !rstring ".+"
It may seem a be strange to do that, but it is possible, it is a valid
config. So it must work. Thanks to this patch, it is now really supported.
This patch may be backported as far as 2.2. But only if someone ask for it.
As timestamps based on now_ms values are used to compute the probing timeout,
they may wrap. So, use ticks API to compared them.
Must be backported to 2.7 and 2.6.
This this commit, this is ->idle_expire of quic_conn struct which must
be taken into an account to display the idel timer task expiration value:
MEDIUM: quic: Ack delay implementation
Furthermore, this value was always zero until now_ms has wrapped (20 s after
the start time) due to this commit:
MEDIUM: clock: force internal time to wrap early after boot
Do not rely on the value of now_ms compared to ->idle_expire to display the
difference but use ticks_remain() to compute it.
Must be backported to 2.7 where "show quic" has already been backported.
This bug arrived with this commit:
MEDIUM: quic: Ack delay implementation
It is possible that the idle timer task was already in the run queue when its
->expire field was updated calling qc_idle_timer_do_rearm(). To prevent this
task from running in this condition, one must check its ->expire field value
with this condition to run the task if its timer has really expired:
!tick_is_expired(t->expire, now_ms)
Furthermore, as this task may be directly woken up with a call to task_wakeup()
all, for instance by qc_kill_conn() to kill the connection, one must check this
task has really been woken up when it was in the wait queue and not by a direct
call to task_wakeup() thanks to this test:
(state & TASK_WOKEN_ANY) == TASK_WOKEN_TIMER
Again, when this condition is not fulfilled, the task must be run.
Must be backported where the commit mentionned above was backported.
As found in issue #2089, it's easy to mistakenly paste a colon in a
header name, or other chars (e.g. spaces) when quotes are in use, and
this causes all sort of trouble in field because such chars are rejected
by the peer.
Better try to detect these upfront. That's what we're doing here during
the parsing of the add-header/set-header/early-hint actions, where a
warning is emitted if a non-token character is found in a header name.
A special case is made for the colon at the beginning so that it remains
possible to place any future pseudo-headers that may appear. E.g:
[WARNING] (14388) : config : parsing [badchar.cfg:23] : header name 'X-Content-Type-Options:' contains forbidden character ':'.
This should be backported to 2.7, and ideally it should be turned to an
error in future versions.
As now_ms may be zero, these BUG_ON() could be triggered when its value has wrapped.
These call to BUG_ON() may be removed because the values they was supposed to
check are safely used by the ticks API.
Must be backported to 2.6 and 2.7.
If OPENSSL_NO_DEPRECATED is set, we get a 'error: ‘RSA_PKCS1_PADDING’
undeclared' when building jwt.c. The symbol is not deprecated, we are
just missing an include.
This was raised in GitHub issue #2098.
It does not need to be backported.
This very old bug is there since the first implementation of newreno congestion
algorithm implementation. This was a very bad idea to put a state variable
into quic_cc_algo struct which only defines the congestion control algorithm used
by a QUIC listener, typically its type and its callbacks.
This bug could lead to crashes since BUG_ON() calls have been added to each algorithm
implementation. This was revealed by interop test, but not very often as there was
not very often several connections run at the time during these tests.
Hopefully this was also reported by Tristan in GH #2095.
Move the congestion algorithm state to the correct structures which are private
to a connection (see cubic and nr structs).
Must be backported to 2.7 and 2.6.
When entering a recovery period, the algo state is set by quic_enter_recovery().
And that's it!. These two lines should have been removed with this commit:
BUG/MINOR: quic: Wrong use of now_ms timestamps (cubic algo)
Take the opportunity of this patch to add a missing TRACE_LEAVE() call in
quic_cc_cubic_ca_cb().
Must be backported to 2.7 and 2.6.
This algorithm does nothing except initializing the congestion control window
to a fixed value. Very smart!
Modify the QUIC congestion control configuration parser to support this new
algorithm. The congestion control algorithm must be set as follows:
quic-cc-algo nocc-<cc window size(KB))
For instance if "nocc-15" is provided as quic-cc-algo keyword value, this
will set a fixed window of 15KB.
Depending on what we're debugging, some FDs can represent pollution in
the "show fd" output. Here we add a set of filters allowing to pick (or
exclude) any combination of listener, frontend conn, backend conn, pipes,
etc. "show fd l" will only list listening connections for example.
In ->srtt quic_loss struct this is 8*srtt which is stored so that not to have to multiply/devide
it to compute the RTT variance (at least). This is where there was a bug in quic_loss_srtt_update():
each time ->srtt must be used, it must be devided by 8 or right shifted by 3.
This bug had a very bad impact for network with non negligeable packet loss.
Must be backported to 2.6 and 2.7.
Reuse the idle timeout task to delay the acknowledgments. The time of the
idle timer expiration is for now on stored in ->idle_expire. The one to
trigger the acknowledgements is stored in ->ack_expire.
Add QUIC_FL_CONN_ACK_TIMER_FIRED new connection flag to mark a connection
as having its acknowledgement timer been triggered.
Modify qc_may_build_pkt() to prevent the sending of "ack only" packets and
allows the connection to send packet when the ack timer has fired.
It is possible that acks are sent before the ack timer has triggered. In
this case it is cancelled only if ACK frames are really sent.
The idle timer expiration must be set again when the ack timer has been
triggered or when it is cancelled.
Must be backported to 2.7.
Dump variables displayed by TRACE_ENTER() or TRACE_LEAVE() by calls to TRACE_PROTO().
No more variables are displayed by the two former macros. For now on, these information
are accessible from proto level.
Add new calls to TRACE_PROTO() at important locations in relation whith QUIC transport
protocol.
When relevant, try to prefix such traces with TX or RX keyword to identify the
concerned subpart (transmission or reception) of the protocol.
Must be backported to 2.7.
This callback was left as not implemented. It should at least display
the algorithm state, the control congestion window the slow start threshold
and the time of the current recovery period. Should be helpful to debug.
Must be backported to 2.7.
This bug was revealed by handshakeloss interop tests (often with quiceh) where one
could see haproxy an Initial packet without TLS ClientHello message (only a padded
PING frame). In this case, as the ->max_idle_timeout was not initialized, the
connection was closed about three seconds later, and haproxy opened a new one with
a new source connection ID upon receipt of the next client Initial packet. As the
interop runner count the number of source connection ID used by the server to check
there were exactly 50 such IDs used by the server, it considered the test as failed.
So, the ->max_idle_timeout of the connection must be at least initialized
to the local "max_idle_timeout" transport parameter value to avoid such
a situation (closing connections too soon) until it is "negotiated" with the
client when receiving its TLS ClientHello message.
Must be backported to 2.7 and 2.6.
This patch is similar to the one for cubic algorithm:
"BUG/MINOR: quic: Wrong use of timestamps with now_ms variable (cubic algo)"
As now_ms may wrap, one must use the ticks API to protect the cubic congestion
control algorithm implementation from side effects due to this.
Furthermore, to make the newreno congestion control algorithm more readable and easy
to maintain, add quic_cc_cubic_rp_cb() new callback for the "in recovery period"
state (QUIC_CC_ST_RP).
Must be backported to 2.7 and 2.6.
Add ->srtt, ->rtt_var, ->rtt_min and ->pto_count values from ->path->loss
struct to "show quic". Same thing for ->cwnd from ->path struct.
Also take the opportunity of this patch to dump the packet number
space information directly from ->pktns[] array in place of ->els[]
array. Indeed, ->els[QUIC_TLS_ENC_LEVEL_EARLY_DATA] and ->els[QUIC_TLS_ENC_LEVEL_APP]
have the same packet number space.
Must be backported to 2.7 where "show quic" implementation has alredy been
backported.
As now_ms may wrap, one must use the ticks API to protect the cubic congestion
control algorithm implementation from side effects due to this.
Furthermore to make the cubic congestion control algorithm more readable and easy
to maintain, adding a new state ("in recovery period" QUIC_CC_ST_RP new enum) helps
in reaching this goal. Implement quic_cc_cubic_rp_cb() which is the callback for
this new state.
Must be backported to 2.7 and 2.6.
If a bundle is used in a crt-list, the ssl-min-ver and ssl-max-ver
options were not taken into account in entries other than the first one
because the corresponding fields in the ssl_bind_conf structure were not
copied in crtlist_dup_ssl_conf.
This should fix GitHub issue #2069.
This patch should be backported up to 2.4.
In some extremely unlikely case (or even impossible for now), we might
exit cli_parse_update_ocsp_response without raising an error but with a
filled 'err' buffer. It was not properly free'd.
It does not need to be backported.
This patch removes dead code from the cli_parse_update_ocsp_response
function. The 'end' label in only used in case of error so the check of
the 'errcode' variable and the errcode variable itself become useless.
This patch does not need to be backported.
It fixes GitHub issue #2077.
When a proxy enters the STOPPED state, it will no longer accept new
connections.
However, it doesn't mean that it's completely inactive yet: it will
still be able to handle already pending / keep-alive connections,
thus finishing ongoing work before effectively stopping.
be_usable_srv(), which is used by nbsrv converter and sample fetch,
will return 0 if the proxy is either stopped or disabled.
nbsrv behaves this way since it was originally implemented in b7e7c4720
("MINOR: Add nbsrv sample converter").
(Since then, multiple refactors were performed around this area, but
the current implementation still follows the same logic)
It was found that if nbsrv is used in a proxy section to perform routing
logic, unexpected decisions are being made when nbsrv is used on a proxy
with STOPPED state, since in-flight requests will suffer from nbsrv
returning 0 instead of the current number of usable servers which may
still process existing connections.
For instance, this can happen during process soft-stop, or even when
stopping the proxy from the cli / lua.
To fix this: we now make sure be_usable_srv() always returns the
current number of usable servers, unless the proxy is explicitly
disabled (from the config, not at runtime)
This could be backported up to 2.6.
For older versions, the need for a backport should be evaluated first.
--
Note for 2.4: proxy flags did not exist, it was implemented with fd10ab5e
("MINOR: proxy: Introduce proxy flags to replace disabled bitfield")
For 2.2: STOPPED and DISABLED states were not separated, so we have no
easy way to apply the fix anyway.
During soft-stop, manage_proxy() (p->task) will try to purge
trashable (expired and not referenced) sticktable entries,
effectively releasing the process memory to leave some space
for new processes.
This is done by calling stktable_trash_oldest(), immediately
followed by a pool_gc() to give the memory back to the OS.
As already mentioned in dfe7925 ("BUG/MEDIUM: stick-table:
limit the time spent purging old entries"), calling
stktable_trash_oldest() with a huge batch can result in the function
spending too much time searching and purging entries, and ultimately
triggering the watchdog.
Lately, an internal issue was reported in which we could see
that the watchdog is being triggered in stktable_trash_oldest()
on soft-stop (thus initiated by manage_proxy())
According to the report, the crash seems to only occur since 5938021
("BUG/MEDIUM: stick-table: do not leave entries in end of window during purge")
This could be the result of stktable_trash_oldest() now working
as expected, and thus spending a large amount of time purging
entries when called with a large enough <to_batch>.
Instead of adding new checks in stktable_trash_oldest(), here we
chose to address the issue directly in manage_proxy().
Since the stktable_trash_oldest() function is called with
<to_batch> == <p->table->current>, it's pretty obvious that it could
cause some issues during soft-stop if a large table, assuming it is
full prior to the soft-stop, suddenly sees most of its entries
becoming trashable because of the soft-stop.
Moreover, we should note that the call to stktable_trash_oldest() is
immediately followed by a call to pool_gc():
We know for sure that pool_gc(), as it involves malloc_trim() on
glibc, is rather expensive, and the more memory to reclaim,
the longer the call.
We need to ensure that both stktable_trash_oldest() + consequent
pool_gc() call both theoretically fit in a single task execution window
to avoid contention, and thus prevent the watchdog from being triggered.
To do this, we now allocate a "budget" for each purging attempt.
budget is maxed out to 32K, it means that each sticktable cleanup
attempt will trash at most 32K entries.
32K value is quite arbitrary here, and might need to be adjusted or
even deducted from other parameters if this fails to properly address
the issue without introducing new side-effects.
The goal is to find a good balance between the max duration of each
cleanup batch and the frequency of (expensive) pool_gc() calls.
If most of the budget is actually spent trashing entries, then the task
will immediately be rescheduled to continue the purge.
This way, the purge is effectively batched over multiple task runs.
This may be slowly backported to all stable versions.
[Please note that this commit depends on 6e1fe25 ("MINOR: proxy/pool:
prevent unnecessary calls to pool_gc()")]
This commit adds a new optional argument to smp_fetch_url_param
and smp_fetch_url_param_val that makes the parameter key comparison
case-insensitive.
Now users can retrieve URL parameters regardless of their case,
allowing to match parameters in case insensitive application.
Doc was updated.
This commit adds a new argument to smp_fetch_url_param
that makes the parameter key comparison case-insensitive.
Several levels of callers were modified to pass this info.
In prevision of adding a third parameter to the url_param
sample-fetch function we need to make the second parameter optional.
User can now pass a empty 2nd argument to keep the default delimiter.
The table in section 2.2 ("Quoting and escaping") was formated in a way
which is not recognized by haproxy-dconv, breaking it, and cutting off
the entire section.
This commit fix that by formatting the table in way which allows the
converter to produce the correct HTML.
Fixescbonte/haproxy-dconv#35
Since eb77824 ("MEDIUM: proxy: remove the deprecated "grace" keyword"),
stop_time is never set, so the related code in manage_proxy() is not
relevant anymore.
Removing code that refers to p->stop_time, since it was probably
overlooked.
Under certain soft-stopping conditions (ie: sticktable attached to proxy
and in-progress connections to the proxy that prevent haproxy from
exiting), manage_proxy() (p->task) will wake up every second to perform
a cleanup attempt on the proxy sticktable (to purge unused entries).
However, as reported by TimWolla in GH #2091, it was found that a
systematic call to pool_gc() could cause some CPU waste, mainly
because malloc_trim() (which is rather expensive) is being called
for each pool_gc() invocation.
As a result, such soft-stopping process could be spending a significant
amount of time in the malloc_trim->madvise() syscall for nothing.
Example "strace -c -f -p `pidof haproxy`" output (taken from
Tim's report):
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
46.77 1.840549 3941 467 1 epoll_wait
43.82 1.724708 13 128509 sched_yield
8.82 0.346968 11 29696 madvise
0.58 0.023011 24 951 clock_gettime
0.01 0.000257 10 25 7 recvfrom
0.00 0.000033 11 3 sendto
0.00 0.000021 21 1 rt_sigreturn
0.00 0.000021 21 1 timer_settime
------ ----------- ----------- --------- --------- ----------------
100.00 3.935568 24 159653 8 total
To prevent this, we now only call pool_gc() when some memory is
really expected to be reclaimed as a direct result of the previous
stick table cleanup.
This is pretty straightforward since stktable_trash_oldest() returns
the number of trashed sticky sessions.
This may be backported to every stable versions.
This bug arrived with this commit:
MINOR: quic: Send PING frames when probing Initial packet number space
This may happen when haproxy needs to probe the peer with very short packets
(only one PING frame). In this case, the packet must be padded. There was clearly
a case which was removed by the mentionned commit above. That said, there was
an extra byte which was added to the PADDING frame before the mentionned commit
above. This is no more the case with this patch.
Thank you to @tatsuhiro-t (ngtcp2 manager) for having reported this issue which
was revealed by the keyupdate test (on client side).
Must be backported to 2.7 and 2.6.
When a backend H2 connection is waiting the connection is fully established,
nothing is sent. However, it remains useful to detect connection error at
this stage. It is especially important to release H2 connection on connect
error. Be able to set H2_CF_ERR_PENDiNG or H2_CF_ERROR flags when the
underlying connection is not fully established will exclude the H2C to be
inserted in a idle list in h2_detach().
Without this fix, an H2C in PREFACE state and relying on a connection in
error can be inserted in the safe list. Of course, it will be purged if not
reused. But in the mean time, it can be reused. When this happens, the
connection remains in error and nothing happens. At the end a connection
error is returned to the client. On low traffic, we can imagine a scenario
where this dead connection is the only idle connection. If it is always
reused before being purged, no connection to the server is possible.
In addition, h2c_is_dead() is updated to declare as dead any H2 connection
with a pending error if its state is PREFACE or SETTINGS1 (thus if no
SETTINGS frame was received yet).
This patch should fix the issue #2092. It must be backported as far as 2.6.
In commit c2c043ed4 ("BUG/MEDIUM: stats: Consume the request except when
parsing the POST payload"), a change about applet was pushed too early. The
applet must still call cf_shutr() when the response is fully sent. It is
planned to rely on SE_FL_EOS flag, just like connections. But it is not
possible for now.
However, at first glance, this bug has no visible effect.
It is 2.8-specific. No backport needed.
Released version 2.8-dev6 with the following main changes :
- BUG/MEDIUM: mux-pt: Set EOS on error on sending path if read0 was received
- MINOR: ssl: Change the ocsp update log-format
- MINOR: ssl: Use ocsp update task for "update ssl ocsp-response" command
- BUG/MINOR: ssl: Fix double free in ocsp update deinit
- MINOR: ssl: Accept certpath as param in "show ssl ocsp-response" CLI command
- MINOR: ssl: Add certificate path to 'show ssl ocsp-response' output
- BUG/MEDIUM: proxy: properly stop backends on soft-stop
- BUG/MEDIUM: resolvers: Properly stop server resolutions on soft-stop
- DEBUG: cli/show_fd: Display connection error code
- DEBUG: ssl-sock/show_fd: Display SSL error code
- BUG/MEDIUM: mux-h1: Don't block SE_FL_ERROR if EOS is not reported on H1C
- BUG/MINOR: tcp_sample: fix a bug in fc_dst_port and fc_dst_is_local sample fetches
- BUG/MINOR: quic: Missing STREAM frame length updates
- BUG/MEDIUM: connection: Preserve flags when a conn is removed from an idle list
- BUG/MINOR: mux-h2: make sure the h2c task exists before refreshing it
- MINOR: buffer: add br_count() to return the number of allocated bufs
- MINOR: buffer: add br_single() to check if a buffer ring has more than one buf
- BUG/MEDIUM: mux-h2: only restart sending when mux buffer is decongested
- BUG/MINOR: mux-h2: set CO_SFL_STREAMER when sending lots of data
- BUG/MINOR: quic: Missing STREAM frame data pointer updates
- MINOR: stick-table: add sc-add-gpc() to http-after-response
- MINOR: doc: missing entries for sc-add-gpc()
- BUG/MAJOR: qpack: fix possible read out of bounds in static table
- OPTIM: mux-h1: limit first read size to avoid wrapping
- MINOR: mux-h2: set CO_SFL_MSG_MORE when sending multiple buffers
- MINOR: ssl-sock: pass the CO_SFL_MSG_MORE info down the stack
- MINOR: quic: Stop stressing the acknowledgments process (RX ACK frames)
- BUG/MINOR: quic: Dysfunctional 01RTT packet number space probing
- BUG/MEDIUM: stream: do not try to free a failed stream-conn
- BUG/MEDIUM: mux-h2: do not try to free an unallocated h2s->sd
- BUG/MEDIUM: mux-h2: erase h2c->wait_event.tasklet on error path
- BUG/MEDIUM: stconn: don't set the type before allocation succeeds
- BUG/MINOR: stconn: fix sedesc memory leak on stream allocation failure
- MINOR: dynbuf: set POOL_F_NO_FAIL on buffer allocation
- MINOR: pools: preset the allocation failure rate to 1% with -dMfail
- BUG/MEDIUM: mux-h1: properly destroy a partially allocated h1s
- BUG/MEDIUM: applet: only set appctx->sedesc on successful allocation
- BUG/MINOR: quic: wake up MUX on probing only for 01RTT
- BUG/MINOR: quic: ignore congestion window on probing for MUX wakeup
- BUILD: thread: implement thread_harmless_end_sig() for threadless builds
- BUILD: thread: silence a build warning when threads are disabled
- MINOR: debug: support dumping the libs addresses when running in verbose mode
- BUG/MINOR: illegal use of the malloc_trim() function if jemalloc is used
- BUG/MINOR: trace: fix hardcoded level for TRACE_PRINTF
- BUG/MEDIUM: mux-quic: release data from conn flow-control on qcs reset
- MINOR: mux-quic: complete traces for qcs emission
- MINOR: mux-quic: adjust trace level for MAX_DATA/MAX_STREAM_DATA recv
- MINOR: mux-quic: add flow-control info to minimal trace level
- MINOR: pools: make sure 'no-memory-trimming' is always used
- MINOR: pools: intercept malloc_trim() instead of trying to plug holes
- MEDIUM: pools: move the compat code from trim_all_pools() to malloc_trim()
- MINOR: pools: export trim_all_pools()
- MINOR: pattern: use trim_all_pools() instead of a conditional malloc_trim()
- MINOR: tools: relax dlopen() on malloc/free checks
- MEDIUM: tools: further relax dlopen() checks too consider grouped symbols
- BUG/MINOR: pools: restore detection of built-in allocator
- MINOR: pools: report a replaced memory allocator instead of just malloc_trim()
- BUG/MINOR: h3: properly handle incomplete remote uni stream type
- BUG/MINOR: mux-quic: prevent CC status to be erased by shutdown
- MINOR: mux-quic: interrupt qcc_recv*() operations if CC scheduled
- MINOR: mux-quic: ensure CONNECTION_CLOSE is scheduled once per conn
- MINOR: mux-quic: close on qcs allocation failure
- MINOR: mux-quic: close on frame alloc failure
- BUG/MINOR: syslog: Request for more data if message was not fully received
- BUG/MEDIUM: stats: Consume the request except when parsing the POST payload
- DOC: config: set-var() dconv rendering issues
- BUG/MEDIUM: mux-h1: Wakeup H1C on shutw if there is no I/O subscription
- BUG/MINOR: applet/new: fix sedesc freeing logic
- BUG/MINOR: quic: Missing STREAM frame type updated
- BUILD: da: extends CFLAGS to support API v3 from 3.1.7 and onwards.
- BUG/MINOR: ssl: Stop leaking `err` in ssl_sock_load_ocsp()
Previously performing a config check of `.github/h2spec.config` would report a
20 byte leak as reported in GitHub Issue #2082.
The leak was introduced in a6c0a59e9a, which is
dev only. No backport needed.
Minor build update to still both support the v2 and v3 api from
the 3.1.7 release which supports a cache but would need a shift
in the HAProxy build not necessary at the moment.
In the second half of the year and for the next major HAProxy release
branch, v2 could be dropped altogether thus the next HAProxy 2.9
major release will contain more changes towards the v3 support
and reminder for the v2 EOL.
To be backported.
This patch follows this commit which was not sufficient:
BUG/MINOR: quic: Missing STREAM frame data pointer updates
Indeed, after updating the ->offset field, the bit which informs the
frame builder of its presence must be systematically set.
This bug was revealed by the following BUG_ON() from
quic_build_stream_frame() :
bug condition "!!(frm->type & 0x04) != !!stream->offset.key" matched at src/quic_frame.c:515
This should fix the last crash occured on github issue #2074.
Must be backported to 2.6 and 2.7.