Commit Graph

19891 Commits

Author SHA1 Message Date
Aurelien DARRAGON
9f5853fa38 BUG/MINOR: server: don't miss server stats update on server state transitions
s->last_change and s->down_time updates were manually updated for each
effective server state change within srv_update_status().

This is rather error-prone, and as a result there were still some state
transitions that were not handled properly since at least 1.8.

ie:
- when transitionning from DRAIN to READY: downtime was updated
  (which is wrong since a server in DRAIN state should not be
   considered as DOWN)
- when transitionning from MAINT to READY: downtime was not updated
  (this can be easily seen in the html stats page)

To fix these all at once, and prevent similar bugs from being introduced,
we centralize the server last_change and down_time stats logic at the end
of srv_update_status():

If the server state changed during the call, then it means that
last_change must be updated, with a special case when changing from
STOPPED state which means the server was previously DOWN and thus
downtime should be updated.

This patch depends on:

- "MINOR: server: explicitly commit state change in srv_update_status()"

This could be backported to every stable versions.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
e80ddb18a8 BUG/MINOR: server: don't miss proxy stats update on server state transitions
backend "down" stats logic has been duplicated multiple times in
srv_update_status(), resulting in the logic now being error-prone.

For example, the following bugfix was needed to compensate for a
copy-paste introduced bug: d332f139
("BUG/MINOR: server: update last_change on maint->ready transitions too")

While the above patch works great, we actually forgot to update the
proxy downtime like it is done for other down->up transitions...
This is simply illustrating that the current design is error-prone,
it is very easy to miss something in this area.

To properly update the proxy downtime stats on the maint->ready transition,
to cleanup srv_update_status() and to prevent similar bugs from being
introduced in the future, proxy/backend stats update are now automatically
performed at the end of the server state change if needed.

Thus we can remove existing updates that were performed at various places
within the function, this simplifies things a bit.

This patch depends on:
- "MINOR: server: explicitly commit state change in srv_update_status()"

This could be backported to all stable versions.

Backport notes:

2.2:

Replace
        struct task *srv_cleanup_toremove_conns(struct task *task, void *context, unsigned int state)

by
        struct task *srv_cleanup_toremove_connections(struct task *task, void *context, unsigned short state)
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
22151c70bb MINOR: server: explicitly commit state change in srv_update_status()
As shown in 8f29829 ("BUG/MEDIUM: checks: a down server going to
maint remains definitely stucked on down state."), state changes
that don't result in explicit lb state change, require us to perform
an explicit server state commit to make sure the next state is
applied before returning from the function.

This is the case for server state changes that don't trigger lb logic
and only perform some logging.

This is quite error prone, we could easily forget a state change
combination that could result in next_state, next_admin or next_eweight
not being applied. (cur_state, cur_admin and cur_eweight would be left
with unexpected values)

To fix this, we explicitly call srv_lb_commit_status() at the end
of srv_update_status() to enforce the new values, even if they were
already applied. (when a state changes requires lb state update
an implicit commit is already performed)

Applying the state change multiple times is safe (since the next value
always points to the current value).

Backport notes:

2.2:

Replace
	struct task *srv_cleanup_toremove_conns(struct task *task, void *context, unsigned int state)

by
	struct task *srv_cleanup_toremove_connections(struct task *task, void *context, unsigned short state)
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
9a1df02ccb BUG/MINOR: server: incorrect report for tracking servers leaving drain
Report message for tracking servers completely leaving drain is wrong:

The check for "leaving drain .. via" never evaluates because the
condition !(s->next_admin & SRV_ADMF_FDRAIN) is always true in the
current block which is guarded by !(s->next_admin & SRV_ADMF_DRAIN).

For tracking servers that leave inherited drain mode, this results in the
following message being emitted:
  "Server x/b is UP (leaving forced drain)"

Instead of:
  "Server x/b is UP (leaving drain) via x/a"

To this fix: we check if FDRAIN is currently set, else it means that the
drain status is inherited from the tracked server (IDRAIN)

This regression was introduced with 64cc49cf ("MAJOR: servers: propagate server
status changes asynchronously."), thus it may be backported to every stable
versions.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
2dac67af7d DOC: lua: restore 80 char limitation
Restore 80 char limitation throughout the file for easier reading
on the cli, and fix some raw formatting issues without altering
html rendering.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
096b383e16 MINOR: hlua/event_hdl: timestamp for events
'when' optional argument is provided to lua event handlers.

It is an integer representing the number of seconds elapsed since Epoch
and may be used in conjunction with lua `os.date()` function to provide
a custom format string.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
e9314fb7a7 MINOR: event_hdl: provide event->when for advanced handlers
For advanced async handlers only
(Registered using EVENT_HDL_ASYNC_TASK() macro):

event->when is provided as a struct timeval and fetched from 'date'
haproxy global variable.

Thanks to 'when', related event consumers will be able to timestamp
events, even if they don't work in real-time or near real-time.
Indeed, unlike sync or normal async handlers, advanced async handlers
could purposely delay the consumption of pending events, which means
that the date wouldn't be accurate if computed directly from within
the handler.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
ebf58e991a MINOR: event_hdl: dynamically allocated event data members
Add the ability to provide a cleanup function for event data passed
via the publishing function.

One use case could be the need to provide valid pointers in the safe
section of the data struct.
Cleanup function will be automatically called with data (or copy of data)
as argument when all handlers consumed the event, which provides an easy
way to release some memory or decrement refcounts to ressources that were
provided through the data struct.
data in itself may not be freed by the cleanup function, it is handled
by the API.

This would allow passing large (allocated) data blocks through the data
struct while keeping data struct size under the EVENT_HDL_ASYNC_EVENT_DATA
size limit.

To do so, when publishing an event, where we would currently do:

        struct event_hdl_cb_data_new_family event_data;

        /* safe data, available from both sync and async contexts
	 * may not use pointers to short-living resources
	 */
        event_data.safe.my_custom_data = x;

        /* unsafe data, only available from sync contexts */
        event_data.unsafe.my_unsafe_data = y;

        /* once data is prepared, we can publish the event */
        event_hdl_publish(NULL,
                          EVENT_HDL_SUB_NEW_FAMILY_SUBTYPE_1,
                          EVENT_HDL_CB_DATA(&event_data));

We could do:

        struct event_hdl_cb_data_new_family event_data;

        /* safe data, available from both sync and async contexts
	 * may not use pointers to short-living resources,
	 * unless EVENT_HDL_CB_DATA_DM is used to ensure pointer
	 * consistency (ie: refcount)
	 */
        event_data.safe.my_custom_static_data = x;
	event_data.safe.my_custom_dynamic_data = malloc(1);

        /* unsafe data, only available from sync contexts */
        event_data.unsafe.my_unsafe_data = y;

        /* once data is prepared, we can publish the event */
        event_hdl_publish(NULL,
                          EVENT_HDL_SUB_NEW_FAMILY_SUBTYPE_1,
                          EVENT_HDL_CB_DATA_DM(&event_data, data_new_family_cleanup));

With data_new_family_cleanup func which would look like this:

      void data_new_family_cleanup(const void *data)
      {
      	const struct event_hdl_cb_data_new_family *event_data = ptr;

	/* some data members require specific cleanup once the event
	 * is consumed
	 */
      	free(event_data.safe.my_custom_dynamic_data);
	/* don't ever free data! it is not ours */
      }

Not sure if this feature will become relevant in the future, so I prefer not
to mention it in the doc for now.

But given that the implementation is trivial and does not put a burden
on the existing API, it's a good thing to have it there, just in case.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
147691fd83 CLEANUP: event_hdl: fix comment typo about _sync assertion
Fixing a comment relative to EVENT_HDL_ASSERT_SYNC macro where a
typo was made and the comment was lacking some context.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
363ef4daa7 CLEANUP: event_hdl: updating obsolete comment for EVENT_HDL_CB_DATA
EVENT_HDL_CB_DATA macro comments were not updated during the API
refactor, fixing that.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
8273bfc639 BUG/MINOR: event_hdl: don't waste 1 event subtype slot
ESUB_INDEX(n) index macro is used exclusively with n > 0
Fixing it so that it starts numbering at 1 instead of 2.

This way, we don't waste a subtype slot in event_hdl_sub_type
struct, and we comply with the structure comments about max
supported event subtypes (currently set at 16).

If 68e692da0 ("MINOR: event_hdl: add event handler base api")
is being backported, then this commit should be backported with it.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
a63f4903c9 MINOR: server/event_hdl: prepare for upcoming refactors
This commit does nothing that ought to be mentioned, except that
it adds missing comments and slighty moves some function calls
out of "sensitive" code in preparation of some server code refactors.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
2f6a07dce8 MINOR: hlua/event_hdl: fix return type for hlua_event_hdl_cb_data_push_args
Changing hlua_event_hdl_cb_data_push_args() return type to void since it
does not return anything useful.
Also changing its name to hlua_event_hdl_cb_push_args() since it does more
than just pushing cb data argument (it also handles event type and mgmt).

Errors catched by the function are reported as lua errors.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
55f84c7cab MINOR: hlua/event_hdl: expose proxy_uuid variable in server events
Adding proxy_uuid to ServerEvent class.
proxy_uuid contains the uuid of the proxy to which the server belongs
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
3d9bf4e1a5 MINOR: hlua/event_hdl: rely on proxy_uuid instead of proxy_name for lookups
Since "MINOR: server/event_hdl: add proxy_uuid to event_hdl_cb_data_server"
we may now use proxy_uuid variable to perform proxy lookups when
handling a server event.

It is more reliable since proxy_uuid isn't subject to any size limitation
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
d714213862 MINOR: server/event_hdl: add proxy_uuid to event_hdl_cb_data_server
Expose proxy_uuid variable in event_hdl_cb_data_server struct to
overcome proxy_name fixed length limitation.

proxy_uuid may be used by the handler to perform proxy lookups.
This should be preferred over lookups relying proxy_name.
(proxy_name is suitable for printing / logging purposes but not for
ID lookups since it has a maximum fixed length)
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
0ddf052972 CLEANUP: server: fix update_status() function comment
srv_update_status() function comment says that the function "is designed
to be called asynchronously".

While this used to be true back then with 64cc49cf
("MAJOR: servers: propagate server status changes asynchronously.")

This is not true anymore since 3ff577e ("MAJOR: server: make server state changes
synchronous again")

Fixing the comment in order to better reflect current behavior.
2023-04-21 14:36:45 +02:00
Aurelien DARRAGON
88687f0980 CLEANUP: errors: fix obsolete function comments
Since 9f903af5 ("MEDIUM: log: slightly refine the output format of
alerts/warnings/etc"), messages generated by ha_{alert,warning,notice}
don't embed date/time information anymore.

Updating some old function comments that kept saying otherwise.
2023-04-21 14:36:45 +02:00
Amaury Denoyelle
a65dd3a2c8 BUG/MINOR: quic: consume Rx datagram even on error
A BUG_ON crash can occur on qc_rcv_buf() if a Rx packet allocation
failed.

To fix this, datagram are marked as consumed even if a fatal error
occured during parsing. For the moment, only a Rx packet allocation
failure could provoke this. At this stage, it's unknown if the datagram
were partially parsed or not at all so it's better to discard it
completely.

This bug was detected using -dMfail argument.

This should be backported up to 2.7.
2023-04-20 14:49:32 +02:00
Amaury Denoyelle
d537ca79dc BUG/MINOR: quic: prevent crash on qc_new_conn() failure
Properly initialize el_th_ctx member first on qc_new_conn(). This
prevents a segfault if release should be called later due to memory
allocation failure in the function on qc_detach_th_ctx_list().

This should be backported up to 2.7.
2023-04-20 14:49:32 +02:00
Amaury Denoyelle
9bbfa72b67 BUG/MINOR: h3: fix crash on h3s alloc failure
Do not emit a CONNECTION_CLOSE on h3s allocation failure. Indeed, this
causes a crash as the calling function qcs_new() will also try to emit a
CONNECTION_CLOSE which triggers a BUG_ON() on qcc_emit_cc().

This was reproduced using -dMfail.

This should be backported up to 2.7.
2023-04-20 14:49:32 +02:00
Amaury Denoyelle
93d2ebe9f3 BUG/MINOR: mux-quic: properly handle STREAM frame alloc failure
Previously, if a STREAM frame cannot be allocated for emission, a crash
would occurs due to an ABORT_NOW() statement in _qc_send_qcs().

Replace this by proper error code handling. Each stream were sending
fails are removed temporarily from qcc::send_list to a list local to
_qc_send_qcs(). Once emission has been conducted for all streams,
reinsert failed stream to qcc::send_list. This avoids to reloop on
failed streams on the second while loop at the end of _qc_send_qcs().

This crash was reproduced using -dMfail.

This should be backported up to 2.6.
2023-04-20 14:49:32 +02:00
Amaury Denoyelle
ed820823f0 BUG/MINOR: mux-quic: fix crash with app ops install failure
On MUX initialization, the application layer is setup via
qcc_install_app_ops(). If this function fails MUX is deallocated and an
error is returned.

This code path causes a crash before connection has been registered
prior into the mux_stopping_data::list for stopping idle frontend conns.
To fix this, insert the connection later in qc_init() once no error can
occured.

The crash was seen on the process closing with SUGUSR1 with a segfault
on mux_stopping_process(). This was reproduced using -dMfail.

This regression was introduced by the following patch :
  commit b4d119f0c7
  BUG/MEDIUM: mux-quic: fix crash on H3 SETTINGS emission

This should be backported up to 2.7.
2023-04-20 14:49:32 +02:00
Frédéric Lécaille
d07421331f BUG/MINOR: quic: Wrong Retry token generation timestamp computing
Again a now_ms variable value used without the ticks API. It is used
to store the generation time of the Retry token to be received back
from the client.

Must be backported to 2.6 and 2.7.
2023-04-19 17:31:28 +02:00
Frédéric Lécaille
45662efb2f BUG/MINOR: quic: Unchecked buffer length when building the token
As server, an Initial does not contain a token but only the token length field
with zero as value. The remaining room was not checked before writting this field.

Must be backported to 2.6 and 2.7.
2023-04-19 11:36:54 +02:00
Frédéric Lécaille
0ed94032b2 MINOR: quic: Do not allocate too much ack ranges
Limit the maximum number of ack ranges to QUIC_MAX_ACK_RANGES(32).

Must be backported to 2.6 and 2.7.
2023-04-19 11:36:54 +02:00
Frédéric Lécaille
4b2627beae BUG/MINOR: quic: Stop removing ACK ranges when building packets
Since this commit:

    BUG/MINOR: quic: Possible wrapped values used as ACK tree purging limit.

There are more chances that ack ranges may be removed from their trees when
building a packet. It is preferable to impose a limit to these trees. This
will be the subject of the a next commit to come.

For now on, it is sufficient to stop deleting ack range from their trees.
Remove quic_ack_frm_reduce_sz() and quic_rm_last_ack_ranges() which were
there to do that.
Make qc_frm_len() support ACK frames and calls it to ensure an ACK frame
may be added to a packet before building it.

Must be backported to 2.6 and 2.7.
2023-04-19 11:36:54 +02:00
Aurelien DARRAGON
8cd620b46f MINOR: hlua: safe coroutine.create()
Overriding global coroutine.create() function in order to link the
newly created subroutine with the parent hlua ctx.
(hlua_gethlua() function from a subroutine will return hlua ctx from the
hlua ctx on which the coroutine.create() was performed, instead of NULL)

Doing so allows hlua_hook() function to support being called from
subroutines created using coroutine.create() within user lua scripts.

That is: the related subroutine will be immune to the forced-yield,
but it will still be checked against hlua timeouts. If the subroutine
fails to yield or finish before the timeout, the related lua handler will
be aborted (instead of going rogue unnoticed like it would be the case prior
to this commit)
2023-04-19 11:03:31 +02:00
Aurelien DARRAGON
cf0f792490 MINOR: hlua: hook yield on known lua state
When forcing a yield attempt from hlua_hook(), we should perform it on
the known hlua state, not on a potential substate created using
coroutine.create() from an existing hlua state from lua script.

Indeed, only true hlua couroutines will properly handle the yield and
perform the required timeout checks when returning in hlua_ctx_resume().

So far, this was not a concern because hlua_gethlua() would return NULL
if hlua_hook() is not directly being called from a hlua coroutine anyway.

But with this we're trying to make hlua_hook() ready for being called
from a subcoroutine which inherits from a parent hlua ctx.
In this case, no yield attempt will be performed, we will simply check
for hlua timeouts.

Not doing so would result in the timeout checks not being performed since
hlua_ctx_resume() is completely bypassed when yielding from the subroutine,
resulting in a user-defined coroutine potentially going rogue unnoticed.
2023-04-19 11:03:31 +02:00
Aurelien DARRAGON
2a9764baae CLEANUP: hlua: avoid confusion between internal timers and tick based timers
Not all hlua "time" variables use the same time logic.

hlua->wake_time relies on ticks since its meant to be used in conjunction
with task scheduling. Thus, it should be stored as a signed int and
manipulated using the tick api.
Adding a few comments about that to prevent mixups with hlua internal
timer api which doesn't rely on the ticks api.
2023-04-19 11:03:31 +02:00
Aurelien DARRAGON
58e36e5b14 MEDIUM: hlua: introduce tune.lua.burst-timeout
The "burst" execution timeout applies to any Lua handler.
If the handler fails to finish or yield before timeout is reached,
handler will be aborted to prevent thread contention, to prevent
traffic from not being served for too long, and ultimately to prevent
the process from crashing because of the watchdog kicking in.

Default value is 1000ms.
Combined with forced-yield default value of 10000 lua instructions, it
should be high enough to prevent any existing script breakage, while
still being able to catch slow lua converters or sample fetches
doing thread contention and risking the process stability.

Setting value to 0 completely bypasses this check. (not recommended but
could be required to restore original behavior if this feature breaks
existing setups somehow...)

No backport needed, although it could be used to prevent watchdog crashes
due to poorly coded (slow/cpu consuming) lua sample fetches/converters.
2023-04-19 11:03:31 +02:00
Aurelien DARRAGON
da9503ca9a MEDIUM: hlua: reliable timeout detection
For non yieldable lua handlers (converters, fetches or yield
incompatible lua functions), current timeout detection relies on now_ms
thread local variable.

But within non-yieldable contexts, now_ms won't be updated if not by us
(because we're momentarily stuck in lua context so we won't
re-enter the polling loop, which is responsible for clock updates).

To circumvent this, clock_update_date(0, 1) was manually performed right
before now_ms is being read for the timeout checks.

But this fails to work consistently, because if no other concurrent
threads periodically run clock_update_global_date(), which do happen if
we're the only active thread (nbthread=1 or low traffic), our
clock_update_date() call won't reliably update our local now_ms variable

Moreover, clock_update_date() is not the right tool for this anyway, as
it was initially meant to be used from the polling context.
Using it could have negative impact on other threads relying on now_ms
to be stable. (because clock_update_date() performs global clock update
from time to time)

-> Introducing hlua multipurpose timer, which is internally based on
now_cpu_time_fast() that provides per-thread consistent clock readings.

Thanks to this new hlua timer API, hlua timeout logic is less error-prone
and more robust.

This allows the timeout detection to work as expected for both yieldable
and non-yieldable lua handlers.

This patch depends on commit "MINOR: clock: add now_cpu_time_fast() function"

While this could theorically be backported to all stable versions,
it is advisable to avoid backports unless we're confident enough
since it could cause slight behavior changes (timing related) in
existing setups.
2023-04-19 11:03:31 +02:00
Aurelien DARRAGON
df188f145b MINOR: clock: add now_cpu_time_fast() function
Same as now_cpu_time(), but for fast queries (less accurate)
Relies on now_cpu_time() and now_mono_time_fast() is used
as a cache expiration hint to prevent now_cpu_time() from being
called too often since it is known to be quite expensive.

Depends on commit "MINOR: clock: add now_mono_time_fast() function"
2023-04-19 11:03:31 +02:00
Aurelien DARRAGON
07cbd8e074 MINOR: clock: add now_mono_time_fast() function
Same as now_mono_time(), but for fast queries (less accurate)
Relies on coarse clock source (also known as fast clock source on
some systems).

Fallback to now_mono_time() if coarse source is not supported on the system.
2023-04-19 11:03:31 +02:00
Willy Tarreau
be336620b7 BUG/MINOR: cfgparse: make sure to include openssl-compat
Commit 5003ac7fe ("MEDIUM: config: set useful ALPN defaults for HTTPS
and QUIC") revealed a build dependency bug: if QUIC is not enabled,
cfgparse doesn't have any dependency on the SSL stack, so the various
ifdefs that try to check special conditions such as rejecting an H2
config with too small a bufsize, are silently ignored. This was
detected because the default ALPN string was not set and caused the
alpn regtest to fail without QUIC support. Adding openssl-compat to
the list of includes seems to be sufficient to have what we need.

It's unclear when this dependency was broken, it seems that even 2.2
didn't have an explicit dependency on anything SSL-related, though it
could have been inherited through other files (as happens with QUIC
here). It would be safe to backport it to all stable branches. The
impact is very low anyway.
2023-04-19 10:46:21 +02:00
Amaury Denoyelle
89e48ff92f BUG/MEDIUM: quic: prevent crash on Retry sending
The following commit introduced a regression :
  commit 1a5cc19cec
  MINOR: quic: adjust Rx packet type parsing

Since this commit, qv variable was left to NULL as version is stored
directly in quic_rx_packet instance. In most cases, this only causes
traces to skip version printing. However, qv is dereferenced when
sending a Retry which causes a segfault.

To fix this, simply remove qv variable and use pkt->version instead,
both for traces and send_retry() invocation.

This bug was detected thanks to QUIC interop runner. It can easily be
reproduced by using quic-force-retry on the bind line.

This must be backported up to 2.7.
2023-04-19 10:18:58 +02:00
Willy Tarreau
5003ac7fe9 MEDIUM: config: set useful ALPN defaults for HTTPS and QUIC
This commit makes sure that if three is no "alpn", "npn" nor "no-alpn"
setting on a "bind" line which corresponds to an HTTPS or QUIC frontend,
we automatically turn on "h2,http/1.1" as an ALPN default for an HTTP
listener, and "h3" for a QUIC listener. This simplifies the configuration
for end users since they won't have to explicitly configure the ALPN
string to enable H2, considering that at the time of writing, HTTP/1.1
represents less than 7% of the traffic on large infrastructures. The
doc and regtests were updated. For more info, refer to the following
thread:

  https://www.mail-archive.com/haproxy@formilux.org/msg43410.html
2023-04-19 09:52:20 +02:00
Willy Tarreau
de85de69ec MINOR: ssl_crtlist: dump "no-alpn" on "show crtlist" when "no-alpn" was set
Instead of dumping "alpn " better show "no-alpn" as configured.
2023-04-19 09:12:43 +02:00
Willy Tarreau
a2a095536a MINOR: ssl: do not set ALPN callback with the empty string
While it does not have any effect, it's better not to try to setup an
ALPN callback nor to try to lookup algorithms when the configured ALPN
string is empty as a result of "no-alpn" being used.
2023-04-19 09:12:43 +02:00
Willy Tarreau
74d7cc0891 DOC: add missing documentation for "no-alpn" on bind lines
This is the doc for the no-alpn keyword that was mistakenly left out of
commit 158c18e85 ("MINOR: config: add "no-alpn" support for bind lines").
2023-04-19 09:12:43 +02:00
Willy Tarreau
2e3c04a7c7 REGTESTS: add a new "ssl_alpn" test to test ALPN negotiation
This teg-test verifies that different ALPN values on the "server" line
will negotiate the expected protocol depending on the ALPN "bind" line.
2023-04-19 08:38:06 +02:00
Willy Tarreau
158c18e85a MINOR: config: add "no-alpn" support for bind lines
It's possible to replace a previously set ALPN but not to disable ALPN
if it was previously set. The new "no-alpn" setting allows to disable
a previously set ALPN setting by preparing an empty one that will be
replaced and freed when the config is validated.
2023-04-19 08:38:06 +02:00
Christopher Faulet
d0c57d3d33 BUG/MEDIUM: stconn: Propagate error on the SC on sending path
On sending path, a pending error can be promoted to a terminal error at the
endpoint level (SE_FL_ERR_PENDING to SE_FL_ERROR). When this happens, we
must propagate the error on the SC to be able to handle it at the stream
level and eventually forward it to the other side.

Because of this bug, it is possible to freeze sessions, for instance on the
CLI.

It is a 2.8-specific issue. No backport needed.
2023-04-18 18:57:04 +02:00
Christopher Faulet
845f7c4708 CLEANUP: cli: Remove useless debug message in cli_io_handler()
When compiled in debug mode, HAProxy prints a debug message at the end of
the cli I/O handle. It is pretty annoying and useless because, we can active
applet traces. Thus, just remove it.
2023-04-18 18:57:04 +02:00
Christopher Faulet
cbfcb02e21 CLEANUP: backend: Remove useless debug message in assign_server()
When compiled in debug mode, HAProxy prints a debug message at the beginning
of assign_server(). It is pretty annoying and useless because, in debug
mode, we can active stream traces. Thus, just remove it.
2023-04-18 18:57:04 +02:00
Christopher Faulet
27c17d1ca5 BUG/MINOR: http-ana: Update analyzers on both sides when switching in TUNNEL mode
The commit 9704797fa ("BUG/MEDIUM: http-ana: Properly switch the request in
tunnel mode on upgrade") fixes the switch in TUNNEL mode, but only
partially. Because both channels are switch in TUNNEL mode in same time on
one side, the channel's analyzers on the opposite side are not updated
accordingly. This prevents the tunnel timeout to be applied.

So instead of updating both sides in same time, we only force the analysis
on the other side by setting CF_WAKE_ONCE flag when a channel is switched in
TUNNEL mode. In addition, we must take care to forward all data if there is
no DATAa TCP filters registered.

This patch is related to the issue #2125. It is 2.8-specific. No backport
needed.
2023-04-18 18:57:04 +02:00
Amaury Denoyelle
0783a7b08e MINOR: listener: remove unneeded local accept flag
Remove the receiver RX_F_LOCAL_ACCEPT flag. This was used by QUIC
protocol before thread rebinding was supported by the quic_conn layer.

This should be backported up to 2.7 after the previous patch has also
been taken.
2023-04-18 17:09:34 +02:00
Amaury Denoyelle
1acbbca171 MAJOR: quic: support thread balancing on accept
Before this patch, QUIC protocol used a custom add_listener callback.
This was because a quic_conn instance was allocated before accept. Its
thread affinity was fixed and could not be changed after. The thread was
derived itself from the CID selected by the client which prevent an even
repartition of QUIC connections on multiple threads.

A series of patches was introduced with a lot of changes. The most
important ones :
* removal of affinity between an encoded CID and a thread
* possibility to rebind a quic_conn on a new thread

Thanks to this, it's possible to suppress the custom add_listener
callback. Accept is conducted for QUIC protocol as with the others. A
less loaded thread is selected on listener_accept() and the connection
stack is bind on it. This operation implies that quic_conn instance is
moved to the new thread using the set_affinity QUIC protocol callback.

To reactivate quic_conn instance after thread rebind,
qc_finalize_affinity_rebind() is called after accept on the new thread
by qc_xprt_start() through accept_queue_process() / session_accept_fd().

This should be backported up to 2.7 after a period of observation.
2023-04-18 17:09:34 +02:00
Amaury Denoyelle
739de3f119 MINOR: quic: properly finalize thread rebinding
When a quic_conn instance is rebinded on a new thread its tasks and
tasklet are destroyed and new ones created. Its socket is also migrated
to a new thread which stop reception on it.

To properly reactivate a quic_conn after rebind, wake up its tasks and
tasklet if they were active before thread rebind. Also reactivate
reading on the socket FD. These operations are implemented on a new
function qc_finalize_affinity_rebind().

This should be backported up to 2.7 after a period of observation.
2023-04-18 17:09:02 +02:00
Amaury Denoyelle
5f8704152a BUG/MINOR: quic: transform qc_set_timer() as a reentrant function
qc_set_timer() function is used to rearm the timer for loss detection
and probing. Previously, timer was always rearm when congestion window
was free due to a wrong interpretation of the RFC which mandates the
client to rearm the timer before handshake completion to avoid a
deadlock related to anti-amplification.

Fix this by removing this code from quic_pto_pktns(). This allows
qc_set_timer() to be reentrant and only activate the timer if needed.

The impact of this bug seems limited. It can probably caused the timer
task to be processed too frequently which could caused too frequent
probing.

This change will allow to reuse easily qc_set_timer() after quic_conn
thread migration. As such, the new timer task will be scheduled only if
needed.

This should be backported up to 2.6.
2023-04-18 17:09:02 +02:00