This bit field used to be a per-thread cache of the result of the last
lookup of the presence of a task for each thread in the shared cache.
Since we now know that each thread has its own shared cache, a test of
emptiness is now sufficient to decide whether or not the shared tree
has a task for the current thread. Let's just remove this mask.
grq_total was only used to know how many tasks were being queued in the
global runqueue for stats purposes, and that was transferred to the per
thread rq_total counter once assigned. We don't need this anymore since
we know where they are, so let's just directly update rq_total and drop
that one.
Since we only use the shared runqueue to put tasks only assigned to
known threads, let's move that runqueue to each of these threads. The
goal will be to arrange an N*(N-1) mesh instead of a central contention
point.
The global_rqueue_ticks had to be dropped (for good) since we'll now
use the per-thread rqueue_ticks counter for both trees.
A few points to note:
- the rq_lock stlil remains the global one for now so there should not
be any gain in doing this, but should this trigger any regression, it
is important to detect whether it's related to the lock or to the tree.
- there's no more reason for using the scope-based version of the ebtree
now, we could switch back to the regular eb32_tree.
- it's worth checking if we still need TASK_GLOBAL (probably only to
delete a task in one's own shared queue maybe).
The runqueue ticks counter is per-thread and wasn't initially meant to
be shared. We'll soon have to share it so let's make it atomic. It's
only updated when waking up a task, and no performance difference was
observed. It was moved in the thread_ctx struct so that it doesn't
pollute the local cache line when it's later updated by other threads.
This function stopped being used before 2.4 because either the task is
dequeued by the scheduler itself and it knows where to find it, or it's
killed by any thread, and task_kill() must be used for this as only this
one is safe.
It's difficult to say whether task_unlink_rq() is still safe, but once
the lock moves to a thread declared in the task itself, it will be even
more difficult to keep it safe.
Let's just remove it now before someone reuses it and causes trouble.
TASK_SHARED_WQ was set upon task creation and never changed afterwards.
Thus if a task was created to run anywhere (e.g. a check or a Lua task),
all its timers would always pass through the shared timers queue with a
lock. Now we know that tid<0 indicates a shared task, so we can use that
to decide whether or not to use the shared queue. The task might be
migrated using task_set_affinity() but it's always dequeued first so
the check will still be valid.
Not only this removes a flag that's difficult to keep synchronized with
the thread ID, but it should significantly lower the load on systems with
many checks. A quick test with 5000 servers and fast checks that were
saturating the CPU shows that the check rate increased by 20% (hence the
CPU usage dropped by 17%). It's worth noting that run_task_lists() almost
no longer appears in perf top now.
As previously advertised in comments, the mask-based task_new() is now
gone. The low-level function now is task_new_on() which takes a thread
number or a negative value for "any thread", which is turned to zero
for thread-less builds since there's no shared WQ in thiscase. The
task_new_here() and task_new_anywhere() functions were adjusted
accordingly.
This removes the mask-based variant so that from now on the low-level
function becomes appctx_new_on() and it takes either a thread number or
a negative value for "any thread". This way we can use task_new_on() and
task_new_anywhere() instead of task_new() which will soon disappear.
At several places we need to figure the ID of the first thread allowed
to run a task. Till now this was performed using my_ffsl(t->thread_mask)
but since we now have the thread ID stored into the task, let's use it
instead. This is tagged major because it starts to assume that tid<0 is
strictly equivalent to atleast2(thread_mask), and that as such, among
the allowed threads are the current one.
The tasks currently rely on a mask but do not have an assigned thread ID,
contrary to tasklets. However, in practice they're either running on a
single thread or on any thread, so that it will be worth simplifying all
this in order to ease the transition to the thread groups.
This patch introduces a "tid" field in the task struct, that's either
the number of the thread the task is attached to, or a negative value
if the task is not bound to a thread, (i.e. its mask is all_threads_mask).
The new ID is only set and updated but not used yet.
For now we still set tid_bit to (1UL << tid) because FDs will not
work with more than one group without this, but once FDs start to
adopt local masks this must change to thr->ltid_bit.
Some FDs might be offered to some external code (external libraries)
which will deal with them until they close them. As such we must not
close them upon fd_delete() but we need to delete them anyway so that
they do not appear anymore in the fdtab. This used to be handled by
fd_remove() before 2.3 but we don't have this anymore.
This patch introduces a new flag FD_DISOWN to let fd_delete() know that
the core doesn't own the fd and it must not be closed upon removal from
the fd_tab. This way it's totally unregistered from the poller but still
open.
This patch must be backported on branches >= 2.3 because it will be
needed to fix a bug affecting SSL async. it should be adapted on 2.3
because state flags were stored in a different way (via bits in the
structure).
Implement a new status function for ncbuf. It allows to quickly report
if a buffer contains data in a fragmented way, i.e. with gaps in between
or at start of the buffer.
To summarize, a buffer is considered as non-fragmented in the following
cases :
- a null or empty buffer
- a full buffer
- a buffer containing exactly one data block at the beginning, following
by a gap until the end.
With ~1500 bytes QUIC datagrams, we can handle less than 200 datagrams
which is less than the default maxpollevents value. This should reduce
the chances of fulfilling the connections RX buffers as reported by
Tristan in GH #1737.
Must be backported to 2.6.
Remove the call to qc_list_all_rx_pkts() which print messages on stderr
during RX buffer overruns and add a new counter for the number of dropped packets
because of such events.
Must be backported to 2.6
We want to be able to schedule a tasklet onto a thread after the current tasklet
is done. What we have to do is to insert this tasklet at the head of the thread
task list. Furthermore, we would like to serialize the tasklets. They must be
run in the same order as the order in which they have been scheduled. This is
implemented passing a list of tasklet as parameter (see <head> parameters) which
must be reused for subsequent calls.
_tasklet_wakeup_after_on() is implemented to accomplish this job.
tasklet_wakeup_after_on() and tasklet_wake_after() are only wrapper macros around
_tasklet_wakeup_after_on(). tasklet_wakeup_after_on() does exactly the same thing
as _tasklet_wakeup_after_on() without having to pass the filename and line in the
filename as parameters (usefull when DEBUG_TASK is enabled).
tasklet_wakeup_after() hides also the usage of the thread parameter which is
<tl> tasklet thread ID.
In GH #1760 (which is marked as being a feature), there were compilation
errors on MacOS which could be reproduced in Linux when building 32-bit code
(-m32 gcc option). Most of them were due to variables types mixing in QUIC_MIN macro
or using size_t type in place of uint64_t type.
Must be backported to 2.6.
Sometimes using "debug dev memstats" can be frustrating because all
pool allocations are reported through pool-os.h and that's all.
But in practice there's nothing wrong with also intercepting pool_alloc,
pool_free and pool_zalloc and report their call counts and locations,
so that's what this patch does. It only uses an alternate set of macroes
for these 3 calls when DEBUG_MEM_STATS is defined. The outputs are
reported as P_ALLOC (for both pool_malloc() and pool_zalloc()) and
P_FREE (for pool_free()).
freq_ctr_overshoot_period() function may be used to retrieve the excess of
events over the current period for a givent frequency counter, ignoring the
history. It is a way compare the "current rate" (the number of events over
the current period) to a given rate and estimate the excess of events.
It may be used to safely add new events, especially at the begining of the
current period for a frequency counter with large period. This way, it is
possible to smoothly add events during the whole period without quickly
consuming all the quota at the beginning of the period and waiting for the
next one to be able to add new events.
Sometimes we need to be able to signal one thread among a mask, without
caring much about which bit will be picked. At the moment we use ffsl()
for this but this sometimes results in imbalance at certain specific
places where the same first thread in a set is always the same one that
is selected.
Another approach would consist in using the rank finding function but it
requires a popcount and a setup phase, and possibly a modulo operation
depending on the popcount, which starts to be very expensive.
Here we take a different approach. The idea is an input bit value is
passed, from 0 to LONGBITS-1, and that as much as possible we try to
pick the bit matching it if it is set. Otherwise we look at a mirror
position based on a decreasing power of two, and jump to the side
that still has bits left. In 6 iterations it ends up spotting one bit
among 64 and the operations are very cheap and optimizable. This method
has the benefit that we don't care where the holes are located in the
mask, thus it shows a good distribution of output bits based on the
input ones. A long-time test shows an average of 16 cycles, or ~4ns
per lookup at 3.8 GHz, which is about twice as fast as using the rank
finding function.
Just like for that one, the code was stored into tools.c since we don't
have a C file for intops.
Implement quic_tp_version_info_dump() to dump such a transport parameter (only remote).
Call it from quic_transport_params_dump() which dump all the transport parameters.
Can be backported to 2.6 as it's useful for debugging.
This counter must be incremented only one time by connection and decremented
as soon as the handshake has failed or succeeded. This is a gauge. Under certain
conditions this counter could be decremented twice. For instance
after having received a TLS alert, then upon SSL_do_handshake() failure.
To stop having to deal to all the current combinations which can lead to such a
situation (and the next to come), add a connection flag to denote if this counter
has been already decremented for a connection. So, this counter must be decremented
only if this flag has not been already set.
Must be backported up to 2.6.
In order to better detect the danger caused by extra shared libraries
which replace some symbols, upon dlopen() we now compare a few critical
symbols such as malloc(), free(), and some OpenSSL symbols, to see if
the loaded library comes with its own version. If this happens, a
warning is emitted and TAINTED_REDEFINITION is set. This is important
because some external libs might be linked against different libraries
than the ones haproxy was linked with, and most often this will end
very badly (e.g. an OpenSSL object is allocated by haproxy and freed
by such libs).
Since the main source of dlopen() calls is the Lua lib, via a "require"
statement, it's worth trying to show a Lua call trace when detecting a
symbol redefinition during dlopen(). As such we emit a Lua backtrace if
Lua is detected as being in use.
Several bug reports were caused by shared libraries being loaded by other
libraries or some Lua code. Such libraries could define alternate symbols
or include dependencies to alternate versions of a library used by haproxy,
making it very hard to understand backtraces and analyze the issue.
Let's intercept dlopen() and set a new TAINTED_SHARED_LIBS flag when it
succeeds, so that "show info" makes it visible that some external libs
were added.
The redefinition is based on the definition of RTLD_DEFAULT or RTLD_NEXT
that were previously used to detect that dlsym() is usable, since we need
it as well. This should be sufficient to catch most situations.
This function may be used to try to show where some Lua code is currently
being executed. It tries hard to detect the initialization phase, both for
the global and the per-thread states, and for runtime states. This intends
to be used by error handlers to provide the users with indications about
what Lua code was being executed when the error triggered.
When the protocol is changed for a client connection at the stream level
(from TCP to H1/H2), there are two cases. The stream may be reused or
not. The first case, when the stream is reused is working. The second one is
buggy since the conn-stream refactoring and leads to a crash.
In this case, the new mux don't reuse the stream. It must be silently
aborted. However, its front stream connector is still referencing the
connection. So it must be detached. But it must be performed in two stages,
to be sure to not loose the context for the upgrade and to be able to
rollback on error. So now, before the upgrade, we prepare to detach the
stconn and it is finally detached if the upgrade succeeds. There is a trick
here. Because we pretend the stconn is detached but its state is preserved.
This patch must be backported to 2.6.
At this time haproxy supported only incompatible version negotiation feature which
consists in sending a Version Negotiation packet after having received a long packet
without compatible value in its version field. This version value is the version
use to build the current packet. This patch does not modify this behavior.
This patch adds the support for compatible version negotiation feature which
allows endpoints to negotiate during the first flight or packets sent by the
client the QUIC version to use for the connection (or after the first flight).
This is done thanks to "version_information" parameter sent by both endpoints.
To be short, the client offers a list of supported versions by preference order.
The server (or haproxy listener) chooses the first version it also supported as
negotiated version.
This implementation has an impact on the tranport parameters handling (in both
direcetions). Indeed, the server must sent its version information, but only
after received and parsed the client transport parameters). So we cannot
encode these parameters at the same time we instantiated a new connection.
Add QUIC_TP_DRAFT_VERSION_INFORMATION(0xff73db) new transport parameter.
Add tp_version_information new C struct to handle this new parameter.
Implement quic_transport_param_enc_version_info() (resp.
quic_transport_param_dec_version_info()) to encode (resp. decode) this
parameter.
Add qc_conn_finalize() which encodes the transport parameters and configure
the TLS stack to send them.
Add ->negotiated_ictx quic_conn C struct new member to store the Initial
QUIC TLS context for the negotiated version. The Initial secrets derivation
is version dependent.
Rename ->version to ->original_version and add ->negotiated_version to
this C struct to reflect the QUIC-VN RFC denomination.
Modify most of the QUIC TLS API functions to pass a version as parameter.
Export the QUIC version definitions to be reused at least from quic_tp.c
(transport parameters.
Move the token check after the QUIC connection lookup. As this is the original
version which is sent into a Retry packet, and because this original version is
stored into the connection, we must check the token after having retreived this
connection.
Add packet version to traces.
See https://datatracker.ietf.org/doc/html/draft-ietf-quic-version-negotiation-08
for more information about this new feature.
This is becoming difficult to handle the QUIC TLS related definitions
which arrive with a QUIC version (draft or not). So, here we add
quic_version C struct which does not define only the QUIC version number,
but also the QUIC TLS definitions which depend on a QUIC version.
Modify consequently all the QUIC TLS API to reuse these definitions
through new quic_version C struct.
Implement quic_pkt_type() function which return a packet type (0 up to 3)
depending on the QUIC version number.
Stop harding the Retry packet first byte in send_retry(): this is not more
possible because the packet type field depends on the QUIC version.
Also modify quic_build_packet_long_header() for the same reason: the packet
type depends on the QUIC version.
Add a quic_version C struct member to quic_conn C struct.
Modify qc_lstnr_pkt_rcv() to set this member asap.
Remove the version member from quic_rx_packet C struct: a packet is attached
asap to a connection (or dropped) which is the unique object which should
store the QUIC version.
Modify qc_pkt_is_supported_version() to return a supported quic_version C
struct from a version number.
Add Initial salt for QUIC v2 draft (initial_salt_v2_draft).
Due to missing brackets around the ternary C operator, quic_pto() could return zero
at the first run, before the QUIC connection was completely initialized. This leaded
the idle timeout task to be executed before this initialization completion. Then
this connection could be immediately released.
This bug was revealed by the multi_packet_client_hello QUIC tracker test.
Must be backported to 2.6.
The nonce and keys used to cipher the Retry tag depend on the QUIC version.
Add these definitions for 0xff00001d (draft-29) and v2 QUIC version. At least
draft-29 is useful for QUIC tracker tests with "quic-force-retry" enabled
on haproxy side.
Validated with -v 0xff00001d ngtcp2 option.
Could not validate the v2 nonce and key at this time because not supported.
ensures that we never insert too many entries in a headers input list.
On the decoding side, a new error QPACK_ERR_TOO_LARGE is reported in
this case.
This prevents crash if headers number on a H3 request or response is
superior to tune.http.maxhdr config value. Previously, a crash would
occur in QPACK decoding function.
Note that the process still crashes later with ABORT_NOW() because error
reporting on frame parsing is not implemented for now. It should be
treated with a RESET_STREAM frame in most cases.
This can be backported up to 2.6.
Clean up QPACK decoder API by removing dependencies on ncbuf and
MUX-QUIC. This reduces includes statements. It will also help to
implement a standalone QPACK decoder.
Add comments on the decoding function to facilitate code analysis.
Also remove the qpack_debug_hexdump() which prints the whole left buffer
on each header parsing. With large HEADERS frame payload, QPACK traces
are complicated to debug with this statement.
This macro was used both for binding and for lookups. When binding tasks
or FDs, using all_threads_mask instead is better as it will later be per
group. For lookups, ~0UL always does the job. Thus in practice the macro
was already almost not used anymore since the rest of the code could run
fine with a constant of all ones there.
Each thread has its own local thread id and its own global thread id,
in addition to the masks corresponding to each. Once the global thread
ID can go beyond 64 it will not be possible to have a global thread Id
bit anymore, so better start to remove it and use only the local one
from the struct thread_info.
Instead of having a global mask of all the profiled threads, let's have
one flag per thread in each thread's flags. They are never accessed more
than one at a time an are better located inside the threads' contexts for
both performance and scalability.
The flow control enforced at connection level is incorrectly calculated.
There is a risk of exceeding the limit. In most cases, this results in a
segfault produced by a BUG_ON which is here to catch this kind of error.
If not compiled with DEBUG_STRICT, this should generate a connection
closed by the client due to the flow control overflow.
The problem is encountered when transfered payload is big enough to fill
the transport congestion window. In this case, some data are rejected by
the transport layer and kept by the MUX to be reemitted later. However,
these preserved data are not counted on the connection flow control when
resubmitted, which gradually amplify the gap between expected and real
consumed flow control.
To fix this, handle the flow-control at the connection level in the same
way as the stream level. A new field qcc.tx.offsets is incremented as
soon as data are transfered between stream TX buffers. The field
qcc.tx.sent_offsets is preserved to count bytes handled by the transport
layer and stop the MUX transfer if limit is reached.
As already stated, this bug can occur during transfers with enough
emitted data, over multiple streams. When using a single stream, the
flow control at the stream level hides it.
The BUG_ON crash is reproduced systematically with quiche client :
$ quiche-client --no-verify --http-version HTTP/3 -n 10000 https://127.0.0.1:20443/10K
This must be backported up to 2.6 when confirmed to work as expected.
This should fix github issue #1738.
Frame type has changed during HTTP/3 specification process. Adjust it to
reflect the latest RFC 9114 status.
Concretly, type for GOAWAY and MAX_PUSH_ID frames has been adjusted.
The impact of this bug is limited as currently these frames are not
handled by haproxy and are ignored.
This can be backported up to 2.6.
Benoit Dolez reported that gcc-4.4 emits several "may be used
uninitialized" warnings around places where there are BUG_ON()
or ABORT_NOW(). The reason is that __builtin_unreachable() was
introduced in gcc-4.5 thus older ones do not know that the code
after such statements is not reachable.
This patch solves the problem by deplacing the statement with
an infinite loop on older versions. The compiler knows that the
code following it cannot be reached, and this is quite cheap
(2 to 4 bytes depending on architectures). It even reduces the
code size a little bit as the compiler doesn't have to optimize
for branches that do not exist.
This may be backported to older versions.
Clean the API used by decode_qcs() and transcoder internal functions.
Parsing functions now returns a ssize_t which represents the number of
consumed bytes or a negative error code. The total consumed bytes is
returned via decode_qcs().
The API is now unified and cleaner. The MUX can thus simply use the
return value of decode_qcs() instead of substracting the data bytes in
the buffer before and after the call. Transcoders functions are not
anymore obliged to remove consumed bytes from the buffer which was not
obvious.
Slightly modify decode_qcs function used by transcoders. The MUX now
gives a buffer instance on which each transcoder is free to work on it.
At the return of the function, the MUX removes consume data from its own
buffer.
This reduces the number of invocation to qcs_consume at the end of a
full demuxing process. The API is also cleaner with the transcoders not
responsible of calling it with the risk of having the input buffer
freed if empty.
As a mirror to qcc/qcs types, add a h3c pointer into h3s struct. This
should help to clean up H3 code and avoid to use qcs.qcc.ctx to retrieve
the h3c instance.
htx_get_head_blk() is used at plenty of places, many of which are known
to be safe, but the function checks for the presence of a first block
and returns NULL if it doesn't exist. While it's properly used, it makes
compilers complain at -Os on stream.c and mux_fcgi.c because they probably
don't propagate variables far enough to see that there's no risk.
Let's add an unchecked version for these use cases.
Add ->inc_err_cnt new callback to qcc_app_ops struct which can
be called from xprt to increment the application level error code counters.
It take the application context as first parameter to be generic and support
new QUIC applications to come.
Add h3_stats.c module with counters for all the frame types and error codes.
Add new counters to count the number of dropped packet upon parsing error, lost
sent packets and the number of stateless reset packet sent.
Take the oppportunity of this patch to rename CONN_OPENINGS to QUIC_ST_HALF_OPEN_CONN
(total number of half open connections) and QUIC_ST_HDSHK_FAILS to QUIC_ST_HDSHK_FAIL.
This is becoming difficult to distinguish the default values for
transport parameters which come with the RFC from our implementation
default values when not set by configuration (tunable parameters).
Add a comment to distinguish them.
Prefix these default values by QUIC_TP_DFLT_ to distinguish them from
QUIC_DFLT_* value even if there are not numerous.
Furthermore ->max_udp_payload_size must be first initialized to
QUIC_TP_DFLT_MAX_UDP_PAYLOAD_SIZE especially for received value.
Add tunable "tune.quic.frontend.max_streams_bidi" setting for QUIC frontends
to set the "initial_max_streams_bidi" transport parameter.
Add some documentation for this new setting.
Add two tunable settings both for backends and frontends "max_idle_timeout"
QUIC transport parameter, "tune.quic.frontend.max-idle-timeout" and
"tune.quic.backend.max-idle-timeout" respectively.
cfg_parse_quic_time() has been implemented to parse a time value thanks
to parse_time_err(). It should be reused for any tunable time value to be
parsed.
Add the documentation for this tunable setting only for frontend.
Add quic_transport_params_dump() static inline function to do so for
a quic_transport_parameters struct as parameter.
We use the trace API do dump these transport parameters both
after they have been initialized (RX/local) or received (TX/remote).
Make the transport parameters be standlone as much as possible as
it consists only in encoding/decoding data into/from buffers.
Reduce the size of xprt_quic.h. Unfortunalety, I think we will
have to continue to include <xprt_quic-t.h> to use the trace API
into this module.
The only two places where it was used was to carefully preserve the
SE_FL_WILL_CONSUME flag (since others are irrelevant there and the
previous RXBLK* flags moved to the stconn). Now that the flag is
cleared by default there's no need to re-created a fresh new one
when replacing the descriptor, so we can eliminate that remaining
trick.
This flag was the only remaining one that was inverted as a blocking
condition, requiring special handling to preset it on sedesc allocation.
Let's flip it in its definition and accessors.
Function arguments and local variables called "cs" were renamed to "sc"
to avoid future confusion. Both the core functions and the ones in the
resolvers files were updated.
Function arguments and local variables called "cs" were renamed to "sc"
to avoid future confusion. The HTTP analyser and the backend functions
were all updated after being reviewed. Function stream_update_both_cs()
was renamed to stream_update_both_sc()
Function arguments and local variables called "cs" were renamed to "sc"
to avoid future confusion. The "nb_cs" stream-connector counter was
renamed to "nb_sc" and qc_attach_cs() was renamed to qc_attach_sc().
Function arguments and local variables called "cs" were renamed to "sc"
to avoid future confusion. The change is huge (~580 lines), so extreme
care was given not to change anything else.
The check struct had a "cs" field renamed to "sc", which also required
a tiny update to a few functions using it to distinguish a check from
a stream (log.c, payload.c, ssl_sample.c, tcp_sample.c, tcpcheck.c,
connection.c).
Function arguments and local variables called "cs" were renamed to "sc".
The presence of one "cs=" in the debugging traces was also turned to
"sc=" for consistency.
There's no more reason for keepin the code and definitions in conn_stream,
let's move all that to stconn. The alphabetical ordering of include files
was adjusted.
This file contains all the stream-connector functions that are specific
to application layers of type stream. So let's name it accordingly so
that it's easier to figure what's located there.
The alphabetical ordering of include files was preserved.
An equivalent applet_need_more_data() was added as well since that function
is mostly used from applet code. It makes it much clearer that the applet
is waiting for data from the stream layer.
These ones are essentially for the stream endpoint, let's give them a
name that matches the intent. Equivalent versions were provided in the
applet namespace to ease code legibility.
These ones were used exactly once and together, in sc_is_send_allowed().
No need to give them confusing names, instead let's just put the flags,
they're way more explicit, and drop the two functions.
This flag indicates the that stream endpoint is willing to consume output
data from the stream. Its new name makes this more explicit. The function
names will be updated accordingly, which will remove the disturbing "get"
everywhere.
The following flags are not at all related to the endpoint but to the
connector itself:
- SE_FL_RXBLK_ROOM
- SE_FL_RXBLK_BUFF
- SE_FL_RXBLK_CHAN
As such they have no business staying in the endpoint descriptor and
they must move to the stream connector. They've also been renamed
accordingly to better match what they correspond to (the same name
as the function that sets them).
The rare occurrences of cs_rx_blocked() were replaced by an explicit
test on the list of flags. The reason is that cs_rx_blocked() used to
preserve some tests that are not needed at certain places since already
known. For the same reason SE_FL_RXBLK_ANY wasn't converted. As such it
will later be possible to carefully review these few locations and
eliminate the unneeded flags from the tests. No particular function
was made to test them since they're explicit enough.
It now looks like ci_putchk() and friends could very well place the flag
themselves on the connector when they detect a buffer full condition, as
this would significantly simplify the high-level API. But all usages must
first be reviewed before this simplification can be done. For now it
remains done by applet_put*() instead.
It's more explicit this way. The cs_rx_endp_ready() function could be
removed so that the flag is directly tested. In the future it should
be inverted and the few places where it's set (or preserved via
SE_FL_APP_MASK) could be dropped.
At plenty of places we combine multiple flags checks to determine if we
can receive (endp_ready, rx_blocked, cf_shutr etc). Let's group them
under a single function that is meant to replace existing tests.
Some tests were only checking the rxblk flags at the connection level,
so for now they were not converted, this requires a bit of auditing
first, and probably a test to determine whether or not to check for
cf_shutr (e.g. there is none if no stream is present).
The analysis of cs_rx_endp_more() showed that the purpose is for a stream
endpoint to inform the connector that it's ready to deliver more data to
that one, and conversely cs_rx_endp_done() that it's done delivering data
so it should not be bothered again for this.
This was modified two ways:
- the operation is no longer performed on the connector but on the
endpoint so that there is no more doubt when reading applet code
about what this rx refers to; it's the endpoint that has more or
no more data.
- an applet implementation is also provided and mostly used from
applet code since it saves the caller from having to access the
endpoint descriptor.
It's visible that the flag ought to be inverted because some places
have to set it by default for no reason.
These functions are used by the application layer to disable or enable
reading at the stream connector's level when the input buffer failed to
be allocated (or was finally allocated). The new names makes things
clearer.
These functions were used by the channel to inform the lower layer
whether reading was acceptable or not. Usually this directly mimmicks
the CF_DONT_READ flag from the channel, which may be set when it's
desired not to buffer incoming data that will not be processed, or
that the buffer wants to be flushed before starting to read again,
or that bandwidth limiting might be enforced, etc. It's always a
policy reason, not a purely resource-based one.
The new name mor eclearly indicates that a stream connector cannot make
any more progress because it needs room in the channel buffer, or that
it may be unblocked because the buffer now has more room available. The
testing function is sc_waiting_room(). This is mostly used by applets.
Note that the flags will change soon.
This makes SE_FL_APPLET_NEED_CONN autonomous, in that we check for it
everywhere we have a relevant cs_rx_blocked(), so that the flag doesn't
need anymore to be covered by cs_rx_blocked(). Indeed, this flag doesn't
really translate a receive blocking condition but rather a refusal to
wake up an applet that is waiting for a connection to finish to setup.
This also ensures we will not risk to set it back on a new endpoint
after cs_reset_endp() via SE_FL_APP_MASK, because the flag being
specific to the endpoint only and not to the connector, we don't
want to preserve it when replacing the endpoint.
It's possible that cs_chk_rcv() could later be further simplified if
we can demonstrate that the two tests in it can be merged.
This flag is exclusively used when a front applet needs to wait for the
other side to connect (or fail to). Let's give it a more explicit name
and remove the ambiguous function that was used only once.
This also ensures we will not risk to set it back on a new endpoint
after cs_reset_endp() via SE_FL_APP_MASK, because the flag being
specific to the endpoint only and not to the connector, we don't
want to preserve it when replacing the endpoint.
This flag is no more needed, it was only set on shut read to be tested
by cs_rx_blocked() which is now properly tested for shutr as well. The
cs_rx_blk_shut() calls were removed. Interestingly it allowed to remove
a special case in the L7 retry code.
This also ensures we will not risk to set it back on a new endpoint
after cs_reset_endp() via SE_FL_APP_MASK.
One flag (RXBLK_SHUT) is always set with CF_SHUTR, so in order to remove
it, we first need to make sure we always check for CF_SHUTR where
cs_rx_blocked() is being used.
When a shutdown(WR) is performed, send is no longer allowed, and that is
currently handled by the explicit cs_done_get() in the various shutw()
calls. That's a bit ugly and complicated for no reason, let's simply
integrate the test of SHUTW in sc_is_send_allowed().
Note that the test could also be added wherever sc_is_send_allowed() is
used but for now proceeding like this limits the changes.
sc_is_send_allowed() is now used everywhere instead of the combination
of cs_tx_endp_ready() && !cs_tx_blocked(). There's no place where we
need them individually thus it's simpler. The test was placed in cs_util
as we'll complete it later.
A number of functions in cs_utils.h are not usable from functions taking
a const because they're not declared as using const, despite never
modifying the stconn. Let's address this for the following ones:
sc_ic(), sc_oc(), sc_ib(), sc_ob(), sc_strm_task(),
cs_opposite(), sc_conn_ready(), cs_src(), cs_dst(),
First it applies to the stream endpoint and not the conn_stream, and
second it only tests and touches the flags so it makes sense to call
it se_fl_ like other functions which only manipulate the flags, as
it's just a special case of flags.
It returns an stconn from a connection and not the opposite, so the name
change was more appropriate. In addition it was moved to connection.h
which manipulates the connection stuff, and it happens that only
connection.c uses it.
The following functions which act on a connection-based stream connector
were renamed to sc_conn_* (~60 places):
cs_conn_drain_and_shut
cs_conn_process
cs_conn_read0
cs_conn_ready
cs_conn_recv
cs_conn_send
cs_conn_shut
cs_conn_shutr
cs_conn_shutw
The function doesn't return a pointer to the mux but to the mux stream
(h1s, h2s etc). Let's adjust its name to reflect this. It's rarely used,
the name can be enlarged a bit. And of course s/cs/sc to accommodate for
the updated name.
These functions return the app-layer associated with an stconn, which
is a check, a stream or a stream's task. They're used a lot to access
channels, flags and for waking up tasks. Let's just name them
appropriately for the stream connector.
We're starting to propagate the stream connector's new name through the
API. Most call places of these functions that retrieve the channel or its
buffer are in applets. The local variable names are not changed in order
to keep the changes small and reviewable. There were ~92 uses of cs_ic(),
~96 of cs_oc() (due to co_get*() being less factorizable than ci_put*),
and ~5 accesses to the buffer itself.
The vast majority of calls to ci_putchk() etc are performed from applets
which directly know an endpoint. Figuring the correct API (writing into
input channel etc) isn't trivial for newcomers, and knowing that they
must mark the flag indicating a buffer full condition isn't trivial
either.
Here we're adding wrappers to these functions but to be used directly
from the appctx. That's already what is being done in multiple steps in
the applet code, where the endp is derived from the appctx, then the cs
from the endp, then the stream from the cs, then the channel from the
stream, and so on. But this time the function doesn't require to know
much of the internals, applet_putchr() writes a char from the appctx,
and marks the buffer full if needed. Period. This will allow to remove
a significant amount of obscure ci_putchk() and cs_ic() calls from the
code, hence a significant number of possible mistakes.
For historical reasons (stream-interface and connections), we used to
require two independent fields for the application level callbacks and
the transport-level functions. Over time the distinction faded away so
much that the low-level functions became specific to the application
and conversely. For example, applets may only work with streams on top
since they rely on the channels, and the stream-level functions differ
between applets and connections. Right now the application level only
contains a wake() callback and the low-level ones contain the functions
that act at the lower level to perform the shutr/shutw and at the upper
level to notify about readability and writability. Let's just merge them
together into a single set and get rid of this confusing distinction.
Note that the check ops do not define any app-level function since these
are only called by streams.
We currently call all ->shutr, ->chk_snd etc from ->ops unconditionally,
while the ->wake() call from data_cb is checked. Better check ops as
well for consistency, this will help get them merged.
This also follows the natural naming. There are roughly 238 changes, all
totally trivial. conn_stream-t.h has become completely void of any
"conn_stream" related stuff now (except its name).
This renames the "struct conn_stream" to "struct stconn" and updates
the descriptions in all comments (and the rare help descriptions) to
"stream connector" or "connector". This touches a lot of files but
the change is minimal. The local variables were not even renamed, so
there's still a lot of "cs" everywhere.
Let's start to introduce the stream connector at the app_ops level.
This is entirely self-contained into conn_stream.c. The functions
were also updated to reflect the new name, and the comments were
updated.
Just like for the appctx, this is a pointer to a stream endpoint descriptor,
so let's make this explicit and not confuse it with the full endpoint. There
are very few changes thanks to the preliminary refactoring of the flags
manipulation.
Now at least it makes it obvious that it's the stream endpoint descriptor
and not an endpoint. There were few changes thanks to the previous refactor
of the flags.
After some discussion we found that the cs_endpoint was precisely the
descriptor for a stream endpoint, hence the naturally coming name,
stream endpoint constructor.
This patch renames only the type everywhere and the new/init/free functions
to remain consistent with it. Future patches will address field names and
argument names in various code areas.
That's the "stream endpoint" pointer. Let's change it now while it's
not much spread. The function __cs_endp_target() wasn't yet renamed
because that will change more globally soon.
This changes all main uses of endp->flags to the se_fl_*() equivalent
by applying coccinelle script endp_flags.cocci. The se_fl_*() functions
themselves were manually excluded from the change, of course.
Note: 144 locations were touched, manually reviewed and found to be OK.
The script was applied with all includes:
spatch --in-place --recursive-includes -I include --sp-file $script $files
This changes all main uses of cs->endp->flags to the sc_ep_*() equivalent
by applying coccinelle script cs_endp_flags.cocci.
Note: 143 locations were touched, manually reviewed and found to be OK,
except a single one that was adjusted in cs_reset_endp() where the flags
are read and filtered to be used as-is and not as a boolean, hence was
replaced with sc_ep_get() & $FLAGS.
The script was applied with all includes:
spatch --in-place --recursive-includes -I include --sp-file $script $files
At plenty of places we need to manipulate the conn_stream's endpoint just
to set or clear a flag. This patch adds a handful of functions to perform
the common operations (clr/set/get etc) on these flags at both the endpoint
and at the conn_stream level.
The functions were named after the target names, i.e. se_fl_*() to act on
the stream endpoint flags, and sc_ep_* to manipulate the endpoint flags
from the stream connector (currently conn_stream).
For now they're not used.
This one is exclusively used by the connection, regardless its generic
name "ctx" is rather confusing. Let's make it a struct connection* and
call it "conn". This way there's no doubt about what it is and there's
no way it will be used by accident by being taken for something else.
This reverts commit d9404b464f.
In fact, there is a BUG_ON() in __task_free() function to be sure the task
is no longer in the wait-queue or the run-queue. Because the patch tries to
fix a "leak" on deinit, it is safer to revert it. there is no reason to
introduce potential bug for this kind of issues. And there is no reason to
impact the normal use-cases at runtime with additionnal conditions to only
remove a task on deinit.
Bring some improvment to h3_parse_settings_frm() function. The first one
is the parsing which now manipulates a buffer instead of a plain char*.
This is more to unify with other parsing functions rather than dealing
with data wrapping : it's unlikely to happen as SETTINGS is only
received as the first frame on the control STREAM.
Various errors are now properly reported as connection error :
* on incomplete frame payload
* on a duplicated settings in the same frame
* on reserved settings receive
As specified by HTTP/3 draft, an unknown unidirectional stream can be
aborted. To do this, use a new flag QC_SF_READ_ABORTED. When the MUX
detects this flag, QCS instance is automatically freed.
Previously, such streams were instead automatically drained. By aborting
them, we economize some useless memcpy instruction. On future data
reception, QCS instance is not found in the tree and considered as
already closed. The frame payload is thus deleted without copying it.
Remove all unnecessary bits of code for H3 unidirectional streams. Most
notable, an individual tasklet is not require anymore for each stream.
This is useless since the merge of RX/TX uni streams handling with
bidirectional streams code.
The whole QUIC stack is impacted by this change :
* at quic-conn level, a single function is now used to handle uni and
bidirectional streams. It uses qcc_recv() function from MUX.
* at MUX level, qc_recv() io-handler function does not skip uni streams
* most changes are conducted at app layer. Most notably, all received
data is handle by decode_qcs operation.
Now that decode_qcs is the single app read function, the H3 layer can be
simplified. Uni streams parsing was extracted from h3_attach_ruqs() to
h3_decode_qcs().
h3_decode_qcs() is able to deal with all HTTP/3 frame types. It first
check if the frame is valid for the H3 stream type. Most notably,
SETTINGS parsing was moved from h3_control_recv() into h3_decode_qcs().
This commit has some major benefits besides removing duplicated code.
Mainly, QUIC flow control is now enforced for uni streams as with bidi
streams. Also, an unknown frame received on control stream does not set
an error : it is now silently ignored as required by the specification.
Some cleaning in H3 code is already done with this patch :
h3_control_recv() and h3_attach_ruqs() are removed as they are now
unused. A final patch should clean up the unneeded remaining bit.
Define a new enum h3s_t. This is used to differentiate between the
different stream types used in a HTTP/3 connection, including the QPACK
encoder/decoder streams.
For the moment, only bidirectional streams is positioned. This patch
will be useful to unify reception of uni streams with bidirectional
ones.
Replace h3_uqs type by qcs in stream callbacks. This change is done in
the context of unification between bidi and uni-streams. h3_uqs type
will be unneeded when this is achieved.
Complete quic-conn API for error reporting. A new parameter <app> is
defined in the function quic_set_connection_close(). This will transform
the frame into a CONNECTION_CLOSE_APP type.
This type of frame will be generated by the applicative layer, h3 or
hq-interop for the moment. A new function qcc_emit_cc_app() is exported
by the MUX layer for them.
Do not allocate cs_endpoint for every QCS instances in qcs_new().
Instead, this is delayed to qc_attach_cs() function.
In effect, with H3 as app protocol, cs_endpoint will be allocated on
HEADERS parsing. Thus, no cs_endpoint is allocated for H3 unidirectional
streams which do not convey any HTTP data.
A running or queued task is not released when task_destroy() is called,
except if it is the current task. Its process function is set to NULL and we
let the scheduler to release the task. However, when HAProxy is stopping, it
never happens and some tasks may leak. To fix the issue, we now also rely on
the global MODE_STOPPING flag. When this flag is set, the task is always
immediately released.
This patch should fix the issue #1714. It could be backported as far as 2.4
but it's not a real problem in practice because it only happens on
deinit. The leak exists on previous versions but not MODE_STOPPING flag.
the cpuset api changes done fir the future 14 release had been
backported to the 13.1 release so changing the cpuset api of choice
condition change accordingly.
In order to be able to check compatibility between muxes and transport
layers, we'll need a new flag to tag muxes that work on framed transport
layers like QUIC. Only QUIC has this flag now.
Let's collect the set of xprt-level and sock-level dgram/stream protocols
seen on a bind line and store that in the bind_conf itself while they're
being parsed. This will make it much easier to detect incompatibilities
later than the current approch which consists in scanning all listeners
in post-parsing.
There is no way to store useful info there, yet there's about one entry
per boolean. Let's add an "options" attribute which will collect various
options.
In practice, even the BC_O_SSL_* flags and a few info such as strict_sni
could move there.
The "bind" parsing code was duplicated for the peers section and as a
result it wasn't kept updated, resulting in slightly different error
behavior (e.g. errors were not freed, warnings were emitted as alerts)
Let's first unify it into a new dedicated function that properly reports
and frees the error.
There's been some great confusion between proto_type, ctrl_type and
sock_type. It turns out that ctrl_type was improperly chosen because
it's not the control layer that is of this or that type, but the
transport layer, and it turns out that the transport layer doesn't
(normally) denaturate the underlying control layer, except for QUIC
which turns dgrams to streams. The fact that the SOCK_{DGRAM|STREAM}
set of values was used added to the confusion.
Let's replace it with xprt_type which reuses the later introduced
PROTO_TYPE_* values, and update the comments to explain which one
works at what level.
Send a CONNECTION_CLOSE if the peer emits more data than authorized by
our flow-control. This is implemented for both stream and connection
level.
Fields have been added in qcc/qcs structures to differentiate received
offsets for limit enforcing with consumed offsets for sending of
MAX_DATA/MAX_STREAM_DATA frames.
Define an API to easily set a CONNECTION_CLOSE. This will mainly be
useful for the MUX when an error is detected which require to close the
whole connection.
On the MUX side, a new flag is added when a CONNECTION_CLOSE has been
prepared. This will disable add future send operations.
This QUIC specific keyword may be used to set the theshold, in number of
connection openings, beyond which QUIC Retry feature will be automatically
enabled. Its default value is 100.
First commit to handle the QUIC stats counters. There is nothing special to say
except perhaps for ->conn_openings which is a gauge to count the number of
connection openings. It is incremented after having instantiated a quic_conn
struct, then decremented when the handshake was successful (handshake completed
state) or failed or when the connection timed out without reaching the handshake
completed state.
If we add a new stats module to C source files including only
stats.h we get these errors:
include/haproxy/stats.h:39:31: error: array type has incomplete element type
‘struct name_desc’
39 | extern const struct name_desc stat_fields[];
include/haproxy/stats.h:55:50: warning: ‘struct listener’ declared inside
parameter list will not be visible outside of this definition or declaration
55 | int stats_fill_li_stats(struct proxy *px, struct listener *l, int flags,
name_desc struct is defined in tools-t.h and listener struct in listner-t.h.
Here is the format of a token:
- format (1 byte)
- ODCID (from 9 up 21 bytes)
- creation timestamp (4 bytes)
- salt (16 bytes)
A format byte is required to distinguish the Retry token from others sent in
NEW_TOKEN frames.
The Retry token is ciphered after having derived a strong secret from the cluster secret
and generated the AEAD AAD, as well as a 16 bytes long salt. This salt is
added to the token. Obviously it is not ciphered. The format byte is not
ciphered too.
The AAD are built by quic_generate_retry_token_aad() which concatenates the version,
the client SCID and the IP address and port. We had to implement quic_saddr_cpy()
to copy the IP address and port to the AAD buffer. Only the Retry SCID is generated
on our side to build a Retry packet, the others fields come from the first packet
received by the client. It must reuse this Retry SCID in response to our Retry packet.
So, we have not to store it on our side. Everything is offloaded to the client (stateless).
quic_generate_retry_token() must be used to generate a Retry packet. It calls
quic_pkt_encrypt() to cipher the token.
quic_generate_retry_check() must be used to check the validity of a Retry token.
It is able to decipher a token which arrives into an Initial packet in response
to a Retry packet. It calls parse_retry_token() after having deciphered the token
to store the ODCID into a local quic_cid struct variable. Finally this ODCID may
be stored into the transport parameter thanks to qc_lstnr_params_init().
The Retry token lifetime is 10 seconds. This lifetime is also checked by
quic_generate_retry_check(). If quic_generate_retry_check() fails, the received
packet is dropped without anymore packet processing at this time.
This function does exactly the same thing as quic_tls_decrypt(), except that
it does reuse its input buffer as output buffer. This is needed
to decrypt the Retry token without modifying the packet buffer which
contains this token. Indeed, this would prevent us from decryption
the packet itself as the token belong to the AEAD AAD for the packet.
This function must be used to derive strong secrets from a non pseudo-random
secret (cluster-secret setting in our case) and an IV. First it call
quic_hkdf_extract_and_expand() to do that for a temporary strong secret (tmpkey)
then two calls to quic_hkdf_expand() reusing this strong temporary secret
to derive the final strong secret and IV.
The local and remote TPs were both processed through the same function
quic_transport_params_init(). This caused the remote TPs to be
overwritten with values configured for our local usage.
Change this by reserving quic_transport_params_init() only for our local
TPs. Remote TPs are simply initialized via
quic_dflt_transport_params_cpy().
This bug could result in a connection closed in error by the client due
to a violation of its TPs. For example, curl client closed the
connection after receiving too many CONNECTION_ID due to an invalid
active_connection_id value used.
Instead of using the health-check to subscribe to read/write events, we now
rely on the conn-stream. Indeed, on the server side, the conn-stream's
endpoint is a multiplexer. Thus it seems appropriate to handle subscriptions
for read/write events the same way than for the streams. Of course, the I/O
callback function is not the same. We use srv_chk_io_cb() instead of
cs_conn_io_cb().
event_srv_chk_io() function is renamed srv_chk_io_cb() to be consistant with
the I/O callback function of connections. In addition, this function is
exported. It will be required to use the conn-stream's subscriptions.
This commit is similar to the previous one but deals with MAX_DATA for
connection-level data flow control. It uses the same function
qcc_consume_qcs() to update flow control level and generate a MAX_DATA
frame if needed.
Send MAX_STREAM_DATA frames when at least half of the allocated
flow-control has been demuxed, frame and cleared. This is necessary to
support QUIC STREAM with received data greater than a buffer.
Transcoders must use the new function qcc_consume_qcs() to empty the QCS
buffer. This will allow to monitor current flow-control level and
generate a MAX_STREAM_DATA frame if required. This frame will be emitted
via qc_io_cb().
Adjust the mechanism for MAX_STREAMS_BIDI emission. When a bidirectional
stream is removed, current flow-control level is checked. If needed, a
MAX_STREAMS_BIDI frame is generated and inserted in a new list in the
QCS instance. The new frames will be emitted at the start of qc_send().
This has no impact on the current MAX_STREAMS_BIDI behavior. However,
this mechanism is more flexible and will allow to implement quickly
MAX_STREAM_DATA/MAX_DATA emission.
Slightly change the interface for qcc_recv() between MUX and XPRT. The
MUX is now responsible to call qcc_decode_qcs(). This is cleaner as now
the XPRT does not have to deal with an extra QCS parameter and the MUX
will call qcc_decode_qcs() only if really needed.
This change is possible since there is no extra buffering for
out-of-order STREAM frames and the XPRT does not have to handle buffered
frames.
This commit is an adaptation from the following patch :
commit d0de677682
Author: Willy Tarreau <w@1wt.eu>
Date: Fri Feb 4 09:05:37 2022 +0100
BUG/MINOR: mux-h2: update the session's idle delay before creating the stream
This should fix the incorrect timeouts present in httplog format for
QUIC requests.
First adjusted some typos in comments inside the function. Second,
change the naming of some variable to reduce confusion.
A special case has been inserted when advance is done inside a GAP block
and this block is the last of the buffer. In this case, the whole buffer
will be emptied, equivalent to a ncb_init() operation.
quic_rx_pkts_del() function removes packets from QUIC RX buffer. In most
cases, the buffer will be emptied after it. In this case, it's useful to
realign it. This will avoid future data wrapping and use of an
unnecessary junk to fill a too small contiguous space.
It is now possible to start an appctx on a thread subset. Some controls were
added here and there. It is forbidden to start a backend appctx on another
thread than the local one. If a frontend appctx is started on another thread
or a thread subset, the applet .init callback function must be defined. This
callback function is responsible to finalize the appctx startup. It can be
performed synchornously. In this case, the appctx is started on the local
thread. It is not really useful but it is valid. Or it can be performed
asynchronously. In this case, .init callback function is called when the
appctx is woken up for the first time. When this happens, the appctx
affinity is set to the current thread to be able to start the session and
the stream.
In the same way than for the tasks, the applets api was changed to be able
to start a new appctx on a thread subset. For now the feature is
disabled. Only appctx_new_here() is working. But it will be possible to
start an appctx on a specific thread or a subset via a mask.
appctx_free_on_early_error() must be used to release a freshly created
frontend appctx if an error occurred during the init stage. It takes care to
release the stream instead of the appctx if it exists. For a backend appctx,
it just calls appctx_free().
appctx_finalize_startup() may be used to finalize the frontend appctx
startup. It is responsible to create the appctx's session and the frontend
conn-stream. On error, it is the caller responsibility to release the
appctx. However, the session is released if it was created. On success, if
an error is encountered in the caller function, the stream must be released
instead of the appctx.
This function should ease the init stage when new appctx is created.
It is just a helper function that call the .init applet callback function,
if it exists. This will simplify a bit the init stage when a new applet is
started. For now, this callback function is only used when a new service is
started.
Applets were moved at the same level than multiplexers. Thus, gradually,
applets code is changed to be less dependent from the stream. With this
commit, the frontend appctx are ready to own the session. It means a
frontend appctx will be responsible to release the session.
When HAProxy is linked to an OpenSSLv3 library, this option can be used
to load a provider during init. You can specify multiple ssl-provider
options, which will be loaded in the order they appear. This does not
prevent OpenSSL from parsing its own configuration file in which some
other providers might be specified.
A linked list of the providers loaded from the configuration file is
kept so that all those providers can be unloaded during cleanup. The
providers loaded directly by OpenSSL will be freed by OpenSSL.
The "http-restrict-req-hdr-names" option can now be set to restrict allowed
characters in the request header names to the "[a-zA-Z0-9-]" charset.
Idea of this option is to not send header names with non-alphanumeric or
hyphen character. It is especially important for FastCGI application because
all those characters are converted to underscore. For instance,
"X-Forwarded-For" and "X_Forwarded_For" are both converted to
"HTTP_X_FORWARDED_FOR". So, header names can be mixed up by FastCGI
applications. And some HAProxy rules may be bypassed by mangling header
names. In addition, some non-HTTP compliant servers may incorrectly handle
requests when header names contain characters ouside the "[a-zA-Z0-9-]"
charset.
When this option is set, the policy must be specify:
* preserve: It disables the filtering. It is the default mode for HTTP
proxies with no FastCGI application configured.
* delete: It removes request headers with a name containing a character
outside the "[a-zA-Z0-9-]" charset. It is the default mode for
HTTP backends with a configured FastCGI application.
* reject: It rejects the request with a 403-Forbidden response if it
contains a header name with a character outside the
"[a-zA-Z0-9-]" charset.
The option is evaluated per-proxy and after http-request rules evaluation.
This patch may be backported to avoid any secuirty issue with FastCGI
application (so as far as 2.2).
quic_rx_strm_frm type was used to buffered STREAM frames received out of
order. Now the MUX is able to deal directly with these frames and
buffered it inside its ncbuf.
Rx has been simplified since the conversion of buffer to a ncbuf. The
old buffer can now be removed. The frms tree is also removed. It was
used previously to stored out-of-order received STREAM frames. Now the
MUX is able to buffer them directly into the ncbuf.
Add a ncbuf for data reception on qcs. Thanks to this, the MUX is able
to buffered all received frame directly into the buffer. Flow control
parameters will be used to ensure there is never an overflow.
This change will simplify Rx path with the future deletion of acked
frames tree previously used for frames out of order.
Redefine the initial local flow-control to enforce by us. Use bufsize as
the maximum offset allowed to be received.
This change is part of an adjustement on the Rx path. Mux buffer will be
converted to a ncbuf. Flow-control parameters must ensure that we never
receive a frame larger than the buffer. With this, all received frames
will be stored in the MUX buffer.
The two functions became exact copies since there's no more special case
for the appctx owner. Let's merge them into a single one, that simplifies
the code.