This commit is innoffensive but will allow to do some code refactors in
existing proxy http options. Newly created http related proxy options
will also benefit from this.
Just like http_append_header(), but this time to insert new value before
an existing one.
If the header already contains one or multiple values, ',' is automatically
inserted after the new value.
Calling this function as an alternative to http_replace_header_value()
to append a new value to existing header instead of replacing the whole
header content.
If the header already contains one or multiple values: a ',' is automatically
appended before the new value.
This function is not meant for prepending (providing empty ctx value),
in which case we should consider implementing dedicated prepend
alternative function.
Till now pseudo headers were passed as const strings, but having them as
ISTs will be more convenient for traces. This doesn't change anything for
strings which are derived from them (and being constants they're still
zero-terminated).
TRACE_PRINTF() can be used to produce arbitrary trace contents at any
trace level. It uses the exact same arguments as other TRACE_* macros,
but here they are mandatory since they are followed by the format-string,
though they may be filled with zeroes. The reason for the arguments is to
match tracking or filtering and not pollute other non-inspected objects.
It will probably be used inside loops, in which case there are two points
to be careful about:
- output atomicity is only per-message, so competing threads may see
their messages interleaved. As such, it is recommended that the
caller places a recognizable unique context at the beginning of the
message such as a connection pointer.
- iterating over arrays or lists for all requests could be very
expensive. In order to avoid this it is best to condition the call
via TRACE_ENABLED() with the same arguments, which will return the
same decision.
- messages longer than TRACE_MAX_MSG-1 (1023 by default) will be
truncated.
For example, in order to dump the list of HTTP headers between hpack
and h2:
if (outlen > 0 &&
TRACE_ENABLED(TRACE_LEVEL_DEVELOPER,
H2_EV_RX_FRAME|H2_EV_RX_HDR, h2c->conn, 0, 0, 0)) {
int i;
for (i = 0; list[i].n.len; i++)
TRACE_PRINTF(TRACE_LEVEL_DEVELOPER, H2_EV_RX_FRAME|H2_EV_RX_HDR,
h2c->conn, 0, 0, 0, "h2c=%p hdr[%d]=%s:%s", h2c, i,
list[i].n.ptr, list[i].v.ptr);
}
In addition, a lower-level TRACE_PRINTF_LOC() macro is provided, that takes
two extra arguments, the caller's location and the caller's function name.
This will allow to emit composite traces from central functions on the
behalf of another one.
By default, passing a NULL cb to the trace functions will result in the
source's default one to be used. For some cases we won't want to use any
callback at all, not event the default one. Let's define a trace_no_cb()
function for this, that does absolutely nothing.
Sometimes it would be necessary to prepare some messages, pre-process some
blocks or maybe duplicate some contents before they vanish for the purpose
of tracing them. However we don't want to do that for everything that is
submitted to the traces, it's important to do it only for what will really
be traced.
The __trace() function has all the knowledge for this, to the point of even
checking the lockon pointers. This commit splits the function in two, one
with the trace decision logic, and the other one for the trace production.
The first one is now usable through wrappers such as _trace_enabled() and
TRACE_ENABLED() which will indicate whether traces are going to be produced
for the current source, level, event mask, parameters and tracking.
There are ifdefs at several places to only define TRC_ARGS_QCON when
QUIC is defined, but nothing prevents this code from building without.
Let's just remove those ifdefs, the single "if" they avoid is not worth
the extra maintenance burden.
As specified by HTTP3 RFC, SETTINGS frame should be sent as soon as
possible. Before this patch, this was only done on the first qc_send()
invocation. This delay significantly SETTINGS emission until the first
H3 response is ready to be transferred.
This patch fixes this by ensuring SETTINGS is emitted when MUX-QUIC is
being setup.
As a side point, return value of finalize operation is checked. This
means that an error during SETTINGS emission will cause the connection
init to fail.
This should be backported up to 2.7.
thread_harmless_end() needs to wait for rdv_requests to disappear so
that we're certain to respect a harmless promise that possibly allowed
another thread to proceed under isolation. But this doesn't work in a
signal handler because a thread could be interrupted by the debug
handler while already waiting for isolation and with rdv_request>0.
As such this function could cause a deadlock in such a signal handler.
Let's implement a specific variant for this, thread_harmless_end_sig(),
that just resets the thread's bit and doesn't wait. It must of course
not be used past a check point that would allow the isolation requester
to return and see the thread as temporarily harmless then turning back
on its promise.
This will be needed to fix a race in the debug handler.
A few loops waiting for threads to synchronize such as thread_isolate()
rightfully filter the thread masks via the threads_enabled field that
contains the list of enabled threads. However, it doesn't use an atomic
load on it. Before 2.7, the equivalent variables were marked as volatile
and were always reloaded. In 2.7 they're fields in ha_tgroup_ctx[], and
the risk that the compiler keeps them in a register inside a loop is not
null at all. In practice when ha_thread_relax() calls sched_yield() or
an x86 PAUSE instruction, it could be verified that the variable is
always reloaded. If these are avoided (e.g. architecture providing
neither solution), it's visible in asm code that the variables are not
reloaded. In this case, if a thread exists just between the moment the
two values are read, the loop could spin forever.
This patch adds the required _HA_ATOMIC_LOAD() on the relevant
threads_enabled fields. It must be backported to 2.7.
Slighty adjust b_quic_enc_int(). This function is used to encode an
integer as a QUIC varint in a struct buffer.
A new parameter is added to the function API to specify the width of the
encoded integer. By default, 0 should be use to ensure that the minimum
space is used. Other valid values are 1, 2, 4 or 8. An error is reported
if the width is not large enough.
This new parameter will be useful when buffer space is reserved prior to
encode an unknown integer value. The maximum size of 8 bytes will be
reserved and some data can be put after. When finally encoding the
integer, the width can be requested to be 8 bytes.
With this new parameter, a small refactoring of the function has been
conducted to remove some useless internal variables.
This should be backported up to 2.7. It will be mostly useful to
implement H3 trailers encoding.
This function was introduced in OpenSSL 1.1.0. Prior to that, the
ECDSA_SIG structure was public.
This function was used in commit 5a8f02ae "BUG/MEDIUM: jwt: Properly
process ecdsa signatures (concatenated R and S params)".
This patch needs to be backported up to branch 2.5 alongside commit
5a8f02ae.
This reverts commit d65791e26c12b57723f2feb7eacdbbd99601371b.
Conflict with the patch which was originally written and lacks the
BN_clear_free() and the NULL check.
Commit 5a8f02ae6 ("BUG/MEDIUM: jwt: Properly process ecdsa signatures
(concatenated R and S params)") makes use of ECDSA_SIG_set0() which only
appeared in openssl-1.1.0 and libressl 2.7, and breaks the build before.
Let's just do what it minimally does (only assigns the two fields to the
destination).
This will need to be backported where the commit above is, likely 2.5.
Add "no-quic" to "global" section to disable the use of QUIC transport protocol
by all configured QUIC listeners. This is listeners with QUIC addresses on their
"bind" lines. Internally, the socket addresses binding is skipped by
protocol_bind_all() for receivers with <proto_quic4> or <proto_quic6> as
protocol (see protocol struct).
Add information about "no-quic" global option to the documentation.
Must be backported to 2.7.
In se_fl_set_error() we used to switch to SE_FL_ERROR only when there
is already SE_FL_EOS indicating that the read side is closed. But that
is not sufficient, we need to consider all cases where no more reads
will be performed on the connection, and as such also include SE_FL_EOI.
Without this, some aborted connections during a transfer sometimes only
stop after the timeout, because the ERR_PENDING is never promoted to
ERROR.
This must be backported to 2.7 and requires previous patch "CLEANUP:
stconn: always use se_fl_set_error() to set the pending error".
When the payload length cannot be determined, the htx extra field is set to
the magical vlaue ULLONG_MAX. It is not obvious. This a dedicated HTX value
is now used. Now, HTX_UNKOWN_PAYLOAD_LENGTH must be used in this case,
instead of ULLONG_MAX.
There is already a function to set termination flags but it is not well
suited for HTTP streams. So a function, dedicated to the HTTP analysis, was
added. This way, this new function will be called for HTTP analysers on
error. And if the error is not caugth at this stage, the generic function
will still be called from process_stream().
Here, by default a PRXCOND error is reported and depending on the stream
state, the reson will be set accordingly:
* If the backend SC is in INI state, SF_FINST_T is reported on tarpit and
SF_FINST_R otherwise.
* SF_FINST_Q is the server connection is queued
* SF_FINST_C in any connection attempt state (REQ/TAR/ASS/CONN/CER/RDY).
Except for applets, a SF_FINST_R is reported.
* Once the server connection is established, SF_FINST_H is reported while
HTTP_MSG_DATA state on the response side.
* SF_FINST_L is reported if the response is in HTTP_MSG_DONE state or
higher and a client error/timeout was reported.
* Otherwise SF_FINST_D is reported.
During multiple tests we've already noticed that shared stats counters
have become a real bottleneck under large thread counts. With QUIC it's
pretty visible, with qc_snd_buf() taking 2.5% of the CPU on a 48-thread
machine at only 25 Gbps, and this CPU is entirely spent in the atomic
increment of the byte count and byte rate. It's also visible in H1/H2
but slightly less since we're working with larger buffers, hence less
frequent updates. These counters are exclusively used to report the
byte count in "show info" and the byte rate in the stats.
Let's move them to the thread_ctx struct and make the stats reader
just collect each thread's stats when requested. That's way more
efficient than competing on a single cache line.
After this, qc_snd_buf has totally disappeared from the perf profile
and tests made in h1 show roughly 1% performance increase on small
objects.
When a STOP_SENDING or RESET_STREAM must be send, its corresponding qcs
is inserted into <qcc.send_list> via qcc_reset_stream() or
qcc_abort_stream_read().
This allows to remove the iteration on full qcs tree in qc_send().
Instead, STOP_SENDING and RESET_STREAM is done in the loop over
<qcc.send_list> as with STREAM frames. This should improve slightly the
performance, most notably when large number of streams are opened.
This must be backported up to 2.7.
Complete qcc_send_stream() function to allow to specify if the stream
should be handled in priority. Internally this will insert the qcs
instance in front of <qcc.send_list> to be able to treat it before other
streams.
This functionality is useful when some QUIC streams should be sent
before others. Most notably, this is used to guarantee that H3 SETTINGS
is done first via the control stream.
This must be backported up to 2.7.
Implement a mechanism to register streams ready to send data in new
STREAM frames. Internally, this is implemented with a new list
<qcc.send_list> which contains qcs instances.
A qcs can be registered safely using the new function qcc_send_stream().
This is done automatically in qc_send_buf() which covers most cases.
Also, application layer is free to use it for internal usage streams.
This is currently the case for H3 control stream with SETTINGS sending.
The main point of this patch is to handle stream sending fairly. This is
in stark contrast with previous code where streams with lower ID were
always prioritized. This could cause other streams to be indefinitely
blocked behind a stream which has a lot of data to transfer. Now,
streams are handled in an order scheduled by se_desc layer.
This commit is the first one of a serie which will bring other
improvments which also relied on the send_list implementation.
This must be backported up to 2.7 when deemed sufficiently stable.
In applets, we stop processing when a write error (CF_WRITE_ERROR) or a shutdown
for writes (CF_SHUTW) is detected. However, any write error leads to an
immediate shutdown for writes. Thus, it is enough to only test if CF_SHUTW is
set.
When a read error (CF_READ_ERROR) is reported, a shutdown for reads is
always performed (CF_SHUTR). Thus, there is no reason to check if
CF_READ_ERROR is set if CF_SHUTR is also checked.
CF_READ_ATTACHED flag is only used in input events for stream analyzers,
CF_MASK_ANALYSER. A read event can be reported instead and this flag can be
removed. We must only take care to report a read event when the client
connection is upgraded from TCP to HTTP.
It appears CF_ANA_TIMEOUT is flag only used in CF_MASK_ANALYSER. All
analyzer timeout relies on the analysis expiration date (chn->analyse_exp).
Worst, once set, this flag is never removed. Thus this flag can be removed
and replaced by a read event (CF_READ_EVENT).
Thanks to previous changes, CF_WRITE_ACTIVITY flags can be removed.
Everywhere it was used, its value is now directly used
(CF_WRITE_EVENT|CF_WRITE_ERROR).
Thanks to previous changes, CF_READ_ACTIVITY flags can be removed.
Everywhere it was used, its value is now directly used
(CF_READ_EVENT|CF_READ_ERROR).
Just like CF_READ_PARTIAL, CF_WRITE_PARTIAL is now merged with
CF_WRITE_EVENT. There a subtlety in sc_notify(). The "connect" event
(formely CF_WRITE_NULL) is now detected with
(CF_WRITE_EVENT + sc->state < SC_ST_EST).
CF_READ_PARTIAL flag is now merged with CF_READ_EVENT. It means
CF_READ_EVENT is set when a read0 is received (formely CF_READ_NULL) or when
data are received (formely CF_READ_ACTIVITY).
There is nothing special here, except conditions to wake the stream up in
sc_notify(). Indeed, the test was a bit changed to reflect recent
change. read0 event is now formalized by (CF_READ_EVENT + CF_SHUTR).
As for CF_READ_NULL, it appears CF_WRITE_NULL and other write events on a
channel are mainly used to wake up the stream and may be replace by on write
event.
In this patch, we introduce CF_WRITE_EVENT flag as a replacement to
CF_WRITE_EVENT_NULL. There is no breaking change for now, it is just a
rename. Gradually, other write events will be merged with this one.
CF_READ_NULL flag is not really useful and used. It is a transient event
used to wakeup the stream. As we will see, all read events on a channel may
be resumed to only one and are all used to wake up the stream.
In this patch, we introduce CF_READ_EVENT flag as a replacement to
CF_READ_NULL. There is no breaking change for now, it is just a
rename. Gradually, other read events will be merged with this one.
This action increments the General Purpose Counter at the index <idx> of
the array associated to the sticky counter designated by <sc-id> by the
value of either integer <int> or the integer evaluation of expression
<expr>. Integers and expressions are limited to unsigned 32-bit values.
If an error occurs, this action silently fails and the actions evaluation
continues. <idx> is an integer between 0 and 99 and <sc-id> is an integer
between 0 and 2. It also silently fails if the there is no GPC stored at
this index. The entry in the table is refreshed even if the value is zero.
The 'gpc_rate' is automatically adjusted to reflect the average growth
rate of the gpc value.
The main use of this action is to count scores or total volumes (e.g.
estimated danger per source IP reported by the server or a WAF, total
uploaded bytes, etc).
The number of stick-counter entries usable by track-sc rules is currently
set at build time. There is no good value for this since the vast majority
of users don't need any, most need only a few and rare users need more.
Adding more counters for everyone increases memory and CPU usages for no
reason.
This patch moves the per-session and per-stream arrays to a pool of a size
defined at boot time. This way it becomes possible to set the number of
entries at boot time via a new global setting "tune.stick-counters" that
sets the limit for the whole process. When not set, the MAX_SESS_STR_CTR
value still applies, or 3 if not set, as before.
It is also possible to lower the value to 0 to save a bit of memory if
not used at all.
Note that a few low-level sample-fetch functions had to be protected due
to the ability to use sample-fetches in the global section to set some
variables.
There is a bug in b_slow_realign() function when wrapping output data are
copied in the swap buffer. block1 and block2 sizes are inverted. Thus blocks
with a wrong size are copied. It leads to data mixin if the first block is
in reality larger than the second one or to a copy of data outside the
buffer is the first block is smaller than the second one.
The bug was introduced when the buffer API was refactored in 1.9. It was
found by a code review and seems never to have been triggered in almost 5
years. However, we cannot exclude it is responsible of some unresolved bugs.
This patch should fix issue #1978. It must be backported as far as 2.0.
Due to the previous SSL exception we coudln't restrict the collected
CFLAGS/LDFLAGS to those of enabled options, so all of them were
considered if set. The problem is that it would prevent simply
disabling a build option without unsetting its xxx_CFLAGS or _LDFLAGS
values if those had incompatible values (e.g. -lfoo).
Now that only existing options are listed in collect_opts_flags, we
can safely check that the option is set and only consider its settings
in this case. Thus OT_LDFLAGS will not be used if USE_OT is not set
for example.
By creating USE_SSL and enabling it when USE_OPENSSL is set, we can
get rid of the special case that was made with it regarding cflags
collect and when resetting options. The option doesn't need to be
manually set, though in the future it might prove useful if other
non-openssl API are supported.
It's getting complicated to configure includes and lib dirs for
OpenSSL API variants such as WolfSSL, because some settings are
common and others are specific but carry a prefix that doesn't
match the USE_* rule scheme.
This patch simplifies everything by considering that all SSL libs
will use SSL_INC, SSL_LIB, SSL_CFLAGS and SSL_LDFLAGS. That's much
more convenient. This works thanks to the settings collector which
explicitly checks the SSL_* settings. When USE_OPENSSL_WOLFSSL is
set, then USE_OPENSSL is implied, so that there's no need to
duplicate maintenance effort.
The new function collect_opts_flags now scans all USE_* options defined
in use_opts and appends the corresponding *_CFLAGS and *_LDFLAGS to
OPTIONS_{C,LD}FLAGS respectively. This will be useful to get rid of all
the individual concatenations to these variables.
A lot of _SRC, _INC, _LIB etc variables are set and expected to be
initialized to an empty string by default. However, an in-depth
review of all of them showed that WOLFSSL_{INC,LIB}, SSL_{INC,LIB},
LUA_{INC,LIB}, and maybe others were not always initialized and could
sometimes leak from the environment and as such cause strange build
issues when running from cascaded scripts that had exported them.
The approach taken here consists in iterating over all USE_* options
and unsetting any _SRC, _INC, _LIB, _CFLAGS and _LDFLAGS that follows
the same name. For the few variable names options that don't exactly
match the build option (SSL & WOLFSSL), these ones are specifically
added to the list. The few that were explicitly cleared in their own
sections were just removed since not needed anymore. Note that an
"undefine" command appeared in GNU make 3.82 but since we support
older ones we can only initialize the variables to an empty string
here. It's not a problem in practice.
We're now certain that these variables are empty wherever they are
used, and that it is possible to just append to them, or use them
as-is.
Some macros and functions are barely understandable and are only used
to iterate over known options from the use_opts list. Better assign
them a name and move them into a dedicated file to clean the makefile
a little bit. Now at least "use_opts" only appears once, where it is
defined. This also allowed to completely remove the BUILD_FEATURES
macro that caused some confusion until previous commit.
Implement STOP_SENDING. This is divided in two main functions :
* qcc_abort_stream_read() which can be used by application protocol to
request for a STOP_SENDING. This set the flag QC_SF_READ_ABORTED.
* qcs_send_reset() is a static function called after the preceding one.
It will send a STOP_SENDING via qcc_send().
QC_SF_READ_ABORTED flag is now properly used : if activated on a stream
during qcc_recv(), <qcc.app_ops.decode_qcs> callback is skipped. Also,
abort reading on unknown unidirection remote stream is now fully
supported with the emission of a STOP_SENDING as specified by RFC 9000.
This commit is part of implementing H3 errors at the stream level. This
will allows the H3 layer to request the peer to close its endpoint for
an error on a stream.
This should be backported up to 2.7.
Implement RESET_STREAM reception by mux-quic. On reception, qcs instance
will be mark as remotely closed and its Rx buffer released. The stream
layer will be flagged on error if still attached.
This commit is part of implementing H3 errors at the stream level.
Indeed, on H3 stream errors, STOP_SENDING + RESET_STREAM should be
emitted. The STOP_SENDING will in turn generate a RESET_STREAM by the
remote peer which will be handled thanks to this patch.
This should be backported up to 2.7.
Implement mux_ops shutw operation for QUIC mux. A RESET_STREAM is
emitted unless the stream is already closed due to all data or
RESET_STREAM already transmitted.
This operation is notably useful when upper stream layer wants to close
the connection early due to an error.
This was tested by using a HTTP server which listens with PROXY protocol
support. The corresponding server line on haproxy configuration
deliberately not specify send-proxy. This causes the server to close
abruptly the connection. Without this patch, nothing was done on the QUIC
stream which was kept open until the whole connection is closed. Now, a
proper RESET_STREAM is emitted to report the error.
This should be backported up to 2.7.