Commit Graph

6718 Commits

Author SHA1 Message Date
Amaury Denoyelle
57b3eaa793 MINOR: quic: refactor frame deallocation
Define a new function qc_frm_free() to handle frame deallocation. New
BUG_ON() statements ensure that the deallocated frame is not referenced
by other frame. To support this, all LIST_DELETE() have been replaced by
LIST_DEL_INIT(). This should enforce that frame deallocation is robust.

As a complement, qc_frm_unref() has been moved into quic_frame module.
It is justified as this is a utility function related to frame
deallocation. It allows to use it in quic_pktns_tx_pkts_release() before
calling qc_frm_free().

This should be backported up to 2.7.
2023-02-03 11:55:41 +01:00
Amaury Denoyelle
40c24f1a10 MINOR: quic: define new functions for frame alloc
Define two utility functions for quic_frame allocation :
* qc_frm_alloc() is used to allocate a new frame
* qc_frm_dup() is used to allocate a new frame by duplicating an
  existing one

Theses functions are useful to centralize quic_frame initialization.
Note that pool_zalloc() is replaced by a proper pool_alloc() + explicit
initialization code.

This commit will simplify implementation of the per frame retransmission
limitation. Indeed, a new counter will be added in quic_frame structure
which must be initialized to 0.

This should be backported up to 2.7.
2023-02-03 10:44:26 +01:00
Amaury Denoyelle
2216b0866e MINOR: quic: remove fin from quic_stream frame type
A dedicated <fin> field was used in quic_stream structure. However, this
info is already encoded in the frame type field as specified by QUIC
protocol.

In fact, only code for packet reception used the <fin> field. On the
sending side, we only checked for the FIN bit. To align both sides,
remove the <fin> field and only used the FIN bit.

This should be backported up to 2.7.
2023-02-03 09:46:55 +01:00
Amaury Denoyelle
1e340ba6bc MINOR: mux-quic/h3: define stream close callback
Define a new qcc_app_ops callback named close(). This will be used to
notify app-layer about the closure of a stream by the remote peer. Its
main usage is to ensure that the closure is allowed by the application
protocol specification.

For the moment, close is not implemented by H3 layer. However, this
function will be mandatory to properly reject a STOP_SENDING on the
control stream and preventing a later crash. As such, this commit must
be backported with the next one on 2.6.

This is related to github issue #2006.
2023-01-30 15:56:25 +01:00
Aurelien DARRAGON
b2e2ec51b3 MEDIUM: proxy/http_ext: implement dynamic http_ext
proxy http-only options implemented in http_ext were statically stored
within proxy struct.

We're making some changes so that http_ext are now stored in a dynamically
allocated structs.
http_ext related structs are only allocated when needed to save some space
whenever possible, and they are automatically freed upon proxy deletion.

Related PX_O_HTTP{7239,XFF,XOT) option flags were removed because we're now
considering an http_ext option as 'active' if it is allocated (ptr is not NULL)

A few checks (and BUG_ON) were added to make these changes safe because
it adds some (acceptable) complexity to the previous design.

Also, proxy.http was renamed to proxy.http_ext to make things more explicit.
2023-01-27 15:18:59 +01:00
Aurelien DARRAGON
9ded834adc OPTIM: http_ext/7239: introduce c_mode to save some space
forwarded header option (rfc7239) deals with sample expressions in two
steps: first a sample expression string is extracted from the config file
and later in startup sequence this string is converted into the resulting
sample_expr.

We need to perform these two steps because we cannot compile the expr
too early in the parsing sequence. (or we would miss some context)
Because of this, we have two dinstinct structure members (expr and expr_s)
for each 7239 field supporting sample expressions.
This is not cool, because we're bloating the http forwarded config structure,
and thus, bloating proxy config structure.

To address this, we now merge both expr and expr_s members inside a single
union to regain some space. This forces us to perform some additional logic
to make sure to use the proper structure member at different parsing steps.

Thanks to this, we're also able to free/release related config hints and
sample expression strings as soon as the sample expression
compilation is finished.
2023-01-27 15:18:59 +01:00
Aurelien DARRAGON
f958341610 MINOR: proxy: move 'originalto' option to http_ext
Just like forwarded (7239) header and forwardfor header, move parsing,
logic and management of 'originalto' option into http_ext dedicated class.

We're only doing this to standardize proxy http options management.
Existing behavior remains untouched.
2023-01-27 15:18:59 +01:00
Aurelien DARRAGON
730b9836a6 MINOR: proxy: move 'forwardfor' option to http_ext
Just like forwarded (7239) header, move parsing, logic and management
of 'forwardfor' option into http_ext dedicated class.

We're only doing this to standardize proxy http options management.
Existing behavior remains untouched.
2023-01-27 15:18:59 +01:00
Aurelien DARRAGON
b2bb9257d2 MINOR: proxy/http_ext: introduce proxy forwarded option
Introducing http_ext class for http extension related work that
doesn't fit into existing http classes.

HTTP extension "forwarded", introduced with 7239 RFC is now supported
by haproxy.

The option supports various modes from simple to complex usages involving
custom sample expressions.

  Examples :

    # Those servers want the ip address and protocol of the client request
    # Resulting header would look like this:
    #   forwarded: proto=http;for=127.0.0.1
    backend www_default
        mode http
        option forwarded
        #equivalent to: option forwarded proto for

    # Those servers want the requested host and hashed client ip address
    # as well as client source port (you should use seed for xxh32 if ensuring
    # ip privacy is a concern)
    # Resulting header would look like this:
    #   forwarded: host="haproxy.org";for="_000000007F2F367E:60138"
    backend www_host
        mode http
        option forwarded host for-expr src,xxh32,hex for_port

    # Those servers want custom data in host, for and by parameters
    # Resulting header would look like this:
    #   forwarded: host="host.com";by=_haproxy;for="[::1]:10"
    backend www_custom
        mode http
        option forwarded host-expr str(host.com) by-expr str(_haproxy) for for_port-expr int(10)

    # Those servers want random 'for' obfuscated identifiers for request
    # tracing purposes while protecting sensitive IP information
    # Resulting header would look like this:
    #   forwarded: for=_000000002B1F4D63
    backend www_for_hide
        mode http
        option forwarded for-expr rand,hex

By default (no argument provided), forwarded option will try to mimic
x-forward-for common setups (source client ip address + source protocol)

The option is not available for frontends.
no option forwarded is supported.

More info about 7239 RFC here: https://www.rfc-editor.org/rfc/rfc7239.html

More info about the feature in doc/configuration.txt

This should address feature request GH #575

Depends on:
  - "MINOR: http_htx: add http_append_header() to append value to header"
  - "MINOR: sample: add ARGC_OPT"
  - "MINOR: proxy: introduce http only options"
2023-01-27 15:18:59 +01:00
Aurelien DARRAGON
832e9f4119 MINOR: proxy: introduce http only options
This commit is innoffensive but will allow to do some code refactors in
existing proxy http options. Newly created http related proxy options
will also benefit from this.
2023-01-27 15:18:59 +01:00
Aurelien DARRAGON
5f7f5fe76a MINOR: sample: add ARGC_OPT
Add ARGC_OPT enum to provide more context for upcoming sample parse errors
involving proxy "option" config directives.
2023-01-27 15:18:59 +01:00
Aurelien DARRAGON
38ebffaf10 MINOR: http_htx: add http_prepend_header() to prepend value to header
Just like http_append_header(), but this time to insert new value before
an existing one.

If the header already contains one or multiple values, ',' is automatically
inserted after the new value.
2023-01-27 15:18:59 +01:00
Aurelien DARRAGON
a5a8552cab MINOR: http_htx: add http_append_header() to append value to header
Calling this function as an alternative to http_replace_header_value()
to append a new value to existing header instead of replacing the whole
header content.

If the header already contains one or multiple values: a ',' is automatically
appended before the new value.

This function is not meant for prepending (providing empty ctx value),
in which case we should consider implementing dedicated prepend
alternative function.
2023-01-27 15:18:59 +01:00
Willy Tarreau
271c440392 MINOR: h2: add h2_phdr_to_ist() to make ISTs from pseudo headers
Till now pseudo headers were passed as const strings, but having them as
ISTs will be more convenient for traces. This doesn't change anything for
strings which are derived from them (and being constants they're still
zero-terminated).
2023-01-26 15:49:43 +01:00
Willy Tarreau
b8b243ac6a MINOR: trace: add the long awaited TRACE_PRINTF()
TRACE_PRINTF() can be used to produce arbitrary trace contents at any
trace level. It uses the exact same arguments as other TRACE_* macros,
but here they are mandatory since they are followed by the format-string,
though they may be filled with zeroes. The reason for the arguments is to
match tracking or filtering and not pollute other non-inspected objects.

It will probably be used inside loops, in which case there are two points
to be careful about:
  - output atomicity is only per-message, so competing threads may see
    their messages interleaved. As such, it is recommended that the
    caller places a recognizable unique context at the beginning of the
    message such as a connection pointer.
  - iterating over arrays or lists for all requests could be very
    expensive. In order to avoid this it is best to condition the call
    via TRACE_ENABLED() with the same arguments, which will return the
    same decision.
  - messages longer than TRACE_MAX_MSG-1 (1023 by default) will be
    truncated.

For example, in order to dump the list of HTTP headers between hpack
and h2:

  if (outlen > 0 &&
      TRACE_ENABLED(TRACE_LEVEL_DEVELOPER,
      H2_EV_RX_FRAME|H2_EV_RX_HDR, h2c->conn, 0, 0, 0)) {
          int i;
          for (i = 0; list[i].n.len; i++)
              TRACE_PRINTF(TRACE_LEVEL_DEVELOPER, H2_EV_RX_FRAME|H2_EV_RX_HDR,
                           h2c->conn, 0, 0, 0, "h2c=%p hdr[%d]=%s:%s", h2c, i,
                           list[i].n.ptr, list[i].v.ptr);
  }

In addition, a lower-level TRACE_PRINTF_LOC() macro is provided, that takes
two extra arguments, the caller's location and the caller's function name.
This will allow to emit composite traces from central functions on the
behalf of another one.
2023-01-26 15:49:43 +01:00
Willy Tarreau
4b36d5e8de MINOR: trace: add a trace_no_cb() dummy callback for when to use no callback
By default, passing a NULL cb to the trace functions will result in the
source's default one to be used. For some cases we won't want to use any
callback at all, not event the default one. Let's define a trace_no_cb()
function for this, that does absolutely nothing.
2023-01-26 15:49:43 +01:00
Willy Tarreau
8f9a9704bb MINOR: trace: add a TRACE_ENABLED() macro to determine if a trace is active
Sometimes it would be necessary to prepare some messages, pre-process some
blocks or maybe duplicate some contents before they vanish for the purpose
of tracing them. However we don't want to do that for everything that is
submitted to the traces, it's important to do it only for what will really
be traced.

The __trace() function has all the knowledge for this, to the point of even
checking the lockon pointers. This commit splits the function in two, one
with the trace decision logic, and the other one for the trace production.
The first one is now usable through wrappers such as _trace_enabled() and
TRACE_ENABLED() which will indicate whether traces are going to be produced
for the current source, level, event mask, parameters and tracking.
2023-01-26 15:49:43 +01:00
Willy Tarreau
80f36b2ac2 CLEANUP: trace: remove the QUIC-specific ifdefs
There are ifdefs at several places to only define TRC_ARGS_QCON when
QUIC is defined, but nothing prevents this code from building without.
Let's just remove those ifdefs, the single "if" they avoid is not worth
the extra maintenance burden.
2023-01-26 15:49:43 +01:00
Amaury Denoyelle
71fd03632f MINOR: mux-quic/h3: send SETTINGS as soon as transport is ready
As specified by HTTP3 RFC, SETTINGS frame should be sent as soon as
possible. Before this patch, this was only done on the first qc_send()
invocation. This delay significantly SETTINGS emission until the first
H3 response is ready to be transferred.

This patch fixes this by ensuring SETTINGS is emitted when MUX-QUIC is
being setup.

As a side point, return value of finalize operation is checked. This
means that an error during SETTINGS emission will cause the connection
init to fail.

This should be backported up to 2.7.
2023-01-25 16:01:55 +01:00
Willy Tarreau
7e70bfc8cb MINOR: threads: add a thread_harmless_end() version that doesn't wait
thread_harmless_end() needs to wait for rdv_requests to disappear so
that we're certain to respect a harmless promise that possibly allowed
another thread to proceed under isolation. But this doesn't work in a
signal handler because a thread could be interrupted by the debug
handler while already waiting for isolation and with rdv_request>0.
As such this function could cause a deadlock in such a signal handler.

Let's implement a specific variant for this, thread_harmless_end_sig(),
that just resets the thread's bit and doesn't wait. It must of course
not be used past a check point that would allow the isolation requester
to return and see the thread as temporarily harmless then turning back
on its promise.

This will be needed to fix a race in the debug handler.
2023-01-19 19:22:17 +01:00
Willy Tarreau
b2f38c13d1 BUG/MINOR: thread: always reload threads_enabled in loops
A few loops waiting for threads to synchronize such as thread_isolate()
rightfully filter the thread masks via the threads_enabled field that
contains the list of enabled threads. However, it doesn't use an atomic
load on it. Before 2.7, the equivalent variables were marked as volatile
and were always reloaded. In 2.7 they're fields in ha_tgroup_ctx[], and
the risk that the compiler keeps them in a register inside a loop is not
null at all. In practice when ha_thread_relax() calls sched_yield() or
an x86 PAUSE instruction, it could be verified that the variable is
always reloaded. If these are avoided (e.g. architecture providing
neither solution), it's visible in asm code that the variables are not
reloaded. In this case, if a thread exists just between the moment the
two values are read, the loop could spin forever.

This patch adds the required _HA_ATOMIC_LOAD() on the relevant
threads_enabled fields. It must be backported to 2.7.
2023-01-19 19:22:17 +01:00
Amaury Denoyelle
7d78eff889 MINOR: h3: extend function for QUIC varint encoding
Slighty adjust b_quic_enc_int(). This function is used to encode an
integer as a QUIC varint in a struct buffer.

A new parameter is added to the function API to specify the width of the
encoded integer. By default, 0 should be use to ensure that the minimum
space is used. Other valid values are 1, 2, 4 or 8. An error is reported
if the width is not large enough.

This new parameter will be useful when buffer space is reserved prior to
encode an unknown integer value. The maximum size of 8 bytes will be
reserved and some data can be put after. When finally encoding the
integer, the width can be requested to be 8 bytes.

With this new parameter, a small refactoring of the function has been
conducted to remove some useless internal variables.

This should be backported up to 2.7. It will be mostly useful to
implement H3 trailers encoding.
2023-01-19 15:09:01 +01:00
Remi Tricot-Le Breton
bb35e1f5aa BUG/MINOR: ssl: Fix compilation with OpenSSL 1.0.2 (missing ECDSA_SIG_set0)
This function was introduced in OpenSSL 1.1.0. Prior to that, the
ECDSA_SIG structure was public.
This function was used in commit 5a8f02ae "BUG/MEDIUM: jwt: Properly
process ecdsa signatures (concatenated R and S params)".

This patch needs to be backported up to branch 2.5 alongside commit
5a8f02ae.
2023-01-19 11:13:51 +01:00
William Lallemand
2edc6d0301 Revert "BUILD: ssl: add ECDSA_SIG_set0() for openssl < 1.1 or libressl < 2.7"
This reverts commit d65791e26c.

Conflict with the patch which was originally written and lacks the
BN_clear_free() and the NULL check.
2023-01-19 11:13:24 +01:00
Willy Tarreau
d65791e26c BUILD: ssl: add ECDSA_SIG_set0() for openssl < 1.1 or libressl < 2.7
Commit 5a8f02ae6 ("BUG/MEDIUM: jwt: Properly process ecdsa signatures
(concatenated R and S params)") makes use of ECDSA_SIG_set0() which only
appeared in openssl-1.1.0 and libressl 2.7, and breaks the build before.
Let's just do what it minimally does (only assigns the two fields to the
destination).

This will need to be backported where the commit above is, likely 2.5.
2023-01-19 10:57:00 +01:00
Frdric Lcaille
21c4c9b854 MINOR: quic: Replace v2 draft definitions by those of the final 2 version
This should finalize the support for the QUIC version 2.

Must be backported to 2.7.
2023-01-17 16:35:20 +01:00
Frdric Lcaille
12a0317fed MINOR: quic: Add "no-quic" global option
Add "no-quic" to "global" section to disable the use of QUIC transport protocol
by all configured QUIC listeners. This is listeners with QUIC addresses on their
"bind" lines. Internally, the socket addresses binding is skipped by
protocol_bind_all() for receivers with <proto_quic4> or <proto_quic6> as
protocol (see protocol struct).
Add information about "no-quic" global option to the documentation.

Must be backported to 2.7.
2023-01-17 16:35:20 +01:00
Willy Tarreau
e77f4306ba BUG/MEDIUM: stconn: also consider SE_FL_EOI to switch to SE_FL_ERROR
In se_fl_set_error() we used to switch to SE_FL_ERROR only when there
is already SE_FL_EOS indicating that the read side is closed. But that
is not sufficient, we need to consider all cases where no more reads
will be performed on the connection, and as such also include SE_FL_EOI.

Without this, some aborted connections during a transfer sometimes only
stop after the timeout, because the ERR_PENDING is never promoted to
ERROR.

This must be backported to 2.7 and requires previous patch "CLEANUP:
stconn: always use se_fl_set_error() to set the pending error".
2023-01-17 16:27:35 +01:00
Christopher Faulet
2e47e3a1cf MINOR: htx: Add an HTX value for the extra field is payload length is unknown
When the payload length cannot be determined, the htx extra field is set to
the magical vlaue ULLONG_MAX. It is not obvious. This a dedicated HTX value
is now used. Now, HTX_UNKOWN_PAYLOAD_LENGTH must be used in this case,
instead of ULLONG_MAX.
2023-01-13 11:51:11 +01:00
Christopher Faulet
4da82395d8 CLEANUP: http-ana: Remove HTTP_MSG_ERROR state
This state is now unused. Thus it can be removed.
2023-01-13 11:22:13 +01:00
Christopher Faulet
71236dedb9 MINOR: http-ana: Add a function to set HTTP termination flags
There is already a function to set termination flags but it is not well
suited for HTTP streams. So a function, dedicated to the HTTP analysis, was
added. This way, this new function will be called for HTTP analysers on
error. And if the error is not caugth at this stage, the generic function
will still be called from process_stream().

Here, by default a PRXCOND error is reported and depending on the stream
state, the reson will be set accordingly:

  * If the backend SC is in INI state, SF_FINST_T is reported on tarpit and
    SF_FINST_R otherwise.

  * SF_FINST_Q is the server connection is queued

  * SF_FINST_C in any connection attempt state (REQ/TAR/ASS/CONN/CER/RDY).
    Except for applets, a SF_FINST_R is reported.

  * Once the server connection is established, SF_FINST_H is reported while
    HTTP_MSG_DATA state on the response side.

  * SF_FINST_L is reported if the response is in HTTP_MSG_DONE state or
    higher and a client error/timeout was reported.

  * Otherwise SF_FINST_D is reported.
2023-01-13 09:45:23 +01:00
Willy Tarreau
6be8d09a61 OPTIM: global: move byte counts out of global and per-thread
During multiple tests we've already noticed that shared stats counters
have become a real bottleneck under large thread counts. With QUIC it's
pretty visible, with qc_snd_buf() taking 2.5% of the CPU on a 48-thread
machine at only 25 Gbps, and this CPU is entirely spent in the atomic
increment of the byte count and byte rate. It's also visible in H1/H2
but slightly less since we're working with larger buffers, hence less
frequent updates. These counters are exclusively used to report the
byte count in "show info" and the byte rate in the stats.

Let's move them to the thread_ctx struct and make the stats reader
just collect each thread's stats when requested. That's way more
efficient than competing on a single cache line.

After this, qc_snd_buf has totally disappeared from the perf profile
and tests made in h1 show roughly 1% performance increase on small
objects.
2023-01-12 16:37:45 +01:00
Amaury Denoyelle
0a1154afb5 MINOR: mux-quic: use send-list for STOP_SENDING/RESET_STREAM emission
When a STOP_SENDING or RESET_STREAM must be send, its corresponding qcs
is inserted into <qcc.send_list> via qcc_reset_stream() or
qcc_abort_stream_read().

This allows to remove the iteration on full qcs tree in qc_send().
Instead, STOP_SENDING and RESET_STREAM is done in the loop over
<qcc.send_list> as with STREAM frames. This should improve slightly the
performance, most notably when large number of streams are opened.

This must be backported up to 2.7.
2023-01-10 17:49:50 +01:00
Amaury Denoyelle
f9b03265f0 MEDIUM: h3: send SETTINGS before STREAM frames
Complete qcc_send_stream() function to allow to specify if the stream
should be handled in priority. Internally this will insert the qcs
instance in front of <qcc.send_list> to be able to treat it before other
streams.

This functionality is useful when some QUIC streams should be sent
before others. Most notably, this is used to guarantee that H3 SETTINGS
is done first via the control stream.

This must be backported up to 2.7.
2023-01-10 17:49:50 +01:00
Amaury Denoyelle
20f2a425ff MAJOR: mux-quic: rework stream sending priorization
Implement a mechanism to register streams ready to send data in new
STREAM frames. Internally, this is implemented with a new list
<qcc.send_list> which contains qcs instances.

A qcs can be registered safely using the new function qcc_send_stream().
This is done automatically in qc_send_buf() which covers most cases.
Also, application layer is free to use it for internal usage streams.
This is currently the case for H3 control stream with SETTINGS sending.

The main point of this patch is to handle stream sending fairly. This is
in stark contrast with previous code where streams with lower ID were
always prioritized. This could cause other streams to be indefinitely
blocked behind a stream which has a lot of data to transfer. Now,
streams are handled in an order scheduled by se_desc layer.

This commit is the first one of a serie which will bring other
improvments which also relied on the send_list implementation.

This must be backported up to 2.7 when deemed sufficiently stable.
2023-01-10 17:49:50 +01:00
Christopher Faulet
da89e9b95b MINOR: channel/applets: Stop to test CF_WRITE_ERROR flag if CF_SHUTW is enough
In applets, we stop processing when a write error (CF_WRITE_ERROR) or a shutdown
for writes (CF_SHUTW) is detected. However, any write error leads to an
immediate shutdown for writes. Thus, it is enough to only test if CF_SHUTW is
set.
2023-01-09 18:41:08 +01:00
Christopher Faulet
4b490b7517 MINOR: channel: Stop to test CF_READ_ERROR flag if CF_SHUTR is enough
When a read error (CF_READ_ERROR) is reported, a shutdown for reads is
always performed (CF_SHUTR). Thus, there is no reason to check if
CF_READ_ERROR is set if CF_SHUTR is also checked.
2023-01-09 18:41:08 +01:00
Christopher Faulet
2357718217 MEDIUM: channel: Remove CF_READ_ATTACHED and report CF_READ_EVENT instead
CF_READ_ATTACHED flag is only used in input events for stream analyzers,
CF_MASK_ANALYSER. A read event can be reported instead and this flag can be
removed. We must only take care to report a read event when the client
connection is upgraded from TCP to HTTP.
2023-01-09 18:41:08 +01:00
Christopher Faulet
049fbcd36a MINOR: channel: Remove CF_ANA_TIMEOUT and report CF_READ_EVENT instead
It appears CF_ANA_TIMEOUT is flag only used in CF_MASK_ANALYSER. All
analyzer timeout relies on the analysis expiration date (chn->analyse_exp).
Worst, once set, this flag is never removed. Thus this flag can be removed
and replaced by a read event (CF_READ_EVENT).
2023-01-09 18:41:08 +01:00
Christopher Faulet
a63f8f379f MINOR: channel: Remove CF_WRITE_ACTIVITY
Thanks to previous changes, CF_WRITE_ACTIVITY flags can be removed.
Everywhere it was used, its value is now directly used
(CF_WRITE_EVENT|CF_WRITE_ERROR).
2023-01-09 18:41:08 +01:00
Christopher Faulet
33e03cec5f MINOR: channel: Remove CF_READ_ACTIVITY
Thanks to previous changes, CF_READ_ACTIVITY flags can be removed.
Everywhere it was used, its value is now directly used
(CF_READ_EVENT|CF_READ_ERROR).
2023-01-09 18:41:08 +01:00
Christopher Faulet
d898841530 MEDIUM: channel: Use CF_WRITE_EVENT instead of CF_WRITE_PARTIAL
Just like CF_READ_PARTIAL, CF_WRITE_PARTIAL is now merged with
CF_WRITE_EVENT. There a subtlety in sc_notify(). The "connect" event
(formely CF_WRITE_NULL) is now detected with
(CF_WRITE_EVENT + sc->state < SC_ST_EST).
2023-01-09 18:41:08 +01:00
Christopher Faulet
285f7616ee MEDIUM: channel: Use CF_READ_EVENT instead of CF_READ_PARTIAL
CF_READ_PARTIAL flag is now merged with CF_READ_EVENT. It means
CF_READ_EVENT is set when a read0 is received (formely CF_READ_NULL) or when
data are received (formely CF_READ_ACTIVITY).

There is nothing special here, except conditions to wake the stream up in
sc_notify(). Indeed, the test was a bit changed to reflect recent
change. read0 event is now formalized by (CF_READ_EVENT + CF_SHUTR).
2023-01-09 18:41:08 +01:00
Christopher Faulet
b96f2aa380 REORG: channel: Rename CF_WRITE_NULL to CF_WRITE_EVENT
As for CF_READ_NULL, it appears CF_WRITE_NULL and other write events on a
channel are mainly used to wake up the stream and may be replace by on write
event.

In this patch, we introduce CF_WRITE_EVENT flag as a replacement to
CF_WRITE_EVENT_NULL. There is no breaking change for now, it is just a
rename. Gradually, other write events will be merged with this one.
2023-01-09 18:41:08 +01:00
Christopher Faulet
6e1bbc446b REORG: channel: Rename CF_READ_NULL to CF_READ_EVENT
CF_READ_NULL flag is not really useful and used. It is a transient event
used to wakeup the stream. As we will see, all read events on a channel may
be resumed to only one and are all used to wake up the stream.

In this patch, we introduce CF_READ_EVENT flag as a replacement to
CF_READ_NULL. There is no breaking change for now, it is just a
rename. Gradually, other read events will be merged with this one.
2023-01-09 18:41:08 +01:00
Willy Tarreau
5a72d03a58 MINOR: stick-table: implement the sc-add-gpc() action
This action increments the General Purpose Counter at the index <idx> of
the array associated to the sticky counter designated by <sc-id> by the
value of either integer <int> or the integer evaluation of expression
<expr>. Integers and expressions are limited to unsigned 32-bit values.
If an error occurs, this action silently fails and the actions evaluation
continues. <idx> is an integer between 0 and 99 and <sc-id> is an integer
between 0 and 2. It also silently fails if the there is no GPC stored at
this index. The entry in the table is refreshed even if the value is zero.
The 'gpc_rate' is automatically adjusted to reflect the average growth
rate of the gpc value.

The main use of this action is to count scores or total volumes (e.g.
estimated danger per source IP reported by the server or a WAF, total
uploaded bytes, etc).
2023-01-07 09:11:22 +01:00
Willy Tarreau
6c0117168e MEDIUM: stick-table: set the track-sc limit at boottime via tune.stick-counters
The number of stick-counter entries usable by track-sc rules is currently
set at build time. There is no good value for this since the vast majority
of users don't need any, most need only a few and rare users need more.
Adding more counters for everyone increases memory and CPU usages for no
reason.

This patch moves the per-session and per-stream arrays to a pool of a size
defined at boot time. This way it becomes possible to set the number of
entries at boot time via a new global setting "tune.stick-counters" that
sets the limit for the whole process. When not set, the MAX_SESS_STR_CTR
value still applies, or 3 if not set, as before.

It is also possible to lower the value to 0 to save a bit of memory if
not used at all.

Note that a few low-level sample-fetch functions had to be protected due
to the ability to use sample-fetches in the global section to set some
variables.
2023-01-06 18:08:49 +01:00
Christopher Faulet
61aded057d BUG/MAJOR: buf: Fix copy of wrapping output data when a buffer is realigned
There is a bug in b_slow_realign() function when wrapping output data are
copied in the swap buffer. block1 and block2 sizes are inverted. Thus blocks
with a wrong size are copied. It leads to data mixin if the first block is
in reality larger than the second one or to a copy of data outside the
buffer is the first block is smaller than the second one.

The bug was introduced when the buffer API was refactored in 1.9. It was
found by a code review and seems never to have been triggered in almost 5
years. However, we cannot exclude it is responsible of some unresolved bugs.

This patch should fix issue #1978. It must be backported as far as 2.0.
2023-01-05 09:34:49 +01:00
Willy Tarreau
6e70a3986c BUILD: makefile: only consider settings from enabled options
Due to the previous SSL exception we coudln't restrict the collected
CFLAGS/LDFLAGS to those of enabled options, so all of them were
considered if set. The problem is that it would prevent simply
disabling a build option without unsetting its xxx_CFLAGS or _LDFLAGS
values if those had incompatible values (e.g. -lfoo).

Now that only existing options are listed in collect_opts_flags, we
can safely check that the option is set and only consider its settings
in this case. Thus OT_LDFLAGS will not be used if USE_OT is not set
for example.
2022-12-23 17:01:55 +01:00
Willy Tarreau
6a2cd33509 BUILD: makefile: remove the special case of the SSL option
By creating USE_SSL and enabling it when USE_OPENSSL is set, we can
get rid of the special case that was made with it regarding cflags
collect and when resetting options. The option doesn't need to be
manually set, though in the future it might prove useful if other
non-openssl API are supported.
2022-12-23 16:53:35 +01:00
Willy Tarreau
2b8d0978f3 BUILD: makefile: make all OpenSSL variants use the same settings
It's getting complicated to configure includes and lib dirs for
OpenSSL API variants such as WolfSSL, because some settings are
common and others are specific but carry a prefix that doesn't
match the USE_* rule scheme.

This patch simplifies everything by considering that all SSL libs
will use SSL_INC, SSL_LIB, SSL_CFLAGS and SSL_LDFLAGS. That's much
more convenient. This works thanks to the settings collector which
explicitly checks the SSL_* settings. When USE_OPENSSL_WOLFSSL is
set, then USE_OPENSSL is implied, so that there's no need to
duplicate maintenance effort.
2022-12-23 16:53:35 +01:00
Willy Tarreau
8fa2f49f24 BUILD: makefile: add a function to collect all options' CFLAGS/LDFLAGS
The new function collect_opts_flags now scans all USE_* options defined
in use_opts and appends the corresponding *_CFLAGS and *_LDFLAGS to
OPTIONS_{C,LD}FLAGS respectively. This will be useful to get rid of all
the individual concatenations to these variables.
2022-12-23 16:53:35 +01:00
Willy Tarreau
b14e89e322 BUILD: makefile: initialize all build options' variables at once
A lot of _SRC, _INC, _LIB etc variables are set and expected to be
initialized to an empty string by default. However, an in-depth
review of all of them showed that WOLFSSL_{INC,LIB}, SSL_{INC,LIB},
LUA_{INC,LIB}, and maybe others were not always initialized and could
sometimes leak from the environment and as such cause strange build
issues when running from cascaded scripts that had exported them.

The approach taken here consists in iterating over all USE_* options
and unsetting any _SRC, _INC, _LIB, _CFLAGS and _LDFLAGS that follows
the same name. For the few variable names options that don't exactly
match the build option (SSL & WOLFSSL), these ones are specifically
added to the list. The few that were explicitly cleared in their own
sections were just removed since not needed anymore. Note that an
"undefine" command appeared in GNU make 3.82 but since we support
older ones we can only initialize the variables to an empty string
here. It's not a problem in practice.

We're now certain that these variables are empty wherever they are
used, and that it is possible to just append to them, or use them
as-is.
2022-12-23 16:53:35 +01:00
Willy Tarreau
848362f2d2 BUILD: makefile: sort the features list
The features list that appears in -vv appears in a random order, which
always makes it a pain to look for certain features. Let's sort it.
2022-12-23 16:53:35 +01:00
Willy Tarreau
69e7b7f677 BUILD: makefile: move common options-oriented macros to include/make/options.mk
Some macros and functions are barely understandable and are only used
to iterate over known options from the use_opts list. Better assign
them a name and move them into a dedicated file to clean the makefile
a little bit. Now at least "use_opts" only appears once, where it is
defined. This also allowed to completely remove the BUILD_FEATURES
macro that caused some confusion until previous commit.
2022-12-23 16:53:35 +01:00
Amaury Denoyelle
663e872e3a MEDIUM: mux-quic: implement STOP_SENDING emission
Implement STOP_SENDING. This is divided in two main functions :
* qcc_abort_stream_read() which can be used by application protocol to
  request for a STOP_SENDING. This set the flag QC_SF_READ_ABORTED.
* qcs_send_reset() is a static function called after the preceding one.
  It will send a STOP_SENDING via qcc_send().

QC_SF_READ_ABORTED flag is now properly used : if activated on a stream
during qcc_recv(), <qcc.app_ops.decode_qcs> callback is skipped. Also,
abort reading on unknown unidirection remote stream is now fully
supported with the emission of a STOP_SENDING as specified by RFC 9000.

This commit is part of implementing H3 errors at the stream level. This
will allows the H3 layer to request the peer to close its endpoint for
an error on a stream.

This should be backported up to 2.7.
2022-12-22 16:38:16 +01:00
Amaury Denoyelle
5854fc08cc MINOR: mux-quic: handle RESET_STREAM reception
Implement RESET_STREAM reception by mux-quic. On reception, qcs instance
will be mark as remotely closed and its Rx buffer released. The stream
layer will be flagged on error if still attached.

This commit is part of implementing H3 errors at the stream level.
Indeed, on H3 stream errors, STOP_SENDING + RESET_STREAM should be
emitted. The STOP_SENDING will in turn generate a RESET_STREAM by the
remote peer which will be handled thanks to this patch.

This should be backported up to 2.7.
2022-12-22 16:38:04 +01:00
Amaury Denoyelle
a473f196f1 MEDIUM: mux-quic: implement shutw
Implement mux_ops shutw operation for QUIC mux. A RESET_STREAM is
emitted unless the stream is already closed due to all data or
RESET_STREAM already transmitted.

This operation is notably useful when upper stream layer wants to close
the connection early due to an error.

This was tested by using a HTTP server which listens with PROXY protocol
support. The corresponding server line on haproxy configuration
deliberately not specify send-proxy. This causes the server to close
abruptly the connection. Without this patch, nothing was done on the QUIC
stream which was kept open until the whole connection is closed. Now, a
proper RESET_STREAM is emitted to report the error.

This should be backported up to 2.7.
2022-12-22 16:22:39 +01:00
William Lallemand
be6a873096 BUG/MINOR: httpclient/log: free of invalid ptr with httpclient_log_format
free_proxy() must check if the ptr is not httpclient_log_format before
trying to free p->conf.logformat_string.

No backport needed.
2022-12-22 15:39:31 +01:00
Christopher Faulet
c960a3b60f BUG/MINOR: pool/stats: Use ullong to report total pool usage in bytes in stats
The same change was already performed for the cli. The stats applet and the
prometheus exporter are also concerned. Both use the stats API and rely on
pool functions to get total pool usage in bytes. pool_total_allocated() and
pool_total_used() must return 64 bits unsigned integer to avoid any wrapping
around 4G.

This may be backported to all versions.
2022-12-22 13:46:21 +01:00
Remi Tricot-Le Breton
c8d814ed63 MINOR: ssl: Move OCSP code to a dedicated source file
This is a simple cleanup that moves OCSP related code to a dedicated
file instead of interlacing it in some pure ssl connection code.
2022-12-21 11:21:07 +01:00
Remi Tricot-Le Breton
6477bbd78d MEDIUM: ssl: Add ocsp update task main function
This patch contains the main function of the ocsp auto update mechanism
as well as an init and destroy function of the task used for this.
The task is not created in this patch but in a later one.

The function has two distinct parts and the branching to one or the
other is completely based on the fact that the cur_ocsp pointer of the
ssl_ocsp_task_ctx member is set.
If the pointer is not set, we need to look at the first item of the
update tree and see if it needs to be updated. If it does not we simply
wait until the time is right and let the task asleep. If it does need to
be updated, we simply build and send the corresponding ocsp request
thanks to the http_client. The task is then sent to sleep with an expire
time set to infinity. The http_client will wake it back up once the
response is received (or a timeout occurs). Just note that during this
whole process the cetificate_ocsp object corresponding to the entry
being updated is taken out of the update tree and only stored in the
ssl_ocsp_task_ctx context.
Once the task is waken up by the http_client, it branches on the
response processing part of the function which basically checks that the
response is valid and inserts it into the ocsp_response tree. The task
then goes back to sleep until another entry needs to be updated.
2022-12-21 11:21:07 +01:00
Remi Tricot-Le Breton
fb2b9988e8 MINOR: ssl: Store 'ocsp-update' mode in the ckch_data and check for inconsistencies
The 'ocsp-update' option is parsed at the same time as all the other
bind line options but it does not actually have anything to do with the
bind line since it concerns the frontend certificate instead. For that
reason, we should have a mean to identify inconsistencies in the
configuration and raise an error when a given certificate has two
different ocsp-update modes specified in one or more crt-lists.
The simplest way to do it is to store the ocsp update mode directly in
the ckch and not only in the ssl_bind_conf.
2022-12-21 11:21:07 +01:00
Remi Tricot-Le Breton
03c5ffff8e MINOR: ssl: Add crt-list ocsp-update option
This option will define how the ocsp update mechanism behaves. The
option can either be set to 'on' or 'off' and can only be specified in a
crt-list entry so that we ensure that it concerns a single certificate.
The 'off' mode is the default one and corresponds to the old behavior
(no automatic update).
When the option is set to 'on', we will try to get an ocsp response
whenever an ocsp uri can be found in the frontend's certificate. The
only limitation of this mode is that the certificate's issuer will have
to be known in order for the OCSP certid to be built.

This patch only adds the parsing of the option. The full functionality
will come in a later commit.
2022-12-21 11:21:07 +01:00
Remi Tricot-Le Breton
cc346678dc MEDIUM: ssl: Add ocsp_certid in ckch structure and discard ocsp buffer early
The ocsp_response member of the cert_key_and_chain structure is only
used temporarily. During a standard init process where an ocsp response
is provided, this ocsp file is first copied into the ocsp_response
buffer without any ocsp-related parsing (see
ssl_sock_load_ocsp_response_from_file), and then the contents are
actually interpreted and inserted into the actual ocsp tree
(cert_ocsp_tree) later in the process (see ssl_sock_load_ocsp). If the
response was deemed valid, it is then copied into the actual
ocsp_response structure's 'response' field (see
ssl_sock_load_ocsp_response). From this point, the ocsp_response field
of the cert_key_and_chain object could be discarded since actual ocsp
operations will be based of the certificate_ocsp object.

The only remaining runtime use of the ckch's ocsp_response field was in
the CLI, and more precisely in the 'show ssl cert' mechanism.
This constraint could be removed by adding an OCSP_CERTID directly in
the ckch because the buffer was only used to get this id.

This patch then adds the OCSP_CERTID pointer in the ckch, it clears the
ocsp_response buffer early and simplifies the ckch_store_build_certid
function.
2022-12-21 11:21:07 +01:00
Remi Tricot-Le Breton
c0b4058e7e MINOR: ssl: Add helper function that checks the validity of an OCSP response
This helper function will check that an OCSP response is valid, meaning
that the proper "Content-Type: application/ocsp-response" header is
present and the data itself is a proper OCSP_RESPONSE that can be
checked thanks to the issuer certificate.
2022-12-21 11:21:07 +01:00
Remi Tricot-Le Breton
e09d2ae598 MINOR: ssl: Add OCSP request helper function
This function creates the url and body that will be used to build a
proper OCSP request for a given certid (following section A.1 of
RFC6960).
2022-12-21 11:21:07 +01:00
Remi Tricot-Le Breton
47a4f1239d MINOR: ssl: Add helper function that extracts an OCSP URI from a certificate
This function extracts the first OCSP URI (if any) contained in a
certificate. It only takes the first of potentially multiple URIs.
2022-12-21 11:21:07 +01:00
Remi Tricot-Le Breton
95e7cf1ddf MINOR: httpclient: Make the CLI flags public for future use
Those flags used by the http_client in its CLI function might come to
use for OCSP updates that will strongly rely on the http client.
2022-12-21 11:21:07 +01:00
Remi Tricot-Le Breton
2b96364b35 MINOR: ssl: Add a lock to the OCSP response tree
The tree that contains OCSP responses is never locked despite being used
at runtime for OCSP stapling as well as the CLI through "set ssl cert"
and "set ssl ocsp-response" commands.
Everything works though because the certificate_ocsp structure is
refcounted and the tree's entries are cleaned up when SSL_CTXs are
destroyed (thanks to an ex_data entry in which the certificate_ocsp
pointer is stored).
This new lock will come to use when the OCSP auto update mechanism is
fully implemented because this new feature will be based on another tree
that stores the same certificate_ocsp members and updates their contents
periodically.
2022-12-21 11:21:07 +01:00
Willy Tarreau
eed7826529 BUG/MEDIUM: quic: properly take shards into account on bind lines
Shards were completely forgotten in commit f5a0c8abf ("MEDIUM: quic:
respect the threads assigned to a bind line"). The thread mask is
taken from the bind_conf, but since shards were introduced in 2.5,
the per-listener mask is held by the receiver and can be smaller
than the bind_conf's mask.

The effect here is that the traffic is not distributed to the
appropriate thread. At first glance it's not dramatic since it remains
one of the threads eligible by the bind_conf, but it still means that
in some contexts such as "shards by-thread", some concurrency may
persist on listeners while they're expected to be alone. One identified
impact is that it requires more rxbufs than necessary, but there may
possibly be other not yet identified side effects.

This must be backported to 2.7 and everywhere the commit above is
backported.
2022-12-21 09:27:26 +01:00
Amaury Denoyelle
15337fd808 BUG/MEDIUM: mux-quic: fix double delete from qcc.opening_list
qcs instances for bidirectional streams are inserted in
<qcc.opening_list>. It is removed from the list once a full HTTP request
has been parsed. This is required to implement http-request timeout.

In case a stream is deleted before receiving full HTTP request, it also
must be removed from <qcc.opening_list>. This was not the case on first
implementation but has been fixed by the following patch :
  641a65ff3c
  BUG/MINOR: mux-quic: remove qcs from opening-list on free

This means that now a stream can be deleted from the list in two
different functions. Sadly, as LIST_DELETE was used in both cases,
nothing prevented a double-deletion from the list, even though
LIST_INLIST was used. Both calls are replaced with LIST_DEL_INIT which
is idempotent.

This bug causes memory corruption which results in most cases in a
segfault, most of times outside of mux-quic code itself. It has been
found first by gabrieltz who reported it on the github issue #1903. Big
thanks to him for his testing.

This bug also causes failures on several 'M' transfer testcase of QUIC
interop-runner. The s2n-quic client is particularly useful in this case
as segfaults triggers were most of the times on the LIST_DELETE
operation itself. This is probably due to its encapsulating of HEADERS
frame with fin bit delayed in a following empty STREAM frame.

This must be backported wherever the above patch is, up to 2.6.
2022-12-21 08:58:04 +01:00
Willy Tarreau
e327b4a73e MINOR: freq_ctr: add opportunistic versions of swrate_add()
Some uses of swrate_add() only consist in getting a rough estimate of
a frequency. There are cases where speed matters more than accuracy
(e.g. pools). For such use cases, let's just stop looping on the CAS,
if the update fails, another thread is already providing input, and
it's not dramatic to lose the race. All these functions are now
suffixed with "_opportunistic".
2022-12-20 14:51:12 +01:00
Willy Tarreau
284cfc67b8 MINOR: pool: make the thread-local hot cache size configurable
Till now it was only possible to change the thread local hot cache size
at build time using CONFIG_HAP_POOL_CACHE_SIZE. But along benchmarks it
was sometimes noticed a huge contention in the lower level memory
allocators indicating that larger caches could be beneficial, especially
on machines with large L2 CPUs.

Given that the checks against this value was no longer on a hot path
anymore, there was no reason for continuing to force it to be tuned at
build time. So this patch allows to set it by tune.memory-hot-size.

It's worth noting that during the boot phase the value remains zero so
that it's possible to know if the value was set or not, which opens the
possibility that we try to automatically adjust it based on the per-cpu
L2 cache size or the use of certain protocols (none of this is done yet).
2022-12-20 14:51:12 +01:00
Willy Tarreau
4dd33d9c32 OPTIM: pool: split the read_mostly from read_write parts in pool_head
Performance profiling on a 48-thread machine showed a lot of time spent
in pool_free(), precisely at the point where pool->limit was retrieved.
And the reason is simple. Some parts of the pool_head are heavily updated
only when facing a cache miss ("allocated", "used", "needed_avg"), while
others are always accessed (limit, flags, size). The fact that both
entries were stored into the same cache line makes it very difficult for
each thread to access these precious info even when working with its own
cache.

By just splitting the fields apart, a test on QUIC (which stresses pools
a lot) more than doubled performance from 42 Gbps to 96 Gbps!

Given that the patch only reorders fields and addresses such a significant
contention, it should be backported to 2.7 and 2.6.
2022-12-20 14:51:12 +01:00
William Lallemand
46bea1c616 BUILD: peers: peers-t.h depends on stick-table-t.h
peers-t.h uses "struct stktable" as well as STKTABLE_DATA_TYPES which
are defined in stick-table-t.h. It works by accident because
stick-table-t.h was always included before. But could provoke build
issue with EXTRA code.

To be backported as far as 2.2.
2022-12-16 15:51:44 +01:00
Aurelien DARRAGON
5594184190 MINOR: stats: introduce stats field ctx
Add a new value in stats ctx: field.
Implement field support in line dumping parent functions
stats_print_proxy_field_json() and stats_dump_proxy_to_buffer().

This will allow child dumping functions to support partial line dumping
when needed. ie: when dumping buffer is exhausted: do a partial send and
wait for a new buffer to finish the dump. Thanks to field ctx, the function
can start dumping where it left off on previous (unterminated) invokation.
2022-12-15 16:53:49 +01:00
Amaury Denoyelle
15f3cc4b38 MINOR: http: extract content-length parsing from H2
Extract function h2_parse_cont_len_header() in the generic HTTP module.
This allows to reuse it for all HTTP/x parsers. The function is now
available as http_parse_cont_len_header().

Most notably, this will be reused in the next bugfix for the H3 parser.
This is necessary to check that content-length header match the length
of DATA frames.

Thus, it must be backported to 2.6.
2022-12-14 11:34:18 +01:00
Amaury Denoyelle
dbf6ad470b BUG/MINOR: quic: properly handle alloc failure in qc_new_conn()
qc_new_conn() is used to allocate a quic_conn instance and its various
internal members. If one allocation fails, quic_conn_release() is used
to cleanup things.

For the moment, pool_zalloc() is used which ensures that all content is
null. However, some members must be initialized to a special values
to be able to use quic_conn_release() safely. This is the case for
quic_conn lists and its tasklet.

Also, some quic_conn internal allocation functions were doing their own
cleanup on failure without reset to NULL. This caused an issue with
quic_conn_release() which also frees this members. To fix this, these
functions now only return an error without cleanup. It is the caller
responsibility to free the allocated content, which is done via
quic_conn_release().

Without this patch, allocation failure in qc_new_conn() would often
result in segfault. This was reproduced easily using fail-alloc at 10%.

This should be backported up to 2.6.
2022-12-12 11:44:34 +01:00
Cedric Paillet
e06e31ea3b MINOR: promex: introduce haproxy_backend_agg_check_status
This patch introduces haproxy_backend_agg_check_status metric
as we wanted in 42d7c402d but with the right data source.

This patch could be backported as far as 2.4.
2022-12-09 10:54:48 +01:00
Cedric Paillet
7d6644e689 BUG/MINOR: promex: create haproxy_backend_agg_server_status
haproxy_backend_agg_server_check_status currently aggregates
haproxy_server_status instead of haproxy_server_check_status.
We deprecate this and create a new one,
haproxy_backend_agg_server_status to clarify what it really
does.

This patch could be backported as far as 2.4.
2022-12-09 10:54:27 +01:00
Willy Tarreau
9192d20f02 MINOR: pools: make DEBUG_UAF a runtime setting
Since the massive pools cleanup that happened in 2.6, the pools
architecture was made quite more hierarchical and many alternate code
blocks could be moved to runtime flags set by -dM. One of them had not
been converted by then, DEBUG_UAF. It's not much more difficult actually,
since it only acts on a pair of functions indirection on the slow path
(OS-level allocator) and a default setting for the cache activation.

This patch adds the "uaf" setting to the options permitted in -dM so
that it now becomes possible to set or unset UAF at boot time without
recompiling. This is particularly convenient, because every 3 months on
average, developers ask a user to recompile haproxy with DEBUG_UAF to
understand a bug. Now it will not be needed anymore, instead the user
will only have to disable pools and enable uaf using -dMuaf. Note that
-dMuaf only disables previously enabled pools, but it remains possible
to re-enable caching by specifying the cache after, like -dMuaf,cache.
A few tests with this mode show that it can be an interesting combination
which catches significantly less UAF but will do so with much less
overhead, so it might be compatible with some high-traffic deployments.

The change is very small and isolated. It could be helpful to backport
this at least to 2.7 once confirmed not to cause build issues on exotic
systems, and even to 2.6 a bit later as this has proven to be useful
over time, and could be even more if it did not require a rebuild. If
a backport is desired, the following patches are needed as well:

  CLEANUP: pools: move the write before free to the uaf-only function
  CLEANUP: pool: only include pool-os from pool.c not pool.h
  REORG: pool: move all the OS specific code to pool-os.h
  CLEANUP: pools: get rid of CONFIG_HAP_POOLS
  DEBUG: pool: show a few examples in -dMhelp
2022-12-08 18:54:59 +01:00
Willy Tarreau
4da51bd190 CLEANUP: pools: get rid of CONFIG_HAP_POOLS
This one was set in defaults.h only when neither DEBUG_NO_POOLS nor
DEBUG_UAF were set. This was not the most convenient location to look
for it, and it was only used in pool.c to decide on the initial value
of POOL_DBG_NO_CACHE.

Let's just use DEBUG_NO_POOLS || DEBUG_UAF directly on this flag and
get rid of the intermediary condition. This also has the benefit of
removing a double inversion, which is always nice for understanding.
2022-12-08 17:45:08 +01:00
Willy Tarreau
a95636682d REORG: pool: move all the OS specific code to pool-os.h
Till now pool-os used to contain a mapping from pool_{alloc,free}_area()
to pool_{alloc,free}_area_uaf() in case of DEBUG_UAF, or the regular
malloc-based function. And the *_uaf() functions were in pool.c. But
since 2.4 with the first cleanup of the pools, there has been no more
calls to pool_{alloc,free}_area() from anywhere but pool.c, from exactly
one place each. As such, there's no more need to keep *_uaf() apart in
pool.c, we can inline it into pool-os.h and leave all the OS stuff there,
with pool.c calling either based on DEBUG_UAF. This is cleaner with less
round trips between both files and easier to find.
2022-12-08 17:32:57 +01:00
Willy Tarreau
76a97a98ca CLEANUP: pool: only include pool-os from pool.c not pool.h
There's no need for the low-level pool functions to be known from all
callers anymore, they're only used by pool.c. Let's reduce the amount
of header files processed.
2022-12-08 17:32:40 +01:00
Willy Tarreau
5ab3c61932 BUILD: atomic: atomic.h may need compiler.h on ARMv8.2-a
We get a build error in ncbuf.c when building for ARMv8.2-a because ncbuf
has minimal includes and among them bug.h which includes atomic.h. Atomic.h
may use "forceinline" without including compiler.h, hence the build error.
It was verified that adding it doesn't inflate the total headers.

Since all other C files include api.h which already covers this, there's
no real need to bapkport this. The issue was already there in 2.3 though.
2022-12-08 08:36:24 +01:00
Aurelien DARRAGON
7d541a91ec BUG/MINOR: checks: restore legacy on-error fastinter behavior
With previous commit, 9e080bf ("BUG/MINOR: checks: make sure fastinter is used
even on forced transitions"), on-error mark-down|sudden-death|fail-check are
now working as expected.

However, on-error fastinter remains broken because srv_getinter(), used in
the above commit to check the expiration date, won't return fastinter interval
if server health is maxed out (which is the case with on-error fastinter mode).

To fix this, we introduce a check flag named CHK_ST_FASTINTER.
This flag is set when on-error is triggered. This way we can force
srv_getinter() to return fastinter interval whenever the flag is set.
The flag is automatically cleared as soon as the new check task expiry is
recalculated in process_chk_conn().
This restores original behavior prior to d114f4a ("MEDIUM: checks: spread the
checks load over random threads").

It must be backported to 2.7 along with the aforementioned commits.
2022-12-07 17:03:55 +01:00
Ilya Shipitsin
5fa29b8a74 CLEANUP: assorted typo fixes in the code and comments
This is 34th iteration of typo fixes
2022-12-07 09:08:18 +01:00
Aurelien DARRAGON
22f82f81e5 MINOR: server/event_hdl: add support for SERVER_UP and SERVER_DOWN events
We're using srv_update_status() as the only event source or UP/DOWN server
events in an attempt to simplify the support for these 2 events.

It seems srv_update_status() is the common path for server state changes anyway

Tested with server state updated from various sources:
  - the cli
  - server-state file (maybe we could disable this or at least don't publish
  in global event queue in the future if it ends in slower startup for setups
  relying on huge server state files)
  - dns records (ie: srv template)
  (again, could be fined tuned to only publish in server specific subscriber
  list and no longer in global subscription list if mass dns update tend to
  slow down srv_update_status())
  - normal checks and observe checks (HCHK_STATUS_HANA)
  (same as above, if checks related state update storms are expected)
  - lua scripts
  - html stats page (admin mode)
2022-12-06 10:22:07 +01:00
Aurelien DARRAGON
129ecf441f MINOR: server/event_hdl: add support for SERVER_ADD and SERVER_DEL events
Basic support for ADD and DEL server events are added through this commit:
	SERVER_ADD is published on dynamic server addition through cli.
	SERVER_DEL is published on dynamic server deletion through cli.

This work depends on:
	"MINOR: event_hdl: add event handler base api"
	"MINOR: server: add srv->rid (revision id) value"
2022-12-06 10:22:07 +01:00
Aurelien DARRAGON
745ce8e8ad MINOR: stats: add server revision id support
Make use of the new srv->rid value in stats.

Stat is referred as ST_F_SRID, it is now used in stats_fill_sv_stats
function in order to be included in csv and json stats dumps.

Moreover, "rid: $value" will be displayed next to server puid
in html stats page if "stats show-legend" is specified in the stats frontend.
(mouse hovering tooltip)

Depends on the following commit:
	"MINOR: server: add srv->rid (revision id) value"
2022-12-06 10:22:06 +01:00
Aurelien DARRAGON
61e3894dfe MINOR: server: add srv->rid (revision id) value
With current design, we could not distinguish between
previously existing deleted server and a new server reusing
the deleted server name/id.

This can cause some confusion when auditing stats/events/logs,
because the new server will look similar to the old
one.

To address this, we're adding a new value in server structure: rid

rid (revision id) value is an unsigned 32bits value that is set upon
server creation. Value is derived from a global counter that starts
at 0 and is incremented each time one or multiple server deletions are
followed by a server addition (meaning that old name/id reuse could occur).

Thanks to this revision id, it is now easy to tell whether the server
we're looking at is the same as before or if it has been deleted and
re-added in the meantime.
(combining server name/id + server revision id yields a process-wide unique
identifier)
2022-12-06 10:22:06 +01:00
Amaury Denoyelle
d3083c9df9 MINOR: quic: reconnect quic-conn socket on address migration
UDP addresses may change over time for a QUIC connection. When using
quic-conn owned socket, we have to detect address change to break the
bind/connect association on the socket.

For the moment, on change detected, QUIC connection socket is closed and
a new one is opened. In the future, we may improve this by trying to
keep the original socket and reexecute only bind/connect syscalls.

This change is part of quic-conn owned socket implementation.
It may be backported to 2.7 after a period of observation.
2022-12-02 14:45:43 +01:00
Amaury Denoyelle
7c9fdd9c3a MEDIUM: quic: move receive out of FD handler to quic-conn io-cb
This change is the second part for reception on QUIC connection socket.
All operations inside the FD handler has been delayed to quic-conn
tasklet via the new function qc_rcv_buf().

With this change, buffer management on reception has been simplified. It
is now possible to use a local buffer inside qc_rcv_buf() instead of
quic_receiver_buf().

This change is part of quic-conn owned socket implementation.
It may be backported to 2.7 after a period of observation.
2022-12-02 14:45:43 +01:00
Amaury Denoyelle
5b41486b7f MEDIUM: quic: use quic-conn socket for reception
Try to use the quic-conn socket for reception if it is allocated. For
this, the socket is inserted in the fdtab. This will call the new
handler quic_conn_io_cb() which is responsible to process the recv()
system call. It will reuse datagram dispatch for simplicity. However,
this is guaranteed to be called on the quic-conn thread, so it will be
more efficient to use a dedicated buffer. This will be implemented in
another commit.

This patch should improve performance by reducing contention on the
receiver socket. However, more gain can be obtained when the datagram
dispatch operation will be skipped.

Older quic_sock_fd_iocb() is renamed to quic_lstnr_sock_fd_iocb() to
emphasize its usage for the receiver socket.

This change is part of quic-conn owned socket implementation.
It may be backported to 2.7 after a period of observation.
2022-12-02 14:45:43 +01:00
Amaury Denoyelle
40909dfec5 MINOR: quic: allocate a socket per quic-conn
Allocate quic-conn owned socket if possible. This requires that this is
activated in haproxy configuration. Also, this is done only if local
address is known so it depends on the support of IP_PKTINFO.

For the moment this socket is not used. This causes QUIC support to be
broken as received datagram are not read. This commit will be completed
by a following patch to support recv operation on the newly allocated
socket.

This change is part of quic-conn owned socket implementation.
It may be backported to 2.7 after a period of observation.
2022-12-02 14:45:43 +01:00
Amaury Denoyelle
75839a44e7 MINOR: quic: startup detect for quic-conn owned socket support
To be able to use individual sockets for QUIC connections, we rely on
the OS network stack which must support UDP sockets binding on the same
local address.

Add a detection code for this feature executed on startup. When the
first QUIC listener socket is binded, a test socket is created and
binded on the same address. If the bind call fails, we consider that
it's impossible to use individual socket for QUIC connections.

A new global option GTUNE_QUIC_SOCK_PER_CONN is defined. If startup
detect fails, this value is resetted from global options. For the
moment, there is no code to activate the option : this will be in a
follow-up patch with the introduction of a new configuration option.

This change is part of quic-conn owned socket implementation.
It may be backported to 2.7 after a period of observation.
2022-12-02 14:45:43 +01:00
Amaury Denoyelle
eec0b3c1bd MINOR: quic: detect connection migration
Detect connection migration attempted by the client. This is done by
comparing addresses stored in quic-conn with src/dest addresses of the
UDP datagram.

A new function qc_handle_conn_migration() has been added. For the
moment, no operation is conducted and the function will be completed
during connection migration implementation. The only notable things is
the increment of a new counter "quic_conn_migration_done".

This should be backported up to 2.7.
2022-12-02 14:45:43 +01:00
Amaury Denoyelle
21e611dc89 MINOR: tools: add port for ipcmp as optional criteria
Complete ipcmp() function with a new argument <check_port>. If this
argument is true, the function will compare port values besides IP
addresses and return true only if both are identical.

This commit will simplify QUIC connection migration detection. As such,
it should be backported to 2.7.
2022-12-02 14:45:43 +01:00
Amaury Denoyelle
8687b63c69 MINOR: quic: extract datagram parsing code
Extract individual datagram parsing code outside of datagrams list loop
in quic_lstnr_dghdlr(). This is moved in a new function named
quic_dgram_parse().

To complete this change, quic_lstnr_dghdlr() has been moved into
quic_sock source file : it belongs to QUIC socket lower layer and is
directly called by quic_sock_fd_iocb().

This commit will ease implementation of quic-conn owned socket.
New function quic_dgram_parse() will be easily usable after a receive
operation done on quic-conn IO-cb.

This should be backported up to 2.7.
2022-12-02 14:45:43 +01:00