haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-09-22 22:31:28 +02:00

Author	SHA1	Message	Date
Christopher Faulet	3aeb36681c	BUG/MINOR: syslog: Request for more data if message was not fully received In the syslog applet, when a message was not fully received, we must request for more data by calling appctx_need_more_data() and not by setting CF_READ_DONTWAIT flag on the request channel. Indeed, this flag is only used to only try a read at once. This patch could be backported as far as 2.4. On 2.5 and 2.4, applet_need_more_data() must be replaced by si_cant_get().	2023-03-24 09:24:03 +01:00
Amaury Denoyelle	abbb5ad1f5	MINOR: mux-quic: close on frame alloc failure Replace all BUG_ON() on frame allocation failure by a CONNECTION_CLOSE sending with INTERNAL_ERROR code. This can happen for the following cases : * sending of MAX_STREAM_DATA * sending of MAX_DATA * sending of MAX_STREAMS_BIDI In other cases (STREAM, STOP_SENDING, RESET_STREAM), an allocation failure will only result in the current operation to be interrupted and retried later. However, it may be desirable in the future to replace this with a simpler CONNECTION_CLOSE emission to recover better under a memory pressure issue. This should be backported up to 2.7.	2023-03-23 14:39:49 +01:00
Amaury Denoyelle	c0c6b6d8c0	MINOR: mux-quic: close on qcs allocation failure Emit a CONNECTION_CLOSE with INTERNAL_ERROR code each time qcs allocation fails. This can happen in two cases : * when creating a local stream through application layer * when instantiating a remote stream through qcc_get_qcs() In both cases, error paths are already in place to interrupt the current operation and a CONNECTION_CLOSE will be emitted soon after. This should be backported up to 2.7.	2023-03-23 14:39:49 +01:00
Amaury Denoyelle	e2213df9fe	MINOR: mux-quic: ensure CONNECTION_CLOSE is scheduled once per conn Add BUG_ON() statements to ensure qcc_emit_cc()/qcc_emit_cc_app() is not called more than one time for each connection. This should improve code resilience of MUX-QUIC and H3 and it will ensure that a scheduled CONNECTION_CLOSE is not overwritten by another one with a different error code. This commit relies on the previous one to ensure all QUIC operations are not conducted as soon as a CONNECTION_CLOSE has been prepared : commit d7fbf458f8a4c5b09cbf0da0208fbad70caaca33 MINOR: mux-quic: interrupt most operations if CONNECTION_CLOSE scheduled This should be backported up to 2.7.	2023-03-23 14:39:49 +01:00
Amaury Denoyelle	b47310d883	MINOR: mux-quic: interrupt qcc_recv() operations if CC scheduled Ensure that external MUX operations are interrupted if a CONNECTION_CLOSE is scheduled. This was already the cases for some functions. This is extended to the qcc_recv() family for MAX_STREAM_DATA, RESET_STREAM and STOP_SENDING. Also, qcc_release_remote_stream() is skipped in qcs_destroy() if a CONNECTION_CLOSE is already scheduled. All of this will ensure we only proceed to minimal treatment as soon as a CONNECTION_CLOSE is prepared. Indeed, all sending and receiving is stopped as soon as a CONNECTION_CLOSE is emitted so only internal cleanup code should be necessary at this stage. This should prevent a registered CONNECTION_CLOSE error status to be overwritten by an error in a follow-up treatment. This should be backported up to 2.7.	2023-03-23 14:39:47 +01:00
Amaury Denoyelle	665817a91c	BUG/MINOR: mux-quic: prevent CC status to be erased by shutdown HTTP/3 graceful shutdown operation is used to emit a GOAWAY followed by a CONNECTION_CLOSE with H3_NO_ERROR status. It is used for every connection on release which means that if a CONNECTION_CLOSE was already registered for a previous error, its status code is overwritten. To fix this, skip shutdown operation if a CONNECTION_CLOSE is already registered at the MUX level. This ensures that the correct error status is reported to the peer. This should be backported up to 2.6. Note that qc_shutdown() does not exists on 2.6 so modification will have to be made directly in qc_release() as followed : diff --git a/src/mux_quic.c b/src/mux_quic.c index 49df0dc418..3463222956 100644 --- a/src/mux_quic.c +++ b/src/mux_quic.c @@ -1766,19 +1766,21 @@ static void qc_release(struct qcc qcc) TRACE_ENTER(QMUX_EV_QCC_END, conn); - if (qcc->app_ops && qcc->app_ops->shutdown) { - / Application protocol with dedicated connection closing - * procedure. - / - qcc->app_ops->shutdown(qcc->ctx); + if (!(qcc->flags & QC_CF_CC_EMIT)) { + if (qcc->app_ops && qcc->app_ops->shutdown) { + / Application protocol with dedicated connection closing + * procedure. + / + qcc->app_ops->shutdown(qcc->ctx); - / useful if application protocol should emit some closing - * frames. For example HTTP/3 GOAWAY frame. - / - qc_send(qcc); - } - else { - qcc_emit_cc_app(qcc, QC_ERR_NO_ERROR, 0); + / useful if application protocol should emit some closing + * frames. For example HTTP/3 GOAWAY frame. + */ + qc_send(qcc); + } + else { + qcc_emit_cc_app(qcc, QC_ERR_NO_ERROR, 0); + } } if (qcc->task) {	2023-03-23 14:38:06 +01:00
Amaury Denoyelle	5aa21c1748	BUG/MINOR: h3: properly handle incomplete remote uni stream type A H3 unidirectional stream is always opened with its stream type first encoded as a QUIC variable integer. If the STREAM frame contains some data but not enough to decode this varint, haproxy would crash due to an ABORT_NOW() statement. To fix this, ensure we support an incomplete stream type. In this case, h3_init_uni_stream() returns 0 and the buffer content is not cleared. Stream decoding will resume when new data are received for this stream which should be enough to decode the stream type varint. This bug has never occured on production because standard H3 stream types are small enough to be encoded on a single byte. This should be backported up to 2.6.	2023-03-23 14:38:06 +01:00
Willy Tarreau	1751db140a	MINOR: pools: report a replaced memory allocator instead of just malloc_trim() Instead of reporting the inaccurate "malloc_trim() support" on -vv, let's report the case where the memory allocator was actively replaced from the one used at build time, as this is the corner case we want to be cautious about. We also put a tainted bit when this happens so that it's possible to detect it at run time (e.g. the user might have inherited it from an environment variable during a reload operation). The now unused is_trim_enabled() function was finally dropped.	2023-03-22 18:05:02 +01:00
Willy Tarreau	0c27ec5df7	BUG/MINOR: pools: restore detection of built-in allocator The runtime detection of the default memory allocator was broken by Commit d8a97d8f6 ("BUG/MINOR: illegal use of the malloc_trim() function if jemalloc is used") due to a misunderstanding of its role. The purpose is not to detect whether we're on non-jemalloc but whether or not the allocator was changed from the one we booted with, in which case we must be extra cautious and absolutely refrain from calling malloc_trim() and its friends. This was done only to drop the message saying that malloc_trim() is supported, which will be totally removed in another commit, and could possibly be removed even in older versions if this patch would get backported since in the end it provides limited value.	2023-03-22 17:57:13 +01:00
Willy Tarreau	c3b297d5a4	MEDIUM: tools: further relax dlopen() checks too consider grouped symbols There's a recurring issue regarding shared library loading from Lua. If the imported library is linked with a different version of openssl but doesn't use it, the check will trigger and emit a warning. In practise it's not necessarily a problem as long as the API is the same, because all symbols are replaced and the library will use the included ssl lib. It's only a problem if the library comes with a different API because the dynamic linker will only replace known symbols with ours, and not all. Thus the loaded lib may call (via a static inline or a macro) a few different symbols that will allocate or preinitialize structures, and which will then pass them to the common symbols coming from a different and incompatible lib, exactly what happens to users of Lua's luaossl when building haproxy with quictls and without rebuilding luaossl. In order to better address this situation, we now define groups of symbols that must always appear/disappear in a consistent way. It's OK if they're all absent from either haproxy or the lib, it means that one of them doesn't use them so there's no problem. But if any of them is defined on any side, all of them must be in the exact same state on the two sides. The symbols are represented using a bit in a mask, and the mask of the group of other symbols they're related to. This allows to check 64 symbols, this should be OK for a while. The first ones that are tested for now are symbols whose combination differs between openssl versions 1.0, 1.1, and 3.0 as well as quictls. Thus a difference there will indicate upcoming trouble, but no error will mean that we're running on a seemingly compatible API and that all symbols should be replaced at once. The same mechanism could possibly be used for pcre/pcre2, zlib and the few other optional libs that may occasionally cause runtime issues when used by dependencies, provided that discriminatory symbols are found to distinguish them. But in practice such issues are pretty rare, mainly because loading standard libs via Lua on a custom build of haproxy is not pretty common. In the event that further symbol compatibility issues would be reported in the future, backporting this patch as well as the following series might be an acceptable solution given that the scope of changes is very narrow (the malloc stuff is needed so that the malloc/free tests can be dropped): BUG/MINOR: illegal use of the malloc_trim() function if jemalloc is used MINOR: pools: make sure 'no-memory-trimming' is always used MINOR: pools: intercept malloc_trim() instead of trying to plug holes MEDIUM: pools: move the compat code from trim_all_pools() to malloc_trim() MINOR: pools: export trim_all_pools() MINOR: pattern: use trim_all_pools() instead of a conditional malloc_trim() MINOR: tools: relax dlopen() on malloc/free checks	2023-03-22 17:30:28 +01:00
Willy Tarreau	58912b8d92	MINOR: tools: relax dlopen() on malloc/free checks Now that we can provide a safe malloc_trim() we don't need to detect anymore that some dependencies use a different set of malloc/free functions than ours because they will use the same as those we're seeing, and we control their use of malloc_trim(). The comment about the incompatibility with DEBUG_MEM_STATS is not true anymore either since the feature relies on macros so we're now OK. This will stop catching libraries linked against glibc's allocator when haproxy is natively built with jemalloc. This was especially annoying since dlopen() on a lib depending on jemalloc() tends to fail on TLS issues.	2023-03-22 17:30:28 +01:00
Willy Tarreau	9b060f148e	MINOR: pattern: use trim_all_pools() instead of a conditional malloc_trim() First this will ensure that we serialize the threads and avoid severe contention. Second it removes ugly ifdefs and conditions.	2023-03-22 17:30:28 +01:00
Willy Tarreau	7aee683541	MINOR: pools: export trim_all_pools() This way it will be usable from outside instead of malloc_trim().	2023-03-22 17:30:28 +01:00
Willy Tarreau	4138f15182	MEDIUM: pools: move the compat code from trim_all_pools() to malloc_trim() We already have some generic code in trim_all_pools() to implement the equivalent of malloc_trim() on jemalloc and macos. Instead of keeping the logic there, let's just move it to our own malloc_trim() implementation so that we can unify the mechanism and the logic. Now any low-level code calling malloc_trim() will either be disabled by haproxy's config if the user decides to, or will be mapped to the equivalent mechanism if malloc() was intercepted by a preloaded jemalloc. Trim_all_pools() preserves the benefit of serializing threads (which we must not impose to other libs which could come with their own threads). It means that our own code should mostly use trim_all_pools() instead of calling malloc_trim() directly.	2023-03-22 17:30:28 +01:00
Willy Tarreau	eaba76b02d	MINOR: pools: intercept malloc_trim() instead of trying to plug holes As reported by Miroslav in commit d8a97d8f6 ("BUG/MINOR: illegal use of the malloc_trim() function if jemalloc is used") there are still occasional cases where it's discovered that malloc_trim() is being used without its suitability being checked first. This is a problem when using another incompatible allocator. But there's a class of use cases we'll never be able to cover, it's dynamic libraries loaded from Lua. In order to address this more reliably, we now define our own malloc_trim() that calls the previous one after checking that the feature is supported and that the allocator is the expected one. This way child libraries that would call it will also be safe. The function is intentionally left defined all the time so that it will be possible to clean up some code that uses it by removing ifdefs.	2023-03-22 17:30:28 +01:00
Willy Tarreau	4db0b0430d	MINOR: pools: make sure 'no-memory-trimming' is always used The global option 'no-memory-trimming' was added in 2.6 with commit c4e56dc58 ("MINOR: pools: add a new global option "no-memory-trimming"") but there were some cases left where it was not considered. Let's make is_trim_enabled() also consider it.	2023-03-22 17:29:23 +01:00
Amaury Denoyelle	f4e7616e6c	MINOR: mux-quic: add flow-control info to minimal trace level Complete traces with information from qcc and qcs instances about flow-control level. This should help to debug further issue on sending. This must be backported up to 2.7.	2023-03-22 16:08:54 +01:00
Amaury Denoyelle	b7143a8781	MINOR: mux-quic: adjust trace level for MAX_DATA/MAX_STREAM_DATA recv Change the trace from developer to data level whenever the flow control limitation is updated following a MAX_DATA or MAX_STREAM_DATA reception. This should be backported up to 2.7.	2023-03-22 16:08:54 +01:00
Amaury Denoyelle	1ec78ff421	MINOR: mux-quic: complete traces for qcs emission Add traces for _qc_send_qcs() function. Most notably, traces have been added each time a qc_stream_desc buffer allocation fails and when stream or connection flow-level is reached. This should improve debugging for emission issues. This must be backported up to 2.7.	2023-03-22 16:08:54 +01:00
Amaury Denoyelle	178fbffda1	BUG/MEDIUM: mux-quic: release data from conn flow-control on qcs reset Connection flow-control level calculation is a bit complicated. To ensure it is never exceeded, each time a transfer occurs from a qcs.tx.buf to its qc_stream_desc buffer it is accounted in qcc.tx.offsets at the connection level. This value is not decremented even if the corresponding STREAM frame is rejected by the quic-conn layer as its emission will be retried later. In normal cases this works as expected. However there is an issue if a qcs instance is removed with prepared data left. In this case, its data is still accounted in qcc.tx.offsets despite being removed which may block other streams. This happens every time a qcs is reset with remaining data which will be discarded in favor of a RESET_STREAM frame. To fix this, if a stream has prepared data in qcc_reset_stream(), it is decremented from qcc.tx.offsets. A BUG_ON() has been added to ensure qcs_destroy() is never called for a stream with prepared data left. This bug can cause two issues : * transfer freeze as data unsent from closed streams still count on the connection flow-control limit and will block other streams. Note that this issue was not reproduced so it's unsure if this really happens without the following issue first. * a crash on a BUG_ON() statement in qc_send() loop over qc_send_frames(). Streams may remained in the send list with nothing to send due to connection flow-control limit. However, limit is never reached through qcc_streams_sent_done() so QC_CF_BLK_MFCTL flag is not set which will allow the loop to continue. The last case was reproduced after several minutes of testing using the following command : $ ngtcp2-client --exit-on-all-streams-close -t 0.1 -r 0.1 \ --max-data=100K -n32 \ 127.0.0.1 20443 "https://127.0.0.1:20443/?s=1g" 2>/dev/null This should fix github issues #2049 and #2074.	2023-03-22 16:08:54 +01:00
Amaury Denoyelle	1d0ed1a2e9	BUG/MINOR: trace: fix hardcoded level for TRACE_PRINTF Level argument was not ignored by TRACE_PRINTF due to an hardcoded value of TRACE_LEVEL_DEVELOPER inside the macro. This must be backported up to 2.6.	2023-03-22 15:31:55 +01:00
Miroslav Zagorac	d8a97d8f60	BUG/MINOR: illegal use of the malloc_trim() function if jemalloc is used In the event that HAProxy is linked with the jemalloc library, it is still shown that malloc_trim() is enabled when executing "haproxy -vv": .. Support for malloc_trim() is enabled. .. It's not so much a problem as it is that malloc_trim() is called in the pat_ref_purge_range() function without any checking. This was solved by setting the using_default_allocator variable to the correct value in the detect_allocator() function and before calling malloc_trim() it is checked whether the function should be called.	2023-03-22 14:14:50 +01:00
Willy Tarreau	9ef2742a51	MINOR: debug: support dumping the libs addresses when running in verbose mode Starting haproxy with -dL helps enumerate the list of libraries in use. But sometimes in order to go further we'd like to see their address ranges. This is already supported on the CLI's "show libs" but not on the command line where it can sometimes help troubleshoot startup issues. Let's dump them when in verbose mode. This way it doesn't change the existing behavior for those trying to enumerate libs to produce an archive.	2023-03-22 11:43:15 +01:00
Willy Tarreau	1b536a11e7	BUILD: thread: silence a build warning when threads are disabled When threads are disabled, the compiler complains that we might be accessing tg->abs[] out of bounds since the array is of size 1. It cannot know that the condition to do this is never met, and given that it's not in a fast path, we can make it more obvious.	2023-03-22 10:40:06 +01:00
Willy Tarreau	0de1e6180a	BUILD: thread: implement thread_harmless_end_sig() for threadless builds Building without thread support was broken in 2.8-dev2 with commit 7e70bfc8c ("MINOR: threads: add a thread_harmless_end() version that doesn't wait") that forgot to define the function for the threadless cases. No backport is needed.	2023-03-22 10:40:06 +01:00
Amaury Denoyelle	8afe4b88c4	BUG/MINOR: quic: ignore congestion window on probing for MUX wakeup qc_notify_send() is used to wake up the MUX layer for sending. This function first ensures that all sending condition are met to avoid to wake up the MUX for unnecessarily. One of this condition is to check if there is room in the congestion window. However, when probe packets must be sent due to a PTO expiration, RFC 9002 explicitely mentions that the congestion window must be ignored which was not the case prior to this patch. This commit fixes this by first setting <pto_probe> of 01RTT packet space before invoking qc_notify_send(). This ensures that congestion window won't be checked anymore to wake up the MUX layer until probing packets are sent. This commit replaces the following one which was not sufficient : commit e25fce03ebe3307bc104d1f81356108e271d2bc3 BUG/MINOR: quic: Dysfunctional 01RTT packet number space probing This should be backported up to 2.7.	2023-03-21 14:52:02 +01:00
Amaury Denoyelle	2a19b6e564	BUG/MINOR: quic: wake up MUX on probing only for 01RTT On PTO probe timeout expiration, a probe packet must be emitted. quic_pto_pktns() is used to determine for which packet space the timer has expired. However, if MUX is already subscribed for sending, it is woken up without checking first if this happened for the 01RTT packet space. It is unsure that this is really a bug as in most cases, MUX is established only after Initial and Handshake packet spaces are removed. However, the situation is not se clear when 0-RTT is used. For this reason, adjust the code to explicitely check for the 01RTT packet space before waking up the MUX layer. This should be backported up to 2.6. Note that qc_notify_send() does not exists in 2.6 so it should be replaced by the explicit block checking (qc->subs && qc->subs->events & SUB_RETRY_SEND).	2023-03-21 14:09:50 +01:00
Willy Tarreau	465a6c8506	BUG/MEDIUM: applet: only set appctx->sedesc on successful allocation If appctx_new_on() fails to allocate a task, it will not remove the freshly allocated sedesc from the appctx despite freeing it, causing a UAF. Let's only assign appctx->sedesc upon success. This needs to be backported to 2.6. In 2.6 the function is slightly different and called appctx_new(), though the issue is exactly the same.	2023-03-21 10:50:51 +01:00
Willy Tarreau	a220e59ad8	BUG/MEDIUM: mux-h1: properly destroy a partially allocated h1s In h1c_frt_stream_new() and h1c_bck_stream_new(), if we fail to completely initialize the freshly allocated h1s, typically because sc_attach_mux() fails, we must use h1s_destroy() to de-initialize it. Otherwise it stays attached to the h1c when released, causing use-after-free upon the next wakeup. This can be triggered upon memory shortage. This needs to be backported to 2.6.	2023-03-21 10:44:44 +01:00
Willy Tarreau	0c4348c982	MINOR: pools: preset the allocation failure rate to 1% with -dMfail Using -dMfail alone does nothing unless tune.fail-alloc is set, which renders it pretty useless as-is, and is not intuitive. Let's change this so that the filure rate is preset to 1% when the option is set on the command line. This allows to inject failures without having to edit the configuration.	2023-03-21 09:26:55 +01:00
Willy Tarreau	69869e6354	MINOR: dynbuf: set POOL_F_NO_FAIL on buffer allocation b_alloc() is used to allocate a buffer. We can provoke fault injection based on forced memory allocation failures using -dMfail on the command line, but we know that the buffer_wait list is a bit weak and doesn't always recover well. As such, submitting buffer allocation to such a treatment seriously limits the usefulness of -dMfail which cannot really be used for other purposes. Let's just disable it for buffers for now.	2023-03-21 09:15:13 +01:00
Willy Tarreau	7a8ca0a063	BUG/MINOR: stconn: fix sedesc memory leak on stream allocation failure If we fail to allocate a new stream in sc_new_from_endp(), and the call to sc_new() allocated the sedesc itself (which normally doesn't happen), then it doesn't get released on the failure path. Let's explicitly handle this case so that it's not overlooked and avoids some head scratching sessions. This may be backported to 2.6.	2023-03-20 19:58:38 +01:00
Willy Tarreau	e2f7946339	BUG/MEDIUM: stconn: don't set the type before allocation succeeds There's an occasional crash that can be triggered in sc_detach_endp() when calling conn->mux->detach() upon memory allocation error. The problem in fact comes from sc_attach_mux(), which doesn't reset the sc type flags upon tasklet allocation failure, leading to an attempt at detaching an incompletely initialized stconn. Let's just attach the sc after the tasklet allocation succeeds, not before. This must be backported to 2.6.	2023-03-20 19:58:38 +01:00
Willy Tarreau	389ab0d4b4	BUG/MEDIUM: mux-h2: erase h2c->wait_event.tasklet on error path On the allocation error path in h2_init() we may check if h2c->wait_event.tasklet needs to be released but it has not yet been zeroed. Let's do this before jumping to the freeing location. This needs to be backported to all maintained versions.	2023-03-20 19:58:38 +01:00
Willy Tarreau	bcdc6cc15b	BUG/MEDIUM: mux-h2: do not try to free an unallocated h2s->sd In h2s_close() we may dereference h2s->sd to get the sc, but this function may be called on allocation error paths, so we must check for this specific condition. Let's also update the comment to make it explicitly permitted. This needs to be backported to 2.6.	2023-03-20 19:58:38 +01:00
Willy Tarreau	a45e7e81ec	BUG/MEDIUM: stream: do not try to free a failed stream-conn In stream_free() if we fail to allocate s->scb() we go to the path where we try to free it, and it doesn't like being called with a null at all. It's easily reproducible with -dMfail,no-cache and "tune.fail-alloc 10" in the global section. This must be backported to 2.6.	2023-03-20 19:58:38 +01:00
Frédéric Lécaille	e25fce03eb	BUG/MINOR: quic: Dysfunctional 01RTT packet number space probing This bug arrived with this commit: "MINOR: quic: implement qc_notify_send()". The ->tx.pto_probe variable was no more set when qc_processt_timer() the timer task for the connection responsible of detecting packet loss and probing upon PTO expiration leading to interrupted stream transfers. This was revealed by blackhole interop failed tests where one could see that qc_process_timer() was wakeup without traces as follows in the log file: "needs to probe 01RTT packet number space" Must be backported to 2.7 and to 2.6 if the commit mentionned above is backported to 2.6 in the meantime.	2023-03-20 17:50:36 +01:00
Frédéric Lécaille	c664e644eb	MINOR: quic: Stop stressing the acknowledgments process (RX ACK frames) The ACK frame range of packets were handled from the largest to the smallest packet number, leading to big number of ebtree insertions when the packet are handled in the inverse way they are sent. This was detected a long time ago but left in the code to stress our implementation. It is time to be more efficient and process the packet so that to avoid useless ebtree insertions. Modify qc_ackrng_pkts() responsible of handling the acknowledged packets from an ACK frame range of acknowledged packets. Must be backported to 2.7.	2023-03-20 17:47:12 +01:00
Willy Tarreau	ac78c4fd9d	MINOR: ssl-sock: pass the CO_SFL_MSG_MORE info down the stack Despite having replaced the SSL BIOs to use our own raw_sock layer, we still didn't exploit the CO_SFL_MSG_MORE flag which is pretty useful to avoid sending incomplete packets. It's particularly important for SSL since the extra overhead almost guarantees that each send() will be followed by an incomplete (and often odd-sided) segment. We already have an xprt_st set of flags to pass info to the various layers, so let's just add a new one, SSL_SOCK_SEND_MORE, that is set or cleared during ssl_sock_from_buf() to transfer the knowledge of CO_SFL_MSG_MORE. This way we can recover this information and pass it to raw_sock. This alone is sufficient to increase by ~5-10% the H2 bandwidth over SSL when multiple streams are used in parallel.	2023-03-17 16:43:51 +01:00
Willy Tarreau	464fa06e9a	MINOR: mux-h2: set CO_SFL_MSG_MORE when sending multiple buffers Traces show that sendto() rarely has MSG_MORE on H2 despite sending multiple buffers. The reason is that the loop iterating over the buffer ring doesn't have this info and doesn't pass it down. But now we know how many buffers are left to be sent, so we know whether or not the current buffer is the last one. As such we can set this flag for all buffers but the last one.	2023-03-17 16:43:51 +01:00
Willy Tarreau	88718955f4	OPTIM: mux-h1: limit first read size to avoid wrapping Before muxes were used, we used to refrain from reading past the buffer's reserve. But with muxes which have their own buffer, this rule was a bit forgotten, resulting in an extraneous read to be performed just because the rx buffer cannot be entirely transferred to the stream layer: sendto(12, "GET /?s=16k HTTP/1.1\r\nhost: 127."..., 84, MSG_DONTWAIT\|MSG_NOSIGNAL, NULL, 0) = 84 recvfrom(12, "HTTP/1.1 200\r\nContent-length: 16"..., 16320, 0, NULL, NULL) = 16320 recvfrom(12, ".123456789.12345", 16, 0, NULL, NULL) = 16 recvfrom(12, "6789.123456789.12345678\n.1234567"..., 15244, 0, NULL, NULL) = 182 recvfrom(12, 0x1e5d5d6, 15062, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) Here the server sends 16kB of payload after a headers block, the mux reads 16320 into the ibuf, and the stream layer consumes 15360 from the first h1_rcv_buf(), which leaves 960 into the buffer and releases a few indexes. The buffer cannot be realigned due to these remaining data, and a subsequent read is made on 16 bytes, then again on 182 bytes. By avoiding to read too much on the first call, we can avoid needlessly filling this buffer: recvfrom(12, "HTTP/1.1 200\r\nContent-length: 16"..., 15360, 0, NULL, NULL) = 15360 recvfrom(12, "456789.123456789.123456789.12345"..., 16220, 0, NULL, NULL) = 1158 recvfrom(12, 0x1d52a3a, 15062, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) This is much more efficient and uses less RAM since the first buffer that was emptied can now be released. Note that a further improvement (tested) consists in reading even less (typically 1kB) so that most of the data are transferred in zero-copy, and are not read until process_stream() is scheduled. This patch doesn't do that for now so that it can be backported without any obscure impact.	2023-03-17 16:43:51 +01:00
Willy Tarreau	f41dfc22b2	BUG/MAJOR: qpack: fix possible read out of bounds in static table CertiK Skyfall Team reported that passing an index greater than QPACK_SHT_SIZE in a qpack instruction referencing a literal field name with name reference or and indexed field line will cause a read out of bounds that may crash the process, and confirmed that this fix addresses the issue. This needs to be backported as far as 2.5.	2023-03-17 16:43:51 +01:00
Aurelien DARRAGON	5b4e16ee2d	MINOR: doc: missing entries for sc-add-gpc() When sc-add-gpc() action was implemented in 5a72d03 ("MINOR: stick-table: implement the sc-add-gpc() action"), its usage was only documented for "http-request", but according to the code it now applies everywhere sc-inc-gpc() is mentioned. Adding the missing entries in the doc everywhere the action may be used. The issue was detected by the haproxy-controller bot and was reported by Pratik Mohanty and Marko Juraga. No backport needed, unless 5a72d03 ("MINOR: stick-table: implement the sc-add-gpc() action") is being backported.	2023-03-17 13:09:09 +01:00
Aurelien DARRAGON	e2907c7ee3	MINOR: stick-table: add sc-add-gpc() to http-after-response sc-add-gpc() was implemented in 5a72d03 ("MINOR: stick-table: implement the sc-add-gpc() action") This new action was exposed everywhere sc-inc-gpc() is available, except for http-after-response. But there doesn't seem to be a technical constraint that prevents us from exposing it in http-after-response. It was probably overlooked, let's add it. No backport needed, unless 5a72d03 ("MINOR: stick-table: implement the sc-add-gpc() action") is being backported.	2023-03-17 13:09:09 +01:00
Fr�d�ric L�caille	ca07979b97	BUG/MINOR: quic: Missing STREAM frame data pointer updates This patch follows this one which was not sufficient: "BUG/MINOR: quic: Missing STREAM frame length updates" Indeed, it is not sufficient to update the ->len and ->offset member of a STREAM frame to move it forward. The data pointer must also be updated. This is not done by the STREAM frame builder. Must be backported to 2.6 and 2.7.	2023-03-17 09:21:18 +01:00
Willy Tarreau	14ea98af73	BUG/MINOR: mux-h2: set CO_SFL_STREAMER when sending lots of data Emeric noticed that h2 bit-rate performance was always slightly lower than h1 when the CPU is saturated. Strace showed that we were always data in 2kB chunks, corresponding to the max_record size. What's happening is that when this mechanism of dynamic record size was introduced, the STREAMER flag at the stream level was relied upon. Since all this was moved to the muxes, the flag has to be passed as an argument to the snd_buf() function, but the mux h2 did not use it despite a comment mentioning it, probably because before the multi-buf it was not easy to figure the status of the buffer. The solution here consists in checking if the mbuf is congested or not, by checking if it has more than one buffer allocated. If so we set the CO_SFL_STREAMER flag, otherwise we don't. This way moderate size exchanges continue to be made over small chunks, but downloads will be able to use the large ones. While it could be backported to all supported versions, it would be better to limit it to the last LTS, so let's do it for 2.7 and 2.6 only. This patch requires previous commit "MINOR: buffer: add br_single() to check if a buffer ring has more than one buf".	2023-03-16 18:45:46 +01:00
Willy Tarreau	93c5511af8	BUG/MEDIUM: mux-h2: only restart sending when mux buffer is decongested During performance tests, Emeric faced a case where the wakeups of sc_conn_io_cb() caused by h2_resume_each_sending_h2s() was multiplied by 5-50 and a lot of CPU was being spent doing this for apparently no reason. The culprit is h2_send() not behaving well with congested buffers and small SSL records. What happens when the output is congested is that all buffers are full, and data are emitted in 2kB chunks, which are sufficient to wake all streams up again to ask them to send data again, something that will obviously only work for one of them at best, and waste a lot of CPU in wakeups and memcpy() due to the small buffers. When this happens, the performance can be divided by 2-2.5 on large objects. Here the chosen solution against this is to keep in mind that as long as there are still at least two buffers in the ring after calling xprt->snd_buf(), it means that the output is congested and there's no point trying again, because these data will just be placed into such buffers and will wait there. Instead we only mark the buffer decongested once we're back to a single allocated buffer in the ring. By doing so we preserve the ability to deal with large concurrent bursts while not causing a thundering herd by waking all streams for almost nothing. This needs to be backported to 2.7 and 2.6. Other versions could benefit from it as well but it's not strictly necessary, and we can reconsider this option if some excess calls to sc_conn_io_cb() are faced. Note that this fix depends on this recent commit: MINOR: buffer: add br_single() to check if a buffer ring has more than one buf	2023-03-16 18:45:46 +01:00
Willy Tarreau	9824f8c890	MINOR: buffer: add br_single() to check if a buffer ring has more than one buf It's cheaper and cleaner than using br_count()==1 given that it just compares two indexes, and that a ring having a single buffer is in a special case where it is between empty and used up-to-1. In other words it's not congested.	2023-03-16 18:45:46 +01:00
Willy Tarreau	e5a26eb2de	MINOR: buffer: add br_count() to return the number of allocated bufs We have no way to know how many buffers are currently allocated in a buffer ring. Let's add br_count() for this.	2023-03-16 18:45:46 +01:00
Willy Tarreau	3fb2c6d5b4	BUG/MINOR: mux-h2: make sure the h2c task exists before refreshing it When detaching a stream, if it's the last one and the mbuf is blocked, we leave without freeing the stream yet. We also refresh the h2c task's timeout, except that it's possible that there's no such task in case there is no client timeout, causing a crash. The fix just consists in doing this when the task exists. This bug has always been there and is extremely hard to meet even without a client timeout. This fix has to be backported to all branches, but it's unlikely anyone has ever met it anyay.	2023-03-16 18:45:46 +01:00

... 3 4 5 6 7 ...

19791 Commits