haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-20 14:11:19 +02:00

Author	SHA1	Message	Date
Nicolas CARPi	a33407b499	CLEANUP: mqtt: fix typo in MQTT_REMAINING_LENGHT_MAX_SIZE There was a typo in the macro name, where LENGTH was incorrectly written. This didn't cause any issue because the typo appeared in all occurrences in the codebase.	2024-08-30 14:58:59 +02:00
Christopher Faulet	62c9d51ca4	BUG/MINIR: proxy: Match on 429 status when trying to perform a L7 retry Support for 429 was recently added to L7 retries (0d142e075 "MINOR: proxy: Add support of 429-Too-Many-Requests in retry-on status"). But the l7_status_match() function was not properly updated. The switch statement must match the 429 status to be able to perform a L7 retry. This patch must be backported if the commit above is backported. It is related to #2687.	2024-08-30 12:13:32 +02:00
Christopher Faulet	0d142e0756	MINOR: proxy: Add support of 429-Too-Many-Requests in retry-on status The "429" status can now be specified on retry-on directives. PR_RE_* flags were updated to remains sorted. This patch should fix the issue #2687. It is quite simple so it may safely be backported to 3.0 if necessary.	2024-08-28 10:05:34 +02:00
William Lallemand	e8fecef0ff	MEDIUM: ssl: capture the signature_algorithms extension from Client Hello Activate the capture of the TLS signature_algorithms extension from the Client Hello. This list is stored in the ssl_capture buffer when the global option "tune.ssl.capture-cipherlist-size" is enabled.	2024-08-26 15:17:40 +02:00
William Lallemand	ce7fb6628e	MEDIUM: ssl: capture the supported_versions extension from Client Hello Activate the capture of the TLS supported_versions extension from the Client Hello. This list is stored in the ssl_capture buffer when the global option "tune.ssl.capture-cipherlist-size" is enabled.	2024-08-26 15:12:42 +02:00
Valentine Krasnobaeva	7b78e1571b	MINOR: mworker: restore initial env before wait mode This patch is the follow-up of 1811d2a6ba (MINOR: tools: add helpers to backup/clean/restore env). In order to avoid unexpected behaviour in master-worker mode during the process reload with a new configuration, when the old one has contained '*env' keywords, let's backup its initial environment before calling parse_cfg() and let's clean and restore it in the context of master process, just before it enters in a wait polling loop. This will garantee that new workers will have a new updated environment and not the previous one inherited from the master, which does not read the configuration, when it's in a wait-mode.	2024-08-23 17:06:59 +02:00
Valentine Krasnobaeva	1811d2a6ba	MINOR: tools: add helpers to backup/clean/restore env 'setenv', 'presetenv', 'unsetenv', 'resetenv' keywords in configuration could modify the process runtime environment. In case of master-worker mode this creates a problem, as the configuration is read only once before the forking a worker and then the master process does the reexec without reading any config files, just to free the memory. So, during the reload a new worker process will be created, but it will inherited the previous unchanged environment from the master in wait mode, thus it won't benefit the changes in configuration, related to '*env' keywords. This may cause unexpected behavior or some parser errors in master-worker mode. So, let's add a helper to backup all process env variables just before it will read its configuration. And let's also add helpers to clean up the current runtime environment and to restore it to its initial state (as it was before parsing the config).	2024-08-23 17:06:33 +02:00
Willy Tarreau	2a799b64b0	MINOR: protocol: add the real address family to the protocol For custom families, there's sometimes an underlying real address and it would be nice to be able to directly use the real family in calls to bind() and connect() without having to add explicit checks for exceptions everywhere. Let's add a .real_family field to struct proto_fam for this. For now it's always equal to the family except for non-transferable ones such as rhttp where it's equal to the custom one (anything else could fit).	2024-08-21 17:37:46 +02:00
Willy Tarreau	ba4a416c66	MINOR: protocol: add a family lookup At plenty of places we have access to an address family which may include some custom addresses but we cannot simply convert them to the real families without performing some random protocol lookups. Let's simply add a proto_fam table like we have for the protocols. The protocols could even be indexed there, but for now it's not worth it.	2024-08-21 16:46:15 +02:00
Willy Tarreau	732913f848	MINOR: protocol: properly assign the sock_domain and sock_family When we finally split sock_domain from sock_family in 2.3, something was not cleanly finished. The family is what should be stored in the address while the domain is what is supposed to be passed to socket(). But for the custom addresses, we did the opposite, just because the protocol_lookup() function was acting on the domain, not the family (both of which are equal for non-custom addresses). This is an API bug but there's no point backporting it since it does not have visible effects. It was visible in the code since a few places were using PF_UNIX while others were comparing the domain against AF_MAX instead of comparing the family. This patch clarifies this in the comments on top of proto_fam, addresses the indexing issue and properly reconfigures the two custom families.	2024-08-21 16:46:15 +02:00
Willy Tarreau	67bf1d6c9e	MINOR: quic: support a tolerance for spurious losses Tests performed between a 1 Gbps connected server and a 100 mbps client, distant by 95ms showed that: - we need 1.1 MB in flight to fill the link - rare but inevitable losses are sufficient to make cubic's window collapse fast and long to recover - a 100 MB object takes 69s to download - tolerance for 1 loss between two ACKs suffices to shrink the download time to 20-22s - 2 losses go to 17-20s - 4 losses reach 14-17s At 100 concurrent connections that fill the server's link: - 0 loss tolerance shows 2-3% losses - 1 loss tolerance shows 3-5% losses - 2 loss tolerance shows 10-13% losses - 4 loss tolerance shows 23-29% losses As such while there can be a significant gain sometimes in setting this tolerance above zero, it can also significantly waste bandwidth by sending far more than can be received. While it's probably not a solution to real world problems, it repeatedly proved to be a very effective troubleshooting tool helping to figure different root causes of low transfer speeds. In spirit it is comparable to the no-cc congestion algorithm, i.e. it must not be used except for experimentation.	2024-08-21 08:34:30 +02:00
Willy Tarreau	fab0e99aa1	MINOR: quic: store the lost packets counter in the quic_cc_event element Upon loss detection, qc_release_lost_pkts() notifies congestion controllers about the event and its final time. However it does not pass the number of lost packets, that can provide useful hints for some controllers. Let's just pass this option.	2024-08-21 08:02:44 +02:00
Amaury Denoyelle	0d6112b40b	MINOR: mux-quic: retry after small buf alloc failure Previous commit switch to small buffers for HTTP/3 HEADERS emission. This ensures that several parallel streams can allocate their own buffer without hitting the connection buffer limit based now on the congestion window size. However, this prevents the transmission of responses with uncommonly large headers. Indeed, if all headers cannot be encoded in a single buffer, an error is reported which cause the whole connection closure. Adjust this by implementing a realloc API exposed by QUIC MUX. This allows application layer to switch from a small to a default buffer and restart its processing. This guarantees that again headers not longer than bufsize can be properly transferred.	2024-08-20 18:12:27 +02:00
Amaury Denoyelle	885e4c5cf8	MINOR: quic: support sbuf allocation in quic_stream This patch extends qc_stream_desc API to be able to allocate small buffers. QUIC MUX API is similarly updated as ultimatly each application protocol is responsible to choose between a default or a smaller buffer. Internally, the type of allocated buffer is remembered via qc_stream_buf instance. This is mandatory to ensure that the buffer is released in the correct pool, in particular as small and standard buffers can be configured with the same size. This commit is purely an API change. For the moment, small buffers are not used. This will changed in a dedicated patch.	2024-08-20 18:12:27 +02:00
Amaury Denoyelle	d0d8e57d47	MINOR: quic: define sbuf pool Define a new buffer pool reserved to allocate smaller memory area. For the moment, its usage will be restricted to QUIC, as such it is declared in quic_stream module. Add a new config option "tune.bufsize.small" to specify the size of the allocated objects. A special check ensures that it is not greater than the default bufsize to avoid unexpected effects.	2024-08-20 18:12:27 +02:00
Amaury Denoyelle	1de5f718cf	MINOR: quic/config: adapt settings to new conn buffer limit QUIC MUX buffer allocation limit is now directly based on the underlying congestion window size. previous static limit based on conn-tx-buffers is now unused. As such, this commit adds a warning to users to prevent that it is now obsolete. Secondly, update max-window-size setting. It is now the main entrypoint to limit both the maximum congestion window size and the number of QUIC MUX allocated buffer on emission. Remove its special value '0' which was used to automatically adjust it on now unused conn-tx-buffers.	2024-08-20 17:59:35 +02:00
Amaury Denoyelle	aeb8c1ddc3	MAJOR: mux-quic: allocate Tx buffers based on congestion window Each QUIC MUX may allocate buffers for MUX stream emission. These buffers are then shared with quic_conn to handle ACK reception and retransmission. A limit on the number of concurrent buffers used per connection has been defined statically and can be updated via a configuration option. This commit replaces the limit to instead use the current underlying congestion window size. The purpose of this change is to remove the artificial static buffer count limit, which may be difficult to choose. Indeed, if a connection performs with minimal loss rate, the buffer count would limit severely its throughput. It could be increase to fix this, but it also impacts others connections, even with less optimal performance, causing too many extra data buffering on the MUX layer. By using the dynamic congestion window size, haproxy ensures that MUX buffering corresponds roughly to the network conditions. Using QCC <buf_in_flight>, a new buffer can be allocated if it is less than the current window size. If not, QCS emission is interrupted and haproxy stream layer will subscribe until a new buffer is ready. One of the criticals parts is to ensure that MUX layer previously blocked on buffer allocation is properly woken up when sending can be retried. This occurs on two occasions : * after an already used Tx buffer is cleared on ACK reception. This case is already handled by qcc_notify_buf() via quic_stream layer. * on congestion window increase. A new qcc_notify_buf() invokation is added into qc_notify_send(). Finally, remove <avail_bufs> QCC field which is now unused. This commit is labelled MAJOR as it may have unexpected effect and could cause significant behavior change. For example, in previous implementation QUIC MUX would be able to buffer more data even if the congestion window is small. With this patch, data cannot be transferred from the stream layer which may cause more streams to be shut down on client timeout. Another effect may be more CPU consumption as the connection limit would be hit more often, causing more streams to be interrupted and woken up in cycle.	2024-08-20 17:17:17 +02:00
Amaury Denoyelle	000976af58	MINOR: mux-quic: define buf_in_flight Define a new QCC counter named <buf_in_flight>. Its purpose is to account the current sum of all allocated stream buffer size used on emission. For this moment, this counter is updated and buffer allocation and deallocation. It will be used to replace <avail_bufs> once congestion window is used as limit for buffer allocation in a future commit.	2024-08-20 17:17:17 +02:00
Amaury Denoyelle	4c4bf26f44	MEDIUM: mux-quic: implement API to ignore txbuf limit for some streams Define a new qc_stream_desc flag QC_SD_FL_OOB_BUF. This is to mark streams which are not subject to the connection limit on allocated MUX stream buffer. The purpose is to simplify handling of QUIC MUX streams which do not transfer data and as such are not driven by haproxy layer, for example HTTP/3 control stream. These streams interacts synchronously with QUIC MUX and cannot retry emission in case of temporary failure. This commit will be useful once connection buffer allocation limit is reimplemented to directly rely on the congestion window size. This will probably cause the buffer limit to be reached more frequently, maybe even on QUIC MUX initialization. As such, it will be possible to mark control streams and prevent them to be subject to the buffer limit. QUIC MUX expose a new function qcs_send_metadata(). It can be used by an application protocol to specify which streams are used for control exchanges. For the moment, no such stream use this mechanism.	2024-08-20 17:17:17 +02:00
Amaury Denoyelle	f4d1bd0b76	MINOR: mux-quic: account stream txbuf in QCC A limit per connection is put on the number of buffers allocated by QUIC MUX for emission accross all its streams. This ensures memory consumption remains under control. This limit is simply explained as a count of buffers which can be concurrently allocated for each connection. As such, quic_conn structure was used to account currently allocated buffers. However, a quic_conn nevers allocates new stream buffers. This is only done at QUIC MUX layer. As such, this commit moves buffer accounting inside QCC structure. This simplifies the API, most notably qc_stream_buf_alloc() usage. Note that this commit inverts the accounting. Previously, it was initially set to 0 and increment for each allocated buffer. Now, it is set to the maximum value and decrement for each buf usage. This is considered as clearer to use.	2024-08-20 17:17:17 +02:00
Amaury Denoyelle	c24c8667b2	MINOR: quic: define max-window-size config setting Define a new global keyword tune.quic.frontend.max-window-size. This allows to set globally the maximum congestion window size for each QUIC frontend connections. The default value is 0. It is a special value which automatically derive the size from the configured QUIC connection buffer limit. This is similar to the previous "quic-cc-algo" behavior, which can be used to override the maximum window size per bind line.	2024-08-20 17:02:29 +02:00
Valentine Krasnobaeva	8b1dfa9def	MINOR: cfgparse: limit file size loaded via /dev/stdin load_cfg_in_mem() can continuously reallocate memory in order to load an extremely large input from /dev/stdin, until it fails with ENOMEM, which means that process has consumed all available RAM. In case of containers and virtualized environments it's not very good. So, in order to prevent this, let's introduce MAX_CFG_SIZE as 10MB, which will limit the size of input supplied via /dev/stdin.	2024-08-20 14:28:34 +02:00
Nathan Wehrman	fd48b28315	MINOR: Implements new log format of option tcplog clf Some systems require log formats in the CLF format and that meant that I could not send my logs for proxies in mode tcp to those servers. This implements a format that uses log variables that are compatble with TCP mode frontends and replaces traditional HTTP values in the CLF format to make them stand out. Instead of logging method and URI like this "GET /example HTTP/1.1" it will log "TCP " and for a response code I used "000" so it would be easy to separate from legitimate HTTP traffic. Now your log servers that require a CLF format can see the timings for TCP traffic as well as HTTP.	2024-08-20 07:46:34 +02:00
Aurelien DARRAGON	f8299bc5ea	MINOR: log: "drop" support for log-profile steps It is now possible to use "drop" keyword for "on" lines under a log-profile section to specify that no log at all should be emitted for the specified step (setting an empty format was not sufficient to do so because only the log payload would be empty, not the log header, thus the log would still be emitted). It may be useful to selectively disable logging at specific steps for a given log target (since the log profile may be set on log directives): log-profile myprof on request format "blabla" sd "custom sd" on response drop New testcase was added to reg-tests/log/log_profiles.vtc	2024-08-19 18:53:01 +02:00
William Lallemand	b2a8e8731d	MINOR: channel: implement ci_insert() function ci_insert() is a function which allows to insert a string <str> of size <len> at <pos> of the input buffer. This is the equivalent of ci_insert_line2() but without inserting '\r\n'	2024-08-08 17:29:37 +02:00
Valentine Krasnobaeva	c6cfa7cb4a	MINOR: startup: rename readcfgfile in parse_cfg As readcfgfile no longer opens configuration files and reads them with fgets, but performs only the parsing of provided data, let's rename it to parse_cfg by analogy with read_cfg in haproxy.c.	2024-08-07 18:41:41 +02:00
Valentine Krasnobaeva	5b52df4c4d	MEDIUM: startup: load and parse configs from memory Let's call load_cfg_in_ram() helper for each configuration file to load it's content in some area in memory. Adapt readcfgfile() parser function respectively. In order to limit changes in its scope we give as an argument a cfgfile structure, already filled in init_args() and in load_cfg_in_ram() with file metadata and content. Parser function (readcfgfile()) uses now fgets_from_mem() instead of standard fgets from libc implementations. SPOE filter parses its own configuration file, pointed by 'config' keyword in the configuration already loaded in memory. So, let's allocate and fill for this a supplementary cfgfile structure, which is not referenced in cfg_cfgfiles list. This structure and the memory with content of SPOE filter configuration are freed immediately in parse_spoe_flt(), when readcfgfile() returns. HAProxy OpenTracing filter also uses its own configuration file. So, let's follow the same logic as we do for SPOE filter.	2024-08-07 18:41:41 +02:00
Valentine Krasnobaeva	007f7f2f02	MINOR: tools: add fgets_from_mem Add fgets_from_mem() helper to read lines from configuration files, stored now as memory chunks. In order to limit changes in the first-level parser code (readcfgfile()), it is better to reimplement the standard fgets, i.e. to have a fgets, which can read the serialized data line by line from some memory area, instead of file stream, and can keep the same behaviour as libc implementations fgets.	2024-08-07 18:41:41 +02:00
Valentine Krasnobaeva	5b9ed6e4be	MINOR: cfgparse: add load_cfg_in_mem Add load_cfg_in_mem() helper, which allows to store the content of a given file in memory.	2024-08-07 18:41:41 +02:00
Valentine Krasnobaeva	bafb0ce272	MINOR: startup: adapt list_append_word to use cfgfile list_append_word() helper was used before only to chain configuration file names in a list. As now we start to use cfgfile structure which represents entire file in memory and its metadata, let's adapt this helper to use this structure and let's rename it to list_append_cfgfile(). Adapt functions, which process configuration files and directories to use cfgfile structure and list_append_cfgfile() instead of wordlist.	2024-08-07 18:41:41 +02:00
Valentine Krasnobaeva	39f2a19620	REORG: tools: move list_append_word to cfgparse Let's move list_append_word to cfgparse.c as it is used only to fill cfg_cfgfiles list with configuration file names.	2024-08-07 18:41:41 +02:00
Valentine Krasnobaeva	70b842e847	MINOR: cfgparse: add struct cfgfile to represent config in memory This and following commits serve to prepare loading configuration files in memory, before parsing them, as we may need to parse some parts of configuration in different moments of the startup sequence. This is a case of the new master-worker initialization process. Here we need to read at first only the global and the program sections and only after some steps (forking worker, etc) the rest of the configuration. Add a new structure cfgfile to keep configuration files metadata and content, loaded somewhere in a memory. Instances of filled cfgfile structures could be chained in a list, as the order in which they were loaded is important.	2024-08-07 18:41:41 +02:00
Willy Tarreau	10c8baca44	MINOR: trace: add a per-source helper to pre-fill the context Now sources which want to do it can provide a helper that can pre-fill some fields in the context based on their knowledge (e.g. mux streams).	2024-08-07 16:02:59 +02:00
Willy Tarreau	7d55a70f5a	MINOR: trace: move the known trace context into a dedicated struct We now have a trace_ctx to hold the sess, conn, qc, stream and so on. This will allow us to pass it across layers so that other helpers can help fill them. Ideally it should be passed as an argument to __trace_enabled() by __trace() so that it can be passed back to the trace callback. But it seems that trace callbacks are smart enough to figure all their info when they need them.	2024-08-07 16:02:59 +02:00
Willy Tarreau	d465610ec3	MEDIUM: trace: implement a "follow" mechanism With "follow" from one source to another, it becomes possible for a source to automatically follow another source's tracked pointer. The best example is the session: - the "session" source is enabled and has a "lockon session" -> its lockon_ptr is equal to the session when valid - other sources (h1,h2,h3 etc) are configured for "follow session" and will then automatically check if session's lockon_ptr matches its own session, in which case tracing will be enabled for that trace (no state change). It's not necessary to start/pause/stop traces when using this, only "follow" followed by a source with lockon enabled is needed. Some combinations might work better than others. At the moment the session is almost never known from the backend, but this may improve. The meta-source "all" is supported for the follower so that all sources will follow the tracked one.	2024-08-07 16:02:59 +02:00
Amaury Denoyelle	9f829ea3f3	MINOR: mux-quic: measure QCS lifetime and its blocking state Reuse newly defined tot_time structure to measure various values related to a QCS lifetime. First, a timer is used to comptabilize the total QCS lifetime. Then, two other timers are used to account the total time during which Tx from stream layer to MUX is blocked, either on lack of buffer or due to flow-control. These three timers are reported in qmux_dump_qcs_info(). Thus, they are available in traces and for QUIC MUX debug string sample.	2024-08-07 15:40:52 +02:00
Amaury Denoyelle	a6e2523ca1	MINOR: time: define tot_time structure Define a new utility type tot_time. Its purpose is to be able to account elapsed time accross multiple periods. Functions are defined to easily start and stop measures, and return the current value.	2024-08-07 15:40:52 +02:00
Amaury Denoyelle	663416b4ef	MINOR: quic: dump quic_conn debug string for logs Define a new xprt_ops callback named dump_info. This can be used to extend MUX debug string with infos from the lower layer. Implement dump_info for QUIC stack. For now, only minimal info are reported : bytes in flight and size of the sending window. This should allow to detect if the congestion controller is fine. These info are reported via QUIC MUX debug string sample.	2024-08-07 15:40:52 +02:00
Amaury Denoyelle	eb4dfa3b36	MINOR: mux-quic: define dump functions for QCC and QCS Extract trace code to dump QCC and QCS instances into dedicated functions named qmux_dump_qc{c,s}_info(). This will allow to easily print QCC/QCS infos outside of traces.	2024-08-07 15:40:52 +02:00
Willy Tarreau	921e04bf87	MINOR: stconn: add a new pair of sf functions {bs,fs}.debug_str These are passed to the underlying mux to retrieve debug information at the mux level (stream/connection) as a string that's meant to be added to logs. The API is quite complex just because we can't pass any info to the bottom function. So we construct a union and pass the argument as an int, and expect the callee to fill that with its buffer in return. Most likely the mux->ctl and ->sctl API should be reworked before the release to simplify this. The functions take an optional argument that is a bit mask of the layers to dump: muxs=1 muxc=2 xprt=4 conn=8 sock=16 The default (0) logs everything available.	2024-08-07 14:07:41 +02:00
Amaury Denoyelle	e177cf341c	BUG/MEDIUM: quic: handle retransmit for standalone FIN STREAM STREAM frames have dedicated handling on retransmission. A special check is done to remove data already acked in case of duplicated frames, thus only unacked data are retransmitted. This handling is faulty in case of an empty STREAM frame with FIN set. On retransmission, this frame does not cover any unacked range as it is empty and is thus discarded. This may cause the transfer to freeze with the client waiting indefinitely for the FIN notification. To handle retransmission of empty FIN STREAM frame, qc_stream_desc layer have been extended. A new flag QC_SD_FL_WAIT_FOR_FIN is set by MUX QUIC when FIN has been transmitted. If set, it prevents qc_stream_desc to be freed until FIN is acknowledged. On retransmission side, qc_stream_frm_is_acked() has been updated. It now reports false if FIN bit is set on the frame and qc_stream_desc has QC_SD_FL_WAIT_FOR_FIN set. This must be backported up to 2.6. However, this modifies heavily critical section for ACK handling and retransmission. As such, it must be backported only after a period of observation. This issue can be reproduced by using the following socat command as server to add delay between the response and connection closure : $ socat TCP-LISTEN:<port>,fork,reuseaddr,crlf SYSTEM:'echo "HTTP/1.1 200 OK"; echo ""; sleep 1;' On the client side, ngtcp2 can be used to simulate packet drop. Without this patch, connection will be interrupted on QUIC idle timeout or haproxy client timeout with ERR_DRAINING on ngtcp2 : $ ngtcp2-client --exit-on-all-streams-close -r 0.3 <host> <port> "http://<host>:<port>/?s=32o" Alternatively to ngtcp2 random loss, an extra haproxy patch can also be used to force skipping the emission of the empty STREAM frame : diff --git a/include/haproxy/quic_tx-t.h b/include/haproxy/quic_tx-t.h index efbdfe687..1ff899acd 100644 --- a/include/haproxy/quic_tx-t.h +++ b/include/haproxy/quic_tx-t.h @@ -26,6 +26,8 @@ extern struct pool_head pool_head_quic_cc_buf; / Flag a sent packet as being probing with old data / #define QUIC_FL_TX_PACKET_PROBE_WITH_OLD_DATA (1UL << 5) +#define QUIC_FL_TX_PACKET_SKIP_SENDTO (1UL << 6) + / Structure to store enough information about TX QUIC packets. / struct quic_tx_packet { / List entry point. / diff --git a/src/quic_tx.c b/src/quic_tx.c index 2f199ac3c..2702fc9b9 100644 --- a/src/quic_tx.c +++ b/src/quic_tx.c @@ -318,7 +318,7 @@ static int qc_send_ppkts(struct buffer buf, struct ssl_sock_ctx ctx) tmpbuf.size = tmpbuf.data = dglen; TRACE_PROTO("TX dgram", QUIC_EV_CONN_SPPKTS, qc); - if (!skip_sendto) { + if (!skip_sendto && !(first_pkt->flags & QUIC_FL_TX_PACKET_SKIP_SENDTO)) { int ret = qc_snd_buf(qc, &tmpbuf, tmpbuf.data, 0, gso); if (ret < 0) { if (gso && ret == -EIO) { @@ -354,6 +354,7 @@ static int qc_send_ppkts(struct buffer buf, struct ssl_sock_ctx ctx) qc->cntrs.sent_bytes_gso += ret; } } + first_pkt->flags &= ~QUIC_FL_TX_PACKET_SKIP_SENDTO; b_del(buf, dglen + QUIC_DGRAM_HEADLEN); qc->bytes.tx += tmpbuf.data; @@ -2066,6 +2067,17 @@ static int qc_do_build_pkt(unsigned char pos, const unsigned char *end, continue; } + switch (cf->type) { + case QUIC_FT_STREAM_8 ... QUIC_FT_STREAM_F: + if (!cf->stream.len && (qc->flags & QUIC_FL_CONN_TX_MUX_CONTEXT)) { + TRACE_USER("artificially drop packet with empty STREAM frame", QUIC_EV_CONN_TXPKT, qc); + pkt->flags \|= QUIC_FL_TX_PACKET_SKIP_SENDTO; + } + break; + default: + break; + } + quic_tx_packet_refinc(pkt); cf->pkt = pkt; }	2024-08-07 11:03:32 +02:00
Amaury Denoyelle	714009b7bc	MINOR: quic: implement function to check if STREAM is fully acked When a STREAM frame is retransmitted, a check is performed to remove range of data already acked from it. This is useful when STREAM frames are duplicated and splitted to cover different data ranges. The newly retransmitted frame contains only unacked data. This process is performed similarly in qc_dup_pkt_frms() and qc_build_frms(). Refactor the code into a new function named qc_stream_frm_is_acked(). It returns true if frame data are already fully acked and retransmission can be avoided. If only a partial range of data is acknowledged, frame content is updated to only cover the unacked data. This patch does not have any functional change. However, it simplifies retransmission for STREAM frames. Also, it will be reused to fix retransmission for empty STREAM frames with FIN set from the following patch : BUG/MEDIUM: quic: handle retransmit for standalone FIN STREAM As such, it must be backported prior to it.	2024-08-07 10:57:10 +02:00
Amaury Denoyelle	bb9ac256a1	MINOR: quic: convert qc_stream_desc release field to flags qc_stream_desc had a field <release> used as a boolean. Convert it with a new <flags> field and QC_SD_FL_RELEASE value as equivalent. The purpose of this patch is to be able to extend qc_stream_desc by adding newer flags values. This patch is required for the following patch BUG/MEDIUM: quic: handle retransmit for standalone FIN STREAM As such, it must be backported prior to it.	2024-08-06 18:00:17 +02:00
Amaury Denoyelle	7b89aa5b19	BUG/MINOR: h1: do not forward h2c upgrade header token haproxy supports tunnel establishment through HTTP Upgrade mechanism. Since the following commit, extended CONNECT is also supported for HTTP/2 both on frontend and backend side. commit 9bf957335e2c385b74901481f7a89c9565dfce53 MEDIUM: mux_h2: generate Extended CONNECT from htx upgrade As specified by HTTP/2 rfc, "h2c" can be used by an HTTP/1.1 client to request an upgrade to HTTP/2. In haproxy, this is not supported so it silently ignores this. However, Connection and Upgrade headers are forwarded as-is on the backend side. If using HTTP/1 on the backend side and the server supports this upgrade mechanism, haproxy won't be able to parse the HTTP response. If using HTTP/2, mux backend tries to incorrectly convert the request to an Extended CONNECT with h2c protocol, which may also prevent the response to be transmitted. To fix this, flag HTTP/1 request with "h2c" or "h2" token in an upgrade header. On converting the header list to HTX, the upgrade header is skipped if any of this token is present and the H1_MF_CONN_UPG flag is removed. This issue can easily be reproduced using curl --http2 argument to connect to an HTTP/1 frontend. This must be backported up to 2.4 after a period of observation.	2024-08-01 18:23:32 +02:00
Amaury Denoyelle	4b0bda42f7	MINOR: flags/mux-quic: decode qcc and qcs flags Decode QUIC MUX connection and stream elements via qcc_show_flags() and qcs_show_flags(). Flags definition have been moved outside of USE_QUIC to ease compilation of flags binary.	2024-07-31 17:59:35 +02:00
Frederic Lecaille	1733dff42a	MINOR: tcp_sample: Move TCP low level sample fetch function to control layer Add ->get_info() new control layer callback definition to protocol struct to retreive statiscal counters information at transport layer (TCPv4/TCPv6) identified by an integer into a long long int. Move the TCP specific code from get_tcp_info() to the tcp_get_info() control layer function (src/proto_tcp.c) and define it as the ->get_info() callback for TCPv4 and TCPv6. Note that get_tcp_info() is called for several TCP sample fetches. This patch is useful to support some of these sample fetches for QUIC and to keep the code simple and easy to maintain.	2024-07-31 10:29:42 +02:00
William Lallemand	f76e8e50f4	BUILD: ssl: replace USE_OPENSSL_AWSLC by OPENSSL_IS_AWSLC Replace USE_OPENSSL_AWSLC by OPENSSL_IS_AWSLC in the code source, so we won't need to set USE_OPENSSL_AWSLC in the Makefile on the long term.	2024-07-30 18:53:08 +02:00
William Lallemand	56eefd6827	BUG/MEDIUM: ssl: reactivate 0-RTT for AWS-LC Then reactivate HAVE_SSL_0RTT and HAVE_SSL_0RTT_QUIC for AWS-LC, which were wrongly deactivated in f5353f2c ("MINOR: ssl: add HAVE_SSL_0RTT constant"). Must be backported to 3.0.	2024-07-30 18:53:08 +02:00
Willy Tarreau	1a8f3a368f	MINOR: queue: add a function to check for TOCTOU after queueing There's a rare TOCTOU case that happens from time to time with maxconn 1 and multiple threads. Between the moment we see the queue full and the moment we queue a request, it's possible that the last request on the server or proxy ended and that no other one is left to offer it its place. Given that all this code path is performance-critical and we cannot afford to increase the lock duration, better recheck for the condition after queueing. For this we need to be able to check for the condition and cleanly dequeue a request. That's what this patch provides via the new function pendconn_must_try_again(). It will catch more requests than absolutely needed though it will catch them all. It may find that around 1/1000 of requests are at risk, though testing shows that in practice, it's around 1 per million that really gets stuck (other ones benefit from timing and finishing late requests). Maybe in the future some conditions might be refined but it's harmless. What happens to such requests is that they're dequeued and their pendconn freed, so that the caller can decide to try to LB or queue them again. For now the function is not used, it's just added separately for easier tracking.	2024-07-29 09:27:01 +02:00
Frederic Lecaille	76ff8afa2d	MINOR: quic: Add information to "show quic" for CUBIC cc. Add ->state_cli() new callback to quic_cc_algo struct to define a function called by the "show quic (cc\|full)" commands to dump some information about the congestion algorithm internal state currently in use by the QUIC connections. Implement this callback for CUBIC algorithm to dump its internal variables: - K: (the time to reach the cubic curve inflexion point), - last_w_max: the last maximum window value reached before intering the last recovery period. This is also the window value at the inflexion point of the cubic curve, - wdiff: the difference between the current window value and last_w_max. So negative before the inflexion point, and positive after.	2024-07-26 16:42:44 +02:00

... 11 12 13 14 15 ...

8446 Commits