haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-12 01:56:58 +02:00

Author	SHA1	Message	Date
Willy Tarreau	b64ef3e3f8	DOC: internals: document the pools architecture and API The purpose here is to explain how memory pools work, what their architecture is depending on the build options (4 possible combinations), and how the various build options affect their behavior. Two pool-specific macros that were previously documented in initcalls were moved to pools.txt.	2022-01-11 14:51:41 +01:00
Christopher Faulet	b4eca0e908	BUG/MAJOR: mux-h1: Don't decrement .curr_len for unsent data A regression was introduced by commit `140f1a58` ("BUG/MEDIUM: mux-h1: Fix splicing by properly detecting end of message"). To detect end of the outgoing message, when the content-length is announced, we count amount of data already sent. But only data really sent must be counted. If the output buffer is full, we can fail to send data (fully or partially). In this case, we must take care to only count sent data. Otherwise we may think too much data were sent and an internal error may be erroneously reported. This patch should fix issues #1510 and #1511. It must be backported as far as 2.4.	2022-01-11 09:15:13 +01:00
Remi Tricot-Le Breton	a996763619	BUG/MINOR: ssl: Store client SNI in SSL context in case of ClientHello error If an error is raised during the ClientHello callback on the server side (ssl_sock_switchctx_cbk), the servername callback won't be called and the client's SNI will not be saved in the SSL context. But since we use the SSL_get_servername function to return this SNI in the ssl_fc_sni sample fetch, that means that in case of error, such as an SNI mismatch with a frontend having the strict-sni option enabled, the sample fetch would not work (making strict-sni related errors hard to debug). This patch fixes that by storing the SNI as an ex_data in the SSL context in case the ClientHello callback returns an error. This way the sample fetch can fallback to getting the SNI this way. It will still first call the SSL_get_servername function first since it is the proper way of getting a client's SNI when the handshake succeeded. In order to avoid memory allocations are runtime into this highly used runtime function, a new memory pool was created to store those client SNIs. Its entry size is set to 256 bytes since SNIs can't be longer than 255 characters. This fixes GitHub #1484. It can be backported in 2.5.	2022-01-10 16:31:22 +01:00
William Lallemand	f82afbb9cd	BUG/MEDIUM: mworker: don't use _getsocks in wait mode Since version 2.5 the master is automatically re-executed in wait-mode when the config is successfully loaded, puting corner cases of the wait mode in plain sight. When using the -x argument and with the right timing, the master will try to get the FDs again in wait mode even through it's not needed anymore, which will harm the worker by removing its listeners. However, if it fails, (and it's suppose to, sometimes), the master will exit with EXIT_FAILURE because it does not have the MODE_MWORKER flag, but only the MODE_MWORKER_WAIT flag. With the consequence of killing the workers. This patch fixes the issue by restricting the use of _getsocks to some modes. This patch must be backported in every version supported, even through the impact should me more harmless in version prior to 2.5.	2022-01-07 18:44:27 +01:00
Frédéric Lécaille	99942d6f4c	MINOR: quic: Non-optimal use of a TX buffer When full, after having reset the writer index, let's reuse the TX buffer in any case.	2022-01-07 17:58:26 +01:00
Frédéric Lécaille	f010f0aaf2	MINOR: quic: Missing retransmission from qc_prep_fast_retrans() In fact we must look for the first packet with some ack-elicting frame to in the packet number space tree to retransmit from. Obviously there may be already retransmit packets which are not deemed as lost and still present in the packet number space tree for TX packets.	2022-01-07 17:58:26 +01:00
Frédéric Lécaille	d4ecf94827	MINOR: quic: Only one CRYPTO frame by encryption level When receiving CRYPTO data from the TLS stack, concatenate the CRYPTO data to the first allocated CRYPTO frame if present. This reduces by one the number of handshake packets built for a connection with a standard size certificate.	2022-01-07 17:58:26 +01:00
Willy Tarreau	790169fe69	BUILD: makefile: add -Wno-atomic-alignment to work around clang abusive warning As reported in github issue #1502, clang, when building for i386, will try to use CMPXCHG8B-based loops for 64-bit atomic operations, and emits warnings for all 64-bit operands that are not 64-bit aligned, an alignment that is not required by the ABI, that the compiler itself does not enforce, and that the intel SDM clearly says is not required on this 32-bit platform for this operation. But this is likely an excessive outcome of the same code being used in 64-bit for CMPXCHG16B which does require proper alignment. Firefox already gave up on this one 3 years ago, let's not waste our time arguing and just shut up the warning instead. It might hide some real bugs in the future but till now experience showed that overall it's unlikely. This should be backported to all maintained branches that use 64-bit atomic ops (e.g. for counters). Thanks to Brad Smith for reporting it and confirming that shutting the warning addresses it.	2022-01-07 14:58:48 +01:00
Ilya Shipitsin	37d3e38130	CLEANUP: assorted typo fixes in the code and comments This is 30th iteration of typo fixes	2022-01-07 14:42:54 +01:00
Ilya Shipitsin	6569de2b88	CI: refactor spelling check let us switch to codespell github actions instead of invocation from cmdline. also, "ifset,thrid,strack,ba,chck,hel,unx,mor" added to whitelist, those are variable names and special terms widely used in HAProxy	2022-01-07 14:42:33 +01:00
David CARLIER	df91cbd584	MINOR: cpuset: switch to sched_setaffinity for FreeBSD 14 and above. Following up previous update on cpuset-t.h. Ultimately, at some point the cpuset_setaffinity code path could be removed.	2022-01-07 06:53:51 +01:00
William Dauchy	a9dd901143	MINOR: proxy: add option idle-close-on-response Avoid closing idle connections if a soft stop is in progress. By default, idle connections will be closed during a soft stop. In some environments, a client talking to the proxy may have prepared some idle connections in order to send requests later. If there is no proper retry on write errors, this can result in errors while haproxy is reloading. Even though a proper implementation should retry on connection/write errors, this option was introduced to support back compat with haproxy < v2.4. Indeed before v2.4, we were waiting for a last request to be able to add a "connection: close" header and advice the client to close the connection. In a real life example, this behavior was seen in AWS using the ALB in front of a haproxy. The end result was ALB sending 502 during haproxy reloads. This patch was tested on haproxy v2.4, with a regular reload on the process, and a constant trend of requests coming in. Before the patch, we see regular 502 returned to the client; when activating the option, the 502 disappear. This patch should help fixing github issue #1506. In order to unblock some v2.3 to v2.4 migraton, this patch should be backported up to v2.4 branch. Signed-off-by: William Dauchy <wdauchy@gmail.com> [wt: minor edits to the doc to mention other options to care about] Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-01-06 09:09:51 +01:00
Frédéric Lécaille	6b6631593f	MINOR: quic: Re-arm the PTO timer upon datagram receipt When block by the anti-amplification limit, this is the responsability of the client to unblock it sending new datagrams. On the server side, even if not well parsed, such datagrams must trigger the PTO timer arming.	2022-01-04 17:30:00 +01:00
Frédéric Lécaille	078634d126	MINOR: quic: PTO timer too often reset It must be reset when the anti-amplication was reached but only if the peer address was not validated.	2022-01-04 17:30:00 +01:00
Frédéric Lécaille	41a076087b	MINOR: quic: Flag asap the connection having reached the anti-amplification limit The best location to flag the connection is just after having built the packet which reached the anti-amplication limit.	2022-01-04 17:30:00 +01:00
Frédéric Lécaille	de6f7c503e	MINOR: quic: Prepare Handshake packets asap after completed handshake Switch back to QUIC_HS_ST_SERVER_HANDSHAKE state after a completed handshake if acks must be send. Also ensure we build post handshake frames only one time without using prev_st variable and ensure we discard the Handshake packet number space only one time.	2022-01-04 17:30:00 +01:00
Frédéric Lécaille	917a7dbdc7	MINOR: quic: Do not drop secret key but drop the CRYPTO data We need to be able to decrypt late Handshake packets after the TLS secret keys have been discarded. If not the peer send Handshake packet which have not been acknowledged. But for such packets, we discard the CRYPTO data.	2022-01-04 17:30:00 +01:00
Frédéric Lécaille	ee2b8b377f	MINOR: quic: Improve qc_prep_pkts() flexibility We want to be able to choose the encryption levels to be used by qc_prep_pkts() outside of it.	2022-01-04 17:30:00 +01:00
Frédéric Lécaille	f7ef97698a	MINOR: quic: Comment fix. When we drop a packet with unknown length, this is the entire datagram which must be skipped.	2022-01-04 17:30:00 +01:00
Frédéric Lécaille	a56054e438	MINOR: quic: Probe several packet number space upon timer expiration When the loss detection timer expires, we SHOULD include new data in our probing packets (RFC 9002 par 6.2.4. Sending Probe Packets).	2022-01-04 17:30:00 +01:00
Frédéric Lécaille	db6a4727cf	MINOR: quic: Probe Initial packet number space more often Especially when the PTO expires for Handshake packet number space and when Initial packets are still flying (for QUIC servers).	2022-01-04 17:30:00 +01:00
Frédéric Lécaille	3bb457c4ba	MINOR: quic: Speeding up Handshake Completion According to RFC 9002 par. 6.2.3. when receving duplicate Initial CRYPTO data a server may a packet containing non unacknowledged before the PTO expiry.	2022-01-04 17:30:00 +01:00
Frédéric Lécaille	63556772cc	MINOR: quic: qc_prep_pkts() code moving Move the switch default case code out of the switch to improve the readibily.	2022-01-04 17:30:00 +01:00
Frédéric Lécaille	5732062cd2	MINOR: quic: Useless test in qc_prep_pkts() These tests were there to initiate PTO probing but they are not correct. Furthermore they may break the PTO probing process and lead to useless packet building.	2022-01-04 17:30:00 +01:00
Frédéric Lécaille	dd51da599e	MINOR: quic: Wrong packet number space trace in qc_prep_pkts() It was always the first packet number space information which was dumped.	2022-01-04 17:30:00 +01:00
Frédéric Lécaille	466e9da145	MINOR: quic: Remove nb_pto_dgrams quic_conn struct member For now on we rely on tx->pto_probe pktns struct member to inform the packet building function we want to probe.	2022-01-04 17:30:00 +01:00
Frédéric Lécaille	22576a2e55	MINOR: quic: Wrong ack_delay compution before calling quic_loss_srtt_update() RFC 9002 5.3. Estimating smoothed_rtt and rttvar: MUST use the lesser of the acknowledgment delay and the peer's max_ack_delay after the handshake is confirmed.	2022-01-04 17:30:00 +01:00
Frédéric Lécaille	dc90c07715	MINOR: quic: Wrong loss time computation in qc_packet_loss_lookup() This part as been modified by the RFC since our first implementation.	2022-01-04 17:30:00 +01:00
Frédéric Lécaille	09e0f8319d	MINOR: quic: Wrong packet number space computation for PTO This leaded to make quic_pto_pktns() return 01RTT packet number space when initiating a probing even if the handshake was not completed!	2022-01-04 17:30:00 +01:00
Frédéric Lécaille	1f6cf18183	MINOR: quic: Wrong first packet number space computation I really do not know where does these inversion come from.	2022-01-04 17:30:00 +01:00
Frédéric Lécaille	22cfd83890	MINOR: quic: Add trace about in flight bytes by packet number space This parameter is useful to diagnose packet loss detection issues.	2022-01-04 17:30:00 +01:00
Frédéric Lécaille	fde2a98dd1	MINOR: quic: Wrong traces after rework TRACE_*() macros must take a quic_conn struct as first argument.	2022-01-04 17:30:00 +01:00
Christopher Faulet	7bf46bb9a9	BUG/MEDIUM: http-ana: Preserve response's FLT_END analyser on L7 retry When a filter is attached on a stream, the FLT_END analyser must not be removed from the response channel on L7 retry. It is especially important because CF_FLT_ANALYZE flag is still set. This means the synchronization between the two sides when the filter ends can be blocked. Depending on the timing, this can freeze the stream infinitely or lead to a spinning loop. Note that the synchronization between the two sides at the end of the analysis was introduced because the stream was reused in HTTP between two transactions. But, since the HTX was introduced, a new stream is created for each transaction. So it is probably possible to remove this step for 2.2 and higher. This patch must be backported as far as 2.0.	2022-01-04 10:56:04 +01:00
William Lallemand	148d7a0301	BUG/MINOR: cli: fix _getsocks with musl libc In ticket #1413, the transfer of FDs couldn't correctly work on alpine linux. After a few tests with musl on another distribution it seems to be a limitation of this libc. The number of FD that could be sent per sendmsg was set to 253, which does not seem to work with musl, decreasing it 252 seems to work better, so lets set this value everywhere since it does not have that much impact. This must be backported in every maintained version.	2022-01-03 19:50:34 +01:00
David Carlier	ae5c42f4d0	BUILD/MINOR: tools: solaris build fix on dladdr. dladdr takes a mutable address on this platform.	2022-01-03 14:43:51 +01:00
Ilya Shipitsin	874c907a2e	CI: github actions: update OpenSSL to 3.0.1 OpenSSL-3.0.1 was released on 14 Dec 2021, let's switch to it	2022-01-03 14:42:12 +01:00
Ilya Shipitsin	5e87bcf870	CLEANUP: assorted typo fixes in the code and comments This is 29th iteration of typo fixes	2022-01-03 14:40:58 +01:00
Willy Tarreau	f5e94b2f47	OPTIM: pools: reduce local pool cache size to 512kB Now that we support batched allocations/releases, it appears that we can reach the same performance on H2 with shared pools and 256kB thread-local cache as without shared pools, a fast allocator and 1MB thread-local cache. With 512kB we're up to 10% faster on highly multiplexed H2 than without the shared cache. This was tested on a 16-core ARM machine. Thus it's time to slightly reduce the per-thread memory cost, which may also improve the performance on machines with smaller L2 caches. It essentially reverts commit `f587003fe` ("MINOR: pools: double the local pool cache size to 1 MB").	2022-01-02 19:52:15 +01:00
Willy Tarreau	1513c5479a	MEDIUM: pools: release cached objects in batches With this patch pool_evict_last_items builds clusters of up to CONFIG_HAP_POOL_CLUSTER_SIZE entries so that accesses to the shared pools are reduced by CONFIG_HAP_POOL_CLUSTER_SIZE and the inter- thread contention is reduced by as much..	2022-01-02 19:35:26 +01:00
Willy Tarreau	43937e920f	MEDIUM: pools: start to batch eviction from local caches Since previous patch we can forcefully evict multiple objects from the local cache, even when evicting basd on the LRU entries. Let's define a compile-time configurable setting to batch releasing of objects. For now we set this value to 8 items per round. This is marked medium because eviction from the LRU will slightly change in order to group the last items that are freed within a single cache instead of accurately scanning only the oldest ones exactly in their order of appearance. But this is required in order to evolve towards batched removals.	2022-01-02 19:35:26 +01:00
Willy Tarreau	a0b5831eed	MEDIUM: pools: centralize cache eviction in a common function We currently have two functions to evict cold objects from local caches: pool_evict_from_local_cache() to evict from a single cache, and pool_evict_from_local_caches() to evict oldest objects from all caches. The new function pool_evict_last_items() focuses on scanning oldest objects from a pool and releasing a predefined number of them, either to the shared pool or to the system. For now they're evicted one at a time, but the next step will consist in creating clusters.	2022-01-02 19:35:26 +01:00
Willy Tarreau	337410c5a4	MINOR: pools: pass the objects count to pool_put_to_shared_cache() This is in order to let the caller build the cluster of items to be released. For now single items are released hence the count is always 1.	2022-01-02 19:35:26 +01:00
Willy Tarreau	148160b027	MINOR: pools: prepare pool_item to support chained clusters In order to support batched allocations and releases, we'll need to prepare chains of items linked together and that can be atomically attached and detached at once. For this we implement a "down" pointer in each pool_item that points to the other items belonging to the same group. For now it's always NULL though freeing functions already check them when trying to release everything.	2022-01-02 19:35:26 +01:00
Willy Tarreau	361e31e3fe	MEDIUM: pool: compute the number of evictable entries once per pool In pool_evict_from_local_cache() we used to check for room left in the pool for each and every object. Now we compute the value before entering the loop and keep into a local list what has to be released, and call the OS-specific functions for the other ones. It should already save some cycles since it's not needed anymore to recheck for the pool's filling status. But the main expected benefit comes from the ability to pre-construct a list of all releasable objects, that will later help with grouping them.	2022-01-02 19:35:26 +01:00
Willy Tarreau	91a8e28f90	MINOR: pool: add a function to estimate how many may be released at once At the moment we count the number of releasable objects to a shared pool one by one. The way the formula is made allows to pre-compute the number of available slots, so let's add a function for that so that callers can do it once before iterating. This takes into account the average number of entries needed and the minimum availability per pool. The function is not used yet.	2022-01-02 19:35:26 +01:00
Willy Tarreau	c16ed3b090	MINOR: pool: introduce pool_item to represent shared pool items In order to support batch allocation from/to shared pools, we'll have to support a specific representation for pool objects. The new pool_item structure will be used for this. For now it only contains a "next" pointer that matches exactly the current storage model. The few functions that deal with the shared pool entries were adapted to use the new type. There is no functionality difference at this point.	2022-01-02 19:35:26 +01:00
Willy Tarreau	b46674a283	MINOR: pool: check for pool's fullness outside of pool_put_to_shared_cache() Instead of letting pool_put_to_shared_cache() pass the object to the underlying OS layer when there's no more room, let's have the caller check if the pool is full and either call pool_put_to_shared_cache() or call pool_free_nocache(). Doing this sensibly simplifies the code as this function now only has to deal with a pool and an item and only for cases where there are local caches and shared caches. As the code was simplified and the calls more isolated, the function was moved to pool.c. Note that it's only called from pool_evict_from_local_cache{,s}() and that a part of its logic might very well move there when dealing with batches.	2022-01-02 19:35:26 +01:00
Willy Tarreau	a06f78b376	MINOR: pool: make pool_is_crowded() always true when no shared pools are used This function is used to know whether the shared pools are full or if we can store more objects in them. Right now it cannot be used in a generic way because when shared pools are not used it will return false, letting one think pools can accept objects. Let's make one variant for each build model.	2022-01-02 19:35:26 +01:00
Willy Tarreau	57c5c6db0c	MINOR: pool: rely on pool_free_nocache() in pool_put_to_shared_cache() At the moment pool_put_to_shared_cache() checks if the pool is crowded, and if so it does the exact same job as pool_free_nocache(), otherwise it adds the object there. This patch rearranges the code so that the function is split in two and either uses one path or the other, and always relies on pool_free_nocache() in case we don't want to store the object. This way there will be a common path with the variant not using the shared cache. The patch is better viewed using git show -b since a whole block got reindented. It's worth noting that there is a tiny difference now in the local cache usage measurement, as the decrement of "used" used to be performed before checking for pool_is_crowded() instead of being done after. This used to result in always one less object being kept in the cache than what was configured in minavail. The rearrangement of the code aligns it with other call places.	2022-01-02 19:35:26 +01:00
Willy Tarreau	594775d17c	CLEANUP: pools: group list updates in pool_get_from_cache() Some changes affect the list element and others affect the pool stats. Better group them together, as the compiler may not detect certain possible optimizations after the casts made by the list macros.	2022-01-02 19:34:19 +01:00

... 23 24 25 26 27 ...

17520 Commits