haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-26 00:51:24 +02:00

Author	SHA1	Message	Date
Amaury Denoyelle	c603de4d84	MINOR: mux-quic: count in-progress requests Add a new qcc member named <nb_hreq>. Its purpose is close to <nb_sc> which represents the number of attached stream connectors. Both are incremented inside qc_attach_sc(). The difference is on the decrement operation. While <nb_cs> is decremented on sedesc detach callback, <nb_hreq> is decremented when the qcs is locally closed. In most cases, <nb_hreq> will be decremented before <nb_cs>. However, it will be the reverse if a stream must be kept alive after detach callback. The main purpose of this field is to implement http-keep-alive timeout. Both <nb_sc> and <nb_hreq> must be null to activate the http-keep-alive timeout.	2022-08-01 14:58:41 +02:00
Amaury Denoyelle	07bf8f4d86	MINOR: mux-quic: save proxy instance into qcc Store a reference to proxy in the qcc structure. This will be useful to access to proxy members outside of qcc_init(). Most notably, this change is required to implement timeout refreshing by using the various timeouts configured at the proxy level.	2022-08-01 14:23:21 +02:00
Amaury Denoyelle	4ea5090f55	CLEANUP: mux-quic: remove useless app_ops is_active callback Timeout in QUIC MUX has evolved from the simple first implementation. At the beginning, a connection was considered dead unless bidirectional streams were opened. This was abstracted through an app callback is_active(). Now this paradigm has been reversed and a connection is considered alive by default, unless an error has been reported or a timeout has already been fired. The callback is_active() is thus not used anymore and can be safely removed to simplify qcc_is_dead(). This commit should be backported to 2.6.	2022-08-01 14:13:51 +02:00
Willy Tarreau	81f3b80e32	MINOR: ebtree: add ebmb_lookup_shorter() to pursue lookups This function is designed to enlarge the scope of a lookup performed by a caller via ebmb_lookup_longest() that was not satisfied with the result. It will first visit next duplicates, and if none are found, it will go up in the tree to visit similar keys with shorter prefixes and will return them if they match. We only use the starting point's value to perform the comparison since it was expected to be valid for the looked up key, hence it has all bits in common with its own length. The algorithm is a bit complex because when going up we may visit nodes that are located beneath the level we just come from. However it is guaranteed that keys having a shorter prefix will be present above the current location, though they may be attached to the left branch of a cover node, so we just visit all nodes as long as their prefix is too large, possibly go down along the left branch on cover nodes, and stop when either there's a match, or there's a non-matching prefix anymore. The following tricky case now works fine and properly finds 10.0.0.0/7 when looking up 11.0.0.1 from tree version 1 though both belong to different sub-trees: prepare map #1 add map @1 #1 10.0.0.0/7 10.0.0.0/7 add map @1 #1 10.0.0.0/7 10.0.0.0/7 commit map @1 #1 prepare map #1 add map @2 #1 11.0.0.0/8 11.0.0.0/8 add map @2 #1 11.0.0.0/8 11.0.0.0/8 prepare map #1 add map @1 #1 10.0.0.0/7 10.0.0.0/7 commit map @1 #1 prepare map #1 add map @2 #1 10.0.0.0/7 10.0.0.0/7 add map @2 #1 11.0.0.0/8 11.0.0.0/8 add map @2 #1 11.0.0.0/8 11.0.0.0/8	2022-08-01 11:59:46 +02:00
Willy Tarreau	0dc9e6dca2	DEBUG: tools: provide a tree dump function for ebmbtrees as well It's convenient for debugging IP trees. However we're not dumping the full keys, for the sake of simplicity, only the 4 first bytes are dumped as a u32 hex value. In practice this is sufficient for debugging. As a reminder since it seems difficult to recover the command each time it's needed, the output is converted to an image using dot from Graphviz: dot -o a.png -Tpng dump.txt	2022-08-01 11:59:15 +02:00
Willy Tarreau	688709d814	MAJOR: threads/plock: update the embedded library The plock code hasn't been been updated since 2017 and didn't benefit from the exponential back-off improvements that were added in 2018. Simply updating the file shows a massive performance gain on large thread count (>=48) with dequeuing going from 113k RPS to 300k RPS and round robin from 229k RPS to 1020k RPS. It was about time to update. In addition, some recent improvements to the code will be useful with thread groups. An interesting improvement concerns EPYC CPUs. This one alone increased fairness and was sufficient to avoid crashes in process_srv_queue() there, when hammering two servers with maxconn 200 under 1k connections.	2022-07-30 10:15:44 +02:00
Frédéric Lécaille	43910a9450	MINOR: quic: New "quic-cc-algo" bind keyword As it could be interesting to be able to choose the QUIC control congestion algorithm to be used by listener, add "quic-cc-algo" new keyword to do so. Update the documentation consequently. Must be backported to 2.6.	2022-07-29 17:32:05 +02:00
Frédéric Lécaille	1c9c2f6c02	MEDIUM: quic: Cubic congestion control algorithm implementation Cubic is the congestion control algorithm used by default by the Linux kernel since 2.6.15 version. This algorithm is supposed to achieve good scalability and fairness between flows using the same network path, it should also be used by QUIC by default. This patch implements this algorithm and select it as default algorithm for the congestion control. Must be backported to 2.6.	2022-07-29 17:32:05 +02:00
Frédéric Lécaille	c591459d11	MINOR: quic: Congestion control architecture refactoring Ease the integration of new congestion control algorithm to come. Move the congestion controller state to a private array of uint32_t to stop using a union. We do not want to continue using such long paths cc->algo_state.<algo>.<var> to modify the internal state variable for each algorithm. Must be backported to 2.6	2022-07-29 17:32:05 +02:00
William Lallemand	dc66f2f97d	DEBUG: fd: split the fd check Split the BUG_ON(fd < 0 \|\| fd >= global.maxsock) so it's easier to know if it quits because of a -1.	2022-07-26 10:35:24 +02:00
Willy Tarreau	51b1fcedeb	DEBUG: fd: detect possibly invalid tgid in fd_insert() Since the API is still a bit young, let's make sure nobody tries to assign and FD to a group not strictly 1..MAX_TGROUPS as that would indicate a bug. Note: some of these might be relaxed to BUG_ON_HOT() in the future	2022-07-25 15:47:45 +02:00
Christopher Faulet	7e94b40a22	BUG/MINOR: fd: Properly init the fd state in fd_insert() When a new fd is inserted in the fdtab array, its state is initialized. The "newstate" variable is used to compute the right state (0 by default, but FD_ET_POSSIBLE flag is set if edge-triggered is supported for the fd). However, this variable is never used and the fd state is always set to 0. Now, the fd state is initialized with "newstate" variable. This bug was introduced by commit ddedc1662 ("MEDIUM: fd: make fd_insert/fd_delete atomically update fd.tgid"). No backport needed.	2022-07-19 12:11:04 +02:00
Willy Tarreau	03f3049df1	BUG/MINOR: tools: fix statistical_prng_range()'s output range This function was added by commit 84ebfabf7 ("MINOR: tools: add statistical_prng_range() to get a random number over a range") but it contains a bug on the range, since mul32hi() covers the whole input range, we must pass it range-1. For now it didn't have any impact, but if used to find an array's index it will cause trouble. This should be backported to 2.4.	2022-07-18 19:09:55 +02:00
Willy Tarreau	5b3cd9561b	BUG/MEDIUM: tools: avoid calling dlsym() in static builds (try 2) The first approach in commit 288dc1d8e ("BUG/MEDIUM: tools: avoid calling dlsym() in static builds") relied on dlopen() but on certain configs (at least gcc-4.8+ld-2.27+glibc-2.17) it used to catch situations where it ought not fail. Let's have a second try on this using dladdr() instead. The variable was renamed "build_is_static" as it's exactly what's being detected there. We could even take it for reporting in -vv though that doesn't seem very useful. At least the variable was made global to ease inspection via the debugger, or in case it's useful later. Now it properly detects a static build even with gcc-4.4+glibc-2.11.1 and doesn't crash anymore.	2022-07-18 14:03:54 +02:00
Willy Tarreau	856d56d2d2	MINOR: config: change default MAX_TGROUPS to 16 This will allows nbtgroups > 1 to be declared in the config without recompiling. The theoretical limit is 64, though we'd rather not push it too far for now as some structures might be enlarged to be indexed per group. Let's start with 16 groups max, allowing to experiment with dual-socket machines suffering from up to 8 loosely coupled L3 caches. It's a good start and doesn't engage us too far.	2022-07-15 21:51:48 +02:00
Willy Tarreau	c6b596dcce	CLEANUP: threads: remove the now unused all_threads_mask and tid_bit Since these are not used anymore, let's now remove them. Given the number of places where we're using ti->ldit_bit, maybe an equivalent might be useful though.	2022-07-15 20:25:41 +02:00
Willy Tarreau	88c4c14050	MINOR: fd: add fd_reregister_all() to deal with boot-time FDs At boot the pollers are allocated for each thread and they need to reprogram updates for all FDs they will manage. This code is not trivial, especially when trying to respect thread groups, so we'd rather avoid duplicating it. Let's centralize this into fd.c with this function. It avoids closed FDs, those whose thread mask doesn't match the requested one or whose thread group doesn't match the requested one, and performs the update if required under thread-group protection.	2022-07-15 20:16:30 +02:00
Willy Tarreau	ddedc16624	MEDIUM: fd: make fd_insert/fd_delete atomically update fd.tgid These functions need to set/reset the FD's tgid but when they're called there may still be wakeups on other threads that discover late updates and have to touch the tgid at the same time. As such, it is not possible to just read/write the tgid there. It must only be done using operations that are compatible with what other threads may be doing. As we're using inc/dec on the refcount, it's safe to AND the area to zero the lower part when resetting the value. However, in order to set the value, there's no other choice but fd_claim_tgid() which will assign it only if possible (via a CAS). This is convenient in the end because it protects the FD's masks from being modified by late threads, so while we hold this refcount we can safely reset the thread_mask and a few other elements. A debug test for non-null masks was added to fd_insert() as it must not be possible to face this situation thanks to the protection offered by the tgid.	2022-07-15 20:16:30 +02:00
Willy Tarreau	3638d174e5	MEDIUM: fd: make thread_mask now represent group-local IDs With the change that was started on other masks, the thread mask was still not fully converted, sometimes being used as a global mask and sometimes as a local one. This finishes the code modifications so that the mask is always considered as a group-local mask. This doesn't change anything as long as there's a single group, but is necessary for groups 2 and above since it's used against running_mask and so on.	2022-07-15 20:16:30 +02:00
Willy Tarreau	d6e1987612	MINOR: fd: make fd_clr_running() return the previous value instead It's an AND so it destroys information and due to this there's a call place where we have to perform two reads to know the previous value then to change it. With a fetch-and-and instead, in a single operation we can know if the bit was previously present, which is more efficient.	2022-07-15 20:16:30 +02:00
Willy Tarreau	a707d02657	MEDIUM: fd/poller: turn running_mask to group-local IDs From now on, the FD's running_mask only refers to local thread IDs. However, there remains a limitation, in updt_fd_polling(), we temporarily have to check and set shared FDs against .thread_mask, which still contains global ones. As such, nbtgroups > 1 may break (but this is not yet supported without special build options).	2022-07-15 20:16:30 +02:00
Willy Tarreau	6d3c501c08	MEDIUM: fd/poller: turn update_mask to group-local IDs From now on, the FD's update_mask only refers to local thread IDs. However, there remains a limitation, in updt_fd_polling(), we temporarily have to check and set shared FDs against .thread_mask, which still contains global ones. As such, nbtgroups > 1 may break (but this is not yet supported without special build options).	2022-07-15 20:16:30 +02:00
Willy Tarreau	ceffd17f52	MINOR: fd: add fd_get_running() to atomically return the running mask The running mask is only valid if the tgid is the expected one. This function takes a reference on the tgid before reading the running mask, so that both are checked at once. It returns either the mask or zero if the tgid differs, thus providing a simple way for a caller to check if it still holds the FD.	2022-07-15 20:16:30 +02:00
Willy Tarreau	080373ea38	MINOR: fd: add functions to manipulate the FD's tgid The FD's tgid is refcounted and must be atomically manipulated. Function fd_grab_tgid() will increase the refcount but only if the tgid matches the one in argument (likely the current one). fd_claim_tgid() will be used to self-assign the tgid after waiting for its refcount to reach zero. fd_drop_tgid() will be used to drop a temporarily held tgid. All of these are needed to prevent an FD from being reassigned to another group, either when inspecting/modifying the running_mask, or when checking for updates, in order to be certain that the mask being seen corresponds to the desired group. Note that once at least one bit is set in the running mask of an active FD, it cannot be closed, thus not migrated, thus the reference does not need to be held long.	2022-07-15 20:16:09 +02:00
Willy Tarreau	9464bb1f05	MEDIUM: fd: add the tgid to the fd and pass it to fd_insert() The file descriptors will need to know the thread group ID in addition to the mask. This extends fd_insert() to take the tgid, and will store it into the FD. In the FD, the tgid is stored as a combination of tgid on the lower 16 bits and a refcount on the higher 16 bits. This allows to know when it's really possible to trust the tgid and the running mask. If a refcount is higher than 1 it indeed indicates another thread else might be in the process of updating these values. Since a closed FD must necessarily have a zero refcount, a test was added to fd_insert() to make sure that it is the case.	2022-07-15 19:58:06 +02:00
Willy Tarreau	512dd2dc1c	MINOR: fd: make fd_insert() apply the thread mask itself It's a bit ugly to see that half of the callers of fd_insert() have to apply all_threads_mask themselves to the bit field they're passing, because usually it comes from a listener that may have other bits set. Let's make the function apply the mask itself.	2022-07-15 19:58:06 +02:00
Willy Tarreau	35ee710ece	MEDIUM: fd/poller: make the update-list per-group The update-list needs to be per-group because its inspection is based on a mask and we need to be certain when scanning it if a mask is for the same thread or another one. Once per-group there's no doubt about it, even if the FD's polling changes, the entry remains valid. It will be needed to check the tgid though. Note that a soft-stop or pause/resume might not necessarily work here with tgroups>1, because the operation might be delivered to a thread that doesn't belong to the group and whoe update mask will not reflect one that is interesting here. We can't do better at this stage.	2022-07-15 19:57:28 +02:00
Willy Tarreau	91a7c164b4	MINOR: task: move the niced_tasks counter to the thread group context This one is only used as a hint to improve scheduling latency, so there is no more point in keeping it global since each thread group handles its own run q	2022-07-15 19:43:10 +02:00
Willy Tarreau	b0e7712fb2	MEDIUM: task/thread: move the task shared wait queues per thread group Their migration was postponed for convenience only but now's time for having the shared wait queues per thread group and not just per process, otherwise the WQ lock uses a huge amount of CPU alone.	2022-07-15 19:43:10 +02:00
Willy Tarreau	82e378aa8a	MINOR: fd/thread: get rid of thread_mask() Since commit d2494e048 ("BUG/MEDIUM: peers/config: properly set the thread mask") there must not remain any single case of a receiver that is bound nowhere, so there's no need anymore for thread_mask(). We're adding a test in fd_insert() to make sure this doesn't happen by accident though, but the function was removed and its rare uses were replaced with the original value of the bind_thread msak.	2022-07-15 19:43:10 +02:00
Willy Tarreau	5b09341c02	MEDIUM: cpu-map: replace the process number with the thread group number The principle remains the same, but instead of having a single process and ignoring extra ones, now we set the affinity masks for the respective threads of all groups. The doc was updated with a few extra examples.	2022-07-15 19:43:10 +02:00
Willy Tarreau	7aa41196cf	MEDIUM: debug/threads: make the lock debugging take tgroups into account Since we have to use masks to verify owners/waiters, we have no other option but to have them per group. This definitely inflates the size of the locks, but this is only used for extreme debugging anyway so that's not dramatic. Thus as of now, all masks in the lock stats are local bit masks, derived from ti->ltid_bit. Since at boot ltid_bit might not be set, we just take care of this situation (since some structs are initialized under look during boot), and use bit 0 from group 0 only.	2022-07-15 19:41:26 +02:00
Willy Tarreau	4d9888ca69	CLEANUP: fd: get rid of the __GET_{NEXT,PREV} macros They were initially made to deal with both the cache and the update list but there's no cache anymore and keeping them for the update list adds a lot of obfuscation that is really not desired. Let's get rid of them now. Their purpose was simply to get a pointer to fdtab[fd].update.{,next,prev} in order to perform atomic tests and modifications. The offset passed in argument to the functions (fd_add_to_fd_list() and fd_rm_from_fd_list()) was the offset of the ->update field in fdtab, and as it's not used anymore it was removed. This also removes a number of casts, though those used by the atomic ops have to remain since only scalars are supported.	2022-07-15 19:41:26 +02:00
Willy Tarreau	91f7a1af34	CLEANUP: applet: remove the obsolete command context from the appctx The "ctx" and "st2" parts in the appctx were marked for removal in 2.7 and were emulated using memcpy/memset etc for possible external code. Let's remove this now.	2022-07-15 19:41:26 +02:00
Amaury Denoyelle	d666d740d2	MINOR: mux-quic: support app graceful shutdown Adjust qcc_emit_cc_app() to allow the delay of emission of a CONNECTION_CLOSE. This will only set the error code but the quic-conn layer is not flagged for immediate close. The quic-conn will be responsible to shut the connection when deemed suitable. This change will allow to implement application graceful shutdown, such as HTTP/3 with GOAWAY emission. This will allow to emit closing frames on MUX release. Once all work is done at the lower layer, the quic-conn should emit a CONNECTION_CLOSE with the registered error code.	2022-07-15 15:06:59 +02:00
Amaury Denoyelle	57e6db7021	MINOR: quic: define a generic QUIC error type Define a new structure quic_err to abstract a QUIC error type. This allows to easily differentiate a transport and an application error code. This simplifies error transmission from QUIC MUX and H3 layers. This new type is defined in quic_frame module. It is used to replace <err_code> field in <quic_conn>. QUIC_FL_CONN_APP_ALERT flag is removed as it is now useless. Utility functions are defined to be able to quickly instantiate transport, tls and application errors.	2022-07-15 14:57:49 +02:00
Amaury Denoyelle	41cd879383	CLEANUP: quic: clean up include on quic_frame-t.h quic_frame-t.h and xprt_quic-t.h include themselves mutually. This may cause some troubles later. In fact, xprt_quic does not need to include quic_frame so remove this. And as quic_frame is a generic source file which is included in multiple places, it is useful to also remove the xprt_quic include in it. Use forward declaration for this.	2022-07-15 14:54:24 +02:00
Amaury Denoyelle	a5b5075211	MEDIUM: mux-quic: implement STOP_SENDING handling Implement support for STOP_SENDING frame parsing. The stream is resetted as specified by RFC 9000. This will automatically interrupt all future send operation in qc_send(). A RESET_STREAM will be sent with the code extracted from the original STOP_SENDING frame.	2022-07-11 16:45:04 +02:00
Amaury Denoyelle	843a1196b3	MEDIUM: mux-quic: implement RESET_STREAM emission Implement functions to be able to reset a stream via RESET_STREAM. If needed, a qcs instance is flagged with QC_SF_TO_RESET to schedule a stream reset. This will interrupt all future send operations. On stream emission, if a stream is flagged with QC_SF_TO_RESET, a RESET_STREAM frame is generated and emitted to the transport layer. If this operation succeeds, the stream is locally closed. If upper layer is instantiated, error flag is set on it.	2022-07-11 16:45:04 +02:00
Amaury Denoyelle	38e6006da1	MINOR: mux-quic: define basic stream states Implement a basic state machine to represent stream lifecycle. By default a stream is idle. It is marked as open when sending or receiving the first data on a stream. Bidirectional streams has two states to represent the closing on both receive and send channels. This distinction does not exists for unidirectional streams which passed automatically from open to close state. This patch is mostly internal and has a limited visible impact. Some behaviors are slightly updated : * closed streams are garbage collected at the start of io handler * send operation is interrupted if a stream is close locally Outside of this, there is no functional change. However, some additional BUG_ON guards are implemented to ensure that we do not conduct invalid operation on a stream. This should strengthen the code safety. Also, stream states are displayed on trace which should help debugging.	2022-07-11 16:37:21 +02:00
Amaury Denoyelle	b143723411	REORG: mux-quic: rename stream initialization function Rename both qcc_open_stream_local/remote() functions to qcc_init_stream_local/remote(). This change is purely cosmetic. It will reduces the ambiguity with the soon to be implemented OPEN states for QCS instances.	2022-07-11 16:24:03 +02:00
Willy Tarreau	481edaceb8	BUILD: debug: silence warning on gcc-5 In 2.6, 8a0fd3a36 ("BUILD: debug: work around gcc-12 excessive -Warray-bounds warnings") disabled some warnings that were reported around the the BUG() statement. But the -Wnull-dereference warning isn't known from gcc-5, it only arrived in gcc-6, hence makes gcc-5 complain loudly that it doesn't know this directive. Let's just condition this one to gcc-6.	2022-07-10 14:13:48 +02:00
Christopher Faulet	ca7218aaf0	MINOR: http: Add function to detect default port http_is_default_port() can be used to test if a port is a default HTTP/HTTPS port. A scheme may be specified. In this case, it is used to detect defaults ports, 80 for "http://" and 443 for "https://". Otherwise, with no scheme, both are considered as default ports.	2022-07-06 17:54:03 +02:00
Christopher Faulet	658f971621	MINOR: http: Add function to get port part of a host http_get_host_port() function can be used to get the port part of a host. It will be used to get the port of an uri authority or a host header value. This function only look for a port starting from the end of the host. It is the caller responsibility to call it with a valid host value. An indirect string is returned.	2022-07-06 17:54:03 +02:00
Amaury Denoyelle	3f39b40fe0	MINOR: mux-quic: rename qcs flag FIN_RECV to SIZE_KNOWN Rename QC_SF_FIN_RECV to the more generic name QC_SF_SIZE_KNOWN. This better align with the QUIC RFC 9000 which uses the "Size Known" state definition. This change is purely cosmetic.	2022-07-05 16:18:27 +02:00
Amaury Denoyelle	a509ffb505	MEDIUM: mux-quic: refactor streams opening Review the whole API used to access/instantiate qcs. A public function qcc_open_stream_local() is available to the application protocol layer. It allows to easily opening a local stream. The ID is automatically attributed to the next one available. For remote streams, qcc_open_stream_remote() has been implemented. It will automatically take care of allocating streams in a linear way according to the ID. This function is called via qcc_get_qcs() which can be used for each qcc_recv*() operations. For the moment, it is only used for STREAM frames via qcc_recv(), but soon it will be implemented for other frames types which can also be used to open a new stream. qcs_new() and qcs_free() has been restricted to the MUX QUIC only as they are now reserved for internal usage. This change is a pure refactoring and should not have any noticeable impact. It clarifies the developer intent and help to ensure that a stream is not automatically opened when not desired.	2022-07-05 16:18:27 +02:00
Amaury Denoyelle	321fa7733c	REORG: mux-quic: reorganize flow-control fields <qcc.cl_bidi_r> is used to implement STREAM ID flow control enforcement. Move it with all fields related to this operation and separated from MAX STREAM DATA calcul.	2022-07-05 11:20:02 +02:00
Amaury Denoyelle	a441ec9c7a	CLEANUP: mux-quic: do not export qc_get_ncbuf qc_get_ncbuf() is only used internally : thus its prototype in QUIC MUX include is not required.	2022-07-05 11:06:52 +02:00
Emeric Brun	36d9097cf3	MINOR: fd: Add BUG_ON checks on fd_insert() This patch adds two BUG_ON on fd_insert() into the fdtab checking if the fd has been correctly re-initialized into the fdtab before a new insert. It will raise a BUG if we try to insert the same fd multiple times without an intermediate fd_delete(). First one checks that the owner for this fd in fdtab was reset to NULL. Second one checks that the state flags for this fd in fdtab was reset to 0. This patch could be backported on version >= 2.4	2022-07-05 05:18:51 +02:00
Willy Tarreau	1229ef312d	MINOR: wdt: do not rely on threads_to_dump anymore This flag is not needed anymore as we're already marking the waiting threads as harmless, thus the thread's bit is already covered by this information. The variable was unexported.	2022-07-01 19:26:35 +02:00

... 41 42 43 44 45 ...

8446 Commits