haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-08 08:07:10 +02:00

Author	SHA1	Message	Date
Christopher Faulet	9aed1124ed	MINOR: stconn: Add a field to specify the room needed by the SC to progress When the SC is blocked because it is waiting for room in the input buffer, it will be responsible to specify the minimum free space required to progress. In this commit, we only introduce the field in the stconn structure that will be used to store this value. It is a signed value with the following meaning: * -1: The SC is waiting for room but not based on the buffer state. It will be typically used during splicing when the pipe is full. In this case, only a successful send can unblock the SC. * >= 0; The minimum free space in the input buffer to unblock the SC. 0 is a special value to specify the SC must be unblocked ASAP, by the stream, at the end of process_stream() or when output data are consumed on the opposite side.	2023-05-05 15:41:30 +02:00
Christopher Faulet	7a48b72d39	MINOR: peers: Use the applet API to send message The peers applet now use the applet API to send message instead of the channel API. This way, it does not need to take care to request more room if it fails to put data into the channel's buffer.	2023-05-05 15:41:30 +02:00
Christopher Faulet	f4258bdf3b	MINOR: stats: Use the applet API to write data stats_putchk() is updated to use the applet API instead of the channel API to write data. To do so, the appctx is passed as parameter instead of the channel. This way, the applet does not need to take care to request more room it it fails to put data into the channel's buffer.	2023-05-05 15:41:29 +02:00
Christopher Faulet	e8ee27b0fd	MINOR: cli: Use applet API to write output message Instead of using the channel API to to write output message from the CLI applet, we use the applet API. This way, the applet does not need to take care to request more room it it fails to put its message into the channel's buffer.	2023-05-05 15:41:19 +02:00
William Lallemand	b6ae2aafde	MINOR: ssl: allow to change the signature algorithm for client authentication This commit introduces the keyword "client-sigalgs" for the bind line, which does the same as "sigalgs" but for the client authentication. "ssl-default-bind-client-sigalgs" allows to set the default parameter for all the bind lines. This patch should fix issue #2081.	2023-05-05 00:05:46 +02:00
William Lallemand	1d3c822300	MINOR: ssl: allow to change the server signature algorithm This patch introduces the "sigalgs" keyword for the bind line, which allows to configure the list of server signature algorithms negociated during the handshake. Also available as "ssl-default-bind-sigalgs" in the default section. This patch was originally written by Bruno Henc.	2023-05-04 22:43:18 +02:00
Willy Tarreau	e69919d1ba	CLEANUP: debug: remove the now unused ha_thread_dump_all_to_trash() The function isn't used anymore since each call place performs its own loop. Let's get rid of it.	2023-05-04 19:19:04 +02:00
Willy Tarreau	009b5519e6	MINOR: debug: make "show threads" properly iterate over all threads Previously it would re-dump all threads to the same trash if the output buffer was full, which it never was since the trash is of the same size. Now it dumps one thread, copies it to the buffer and yields until it can continue. Showing 256 threads works as expected.	2023-05-04 19:15:50 +02:00
Willy Tarreau	880d1684a7	MINOR: debug: write panic dump to stderr one thread at a time Currently large setups cannot dump all their threads because they're first dumped to the trash buffer, then copied to stderr. Here we can now change this, instead we dump one thread at a time into the trash and immediately send it to stderr. We also keep a copy into a local trash chunk that's assigned to thread_dump_buffer so that a core file still contains a copy of a large number of threads, which is generally sufficient for the vast majority of situations. It was verified that dumping 256 threads now produces ~55kB of output and all of them are properly dumped.	2023-05-04 19:15:50 +02:00
Willy Tarreau	9a6ecbd590	MEDIUM: debug: simplify the thread dump mechanism The thread dump mechanism that is used by "show threads" and by the panic dump is overly complicated due to an initial misdesign. It firsts wakes all threads, then serializes their dumps, then releases them, while taking extreme care not to face colliding dumps. In fact this is not what we need and it reached a limit where big machines cannot dump all their threads anymore due to buffer size limitations. What is needed instead is to be able to dump one thread, and to let the requester iterate on all threads. That's what this patch does. It adds the thread_dump_buffer to the struct thread_ctx so that the requester offers the buffer to the thread that is about to be dumped. This buffer also serves as a lock. A thread at rest has a NULL, a valid pointer indicates the thread is using it, and 0x1 (NULL+1) is used by the dumped thread to tell the requester it's done. This makes sure that a given thread is dumped once at a time. In addition to this, the calling thread decides whether it accesses the thread by itself or via the debug signal handler, in order to get a backtrace. This is much saner because the calling thread is free to do whatever it wants with the buffer after each thread is dumped, and there is no dependency between threads, once they've dumped, they're free to continue (and possibly to dump for another requester if needed). Finally, when the THREAD_DUMP feature is disabled and the debug signal is not used, the requester accesses the thread by itself like before. For now we still have the buffer size limitation but it will be addressed in future patches.	2023-05-04 19:15:44 +02:00
Christopher Faulet	34f81d5815	BUG/MINOR: mux-h2: Also expect data when waiting for a tunnel establishment When a client H2 stream is waiting for a tunnel establishment, it must state it expects data from server. It is the second fix that should fix regressions of the commit 2722c04b ("MEDIUM: mux-h2: Don't expect data from server as long as request is unfinished") It is a 2.8-specific bug. No backport needed.	2023-05-04 16:58:33 +02:00
Willy Tarreau	cb01f5daa7	BUG/MINOR: debug: do not emit empty lines in thread dumps In 2.3, commit `471425f51` ("BUG/MINOR: debug: Don't dump the lua stack if it is not initialized") introduced the possibility to emit an empty line when there's no Lua info to dump. The problem is that doing this on the CLI in "show threads" marks the end of the output, and it may affect some external tools. We need to make sure that LFs are only emitted if there's something on the line and that all lines properly start with the prefix. This may be backported as far as 2.0 since the commit above was backported there.	2023-05-04 16:51:50 +02:00
Amaury Denoyelle	d4af04198b	MINOR: mux-quic: close connection asap on local error With the change for QUIC MUX local error API, the new flag QC_CF_ERRL is now checked on qc_detach(). If set, qcs instance is freed even though transfer is not finished. This should help to quickly release qcs and eventually all MUX instance resources. To further accelerate this, a specific check has been added in qc_shutw(). It is skipped if local error flag is set to prevent noisy reset stream invocation. In the same way, QUIC MUX is not rescheduled on qc_recv_buf() operation if local error flag set. This should be backported up to 2.7.	2023-05-04 16:36:51 +02:00
Amaury Denoyelle	35542ce7bf	MINOR: mux-quic: report local error on stream endpoint asap If an error a detected at the MUX layer, all remaining stream endpoints should be closed asap with error set. This is now done by checking for QC_CF_ERRL flag on qc_wake_some_streams() and qc_send_buf(). To complete this, qc_wake_some_streams() is called by qc_process() if needed. This should help to quickly release streams as soon as a new error is detected locally by the MUX or APP layer. This allows to in turn free the MUX instance itself. Previously, error would not have been automatically reported until the transport layer closure would occur on CONNECTION_CLOSE emission. This should be backported up to 2.7.	2023-05-04 16:36:51 +02:00
Amaury Denoyelle	51f116d65e	MINOR: mux-quic: adjust local error API When a fatal error is detected by the QUIC MUX or H3 layer, the connection should be closed with a CONNECTION_CLOSE with an error code as the reason. Previously, a direct call was used to the quic_conn layer to try to close the connection. This API was adjusted to be more flexible. Now, when an error is detected, the function qcc_set_error() is called. This set the flag QC_CF_ERRL with the error code stored by the MUX. The connection will be closed soon so most of the operations are not conducted anymore. Connection is then finally closed during qc_send() via quic_conn layer if QC_CF_ERRL is set. This will set the flag QC_CF_ERRL_DONE which indicates that the MUX instance can be freed. This model is cleaner and brings the following improvments : - interaction with quic_conn layer for closure is centralized on a single function - CO_FL_ERROR is not set anymore. This was incorrect as this should be reserved to errors reported by the transport layer to be similar with other haproxy components. As a consequence, qcc_is_dead() has been adjusted to check for QC_CF_ERRL_DONE to release the MUX instance. This should be backported up to 2.7.	2023-05-04 16:36:51 +02:00
Amaury Denoyelle	b8901d2c86	MINOR: mux-quic: wake up after recv only if avail data When HTX content is transferred from qcs instance to upper stream endpoint, a wakeup is conducted for MUX tasklet. However, this is only necessary if demux was interrupted due to a full QCS HTX buffer. This should be backported up to 2.7.	2023-05-04 16:36:51 +02:00
Amaury Denoyelle	8d44bfaf0b	MINOR: mux-quic: add trace event for local error Add a dedicated trace event QMUX_EV_QCC_ERR. This is used for locally detected error when a CONNECTION_CLOSE should be emitted. This should be backported up to 2.7.	2023-05-04 16:36:51 +02:00
Amaury Denoyelle	b737f95009	BUG/MINOR: mux-quic: prevent quic_conn error code to be overwritten When MUX performs a graceful shutdown, quic_conn error code is set to a "no error" code which depends on the application layer used. However, this may overwrite a previous error code if quic_conn layer has detected an error on its side. In practice, this behavior has not been seen on production. In fact, it may have undesirable effect only if this error code modification happens between the quic_conn error detection and the emission of the CONNECTION_CLOSE, so it should be pretty rare. However, there is still a tiny possibility it may happen. To prevent this, first check that quic_conn error code is not set before setting it. Ideally, transport layer API should be adjusted to be able to set this without fiddling with the quic_conn directly. This should be backported up to 2.6.	2023-05-04 16:36:51 +02:00
Christopher Faulet	4403cdf653	BUG/MEDIUM: mux-h2: Properly handle end of request to expect data from server The commit 2722c04b ("MEDIUM: mux-h2: Don't expect data from server as long as request is unfinished") introduced a regression in the H2 multiplexer. The end of the request is not systematically handled to state a H2 stream on client side now expexts data from the server. Indeed, while the client is uploading its request, the H2 stream warns it does not expect data from the server. This way, no server timeout is applied at this stage. When end of the request is detected, the H2 stream must state it now expects the server response. This enables the server timeout. However, it was only performed at one place while the end of the request can be handled at different places. First, during a zero-copy in h2_rcv_buf(). Then, when the SC is created with the full request. Because of this bug, it is possible to totally disable the server timeout for H2 streams. In h2_rcv_buf(), we now rely on h2s flags to detect the end of the request, but only when the rxbuf was emptied. It is a 2.8-specific bug. No backport needed.	2023-05-04 16:29:27 +02:00
Willy Tarreau	e5e62231d8	MINOR: debug: permit the "debug dev loop" to run under isolation Sometimes it's convenient to test the effect of tasks running under isolation, e.g. to validate the contents of the crash dumps. Let's add an optional "isolated" keyword to "debug dev loop" for this.	2023-05-04 11:50:26 +02:00
Willy Tarreau	b30ced3d88	BUG/MINOR: debug: fix incorrect profiling status reporting in show threads Thread dumps include a field "prof" for each thread that reports whether task profiling is currently active or not. It turns out that in 2.7-dev1, commit `680ed5f28` ("MINOR: task: move profiling bit to per-thread") mistakenly replaced it with a check for the current thread's bit in the thread dumps, which basically is the only place where another thread is being watched. The same mistake was done a few lines later by confusing threads_want_rdv_mask with the profiling mask. This mask disappeared in 2.7-dev2 with commit `598cf3f22` ("MAJOR: threads: change thread_isolate to support inter-group synchronization"), though instead we know the ID of the isolated thread. This commit fixes this and now reports "isolated" instead of "wantrdv". This can be backported to 2.7.	2023-05-04 11:41:33 +02:00
Willy Tarreau	8b3e39e37b	MINOR: activity: allow "show activity" to restart in the middle of a line 16kB buffers are not enough to dump 4096 threads with up to 10 bytes value on each line. By storing the column number in the applet's context, we can now restart from the last attempted column. This requires to dump all values as they are produced, but it doesn't cost that much: a 4096-thread output from a fesh process produces 300kB of output in ~8ms, or ~400us per call (19*16kB), most of which are spent in vfprintf(). Given that we don't print more than needed, it doesn't really change anything. The main caveat is that when interrupted on such large lines, there's a great possibility that the total or average on the first column doesn't match anymore the sum or average of all dumped values. In order to avoid this whenever possible (typically less than ~1500 threads), we first try to dump entire lines and only proceed one column at a time when we have to retry a failed dump. This is already the same for other stats that are dumped in an interruptible way anyway and there's little that can be done about it at this point (and not much immediately perceived benefit in doing this with extreme accuracy for >1500 threads).	2023-05-03 17:26:11 +02:00
Willy Tarreau	6ed0b9885d	MINOR: activity: allow "show activity" to restart dumping on any line When using many threads, it's difficult to see the end of "show activity" due to the numerous columns which fill the buffer. For example a dump of a 256-thread, freshly booted process yields around 15kB. Here by arranging the dump in a loop around a switch/case block where each case checks the code line number against the current dump position, we have a restartable counter for free with a granularity of the line of code, without having to maintain a matching between states and specific lines. It just requires to reset the trash buffer for each line and to try to dump it after each line. Now dumping 256 threads after a few seconds of traffic happily emits 20kB.	2023-05-03 17:24:54 +02:00
Willy Tarreau	8ee0d11cb8	MINOR: activity: iterate over all fields in a main loop for dumping Now each line of "show activity" will iterate over n+2 fields, one for the line header, one for the total, and one per thread. This will soon allow us to save the current state in a restartable way.	2023-05-03 17:24:54 +02:00
Willy Tarreau	a465b21516	MINOR: activity: show the line header inside the SHOW_VAL macro Doing so will allow us to drop the extra chunk_appendf() dedicated to the line header and simplify iteration over restartable columns.	2023-05-03 17:24:54 +02:00
Willy Tarreau	5ddf9bea09	MINOR: activity: use a single macro to iterate over all fields Instead of having SHOW_AVG() and SHOW_TOT(), let's just have SHOW_VAL() which iterates over all values.	2023-05-03 17:24:54 +02:00
Willy Tarreau	ff508f12c6	BUILD: cli: fix build on Windows due to isalnum() implemented as a macro Commit `986798718` ("DEBUG: cli: add "debug dev task" to show/wake/expire/kill tasks and tasklets") broke the build on windows due to this: src/debug.c:940:95: error: array subscript has type char [-Werror=char-subscripts] 940 \| caller && may_access(caller) && may_access(caller->func) && isalnum(*caller->func) ? caller->func : "0", \| ^~~~~~~~~~~~~ It's classical on platforms which implement ctype.h as macros instead of functions, let's cast it as uchar. No backport is needed.	2023-05-03 16:32:50 +02:00
William Lallemand	117c7fde06	BUG/MINOR: ssl/sample: x509_v_err_str converter output when not found The x509_v_err_str converter now outputs the numerical value as a string when the corresponding constant name was not found. Must be backported as far as 2.7.	2023-05-03 15:19:38 +02:00
Willy Tarreau	9867987182	DEBUG: cli: add "debug dev task" to show/wake/expire/kill tasks and tasklets When analyzing certain types of bugs in field, sometimes it would be nice to be able to wake up a task or tasklet to see how events progress (e.g. to detect a missing wakeup condition), or expire or kill such a task. This restricted command shows hte current state of a task or tasklet and allows to manipulate it like this. However it must be used with extreme care because while it does verify that the pointers are mapped, it cannot know if they point to a real task, and performing such actions on something not a task will easily lead to a crash. In addition, performing a "kill" on a task has great chances of provoking a deferred crash due to a double free and/or another kill that is not idempotent. Use with extreme care!	2023-05-03 11:47:44 +02:00
Willy Tarreau	dd01448953	MINOR: debug: clarify "debug dev stream" help message The help message was insufficient to figure how to use it and specify the stream pointer and changes to operate.	2023-05-03 11:47:44 +02:00
Willy Tarreau	65efd33c06	BUG/MINOR: stream/cli: fix stream age calculation in "show sess" The "show sess" command displays the stream's age in synthetic form, and also makes it appear in the long version (show sess all). But that last one uses the wrong origin, it uses accept_date.tv_sec instead of accept_ts (formerly known as tv_accept). This was introduced in 1.4.2 with the long format, with commit `66dc20a17` ("[MINOR] stats socket: add show sess <id> to dump details about a session"), while the code that split the two variables was introduced in 1.3.16 with commit `b7f694f20` ("[MEDIUM] implement a monotonic internal clock"). This problem was revealed by recent change `ad5a5f677` ("MEDIUM: tree-wide: replace timeval with nanoseconds in tv_accept and tv_request") that made this value report random garbage, and generally emphasized by the fact that in 2.8 the two clocks have sufficiently large an offset for such mistakes to be noticeable early. Arguably a difference between date and accept_date could also make sense, to indicate if the stream had been there for more than 49 days, but this would introduce instabilities for most sockets (including negative times) for extremely rare cases while the goal is essentially to see how much longer than a configured timeout a stream has been there. And that's what other locations (including the short form) provide. This patch could be backported but most users will never notice. In case of backport, tv_accept.tv_sec should be used instead of accept_date.tv_sec.	2023-05-03 11:47:44 +02:00
William Lallemand	64a77e3ea5	MINOR: ssl: disable CRL checks with WolfSSL when no CRL file WolfSSL is enabling by default the CRL checks even if a CRL file wasn't provided. This patch resets the default X509_STORE flags so this is not checked by default.	2023-05-02 18:30:11 +02:00
Tim Duesterhus	0ababda701	BUG/MINOR: stats: fix typo in `TotalSplicedBytesOut` field name An additional `d` slipped in there. This likely should not be backported, because scripts might rely on the typoed name. Public discussion on this topic here: https://www.mail-archive.com/haproxy@formilux.org/msg43359.html	2023-05-02 11:15:49 +02:00
Amaury Denoyelle	bc0adfa334	MINOR: proxy: factorize send rate measurement Implement a new dedicated function increment_send_rate() which can be call anywhere new bytes must be accounted for global total sent.	2023-04-28 16:53:44 +02:00
Amaury Denoyelle	1bcb695a05	MINOR: quic: use real sending rate measurement Before this patch, global sending rate was measured on the QUIC lower layer just after sendto(). This meant that all QUIC frames were accounted for, including non STREAM frames and also retransmission. To have a better reflection of the application data transferred, move the incrementation into the MUX layer. This allows to account only for STREAM frames payload on their first emission. This should be backported up to 2.6.	2023-04-28 16:52:26 +02:00
Aleksandar Lazic	5529c9985e	MINOR: sample: Add bc_rtt and bc_rttvar This Patch adds fetch samples for backends round trip time.	2023-04-28 16:31:08 +02:00
Willy Tarreau	c05d30e9d8	MINOR: clock: replace the timeval start_time with start_time_ns Now that "now" is no more a timeval, there's no point keeping a copy of it as a timeval, let's also switch start_time to nanoseconds, it simplifies operations.	2023-04-28 16:08:08 +02:00
Willy Tarreau	69530f59ae	MEDIUM: clock: replace timeval "now" with integer "now_ns" This puts an end to the occasional confusion between the "now" date that is internal, monotonic and not synchronized with the system's date, and "date" which is the system's date and not necessarily monotonic. Variable "now" was removed and replaced with a 64-bit integer "now_ns" which is a counter of nanoseconds. It wraps every 585 years, so if all goes well (i.e. if humanity does not need haproxy anymore in 500 years), it will just never wrap. This implies that now_ns is never nul and that the zero value can reliably be used as "not set yet" for a timestamp if needed. This will also simplify date checks where it becomes possible again to do "date1<date2". All occurrences of "tv_to_ns(&now)" were simply replaced by "now_ns". Due to the intricacies between now, global_now and now_offset, all 3 had to be turned to nanoseconds at once. It's not a problem since all of them were solely used in 3 functions in clock.c, but they make the patch look bigger than it really is. The clock_update_local_date() and clock_update_global_date() functions are now much simpler as there's no need anymore to perform conversions nor to round the timeval up or down. The wrapping continues to happen by presetting the internal offset in the short future so that the 32-bit now_ms continues to wrap 20 seconds after boot. The start_time used to calculate uptime can still be turned to nanoseconds now. One interrogation concerns global_now_ms which is used only for the freq counters. It's unclear whether there's more value in using two variables that need to be synchronized sequentially like today or to just use global_now_ns divided by 1 million. Both approaches will work equally well on modern systems, the difference might come from smaller ones. Better not change anyhting for now. One benefit of the new approach is that we now have an internal date with a resolution of the nanosecond and the precision of the microsecond, which can be useful to extend some measurements given that timestamps also have this resolution.	2023-04-28 16:08:08 +02:00
Willy Tarreau	eed5da1037	MINOR: clock: do not use now.tv_sec anymore Instead we're using ns_to_sec(tv_to_ns(&now)) which allows the tv_sec part to disappear. At this point, "now" is only used as a timeval in clock.c where it is updated.	2023-04-28 16:08:08 +02:00
Willy Tarreau	e8e4712771	MINOR: checks: use a nanosecond counters instead of timeval for checks->start Now we store the checks start date as a nanosecond timestamps instead of a timeval, this will simplify the operations with "now" in the near future.	2023-04-28 16:08:08 +02:00
Willy Tarreau	b68d308aec	MINOR: activity: use nanoseconds, not timeval to compute uptime Now that we have the required functions, let's get rid of the timeval in intermediary calculations.	2023-04-28 16:08:08 +02:00
Willy Tarreau	563efe62e9	MINOR: stats: use nanoseconds, not timeval to compute uptime Now that we have the required functions, let's get rid of the timeval in intermediary calculations.	2023-04-28 16:08:08 +02:00
Willy Tarreau	ad5a5f6779	MEDIUM: tree-wide: replace timeval with nanoseconds in tv_accept and tv_request Let's get rid of timeval in storage of internal timestamps so that they are no longer mistaken for wall clock time. These were exclusively used subtracted from each other or to/from "now" after being converted to ns, so this patch removes the tv_to_ns() conversion to use them natively. Two occurrences of tv_isge() were turned to a regular wrapping subtract.	2023-04-28 16:08:08 +02:00
Willy Tarreau	aaebcae58b	MINOR: spoe: switch the timeval-based timestamps to nanosecond timestamps Various points were collected during a request/response and were stored using timeval. Let's now switch them to nanosecond based timestamps.	2023-04-28 16:08:08 +02:00
Willy Tarreau	76d343d3d3	MINOR: time: replace calls to tv_ms_elapsed() with a linear subtract Instead of operating on {sec, usec} now we convert both operands to ns then subtract them and convert to ms. This is a first step towards dropping timeval from these timestamps. Interestingly, tv_ms_elapsed() and tv_ms_remain() are no longer used at all and could be removed.	2023-04-28 16:08:08 +02:00
Willy Tarreau	7222db7b84	BUG/MINOR: stats: report the correct start date in "show info" The "show info" help for "Start_time_sec" says "Start time in seconds" so it's definitely the start date in human format, not the internal one that is solely used to compute uptime. Since commit `28360dc` ("MEDIUM: clock: force internal time to wrap early after boot"), both are split apart since the start time takes into account the offset needed to cause the early wraparound, so we must only use start_date here. No backport is needed.	2023-04-28 16:08:08 +02:00
Christopher Faulet	2ebac6a320	BUG/MEDIUM: tcpcheck: Don't eval custom expect rule on an empty buffer The commit `a664aa6a6` ("BUG/MINOR: tcpcheck: Be able to expect an empty response") instroduced a regression for expect rules relying on a custom function. Indeed, there is no check on the buffer to be sure it is not empty before calling the custom function. But some of these functions expect to have data and don't perform any test on the buffer emptiness. So instead of fixing all custom functions, we just don't eval them if the buffer is empty. This patch must be backported but only if the commit above was backported first.	2023-04-28 15:01:10 +02:00
Christopher Faulet	89aeabff5b	BUG/MINOR: resolvers: Use sc_need_room() to wait more room when dumping stats It was a cut/paste typo during stream-interface to conn-stream refactoring. sc_have_room() was used instead of sc_need_room(). This patch must be backported as far as 2.6.	2023-04-28 08:51:34 +02:00
Christopher Faulet	e99c43907c	BUG/MEDIUM: spoe: Don't start new applet if there are enough idle ones It is possible to start too many applets on sporadic burst of events after an inactivity period. It is due to the way we estimate if a new applet must be created or not. It is based on a frequency counter. We compare the events processing rate against the number of events currently processed (in progress or waiting to be processed). But we should also take care of the number of idle applets. We already track the number of idle applets, but it is global and not per-thread. Thus we now also track the number of idle applets per-thread. It is not a big deal because this fills a hole in the spoe_agent structure. Thanks to this counter, we can refrain applets creation if there is enough idle applets to handle currently processed events. This patch should be backported to every stable versions.	2023-04-28 08:51:34 +02:00
Willy Tarreau	d2f61de8c2	BUG/MINOR: hlua: return wall-clock date, not internal date in core.now() That's hopefully the last one affected by this. It was a bit trickier because there's the promise in the doc that the date is monotonous, so we continue to use now-start_time as the uptime value and add it to start_date to get the current date. It was also emphasized by commit `28360dc` ("MEDIUM: clock: force internal time to wrap early after boot"), causing core.now() to return a date of Mar 20 on Apr 27. No backport is needed.	2023-04-27 18:44:14 +02:00
Willy Tarreau	bc3c4e85f0	BUG/MINOR: trace: show wall-clock date, not internal date in show activity Yet another case where "now" was used instead of "date" for a publicly visible date that was already incorrect and became worse after commit `28360dc` ("MEDIUM: clock: force internal time to wrap early after boot"). No backport is needed.	2023-04-27 18:22:34 +02:00
Willy Tarreau	22b6d26c57	BUG/MINOR: calltrace: fix 'now' being used in place of 'date' Since commit `28360dc` ("MEDIUM: clock: force internal time to wrap early after boot") we have a much clearer distinction between 'now' (the internal, drifting clock) and 'date' (the wall clock time). The calltrace code was using "now" instead of "date" since the value is displayed to humans. No backport is needed.	2023-04-27 18:14:57 +02:00
Willy Tarreau	fe1b3b8777	Revert "BUG/MINOR: clock: fix a few occurrences of 'now' being used in place of 'date'" This reverts commit `aadcfc9ea6`. The parts affecting the DeviceAtlas addon were wrong actually, the "now" variable was a local time_t in a file that's not compiled with the haproxy binary (dadwsch). Only the fix to the calltrace is correct, so better revert and fix the only one in a separate commit. No backport is needed.	2023-04-27 18:14:57 +02:00
Willy Tarreau	82bde18aa4	BUG/MINOR: activity: show wall-clock date, not internal date in show activity Another case where "now" was used instead of "date" for a publicly visible date that was already incorrect and became worse after commit `28360dc` ("MEDIUM: clock: force internal time to wrap early after boot"). No backport is needed.	2023-04-27 14:47:50 +02:00
Willy Tarreau	a5f0e6cfc0	BUG/MINOR: spoe: use "date" not "now" in debug messages The debug messages were still emitted with a date taken from "now" instead of "date", which was not correct a long time ago but which became worse in 2.8 since commit `28360dc` ("MEDIUM: clock: force internal time to wrap early after boot"). Let's fix it. No backport is needed.	2023-04-27 11:57:53 +02:00
Willy Tarreau	aadcfc9ea6	BUG/MINOR: clock: fix a few occurrences of 'now' being used in place of 'date' Since commit `28360dc` ("MEDIUM: clock: force internal time to wrap early after boot") we have a much clearer distinction between 'now' (the internal, drifting clock) and 'date' (the wall clock time). There were still a few places where 'now' was being used for human consumption. No backport is needed.	2023-04-26 19:21:25 +02:00
Amaury Denoyelle	7b516d3732	BUG/MINOR: quic: fix race on quic_conns list during affinity rebind Each quic_conn are attached in a global thread-local quic_conns list used for "show quic" command. During thread rebinding, a connection is detached from its local list instance and moved to its new thread list. However this operation is not thread-safe and may cause a race condition. To fix this, only remove the connection from its list inside qc_set_tid_affinity(). The connection is inserted only after in qc_finalize_affinity_rebind() on the new thread instance thus prevented a race condition. One impact of this is that a connection will be invisible during rebinding for "show quic". A connection must not transition to closing state in between this two steps or else cleanup via quic_handle_stopping() may not miss it. To ensure this, this patch relies on the previous commit : commit `d6646dddcc` MINOR: quic: finalize affinity change as soon as possible This should be backported up to 2.7.	2023-04-26 17:50:22 +02:00
Amaury Denoyelle	d6646dddcc	MINOR: quic: finalize affinity change as soon as possible During accept, a quic-conn is rebind to a new thread. This process is done in two times : * first on the original thread via qc_set_tid_affinity() * then on the newly assigned thread via qc_finalize_affinity_rebind() Most quic_conn operations (I/O tasklet, task and quic_conn FD socket read) are reactivated ony after the second step. However, there is a possibility that datagrams are handled before it via quic_dgram_parse() when using listener sockets. This does not seem to cause any issue but this may cause unexpected behavior in the future. To simplify this, qc_finalize_affinity_rebind() will be called both by qc_xprt_start() and quic_dgram_parse(). Only one invocation will be performed thanks to the new flag QUIC_FL_CONN_AFFINITY_CHANGED. This should be backported up to 2.7.	2023-04-26 17:50:16 +02:00
Amaury Denoyelle	a57ab0fabe	MINOR: mux-quic: do not allocate Tx buf for empty STREAM frame Sometimes it may be necessary to send an empty STREAM frame to signal clean stream closure with FIN bit set. Prior to this change, a Tx buffer was allocated unconditionnally even if no data is transferred. Most of the times, allocation was not performed due to an older buffer reused. But if data were already acknowledge, a new buffer is allocated. No memory leak occurs as the buffer is properly released when the empty frame acknowledge is received. But this allocation is unnecessary and it consumes a connexion Tx buffer for nothing. Improve this by skipping buffer allocation if no data to transfer. qcs_build_stream_frm() is now able to deal with a NULL out argument. This should be backported up to 2.6.	2023-04-26 17:50:16 +02:00
Amaury Denoyelle	42c5b75cac	MINOR: mux-quic: do not set buffer for empty STREAM frame Previous patch fixes an issue occurring with empty STREAM frames without payload. The crash was hidden in part because buf/data fields of qf_stream were set even if no payload is referenced. This was not the true cause of the crash but to ease future debugging, a STREAM frame built with no payload now has its buf and data fields set to NULL. This should be backported up to 2.6.	2023-04-26 17:50:16 +02:00
Amaury Denoyelle	19eaf88fda	BUG/MINOR: quic: prevent buggy memcpy for empty STREAM Sometimes it may be necessary to send empty STREAM frames with only the FIN bit set. For these frames, memcpy is thus unnecessary as their payload is empty. However, we did not prevent its invocation inside quic_build_stream_frame(). Normally, memcpy invocation with length==0 is safe. However, there is an extra condition in our function to handle data wrapping. For an empty STREAM frame in the context of MUX emission, this is safe as the frame points to a valid buffer which causes the wrapping condition to be false and resulting in a memcpy with 0 length. However, in the context of retransmission, this may lead to a crash. Consider the following scenario : two STREAM frames A and B are produced, one with payload and one empty with FIN set, pointing to the same stream_desc buffer. If A is acknowledged by the peer, its buffer is released as no more data is left in it. If B needs to be resent, the wrapping condition will be messed up to a reuse of a freed buffer. Most of the times, <wrap> will be a negative number, which results in a memcpy invocation causing a buffer overflow. To fix this, simply add an extra condition to skip memcpy and wrapping check if STREAM frame length is null inside quic_build_stream_frame(). This crash is pretty rare as it relies on a lot of conditions difficult to reproduce. It seems to be the cause for the latest crashes reported under github issue #2120. In all the inspected dumps, the segfault occurred during retransmission with an empty STREAM frame being used as input. Thanks again to Tristan from Mangadex for his help and investigation on it. This should be backported up to 2.6.	2023-04-26 17:50:16 +02:00
Amaury Denoyelle	7c5591facb	BUG/MEDIUM: mux-quic: improve streams fairness to prevent early timeout Since the following mentioned patch, a send-list mechanism was implemented to improve streams priorization on sending. commit `20f2a425ff` MAJOR: mux-quic: rework stream sending priorization This is done to prevent the same streams to always be used as first ones on emission. However there is still a flaw on the algorithm. Once put in the send-list, a streams is not removed until it has sent all of its content. When a stream transfers a large object, it will remain in the send-list during all the transfer and will soon monopolize the first place. the stream does never leave its position until the transfer is finished and will monopolize the first place. Other streams behind won't have the opportunity to advance on their own transfers due to a Tx buffer exhaustion. This situation is especially problematic if a small timeout client is used. As some streams won't advance on their transfer for a long period of time, they will be aborted due to a stream layer timeout client causing a RESET_STREAM emission. To fix this, during sending each stream with at least some bytes transferred from its tx.buf to qc_stream_desc out buffer is put at the end of the send-list. This ensures that on the next iteration streams that cannot transfer anything will be used in priority. This patch improves significantly h2load benchmarks for large objects with several streams opened in parallel on a single connection. Without it, errors may be reported by h2load for aborted streams. For example, this improved the following scenario on a 10mbit/s link with a 10s timeout client : $ ./build/bin/h2load --npn-list h3 -t 1 -c 1 -m 30 -n 30 https://198.18.10.11:20443/?s=500k This fix may help with the github issue #2004 where chrome browser stop to use QUIC after receiving RESET_STREAM frames. This should be backported up to 2.7.	2023-04-26 17:50:16 +02:00
Amaury Denoyelle	24962dd178	BUG/MEDIUM: mux-quic: do not emit RESET_STREAM for unknown length Some HTX responses may not always contain a EOM block. For example this is the case if content-length header is missing from the HTTP server response. Stream termination is thus signaled to QUIC mux via shutw callback. However, this is interpreted inconditionnally as an early close by the mux with a RESET_STREAM emission. Most of the times, QUIC clients report this as an error. To fix this, check if htx.extra is set to HTX_UNKOWN_PAYLOAD_LENGTH for a qcs instance. If true, shutw will never be used to emit a RESET_STREAM. Instead, the stream will be closed properly with a FIN STREAM frame. If all data were already transfered, an empty STREAM frame is sent. This fix may help with the github issue #2004 where chrome browser stop to use QUIC after receiving RESET_STREAM frames. This issue was reported by Vladimir Zakharychev. Thanks to him for his help and testing. It was also reproduced locally using httpterm with the query string "/?s=1k&b=0&C=1". This should be backported up to 2.7.	2023-04-26 17:50:09 +02:00
Frédéric Lécaille	7d23e8d1a6	CLEANUP: quic: Rename several <buf> variables into quic_sock.c Rename some variables which are not struct buffer variables. Should be backported to 2.7.	2023-04-24 15:53:27 +02:00
Frédéric Lécaille	bb426aa5f1	CLEANUP: quic: Rename <buf> variable into qc_parse_hd_form() There is no struct buffer variable manipulated by this function. Should be backported to 2.7.	2023-04-24 15:53:27 +02:00
Frédéric Lécaille	6ff52f9ce5	CLEANUP: quic: Rename <buf> variable into quic_packet_read_long_header() Make this function be more readable: there is no struct buffer variable passed as parameter to this function. Should be backported to 2.7.	2023-04-24 15:53:27 +02:00
Frédéric Lécaille	81a02b59f5	CLEANUP: quic: Rename several <buf> variables at low level Make quic_stateless_reset_token_cpy(), quic_derive_cid() and quic_get_cid_tid() be more readable: there is no struct buffer variable manipulated by these functions. Should be backported to 2.7.	2023-04-24 15:53:27 +02:00
Frédéric Lécaille	182934d80b	CLEANUP: quic: Rename quic_get_dgram_dcid() <buf> variable quic_get_dgram_dcid() does not manipulate any struct buffer variable. Should be backported to 2.7.	2023-04-24 15:53:26 +02:00
Frédéric Lécaille	1e0f8255a1	CLEANUP: quic: Make qc_build_pkt() be more readable There is no <buf> variable passed to this function. Also rename <buf_end> to <end> to mimic others functions. Rename <beg> to <first_byte> and <end> to <last_byte>. Should be backported to 2.7.	2023-04-24 15:53:26 +02:00
Frédéric Lécaille	3adb9e85a1	CLEANUP: quic: Rename <buf> variable for several low level functions Make quic_build_packet_long_header(), quic_build_packet_short_header() and quic_apply_header_protection() be more readable: there is no struct buffer variables used by these functions. Should be backported to 2.7.	2023-04-24 15:53:26 +02:00
Frédéric Lécaille	bef3098d33	CLEANUP: quic: Rename <buf> variable into quic_rx_pkt_parse() Make this function be more readable: there is no struct buffer variable used by this function. Should be backported to 2.7.	2023-04-24 15:53:26 +02:00
Frédéric Lécaille	7f0b1c7016	CLEANUP: quic: Rename <buf> variable into quic_padding_check() Make quic_padding_check() be more readable: there is not struct buffer variable used by this function. Should be backported to 2.7.	2023-04-24 15:53:26 +02:00
Frédéric Lécaille	dad0ede28a	CLEANUP: quic: Rename <buf> variable to <token> in quic_generate_retry_token() Make quic_generate_retry_token() be more readable: there is no struct buffer variable used in this function. Should be backported to 2.7.	2023-04-24 15:53:26 +02:00
Frédéric Lécaille	e66d67a1ae	CLEANUP: quic: Remove useless parameters passes to qc_purge_tx_buf() Remove the pointer to the connection passed as parameters to qc_purge_tx_buf() and other similar function which came with qc_purge_tx_buf() implementation. They were there do track the connection during tests. Must be backported to 2.7.	2023-04-24 15:53:26 +02:00
Amaury Denoyelle	d5f03cd576	CLEANUP: quic: rename frame variables Rename all frame variables with the suffix _frm. This helps to differentiate frame instances from other internal objects. This should be backported up to 2.7.	2023-04-24 15:35:22 +02:00
Amaury Denoyelle	888c5f283a	CLEANUP: quic: rename frame types with an explicit prefix Each frame type used in quic_frame union has been renamed with the following prefix "qf_". This helps to differentiate frame instances from other internal objects. This should be backported up to 2.7.	2023-04-24 15:35:03 +02:00
Frédéric Lécaille	b73762ad78	BUG/MINOR: quic: Useless I/O handler task wakeups (draining, killing state) From the idle_timer_task(), the I/O handler must be woken up to send ack. But there is no reason to do that in draining state or killing state. In draining state this is even forbidden. Must be backported to 2.7.	2023-04-24 11:47:11 +02:00
Frédéric Lécaille	d21c628ffd	BUG/MINOR: quic: Useless probing retransmission in draining or killing state The timer task responsible of triggering probing retransmission did not inspect the state of the connection before doing its job. But there is no need to probe the peer when the connection is in draining or killing state. About the draining state, this is even forbidden. Must be backported to 2.7 and 2.6.	2023-04-24 11:46:33 +02:00
Frédéric Lécaille	c6bec2a3af	BUG/MINOR: quic: Possible leak during probing retransmissions qc_dgrams_retransmit() prepares two list of frames to be retransmitted into two datagrams. If the first datagram could not be sent, the TX buffer will be purged with the prepared packet and its frames, but this was not the case for the second list of frames. Must be backported in 2.7.	2023-04-24 11:38:28 +02:00
Frédéric Lécaille	ce0bb338c6	BUG/MINOR: quic: Possible memory leak from TX packets This bug arrived with this commit which was not sufficient: BUG/MEDIUM: quic: Missing TX buffer draining from qc_send_ppkts() Indeed, there were also remaining allocated TX packets to be released and their TX frames. Implement qc_purge_tx_buf() to do so which depends on qc_free_tx_coalesced_pkts() and qc_free_frm_list(). Must be backported to 2.7.	2023-04-24 11:38:28 +02:00
Frédéric Lécaille	e95e00e305	MINOR: quic: Move traces at proto level These traces has already been useful to debug issues. Must be backported to 2.7 and 2.6.	2023-04-24 11:38:16 +02:00
Willy Tarreau	0e875cf291	MEDIUM: listener: switch the default sharding to by-group Sharding by-group is exactly identical to by-process for a single group, and will use the same number of file descriptors for more than one group, while significantly lowering the kernel's locking overhead. Now that all special listeners (cli, peers) are properly handled, and that support for SO_REUSEPORT is detected at runtime per protocol, there should be no more reason for now switching to by-group by default. That's what this patch does. It does only this and nothing else so that it's easy to revert, should any issue be raised. Testing on an AMD EPYC 74F3 featuring 24 cores and 48 threads distributed into 8 core complexes of 3 cores each, shows that configuring 8 groups (one per CCX) is sufficient to simply double the forwarded connection rate from 112k to 214k/s, reducing kernel locking from 71 to 55%.	2023-04-23 10:18:16 +02:00
Willy Tarreau	7310164b2c	MINOR: listener: add a new global tune.listener.default-shards setting This new setting accepts "by-process", "by-group" and "by-thread" and will dictate how listeners will be sharded by default when nothing is specified. While the default remains "by-process", "by-group" should be much more efficient with many threads, while not changing anything for single-group setups.	2023-04-23 09:46:15 +02:00
Willy Tarreau	c38499ceae	MINOR: listener: do not restrict CLI to first group anymore Now that we're able to run listeners on any set of groups, we don't need to maintain a special case about the stats socket anymore. It used to be forced to group 1 only so as to avoid startup failures in case several groups were configured, but if it's done now, it will automatically bind the needed FDs to have one per group so this is no more an issue.	2023-04-23 09:46:15 +02:00
Willy Tarreau	f1003ea7fa	MINOR: protocol: perform a live check for SO_REUSEPORT support When testing if a protocol supports SO_REUSEPORT, we're now able to verify if the OS does really support it. While it may be supported at build time, it may possibly have been blocked in a container for example so we'd rather know what it's like.	2023-04-23 09:46:15 +02:00
Willy Tarreau	b073573c10	MINOR: sock: add a function to check for SO_REUSEPORT support at runtime The new function _sock_supports_reuseport() will be used to check if a protocol type supports SO_REUSEPORT or not. This will be useful to verify that shards can really work.	2023-04-23 09:46:15 +02:00
Willy Tarreau	8a5e6f4cca	MINOR: protocol: add a function to check if some features are supported The new function protocol_supports_flag() checks the protocol flags to verify if some features are supported, but will support being extended to refine the tests. Let's use it to check for REUSEPORT.	2023-04-23 09:46:15 +02:00
Willy Tarreau	c1fbdd6397	MINOR: listener: automatically adjust shards based on support for SO_REUSEPORT Now if multiple shards are explicitly requested, and the listener's protocol doesn't support SO_REUSEPORT, sharding is disabled, which will result in the socket being automatically duped if needed. A warning is emitted when this happens. If "shards by-group" or "shards by-thread" are used, these will automatically be turned down to 1 since we want this to be possible easily using -dR on the command line without having to djust the config. For "by-thread", a diag warning will be emitted to help troubleshoot possible performance issues.	2023-04-23 09:46:15 +02:00
Willy Tarreau	785b89f551	MINOR: protocol: move the global reuseport flag to the protocols Some protocol support SO_REUSEPORT and others not. Some have such a limitation in the kernel, and others in haproxy itself (e.g. sock_unix cannot support multiple bindings since each one will unbind the previous one). Also it's really protocol-dependent and not just family-dependent because on Linux for some time it was supported for TCP and not UDP. Let's move the definition to the protocols instead. Now it's preset in tcp/udp/quic when SO_REUSEPORT is defined, and is otherwise left unset. The enabled() config condition test validates IPv4 (generally sufficient), and -dR / noreuseport all protocols at once.	2023-04-23 09:46:15 +02:00
Willy Tarreau	65df7e028d	MINOR: protocol: add a flags field to store info about protocols We'll use these flags to know if some protocols are supported, and if so, with what options/extensions. Reuseport will move there for example. Two functions were added to globally set/clear a flag.	2023-04-23 09:46:15 +02:00
Willy Tarreau	a22db6567f	MEDIUM: peers: call bind_complete_thread_setup() to finish the config The listeners in peers sections were still not handing the thread groups fine. Shards were silently ignored and if a listener was bound to more than one group, it would simply fail. Now we can call the dedicated function to resolve all this and possibly create the missing extra listeners. bind_complete_thread_setup() was adjusted to use the proxy_type_str() instead of writing "proxy" at the only place where this word was still hard-coded so that we continue to speak about peers sections when relevant.	2023-04-23 09:46:15 +02:00
Willy Tarreau	f6a8444f55	REORG: listener: move the bind_conf's thread setup code to listener.c What used to be only two lines to apply a mask in a loop in check_config_validity() grew into a 130-line block that performs deeply listener-specific operations that do not have their place there anymore. In addition it's worth noting that the peers code still doesn't support shards nor being bound to more than one group, which is a second reason for moving that code to its own function. Nothing was changed except recreating the missing variables from the bind_conf itself (the fe only).	2023-04-23 09:46:15 +02:00
Willy Tarreau	e1a0107f9c	BUG/MINOR: config: fix NUMA topology detection on FreeBSD In 2.6-dev1, NUMA topology detection was enabled on FreeBSD with commit `f5d48f8b3` ("MEDIUM: cfgparse: numa detect topology on FreeBSD."). But it suffers from a minor bug which is that it forgets to check for the number of domains and always emits a confusing warning indicating that multiple sockets were found while it's not the case. This can be backported to 2.6.	2023-04-23 09:46:15 +02:00
Willy Tarreau	997ad155fe	BUG/MINOR: tools: check libssl and libcrypto separately The lib compatibility checks introduced in 2.8-dev6 with commit `c3b297d5a` ("MEDIUM: tools: further relax dlopen() checks too consider grouped symbols") were partially incorrect in that they check at the same time libcrypto and libssl. But if loading a library that only depends on libcrypto, the ssl-only symbols will be missing and this might present an inconsistency. This is what is observed on FreeBSD 13.1 when libcrypto is being loaded, where it sees two symbols having disappeared. The fix consists in splitting the checks for libcrypto and libssl. No backport is needed, unless the patch above finally gets backported.	2023-04-23 09:46:15 +02:00
Willy Tarreau	9f53b7b41a	BUG/MINOR: sock_inet: use SO_REUSEPORT_LB where available On FreeBSD 13.1 I noticed that thread balancing using shards was not always working. Sometimes several threads would work, but most of the time a single one was taking all the traffic. This is related to how SO_REUSEPORT works on FreeBSD since version 12, as it seems there is no guarantee that multiple sockets will receive the traffic. However there is SO_REUSEPORT_LB that is designed exactly for this, so we'd rather use it when available. This patch may possibly be backported, but nobody complained and it's not sure that many users rely on shards. So better wait for some feedback before backporting this.	2023-04-23 09:46:15 +02:00
Ilya Shipitsin	ccf8012f28	CLEANUP: assorted typo fixes in the code and comments This is 36th iteration of typo fixes	2023-04-23 09:44:53 +02:00
Willy Tarreau	023c311d70	BUG/MINOR: cli: clarify error message about stats bind-process In 2.7-dev2, "stats bind-process" was removed by commit `94f763b5e` ("MEDIUM: config: remove deprecated "bind-process" directives from frontends") and an error message indicates that it's no more supported. However it says "stats" is not supported instead of "stats bind-process", making it a bit confusing. This should be backported to 2.7.	2023-04-23 09:40:56 +02:00
Tim Duesterhus	1307cd42d2	CLEANUP: Stop checking the pointer before calling `ring_free()` Changes performed with this Coccinelle patch: @@ expression e; @@ - if (e != NULL) { ring_free(e); - } @@ expression e; @@ - if (e) { ring_free(e); - } @@ expression e; @@ - if (e) ring_free(e); @@ expression e; @@ - if (e != NULL) ring_free(e);	2023-04-23 00:28:25 +02:00
Tim Duesterhus	fe83f58906	CLEANUP: Stop checking the pointer before calling `task_free()` Changes performed with this Coccinelle patch: @@ expression e; @@ - if (e != NULL) { task_destroy(e); - } @@ expression e; @@ - if (e) { task_destroy(e); - } @@ expression e; @@ - if (e) task_destroy(e); @@ expression e; @@ - if (e != NULL) task_destroy(e);	2023-04-23 00:28:25 +02:00
Tim Duesterhus	c18e244515	CLEANUP: Stop checking the pointer before calling `pool_free()` Changes performed with this Coccinelle patch: @@ expression e; expression p; @@ - if (e != NULL) { pool_free(p, e); - } @@ expression e; expression p; @@ - if (e) { pool_free(p, e); - } @@ expression e; expression p; @@ - if (e) pool_free(p, e); @@ expression e; expression p; @@ - if (e != NULL) pool_free(p, e);	2023-04-23 00:28:25 +02:00
Tim Duesterhus	b1ec21d259	CLEANUP: Stop checking the pointer before calling `tasklet_free()` Changes performed with this Coccinelle patch: @@ expression e; @@ - if (e != NULL) { tasklet_free(e); - } @@ expression e; @@ - if (e) { tasklet_free(e); - } @@ expression e; @@ - if (e) tasklet_free(e); @@ expression e; @@ - if (e != NULL) tasklet_free(e); See GitHub Issue #2126	2023-04-23 00:28:25 +02:00
Willy Tarreau	8adffaa899	MINOR: listener: always compare the local thread as well By comparing the local thread's load with the least loaded thread's load, we can further improve the fairness and at the same time also improve locality since it allows a small ratio of connections not to be migrated. This is visible on CPU usage with long connections on very large thread counts (224) and high bandwidth (200G). The cost of checking the local thread's load remains fairly low so there's no reason not to do this. We continue to update the index if we select the local thread, because it means that the two other threads were both more loaded so we'd rather find better ones.	2023-04-21 17:41:26 +02:00
Willy Tarreau	ff18504d73	MINOR: listener: make sure to avoid ABA updates in per-thread index One limitation of the current thread index mechanism is that if the values are assigned multiple times to the same thread and the index loops, it can match again the old value, which will not prevent a competing thread from finishing its CAS and assigning traffic to a thread that's not the optimal one. The probability is low but the solution is simple enough and consists in implementing an update counter in the high bits of the index to force a mismatch in this case (assuming we don't try to cover for extremely unlikely cases where the update counter loops while the index remains equal). So let's do that. In order to improve the situation a little bit, we now set the index to a ulong so that in 32 bits we have 8 bits of counter and in 64 bits we have 40 bits.	2023-04-21 17:41:26 +02:00
Willy Tarreau	77e33509c8	MINOR: listener: resync with the thread index before heavy calculations During heavy accept competition, the CAS will occasionally fail and we'll have to go through all the calculation again. While the first two loops look heavy, they're almost never taken so they're quite cheap. However the rest of the operation is heavy because we have to consult connection counts and queue indexes for other threads, so better double-check if the index is still valid before continuing. Tests show that it's more efficient do retry half-way like this.	2023-04-21 17:41:26 +02:00
Willy Tarreau	b657492680	MINOR: listener: use a common thr_idx from the reference listener Instead of seeing each listener use its own thr_idx, let's use the same for all those from a shard. It should provide more accurate and smoother thread allocation.	2023-04-21 17:41:26 +02:00
Willy Tarreau	9d360604bd	MEDIUM: listener: rework thread assignment to consider all groups Till now threads were assigned in listener_accept() to other threads of the same group only, using a single group mask. Now that we have all the relevant info (array of listeners of the same shard), we can spread the thr_idx to cover all assigned groups. The thread indexes now contain the group number in their upper bits, and the indexes run over te whole list of threads, all groups included. One particular subtlety here is that switching to a thread from another group also means switching the group, hence the listener. As such, when changing the group we need to update the connection's owner to point to the listener of the same shard that is bound to the target group.	2023-04-21 17:41:26 +02:00
Willy Tarreau	e6f5ab5afa	MINOR: listener: make accept_queue index atomic There has always been a race when checking the length of an accept queue to determine which one is more loaded that another, because the head and tail are read at two different moments. This is not required, we can merge them as two 16 bit numbers inside a single 32-bit index that is always accessed atomically. This way we read both values at once and always have a consistent measurement.	2023-04-21 17:41:26 +02:00
Willy Tarreau	09b52d1c3d	MEDIUM: config: permit to start a bind on multiple groups at once Now it's possible for a bind line to span multiple thread groups. When this happens, the first one will become the reference and will be entirely set up, and the subsequent ones will be duplicated from this reference, so that they can be registered in distinct groups. The reference is always setup and started first so it is always available when the other ones are started. The doc was updated to reflect this new possibility with its limitations and impacts, and the differences with the "shards" option.	2023-04-21 17:41:26 +02:00
Willy Tarreau	09e266e6f5	MINOR: proto: skip socket setup for duped FDs It's not strictly necessary, but it's still better to avoid setting up the same socket multiple times when it's being duplicated to a few FDs. We don't change that for inherited ones however since they may really need to be set up, so we only skip duplicated ones.	2023-04-21 17:41:26 +02:00
Willy Tarreau	0e1aaf4e78	MEDIUM: proto: duplicate receivers marked RX_F_MUST_DUP The different protocol's ->bind() function will now check the receiver's RX_F_MUST_DUP flag to decide whether to bind a fresh new listener from scratch or reuse an existing one and just duplicate it. It turns out that the existing code already supports reusing FDs since that was done as part of the FD passing and inheriting mechanism. Here it's not much different, we pass the FD of the reference receiver, it gets duplicated and becomes the new receiver's FD. These FDs are also marked RX_F_INHERITED so that they are not exported and avoid being touched directly (only the reference should be touched).	2023-04-21 17:41:26 +02:00
Willy Tarreau	aae1810b4d	MINOR: receiver: add a struct shard_info to store info about each shard In order to create multiple receivers for one multi-group shard, we'll need some more info about the shard. Here we store: - the number of groups (= number of receivers) - the number of threads (will be used for accept LB) - pointer to the reference rx (to get the FD and to find all threads) - pointers to the other members (to iterate over all threads) For now since there's only one group per shard it remains simple. The listener deletion code already takes care of removing the current member from its shards list and moving others' reference to the last one if it was their reference (so as to avoid o(n^2) updates during ordered deletes). Since the vast majority of setups will not use multi-group shards, we try to save memory usage by only allocating the shard_info when it is needed, so the principle here is that a receiver shard_info==NULL is alone and doesn't share its socket with another group. Various approaches were considered and tests show that the management of the listeners during boot makes it easier to just attach to or detach from a shard_info and automatically allocate it if it does not exist, which is what is being done here. For now the attach code is not called, but detach is already called on delete.	2023-04-21 17:41:26 +02:00
Willy Tarreau	84fe1f479b	MINOR: listener: support another thread dispatch mode: "fair" This new algorithm for rebalancing incoming connections to multiple threads is simpler and instead of considering the threads load, it will only cycle through all of them, offering a fair share of the traffic to each thread. It may be well suited for short-lived connections but is also convenient for very large thread counts where it's not always certain that the least loaded thread will always be found.	2023-04-21 17:41:26 +02:00
Willy Tarreau	6a4d48b736	MINOR: quic_sock: index li->per_thr[] on local thread id, not global one There's a li_per_thread array in each listener for use with QUIC listeners. Since thread groups were introduced, this array can be allocated too large because global.nbthread is allocated for each listener, while only no more than MIN(nbthread,MAX_THREADS_PER_GROUP) may be used by a single listener. This was because the global thread ID is used as the index instead of the local ID (since a listener may only be used by a single group). Let's just switch to local ID and reduce the allocated size.	2023-04-21 17:41:26 +02:00
Willy Tarreau	77d37b07b1	MINOR: quic: support migrating the listener as well When migrating a quic_conn to another thread, we may need to also switch the listener if the thread belongs to another group. When this happens, the freshly created connection will already have the target listener, so let's just pick it from the connection and use it in qc_set_tid_affinity(). Note that it will be the caller's responsibility to guarantee this.	2023-04-21 17:41:26 +02:00
Aurelien DARRAGON	23f352f7d0	MINOR: server/event_hdl: prepare for server event data wrapper Adding the possibility to publish an event using a struct wrapper around existing SERVER events to provide additional contextual info. Using the specific struct wrapper is not required: it is supported to cast event data as a regular server event data struct so that we don't break the existing API. However, casting event data with a more explicit data type allows to fetch event-only relevant hints.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	f71e0645c1	MEDIUM: server: split srv_update_status() in two functions Considering that srv_update_status() is now synchronous again since `3ff577e1` ("MAJOR: server: make server state changes synchronous again"), and that we can easily identify if the update is from an operational or administrative context thanks to "MINOR: server: pass adm and op cause to srv_update_status()". And given that administrative and operational updates cannot be cumulated (since srv_update_status() is called synchronously and independently for admin updates and state/operational updates, and the function directly consumes the changes). We split srv_update_status() in 2 distinct parts: Either <type> is 0, meaning the update is an operational update which is handled by directly looking at cur_state and next_state to apply the proper transition. Also, the check to prevent operational state from being applied if MAINT admin flag is set is no longer needed given that the calling functions already ensure this (ie: srv_set_{running,stopping,stopped) Or <type> is 1, meaning the update is an administrative update, where cur_admin and next_admin are evaluated to apply the proper transition and deduct the resulting server state (next_state is updated implicitly). Once this is done, both operations share a common code path in srv_update_status() to update proxy and servers stats if required. Thanks to this change, the function's behavior is much more predictable, it is not an all-in-one function anymore. Either we apply an operational change, else it is an administrative change. That's it, we cannot mix the 2 since both code paths are now properly separated.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	76e255520f	MINOR: server: pass adm and op cause to srv_update_status() Operational and administrative state change causes are not propagated through srv_update_status(), instead they are directly consumed within the function to provide additional info during the call when required. Thus, there is no valid reason for keeping adm and op causes within server struct. We are wasting space and keeping uneeded complexity. We now exlicitly pass change type (operational or administrative) and associated cause to srv_update_status() so that no extra storage is needed since those values are only relevant from srv_update_status().	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	10518c0d59	CLEANUP: server: fix srv_set_{running, stopping, stopped} function comment Fixing function comments for the server state changing function since they still refer to asynchonous propagation of server state which is no longer in play. Moreover, there were some mixups between running/stopping.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	c54b98ac9a	CLEANUP: server: remove unused variables in srv_update_status() check and px local variable aliases are not very useful. Let's remove them and use s->check and s->proxy instead.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	1746b56e68	MINOR: server: change srv_op_st_chg_cause storage type This one is greatly inspired by "MINOR: server: change adm_st_chg_cause storage type". While looking at current srv_op_st_chg_cause usage, it was clear that the struct needed some cleanup since some leftovers from asynchronous server state change updates were left behind and resulted in some useless code duplication, and making the whole thing harder to maintain. Two observations were made: - by tracking down srv_set_{running, stopped, stopping} usage, we can see that the <reason> argument is always a fixed statically allocated string. - check-related state change context (duration, status, code...) is not used anymore since srv_append_status() directly extracts the values from the server->check. This is pure legacy from when the state changes were applied asynchronously. To prevent code duplication, useless string copies and make the reason/cause more exportable, we store it as an enum now, and we provide srv_op_st_chg_cause() function to fetch the related description string. HEALTH and AGENT causes (check related) are now explicitly identified to make consumers like srv_append_op_chg_cause() able to fetch checks info from the server itself if they need to.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	f3b48a808e	MINOR: server: srv_append_status refacto srv_append_status() has become a swiss-knife function over time. It is used from server code and also from checks code, with various inputs and distincts code paths, making it very hard to guess the actual behavior of the function (resulting string output). To simplify the logic behind it, we're dividing it in multiple contextual functions that take simple inputs and do explicit things, making them more predictable and easier to maintain.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	9b1ccd7325	MINOR: server: change adm_st_chg_cause storage type Even though it doesn't look like it at first glance, this is more like a cleanup than an actual code improvement: Given that srv->adm_st_chg_cause has been used to exclusively store static strings ever since it was implemented, we make the choice to store it as an enum instead of a fixed-size string within server struct. This will allow to save some space in server struct, and will make it more easily exportable (ie: event handlers) because of the reduced memory footprint during handling and the ability to later get the corresponding human-readable message when it's explicitly needed.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	85b91375bf	MINOR: server: propagate lb changes through srv_lb_propagate() Now that we have a generic srv_lb_propagate(s) function, let's use it each time we explicitly wan't to set the status down as well. Indeed, it is tricky to try to handle "down" case explicitly, instead we use srv_lb_propagate() which will call the proper function that will handle the new server state. This will allow some code cleanup and will prevent any logic error. This commit depends on: - "MINOR: server: propagate server state change to lb through single function"	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	8bbe643acc	MINOR: server: propagate server state change to lb through single function Use a dedicated helper function to propagate server state change to lb algorithms, since it is performed at multiple places within srv_update_status() function.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	5f80f8bbc5	MINOR: server: central update for server counters on state change Based on "BUG/MINOR: server: don't miss server stats update on server state transitions", we're also taking advantage of the new centralized logic to update down_trans server counter directly from there instead of multiple places.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	9c21ff0208	BUG/MINOR: server: don't use date when restoring last_change from state file When restoring from a state file: the server "Status" reports weird values on the html stats page: "5s UP" becomes -> "? UP" after the restore This is due to a bug in srv_state_srv_update(): when restoring the states from a state file, we rely on date.tv_sec to compute the process-relative server last_change timestamp. This is wrong because everywhere else we use now.tv_sec when dealing with last_change, for instance in srv_update_status(). date (which is Wall clock time) deviates from now (monotonic time) in the long run. They should not be mixed, and given that last_change is an internal time value, we should rely on now.tv_sec instead. last_change export through "show servers state" cli is safe since we export a delta and not the raw time value in dump_servers_state(): srv_time_since_last_change = now.tv_sec - srv->last_change -- While this bug affects all stable versions, it was revealed in 2.8 thanks to `28360dc` ("MEDIUM: clock: force internal time to wrap early after boot") This is due to the fact that "now" immediately deviates from "date", whereas in the past they had the same value when starting. Thus prior to 2.8 the bug is trickier since it could take some time for date and now to deviate sufficiently for the issue to arise, and instead of reporting absurd values that are easy to spot it could just result in last_change becoming inconsistent over time. As such, the fix should be backported to all stable versions. [for 2.2 the patch needs to be applied manually since srv_state_srv_update() was named srv_update_state() and can be found in server.c instead of server_state.c]	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	9f5853fa38	BUG/MINOR: server: don't miss server stats update on server state transitions s->last_change and s->down_time updates were manually updated for each effective server state change within srv_update_status(). This is rather error-prone, and as a result there were still some state transitions that were not handled properly since at least 1.8. ie: - when transitionning from DRAIN to READY: downtime was updated (which is wrong since a server in DRAIN state should not be considered as DOWN) - when transitionning from MAINT to READY: downtime was not updated (this can be easily seen in the html stats page) To fix these all at once, and prevent similar bugs from being introduced, we centralize the server last_change and down_time stats logic at the end of srv_update_status(): If the server state changed during the call, then it means that last_change must be updated, with a special case when changing from STOPPED state which means the server was previously DOWN and thus downtime should be updated. This patch depends on: - "MINOR: server: explicitly commit state change in srv_update_status()" This could be backported to every stable versions.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	e80ddb18a8	BUG/MINOR: server: don't miss proxy stats update on server state transitions backend "down" stats logic has been duplicated multiple times in srv_update_status(), resulting in the logic now being error-prone. For example, the following bugfix was needed to compensate for a copy-paste introduced bug: `d332f139` ("BUG/MINOR: server: update last_change on maint->ready transitions too") While the above patch works great, we actually forgot to update the proxy downtime like it is done for other down->up transitions... This is simply illustrating that the current design is error-prone, it is very easy to miss something in this area. To properly update the proxy downtime stats on the maint->ready transition, to cleanup srv_update_status() and to prevent similar bugs from being introduced in the future, proxy/backend stats update are now automatically performed at the end of the server state change if needed. Thus we can remove existing updates that were performed at various places within the function, this simplifies things a bit. This patch depends on: - "MINOR: server: explicitly commit state change in srv_update_status()" This could be backported to all stable versions. Backport notes: 2.2: Replace struct task srv_cleanup_toremove_conns(struct task task, void context, unsigned int state) by struct task srv_cleanup_toremove_connections(struct task task, void context, unsigned short state)	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	22151c70bb	MINOR: server: explicitly commit state change in srv_update_status() As shown in `8f29829` ("BUG/MEDIUM: checks: a down server going to maint remains definitely stucked on down state."), state changes that don't result in explicit lb state change, require us to perform an explicit server state commit to make sure the next state is applied before returning from the function. This is the case for server state changes that don't trigger lb logic and only perform some logging. This is quite error prone, we could easily forget a state change combination that could result in next_state, next_admin or next_eweight not being applied. (cur_state, cur_admin and cur_eweight would be left with unexpected values) To fix this, we explicitly call srv_lb_commit_status() at the end of srv_update_status() to enforce the new values, even if they were already applied. (when a state changes requires lb state update an implicit commit is already performed) Applying the state change multiple times is safe (since the next value always points to the current value). Backport notes: 2.2: Replace struct task srv_cleanup_toremove_conns(struct task task, void context, unsigned int state) by struct task srv_cleanup_toremove_connections(struct task task, void context, unsigned short state)	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	9a1df02ccb	BUG/MINOR: server: incorrect report for tracking servers leaving drain Report message for tracking servers completely leaving drain is wrong: The check for "leaving drain .. via" never evaluates because the condition !(s->next_admin & SRV_ADMF_FDRAIN) is always true in the current block which is guarded by !(s->next_admin & SRV_ADMF_DRAIN). For tracking servers that leave inherited drain mode, this results in the following message being emitted: "Server x/b is UP (leaving forced drain)" Instead of: "Server x/b is UP (leaving drain) via x/a" To this fix: we check if FDRAIN is currently set, else it means that the drain status is inherited from the tracked server (IDRAIN) This regression was introduced with `64cc49cf` ("MAJOR: servers: propagate server status changes asynchronously."), thus it may be backported to every stable versions.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	096b383e16	MINOR: hlua/event_hdl: timestamp for events 'when' optional argument is provided to lua event handlers. It is an integer representing the number of seconds elapsed since Epoch and may be used in conjunction with lua `os.date()` function to provide a custom format string.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	e9314fb7a7	MINOR: event_hdl: provide event->when for advanced handlers For advanced async handlers only (Registered using EVENT_HDL_ASYNC_TASK() macro): event->when is provided as a struct timeval and fetched from 'date' haproxy global variable. Thanks to 'when', related event consumers will be able to timestamp events, even if they don't work in real-time or near real-time. Indeed, unlike sync or normal async handlers, advanced async handlers could purposely delay the consumption of pending events, which means that the date wouldn't be accurate if computed directly from within the handler.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	ebf58e991a	MINOR: event_hdl: dynamically allocated event data members Add the ability to provide a cleanup function for event data passed via the publishing function. One use case could be the need to provide valid pointers in the safe section of the data struct. Cleanup function will be automatically called with data (or copy of data) as argument when all handlers consumed the event, which provides an easy way to release some memory or decrement refcounts to ressources that were provided through the data struct. data in itself may not be freed by the cleanup function, it is handled by the API. This would allow passing large (allocated) data blocks through the data struct while keeping data struct size under the EVENT_HDL_ASYNC_EVENT_DATA size limit. To do so, when publishing an event, where we would currently do: struct event_hdl_cb_data_new_family event_data; /* safe data, available from both sync and async contexts * may not use pointers to short-living resources / event_data.safe.my_custom_data = x; / unsafe data, only available from sync contexts / event_data.unsafe.my_unsafe_data = y; / once data is prepared, we can publish the event / event_hdl_publish(NULL, EVENT_HDL_SUB_NEW_FAMILY_SUBTYPE_1, EVENT_HDL_CB_DATA(&event_data)); We could do: struct event_hdl_cb_data_new_family event_data; / safe data, available from both sync and async contexts * may not use pointers to short-living resources, * unless EVENT_HDL_CB_DATA_DM is used to ensure pointer * consistency (ie: refcount) / event_data.safe.my_custom_static_data = x; event_data.safe.my_custom_dynamic_data = malloc(1); / unsafe data, only available from sync contexts / event_data.unsafe.my_unsafe_data = y; / once data is prepared, we can publish the event / event_hdl_publish(NULL, EVENT_HDL_SUB_NEW_FAMILY_SUBTYPE_1, EVENT_HDL_CB_DATA_DM(&event_data, data_new_family_cleanup)); With data_new_family_cleanup func which would look like this: void data_new_family_cleanup(const void data) { const struct event_hdl_cb_data_new_family event_data = ptr; / some data members require specific cleanup once the event * is consumed / free(event_data.safe.my_custom_dynamic_data); / don't ever free data! it is not ours */ } Not sure if this feature will become relevant in the future, so I prefer not to mention it in the doc for now. But given that the implementation is trivial and does not put a burden on the existing API, it's a good thing to have it there, just in case.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	a63f4903c9	MINOR: server/event_hdl: prepare for upcoming refactors This commit does nothing that ought to be mentioned, except that it adds missing comments and slighty moves some function calls out of "sensitive" code in preparation of some server code refactors.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	2f6a07dce8	MINOR: hlua/event_hdl: fix return type for hlua_event_hdl_cb_data_push_args Changing hlua_event_hdl_cb_data_push_args() return type to void since it does not return anything useful. Also changing its name to hlua_event_hdl_cb_push_args() since it does more than just pushing cb data argument (it also handles event type and mgmt). Errors catched by the function are reported as lua errors.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	55f84c7cab	MINOR: hlua/event_hdl: expose proxy_uuid variable in server events Adding proxy_uuid to ServerEvent class. proxy_uuid contains the uuid of the proxy to which the server belongs	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	3d9bf4e1a5	MINOR: hlua/event_hdl: rely on proxy_uuid instead of proxy_name for lookups Since "MINOR: server/event_hdl: add proxy_uuid to event_hdl_cb_data_server" we may now use proxy_uuid variable to perform proxy lookups when handling a server event. It is more reliable since proxy_uuid isn't subject to any size limitation	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	d714213862	MINOR: server/event_hdl: add proxy_uuid to event_hdl_cb_data_server Expose proxy_uuid variable in event_hdl_cb_data_server struct to overcome proxy_name fixed length limitation. proxy_uuid may be used by the handler to perform proxy lookups. This should be preferred over lookups relying proxy_name. (proxy_name is suitable for printing / logging purposes but not for ID lookups since it has a maximum fixed length)	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	0ddf052972	CLEANUP: server: fix update_status() function comment srv_update_status() function comment says that the function "is designed to be called asynchronously". While this used to be true back then with `64cc49cf` ("MAJOR: servers: propagate server status changes asynchronously.") This is not true anymore since `3ff577e` ("MAJOR: server: make server state changes synchronous again") Fixing the comment in order to better reflect current behavior.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	88687f0980	CLEANUP: errors: fix obsolete function comments Since `9f903af5` ("MEDIUM: log: slightly refine the output format of alerts/warnings/etc"), messages generated by ha_{alert,warning,notice} don't embed date/time information anymore. Updating some old function comments that kept saying otherwise.	2023-04-21 14:36:45 +02:00
Amaury Denoyelle	a65dd3a2c8	BUG/MINOR: quic: consume Rx datagram even on error A BUG_ON crash can occur on qc_rcv_buf() if a Rx packet allocation failed. To fix this, datagram are marked as consumed even if a fatal error occured during parsing. For the moment, only a Rx packet allocation failure could provoke this. At this stage, it's unknown if the datagram were partially parsed or not at all so it's better to discard it completely. This bug was detected using -dMfail argument. This should be backported up to 2.7.	2023-04-20 14:49:32 +02:00
Amaury Denoyelle	d537ca79dc	BUG/MINOR: quic: prevent crash on qc_new_conn() failure Properly initialize el_th_ctx member first on qc_new_conn(). This prevents a segfault if release should be called later due to memory allocation failure in the function on qc_detach_th_ctx_list(). This should be backported up to 2.7.	2023-04-20 14:49:32 +02:00
Amaury Denoyelle	9bbfa72b67	BUG/MINOR: h3: fix crash on h3s alloc failure Do not emit a CONNECTION_CLOSE on h3s allocation failure. Indeed, this causes a crash as the calling function qcs_new() will also try to emit a CONNECTION_CLOSE which triggers a BUG_ON() on qcc_emit_cc(). This was reproduced using -dMfail. This should be backported up to 2.7.	2023-04-20 14:49:32 +02:00
Amaury Denoyelle	93d2ebe9f3	BUG/MINOR: mux-quic: properly handle STREAM frame alloc failure Previously, if a STREAM frame cannot be allocated for emission, a crash would occurs due to an ABORT_NOW() statement in _qc_send_qcs(). Replace this by proper error code handling. Each stream were sending fails are removed temporarily from qcc::send_list to a list local to _qc_send_qcs(). Once emission has been conducted for all streams, reinsert failed stream to qcc::send_list. This avoids to reloop on failed streams on the second while loop at the end of _qc_send_qcs(). This crash was reproduced using -dMfail. This should be backported up to 2.6.	2023-04-20 14:49:32 +02:00
Amaury Denoyelle	ed820823f0	BUG/MINOR: mux-quic: fix crash with app ops install failure On MUX initialization, the application layer is setup via qcc_install_app_ops(). If this function fails MUX is deallocated and an error is returned. This code path causes a crash before connection has been registered prior into the mux_stopping_data::list for stopping idle frontend conns. To fix this, insert the connection later in qc_init() once no error can occured. The crash was seen on the process closing with SUGUSR1 with a segfault on mux_stopping_process(). This was reproduced using -dMfail. This regression was introduced by the following patch : commit `b4d119f0c7` BUG/MEDIUM: mux-quic: fix crash on H3 SETTINGS emission This should be backported up to 2.7.	2023-04-20 14:49:32 +02:00
Frédéric Lécaille	d07421331f	BUG/MINOR: quic: Wrong Retry token generation timestamp computing Again a now_ms variable value used without the ticks API. It is used to store the generation time of the Retry token to be received back from the client. Must be backported to 2.6 and 2.7.	2023-04-19 17:31:28 +02:00
Frédéric Lécaille	45662efb2f	BUG/MINOR: quic: Unchecked buffer length when building the token As server, an Initial does not contain a token but only the token length field with zero as value. The remaining room was not checked before writting this field. Must be backported to 2.6 and 2.7.	2023-04-19 11:36:54 +02:00
Frédéric Lécaille	0ed94032b2	MINOR: quic: Do not allocate too much ack ranges Limit the maximum number of ack ranges to QUIC_MAX_ACK_RANGES(32). Must be backported to 2.6 and 2.7.	2023-04-19 11:36:54 +02:00
Frédéric Lécaille	4b2627beae	BUG/MINOR: quic: Stop removing ACK ranges when building packets Since this commit: BUG/MINOR: quic: Possible wrapped values used as ACK tree purging limit. There are more chances that ack ranges may be removed from their trees when building a packet. It is preferable to impose a limit to these trees. This will be the subject of the a next commit to come. For now on, it is sufficient to stop deleting ack range from their trees. Remove quic_ack_frm_reduce_sz() and quic_rm_last_ack_ranges() which were there to do that. Make qc_frm_len() support ACK frames and calls it to ensure an ACK frame may be added to a packet before building it. Must be backported to 2.6 and 2.7.	2023-04-19 11:36:54 +02:00
Aurelien DARRAGON	8cd620b46f	MINOR: hlua: safe coroutine.create() Overriding global coroutine.create() function in order to link the newly created subroutine with the parent hlua ctx. (hlua_gethlua() function from a subroutine will return hlua ctx from the hlua ctx on which the coroutine.create() was performed, instead of NULL) Doing so allows hlua_hook() function to support being called from subroutines created using coroutine.create() within user lua scripts. That is: the related subroutine will be immune to the forced-yield, but it will still be checked against hlua timeouts. If the subroutine fails to yield or finish before the timeout, the related lua handler will be aborted (instead of going rogue unnoticed like it would be the case prior to this commit)	2023-04-19 11:03:31 +02:00
Aurelien DARRAGON	cf0f792490	MINOR: hlua: hook yield on known lua state When forcing a yield attempt from hlua_hook(), we should perform it on the known hlua state, not on a potential substate created using coroutine.create() from an existing hlua state from lua script. Indeed, only true hlua couroutines will properly handle the yield and perform the required timeout checks when returning in hlua_ctx_resume(). So far, this was not a concern because hlua_gethlua() would return NULL if hlua_hook() is not directly being called from a hlua coroutine anyway. But with this we're trying to make hlua_hook() ready for being called from a subcoroutine which inherits from a parent hlua ctx. In this case, no yield attempt will be performed, we will simply check for hlua timeouts. Not doing so would result in the timeout checks not being performed since hlua_ctx_resume() is completely bypassed when yielding from the subroutine, resulting in a user-defined coroutine potentially going rogue unnoticed.	2023-04-19 11:03:31 +02:00
Aurelien DARRAGON	2a9764baae	CLEANUP: hlua: avoid confusion between internal timers and tick based timers Not all hlua "time" variables use the same time logic. hlua->wake_time relies on ticks since its meant to be used in conjunction with task scheduling. Thus, it should be stored as a signed int and manipulated using the tick api. Adding a few comments about that to prevent mixups with hlua internal timer api which doesn't rely on the ticks api.	2023-04-19 11:03:31 +02:00
Aurelien DARRAGON	58e36e5b14	MEDIUM: hlua: introduce tune.lua.burst-timeout The "burst" execution timeout applies to any Lua handler. If the handler fails to finish or yield before timeout is reached, handler will be aborted to prevent thread contention, to prevent traffic from not being served for too long, and ultimately to prevent the process from crashing because of the watchdog kicking in. Default value is 1000ms. Combined with forced-yield default value of 10000 lua instructions, it should be high enough to prevent any existing script breakage, while still being able to catch slow lua converters or sample fetches doing thread contention and risking the process stability. Setting value to 0 completely bypasses this check. (not recommended but could be required to restore original behavior if this feature breaks existing setups somehow...) No backport needed, although it could be used to prevent watchdog crashes due to poorly coded (slow/cpu consuming) lua sample fetches/converters.	2023-04-19 11:03:31 +02:00
Aurelien DARRAGON	da9503ca9a	MEDIUM: hlua: reliable timeout detection For non yieldable lua handlers (converters, fetches or yield incompatible lua functions), current timeout detection relies on now_ms thread local variable. But within non-yieldable contexts, now_ms won't be updated if not by us (because we're momentarily stuck in lua context so we won't re-enter the polling loop, which is responsible for clock updates). To circumvent this, clock_update_date(0, 1) was manually performed right before now_ms is being read for the timeout checks. But this fails to work consistently, because if no other concurrent threads periodically run clock_update_global_date(), which do happen if we're the only active thread (nbthread=1 or low traffic), our clock_update_date() call won't reliably update our local now_ms variable Moreover, clock_update_date() is not the right tool for this anyway, as it was initially meant to be used from the polling context. Using it could have negative impact on other threads relying on now_ms to be stable. (because clock_update_date() performs global clock update from time to time) -> Introducing hlua multipurpose timer, which is internally based on now_cpu_time_fast() that provides per-thread consistent clock readings. Thanks to this new hlua timer API, hlua timeout logic is less error-prone and more robust. This allows the timeout detection to work as expected for both yieldable and non-yieldable lua handlers. This patch depends on commit "MINOR: clock: add now_cpu_time_fast() function" While this could theorically be backported to all stable versions, it is advisable to avoid backports unless we're confident enough since it could cause slight behavior changes (timing related) in existing setups.	2023-04-19 11:03:31 +02:00
Aurelien DARRAGON	df188f145b	MINOR: clock: add now_cpu_time_fast() function Same as now_cpu_time(), but for fast queries (less accurate) Relies on now_cpu_time() and now_mono_time_fast() is used as a cache expiration hint to prevent now_cpu_time() from being called too often since it is known to be quite expensive. Depends on commit "MINOR: clock: add now_mono_time_fast() function"	2023-04-19 11:03:31 +02:00
Aurelien DARRAGON	07cbd8e074	MINOR: clock: add now_mono_time_fast() function Same as now_mono_time(), but for fast queries (less accurate) Relies on coarse clock source (also known as fast clock source on some systems). Fallback to now_mono_time() if coarse source is not supported on the system.	2023-04-19 11:03:31 +02:00
Willy Tarreau	be336620b7	BUG/MINOR: cfgparse: make sure to include openssl-compat Commit `5003ac7fe` ("MEDIUM: config: set useful ALPN defaults for HTTPS and QUIC") revealed a build dependency bug: if QUIC is not enabled, cfgparse doesn't have any dependency on the SSL stack, so the various ifdefs that try to check special conditions such as rejecting an H2 config with too small a bufsize, are silently ignored. This was detected because the default ALPN string was not set and caused the alpn regtest to fail without QUIC support. Adding openssl-compat to the list of includes seems to be sufficient to have what we need. It's unclear when this dependency was broken, it seems that even 2.2 didn't have an explicit dependency on anything SSL-related, though it could have been inherited through other files (as happens with QUIC here). It would be safe to backport it to all stable branches. The impact is very low anyway.	2023-04-19 10:46:21 +02:00
Amaury Denoyelle	89e48ff92f	BUG/MEDIUM: quic: prevent crash on Retry sending The following commit introduced a regression : commit `1a5cc19cec` MINOR: quic: adjust Rx packet type parsing Since this commit, qv variable was left to NULL as version is stored directly in quic_rx_packet instance. In most cases, this only causes traces to skip version printing. However, qv is dereferenced when sending a Retry which causes a segfault. To fix this, simply remove qv variable and use pkt->version instead, both for traces and send_retry() invocation. This bug was detected thanks to QUIC interop runner. It can easily be reproduced by using quic-force-retry on the bind line. This must be backported up to 2.7.	2023-04-19 10:18:58 +02:00
Willy Tarreau	5003ac7fe9	MEDIUM: config: set useful ALPN defaults for HTTPS and QUIC This commit makes sure that if three is no "alpn", "npn" nor "no-alpn" setting on a "bind" line which corresponds to an HTTPS or QUIC frontend, we automatically turn on "h2,http/1.1" as an ALPN default for an HTTP listener, and "h3" for a QUIC listener. This simplifies the configuration for end users since they won't have to explicitly configure the ALPN string to enable H2, considering that at the time of writing, HTTP/1.1 represents less than 7% of the traffic on large infrastructures. The doc and regtests were updated. For more info, refer to the following thread: https://www.mail-archive.com/haproxy@formilux.org/msg43410.html	2023-04-19 09:52:20 +02:00
Willy Tarreau	de85de69ec	MINOR: ssl_crtlist: dump "no-alpn" on "show crtlist" when "no-alpn" was set Instead of dumping "alpn " better show "no-alpn" as configured.	2023-04-19 09:12:43 +02:00
Willy Tarreau	a2a095536a	MINOR: ssl: do not set ALPN callback with the empty string While it does not have any effect, it's better not to try to setup an ALPN callback nor to try to lookup algorithms when the configured ALPN string is empty as a result of "no-alpn" being used.	2023-04-19 09:12:43 +02:00
Willy Tarreau	158c18e85a	MINOR: config: add "no-alpn" support for bind lines It's possible to replace a previously set ALPN but not to disable ALPN if it was previously set. The new "no-alpn" setting allows to disable a previously set ALPN setting by preparing an empty one that will be replaced and freed when the config is validated.	2023-04-19 08:38:06 +02:00
Christopher Faulet	d0c57d3d33	BUG/MEDIUM: stconn: Propagate error on the SC on sending path On sending path, a pending error can be promoted to a terminal error at the endpoint level (SE_FL_ERR_PENDING to SE_FL_ERROR). When this happens, we must propagate the error on the SC to be able to handle it at the stream level and eventually forward it to the other side. Because of this bug, it is possible to freeze sessions, for instance on the CLI. It is a 2.8-specific issue. No backport needed.	2023-04-18 18:57:04 +02:00
Christopher Faulet	845f7c4708	CLEANUP: cli: Remove useless debug message in cli_io_handler() When compiled in debug mode, HAProxy prints a debug message at the end of the cli I/O handle. It is pretty annoying and useless because, we can active applet traces. Thus, just remove it.	2023-04-18 18:57:04 +02:00
Christopher Faulet	cbfcb02e21	CLEANUP: backend: Remove useless debug message in assign_server() When compiled in debug mode, HAProxy prints a debug message at the beginning of assign_server(). It is pretty annoying and useless because, in debug mode, we can active stream traces. Thus, just remove it.	2023-04-18 18:57:04 +02:00
Christopher Faulet	27c17d1ca5	BUG/MINOR: http-ana: Update analyzers on both sides when switching in TUNNEL mode The commit `9704797fa` ("BUG/MEDIUM: http-ana: Properly switch the request in tunnel mode on upgrade") fixes the switch in TUNNEL mode, but only partially. Because both channels are switch in TUNNEL mode in same time on one side, the channel's analyzers on the opposite side are not updated accordingly. This prevents the tunnel timeout to be applied. So instead of updating both sides in same time, we only force the analysis on the other side by setting CF_WAKE_ONCE flag when a channel is switched in TUNNEL mode. In addition, we must take care to forward all data if there is no DATAa TCP filters registered. This patch is related to the issue #2125. It is 2.8-specific. No backport needed.	2023-04-18 18:57:04 +02:00
Amaury Denoyelle	0783a7b08e	MINOR: listener: remove unneeded local accept flag Remove the receiver RX_F_LOCAL_ACCEPT flag. This was used by QUIC protocol before thread rebinding was supported by the quic_conn layer. This should be backported up to 2.7 after the previous patch has also been taken.	2023-04-18 17:09:34 +02:00
Amaury Denoyelle	1acbbca171	MAJOR: quic: support thread balancing on accept Before this patch, QUIC protocol used a custom add_listener callback. This was because a quic_conn instance was allocated before accept. Its thread affinity was fixed and could not be changed after. The thread was derived itself from the CID selected by the client which prevent an even repartition of QUIC connections on multiple threads. A series of patches was introduced with a lot of changes. The most important ones : * removal of affinity between an encoded CID and a thread * possibility to rebind a quic_conn on a new thread Thanks to this, it's possible to suppress the custom add_listener callback. Accept is conducted for QUIC protocol as with the others. A less loaded thread is selected on listener_accept() and the connection stack is bind on it. This operation implies that quic_conn instance is moved to the new thread using the set_affinity QUIC protocol callback. To reactivate quic_conn instance after thread rebind, qc_finalize_affinity_rebind() is called after accept on the new thread by qc_xprt_start() through accept_queue_process() / session_accept_fd(). This should be backported up to 2.7 after a period of observation.	2023-04-18 17:09:34 +02:00
Amaury Denoyelle	739de3f119	MINOR: quic: properly finalize thread rebinding When a quic_conn instance is rebinded on a new thread its tasks and tasklet are destroyed and new ones created. Its socket is also migrated to a new thread which stop reception on it. To properly reactivate a quic_conn after rebind, wake up its tasks and tasklet if they were active before thread rebind. Also reactivate reading on the socket FD. These operations are implemented on a new function qc_finalize_affinity_rebind(). This should be backported up to 2.7 after a period of observation.	2023-04-18 17:09:02 +02:00
Amaury Denoyelle	5f8704152a	BUG/MINOR: quic: transform qc_set_timer() as a reentrant function qc_set_timer() function is used to rearm the timer for loss detection and probing. Previously, timer was always rearm when congestion window was free due to a wrong interpretation of the RFC which mandates the client to rearm the timer before handshake completion to avoid a deadlock related to anti-amplification. Fix this by removing this code from quic_pto_pktns(). This allows qc_set_timer() to be reentrant and only activate the timer if needed. The impact of this bug seems limited. It can probably caused the timer task to be processed too frequently which could caused too frequent probing. This change will allow to reuse easily qc_set_timer() after quic_conn thread migration. As such, the new timer task will be scheduled only if needed. This should be backported up to 2.6.	2023-04-18 17:09:02 +02:00
Amaury Denoyelle	25174d51ef	MEDIUM: quic: implement thread affinity rebinding Implement a new function qc_set_tid_affinity(). This function is responsible to rebind a quic_conn instance to a new thread. This operation consists mostly of releasing existing tasks and tasklet and allocating new instances on the new thread. If the quic_conn uses its owned socket, it is also migrated to the new thread. The migration is finally completed with updated the CID TID to the new thread. After this step, the connection is thus accessible to the new thread and cannot be access anymore on the old one without risking race condition. To ensure rebinding is either done completely or not at all, tasks and tasklet are pre-allocated before all operations. If this fails, an error is returned and rebiding is not done. To destroy the older tasklet, its context is set to NULL before wake up. In I/O callbacks, a new function qc_process() is used to check context and free the tasklet if NULL. The thread rebinding can cause a race condition if the older thread quic_dghdlrs::dgrams list contains datagram for the connection after rebinding is done. To prevent this, quic_rx_pkt_retrieve_conn() always check if the packet CID is still associated to the current thread or not. In the latter case, no connection is returned and the new thread is returned to allow to redispatch the datagram to the new thread in a thread-safe way. This should be backported up to 2.7 after a period of observation.	2023-04-18 17:08:34 +02:00
Amaury Denoyelle	1304d19dee	MINOR: quic: delay post handshake frames after accept When QUIC handshake is completed on our side, some frames are prepared to be sent : * HANDSHAKE_DONE * several NEW_CONNECTION_ID with CIDs allocated This step was previously executed in quic_conn_io_cb() directly after CRYPTO frames parsing. This patch delays it to be completed after accept. Special care have been taken to ensure it is still functional with 0-RTT activated. For the moment, this patch should have no impact. However, when quic_conn thread migration on accept will be implemented, it will be easier to remap only one CID to the new thread. New CIDs will be allocated after migration on the new thread. This should be backported up to 2.7 after a period of observation.	2023-04-18 17:08:28 +02:00
Amaury Denoyelle	a66e04338e	MINOR: protocol: define new callback set_affinity Define a new protocol callback set_affinity. This function is used during listener_accept() to notify about a rebind on a new thread just before pushing the connection on the selected thread queue. If the callback fails, accept is done locally. This change will be useful for protocols with state allocated before accept is done. For the moment, only QUIC protocol is concerned. This will allow to rebind the quic_conn to a new thread depending on its load. This should be backported up to 2.7 after a period of observation.	2023-04-18 16:54:52 +02:00
Amaury Denoyelle	987812b190	MINOR: quic: do not proceed to accept for closing conn Each quic_conn is inserted in an accept queue to allocate the upper layers. This is done through a listener tasklet in quic_sock_accept_conn(). This patch interrupts the accept process for a quic_conn in closing/draining state. Indeed, this connection will soon be closed so it's unnecessary to allocate a complete stack for it. This patch will become necessary when thread migration is implemented. Indeed, it won't be allowed to proceed to thread migration for a closing quic_conn. This should be backported up to 2.7 after a period of observation.	2023-04-18 16:54:48 +02:00
Amaury Denoyelle	f16ec344d5	MEDIUM: quic: handle conn bootstrap/handshake on a random thread TID encoding in CID was removed by a recent change. It is now possible to access to the <tid> member stored in quic_connection_id instance. For unknown CID, a quick solution was to redispatch to the thread corresponding to the first CID byte. This ensures that an identical CID will always be handled by the same thread to avoid creating multiple same connection. However, this forces an uneven load repartition which can be critical for QUIC handshake operation. To improve this, remove the above constraint. An unknown CID is now handled by its receiving thread. However, this means that if multiple packets are received with the same unknown CID, several threads will try to allocate the same connection. To prevent this race condition, CID insertion in global tree is now conducted first before creating the connection. This is a thread-safe operation which can only be executed by a single thread. The thread which have inserted the CID will then proceed to quic_conn allocation. Other threads won't be able to insert the same CID : this will stop the treatment of the current packet which is redispatch to the now owning thread. This should be backported up to 2.7 after a period of observation.	2023-04-18 16:54:44 +02:00
Amaury Denoyelle	1e959ad522	MINOR: quic: remove TID encoding in CID CIDs were moved from a per-thread list to a global list instance. The TID-encoded is thus non needed anymore. This should be backported up to 2.7 after a period of observation.	2023-04-18 16:54:31 +02:00
Amaury Denoyelle	e83f937cc1	MEDIUM: quic: use a global CID trees list Previously, quic_connection_id were stored in a per-thread tree list. Datagram were first dispatched to the correct thread using the encoded TID before a tree lookup was done. Remove these trees and replace it with a global trees list of 256 entries. A CID is using the list index corresponding to its first byte. On datagram dispatch, CID is lookup on its tree and TID is retrieved using new member quic_connection_id.tid. As such, a read-write lock protects each list instances. With 256 entries, it is expected that contention should be reduced. A new structure quic_cid_tree served as a tree container associated with its read-write lock. An API is implemented to ensure lock safety for insert/lookup/delete operation. This patch is a step forward to be able to break the affinity between a CID and a TID encoded thread. This is required to be able to migrate a quic_conn after accept to select thread based on their load. This should be backported up to 2.7 after a period of observation.	2023-04-18 16:54:17 +02:00
Amaury Denoyelle	66947283ba	MINOR: quic: remove TID ref from quic_conn Remove <tid> member in quic_conn. This is moved to quic_connection_id instance. For the moment, this change has no impact. Indeed, qc.tid reference could easily be replaced by tid as all of this work was already done on the connection thread. However, it is planified to support quic_conn thread migration in the future, so removal of qc.tid will simplify this. This should be backported up to 2.7.	2023-04-18 16:20:47 +02:00
Amaury Denoyelle	c2a9264f34	MINOR: quic: adjust quic CID derive API ODCID are never stored in the CID tree. Instead, we store our generated CID which is directly derived from the CID using a hash function. This operation is done via quic_derive_cid(). Previously, generated CID was returned as a 64-bits integer. However, this is cumbersome to convert as an array of bytes which is the most common CID representation. Adjust this by modifying return type to a quic_cid struct. This should be backported up to 2.7.	2023-04-18 16:20:47 +02:00
Amaury Denoyelle	1a5cc19cec	MINOR: quic: adjust Rx packet type parsing qc_parse_hd_form() is the function used to parse the first byte of a packet and return its type and version. Its API has been simplified with the following changes : * extra out paremeters are removed (long_header and version). All infos are now stored directly in quic_rx_packet instance * a new dummy version is declared in quic_versions array with a 0 number code. This can be used to match Version negotiation packets. * a new default packet type is defined QUIC_PACKET_TYPE_UNKNOWN to be used as an initial value. Also, the function has been exported to an include file. This will be useful to be able to reuse on quic-sock to parse the first packet of a datagram. This should be backported up to 2.7.	2023-04-18 16:20:47 +02:00
Amaury Denoyelle	6ac0fb0f13	MINOR: quic: remove uneeded tasklet_wakeup after accept No need to explicitely wakeup quic-conn tasklet after accept is done. This should be backported up to 2.7.	2023-04-18 16:20:47 +02:00
Amaury Denoyelle	591e7981d9	CLEANUP: quic: rename quic_connection_id vars Two different structs exists for QUIC connection ID : * quic_connection_id which represents a full CID with its sequence number * quic_cid which is just a buffer with a length. It is contained in the above structure. To better differentiate them, rename all quic_connection_id variable instances to "conn_id" by contrast to "cid" which is used for quic_cid. This should be backported up to 2.7.	2023-04-18 16:20:47 +02:00
Amaury Denoyelle	9b68b64572	CLEANUP: quic: remove unused qc param on stateless reset token Remove quic_conn instance as first parameter of quic_stateless_reset_token_init() and quic_stateless_reset_token_cpy() functions. It was only used for trace purpose. The main advantage is that it will be possible to allocate a QUIC CID without a quic_conn instance using new_quic_cid() which is requires to first check if a CID is existing before allocating a connection. This should be backported up to 2.7.	2023-04-18 16:20:47 +02:00
Amaury Denoyelle	90e5027e46	CLEANUP: quic: remove unused scid_node Remove unused scid_node member for quic_conn structure. It was prepared for QUIC backend support. This should be backported up to 2.7.	2023-04-18 16:20:47 +02:00
Amaury Denoyelle	22a368ce58	CLEANUP: quic: remove unused QUIC_LOCK label QUIC_LOCK label is never used. Indeed, lock usage is minimal on QUIC as every connection is pinned to its owned thread. This should be backported up to 2.7.	2023-04-18 16:20:47 +02:00
Amaury Denoyelle	c361937d51	BUG/MINOR: task: allow to use tasklet_wakeup_after with tid -1 Adjust BUG_ON() statement to allow tasklet_wakeup_after() for tasklets with tid pinned to -1 (the current thread). This is similar to tasklet_wakeup(). This should be backported up to 2.6.	2023-04-18 16:20:47 +02:00
Willy Tarreau	ca1027c22f	MINOR: mux-h2: make the max number of concurrent streams configurable per side For a long time the maximum number of concurrent streams was set once for both sides (front and back) while the impacts are different. This commit allows it to be configured separately for each side. The older settings remains the fallback choice when other ones are not set.	2023-04-18 15:58:55 +02:00
Willy Tarreau	9d7abda787	MINOR: mux-h2: make the initial window size configurable per side For a long time the initial window size (per-stream size) was set once for both directions, frontend and backend, resulting in a tradeoff between upload speed and download fairness. This commit allows it to be configured separately for each side. The older settings remains the fallback choice when other ones are not set.	2023-04-18 15:58:55 +02:00
Christopher Faulet	b36e512bd0	MINOR: stconn: Propagate EOS from an applet to the attached stream-connector In the same way than for a stream-connector attached to a mux, an EOS is now propagated from an applet to its stream-connector. To do so, sc_applet_eos() function is added.	2023-04-17 17:41:28 +02:00
Christopher Faulet	1aec6c92cb	MINOR: stconn: Propagate EOS from a mux to the attached stream-connector Now there is a SC flag to state the endpoint has reported an end-of-stream, it is possible to distinguish an EOS from an abort at the stream-connector level. sc_conn_read0() function is renamed to sc_conn_eos() and it propagates an EOS by setting SC_FL_EOS instead of SC_FL_ABRT_DONE. It only concernes stream-connectors attached to a mux.	2023-04-17 17:41:28 +02:00
Christopher Faulet	ca5309a9a3	MINOR: stconn: Add a flag to report EOS at the stream-connector level SC_FL_EOS flag is added to report the end-of-stream at the SC level. It will be used to distinguish end of stream reported by the endoint, via the SE_FL_EOS flag, and the abort triggered by the stream, via the SC_FL_ABRT_DONE flag. In this patch, the flag is defined and is systematically tested everywhere SC_FL_ABRT_DONE is tested. It should be safe because it is never set.	2023-04-17 17:41:28 +02:00
Christopher Faulet	285aa40d35	BUG/MEDIUM: log: Properly handle client aborts in syslog applet In the syslog applet, when there is no output data, nothing is performed and the applet leaves by requesting more data. But it is an issue because a client abort is only handled if it reported with the last bytes of the message. If the abort occurs after the message was handled, it is ignored. The session remains opened and inactive until the client timeout is being triggered. It no such timeout is configured, given that the default maxconn is 10, all slots can be quickly busy and make the applet unresponsive. To fix the issue, the best is to always try to read a message when the I/O handle is called. This way, the abort can be handled. And if there is no data, we leave as usual. This patch should fix the issue #2112. It must be backported as far as 2.4.	2023-04-17 16:50:30 +02:00
Christopher Faulet	9704797fa2	BUG/MEDIUM: http-ana: Properly switch the request in tunnel mode on upgrade Since the commit `f2b02cfd9` ("MAJOR: http-ana: Review error handling during HTTP payload forwarding"), during the payload forwarding, we are analyzing a side, we stop to test the opposite side. It means when the HTTP request forwarding analyzer is called, we no longer check the response side and vice versa. Unfortunately, since then, the HTTP tunneling is broken after a protocol upgrade. On the response is switch in TUNNEL mode. The request remains in DONE state. As a consequence, data received from the server are forwarded to the client but not data received from the client. To fix the bug, when both sides are in DONE state, both are switched in same time in TUNNEL mode if it was requested. It is performed in the same way in http_end_request() and http_end_response(). This patch should fix the issue #2125. It is 2.8-specific. No backport needed.	2023-04-17 16:17:35 +02:00
William Lallemand	a21ca74e83	MINOR: ssl: remove OpenSSL 1.0.2 mention into certificate loading error Remove the mention to OpenSSL 1.0.2 in the certificate chain loading error, which is not relevant. Could be backported in 2.7.	2023-04-17 14:45:40 +02:00
Ilya Shipitsin	2ca01589a0	CLEANUP: use "offsetof" where appropriate let's use the C library macro "offsetof"	2023-04-16 09:58:49 +02:00
Fr�d�ric L�caille	b5efe7901d	BUG/MINOR: quic: Do not use ack delay during the handshakes As revealed by GH #2120 opened by @Tristan971, there are cases where ACKs have to be sent without packet to acknowledge because the ACK timer has been triggered and the connection needs to probe the peer at the same time. Indeed Thank you to @Tristan971 for having reported this issue. Must be backported to 2.6 and 2.7.	2023-04-14 21:09:13 +02:00
Christopher Faulet	75b954fea4	BUG/MINOR: stconn: Don't set SE_FL_ERROR at the end of sc_conn_send() When I reworked my series, this code was first removed and reinserted by error. So let's remove it again.	2023-04-14 17:32:44 +02:00
Christopher Faulet	25d9fe50f5	MEDIUM: stconn: Rely on SC flags to handle errors instead of SE flags It is the last commit on this subject. we stop to use SE_FL_ERROR flag from the SC, except at the I/O level. Otherwise, we rely on SC_FL_ERROR flag. Now, there should be a real separation between SE flags and SC flags.	2023-04-14 17:05:54 +02:00
Christopher Faulet	e182a8e651	MEDIUM: stream: Stop to use SE flags to detect endpoint errors Here again, we stop to use SE_FL_ERROR flag from process_stream() and sub-functions and we fully rely on SC_FL_ERROR to do so.	2023-04-14 17:05:54 +02:00
Christopher Faulet	d7bac88427	MEDIUM: stream: Stop to use SE flags to detect read errors from analyzers In the same way the previous commit, we stop to use SE_FL_ERROR flag from analyzers and their sub-functions. We now fully rely on SC_FL_ERROR to do so.	2023-04-14 17:05:54 +02:00
Christopher Faulet	725170eee6	MEDIUM: backend: Stop to use SE flags to detect connection errors SE_FL_ERROR flag is no longer set when an error is detected durign the connection establishment. SC_FL_ERROR flag is set instead. So it is safe to remove test on SE_FL_ERROR to detect connection establishment error.	2023-04-14 17:05:54 +02:00
Christopher Faulet	88d05a0f3b	MEDIUM: tree-wide: Stop to set SE_FL_ERROR from upper layer We can now fully rely on SC_FL_ERROR flag from the stream. The first step is to stop to set the SE_FL_ERROR flag. Only endpoints are responsible to set this flag. It was a design limitation. It is now fixed.	2023-04-14 17:05:54 +02:00
Christopher Faulet	ad46e52814	MINOR: tree-wide: Test SC_FL_ERROR with SE_FL_ERROR from upper layer From the stream, when SE_FL_ERROR flag is tested, we now also test the SC_FL_ERROR flag. Idea is to stop to rely on the SE descriptor to detect errors.	2023-04-14 17:05:54 +02:00
Christopher Faulet	340021b89f	MINOR: stream: Set SC_FL_ERROR on channels' buffer allocation error Set SC_FL_ERROR flag when we fail to allocate a buffer for a stream.	2023-04-14 17:05:54 +02:00
Christopher Faulet	38656f406c	MINOR: backend: Set SC_FL_ERROR on connection error During connection establishement, if an error occurred, the SC_FL_ERROR flag is now set. Concretely, it is set when SE_FL_ERROR is also set.	2023-04-14 17:05:53 +02:00
Christopher Faulet	a1d14a7c7f	MINOR: stconn: Add a flag to ack endpoint errors at SC level The flag SC_FL_ERROR is added to ack errors on the endpoint. When SE_FL_ERROR flag is detected on the SE descriptor, the corresponding is set on the SC. Idea is to avoid, as far as possible, to manipulated the SE descriptor in upper layers and know when an error in the endpoint is handled by the SC. For now, this flag is only set and cleared but never tested.	2023-04-14 17:05:53 +02:00
Christopher Faulet	638fe6ab0f	MINOR: stconn: Don't clear SE_FL_ERROR when endpoint is reset There is no reason to remove this flag. When the SC endpoint is reset, it is replaced by a new one. The old one is released. It was useful when the new endpoint inherited some flags from the old one. But it is no longer performed. Thus there is no reason still unset this flag.	2023-04-14 17:05:53 +02:00
Christopher Faulet	e8bcef5f22	MEDIUM: stconn: Forbid applets with more to deliver if EOI was reached When an applet is woken up, before calling its io_handler, we pretend it has no more data to deliver. So, after the io_handler execution, it is a bug if an applet states it has more data to deliver while the end of input is reached. So a BUG_ON() is added to be sure it never happens.	2023-04-14 17:05:53 +02:00
Christopher Faulet	56a2b608b0	MINOR: stconn: Stop to set SE_FL_ERROR on sending path It is not the SC responsibility to report errors on the SE descriptor. It is the endpoint responsibility. It must switch SE_FL_ERR_PENDING into SE_FL_ERROR if the end of stream was detected. It can even be considered as a bug if it is not done by he endpoint. So now, on sending path, a BUG_ON() is added to abort if SE_FL_EOS and SE_FL_ERR_PENDING flags are set but not SE_FL_ERROR. It is trully important to handle this case in the endpoint to be able to properly shut the endpoint down.	2023-04-14 17:05:53 +02:00
Christopher Faulet	d3bc340e7e	BUG/MINOR: cli: Don't close when SE_FL_ERR_PENDING is set in cli analyzer SE_FL_ERR_PENDING is used to report an error on the write side. But it is not a terminal error. Some incoming data may still be available. In the cli analyzers, it is important to not close the stream when this flag is set. Otherwise the response to a command can be truncated. It is probably hard to observe. But it remains a bug. While this patch could be backported to 2.7, there is no real reason to do so, except if someone reports a bug about truncated responses.	2023-04-14 16:49:04 +02:00
Christopher Faulet	214f1b5c16	MINOR: tree-wide: Replace several chn_prod() by the corresponding SC At many places, call to chn_prod() can be easily replaced by the corresponding SC. It is a bit easier to understand which side is manipulated.	2023-04-14 15:06:04 +02:00
Christopher Faulet	64350bbf05	MINOR: tree-wide: Replace several chn_cons() by the corresponding SC At many places, call to chn_cons() can be easily replaced by the corresponding SC. It is a bit easier to understand which side is manipulated.	2023-04-14 15:04:03 +02:00
Christopher Faulet	b2b1c3a6ea	MINOR: channel/stconn: Replace sc_shutw() by sc_shutdown() All reference to a shutw is replaced by an abort. So sc_shutw() is renamed sc_shutdown(). SC app ops functions are renamed accordingly.	2023-04-14 15:02:57 +02:00
Christopher Faulet	208c712b40	MINOR: stconn: Rename SC_FL_SHUTW in SC_FL_SHUT_DONE Here again, it is just a flag renaming. In SC flags, there is no longer shutdown for writes but shutdowns.	2023-04-14 15:01:21 +02:00
Christopher Faulet	cfc11c0eae	MINOR: channel/stconn: Replace sc_shutr() by sc_abort() All reference to a shutr is replaced by an abort. So sc_shutr() is renamed sc_abort(). SC app ops functions are renamed accordingly.	2023-04-14 14:54:35 +02:00
Christopher Faulet	0c370eee6d	MINOR: stconn: Rename SC_FL_SHUTR in SC_FL_ABRT_DONE Here again, it is just a flag renaming. In SC flags, there is no longer shutdown for reads but aborts. For now this flag is set when a read0 is detected. It is of couse not accurate. This will be changed later.	2023-04-14 14:51:22 +02:00
Christopher Faulet	df7cd710a8	MINOR: channel/stconn: Replace channel_shutw_now() by sc_schedule_shutdown() After the flag renaming, it is now the turn for the channel function to be renamed and moved in the SC scope. channel_shutw_now() is replaced by sc_schedule_shutdown(). The request channel is replaced by the front SC and the response is replace by the back SC.	2023-04-14 14:49:45 +02:00
Christopher Faulet	e38534cbd0	MINOR: stconn: Rename SC_FL_SHUTW_NOW in SC_FL_SHUT_WANTED Because shutowns for reads are now considered as aborts, the shudowns for writes can now be considered as shutdowns. Here it is just a flag renaming. SC_FL_SHUTW_NOW is renamed SC_FL_SHUT_WANTED.	2023-04-14 14:46:07 +02:00
Christopher Faulet	12762f09a5	MINOR: channel/stconn: Replace channel_shutr_now() by sc_schedule_abort() After the flag renaming, it is now the turn for the channel function to be renamed and moved in the SC scope. channel_shutr_now() is replaced by sc_schedule_abort(). The request channel is replaced by the front SC and the response is replace by the back SC.	2023-04-14 14:08:49 +02:00
Christopher Faulet	573ead1e68	MINOR: stconn: Rename SC_FL_SHUTR_NOW in SC_FL_ABRT_WANTED It is the first step to transform shutdown for reads for the upper layer into aborts. This patch is quite simple, it is just a flag renaming.	2023-04-14 14:06:01 +02:00
Christopher Faulet	7eb837df4a	MINOR: stream: Introduce stream_abort() to abort on both sides in same time The function stream_abort() should now be called when an abort is performed on the both channels in same time.	2023-04-14 14:04:59 +02:00
Christopher Faulet	3db538ac2f	MINOR: channel: Forwad close to other side on abort Most of calls to channel_abort() are associated to a call to channel_auto_close(). Others are in areas where the auto close is the default. So, it is now systematically enabled when an abort is performed on a channel, as part of channel_abort() function.	2023-04-14 13:56:28 +02:00
Christopher Faulet	0adffb62c1	MINOR: filters: Review and simplify errors handling First, it is useless to abort the both channel explicitly. For HTTP streams, http_reply_and_close() is called. This function already take care to abort processing. For TCP streams, we can rely on stream_retnclose(). To set termination flags, we can also rely on http_set_term_flags() for HTTP streams and sess_set_term_flags() for TCP streams. Thus no reason to handle them by hand. At the end, the error handling after filters evaluation is now quite simple.	2023-04-14 12:13:09 +02:00
Christopher Faulet	dbad8ec787	MINOR: stream: Uninline and export sess_set_term_flags() function This function will be used to set termination flags on TCP streams from outside of process_stream(). Thus, it must be uninlined and exported.	2023-04-14 12:13:09 +02:00
Christopher Faulet	95125886ee	BUG/MEDIUM: stconn: Do nothing in sc_conn_recv() when the SC needs more room We erroneously though that an attempt to receive data was not possible if the SC was waiting for more room in the channel buffer. A BUG_ON() was added to detect bugs. And in fact, it is possible. The regression was added in commit `341a5783b` ("BUG/MEDIUM: stconn: stop to enable/disable reads from streams via si_update_rx"). This patch should fix the issue #2115. It must be backported if the commit above is backported.	2023-04-14 12:13:09 +02:00
Christopher Faulet	915ba08b57	BUG/MEDIUM: stream: Report write timeouts before testing the flags A regression was introduced when stream's timeouts were refactored. Write timeouts are not testing is the right order. When timeous of the front SC are handled, we must then test the read timeout on the request channel and the write timeout on the response channel. But write timeout is tested on the request channel instead. On the back SC, the same mix-up is performed. We must be careful to handle timeouts before checking channel flags. To avoid any confusions, all timeuts are handled first, on front and back SCs. Then flags of the both channels are tested. It is a 2.8-specific issue. No backport needed.	2023-04-14 12:13:09 +02:00
Christopher Faulet	925279ccf2	BUG/MINOR: stream: Fix test on SE_FL_ERROR on the wrong entity There is a bug at begining of process_stream(). The SE_FL_ERROR flag is tested against backend stream-connector's flags instead of its SE descriptor's flags. It is an old typo, introduced when the stream-interfaces were replaced by the conn-streams. This patch must be backported as far as 2.6.	2023-04-14 12:13:09 +02:00
Frédéric Lécaille	895700bd32	BUG/MINOR: quic: Wrong Application encryption level selection when probing This bug arrived with this commit: MEDIUM: quic: Ack delay implementation After having probed the Handshake packet number space, one must not select the Application encryption level to continue trying building packets as this is done when the connection is not probing. Indeed, if the ACK timer has been triggered in the meantime, the packet builder will try to build a packet at the Application encryption level to acknowledge the received packet. But there is very often no 01RTT packet to acknowledge when the connection is probing before the handshake is completed. This triggers a BUG_ON() in qc_do_build_pkt() which checks that the tree of ACK ranges to be used is not empty. Thank you to @Tristan971 for having reported this issue in GH #2109. Must be backported to 2.6 and 2.7.	2023-04-13 19:20:09 +02:00
Frédéric Lécaille	a576c1b0c6	MINOR: quic: Remove a useless test about probing in qc_prep_pkts() qel->pktns->tx.pto_probe is set to 0 after having prepared a probing datagram. There is no reason to check this parameter. Furthermore it is always 0 when the connection does not probe the peer. Must be backported to 2.6 and 2.7.	2023-04-13 19:20:09 +02:00
Frédéric Lécaille	91369cfcd0	MINOR: quic: Display the packet number space flags in traces Display this information when the encryption level is also displayed. Must be backported to 2.6 and 2.7.	2023-04-13 19:20:08 +02:00
Frédéric Lécaille	595251f22e	BUG/MINOR: quic: SIGFPE in quic_cubic_update() As reported by @Tristan971 in GH #2116, the congestion control window could be zero due to an inversion in the code about the reduction factor to be applied. On a new loss event, it must be applied to the slow start threshold and the window should never be below ->min_cwnd (2*max_udp_payload_sz). Same issue in both newReno and cubic algorithm. Furthermore in newReno, only the threshold was decremented. Must be backported to 2.6 and 2.7.	2023-04-13 19:20:08 +02:00
Frédéric Lécaille	9d68c6aaf6	BUG/MINOR: quic: Possible wrapped values used as ACK tree purging limit. Add two missing checks not to substract too big values from another too little one. In this case the resulted wrapped huge values could be passed to the function which has to remove the last range of a tree of ACK ranges as encoded limit size not to go below, cancelling the ACK ranges deletion. The consequence could be that no ACK were sent. Must be backported to 2.6 and 2.7.	2023-04-13 19:20:08 +02:00
Frédéric Lécaille	45bf1a82f1	BUG/MEDIUM: quic: Code sanitization about acknowledgements requirements qc_may_build_pkt() has been modified several times regardless of the conditions the functions it is supposed to allow to send packets (qc_build_pkt()/qc_do_build_pkt()) really use to finally send packets just after having received others, leading to contraditions and possible very long loops sending empty packets (PADDING only packets) because qc_may_build_pkt() could allow qc_build_pkt()/qc_do_build_pkt to build packet, and the latter did nothing except sending PADDING frames, because from its point of view they had nothing to send. For now on, this is the job of qc_may_build_pkt() to decide to if there is packets to send just after having received others AND to provide this information to the qc_build_pkt()/qc_do_build_pkt() Note that the unique case where the acknowledgements are completely ignored is when the endpoint must probe. But at least this is when sending at most two datagrams! This commit also fixes the issue reported by Willy about a very low throughput performance when the client serialized its requests. Must be backported to 2.7 and 2.6.	2023-04-13 19:20:08 +02:00
Frédéric Lécaille	eb3e5171ed	MINOR: quic: Add connection flags to traces This should help in diagnosing issues. Some adjustments have to be done to avoid deferencing a quic_conn objects from TRACE_*() calls. Must be backported to 2.7 and 2.6.	2023-04-13 19:20:08 +02:00
Frédéric Lécaille	809bd9fed1	BUG/MINOR: quic: Ignored less than 1ms RTTs Do not ignore very short RTTs (less than 1ms) before computing the smoothed RTT initializing it to an "infinite" value (UINT_MAX). Must be backported to 2.7 and 2.6.	2023-04-13 19:20:08 +02:00
Frédéric Lécaille	fad0e6cf73	MINOR: quic: Add packet loss and maximum cc window to "show quic" Add the number of packet losts and the maximum congestion control window computed by the algorithms to "show quic". Same thing for the traces of existent congestion control algorithms. Must be backported to 2.7 and 2.6.	2023-04-13 19:20:08 +02:00
Olivier Houchard	f98a8c317e	BUG/MEDIUM: fd: don't wait for tmask to stabilize if we're not in it. In fd_update_events(), we loop until there's no bit in the running_mask that is not in the thread_mask. Problem is, the thread sets its running_mask bit before that loop, and so if 2 threads do the same, and a 3rd one just closes the FD and sets the thread_mask to 0, then running_mask will always be non-zero, and we will loop forever. This is trivial to reproduce when using a DNS resolver that will just answer "port unreachable", but could theoretically happen with other types of file descriptors too. To fix that, just don't bother looping if we're no longer in the thread_mask, if that happens we know we won't have to take care of the FD, anyway. This should be backported to 2.7, 2.6 and 2.5.	2023-04-13 18:04:46 +02:00
Willy Tarreau	a07635ead5	MINOR: bind-conf: support a new shards value: "by-group" Setting "shards by-group" will create one shard per thread group. This can often be a reasonable tradeoff between a single one that can be suboptimal on CPUs with many cores, and too many that will eat a lot of file descriptors. It was shown to provide good results on a 224 thread machine, with a distribution that was even smoother than the system's since here it can take into account the number of connections per thread in the group. Depending on how popular it becomes, it could even become the default setting in a future version.	2023-04-13 17:38:31 +02:00
Willy Tarreau	d30e82b9f0	MINOR: receiver: reserve special values for "shards" Instead of artificially setting the shards count to MAX_THREAD when "by-thread" is used, let's reserve special values for symbolic names so that we can add more in the future. For now we use value -1 for "by-thread", which requires to turn the type to signed int but it was already used as such everywhere anyway.	2023-04-13 17:12:50 +02:00
Amaury Denoyelle	53fc98c3bc	MINOR: fd: implement fd_migrate_on() to migrate on a non-local thread fd_migrate_on() can be used to migrate an existing FD to any thread, even one belonging to a different group from the current one and from the caller's. All that is needed is to make sure the FD is still valid when the operation is performed (which is the case when such operations happen). This is potentially slightly expensive since it locks the tgid during the delicate operation, but it is normally performed only from an owning thread to offer the FD to another one (e.g. reassign a better thread upon accept()).	2023-04-13 16:57:51 +02:00
Willy Tarreau	97da942ba6	MINOR: thread: keep a bitmask of enabled groups in thread_set We're only checking for 0, 1, or >1 groups enabled there, and we'll soon need to be more precise and know quickly which groups are non-empty. Let's just replace the count with a mask of enabled groups. This will allow to quickly spot the presence of any such group in a set.	2023-04-13 16:57:51 +02:00
William Lallemand	3f210970bf	BUG/MINOR: stick_table: alert when type len has incorrect characters Alert when the len argument of a stick table type contains incorrect characters. Replace atol by strtol. Could be backported in every maintained versions.	2023-04-13 14:46:08 +02:00
Willy Tarreau	28f2a590f6	MINOR: activity: add a line reporting the average CPU usage to "show activity" It was missing from the output but is sometimes convenient to observe and understand how incoming connections are distributed. The CPU usage is reported as the instant measurement of 100-idle_pct for each thread, and the average value is shown for the aggregated value. This could be backported as it's helpful in certain troublehsooting sessions.	2023-04-12 08:42:52 +02:00
Frédéric Lécaille	6fd2576d5e	MINOR: quic: Add a trace for packet with an ACK frame As the ACK frames are not added to the packet list of ack-eliciting frames, it could not be traced. But there is a flag to identify such packet. Let's use it to add this information to the traces of TX packets. Must be backported to 2.6 and 2.7.	2023-04-11 10:47:19 +02:00
Frédéric Lécaille	e47adca432	MINOR: quic: Dump more information at proto level when building packets This should be helpful to debug issues at without too much traces. Must be backported to 2.7 and 2.6.	2023-04-11 10:47:19 +02:00
Frédéric Lécaille	c0aaa07aa3	MINOR: quic: Modify qc_try_rm_hp() traces Dump at proto level the packet information when its header protection was removed. Remove no more use qpkt_trace variable. Must be backported to 2.7 and 2.6.	2023-04-11 10:47:19 +02:00
Frédéric Lécaille	68737316ea	BUG/MINOR: quic: Wrong packet number space probing before confirmed handshake It is possible that the handshake was not confirmed and there was no more packet in flight to probe with. It this case the server must wait for the client to be unblocked without probing any packet number space contrary to what was revealed by interop tests as follows: [01\|quic\|2\|uic_loss.c:65] TX loss pktns : qc@0x7fac301cd390 pktns=I pp=0 [01\|quic\|2\|uic_loss.c:67] TX loss pktns : qc@0x7fac301cd390 pktns=H pp=0 tole=-102ms [01\|quic\|2\|uic_loss.c:67] TX loss pktns : qc@0x7fac301cd390 pktns=01RTT pp=0 if=1054 tole=-1987ms [01\|quic\|5\|uic_loss.c:73] quic_loss_pktns(): leaving : qc@0x7fac301cd390 [01\|quic\|5\|uic_loss.c:91] quic_pto_pktns(): entering : qc@0x7fac301cd390 [01\|quic\|3\|ic_loss.c:121] TX PTO handshake not already completed : qc@0x7fac301cd390 [01\|quic\|2\|ic_loss.c:141] TX PTO : qc@0x7fac301cd390 pktns=I pp=0 dur=83ms [01\|quic\|5\|ic_loss.c:142] quic_pto_pktns(): leaving : qc@0x7fac301cd390 [01\|quic\|3\|c_conn.c:5179] needs to probe Initial packet number space : qc@0x7fac301cd390 This bug was not visible before this commit: BUG/MINOR: quic: wake up MUX on probing only for 01RTT This means that before it, one could do bad things (probing the 01RTT packet number space before the handshake was confirmed). Must be backported to 2.7 and 2.6.	2023-04-11 10:47:19 +02:00
Frédéric Lécaille	2513b1dd7b	MINOR: quic: Trace fix in quic_pto_pktns() (handshaske status) The handshake must be confirmed before probing the 01RTT packet number space. Must be backported to 2.7 and 2.6.	2023-04-11 10:47:19 +02:00
Christopher Faulet	c202c740b5	BUG/MEDIUM: mux-h2: Never set SE_FL_EOS without SE_FL_EOI or SE_FL_ERROR When end-of-stream is reported by a H2 stream, we must take care to also report an error is end-of-input was not reported. Indeed, it is now mandatory to set SE_FL_EOI or SE_FL_ERROR flags when SE_FL_EOS is set. It is a 2.8-specific issue. No backport needed.	2023-04-11 08:59:10 +02:00
Christopher Faulet	c393c9e388	BUG/MEDIUM: mux-h1: Report EOI when a TCP connection is upgraded to H2 When TCP connection is first upgrade to H1 then to H2, the stream-connector, created by the PT mux, must be destroyed because the H2 mux cannot inherit from it. When it is performed, the SE_FL_EOS flag is set but SE_FL_EOI must also be set. It is now required to never set SE_FL_EOS without SE_FL_EOI or SE_FL_ERROR. It is a 2.8-specific issue. No backport needed.	2023-04-11 08:45:18 +02:00
Christopher Faulet	f65cf3684d	MINOR: hlua: Stop to check the SC state when executing a hlua cli command This part has changed but it was already handled by the CLI applet. There is no reason to performe this test when a hlua cli command is executed.	2023-04-11 08:19:06 +02:00
Christopher Faulet	5220a8c5c4	BUG/MEDIUM: resolvers: Force the connect timeout for DNS resolutions Timeouts for dynamic resolutions are not handled at the stream level but by the resolvers themself. It means there is no connect, client and server timeouts defined on the internal proxy used by a resolver. While it is not an issue for DNS resolution over UDP, it can be a problem for resolution over TCP. New sessions are automatically created when required, and killed on excess. But only established connections are considered. Connecting ones are never killed. Because there is no conncet timeout, we rely on the kernel to report a connection error. And this may be quite long. Because resolutions are periodically triggered, this may lead to an excess of unusable sessions in connecting state. This also prevents HAProxy to quickly exit on soft-stop. It is annoying, especially because there is no reason to not set a connect timeout. So to mitigate the issue, we now use the "resolve" timeout as connect timeout for the internal proxy attached to a resolver. This patch should be backported as far as 2.4.	2023-04-11 08:19:06 +02:00
Christopher Faulet	142cc1b52a	BUG/MINOR: resolvers: Wakeup DNS idle task on stopping Thanks to previous commit ("BUG/MEDIUM: dns: Kill idle DNS sessions during stopping stage"), DNS idle sessions are killed on stopping staged. But the task responsible to kill these sessions is running every 5 seconds. It means, when HAProxy is stopped, we can observe a delay before the process exits. To reduce this delay, when the resolvers task is executed, all DNS idle tasks are woken up. This patch must be backported as far as 2.6.	2023-04-11 08:19:06 +02:00
Christopher Faulet	e0f4717727	BUG/MEDIUM: dns: Kill idle DNS sessions during stopping stage There is no server timeout for DNS sessions over TCP. It means idle session cannot be killed by itself. There is a task running peridically, every 5s, to kill the excess of idle sessions. But the last one is never killed. During the stopping stage, it is an issue since the dynamic resolutions are no longer performed (2ec6f14c "BUG/MEDIUM: resolvers: Properly stop server resolutions on soft-stop"). Before the above commit, during stopping stage, the DNS sessions were killed when a resolution was triggered. Now, nothing kills these sessions. This prevents the process to finish on soft-stop. To fix this bug, the task killing excess of idle sessions now kill all idle sessions on stopping stage. This patch must be backported as far as 2.6.	2023-04-11 08:19:06 +02:00
Christopher Faulet	211452ef9a	BUG/MEDIUM: log: Eat output data when waiting for appctx shutdown When the log applet is executed while a shut is pending, the remaining output data must always be consumed. Otherwise, this can prevent the stream to exit, leading to a spinning loop on the applet. It is 2.8-specific. No backport needed.	2023-04-11 08:19:06 +02:00
Christopher Faulet	9837bd86dc	BUG/MEDIUM: stats: Eat output data when waiting for appctx shutdown When the stats applet is executed while a shut is pending, the remaining output data must always be consumed. Otherwise, this can prevent the stream to exit, leading to a spinning loop on the applet. It is 2.8-specific. No backport needed.	2023-04-11 07:43:26 +02:00
Christopher Faulet	1901c1bf5a	BUG/MEDIUM: http-client: Eat output data when waiting for appctx shutdown When the http-client applet is executed while a shut is pending, the remaining output data must always be consumed. Otherwise, this can prevent the stream to exit, leading to a spinning loop on the applet. It is 2.8-specific. No backport needed.	2023-04-11 07:43:26 +02:00
Christopher Faulet	1fb97e47f0	BUG/MEDIUM: cli: Eat output data when waiting for appctx shutdown When the cli applet is executed while a shut is pending, the remaining output data must always be consumed. Otherwise, this can prevent the stream to exit, leading to a spinning loop on the applet. This patch should fix the issue #2107. It is 2.8-specific. No backport needed.	2023-04-11 07:43:26 +02:00
Christopher Faulet	33af99655e	BUG/MEDIUM: cli: Set SE_FL_EOI flag for '_getsocks' and 'quit' commands An applet must never set SE_FL_EOS flag without SE_FL_EOI or SE_FL_ERROR flags. Here, SE_FL_EOI flag was missing for "quit" or "_getsocks" commands. Indeed, these commands are terminal. This bug triggers a BUG_ON() recently added. This patch is related to the issue #2107. It is 2.8-specific. No backport needed.	2023-04-11 07:43:26 +02:00
Olivier Houchard	0963b8a07f	BUG/MEDIUM: listeners: Use the right parameters for strlcpy2(). When calls to strcpy() were replaced with calls to strlcpy2(), one of them was replaced wrong, and the source and size were inverted. Correct that. This should fix issue #2110.	2023-04-08 15:01:57 +02:00
Willy Tarreau	fc458ec8aa	CLEANUP: tree-wide: remove strpcy() from constant strings These ones are genenerally harmless on modern compilers because the compiler checks them. While gcc optimizes them away without even referencing strcpy(), clang prefers to call strcpy(). Nevertheless they prevent from enabling stricter checks so better remove them altogether. They were all replaced by strlcpy2() and the size of the destination which is always known there.	2023-04-07 18:14:28 +02:00
Willy Tarreau	6d4c0c2ca2	CLEANUP: ocsp: do no use strpcy() to copy a path! strcpy() is quite nasty but tolerable to copy constants, but here it copies a variable path into a node in a code path that's not trivial to follow given that it takes the node as the result of a tree lookup. Let's get rid of it and mention where the entry is retrieved.	2023-04-07 17:57:05 +02:00
Willy Tarreau	a0fa577070	CLEANUP: tcpcheck: remove the only occurrence of sprintf() in the code There's a single sprintf() in the whole code, in the "option smtpchk" parser in tcpcheck.c. Let's turn it to a safer snprintf().	2023-04-07 16:04:54 +02:00
Willy Tarreau	22450af22a	BUG/MINOR: lua: remove incorrect usage of strncat() As every time strncat() is used, it's wrong, and this one is no exception. Users often think that the length applies to the destination except it applies to the source and makes it hard to use correctly. The bug did not have an impact because the length was preallocated from the sum of all the individual lengths as measured by strlen() so there was no chance one of them would change in between. But it could change in the future. Let's fix it to use memcpy() instead for strings, or byte copies for delimiters. No backport is needed, though it can be done if it helps to apply other fixes.	2023-04-07 16:04:54 +02:00
Olivier Houchard	ead43fe4f2	MEDIUM: compression: Make it so we can compress requests as well. Add code so that compression can be used for requests as well. New compression keywords are introduced : "direction" that specifies what we want to compress. Valid values are "request", "response", or "both". "type-req" and "type-res" define content-type to be compressed for requests and responses, respectively. "type" is kept as an alias for "type-res" for backward compatibilty. "algo-req" specifies the compression algorithm to be used for requests. Only one algorithm can be provided. "algo-res" provides the list of algorithm that can be used to compress responses. "algo" is kept as an alias for "algo-res" for backward compatibility.	2023-04-07 00:49:17 +02:00
Olivier Houchard	dea25f51b6	MINOR: compression: Count separately request and response compression Duplicate the compression counters, so that we have separate counters for request and response compression.	2023-04-07 00:47:04 +02:00
Olivier Houchard	db573e9c58	MINOR: compression: Store algo and type for both request and response Make provision for being able to store both compression algorithms and content-types to compress for both requests and responses. For now only the responses one are used.	2023-04-07 00:46:59 +02:00
Olivier Houchard	dfc11da561	MINOR: compression: Prepare compression code for request compression Make provision for storing the compression algorithm and the compression context twice, one for requests, and the other for responses. Only the response ones are used for now.	2023-04-07 00:46:55 +02:00
Olivier Houchard	3ce0f01b81	MINOR: compression: Make compression offload a flag Turn compression offload into a flag in struct comp, instead of using an int just for it.	2023-04-07 00:46:45 +02:00
Christopher Faulet	8eeec38bfa	MINOR: applet: Use unsafe version to get stream from SC in the trace function When a trace message for an applet is dumped, if the SC exists, the stream always exists too. There is no way to attached an applet to a health-check. So, we can use the unsafe version __sc_strm() to get the stream. This patch is related to #2106. Not sure it will be enough for Coverity. However, there is no bug here.	2023-04-06 08:48:17 +02:00
Aurelien DARRAGON	b28ded19a4	BUG/MINOR: errors: invalid use of memprintf in startup_logs_init() On startup/reload, startup_logs_init() will try to export startup logs shm filedescriptor through the internal HAPROXY_STARTUPLOGS_FD env variable. While memprintf() is used to prepare the string to be exported via setenv(), str_fd argument (first argument passed to memprintf()) could be non NULL as a result of HAPROXY_STARTUPLOGS_FD env variable being already set. Indeed: str_fd is already used earlier in the function to store the result of getenv("HAPROXY_STARTUPLOGS_FD"). The issue here is that memprintf() is designed to free the 'out' argument if out != NULL, and here we don't expect str_fd to be freed since it was provided by getenv() and would result in memory violation. To prevent any invalid free, we must ensure that str_fd is set to NULL prior to calling memprintf(). This must be backported in 2.7 with `eba6a54cd4` ("MINOR: logs: startup-logs can use a shm for logging the reload")	2023-04-05 17:06:38 +02:00
William Lallemand	b4e651f12f	BUG/MINOR: mworker: unset more internal variables from program section People who use HAProxy as a process 1 in containers sometimes starts other things from the program section. This is still not recommend as the master process has minimal features regarding process management. Environment variables are still inherited, even internal ones. Since 2.7, it could provoke a crash when inheriting the HAPROXY_STARTUPLOGS_FD variable. Note: for future releases it should be better to clean the env and sets a list of variable to be exported. We need to determine which variables are used by users before. Must be backported in 2.7.	2023-04-05 16:02:36 +02:00
Amaury Denoyelle	15adc4cc4e	MINOR: quic: remove address concatenation to ODCID Previously, ODCID were concatenated with the client address. This was done to prevent a collision between two endpoints which used the same ODCID. Thanks to the two previous patches, first connection generated CID is now directly derived from the client ODCID using a hash function which uses the client source address from the same purpose. Thus, it is now unneeded to concatenate client address to <odcid> quic-conn member. This change allows to simplify the quic_cid structure management and reduce its size which is important as it is embedded several times in various structures such as quic_conn and quic_rx_packet. This should be backported up to 2.7.	2023-04-05 11:09:57 +02:00
Amaury Denoyelle	2c98209c1c	MINOR: quic: remove ODCID dedicated tree First connection CID generation has been altered. It is now directly derived from client ODCID since previous commit : commit `162baaff7a` MINOR: quic: derive first DCID from client ODCID This patch removes the ODCID tree which is now unneeded. On connection lookup via CID, if a DCID is not found the hash derivation is performed for an INITIAL/0-RTT packet only. In case a client has used multiple times an ODCID, this will allow to retrieve our generated DCID in the CID tree without storing the ODCID node. The impact of this two combined patch is that it may improve slightly haproxy memory footprint by removing a tree node from quic_conn structure. The cpu calculation induced by hash derivation should only be performed only a few times per connection as the client will start to use our generated CID as soon as it received it. This should be backported up to 2.7.	2023-04-05 11:07:01 +02:00
Amaury Denoyelle	162baaff7a	MINOR: quic: derive first DCID from client ODCID Change the generation of the first CID of a connection. It is directly derived from the client ODCID using a 64-bits hash function. Client address is added to avoid collision between clients which could use the same ODCID. For the moment, this change as no functional impact. However, it will be directly used for the next commit to be able to remove the ODCID tree. This should be backported up to 2.7.	2023-04-05 11:06:04 +02:00
Frédéric Lécaille	ce5c145df5	BUG/MINOR: quic: Possible crashes in qc_idle_timer_task() This is due to this commit: MINOR: quic: Add trace to debug idle timer task issues where has been added without having been tested at developer level. <qc> was dereferenced after having been released by qc_conn_release(). Set qc to NULL value after having been released to forbid its dereferencing. Add a check for qc->idle_timer_task in the traces added by the mentionned commit above to prevent its dereferencing if NULL. Take the opportunity of this patch to modify trace events from QUIC_EV_CONN_SSLALERT to QUIC_EV_CONN_IDLE_TIMER. Must be backported to 2.6 and 2.7.	2023-04-05 11:03:20 +02:00
Christopher Faulet	2954bcc1e8	BUG/MINOR: http-ana: Don't switch message to DATA when waiting for payload The HTTP message must remains in BODY state during the analysis, to be able to report accurate termination state in logs. It is also important to know the HTTP analysis is still in progress. Thus, when we are waiting for the message payload, the message is no longer switch to DATA state. This was used to not process "Expect: " header at each evaluation. But thanks to the previous patch, it is no long necessary. This patch also fixes a bug in the lua filter api. Some functions must be called during the message analysis and not during the payload forwarding. It is not valid to try to manipulate headers during the forward stage because headers are already forwarded. We rely on the message state to detect errors. So the api was unusable if a "wait-for-body" action was used. This patch shoud fix the issue #2093. It relies on the commit: * MINOR: http-ana: Add a HTTP_MSGF flag to state the Expect header was checked Both must be backported as far as 2.5.	2023-04-05 10:53:20 +02:00
Christopher Faulet	ffcffa8e93	MINOR: http-ana: Add a HTTP_MSGF flag to state the Expect header was checked HTTP_MSGF_EXPECT_CHECKED is now set on the request message to know the "Expect: " header was already handled, if any. The flag is set from the moment we try to handle the header to send a "100-continue" response, whether it was found or not. This way, when we are waiting for the request payload, thanks to this flag, we only try to handle "Expect: " header only once. Before it was performed by changing the message state from BODY to DATA. But this has some side effects and it is no accurate. So, it is better to rely on a flag to do so.	2023-04-05 10:33:32 +02:00
Aurelien DARRAGON	223770ddca	MINOR: hlua/event_hdl: per-server event subscription Now that event_hdl api is properly implemented in hlua, we may add the per-server event subscription in addition to the global event subscription. Per-server subscription allows to be notified for events related to single server. It is useful to track a server UP/DOWN and DEL events. It works exactly like core.event_sub() except that the subscription will be performed within the server dedicated subscription list instead of the global one. The callback function will only be called for server events affecting the server from which the subscription was performed. Regarding the implementation, it is pretty trivial at this point, we add more doc than code this time. Usage examples have been added to the (lua) documentation.	2023-04-05 08:58:17 +02:00
Aurelien DARRAGON	c84899c636	MEDIUM: hlua/event_hdl: initial support for event handlers Now that the event handler API is pretty mature, we can expose it in the lua API. Introducing the core.event_sub(<event_types>, <cb>) lua function that takes an array of event types <event_types> as well as a callback function <cb> as argument. The function returns a subscription <sub> on success. Subscription <sub> allows you to manage the subscription from anywhere in the script. To this day only the sub->unsub method is implemented. The following event types are currently supported: - "SERVER_ADD": when a server is added - "SERVER_DEL": when a server is removed from haproxy - "SERVER_DOWN": server states goes from up to down - "SERVER_UP": server states goes from down to up As for the <cb> function: it will be called when one of the registered event types occur. The function will be called with 3 arguments: cb(<event>,<data>,<sub>) <event>: event type (string) that triggered the function. (could be any of the types used in <event_types> when registering the subscription) <data>: data associated with the event (specific to each event family). For "SERVER_" family events, server details such as server name/id/proxy will be provided. If the server still exists (not yet deleted), a reference to the live server is provided to spare you from an additionnal lookup if you need to have direct access to the server from lua. <sub> refers to the subscription. In case you need to manage it from within an event handler. (It refers to the same subscription that the one returned from core.event_sub()) Subscriptions are per-thread: the thread that will be handling the event is the one who performed the subscription using core.event_sub() function. Each thread treats events sequentially, it means that if you have, let's say SERVER_UP, then SERVER_DOWN in a short timelapse, then your cb function will first be called with SERVER_UP, and once you're done handling the event, your function will be called again with SERVER_DOWN. This is to ensure event consitency when it comes to logging / triggering logic from lua. Your lua cb function may yield if needed, but you're pleased to process the event as fast as possible to prevent the event queue from growing up To prevent abuses, if the event queue for the current subscription goes over 100 unconsumed events, the subscription will pause itself automatically for as long as it takes for your handler to catch up. This would lead to events being missed, so a warning will be emitted in the logs to inform you about that. This is not something you want to let happen too often, it may indicate that you subscribed to an event that is occurring too frequently or/and that your callback function is too slow to keep up the pace and you should review it. If you want to do some parallel processing because your callback functions are slow: you might want to create subtasks from lua using core.register_task() from within your callback function to perform the heavy job in a dedicated task and allow remaining events to be processed more quickly. Please check the lua documentation for more information.	2023-04-05 08:58:17 +02:00
Aurelien DARRAGON	4e5e26641d	MINOR: proxy: add findserver_unique_id() and findserver_unique_name() Adding alternative findserver() functions to be able to perform an unique match based on name or puid and by leveraging revision id (rid) to make sure the function won't match with a new server reusing the same name or puid of the "potentially deleted" server we were initially looking for. For example, if you were in the position of finding a server based on a given name provided to you by a different context: Since dynamic servers were implemented, between the time the name was picked and the time you will perform the findserver() call some dynamic server deletion/additions could've been performed in the mean time. In such cases, findserver() could return a new server that re-uses the name of a previously deleted server. Depending on your needs, it could be perfectly fine, but there are some cases where you want to lookup the original server that was provided to you (if it still exists).	2023-04-05 08:58:17 +02:00
Aurelien DARRAGON	f751a97a11	MINOR: event_hdl: pause/resume for subscriptions While working on event handling from lua, the need for a pause/resume function to temporarily disable a subscription was raised. We solve this by introducing the EHDL_SUB_F_PAUSED flag for subscriptions. The flag is set via _pause() and cleared via _resume(), and it is checked prior to notifying the subscription in publish function. Pause and Resume functions are also available for via lookups for identified subscriptions. If `68e692da0` ("MINOR: event_hdl: add event handler base api") is being backported, then this commit should be backported with it.	2023-04-05 08:58:17 +02:00
Aurelien DARRAGON	b4b7320a6a	MINOR: event_hdl: add event_hdl_async_equeue_size() function Use event_hdl_async_equeue_size() in advanced async task handler to get the near real-time event queue size. By near real-time, you should understand that the queue size is not updated during element insertion/removal, but shortly before insertion and shortly after removal, so the size should reflect the approximate queue size at a given time but should definitely not be used as a unique source of truth. If `68e692da0` ("MINOR: event_hdl: add event handler base api") is being backported, then this commit should be backported with it.	2023-04-05 08:58:17 +02:00
Aurelien DARRAGON	b289fd1420	MINOR: event_hdl: normal tasks support for advanced async mode advanced async mode (EVENT_HDL_ASYNC_TASK) provided full support for custom tasklets registration. Due to the similarities between tasks and tasklets, it may be useful to use the advanced mode with an existing task (not a tasklet). While the API did not explicitly disallow this usage, things would get bad if we try to wakeup a task using tasklet_wakeup() for notifying the task about new events. To make the API support both custom tasks and tasklets, we use the TASK_IS_TASKLET() macro to call the proper waking function depending on the task's type: - For tasklets: we use tasklet_wakeup() - For tasks: we use task_wakeup() If `68e692da0` ("MINOR: event_hdl: add event handler base api") is being backported, then this commit should be backported with it.	2023-04-05 08:58:17 +02:00
Aurelien DARRAGON	afcfc20e14	BUG/MEDIUM: event_hdl: fix async data refcount issue In _event_hdl_publish(), when publishing an event to async handler(s), async_data is allocated only once and then relies on a refcount logic to reuse the same data block for multiple async event handlers. (this allows to save significant amount of memory) Because the refcount is first set to 0, there is a small race where the consumers could consume async data (async data refcount reaching 0) before publishing is actually over. The consequence is that async data may be freed by one of the consumers while we still rely on it within _event_hdl_publish(). This was discovered by chance when stress-testing the API with multiple async handlers registered to the same event: some of the handlers were notified about a new event for which the event data was already freed, resulting in invalid reads and/or segfaults. To fix this, we first set the refcount to 1, assuming that the publish function relies on async_data until the publish is over. At the end of the publish, the reference to the async data is dropped. This way, async_data is either freed by _event_hdl_publish() itself or by one of the consumers, depending on who is the last one relying on it. If `68e692da0` ("MINOR: event_hdl: add event handler base api") is being backported, then this commit should be backported with it.	2023-04-05 08:58:17 +02:00
Aurelien DARRAGON	ef6ca67176	BUG/MEDIUM: event_hdl: clean soft-stop handling soft-stop was not explicitly handled in event_hdl API. Because of this, event_hdl was causing some leaks on deinit paths. Moreover, a task responsible for handling events could require some additional cleanups (ie: advanced async task), and as the task was not protected against abort when soft-stopping, such cleanup could not be performed unless the task itself implements the required protections, which is not optimal. Consider this new approach: 'jobs' global variable is incremented whenever an async subscription is created to prevent the related task from being aborted before the task acknowledges the final END event. Once the END event is acknowledged and freed by the task, the 'jobs' variable is decremented, and the deinit process may continue (including the abortion of remaining tasks not guarded by the 'jobs' variable). To do this, a new global mt_list is required: known_event_hdl_sub_list This list tracks the known (initialized) subscription lists within the process. sub_lists are automatically added to the "known" list when calling event_hdl_sub_list_init(), and are removed from the list with event_hdl_sub_list_destroy(). This allows us to implement a global thread-safe event_hdl deinit() function that is automatically called on soft-stop thanks to signal(0). When event_hdl deinit() is initiated, we simply iterate against the known subscription lists to destroy them. event_hdl_subscribe_ptr() was slightly modified to make sure that a sub_list may not accept new subscriptions once it is destroyed (removed from the known list) This can occur between the time the soft-stop is initiated (signal(0)) and haproxy actually enters in the deinit() function (once tasks are either finished or aborted and other threads already joined). It is safe to destroy() the subscription list multiple times as long as the pointer is still valid (ie: first on soft-stop when handling the '0' signal, then from regular deinit() path): the function does nothing if the subscription list is already removed. We partially reverted "BUG/MINOR: event_hdl: make event_hdl_subscribe thread-safe" since we can use parent mt_list locking instead of a dedicated lock to make the check gainst duplicate subscription ID. (insert_lock is not useful anymore) The check in itself is not changed, only the locking method. sizeof(event_hdl_sub_list) slightly increases: from 24 bits to 32bits due to the additional mt_list struct within it. With that said, having thread-safe list to store known subscription lists is a good thing: it could help to implement additional management logic for subcription lists and could be useful to add some stats or debugging tools in the future. If `68e692da0` ("MINOR: event_hdl: add event handler base api") is being backported, then this commit should be backported with it.	2023-04-05 08:58:17 +02:00
Aurelien DARRAGON	3a81e997ac	MINOR: event_hdl: global sublist management clarification event_hdl_sub_list_init() and event_hdl_sub_list_destroy() don't expect to be called with a NULL argument (to use global subscription list implicitly), simply because the global subscription list init and destroy is internally managed. Adding BUG_ON() to detect such invalid usages, and updating some comments to prevent confusion around these functions. If `68e692da0` ("MINOR: event_hdl: add event handler base api") is being backported, then this commit should be backported with it.	2023-04-05 08:58:17 +02:00
Aurelien DARRAGON	d514ca45c6	BUG/MINOR: event_hdl: make event_hdl_subscribe thread-safe List insertion in event_hdl_subscribe() was not thread-safe when dealing with unique identifiers. Indeed, in this case the list insertion is conditional (we check for a duplicate, then we insert). And while we're using mt lists for this, the whole operation is not atomic: there is a race between the check and the insertion. This could lead to the same ID being registered multiple times with concurrent calls to event_hdl_subscribe() on the same ID. To fix this, we add 'insert_lock' dedicated lock in the subscription list struct. The lock's cost is nearly 0 since it is only used when registering identified subscriptions and the lock window is very short: we only guard the duplicate check and the list insertion to make the conditional insertion "atomic" within a given subscription list. This is the only place where we need the lock: as soon as the item is properly inserted we're out of trouble because all other operations on the list are already thread-safe thanks to mt lists. A new lock hint is introduced: LOCK_EHDL which is dedicated to event_hdl The patch may seem quite large since we had to rework the logic around the subscribe function and switch from simple mt_list to a dedicated struct wrapping both the mt_list and the insert_lock for the event_hdl_sub_list type. (sizeof(event_hdl_sub_list) is now 24 instead of 16) However, all the changes are internal: we don't break the API. If `68e692da0` ("MINOR: event_hdl: add event handler base api") is being backported, then this commit should be backported with it.	2023-04-05 08:58:17 +02:00
Aurelien DARRAGON	b8038996e9	MINOR: hlua: support for optional arguments to core.register_task() core.register_task(function) may now take up to 4 additional arguments that will be passed as-is to the task function. This could be convenient to spawn sub-tasks from existing functions supporting core.register_task() without the need to use global variables to pass some context to the newly created task function. The new prototype is: core.register_task(function[, arg1[, arg2[, ...[, arg4]]]]) Implementation remains backward-compatible with existing scripts.	2023-04-05 08:58:17 +02:00
Aurelien DARRAGON	94ee6632ee	MINOR: hlua_fcn: add server->get_rid() method Server revision ID was recently added to haproxy with `61e3894` ("MINOR: server: add srv->rid (revision id) value") Let's add it to the hlua server class.	2023-04-05 08:58:17 +02:00
Aurelien DARRAGON	6b0b9bd39f	BUG/MEDIUM: hlua: prevent deadlocks with main lua lock Main lua lock is used at various places in the code. Most of the time it is used from unprotected lua environments, in which case the locking is mandatory. But there are some cases where the lock is attempted from protected lua environments, meaning that lock is already owned by the current thread. Thus new locking attempt should be skipped to prevent any deadlocks from occuring. To address this, "already_safe" lock hint was implemented in hlua_ctx_init() function with commit `bf90ce1` ("BUG/MEDIUM: lua: dead lock when Lua tasks are trigerred") But this approach is not very safe, for 2 reasons: First reason is that there are still some code paths that could lead to deadlocks. For instance, in register_task(), hlua_ctx_init() is called with already_safe set to 1 to prevent deadlock from occuring. But in case of task init failure, hlua_ctx_destroy() will be called from the same environment (protected environment), and hlua_ctx_destroy() does not offer the already_safe lock hint.. resulting in a deadlock. Second reason is that already_safe hint is used to completely skip SET_LJMP macros (which manipulates the lock internally), resulting in some logics in the function being unprotected from lua aborts in case of unexpected errors when manipulating the lua stack (the lock does not protect against longjmps) Instead of leaving the locking responsibility to the caller, which is quite error prone since we must find out ourselves if we are or not in a protected environment (and is not robust against code re-use), we move the deadlock protection logic directly in hlua_lock() function. Thanks to a thread-local lock hint, we can easily guess if the current thread already owns the main lua lock, in which case the locking attempt is skipped. The thread-local lock hint is implemented as a counter so that the lock is properly dropped when the counter reaches 0. (to match actual lock() and unlock() calls) This commit depends on "MINOR: hlua: simplify lua locking" It may be backported to every stable versions. [prior to 2.5 lua filter API did not exist, filter-related parts should be skipped]	2023-04-05 08:58:17 +02:00
Aurelien DARRAGON	e36f803b71	MINOR: hlua: simplify lua locking The check on lua state==0 to know whether locking is required or not can be performed in a locking wrapper to simplify things a bit and prevent implementation errors. Locking from hlua context should now be performed via hlua_lock(L) and unlocking via hlua_unlock(L)	2023-04-05 08:58:17 +02:00
Aurelien DARRAGON	fde199dddc	CLEANUP: hlua: use hlua_unref() instead of luaL_unref() Replacing some luaL_unref(, LUA_REGISTRYINDEX) calls with hlua_unref() which is simpler to use and more explicit.	2023-04-05 08:58:17 +02:00
Aurelien DARRAGON	4fdf8b58f2	CLEANUP: hlua: use hlua_pushref() instead of lua_rawgeti() Using hlua_pushref() everywhere temporary lua objects are involved. (ie: hlua_checkfunction(), hlua_checktable...) Those references are expected to be cleared using hlua_unref() when they are no longer used.	2023-04-05 08:58:17 +02:00
Aurelien DARRAGON	73d1a98d52	CLEANUP: hlua: use hlua_ref() instead of luaL_ref() Using hlua_ref() everywhere temporary lua objects are involved. Those references are expected to be cleared using hlua_unref() when they are no longer used.	2023-04-05 08:58:17 +02:00
Aurelien DARRAGON	55afbedfb4	BUG/MINOR: hlua: prevent function and table reference leaks on errors Several error paths were leaking function or table references. (Obtained through hlua_checkfunction() and hlua_checktable() functions) Now we properly release the references thanks to hlua_unref() in such cases. This commit depends on "MINOR: hlua: add simple hlua reference handling API" This could be backported in every stable versions although it is not mandatory as such leaks only occur on rare error/warn paths. [prior to 2.5 lua filter API did not exist, the hlua_register_filter() part should be skipped]	2023-04-05 08:58:17 +02:00
Aurelien DARRAGON	16d047b615	BUG/MINOR: hlua: fix reference leak in hlua_post_init_state() hlua init function references were not released during hlua_post_init_state(). Hopefully, this function is only used during startup so the resulting leak is not a big deal. Since each init lua function runs precisely once, it is safe to release the ref as soon as the function is restored on the stack. This could be backported to every stable versions. Please note that this commit depends on "MINOR: hlua: add simple hlua reference handling API"	2023-04-05 08:58:17 +02:00
Aurelien DARRAGON	be58d6683c	BUG/MINOR: hlua: fix reference leak in core.register_task() In core.register_task(): we take a reference to the function passed as argument in order to push it in the new coroutine substack. However, once pushed in the substack: the reference is not useful anymore and should be cleared. Currently, this is not the case in hlua_register_task(). Explicitly dropping the reference once the function is pushed to the coroutine's stack to prevent any reference leak (which could contribute to resource shortage) This may be backported to every stable versions. Please note that this commit depends on "MINOR: hlua: add simple hlua reference handling API"	2023-04-05 08:58:17 +02:00
Aurelien DARRAGON	9ee0d04770	MINOR: hlua: fix return type for hlua_checkfunction() and hlua_checktable() hlua_checktable() and hlua_checkfunction() both return the raw value of luaL_ref() function call. As luaL_ref() returns a signed int, both functions should return a signed int as well to prevent any misuse of the returned reference value.	2023-04-05 08:58:17 +02:00
Aurelien DARRAGON	f8f8a2b872	MINOR: hlua: add simple hlua reference handling API We're doing this in an attempt to simplify temporary lua objects references handling. Adding the new hlua_unref() function to release lua object references created using luaL_ref(, LUA_REGISTRYINDEX) (ie: hlua_checkfunction() and hlua_checktable()) Failure to release unused object reference prevents the reference index from being re-used and prevents the referred ressource from being garbage collected. Adding hlua_pushref(L, ref) to replace lua_rawgeti(L, LUA_REGISTRYINDEX, ref) Adding hlua_ref(L) to replace luaL_ref(L, LUA_REGISTRYINDEX)	2023-04-05 08:58:17 +02:00
Aurelien DARRAGON	60ab0f7d20	CLEANUP: hlua: fix conflicting comment in hlua_ctx_destroy() The comment for the hlua_ctx_destroy() function states that the "lua" struct is not freed. This is not true anymore since `2c8b54e7` ("MEDIUM: lua: remove Lua struct from session, and allocate it with memory pools") Updating the function comment to properly report the actual behavior. This could be backported in every stable versions with `2c8b54e7` ("MEDIUM: lua: remove Lua struct from session, and allocate it with memory pools")	2023-04-05 08:58:16 +02:00
Aurelien DARRAGON	c4b2437037	MEDIUM: hlua_fcn/api: remove some old server and proxy attributes Since ("MINOR: hlua_fcn: alternative to old proxy and server attributes"): - s->name(), s->puid() are superseded by s->get_name() and s->get_puid() - px->name(), px->uuid() are superseded by px->get_name() and px->get_uuid() And considering this is now the proper way to retrieve proxy name/uuid and server name/puid from lua: We're now removing such legacy attributes, but for retro-compatibility purposes we will be emulating them and warning the user for some time before completely dropping their support. To do this, we first remove old legacy code. Then we move server and proxy methods out of the metatable to allow direct elements access without systematically involving the "__index" metamethod. This allows us to involve the "__index" metamethod only when the requested key is missing from the table. Then we define relevant hlua_proxy_index and hlua_server_index functions that will be used as the "__index" metamethod to respectively handle "name, uuid" (proxy) or "name, puid" (server) keys, in which case we warn the user about the need to use the new getter function instead the legacy attribute (to prepare for the potential upcoming removal), and we call the getter function to return the value as if the getter function was directly called from the script. Note: Using the legacy variables instead of the getter functions results in a slight overhead due to the "__index" metamethod indirection, thus it is recommended to switch to the getter functions right away. With this commit we're also adding a deprecation notice about legacy attributes.	2023-04-05 08:58:16 +02:00
Thierry Fournier	1edf36a369	MEDIUM: hlua_fcn: dynamic server iteration and indexing This patch proposes to enumerate servers using internal HAProxy list. Also, remove the flag SRV_F_NON_PURGEABLE which makes the server non purgeable each time Lua uses the server. Removing reg-tests/cli_delete_server_lua.vtc since this test is no longer relevant (we don't set the SRV_F_NON_PURGEABLE flag anymore) and we already have a more generic test: reg-tests/server/cli_delete_server.vtc Co-authored-by: Aurelien DARRAGON <adarragon@haproxy.com>	2023-04-05 08:58:16 +02:00
Thierry Fournier	b0467730a0	MINOR: hlua_fcn: alternative to old proxy and server attributes This patch adds new lua methods: - "Proxy.get_uuid()" - "Proxy.get_name()" - "Server.get_puid()" - "Server.get_name()" These methods will be equivalent to their old analog Proxy.{uuid,name} and Server.{puid,name} attributes, but this will be the new preferred way to fetch such infos as it duplicates memory only when necessary and thus reduce the overall lua Server/Proxy objects memory footprint. Legacy attributes (now superseded by the explicit getters) are expected to be removed some day. Co-authored-by: Aurelien DARRAGON <adarragon@haproxy.com>	2023-04-05 08:58:16 +02:00
Thierry Fournier	467913c84e	MEDIUM: hlua: Dynamic list of frontend/backend in Lua When HAproxy is loaded with a lot of frontends/backends (tested with 300k), it is slow to start and it uses a lot of memory just for indexing backends in the lua tables. This patch uses the internal frontend/backend index of HAProxy in place of lua table. HAProxy startup is now quicker as each frontend/backend object is created on demand and not at init. This has to come with some cost: the execution of Lua will be a little bit slower.	2023-04-05 08:58:16 +02:00
Thierry Fournier	599f2311a8	MINOR: hlua: Fix two functions that return nothing useful Two lua init function seems to return something useful, but it is not the case. The function "hlua_concat_init" seems to return a failure status, but the function never fails. The function "hlua_fcn_reg_core_fcn" seems to return a number of elements in the stack, but it is not the case.	2023-04-05 08:58:16 +02:00
Aurelien DARRAGON	87f52974ba	BUG/MINOR: hlua: enforce proper running context for register_x functions register_{init, converters, fetches, action, service, cli, filter} are meant to run exclusively from body context according to the documentation (unlike register_task which is designed to work from both init and runtime contexts) A quick code inspection confirms that only register_task implements the required precautions to make it safe out of init context. Trying to use those register_* functions from a runtime lua task will lead to a program crash since they all assume that they are running from the main lua context and with no concurrent runs: core.register_task(function() core.register_init(function() end) end) When loaded from the config, the above example would segfault. To prevent this undefined behavior, we now report an explicit error if the user tries to use such functions outside of init/body context. This should be backported in every stable versions. [prior to 2.5 lua filter API did not exist, the hlua_register_filter() part should be skipped]	2023-04-05 08:58:16 +02:00
Aurelien DARRAGON	795441073c	MINOR: hlua: properly handle hlua_process_task HLUA_E_ETMOUT In hlua_process_task: when HLUA_E_ETMOUT was returned by hlua_ctx_resume(), meaning that the lua task reached tune.lua.task-timeout (default: none), we logged "Lua task: unknown error." before stopping the task. Now we properly handle HLUA_E_ETMOUT to report a meaningful error message.	2023-04-05 08:58:16 +02:00
Aurelien DARRAGON	0ebd41ff50	BUG/MINOR: hlua: hook yield does not behave as expected In function hlua_hook, a yieldk is performed when function is yieldable. But the following code in that function seems to assume that the yield never returns, which is not the case! Moreover, Lua documentation says that in this situation the yieldk call must immediately be followed by a return. This patch adds a return statement after the yieldk call. It also adds some comments and removes a needless lua_sethook call. It could be backported to all stable versions, but it is not mandatory, because even if it is undefined behavior this bug doesn't seem to negatively affect lua 5.3/5.4 stacks.	2023-04-05 08:58:16 +02:00
Aurelien DARRAGON	32483ecaac	MINOR: server: correctly free servers on deinit() srv_drop() function is reponsible for freeing the server when the refcount reaches 0. There is one exception: when global.mode has the MODE_STOPPING flag set, srv_drop() will ignore the refcount and free the server on first invocation. This logic has been implemented with `13f2e2ce` ("BUG/MINOR: server: do not use refcount in free_server in stopping mode") and back then doing so was not a problem since dynamic server API was just implemented and srv_take() and srv_drop() were not widely used. Now that dynamic server API is starting to get more popular we cannot afford to keep the current logic: some modules or lua scripts may hold references to existing server and also do their cleanup in deinit phases In this kind of situation, it would be easy to trigger double-frees since every call to srv_drop() on a specific server will try to free it. To fix this, we take a different approach and try to fix the issue at the source: we now properly drop server references involved with checks/agent_checks in deinit_srv_check() and deinit_srv_agent_check(). While this could theorically be backported up to 2.6, it is not very relevant for now since srv_drop() usage in older versions is very limited and we're only starting to face the issue in mid 2.8 developments. (ie: lua core updates)	2023-04-05 08:58:16 +02:00
Aurelien DARRAGON	b5ee8bebfc	MINOR: server: always call ssl->destroy_srv when available In srv_drop(), we only call the ssl->destroy_srv() method on specific conditions. But this has two downsides: First, destroy_srv() is reponsible for freeing data that may have been allocated in prepare_srv(), but not exclusively: it also frees ssl-related parameters allocated when parsing a server entry, such as ca-file for instance. So this is quite error-prone, we could easily miss a condition where some data needs to be deallocated using destroy_srv() even if prepare_srv() was not used (since prepare_srv() is also conditional), thus resulting in memory leaks. Moreover, depending on srv->proxy to guard the check is probably not a good idea here, since srv_drop() could be called in late de-init paths in which related proxy could be freed already. srv_drop() should only take care of freeing local server data without external logic. Thankfully, destroy_srv() function performs the necessary checks to ensure that a systematic call to the function won't result in invalid reads or double frees. No backport needed.	2023-04-05 08:58:16 +02:00
Aurelien DARRAGON	cca3355074	BUG/MINOR: log: free log forward proxies on deinit() Proxies belonging to the cfg_log_forward proxy list are not cleaned up in haproxy deinit() function. We add the missing cleanup directly in the main deinit() function since no other specific function may be used for this. This could be backported up to 2.4	2023-04-05 08:58:16 +02:00
Aurelien DARRAGON	9b1d15f53a	BUG/MINOR: sink: free forward_px on deinit() When a ring section is configured, a new sink is created and forward_px proxy may be allocated and assigned to the sink. Such sink-related proxies are added to the sink_proxies_list and thus don't belong to the main proxy list which is cleaned up in haproxy deinit() function. We don't have to manually clean up sink_proxies_list in the main deinit() func: sink API already provides the sink_deinit() function so we just add the missing free_proxy(sink->forward_px) there. This could be backported up to 2.4. [in 2.4, commit `b0281a49` ("MINOR: proxy: check if p is NULL in free_proxy()") must be backported first]	2023-04-05 08:58:16 +02:00
Aurelien DARRAGON	99a8d0f5d8	BUG/MINOR: stats: properly handle server stats dumping resumption In stats_dump_proxy_to_buffer() function, special care was taken when dealing with servers dump. Indeed, stats_dump_proxy_to_buffer() can be interrupted and resumed if buffer space is not big enough to complete dump. Thus, a reference is taken on the server being dumped in the hope that the server will still be valid when the function resumes. (to prevent the server from being freed in the meantime) While this is now true thanks to: - "BUG/MINOR: server/del: fix legacy srv->next pointer consistency" We still have an issue: when resuming, saved server reference is not dropped. This prevents the server from being freed when we no longer use it. Moreover, as the saved server might now be deleted (SRV_F_DELETED flag set), the current deleted server may still be dumped in the stats and while this is not a bug, this could be misleading for the user. Let's add a px_st variable to detect if the stats_dump_proxy_to_buffer() is being resumed at the STAT_PX_ST_SV stage: perform some housekeeping to skip deleted servers and properly drop the reference on the saved server. This commit depends on: - "MINOR: server: add SRV_F_DELETED flag" - "BUG/MINOR: server/del: fix legacy srv->next pointer consistency" This should be backported up to 2.6	2023-04-05 08:58:16 +02:00
Aurelien DARRAGON	f175b08bfb	BUG/MINOR: server/del: fix srv->next pointer consistency We recently discovered a bug which affects dynamic server deletion: When a server is deleted, it is removed from the "visible" server list. But as we've seen in previous commit ("MINOR: server: add SRV_F_DELETED flag"), it can still be accessed by someone who keeps a reference on it (waiting for the final srv_drop()). Throughout this transient state, server ptr is still valid (may be dereferenced) and the flag SRV_F_DELETED is set. However, as the server is not part of server list anymore, we have an issue: srv->next pointer won't be updated anymore as the only place where we perform such update is in cli_parse_delete_server() by iterating over the "visible" server list. Because of this, we cannot guarantee that a server with the SRV_F_DELETED flag has a valid 'next' ptr: 'next' could be pointing to a fully removed (already freed) server. This problem can be easily demonstrated with server dumping in the stats: server list dumping is performed in stats_dump_proxy_to_buffer() The function can be interrupted and resumed later by design. ie: output buffer is full: partial dump and finish the dump after the flush This is implemented by calling srv_take() on the server being dumped, and only releasing it when we're done with it using srv_drop(). (drop can be delayed after function resume if buffer is full) While the function design seems OK, it works with the assumption that srv->next will still be valid after the function resumes, which is not true. (especially if multiple servers are being removed in between the 2 dumping attempts) In practice, this did not cause any crash yet (at least this was not reported so far), because server dumping is so fast that it is very unlikely that multiple server deletions make their way between 2 dumping attempts in most setups. But still, this is a problem that we need to address because some upcoming work might depend on this assumption as well and for the moment it is not safe at all. ======================================================================== Here is a quick reproducer: With this patch, we're creating a large deletion window of 3s as soon as we reach a server named "t2" while iterating over the list. This will give us plenty of time to perform multiple deletions before the function is resumed. \| diff --git a/src/stats.c b/src/stats.c \| index 84a4f9b6e..15e49b4cd 100644 \| --- a/src/stats.c \| +++ b/src/stats.c \| @@ -3189,11 +3189,24 @@ int stats_dump_proxy_to_buffer(struct stconn sc, struct htx htx, \| * Temporarily increment its refcount to prevent its \| * anticipated cleaning. Call free_server to release it. \| / \| + struct server orig = ctx->obj2; \| for (; ctx->obj2 != NULL; \| ctx->obj2 = srv_drop(sv)) { \| \| sv = ctx->obj2; \| + printf("sv = %s\n", sv->id); \| srv_take(sv); \| + if (!strcmp("t2", sv->id) && orig == px->srv) { \| + printf("deletion window: 3s\n"); \| + thread_idle_now(); \| + thread_harmless_now(); \| + sleep(3); \| + thread_harmless_end(); \| + \| + thread_idle_end(); \| + \| + goto full; /* simulate full buffer / \| + } \| \| if (htx) { \| if (htx_almost_full(htx)) \| @@ -4353,6 +4366,7 @@ static void http_stats_io_handler(struct appctx appctx) \| struct channel res = sc_ic(sc); \| struct htx req_htx, res_htx; \| \| + printf("http dump\n"); \| / only proxy stats are available via http / \| ctx->domain = STATS_DOMAIN_PROXY; \| Ok, we're ready, now we start haproxy with the following conf: global stats socket /tmp/ha.sock mode 660 level admin expose-fd listeners thread 1-1 nbthread 2 frontend stats mode http bind :8081 thread 2-2 stats enable stats uri / backend farm server t1 127.0.0.1:1899 disabled server t2 127.0.0.1:18999 disabled server t3 127.0.0.1:18998 disabled server t4 127.0.0.1:18997 disabled And finally, we execute the following script: curl localhost:8081/stats& sleep .2 echo "del server farm/t2" \| nc -U /tmp/ha.sock echo "del server farm/t3" \| nc -U /tmp/ha.sock This should be enough to reveal the issue, I easily manage to consistently crash haproxy with the following reproducer: http dump sv = t1 http dump sv = t1 sv = t2 deletion window = 3s [NOTICE] (2940566) : Server deleted. [NOTICE] (2940566) : Server deleted. http dump sv = t2 sv = ��U [1] 2940566 segmentation fault (core dumped) ./haproxy -f ttt.conf ======================================================================== To fix this, we add prev_deleted mt_list in server struct. For a given "visible" server, this list will contain the pending "deleted" servers references that point to it using their 'next' ptr. This way, whenever this "visible" server is going to be deleted via cli_parse_delete_server() it will check for servers in its 'prev_deleted' list and update their 'next' pointer so that they no longer point to it, and then it will push them in its 'next->prev_deleted' list to transfer the update responsibility to the next 'visible' server (if next != NULL). Then, following the same logic, the server about to be removed in cli_parse_delete_server() will push itself as well into its 'next->prev_deleted' list (if next != NULL) so that it may still use its 'next' ptr for the time it is in transient removal state. In srv_drop(), right before the server is finally freed, we make sure to remove it from the 'next->prev_deleted' list so that 'next' won't try to perform the pointers update for this server anymore. This has to be done atomically to prevent 'next' srv from accessing a purged server. As a result: for a valid server, either deleted or not, 'next' ptr will always point to a non deleted (ie: visible) server. With the proposed fix, and several removal combinations (including unordered cli_parse_delete_server() and srv_drop() calls), I cannot reproduce the crash anymore. Example tricky removal sequence that is now properly handled: sv list: t1,t2,t3,t4,t5,t6 ops: take(t2) del(t4) del(t3) del(t5) drop(t3) drop(t4) drop(t5) drop(t2)	2023-04-05 08:58:16 +02:00
Aurelien DARRAGON	75b9d1c041	MINOR: server: add SRV_F_DELETED flag Set the SRV_F_DELETED flag when server is removed from the cli. When removing a server from the cli (in cli_parse_delete_server()), we update the "visible" server list so that the removed server is no longer part of the list. However, despite the server being removed from "visible" server list, one could still access the server data from a valid ptr (ie: srv_take()) Deleted flag helps detecting when a server is in transient removal state: that is, removed from the list, thus not visible but not yet purged from memory.	2023-04-05 08:58:16 +02:00
Christopher Faulet	8019f78326	MINOR: stconn/applet: Add BUG_ON_HOT() to be sure SE_FL_EOS is never set alone SE_FL_EOS flag must never be set on the SE descriptor without SE_FL_EOI or SE_FL_ERROR. When a mux or an applet report an end of stream, it must be able to state if it is the end of input too or if it is an error. Because all this part was recently refactored, especially the applet part, it is a bit sensitive. Thus a BUG_ON_HOT() is used and not a BUG_ON().	2023-04-05 08:57:06 +02:00
Christopher Faulet	7faac7cf34	MINOR: tree-wide: Simplifiy some tests on SHUT flags by accessing SCs directly At many places, we simplify the tests on SHUT flags to remove calls to chn_prod() or chn_cons() function because the corresponding SC is available.	2023-04-05 08:57:06 +02:00
Christopher Faulet	87633c3a11	MEDIUM: tree-wide: Move flags about shut from the channel to the SC The purpose of this patch is only a one-to-one replacement, as far as possible. CF_SHUTR(_NOW) and CF_SHUTW(_NOW) flags are now carried by the stream-connecter. CF_ prefix is replaced by SC_FL_ one. Of course, it is not so simple because at many places, we were testing if a channel was shut for reads and writes in same time. To do the same, shut for reads must be tested on one side on the SC and shut for writes on the other side on the opposite SC. A special care was taken with process_stream(). flags of SCs must be saved to be able to detect changes, just like for the channels.	2023-04-05 08:57:06 +02:00
Christopher Faulet	904763f562	MINOR: stconn/channel: Move CF_EOI into the SC and rename it The channel flag CF_EOI is renamed to SC_FL_EOI and moved into the stream-connector.	2023-04-05 08:57:06 +02:00
Christopher Faulet	be08df8fb3	MEDIUM: http_client: Use the sedesc to report and detect end of processing Just like for other applets, we now use the SE descriptor instead of the channel to report error and end-of-stream. Here, the applet is a bit refactored to handle SE descriptor EOS, EOI and ERROR flags	2023-04-05 08:57:06 +02:00
Christopher Faulet	df15a5d1f3	MEDIUM: stats: Use the sedesc to report and detect end of processing Just like for other applets, we now use the SE descriptor instead of the channel to report error and end-of-stream.	2023-04-05 08:57:06 +02:00
Christopher Faulet	a739dc22c5	MEDIUM: sink: Use the sedesc to report and detect end of processing Just like for other applets, we now use the SE descriptor instead of the channel to report error and end-of-stream.	2023-04-05 08:57:06 +02:00
Christopher Faulet	4b866959d8	MINOR: sink: Remove the tests on the opposite SC state to process messages The state of the opposite SC is already tested to wait the connection is established before sending messages. So, there is no reason to test it again before looping on the ring buffer.	2023-04-05 08:57:06 +02:00
Christopher Faulet	3d949010bc	MEDIUM: peers: Use the sedesc to report and detect end of processing Just like for other applets, we now use the SE descriptor instead of the channel to report error and end-of-stream. We must just be sure to consume request data when we are waiting the applet to be released.	2023-04-05 08:57:05 +02:00
Christopher Faulet	22a88f06d4	MEDIUM: log: Use the sedesc to report and detect end of processing Just like for other applets, we now use the SE descriptor instead of the channel to report error and end-of-stream. Here, the refactoring only reports errors by setting SE_FL_ERROR flag.	2023-04-05 08:57:05 +02:00
Christopher Faulet	31572229ed	MEDIUM: hlua/applet: Use the sedesc to report and detect end of processing There are 3 kinds of applet in lua: The co-sockets, the TCP services and the HTTP services. The three are refactored to use the SE descriptor instead of the channel to report error and end-of-stream.	2023-04-05 08:57:05 +02:00
Christopher Faulet	d550d26a39	MEDIUM: spoe: Use the sedesc to report and detect end of processing Just like for other applets, we now use the SE descriptor instead of the channel to report error and end-of-stream. We must just be sure to consume request data when we are waiting the applet to be released. This patch is bit different than others because messages handling is dispatched in several functions. But idea if the same.	2023-04-05 08:57:05 +02:00
Christopher Faulet	26769b0775	MEDIUM: dns: Use the sedesc to report and detect end of processing It is now the dns turn to be refactored to use the SE descriptor instead of the channel to report error and end-of-stream. We must just be sure to consume request data when we are waiting the applet to be released.	2023-04-05 08:57:05 +02:00
Christopher Faulet	4d3283f44b	MINOR: dns: Remove the test on the opposite SC state to send requests The state of the opposite SC is already tested to wait the connection is established before sending requests. So, there is no reason to test it again before looping on the ring buffer.	2023-04-05 08:57:05 +02:00
Christopher Faulet	2fd0c7669d	MEDIUM: cli: Use the sedesc to report and detect end of processing It is the same kind of change than for the cache applet. Idea is to use the SE desc instead of the channel or the SC to report end-of-input, end-of-stream and errors. Truncated commands are now reported on error. Other changes are the same than for the cache applet. We now set SE_FL_EOS flag instead of calling cf_shutr() and calls to cf_shutw are removed.	2023-04-05 08:57:05 +02:00
Christopher Faulet	f8130b2de2	MEDIUM: cache: Use the sedesc to report and detect end of processing We now try, as far as possible, to rely on the SE descriptor to detect end of processing. Idea is to no longer rely on the channel or the SC to do so. First, we now set SE_FL_EOS instead of calling and cf_shutr() to report the end of the stream. It happens when the response is fully sent (SE_FL_EOI is already set in this case) or when an error is reported. In this last case, SE_FL_ERROR is also set. Thanks to this change, it is now possible to detect the applet must only consume the request waiting for the upper layer releases it. So, if SE_FL_EOS or SE_FL_ERROR are set, it means the reponse was fully handled. And if SE_FL_SHR or SE_FL_SHW are set, it means the applet was released by upper layer and is waiting to be freed.	2023-04-05 08:57:05 +02:00
Christopher Faulet	0ffc9d7be3	MINOR: stconn/applet: Handle EOS in the applet .wake callback function Just like for end of input, the end of stream reported by the endpoint (SE_FL_EOS flag) is now handled in sc_applet_process(). The idea is to have applets acting as muxes by reporting events through the SE descriptor, as far as possible.	2023-04-05 08:57:05 +02:00
Christopher Faulet	92297749e1	MINOR: applet: No longer set EOI on the SC Thanks to the previous patch, it is now possible for applets to not set the CF_EOI flag on the channels. On this point, the applets get closer to the muxes.	2023-04-05 08:57:05 +02:00
Christopher Faulet	f8fbb6de66	MINOR: stconn/applet: Handle EOI in the applet .wake callback function The end of input reported by the endpoint (SE_FL_EOI flag), is now handled in sc_applet_process(). This function is always called after an applet was called. So, the applets can now only report EOI on the SE descriptor and have no reason to update the channel too.	2023-04-05 08:57:05 +02:00
Christopher Faulet	b208d8cd64	MINOR: stconn: Always ack EOS at the end of sc_conn_recv() EOS is now acknowledge at the end of sc_conn_recv(), even if an error was encountered. There is no reason to not do so, especially because, if it not performed here, it will be ack in sc_conn_process(). Note, it is still performed in sc_conn_process() because this function is also the .wake callback function and can be directly called from the lower layer.	2023-04-05 08:57:05 +02:00
Christopher Faulet	e9bacf642d	MINOR: mux-h1: Report an error to the SE descriptor on truncated message On truncated message, a parsing error is still reported. But an error on the SE descriptor is also reported. This will avoid any bugs in future. We are know sure the SC is able to detect the error, independently on the HTTP analyzers.	2023-04-05 08:57:05 +02:00
Christopher Faulet	88dd0b0d13	CLEANUP: mux-h1/mux-pt: Remove useless test on SE_FL_SHR/SE_FL_SHW flags It is already performed by the called, sc_conn_shutr() and sc_conn_shutw(). So there is no reason to still test these flags in the PT and H1 muxes.	2023-04-05 08:57:05 +02:00
Christopher Faulet	147e18f9d8	BUG/MINOR: mux-h1: Properly report EOI/ERROR on read0 in h1_rcv_pipe() In h1_rcv_pipe(), only the end of stream was reported when a read0 was detected. However, it is also important to report the end of input or an error, depending on the message state. This patch does not fix any real issue for now, but some others, specific to the 2.8, rely on it. No backport needed.	2023-04-05 08:57:05 +02:00
Christopher Faulet	872b01c984	MINOR: mux-pt: Report end-of-input with the end-of-stream after a read In the PT multiplexer, the end of stream is also the end of input. Thus we must report EOI to the stream-endpoint descriptor when the EOS is reported. For now, it is a bit useless but it will be important to disginguish an shutdown to an error to an abort. To be sure to not report an EOI on an error, the errors are now handled first.	2023-04-05 08:57:05 +02:00
Christopher Faulet	84d3ef982c	MINOR: stconn/channel: Move CF_EXPECT_MORE into the SC and rename it The channel flag CF_EXPECT_MORE is renamed to SC_FL_SND_EXP_MORE and moved into the stream-connector.	2023-04-05 08:57:05 +02:00
Christopher Faulet	68ef218a72	MINOR: stconn/channel: Move CF_NEVER_WAIT into the SC and rename it The channel flag CF_NEVER_WAIT is renamed to SC_FL_SND_NEVERWAIT and moved into the stream-connector.	2023-04-05 08:57:05 +02:00
Christopher Faulet	5c281d58ea	MINOR: stconn/channel: Move CF_SEND_DONTWAIT into the SC and rename it The channel flag CF_SEND_DONTWAIT is renamed to SC_FL_SND_ASAP and moved into the stream-connector.	2023-04-05 08:57:05 +02:00
Christopher Faulet	9a790f63ed	MINOR: stconn/channel: Move CF_READ_DONTWAIT into the SC and rename it The channel flag CF_READ_DONTWAIT is renamed to SC_FL_RCV_ONCE and moved into the stream-connector.	2023-04-05 08:57:05 +02:00
Christopher Faulet	9bce9724ec	MINOR: stconn: Remove unecessary test on SE_FL_EOS before receiving data In sc_conn_recv(), if the EOS is reported by the endpoint, it will always be acknowledged by the SC and a read0 will be performed on the input channel. Thus there is no reason to still test at the begining of the function because there is already a test on CF_SHUTR.	2023-04-05 08:57:05 +02:00
Christopher Faulet	28975e1e10	BUG/MEDIUM: dns: Properly handle error when a response consumed When a response is consumed, result for co_getblk() is never checked. It seems ok because amount of output data is always checked first. But There is an issue when we try to get the first 2 bytes to read the message length. If there is only one byte followed by a shutdown, the applet ignore the shutdown and loop till the timeout to get more data. So to avoid any issue and improve shutdown detection, the co_getblk() return value is always tested. In addition, if there is not enough data, the applet explicitly ask for more data by calling applet_need_more_data(). This patch relies on the previous one: * BUG/MEDIUM: channel: Improve reports for shut in co_getblk() Both should be backported as far as 2.4. On 2.5 and 2.4, applet_need_more_data() must be replaced by si_rx_endp_more().	2023-04-05 08:57:05 +02:00
Christopher Faulet	5f5c94617e	BUG/MEDIUM: channel: Improve reports for shut in co_getblk() When co_getblk() is called with a length and an offset to 0, shutdown is never reported. It may be an issue when the function is called to retrieve all available output data, while there is no output data at all. And it seems pretty annoying to handle this case in the caller. Thus, now, in co_getblk(), -1 is returned when the channel is empty and a shutdown was received. There is no real reason to backport this patch alone. However, another fix will rely on it.	2023-04-05 08:57:05 +02:00
Christopher Faulet	2726624ee7	CLEANUP: stconn: Remove remaining debug messages It is now possible to enable traces for applets. Thus we can remove annoying debug messages (DPRINTF) to track calls to applets.	2023-04-05 08:57:05 +02:00
Christopher Faulet	26e0935681	MEDIUM: applet/trace: Register a new trace source with its events Traces are now supported for applets. The first argument is always the appctx. This will help to debug applets.	2023-04-05 08:46:06 +02:00
Christopher Faulet	a5915eb1dd	MINOR: applet: Uninline appctx_free() This functin is uninlined and move in src/applet.c. It is mandatory to add traces for applets.	2023-04-05 08:46:06 +02:00

... 5 6 7 8 9 ...

16085 Commits