haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-10 00:57:02 +02:00

Author	SHA1	Message	Date
Christopher Faulet	dabeb5cbd5	MEDIUM: stream: Reset response analyse expiration date if there is no analyzer When the stream expiration date is computed at the end of process_stream(), if there is no longer analyzer on the request channel, its analyse expiration date is reset. The same is now performed on the response channel. This way, we are sure to not inherit of an orphan expired date. This should prevent spinning loop on process_stream().	2023-08-01 11:33:45 +02:00
Willy Tarreau	4ad1c9635a	BUG/MINOR: stream: do not use client-fin/server-fin with HTX Historically the client-fin and server-fin timeouts were made to allow a connection closure to be effective quickly if the last data were sent down a socket and the client didn't close, something that can happen when the peer's FIN is lost and retransmits are blocked by a firewall for example. This made complete sense in 1.5 for TCP and HTTP in close mode. But nowadays with muxes, it's not done at the right layer anymore and even the description doesn't match what is being done, because what happens is that the stream will abort the whole transfer after it's done sending to the mux and this timeout expires. We've seen in GH issue 2095 that this can happen with very short timeout values, and while this didn't trigger often before, now that the muxes (h2 & quic) properly report an end of stream before even the first sc_conn_sync_recv(), it seems that it can happen more often, and have two undesirable effects: - logging a timeout when that's not the case - aborting the request channel, hence the server-side conn, possibly before it had a chance to be put back to the idle list, causing this connection to be closed and not reusable. Unfortunately for TCP (mux_pt) this remains necessary because the mux doesn't have a timeout task. So here we're adding tests to only do this through an HTX mux. But to be really clean we should in fact completely drop all of this and implement these timeouts in the mux itself. This needs to be backported to 2.8 where the issue was discovered, and maybe carefully to older versions, though that is not sure at all. In any case, using a higher timeout or removing client-fin in HTTP proxies is sufficient to make the issue disappear.	2023-06-02 16:33:40 +02:00
Christopher Faulet	bd90a16564	MEDIUM: stream: Resync analyzers at the end of process_stream() on change At the end of process_stream(), if there was any change on request/response analyzers, we now trigger a resync. It is performed if any analyzer is added but also removed. It should help to catch internal changes on a stream and eventually avoid it to be frozen. There is no reason to backport this patch. But it may be good to keep an eye on it, just in case.	2023-05-10 16:45:36 +02:00
Christopher Faulet	b1368adcc7	BUG/MEDIUM: stream: Forward shutdowns when unhandled errors are caught In process_stream(), after request and response analyzers evaluation, unhandled errors are processed, if any. In this case, depending on the case, remaining request or response analyzers may be removed, unlesse the last one about end of filters. However, auto-close is not reenabled in same time. Thus it is possible to not forward the shutdown for a side to the other one while no analyzer is there to do so or at least to make evolved the situation. In theory, it is thus possible to freeze a stream if no wakeup happens. And it seems possible because it explain a freeze we've oberseved. This patch could be backported to every stable versions but only after a period of observation and if it may match an unexplained bug. It should not lead to any loop but at worst and eventually to truncated messages.	2023-05-10 16:45:36 +02:00
Willy Tarreau	65efd33c06	BUG/MINOR: stream/cli: fix stream age calculation in "show sess" The "show sess" command displays the stream's age in synthetic form, and also makes it appear in the long version (show sess all). But that last one uses the wrong origin, it uses accept_date.tv_sec instead of accept_ts (formerly known as tv_accept). This was introduced in 1.4.2 with the long format, with commit `66dc20a17` ("[MINOR] stats socket: add show sess <id> to dump details about a session"), while the code that split the two variables was introduced in 1.3.16 with commit `b7f694f20` ("[MEDIUM] implement a monotonic internal clock"). This problem was revealed by recent change `ad5a5f677` ("MEDIUM: tree-wide: replace timeval with nanoseconds in tv_accept and tv_request") that made this value report random garbage, and generally emphasized by the fact that in 2.8 the two clocks have sufficiently large an offset for such mistakes to be noticeable early. Arguably a difference between date and accept_date could also make sense, to indicate if the stream had been there for more than 49 days, but this would introduce instabilities for most sockets (including negative times) for extremely rare cases while the goal is essentially to see how much longer than a configured timeout a stream has been there. And that's what other locations (including the short form) provide. This patch could be backported but most users will never notice. In case of backport, tv_accept.tv_sec should be used instead of accept_date.tv_sec.	2023-05-03 11:47:44 +02:00
Willy Tarreau	69530f59ae	MEDIUM: clock: replace timeval "now" with integer "now_ns" This puts an end to the occasional confusion between the "now" date that is internal, monotonic and not synchronized with the system's date, and "date" which is the system's date and not necessarily monotonic. Variable "now" was removed and replaced with a 64-bit integer "now_ns" which is a counter of nanoseconds. It wraps every 585 years, so if all goes well (i.e. if humanity does not need haproxy anymore in 500 years), it will just never wrap. This implies that now_ns is never nul and that the zero value can reliably be used as "not set yet" for a timestamp if needed. This will also simplify date checks where it becomes possible again to do "date1<date2". All occurrences of "tv_to_ns(&now)" were simply replaced by "now_ns". Due to the intricacies between now, global_now and now_offset, all 3 had to be turned to nanoseconds at once. It's not a problem since all of them were solely used in 3 functions in clock.c, but they make the patch look bigger than it really is. The clock_update_local_date() and clock_update_global_date() functions are now much simpler as there's no need anymore to perform conversions nor to round the timeval up or down. The wrapping continues to happen by presetting the internal offset in the short future so that the 32-bit now_ms continues to wrap 20 seconds after boot. The start_time used to calculate uptime can still be turned to nanoseconds now. One interrogation concerns global_now_ms which is used only for the freq counters. It's unclear whether there's more value in using two variables that need to be synchronized sequentially like today or to just use global_now_ns divided by 1 million. Both approaches will work equally well on modern systems, the difference might come from smaller ones. Better not change anyhting for now. One benefit of the new approach is that we now have an internal date with a resolution of the nanosecond and the precision of the microsecond, which can be useful to extend some measurements given that timestamps also have this resolution.	2023-04-28 16:08:08 +02:00
Willy Tarreau	eed5da1037	MINOR: clock: do not use now.tv_sec anymore Instead we're using ns_to_sec(tv_to_ns(&now)) which allows the tv_sec part to disappear. At this point, "now" is only used as a timeval in clock.c where it is updated.	2023-04-28 16:08:08 +02:00
Willy Tarreau	ad5a5f6779	MEDIUM: tree-wide: replace timeval with nanoseconds in tv_accept and tv_request Let's get rid of timeval in storage of internal timestamps so that they are no longer mistaken for wall clock time. These were exclusively used subtracted from each other or to/from "now" after being converted to ns, so this patch removes the tv_to_ns() conversion to use them natively. Two occurrences of tv_isge() were turned to a regular wrapping subtract.	2023-04-28 16:08:08 +02:00
Willy Tarreau	76d343d3d3	MINOR: time: replace calls to tv_ms_elapsed() with a linear subtract Instead of operating on {sec, usec} now we convert both operands to ns then subtract them and convert to ms. This is a first step towards dropping timeval from these timestamps. Interestingly, tv_ms_elapsed() and tv_ms_remain() are no longer used at all and could be removed.	2023-04-28 16:08:08 +02:00
Christopher Faulet	ca5309a9a3	MINOR: stconn: Add a flag to report EOS at the stream-connector level SC_FL_EOS flag is added to report the end-of-stream at the SC level. It will be used to distinguish end of stream reported by the endoint, via the SE_FL_EOS flag, and the abort triggered by the stream, via the SC_FL_ABRT_DONE flag. In this patch, the flag is defined and is systematically tested everywhere SC_FL_ABRT_DONE is tested. It should be safe because it is never set.	2023-04-17 17:41:28 +02:00
Christopher Faulet	e182a8e651	MEDIUM: stream: Stop to use SE flags to detect endpoint errors Here again, we stop to use SE_FL_ERROR flag from process_stream() and sub-functions and we fully rely on SC_FL_ERROR to do so.	2023-04-14 17:05:54 +02:00
Christopher Faulet	88d05a0f3b	MEDIUM: tree-wide: Stop to set SE_FL_ERROR from upper layer We can now fully rely on SC_FL_ERROR flag from the stream. The first step is to stop to set the SE_FL_ERROR flag. Only endpoints are responsible to set this flag. It was a design limitation. It is now fixed.	2023-04-14 17:05:54 +02:00
Christopher Faulet	ad46e52814	MINOR: tree-wide: Test SC_FL_ERROR with SE_FL_ERROR from upper layer From the stream, when SE_FL_ERROR flag is tested, we now also test the SC_FL_ERROR flag. Idea is to stop to rely on the SE descriptor to detect errors.	2023-04-14 17:05:54 +02:00
Christopher Faulet	340021b89f	MINOR: stream: Set SC_FL_ERROR on channels' buffer allocation error Set SC_FL_ERROR flag when we fail to allocate a buffer for a stream.	2023-04-14 17:05:54 +02:00
Christopher Faulet	64350bbf05	MINOR: tree-wide: Replace several chn_cons() by the corresponding SC At many places, call to chn_cons() can be easily replaced by the corresponding SC. It is a bit easier to understand which side is manipulated.	2023-04-14 15:04:03 +02:00
Christopher Faulet	b2b1c3a6ea	MINOR: channel/stconn: Replace sc_shutw() by sc_shutdown() All reference to a shutw is replaced by an abort. So sc_shutw() is renamed sc_shutdown(). SC app ops functions are renamed accordingly.	2023-04-14 15:02:57 +02:00
Christopher Faulet	208c712b40	MINOR: stconn: Rename SC_FL_SHUTW in SC_FL_SHUT_DONE Here again, it is just a flag renaming. In SC flags, there is no longer shutdown for writes but shutdowns.	2023-04-14 15:01:21 +02:00
Christopher Faulet	cfc11c0eae	MINOR: channel/stconn: Replace sc_shutr() by sc_abort() All reference to a shutr is replaced by an abort. So sc_shutr() is renamed sc_abort(). SC app ops functions are renamed accordingly.	2023-04-14 14:54:35 +02:00
Christopher Faulet	0c370eee6d	MINOR: stconn: Rename SC_FL_SHUTR in SC_FL_ABRT_DONE Here again, it is just a flag renaming. In SC flags, there is no longer shutdown for reads but aborts. For now this flag is set when a read0 is detected. It is of couse not accurate. This will be changed later.	2023-04-14 14:51:22 +02:00
Christopher Faulet	df7cd710a8	MINOR: channel/stconn: Replace channel_shutw_now() by sc_schedule_shutdown() After the flag renaming, it is now the turn for the channel function to be renamed and moved in the SC scope. channel_shutw_now() is replaced by sc_schedule_shutdown(). The request channel is replaced by the front SC and the response is replace by the back SC.	2023-04-14 14:49:45 +02:00
Christopher Faulet	e38534cbd0	MINOR: stconn: Rename SC_FL_SHUTW_NOW in SC_FL_SHUT_WANTED Because shutowns for reads are now considered as aborts, the shudowns for writes can now be considered as shutdowns. Here it is just a flag renaming. SC_FL_SHUTW_NOW is renamed SC_FL_SHUT_WANTED.	2023-04-14 14:46:07 +02:00
Christopher Faulet	12762f09a5	MINOR: channel/stconn: Replace channel_shutr_now() by sc_schedule_abort() After the flag renaming, it is now the turn for the channel function to be renamed and moved in the SC scope. channel_shutr_now() is replaced by sc_schedule_abort(). The request channel is replaced by the front SC and the response is replace by the back SC.	2023-04-14 14:08:49 +02:00
Christopher Faulet	573ead1e68	MINOR: stconn: Rename SC_FL_SHUTR_NOW in SC_FL_ABRT_WANTED It is the first step to transform shutdown for reads for the upper layer into aborts. This patch is quite simple, it is just a flag renaming.	2023-04-14 14:06:01 +02:00
Christopher Faulet	7eb837df4a	MINOR: stream: Introduce stream_abort() to abort on both sides in same time The function stream_abort() should now be called when an abort is performed on the both channels in same time.	2023-04-14 14:04:59 +02:00
Christopher Faulet	3db538ac2f	MINOR: channel: Forwad close to other side on abort Most of calls to channel_abort() are associated to a call to channel_auto_close(). Others are in areas where the auto close is the default. So, it is now systematically enabled when an abort is performed on a channel, as part of channel_abort() function.	2023-04-14 13:56:28 +02:00
Christopher Faulet	dbad8ec787	MINOR: stream: Uninline and export sess_set_term_flags() function This function will be used to set termination flags on TCP streams from outside of process_stream(). Thus, it must be uninlined and exported.	2023-04-14 12:13:09 +02:00
Christopher Faulet	915ba08b57	BUG/MEDIUM: stream: Report write timeouts before testing the flags A regression was introduced when stream's timeouts were refactored. Write timeouts are not testing is the right order. When timeous of the front SC are handled, we must then test the read timeout on the request channel and the write timeout on the response channel. But write timeout is tested on the request channel instead. On the back SC, the same mix-up is performed. We must be careful to handle timeouts before checking channel flags. To avoid any confusions, all timeuts are handled first, on front and back SCs. Then flags of the both channels are tested. It is a 2.8-specific issue. No backport needed.	2023-04-14 12:13:09 +02:00
Christopher Faulet	925279ccf2	BUG/MINOR: stream: Fix test on SE_FL_ERROR on the wrong entity There is a bug at begining of process_stream(). The SE_FL_ERROR flag is tested against backend stream-connector's flags instead of its SE descriptor's flags. It is an old typo, introduced when the stream-interfaces were replaced by the conn-streams. This patch must be backported as far as 2.6.	2023-04-14 12:13:09 +02:00
Christopher Faulet	7faac7cf34	MINOR: tree-wide: Simplifiy some tests on SHUT flags by accessing SCs directly At many places, we simplify the tests on SHUT flags to remove calls to chn_prod() or chn_cons() function because the corresponding SC is available.	2023-04-05 08:57:06 +02:00
Christopher Faulet	87633c3a11	MEDIUM: tree-wide: Move flags about shut from the channel to the SC The purpose of this patch is only a one-to-one replacement, as far as possible. CF_SHUTR(_NOW) and CF_SHUTW(_NOW) flags are now carried by the stream-connecter. CF_ prefix is replaced by SC_FL_ one. Of course, it is not so simple because at many places, we were testing if a channel was shut for reads and writes in same time. To do the same, shut for reads must be tested on one side on the SC and shut for writes on the other side on the opposite SC. A special care was taken with process_stream(). flags of SCs must be saved to be able to detect changes, just like for the channels.	2023-04-05 08:57:06 +02:00
Christopher Faulet	68ef218a72	MINOR: stconn/channel: Move CF_NEVER_WAIT into the SC and rename it The channel flag CF_NEVER_WAIT is renamed to SC_FL_SND_NEVERWAIT and moved into the stream-connector.	2023-04-05 08:57:05 +02:00
Christopher Faulet	9a790f63ed	MINOR: stconn/channel: Move CF_READ_DONTWAIT into the SC and rename it The channel flag CF_READ_DONTWAIT is renamed to SC_FL_RCV_ONCE and moved into the stream-connector.	2023-04-05 08:57:05 +02:00
Christopher Faulet	947b2e5922	BUG/MINOR: stream: Fix test on channels flags to set clientfin/serverfin touts There is a bug in a way the channels flags are checked to set clientfin or serverfin timeout. Indeed, to set the clientfin timeout, the request channel must be shut for reads (CF_SHUTR) or the response channel must be shut for writes (CF_SHUTW). As the opposite, the serverfin timeout must be set when the request channel is shut for writes (CF_SHUTW) or the response channel is shut for reads (CF_SHUTR). It is a 2.8-dev specific issue. No backport needed.	2023-04-05 08:46:06 +02:00
Willy Tarreau	a45e7e81ec	BUG/MEDIUM: stream: do not try to free a failed stream-conn In stream_free() if we fail to allocate s->scb() we go to the path where we try to free it, and it doesn't like being called with a null at all. It's easily reproducible with -dMfail,no-cache and "tune.fail-alloc 10" in the global section. This must be backported to 2.6.	2023-03-20 19:58:38 +01:00
Christopher Faulet	e758b5c703	MEDIUM: stream: Eventually handle stream timeouts when exiting process_stream() When we exit from process_stream(), if the task is expired, we try to handle the stream timeouts and we resync the stream-connectors. This avoids a useless immediate wakeup. It is not really an issue, but it is a small improvement in edge cases.	2023-02-27 17:45:45 +01:00
Christopher Faulet	85e568f594	MINOR: stream: Handle stream's timeouts in a dedicated function This will be mandatory to be able to handle stream's timeouts before exiting process_stream(). So, to not duplicate code, all this stuff is moved in a dedicated function.	2023-02-27 17:45:45 +01:00
Christopher Faulet	3bbd2baab3	BUG/MINOR: stream: Remove BUG_ON about the task expiration in process_stream() At the end of process_stream(), A BUG_ON was recently added to abort if we leave the function with an expired task. However, it may happen if an event prevents the timeout to be handled but nothing evolved. In this case, the task expiration is not updated and we expect to catch the timeout on the immediate task wakeup. No backport needed.	2023-02-27 17:45:45 +01:00
Christopher Faulet	be5cc766b0	MINOR: stconn: Remove half-closed timeout The half-closed timeout is now directly retrieved from the proxy settings. There is no longer usage for the .hcto field in the stconn structure. So let's remove it.	2023-02-22 15:59:16 +01:00
Christopher Faulet	15315d6c0a	CLEANUP: stconn: Remove old read and write expiration dates Old read and write expiration dates are no longer used. Thus we can safely remove them.	2023-02-22 15:59:16 +01:00
Christopher Faulet	80e4532105	MINOR: stream: Use relative expiration date in trace messages Expiration dates in trace messages are now relative to now_ms. It is fairly easier to read traces this way. And an expired date is now negative. So, it is also easy to detect when a timeout was reached.	2023-02-22 15:59:16 +01:00
Christopher Faulet	03d5e62e13	MINOR: stream: Report rex/wex value using the sedesc date in trace messages Becasue read and write timeout are now detected using .lra and .fsb fields of the stream-endpoint descriptor, it is better to also use these fields to report read and write expiration date in trace messages. Especially because old rex and wex fields will be removed.	2023-02-22 15:59:16 +01:00
Christopher Faulet	6e59e871bf	MINOR: stream: Dump the task expiration date in trace messages The expiration date of the stream's task is now diplayed in the trace messages. This will help to track changes in the stream traces.	2023-02-22 15:59:16 +01:00
Christopher Faulet	b374ba563a	MAJOR: stream: Use SE descriptor date to detect read/write timeouts We stop to use the channel's expiration dates to detect read and write timeouts on the channels. We now rely on the stream-endpoint descriptor to do so. All the stuff is handled in process_stream(). The stream relies on 2 helper functions to know if the receives or sends may expire: sc_rcv_may_expire() and sc_snd_may_expire().	2023-02-22 15:57:16 +01:00
Christopher Faulet	4c13568b49	MEDIUM: stconn: Add two date to track successful reads and blocked sends The stream endpoint descriptor now owns two date, lra (last read activity) and fsb (first send blocked). The first one is updated every time a read activity is reported, including data received from the endpoint, successful connect, end of input and shutdown for reads. A read activity is also reported when receives are unblocked. It will be used to detect read timeouts. The other one is updated when no data can be sent to the endpoint and reset when some data are sent. It is the date of the first send blocked by the endpoint. It will be used to detect write timeouts. Helper functions are added to report read/send activity and to retrieve lra/fsb date.	2023-02-22 14:52:15 +01:00
Christopher Faulet	5aaacfbccd	MEDIUM: stconn: Replace read and write timeouts by a unique I/O timeout Read and write timeouts (.rto and .wto) are now replaced by an unique timeout, call .ioto. Since the recent refactoring on channel's timeouts, both use the same value, the client timeout on client side and the server timeout on the server side. Thus, this part may be simplified. Now it represents the I/O timeout.	2023-02-22 14:52:15 +01:00
Christopher Faulet	f8413cba2a	MEDIUM: channel/stconn: Move rex/wex timer from the channel to the sedesc These timers are related to the I/O. Thus it is logical to move them into the SE descriptor. The patch is a bit huge but it is just a replacement. However it is error-prone. From the stconn or the stream, helper functions are used to get, set or reset these timers. This simplify the timers manipulations.	2023-02-22 14:52:15 +01:00
Christopher Faulet	ed7e66fe1a	MINOR: channel/stconn: Move rto/wto from the channel to the stconn Read and write timeouts concerns the I/O. Thus, it is logical to move it into the stconn. At the end, the stream is responsible to detect the timeouts. So it is logcial to have these values in the stconn and not in the SE descriptor. But it may change depending on the recfactoring. So, now: * scf->rto is used instead of req->rto * scf->wto is used instead of res->wto * scb->rto is used instead of res->rto * scb->wto is used instead of req->wto	2023-02-22 14:52:15 +01:00
Christopher Faulet	6362934793	DEBUG: stream/trace: Add sedesc flags in trace messages In trace messages, when info about stream-connector are dumped, the stream-endpoint descriptor flags are now also dumped.	2023-02-22 14:52:15 +01:00
Christopher Faulet	2e56a73459	MAJOR: channel: Remove flags to report READ or WRITE errors This patch removes CF_READ_ERROR and CF_WRITE_ERROR flags. We now rely on SE_FL_ERR_PENDING and SE_FL_ERROR flags. SE_FL_ERR_PENDING is used for write errors and SE_FL_ERROR for read or unrecoverable errors. When a connection error is reported, SE_FL_ERROR and SE_FL_EOS are now set and a read event and a write event are reported to be sure the stream will properly process the error. At the stream-connector level, it is similar. When an error is reported during a send, a write event is triggered. On the read side, nothing more is performed because an error at this stage is enough to wake the stream up. A major change is brought with this patch. We stop to check flags of the ooposite channel to report abort or timeout. It also means when an read or write error is reported on a side, we no longer update the other side. Thus a read error on the server side does no long lead to a write error on the client side. This should ease errors report.	2023-02-22 14:52:15 +01:00
Christopher Faulet	81fdeb8ce2	MEDIUM: channel: Remove CF_READ_NOEXP flag This flag was introduced in 1.3 to fix a design issue. It was untouch since then but there is no reason to still have this trick. Note it could be good to review what happens in HTTP with the server is waiting for the end of the request. It could be good to be sure a client timeout is always reported.	2023-02-22 14:52:14 +01:00

1 2 3 4 5 ...

682 Commits