1031 Commits

Author SHA1 Message Date
Willy Tarreau
250b630fb9 BUG/MINOR: mux-h2: commit the current stream ID even on reject
The H2 spec says that a HEADERS frame turns an idle stream to the open
state, and it may then turn to half-closed(remote) on ES, then to close,
all at once, if we respond with RST (e.g. on error). Due to the fact that
we process a complete frame at once since h2_dec_hdrs() may reassemble
CONTINUATION frames until everything is complete, the state was only
committed after the frame was completley valid (otherwise multiple passes
could result in subsequent frames being rejected as the stream ID would
be equal to the highest one).

However this is not correct because it means that a client may retry on
the same ID as a previously failed one, which technically is forbidden
(for example the client couldn't know which of them a WINDOW_UPDATE or
RST_STREAM frame is for).

In practice, due to the error paths, this would only be possible when
failing to decode HPACK while leaving the HPACK stream intact, thus
when the valid decoded HPACK stream cannot be turned into a valid HTTP
representation, e.g. when the resulting headers are too large for example.
The solution to avoid this consists in committing the stream ID on this
error path as well. h2spec continues to be happy.

Thanks to Annika Wickert and Tim Windelschmidt for reporting this issue.

This fix must be backported to all stable versions.
2023-10-20 21:09:12 +02:00
Willy Tarreau
08f3bb5bd5 MINOR: mux-h2/traces: clarify the "rejected H2 request" event
In h2_frt_handle_headers() all failures lead to a generic message saying
"rejected H2 request". It's quite inexpressive while there are a few
distinct tests that are made before jumping there:

  - trailers on closed stream
  - unparsable request
  - refused stream

Let's emit the traces from these call points instead so that we get more
info about what happened. Since these are user-level messages, we take
care of keeping them aligned as much as possible.

For example before it would say:

  [04|h2|1|mux_h2.c:2859] rejected H2 request : h2c=0x7f5d58036fd0(F,FRE)
  [04|h2|5|mux_h2.c:2860] h2c_frt_handle_headers(): leaving on error : h2c=0x7f5d58036fd0(F,FRE) dsi=1 h2s=0x9fdb60(0,CLO)

And now it says:

  [04|h2|1|mux_h2.c:2817] rcvd unparsable H2 request : h2c=0x7f55f8037160(F,FRH) dsi=1 h2s=CLO
  [04|h2|5|mux_h2.c:2875] h2c_frt_handle_headers(): leaving on error : h2c=0x7f55f8037160(F,FRE) dsi=1 h2s=CLO
2023-10-20 21:09:12 +02:00
Willy Tarreau
1deac6f99a MINOR: mux-h2/traces: explicitly show the error/refused stream states
Sometimes it's unclear whether a stream is still open or closed when
certain traces are emitted, for example when the stream was refused,
because the reported pointer and ID in fact correspond to the refused
stream. And for closed streams, no pointer/name is printed, leaving
some confusion about the state. This patch makes the situation easier
to analyse by explicitly reporting "h2s=CLO" on closed/error/refused
streams so that we don't waste time comparing pointers and we instantly
know the stream is closed. Now instead of emitting:

   [03|h2|5|mux_h2.c:2874] h2c_frt_handle_headers(): leaving on error : h2c=0x7fdfa8026820(F,FRE) dsi=201 h2s=0x9fdb60(0,CLO)

It will emit:

   [03|h2|5|mux_h2.c:2874] h2c_frt_handle_headers(): leaving on error : h2c=0x7fdfa8026820(F,FRE) dsi=201 h2s=CLO
2023-10-20 21:09:12 +02:00
Willy Tarreau
3dd963b35f BUG/MINOR: mux-h2: fix http-request and http-keep-alive timeouts again
Stefan Behte reported that since commit f279a2f14 ("BUG/MINOR: mux-h2:
refresh the idle_timer when the mux is empty"), the http-request and
http-keep-alive timeouts don't work anymore on H2. Before this patch,
and since 3e448b9b64 ("BUG/MEDIUM: mux-h2: make sure control frames do
not refresh the idle timeout"), they would only be refreshed after stream
frames were sent (HEADERS or DATA) but the patch above that adds more
refresh points broke these so they don't expire anymore as long as
there's some activity.

We cannot just revert the fix since it also addressed an isse by which
sometimes the timeout would trigger too early and provoque truncated
responses. The right approach here is in fact to only use refresh the
idle timer when the mux buffer was flushed from any such stream frames.

In order to achieve this, we're now setting a flag on the connection
whenever we write a stream frame, and we consider that flag when deciding
to refresh the buffer after it's emptied. This way we'll only clear that
flag once the buffer is empty and there were stream data in it, not if
there were no such stream data. In theory it remains possible to leave
the flag on if some control data is appended after the buffer and it's
never cleared, but in practice it's not a problem as a buffer will always
get sent in large blocks when the window opens. Even a large buffer should
be emptied once in a while as control frames will not fill it as much as
data frames could.

Given the patch above was backported as far as 2.6, this patch should
also be backported as far as 2.6.
2023-10-18 17:17:58 +02:00
Christopher Faulet
203211f4cb REORG: stconn/muxes: Rename init step in fast-forwarding
Instead of speaking of an initialisation stage for each data
fast-forwarding, we now use the negociate term. Thus init_ff/init_fastfwd
functions were renamed nego_ff/nego_fastfwd.
2023-10-18 12:46:55 +02:00
Christopher Faulet
11c05c516a MEDIUM: mux-h2: Add consumer-side fast-forwarding support
The H2 multiplexer now implements callbacks to consume fast-forwarded
data. It is the most usful case: A H2 client getting data from a H1
server. It is also the easiest case to implement. The producer side is
trickier because of multiplexing. It is not obvious this case would be
improved with data fast-forwarding.
2023-10-17 18:51:13 +02:00
Christopher Faulet
1fdfa4f9ba BUG/MEDIUM: mux-h2: Don't report an error on shutr if a shutw is pending
If a shutw is blocked because the mux is full or busy, we must defer the
shutr. In this case, the H2 stream is not in H2_SS_CLOSED state because the
shutw is also deferred. If the shutr is performed, this will lead to a
error.

Concretly, when the mux is unblocked, a RST_STREAM is sent while in some
cases, an empty DATA frame with ES flag set could be sent.

This patch should be backported to all stable versions.
2023-10-17 18:51:13 +02:00
Willy Tarreau
68d02e5fa9 BUG/MINOR: mux-h2: make up other blocked streams upon removal from list
An interesting issue was met when testing the mux-to-mux forwarding code.

In order to preserve fairness, in h2_snd_buf() if other streams are waiting
in send_list or fctl_list, the stream that is attempting to send also goes
to its list, and will be woken up by h2_process_mux() or h2_send() when
some space is released. But on rare occasions, there are only a few (or
even a single) streams waiting in this list, and these streams are just
quickly removed because of a timeout or a quick h2_detach() that calls
h2s_destroy(). In this case there's no even to wake up the other waiting
stream in its list, and this will possibly resume processing after some
client WINDOW_UPDATE frames or even new streams, so usually it doesn't
last too long and it not much noticeable, reason why it was left that
long. In addition, measures have shown that in heavy network-bound
benchmark, this exact situation happens on less than 1% of the streams
(reached 4% with mux-mux).

The fix here consists in replacing these LIST_DEL_INIT() calls on
h2s->list with a function call that checks if other streams were queued
to the send_list recently, and if so, which also tries to resume them
by calling h2_resume_each_sending_h2s(). The detection of late additions
is made via a new flag on the connection, H2_CF_WAIT_INLIST, which is set
when a stream is queued due to other streams being present, and which is
cleared when this is function is called.

It is particularly difficult to reproduce this case which is particularly
timing-dependent, but in a constrained environment, a test involving 32
conns of 20 streams each, all downloading a 10 MB object previously
showed a limitation of 17 Gbps with lots of idle CPU time, and now
filled the cable at 25 Gbps.

This should be backported to all versions where it applies.
2023-10-17 16:43:44 +02:00
Willy Tarreau
5798b5bb14 BUG/MAJOR: connection: make sure to always remove a connection from the tree
Since commit 5afcb686b ("MAJOR: connection: purge idle conn by last usage")
in 2.9-dev4, the test on conn->toremove_list added to conn_get_idle_flag()
in 2.8 by commit 3a7b539b1 ("BUG/MEDIUM: connection: Preserve flags when a
conn is removed from an idle list") becomes misleading. Indeed, now both
toremove_list and idle_list are shared by a union since the presence in
these lists is mutually exclusive. However, in conn_get_idle_flag() we
check for the presence in the toremove_list to decide whether or not to
delete the connection from the tree. This test now fails because instead
it sees the presence in the idle or safe list via the union, and concludes
the element must not be removed. Thus the element remains in the tree and
can be found later after the connection is released, causing crashes that
Tristan reported in issue #2292.

The following config is sufficient to reproduce it with 2 threads:

   defaults
        mode http
        timeout client 5s
        timeout server 5s
        timeout connect 1s

   listen front
        bind :8001
        server next 127.0.0.1:8002

   frontend next
        bind :8002
        timeout http-keep-alive 1
        http-request redirect location /

Sending traffic with a few concurrent connections and some short timeouts
suffices to instantly crash it after ~10k reqs:

   $ h2load -t 4 -c 16 -n 10000 -m 1 -w 1 http://0:8001/

With Amaury we analyzed the conditions in which the function is called
in order to figure a better condition for the test and concluded that
->toremove_list is never filled there so we can safely remove that part
from the test and just move the flag retrieval back to what it was prior
to the 2.8 patch above. Note that the patch is not reverted though, as
the parts that would drop the unexpected flags removal are unchanged.

This patch must NOT be backported. The code in 2.8 works correctly, it's
only the change in 2.9 that makes it misbehave.
2023-10-12 14:20:03 +02:00
Amaury Denoyelle
337c71423f MINOR: connection: define mux flag for reverse support
Add a new MUX flag MX_FL_REVERSABLE. This value is used to indicate that
MUX instance supports connection reversal. For the moment, only HTTP/2
multiplexer is flagged with it.

This allows to dynamically check if reversal can be completed during MUX
installation. This will allow to relax requirement on config writing for
'tcp-request session attach-srv' which currently cannot be used mixed
with non-http/2 listener instances, even if used conditionnally with an
ACL.
2023-09-29 18:09:08 +02:00
Christopher Faulet
89e20033c7 BUG/MAJOR: mux-h2: Report a protocol error for any DATA frame before headers
If any DATA frame is received before all headers are fully received, a
protocol error must be reported. It is required by the HTTP/2 RFC but it is
also important because the HTTP analyzers expect the first HTX block is a
start-line. It leads to a crash if this statement is not respected.

For instance, it is possible to trigger a crash by sending an interim
message with a DATA frame (It may be an empty DATA frame with the ES
flag). AFAIK, only the server side is affected by this bug.

To fix the issue, an protocol error is reported for the stream.

This patch should fix the issue #2291. It must be backported as far as 2.2
(and probably to 2.0 too).
2023-09-14 11:39:39 +02:00
Willy Tarreau
a7b9baa2cc BUG/MEDIUM: mux-h2: fix crash when checking for reverse connection after error
If the connection is closed in h2_release(), which is indicated by ret<0, we
must not dereference conn anymore. This was introduced in 2.9-dev4 by commit
5053e8914 ("MEDIUM: h2: prevent stream opening before connection reverse
completed") and detected after a few hours of runtime thanks to running with
pool integrity checks and caller enabled. No backport is needed.
2023-08-26 17:05:19 +02:00
Amaury Denoyelle
61fc9568fb MINOR: server: move idle tree insert in a dedicated function
Define a new function _srv_add_idle(). This is a simple wrapper to
insert a connection in the server idle tree. This is reserved for simple
usage and require to idle_conns lock. In most cases,
srv_add_to_idle_list() should be used.

This patch does not have any functional change. However, it will help
with the next patch as idle connection will be always inserted in a list
as secondary storage along with idle/safe trees.
2023-08-25 15:57:48 +02:00
Amaury Denoyelle
77ac8eb4a6 MINOR: connection: simplify removal of idle conns from their trees
Small change of API for conn_delete_from_tree(). Now the connection
instance is taken as argument instead of its inner node.

No functional change introduced with this commit. This simplifies
slightly invocation of conn_delete_from_tree(). The most useful changes
is that this function will be extended in the next patch to be able to
remove the connection from its new idle list at the same time as in its
idle tree.
2023-08-25 15:57:48 +02:00
Amaury Denoyelle
6bd994d5d7 BUG/MINOR: h2: fix reverse if no timeout defined
h2c.task is not allocated in h2_init() if timeout client/server is not
defined depending on the connection side. This caused crash on
connection reverse due to systematic requeuing of h2c.task in
h2_conn_reverse().

To fix this, check h2c.task in h2_conn_reverse(). If old timeout was
undefined but new one is, h2c.task must be allocated as it was not in
h2_init(). On the opposite situation, if old timeout was defined and new
one is not, h2c.task is freed. In this case, or if neither timeout are
defined, skip the task requeuing.

This bug is easily reproduced by using reverse bind or server with
undefined timeout client/server depending on the connection reverse
direction.

This bug has been introduced by reverse connect support.
No need to backport it.
2023-08-24 17:58:14 +02:00
Amaury Denoyelle
5053e89142 MEDIUM: h2: prevent stream opening before connection reverse completed
HTTP/2 demux must be handled with care for active reverse connection.
Until accept has been completed, it should be forbidden to handle
HEADERS frame as session is not yet ready to handle streams.

To implement this, use the flag H2_CF_DEM_TOOMANY which blocks demux
process. This flag is automatically set just after conn_reverse()
invocation. The flag is removed on rev_accept_conn() callback via a new
H2 ctl enum. H2 tasklet is woken up to restart demux process.

As a side-effect, reporting in H2 mux may be blocked as demux functions
are used to convert error status at the connection level with
CO_FL_ERROR. To ensure error is reported for a reverse connection, check
h2c_is_dead() specifically for this case in h2_wake(). This change also
has its own side-effect : h2c_is_dead() conditions have been adjusted to
always exclude !h2c->conn->owner condition which is always true for
reverse connection or else H2 mux may kill them unexpectedly.
2023-08-24 17:03:08 +02:00
Amaury Denoyelle
6820b9b393 MEDIUM: h2: implement active connection reversal
Implement active reverse on h2_conn_reverse().

Only minimal steps are done here : HTTP version session counters are
incremented on the listener instance. Also, the connection is inserted
in the mux_stopping_list to ensure it will be actively closed on process
shutdown/listener suspend.
2023-08-24 17:03:08 +02:00
Amaury Denoyelle
b781a1bb09 MINOR: connection: prepare init code paths for active reverse
When an active reverse connection is initialized, it has no stream-conn
attached to it contrary to other backend connections. This forces to add
extra check on stream existence in conn_create_mux() and h2_init().

There is also extra checks required for session_accept_fd() after
reverse and accept is done. This is because contrary to other frontend
connections, reversed connections have already initialized their mux and
transport layers. This forces us to skip the majority of
session_accept_fd() initialization part.

Finally, if session_accept_fd() is interrupted due to an early error, a
reverse connection cannot be freed directly or else mux will remain
alone. Instead, the mux destroy callback is used to free all connection
elements properly.
2023-08-24 17:02:37 +02:00
Amaury Denoyelle
4fb538d4b6 MEDIUM: h2: reverse connection after SETTINGS reception
Reverse connection after SETTINGS reception if it was set as reversable.
This operation is done in a new function h2_conn_reverse(). It regroups
common changes which are needed for both reversal direction :
H2_CF_IS_BACK is set or unset and timeouts are inverted.

For the moment, only passive reverse is fully implemented. Once done,
the connection instance is directly inserted in its targetted server
pool. It can then be used immediately for future transfers using this
server.
2023-08-24 14:49:03 +02:00
Willy Tarreau
d93a00861d MINOR: h2: pass accept-invalid-http-request down the request parser
We're adding a new argument "relaxed" to h2_make_htx_request() so that
we can control its level of acceptance of certain invalid requests at
the proxy level with "option accept-invalid-http-request". The goal
will be to add deactivable checks that are still desirable to have by
default. For now no test is subject to it.
2023-08-08 19:10:54 +02:00
Willy Tarreau
db97bb42d9 MINOR: mux-h2/traces: also suggest invalid header upon parsing error
Historically the parsing error used to apply only to too large headers,
so this is what has been reported in traces. But nowadays we can also
reject invalid characters, and when this happens the trace is a bit
misleading, so let's mention "or invalid".
2023-08-08 19:02:24 +02:00
Christopher Faulet
ef2b15998c BUG/MINOR: htx/mux-h1: Properly handle bodyless responses when splicing is used
There is a mechanisme in the H1 and H2 multiplexer to skip the payload when
a response is returned to the client when it must not contain any payload
(response to a HEAD request or a 204/304 response). However, this does not
work when the splicing is used. The H2 multiplexer does not support the
splicing, so there is no issue. But with the mux-h1, when data are sent
using the kernel splicing, the mux on the server side is not aware the
client side should skip the payload. And once the data are put in a pipe,
there is no way to stop the sending.

It is a defect of the current design. This will be easier to deal with this
case when the mux-to-mux forwarding will be implemented. But for now, to fix
the issue, we should add an HTX flag on the start-line to pass the info from
the client side to the server side and be able to disable the splicing in
necessary.

The associated reg-test was improved to be sure it does not fail when the
splicing is configured.

This patch should be backported as far as 2.4..
2023-08-02 12:05:05 +02:00
Willy Tarreau
f279a2f148 BUG/MINOR: mux-h2: refresh the idle_timer when the mux is empty
There's a rare case where on long fat pipes, we can see the keep-alive
timeout trigger before the end of the transfer of the last large object,
and the connection closed a bit quickly after the end of the transfer
because a GOAWAY is queued. The data are not destroyed, except that
the WINDOW_UPDATES from the client arriving late while the last data
are being drained by the socket buffers may at some point trigger a
reset, and some clients might choke a bit too early on these. Let's
make sure we only arm the idle_start timestamp once the output buffer
is empty. Of course it will still not cover for the data pending in the
socket buffers but it will at least let those in the buffer leave in
peace. More elaborate options can be used to protect the data in the
kernel buffers, such as the one described in GH issue #5.

It's very likely that this old issue was emphasized by the following
commit in 2.6:
  15a4733d5 ("BUG/MEDIUM: mux-h2: make use of http-request and keep-alive timeouts")

and the behavior probably changed again with this one in 2.8, which
was backported to 2.7 and scheduled for 2.6:
  d38d8c6cc ("BUG/MEDIUM: mux-h2: make sure control frames do not refresh the idle timeout")

As such this patch should be backported to 2.6 after some observation
period.
2023-05-31 10:45:30 +02:00
Christopher Faulet
c2f1d0ee5e BUG/MEDIUM: mux-h2: Propagate termination flags when frontend SC is created
We must evaluate if EOS/EOI/ERR_PENDING/ERROR flags must be set on the SE
when the frontend SC is created because the rxbuf is transferred to the
steeam at this stage. It means the call to h2_rcv_buf() may be skipped on
some circumstances.

And indeed, it happens when HAproxy quickly replies, for instance because of
a deny rule. In this case, depending on the scheduling, the abort may block
the receive attempt from the SC. In this case if SE flags were not properly
set earlier, there is no way to terminate the request and the session may be
freezed.

For now, I can't explain why there is no timeout when this happens but it
remains an issue because here we should not rely on timeouts to close the
stream.

This patch relies on following commits:

    * MINOR: mux-h2: Add a function to propagate termination flags from h2s to SE
    * MINOR: mux-h2: Set H2_SF_ES_RCVD flag when decoding the HEADERS frame

The issue was encountered on the 2.8 but it seems the bug exists since the
2.4. But it is probably a good idea to only backport the series to 2.7 only
and wait for a bug report on earlier versions.

This patch should solve the issue #2147.
2023-05-24 16:06:11 +02:00
Christopher Faulet
531dd050ff MINOR: mux-h2: Add a function to propagate termination flags from h2s to SE
The function h2s_propagate_term_flags() was added to check the H2S state and
evaluate when EOI/EOS/ERR_PENDING/ERROR flags must be set on the SE. It is
not the only place where those flags are set. But it centralizes the synchro
between the H2 stream and the SC.

For now, this function is only used at the end of h2_rcv_buf(). But it will
be used to fix a bug.
2023-05-24 16:06:11 +02:00
Christopher Faulet
1a60a66306 MINOR: mux-h2: Set H2_SF_ES_RCVD flag when decoding the HEADERS frame
The flag H2_SF_ES_RCVD is set on the H2 stream when the ES flag is found in
a frame. On HEADERS frame, it was set in function processing the frame. It
is moved in the function decoding the frame. Fundamentally, this changes
nothing. But it will be useful to have this information earlier when a
client H2 stream is created.
2023-05-24 16:06:11 +02:00
Christopher Faulet
78b1eb2b04 BUG/MINOR: mux-h2: Check H2_SF_BODY_TUNNEL on H2S flags and not demux frame ones
In h2c_frt_stream_new(), H2_SF_BODY_TUNNEL flags was tested on demux frame
flags (h2c->dff) instead of the h2s flags.  By chance, it is a noop test
becasue H2_SF_BODY_TUNNEL value, once converted to an int8_t, is 0.

It is a 2.8-specific issue. No backport needed.
2023-05-24 16:06:11 +02:00
Willy Tarreau
d38d8c6ccb BUG/MEDIUM: mux-h2: make sure control frames do not refresh the idle timeout
Christopher found as part of the analysis of Tim's issue #1891 that commit
15a4733d5 ("BUG/MEDIUM: mux-h2: make use of http-request and keep-alive
timeouts") introduced in 2.6 incompletely addressed a timeout issue in the
H2 mux. The problem was that the http-keepalive and http-request timeouts
were not applied before it. With that commit they are now considered, but
if a GOAWAY is sent (or even attempted to be sent), then they are not used
anymore again, because the way the code is arranged consists in applying
the client-fin timeout (if set) to the current date, and falling back to
the client timeout, without considering the idle_start period. This means
that a config having a "timeout http-keepalive" would still not close the
connection quickly when facing a client that periodically sends PING,
PRIORITY or whatever other frame types.

In addition, after the GOAWAY was attempted to be sent, there was no check
for pending data in the output buffer, meaning that it would be possible
to truncate some responses in configs involving a very short client-fin
timeout.

Finally the spreading of the closures during the soft-stop brought in 2.6
by commit b5d968d9b ("MEDIUM: global: Add a "close-spread-time" option to
spread soft-stop on time window") didn't consider the particular case of
an idle "pre-connect" connection, which would also live long if a browser
failed to deliver a valid request for a long time.

All of this indicates that the conditions must be reworked so as not to
have that level of exclusion between conditions, but rather stick to the
rules from the doc that are already enforced on other muxes:
  - timeout client always applies if there are data pending, and is
    relative to each new I/O ;
  - timeout http-request applies before the first complete request and
    is relative to the entry in idle state ;
  - timeout http-keepalive applies between idle and the next complete
    request and is relative to the entry in idle state ;
  - timeout client-fin applies when in idle after a shut was sent (here
    the shut is the GOAWAY). The shut may only be considered as sent if
    the buffer is empty and the flags indicate that it was successfully
    sent (or failed) but not if it's still waiting for some room in the
    output buffer for example. This implies that this timeout may then
    lower the http-keepalive/http-request ones.

This is what this patch implements. Of course the client timeout still
applies as a fallback when all the ones above are not set or when their
conditions are not met.

It would seem reasoanble to backport this to 2.7 first, then only after
one or two releases to 2.6.
2023-05-15 12:01:20 +02:00
Amaury Denoyelle
25cf19d5c8 MINOR: htx: add function to set EOM reliably
Implement a new HTX utility function htx_set_eom(). If the HTX message
is empty, it will first add a dummy EOT block. This is a small trick
needed to ensure readers will detect the HTX buffer as not empty and
retrieve the EOM flag.

Replace the H2 code related by a htx_set_eom() invocation. QUIC also has
the same code which will be replaced in the next commit.

This should be backported up to 2.7 before the related QUIC patch.
2023-05-12 15:29:28 +02:00
Christopher Faulet
34f81d5815 BUG/MINOR: mux-h2: Also expect data when waiting for a tunnel establishment
When a client H2 stream is waiting for a tunnel establishment, it must state
it expects data from server. It is the second fix that should fix
regressions of the commit 2722c04b ("MEDIUM: mux-h2: Don't expect data from
server as long as request is unfinished")

It is a 2.8-specific bug. No backport needed.
2023-05-04 16:58:33 +02:00
Christopher Faulet
4403cdf653 BUG/MEDIUM: mux-h2: Properly handle end of request to expect data from server
The commit 2722c04b ("MEDIUM: mux-h2: Don't expect data from server as long
as request is unfinished") introduced a regression in the H2 multiplexer.
The end of the request is not systematically handled to state a H2 stream on
client side now expexts data from the server.

Indeed, while the client is uploading its request, the H2 stream warns it
does not expect data from the server. This way, no server timeout is applied
at this stage. When end of the request is detected, the H2 stream must state
it now expects the server response. This enables the server timeout.

However, it was only performed at one place while the end of the request can
be handled at different places. First, during a zero-copy in
h2_rcv_buf(). Then, when the SC is created with the full request. Because of
this bug, it is possible to totally disable the server timeout for H2
streams.

In h2_rcv_buf(), we now rely on h2s flags to detect the end of the request,
but only when the rxbuf was emptied.

It is a 2.8-specific bug. No backport needed.
2023-05-04 16:29:27 +02:00
Willy Tarreau
69530f59ae MEDIUM: clock: replace timeval "now" with integer "now_ns"
This puts an end to the occasional confusion between the "now" date
that is internal, monotonic and not synchronized with the system's
date, and "date" which is the system's date and not necessarily
monotonic. Variable "now" was removed and replaced with a 64-bit
integer "now_ns" which is a counter of nanoseconds. It wraps every
585 years, so if all goes well (i.e. if humanity does not need
haproxy anymore in 500 years), it will just never wrap. This implies
that now_ns is never nul and that the zero value can reliably be used
as "not set yet" for a timestamp if needed. This will also simplify
date checks where it becomes possible again to do "date1<date2".

All occurrences of "tv_to_ns(&now)" were simply replaced by "now_ns".
Due to the intricacies between now, global_now and now_offset, all 3
had to be turned to nanoseconds at once. It's not a problem since all
of them were solely used in 3 functions in clock.c, but they make the
patch look bigger than it really  is.

The clock_update_local_date() and clock_update_global_date() functions
are now much simpler as there's no need anymore to perform conversions
nor to round the timeval up or down.

The wrapping continues to happen by presetting the internal offset in
the short future so that the 32-bit now_ms continues to wrap 20 seconds
after boot.

The start_time used to calculate uptime can still be turned to
nanoseconds now. One interrogation concerns global_now_ms which is used
only for the freq counters. It's unclear whether there's more value in
using two variables that need to be synchronized sequentially like today
or to just use global_now_ns divided by 1 million. Both approaches will
work equally well on modern systems, the difference might come from
smaller ones. Better not change anyhting for now.

One benefit of the new approach is that we now have an internal date
with a resolution of the nanosecond and the precision of the microsecond,
which can be useful to extend some measurements given that timestamps
also have this resolution.
2023-04-28 16:08:08 +02:00
Willy Tarreau
ad5a5f6779 MEDIUM: tree-wide: replace timeval with nanoseconds in tv_accept and tv_request
Let's get rid of timeval in storage of internal timestamps so that they
are no longer mistaken for wall clock time. These were exclusively used
subtracted from each other or to/from "now" after being converted to ns,
so this patch removes the tv_to_ns() conversion to use them natively. Two
occurrences of tv_isge() were turned to a regular wrapping subtract.
2023-04-28 16:08:08 +02:00
Willy Tarreau
76d343d3d3 MINOR: time: replace calls to tv_ms_elapsed() with a linear subtract
Instead of operating on {sec, usec} now we convert both operands to
ns then subtract them and convert to ms. This is a first step towards
dropping timeval from these timestamps.

Interestingly, tv_ms_elapsed() and tv_ms_remain() are no longer used at
all and could be removed.
2023-04-28 16:08:08 +02:00
Tim Duesterhus
b1ec21d259 CLEANUP: Stop checking the pointer before calling tasklet_free()
Changes performed with this Coccinelle patch:

    @@
    expression e;
    @@

    - if (e != NULL) {
    	tasklet_free(e);
    - }

    @@
    expression e;
    @@

    - if (e) {
    	tasklet_free(e);
    - }

    @@
    expression e;
    @@

    - if (e)
    	tasklet_free(e);

    @@
    expression e;
    @@

    - if (e != NULL)
    	tasklet_free(e);

See GitHub Issue #2126
2023-04-23 00:28:25 +02:00
Willy Tarreau
ca1027c22f MINOR: mux-h2: make the max number of concurrent streams configurable per side
For a long time the maximum number of concurrent streams was set once for
both sides (front and back) while the impacts are different. This commit
allows it to be configured separately for each side. The older settings
remains the fallback choice when other ones are not set.
2023-04-18 15:58:55 +02:00
Willy Tarreau
9d7abda787 MINOR: mux-h2: make the initial window size configurable per side
For a long time the initial window size (per-stream size) was set once
for both directions, frontend and backend, resulting in a tradeoff between
upload speed and download fairness. This commit allows it to be configured
separately for each side. The older settings remains the fallback choice
when other ones are not set.
2023-04-18 15:58:55 +02:00
Christopher Faulet
c202c740b5 BUG/MEDIUM: mux-h2: Never set SE_FL_EOS without SE_FL_EOI or SE_FL_ERROR
When end-of-stream is reported by a H2 stream, we must take care to also
report an error is end-of-input was not reported. Indeed, it is now
mandatory to set SE_FL_EOI or SE_FL_ERROR flags when SE_FL_EOS is set.

It is a 2.8-specific issue. No backport needed.
2023-04-11 08:59:10 +02:00
Ilya Shipitsin
07be66d21b CLEANUP: assorted typo fixes in the code and comments
This is 35th iteration of typo fixes
2023-04-01 18:33:40 +02:00
Christopher Faulet
21fb6bdab4 BUG/MEDIUM: mux-h2: Be able to detect connection error during handshake
When a backend H2 connection is waiting the connection is fully established,
nothing is sent. However, it remains useful to detect connection error at
this stage. It is especially important to release H2 connection on connect
error. Be able to set H2_CF_ERR_PENDiNG or H2_CF_ERROR flags when the
underlying connection is not fully established will exclude the H2C to be
inserted in a idle list in h2_detach().

Without this fix, an H2C in PREFACE state and relying on a connection in
error can be inserted in the safe list. Of course, it will be purged if not
reused. But in the mean time, it can be reused. When this happens, the
connection remains in error and nothing happens. At the end a connection
error is returned to the client. On low traffic, we can imagine a scenario
where this dead connection is the only idle connection. If it is always
reused before being purged, no connection to the server is possible.

In addition, h2c_is_dead() is updated to declare as dead any H2 connection
with a pending error if its state is PREFACE or SETTINGS1 (thus if no
SETTINGS frame was received yet).

This patch should fix the issue #2092. It must be backported as far as 2.6.
2023-03-28 14:52:42 +02:00
Willy Tarreau
389ab0d4b4 BUG/MEDIUM: mux-h2: erase h2c->wait_event.tasklet on error path
On the allocation error path in h2_init() we may check if
h2c->wait_event.tasklet needs to be released but it has not yet been
zeroed. Let's do this before jumping to the freeing location.

This needs to be backported to all maintained versions.
2023-03-20 19:58:38 +01:00
Willy Tarreau
bcdc6cc15b BUG/MEDIUM: mux-h2: do not try to free an unallocated h2s->sd
In h2s_close() we may dereference h2s->sd to get the sc, but this
function may be called on allocation error paths, so we must check
for this specific condition. Let's also update the comment to make
it explicitly permitted.

This needs to be backported to 2.6.
2023-03-20 19:58:38 +01:00
Willy Tarreau
464fa06e9a MINOR: mux-h2: set CO_SFL_MSG_MORE when sending multiple buffers
Traces show that sendto() rarely has MSG_MORE on H2 despite sending
multiple buffers. The reason is that the loop iterating over the buffer
ring doesn't have this info and doesn't pass it down.

But now we know how many buffers are left to be sent, so we know whether
or not the current buffer is the last one. As such we can set this flag
for all buffers but the last one.
2023-03-17 16:43:51 +01:00
Willy Tarreau
14ea98af73 BUG/MINOR: mux-h2: set CO_SFL_STREAMER when sending lots of data
Emeric noticed that h2 bit-rate performance was always slightly lower
than h1 when the CPU is saturated. Strace showed that we were always
data in 2kB chunks, corresponding to the max_record size. What's
happening is that when this mechanism of dynamic record size was
introduced, the STREAMER flag at the stream level was relied upon.
Since all this was moved to the muxes, the flag has to be passed as
an argument to the snd_buf() function, but the mux h2 did not use it
despite a comment mentioning it, probably because before the multi-buf
it was not easy to figure the status of the buffer.

The solution here consists in checking if the mbuf is congested or not,
by checking if it has more than one buffer allocated. If so we set the
CO_SFL_STREAMER flag, otherwise we don't. This way moderate size
exchanges continue to be made over small chunks, but downloads will
be able to use the large ones.

While it could be backported to all supported versions, it would be
better to limit it to the last LTS, so let's do it for 2.7 and 2.6 only.
This patch requires previous commit "MINOR: buffer: add br_single() to
check if a buffer ring has more than one buf".
2023-03-16 18:45:46 +01:00
Willy Tarreau
93c5511af8 BUG/MEDIUM: mux-h2: only restart sending when mux buffer is decongested
During performance tests, Emeric faced a case where the wakeups of
sc_conn_io_cb() caused by h2_resume_each_sending_h2s() was multiplied
by 5-50 and a lot of CPU was being spent doing this for apparently no
reason.

The culprit is h2_send() not behaving well with congested buffers and
small SSL records. What happens when the output is congested is that
all buffers are full, and data are emitted in 2kB chunks, which are
sufficient to wake all streams up again to ask them to send data again,
something that will obviously only work for one of them at best, and
waste a lot of CPU in wakeups and memcpy() due to the small buffers.
When this happens, the performance can be divided by 2-2.5 on large
objects.

Here the chosen solution against this is to keep in mind that as long
as there are still at least two buffers in the ring after calling
xprt->snd_buf(), it means that the output is congested and there's
no point trying again, because these data will just be placed into
such buffers and will wait there. Instead we only mark the buffer
decongested once we're back to a single allocated buffer in the ring.

By doing so we preserve the ability to deal with large concurrent
bursts while not causing a thundering herd by waking all streams for
almost nothing.

This needs to be backported to 2.7 and 2.6. Other versions could
benefit from it as well but it's not strictly necessary, and we can
reconsider this option if some excess calls to sc_conn_io_cb() are
faced.

Note that this fix depends on this recent commit:
    MINOR: buffer: add br_single() to check if a buffer ring has more than one buf
2023-03-16 18:45:46 +01:00
Willy Tarreau
3fb2c6d5b4 BUG/MINOR: mux-h2: make sure the h2c task exists before refreshing it
When detaching a stream, if it's the last one and the mbuf is blocked,
we leave without freeing the stream yet. We also refresh the h2c task's
timeout, except that it's possible that there's no such task in case
there is no client timeout, causing a crash. The fix just consists in
doing this when the task exists.

This bug has always been there and is extremely hard to meet even
without a client timeout. This fix has to be backported to all
branches, but it's unlikely anyone has ever met it anyay.
2023-03-16 18:45:46 +01:00
Christopher Faulet
3a7b539b12 BUG/MEDIUM: connection: Preserve flags when a conn is removed from an idle list
The commit 5e1b0e7bf ("BUG/MEDIUM: connection: Clear flags when a conn is
removed from an idle list") introduced a regression. CO_FL_SAFE_LIST and
CO_FL_IDLE_LIST flags are used when the connection is released to properly
decrement used/idle connection counters. if a connection is idle, these
flags must be preserved till the connection is really released. It may be
removed from the list but not immediately released. If these flags are lost
when it is finally released, the current number of used connections is
erroneously decremented. If means this counter may become negative and the
counters tracking the number of idle connecitons is not decremented,
suggesting a leak.

So, the above commit is reverted and instead we improve a bit the way to
detect an idle connection. The function conn_get_idle_flag() must now be
used to know if a connection is in an idle list. It returns the connection
flag corresponding to the idle list if the connection is idle
(CO_FL_SAFE_LIST or CO_FL_IDLE_LIST) or 0 otherwise. But if the connection
is scheduled to be removed, 0 is also returned, regardless the connection
flags.

This new function is used when the connection is temporarily removed from
the list to be used, mainly in muxes.

This patch should fix #2078 and #2057. It must be backported as far as 2.2.
2023-03-16 15:34:20 +01:00
Christopher Faulet
5e1b0e7bf8 BUG/MEDIUM: connection: Clear flags when a conn is removed from an idle list
When a connection is removed from the safe list or the idle list,
CO_FL_SAFE_LIST and CO_FL_IDLE_LIST flags must be cleared. It is performed
when the connection is reused. But not when it is moved into the
toremove_conns list. It may be an issue because the multiplexer owning the
connection may be woken up before the connection is really removed. If the
connection flags are not sanitized, it may think the connection is idle and
reinsert it in the corresponding list. From this point, we can imagine
several bugs. An UAF or a connection reused with an invalid state for
instance.

To avoid any issue, the connection flags are sanitized when an idle
connection is moved into the toremove_conns list. The same is performed at
right places in the multiplexers. Especially because the connection release
may be delayed (for h2 and fcgi connections).

This patch shoudld fix the issue #2057. It must carefully be backported as
far as 2.2. Especially on the 2.2 where the code is really different. But
some conflicts should be expected on the 2.4 too.
2023-02-28 18:36:29 +01:00
Christopher Faulet
72722c04b0 MEDIUM: mux-h2: Don't expect data from server as long as request is unfinished
As for the H1 stream, the H2 stream now states it does not expect data from
the server as long as the request is unfinished. The aim is the same. We
must be sure to not trigger a read timeout on server side if the client is
still uploading data.

From the moment the end of the request is received and forwarded to upper
layer, the H2 stream reports it expects to receive data from the opposite
endpoint. This re-enables read timeout on the server side.
2023-02-27 17:45:45 +01:00
Willy Tarreau
0d6e5d271f MINOR: mux-h2/traces: add a missing TRACE_LEAVE() in h2s_frt_handle_headers()
Traces from this function would miss a TRACE_LEAVE() on the success path,
which had for consequences, 1) that it was difficult to figure where the
function was left, and 2) that we never had the allocated stream ID
clearly visible (actually the one returned by h2c_frt_stream_new() is
the right one but it's not obvious).

This can be backported to 2.7 and 2.6.
2023-02-20 17:22:03 +01:00