Commit Graph

859 Commits

Author SHA1 Message Date
Willy Tarreau
f279a2f148 BUG/MINOR: mux-h2: refresh the idle_timer when the mux is empty
There's a rare case where on long fat pipes, we can see the keep-alive
timeout trigger before the end of the transfer of the last large object,
and the connection closed a bit quickly after the end of the transfer
because a GOAWAY is queued. The data are not destroyed, except that
the WINDOW_UPDATES from the client arriving late while the last data
are being drained by the socket buffers may at some point trigger a
reset, and some clients might choke a bit too early on these. Let's
make sure we only arm the idle_start timestamp once the output buffer
is empty. Of course it will still not cover for the data pending in the
socket buffers but it will at least let those in the buffer leave in
peace. More elaborate options can be used to protect the data in the
kernel buffers, such as the one described in GH issue #5.

It's very likely that this old issue was emphasized by the following
commit in 2.6:
  15a4733d5 ("BUG/MEDIUM: mux-h2: make use of http-request and keep-alive timeouts")

and the behavior probably changed again with this one in 2.8, which
was backported to 2.7 and scheduled for 2.6:
  d38d8c6cc ("BUG/MEDIUM: mux-h2: make sure control frames do not refresh the idle timeout")

As such this patch should be backported to 2.6 after some observation
period.
2023-05-31 10:45:30 +02:00
Christopher Faulet
c2f1d0ee5e BUG/MEDIUM: mux-h2: Propagate termination flags when frontend SC is created
We must evaluate if EOS/EOI/ERR_PENDING/ERROR flags must be set on the SE
when the frontend SC is created because the rxbuf is transferred to the
steeam at this stage. It means the call to h2_rcv_buf() may be skipped on
some circumstances.

And indeed, it happens when HAproxy quickly replies, for instance because of
a deny rule. In this case, depending on the scheduling, the abort may block
the receive attempt from the SC. In this case if SE flags were not properly
set earlier, there is no way to terminate the request and the session may be
freezed.

For now, I can't explain why there is no timeout when this happens but it
remains an issue because here we should not rely on timeouts to close the
stream.

This patch relies on following commits:

    * MINOR: mux-h2: Add a function to propagate termination flags from h2s to SE
    * MINOR: mux-h2: Set H2_SF_ES_RCVD flag when decoding the HEADERS frame

The issue was encountered on the 2.8 but it seems the bug exists since the
2.4. But it is probably a good idea to only backport the series to 2.7 only
and wait for a bug report on earlier versions.

This patch should solve the issue #2147.
2023-05-24 16:06:11 +02:00
Christopher Faulet
531dd050ff MINOR: mux-h2: Add a function to propagate termination flags from h2s to SE
The function h2s_propagate_term_flags() was added to check the H2S state and
evaluate when EOI/EOS/ERR_PENDING/ERROR flags must be set on the SE. It is
not the only place where those flags are set. But it centralizes the synchro
between the H2 stream and the SC.

For now, this function is only used at the end of h2_rcv_buf(). But it will
be used to fix a bug.
2023-05-24 16:06:11 +02:00
Christopher Faulet
1a60a66306 MINOR: mux-h2: Set H2_SF_ES_RCVD flag when decoding the HEADERS frame
The flag H2_SF_ES_RCVD is set on the H2 stream when the ES flag is found in
a frame. On HEADERS frame, it was set in function processing the frame. It
is moved in the function decoding the frame. Fundamentally, this changes
nothing. But it will be useful to have this information earlier when a
client H2 stream is created.
2023-05-24 16:06:11 +02:00
Christopher Faulet
78b1eb2b04 BUG/MINOR: mux-h2: Check H2_SF_BODY_TUNNEL on H2S flags and not demux frame ones
In h2c_frt_stream_new(), H2_SF_BODY_TUNNEL flags was tested on demux frame
flags (h2c->dff) instead of the h2s flags.  By chance, it is a noop test
becasue H2_SF_BODY_TUNNEL value, once converted to an int8_t, is 0.

It is a 2.8-specific issue. No backport needed.
2023-05-24 16:06:11 +02:00
Willy Tarreau
d38d8c6ccb BUG/MEDIUM: mux-h2: make sure control frames do not refresh the idle timeout
Christopher found as part of the analysis of Tim's issue #1891 that commit
15a4733d5 ("BUG/MEDIUM: mux-h2: make use of http-request and keep-alive
timeouts") introduced in 2.6 incompletely addressed a timeout issue in the
H2 mux. The problem was that the http-keepalive and http-request timeouts
were not applied before it. With that commit they are now considered, but
if a GOAWAY is sent (or even attempted to be sent), then they are not used
anymore again, because the way the code is arranged consists in applying
the client-fin timeout (if set) to the current date, and falling back to
the client timeout, without considering the idle_start period. This means
that a config having a "timeout http-keepalive" would still not close the
connection quickly when facing a client that periodically sends PING,
PRIORITY or whatever other frame types.

In addition, after the GOAWAY was attempted to be sent, there was no check
for pending data in the output buffer, meaning that it would be possible
to truncate some responses in configs involving a very short client-fin
timeout.

Finally the spreading of the closures during the soft-stop brought in 2.6
by commit b5d968d9b ("MEDIUM: global: Add a "close-spread-time" option to
spread soft-stop on time window") didn't consider the particular case of
an idle "pre-connect" connection, which would also live long if a browser
failed to deliver a valid request for a long time.

All of this indicates that the conditions must be reworked so as not to
have that level of exclusion between conditions, but rather stick to the
rules from the doc that are already enforced on other muxes:
  - timeout client always applies if there are data pending, and is
    relative to each new I/O ;
  - timeout http-request applies before the first complete request and
    is relative to the entry in idle state ;
  - timeout http-keepalive applies between idle and the next complete
    request and is relative to the entry in idle state ;
  - timeout client-fin applies when in idle after a shut was sent (here
    the shut is the GOAWAY). The shut may only be considered as sent if
    the buffer is empty and the flags indicate that it was successfully
    sent (or failed) but not if it's still waiting for some room in the
    output buffer for example. This implies that this timeout may then
    lower the http-keepalive/http-request ones.

This is what this patch implements. Of course the client timeout still
applies as a fallback when all the ones above are not set or when their
conditions are not met.

It would seem reasoanble to backport this to 2.7 first, then only after
one or two releases to 2.6.
2023-05-15 12:01:20 +02:00
Amaury Denoyelle
25cf19d5c8 MINOR: htx: add function to set EOM reliably
Implement a new HTX utility function htx_set_eom(). If the HTX message
is empty, it will first add a dummy EOT block. This is a small trick
needed to ensure readers will detect the HTX buffer as not empty and
retrieve the EOM flag.

Replace the H2 code related by a htx_set_eom() invocation. QUIC also has
the same code which will be replaced in the next commit.

This should be backported up to 2.7 before the related QUIC patch.
2023-05-12 15:29:28 +02:00
Christopher Faulet
34f81d5815 BUG/MINOR: mux-h2: Also expect data when waiting for a tunnel establishment
When a client H2 stream is waiting for a tunnel establishment, it must state
it expects data from server. It is the second fix that should fix
regressions of the commit 2722c04b ("MEDIUM: mux-h2: Don't expect data from
server as long as request is unfinished")

It is a 2.8-specific bug. No backport needed.
2023-05-04 16:58:33 +02:00
Christopher Faulet
4403cdf653 BUG/MEDIUM: mux-h2: Properly handle end of request to expect data from server
The commit 2722c04b ("MEDIUM: mux-h2: Don't expect data from server as long
as request is unfinished") introduced a regression in the H2 multiplexer.
The end of the request is not systematically handled to state a H2 stream on
client side now expexts data from the server.

Indeed, while the client is uploading its request, the H2 stream warns it
does not expect data from the server. This way, no server timeout is applied
at this stage. When end of the request is detected, the H2 stream must state
it now expects the server response. This enables the server timeout.

However, it was only performed at one place while the end of the request can
be handled at different places. First, during a zero-copy in
h2_rcv_buf(). Then, when the SC is created with the full request. Because of
this bug, it is possible to totally disable the server timeout for H2
streams.

In h2_rcv_buf(), we now rely on h2s flags to detect the end of the request,
but only when the rxbuf was emptied.

It is a 2.8-specific bug. No backport needed.
2023-05-04 16:29:27 +02:00
Willy Tarreau
69530f59ae MEDIUM: clock: replace timeval "now" with integer "now_ns"
This puts an end to the occasional confusion between the "now" date
that is internal, monotonic and not synchronized with the system's
date, and "date" which is the system's date and not necessarily
monotonic. Variable "now" was removed and replaced with a 64-bit
integer "now_ns" which is a counter of nanoseconds. It wraps every
585 years, so if all goes well (i.e. if humanity does not need
haproxy anymore in 500 years), it will just never wrap. This implies
that now_ns is never nul and that the zero value can reliably be used
as "not set yet" for a timestamp if needed. This will also simplify
date checks where it becomes possible again to do "date1<date2".

All occurrences of "tv_to_ns(&now)" were simply replaced by "now_ns".
Due to the intricacies between now, global_now and now_offset, all 3
had to be turned to nanoseconds at once. It's not a problem since all
of them were solely used in 3 functions in clock.c, but they make the
patch look bigger than it really  is.

The clock_update_local_date() and clock_update_global_date() functions
are now much simpler as there's no need anymore to perform conversions
nor to round the timeval up or down.

The wrapping continues to happen by presetting the internal offset in
the short future so that the 32-bit now_ms continues to wrap 20 seconds
after boot.

The start_time used to calculate uptime can still be turned to
nanoseconds now. One interrogation concerns global_now_ms which is used
only for the freq counters. It's unclear whether there's more value in
using two variables that need to be synchronized sequentially like today
or to just use global_now_ns divided by 1 million. Both approaches will
work equally well on modern systems, the difference might come from
smaller ones. Better not change anyhting for now.

One benefit of the new approach is that we now have an internal date
with a resolution of the nanosecond and the precision of the microsecond,
which can be useful to extend some measurements given that timestamps
also have this resolution.
2023-04-28 16:08:08 +02:00
Willy Tarreau
ad5a5f6779 MEDIUM: tree-wide: replace timeval with nanoseconds in tv_accept and tv_request
Let's get rid of timeval in storage of internal timestamps so that they
are no longer mistaken for wall clock time. These were exclusively used
subtracted from each other or to/from "now" after being converted to ns,
so this patch removes the tv_to_ns() conversion to use them natively. Two
occurrences of tv_isge() were turned to a regular wrapping subtract.
2023-04-28 16:08:08 +02:00
Willy Tarreau
76d343d3d3 MINOR: time: replace calls to tv_ms_elapsed() with a linear subtract
Instead of operating on {sec, usec} now we convert both operands to
ns then subtract them and convert to ms. This is a first step towards
dropping timeval from these timestamps.

Interestingly, tv_ms_elapsed() and tv_ms_remain() are no longer used at
all and could be removed.
2023-04-28 16:08:08 +02:00
Tim Duesterhus
b1ec21d259 CLEANUP: Stop checking the pointer before calling tasklet_free()
Changes performed with this Coccinelle patch:

    @@
    expression e;
    @@

    - if (e != NULL) {
    	tasklet_free(e);
    - }

    @@
    expression e;
    @@

    - if (e) {
    	tasklet_free(e);
    - }

    @@
    expression e;
    @@

    - if (e)
    	tasklet_free(e);

    @@
    expression e;
    @@

    - if (e != NULL)
    	tasklet_free(e);

See GitHub Issue #2126
2023-04-23 00:28:25 +02:00
Willy Tarreau
ca1027c22f MINOR: mux-h2: make the max number of concurrent streams configurable per side
For a long time the maximum number of concurrent streams was set once for
both sides (front and back) while the impacts are different. This commit
allows it to be configured separately for each side. The older settings
remains the fallback choice when other ones are not set.
2023-04-18 15:58:55 +02:00
Willy Tarreau
9d7abda787 MINOR: mux-h2: make the initial window size configurable per side
For a long time the initial window size (per-stream size) was set once
for both directions, frontend and backend, resulting in a tradeoff between
upload speed and download fairness. This commit allows it to be configured
separately for each side. The older settings remains the fallback choice
when other ones are not set.
2023-04-18 15:58:55 +02:00
Christopher Faulet
c202c740b5 BUG/MEDIUM: mux-h2: Never set SE_FL_EOS without SE_FL_EOI or SE_FL_ERROR
When end-of-stream is reported by a H2 stream, we must take care to also
report an error is end-of-input was not reported. Indeed, it is now
mandatory to set SE_FL_EOI or SE_FL_ERROR flags when SE_FL_EOS is set.

It is a 2.8-specific issue. No backport needed.
2023-04-11 08:59:10 +02:00
Ilya Shipitsin
07be66d21b CLEANUP: assorted typo fixes in the code and comments
This is 35th iteration of typo fixes
2023-04-01 18:33:40 +02:00
Christopher Faulet
21fb6bdab4 BUG/MEDIUM: mux-h2: Be able to detect connection error during handshake
When a backend H2 connection is waiting the connection is fully established,
nothing is sent. However, it remains useful to detect connection error at
this stage. It is especially important to release H2 connection on connect
error. Be able to set H2_CF_ERR_PENDiNG or H2_CF_ERROR flags when the
underlying connection is not fully established will exclude the H2C to be
inserted in a idle list in h2_detach().

Without this fix, an H2C in PREFACE state and relying on a connection in
error can be inserted in the safe list. Of course, it will be purged if not
reused. But in the mean time, it can be reused. When this happens, the
connection remains in error and nothing happens. At the end a connection
error is returned to the client. On low traffic, we can imagine a scenario
where this dead connection is the only idle connection. If it is always
reused before being purged, no connection to the server is possible.

In addition, h2c_is_dead() is updated to declare as dead any H2 connection
with a pending error if its state is PREFACE or SETTINGS1 (thus if no
SETTINGS frame was received yet).

This patch should fix the issue #2092. It must be backported as far as 2.6.
2023-03-28 14:52:42 +02:00
Willy Tarreau
389ab0d4b4 BUG/MEDIUM: mux-h2: erase h2c->wait_event.tasklet on error path
On the allocation error path in h2_init() we may check if
h2c->wait_event.tasklet needs to be released but it has not yet been
zeroed. Let's do this before jumping to the freeing location.

This needs to be backported to all maintained versions.
2023-03-20 19:58:38 +01:00
Willy Tarreau
bcdc6cc15b BUG/MEDIUM: mux-h2: do not try to free an unallocated h2s->sd
In h2s_close() we may dereference h2s->sd to get the sc, but this
function may be called on allocation error paths, so we must check
for this specific condition. Let's also update the comment to make
it explicitly permitted.

This needs to be backported to 2.6.
2023-03-20 19:58:38 +01:00
Willy Tarreau
464fa06e9a MINOR: mux-h2: set CO_SFL_MSG_MORE when sending multiple buffers
Traces show that sendto() rarely has MSG_MORE on H2 despite sending
multiple buffers. The reason is that the loop iterating over the buffer
ring doesn't have this info and doesn't pass it down.

But now we know how many buffers are left to be sent, so we know whether
or not the current buffer is the last one. As such we can set this flag
for all buffers but the last one.
2023-03-17 16:43:51 +01:00
Willy Tarreau
14ea98af73 BUG/MINOR: mux-h2: set CO_SFL_STREAMER when sending lots of data
Emeric noticed that h2 bit-rate performance was always slightly lower
than h1 when the CPU is saturated. Strace showed that we were always
data in 2kB chunks, corresponding to the max_record size. What's
happening is that when this mechanism of dynamic record size was
introduced, the STREAMER flag at the stream level was relied upon.
Since all this was moved to the muxes, the flag has to be passed as
an argument to the snd_buf() function, but the mux h2 did not use it
despite a comment mentioning it, probably because before the multi-buf
it was not easy to figure the status of the buffer.

The solution here consists in checking if the mbuf is congested or not,
by checking if it has more than one buffer allocated. If so we set the
CO_SFL_STREAMER flag, otherwise we don't. This way moderate size
exchanges continue to be made over small chunks, but downloads will
be able to use the large ones.

While it could be backported to all supported versions, it would be
better to limit it to the last LTS, so let's do it for 2.7 and 2.6 only.
This patch requires previous commit "MINOR: buffer: add br_single() to
check if a buffer ring has more than one buf".
2023-03-16 18:45:46 +01:00
Willy Tarreau
93c5511af8 BUG/MEDIUM: mux-h2: only restart sending when mux buffer is decongested
During performance tests, Emeric faced a case where the wakeups of
sc_conn_io_cb() caused by h2_resume_each_sending_h2s() was multiplied
by 5-50 and a lot of CPU was being spent doing this for apparently no
reason.

The culprit is h2_send() not behaving well with congested buffers and
small SSL records. What happens when the output is congested is that
all buffers are full, and data are emitted in 2kB chunks, which are
sufficient to wake all streams up again to ask them to send data again,
something that will obviously only work for one of them at best, and
waste a lot of CPU in wakeups and memcpy() due to the small buffers.
When this happens, the performance can be divided by 2-2.5 on large
objects.

Here the chosen solution against this is to keep in mind that as long
as there are still at least two buffers in the ring after calling
xprt->snd_buf(), it means that the output is congested and there's
no point trying again, because these data will just be placed into
such buffers and will wait there. Instead we only mark the buffer
decongested once we're back to a single allocated buffer in the ring.

By doing so we preserve the ability to deal with large concurrent
bursts while not causing a thundering herd by waking all streams for
almost nothing.

This needs to be backported to 2.7 and 2.6. Other versions could
benefit from it as well but it's not strictly necessary, and we can
reconsider this option if some excess calls to sc_conn_io_cb() are
faced.

Note that this fix depends on this recent commit:
    MINOR: buffer: add br_single() to check if a buffer ring has more than one buf
2023-03-16 18:45:46 +01:00
Willy Tarreau
3fb2c6d5b4 BUG/MINOR: mux-h2: make sure the h2c task exists before refreshing it
When detaching a stream, if it's the last one and the mbuf is blocked,
we leave without freeing the stream yet. We also refresh the h2c task's
timeout, except that it's possible that there's no such task in case
there is no client timeout, causing a crash. The fix just consists in
doing this when the task exists.

This bug has always been there and is extremely hard to meet even
without a client timeout. This fix has to be backported to all
branches, but it's unlikely anyone has ever met it anyay.
2023-03-16 18:45:46 +01:00
Christopher Faulet
3a7b539b12 BUG/MEDIUM: connection: Preserve flags when a conn is removed from an idle list
The commit 5e1b0e7bf ("BUG/MEDIUM: connection: Clear flags when a conn is
removed from an idle list") introduced a regression. CO_FL_SAFE_LIST and
CO_FL_IDLE_LIST flags are used when the connection is released to properly
decrement used/idle connection counters. if a connection is idle, these
flags must be preserved till the connection is really released. It may be
removed from the list but not immediately released. If these flags are lost
when it is finally released, the current number of used connections is
erroneously decremented. If means this counter may become negative and the
counters tracking the number of idle connecitons is not decremented,
suggesting a leak.

So, the above commit is reverted and instead we improve a bit the way to
detect an idle connection. The function conn_get_idle_flag() must now be
used to know if a connection is in an idle list. It returns the connection
flag corresponding to the idle list if the connection is idle
(CO_FL_SAFE_LIST or CO_FL_IDLE_LIST) or 0 otherwise. But if the connection
is scheduled to be removed, 0 is also returned, regardless the connection
flags.

This new function is used when the connection is temporarily removed from
the list to be used, mainly in muxes.

This patch should fix #2078 and #2057. It must be backported as far as 2.2.
2023-03-16 15:34:20 +01:00
Christopher Faulet
5e1b0e7bf8 BUG/MEDIUM: connection: Clear flags when a conn is removed from an idle list
When a connection is removed from the safe list or the idle list,
CO_FL_SAFE_LIST and CO_FL_IDLE_LIST flags must be cleared. It is performed
when the connection is reused. But not when it is moved into the
toremove_conns list. It may be an issue because the multiplexer owning the
connection may be woken up before the connection is really removed. If the
connection flags are not sanitized, it may think the connection is idle and
reinsert it in the corresponding list. From this point, we can imagine
several bugs. An UAF or a connection reused with an invalid state for
instance.

To avoid any issue, the connection flags are sanitized when an idle
connection is moved into the toremove_conns list. The same is performed at
right places in the multiplexers. Especially because the connection release
may be delayed (for h2 and fcgi connections).

This patch shoudld fix the issue #2057. It must carefully be backported as
far as 2.2. Especially on the 2.2 where the code is really different. But
some conflicts should be expected on the 2.4 too.
2023-02-28 18:36:29 +01:00
Christopher Faulet
72722c04b0 MEDIUM: mux-h2: Don't expect data from server as long as request is unfinished
As for the H1 stream, the H2 stream now states it does not expect data from
the server as long as the request is unfinished. The aim is the same. We
must be sure to not trigger a read timeout on server side if the client is
still uploading data.

From the moment the end of the request is received and forwarded to upper
layer, the H2 stream reports it expects to receive data from the opposite
endpoint. This re-enables read timeout on the server side.
2023-02-27 17:45:45 +01:00
Willy Tarreau
0d6e5d271f MINOR: mux-h2/traces: add a missing TRACE_LEAVE() in h2s_frt_handle_headers()
Traces from this function would miss a TRACE_LEAVE() on the success path,
which had for consequences, 1) that it was difficult to figure where the
function was left, and 2) that we never had the allocated stream ID
clearly visible (actually the one returned by h2c_frt_stream_new() is
the right one but it's not obvious).

This can be backported to 2.7 and 2.6.
2023-02-20 17:22:03 +01:00
Willy Tarreau
f9f4499429 MINOR: mux-h2/traces: do not log h2s pointer for dummy streams
Functions which are called with dummy streams pass it down the traces
and that leads to somewhat confusing "h2s=0x1234568(0,IDL)" for example
while the nature of the called function makes this stream useless at that
place. Better not report a random pointer, especially since it always
requires to look at the code before remembering how this should be
interpreted.

Now what we're doing is that the idle stream only prints "h2s=IDL" which
is shorter and doesn't report a pointer, closed stream do not report
anything since the stream ID 0 already implies it, and other ones are
reported normally.

This could be backported to 2.7 and 2.6 as it improves traces legibility.
2023-02-20 17:22:03 +01:00
Frdric Lcaille
9969adbcdc MINOR: stats: add by HTTP version cumulated number of sessions and requests
Add cum_sess_ver[] new array of counters to count the number of cumulated
HTTP sessions by version (h1, h2 or h3).
Implement proxy_inc_fe_cum_sess_ver_ctr() to increment these counter.
This function is called each a HTTP mux is correctly initialized. The QUIC
must before verify the application operations for the mux is for h3 before
calling proxy_inc_fe_cum_sess_ver_ctr().
ST_F_SESS_OTHER stat field for the cumulated of sessions others than
HTTP sessions is deduced from ->cum_sess_ver counter (for all the session,
not only HTTP sessions) from which the HTTP sessions counters are substracted.

Add cum_req[] new array of counters to count the number of cumulated HTTP
requests by version and others than HTTP requests. This new member replace ->cum_req.
Modify proxy_inc_fe_req_ctr() which increments these counters to pass an HTTP
version, 0 special values meaning "other than an HTTP request". This is the case
for instance for syslog.c from which proxy_inc_fe_req_ctr() is called with 0
as version parameter.
ST_F_REQ_TOT stat field compputing for the cumulated number of requests is modified
to count the sum of all the cum_req[] counters.

As this patch is useful for QUIC, it must be backported to 2.7.
2023-02-03 17:55:49 +01:00
Christopher Faulet
c254516c53 BUG/MINOR: mux-h2: Fix possible null pointer deref on h2c in _h2_trace_header()
As reported by Coverity, this function may be called with no h2c. Thus, the
pointer must always be checked before any access. One test was missing in
TRACE_PRINTF_LOC().

This patch should fix the issue #2015. No backport needed, except if the
commit 11e8a8c2a ("MEDIUM: mux-h2/trace: add tracing support for headers")
is backported.
2023-01-30 08:26:12 +01:00
Willy Tarreau
7cfbb81c85 CLEANUP: mux-h2/trace: shorten the name of the header enc/dec functions
The functions in charge of processing headers have their names in the
traces and they're among the longest of the mux_h2.c file, while even
containing some redundancy. These names are not used outside, let's
shorten them:

- h2c_decode_headers        -> h2c_dec_hdrs
- h2s_bck_make_req_headers  -> h2s_snd_bhdrs
- h2s_frt_make_resp_headers -> h2s_snd_fhdrs

Now the traces are a bit more readable:

  [00|h2|5|mux_h2.c:4822] h2c_dec_hdrs(): h2c=0x1870510(F,FRP) dsi=1 rcvh :method: GET
  [00|h2|5|mux_h2.c:4822] h2c_dec_hdrs(): h2c=0x1870510(F,FRP) dsi=1 rcvh :path: /
  [00|h2|5|mux_h2.c:4822] h2c_dec_hdrs(): h2c=0x1870510(F,FRP) dsi=1 rcvh :scheme: http
  [00|h2|5|mux_h2.c:4822] h2c_dec_hdrs(): h2c=0x1870510(F,FRP) dsi=1 rcvh :authority: localhost:14446
  [00|h2|5|mux_h2.c:4822] h2c_dec_hdrs(): h2c=0x1870510(F,FRP) dsi=1 rcvh user-agent: curl/7.54.1
  [00|h2|5|mux_h2.c:4822] h2c_dec_hdrs(): h2c=0x1870510(F,FRP) dsi=1 rcvh accept: */*
2023-01-26 16:05:51 +01:00
Willy Tarreau
11e8a8c2ac MEDIUM: mux-h2/trace: add tracing support for headers
Now we can make use of TRACE_PRINTF() to iterate over headers as they
are received or dumped. It's worth noting that the dumps may occasionally
be interrupted due to a buffer full or a realign, but in this case it
will be visible because the trace will restart from the first one. All
these headers (and trailers) may be interleaved with other connections'
so they're all preceeded by the pointer to the connection and optionally
the stream (or alternately the stream ID) to help discriminating them.

Since it's not easy to read the header directions, sent headers are
prefixed with "sndh" and received headers are prefixed with "rcvh", both
of which are rare enough in the traces to conveniently support a quick
grep.

In order to avoid code duplication, h2_encode_headers() was implemented
as a wrapper on top of hpack_encode_header(), which optionally emits the
header to the trace if the trace is active. In addition, for headers that
are encoded using a different method, h2_trace_header() was added as well.

Header names are truncated to 256 bytes and values to 1024 bytes. If
the lengths are larger, they will be truncated and suffixed with
"(... +xxx)" where "xxx" is the number of extra bytes.

Example of what an end-to-end H2 request gives:

  [00|h2|5|mux_h2.c:4818] h2c_decode_headers(): h2c=0x1c13120(F,FRP) dsi=1 rcvh :method: GET
  [00|h2|5|mux_h2.c:4818] h2c_decode_headers(): h2c=0x1c13120(F,FRP) dsi=1 rcvh :path: /
  [00|h2|5|mux_h2.c:4818] h2c_decode_headers(): h2c=0x1c13120(F,FRP) dsi=1 rcvh :scheme: http
  [00|h2|5|mux_h2.c:4818] h2c_decode_headers(): h2c=0x1c13120(F,FRP) dsi=1 rcvh :authority: localhost:14446
  [00|h2|5|mux_h2.c:4818] h2c_decode_headers(): h2c=0x1c13120(F,FRP) dsi=1 rcvh user-agent: curl/7.54.1
  [00|h2|5|mux_h2.c:4818] h2c_decode_headers(): h2c=0x1c13120(F,FRP) dsi=1 rcvh accept: */*
  [00|h2|5|mux_h2.c:4818] h2c_decode_headers(): h2c=0x1c13120(F,FRP) dsi=1 rcvh cookie: blah
  [00|h2|5|mux_h2.c:5491] h2s_bck_make_req_headers(): h2c=0x1c1cd90(B,FRH) h2s=0x1c1e3d0(1,IDL) sndh :method: GET
  [00|h2|5|mux_h2.c:5572] h2s_bck_make_req_headers(): h2c=0x1c1cd90(B,FRH) h2s=0x1c1e3d0(1,IDL) sndh :authority: localhost:14446
  [00|h2|5|mux_h2.c:5596] h2s_bck_make_req_headers(): h2c=0x1c1cd90(B,FRH) h2s=0x1c1e3d0(1,IDL) sndh :path: /
  [00|h2|5|mux_h2.c:5647] h2s_bck_make_req_headers(): h2c=0x1c1cd90(B,FRH) h2s=0x1c1e3d0(1,IDL) sndh user-agent: curl/7.54.1
  [00|h2|5|mux_h2.c:5647] h2s_bck_make_req_headers(): h2c=0x1c1cd90(B,FRH) h2s=0x1c1e3d0(1,IDL) sndh accept: */*
  [00|h2|5|mux_h2.c:5647] h2s_bck_make_req_headers(): h2c=0x1c1cd90(B,FRH) h2s=0x1c1e3d0(1,IDL) sndh cookie: blah
  [00|h2|5|mux_h2.c:4818] h2c_decode_headers(): h2c=0x1c1cd90(B,FRP) dsi=1 rcvh :status: 200
  [00|h2|5|mux_h2.c:4818] h2c_decode_headers(): h2c=0x1c1cd90(B,FRP) dsi=1 rcvh content-length: 0
  [00|h2|5|mux_h2.c:4818] h2c_decode_headers(): h2c=0x1c1cd90(B,FRP) dsi=1 rcvh x-req: size=102, time=0 ms
  [00|h2|5|mux_h2.c:4818] h2c_decode_headers(): h2c=0x1c1cd90(B,FRP) dsi=1 rcvh x-rsp: id=dummy, code=200, cache=1, size=0, time=0 ms (0 real)
  [00|h2|5|mux_h2.c:5210] h2s_frt_make_resp_headers(): h2c=0x1c13120(F,FRH) h2s=0x1c1c780(1,HCR) sndh :status: 200
  [00|h2|5|mux_h2.c:5231] h2s_frt_make_resp_headers(): h2c=0x1c13120(F,FRH) h2s=0x1c1c780(1,HCR) sndh content-length: 0
  [00|h2|5|mux_h2.c:5231] h2s_frt_make_resp_headers(): h2c=0x1c13120(F,FRH) h2s=0x1c1c780(1,HCR) sndh x-req: size=102, time=0 ms
  [00|h2|5|mux_h2.c:5231] h2s_frt_make_resp_headers(): h2c=0x1c13120(F,FRH) h2s=0x1c1c780(1,HCR) sndh x-rsp: id=dummy, code=200, cache=1, size=0, time=0 ms (0 real)

At some point the frontend/backend names would be useful but that's a more
general comment than just the H2 traces.
2023-01-26 15:51:30 +01:00
Willy Tarreau
17c630b846 BUG/MINOR: mux-h2: add missing traces on failed headers decoding
In case HPACK cannot be decoded, logs are emitted but there's no info
in the H2 traces, so let's add them.

This may be backported to all supported versions.
2023-01-20 00:02:21 +01:00
Willy Tarreau
f43f36da5b BUG/MINOR: mux-h2: make sure to produce a log on invalid requests
As reported by Dominik Froehlich in github issue #1968, some H2 request
parsing errors do not result in a log being emitted. This is annoying
for debugging because while an RST_STREAM is correctly emitted to the
client, there's no way without enabling traces to find it on the
haproxy side.

After some testing with various abnormal requests, a few places were
found where logs were missing and could be added. In this case, we
simply use sess_log() so some sample fetch functions might not be
available since the stream is not created. But at least there will
be a BADREQ in the logs. A good eaxmple of this consists in sending
forbidden headers or header syntax (e.g. presence of LF in value).
Some quick tests can be done this way:

- protocol error (LF in value):
  curl -iv --http2-prior-knowledge -H "$(printf 'a:b\na')" http://0:8001/

- too large header block after decoding:
  curl -v --http2-prior-knowledge -H "a:$(perl -e "print('a'x10000)")"  -H "a:$(perl -e "print('a'x10000)")"  http://localhost:8001/

This should be backported where needed, most likely 2.7 and 2.6 at
least for a start, and progressively to other versions.
2023-01-19 23:37:00 +01:00
Willy Tarreau
35c4dd0005 CLEANUP: stconn: always use se_fl_set_error() to set the pending error
In mux-h2 and mux-quic we still had two places manually setting
SE_FL_ERR_PENDING or SE_FL_ERROR depending on the EOS state, instead
of using se_fl_set_error() which takes care of the condition. Better
use the specialized function for this, it will allow to centralize
the conditions. Note that this will be needed to fix a bug.
2023-01-17 16:25:29 +01:00
Christopher Faulet
2e47e3a1cf MINOR: htx: Add an HTX value for the extra field is payload length is unknown
When the payload length cannot be determined, the htx extra field is set to
the magical vlaue ULLONG_MAX. It is not obvious. This a dedicated HTX value
is now used. Now, HTX_UNKOWN_PAYLOAD_LENGTH must be used in this case,
instead of ULLONG_MAX.
2023-01-13 11:51:11 +01:00
Christopher Faulet
462f52260c BUG/MEDIUM: mux-h2: Don't send CANCEL on shutw when response length is unkown
Since commit 473e0e54 ("BUG/MINOR: mux-h2: send a CANCEL instead of ES on
truncated writes"), a CANCEL may be reported when the response length is
unkown. It happens for H1 reponses without "Content-lenght" or
"Transfer-encoding" header.

Indeed, in this case, the end of the reponse is detected when the server
connection is closed. On the fontend side, the H2 multiplexer handles this
event as an abort and sensd a RST_STREAM frame with CANCEL error code.

The issue is not with the above commit but with the commit 4877045f1
("MINOR: mux-h2: make streams know if they need to send more data"). The
H2_SF_MORE_HTX_DATA flag must only be set if the payload length can be
determined.

This patch should fix the issue #1992. It must be backported to 2.7.
2023-01-13 11:28:32 +01:00
Christopher Faulet
827a6299e6 BUG/MEDIUM: mux-h2: Refuse interim responses with end-stream flag set
As state in RFC9113#8.1, HEADERS frame with the ES flag set that carries an
informational status code is malformed. However, there is no test on this
condition.

On 2.4 and higher, it is hard to predict consequences of this bug because
end of the message is only reported with a flag. But on 2.2 and lower, it
leads to a crash because there is an unexpected extra EOM block at the end
of an interim response.

Now, when a ES flag is detected on a HEADERS frame for an interim message, a
stream error is sent (RST_STREAM/PROTOCOL_ERROR).

This patch should solve the issue #1972. It should be backported as far as
2.0.
2022-12-22 13:46:21 +01:00
Willy Tarreau
f8c7709013 MINOR: mux-h2: add the expire task and its expiration date in "show fd"
Some issues such as #1929 seem to involve a task without timeout but we
can't find the condition to reproduce this in the code. However, not having
this info in the output doesn't help, so this patch adds the task pointer
and its timeout (when the task is non-null). It may be useful to backport
it.
2022-11-29 15:29:00 +01:00
Christopher Faulet
68ee7845cf CLEANUP: mux-h2: Remove unused fields in h2c structures
Some fields in h2c structures are not used: .mfl, .mft and .mff. Just remove
them.

.msi field is also removed. It is tested but never set, except when a H2
connection is initialized. It also means h2c_mux_busy() function is useless
because it always returns 0 (.msi is always -1). And thus, by transitivity,
H2_CF_DEM_MBUSY is also useless because it is never set. So .msi field,
h2c_mux_busy() function and H2C_MUX_BUSY flag are removed.
2022-11-17 14:33:15 +01:00
Christopher Faulet
ff7925dce0 MEDIUM: mux-h2: Introduce flags to deal with connection read/write errors
Similarly to the H1 multiplexer, H2_CF_ERR_PENDING is now used to report an
error when we try to send data and H2_CF_ERROR to report an error when we
try to read data. In other funcions, we rely on these flags instead of
connection ones. Only H2_CF_ERROR is considered as a final error.
H2_CF_ERR_PENDING does not block receive attempt.

In addition, we rely on H2_CF_RCVD_SHUT flag to test if a read0 was received
or not.
2022-11-17 14:33:15 +01:00
Willy Tarreau
8522348482 BUG/MAJOR: conn-idle: fix hash indexing issues on idle conns
Idle connections do not work on 32-bit machines due to an alignment issue
causing the connection nodes to be indexed with their lower 32-bits set to
zero and the higher 32 ones containing the 32 lower bitss of the hash. The
cause is the use of ebmb_node with an aligned data, as on this platform
ebmb_node is only 32-bit aligned, leaving a hole before the following hash
which is a uint64_t:

  $ pahole -C conn_hash_node ./haproxy
  struct conn_hash_node {
        struct ebmb_node           node;                 /*     0    20 */

        /* XXX 4 bytes hole, try to pack */

        int64_t                    hash;                 /*    24     8 */
        struct connection *        conn;                 /*    32     4 */

        /* size: 40, cachelines: 1, members: 3 */
        /* sum members: 32, holes: 1, sum holes: 4 */
        /* padding: 4 */
        /* last cacheline: 40 bytes */
  };

Instead, eb64 nodes should be used when it comes to simply storing a
64-bit key, and that is what this patch does.

For backports, a variant consisting in simply marking the "hash" member
with a "packed" attribute on the struct also does the job (tested), and
might be preferable if the fix is difficult to adapt. Only 2.6 and 2.5
are affected by this.
2022-10-03 12:06:36 +02:00
Willy Tarreau
6c0fadfb7d REORG: mux-h2: extract flags and enums into mux_h2-t.h
Originally in 1.8 we wanted to have an independent mux that could possibly
be disabled and would not impose dependencies on the outside. Everything
would fit into a single C file and that was fine.

Nowadays muxes are unavoidable, and not being able to easily inspect them
from outside is sometimes a bit of a pain. In particular, the flags utility
still cannot be used to decode their flags.

As a first step towards this, this patch moves the flags and enums to
mux_h2-t.h, as well as the two state decoding inline functions. It also
dropped the H2_SS_*_BIT defines that nobody uses. The mux_h2.c file remains
the only one to include that for now.
2022-09-12 19:33:07 +02:00
Willy Tarreau
7051f73efe MINOR: mux-h2: insert line breaks in "show sess all" output for legibility
h2s and h2c were extremely long in the "show sess all" output, around 300
chars each. This adds a few line breaks to improve legibility, there are
now 3 lines for each, which are around the same length as the other ones
while keeping a natural arrangement. E.g (lines highlighted with '>'):

  0x7faad8144f80: [02/Sep/2022:15:49:40.171620] id=105283 proto=tcpv4 source=127.0.0.1:42942
    flags=0x100c4a, conn_retries=0, conn_exp=<NEVER> conn_et=0x000 srv_conn=0x1f44b20, pend_pos=(nil) waiting=0 epoch=0
    frontend=decrypt (id=2 mode=http), listener=? (id=3) addr=127.0.0.1:8001
    backend=decrypt (id=2 mode=http) addr=127.0.0.1:18144
    server=httpterm (id=1) addr=127.0.0.1:8000
    task=0x7faad812b7c0 (state=0x00 nice=0 calls=2 rate=0 exp=4s tid=7(1/7) age=0s)
    txn=0x7faad81453e0 flags=0x43000 meth=1 status=200 req.st=MSG_DONE rsp.st=MSG_DATA req.f=0x4c rsp.f=0x0d
    scf=0x7faad81625d0 flags=0x00000080 state=EST endp=CONN,0x7faad811d380,0x02805001 sub=1
>       h2s=0x7faad811d380 h2s.id=2113 .st=HCR .flg=0x207001 .rxbuf=0@(nil)+0/0
>       .sc=0x7faad81625d0(.flg=0x00000080 .app=0x7faad8144f80) .sd=0x7faad8119dc0(.flg=0x02805001)
>       .subs=0x7faad81625e0(ev=1 tl=0x7faad86d6500 tl.calls=4 tl.ctx=0x7faad81625d0 tl.fct=sc_conn_io_cb)
>       h2c=0x7faad802c640 h2c.st0=FRH .err=0 .maxid=2157 .lastid=-1 .flg=0x0600 .nbst=1 .nbsc=1
>       .fctl_cnt=0 .send_cnt=0 .tree_cnt=1 .orph_cnt=0 .sub=1 .dsi=2157 .dbuf=0@(nil)+0/0
>       .msi=-1 .mbuf=[6..6|32],h=[0@(nil)+0/0],t=[0@(nil)+0/0]
        co0=0x7faae402efc0 ctrl=tcpv4 xprt=RAW mux=H2 data=STRM target=LISTENER:0x1f43c40
        flags=0x00000300 fd=95 fd.state=121 updt=0 fd.tmask=0x80
    scb=0x7faad8145370 flags=0x00000011 state=EST endp=CONN,0x7faad8115630,0x02840001 sub=1
        co1=0x7faad86c0730 ctrl=tcpv4 xprt=RAW mux=H1 data=STRM target=SERVER:0x1f44b20
        flags=0x00000300 fd=1656 fd.state=10121 updt=0 fd.tmask=0x80
    req=0x7faad8144fa0 (f=0x49c40000 an=0x8000 pipe=0 tofwd=0 total=110)
        an_exp=<NEVER> rex=<NEVER> wex=<NEVER>
        buf=0x7faad8144fa8 data=(nil) o=0 p=0 i=0 size=0
        htx=0xdd90a0 flags=0x0 size=0 data=0 used=0 wrap=NO extra=0
    res=0x7faad8145000 (f=0x80040202 an=0x4000000 pipe=0 tofwd=-1 total=60365)
        an_exp=<NEVER> rex=<NEVER> wex=<NEVER>
        buf=0x7faad8145008 data=(nil) o=0 p=0 i=0 size=0
        htx=0xdd90a0 flags=0x0 size=0 data=0 used=0 wrap=NO extra=0
2022-09-02 16:43:03 +02:00
Willy Tarreau
bf4ec6f4a0 MINOR: mux-h2: provide a "show_sd" helper to output stream debugging info
With this, it now becomes possible to see the state of each H2 stream from
"show sess all". Lines are still too long and need to be split, but that's
for another patch.
2022-09-02 15:48:50 +02:00
Willy Tarreau
4e97bcc76b MINOR: mux-h2: extract the connection dump function out of h2_show_fd()
The function will be reusable to dump connections, so let's extract it.
2022-09-02 15:48:10 +02:00
Willy Tarreau
90bffa2ce3 MINOR: mux-h2: extract the stream dump function out of h2_show_fd()
The function will be reusable to dump streams, so let's extract it.
Note that due to "last_h2s" being originally printed as a prefix for
the stream dump, now the pointer is displayed by the caller instead.
2022-09-02 15:48:10 +02:00
Willy Tarreau
ba7657ca0f BUG/MINOR: mux-h2: fix the "show fd" dest buffer for the subscriber
Commit 98e40b981 ("MINOR: mux-h2: make the "show fd" helper also decode
the h2s subscriber when known") improved the output of "show fd" for the
H2 mux, but the output is sent to the trash buffer instead of the msg
argument. It turns out that this has no effect right now as the caller
passes the trash but this is risky.

This should be backported to 2.4.
2022-09-02 14:23:56 +02:00
Willy Tarreau
473e0e54f5 BUG/MINOR: mux-h2: send a CANCEL instead of ES on truncated writes
If a POST upload is cancelled after having advertised a content-length,
or a response body is truncated after a content-length, we're not allowed
to send ES because in this case the total body length must exactly match
the advertised value. Till now that's what we were doing, and that was
causing the other side (possibly haproxy) to respond with an RST_STREAM
PROTOCOL_ERROR due to "ES on DATA frame before content-length".

We can behave a bit cleaner here. Let's detect that we haven't sent
everything, and send an RST_STREAM(CANCEL) instead, which is designed
exactly for this purpose.

This patch could be backported to older versions but only a little bit
of exposure to make sure it doesn't wake up a bad behavior somewhere.
It relies on the following previous commit:

   "MINOR: mux-h2: make streams know if they need to send more data"
2022-08-19 08:03:53 +02:00