Commit Graph

859 Commits

Author SHA1 Message Date
Willy Tarreau
8050efeacb MINOR: cli: give the show_fd helpers the ability to report a suspicious entry
Now the show_fd helpers at the transport and mux levels return an integer
which indicates whether or not the inspected entry looks suspicious. When
an entry is reported as suspicious, "show fd" will suffix it with an
exclamation mark ('!') in the dump, that is supposed to help detecting
them.

For now, helpers were adjusted to adapt to the new API but none of them
reports any suspicious entry yet.
2021-01-21 08:58:15 +01:00
Willy Tarreau
98e40b9818 MINOR: mux-h2: make the "show fd" helper also decode the h2s subscriber when known
When dumping a valid h2 stream, also dump the subscriber, its events,
tasklet context and handler. Example:

    128 : st=0x21(R:rA W:Ra) ev=0x01(heopI) [lc] tmask=0x1 umask=0x0 owner=0x7f40380d7370 iocb=0x65b71b(sock_conn_iocb) back=0 cflg=0x00001300 fe=recv mux=H2 ctx=0x1ad23e0 h2c.st0=FRP .err=0 .maxid=3 .lastid=-1 .flg=0x10000 .nbst=2 .nbcs=2 .fctl_cnt=0 .send_cnt=0 .tree_cnt=2 .orph_cnt=0 .sub=1 .dsi=3 .dbuf=16366@0x1ea9380+16441/16448 .msi=-1 .mbuf=[1..1|32],h=[0@(nil)+0/0],t=[0@(nil)+0/0] last_h2s=0x20a8340 .id=3 .st=OPN .flg=0x4100 .rxbuf=0@(nil)+0/0 .cs=0x20a8440(.flg=0x00100000 .data=0x20a8738) .subs=0x20a8760(ev=1 tl=0x20a89b0 tl.calls=22 tl.ctx=0x20a8738 tl.fct=si_cs_io_cb) xprt=SSL xprt_ctx=0x1aaf4c0 xctx.st=0 .xprt=RAW .wait.ev=1 .subs=0x1ad28e0(ev=1 tl=0x1ab3c70 tl.calls=176 tl.ctx=0x1ad23e0 tl.fct=h2_io_cb) .sent_early=0 .early_in=0
2021-01-20 17:17:39 +01:00
Willy Tarreau
691d503896 MINOR: xprt/mux: export all *_io_cb functions so that "show fd" resolves them
In FD dumps it's often very important to figure what upper layer function
is going to be called. Let's export the few I/O callbacks that appear as
tasklet functions so that "show fd" can resolve them instead of printing
a pointer relative to main. For example:

   1028 : st=0x21(R:rA W:Ra) ev=0x01(heopI) [lc] tmask=0x2 umask=0x2 owner=0x7f00b889b200 iocb=0x65b638(sock_conn_iocb) back=0 cflg=0x00001300 fe=recv mux=H2 ctx=0x7f00c8824de0 h2c.st0=FRH .err=0 .maxid=795 .lastid=-1 .flg=0x0000 .nbst=0 .nbcs=0 .fctl_cnt=0 .send_cnt=0 .tree_cnt=0 .orph_cnt=0 .sub=1 .dsi=795 .dbuf=0@(nil)+0/0 .msi=-1 .mbuf=[1..1|32],h=[0@(nil)+0/0],t=[0@(nil)+0/0] xprt=SSL xprt_ctx=0x7f00c86d0750 xctx.st=0 .xprt=RAW .wait.ev=1 .subs=0x7f00c88252e0(ev=1 tl=0x7f00a07d1aa0 tl.calls=1047 tl.ctx=0x7f00c8824de0 tl.fct=h2_io_cb) .sent_early=0 .early_in=0
2021-01-20 17:17:39 +01:00
Willy Tarreau
ed4464e6c6 BUG/MINOR: mux_h2: missing space between "st" and ".flg" in the "show fd" helper
That was causing confusing outputs like this one whenan H2S is known:

   1030 : ... last_h2s=0x2ed8390 .id=775 .st=HCR.flg=0x4001 .rxbuf=...
                                                ^^^^

This was introduced by commit ab2ec4540 in 2.1-dev2 so the fix can be
backported as far as 2.1.
2021-01-20 17:17:39 +01:00
Tim Duesterhus
12a08d8849 BUG/MEDIUM: mux_h2: Add missing braces in h2_snd_buf()around trace+wakeup
This is a regression in 7838a79ba ("MEDIUM: mux-h2/trace: add lots of traces
all over the code"). The issue was found using -Wmisleading-indentation.
This patch fixes GitHub issue #1015.

The impact of this bug is that it could in theory cause occasional delays
on some long responses for connections having otherwise no traffic.

This patch should be backported to 2.1+, the commit was first tagged in
v2.1-dev2.
2020-12-22 09:02:11 +01:00
Christopher Faulet
fac0f8f029 CLEANUP: mux-h2: Rename h2c_frt_handle_data() to be generic
h2c_frt_handle_data() is now used to parse DATA frames on the frontend and
the backend side. Thus it is renamed into h2c_handle_data().
2020-12-18 15:05:57 +01:00
Christopher Faulet
142854b1da CLEANUP: mux-h2: Rename h2s_frt_make_resp_data() to be generic
h2s_frt_make_resp_data() is now used to emit DATA frames on the frontend and
the backend side. Thus it is renamed into h2s_make_data().
2020-12-18 15:05:57 +01:00
Christopher Faulet
4c8ad84232 MINOR: mux: Add a ctl parameter to get the exit status of the multiplexers
The ctl param MUX_EXIT_STATUS can be request to get the exit status of a
multiplexer. For instance, it may be an HTTP status code or an H2 error. For
now, 0 is always returned. When the mux h1 will be able to return HTTP
errors itself, this ctl param will be used to get the HTTP status code from
the logs.

the mux_exit_status enum has been created to map internal mux exist status
to generic one. Thus there is 5 possible status for now: success, invalid
error, timeout error, internal error and unknown.
2020-12-04 14:41:49 +01:00
Christopher Faulet
26256f86e1 MINOR: stream: Pass an optional input buffer when a stream is created
It is now possible to set the buffer used by the channel request buffer when
a stream is created. It may be useful if input data are already received,
instead of waiting the first call to the mux rcv_buf() callback. This change
is mandatory to support H1 connection with no stream attached.

For now, the multiplexers don't pass any buffer. BUF_NULL is thus used to
call stream_create_from_cs().
2020-12-04 14:41:48 +01:00
Willy Tarreau
f965b2ad13 BUG/MINOR: mux-h2/stats: not all GOAWAY frames are errors
The stats on haproxy.org reported ~12k GOAWAY for ~34k connections, with
only 2 protocol errorss. It turns out that the GOAWAY frame counter added
in commit a8879238c ("MINOR: mux-h2: report detected error on stats")
matches a bit too many situations. First it counts those which are not
sent as well as failed retries, second it counts as errors the cases of
attempts to cleanly close, while it's titled "GOAWAY sent on detected
error". Let's address this by moving the counter up one line and excluding
the clean codes.

This can be backported to 2.3.
2020-12-01 10:47:18 +01:00
Willy Tarreau
5dd36ac8a0 MINOR: mux-h2/trace: add traces at level ERROR for protocol errors
A number of traces could be added, and a few TRACE_PROTO were replaced
with TRACE_ERROR. The goal is to be able to enable error tracing only
to detect anomalies.

It looks like they're mostly correct as they don't seem to strike on
valid H2 traffic but are very verbose on h2spec.
2020-12-01 10:30:37 +01:00
Willy Tarreau
a307528fe2 BUG/MINOR: mux-h2/stats: make stream/connection proto errors more accurate
Since commit a8879238c ("MINOR: mux-h2: report detected error on stats")
we now have some error stats on stream/connection level protocol errors,
but some were improperly marked as stream while they're connection, and
2 or 3 relevant ones were missing and have now been added.

This could be backported to 2.3.
2020-12-01 10:25:20 +01:00
Christopher Faulet
5214099233 MINOR: config/mux-h2: Return ERR_ flags from init_h2() instead of a status
post-check function callbacks must return ERR_* flags. Thus, init_h2() is fixed
to return ERR_NONE on success or (ERR_ALERT|ERR_FATAL) on error.

This patch may be backported as far as 2.2.
2020-11-13 16:26:10 +01:00
Christopher Faulet
fafd1b0a5b CLEANUP: mux-h2: Remove the h1 parser state from the h2 stream
Since the h2 multiplexer no longer relies on the legacy HTTP representation, and
uses exclusively the HTX, the H1 parser state (h1m) is no longer used by the h2
streams. Thus it can be removed.

This patch may be backported as far as 2.1.
2020-11-04 15:02:24 +01:00
Amaury Denoyelle
e7b891f7d3 MINOR: mux_h2: add stat for total count of connections/streams
Add counters for total number of http2 connections/stream since haproxy
startup. Contrary to open_conn/stream, they are never reset to zero.
2020-11-03 17:04:22 +01:00
Amaury Denoyelle
2ac34d97a6 MINOR: mux_h2: capitalize frame type in stats
http/2 frame type names are capitalized in the rfc, use the same
notation on the stats labels.
2020-11-03 17:04:22 +01:00
Amaury Denoyelle
66942c1d4d MINOR: mux-h2: count open connections/streams on stats
Implement as a gauge h2 counters for currently open connections and
streams. The counters are decremented when closing the stream or the
connection.
2020-10-28 08:55:23 +01:00
Amaury Denoyelle
a8879238ce MINOR: mux-h2: report detected error on stats
Implement counters for h2 protocol error on connection or stream level.
Also count the total number of rst_stream and goaway frames sent by the
mux in response to a detected error.
2020-10-28 08:55:19 +01:00
Amaury Denoyelle
2dec1ebec2 MINOR: mux-h2: add stats for received frame types
Implement counters for h2 frame received based on their type for
HEADERS, DATA, SETTINGS, RST_STREAM and GOAWAY.
2020-10-28 08:55:16 +01:00
Amaury Denoyelle
c92697d977 MINOR: mux-h2: add counters instance to h2c
Add pointer to counters as a member for h2c structure. This pointer is
initialized on h2_init function. This is useful to quickly access and
manipulate the counters inside every h2 functions.
2020-10-28 08:55:11 +01:00
Amaury Denoyelle
3238b3f906 MINOR: mux-h2: register a stats module
Use statistics API to register a new stats module generating counters
on h2 module. The counters are attached to frontend/backend instances.
2020-10-28 08:55:07 +01:00
Christopher Faulet
9a3d3fcb5d BUG/MAJOR: mux-h2: Don't try to send data if we know it is no longer possible
In h2_send(), if we are in a state where we know it is no longer possible to
send data, we must exit the sending loop to avoid any possiblity to loop
forever. It may happen if the mbuf ring is released while the H2_CF_MUX_MFULL
flag is still set. Here is a possible scenario to trigger the bug :

  1) The mbuf ring is full because we are unable to send data. The
     H2_CF_MUX_MFULL flag is set on the H2 connection.

  2) At this stage, the task timeout expires because the H2 connection is
     blocked. We enter in h2_timeout_task() function. Because the mbuf ring is
     full, we cannot send the GOAWAY frame. Thus the H2_CF_GOAWAY_FAILED flag is
     set. The H2 connection is not released yet because there is still a stream
     attached. Here we leave h2_timeout_task() function.

  3) A bit later, the H2 connection is woken up. If h2_process(), nothing is
     performed by the first attempt to send data, in h2_send(). Then, because
     the H2_CF_GOAWAY_FAILED flag is set, the mbuf ring is released. But the
     H2_CF_MUX_MFULL flag is still there. At this step a second attempt to send
     data is performed.

  4) In h2_send(), we try to send data in a loop. To exist this loop, done
     variable must be set to 1. Because the H2_CF_MUX_MFULL flag is set, we
     don't call h2_process_mux() and done is not updated. Because the mbuf ring
     is now empty, nothing is sent and the H2_CF_MUX_MFULL flag is never
     removed. Now, we loop forever... waiting for the watchdog.

To fix the bug, we now exit the loop if one of these conditions is true :

  - The H2_CF_GOAWAY_FAILED flag is set on the H2 connection
  - The CO_FL_SOCK_WR_SH flag is set on the underlying connection
  - The H2 connection is in the H2_CS_ERROR2 state

This patch should fix the issue #912 and most probably #875. It must be
backported as far as the 1.8.
2020-10-22 17:13:22 +02:00
Amaury Denoyelle
6b8daef56b MEDIUM: h2: remove conn from session on detach
H2 mux is marked with HOL blocking. On safe reuse mode, the connection
using it are placed on the sessions instead of the available lists to
avoid sharing it with several clients. On detach, if they are no more
streams, remove the connection from the session before adding it to the
idle list. If there is still used streams, do not add it to available
list as it should be already on the session list.
2020-10-15 15:19:34 +02:00
Amaury Denoyelle
3d3c0918dc MINOR: mux/connection: add a new mux flag for HOL risk
This flag is used to indicate if the mux protocol is subject to
head-of-line blocking problem.
2020-10-15 15:19:34 +02:00
Willy Tarreau
b1e600c9c5 BUG/MINOR: mux-h2: do not stop outgoing connections on stopping
There are reports of a few "SC" in logs during reloads when H2 is used
on the backend side. Christopher analysed this as being caused by the
proxy disabled test in h2_process(). As the comment says, this was done
for frontends only, and must absolutely not send a GOAWAY to the backend,
as all it will result in is to make newly queued streams fail.

The fix consists in simply testing the connection side before deciding
to send the GOAWAY.

This may be backported as far as 2.0, though for whatever reason it seems
to manifest itself only since 2.2 (probably due to changes in the outgoing
connection setup sequence).
2020-10-13 18:15:33 +02:00
Willy Tarreau
c3914d4fff MEDIUM: proxy: replace proxy->state with proxy->disabled
The remaining proxy states were only used to distinguish an enabled
proxy from a disabled one. Due to the initialization order, both
PR_STNEW and PR_STREADY were equivalent after startup, and they
would only differ from PR_STSTOPPED when the proxy is disabled or
shutdown (which is effectively another way to disable it).

Now we just have a "disabled" field which allows to distinguish them.
It's becoming obvious that start_proxies() is only used to print a
greeting message now, that we'd rather get rid of. Probably that
zombify_proxy() and stop_proxy() should be merged once their
differences move to the right place.
2020-10-09 11:27:30 +02:00
Christopher Faulet
aade4edc1a BUG/MEDIUM: mux-h2: Don't handle pending read0 too early on streams
This patch is similar to the previous one on the fcgi. Same is true for the
H2. But the bug is far harder to trigger because of the protocol cinematic. But
it may explain strange aborts in some edge cases.

A read0 received on the connection must not be handled too early by H2 streams.
If the demux buffer is not empty, the pending read0 must not be considered. The
H2 streams must not be passed in half-closed remote state in
h2s_wake_one_stream() and the CS_FL_EOS flag must not be set on the associated
conn-stream in h2_rcv_buf(). To sum up, it means, if there are still data
pending in the demux buffer, no abort must be reported to the streams.

To fix the issue, a dedicated function has been added, responsible for detecting
pending read0 for a H2 connection. A read0 is reported only if the demux buffer
is empty. This function is used instead of conn_xprt_read0_pending() at some
places.

Note that the HREM stream state should not be used to report aborts. It is
performed on h2s_wake_one_stream() function and it is a legacy of the very first
versions of the mux-h2.

This patch should be backported as far as 2.0. In the 1.8, the code is too
different to apply it like that. But it is probably useless because the mux-h2
can only be installed on the client side.
2020-10-09 10:02:09 +02:00
Willy Tarreau
bba7a4dafd BUG/MINOR: h2/trace: do not display "stream error" after a frame ACK
When sending a frame ACK, the parser state is not equal to H2_CS_FRAME_H
and we used to report it as an error, which is not true. In fact we should
only indicate when we skip remaining data.

This may be backported as far as 2.1.
2020-09-18 07:41:28 +02:00
Willy Tarreau
8520d87198 MINOR: h2/trace: also display the remaining frame length in traces
It's often missing when debugging, even though it's often zero for
control frames or after data are consumed.
2020-09-18 07:39:29 +02:00
Willy Tarreau
022e5e56ed BUILD: traces: don't pass an empty argument for missing ones
It initially looked appealing to be able to call traces with ",,," for
unused arguments, but tcc doesn't like empty macro arguments, and quite
frankly, adding a zero between the few remaining ones is no big deal.
Let's do so now.
2020-09-10 09:37:52 +02:00
William Dauchy
477757c66b CLEANUP: fix all duplicated semicolons
trivial commit, does not change the code behaviour

Signed-off-by: William Dauchy <w.dauchy@criteo.com>
2020-08-10 08:49:38 +02:00
Christopher Faulet
810df06145 MEDIUM: htx: Add a flag on a HTX message when no more data are expected
The HTX_FL_EOI flag must now be set on a HTX message when no more data are
expected. Most of time, it must be set before adding the EOM block. Thus, if
there is no space for the EOM, there is still an information to know all data
were received and pushed in the HTX message. There is only an exception for the
HTTP replies (deny, return...). For these messages, the flag is set after all
blocks are pushed in the message, including the EOM block, because, on error,
we remove all inserted data.
2020-07-22 16:43:32 +02:00
Christopher Faulet
236c93b108 MINOR: connection: Set the conncetion target during its initialisation
When a new connection is created, its target is always set just after. So the
connection target may set when it is created instead, during its initialisation
to be precise. It is the purpose of this patch. Now, conn_new() function is
called with the connection target as parameter. The target is then passed to
conn_init(). It means the target must be passed when cs_new() is called. In this
case, the target is only used when the conn-stream is created with no
connection. This only happens for tcpchecks for now.
2020-07-15 14:08:14 +02:00
Christopher Faulet
08016ab82d MEDIUM: connection: Add private connections synchronously in session server list
When a connection is marked as private, it is now added in the session server
list. We don't wait a stream is detached from the mux to do so. When the
connection is created, this happens after the mux creation. Otherwise, it is
performed when the connection is marked as private.

To allow that, when a connection is created, the session is systematically set
as the connectin owner. Thus, a backend connection has always a owner during its
creation. And a private connection has always a owner until its death.

Note that outside the detach() callback, if the call to session_add_conn()
failed, the error is ignored. In this situation, we retry to add the connection
into the session server list in the detach() callback. If this fails at this
step, the multiplexer is destroyed and the connection is closed.
2020-07-15 14:08:14 +02:00
Christopher Faulet
21ddc74e8a MINOR: connection: Add a wrapper to mark a connection as private
To set a connection as private, the conn_set_private() function must now be
called. It sets the CO_FL_PRIVATE flags, but it also remove the connection from
the available connection list, if necessary. For now, it never happens because
only HTTP/1 connections may be set as private after their creation. And these
connections are never inserted in the available connection list.
2020-07-15 14:08:14 +02:00
Christopher Faulet
c64badd573 MINOR: connection: Set new connection as private on reuse never
When a new connection is created, it may immediatly be set as private if
http-reuse never is configured for the backend. There is no reason to wait the
call to mux->detach() to do so.
2020-07-15 14:08:14 +02:00
Christopher Faulet
c5579d18d1 BUG/MEDIUM: mux-h2: Don't add private connections in available connection list
When a stream is detached from a backend private connection, we must not insert
it in the available connection list. In addition, we must be sure to remove it
from this list. To ensure it is properly performed, this part has been slightly
refactored to clearly split processing of private connections from the others.

This patch should probably be backported to 2.2.
2020-07-15 14:08:14 +02:00
Willy Tarreau
a9d7b76f6a MINOR: connection: use MT_LIST_ADDQ() to add connections to idle lists
When a connection is added to an idle list, it's already detached and
cannot be seen by two threads at once, so there's no point using
TRY_ADDQ, there will never be any conflict. Let's just use the cheaper
ADDQ.
2020-07-10 08:52:13 +02:00
Willy Tarreau
8689127816 MINOR: buffer: use MT_LIST_ADDQ() for buffer_wait lists additions
The TRY_ADDQ there was not needed since the wait list is exclusively
owned by the caller. There's a preliminary test on MT_LIST_ADDED()
that might have been eliminated by keeping MT_LIST_TRY_ADDQ() but
it would have required two more expensive writes before testing so
better keep the test the way it is.
2020-07-10 08:52:13 +02:00
Willy Tarreau
de4db17dee MINOR: lists: rename some MT_LIST operations to clarify them
Initially when mt_lists were added, their purpose was to be used with
the scheduler, where anyone may concurrently add the same tasklet, so
it sounded natural to implement a check in MT_LIST_ADD{,Q}. Later their
usage was extended and MT_LIST_ADD{,Q} started to be used on situations
where the element to be added was exclusively owned by the one performing
the operation so a conflict was impossible. This became more obvious with
the idle connections and the new macro was called MT_LIST_ADDQ_NOCHECK.

But this remains confusing and at many places it's not expected that
an MT_LIST_ADD could possibly fail, and worse, at some places we start
by initializing it before adding (and the test is superflous) so let's
rename them to something more conventional to denote the presence of the
check or not:

   MT_LIST_ADD{,Q}    : inconditional operation, the caller owns the
                        element, and doesn't care about the element's
                        current state (exactly like LIST_ADD)
   MT_LIST_TRY_ADD{,Q}: only perform the operation if the element is not
                        already added or in the process of being added.

This means that the previously "safe" MT_LIST_ADD{,Q} are not "safe"
anymore. This also means that in case of backport mistakes in the
future causing this to be overlooked, the slower and safer functions
will still be used by default.

Note that the missing unchecked MT_LIST_ADD macro was added.

The rest of the code will have to be reviewed so that a number of
callers of MT_LIST_TRY_ADDQ are changed to MT_LIST_ADDQ to remove
the unneeded test.
2020-07-10 08:50:41 +02:00
Ilya Shipitsin
46a030cdda CLEANUP: assorted typo fixes in the code and comments
This is 11th iteration of typo fixes
2020-07-06 14:34:32 +02:00
Willy Tarreau
46ac781779 BUILD: mux-h2: fix typo breaking build when using DEBUG_LOCK
A typo was accidently introduced in commit 48ce6a3 ("BUG/MEDIUM: muxes:
Make sure nobody stole the connection before using it."), a "&" was
placed in front of "OTHER_LOCK", which breaks DEBUG_LOCK. No backport
is needed.
2020-07-04 07:16:18 +02:00
Olivier Houchard
a74bb7e26e BUG/MEDIUM: connections: Let the xprt layer know a takeover happened.
When we takeover a connection, let the xprt layer know. If it has its own
tasklet, and it is already scheduled, then it has to be destroyed, otherwise
it may run the new mux tasklet on the old thread.

Note that we only do this for the ssl xprt for now, because the only other
one that might wake the mux up is the handshake one, which is supposed to
disappear before idle connections exist.

No backport is needed, this is for 2.2.
2020-07-03 17:49:33 +02:00
Olivier Houchard
1662cdb0c6 BUG/MEDIUM: connections: Set the tid for the old tasklet on takeover.
In the various takeover() methods, make sure we schedule the old tasklet
on the old thread, as we don't want it to run on our own thread! This
was causing a very rare crash when building with DEBUG_STRICT, seeing
that either an FD's thread mask didn't match the thread ID in h1_io_cb(),
or that stream_int_notify() would try to queue a task with the wrong
tid_bit.

In order to reproduce this, it is necessary to maintain many connections
(typically 30k) at a high request rate flowing over H1+SSL between two
proxies, the second of which would randomly reject ~1% of the incoming
connection and randomly killing some idle ones using a very short client
timeout. The request rate must be adjusted so that the CPUs are nearly
saturated, but never reach 100%. It's easier to reproduce this by skipping
local connections and always picking from other threads. The issue
should happen in less than 20s otherwise it's necessary to restart to
reset the idle connections lists.

No backport is needed, takeover() is 2.2 only.
2020-07-03 17:49:23 +02:00
Olivier Houchard
48ce6a3ab1 BUG/MEDIUM: muxes: Make sure nobody stole the connection before using it.
In the various timeout functions, make sure nobody stole the connection from
us before attempting to doing anything with it, there's a very small race
condition between the time we access the task context, and the time we
actually check it again with the lock, where it could have been free'd.
2020-07-02 14:17:25 +02:00
Olivier Houchard
f8f4c2ef60 CLEANUP: connections: rename the toremove_lock to takeover_lock
This lock was misnamed and a bit confusing. It's only used for takeover
so let's call it takeover_lock.
2020-07-01 17:09:10 +02:00
Willy Tarreau
617e80ff76 MEDIUM: mux-h2: use task_kill() during h2_takeover() instead of task_wakeup()
task_wakeup() passes the task through the global run queue under the
global RQ lock, which is expensive when dealing with large amounts of
h2_takeover() calls. Let's use the new task_kill() instead to kill the
task.
2020-07-01 16:47:12 +02:00
Willy Tarreau
bd42e9257d MINOR: mux-h2: avoid taking the toremove_lock in on dying tasks
If the owning task is already dying (context was destroyed by h2_takeover)
there's no point taking the lock then removing it later since all the code
in between is conditionned by a non-null context. Let's simplify this.
2020-06-30 14:06:19 +02:00
Willy Tarreau
4d82bf5c2e MINOR: connection: align toremove_{lock,connections} and cleanup into idle_conns
We used to have 3 thread-based arrays for toremove_lock, idle_cleanup,
and toremove_connections. The problem is that these items are small,
and that this creates false sharing between threads since it's possible
to pack up to 8-16 of these values into a single cache line. This can
cause real damage where there is contention on the lock.

This patch creates a new array of struct "idle_conns" that is aligned
on a cache line and which contains all three members above. This way
each thread has access to its variables without hindering the other
ones. Just doing this increased the HTTP/1 request rate by 5% on a
16-thread machine.

The definition was moved to connection.{c,h} since it appeared a more
natural evolution of the ongoing changes given that there was already
one of them declared in connection.h previously.
2020-06-28 10:52:36 +02:00
Willy Tarreau
3846877329 BUG/MINOR: mux_h2: don't lose the leaving trace in h2_io_cb()
Commit cd4159f ("MEDIUM: mux_h2: Implement the takeover() method.")
added a return in the middle of the function, and as usual with such
stray return statements, some unrolling was lost. Here it's only the
TRACE_LEAVE() call, so it's mostly harmless. That's 2.2 only, no
backport is needed.
2020-06-28 00:34:23 +02:00
Willy Tarreau
b2551057af CLEANUP: include: tree-wide alphabetical sort of include files
This patch fixes all the leftovers from the include cleanup campaign. There
were not that many (~400 entries in ~150 files) but it was definitely worth
doing it as it revealed a few duplicates.
2020-06-11 10:18:59 +02:00
Willy Tarreau
36979d9ad5 REORG: include: move the error reporting functions to from log.h to errors.h
Most of the files dealing with error reports have to include log.h in order
to access ha_alert(), ha_warning() etc. But while these functions don't
depend on anything, log.h depends on a lot of stuff because it deals with
log-formats and samples. As a result it's impossible not to embark long
dependencies when using ha_warning() or qfprintf().

This patch moves these low-level functions to errors.h, which already
defines the error codes used at the same places. About half of the users
of log.h could be adjusted, sometimes revealing other issues such as
missing tools.h. Interestingly the total preprocessed size shrunk by
4%.
2020-06-11 10:18:59 +02:00
Willy Tarreau
6be7849f39 REORG: include: move cfgparse.h to haproxy/cfgparse.h
There's no point splitting the file in two since only cfgparse uses the
types defined there. A few call places were updated and cleaned up. All
of them were in C files which register keywords.

There is nothing left in common/ now so this directory must not be used
anymore.
2020-06-11 10:18:58 +02:00
Willy Tarreau
dfd3de8826 REORG: include: move stream.h to haproxy/stream{,-t}.h
This one was not easy because it was embarking many includes with it,
which other files would automatically find. At least global.h, arg.h
and tools.h were identified. 93 total locations were identified, 8
additional includes had to be added.

In the rare files where it was possible to finalize the sorting of
includes by adjusting only one or two extra lines, it was done. But
all files would need to be rechecked and cleaned up now.

It was the last set of files in types/ and proto/ and these directories
must not be reused anymore.
2020-06-11 10:18:58 +02:00
Willy Tarreau
5e539c9b8d REORG: include: move stream_interface.h to haproxy/stream_interface{,-t}.h
Almost no changes, removed stdlib and added buf-t and connection-t to
the types to avoid a warning.
2020-06-11 10:18:58 +02:00
Willy Tarreau
c6d61d762f REORG: include: move trace.h to haproxy/trace{,-t}.h
Only thread-t was added to satisfy THREAD_LOCAL but the rest was OK.
2020-06-11 10:18:58 +02:00
Willy Tarreau
48d25b3bc9 REORG: include: move session.h to haproxy/session{,-t}.h
Almost no change was needed beyond a little bit of reordering of the
types file and adjustments to use session-t instead of session at a
few places.
2020-06-11 10:18:58 +02:00
Willy Tarreau
7ea393d95e REORG: include: move connection.h to haproxy/connection{,-t}.h
The type file is becoming a mess, half of it is for the proxy protocol,
another good part describes conn_streams and mux ops, it would deserve
being split again. At least it was reordered so that elements are easier
to find, with the PP-stuff left at the end. The MAX_SEND_FD macro was moved
to compat.h as it's said to be the value for Linux.
2020-06-11 10:18:58 +02:00
Willy Tarreau
87735330d1 REORG: include: move http_htx.h to haproxy/http_htx{,-t}.h
A few includes had to be added, namely list-t.h in the type file and
types/proxy.h in the proto file. actions.h was including http-htx.h
but didn't need it so it was dropped.
2020-06-11 10:18:57 +02:00
Willy Tarreau
bf0731491b REORG: include: move common/h2.h to haproxy/h2.h
No change was performed, the file is only included from C files and
currently doesn't need to be split into types+functions.
2020-06-11 10:18:57 +02:00
Willy Tarreau
be327fa332 REORG: include: move hpack*.h to haproxy/ and split hpack-tbl
The various hpack files are self-contained, but hpack-tbl was one of
those showing difficulties when pools were added because that began
to add quite some dependencies. Now when built in standalone mode,
it still uses the bare minimum pool definitions and doesn't require
to know the prototypes anymore when only the structures are needed.
Thus the files were moved verbatim except for hpack-tbl which was
split between types and prototypes.
2020-06-11 10:18:57 +02:00
Willy Tarreau
16f958c0e9 REORG: include: split common/htx.h into haproxy/htx{,-t}.h
Most of the file was a large set of HTX elements manipulation functions
and few types, so splitting them allowed to further reduce dependencies
and shrink the build time. Doing so revealed that a few files (h2.c,
mux_pt.c) needed haproxy/buf.h and were previously getting it through
htx.h. They were fixed.
2020-06-11 10:18:57 +02:00
Willy Tarreau
5413a87ad3 REORG: include: move common/h1.h to haproxy/h1.h
The file was moved as-is. There was a wrong dependency on dynbuf.h
instead of buf.h which was addressed. There was no benefit to
splitting this between types and functions.
2020-06-11 10:18:57 +02:00
Willy Tarreau
6131d6a731 REORG: include: move common/net_helper.h to haproxy/net_helper.h
No change was necessary.
2020-06-11 10:18:57 +02:00
Willy Tarreau
2741c8c4aa REORG: include: move common/buffer.h to haproxy/dynbuf{,-t}.h
The pretty confusing "buffer.h" was in fact not the place to look for
the definition of "struct buffer" but the one responsible for dynamic
buffer allocation. As such it defines the struct buffer_wait and the
few functions to allocate a buffer or wait for one.

This patch moves it renaming it to dynbuf.h. The type definition was
moved to its own file since it's included in a number of other structs.

Doing this cleanup revealed that a significant number of files used to
rely on this one to inherit struct buffer through it but didn't need
anything from this file at all.
2020-06-11 10:18:57 +02:00
Willy Tarreau
4c7e4b7738 REORG: include: update all files to use haproxy/api.h or api-t.h if needed
All files that were including one of the following include files have
been updated to only include haproxy/api.h or haproxy/api-t.h once instead:

  - common/config.h
  - common/compat.h
  - common/compiler.h
  - common/defaults.h
  - common/initcall.h
  - common/tools.h

The choice is simple: if the file only requires type definitions, it includes
api-t.h, otherwise it includes the full api.h.

In addition, in these files, explicit includes for inttypes.h and limits.h
were dropped since these are now covered by api.h and api-t.h.

No other change was performed, given that this patch is large and
affects 201 files. At least one (tools.h) was already freestanding and
didn't get the new one added.
2020-06-11 10:18:42 +02:00
Willy Tarreau
8d2b777fe3 REORG: ebtree: move the include files from ebtree to include/import/
This is where other imported components are located. All files which
used to directly include ebtree were touched to update their include
path so that "import/" is now prefixed before the ebtree-related files.

The ebtree.h file was slightly adjusted to read compiler.h from the
common/ subdirectory (this is the only change).

A build issue was encountered when eb32sctree.h is loaded before
eb32tree.h because only the former checks for the latter before
defining type u32. This was addressed by adding the reverse ifdef
in eb32tree.h.

No further cleanup was done yet in order to keep changes minimal.
2020-06-11 09:31:11 +02:00
Willy Tarreau
2bdcc70fa7 MEDIUM: hpack: use a pool for the hpack table
Instead of using malloc/free to allocate an HPACK table, let's declare
a pool. However the HPACK size is configured by the H2 mux, so it's
also this one which allocates it after post_check.
2020-05-19 11:40:39 +02:00
Olivier Houchard
199d4fade4 MINOR: muxes: Note that we can't usee a connection when added to the srv idle.
In the various muxes, add a comment documenting that once
srv_add_to_idle_list() got called, any thread may pick that conenction up,
so it is unsafe to access the mux context/the connection, the only thing we
can do is returning.
2020-03-22 23:25:51 +01:00
Olivier Houchard
cd4159f039 MEDIUM: mux_h2: Implement the takeover() method.
Implement a takeover() method in the mux_h2, so that other threads may
take an idle connection over if they need it.
2020-03-19 22:07:34 +01:00
Olivier Houchard
f0d4dff25c MINOR: connections: Make the "list" element a struct mt_list instead of list.
Make the "list" element a struct mt_list, and explicitely use
list_from_mt_list to get a struct list * where it is used as such, so that
mt_list_for_each_entry will be usable with it.
2020-03-19 22:07:33 +01:00
Olivier Houchard
dc2f2753e9 MEDIUM: servers: Split the connections into idle, safe, and available.
Revamp the server connection lists. We know have 3 lists :
- idle_conns, which contains idling connections
- safe_conns, which contains idling connections that are safe to use even
for the first request
- available_conns, which contains connections that are not idling, but can
still accept new streams (those are HTTP/2 or fastcgi, and are always
considered safe).
2020-03-19 22:07:33 +01:00
Olivier Houchard
2444aa5b66 MEDIUM: sessions: Don't be responsible for connections anymore.
Make it so sessions are not responsible for connection anymore, except for
connections that are private, and thus can't be shared, otherwise, as soon
as a request is done, the session will just add the connection to the
orphan connections pool.
This will break http-reuse safe, but it is expected to be fixed later.
2020-03-19 22:07:33 +01:00
Olivier Houchard
8676514d4e MINOR: servers: Kill priv_conns.
Remove the list of private connections from server, it has been largely
unused, we only inserted connections in it, but we would never actually
use it.
2020-03-11 19:20:01 +01:00
Willy Tarreau
f4629a5346 BUG/MINOR: connection/debug: do not enforce !event_type on subscribe() anymore
When building with DEBUG_STRICT, there are still some BUG_ON(events&event_type)
in the subscribe() code which are not welcome anymore since we explicitly
permit to wake the caller up on readiness. This causes some regtests to fail
since 2c1f37d353 ("OPTIM: mux-h1: subscribe rather than waking up at a few
other places") when built with this option.

No backport is needed, this is 2.2-dev.
2020-03-05 07:46:33 +01:00
Willy Tarreau
2104659cd5 MEDIUM: buffer: remove the buffer_wq lock
This lock was only needed to protect the buffer_wq list, but now we have
the mt_list for this. This patch simply turns the buffer_wq list to an
mt_list and gets rid of the lock.

It's worth noting that the whole buffer_wait thing still looks totally
wrong especially in a threaded context: the wakeup_cb() callback is
called synchronously from any thread and may end up calling some
connection code that was not expected to run on a given thread. The
whole thing should probably be reworked to use tasklets instead and be
a bit more centralized.
2020-02-26 10:39:36 +01:00
Willy Tarreau
d57e34978d BUG/MINOR: mux: do not call conn_xprt_stop_recv() on buffer shortage
In H1/H2/FCGI, the *_get_buf() functions try to disable receipt of data
when there's no buffer available. But they do so at the lowest possible
level, which is unrelated to the upper transport layers which may still
be trying to feed data based on subscription. The correct approach here
would theorically be to only disable subscription, though when we get
there, the subscription will already have been dropped, so we can safely
just remove that call.

It's unlikely that this could have had any practical impact, as the upper
xprt layer would call this callback which would fail an not resubscribe.
Having the lowest layer disabled would just be temporary since when
re-enabling reading, a subscribe at the end of data would re-enable it.

Backport should not harm but seems useless at this point.
2020-02-21 11:21:12 +01:00
Olivier Houchard
12ffab03b6 BUG/MEDIUM: muxes: Use the right argument when calling the destroy method.
When calling the mux "destroy" method, the argument should be the mux
context, not the connection. In a few instances in the mux code, the
connection was used (mainly when the session wouldn't handle the idle
connection, and the server pool was fool), and that could lead to random
segfaults.

This should be backported to 2.1, 2.0, and 1.9
2020-02-14 13:28:38 +01:00
Willy Tarreau
508f989758 BUG/MAJOR: mux-h2: don't wake streams after connection was destroyed
In commit 477902b ("MEDIUM: connections: Get ride of the xprt_done
callback.") we added an inconditional call to h2_wake_some_streams()
in h2_wake(), though we must not do it if the connection is destroyed
or we end up with a use-after-free. In this case it's already done in
h2_process() before destroying the connection anyway.

Let's just add this test for now. A cleaner approach might consist in
doing it in the h2_process() function itself when a connection status
change is detected.

No backport is needed, this is purely 2.2.
2020-02-11 04:42:05 +01:00
Willy Tarreau
bb2c4ae065 BUG/MEDIUM: mux-h2: make sure we don't emit TE headers with anything but "trailers"
While the H2 parser properly checks for the absence of anything but
"trailers" in the TE header field, we forget to check this when sending
the request to an H2 server. The problem is that an H2->H2 conversion
may keep "gzip" and fail on the next stage.

This patch makes sure that we only send "TE: trailers" if the TE header
contains the "trailers" token, otherwise it's dropped.

This fixes issue #464 and should be backported till 1.9.
2020-01-24 09:07:53 +01:00
Willy Tarreau
911db9bd29 MEDIUM: connection: use CO_FL_WAIT_XPRT more consistently than L4/L6/HANDSHAKE
As mentioned in commit c192b0ab95 ("MEDIUM: connection: remove
CO_FL_CONNECTED and only rely on CO_FL_WAIT_*"), there is a lack of
consistency on which flags are checked among L4/L6/HANDSHAKE depending
on the code areas. A number of sample fetch functions only check for
L4L6 to report MAY_CHANGE, some places only check for HANDSHAKE and
many check both L4L6 and HANDSHAKE.

This patch starts to make all of this more consistent by introducing a
new mask CO_FL_WAIT_XPRT which is the union of L4/L6/HANDSHAKE and
reports whether the transport layer is ready or not.

All inconsistent call places were updated to rely on this one each time
the goal was to check for the readiness of the transport layer.
2020-01-23 16:34:26 +01:00
Olivier Houchard
477902bd2e MEDIUM: connections: Get ride of the xprt_done callback.
The xprt_done_cb callback was used to defer some connection initialization
until we're connected and the handshake are done. As it mostly consists of
creating the mux, instead of using the callback, introduce a conn_create_mux()
function, that will just call conn_complete_session() for frontend, and
create the mux for backend.
In h2_wake(), make sure we call the wake method of the stream_interface,
as we no longer wakeup the stream task.
2020-01-22 18:56:05 +01:00
Willy Tarreau
ee1a6fc943 MINOR: connection: make the last arg of subscribe() a struct wait_event*
The subscriber used to be passed as a "void *param" that was systematically
cast to a struct wait_event*. By now it appears clear that the subscribe()
call at every layer is well defined and always takes a pointer to an event
subscriber of type wait_event, so let's enforce this in the functions'
prototypes, remove the intermediary variables used to cast it and clean up
the comments to clarify what all these functions do in their context.
2020-01-17 18:30:37 +01:00
Willy Tarreau
f96508aae6 MEDIUM: mux-h2: merge recv_wait and send_wait event notifications
This is the continuation of the recv+send event notifications merge
that was started. This patch is less trivial than the previous ones
because the existence of a send event subscription is also used to
decide to put a stream back into the send list.
2020-01-17 18:30:36 +01:00
Willy Tarreau
5723f295d8 MEDIUM: mux-h2: do not make an h2s subscribe to itself on deferred shut
The logic handling the deferred shutdown is a bit complex because it
involves a wait_event struct in each h2s dedicated to subscribing to
itself when shutdowns are not immediately possible. This implies that
we will not be able to support a shutdown and a receive subscription
in the future when we merge all wait events.

Let's solely rely on the H2_SF_WANT_SHUT_{R,W} flags instead and have
an autonomous tasklet for this. This requires to add a few controls
in the code because now when waking up a stream we need to check if it
is for I/O or just a shut, but since sending and shutting are exclusive
it's not difficult.

One point worth noting is that further resources could be shaved off
by only allocating the tasklet when failing to shut, given that in the
vast majority of streams it will never be used. In fact the sole purpose
of the tasklet is to support calling this code from outside the H2 mux
context. Looking at the code, it seems that not too many adaptations
would be required to have the send_list walking code deal with sending
the shut bits itself and further simplify all this.
2020-01-17 18:30:36 +01:00
Willy Tarreau
d9464167fa MEDIUM: mux-h2: do not try to stop sending streams on blocked mux
This partially reverts commit d846c267 ("MINOR: h2: Don't run tasks that
are waiting to send if mux in full"). This commit was introduced to
limit the start/stop overhead incurred by waking many streams to let
only a few work. But since commit 9c218e7521 ("MAJOR: mux-h2: switch
to next mux buffer on buffer full condition."), this situation occurs
way less (typically 2000 to 4000 times less often) and the benefits of
the patch above do not outweigh its shortcomings anymore. And commit
c7ce4e3e7f ("BUG/MEDIUM: mux-h2: don't stop sending when crossing a
buffer boundary") addressed a root cause of many unexpected sleeps and
wakeups.

The main problem it's causing is that it requires to keep the element
in the send_wait list until it's executed, leaving the entry in an
uncertain state, and significantly complicating the coexistence of this
list and the wait list dedicated to shutdown. Also it happens that this
call to tasklet_remove_from_task_list() will not be usable anymore once
we start to support streams on different threads. And finally, some of
the other streams that we remove might very well have managed to find
their way to the h2_snd_buf() with an unblocked condition as well so it
is possible that some of these removals were not welcome.

So this patch now makes sure that send_wait is immediately nulled when
the task is woken up, and that we don't have to play with it afterwards.
Since we don't need to stop the tasklets anymore, we don't need the
sending_list that we can remove.

However one very useful benefit of the sending_list was that it used to
provide the information about the fact that the stream already tried to
send and failed. This was an important factor to improve fairness because
late arrived streams should not be allowed to send if others are already
scheduled. So this patch introduces a new per-stream flag H2_SF_NOTIFIED
to distinguish such streams.

With this patch the fairness is preserved, and the ratio of aborted
h2_snd_buf() due to other streams already sending remains quite low
(~0.3-2.1% measured depending on object size, this is within
expectations for 100 independent streams).

If the contention issue the patch above used to address comes up again
in the future, a much better (though more complicated) solution would
be to switch to per-connection buffer pools to distribute between the
connection and the streams so that by default there are more buffers
available for the mux and the streams only have some when the mux's are
unused, i.e. it would push the memory pressure back to the data layer.

One observation made while developing this patch is that when dealing
with large objects we still spend a huge amount of time scanning the
send_list with tasks that are already woken up every time a send()
manages to purge a bit more data. Indeed, by removing the elements
from the list when H2_SF_NOTIFIED is set, the netowrk bandwidth on
1 MB objects fetched over 100 streams per connection increases by 38%.
This was not done here to preserve fairness but is worth studying (e.g.
by keeping a restart pointer on the list or just having a flag indicating
if an entry was added since last scan).
2020-01-17 18:30:36 +01:00
Willy Tarreau
c7ce4e3e7f BUG/MEDIUM: mux-h2: don't stop sending when crossing a buffer boundary
In version 2.0, after commit 9c218e7521 ("MAJOR: mux-h2: switch to next
mux buffer on buffer full condition."), the H2 mux started to use a ring
buffer for the output data in order to reduce competition between streams.
However, one corner case was suboptimally covered: when crossing a buffer
boundary, we have to shrink the outgoing frame size to the one left in
the output buffer, but this shorter size is later used as a signal of
incomplete send due to a buffer full condition (which used to be true when
using a single buffer). As a result, function h2s_frt_make_resp_data()
used to return less than requested, which in turn would cause h2_snd_buf()
to stop sending and leave some unsent data in the buffer, and si_cs_send()
to subscribe for sending more later.

But it goes a bit further than this, because subscribing to send again
causes the mux's send_list not to be empty anymore, hence extra streams
can be denied the access to the mux till the first stream is woken again.
This causes a nasty wakeup-sleep dance between streams that makes it
totally impractical to try to remove the sending list. A test showed
that it was possible to observe 3 million h2_snd_buf() giveups for only
100k requests when using 100 concurrent streams on 20kB objects.

It doesn't seem likely that a stream could get blocked and time out due
to this bug, though it's not possible either to demonstrate the opposite.
One risk is that incompletely sent streams do not have any blocking flags
so they may not be identified as blocked. However on first scan of the
send_list they meet all conditions for a wakeup.

This patch simply allows to continue on a new frame after a partial
frame. with only this change, the number of failed h2_snd_buf() was
divided by 800 (4% of calls). And by slightly increasing the H2C_MBUF_CNT
size, it can go down to zero.

This fix must be backported to 2.1 and 2.0.
2020-01-14 13:55:04 +01:00
Willy Tarreau
70c5b0e5fd BUG/MEDIUM: mux-h2: fix missing test on sending_list in previous patch
Previous commit 989539b048 ("BUG/MINOR: mux-h2: use a safe
list_for_each_entry in h2_send()") accidently lost its sending_list test,
resulting in some elements to be woken up again while already in the
sending_list and h2_unsubscribe() crashing on integrity tests (only
when built with DEBUG_DEV).

If the fix above is backported this one must be as well.
2020-01-10 18:20:15 +01:00
Willy Tarreau
989539b048 BUG/MINOR: mux-h2: use a safe list_for_each_entry in h2_send()
h2_send() uses list_for_each_entry() to scan paused streams and resume
them, but happily deletes any leftover from a previous failed unsubscribe,
which is obviously not safe and would corrupt the list. In practice this
is a proof that this doesn't happen, but it's not the best way to prove it.
In order to fix this and reduce the maintenance burden caused by code
duplication (this list walk exists at 3 places), let's introduce a new
function h2_resume_each_sending_h2s() doing exactly this and use it at
all 3 places.

This bug was introduced as a side effect of fix 998410a41b ("BUG/MEDIUM:
h2: Revamp the way send subscriptions works.") so it should be backported
as far as 1.9.
2020-01-10 17:18:32 +01:00
William Dauchy
cd7fa3dcfc CLEANUP: mux-h2: remove unused goto "out_free_h2s"
Since commit fa8aa867b9 ("MEDIUM: connections: Change struct
wait_list to wait_event.") we no longer use this section.

this should fix github issue #437

Signed-off-by: William Dauchy <w.dauchy@criteo.com>
2020-01-08 16:16:19 +01:00
Willy Tarreau
f3ce0418aa MINOR: mux-h2/trace: report the connection and/or stream error code
We were missing the error code when tracing a call to h2s_error() or
h2c_error(), let's report it when it's set.
2019-11-25 11:34:26 +01:00
Willy Tarreau
57a1816fae BUG/MAJOR: mux-h2: don't try to decode a response HEADERS frame in idle state
Christopher found another issue in the H2 backend implementation that
results from a miss in the H2 spec: the processing of a HEADERS frame
is always permitted in IDLE state, but this doesn't make sense on the
response path! And here when facing such a frame, we try to decode it
while we didn't allocate any stream, so we end up trying to fill the
idle stream's buffer (read-only) and crash.

What we're doing here is that if we get a HEADERS frame in IDLE state
from a server, we terminate the connection with a PROTOCOL_ERROR. No
such transition seems to be permitted by the spec but it seems to be
the only sane solution.

This fix must be backported as far as 1.9. Note that in 2.0 and earlier
there's no h2_frame_check_vs_state() function, instead the check is
inlined in h2_process_demux().
2019-11-25 11:34:20 +01:00
Christopher Faulet
ea009736d8 BUILD: debug: Avoid warnings in dev mode with -02 because of some BUG_ON tests
Some BUG_ON() tests emit a warning because of a potential null pointer
dereference on an HTX block. In fact, it should never happen, but now, GCC is
happy.

This patch must be backported to 2.0.
2019-11-20 14:11:47 +01:00
Willy Tarreau
cab2295ae7 BUG/MEDIUM: mux-h2: immediately report connection errors on streams
In case a stream tries to send on a connection error, we must report the
error so that the stream interface keeps the data available and may safely
retry on another connection. Till now this would happen only before the
connection was established, not in case of a failed handshake or an early
GOAWAY for example.

This should be backported to 2.0 and 1.9.
2019-10-31 15:48:18 +01:00
Willy Tarreau
4481e26e5d BUG/MEDIUM: mux-h2: immediately remove a failed connection from the idle list
If a connection faces an error or a timeout, it must be removed from its
idle list ASAP. We certainly don't want to risk sending new streams on
it.

This should be backported to 2.0 (replacing MT_LIST_DEL with LIST_DEL_LOCKED)
and 1.9 (there's no lock there, the idle lists are per-thread and per-server
however a LIST_DEL_INIT will be needed).
2019-10-31 15:39:27 +01:00
Willy Tarreau
c61966f9b4 BUG/MEDIUM: mux-h2: report no available stream on a connection having errors
If an H2 mux has met an error, we must not report available streams
anymore, or it risks to accumulate new streams while not being able
to process them.

This should be backported to 2.0 and 1.9.
2019-10-31 15:10:03 +01:00
Olivier Houchard
9b8e11e691 MINOR: mux: Add a new method to get informations about a mux.
Add a new method, ctl(), to muxes. It uses a "enum mux_ctl_type" to
let it know which information we're asking for, and can output it either
directly by returning the expected value, or by using an optional argument.
"output" argument.
Right now, the only known mux_ctl_type is MUX_STATUS, that will return 0 if
the mux is not ready, or MUX_STATUS_READY if the mux is ready.

We probably want to backport this to 1.9 and 2.0.
2019-10-29 14:15:20 +01:00
Christopher Faulet
69fe5cea21 BUG/MINOR: mux-h2: Don't pretend mux buffers aren't full anymore if nothing sent
In h2_send(), when something is sent, we remove the flags
(H2_CF_MUX_MFULL|H2_CF_DEM_MROOM) on the h2 connection. This way, we are able to
wake up all streams waiting to send data. Unfortunatly, these flags are
unconditionally removed, even when nothing was sent. So if the h2c is blocked
because the mux buffers are full and we are unable to send anything, all streams
in the send_list are woken up for nothing. Now, we only remove these flags if at
least a send succeeds.

This patch must be backport to 2.0.
2019-10-26 08:24:45 +02:00
Willy Tarreau
9364a5fda3 BUG/MINOR: mux-h2: do not emit logs on backend connections
The logs were added to the H2 mux so that we can report logs in case
of errors that prevent a stream from being created, but as a side effect
these logs are emitted twice for backend connections: once by the H2 mux
itself and another time by the upper layer stream. It can even happen
more with connection retries.

This patch makes sure we do not emit logs for backend connections.

It should be backported to 2.0 and 1.9.
2019-10-23 11:12:22 +02:00
Willy Tarreau
572d9f5847 MINOR: mux-h2: also support emitting CONTINUATION on trailers
Trailers were forgotten by commit cb985a4da6 ("MEDIUM: mux-h2: support
emitting CONTINUATION frames after HEADERS"), this one just fixes this
miss.
2019-10-11 17:00:04 +02:00
Olivier Houchard
5a3671d8b1 MINOR: h2: Document traps to be avoided on multithread.
Document a few traps to avoid if we ever attempt to allow the upper layer
of the mux h2 to be run by multiple threads.
2019-10-11 16:37:41 +02:00
Willy Tarreau
b8ce8905cf MEDIUM: mux-h2: do not map Host to :authority on output
Instead of mapping the Host header field to :authority, we now act
differently if the request is in origin form or in absolute form.
If it's absolute, we extract the scheme and the authority from the
request, fix the path if it's empty, and drop the Host header.
Otherwise we take the scheme from the http/https flags in the HTX
layer, make the URI be the path only, and emit the Host header,
as indicated in RFC7540#8.1.2.3. This allows to distinguish between
absolute and origin requests for H1 to H2 conversions.
2019-10-09 11:10:19 +02:00
Willy Tarreau
cb985a4da6 MEDIUM: mux-h2: support emitting CONTINUATION frames after HEADERS
There are some reports of users not being able to pass "enterprise"
traffic through haproxy when using H2 because it doesn't emit CONTINUATION
frames and as such is limited to headers no longer than the negociated
max-frame-size which usually is 16 kB.

This patch implements support form emitting CONTINUATION when a HEADERS
frame cannot fit within a limit of mfs. It does this by first filling a
buffer-wise frame, then truncating it starting from the tail to append
CONTINUATION frames. This makes sure that we can truncate on any byte
without being forced to stop on a header boundary, and ensures that the
common case (no fragmentation) doesn't add any extra cost. By moving
the tail first we make sure that each byte is moved only once, thus the
performance impact remains negligible.

This addresses github issue #249.
2019-10-07 18:18:32 +02:00
Christopher Faulet
67d580994e MINOR: http: Remove headers matching the name of http-send-name-header option
It is not explicitly stated in the documentation, but some users rely on this
behavior. When the server name is inserted in a request, headers with the same
name are first removed.

This patch is not tagged as a bug, because it is not explicitly documented. We
choose to keep the same implicit behavior to not break existing
configuration. Because this option is used very little, it is not a big deal.
2019-10-04 16:12:02 +02:00
Christopher Faulet
f81ef0344e BUG/MINOR: mux-h2/trace: Fix traces on h2c initialization
When a new H2 connection is initialized, the connection context is not changed
before the end. So, traces emitted during this initialization are buggy, except
the last one when no error occurred, because the connection context is not an
h2c.

To fix the bug, the connection context is saved and set as soon as possible. So,
the connection can always safely be used in all traces, except for the very
first one. And on error, the connection context is restored.

No need to backport.
2019-10-04 15:46:59 +02:00
Willy Tarreau
c2ea47fb18 BUG/MEDIUM: mux-h2: do not enforce timeout on long connections
Alexandre Derumier reported issue #308 in which the client timeout will
strike on an H2 mux when it's shorter than the server's response time.
What happens in practice is that there is no activity on the connection
and there's no data pending on output so we can expire it. But this does
not take into account the possibility that some streams are in fact
waiting for the data layer above. So what we do now is that we enforce
the timeout when:
  - there are no more streams
  - some data are pending in the output buffer
  - some streams are blocked on the connection's flow control
  - some streams are blocked on their own flow control
  - some streams are in the send/sending list

In all other cases the connection will not timeout as it means that some
streams are actively used by the data layer.

This fix must be backported to 2.0, 1.9 and probably 1.8 as well. It
depends on the new "blocked_list" field introduced by "MINOR: mux-h2:
add a per-connection list of blocked streams". It would be nice to
also backport "ebtree: make eb_is_empty() and eb_is_dup() take a const"
to avoid a build warning.
2019-10-02 15:27:03 +02:00
Willy Tarreau
9edf6dbecc MINOR: mux-h2: add a per-connection list of blocked streams
Currently the H2 mux doesn't have a list of all the streams blocking on
the H2 side. It only knows about those trying to send or waiting for a
connection window update. It is problematic to enforce timeouts because
we never know if a stream has to live as long as the data layer wants or
has to be timed out becase it's waiting for a stream window update. This
patch adds a new list, "blocked_list", to store streams blocking on
stream flow control, or later, dependencies. Streams blocked on sfctl
are now added there. It doesn't modify the rest of the logic.
2019-10-02 14:16:14 +02:00
Willy Tarreau
35fb846333 MINOR: mux-h2/trace: missing conn pointer in demux full message
One trace was missing the connection's pointer, reporting "demux buffer full"
without indicating for what connection it was.
2019-10-02 14:16:14 +02:00
Christopher Faulet
72ba6cd8c0 MINOR: http: Add server name header from HTTP multiplexers
the option "http-send-name-header" is an eyesore. It was responsible of several
bugs because it is handled after the message analysis. With the HTX
representation, the situation is cleaner because no rewind on forwarded data is
required. But it remains ugly.

With recent changes in HAProxy, we have the opportunity to make it fairly
better. The message formatting in now done in the HTTP multiplexers. So it seems
to be the right place to handle this option. Now, the server name is added by
the HTTP multiplexers (h1, h2 and fcgi).
2019-09-27 08:48:21 +02:00
Christopher Faulet
5112a603d9 BUG/MAJOR: mux_h2: Don't consume more payload than received for skipped frames
When a frame is received for a unknown or already closed stream, it must be
skipped. This also happens when a stream error is reported. But we must be sure
to only skip received data. In the loop in h2_process_demux(), when such frames
are handled, all the frame lenght is systematically skipped. If the frame
payload is partially received, it leaves the demux buffer in an undefined
state. Because of this bug, all sort of errors may be observed, like crash or
intermittent freeze.

This patch must be backported to 2.0, 1.9 and 1.8.
2019-09-26 16:51:02 +02:00
Christopher Faulet
ea7a7781a9 BUG/MINOR: mux-h2: Use the dummy error when decoding headers for a closed stream
Since the commit 6884aa3e ("BUG/MAJOR: mux-h2: Handle HEADERS frames received
after a RST_STREAM frame"), HEADERS frames received for an unknown or already
closed stream are decoded. Once decoded, an error is reported for the
stream. But because it is a dummy stream (h2_closed_stream), its state cannot be
changed. So instead, we must return the dummy error stream (h2_error_stream).

This patch must be backported to 2.0 and 1.9.
2019-09-26 16:51:02 +02:00
Christopher Faulet
b2d930ebe6 BUG/MINOR: mux-h2: Fix missing braces because of traces in h2_detach()
Braces was missing aroung a "if" statement in the function h2_detach(), leaving
an unconditional return.

No backport needed.
2019-09-26 16:51:02 +02:00
Willy Tarreau
4c08f12dd8 BUG/MEDIUM: mux-h2: don't reject valid frames on closed streams
Consecutive to commit 6884aa3eb0 ("BUG/MAJOR: mux-h2: Handle HEADERS frames
received after a RST_STREAM frame") some valid frames on closed streams
(RST_STREAM, PRIORITY, WINDOW_UPDATE) were now rejected. It turns out that
the previous condition was in fact intentional to catch only sensitive
frames, which was indeed a mistake since these ones needed to be decoded
to keep HPACK synchronized. But we must absolutely accept WINDOW_UPDATES
or we risk to stall some transfers. And RST/PRIO definitely are valid.

Let's adjust the condition to reflect that and update the comment to
explain the reason for this unobvious condition.

This must be backported to 2.0 and 1.9 after the commit above is brought
there.
2019-09-26 08:47:15 +02:00
Willy Tarreau
cec60056e4 BUG/MINOR: mux-h2: do not wake up blocked streams before the mux is ready
In h2_send() we used to scan pending streams and wake them up when it's
possible to send, without considering the connection's state. Thus caused
some excess failed calls to h2_snd_buf() during the preface on backend
connections :

[01|h2|4|mux_h2.c:3562] h2_wake(): entering : h2c=0x7f1430032ed0(B,PRF)
[01|h2|4|mux_h2.c:3475] h2_process(): entering : h2c=0x7f1430032ed0(B,PRF)
[01|h2|4|mux_h2.c:3326] h2_send(): entering : h2c=0x7f1430032ed0(B,PRF)
[01|h2|4|mux_h2.c:3152] h2_process_mux(): entering : h2c=0x7f1430032ed0(B,PRF)
[01|h2|4|mux_h2.c:1508] h2c_bck_send_preface(): entering : h2c=0x7f1430032ed0(B,PRF)
[01|h2|4|mux_h2.c:1379] h2c_send_settings(): entering : h2c=0x7f1430032ed0(B,PRF)
[01|h2|4|mux_h2.c:1464] h2c_send_settings(): leaving : h2c=0x7f1430032ed0(B,PRF)
[01|h2|4|mux_h2.c:1543] h2c_bck_send_preface(): leaving : h2c=0x7f1430032ed0(B,PRF)
[01|h2|4|mux_h2.c:3241] h2_process_mux(): leaving : h2c=0x7f1430032ed0(B,STG)
[01|h2|3|mux_h2.c:3384] sent data : h2c=0x7f1430032ed0(B,STG)
  >>> streams woken up here
[01|h2|4|mux_h2.c:3428] h2_send(): waking up pending stream : h2c=0x7f1430032ed0(B,STG)
[01|h2|4|mux_h2.c:3435] h2_send(): leaving with everything sent : h2c=0x7f1430032ed0(B,STG)
[01|h2|4|mux_h2.c:3326] h2_send(): entering : h2c=0x7f1430032ed0(B,STG)
[01|h2|4|mux_h2.c:3152] h2_process_mux(): entering : h2c=0x7f1430032ed0(B,STG)
[01|h2|4|mux_h2.c:3241] h2_process_mux(): leaving : h2c=0x7f1430032ed0(B,STG)
[01|h2|4|mux_h2.c:3435] h2_send(): leaving with everything sent : h2c=0x7f1430032ed0(B,STG)
[01|h2|4|mux_h2.c:3552] h2_process(): leaving : h2c=0x7f1430032ed0(B,STG)
[01|h2|4|mux_h2.c:3564] h2_wake(): leaving
  >>> I/O callback was already scheduled and called despite having nothing left to do
[01|h2|4|mux_h2.c:3454] h2_io_cb(): entering : h2c=0x7f1430032ed0(B,STG)
[01|h2|4|mux_h2.c:3326] h2_send(): entering : h2c=0x7f1430032ed0(B,STG)
[01|h2|4|mux_h2.c:3152] h2_process_mux(): entering : h2c=0x7f1430032ed0(B,STG)
[01|h2|4|mux_h2.c:3241] h2_process_mux(): leaving : h2c=0x7f1430032ed0(B,STG)
[01|h2|4|mux_h2.c:3435] h2_send(): leaving with everything sent : h2c=0x7f1430032ed0(B,STG)
[01|h2|4|mux_h2.c:3463] h2_io_cb(): leaving
  >>> stream tries and fails again here!
[01|h2|4|mux_h2.c:5568] h2_snd_buf(): entering : h2c=0x7f1430032ed0(B,STG)
[01|h2|4|mux_h2.c:5587] h2_snd_buf(): connection not ready, leaving : h2c=0x7f1430032ed0(B,STG)
[01|h2|4|mux_h2.c:5398] h2_subscribe(): entering : h2c=0x7f1430032ed0(B,STG)
[01|h2|4|mux_h2.c:5408] h2_subscribe(): subscribe(send) : h2c=0x7f1430032ed0(B,STG)
[01|h2|4|mux_h2.c:5422] h2_subscribe(): leaving : h2c=0x7f1430032ed0(B,STG)
[01|h2|4|mux_h2.c:5475] h2_rcv_buf(): entering : h2c=0x7f1430032ed0(B,STG)
[01|h2|4|mux_h2.c:5535] h2_rcv_buf(): leaving : h2c=0x7f1430032ed0(B,STG)
[01|h2|4|mux_h2.c:5398] h2_subscribe(): entering : h2c=0x7f1430032ed0(B,STG)
[01|h2|4|mux_h2.c:5400] h2_subscribe(): subscribe(recv) : h2c=0x7f1430032ed0(B,STG)
[01|h2|4|mux_h2.c:5422] h2_subscribe(): leaving : h2c=0x7f1430032ed0(B,STG)

This can happen when sending the preface, the settings, and the settings
ACK. Let's simply condition the wake up on st0 >= FRAME_H as is done at
other places.
2019-09-25 08:34:15 +02:00
Willy Tarreau
73db434f7f MINOR: h2/trace: report the frame type when known
In state match error cases, we don't know what frame type was received
because we don't reach the frame parsers. Let's add the demuxed frame
type and flags in the trace when it's known. For this we make sure to
always reset h2c->dsi when switching back to FRAME_H. Only one location
was missing. The state transitions were not always clear (sometimes
reported before, sometimes after), these were clarified by being
reported only before switching.
2019-09-25 08:34:15 +02:00
Willy Tarreau
2d22144559 MINOR: h2/trace: indicate 'F' or 'B' to locate the side of an h2c in traces
It was difficult in traces showing h2-to-h2 communications to figure the
connection side solely based on the pointer. With this patch we prepend
'F' or 'B' before the state to make this more explicit:

[06|h2|4|mux_h2.c:5487] h2_rcv_buf(): entering : h2c=0x7f6acc026440(F,FRH) h2s=0x7f6acc021720(1,CLO)
[06|h2|4|mux_h2.c:5547] h2_rcv_buf(): leaving : h2c=0x7f6acc026440(F,FRH) h2s=0x7f6acc021720(1,CLO)
[06|h2|4|mux_h2.c:4040] h2_shutw(): entering : h2c=0x7f6acc026440(F,FRH) h2s=0x7f6acc021720(1,CLO)
2019-09-25 07:30:59 +02:00
Christopher Faulet
6884aa3eb0 BUG/MAJOR: mux-h2: Handle HEADERS frames received after a RST_STREAM frame
As stated in the RFC7540#5.1, an endpoint that receives any frame other than
PRIORITY after receiving a RST_STREAM MUST treat that as a stream error of type
STREAM_CLOSED. However, frames carrying compression state must still be
processed before being dropped to keep the HPACK decoder synchronized. This had
to be the purpose of the commit 8d9ac3ed8b ("BUG/MEDIUM: mux-h2: do not abort
HEADERS frame before decoding them"). But, the test on the frame type was
inverted.

This bug is major because desynchronizing the HPACK decoder leads to mixup
indexed headers in messages. From the time an HEADERS frame is received and
ignored for a closed stream, wrong headers may be sent to the following streams.

This patch may fix several bugs reported on github (#116, #290, #292). It must
be backported to 2.0 and 1.9.
2019-09-23 15:28:23 +02:00
Christopher Faulet
21d849f52f BUG/MINOR: mux-h2: Be sure to have a connection to unsubcribe
When the mux is released, It must own the connection to unsubcribe.
This patch must be backported to 2.0.
2019-09-18 11:20:55 +02:00
Christopher Faulet
86d144c74b MINOR: muxes/htx: Ignore pseudo header during message formatting
When an HTX message is formatted to an H1 or H2 message, pseudo-headers (with
header names starting by a colon (':')) are now ignored. In fact, for now, only
H2 messages have such headers, and the H2 mux already skips them when it creates
the HTX message. But in the futur, it may be useful to keep these headers in the
HTX message to help the message analysis or to do some processing during the
HTTP formatting. It would also be a good idea to have scopes for pseudo-headers
(:h1-, :h2-, :fcgi-...) to limit their usage to a specific mux.
2019-09-17 10:18:54 +02:00
Christopher Faulet
3e395632bf CLEANUP: mux-h2: Remove unused flag H2_SF_DATA_CHNK
Since the legacy HTTP mode has been removed, this flag is not necessary
anymore. Removing this flag, a test on the HTX message at the end of the
function h2c_decode_headers() has also been removed fixing the github
issue #244.

No backport needed.
2019-09-13 10:08:28 +02:00
Willy Tarreau
e7bbbca781 BUG/MEDIUM: mux-h2/trace: fix missing braces added with traces
Ilya reported in issue #242 that h2c_handle_priority() was having
unreachable code...  Obviously, I missed the braces around the "if",
leaving an unconditional return.

No backport is needed.
2019-08-30 15:03:58 +02:00
Willy Tarreau
fe1c908744 BUG/MEDIUM: mux-h2/trace: do not dereference h2c->conn after failed idle
In h2_detach(), if session_check_idle_conn() returns <0 we must not
dereference it since it has been freed.

No backport is needed.
2019-08-30 15:00:42 +02:00
Willy Tarreau
70b1e50feb MINOR: mux-h2/trace: report the connection pointer and state before FRAME_H
Initially we didn't report anything before FRAME_H but at least the
connection's pointer and its state are desirable.
2019-08-30 11:58:58 +02:00
Willy Tarreau
8795194f79 CLEANUP: mux-h2/trace: lower-case event names
I wanted to do it before pushing and forgot. It's easier to type lower-
case event names and more consistent with the "none" and "any" keywords.
2019-08-30 07:39:59 +02:00
Willy Tarreau
8fecec2839 CLEANUP: mux-h2/trace: reformat the "received" messages for better alignment
user-level traces are more readable when visually aligned. This is easily
done by writing "rcvd" instead of "received" to align with "sent" :

  $ socat - /tmp/sock1 <<< "show events buf0"
  [00|h2|0|mux_h2.c:2465] rcvd H2 request  : [1] H2 REQ: GET /?s=10k HTTP/2.0
  [00|h2|0|mux_h2.c:4563] sent H2 response : [1] H2 RES: HTTP/1.1 200
2019-08-30 07:39:59 +02:00
Willy Tarreau
c067a3ac8f MINOR: mux-h2/trace: report h2s->id before h2c->dsi for the stream ID
h2c->dsi is only for demuxing, and needed while decoding a new request.
But if we already have a valid stream ID (e.g. response or outgoing
request), we should use it instead. This avoids seeing [0] in front of
the responses at user level.
2019-08-30 07:39:59 +02:00
Willy Tarreau
17104d46be MINOR: mux-h2/trace: always report the h2c/h2s state and flags
There's no limitation to just "state" trace level anymore, we're
expected to always show these internal states at verbosity levels
above "clean".
2019-08-30 07:39:59 +02:00
Willy Tarreau
94f1dcf119 MINOR: mux-h2/trace: only decode the start-line at verbosity other than "minimal"
This is as documented in "trace h2 verbosity", level "minimal" only
features flags and doesn't perform any decoding at all, "simple" does,
just like "clean" which is the default for end uesrs.
2019-08-30 07:39:59 +02:00
Willy Tarreau
f7dd5191cd MINOR: mux-h2/trace: add a new verbosity level "clean"
The "clean" output will be suitable for user and proto-level output
where the internal stuff (state, pointers, etc) is not desired but
just the basic protocol elements.
2019-08-30 07:38:42 +02:00
Willy Tarreau
ab2ec45403 MINOR: mux-h2: add functions to convert an h2c/h2s state to a string
We need this all the time in traces, let's have it now. For the sake
of compact outputs, the strings are all 3-chars long. The "show fd"
output was improved to make use of this.
2019-08-30 07:10:46 +02:00
Willy Tarreau
7838a79bac MEDIUM: mux-h2/trace: add lots of traces all over the code
All functions of the h2 data path were updated to receive one or multiple
TRACE() calls, at least one pair of TRACE_ENTER()/TRACE_LEAVE(), and those
manipulating protocol elements have been improved to report frame types,
special state transitions or detected errors. Even with careful tests, no
performance impact was measured when traces are disabled.

They are not completely exploited yet, the callback function tries to
dump a lot about them, but still doesn't provide buffer dumps, nor does
it indicate the stream or connection error codes.

The first argument is always set to the connection when known. The second
argument is set to the h2s when known, sometimes a 3rd argument is set to
a buffer, generally the rxbuf or htx, and occasionally the 4th argument
points to an integer (number of bytes read/sent, error code).

Retrieving a 10kB object produces roughly 240 lines when at developer
level, 35 lines at data level, 27 at state level, and 10 at proto level
and 2 at user level.

For now the headers are not dumped, but the start line are emitted in
each direction at user level.

The patch is marked medium because it touches lots of places, though
it takes care not to change the execution path.
2019-08-29 18:22:12 +02:00
Willy Tarreau
db3cfff200 MINOR: mux-h2/trace: add the default decoding callback
The new function h2_trace() is called when relevant by the trace subsystem
in order to provide extra information about the trace being produced. It
can for example display the connection pointer, the stream pointer, etc.
It is declared in the trace source as the default callback as we expect
it to be versatile enough to enrich most traces.

In addition, for requests and responses, if we have a buffer and we can
decode it as an HTX buffer, we can extract user-friendly information from
the start line.
2019-08-29 18:19:11 +02:00
Willy Tarreau
12ae212837 MINOR: mux-h2/trace: register a new trace source with its events
For now the traces are not used. Supported events are categorized by
where the activity comes from (h2c, h2s, stream, etc), a direction
(send/recv/wake), and a list of possibilities for each of them (frame
types, errors, shut, ...). This results in ~50 different events that
try to cover a lot of possibilities when it's needed to filter on
something specific. Special events like protocol error are handled.
A few aggregate events like "rx_frame" or "tx_frame" are planed to
cover all frame types at once by being placed all the time with any
of the other ones.

We also state that the first argument is always the connection. This way
the trace subsystem will be able to safely retrieve some useful info, and
we'll still be able to get the h2c from there (conn->ctx) in a pretty
print function. The second argument will always be an h2s, and in order
to propose it for tracking, we add its description. We also define 4
verbosity levels, which seems more than enough.
2019-08-29 17:14:35 +02:00
Willy Tarreau
6386481cbb CLEANUP: mux-h2: move the demuxed frame check code in its own function
The frame check code in the demuxer was moved to its own function to
keep the demux function clean enough. This also simplifies the test
case as we can now simply call this function once in H2_CS_FRAME_P
state.
2019-08-07 14:25:20 +02:00
Willy Tarreau
30d05f3557 BUG/MINOR: mux-h2: always reset rcvd_s when switching to a new frame
In Patrick's trace it was visible that after a stream had been missed,
the next stream would receive a WINDOW_UPDATE with the first one's
credit added to its own. This makes sense because in case of error
h2c->rcvd_s is not reset. Given that this counter is per frame, better
reset it when starting to parse a new frame, it's easier and safer.

This must be backported as far as 1.8.
2019-08-06 15:49:51 +02:00
Willy Tarreau
e74679a9c6 BUG/MINOR: mux-h2: always send stream window update before connection's
In h2_process_mux() if we have some room and an attempt to send a window
update for the connection was pending, it's done first. But it's not done
for the stream, which will have for effect of postponing this attempt
till next pass into h2_process_demux(), at the risk of seeing the send
buffer full again. Let's always try to send both pending frames as soon
as possible.

This should be backported as far as 1.8.
2019-08-06 15:39:32 +02:00
Willy Tarreau
9fd5aa8ada BUG/MEDIUM: mux-h2: do not recheck a frame type after a state transition
Patrick Hemmer reported a rare case where the H2 mux emits spurious
RST_STREAM(STREAM_CLOSED) that are triggered by the send path and do
not even appear to be associated with a previous incoming frame, while
the send path never emits such a thing.

The problem is particularly complex (hence its rarity). What happens is
that when data are uploaded (POST) we must refill the sending stream's
window by sending a WINDOW_UPDATE message (and we must refill the
connection's too). But in a highly bidirectional traffic, it is possible
that the mux's buffer will be full and that there is no more room to build
this WINDOW_UPDATE frame. In this case the demux parser switches to the
H2_CS_FRAME_A state, noting that an "acknowledgement" is needed for the
current frame, and it doesn't change the current stream nor frame type.
But the stream's state was possibly updated (typically OPEN->HREM when
a DATA frame carried the ES flag).

Later the data can leave the buffer, wake up h2_io_cb(), which calls
h2_send() to send pending data, itself calling h2_process_mux() which
detects that there are unacked data in the connection's window so it
emits a WINDOW_UPDATE for the connection and resets the counter. so it
emits a WINDOW_UPDATE for the connection and resets the counter. Then
h2_process() calls h2_process_demux() which continues the processing
based on the current frame type and the current state H2_CS_FRAME_A.
Unfortunately the protocol compliance checks matching the frame type
against the current state are still present. These tests are designed
for new frames only, not for those in progress, but they are not
limited by frame types. Thus the current DATA frame is checked again
against the current stream state that is now HREM, and fails the test
with a STREAM_CLOSED error.

The quick and backportable solution consists in adding the test for
this ACK and bypass all these checks that were already validated prior
to the state transition. A better long-term solution would consist in
having a new state between H and P indicating the frame is new and
needs to be checked ("N" for new?) and apply the protocol tests only
in this state. In addition everywhere we decide to send a window
update, we should send a stream WU first if there are unacked data
for the current stream. Last, rcvd_s should always be reset when
transitioning to FRAME_H (and a BUGON for this in dev would help).

The bug will be way harder to trigger on 2.0 than on 1.8/1.9 because
we have a ring buffer for the connection so the buffer full situations
are extremely rare.

This fix must be backpored to all versions having H2 (as far as 1.8).

Special thanks to Patrick for providing exploitable traces.
2019-08-06 15:35:20 +02:00
Willy Tarreau
cfba9d6eaa BUG/MINOR: mux-h2: do not send REFUSED_STREAM on aborted uploads
If the server decides to close early, we don't want to send a
REFUSED_STREAM error but a CANCEL, so that the client doesn't want
to retry. The test in h2_do_shutw() was wrong for this as it would
handle the HLOC case like the case where nothing had been sent for
this stream, which is wrong. Now h2_do_shutw() does nothing in this
case and lets h2_do_shutr() decide.

Note that this partially undoes f983d00a1 ("BUG/MINOR: mux-h2: make
the do_shut{r,w} functions more robust against retries").

This must be backported to 2.0. The patch above was not backported to
1.9 for being too risky there, but if it eventually gets to it, this
one will be needed as well.
2019-08-06 10:32:02 +02:00
Willy Tarreau
082c45769b BUG/MINOR: mux-h2: use CANCEL, not STREAM_CLOSED in h2c_frt_handle_data()
There is a test on the existence of the conn_stream when receiving data,
to be sure to have somewhere to deliver it. Right now it responds with
STREAM_CLOSED, which is not correct since from an H2 point of view the
stream is not closed and a peer could be upset to see this. After some
analysis, it is important to keep this test to be sure not to fill the
rxbuf then stall the connection. Another option could be to modiffy
h2_frt_transfer_data() to silently discard any contents but the CANCEL
error code is designed exactly for this and to save the peer from
continuing to stream data that will be discarded, so better switch to
using this.

This must be backported as far as 1.8.
2019-08-06 10:15:49 +02:00
Willy Tarreau
231f616170 BUG/MINOR: mux-h2: don't refrain from sending an RST_STREAM after another one
The test in h2s_send_rst_stream() is excessive, it refrains from sending an
RST_STREAM if *the last frame* was an RST_STREAM, regardless of the stream
ID. In a context where both clients and servers abort a lot, it could happen
that one RST_STREAM is dropped from responses from time to time, causing
delays to the client.

This must be backported to 2.0, 1.9 and 1.8.
2019-08-06 10:04:55 +02:00
Willy Tarreau
1d4a0f8810 BUG/MEDIUM: mux-h2: split the stream's and connection's window sizes
The SETTINGS frame parser updates all streams' window for each
INITIAL_WINDOW_SIZE setting received on the connection (like h2spec
does in test 6.5.3), which can start to be expensive if repeated when
there are many streams (up to 100 by default). A quick test shows that
it's possible to parse only 35000 settings per second on a 3 GHz core
for 100 streams, which is rather small.

Given that window sizes are relative and may be negative, there's no
point in pre-initializing them for each stream and update them from
the settings. Instead, let's make them relative to the connection's
initial window size so that any change immediately affects all streams.
The only thing that remains needed is to wake up the streams that were
unblocked by the update, which is now done once at the end of
h2_process_demux() instead of once per setting. This now results in
5.7 million settings being processed per second, which is way better.

In order to keep the change small, the h2s' mws field was renamed to
"sws" for "stream window size", and an h2s_mws() function was added
to add it to the connection's initial window setting and determine the
window size to use when muxing. The h2c_update_all_ws() function was
renamed to h2c_unblock_sfctl() since it's now only used to unblock
previously blocked streams.

This needs to be backported to all versions till 1.8.
2019-08-02 13:43:33 +02:00
Willy Tarreau
9bc1c95855 BUG/MEDIUM: mux-h2: unbreak receipt of large DATA frames
Recent optimization in commit 4d7a88482 ("MEDIUM: mux-h2: don't try to
read more than needed") broke the receipt of large DATA frames because
it would unconditionally subscribe if there was some room left, thus
preventing any new rx from being done since subscription may only be
done once the end was reached, as indicated by ret == 0.

However, fixing this uncovered that in HTX mode previous versions might
occasionally be affected as well, when an available frame is the same
size as the maximum data that may fit into an HTX buffer, we may end
up reading that whole frame and still subscribe since it's still allowed
to receive, thus causing issues to read the next frame.

This patch will only work for 2.1-dev but a minor adaptation will be
needed for earlier versions (down to 1.9, where subscribe() was added).
2019-08-02 13:37:55 +02:00
Willy Tarreau
4d7a884827 MEDIUM: mux-h2: don't try to read more than needed
The h2_recv() loop was historically built around a loop to deal with
the callback model but this is not needed anymore, as it the upper
layer wants more data, it will simply try to read again. Right now
50% of the recvfrom() calls made over H2 return EAGAIN. With this
change it doesn't happen anymore. Note that the code simply consists
in breaking the loop, and reporting real data receipt instead of
always returning 1.

A test was made not to subscribe if we actually read data but it
doesn't change anything since we might be subscribed very early
already.
2019-07-31 16:18:25 +02:00
Christopher Faulet
4da05478e3 CLEANUP: mux-h2: Remove unused flags H2_SF_CHNK_*
Since the legacy HTTP code was removed, these flags are unused anymore.
2019-07-19 09:46:23 +02:00
Christopher Faulet
c985f6c5d8 MINOR: connection: Remove the multiplexer protocol PROTO_MODE_HTX
Since the legacy HTTP mode is disabled and no multiplexer relies on it anymore,
there is no reason to have 2 multiplexer protocols for the HTTP. So the protocol
PROTO_MODE_HTX was removed and all HTTP multiplexers use now PROTO_MODE_HTTP.
2019-07-19 09:18:27 +02:00
Christopher Faulet
9b79a1025d MEDIUM: mux-h2: Remove support of the legacy HTTP mode
Now the H2 multiplexer only works in HTX. Code relying on the legacy HTTP mode
was removed.
2019-07-19 09:18:27 +02:00
Christopher Faulet
192c6a23d4 MINOR: htx: Deduce the number of used blocks from tail and head values
<head> and <tail> fields are now signed 32-bits integers. For an empty HTX
message, these fields are set to -1. So the field <used> is now useless and can
safely be removed. To know if an HTX message is empty or not, we just compare
<head> against -1 (it also works with <tail>). The function htx_nbblks() has
been added to get the number of used blocks.
2019-07-19 09:18:27 +02:00
Christopher Faulet
6d36e1c282 MINOR: mux-h2: Don't adjust anymore the amount of data sent in h2_snd_buf()
Because the infinite forward is HTX aware, it is useless to tinker with the
number of bytes really sent. This was fixed long ago for the H1 and forgotten to
do so for the H2.
2019-07-19 09:18:27 +02:00
Christopher Faulet
4f09ec812a BUG/MEDIUM: mux-h2: Remove the padding length when a DATA frame size is checked
When a DATA frame is processed for a message with a content-length, we first
take care to not have a frame size that exceeds the remaining to
read. Otherwise, an error is triggered. But we must remove the padding length
from the frame size because the padding is not included in the announced
content-length.

This patch must be backported to 2.0 and 1.9.
2019-06-19 10:06:31 +02:00
Christopher Faulet
dd2a5620d5 BUG/MEDIUM: mux-h2: Reset padlen when several frames are demux
In the function h2_process_demux(), if several frames are parsed, the padding
length must be reset between each frame. Otherwise we may wrongly think a frame
has a padding block because the previous one was padded.

This patch must be backported to 2.0 and 1.9.
2019-06-19 10:06:31 +02:00
Willy Tarreau
b6563f4ac4 BUG/MEDIUM: mux-h2: properly account for the appended data in HTX
When commit 0350b90e3 ("MEDIUM: htx: make htx_add_data() never defragment
the buffer") was introduced, it made htx_add_data() actually be able to
add less data than it was asked for, and the callers must use the returned
value to know how much was added. The H2 code used to rely on the frame
length instead of the return value. A version of the code doing this was
written but is obviously not the one that got merged, resulting in breaking
large uploads or downloads when HTX would have instead defragmented the
buffer because the HTX side sees less contents than what the H2 side sees.

This patch fixes this again. No backport is needed.
2019-06-15 11:42:01 +02:00
Willy Tarreau
76c83826db BUG/MEDIUM: mux-h2: fix early close with option abortonclose
Olivier found that commit 99ad1b3e8 ("MINOR: mux-h2: stop relying on
CS_FL_REOS") managed to break abortonclose again with H2. What happens
is that while the CS_FL_REOS flag was set on some transitions to the
HREM state, it's not set on all and is in fact only set when the low
level connection is closed. So making the replacement condition match
the HREM and ERROR states is not correct and causes completely correct
requests to send advertise an early close of the connection layer while
only the stream's input is closed.

In order to avoid this, we now properly split the checks for the CLOSED
state and for the closed connection. This way there is no risk to set
the EOS flag too early on the connection.

No backport is needed.
2019-06-15 10:04:09 +02:00
Willy Tarreau
86eded6c69 CLEANUP: tasks: rename task_remove_from_tasklet_list() to tasklet_remove_*
The function really only operates on tasklets, its arguments are always
tasklets cast as tasks to match the function's type, to be cast back to
a struct tasklet. Let's rename it to tasklet_remove_from_tasklet_list(),
take a struct tasklet, and get rid of the undesired task casts.
2019-06-14 14:57:03 +02:00
Willy Tarreau
3c39a7d889 CLEANUP: connection: rename the wait_event.task field to .tasklet
It's really confusing to call it a task because it's a tasklet and used
in places where tasks and tasklets are used together. Let's rename it
to tasklet to remove this confusion.
2019-06-14 14:42:29 +02:00
Christopher Faulet
3b44c54129 MINOR: mux-h2: Forward clients scheme to servers checking start-line flags
By default, the scheme "https" is always used. But when an explicit scheme was
defined and when this scheme is "http", we use it in the request sent to the
server. This is done by checking flags of the start-line. If the flag
HTX_SL_F_HAS_SCHM is set, it means an explicit scheme was defined on the client
side. And if the flag HTX_SL_F_SCHM_HTTP is set, it means the scheme "http" was
used.
2019-06-14 11:13:32 +02:00
Christopher Faulet
e4ab11bb88 BUG/MINOR: http: Use the global value to limit the number of parsed headers
Instead of using the macro MAX_HTTP_HDR to limit the number of headers parsed
before throwing an error, we now use the custom global variable
global.tune.max_http_hdr.

This patch must be backported to 1.9.
2019-06-14 11:13:32 +02:00
Willy Tarreau
7348119fb2 BUG/MEDIUM: mux-h2: make sure the connection timeout is always set
There seems to be a tricky case in the H2 mux related to stream flow
control versus buffer a full situation : is a large response cannot
be entirely sent to the client due to the stream window being too
small, the stream is paused with the SFCTL flag. Then the upper
layer stream might get bored and expire this stream. It will then
shut it down first. But the shutdown operation might fail if the
mux buffer is full, resulting in the h2s being subscribed to the
deferred_shut event with the stream *not* added to the send_list
since it's blocked in SFCTL. In the mean time the upper layer completely
closes, calling h2_detach(). There we have a send_wait (the pending
shutw), the stream is marked with SFCTL so we orphan it.

Then if the client finally reads all the data that were clogging the
buffer, the send_list is run again, but our stream is not there. From
this point, the connection's stream list is not empty, the mux buffer
is empty, so the connection's timeout is not set. If the client
disappears without updating the stream's window, nothing will expire
the connection.

This patch makes sure we always keep the connection timeout updated.
There might be finer solutions, such as checking that there are still
living streams in the connection (i.e. streams not blocked in SFCTL
state), though this is not necessarily trivial nor useful, since the
client timeout is the same for the upper level stream and the connection
anyway.

This patch needs to be backported to 1.9 and 1.8 after some observation.
2019-06-07 08:47:44 +02:00
Christopher Faulet
54b5e214b0 MINOR: htx: Don't use end-of-data blocks anymore
This type of blocks is useless because transition between data and trailers is
obvious. And when there is no trailers, the end-of-message is still there to
know when data end for chunked messages.
2019-06-05 10:12:11 +02:00
Christopher Faulet
2d7c5395ed MEDIUM: htx: Add the parsing of trailers of chunked messages
HTTP trailers are now parsed in the same way headers are. It means trailers are
converted to K/V blocks followed by an end-of-trailer marker. For now, to make
things simple, the type for trailer blocks are not the same than for header
blocks. But the aim is to make no difference between headers and trailers by
using the same type. Probably for the end-of marker too.
2019-06-05 10:12:11 +02:00
Willy Tarreau
201840abf1 BUG/MEDIUM: mux-h2: don't refrain from offering oneself a used buffer
Usually when calling offer_buffer(), we don't expect to offer it to
ourselves. But with h2 we have the same buffer_wait for the two directions
so we can unblock the recv path when completing a send(), or we can unblock
part of the mux buffer after sending the first few buffers that we managed
to collect. Thus it is important to always accept to wake up any requester.

A few parts of this patch could possibly be backported but earlier versions
already have other issues related to low-buffer condition so it's not sure
it's worth taking the risk to make things worse.
2019-05-29 17:54:35 +02:00
Willy Tarreau
7f1265a238 BUG/MEDIUM: mux-h2: fix the conditions to end the h2_send() loop
The test for the mux alloc failure in h2_send() right after an attempt
at h2_process_mux() used to make sense as it tried to detect that this
latter failed to produce data. But now that we have a list of buffers,
it is a perfectly valid situation where there can still be data in the
buffer(s).

So now when we see this flag we only declare it's the last run on the
loop. In addition we need to make sure we break out of the loop on
snd_buf failure, or we'll loop indefinitely, for example when the buf
is full and we can't send.

No backport is needed.
2019-05-29 17:54:35 +02:00
Olivier Houchard
58d87f31f7 BUG/MEDIUM: h2: Don't forget to set h2s->cs to NULL after having free'd cs.
In h2c_frt_stream_new, if we failed to create the stream for some reason,
don't forget to set h2s->cs to NULL before calling h2s_destroy(), otherwise
h2s_destroy() will call h2s_close(), which will attempt to access
h2s->cs->flags if it's non-NULL.

This should be backported to 1.9.
2019-05-29 16:45:13 +02:00
Willy Tarreau
186e96ece0 MEDIUM: buffers: relax the buffer lock a little bit
In lock profiles it's visible that there is a huge contention on the
buffer lock. The reason is that when offer_buffers() is called, it
systematically takes the lock before verifying if there is any
waiter. However doing so doesn't protect against races since a
waiter can happen just after we release the lock as well. Similarly
in h2 we take the lock every time an h2c is going to be released,
even without checking that the h2c belongs to a wait list. These
two have now been addressed by verifying non-emptiness of the list
prior to taking the lock.
2019-05-28 17:25:21 +02:00
Willy Tarreau
0a7ef02074 MINOR: htx: make htx_add_data() return the transmitted byte count
In order to later allow htx_add_data() to transmit partial blocks and
avoid defragmenting the buffer, we'll need to return the number of bytes
consumed. This first modification makes the function do this and its
callers take this into account. At the moment the function still works
atomically so it returns either the block size or zero. However all
call places have been adapted to consider any value between zero and
the block size.
2019-05-28 14:48:59 +02:00
Christopher Faulet
b75b5eaf26 MEDIUM: htx: 1xx messages are now part of the final reponses
1xx informational messages (all except 101) are now part of the HTTP reponse,
semantically speaking. These messages are not followed by an EOM anymore,
because a final reponse is always expected. All these parts can also be
transferred to the channel in same time, if possible. The HTX response analyzer
has been update to forward them in loop, as the legacy one.
2019-05-28 07:42:30 +02:00
Christopher Faulet
a61e97bcae MINOR: htx: Be sure to xfer all headers in one time in htx_xfer_blks()
In the function htx_xfer_blks(), we take care to transfer all headers in one
time. When the current block is a start-line, we check if there is enough space
to transfer all headers too. If not, and if the destination is empty, a parsing
error is reported on the source.

The H2 multiplexer is the only one to use this function. When a parsing error is
reported during the transfer, the flag CS_FL_EOI is also set on the conn_stream.
2019-05-28 07:42:12 +02:00
Christopher Faulet
2f6edc84a8 MINOR: mux-h2/htx: Support zero-copy when possible in h2_rcv_buf()
If the channel's buffer is empty and the message is small enough, we can swap
the H2S buffer with the channel one.
2019-05-28 07:42:12 +02:00
Christopher Faulet
8a9ad4c0e8 MINOR: mux-h2: Use the count value received from the SI in h2_rcv_buf()
Now, the SI calls h2_rcv_buf() with the right count value. So we can rely on
it. Unlike the H1 multiplexer, it is fairly easier for the H2 multiplexer
because the HTX message already exists, we only transfer blocks from the H2S to
the channel. And this part is handled by htx_xfer_blks().
2019-05-28 07:42:12 +02:00
Christopher Faulet
156852b613 BUG/MINOR: htx: Change htx_xfer_blk() to also count metadata
This patch makes the function more accurate. Thanks to the function
htx_get_max_blksz(), the transfer of data has been simplified. Note that now the
total number of bytes copied (metadata + payload) is returned. This slighly
change how the function is used in the H2 multiplexer.
2019-05-28 07:42:12 +02:00
Christopher Faulet
b77a1d26a4 MINOR: mux-h2/htx: Get the start-line from the head when HEADERS frame is built
in the H2 multiplexer, when a HEADERS frame is built before sending it, we have
the warranty the start-line is the head of the HTX message. It is safer to rely
on this fact than on the sl_pos value. For now, it's safe to use sl_pos in muxes
because HTTP 1xx messages are considered as full messages in HTX and only one
HTTP message can be stored at a time in HTX. But we are trying to handle 1xx
messages as a part of the reponse message. In this way, an HTTP reponse will be
the sum of all 1xx informational messages followed by the final response. So it
will be possible to have several start-line in the same HTX message. And the
sl_pos will point to the first unprocessed start-line from the analyzers point
of view.
2019-05-28 07:42:12 +02:00
Willy Tarreau
9c218e7521 MAJOR: mux-h2: switch to next mux buffer on buffer full condition.
Now when we fail to send because the mux buffer is full, before giving
up and marking MFULL, we try to allocate another buffer in the mux's
ring to try again. Thanks to this (and provided there are enough buffers
allocated to the mux's ring), a single stream picked in the send_list
cannot steal all the mux's room at once. For this, we expand the ring
size to 31 buffers as it seems to be optimal on benchmarks since it
divides the number of context switches by 3. It will inflate each H2
conn's memory by 1 kB.

The bandwidth is now much more stable. Prior to this, it a test on
h2->h1 with very large objects (1 GB), a few tens of connections and
a few tens of streams per connection would show a varying performance
between 34 and 95 Gbps on 2 cores/4 threads, with h2_snd_buf() stopped
on a buffer full condition between 300000 and 600000 times per second.
Now the performance is constantly between 88 and 96 Gbps. Measures show
that buffer full conditions are met around only 159 times per second
in this case, or rougly 2000 to 4000 times less often.
2019-05-26 11:33:19 +02:00
Willy Tarreau
60f62682b1 MINOR: mux-h2: report the mbuf's head and tail in "show fd"
It's useful to know how the mbuf spans over the whole area and to have
access to the first and last ones, so let's dump just this.
2019-05-26 11:33:18 +02:00
Willy Tarreau
bcc4595e57 CLEANUP: mux-h2: consistently use a local variable for the mbuf
This makes the code more readable and reduces the calls to br_tail().
In addition, all calls to h2_get_buf() are now made via this local
variable, which should significantly help for retries.
2019-05-26 10:52:47 +02:00
Willy Tarreau
41c4d6a2c5 MEDIUM: mux-h2: make the send() function iterate over all mux buffers
Now send() uses a loop to iterate over all buffers to be sent. These
buffers are released and deleted from the vector once completely sent.
If any buffer gets released, offer_buffers() is called to wake up some
waiters.
2019-05-26 10:52:25 +02:00
Willy Tarreau
2e3c000c1c MINOR: mux-h2: introduce h2_release_mbuf() to release all buffers in the mbuf ring
This function iterates over all buffers in the mbuf ring to release all
of them from the head to the tail.
2019-05-26 10:51:25 +02:00
Willy Tarreau
662fafc02b MEDIUM: mux-h2: make the conditions to send based on mbuf, not just its tail
This is in preparation for iterating over lists. First we need to always
check the buffer's head and not its tail.
2019-05-26 10:50:50 +02:00
Willy Tarreau
5133096df2 MEDIUM: mux-h2: replace all occurrences of mbuf with a buffer ring
For now it's only one buffer long so the head and tails are always the
same, thus it doesn't change what used to work. In short, br_tail(h2c->mbuf)
was inserted everywhere we used to have h2c->mbuf.
2019-05-26 10:50:18 +02:00
Willy Tarreau
455d5681b6 MEDIUM: mux-h2: avoid doing expensive buffer realigns when not absolutely needed
Transferring large objects over H2 sometimes shows unexplained performance
variations. A long analysis resulted in the following discovery. Often the
mux buffer looks like this :

    [ empty_head |    data     | empty_tail ]

Typical numbers are (very common) :
  - empty_head = 31
  - empty_tail = 16  (total free=47)
  - data = 16337
  - size = 16384
  - data to copy: 43

The reason for these holes are the blocking factors that are not always
the same in and out (due to keeping 9 bytes for the frame size, or the
56 bytes corresponding to the HTX header). This can easily happen 10000
times a second if the network bandwidth permits it!

In this case, while copying a DATA frame we find that the buffer has its
free space wrapped so we decide to realign it to optimize the copy. It's
possible that this practice stems from the code used to emit headers,
which do not support fragmentation and which had no other option left.
But it comes with two problems :
  - we don't check if the data fits, which results in a memcpy for nothing
  - we can move huge amounts of data to just copy a small block.

This patch addresses this two ways :
  - first, by not forcing a data realignment if what we have to copy does
    not fit, as this is totally pointless ;

  - second, by refusing to move too large data blocks. The threshold was
    set to 1 kB, because it may make sense to move 1 kB of data to copy
    a 15 kB one at once, which will leave as a single 16 kB block, but
    it doesn't make sense to mvoe 15 kB to copy just 1 kB. In all cases
    the data would fit and would just be split into two blocks, which is
    not very expensive, hence the low limit to 1 kB

With such changes, realignments are very rare, they show up around once
every 15 seconds at 60 Gbps, and look like this, resulting in a much more
stable bit rate :

  buf=0x7fe6ec0c3510,h=16333,d=35,s=16384 room=16349 in=16337

This patch should be safe for backporting to 1.9 if some performance
issues are reported there.
2019-05-25 20:31:53 +02:00
Christopher Faulet
316934d3c9 BUG/MINOR: mux-h2: Count EOM in bytes sent when a HEADERS frame is formatted
In HTX, when a HEADERS frame is formatted before sending it to the client or the
server, If an EOM is found because there is no body, we must count it in the
number bytes sent.

This patch must be backported to 1.9.
2019-05-24 09:10:46 +02:00
Olivier Houchard
f8338151a3 MINOR: h2: Use BUG_ON() to enforce rules in subscribe/unsubscribe.
It is not legal to subscribe if we're already subscribed, or to unsubscribe
if we did not subscribe, so instead of trying to handle those cases, just
assert that it's ok using the new BUG_ON() macro.
2019-05-14 18:18:25 +02:00
Christopher Faulet
fa922f03a3 BUG/MEDIUM: mux-h2: Set EOI on the conn_stream during h2_rcv_buf()
Just like CS_FL_REOS previously, the CS_FL_EOI flag is abused as a proxy
for H2_SF_ES_RCVD. The problem is that this flag is consumed by the
application layer and is set immediately when an end of stream was met,
which is too early since the application must retrieve the rxbuf's
contents first. The effect is that some transfers are truncated (mostly
the first one of a connection in most tests).

The problem of mixing CS flags and H2S flags in the H2 mux is not new
(and is currently being addressed) but this specific one was emphasized
in commit 63768a63d ("MEDIUM: mux-h2: Don't mix the end of the message
with the end of stream") which was backported to 1.9. Note that other
flags, particularly CS_FL_REOS still need to be asynchronously reported,
though their impact seems more limited for now.

This patch makes sure that all internal uses of CS_FL_EOI are replaced
with a test on H2_SF_ES_RCVD (as there is a 1-to-1 equivalence) and that
CS_FL_EOI is only reported once the rxbuf is empty.

This should ideally be backported to 1.9 unless it causes too much
trouble due to the recent changes in this area, as 1.9 *seems* not
to be directly affected by this bug.
2019-05-14 15:47:57 +02:00
Willy Tarreau
99ad1b3e8c MINOR: mux-h2: stop relying on CS_FL_REOS
This flag was introduced early in 1.9 development (a3f7efe00) to report
the fact that the rxbuf that was present on the conn_stream was followed
by a shutr. Since then the rxbuf moved from the conn_stream to the h2s
(638b799b0) but the flag remained on the conn_stream. It is problematic
because some state transitions inside the mux depend on it, thus depend
on the CS, and as such have to test for its existence before proceeding.

This patch replaces the test on CS_FL_REOS with a test on the only
states that set this flag (H2_SS_CLOSED, H2_SS_HREM, H2_SS_ERROR).
The few places where the flag was set were removed (the flag is not
used by the data layer).
2019-05-14 15:47:57 +02:00
Willy Tarreau
4c688eb8d1 MINOR: mux-h2: add macros to check multiple stream states at once
At many places we need to test for several stream states at once, let's
have macros to make a bit mask from a state to ease this.
2019-05-14 15:47:57 +02:00
Willy Tarreau
f8fe3d63f0 CLEANUP: mux-h2: don't test for impossible CS_FL_REOS conditions
This flag is currently set when an incoming close was received, which
results in the stream being in either H2_SS_HREM, H2_SS_CLOSED, or
H2_SS_ERROR states, so let's remove the test for the OPEN and HLOC
cases.
2019-05-14 15:47:57 +02:00
Willy Tarreau
3cf69fe6b2 BUG/MINOR: mux-h2: make sure to honor KILL_CONN in do_shut{r,w}
If the stream closes and quits while there's no room in the mux buffer
to send an RST frame, next time it is attempted it will not lead to
the connection being closed because the conn_stream will have been
released and the KILL_CONN flag with it as well.

This patch reserves a new H2_SF_KILL_CONN flag that is copied from
the CS when calling shut{r,w} so that the stream remains autonomous
on this even when the conn_stream leaves.

This should ideally be backported to 1.9 though it depends on several
previous patches that may or may not be suitable for backporting. The
severity is very low so there's no need to insist in case of trouble.
2019-05-14 15:47:57 +02:00
Willy Tarreau
aebbe5ef72 MINOR: mux-h2: make h2s_wake_one_stream() not depend on temporary CS flags
In h2s_wake_one_stream() we used to rely on the temporary flags used to
adjust the CS to determine the new h2s state. This really is not convenient
and creates far too many dependencies. This commit just moves the same
condition to the places where the temporary flags were set so that we
don't have to rely on these anymore. Whether these are relevant or not
was not the subject of the operation, what matters was to make sure the
conditions to adjust the stream's state and the CS's flags remain the
same. Later it could be studied if these conditions are correct or not.
2019-05-14 15:47:57 +02:00
Willy Tarreau
13b6c2e8b3 MINOR: mux-h2: make h2s_wake_one_stream() the only function to deal with CS
h2s_wake_one_stream() has access to all the required elements to update
the connstream's flags and figure the necessary state transitions, so
let's move the conditions there from h2_wake_some_streams().
2019-05-14 15:47:57 +02:00
Willy Tarreau
234829111f MINOR: mux-h2: make h2_wake_some_streams() not depend on the CS flags
It's problematic to have to pass some CS flags to this function because
that forces some h2s state transistions to update them just in time
while some of them are supposed to only be updated during I/O operations.

As a first step this patch transfers the decision to pass CS_FL_ERR_PENDING
from the caller to the leaf function h2s_wake_one_stream(). It is easy
since this is the only flag passed there and it depends on the position of
the stream relative to the last_sid if it was set.
2019-05-14 15:47:57 +02:00
Willy Tarreau
c3b1183f57 MINOR: mux-h2: remove useless test on stream ID vs last in wake function
h2_wake_some_streams() first looks up streams whose IDs are greater than
or equal to last+1, then checks if the id is lower than or equal to last,
which by definition will never match. Let's remove this confusing leftover
from ancient code.
2019-05-14 15:47:57 +02:00
Willy Tarreau
f983d00a1c BUG/MINOR: mux-h2: make the do_shut{r,w} functions more robust against retries
These functions may fail to emit an RST or an empty DATA frame because
the mux is full or busy. Then they subscribe the h2s and try again.
However when doing so, they will already have marked the error state on
the stream and will not pass anymore through the sequence resulting in
the failed frame to be attempted to be sent again nor to the close to
be done, instead they will return a success.

It is important to only leave when the stream is already closed, but
to go through the whole sequence otherwise.

This patch should ideally be backported to 1.9 though it's possible that
the lack of the WANT_SHUT* flags makes this difficult or dangerous. The
severity is low enough to avoid this in case of trouble.
2019-05-14 11:13:06 +02:00
Willy Tarreau
8bdb5c9bb4 CLEANUP: connection: remove the handle field from the wait_event struct
It was only set and not consumed after the previous change. The reason
is that the task's context always contains the relevant information,
so there is no need for a second pointer.
2019-05-13 19:14:52 +02:00
Willy Tarreau
88bdba31fa CLEANUP: mux-h2: simply use h2s->flags instead of ret in h2_deferred_shut()
This one used to rely on the combined return statuses of the shutr/w
functions but now that we have the H2_SF_WANT_SHUT{R,W} flags we don't
need this anymore if we properly remove these flags after their operations
succeed. This is what this patch does.
2019-05-13 19:14:52 +02:00
Willy Tarreau
2c249ebc75 MINOR: mux-h2: add two H2S flags to report the need for shutr/shutw
Currently when a shutr/shutw fails due to lack of buffer space, we abuse
the wait_event's handle pointer to place up to two bits there in addition
to the original pointer. This pointer is not used for anything but this
and overall the intent becomes clearer with h2s flags than with these
two alien bits in the pointer, so let's use clean flags now.
2019-05-13 19:14:52 +02:00
Willy Tarreau
c234ae38f8 CLEANUP: mux-h2: use LIST_ADDED() instead of LIST_ISEMPTY() where relevant
Lots of places were using LIST_ISEMPTY() to detect if a stream belongs
to one of the send lists or to detect if a connection was already
waiting for a buffer or attached to an idle list. Since these ones are
not list heads but list elements, let's use LIST_ADDED() instead.
2019-05-13 19:14:52 +02:00
Willy Tarreau
4087346dab BUG/MAJOR: mux-h2: do not add a stream twice to the send list
In this long thread, Maciej Zdeb reported that the H2 mux was still
going through endless loops from time to time :

  https://www.mail-archive.com/haproxy@formilux.org/msg33709.html

What happens is the following :
- in h2s_frt_make_resp_data() we can set H2_SF_BLK_SFCTL and remove the
  stream from the send_list
- then in h2_shutr() and h2_shutw(), we check if the list is empty before
  subscribing the element, which is true after the case above
- then in h2c_update_all_ws() we still have H2_SF_BLK_SFCTL with the item
  in the send_list, thus LIST_ADDQ() adds it a second time.

This patch adds a check of list emptiness before performing the LIST_ADDQ()
when the flow control window opens. Maciej reported that it reliably fixed
the problem for him.

As later discussed with Olivier, this fixes the consequence of the issue
rather than its cause. The root cause is that a stream should never be in
the send_list with a blocking flag set and the various places that can lead
to this situation must be revisited. Thus another fix is expected soon for
this issue, which will require some observation. In the mean time this one
is easy enough to validate and to backport.

Many thanks to Maciej for testing several versions of the patch, each
time providing detailed traces which allowed to nail the problem down.

This patch must be backported to 1.9.
2019-05-13 08:15:10 +02:00
Olivier Houchard
bfe2a83c24 BUG/MEDIUM: h2: Don't check send_wait to know if we're in the send_list.
When we have to stop sending due to the stream flow control, don't check
if send_wait is NULL to know if we're in the send_list, because at this
point it'll always be NULL, while we're probably in the list.
Use LIST_ISEMPTY(&h2s->list) instead.
Failing to do so mean we might be added in the send_list when flow control
allows us to emit again, while we're already in it.
While I'm here, replace LIST_DEL + LIST_INIT by LIST_DEL_INIT.

This should be backported to 1.9.
2019-05-10 15:06:54 +02:00
Olivier Houchard
d9986ed51e BUG/MEDIUM: h2: Make sure we set send_list to NULL in h2_detach().
In h2_detach(), if we still have a send_wait pointer, because we woke the
tasklet up, but it hasn't ran yet, explicitely set send_wait to NULL after
we removed the tasklet from the task list.
Failure to do so may lead to crashes if the h2s isn't immediately destroyed,
because we considered there were still something to send.

This should be backported to 1.9.
2019-05-09 13:26:48 +02:00
Willy Tarreau
201fe40653 BUG/MINOR: mux-h2: fix the condition to close a cs-less h2s on the backend
A typo was introduced in the following commit : 927b88ba0 ("BUG/MAJOR:
mux-h2: fix race condition between close on both ends") making the test
on h2s->cs never being done and h2c->cs being dereferenced without being
tested. This also confirms that this condition does not happen on this
side but better fix it right now to be safe.

This must be backported to 1.9.
2019-05-07 19:17:50 +02:00
Willy Tarreau
f656279347 CLEANUP: task: remove unneeded tests before task_destroy()
Since previous commit it's not needed anymore to test a task pointer
before calling task_destory() so let's just remove these tests from
the various callers before they become confusing. The function's
arguments were also documented. The same should probably be done
with tasklet_free() which involves a test in roughly half of the
call places.
2019-05-07 19:08:16 +02:00
Willy Tarreau
2135f91d18 BUG/MEDIUM: h2/htx: never leave a trailers block alone with no EOM block
If when receiving an H2 response we fail to add an EOM block after too
large a trailers block, we must not leave the trailers block alone as it
violates the internal assumptions by not being followed by an EOM, even
when an error is reported. We must then make sure the error will safely
be reported to upper layers and that no attempt will be made to forward
partial blocks.

This must be backported to 1.9.
2019-05-07 11:17:32 +02:00
Willy Tarreau
fb07b3f825 BUG/MEDIUM: mux-h2/htx: never wait for EOM when processing trailers
In message https://www.mail-archive.com/haproxy@formilux.org/msg33541.html
Patrick Hemmer reported an interesting bug affecting H2 and trailers.

The problem is that in order to close the stream we have to see the EOM
block, but nothing guarantees it will atomically be delivered with the
trailers block(s). So the code currently waits for it by returning zero
when it was not found, resulting in the caller (h2_snd_buf()) to loop
forever calling it again.

The current internal connection/connstream API doesn't allow a send
actor to notify its caller that it cannot process the data until it
gets more, so even returning zero will only lead to calls in loops
without any guarantee that any progress will be made.

Some late amendments to HTX already guaranteed the atomicity of the
trailers block during snd_buf(), which is currently ensured by the
fact that producers create exactly one such trailers block for all
trailers. So in practice we can only loop between trailers and EOM.

This patch changes the behaviour by making h2s_htx_make_trailers()
become atomic by not consuming the EOM block. This way either it finds
the end of trailers marker (empty line) or it fails. Once it sends the
trailers block, ES is set so the stream turns HLOC or CLOSED. Thanks
to previous patch "MEDIUM: mux-h2: discard contents that are to be sent
after a shutdown" is is now safe to interrupt outgoing data processing,
and the late EOM block will silently be discarded when the caller
finally sends it.

This is a bit tricky but should remain solid by design, and seems like
the only option we have that is compatible with 1.9, where it must be
backported along with the aforementioned patch.
2019-05-07 11:08:02 +02:00
Willy Tarreau
2b77848418 MEDIUM: mux-h2: discard contents that are to be sent after a shutdown
In h2_snd_buf() we discard any possible buffer contents requested to be
sent after a close or an error. But in practice we can extend this to
any case where the stream is locally half-closed since it means we will
never be able to send these data anymore.

For now it must not change anything, but it will be used by subsequent
patches to discard lone a HTX EOM block arriving after the trailers
block.
2019-05-07 11:08:02 +02:00
Willy Tarreau
aab1a60977 BUG/MEDIUM: h2/htx: always fail on too large trailers
In case a header frame carrying trailers just fits into the HTX buffer
but leaves no room for the EOM block, we used to return the same code
as the one indicating we're missing data. This could would result in
such frames causing timeouts instead of immediate clean aborts. Now
they are properly reported as stream errors (since the frame was
decoded and the compression context is still synchronized).

This must be backported to 1.9.
2019-05-07 11:08:02 +02:00
Willy Tarreau
5121e5d750 BUG/MINOR: mux-h2: rely on trailers output not input to turn them to empty data
When sending trailers, we may face an empty HTX trailers block or even
have to discard some of the headers there and be left with nothing to
send. RFC7540 forbids sending of empty HEADERS frames, so in this case
we turn to DATA frames (which is possible since after other DATA).

The code used to only check the input frame's contents to decide whether
or not to switch to a DATA frame, it didn't consider the possibility that
the frame only used to contain headers discarded later, thus it could still
emit an empty HEADERS frame in such a case. This patch makes sure that the
output frame size is checked instead to take the decision.

This patch must be backported to 1.9. In practice this situation is never
encountered since the discarded headers have really nothing to do in a
trailers block.
2019-05-07 11:07:59 +02:00
Willy Tarreau
97215ca284 BUG/MEDIUM: mux-h2: properly deal with too large headers frames
In h2c_decode_headers(), now that we support CONTINUATION frames, we
try to defragment all pending frames at once before processing them.
However if the first is exactly full and the second cannot be parsed,
we don't detect the problem and we wait for the next part forever due
to an incorrect check on exit; we must abort the processing as soon as
the current frame remains full after defragmentation as in this case
there is no way to make forward progress.

Thanks to Yves Lafon for providing traces exhibiting the problem.

This must be backported to 1.9.
2019-04-29 10:20:21 +02:00
Olivier Houchard
e179d0e88f MEDIUM: connections: Provide a xprt_ctx for each xprt method.
For most of the xprt methods, provide a xprt_ctx.  This will be useful later
when we'll want to be able to stack xprts.
The init() method now has to create and provide the said xprt_ctx if needed.
2019-04-18 14:56:24 +02:00
Olivier Houchard
3f795f76e8 MEDIUM: tasks: Merge task_delete() and task_free() into task_destroy().
task_delete() was never used without calling task_free() just after, and
task_free() was only used on error pathes to destroy a just-created task,
so merge them into task_destroy(), that will remove the task from the
wait queue, and make sure the task is either destroyed immediately if it's
not in the run queue, or destroyed when it's supposed to run.
2019-04-18 10:10:04 +02:00
Olivier Houchard
998410a41b BUG/MEDIUM: h2: Revamp the way send subscriptions works.
Instead of abusing the SUB_CALL_UNSUBSCRIBE flag, revamp the H2 code a bit
so that it just checks if h2s->sending_list is empty to know if the tasklet
of the stream_interface has been waken up or not.
send_wait is now set to NULL in h2_snd_buf() (ideally we'd set it to NULL
as soon as we're waking the tasklet, but it can't be done, because we still
need it in case we have to remove the tasklet from the task list).
2019-04-15 19:27:57 +02:00
Olivier Houchard
9a0f559676 BUG/MEDIUM: h2: Make sure we're not already in the send_list in h2_subscribe().
In h2_subscribe(), don't add ourself to the send_list if we're already in it.
That may happen if we try to send and fail twice, as we're only removed
from the send_list if we managed to send data, to promote fairness.
Failing to do so can lead to either an infinite loop, or some random crashes,
as we'd get the same h2s in the send_list twice.

This should be backported to 1.9.
2019-04-15 19:27:57 +02:00
Olivier Houchard
0e0793715c BUG/MEDIUM: muxes: Make sure we unsubcribed when destroying mux ctx.
In the h1 and h2 muxes, make sure we unsubscribed before destroying the
mux context.
Failing to do so will lead in a segfault later, as the connection will
attempt to dereference its conn->send_wait or conn->recv_wait, which pointed
to the now-free'd mux context.

This was introduced by commit 39a96ee16e, so
should only be backported if that commit gets backported.
2019-04-15 19:27:57 +02:00
Christopher Faulet
61840e715f BUG/MEDIUM: muxes: Don't dereference mux context if null in release functions
When a mux context is released, we must be sure it exists before dereferencing
it. The bug was introduced in the commit 39a96ee16 ("MEDIUM: muxes: Be prepared
to don't own connection during the release").

No need to backport this patch, expect if the commit 39a96ee16 is backported
too.
2019-04-15 09:47:10 +02:00
Christopher Faulet
39a96ee16e MEDIUM: muxes: Be prepared to don't own connection during the release
This happens during mux upgrades. In such case, when the destroy() callback is
called, the connection points to a different mux's context than the one passed
to the callback. It means the connection is owned by another mux. The old mux is
then released but the connection is not closed.
2019-04-12 22:06:53 +02:00
Christopher Faulet
73c1207c71 MINOR: muxes: Pass the context of the mux to destroy() instead of the connection
It is mandatory to handle mux upgrades, because during a mux upgrade, the
connection will be reassigned to another multiplexer. So when the old one is
destroyed, it does not own the connection anymore. Or in other words, conn->ctx
does not point to the old mux's context when its destroy() callback is
called. So we now rely on the multiplexer context do destroy it instead of the
connection.

In addition, h1_release() and h2_release() have also been updated in the same
way.
2019-04-12 22:06:53 +02:00
Christopher Faulet
51f73eb11a MEDIUM: muxes: Add an optional input buffer during mux initialization
The mux's callback init() now take a pointer to a buffer as extra argument. It
must be used by the multiplexer as its input buffer. This buffer is always NULL
when a multiplexer is initialized with a fresh connection. But if a mux upgrade
is performed, it may be filled with existing data. Note that, for now, mux
upgrades are not supported. But this commit is mandatory to do so.
2019-04-12 22:06:53 +02:00
Christopher Faulet
e9b7072e9e MINOR: muxes: Rely on conn_is_back() during init to handle front/back conn
Instead of using the connection context to make the difference between a
frontend connection and a backend connection, we now rely on the function
conn_is_back().
2019-04-12 22:06:53 +02:00
Christopher Faulet
9f38f5aa80 MINOR: muxes: Add a flag to specify a multiplexer uses the HTX
A multiplexer must now set the flag MX_FL_HTX when it uses the HTX to structured
the data exchanged with channels. the muxes h1 and h2 set this flag. Of course,
for the mux h2, it is set on h2_htx_ops only.
2019-04-12 22:06:53 +02:00
Christopher Faulet
9b579106fe MINOR: mux-h2: Add a mux_ops dedicated to the HTX mode
Instead of using the same mux_ops structure for the legacy HTTP mode and the HTX
mode, a dedicated mux_ops is now used for the HTX mode. Same callbacks are used
for both. But the flags may be different depending on the mode used.
2019-04-12 22:06:53 +02:00
Olivier Houchard
3ca18bf0bd BUG/MEDIUM: h2: Don't attempt to recv from h2_process_demux if we subscribed.
Modify h2c_restart_reading() to add a new parameter, to let it know if it
should consider if the buffer isn't empty when retrying to read or not, and
call h2c_restart_reading() using 0 as a parameter from h2_process_demux().
If we're leaving h2_process_demux() with a non-empty buffer, it means the
frame is incomplete, and we're waiting for more data, and if we already
subscribed, we'll be waken when more data are available.
Failing to do so means we'll be waken up in a loop until more data are
available.

This should be backported to 1.9.
2019-04-05 16:03:54 +02:00
Willy Tarreau
a27db38f12 BUG/MEDIUM: mux-h2: make sure to always notify streams of EOS condition
Recent commit 63768a63d ("MEDIUM: mux-h2: Don't mix the end of the message
with the end of stream") introduced a race which may manifest itself with
small connection counts on large objects and large server timeouts in
legacy mode. Sometimes h2s_close() is called while the data layer is
subscribed to read events but nothing in the chain can cause this wake-up
to happen and some streams stall for a while at the end of a transfer
until the server timeout strikes and ends the stream completes.

We need to wake the stream up if it's subscribed to rx events there,
which is what this patch does. When the patch above is backported to
1.9, this patch will also have to be backported.
2019-03-25 18:13:16 +01:00
Willy Tarreau
e73256fd2a BUG/MEDIUM: task/h2: add an idempotent task removal fucntion
Previous commit 3ea351368 ("BUG/MEDIUM: h2: Remove the tasklet from the
task list if unsubscribing.") uncovered an issue which needs to be
addressed in the scheduler's API. The function task_remove_from_task_list()
was initially designed to remove a task from the running tasklet list from
within the scheduler, and had to be used in h2 to abort pending I/O events.
However this function was not designed to be idempotent, occasionally
causing a double removal from the tasklet list, with the second doing
nothing but affecting the apparent tasks count and making haproxy use
100% CPU on some tests consisting in stopping the client during some
transfers. The h2_unsubscribe() function can sometimes be called upon
stream exit after an error where the tasklet was possibly already
removed, so it.

This patch does 2 things :
  - it renames task_remove_from_task_list() to
    __task_remove_from_tasklet_list() to discourage users from calling
    it. Also note the fix in the naming since it's a tasklet list and
    not a task list. This function is still uesd from the scheduler.
  - it adds a new, idempotent, task_remove_from_tasklet_list() function
    which does nothing if the task is already not in the tasklet list.

This patch will need to be backported where the commit above is backported.
2019-03-25 18:02:54 +01:00
Olivier Houchard
3ea3513689 BUG/MEDIUM: h2: Remove the tasklet from the task list if unsubscribing.
In h2_unsubscribe(), if we unsubscribe on SUB_CALL_UNSUBSCRIBE, then remove
ourself from the sending_list, and remove the tasklet from the task list.
We're probably about to destroy the stream anyway, so we don't want the
tasklet to run, or to stay in the sending_list, or it could lead to a crash.

This should be backpored to 1.9.
2019-03-25 14:34:26 +01:00
Olivier Houchard
afc7cb85c4 BUG/MEDIUM: h2: Follow the same logic in h2_deferred_shut than in h2_snd_buf.
In h2_deferred_shut(), don't just set h2s->send_wait to NULL, instead, use
the same logic as in h2_snd_buf() and only do so if we successfully sent data
(or if we don't want to send them anymore). Setting it to NULL can lead to
crashes.

This should be backported to 1.9.
2019-03-25 14:34:26 +01:00
Olivier Houchard
fd1e96d2fb BUG/MEDIUM: h2: Use the new sending_list in h2s_notify_send().
In h2s_notify_send(), use the new sending_list instead of using the old
way of setting hs->send_wait to NULL, failing to do so may lead to crashes.

This should be backported to 1.9.
2019-03-25 14:34:26 +01:00
Olivier Houchard
01d4cb5339 BUG/MEDIUM: h2: only destroy the h2s if h2s->cs is NULL.
In h2_deferred_shut(), only attempt to destroy the h2s if h2s->cs is NULL.
h2s->cs being non-NULL means it's still referenced by the stream interface,
so it may try to use it later, and that could lead to a crash.

This should be backported to 1.9.
2019-03-25 13:35:02 +01:00
Christopher Faulet
87a8f353f1 CLEANUP: muxes/stream-int: Remove flags CS_FL_READ_NULL and SI_FL_READ_NULL
Since the flag CF_SHUTR is no more set to mark the end of the message, these
flags become useless.

This patch should be backported to 1.9.
2019-03-25 06:55:23 +01:00
Christopher Faulet
63768a63d7 MEDIUM: mux-h2: Don't mix the end of the message with the end of stream
The H2 multiplexer now sets CS_FL_EOI when it receives a frame with the ES
flag. And when the H2 streams is closed, it set the flag CS_FL_REOS.

This patch should be backported to 1.9.
2019-03-25 06:26:30 +01:00
Christopher Faulet
3ab07c35b4 MINOR: mux-h2: Remove useless test on ES flag in h2_frt_transfer_data()
Same test is already performed in the caller function, h2c_frt_handle_data().

This patch should be backported to 1.9.
2019-03-22 18:06:17 +01:00
Olivier Houchard
d360ac60f4 BUG/MEDIUM: h2: Try to be fair when sending data.
On the send path, try to be fair, and make sure the first to attempt to
send data will actually be the first to send data when it's possible (ie
when the mux' buffer is not full anymore).
To do so, use a separate list element for the sending_list, and only remove
the h2s from the send_list/fctl_list if we successfully sent data. If we did
not, we'll keep our place in the list, and will be able to try again next time.

This should be backported to 1.9.
2019-03-22 18:05:03 +01:00
Willy Tarreau
749f5cab83 CLEANUP: mux-h2: add some comments to help understand the code
Some functions' roles and usage are far from being obvious, and diving
into this part each time requires deep concentration before starting to
understand who does what. Let's add a few comments which help figure
some of the useful pieces.
2019-03-21 19:19:36 +01:00
Willy Tarreau
8ab128c06a MINOR: mux-h2: copy small data blocks more often and reduce the number of pauses
We tend to refrain from sending data a bit too much in the H2 mux :
whenever there are pending data in the buffer and we try to copy
something larger than 1/4 of the buffer we prefer to pause. This
is suboptimal for medium-sized objects which have to send their
headers and later their data.

This patch slightly changes this by allowing a copy of a large block
if it fits at once and if the realign cost is small, i.e. the pending
data are small or the block fits in the contiguous area. Depending on
the object size this measurably improves the download performance by
between 1 and 10%, and possibly lowers the transfer latency for medium
objects.
2019-03-21 18:28:31 +01:00
Olivier Houchard
fd8bd4521a BUG/MEDIUM: mux-h2: Use the right list in h2_stop_senders().
In h2_stop_senders(), when we're about to move the h2s about to send back
to the send_list, because we know the mux is full, instead of putting them
all in the send_list, put them back either in the fctl_list or the send_list
depending on if they are waiting for the flow control or not. This also makes
sure they're inserted in their arrival order and not reversed.

This should be backported to 1.9.
2019-03-21 18:28:31 +01:00
Olivier Houchard
16ff261633 BUG/MEDIUM: mux-h2: Don't bother keeping the h2s if detaching and nothing to send.
In h2_detach(), don't bother keeping the h2s even if it was waiting for
flow control if we no longer are subscribed for receiving or sending, as
nobody will do anything once we can write in the mux, anyway. Failing to do
so may lead to h2s being kept opened forever.

This should be backported to 1.9.
2019-03-21 18:28:31 +01:00
Olivier Houchard
7a977431ca BUG/MEDIUM: mux-h2: Make sure we destroyed the h2s once shutr/shutw is done.
If we're waiting until we can send a shutr and/or a shutw, once we're done
and not considering sending anything, destroy the h2s, and eventually the
h2c if we're done with the whole connection, or it will never be done.

This should be backported to 1.9.
2019-03-21 18:28:31 +01:00
Christopher Faulet
203b2b0a5a MINOR: muxes: Report the Last read with a dedicated flag
For conveniance, in HTTP muxes (h1 and h2), the end of the stream and the end of
the message are reported the same way to the stream, by setting the flag
CS_FL_EOS. In the stream-interface, when CS_FL_EOS is detected, a shutdown for
read is reported on the channel side. This is historical. With the legacy HTTP
layer, because the parsing is done by the stream in HTTP analyzers, the EOS
really means a shutdown for read.

Most of time, for muxes h1 and h2, it works pretty well, especially because the
keep-alive is handled by the muxes. The stream is only used for one
transaction. So mixing EOS and EOM is good enough. But not everytime. For now,
client aborts are only reported if it happens before the end of the request. It
is an error and it is properly handled. But because the EOS was already
reported, client aborts after the end of the request are silently
ignored. Eventually an error can be reported when the response is sent to the
client, if the sending fails. Otherwise, if the server does not reply fast
enough, an error is reported when the server timeout is reached. It is the
expected behaviour, excpect when the option abortonclose is set. In this case,
we must report an error when the client aborts. But as said before, this event
can be ignored. So to be short, for now, the abortonclose is broken.

In fact, it is a design problem and we have to rethink all channel's flags and
probably the conn-stream ones too. It is important to split EOS and EOM to not
loose information anymore. But it is not a small job and the refactoring will be
far from straightforward.

So for now, temporary flags are introduced. When the last read is received, the
flag CS_FL_READ_NULL is set on the conn-stream. This way, we can set the flag
SI_FL_READ_NULL on the stream interface. Both flags are persistant. And to be
sure to wake the stream, the event CF_READ_NULL is reported. So the stream will
always have the chance to handle the last read.

This patch must be backported to 1.9 because it will be used by another patch to
fix the option abortonclose.
2019-03-18 15:50:23 +01:00
Christopher Faulet
35757d38ce MINOR: mux-h2: Set REFUSED_STREAM error to reset a stream if no data was never sent
According to the H2 spec (see #8.1.4), setting the REFUSED_STREAM error code
is a way to indicate that the stream is being closed prior to any processing
having occurred, such as when a server-side H1 keepalive connection is closed
without sending anything (which differs from the regular error case since
haproxy doesn't even generate an error message). Any request that was sent on
the reset stream can be safely retried. So, when a stream is closed, if no
data was ever sent back (ie. the flag H2_SF_HEADERS_SENT is not set), we can
set the REFUSED_STREAM error code on the RST_STREAM frame.

This patch may be backported to 1.9.
2019-03-18 15:50:23 +01:00
Christopher Faulet
f02ca00a36 BUG/MEDIUM: mux-h2: Always wakeup streams with no id to avoid frozen streams
This only happens for server streams because their id is assigned when the first
message is sent. If these streams are not woken up, some events can be lost
leading to frozen streams. For instance, it happens when a server closes its
connection before sending its preface.

This patch must be backported to 1.9.
2019-03-18 15:50:23 +01:00
Willy Tarreau
7196dd6071 MINOR: mux-h2: always pass HTX_FL_PARSING_ERROR between h2s and buf on RX
In order to allow the H2 parser to report parsing errors, we must make
sure to always pass the HTX_FL_PARSING_ERROR flag from the h2s htx to
the conn_stream's htx.
2019-03-05 10:56:34 +01:00
Willy Tarreau
927b88ba00 BUG/MAJOR: mux-h2: fix race condition between close on both ends
A crash in H2 was reported in issue #52. It turns out that there is a
small but existing race by which a conn_stream could detach itself
using h2_detach(), not being able to destroy the h2s due to pending
output data blocked by flow control, then upon next h2s activity
(transfer_data or trailers parsing), an ES flag may need to be turned
into a CS_FL_REOS bit, causing a dereference of a NULL stream. This is
a side effect of the fact that we still have a few places which
incorrectly depend on the CS flags, while these flags should only be
set by h2_rcv_buf() and h2_snd_buf().

All candidate locations along this path have been secured against this
risk, but the code should really evolve to stop depending on CS anymore.

This fix must be backported to 1.9 and possibly partially to 1.8.
2019-03-04 08:17:12 +01:00
Willy Tarreau
0bbad6bb06 BUG/MEDIUM: h2: advertise to servers that we don't support push
The h2c_send_settings() function was initially made to serve on the
frontend. Here we don't need to advertise that we don't support PUSH
since we don't do that ourselves. But on the backend side it's
different because PUSH is enabled by default so we must announce that
we don't want the server to use it.

This must be backported to 1.9.
2019-02-26 16:07:27 +01:00
Willy Tarreau
67b8caefc9 BUG/MEDIUM: mux-h2/htx: send an empty DATA frame on empty HTX trailers
When chunked-encoding is used in HTX mode, a trailers HTX block is always
made due to the way trailers are currently implemented (verbatim copy of
the H1 representation). Because of this it's not possible to know when
processing data that we've reached the end of the stream, and it's up
to the function encoding the trailers (h2s_htx_make_trailers) to put the
end of stream. But when there are no trailers and only an empty HTX block,
this one cannot produce a HEADERS frame, thus it cannot send the END_STREAM
flag either, leaving the other end with an incomplete message, waiting for
either more data or some trailers. This is particularly visible with POST
requests where the server continues to wait.

What this patch does is transform the HEADERS frame into an empty DATA
frame when meeting an empty trailers block. It is possible to do this
because we've not sent any trailers so the other end is still waiting
for DATA frames. The check is made after attempting to encode the list
of headers, so as to minimize the specific code paths.

Thanks to Dragan Dosen for reporting the issue with a reproducer.

This fix must be backported to 1.9.
2019-02-21 18:22:26 +01:00
Willy Tarreau
a24b35ca18 MINOR: mux-h2: make the H2 MAX_FRAME_SIZE setting configurable
This creates a new tunable "tune.h2.max-frame-size" to adjust the
advertised max frame size. When not set it still defaults to the buffer
size. It is convenient to advertise sizes lower than the buffer size,
for example when using very large buffers.
2019-02-21 17:30:59 +01:00
Christopher Faulet
eaf0d2a936 MINOR: mux-h2: Set HTX extra value when possible
For now, this can be only done when a content-length is specified. In fact, it
is the same value than h2s->body_len, the remaining body length according to
content-length. Setting this field allows the fast forwarding at the channel
layer, improving significantly data transfer for big objects.

This patch may be backported to 1.9.
2019-02-19 16:26:14 +01:00
Christopher Faulet
0b46548a68 BUG/MEDIUM: h2/htx: Correctly handle interim responses when HTX is enabled
1xx responses does not work in HTTP2 when the HTX is enabled. First of all, when
a response is parsed, only one HEADERS frame is expected. So when an interim
response is received, the flag H2_SF_HEADERS_RCVD is set and the next HEADERS
frame (for another interim repsonse or the final one) is parsed as a trailers
one. Then when the response is sent, because an EOM block is found at the end of
the interim HTX response, the ES flag is added on the frame, closing too early
the stream. Here, it is a design problem of the HTX. Iterim responses are
considered as full messages, leading to some ambiguities when HTX messages are
processed. This will not be fixed now, but we need to keep it in mind for future
improvements.

To fix the parsing bug, the flag H2_MSGF_RSP_1XX is added when the response
headers are decoded. When this flag is set, an EOM block is added into the HTX
message, despite the fact that there is no ES flag on the frame. And we don't
set the flag H2_SF_HEADERS_RCVD on the corresponding H2S. So the next HEADERS
frame will not be parsed as a trailers one.

To fix the sending bug, the ES flag is not set on the frame when an interim
response is processed and the flag H2_SF_HEADERS_SENT is not set on the
corresponding H2S.

This patch must be backported to 1.9.
2019-02-19 16:26:14 +01:00
Christopher Faulet
fd74267264 BUG/MINOR: mux-h2: Don't add ":status" pseudo-header on trailers
It is a cut-paste bug. Pseudo-header fields MUST NOT appear in trailers.

This patch must be backported to 1.9.
2019-02-18 16:25:06 +01:00
Christopher Faulet
37070b2b15 BUG/MEDIUM: mux-h2/htx: Always set CS flags before exiting h2_rcv_buf()
It is especially important when some data are blocked in the RX buf and the
channel buffer is already full. In such case, instead of exiting the function
directly, we need to set right flags on the conn_stream. CS_FL_RCV_MORE and
CS_FL_WANT_ROOM must be set, otherwise, the stream-interface will subscribe to
receive events, thinking it is not blocked.

This bug leads to connection freeze when everything was received with some data
blocked in the RX buf and a channel full.

This patch must be backported to 1.9.
2019-02-18 16:25:06 +01:00
Willy Tarreau
053c15750b BUG/MEDIUM: mux-h2: always set :authority on request output
PiBa-NL reported that some servers don't fall back to the Host header when
:authority is absent. After studying all the combinations of Host and
:authority, it appears that we always have to send the latter, hence we
never need the former. In case of CONNECT method, the authority is retrieved
from the URI part, otherwise it's extracted from the Host field.

The tricky part is that we have to scan all headers for the Host header
before dumping other headers. This is due to the fact that we must emit
pseudo headers before other ones. One improvement could possibly be made
later in the request parser to search and emit the Host header immediately
if authority was not found. This would cost nothing on the vast marjority
of requests and make the lookup faster on output since Host would appear
first.

This fix must be backported to 1.9.
2019-02-01 16:47:46 +01:00
Willy Tarreau
5be92ff23f BUG/MEDIUM: mux-h2: always omit :scheme and :path for the CONNECT method
This is mandated by RFC7540 #8.3, these pseudo-headers must not be emitted
with the CONNECT method.

This must be backported to 1.9.
2019-02-01 16:47:46 +01:00
Olivier Houchard
9c9da5ee89 MINOR: muxes: Don't bother to LIST_DEL(&conn->list) before calling conn_free().
conn_free() already removes the connection from any idle list, so there's no
need to do it in the mux code, just before calling conn_free().
2019-01-31 19:38:25 +01:00
Willy Tarreau
8694978892 BUG/MEDIUM: mux-h2: properly consider the peer's advertised max-concurrent-streams
Till now we used to only rely on tune.h2.max-concurrent-streams but if
a peer advertises a lower limit this can cause streams to be reset or
even the conection to be killed. Let's respect the peer's value for
outgoing streams.

This patch should be backported to 1.9, though it depends on the following
ones :
    BUG/MINOR: server: fix logic flaw in idle connection list management
    MINOR: mux-h2: max-concurrent-streams should be unsigned
    MINOR: mux-h2: make sure to only check concurrency limit on the frontend
    MINOR: mux-h2: learn and store the peer's advertised MAX_CONCURRENT_STREAMS setting
2019-01-31 19:38:25 +01:00
Willy Tarreau
2e2083ae5b MINOR: mux-h2: learn and store the peer's advertised MAX_CONCURRENT_STREAMS setting
We used not to take it into account because we only used the configured
parameter everywhere. This patch makes sure we can actually learn the
value advertised by the peer. We still enforce our own limit on top of
it however, to make sure we can actually limit resources or stream
concurrency in case of suboptimal server settings.
2019-01-31 19:38:25 +01:00