803 Commits

Author SHA1 Message Date
Willy Tarreau
96a10c24cf MINOR: mux-h2: fail stream creation more cleanly using RST_STREAM
The H2 demux only checks for too many streams in h2c_frt_stream_new(),
then refuses to create a new stream and causes the connection to be
aborted by sending a GOAWAY frame. This will also happen if any error
happens during the stream creation (e.g. memory allocation).

RFC7540#5.1.2 says that attempts to create streams in excess should
instead be dealt with using an RST_STREAM frame conveying either the
PROTOCOL_ERROR or REFUSED_STREAM reason (the latter being usable only
if it is guaranteed that the stream was not processed). In theory it
should not happen for well behaving clients, though it may if we
configure a low enough h2.max_concurrent_streams limit. This error
however may definitely happen on memory shortage.

Previously it was not possible to use RST_STREAM due to the fact that
the HPACK decompressor would be desynchronized. But now we first decode
and only then try to allocate the stream, so the decompressor remains
synchronized regardless of policy or resources issues.

With this patch we enforce stream termination with RST_STREAM and
REFUSED_STREAM if this protocol violation happens, as well as if there
is a temporary condition like a memory allocation issue. It will allow
a client to recover cleanly.

This could possibly be backported to 1.9. Note that this requires that
these five previous patches are merged as well :

    MINOR: h2: add a bit-based frame type representation
    MEDIUM: mux-h2: remove padlen during headers phase
    MEDIUM: mux-h2: decode HEADERS frames before allocating the stream
    MINOR: mux-h2: make h2c_send_rst_stream() use the dummy stream's error code
    MINOR: mux-h2: add a new dummy stream for the REFUSED_STREAM error code
2018-12-24 11:45:00 +01:00
Willy Tarreau
8d0d58bf6a MINOR: mux-h2: add a new dummy stream for the REFUSED_STREAM error code
This patch introduces a new dummy stream, h2_refused_stream, in CLOSED
status with the aforementioned error code. It will be usable to reject
unexpected extraneous streams.
2018-12-24 11:45:00 +01:00
Willy Tarreau
e6888fff75 MINOR: mux-h2: make h2c_send_rst_stream() use the dummy stream's error code
We currently have 2 dummy streams allowing us to send an RST_STREAM
message with an error code matching this one. However h2c_send_rst_stream()
still enforces the STREAM_CLOSED error code for these dummy streams,
ignoring their respective errcode fields which however are properly
set.

Let's make the function always use the stream's error code. This will
allow to create other dummy streams for different codes.
2018-12-24 11:45:00 +01:00
Willy Tarreau
5c8cafae39 MEDIUM: mux-h2: decode HEADERS frames before allocating the stream
It's hard to recover from a HEADERS frame decoding error after having
already created the stream, and it's not possible to recover from a
stream allocation error without dropping the connection since we can't
maintain the HPACK context, so let's decode it before allocating the
stream, into a temporary buffer that will then be offered to the newly
created stream.
2018-12-24 11:45:00 +01:00
Willy Tarreau
6fa380dbba MINOR: mux-h2: remove useless check for empty frame length in h2s_decode_headers()
This test for an empty frame was already performed in the callers, there
is no need for checking it again.
2018-12-24 11:45:00 +01:00
Willy Tarreau
3bf6918cb2 MEDIUM: mux-h2: remove padlen during headers phase
Three types of frames may be padded : DATA, HEADERS and PUSH_PROMISE.
Currently, each of these independently deals with padding and needs to
wait for and skip the initial padlen byte. Not only this complicates
frame processing, but it makes it very hard to process CONTINUATION
frames after a padded HEADERS frame, and makes it complicated to perform
atomic calls to h2s_decode_headers(), which are needed if we want to be
able to maintain the HPACK decompressor's context even when dropping
streams.

This patch takes a different approach : the padding is checked when
parsing the frame header, the padlen byte is waited for and parsed,
and the dpl value is updated with this padlen value. This will allow
the frame parsers to decide to overwrite the padding if needed when
merging adjacent frames.
2018-12-24 11:45:00 +01:00
Willy Tarreau
a875466243 BUG/MEDIUM: mux-h2: mark that we have too many CS once we have more than the max
Since commit f210191 ("BUG/MEDIUM: h2: don't accept new streams if
conn_streams are still in excess") we're refraining from reading input
frames if we've reached the limit of number of CS. The problem is that
it prevents such situations from working fine. The initial purpose was
in fact to prevent from reading new HEADERS frames when this happens,
and causes some occasional transfer hiccups and pauses with large
concurrencies.

Given that we now properly reject extraneous streams before checking
this value, we can be sure never to have too many streams, and that
any higher value is only caused by a scheduling reason and will go
down after the scheduler calls the code.

This fix must be backported to 1.9 and possibly to 1.8. It may be
tested using h2spec this way with an h2spec config :

  while :; do
    h2spec -o 5 -v -t -S -k -h 127.0.0.1 -p 4443 http2/5.1.2
  done
2018-12-24 08:13:16 +01:00
Willy Tarreau
c4ea04c2b6 BUG/MINOR: mux-h2: make empty HEADERS frame return a connection error
We were returning a stream error of type PROTOCOL_ERROR on empty HEADERS
frames, but RFC7540#4.2 stipulates that we should instead return a
connection error of type FRAME_SIZE_ERROR.

This may be backported to 1.9 and 1.8 though it's unlikely to have any
real life effect.
2018-12-23 10:02:38 +01:00
Willy Tarreau
97aaa67658 MINOR: mux-h2: only increase the connection window with the first update
Commit dc57236 ("BUG/MINOR: mux-h2: advertise a larger connection window
size") caused a WINDOW_UPDATE message to be sent early with the connection
to increase the connection's window size. It turns out that it causes some
minor trouble that need to be worked around :
  - varnishtest cannot transparently cope with the WU frames during the
    handshake, forcing all tests to explicitly declare the handshake
    sequence ;
  - some vtc scripts randomly fail if the WU frame is sent after another
    expected response frame, adding uncertainty to some tests ;
  - h2spec doesn't correctly identify these WU at the connection level
    that it believes are the responses to some purposely erroneous frames
    it sends, resulting in some errors being reported

None of these are a problem with real clients but they add some confusion
during troubleshooting.

Since the fix above was intended to increase the upload bandwidth, we
have another option which is to increase the window size with the first
WU frame sent for the connection. This way, no WU frame is sent until
one is really needed, and this first frame will adjust the window to
the maximum value. It will make the window increase slightly later, so
the client will experience the first round trip when uploading data,
but this should not be perceptible, and is not worth the extra hassle
needed to maintain our debugging abilities. As an extra bonus, a few
extra bytes are saved for each connection until the first attempt to
upload data.

This should possibly be backported to 1.9 and 1.8.
2018-12-23 09:49:04 +01:00
Willy Tarreau
47b515a462 BUG/MEDIUM: mux-h2: don't needlessly wake up the demux on short frames
In some situations, if too short a frame header is received, we may leave
h2_process_demux() waking up the task again without checking that we were
already subscribed.

In order to avoid this once for all, let's introduce an h2_restart_reading()
function which performs the control and calls the task up. This way we won't
needlessly wake the task up if it's already waiting for I/O.

Must be backported to 1.9.
2018-12-21 16:12:33 +01:00
Willy Tarreau
645b33d233 BUG/MEDIUM: mux-h2: Don't forget to quit the send list on error reports
Similar to last fix, we need to quit the send list when reporting an
error via the send side.

This should be backported to 1.9.
2018-12-20 15:35:57 +01:00
Olivier Houchard
f29cd5c8a8 BUG/MEDIUM: h2: Don't forget to quit the sending_list if SUB_CALL_UNSUBSCRIBE.
In mux_h2_unsubscribe, don't forget to leave the sending_list if
SUB_CALL_UNSUBSCRIBE was set. SUB_CALL_UNSUBSCRIBE means we were about
to be woken up for writing, unless the mux was too full to get more data.
If there's an unsubscribe call in the meanwhile, we should leave the list,
or we may be put back in the send_list.

This should be backported to 1.9.
2018-12-20 12:24:43 +01:00
Olivier Houchard
6dea2ee939 BUG/MEDIUM: h2: Don't wait for flow control if the connection had a shutr.
In h2_snd_buf(), if we couldn't send the data because of flow control, and
the connection got a shutr, then add CS_FL_ERROR (or CS_FL_ERR_PENDING). We
will never get any window update, so we will never be unlocked, anyway.

No backport is needed.
2018-12-19 18:35:40 +01:00
Willy Tarreau
fde287cc76 BUG/MINOR: mux-h2: make sure we check the conn_stream in early data
When dealing with early data we scan the list of stream to notify them.
We're not supposed to have h2s->cs == NULL here but it doesn't cost much
to make the scan more robust and verify it before notifying.

No backport is needed.
2018-12-19 18:33:16 +01:00
Willy Tarreau
ec988c7a0f CLEANUP: mux-h2: make use of cs_set_error()
It's cleaner than open-coding the conditions and error bits.
2018-12-19 18:13:52 +01:00
Willy Tarreau
f830f018cf BUG/MEDIUM: mux-h2: make use of h2s_alert() to report aborts
If we had no pending read, it could be complicated to report an
RST_STREAM to a sender since we used to only report it via the
rx side if subscribed. Similarly in h2_wake_some_streams() we
now try all methods, hoping to catch all possible events.

No backport is needed.
2018-12-19 18:13:52 +01:00
Willy Tarreau
8b2757c339 MINOR: mux-h2: add a new function h2s_alert() to call the data layer
In order to report an error to the data layer, we have different ways
depending on the situation. At a lot of places it's open-coded and not
always correct. Let's create a new function h2s_alert() to handle this
task. It tries to wake on recv() first, then on send(), then using
wake().
2018-12-19 18:13:48 +01:00
Willy Tarreau
7e094451d0 CLEANUP: mux-h2: implement h2s_notify_{send,recv} to report events to subscribers
Till now we had to open-code all the manipulation of the wait_event,
let's use standarized functions for this and reduce the risk of bugs.
2018-12-19 18:11:35 +01:00
Olivier Houchard
251064b02d BUG/MEDIUM: h2: Make sure we don't set CS_FL_ERROR if there's still data.
In the mux h2, make sure we set CS_FL_ERR_PENDING and wake the recv task,
instead of setting CS_FL_ERROR, if CS_FL_EOS is not set, so if there's
potentially still some data to be sent.
2018-12-19 17:28:54 +01:00
Olivier Houchard
9117780bfd BUG/MEDIUM: mux-h2: pass CS_FL_ERR_PENDING to h2_wake_some_streams()
Commiy 8519357c ("BUG/MEDIUM: mux-h2: report asynchronous errors in
h2_wake_some_streams()") addressed an issue with synchronous errors
but forgot to fix the call places to also pass CS_FL_ERR_PENDING
instead of CS_FL_ERROR.

No backport is needed.
2018-12-19 17:06:49 +01:00
Olivier Houchard
2f30883793 BUG/MEDIUM: H2: Make sure htx is set even on empty frames.
When transfering data, make sure htx is set even on empty frames, or we
will never add a HTX_BLK_EOM block.
2018-12-19 17:00:14 +01:00
Willy Tarreau
3d2ee55ebd CLEANUP: connection: rename conn->mux_ctx to conn->ctx
We most often store the mux context there but it can also be something
else while setting up the connection. Better call it "ctx" and know
that it's the owner's context than misleadingly call it mux_ctx and
get caught doing suspicious tricks.
2018-12-19 14:13:07 +01:00
Willy Tarreau
4f6516d677 CLEANUP: connection: rename subscription events values and event field
The SUB_CAN_SEND/SUB_CAN_RECV enum values have been confusing a few
times, especially when checking them on reading. After some discussion,
it appears that calling them SUB_RETRY_SEND/SUB_RETRY_RECV more
accurately reflects their purpose since these events may only appear
after a first attempt to perform the I/O operation has failed or was
not completed.

In addition the wait_reason field in struct wait_event which carries
them makes one think that a single reason may happen at once while
it is in fact a set of events. Since the struct is called wait_event
it makes sense that this field is called "events" to indicate it's the
list of events we're subscribed to.

Last, the values for SUB_RETRY_RECV/SEND were swapped so that value
1 corresponds to recv and 2 to send, as is done almost everywhere else
in the code an in the shutdown() call.
2018-12-19 14:09:21 +01:00
Willy Tarreau
567beb8a91 BUG/MEDIUM: mux-h2: make sure the demux also wakes streams up on errors
Today the demux only wakes a stream up after receiving some contents, but
not necessarily on close or error. Let's do it based on both error flags
and both EOS flags. With a bit of refinement we should be able to only do
it when the pending bits are there but not the static ones.

No backport is needed.
2018-12-18 16:52:44 +01:00
Willy Tarreau
a8519357c5 BUG/MEDIUM: mux-h2: report asynchronous errors in h2_wake_some_streams()
This function is called when dealing with a connection error or a GOAWAY
frame. It used to report a synchronous error instead of an asycnhronous
error, which can lead to data truncation since whatever is still available
in the rxbuf will be ignored. Let's correctly use CS_FL_ERR_PENDING instead
and only fall back to CS_FL_ERROR if CS_FL_EOS was already delivered.

No backport is needed.
2018-12-18 16:46:24 +01:00
Willy Tarreau
7ecb6f10a4 BUG/MEDIUM: mux-h2: make sure to report synchronous errors after EOS
If EOS has already been reported on the conn_stream, there won't be
any read anymore to turn ERR_PENDING into ERROR, so we have to do
report it directly.

No backport is needed.
2018-12-18 16:46:19 +01:00
Willy Tarreau
3af3771bf3 BUG/MINOR: mux-h2: don't report a fantom h2s in "show fd"
The h2s pointer was used to scan fctl lists prior to being used to scan
the send list by ID, so it could appear non-null eventhough the list is
empty, resulting in misleading information on empty connections.

No backport is needed.
2018-12-18 14:34:41 +01:00
Willy Tarreau
987c0633fa MINOR: mux-h2: report more h2c, last h2s and cs information on "show fd"
Most of the time when we issue "show fd" to dump a mux's state, it's
to figure why a transfer is frozen. Connection, stream and conn_stream
states are critical there. And most of the time when this happens there
is a single stream left in the H2 mux, so let's always dump the last
known stream on show fd, as most of the time it will be the one of
interest.
2018-12-18 11:03:11 +01:00
Willy Tarreau
cef5c8e2aa BUG/MEDIUM: mux-h2: restart demuxing as soon as demux data are available
Commit 7505f94f9 ("MEDIUM: h2: Don't use a wake() method anymore.")
changed the conditions to restart demuxing so that this happens as soon
as something is read. But similar to previous fix, at an end of stream
we may be woken up with nothing to read but data still available in the
demux buffer, so we must also use this as a valid condition for demuxing.

No backport is needed, this is purely 1.9.
2018-12-18 11:03:11 +01:00
Willy Tarreau
c5b1004fbe BUG/MEDIUM: mux-h2: also restart demuxing when data are pending in demux
Commit 082f559d3 ("BUG/MEDIUM: h2: restart demuxing after releasing
buffer space") tried to address a situation where transfers could stall
after a read, but the condition was not completely covered : some stalls
may still happen at end of stream because there's nothing anymore to
receive and the last data lie in the demux buffer. Thus we must also
consider this state as a valid condition to restart demuxing.

No backport is needed.
2018-12-18 11:03:11 +01:00
Olivier Houchard
71748cb91b BUG/MEDIUM: connection: Add a new CS_FL_ERR_PENDING flag to conn_streams.
Add a new flag to conn_streams, CS_FL_ERR_PENDING. This is to be set instead
of CS_FL_ERR in case there's still more data to be read, so that we read all
the data before closing.
2018-12-17 21:54:14 +01:00
Olivier Houchard
ffda58b546 BUG/MEDIUM: h2: Don't destroy the h2s if it still has a cs attached.
In h2_deferred_shut, if we're done sending the shutr/shutw, don't destroy
the h2s if it still has a conn_stream attached, or the conn_stream may try
to access it again.
2018-12-16 08:22:01 +01:00
Olivier Houchard
746fb772f1 MEDIUM: mux_h2: Always set CS_FL_NOT_FIRST for new conn_streams.
When creating new conn_streams, always set the CS_FL_NOT_FIRST flag. We
don't really care about being the first request for HTTP/2, this only
really makes sense for HTTP/1, and that way we can reuse connections.
2018-12-15 23:50:11 +01:00
Olivier Houchard
a4d4fdfaa3 MEDIUM: sessions: Don't keep an infinite number of idling connections.
In session, don't keep an infinite number of connection that can idle.
Add a new frontend parameter, "max-session-srv-conns" to set a max number,
with a default value of 5.
2018-12-15 23:50:10 +01:00
Olivier Houchard
f502aca5c2 MEDIUM: mux: provide the session to the init() and attach() method.
Instead of trying to get the session from the connection, which is not
always there, and of course there could be multiple sessions per connection,
provide it with the init() and attach() methods, so that we know the
session for each outgoing stream.
2018-12-15 23:50:09 +01:00
Olivier Houchard
8a78690229 MEDIUM: mux: Destroy the stream before trying to add the conn to the idle list.
In the mux_h1 and mux_h2, move the test to see if we should add the
connection in the idle list until after we destroyed the h1s/h2s, that way
later we'll be able to check if the connection has no stream at all, and if
it should be added to the server idling list.
2018-12-15 23:50:09 +01:00
Olivier Houchard
2c68a462e1 BUG/MEDIUM: h2: Don't forget to destroy the h2s after deferred shut.
If we had to defer shutr/shutw, and we're now done, destroy the h2s, or
nobody will do so.
2018-12-15 23:50:07 +01:00
Olivier Houchard
84cca66ea3 BUG/MEDIUM: htx: When performing zero-copy, start from the right offset.
When using zerocopy, start from the beginning of the data, not from the
beginning of the buffer, it may have contained headers, and so the data
won't start at the beginning of the buffer.
2018-12-14 17:02:11 +01:00
Willy Tarreau
c0960d1185 MINOR: mux_h1/h2: simplify the zero-copy Rx alignment
The transpory layer now respects buffer alignment, so we don't need to
cheat anymore pretending we have some data at the head, adjusting the
buffer's head is enough.
2018-12-14 10:59:15 +01:00
Willy Tarreau
e0f24ee149 MINOR: connection: realign empty buffers in muxes, not transport layers
For a long time we've been realigning empty buffers in the transport
layers, where the I/Os were performed based on callbacks. Doing so is
optimal for higher data throughput but makes it trickier to optimize
unaligned data, where mux_h1/h2 have to claim some data are present
in the buffer to force unaligned accesses to skip the frame's header
or the chunk header.

We don't need to do this anymore since the I/O calls are now always
performed from top to bottom, so it's only the mux's responsibility
to realign an empty buffer if it wants to.

In practice it doesn't change anything, it's just a convention, and
it will allow the code to be simplified in a next patch.
2018-12-14 10:51:23 +01:00
Olivier Houchard
44d59146a6 MEDIUM: htx: Try to take a connection over if it has no owner.
In the mux detach function, when using HTX, take the connection over if
it no longer has an owner (ie because the session that was the owner left).
It is done for legacy code in proto_http.c, but not for HTX.
Also when using HTX, in H2, try to add the connection back to idle_conns if
it was not already (ie we used to use all the available streams, and we're
freeing one). That too was done in proto_http.c.
2018-12-13 18:54:27 +01:00
Willy Tarreau
2a59e87735 MINOR: mux-h2: force reads to be HTX-aligned in HTX mode
H2 has a 9-byte frame header, and HTX has a 40-byte frame header.
By artificially advancing the Rx header and limiting the amount of
bytes read to protect the end of the buffer, we can make the data
payload perfectly aligned with HTX blocks and optimize the copy.
2018-12-12 11:52:45 +01:00
Willy Tarreau
98de12a5d1 MEDIUM: mux-h2: implement true zero-copy send of large HTX DATA blocks
This is similar to what was done for the H1 mux : when the mux's buffer
is empty and the htx area contains exactly one data block of the same
size as the requested count, and all window and frame size conditions are
satisfied, then it's possible to simply swap the caller's buffer with the
mux's output buffer and adjust offsets and length to match the entire
DATA HTX block in the middle. An H2 frame header has to be prepended
before the block but this always fits in an HTX frame header.

In this case we perform a true zero-copy operation from end-to-end. This
is the situation that happens all the time with large files. When using
HTX over H2 over TLS, this brings a 3% extra performance gain. TLS remains
a limiting factor here but the copy definitely has a cost. Also since
haproxy can now use H2 in clear, the savings can be higher.
2018-12-12 11:52:45 +01:00
Willy Tarreau
06ae84a8ac MINOR: mux-h2: avoid copying large blocks into full buffers
Due to blocking factor being different on H1 and H2, we regularly end
up with tails of data blocks that leave room in the mux buffer, making
it tempting to copy the pending frame into the remaining room left, and
possibly realigning the output buffer.

Here we check if the output buffer contains data, and prefer to wait
if either the current frame doesn't fit or if it's larger than 1/4 of
the buffer. This way upon next call, either a zero copy, or a larger
and aligned copy will be performed, taking the whole chunk at once.

Doing so increases the H2 bandwidth by slightly more than 1% on large
objects.
2018-12-12 11:52:45 +01:00
Willy Tarreau
dc572364c6 BUG/MINOR: mux-h2: advertise a larger connection window size
By default H2 uses a 65535 bytes window for the connection, and changing
it requires sending a WINDOW_UPDATE message. We only used to update the
window when receiving data, thus never increasing it further.

As reported by user klzgrad on the mailing list, this seriously limits
the upload bitrate, and will have an even higher impact on the backend
H2 connections to origin servers.

There is no technical reason for keeping this window so low, so let's
increase it to the maximum possible value (2G-1). We do this by
pretending we've already received that many data minus the maximum
data the client might already send (65535), so that an early
WINDOW_UPDATE message is sent right after the SETTINGS frame.

This should be backported to 1.8. This patch depends on previous
patch "BUG/MINOR: mux-h2: refrain from muxing during the preface".
2018-12-12 09:23:41 +01:00
Willy Tarreau
75a930affb BUG/MINOR: mux-h2: refrain from muxing during the preface
The condition to refrain from processing the mux was insufficient as it
would only handle the outgoing connections. In essence it is not that much
of a problem since we don't have streams yet on an incoming connetion. But
it prevents waiting for the end of the preface before sending an early
WINDOW_UPDATE message, thus causing the connections to fail in this case.

This must be backported to 1.8 with a few minor adaptations.
2018-12-12 09:23:41 +01:00
Willy Tarreau
afba57ae80 REORG: h1: merge types+proto into common/h1.h
These two files are self-contained and do not depend on other
layers, so let's remerge them together for easier manipulation.
2018-12-11 17:15:13 +01:00
Willy Tarreau
b96b77ed6e REORG: htx: merge types+proto into common/htx.h
All the HTX definition is self-contained and doesn't really depend on
anything external since it's a mostly protocol. In addition, some
external similar files (like h2) also placed in common used to rely
on it, making it a bit awkward.

This patch moves the two htx.h files into a single self-contained one.
The historical dependency on sample.h could be also removed since it
used to be there only for http_meth_t which is now in http.h.
2018-12-11 17:15:04 +01:00
Willy Tarreau
907998194b MEDIUM: mux-h2: make use of hpack_encode_path() to encode the path
The HTTP path encoding was open-coded with a HPACK byte matching the
"/" or "/index.html" paths. Let's make use of the new functions to
avoid this.
2018-12-11 09:07:02 +01:00
Willy Tarreau
7561bcbb36 MEDIUM: mux-h2: make use of hpack_encode_scheme() to encode the scheme
The HTTP scheme encoding was open-coded with a HPACK byte matching the
"https" scheme. Let's make use of the new functions to avoid this.
2018-12-11 09:07:02 +01:00