RFC7540#5.1 is pretty clear : "any frame other than WINDOW_UPDATE,
PRIORITY, or RST_STREAM in this state MUST be treated as a connection
error of type STREAM_CLOSED". Instead of dealing with this for each
and every frame type, let's do it once for all in the main demux loop.
RFC7540#5.1 is pretty clear : "any frame other than HEADERS or PRIORITY
in this state MUST be treated as a connection error". Instead of dealing
with this for each and every frame type, let's do it once for all in the
main demux loop.
The ID is respected, and only IDs greater than the advertised last_id
are woken up, with a CS_FL_ERROR flag to signal that the stream is
aborted. This is necessary for a browser to abort a download or to
reject a bad response that affects the connection's state.
Let's replace h2_wake_all_streams() with h2_wake_some_streams(), to
support signaling only streams by their ID (for GOAWAY frames) and
to pass the flags to add on the conn_stream.
When a stream sends a shutw, we send an empty DATA frame with the ES
flag set, except if no HEADERS were sent, in which case we rather send
RST_STREAM. On shutr(1) to abort a request, an RST_STREAM frame is sent
if the stream is OPEN and the stream is closed. Care is taken to switch
the stream's state accordingly and to avoid sending an ES bit again or
another RST once already done.
Data frames are received and transmitted. The per-connection and
per-stream amount of data to ACK is automatically updated. Each
DATA frame is ACKed because usually the downstream link is large
and the upstream one is small, so it seems better to waste a few
bytes every few kilobytes to maintain a low ACK latency and help
the sender keep the link busy. The connection's ACK however is
sent at the end of the demux loop and at the beginning of the mux
loop so that a single aggregated one is emitted (connection
windows tend to be much larger than stream windows).
A future improvement would consist in sending a single ACK for
multiple subsequent DATA frames of the same stream (possibly
interleaved with window updates frames), but this is much trickier
as it also requires to remember the ID of the stream for which
DATA frames have to be sent.
Ideally in the near future we should chunk-encode the body sent
to HTTP/1 when there's no content length and when the request is
not a CONNECT. It's just uncertain whether it's the best option
or not for now.
When it is detected that the number of received bytes is > 0 on the
connection at the end of the demux call or before starting to process
pending output data, an attempt is made at sending a WINDOW UPDATE on
the connection. In case of failure, it's attempted later.
For now we don't build a HEADERS frame with them, but at least we remove
them from the response so that the L7 chunk parser inside isn't blocked
on these (often two) remaining bytes that don't want to leave the buffer.
It also ensures that trailers delivered progressively will correctly be
skipped.
The H1 response data are processed (either following content-length or
chunks) and emitted as H2 DATA frames. In the case of content-length,
the maximum size permitted by the mux buffer, the max frame size, the
connection's window and the stream's window it used to determine the
frame size. For chunked encoding, the same limitation applies, but in
addition, each chunk leads to a distinct frame. This could be improved
in the future to aggregate chunks into larger frames.
Streams blocked on the connection's flow control subscribe to the
connection's fctl_list to be woken up when the window opens again.
Streams blocked on their own flow control don't subscribe to anything,
they just sit waiting for window update frames to reopen the window.
The connection-close mode (without content-length) partially works thanks
to the fact that the SHUTW event leads to a close of the stream. In
practice an empty DATA frame should be sent in this case though.
This calls the h1 response parser and feeds the output through the hpack
encoder to produce stateless HPACK bytecode into an output chunk. For now
it's a bit naive but reasonably efficient.
The HPACK encoder relies on hpack_encode_header() so that the most common
response header fields are encoded based on the static header table. The
forbidden header field names (connection, proxy-connection, upgrade,
transfer-encoding, keep-alive) are dropped before calling the hpack
encoder.
A new flag (H2_CF_HEADERS_SENT) is set once such a frame is emitted. It
will be used to know if we can send an empty DATA+ES frame to use as a
shutdown() signal or if we have to use RST_STREAM.
The trash is already used by the hpack layer and for Huffman decoding,
it's unsafe to use here as a buffer and results in corrupted data. Use
a safely allocated trash instead.
This takes care of creating a new h2s and a new conn_stream when a
HEADERS frame arrives. The recv() callback from the data layer is then
called to extract the frame into the stream's buffer. It is verified
that the stream ID is strictly greater than the known max stream ID.
And the last_id is updated if the current request is properly converted.
The streams are created in open or half-closed(remote) states.
For now there are some limitations :
- frames without END_HEADERS are rejected (CONTINUATION not supported
yet, will require some more changes so that the stream processor
checks the H2 frame header by itself and steals the frames from the
connection)
- padding/stream_dep/priority are currently ignored
- limited error handling, could be improved
But at least the request is properly decoded, transcoded and processed.
If a stream is killed for whatever reason and it happens to be the one
currently blocking the connection, we must unblock the connection and
enable polling again so that it can attempt to make progress. This may
happen for example on upload timeout, where the demux is blocked due to
a full stream buffer, and the stream dies on server timeout and quits.
This does the very minimum required to release a stream and/or a connection
upon the stream's request. The only thing is that it doesn't kill the
connection unless it's already closed or in error or the stream ID reached
the one specified in GOAWAY frame. We're supposed to arm a timer to close
after some idle timeout but it's not done.
For now we have nowhere to store partial header frames so we can't
handle CONTINUATION frames and we must reject them. In this case we
respond with a stream error of type INTERNAL_ERROR.
This one sends an RST_STREAM for a given stream, using the current
demux stream ID. It's also used to send RST_STREAM for streams which
have lost their CS part (ie were aborted).
Now they really increase the window size of connections and streams.
If a stream was not queued but requested to send, it means it was
flow-controlled so it's added again into the connection's send list.
The INITIAL_WINDOW_SIZE and MAX_FRAME_SIZE settings are now extracted
from the settings frame, assigned to the connection, and attempted to
be propagated to all existing streams as per the specification. In
practice clients rarely update the settings after sending the first
stream, so the propagation will rarely be used. The ACK is properly
sent after the frame is completely parsed.
The function h2_process_demux() now tries to parse the incoming bytes
to process as many streams as possible. For now it does nothing but
dropping all incoming frames.
Instead of doing a special processing of the first SETTINGS frame, we
simply parse its header, check that it matches the expected frame type
and flags (ie no ACK), and switch to FRAME_P to parse it as any regular
frame. The regular frame parser will take care of decoding it.
An initial settings frame is emitted upon receipt of the connection
preface, which takes care of configured values. These settings are
only emitted when they differ from the protocol's default value :
- header_table_size (defaults to 4096)
- initial_window_size (defaults to 65535)
- max_concurrent_streams (defaults to unlimited)
- max_frame_size (defaults to 16384)
The max frame size is a copy of tune.bufsize. Clients will most often
reject values lower than 16384 and currently there's no trivial way to
check if H2 is going to be used at boot time.
The send() callback calls h2_process_mux() which iterates over the list
of flow controlled streams first, then streams waiting for room in the
send_list. If a stream from the send_list ends up being flow controlled,
it is then moved to the fctl_list. This way we can maintain the most
accurate fairness by ensuring that flows are always processed in order
of arrival except when they're blocked by flow control, in which case
only the other ones may pass in front of them.
It's a bit tricky as we want to remove a stream from the active lists
if it doesn't block (ie it has no reason for staying there).
If the polling update function is called with RD_ENA while H2_CF_DEM_SFULL
indicates the demux had to block on a stream buffer full condition, we can
remove the flag and re-enable polling for receiving because this is the
indication that a consumer stream has made some room in the buffer. Probably
that we should improve this to ensure that h2s->id == h2c->dsi and avoid
trying to receive multiple times in a row for the wrong stream.
A conn_stream indicates its intent to send by setting the WR_ENA flag
and calling mux->update_poll(). There's no synchronous write so the only
way to emit a response from a stream is to proceed this way. The sender
h2s is then queued into the h2c's send_list if it was not yet queued.
Once the connection is ready, it will enter its send() callback to visit
writers, calling their data->send_cb() callback to complete the operation
using mux->snd_buf().
Also we enable polling if the mux contains data and wasn't enabled. This
may happen just after a response has been transmitted using chk_snd().
It likely is incomplete for now and should probably be refined.
The H2 preface is properly detected to switch to the settings state.
It's important to note that for now we don't send out settings frame
so the operation is not complete yet.
For now it's only used to report immediate errors by announcing the
highest known stream-id on the mux's error path. The function may be
used both while processing a stream or directly in relation with the
connection. The wake() callback will automatically ask for send access
if an error is reported. The function should be usable for graceful
shutdowns as well by simply setting h2c->last_sid to the highest
acceptable stream-id (2^31-1) prior to calling the function.
A connection flag (H2_CF_GOAWAY_SENT) is set once the frame was
successfully sent. It will be usable to detect when it's safe to
close the connection.
Another flag (H2_CF_GOAWAY_FAILED) is set in case of unrecoverable
error while trying to send. It will also be used to know when it's safe
to close the connection.
The rcv_buf() callback now calls h2_process_demux() after an recv() call
leaving some data in the buffer, and the snd_buf() callback calls
h2_process_mux() to try to process pending data from streams.
If some streams were blocked on flow control and the connection's
window was recently opened, or if some streams are waiting while
no block flag remains, we immediately want to try to send again.
This can happen if a recv() for a stream wants to send after the
send() loop has already been processed.
During h2_wake(), there are various situations that can lead to the
connection being closed :
- low-level connection error
- read0 received
- fatal error (ERROR2)
- failed to emit a GOAWAY
- empty stream list with max_id >= last_sid
In such cases, all streams are notified and we have to wait for all
streams to leave while doing nothing, or if the last stream is gone,
we can simply terminate the connection.
It's important to do this test there again because an error might arise
while trying to send a pending GOAWAY after the last stream for example,
thus there's possibly no way to get notified of a closing stream.
It happens that an H2 mux is totally unusable once the client has shut,
so we must consider this situation equivalent to the connection error,
and let the possible streams drain their data if needed then stop.
Now we start to set the flags to indicate that the response buffer is
being awaited or that it is full, it makes it possible to centralize a
little bit the polling management into the wake() callback.
In case of error, we wake all the streams up so that they are aware of
the nature of the event and are able to detach if needed.
Flag H2_CF_DEM_DALLOC is set when the demux buffer fails to be allocated
in the recv() callback, and is cleared when it succeeds.
Both flags H2_CF_MUX_MALLOC and H2_CF_DEM_MROOM are cleared when the mux
buffer allocation succeeds.
In both cases it will be up to the callers to report allocation failures.
This one will be used by the HEADERS frame handler and maybe later by
the PUSH frame handler. It creates a conn_stream in the mux's connection.
The create streams are inserted in the h2c's tree sorted by IDs. The
caller is expected to have verified that the stream doesn't exist yet.
It will be more convenient to always manipulate existing streams than
null pointers. Here we create one idle stream and one closed stream.
The idea is that we can easily point any stream to one of these states
in order to merge maintenance operations.
Functions h2_get_buf_n{16,32,64}() and h2_get_buf_bytes() respectively
extract a network-ordered 16/32/64 bit value from a possibly wrapping
buffer, or any arbitrary size. They're convenient to retrieve a PING
payload or to parse SETTINGS frames. Since they copy one byte at a time,
they will be less efficient than a memcpy-based implementation on large
blocks.
This function extracts the next frame header but doesn't consume it.
This will allow to detect a stream-id change and to perform a yielding
window update without losing information. The result is stored into a
temporary frame descriptor. We could also store the next frame header
into the connection but parsing the header again is much cheaper than
wasting bytes in the connection for a rare use case.
A function (h2_skip_frame_hdr()) is also provided to skip the parsed
header (always 9 bytes) and another one (h2_get_frame_hdr()) to do both
at once.
This function is called after preparing a frame, in order to update the
frame's size in the frame header. It takes the frame payload length in
argument.
It simply writes a 24-bit frame size into a buffer, making use of the
net_helper functions which try to optimize per platform (this is a
frequently used operation).
This one will store the error into the stream's errcode if it's neither
idle nor closed (since these ones are read-only) and switch its state to
H2_SS_ERROR. If a conn_stream is attached, it will be flagged with
CS_FL_ERROR.
A mux is busy when any stream id >= 0 is currently being handled
and the current stream's id doesn't match. When no stream is
involved (ie: demuxer), stream 0 is considered. This will be
necessary to know when it's possible to send frames.
A demux may be prevented from receiving for the following reasons :
- no receive buffer could be allocated
- the receive buffer is full
- a response is needed and the mux is currently being used by a stream
- a response is needed and some room could not be found in the mux
buffer (either full or waiting for allocation)
- the stream buffer is waiting for allocation
- the stream buffer is full
A mux may stop accepting data for the following reasons :
- the buffer could not be allocated
- the buffer is full
A stream may stop sending data to a mux for the following reaons :
- the mux is busy processing another stream
- the mux buffer lacks room (full or not allocated)
- the mux's flow control prevents from sending
- the stream's flow control prevents from sending
All these conditions were turned into flags for use by the respective
places.
The idea is that we may need a mux buffer for anything, ranging from
receiving to sending traffic. For now it's unclear where exactly the
calls will be placed so let's block both send and recv when a buffer
is missing, and re-enable both of them at the end. This will have to
be changed later.
This patch implements a very basic Rx buffer management. The mux needs
an rx buffer to decode the connection's stream. If this buffer it
available upon Rx events, we fill it with whatever input data are
available. Otherwise we try to allocate it and subscribe to the buffer
wait queue in case of failure. In such a situation, a function
"h2_dbuf_available()" will be called once a buffer may be allocated.
The buffer is released if it's still empty after recv().
The connection's h2c context is now allocated and initialized on mux
initialization, and released on mux destruction. Note that for now the
release() code is never called.
We need to deal with stream error notifications (RST_STREAM) as well as
internal reporting. The problem is that we don't know in which order
this will be done so we can't unilaterally decide to deallocate the
stream. In order to help, we add two extra stream states, H2_SS_ERROR
and H2_SS_RESET. The former mentions that the stream has an error pending
and the latter indicates that the error was already sent and that the
stream is now closed. It's equivalent to H2_SS_CLOSED except that in this
state we'll avoid sending new RST_STREAM as per RFC7540#5.4.2.
With this it will be possible to only detach or deallocate the h2s once
the stream is closed.
This describes an HTTP/2 stream with its relation to the connection
and to the conn_stream on the other side.
For now we also allocate request and response state for HTTP/1 because
the internal HTTP representation is HTTP/1 at the moment. Later this
should evolve towards a version-agnostic representation and this H1
message state will disappear.
It's important to consider that the streams are necessarily polarized
depending on h2c : if the connection is incoming, streams initiated by
the connection receive requests and send responses. Otherwise it's the
other way around. Such information is known during the connection
instanciation by h2c_frt_init() and will normally be reflected in the
stream ID (odd=demux from client, even=demux from server). The initial
H2_CS_PREFACE state will also depend on the direction. The current h2c
state machine doesn't allow for outgoing connections as it uses a single
state for both (rx state only). It should be the demux state only.
The h2c struct describes an H2 connection context and is assigned as the
mux's context. It has its own pool, allocated at boot time and released
after deinit().