35 Commits

Author SHA1 Message Date
Willy Tarreau
fd2658c0c6 BUG/MINOR: h2: reject again empty :path pseudo-headers
Since commit 92919f7fd5 ("MEDIUM: h2: make the request parser rebuild
a complete URI") we make sure to rebuild a complete URI. Unfortunately
the test for an empty :path pseudo-header that is mandated by #8.1.2.3
appened to be performed on the URI before this patch, which is never
empty anymore after being rebuilt, causing h2spec to complain :

  8. HTTP Message Exchanges
    8.1. HTTP Request/Response Exchange
      8.1.2. HTTP Header Fields
        8.1.2.3. Request Pseudo-Header Fields
          - 1: Sends a HEADERS frame with empty ":path" pseudo-header field
            -> The endpoint MUST respond with a stream error of type PROTOCOL_ERROR.
               Expected: GOAWAY Frame (Error Code: PROTOCOL_ERROR)
                         RST_STREAM Frame (Error Code: PROTOCOL_ERROR)
                         Connection closed
                 Actual: DATA Frame (length:0, flags:0x01, stream_id:1)

It's worth noting that this error doesn't trigger when calling h2spec
with a timeout as some scripts do, which explains why it wasn't detected
after the patch above.

This fixes one half of issue #471 and should be backported to 2.1.
2020-02-26 13:56:24 +01:00
Willy Tarreau
146f53ae7e BUG/MAJOR: h2: make header field name filtering stronger
Tim Düsterhus found that the amount of sanitization we perform on HTTP
header field names received in H2 is insufficient. Currently we reject
upper case letters as mandated by RFC7540#8.1.2, but section 10.3 also
requires that intermediaries translating streams to HTTP/1 further
refine the filtering to also reject invalid names (which means any name
that doesn't match a token). There is a small trick here which is that
the colon character used to start pseudo-header names doesn't match a
token, so pseudo-header names fall into that category, thus we have to
swap the pseudo-header name lookup with this check so that we only check
from the second character (past the ':') in case of pseudo-header names.

Another possibility could have been to perform this check only in the
HTX-to-H1 trancoder but doing would still expose the configured rules
and logs to such header names.

This fix must be backported as far as 1.8 since this bug could be
exploited and serve as the base for an attack. In 2.0 and earlier,
functions h2_make_h1_request() and h2_make_h1_trailers() must also
be adapted to sanitize requests coming in legacy mode.
2019-11-25 11:11:32 +01:00
Willy Tarreau
54f53ef7ce BUG/MAJOR: h2: reject header values containing invalid chars
Tim Düsterhus reported an annoying problem in the H2 decoder related to
an ambiguity in the H2 spec. The spec says in section 10.3 that HTTP/2
allows header field values that are not valid (since they're binary) and
at the same time that an H2 to H1 gateway must be careful to reject headers
whose values contain \0, \r or \n.

Till now, and for the sake of the ability to maintain end-to-end binary
transparency in H2-to-H2, the H2 mux wouldn't reject this since it does
not know what version will be used on the other side.

In theory we should in fact perform such a check when converting an HTX
header to H1. But this causes a problem as it means that all our rule sets,
sample fetches, captures, logs or redirects may still find an LF in a header
coming from H2. Also in 2.0 and older in legacy mode, the frames are instantly
converted to H1 and HTX couldn't help there. So this means that in practice
we must refrain from delivering such a header upwards, regardless of any
outgoing protocol consideration.

Applying such a lookup on all headers leaving the mux comes with a
significant performance hit, especially for large ones. A first attempt
was made at placing this into the HPACK decoder to refrain from learning
invalid literals but error reporting becomes more complicated. Additional
tests show that doing this within the HTX transcoding loop benefits from
the hot L1 cache, and that by skipping up to 8 bytes per iteration the
CPU cost remains within noise margin, around ~0.5%.

This patch must be backported as far as 1.8 since this bug could be
exploited and serve as the base for an attack. In 2.0 and earlier the
fix must also be added to functions h2_make_h1_request() and
h2_make_h1_trailers() to handle legacy mode. It relies on previous patch
"MINOR: ist: add ist_find_ctl()" to speed up the control bytes lookup.

All credits go to Tim for his detailed bug report and his initial patch.
2019-11-25 11:06:19 +01:00
Willy Tarreau
30ee1efe67 MEDIUM: h2: use the normalized URI encoding for absolute form requests
H2 strongly recommends that clients exclusively use the absolute form
for requests, which contains a scheme, an authority and a path, instead
of the old format involving the Host header and a path. Thus there is
no way to distinguish between a request intended for a proxy and an
origin request, and as such proxied requests are lost.

This patch makes sure to keep the encoding of all absolute form requests
so that the URI is kept end-to-end. If the scheme is http or https, there
is an uncertainty so the request is tagged as a normalized URI so that
the other end (H1) can decide to emit it in origin form as this is by far
the most commonly expected one, and it's certain that quite a number of
H1 setups are not ready to cope with absolute URIs.

There is a direct visible impact of this change, which is that the uri
sample fetch will now return absolute URIs (as they really come on the
wire) whenever these are used. It also means that default http logs will
report absolute URIs.

If a situation is once met where a client uses H2 to join an H1 proxy
with haproxy in the middle, then it will be trivial to add an option to
ask the H1 output to use absolute encoding for such requests.

Later we may be able to consider that the normalized URI is the default
output format and stop sending them in origin form unless an option is
set.

Now chaining multiple instances keeps the semantics as far as possible
along the whole chain :

 1) H1 to H1
  H1:"GET /"       --> H1:"GET /"       # log: /
  H1:"GET http://" --> H1:"GET http://" # log: http://
  H1:"GET ftp://"  --> H1:"GET ftp://"  # log: ftp://

 2) H2 to H1
  H2:"GET /"       --> H1:"GET /"       # log: /
  H2:"GET http://" --> H1:"GET /"       # log: http://
  H2:"GET ftp://"  --> H1:"GET ftp://"  # log: ftp://

 3) H1 to H2 to H2 to H1
  H1:"GET /"       --> H2:"GET /"       --> H2:"GET /"       --> H1:"GET /"
  H1:"GET http://" --> H2:"GET http://" --> H2:"GET http://" --> H1:"GET /"
  H1:"GET ftp://"  --> H2:"GET ftp://"  --> H2:"GET ftp://"  --> H1:"GET ftp://"

Thus there is zero loss on H1->H1, H1->H2 nor H2->H2, and H2->H1 is
normalized in origin format if ambiguous.
2019-10-09 11:10:19 +02:00
Willy Tarreau
1440fe8b4b MINOR: h2: report in the HTX flags when the request has an authority
The other side will need to know when to emit an authority or not. We
need to pass this information in the HTX flags.
2019-10-09 11:10:19 +02:00
Willy Tarreau
92919f7fd5 MEDIUM: h2: make the request parser rebuild a complete URI
Till now we've been producing path components of the URI and using the
:authority header only to be placed into the host part. But this practice
is not correct, as if we're used to convey H1 proxy requests over H2 then
over H1, the absolute URI is presented as a path on output, which is not
valid. In addition the scheme on output is not updated from the absolute
URI either.

Now the request parser will continue to deliver origin-form for request
received using the http/https schemes, but will use the absolute-form
when dealing with other schemes, by concatenating the scheme, the authority
and the path if it's not '*'.
2019-10-09 11:10:19 +02:00
Willy Tarreau
2be362c937 MINOR: h2: clarify the rules for how to convert an H2 request to HTX
The H2 request parsing is not trivial given that we have multiple
possible syntaxes. Mainly we can have :authority or not, and when
a CONNECT method is seen, :scheme and :path are missing. This mostly
updates the functions' comments and header index assignments to make
them less confusing. Functionally there is no change.
2019-10-09 11:05:31 +02:00
Christopher Faulet
5ed8353dcf CLEANUP: h2: Remove functions converting h2 requests to raw HTTP/1.1 ones
Because the h2 multiplexer only uses the HTX mode, following H2 functions were
removed :

  * h2_prepare_h1_reqline
  * h2_make_h1_request()
  * h2_make_h1_trailers()
2019-07-19 09:18:27 +02:00
Christopher Faulet
3e2638ee04 BUG/MEDIUM: htx: Fully update HTX message when the block value is changed
Everywhere the value length of a block is changed, calling the function
htx_set_blk_value_len(), the HTX message must be updated. But at many places,
because of the recent changes in the HTX structure, this update was only
partially done. tail_addr and head_addr values were not systematically updated.

In fact, the function htx_set_blk_value_len() was designed as an internal
function to the HTX API. And we used it from outside by convenience. But it is
really painfull and error prone to let the caller update the HTX message. So
now, we use the function htx_change_blk_value_len() wherever is possible. It
changes the value length of a block and updates the HTX message accordingly.

This patch must be backported to 2.0.
2019-06-18 10:02:05 +02:00
Christopher Faulet
0c6de00d7c BUG/MEDIUM: h2/htx: Update data length of the HTX when the cookie list is built
When an H2 request is converted into an HTX message, All cookie headers are
grouped into one, each value separated by a semicolon (;). To do so, we add the
header "cookie" with the first value and then we update the value by appending
other cookies. But during this operation, only the size of the HTX block is
updated. And not the data length of the whole HTX message.

It is an old bug and it seems to work by chance till now. But it may lead to
undefined behaviour by time to time.

This patch must be backported to 2.0 and 1.9
2019-06-17 11:44:51 +02:00
Christopher Faulet
a9a5c04c23 MINOR: h2: Set flags about the request's scheme on the start-line
The flag HTX_SL_F_HAS_SCHM is always set because H2 requests have always an
explicit scheme. Then, the pseudo-header ":scheme" is tested. If it is set to
"http", the flag HTX_SL_F_SCHM_HTTP is set. Otherwise, for all other cases, the
flag HTX_SL_F_SCHM_HTTPS is set. For now, it seems reasonable to have a fallback
on the scheme "https".
2019-06-14 11:13:32 +02:00
Christopher Faulet
2d7c5395ed MEDIUM: htx: Add the parsing of trailers of chunked messages
HTTP trailers are now parsed in the same way headers are. It means trailers are
converted to K/V blocks followed by an end-of-trailer marker. For now, to make
things simple, the type for trailer blocks are not the same than for header
blocks. But the aim is to make no difference between headers and trailers by
using the same type. Probably for the end-of marker too.
2019-06-05 10:12:11 +02:00
Christopher Faulet
33543e73a2 MINOR: h2/htx: Set hdrs_bytes on the SL when an HTX message is produced 2019-05-28 07:42:12 +02:00
Willy Tarreau
a1bd1faeeb BUILD: use inttypes.h instead of stdint.h
I found on an (old) AIX 5.1 machine that stdint.h didn't exist while
inttypes.h which is expected to include it does exist and provides the
desired functionalities.

As explained here, stdint being just a subset of inttypes for use in
freestanding environments, it's probably always OK to switch to inttypes
instead:

  https://pubs.opengroup.org/onlinepubs/009696799/basedefs/stdint.h.html

Also it's even clearer here in the autoconf doc :

  https://www.gnu.org/software/autoconf/manual/autoconf-2.61/html_node/Header-Portability.html

  "The C99 standard says that inttypes.h includes stdint.h, so there's
   no need to include stdint.h separately in a standard environment.
   Some implementations have inttypes.h but not stdint.h (e.g., Solaris
   7), but we don't know of any implementation that has stdint.h but not
   inttypes.h"
2019-04-01 07:44:56 +02:00
Willy Tarreau
9255e7e971 BUG/MEDIUM: h2/htx: verify that :path doesn't contain invalid chars
While the legacy code converts h2 to h1 and provides some control over
what is passed, in htx mode there is no such control and it is possible
to pass control chars and linear white spaces in the path, which are
possibly reencoded differently once passed to the H1 side.

HTX supports parse error reporting using a special flag. Let's check
the correctness of the :path pseudo header and report any anomaly in
the HTX flag.

Thanks to Jérôme Magnin for reporting this bug with a working reproducer.

This fix must be backported to 1.9 along with the two previous patches
("MINOR: htx: unconditionally handle parsing errors in requests or
responses" and "MINOR: mux-h2: always pass HTX_FL_PARSING_ERROR
between h2s and buf on RX").
2019-03-05 10:58:28 +01:00
Christopher Faulet
0b46548a68 BUG/MEDIUM: h2/htx: Correctly handle interim responses when HTX is enabled
1xx responses does not work in HTTP2 when the HTX is enabled. First of all, when
a response is parsed, only one HEADERS frame is expected. So when an interim
response is received, the flag H2_SF_HEADERS_RCVD is set and the next HEADERS
frame (for another interim repsonse or the final one) is parsed as a trailers
one. Then when the response is sent, because an EOM block is found at the end of
the interim HTX response, the ES flag is added on the frame, closing too early
the stream. Here, it is a design problem of the HTX. Iterim responses are
considered as full messages, leading to some ambiguities when HTX messages are
processed. This will not be fixed now, but we need to keep it in mind for future
improvements.

To fix the parsing bug, the flag H2_MSGF_RSP_1XX is added when the response
headers are decoded. When this flag is set, an EOM block is added into the HTX
message, despite the fact that there is no ES flag on the frame. And we don't
set the flag H2_SF_HEADERS_RCVD on the corresponding H2S. So the next HEADERS
frame will not be parsed as a trailers one.

To fix the sending bug, the ES flag is not set on the frame when an interim
response is processed and the flag H2_SF_HEADERS_SENT is not set on the
corresponding H2S.

This patch must be backported to 1.9.
2019-02-19 16:26:14 +01:00
Christopher Faulet
44af3cfca3 MINOR: h2/htx: Set the flag HTX_SL_F_BODYLESS for messages without body
This information is usefull to know if a body is expected or not, regardless the
presence or not of the header "Content-Length" and its value. Once the ES flag
is set on the header frame or when the content length is 0, we can safely add
the flag HTX_SL_F_BODYLESS on the HTX start-line.

Among other things, it will help the mux-h1 to know if it should add TE header
or not. It will also help the HTTP compression filter.

This patch must be backported to 1.9 because a bug fix depends on it.
2019-02-18 16:25:06 +01:00
Willy Tarreau
9c84d8299a MINOR: h2: add a generic frame checker
The new function h2_frame_check() checks the protocol limits for the
received frame (length, ID, direction) and returns a verdict made of
a connection error code. The purpose is to be able to validate any
frame regardless of the state and the ability to call the frame handler,
and to emit a GOAWAY early in this case.
2019-01-30 19:37:20 +01:00
Willy Tarreau
4790f7c907 MEDIUM: h2: always parse and deduplicate the content-length header
The header used to be parsed only in HTX but not in legacy. And even in
HTX mode, the value was dropped. Let's always parse it and report the
parsed value back so that we'll be able to store it in the streams.
2019-01-24 19:07:26 +01:00
Willy Tarreau
1e1f27c5c1 MINOR: h2: add h2_make_htx_trailers to turn H2 headers to HTX trailers
This function is usable to transform a list of H2 header fields to a
HTX trailers block. It takes care of rejecting forbidden headers and
pseudo-headers when performing the conversion. It also emits the
trailing CRLF that is currently needed in the HTX trailers block.
2019-01-03 18:45:38 +01:00
Willy Tarreau
9d953e7572 MINOR: h2: add h2_make_h1_trailers to turn H2 headers to H1 trailers
This function is usable to transform a list of H2 header fields to a
H1 trailers block. It takes care of rejecting forbidden headers and
pseudo-headers when performing the conversion.
2019-01-03 18:45:38 +01:00
Willy Tarreau
beefaee4f5 MEDIUM: h2: properly check and deduplicate the content-length header in HTX
When producing an HTX message, we can't rely on the next-level H1 parser
to check and deduplicate the content-length header, so we have to do it
while parsing a message. The algorithm is the exact same as used for H1
messages.
2018-12-19 13:08:08 +01:00
Willy Tarreau
164e061066 BUG/MEDIUM: h2: fix aggregated cookie length computation in HTX mode
Cyril Bonté reported a bug in the way the cookie length is computed
when aggregating multiple cookies : the first cookie name was counted
as part of the value length, causing random contents to be placed there,
possibly leading to bad requests.

No backport is needed.
2018-12-18 11:03:11 +01:00
Willy Tarreau
1329b5be71 MINOR: h2: add new functions to produce an HTX message from an H2 response
The new function h2_prepare_htx_stsline() produces an HTX response message
from an H2 response presented as a list of header fields.
2018-12-02 13:30:17 +01:00
Willy Tarreau
6deb4129de MINOR: h2: implement H2->HTX request header frame transcoding
Till now we could only produce an HTTP/1 request from a list of H2
request headers. Now the new function h2_make_htx_request() does the
same but using the HTX encoding instead, while respecting the H2
semantics. The code is not much different from the first version,
only the encoding differs.

For now it's not used.
2018-12-01 17:38:32 +01:00
Willy Tarreau
eba10f24b7 BUG/MEDIUM: h2: implement missing support for chunked encoded uploads
Upload requests not carrying a content-length nor tunnelling data must
be sent chunked-encoded over HTTP/1. The code was planned but for some
reason forgotten during the implementation, leading to such payloads to
be sent as tunnelled data.

Browsers always emit a content length in uploads so this problem doesn't
happen for most sites. However some applications may send data frames
after a request without indicating it earlier.

The only way to detect that a client will need to send data is that the
HEADERS frame doesn't hold the ES bit. In this case it's wise to look
for the content-length header. If it's not there, either we're in tunnel
(CONNECT method) or chunked-encoding (other methods).

This patch implements this.

The following request is sent using content-length :

    curl --http2 -sk https://127.0.0.1:4443/s2 -XPOST -T /large/file

and these ones using chunked-encoding :

    curl --http2 -sk https://127.0.0.1:4443/s2 -XPUT -T /large/file
    curl --http2 -sk https://127.0.0.1:4443/s2 -XPUT -T - < /dev/urandom

Thanks to Robert Samuel Newson for raising this issue with details.
This fix must be backported to 1.8.
2018-04-26 10:20:44 +02:00
Willy Tarreau
174b06a572 MINOR: h2: detect presence of CONNECT and/or content-length
We'll need this in order to support uploading chunks. The h2 to h1
converter checks for the presence of the content-length header field
as well as the CONNECT method and returns these information to the
caller. The caller indicates whether or not a body is detected for
the message (presence of END_STREAM or not). No transfer-encoding
header is emitted yet.
2018-04-26 10:15:14 +02:00
Willy Tarreau
637f64d565 BUG/MEDIUM: h2: do not accept upper case letters in request header names
This is explicitly forbidden by 7540#8.1.2, and may be used to bypass
some of the other filters, so they must be blocked early. It removes
another issue reported by h2spec.

To backport to 1.8.
2017-12-03 21:09:38 +01:00
Willy Tarreau
fe7c356be6 BUG/MEDIUM: h2: remove connection-specific headers from request
h2spec rightfully outlines that we used not to reject these ones, and
they may cause trouble if presented, especially "upgrade".

Must be backported to 1.8.
2017-12-03 21:09:18 +01:00
Willy Tarreau
520886990f BUG/MINOR: h2: reject response pseudo-headers from requests
At the moment there's only ":status". Let's block it early when parsing
the request. Otherwise it would be blocked by the HTTP/1 code anyway.
This silences another h2spec issue.

To backport to 1.8.
2017-12-03 21:08:43 +01:00
Willy Tarreau
d8d2ac75e8 BUG/MINOR: h2: the TE header if present may only contain trailers
h2spec reports this issue which has no side effect for now, but is
better cleared.

To backport to 1.8.
2017-12-03 21:08:42 +01:00
Willy Tarreau
cd4fe17a26 BUG/MINOR: h2: ":path" must not be empty
As reported by h2spec, the h2->h1 gateway doesn't verify that ":path"
is not empty. This is harmless since the H1 parser will reject such a
request, but better fix it anyway.

To backport to 1.8.
2017-12-03 21:08:41 +01:00
Willy Tarreau
811ad12414 BUG/MAJOR: h2: correctly check the request length when building an H1 request
Due to a typo in the request maximum length calculation, we count the
request path twice instead of counting it added to the method's length.
This has two effects, the first one being that a path cannot be larger
than half a buffer, and the second being that the method's length isn't
properly checked. Due to the way the temporary buffers are used internally,
it is quite difficult to meet this condition. In practice, the only
situation where this can cause a problem is when exactly one of either
the method or the path are compressed and the other ones is sent as a
literal.

Thanks to Yves Lafon for providing useful traces exhibiting this issue.

To be backported to 1.8.
2017-12-03 21:08:40 +01:00
Willy Tarreau
2fb986ccb8 BUG/MEDIUM: h2: always reassemble the Cookie request header field
The special case of the Cookie header field was overlooked in the
implementation, considering that most servers do handle cookie lists,
but as reported here on discourse it's not the case at all :

  https://discourse.haproxy.org/t/h2-cookie-header-splitted-header/1742

This patch fixes this by skipping all occurences of the Cookie header
in the request while building the H1 request, and then building a single
Cookie header with all values appended at once, according to what is
requested in RFC7540#8.1.2.5.

In order to build the list of values, the list struct is used as a linked
list (as there can't be more cookies than headers). This makes the list
walking quite efficient and ensures all values are quickly found without
having to rescan the list.

A test case provided by Lukas shows that it properly works :

 > GET /? HTTP/1.1
 > user-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0
 > accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
 > accept-language: en-US,en;q=0.5
 > accept-encoding: gzip, deflate
 > referer: https://127.0.0.1:4443/?expectValue=1511294406
 > host: 127.0.0.1:4443

 < HTTP/1.1 200 OK
 < Server: nginx
 < Date: Tue, 21 Nov 2017 20:00:13 GMT
 < Content-Type: text/html; charset=utf-8
 < Transfer-Encoding: chunked
 < Connection: keep-alive
 < X-Powered-By: PHP/5.3.10-1ubuntu3.26
 < Set-Cookie: HAPTESTa=1511294413
 < Set-Cookie: HAPTESTb=1511294413
 < Set-Cookie: HAPTESTc=1511294413
 < Set-Cookie: HAPTESTd=1511294413
 < Content-Encoding: gzip

 > GET /?expectValue=1511294413 HTTP/1.1
 > user-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0
 > accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
 > accept-language: en-US,en;q=0.5
 > accept-encoding: gzip, deflate
 > host: 127.0.0.1:4443
 > cookie: SERVERID=s1; HAPTESTa=1511294413; HAPTESTb=1511294413; HAPTESTc=1511294413; HAPTESTd=1511294413

Many thanks to @Nurza, @adrianw and @lukastribus for their helpful reports
and investigations here.
2017-11-21 21:13:36 +01:00
Willy Tarreau
f24ea8e45e MEDIUM: h2: add a function to emit an HTTP/1 request from a headers list
The current H2 to H1 protocol conversion presents some issues which will
require to perform some processing on certain headers before writing them
so it's not possible to convert HPACK to H1 on the fly.

Here we introduce a function which performs half of what hpack_decode_header()
used to do, which is to take a list of headers on input and emit the
corresponding request in HTTP/1.1 format. The code is the same and functions
were renamed to be prefixed with "h2" instead of "hpack", though it ends
up being simpler as the various HPACK-specific cases could be fused into
a single one (ie: add header).

Moving this part here makes a lot of sense as now this code is specific to
what is documented in HTTP/2 RFC 7540 and will be able to deal with special
cases related to H2 to H1 conversion enumerated in section 8.1.

Various error codes which were previously assigned to HPACK were never
used (aside being negative) and were all replaced by -1 with a comment
indicating what error was detected. The code could be further factored
thanks to this but this commit focuses on compatibility first.

This code is not yet used but builds fine.
2017-11-21 21:13:33 +01:00