66 Commits

Author SHA1 Message Date
Tim Duesterhus
7750850594 CLEANUP: Reapply ist.cocci with --include-headers-for-types --recursive-includes
Previous uses of `ist.cocci` did not add `--include-headers-for-types` and
`--recursive-includes` preventing Coccinelle seeing `struct ist` members of
other structs.

Reapply the patch with proper flags to further clean up the use of the ist API.

The command used was:

    spatch -sp_file dev/coccinelle/ist.cocci -in_place --include-headers --include-headers-for-types --recursive-includes --dir src/
2022-03-21 08:30:47 +01:00
Christopher Faulet
02c893332b BUG/MEDIUM: h1: Properly reset h1m flags when headers parsing is restarted
If H1 headers are not fully received at once, the parsing is restarted a
last time when all headers are finally received. When this happens, the h1m
flags are sanitized to remove all value set during parsing.

But some flags where erroneously preserved. Among others, H1_MF_TE_CHUNKED
flag was not removed, what could lead to parsing error.

To fix the bug and make things easy, a mask has been added with all flags
that must be preserved. It will be more stable. This mask is used to
sanitize h1m flags.

This patch should fix the issue #1469. It must be backported to 2.5.
2021-12-02 09:46:29 +01:00
Tim Duesterhus
4c8f75fc31 CLEANUP: Apply ist.cocci
Make use of the new rules to use `istend()`.
2021-11-08 08:05:39 +01:00
Christopher Faulet
545fbba273 MINOR: h1: Change T-E header parsing to fail if chunked encoding is found twice
According to the RFC7230, "chunked" encoding must not be applied more than
once to a message body. To handle this case, h1_parse_xfer_enc_header() is
now responsible to fail when a parsing error is found. It also fails if the
"chunked" encoding is not the last one for a request.

To help the parsing, two H1 parser flags have been added: H1_MF_TE_CHUNKED
and H1_MF_TE_OTHER. These flags are set, respectively, when "chunked"
encoding and any other encoding are found. H1_MF_CHNK flag is used when
"chunked" encoding is the last one.
2021-09-28 16:21:25 +02:00
Christopher Faulet
631c7e8665 MEDIUM: h1: Force close mode for invalid uses of T-E header
Transfer-Encoding header is not supported in HTTP/1.0. However, softwares
dealing with HTTP/1.0 and HTTP/1.1 messages may accept it and transfer
it. When a Content-Length header is also provided, it must be
ignored. Unfortunately, this may lead to vulnerabilities (request smuggling
or response splitting) if an intermediary is only implementing
HTTP/1.0. Because it may ignore Transfer-Encoding header and only handle
Content-Length one.

To avoid any security issues, when Transfer-Encoding and Content-Length
headers are found in a message, the close mode is forced. The same is
performed for HTTP/1.0 message with a Transfer-Encoding header only. This
change is conform to what it is described in the last HTTP/1.1 draft. See
also httpwg/http-core#879.

Note that Content-Length header is also removed from any incoming messages
if a Transfer-Encoding header is found. However it is not true (not yet) for
responses generated by HAProxy.
2021-09-28 16:21:25 +02:00
Amaury Denoyelle
69294b20ac MINOR: http: use http uri parser for authority
Replace http_get_authority by the http_uri_parser API.

The new function is renamed http_parse_authority. Replace duplicated
scheme parsing code by http_parse_scheme invocation. A new
http_uri_parser state is declared to mark the authority parsing as done.
2021-07-08 17:11:17 +02:00
Amaury Denoyelle
aad333a9fc MEDIUM: h1: add a WebSocket key on handshake if needed
Add the header Sec-Websocket-Key when generating a h1 handshake websocket
without this header. This is the case when doing h2-h1 conversion.

The key is randomly generated and base64 encoded. It is stored on the session
side to be able to verify response key and reject it if not valid.
2021-01-28 16:37:14 +01:00
Amaury Denoyelle
c193823343 MEDIUM: h1: generate WebSocket key on response if needed
Add the Sec-Websocket-Accept header on a websocket handshake response.
This header may be missing if a h2 server is used with a h1 client.

The response key is calculated following the rfc6455. For this, the
handshake request key must be stored in the h1 session, as a new field
name ws_key. Note that this is only done if the message has been
prealably identified as a Websocket handshake request.
2021-01-28 16:37:14 +01:00
Amaury Denoyelle
18ee5c3eb0 MINOR: h1: reject websocket handshake if missing key
If a request is identified as a WebSocket handshake, it must contains a
websocket key header or else it can be reject, following the rfc6455.

A new flag H1_MF_UPG_WEBSOCKET is set on such messages.  For the request
te be identified as a WebSocket handshake, it must contains the headers:
  Connection: upgrade
  Upgrade: websocket

This commit is a compagnon of
"MEDIUM: h1: generate WebSocket key on response if needed" and
"MEDIUM: h1: add a WebSocket key on handshake if needed".

Indeed, it ensures that a WebSocket key is added only from a http/2 side
and not for a http/1 bogus peer.
2021-01-28 16:37:14 +01:00
Willy Tarreau
f278eec37a BUILD: tree-wide: cast arguments to tolower/toupper to unsigned char
NetBSD apparently uses macros for tolower/toupper and complains about
the use of char for array subscripts. Let's properly cast all of them
to unsigned char where they are used.

This is needed to fix issue #729.
2020-07-05 21:50:02 +02:00
Ilya Shipitsin
47d17182f4 CLEANUP: assorted typo fixes in the code and comments
This is 10th iteration of typo fixes
2020-06-26 11:27:28 +02:00
Willy Tarreau
f1d32c475c REORG: include: move channel.h to haproxy/channel{,-t}.h
The files were moved with no change. The callers were cleaned up a bit
and a few of them had channel.h removed since not needed.
2020-06-11 10:18:58 +02:00
Willy Tarreau
5413a87ad3 REORG: include: move common/h1.h to haproxy/h1.h
The file was moved as-is. There was a wrong dependency on dynbuf.h
instead of buf.h which was addressed. There was no benefit to
splitting this between types and functions.
2020-06-11 10:18:57 +02:00
Willy Tarreau
0017be0143 REORG: include: split common/http-hdr.h into haproxy/http-hdr{,-t}.h
There's only one struct and 2 inline functions. It could have been
merged into http.h but that would have added a massive dependency on
the hpack parts for nothing, so better keep it this way since hpack
is already freestanding and portable.
2020-06-11 10:18:57 +02:00
Willy Tarreau
4c7e4b7738 REORG: include: update all files to use haproxy/api.h or api-t.h if needed
All files that were including one of the following include files have
been updated to only include haproxy/api.h or haproxy/api-t.h once instead:

  - common/config.h
  - common/compat.h
  - common/compiler.h
  - common/defaults.h
  - common/initcall.h
  - common/tools.h

The choice is simple: if the file only requires type definitions, it includes
api-t.h, otherwise it includes the full api.h.

In addition, in these files, explicit includes for inttypes.h and limits.h
were dropped since these are now covered by api.h and api-t.h.

No other change was performed, given that this patch is large and
affects 201 files. At least one (tools.h) was already freestanding and
didn't get the new one added.
2020-06-11 10:18:42 +02:00
Christopher Faulet
7032a3fd0a BUG/MEDIUM: h1: Don't compare host and authority if only h1 headers are parsed
When only request headers are parsed, the host header should not be compared to
the request authority because no start-line was parsed. Thus there is no
authority.

Till now this bug was hidden because this parsing mode was only used for the
response in the FCGI multiplexer. Since the HTTP checks refactoring, the request
headers may now also be parsed without the start-line.

This patch fixes the issue #610. It must be backported to 2.1.
2020-05-04 09:27:01 +02:00
Willy Tarreau
02ac950a11 CLEANUP: http/h1: rely on HA_UNALIGNED_LE instead of checking for CPU families
Now that we have flags indicating the CPU's capabilities, better use
them instead of missing some updates for new CPU families (ARMv8 was
missing there).
2020-02-21 16:32:57 +01:00
Christopher Faulet
1703478e2d BUG/MINOR: h1: Report the right error position when a header value is invalid
During H1 messages parsing, when the parser has finished to parse a full header
line, some tests are performed on its value, depending on its name, to be sure
it is valid. The content-length is checked and converted in integer and the host
header is also checked. If an error occurred during this step, the error
position must point on the header value. But from the parser point of view, we
are already on the start of the next header. Thus the effective reported
position in the error capture is the beginning of the unparsed header line. It
is a bit confusing when we try to figure out why a message is rejected.

Now, the parser state is updated to point on the invalid value. This way, the
error position really points on the right position.

This patch must be backported as far as 1.9.
2020-01-06 13:58:21 +01:00
Christopher Faulet
bc7c03eba3 BUG/MINOR: h1: Don't test the host header during response parsing
During the H1 message parsing, the host header is tested to be sure it matches
the request's authority, if defined. When there are multiple host headers, we
also take care they are all the same. Of course, these tests must only be
performed on the requests. A host header in a response has no special meaning.

This patch must be backported to 2.1.
2019-11-27 14:01:17 +01:00
Christopher Faulet
531b83e039 MINOR: h1: Reject requests if the authority does not match the header host
As stated in the RCF7230#5.4, a client must send a field-value for the header
host that is identical to the authority if the target URI includes one. So, now,
by default, if the authority, when provided, does not match the value of the
header host, an error is triggered. To mitigate this behavior, it is possible to
set the option "accept-invalid-http-request". In that case, an http error is
captured without interrupting the request parsing.
2019-10-14 22:28:50 +02:00
Christopher Faulet
497ab4f519 MINOR: h1: Reject requests with different occurrences of the header host
There is no reason for a client to send several headers host. It even may be
considered as a bug. However, it is totally invalid to have different values for
those. So now, in such case, an error is triggered during the request
parsing. In addition, when several headers host are found with the same value,
only the first instance is kept and others are skipped.
2019-10-14 22:28:50 +02:00
Christopher Faulet
84f06533e1 BUG/MINOR: h1: Properly reset h1m when parsing is restarted
Otherwise some processing may be performed twice. For instance, if the header
"Content-Length" is parsed on the first pass, when the parsing is restarted, we
skip it because we think another header with the same value was already seen. In
fact, it is currently the only existing bug that can be encountered. But it is
safer to reset all the h1m on restart to avoid any future bugs.

This patch must be backported to 2.0 and 1.9
2019-09-04 10:30:11 +02:00
Christopher Faulet
711ed6ae4a MAJOR: http: Remove the HTTP legacy code
First of all, all legacy HTTP analyzers and all functions exclusively used by
them were removed. So the most of the functions in proto_http.{c,h} were
removed. Only functions to deal with the HTTP transaction have been kept. Then,
http_msg and hdr_idx modules were entirely removed. And finally the structure
http_msg was lightened of all its useless information about the legacy HTTP. The
structure hdr_ctx was also removed because unused now, just like unused states
in the enum h1_state. Note that the memory pool "hdr_idx" was removed and
"http_txn" is now smaller.
2019-07-19 09:24:12 +02:00
Christopher Faulet
a51ebb7f56 MEDIUM: h1: Add an option to sanitize connection headers during parsing
The flag H1_MF_CLEAN_CONN_HDR has been added to let the H1 parser sanitize
connection headers. It means it will remove all "close" and "keep-alive" values
during the parsing. One noticeable effect is that connection headers may be
unfolded. In practice, this is not a problem because it is not frequent to have
multiple values for the connection headers.

If this flag is set, during the parsing The function
h1_parse_next_connection_header() is called in a loop instead of
h1_parse_conection_header().

No need to backport this patch
2019-04-12 22:06:53 +02:00
Christopher Faulet
68b1bbd767 BUG/MEDIUM: h1: Get the h1m state when restarting the headers parsing
Since the commit 0f8fb6b7f ("MINOR: h1: make the H1 headers block parser able to
parse headers only"), when headers are not received in one time, a parsing error
is returned because the local state in the function h1_headers_to_hdr_list() was
not initialized with the previous one (in fact, it was not initialized at all).

So now, we start the parsing of headers with the state H1_MSG_HDR_FIRST when the
flag H1_MF_HDRS_ONLY is set. Otherwise, we always get it from the h1m.

This patch must be backported to 1.9.
2019-01-04 16:23:03 +01:00
Willy Tarreau
0f8fb6b7f9 MINOR: h1: make the H1 headers block parser able to parse headers only
Currently the H1 headers parser works for either a request or a response
because it starts from the start line. It is also able to resume its
processing when it was interrupted, but in this case it doesn't update
the list.

Make it support a new flag, H1_MF_HDRS_ONLY so that the caller can
indicate it's only interested in the headers list and not the start
line. This will be convenient to parse H1 trailers.
2019-01-04 10:48:03 +01:00
Willy Tarreau
afba57ae80 REORG: h1: merge types+proto into common/h1.h
These two files are self-contained and do not depend on other
layers, so let's remerge them together for easier manipulation.
2018-12-11 17:15:13 +01:00
Willy Tarreau
538746ad38 REORG: h1: move legacy http functions to http_msg.c
Now that h1 and legacy HTTP are two distinct things, there's no need
to keep the legacy HTTP parsers in h1.c since they're only used by
the legacy code in proto_http.c, and h1.h doesn't need to include
hdr_idx anymore. This concerns the following functions :

- http_parse_reqline();
- http_parse_stsline();
- http_msg_analyzer();
- http_forward_trailers();

All of these were moved to http_msg.c.
2018-12-11 17:15:13 +01:00
Christopher Faulet
25da9e34f1 MINOR: h1: Add the flag H1_MF_NO_PHDR to not add pseudo-headers during parsing
Some pseudo-headers are added during the headers parsing, mainly for the mux
H2. With this flag, it is possible to not add them. This avoid some boring
filtering in the mux H1.
2018-10-12 16:15:18 +02:00
Christopher Faulet
1dc2b49556 MINOR: h1: Change the union h1_sl to use indirect strings to store infos
Instead of using offsets relating to the parsed buffer to store start line
infos, we now use indirect strings. So now, these infos remain valid only if the
origin buffer remains untouched. But it's not a real problem because this union
is used during the parsing and never stored to a later use.
2018-10-12 16:14:57 +02:00
Christopher Faulet
ff08a92797 MINOR: h1: Add EOH marker during headers parsing
When headers parsing ends, a pseudo header with an empty name and an empty value
is added to the array of parsed headers to mark its end. It is convenient to
loop on this array, but not really useful if we want remove the last header or
add a new one, because we don't really know where is the last CRLF (the empty
line ending the headers block). So now, instead the name of this pseudo header
points on this last CRLF. Its length is still 0 and its value is still empty, so
loops on the array remains unchanged.
2018-10-12 16:08:27 +02:00
Christopher Faulet
2912f87443 BUG/MEDIUM: h1: Really skip all updates when incomplete messages are parsed
In h1_headers_to_hdr_list, when an incomplete message is parsed, all updates
must be skipped until the end of the message is found. Then the parsing is
restarted from the beginning. But not all updates were skipped, leading to
invalid rewritting or segfault.

No backport is needed.
2018-09-19 15:08:05 +02:00
Willy Tarreau
73373ab43a MEDIUM: h1: deduplicate the content-length header
Just like we used to do in proto_http, we now check that each and every
occurrence of the content-length header field and each of its values are
exactly identical, and we normalize the header to return the last value
of the first header with spaces trimmed.
2018-09-14 19:04:28 +02:00
Willy Tarreau
2557f6a3e2 MEDIUM: h1: better handle transfer-encoding vs content-length
The transfer-encoding header processing was a bit lenient in this part
because it was made to read messages already validated by haproxy. We
absolutely need to reinstate the strict processing defined in RFC7230
as is currently being done in proto_http.c. That is, transfer-encoding
presence alone is enough to cancel content-length, and must be
terminated by the "chunked" token, except in the response where we
can fall back to the close mode if it's not last.

For this we now use a specific parsing function which updates the
flags and we introduce a new flag H1_MF_XFER_ENC indicating that the
transfer-encoding header is present.

Last, if such a header is found, we delete all content-length header
fields found in the message.
2018-09-14 17:40:35 +02:00
Willy Tarreau
2ea6bb5c31 MINOR: h1: add headers to the list after controls, not before
This will ease removal/skipping of duplicates such as content-length.
2018-09-14 17:40:35 +02:00
Willy Tarreau
98f5cf7a59 MINOR: h1: parse the Connection header field
The new function h1_parse_connection_header() is called when facing a
connection header in the generic parser, and it will set up to 3 bits
in h1m->flags indicating if at least one "close", "keep-alive" or "upgrade"
tokens was seen.
2018-09-13 14:52:31 +02:00
Willy Tarreau
ba5fbca33f MINOR: h1: report in the h1m struct if the HTTP version is 1.1 or above
This will be needed for the mux to know how to process the Connection
header, and will save it from having to re-parse the request line since
it's captured on the fly.
2018-09-13 14:34:09 +02:00
Willy Tarreau
db72da0432 BUG/MINOR: h1: don't consider the status for each header
While it was possible to consider the status before parsing response
headers, it's wrong to do it for request headers and could lead to
random behaviours due to this status matching other fields instead.
Additionnally there is little to no value in doing this for each and
every new header field. It's much better to reset the content-length
at once in the callerwhen seeing such statuses (which currently is only
the H2 mux).

No backport is needed, this is purely 1.9.
2018-09-13 14:30:23 +02:00
Willy Tarreau
eb528db60b MINOR: h1: add H1_MF_TOLOWER to decide when to turn header names to lower case
The h1 parser used to systematically turn header field names to lower
case because it was designed for H2. Let's add a flag which is off by
default to condition this behaviour so that when using it from an H1
parser it will not affect the message.
2018-09-12 17:38:26 +02:00
Willy Tarreau
c2ab9f5163 MEDIUM: h1: implement the request parser as well
The original H1 request parsing code was reintroduced into the generic
H1 parser so that it can be used regardless of the direction. If the
parser is interrupted and restarts, it makes use of the H1_MF_RESP
flag to decide whether to re-parse a request or a response. While
parsing the request, the method is decoded and set into the start line
structure.
2018-09-12 17:38:25 +02:00
Willy Tarreau
11da5674c3 MINOR: h1: remove the HTTP status from the H1M struct
It has nothing to do there and is not used from there anymore, let's
get rid of it.
2018-09-12 17:38:25 +02:00
Willy Tarreau
001823c304 MEDIUM: h1: remove the useless H1_MSG_BODY state
This state was only a delimiter between headers and body but it now
causes more harm than good because it requires someone to change it.
Since the H1 parser knows if we're in DATA or CHUNK_SIZE, simply let
it set the right next state so that h1m->state constantly matches
what is expected afterwards.
2018-09-12 17:38:25 +02:00
Willy Tarreau
4c34c0e74a MEDIUM: h1: support partial message parsing
While it was not needed in the H2 mux which was reading full H1 messages
from the channel, it is mandatory for the H1 mux reading contents from
outside to be able to restart on a message. The problem is that the
headers are indexed on the fly, and it's not fun to have to store
everything between calls.

The solution here is to complete the first pass doing a partial restart,
and only once the end of message was found, to start over it again at
once, filling entries. This way there is a bounded number of passes on
the contents and no need to store an intermediary result anymore. Later
this principle could even be used to decide to completely drop an output
buffer to save memory.
2018-09-12 17:38:25 +02:00
Willy Tarreau
5384aac0cb MINOR: h1: make the message parser support a null <hdr> argument
This will allow some iterative calls to be made on incomplete messages
without having to store all the headers.
2018-09-12 17:38:25 +02:00
Willy Tarreau
4433c083ec MEDIUM: h1: let the caller pass the initial parser's state
This way the caller controls if it's the request or response which has
to be used, and it will allow to restart after an incomplete parsing.
2018-09-12 17:38:25 +02:00
Willy Tarreau
a41393fc61 MEDIUM: h1: make the parser support a pointer to a start line
This will allow the parser to fill some extra fields like the method or
status without having to store them permanently in the HTTP message. At
this point however the parser cannot restart from an interrupted read.
2018-09-12 17:38:25 +02:00
Willy Tarreau
9aec30557b MEDIUM: h1: consider err_pos before deciding to accept a header name or not
Till now the H1 parser made for H2 used to be lenient on invalid header
field names because they were supposed to be produced by haproxy. Now
instead we'll rely on err_pos to know how to act (ie: -2 == must block).
2018-09-12 17:38:25 +02:00
Willy Tarreau
801250e07d REORG: h1: create a new h1m_state
This is the *parsing* state of an HTTP/1 message. Currently the h1_state
is composite as it's made both of parsing and control (100SENT, BODY,
DONE, TUNNEL, ENDING etc). The purpose here is to have a purely H1 state
that can be used by H1 parsers. For now it's equivalent to h1_state.
2018-09-12 17:38:25 +02:00
Willy Tarreau
35b51c6e5b REORG: http: move the HTTP semantics definitions to http.h/http.c
It's a bit painful to have to deal with HTTP semantics for each protocol
version (H1 and H2), and working on the version-agnostic code further
emphasizes the problem.

This patch creates http.h and http.c which are agnostic to the version
in use, and which borrow a few parts from proto_http and from h1. For
example the once thought h1-specific h1_char_classes array is in fact
dictated by RFC7231 and is used to parse HTTP headers. A few changes
were made to a few files which were including proto_http.h while they
only needed http.h.

Certain string definitions pre-dated the introduction of indirect
strings (ist) so some were used to simplify the definition of the known
HTTP methods. The current lookup code saves 2 kB of a heavily used table
and is faster than the previous table based lookup (typ. 14 ns vs 16
before).
2018-09-11 10:30:25 +02:00
Willy Tarreau
950a8a6fde BUG/MINOR: h1: fix buffer shift after realignment
Commit 5e74b0b ("MEDIUM: h1: port to new buffer API.") introduced a
minor bug by which a buffer's head could stay shifted by the amount
of removed CRLF if it started with empty lines. This would cause the
second request (or response) not to work until it would receive a few
extra characters. This most only impacts requests sent by hand though.

This is purely 1.9, no backport is needed.
2018-09-06 10:48:15 +02:00