10599 Commits

Author SHA1 Message Date
Willy Tarreau
afba57ae80 REORG: h1: merge types+proto into common/h1.h
These two files are self-contained and do not depend on other
layers, so let's remerge them together for easier manipulation.
2018-12-11 17:15:13 +01:00
Willy Tarreau
30925659ef CLEANUP: h1: remove some occurrences of unneeded h1.h inclusions
Several places where h1.h was included didn't need it at all since
they in fact relied on the legacy HTTP definitions.
2018-12-11 17:15:13 +01:00
Willy Tarreau
538746ad38 REORG: h1: move legacy http functions to http_msg.c
Now that h1 and legacy HTTP are two distinct things, there's no need
to keep the legacy HTTP parsers in h1.c since they're only used by
the legacy code in proto_http.c, and h1.h doesn't need to include
hdr_idx anymore. This concerns the following functions :

- http_parse_reqline();
- http_parse_stsline();
- http_msg_analyzer();
- http_forward_trailers();

All of these were moved to http_msg.c.
2018-12-11 17:15:13 +01:00
Willy Tarreau
c5a4fd5c30 REORG: http: create http_msg.c to place there some legacy HTTP parts
Lots of HTTP code still uses struct http_msg. Not only this code is
still huge, but it's part of the legacy interface. Let's move most
of these functions to a separate file http_msg.c to make it more
visible which file relies on what. It's mostly symmetrical with
what is present in http_htx.c.

The function http_transform_header_str() which used to rely on two
function pointers to look up a header was simplified to rely on
two variants http_legacy_replace_{,full_}header(), making both
sides of the function much simpler.

No code was changed beyond these moves.
2018-12-11 17:15:13 +01:00
Willy Tarreau
b96b77ed6e REORG: htx: merge types+proto into common/htx.h
All the HTX definition is self-contained and doesn't really depend on
anything external since it's a mostly protocol. In addition, some
external similar files (like h2) also placed in common used to rely
on it, making it a bit awkward.

This patch moves the two htx.h files into a single self-contained one.
The historical dependency on sample.h could be also removed since it
used to be there only for http_meth_t which is now in http.h.
2018-12-11 17:15:04 +01:00
Christopher Faulet
99a17a2d91 MEDIUM: cache: Require an explicit filter declaration if other filters are used
As for the compression filter, the cache filter must be explicitly declared
(using the filter keyword) if other filters than cache are used. It is mandatory
to explicitly define the filters order.

Documentation has been updated accordingly.
2018-12-11 17:09:31 +01:00
Christopher Faulet
afd819c54a MEDIUM: cache/compression: Add a way to safely combined compression and cache
This is only true for HTX proxies. On legacy HTTP proxy, if the compression and
the cache are both enabled, an error during HAProxy startup is triggered.

With the HTX, now you can use both in any order. If the compression is defined
before the cache, then the responses will be stored compressed. If the
compression is defined after the cache, then the responses will be stored
uncompressed. So in the last case, when a response is served from the cache, it
will compressed too like any response.
2018-12-11 17:09:31 +01:00
Christopher Faulet
f4a4ef7d7c MINOR: filters: Export the name of known filters
It could be useful to know if some filter is declared on a proxy or if it is
enabled on a stream.
2018-12-11 17:09:31 +01:00
Christopher Faulet
95220e2ed8 MINOR: cache: Improve and simplify the cache configuration check
To do so, a dedicated configuration has been added on cache filters. Before the
cache filter configuration pointed directly to the cache it used. Now, it is the
dedicated structure cache_flt_conf. Store and use rules also point to this
structure. It is linked to the cache the filter must used. It also contains a
flags field. This will allow us to define the behavior of a cache filter when a
response is stored in the cache or delivered from it.

And now, Store and use rules uses a common parsing function. So if it does not
already exists, a filter is always created for both kind of rules. The cache
filters configuration is checked using their check callback. In the postparser
function, we only check the caches configuration. This removes the loop on all
proxies in the postparser function.
2018-12-11 17:09:31 +01:00
Christopher Faulet
54a8d5a4a0 MEDIUM: cache/htx: Add the HTX support into the cache
The cache is now able to store and resend HTX messages. When an HTX message is
stored in the cache, the headers are prefixed with their block's info (an
uint32_t), containing its type and its length. Data, on their side, are stored
without any prefix. Only the value is copied in the cache. 2 fields have been
added in the structure cache_entry, hdrs_len and data_len, to known the size, in
the cache, of the headers part and the data part. If the message is chunked, the
trailers are also copied, the same way as data. When the HTX message is
recreated in the cache applet, the trailers size is known removing the headers
length and the data lenght from the total object length.
2018-12-11 17:09:31 +01:00
Christopher Faulet
67658c9c9a MINOR: cache: Register the cache as a data filter only if response is cacheable
Instead of calling register_data_filter() when the stream analyze starts, we now
call it when we are sure the response is cacheable. It is done in the
http_headers callback, just before the body analyzis, and only if the headers
was already been cached. And during the body analyzis, if an error occurred or
if the response is too big, we unregistered the cache immediatly.

This patch may be backported in 1.8. It is not a bug but a significant
improvement.
2018-12-11 17:09:31 +01:00
Christopher Faulet
1f672c536d MINOR: cache/htx: Don't use the same cache on HTX and legacy HTTP proxies
It is not possible to mix the format of messages stored in a cache. So we reject
the configurations with a cache used by an HTX proxy and a legacy HTTP proxy in
same time.
2018-12-11 17:09:31 +01:00
Christopher Faulet
c9df7f728f MINOR: compression: Rename the function check_legacy_http_comp_flt()
To not mix it up with the legacy HTTP representation, this function has been
rename check_implicit_http_comp_flt().
2018-12-11 17:09:31 +01:00
William Lallemand
459e18e9e7 MINOR: cli: use pcli_flags for prompt activation
Instead of using a variable to activate the prompt, we just use a flag.
2018-12-11 17:05:40 +01:00
William Lallemand
ebf61804ef MEDIUM: cli: handle payload in CLI proxy
The CLI proxy was not handling payload. To do that, we needed to keep a
connection active on a server and to transfer each new line over that
connection until we receive a empty line.

The CLI proxy handles the payload in the same way that the CLI do it.

Examples:

   $ echo -e "@1;add map #-1 <<\n$(cat data)\n" | socat /tmp/master-socket -

   $ socat /tmp/master-socket readline
   prompt
   master> @1
   25130> add map #-1 <<
   + test test
   + test2 test2
   + test3 test3
   +

   25130>
2018-12-11 17:05:36 +01:00
William Lallemand
3de09d5c7e BUG/MINOR: cli: wait for payload data even without prompt
During a payload transfer, we need to wait for the data even when we are
not in interactive mode. Indeed, the data could be received line per
line progressively instead of in one recv.

Previously the CLI was doing a SHUTW just after the first line if it was
not in interactive mode. We now check if we are in payload mode to do
a SHUTW.

Should be backported in 1.8.
2018-12-11 16:54:18 +01:00
William Lallemand
5f61068dbd MINOR: cli: implements 'quit' in the CLI proxy
Implements the 'quit' command. Works the same way as the CLI command.
2018-12-11 16:54:18 +01:00
William Lallemand
5b80fa2864 MINOR: cli: parse prompt command in the CLI proxy
Handle the prompt command. Works the same way as the CLI.
2018-12-11 16:54:18 +01:00
William Lallemand
bddd33af0b MEDIUM: cli: rework the CLI proxy parser
Rework the CLI proxy parser to look more like the CLI parser, corner
case and escaping are handled the same way.

The parser now splits the commands in words instead of just handling
the prefixes.

It's easier to compare words and arguments of a command this way and to
parse internal command that will be consumed directly by the CLI proxy.
2018-12-11 16:54:18 +01:00
Willy Tarreau
1a18b54142 REORG: connection: centralize the conn_set_{tos,mark,quickack} functions
There were a number of ugly setsockopt() calls spread all over
proto_http.c, proto_htx.c and hlua.c just to manipulate the front
connection's TOS, mark or TCP quick-ack. These ones entirely relied
on the connection, its existence, its control layer's presence, and
its addresses. Worse, inet_set_tos() was placed in proto_http.c,
exported and used from the two other ones, surrounded in #ifdefs.

This patch moves this code to connection.h and makes the other ones
rely on it without ifdefs.
2018-12-11 16:41:51 +01:00
Willy Tarreau
907998194b MEDIUM: mux-h2: make use of hpack_encode_path() to encode the path
The HTTP path encoding was open-coded with a HPACK byte matching the
"/" or "/index.html" paths. Let's make use of the new functions to
avoid this.
2018-12-11 09:07:02 +01:00
Willy Tarreau
7561bcbb36 MEDIUM: mux-h2: make use of hpack_encode_scheme() to encode the scheme
The HTTP scheme encoding was open-coded with a HPACK byte matching the
"https" scheme. Let's make use of the new functions to avoid this.
2018-12-11 09:07:02 +01:00
Willy Tarreau
bdabc3a25f MEDIUM: mux-h2: make use of hpack_encode_method() to encode the method
The HTTP method encoding was open-coded with raw HPACK bytes, which is
not suitable there. Let's make use of the new functions to avoid this.
2018-12-11 09:07:02 +01:00
Willy Tarreau
aafdf58333 MEDIUM: mux-h2: make use of standard HPACK encoding functions for the status
This way we don't open-code the HPACK status codes anymore in the H2
code. Special care was taken not to cause any slowdown as this code is
very sensitive.
2018-12-11 09:07:02 +01:00
Willy Tarreau
bad0a381d3 MINOR: hpack: move the length computation and encoding functions to .h
We'll need these functions from other inline functions, let's make them
accessible. len_to_bytes() was renamed to hpack_len_to_bytes() since it's
now exposed.
2018-12-11 09:06:46 +01:00
Willy Tarreau
2c3139489c MEDIUM: hpack: make it possible to encode any static header name
We used to have a series of well-known header fields that were looked
up, but most of them were not. The current model couldn't scale with
the addition of the new headers or pseudo-headers required to process
requests, resulting in their encoding being hard-coded in the caller.

This patch implements a quick lookup which retrieves any header from
the static table. A binary stream is made of header names prefixed by
lengths and indexes. These header names are sorted by length, then by
frequency, then by direction (preference for response), then by name,
the the lowest index of each is stored only in case of multiple
entries. A parallel length index table provides the index of the first
header for a given string. This allows to focus on the first few values
matching the same length.

Everything was made to limit the cache footprint. Interestingly, the
lookup ends up being slightly faster than the previous one, while
covering the 54 distinct headers instead of only 10.

A test with a curl request and a basic response showed that the request
size has dropped from 85 to 56 bytes and that the response size has
dropped from 197 to 170 bytes, thus we can now shave roughly 25-30 bytes
per message.
2018-12-11 09:06:46 +01:00
Willy Tarreau
19ed92b47d MINOR: hpack: optimize header encoding for short names
For unknown fields, since we know that most of them are less than 127
characters, we don't need to go through the loop and can instead directly
emit the one-byte length encoding. This increases the request rate by
approximately 0.5%.
2018-12-11 09:06:46 +01:00
Willy Tarreau
ac73ae0b83 MINOR: hpack: use ist2bin() to copy header names in hpack_encode_header()
memcpy() tends to be overkill to copy short strings, better use ist's
naive functions for this. This shows a consistent 1.2% performance
gain with h2load.
2018-12-11 09:06:46 +01:00
Willy Tarreau
1526f1942c MINOR: hpack: simplify the len to bytes conversion
The len-to-bytes conversion can be slightly simplified and optimized
by hardcoding a tree lookup. Just doing this increases by 1% the
request rate on H2. It could be made almost branch-free by using
fls() but it looks overkill for most situations since most headers
are very short.
2018-12-11 09:06:46 +01:00
Willy Tarreau
7571015939 BUG/MINOR: hpack: fix off-by-one in header name encoding length calculation
In hpack_encode_header() there is a length check to verify that a literal
header name fits in the buffer, but there it an off-by-one in this length
check, which forgets the byte required to mark the encoding type (literal
without indexing). It should be harmless though as it cannot be triggered
since response headers passing through haproxy are limited by the reserve,
which is not the case of the output buffer.

This fix should be backported to 1.8.
2018-12-11 06:46:03 +01:00
Olivier Houchard
56b0348ea7 BUG/MEDIUM: mux-h2: Don't forget to set the CS_FL_EOS flag with htx.
When running with HTX, if we got an empty answer, don't forget to set
CS_FL_EOS, or the stream will never be destroyed.
2018-12-10 20:53:31 +01:00
Christopher Faulet
e97f3baa66 BUG/MEDIUM: htx: Always do a defrag if a block value is replace by a bigger one
Otherwise, after such replaces, the HTX message appears to wrap but the head
block address is not necessarily the first one. So adding new blocks will
override data of old ones.
2018-12-10 20:51:41 +01:00
Christopher Faulet
f6ce9d61f9 BUG/MEDIUM: mux-h1: Don't loop on the headers parsing if the read0 was received
If a server sends part of headers and then close its connection, the mux H1
reamins blocked in an infinite loop trying to read more data to finish the
parsing of the message. The flag CS_FL_REOS is set on the conn_stream. But
because there are some data in the input buffer, CS_FL_EOS is never set.

To fix the bug, in h1_process_input, when CS_FL_REOS is set on the conn_stream,
we also set CS_FL_EOS if the input buffer is empty OR if the channel's buffer is
empty.
2018-12-10 20:50:59 +01:00
Christopher Faulet
cb55f485da BUG/MEDIUM: mux-h1: Add a BUSY mode to not loop on pipelinned requests
When a request is fully processed, no more data are parsed until the response is
totally processed and a new transaction starts. But during this time, the mux is
trying to read more data and subscribes to read. If requests are pipelined, we
start to receive the next requests which will stay in the input buffer, leading
to a loop consuming all the CPU. This loop ends when the transaction ends. To
avoid this loop, the flag H1C_F_IN_BUSY has been added. It is set when the
request is fully parsed and unset when the transaction ends. Once set on H1C, it
blocks the reads. So the mux never tries to receive more data in this state.
2018-12-10 20:50:19 +01:00
Christopher Faulet
de68b1351f BUG/MINOR: mux-h1: Fix conn_mode processing for headerless outgoing messages
Condition to process the connection mode on outgoing messages whithout
'Connection' header was wrong. It relied on the wrong H1M
state. H1_MSG_HDR_L2_LWS is only a possible state for messages with at least one
header. Now, to fix the bug, we just check the H1M state is not
H1_MSG_LAST_LF. So, we have the warranty the EOH was not processed yet.
2018-12-10 20:49:12 +01:00
Willy Tarreau
ac77b6f441 BUG/MEDIUM: mux-h2: fix encoding of non-GET/POST methods
Jerome reported that outgoing H2 failed for methods different from GET
or POST. It turns out that the HPACK encoding is performed by hand in
the outgoing headers encoding function and that the data length was not
incremented to cover the literal method value, resulting in a corrupted
HEADERS frame.

Admittedly this code should move to the generic HPACK code.

No backport is needed.
2018-12-10 11:08:04 +01:00
Olivier Houchard
ac1ce6f9b8 BUG/MEDIUM: connections: Remove error flags when retrying.
In connect_server(), when retrying to connect, remove the error flags from
the connection and the conn_stream, we're trying to connect again, anyway.
2018-12-08 21:56:07 +01:00
Olivier Houchard
eb2bbba547 BUG/MEDIUM: connection: Don't use the provided conn_stream if it was tried.
In connect_server(), don't attempt to reuse the conn_stream associated to
the stream_interface, if we already attempted a connection with it.
Using that conn_stream is only there for the cases where a connection and
a conn_stream was created ahead, mostly by http_proxy or by the LUA code.
If we already attempted to connect, that means we fail, and so we should
create a new connection.

No backport needed.
2018-12-08 18:13:46 +01:00
Christopher Faulet
ce85149629 BUG/MINOR: mux-h1: Remove the connection header when it is useless
When the connection mode can be deduced from the HTTP version, we remove the
redundant connection header. So "keep-alive" connection header is removed from
HTTP/1.1 messages and "close" connection header is remove from HTTP/1.0
messages.
2018-12-08 15:44:58 +01:00
Willy Tarreau
e2778a43d4 BUILD: h2: mark the start line already checked to avoid warnings
Gcc 7 warns about a potential null pointer deref that cannot happen
since the start line block is guaranteed to be present in the functions
where it's dereferenced. Let's mark it as already checked.
2018-12-08 15:31:57 +01:00
Olivier Houchard
50d660c545 BUG/MEDIUM: h2: Don't try to chunk data when using HTX.
When we're using HTX, we don't have to generate chunk header/trailers, and
that ultimately leads to a crash when we try to access a buffer that
contains just chunk trailers.

This should not be backported.
2018-12-08 08:22:04 +01:00
Willy Tarreau
c706cd73a5 BUG/MEDIUM: htx: fix typo in htx_replace_stline() making it fail all the time
A typo in the block type check makes this function fail all the time,
which has impact on anything rewriting a start line (set-uri, set-path
etc).

No backport needed.
2018-12-07 17:12:22 +01:00
Jérôme Magnin
8657742092 MINOR: sample: add bc_http_major
This adds the sample fetch bc_http_major. It returns the backend connection's HTTP
version encoding, which may be 1 for HTTP/0.9 to HTTP/1.1 or 2 for HTTP/2.0. It is
based on the on-wire encoding, and not the version present in the request header.
2018-12-07 15:34:39 +01:00
Olivier Houchard
4468f1cacb BUG/MEDIUM: sample: Don't treat SMP_T_METH as SMP_T_STR.
In smp_dup(), don't consider a SMP_T_METH with an unknown method the same as
SMP_T_STR. The string and string length aren't stored at the same place.

This should be backported to 1.8.
2018-12-07 15:31:43 +01:00
Christopher Faulet
f061e422f7 BUG/MINOR: stream-int: Process read0 even if no data was received in si_cs_recv
The flag CS_FL_EOS can be set while no data was received. So the flas
CS_FL_RCV_MORE is not set. In this case, the read0 was never processed by the
stream interface. To be sure to process it, the test on CS_FL_RCV_MORE has been
moved after the one on CS_FL_EOS.
2018-12-07 14:57:58 +01:00
Christopher Faulet
5f50f5e606 MINOR: mux-h1: Set CS_FL_EOS when read0 is detected and no data are pending
In h1_process(), instead of setting CS_FL_REOS in this case, it is more accurate
to set CS_FL_EOS.
2018-12-07 14:57:58 +01:00
Willy Tarreau
2e754bff23 MINOR: htx: switch to case sensitive search of lower case header names
Now that we know that htx only contains lower case header names, there
is no need anymore for looking them up in a case-insensitive manner.

Note that http_find_header() still does it because header names to
compare against may come from everywhere there.
2018-12-07 13:25:59 +01:00
Willy Tarreau
c2a10d4b4c MINOR: h2: don't turn HTX header names to lower case anymore
Since HTX stores header names in lower case already, we don't need to
do it again anymore. This increased H2 performance by 2.7% on quick
tests, now making H2 overr HTX about 5.5% faster than H2 over H1.
2018-12-07 13:25:59 +01:00
Willy Tarreau
ed00e345e2 MEDIUM: ist: always turn header names to lower case
HTTP/2 and above require header names to be lower cased, while HTTP/1
doesn't care. By making lower case the standard way to store header
names in HTX, we can significantly simplify all operations applying to
header names retrieved from HTX (including, but not limited to, lookups
and lower case checks which are not needed anymore).

As a side effect of replacing memcpy() with ist2bin_lc(), a small increase
of the request rate performance of about 0.5-1% was noticed on keep-alive
traffic, very likely due to memcpy() being overkill for tiny strings.

This trivial patch was marked medium because it may have a visible end-user
impact (e.g. non-HTTP compliant agent, etc).
2018-12-07 13:25:59 +01:00
Christopher Faulet
e6b39942d1 BUG/MEDIUM: mux-h1: Be sure to have a conn_stream to set CS_FL_REOS in h1_recv
In the commit 6a2d33481 ("BUG/MEDIUM: h1: Set CS_FL_REOS if we had a read0."),
We set the flag CS_FL_REOS on the conn_stream when a read0 is detected. But we
must be sure to have a conn_stream first.
2018-12-07 11:43:19 +01:00