When an error file was loaded, the flag HTX_SL_F_XFER_LEN was never set on the
HTX start line because of a bug. During the headers parsing, the flag
H1_MF_XFER_LEN is never set on the h1m. But it was the condition to set
HTX_SL_F_XFER_LEN on the HTX start-line. Instead, we must only rely on the flags
H1_MF_CLEN or H1_MF_CHNK.
Because of this bug, it was impossible to keep a connection alive for a response
generated by HAProxy. Now the flag HTX_SL_F_XFER_LEN is set when an error file
have a content length (chunked responses are unsupported at this stage) and the
connection may be kept alive if there is no connection header specified to
explicitly close it.
This patch must be backported to 2.0 and 1.9.
In HTTP, the request authority, if any, and the Host header must be identical
(excluding any userinfo subcomponent and its "@" delimiter). So now, during the
request analysis, when the Host header is updated, the start-line is also
updated. The authority of an absolute URI is changed accordingly. Symmetrically,
if the URI is changed, if it contains an authority, then then Host header is
also changed. In this latter case, the flags of the start-line are also updated
to reflect the changes on the URI.
Empty error files may be used to disable the sending of any message for specific
error codes. A common use-case is to use the file "/dev/null". This way the
default error message is overridden and no message is returned to the client. It
was supported in the legacy HTTP mode, but not in HTX. Because of a bug, such
messages triggered an error.
This patch must be backported to 2.0 and 1.9. However, the patch will have to be
adapted.
Default HTTP error messages are stored in an array of chunks. And since the HTX
was added, these messages are also converted in HTX and stored in another
array. But now, the first array is not used anymore because the legacy HTTP mode
was removed.
So now, only the array with the HTX messages are kept. The other one was
removed.
<head> and <tail> fields are now signed 32-bits integers. For an empty HTX
message, these fields are set to -1. So the field <used> is now useless and can
safely be removed. To know if an HTX message is empty or not, we just compare
<head> against -1 (it also works with <tail>). The function htx_nbblks() has
been added to get the number of used blocks.
Since the HTX is the default mode for all proxies, HTTP and TCP, we must
initialize all HTX error messages for all HTX-aware proxies and not only for
HTTP ones. It is required to support HTTP upgrade for TCP proxies.
This patch must be backported to 2.0.
Since the commit b75b5eaf ("MEDIUM: htx: 1xx messages are now part of the final
reponses"), these messages are part of the response and should not contain
EOM. This block is skipped during responses parsing, but analyzers still add it
for "100-Continue" and "103-Eraly-Hints". It can also be added for error files
with 1xx status code.
Now, when HAProxy generate such transitional responses, it does not emit EOM
blocks. And informational messages are now forbidden in error files.
This patch must be backported to 2.0.
Everywhere the value length of a block is changed, calling the function
htx_set_blk_value_len(), the HTX message must be updated. But at many places,
because of the recent changes in the HTX structure, this update was only
partially done. tail_addr and head_addr values were not systematically updated.
In fact, the function htx_set_blk_value_len() was designed as an internal
function to the HTX API. And we used it from outside by convenience. But it is
really painfull and error prone to let the caller update the HTX message. So
now, we use the function htx_change_blk_value_len() wherever is possible. It
changes the value length of a block and updates the HTX message accordingly.
This patch must be backported to 2.0.
Instead of using the macro MAX_HTTP_HDR to limit the number of headers parsed
before throwing an error, we now use the custom global variable
global.tune.max_http_hdr.
This patch must be backported to 1.9.
In an HTX message, it may have 2 available rooms to store a new block. The first
one is between the blocks and their payload. Blocks are added starting from the
end of the buffer and their payloads are added starting from the begining. So
the first free room is between these 2 edges. The second one is at the begining
of the buffer, when we start to wrap to add new payloads. Once we start to use
this one, the other one is ignored until the next defragmentation of the HTX
message.
In theory, there is no problem. But in practice, some lacks in the HTX structure
force us to defragment too often HTX messages to always be in a known state. The
second free room is not tracked as it should do and the first one may be easily
corrupted when rewrites happen.
So to fix the problem and avoid unecessary defragmentation, the HTX structure
has been refactored. The front (the block's position of the first payload before
the blocks) is no more stored. Instead we keep the relative addresses of 3 edges:
* tail_addr : The start address of the free space in front of the the blocks
table
* head_addr : The start address of the free space at the beginning
* end_addr : The end address of the free space at the beginning
Here is the general view of the HTX message now:
head_addr end_addr tail_addr
| | |
V V V
+------------+------------+------------+------------+------------------+
| | | | | |
| PAYLOAD | Free space | PAYLOAD | Free space | Blocks area |
| ==> | 1 | ==> | 2 | <== |
+------------+------------+------------+------------+------------------+
<head_addr> is always lower or equal to <end_addr> and <tail_addr>. <end_addr>
is always lower or equal to <tail_addr>.
In addition;, to simplify everything, the blocks area are now contiguous. It
doesn't wrap anymore. So the head is always the block with the lowest position,
and the tail is always the one with the highest position.
In order to later allow htx_add_data() to transmit partial blocks and
avoid defragmenting the buffer, we'll need to return the number of bytes
consumed. This first modification makes the function do this and its
callers take this into account. At the moment the function still works
atomically so it returns either the block size or zero. However all
call places have been adapted to consider any value between zero and
the block size.
We don't store the start-line position anymore in the HTX message. Instead we
store the first block position to analyze. For now, it is almost the same. But
once all changes will be made on this part, this position will have to be used
by HTX analyzers, and only in the analysis context, to know where the analyse
should start.
When new blocks are added in an HTX message, if the first block position is not
defined, it is set. When the block pointed by it is removed, it is set to the
block following it. -1 remains the value to unset the position. the first block
position is unset when the HTX message is empty. It may also be unset on a
non-empty message, meaning every blocks were already analyzed.
From HTX analyzers point of view, this position is always set during headers
analysis. When they are waiting for a request or a response, if it is unset, it
means the analysis should wait. But once the analysis is started, and as long as
headers are not forwarded, it points to the message start-line.
As mentionned, outside the HTX analysis, no code must rely on the first block
position. So multiplexers and applets must always use the head position to start
a loop on an HTX message.
The first block is the start-line, if defined. Otherwise it the head of the HTX
message. So now, during HTTP analysis, lookup are all done using the first block
instead of the head. Concretely, for now, it is the same because only one HTTP
message is stored at a time in an HTX message. 1xx informational messages are
handled separatly from the final reponse and from each other. But it will make
sense when the 1xx informational messages and the associated final reponse will
be stored in the same HTX message.
Since the HTX start-line is now referenced by position instead of by its payload
address, it is fairly easier to replace it. No need to search the rigth block to
find the start-line comparing the payloads address. It just enough to get the
block at the position sl_pos.
Now, we only return the start-line. If not found, NULL is returned. No lookup is
performed and the HTX message is no more updated. It is now the caller
responsibility to update the position of the start-line to the right value. So
when it is not found, i.e sl_pos is set to -1, it means the last start-line has
been already processed and the next one has not been inserted yet.
It is mandatory to rely on this kind of warranty to store 1xx informational
responses and final reponse in the same HTX message.
The head of an HTX message is heavily used whereas the wrap position is only
used when a block is added or removed. So it is more logical to store the head
position in the HTX message instead of the wrap one. The wrap position can be
easily deduced. To get it, the new function htx_get_wrap() may be used.
When tests are performed on the HTX mode during HAProxy startup, only HTTP
proxies are considered. It is important because, since the commit 1d2b586cd
("MAJOR: htx: Enable the HTX mode by default for all proxies"), the HTX is
enabled on all proxies by default. But for TCP proxies, it is "deactivated".
This patch must be backported to 1.9.
When a header is added or when a data block is added before another one, the
blocks position may be changed (but not their payloads position). For instance,
when a header is added, we move the block just before the EOH, if any. When the
payloads wraps, it is pretty annoying because we loose the last inserted
block. It is neither the tail nor the head. And it is not the front either.
It is a design problem. Waiting for fixing this problem, we force a
defragmentation in such case. Anyway, it should be pretty rare, so it's not
really critical.
This patch must be backported to 1.9.
http_find_stline() carefully verifies that it finds a start line otherwise
returns NULL when not found. But a few calling functions ignore this NULL
in return and dereference this pointer without checking. Let's add the
test where needed in the callers. If it turns out that over the long term
a start line is mandatory, then the test will be removed and the faulty
function will have to be simplified.
This must be backported to 1.9.
All the HTX definition is self-contained and doesn't really depend on
anything external since it's a mostly protocol. In addition, some
external similar files (like h2) also placed in common used to rely
on it, making it a bit awkward.
This patch moves the two htx.h files into a single self-contained one.
The historical dependency on sample.h could be also removed since it
used to be there only for http_meth_t which is now in http.h.
Now, the function htx_from_buf() will set the buffer's length to its size
automatically. In return, the caller should call htx_to_buf() at the end to be
sure to leave the buffer hosting the HTX message in the right state. When the
caller can use the function htxbuf() to get the HTX message without any update
on the underlying buffer.
During startup, after the configuration parsing, all HTTP error messages
(errorloc, errorfile or default messages) are converted into HTX messages and
stored in dedicated buffers. We use it to return errors in the HTX analyzers
instead of using ugly OOB blocks.
Instead, we now use the htx_sl coming from the HTX message. It avoids to have
too H1 specific code in version-agnostic parts. Of course, the concept of the
start-line is higly influenced by the H1, but the structure htx_sl can be
adapted, if necessary. And many things depend on a start-line during HTTP
analyzis. Using the structure htx_sl also avoid boring conversions between HTX
version and H1 version.
The HTX start-line is now a struct. It will be easier to extend, if needed. Same
info can be found, of course. In addition it is now possible to set flags on
it. It will be used to set some infos about the message.
Some macros and functions have been added in proto/htx.h to help accessing
different parts of the start-line.
This file will host all functions to manipulate HTTP messages using the HTX
representation. Functions in this file will be able to be called from anywhere
and are mainly related to the HTTP semantics.