The current "ADD" vs "ADDQ" is confusing because when thinking in terms
of appending at the end of a list, "ADD" naturally comes to mind, but
here it does the opposite, it inserts. Several times already it's been
incorrectly used where ADDQ was expected, the latest of which was a
fortunate accident explained in 6fa922562 ("CLEANUP: stream: explain
why we queue the stream at the head of the server list").
Let's use more explicit (but slightly longer) names now:
LIST_ADD -> LIST_INSERT
LIST_ADDQ -> LIST_APPEND
LIST_ADDED -> LIST_INLIST
LIST_DEL -> LIST_DELETE
The same is true for MT_LISTs, including their "TRY" variant.
LIST_DEL_INIT keeps its short name to encourage to use it instead of the
lazier LIST_DELETE which is often less safe.
The change is large (~674 non-comment entries) but is mechanical enough
to remain safe. No permutation was performed, so any out-of-tree code
can easily map older names to new ones.
The list doc was updated.
This patch replaces roughly all occurrences of an HA_ATOMIC_ADD(&foo, 1)
or HA_ATOMIC_SUB(&foo, 1) with the equivalent HA_ATOMIC_INC(&foo) and
HA_ATOMIC_DEC(&foo) respectively. These are 507 changes over 45 files.
pool_alloc_dirty() is the version below pool_alloc() that never performs
the memory poisonning. It should only be called directly for very large
unstructured areas for which enabling memory poisonning would not bring
anything but could significantly hurt performance (e.g. buffers). Using
this function here will not provide any benefit and will hurt the ability
to debug.
It would be desirable to backport this, although it does not cause any
user-visible bug, it just complicates debugging.
The default proxy was passed as a variable to all parsers instead of a
const, which is not without risk, especially when some timeout parsers used
to make some int pointers point to the default values for comparisons. We
want to be certain that none of these parsers will modify the defaults
sections by accident, so it's important to mark this proxy as const.
This patch touches all occurrences found (89).
The EOM block may be removed. The HTX_FL_EOM flags is enough. Most of time,
to know if the end of the message is reached, we just need to have an empty
HTX message with HTX_FL_EOM flag set. It may also be detected when the last
block of a message with HTX_FL_EOM flag is manipulated.
Removing EOM blocks simplifies the HTX message filling. Indeed, there is no
more edge problems when the message ends but there is no more space to write
the EOM block. However, some part are more tricky. Especially the
compression filter or the FCGI mux. The compression filter must finish the
compression on the last DATA block. Before it was performed on the EOM
block, an extra DATA block with the checksum was added. Now, we must detect
the last DATA block to be sure to finish the compression. The FCGI mux on
its part must be sure to reserve the space for the empty STDIN record on the
last DATA block while this record was inserted on the EOM block.
The H2 multiplexer is probably the part that benefits the most from this
change. Indeed, it is now fairly easier to known when to set the ES flag.
The HTX documentaion has been updated accordingly.
There's no point splitting the file in two since only cfgparse uses the
types defined there. A few call places were updated and cleaned up. All
of them were in C files which register keywords.
There is nothing left in common/ now so this directory must not be used
anymore.
This one was not easy because it was embarking many includes with it,
which other files would automatically find. At least global.h, arg.h
and tools.h were identified. 93 total locations were identified, 8
additional includes had to be added.
In the rare files where it was possible to finalize the sorting of
includes by adjusting only one or two extra lines, it was done. But
all files would need to be rechecked and cleaned up now.
It was the last set of files in types/ and proto/ and these directories
must not be reused anymore.
This one is particularly difficult to split because it provides all the
functions used to manipulate a proxy state and to retrieve names or IDs
for error reporting, and as such, it was included in 73 files (down to
68 after cleanup). It would deserve a small cleanup though the cut points
are not obvious at the moment given the number of structs involved in
the struct proxy itself.
It was moved without any change, however many callers didn't need it at
all. This was a consequence of the split of proto_http.c into several
parts that resulted in many locations to still reference it.
This one is particularly tricky to move because everyone uses it
and it depends on a lot of other types. For example it cannot include
arg-t.h and must absolutely only rely on forward declarations to avoid
dependency loops between vars -> sample_data -> arg. In order to address
this one, it would be nice to split the sample_data part out of sample.h.
A few includes had to be added, namely list-t.h in the type file and
types/proxy.h in the proto file. actions.h was including http-htx.h
but didn't need it so it was dropped.
And also rename standard.c to tools.c. The original split between
tools.h and standard.h dates from version 1.3-dev and was mostly an
accident. This patch moves the files back to what they were expected
to be, and takes care of not changing anything else. However this
time tools.h was split between functions and types, because it contains
a small number of commonly used macros and structures (e.g. name_desc)
which in turn cause the massive list of includes of tools.h to conflict
with the callers.
They remain the ugliest files of the whole project and definitely need
to be cleaned and split apart. A few types are defined there only for
functions provided there, and some parts are even OS-specific and should
move somewhere else, such as the symbol resolution code.
Most of the file was a large set of HTX elements manipulation functions
and few types, so splitting them allowed to further reduce dependencies
and shrink the build time. Doing so revealed that a few files (h2.c,
mux_pt.c) needed haproxy/buf.h and were previously getting it through
htx.h. They were fixed.
So the enums and structs were placed into http-t.h and the functions
into http.h. This revealed that several files were dependeng on http.h
but not including it, as it was silently inherited via other files.
The pretty confusing "buffer.h" was in fact not the place to look for
the definition of "struct buffer" but the one responsible for dynamic
buffer allocation. As such it defines the struct buffer_wait and the
few functions to allocate a buffer or wait for one.
This patch moves it renaming it to dynbuf.h. The type definition was
moved to its own file since it's included in a number of other structs.
Doing this cleanup revealed that a significant number of files used to
rely on this one to inherit struct buffer through it but didn't need
anything from this file at all.
Half of the users of this include only need the type definitions and
not the manipulation macros nor the inline functions. Moves the various
types into mini-clist-t.h makes the files cleaner. The other one had all
its includes grouped at the top. A few files continued to reference it
without using it and were cleaned.
In addition it was about time that we'd rename that file, it's not
"mini" anymore and contains a bit more than just circular lists.
All files that were including one of the following include files have
been updated to only include haproxy/api.h or haproxy/api-t.h once instead:
- common/config.h
- common/compat.h
- common/compiler.h
- common/defaults.h
- common/initcall.h
- common/tools.h
The choice is simple: if the file only requires type definitions, it includes
api-t.h, otherwise it includes the full api.h.
In addition, in these files, explicit includes for inttypes.h and limits.h
were dropped since these are now covered by api.h and api-t.h.
No other change was performed, given that this patch is large and
affects 201 files. At least one (tools.h) was already freestanding and
didn't get the new one added.
Since the HTX mode is the only mode to process HTTP messages, the stream is
created for a uniq transaction. The keep-alive is handled at the mux level. So,
the compression filter can be initialized when the stream is created and
released with the stream. Concretly, .channel_start_analyze and
.channel_end_analyze callback functions are replaced by .attach and .detach
ones.
With this change, it is no longer necessary to call FLT_START_FE/BE and FLT_END
analysers for the compression filter.
During the payload filtering, the offset is relative to the head of the HTX
message and not its first index. This index is the position of the first block
to (re)start the HTTP analysis. It must be used during HTTP analysis but not
during the payload forwarding.
So, from the compression filter point of view, when we loop on the HTX blocks to
compress the response payload, we must start from the head of the HTX
message. To ease the loop, we use the function htx_find_offset().
This patch must be backported as far as 2.0. It depends on the commit "MINOR:
htx: Add a function to return a block at a specific an offset". So this one must
be backported first.
Since commit 27d93c3f94 ("BUG/MAJOR: compression/cache: Make it
really works with these both filters"), we no longer use section
deinit_comp_ctx.
This should fix github issue #441
Signed-off-by: William Dauchy <w.dauchy@criteo.com>
The FCGI application handles all the configuration parameters used to format
requests sent to an application. The configuration of an application is grouped
in a dedicated section (fcgi-app <name>) and referenced in a backend to be used
(use-fcgi-app <name>). To be valid, a FCGI application must at least define a
document root. But it is also possible to set the default index, a regex to
split the script name and the path-info from the request URI, parameters to set
or unset... In addition, this patch also adds a FCGI filter, responsible for
all processing on a stream.
The old module proto_http does not exist anymore. All code dedicated to the HTTP
analysis is now grouped in the file proto_htx.c. So, to finish the polishing
after removing the legacy HTTP code, proto_htx.{c,h} files have been moved in
http_ana.{c,h} files.
In addition, all HTX analyzers and related functions prefixed with "htx_" have
been renamed to start with "http_" instead.
First of all, all legacy HTTP analyzers and all functions exclusively used by
them were removed. So the most of the functions in proto_http.{c,h} were
removed. Only functions to deal with the HTTP transaction have been kept. Then,
http_msg and hdr_idx modules were entirely removed. And finally the structure
http_msg was lightened of all its useless information about the legacy HTTP. The
structure hdr_ctx was also removed because unused now, just like unused states
in the enum h1_state. Note that the memory pool "hdr_idx" was removed and
"http_txn" is now smaller.
The legacy HTTP callbacks were removed (comp_http_data, comp_http_chunk_trailers
and comp_http_forward_data). Functions emitting compressed chunks of data for
the legacy HTTP mode were also removed. The state for the compression filter was
updated accordingly. The compression context and the algorigttm used to compress
data are the only useful information remaining.
Make HAProxy set the `Vary: Accept-Encoding` response header if it compressed
the server response.
Technically the `Vary` header SHOULD also be set for responses that would
normally be compressed based off the current configuration, but are not due
to a missing or invalid `Accept-Encoding` request header or due to the
maximum compression rate being exceeded.
Not setting the header in these cases does no real harm, though: An
uncompressed response might be returned by a Cache, even if a compressed
one could be retrieved from HAProxy. This increases the traffic to the end
user if the cache is unable to compress itself, but it saves another
roundtrip to HAProxy.
see the discussion on the mailing list: https://www.mail-archive.com/haproxy@formilux.org/msg34221.html
Message-ID: 20190617121708.GA2964@1wt.eu
A small issue remains: The User-Agent is not added to the `Vary` header,
despite being relevant to the response. Adding the User-Agent header would
make responses effectively uncacheable and it's unlikely to see a Mozilla/4
in the wild in 2019.
Add a reg-test to ensure the behaviour as described in this commit message.
see issue #121
Should be backported to all branches with compression (i.e. 1.6+).
The function htx_add_data_before() is buggy and cannot work. It first add a data
block and then move it before another one, passed in argument. The problem
happens when a defragmentation is done to add the new block. In this case, the
reference is no longer valid, because the blocks are rearranged. So, instead of
moving the new block before the reference, it is moved at the head of the HTX
message.
So this function has been removed. It was only used by the compression filter to
add a last data block before a TLR, EOT or EOM block. Now, the new function
htx_add_last_data() is used. It adds a last data block, after all others and
before any TLR, EOT or EOM block. Then, the next bock is get. It is the first
non-data block after data in the HTX message. The compression loop continues
with it.
This patch must be backported to 1.9.
This type of blocks is useless because transition between data and trailers is
obvious. And when there is no trailers, the end-of-message is still there to
know when data end for chunked messages.
HTTP trailers are now parsed in the same way headers are. It means trailers are
converted to K/V blocks followed by an end-of-trailer marker. For now, to make
things simple, the type for trailer blocks are not the same than for header
blocks. But the aim is to make no difference between headers and trailers by
using the same type. Probably for the end-of marker too.
The filters filtering HTX body, in the callback http_payload, must now loop on
an HTX message starting from the first block position. The offset passed as
parameter is relative to this position and not the head one. It is mandatory
because once filtered, data are now forwarded using the function
channel_htx_fwd_payload(). So the first block position is always updated.
When HTX support was added to HTTP compression, a set of counters was missed,
namely comp_in and comp_byp, resulting in no stats being available for compression.
This must be backported to 1.9.
RFC 7232 section 2.3.3 states:
> Note: Content codings are a property of the representation data,
> so a strong entity-tag for a content-encoded representation has to
> be distinct from the entity tag of an unencoded representation to
> prevent potential conflicts during cache updates and range
> requests. In contrast, transfer codings (Section 4 of [RFC7230])
> apply only during message transfer and do not result in distinct
> entity-tags.
Thus a strong ETag must be changed when compressing. Usually this is done
by converting it into a weak ETag, which represents a semantically, but not
byte-by-byte identical response. A conversion to a weak ETag still allows
If-None-Match to work.
This should be backported to 1.9 and might be backported to every supported
branch with compression.
Since the commit 9666720c8 ("BUG/MEDIUM: compression: Use the right buffer
pointers to compress input data"), the compression can be done twice. The first
time on the frontend and the second time on the backend. This may happen by
configuring the compression in a default section.
To fix the bug, when the response is checked to know if it should be compressed
or not, if the flag HTTP_MSGF_COMPRESSING is set, the compression is not
performed. It means it is already handled by a previous compression filter.
Thanks to Pieter (PiBa-NL) to report this bug.
This patch must be backported to 1.9.
In HTX, when the compression filter analyze the EOM, it flushes the compression
context and add the last block of compressed data. But, this block can be
empty. In this case, we must ignore it.
A bug was introduced when the buffers API was refactored. It was when wrapping
input data were compressed. the pointer b_peek(in, 0) was used instead of
"b_orig(in)". b_peek(in, 0) is in fact the same as b_head(in).
Caching the response with the compression enabled was totally broken. To fix the
problem, the compression must be done after caching the response. Otherwise it
needs to change the cache to store compressed and uncompressed objects for the
same ressource. So, because it is not possible for now, it is forbidden to
declare the compression filter before the cache one. To ease the configuration,
both can be implicitly declared (without "filter" keyword). The compression will
automatically be inserted after the cache.
Then, to make it works this way, the compression filter has been slighly
modified. Now, the response headers are updated after http-response rules
evaluations, instead of before. So, if the response contains a "Content-length"
header, it will be kept with the response stored in the cache. So this cached
response will be able to be served to clients not supporting the compression at
all.
All the HTX definition is self-contained and doesn't really depend on
anything external since it's a mostly protocol. In addition, some
external similar files (like h2) also placed in common used to rely
on it, making it a bit awkward.
This patch moves the two htx.h files into a single self-contained one.
The historical dependency on sample.h could be also removed since it
used to be there only for http_meth_t which is now in http.h.
Now, the function htx_from_buf() will set the buffer's length to its size
automatically. In return, the caller should call htx_to_buf() at the end to be
sure to leave the buffer hosting the HTX message in the right state. When the
caller can use the function htxbuf() to get the HTX message without any update
on the underlying buffer.