88 Commits

Author SHA1 Message Date
Ilya Shipitsin
46a030cdda CLEANUP: assorted typo fixes in the code and comments
This is 11th iteration of typo fixes
2020-07-06 14:34:32 +02:00
Willy Tarreau
6be7849f39 REORG: include: move cfgparse.h to haproxy/cfgparse.h
There's no point splitting the file in two since only cfgparse uses the
types defined there. A few call places were updated and cleaned up. All
of them were in C files which register keywords.

There is nothing left in common/ now so this directory must not be used
anymore.
2020-06-11 10:18:58 +02:00
Willy Tarreau
dfd3de8826 REORG: include: move stream.h to haproxy/stream{,-t}.h
This one was not easy because it was embarking many includes with it,
which other files would automatically find. At least global.h, arg.h
and tools.h were identified. 93 total locations were identified, 8
additional includes had to be added.

In the rare files where it was possible to finalize the sorting of
includes by adjusting only one or two extra lines, it was done. But
all files would need to be rechecked and cleaned up now.

It was the last set of files in types/ and proto/ and these directories
must not be reused anymore.
2020-06-11 10:18:58 +02:00
Willy Tarreau
a264d960f6 REORG: include: move proxy.h to haproxy/proxy{,-t}.h
This one is particularly difficult to split because it provides all the
functions used to manipulate a proxy state and to retrieve names or IDs
for error reporting, and as such, it was included in 73 files (down to
68 after cleanup). It would deserve a small cleanup though the cut points
are not obvious at the moment given the number of structs involved in
the struct proxy itself.
2020-06-11 10:18:58 +02:00
Willy Tarreau
c7babd8570 REORG: include: move filters.h to haproxy/filters{,-t}.h
Just a minor change, moved the macro definitions upwards. A few caller
files were updated since they didn't need to include it.
2020-06-11 10:18:58 +02:00
Willy Tarreau
c2b1ff04e5 REORG: include: move http_ana.h to haproxy/http_ana{,-t}.h
It was moved without any change, however many callers didn't need it at
all. This was a consequence of the split of proto_http.c into several
parts that resulted in many locations to still reference it.
2020-06-11 10:18:58 +02:00
Willy Tarreau
e6ce10be85 REORG: include: move sample.h to haproxy/sample{,-t}.h
This one is particularly tricky to move because everyone uses it
and it depends on a lot of other types. For example it cannot include
arg-t.h and must absolutely only rely on forward declarations to avoid
dependency loops between vars -> sample_data -> arg. In order to address
this one, it would be nice to split the sample_data part out of sample.h.
2020-06-11 10:18:58 +02:00
Willy Tarreau
87735330d1 REORG: include: move http_htx.h to haproxy/http_htx{,-t}.h
A few includes had to be added, namely list-t.h in the type file and
types/proxy.h in the proto file. actions.h was including http-htx.h
but didn't need it so it was dropped.
2020-06-11 10:18:57 +02:00
Willy Tarreau
0a3bd3919e REORG: include: move compression.h to haproxy/compression{,-t}.h
No change was needed.
2020-06-11 10:18:57 +02:00
Willy Tarreau
48fbcae07c REORG: tools: split common/standard.h into haproxy/tools{,-t}.h
And also rename standard.c to tools.c. The original split between
tools.h and standard.h dates from version 1.3-dev and was mostly an
accident. This patch moves the files back to what they were expected
to be, and takes care of not changing anything else. However this
time tools.h was split between functions and types, because it contains
a small number of commonly used macros and structures (e.g. name_desc)
which in turn cause the massive list of includes of tools.h to conflict
with the callers.

They remain the ugliest files of the whole project and definitely need
to be cleaned and split apart. A few types are defined there only for
functions provided there, and some parts are even OS-specific and should
move somewhere else, such as the symbol resolution code.
2020-06-11 10:18:57 +02:00
Willy Tarreau
16f958c0e9 REORG: include: split common/htx.h into haproxy/htx{,-t}.h
Most of the file was a large set of HTX elements manipulation functions
and few types, so splitting them allowed to further reduce dependencies
and shrink the build time. Doing so revealed that a few files (h2.c,
mux_pt.c) needed haproxy/buf.h and were previously getting it through
htx.h. They were fixed.
2020-06-11 10:18:57 +02:00
Willy Tarreau
cd72d8c981 REORG: include: split common/http.h into haproxy/http{,-t}.h
So the enums and structs were placed into http-t.h and the functions
into http.h. This revealed that several files were dependeng on http.h
but not including it, as it was silently inherited via other files.
2020-06-11 10:18:57 +02:00
Willy Tarreau
2741c8c4aa REORG: include: move common/buffer.h to haproxy/dynbuf{,-t}.h
The pretty confusing "buffer.h" was in fact not the place to look for
the definition of "struct buffer" but the one responsible for dynamic
buffer allocation. As such it defines the struct buffer_wait and the
few functions to allocate a buffer or wait for one.

This patch moves it renaming it to dynbuf.h. The type definition was
moved to its own file since it's included in a number of other structs.

Doing this cleanup revealed that a significant number of files used to
rely on this one to inherit struct buffer through it but didn't need
anything from this file at all.
2020-06-11 10:18:57 +02:00
Willy Tarreau
853b297c9b REORG: include: split mini-clist into haproxy/list and list-t.h
Half of the users of this include only need the type definitions and
not the manipulation macros nor the inline functions. Moves the various
types into mini-clist-t.h makes the files cleaner. The other one had all
its includes grouped at the top. A few files continued to reference it
without using it and were cleaned.

In addition it was about time that we'd rename that file, it's not
"mini" anymore and contains a bit more than just circular lists.
2020-06-11 10:18:56 +02:00
Willy Tarreau
4c7e4b7738 REORG: include: update all files to use haproxy/api.h or api-t.h if needed
All files that were including one of the following include files have
been updated to only include haproxy/api.h or haproxy/api-t.h once instead:

  - common/config.h
  - common/compat.h
  - common/compiler.h
  - common/defaults.h
  - common/initcall.h
  - common/tools.h

The choice is simple: if the file only requires type definitions, it includes
api-t.h, otherwise it includes the full api.h.

In addition, in these files, explicit includes for inttypes.h and limits.h
were dropped since these are now covered by api.h and api-t.h.

No other change was performed, given that this patch is large and
affects 201 files. At least one (tools.h) was already freestanding and
didn't get the new one added.
2020-06-11 10:18:42 +02:00
Christopher Faulet
5e896510a8 MINOR: compression/filters: Initialize the comp filter when stream is created
Since the HTX mode is the only mode to process HTTP messages, the stream is
created for a uniq transaction. The keep-alive is handled at the mux level. So,
the compression filter can be initialized when the stream is created and
released with the stream. Concretly, .channel_start_analyze and
.channel_end_analyze callback functions are replaced by .attach and .detach
ones.

With this change, it is no longer necessary to call FLT_START_FE/BE and FLT_END
analysers for the compression filter.
2020-03-06 15:36:04 +01:00
Christopher Faulet
e6a62bf796 BUG/MEDIUM: compression/filters: Fix loop on HTX blocks compressing the payload
During the payload filtering, the offset is relative to the head of the HTX
message and not its first index. This index is the position of the first block
to (re)start the HTTP analysis. It must be used during HTTP analysis but not
during the payload forwarding.

So, from the compression filter point of view, when we loop on the HTX blocks to
compress the response payload, we must start from the head of the HTX
message. To ease the loop, we use the function htx_find_offset().

This patch must be backported as far as 2.0. It depends on the commit "MINOR:
htx: Add a function to return a block at a specific an offset". So this one must
be backported first.
2020-03-06 14:12:59 +01:00
William Dauchy
8602394040 CLEANUP: compression: remove unused deinit_comp_ctx section
Since commit 27d93c3f9428 ("BUG/MAJOR: compression/cache: Make it
really works with these both filters"), we no longer use section
deinit_comp_ctx.

This should fix github issue #441

Signed-off-by: William Dauchy <w.dauchy@criteo.com>
2020-01-15 10:58:17 +01:00
Christopher Faulet
78fbb9f991 MEDIUM: fcgi-app: Add FCGI application and filter
The FCGI application handles all the configuration parameters used to format
requests sent to an application. The configuration of an application is grouped
in a dedicated section (fcgi-app <name>) and referenced in a backend to be used
(use-fcgi-app <name>). To be valid, a FCGI application must at least define a
document root. But it is also possible to set the default index, a regex to
split the script name and the path-info from the request URI, parameters to set
or unset...  In addition, this patch also adds a FCGI filter, responsible for
all processing on a stream.
2019-09-17 10:18:54 +02:00
Christopher Faulet
fc9cfe4006 REORG: proto_htx: Move HTX analyzers & co to http_ana.{c,h} files
The old module proto_http does not exist anymore. All code dedicated to the HTTP
analysis is now grouped in the file proto_htx.c. So, to finish the polishing
after removing the legacy HTTP code, proto_htx.{c,h} files have been moved in
http_ana.{c,h} files.

In addition, all HTX analyzers and related functions prefixed with "htx_" have
been renamed to start with "http_" instead.
2019-07-19 09:24:12 +02:00
Christopher Faulet
711ed6ae4a MAJOR: http: Remove the HTTP legacy code
First of all, all legacy HTTP analyzers and all functions exclusively used by
them were removed. So the most of the functions in proto_http.{c,h} were
removed. Only functions to deal with the HTTP transaction have been kept. Then,
http_msg and hdr_idx modules were entirely removed. And finally the structure
http_msg was lightened of all its useless information about the legacy HTTP. The
structure hdr_ctx was also removed because unused now, just like unused states
in the enum h1_state. Note that the memory pool "hdr_idx" was removed and
"http_txn" is now smaller.
2019-07-19 09:24:12 +02:00
Christopher Faulet
89f2b16530 MEDIUM: compression: Remove code relying on the legacy HTTP mode
The legacy HTTP callbacks were removed (comp_http_data, comp_http_chunk_trailers
and comp_http_forward_data). Functions emitting compressed chunks of data for
the legacy HTTP mode were also removed. The state for the compression filter was
updated accordingly. The compression context and the algorigttm used to compress
data are the only useful information remaining.
2019-07-19 09:18:27 +02:00
Tim Duesterhus
721d686bd1 BUG/MEDIUM: compression: Set Vary: Accept-Encoding for compressed responses
Make HAProxy set the `Vary: Accept-Encoding` response header if it compressed
the server response.

Technically the `Vary` header SHOULD also be set for responses that would
normally be compressed based off the current configuration, but are not due
to a missing or invalid `Accept-Encoding` request header or due to the
maximum compression rate being exceeded.

Not setting the header in these cases does no real harm, though: An
uncompressed response might be returned by a Cache, even if a compressed
one could be retrieved from HAProxy. This increases the traffic to the end
user if the cache is unable to compress itself, but it saves another
roundtrip to HAProxy.

see the discussion on the mailing list: https://www.mail-archive.com/haproxy@formilux.org/msg34221.html
Message-ID: 20190617121708.GA2964@1wt.eu

A small issue remains: The User-Agent is not added to the `Vary` header,
despite being relevant to the response. Adding the User-Agent header would
make responses effectively uncacheable and it's unlikely to see a Mozilla/4
in the wild in 2019.

Add a reg-test to ensure the behaviour as described in this commit message.

see issue #121
Should be backported to all branches with compression (i.e. 1.6+).
2019-06-17 18:51:43 +02:00
Christopher Faulet
86bc8df955 BUG/MEDIUM: compression/htx: Fix the adding of the last data block
The function htx_add_data_before() is buggy and cannot work. It first add a data
block and then move it before another one, passed in argument. The problem
happens when a defragmentation is done to add the new block. In this case, the
reference is no longer valid, because the blocks are rearranged. So, instead of
moving the new block before the reference, it is moved at the head of the HTX
message.

So this function has been removed. It was only used by the compression filter to
add a last data block before a TLR, EOT or EOM block. Now, the new function
htx_add_last_data() is used. It adds a last data block, after all others and
before any TLR, EOT or EOM block. Then, the next bock is get. It is the first
non-data block after data in the HTX message. The compression loop continues
with it.

This patch must be backported to 1.9.
2019-06-11 14:05:25 +02:00
Christopher Faulet
54b5e214b0 MINOR: htx: Don't use end-of-data blocks anymore
This type of blocks is useless because transition between data and trailers is
obvious. And when there is no trailers, the end-of-message is still there to
know when data end for chunked messages.
2019-06-05 10:12:11 +02:00
Christopher Faulet
2d7c5395ed MEDIUM: htx: Add the parsing of trailers of chunked messages
HTTP trailers are now parsed in the same way headers are. It means trailers are
converted to K/V blocks followed by an end-of-trailer marker. For now, to make
things simple, the type for trailer blocks are not the same than for header
blocks. But the aim is to make no difference between headers and trailers by
using the same type. Probably for the end-of marker too.
2019-06-05 10:12:11 +02:00
Christopher Faulet
ee847d45d0 MEDIUM: filters/htx: Filter body relatively to the first block
The filters filtering HTX body, in the callback http_payload, must now loop on
an HTX message starting from the first block position. The offset passed as
parameter is relative to this position and not the head one. It is mandatory
because once filtered, data are now forwarded using the function
channel_htx_fwd_payload(). So the first block position is always updated.
2019-05-28 07:42:33 +02:00
Willy Tarreau
81036f2738 MINOR: time: move the cpu, mono, and idle time to thread_info
These ones are useful across all threads and would be better placed
in struct thread_info than thread-local. There are very few users.
2019-05-20 21:14:14 +02:00
Olivier Houchard
43da3430f1 MEDIUM: compression: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:38 +01:00
Willy Tarreau
ef6fd85623 BUG/MINOR: compression: properly report compression stats in HTX mode
When HTX support was added to HTTP compression, a set of counters was missed,
namely comp_in and comp_byp, resulting in no stats being available for compression.

This must be backported to 1.9.
2019-02-04 11:48:03 +01:00
Tim Duesterhus
b229f018ee BUG/MEDIUM: compression: Rewrite strong ETags
RFC 7232 section 2.3.3 states:

> Note: Content codings are a property of the representation data,
> so a strong entity-tag for a content-encoded representation has to
> be distinct from the entity tag of an unencoded representation to
> prevent potential conflicts during cache updates and range
> requests.  In contrast, transfer codings (Section 4 of [RFC7230])
> apply only during message transfer and do not result in distinct
> entity-tags.

Thus a strong ETag must be changed when compressing. Usually this is done
by converting it into a weak ETag, which represents a semantically, but not
byte-by-byte identical response. A conversion to a weak ETag still allows
If-None-Match to work.

This should be backported to 1.9 and might be backported to every supported
branch with compression.
2019-01-29 20:26:06 +01:00
Christopher Faulet
1d3613a031 BUG/MINOR: compression: Disable it if another one is already in progress
Since the commit 9666720c8 ("BUG/MEDIUM: compression: Use the right buffer
pointers to compress input data"), the compression can be done twice. The first
time on the frontend and the second time on the backend. This may happen by
configuring the compression in a default section.

To fix the bug, when the response is checked to know if it should be compressed
or not, if the flag HTTP_MSGF_COMPRESSING is set, the compression is not
performed. It means it is already handled by a previous compression filter.

Thanks to Pieter (PiBa-NL) to report this bug.

This patch must be backported to 1.9.
2019-01-08 11:31:56 +01:00
Christopher Faulet
d238ae3a9b BUG/MINOR: compression/htx: Don't add the last block of data if it is empty
In HTX, when the compression filter analyze the EOM, it flushes the compression
context and add the last block of compressed data. But, this block can be
empty. In this case, we must ignore it.
2018-12-21 15:33:26 +01:00
Christopher Faulet
c963eb2a1d BUG/MINOR: compression/htx: Don't compress responses with unknown body length
In HTX, if the body length of a response cannot be determined, we must not try
to compress it.
2018-12-21 15:33:16 +01:00
Christopher Faulet
b61481c710 MINOR: compression: Remove the thread_local variable buf_output
By doing a c_rew() at the right place, we can avoid to use this variable. This
slightly simplifly the compression for the legacy HTTP.
2018-12-19 13:45:53 +01:00
Christopher Faulet
9666720c83 BUG/MEDIUM: compression: Use the right buffer pointers to compress input data
A bug was introduced when the buffers API was refactored. It was when wrapping
input data were compressed. the pointer b_peek(in, 0) was used instead of
"b_orig(in)". b_peek(in, 0) is in fact the same as b_head(in).
2018-12-17 13:46:38 +01:00
Christopher Faulet
27d93c3f94 BUG/MAJOR: compression/cache: Make it really works with these both filters
Caching the response with the compression enabled was totally broken. To fix the
problem, the compression must be done after caching the response. Otherwise it
needs to change the cache to store compressed and uncompressed objects for the
same ressource. So, because it is not possible for now, it is forbidden to
declare the compression filter before the cache one. To ease the configuration,
both can be implicitly declared (without "filter" keyword). The compression will
automatically be inserted after the cache.

Then, to make it works this way, the compression filter has been slighly
modified. Now, the response headers are updated after http-response rules
evaluations, instead of before. So, if the response contains a "Content-length"
header, it will be kept with the response stored in the cache. So this cached
response will be able to be served to clients not supporting the compression at
all.
2018-12-15 23:50:07 +01:00
Willy Tarreau
30925659ef CLEANUP: h1: remove some occurrences of unneeded h1.h inclusions
Several places where h1.h was included didn't need it at all since
they in fact relied on the legacy HTTP definitions.
2018-12-11 17:15:13 +01:00
Willy Tarreau
b96b77ed6e REORG: htx: merge types+proto into common/htx.h
All the HTX definition is self-contained and doesn't really depend on
anything external since it's a mostly protocol. In addition, some
external similar files (like h2) also placed in common used to rely
on it, making it a bit awkward.

This patch moves the two htx.h files into a single self-contained one.
The historical dependency on sample.h could be also removed since it
used to be there only for http_meth_t which is now in http.h.
2018-12-11 17:15:04 +01:00
Christopher Faulet
f4a4ef7d7c MINOR: filters: Export the name of known filters
It could be useful to know if some filter is declared on a proxy or if it is
enabled on a stream.
2018-12-11 17:09:31 +01:00
Christopher Faulet
c9df7f728f MINOR: compression: Rename the function check_legacy_http_comp_flt()
To not mix it up with the legacy HTTP representation, this function has been
rename check_implicit_http_comp_flt().
2018-12-11 17:09:31 +01:00
Christopher Faulet
27ba2dc6d6 MEDIUM: htx: Rework conversion from a buffer to an htx structure
Now, the function htx_from_buf() will set the buffer's length to its size
automatically. In return, the caller should call htx_to_buf() at the end to be
sure to leave the buffer hosting the HTX message in the right state. When the
caller can use the function htxbuf() to get the HTX message without any update
on the underlying buffer.
2018-12-05 17:10:16 +01:00
Christopher Faulet
6e54095d0a BUG/MINOR: flt_trace/compression: Use the right flag to add the HTX support
Of course, the flag FLT_CFG_FL_HTX must be used and not
STRM_FLT_FL_HAS_FILTERS. "Fortunately", these 2 flags have the same value, so
everything worked as expected.
2018-12-04 16:43:30 +01:00
Christopher Faulet
e6902cd57c MEDIUM: compression: Adapt to be compatible with the HTX representation
Functions analyzing request and response headers have been duplicated and
adapted to support HTX messages. The callback http_payload have been implemented
to handle the data compression itself. It loops on HTX blocks and replace
uncompressed value of DATA block by compressed one. Unlike the HTTP legacy
version, there is no chunk at all. So HTX version is significantly easier.
2018-12-01 17:37:27 +01:00
Willy Tarreau
8ceae72d44 MEDIUM: init: use initcall for all fixed size pool creations
This commit replaces the explicit pool creation that are made in
constructors with a pool registration. Not only this simplifies the
pools declaration (it can be done on a single line after the head is
declared), but it also removes references to pools from within
constructors. The only remaining create_pool() calls are those
performed in init functions after the config is parsed, so there
is no more user of potentially uninitialized pool now.

It has been the opportunity to remove no less than 12 constructors
and 6 init functions.
2018-11-26 19:50:32 +01:00
Willy Tarreau
0108d90c6c MEDIUM: init: convert all trivial registration calls to initcalls
This switches explicit calls to various trivial registration methods for
keywords, muxes or protocols from constructors to INITCALL1 at stage
STG_REGISTER. All these calls have in common to consume a single pointer
and return void. Doing this removes 26 constructors. The following calls
were addressed :

- acl_register_keywords
- bind_register_keywords
- cfg_register_keywords
- cli_register_kw
- flt_register_keywords
- http_req_keywords_register
- http_res_keywords_register
- protocol_register
- register_mux_proto
- sample_register_convs
- sample_register_fetches
- srv_register_keywords
- tcp_req_conn_keywords_register
- tcp_req_cont_keywords_register
- tcp_req_sess_keywords_register
- tcp_res_cont_keywords_register
- flt_register_keywords
2018-11-26 19:50:32 +01:00
Joseph Herlant
942eea3f5c CLEANUP: Fix typos in the http subsystem
Fix typos in code comment of the http subsystem.
2018-11-18 22:26:42 +01:00
Willy Tarreau
ab813a4b05 REORG: http: move some header value processing functions to http.c
The following functions only deal with header field values and are agnostic
to the HTTP version so they were moved to http.c :

http_header_match2(), find_hdr_value_end(), find_cookie_value_end(),
extract_cookie_value(), parse_qvalue(), http_find_url_param_pos(),
http_find_next_url_param().

Those lacking the "http_" prefix were modified to have it.
2018-09-11 10:30:25 +02:00
Willy Tarreau
35b51c6e5b REORG: http: move the HTTP semantics definitions to http.h/http.c
It's a bit painful to have to deal with HTTP semantics for each protocol
version (H1 and H2), and working on the version-agnostic code further
emphasizes the problem.

This patch creates http.h and http.c which are agnostic to the version
in use, and which borrow a few parts from proto_http and from h1. For
example the once thought h1-specific h1_char_classes array is in fact
dictated by RFC7231 and is used to parse HTTP headers. A few changes
were made to a few files which were including proto_http.h while they
only needed http.h.

Certain string definitions pre-dated the introduction of indirect
strings (ist) so some were used to simplify the definition of the known
HTTP methods. The current lookup code saves 2 kB of a heavily used table
and is faster than the previous table based lookup (typ. 14 ns vs 16
before).
2018-09-11 10:30:25 +02:00
Willy Tarreau
843b7cbe9d MEDIUM: chunks: make the chunk struct's fields match the buffer struct
Chunks are only a subset of a buffer (a non-wrapping version with no head
offset). Despite this we still carry a lot of duplicated code between
buffers and chunks. Replacing chunks with buffers would significantly
reduce the maintenance efforts. This first patch renames the chunk's
fields to match the name and types used by struct buffers, with the goal
of isolating the code changes from the declaration changes.

Most of the changes were made with spatch using this coccinelle script :

  @rule_d1@
  typedef chunk;
  struct chunk chunk;
  @@
  - chunk.str
  + chunk.area

  @rule_d2@
  typedef chunk;
  struct chunk chunk;
  @@
  - chunk.len
  + chunk.data

  @rule_i1@
  typedef chunk;
  struct chunk *chunk;
  @@
  - chunk->str
  + chunk->area

  @rule_i2@
  typedef chunk;
  struct chunk *chunk;
  @@
  - chunk->len
  + chunk->data

Some minor updates to 3 http functions had to be performed to take size_t
ints instead of ints in order to match the unsigned length here.
2018-07-19 16:23:43 +02:00