haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-10 00:57:02 +02:00

Author	SHA1	Message	Date
Willy Tarreau	acc5b011e5	MINOR: cache: use pool_alloc(), not pool_alloc_dirty() pool_alloc_dirty() is the version below pool_alloc() that never performs the memory poisonning. It should only be called directly for very large unstructured areas for which enabling memory poisonning would not bring anything but could significantly hurt performance (e.g. buffers). Using this function here will not provide any benefit and will hurt the ability to debug. It would be desirable to backport this, although it does not cause any user-visible bug, it just complicates debugging.	2021-03-22 15:35:53 +01:00
Tim Duesterhus	154374cbc8	CLEANUP: Use istadv(const struct ist, const size_t) whenever possible Refactoring performed with the following Coccinelle patch: @@ struct ist i; expression e; @@ - i.ptr += e; - i.len -= e; + i = istadv(i, e);	2021-03-03 05:07:10 +01:00
Willy Tarreau	61cfdf4fd8	CLEANUP: tree-wide: replace free(x);x=NULL with ha_free(&x) This makes the code more readable and less prone to copy-paste errors. In addition, it allows to place some __builtin_constant_p() predicates to trigger a link-time error in case the compiler knows that the freed area is constant. It will also produce compile-time error if trying to free something that is not a regular pointer (e.g. a function). The DEBUG_MEM_STATS macro now also defines an instance for ha_free() so that all these calls can be checked. 178 occurrences were converted. The vast majority of them were handled by the following Coccinelle script, some slightly refined to better deal with "&*x" or with long lines: @ rule @ expression E; @@ - free(E); - E = NULL; + ha_free(&E); It was verified that the resulting code is the same, more or less a handful of cases where the compiler optimized slightly differently the temporary variable that holds the copy of the pointer. A non-negligible amount of {free(str);str=NULL;str_len=0;} are still present in the config part (mostly header names in proxies). These ones should also be cleaned for the same reasons, and probably be turned into ist strings.	2021-02-26 21:21:09 +01:00
Christopher Faulet	d1ac2b90cd	MAJOR: htx: Remove the EOM block type and use HTX_FL_EOM instead The EOM block may be removed. The HTX_FL_EOM flags is enough. Most of time, to know if the end of the message is reached, we just need to have an empty HTX message with HTX_FL_EOM flag set. It may also be detected when the last block of a message with HTX_FL_EOM flag is manipulated. Removing EOM blocks simplifies the HTX message filling. Indeed, there is no more edge problems when the message ends but there is no more space to write the EOM block. However, some part are more tricky. Especially the compression filter or the FCGI mux. The compression filter must finish the compression on the last DATA block. Before it was performed on the EOM block, an extra DATA block with the checksum was added. Now, we must detect the last DATA block to be sure to finish the compression. The FCGI mux on its part must be sure to reserve the space for the empty STDIN record on the last DATA block while this record was inserted on the EOM block. The H2 multiplexer is probably the part that benefits the most from this change. Indeed, it is now fairly easier to known when to set the ES flag. The HTX documentaion has been updated accordingly.	2021-01-28 16:37:14 +01:00
Christopher Faulet	42432f347f	MINOR: htx: Rename HTX_FL_EOI flag into HTX_FL_EOM The HTX_FL_EOI flag is not well named. For now, it is not very used. But that will change. It will replace the EOM block. Thus, it is renamed.	2021-01-28 16:37:14 +01:00
Ilya Shipitsin	7704b0e1e1	CLEANUP: assorted typo fixes in the code and comments This is 16th iteration of typo fixes	2021-01-26 09:16:48 +01:00
Tim Duesterhus	ed84d84a29	CLEANUP: Rename accept_encoding_hash_cmp to accept_encoding_bitmap_cmp For the `accept-encoding` header a bitmap and not a hash is stored.	2021-01-18 15:01:48 +01:00
Tim Duesterhus	5897cfe18e	CLEANUP: cache: Use proper data types in secondary_key_cmp() - hash_length is `unsigned int` and so should offset. - idx is compared to a `size_t` and thus it should also be.	2021-01-18 15:01:46 +01:00
Tim Duesterhus	1d66e396bf	MINOR: cache: Remove the `hash` part of the accept-encoding secondary key As of commit `6ca89162dc` this hash no longer is required, because unknown encodings are not longer stored and known encodings do not use the cache.	2021-01-18 15:01:41 +01:00
Remi Tricot-Le Breton	6ca89162dc	MINOR: cache: Do not store responses with an unknown encoding If a server varies on the accept-encoding header and it sends a response with an encoding we do not know (see parse_encoding_value function), we will not store it. This will prevent unexpected errors caused by cache collisions that could happen in accept_encoding_hash_cmp.	2021-01-15 22:33:05 +01:00
Willy Tarreau	94a01e1cb7	CLEANUP: few extra typo and fixes over last one ("ot" -> "to") As noticed by Tim there were a few incorrect fixes in the previous patch ("ot" -> "to" and not "or").	2021-01-06 17:35:52 +01:00
Ilya Shipitsin	b8888ab557	CLEANUP: assorted typo fixes in the code and comments This is 15th iteration of typo fixes	2021-01-06 17:32:03 +01:00
Tim Duesterhus	c294284e33	CLEANUP: Reduce scope of `hdr_age` in http_action_store_cache() This is only required to process the `age` header.	2021-01-05 17:05:58 +01:00
Tim Duesterhus	e2fff10a19	CLEANUP: Reduce scope of `header_name` in http_action_store_cache() This variable is only needed deeply nested in a single location and clang's static analyzer complains about a dead initialization. Reduce the scope to satisfy clang and the human that reads the function.	2021-01-05 17:05:58 +01:00
Tim Duesterhus	e5ff14100a	CLEANUP: Compare the return value of `XXXcmp()` functions with zero According to coding-style.txt it is recommended to use: `strcmp(a, b) == 0` instead of `!strcmp(a, b)` So let's do this. The change was performed by running the following (very long) coccinelle patch on src/: @@ statement S; expression E; expression F; @@ if ( ( dns_hostname_cmp \| eb_memcmp \| memcmp \| strcasecmp \| strcmp \| strncasecmp \| strncmp ) - (E, F) + (E, F) != 0 ) ( S \| { ... } ) @@ statement S; expression E; expression F; @@ if ( - ! ( dns_hostname_cmp \| eb_memcmp \| memcmp \| strcasecmp \| strcmp \| strncasecmp \| strncmp ) - (E, F) + (E, F) == 0 ) ( S \| { ... } ) @@ expression E; expression F; expression G; @@ ( G && ( dns_hostname_cmp \| eb_memcmp \| memcmp \| strcasecmp \| strcmp \| strncasecmp \| strncmp ) - (E, F) + (E, F) != 0 ) @@ expression E; expression F; expression G; @@ ( G \|\| ( dns_hostname_cmp \| eb_memcmp \| memcmp \| strcasecmp \| strcmp \| strncasecmp \| strncmp ) - (E, F) + (E, F) != 0 ) @@ expression E; expression F; expression G; @@ ( ( dns_hostname_cmp \| eb_memcmp \| memcmp \| strcasecmp \| strcmp \| strncasecmp \| strncmp ) - (E, F) + (E, F) != 0 && G ) @@ expression E; expression F; expression G; @@ ( ( dns_hostname_cmp \| eb_memcmp \| memcmp \| strcasecmp \| strcmp \| strncasecmp \| strncmp ) - (E, F) + (E, F) != 0 \|\| G ) @@ expression E; expression F; expression G; @@ ( G && - ! ( dns_hostname_cmp \| eb_memcmp \| memcmp \| strcasecmp \| strcmp \| strncasecmp \| strncmp ) - (E, F) + (E, F) == 0 ) @@ expression E; expression F; expression G; @@ ( G \|\| - ! ( dns_hostname_cmp \| eb_memcmp \| memcmp \| strcasecmp \| strcmp \| strncasecmp \| strncmp ) - (E, F) + (E, F) == 0 ) @@ expression E; expression F; expression G; @@ ( - ! ( dns_hostname_cmp \| eb_memcmp \| memcmp \| strcasecmp \| strcmp \| strncasecmp \| strncmp ) - (E, F) + (E, F) == 0 && G ) @@ expression E; expression F; expression G; @@ ( - ! ( dns_hostname_cmp \| eb_memcmp \| memcmp \| strcasecmp \| strcmp \| strncasecmp \| strncmp ) - (E, F) + (E, F) == 0 \|\| G ) @@ expression E; expression F; expression G; @@ ( - ! ( dns_hostname_cmp \| eb_memcmp \| memcmp \| strcasecmp \| strcmp \| strncasecmp \| strncmp ) - (E, F) + (E, F) == 0 )	2021-01-04 10:09:02 +01:00
Tim Duesterhus	dc38bc4a1a	BUG/MEDIUM: cache: Fix hash collision in `accept-encoding` handling for `Vary` This patch fixes GitHub Issue #988. Commit `ce9e7b2521` was not sufficient, because it fell back to a hash comparison if the bitmap of known encodings was not acceptable instead of directly returning the the cached response is not compatible. This patch also extends the reg-test to test the hash collision that was mentioned in #988. Vary handling is 2.4, no backport needed.	2020-12-31 09:39:08 +01:00
Remi Tricot-Le Breton	e6cc5b5974	MINOR: cache: Replace the "process-vary" option's expected values Replace the <0/1> expected values of the process-vary option by a more usual <on/off> pair.	2020-12-24 17:18:00 +01:00
Remi Tricot-Le Breton	42efffd7f6	MINOR: cache: Remove redundant test in http_action_req_cache_use The suppressed check is fully covered by the next one and can then be removed.	2020-12-24 17:18:00 +01:00
Remi Tricot-Le Breton	ce9e7b2521	MEDIUM: cache: Manage a subset of encodings in accept-encoding normalizer The accept-encoding normalizer now explicitely manages a subset of encodings which will all have their own bit in the encoding bitmap stored in the cache entry. This way two requests with the same primary key will be served the same cache entry if they both explicitely accept the stored response's encoding, even if their respective secondary keys are not the same and do not match the stored response's one. The actual hash of the accept-encoding will still be used if the response's encoding is unmanaged. The encoding matching and the encoding weight parsing are done for every subpart of the accept-encoding values, and a bitmap of accepted encodings is built for every request. It is then tested upon any stored response that has the same primary key until one with an accepted encoding is found. The specific "identity" and "*" accept-encoding values are managed too. When storing a response in the key, we also parse the content-encoding header in order to only set the response's corresponding encoding's bit in its cache_entry encoding bitmap. This patch fixes GitHub issue #988. It does not need to be backported.	2020-12-24 17:18:00 +01:00
Remi Tricot-Le Breton	6a34b2b65d	MINOR: cache: Add specific secondary key comparison mechanism Add the possibility to define custom comparison functions for every sub-part of the secondary key hash instead of using a global memcmp.	2020-12-24 17:18:00 +01:00
Remi Tricot-Le Breton	e4421dec7e	BUG/MINOR: cache: Manage multiple headers in accept-encoding normalization The accept-encoding part of the secondary key (vary) was only built out of the first occurrence of the header. So if a client had two accept-encoding headers, gzip and br for instance, the key would have been built out of the gzip string. So another client that only managed gzip would have been sent the cached resource, even if it was a br resource. The http_find_header function is now called directly by the normalizers so that they can manage multiple headers if needed. A request that has more than 16 encodings will be considered as an illegitimate request and its response will not be stored. This fixes GitHub issue #987. It does not need any backport.	2020-12-24 17:18:00 +01:00
Remi Tricot-Le Breton	2b5c5cbef6	MINOR: cache: Avoid storing responses whose secondary key was not correctly calculated If any of the secondary hash normalizing functions raises an error, the secondary hash will be unusable. In this case, the response will not be stored anymore.	2020-12-24 17:18:00 +01:00
Remi Tricot-Le Breton	bba2912758	MINOR: cache: Refactoring of secondary_key building functions The two secondary_key building functions (prebuild_full_key and build_key) have roughly the same content so their code can be mutualized.	2020-12-24 17:18:00 +01:00
Ilya Shipitsin	f38a01884a	CLEANUP: assorted typo fixes in the code and comments This is 13n iteration of typo fixes	2020-12-21 11:24:48 +01:00
Remi Tricot-Le Breton	5853c0c0d5	MINOR: cache: Add a max-secondary-entries cache option This new option allows to tune the maximum number of simultaneous entries with the same primary key in the cache (secondary entries). When we try to store a response in the cache and there are already max-secondary-entries living entries in the cache, the storage will fail (but the response will still be sent to the client). It defaults to 10 and does not have a maximum number.	2020-12-15 16:35:09 +01:00
Remi Tricot-Le Breton	73be796462	MEDIUM: cache: Avoid going over duplicates lists too often The secondary entry counter cannot be updated without going over all the items of a duplicates list periodically. In order to avoid doing it too often and to impact the cache's performances, a timestamp is added to the cache_entry. It will store the timestamp (with second precision) of the last iteration over the list (actually the last call of the clear_expired_duplicates function). This way, this function will not be called more than once per second for a given duplicates list.	2020-12-15 16:35:09 +01:00
Remi Tricot-Le Breton	65904e4f07	MEDIUM: cache: Add a secondary entry counter and insertion limitation Add an arbitrary maximum number of secondary entries per primary hash (10 for now) to the cache. This prevents the cache from being filled with duplicates of the same resource. This works thanks to an entry counter that is kept in one of the duplicates of the list (the last one). When an entry is added to the list, the ebtree's implementation ensures that it will be added to the end of the existing list so the only thing to do to keep the counter updated is to get the previous counter from the second to last entry. Likewise, when an entry is explicitely deleted, we update the counter from the list's last item.	2020-12-15 16:35:09 +01:00
Remi Tricot-Le Breton	964caaff0e	BUG/MAJOR: cache: Crash because of disabled entry not removed from the tree The cache entries are now added into the tree even when they are not complete yet. If we realized while trying to add a response's payload that the shctx was full, the entry was disabled through the disable_cache_entry function, which cleared the key field of the entry's node, but without actually removing it from the tree. So the shctx row could be stolen from the entry and the row's content be rewritten while a lookup in the tree would still find a reference to the old entry. This caused a random crash in case of cache saturation and row reuse. This patch adds the missing removal of the node from the tree next to the reset of the key in disable_cache_entry. This bug was introduced by commit `3243447` ("MINOR: cache: Add entry to the tree as soon as possible") It does not need to be backported.	2020-12-15 15:31:30 +01:00
Remi Tricot-Le Breton	e3e1e5f34b	MINOR: cache: Dump secondary entries in "show cache" The duplicated entries (in case of vary) were not taken into account by the "show cache" command. They are now dumped too. A new "vary" column is added to the output. It contains the complete seocndary key (in hex format).	2020-12-10 15:59:49 +01:00
Remi Tricot-Le Breton	51058d64a6	MINOR: cache: Consider invalid Age values as stale Do not store responses that have an invalid age header (non numerical, negative ...).	2020-12-04 10:21:56 +01:00
Remi Tricot-Le Breton	72cffaf440	MEDIUM: cache: Remove cache entry in case of POST on the same resource In case of successful unsafe method on a stored resource, the cached entry must be invalidated (see RFC7234#4.4). A "non-error response" is one with a 2xx (Successful) or 3xx (Redirection) status code. This implies that the primary hash must now be calculated on requests that have an unsafe method (POST or PUT for instance) so that we can disable the corresponding entries when we process the response.	2020-12-04 10:21:56 +01:00
Remi Tricot-Le Breton	fcea374fdf	MINOR: cache: Add extra "cache-control" value checks The Cache-Control max-age and s-maxage directives should be followed by a positive numerical value (see RFC 7234#5.2.1.1). According to the specs, a sender "should not" generate a quoted-string value but we will still accept this format.	2020-12-04 10:21:56 +01:00
Remi Tricot-Le Breton	795e1412b0	MINOR: cache: Do not store stale entry When a response has an Age header (filled in by another cache on the message's path) that is greater than its defined maximum age (extracted either from cache-control directives or an expires header), it is already stale and should not be cached.	2020-12-04 10:21:56 +01:00
Remi Tricot-Le Breton	3243447f83	MINOR: cache: Add entry to the tree as soon as possible When many concurrent requests targeting the same resource were seen, the cache could sometimes be filled by too many partial responses resulting in the impossibility to cache a single one of them. This happened because the actual tree insertion happened only after all the payload of every response was seen. So until then, every response was added to the cache because none of the streams knew that a similar request/response was already being treated. This patch consists in adding the cache_entry as soon as possible in the tree (right after the first packet) so that the other responses do not get cached as well (if they have the same primary key). A "complete" flag is also added to the cache_entry so that we know if all the payload is already stored in the entry or if it is still being processed.	2020-12-02 16:38:42 +01:00
Remi Tricot-Le Breton	8bb72aa82f	MINOR: cache: Improve accept_encoding_normalizer Turn the "Accept-Encoding" value to lower case before processing it. Calculate the CRC on every token instead of a sorted concatenation of them all (in order to avoir copying them) then XOR all the CRCs into a single hash (while ignoring duplicates).	2020-12-02 16:32:54 +01:00
Tim Duesterhus	23b2945c1c	BUG/CRITICAL: cache: Fix trivial crash by sending accept-encoding header Since commit `3d08236cb3` HAProxy can be trivially crashed remotely by sending an `accept-encoding` HTTP request header that contains 16 commas. This is because the `values` array in `accept_encoding_normalizer` accepts only 16 entries and it is not verified whether the end is reached during looping. Fix this issue by checking the length. This patch also simplifies the ist processing in the loop, because it manually calculated offsets and lengths, when the ist API exposes perfectly safe functions to advance and truncate ists. I wonder whether the accept_encoding_normalizer function is able to re-use some existing function for parsing headers that may contain lists of values. I'll leave this evaluation up to someone else, only patching the obvious crash. This commit is 2.4-dev specific and was merged just a few hours ago. No backport needed.	2020-11-25 10:23:00 +01:00
Remi Tricot-Le Breton	754b2428d3	MINOR: cache: Add a process-vary option that can enable/disable Vary processing The cache section's process-vary option takes a 0 or 1 value to disable or enable the vary processing. When disabled, a response containing such a header will never be cached. When enabled, we will calculate a preliminary hash for a subset of request headers on all the incoming requests (which might come with a cpu cost) which will be used to build a secondary key for a given request (see RFC 7234#4.1). The default value is 0 (disabled).	2020-11-24 16:52:57 +01:00
Remi Tricot-Le Breton	1785f3dd96	MEDIUM: cache: Add the Vary header support Calculate a preliminary secondary key for every request we see so that we can have a real secondary key if the response is cacheable and contains a manageable Vary header. The cache's ebtree is now allowed to have multiple entries with the same primary key. Two of those entries will be distinguished thanks to secondary keys stored in the cache_entry (based on hashes of a subset of their headers). When looking for an entry in the cache (cache_use), we still use the primary key (built the same way as before), but in case of match, we also need to check if the entry has a vary signature. If it has one, we need to perform an extra check based on the newly built secondary key. We will only be able to forge a response out of the cache if both the primary and secondary keys match with one of our entries. Otherwise the request will be forwarder to the server.	2020-11-24 16:52:57 +01:00
Remi Tricot-Le Breton	3d08236cb3	MINOR: cache: Prepare helper functions for Vary support The Vary functionality is based on a secondary key that needs to be calculated for every request to which a server answers with a Vary header. The Vary header, which can only be found in server responses, determines which headers of the request need to be taken into account in the secondary key. Since we do not want to have to store all the headers of the request until we have the response, we will pre-calculate as many sub-hashes as there are headers that we want to manage in a Vary context. We will only focus on a subset of headers which are likely to be mentioned in a Vary response (accept-encoding and referer for now). Every managed header will have its own normalization function which is in charge of transforming the header value into a core representation, more robust to insignificant changes that could exist between multiple clients. For instance, two accept-encoding values mentioning the same encodings but in different orders should give the same hash. This patch adds a function that parses a Vary header value and checks if all the values belong to our supported subset. It also adds the normalization functions for our two headers, as well as utility functions that can prebuild a secondary key for a given request and transform it into an actual secondary key after the vary signature is determined from the response.	2020-11-24 16:52:57 +01:00
Christopher Faulet	fc633b6eff	CLEANUP: config: Return ERR_NONE from config callbacks instead of 0 Return ERR_NONE instead of 0 on success for all config callbacks that should return ERR_* codes. There is no change because ERR_NONE is a macro equals to 0. But this makes the return value more explicit.	2020-11-13 16:26:10 +01:00
Remi Tricot-Le Breton	cc9bf2e5fe	MEDIUM: cache: Change caching conditions Do not cache responses that do not have an explicit expiration time (s-maxage or max-age Cache-Control directives or Expires header) or a validator (ETag or Last-Modified headers) anymore, as suggested in RFC 7234#3. The TX_FLAG_IGNORE flag is used instead of the TX_FLAG_CACHEABLE so as not to change the behavior of the checkcache option.	2020-11-12 11:22:05 +01:00
Remi Tricot-Le Breton	8c2db71326	BUG/MINOR: cache: Inverted variables in http_calc_maxage function The maxage and smaxage variables were inadvertently assigned the Cache-Control s-maxage and max-age values respectively when it should have been the other way around. This can be backported on all branches after 1.8 (included).	2020-10-30 14:29:29 +01:00
Remi Tricot-Le Breton	a6476114ec	MINOR: cache: Add Expires header value parsing When no Cache-Control max-age or s-maxage information is present in a cached response, we need to parse the Expires header value (RFC 7234#5.3). An invalid Expires date value or a date earlier than the reception date will make the cache_entry stale upon creation. For now, the Cache-Control and Expires headers are parsed after the insertion of the response in the cache so even if the parsing of the Expires results in an already stale entry, the entry will exist in the cache.	2020-10-30 11:08:38 +01:00
Remi Tricot-Le Breton	bf97121f1c	MINOR: cache: Create res.cache_hit and res.cache_name sample fetches Res.cache_hit sample fetch returns a boolean which is true when the HTTP response was built out of a cache. The cache's name is returned by the res.cache_name sample_fetch. This resolves GitHub issue #900.	2020-10-27 18:25:43 +01:00
Remi Tricot-Le Breton	53161d81b8	MINOR: cache: Process the If-Modified-Since header in conditional requests If a client sends a conditional request containing an If-Modified-Since header (and no If-None-Match header), we try to compare the date with the one stored in the cache entry (coming either from a Last-Modified head, or a Date header, or corresponding to the first response's reception time). If the request's date is earlier than the stored one, we send a "304 Not Modified" response back. Otherwise, the stored is sent (through a 200 OK response). This resolves GitHub issue #821.	2020-10-27 18:10:25 +01:00
Remi Tricot Le Breton	27091b4dd0	MINOR: cache: Store the "Last-Modified" date in the cache_entry In order to manage "If-Modified-Since" requests, we need to keep a reference time for our cache entries (to which the conditional request's date will be compared). This reference is either extracted from the "Last-Modified" header, or the "Date" header, or the reception time of the response (in decreasing order of priority). The date values are converted into seconds since epoch in order to ease comparisons and to limit storage space.	2020-10-27 18:10:25 +01:00
Tim Duesterhus	e0142340b2	BUG/MINOR: cache: Check the return value of http_replace_res_status Send the full body if the status `304` cannot be applied. This should be the most graceful failure. Specific for 2.3, no backport needed.	2020-10-27 17:01:49 +01:00
Remi Tricot-Le Breton	6cb10384a3	MEDIUM: cache: Add support for 'If-None-Match' request header Partial support of conditional HTTP requests. This commit adds the support of the 'If-None-Match' header (see RFC 7232#3.2). When a client specifies a list of ETags through one or more 'If-None-Match' headers, they are all compared to the one that might have been stored in the corresponding http cache entry until one of them matches. If a match happens, a specific "304 Not Modified" response is sent instead of the cached data. This response has all the stored headers but no other data (see RFC 7232#4.1). Otherwise, the whole cached data is sent. Although unlikely in a GET/HEAD request, the "If-None-Match: *" syntax is valid and also receives a "304 Not Modified" response (RFC 7434#4.3.2). This resolves a part of GitHub issue #821.	2020-10-22 16:10:20 +02:00
Remi Tricot-Le Breton	dbb65b5a7a	MEDIUM: cache: Store the ETag information in the cache_entry When sent by a server for a given resource, the ETag header is stored in the coresponding cache entry (as any other header). So in order to perform future ETag comparisons (for subsequent conditional HTTP requests), we keep the length of the ETag and its offset relative to the start of the cache_entry. If no ETag header exists, the length and offset are zero.	2020-10-22 16:10:20 +02:00
Tim Duesterhus	d7c6e6a71d	CLEANUP: cache: Fix leak of cconf->c.name during config check During the config check, the post parsing is not performed. Thus, cache filters are not fully initialized and their cache name are never released. To be able to release them, a flag is now set when a cache filter is fully initialized. On deinit, if the flag is not set, it means the cache name must be freed. The patch should fix #849. No backport needed. [Cf: Tim is the patch author, but I added the commit message]	2020-10-07 14:07:29 +02:00
Tim Duesterhus	ff4d86becd	MINOR: cache: Reject duplicate cache names Using a duplicate cache name most likely is the result of a misgenerated configuration. There is no good reason to allow this, as the duplicate caches can't be referred to. This commit resolves GitHub issue #820. It can be argued whether this is a fix for a bug or not. I'm erring on the side of caution and marking this as a "new feature". It can be considered for backporting to 2.2, but for other branches the risk of accidentally breaking some working (but non-ideal) configuration might be too large.	2020-08-18 22:51:24 +02:00
Tim Duesterhus	ea969f6f26	DOC: cache: Use '<name>' instead of '<id>' in error message When the cache name is left out in 'filter cache' the error message refers to a missing '<id>'. The name of the cache is called 'name' within the docs. Adjust the error message for consistency. The error message was introduced in `99a17a2d91`. This commit first appeared in 1.9, thus the patch must be backported to 1.9+.	2020-08-18 22:51:24 +02:00
Christopher Faulet	810df06145	MEDIUM: htx: Add a flag on a HTX message when no more data are expected The HTX_FL_EOI flag must now be set on a HTX message when no more data are expected. Most of time, it must be set before adding the EOM block. Thus, if there is no space for the EOM, there is still an information to know all data were received and pushed in the HTX message. There is only an exception for the HTTP replies (deny, return...). For these messages, the flag is set after all blocks are pushed in the message, including the EOM block, because, on error, we remove all inserted data.	2020-07-22 16:43:32 +02:00
Willy Tarreau	b2551057af	CLEANUP: include: tree-wide alphabetical sort of include files This patch fixes all the leftovers from the include cleanup campaign. There were not that many (~400 entries in ~150 files) but it was definitely worth doing it as it revealed a few duplicates.	2020-06-11 10:18:59 +02:00
Willy Tarreau	36979d9ad5	REORG: include: move the error reporting functions to from log.h to errors.h Most of the files dealing with error reports have to include log.h in order to access ha_alert(), ha_warning() etc. But while these functions don't depend on anything, log.h depends on a lot of stuff because it deals with log-formats and samples. As a result it's impossible not to embark long dependencies when using ha_warning() or qfprintf(). This patch moves these low-level functions to errors.h, which already defines the error codes used at the same places. About half of the users of log.h could be adjusted, sometimes revealing other issues such as missing tools.h. Interestingly the total preprocessed size shrunk by 4%.	2020-06-11 10:18:59 +02:00
Willy Tarreau	6be7849f39	REORG: include: move cfgparse.h to haproxy/cfgparse.h There's no point splitting the file in two since only cfgparse uses the types defined there. A few call places were updated and cleaned up. All of them were in C files which register keywords. There is nothing left in common/ now so this directory must not be used anymore.	2020-06-11 10:18:58 +02:00
Willy Tarreau	dfd3de8826	REORG: include: move stream.h to haproxy/stream{,-t}.h This one was not easy because it was embarking many includes with it, which other files would automatically find. At least global.h, arg.h and tools.h were identified. 93 total locations were identified, 8 additional includes had to be added. In the rare files where it was possible to finalize the sorting of includes by adjusting only one or two extra lines, it was done. But all files would need to be rechecked and cleaned up now. It was the last set of files in types/ and proto/ and these directories must not be reused anymore.	2020-06-11 10:18:58 +02:00
Willy Tarreau	a264d960f6	REORG: include: move proxy.h to haproxy/proxy{,-t}.h This one is particularly difficult to split because it provides all the functions used to manipulate a proxy state and to retrieve names or IDs for error reporting, and as such, it was included in 73 files (down to 68 after cleanup). It would deserve a small cleanup though the cut points are not obvious at the moment given the number of structs involved in the struct proxy itself.	2020-06-11 10:18:58 +02:00
Willy Tarreau	aeed4a85d6	REORG: include: move log.h to haproxy/log{,-t}.h The current state of the logging is a real mess. The main problem is that almost all files include log.h just in order to have access to the alert/warning functions like ha_alert() etc, and don't care about logs. But log.h also deals with real logging as well as log-format and depends on stream.h and various other things. As such it forces a few heavy files like stream.h to be loaded early and to hide missing dependencies depending where it's loaded. Among the missing ones is syslog.h which was often automatically included resulting in no less than 3 users missing it. Among 76 users, only 5 could be removed, and probably 70 don't need the full set of dependencies. A good approach would consist in splitting that file in 3 parts: - one for error output ("errors" ?). - one for log_format processing - and one for actual logging.	2020-06-11 10:18:58 +02:00
Willy Tarreau	c7babd8570	REORG: include: move filters.h to haproxy/filters{,-t}.h Just a minor change, moved the macro definitions upwards. A few caller files were updated since they didn't need to include it.	2020-06-11 10:18:58 +02:00
Willy Tarreau	c2b1ff04e5	REORG: include: move http_ana.h to haproxy/http_ana{,-t}.h It was moved without any change, however many callers didn't need it at all. This was a consequence of the split of proto_http.c into several parts that resulted in many locations to still reference it.	2020-06-11 10:18:58 +02:00
Willy Tarreau	f1d32c475c	REORG: include: move channel.h to haproxy/channel{,-t}.h The files were moved with no change. The callers were cleaned up a bit and a few of them had channel.h removed since not needed.	2020-06-11 10:18:58 +02:00
Willy Tarreau	5e539c9b8d	REORG: include: move stream_interface.h to haproxy/stream_interface{,-t}.h Almost no changes, removed stdlib and added buf-t and connection-t to the types to avoid a warning.	2020-06-11 10:18:58 +02:00
Willy Tarreau	83487a833c	REORG: include: move cli.h to haproxy/cli{,-t}.h Almost no change except moving the cli_kw struct definition after the defines. Almost all users had both types&proto included, which is not surprizing since this code is old and it used to be the norm a decade ago. These places were cleaned.	2020-06-11 10:18:58 +02:00
Willy Tarreau	c761f843da	REORG: include: move http_rules.h to haproxy/http_rules.h There was no include file. This one still includes types/proxy.h.	2020-06-11 10:18:57 +02:00
Willy Tarreau	122eba92b7	REORG: include: move action.h to haproxy/action{,-t}.h List.h was missing for LIST_ADDQ(). A few unneeded includes of action.h were removed from certain files. This one still relies on applet.h and stick-table.h.	2020-06-11 10:18:57 +02:00
Willy Tarreau	87735330d1	REORG: include: move http_htx.h to haproxy/http_htx{,-t}.h A few includes had to be added, namely list-t.h in the type file and types/proxy.h in the proto file. actions.h was including http-htx.h but didn't need it so it was dropped.	2020-06-11 10:18:57 +02:00
Willy Tarreau	334099c324	REORG: include: move shctx to haproxy/shctx{,-t}.h Minor cleanups were applied, some includes were missing from the types file and some were incorrect in a few C files (duplicated or not using path).	2020-06-11 10:18:57 +02:00
Willy Tarreau	16f958c0e9	REORG: include: split common/htx.h into haproxy/htx{,-t}.h Most of the file was a large set of HTX elements manipulation functions and few types, so splitting them allowed to further reduce dependencies and shrink the build time. Doing so revealed that a few files (h2.c, mux_pt.c) needed haproxy/buf.h and were previously getting it through htx.h. They were fixed.	2020-06-11 10:18:57 +02:00
Willy Tarreau	6131d6a731	REORG: include: move common/net_helper.h to haproxy/net_helper.h No change was necessary.	2020-06-11 10:18:57 +02:00
Willy Tarreau	8d36697dee	REORG: include: move base64.h, errors.h and hash.h from common to to haproxy/ These ones do not depend on any other file. One used to include haproxy/api.h but that was solely for stddef.h.	2020-06-11 10:18:56 +02:00
Willy Tarreau	4c7e4b7738	REORG: include: update all files to use haproxy/api.h or api-t.h if needed All files that were including one of the following include files have been updated to only include haproxy/api.h or haproxy/api-t.h once instead: - common/config.h - common/compat.h - common/compiler.h - common/defaults.h - common/initcall.h - common/tools.h The choice is simple: if the file only requires type definitions, it includes api-t.h, otherwise it includes the full api.h. In addition, in these files, explicit includes for inttypes.h and limits.h were dropped since these are now covered by api.h and api-t.h. No other change was performed, given that this patch is large and affects 201 files. At least one (tools.h) was already freestanding and didn't get the new one added.	2020-06-11 10:18:42 +02:00
Willy Tarreau	8d2b777fe3	REORG: ebtree: move the include files from ebtree to include/import/ This is where other imported components are located. All files which used to directly include ebtree were touched to update their include path so that "import/" is now prefixed before the ebtree-related files. The ebtree.h file was slightly adjusted to read compiler.h from the common/ subdirectory (this is the only change). A build issue was encountered when eb32sctree.h is loaded before eb32tree.h because only the former checks for the latter before defining type u32. This was addressed by adding the reverse ifdef in eb32tree.h. No further cleanup was done yet in order to keep changes minimal.	2020-06-11 09:31:11 +02:00
Christopher Faulet	2a37cdbe6b	BUG/MINOR: cache: Don't needlessly test "cache" keyword in parse_cache_flt() parse_cache_flt() is the registered callback for the "cache" filter keyword. It is only called when the "cache" keyword is found on a filter line. So, it is useless to test the filter name in the callback function. This patch should fix the issue #634. It may be backported as far as 1.9.	2020-05-18 17:47:18 +02:00
Ilya Shipitsin	6fb0f2148f	CLEANUP: assorted typo fixes in the code and comments This is sixth iteration of typo fixes	2020-04-02 16:25:45 +02:00
Christopher Faulet	65554e1b95	MINOR: cache/filters: Initialize the cache filter when stream is created Since the HTX mode is the only mode to process HTTP messages, the stream is created for a uniq transaction. The keep-alive is handled at the mux level. So, the cache filter can be initialized when the stream is created and released with the stream. Concretly, .channel_start_analyze and .channel_end_analyze callback functions are replaced by .attach and .detach ones. With this change, it is no longer necessary to call FLT_START_FE/BE and FLT_END analysers for the cache filter.	2020-03-06 15:36:04 +01:00
Christopher Faulet	497c759558	BUG/MEDIUM: cache/filters: Fix loop on HTX blocks caching the response payload During the payload filtering, the offset is relative to the head of the HTX message and not its first index. This index is the position of the first block to (re)start the HTTP analysis. It must be used during HTTP analysis but not during the payload forwarding. So, from the cache point of view, when we loop on the HTX blocks to cache the response payload, we must start from the head of the HTX message. To ease the loop, we use the function htx_find_offset(). This patch must be backported as far as 2.0. It depends on the commit "MINOR: htx: Add a function to return a block at a specific an offset". So this one must be backported first.	2020-03-06 14:12:59 +01:00
Willy Tarreau	8b5075806d	CLEANUP: cache: use read_u32/write_u32 to access the cache entry's hash Enabling strict aliasing fails on the cache's hash which is a series of 20 bytes cast as u32. And in practice it could even fail on some archs if the http_txn didn't guarantee the hash was properly aligned. Let's use read_u32() to read the value and write_u32() to set it, this makes sure the compiler emits the correct code to access these and knows about the intentional aliasing.	2020-02-25 09:35:07 +01:00
Tim Duesterhus	d34b1ce5a2	BUG/MINOR: cache: Fix leak of cache name in error path This issue was introduced in commit `99a17a2d91` which first appeared in tag v1.9-dev11. This bugfix should be backported to HAProxy 1.9+.	2020-01-18 06:45:54 +01:00
Willy Tarreau	20020ae804	MINOR: chunk: add chunk_istcat() to concatenate an ist after a chunk We previously relied on chunk_cat(dst, b_fromist(src)) for this but it is not reliable as the allocated buffer is inside the expression and may be on a temporary stack. While it's possible to allocate stack space for a struct and return a pointer to it, it's not possible to initialize it form a temporary variable to prevent arguments from being evaluated multiple times. Since this is only used to append an ist after a chunk, let's instead have a chunk_istcat() function to perform exactly this from a native ist. The only call place (URI computation in the cache) was updated.	2019-10-29 13:09:14 +01:00
William Lallemand	d1d1e22945	BUG/MINOR: cache: alloc shctx after check config When running haproxy -c, the cache parser is trying to allocate the size of the cache. This can be a problem in an environment where the RAM is limited. This patch moves the cache allocation in the post_check callback which is not executed during a -c. This patch may be backported at least to 2.0 and 1.9. In 1.9, the callbacks registration mechanism is not the same. So the patch will have to be adapted. No need to backport it to 1.8, the code is probably too different.	2019-10-21 15:05:46 +02:00
Willy Tarreau	ccc61d87ae	BUG/MINOR: cache: also cache absolute URIs The recent changes to address URI issues mixed with the recent fix to stop caching absolute URIs have caused the cache not to cache H2 requests anymore since these ones come with a scheme and authority. Let's unbreak this by using absolute URIs all the time, now that we keep host and authority in sync. So what is done now is that if we have an authority, we take the whole URI as it is as the cache key. This covers H2 and H1 absolute requests. If no authority is present (most H1 origin requests), then we prepend "https://" and the Host header. The reason for https:// is that most of the time we don't care about the scheme, but since about all H2 clients use this scheme, at least we can share the cache between H1 and H2. No backport is needed since the breakage only affects 2.1-dev.	2019-10-17 10:40:47 +02:00
Willy Tarreau	22c6107dba	BUG/MEDIUM: cache: make sure not to cache requests with absolute-uri If a request contains an absolute URI and gets its Host header field rewritten, or just the request's URI without touching the Host header field, it can lead to different Host and authority parts. The cache will always concatenate the Host and the path while a server behind would instead ignore the Host and use the authority found in the URI, leading to incorrect content possibly being cached. Let's simply refrain from caching absolute requests for now, which also matches what the comment at the top of the function says. Later we can improve this by having a special handling of the authority. This should be backported as far as 1.8.	2019-10-07 14:21:30 +02:00
Willy Tarreau	6905d18495	Revert "MINOR: cache: allow caching of OPTIONS request" This reverts commit `1263540fe8`. As discussed in issues #214 and #251, this is not the correct way to cache CORS responses, since it relies on hacking the cache to cache the OPTIONS method which is explicitly non-cacheable and for which we cannot rely on any standard caching semantics (cache headers etc are not expected there). Let's roll this back for now and keep that for a more reliable and flexible CORS-specific solution later.	2019-10-01 17:59:17 +02:00
Christopher Faulet	78fbb9f991	MEDIUM: fcgi-app: Add FCGI application and filter The FCGI application handles all the configuration parameters used to format requests sent to an application. The configuration of an application is grouped in a dedicated section (fcgi-app <name>) and referenced in a backend to be used (use-fcgi-app <name>). To be valid, a FCGI application must at least define a document root. But it is also possible to set the default index, a regex to split the script name and the path-info from the request URI, parameters to set or unset... In addition, this patch also adds a FCGI filter, responsible for all processing on a stream.	2019-09-17 10:18:54 +02:00
Christopher Faulet	b066747107	BUG/MEDIUM: cache: Don't cache objects if the size of headers is too big HTTP responses with headers than impinge upon the reserve must not be cached. Otherwise, there is no warranty to have enough space to add the header "Age" when such cached responses are delivered. This patch must be backported to 2.0 and 1.9. For these versions, the same must be done for the legacy HTTP mode.	2019-09-04 10:30:11 +02:00
Christopher Faulet	15a4ce870a	BUG/MEDIUM: cache: Properly copy headers splitted on several shctx blocks In the cache, huge HTTP headers will use several shctx blocks. When a response is returned from the cache, these headers must be properly copied in the corresponding HTX message by updating the pointer where to copied a header part. This patch must be backported to 2.0 and 1.9.	2019-09-04 10:30:11 +02:00
Baptiste Assmann	1263540fe8	MINOR: cache: allow caching of OPTIONS request Allow HAProxy to cache responses to OPTIONS HTTP requests. This is useful in the use case of "Cross-Origin Resource Sharing" (cors) to cache CORS responses from API servers. Since HAProxy does not support Vary header for now, this would be only useful for "access-control-allow-origin: *" use case.	2019-08-07 15:13:38 +02:00
Baptiste Assmann	db92a836f4	MINOR: cache: add method to cache hash Current HTTP cache hash contains only the Host header and the url path. That said, request method should also be added to the mix to support caching other request methods on the same URL. IE GET and OPTIONS.	2019-08-07 15:13:38 +02:00
Christopher Faulet	f734638976	MINOR: http: Don't store raw HTTP errors in chunks anymore Default HTTP error messages are stored in an array of chunks. And since the HTX was added, these messages are also converted in HTX and stored in another array. But now, the first array is not used anymore because the legacy HTTP mode was removed. So now, only the array with the HTX messages are kept. The other one was removed.	2019-07-19 09:46:23 +02:00
Christopher Faulet	fc9cfe4006	REORG: proto_htx: Move HTX analyzers & co to http_ana.{c,h} files The old module proto_http does not exist anymore. All code dedicated to the HTTP analysis is now grouped in the file proto_htx.c. So, to finish the polishing after removing the legacy HTTP code, proto_htx.{c,h} files have been moved in http_ana.{c,h} files. In addition, all HTX analyzers and related functions prefixed with "htx_" have been renamed to start with "http_" instead.	2019-07-19 09:24:12 +02:00
Christopher Faulet	711ed6ae4a	MAJOR: http: Remove the HTTP legacy code First of all, all legacy HTTP analyzers and all functions exclusively used by them were removed. So the most of the functions in proto_http.{c,h} were removed. Only functions to deal with the HTTP transaction have been kept. Then, http_msg and hdr_idx modules were entirely removed. And finally the structure http_msg was lightened of all its useless information about the legacy HTTP. The structure hdr_ctx was also removed because unused now, just like unused states in the enum h1_state. Note that the memory pool "hdr_idx" was removed and "http_txn" is now smaller.	2019-07-19 09:24:12 +02:00
Christopher Faulet	95e7ea3c62	MEDIUM: cache: Remove code relying on the legacy HTTP mode The applet delivering cached objects based on the legacy HTTP code was removed as the filter callback cache_store_http_forward_data(). And the action analyzing the response coming from the server to store it in the cache or not was purged of the legacy HTTP code.	2019-07-19 09:18:27 +02:00
Christopher Faulet	8f7fe1c9d7	MINOR: cache: Remove tests on the option 'http-use-htx' All cache filters now store HTX messages. So it is useless to test if a cache is used at the same time by a legacy HTTP proxy and an HTX one.	2019-07-19 09:18:27 +02:00
Christopher Faulet	5f2c49f5ee	BUG/MINOR: cache/htx: Make maxage calculation HTX aware The function http_calc_maxage() was not updated to be HTX aware. So the header "Cache-Control" on the response was never parsed to find "max-age" or "s-maxage" values. This patch must be backported to 2.0 and 1.9.	2019-07-19 09:18:27 +02:00
Christopher Faulet	bda8397fba	BUG/MINOR: cache/htx: Fix the counting of data already sent by the cache applet Since the commit `8f3c256f7` ("MEDIUM: cache/htx: Always store info about HTX blocks in the cache"), it is possible to read info about a data block without sending anything. It is possible because we rely on the function htx_add_data(), which will try to add data without any defragmentation. In such case, info about the data block are skipped but don't count in data sent. No need to backport this patch, expect if the commit `8f3c256f7` is backported too.	2019-06-11 14:05:25 +02:00
Christopher Faulet	2d7c5395ed	MEDIUM: htx: Add the parsing of trailers of chunked messages HTTP trailers are now parsed in the same way headers are. It means trailers are converted to K/V blocks followed by an end-of-trailer marker. For now, to make things simple, the type for trailer blocks are not the same than for header blocks. But the aim is to make no difference between headers and trailers by using the same type. Probably for the end-of marker too.	2019-06-05 10:12:11 +02:00
Christopher Faulet	8f3c256f7e	MEDIUM: cache/htx: Always store info about HTX blocks in the cache It was only done for the headers (including the EOH marker). data were prefixed by the info field of these blocks. The payload and the trailers of the messages were stored in raw. The total size of headers and payload were kept in the cached object state to help output formatting. Now, info about each HTX block is store in the cache. Only data are allowed to be splitted. Otherwise, all blocks of an HTX message are handled the same way, both when storing a message in the cache and when delivering it from the cache. This will help the cache implementation to be more robust to internal changes in the HTX. Especially for the upcoming parsing of trailers. There is also no more need to keep extra info in the cached object state.	2019-06-05 10:12:11 +02:00
Willy Tarreau	0a7ef02074	MINOR: htx: make htx_add_data() return the transmitted byte count In order to later allow htx_add_data() to transmit partial blocks and avoid defragmenting the buffer, we'll need to return the number of bytes consumed. This first modification makes the function do this and its callers take this into account. At the moment the function still works atomically so it returns either the block size or zero. However all call places have been adapted to consider any value between zero and the block size.	2019-05-28 14:48:59 +02:00
Christopher Faulet	ee847d45d0	MEDIUM: filters/htx: Filter body relatively to the first block The filters filtering HTX body, in the callback http_payload, must now loop on an HTX message starting from the first block position. The offset passed as parameter is relative to this position and not the head one. It is mandatory because once filtered, data are now forwarded using the function channel_htx_fwd_payload(). So the first block position is always updated.	2019-05-28 07:42:33 +02:00
Christopher Faulet	29f1758285	MEDIUM: htx: Store the first block position instead of the start-line one We don't store the start-line position anymore in the HTX message. Instead we store the first block position to analyze. For now, it is almost the same. But once all changes will be made on this part, this position will have to be used by HTX analyzers, and only in the analysis context, to know where the analyse should start. When new blocks are added in an HTX message, if the first block position is not defined, it is set. When the block pointed by it is removed, it is set to the block following it. -1 remains the value to unset the position. the first block position is unset when the HTX message is empty. It may also be unset on a non-empty message, meaning every blocks were already analyzed. From HTX analyzers point of view, this position is always set during headers analysis. When they are waiting for a request or a response, if it is unset, it means the analysis should wait. But once the analysis is started, and as long as headers are not forwarded, it points to the message start-line. As mentionned, outside the HTX analysis, no code must rely on the first block position. So multiplexers and applets must always use the head position to start a loop on an HTX message.	2019-05-28 07:42:33 +02:00
Christopher Faulet	a3f1550dfa	MEDIUM: http/htx: Perform analysis relatively to the first block The first block is the start-line, if defined. Otherwise it the head of the HTX message. So now, during HTTP analysis, lookup are all done using the first block instead of the head. Concretely, for now, it is the same because only one HTTP message is stored at a time in an HTX message. 1xx informational messages are handled separatly from the final reponse and from each other. But it will make sense when the 1xx informational messages and the associated final reponse will be stored in the same HTX message.	2019-05-28 07:42:12 +02:00
Christopher Faulet	297fbb45fe	MINOR: htx: Replace the function http_find_stline() by http_get_stline() Now, we only return the start-line. If not found, NULL is returned. No lookup is performed and the HTX message is no more updated. It is now the caller responsibility to update the position of the start-line to the right value. So when it is not found, i.e sl_pos is set to -1, it means the last start-line has been already processed and the next one has not been inserted yet. It is mandatory to rely on this kind of warranty to store 1xx informational responses and final reponse in the same HTX message.	2019-05-28 07:42:12 +02:00
Christopher Faulet	9c66b980fa	MINOR: htx: Store start-line block's position instead of address of its payload Nothing much to say. This change is just mandatory to consider 1xx informational messages as part of a response.	2019-05-28 07:42:12 +02:00
Willy Tarreau	2231b63887	BUILD: cache: avoid a build warning with some compilers/linkers The struct http_cache_applet was fully declared at the beginning instead of just doing a forward declaration using an extern modifier. Some linkers report warnings about a redefined symbol since these really are two complete declarations. The proper way to do this is to use extern on the first one and to have a full declaration later. However it's not permitted to have both static and extern so the change done in commit `0f2229943` ("CLEANUP: cache: don't export http_cache_applet anymore") has to be partially undone. This should be backported to 1.9 for sanity but has no effet on most platforms. However on 1.9 the extern keyword must also be added to include/types/cache.h.	2019-03-29 21:03:24 +01:00
Willy Tarreau	0f22299435	CLEANUP: cache: don't export http_cache_applet anymore This one can become static since it's not used by http/htx anymore.	2019-03-19 09:58:35 +01:00
Christopher Faulet	adb363135c	BUG/MINOR: cache: Fully consume large requests in the cache applet In the cache applet (in HTX and legacy HTTP), when an cached object is sent to a client, the request must be consumed. It is done at the end, after all the response was copied into the channel's buffer. But only outgoing data at time the applet is called are consumed. Then the applet is closed. If a request with a huge body is sent, an error is triggerred because a SHUTW is catched on an unfinished request. Now, we consume request data as soon as possible and we do it until the end. In fact, we don't try to shutdown the request's channel for write anymore. This patch must be backported to 1.9 after some observation period.	2019-03-19 09:49:08 +01:00
Olivier Houchard	aa090d46fe	MEDIUM: cache: Use the new _HA_ATOMIC_* macros. Use the new _HA_ATOMIC_* macros and add barriers where needed.	2019-03-11 17:02:38 +01:00
Christopher Faulet	f0dd037456	BUG/MINOR: cache/htx: Return only the headers of cached objects to HEAD requests The body of a cached object must not be sent in response to a HEAD request. This works for the legacy HTTP because the parsing is performed by HTTP analyzers _AND_ because the connection is closed at the end of the transaction. So the body is ignored. But the applet send it. For the HTX, the applet must skip the body explicitly. This patch must be backported to 1.9.	2019-02-26 14:04:23 +01:00
Christopher Faulet	b3d4bca415	BUG/MEDIUM: cache: Get objects from the cache only for GET and HEAD requests Only responses for GET requests are stored in the cache. But there is no check on the method during the lookup. So it is possible to retrieve an object from the cache independently of the method, from the time the key of the object matches. Now, lookups are performed only for GET and HEAD requests. This patch must be backportedi in 1.9.	2019-02-26 14:04:23 +01:00
Christopher Faulet	a0df957471	BUG/MAJOR: cache/htx: Set the start-line offset when a cached object is served When the function htx_add_stline() is used, this offset is automatically set when necessary. But the HTX cache applet adds all header blocks of the responses manually, including the start-line. So its offset must be explicitly set by the applet. When everything goes well, the HTTP analyzer http_wait_for_response() looks for the start-line in the HTX messages, calling http_find_stline(). If necessary, the start-line offet will also be automatically set during this stage. So the bug of the HTX cache applet does not hurt most of the time. But, when an error occurred, HTTP responses analyzers can be bypassed. In such caese, the start-line offset of cached responses remains unset. Some part of the code relies on the start-line offset to process the HTX messages. Among others, when H2 responses are sent to clients, the H2 multiplexer read the start-line without any check, because it _MUST_ always be there. if its offset is not set, a NULL pointer is dereferenced leading to a segfault. The patch must be backported to 1.9.	2019-02-26 14:04:23 +01:00
Willy Tarreau	c9036c0004	BUG/MAJOR: cache: fix confusion between zero and uninitialized cache key The cache uses the first 32 bits of the uri's hash as the key to reference the object in the cache. It makes a special case of the value zero to mean that the object is not in the cache anymore. The problem is that when an object hashes as zero, it's still inserted but the eb32_delete() call is skipped, resulting in the object still being chained in the memory area while the block has been reclaimed and used for something else. Then when objects which were chained below it (techically any object since zero is at the root) are deleted, the walk through the upper object may encounter corrupted values where valid pointers were expected. But while this should only happen statically once on 4 billion, the problem gets worse when the cache-use conditions don't match the cache-store ones, because cache-store runs with an uninitialized key, which can create objects that will never be found by the lookup code, or worse, entries with a zero key preventing eviction of the tree node and resulting in a crash. It's easy to accidently end up on such a config because the request rules generally can't be used to decide on the response : http-request cache-use cache if { path_beg /images } http-response cache-store cache In this test, mixing traffic with /images/$RANDOM and /foo/$RANDOM will result in random keys being inserted, some of them possibly being zero, and crashes will quickly happen. The fix consists in 1) always initializing the transaction's cache_hash to zero, and 2) never storing a response for which the hash has not been calculated, as indicated by the value zero. It is worth noting that objects hashing as value zero will never be cached, but given that there's only one chance among 4 billion that this happens, this is totally harmless. This fix must be backported to 1.9 and 1.8.	2019-01-14 10:31:31 +01:00
Christopher Faulet	839791af0d	BUG/MINOR: cache: Disable the cache if any compression filter precedes it We need to check if any compression filter precedes the cache filter. This is only possible when the compression is configured in the frontend while the cache filter is configured on the backend (via a cache-store action or explicitly). This case cannot be detected during HAProxy startup. So in such cases, the cache is disabled. The patch must be backported to 1.9.	2019-01-08 11:32:23 +01:00
Christopher Faulet	cc156623b2	BUG/MEDIUM: cache/htx: Respect the reserve when cached objects are served It is only true for HTX streams. The legacy code relies on ci_putblk() which is already aware of the reserve. It is mandatory to not fill the reserve to let other filters analysing data. It is especially true for the compression filter. It needs at least 20 bytes of free space, plus at most 5 bytes per 32kB block. So if the cache fully fills the channel's buffer, the compression will not have enough space to do its job and it will block the data forwarding, waiting for more free space. But if the buffer fully filled with input data (ie no outgoing data), the stream will be frozen infinitely. This patch must be backported to 1.9. It depends on the following patches: * BUG/MEDIUM: cache/htx: Respect the reserve when cached objects are served from the cache * MINOR: channel/htx: Add HTX version for some helper functions	2019-01-07 16:32:07 +01:00
Christopher Faulet	74b41ba025	BUG/MINOR: cache/htx: Be sure to count partial trailers When a chunked object is served from the cache, If the trailers are not pushed in the channel's buffer in one time, we still have to count them in the total written bytes in the buffer. This patch must be backported to 1.9.	2019-01-04 16:23:03 +01:00
Christopher Faulet	6112391f81	BUG/MEDIUM: cache: Be sure to end the forwarding when XFER length is unknown This bug exists in the HTX code and in the legacy one. When the body length is unknown, the applet hangs. For the legacy code, it hangs because the end of the cached object is not correctly handled and the applet is never recalled. For the HTX code, only the begining of the response (the 1st buffer) is sent then the applet hangs. To work in HTX, The fast forwarding must be correctly handled. This patch must be backported to 1.9. [cf: the patch adding the function channel_add_input must be backported with this one. It does not exist in 1.8 because only responses with a C-L are cached.]	2019-01-02 20:12:49 +01:00
Willy Tarreau	14bfe9af12	CLEANUP: stream-int: consistently call the si/stream_int functions As long-time changes have accumulated over time, the exported functions of the stream-interface were almost all prefixed "si_<something>" while most private ones (mostly callbacks) were called "stream_int_<something>". There were still a few confusing exceptions, which were addressed to follow this shcme : - stream_sock_read0(), only used internally, was renamed stream_int_read0() and made static - stream_int_notify() is only private and was made static - stream_int_{check_timeouts,report_error,retnclose,register_handler,update} were renamed si_<something>. Now it is clearer when checking one of these if it risks to be used outside or not.	2018-12-19 15:25:43 +01:00
Willy Tarreau	efef323783	BUG/MINOR: cache: also consider CF_SHUTR to abort delivery The cache runs in an applet, so it delivers data into the input side of the channel's buffer. Thus it must also abort feeding the buffer as soon as CF_SHUTR is present, not just CF_SHUTW*, since these last ones may only appear later. There doesn't seem to be an observable side effect of this bug, the fix probably doesn't even need to be backported.	2018-12-16 00:40:31 +01:00
Willy Tarreau	273e964f6e	BUG/MEDIUM: htx/cache: use the correct class of error codes on abort The HTX-specific cache code uses HTX_CACHE_* states which overlap with the legacy HTTP states. A typo in the error handling made the state become HTTP_CACHE_END, which equals 3 and is the value for HTX_CACHE_EOD, which explains why we were seeing a transition to trailers and memory corruption. no backport needed.	2018-12-16 00:40:30 +01:00
Christopher Faulet	27d93c3f94	BUG/MAJOR: compression/cache: Make it really works with these both filters Caching the response with the compression enabled was totally broken. To fix the problem, the compression must be done after caching the response. Otherwise it needs to change the cache to store compressed and uncompressed objects for the same ressource. So, because it is not possible for now, it is forbidden to declare the compression filter before the cache one. To ease the configuration, both can be implicitly declared (without "filter" keyword). The compression will automatically be inserted after the cache. Then, to make it works this way, the compression filter has been slighly modified. Now, the response headers are updated after http-response rules evaluations, instead of before. So, if the response contains a "Content-length" header, it will be kept with the response stored in the cache. So this cached response will be able to be served to clients not supporting the compression at all.	2018-12-15 23:50:07 +01:00
Willy Tarreau	a1214a501f	MINOR: cache: report the number of cache lookups and cache hits The cache lookups and hits is now accounted per frontend and per backend, and reported on the stats page.	2018-12-14 14:00:25 +01:00
Willy Tarreau	a73da1ed25	BUG/MEDIUM: cache: fix random crash on filter parser's error path The cconf variable was not initialized before the two first possible error exits before being freed, resulting in random crashes instead of displaying an error message if the cache ID was missing from the filter declaration. No backport is needed, this is exclusively 1.9.	2018-12-14 10:19:28 +01:00
Willy Tarreau	b96b77ed6e	REORG: htx: merge types+proto into common/htx.h All the HTX definition is self-contained and doesn't really depend on anything external since it's a mostly protocol. In addition, some external similar files (like h2) also placed in common used to rely on it, making it a bit awkward. This patch moves the two htx.h files into a single self-contained one. The historical dependency on sample.h could be also removed since it used to be there only for http_meth_t which is now in http.h.	2018-12-11 17:15:04 +01:00
Christopher Faulet	99a17a2d91	MEDIUM: cache: Require an explicit filter declaration if other filters are used As for the compression filter, the cache filter must be explicitly declared (using the filter keyword) if other filters than cache are used. It is mandatory to explicitly define the filters order. Documentation has been updated accordingly.	2018-12-11 17:09:31 +01:00
Christopher Faulet	afd819c54a	MEDIUM: cache/compression: Add a way to safely combined compression and cache This is only true for HTX proxies. On legacy HTTP proxy, if the compression and the cache are both enabled, an error during HAProxy startup is triggered. With the HTX, now you can use both in any order. If the compression is defined before the cache, then the responses will be stored compressed. If the compression is defined after the cache, then the responses will be stored uncompressed. So in the last case, when a response is served from the cache, it will compressed too like any response.	2018-12-11 17:09:31 +01:00
Christopher Faulet	f4a4ef7d7c	MINOR: filters: Export the name of known filters It could be useful to know if some filter is declared on a proxy or if it is enabled on a stream.	2018-12-11 17:09:31 +01:00
Christopher Faulet	95220e2ed8	MINOR: cache: Improve and simplify the cache configuration check To do so, a dedicated configuration has been added on cache filters. Before the cache filter configuration pointed directly to the cache it used. Now, it is the dedicated structure cache_flt_conf. Store and use rules also point to this structure. It is linked to the cache the filter must used. It also contains a flags field. This will allow us to define the behavior of a cache filter when a response is stored in the cache or delivered from it. And now, Store and use rules uses a common parsing function. So if it does not already exists, a filter is always created for both kind of rules. The cache filters configuration is checked using their check callback. In the postparser function, we only check the caches configuration. This removes the loop on all proxies in the postparser function.	2018-12-11 17:09:31 +01:00
Christopher Faulet	54a8d5a4a0	MEDIUM: cache/htx: Add the HTX support into the cache The cache is now able to store and resend HTX messages. When an HTX message is stored in the cache, the headers are prefixed with their block's info (an uint32_t), containing its type and its length. Data, on their side, are stored without any prefix. Only the value is copied in the cache. 2 fields have been added in the structure cache_entry, hdrs_len and data_len, to known the size, in the cache, of the headers part and the data part. If the message is chunked, the trailers are also copied, the same way as data. When the HTX message is recreated in the cache applet, the trailers size is known removing the headers length and the data lenght from the total object length.	2018-12-11 17:09:31 +01:00
Christopher Faulet	67658c9c9a	MINOR: cache: Register the cache as a data filter only if response is cacheable Instead of calling register_data_filter() when the stream analyze starts, we now call it when we are sure the response is cacheable. It is done in the http_headers callback, just before the body analyzis, and only if the headers was already been cached. And during the body analyzis, if an error occurred or if the response is too big, we unregistered the cache immediatly. This patch may be backported in 1.8. It is not a bug but a significant improvement.	2018-12-11 17:09:31 +01:00
Christopher Faulet	1f672c536d	MINOR: cache/htx: Don't use the same cache on HTX and legacy HTTP proxies It is not possible to mix the format of messages stored in a cache. So we reject the configurations with a cache used by an HTX proxy and a legacy HTTP proxy in same time.	2018-12-11 17:09:31 +01:00
Willy Tarreau	8ceae72d44	MEDIUM: init: use initcall for all fixed size pool creations This commit replaces the explicit pool creation that are made in constructors with a pool registration. Not only this simplifies the pools declaration (it can be done on a single line after the head is declared), but it also removes references to pools from within constructors. The only remaining create_pool() calls are those performed in init functions after the config is parsed, so there is no more user of potentially uninitialized pool now. It has been the opportunity to remove no less than 12 constructors and 6 init functions.	2018-11-26 19:50:32 +01:00
Willy Tarreau	e655251e80	MINOR: initcall: use initcalls for section parsers The two calls to cfg_register_section() and cfg_register_postparser() are now supported by initcalls. This allowed to remove two other constructors.	2018-11-26 19:50:32 +01:00
Willy Tarreau	0108d90c6c	MEDIUM: init: convert all trivial registration calls to initcalls This switches explicit calls to various trivial registration methods for keywords, muxes or protocols from constructors to INITCALL1 at stage STG_REGISTER. All these calls have in common to consume a single pointer and return void. Doing this removes 26 constructors. The following calls were addressed : - acl_register_keywords - bind_register_keywords - cfg_register_keywords - cli_register_kw - flt_register_keywords - http_req_keywords_register - http_res_keywords_register - protocol_register - register_mux_proto - sample_register_convs - sample_register_fetches - srv_register_keywords - tcp_req_conn_keywords_register - tcp_req_cont_keywords_register - tcp_req_sess_keywords_register - tcp_res_cont_keywords_register - flt_register_keywords	2018-11-26 19:50:32 +01:00
Joseph Herlant	8dae5b38b8	CLEANUP: Fix typos in the cache subsystem Fix common misspells in the code comments of the cache subsystem.	2018-11-18 22:26:42 +01:00
Willy Tarreau	db398435aa	MINOR: stream-int: replace si_cant_put() with si_rx_room_{blk,rdy}() Remaining calls to si_cant_put() were all for lack of room and were turned to si_rx_room_blk(). A few places where SI_FL_RXBLK_ROOM was cleared by hand were converted to si_rx_room_rdy(). The now unused si_cant_put() function was removed.	2018-11-18 21:41:50 +01:00
Willy Tarreau	4b962a4179	MEDIUM: stream-int: fix the si_cant_put() calls used for buffer readiness A number of calls to si_cant_put() were used in fact to request being called back once a buffer is available. These ones are not needed anymore since si_alloc_ibuf() already sets the SI_FL_RXBLK_BUFF flag when called in appctx context. Those called with a foreign stream-int are simply turned to si_rx_buff_blk().	2018-11-18 21:41:48 +01:00
Willy Tarreau	96062a181d	BUILD: cache: fix a build warning regarding too large an integer for the age Building on 32 bit gives this : src/cache.c: In function 'http_action_store_cache': src/cache.c:466:4: warning: this decimal constant is unsigned only in ISO C90 [enabled by default] src/cache.c:467:5: warning: this decimal constant is unsigned only in ISO C90 [enabled by default] src/cache.c: In function 'cache_channel_append_age_header': src/cache.c:578:2: warning: this decimal constant is unsigned only in ISO C90 [enabled by default] src/cache.c:579:3: warning: this decimal constant is unsigned only in ISO C90 [enabled by default] It's because of the definition below added in commit `e7a770c` ("MINOR: cache: Add "Age" header.") : #define CACHE_ENTRY_MAX_AGE 2147483648 Just appending "U" to mark it unsigned is enough to fix it. This only affects 1.9, no backport is needed.	2018-11-11 14:03:02 +01:00
Willy Tarreau	0cd3bd628a	MINOR: stream-int: rename si_applet_{want\|stop\|cant}_{get\|put} It doesn't make sense to limit this code to applets, as any stream interface can use it. Let's rename it by simply dropping the "applet_" part of the name. No other change was made except updating the comments.	2018-11-11 10:18:37 +01:00
Fr�d�ric L�caille	e7a770ce80	MINOR: cache: Add "Age" header. This patch makes the cache capable of adding an "Age" header as defined by rfc7234. During the storage of new HTTP objects we memorize ->eoh value and the value of the "Age" header coming from the origin server. These information may then be reused to return the cached HTTP objects with a new "Age" header. May be backported to 1.8.	2018-10-28 19:06:59 +01:00
Fr�d�ric L�caille	4eba544e24	MINOR: cache: Avoid usage of atoi() when parsing "max-object-size". With this patch we avoid parsing "max-object-size" with atoi() and we store its value as an unsigned int to prevent bad implicit conversion issues especially when we compare it with others unsigned value (content length).	2018-10-26 04:54:40 +02:00
Fr�d�ric L�caille	bc584494e6	BUG/MINOR: cache: Wrong usage of shctx_init(). With this patch we check that shctx_init() does not returns 0. This is possible if the maxblocks argument, which is passed as an int, is negative due to an implicit conversion. Must be backported to 1.8.	2018-10-26 04:54:40 +02:00
Fr�d�ric L�caille	b9b8b6b6be	BUG/MINOR: cache: Crashes with "total-max-size" > 2047(MB). With this patch we support cache size larger than 2047 (MB) and prevent haproxy from crashing when "total-max-size" is parsed as negative values by atoi(). The limit at parsing time is 4095 MB (UINT_MAX >> 20). May be backported to 1.8.	2018-10-26 04:54:40 +02:00
Fr�d�ric L�caille	a2219f5e3b	MINOR: cache: Add "max-object-size" option. This patch adds "max-object-size" option to the cache to limit the size in bytes of the HTTP objects to be cached. When not provided, the maximum size of an HTTP object is a 256th of the cache size.	2018-10-24 04:40:03 +02:00
Fr�d�ric L�caille	b7838afe6f	MINOR: shctx: Add a maximum object size parameter. This patch adds a new parameter to shctx_init() function to be used to limit the size of each shared object, -1 value meaning "no limit".	2018-10-24 04:39:44 +02:00
Fr�d�ric L�caille	8df65ae5e2	MINOR: cache: Larger HTTP objects caching. This patch makes the capable of storing HTTP objects larger than a buffer. It makes usage of the "block by block shared object allocation" new shctx API. A new pointer to struct shared_block has been added to the cache applet context to memorize the next block to be used by the HTTP cache I/O handler http_cache_io_handler() to emit the data. Another member, named "sent" memorize the number of bytes already sent by this handler. So, to send an object from cache, http_cache_io_handler() must be called until "sent" counter reaches the size of this object.	2018-10-24 04:37:12 +02:00
Fr�d�ric L�caille	0bec807e08	MINOR: shctx: Shared objects block by block allocation. This patch makes shctx capable of storing objects in several parts, each parts being made of several blocks. There is no more need to walk through until reaching the end of a row to append new blocks. A new pointer to a struct shared_block member, named last_reserved, has been added to struct shared_block so that to memorize the last block which was reserved by shctx_row_reserve_hot(). Same thing about "last_append" pointer which is used to memorize the last block used by shctx_row_data_append() to store the data.	2018-10-24 04:35:53 +02:00
Willy Tarreau	61c112aa5b	REORG: http: move HTTP rules parsing to http_rules.c These ones are mostly called from cfgparse.c for the parsing and do not depend on the HTTP representation. The functions's prototypes were moved to proto/http_rules.h, making this file work exactly like tcp_rules. Ideally we should stop calling these functions directly from cfgparse and register keywords, but there are a few cases where that wouldn't work (stats http-request) so it's probably not worth trying to go this far.	2018-10-02 18:28:05 +02:00
Willy Tarreau	6b952c8101	REORG: http: move http_get_path() to http.c This function is purely HTTP once http_txn is put aside. So the original one was renamed to http_txn_get_path() and it extracts the relevant offsets from the txn to pass them to http_get_path(). One benefit of the new version is that it returns the length at the same time so that allowed to slightly simplify http_get_path_from_string() which had to look up the end pointer previously and which is not needed anymore.	2018-09-11 10:30:25 +02:00
Willy Tarreau	83061a820e	MAJOR: chunks: replace struct chunk with struct buffer Now all the code used to manipulate chunks uses a struct buffer instead. The functions are still called "chunk*", and some of them will progressively move to the generic buffer handling code as they are cleaned up.	2018-07-19 16:23:43 +02:00
Willy Tarreau	843b7cbe9d	MEDIUM: chunks: make the chunk struct's fields match the buffer struct Chunks are only a subset of a buffer (a non-wrapping version with no head offset). Despite this we still carry a lot of duplicated code between buffers and chunks. Replacing chunks with buffers would significantly reduce the maintenance efforts. This first patch renames the chunk's fields to match the name and types used by struct buffers, with the goal of isolating the code changes from the declaration changes. Most of the changes were made with spatch using this coccinelle script : @rule_d1@ typedef chunk; struct chunk chunk; @@ - chunk.str + chunk.area @rule_d2@ typedef chunk; struct chunk chunk; @@ - chunk.len + chunk.data @rule_i1@ typedef chunk; struct chunk chunk; @@ - chunk->str + chunk->area @rule_i2@ typedef chunk; struct chunk chunk; @@ - chunk->len + chunk->data Some minor updates to 3 http functions had to be performed to take size_t ints instead of ints in order to match the unsigned length here.	2018-07-19 16:23:43 +02:00

1 2 3 4 5 ...

293 Commits