haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-07 15:47:01 +02:00

Author	SHA1	Message	Date
Willy Tarreau	d8b8b5329e	BUG/MAJOR: compression: initialize avail_in/next_in even during flush For quite some time, a few users have been experiencing random crashes when compressing with zlib, from versions 1.2.3 to 1.2.8 included. Upon thourough investigation in zlib's deflate.c, it appeared obvious that avail_in and next_in are used during the flush operation and need to be initialized, while admittedly it's not obvious in the documentation. By simply forcing both values to -1 it's possible to immediately reproduce the exact crash that these users have been experiencing : (gdb) bt #0 0x00007fa73ce10c00 in __memcpy_sse2 () from /lib64/libc.so.6 #1 0x00007fa73e0c5d49 in ?? () from /lib64/libz.so.1 #2 0x00007fa73e0c68e0 in ?? () from /lib64/libz.so.1 #3 0x00007fa73e0c73c7 in deflate () from /lib64/libz.so.1 #4 0x00000000004dca1c in deflate_flush_or_finish (comp_ctx=0x7b6580, out=0x7fa73e5bd010, flag=2) at src/compression.c:808 #5 0x00000000004dcb60 in deflate_flush (comp_ctx=0x7b6580, out=0x7fa73e5bd010) at src/compression.c:835 #6 0x00000000004dbc50 in http_compression_buffer_end (s=0x7c0050, in=0x7c00a8, out=0x78adf0 <tmpbuf.24662>, end=0) at src/compression.c:249 #7 0x000000000048bb5f in http_response_forward_body (s=0x7c0050, res=0x7c00a0, an_bit=1048576) at src/proto_http.c:7173 #8 0x00000000004cce54 in process_stream (t=0x7bffd8) at src/stream.c:1939 #9 0x0000000000427ddf in process_runnable_tasks () at src/task.c:238 #10 0x0000000000419892 in run_poll_loop () at src/haproxy.c:1573 #11 0x000000000041a4a5 in main (argc=4, argv=0x7fffcda38348) at src/haproxy.c:1933 Note that for all reports deflate_flush_or_finish() was always involved. The crash is very hard to reproduce when using regular traffic because it requires that the combination of avail_in and next_in are inadequate so that the memcpy() call reads out of bounds. But this can very likely happen when the input buffer points to an area reused by another stream when the flush has been interrupted due to a full output buffer. This also explains why this report is recent, as dynamic buffer allocation was introduced in 1.6. Anyway it's not acceptable to call a function with a randomly set input buffer. The deflate() function explicitly checks for the case where both avail_in and next_in are null and doesn't use it in this case during a flush, so this is the best solution. Special thanks to Sasha Litvak, James Hartshorn and Paul Bauer for reporting very useful stack traces which were critical to finding the root cause of this bug. This fix must be backported into 1.6 and 1.5, though 1.5 is less likely to trigger this case given that it keeps its own buffers allocated all along the session's life.	2016-08-08 16:57:48 +02:00
Vincent Bernat	02779b6263	CLEANUP: uniformize last argument of malloc/calloc Instead of repeating the type of the LHS argument (sizeof(struct ...)) in calls to malloc/calloc, we directly use the pointer name (sizeof(...)). The following Coccinelle patch was used: @@ type T; T x; @@ x = malloc( - sizeof(T) + sizeof(x) ) @@ type T; T x; @@ x = calloc(1, - sizeof(T) + sizeof(*x) ) When the LHS is not just a variable name, no change is made. Moreover, the following patch was used to ensure that "1" is consistently used as a first argument of calloc, not the last one: @@ @@ calloc( + 1, ... - ,1 )	2016-04-03 14:17:42 +02:00
Christopher Faulet	3d97c90974	REORG: filters: Prepare creation of the HTTP compression filter HTTP compression will be moved in a true filter. To prepare the ground, some functions have been moved in a dedicated file. Idea is to keep everything about compression algos in compression.c and everything related to the filtering in flt_http_comp.c. For now, a header has been added to help during the transition. It will be removed later. Unused empty ACL keyword list was removed. The "compression" keyword parser was moved from cfgparse.c to flt_http_comp.c.	2016-02-09 14:53:15 +01:00
Christopher Faulet	d7c9196ae5	MAJOR: filters: Add filters support This patch adds the support of filters in HAProxy. The main idea is to have a way to "easely" extend HAProxy by adding some "modules", called filters, that will be able to change HAProxy behavior in a programmatic way. To do so, many entry points has been added in code to let filters to hook up to different steps of the processing. A filter must define a flt_ops sutrctures (see include/types/filters.h for details). This structure contains all available callbacks that a filter can define: struct flt_ops { /* * Callbacks to manage the filter lifecycle / int (init) (struct proxy p); void (deinit)(struct proxy p); int (check) (struct proxy p); / * Stream callbacks / void (stream_start) (struct stream s); void (stream_accept) (struct stream s); void (session_establish)(struct stream s); void (stream_stop) (struct stream s); / * HTTP callbacks / int (http_start) (struct stream s, struct http_msg msg); int (http_start_body) (struct stream s, struct http_msg msg); int (http_start_chunk) (struct stream s, struct http_msg msg); int (http_data) (struct stream s, struct http_msg msg); int (http_last_chunk) (struct stream s, struct http_msg msg); int (http_end_chunk) (struct stream s, struct http_msg msg); int (http_chunk_trailers)(struct stream s, struct http_msg msg); int (http_end_body) (struct stream s, struct http_msg msg); void (http_end) (struct stream s, struct http_msg msg); void (http_reset) (struct stream s, struct http_msg msg); int (http_pre_process) (struct stream s, struct http_msg msg); int (http_post_process) (struct stream s, struct http_msg msg); void (http_reply) (struct stream s, short status, const struct chunk msg); }; To declare and use a filter, in the configuration, the "filter" keyword must be used in a listener/frontend section: frontend test ... filter <FILTER-NAME> [OPTIONS...] The filter referenced by the <FILTER-NAME> must declare a configuration parser on its own name to fill flt_ops and filter_conf field in the proxy's structure. An exemple will be provided later to make it perfectly clear. For now, filters cannot be used in backend section. But this is only a matter of time. Documentation will also be added later. This is the first commit of a long list about filters. It is possible to have several filters on the same listener/frontend. These filters are stored in an array of at most MAX_FILTERS elements (define in include/types/filters.h). Again, this will be replaced later by a list of filters. The filter API has been highly refactored. Main changes are: * Now, HA supports an infinite number of filters per proxy. To do so, filters are stored in list. * Because filters are stored in list, filters state has been moved from the channel structure to the filter structure. This is cleaner because there is no more info about filters in channel structure. * It is possible to defined filters on backends only. For such filters, stream_start/stream_stop callbacks are not called. Of course, it is possible to mix frontend and backend filters. * Now, TCP streams are also filtered. All callbacks without the 'http_' prefix are called for all kind of streams. In addition, 2 new callbacks were added to filter data exchanged through a TCP stream: - tcp_data: it is called when new data are available or when old unprocessed data are still waiting. - tcp_forward_data: it is called when some data can be consumed. * New callbacks attached to channel were added: - channel_start_analyze: it is called when a filter is ready to process data exchanged through a channel. 2 new analyzers (a frontend and a backend) are attached to channels to call this callback. For a frontend filter, it is called before any other analyzer. For a backend filter, it is called when a backend is attached to a stream. So some processing cannot be filtered in that case. - channel_analyze: it is called before each analyzer attached to a channel, expects analyzers responsible for data sending. - channel_end_analyze: it is called when all other analyzers have finished their processing. A new analyzers is attached to channels to call this callback. For a TCP stream, this is always the last one called. For a HTTP one, the callback is called when a request/response ends, so it is called one time for each request/response. * 'session_established' callback has been removed. Everything that is done in this callback can be handled by 'channel_start_analyze' on the response channel. * 'http_pre_process' and 'http_post_process' callbacks have been replaced by 'channel_analyze'. * 'http_start' callback has been replaced by 'http_headers'. This new one is called just before headers sending and parsing of the body. * 'http_end' callback has been replaced by 'channel_end_analyze'. * It is possible to set a forwarder for TCP channels. It was already possible to do it for HTTP ones. * Forwarders can partially consumed forwardable data. For this reason a new HTTP message state was added before HTTP_MSG_DONE : HTTP_MSG_ENDING. Now all filters can define corresponding callbacks (http_forward_data and tcp_forward_data). Each filter owns 2 offsets relative to buf->p, next and forward, to track, respectively, input data already parsed but not forwarded yet by the filter and parsed data considered as forwarded by the filter. A any time, we have the warranty that a filter cannot parse or forward more input than previous ones. And, of course, it cannot forward more input than it has parsed. 2 macros has been added to retrieve these offets: FLT_NXT and FLT_FWD. In addition, 2 functions has been added to change the 'next size' and the 'forward size' of a filter. When a filter parses input data, it can alter these data, so the size of these data can vary. This action has an effet on all previous filters that must be handled. To do so, the function 'filter_change_next_size' must be called, passing the size variation. In the same spirit, if a filter alter forwarded data, it must call the function 'filter_change_forward_size'. 'filter_change_next_size' can be called in 'http_data' and 'tcp_data' callbacks and only these ones. And 'filter_change_forward_size' can be called in 'http_forward_data' and 'tcp_forward_data' callbacks and only these ones. The data changes are the filter responsability, but with some limitation. It must not change already parsed/forwarded data or data that previous filters have not parsed/forwarded yet. Because filters can be used on backends, when we the backend is set for a stream, we add filters defined for this backend in the filter list of the stream. But we must only do that when the backend and the frontend of the stream are not the same. Else same filters are added a second time leading to undefined behavior. The HTTP compression code had to be moved. So it simplifies http_response_forward_body function. To do so, the way the data are forwarded has changed. Now, a filter (and only one) can forward data. In a commit to come, this limitation will be removed to let all filters take part to data forwarding. There are 2 new functions that filters should use to deal with this feature: * flt_set_http_data_forwarder: This function sets the filter (using its id) that will forward data for the specified HTTP message. It is possible if it was not already set by another filter _AND_ if no data was yet forwarded (msg->msg_state <= HTTP_MSG_BODY). It returns -1 if an error occurs. * flt_http_data_forwarder: This function returns the filter id that will forward data for the specified HTTP message. If there is no forwarder set, it returns -1. When an HTTP data forwarder is set for the response, the HTTP compression is disabled. Of course, this is not definitive.	2016-02-09 14:53:15 +01:00
Thierry FOURNIER	136f9d34a9	MINOR: samples: rename union from "data" to "u" The union name "data" is a little bit heavy while we read the source code because we can read "data.data.sint". The rename from "data" to "u" makes the read easiest like "data.u.sint".	2015-08-20 17:13:46 +02:00
Thierry FOURNIER	8c542cac07	MEDIUM: samples: Use the "struct sample_data" in the "struct sample" This patch remove the struct information stored both in the struct sample_data and in the striuct sample. Now, only thestruct sample_data contains data, and the struct sample use the struct sample_data for storing his own data.	2015-08-20 17:13:46 +02:00
Thierry FOURNIER	07ee64ef4d	MAJOR: sample: converts uint and sint in 64 bits signed integer This patch removes the 32 bits unsigned integer and the 32 bit signed integer. It replaces these types by a unique type 64 bit signed. This makes easy the usage of integer and clarify signed and unsigned use. With the previous version, signed and unsigned are used ones in place of others, and sometimes the converter loose the sign. For example, divisions are processed with "unsigned", if one entry is negative, the result is wrong. Note that the integer pattern matching and dotted version pattern matching are already working with signed 64 bits integer values. There is one user-visible change : the "uint()" and "sint()" sample fetch functions which used to return a constant integer have been replaced with a new more natural, unified "int()" function. These functions were only introduced in the latest 1.6-dev2 so there's no impact on regular deployments.	2015-07-22 00:48:23 +02:00
Thierry FOURNIER	0786d05a04	MEDIUM: sample: change the prototype of sample-fetches functions This patch removes the "opt" entry from the prototype of the sample-fetches fucntions. This permits to remove some weight in the prototype call.	2015-05-11 20:03:08 +02:00
Thierry FOURNIER	0a9a2b8cec	MEDIUM: sample change the prototype of sample-fetches and converters functions This patch removes the structs "session", "stream" and "proxy" from the sample-fetches and converters function prototypes. This permits to remove some weight in the prototype call.	2015-05-11 20:01:42 +02:00
Willy Tarreau	d0d8da989b	MINOR: stream: provide a few helpers to retrieve frontend, listener and origin Expressions are quite long when using strm_sess(strm)->whatever, so let's provide a few helpers : strm_fe(), strm_li(), strm_orig().	2015-04-06 11:37:29 +02:00
Willy Tarreau	192252e2d8	MAJOR: sample: pass a pointer to the session to each sample fetch function Many such function need a session, and till now they used to dereference the stream. Once we remove the stream from the embryonic session, this will not be possible anymore. So as of now, sample fetch functions will be called with this : - sess = NULL, strm = NULL : never - sess = valid, strm = NULL : tcp-req connection - sess = valid, strm = valid, strm->txn = NULL : tcp-req content - sess = valid, strm = valid, strm->txn = valid : http-req / http-res	2015-04-06 11:37:25 +02:00
Willy Tarreau	15e91e1b36	MAJOR: sample: don't pass l7 anymore to sample fetch functions All of them can now retrieve the HTTP transaction if it exists from the stream and be sure to get NULL there when called with an embryonic session. The patch is a bit large because many locations were touched (all fetch functions had to have their prototype adjusted). The opportunity was taken to also uniformize the call names (the stream is now always "strm" instead of "l4") and to fix indent where it was broken. This way when we later introduce the session here there will be less confusion.	2015-04-06 11:35:53 +02:00
Willy Tarreau	eee5b51248	MAJOR: http: move http_txn out of struct stream Now this one is dynamically allocated. It means that 280 bytes of memory are saved per TCP stream, but more importantly that it will become possible to remove the l7 pointer from fetches and converters since it will be deduced from the stream and will support being null. A lot of care was taken because it's easy to forget a test somewhere, and the previous code used to always trust s->txn for being valid, but all places seem to have been visited. All HTTP fetch functions check the txn first so we shouldn't have any issue there even when called from TCP. When branching from a TCP frontend to an HTTP backend, the txn is properly allocated at the same time as the hdr_idx.	2015-04-06 11:35:52 +02:00
Willy Tarreau	e36cbcb3b0	MEDIUM: stream: move the frontend's pointer to the session Just like for the listener, the frontend is session-wide so let's move it to the session. There are a lot of places which were changed but the changes are minimal in fact.	2015-04-06 11:23:58 +02:00
Willy Tarreau	87b09668be	REORG/MAJOR: session: rename the "session" entity to "stream" With HTTP/2, we'll have to support multiplexed streams. A stream is in fact the largest part of what we currently call a session, it has buffers, logs, etc. In order to catch any error, this commit removes any reference to the struct session and tries to rename most "session" occurrences in function names to "stream" and "sess" to "strm" when that's related to a session. The files stream.{c,h} were added and session.{c,h} removed. The session will be reintroduced later and a few parts of the stream will progressively be moved overthere. It will more or less contain only what we need in an embryonic session. Sample fetch functions and converters will have to change a bit so that they'll use an L5 (session) instead of what's currently called "L4" which is in fact L6 for now. Once all changes are completed, we should see approximately this : L7 - http_txn L6 - stream L5 - session L4 - connection \| applet There will be at most one http_txn per stream, and a same session will possibly be referenced by multiple streams. A connection will point to a session and to a stream. The session will hold all the information we need to keep even when we don't yet have a stream. Some more cleanup is needed because some code was already far from being clean. The server queue management still refers to sessions at many places while comments talk about connections. This will have to be cleaned up once we have a server-side connection pool manager. Stream flags "SN_*" still need to be renamed, it doesn't seem like any of them will need to move to the session.	2015-04-06 11:23:56 +02:00
Willy Tarreau	418b8c0c41	MAJOR: compression: integrate support for libslz This library is designed to emit a zlib-compatible stream with no memory usage and to favor resource savings over compression ratio. While zlib requires 256 kB of RAM per compression context (and can only support 4000 connections per GB of RAM), the stateless compression offered by libslz does not need to retain buffers between subsequent calls. In theory this slightly reduces the compression ratio but in practice it does not have that much of an effect since the zlib window is limited to 32kB. Libslz is available at : http://git.1wt.eu/web?p=libslz.git It was designed for web compression and provides a lot of savings over zlib in haproxy. Here are the preliminary results on a single core of a core2-quad 3.0 GHz in 32-bit for only 300 concurrent sessions visiting the home page of www.haproxy.org (76 kB) with the default 16kB buffers : BW In BW Out BW Saved Ratio memory VSZ/RSS zlib 237 Mbps 92 Mbps 145 Mbps 2.58 84M / 69M slz 733 Mbps 380 Mbps 353 Mbps 1.93 5.9M / 4.2M So while the compression ratio is lower, the bandwidth savings are much more important due to the significantly lower compression cost which allows to consume even more data from the servers. In the example above, zlib became the bottleneck at 24% of the output bandwidth. Also the difference in memory usage is obvious. More tests run on a single core of a core i5-3320M, with 500 concurrent users and the default 16kB buffers : At 100% CPU (no limit) : BW In BW Out BW Saved Ratio memory VSZ/RSS hits/s zlib 480 Mbps 188 Mbps 292 Mbps 2.55 130M / 101M 744 slz 1700 Mbps 810 Mbps 890 Mbps 2.10 23.7M / 9.7M 2382 At 85% CPU (limited) : BW In BW Out BW Saved Ratio memory VSZ/RSS hits/s zlib 1240 Mbps 976 Mbps 264 Mbps 1.27 130M / 100M 1738 slz 1600 Mbps 976 Mbps 624 Mbps 1.64 23.7M / 9.7M 2210 The most important benefit really happens when the CPU usage is limited by "maxcompcpuusage" or the BW limited by "maxcomprate" : in order to preserve resources, haproxy throttles the compression ratio until usage is within limits. Since slz is much cheaper, the average compression ratio is much higher and the input bandwidth is quite higher for one Gbps output. Other tests made with some reference files : BW In BW Out BW Saved Ratio hits/s daniels.html zlib 1320 Mbps 163 Mbps 1157 Mbps 8.10 1925 slz 3600 Mbps 580 Mbps 3020 Mbps 6.20 5300 tv.com/listing zlib 980 Mbps 124 Mbps 856 Mbps 7.90 310 slz 3300 Mbps 553 Mbps 2747 Mbps 5.97 1100 jquery.min.js zlib 430 Mbps 180 Mbps 250 Mbps 2.39 547 slz 1470 Mbps 764 Mbps 706 Mbps 1.92 1815 bootstrap.min.css zlib 790 Mbps 165 Mbps 625 Mbps 4.79 777 slz 2450 Mbps 650 Mbps 1800 Mbps 3.77 2400 So on top of saving a lot of memory, slz is constantly 2.5-3.5 times faster than zlib and results in providing more savings for a fixed CPU usage. For links smaller than 100 Mbps, zlib still provides a better compression ratio, at the expense of a much higher CPU usage. Larger input files provide slightly higher bandwidth for both libs, at the expense of a bit more memory usage for zlib (it converges to 256kB per connection).	2015-03-29 03:32:06 +02:00
Willy Tarreau	7b21877888	CLEANUP: compression: remove unused reset functions It's unclear what purpose these functions used to server, however they are not used anywhere, one good reason to remove them.	2015-03-28 22:08:25 +01:00
Willy Tarreau	9787efa97c	MEDIUM: compression: split deflate_flush() into flush and finish This function used to take a zlib-specific flag as argument to indicate whether a buffer flush or end of contents was met, let's split it in two so that we don't depend on zlib anymore.	2015-03-28 19:17:31 +01:00
Willy Tarreau	c91840aa33	MEDIUM: compression: add new "raw-deflate" compression algorithm This algorithm is exactly the same as "deflate" without the zlib wrapper, and used as an alternative when the browser wants "deflate". All major browsers understand it and despite violating the standards, it is known to work better than "deflate", at least on MSIE and some versions of Safari. Do not use it in conjunction with "deflate", use either one or the other since both react to the same Accept-Encoding token. Note that the lack of Adler32 checksum makes it slightly faster.	2015-03-28 17:01:30 +01:00
Willy Tarreau	615105e7e8	MEDIUM: compression: add a distinction between UA- and config- algorithms Thanks to MSIE/IIS, the "deflate" name is ambigous. According to the RFC it's a zlib-wrapped deflate stream, but IIS used to send only a raw deflate stream, which is the only format MSIE understands for "deflate". The other widely used browsers do support both formats. For this reason some people prefer to emit a raw deflate stream on "deflate" to serve more users even it that means violating the standards. Haproxy only follows the standard, so they cannot do this. This patch makes it possible to have one algorithm name in the configuration and another one in the protocol. This will make it possible to have a new configuration token to add a different algorithm so that users can decide if they want a raw deflate or the standard one.	2015-03-28 16:46:38 +01:00
Willy Tarreau	9f640a1eab	CLEANUP: compression: statify all algo-specific functions There's no reason for exporting identity_* nor deflate_*, they're only used in the same file. Mark them static, it will make it easier to add other algorithms.	2015-03-28 15:46:00 +01:00
Willy Tarreau	2aee2215c9	BUG/MINOR: compression: consider the expansion factor in init When checking if the buffer is large enough, we used to rely on a fixed size that was "apparently" enough. We need to consider the expansion factor of deflate-encoded streams instead, which is of 5 bytes per 32kB. The previous value was OK till 128kB buffers but became wrong past that. It's totally harmless since we always keep the reserve when compressiong, so there's 1kB or so available, which is enough for buffers as large as 6.5 MB, but better fix the check anyway. This fix could be backported into 1.5 since compression was added there.	2015-03-28 12:23:35 +01:00
Willy Tarreau	15530d28a4	MEDIUM: compression: don't send leading zeroes with chunk size Till now we used to rely on a fixed maximum chunk size. Thanks to last commit we're now free to adjust the chunk's length before sending the data, so we don't have to use 6 digits all the time anymore, and if one wants buffers larger than 16 MB it is now possible.	2015-03-28 12:05:47 +01:00
Willy Tarreau	d328af5981	MEDIUM: compression: postpone buffer adjustments after compression Till now we used to copy the pending outgoing data into the new buffer, then compute the chunk size, then compress, then fix the chunk size, then copy the remaining data into the destination buffer. If the compression would fail for whatever reason (eg: not enough input bytes to push an extra block), this work still had to be performed for no added value. It also presents the disadvantage of having to use a fixed length to encode the chunk size. Thanks to the body parser changes that went late into 1.5, the buffers are not modified anymore during these operations. So this patch rearranges operations so that they're more optimal : 1) init() prepares a new buffer and reserves space in it for pending outgoing data (no copy) and for chunk size 2) data are compressed 3) only if data were added to the buffer, then the old data are copied and the chunk size is set. A few optimisations are still possible to go further : - decide whether we prefer to copy pending outgoing data from the old buffer to the new one, or pending incoming compressed data from the new one to the old one, based on the amount of outgoing data available. Given that pending outgoing data are rare and the operation could be complex in the presence of extra input data, it's probably better to ignore this one ; - compute the needed length for the chunk size. This would avoid sending lots of leading zeroes when not needed.	2015-03-28 11:42:29 +01:00
Thierry FOURNIER	f41a809dc9	MINOR: sample: add private argument to the struct sample_fetch The add of this private argument is to prepare the integration of the lua fetchs.	2015-02-28 23:12:31 +01:00
Willy Tarreau	e583ea583a	MEDIUM: buffer: use b_alloc() to allocate and initialize a buffer b_alloc() now allocates a buffer and initializes it to the size specified in the pool minus the size of the struct buffer itself. This ensures that callers do not need to care about buffer details anymore. Also this never applies memory poisonning, which is slow and useless on buffers.	2014-12-24 23:47:32 +01:00
Willy Tarreau	474cf54a97	MINOR: buffer: reset a buffer in b_reset() and not channel_init() We'll soon need to be able to switch buffers without touching the channel, so let's move buffer initialization out of channel_init(). We had the same in compressoin.c.	2014-12-24 23:47:31 +01:00
Willy Tarreau	4f31fc2f28	BUG/MEDIUM: compression: correctly report zlib_mem In zlib we track memory usage. The problem is that the way alloc_zlib() and free_zlib() account for memory is different, resulting in variations that can lead to negative zlib_mem being reported. The alloc() function uses the requested size while the free() function uses the pool size. The difference can happen when pools are shared with other pools of similar size. The net effect is that zlib_mem can be reported negative with a slowly decreasing count, and over the long term the limit will not be enforced anymore. The fix is simple : let's use the pool size in both cases, which is also the exact value when it comes to memory usage. This fix must be backported to 1.5.	2014-12-24 18:19:50 +01:00
Willy Tarreau	3ca5448828	BUG/MINOR: compression: correctly report incoming byte count The fixes merged into 1.5-dev23 on compression resulted in the input byte count not being correctly computed and always reported as zero.	2014-04-23 19:31:17 +02:00
Willy Tarreau	7f2f8d5cc3	MAJOR: http/compression: fix chunked-encoded response processing Now we have valid buffer offsets, we can use them to safely parse the input and only forward when needed. Thus we can get rid of the consumed_data accumulator, and the code now works both for chunked and content-length, even with a server feeding one byte at a time (which systematically broke the previous one). It's worth noting that 0<CRLF> must always be sent after end of data (ie: chunk_len==0), and that the trailing CRLF is sent only content length mode, because in chunked we'll have to pass trailers.	2014-04-22 23:15:28 +02:00
Willy Tarreau	c24715e5f7	MAJOR: http: don't update msg->sov anymore while processing the body We used to have msg->sov updated for every chunk that was parsed. The issue is that we want to be able to rewind after chunks were parsed in case we need to redispatch a request and perform a new hash on the request or insert a different server header name. Currently, msg->sov and msg->next make parallel progress. We reached a point where they're always equal because msg->next is initialized from msg->sov, and is subtracted msg->sov's value each time msg->sov bytes are forwarded. So we can now ensure that msg->sov can always be replaced by msg->next for every state after HTTP_MSG_BODY where it is used as a position counter. This allows us to keep msg->sov untouched whatever the number of chunks that are parsed, as is needed to extract data from POST request (eg: url_param). However, we still need to know the starting position of the data relative to the body, which differs by the chunk size length. We use msg->sol for this since it's now always zero and unused in the body. So with this patch, we have the following situation : - msg->sov = msg->eoh + msg->eol = size of the headers including last CRLF - msg->sol = length of the chunk size if any. So msg->sov + msg->sol = DATA. - msg->next corresponds to the byte being inspected based on the current state and is always >= msg->sov before starting to forward anything. Since sov and next are updated in case of header rewriting, a rewind will fix them both when needed. Of course, ->sol has no reason for changing in such conditions, so it's fine to keep it relative to msg->sov. In theory, even if a redispatch has to be performed, a transformation occurring on the request would still work because the data moved would still appear at the same place relative to bug->p.	2014-04-22 23:15:28 +02:00
Willy Tarreau	877e78dbef	MAJOR: http: do not use msg->sol while processing messages or forwarding data There are still some pending issues in the gzip compressor, and fixing them requires a better handling of intermediate parsing states. Another issue to deal with is the rewinding of a buffer during a redispatch when a load balancing algorithm involves L7 data because the exact amount of data to rewind is not clear. At the moment, this is handled by unwinding all pending data, which cannot work in responses due to pipelining. Last, having a first analysis which parses the body and another one which restarts from where the parsing was left is wrong. Right now it only works because we never both parse and transform in the same direction. But that is wrong anyway. In order to address the first issue, we'll have to use msg->eoh + msg->eol to find the end of headers, and we still need to store the information about the forwarded header length somewhere (msg->sol might be reused for this). msg->sov may only be used for the start of data and not for subsequent chunks if possible. This first implies that we stop sharing it with header length, and stop using msg->sol there. In fact we don't need it already as it is always zero when reaching the HTTP_MSG_BODY state. It was only updated to reflect a copy of msg->sov. So now as a first step into that direction, this patch ensure that msg->sol is never re-assigned after being set to zero and is not used anymore when we're dealing with HTTP processing and forwarding. We'll later reuse it differently but for now it's secured. The patch does nothing magic, it only removes msg->sol everywhere it was already zero and avoids setting it. In order to keep the sov-sol difference, it now resets sov after forwarding data. In theory there's no problem here, but the patch is still tagged major because that code is complex.	2014-04-22 23:15:28 +02:00
Thierry FOURNIER	7654c9ff44	MEDIUM: sample: Remove types SMP_T_CSTR and SMP_T_CBIN, replace it by SMP_F_CONST flags The operations applied on types SMP_T_CSTR and SMP_T_STR are the same, but the check code and the declarations are double, because it must declare action for SMP_T_C* and SMP_T_. The declared actions and checks are the same. this complexify the code. Only the "conv" functions can change from "C" to "*" Now, if a function needs to modify input string, it can call the new function smp_dup(). This one duplicate data in a trash buffer.	2014-03-17 18:06:07 +01:00
Willy Tarreau	4a4e6bca60	BUG/MEDIUM: compression: fix the output type of the compressor name smp_fetch_res_comp_algo() returns the name of the compression algorithm in use. The output type is set to SMP_T_STR instead of SMP_T_CSTR, which causes any transformation to be operated without a cast. Fortunately, the current converters do not overwrite a zero-sized area, so the result is an empty string. Fix this to have SMP_T_CSTR instead so that the cast is always performed using a copy before any transformation is done.	2014-03-11 16:23:05 +01:00
Willy Tarreau	ef38c39287	MEDIUM: sample: systematically pass the keyword pointer to the keyword We're having a lot of duplicate code just because of minor variants between fetch functions that could be dealt with if the functions had the pointer to the original keyword, so let's pass it as the last argument. An earlier version used to pass a pointer to the sample_fetch element, but this is not the best solution for two reasons : - fetch functions will solely rely on the keyword string - some other smp_fetch_* users do not have the pointer to the original keyword and were forced to pass NULL. So finally we're passing a pointer to the keyword as a const char *, which perfectly fits the original purpose.	2013-08-01 21:17:13 +02:00
Willy Tarreau	dc13c11c1e	BUG/MEDIUM: prevent gcc from moving empty keywords lists into BSS Benoit Dolez reported a failure to start haproxy 1.5-dev19. The process would immediately report an internal error with missing fetches from some crap instead of ACL names. The cause is that some versions of gcc seem to trim static structs containing a variable array when moving them to BSS, and only keep the fixed size, which is just a list head for all ACL and sample fetch keywords. This was confirmed at least with gcc 3.4.6. And we can't move these structs to const because they contain a list element which is needed to link all of them together during the parsing. The bug indeed appeared with 1.5-dev19 because it's the first one to have some empty ACL keyword lists. One solution is to impose -fno-zero-initialized-in-bss to everyone but this is not really nice. Another solution consists in ensuring the struct is never empty so that it does not move there. The easy solution consists in having a non-null list head since it's not yet initialized. A new "ILH" list head type was thus created for this purpose : create an Initialized List Head so that gcc cannot move the struct to BSS. This fixes the issue for this version of gcc and does not create any burden for the declarations.	2013-06-21 23:29:02 +02:00
Willy Tarreau	6d4e4e8dd2	MEDIUM: acl: remove a lot of useless ACLs that are equivalent to their fetches The following 116 ACLs were removed because they're redundant with their fetch function since last commit which allows the fetch function to be used instead for types BOOL, INT and IP. Most places are now left with an empty ACL keyword list that was not removed so that it's easier to add other ACLs later. always_false, always_true, avg_queue, be_conn, be_id, be_sess_rate, connslots, nbsrv, queue, srv_conn, srv_id, srv_is_up, srv_sess_rate, res.comp, fe_conn, fe_id, fe_sess_rate, dst_conn, so_id, wait_end, http_auth, http_first_req, status, dst, dst_port, src, src_port, sc1_bytes_in_rate, sc1_bytes_out_rate, sc1_clr_gpc0, sc1_conn_cnt, sc1_conn_cur, sc1_conn_rate, sc1_get_gpc0, sc1_gpc0_rate, sc1_http_err_cnt, sc1_http_err_rate, sc1_http_req_cnt, sc1_http_req_rate, sc1_inc_gpc0, sc1_kbytes_in, sc1_kbytes_out, sc1_sess_cnt, sc1_sess_rate, sc1_tracked, sc1_trackers, sc2_bytes_in_rate, sc2_bytes_out_rate, sc2_clr_gpc0, sc2_conn_cnt, sc2_conn_cur, sc2_conn_rate, sc2_get_gpc0, sc2_gpc0_rate, sc2_http_err_cnt, sc2_http_err_rate, sc2_http_req_cnt, sc2_http_req_rate, sc2_inc_gpc0, sc2_kbytes_in, sc2_kbytes_out, sc2_sess_cnt, sc2_sess_rate, sc2_tracked, sc2_trackers, sc3_bytes_in_rate, sc3_bytes_out_rate, sc3_clr_gpc0, sc3_conn_cnt, sc3_conn_cur, sc3_conn_rate, sc3_get_gpc0, sc3_gpc0_rate, sc3_http_err_cnt, sc3_http_err_rate, sc3_http_req_cnt, sc3_http_req_rate, sc3_inc_gpc0, sc3_kbytes_in, sc3_kbytes_out, sc3_sess_cnt, sc3_sess_rate, sc3_tracked, sc3_trackers, src_bytes_in_rate, src_bytes_out_rate, src_clr_gpc0, src_conn_cnt, src_conn_cur, src_conn_rate, src_get_gpc0, src_gpc0_rate, src_http_err_cnt, src_http_err_rate, src_http_req_cnt, src_http_req_rate, src_inc_gpc0, src_kbytes_in, src_kbytes_out, src_sess_cnt, src_sess_rate, src_updt_conn_cnt, table_avl, table_cnt, ssl_c_ca_err, ssl_c_ca_err_depth, ssl_c_err, ssl_c_used, ssl_c_verify, ssl_c_version, ssl_f_version, ssl_fc, ssl_fc_alg_keysize, ssl_fc_has_crt, ssl_fc_has_sni, ssl_fc_use_keysize,	2013-06-11 21:22:58 +02:00
Willy Tarreau	c5599e7c49	BUG/MEDIUM: compression: the deflate algorithm must use global settings as well Global compression settings (windowsize and memlevel) were only considered for the gzip algorithm but not the deflate algorithm. Since a single allocator is used for both algos, if gzip was first initialized the memory with parameters smaller than default, then initializing deflate after with default settings would result in overusing the small allocated areas. To fix this, we make use of deflateInit2() for deflate_init() as well. Thanks to Godbach for reporting this bug, introduced by in 1.5-dev13 by commit `8b52bb38`. No backport is needed.	2013-04-28 09:01:11 +02:00
Willy Tarreau	7f6fa69221	BUG/MINOR: fix unterminated ACL array in compression Recent commit `727db8b4` was lacking a NULL ACL descriptor to terminate the array, causing a random behaviour upon startup. No backport is needed.	2013-04-23 19:39:43 +02:00
William Lallemand	727db8b4ea	MINOR: compression: acl "res.comp" and fetch "res.comp_algo" Implements the "res.comp" ACL which is a boolean returning 1 when a response has been compressed by HAProxy or 0 otherwise. Implements the "res.comp_algo" fetch which contains the name of the algorithm HAProxy used to compress the response.	2013-04-20 23:53:33 +02:00
William Lallemand	00bf1dee9c	BUG/MEDIUM: compression: does not forward trailers The commit `bf3ae617` introduced a regression about the forward of the trailers in compression mode.	2012-11-23 11:12:33 +01:00
Willy Tarreau	55058a7c1e	MINOR: stats: report HTTP compression stats per frontend and per backend It was a bit frustrating to have no idea about the bandwidth saved by HTTP compression. Now we have per-frontend and per-backend stats. The stats on the HTTP interface are shown in a hover title in the "bytes out" column if at least something was fed to the compressor. 3 new columns appeared in the CSV stats output.	2012-11-22 01:07:40 +01:00
William Lallemand	072a2bf537	MINOR: compression: CPU usage limit New option 'maxcompcpuusage' in global section. Sets the maximum CPU usage HAProxy can reach before stopping the compression for new requests or decreasing the compression level of current requests. It works like 'maxcomprate' but with the Idle.	2012-11-21 02:15:16 +01:00
William Lallemand	c71407657d	BUG/MINOR: compression: dynamic level increase Using compression rate limit, the compression level wasn't taking care of the max compression level during a session because the test was done on the wrong variable.	2012-11-21 02:15:16 +01:00
William Lallemand	e3a7d99062	MINOR: compression: report zlib memory usage Show the memory usage and the max memory available for zlib. The value stored is now the memory used instead of the remaining available memory.	2012-11-21 02:15:16 +01:00
William Lallemand	8b52bb3878	MEDIUM: compression: use pool for comp_ctx Use pool for comp_ctx, it is allocated during the comp_algo->init(). The allocation of comp_ctx is accounted for in the zlib_memory_available.	2012-11-21 01:56:47 +01:00
William Lallemand	bf3ae61789	MEDIUM: compression: don't compress when no data This patch makes changes in the http_response_forward_body state machine. It checks if the compress algorithm had consumed data before swapping the temporary and the input buffer. So it prevents null sized zlib chunks.	2012-11-19 14:57:29 +01:00
Willy Tarreau	4690985fca	BUG: compression: do not always increment the round counter on allocation failure Zlib (at least 1.2 and 1.3) aborts when it fails to allocate the state, so we must not count a round on this event. If the state succeeds, then it allocates all the 4 remaining counters at once.	2012-11-15 15:00:55 +01:00
Cyril Bont�	6162c43a0a	BUILD: report zlib support in haproxy -vv Compression algorithms are not always supported depending on build options. "haproxy -vv" now reports if zlib is supported and lists compression algorithms also supported.	2012-11-10 20:36:46 +01:00
Willy Tarreau	b1fbd050ec	BUILD: compression: remove a build warning gcc emits this warning while building free_zlib() : src/compression.c: In function `free_zlib': src/compression.c:403: warning: 'pool' might be used uninitialized in this function This is not a bug as the pool cannot take other values, but let's pre-initialize is to null to fix the warning.	2012-11-10 17:49:37 +01:00

1 2

59 Commits