haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-10-15 01:21:46 +02:00

Author	SHA1	Message	Date
Christopher Faulet	a39d8ad086	MINOR: mux-h1: Set hdrs_bytes on the SL when an HTX message is produced	2019-05-28 07:42:12 +02:00
Christopher Faulet	33543e73a2	MINOR: h2/htx: Set hdrs_bytes on the SL when an HTX message is produced	2019-05-28 07:42:12 +02:00
Christopher Faulet	05c083ca8d	MINOR: htx: Add a field to set the memory used by headers in the HTX start-line The field hdrs_bytes has been added in the structure htx_sl. It should be used to set how many bytes are help by all headers, from the start-line to the corresponding EOH block. it must be set to -1 if it is unknown.	2019-05-28 07:42:12 +02:00
Christopher Faulet	2f6edc84a8	MINOR: mux-h2/htx: Support zero-copy when possible in h2_rcv_buf() If the channel's buffer is empty and the message is small enough, we can swap the H2S buffer with the channel one.	2019-05-28 07:42:12 +02:00
Christopher Faulet	9cdd5036f3	MINOR: stream-int: Don't use the flag CO_RFL_KEEP_RSV anymore in si_cs_recv() Because the channel_recv_max() always return the right value, for HTX and legacy streams, we don't need to set this flag. The multiplexer don't use it anymore.	2019-05-28 07:42:12 +02:00
Christopher Faulet	8a9ad4c0e8	MINOR: mux-h2: Use the count value received from the SI in h2_rcv_buf() Now, the SI calls h2_rcv_buf() with the right count value. So we can rely on it. Unlike the H1 multiplexer, it is fairly easier for the H2 multiplexer because the HTX message already exists, we only transfer blocks from the H2S to the channel. And this part is handled by htx_xfer_blks().	2019-05-28 07:42:12 +02:00
Christopher Faulet	30db3d737b	MEDIUM: mux-h1: Use the count value received from the SI in h1_rcv_buf() Now, the SI calls h1_rcv_buf() with the right count value. So we can rely on it. During the parsing, we now really respect this value to be sure to never exceed it. To do so, once headers are parsed, we should estimate the size of the HTX message before copying data.	2019-05-28 07:42:12 +02:00
Christopher Faulet	156852b613	BUG/MINOR: htx: Change htx_xfer_blk() to also count metadata This patch makes the function more accurate. Thanks to the function htx_get_max_blksz(), the transfer of data has been simplified. Note that now the total number of bytes copied (metadata + payload) is returned. This slighly change how the function is used in the H2 multiplexer.	2019-05-28 07:42:12 +02:00
Christopher Faulet	a3f1550dfa	MEDIUM: http/htx: Perform analysis relatively to the first block The first block is the start-line, if defined. Otherwise it the head of the HTX message. So now, during HTTP analysis, lookup are all done using the first block instead of the head. Concretely, for now, it is the same because only one HTTP message is stored at a time in an HTX message. 1xx informational messages are handled separatly from the final reponse and from each other. But it will make sense when the 1xx informational messages and the associated final reponse will be stored in the same HTX message.	2019-05-28 07:42:12 +02:00
Christopher Faulet	7b7d507a5b	MINOR: http/htx: Use sl_pos directly to replace the start-line Since the HTX start-line is now referenced by position instead of by its payload address, it is fairly easier to replace it. No need to search the rigth block to find the start-line comparing the payloads address. It just enough to get the block at the position sl_pos.	2019-05-28 07:42:12 +02:00
Christopher Faulet	297fbb45fe	MINOR: htx: Replace the function http_find_stline() by http_get_stline() Now, we only return the start-line. If not found, NULL is returned. No lookup is performed and the HTX message is no more updated. It is now the caller responsibility to update the position of the start-line to the right value. So when it is not found, i.e sl_pos is set to -1, it means the last start-line has been already processed and the next one has not been inserted yet. It is mandatory to rely on this kind of warranty to store 1xx informational responses and final reponse in the same HTX message.	2019-05-28 07:42:12 +02:00
Christopher Faulet	b77a1d26a4	MINOR: mux-h2/htx: Get the start-line from the head when HEADERS frame is built in the H2 multiplexer, when a HEADERS frame is built before sending it, we have the warranty the start-line is the head of the HTX message. It is safer to rely on this fact than on the sl_pos value. For now, it's safe to use sl_pos in muxes because HTTP 1xx messages are considered as full messages in HTX and only one HTTP message can be stored at a time in HTX. But we are trying to handle 1xx messages as a part of the reponse message. In this way, an HTTP reponse will be the sum of all 1xx informational messages followed by the final response. So it will be possible to have several start-line in the same HTX message. And the sl_pos will point to the first unprocessed start-line from the analyzers point of view.	2019-05-28 07:42:12 +02:00
Christopher Faulet	9c66b980fa	MINOR: htx: Store start-line block's position instead of address of its payload Nothing much to say. This change is just mandatory to consider 1xx informational messages as part of a response.	2019-05-28 07:42:12 +02:00
Christopher Faulet	28f29c7eea	MINOR: htx: Store the head position instead of the wrap one The head of an HTX message is heavily used whereas the wrap position is only used when a block is added or removed. So it is more logical to store the head position in the HTX message instead of the wrap one. The wrap position can be easily deduced. To get it, the new function htx_get_wrap() may be used.	2019-05-28 07:42:12 +02:00
Christopher Faulet	429b91d308	MINOR: htx: Remove the macro IS_HTX_SMP() and always use IS_HTX_STRM() instead The macro IS_HTX_SMP() is only used at a place, in a context where the stream always exists. So, we can remove it to use IS_HTX_STRM() instead.	2019-05-28 07:42:12 +02:00
Willy Tarreau	b01302f9ac	MEDIUM: config: now alert when two servers have the same name We've been emitting warnings for over 5 years (since 1.5-dev22) about configs accidently carrying multiple servers with the same name in the same backend, and this starts to cause some real trouble in dynamic environments since it's still very difficult to accurately process a state-file and we still can't transport a server's name over the peers protocol because of this. It's about time to force users to fix their configs if they still hadn't given that there is zero technical justification for doing this, beyond the "yyp" (or copy-paste accident) when editing the config. The message remains as clear as before, indicating the file and lines of the conflict so that the user can easily fix it.	2019-05-27 19:31:06 +02:00
Willy Tarreau	c3b5958255	BUG/MEDIUM: threads: fix double-word CAS on non-optimized 32-bit platforms On armv7 haproxy doesn't work because of the fixes on the double-word CAS. There are two issues. The first one is that the last argument in case of dwcas is a pointer to the set of value and not a value ; the second is that it's not enough to cast the data as (void*) since it will be a single word. Let's fix this by using the pointers as an array of long. This was tested on i386, armv7, x86_64 and aarch64 and it is now fine. An alternate approach using a struct was attempted as well but it used to produce less optimal code. This fix must be backported to 1.9. This fixes github issue #105. Cc: Olivier Houchard <ohouchard@haproxy.com>	2019-05-27 17:40:59 +02:00
Willy Tarreau	bff005ae58	BUG/MEDIUM: queue: fix the tree walk in pendconn_redistribute. In pendconn_redistribute() we scan the queue using eb32_next() on the node we've just deleted, which is wrong since the node is not in the tree anymore, and it could dereference one node that has already been released by another thread. Note that we cannot use eb32_first() in the loop here instead because we need to skip pendconns having SF_FORCE_PRST. Instead, let's keep a copy of the next node before deleting it. In addition, the pendconn retrieved there is wrong, it uses &node as the pointer instead of node, resulting in very quick crashes when the server list is scanned. Fortunately this only happens when "option redispatch" is used in conjunction with "maxconn" on server lines, "cookie" for the stickiness, and when a server goes down with entries in its queue. This bug was introduced by commit 0355dabd7 ("MINOR: queue: replace the linked list with a tree") so the fix must be backported to 1.9.	2019-05-27 10:29:59 +02:00
Willy Tarreau	b6195ef2a6	BUG/MAJOR: lb/threads: make sure the avoided server is not full on second pass In fwrr_get_next_server(), we optionally pass a server to avoid. It usually points to the current server during a redispatch operation. If this server is usable, an "avoided" pointer is set and we continue to look for another server. If in the end no other server is found, then we fall back to this avoided one, which is still better than nothing. The problem that may arise with threads is that in the mean time, this avoided server might have received extra connections and might not be usable anymore. This causes it to be queued a second time in the "full" list and the loop to search for a server again, ending up on this one again and so on. This patch makes sure that we break out of the loop when we have to pick the avoided server. It's probably what the code intended to do as the current break statement causes fwrr_update_position() and fwrr_dequeue_srv() to be called again on the avoided server. It must be backported to 1.9 and 1.8, and seems appropriate for older versions though it's unclear what the impact of this bug might be there since the race doesn't exist and we're left with the double update of the server's position.	2019-05-27 10:29:59 +02:00
Willy Tarreau	d6a7850200	MINOR: cli/activity: add 3 general purpose counters in development mode The unused fd_del and fd_skip were being abused during debugging sessions as general purpose event counters. With their removal, let's officially have dedicated counters for such use cases. These counters are called "ctr0".."ctr2" and are listed at the end when DEBUG_DEV is set.	2019-05-27 07:03:38 +02:00
Willy Tarreau	394c9b4215	MINOR: cli/activity: remove "fd_del" and "fd_skip" from show activity These variables are never set anymore and were always reported as zero.	2019-05-27 06:59:14 +02:00
Ilya Shipitsin	0590f44254	BUILD: ssl: fix latest LibreSSL reg-test error starting with OpenSSL 1.0.0 recommended way to disable compression is using SSL_OP_NO_COMPRESSION when creating context. manipulations with SSL_COMP_get_compression_methods, sk_SSL_COMP_num are only required for OpenSSL < 1.0.0	2019-05-26 21:26:02 +02:00
Willy Tarreau	08e2b41e81	BUILD: connections: shut up gcc about impossible out-of-bounds warning Since commit 88698d9 ("MEDIUM: connections: Add a way to control the number of idling connections.") when building without threads, gcc complains that the operations made on the idle_orphan_conns[] list is out of bounds, which is always false since 1) <i> can only equal zero, and 2) given it's equal to <tid> we never even enter the loop. But as usual it thinks it knows better, so let's mask the origin of this <i> value to shut it up. Another solution consists in making <i> unsigned and adding an explicit range check.	2019-05-26 11:54:20 +02:00
Willy Tarreau	9c218e7521	MAJOR: mux-h2: switch to next mux buffer on buffer full condition. Now when we fail to send because the mux buffer is full, before giving up and marking MFULL, we try to allocate another buffer in the mux's ring to try again. Thanks to this (and provided there are enough buffers allocated to the mux's ring), a single stream picked in the send_list cannot steal all the mux's room at once. For this, we expand the ring size to 31 buffers as it seems to be optimal on benchmarks since it divides the number of context switches by 3. It will inflate each H2 conn's memory by 1 kB. The bandwidth is now much more stable. Prior to this, it a test on h2->h1 with very large objects (1 GB), a few tens of connections and a few tens of streams per connection would show a varying performance between 34 and 95 Gbps on 2 cores/4 threads, with h2_snd_buf() stopped on a buffer full condition between 300000 and 600000 times per second. Now the performance is constantly between 88 and 96 Gbps. Measures show that buffer full conditions are met around only 159 times per second in this case, or rougly 2000 to 4000 times less often.	2019-05-26 11:33:19 +02:00
Willy Tarreau	60f62682b1	MINOR: mux-h2: report the mbuf's head and tail in "show fd" It's useful to know how the mbuf spans over the whole area and to have access to the first and last ones, so let's dump just this.	2019-05-26 11:33:18 +02:00
Willy Tarreau	bcc4595e57	CLEANUP: mux-h2: consistently use a local variable for the mbuf This makes the code more readable and reduces the calls to br_tail(). In addition, all calls to h2_get_buf() are now made via this local variable, which should significantly help for retries.	2019-05-26 10:52:47 +02:00
Willy Tarreau	41c4d6a2c5	MEDIUM: mux-h2: make the send() function iterate over all mux buffers Now send() uses a loop to iterate over all buffers to be sent. These buffers are released and deleted from the vector once completely sent. If any buffer gets released, offer_buffers() is called to wake up some waiters.	2019-05-26 10:52:25 +02:00
Willy Tarreau	2e3c000c1c	MINOR: mux-h2: introduce h2_release_mbuf() to release all buffers in the mbuf ring This function iterates over all buffers in the mbuf ring to release all of them from the head to the tail.	2019-05-26 10:51:25 +02:00
Willy Tarreau	662fafc02b	MEDIUM: mux-h2: make the conditions to send based on mbuf, not just its tail This is in preparation for iterating over lists. First we need to always check the buffer's head and not its tail.	2019-05-26 10:50:50 +02:00
Willy Tarreau	5133096df2	MEDIUM: mux-h2: replace all occurrences of mbuf with a buffer ring For now it's only one buffer long so the head and tails are always the same, thus it doesn't change what used to work. In short, br_tail(h2c->mbuf) was inserted everywhere we used to have h2c->mbuf.	2019-05-26 10:50:18 +02:00
Willy Tarreau	455d5681b6	MEDIUM: mux-h2: avoid doing expensive buffer realigns when not absolutely needed Transferring large objects over H2 sometimes shows unexplained performance variations. A long analysis resulted in the following discovery. Often the mux buffer looks like this : [ empty_head \| data \| empty_tail ] Typical numbers are (very common) : - empty_head = 31 - empty_tail = 16 (total free=47) - data = 16337 - size = 16384 - data to copy: 43 The reason for these holes are the blocking factors that are not always the same in and out (due to keeping 9 bytes for the frame size, or the 56 bytes corresponding to the HTX header). This can easily happen 10000 times a second if the network bandwidth permits it! In this case, while copying a DATA frame we find that the buffer has its free space wrapped so we decide to realign it to optimize the copy. It's possible that this practice stems from the code used to emit headers, which do not support fragmentation and which had no other option left. But it comes with two problems : - we don't check if the data fits, which results in a memcpy for nothing - we can move huge amounts of data to just copy a small block. This patch addresses this two ways : - first, by not forcing a data realignment if what we have to copy does not fit, as this is totally pointless ; - second, by refusing to move too large data blocks. The threshold was set to 1 kB, because it may make sense to move 1 kB of data to copy a 15 kB one at once, which will leave as a single 16 kB block, but it doesn't make sense to mvoe 15 kB to copy just 1 kB. In all cases the data would fit and would just be split into two blocks, which is not very expensive, hence the low limit to 1 kB With such changes, realignments are very rare, they show up around once every 15 seconds at 60 Gbps, and look like this, resulting in a much more stable bit rate : buf=0x7fe6ec0c3510,h=16333,d=35,s=16384 room=16349 in=16337 This patch should be safe for backporting to 1.9 if some performance issues are reported there.	2019-05-25 20:31:53 +02:00
Ilya Shipitsin	e242f3dfb8	BUG/MINOR: ssl_sock: Fix memory leak when disabling compression according to manpage: sk_TYPE_zero() sets the number of elements in sk to zero. It does not free sk so after this call sk is still valid. so we need to free all elements [wt: seems like it has been there forever and should be backported to all stable branches]	2019-05-25 07:45:55 +02:00
Christopher Faulet	b8fd4c031c	BUG/MINOR: htx: Remove a forgotten while loop in htx_defrag() Fortunately, this loop does nothing. Otherwise it would have led to an infinite loop. It was probably forgotten during a refactoring, in the early stage of the HTX. This patch must be backported to 1.9.	2019-05-24 09:11:10 +02:00
Christopher Faulet	f90c24d14c	BUG/MEDIUM: proto-htx: Not forward too much data when 1xx reponses are handled When an 1xx reponse is processed, we forward it immediatly. But another message may already be in the channel's buffer, waiting to be processed. This may be another 1xx reponse or the final one. So instead of forwarding everything, we must take care to only forward the processed 1xx response. This patch must be backported to 1.9.	2019-05-24 09:11:07 +02:00
Christopher Faulet	8e9e3ef15c	BUG/MINOR: mux-h1: Report EOI instead EOS on parsing error or H2 upgrade When a parsing error occurrs in the H1 multiplexer, we stop to copy HTX blocks. So the error may be reported with an emtpy HTX message. For instance, if the headers parsing failed. When it happens, the flag CS_FL_EOS is also set on the conn_stream. But it is an error. Most of time, it is set on established connections, so it is not really an issue. But if it happens when the server connection is not fully established, the connection is shut down immediatly and the stream-interface is switched from SI_ST_CON to SI_ST_DIS/CLO. So HTX analyzers have no chance to catch the error. Instead of setting CS_FL_EOS, it is fairly better to set CS_FL_EOI, which is the right flag to use. The same is also done on H2 upgrade. As a side effet of this fix, in the stream-interface code, we must now set the flag CF_READ_PARTIAL on the channel when the flag CF_EOI is set. It is a warranty to wakeup the stream when EOI is reported to the channel while no data are received. This patch must be backported to 1.9.	2019-05-24 09:11:01 +02:00
Christopher Faulet	316934d3c9	BUG/MINOR: mux-h2: Count EOM in bytes sent when a HEADERS frame is formatted In HTX, when a HEADERS frame is formatted before sending it to the client or the server, If an EOM is found because there is no body, we must count it in the number bytes sent. This patch must be backported to 1.9.	2019-05-24 09:10:46 +02:00
Christopher Faulet	256b69a82d	BUG/MINOR: lua: Set right direction and flags on new HTTP objects When a LUA HTTP object is created using the current TXN object, it is important to also set the right direction and flags, using ones from the TXN object. This patch may be backported to all supported branches with the lua support. But, it seems to have no impact for now.	2019-05-24 09:07:57 +02:00
Christopher Faulet	55ae8a64e4	BUG/MEDIUM: spoe: Don't use the SPOE applet after releasing it In spoe_release_appctx(), the SPOE applet may be used after it was released to get its exit status code. Of course, HAProxy crashes when this happens. This patch must be backported to 1.9 and 1.8.	2019-05-24 09:07:30 +02:00
Christopher Faulet	08e6646460	BUG/MINOR: proto-htx: Try to keep connections alive on redirect As fat as possible, we try to keep the connections alive on redirect. It's possible when the request has no body or when the request parsing is finished. No backport is needed.	2019-05-24 09:06:59 +02:00
Willy Tarreau	1713c03825	MINOR: stats: report the global output bit rate in human readable form The stats page now reports the per-process output bit rate and applies the usual conversions needed to turn the TCP payload rate to an Ethernet bit rate in order to give a reasonably accurate estimate of how far from interface saturation we are.	2019-05-23 12:31:51 +02:00
Willy Tarreau	7cf0e4517d	MINOR: raw_sock: report global traffic statistics Many times we've been missing per-process traffic statistics. While it didn't make sense in multi-process mode, with threads it does. Thus we now have a counter of bytes emitted by raw_sock, and a freq counter for these as well. However, freq_ctr are limited to 32 bits, and given that loads of 300 Gbps have already been reached over a loopback using splicing, we need to downscale this a bit. Here we're storing 1/32 of the byte rate, which gives a theorical limit of 128 GB/s or ~1 Tbps, which is more than enough. Let's have fun re-reading this sentence in 2029 :-) The values can be read in "show info" output on the CLI.	2019-05-23 11:45:38 +02:00
Willy Tarreau	bc1b820606	BUILD: watchdog: condition it to USE_RT It's needed on Linux to have access to timerfd_*, and on FreeBSD this lib is needed as well, though not enabled in our default build. We can see later if it's OK to enable it, for now let's fix the build issues.	2019-05-23 10:20:55 +02:00
Willy Tarreau	02255b24df	BUILD: watchdog: use si_value.sival_int, not si_int for the timer's value Bah, the linux manpage suggests to use si_int but it's a fake, it's only a define on sigval.sival_int where sigval is defined as si_value. Let's use si_value.sival_int, at least it builds on both Linux and FreeBSD. It's likely that this code will have to be limited to a small subset of OSes if it causes difficulties like this.	2019-05-23 08:36:29 +02:00
Willy Tarreau	96d5195862	MEDIUM: config: deprecate the antique req* and rsp* commands These commands don't follow the same flow as the rest of the commands, each of them iterates over all header lines before switching to the next directive. In addition they make no distinction between start line and headers and can lead to unparsable rewrites which are very difficult to deal with internally. Most of them are still occasionally found in configurations, mainly because of the usual "we've always done this way". By marking them deprecated and emitting a warning and recommendation on first use of each of them, we will raise users' awareness of users regarding the cleaner, faster and more reliable alternatives. Some use cases of "reqrep" still appear from time to time for URL rewriting that is not so convenient with other rules. But at least users facing this requirement will explain their use case so that we can best serve them. Some discussion started on this subject in a thread linked to from github issue #100. The goal is to remove them in 2.1 since they require to reparse the result before indexing it and we don't want this hack to live long. The following directives were marked deprecated : -reqadd -reqallow -reqdel -reqdeny -reqiallow -reqidel -reqideny -reqipass -reqirep -reqitarpit -reqpass -reqrep -reqtarpit -rspadd -rspdel -rspdeny -rspidel -rspideny -rspirep -rsprep	2019-05-22 20:43:45 +02:00
Willy Tarreau	3844747536	CLEANUP: raw_sock: remove support for very old linux splice bug workaround We've been dealing with a workaround for a bug in splice that used to affect version 2.6.25 to 2.6.27.12 and which was fixed 10 years ago in kernel versions which are not supported anymore. Given that people who would use a kernel in such a range would face much more serious stability and security issues, it's about time to get rid of this workaround and of the ASSUME_SPLICE_WORKS build option used to disable it.	2019-05-22 20:02:15 +02:00
Willy Tarreau	e5733234f6	CLEANUP: build: rename some build macros to use the USE_* ones We still have quite a number of build macros which are mapped 1:1 to a USE_something setting in the makefile but which have a different name. This patch cleans this up by renaming them to use the USE_something one, allowing to clean up the makefile and make it more obvious when reading the code what build option needs to be added. The following renames were done : ENABLE_POLL -> USE_POLL ENABLE_EPOLL -> USE_EPOLL ENABLE_KQUEUE -> USE_KQUEUE ENABLE_EVPORTS -> USE_EVPORTS TPROXY -> USE_TPROXY NETFILTER -> USE_NETFILTER NEED_CRYPT_H -> USE_CRYPT_H CONFIG_HAP_CRYPT -> USE_LIBCRYPT CONFIG_HAP_NS -> DUSE_NS CONFIG_HAP_LINUX_SPLICE -> USE_LINUX_SPLICE CONFIG_HAP_LINUX_TPROXY -> USE_LINUX_TPROXY CONFIG_HAP_LINUX_VSYSCALL -> USE_LINUX_VSYSCALL	2019-05-22 19:47:57 +02:00
Willy Tarreau	823bda0eb7	BUILD: time: remove the test on _POSIX_C_SOURCE It seems it's not defined on FreeBSD while it's mentioned on Linux that clock_gettime() can be detected using this. Given that we also have the test for _POSIX_TIMERS>0 that should cover it well enough. If it breaks on other systems, we'll see. Report was here : https://github.com/haproxy/haproxy/runs/133866993	2019-05-22 19:14:59 +02:00
Willy Tarreau	082b62828d	BUG/MEDIUM: init/threads: provide per-thread alloc/free function callbacks We currently have the ability to register functions to be called early on thread creation and at thread deinitialization. It turns out this is not sufficient because certain such functions may use resources that are being allocated by the other ones, thus creating a race condition depending only on the linking order. For example the mworker needs to register a file descriptor while the pollers will reallocate the fd_updt[] array. Similarly logs and trashes may be used by some init functions while it's unclear whether they have been deduplicated. The same issue happens on deinit, if the fd_updt[] or trash is released before some functions finish to use them, we'll get into trouble. This patch creates a couple of early and late callbacks for per-thread allocation/freeing of resources. A few init functions were moved there, and the fd init code was split between the two (since it used to both allocate and initialize at once). This way the init/deinit sequence is expected to be safe now. This patch should be backported to 1.9 as at least the trash/log issue seems to be present. The run_thread_poll_loop() code is a bit different there as the mworker is not a callback, but it will have no effect and it's enough to drop the mworker changes. This bug was reported by Ilya Shipitsin in github issue #104.	2019-05-22 14:59:08 +02:00
Willy Tarreau	aabbe6a3bb	MINOR: WURFL: do not emit warnings when not configured At the moment the WURFL module emits 3 lines of warnings upon startup when it is not referenced in the configuration file, which is quite confusing. Let's make sure to keep it silent when not configured, as detected by the absence of the wurfl-data-file statement.	2019-05-22 14:01:22 +02:00
mbellomi	ae4fcf1e67	MINOR: WURFL: module version bump to 2.0 Make it version 2.0.	2019-05-22 12:06:42 +02:00

1 2 3 4 5 ...

7865 Commits