haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-10-20 20:21:01 +02:00

Author	SHA1	Message	Date
Willy Tarreau	33a09a5f2a	MINOR: stream-int: don't needlessly call tasklet_wakeup() in stream_int_chk_snd_conn() This one was added by commit 53216e7db ("MEDIUM: connections: Don't directly mess with the polling from the upper layers.") after the removal of the conditional cs_want_send() call. But after analysis it turned out that it's not needed since the si_cs_send() call will either succeed or subscribe.	2018-10-28 13:50:01 +01:00
Willy Tarreau	eafd8ebcfe	MEDIUM: stream-int: call si_cs_process() in stream_int_update_conn Calling si_cs_send() alone is always dangerous because it can result in the loss of an event if it manages to empty the buffer. Indeed, in this case it's critical to call si_chk_rcv() on the opposite stream-int. Given that si_cs_process() takes care of all this, let's call it instead. All this code could possibly be refined soon to avoid redoing the whole stream_int_notify() and do it only after a send(), but at the moment it's not important.	2018-10-28 13:48:06 +01:00
Willy Tarreau	85f890174a	MEDIUM: stream-int: make si_update() synchronize flag changes before the I/O With the new synchronous si_cs_send() at the end of process_stream(), we're seeing re-appear the I/O layer specific part of the stream interface which is supposed to deal with I/O event subscription. The only difference is that now we subscribe to I/Os only after having attempted (and failed) them. This patch brings a cleanup in this by reintroducing stream_int_update_conn() with the send code from process_stream(). However this alone would not be enough because the flags which are cleared afterwards would result in the loss of the possible events (write events only at the moment). So the flags clearing and stream-int state updates are also performed inside si_update() between the generic code and the I/O specific code. This definitely makes sense as after this call we can simply check again for channel and SI flag changes and decide to loop once again or not.	2018-10-28 13:47:00 +01:00
Willy Tarreau	0f8d3ab362	MEDIUM: stream: don't try to send first in process_stream() The rationale here is that we should never need to try to send() at the beginning of process_stream() because : - if something was pending, it's very unlikely that it was unblocked and not sent just between the last poll() and the wakeup instant. - if something pending was recently sent, then we don't have anything to send anymore. So at first glance it doesn't seem like there could be any valid case where trying to send before entering the function brings any benefit.	2018-10-28 13:47:00 +01:00
Willy Tarreau	18e066c2e7	MEDIUM: stream: always call si_cs_recv() after a failed buffer allocation If a buffer allocation failed, we have SI_FL_WAIT_ROOM set and c_size(buf) being zero. It's the only moment where we have a new opportunity to try to allocate this buffer. However we don't want to waste our time trying this if both are non-null since it indicates missing room without any changed condition.	2018-10-28 13:47:00 +01:00
Willy Tarreau	581abd3f99	MEDIUM: stream-int: replace channel_alloc_buffer() with si_alloc_ibuf() everywhere Well that's only 3 places (applet.c, stream_interface.c, hlua.c). This ensures we always clear SI_FL_WAIT_ROOM before setting it on failure, so that it is granted that SI_FL_WAIT_ROOM always indicates a lack of room for doing an operation, including the inability to allocate a buffer for this.	2018-10-28 13:47:00 +01:00
Willy Tarreau	cda7f3f5c2	MINOR: stream: don't prune variables if the list is empty The vars_prune() and vars_init() functions involve locking while most of the time there is no variable at all in streams nor sessions. Let's check for emptiness before calling these functions. Simply doing this has increased the multithreaded performance from 1.5 to 5% depending on the workload.	2018-10-28 13:46:47 +01:00
Lukas Tribus	80512b186f	BUG/MINOR: only auto-prefer last server if lb-alg is non-deterministic While "option prefer-last-server" only applies to non-deterministic load balancing algorithms, 401/407 responses actually caused haproxy to prefer the last server unconditionally. As this breaks deterministic load balancing algorithms like uri, this patch applies the same condition here. Should be backported to 1.8 (together with "BUG/MINOR: only mark connections private if NTLM is detected").	2018-10-27 22:10:32 +02:00
Lukas Tribus	fd9b68c48e	BUG/MINOR: only mark connections private if NTLM is detected Instead of marking all connections that see a 401/407 response private (for connection reuse), this patch detects a RFC4559/NTLM authentication scheme and restricts the private setting to those connections. This is so we can reuse connections with 401/407 responses with deterministic load balancing algorithms later (which requires another fix). This fixes the problem reported here by Elliot Barlas : https://discourse.haproxy.org/t/unable-to-configure-load-balancing-per-request-over-persistent-connection/3144 Should be backported to 1.8.	2018-10-27 22:10:29 +02:00
Willy Tarreau	ede3d884fc	MEDIUM: channel: merge back flags CF_WRITE_PARTIAL and CF_WRITE_EVENT The behaviour of the flag CF_WRITE_PARTIAL was modified by commit 95fad5ba4 ("BUG/MAJOR: stream-int: don't re-arm recv if send fails") due to a situation where it could trigger an immediate wake up of the other side, both acting in loops via the FD cache. This loss has caused the need to introduce CF_WRITE_EVENT as commit c5a9d5bf, to replace it, but both flags express more or less the same thing and this distinction creates a lot of confusion and complexity in the code. Since the FD cache now acts via tasklets, the issue worked around in the first patch no longer exists, so it's more than time to kill this hack and to restore CF_WRITE_PARTIAL's semantics (i.e.: there has been some write activity since we last left process_stream). This patch mostly reverts the two commits above. Only the part making use of CF_WROTE_DATA instead of CF_WRITE_PARTIAL to detect the loss of data upon connection setup was kept because it's more accurate and better suited.	2018-10-26 08:32:57 +02:00
Fr�d�ric L�caille	b80bc273a3	MINOR: shctx: Change max. object size type to unsigned int. This change is there to prevent implicit conversions when comparing shctx maximum object sizes with other unsigned values.	2018-10-26 04:54:40 +02:00
Fr�d�ric L�caille	4eba544e24	MINOR: cache: Avoid usage of atoi() when parsing "max-object-size". With this patch we avoid parsing "max-object-size" with atoi() and we store its value as an unsigned int to prevent bad implicit conversion issues especially when we compare it with others unsigned value (content length).	2018-10-26 04:54:40 +02:00
Fr�d�ric L�caille	4c8aa117f9	BUG/MINOR: ssl: Wrong usage of shctx_init(). With this patch we check that shctx_init() does not return 0. Must be backported to 1.8.	2018-10-26 04:54:40 +02:00
Fr�d�ric L�caille	bc584494e6	BUG/MINOR: cache: Wrong usage of shctx_init(). With this patch we check that shctx_init() does not returns 0. This is possible if the maxblocks argument, which is passed as an int, is negative due to an implicit conversion. Must be backported to 1.8.	2018-10-26 04:54:40 +02:00
Fr�d�ric L�caille	b9b8b6b6be	BUG/MINOR: cache: Crashes with "total-max-size" > 2047(MB). With this patch we support cache size larger than 2047 (MB) and prevent haproxy from crashing when "total-max-size" is parsed as negative values by atoi(). The limit at parsing time is 4095 MB (UINT_MAX >> 20). May be backported to 1.8.	2018-10-26 04:54:40 +02:00
Fr�d�ric L�caille	a2219f5e3b	MINOR: cache: Add "max-object-size" option. This patch adds "max-object-size" option to the cache to limit the size in bytes of the HTTP objects to be cached. When not provided, the maximum size of an HTTP object is a 256th of the cache size.	2018-10-24 04:40:03 +02:00
Fr�d�ric L�caille	b7838afe6f	MINOR: shctx: Add a maximum object size parameter. This patch adds a new parameter to shctx_init() function to be used to limit the size of each shared object, -1 value meaning "no limit".	2018-10-24 04:39:44 +02:00
Fr�d�ric L�caille	8df65ae5e2	MINOR: cache: Larger HTTP objects caching. This patch makes the capable of storing HTTP objects larger than a buffer. It makes usage of the "block by block shared object allocation" new shctx API. A new pointer to struct shared_block has been added to the cache applet context to memorize the next block to be used by the HTTP cache I/O handler http_cache_io_handler() to emit the data. Another member, named "sent" memorize the number of bytes already sent by this handler. So, to send an object from cache, http_cache_io_handler() must be called until "sent" counter reaches the size of this object.	2018-10-24 04:37:12 +02:00
Fr�d�ric L�caille	0bec807e08	MINOR: shctx: Shared objects block by block allocation. This patch makes shctx capable of storing objects in several parts, each parts being made of several blocks. There is no more need to walk through until reaching the end of a row to append new blocks. A new pointer to a struct shared_block member, named last_reserved, has been added to struct shared_block so that to memorize the last block which was reserved by shctx_row_reserve_hot(). Same thing about "last_append" pointer which is used to memorize the last block used by shctx_row_data_append() to store the data.	2018-10-24 04:35:53 +02:00
Willy Tarreau	30f931ead2	BUG/MEDIUM: pools: fix the minimum allocation size Fred reported a random crash related to the pools. This was introduced by commit e18db9e98 ("MEDIUM: pools: implement a thread-local cache for pool entries") because the minimum pool item size should have been increased to 32 bytes to accommodate the 2 double-linked lists. No backport is needed.	2018-10-23 14:40:23 +02:00
Willy Tarreau	68ad3a42f7	MINOR: proxy: add a new option "http-use-htx" This option makes a proxy use only HTX-compatible muxes instead of the HTTP-compatible ones for HTTP modes. It must be set on both ends, this is checked at parsing time.	2018-10-23 10:22:36 +02:00
Christopher Faulet	955188d37d	BUG/MEDIUM: stream-int: don't set SI_FL_WAIT_ROOM on CF_READ_DONTWAIT With the previous connection model, when we purposely decided to stop receiving in order to avoid polling after a complete request was received for example, it was needed to set SI_FL_WAIT_ROOM to prevent receive polling from being re-armed. Now with the new subscription-based model there is no such thing anymore and there is noone to remove this flag either. Thus if a request takes more than one packet to come in or spans over too many packets, this flag will cause it to wait forever. Let's simply remove this flag now. This patch should not be backported since older versions still need that this flag is set here to stop receiving.	2018-10-23 10:22:36 +02:00
Christopher Faulet	66943a4903	CLEANUP: http: Remove the unused function http_find_header	2018-10-23 10:22:36 +02:00
Olivier Houchard	31f04e4416	MINOR: stream_interface: Avoid calling si_cs_send/recv if not needed. Don't bother calling si_cs_send and si_cs_recv if we're either already subscribe, or if the output buffer is empty for si_cs_send.	2018-10-22 16:05:08 +02:00
Olivier Houchard	d846c267d5	MINOR: h2: Don't run tasks that are waiting to send if mux in full. We wake up all the streams waiting to send data when we have space available in the mux buffer. Doing so means we probably wake way too many streams, because after a few the buffer will probably be full instead. So keep a list of all the streams that are about to send data, and if we detect that the buffer is full, unschedule the tasks and put the streams back to the send_list.	2018-10-21 06:00:13 +02:00
Olivier Houchard	d7bd3e3c4c	MINOR: streams: Call tasklet_free() after si_release_endpoint(). Make sure we call tasklet_free() only after si_release_endpoint(), when the unsubscribe() method has been called, so that we're sure the mux won't attempt to access the taslet.	2018-10-21 05:59:55 +02:00
Olivier Houchard	53216e7db9	MEDIUM: connections: Don't directly mess with the polling from the upper layers. Avoid using conn_xprt_want_send/recv, and totally nuke cs_want_send/recv, from the upper layers. The polling is now directly handled by the connection layer, it is activated on subscribe(), and unactivated once we got the event and we woke the related task.	2018-10-21 05:58:40 +02:00
Olivier Houchard	81a15af6bc	MINOR: h2: Make sure to return 1 in h2_recv() when needed. In h2_recv(), return 1 if we have data available, or if h2_recv_allowed() failed, to be sure h2_process() is called. Also don't subscribe if our buffer is full.	2018-10-21 05:58:33 +02:00
Olivier Houchard	85b73e9427	BUG/MEDIUM: stream: Make sure polling is right on retry. When retrying to connect to a server, because the previous connection failed, make sure if we subscribed to the previous connection, the polling flags will be true for the new fd. No backport is needed.	2018-10-21 05:55:32 +02:00
Olivier Houchard	52b946686c	BUG/MEDIUM: h2: Close connection if no stream is left an GOAWAY was sent. When we're closing a stream, is there's no stream left and a goaway was sent, close the connection, there's no reason to keep it open. [wt: it's likely that this is needed in 1.8 as well, though it's unclear how to trigger this issue, some tests are needed]	2018-10-21 05:53:09 +02:00
Olivier Houchard	8b2c8a7894	BUILD: memory: fix free_list pointer declaration again for atomic CAS Similary to what's been done in 7a6ad88b02d8b74c2488003afb1a7063043ddd2d, take into account that free_list that free_list is a void , and so use a void too when attempting to do a CAS.	2018-10-21 05:44:38 +02:00
Willy Tarreau	4e7cc3381b	BUILD: compiler: rename __unreachable() to my_unreachable() Olivier reported that on FreeBSD __unreachable is already defined and causes build warnings. Let's rename it then.	2018-10-20 17:45:48 +02:00
Willy Tarreau	ed72d82827	MEDIUM: time: measure the time stolen by other threads The purpose is to detect if threads or processes are competing for the same CPU. This can happen when threads are incorrectly bound, or after a reload if the previous process still has an important activity. With threads this situation is problematic because a preempted thread holding a lock will block other ones waiting for this lock to be released. A first attempt consisted in measuring the cumulated lost time more precisely but the system's scheduler is smart enough to try to limit the thread preemption rate by mostly context switching during poll()'s blank periods, so most of the time lost is not seen. In essence this is good because it means a thread is not preempted with a lock held, and even regarding the rendez-vous point it cannot prevent the other ones from making progress. But still it happens tens to hundreds of times per second that a thread might be preempted, so it's still possible to detect that the situation is happening, thus it's interesting to measure and report its frequency. Each time we enter the poller, we check the CPU time spent working and see if we've lost time doing something else. To limit false positives, we're only interested in losses of 500 microseconds or more (i.e. half a clock tick on a 1 kHz system). If so, it indicates that some time was stolen by another thread or process. Note that we purposely store some sub-millisecond counters so that under heavy traffic with a 1 kHz clock, it's still possible to measure something without being subject to the risk of rounding errors (i.e. if exactly 1 ms is stolen it's possible that the time difference could often be slightly lower). This counter of lost CPU time slots time is reported in "show activity" in numbers of milliseconds of CPU lost per second, per 15s, and total over the process' life. By definition, the per-second counter cannot report values larger than 1000 per thread per second and the 15s one will be limited to 15000/s in the worst case, but it's possible that peak values exceed such thresholds after long pauses.	2018-10-19 08:51:59 +02:00
Willy Tarreau	ac6c8805be	BUILD: memory: fix pointer declaration for atomic CAS The calls to HA_ATOMIC_CAS() on the lockfree version of the pool allocator were mistakenly done on (void) for the old value instead of (void *). While this has no impact on "recent" gcc, it does have one for gcc < 4.7 since the CAS was open coded and it's not possible to assign a temporary variable of type "void". No backport is needed, this only affects 1.9.	2018-10-18 16:12:28 +02:00
Willy Tarreau	7e9c4ae4de	MINOR: poller: move time and date computation out of the pollers By placing this code into time.h (tv_entering_poll() and tv_leaving_poll()) we can remove the logic from the pollers and prepare for extending this to offer more accurate time measurements.	2018-10-17 19:59:43 +02:00
Willy Tarreau	f37ba94768	MINOR: fd: centralize poll timeout computation in compute_poll_timeout() The 4 pollers all contain the same code used to compute the poll timeout. This is pointless, let's centralize this into fd.h. This also gets rid of the useless SCHEDULER_RESOLUTION macro which used to work arond a very old linux 2.2 bug causing select() to wake up slightly before the timeout.	2018-10-17 19:59:43 +02:00
Olivier Houchard	33992267aa	MINOR: peers: use defines instead of enums to appease clang. Clang (rightfully) warns that we're trying to set chars to values >= 128. Use defines with hex values instead of an enum to address this.	2018-10-16 19:31:15 +02:00
Olivier Houchard	3332090a2d	MINOR: cfgparse: Write 130 as 128 as 0x82 and 0x80. Write 130 and 128 as 8x82 and 0x80, to avoid warnings about casting from int to size. "check_req" should probably be unsigned, but it's hard to do so.	2018-10-16 19:28:35 +02:00
Willy Tarreau	5dfb6c4cc9	CLEANUP: state-file: make the path concatenation code a bit more consistent There are as many ways to build the globalfilepathlen variable as branches in the if/then/else, creating lots of confusion. Address the most obvious parts, but some polishing definitely is still needed.	2018-10-16 19:26:12 +02:00
Olivier Houchard	17f8b90736	MINOR: server: Use memcpy() instead of strncpy(). Use memcpy instead of strncpy, strncpy buys us nothing, and gcc is being annoying.	2018-10-16 19:22:20 +02:00
Willy Tarreau	b059b894cd	BUILD: lua: silence some compiler warnings after WILL_LJMP These ones are on error paths that are properly handled by luaL_error() which does a longjmp() but the compiler cannot know it. By adding an __unreachable() statement in WILL_LJMP(), there is no ambiguity anymore. This may be backported to 1.8 but these previous patches are needed first : - BUILD: compiler: add a new statement "__unreachable()" - MINOR: lua: all functions calling lua_yieldk() may return - BUILD: lua: silence some compiler warnings about potential null derefs (#2)	2018-10-16 17:57:36 +02:00
Willy Tarreau	9635e03c41	MINOR: lua: all functions calling lua_yieldk() may return There was a mistake when tagging functions which always use longjmp and those which may use it in that all those supposed to call lua_yieldk() may return without calling longjmp. Thus they must not use WILL_LJMP() but MAY_LJMP(). It has zero impact on the code emitted as such, but prevents other fixes from being properly implemented : this was the cause of the previous failure with the __unreachable() calls. This may be backported to older versions. It may or may not apply well depending on the context, though the change simply consists in replacing "WILL_LJMP(hlua_yieldk" with "MAY_LJMP(hlua_yieldk", and same with the single call to lua_yieldk() in hlua_yieldk().	2018-10-16 17:56:20 +02:00
Willy Tarreau	e09101e8d9	BUILD: lua: silence some compiler warnings about potential null derefs (#2 ) Here we make sure that appctx is always taken from the unchecked value since we know it's an appctx, which explains why it's immediately dereferenced. A missing test was added to ensure that task_new() does not return a NULL. This may be backported to 1.8.	2018-10-16 17:39:05 +02:00
Willy Tarreau	526aed219f	Revert "BUILD: lua: silence some compiler warnings about potential null derefs" This reverts commit f1ffb39b614b0d9654c9450ac6e8c88cfc942784. It breaks Lua causing some timeouts. Removing the __unreachable() statement from WILL_LJMP() fixes it. It's very strange and unclear whether it's an issue with WILL_LJMP() not fullfilling its promise of not returning, if the code emitted with __unreachable() gets broken, or anything else. Let's revert this for now.	2018-10-16 17:32:55 +02:00
Willy Tarreau	a9c0252b2e	BUG/MEDIUM: threads: fix thread_release() at the end of the rendez-vous point There is a bug in this function used to release other threads. It leaves the current thread marked as harmless. If after this another thread does a thread_isolate(), but before the first one reaches poll(), the second thread will believe it's alone while it's not. This must be backported to 1.8 since the rendez-vous point was merged into 1.8.14.	2018-10-16 17:03:16 +02:00
Willy Tarreau	e18db9e984	MEDIUM: pools: implement a thread-local cache for pool entries Each thread now keeps the last ~512 kB of freed objects into a local cache. There are some heuristics involved so that a specific pool cannot use more than 1/8 of the total cache in number of objects. Tests have shown that 512 kB is an optimal size on a 24-thread test running on a dual-socket machine, resulting in an overall 7.5% performance increase and a cache miss ratio reducing from 19.2 to 17.7%. Anyway it seems pointless to keep more than an L2 cache, which probably explains why sizes between 256 and 512 kB are optimal. Cached objects appear in two lists, one per pool and one LRU to help with fair eviction. Currently there is no way to check each thread's cache state nor to flush it. This cache cannot be disabled and is enabled as soon as the lockless pools are enabled (i.e.: threads are enabled, no pool debugging is in use and the CPU supports a double word CAS).	2018-10-16 13:46:08 +02:00
Willy Tarreau	0a93b6413f	MINOR: pools: allocate most memory pools from an array For caching it will be convenient to have indexes associated with pools, without having to dereference the pool itself. One solution could consist in replacing all pool pointers with integers but this would limit the number of allocatable pools. Instead here we allocate the 32 first pools from a pre-allocated array whose base address is known so that it's trivial to convert a pool to an index in this array. Pools that cannot fit there will be allocated normally.	2018-10-16 10:29:26 +02:00
Willy Tarreau	8d8747abe0	OPTIM: tasks: group all tree roots per cache line Currently we have per-thread arrays of trees and counts, but these ones unfortunately share cache lines and are accessed very often. This patch moves the task-specific stuff into a structure taking a multiple of a cache line, and has one such per thread. Just doing this has reduced the cache miss ratio from 19.2% to 18.7% and increased the 12-thread test performance by 3%. It starts to become visible that we really need a process-wide per-thread storage area that would cover more than just these parts of the tasks. The code was arranged so that it's easy to move the pieces elsewhere if needed.	2018-10-15 19:06:13 +02:00
Willy Tarreau	b20aa9eef3	MAJOR: tasks: create per-thread wait queues Now we still have a main contention point with the timers in the main wait queue, but the vast majority of the tasks are pinned to a single thread. This patch creates a per-thread wait queue and queues a task to the local wait queue without any locking if the task is bound to a single thread (the current one) otherwise to the shared queue using locking. This significantly reduces contention on the wait queue. A test with 12 threads showed 11 ms spent in the WQ lock compared to 4.7 seconds in the same test without this change. The cache miss ratio decreased from 19.7% to 19.2% on the 12-thread test, and its performance increased by 1.5%. Another indirect benefit is that the average queue size is divided by the number of threads, which roughly removes log(nbthreads) levels in the tree and further speeds up lookups.	2018-10-15 19:04:40 +02:00
Willy Tarreau	87d54a9a6d	MEDIUM: fd/threads: only grab the fd's lock if the FD has more than one thread The vast majority of FDs are only seen by one thread. Currently the lock on FDs costs a lot because it's touched often, though there should be very little contention. This patch ensures that the lock is only grabbed if the FD is shared by more than one thread, since otherwise the situation is safe. Doing so resulted in a 15% performance boost on a 12-threads test.	2018-10-15 13:25:06 +02:00

... 6 7 8 9 10 ...

6804 Commits