haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-10 00:57:02 +02:00

Author	SHA1	Message	Date
Willy Tarreau	e01d11a75b	BUILD: http: properly mark some struct as extern http_known_methods, HTTP_100 and HTTP_103 were not declared extern and as such were multiply defined since they were in http.h. There was apparently no more side effect but it may depend on the platform and the linker. This needs to be backported to 1.9.	2019-03-29 21:00:22 +01:00
Willy Tarreau	a33d39a1b1	CLEANUP: task: only perform a LIST_DEL() when the list is not empty In tasklet_free() we unconditionally perform a LIST_DEL() even when the list is empty, let's move the LIST_DEL() inside the matching block.	2019-03-25 18:10:53 +01:00
Willy Tarreau	e73256fd2a	BUG/MEDIUM: task/h2: add an idempotent task removal fucntion Previous commit `3ea351368` ("BUG/MEDIUM: h2: Remove the tasklet from the task list if unsubscribing.") uncovered an issue which needs to be addressed in the scheduler's API. The function task_remove_from_task_list() was initially designed to remove a task from the running tasklet list from within the scheduler, and had to be used in h2 to abort pending I/O events. However this function was not designed to be idempotent, occasionally causing a double removal from the tasklet list, with the second doing nothing but affecting the apparent tasks count and making haproxy use 100% CPU on some tests consisting in stopping the client during some transfers. The h2_unsubscribe() function can sometimes be called upon stream exit after an error where the tasklet was possibly already removed, so it. This patch does 2 things : - it renames task_remove_from_task_list() to __task_remove_from_tasklet_list() to discourage users from calling it. Also note the fix in the naming since it's a tasklet list and not a task list. This function is still uesd from the scheduler. - it adds a new, idempotent, task_remove_from_tasklet_list() function which does nothing if the task is already not in the tasklet list. This patch will need to be backported where the commit above is backported.	2019-03-25 18:02:54 +01:00
Christopher Faulet	87a8f353f1	CLEANUP: muxes/stream-int: Remove flags CS_FL_READ_NULL and SI_FL_READ_NULL Since the flag CF_SHUTR is no more set to mark the end of the message, these flags become useless. This patch should be backported to 1.9.	2019-03-25 06:55:23 +01:00
Christopher Faulet	297d3e2e0f	MINOR: channel: Report EOI on the input channel if it was reached in the mux The flag CF_EOI is now set on the input channel when the flag CS_FL_EOI is set on the corresponding conn_stream. In addition, if a read activity is reported when this flag is set, the stream is woken up. This patch should be backported to 1.9.	2019-03-25 06:24:43 +01:00
Christopher Faulet	5311a9255d	MINOR: connection: and new flag to mark end of input (EOI) Since the begining, in the H2 multiplexer, when the end of a message is reached, the flag CS_FL_(R)EOS is set on the conn_stream to notify the upper layer that all data were received and consumed and there is no longer any expected. The stream-interface converts it into a shutdown read. But it leads to some ambiguities with the real shutr. Once it was reported at the end of the message, there is no way to report it when the read0 is received. For this reason, aborts after the message was fully received cannot be reported. And on the channel side, it is hard to make the difference between a shutr because the end of the message was reached and a shutr because of an abort. For these reasons, there is now a flag to mark the end of the message. It is called CS_FL_EOI (end-of-input) because it is only used on the receipt path. This flag is only declared and not used yet. This patch will be used by future bug fixes and will have to be backported to 1.9.	2019-03-25 06:24:25 +01:00
Willy Tarreau	0f22299435	CLEANUP: cache: don't export http_cache_applet anymore This one can become static since it's not used by http/htx anymore.	2019-03-19 09:58:35 +01:00
Christopher Faulet	3a78aa6e95	BUG/MINOR: stats: Fully consume large requests in the stats applet In the stats applet (in HTX and legacy HTTP), after a response is fully sent to a client, the request is consumed. It is done at the end, after all the response was copied into the channel's buffer. But only outgoing data at time the applet is called are consumed. Then the applet is closed. If a request with a huge body is sent, an error is triggerred because a SHUTW is catched for an unfinisehd request. Now, we consume request data until the end. In fact, we don't try to shutdown the request's channel for write anymore. This patch must be backported to 1.9 after some observation period. It should probably be backported in prior versions too. But honnestly, with refactoring on the connection layer and the stream interface in 1.9, it is probably safer to not do so.	2019-03-19 09:49:29 +01:00
Willy Tarreau	679bba13f7	MINOR: init: report the list of optionally available services It's never easy to guess what services are built in. We currently have the prometheus exporter in contrib/ which is the only extension for now. Let's enumerate all available ones just like we do for filterr and pollers.	2019-03-19 08:08:10 +01:00
Christopher Faulet	203b2b0a5a	MINOR: muxes: Report the Last read with a dedicated flag For conveniance, in HTTP muxes (h1 and h2), the end of the stream and the end of the message are reported the same way to the stream, by setting the flag CS_FL_EOS. In the stream-interface, when CS_FL_EOS is detected, a shutdown for read is reported on the channel side. This is historical. With the legacy HTTP layer, because the parsing is done by the stream in HTTP analyzers, the EOS really means a shutdown for read. Most of time, for muxes h1 and h2, it works pretty well, especially because the keep-alive is handled by the muxes. The stream is only used for one transaction. So mixing EOS and EOM is good enough. But not everytime. For now, client aborts are only reported if it happens before the end of the request. It is an error and it is properly handled. But because the EOS was already reported, client aborts after the end of the request are silently ignored. Eventually an error can be reported when the response is sent to the client, if the sending fails. Otherwise, if the server does not reply fast enough, an error is reported when the server timeout is reached. It is the expected behaviour, excpect when the option abortonclose is set. In this case, we must report an error when the client aborts. But as said before, this event can be ignored. So to be short, for now, the abortonclose is broken. In fact, it is a design problem and we have to rethink all channel's flags and probably the conn-stream ones too. It is important to split EOS and EOM to not loose information anymore. But it is not a small job and the refactoring will be far from straightforward. So for now, temporary flags are introduced. When the last read is received, the flag CS_FL_READ_NULL is set on the conn-stream. This way, we can set the flag SI_FL_READ_NULL on the stream interface. Both flags are persistant. And to be sure to wake the stream, the event CF_READ_NULL is reported. So the stream will always have the chance to handle the last read. This patch must be backported to 1.9 because it will be used by another patch to fix the option abortonclose.	2019-03-18 15:50:23 +01:00
Christopher Faulet	2b9b6784b9	MINOR: stats: Move stuff about the stats status codes in stats files The status codes definition (STAT_STATUS_*) and their string representation stat_status_codes) have been moved in stats files. There is no reason to keep them in proto_http files.	2019-03-15 14:34:59 +01:00
Christopher Faulet	3c2ecf75c8	MINOR: stats: Add the status code STAT_STATUS_IVAL to handle invalid requests This patch must be backported to 1.9 because a bug fix depends on it.	2019-03-15 14:34:52 +01:00
Olivier Houchard	1d7f37a2cb	BUG/MAJOR: tasks: Use the TASK_GLOBAL flag to know if we're in the global rq. In task_unlink_rq, to decide if we should logk the global runqueue lock, use the TASK_GLOBAL flag instead of relying on t->thread_mask being tid_bit, as it could be so while still being in the global runqueue if another thread woke that task for us. This should be backported to 1.9.	2019-03-14 16:19:11 +01:00
Olivier Houchard	237985b228	MEDIUM: connections: Use _HA_ATOMIC_* Use _HA_ATOMIC_ instead of HA_ATOMIC_ because we know we don't need barriers	2019-03-14 15:55:15 +01:00
Olivier Houchard	9f8d821a55	MEDIUM: list: Use _HA_ATOMIC_* Use _HA_ATOMIC_ instead of HA_ATOMIC_ because we know we don't need barriers.	2019-03-14 15:55:15 +01:00
Olivier Houchard	17fbb4eb3f	MEDIUM: list: Remove useless barriers. Don't bother forcing a barrier after using HA_ATOMIC_XCHG if we're about to check the returned value anyway.	2019-03-14 15:55:15 +01:00
Willy Tarreau	b0cef35b09	BUG/MEDIUM: list: fix incorrect pointer unlocking in LIST_DEL_LOCKED() Injecting on a saturated listener started to exhibit some deadlocks again between LIST_POP_LOCKED() and LIST_DEL_LOCKED(). Olivier found it was due to a leftover from a previous debugging session. This patch fixes it. This will have to be backported if the other LIST_*_LOCKED() patches are backported.	2019-03-13 14:15:54 +01:00
Willy Tarreau	df23c0ce45	MINOR: config: continue to rely on DEFAULT_MAXCONN to set the minimum maxconn Some packages used to rely on DEFAULT_MAXCONN to set the default global maxconn value to use regardless of the initial ulimit. The recent changes made the lowest bound set to 100 so that it is compatible with almost any environment. Now that DEFAULT_MAXCONN is not needed for anything else, we can use it for the lowest bound set when maxconn is not configured. This way it retains its original purpose of setting the default maxconn value eventhough most of the time the effective value will be higher thanks to the automatic computation based on "ulimit -n".	2019-03-13 10:10:49 +01:00
Willy Tarreau	ca783d4ee6	MINOR: config: remove obsolete use of DEFAULT_MAXCONN at various places This entry was still set to 2000 but never used anymore. The only places where it appeared was as an alias to SYSTEM_MAXCONN which forces it, so let's turn these ones to SYSTEM_MAXCONN and remove the default value for DEFAULT_MAXCONN. SYSTEM_MAXCONN still defines the upper bound however.	2019-03-13 10:10:25 +01:00
Olivier Houchard	20872763dd	MEDIUM: memory: Use the new _HA_ATOMIC_* macros. Use the new _HA_ATOMIC_* macros and add barriers where needed.	2019-03-11 17:02:38 +01:00
Olivier Houchard	4c28328572	MEDIUM: task: Use the new _HA_ATOMIC_* macros. Use the new _HA_ATOMIC_* macros and add barriers where needed.	2019-03-11 17:02:37 +01:00
Olivier Houchard	aa4d71a7fe	MEDIUM: server: Use the new _HA_ATOMIC_* macros. Use the new _HA_ATOMIC_* macros and add barriers where needed.	2019-03-11 17:02:37 +01:00
Olivier Houchard	11ecfd1c01	MEDIUM: proxy: Use the new _HA_ATOMIC_* macros. Use the new _HA_ATOMIC_* macros and add barriers where needed.	2019-03-11 17:02:37 +01:00
Olivier Houchard	d5f9b19196	MEDIUM: freq_ctr: Use the new _HA_ATOMIC_* macros. Use the new _HA_ATOMIC_* macros and add barriers where needed.	2019-03-11 17:02:37 +01:00
Olivier Houchard	d360879fb5	MEDIUM: fd: Use the new _HA_ATOMIC_* macros. Use the new _HA_ATOMIC_* macros and add barriers where needed.	2019-03-11 17:02:37 +01:00
Olivier Houchard	8beb27e9ce	MEDIUM: xref: Use the new _HA_ATOMIC_* macros. Use the new _HA_ATOMIC_* macros and add barriers where needed.	2019-03-11 17:02:37 +01:00
Olivier Houchard	a2735340fb	MEDIUM: applets: Use the new _HA_ATOMIC_* macros. Use the new _HA_ATOMIC_* macros and add barriers where needed.	2019-03-11 17:02:37 +01:00
Olivier Houchard	d2b5d16187	MEDIUM: various: Use __ha_barrier_atomic* when relevant. When protecting data modified by atomic operations, use __ha_barrier_atomic* to avoid unneeded barriers on x86.	2019-03-11 17:02:37 +01:00
Olivier Houchard	d0c3b8894a	MINOR: threads: Add macros to do atomic operation with no memory barrier. Add variants of the HA_ATOMIC* macros, prefixed with a _, that do the atomic operation with no barrier generated by the compiler. It is expected the developer adds barriers manually if needed.	2019-03-11 17:02:37 +01:00
Olivier Houchard	113537967c	MEDIUM: threads: Use __ATOMIC_SEQ_CST when using the newer atomic API. When using the new __atomic* API, ask the compiler to generate barriers. A variant of those functions that don't generate barriers will be added later. Before that, using HA_ATOMIC* would not generate any barrier, and some parts of the code should be reviewed and missing barriers should be added. This should probably be backported to 1.8 and 1.9.	2019-03-11 17:02:37 +01:00
Olivier Houchard	9abcf6ef9a	MINOR: threads: Implement __ha_barrier_atomic*. Implement __ha_barrier functions to be used when trying to protect data modified by atomic operations (except when using HA_ATOMIC_STORE). On intel, atomic operations either use the LOCK prefix and xchg, and both atc as full barrier, so there's no need to add an extra barrier.	2019-03-11 17:02:37 +01:00
Olivier Houchard	92fce85d03	MINOR: fd: Remove debugging code. Remove a debugging test, and call to abort, it's no longer needed.	2019-03-08 16:05:25 +01:00
Willy Tarreau	1e56c70cc9	OPTIM: task: limit the impact of memory barriers in taks_remove_from_task_list() In this function we end up with successive locked operations then a store barrier, and in addition the compiler has to emit less efficient code due to a longer jump. There's no need for absolutely updating the tasks_run_queue counter before clearing the task's leaf pointer, so let's swap the two operations and benefit from a single barrier as much as possible. This code is on the hot path and shows about half a percent of improvement with 8 threads.	2019-03-07 18:44:12 +01:00
Willy Tarreau	0cf33176bd	MINOR: listener: move thr_idx from the bind_conf to the listener Tests show that it's slightly faster to have this field in the listener. The cache walk patterns are under heavy stress and having only this field written to in the bind_conf was wasting a cache line that was heavily read. Let's move this close to the other entries already written to in the listener. Warning, the position does have an impact on peak performance.	2019-03-07 14:08:26 +01:00
Willy Tarreau	9f1d4e7f7f	CLEANUP: listener: remove old thread bit mapping Now that the P2C algorithm for the accept queue is removed, we don't need to map a number to a thread bit anymore, so let's remove all these fields which are taking quite some space for no reason.	2019-03-07 13:59:04 +01:00
Willy Tarreau	d87a67f9bc	MINOR: tools: implement my_flsl() We already have my_ffsl() to find the lowest bit set in a word, and this patch implements the search for the highest bit set in a word. On x86 it uses the bsr instruction and on other architectures it uses an efficient implementation.	2019-03-07 13:48:04 +01:00
Willy Tarreau	fc630bd373	MINOR: listener: improve incoming traffic distribution By picking two randoms following the P2C algorithm, we seldom observe asymmetric loads on bursts of small session counts. This is typically what makes h2load take a bit of time to complete the last 100% because if a thread gets two connections while the other ones only have one, it takes twice the time to complete its work. This patch proposes a modification of the p2c algorithm which seems more suitable to this case : it mixes a rotating index with a random. This way, we're certain that all threads are consulted in turn and at the same time we're not forced to use the ones we're giving a chance. This significantly increases the traffic rate. Now h2load shows faster completion and the average request rates on H2 and the TLS resume rate increases by a bit more than 5% compared to pure p2c. The index was placed into the struct bind_conf because 1) it's faster there and it's the best place to optimally distribute traffic among a group of listeners. It's the only runtime-modified element there and it will be quite cache-hot.	2019-03-07 13:48:04 +01:00
Willy Tarreau	b238b12e98	MINOR: task: use LIST_DEL_INIT() to remove a task from the queue By using LIST_DEL_INIT() instead of LIST_DEL()+LIST_INIT() we manage to bump the peak connection rate by no less than 3% on 8 threads. The perf top profile shows much less contention in this area which suffered from the second reload.	2019-03-07 11:45:44 +01:00
Willy Tarreau	c5bd311b2a	MINOR: lists: add a LIST_DEL_INIT() macro It turns out that we call LIST_DEL+LIST_INIT very frequently and that the compiler doesn't know what pointers get modified in the e->n->p and e->p->n dance, so when LIST_INIT() is called, it reloads these pointers, which is quite a bit of a mess in terms of performance. This patch adds LIST_DEL_INIT() to perform the two operations at once using local temporary variables so that the compiler knows these pointers are left unaffected.	2019-03-07 11:45:44 +01:00
Fr�d�ric L�caille	5f33f85ce8	MINOR: sample: Extract some protocol buffers specific code. We move the code responsible of parsing protocol buffers messages inside gRPC messages from sample.c to include/proto/protocol_buffers.h so that to reuse it to cascade "ungrpc" converter.	2019-03-06 15:36:02 +01:00
Fr�d�ric L�caille	756d97f205	MINOR: sample: Rework gRPC converter code. For now on, "ungrpc" may take a second optional argument to provide the protocol buffers types used to encode the field value to be extracted. When absent the field value is extracted as a binary sample which may then followed by others converters like "hex" which takes binary as input sample. When this second argument is a type which does not match the one found by "ungrpc", this field is considered as not found even if present. With this patch we also remove the useless "varint" and "svarint" converters. Update the documentation about "ungrpc" converters.	2019-03-05 11:04:23 +01:00
Fr�d�ric L�caille	7c93e88d0c	MINOR: sample: Code factorization "ungrpc" converter. Parsing protocol buffer fields always consists in skip the field if the field is not found or store the field value if found. So, with this patch we factorize a little bit the code for "ungrpc" converter.	2019-03-05 11:03:53 +01:00
Willy Tarreau	967de20a43	BUG/MEDIUM: list: fix again LIST_ADDQ_LOCKED Well, that's becoming embarrassing. Now this fixes commit `4ef6801c` ("BUG/MEDIUM: list: correct fix for LIST_POP_LOCKED's removal of last element") which itself tried to fix commit `285192564`. This fix only works under low contention and was tested with the listener's queue. With the idle conns it's obvious that it's still wrong since adding more than one element to the list leaves a LLIST_BUSY pointer into the list's head. This was visible when accumulating idle connections in a server's list. This new version of the fix almost goes back to the original code, except that since then we addressed issues with expectedly idempotent operations that were not. Now the code has been verified on paper again and has survived 300 million connections spread over 4 threads. This will have to be backported if the commit above is backported.	2019-03-04 14:09:22 +01:00
Willy Tarreau	bf6964007a	MINOR: global: keep a copy of the initial rlim_fd_cur and rlim_fd_max values Let's keep a copy of these initial values. They will be useful to compute automatic maxconn, as well as to restore proper limits when doing an execve() on external checks.	2019-03-01 10:40:30 +01:00
Fr�d�ric L�caille	645635da84	MINOR: peers: Add a message for heartbeat. This patch implements peer heartbeat feature to prevent any haproxy peer from reconnecting too often, consuming sockets for nothing. To do so, we add PEER_MSG_CTRL_HEARTBEAT new message to PEER_MSG_CLASS_CONTROL peers control class of messages. A ->heartbeat field is added to peer structs to store the heatbeat timeout value which is handled by the same function as for ->reconnect to control the session timeouts. A 2-bytes heartbeat message is sent every 3s when no updates have to be sent. This way, the peer which receives such a message is sure the remote peer is still alive. So, it resets the ->reconnect peer session timeout to its initial value (5s). This prevents any reconnection to an already connected alive peer.	2019-03-01 09:33:26 +01:00
Willy Tarreau	c8d5b95e6d	MEDIUM: config: don't enforce a low frontend maxconn value anymore Historically the default frontend's maxconn used to be quite low (2000), which was sufficient two decades ago but often proved to be a problem when users had purposely set the global maxconn value but forgot to set the frontend's. There is no point in keeping this arbitrary limit for frontends : when the global maxconn is lower, it's already too high and when the global maxconn is much higher, it becomes a limiting factor which causes trouble in production. This commit allows the value to be set to zero, which becomes the new default value, to mean it's not directly limited, or in fact it's set to the global maxconn. Since this operation used to be performed before computing a possibly automatic global maxconn based on memory limits, the calculation of the maxconn value and its propagation to the backends' fullconn has now moved to a dedicated function, proxy_adjust_all_maxconn(), which is called once the global maxconn is stabilized. This comes with two benefits : 1) a configuration missing "maxconn" in the defaults section will not limit itself to a magically hardcoded value but will scale up to the global maxconn ; 2) when the global maxconn is not set and memory limits are used instead, the frontends' maxconn automatically adapts, and the backends' fullconn as well.	2019-02-28 17:05:32 +01:00
Willy Tarreau	e2711c7bd6	MINOR: listener: introduce listener_backlog() to report the backlog value In an attempt to try to provide automatic maxconn settings, we need to decorrelate a listner's backlog and maxconn so that these values can be independent. This introduces a listener_backlog() function which retrieves the backlog value from the listener's backlog, the frontend's, the listener's maxconn, the frontend's or falls back to 1024. This corresponds to what was done in cfgparse.c to force a value there except the last fallback which was not set since the frontend's maxconn is always known.	2019-02-28 17:05:29 +01:00
Willy Tarreau	4ef6801cd4	BUG/MEDIUM: list: correct fix for LIST_POP_LOCKED's removal of last element As seen with Olivier, in the end the fix in commit `285192564` ("BUG/MEDIUM: list: fix LIST_POP_LOCKED's removal of the last pointer") is wrong, the code there was right but the bug was triggered by another bug in LIST_ADDQ_LOCKED() which doesn't properly update the list's head by inserting in the wrong order. This will have to be backported if the commit above is backported.	2019-02-28 16:51:28 +01:00
Willy Tarreau	01abd02508	BUG/MEDIUM: listener: use a self-locked list for the dequeue lists There is a very difficult to reproduce race in the listener's accept code, which is much easier to reproduce once connection limits are properly enforced. It's an ABBA lock issue : - the following functions take l->lock then lq_lock : disable_listener, pause_listener, listener_full, limit_listener, do_unbind_listener - the following ones take lq_lock then l->lock : resume_listener, dequeue_all_listener This is because __resume_listener() only takes the listener's lock and expects to be called with lq_lock held. The problem can easily happen when listener_full() and limit_listener() are called a lot while in parallel another thread releases sessions for the same listener using listener_release() which in turn calls resume_listener(). This scenario is more prevalent in 2.0-dev since the removal of the accept lock in listener_accept(). However in 1.9 and before, a different but extremely unlikely scenario can happen : thread1 thread2 ............................ enter listener_accept() limit_listener() ............................ long pause before taking the lock session_free() dequeue_all_listeners() lock(lq_lock) [1] ............................ try_lock(l->lock) [2] __resume_listener() spin_lock(l->lock) =>WAIT[2] ............................ accept() l->accept() nbconn==maxconn => listener_full() state==LI_LIMITED => lock(lq_lock) =>DEADLOCK[1]! In practice it is almost impossible to trigger it because it requires to limit both on the listener's maxconn and the frontend's rate limit, at the same time, and to release the listener when the connection rate goes below the limit between poll() returns the FD and the lock is taken (a few nanoseconds). But maybe with threads competing on the same core it has more chances to appear. This patch removes the lq_lock and replaces it with a lockless queue for the listener's wait queue (well, technically speaking a self-locked queue) brought by commit `a8434ec14` ("MINOR: lists: Implement locked variations.") and its few subsequent fixes. This relieves us from the need of the lq_lock and removes the deadlock. It also gets rid of the distinction between __resume_listener() and resume_listener() since the only difference was the lq_lock. All listener removals from the list are now unconditional to avoid races on the state. It's worth noting that the list used to never be initialized and that it used to work only thanks to the state tests, so the initialization has now been added. This patch must carefully be backported to 1.9 and very likely 1.8. It is mandatory to be careful about replacing all manipulations of l->wait_queue, global.listener_queue and p->listener_queue.	2019-02-28 16:08:54 +01:00
Willy Tarreau	c912f94b57	MINOR: server: remove a few unneeded LIST_INIT calls after LIST_DEL_LOCKED Since LIST_DEL_LOCKED() and LIST_POP_LOCKED() now automatically reinitialize the removed element, there's no need for keeping this LIST_INIT() call in the idle connection code.	2019-02-28 16:08:54 +01:00
Willy Tarreau	4c747e86cd	MINOR: list: make the delete and pop operations idempotent These operations previously used to return a "locked" element, which is a constraint when multiple threads try to delete the same element, because the second one will block indefinitely. Instead, let's make sure that both LIST_DEL_LOCKED() and LIST_POP_LOCKED() always reinitialize the element after deleting it. This ensures that the second thread will immediately unblock and succeed with the removal. It also secures the pop vs delete competition that may happen when trying to remove an element that's about to be dequeued.	2019-02-28 16:03:29 +01:00
Willy Tarreau	690d2ad4d2	BUG/MEDIUM: list: add missing store barriers when updating elements and head Commit `a8434ec14` ("MINOR: lists: Implement locked variations.") introduced locked lists which use the elements pointers as locks for concurrent operations. Under heavy stress the lists occasionally fail. The cause is a missing barrier at some points when updating the list element and the head : nothing prevents the compiler (or CPU) from updating the list head first before updating the element, making another thread jump to a wrong location. This patch simply adds the missing barriers before these two opeations. This will have to be backported if the commit above is backported.	2019-02-28 15:59:31 +01:00
Willy Tarreau	285192564d	BUG/MEDIUM: list: fix LIST_POP_LOCKED's removal of the last pointer There was a typo making the last updated pointer be the pre-last element's prev instead of the last's prev element. It didn't show up during early tests because the contention is very rare on this one and it's implicitly recovered when updating the pointers to go to the next element, but it was clearly visible in the listener_accept() tests by having all threads block on LIST_POP_LOCKED() with n==p==LLIST_BUSY. This will have to be backported if commit `a8434ec14` ("MINOR: lists: Implement locked variations.") is backported.	2019-02-28 15:59:31 +01:00
Willy Tarreau	bd20ad5874	BUG/MEDIUM: list: fix the rollback on addq in the locked liss Commit `a8434ec14` ("MINOR: lists: Implement locked variations.") introduced locked lists which use the elements pointers as locks for concurrent operations. A copy-paste typo in LIST_ADDQ_LOCKED() causes corruption in the list in case the next pointer is already held, as it restores the previous pointer into the next one. It may impact the server pools. This will have to be backported if the commit above is backported.	2019-02-28 15:10:15 +01:00
Willy Tarreau	149ab779cc	MAJOR: threads: enable one thread per CPU by default Threads have long matured by now, still for most users their usage is not trivial. It's about time to enable them by default on platforms where we know the number of CPUs bound. This patch does this, it counts the number of CPUs the process is bound to upon startup, and enables as many threads by default. Of course, "nbthread" still overrides this, but if it's not set the default behaviour is to start one thread per CPU. The default number of threads is reported in "haproxy -vv". Simply using "taskset -c" is now enough to adjust this number of threads so that there is no more need for playing with cpu-map. And thanks to the previous patches on the listener, the vast majority of configurations will not need to duplicate "bind" lines with the "process x/y" statement anymore either, so a simple config will automatically adapt to the number of processors available.	2019-02-27 14:51:50 +01:00
Willy Tarreau	7ac908bf8c	MINOR: config: add global tune.listener.multi-queue setting tune.listener.multi-queue { on \| off } Enables ('on') or disables ('off') the listener's multi-queue accept which spreads the incoming traffic to all threads a "bind" line is allowed to run on instead of taking them for itself. This provides a smoother traffic distribution and scales much better, especially in environments where threads may be unevenly loaded due to external activity (network interrupts colliding with one thread for example). This option is enabled by default, but it may be forcefully disabled for troubleshooting or for situations where it is estimated that the operating system already provides a good enough distribution and connections are extremely short-lived.	2019-02-27 14:27:07 +01:00
Willy Tarreau	8a03408d81	MINOR: activity: add accept queue counters for pushed and overflows It's important to monitor the accept queues to know if some incoming connections had to be handled by their originating thread due to an overflow. It's also important to be able to confirm thread fairness. This patch adds "accq_pushed" to activity reporting, which reports the number of connections that were successfully pushed into each thread's queue, and "accq_full", which indicates the number of connections that couldn't be pushed because the thread's queue was full.	2019-02-27 14:27:07 +01:00
Willy Tarreau	1efafce61f	MINOR: listener: implement multi-queue accept for threads There is one point where we can migrate a connection to another thread without taking risk, it's when we accept it : the new FD is not yet in the fd cache and no task was created yet. It's still possible to assign it a different thread than the one which accepted the connection. The only requirement for this is to have one accept queue per thread and their respective processing tasks that have to be woken up each time an entry is added to the queue. This is a multiple-producer, single-consumer model. Entries are added at the queue's tail and the processing task is woken up. The consumer picks entries at the head and processes them in order. The accept queue contains the fd, the source address, and the listener. Each entry of the accept queue was rounded up to 64 bytes (one cache line) to avoid cache aliasing because tests have shown that otherwise performance suffers a lot (5%). A test has shown that it's important to have at least 256 entries for the rings, as at 128 it's still possible to fill them often at high loads on small thread counts. The processing task does almost nothing except calling the listener's accept() function and updating the global session and SSL rate counters just like listener_accept() does on synchronous calls. At this point the accept queue is implemented but not used.	2019-02-27 14:27:07 +01:00
Willy Tarreau	b2b50a7784	MINOR: listener: pre-compute some thread counts per bind_conf In order to quickly pick a thread ID when accepting a connection, we'll need to know certain pre-computed values derived from the thread mask, which are counts of bits per position multiples of 1, 2, 4, 8, 16 and 32. In practice it is sufficient to compute only the 4 first ones and store them in the bind_conf. We update the count every time the bind_thread value is adjusted. The fields in the bind_conf struct have been moved around a little bit to make it easier to group all thread bit values into the same cache line. The function used to return a thread number is bind_map_thread_id(), and it maps a number between 0 and 31/63 to a thread ID between 0 and 31/63, starting from the left.	2019-02-27 14:27:07 +01:00
Willy Tarreau	f3241115e7	MINOR: tools: implement functions to look up the nth bit set in a mask Function mask_find_rank_bit() returns the bit position in mask <m> of the nth bit set of rank <r>, between 0 and LONGBITS-1 included, starting from the left. For example ranks 0,1,2,3 for mask 0x55 will be 6, 4, 2 and 0 respectively. This algorithm is based on a popcount variant and is described here : https://graphics.stanford.edu/~seander/bithacks.html.	2019-02-27 14:27:07 +01:00
Willy Tarreau	9e85318417	MINOR: listener: maintain a per-thread count of the number of connections on a listener Having this information will help us improve thread-level distribution of incoming traffic.	2019-02-27 14:27:07 +01:00
Willy Tarreau	a36b324777	MEDIUM: listener: keep a single thread-mask and warn on "process" misuse Now that nbproc and nbthread are exclusive, we can still provide more detailed explanations about what we've found in the config when a bind line appears on multiple threads and processes at the same time, then ignore the setting. This patch reduces the listener's thread mask to a single mask instead of an array of masks per process. Now we have only one thread mask and one process mask per bind-conf. This removes ~504 bytes of RAM per bind-conf and will simplify handling of thread masks. If a "bind" line only refers to process numbers not found by its parent frontend or not covered by the global nbproc directive, or to a thread not covered by the global nbthread directive, a warning is emitted saying what will be used instead.	2019-02-27 14:27:07 +01:00
Olivier Houchard	db64489aac	BUG/MEDIUM: lists: Properly handle the case we're removing the first elt. In LIST_DEL_LOCKED(), initialize p2 to NULL, and only attempt to set it back to its previous value if we had a previous element, and thus p2 is non-NULL.	2019-02-26 18:47:59 +01:00
Olivier Houchard	9ea5d361ae	MEDIUM: servers: Reorganize the way idle connections are cleaned. Instead of having one task per thread and per server that does clean the idling connections, have only one global task for every servers. That tasks parses all the servers that currently have idling connections, and remove half of them, to put them in a per-thread list of connections to kill. For each thread that does have connections to kill, wake a task to do so, so that the cleaning will be done in the context of said thread.	2019-02-26 18:17:32 +01:00
Olivier Houchard	7f1bc31fee	MEDIUM: servers: Used a locked list for idle_orphan_conns. Use the locked macros when manipulating idle_orphan_conns, so that other threads can remove elements from it. It will be useful later to avoid having a task per server and per thread to cleanup the orphan list.	2019-02-26 18:17:32 +01:00
Olivier Houchard	a8434ec146	MINOR: lists: Implement locked variations. Implement LIST_ADD_LOCKED(), LIST_ADDQ_LOCKED(), LIST_DEL_LOCKED() and LIST_POP_LOCKED(). LIST_ADD_LOCKED, LIST_ADDQ_LOCKED and LIST_DEL_LOCKED work the same as LIST_ADD, LIST_ADDQ and LIST_DEL, except before any manipulation it locks the relevant elements of the list, so it's safe to manipulate the list with multiple threads. LIST_POP_LOCKED() removes the first element from the list, and returns its data.	2019-02-26 18:17:32 +01:00
Fr�d�ric L�caille	1fceee8316	MINOR: http_fetch: add "req.ungrpc" sample fetch for gRPC. This patch implements "req.ungrpc" sample fetch method to decode and parse a gRPC request. It takes only one argument: a protocol buffers field number to identify the protocol buffers message number to be looked up. This argument is a sort of path in dotted notation to the terminal field number to be retrieved. ex: req.ungrpc(1.2.3.4) This sample fetch catch the data in raw mode, without interpreting them. Some protocol buffers specific converters may be used to convert the data to the correct type.	2019-02-26 16:27:05 +01:00
Fr�d�ric L�caille	3a463c92cf	MINOR: arg: Add support for ARGT_PBUF_FNUM arg type. This new argument type is used to parse Protocol Buffers field number with dotted notation (e.g: 1.2.3.4).	2019-02-26 16:27:05 +01:00
Fr�d�ric L�caille	3b71716685	MINOR: standard: Add a function to parse uints (dotted notation). This function is useful to parse strings made of unsigned integers and to allocate a C array of unsigned integers from there. For instance this function allocates this array { 1, 2, 3, 4, } from this string: "1.2.3.4".	2019-02-26 16:27:05 +01:00
Christopher Faulet	c6827d52c1	MINOR: channel/htx: Add function to skips output bytes from an HTX channel It is the HTX version of co_skip(). Internally, It uses the function htx_drain(). It will be used by other commits to fix bugs, so it must be backported to 1.9.	2019-02-26 14:04:23 +01:00
Christopher Faulet	549822f0a1	MINOR: htx: Add function to drain data from an HTX message The function htx_drain() can now be used to drain data from an HTX message. It will be used by other commits to fix bugs, so it must be backported to 1.9.	2019-02-26 14:04:23 +01:00
Christopher Faulet	729b5b308c	BUG/MINOR: channel: Set CF_WROTE_DATA when outgoing data are skipped in co_skip(), the flag CF_WRITE_PARTIAL is set on the channel. The flag CF_WROTE_DATA must also be set to notify the channel some data were sent. This patch must be backported to 1.9.	2019-02-26 14:04:23 +01:00
Richard Russo	bc9d9844d5	BUG/MAJOR: fd/threads, task/threads: ensure all spin locks are unlocked Calculate if the fd or task should be locked once, before locking, and reuse the calculation when determing when to unlock. Fixes a race condition added in `87d54a9a` for fds, and `b20aa9ee` for tasks, released in 1.9-dev4. When one thread modifies thread_mask to be a single thread for a task or fd while a second thread has locked or is waiting on a lock for that task or fd, the second thread will not unlock it. For FDs, this is observable when a listener is polled by multiple threads, and is closed while those threads have events pending. For tasks, this seems possible, where task_set_affinity is called, but I did not observe it. This must be backported to 1.9.	2019-02-25 16:16:36 +01:00
Willy Tarreau	2d7f81b809	MINOR: fd: add a new my_closefrom() function to close all FDs This is a naive implementation of closefrom() which closes all FDs starting from the one passed in argument. closefrom() is not provided on all operating systems, and other versions will follow.	2019-02-21 22:19:17 +01:00
Olivier Houchard	f131481a0a	BUG/MEDIUM: servers: Add a per-thread counter of idle connections. Add a per-thread counter of idling connections, and use it to determine how many connections we should kill after the timeout, instead of using the global counter, or we're likely to just kill most of the connections. This should be backported to 1.9.	2019-02-21 19:07:45 +01:00
Olivier Houchard	e737103173	BUG/MEDIUM: servers: Use atomic operations when handling curr_idle_conns. Use atomic operations when dealing with srv->curr_idle_conns, as it's shared between threads, otherwise we could get inconsistencies. This should be backported to 1.9.	2019-02-21 19:07:19 +01:00
Christopher Faulet	0b46548a68	BUG/MEDIUM: h2/htx: Correctly handle interim responses when HTX is enabled 1xx responses does not work in HTTP2 when the HTX is enabled. First of all, when a response is parsed, only one HEADERS frame is expected. So when an interim response is received, the flag H2_SF_HEADERS_RCVD is set and the next HEADERS frame (for another interim repsonse or the final one) is parsed as a trailers one. Then when the response is sent, because an EOM block is found at the end of the interim HTX response, the ES flag is added on the frame, closing too early the stream. Here, it is a design problem of the HTX. Iterim responses are considered as full messages, leading to some ambiguities when HTX messages are processed. This will not be fixed now, but we need to keep it in mind for future improvements. To fix the parsing bug, the flag H2_MSGF_RSP_1XX is added when the response headers are decoded. When this flag is set, an EOM block is added into the HTX message, despite the fact that there is no ES flag on the frame. And we don't set the flag H2_SF_HEADERS_RCVD on the corresponding H2S. So the next HEADERS frame will not be parsed as a trailers one. To fix the sending bug, the ES flag is not set on the frame when an interim response is processed and the flag H2_SF_HEADERS_SENT is not set on the corresponding H2S. This patch must be backported to 1.9.	2019-02-19 16:26:14 +01:00
Olivier Houchard	9efa7b8ba8	BUILD/MEDIUM: initcall: Fix build on MacOS. MacOS syntax for sections is a bit different, so implement it. (see issue #42). This should be backported to 1.9.	2019-02-15 14:32:35 +01:00
Fr�d�ric L�caille	76d2cef0c2	BUG/MEDIUM: peers: Missing peer initializations. Initialize ->srv peer field for all the peers, the local peer included. Indeed, a haproxy process needs to connect to the local peer of a remote process. Furthermore, when a "peer" or "server" line is parsed by parse_server() the address must be copied to ->addr field of the peer object only if this address has been also parsed by parse_server(). This is not the case if this address belongs to the local peer and is provided on a "server" line. After having parsed the "peer" or "server" lines of a peer sections, the ->srv part of all the peer must be initialized for SSL, if enabled. Same thing for the binding part. Revert `1417f0b` commit which is no more required. No backport is needed, this is purely 2.0.	2019-02-12 19:49:22 +01:00
Ben51Degrees	4ddf59d070	MEDIUM: 51d: Enabled multi threaded operation in the 51Degrees module. The existing threading flag in the 51Degrees API (FIFTYONEDEGREES_NO_THREADING) has now been mapped to the HAProxy threading flag (USE_THREAD), and the 51Degrees module code has been made thread safe. In Pattern, the cache is now locked with a spin lock from hathreads.h using a new lable 'OTHER_LOCK'. The workset pool is now created with the same size as the number of threads to avoid any time waiting on a worket. In Hash Trie, the global device offsets structure is only used in single threaded operation. Multi threaded operation creates a new offsets structure in each thread.	2019-02-08 21:29:23 +01:00
Willy Tarreau	1417f0b5dc	BUG/MEDIUM: peers: check that p->srv actually exists before using p->srv->use_ssl Commit `1055e687a` ("MINOR: peers: Make outgoing connection to SSL/TLS peers work.") introduced an "srv" field in the peers, which points to the equivalent server to hold SSL settings. This one is not set when the peer is local so we must always test it before testing p->srv->use_ssl otherwise haproxy dies during reloads. No backport is needed, this is purely 2.0.	2019-02-08 10:22:31 +01:00
Willy Tarreau	ff9c9140f4	MINOR: config: make MAX_PROCS configurable at build time For some embedded systems, it's pointless to have 32- or even 64- large arrays of processes when it's known that much fewer processes will be used in the worst case. Let's introduce this MAX_PROCS define which contains the highest number of processes allowed to run at once. It still defaults to LONGBITS but may be lowered.	2019-02-07 15:10:19 +01:00
Willy Tarreau	980855bd95	BUG/MEDIUM: server: initialize the orphaned conns lists and tasks at the end This also depends on the nbthread count, so it must only be performed after parsing the whole config file. As a side effect, this removes some code duplication between servers and server-templates. This must be backported to 1.9.	2019-02-07 15:08:13 +01:00
Willy Tarreau	2415727a00	MINOR: global: add proc_mask() and thread_mask() These two functions return either all_{proc,threads}_mask, or the argument. This is used to default to all_proc_mask or all_threads_mask when not set on bind_conf or proxies.	2019-02-04 05:09:15 +01:00
Willy Tarreau	a38a7175b1	MINOR: config: keep an all_proc_mask like we have all_threads_mask This simplifies some mask comparisons at various places where nbits(global.nbproc) was used.	2019-02-04 05:09:15 +01:00
Willy Tarreau	cafa56ecd6	MINOR: tools: improve the popcount() operation We'll call popcount() more often so better use a parallel method than an iterative one. One optimal design is proposed at the site below. It requires a fast multiplication though, but even without it will still be faster than the iterative one, and all relevant 64 bit platforms do have a multiply unit. https://graphics.stanford.edu/~seander/bithacks.html	2019-02-04 05:09:15 +01:00
Willy Tarreau	4ed84c96cf	OPTIM: listener: optimize cache-line packing for struct listener Some unused fields were placed early and some important ones were on the second cache line. Let's move the proto_list and name closer to the end of the structure to bring accept() and default_target() into the first cache line.	2019-02-04 05:09:14 +01:00
Willy Tarreau	da9e939f3c	CLEANUP: threads: fix misleading comment about all_threads_mask This variable changed a bit after 1.8, it's never zero anymore.	2019-02-02 17:48:39 +01:00
Olivier Houchard	dc21ff778b	MINOR: debug: Add an option that causes random allocation failures. When compiling with DEBUG_FAIL_ALLOC, add a new option, tune.fail-alloc, that gives the percentage of chances an allocation fails. This is useful to check that allocation failures are always handled gracefully.	2019-01-31 19:38:25 +01:00
Olivier Houchard	ff5dd74e25	MINOR: xref: Add missing barriers. Add a few missing barriers in the xref code, it's unlikely to be a problem for x86, but may be on architectures with weak memory ordering.	2019-01-31 19:38:25 +01:00
Willy Tarreau	00f18a36b6	BUG/MINOR: server: fix logic flaw in idle connection list management With variable connection limits, it's not possible to accurately determine whether the mux is still in use by comparing usage and max to be equal due to the fact that one determines the capacity and the other one takes care of the context. This can cause some connections to be dropped before they reach their stream ID limit. It seems it could also cause some connections to be terminated with streams still alive if the limit was reduced to match the newly computed avail_streams() value, though this cannot yet happen with existing muxes. Instead let's switch to usage reports and simply check whether connections are both unused and available before adding them to the idle list. This should be backported to 1.9.	2019-01-31 19:38:25 +01:00
Willy Tarreau	51d0a7e54c	MINOR: connstream: have a new flag CS_FL_KILL_CONN to kill a connection This is the equivalent of SI_FL_KILL_CONN but for the connstreams. It will be set by the stream-interface during the various shutdown operations.	2019-01-31 19:38:25 +01:00
Willy Tarreau	0f9cd7b196	MINOR: stream-int: add a new flag to mention that we want the connection to be killed The new flag SI_FL_KILL_CONN is now set by the rare actions which deliberately want the whole connection (and not just the stream) to be killed. This is only used for "tcp-request content reject", "tcp-response content reject", "tcp-response content close" and "http-request reject". The purpose is to desambiguate the close from a regular shutdown. This will be used by the next patches.	2019-01-31 19:38:25 +01:00
Olivier Houchard	8788b4111c	BUG/MEDIUM: connections: Don't forget to remove CO_FL_SESS_IDLE. If we're adding a connection to the server orphan idle list, don't forget to remove the CO_FL_SESS_IDLE flag, or we will assume later it's still attached to a session. This should be backported to 1.9.	2019-01-31 19:38:25 +01:00
Willy Tarreau	e5fcfbed5c	MINOR: htx: never check for null htx pointer in htx_is_{,not_}empty() The previous patch clarifies the fact that the htx pointer is never null along all the code. This test for a null will never match, didn't catch the pointer 1 before the fix for b_is_null(), but it confuses the compiler letting it think that any dereferences made to this pointer after this test could actually mean we're dereferencing a null. Let's now drop this test. This saves us from having to add impossible tests everywhere to avoid the warning. This should be backported to 1.9 if the b_is_null() patch is backported.	2019-01-31 08:07:17 +01:00
Willy Tarreau	245d189cce	DOC: htx: make it clear that htxbuf() and htx_from_buf() always return valid pointers Update the comments above htxbuf() and htx_from_buf() to make it clear that they always return valid htx pointers so that callers know they do not have to test them. This is only true after the fix on b_is_null() which was the only known corner case. This should be backported to 1.9 if the b_is_null() patch is backported.	2019-01-31 08:07:17 +01:00
Olivier Houchard	203d735cac	BUG/MEDIUM: buffer: Make sure b_is_null handles buffers waiting for allocation. In b_is_null(), make sure we return 1 if the buffer is waiting for its allocation, as users assume there's memory allocated if b_is_null() returns 0. The indirect impact of not having this was that htxbuf() would not match b_is_null() for a buffer waiting for an allocation, and would thus return the value 1 for the htx pointer, causing various crashes under low memory condition. Note that this patch makes gcc versions 6 and above report two null-deref warnings in proto_htx.c since htx_is_empty() continues to check for a null pointer without knowing that this is protected by the test on b_is_null(). This is addressed by the following patches. This should be backported to 1.9.	2019-01-31 08:07:17 +01:00
Willy Tarreau	9c84d8299a	MINOR: h2: add a generic frame checker The new function h2_frame_check() checks the protocol limits for the received frame (length, ID, direction) and returns a verdict made of a connection error code. The purpose is to be able to validate any frame regardless of the state and the ability to call the frame handler, and to emit a GOAWAY early in this case.	2019-01-30 19:37:20 +01:00
Willy Tarreau	13afcb7ab3	BUG/MINOR: task: fix possibly missed event in inter-thread wakeups There's a very small but existing uncertainty window when waking another thread up where it is possible for task_wakeup() not to wake the other task up because it's still running while this once is in the process of finishing and loses its TASK_RUNNING flag. In this case the wakeup will be missed. The problem is that we have a single flag to store 3 states, since the transition from running to sleeping isn't atomic. Thus we need to have another flag to cover this part. This patch introduces TASK_QUEUED to mention that the task is already in the run queue, running or not. This bit will be removed while TASK_RUNNING is kept once dequeued, and will be used when removing TASK_RUNNING to check if the task has been requeued. It might be possible to slightly improve this but the occurrence rate is quite low and we don't really need to complexify the scheduler to optimize for a rare case. The impact with the current code is very low since we have few inter- thread wakeups. Most of them are caused by checks killing sessions. This must be backported to 1.9.	2019-01-28 15:03:04 +01:00
Willy Tarreau	f5809cde7a	MINOR: threads: make MAX_THREADS configurable at build time There's some value in being able to limit MAX_THREADS, either to save precious resources in embedded environments, or to protect certain deployments against accidently incorrect settings. With this patch, if MAX_THREADS is defined at build time, it will be used. However, given that LONGBITS is not a macro but is defined according to sizeof(long), we can't check the value range at build time and instead we need to perform the check at early boot time. However, the compiler is able to optimize away the constant comparisons and doesn't even emit the check code when values are correct. The output message regarding threading support was improved to report the number of threads.	2019-01-26 13:37:48 +01:00
Willy Tarreau	c9a82e48bf	MINOR: cfgparse: make the process/thread parser support a maximum value It was hard-wired to LONGBITS, let's make it configurable depending on the context (threads, processes).	2019-01-26 13:25:14 +01:00
Willy Tarreau	4790f7c907	MEDIUM: h2: always parse and deduplicate the content-length header The header used to be parsed only in HTX but not in legacy. And even in HTX mode, the value was dropped. Let's always parse it and report the parsed value back so that we'll be able to store it in the streams.	2019-01-24 19:07:26 +01:00
Willy Tarreau	bf66bd1b8b	MEDIUM: stream-int: always mark pending outgoing SI_ST_CON Before the first send() attempt, we should be in SI_ST_CON, not SI_ST_EST, since we have not yet attempted to send and we are allowed to retry. This is particularly important with complex outgoing muxes which can fail during the first send attempt (e.g. failed stream ID allocation). It only requires that sess_update_st_con_tcp() knows about this possibility, as we must not forcefully close a reused connection when facing an error in this case, this will be handled later. This may be backported to 1.9 with care after some observation period.	2019-01-24 19:06:43 +01:00
Willy Tarreau	9c538e01c2	MINOR: server: add a max-reuse parameter Some servers may wish to limit the total number of requests they execute over a connection because some of their components might leak resources. In HTTP/1 it was easy, they just had to emit a "connection: close" header field with the last response. In HTTP/2, it's less easy because the info is not always shared with the component dealing with the H2 protocol and it could be harder to advertise a GOAWAY with a stream limit. This patch provides a solution to this by adding a new "max-reuse" parameter to the server keyword. This parameter indicates how many times an idle connection may be reused for new requests. The information is made available and the underlying muxes will be able to use it at will. This patch should be backported to 1.9.	2019-01-24 19:06:43 +01:00
Willy Tarreau	1e7d444eec	BUG/MINOR: hpack: return a compression error on invalid table size updates RFC7541#6.3 mandates that an error is reported when a dynamic table size update announces a size larger than the one configured with settings. This is tested by h2spec using test "hpack/6.3/1". This must be backported to 1.9 and possibly 1.8 as well.	2019-01-24 15:27:06 +01:00
Willy Tarreau	71c3811589	MINOR: h2: declare new sets of frame types This patch adds H2_FT_HDR_MASK to group all frame types carrying headers information, and H2_FT_LATE_MASK to group frame types allowed to arrive after a stream was closed.	2019-01-24 15:27:06 +01:00
Fr�d�ric L�caille	355b2033ec	MINOR: cfgparse: SSL/TLS binding in "peers" sections. Make "bind" keywork be supported in "peers" sections. All "bind" settings are supported on this line. Add "default-bind" option to parse the binding options excepted the bind address. Do not parse anymore the bind address for local peers on "server" lines. Do not use anymore list_for_each_entry() to set the "peers" section listener parameters because there is only one listener by "peers" section. May be backported to 1.5 and newer.	2019-01-18 14:26:21 +01:00
Fr�d�ric L�caille	1055e687a2	MINOR: peers: Make outgoing connection to SSL/TLS peers work. This patch adds pointer to a struct server to peer structure which is initialized after having parsed a remote "peer" line. After having parsed all peers section we run ->prepare_srv to initialize all SSL/TLS stuff of remote perr (or server). Remaining thing to do to completely support peer protocol over SSL/TLS: make "bind" keyword be supported in "peers" sections to make SSL/TLS incoming connections to local peers work. May be backported to 1.5 and newer.	2019-01-18 14:26:21 +01:00
Tim Duesterhus	8b87c01c4d	BUG/MINOR: stick_table: Prevent conn_cur from underflowing When using the peers feature a race condition could prevent a connection from being properly counted. When this connection exits it is being "uncounted" nonetheless, leading to a possible underflow (-1) of the conn_curr stick table entry in the following scenario : - Connect to peer A (A=1, B=0) - Peer A sends 1 to B (A=1, B=1) - Kill connection to A (A=0, B=1) - Connect to peer B (A=0, B=2) - Peer A sends 0 to B (A=0, B=0) - Peer B sends 0/2 to A (A=?, B=0) - Kill connection to B (A=?, B=-1) - Peer B sends -1 to A (A=-1, B=-1) This fix may be backported to all supported branches.	2019-01-15 15:34:49 +01:00
Willy Tarreau	0cac26cd88	MEDIUM: backend: move all LB algo parameters into an union Since all of them are exclusive, let's move them to an union instead of eating memory with the sum of all of them. We're using a transparent union to limit the code changes. Doing so reduces the struct lbprm from 392 bytes to 372, and thanks to these changes, the struct proxy is now down to 6480 bytes vs 6624 before the changes (144 bytes saved per proxy).	2019-01-14 19:33:17 +01:00
Willy Tarreau	76e84f5091	MINOR: backend: move hash_balance_factor out of chash This one is a proxy option which can be inherited from defaults even if the LB algo changes. Move it out of the lb_chash struct so that we don't need to keep anything separate between these structs. This will allow us to merge them into an union later. It even takes less room now as it fills a hole and removes another one.	2019-01-14 19:33:17 +01:00
Willy Tarreau	a9a7249966	MINOR: backend: remap the balance uri settings to lbprm.arg_opt{1,2,3} The algo-specific settings move from the proxy to the LB algo this way : - uri_whole => arg_opt1 - uri_len_limit => arg_opt2 - uri_dirs_depth1 => arg_opt3	2019-01-14 19:33:17 +01:00
Willy Tarreau	9fed8586b5	MINOR: backend: make the header hash use arg_opt1 for use_domain_only This is only a boolean extra arg. Let's map it to arg_opt1 and remove hh_match_domain from struct proxy.	2019-01-14 19:33:17 +01:00
Willy Tarreau	20e68378f1	MINOR: backend: add new fields in lbprm to store more LB options Some algorithms require a few extra options (up to 3). Let's provide some room in lbprm to store them, and make sure they're passed from defaults to backends.	2019-01-14 19:33:17 +01:00
Willy Tarreau	484ff07691	MINOR: backend: make headers and RDP cookie also use arg_str/len These ones used to rely on separate variables called hh_name/hh_len but they are exclusive with the former. Let's use the same variable which becomes a generic argument name and length for the LB algorithm.	2019-01-14 19:33:17 +01:00
Willy Tarreau	4c03d1c9b6	MINOR: backend: move url_param_name/len to lbprm.arg_str/len This one is exclusively used by LB parameters, when using URL param hashing. Let's move it to the lbprm struct under a more generic name.	2019-01-14 19:33:17 +01:00
Emeric Brun	9e7547740c	MINOR: ssl: add support of aes256 bits ticket keys on file and cli. Openssl switched from aes128 to aes256 since may 2016 to compute tls ticket secrets used by default. But Haproxy still handled only 128 bits keys for both tls key file and CLI. This patch permit the user to set aes256 keys throught CLI or the key file (80 bytes encoded in base64) in the same way that aes128 keys were handled (48 bytes encoded in base64): - first 16 bytes for the key name - next 16/32 bytes for aes 128/256 key bits key - last 16/32 bytes for hmac 128/256 bits Both sizes are now supported (but keys from same file must be of the same size and can but updated via CLI only using a key of the same size). Note: This feature need the fix "dec func ignores padding for output size checking."	2019-01-14 19:32:58 +01:00
Olivier Houchard	c98aa1f182	MINOR: checks: Store the proxy in checks. Instead of assuming we have a server, store the proxy directly in struct check, and use it instead of s->server. This should be a no-op for now, but will be useful later when we change mail checks to avoid having a server. This should be backported to 1.9.	2019-01-14 11:15:11 +01:00
Willy Tarreau	762475e1f9	BUG/MEDIUM: connection: properly unregister the mux on failed initialization When mux->init() fails, session_free() will call it again to unregister it while it was already done, resulting in null derefs or use-after-free. This typically happens on out-of-memory conditions during H1 or H2 connection or stream allocation. This fix must be backported to 1.9.	2019-01-10 19:47:43 +01:00
Christopher Faulet	f7ed195ac8	MINOR: channel/htx: Add the HTX version of channel_truncate/erase The function channel_htx_truncate() can now be used on HTX buffer to truncate all incoming data, keeping outgoing one intact. This function relies on the function channel_htx_erase() and htx_truncate(). This patch may be backported to 1.9. If so, the patch "MINOR: channel/htx: Add the HTX version of channel_truncate()" must also be backported.	2019-01-08 12:06:55 +01:00
Christopher Faulet	00cf697215	MINOR: htx: Add a function to truncate all blocks after a specific offset This function will be used to truncate all incoming data in a channel, keeping outgoing ones. This may be backported to 1.9.	2019-01-08 12:06:55 +01:00
Christopher Faulet	5811db0043	MINOR: channel/htx: Add HTX version for some helper functions HTX versions for functions to test the free space in input against the reserve have been added. Now, on HTX streams, following functions can be used: * channel_htx_may_recv * channel_htx_recv_limit * channel_htx_recv_max * channel_htx_full This patch must be backported in 1.9 because it will be used by a futher patch to fix a bug.	2019-01-07 16:32:05 +01:00
Christopher Faulet	8564c1f04b	MINOR: htx: Add an helper function to get the max space usable for a block This patch must be backported in 1.9 because it will be used by a futher patch to fix a bug.	2019-01-07 16:32:02 +01:00
Willy Tarreau	909b9d852b	BUILD: add a new file "version.c" to carry version updates While testing fixes, it's sometimes confusing to rebuild only one C file (e.g. a mux) and not to have the correct commit ID reported in "haproxy -v" nor on the stats page. This patch adds a new "version.c" file which is always rebuilt. It's very small and contains only 3 variables derived from the various version strings. These variables are used instead of the macros at the few places showing the version. This way the output version of the running code is always correct for the parts that were rebuilt.	2019-01-04 18:20:32 +01:00
Olivier Houchard	f1b11e2d16	MINOR: connections: Remove a stall comment. Remove the comment that pretends 0x40000000 is unused, it's not true anymore.	2019-01-04 17:26:47 +01:00
Willy Tarreau	0f8fb6b7f9	MINOR: h1: make the H1 headers block parser able to parse headers only Currently the H1 headers parser works for either a request or a response because it starts from the start line. It is also able to resume its processing when it was interrupted, but in this case it doesn't update the list. Make it support a new flag, H1_MF_HDRS_ONLY so that the caller can indicate it's only interested in the headers list and not the start line. This will be convenient to parse H1 trailers.	2019-01-04 10:48:03 +01:00
Willy Tarreau	1e1f27c5c1	MINOR: h2: add h2_make_htx_trailers to turn H2 headers to HTX trailers This function is usable to transform a list of H2 header fields to a HTX trailers block. It takes care of rejecting forbidden headers and pseudo-headers when performing the conversion. It also emits the trailing CRLF that is currently needed in the HTX trailers block.	2019-01-03 18:45:38 +01:00
Willy Tarreau	52610e905d	MINOR: htx: add a new function to add a block without filling it htx_add_blk_type_size() creates a block of a specified type and size and returns it. The caller can then fill it.	2019-01-03 18:45:38 +01:00
Willy Tarreau	9d953e7572	MINOR: h2: add h2_make_h1_trailers to turn H2 headers to H1 trailers This function is usable to transform a list of H2 header fields to a H1 trailers block. It takes care of rejecting forbidden headers and pseudo-headers when performing the conversion.	2019-01-03 18:45:38 +01:00
Willy Tarreau	59884a646c	MINOR: lb: allow redispatch when using consistent hash Redispatch traditionally only worked for cookie based persistence. Adding redispatch support for consistent hash based persistence - also update docs. Reported by Oskar Stenman on discourse: https://discourse.haproxy.org/t/balance-uri-consistent-hashing-redispatch-3-not-redispatching/3344 Should be backported to 1.8. Cc: Lukas Tribus <lukas@ltri.eu>	2019-01-02 20:22:17 +01:00
Christopher Faulet	e64582929f	MINOR: channel: Add the function channel_add_input This function must be called when new incoming data are pushed in the channel's buffer. It updates the channel state and take care of the fast forwarding by consuming right amount of data and decrementing "->to_forward" accordingly when necessary. In fact, this patch just moves a part of ci_putblk in a dedicated function. This patch must be backported to 1.9.	2019-01-02 20:12:44 +01:00
Olivier Houchard	a2dbeb22fc	MEDIUM: sessions: Keep track of which connections are idle. Instead of keeping track of the number of connections we're responsible for, keep track of the number of connections we're responsible for that we are currently considering idling (ie that we are not using, they may be in use by other sessions), that way we can actually reuse connections when we have more connections than the max configured.	2018-12-28 19:16:03 +01:00
Olivier Houchard	351411facd	BUG/MAJOR: sessions: Use an unlimited number of servers for the conn list. When a session adds a connection to its connection list, we used to remove connections for an another server if there were not enough room for our server. This can't work, because those lists are now the list of connections we're responsible for, not just the idle connections. To fix this, allow for an unlimited number of servers, instead of using an array, we're now using a linked list.	2018-12-28 16:33:13 +01:00
Olivier Houchard	09e498f1a1	BUG/MEDIUM: tasks: Decrement tasks_run_queue in tasklet_free(). If the tasklet is in the list, don't forget to decrement tasks_run_queue in tasklet_free(). This should be backported to 1.9.	2018-12-24 14:04:55 +01:00
Willy Tarreau	f48919aafb	MINOR: buffers: add a new b_move() function This function will be used to move parts of a buffer to another place in the same buffer, even if the parts overlap. In order to keep things under reasonable control, it only uses a length and absolute offsets for the source and destination, and doesn't consider head nor data.	2018-12-24 11:45:00 +01:00
Willy Tarreau	deab244dc1	MINOR: h2: add a bit-based frame type representation This will ease checks among sets of frames.	2018-12-24 11:45:00 +01:00
Willy Tarreau	fba74ea7b0	[RELEASE] Released version 2.0-dev0 Released version 2.0-dev0 with the following main changes : - BUG/MAJOR: connections: Close the connection before freeing it. - REGTEST: Require the option LUA to run lua tests - REGTEST: script: Process script arguments before everything else - REGTEST: script: Evaluate the varnishtest command to allow quoted parameters - REGTEST: script: Add the option --clean to remove previous log direcotries - REGTEST: script: Add the option --debug to show logs on standard ouput - REGTEST: script: Add the option --keep-logs to keep all log directories - REGTEST: script: Add the option --use-htx to enable the HTX in regtests - REGTEST: script: Print only errors in the results report - REGTEST: Add option to use HTX prefixed by the macro 'no-htx' - REGTEST: Make reg-tests target support argument. - REGTEST: Fix a typo about barrier type. - REGTEST: Be less Linux specific with a syslog regex. - REGTEST: Missing enclosing quotes for ${tmpdir} macro. - REGTEST: Exclude freebsd target for some reg tests. - BUG/MEDIUM: h2: Don't forget to quit the sending_list if SUB_CALL_UNSUBSCRIBE. - BUG/MEDIUM: mux-h2: Don't forget to quit the send list on error reports - BUG/MEDIUM: dns: Don't prevent reading the last byte of the payload in dns_validate_response() - BUG/MEDIUM: dns: overflowed dns name start position causing invalid dns error - BUG/MINOR: compression/htx: Don't compress responses with unknown body length - BUG/MINOR: compression/htx: Don't add the last block of data if it is empty - MEDIUM: mux_h1: Implement h1_show_fd. - REGTEST: script: Add support of alternatives in requited options list - REGTEST: Add a basic test for the compression - BUG/MEDIUM: mux-h2: don't needlessly wake up the demux on short frames - REGTEST: A basic test for "http-buffer-request" - BUG/MEDIUM: server: Also copy "check-sni" for server templates. - MINOR: ssl: Add ssl_sock_set_alpn(). - MEDIUM: checks: Add check-alpn.	2018-12-22 11:20:35 +01:00
Olivier Houchard	921501443b	MEDIUM: checks: Add check-alpn. Add a way to configure the ALPN used by check, with a new "check-alpn" keyword. By default, the checks will use the server ALPN, but it may not be convenient, for instance because the server may use HTTP/2, while checks are unable to do HTTP/2 yet.	2018-12-21 19:54:16 +01:00
Olivier Houchard	ab28a320aa	MINOR: ssl: Add ssl_sock_set_alpn(). Add a new function, ssl_sock_set_alpn(), to be able to change the ALPN for a connection, instead of relying of the one defined in the SSL_CTX.	2018-12-21 19:53:30 +01:00
Olivier Houchard	8ab8a6eee5	BUG/MAJOR: connections: Close the connection before freeing it. In si_release_endpoint(), if the end point is a connection, because we don't know which mux to use it, make sure we close the connection before freeing it, or else, we'd have a fd left for polling, which would point to a now free'd connection. This should be backported to 1.9.	2018-12-20 06:03:14 +01:00
Willy Tarreau	e9f4301f0f	MINOR: connection: add cs_set_error() to set the error bits Depending on the CS_FL_EOS status, we either set CS_FL_ERR_PENDING or CS_FL_ERROR at various places. Let's have a generic function to do this.	2018-12-19 18:13:52 +01:00
Willy Tarreau	14bfe9af12	CLEANUP: stream-int: consistently call the si/stream_int functions As long-time changes have accumulated over time, the exported functions of the stream-interface were almost all prefixed "si_<something>" while most private ones (mostly callbacks) were called "stream_int_<something>". There were still a few confusing exceptions, which were addressed to follow this shcme : - stream_sock_read0(), only used internally, was renamed stream_int_read0() and made static - stream_int_notify() is only private and was made static - stream_int_{check_timeouts,report_error,retnclose,register_handler,update} were renamed si_<something>. Now it is clearer when checking one of these if it risks to be used outside or not.	2018-12-19 15:25:43 +01:00
Willy Tarreau	94031d30d7	MINOR: connection: remove an unwelcome dependency on struct stream There was a reference to struct stream in conn_free() for the case where we're freeing a connection that doesn't have a mux attached. For now we know it's always a stream, and we only need to do it to put a NULL in s->si[1].end. Let's do it better by storing the pointer to si[1].end in the context and specifying that this pointer is always nulled if the mux is null. This way it allows a connection to detach itself from wherever it's being used. Maybe we could even get rid of the condition on the mux.	2018-12-19 14:36:29 +01:00
Willy Tarreau	3d2ee55ebd	CLEANUP: connection: rename conn->mux_ctx to conn->ctx We most often store the mux context there but it can also be something else while setting up the connection. Better call it "ctx" and know that it's the owner's context than misleadingly call it mux_ctx and get caught doing suspicious tricks.	2018-12-19 14:13:07 +01:00
Willy Tarreau	4f6516d677	CLEANUP: connection: rename subscription events values and event field The SUB_CAN_SEND/SUB_CAN_RECV enum values have been confusing a few times, especially when checking them on reading. After some discussion, it appears that calling them SUB_RETRY_SEND/SUB_RETRY_RECV more accurately reflects their purpose since these events may only appear after a first attempt to perform the I/O operation has failed or was not completed. In addition the wait_reason field in struct wait_event which carries them makes one think that a single reason may happen at once while it is in fact a set of events. Since the struct is called wait_event it makes sense that this field is called "events" to indicate it's the list of events we're subscribed to. Last, the values for SUB_RETRY_RECV/SEND were swapped so that value 1 corresponds to recv and 2 to send, as is done almost everywhere else in the code an in the shutdown() call.	2018-12-19 14:09:21 +01:00
Willy Tarreau	beefaee4f5	MEDIUM: h2: properly check and deduplicate the content-length header in HTX When producing an HTX message, we can't rely on the next-level H1 parser to check and deduplicate the content-length header, so we have to do it while parsing a message. The algorithm is the exact same as used for H1 messages.	2018-12-19 13:08:08 +01:00
Willy Tarreau	d5e3c71208	MINOR: objtype: report a few missing types in names and base pointers Types DNS_SRVRQ and CS were not referenced in the type to string conversions, causing possibly misleading outputs in session dumps. Now instead of showing "NONE" for unknown invalid types names, we display "!INVAL!" to clear the confusion that may exist in case of memory corruption for example.	2018-12-18 16:31:10 +01:00
Olivier Houchard	71748cb91b	BUG/MEDIUM: connection: Add a new CS_FL_ERR_PENDING flag to conn_streams. Add a new flag to conn_streams, CS_FL_ERR_PENDING. This is to be set instead of CS_FL_ERR in case there's still more data to be read, so that we read all the data before closing.	2018-12-17 21:54:14 +01:00
Willy Tarreau	bce4d8a37d	MINOR: debug: make the ABORT_NOW macro use a volatile int Similar to previous commit, let's make the macro use a volatile when dereferencing NULL so that clang doesn't optimize it away.	2018-12-16 08:17:23 +01:00
Olivier Houchard	51e474136b	MINOR: pools: Cast to volatile int * instead of int . When using DEBUG_MEMORY_POOLS, when we want to crash, instead of using (int )0 = 0, use (volatile int *)0 = 0, or clang will just translate it to a nop, instead of dereferencing 0.	2018-12-16 08:15:16 +01:00

1 2 3 4 5 ...

3452 Commits