haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-09 08:37:04 +02:00

Author	SHA1	Message	Date
Willy Tarreau	deccd1116d	MEDIUM: mux: make mux->snd_buf() take the byte count in argument This way the mux doesn't need to modify the buffer's metadata anymore nor to know the output's size. The mux->snd_buf() function now takes a const buffer and it's up to the caller to update the buffer's state. The return type was updated to return a size_t to comply with the count argument.	2018-07-19 16:23:41 +02:00
Willy Tarreau	787db9a6a4	MEDIUM: connection: make xprt->snd_buf() take the byte count in argument This way the senders don't need to modify the buffer's metadata anymore nor to know about the output's split point. This way the functions can take a const buffer and it's clearer who's in charge of updating the buffer after a send. That's why the buffer realignment is now performed by the caller of the transport's snd_buf() functions. The return type was updated to return a size_t to comply with the count argument.	2018-07-19 16:23:41 +02:00
Willy Tarreau	55f3ce1c91	MINOR: buffer: make b_getblk_nc() take size_t for the block sizes Till now we used to reimplement it using ints to limit external changes but we must adjust it and the various users to switch to size_t.	2018-07-19 16:23:41 +02:00
Willy Tarreau	206ba834ef	MINOR: buffer: make b_getblk_nc() take const pointers Now that there are no more users requiring to modify the buffer anymore, switch these ones to const char and const buffer. This will make it more obvious next time send functions are tempted to modify the buffer's output count. Minor adaptations were necessary at a few call places which were using char due to the function's previous prototype.	2018-07-19 16:23:41 +02:00
Willy Tarreau	5d7d1bbd0e	MINOR: buffer: get rid of b_end() and b_to_end() These ones are not used anymore.	2018-07-19 16:23:41 +02:00
Willy Tarreau	f40e68227b	MINOR: h1: make h1_measure_trailers() use an offset and a count This will be needed by the H2 encoder to restart after wrapping.	2018-07-19 16:23:41 +02:00
Willy Tarreau	84d6b7af87	MINOR: h1: make h1_parse_chunk_size() not depend on b_ptr() anymore It's similar to the previous commit so that the function doesn't rely on buf->p anymore.	2018-07-19 16:23:41 +02:00
Willy Tarreau	c0973c6742	MINOR: h1: make h1_skip_chunk_crlf() not depend on b_ptr() anymore It now takes offsets relative to the buffer's head. It's up to the callers to add this offset which corresponds to the buffer's output size.	2018-07-19 16:23:41 +02:00
Willy Tarreau	7314be8e2c	MINOR: h1: make h1_measure_trailers() take the byte count in argument The principle is that it should not have to take this value from the buffer itself anymore.	2018-07-19 16:23:40 +02:00
Willy Tarreau	e5f12ce7f2	MINOR: buffer: replace bi_del() and bo_del() with b_del() Till now the callers had to know which one to call for specific use cases. Let's fuse them now since a single one will remain after the API migration. Given that bi_del() may only be used where o==0, just combine the two tests by first removing output data then only input.	2018-07-19 16:23:40 +02:00
Willy Tarreau	a1f78fb652	MINOR: buffer: replace bo_getblk_nc() with b_getblk_nc() which takes an offset This will be important so that we can parse a buffer without touching it. Now we indicate where from the buffer's head we plan to start to copy, and for how many bytes. This will be used by send functions to loop at the end of the buffer without having to update the buffer's output byte count.	2018-07-19 16:23:40 +02:00
Willy Tarreau	90ed3836db	MINOR: buffer: replace bo_getblk() with direction agnostic b_getblk() This new functoin limits itself to the amount of data available in the buffer and doesn't care about the direction anymore. It's only called from co_getblk() which already checks that no more than the available output bytes is requested.	2018-07-19 16:23:40 +02:00
Willy Tarreau	e4d5a036ed	MINOR: buffer: merge b{i,o}_contig_space() These ones were merged into a single b_contig_space() that covers both (the bo_ case was a simplified version of the other one). The function doesn't use ->i nor ->o anymore.	2018-07-19 16:23:40 +02:00
Willy Tarreau	0e11d59af6	MINOR: buffer: remove bo_contig_data() The two call places now make use of b_contig_data(0) and check by themselves that the returned size is no larger than the scheduled output data.	2018-07-19 16:23:40 +02:00
Willy Tarreau	8f9c72d301	MINOR: buffer: remove bi_end() It was replaced by ci_tail() when the channel is known, or b_tail() in other cases.	2018-07-19 16:23:40 +02:00
Willy Tarreau	41e38ac0ee	MINOR: buffer: remove bo_end() It was replaced by either b_tail() when the buffer has no input data, or b_peek(b, b->o).	2018-07-19 16:23:40 +02:00
Willy Tarreau	89faf5d7c3	MINOR: buffer: remove bo_ptr() It was replaced by co_head() when a channel was known, otherwise b_head().	2018-07-19 16:23:40 +02:00
Willy Tarreau	dda2e41881	MINOR: buffer: remove bi_ptr() It's now been replaced by b_head() when b->o is null, ci_head() when the channel is known, or b_peek(b, b->o) in other situations.	2018-07-19 16:23:40 +02:00
Willy Tarreau	7194d3cc3b	MINOR: buffer: split bi_contig_data() into ci_contig_data and b_config_data() This function was sometimes used from a channel and sometimes from a buffer. In both cases it requires knowledge of the size of the output data (to skip them). Here the split ensures the channel can deal with this point, and that other places not having output data can continue to work.	2018-07-19 16:23:40 +02:00
Willy Tarreau	d55fe397a0	MINOR: buffer: remove bi_getblk() and bi_getblk_nc() These ones were relying on bi_ptr() and are not used. They may be reimplemented later in the channel if needed.	2018-07-19 16:23:40 +02:00
Willy Tarreau	aa7af7213d	MINOR: buffer: replace calls to buffer_space_wraps() with b_space_wraps() And remove the unused function.	2018-07-19 16:23:40 +02:00
Willy Tarreau	bcbd39370f	MINOR: channel/buffer: replace b_{adv,rew} with c_{adv,rew} These ones manipulate the output data count which will be specific to the channel soon, so prepare the call points to use the channel only. The b_* functions are now unused and were removed.	2018-07-19 16:23:40 +02:00
Willy Tarreau	c0a51c51b1	MINOR: buffer: remove buffer_slow_realign() and the swap_buffer allocation code Since all call places can use the trash now, this is not needed anymore.	2018-07-19 16:23:40 +02:00
Willy Tarreau	fd8d42f496	MEDIUM: channel: make channel_slow_realign() take a swap buffer The few call places where it's used can use the trash as a swap buffer, which is made for this exact purpose. This way we can rely on the generic b_slow_realign() call.	2018-07-19 16:23:40 +02:00
Willy Tarreau	4cf1300e6a	MINOR: channel/buffer: replace buffer_slow_realign() with channel_slow_realign() and b_slow_realign() Where relevant, the channel version is used instead. The buffer version was ported to be more generic and now takes a swap buffer and the output byte count to know where to set the alignment point. The H2 mux still uses buffer_slow_realign() with buf->o but it will change later.	2018-07-19 16:23:40 +02:00
Willy Tarreau	d5b343bf9e	MINOR: channel/buffer: use c_realign_if_empty() instead of buffer_realign() This patch removes buffer_realign() and replaces it with c_realign_if_empty() instead.	2018-07-19 16:23:40 +02:00
Willy Tarreau	08d5ac8f27	MINOR: channel: add a few basic functions for the new buffer API This adds : - c_orig() : channel buffer's origin - c_size() : channel buffer's size - c_wrap() : channel buffer's wrapping location - c_data() : channel buffer's total data count - c_room() : room left in channel buffer's - c_empty() : true if channel buffer is empty - c_full() : true if channel buffer is full - c_ptr() : pointer to an offset relative to input data in the buffer - c_adv() : advances the channel's buffer (bytes become part of output) - c_rew() : rewinds the channel's buffer (output bytes not output anymore) - c_realign_if_empty() : realigns the buffer if it's empty - co_data() : # of output data - co_head() : beginning of output data - co_tail() : end of output data - ci_data() : # of input data - ci_head() : beginning of input data - ci_tail() : end of input data - ci_stop() : location after ci_tail() - ci_next() : pointer to next input byte And for the ci_* / co_* functions above, the "__*" variants which disable wrapping checks, and the "_ofs" variants which return an offset relative to the buffer's origin instead.	2018-07-19 16:23:39 +02:00
Willy Tarreau	f17f19f1a7	MINOR: buffer: introduce b_realign_if_empty() Many places deal with buffer realignment after data removal. The method is always the same : if the buffer is empty, set its pointer to the origin. Let's have a function for this so that we have less code to change with the new API.	2018-07-19 16:23:39 +02:00
Olivier Houchard	a04e40d578	MINOR: buffer: Add b_set_data(). Add a new function that lets you set the amount of input in a buffer. For now it extends/truncates b->i except if the total length is below b->o in which case it clears i and adjusts o.	2018-07-19 16:23:39 +02:00
Olivier Houchard	09138ecc49	MINOR: buffer: Introduce b_sub(), b_add(), and bo_add() Instead of doing b->i -= directly, introduce b_sub(), that does the job, to make it easier to switch to the future API. Also add b_add(), that increases b->i, instead of using it directly, and bo_add(), that does increase b->o.	2018-07-19 16:23:39 +02:00
Willy Tarreau	bbc68df330	MINOR: buffer: add a few basic functions for the new API Here's the list of newly introduced functions : - b_data(), returning the total amount of data in the buffer (currently i+o) - b_orig(), returning the origin of the storage area, that is, the place of position 0. - b_wrap(), pointer to wrapping point (currently data+size) - b_size(), returning the size of the buffer - b_room(), returning the amount of bytes left available - b_full(), returning true if the buffer is full, otherwise false - b_stop(), pointer to end of data mark (currently p+i), used to compute distances or a stop pointer for a loop. - b_peek(), this one will help make the transition to the new buffer model. It returns a pointer to a position in the buffer known from an offest relative to the beginning of the data in the buffer. Thus, we can replace the following occurrences : bo_ptr(b) => b_peek(b, 0); bo_end(b) => b_peek(b, b->o); bi_ptr(b) => b_peek(b, b->o); bi_end(b) => b_peek(b, b->i + b->o); b_ptr(b, ofs) => b_peek(b, b->o + ofs); - b_head(), pointer to the beginning of data (currently bo_ptr()) - b_tail(), pointer to first free place (currently bi_ptr()) - b_next() / b_next_ofs(), pointer to the next byte, taking wrapping into account. - b_dist(), returning the distance between two pointers belonging to a buffer - b_reset(), which resets the buffer - b_space_wraps(), indicating if the free space wraps around the buffer - b_almost_full(), indicating if 3/4 or more of the buffer are used Some of these are provided with the unchecked variants using the "__" prefix, or with the "_ofs" suffix indicating they return a relative position to the buffer's origin instead of a pointer. Cc: Olivier Houchard <ohouchard@haproxy.com>	2018-07-19 16:23:39 +02:00
Willy Tarreau	506a29ac6e	MINOR: buffer: switch buffer sizes and offsets to size_t Passing unsigned ints everywhere is painful, and will cause some headache later when we'll want to integrate better with struct ist which already uses size_t. Let's switch buffers to use size_t instead.	2018-07-19 16:23:39 +02:00
Willy Tarreau	41806d1c52	MINOR: buffer: implement a new file for low-level buffer manipulation functions The buffer code currently depends on pools and other stuff and is not really autonomous anymore. The rewrite of the new API is an opportunity to clean this up. This patch creates a new file (buf.h) which does not depend on other elements and which will only contain what is needed to perform the most basic buffer operations. The new API will be introduced in this file and the conversion will be finished once buffer.h is empty. The definition of struct buffer was moved to this new file, using more explicity stdint types for the sizes and offsets. Most new functions will be implemented in two variants : __b_something() : unchecked variant, no wrapping is expected b_something() : wrapping-checked variant This way callers will be able to select which one to use depending on the use cases.	2018-07-19 16:23:39 +02:00
Olivier Houchard	9ddaf794a8	MINOR: tasklet: Set process to NULL. Some consumers expect the process to be NULL when a tasklet it created, so do so.	2018-07-19 16:23:08 +02:00
Willy Tarreau	17b4aa1adc	BUG/MINOR: ssl: properly ref-count the tls_keys entries Commit `200b0fa` ("MEDIUM: Add support for updating TLS ticket keys via socket") introduced support for updating TLS ticket keys from the CLI, but missed a small corner case : if multiple bind lines reference the same tls_keys file, the same reference is used (as expected), but during the clean shutdown, it will lead to a double free when destroying the bind_conf contexts since none of the lines knows if others still use it. The impact is very low however, mostly a core and/or a message in the system's log upon old process termination. Let's introduce some basic refcounting to prevent this from happening, so that only the last bind_conf frees it. Thanks to Janusz Dziemidowicz and Thierry Fournier for both reporting the same issue with an easy reproducer. This fix needs to be backported from 1.6 to 1.8.	2018-07-18 08:59:50 +02:00
Baptiste Assmann	8e2d9430c0	MINOR: dns: new DNS options to allow/prevent IP address duplication By default, HAProxy's DNS resolution at runtime ensure that there is no IP address duplication in a backend (for servers being resolved by the same hostname). There are a few cases where people want, on purpose, to disable this feature. This patch introduces a couple of new server side options for this purpose: "resolve-opts allow-dup-ip" or "resolve-opts prevent-dup-ip".	2018-07-12 17:56:44 +02:00
Dave Chiluk	8618a6a5e2	MINOR: Some spelling cleanup in the comments. Signed-off-by: Dave Chiluk <chiluk+haproxy@indeed.com>	2018-06-21 20:43:52 +02:00
Olivier Houchard	dcd6f3a597	MINOR: tasks: Make sure we correctly init and deinit a tasklet. Up until now, a tasklet couldn't be free'd while it was in the list, it is no longer the case, so make sure we remove it from the list before freeing it. To do so, we have to make sure we correctly initialize it, so use LIST_INIT, instead of setting the pointers to NULL.	2018-06-14 18:57:13 +02:00
William Lallemand	6e1796e85d	BUG/MINOR: signals: ha_sigmask macro for multithreading The behavior of sigprocmask in an multithreaded environment is undefined. The new macro ha_sigmask() calls either pthreads_sigmask() or sigprocmask() if haproxy was built with thread support or not. This should be backported to 1.8.	2018-06-08 18:24:53 +02:00
Olivier Houchard	b1ca58b245	MINOR: tasks: Don't define rqueue if we're building without threads. To make sure we don't inadvertently insert task in the global runqueue, while only the local runqueue is used without threads, make its definition and usage conditional on USE_THREAD.	2018-06-06 16:35:12 +02:00
Olivier Houchard	e13ab8b3c6	BUG/MEDIUM: tasks: Use the local runqueue when building without threads. When building without threads enabled, instead of just using the global runqueue, just use the local runqueue associated with the only thread, as that's what is now expected for a single thread in prcoess_runnable_tasks(). This should fix haproxy when built without threads.	2018-06-06 16:34:52 +02:00
Willy Tarreau	10d81b8757	MINOR: applet: assign the same nice value to a new appctx as its owner task When an applet is created, let's assign it the same nice value as the task of the stream which owns it. It ensures that fairness is properly propagated to applets, and that the CLI can regain a low latency behaviour again. Huge differences have been seen under extreme loads, with the CLI being called every 200 microseconds instead of 11 milliseconds.	2018-06-05 11:18:21 +02:00
David Carlier	caa8a37ffe	MINOR: task: Fix a compiler warning by adding a cast. When calling HA_ATOMIC_CAS with a pointer as the target, the compiler expects a pointer as the new value, so give it one by casting 0x1 to (void *).	2018-06-04 17:43:12 +02:00
Thierry FOURNIER	9d5422a4b7	MINOR: task/notification: Is notifications registered ? This function returns true is some notifications are registered. This function is usefull for the following patch BUG/MEDIUM: lua/socket: Sheduling error on write: may dead-lock It should be backported in 1.6, 1.7 and 1.8	2018-05-31 10:58:41 +02:00
Olivier Houchard	09eeb7684d	BUG/MEDIUM: tasks: Don't forget to increase/decrease tasks_run_queue. Don't forget to increase tasks_run_queue when we're adding a task to the tasklet list, and to decrease it when we remove a task from a runqueue, or its value won't be accurate, and could lead to tasks not being executed when put in the global run queue. 1.9-dev only, no backport is needed.	2018-05-28 15:20:55 +02:00
Tim Duesterhus	3fd1973d37	MINOR: http: Log warning if (add\|set)-header fails This patch adds a warning if an http-(request\|reponse) (add\|set)-header rewrite fails to change the respective header in a request or response. This usually happens when tune.maxrewrite is not sufficient to hold all the headers that should be added.	2018-05-28 14:53:59 +02:00
Olivier Houchard	673867c357	MAJOR: applets: Use tasks, instead of rolling our own scheduler. There's no real reason to have a specific scheduler for applets anymore, so nuke it and just use tasks. This comes with some benefits, the first one being that applets cannot induce high latencies anymore since they share nice values with other tasks. Later it will be possible to configure the applets' nice value. The second benefit is that the applet scheduler was not very thread-friendly, having a big lock around it in prevision of this change. Thus applet-intensive workloads should now scale much better with threads. Some more improvement is possible now : some applets also use a task to handle timers and timeouts. These ones could now be simplified to use only one task.	2018-05-26 20:03:30 +02:00
Olivier Houchard	1599b80360	MINOR: tasks: Make the number of tasks to run at once configurable. Instead of hardcoding 200, make the number of tasks to be run configurable using tune.runqueue-depth. 200 is still the default.	2018-05-26 20:03:24 +02:00
Olivier Houchard	b0bdae7b88	MAJOR: tasks: Introduce tasklets. Introduce tasklets, lightweight tasks. They have no notion of priority, they are just run as soon as possible, and will probably be used for I/O later. For the moment they're used to replace the temporary thread-local list that was used in the scheduler. The first part of the struct is common with tasks so that tasks can be cast to tasklets and queued in this list. Once a task is in the tasklet list, it has its leaf_p set to 0x1 so that it cannot accidently be confused as not in the queue. Pure tasklets are identifiable by their nice value of -32768 (which is normally not possible).	2018-05-26 20:03:19 +02:00
Olivier Houchard	f6e6dc12cd	MAJOR: tasks: Create a per-thread runqueue. A lot of tasks are run on one thread only, so instead of having them all in the global runqueue, create a per-thread runqueue which doesn't require any locking, and add all tasks belonging to only one thread to the corresponding runqueue. The global runqueue is still used for non-local tasks, and is visited by each thread when checking its own runqueue. The nice parameter is thus used both in the global runqueue and in the local ones. The rare tasks that are bound to multiple threads will have their nice value used twice (once for the global queue, once for the thread-local one).	2018-05-26 19:27:29 +02:00
Olivier Houchard	9f6af33222	MINOR: tasks: Change the task API so that the callback takes 3 arguments. In preparation for thread-specific runqueues, change the task API so that the callback takes 3 arguments, the task itself, the context, and the state, those were retrieved from the task before. This will allow these elements to change atomically in the scheduler while the application uses the copied value, and even to have NULL tasks later.	2018-05-26 19:23:57 +02:00
Willy Tarreau	0cd82e883e	BUG/BUILD: threads: unbreak build without threads A few users reported that building without threads was accidently broken after commit `6b96f72` ("BUG/MEDIUM: pollers: Use a global list for fd shared between threads.") due to all_threads_mask not being defined. It's OK to set it to zero as other code parts do when threads are enabled but only one thread is used. This needs to be backported to 1.8.	2018-05-23 19:54:43 +02:00
Thierry Fournier	d5b073cf1f	MINOR: lua: Improve error message The function hlua_ctx_resume return less text message and more error code. These error code allow the caller to return appropriate message to the user.	2018-05-22 18:57:46 +02:00
Christopher Faulet	68db0235fd	CLEANUP: spoe: Remove unused variables the agent structure applets_act and applets_idle were used for debugging purpose. Now, these values are part of the agent's counters.	2018-05-18 15:04:46 +02:00
Olivier Houchard	cb92f5cae4	MINOR: pollers: move polled_mask outside of struct fdtab. The polled_mask is only used in the pollers, and removing it from the struct fdtab makes it fit in one 64B cacheline again, on a 64bits machine, so make it a separate array.	2018-05-06 06:27:34 +02:00
Olivier Houchard	6b96f7289c	BUG/MEDIUM: pollers: Use a global list for fd shared between threads. With the old model, any fd shared by multiple threads, such as listeners or dns sockets, would only be updated on one threads, so that could lead to missed event, or spurious wakeups. To avoid this, add a global list for fd that are shared, using the same implementation as the fd cache, and only remove entries from this list when every thread as updated its poller. [wt: this will need to be backported to 1.8 but differently so this patch must not be backported as-is]	2018-05-06 06:27:09 +02:00
Olivier Houchard	6a2cf8752c	MINOR: fd: Make the lockless fd list work with multiple lists. Modify fd_add_to_fd_list() and fd_rm_from_fd_list() so that they take an offset in the fdtab to the list entry, instead of hardcoding the fd cache, so we can use them with other lists.	2018-05-06 06:25:49 +02:00
Olivier Houchard	9b36cb4a41	BUG/MEDIUM: task: Don't free a task that is about to be run. While running a task, we may try to delete and free a task that is about to be run, because it's part of the local tasks list, or because rq_next points to it. So flag any task that is in the local tasks list to be deleted, instead of run, by setting t->process to NULL, and re-make rq_next a global, thread-local variable, that is modified if we attempt to delete that task. Many thanks to PiBa-NL for reporting this and analysing the problem. This should be backported to 1.8.	2018-05-04 20:11:04 +02:00
Willy Tarreau	760e81d356	MINOR: backend: implement random-based load balancing For large farms where servers are regularly added or removed, picking a random server from the pool can ensure faster load transitions than when using round-robin and less traffic surges on the newly added servers than when using leastconn. This commit introduces "balance random". It internally uses a random as the key to the consistent hashing mechanism, thus all features available in consistent hashing such as weights and bounded load via hash-balance- factor are usable. It is extremely convenient because one common concern when using random is what happens when a server is hammered a bit too much. Here that can trivially be avoided, like in the configuration below : backend bk0 balance random hash-balance-factor 110 server-template s 1-100 127.0.0.1:8000 check inter 1s Note that while "balance random" internally relies on a hash algorithm, it holds the same properties as round-robin and as such is compatible with reusing an existing server connection with "option prefer-last-server".	2018-05-03 07:20:40 +02:00
Tim Duesterhus	e2b10bf491	MINOR: http: Add support for 421 Misdirected Request This makes haproxy aware of HTTP 421 Misdirected Request, which is defined in RFC 7540, section 9.1.2.	2018-04-28 07:03:39 +02:00
Aur�lien Nephtali	abbf607105	MEDIUM: cli: Add payload support In order to use arbitrary data in the CLI (multiple lines or group of words that must be considered as a whole, for example), it is now possible to add a payload to the commands. To do so, the first line needs to end with a special pattern: <<\n. Everything that follows will be left untouched by the CLI parser and will be passed to the commands parsers. Per-command support will need to be added to take advantage of this feature. Signed-off-by: Aur�lien Nephtali <aurelien.nephtali@corp.ovh.com>	2018-04-26 14:19:33 +02:00
Willy Tarreau	174b06a572	MINOR: h2: detect presence of CONNECT and/or content-length We'll need this in order to support uploading chunks. The h2 to h1 converter checks for the presence of the content-length header field as well as the CONNECT method and returns these information to the caller. The caller indicates whether or not a body is detected for the message (presence of END_STREAM or not). No transfer-encoding header is emitted yet.	2018-04-26 10:15:14 +02:00
Olivier Houchard	302f9ef055	BUG/MEDIUM: connection: Make sure we have a mux before calling detach(). In some cases, we call cs_destroy() very early, so early the connection doesn't yet have a mux, so we can't call mux->detach(). In this case, just destroy the associated connection. This should be backported to 1.8.	2018-04-13 16:02:21 +02:00
Christopher Faulet	48aa13f286	BUG/MEDIUM: threads: Fix the max/min calculation because of name clashes With gcc < 4.7, when HAProxy is built with threads, the macros HA_ATOMIC_CAS/XCHG/STORE relies on the legacy __sync builtins. These macros are slightly complicated than the versions relying on the '_atomic' builtins. Internally, some local variables are defined, prefixed with '__' to avoid name clashes with the caller. On the other hand, the macros HA_ATOMIC_UPDATE_MIN/MAX call HA_ATOMIC_CAS. Some local variables are also definied in these macros, following the same naming rule as below. The problem is that '__new' variable is used in HA_ATOMIC_MIN/_MAX and in HA_ATOMIC_CAS. Obviously, the behaviour is undefined because '__new' in HA_ATOMIC_CAS is left uninitialized. Unfortunatly gcc fails to detect this error. To fix the problem, all internal variables to macros are now suffixed with name of the macros to avoid clashes (for instance, '__new_cas' in HA_ATOMIC_CAS). This patch must be backported in 1.8.	2018-04-10 11:07:56 +02:00
Christopher Faulet	caf2feca62	MINOR: spoe: Add counters to log info about SPOE agents In addition to metrics about time spent in the SPOE, following counters have been added: * applets : number of SPOE applets. * idles : number of idle applets. * nb_sending : number of streams waiting to send data. * nb_waiting : number of streams waiting for a ack. * nb_processed : number of events/groups processed by the SPOE (from the stream point of view). * nb_errors : number of errors during the processing (from the stream point of view). Log messages has been updated to report these counters. Following pattern has been added at the end of the log message: ... <idles>/<applets> <nb_sending>/<nb_waiting> <nb_error>/<nb_processed>	2018-04-05 15:13:54 +02:00
Christopher Faulet	7250b8fb5c	MINOR: spoe: Add loggers dedicated to the SPOE agent Now it is possible to configure a logger in a spoe-agent section using a "log" line, as for a proxy. "no log", "log global" and "log <address> ..." syntaxes are supported.	2018-04-05 15:13:54 +02:00
Christopher Faulet	28ac099907	MINOR: log: Keep the ref when a log server is copied to avoid duplicate entries With "log global" line, the global list of loggers are copied into the proxy's struct. The list coming from the default section is also copied when a frontend or a backend section is parsed. So it is possible to have duplicate entries in the proxy's list. For instance, with this following config, all messages will be logged twice: global log 127.0.0.1 local0 debug daemon defaults mode http log global option httplog frontend front-http log global bind *:8888 default_backend back-http backend back-http server www 127.0.0.1:8000	2018-04-05 15:13:54 +02:00
Christopher Faulet	4b0b79dd56	MINOR: log: move 'log' keyword parsing in dedicated function Now, the function parse_logsrv should be used to parse a "log" line. This function will update the list of loggers passed in argument. It can release all log servers when "no log" line was parsed (by the caller) or it can parse "log global" or "log <address> ... " lines. It takes care of checking the caller context (global or not) to prohibit "log global" usage in the global section.	2018-04-05 15:13:54 +02:00
Christopher Faulet	36bda1cd4a	MINOR: spoe: Add options to store processing times in variables "set-process-time" and "set-total-time" options have been added to store processing times in the transaction scope, at each event and group processing, the current one and the total one. So it is possible to get them. TODO: documentation	2018-04-05 15:13:54 +02:00
Christopher Faulet	b2dd1e034c	MINOR: spoe: Add metrics in to know time spent in the SPOE Following metrics are added for each event or group of messages processed in the SPOE: * processing time: the delay to process the event or the group. From the stream point of view, it is the latency added by the SPOE processing. * request time : It is the encoding time. It includes ACLs processing, if any. For fragmented frames, it is the sum of all fragments. * queue time : the delay before the request gets out the sending queue. For fragmented frames, it is the sum of all fragments. * waiting time: the delay before the reponse is received. No fragmentation supported here. * response time: the delay to process the response. No fragmentation supported here. * total time: (unused for now). It is the sum of all events or groups processed by the SPOE for a specific threads. Log messages has been updated. Before, only errors was logged (status_code != 0). Now every processing is logged, following this format: SPOE: [AGENT] <TYPE:NAME> sid=STREAM-ID st=STATUC-CODE reqT/qT/wT/resT/pT where: AGENT is the agent name TYPE is EVENT of GROUP NAME is the event or the group name STREAM-ID is an integer, the unique id of the stream STATUS_CODE is the processing's status code reqT/qT/wT/resT/pT are delays descrive above For all these delays, -1 means the processing was interrupted before the end. So -1 for the queue time means the request was never dequeued. For fragmented frames it is harder to know when the interruption happened. For now, messages are logged using the same logger than the backend of the stream which initiated the request.	2018-04-05 15:13:53 +02:00
Olivier Houchard	8ef1a6b0d8	BUG/MINOR: fd: Don't clear the update_mask in fd_insert. Clearing the update_mask bit in fd_insert may lead to duplicate insertion of fd in fd_updt, that could lead to a write past the end of the array. Instead, make sure the update_mask bit is cleared by the pollers no matter what. This should be backported to 1.8. [wt: warning: 1.8 doesn't have the lockless fdcache changes and will require some careful changes in the pollers]	2018-04-03 19:38:15 +02:00
Willy Tarreau	b011d8f4c4	MINOR: mux: add a "show_fd" function to dump debugging information for "show fd" This function will be called from the CLI's "show fd" command to append some extra mux-specific information that only the mux handler can decode. This is supposed to help collect various hints about what is happening when facing certain anomalies.	2018-03-30 14:41:19 +02:00
Willy Tarreau	4037a3f904	MINOR: cli/threads: make "show fd" report thread_sync_io_handler instead of "unknown" The output was confusing when the sync point's dummy handler was shown. This patch should be backported to 1.8 to help with troubleshooting.	2018-03-28 18:06:47 +02:00
Emmanuel Hocdet	4952985b71	REORG: compact "struct server" Move use_ssl (bool value) in "struct server" hole.	2018-03-21 05:04:01 +01:00
Emmanuel Hocdet	4399c75f6c	MINOR: proxy-v2-options: add crc32c This patch add option crc32c (PP2_TYPE_CRC32C) to proxy protocol v2. It compute the checksum of proxy protocol v2 header as describe in "doc/proxy-protocol.txt".	2018-03-21 05:04:01 +01:00
Emmanuel Hocdet	6afd898988	MINOR: hash: add new function hash_crc32c This function will be used to perform CRC32c computations. This is required to compute proxy protocol v2 CRC32C tlv (PP2_TYPE_CRC32C).	2018-03-21 05:04:01 +01:00
Willy Tarreau	26fb5d8449	BUG/MEDIUM: fd/threads: ensure the fdcache_mask always reflects the cache contents Commit `4815c8c` ("MAJOR: fd/threads: Make the fdcache mostly lockless.") made the fd cache lockless, but after a few iterations, a subtle part was lost, consisting in setting the bit on the fd_cache_mask immediately when adding an event. Now it was done only when the cache started to process events, but the problem it causes is that fd_cache_mask isn't reliable anymore as an indicator of presence of events to be processed with no delay outside of fd_process_cached_events(). This results in some spurious delays when processing inter-thread wakeups between tasks. Just restoring the flag when the event is added is enough to fix the problem. Kudos to Christopher for spotting this one! No backport is needed as this is only in the development version.	2018-03-20 19:14:24 +01:00
Christopher Faulet	5cd4bbd7ab	BUG/MAJOR: threads/queue: Fix thread-safety issues on the queues management The management of the servers and the proxies queues was not thread-safe at all. First, the accesses to <strm>->pend_pos were not protected. So it was possible to release it on a thread (for instance because the stream is released) and to use it in same time on another one (because we redispatch pending connections for a server). Then, the accesses to stream's information (flags and target) from anywhere is forbidden. To be safe, The stream's state must always be updated in the context of process_stream. So to fix these issues, the queue module has been refactored. A lock has been added in the pendconn structure. And now, when we try to dequeue a pending connection, we start by unlinking it from the server/proxy queue and we wake up the stream. Then, it is the stream reponsibility to really dequeue it (or release it). This way, we are sure that only the stream can create and release its <pend_pos> field. However, be careful. This new implementation should be thread-safe (hopefully...). But it is not optimal and in some situations, it could be really slower in multi-threaded mode than in single-threaded one. The problem is that, when we try to dequeue pending connections, we process it from the older one to the newer one independently to the thread's affinity. So we need to wait the other threads' wakeup to really process them. If threads are blocked in the poller, this will add a significant latency. This problem happens when maxconn values are very low. This patch must be backported in 1.8.	2018-03-19 10:03:06 +01:00
Christopher Faulet	510c0d67ef	BUG/MEDIUM: threads/unix: Fix a deadlock when a listener is temporarily disabled When a listener is temporarily disabled, we start by locking it and then we call .pause callback of the underlying protocol (tcp/unix). For TCP listeners, this is not a problem. But listeners bound on an unix socket are in fact closed instead. So .pause callback relies on unbind_listener function to do its job. Unfortunatly, unbind_listener hold the listener's lock and then call an internal function to unbind it. So, there is a deadlock here. This happens during a reload. To fix the problemn, the function do_unbind_listener, which is lockless, is now exported and is called when a listener bound on an unix socket is temporarily disabled. This patch must be backported in 1.8.	2018-03-16 11:19:07 +01:00
Willy Tarreau	c41b3e8dff	DOC: buffers: clarify the purpose of the <from> pointer in offer_buffers() This one is only used to compare pointers and NULL is permitted though this is far from being clear.	2018-03-08 18:33:48 +01:00
Emmanuel Hocdet	253c3b7516	MINOR: connection: add proxy-v2-options authority This patch add option PP2_TYPE_AUTHORITY to proxy protocol v2 when a TLS connection was negotiated. In this case, authority corresponds to the sni.	2018-03-01 11:38:32 +01:00
Emmanuel Hocdet	fa8d0f1875	MINOR: connection: add proxy-v2-options ssl-cipher,cert-sig,cert-key This patch implement proxy protocol v2 options related to crypto information: ssl-cipher (PP2_SUBTYPE_SSL_CIPHER), cert-sig (PP2_SUBTYPE_SSL_SIG_ALG) and cert-key (PP2_SUBTYPE_SSL_KEY_ALG).	2018-03-01 11:38:28 +01:00
Emmanuel Hocdet	283e004a85	MINOR: ssl: add ssl_sock_get_cert_sig function ssl_sock_get_cert_sig can be used to report cert signature short name to log and ppv2 (RSA-SHA256).	2018-03-01 11:34:08 +01:00
Emmanuel Hocdet	96b7834e98	MINOR: ssl: add ssl_sock_get_pkey_algo function ssl_sock_get_pkey_algo can be used to report pkey algorithm to log and ppv2 (RSA2048, EC256,...). Extract pkey information is not free in ssl api (lock/alloc/free): haproxy can use the pkey information computed in load_certificate. Store and use this information in a SSL ex_data when available, compute it if not (SSL multicert bundled and generated cert).	2018-03-01 11:34:05 +01:00
Emmanuel Hocdet	ddc090bc55	MINOR: ssl: extract full pkey info in load_certificate Private key information is used in switchctx to implement native multicert selection (ecdsa/rsa/anonymous). This patch extract and store full pkey information: dsa type and pkey size in bits. This can be used for switchctx or to report pkey informations in ppv2 and log.	2018-03-01 11:33:18 +01:00
Christopher Faulet	ca6ef50661	BUG/MEDIUM: buffer: Fix the wrapping case in bi_putblk When the block of data need to be split to support the wrapping, the start of the second block of data was wrong. We must be sure to skup data copied during the first memcpy. This patch must be backported to 1.8.	2018-02-27 15:45:03 +01:00
Christopher Faulet	b2b279464c	BUG/MEDIUM: buffer: Fix the wrapping case in bo_putblk When the block of data need to be split to support the wrapping, the start of the second block of data was wrong. We must be sure to skip data copied during the first memcpy. This patch must be backported to 1.8, 1.7, 1.6 and 1.5.	2018-02-27 15:45:03 +01:00
Yves Lafon	95317289e9	MINOR: stats: display the number of threads in the statistics. Add the nbthread global variable to the output, matching nbproc. This may be backported to 1.8	2018-02-26 11:53:46 +01:00
Willy Tarreau	364d745106	MINOR: debug/pools: make DEBUG_UAF also detect underflows Since we use padding before the allocated page, it's trivial to place the allocated address there and see if it gets mangled once we release it. This may be backported to stable releases already using DEBUG_UAF.	2018-02-22 14:18:45 +01:00
Willy Tarreau	5a9cce4653	BUG/MINOR: debug/pools: properly handle out-of-memory when building with DEBUG_UAF Commit `158fa75` ("MINOR: pools: implement DEBUG_UAF to detect use after free") implemented pool use-after-free detection, but the mmap() return value isn't properly checked, preventing the call to pool_alloc_area() from returning NULL. So on out-of-memory a mangled pointer is returned, causing a crash on the pool_alloc() site instead of forcing a GC. It doesn't affect regular operations however, just complicates complex bug investigations. This fix should be backported to 1.8 and to 1.7.	2018-02-22 14:18:45 +01:00
Willy Tarreau	f161d0f51e	BUG/MINOR: pools/threads: don't ignore DEBUG_UAF on double-word CAS capable archs Since commit `cf975d4` ("MINOR: pools/threads: Implement lockless memory pools."), we support lockless pools. However the parts dedicated to detecting use-after-free are not present in this part, making DEBUG_UAF useless in this situation. The present patch sets a new define CONFIG_HAP_LOCKLESS_POOLS when such a compatible architecture is detected, and when pool debugging is not requested, then makes use of this everywhere in pools and buffers functions. This way enabling DEBUG_UAF will automatically disable the lockless version. No backport is needed as this is purely 1.9-dev.	2018-02-22 14:18:45 +01:00
Tim Duesterhus	5e64286bab	CLEANUP: standard: Fix typo in IPv6 mask example IPv6 addresses with two double colons are invalid. This typo was introduced in commit `471851713a`.	2018-02-21 05:07:35 +01:00
Tim Duesterhus	05f6a43bd4	CLEANUP: pools: Remove unused end label in memory.h This removes the end label from memory.h. The labels are unused as of `cf975d46bc` which is unreleased (and incidentally the first commit containing those labels, thus they never have been used).	2018-02-20 08:30:13 +01:00
Christopher Faulet	16f45c87d5	BUG/MINOR: ssl/threads: Make management of the TLS ticket keys files thread-safe A TLS ticket keys file can be updated on the CLI and used in same time. So we need to protect it to be sure all accesses are thread-safe. Because updates are infrequent, a R/W lock has been used. This patch must be backported in 1.8	2018-02-19 14:15:38 +01:00
David Carlier	4ee76d0281	BUILD/MINOR: memory: stdint is needed for uintptr_t stdint.h is needed on OpenBSD for uintptr_t type.	2018-02-19 07:58:50 +01:00
Willy Tarreau	41ccb194d1	BUG/MEDIUM: threads: fix the double CAS implementation for ARMv7 Commit `f61f0cb` ("MINOR: threads: Introduce double-width CAS on x86_64 and arm.") introduced the double CAS. But the ARMv7 version is bogus, it uses the value of the pointers instead of dereferencing them. When lucky, it simply doesn't build due to impossible registers combinations. Otherwise it will immediately crash at run time when facing traffic. No backport is needed, this bug was introduced in 1.9-dev.	2018-02-14 14:16:28 +01:00
Willy Tarreau	4cc67a2782	MINOR: fd: move the fd_{add_to,rm_from}_fdlist functions to fd.c There's not point inlining these huge functions, better move them to real functions in fd.c.	2018-02-05 17:19:40 +01:00
Willy Tarreau	4d84186337	MEDIUM: fd: make updt_fd_polling() use atomics It only needed a test-and-set and an atomic increment so we can take it out of the fd lock now.	2018-02-05 16:02:22 +01:00
Willy Tarreau	1b76a6d1a6	CLEANUP: fd: remove the now unused fd_compute_new_polled_status() function It's not used anymore since the new state is calculated on the fly during every update. Let's remove this function.	2018-02-05 16:02:22 +01:00
Willy Tarreau	7ac0e35f23	MAJOR: fd: compute the new fd polling state out of the fd lock Each fd_{may\|cant\|stop\|want}_{recv\|send} function sets or resets a single bit at once, then recomputes the need for updates, and then the new cache state. Later, pollers will compute the new polling state based on the resulting operations here. In fact the conditions are so simple that they can be performed by a single "if", or sometimes even optimized away. This means that in practice a simple compare-and-swap operation if often enough to set the new value inluding the new polling state, and that only the cache and fdupdt have to be performed under the lock. Better, for the most common operations (fd_may_{recv,send}, used by the pollers), a simple atomic OR is needed. This patch does this for the fd_* functions above and it doesn't yet remove the now useless fd_compute_new_polling_status() because it's still used by other pollers. A pure connection rate test shows a 1% performance increase.	2018-02-05 16:02:22 +01:00
Olivier Houchard	1256836ebf	MEDIUM: fd/threads: Make sure we don't miss a fd cache entry. An fd cache entry might be removed and added at the end of the list, while another thread is parsing it, if that happens, we may miss fd cache entries, to avoid that, add a new field in the struct fdtab, "added_mask", which contains a mask for potentially affected threads, if it is set, the corresponding thread will set its bit in fd_cache_mask, to avoid waiting in poll while it may have more work to do.	2018-02-05 16:02:22 +01:00
Olivier Houchard	4815c8cbfe	MAJOR: fd/threads: Make the fdcache mostly lockless. Create a local, per-thread, fdcache, for file descriptors that only belongs to one thread, and make the global fd cache mostly lockless, as we can get a lot of contention on the fd cache lock.	2018-02-05 16:02:22 +01:00
Olivier Houchard	cf975d46bc	MINOR: pools/threads: Implement lockless memory pools. On CPUs that support a double-width compare-and-swap, implement lockless pools.	2018-02-05 16:02:22 +01:00
Willy Tarreau	5266b3e12d	MINOR: threads: add test and set/reset operations This just adds a set of naive bts/btr operations based on OR/AND. Later it could rely on pl_bts/btr to use arch-specific versions if needed.	2018-02-05 14:24:50 +01:00
Olivier Houchard	f61f0cb95f	MINOR: threads: Introduce double-width CAS on x86_64 and arm. Introduce double-width compare-and-swap on arches that support it, right now x86_64, arm, and aarch64. Also introduce functions to do memory barriers.	2018-02-05 14:24:50 +01:00
Olivier Houchard	928fbfa8b7	MINOR: compiler: introduce offsetoff(). Add a offsetof() macro, if it is no there already.	2018-02-05 14:24:50 +01:00
Olivier Houchard	6fa63d9852	MINOR: early data: Don't rely on CO_FL_EARLY_DATA to wake up streams. Instead of looking for CO_FL_EARLY_DATA to know if we have to try to wake up a stream, because it is waiting for a SSL handshake, instead add a new conn_stream flag, CS_FL_WAIT_FOR_HS. This way we don't have to rely on CO_FL_EARLY_DATA, and we will only wake streams that are actually waiting.	2018-02-05 14:24:50 +01:00
Christopher Faulet	b077cdc012	MEDIUM: spoe: Use an ebtree to manage idle applets Instead of using a list of applets with idle ones in front, we now use an ebtree. Aapplets in the tree are idle by definition. And the key is the applet's weight. When a new frame is queued, the first idle applet (with the lowest weight) is woken up and its weight is increased by one. And when an applet sends a frame to a SPOA, its weight is decremented by one. This is empirical, but it should avoid to overuse a very few number of applets and increase the balancing between idle applets.	2018-02-02 16:00:32 +01:00
Christopher Faulet	8f82b203d5	MINOR: spoe: Count the number of frames waiting for an ack for each applet So it is easier to respect the max_fpa value. This is no more the maximum frames processed by an applet at each loop but the maximum frames waiting for an ack for a specific applet. The function spoe_handle_processing_appctx has been rewritten accordingly.	2018-02-02 16:00:32 +01:00
Christopher Faulet	6f9ea4f87b	MINOR: spoe: Replace sending_rate by a frequency counter sending_rate was a counter used to evaluate the SPOE capacity to process frames. Because it was not really accurrate, it has been replaced by a frequency counter representing the number of frames handled by the SPOE per second. We just check this counter is higher than the number of streams waiting for a reply. If not, a new applet is created.	2018-02-02 16:00:32 +01:00
Christopher Faulet	fce747bbaa	MINOR: spoe: Always link a SPOE context with the applet processing it This was already done for fragmented frames. Now, this is true for all frames.	2018-02-02 16:00:32 +01:00
Christopher Faulet	420977903b	MINOR: spoe: Remove check on min_applets number when a SPOE context is queued The calculation of a minimal number of active applets was really empirical and finally useless. On heavy load, there are always many active applets (most of time, more than the minimal required) and when the load is low, there is no reason to keep unused applets opened. Because of this change, the flag SPOE_APPCTX_FL_PERSIST is now unused. So it has been removed.	2018-02-02 16:00:32 +01:00
Fr�d�ric L�caille	6778b27542	MINOR: stick-tables: Adds support for new "gpc1" and "gpc1_rate" counters. Implement exactly the same code as this has been done for "gpc0" and "gpc0_rate" counters.	2018-01-31 09:40:05 +01:00
Christopher Faulet	f51bac2ba8	BUG/MINOR: threads: Update labels array because of changes in lock_label enum Recent changes to the enum were not synchronized with the lock debugging code. Now we use a switch/case instead of an array so that the compiler throws a warning if there is any inconsistency. To be backported to 1.8 (at least to add the START entry).	2018-01-30 14:35:24 +01:00
Willy Tarreau	a9786b6f04	MINOR: fd: pass the iocb and owner to fd_insert() fd_insert() is currently called just after setting the owner and iocb, but proceeding like this prevents the operation from being atomic and requires a lock to protect the maxfd computation in another thread from meeting an incompletely initialized FD and computing a wrong maxfd. Fortunately for now all fdtab[].owner are set before calling fd_insert(), and the first lock in fd_insert() enforces a memory barrier so the code is safe. This patch moves the initialization of the owner and iocb to fd_insert() so that the function will be able to properly arrange its operations and remain safe even when modified to become lockless. There's no other change beyond the internal API.	2018-01-29 16:07:25 +01:00
Willy Tarreau	82b37d74d2	MEDIUM: fd: use atomic ops for hap_fd_{clr,set} and remove poll_lock Now that we can use atomic ops to set/clear an fd occurrence in an fd_set, we don't need the poll_lock anymore. Let's remove it.	2018-01-29 16:03:15 +01:00
Willy Tarreau	322e6c7e73	MINOR: fd: move the hap_fd_{clr,set,isset} functions to fd.h These functions were created for poll() in 1.5-dev18 (commit `80da05a4`) to replace the previous FD_{CLR,SET,ISSET} that were shared with select() because some libcs enforce a limit on FD_SET. But FD_SET doesn't seem to be universally MT-safe, requiring locks in the select() code that are not needed in the poll code. So let's move back to the initial situation where we used to only use bit fields, since that has been in use since day one without a problem, and let's use these hap_fd_* functions instead of FD_*. This patch only moves the functions to fd.h and revives hap_fd_isset() that was recently removed to kill an "unused" warning.	2018-01-29 16:03:15 +01:00
Willy Tarreau	745c60eac6	CLEANUP: fd: remove the unused "new" field This field has been unused since 1.6, it's only updated and never tested. Let's remove it.	2018-01-29 16:02:59 +01:00
Willy Tarreau	f2b5c99b4c	CLEANUP: fd/threads: remove the now unused fdtab_lock It was only used to protect maxfd computation and is not needed anymore.	2018-01-29 15:25:35 +01:00
Willy Tarreau	173d9951e2	MEDIUM: polling: start to move maxfd computation to the pollers Since only select() and poll() still make use of maxfd, let's move its computation right there in the pollers themselves, and only during each fd update pass. The computation doesn't need a lock anymore, only a few atomic ops. It will be accurate, be done much less often and will not be required anymore in the FD's fast patch. This provides a small performance increase of about 1% in connection rate when using epoll since we get rid of this computation which was performed under a lock.	2018-01-29 15:22:57 +01:00
Fr�d�ric L�caille	a41d531e4e	MINOR: config: Enable tracking of up to MAX_SESS_STKCTR stick counters. This patch really adds support for up to MAX_SESS_STKCTR stick counters.	2018-01-29 13:53:56 +01:00
Tim Duesterhus	471851713a	MINOR: standard: Add str2mask6 function This new function mirrors the str2mask() function for IPv4 addresses. This commit is in preparation to support ARGT_MSK6.	2018-01-25 22:25:40 +01:00
Tim Duesterhus	92bb034209	CLEANUP: Fix typo in ARGT_MSK6 comment The incorrect comment was introduced in commit: `2ac5718dbd` v1.5-dev9 is the first tag containing this comment, the fix should be backported to haproxy 1.5 and newer.	2018-01-25 22:25:40 +01:00
Willy Tarreau	1605c7ae61	BUG/MEDIUM: threads/mworker: fix a race on startup Marc Fournier reported an interesting case when using threads with the master-worker mode : sometimes, a listener would have its FD closed during startup. Sometimes it could even be health checks seeing this. What happens is that after the threads are created, and the pollers enabled on each threads, the master-worker pipe is registered, and at the same time a close() is performed on the write side of this pipe since the children must not use it. But since this is replicated in every thread, what happens is that the first thread closes the pipe, thus releases the FD, and the next thread starting a listener in parallel gets this FD reassigned. Then another thread closes the FD again, which this time corresponds to the listener. It can also happen with the health check sockets if they're started early enough. This patch splits the mworker_pipe_register() function in two, so that the close() of the write side of the FD is performed very early after the fork() and long before threads are created (we don't need to delay it anyway). Only the pipe registration is done in the threaded code since it is important that the pollers are properly allocated for this. The mworker_pipe_register() function now takes care of registering the pipe only once, and this is guaranteed by a new surrounding lock. The call to protocol_enable_all() looks fragile in theory since it scans the list of proxies and their listeners, though in practice all threads scan the same list and take the same locks for each listener so it's not possible that any of them escapes the process and finishes before all listeners are started. And the operation is idempotent. This fix must be backported to 1.8. Thanks to Marc for providing very detailed traces clearly showing the problem.	2018-01-23 19:18:57 +01:00
Willy Tarreau	c9c8378c2b	MINOR: fd: add a bitmask to indicate that an FD is known by the poller Some pollers like epoll() need to know if the fd is already known or not in order to compute the operation to perform (add, mod, del). For now this is performed based on the difference between the previous FD state and the new state but this will not be usable anymore once threads become responsible for their own polling. Here we come with a different approach : a bitmask is stored with the fd to indicate which pollers already know it, and the pollers will be able to simply perform the add/mod/del operations based on this bit combined with the new state. This patch only adds the bitmask declaration and initialization, it is it not yet used. It will be needed by the next two fixes and will need to be backported to 1.8.	2018-01-23 15:42:57 +01:00
Willy Tarreau	ebc78d78a2	BUG/MEDIUM: fd: maintain a per-thread update mask Since the fd update tables are per-thread, we need to have a bit per thread to indicate whether an update exists, otherwise this can lead to lost update events every time multiple threads want to update the same FD. In practice for now, it only happens at start time when listeners are enabled and ask for polling after facing their first EAGAIN. But since the pollers are still shared, a lost event is still recovered by a neighbor thread. This will not reliably work anymore with per-thread pollers, where it has been observed a few times on startup that a single-threaded listener would not always accept incoming connections upon startup. It's worth noting that during this code review it appeared that the "new" flag in the fdtab isn't used anymore. This fix should be backported to 1.8.	2018-01-23 15:41:19 +01:00
Christopher Faulet	69553fe62c	MINOR: threads/fd: Use a bitfield to know if there are FDs for a thread in the FD cache A bitfield has been added to know if there are some FDs processable by a specific thread in the FD cache. When a FD is inserted in the FD cache, the bits corresponding to its thread_mask are set. On each thread, the bitfield is updated when the FD cache is processed. If there is no FD processed, the thread is removed from the bitfield by unsetting its tid_bit. Note that this bitfield is updated but not checked in fd_process_cached_events. So, when this function is called, the FDs cache is always processed. [wt: should be backported to 1.8 as it will help fix a design limitation]	2018-01-23 15:39:10 +01:00
Willy Tarreau	d80cb4ee13	MINOR: global: add some global activity counters to help debugging A number of counters have been added at special places helping better understanding certain bug reports. These counters are maintained per thread and are shown using "show activity" on the CLI. The "clear counters" commands also reset these counters. The output is sent as a single write(), which currently produces up to about 7 kB of data for 64 threads. If more counters are added, it may be necessary to write into multiple buffers, or to reset the counters. To backport to 1.8 to help collect more detailed bug reports.	2018-01-23 15:38:33 +01:00
Willy Tarreau	421f02e738	MINOR: threads: add a MAX_THREADS define instead of LONGBITS This one allows not to inflate some structures when threads are disabled. Now struct global is 1.4 kB instead of 33 kB. Should be backported to 1.8 for ease of backporting of upcoming patches.	2018-01-23 15:28:20 +01:00
Willy Tarreau	f4571a027f	MINOR: global/threads: move cpu_map at the end of the global struct The "thread" part is 32kB long, better move it at the end of the structure since it's only used during initialization, to keep the rest grouped together. Should be backported to 1.8 to ease backporting of upcoming patches, no functional impact.	2018-01-23 15:27:52 +01:00
Christopher Faulet	336d3ef0e7	MINOR: spoe: add register-var-names directive in spoe-agent configuration In addition to "option force-set-var", recently added, this directive can be used to selectivelly register unknown variable names, without totally relaxing their registration during the runtime, like "option force-set-var" does. So there is no way for a malicious agent to exhaust memory by defining a too high number of variable names. In other hand, you need to enumerate all variable names. This could be painfull in some circumstances. Remember, this directive is only usefull when the variable names are not referenced anywhere in the HAProxy configuration or the SPOE one. Thanks to Etienne Carri�re for his help on this part.	2018-01-15 13:47:27 +01:00
David Carlier	ec5e84552a	BUILD/MINOR: ancient gcc versions atomic fix Commit `1a69af6d38` introduced code for atomic prior to 4.7. Unfortunately clang uses as well those constants which is misleading.	2018-01-11 15:31:07 +01:00
Willy Tarreau	1a69af6d38	MINOR: hathreads: add support for gcc < 4.7 Till now the use of __atomic_* gcc builtins required gcc >= 4.7. Since some supported and quite common operating systems like CentOS 6 still come with older versions (4.4) and the mapping to the older builtins is reasonably simple, let's implement it. This code is only used for gcc < 4.7. It has been quickly tested on a machine using gcc 4.4.4 and provided expected results. This patch should be backported to 1.8.	2018-01-10 07:51:56 +01:00
Olivier Houchard	2ec2db9725	MINOR: dns: Handle SRV record weight correctly. A SRV record weight can range from 0 to 65535, while haproxy weight goes from 0 to 256, so we have to divide it by 256 before handing it to haproxy. Also, a SRV record with a weight of 0 doesn't mean the server shouldn't be used, so use a minimum weight of 1. This should probably be backported to 1.8.	2018-01-09 15:43:11 +01:00
Olivier Houchard	e2a34967a9	CLEANUP: rbtree: remove Remove the rbtree implementation. It's not used, it's not even connected to the build, and we probably have no use for it .	2018-01-05 10:56:32 +01:00
Willy Tarreau	3083276187	MINOR: h2: add a function to report pseudo-header names For debugging we need to be able to dump pseudo headers when we know their name, let's put this there as we already have the other way around.	2017-12-30 17:17:07 +01:00
Willy Tarreau	a48c141f44	BUG/MAJOR: connection: refine the situations where we don't send shutw() Since commit `f9ce57e` ("MEDIUM: connection: make conn_sock_shutw() aware of lingering"), we refrain from performing the shutw() on the socket if there is no lingering risk. But there is a problem with this in tunnel and in TCP modes where a client is explicitly allowed to send a shutw to the server, eventhough it it risky. Not doing it creates this situation reported by Ricardo Fraile and diagnosed by Christopher : a typical HTTP client (eg: curl) connecting via the config below to an HTTP server would receive its response, immediately close while the server remains in keep-alive mode. The shutr() received by haproxy from the client is "propagated" to the server side but not acted upon because fdtab[fd].linger_risk is set, so we expect that the next close will immediately complete this operation. listen proxy-tcp bind 127.0.0.1:8888 mode tcp timeout connect 5s timeout server 10s timeout client 10s server server1 127.0.0.1:8000 But since the whole stream will not end until the server closes in turn, the server doesn't close and haproxy expires on server timeout. This problem has already struck by waking up an older bug and was partially fixed with commit `8059351` ("BUG/MEDIUM: http: don't disable lingering on requests with tunnelled responses") though it was not enough. The problem is that linger_risk is not suited here. In fact we need to know whether or not it is desired to close normally or silently, and whether or not a shutr() has already been received on this connection. This is the approach this patch takes, and it solves the problem for the various difficult modes (tcp, http-server-close, pretend-keepalive). This fix needs to be backported to 1.8. Many thanks to Ricardo for providing very detailed traces and configurations.	2017-12-22 18:54:05 +01:00
Willy Tarreau	0ad8e0dfea	MINOR: http: add a function to check request's cache-control header field The new function check_request_for_cacheability() is used to check if a request may be served from the cache, and/or allows the response to be stored into the cache. For this it checks the cache-control and pragma header fields, and adjusts the existing TX_CACHEABLE and a new TX_CACHE_IGNORE flags. For now, just like its response side counterpart, it only checks the first value of the header field. These functions should be reworked to improve their parsers and validate all elements.	2017-12-22 17:56:17 +01:00
Willy Tarreau	984fca9363	MINOR: stream-int: set flag SI_FL_CLEAN_ABRT when mux supports clean aborts By copying the info in the stream interface that the mux cleanly reports aborts, we'll have the ability to check this flag wherever needed regardless of the presence of a mux or not.	2017-12-20 16:56:32 +01:00
Willy Tarreau	28f1cb9da2	MINOR: mux: add flags to describe a mux's capabilities This new field will be used to describe certain properties of some muxes. For now we only add MX_FL_CLEAN_ABRT to indicate that a mux is able to unambiguously report aborts using CS_FL_ERROR contrary to others who may only report it via a read0. This will be used to improve handling of the abortonclose option with H2. Other flags may come later to report multiplexing capabilities or not, support of client/server sides etc.	2017-12-20 16:31:30 +01:00
Etienne Carriere	aec8989e53	MINOR: spoe: add force-set-var option in spoe-agent configuration For security reasons, the spoe filter was only able to change values of existing variables. In specific cases (ex : with LUA code), the name of variables are unknown at the configuration parsing phase. The force-set-var option can be enabled to register all variables.	2017-12-20 08:55:18 +01:00
Willy Tarreau	3c8294b607	MINOR: conn_stream: add new flag CS_FL_RCV_MORE to indicate pending data Due to the nature of multiplexed protocols, it will often happen that some operations are only performed on full frames, preventing any partial operation from being performed. HTTP/2 is one such example. The current MUX API causes a problem here because the rcv_buf() function has no way to let the stream layer know that some data could not be read due to a lack of room in the buffer, but that data are definitely present. The problem with this is that the stream layer might not know it needs to call the function again after it has made some room. And if the frame in the buffer is not followed by any other, nothing will move anymore. This patch introduces a new conn_stream flag CS_FL_RCV_MORE whose purpose is to indicate on the stream that more data than what was received are already available for reading as soon as more room will be available in the buffer. This patch doesn't make use of this flag yet, it only declares it. It is expected that other similar flags may come in the future, such as reports of pending end of stream, errors or any such event that might save the caller from having to poll, or simply let it know that it can take some actions after having processed data.	2017-12-10 21:13:25 +01:00
Thierry FOURNIER	cb14688496	BUG/MEDIUM: lua/notification: memory leak The thread patches adds refcount for notifications. The notifications are used with the Lua cosocket. These refcount free the notifications when the session is cleared. In the Lua task case, it not have sessions, so the nofications are never cleraed. This patch adds a garbage collector for signals. The garbage collector just clean the notifications for which the end point is disconnected. This patch should be backported in 1.8	2017-12-10 19:38:58 +01:00
Thierry FOURNIER	d5b79835f8	DOC: notifications: add precisions about thread usage Precise the terms of use the notification functions.	2017-12-10 19:38:55 +01:00
Emeric Brun	ece0c334bd	BUG/MEDIUM: ssl engines: Fix async engines fds were not considered to fix fd limit automatically. The number of async fd is computed considering the maxconn, the number of sides using ssl and the number of engines using async mode. This patch should be backported on haproxy 1.8	2017-12-06 14:17:41 +01:00
Willy Tarreau	6c71e4696b	BUG/MAJOR: hpack: don't pretend large headers fit in empty table In hpack_dht_make_room(), we try to fulfill this rule form RFC7541#4.4 : "It is not an error to attempt to add an entry that is larger than the maximum size; an attempt to add an entry larger than the maximum size causes the table to be emptied of all existing entries and results in an empty table." Unfortunately it is not consistent with the way it's used in hpack_dht_insert() as this last one will consider a success as a confirmation it can copy the header into the table, and a failure as an indexing error. This results in the two following issues : - if a client sends too large a header into an empty table, this header may overflow the table. Fortunately, most clients send small headers like :authority first, and never mark headers that don't fit into the table as indexable since it is counter-productive ; - if a client sends too large a header into a populated table, the operation fails after the table is totally flushed and the request is not processed. This patch fixes the two issues at once : - a header not fitting into an empty table is always a sign that it will never fit ; - not fitting into the table is not an error Thanks to Yves Lafon for reporting detailed traces demonstrating this issue. This fix must be backported to 1.8.	2017-12-04 18:06:51 +01:00
Willy Tarreau	d85ba4e092	BUG/MINOR: hpack: reject invalid header index If the hpack decoder sees an invalid header index, it emits value "### ERR ###" that was used during debugging instead of rejecting the block. This is harmless, and was detected by h2spec. To backport to 1.8.	2017-12-03 21:08:39 +01:00
Emeric Brun	0fed0b0a38	BUG/MEDIUM: peers: fix some track counter rules dont register entries for sync. This BUG was introduced with: 'MEDIUM: threads/stick-tables: handle multithreads on stick tables' The API was reviewed to handle stick table entry updates asynchronously and the caller must now call a 'stkable_touch_' function each time the content of an entry is modified to register the entry to be synced. There was missing call to stktable_touch_ resulting in not propagated entries to remote peers (or local one during reload)	2017-11-29 19:16:22 +01:00
Willy Tarreau	ec7464726f	BUILD: checks: don't include server.h server.h needs checks.h since it references the struct check, but depending on the include order it will fail if check.h is included first due to this one including server.h in turn while it doesn't need it.	2017-11-29 10:54:05 +01:00
Willy Tarreau	b306650c2a	[RELEASE] Released version 1.9-dev0 Released version 1.9-dev0 with the following main changes : - BUG/MEDIUM: stream: don't automatically forward connect nor close - BUG/MAJOR: stream: ensure analysers are always called upon close - BUG/MINOR: stream-int: don't try to read again when CF_READ_DONTWAIT is set - MEDIUM: mworker: Add systemd `Type=notify` support - BUG/MEDIUM: cache: free callback to remove from tree - CLEANUP: cache: remove unused struct - MEDIUM: cache: enable the HTTP analysers - CLEANUP: cache: remove wrong comment - MINOR: threads/atomic: rename local variables in macros to avoid conflicts - MINOR: threads/plock: rename local variables in macros to avoid conflicts - MINOR: threads/atomic: implement pl_mb() in asm on x86 - MINOR: threads/atomic: implement pl_bts() on non-x86 - MINOR: threads/build: atomic: replace the few inlines with macros - BUILD: threads/plock: fix a build issue on Clang without optimization - BUILD: ebtree: don't redefine types u32/s32 in scope-aware trees - BUILD: compiler: add a new type modifier __maybe_unused - BUILD: h2: mark some inlined functions "unused" - BUILD: server: check->desc always exists - BUG/MEDIUM: h2: properly report connection errors in headers and data handlers - MEDIUM: h2: add a function to emit an HTTP/1 request from a headers list - MEDIUM: h2: change hpack_decode_headers() to only provide a list of headers - BUG/MEDIUM: h2: always reassemble the Cookie request header field - BUG/MINOR: systemd: ignore daemon mode - CONTRIB: spoa_example: allow to compile outside HAProxy. - CONTRIB: spoa_example: remove bref, wordlist, cond_wordlist - CONTRIB: spoa_example: remove last dependencies on type "sample" - CONTRIB: spoa_example: remove SPOE enums that are useless for clients - CLEANUP: cache: reorder includes - MEDIUM: shctx: use unsigned int for len and block_count - MEDIUM: cache: "show cache" on the cli - BUG/MEDIUM: cache: use key=0 as a condition for freeing - BUG/MEDIUM: cache: refcount forbids to free the objects - BUG/MEDIUM: cache fix cli_kws structure - BUG/MEDIUM: deinit: correctly deinitialize the proxy and global listener tasks - BUG/MINOR: ssl: Always start the handshake if we can't send early data. - MINOR: ssl: Don't disable early data handling if we could not write. - MINOR: pools: prepare functions to override malloc/free in pools - MINOR: pools: implement DEBUG_UAF to detect use after free - BUG/MEDIUM: threads/time: fix time drift correction - BUG/MEDIUM: threads/time: maintain a common time reference between all threads - MINOR: sample: Add "thread" sample fetch - BUG/MINOR: Use crt_base instead of ca_base when crt is parsed on a server line - BUG/MINOR: stream: fix tv_request calculation for applets - BUG/MAJOR: h2: always remove a stream from the send list before freeing it - BUG/MAJOR: threads/task: dequeue expired tasks under the WQ lock - MINOR: ssl: Handle reading early data after writing better. - MINOR: mux: Make sure every string is woken up after the handshake. - MEDIUM: cache: store sha1 for hashing the cache key - MINOR: http: implement the "http-request reject" rule - MINOR: h2: send RST_STREAM before GOAWAY on reject - MEDIUM: h2: don't gracefully close the connection anymore on Connection: close - MINOR: h2: make use of client-fin timeout after GOAWAY - MEDIUM: config: ensure that tune.bufsize is at least 16384 when using HTTP/2 - MINOR: ssl: Handle early data with BoringSSL - BUG/MEDIUM: stream: always release the stream-interface on abort - BUG/MEDIUM: cache: free ressources in chn_end_analyze - MINOR: cache: move the refcount decrease in the applet release - BUG/MINOR: listener: Allow multiple "process" options on "bind" lines - MINOR: config: Support a range to specify processes in "cpu-map" parameter - MINOR: config: Slightly change how parse_process_number works - MINOR: config: Export parse_process_number and use it wherever it's applicable - MINOR: standard: Add my_ffsl function to get the position of the bit set to one - MINOR: config: Add auto-increment feature for cpu-map - MINOR: config: Support partial ranges in cpu-map directive - MINOR:: config: Remove thread-map directive - MINOR: config: Add the threads support in cpu-map directive - MINOR: config: Add threads support for "process" option on "bind" lines - MEDIUM: listener: Bind listeners on a thread subset if specified - CLEANUP: debug: Use DPRINTF instead of fprintf into #ifdef DEBUG_FULL/#endif - CLEANUP: log: Rename Alert/Warning in ha_alert/ha_warning - MINOR/CLEANUP: proxy: rename "proxy" to "proxies_list" - CLEANUP: pools: rename all pool functions and pointers to remove this "2" - DOC: update the roadmap file with the latest changes merged in 1.8 - DOC: fix mangled version in peers protocol documentation - DOC: add initial peers protovol v2.0 documentation. - DOC: mention William as maintainer of the cache and master-worker - DOC: add Christopher and Emeric as maintainers of the threads - MINOR: cache: replace a fprint() by an abort() - MEDIUM: cache: max-age configuration keyword - DOC: explain HTTP2 timeout behavior - DOC: cache: configuration and management - MAJOR: mworker: exits the master on failure - BUG/MINOR: threads: don't drop "extern" on the lock in include files - MINOR: task: keep a pointer to the currently running task - MINOR: task: align the rq and wq locks - MINOR: fd: cache-align fdtab and fdcache locks - MINOR: buffers: cache-align buffer_wq_lock - CLEANUP: server: reorder some fields in struct server to save 40 bytes - CLEANUP: proxy: slightly reorder the struct proxy to reduce holes - CLEANUP: checks: remove 16 bytes of holes in struct check - CLEANUP: cache: more efficiently pack the struct cache - CLEANUP: fd: place the lock at the beginning of struct fdtab - CLEANUP: pools: align pools on a cache line - DOC: config: add a few bits about how to configure HTTP/2 - BUG/MAJOR: threads/queue: avoid recursive locking in pendconn_get_next_strm() - BUILD: Makefile: reorder object files by size	2017-11-26 19:50:17 +01:00
Willy Tarreau	103e5663c8	BUG/MAJOR: threads/queue: avoid recursive locking in pendconn_get_next_strm() pendconn_get_next_strm() is called from process_srv_queue() under the server lock, and calls stream_add_srv_conn() with this lock held, while the latter tries to take it again. This results in a deadlock when a server's maxconn is reached and haproxy is built with thread support.	2017-11-26 18:50:30 +01:00
Willy Tarreau	1ca1b70cf9	CLEANUP: pools: align pools on a cache line There are just a few pools, and they're stressed a lot, so it makes sense to dedicate them a cache line to avoid contention and to place the lock at the beginning.	2017-11-26 11:10:53 +01:00
Willy Tarreau	5809052ae1	CLEANUP: fd: place the lock at the beginning of struct fdtab The struct is not cache line aligned but at least, every time the lock will appear in the same cache line as the fd it will benefit from being accessed first. This improves the performance by about 2% on fd-intensive workloads with 4 threads.	2017-11-26 11:10:53 +01:00
Willy Tarreau	08eaa78739	CLEANUP: checks: remove 16 bytes of holes in struct check These ones were easily recovered by swapping two members.	2017-11-26 11:10:52 +01:00
Willy Tarreau	a51108443e	CLEANUP: proxy: slightly reorder the struct proxy to reduce holes 16 bytes were recovered from the struct doing minimal reordering.	2017-11-26 11:10:52 +01:00
Willy Tarreau	d7e33bbe2f	CLEANUP: server: reorder some fields in struct server to save 40 bytes In 1.8 many holes were introduced in struct server, so let's slightly reorder a few fields to plug most of them. This saves 40 bytes in the struct.	2017-11-26 11:10:52 +01:00
Willy Tarreau	8b94969054	MINOR: fd: cache-align fdtab and fdcache locks These locks are highly contended, let's not make them share cache lines.	2017-11-26 11:10:51 +01:00
Willy Tarreau	53bae85b8e	BUG/MINOR: threads: don't drop "extern" on the lock in include files Commit `9dcf9b6` ("MINOR: threads: Use __decl_hathreads to declare locks") accidently lost a few "extern" in certain lock declarations, possibly causing certain entries to be declared at multiple places. Apparently it hasn't caused any harm though. The offending ones were : - fdtab_lock - fdcache_lock - poll_lock - buffer_wq_lock	2017-11-26 11:10:50 +01:00
William Lallemand	4cfede87a3	MAJOR: mworker: exits the master on failure This patch changes the behavior of the master during the exit of a worker. When a worker exits with an error code, for example in the case of a segfault, all workers are now killed and the master leaves. If you don't want this behavior you can use the option "master-worker no-exit-on-failure".	2017-11-24 22:48:27 +01:00
Willy Tarreau	bafbe01028	CLEANUP: pools: rename all pool functions and pointers to remove this "2" During the migration to the second version of the pools, the new functions and pool pointers were all called "pool_something2()" and "pool2_something". Now there's no more pool v1 code and it's a real pain to still have to deal with this. Let's clean this up now by removing the "2" everywhere, and by renaming the pool heads "pool_head_something".	2017-11-24 17:49:53 +01:00
Olivier Houchard	fbc74e8556	MINOR/CLEANUP: proxy: rename "proxy" to "proxies_list" Rename the global variable "proxy" to "proxies_list". There's been multiple proxies in haproxy for quite some time, and "proxy" is a potential source of bugs, a number of functions have a "proxy" argument, and some code used "proxy" when it really meant "px" or "curproxy". It worked by pure luck, because it usually happened while parsing the config, and thus "proxy" pointed to the currently parsed proxy, but we should probably not rely on this. [wt: some of these are definitely fixes that are worth backporting]	2017-11-24 17:21:27 +01:00
Christopher Faulet	767a84bcc0	CLEANUP: log: Rename Alert/Warning in ha_alert/ha_warning	2017-11-24 17:19:12 +01:00
Christopher Faulet	c644fa9bf5	MINOR: config: Add threads support for "process" option on "bind" lines It is now possible on a "bind" line (or a "stats socket" line) to specify the thread set allowed to process listener's connections. For instance: # HTTPS connections will be processed by all threads but the first and HTTP # connection will be processed on the first thread. bind :80 process 1/1 bind :443 ssl crt mycert.pem process 1/2-	2017-11-24 15:38:50 +01:00
Christopher Faulet	cb6a94510d	MINOR: config: Add the threads support in cpu-map directive Now, it is possible to bind CPU at the thread level instead of the process level by defining a thread set in "cpu-map" directives. Thus, its format is now: cpu-map [auto:]<process-set>[/<thread-set>] <cpu-set>... where <process-set> and <thread-set> must follow the format: all \| odd \| even \| number[-[number]] Having a process range and a thread range in same time with the "auto:" prefix is not supported. Only one range is supported, the other one must be a fixed number. But it is allowed when there is no "auto:" prefix. Because it is possible to define a mapping for a process and another for a thread on this process, threads will be bound on the intersection of their mapping and the one of the process on which they are attached. If the intersection is null, no specific binding will be set for the threads.	2017-11-24 15:38:50 +01:00
Christopher Faulet	26028f6209	MINOR: config: Add auto-increment feature for cpu-map The prefix "auto:" can be added before the process set to let HAProxy automatically bind a process to a CPU by incrementing process and CPU sets. To be valid, both sets must have the same size. No matter the declaration order of the CPU sets, it will be bound from the lower to the higher bound. Examples: # all these lines bind the process 1 to the cpu 0, the process 2 to cpu 1 # and so on. cpu-map auto:1-4 0-3 cpu-map auto:1-4 0-1 2-3 cpu-map auto:1-4 3 2 1 0 # bind each process to exaclty one CPU using all/odd/even keyword cpu-map auto:all 0-63 cpu-map auto:even 0-31 cpu-map auto:odd 32-63 # invalid cpu-map because process and CPU sets have different sizes. cpu-map auto:1-4 0 # invalid cpu-map auto:1 0-3 # invalid	2017-11-24 15:38:49 +01:00
Christopher Faulet	ff8131861f	MINOR: standard: Add my_ffsl function to get the position of the bit set to one	2017-11-24 15:38:49 +01:00
Christopher Faulet	f1f0c5f591	MINOR: config: Export parse_process_number and use it wherever it's applicable This function is used when "bind-process" directive is parsed and when "process" parameter on a "bind" or a "stats socket" line is parsed.	2017-11-24 15:38:49 +01:00
William Lallemand	f528fff46b	MEDIUM: cache: store sha1 for hashing the cache key The cache was relying on the txn->uri for creating its key, which was a big problem when there was no log activated. This patch does a sha1 of the host + uri, and stores it in the txn. When a object is stored, the eb32node uses the first 32 bits of the hash as a key, and the whole hash is stored in the cache entry. During a lookup, the truncated hash is used, and when it matches an entry we check the real sha1.	2017-11-23 20:20:04 +01:00
Olivier Houchard	90084a133d	MINOR: ssl: Handle reading early data after writing better. It can happen that we want to read early data, write some, and then continue reading them. To do so, we can't reuse tmp_early_data to store the amount of data sent, so introduce a new member. If we read early data, then ssl_sock_to_buf() is now the only responsible for getting back to the handshake, to make sure we don't miss any early data.	2017-11-23 19:35:28 +01:00
Willy Tarreau	158fa75811	MINOR: pools: implement DEBUG_UAF to detect use after free This code has been used successfully a few times in the past to detect that a pool was used after being freed. Its main goal is to allocate a full page for each object so that they are always released individually and unmapped from memory. This way if any part of the code reference the object after is was freed and before it is reallocated, a segv occurs at the exact offending location. It does a few extra things such as writing to the memory area before freeing to detect double-frees and free of read-only areas, and placing the data at the end of the page instead of the beginning so that out of bounds accesses are easier to spot. The amount of memory used with this is huge (about 10 times the regular usage) but it can be useful sometimes.	2017-11-22 19:43:57 +01:00
Willy Tarreau	f13322ede1	MINOR: pools: prepare functions to override malloc/free in pools This will be useful to add some debugging capabilities. For now it changes nothing.	2017-11-22 19:27:44 +01:00
William Lallemand	111bfef33c	MEDIUM: shctx: use unsigned int for len and block_count Allows bigger objects to be cached in the shctx, the first implementation was only storing small ssl session, but we want to store bigger HTTP response.	2017-11-21 21:35:04 +01:00
Willy Tarreau	59a10fb53d	MEDIUM: h2: change hpack_decode_headers() to only provide a list of headers The current H2 to H1 protocol conversion presents some issues which will require to perform some processing on certain headers before writing them so it's not possible to convert HPACK to H1 on the fly. This commit modifies the headers decoding so that it now works in two phases : hpack_decode_headers() only decodes the HPACK stream in the HEADERS frame and puts the result into a list. Headers which require storage (huffman-compressed or from the dynamic table) are stored in a chunk allocated by the H2 demuxer. Then once the headers are properly decoded into this list, h2_make_h1_request() is called with this list to produce the HTTP/1.1 request into the destination buffer. The list necessarily enforces a limit. Here we use 2*MAX_HTTP_HDR, which means that we can have as many individual cookies as we have regular headers if a client decides to break their cookies into multiple values. This seams reasonable and will allow the H1 parser to decide whether it's too much or not. Thus the output stream is not produced on the fly anymore and this will permit to deal with certain corner cases like reparing the Cookie header (which for now is not done). In order to limit header duplication and parsing, the known pseudo headers continue to be passed by their index : the name element in the list then has a NULL pointer and the value is the pseudo header's index. Given that these ones represent about half of the incoming requests and need to be found quickly, it maintains an acceptable level of performance. The code was significantly reduced by doing this because the orignal code had to deal with HPACK and H1 combinations (eg: index vs not indexed, etc) and now the HPACK decoding is totally focused on the decompression, and the H1 encoding doesn't have to deal with the issue of wrapping input for example. One bug was addressed here (though it couldn't happen at the moment). The H2 demuxer used to detect a failure to write the request into the H1 buffer and would then detect if the output buffer wraps, realign it and try again. The problem by doing so was that the HPACK context was already modified and not rewindable. Thus the size check is now performed first and a failure is reported if it doesn't fit.	2017-11-21 21:13:36 +01:00
Willy Tarreau	f24ea8e45e	MEDIUM: h2: add a function to emit an HTTP/1 request from a headers list The current H2 to H1 protocol conversion presents some issues which will require to perform some processing on certain headers before writing them so it's not possible to convert HPACK to H1 on the fly. Here we introduce a function which performs half of what hpack_decode_header() used to do, which is to take a list of headers on input and emit the corresponding request in HTTP/1.1 format. The code is the same and functions were renamed to be prefixed with "h2" instead of "hpack", though it ends up being simpler as the various HPACK-specific cases could be fused into a single one (ie: add header). Moving this part here makes a lot of sense as now this code is specific to what is documented in HTTP/2 RFC 7540 and will be able to deal with special cases related to H2 to H1 conversion enumerated in section 8.1. Various error codes which were previously assigned to HPACK were never used (aside being negative) and were all replaced by -1 with a comment indicating what error was detected. The code could be further factored thanks to this but this commit focuses on compatibility first. This code is not yet used but builds fine.	2017-11-21 21:13:33 +01:00
Willy Tarreau	dbd25fc75a	BUILD: compiler: add a new type modifier __maybe_unused While gcc only emits warnings about unused static functions, Clang also emits such a warning when the functions are inlined. This is a bit annoying at certain places where functions are provided to manipulate multiple data types and are not yet used. Let's have a type modifier "__maybe_unused" which sets the "unused" attribute like the Linux kernel does. It's elegant as it allows the code author to indicate that it knows that this element might be unused. It works on variables as well, which is convenient to remove ifdefs around local variables in certain functions, but doesn't work on labels.	2017-11-20 21:27:27 +01:00
Willy Tarreau	2532bd2f81	BUILD: threads/plock: fix a build issue on Clang without optimization [ plock commit 4c53fd3a0b2b1892817cebd0db012a52f4087850 ] Pieter Baauw reported a build issue affecting haproxy after plock was included. It happens that expressions of the form : if ((const) ? (expr1) : (expr2)) do_something() always produce code for both expr1 and expr2 on Clang when building without optimization. The resulting asm code is even funny, basically doing : mov reg, 1 cmp reg, 1 ... This causes our sizeof() tests to fail to build because we purposely dereference a fake function that reports the location and nature of the inconsistency, but this fake function appears in the object code despite all conditions being there to avoid it. However the compiler is still smart enough to optimize away code doing if (const) do_something() So we simply repeat the condition before do_something(), and the dummy function is not referenced anymore unless really required.	2017-11-20 21:06:35 +01:00
Willy Tarreau	b5f271555e	MINOR: threads/build: atomic: replace the few inlines with macros [ plock commit 61e255286ae32e83e1a3174dd7c49eda99880a8b] There are a few inlines such as pl_barrier() and pl_cpu_relax() which are used a lot. Unfortunately, while building test code at -O0, inlining is disabled and these ones are called a lot and show up a lot in any profile, are traced into when single-stepping with a debugger, etc, thus they are polluting the landscape. Since they're single-asm statements, there is no reason for not turning them into macros. The result becomes fairly visible here at -O0 : $ size latency.inline latency.macro text data bss dec hex filename 11431 692 656 12779 31eb treelock.inline 10967 692 656 12315 301b treelock.macro And it was verified that regularly optimized code remains strictly identical.	2017-11-20 21:06:35 +01:00
Willy Tarreau	d0d8ba59d3	MINOR: threads/atomic: implement pl_bts() on non-x86 [ plock commit da17ba320aad3a8faf08e36fca604de9cad21fdd ] This one was missing, it can be done using sync_fetch_and_or().	2017-11-20 21:06:03 +01:00
Willy Tarreau	01b8398b9e	MINOR: threads/atomic: implement pl_mb() in asm on x86 [ plock commit 44081ea493dd78dab48076980e881748e9b33db5 ] Older compilers (eg: gcc 3.4) don't provide __sync_synchronize() so let's do it by hand on this platform.	2017-11-20 20:45:47 +01:00
Willy Tarreau	f7ba77eb80	MINOR: threads/plock: rename local variables in macros to avoid conflicts [ plock commit b155d5c762fb9a9793911881f80e61faa6b0e889 ] Local variables "l", "i" and "ret" were renamed "__pl_l", "__pl_i" and "__pl_r" respectively, to limit the risk of conflicts with existing variables in application code.	2017-11-20 20:45:43 +01:00
Willy Tarreau	98409e34ca	MINOR: threads/atomic: rename local variables in macros to avoid conflicts [ plock commit bfac5887ebabb8ef753b0351f162265767eb219b ] Local variable "t" was renamed "__pl_t" to limit the risk of conflicts with existing variables in application code.	2017-11-20 20:45:38 +01:00
William Lallemand	71bd11a1f3	MEDIUM: cache: enable the HTTP analysers Enable the same analysers as the stats applet. Allows keepalive and termination flags to work.	2017-11-20 19:22:27 +01:00
William Lallemand	44e259c0b7	CLEANUP: cache: remove unused struct Remove unused structure which remain from old dev.	2017-11-20 19:22:27 +01:00
Tim Duesterhus	d6942c8297	MEDIUM: mworker: Add systemd `Type=notify` support This patch adds support for `Type=notify` to the systemd unit. Supporting `Type=notify` improves both starting as well as reloading of the unit, because systemd will be let known when the action completed. See this quote from `systemd.service(5)`: > Note however that reloading a daemon by sending a signal (as with the > example line above) is usually not a good choice, because this is an > asynchronous operation and hence not suitable to order reloads of > multiple services against each other. It is strongly recommended to > set ExecReload= to a command that not only triggers a configuration > reload of the daemon, but also synchronously waits for it to complete. By making systemd aware of a reload in progress it is able to wait until the reload actually succeeded. This patch introduces both a new `USE_SYSTEMD` build option which controls including the sd-daemon library as well as a `-Ws` runtime option which runs haproxy in master-worker mode with systemd support. When haproxy is running in master-worker mode with systemd support it will send status messages to systemd using `sd_notify(3)` in the following cases: - The master process forked off the worker processes (READY=1) - The master process entered the `mworker_reload()` function (RELOADING=1) - The master process received the SIGUSR1 or SIGTERM signal (STOPPING=1) Change the unit file to specify `Type=notify` and replace master-worker mode (`-W`) with master-worker mode with systemd support (`-Ws`). Future evolutions of this feature could include making use of the `STATUS` feature of `sd_notify()` to send information about the number of active connections to systemd. This would require bidirectional communication between the master and the workers and thus is left for future work.	2017-11-20 18:39:41 +01:00
Olivier Houchard	e6060c5d87	MINOR: SSL: Store the ASN1 representation of client sessions. Instead of storing the SSL_SESSION pointer directly in the struct server, store the ASN1 representation, otherwise, session resumption is broken with TLS 1.3, when multiple outgoing connections want to use the same session.	2017-11-16 19:03:32 +01:00
Christopher Faulet	595d7b72a6	MINOR: applets: Use a bitfield to track applets activity per-thread a bitfield has been added to know if there are runnable applets for a thread. When an applet is woken up, the bits corresponding to its thread_mask are set. When all active applets for a thread is get to be processed, the thread is removed from active ones by unsetting its tid_bit from the bitfield.	2017-11-16 11:19:46 +01:00
Christopher Faulet	3911ee85df	MINOR: tasks: Use a bitfield to track tasks activity per-thread a bitfield has been added to know if there are runnable tasks for a thread. When a task is woken up, the bits corresponding to its thread_mask are set. When all tasks for a thread have been evaluated without any wakeup, the thread is removed from active ones by unsetting its tid_bit from the bitfield.	2017-11-16 11:19:46 +01:00
William Lallemand	75ea0a06b0	BUG/MEDIUM: mworker: does not close inherited FD At the end of the master initialisation, a call to protocol_unbind_all() was made, in order to close all the FDs. Unfortunately, this function closes the inherited FDs (fd@), upon reload the master wasn't able to reload a configuration with those FDs. The create_listeners() function now store a flag to specify if the fd was inherited or not. Replace the protocol_unbind_all() by mworker_cleanlisteners() + deinit_pollers()	2017-11-15 19:53:33 +01:00
Willy Tarreau	9c1e15d8cd	MINOR: tools: emphasize the node being worked on in the tree dump Now we can show in dotted red the node being removed or surrounded in red a node having been inserted, and add a description on the graph related to the operation in progress for example.	2017-11-15 19:43:05 +01:00
Willy Tarreau	ed3cda02ae	MINOR: tools: add a function to dump a scope-aware tree to a file It emits a dump in DOT format for graphing purposes during debugging sessions. It's convenient to dump the run queue.	2017-11-15 16:07:15 +01:00
Christopher Faulet	99bca65f53	BUG/MEDIUM: standard: itao_str/idx and quote_str/idx must be thread-local This bug has an impact on the stats applet and easily leads to a crash of HAProxy. This is specific to threads, no backport is needed.	2017-11-14 18:11:57 +01:00
Christopher Faulet	e9a896e09e	BUG/MINOR: threads: tid_bit must be a unsigned long This is specific to threads, no backport is needed.	2017-11-14 18:11:28 +01:00
Christopher Faulet	fa5c812a6b	BUG/MINOR: buffers: Fix b_alloc_margin to be "fonctionnaly" thread-safe b_alloc_margin is, strickly speeking, thread-safe. It will not crash HAproxy. But its contract is not respected anymore in a multithreaded environment. In this function, we need to be sure to have <margin> buffers available in the pool after the allocation. So to have this guarantee, we must lock the memory pool during all the operation. This also means, we must call internal and lockless memory functions (prefixed with '__'). For the record, this patch fixes a pernicious bug happens after a soft reload where some streams can be blocked infinitly, waiting for a buffer in the buffer_wq list. This happens because, during a soft reload, pool_gc2 is called, making some calls to b_alloc_fast fail. This is specific to threads, no backport is needed.	2017-11-13 11:42:48 +01:00
Christopher Faulet	9dcf9b6f03	MINOR: threads: Use __decl_hathreads to declare locks This macro should be used to declare variables or struct members depending on the USE_THREAD compile option. It avoids the encapsulation of such declarations between #ifdef/#endif. It is used to declare all lock variables.	2017-11-13 11:38:17 +01:00
Willy Tarreau	387bd4f69f	CLEANUP: global: introduce variable pid_bit to avoid shifts with relative_pid At a number of places, bitmasks are used for process affinity and to map listeners to processes. Every time 1UL<<(relative_pid-1) is used. Let's create a "pid_bit" variable corresponding to this value to clean this up.	2017-11-10 19:08:14 +01:00
Willy Tarreau	28b55c6fed	CLEANUP: mux: remove the unused "release()" function In commit `53a4766` ("MEDIUM: connection: start to introduce a mux layer between xprt and data") we introduced a release() function which ends up never being used. Let's get rid of it now.	2017-11-10 16:43:05 +01:00
Willy Tarreau	aa39860aef	MINOR: tools: don't use unlikely() in hex2i() This small inline function causes some pain to the compiler when used inside other functions due to its use of the unlikely() hint for non-digits. It causes the letters to be processed far away in the calling function and makes the code less efficient. Removing these unlikely() hints has increased the chunk size parsing by around 5%.	2017-11-10 11:19:54 +01:00
Willy Tarreau	b15e3fefc9	BUG/MEDIUM: h1: ensure the chunk size parser can deal with full buffers The HTTP/1 code always has the reserve left available so the buffer is never full there. But with HTTP/2 we have to deal with full buffers, and it happens that the chunk size parser cannot tell the difference between a full buffer and an empty one since it compares the start and the stop pointer. Let's change this to instead deal with the number of bytes left to process. As a side effect, this code ends up being about 10% faster than the previous one, even on HTTP/1.	2017-11-10 11:17:08 +01:00
Christopher Faulet	c5a9d5bf23	BUG/MEDIUM: stream-int: Don't loss write's notifs when a stream is woken up When a write activity is reported on a channel, it is important to keep this information for the stream because it take part on the analyzers' triggering. When some data are written, the flag CF_WRITE_PARTIAL is set. It participates to the task's timeout updates and to the stream's waking. It is also used in CF_MASK_ANALYSER mask to trigger channels anaylzers. In the past, it was cleared by process_stream. Because of a bug (fixed in commit `95fad5ba4` ["BUG/MAJOR: stream-int: don't re-arm recv if send fails"]), It is now cleared before each send and in stream_int_notify. So it is possible to loss this information when process_stream is called, preventing analyzers to be called, and possibly leading to a stalled stream. Today, this happens in HTTP2 when you call the stat page or when you use the cache filter. In fact, this happens when the response is sent by an applet. In HTTP1, everything seems to work as expected. To fix the problem, we need to make the difference between the write activity reported to lower layers and the one reported to the stream. So the flag CF_WRITE_EVENT has been added to notify the stream of the write activity on a channel. It is set when a send succedded and reset by process_stream. It is also used in CF_MASK_ANALYSER. finally, it is checked in stream_int_notify to wake up a stream and in channel_check_timeouts. This bug is probably present in 1.7 but it seems to have no effect. So for now, no needs to backport it.	2017-11-09 15:16:05 +01:00
Willy Tarreau	1b4cf9b754	BUG/MINOR: h1: the HTTP/1 make status code parser check for digits The H1 parser used by the H2 gateway was a bit lax and could validate non-numbers in the status code. Since it computes the code on the fly it's problematic, as "30:" is read as status code 310. Let's properly check that it's a number now. No backport needed.	2017-11-09 11:15:45 +01:00

... 2 3 4 5 6 ...

2947 Commits