haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-12 18:16:58 +02:00

Author	SHA1	Message	Date
Willy Tarreau	81036f2738	MINOR: time: move the cpu, mono, and idle time to thread_info These ones are useful across all threads and would be better placed in struct thread_info than thread-local. There are very few users.	2019-05-20 21:14:14 +02:00
Willy Tarreau	8323a375bc	MINOR: threads: add a thread-local thread_info pointer "ti" Since we're likely to access this thread_info struct more frequently in the future, let's reserve the thread-local symbol to access it directly and avoid always having to combine thread_info and tid. This pointer is set when tid is set.	2019-05-20 21:14:12 +02:00
Willy Tarreau	624dcbf41e	MINOR: threads: always place the clockid in the struct thread_info It will be easier to deal with the internal API to always have it.	2019-05-20 21:13:01 +02:00
Willy Tarreau	5a6e2245fa	REORG: threads: move the struct thread_info from global.h to hathreads.h It doesn't make sense to keep this struct thread_info in global.h, it causes difficulties to access its contents from hathreads.h, let's move it to the threads where it ought to have been created.	2019-05-20 20:00:25 +02:00
Willy Tarreau	a9f9fc9e5b	MINOR: debug: make ha_panic() report threads starting at 1 Internally they start at zero but everywhere (config, dumps) we show them starting at 1, so let's fix the confusion.	2019-05-20 17:46:14 +02:00
Willy Tarreau	3710105945	MINOR: tools: provide a may_access() function and make dump_hex() use it It's a bit too easy to crash by accident when using dump_hex() on any area. Let's have a function to check if the memory may safely be read first. This one abuses the stat() syscall checking if it returns EFAULT or not, in which case it means we're not allowed to read from there. In other situations it may return other codes or even a success if the area pointed to by the file exists. It's important not to abuse it though and as such it's tested only once per output line.	2019-05-20 16:59:37 +02:00
Willy Tarreau	6bdf3e9b11	MINOR: debug/cli: add some debugging commands for developers When haproxy is built with DEBUG_DEV, the following commands are added to the CLI : debug dev close <fd> : close this file descriptor debug dev delay [ms] : sleep this long debug dev exec [cmd] ... : show this command's output debug dev exit [code] : immediately exit the process debug dev hex <addr> [len]: dump a memory area debug dev log [msg] ... : send this msg to global logs debug dev loop [ms] : loop this long debug dev panic : immediately trigger a panic debug dev tkill [thr] [sig] : send signal to thread These are essentially aimed at helping developers trigger certain conditions and are expected to be complemented over time.	2019-05-20 16:59:30 +02:00
Willy Tarreau	56131ca58e	MINOR: debug: implement ha_panic() This function dumps all existing threads using the thread dump mechanism then aborts. This will be used by the lockup detection and by debugging tools.	2019-05-20 16:51:30 +02:00
Willy Tarreau	9fc5dcbd71	MINOR: tools: add dump_hex() This is used to dump a memory area into a buffer for debugging purposes.	2019-05-20 16:51:30 +02:00
Willy Tarreau	da5a63f8f1	CLEANUP: stream: remove an obsolete debugging test The test consisted in checking that there was always a timeout on a stream's task and was only enabled when built in development mode, but 1) it is never tested and 2) if it had been tested it would have been noticed that it triggers a bit too easily on the CLI. Let's get rid of this old one.	2019-05-20 16:19:40 +02:00
Willy Tarreau	91e6df01fa	MINOR: threads: add each thread's clockid into the global thread_info This is the per-thread CPU runtime clock, it will be used to measure the CPU usage of each thread and by the lockup detection mechanism. It must only be retrieved at the beginning of run_thread_poll_loop() since the thread must already have been started for this. But it must be done before performing any per-thread initcall so that all thread init functions have access to the clock ID. Note that it could make sense to always have this clockid available even in non-threaded situations and place the process' clock there instead. But it would add portability issues which are currently easy to deal with by disabling threads so it may not be worth it for now.	2019-05-20 11:42:25 +02:00
Willy Tarreau	522cfbc1ea	MINOR: init/threads: make the global threads an array of structs This way we'll be able to store more per-thread information than just the pthread pointer. The storage became an array of struct instead of an allocated array since it's very small (typically 512 bytes) and not worth the hassle of dealing with memory allocation on this. The array was also renamed thread_info to make its intended usage more explicit.	2019-05-20 11:37:57 +02:00
Willy Tarreau	64a47b943c	CLEANUP: memory: make the fault injection code use the OTHER_LOCK label The mem_should_fail() function sets a lock while it's building its messages, and when this was done there was no relevant label available hence the confusing use of START_LOCK. Now OTHER_LOCK is available for such use cases, so let's switch to this one instead as START_LOCK is going to disappear.	2019-05-20 11:26:12 +02:00
Willy Tarreau	619a95f5ad	MEDIUM: init/mworker: make the pipe register function a regular initcall Now that we have the guarantee that init calls happen before any other thread starts, we don't need anymore the workaround installed by commit `1605c7ae6` ("BUG/MEDIUM: threads/mworker: fix a race on startup") and we can instead rely on a regular per-thread initcall for this function. It will only be performed on worker thread #0, the other ones and the master have nothing to do, just like in the original code that was only moved to the function.	2019-05-20 11:26:12 +02:00
Willy Tarreau	3078e9f8e2	MINOR: threads/init: synchronize the threads startup It's a bit dangerous to let threads initialize at different speeds on startup. Some are still in their init functions while others area already running. It was even subject to some race condition bugs like the one fixed by commit `1605c7ae6` ("BUG/MEDIUM: threads/mworker: fix a race on startup"). Here in order to secure all this, we take a very simplistic approach consisting in using half of the rendez-vous point, which is made exactly for this purpose : we first initialize the mask of the threads requesting a rendez-vous to the mask of all threads, and we simply call thread_release() once the init is complete. This guarantees that no thread will go further than the initialization code during this time. This could even safely be backported if any other issue related to an init race was discovered in a stable release.	2019-05-20 11:26:12 +02:00
William Lallemand	7b302d8dd5	MINOR: init: setenv HAPROXY_CFGFILES Set the HAPROXY_CFGFILES environment variable which contains the list of configuration files used to start haproxy, separated by semicolon.	2019-05-20 11:21:00 +02:00
Willy Tarreau	c7091d89ae	MEDIUM: debug/threads: implement an advanced thread dump system The current "show threads" command was too limited as it was not possible to dump other threads' detailed states (e.g. their tasks). This patch goes further by using thread signals so that each thread can dump its own state in turn into a shared buffer provided by the caller. Threads are synchronized using a mechanism very similar to the rendez-vous point and using this method, each thread can safely dump any of its contents and the caller can finally report the aggregated ones from the buffer. It is important to keep in mind that the list of signal-safe functions is limited, so we take care of only using chunk_printf() to write to a pre-allocated buffer. This mechanism is enabled by USE_THREAD_DUMP and is enabled by default on Linux 2.6.28+. On other platforms it falls back to the previous solution using the loop and the less precise dump.	2019-05-17 17:16:20 +02:00
Willy Tarreau	0ad46fa6f5	MINOR: stream: detach the stream from its own task on stream_free() This makes sure that the stream is not visible from its own task just before starting to free some of its components. This way we have the guarantee that a stream found in a task list is totally valid and can safely be dereferenced.	2019-05-17 17:16:20 +02:00
Willy Tarreau	01f3489752	MINOR: task: put barriers after each write to curr_task This one may be watched by signal handlers, we don't want the compiler to optimize its assignment away at the end of the loop and leave some wandering pointers there.	2019-05-17 17:16:20 +02:00
Willy Tarreau	38171daf21	MINOR: thread: implement ha_thread_relax() At some places we're using a painful ifdef to decide whether to use sched_yield() or pl_cpu_relax() to relax in loops, this is hardly exportable. Let's move this to ha_thread_relax() instead and une this one only.	2019-05-17 17:16:20 +02:00
Willy Tarreau	20db9115dc	BUG/MINOR: debug: don't check the call date on tasklets tasklets don't have a call date, so when a tasklet is cast into a task and is present at the end of a page we run a risk of dereferencing unmapped memory when dumping them in ha_task_dump(). This commit simplifies the test and uses to distinct calls for tasklets and tasks. No backport is needed.	2019-05-17 17:16:20 +02:00
Willy Tarreau	5cf64dd1bd	MINOR: debug: make ha_thread_dump() and ha_task_dump() take a buffer Instead of having them dump into the trash and initialize it, let's have the caller initialize a buffer and pass it. This will be convenient to dump multiple threads at once into a single buffer.	2019-05-17 17:16:20 +02:00
Willy Tarreau	14a1ab75d0	BUG/MINOR: debug: make ha_task_dump() actually dump the requested task It used to only dump the current task, which isn't different for now but the purpose clearly is to dump the requested task. No backport is needed.	2019-05-17 17:16:20 +02:00
Willy Tarreau	231ec395c1	BUG/MINOR: debug: make ha_task_dump() always check the task before dumping it For now it cannot happen since we're calling it from a task but it will break with signals. No backport is needed.	2019-05-17 17:16:20 +02:00
Olivier Houchard	6db1699f77	BUG/MEDIUM: streams: Try to L7 retry before aborting the connection. In htx_wait_for_response, in case of error, attempt a L7 retry before aborting the connection if the TX_NOT_FIRST flag is set. If we don't do that, then we wouldn't attempt L7 retries after the first request, or if we use HTTP/2, as with HTTP/2 that flag is always set.	2019-05-17 15:49:21 +02:00
Olivier Houchard	ce1a0292bf	BUG/MEDIUM: streams: Don't use CF_EOI to decide if the request is complete. In si_cs_send(), don't check CF_EOI on the request channel to decide if the request is complete and if we should save the buffer to eventually attempt L7 retries. The flag may not be set yet, and it may too be set to early, before we're done modifying the buffer. Instead, get the msg, and make sure its state is HTTP_MSG_DONE. That way we will store the request buffer when sending it even in H2.	2019-05-17 15:49:21 +02:00
Willy Tarreau	4e2b646d60	MINOR: cli/debug: add a thread dump function The new function ha_thread_dump() will dump debugging info about all known threads. The current thread will contain a bit more info. The long-term goal is to make it possible to use it in signal handlers to improve the accuracy of some dumps. The function dumps its output into the trash so as it was trivial to add, a new "show threads" command appeared on the CLI.	2019-05-16 18:06:45 +02:00
Willy Tarreau	58d9621fc8	MINOR: cli/activity: show the dumping thread ID starting at 1 Both the config and gdb report thread IDs starting at 1, so better do the same in "show activity" to limit confusion. We also display the full permitted range. This could be backported to 1.9 since it was present there.	2019-05-16 18:02:03 +02:00
Tim Duesterhus	3506dae342	MEDIUM: Make 'resolution_pool_size' directive fatal This directive never appeared in a stable release and instead was introduced and deprecated within 1.8-dev. While it technically could be outright removed we detect it and error out for good measure.	2019-05-16 18:02:03 +02:00
Tim Duesterhus	10c6c16cde	MEDIUM: Make 'option forceclose' actually warn It is deprecated since `315b39c391` (1.9-dev), but only was deprecated in the docs. Make it warn when being used and remove it from the docs.	2019-05-16 18:02:03 +02:00
Christopher Faulet	c1f40dd492	BUG/MINOR: http_fetch: Rely on the smp direction for "cookie()" and "hdr()" A regression was introduced in the commit `89dc49935` ("BUG/MAJOR: http_fetch: Get the channel depending on the keyword used") on the samples "cookie()" and "hdr()". Unlike other samples manipulating the HTTP headers, these ones depend on the sample direction. To fix the bug, these samples use now their own functions. Depending on the sample direction, they call smp_fetch_cookie() and smp_fetch_hdr() with the appropriate keyword. Thanks to Yves Lafon to report this issue. This patch must be backported wherever the commit `89dc49935` was backported. For now, 1.9 and 1.8.	2019-05-16 11:31:28 +02:00
Olivier Houchard	35d116885d	MINOR: connections: Use BUG_ON() to enforce rules in subscribe/unsubscribe. It is not legal to subscribe if we're already subscribed, or to unsubscribe if we did not subscribe, so instead of trying to handle those cases, just assert that it's ok using the new BUG_ON() macro.	2019-05-14 18:18:25 +02:00
Olivier Houchard	00b8f7c60b	MINOR: h1: Use BUG_ON() to enforce rules in subscribe/unsubscribe. It is not legal to subscribe if we're already subscribed, or to unsubscribe if we did not subscribe, so instead of trying to handle those cases, just assert that it's ok using the new BUG_ON() macro.	2019-05-14 18:18:25 +02:00
Olivier Houchard	f8338151a3	MINOR: h2: Use BUG_ON() to enforce rules in subscribe/unsubscribe. It is not legal to subscribe if we're already subscribed, or to unsubscribe if we did not subscribe, so instead of trying to handle those cases, just assert that it's ok using the new BUG_ON() macro.	2019-05-14 18:18:25 +02:00
Christopher Faulet	fa922f03a3	BUG/MEDIUM: mux-h2: Set EOI on the conn_stream during h2_rcv_buf() Just like CS_FL_REOS previously, the CS_FL_EOI flag is abused as a proxy for H2_SF_ES_RCVD. The problem is that this flag is consumed by the application layer and is set immediately when an end of stream was met, which is too early since the application must retrieve the rxbuf's contents first. The effect is that some transfers are truncated (mostly the first one of a connection in most tests). The problem of mixing CS flags and H2S flags in the H2 mux is not new (and is currently being addressed) but this specific one was emphasized in commit `63768a63d` ("MEDIUM: mux-h2: Don't mix the end of the message with the end of stream") which was backported to 1.9. Note that other flags, particularly CS_FL_REOS still need to be asynchronously reported, though their impact seems more limited for now. This patch makes sure that all internal uses of CS_FL_EOI are replaced with a test on H2_SF_ES_RCVD (as there is a 1-to-1 equivalence) and that CS_FL_EOI is only reported once the rxbuf is empty. This should ideally be backported to 1.9 unless it causes too much trouble due to the recent changes in this area, as 1.9 seems not to be directly affected by this bug.	2019-05-14 15:47:57 +02:00
Willy Tarreau	99ad1b3e8c	MINOR: mux-h2: stop relying on CS_FL_REOS This flag was introduced early in 1.9 development (`a3f7efe00`) to report the fact that the rxbuf that was present on the conn_stream was followed by a shutr. Since then the rxbuf moved from the conn_stream to the h2s (`638b799b0`) but the flag remained on the conn_stream. It is problematic because some state transitions inside the mux depend on it, thus depend on the CS, and as such have to test for its existence before proceeding. This patch replaces the test on CS_FL_REOS with a test on the only states that set this flag (H2_SS_CLOSED, H2_SS_HREM, H2_SS_ERROR). The few places where the flag was set were removed (the flag is not used by the data layer).	2019-05-14 15:47:57 +02:00
Willy Tarreau	4c688eb8d1	MINOR: mux-h2: add macros to check multiple stream states at once At many places we need to test for several stream states at once, let's have macros to make a bit mask from a state to ease this.	2019-05-14 15:47:57 +02:00
Willy Tarreau	f8fe3d63f0	CLEANUP: mux-h2: don't test for impossible CS_FL_REOS conditions This flag is currently set when an incoming close was received, which results in the stream being in either H2_SS_HREM, H2_SS_CLOSED, or H2_SS_ERROR states, so let's remove the test for the OPEN and HLOC cases.	2019-05-14 15:47:57 +02:00
Willy Tarreau	3cf69fe6b2	BUG/MINOR: mux-h2: make sure to honor KILL_CONN in do_shut{r,w} If the stream closes and quits while there's no room in the mux buffer to send an RST frame, next time it is attempted it will not lead to the connection being closed because the conn_stream will have been released and the KILL_CONN flag with it as well. This patch reserves a new H2_SF_KILL_CONN flag that is copied from the CS when calling shut{r,w} so that the stream remains autonomous on this even when the conn_stream leaves. This should ideally be backported to 1.9 though it depends on several previous patches that may or may not be suitable for backporting. The severity is very low so there's no need to insist in case of trouble.	2019-05-14 15:47:57 +02:00
Willy Tarreau	aebbe5ef72	MINOR: mux-h2: make h2s_wake_one_stream() not depend on temporary CS flags In h2s_wake_one_stream() we used to rely on the temporary flags used to adjust the CS to determine the new h2s state. This really is not convenient and creates far too many dependencies. This commit just moves the same condition to the places where the temporary flags were set so that we don't have to rely on these anymore. Whether these are relevant or not was not the subject of the operation, what matters was to make sure the conditions to adjust the stream's state and the CS's flags remain the same. Later it could be studied if these conditions are correct or not.	2019-05-14 15:47:57 +02:00
Willy Tarreau	13b6c2e8b3	MINOR: mux-h2: make h2s_wake_one_stream() the only function to deal with CS h2s_wake_one_stream() has access to all the required elements to update the connstream's flags and figure the necessary state transitions, so let's move the conditions there from h2_wake_some_streams().	2019-05-14 15:47:57 +02:00
Willy Tarreau	234829111f	MINOR: mux-h2: make h2_wake_some_streams() not depend on the CS flags It's problematic to have to pass some CS flags to this function because that forces some h2s state transistions to update them just in time while some of them are supposed to only be updated during I/O operations. As a first step this patch transfers the decision to pass CS_FL_ERR_PENDING from the caller to the leaf function h2s_wake_one_stream(). It is easy since this is the only flag passed there and it depends on the position of the stream relative to the last_sid if it was set.	2019-05-14 15:47:57 +02:00
Willy Tarreau	c3b1183f57	MINOR: mux-h2: remove useless test on stream ID vs last in wake function h2_wake_some_streams() first looks up streams whose IDs are greater than or equal to last+1, then checks if the id is lower than or equal to last, which by definition will never match. Let's remove this confusing leftover from ancient code.	2019-05-14 15:47:57 +02:00
William Lallemand	920fc8bbe4	BUG/MINOR: mworker: use after free when the PID not assigned Commit `4528611` ("MEDIUM: mworker: store the leaving state of a process") introduced a bug in the mworker_env_to_proc_list() function. This is very unlikely to occur since the PID should always be assigned. It can probably happen if the environment variable is corrupted. No backport needed.	2019-05-14 11:28:16 +02:00
Willy Tarreau	f983d00a1c	BUG/MINOR: mux-h2: make the do_shut{r,w} functions more robust against retries These functions may fail to emit an RST or an empty DATA frame because the mux is full or busy. Then they subscribe the h2s and try again. However when doing so, they will already have marked the error state on the stream and will not pass anymore through the sequence resulting in the failed frame to be attempted to be sent again nor to the close to be done, instead they will return a success. It is important to only leave when the stream is already closed, but to go through the whole sequence otherwise. This patch should ideally be backported to 1.9 though it's possible that the lack of the WANT_SHUT* flags makes this difficult or dangerous. The severity is low enough to avoid this in case of trouble.	2019-05-14 11:13:06 +02:00
Fr�d�ric L�caille	90a10aeb65	BUG/MINOR: log: Wrong log format initialization. This patch fixes an issue introduced by `0bad840b` commit "MINOR: log: Extract some code to send syslog messages" which leaded to wrong log format variable initializations at least for "short" and "raw" format. This commit skipped the cases where even if passed to __do_send_log(), the syslog tag and syslog pid string must not be used to format the log message with "short" and "raw". This is done iniatilizing "tag_max" and "pid_max" variables (the lengths of the tag and pid strings) to 0, then updating to them to the length of the tag and pid strings passed as variables to __do_send_log() depending on the log format and in every cases using this length for the iovec variable used to send() the log. This bug is specific to 2.0.	2019-05-14 11:12:00 +02:00
Willy Tarreau	8bdb5c9bb4	CLEANUP: connection: remove the handle field from the wait_event struct It was only set and not consumed after the previous change. The reason is that the task's context always contains the relevant information, so there is no need for a second pointer.	2019-05-13 19:14:52 +02:00
Willy Tarreau	88bdba31fa	CLEANUP: mux-h2: simply use h2s->flags instead of ret in h2_deferred_shut() This one used to rely on the combined return statuses of the shutr/w functions but now that we have the H2_SF_WANT_SHUT{R,W} flags we don't need this anymore if we properly remove these flags after their operations succeed. This is what this patch does.	2019-05-13 19:14:52 +02:00
Willy Tarreau	2c249ebc75	MINOR: mux-h2: add two H2S flags to report the need for shutr/shutw Currently when a shutr/shutw fails due to lack of buffer space, we abuse the wait_event's handle pointer to place up to two bits there in addition to the original pointer. This pointer is not used for anything but this and overall the intent becomes clearer with h2s flags than with these two alien bits in the pointer, so let's use clean flags now.	2019-05-13 19:14:52 +02:00
Willy Tarreau	c234ae38f8	CLEANUP: mux-h2: use LIST_ADDED() instead of LIST_ISEMPTY() where relevant Lots of places were using LIST_ISEMPTY() to detect if a stream belongs to one of the send lists or to detect if a connection was already waiting for a buffer or attached to an idle list. Since these ones are not list heads but list elements, let's use LIST_ADDED() instead.	2019-05-13 19:14:52 +02:00
William Lallemand	7e1770b151	BUG/MAJOR: ssl: segfault upon an heartbeat request `7b5fd1e` ("MEDIUM: connections: Move some fields from struct connection to ssl_sock_ctx.") introduced a bug in the heartbleed mitigation code. Indeed the code used conn->ctx instead of conn->xprt_ctx for the ssl context, resulting in a null dereference.	2019-05-13 16:03:44 +02:00
Tim Duesterhus	a6cc7e872a	BUG/MINOR: vars: Fix memory leak in vars_check_arg vars_check_arg previously leaked the string containing the variable name: Consider this config: frontend fe1 mode http bind :8080 http-request set-header X %[var(txn.host)] Starting HAProxy and immediately stopping it by sending a SIGINT makes Valgrind report this leak: ==7795== 9 bytes in 1 blocks are definitely lost in loss record 15 of 71 ==7795== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==7795== by 0x4AA2AD: my_strndup (standard.c:2227) ==7795== by 0x51FCC5: make_arg_list (arg.c:146) ==7795== by 0x4CF095: sample_parse_expr (sample.c:897) ==7795== by 0x4BA7D7: add_sample_to_logformat_list (log.c:495) ==7795== by 0x4BBB62: parse_logformat_string (log.c:688) ==7795== by 0x4E70A9: parse_http_req_cond (http_rules.c:239) ==7795== by 0x41CD7B: cfg_parse_listen (cfgparse-listen.c:1466) ==7795== by 0x480383: readcfgfile (cfgparse.c:2089) ==7795== by 0x47A081: init (haproxy.c:1581) ==7795== by 0x4049F2: main (haproxy.c:2591) This leak can be detected even in HAProxy 1.6, this patch thus should be backported to all supported branches [Cf: This fix was reverted because the chunk's area was inconditionnaly released, making haproxy to crash when spoe was enabled. Now the chunk is released by calling chunk_destroy(). This function takes care of the chunk's size to release it or not. It is the responsibility of callers to set or not the chunk's size.]	2019-05-13 11:09:12 +02:00
Christopher Faulet	bf9bcb0a00	MINOR: spoe: Set the argument chunk size to 0 when SPOE variables are checked When SPOE variables are registered during HAProxy startup, the argument used to call the function vars_check_arg() uses the trash area. To be sure it is never released by the callee function, the size of the internal chunk (arg.data.str) is set to 0. It is important to do so because, to fix a memory leak, this buffer must be released by the function vars_check_arg(). This patch must be backported to 1.9.	2019-05-13 11:07:00 +02:00
Willy Tarreau	ce9bbf523c	BUG/MINOR: htx: make sure to always initialize the HTTP method when parsing a buffer smp_prefetch_htx() is used when trying to access the contents of an HTTP buffer from the TCP rulesets. The method was not properly set in this case, which will cause the sample fetch methods relying on the method to randomly fail in this case. Thanks to Tim D�sterhus for reporting this issue (#97). This fix must be backported to 1.9.	2019-05-13 10:10:44 +02:00
Tim Duesterhus	04bcaa1f9f	BUG/MINOR: peers: Fix memory leak in cfg_parse_peers cfg_parse_peers previously leaked the contents of the `kws` string, as it was unconditionally filled using bind_dump_kws, but only used (and freed) within the error case. Move the dumping into the error case to: 1. Ensure that the registered keywords are actually printed as least once. 2. The contents of kws are not leaked. This move allows to narrow the scope of `kws`, so this is done as well. This bug was found using valgrind: ==28217== 590 bytes in 1 blocks are definitely lost in loss record 51 of 71 ==28217== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==28217== by 0x4AD4C7: indent_msg (standard.c:3676) ==28217== by 0x47E962: cfg_parse_peers (cfgparse.c:700) ==28217== by 0x480273: readcfgfile (cfgparse.c:2147) ==28217== by 0x479D51: init (haproxy.c:1585) ==28217== by 0x404A02: main (haproxy.c:2585) with this super simple configuration: peers peers bind :8081 server A This bug exists since the introduction of cfg_parse_peers in commit `355b2033ec` (which was introduced for HAProxy 2.0, but marked as backportable). It should be backported to all branches containing that commit.	2019-05-13 10:10:01 +02:00
Willy Tarreau	f7b0523425	Revert "BUG/MINOR: vars: Fix memory leak in vars_check_arg" This reverts commit `6ea00195c4`. As found by Christopher, this fix is not correct due to the way args are built at various places. For example some config or runtime parsers will place a substring pointer there, and calling free() on it will immediately crash the program. A quick audit of the code shows that there are not that many users, but the way it's done requires to properly set the string as a regular chunk (size=0 if free not desired, then call chunk_destroy() at release time), and given that the size is currently set to len+1 in all parsers, a deeper audit needs to be done to figure the impacts of not setting it anymore. Thus for now better leave this harmless leak which impacts only the config parsing time. This fix must be backported to all branches containing the fix above.	2019-05-13 10:10:01 +02:00
Willy Tarreau	4087346dab	BUG/MAJOR: mux-h2: do not add a stream twice to the send list In this long thread, Maciej Zdeb reported that the H2 mux was still going through endless loops from time to time : https://www.mail-archive.com/haproxy@formilux.org/msg33709.html What happens is the following : - in h2s_frt_make_resp_data() we can set H2_SF_BLK_SFCTL and remove the stream from the send_list - then in h2_shutr() and h2_shutw(), we check if the list is empty before subscribing the element, which is true after the case above - then in h2c_update_all_ws() we still have H2_SF_BLK_SFCTL with the item in the send_list, thus LIST_ADDQ() adds it a second time. This patch adds a check of list emptiness before performing the LIST_ADDQ() when the flow control window opens. Maciej reported that it reliably fixed the problem for him. As later discussed with Olivier, this fixes the consequence of the issue rather than its cause. The root cause is that a stream should never be in the send_list with a blocking flag set and the various places that can lead to this situation must be revisited. Thus another fix is expected soon for this issue, which will require some observation. In the mean time this one is easy enough to validate and to backport. Many thanks to Maciej for testing several versions of the patch, each time providing detailed traces which allowed to nail the problem down. This patch must be backported to 1.9.	2019-05-13 08:15:10 +02:00
Willy Tarreau	6a38b3297c	BUILD: threads: fix again the __ha_cas_dw() definition This low-level asm implementation of a double CAS was implemented only for certain architectures (x86_64, armv7, armv8). When threads are not used, they were not defined, but since they were called directly from a few locations, they were causing build issues on certain platforms with threads disabled. This was addressed in commit `f4436e1` ("BUILD: threads: Add __ha_cas_dw fallback for single threaded builds") by making it fall back to HA_ATOMIC_CAS() when threads are not defined, but this actually made the situation worse by breaking other cases. This patch fixes this by creating a high-level macro HA_ATOMIC_DWCAS() which is similar to HA_ATOMIC_CAS() except that it's intended to work on a double word, and which rely on the asm implementations when threads are in use, and uses its own open-coded implementation when threads are not used. The 3 call places relying on __ha_cas_dw() were updated to use HA_ATOMIC_DWCAS() instead. This change was tested on i586, x86_64, armv7, armv8 with and without threads with gcc 4.7, armv8 with gcc 5.4 with and without threads, as well as i586 with gcc-3.4 without threads. It will need to be backported to 1.9 along with the fix above to fix build on armv7 with threads disabled.	2019-05-11 18:13:29 +02:00
Willy Tarreau	295d614de1	CLEANUP: ssl: move all BIO_* definitions to openssl-compat The following macros are now defined for openssl < 1.1 so that we can remove the code performing direct access to the structures : BIO_get_data(), BIO_set_data(), BIO_set_init(), BIO_meth_free(), BIO_meth_new(), BIO_meth_set_gets(), BIO_meth_set_puts(), BIO_meth_set_read(), BIO_meth_set_write(), BIO_meth_set_create(), BIO_meth_set_ctrl(), BIO_meth_set_destroy()	2019-05-11 17:39:08 +02:00
Willy Tarreau	11b167167e	CLEANUP: ssl: remove ifdef around SSL_CTX_get_extra_chain_certs() Instead define this one in openssl-compat.h when SSL_CTRL_GET_EXTRA_CHAIN_CERTS is not defined (which was the current condition used in the ifdef).	2019-05-11 17:38:21 +02:00
Willy Tarreau	366a6987a7	CLEANUP: ssl: move the SSL_OP_* and SSL_MODE_* definitions to openssl-compat These ones were defined in the middle of ssl_sock.c, better move them to the include file to find them.	2019-05-11 17:37:44 +02:00
Tim Duesterhus	6ea00195c4	BUG/MINOR: vars: Fix memory leak in vars_check_arg vars_check_arg previously leaked the string containing the variable name: Consider this config: frontend fe1 mode http bind :8080 http-request set-header X %[var(txn.host)] Starting HAProxy and immediately stopping it by sending a SIGINT makes Valgrind report this leak: ==7795== 9 bytes in 1 blocks are definitely lost in loss record 15 of 71 ==7795== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==7795== by 0x4AA2AD: my_strndup (standard.c:2227) ==7795== by 0x51FCC5: make_arg_list (arg.c:146) ==7795== by 0x4CF095: sample_parse_expr (sample.c:897) ==7795== by 0x4BA7D7: add_sample_to_logformat_list (log.c:495) ==7795== by 0x4BBB62: parse_logformat_string (log.c:688) ==7795== by 0x4E70A9: parse_http_req_cond (http_rules.c:239) ==7795== by 0x41CD7B: cfg_parse_listen (cfgparse-listen.c:1466) ==7795== by 0x480383: readcfgfile (cfgparse.c:2089) ==7795== by 0x47A081: init (haproxy.c:1581) ==7795== by 0x4049F2: main (haproxy.c:2591) This leak can be detected even in HAProxy 1.6, this patch thus should be backported to all supported branches.	2019-05-11 06:00:50 +02:00
Olivier Houchard	ddf0e03585	MINOR: streams: Introduce a new retry-on keyword, all-retryable-errors. Add a new retry-on keyword, "all-retryable-errors", that activates retry for all errors that are considered retryable. This currently activates retry for "conn-failure", "empty-response", "junk-respones", "response-timeout", "0rtt-rejected", "500", "502", "503" and "504".	2019-05-10 18:05:35 +02:00
Olivier Houchard	602bf7d2ea	MEDIUM: streams: Add a new http action, disable-l7-retry. Add a new action for http-request, disable-l7-retry, that can be used to disable any attempt at retry requests (see retry-on) if it fails for any reason other than a connection failure. This is useful for example to make sure POST requests aren't retried.	2019-05-10 17:49:09 +02:00
Olivier Houchard	ad26d8d820	BUG/MEDIUM: streams: Make sur SI_FL_L7_RETRY is set before attempting a retry. In a few cases, we'd just check if the backend is configured to do retries, and not if it's still allowed on the stream_interface. The SI_FL_L7_RETRY flag could have been removed because we failed to allocate a buffer, or because the request was too big to fit in a single buffer, so make sure it's there before attempting a retry.	2019-05-10 17:48:59 +02:00
Olivier Houchard	bfe2a83c24	BUG/MEDIUM: h2: Don't check send_wait to know if we're in the send_list. When we have to stop sending due to the stream flow control, don't check if send_wait is NULL to know if we're in the send_list, because at this point it'll always be NULL, while we're probably in the list. Use LIST_ISEMPTY(&h2s->list) instead. Failing to do so mean we might be added in the send_list when flow control allows us to emit again, while we're already in it. While I'm here, replace LIST_DEL + LIST_INIT by LIST_DEL_INIT. This should be backported to 1.9.	2019-05-10 15:06:54 +02:00
Christopher Faulet	132f7b496c	BUG/MEDIUM: http: Use pointer to the begining of input to parse message headers In the legacy HTTP, when the message headers are parsed, in http_msg_analyzer(), we must use the begining of input and not the head of the buffer. Most of time, it will be the same pointers because there is no outgoing data when a new message is received. But when a 1xx informational response is parsed, it is forwarded and the parsing restarts immediatly. In this case, we have outgoing data when the next response is parsed. This patch must be backported to 1.9.	2019-05-10 11:47:00 +02:00
Christopher Faulet	7a3367cca0	BUG/MINOR: stream: Attach the read side on the response as soon as possible A backend stream-interface attached to a reused connection remains in the state SI_ST_CONN until some data are sent to validate the connection. But when the url_param algorithm is used to balance connections, no data are sent while the connection is not established. So it is a chicken and egg situation. To solve the problem, if no error is detected and when the request channel is waiting for the connect(), we mark the read side as attached on the response channel as soon as possible and we wake the request channel up once. This happens in 2 places. The first one is right after the connect(), when the stream-interface is still in state SI_ST_CON, in the function sess_update_st_con_tcp(). The second one is when an applet is used instead of a real connection to a server, in the function sess_prepare_conn_req(). In fact, it is done when the backend stream-interface is set to the state SI_ST_EST. This patch must be backported to 1.9.	2019-05-10 11:47:00 +02:00
Willy Tarreau	c125cef6da	CLEANUP: ssl: make inclusion of openssl headers safe It's always a pain to have to stuff lots of #ifdef USE_OPENSSL around ssl headers, it even results in some of them appearing in a random order and multiple times just to benefit form an existing ifdef block. Let's make these headers safe for inclusion when USE_OPENSSL is not defined, they now perform the test themselves and do nothing if USE_OPENSSL is not defined. This allows to remove no less than 8 such ifdef blocks and make include blocks more readable.	2019-05-10 09:58:43 +02:00
Willy Tarreau	8d164dc568	CLEANUP: ssl: never include openssl/*.h outside of openssl-compat.h anymore Since we're providing a compatibility layer for multiple OpenSSL implementations and their derivatives, it is important that no C file directly includes openssl headers but only passes via openssl-compat instead. As a bonus this also gets rid of redundant complex rules for inclusion of certain files (engines etc).	2019-05-10 09:36:42 +02:00
Willy Tarreau	9356dacd22	REORG: ssl: move some OpenSSL defines from ssl_sock to openssl-compat Some defines like OPENSSL_VERSION or X509_getm_notBefore() have nothing to do in ssl_sock and must move to openssl-compat.h so that they are consistently shared by the whole code. A warning in the code was added against wild additions of macros there.	2019-05-10 09:31:06 +02:00
Willy Tarreau	5599456ee2	REORG: ssl: move openssl-compat from proto to common This way we can include it much earlier to cover types/ as well.	2019-05-10 09:19:50 +02:00
Willy Tarreau	df17e0e1a7	BUILD: ssl: fix libressl build again after aes-gcm-enc Enabling aes-gcm-enc in last commit (MINOR: ssl: enable aes_gcm_dec on LibreSSL) uncovered a wrong condition on the define of the EVP_CTRL_AEAD_SET_IVLEN macro which I forgot to add when making the commit, resulting in breaking libressl build again. In case libressl later defines this macro, the test will have to change for a version range instead.	2019-05-10 09:19:07 +02:00
Willy Tarreau	86a394e44d	MINOR: ssl: enable aes_gcm_dec on LibreSSL This one requires OpenSSL 1.0.1 and above, and libressl was forked from 1.0.1g and is compatible (build-tested). No need to exclude it anymore from using this converter.	2019-05-09 14:26:40 +02:00
Willy Tarreau	5db847ab65	CLEANUP: ssl: remove 57 occurrences of useless tests on LIBRESSL_VERSION_NUMBER They were all check to comply with the advertised openssl version. Now that libressl doesn't pretend to be a more recent openssl anymore, we can simply rely on the regular openssl version tests without having to deal with exceptions for libressl.	2019-05-09 14:26:39 +02:00
Willy Tarreau	1d158ab12d	BUILD: ssl: make libressl use its own version numbers LibreSSL causes lots of build issues by pretending to be OpenSSL 2.0.0, and it requires lots of care for each #if added to cover any specific OpenSSL features. This commit addresses the problem by making LibreSSL only advertise the version it forked from (1.0.1g) and by starting to use tests based on its real version to enable features instead of working by exclusion.	2019-05-09 14:25:47 +02:00
Willy Tarreau	9a1ab08160	CLEANUP: ssl-sock: use HA_OPENSSL_VERSION_NUMBER instead of OPENSSL_VERSION_NUMBER Most tests on OPENSSL_VERSION_NUMBER have become complex and break all the time because this number is fake for some derivatives like LibreSSL. This patch creates a new macro, HA_OPENSSL_VERSION_NUMBER, which will carry the real openssl version defining the compatibility level, and this version will be adjusted depending on the variants.	2019-05-09 14:25:43 +02:00
Willy Tarreau	affd1b980a	BUILD: ssl: fix again a libressl build failure after the openssl FD leak fix As with every single OpenSSL fix, LibreSSL build broke again, this time after commit `56996dabe` ("BUG/MINOR: mworker/ssl: close OpenSSL FDs on reload"). A definitive solution will have to be found quickly. For now, let's exclude libressl from the version test. This patch must be backported to 1.9 since the fix above was already backported there.	2019-05-09 13:55:33 +02:00
Olivier Houchard	d9986ed51e	BUG/MEDIUM: h2: Make sure we set send_list to NULL in h2_detach(). In h2_detach(), if we still have a send_wait pointer, because we woke the tasklet up, but it hasn't ran yet, explicitely set send_wait to NULL after we removed the tasklet from the task list. Failure to do so may lead to crashes if the h2s isn't immediately destroyed, because we considered there were still something to send. This should be backported to 1.9.	2019-05-09 13:26:48 +02:00
Christopher Faulet	6f3cb1801b	MINOR: htx: Remove support for unused OOB HTX blocks This type of block was introduced in the early design of the HTX and it is not used anymore. So, just remove it. This patch may be backported to 1.9.	2019-05-07 22:16:41 +02:00
Christopher Faulet	6177509eb7	MINOR: htx: Don't try to append a trailer block with the previous one In H1 and H2, one and only one trailer block is emitted during the HTTP parsing. So it is useless to try to append this block with the previous one, like for data block. This patch may be backported to 1.9.	2019-05-07 22:16:41 +02:00
Christopher Faulet	bc5770b91e	MINOR: htx: Split on DATA blocks only when blocks are moved to an HTX message When htx_xfer_blks() is called to move blocks from an HTX message to another one, most of blocks must be transferred atomically. But some may be splitted if there is not enough space to move all the block. This was true for DATA and TLR blocks. But it is a bad idea to split trailers. During HTTP parsing, only one TLR block is emitted. It simplifies the processing of trailers to keep the block untouched. This patch must be backported to 1.9 because some fixes may depend on it.	2019-05-07 22:16:41 +02:00
Christopher Faulet	cc5060217e	BUG/MINOR: htx: Never transfer more than expected in htx_xfer_blks() When the maximum free space available for data in the HTX message is compared to the number of bytes to transfer, we must take into account the amount of data already transferred. Otherwise we may move more data than expected. This patch must be backported to 1.9.	2019-05-07 22:16:41 +02:00
Christopher Faulet	39593e6ae3	BUG/MINOR: mux-h1: Fix the parsing of trailers Unlike other H1 parsing functions, the 3rd parameter of the function h1_measure_trailers() is the maximum number of bytes to read. For others functions, it is the relative offset where to stop the parsing. This patch must be backported to 1.9.	2019-05-07 22:16:41 +02:00
Christopher Faulet	3b1d004d41	BUG/MEDIUM: spoe: Be sure the sample is found before setting its context When a sample fetch is encoded, we use its context to set info about the fragmentation. But if the sample is not found, the function sample_process() returns NULL. So we me be sure the sample exists before setting its context. This patch must be backported to 1.9 and 1.8.	2019-05-07 22:16:41 +02:00
Willy Tarreau	201fe40653	BUG/MINOR: mux-h2: fix the condition to close a cs-less h2s on the backend A typo was introduced in the following commit : `927b88ba0` ("BUG/MAJOR: mux-h2: fix race condition between close on both ends") making the test on h2s->cs never being done and h2c->cs being dereferenced without being tested. This also confirms that this condition does not happen on this side but better fix it right now to be safe. This must be backported to 1.9.	2019-05-07 19:17:50 +02:00
William Lallemand	27edc4b915	MINOR: mworker: support a configurable maximum number of reloads This patch implements a new global parameter for the master-worker mode. When setting the mworker-max-reloads value, a worker receive a SIGTERM if its number of reloads is greater than this value.	2019-05-07 19:09:01 +02:00
Willy Tarreau	f656279347	CLEANUP: task: remove unneeded tests before task_destroy() Since previous commit it's not needed anymore to test a task pointer before calling task_destory() so let's just remove these tests from the various callers before they become confusing. The function's arguments were also documented. The same should probably be done with tasklet_free() which involves a test in roughly half of the call places.	2019-05-07 19:08:16 +02:00
Dragan Dosen	7d61a33921	BUG/MEDIUM: stick-table: fix regression caused by a change in proxy struct In commit `1b8e68e` ("MEDIUM: stick-table: Stop handling stick-tables as proxies."), the ->table member of proxy struct was replaced by a pointer that is not always checked and in some situations can cause a segfault, eg. during reload or while using "show table" on CLI socket. No backport is needed.	2019-05-07 14:56:59 +02:00
Rob Allen	56996dabe6	BUG/MINOR: mworker/ssl: close OpenSSL FDs on reload From OpenSSL 1.1.1, the default behaviour is to maintain open FDs to any random devices that get used by the random number library. As a result, those FDs leak when the master re-execs on reload; since those FDs are not marked FD_CLOEXEC or O_CLOEXEC, they also get inherited by children. Eventually both master and children run out of FDs. OpenSSL 1.1.1 introduces a new function to control whether the random devices are kept open. When clearing the keep-open flag, it also closes any currently open FDs, so it can be used to clean-up open FDs too. Therefore, a call to this function is made in mworker_reload prior to re-exec. The call is guarded by whether SSL is in use, because it will cause initialisation of the OpenSSL random number library if that has not already been done. This should be backported to 1.9 and 1.8.	2019-05-07 14:11:55 +02:00
Willy Tarreau	2135f91d18	BUG/MEDIUM: h2/htx: never leave a trailers block alone with no EOM block If when receiving an H2 response we fail to add an EOM block after too large a trailers block, we must not leave the trailers block alone as it violates the internal assumptions by not being followed by an EOM, even when an error is reported. We must then make sure the error will safely be reported to upper layers and that no attempt will be made to forward partial blocks. This must be backported to 1.9.	2019-05-07 11:17:32 +02:00
Willy Tarreau	fb07b3f825	BUG/MEDIUM: mux-h2/htx: never wait for EOM when processing trailers In message https://www.mail-archive.com/haproxy@formilux.org/msg33541.html Patrick Hemmer reported an interesting bug affecting H2 and trailers. The problem is that in order to close the stream we have to see the EOM block, but nothing guarantees it will atomically be delivered with the trailers block(s). So the code currently waits for it by returning zero when it was not found, resulting in the caller (h2_snd_buf()) to loop forever calling it again. The current internal connection/connstream API doesn't allow a send actor to notify its caller that it cannot process the data until it gets more, so even returning zero will only lead to calls in loops without any guarantee that any progress will be made. Some late amendments to HTX already guaranteed the atomicity of the trailers block during snd_buf(), which is currently ensured by the fact that producers create exactly one such trailers block for all trailers. So in practice we can only loop between trailers and EOM. This patch changes the behaviour by making h2s_htx_make_trailers() become atomic by not consuming the EOM block. This way either it finds the end of trailers marker (empty line) or it fails. Once it sends the trailers block, ES is set so the stream turns HLOC or CLOSED. Thanks to previous patch "MEDIUM: mux-h2: discard contents that are to be sent after a shutdown" is is now safe to interrupt outgoing data processing, and the late EOM block will silently be discarded when the caller finally sends it. This is a bit tricky but should remain solid by design, and seems like the only option we have that is compatible with 1.9, where it must be backported along with the aforementioned patch.	2019-05-07 11:08:02 +02:00
Willy Tarreau	2b77848418	MEDIUM: mux-h2: discard contents that are to be sent after a shutdown In h2_snd_buf() we discard any possible buffer contents requested to be sent after a close or an error. But in practice we can extend this to any case where the stream is locally half-closed since it means we will never be able to send these data anymore. For now it must not change anything, but it will be used by subsequent patches to discard lone a HTX EOM block arriving after the trailers block.	2019-05-07 11:08:02 +02:00
Willy Tarreau	aab1a60977	BUG/MEDIUM: h2/htx: always fail on too large trailers In case a header frame carrying trailers just fits into the HTX buffer but leaves no room for the EOM block, we used to return the same code as the one indicating we're missing data. This could would result in such frames causing timeouts instead of immediate clean aborts. Now they are properly reported as stream errors (since the frame was decoded and the compression context is still synchronized). This must be backported to 1.9.	2019-05-07 11:08:02 +02:00
Willy Tarreau	5121e5d750	BUG/MINOR: mux-h2: rely on trailers output not input to turn them to empty data When sending trailers, we may face an empty HTX trailers block or even have to discard some of the headers there and be left with nothing to send. RFC7540 forbids sending of empty HEADERS frames, so in this case we turn to DATA frames (which is possible since after other DATA). The code used to only check the input frame's contents to decide whether or not to switch to a DATA frame, it didn't consider the possibility that the frame only used to contain headers discarded later, thus it could still emit an empty HEADERS frame in such a case. This patch makes sure that the output frame size is checked instead to take the decision. This patch must be backported to 1.9. In practice this situation is never encountered since the discarded headers have really nothing to do in a trailers block.	2019-05-07 11:07:59 +02:00
Dragan Dosen	2674303912	MEDIUM: regex: modify regex_comp() to atomically allocate/free the my_regex struct Now we atomically allocate the my_regex struct within function regex_comp() and compile the regex or free both in case of failure. The pointer to the allocated my_regex struct is returned directly. The my_regex* argument to regex_comp() is removed. Function regex_free() was modified so that it systematically frees the my_regex entry. The function does nothing when called with a NULL as argument (like free()). It will avoid existing risk of not properly freeing the initialized area. Other structures are also updated in order to be compatible (the ones related to Lua and action rules).	2019-05-07 06:58:15 +02:00
Fr�d�ric L�caille	7fcc24d4ef	MINOR: peers: Do not emit global stick-table names. This commit "MINOR: stick-table: Add prefixes to stick-table names" prepended the "peers" section name to stick-table names declared in such "peers" sections followed by a '/' character. This is not this name which must be sent over the network to avoid collisions with stick-table name declared as backends. As the '/' character is forbidden as first character of a backend name, we prefix the stick-table names declared in peers sections only with a '/' character. With such declarations: peers mypeers table t1 backend t1 stick-table ... peers mypeers at peer protocol level, "t1" declared as stick-table in "mypeers" section is different of "t1" stick-table declared as backend. In src/peers.c, only two modifications were required: use ->nid stktable struct member in place of ->id in peer_prepare_switchmsg() to prepare the stick-table definition messages. Same thing in peer_treat_definemsg() to treat a stick-table definition messages.	2019-05-07 06:54:07 +02:00
Fr�d�ric L�caille	c02766a267	MINOR: stick-table: Add prefixes to stick-table names. With this patch we add a prefix to stick-table names declared in "peers" sections concatenating the "peers" section name followed by a '/' character with the stick-table name. Consequently, "peers" sections have their own namespace for their stick-tables. Obviously, these stick-table names are not the ones which should be sent over the network. So these configurations must be compatible and should make A and B peers communicate with peers protocol: # haproxy A config, old way stick-table declerations peers mypeers peer A ... peer B ... backend t1 stick-table type string size 10m store gpc0 peers mypeers # haproxy B config, new way stick-table declerations peers mypeers peer A ... peer B ... table t1 type string size store gpc0 10m This "network" name is stored in ->nid new field of stktable struct. The "local" stktable-name is still stored in ->id.	2019-05-07 06:54:07 +02:00
Fr�d�ric L�caille	015e4d7d93	MINOR: stick-tables: Add peers process binding computing. Add a list of proxies for all the stick-tables (->proxies_list struct stktable member) so that to be able to compute the process bindings of the peers after having parsed the configuration file. The proxies are added to the stick-tables they reference when parsing stick-tables lines in proxy sections, when checking the actions in check_trk_action() and when resolving samples args for stick-tables without checking is they are duplicates. We check only there is no loop. Then, after having parsed everything, we add the proxy bindings to the peers frontend bindings with stick-tables they reference.	2019-05-07 06:54:07 +02:00
Fr�d�ric L�caille	1b8e68e89a	MEDIUM: stick-table: Stop handling stick-tables as proxies. This patch adds the support for the "table" line parsing in "peers" sections to declare stick-table in such sections. This also prevents the user from having to declare dummy backends sections with a unique stick-table inside. Even if still supported, this usage will become deprecated. To do so, the ->table member of proxy struct which is a stktable struct is replaced by a pointer to a stktable struct allocated at parsing time in src/cfgparse-listen.c for the dummy stick-table backends and in src/cfgparse.c for "peers" sections. This has an impact on the code for stick-table sample converters and on the stickiness rules parsers which first store the name of the dummy before resolving the rules. This patch replaces proxy_tbl_by_name() calls by stktable_find_by_name() calls to lookup for stick-tables stored in "stktable_by_name" ebtree at parsing time. There is only one remaining place where proxy_tbl_by_name() is used: src/hlua.c. At several places in the code we relied on the fact that ->size member of stick-table was equal to zero to consider the stick-table was present by not configured, this do not make sense anymore as ->table member of struct proxyis fow now on a pointer. These tests are replaced by a test on ->table value itself. In "peers" section we do not have to temporary store the name of the section the stick-table are attached to because this name is obviously already known just after having entered this "peers" section. About the CLI stick-table I/O handler, the pointer to proxy struct is replaced by a pointer to a stktable struct.	2019-05-07 06:54:06 +02:00
Fr�d�ric L�caille	d456aa4ac2	MINOR: config: Extract the code of "stick-table" line parsing. With this patch we move the code responsible of parsing "stick-table" lines to implement parse_stick_table() function in src/stick-tabble.c so that to be able to parse "stick-table" elsewhere than in proxy sections. We have have also added a conf struct to stktable struct to store the filename and the line in the file the stick-table has been parsed to help in diagnosing and displaying any configuration issue.	2019-05-07 06:54:06 +02:00
Willy Tarreau	034c88cf03	MEDIUM: tcp: add the "tfo" option to support TCP fastopen on the server This implements support for the new API which relies on a call to setsockopt(). On systems that support it (currently, only Linux >= 4.11), this enables using TCP fast open when connecting to server. Please note that you should use the retry-on "conn-failure", "empty-response" and "response-timeout" keywords, or the request won't be able to be retried on failure. Co-authored-by: Olivier Houchard <ohouchard@haproxy.com>	2019-05-06 22:29:39 +02:00
Olivier Houchard	fdcb007ad8	MEDIUM: proto: Change the prototype of the connect() method. The connect() method had 2 arguments, "data", that tells if there's pending data to be sent, and "delack" that tells if we have to use a delayed ack inconditionally, or if the backend is configured with tcp-smart-connect. Turn that into one argument, "flags". That way it'll be easier to provide more informations to connect() without adding extra arguments.	2019-05-06 22:12:57 +02:00
Olivier Houchard	4cd2af4e5d	BUG/MEDIUM: ssl: Don't attempt to use early data with libressl. Libressl doesn't yet provide early data, so don't put the CO_FL_EARLY_SSL_HS on the connection if we're building with libressl, or the handshake will never be done.	2019-05-06 15:20:42 +02:00
Ilya Shipitsin	54832b97c6	BUILD: enable several LibreSSL hacks, including SSL_SESSION_get0_id_context is introduced in LibreSSL-2.7.0 async operations are not supported by LibreSSL early data is not supported by LibreSSL packet_length is removed from SSL struct in LibreSSL	2019-05-06 07:26:24 +02:00
Tim Duesterhus	473c283d95	CLEANUP: Remove appsession documentation I was about to partly revert `294d0f08b3`, because there were no 'X' for 'appsession' in the keyword matrix until I checked the blame, realizing that the feature does not exist any more. Clearly the documentation is confusing here, the removal note is only listed below the old documentation and the supported sections still show 'backend' and 'listen'. It's been 3.5 years and 4 releases (1.6, 1.7, 1.8 and 1.9), I guess this can be removed from the documentation of future versions.	2019-05-06 07:15:08 +02:00
Willy Tarreau	55e2f5ad14	BUG/MINOR: logs/threads: properly split the log area upon startup If logs were emitted before creating the threads, then the dataptr pointer keeps a copy of the end of the log header. Then after the threads are created, the headers are reallocated for each thread. However the end pointer was not reset until the end of the first second, which may result in logs emitted by multiple threads during the first second to be mangled, or possibly in some cases to use a memory area that was reused for something else. The fix simply consists in reinitializing the end pointers immediately when the threads are created. This fix must be backported to 1.9 and 1.8.	2019-05-05 10:16:13 +02:00
Willy Tarreau	4fc49a9aab	BUG/MEDIUM: checks: make sure the warmup task takes the server lock The server warmup task is used when a server uses the "slowstart" parameter. This task affects the server's weight and maxconn, and may dequeue pending connections from the queue. This must be done under the server's lock, which was not the case. This must be backported to 1.9 and 1.8.	2019-05-05 06:54:22 +02:00
Willy Tarreau	223995e8ca	BUG/MINOR: stream: also increment the retry stats counter on L7 retries It happens that the retries stats use their own counter and are not derived from the stream interface, so we need to update it as well when performing an L7 retry. No backport is needed.	2019-05-04 10:40:00 +02:00
Olivier Houchard	e3249a98e2	MEDIUM: streams: Add a new keyword for retry-on, "junk-response" Add a way to retry requests if we got a junk response from the server, ie an incomplete response, or something that is not valid HTTP. To do so, one can use the new "junk-response" keyword for retry-on.	2019-05-04 10:20:24 +02:00
Olivier Houchard	865d8392bb	MEDIUM: streams: Add a way to replay failed 0rtt requests. Add a new keyword for retry-on, 0rtt-rejected. If set, we will try to replay requests for which we sent early data that got rejected by the server. If that option is set, we will attempt to use 0rtt if "allow-0rtt" is set on the server line even if the client didn't send early data.	2019-05-04 10:20:24 +02:00
Olivier Houchard	a254a37ad7	MEDIUM: streams: Add the ability to retry a request on L7 failure. When running in HTX mode, if we sent the request, but failed to get the answer, either because the server just closed its socket, we hit a server timeout, or we get a 404, 408, 425, 500, 501, 502, 503 or 504 error, attempt to retry the request, exactly as if we just failed to connect to the server. To do so, add a new backend keyword, "retry-on". It accepts a list of keywords, which can be "none" (never retry), "conn-failure" (we failed to connect, or to do the SSL handshake), "empty-response" (the server closed the connection without answering), "response-timeout" (we timed out while waiting for the server response), or "404", "408", "425", "500", "501", "502", "503" and "504". The default is "conn-failure".	2019-05-04 10:19:56 +02:00
Olivier Houchard	f4bda993dd	BUG/MEDIUM: streams: Don't add CF_WRITE_ERROR if early data were rejected. In sess_update_st_con_tcp(), if we have an error on the stream_interface because we tried to send early_data but failed, don't flag the request channel as CF_WRITE_ERROR, or we will never reach the analyser that sends back the 425 response. This should be backported to 1.9.	2019-05-03 22:23:41 +02:00
Olivier Houchard	010941f876	BUG/MEDIUM: ssl: Use the early_data API the right way. We can only read early data if we're a server, and write if we're a client, so don't attempt to mix both. This should be backported to 1.8 and 1.9.	2019-05-03 21:00:10 +02:00
Willy Tarreau	c40efc1919	MINOR: init/threads: make the threads array global Currently the thread array is a local variable inside a function block and there is no access to it from outside, which often complicates debugging. Let's make it global and export it. Also the allocation return is now checked.	2019-05-03 10:16:30 +02:00
Willy Tarreau	b4f7cc3839	MINOR: init/threads: remove the useless tids[] array It's still obscure how we managed to initialize an array of integers with values always equal to the index, just to retrieve the value from an opaque pointer to the index instead of directly using it! I suspect it's a leftover from the very early threading experiments. This commit gets rid of this and simply passes the thread ID as the argument to run_thread_poll_loop(), thus significantly simplifying the few call places and removing the need to allocate then free an array of identity.	2019-05-03 09:59:15 +02:00
Willy Tarreau	81492c989c	MINOR: threads: flatten the per-thread cpu-map When we initially experimented with threads and processes support, we needed to implement arrays of threads per process for cpu-map, but this is not needed anymore since we support either threads or processes. Let's simply make the thread-based cpu-map per thread and not per thread and per process since that's not used anymore. Doing so reduces the global struct from 33kB to 1.5kB.	2019-05-03 09:46:45 +02:00
Olivier Houchard	a48237fd07	BUG/MEDIUM: connections: Make sure we remove CO_FL_SESS_IDLE on disown. When for some reason the session is not the owner of the connection anymore, make sure we remove CO_FL_SESS_IDLE, even if we're about to call conn->mux->destroy(), as the destroy may not destroy the connection immediately if it's still in use. This should be backported to 1.9. u	2019-05-02 12:08:39 +02:00
Dragan Dosen	e99af978c8	BUG/MEDIUM: pattern: fix memory leak in regex pattern functions The allocated regex is not freed properly and can cause a memory leak, eg. when patterns are updated via CLI socket. This patch should be backported to all supported versions.	2019-05-02 10:05:11 +02:00
Dragan Dosen	026ef570e1	BUG/MINOR: checks: free memory allocated for tasklets The check->wait_list.task and agent->wait_list.task were not freed properly on deinit(). This patch should be backported to 1.9.	2019-05-02 10:05:09 +02:00
Dragan Dosen	61302da0e7	BUG/MINOR: log: properly free memory on logformat parse error and deinit() This patch may be backported to all supported versions.	2019-05-02 10:05:07 +02:00
Dragan Dosen	2a7c20f602	BUG/MINOR: haproxy: fix rule->file memory leak When using the "use_backend" configuration directive, the configuration file name stored as rule->file was not freed in some situations. This was introduced in commit `4ed1c95` ("MINOR: http/conf: store the use_backend configuration file and line for logs"). This patch should be backported to 1.9, 1.8 and 1.7.	2019-05-02 10:05:06 +02:00
Olivier Houchard	b51937ebaa	BUG/MEDIUM: ssl: Don't pretend we can retry a recv/send if we got a shutr/w. In ha_ssl_write() and ha_ssl_read(), don't pretend we can retry a read/write if we got a shutr/shutw, or we will never properly shutdown the connection.	2019-05-01 17:37:33 +02:00
Ilya Shipitsin	0c50b1ecbb	BUG/MEDIUM: servers: fix typo "src" instead of "srv" When copying the settings for all servers when using server templates, fix a typo, or we would never copy the length of the ALPN to be used for checks. This should be backported to 1.9.	2019-04-30 23:04:47 +02:00
Christopher Faulet	02f3cf19ed	CLEANUP: config: Don't alter listener->maxaccept when nbproc is set to 1 This patch only removes a useless calculation on listener->maxaccept when nbproc is set to 1. Indeed, the following formula has no effet in such case: listener->maxaccept = (listener->maxaccept + nbproc - 1) / nbproc; This patch may be backported as far as 1.5.	2019-04-30 15:28:29 +02:00
Christopher Faulet	6b02ab8734	MINOR: config: Test validity of tune.maxaccept during the config parsing Only -1 and positive integers from 0 to INT_MAX are accepted. An error is triggered during the config parsing for any other values. This patch may be backported to all supported versions.	2019-04-30 15:28:29 +02:00
Christopher Faulet	102854cbba	BUG/MEDIUM: listener: Fix how unlimited number of consecutive accepts is handled There is a bug when global.tune.maxaccept is set to -1 (no limit). It is pretty visible with one process (nbproc sets to 1). The functions listener_accept() and accept_queue_process() don't expect to handle negative maxaccept values. So instead of accepting incoming connections without any limit, none are never accepted and HAProxy loop infinitly in the scheduler. When there are 2 or more processes, the bug is a bit more subtile. The limit for a listener is set to 1. So only one connection is accepted at a time by a given listener. This happens because the listener's maxaccept value is an unsigned integer. In check_config_validity(), it is first set to UINT_MAX (-1 casted in an unsigned integer), and then some calculations on it leads to an integer overflow. To fix the bug, the listener's maxaccept value is now a signed integer. So, if a negative value is set for global.tune.maxaccept, we keep it untouched for the listener and no calculation is made on it. Then, in the listener code, this signed value is casted to a unsigned one. It simplifies all tests instead of dealing with negative values. So, it limits the number of connections accepted at a time to UINT_MAX at most. But, honestly, it not an issue. This patch must be backported to 1.9 and 1.8.	2019-04-30 15:28:29 +02:00
Willy Tarreau	bc13bec548	MINOR: activity: report context switch counts instead of rates It's not logical to report context switch rates per thread in show activity because everything else is a counter and it's not even possible to compare values. Let's only report counts. Further, this simplifies the scheduler's code.	2019-04-30 14:55:18 +02:00
Willy Tarreau	49ee3b2f9a	BUG/MAJOR: map/acl: real fix segfault during show map/acl on CLI A previous commit `8d85aa44d` ("BUG/MAJOR: map: fix segfault during 'show map/acl' on cli.") was provided to address a concurrency issue between "show acl" and "clear acl" on the CLI. Sadly the code placed there was copy-pasted without changing the element type (which was struct stream in the original code) and not tested since the crash is still present. The reproducer is simple : load a large ACL file (e.g. geolocation addresses), issue "show acl #0" in loops in one window and issue a "clear acl #0" in the other one, haproxy crashes. This fix was also tested with threads enabled and looks good since the locking seems to work correctly in these areas though. It will have to be backported as far as 1.6 since the commit above went that far as well...	2019-04-30 11:50:59 +02:00
Fr�d�ric L�caille	d803e475e5	MINOR: log: Enable the log sampling and load-balancing feature. This patch implements the sampling and load-balancing of log servers configured with "sample" new keyword implemented by this commit: 'MINOR: log: Add "sample" new keyword to "log" lines'. As the list of ranges used to sample the log to balance is ordered, we only have to maintain ->curr_idx member of smp_info struct which is the index of the sample and check if it belongs or not to the current range to decide if we must send it to the log server or not.	2019-04-30 09:25:09 +02:00
Fr�d�ric L�caille	d95ea2897e	MINOR: log: Add "sample" new keyword to "log" lines. This patch implements the parsing of "sample" new optional keyword for "log" lines to be able to sample and balance the load of log messages between serveral log destinations declared by "log" lines. This keyword must be followed by a list of comma seperated ranges of indexes numbered from 1 to define the samples to be used to balance the load of logs to send. This "sample" keyword must be used on "log" lines obviously before the remaining optional ones without keyword. The list of ranges must be followed by a colon character to separate it from the log sampling size. With such following configuration declarations: log stderr local0 log 127.0.0.1:10001 sample 2-3,8-11:11 local0 log 127.0.0.2:10002 sample 5:5 local0 in addition to being sent to stderr, about the second "log" line, every 11 logs the logs #2 up to #3 would be sent to 127.0.0.1:10001, then #8 up tp #11 four logs would be sent to the same log server and so on periodically. Logs would be sent to 127.0.0.2:100002 every 5 logs. It is also possible to define the size of the sample with a value different of the maximum of the high limits of the ranges, for instance as follows: log 127.0.0.1:10001 sample 2-3,8-11:15 local0 as before the two logs #2 and #3 would be sent to 127.0.0.1:10001, then #8 up tp #11 logs, but in this case here, this would be done periodically every 15 messages. Also note that the ranges must not overlap each others. This is to ease the way the logs are periodically sent.	2019-04-30 09:25:09 +02:00
Christopher Faulet	85db3212b8	MINOR: spoe: Use the sample context to pass frag_ctx info during encoding This simplifies the API and hide the details in the sample. This way, only string and binary are aware of these info, because other types cannot be partially encoded. This patch may be backported to 1.9 and 1.8.	2019-04-29 16:02:05 +02:00
Kevin Zhu	f7f54280c8	BUG/MEDIUM: spoe: arg len encoded in previous frag frame but len changed Fragmented arg will do fetch at every encode time, each fetch may get different result if SMP_F_MAY_CHANGE, for example res.payload, but the length already encoded in first fragment of the frame, that will cause SPOA decode failed and waste resources. This patch must be backported to 1.9 and 1.8.	2019-04-29 16:02:05 +02:00
Christopher Faulet	1907ccc2f7	BUG/MINOR: http: Call stream_inc_be_http_req_ctr() only one time per request The function stream_inc_be_http_req_ctr() is called at the beginning of the analysers AN_REQ_HTTP_PROCESS_FE/BE. It as an effect only on the backend. But we must be careful to call it only once. If the processing of HTTP rules is interrupted in the middle, when the analyser is resumed, we must not call it again. Otherwise, the tracked counters of the backend are incremented several times. This bug was reported in github. See issue #74. This fix should be backported as far as 1.6.	2019-04-29 16:01:47 +02:00
Willy Tarreau	97215ca284	BUG/MEDIUM: mux-h2: properly deal with too large headers frames In h2c_decode_headers(), now that we support CONTINUATION frames, we try to defragment all pending frames at once before processing them. However if the first is exactly full and the second cannot be parsed, we don't detect the problem and we wait for the next part forever due to an incorrect check on exit; we must abort the processing as soon as the current frame remains full after defragmentation as in this case there is no way to make forward progress. Thanks to Yves Lafon for providing traces exhibiting the problem. This must be backported to 1.9.	2019-04-29 10:20:21 +02:00
David CARLIER	4de0eba848	MEDIUM: da: HTX mode support. The DeviceAtlas module now can support both the legacy mode and the new HTX's with the known set of support headers for the latter.	2019-04-26 17:06:32 +02:00
David Carlier	0470d704a7	BUILD/MEDIUM: contrib: Dummy DeviceAtlas API. Creating a "mocked" version mainly for testing purposes.	2019-04-26 17:06:32 +02:00
Willy Tarreau	4ad574fbe2	MEDIUM: streams: measure processing time and abort when detecting bugs On some occasions we've had loops happening when processing actions (e.g. a yield not being well understood) resulting in analysers being called in loops until the analysis timeout without incrementing the stream's call count, thus this type of bug cannot be caught by the current protection system. What this patch proposes is to start to measure the time spent in analysers when profiling is enabled on the thread, in order to detect if a stream is really misbehaving. In this case we measured the consumed CPU time, not the wall clock time, so as not to be affected by possible noisy neighbours sharing the same CPU. When more than 100ms are spent in an analyser, we trigger the stream_dump_and_crash() function to report the anomaly. The choice of 100ms comes from the fact that regular calls only take around 1 microsecond and it seems reasonable to accept a degradation factor of 100000, which covers very slow machines such as home gateways running on sub-ghz processors, with extremely heavy configurations. Some complete tests show that even this common bogus map_regm() entry supposedly designed to extract a port from an IP:port entry does not trigger the timeout (25 ms evaluation time for a 4kB header, exercise left to the reader to spot the mistake) : ([0-9]{0,3}).([0-9]{0,3}).([0-9]{0,3}).([0-9]{0,3}):([0-9]{0,5}) \5 However this one purposely designed to kill haproxy definitely dies as it manages to completely freeze the whole process for more than one second on a 4 GHz CPU for only 120 bytes in : (.{0,20})(.{0,20})(.{0,20})(.{0,20})(.{0,20})b \1 This protection will definitely help during the code stabilization period and may possibly be left enabled later depending on reported issues or not. If you've noticed that your workload is affected by this patch, please report it as you have very likely found a bug. And in the mean time you can turn profiling off to disable it.	2019-04-26 14:30:59 +02:00
Willy Tarreau	3d07a16f14	MEDIUM: stream/debug: force a crash if a stream spins over itself forever If a stream is caught spinning over itself at more than 100000 loops per second and for more than one second, the process will be aborted and the offender reported on the console and logs. Typical figures usually are just a few tens to hundreds per second over a very short time so there is a huge margin here. Using even higher values could also work but there is the risk of not being able to catch offenders if multiple ones start to bug at the same time and share the load. This code should ideally be disabled for stable releases, though in theory nothing should ever trigger it.	2019-04-26 13:16:14 +02:00
Willy Tarreau	dcb0e1d37d	MEDIUM: appctx/debug: force a crash if an appctx spins over itself forever If an appctx is caught spinning over itself at more than 100000 loops per second and for more than one second, the process will be aborted and the offender reported on the console and logs. Typical figures usually are just a few tens to hundreds per second over a very short time so there is a huge margin here. Using even higher values could also work but there is the risk of not being able to catch offenders if multiple ones start to bug at the same time and share the load. This code should ideally be disabled for stable releases, though in theory nothing should ever trigger it.	2019-04-26 13:15:56 +02:00
Willy Tarreau	71c07ac65a	MINOR: stream/debug: make a stream dump and crash function During 1.9 development (and even a bit after) we've started to face a significant number of situations where streams were abusively spinning due to an uncaught error flag or complex conditions that couldn't be correctly identified. Sometimes streams wake appctx up and conversely as well. More importantly when this happens the only fix is to restart. This patch adds a new function to report a serious error, some relevant info and to crash the process using abort() so that a core dump is available. The purpose will be for this function to be called in various situations where the process is unfixable. It will help detect these issues much earlier during development and may even help fixing test platforms which are able to automatically restart when such a condition happens, though this is not the primary purpose. This patch only provides the function and doesn't use it yet.	2019-04-26 13:15:56 +02:00
Willy Tarreau	5e370daa52	BUG/MINOR: proto_http: properly reset the stream's call rate on keep-alive The stream's call rate measurement was added by commit `2e9c1d296` ("MINOR: stream: measure and report a stream's call rate in "show sess"") but it forgot to reset it in case of HTTP keep-alive (legacy mode), resulting in incorrect measurements. No backport is needed, unless the patch above is backported.	2019-04-25 18:33:37 +02:00
Willy Tarreau	d5ec4bfe85	CLEANUP: standard: use proper const to addr_to_str() and port_to_str() The input parameter was not marked const, making it painful for some calls.	2019-04-25 17:48:16 +02:00
Willy Tarreau	d2d3348acb	MINOR: activity: enable automatic profiling turn on/off Instead of having to manually turn task profiling on/off in the configuration, by default it will work in "auto" mode, which automatically turns on on any thread experiencing sustained loop latencies over one millisecond averaged over the last 1024 samples. This may happen with configs using lots of regex (thing map_reg for example, which is the lazy way to convert Apache's rewrite rules but must not be abused), and such high latencies affect all the process and the problem is most often intermittent (e.g. hitting a map which is only used for certain host names). Thus now by default, with profiling set to "auto", it remains off all the time until something bad happens. This also helps better focus on the issues when looking at the logs as well as in "show sess" output. It automatically turns off when the average loop latency over the last 1024 calls goes below 990 microseconds (which typically takes a while when in idle). This patch could be backported to stable versions after a bit more exposure, as it definitely improves observability and the ability to quickly spot the culprit. In this case, previous patch ("MINOR: activity: make the profiling status per thread and not global") must also be taken.	2019-04-25 17:26:46 +02:00
Willy Tarreau	d9add3acc8	MINOR: activity: make the profiling status per thread and not global In order to later support automatic profiling turn on/off, we need to have it per-thread. We're keeping the global option to know whether to turn it or on off, but the profiling status is now set per thread. We're updating the status in activity_count_runtime() which is called before entering poll(). The reason is that we'll extend this with run time measurement when deciding to automatically turn it on or off.	2019-04-25 17:26:19 +02:00
Willy Tarreau	d636675137	BUG/MINOR: activity: always initialize the profiling variable It happens it was only set if present in the configuration. It's harmless anyway but can still cause doubts when comparing logs and configurations so better correctly initialize it. This should be backported to 1.9.	2019-04-25 17:26:19 +02:00
Willy Tarreau	22d63a24d9	MINOR: applet: measure and report an appctx's call rate in "show sess" Very similarly to previous commit doing the same for streams, we now measure and report an appctx's call rate. This will help catch applets which do not consume all their data and/or which do not properly report that they're waiting for something else. Some of them like peers might theorically be able to exhibit some occasional peeks when teaching a full table to a nearby peer (e.g. the new replacement process), but nothing close to what a bogus service can do so there is no risk of confusion.	2019-04-24 16:04:23 +02:00
Willy Tarreau	2e9c1d2960	MINOR: stream: measure and report a stream's call rate in "show sess" Quite a few times some bugs have made a stream task incorrectly handle a complex combination of events, which was often reported as "100% CPU", and was usually caused by the event not being properly identified and flushed, and the stream's handler called in loops. This patch adds a call rate counter to the stream struct. It's not huge, it's really inexpensive (especially compared to the rest of the processing function) and will easily help spot such tasks in "show sess" output, possibly even allowing to kill them. A future patch should probably consist in alerting when they're above a certain threshold, possibly sending a dump and killing them. Some options could also consist in aborting in order to get an analyzable core dump and let a service manager restart a fresh new process.	2019-04-24 16:04:23 +02:00
Willy Tarreau	0212fadd65	MINOR: tasks/activity: report the context switch and task wakeup rates It's particularly useful to spot runaway tasks to see this. The context switch rate covers all tasklet calls (tasks and I/O handlers) while the task wakeups only covers tasks picked from the run queue to be executed. High values there will indicate either an intense traffic or a bug that mades a task go wild.	2019-04-24 16:04:23 +02:00
Willy Tarreau	69b5a7f1a3	CLEANUP: task: report calls as unsigned in show sess The "show sess" output used signed ints to report the number of calls, which is confusing for runaway tasks where the call count can turn negative.	2019-04-24 16:04:23 +02:00
Christopher Faulet	4904058661	BUG/MINOR: htx: Exclude TCP proxies when the HTX mode is handled during startup When tests are performed on the HTX mode during HAProxy startup, only HTTP proxies are considered. It is important because, since the commit `1d2b586cd` ("MAJOR: htx: Enable the HTX mode by default for all proxies"), the HTX is enabled on all proxies by default. But for TCP proxies, it is "deactivated". This patch must be backported to 1.9.	2019-04-24 15:40:02 +02:00
Willy Tarreau	274ba67862	BUG/MAJOR: lb/threads: fix AB/BA locking issue in round-robin LB An occasional divide by zero in the round-robin scheduler was addressed in commit `9df86f997` ("BUG/MAJOR: lb/threads: fix insufficient locking on round-robin LB") by grabing the server's lock in fwrr_get_server_from_group(). But it happens that this is not the correct approach as it introduces a case of AB/BA deadlock reported by Maksim Kupriianov. This happens when a server weight changes from/to zero while another thread extracts this server from the tree. The reason is that the functions used to manipulate the state work under the server's lock and grab the LB lock while the ones used in LB work under the LB lock and grab the server's lock when needed. This commit mostly reverts the changes above and instead further completes the locking analysis performed on this code to identify areas that really need to be protected by the server's lock, since this is the only algorithm which happens to have this requirement. This audit showed that in fact all locations which require the server's lock are already protected by the LB lock. This was not noticed the first time due to the server's lock being taken instead and due to some functions misleadingly using atomic ops to modify server fields which are under the LB lock protection (these ones were now removed). The change consists in not taking the server's lock anymore here, and instead making sure that the aforementioned function which used to suffer from the server's weight becoming zero only uses a copy of the weight which was preliminary verified to be non-null (when the weight is null, the server will be removed from the tree anyway so there is no need to recalculate its position). With this change, the code survived an injection at 200k req/s split on two servers with weights changing 50 times a second. This commit must be backported to 1.9 only.	2019-04-24 14:23:40 +02:00
Olivier Houchard	a28454ee21	BUG/MEDIUM: ssl: Return -1 on recv/send if we got EAGAIN. In ha_ssl_read()/ha_ssl_write(), if we couldn't send/receive data because we got EAGAIN, return -1 and not 0, as older SSL versions expect that. This should fix the problems with OpenSSL < 1.1.0.	2019-04-24 12:06:08 +02:00
Christopher Faulet	371723b0c2	BUG/MINOR: spoe: Don't systematically wakeup SPOE stream in the applet handler This can lead to wakeups in loop between the SPOE stream and the SPOE applets waiting to receive agent messages (mainly AGENT-HELLO and AGENT-DISCONNECT). This patch must be backported to 1.9 and 1.8.	2019-04-23 21:20:47 +02:00
Christopher Faulet	5e1a9d715e	BUG/MEDIUM: stream: Fix the way early aborts on the client side are handled A regression was introduced with the commit c9aecc8ff ("BUG/MEDIUM: stream: Don't request a server connection if a shutw was scheduled"). Among other this, it breaks the CLI when the shutr on the client side is handled with the client data. To depend on the flag CF_SHUTW_NOW to not establish the server connection when an error on the client side is detected is the right way to fix the bug, because this flag may be set without any error on the client side. So instead, we abort the request where the error is handled and only when the backend stream-interface is in the state SI_ST_INI. This way, there is no ambiguity on the reason why the abort accurred. The stream-interface is also switched to the state SI_ST_CLO. This patch must be backported to 1.9. If the commit c9aecc8ff is backported to previous versions, this one MUST also be backported. Otherwise, it MAY be backported to older versions that 1.9 with caution.	2019-04-23 21:20:47 +02:00
Fr�d�ric L�caille	bed883abe8	BUG/MAJOR: stream: Missing DNS context initializations. Fix some missing initializations wich came with `333939c` commit (MINOR: action: new '(http-request\|tcp-request content) do-resolve' action). The DNS contexts of streams which were allocated were not initialized by stream_new(). This leaded to accesses to non-allocated memory when freeing these contexts with stream_free().	2019-04-23 20:24:11 +02:00
Fr�d�ric L�caille	0bad840b4d	MINOR: log: Extract some code to send syslog messages. This patch extracts the code of __send_log() responsible of sending a syslog message to a syslog destination represented as a logsrv struct to define __do_send_log() function. __send_log() calls __do_send_log() for each syslog destination of a proxy after having prepared some of its parameters.	2019-04-23 14:16:51 +02:00
Baptiste Assmann	333939c2ee	MINOR: action: new '(http-request\|tcp-request content) do-resolve' action The 'do-resolve' action is an http-request or tcp-request content action which allows to run DNS resolution at run time in HAProxy. The name to be resolved can be picked up in the request sent by the client and the result of the resolution is stored in a variable. The time the resolution is being performed, the request is on pause. If the resolution can't provide a suitable result, then the variable will be empty. It's up to the admin to take decisions based on this statement (return 503 to prevent loops). Read carefully the documentation concerning this feature, to ensure your setup is secure and safe to be used in production. This patch creates a global counter to track various errors reported by the action 'do-resolve'.	2019-04-23 11:41:52 +02:00
Baptiste Assmann	db4c8521ca	MINOR: dns: move callback affection in dns_link_resolution() In dns.c, dns_link_resolution(), each type of dns requester is managed separately, that said, the callback function is affected globaly (and points to server type callbacks only). This design prevents the addition of new dns requester type and this patch aims at fixing this limitation: now, the callback setting is done directly into the portion of code dedicated to each requester type.	2019-04-23 11:34:11 +02:00
Baptiste Assmann	dfd35fd71a	MINOR: dns: dns_requester structures are now in a memory pool dns_requester structure can be allocated at run time when servers get associated to DNS resolution (this happens when SRV records are used in conjunction with service discovery). Well, this memory allocation is safer if managed in an HAProxy pool, furthermore with upcoming HTTP action which can perform DNS resolution at runtime. This patch moves the memory management of the dns_requester structure into its own pool.	2019-04-23 11:33:48 +02:00
paulborile	7714b12604	MINOR: wurfl: enabled multithreading mode Initially excluded multithreaded mode is completely supported (libwurfl is fully MT safe). Internal tests now are run also with multithreading enabled.	2019-04-23 11:00:23 +02:00
paulborile	bad132c384	CLEANUP: wurfl: removed deprecated methods last 2 major releases of libwurfl included a complete review of engine options with the result of deprecating many features. The patch removes unecessary code and fixes the documentation. Can be backported on any version of haproxy. [wt: must not be backported since it removes config keywords and would thus break existing configurations] Signed-off-by: Willy Tarreau <w@1wt.eu>	2019-04-23 11:00:23 +02:00
paulborile	59d50145dc	BUILD: wurfl: build fix for 1.9/2.0 code base This applies the required changes for the new buffer API that came in 1.9. This patch must be backported to 1.9.	2019-04-23 11:00:23 +02:00
Willy Tarreau	b518823f1b	MINOR: wurfl: indicate in haproxy -vv the wurfl version in use It also explicitly mentions that the library is the dummy one when it is detected. We have this output now : $ ./haproxy -vv \|grep -i wurfl Built with WURFL support (dummy library version 1.11.2.100)	2019-04-23 11:00:23 +02:00
Willy Tarreau	b3cc9f2887	Revert "CLEANUP: wurfl: remove dead, broken and unmaintained code" This reverts commit `8e5e1e7bf0`. The following patches will fix this code and may be backported.	2019-04-23 10:34:43 +02:00
Emeric Brun	d0e095c2aa	MINOR: ssl/cli: async fd io-handlers printable on show fd This patch exports the async fd iohandlers and make them printable doing a 'show fd' on cli.	2019-04-19 17:27:01 +02:00
Christopher Faulet	46451d6e04	MINOR: gcc: Fix a silly gcc warning in connect_server() Don't know why it happens now, but gcc seems to think srv_conn may be NULL when a reused connection is removed from the orphan list. It happens when HAProxy is compiled with -O2 with my gcc (8.3.1) on fedora 29... Changing a little how reuse parameter is tested removes the warnings. So... This patch may be backported to 1.9.	2019-04-19 15:53:23 +02:00
Christopher Faulet	f48552f2c1	BUG/MINOR: da: Get the request channel to call CHECK_HTTP_MESSAGE_FIRST() Since the commit `89dc49935` ("BUG/MAJOR: http_fetch: Get the channel depending on the keyword used"), the right channel must be passed as argument when the macro CHECK_HTTP_MESSAGE_FIRST is called. This patch must be backported to 1.9.	2019-04-19 15:53:23 +02:00
Christopher Faulet	2db9dac4c8	BUG/MINOR: 51d: Get the request channel to call CHECK_HTTP_MESSAGE_FIRST() Since the commit `89dc49935` ("BUG/MAJOR: http_fetch: Get the channel depending on the keyword used"), the right channel must be passed as argument when the macro CHECK_HTTP_MESSAGE_FIRST is called. This patch must be backported to 1.9.	2019-04-19 15:53:23 +02:00
Christopher Faulet	c54e4b053d	BUG/MEDIUM: stream: Don't request a server connection if a shutw was scheduled If a shutdown for writes was performed on the client side (CF_SHUTW is set on the request channel) while the server connection is still unestablished (the stream-int is in the state SI_ST_INI), then it is aborted. It must also be aborted when the shudown for write is pending (only CF_SHUTW_NOW is set). Otherwise, some errors on the request channel can be ignored, leaving the stream in an undefined state. This patch must be backported to 1.9. It may probably be backported to all suported versions, but it is unclear if the bug is visbile for older versions than 1.9. So it is probably safer to wait bug reports on these versions to backport this patch.	2019-04-19 15:53:23 +02:00
Christopher Faulet	e84289e585	BUG/MEDIUM: thread/http: Add missing locks in set-map and add-acl HTTP rules Locks are missing in the rules "http-request set-map" and "http-response add-acl" when an acl or map update is performed. Pattern elements must be locked. This patch must be backported to 1.9 and 1.8. For the 1.8, the HTX part must be ignored.	2019-04-19 15:53:23 +02:00
Baptiste Assmann	e1afd4fec6	MINOR: proto_tcp: tcp-request content: enable set-dst and set-dst-var The set-dst and set dst-var are available at both 'tcp-request connection' and 'http-request' but not at the layer in the middle. This patch fixes this miss and enables both set-dst and set-dst-var at 'tcp-request content' layer.	2019-04-19 15:50:06 +02:00
Willy Tarreau	78c5eec949	BUG/MINOR: acl: properly detect pattern type SMP_T_ADDR Since 1.6-dev4 with commit `b2f8f087f` ("MINOR: map: The map can return IPv4 and IPv6"), maps can return both IPv4 and IPv6 addresses, which is represented as SMP_T_ADDR at the output of the map converter. But the ACL parser only checks for either SMP_T_IPV4 or SMP_T_IPV6 and requires to see an explicit matching method specified. Given that it uses the same pattern parser for both address families, it implicitly is also compatible with SMP_T_ADDR, which ought to have been added there. This fix should be backported as far as 1.6.	2019-04-19 11:45:20 +02:00
Willy Tarreau	aa5801bcaa	BUG/MEDIUM: maps: only try to parse the default value when it's present Maps returning an IP address (e.g. map_str_ip) support an optional default value which must be parsed. Unfortunately the parsing code does not check for this argument's existence and uncondtionally tries to resolve the argument whenever the output is of type address, resulting in segfaults at parsing time when no such argument is provided. This patch adds the appropriate check. This fix may be backported as far as 1.6.	2019-04-19 11:35:22 +02:00
Olivier Houchard	88698d966d	MEDIUM: connections: Add a way to control the number of idling connections. As by default we add all keepalive connections to the idle pool, if we run into a pathological case, where all client don't do keepalive, but the server does, and haproxy is configured to only reuse "safe" connections, we will soon find ourself having lots of idling, unusable for new sessions, connections, while we won't have any file descriptors available to create new connections. To fix this, add 2 new global settings, "pool_low_ratio" and "pool_high_ratio". pool-low-fd-ratio is the % of fds we're allowed to use (against the maximum number of fds available to haproxy) before we stop adding connections to the idle pool, and destroy them instead. The default is 20. pool-high-fd-ratio is the % of fds we're allowed to use (against the maximum number of fds available to haproxy) before we start killing idling connection in the event we have to create a new outgoing connection, and no reuse is possible. The default is 25.	2019-04-18 19:52:03 +02:00
Olivier Houchard	7c49d2e213	MINOR: fd: Add a counter of used fds. Add a new counter, ha_used_fds, that let us know how many file descriptors we're currently using.	2019-04-18 19:19:59 +02:00
Emeric Brun	0bbec0fa34	MINOR: peers: adds counters on show peers about tasks calls. This patch adds a counter of calls on the orchestator peers task and a counter on the tasks linked to applet i/o handler for each peer. Those two counters are useful to detect if a peer sync is active or frozen. This patch is related to the commit: "MINOR: peers: Add a new command to the CLI for peers." and should be backported with it.	2019-04-18 18:24:25 +02:00
Olivier Houchard	66a7b3302a	BUILD/medium: ssl: Fix build with OpenSSL < 1.1.0 Make sure it builds with OpenSSL < 1.1.0, a lot of the BIO_get/set methods were introduced with OpenSSL 1.1.0, so fallback with the old way of doing things if needed.	2019-04-18 15:58:58 +02:00
Olivier Houchard	a8955d57ed	MEDIUM: ssl: provide our own BIO. Instead of letting the OpenSSL code handle the file descriptor directly, provide a custom BIO, that will use the underlying XPRT to send/recv data. This will let us implement QUIC later, and probably clean the upper layer, if/when the SSL code provide its own subscribe code, so that the upper layers won't have to care if we're still waiting for the handshake to complete or not.	2019-04-18 14:56:24 +02:00
Olivier Houchard	e179d0e88f	MEDIUM: connections: Provide a xprt_ctx for each xprt method. For most of the xprt methods, provide a xprt_ctx. This will be useful later when we'll want to be able to stack xprts. The init() method now has to create and provide the said xprt_ctx if needed.	2019-04-18 14:56:24 +02:00
Olivier Houchard	df35784600	MEDIUM: ssl: provide its own subscribe/unsubscribe function. In order to prepare for the possibility of using different kinds of xprt with ssl, make the ssl code provide its own subscribe and unsubscribe functions, right now it just calls conn_subscribe and conn_unsubsribe.	2019-04-18 14:56:24 +02:00
Olivier Houchard	7b5fd1ec26	MEDIUM: connections: Move some fields from struct connection to ssl_sock_ctx. Move xprt_st, tmp_early_data and sent_early_data from struct connection to struct ssl_sock_ctx, as they are only used in the SSL code.	2019-04-18 14:56:24 +02:00
Olivier Houchard	66ab498f26	MEDIUM: ssl: Give ssl_sock its own context. Instead of using directly a SSL * as xprt_ctx, give ssl_sock its own context. It's useless for now, but will be useful later when we'll want to be able to stack xprts.	2019-04-18 14:56:24 +02:00
Olivier Houchard	ed1a6a0d8a	MEDIUM: tasks: Use __ha_barrier_store after modifying global_tasks_mask. Now that we no longer use atomic operations to update global_tasks_mask, as it's always modified while holding the TASK_RQ_LOCK, we have to use __ha_barrier_store() instead of __ha_barrier_atomic_store() to ensure any modification of global_tasks_mask is seen before modifying active_tasks_mask. This should be backported to 1.9.	2019-04-18 14:14:10 +02:00
Willy Tarreau	d83b6c1ab3	BUG/MINOR: mworker: disable busy polling in the master process When enabling busy polling, we don't want the master to use it, or it wastes a dedicated processor to this! Must be backported to 1.9.	2019-04-18 11:34:41 +02:00
Olivier Houchard	1cfac37b65	MEDIUM: tasks: Don't account a destroyed task as a runned task. In process_runnable_tasks(), if the task we're about to run has been destroyed, and should be free, don't account for it in the number of task we ran. We're only allowed a maximum number of tasks to run per call to process_runnable_tasks(), and freeing one shouldn't take the slot of a valid task.	2019-04-18 10:11:13 +02:00
Olivier Houchard	3f795f76e8	MEDIUM: tasks: Merge task_delete() and task_free() into task_destroy(). task_delete() was never used without calling task_free() just after, and task_free() was only used on error pathes to destroy a just-created task, so merge them into task_destroy(), that will remove the task from the wait queue, and make sure the task is either destroyed immediately if it's not in the run queue, or destroyed when it's supposed to run.	2019-04-18 10:10:04 +02:00
Willy Tarreau	03dd029a5b	CLEANUP: task: remain consistent when using the task's handler A pointer "process" is assigned the task's handler in process_runnable_tasks(), we have no reason to use t->process right after it is assigned.	2019-04-17 22:32:27 +02:00
Olivier Houchard	51205a1958	BUG/MEDIUM: applets: Don't use task_in_rq(). When deciding if we want to wake the task of an applet up, don't give up if task_in_rq returns 1, as there's a race condition and another thread may run it. Instead, always attempt to task_wakeup(), at worst the task is already in the run queue, and nothing will happen.	2019-04-17 19:30:23 +02:00
Olivier Houchard	0c7a4b6371	MINOR: tasks: Don't set the TASK_RUNNING flag when adding in the tasklet list. Now that TASK_QUEUED is enforced, there's no need to set TASK_RUNNING when removing the task from the runqueue to add it to the tasklet list. The flag will only be set right before we run the task.	2019-04-17 19:28:01 +02:00
Olivier Houchard	de82aeaa26	BUG/MEDIUM: tasks: Make sure we modify global_tasks_mask with the rq_lock. When modifying global_tasks_mask, make sure we hold the rq_lock, or we might remove the bit while it has been re-set by somebody else, and we make not be waked when needed.	2019-04-17 19:28:01 +02:00
Willy Tarreau	b038007ae8	BUG/MEDIUM: tasks: Make sure we set TASK_QUEUED before adding a task to the rq. Make sure we set TASK_QUEUED in every case before adding the task to the run queue. task_wakeup() now checks if either TASK_QUEUED or TASK_RUNNING is set, and if neither is set, add TASK_QUEUED and effectively add the task to the runqueue. No longer use __task_wakeup() anywhere except in task_wakeup(), always use task_wakeup() instead. With the old code, process_runnable_task() may re-add a task in the runqueue without setting the TASK_QUEUED flag, and there were race conditions that could lead to a task having the TASK_QUEUED flag but not in the runqueue, thus being unschedulable. This should be backported to 1.9.	2019-04-17 19:28:01 +02:00
Christopher Faulet	46575cd392	BUG/MINOR: http_fetch/htx: Use HTX versions if the proxy enables the HTX mode Because the HTX is now the default mode for all proxies (HTTP and TCP), it is better to match on the proxy options to know if the HTX is enabled or not. This way, if a TCP proxy explicitly disables the HTX mode, the legacy version of HTTP fetches will be used. No backport needed except if the patch activating the HTX by default for all proxies is backported.	2019-04-17 15:12:27 +02:00
Christopher Faulet	5ec8bcb021	BUG/MINOR: http_fetch/htx: Allow permissive sample prefetch for the HTX As for smp_prefetch_http(), there is now a way to successfully perform a prefetch in HTX, even if the message forwarding already begun. It is used for the sample fetches "req.proto_http" and "method". This patch must be backported to 1.9.	2019-04-17 15:12:27 +02:00
Christopher Faulet	89dc499359	BUG/MAJOR: http_fetch: Get the channel depending on the keyword used All HTTP samples are buggy because the channel tested in the prefetch functions (HTX and legacy HTTP) is chosen depending on the sample direction and not the keyword really used. It means the request channel is used if the sample is called during the request analysis and the response channel is used if it is called during the response analysis, regardless the sample really called. For instance, if you use the sample "req.ver" in an http-response rule, the response channel will be prefeched because it is called during the response analysis, while the request channel should have been used instead. So some assumptions on the validity of the sample may be made on the wrong channel. It is the first bug. Then the same error is done in some samples themselves. So fetches are performed on the wrong channel. For instance, the header extraction (req.fhdr, res.fhdr, req.hdr, res.hdr...). If the sample "req.hdr" is used in an http-response rule, then the matching is done on the response headers and not the request ones. It is the second bug. Finally, the last one but not the least, in some samples, the right channel is used. But because the prefetch was done on the wrong one, this channel may be in a undefined state. For instance, using the sample "req.ver" in an http-response rule leads to a matching on a posibility released buffer. To fix all these bugs, the right channel is now chosen in sample fetches, before the prefetch. If the same function is used to fetch requests and responses elements, then the keyword is used to choose the right one. This channel is then used by the functions smp_prefetch_htx() and smp_prefetch_http(). Of course, it is also used by the samples themselves to extract information. This patch must be backported to all supported versions. For version 1.8 and priors, it must be totally refactored. First because there is no HTX into these versions. Then the buffers API has changed in HAProxy 1.9. The files http_fetch.{ch} doesn't exist on old versions.	2019-04-17 15:12:27 +02:00
Christopher Faulet	038ad8123b	MINOR: mux-h1: Handle read0 during TCP splicing It avoids a roundtrip with underlying I/O callbacks to do so. If a read0 is handled at the end of h1_rcv_pipe(), the flag CS_FL_REOS is set on the conn_stream. And if there is no data in the pipe, the flag CS_FL_EOS is also set. This path may be backported to 1.9.	2019-04-17 14:52:31 +02:00
Christopher Faulet	e18777b79d	BUG/MEDIUM: mux-h1: Enable TCP splicing to exchange data only Use the TCP splicing only when the input parser is in the state H1_MSG_DATA or H1_MSG_TUNNEL and don't transfer more than then known expected length for these data (unlimited for the tunnel mode). In other states or when all data are transferred, the TCP splicing is disabled. This patch must be backported to 1.9.	2019-04-17 14:52:31 +02:00
Christopher Faulet	f7d5ff37e0	BUG/MEDIUM: mux-h1: Notify the stream waiting for TCP splicing if ibuf is empty When a stream-interface want to use the TCP splicing to forward its data, it notifies the mux h1. We will then flush the input buffer and don't read more data. So the stream-interface will not be notified for read anymore, except if an error or a read0 is detected. It is a problem everytime the receive I/O callback is called again. It happens when the pipe is full or when no data are received on the pipe. It also happens when the input buffer is freshly flushed. Because the TCP splicing is enabled, nothing is done in h1_recv() and the stream-interface is never woken up. So, now, in h1_recv(), if the TCP splicing is used and the input buffer is empty, the stream-interface is notified for read. This patch must be backported to 1.9.	2019-04-17 14:52:31 +02:00
Christopher Faulet	2f320ee59c	BUG/MINOR: mux-h1: Don't switch the parser in busy mode if other side has done There is no reaon to switch the input parser in busy mode if all the output has been processed. This patch must be backported to 1.9.	2019-04-17 14:52:31 +02:00
Christopher Faulet	91f77d5999	BUG/MINOR: mux-h1: Process input even if the input buffer is empty It is required, at least, to add the EOM block and finish the message when the TCP splicing was used to send all data. Otherwise, there is no way to finish the parsing. This patch must be backported to 1.9.	2019-04-17 14:52:31 +02:00
William Lallemand	74f0ec3894	BUG/MINOR: mworker: ensure that we still quits with SIGINT Since the fix "BUG/MINOR: mworker: don't exit with an ambiguous value" we are leaving with a EXIT_SUCCESS upon a SIGINT. We still need to quit with a SIGINT when a worker leaves with a SIGINT. This is done this way because vtest expect a 130 during the process stop, haproxy without mworker returns a 130, so it should be the same in mworker mode. This should be backported in 1.9, with the previous patch ("BUG/MINOR: mworker: don't exit with an ambiguous value"). Code has moved, mworker_catch_sigchld() is in haproxy.c.	2019-04-16 18:14:29 +02:00
William Lallemand	4cf4b33744	BUG/MINOR: mworker: don't exit with an ambiguous value When the sigchld handler is called and waitpid() returns -1, the behavior of waitpid() with the status variable is undefined. It is not a good idea to exit with the value contained in it. Since this exit path does not use the exitcode variable, it means that this is an expected and successful exit. This should be backported in 1.9, code has moved, mworker_catch_sigchld() is in haproxy.c.	2019-04-16 18:14:29 +02:00
William Lallemand	32b6901550	BUG/MINOR: mworker: mworker_kill should apply on every children Commit `3f12887` ("MINOR: mworker: don't use children variable anymore") introduced a regression. The previous behavior was to send a signal to every children, whether or not they are former children. Instead of this, we only send a signal to the current children, so we don't try to kill -INT or -TERM all processes during a reload. No backport needed.	2019-04-16 18:14:29 +02:00
Willy Tarreau	85d0424b20	BUG/MINOR: listener/mq: correctly scan all bound threads under low load When iterating on the CLI using "show activity" and no other load, it was visible that the last thread was always skipped. This was caused by the way the thread bits were walking : t1 was updated after t2 to make sure it never equals t2 (thus it skips t2), and in case of a tie we choose t1. This results in the chosen thread never to equal t2 unless the other ones already have one connection. In addition to this, t2 was recalulated upon each pass due to the fact that only the 31th bit was looked at instead of looking at the t2'th bit. This patch fixes this by updating t2 after t1 so that t1 is free to walk over all positions under equal load. No measurable performance gains are expected from this though, but it at least removes one strange indicator which could lead to some suspicion. No backport is needed.	2019-04-16 18:09:13 +02:00
Willy Tarreau	636848aa86	MINOR: init: add a "set-dumpable" global directive to enable core dumps It's always a pain to get a core dump when enabling user/group setting (which disables the dumpable flag on Linux), when using a chroot and/or when haproxy is started by a service management tool which requires complex operations to just raise the core dump limit. This patch introduces a new "set-dumpable" global directive to work around these troubles by doing the following : - remove file size limits (equivalent of ulimit -f unlimited) - remove core size limits (equivalent of ulimit -c unlimited) - mark the process dumpable again (equivalent of suid_dumpable=1) Some of these will depend on the operating system. This way it becomes much easier to retrieve a core file. Temporarily moving the chroot to a user-writable place generally enough.	2019-04-16 14:31:23 +02:00
William Lallemand	482f9a9a2f	MINOR: mworker: export HAPROXY_MWORKER=1 when running in mworker mode Export HAPROXY_MWORKER=1 in an environment variable when running in mworker mode.	2019-04-16 13:26:43 +02:00
William Lallemand	620072bc0d	MINOR: cli: don't add a semicolon at the end of HAPROXY_CLI Only add the semicolon when there is several CLI in HAPROXY_CLI and HAPROXY_MASTER_CLI.	2019-04-16 13:26:43 +02:00
William Lallemand	9a37fd0f19	MEDIUM: mworker/cli: export the HAPROXY_MASTER_CLI variable It works the same way as the HAPROXY_CLI variable, it exports the listeners addresses separated by semicolons.	2019-04-16 13:26:43 +02:00
William Lallemand	8f7069a389	CLEANUP: mworker: remove the type field in mworker_proc Since the introduction of the options field, we can use it to store the type of process. type = 'm' is replaced by PROC_O_TYPE_MASTER type = 'w' is replaced by PROC_O_TYPE_WORKER type = 'e' is replaced by PROC_O_TYPE_PROG The old values are still used in the HAPROXY_PROCESSES environment variable to pass the information during a reload.	2019-04-16 13:26:43 +02:00
William Lallemand	bd3de3efb7	MEDIUM: mworker-prog: implements 'option start-on-reload' This option is already the default, but its opposite 'no option start-on-reload' allows the master to keep a previous instance of a program and don't start a new one upon a reload. The old program will then appear as a current one in "show proc" and could also trigger an exit-on-failure upon a segfault.	2019-04-16 13:26:43 +02:00
William Lallemand	4528611ed6	MEDIUM: mworker: store the leaving state of a process Previously we were assuming than a process was in a leaving state when its number of reload was greater than 0. With mworker programs it's not the case anymore so we need to store a leaving state.	2019-04-16 13:26:43 +02:00
Willy Tarreau	9df86f997e	BUG/MAJOR: lb/threads: fix insufficient locking on round-robin LB Maksim Kupriianov reported very strange crashes in fwrr_update_position() which didn't make sense because of an apparent divide overflow except that the value was not null in the core. It happens that while the locking is correct in all the functions' call graph, the uppermost one (fwrr_get_next_server()) incorrectly expected that its target server was already locked when called. This stupid assumption causd the server lock not to be held when calling the other ones, explaining how it was possible to change the server's eweight by calling srv_lb_commit_status() under the server lock yet collide with its unprotected usage. This commit makes sure that fwrr_get_server_from_group() retrieves a locked server and that fwrr_get_next_server() is responsible for unlocking the server before returning it. There is one subtlety in this function which is that it builds a list of avoided servers that were full while scanning the tree, and all of them are queued in a full state so they must be unlocked upon return. Many thanks to Maksim for providing detailed info allowing to narrow down this bug. This fix must be backported to 1.9. In 1.8 the lock seems much wider and changes to the server's state are performed under the rendez-vous point so this it doesn't seem possible that it happens there.	2019-04-16 11:21:14 +02:00
Fr�d�ric L�caille	95679dc096	MINOR: peers: Add a new command to the CLI for peers. Implements "show peers [peers section]" new CLI command to dump information about the peers and their stick-tables to be synchronized and others internal. May be backported as far as 1.5.	2019-04-16 09:58:40 +02:00
Willy Tarreau	6f7a02a381	BUILD: htx: fix a used uninitialized warning on is_cookie2 gcc-3.4 reports this which actually looks like a valid warning when looking at the code, it's unsure why others didn't notice it : src/proto_htx.c: In function `htx_manage_server_side_cookies': src/proto_htx.c:4266: warning: 'is_cookie2' might be used uninitialized in this function	2019-04-15 21:55:48 +02:00
Willy Tarreau	8de1df92a3	BUILD: do not specify "const" on functions returning structs or scalars Older compilers (like gcc-3.4) warn about the use of "const" on functions returning a struct, which makes sense since the return may only be copied : include/common/htx.h:233: warning: type qualifiers ignored on function return type Let's simply drop "const" here.	2019-04-15 21:55:48 +02:00
Willy Tarreau	0e492e2ad0	BUILD: address a few cases of "static <type> inline foo()" Older compilers don't like to see "inline" placed after the type in a function declaration, it must be "static inline <type>" only. This patch touches various areas. The warnings were seen with gcc-3.4.	2019-04-15 21:55:48 +02:00
Olivier Houchard	998410a41b	BUG/MEDIUM: h2: Revamp the way send subscriptions works. Instead of abusing the SUB_CALL_UNSUBSCRIBE flag, revamp the H2 code a bit so that it just checks if h2s->sending_list is empty to know if the tasklet of the stream_interface has been waken up or not. send_wait is now set to NULL in h2_snd_buf() (ideally we'd set it to NULL as soon as we're waking the tasklet, but it can't be done, because we still need it in case we have to remove the tasklet from the task list).	2019-04-15 19:27:57 +02:00
Olivier Houchard	9a0f559676	BUG/MEDIUM: h2: Make sure we're not already in the send_list in h2_subscribe(). In h2_subscribe(), don't add ourself to the send_list if we're already in it. That may happen if we try to send and fail twice, as we're only removed from the send_list if we managed to send data, to promote fairness. Failing to do so can lead to either an infinite loop, or some random crashes, as we'd get the same h2s in the send_list twice. This should be backported to 1.9.	2019-04-15 19:27:57 +02:00
Olivier Houchard	0e0793715c	BUG/MEDIUM: muxes: Make sure we unsubcribed when destroying mux ctx. In the h1 and h2 muxes, make sure we unsubscribed before destroying the mux context. Failing to do so will lead in a segfault later, as the connection will attempt to dereference its conn->send_wait or conn->recv_wait, which pointed to the now-free'd mux context. This was introduced by commit `39a96ee16e`, so should only be backported if that commit gets backported.	2019-04-15 19:27:57 +02:00
Willy Tarreau	e61828449c	BUILD: cli/threads: fix build in single-threaded mode Commit `a8f57d51a` ("MINOR: cli/activity: report the accept queue sizes in "show activity"") broke the single-threaded build because the accept-rings are not implemented there. Let's ifdef this out. Ideally we should start to think about always having such elements initialized even without threads to improve the test coverage.	2019-04-15 18:55:31 +02:00
Willy Tarreau	3466e3cdcb	BUILD: task/thread: fix single-threaded build of task.c As expected, commit `cde7902ac` ("MEDIUM: tasks: improve fairness between the local and global queues") broke the build with threads disabled, and I forgot to rerun this test before committing. No backport is needed.	2019-04-15 18:52:40 +02:00
Nenad Merdanovic	646b7741bc	BUG/MEDIUM: map: Fix memory leak in the map converter The allocated trash chunk is not freed properly and causes a memory leak exhibited as the growth in the trash pool allocations. Bug was introduced in commit 271022 (BUG/MINOR: map: fix map_regm with backref). This should be backported to all branches where the above commit was backported.	2019-04-15 09:53:46 +02:00
Willy Tarreau	c8da044b41	MINOR: tasks: restore the lower latency scheduling when niced tasks are present In the past we used to reduce the number of tasks consulted at once when some niced tasks were present in the run queue. This was dropped in 1.8 when the scheduler started to take batches. With the recent fixes it now becomes possible to restore this behaviour which guarantees a better latency between tasks when niced tasks are present. Thanks to this, with the default number of 200 for tune.runqueue-depth, with a parasitic load of 14000 requests per second, nice 0 gives 14000 rps, nice 1024 gives 12000 rps and nice -1024 gives 16000 rps. The amplitude widens if the runqueue depth is lowered.	2019-04-15 09:50:56 +02:00
Willy Tarreau	2d1fd0a0d2	MEDIUM: tasks: only base the nice offset on the run queue depth The offset calculated for the nice value used to be wrong for a long time and got even worse when the improved multi-thread sheduler was implemented because it continued to rely on the run queue size, which become irrelevant given that we extract tasks in batches, so the run queue size moves following a sawtooth form. However the offsets much better reflects insertion positions in the queue, so it's worth dropping this rq_size component of the equation. Last point, due to the batches made of runqueue-depth entries at once, the higher the depth, the lower the effect of the nice setting since values are picked together in batches and placed into a list. An intuitive approach consists in multiplying the nice value with the batch size to allow tasks to participate to a different batch. And experimentation shows that this works pretty well. With a runqueue-depth of 16 and a parasitic load of 16000 requests per second on 100 streams, a default nice of 0 shows 16000 requests per second for nice 0, 22000 for nice -1024 and 10000 for nice 1024. The difference is even bigger with a runqueue depth of 5. At 200 however it's much smoother (16000-22000).	2019-04-15 09:50:56 +02:00
Willy Tarreau	cde7902ac9	MEDIUM: tasks: improve fairness between the local and global queues Tasks allowed to run on multiple threads, as well as those scheduled by one thread to run on another one pass through the global queue. The local queues only see tasks scheduled by one thread to run on itself. The tasks extracted from the global queue are transferred to the local queue when they're picked by one thread. This causes a priority issue because the global tasks experience a priority contest twice while the local ones experience it only once. Thus if a tasks returns still running, it's immediately reinserted into the local run queue and runs much faster than the ones coming from the global queue. Till 1.9 the tasks going through the global queue were mostly : - health checks initialization - queue management - listener dequeue/requeue These ones are moderately sensitive to unfairness so it was not that big an issue. Since 2.0-dev2 with the multi-queue accept, tasks are scheduled to remote threads on most accept() and it becomes fairly visible under load that the accept slows down, even for the CLI. This patch remedies this by consulting both the local and the global run queues in parallel and by always picking the task whose deadline is the earliest. This guarantees to maintain an excellent fairness between the two queues and removes the cascade effect experienced by the global tasks. Now the CLI always continues to respond quickly even in presence of expensive tasks running for a long time. This patch may possibly be backported to 1.9 if some scheduling issues are reported but at this time it doesn't seem necessary.	2019-04-15 09:50:56 +02:00
Willy Tarreau	24f382f555	CLEANUP: task: do not export rq_next anymore This one hasn't been used anymore since the scheduler changes after 1.8 but it kept being exported and maintained up to date while it's always reset when scanning the trees. Let's stop exporting it and updating it.	2019-04-15 09:50:56 +02:00
Christopher Faulet	61840e715f	BUG/MEDIUM: muxes: Don't dereference mux context if null in release functions When a mux context is released, we must be sure it exists before dereferencing it. The bug was introduced in the commit `39a96ee16` ("MEDIUM: muxes: Be prepared to don't own connection during the release"). No need to backport this patch, expect if the commit `39a96ee16` is backported too.	2019-04-15 09:47:10 +02:00
Christopher Faulet	1d2b586cdd	MAJOR: htx: Enable the HTX mode by default for all proxies The legacy HTTP mode is no more the default one. So now, by default, without any option in your configuration, all proxies will use the HTX mode. The line "option http-use-htx" in proxy sections are now useless, except to cancel the legacy HTTP mode. To fallback on legacy HTTP mode, you should use the line "no option http-use-htx" explicitly. Note that the reg-tests still work by default on legacy HTTP mode. The HTX will be enabled by default in a futur commit.	2019-04-12 22:06:53 +02:00
Christopher Faulet	0ef372a390	MAJOR: muxes/htx: Handle inplicit upgrades from h1 to h2 The upgrade is performed when an H2 preface is detected when the first request on a connection is parsed. The CS is destroyed by setting EOS flag on it. A special flag is added on the HTX message to warn the HTX analyzers the stream will be closed because of an upgrade. This way, no error and no log are emitted. When the mux h1 is released, we create a mux h2, without any CS and passing the buffer with the unparsed H2 preface.	2019-04-12 22:06:53 +02:00
Christopher Faulet	bbe685452f	MAJOR: proxy/htx: Handle mux upgrades from TCP to HTTP in HTX mode It is now possible to upgrade TCP streams to HTX when an HTTP backend is set for a TCP frontend (both with the HTX enabled). So concretely, in such case, an upgrade is performed from the mux pt to the mux h1. The current CS and the channel's buffer are used to initialize the mux h1.	2019-04-12 22:06:53 +02:00
Christopher Faulet	eb7098035c	MEDIUM: htx: Allow the option http-use-htx to be used on TCP proxies too This will be mandatory to allow upgrades from TCP to HTTP in HTX. Of course, raw buffers will still be used by default on TCP proxies, this option sets or not. But if you want to handle mux upgrades from a TCP proxy, you must enable the HTX on it and on all its backends. There is only a small change in the lua code. Because TCP proxies can be HTX aware, to exclude TCP services only for HTTP proxies, we must also check the mode (TCP/HTTP) now.	2019-04-12 22:06:53 +02:00
Christopher Faulet	39a96ee16e	MEDIUM: muxes: Be prepared to don't own connection during the release This happens during mux upgrades. In such case, when the destroy() callback is called, the connection points to a different mux's context than the one passed to the callback. It means the connection is owned by another mux. The old mux is then released but the connection is not closed.	2019-04-12 22:06:53 +02:00
Christopher Faulet	73c1207c71	MINOR: muxes: Pass the context of the mux to destroy() instead of the connection It is mandatory to handle mux upgrades, because during a mux upgrade, the connection will be reassigned to another multiplexer. So when the old one is destroyed, it does not own the connection anymore. Or in other words, conn->ctx does not point to the old mux's context when its destroy() callback is called. So we now rely on the multiplexer context do destroy it instead of the connection. In addition, h1_release() and h2_release() have also been updated in the same way.	2019-04-12 22:06:53 +02:00
Christopher Faulet	51f73eb11a	MEDIUM: muxes: Add an optional input buffer during mux initialization The mux's callback init() now take a pointer to a buffer as extra argument. It must be used by the multiplexer as its input buffer. This buffer is always NULL when a multiplexer is initialized with a fresh connection. But if a mux upgrade is performed, it may be filled with existing data. Note that, for now, mux upgrades are not supported. But this commit is mandatory to do so.	2019-04-12 22:06:53 +02:00
Christopher Faulet	e9b7072e9e	MINOR: muxes: Rely on conn_is_back() during init to handle front/back conn Instead of using the connection context to make the difference between a frontend connection and a backend connection, we now rely on the function conn_is_back().	2019-04-12 22:06:53 +02:00
Christopher Faulet	0f17a9b510	MINOR: filters/htx: Use stream flags instead of px mode to instanciate a filter In the function flt_stream_add_filter(), if the HTX is enabled, before attaching a filter to a stream, we test if the filter can handle it or not. If not, the filter is ignored. Before the proxy mode was tested. Now we test if the stream is an HTX stream or not.	2019-04-12 22:06:53 +02:00
Christopher Faulet	eca8854555	MINOR: http_fetch/htx: Use stream flags instead of px mode in smp_prefetch_htx In the function smp_prefetch_htx(), we must know if data in the channel's buffer are structured or not. Before the proxy mode was tested. Now we test if the stream is an HTX stream or not. If yes, we know the HTX is used to structure data in the channel's buffer.	2019-04-12 22:06:53 +02:00
Christopher Faulet	0e160ff5bb	MINOR: stream: Set a flag when the stream uses the HTX The flag SF_HTX has been added to know when a stream uses the HTX or not. It is set when an HTX stream is created. There are 2 conditions to set it. The first one is when the HTTP frontend enables the HTX. The second one is when the attached conn_stream uses an HTX multiplexer.	2019-04-12 22:06:53 +02:00
Christopher Faulet	9f38f5aa80	MINOR: muxes: Add a flag to specify a multiplexer uses the HTX A multiplexer must now set the flag MX_FL_HTX when it uses the HTX to structured the data exchanged with channels. the muxes h1 and h2 set this flag. Of course, for the mux h2, it is set on h2_htx_ops only.	2019-04-12 22:06:53 +02:00
Christopher Faulet	9b579106fe	MINOR: mux-h2: Add a mux_ops dedicated to the HTX mode Instead of using the same mux_ops structure for the legacy HTTP mode and the HTX mode, a dedicated mux_ops is now used for the HTX mode. Same callbacks are used for both. But the flags may be different depending on the mode used.	2019-04-12 22:06:53 +02:00
Christopher Faulet	7f36636c21	BUG/MINOR: mux-h1: Handle the flag CS_FL_KILL_CONN during a shutdown read/write This flag is used to explicitly kill the connection when the CS is closed. It may be set by tcp rules. It must be respect by the mux-h1. This patch must be backported to 1.9.	2019-04-12 22:06:53 +02:00
Christopher Faulet	14c91cfdf8	MINOR: mux-h1: Don't release the conn_stream anymore when h1s is destroyed An H1 stream is destroyed when the conn_stream is detached or when the H1 connection is destroyed. In the first case, the CS is released by the caller. In the second one, because the connection is closed, no CS is attached anymore. In both, there is no reason to release the conn_stream in h1s_destroy().	2019-04-12 22:06:53 +02:00
Christopher Faulet	b992af00b6	MEDIUM: mux-h1: Simplify the connection mode management by sanitizing headers Connection headers are now sanitized during the parsing and the formatting. This means "close" and "keep-alive" values are always removed but right flags are set. This way, client side and server side are independent of each other. On the input side, after the parsing, neither "close" nor "keep-alive" values remain. So on the output side, if we found one of these values in a connection headers, it means it was explicitly added by HAProxy. So it overwrites the other rules, if applicable. Always sanitizing the output is also a way to simplifiy conditions to update the connection header. Concretly, only additions of "close" or "keep-alive" values remain, depending the case. No need to backport this patch.	2019-04-12 22:06:53 +02:00
Christopher Faulet	a51ebb7f56	MEDIUM: h1: Add an option to sanitize connection headers during parsing The flag H1_MF_CLEAN_CONN_HDR has been added to let the H1 parser sanitize connection headers. It means it will remove all "close" and "keep-alive" values during the parsing. One noticeable effect is that connection headers may be unfolded. In practice, this is not a problem because it is not frequent to have multiple values for the connection headers. If this flag is set, during the parsing The function h1_parse_next_connection_header() is called in a loop instead of h1_parse_conection_header(). No need to backport this patch	2019-04-12 22:06:53 +02:00
Christopher Faulet	b829f4c726	MINOR: stats/htx: Don't add "Connection: close" header anymore in stats responses On the client side, as far as possible, we will try to keep connection alive. So, in most of cases, this header will be removed. So it is better to not add it at all. If finally the connection must be closed, the header will be added by the mux h1. No need to backport this patch.	2019-04-12 22:06:53 +02:00
Christopher Faulet	cdc90e9175	MINOR: mux-h1: Simplify handling of 1xx responses Because of previous changes on http tunneling, the synchronization of the transaction can be simplified. Only the check on intermediate messages remains and it only concerns the response path. This patch must be backported to 1.9. It is not strictly speaking required but it will ease futur backports.	2019-04-12 22:06:53 +02:00
Christopher Faulet	c62c2b9d92	BUG/MEDIUM: htx: Fix the process of HTTP CONNECT with h2 connections In HTX, the HTTP tunneling does not work if h1 and h2 are mixed (an h1 client sending requests to an h2 server or this opposite) because the h1 multiplexer always adds an EOM before switching it to tunnel mode. The h2 multiplexer interprets it as an end of stream, closing the stream as for any other transaction. To make it works again, we need to swith to the tunnel mode without emitting any EOM blocks. Because of that, HTX analyzers have been updated to switch the transaction to tunnel mode before end of the message (because there is no end of message...). To be consistent, the protocol switching is also handled the same way even though the 101 responses are not supported in h2. This patch must be backported to 1.9.	2019-04-12 22:06:53 +02:00
Christopher Faulet	03b9d8ba4a	MINOR: proto_htx: Don't adjust transaction mode anymore in HTX analyzers Because the option http-tunnel is now ignored in HTX, there is no longer any need to adjust the transaction mode in HTX analyzers. A channel can still be switch to the tunnel mode for legitimate cases (HTTP CONNECT or switching protocols). So the function htx_adjust_conn_mode() is now useless. This patch must be backported to 1.9. It is not strictly speaking required but it will ease futur backports.	2019-04-12 22:06:53 +02:00
Christopher Faulet	6c9bbb2265	MEDIUM: htx: Deprecate the option 'http-tunnel' and ignore it in HTX The option http-tunnel disables any HTTP processing past the first transaction. In HTX, it works for full h1 transactions. As for the legacy HTTP, it is a workaround, but it works. But it is impossible to make it works with an h2 connection. In such case, it has no effect, the stream is closed at the end of the transaction. So to avoid any inconsistancies between h1 and h2 connections, this option is now always ignored when the HTX is enabled. It is also a good opportinity to deprecate an old and ugly option. A warning is emitted during HAProxy startup to encourage users to remove this option. Note that in legacy HTTP, this option only works with full h1 transactions too. If an h2 connection is established on a frontend with this option enabled, it will have no effect at all. But we keep it for the legacy HTTP for compatibility purpose. It will be removed with the legacy HTTP. So to be short, if you have to really (REALLY) use it, it will only work for legacy HTTP frontends with H1 clients. The documentation has been updated accordingly. This patch must be backported to 1.9. It is not strictly speaking required but it will ease futur backports.	2019-04-12 22:06:53 +02:00
Christopher Faulet	f1449b785e	BUG/MEDIUM: htx: Don't crush blocks payload when append is done on a data block If there is a data block when a header block is added in a HTX message, its payload will be inserted after the data block payload. But its index will be moved before the EOH block. So at this stage, if a new data block is added, we will try to append its payload to the last data block (because it is also the tail). Thus the payload of the further header block will be crushed. This cannot happens if the payloads wrap thanks to the previous fix. But it happens when the tail is not the front too. So now, in this case, we add a new block instead of appending. This patch must be backported in 1.9.	2019-04-12 22:06:45 +02:00

... 3 4 5 6 7 ...

7942 Commits