haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-08 08:07:10 +02:00

Author	SHA1	Message	Date
Willy Tarreau	7ea393d95e	REORG: include: move connection.h to haproxy/connection{,-t}.h The type file is becoming a mess, half of it is for the proxy protocol, another good part describes conn_streams and mux ops, it would deserve being split again. At least it was reordered so that elements are easier to find, with the PP-stuff left at the end. The MAX_SEND_FD macro was moved to compat.h as it's said to be the value for Linux.	2020-06-11 10:18:58 +02:00
Willy Tarreau	fc77454aff	REORG: include: move proto_tcp.h to haproxy/proto_tcp.h There was no type file. This one really is trivial. A few missing includes were added to satisfy the exported functions prototypes.	2020-06-11 10:18:58 +02:00
Willy Tarreau	e6ce10be85	REORG: include: move sample.h to haproxy/sample{,-t}.h This one is particularly tricky to move because everyone uses it and it depends on a lot of other types. For example it cannot include arg-t.h and must absolutely only rely on forward declarations to avoid dependency loops between vars -> sample_data -> arg. In order to address this one, it would be nice to split the sample_data part out of sample.h.	2020-06-11 10:18:58 +02:00
Willy Tarreau	762d7a5117	REORG: include: move frontend.h to haproxy/frontend.h There was no type file for this one, it only contains frontend_accept().	2020-06-11 10:18:57 +02:00
Willy Tarreau	0f6ffd652e	REORG: include: move fd.h to haproxy/fd{,-t}.h A few includes were missing in each file. A definition of struct polled_mask was moved to fd-t.h. The MAX_POLLERS macro was moved to defaults.h Stdio used to be silently inherited from whatever path but it's needed for list_pollers() which takes a FILE* and which can thus not be forward-declared.	2020-06-11 10:18:57 +02:00
Willy Tarreau	7a00efbe43	REORG: include: move common/namespace.h to haproxy/namespace{,-t}.h The type was moved out as it's used by standard.h for netns_entry. Instead of just being a forward declaration when not used, it's an empty struct, which makes gdb happier (the resulting stripped executable is the same).	2020-06-11 10:18:57 +02:00
Willy Tarreau	6131d6a731	REORG: include: move common/net_helper.h to haproxy/net_helper.h No change was necessary.	2020-06-11 10:18:57 +02:00
Willy Tarreau	58017eef3f	REORG: include: move the BUG_ON() code to haproxy/bug.h This one used to be stored into debug.h but the debug tools got larger and require a lot of other includes, which can't use BUG_ON() anymore because of this. It does not make sense and instead this macro should be placed into the lower includes and given its omnipresence, the best solution is to create a new bug.h with the few surrounding macros needed to trigger bugs and place assertions anywhere. Another benefit is that it won't be required to add include <debug.h> anymore to use BUG_ON, it will automatically be covered by api.h. No less than 32 occurrences were dropped. The FSM_PRINTF macro was dropped since not used at all anymore (probably since 1.6 or so).	2020-06-11 10:18:56 +02:00
Willy Tarreau	8d36697dee	REORG: include: move base64.h, errors.h and hash.h from common to to haproxy/ These ones do not depend on any other file. One used to include haproxy/api.h but that was solely for stddef.h.	2020-06-11 10:18:56 +02:00
Willy Tarreau	4c7e4b7738	REORG: include: update all files to use haproxy/api.h or api-t.h if needed All files that were including one of the following include files have been updated to only include haproxy/api.h or haproxy/api-t.h once instead: - common/config.h - common/compat.h - common/compiler.h - common/defaults.h - common/initcall.h - common/tools.h The choice is simple: if the file only requires type definitions, it includes api-t.h, otherwise it includes the full api.h. In addition, in these files, explicit includes for inttypes.h and limits.h were dropped since these are now covered by api.h and api-t.h. No other change was performed, given that this patch is large and affects 201 files. At least one (tools.h) was already freestanding and didn't get the new one added.	2020-06-11 10:18:42 +02:00
Christopher Faulet	3ab504f5ff	BUG/MEDIUM: connection: Ignore PP2 unique ID for stream-less connections It is possible to send a unique ID when the PROXY protocol v2 is used. It relies on the stream to do so. So we must be sure to have a stream. Locally initiated connections may not be linked to a stream. For instance, outgoing connections created by health checks have no stream. Moreover, the stream is not retrieved for mux-less connections (this bug will be fixed in another commit). Unfortunately, in make_proxy_line_v2() function, the stream is not tested before generating the unique-id. This bug leads to a segfault when a health check is performed for a server with the PROXY protocol v2 and the unique-id option enabled. It also crashes for servers using SSL connections with alpn. The bug was introduced by the commit `cf6e0c8a8` ("MEDIUM: proxy_protocol: Support sending unique IDs using PPv2") This patch should fix the issue #640. It must be backported to the same versions as the commit above.	2020-05-26 17:36:01 +02:00
Willy Tarreau	119e50e0cc	MINOR: connection: add pp2-never-send-local to support old PP2 behavior A bug in the PROXY protocol v2 implementation was present in HAProxy up to version 2.1, causing it to emit a PROXY command instead of a LOCAL command for health checks. This is particularly minor but confuses some servers' logs. Sadly, the bug was discovered very late and revealed that some servers which possibly only tested their PROXY protocol implementation against HAProxy fail to properly handle the LOCAL command, and permanently remain in the "down" state when HAProxy checks them. When this happens, it is possible to enable this global option to revert to the older (bogus) behavior for the time it takes to contact the affected components' vendors and get them fixed. This option is disabled by default and acts on all servers having the "send-proxy-v2" statement. Older versions were reverted to the old behavior and should not attempt to be fixed by default again. However a variant of this patch could possibly be implemented to ask to explicitly send LOCAL if needed by some servers. More context here: https://www.mail-archive.com/haproxy@formilux.org/msg36890.html https://www.mail-archive.com/haproxy@formilux.org/msg37218.html	2020-05-22 13:55:32 +02:00
Christopher Faulet	14cd316a1f	MAJOR: checks: Use the best mux depending on the protocol for health checks When a tcp-check connect rule is evaluated, the mux protocol corresponding to the health-check is chosen. So for TCP based health-checks, the mux-pt is used. For HTTP based health-checks, the mux-h1 is used. The connection is marked as private to be sure to not ruse regular HTTP connection for health-checks. Connections reuse will be evaluated later. The functions evaluating HTTP send rules and expect rules have been updated to be HTX compliant. The main change for users is that HTTP health-checks are now stricter on the HTTP message format. While before, the HTTP formatting and parsing were minimalist, now messages should be well formatted.	2020-04-27 10:41:07 +02:00
Willy Tarreau	02c88036a6	BUG/MINOR: connection: always send address-less LOCAL PROXY connections Commit `7f26391bc5` ("BUG/MINOR: connection: make sure to correctly tag local PROXY connections") revealed that some implementations do not properly ignore addresses in LOCAL connections (at least Dovecot was spotted). More context information in the thread below: https://www.mail-archive.com/haproxy@formilux.org/msg36890.html The patch above was using LOCAL on top of local addresses in order to minimize the risk of breakage but revealed worse than a clean fix. So let's partially revert it and send pure LOCAL connections instead now. After a bit of observation, this patch should be progressively backported to stable branches. However if it reveals new breakage, the backport of the patch above will have to be reverted from stable branches while other products work on fixing their code based on the master branch.	2020-04-14 16:02:50 +02:00
Ilya Shipitsin	ce7b00f926	CLEANUP: assorted typo fixes in the code and comments This is fifth iteration of typo fixes	2020-03-31 17:09:35 +02:00
Olivier Houchard	f0d4dff25c	MINOR: connections: Make the "list" element a struct mt_list instead of list. Make the "list" element a struct mt_list, and explicitely use list_from_mt_list to get a struct list * where it is used as such, so that mt_list_for_each_entry will be usable with it.	2020-03-19 22:07:33 +01:00
Olivier Houchard	dc2f2753e9	MEDIUM: servers: Split the connections into idle, safe, and available. Revamp the server connection lists. We know have 3 lists : - idle_conns, which contains idling connections - safe_conns, which contains idling connections that are safe to use even for the first request - available_conns, which contains connections that are not idling, but can still accept new streams (those are HTTP/2 or fastcgi, and are always considered safe).	2020-03-19 22:07:33 +01:00
Tim Duesterhus	2b7f6c22d8	CLEANUP: connection: Stop directly setting an ist's .ptr Instead replace the complete `ist` by the value returned from `ist2`. This was noticed during review of issue #549.	2020-03-14 18:31:58 +01:00
Tim Duesterhus	a8692f3fe0	CLEANUP: connection: Add blank line after declarations in PP handling This adds the missing blank lines in `make_proxy_line_v2` and `conn_recv_proxy`. It also adjusts the type of the temporary variable used for the return value of `recv` to be `ssize_t` instead of `int`.	2020-03-13 17:26:43 +01:00
Tim Duesterhus	cf6e0c8a83	MEDIUM: proxy_protocol: Support sending unique IDs using PPv2 This patch adds the `unique-id` option to `proxy-v2-options`. If this option is set a unique ID will be generated based on the `unique-id-format` while sending the proxy protocol v2 header and stored as the unique id for the first stream of the connection. This feature is meant to be used in `tcp` mode. It works on HTTP mode, but might result in inconsistent unique IDs for the first request on a keep-alive connection, because the unique ID for the first stream is generated earlier than the others. Now that we can send unique IDs in `tcp` mode the `%ID` log variable is made available in TCP mode.	2020-03-13 17:26:43 +01:00
Tim Duesterhus	d1b15b6e9b	MINOR: proxy_protocol: Ingest PP2_TYPE_UNIQUE_ID on incoming connections This patch reads a proxy protocol v2 provided unique ID and makes it available using the `fc_pp_unique_id` fetch.	2020-03-13 17:25:23 +01:00
Tim Duesterhus	ba837ec367	CLEANUP: proxy_protocol: Use `size_t` when parsing TLVs Change `int` to `size_t` for consistency.	2020-03-06 11:16:19 +01:00
Tim Duesterhus	488ee7fb6e	BUG/MAJOR: proxy_protocol: Properly validate TLV lengths This patch fixes PROXYv2 parsing when the payload of the TCP connection is fused with the PROXYv2 header within a single recv() call. Previously HAProxy ignored the PROXYv2 header length when attempting to parse the TLV, possibly interpreting the first byte of the payload as a TLV type. This patch adds proper validation. It ensures that: 1. TLV parsing stops when the end of the PROXYv2 header is reached. 2. TLV lengths cannot exceed the PROXYv2 header length. 3. The PROXYv2 header ends together with the last TLV, not allowing for "stray bytes" to be ignored. A reg-test was added to ensure proper behavior. This patch tries to find the sweat spot between a small and easily backportable one, and a cleaner one that's more easily adaptable to older versions, hence why it merges the "if" and "while" blocks which causes a reindent of the whole block. It should be used as-is for versions 1.9 to 2.1, the block about PP2_TYPE_AUTHORITY should be dropped for 2.0 and the block about CRC32C should be dropped for 1.8. This bug was introduced when TLV parsing was added. This happened in commit `b3e54fe387`. This commit was first released with HAProxy 1.6-dev1. A similar issue was fixed in commit `7209c204bd`. This patch must be backported to HAProxy 1.6+.	2020-03-06 11:11:22 +01:00
Willy Tarreau	6f95f6e111	OPTIM: connection: disable receiving on disabled events when the run queue is too high In order to save a lot on syscalls, we currently don't disable receiving on a file descriptor anymore if its handler was already woken up. But if the run queue is huge and the poller collects a lot of events, this causes excess wakeups which take CPU time which is not used to flush these tasklets. This patch simply considers the run queue size to decide whether or not to stop receiving. Tests show that by stopping receiving when the run queue reaches ~16 times its configured size, we can still hold maximal performance in extreme situations like maxpollevents=20k for runqueue_depth=2, and still totally avoid calling epoll_event under moderate load using default settings on keep-alive connections.	2020-03-04 19:29:12 +01:00
Willy Tarreau	8de5c4fa15	MEDIUM: connection: only call ->wake() for connect() without I/O We used to call ->wake() for any I/O event for which there was no subscriber. But this is a problem because this causes massive wake() storms since we disabled fd_stop_recv() to save syscalls. The only reason for the io_available condition is to detect that an asynchronous connect() just finished and will not be handled by any registered event handler. Since we now properly handle synchronous connects, we can detect this situation by the fact that we had a success on conn_fd_check() and no requested I/O took over.	2020-03-04 19:29:12 +01:00
Willy Tarreau	667fefdc90	BUG/MEDIUM: connection: stop polling for sending when the event is ready With commit `065a025610` ("MEDIUM: connection: don't stop receiving events in the FD handler") we disabled a number of fd_stop_* in conn_fd_handler(), in order to wait for their respective handlers to deal with them. But it is not correct to do that for the send direction, as we may very well have nothing to send. This is visible when connecting in TCP mode to a server with no data to send, there's nobody anymore to disable the polling for the send direction. And it is logical, on the recv() path we know the system has data to deliver and that some code will be in charge of it. On the send direction we simply don't know if it was the result of a successful connect() or if there is still something to send. In any case we almost never fill the network buffer on a single send() after being woken up by the system, so disabling the FD immediately or much later will not change the number of operations. No backport is needed, this is 2.2-dev.	2020-03-04 19:29:12 +01:00
Willy Tarreau	065a025610	MEDIUM: connection: don't stop receiving events in the FD handler The remaining epoll_ctl() calls are exclusively caused by the disagreement between conn_fd_handler() and the mux receiving the data: the fd handler wants to stop after having woken up the tasklet, then the mux after receiving data wants to receive again. Given that they don't happen in the same poll loop when there are many FDs, this causes a lot of state changes. As suggested by Olivier, if the task is already scheduled for running, we don't need to disable the event because it's in the run queue, poll() cannot stop, and reporting it again will be harmless. What might happen however is that a sampling-based poller like epoll() would report many times the same event and has trouble getting others behind. But if it would happen, it would still indicate the run queue has plenty of pending operations, so it would in fact only displace the problem from the poller to the run queue, which doesn't seem to be worse (and in fact we do support priorities while the poller does not). By doing this change, the keep-alive test with 1k conns and 100k reqs completely gets rid of the per-request epoll_ctl changes, while still not causing extra recvfrom() : $ ./h1load -n 100000 -t 4 -c 1000 -T 20 -F 127.0.0.1:8001/?s=1k/t=20 200000 sendto 1 200000 recvfrom 1 10762 epoll_wait 1 3664 epoll_ctl 1 1999 recvfrom -1 In close mode, it didn't change anything, we're still in the optimal case (2 epoll per connection) : $ ./h1load -n 100000 -r 1 -t 4 -c 1000 -T 20 -F 127.0.0.1:8001/?s=1k/t=20 203764 epoll_ctl 1 200000 sendto 1 200000 recvfrom 1 6091 epoll_wait 1 2994 recvfrom -1	2020-02-28 16:17:09 +01:00
Willy Tarreau	7e59c0a5e1	MEDIUM: connection: make the subscribe() call able to wakeup if ready There's currently an internal API limitation at the connection layer regarding conn_subscribe(). We must not subscribe if we haven't yet met EAGAIN or such a condition, so we sometimes force ourselves to read in order to meet this condition and being allowed to call subscribe. But reading cannot always be done (e.g. at the end of a loop where we cannot afford to retrieve new data and start again) so we instead perform a tasklet_wakeup() of the requester's io_cb. This is what is done in mux_h1 for example. The problem with this is that it forces a new receive when we're not necessarily certain we need one. And if the FD is not ready and was already being polled, it's a useless wakeup. The current patch improves the connection-level subscribe() so that it really manipulates the polling if the FD is marked not-ready, but instead schedules the argument tasklet for a wakeup if the FD is ready. This guarantees we'll wake this tasklet up in any case once the FD is ready, either immediately or after polling. By doing so, a test on pure close mode shows we cut in half the number of epoll_ctl() calls and almost eliminate failed recvfrom(): $ ./h1load -n 100000 -r 1 -t 4 -c 1000 -T 20 -F 127.0.0.1:8001/?s=1k/t=20 before: 399464 epoll_ctl 1 200007 recvfrom 1 200000 sendto 1 100000 recvfrom -1 7508 epoll_wait 1 after: 205739 epoll_ctl 1 200000 sendto 1 200000 recvfrom 1 6084 epoll_wait 1 2651 recvfrom -1 On keep-alive there is no change however.	2020-02-28 16:17:09 +01:00
Willy Tarreau	7f26391bc5	BUG/MINOR: connection: make sure to correctly tag local PROXY connections As reported in issue #511, when sending an outgoing local connection (e.g. health check) we must set the "local" tag and not a "proxy" tag. The issue comes from historic support on v1 which required to steal the address on the outgoing connection for such ones, creating confusion in the v2 code which believes it sees the incoming connection. In order not to risk to break existing setups which might rely on seeing the LB's address in the connection's source field, let's just change the connection type from proxy to local and keep the addresses. The protocol spec states that for local, the addresses must be ignored anyway. This problem has always existed, this can be backported as far as 1.5, though it's probably not a good idea to change such setups, thus maybe 2.0 would be more reasonable.	2020-02-25 10:31:37 +01:00
Willy Tarreau	1ac83af560	CLEANUP: connection: use read_u32() instead of a cast in the netscaler parser The netscaler protocol parser used to involve a few casts from char to (uint32_t*), let's properly use u32 for this instead.	2020-02-25 10:24:51 +01:00
Willy Tarreau	5d4d1806db	CLEANUP: connection: remove the definitions of conn_xprt_{stop,want}_{send,recv} This marks the end of the transition from the connection polling states introduced in 1.5-dev12 and the subscriptions in that arrived in 1.9. The socket layer can now safely use its FD while all upper layers rely exclusively on subscriptions. These old functions were removed. Some may deserve some renaming to improved clarty though. The single call to conn_xprt_stop_both() was dropped in favor of conn_cond_update_polling() which already does the same.	2020-02-21 11:21:12 +01:00
Willy Tarreau	d1d14c3157	MINOR: connection: remove the last calls to conn_xprt_{want,stop}_* The last few calls to conn_xprt_{want,stop}_{recv,send} in the central connection code were replaced with their strictly exact equivalent fd_*, adding the call to conn_ctrl_ready() when it was missing.	2020-02-21 11:21:12 +01:00
Willy Tarreau	19bc201c9f	MEDIUM: connection: remove the intermediary polling state from the connection Historically we used to require that the connections held the desired polling states for the data layer and the socket layer. Then with muxes these were more or less merged into the transport layer, and now it happens that with all transport layers having their own state, the "transport layer state" as we have it in the connection (XPRT_RD_ENA, XPRT_WR_ENA) is only an exact copy of the undelying file descriptor state, but with a delay. All of this is causing some difficulties at many places in the code because there are still some locations which use the conn_want_* API to remain clean and only rely on connection, and count on a later collection call to conn_cond_update_polling(), while others need an immediate action and directly use the FD updates. Since our updates are now much cheaper, most of them being only an atomic test-and-set operation, and since our I/O callbacks are deferred, there's no benefit anymore in trying to "cache" the transient state change in the connection flags hoping to cancel them before they become an FD event. Better make such calls transparent indirections to the FD layer instead and get rid of the deferred operations which needlessly complicate the logic inside. This removes flags CO_FL_XPRT_{RD,WR}_ENA and CO_FL_WILL_UPDATE. A number of functions related to polling updates were either greatly simplified or removed. Two places were using CO_FL_XPRT_WR_ENA as a hint to know if more data were expected to be sent after a PROXY protocol or SOCKSv4 header. These ones were simply replaced with a check on the subscription which is where we ought to get the autoritative information from. Now the __conn_xprt_want_* and their conn_xprt_want_* counterparts are the same. conn_stop_polling() and conn_xprt_stop_both() are the same as well. conn_cond_update_polling() only causes errors to stop polling. It also becomes way more obvious that muxes should not at all employ conn_xprt_{want\|stop}_{recv,send}(), and that the call to __conn_xprt_stop_recv() in case a mux failed to allocate a buffer is inappropriate, it ought to unsubscribe from reads instead. All of this definitely requires a serious cleanup.	2020-02-21 11:21:12 +01:00
Willy Tarreau	157788c7b1	BUG/MINOR: connection: correctly retry I/O on signals Issue #490 reports that there are a few bogus constructs of the famous "do { if (cond) continue; } while (0)" in the connection code, that are used to retry on I/O failures caused by receipt of a signal. Let's turn them into the more correct "while (1) { if (cond) continue; break }" instead. This may or may not be backported, it shouldn't have any visible effect.	2020-02-11 10:26:39 +01:00
William Dauchy	bd8bf67102	BUG/MINOR: connection: fix ip6 dst_port copy in make_proxy_line_v2 triggered by coverity; src_port is set earlier. this should fix github issue #467 Fixes: `7fec021537` ("MEDIUM: proxy_protocol: Convert IPs to v6 when protocols are mixed") This should be backported to 1.8. Signed-off-by: William Dauchy <w.dauchy@criteo.com> Reviewed-by: Tim Duesterhus <tim@bastelstu.be>	2020-01-28 13:02:58 +01:00
Willy Tarreau	49139cb914	MINOR: connection: don't check for CO_FL_SOCK_WR_SH too early in handshakes Just like with CO_FL_SOCK_RD_SH, we don't need to check for this flag too early because conn_sock_send() already does it. No error was lost so it was harmless, it was only useless code.	2020-01-23 19:01:37 +01:00
Willy Tarreau	d838fb840c	MINOR: connection: do not check for CO_FL_SOCK_RD_SH too early The handshake functions dedicated to proxy proto, netscaler and socks4 all check for this flag before proceeding. This is wrong, they must not do and instead perform the call to recv() then report the close. The reason for this is that the current construct managed to lose the CO_ER_CIP_EMPTY error code in case the connection was already shut, thus causing a race condition with some errors being reported correctly or as unknown depending on the timing.	2020-01-23 18:05:18 +01:00
Willy Tarreau	911db9bd29	MEDIUM: connection: use CO_FL_WAIT_XPRT more consistently than L4/L6/HANDSHAKE As mentioned in commit `c192b0ab95` ("MEDIUM: connection: remove CO_FL_CONNECTED and only rely on CO_FL_WAIT_*"), there is a lack of consistency on which flags are checked among L4/L6/HANDSHAKE depending on the code areas. A number of sample fetch functions only check for L4L6 to report MAY_CHANGE, some places only check for HANDSHAKE and many check both L4L6 and HANDSHAKE. This patch starts to make all of this more consistent by introducing a new mask CO_FL_WAIT_XPRT which is the union of L4/L6/HANDSHAKE and reports whether the transport layer is ready or not. All inconsistent call places were updated to rely on this one each time the goal was to check for the readiness of the transport layer.	2020-01-23 16:34:26 +01:00
Willy Tarreau	c192b0ab95	MEDIUM: connection: remove CO_FL_CONNECTED and only rely on CO_FL_WAIT_* Commit `477902bd2e` ("MEDIUM: connections: Get ride of the xprt_done callback.") broke the master CLI for a very obscure reason. It happens that short requests immediately terminated by a shutdown are properly received, CS_FL_EOS is correctly set, but in si_cs_recv(), we refrain from setting CF_SHUTR on the channel because CO_FL_CONNECTED was not yet set on the connection since we've not passed again through conn_fd_handler() and it was not done in conn_complete_session(). While commit `a8a415d31a` ("BUG/MEDIUM: connections: Set CO_FL_CONNECTED in conn_complete_session()") fixed the issue, such accident may happen again as the root cause is deeper and actually comes down to the fact that CO_FL_CONNECTED is lazily set at various check points in the code but not every time we drop one wait bit. It is not the first time we face this situation. Originally this flag was used to detect the transition between WAIT_* and CONNECTED in order to call ->wake() from the FD handler. But since at least 1.8-dev1 with commit `7bf3fa3c23` ("BUG/MAJOR: connection: update CO_FL_CONNECTED before calling the data layer"), CO_FL_CONNECTED is always synchronized against the two others before being checked. Moreover, with the I/Os moved to tasklets, the decision to call the ->wake() function is performed after the I/Os in si_cs_process() and equivalent, which don't care about this transition either. So in essence, checking for CO_FL_CONNECTED has become a lazy wait to check for (CO_FL_WAIT_L4_CONN \| CO_FL_WAIT_L6_CONN), but that always relies on someone else having synchronized it. This patch addresses it once for all by killing this flag and only checking the two others (for which a composite mask CO_FL_WAIT_L4L6 was added). This revealed a number of inconsistencies that were purposely not addressed here for the sake of bisectability: - while most places do check both L4+L6 and HANDSHAKE at the same time, some places like assign_server() or back_handle_st_con() and a few sample fetches looking for proxy protocol do check for L4+L6 but don't care about HANDSHAKE ; these ones will probably fail on TCP request session rules if the handshake is not complete. - some handshake handlers do validate that a connection is established at L4 but didn't clear CO_FL_WAIT_L4_CONN - the ->ctl method of mux_fcgi, mux_pt and mux_h1 only checks for L4+L6 before declaring the mux ready while the snd_buf function also checks for the handshake's completion. Likely the former should validate the handshake as well and we should get rid of these extra tests in snd_buf. - raw_sock_from_buf() would directly set CO_FL_CONNECTED and would only later clear CO_FL_WAIT_L4_CONN. - xprt_handshake would set CO_FL_CONNECTED itself without actually clearing CO_FL_WAIT_L4_CONN, which could apparently happen only if waiting for a pure Rx handshake. - most places in ssl_sock that were checking CO_FL_CONNECTED don't need to include the L4 check as an L6 check is enough to decide whether to wait for more info or not. It also becomes obvious when reading the test in si_cs_recv() that caused the failure mentioned above that once converted it doesn't make any sense anymore: having CS_FL_EOS set while still waiting for L4 and L6 to complete cannot happen since for CS_FL_EOS to be set, the other ones must have been validated. Some of these parts will still deserve further cleanup, and some of the observations above may induce some backports of potential bug fixes once totally analyzed in their context. The risk of breaking existing stuff is too high to blindly backport everything.	2020-01-23 14:41:37 +01:00
Olivier Houchard	a8a415d31a	BUG/MEDIUM: connections: Set CO_FL_CONNECTED in conn_complete_session(). We can't just assume conn_create_mux() will be called, and set CO_FL_CONNECTED, conn_complete_session() might be call synchronously if we're not using SSL, so ew haee no choice but to set CO_FL_CONNECTED in there. This should fix the recent breakage of the mcli reg tests.	2020-01-23 13:20:03 +01:00
Olivier Houchard	477902bd2e	MEDIUM: connections: Get ride of the xprt_done callback. The xprt_done_cb callback was used to defer some connection initialization until we're connected and the handshake are done. As it mostly consists of creating the mux, instead of using the callback, introduce a conn_create_mux() function, that will just call conn_complete_session() for frontend, and create the mux for backend. In h2_wake(), make sure we call the wake method of the stream_interface, as we no longer wakeup the stream task.	2020-01-22 18:56:05 +01:00
Olivier Houchard	1a9dbe58a6	BUG/MEDIUM: netscaler: Don't forget to allocate storage for conn->src/dst. In conn_recv_netscaler_cip(), don't forget to allocate conn->src and conn->dst, as those are now dynamically allocated. Not doing so results in getting a crash when using netscaler. This should fix github issue #460. This should be backported to 2.1.	2020-01-22 15:33:03 +01:00
Willy Tarreau	ee1a6fc943	MINOR: connection: make the last arg of subscribe() a struct wait_event* The subscriber used to be passed as a "void param" that was systematically cast to a struct wait_event. By now it appears clear that the subscribe() call at every layer is well defined and always takes a pointer to an event subscriber of type wait_event, so let's enforce this in the functions' prototypes, remove the intermediary variables used to cast it and clean up the comments to clarify what all these functions do in their context.	2020-01-17 18:30:37 +01:00
Willy Tarreau	7872d1fc15	MEDIUM: connection: merge the send_wait and recv_wait entries In practice all callers use the same wait_event notification for any I/O so instead of keeping specific code to handle them separately, let's merge them and it will allow us to create new events later.	2020-01-17 18:30:36 +01:00
Willy Tarreau	3381bf89e3	MEDIUM: connection: get rid of CO_FL_CURR_* flags These ones used to serve as a set of switches between CO_FL_SOCK_* and CO_FL_XPRT_, and now that the SOCK layer is gone, they're always a copy of the last know CO_FL_XPRT_ ones that is resynchronized before I/O events by calling conn_refresh_polling_flags(), and that are pushed back to FDs when detecting changes with conn_xprt_polling_changes(). While these functions are not particularly heavy, what they do is totally redundant by now because the fd_want_/fd_stop_() actions already perform test-and-set operations to decide to create an entry or not, so they do the exact same thing that is done by conn_xprt_polling_changes(). As such it is pointless to call that one, and given that the only reason to keep CO_FL_CURR_* is to detect changes there, we can now remove them. Even if this does only save very few cycles, this removes a significant complexity that has been responsible for many bugs in the past, including the last one affecting FreeBSD. All tests look good, and no performance regressions were observed.	2020-01-17 17:45:12 +01:00
Willy Tarreau	0fbc318e24	CLEANUP: connection: merge CO_FL_NOTIFY_DATA and CO_FL_NOTIFY_DONE Both flags became equal in commit `82967bf9` ("MINOR: connection: adjust CO_FL_NOTIFY_DATA after removal of flags"), which already predicted the overlap between xprt_done_cb() and wake() after the removal of the DATA specific flags in 1.8. Let's simply remove CO_FL_NOTIFY_DATA since the "_DONE" version already covers everything and explains the intent well enough.	2019-12-27 16:38:47 +01:00
Willy Tarreau	cbcf77edb7	MINOR: connection: remove the double test on xprt_done_cb() The conn_fd_handler used to have one possible call to this function to notify about end of handshakes, and another one to notify about connection setup or error. But given that we're now only performing wakeup calls after connection validation, we don't need to keep two places to run this test since the conditions do not change in between. This patch merges the two tests into a single one and moves the CO_FL_CONNECTED test appropriately as well so that it's called even on the error path if needed.	2019-12-27 16:38:47 +01:00
Willy Tarreau	b2a7ab08a8	MINOR: connection: check for connection validation earlier In conn_fd_handler() we used to first give a chance to the send() callback to try to send data and validate the connection at the same time. But since 1.9 we do not call this callback anymore inline, it's scheduled. So let's validate the connection ealier so that all other decisions can be taken based on this confirmation. This may notably be useful to the xprt_done_cb() to know that the connection was properly validated.	2019-12-27 16:38:47 +01:00
Willy Tarreau	4970e5adb7	REORG: connection: move tcp_connect_probe() to conn_fd_check() The function is not TCP-specific at all, it covers all FD-based sockets so let's move this where other similar functions are, in connection.c, and rename it conn_fd_check().	2019-12-27 16:38:43 +01:00
Willy Tarreau	8081abe26a	CLEANUP: connection: conn->xprt is never NULL Let's remove this outdated test that's been there since 1.5. For quite some time now xprt hasn't been NULL anymore on an initialized connection.	2019-12-27 14:04:33 +01:00
Willy Tarreau	70ccb2cddf	BUG/MINOR: connection: only wake send/recv callbacks if the FD is active Since commit `c3df4507fa` ("MEDIUM: connections: Wake the upper layer even if sending/receiving is disabled.") the send/recv callbacks are called on I/O if the FD is ready and not just if it's active. This means that in some situations (e.g. send ready but nothing to send) we may needlessly enter the if() block, notice we're not subscribed, set io_available=1 and call the wake() callback even if we're just called for read activity. Better make sure we only do this when the FD is active in that direction.. This may be backported as far as 2.0 though it should remain under observation for a few weeks first as the risk of harm by a mistake is higher than the trouble it should cause.	2019-12-27 14:04:33 +01:00
Willy Tarreau	ccf3f6d1d6	MEDIUM: connection: enable reading only once the connection is confirmed In order to address the absurd polling sequence described in issue #253, let's make sure we disable receiving on a connection until it's established. Previously with bottom-top I/Os, we were almost certain that a connection was ready when the first I/O was confirmed. Now we can enter various functions, including process_stream(), which will attempt to read something, will fail, and will then subscribe. But we don't want them to try to receive if we know the connection didn't complete. The first prerequisite for this is to mark the connection as not ready for receiving until it's validated. But we don't want to mark it as not ready for sending because we know that attempting I/Os later is extremely likely to work without polling. Once the connection is confirmed we re-enable recv readiness. In order for this event to be taken into account, the call to tcp_connect_probe() was moved earlier, between the attempt to send() and the attempt to recv(). This way if tcp_connect_probe() enables reading, we have a chance to immediately fall back to this and read the possibly pending data. Now the trace looks like the following. It's far from being perfect but we've already saved one recvfrom() and one epollctl(): epoll_wait(3, [], 200, 0) = 0 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 7 fcntl(7, F_SETFL, O_RDONLY\|O_NONBLOCK) = 0 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0 connect(7, {sa_family=AF_INET, sin_port=htons(8000), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress) epoll_ctl(3, EPOLL_CTL_ADD, 7, {EPOLLIN\|EPOLLOUT\|EPOLLRDHUP, {u32=7, u64=7}}) = 0 epoll_wait(3, [{EPOLLOUT, {u32=7, u64=7}}], 200, 1000) = 1 connect(7, {sa_family=AF_INET, sin_port=htons(8000), sin_addr=inet_addr("127.0.0.1")}, 16) = 0 getsockopt(7, SOL_SOCKET, SO_ERROR, [0], [4]) = 0 sendto(7, "OPTIONS / HTTP/1.0\r\n\r\n", 22, MSG_DONTWAIT\|MSG_NOSIGNAL, NULL, 0) = 22 epoll_ctl(3, EPOLL_CTL_MOD, 7, {EPOLLIN\|EPOLLRDHUP, {u32=7, u64=7}}) = 0 epoll_wait(3, [{EPOLLIN\|EPOLLRDHUP, {u32=7, u64=7}}], 200, 1000) = 1 getsockopt(7, SOL_SOCKET, SO_ERROR, [0], [4]) = 0 getsockopt(7, SOL_SOCKET, SO_ERROR, [0], [4]) = 0 recvfrom(7, "HTTP/1.0 200\r\nContent-length: 0\r\nX-req: size=22, time=0 ms\r\nX-rsp: id=dummy, code=200, cache=1, size=0, time=0 ms (0 real)\r\n\r\n", 16384, 0, NULL, NULL) = 126 close(7) = 0	2019-09-06 17:50:36 +02:00
Jerome Magnin	78891c7e71	BUILD: connection: silence gcc warning with extra parentheses Commit `8a4ffa0a` ("MINOR: send-proxy-v2: sends authority TLV according to TLV received") is missing parentheses around a variable assignment used as condition in an if statement, and gcc isn't happy about it.	2019-09-02 16:59:32 +02:00
Emmanuel Hocdet	8a4ffa0aab	MINOR: send-proxy-v2: sends authority TLV according to TLV received Since patch "7185b789", the authority TLV in a PROXYv2 header from a client connection is stored. Authority TLV sends in PROXYv2 should be taken into account to allow chaining PROXYv2 without droping it.	2019-08-31 12:28:33 +02:00
Geoff Simmons	7185b789f9	MINOR: connection: add the fc_pp_authority fetch -- authority TLV, from PROXYv2 Save the authority TLV in a PROXYv2 header from the client connection, if present, and make it available as fc_pp_authority. The fetch can be used, for example, to set the SNI for a backend TLS connection.	2019-08-28 17:16:20 +02:00
Willy Tarreau	ca79f59365	MEDIUM: connection: make sure all address producers allocate their address This commit places calls to sockaddr_alloc() at the places where an address is needed, and makes sure that the allocation is properly tested. This does not add too many error paths since connection allocations are already in the vicinity and share the same error paths. For the two cases where a clear_addr() was called, instead the address was not allocated.	2019-07-19 13:50:09 +02:00
Willy Tarreau	ff5d57b022	MINOR: connection: create a new pool for struct sockaddr_storage This pool will be used to allocate storage for source and destination addresses used in connections. Two functions sockaddr_{alloc,free}() were added and will have to be used everywhere an address is needed. These ones are safe for progressive replacement as they check that the existing pointer is set before replacing it. The pool is not yet used during allocation nor freeing. Also they operate on pointers to pointers so they will perform checks and replace values. The free one nulls the pointer.	2019-07-19 13:50:09 +02:00
Willy Tarreau	226572f55f	MINOR: connection: use conn->{src,dst} instead of &conn->addr.{from,to} This is in preparation for the switch to dynamic address allocation, let's migrate the code using the old fields to the pointers instead. Note that no extra check was added for now, the purpose is only to get the code to use the pointers and still work. In the proxy protocol message handling we make sure the addresses are properly allocated before declaring them unset.	2019-07-19 13:50:09 +02:00
Willy Tarreau	3c39a7d889	CLEANUP: connection: rename the wait_event.task field to .tasklet It's really confusing to call it a task because it's a tasklet and used in places where tasks and tasklets are used together. Let's rename it to tasklet to remove this confusion.	2019-06-14 14:42:29 +02:00
Olivier Houchard	03abf2d31e	MEDIUM: connections: Remove CONN_FL_SOCK* Now that the various handshakes come with their own XPRT, there's no need for the CONN_FL_SOCK* flags, and the conn_sock_want\|stop functions, so garbage-collect them.	2019-06-05 18:03:38 +02:00
Olivier Houchard	fe50bfb82c	MEDIUM: connections: Introduce a handshake pseudo-XPRT. Add a new XPRT that is used when using non-SSL handshakes, such as proxy protocol or Netscaler, instead of taking care of it in conn_fd_handler(). This XPRT is installed when any of those is used, and it removes itself once the handshake is done. This should allow us to remove the distinction between CO_FL_SOCK* and CO_FL_XPRT*.	2019-06-05 18:03:38 +02:00
Olivier Houchard	ea8dd949e4	MEDIUM: ssl: Handle subscribe by itself. As the SSL code may have different needs than the upper layer, ie it may want to receive when the upper layer wants to right, instead of directly forwarding the subscribe to the underlying xprt, handle it ourself. The SSL code will know remember any subscribe call, and wake the tasklet when it is ready for more I/O.	2019-06-05 18:03:38 +02:00
Olivier Houchard	c3df4507fa	MEDIUM: connections: Wake the upper layer even if sending/receiving is disabled. In conn_fd_handler(), if the fd is ready to send/recv, wake the upper layer even if we have CO_FL_ERROR, or if CO_FL_XPRT_RD_ENA/CO_FL_XPRT_WR_ENA isn't set. The only reason we should reach that point is if we had a shutw/shutr, and the upper layer may want to know about it, and is supposed to handle it anyway.	2019-06-05 18:03:38 +02:00
Willy Tarreau	694fcd0ee4	MINOR: connection: also stop receiving after a SOCKS4 response Just as is done in previous patch for all handshake handlers, also stop receiving after a SOCKS4 response was received. This one escaped the previous cleanup but must be done to keep the code safe.	2019-06-03 10:16:35 +02:00
Willy Tarreau	6499b9d996	BUG/MEDIUM: connection: fix multiple handshake polling issues Connection handshakes were rarely stacked on top of each other, but the recent experiments consisting in sending PROXY over SOCKS4 revealed a number of issues in these lower layers. First, each handler waiting for data MUST subscribe to recv events with __conn_sock_want_recv() and MUST unsubscribe from send events using __conn_sock_stop_send() to avoid any wake-up loop in case a previous sender has set this. Second, each handler waiting for sending MUST subscribe to send events with __conn_sock_want_send() and MUST unsubscribe from recv events using __conn_sock_stop_recv() to avoid any wake-up loop in case some data are available on the connection. Till now this was done at various random places, and in particular the cases where the FD was not ready for recv forgot to re-enable reading. Second, while senders can happily use conn_sock_send() which automatically handles EINTR, loops, and marks the FD as not ready with fd_cant_send(), there is no equivalent for recv so receivers facing EAGAIN MUST call fd_cant_send() to enable polling. It could be argued that implementing an equivalent conn_sock_recv() function could be useful and more long-term proof than the current situation. Third, both types of handlers MUST unsubscribe from their respective events once they managed to do their job, and none may even play with __conn_xprt_*(). Here again this was lacking, and one surprizing call to __conn_xprt_stop_recv() was present in the proxy protocol parser for TCP6 messages! Thanks to Alexander Liu for his help on this issue. This patch must be backported to 1.9 and possibly some older versions, though the SOCKS parts should be dropped.	2019-06-03 08:31:22 +02:00
Alexander Liu	2a54bb74cd	MEDIUM: connection: Upstream SOCKS4 proxy support Have "socks4" and "check-via-socks4" server keyword added. Implement handshake with SOCKS4 proxy server for tcp stream connection. See issue #82. I have the "SOCKS: A protocol for TCP proxy across firewalls" doc found at "https://www.openssh.com/txt/socks4.protocol". Please reference to it. [wt: for now connecting to the SOCKS4 proxy over unix sockets is not supported, and mixing IPv4/IPv6 is discouraged; indeed, the control layer is unique for a connection and will be used both for connecting and for target address manipulation. As such it may for example report incorrect destination addresses in logs if the proxy is reached over IPv6]	2019-05-31 17:24:06 +02:00
Willy Tarreau	e5733234f6	CLEANUP: build: rename some build macros to use the USE_* ones We still have quite a number of build macros which are mapped 1:1 to a USE_something setting in the makefile but which have a different name. This patch cleans this up by renaming them to use the USE_something one, allowing to clean up the makefile and make it more obvious when reading the code what build option needs to be added. The following renames were done : ENABLE_POLL -> USE_POLL ENABLE_EPOLL -> USE_EPOLL ENABLE_KQUEUE -> USE_KQUEUE ENABLE_EVPORTS -> USE_EVPORTS TPROXY -> USE_TPROXY NETFILTER -> USE_NETFILTER NEED_CRYPT_H -> USE_CRYPT_H CONFIG_HAP_CRYPT -> USE_LIBCRYPT CONFIG_HAP_NS -> DUSE_NS CONFIG_HAP_LINUX_SPLICE -> USE_LINUX_SPLICE CONFIG_HAP_LINUX_TPROXY -> USE_LINUX_TPROXY CONFIG_HAP_LINUX_VSYSCALL -> USE_LINUX_VSYSCALL	2019-05-22 19:47:57 +02:00
Olivier Houchard	35d116885d	MINOR: connections: Use BUG_ON() to enforce rules in subscribe/unsubscribe. It is not legal to subscribe if we're already subscribed, or to unsubscribe if we did not subscribe, so instead of trying to handle those cases, just assert that it's ok using the new BUG_ON() macro.	2019-05-14 18:18:25 +02:00
Willy Tarreau	c125cef6da	CLEANUP: ssl: make inclusion of openssl headers safe It's always a pain to have to stuff lots of #ifdef USE_OPENSSL around ssl headers, it even results in some of them appearing in a random order and multiple times just to benefit form an existing ifdef block. Let's make these headers safe for inclusion when USE_OPENSSL is not defined, they now perform the test themselves and do nothing if USE_OPENSSL is not defined. This allows to remove no less than 8 such ifdef blocks and make include blocks more readable.	2019-05-10 09:58:43 +02:00
Olivier Houchard	e179d0e88f	MEDIUM: connections: Provide a xprt_ctx for each xprt method. For most of the xprt methods, provide a xprt_ctx. This will be useful later when we'll want to be able to stack xprts. The init() method now has to create and provide the said xprt_ctx if needed.	2019-04-18 14:56:24 +02:00
Willy Tarreau	0ca24aa028	BUILD: connection: fix naming of ip_v field AIX defines ip_v as ip_ff.ip_fv in netinet/ip.h using a macro, and unfortunately we do have a local variable with such a name and which uses the same header file. Let's rename the variable to ip_ver to fix this.	2019-04-01 07:44:56 +02:00
Willy Tarreau	4f6516d677	CLEANUP: connection: rename subscription events values and event field The SUB_CAN_SEND/SUB_CAN_RECV enum values have been confusing a few times, especially when checking them on reading. After some discussion, it appears that calling them SUB_RETRY_SEND/SUB_RETRY_RECV more accurately reflects their purpose since these events may only appear after a first attempt to perform the I/O operation has failed or was not completed. In addition the wait_reason field in struct wait_event which carries them makes one think that a single reason may happen at once while it is in fact a set of events. Since the struct is called wait_event it makes sense that this field is called "events" to indicate it's the list of events we're subscribed to. Last, the values for SUB_RETRY_RECV/SEND were swapped so that value 1 corresponds to recv and 2 to send, as is done almost everywhere else in the code an in the shutdown() call.	2018-12-19 14:09:21 +01:00
J�r�me Magnin	8657742092	MINOR: sample: add bc_http_major This adds the sample fetch bc_http_major. It returns the backend connection's HTTP version encoding, which may be 1 for HTTP/0.9 to HTTP/1.1 or 2 for HTTP/2.0. It is based on the on-wire encoding, and not the version present in the request header.	2018-12-07 15:34:39 +01:00
Willy Tarreau	8ceae72d44	MEDIUM: init: use initcall for all fixed size pool creations This commit replaces the explicit pool creation that are made in constructors with a pool registration. Not only this simplifies the pools declaration (it can be done on a single line after the head is declared), but it also removes references to pools from within constructors. The only remaining create_pool() calls are those performed in init functions after the config is parsed, so there is no more user of potentially uninitialized pool now. It has been the opportunity to remove no less than 12 constructors and 6 init functions.	2018-11-26 19:50:32 +01:00
Willy Tarreau	0108d90c6c	MEDIUM: init: convert all trivial registration calls to initcalls This switches explicit calls to various trivial registration methods for keywords, muxes or protocols from constructors to INITCALL1 at stage STG_REGISTER. All these calls have in common to consume a single pointer and return void. Doing this removes 26 constructors. The following calls were addressed : - acl_register_keywords - bind_register_keywords - cfg_register_keywords - cli_register_kw - flt_register_keywords - http_req_keywords_register - http_res_keywords_register - protocol_register - register_mux_proto - sample_register_convs - sample_register_fetches - srv_register_keywords - tcp_req_conn_keywords_register - tcp_req_cont_keywords_register - tcp_req_sess_keywords_register - tcp_res_cont_keywords_register - flt_register_keywords	2018-11-26 19:50:32 +01:00
Olivier Houchard	53216e7db9	MEDIUM: connections: Don't directly mess with the polling from the upper layers. Avoid using conn_xprt_want_send/recv, and totally nuke cs_want_send/recv, from the upper layers. The polling is now directly handled by the connection layer, it is activated on subscribe(), and unactivated once we got the event and we woke the related task.	2018-10-21 05:58:40 +02:00
Olivier Houchard	fa8aa867b9	MEDIUM: connections: Change struct wait_list to wait_event. When subscribing, we don't need to provide a list element, only the h2 mux needs it. So instead, Add a list element to struct h2s, and use it when a list is needed. This forces us to use the unsubscribe method, since we can't just unsubscribe by using LIST_DEL anymore. This patch is larger than it should be because it includes some renaming.	2018-10-11 15:34:39 +02:00
Olivier Houchard	83a0cd8a36	MINOR: connections: Introduce an unsubscribe method. As we don't know how subscriptions are handled, we can't just assume we can use LIST_DEL() to unsubscribe, so introduce a new method to mux and connections to do so.	2018-10-11 15:34:21 +02:00
Ilya Shipitsin	ca56fce8bd	BUG/MINOR: connection: avoid null pointer dereference in send-proxy-v2 found by coverity. [wt: this bug was introduced by commit `404d978` ("MINOR: add ALPN information to send-proxy-v2"). It might be triggered by a health check on a server using ppv2 or by an applet making use of such a server, if at all configurable]. This needs to be backported to 1.8.	2018-10-02 04:07:43 +02:00
Willy Tarreau	55e0da664e	BUILD: connection: silence a couple of null-deref build warnings at -Wextra These ones don't need to be checked either.	2018-09-20 11:42:15 +02:00
Olivier Houchard	7505f94f90	MEDIUM: h2: Don't use a wake() method anymore. Instead of having our wake() method called each time a fd event happens, just subscribe to recv/send events, and get our tasklet called when that happens. If any recv/send was possible, the equivalent of what h2_wake_cb() will be done.	2018-09-12 17:37:55 +02:00
Olivier Houchard	af4021e680	MEDIUM: connections: Get rid of the recv() method. Remove the recv() method from mux and conn_stream. The goal is to always receive from the upper layers, instead of waiting for the connection later. For now, recv() is still called from the wake() method, but that should change soon.	2018-09-12 17:37:55 +02:00
Olivier Houchard	4cf7fb148f	MEDIUM: connections/mux: Add a recv and a send+recv wait list. For struct connection, struct conn_stream, and for the h2 mux, add 2 new lists, one that handles waiters for recv, and one that handles waiters for recv and send. That way we can ask to subscribe for either recv or send.	2018-09-12 17:37:55 +02:00
Olivier Houchard	524344b4e0	MEDIUM: connections: Don't reset the polling flags in conn_fd_handler(). Resetting the polling flags at the end of conn_fd_handler() shouldn't be needed anymore, and it will create problem when we won't handle send/recv from conn_fd_handler() anymore.	2018-09-12 17:37:55 +02:00
Willy Tarreau	e215bba956	MINOR: connection: make conn_sock_drain() work for all socket families This patch improves the previous fix by implementing the socket draining code directly in conn_sock_drain() so that it always applies regardless of the protocol's family. Thus it gets rid of tcp_drain().	2018-08-24 14:45:46 +02:00
Willy Tarreau	b406b8708f	BUG/MEDIUM: connection: don't store recv() result into trash.data Cyril Bont� discovered that the proxy protocol randomly fails since commit `843b7cb` ("MEDIUM: chunks: make the chunk struct's fields match the buffer struct"). This is because we used to store recv()'s return code into trash.data which is now unsigned, so it never compares as negative against 0. Let's clean this up and test the result itself without storing it first. No backport is needed.	2018-08-22 05:28:32 +02:00
Olivier Houchard	e1c6dbcd70	MINOR: connections/mux: Add the wait reason(s) to wait_list. Add a new element to the wait_list, that let us know which event(s) we are waiting on.	2018-08-16 17:29:53 +02:00
Olivier Houchard	ed0f207ef5	MINOR: connections: Get rid of txbuf. Remove txbuf from conn_stream. It is not used yet, and its only user will probably be the mux_h2, so it will be better suited in the struct h2s.	2018-08-16 17:29:51 +02:00
Olivier Houchard	511efeae7e	MINOR: connections: Make rcv_buf mandatory and nuke cs_recv(). Reintroduce h2_rcv_buf(), right now it just does what cs_recv() did, but should be modified later.	2018-08-16 17:23:44 +02:00
Christopher Faulet	32f61c0421	MINOR: mux: Unlink ALPN and multiplexers to rather speak of mux protocols Multiplexers are not necessarily associated to an ALPN. ALPN is a TLS extension, so it is not always defined or used. Instead, we now rather speak of multiplexer's protocols. So in this patch, there are no significative changes, some structures and functions are just renamed.	2018-08-08 09:54:22 +02:00
Christopher Faulet	063f786553	MINOR: conn_stream: add cs_send() as a default snd_buf() function This function is generic and is able to automatically transfer data from a buffer to the conn_stream's tx buffer. It does this automatically if the mux doesn't define another snd_buf() function. It cannot yet be used as-is with the conn_stream's txbuf without risking to lose data on close since conn_streams need to be orphaned for this.	2018-08-08 09:53:58 +02:00
Tim Duesterhus	7fec021537	MEDIUM: proxy_protocol: Convert IPs to v6 when protocols are mixed http-request set-src possibly creates a situation where src and dst are from different address families. Convert both addresses to IPv6 to avoid a PROXY UNKNOWN. This patch should be backported to haproxy 1.8.	2018-07-30 11:23:30 +02:00
Willy Tarreau	11c9aa424e	MEDIUM: conn_stream: add cs_recv() as a default rcv_buf() function This function is generic and is able to automatically transfer data from a conn_stream's rx buffer to the destination buffer. It does this automatically if the mux doesn't define another rcv_buf() function.	2018-07-20 19:21:43 +02:00
Olivier Houchard	910b2bc829	MEDIUM: connections/mux: Revamp the send direction. Totally nuke the "send" method, instead, the upper layer decides when it's time to send data, and if it's not possible, uses the new subscribe() method to be called when it can send data again.	2018-07-19 18:31:07 +02:00
Olivier Houchard	6ff2039d13	MINOR: connections/mux: Add a new "subscribe" method. Add a new "subscribe" method for connection, conn_stream and mux, so that upper layer can subscribe to them, to be called when the event happens. Right now, the only event implemented is "SUB_CAN_SEND", where the upper layer can register to be called back when it is possible to send data. The connection and conn_stream got a new "send_wait_list" entry, which required to move a few struct members around to maintain an efficient cache alignment (and actually this slightly improved performance).	2018-07-19 16:23:43 +02:00
Willy Tarreau	83061a820e	MAJOR: chunks: replace struct chunk with struct buffer Now all the code used to manipulate chunks uses a struct buffer instead. The functions are still called "chunk*", and some of them will progressively move to the generic buffer handling code as they are cleaned up.	2018-07-19 16:23:43 +02:00
Willy Tarreau	843b7cbe9d	MEDIUM: chunks: make the chunk struct's fields match the buffer struct Chunks are only a subset of a buffer (a non-wrapping version with no head offset). Despite this we still carry a lot of duplicated code between buffers and chunks. Replacing chunks with buffers would significantly reduce the maintenance efforts. This first patch renames the chunk's fields to match the name and types used by struct buffers, with the goal of isolating the code changes from the declaration changes. Most of the changes were made with spatch using this coccinelle script : @rule_d1@ typedef chunk; struct chunk chunk; @@ - chunk.str + chunk.area @rule_d2@ typedef chunk; struct chunk chunk; @@ - chunk.len + chunk.data @rule_i1@ typedef chunk; struct chunk chunk; @@ - chunk->str + chunk->area @rule_i2@ typedef chunk; struct chunk chunk; @@ - chunk->len + chunk->data Some minor updates to 3 http functions had to be performed to take size_t ints instead of ints in order to match the unsigned length here.	2018-07-19 16:23:43 +02:00
Emmanuel Hocdet	115df3e38e	MINOR: accept-proxy: support proxy protocol v2 CRC32c checksum When proxy protocol v2 CRC32c tlv is received, check it before accept connection (as describe in "doc/proxy-protocol.txt").	2018-03-21 05:04:01 +01:00
Emmanuel Hocdet	4399c75f6c	MINOR: proxy-v2-options: add crc32c This patch add option crc32c (PP2_TYPE_CRC32C) to proxy protocol v2. It compute the checksum of proxy protocol v2 header as describe in "doc/proxy-protocol.txt".	2018-03-21 05:04:01 +01:00
Emmanuel Hocdet	253c3b7516	MINOR: connection: add proxy-v2-options authority This patch add option PP2_TYPE_AUTHORITY to proxy protocol v2 when a TLS connection was negotiated. In this case, authority corresponds to the sni.	2018-03-01 11:38:32 +01:00
Emmanuel Hocdet	fa8d0f1875	MINOR: connection: add proxy-v2-options ssl-cipher,cert-sig,cert-key This patch implement proxy protocol v2 options related to crypto information: ssl-cipher (PP2_SUBTYPE_SSL_CIPHER), cert-sig (PP2_SUBTYPE_SSL_SIG_ALG) and cert-key (PP2_SUBTYPE_SSL_KEY_ALG).	2018-03-01 11:38:28 +01:00
Emmanuel Hocdet	8c0c34b6e7	Revert "BUG/MINOR: send-proxy-v2: string size must include ('\0')" This reverts commit `82913e4f79`. TLV string value should not be null-terminated. This should be backported to 1.8.	2018-03-01 06:48:05 +01:00
Willy Tarreau	d80cb4ee13	MINOR: global: add some global activity counters to help debugging A number of counters have been added at special places helping better understanding certain bug reports. These counters are maintained per thread and are shown using "show activity" on the CLI. The "clear counters" commands also reset these counters. The output is sent as a single write(), which currently produces up to about 7 kB of data for 64 threads. If more counters are added, it may be necessary to write into multiple buffers, or to reset the counters. To backport to 1.8 to help collect more detailed bug reports.	2018-01-23 15:38:33 +01:00
Bertrand Jacquin	72fa1ec24e	MEDIUM: netscaler: add support for standard NetScaler CIP protocol It looks like two version of the protocol exist as reported by Andreas Mahnke. This patch add support for both legacy and standard CIP protocol according to NetScaler specifications.	2017-12-20 07:04:07 +01:00
Bertrand Jacquin	a341a2f479	MEDIUM: netscaler: do not analyze original IP packet size Original informations about the client are stored in the CIP encapsulated IP header, hence there is no need to consider original IP packet length to determine if data are missing. Instead this change detect missing data if the remaining buffer is large enough to contain a minimal IP and TCP header and if the buffer has as much data as CIP is telling.	2017-12-20 07:04:07 +01:00
Bertrand Jacquin	67de5a295c	MINOR: netscaler: check in one-shot if buffer is large enough for IP and TCP header There is minimal gain in checking first the IP header length and then the TCP header length since we always want to capture information about both protocols. IPv4 length calculation was incorrect since IPv4 ip_len actually defines the total length of IPv4 header and following data.	2017-12-20 07:04:07 +01:00
Bertrand Jacquin	43a66a96b3	BUG/MAJOR: netscaler: address truncated CIP header detection Buffer line is manually incremented in order to progress in the trash buffer but calculation are made omitting this manual offset. This leads to random packets being rejected with the following error: HTTP/1: Truncated NetScaler Client IP header received Instead, once original IP header is found, use the IP header length without considering the CIP encapsulation.	2017-12-20 07:04:07 +01:00
Bertrand Jacquin	c7cc69ac36	BUG/MEDIUM: netscaler: use the appropriate IPv6 header size IPv6 header has a fixed size of 40 bytes, not 20.	2017-12-20 07:04:07 +01:00
Bertrand Jacquin	7d668f9e76	MINOR: netscaler: rename cip_len to clarify its uage cip_len was meant to be the length of the data encapsulated in the CIP protocol, the size the IP and TCP header	2017-12-20 07:04:07 +01:00
Bertrand Jacquin	4b4c286bee	MINOR: netscaler: remove the use of cip_magic only used once	2017-12-20 07:04:07 +01:00
Bertrand Jacquin	b387591f32	MINOR: netscaler: respect syntax As per doc/coding-style.txt	2017-12-20 07:04:07 +01:00
Willy Tarreau	bafbe01028	CLEANUP: pools: rename all pool functions and pointers to remove this "2" During the migration to the second version of the pools, the new functions and pool pointers were all called "pool_something2()" and "pool2_something". Now there's no more pool v1 code and it's a real pain to still have to deal with this. Let's clean this up now by removing the "2" everywhere, and by renaming the pool heads "pool_head_something".	2017-11-24 17:49:53 +01:00
Emmanuel Hocdet	82913e4f79	BUG/MINOR: send-proxy-v2: string size must include ('\0') strlen() exclude the terminating null byte ('\0'), add it.	2017-11-01 07:58:20 +01:00
Emmanuel Hocdet	571c7ac0a5	BUG/MINOR: send-proxy-v2: fix dest_len in make_tlv call Subtract already allocated size from buf_len.	2017-11-01 07:57:42 +01:00
Olivier Houchard	e2b40b9eab	MINOR: connection: introduce conn_stream This patch introduces a new struct conn_stream. It's the stream-side of a multiplexed connection. A pool is created and destroyed on exit. For now the conn_streams are not used at all.	2017-10-31 18:03:23 +01:00
Willy Tarreau	60ca10a372	MINOR: connection: report the major HTTP version from the MUX for logging (fc_http_major) A new sample fetch function reports either 1 or 2 for the on-wire encoding, to indicate if the request was received using the HTTP/1.x format or HTTP/2 format. Note that it reports the on-wire encoding, not the version presented in the request header. This will possibly have to evolve if it becomes necessary to report the encoding on the server side as well.	2017-10-31 18:03:23 +01:00
Willy Tarreau	2386be64ba	MINOR: connection: implement alpn registration of muxes Selecting a mux based on ALPN and the proxy mode will quickly become a pain. This commit provides new functions to register/lookup a mux based on the ALPN string and the proxy mode to make this easier. Given that we're not supposed to support a wide range of muxes, the lookup should not have any measurable performance impact.	2017-10-31 18:03:23 +01:00
Willy Tarreau	53a4766e40	MEDIUM: connection: start to introduce a mux layer between xprt and data For HTTP/2 and QUIC, we'll need to deal with multiplexed streams inside a connection. After quite a long brainstorming, it appears that the connection interface to the existing streams is appropriate just like the connection interface to the lower layers. In fact we need to have the mux layer in the middle of the connection, between the transport and the data layer. A mux can exist on two directions/sides. On the inbound direction, it instanciates new streams from incoming connections, while on the outbound direction it muxes streams into outgoing connections. The difference is visible on the mux->init() call : in one case, an upper context is already known (outgoing connection), and in the other case, the upper context is not yet known (incoming connection) and will have to be allocated by the mux. The session doesn't have to create the new streams anymore, as this is performed by the mux itself. This patch introduces this and creates a pass-through mux called "mux_pt" which is used for all new connections and which only calls the data layer's recv,send,wake() calls. One incoming stream is immediately created when init() is called on the inbound direction. There should not be any visible impact. Note that the connection's mux is purposely not set until the session is completed so that we don't accidently run with the wrong mux. This must not cause any issue as the xprt_done_cb function is always called prior to using mux's recv/send functions.	2017-10-31 18:03:23 +01:00
Emmanuel Hocdet	404d978d40	MINOR: add ALPN information to send-proxy-v2 Send ALPN information in proxy-protocol-v2 if an alpn have been negotiated.	2017-10-27 19:32:36 +02:00
Emmanuel Hocdet	01da571e21	MINOR: merge ssl_sock_get calls for log and ppv2 Merge ssl_sock_get_version and ssl_sock_get_proto_version. Change ssl_sock_get_cipher to be used in ppv2.	2017-10-27 19:32:36 +02:00
Emmanuel Hocdet	58118b43b1	MINOR: update proxy-protocol-v2 #define Report #define from doc/proxy-protocol.txt.	2017-10-27 19:32:36 +02:00
Willy Tarreau	916e12dcfb	MINOR: connection: add flag CO_FL_WILL_UPDATE to indicate when updates are granted In transport-layer functions (snd_buf/rcv_buf), it's very problematic never to know if polling changes made to the connection will be propagated or not. This has led to some conn_cond_update_polling() calls being placed at a few places to cover both the cases where the function is called from the upper layer and when it's called from the lower layer. With the arrival of the MUX, this becomes even more complicated, as the upper layer will not have to manipulate anything from the connection layer directly and will not have to push such updates directly either. But the snd_buf functions will need to see their updates committed when called from upper layers. The solution here is to introduce a connection flag set by the connection handler (and possibly any other similar place) indicating that the caller is committed to applying such changes on return. This way, the called functions will be able to apply such changes by themselves before leaving when the flag is not set, and the upper layer will not have to care about that anymore.	2017-10-25 15:52:41 +02:00
Olivier Houchard	1a0545f3d7	REORG: connection: rename CO_FL_DATA_* -> CO_FL_XPRT_* These flags are not exactly for the data layer, they instead indicate what is expected from the transport layer. Since we're going to split the connection between the transport and the data layers to insert a mux layer, it's important to have a clear idea of what each layer does. All function conn_data_* used to manipulate these flags were renamed to conn_xprt_*.	2017-10-22 09:54:15 +02:00
Willy Tarreau	8e3c6ce75a	MEDIUM: connection: get rid of data->init() which was not for data The ->init() callback of the connection's data layer was only used to complete the session's initialisation since sessions and streams were split apart in 1.6. The problem is that it creates a big confusion in the layers' roles as the session has to register a dummy data layer when waiting for a handshake to complete, then hand it off to the stream which will replace it. The real need is to notify that the transport has finished initializing. This should enable a better splitting between these layers. This patch thus introduces a connection-specific callback called xprt_done_cb() which informs about handshake successes or failures. With this, data->init() can disappear, CO_FL_INIT_DATA as well, and we don't need to register a dummy data->wake() callback to be notified of errors.	2017-08-30 07:04:04 +02:00
Willy Tarreau	585744bf2e	REORG/MEDIUM: connection: introduce the notion of connection handle Till now connections used to rely exclusively on file descriptors. It was planned in the past that alternative solutions would be implemented, leading to member "union t" presenting sock.fd only for now. With QUIC, the connection will need to continue to exist but will not rely on a file descriptor but a connection ID. So this patch introduces a "connection handle" which is either a file descriptor or a connection ID, to replace the existing "union t". We've now removed the intermediate "struct sock" which was never used. There is no functional change at all, though the struct connection was inflated by 32 bits on 64-bit platforms due to alignment.	2017-08-24 19:30:04 +02:00
Willy Tarreau	57ec32fb99	MINOR: connection: send data before receiving It's more efficient this way, as it allows to flush a send buffer before receiving data in the other one. This can lead to a slightly faster buffer recycling, thus slightly less memory and a small performance increase by using a hotter cache.	2017-06-27 14:38:02 +02:00
Willy Tarreau	2686dcad1e	CLEANUP: connection: remove unused CO_FL_WAIT_DATA Very early in the connection rework process leading to v1.5-dev12, commit `56a77e5` ("MEDIUM: connection: complete the polling cleanups") marked the end of use for this flag which since was never set anymore, but it continues to be tested. Let's kill it now.	2017-06-02 15:50:27 +02:00
Willy Tarreau	9fa1ee61cc	MEDIUM: connection: don't test for CO_FL_WAKE_DATA This flag is always set when we end up here, for each and every data layer (idle, stream-interface, checks), and continuing to test it leaves a big risk of forgetting to set it as happened once already before 1.5-dev13. It could make sense to backport this into stable branches as part of the connection flag fixes, after some cool down period.	2017-03-19 12:17:35 +01:00
Willy Tarreau	3c0cc49d30	BUG/MEDIUM: connection: ensure to always report the end of handshakes Despite the previous commit working fine on all tests, it's still not sufficient to completely address the problem. If the connection handler is called with an event validating an L4 connection but some handshakes remain (eg: accept-proxy), it will still wake the function up, which will not report the activity, and will not detect a change once the handshake it complete so it will not notify the ->wake() handler. In fact the only reason why the ->wake() handler is still called here is because after dropping the last handshake, we try to call ->recv() and ->send() in turn and change the flags in order to detect a data activity. But if for any reason the data layer is not interested in reading nor writing, it will not get these events. A cleaner way to address this is to call the ->wake() handler only on definitive status changes (shut, error), on real data activity, and on a complete connection setup, measured as CONNECTED with no more handshake pending. It could be argued that the handshake flags have to be made part of the condition to set CO_FL_CONNECTED but that would currently break a part of the health checks. Also a handshake could appear at any moment even after a connection is established so we'd lose the ability to detect a second end of handshake. For now the situation around CO_FL_CONNECTED is not clean : - session_accept() only sets CO_FL_CONNECTED if there's no pending handshake ; - conn_fd_handler() will set it once L4 and L6 are complete, which will do what session_accept() above refrained from doing even if an accept_proxy handshake is still pending ; - ssl_sock_infocbk() and ssl_sock_handshake() consider that a handshake performed with CO_FL_CONNECTED set is a renegociation ; => they should instead filter on CO_FL_WAIT_L6_CONN - all ssl_fc_* sample fetch functions wait for CO_FL_CONNECTED before accepting to fetch information => they should also get rid of any pending handshake - smp_fetch_fc_rcvd_proxy() uses !CO_FL_CONNECTED instead of CO_FL_ACCEPT_PROXY - health checks (standard and tcp-checks) don't check for HANDSHAKE and may report a successful check based on CO_FL_CONNECTED while not yet done (eg: send buffer full on send_proxy). This patch aims at solving some of these side effects in a backportable way before this is reworked in depth : - we need to call ->wake() to report connection success, measure connection time, notify that the data layer is ready and update the data layer after activity ; this has to be done either if we switch from pending {L4,L6}_CONN to nothing with no handshakes left, or if we notice some handshakes were pending and are now done. - we document that CO_FL_CONNECTED exactly means "L4 connection setup confirmed at least once, L6 connection setup confirmed at least once or not necessary, all this regardless of any possibly remaining handshakes or future L6 negociations". This patch also renames CO_FL_CONN_STATUS to the more explicit CO_FL_NOTIFY_DATA, and works around the previous flags trick consiting in setting an impossible combination of flags to notify the data layer, by simply clearing the current flags. This fix should be backported to 1.7, 1.6 and 1.5.	2017-03-19 12:06:18 +01:00
Willy Tarreau	7bf3fa3c23	BUG/MAJOR: connection: update CO_FL_CONNECTED before calling the data layer Matthias Fechner reported a regression in 1.7.3 brought by the backport of commit `819efbf` ("BUG/MEDIUM: tcp: don't poll for write when connect() succeeds"), causing some connections to fail to establish once in a while. While this commit itself was a fix for a bad sequencing of connection events, it in fact unveiled a much deeper bug going back to the connection rework era in v1.5-dev12 : `8f8c92f` ("MAJOR: connection: add a new CO_FL_CONNECTED flag"). It's worth noting that in a lab reproducing a similar environment as Matthias' about only 1 every 19000 connections exhibit this behaviour, making the issue not so easy to observe. A trick to make the problem more observable consists in disabling non-blocking mode on the socket before calling connect() and re-enabling it later, so that connect() always succeeds. Then it becomes 100% reproducible. The problem is that this CO_FL_CONNECTED flag is tested after deciding to call the data layer (typically the stream interface but might be a health check as well), and that the decision to call the data layer relies on a change of one of the flags covered by the CO_FL_CONN_STATE set, which is made of CO_FL_CONNECTED among others. Before the fix above, this bug couldn't appear with TCP but it could appear with Unix sockets. Indeed, connect() was always considered blocking so the CO_FL_WAIT_L4_CONN connection flag was always set, and polling for write events was always enabled. This used to guarantee that the conn_fd_handler() could detect a change among the CO_FL_CONN_STATE flags. Now with the fix above, if a connect() immediately succeeds for non-ssl connection with send-proxy enabled, and no data in the buffer (thus TCP mode only), the CO_FL_WAIT_L4_CONN flag is not set, the lack of data in the buffer doesn't enable polling flags for the data layer, the CO_FL_CONNECTED flag is not set due to send-proxy still being pending, and once send-proxy is done, its completion doesn't cause the data layer to be woken up due to the fact that CO_FL_CONNECT is still not present and that the CO_FL_SEND_PROXY flag is not watched in CO_FL_CONN_STATE. Then no progress is made when data are received from the client (and attempted to be forwarded), because a CF_WRITE_NULL (or CF_WRITE_PARTIAL) flag is needed for the stream-interface state to turn from SI_ST_CON to SI_ST_EST, allowing ->chk_snd() to be called when new data arrive. And the only way to set this flag is to call the data layer of course. After the connect timeout, the connection gets killed and if in the mean time some data have accumulated in the buffer, the retry will succeed. This patch fixes this situation by simply placing the update of CO_FL_CONNECTED where it should have been, before the check for a flag change needed to wake up the data layer and not after. This fix must be backported to 1.7, 1.6 and 1.5. Versions not having the patch above are still affected for unix sockets. Special thanks to Matthias Fechner who provided a very detailed bug report with a bisection designating the faulty patch, and to Olivier Houchard for providing full access to a pretty similar environment where the issue could first be reproduced.	2017-03-14 22:04:06 +01:00
Emeric Brun	4f60301235	MINOR: connection: add sample fetch "fc_rcvd_proxy" fc_rcvd_proxy : boolean Returns true if the client initiated the connection with a PROXY protocol header. A flag is added on the struct connection if a PROXY header is successfully parsed.	2017-01-06 11:59:17 +01:00
Willy Tarreau	13e1410f8a	MINOR: connection: add a minimal transport layer registration system There are still a lot of #ifdef USE_OPENSSL in the code (still 43 occurences) because we never know if we can directly access ssl_sock or not. This patch attacks the problem differently by providing a way for transport layers to register themselves and for users to retrieve the pointer. Unregistered transport layers will point to NULL so it will be easy to check if SSL is registered or not. The mechanism is very inexpensive as it relies on a two-entries array of pointers, so the performance will not be affected.	2016-12-22 23:26:38 +01:00
David Carlier	3015a2eebd	CLEANUP: connection: using internal struct to hold source and dest port. Originally, tcphdr's source and dest from Linux were used to get the source and port which led to a build issue on BSD oses. To avoid side problems related to network then we just use an internal struct as we need only those two fields.	2016-07-05 14:43:05 +02:00
Bertrand Jacquin	93b227db95	MINOR: listener: add the "accept-netscaler-cip" option to the "bind" keyword When NetScaler application switch is used as L3+ switch, informations regarding the original IP and TCP headers are lost as a new TCP connection is created between the NetScaler and the backend server. NetScaler provides a feature to insert in the TCP data the original data that can then be consumed by the backend server. Specifications and documentations from NetScaler: https://support.citrix.com/article/CTX205670 https://www.citrix.com/blogs/2016/04/25/how-to-enable-client-ip-in-tcpip-option-of-netscaler/ When CIP is enabled on the NetScaler, then a TCP packet is inserted just after the TCP handshake. This is composed as: - CIP magic number : 4 bytes Both sender and receiver have to agree on a magic number so that they both handle the incoming data as a NetScaler Client IP insertion packet. - Header length : 4 bytes Defines the length on the remaining data. - IP header : >= 20 bytes if IPv4, 40 bytes if IPv6 Contains the header of the last IP packet sent by the client during TCP handshake. - TCP header : >= 20 bytes Contains the header of the last TCP packet sent by the client during TCP handshake.	2016-06-20 23:02:47 +02:00
Vincent Bernat	6e61589573	BUG/MAJOR: fix listening IP address storage for frontends When compiled with GCC 6, the IP address specified for a frontend was ignored and HAProxy was listening on all addresses instead. This is caused by an incomplete copy of a "struct sockaddr_storage". With the GNU Libc, "struct sockaddr_storage" is defined as this: struct sockaddr_storage { sa_family_t ss_family; unsigned long int __ss_align; char __ss_padding[(128 - (2 * sizeof (unsigned long int)))]; }; Doing an aggregate copy (ss1 = ss2) is different than using memcpy(): only members of the aggregate have to be copied. Notably, padding can be or not be copied. In GCC 6, some optimizations use this fact and if a "struct sockaddr_storage" contains a "struct sockaddr_in", the port and the address are part of the padding (between sa_family and __ss_align) and can be not copied over. Therefore, we replace any aggregate copy by a memcpy(). There is another place using the same pattern. We also fix a function receiving a "struct sockaddr_storage" by copy instead of by reference. Since it only needs a read-only copy, the function is converted to request a reference.	2016-05-19 10:43:24 +02:00
Willy Tarreau	7a798e5d6b	CLEANUP: fix inconsistency between fd->iocb, proto->accept and accept() There's quite some inconsistency in the internal API. listener_accept() which is the main accept() function returns void but is declared as int in the include file. It's assigned to proto->accept() for all stream protocols where an int is expected but the result is never checked (nor is it documented by the way). This proto->accept() is in turn assigned to fd->iocb() which is supposed to return an int composed of FD_WAIT_* flags, but which is never checked either. So let's fix all this mess : - nobody checks accept()'s return - nobody checks iocb()'s return - nobody sets a return value => let's mark all these functions void and keep the current ones intact. Additionally we now include listener.h from listener.c to ensure we won't silently hide this incoherency in the future. Note that this patch could/should be backported to 1.6 and even 1.5 to simplify debugging sessions.	2016-04-14 11:18:22 +02:00
David CARLIER	42ff05e2d3	CLEANUP: connection: fix double negation on memcmp() Nothing harmful in here, just clarify that it applies to the whole expression.	2016-03-24 11:25:46 +01:00
KOVACS Krisztian	7209c204bd	BUG/MAJOR: connection: fix TLV offset calculation for proxy protocol v2 parsing Until now, the code assumed that it can get the offset to the first TLV header just by subtracting the length of the TLV part from the length of the complete buffer. However, if the buffer contains actual data after the header, this computation is flawed and leads to haproxy trying to parse TLV headers from the proxied data. This change fixes this by making sure that the offset to the first TLV header is calculated based from the start of the buffer -- simply by adding the size of the proxy protocol v2 header plus the address family-dependent size of the address information block.	2015-07-03 17:05:20 +02:00
Willy Tarreau	87b09668be	REORG/MAJOR: session: rename the "session" entity to "stream" With HTTP/2, we'll have to support multiplexed streams. A stream is in fact the largest part of what we currently call a session, it has buffers, logs, etc. In order to catch any error, this commit removes any reference to the struct session and tries to rename most "session" occurrences in function names to "stream" and "sess" to "strm" when that's related to a session. The files stream.{c,h} were added and session.{c,h} removed. The session will be reintroduced later and a few parts of the stream will progressively be moved overthere. It will more or less contain only what we need in an embryonic session. Sample fetch functions and converters will have to change a bit so that they'll use an L5 (session) instead of what's currently called "L4" which is in fact L6 for now. Once all changes are completed, we should see approximately this : L7 - http_txn L6 - stream L5 - session L4 - connection \| applet There will be at most one http_txn per stream, and a same session will possibly be referenced by multiple streams. A connection will point to a session and to a stream. The session will hold all the information we need to keep even when we don't yet have a stream. Some more cleanup is needed because some code was already far from being clean. The server queue management still refers to sessions at many places while comments talk about connections. This will have to be cleaned up once we have a server-side connection pool manager. Stream flags "SN_*" still need to be renamed, it doesn't seem like any of them will need to move to the session.	2015-04-06 11:23:56 +02:00
Willy Tarreau	d85c48589a	REORG: connection: move conn_drain() to connection.c and rename it It's now called conn_sock_drain() to make it clear that it only reads at the sock layer and not at the data layer. The function was too big to remain inlined and it's used at a few places where size counts.	2015-03-13 00:42:48 +01:00
Willy Tarreau	ff3e648812	MINOR: connection: implement conn_sock_send() This function is an equivalent to send() which operates over a connection instead of a file descriptor. It checks that the control layer is ready and that it's allowed to send. If automatically enables polling if it cannot send. It simplifies the return checks by returning zero in all cases where it cannot send so that the caller only has to care about negative values indicating errors.	2015-03-13 00:04:49 +01:00
KOVACS Krisztian	b3e54fe387	MAJOR: namespace: add Linux network namespace support This patch makes it possible to create binds and servers in separate namespaces. This can be used to proxy between multiple completely independent virtual networks (with possibly overlapping IP addresses) and a non-namespace-aware proxy implementation that supports the proxy protocol (v2). The setup is something like this: net1 on VLAN 1 (namespace 1) -\ net2 on VLAN 2 (namespace 2) -- haproxy ==== proxy (namespace 0) net3 on VLAN 3 (namespace 3) -/ The proxy is configured to make server connections through haproxy and sending the expected source/target addresses to haproxy using the proxy protocol. The network namespace setup on the haproxy node is something like this: = 8< = $ cat setup.sh ip netns add 1 ip link add link eth1 type vlan id 1 ip link set eth1.1 netns 1 ip netns exec 1 ip addr add 192.168.91.2/24 dev eth1.1 ip netns exec 1 ip link set eth1.$id up ... = 8< = = 8< = $ cat haproxy.cfg frontend clients bind 127.0.0.1:50022 namespace 1 transparent default_backend scb backend server mode tcp server server1 192.168.122.4:2222 namespace 2 send-proxy-v2 = 8< = A bind line creates the listener in the specified namespace, and connections originating from that listener also have their network namespace set to that of the listener. A server line either forces the connection to be made in a specified namespace or may use the namespace from the client-side connection if that was set. For more documentation please read the documentation included in the patch itself. Signed-off-by: KOVACS Tamas <ktamas@balabit.com> Signed-off-by: Sarkozi Laszlo <laszlo.sarkozi@balabit.com> Signed-off-by: KOVACS Krisztian <hidden@balabit.com>	2014-11-21 07:51:57 +01:00
KOVACS Krisztian	efd3aa9341	BUG/MEDIUM: connection: sanitize PPv2 header length before parsing address information Previously, if hdr_v2->len was less than the length of the protocol specific address information we could have read after the end of the buffer and initialize the sockaddr structure with junk. Signed-off-by: KOVACS Krisztian <hidden@balabit.com> [WT: this is only tagged medium since proxy protocol is only used from trusted sources] This must be backported to 1.5.	2014-11-21 07:45:17 +01:00
Dave McCowan	328fb58d74	MEDIUM: connection: add new bit in Proxy Protocol V2 There are two sample commands to get information about the presence of a client certificate. ssl_fc_has_crt is true if there is a certificate present in the current connection ssl_c_used is true if there is a certificate present in the session. If a session has stopped and resumed, then ssl_c_used could be true, while ssl_fc_has_crt is false. In the client byte of the TLS TLV of Proxy Protocol V2, there is only one bit to indicate whether a certificate is present on the connection. The attached patch adds a second bit to indicate the presence for the session. This maintains backward compatibility. [wt: this should be backported to 1.5 to help maintain compatibility between versions]	2014-08-23 07:35:29 +02:00
Willy Tarreau	3b9a0c9d4d	BUG/MEDIUM: connection: fix proxy v2 header again! Last commit `77d1f01` ("BUG/MEDIUM: connection: fix memory corruption when building a proxy v2 header") was wrong, using &cn_trash instead of cn_trash resulting in a warning and the client's SSL cert CN not being stored at the proper location. Thanks to Lukas Tribus for spotting this quickly. This should be backported to 1.5 after the patch above is backported.	2014-07-19 06:37:33 +02:00
Dave McCowan	77d1f0143e	BUG/MEDIUM: connection: fix memory corruption when building a proxy v2 header Use temporary trash chunk, instead of global trash chunk in make_proxy_line_v2() to avoid memory overwrite. This fix must also be backported to 1.5.	2014-07-17 21:00:53 +02:00
Emeric Brun	0abf836ecb	BUG/MINOR: ssl: Fix external function in order not to return a pointer on an internal trash buffer. 'ssl_sock_get_common_name' applied to a connection was also renamed 'ssl_sock_get_remote_common_name'. Currently, this function is only used with protocol PROXYv2 to retrieve the client certificate's common name. A further usage could be to retrieve the server certificate's common name on an outgoing connection.	2014-06-24 22:39:16 +02:00
Willy Tarreau	7799267f43	MEDIUM: connection: add support for proxy protocol v2 in accept-proxy The "accept-proxy" statement of bind lines was still limited to version 1 of the protocol, while send-proxy-v2 is now available on the server lines. This patch adds support for parsing v2 of the protocol on incoming connections. The v2 header is automatically recognized so there is no need for a new option.	2014-06-14 11:46:03 +02:00
Willy Tarreau	8fccfa256e	CLEANUP: connection: merge proxy proto v2 header and address block This is in order to simplify the PPv2 header parsing code to look more like the one provided as an example in the spec. No code change was performed beyond just merging the proxy_addr union into the proxy_hdr_v2 struct.	2014-06-14 11:46:02 +02:00
Willy Tarreau	4c20d29c29	BUG/MINOR: connection: make proxy protocol v1 support the UNKNOWN protocol If haproxy receives a connection over a unix socket and forwards it to another haproxy instance using proxy protocol v1, it sends an UNKNOWN protocol, which is rejected by the other side. Make the receiver accept the UNKNOWN protocol as per the spec, and only use the local connection's address for this.	2014-06-14 11:46:02 +02:00

1 2 3 4 5 ...

311 Commits