haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-09 08:37:04 +02:00

Author	SHA1	Message	Date
Willy Tarreau	f9ce57e86c	MEDIUM: connection: make conn_sock_shutw() aware of lingering Instead of having to manually handle lingering outside, let's make conn_sock_shutw() check for it before calling shutdown(). We simply don't want to emit the FIN if we're going to reset the connection due to lingering. It's particularly important for silent-drop where it's absolutely mandatory that no packet leaves the machine.	2017-10-22 09:54:16 +02:00
Olivier Houchard	1a0545f3d7	REORG: connection: rename CO_FL_DATA_* -> CO_FL_XPRT_* These flags are not exactly for the data layer, they instead indicate what is expected from the transport layer. Since we're going to split the connection between the transport and the data layers to insert a mux layer, it's important to have a clear idea of what each layer does. All function conn_data_* used to manipulate these flags were renamed to conn_xprt_*.	2017-10-22 09:54:15 +02:00
Willy Tarreau	06d80a9a9c	REORG: channel: finally rename the last bi_* / bo_* functions For HTTP/2 we'll need some buffer-only equivalent functions to some of the ones applying to channels and still squatting the bi_* / bo_* namespace. Since these names have kept being misleading for quite some time now and are really getting annoying, it's time to rename them. This commit will use "ci/co" as the prefix (for "channel in", "channel out") instead of "bi/bo". The following ones were renamed : bi_getblk_nc, bi_getline_nc, bi_putblk, bi_putchr, bo_getblk, bo_getblk_nc, bo_getline, bo_getline_nc, bo_inject, bi_putchk, bi_putstr, bo_getchr, bo_skip, bi_swpbuf	2017-10-19 15:01:08 +02:00
Willy Tarreau	4ac4928718	BUG/MINOR: stream-int: don't set MSG_MORE on SHUTW_NOW without AUTO_CLOSE Since around 1.5-dev12, we've been setting MSG_MORE on send() on various conditions, including the fact that SHUTW_NOW is present, but we don't check that it's accompanied with AUTO_CLOSE. The result is that on requests immediately followed by a close (where AUTO_CLOSE is not set), the request gets delayed in the TCP stack before being sent to the server. This is visible with the H2 code where the end-of-stream flag is set on requests, but probably happens when a POLL_HUP is detected along with the request. The (lack of) presence of option abortonclose has no effect here since we never send the SHUTW along with the request. This fix can be backported to 1.7, 1.6 and 1.5.	2017-10-17 16:38:21 +02:00
Bin Wang	95fad5ba4b	BUG/MAJOR: stream-int: don't re-arm recv if send fails When 1) HAProxy configured to enable splice on both directions 2) After some high load, there are 2 input channels with their socket buffer being non-empty and pipe being full at the same time, sitting in `fd_cache` without any other fds. The 2 channels will repeatedly be stopped for receiving (pipe full) and waken for receiving (data in socket), thus getting out and in of `fd_cache`, making their fd swapping location in `fd_cache`. There is a `if (entry < fd_cache_num && fd_cache[entry] != fd) continue;` statement in `fd_process_cached_events` to prevent frequent polling, but since the only 2 fds are constantly swapping location, `fd_cache[entry] != fd` will always hold true, thus HAProxy can't make any progress. The root cause of the issue is dual : - there is a single fd_cache, for next events and for the ones being processed, while using two distinct arrays would avoid the problem. - the write side of the stream interface wakes the read side up even when it couldn't write, and this one really is a bug. Due to CF_WRITE_PARTIAL not being cleared during fast forwarding, a failed send() attempt will still cause ->chk_rcv() to be called on the other side, re-creating an entry for its connection fd in the cache, causing the same sequence to be repeated indefinitely without any opportunity to make progress. CF_WRITE_PARTIAL used to be used for what is present in these tests : check if a recent write operation was performed. It's part of the CF_WRITE_ACTIVITY set and is tested to check if timeouts need to be updated. It's also used to detect if a failed connect() may be retried. What this patch does is use CF_WROTE_DATA() to check for a successful write for connection retransmits, and to clear CF_WRITE_PARTIAL before preparing to send in stream_int_notify(). This way, timeouts are still updated each time a write succeeds, but chk_rcv() won't be called anymore after a failed write. It seems the fix is required all the way down to 1.5. Without this patch, the only workaround at this point is to disable splicing in at least one direction. Strictly speaking, splicing is not absolutely required, as regular forwarding could theorically cause the issue to happen if the timing is appropriate, but in practice it appears impossible to reproduce it without splicing, and even with splicing it may vary. The following config manages to reproduce it after a few attempts (haproxy going 100% CPU and having to be killed) : global maxpipes 50000 maxconn 10000 listen srv1 option splice-request option splice-response bind :8001 server s1 127.0.0.1:8002 server$ tcploop 8002 L N20 A R10 S1000000 R10 S1000000 R10 S1000000 R10 S1000000 R10 S1000000 client$ tcploop 8001 N20 C T S1000000 R10 J	2017-10-05 11:20:16 +02:00
Willy Tarreau	bbae3f0170	MEDIUM: connection: remove useless flag CO_FL_DATA_WR_SH After careful inspection, this flag is set at exactly two places : - once in the health-check receive callback after receipt of a response - once in the stream interface's shutw() code where CF_SHUTW is always set on chn->flags The flag was checked in the checks before deciding to send data, but when it is set, the wake() callback immediately closes the connection so the CO_FL_SOCK_WR_SH flag is also set. The flag was also checked in si_conn_send(), but checking the channel's flag instead is enough and even reveals that one check involving it could never match. So it's time to remove this flag and replace its check with a check of CF_SHUTW in the stream interface. This way each layer is responsible for its shutdown, this will ease insertion of the mux layer.	2017-08-30 10:05:49 +02:00
Willy Tarreau	54e917cfa1	MEDIUM: connection: remove useless flag CO_FL_DATA_RD_SH This flag is both confusing and wrong. It is supposed to report the fact that the data layer has received a shutdown, but in fact this is reported by CO_FL_SOCK_RD_SH which is set by the transport layer after this condition is detected. The only case where the flag above is set is in the stream interface where CF_SHUTR is also set on the receiving channel. In addition, it was checked in the health checks code (while never set) and was always test jointly with CO_FL_SOCK_RD_SH everywhere, except in conn_data_read0_pending() which incorrectly doesn't match the second time it's called and is fortunately protected by an extra check on (ic->flags & CF_SHUTR). This patch gets rid of the flag completely. Now conn_data_read0_pending() accurately reports the fact that the transport layer has detected the end of the stream, regardless of the fact that this state was already consumed, and the stream interface watches ic->flags&CF_SHUTR to know if the channel was already closed by the upper layer (which it already used to do). The now unused conn_data_read0() function was removed.	2017-08-30 08:18:50 +02:00
Willy Tarreau	8ff5a8d87f	BUG/MINOR: stream-int: don't check the CO_FL_CURR_WR_ENA flag The stream interface chk_snd() code checks if the connection has already subscribed to write events in order to avoid attempting a useless write() which will fail. But it used to check both the CO_FL_CURR_WR_ENA and the CO_FL_DATA_WR_ENA flags, while the former may only be present without the latterif either the other side just disabled writing did not synchronize yet (which is harmless) or if it's currently performing a handshake, which is being checked by the next condition and will be better dealt with by properly subscribing to the data events. This code was added back in 1.5-dev20 to limit the number of useless calls to splice() but both flags were checked at once while only CO_FL_DATA_WR_ENA was needed. This bug seems to have no impact other than making code changes more painful. This fix may be backported down to 1.5 though is unlikely to be needed there.	2017-08-30 07:03:34 +02:00
Emeric Brun	2802b07d97	BUG/MAJOR: applet: fix a freeze if data is immedately forwarded. Introduced regression with 'MAJOR: applet scheduler rework' (1.8-dev only). The fix consist to re-enable the appctx immediatly from the applet wake cb if the process_stream is not pending in runqueue and the applet want perform a put or a get and the WAIT_ROOM flag was removed by stream_int_notify.	2017-06-30 14:57:24 +02:00
Emeric Brun	c730606879	MAJOR: applet: applet scheduler rework. In order to authorize call of appctx_wakeup on running task: - from within the task handler itself. - in futur, from another thread. The appctx is considered paused as default after running the handler. The handler should explicitly call appctx_wakeup to be re-called. When the appctx_free is called on a running handler. The real free is postponed at the end of the handler process.	2017-06-27 14:38:02 +02:00
Willy Tarreau	2686dcad1e	CLEANUP: connection: remove unused CO_FL_WAIT_DATA Very early in the connection rework process leading to v1.5-dev12, commit `56a77e5` ("MEDIUM: connection: complete the polling cleanups") marked the end of use for this flag which since was never set anymore, but it continues to be tested. Let's kill it now.	2017-06-02 15:50:27 +02:00
Hongbo Long	e39683c4d4	BUG/MEDIUM: stream: fix client-fin/server-fin handling A tcp half connection can cause 100% CPU on expiration. First reproduced with this haproxy configuration : global tune.bufsize 10485760 defaults timeout server-fin 90s timeout client-fin 90s backend node2 mode tcp timeout server 900s timeout connect 10s server def 127.0.0.1:3333 frontend fe_api mode tcp timeout client 900s bind :1990 use_backend node2 Ie timeout server-fin shorter than timeout server, the backend server sends data, this package is left in the cache of haproxy, the backend server continue sending fin package, haproxy recv fin package. this time the session information is as follows: time the session information is as follows: 0x2373470: proto=tcpv4 src=127.0.0.1:39513 fe=fe_api be=node2 srv=def ts=08 age=1s calls=3 rq[f=848000h,i=0,an=00h,rx=14m58s,wx=,ax=] rp[f=8004c020h,i=0,an=00h,rx=,wx=14m58s,ax=] s0=[7,0h,fd=6,ex=] s1=[7,18h,fd=7,ex=] exp=14m58s rp has set the CF_SHUTR state, next, the client sends the fin package, session information is as follows: 0x2373470: proto=tcpv4 src=127.0.0.1:39513 fe=fe_api be=node2 srv=def ts=08 age=38s calls=4 rq[f=84a020h,i=0,an=00h,rx=,wx=,ax=] rp[f=8004c020h,i=0,an=00h,rx=1m11s,wx=14m21s,ax=] s0=[7,0h,fd=6,ex=] s1=[9,10h,fd=7,ex=] exp=1m11s After waiting 90s, session information is as follows: 0x2373470: proto=tcpv4 src=127.0.0.1:39513 fe=fe_api be=node2 srv=def ts=04 age=4m11s calls=718074391 rq[f=84a020h,i=0,an=00h,rx=,wx=,ax=] rp[f=8004c020h,i=0,an=00h,rx=?,wx=10m49s,ax=] s0=[7,0h,fd=6,ex=] s1=[9,10h,fd=7,ex=] exp=? run(nice=0) cpu information: 6899 root 20 0 112224 21408 4260 R 100.0 0.7 3:04.96 haproxy Buffering is set to ensure that there is data in the haproxy buffer, and haproxy can receive the fin package, set the CF_SHUTR flag, If the CF_SHUTR flag has been set, The following code does not clear the timeout message, causing cpu 100%: stream.c:process_stream: if (unlikely((res->flags & (CF_SHUTR\|CF_READ_TIMEOUT)) == CF_READ_TIMEOUT)) { if (si_b->flags & SI_FL_NOHALF) si_b->flags \|= SI_FL_NOLINGER; si_shutr(si_b); } If you have closed the read, set the read timeout does not make sense. With or without cf_shutr, read timeout is set: if (tick_isset(s->be->timeout.serverfin)) { res->rto = s->be->timeout.serverfin; res->rex = tick_add(now_ms, res->rto); } After discussion on the mailing list, setting half-closed timeouts the hard way here doesn't make sense. They should be set only at the moment the shutdown() is performed. It will also solve a special case which was already reported of some half-closed timeouts not working when the shutw() is performed directly at the stream-interface layer (no analyser involved). Since the stream interface layer cannot know the timeout values, we'll have to store them directly in the stream interface so that they are used upon shutw(). This patch does this, fixing the problem. An easier reproducer to validate the fix is to keep the huge buffer and shorten all timeouts, then call it under tcploop server and client, and wait 3 seconds to see haproxy run at 100% CPU : global tune.bufsize 10485760 listen px bind :1990 timeout client 90s timeout server 90s timeout connect 1s timeout server-fin 3s timeout client-fin 3s server def 127.0.0.1:3333 $ tcploop 3333 L W N20 A P100 F P10000 & $ tcploop 127.0.0.1:1990 C S10000000 F	2017-03-21 15:04:43 +01:00
Willy Tarreau	52821e2737	BUG/MAJOR: stream-int: do not depend on connection flags to detect connection Recent fix `7bf3fa3` ("BUG/MAJOR: connection: update CO_FL_CONNECTED before calling the data layer") marked an end to a fragile situation where the absence of CO_FL_{CONNECTED,L4,L6}* flags is used to mark the completion of a connection setup. The problem is that by setting the CO_FL_CONNECTED flag earlier, we can indeed call the ->wake() function from conn_fd_handler but the stream-interface's wake function needs to see CO_FL_CONNECTED unset to detect that a connection has just been established, so if there's no pending data in the buffer, the connection times out. The other ->wake() functions (health checks and idle connections) don't do this though. So instead of trying to detect a subtle change in connection flags, let's simply rely on the stream-interface's state and validate that the connection is properly established and that handshakes are completed before reporting the WRITE_NULL indicating that a pending connection was just completed. This patch passed all tests of handshake and non-handshake combinations, with synchronous and asynchronous connect() and should be safe for backport to 1.7, 1.6 and 1.5 when the fix above is already present.	2017-03-19 12:00:04 +01:00
Willy Tarreau	8cf9c8e663	BUG/MINOR: stream-int: automatically release SI_FL_WAIT_DATA on SHUTW_NOW While developing an experimental applet performing only one read per full line, it appeared that it would be woken up for the client's close, not read all data (missing LF), then wait for a subsequent call, and would only be woken up on client timeout to finish the read. The reason is that we preset SI_FL_WAIT_DATA in the stream-interface's flags to avoid a fast loop, but there's nothing which can remove this flag until there's a read operation. We must definitely remove it in stream_int_notify() each time we're called with CF_SHUTW_NOW because we know there will be no more subsequent read and we don't want an applet which keeps the WANT_GET flag to block on this. This fix should be backported to 1.7 and 1.6 though it's uncertain whether cli, peers, lua or spoe really are affected there.	2016-12-14 16:48:16 +01:00
Christopher Faulet	a73e59b690	BUG/MAJOR: Fix how the list of entities waiting for a buffer is handled When an entity tries to get a buffer, if it cannot be allocted, for example because the number of buffers which may be allocated per process is limited, this entity is added in a list (called <buffer_wq>) and wait for an available buffer. Historically, the <buffer_wq> list was logically attached to streams because it were the only entities likely to be added in it. Now, applets can also be waiting for a free buffer. And with filters, we could imagine to have more other entities waiting for a buffer. So it make sense to have a generic list. Anyway, with the current design there is a bug. When an applet failed to get a buffer, it will wait. But we add the stream attached to the applet in <buffer_wq>, instead of the applet itself. So when a buffer is available, we wake up the stream and not the waiting applet. So, it is possible to have waiting applets and never awakened. So, now, <buffer_wq> is independant from streams. And we really add the waiting entity in <buffer_wq>. To be generic, the entity is responsible to define the callback used to awaken it. In addition, applets will still request an input buffer when they become active. But they will not be sleeped anymore if no buffer are available. So this is the responsibility to the applet I/O handler to check if this buffer is allocated or not. This way, an applet can decide if this buffer is required or not and can do additional processing if not. [wt: backport to 1.7 and 1.6]	2016-12-12 19:11:04 +01:00
Willy Tarreau	796c5b7997	OPTIM: stream-int: don't disable polling anymore on DONT_READ Commit `5fddab0` ("OPTIM: stream_interface: disable reading when CF_READ_DONTWAIT is set") improved the connection layer's efficiency back in 1.5-dev13 by avoiding successive read attempts on an active FD. But by disabling this on a polled FD, it causes an unpleasant side effect which is that the FD that was subscribed to polling is suddenly stopped and may need to be re-enabled once the kernel starts to slow down on data eviction (eg: saturated server at the other end, bursty traffic caused by too large maxpollevents). This behaviour is observable with persistent connections when there is a large enough connection count so that there's no data in the early connection and polling is required, because there are then up to 4 epoll_ctl() calls per request. It's important that the server is slower than haproxy to cause some delays when reading response. The current connection layer as designed in 1.6 with the FD cache doesn't require this trick anymore, though it still benefits from it when it saves an FD from being uselessly polled. But compared to the increased cost of enabling and disabling poll all the time, it's still better to disable it. In some cases it's possible to observe a performance increase as high as 30% by avoiding this epoll_ctl() dance. In the end we only want to disable it when the FD is speculatively read and not when it's polled. For this we introduce a new function __conn_data_done_recv() which is used to indicate that we're done with recv() and not interested in new attempts. If/when we later support event-triggered epoll, this function will have to change a bit to do the same even in the polled case. A quick test with keep-alive requests run on a dual-core / dual- thread Atom shows a significant improvement : single process, 0 bytes : before: Requests per second: 12243.20 [#/sec] (mean) after: Requests per second: 13354.54 [#/sec] (mean) single process, 4k : before: Requests per second: 9639.81 [#/sec] (mean) after: Requests per second: 10991.89 [#/sec] (mean) dual process, 0 bytes (unstable) : before: Requests per second: 16900-19800 ~ 17600 [#/sec] (mean) after: Requests per second: 18600-21400 ~ 20500 [#/sec] (mean)	2016-12-05 13:49:57 +01:00
Willy Tarreau	8e0bb0ae16	MINOR: connection: add names for transport and data layers This makes debugging easier and avoids having to put ugly checks against certain well-known internal struct pointers.	2016-11-24 16:58:12 +01:00
Willy Tarreau	958f0742a2	BUG/MEDIUM: stream-int: avoid double-call to applet->release While the SI_ST_DIS state is set after doing the close on a connection, it was set before calling release on an applet. Applets have no internal flags contrary to connections, so they have no way to detect they were already released. Because of this it happened that applets were closed twice, once via si_applet_release() and once via si_release_endpoint() at the end of a transaction. The CLI applet could perform a double free in this case, though the situation to cause it is quite hard because it requires that the applet is stuck on output in states that produce very few data. In order to solve this, we now assign the SI_ST_DIS state after calling ->release, and we refrain from doing so if the state is already assigned. This makes applets work much more like connections and definitely avoids this double release. In the future it might be worth making applets have their own flags like connections to carry their own state regardless of the stream interface's state, especially when dealing with connection reuse. No backport is needed since this issue was caused by the rearchitecture in 1.6.	2015-09-25 21:16:03 +02:00
Willy Tarreau	eca572fe7f	BUG/MEDIUM: applet: fix reporting of broken write situation If an applet tries to write to a closed connection, it hangs forever. This results in some "get map" commands on the CLI to leave orphaned connections alive. Now the applet wakeup function detects that the applet still wants to write while the channel is closed for reads, which is the equivalent to the common "broken pipe" situation. In this case, an error is reported on the stream interface, just as it happens with connections trying to perform a send() in a similar situation. With this fix the stats socket is properly released.	2015-09-25 21:16:02 +02:00
Willy Tarreau	aa977ba205	MINOR: stream-int: rename si_applet_done() to si_applet_wake_cb() This function is a callback made only for calls from the applet handler. Rename it to remove confusion. It's currently called from the Lua code but that's not correct, we should call the notify and update functions instead otherwise it will not enable the applet again.	2015-09-25 21:16:02 +02:00
Willy Tarreau	335520305c	MEDIUM: stream-int: completely remove stream_int_update_embedded() This one is not needed anymore as what it used to do is either completely covered by the new stream_int_notify() function, or undesired and inherited from the past as a side effect of introducing the connections. This update is theorically never called since it's assigned only when nothing is connected to the stream interface. However a test has been added to si_update() to stay safe if some foreign code decides to call si_update() in unsafe situations.	2015-09-25 21:16:02 +02:00
Willy Tarreau	651e18292d	MEDIUM: stream-int: use the same stream notification function for applets and conns The code to report completion after a connection update or an applet update was almost the same since applets stole it from the connection. But the differences made them hard to maintain and prevented the creation of new functions doing only one part of the work. This patch replaces the common code from the si_conn_wake_cb() and si_applet_wake_cb() with a single call to stream_int_notify() which only notifies the stream (si+channels+task) from the outside. No functional change was made beyond this.	2015-09-25 21:16:02 +02:00
Willy Tarreau	615f28bec1	MINOR: stream-int: implement the stream_int_notify() function stream_int_notify() was taken from the common part between si_conn_wake_cb() and si_applet_done(). It is designed to report activity to a stream from outside its handler. It'll generally be used by lower layers to report I/O completion but may also be used by remote streams if the buffer processing is shared.	2015-09-25 21:16:02 +02:00
Willy Tarreau	ea3cc48d64	MEDIUM: stream-int: clean up the conditions to enable reading in si_conn_wake_cb The condition to release the SI_FL_WAIT_ROOM flag was abnormally complicated because it was inherited from 6 years ago before we used to check for the buffer's emptiness. The CF_READ_PARTIAL flag had to be removed, and the complex test was replaced with a simpler one checking if some data were moved out or not. The reason behind this change is to have a condition compatible with both connections and applets, as applets currently don't work very well in this area. Specifically, some optimizations on the applet side cause them not to release the flag above until the buffer is empty, which may prevent applets from taking together (eg: peers over large haproxy buffers and small kernel buffers).	2015-09-25 18:07:16 +02:00
Willy Tarreau	388a2385a5	MINOR: stream-int: move the applet_pause call out of the stream updates It's just to split the part dealing with the stream update and the part dealing with the applet update in si_applet_done().	2015-09-25 18:07:16 +02:00
Willy Tarreau	cbc32601a6	MINOR: stream-int: export stream_int_update_* Not only these functions were not static, but we'll also want to export them.	2015-09-25 18:07:16 +02:00
Willy Tarreau	5d5b2fecac	MEDIUM: stream-int: call stream_int_update() from si_update() Now the call to stream_int_update() is moved to si_update(), which is exclusively called from the stream, so that the socket layer may be updated without updating the stream layer. This will later permit to call it individually from other places (other tasks or applets for example).	2015-09-25 18:07:16 +02:00
Willy Tarreau	452c7d5d93	MEDIUM: stream-int: factor out the stream update functions Now that we have a generic stream_int_update() function, we can replace the equivalent part in stream_int_update_conn() and stream_int_update_applet() to avoid code duplication. There is no functional change, as the code is the same but split in two functions for each call.	2015-09-25 18:07:16 +02:00
Willy Tarreau	25f1310f33	MINOR: stream-int: implement a new stream_int_update() function This function is designed to be called from within the stream handler to update the channels' expiration timers and the stream interface's flags based on the channels' flags. It needs to be called only once after the channels' flags have settled down, and before they are cleared, though it doesn't harm to call it as often as desired (it just slightly hurts performance). It must not be called from outside of the stream handler, as what it does will be used to compute the stream task's expiration. The code was taken directly from stream_int_update_applet() and stream_int_update_conn() which had exactly the same one except for applet-specific or connection-specific status update.	2015-09-25 18:07:16 +02:00
Willy Tarreau	2f4e702031	MEDIUM: stream-int: split stream_int_update_conn() into si- and conn-specific parts The purpose is to separate the connection-specific parts so that the stream-int specific one can be factored out. There's no functional change here, only code displacement.	2015-09-25 18:07:16 +02:00
Willy Tarreau	c4b56e4470	MINOR: stream-int: use si_release_endpoint() to close idle conns We don't want to open-code the connection close code in si_idle_conn_wake_cb() because we need to centralize some controls.	2015-09-24 11:57:34 +02:00
Thierry FOURNIER	5bc2cbf8f4	CLEANUP: typo: bad indent A space alignment remains in the stream_interface.c file	2015-09-10 21:16:55 +02:00
Willy Tarreau	323a2d925c	MEDIUM: stream-int: queue idle connections at the server Now we get a per-server list of all idle connections. That way we'll be able to reclaim them upon shortage later.	2015-08-06 11:06:25 +02:00
Willy Tarreau	7a08d3b2d7	CLEANUP: stream-int: remove stream_int_unregister_handler() and si_detach() The former was not used anymore and the latter was only used by the former. They were only aliases to other existing functions anyway.	2015-07-19 18:48:20 +02:00
Willy Tarreau	a9ff5e64c1	CLEANUP: stream-int: fix a few outdated comments about stream_int_register_handler() They were not updated after the infrastructure change.	2015-07-19 18:46:30 +02:00
Willy Tarreau	0b1a4541dc	MEDIUM: stream-int: pause the appctx if the task is woken up If we're going to call the task we don't need to call the appctx anymore since the task may decide differently in the end and will do the proper thing using ->update(). This reduces one wake up call per session and may go down to half in case of high concurrency (scheduling races).	2015-04-23 17:56:17 +02:00
Willy Tarreau	fe127937a8	MEDIUM: applet: make the applets only use si_applet_{cant\|want\|stop}_{get\|put} The applets don't fiddle with SI_FL_WAIT_ROOM anymore, instead they indicate what they want, possibly that they failed (eg: WAIT_ROOM), and it's done() / update() which finally updates the WAIT_* flags according to the channels' and stream interface's states. This solves the issue of the pauses during a "show sess" without creating busy loops.	2015-04-23 17:56:17 +02:00
Willy Tarreau	563cc37609	MAJOR: stream: use a regular ->update for all stream interfaces Now si->update() is used to update any type of stream interface, whether it's an applet, a connection or even nothing. We don't call si_applet_call() anymore at the end of the resync and we don't have the risk that the stream's task is reinserted into the run queue, which makes the code a bit simpler. The stream_int_update_applet() function was simplified to ensure that it remained compatible with this standardized calling convention. It was almost copy-pasted from the update code dedicated to connections. Just like for si_applet_done(), it seems that it should be possible to merge the two functions except that it would require some slow operations, except maybe if the type of end point is tested inside the update function itself.	2015-04-23 17:56:16 +02:00
Willy Tarreau	828824af05	MAJOR: applet: now call si_applet_done() instead of si_update() in I/O handlers The applet I/O handlers now rely on si_applet_done() which itself decides to wake up or sleep the appctx. Now it becomes critical that applte handlers properly call this on every exit path so that the appctx is removed from the active list after I/O have been handled. One such call was added to the Lua socket handler. It used to work without it probably because the main task is woken up by the parent task but now it's needed.	2015-04-23 17:56:16 +02:00
Willy Tarreau	e5f8649102	MEDIUM: stream-int: add a new function si_applet_done() This is the equivalent of si_conn_wake() but for applets. It will be called after changes to the stream interface are brought by the applet I/O handler. Ultimately it will release buffers and may be even wake the stream's task up if some important changes are detected. It would be nice to be able to merge it with the connection's wake function since it mostly manipulates the stream interface, but there are minor differences (such as how to enable/disable polling on a fd vs applet) and some specificities to applets (eg: don't wake the applet up until the output is empty) which would require abstract functions which would slow down everything.	2015-04-23 17:56:16 +02:00
Willy Tarreau	d45b9f8991	REORG: stream-int: create si_applet_ops dedicated to applets These functions are dedicated to applets so that we don't use the default ones anymore in this case.	2015-04-23 17:56:16 +02:00
Willy Tarreau	3057645b37	CLEANUP: applet: rename struct si_applet to applet Since this one does not depend on stream_interface anymore, remove the "si_" prefix.	2015-04-23 17:56:16 +02:00
Willy Tarreau	8a8d83b85c	REORG: applet: move the applet definitions out of stream_interface We're tidying the definitions so that appctx lives on its own. A new set of applet.h files has been added for this purpose.	2015-04-23 17:56:16 +02:00
Willy Tarreau	a7513f5d00	MINOR: stream-int: make appctx_new() take the applet in argument Doing so simplifies the initialization of a new appctx. We don't need appctx_set_applet() anymore.	2015-04-06 11:37:32 +02:00
Willy Tarreau	87b09668be	REORG/MAJOR: session: rename the "session" entity to "stream" With HTTP/2, we'll have to support multiplexed streams. A stream is in fact the largest part of what we currently call a session, it has buffers, logs, etc. In order to catch any error, this commit removes any reference to the struct session and tries to rename most "session" occurrences in function names to "stream" and "sess" to "strm" when that's related to a session. The files stream.{c,h} were added and session.{c,h} removed. The session will be reintroduced later and a few parts of the stream will progressively be moved overthere. It will more or less contain only what we need in an embryonic session. Sample fetch functions and converters will have to change a bit so that they'll use an L5 (session) instead of what's currently called "L4" which is in fact L6 for now. Once all changes are completed, we should see approximately this : L7 - http_txn L6 - stream L5 - session L4 - connection \| applet There will be at most one http_txn per stream, and a same session will possibly be referenced by multiple streams. A connection will point to a session and to a stream. The session will hold all the information we need to keep even when we don't yet have a stream. Some more cleanup is needed because some code was already far from being clean. The server queue management still refers to sessions at many places while comments talk about connections. This will have to be cleaned up once we have a server-side connection pool manager. Stream flags "SN_*" still need to be renamed, it doesn't seem like any of them will need to move to the session.	2015-04-06 11:23:56 +02:00
Willy Tarreau	6b5a9c23ce	CLEANUP: stream-int: remove inclusion of fd.h that is not used anymore That's a historic achievement, stream_interface.c doesn't manipulate any file descriptor anymore. It only relies on connections or applets.	2015-03-13 00:46:47 +01:00
Willy Tarreau	d85c48589a	REORG: connection: move conn_drain() to connection.c and rename it It's now called conn_sock_drain() to make it clear that it only reads at the sock layer and not at the data layer. The function was too big to remain inlined and it's used at a few places where size counts.	2015-03-13 00:42:48 +01:00
Willy Tarreau	f31fb07958	MEDIUM: connection: make conn_drain() perform more controls Currently si_idle_conn_null_cb() has to perform some low-level checks over the file descriptor and the connection configuration that should only belong to conn_drain(). Let's move these controls there. The function now automatically checks for errors and hangups on the file descriptor for example, and disables recv polling if there's no drain function at the control layer.	2015-03-13 00:32:20 +01:00
Willy Tarreau	0a03c0f022	MEDIUM: stream-int: make conn_si_send_proxy() use conn_sock_send() This substantially simplifies the code as we don't need to handle the file descriptors anymore nor the specific error codes from send().	2015-03-13 00:09:30 +01:00
Willy Tarreau	1398aa19d8	MEDIUM: stream-int: replace xprt->shutw calls with conn_data_shutw() Now that the connection performs the correct controls when shutting down, use that in the few places where conn->xprt->shutw() was called. The calls were split between conn_data_shutw() and conn_data_shutw_hard() depending on the argument. Since the connection flags are updated, we don't need to call conn_data_stop_send() anymore, instead we just have to call conn_cond_update_polling().	2015-03-12 23:04:07 +01:00
Willy Tarreau	4dfd54f26a	MINOR: stream-int: use conn_sock_shutw() to shutdown a connection Stop calling shutdown() on the connection's fd. Note, this also seems to fix a bug which was harmless, but which consisted in not marking the connection as shutdown at the socket level until the other side was shut as well.	2015-03-12 22:44:53 +01:00
Willy Tarreau	1140512f76	CLEANUP: stream-int: remove a redundant clearing of the linger_risk flag In stream_sock_read0(), we used to clear this flag. But the only case where stream_sock_read0() is called is in reaction with a conn_sock_read0() event coming from the lower layers, which already clears this flag. So let's remove this duplicate one and clear one of the few remaining layering violations in this area.	2015-03-12 22:32:27 +01:00
Willy Tarreau	78955f4c8b	MEDIUM: session: simplify receive buffer allocator to only use the channel Now that we can get the session from the channel, let's simplify the prototype of session_alloc_recv_buffer() to only require the channel. Both the caller and the function are now simplified.	2015-03-11 20:41:47 +01:00
Willy Tarreau	afc8a22ad7	CLEANUP: stream-int: limit usage of si_ic/si_oc As much as possible, we copy the result of this function into a local variable to avoid having to check the flag all the time.	2015-03-11 20:41:47 +01:00
Willy Tarreau	50fe03be78	CLEANUP: stream-int: add si_opposite() to find the other stream interface At a few places we need to find one stream interface from the other one. Instead of passing via the channel, we simply use the session as an intermediary, which simply results in applying an offset to the pointer.	2015-03-11 20:41:47 +01:00
Willy Tarreau	4e4292b9af	CLEANUP: stream-int: add si_ib/si_ob to dereference the buffers This makes the code cleaner and is more intuitive to use.	2015-03-11 20:41:46 +01:00
Willy Tarreau	07373b8660	MEDIUM: stream-int: use si_task() to retrieve the task from the stream int We go back to the session to get the owner. Here again it's very easy and is just a matter of relative offsets. Since the owner always exists and always points to the session's task, we can remove some unneeded tests.	2015-03-11 20:41:46 +01:00
Willy Tarreau	2bb4a96f8f	REORG/MEDIUM: stream-int: introduce si_ic/si_oc to access channels We'll soon remove direct references to the channels from the stream interface since everything belongs to the same session, so let's first not dereference si->ib / si->ob anymore and use macros instead.	2015-03-11 20:41:46 +01:00
Willy Tarreau	319f745ba0	MINOR: channel: rename bi_erase() to channel_truncate() It applies to the channel and it doesn't erase outgoing data, only pending unread data, which is strictly equivalent to what recv() does with MSG_TRUNC, so that new name is more accurate and intuitive.	2015-01-14 20:32:59 +01:00
Willy Tarreau	b5051f8742	MINOR: channel: rename bi_avail() to channel_recv_max() This name more accurately reminds that it applies to a channel and not to a buffer, and that what is returned may be used as a max number of bytes to pass to recv().	2015-01-14 20:26:54 +01:00
Willy Tarreau	3889fffe92	MINOR: channel: rename channel_full() to !channel_may_recv() This function's name was poorly chosen and is confusing to the point of being suspiciously used at some places. The operations it does always consider the ability to forward pending input data before receiving new data. This is not obvious at all, especially at some places where it was used when consuming outgoing data to know if the buffer has any chance to ever get the missing data. The code needs to be re-audited with that in mind. Care must be taken with existing code since the polarity of the function was switched with the renaming.	2015-01-14 18:41:33 +01:00
Willy Tarreau	56efc4896b	OPTIM: stream-int: try to send pending spliced data This is the equivalent of `eb9fd51` ("OPTIM: stream_sock: reduce the amount of in-flight spliced data") whose purpose is to try to immediately send spliced data if available.	2014-12-24 23:47:33 +01:00
Willy Tarreau	9b20c55562	MEDIUM: stream-int: support splicing from applets If we want to splice from applets, we must check the pipe before clearing SI_FL_WAIT_ROOM.	2014-12-24 23:47:33 +01:00
Willy Tarreau	10fc09e872	MAJOR: session: only allocate buffers when needed A session doesn't need buffers all the time, especially when they're empty. With this patch, we don't allocate buffers anymore when the session is initialized, we only allocate them in two cases : - during process_session() - during I/O operations During process_session(), we try hard to allocate both buffers at once so that we know for sure that a started operation can complete. Indeed, a previous version of this patch used to allocate one buffer at a time, but it can result in a deadlock when all buffers are allocated for requests for example, and there's no buffer left to emit error responses. Here, if any of the buffers cannot be allocated, the whole operation is cancelled and the session is added at the tail of the buffer wait queue. At the end of process_session(), a call to session_release_buffers() is done so that we can offer unused buffers to other sessions waiting for them. For I/O operations, we only need to allocate a buffer on the Rx path. For this, we only allocate a single buffer but ensure that at least two are available to avoid the deadlock situation. In case buffers are not available, SI_FL_WAIT_ROOM is set on the stream interface and the session is queued. Unused buffers resulting either from a successful send() or from an unused read buffer are offered to pending sessions during the ->wake() callback.	2014-12-24 23:47:33 +01:00
Willy Tarreau	bf883e0aa7	MAJOR: session: implement a wait-queue for sessions who need a buffer When a session_alloc_buffers() fails to allocate one or two buffers, it subscribes the session to buffer_wq, and waits for another session to release buffers. It's then removed from the queue and woken up with TASK_WAKE_RES, and can attempt its allocation again. We decide to try to wake as many waiters as we release buffers so that if we release 2 and two waiters need only once, they both have their chance. We must never come to the situation where we don't wake enough tasks up. It's common to release buffers after the completion of an I/O callback, which can happen even if the I/O could not be performed due to half a failure on memory allocation. In this situation, we don't want to move out of the wait queue the session that was just added, otherwise it will never get any buffer. Thus, we only force ourselves out of the queue when freeing the session. Note: at the moment, since session_alloc_buffers() is not used, no task is subscribed to the wait queue.	2014-12-24 23:47:33 +01:00
Willy Tarreau	a69fc9f803	BUG/MAJOR: stream-int: properly check the memory allocation return In stream_int_register_handler(), we call si_alloc_appctx(si) but as a mistake, instead of checking the return value for a NULL, we test <si>. This bug was discovered under extreme memory contention (memory for only two buffers with 500 connections waiting) and after 3 million failed connections. While it was very hard to produce it, the fix is tagged major because in theory it could happen when haproxy runs with a very low "-m" setting preventing from allocating just the few bytes needed for an appctx. But most users will never be able to trigger it. The fix was confirmed to address the bug. This fix must be backported to 1.5.	2014-12-23 11:22:39 +01:00
Willy Tarreau	9dc1c61c43	BUG/CRITICAL: http: don't update msg->sov once data start to leave the buffer Commit `bb2e669` ("BUG/MAJOR: http: correctly rewind the request body after start of forwarding") was incorrect/incomplete. It used to rely on CF_READ_ATTACHED to stop updating msg->sov once data start to leave the buffer, but this is unreliable because since commit `a6eebb3` ("[BUG] session: clear BF_READ_ATTACHED before next I/O") merged in 1.5-dev1, this flag is only ephemeral and is cleared once all analysers have seen it. So we can start updating msg->sov again each time we pass through this place with new data. With a sufficiently large amount of data, it is possible to make msg->sov wrap and validate the if() condition at the top, causing the buffer to advance by about 2GB and crash the process. Note that the offset cannot be controlled by the attacker because it is a sum of millions of small random sizes depending on how many bytes were read by the server and how many were left in the buffer, only because of the speed difference between reading and writing. Also, nothing is written, the invalid pointer resulting from this operation is only read. Many thanks to James Dempsey for reporting this bug and to Chris Forbes for narrowing down the faulty area enough to make its root cause analysable. This fix must be backported to haproxy 1.5.	2014-09-02 16:48:54 +02:00
David S	afb768340c	MEDIUM: connection: Implement and extented PROXY Protocol V2 This commit modifies the PROXY protocol V2 specification to support headers longer than 255 bytes allowing for optional extensions. It implements the PROXY protocol V2 which is a binary representation of V1. This will make parsing more efficient for clients who will know in advance exactly how many bytes to read. Also, it defines and implements some optional PROXY protocol V2 extensions to send information about downstream SSL/TLS connections. Support for PROXY protocol V1 remains unchanged.	2014-05-09 08:25:38 +02:00
Willy Tarreau	7e3127391f	MINOR: config: make the stream interface idle timer user-configurable The new tune.idletimer value allows one to set a different value for idle stream detection. The default value remains set to one second. It is possible to disable it using zero, and to change the default value at build time using DEFAULT_IDLE_TIMER.	2014-02-12 16:36:12 +01:00
Willy Tarreau	c5890e66cd	MEDIUM: stream-int: automatically disable CF_STREAMER flags after idle Disabling the streamer flags after an idle period will help TCP proxies to better adapt to the streams they're forwarding, especially with SSL where this will allow the SSL sender to use smaller records. This is typically used to optimally relay HTTP and derivatives such as SPDY or HTTP/2 in pure TCP mode when haproxy is used as an SSL offloader. This idea was first proposed by Ilya Grigorik on the haproxy mailing list, and his tests seem to confirm the improvement : https://www.mail-archive.com/haproxy@formilux.org/msg12576.html	2014-02-12 11:46:03 +01:00
Willy Tarreau	7bed945be0	OPTIM: ssl: implement dynamic record size adjustment By having the stream interface pass the CF_STREAMER flag to the snd_buf() primitive, we're able to tell the send layer whether we're sending large chunks or small ones. We use this information in SSL to adjust the max record dynamically. This results in small chunks respecting tune.ssl.maxrecord at the beginning of a transfer or for small transfers, with an automatic switch to full records if the exchanges last long. This allows the receiver to parse HTML contents on the fly without having to retrieve 16kB of data, which is even more important with small initcwnd since the receiver does not need to wait for round trips to start fetching new objects. However, sending large files still produces large chunks. For example, with tune.ssl.maxrecord = 2859, we see 5 write(2885) sent in two segments each and 6 write(16421). This idea was first proposed on the haproxy mailing list by Ilya Grigorik.	2014-02-06 11:37:29 +01:00
Willy Tarreau	1049b1f551	MEDIUM: connection: don't use real send() flags in snd_buf() This prevents us from passing other useful info and requires the upper levels to know these flags. Let's use a new flags category instead : CO_SFL_*. For now, only MSG_MORE has been remapped.	2014-02-06 11:37:29 +01:00
Willy Tarreau	798c3c9c41	MINOR: stream-interface: no need to call fd_stop_both() on error We don't need to call fd_stop_both() since we already call conn_cond_update_polling() which will do it. This call was introduced by commit `d29a066` ("BUG/MAJOR: connection: always recompute polling status upon I/O").	2014-01-26 00:42:31 +01:00
Willy Tarreau	708e717251	MEDIUM: stream-interface: the polling flags must always be updated in chk_snd_conn We used to only update the polling flags in data phase, but after that we could update other flags. It does not seem possible to trigger a bug here but it's not very safe either. Better always keep them up to date.	2014-01-26 00:42:30 +01:00
Willy Tarreau	fd803bb4d7	MEDIUM: connection: add check for readiness in I/O handlers The recv/send callbacks must check for readiness themselves instead of having their callers do it. This will strengthen the test and will also ensure we never refrain from calling a handshake handler because a direction is being polled while the other one is ready.	2014-01-26 00:42:30 +01:00
Willy Tarreau	e1f50c4b02	MEDIUM: connection: remove conn_{data,sock}_poll_{recv,send} We simply remove these functions and replace their calls with the appropriate ones : - if we're in the data phase, we can simply report wait on the FD - if we're in the socket phase, we may also have to signal the desire to read/write on the socket because it might not be active yet.	2014-01-26 00:42:30 +01:00
Willy Tarreau	310987a038	MAJOR: connection: remove the CO_FL_WAIT_{RD,WR} flags These flags were used to report the readiness of the file descriptor. Now this readiness is directly checked at the file descriptor itself. This removes the need for constantly synchronizing updates between the file descriptor and the connection and ensures that all layers share the same level of information. For now, the readiness is updated in conn_{sock,data}_poll_* by directly touching the file descriptor. This must move to the lower layers instead so that these functions can disappear as well. In this state, the change works but is incomplete. It's sensible enough to avoid making it more complex. Now the sock/data updates become much simpler because they just have to enable/disable access to a file descriptor and not to care anymore about its readiness.	2014-01-26 00:42:30 +01:00
Willy Tarreau	e6300be8f8	BUG/MEDIUM: stream-interface: don't wake the task up before end of transfer Recent commit `d7ad9f5` ("MAJOR: channel: add a new flag CF_WAKE_WRITE to notify the task of writes") was not correct. It used to wake up the task as soon as there was some write activity and the flag was set, even if there were still some data to be forwarded. This resulted in process_session() being called a lot when transfering chunk-encoded HTTP responses made of very large chunks. The purpose of the flag is to wake up only a task waiting for some room and not the other ones, so it's totally counter-productive to wake it up as long as there are data to forward because the task will not be allowed to write anyway. Also, the commit above was taking some risks by not considering certain events anymore (eg: state != SI_ST_EST). While such events are not used at the moment, if some new features were developped in the future relying on these, it would be better that they could be notified when subscribing to the WAKE_WRITE event, so let's restore the condition.	2014-01-25 22:28:22 +01:00
Willy Tarreau	46be2e5039	MEDIUM: connection: update callers of ctrl->drain() to use conn_drain() Now we can more safely rely on the connection state to decide how to drain and what to do when data are drained. Callers don't need to manipulate the file descriptor's state anymore. Note that it also removes the need for the fix `ea90063` ("BUG/MEDIUM: stream-int: fix the keep-alive idle connection handler") since conn_drain() correctly sets the polling flags.	2014-01-20 22:27:17 +01:00
Willy Tarreau	7f4bcc312d	MINOR: protocol: improve the proto->drain() API It was not possible to know if the drain() function had hit an EAGAIN, so now we change the API of this function to return : < 0 if EAGAIN was met = 0 if some data remain > 0 if a shutdown was received	2014-01-20 22:27:16 +01:00
Willy Tarreau	d7ad9f5b0d	MAJOR: channel: add a new flag CF_WAKE_WRITE to notify the task of writes Since commit `6b66f3e` ([MAJOR] implement autonomous inter-socket forwarding) introduced in 1.3.16-rc1, we've been relying on a stupid mechanism to wake up the task after a write, which was an exact copy-paste of the reader side. The principle was that if we empty a buffer and there's no forwarding scheduled or if the producer is not in a connected state, then we wake the task up. That does not make any sense. It happens to wake up too late sometimes (eg, when the request analyser waits for some room in the buffer to start to work), and leads to unneeded wakeups in client-side keep-alive, because the task is woken up when the response is sent, while the analysers are simply waiting for a new request. In order to fix this, we introduce a new channel flag : CF_WAKE_WRITE. It is designed so that an analyser can explicitly request being notified when some data were written. It is used only when the HTTP request or response analysers need to wait for more room in the buffers. It is automatically cleared upon wake up. The flag is also automatically set by the functions which try to write into a buffer from an applet when they fail (bi_putblk() etc...). That allows us to remove the stupid condition above and avoid some wakeups. In http-server-close and in http-keep-alive modes, this reduces from 4 to 3 the average number of wakeups per request, and increases the overall performance by about 1.5%.	2013-12-31 18:37:36 +01:00
Willy Tarreau	61f7f0a959	BUG/MINOR: stream-int: do not clear the owner upon unregister Since the applet rework and the removal of the inter-task applets, we must not clear the stream-interface's owner task anymore otherwise we risk a crash when maintaining keep-alive with an applet. This is not possible right now so there is no impact yet, but this bug is not easy to track down. No backport is needed.	2013-12-28 21:33:37 +01:00
Willy Tarreau	ea90063cbc	BUG/MEDIUM: stream-int: fix the keep-alive idle connection handler Commit `2737562` (MEDIUM: stream-int: implement a very simplistic idle connection manager) implemented an idle connection handler. In the case where all data is drained from the server, it fails to disable polling, resulting in a busy spinning loop. Thanks to Sander Klein and Guillaume Castagnino for reporting this bug. No backport is needed.	2013-12-17 14:21:48 +01:00
Willy Tarreau	2737562e43	MEDIUM: stream-int: implement a very simplistic idle connection manager Idle connections are not monitored right now. So if a server closes after a response without advertising it, it won't be detected until a next request wants to use the connection. This is a bit problematic because it unnecessarily maintains file descriptors and sockets in an idle state. This patch implements a very simple idle connection manager for the stream interface. It presents itself as an I/O callback. The HTTP engine enables it when it recycles a connection. If a close or an error is detected on the underlying socket, it tries to drain as much data as possible from the socket, detect the close and responds with a close as well, then detaches from the stream interface.	2013-12-17 00:00:28 +01:00
Willy Tarreau	ad38acedaa	MEDIUM: connection: centralize handling of nolinger in fd management Right now we see many places doing their own setsockopt(SO_LINGER). Better only do it just before the close() in fd_delete(). For this we add a new flag on the file descriptor, indicating if it's safe or not to linger. If not (eg: after a connect()), then the setsockopt() call is automatically performed before a close(). The flag automatically turns to safe when receiving a read0.	2013-12-16 02:23:52 +01:00
Willy Tarreau	d02cdd23be	MINOR: connection: add simple functions to report connection readiness conn_xprt_ready() reports if the transport layer is ready. conn_ctrl_ready() reports if the control layer is ready. The stream interface uses si_conn_ready() to report that the underlying connection is ready. This will be used for connection reuse in keep-alive mode.	2013-12-16 02:23:52 +01:00
Willy Tarreau	0a23bcb8be	MAJOR: stream-interface: dynamically allocate the applet context From now on, a call to stream_int_register_handler() causes a call to si_alloc_appctx() and returns an initialized appctx for the current stream interface. If one was previously allocated, it is released. If the stream interface was attached to a connection, it is released as well. The appctx are allocated from the same pools as the connections, because they're substantially smaller in size, and we can't have both a connection and an appctx on an interface at any moment. In case of memory shortage, the call may return NULL, which is already handled by all consumers of stream_int_register_handler(). The field appctx was removed from the stream interface since we only rely on the endpoint now. On 32-bit, the stream_interface size went down from 108 to 44 bytes. On 64-bit, it went down from 144 to 64 bytes. This represents a memory saving of 160 bytes per session. It seems that a later improvement could be to move the call to stream_int_register_handler() to session.c for most cases.	2013-12-09 15:40:23 +01:00
Willy Tarreau	1fbe1c9ec8	MEDIUM: stream-int: return the allocated appctx in stream_int_register_handler() The task returned by stream_int_register_handler() is never used, however we always need to access the appctx afterwards. So make it return the appctx instead. We already plan for it to fail, which is the reason for the addition of a few tests and the possibility for the HTTP analyser to return a status code 500.	2013-12-09 15:40:23 +01:00
Willy Tarreau	57cd3e46b9	MEDIUM: connection: merge the send_proxy and local_send_proxy calls We used to have two very similar functions for sending a PROXY protocol line header. The reason is that the default one relies on the stream interface to retrieve the other end's address, while the "local" one performs a local address lookup and sends that instead (used by health checks). Now that the send_proxy_ofs is stored in the connection and not the stream interface, we can make the local_send_proxy rely on it and support partial sends. This also simplifies the code by removing the local_send_proxy function, making health checks use send_proxy_ofs, resulting in the removal of the CO_FL_LOCAL_SPROXY flag, and the associated test in the connection handler. The other flag, CO_FL_SI_SEND_PROXY was renamed without the "SI" part so that it is clear that it is not dedicated anymore to a usage with a stream interface.	2013-12-09 15:40:23 +01:00
Willy Tarreau	b8020cefed	MEDIUM: connection: move the send_proxy offset to the connection Till now the send_proxy_ofs field remained in the stream interface, but since the dynamic allocation of the connection, it makes a lot of sense to move that into the connection instead of the stream interface, since it will not be statically allocated for each session. Also, it turns out that moving it to the connection fils an alignment hole on 64 bit architectures so it does not consume more memory, and removing it from the stream interface was an opportunity to correctly reorder fields and reduce the stream interface's size from 160 to 144 bytes (-10%). This is 32 bytes saved per session.	2013-12-09 15:40:23 +01:00
Willy Tarreau	32e3c6a607	MAJOR: stream interface: dynamically allocate the outgoing connection The outgoing connection is now allocated dynamically upon the first attempt to touch the connection's source or destination address. If this allocation fails, we fail on SN_ERR_RESOURCE. As we didn't use si->conn anymore, it was removed. The endpoints are released upon session_free(), on the error path, and upon a new transaction. That way we are able to carry the existing server's address across retries. The stream interfaces are not initialized anymore before session_complete(), so we could even think about allocating them dynamically as well, though that would not provide much savings. The session initialization now makes use of conn_new()/conn_free(). This slightly simplifies the code and makes it more logical. The connection initialization code is now shorter by about 120 bytes because it's done at once, allowing the compiler to remove all redundant initializations. The si_attach_applet() function now takes care of first detaching the existing endpoint, and it is called from stream_int_register_handler(), so we can safely remove the calls to si_release_endpoint() in the application code around this call. A call to si_detach() was made upon stream_int_unregister_handler() to ensure we always free the allocated connection if one was allocated in parallel to setting an applet (eg: detect HTTP proxy while proceeding with stats maybe).	2013-12-09 15:40:23 +01:00
Willy Tarreau	2a6e8802c0	MEDIUM: stream-interface: introduce si_attach_conn to replace si_prepare_conn si_prepare_conn() is not appropriate in our case as it both initializes and attaches the connection to the stream interface. Due to the asymmetry between accept() and connect(), it causes some fields such as the control and transport layers to be reinitialized. Now that we can separately initialize these fields using conn_prepare(), let's break this function to only attach the connection to the stream interface. Also, by analogy, si_prepare_none() was renamed si_detach(), and si_prepare_applet() was renamed si_attach_applet().	2013-12-09 15:40:23 +01:00
Willy Tarreau	f79c8171b2	MAJOR: connection: add two new flags to indicate readiness of control/transport Currently the control and transport layers of a connection are supposed to be initialized when their respective pointers are not NULL. This will not work anymore when we plan to reuse connections, because there is an asymmetry between the accept() side and the connect() side : - on accept() side, the fd is set first, then the ctrl layer then the transport layer ; upon error, they must be undone in the reverse order, then the FD must be closed. The FD must not be deleted if the control layer was not yet initialized ; - on the connect() side, the fd is set last and there is no reliable way to know if it has been initialized or not. In practice it's initialized to -1 first but this is hackish and supposes that local FDs only will be used forever. Also, there are even less solutions for keeping trace of the transport layer's state. Also it is possible to support delayed close() when something (eg: logs) tracks some information requiring the transport and/or control layers, making it even more difficult to clean them. So the proposed solution is to add two flags to the connection : - CO_FL_CTRL_READY is set when the control layer is initialized (fd_insert) and cleared after it's released (fd_delete). - CO_FL_XPRT_READY is set when the control layer is initialized (xprt->init) and cleared after it's released (xprt->close). The functions have been adapted to rely on this and not on the pointers anymore. conn_xprt_close() was unused and dangerous : it did not close the control layer (eg: the socket itself) but still marks the transport layer as closed, preventing any future call to conn_full_close() from finishing the job. The problem comes from conn_full_close() in fact. It needs to close the xprt and ctrl layers independantly. After that we're still having an issue : we don't know based on ->ctrl alone whether the fd was registered or not. For this we use the two new flags CO_FL_XPRT_READY and CO_FL_CTRL_READY. We now rely on this and not on conn->xprt nor conn->ctrl anymore to decide what remains to be done on the connection. In order not to miss some flag assignments, we introduce conn_ctrl_init() to initialize the control layer, register the fd using fd_insert() and set the flag, and conn_ctrl_close() which unregisters the fd and removes the flag, but only if the transport layer was closed. Similarly, at the transport layer, conn_xprt_init() calls ->init and sets the flag, while conn_xprt_close() checks the flag, calls ->close and clears the flag, regardless xprt_ctx or xprt_st. This also ensures that the ->init and the ->close functions are called only once each and in the correct order. Note that conn_xprt_close() does nothing if the transport layer is still tracked. conn_full_close() now simply calls conn_xprt_close() then conn_full_close() in turn, which do nothing if CO_FL_XPRT_TRACKED is set. In order to handle the error path, we also provide conn_force_close() which ignores CO_FL_XPRT_TRACKED and closes the transport and the control layers in turns. All relevant instances of fd_delete() have been replaced with conn_force_close(). Now we always know what state the connection is in and we can expect to split its initialization.	2013-12-09 15:40:23 +01:00
Willy Tarreau	b363a1f469	MAJOR: stream-int: stop using si->conn and use si->end instead The connection will only remain there as a pre-allocated entity whose goal is to be placed in ->end when establishing an outgoing connection. All connection initialization can be made on this connection, but all information retrieved should be applied to the end point only. This change is huge because there were many users of si->conn. Now the only users are those who initialize the new connection. The difficulty appears in a few places such as backend.c, proto_http.c, peers.c where si->conn is used to hold the connection's target address before assigning the connection to the stream interface. This is why we have to keep si->conn for now. A future improvement might consist in dynamically allocating the connection when it is needed.	2013-12-09 15:40:22 +01:00
Willy Tarreau	cf644ed37a	MEDIUM: stream-int: make ->end point to the connection or the appctx The long-term goal is to have a context for applets as an alternative to the connection and not as a complement. At the moment, the context is still stored into the stream interface, and we only put a pointer to the applet's context in si->end, initialize the context with object type OBJ_TYPE_APPCTX, and this allows us not to allocate an entry when deciding to switch to an applet. A special care is taken to never dereference si->conn anymore when dealing with an applet. That's why it's important that si->end is always set to the proper type : si->end == NULL => not connected to anything si->end == OBJ_TYPE_APPCTX => connected to an applet si->end == OBJ_TYPE_CONN => real connection (server, proxy, ...) The session management code used to check the applet from the connection's target. Now it uses the stream interface's end point and does not touch the connection at all. Similarly, we stop checking the connection's addresses and file descriptors when reporting the applet's status in the stats dump.	2013-12-09 15:40:22 +01:00
Willy Tarreau	4a59f2f954	MAJOR: stream interface: remove the ->release function pointer Since last commit, we now have a pointer to the applet in the applet context. So we don't need the si->release function pointer anymore, it can be extracted from applet->applet.release. At many places, the ->release function was still tested for real connections while it is only limited to applets, so most of them were simply removed. For the remaining valid uses, a new inline function si_applet_release() was added to simplify the check and the call.	2013-12-09 15:40:22 +01:00
Willy Tarreau	7d67d7b9e5	MINOR: stream-int: add a new pointer to the end point The end point will correspond to either an applet context or a connection, depending on the object type. For now the pointer remains null.	2013-12-09 15:40:22 +01:00
Willy Tarreau	372d6708fb	MINOR: stream-int: split si_prepare_embedded into si_prepare_none and si_prepare_applet si_prepare_embedded() was used both to attach an applet and to detach anything from a stream interface. Split it into si_prepare_none() to detach and si_prepare_applet() to attach an applet. si->conn->target is now assigned from within these two functions instead of their respective callers.	2013-12-09 15:40:22 +01:00
Willy Tarreau	6fe1541285	MINOR: stream-int: make the shutr/shutw functions void This is to be more consistent with the other functions. The only reason why these functions used to return a value was to let the caller adjust polling by itself, but now their only callers were the si_shutr()/si_shutw() inline functions. Now these functions do not depend anymore on the connection. These connection variant of these functions now call conn_data_stop_recv()/conn_data_stop_send() before returning order not to require a return code anymore. The applet version does not need this at all.	2013-12-09 15:40:22 +01:00
Willy Tarreau	8b3d7dfd7c	MEDIUM: stream-int: split the shutr/shutw functions between applet and conn These functions induce a lot of ifs everywhere because they consider two different cases, one which is where the connection exists and has a file descriptor, and the other one which is the default case where at most an applet has to be notified. Let's have them in si_ops and automatically decide which one to use. The connection shutdown sequence has been slightly simplified, and we now clear the flags at the end. Also we remove SHUTR_NOW after a shutw with nolinger, as it's cleaner not to keep it.	2013-12-09 15:40:22 +01:00

1 2 3 4 5 ...

271 Commits