haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-07 07:37:02 +02:00

Author	SHA1	Message	Date
Willy Tarreau	e91ffd093e	BUG/MAJOR: tcp: only call registered actions when they're registered Commit `cc87a11` ("MEDIUM: tcp: add register keyword system.") introduced the registration of new keywords for TCP rulesets. Unfortunately it replaced the "accept" action with an unconditionnal call to the rule's action function, resulting in an immediate segfault when using the "accept" action in a TCP ruleset. This bug reported by Baptiste Assmann was introduced in 1.6-dev1, no backport is needed.	2015-04-24 10:13:18 +02:00
Willy Tarreau	152b81e7b2	BUG/MAJOR: tcp/http: fix current_rule assignment when restarting over a ruleset Commit `bc4c1ac` ("MEDIUM: http/tcp: permit to resume http and tcp custom actions") introduced the ability to interrupt and restart processing in the middle of a TCP/HTTP ruleset. But it doesn't do it in a consistent way : it checks current_rule_list, immediately dereferences current_rule, which is only set in certain cases and never cleared. So that broke the tcp-request content rules when the processing was interrupted due to missing data, because current_rule was not yet set (segfault) or could have been inherited from another ruleset if it was used in a backend (random behaviour). The proper way to do it is to always set current_rule before dereferencing it. But we don't want to set it for all rules because we don't want any action to provide a checkpointing mechanism. So current_rule is set to NULL before entering the loop, and only used if not NULL and if current_rule_list matches the current list. This way they both serve as a guard for the other one. This fix also makes the current rule point to the rule instead of its list element, as it's much easier to manipulate. No backport is needed, this is 1.6-specific.	2015-04-20 13:46:20 +02:00
Willy Tarreau	e73ef85a63	MAJOR: tcp: make tcp_exec_req_rules() only rely on the session It passes a NULL wherever a stream was needed (acl_exec_cond() and action_ptr mainly). It can still track the connection rate correctly and block based on ACLs.	2015-04-06 11:37:31 +02:00
Willy Tarreau	70f454e8fa	MEDIUM: proto_tcp: track the session's counters in the connection ruleset The tcp-request connection ruleset now only tracks session counters and not stream counters. Thus it does not need access to the stream anymore.	2015-04-06 11:37:31 +02:00
Willy Tarreau	192252e2d8	MAJOR: sample: pass a pointer to the session to each sample fetch function Many such function need a session, and till now they used to dereference the stream. Once we remove the stream from the embryonic session, this will not be possible anymore. So as of now, sample fetch functions will be called with this : - sess = NULL, strm = NULL : never - sess = valid, strm = NULL : tcp-req connection - sess = valid, strm = valid, strm->txn = NULL : tcp-req content - sess = valid, strm = valid, strm->txn = valid : http-req / http-res	2015-04-06 11:37:25 +02:00
Willy Tarreau	15e91e1b36	MAJOR: sample: don't pass l7 anymore to sample fetch functions All of them can now retrieve the HTTP transaction if it exists from the stream and be sure to get NULL there when called with an embryonic session. The patch is a bit large because many locations were touched (all fetch functions had to have their prototype adjusted). The opportunity was taken to also uniformize the call names (the stream is now always "strm" instead of "l4") and to fix indent where it was broken. This way when we later introduce the session here there will be less confusion.	2015-04-06 11:35:53 +02:00
Willy Tarreau	eee5b51248	MAJOR: http: move http_txn out of struct stream Now this one is dynamically allocated. It means that 280 bytes of memory are saved per TCP stream, but more importantly that it will become possible to remove the l7 pointer from fetches and converters since it will be deduced from the stream and will support being null. A lot of care was taken because it's easy to forget a test somewhere, and the previous code used to always trust s->txn for being valid, but all places seem to have been visited. All HTTP fetch functions check the txn first so we shouldn't have any issue there even when called from TCP. When branching from a TCP frontend to an HTTP backend, the txn is properly allocated at the same time as the hdr_idx.	2015-04-06 11:35:52 +02:00
Willy Tarreau	cb7dd015be	MEDIUM: http: move header captures from http_txn to struct stream The header captures are now general purpose captures since tcp rules can use them to capture various contents. That removes a dependency on http_txn that appeared in some sample fetch functions and in the order by which captures and http_txn were allocated. Interestingly the reset of the header captures were done at too many places as http_init_txn() used to do it while it was done previously in every call place.	2015-04-06 11:35:52 +02:00
Willy Tarreau	9ad7bd48d2	MEDIUM: session: use the pointer to the origin instead of s->si[0].end When s->si[0].end was dereferenced as a connection or anything in order to retrieve information about the originating session, we'll now use sess->origin instead so that when we have to chain multiple streams in HTTP/2, we'll keep accessing the same origin.	2015-04-06 11:34:29 +02:00
Willy Tarreau	e36cbcb3b0	MEDIUM: stream: move the frontend's pointer to the session Just like for the listener, the frontend is session-wide so let's move it to the session. There are a lot of places which were changed but the changes are minimal in fact.	2015-04-06 11:23:58 +02:00
Willy Tarreau	fb0afa77c9	MEDIUM: stream: move the listener's pointer to the session The listener is session-specific, move it there.	2015-04-06 11:23:57 +02:00
Willy Tarreau	e7dff02dd4	REORG/MEDIUM: stream: rename stream flags from SN_* to SF_* This is in order to keep things consistent.	2015-04-06 11:23:57 +02:00
Willy Tarreau	87b09668be	REORG/MAJOR: session: rename the "session" entity to "stream" With HTTP/2, we'll have to support multiplexed streams. A stream is in fact the largest part of what we currently call a session, it has buffers, logs, etc. In order to catch any error, this commit removes any reference to the struct session and tries to rename most "session" occurrences in function names to "stream" and "sess" to "strm" when that's related to a session. The files stream.{c,h} were added and session.{c,h} removed. The session will be reintroduced later and a few parts of the stream will progressively be moved overthere. It will more or less contain only what we need in an embryonic session. Sample fetch functions and converters will have to change a bit so that they'll use an L5 (session) instead of what's currently called "L4" which is in fact L6 for now. Once all changes are completed, we should see approximately this : L7 - http_txn L6 - stream L5 - session L4 - connection \| applet There will be at most one http_txn per stream, and a same session will possibly be referenced by multiple streams. A connection will point to a session and to a stream. The session will hold all the information we need to keep even when we don't yet have a stream. Some more cleanup is needed because some code was already far from being clean. The server queue management still refers to sessions at many places while comments talk about connections. This will have to be cleaned up once we have a server-side connection pool manager. Stream flags "SN_*" still need to be renamed, it doesn't seem like any of them will need to move to the session.	2015-04-06 11:23:56 +02:00
Willy Tarreau	73796535a9	REORG/MEDIUM: channel: only use chn_prod / chn_cons to find stream-interfaces The purpose of these two macros will be to pass via the session to find the relevant stream interfaces so that we don't need to store the ->cons nor ->prod pointers anymore. Currently they're only defined so that all references could be removed. Note that many places need a second pass of clean up so that we don't have any chn_prod(&s->req) anymore and only &s->si[0] instead, and conversely for the 3 other cases.	2015-03-11 20:41:47 +01:00
Willy Tarreau	22ec1eadd0	REORG/MAJOR: move session's req and resp channels back into the session The channels were pointers to outside structs and this is not needed anymore since the buffers have moved, but this complicates operations. Move them back into the session so that both channels and stream interfaces are always allocated for a session. Some places (some early sample fetch functions) used to validate that a channel was NULL prior to dereferencing it. Now instead we check if chn->buf is NULL and we force it to remain NULL until the channel is initialized.	2015-03-11 20:41:46 +01:00
Thierry FOURNIER	bc4c1ac6ad	MEDIUM: http/tcp: permit to resume http and tcp custom actions Later, the processing of some actions needs to be interrupted and resumed later. This patch permit to resume the actions. The actions that needs to run with the resume mode are not yet avalaible. It will be soon with Lua patches. So the code added by this patch is untestable for the moment. The list of "tcp_exec_req_rules" cannot resme because is called by the unresumable function "accept_session".	2015-02-28 23:12:33 +01:00
Thierry FOURNIER	cc87a11842	MEDIUM: tcp: add register keyword system. This patch introduces an action keyword registration system for TCP rulesets similar to what is available for HTTP rulesets. This sytem will be useful with lua.	2015-02-28 23:12:32 +01:00
Thierry FOURNIER	f41a809dc9	MINOR: sample: add private argument to the struct sample_fetch The add of this private argument is to prepare the integration of the lua fetchs.	2015-02-28 23:12:31 +01:00
Willy Tarreau	2af207a5f5	MEDIUM: tcp: implement tcp-ut bind option to set TCP_USER_TIMEOUT On Linux since 2.6.37, it's possible to set the socket timeout for pending outgoing data, with an accuracy of 1 millisecond. This is pretty handy to deal with dead connections to clients and or servers. For now we only implement it on the frontend side (bind line) so that when a client disappears from the net, we're able to quickly get rid of its connection and possibly release a server connection. This can be useful with long-lived connections where an application level timeout is not suited because long pauses are expected (remote terminals, connection pools, etc). Thanks to Thijs Houtenbos and John Eckersberg for the suggestion.	2015-02-04 00:54:40 +01:00
Willy Tarreau	529c13933b	BUG/MAJOR: namespaces: conn->target is not necessarily a server create_server_socket() used to dereference objt_server(conn->target), but if the target is not a server (eg: a proxy) then it's NULL and we get a segfault. This can be reproduced with a proxy using "dispatch" with no server, even when namespaces are disabled, because that code is not #ifdef'd. The fix consists in first checking if the target is a server. This fix does not need to be backported, this is 1.6-only.	2014-12-24 13:47:55 +01:00
KOVACS Krisztian	b3e54fe387	MAJOR: namespace: add Linux network namespace support This patch makes it possible to create binds and servers in separate namespaces. This can be used to proxy between multiple completely independent virtual networks (with possibly overlapping IP addresses) and a non-namespace-aware proxy implementation that supports the proxy protocol (v2). The setup is something like this: net1 on VLAN 1 (namespace 1) -\ net2 on VLAN 2 (namespace 2) -- haproxy ==== proxy (namespace 0) net3 on VLAN 3 (namespace 3) -/ The proxy is configured to make server connections through haproxy and sending the expected source/target addresses to haproxy using the proxy protocol. The network namespace setup on the haproxy node is something like this: = 8< = $ cat setup.sh ip netns add 1 ip link add link eth1 type vlan id 1 ip link set eth1.1 netns 1 ip netns exec 1 ip addr add 192.168.91.2/24 dev eth1.1 ip netns exec 1 ip link set eth1.$id up ... = 8< = = 8< = $ cat haproxy.cfg frontend clients bind 127.0.0.1:50022 namespace 1 transparent default_backend scb backend server mode tcp server server1 192.168.122.4:2222 namespace 2 send-proxy-v2 = 8< = A bind line creates the listener in the specified namespace, and connections originating from that listener also have their network namespace set to that of the listener. A server line either forces the connection to be made in a specified namespace or may use the namespace from the client-side connection if that was set. For more documentation please read the documentation included in the patch itself. Signed-off-by: KOVACS Tamas <ktamas@balabit.com> Signed-off-by: Sarkozi Laszlo <laszlo.sarkozi@balabit.com> Signed-off-by: KOVACS Krisztian <hidden@balabit.com>	2014-11-21 07:51:57 +01:00
Willy Tarreau	5e0d0e046a	BUG/MEDIUM: tcp: don't use SO_ORIGINAL_DST on non-AF_INET sockets There's an issue when using SO_ORIGINAL_DST to retrieve the original destination of a connection's address before being translated by Netfilter's DNAT/REDIRECT or the old TPROXY. SO_ORIGINAL_DST is able to retrieve an IPv4 address when the original destination was IPv4 mapped into IPv6. At first glance it's not a big deal, but it is for logging and for the proxy protocol, because we then have two different address families for the source and destination. In this case, the proxy protocol correctly detects the issue and emits "UNKNOWN". In order to fix this, we perform getsockname() first, and only if the address family is AF_INET, then we perform the getsockopt() call. This fix must be backported to 1.5, and probably even to 1.4 and 1.3.	2014-10-29 21:46:01 +01:00
Willy Tarreau	fb20e4668d	BUG/MEDIUM: tcp: fix outgoing polling based on proxy protocol During a tcp connection setup in tcp_connect_server(), we check if there are pending data to start polling for writes immediately. We also use the same test to know if we can disable the quick ack and merge the first data packet with the connection's ACK. This last case is also valid for the proxy protocol. The problem lies in the way it's done, as the "data" variable is improperly completed with the presence of the proxy protocol, resulting in the connection being polled for data writes if the proxy protocol is enabled. It's not a big issue per se, except that the proxy protocol uses the fact that we're polling for data to know if it can use MSG_MORE. This causes no problem on HTTP/HTTPS, but with banner protocols, it introduces a 200ms delay if the server waits for the PROXY header. This has been caused by the connection management changes introduced in 1.5-dev12, specifically commit `a1a7474` ("MEDIUM: proxy-proto: don't use buffer flags in conn_si_send_proxy()"), so this fix must be backported to 1.5.	2014-10-24 12:09:12 +02:00
Willy Tarreau	e1cfc1f2b4	BUG/MINOR: config: do not accept more track-sc than configured MAX_SESS_STKCTR allows one to define the number of stick counters that can be used in parallel in track-sc* rules. The naming of this macro creates some confusion because the value there is sometimes used as a max instead of a count, and the config parser accepts values from 0 to MAX_SESS_STKCTR and the processing ignores anything tracked on the last one. This means that by default, track-sc3 is allowed and ignored. This fix must be backported to 1.5 where the problem there only affects TCP rules.	2014-10-17 11:53:05 +02:00
Willy Tarreau	3986b9c140	MEDIUM: config: report it when tcp-request rules are misplaced A config where a tcp-request rule appears after an http-request rule might seem valid but it is not. So let's report a warning about this since this case is hard to detect by the naked eye.	2014-09-16 15:43:24 +02:00
Willy Tarreau	6bcb0a84e7	BUG/MAJOR: tcp: fix a possible busy spinning loop in content track-sc* As a consequence of various recent changes on the sample conversion, a corner case has emerged where it is possible to wait forever for a sample in track-sc*. The issue is caused by the fact that functions relying on sample_process() don't all exactly work the same regarding the SMP_F_MAY_CHANGE flag and the output result. Here it was possible to wait forever for an output sample from stktable_fetch_key() without checking the SMP_OPT_FINAL flag. As a result, if the client connects and closes without sending the data and haproxy expects a sample which is capable of coming, it will ignore this impossible case and will continue to wait. This change adds control for SMP_OPT_FINAL before waiting for extra data. The various relevant functions have been better documented regarding their output values. This fix must be backported to 1.5 since it appeared there.	2014-07-30 08:56:35 +02:00
Willy Tarreau	092d865c53	MEDIUM: listener: implement a per-protocol pause() function In order to fix the abstact socket pause mechanism during soft restarts, we'll need to proceed differently depending on the socket protocol. The pause_listener() function already supports some protocol-specific handling for the TCP case. This commit makes this cleaner by adding a new ->pause() function to the protocol struct, which, if defined, may be used to pause a listener of a given protocol. For now, only TCP has been adapted, with the specific code moved from pause_listener() to tcp_pause_listener().	2014-07-08 01:13:34 +02:00
Willy Tarreau	1b71eb581e	BUG/MEDIUM: counters: fix track-sc* to wait on unstable contents I've been facing multiple configurations which involved track-sc* rules in tcp-request content without the "if ..." to force it to wait for the contents, resulting in random behaviour with contents sometimes retrieved and sometimes not. Reading the doc doesn't make it clear either that the tracking will be performed only if data are already there and that waiting on an ACL is the only way to avoid this. Since this behaviour is not natural and we now have the ability to fix it, this patch ensures that if input data are still moving, instead of silently dropping them, we naturally wait for them to stabilize up to the inspect-delay. This way it's not needed anymore to implement an ACL-based condition to force to wait for data, eventhough the behaviour is not changed for when an ACL is present. The most obvious usage will be when track-sc is followed by any HTTP sample expression, there's no need anymore for adding "if HTTP". It's probably worth backporting this to 1.5 to avoid further configuration issues. Note that it requires previous patch.	2014-06-25 17:26:54 +02:00
Willy Tarreau	b5975defba	MINOR: stick-table: make stktable_fetch_key() indicate why it failed stktable_fetch_key() does not indicate whether it returns NULL because the input sample was not found or because it's unstable. It causes trouble with track-sc* rules. Just like with sample_fetch_string(), we want it to be able to give more information to the caller about what it found. Thus, now we use the pointer to a sample passed by the caller, and fill it with the information we have about the sample. That way, even if we return NULL, the caller has the ability to check whether a sample was found and if it is still changing or not.	2014-06-25 17:17:53 +02:00
Willy Tarreau	18bf01e900	MEDIUM: tcp: add a new tcp-request capture directive This new directive captures the specified fetch expression, converts it to text and puts it into the next capture slot. The capture slots are shared with header captures so that it is possible to dump all captures at once or selectively in logs and header processing. The purpose is to permit logs to contain whatever payload is found in a request, for example bytes at a fixed location or the SNI of forwarded SSL traffic.	2014-06-13 16:45:53 +02:00
Willy Tarreau	9cf8d3f46b	MINOR: protocols: use is_inet_addr() when only INET addresses are desired We used to have is_addr() in place to validate sometimes the existence of an address, sometimes a valid IPv4 or IPv6 address. Replace them carefully so that is_inet_addr() is used wherever we can only use an IPv4/IPv6 address.	2014-05-10 01:26:37 +02:00
Thierry FOURNIER	eeaa951726	MINOR: configuration: File and line propagation This patch permits to communicate file and line of the configuration file at the configuration parser.	2014-03-17 18:06:08 +01:00
Thierry FOURNIER	0d6ba513a5	MINOR: pattern: store configuration reference for each acl or map pattern. This patch permit to add reference for each pattern reference. This is useful to identify the acl listed.	2014-03-17 18:06:07 +01:00
Lukas Tribus	7640e72a31	MINOR: set IP_FREEBIND on IPv6 sockets in transparent mode Lets set IP_FREEBIND on IPv6 sockets as well, this works since Linux 3.3 and doesn't require CAP_NET_ADMIN privileges (IPV6_TRANSPARENT does). This allows unprivileged users to bind to non-local IPv6 addresses, which can be useful when setting up the listening sockets or when connecting to backend servers with a specific, non-local source IPv6 address (at that point we usually dropped root privileges already).	2014-03-03 21:31:10 +01:00
Willy Tarreau	cc08d2c9ff	MEDIUM: counters: stop relying on session flags at all Till now, we had one flag per stick counter to indicate if it was tracked in a backend or in a frontend. We just had to add another flag per stick-counter to indicate if it relies on contents or just connection. These flags are quite painful to maintain and tend to easily conflict with other flags if their number is changed. The correct solution consists in moving the flags to the stkctr struct itself, but currently this struct is made of 2 pointers, so adding a new entry there to store only two bits will cause at least 16 more bytes to be eaten per counter due to alignment issues, and we definitely don't want to waste tens to hundreds of bytes per session just for things that most users don't use. Since we only need to store two bits per counter, an intermediate solution consists in replacing the entry pointer with a composite value made of the original entry pointer and the two flags in the 2 unused lower bits. If later a need for other flags arises, we'll have to store them in the struct. A few inline functions have been added to abstract the retrieval and assignment of the pointers and flags, resulting in very few changes. That way there is no more dependence on the number of stick-counters and their position in the session flags.	2014-01-28 23:34:45 +01:00
Willy Tarreau	f3338349ec	BUG/MEDIUM: counters: flush content counters after each request One year ago, commit `5d5b5d8` ("MEDIUM: proto_tcp: add support for tracking L7 information") brought support for tracking L7 information in tcp-request content rules. Two years earlier, commit `0a4838c` ("[MEDIUM] session-counters: correctly unbind the counters tracked by the backend") used to flush the backend counters after processing a request. While that earliest patch was correct at the time, it became wrong after the second patch was merged. The code does what it says, but the concept is flawed. "TCP request content" rules are evaluated for each HTTP request over a single connection. So if such a rule in the frontend decides to track any L7 information or to track L4 information when an L7 condition matches, then it is applied to all requests over the same connection even if they don't match. This means that a rule such as : tcp-request content track-sc0 src if { path /index.html } will count one request for index.html, and another one for each of the objects present on this page that are fetched over the same connection which sent the initial matching request. Worse, it is possible to make the code do stupid things by using multiple counters: tcp-request content track-sc0 src if { path /foo } tcp-request content track-sc1 src if { path /bar } Just sending two requests first, one with /foo, one with /bar, shows twice the number of requests for all subsequent requests. Just because both of them persist after the end of the request. So the decision to flush backend-tracked counters was not the correct one. In practice, what is important is to flush countent-based rules since they are the ones evaluated for each request. Doing so requires new flags in the session however, to keep track of which stick-counter was tracked by what ruleset. A later change might make this easier to maintain over time. This bug is 1.5-specific, no backport to stable is needed.	2014-01-28 21:40:28 +01:00
Willy Tarreau	3c72872da1	CLEANUP: connection: use conn_ctrl_ready() instead of checking the flag It's easier and safer to rely on conn_ctrl_ready() everywhere than to check the flag itself. It will also simplify adding extra checks later if needed. Some useless controls for !ctrl have been removed, as the CTRL_READY flag itself guarantees ctrl is set.	2014-01-26 00:42:31 +01:00
Willy Tarreau	fd803bb4d7	MEDIUM: connection: add check for readiness in I/O handlers The recv/send callbacks must check for readiness themselves instead of having their callers do it. This will strengthen the test and will also ensure we never refrain from calling a handshake handler because a direction is being polled while the other one is ready.	2014-01-26 00:42:30 +01:00
Willy Tarreau	e1f50c4b02	MEDIUM: connection: remove conn_{data,sock}_poll_{recv,send} We simply remove these functions and replace their calls with the appropriate ones : - if we're in the data phase, we can simply report wait on the FD - if we're in the socket phase, we may also have to signal the desire to read/write on the socket because it might not be active yet.	2014-01-26 00:42:30 +01:00
Willy Tarreau	f817e9f473	MAJOR: polling: rework the whole polling system This commit heavily changes the polling system in order to definitely fix the frequent breakage of SSL which needs to remember the last EAGAIN before deciding whether to poll or not. Now we have a state per direction for each FD, as opposed to a previous and current state previously. An FD can have up to 8 different states for each direction, each of which being the result of a 3-bit combination. These 3 bits indicate a wish to access the FD, the readiness of the FD and the subscription of the FD to the polling system. This means that it will now be possible to remember the state of a file descriptor across disable/enable sequences that generally happen during forwarding, where enabling reading on a previously disabled FD would result in forgetting the EAGAIN flag it met last time. Several new state manipulation functions have been introduced or adapted : - fd_want_{recv,send} : enable receiving/sending on the FD regardless of its state (sets the ACTIVE flag) ; - fd_stop_{recv,send} : stop receiving/sending on the FD regardless of its state (clears the ACTIVE flag) ; - fd_cant_{recv,send} : report a failure to receive/send on the FD corresponding to EAGAIN (clears the READY flag) ; - fd_may_{recv,send} : report the ability to receive/send on the FD as reported by poll() (sets the READY flag) ; Some functions are used to report the current FD status : - fd_{recv,send}_active - fd_{recv,send}_ready - fd_{recv,send}_polled Some functions were removed : - fd_ev_clr(), fd_ev_set(), fd_ev_rem(), fd_ev_wai() The POLLHUP/POLLERR flags are now reported as ready so that the I/O layers knows it can try to access the file descriptor to get this information. In order to simplify the conditions to add/remove cache entries, a new function fd_alloc_or_release_cache_entry() was created to be used from pollers while scanning for updates. The following pollers have been updated : ev_select() : done, built, tested on Linux 3.10 ev_poll() : done, built, tested on Linux 3.10 ev_epoll() : done, built, tested on Linux 3.10 & 3.13 ev_kqueue() : done, built, tested on OpenBSD 5.2	2014-01-26 00:42:30 +01:00
Willy Tarreau	9ce7013429	MEDIUM: tcp: report connection error at the connection level Now when a connection error happens, it is reported in the connection so that upper layers know exactly what happened. This is particularly useful with health checks and resources exhaustion.	2014-01-24 16:15:04 +01:00
Willy Tarreau	3bd3e57a9b	MEDIUM: tcp: report in tcp_drain() that lingering is already disabled on close When an incoming shutdown or error is detected, we know that we can safely close without disabling lingering. Do it in tcp_drain() so that we don't have to do it from each and every caller.	2014-01-20 22:27:17 +01:00
Willy Tarreau	7f4bcc312d	MINOR: protocol: improve the proto->drain() API It was not possible to know if the drain() function had hit an EAGAIN, so now we change the API of this function to return : < 0 if EAGAIN was met = 0 if some data remain > 0 if a shutdown was received	2014-01-20 22:27:16 +01:00
Willy Tarreau	ad38acedaa	MEDIUM: connection: centralize handling of nolinger in fd management Right now we see many places doing their own setsockopt(SO_LINGER). Better only do it just before the close() in fd_delete(). For this we add a new flag on the file descriptor, indicating if it's safe or not to linger. If not (eg: after a connect()), then the setsockopt() call is automatically performed before a close(). The flag automatically turns to safe when receiving a read0.	2013-12-16 02:23:52 +01:00
Willy Tarreau	975c1784c8	MINOR: sample: make sample_parse_expr() use memprintf() to report parse errors Doing so ensures that we're consistent between all the functions in the whole chain. This is important so that we can extract the argument parsing from this function.	2013-12-12 23:16:54 +01:00
Willy Tarreau	57cd3e46b9	MEDIUM: connection: merge the send_proxy and local_send_proxy calls We used to have two very similar functions for sending a PROXY protocol line header. The reason is that the default one relies on the stream interface to retrieve the other end's address, while the "local" one performs a local address lookup and sends that instead (used by health checks). Now that the send_proxy_ofs is stored in the connection and not the stream interface, we can make the local_send_proxy rely on it and support partial sends. This also simplifies the code by removing the local_send_proxy function, making health checks use send_proxy_ofs, resulting in the removal of the CO_FL_LOCAL_SPROXY flag, and the associated test in the connection handler. The other flag, CO_FL_SI_SEND_PROXY was renamed without the "SI" part so that it is clear that it is not dedicated anymore to a usage with a stream interface.	2013-12-09 15:40:23 +01:00
Willy Tarreau	1ec74bf660	MINOR: connection: check for send_proxy during the connect(), not the SI It's cleaner to check for a pending send_proxy_ofs while establishing the connection (which already checks it anyway) and not in the stream interface.	2013-12-09 15:40:23 +01:00
Willy Tarreau	f79c8171b2	MAJOR: connection: add two new flags to indicate readiness of control/transport Currently the control and transport layers of a connection are supposed to be initialized when their respective pointers are not NULL. This will not work anymore when we plan to reuse connections, because there is an asymmetry between the accept() side and the connect() side : - on accept() side, the fd is set first, then the ctrl layer then the transport layer ; upon error, they must be undone in the reverse order, then the FD must be closed. The FD must not be deleted if the control layer was not yet initialized ; - on the connect() side, the fd is set last and there is no reliable way to know if it has been initialized or not. In practice it's initialized to -1 first but this is hackish and supposes that local FDs only will be used forever. Also, there are even less solutions for keeping trace of the transport layer's state. Also it is possible to support delayed close() when something (eg: logs) tracks some information requiring the transport and/or control layers, making it even more difficult to clean them. So the proposed solution is to add two flags to the connection : - CO_FL_CTRL_READY is set when the control layer is initialized (fd_insert) and cleared after it's released (fd_delete). - CO_FL_XPRT_READY is set when the control layer is initialized (xprt->init) and cleared after it's released (xprt->close). The functions have been adapted to rely on this and not on the pointers anymore. conn_xprt_close() was unused and dangerous : it did not close the control layer (eg: the socket itself) but still marks the transport layer as closed, preventing any future call to conn_full_close() from finishing the job. The problem comes from conn_full_close() in fact. It needs to close the xprt and ctrl layers independantly. After that we're still having an issue : we don't know based on ->ctrl alone whether the fd was registered or not. For this we use the two new flags CO_FL_XPRT_READY and CO_FL_CTRL_READY. We now rely on this and not on conn->xprt nor conn->ctrl anymore to decide what remains to be done on the connection. In order not to miss some flag assignments, we introduce conn_ctrl_init() to initialize the control layer, register the fd using fd_insert() and set the flag, and conn_ctrl_close() which unregisters the fd and removes the flag, but only if the transport layer was closed. Similarly, at the transport layer, conn_xprt_init() calls ->init and sets the flag, while conn_xprt_close() checks the flag, calls ->close and clears the flag, regardless xprt_ctx or xprt_st. This also ensures that the ->init and the ->close functions are called only once each and in the correct order. Note that conn_xprt_close() does nothing if the transport layer is still tracked. conn_full_close() now simply calls conn_xprt_close() then conn_full_close() in turn, which do nothing if CO_FL_XPRT_TRACKED is set. In order to handle the error path, we also provide conn_force_close() which ignores CO_FL_XPRT_TRACKED and closes the transport and the control layers in turns. All relevant instances of fd_delete() have been replaced with conn_force_close(). Now we always know what state the connection is in and we can expect to split its initialization.	2013-12-09 15:40:23 +01:00
Willy Tarreau	b363a1f469	MAJOR: stream-int: stop using si->conn and use si->end instead The connection will only remain there as a pre-allocated entity whose goal is to be placed in ->end when establishing an outgoing connection. All connection initialization can be made on this connection, but all information retrieved should be applied to the end point only. This change is huge because there were many users of si->conn. Now the only users are those who initialize the new connection. The difficulty appears in a few places such as backend.c, proto_http.c, peers.c where si->conn is used to hold the connection's target address before assigning the connection to the stream interface. This is why we have to keep si->conn for now. A future improvement might consist in dynamically allocating the connection when it is needed.	2013-12-09 15:40:22 +01:00
Willy Tarreau	26f4a04744	MEDIUM: connection: set the socket shutdown flags on socket errors When we get a hard error from a syscall indicating the socket is dead, it makes sense to set the CO_FL_SOCK_WR_SH and CO_FL_SOCK_RD_SH flags to indicate that the socket may not be used anymore. It will ease the error processing in health checks where the state of socket is very important. We'll also be able to avoid some setsockopt(nolinger) after an error. For now, the rest of the code is not impacted because CO_FL_ERROR is always tested prior to these flags.	2013-12-04 23:50:36 +01:00
Willy Tarreau	f12a20ebce	BUG/MINOR: tcp: check that no error is pending during a connect probe The tcp_connect_probe() function may be called upon I/O activity when no recv/send callbacks were called (eg: recv not possible, nothing to send). It only relies on connect() to observe the connection establishment progress but that does not work when some network errors are pending on the socket (eg: a delayed connection refused). For this reason we need to run a getsockopt() in the case where the poller reports FD_POLL_ERR on the socket. We use this opportunity to update errno so that the conn->data->wake() function has all relevant info when it sees CO_FL_ERROR. At the moment no code is impacted by this bug because recv polling is always enabled during a connect, so recvfrom() always sees the error first. But this may change with the health check cleanup. No backport is needed.	2013-12-04 23:50:10 +01:00
Willy Tarreau	0cba607400	MINOR: acl/pattern: use types different from int to clarify who does what. We now have the following enums and all related functions return them and consume them : enum pat_match_res { PAT_NOMATCH = 0, /* sample didn't match any pattern / PAT_MATCH = 3, / sample matched at least one pattern / }; enum acl_test_res { ACL_TEST_FAIL = 0, / test failed / ACL_TEST_MISS = 1, / test may pass with more info / ACL_TEST_PASS = 3, / test passed / }; enum acl_cond_pol { ACL_COND_NONE, / no polarity set yet / ACL_COND_IF, / positive condition (after 'if') / ACL_COND_UNLESS, / negative condition (after 'unless') */ }; It's just in order to avoid doubts when reading some code.	2013-12-02 23:31:33 +01:00
Thierry FOURNIER	a65b343eee	MEDIUM: pattern: rename "acl" prefix to "pat" This patch just renames functions, types and enums. No code was changed. A significant number of files were touched, especially the ACL arrays, so it is likely that some external patches will not apply anymore. One important thing is that we had to split ACL_PAT_* into two groups : - ACL_TEST_{PASS\|MISS\|FAIL} - PAT_{MATCH\|UNMATCH} A future patch will enforce enums on all these places to avoid confusion.	2013-12-02 23:31:33 +01:00
Willy Tarreau	0bb166be5e	MINOR: tcp: don't use tick_add_ifset() when timeout is known to be set These two useless tests propably result from a copy-paste. The test is performed in the condition to enter the block.	2013-11-04 18:12:20 +01:00
Willy Tarreau	44778ad87d	BUG/MEDIUM: tcp: do not skip tracking rules on second pass The track-sc* tcp rules are bogus. The test to verify if the tracked counter was already assigned is performed in the same condition as the test for the action. The effect is that a rule which tracks a counter that is already being tracked is implicitly converted to an accept because the default rule is an accept. This bug only affects 1.5-dev releases.	2013-10-30 19:29:21 +01:00
Willy Tarreau	cc1e04b1e8	MINOR: tcp: add new "close" action for tcp-response This new action immediately closes the connection with the server when the condition is met. The first such rule executed ends the rules evaluation. The main purpose of this action is to force a connection to be finished between a client and a server after an exchange when the application protocol expects some long time outs to elapse first. The goal is to eliminate idle connections which take signifiant resources on servers with certain protocols.	2013-09-11 23:28:51 +02:00
Willy Tarreau	b4c8493a9f	MINOR: session: make the number of stick counter entries more configurable In preparation of more flexibility in the stick counters, make their number configurable. It still defaults to 3 which is the minimum accepted value. Changing the value alone is not sufficient to get more counters, some bitfields still need to be updated and the TCP actions need to be updated as well, but this update tries to be easier, which is nice for experimentation purposes.	2013-08-01 21:17:14 +02:00
Willy Tarreau	ef38c39287	MEDIUM: sample: systematically pass the keyword pointer to the keyword We're having a lot of duplicate code just because of minor variants between fetch functions that could be dealt with if the functions had the pointer to the original keyword, so let's pass it as the last argument. An earlier version used to pass a pointer to the sample_fetch element, but this is not the best solution for two reasons : - fetch functions will solely rely on the keyword string - some other smp_fetch_* users do not have the pointer to the original keyword and were forced to pass NULL. So finally we're passing a pointer to the keyword as a const char *, which perfectly fits the original purpose.	2013-08-01 21:17:13 +02:00
Willy Tarreau	dc13c11c1e	BUG/MEDIUM: prevent gcc from moving empty keywords lists into BSS Benoit Dolez reported a failure to start haproxy 1.5-dev19. The process would immediately report an internal error with missing fetches from some crap instead of ACL names. The cause is that some versions of gcc seem to trim static structs containing a variable array when moving them to BSS, and only keep the fixed size, which is just a list head for all ACL and sample fetch keywords. This was confirmed at least with gcc 3.4.6. And we can't move these structs to const because they contain a list element which is needed to link all of them together during the parsing. The bug indeed appeared with 1.5-dev19 because it's the first one to have some empty ACL keyword lists. One solution is to impose -fno-zero-initialized-in-bss to everyone but this is not really nice. Another solution consists in ensuring the struct is never empty so that it does not move there. The easy solution consists in having a non-null list head since it's not yet initialized. A new "ILH" list head type was thus created for this purpose : create an Initialized List Head so that gcc cannot move the struct to BSS. This fixes the issue for this version of gcc and does not create any burden for the declarations.	2013-06-21 23:29:02 +02:00
Willy Tarreau	be4a3eff34	MEDIUM: counters: use sc0/sc1/sc2 instead of sc1/sc2/sc3 It was a bit inconsistent to have gpc start at 0 and sc start at 1, so make sc start at zero like gpc. No previous release was issued with sc3 anyway, so no existing setup should be affected.	2013-06-17 15:04:07 +02:00
Willy Tarreau	6d4e4e8dd2	MEDIUM: acl: remove a lot of useless ACLs that are equivalent to their fetches The following 116 ACLs were removed because they're redundant with their fetch function since last commit which allows the fetch function to be used instead for types BOOL, INT and IP. Most places are now left with an empty ACL keyword list that was not removed so that it's easier to add other ACLs later. always_false, always_true, avg_queue, be_conn, be_id, be_sess_rate, connslots, nbsrv, queue, srv_conn, srv_id, srv_is_up, srv_sess_rate, res.comp, fe_conn, fe_id, fe_sess_rate, dst_conn, so_id, wait_end, http_auth, http_first_req, status, dst, dst_port, src, src_port, sc1_bytes_in_rate, sc1_bytes_out_rate, sc1_clr_gpc0, sc1_conn_cnt, sc1_conn_cur, sc1_conn_rate, sc1_get_gpc0, sc1_gpc0_rate, sc1_http_err_cnt, sc1_http_err_rate, sc1_http_req_cnt, sc1_http_req_rate, sc1_inc_gpc0, sc1_kbytes_in, sc1_kbytes_out, sc1_sess_cnt, sc1_sess_rate, sc1_tracked, sc1_trackers, sc2_bytes_in_rate, sc2_bytes_out_rate, sc2_clr_gpc0, sc2_conn_cnt, sc2_conn_cur, sc2_conn_rate, sc2_get_gpc0, sc2_gpc0_rate, sc2_http_err_cnt, sc2_http_err_rate, sc2_http_req_cnt, sc2_http_req_rate, sc2_inc_gpc0, sc2_kbytes_in, sc2_kbytes_out, sc2_sess_cnt, sc2_sess_rate, sc2_tracked, sc2_trackers, sc3_bytes_in_rate, sc3_bytes_out_rate, sc3_clr_gpc0, sc3_conn_cnt, sc3_conn_cur, sc3_conn_rate, sc3_get_gpc0, sc3_gpc0_rate, sc3_http_err_cnt, sc3_http_err_rate, sc3_http_req_cnt, sc3_http_req_rate, sc3_inc_gpc0, sc3_kbytes_in, sc3_kbytes_out, sc3_sess_cnt, sc3_sess_rate, sc3_tracked, sc3_trackers, src_bytes_in_rate, src_bytes_out_rate, src_clr_gpc0, src_conn_cnt, src_conn_cur, src_conn_rate, src_get_gpc0, src_gpc0_rate, src_http_err_cnt, src_http_err_rate, src_http_req_cnt, src_http_req_rate, src_inc_gpc0, src_kbytes_in, src_kbytes_out, src_sess_cnt, src_sess_rate, src_updt_conn_cnt, table_avl, table_cnt, ssl_c_ca_err, ssl_c_ca_err_depth, ssl_c_err, ssl_c_used, ssl_c_verify, ssl_c_version, ssl_f_version, ssl_fc, ssl_fc_alg_keysize, ssl_fc_has_crt, ssl_fc_has_sni, ssl_fc_use_keysize,	2013-06-11 21:22:58 +02:00
Willy Tarreau	4f0d919bd4	MEDIUM: tcp: add "tcp-request connection expect-proxy layer4" This configures the client-facing connection to receive a PROXY protocol header before any byte is read from the socket. This is equivalent to having the "accept-proxy" keyword on the "bind" line, except that using the TCP rule allows the PROXY protocol to be accepted only for certain IP address ranges using an ACL. This is convenient when multiple layers of load balancers are passed through by traffic coming from public hosts.	2013-06-11 20:40:55 +02:00
Willy Tarreau	2b57cb8f30	MEDIUM: protocol: implement a "drain" function in protocol layers Since commit `cfd97c6f` was merged into 1.5-dev14 (BUG/MEDIUM: checks: prevent TIME_WAITs from appearing also on timeouts), some valid health checks sometimes used to show some TCP resets. For example, this HTTP health check sent to a local server : 19:55:15.742818 IP 127.0.0.1.16568 > 127.0.0.1.8000: S 3355859679:3355859679(0) win 32792 <mss 16396,nop,nop,sackOK,nop,wscale 7> 19:55:15.742841 IP 127.0.0.1.8000 > 127.0.0.1.16568: S 1060952566:1060952566(0) ack 3355859680 win 32792 <mss 16396,nop,nop,sackOK,nop,wscale 7> 19:55:15.742863 IP 127.0.0.1.16568 > 127.0.0.1.8000: . ack 1 win 257 19:55:15.745402 IP 127.0.0.1.16568 > 127.0.0.1.8000: P 1:23(22) ack 1 win 257 19:55:15.745488 IP 127.0.0.1.8000 > 127.0.0.1.16568: FP 1:146(145) ack 23 win 257 19:55:15.747109 IP 127.0.0.1.16568 > 127.0.0.1.8000: R 23:23(0) ack 147 win 257 After some discussion with Chris Huang-Leaver, it appeared clear that what we want is to only send the RST when we have no other choice, which means when the server has not closed. So we still keep SYN/SYN-ACK/RST for pure TCP checks, but don't want to see an RST emitted as above when the server has already sent the FIN. The solution against this consists in implementing a "drain" function at the protocol layer, which, when defined, causes as much as possible of the input socket buffer to be flushed to make recv() return zero so that we know that the server's FIN was received and ACKed. On Linux, we can make use of MSG_TRUNC on TCP sockets, which has the benefit of draining everything at once without even copying data. On other platforms, we read up to one buffer of data before the close. If recv() manages to get the final zero, we don't disable lingering. Same for hard errors. Otherwise we do. In practice, on HTTP health checks we generally find that the close was pending and is returned upon first recv() call. The network trace becomes cleaner : 19:55:23.650621 IP 127.0.0.1.16561 > 127.0.0.1.8000: S 3982804816:3982804816(0) win 32792 <mss 16396,nop,nop,sackOK,nop,wscale 7> 19:55:23.650644 IP 127.0.0.1.8000 > 127.0.0.1.16561: S 4082139313:4082139313(0) ack 3982804817 win 32792 <mss 16396,nop,nop,sackOK,nop,wscale 7> 19:55:23.650666 IP 127.0.0.1.16561 > 127.0.0.1.8000: . ack 1 win 257 19:55:23.651615 IP 127.0.0.1.16561 > 127.0.0.1.8000: P 1:23(22) ack 1 win 257 19:55:23.651696 IP 127.0.0.1.8000 > 127.0.0.1.16561: FP 1:146(145) ack 23 win 257 19:55:23.652628 IP 127.0.0.1.16561 > 127.0.0.1.8000: F 23:23(0) ack 147 win 257 19:55:23.652655 IP 127.0.0.1.8000 > 127.0.0.1.16561: . ack 24 win 257 This change should be backported to 1.4 which is where Chris encountered this issue. The code is different, so probably the tcp_drain() function will have to be put in the checks only.	2013-06-10 20:33:23 +02:00
Willy Tarreau	e25c917af8	MEDIUM: counters: add support for tracking a third counter We're often missin a third counter to track base, src and base+src at the same time. Here we introduce track_sc3 to have this third counter. It would be wise not to add much more counters because that slightly increases the session size and processing time though the real issue is more the declaration of the keywords in the code and in the doc.	2013-05-29 00:37:16 +02:00
Willy Tarreau	d5ca9abb0d	MINOR: counters: make it easier to extend the amount of tracked counters By properly affecting the flags and values, it becomes easier to add more tracked counters, for example for experimentation. It also slightly reduces the code and the number of tests. No counters were added with this patch.	2013-05-28 17:43:40 +02:00
Pieter Baauw	1eb7592bba	MINOR: tproxy: add support for OpenBSD OpenBSD uses (SOL_SOCKET, SO_BINDANY) to enable transparent proxy on a socket. This patch adds support for the relevant setsockopt() calls.	2013-05-11 08:03:50 +02:00
Pieter Baauw	ff30b6667b	MINOR: tproxy: add support for FreeBSD FreeBSD uses (IPPROTO_IP, IP_BINDANY) and (IPPROTO_IPV6, IPV6_BINDANY) to enable transparent proxy on a socket. This patch adds support for the relevant setsockopt() calls.	2013-05-11 08:03:43 +02:00
Pieter Baauw	d551fb5a8d	REORG: tproxy: prepare the transparent proxy defines for accepting other OSes This patch does not change the logic of the code, it only changes the way OS-specific defines are tested. At the moment the transparent proxy code heavily depends on Linux-specific defines. This first patch introduces a new define "CONFIG_HAP_TRANSPARENT" which is set every time the defines used by transparent proxy are present. This also means that with an up-to-date libc, it should not be necessary anymore to force CONFIG_HAP_LINUX_TPROXY during the build, as the flags will automatically be detected. The CTTPROXY flags still remain separate because this older API doesn't work the same way. A new line has been added in the version output for haproxy -vv to indicate what transparent proxy support is available.	2013-05-11 08:03:37 +02:00
Willy Tarreau	33c60dece5	MINOR: tcp: report the erroneous word in tcp-request track* We used to report the word after the fetch.	2013-04-12 08:26:32 +02:00
Willy Tarreau	deaec2fda3	BUG/MINOR: tcp: fix error reporting for TCP rules tcp-request swapped two output words in the error message, making it meaningless.	2013-04-11 17:24:53 +02:00
Willy Tarreau	a4312fa28e	MAJOR: sample: maintain a per-proxy list of the fetch args to resolve While ACL args were resolved after all the config was parsed, it was not the case with sample fetch args because they're almost everywhere now. The issue is that ACLs now solely rely on sample fetches, so their args resolving doesn't work anymore. And many fetches involving a server, a proxy or a userlist don't work at all. The real issue is that at the bottom layers we have no information about proxies, line numbers, even ACLs in order to report understandable errors, and that at the top layers we have no visibility over the locations where fetches are referenced (think log node). After failing multiple unsatisfying solutions attempts, we now have a new concept of args list. The principle is that every proxy has a list head which contains a number of indications such as the config keyword, the context where it's used, the file and line number, etc... and a list of arguments. This list head is of the same type as the elements, so it serves as a template for adding new elements. This way, it is filled from top to bottom by the callers with the information they have (eg: line numbers, ACL name, ...) and the lower layers just have to duplicate it and add an element when they face an argument they cannot resolve yet. Then at the end of the configuration parsing, a loop passes over each proxy's list and resolves all the args in sequence. And this way there is all necessary information to report verbose errors. The first immediate benefit is that for the first time we got very precise location of issues (arg number in a keyword in its context, ...). Second, in order to do this we had to parse log-format and unique-id-format a bit earlier, so that was a great opportunity for doing so when the directives are encountered (unless it's a default section). This way, the recorded line numbers for these args are the ones of the place where the log format is declared, not the end of the file. Userlists report slightly more information now. They're the only remaining ones in the ACL resolving function.	2013-04-03 02:13:02 +02:00
Willy Tarreau	93fddf1dbc	MEDIUM: acl: have a pointer to the keyword name in acl_expr The acl_expr struct used to hold a pointer to the ACL keyword. But since we now have all the relevant pointers, we don't need that anymore, we just need the pointer to the keyword as a string in order to return warnings and error messages. So let's change this in order to remove the dependency on the acl_keyword struct from acl_expr. During this change, acl_cond_kw_conflicts() used to return a pointer to an ACL keyword but had to be changed to return a const char* for the same reason.	2013-04-03 02:13:01 +02:00
Willy Tarreau	d86e29d2a1	CLEANUP: acl: remove unused references to ACL_USE_* Now that acl->requires is not used anymore, we can remove all references to it as well as all ACL_USE_* flags.	2013-04-03 02:13:00 +02:00
Willy Tarreau	a91d0a583c	MAJOR: acl: convert all ACL requires to SMP use+val instead of ->requires The ACLs now use the fetch's ->use and ->val to decide upon compatibility between the place where they are used and where the information are fetched. The code is capable of reporting warnings about very fine incompatibilities between certain fetches and an exact usage location, so it is expected that some new warnings will be emitted on some existing configurations. Two degrees of detection are provided : - detecting ACLs that never match - detecting keywords that are ignored All tests show that this seems to work well, though bugs are still possible.	2013-04-03 02:13:00 +02:00
Willy Tarreau	25320b2906	MEDIUM: proxy: remove acl_requires and just keep a flag "http_needed" Proxy's acl_requires was a copy of all bits taken from ACLs, but we'll get rid of ACL flags and only rely on sample fetches soon. The proxy's acl_requires was only used to allocate an HTTP context when needed, and was even forced in HTTP mode. So better have a flag which exactly says what it's supposed to be used for.	2013-04-03 02:13:00 +02:00
Willy Tarreau	c48c90dfa5	MAJOR: acl: remove the arg_mask from the ACL definition and use the sample fetch's Now that ACLs solely rely on sample fetch functions, make them use the same arg mask. All inconsistencies have been fixed separately prior to this patch, so this patch almost only adds a new pointer indirection and removes all references to ARG*() in the definitions. The parsing is still performed by the ACL code though.	2013-04-03 02:12:58 +02:00
Willy Tarreau	8ed669b12a	MAJOR: acl: make all ACLs reference the fetch function via a sample. ACL fetch functions used to directly reference a fetch function. Now that all ACL fetches have their sample fetches equivalent, we can make ACLs reference a sample fetch keyword instead. In order to simplify the code, a sample keyword name may be NULL if it is the same as the ACL's, which is the most common case. A minor change appeared, http_auth always expects one argument though the ACL allowed it to be missing and reported as such afterwards, so fix the ACL to match this. This is not really a bug.	2013-04-03 02:12:58 +02:00
Willy Tarreau	d4c33c8889	MEDIUM: samples: move payload-based fetches and ACLs to their own file The file acl.c is a real mess, it both contains functions to parse and process ACLs, and some sample extraction functions which act on buffers. Some other payload analysers were arbitrarily dispatched to proto_tcp.c. So now we're moving all payload-based fetches and ACLs to payload.c which is capable of extracting data from buffers and rely on everything that is protocol-independant. That way we can safely inflate this file and only use the other ones when some fetches are really specific (eg: HTTP, SSL, ...). As a result of this cleanup, the following new sample fetches became available even if they're not really useful : always_false, always_true, rep_ssl_hello_type, rdp_cookie_cnt, req_len, req_ssl_hello_type, req_ssl_sni, req_ssl_ver, wait_end The function 'acl_fetch_nothing' was wrong and never used anywhere so it was removed. The "rdp_cookie" sample fetch used to have a mandatory argument while it was optional in ACLs, which are supposed to iterate over RDP cookies. So we're making it optional as a fetch too, and it will return the first one.	2013-04-03 02:12:57 +02:00
Willy Tarreau	80aca90ad2	MEDIUM: samples: use new flags to describe compatibility between fetches and their usages Samples fetches were relying on two flags SMP_CAP_REQ/SMP_CAP_RES to describe whether they were compatible with requests rules or with response rules. This was never reliable because we need a finer granularity (eg: an HTTP request method needs to parse an HTTP request, and is available past this point). Some fetches are also dependant on the context (eg: "hdr" uses request or response depending where it's involved, causing some abiguity). In order to solve this, we need to precisely indicate in fetches what they use, and their users will have to compare with what they have. So now we have a bunch of bits indicating where the sample is fetched in the processing chain, with a few variants indicating for some of them if it is permanent or volatile (eg: an HTTP status is stored into the transaction so it is permanent, despite being caught in the response contents). The fetches also have a second mask indicating their validity domain. This one is computed from a conversion table at registration time, so there is no need for doing it by hand. This validity domain consists in a bitmask with one bit set for each usage point in the processing chain. Some provisions were made for upcoming controls such as connection-based TCP rules which apply on top of the connection layer but before instantiating the session. Then everywhere a fetch is used, the bit for the control point is checked in the fetch's validity domain, and it becomes possible to finely ensure that a fetch will work or not. Note that we need these two separate bitfields because some fetches are usable both in request and response (eg: "hdr", "payload"). So the keyword will have a "use" field made of a combination of several SMP_USE_* values, which will be converted into a wider list of SMP_VAL_* flags. The knowledge of permanent vs dynamic information has disappeared for now, as it was never used. Later we'll probably reintroduce it differently when dealing with variables. Its only use at the moment could have been to avoid caching a dynamic rate measurement, but nothing is cached as of now.	2013-04-03 02:12:56 +02:00
Willy Tarreau	e0db1e8946	MEDIUM: acl: remove flag ACL_MAY_LOOKUP which is improperly used This flag is used on ACL matches that support being looking up patterns in trees. At the moment, only strings and IPs support tree-based lookups, but the flag is randomly set also on integers and binary data, and is not even always set on strings nor IPs. Better get rid of this mess by only relying on the matching function to decide whether or not it supports tree-based lookups, this is safer and easier to maintain.	2013-04-03 02:12:56 +02:00
Willy Tarreau	40aa070c51	MAJOR: listener: support inheriting a listening fd from the parent Using the address syntax "fd@<num>", a listener may inherit a file descriptor that the caller process has already bound and passed as this number. The fd's socket family is detected using getsockname(), and the usual initialization is performed through the existing code for that family, but the socket creation is skipped. Whether the parent has performed the listen() call or not is not important as this is detected. For UNIX sockets, we immediately clear the path after preparing a socket so that we never remove it in case an abort would happen due to a late error during startup.	2013-03-11 01:30:01 +01:00
Lukas Tribus	0defb90784	DOC: tfo: bump required kernel to linux-3.7 Support for server side TFO was actually introduced in linux-3.7, linux-3.6 just has client support. This patch fixes documentation and a code comment about the kernel requirement. It also fixes a wrong tfo related code comment in src/proto_tcp.c.	2013-02-14 00:03:04 +01:00
Willy Tarreau	8ab505bdef	CLEANUP: tcp/unix: remove useless NULL check in {tcp,unix}_bind_listener() errmsg may only be NULL if errlen is zero. Clarify this in the comment too. Reported-by: Dinko Korunic <dkorunic@reflected.net>	2013-01-24 16:19:18 +01:00
Willy Tarreau	d486ef5045	BUG/MINOR: connection: remove a few synchronous calls to polling updates There were a few synchronous calls to polling updates in some functions called from the connection handler. These ones are not needed and should be replaced by more efficient and more debugable asynchronous calls.	2012-12-10 17:03:52 +01:00
Willy Tarreau	b54b6ca483	BUG/MINOR: proto_tcp: bidirectional fetches not supported anymore in track-sc1/2 Sample fetch capabilities indicate when the fetch may be used and not what it requires, so we need to check if a fetch is compatible with the direction we want and not if it works backwards.	2012-12-09 17:04:41 +01:00
Willy Tarreau	598718a7ab	BUG/MINOR: proto_tcp: fix parsing of "table" in track-sc1/2 Recent commit `5d5b5d8e` left the "table" argument in the list of arguments to parse.	2012-12-09 16:57:27 +01:00
Willy Tarreau	20d46a5a95	CLEANUP: session: use an array for the stick counters The stick counters were in two distinct sets of struct members, causing some code to be duplicated. Now we use an array, which enables some processing to be performed in loops. This allowed the code to be shrunk by 700 bytes.	2012-12-09 15:57:16 +01:00
Willy Tarreau	5d5b5d8eaf	MEDIUM: proto_tcp: add support for tracking L7 information Until now it was only possible to use track-sc1/sc2 with "src" which is the IPv4 source address. Now we can use track-sc1/sc2 with any fetch as well as any transformation type. It works just like the "stick" directive. Samples are automatically converted to the correct types for the table. Only "tcp-request content" rules may use L7 information, and such information must already be present when the tracking is set up. For example it becomes possible to track the IP address passed in the X-Forwarded-For header. HTTP request processing now also considers tracking from backend rules because we want to be able to update the counters even when the request was already parsed and tracked. Some more controls need to be performed (eg: samples do not distinguish between L4 and L6).	2012-12-09 14:08:47 +01:00
Willy Tarreau	a4380b4f15	CLEANUP: proto_tcp: use the same code to bind servers and backends The tproxy and source binding code has now be factored out for servers and backends. A nice effect is that the code now supports having backends use source port ranges, though the config does not support it yet. This change has reduced the executable by around 700 bytes.	2012-12-09 10:05:37 +01:00
Willy Tarreau	ef9a360555	MEDIUM: connection: introduce "struct conn_src" for servers and proxies Both servers and proxies share a common set of parameters for outgoing connections, and since they're not stored in a similar structure, a lot of code is duplicated in the connection setup, which is one sensible area. Let's first define a common struct for these settings and make use of it. Next patches will de-duplicate code. This change also fixes a build breakage that happens when USE_LINUX_TPROXY is not set but USE_CTTPROXY is set, which seem to be very unlikely considering that the issue was introduced almost 2 years ago an never reported.	2012-12-09 10:04:39 +01:00
Willy Tarreau	b1719517b7	BUG/MEDIUM: tcp: process could theorically crash on lack of source ports When connect() fails with EAGAIN or EADDRINUSE, an error message is sent to logs and uses srv->id to indicate the server name (this is very old code). Since version 1.4, it is possible to have srv == NULL, so the message could cause a crash when connect() returns EAGAIN or EADDRINUSE. However in practice this does not happen because on lack of source ports, EADDRNOTAVAIL is returned instead, so this code is never called. This fix consists in not displaying the server name anymore, and in adding the test for EADDRNOTAVAIL. Also, the log level was lowered from LOG_EMERG to LOG_ERR in order not to spam all consoles when source ports are missing for a given target. This fix should be backported to 1.4.	2012-12-08 23:07:33 +01:00
Willy Tarreau	fc8f1f0382	BUG/MINOR: tcp: set the ADDR_TO_SET flag on outgoing connections tcp_connect_server() resets all of the connection's flags. This means that an outgoing connection does not have the ADDR_TO_SET flag eventhough the address is set. The first impact is that logging the outgoing address or displaying it on the CLI while dumping sessions will result in an extra call to getpeername(). But there is a nastier impact. If such a lookup happens after the first connect() attempt and this one fails, the destination address is corrupted by the call to getsockname(), and subsequent connection retries will fail with socket errors. For now we fix this by making tcp_connect_server() set the flag. But we'll soon need a function to initialize an outgoing connection with appropriate address and flags before calling the connect() function.	2012-12-08 18:53:44 +01:00
Willy Tarreau	77e3af9e6f	MINOR: tcp: add support for the "v4v6" bind option Commit `9b6700f` added "v6only". As suggested by Vincent Bernat, it is sometimes useful to have the opposite option to force binding to the two protocols when the system is configured to bind to v6 only by default. This option does exactly this. v6only still has precedence.	2012-11-24 15:07:23 +01:00
Willy Tarreau	9b6700f673	MINOR: tcp: add support for the "v6only" bind option This option forces a socket to bind to IPv6 only when it uses the default address (eg: ":::80").	2012-11-24 12:20:28 +01:00
Willy Tarreau	f0837b259b	MEDIUM: tcp: add explicit support for delayed ACK in connect() Commit `24db47e0` tried to improve support for delayed ACK upon connect but it was incomplete, because checks with the proxy protocol would always enable polling for data receive and there was no way of distinguishing data polling and delayed ack. So we add a distinct delack flag to the connect() function so that the caller decides whether or not to use a delayed ack regardless of pending data (eg: when send-proxy is in use). Doing so covers all combinations of { (check with data), (sendproxy), (smart-connect) }.	2012-11-24 10:24:27 +01:00
Willy Tarreau	24db47e0cc	MEDIUM: checks: avoid waking the application up for pure TCP checks Pure TCP checks only use the SYN/ACK in return to a SYN. By forcing the system to use delayed ACKs, it is possible to send an RST instead of the ACK and thus ensure that the application will never be needlessly woken up. This avoids error logs or counters on checked components since the application is never made aware of this connection which dies in the network stack.	2012-11-23 14:18:39 +01:00
Willy Tarreau	6b0a850503	BUG/MEDIUM: checks: mark the check as stopped after a connect error Health checks currently still use the connection's fd to know whether a check is running (this needs to change). When a health check immediately fails during connect() because of a lack of local resource (eg: port), we failed to unset the fd, so each time the process_chk woken up after such an error, it believed a check was still running and used to close the fd again instead of starting a new check. This could result in other connections being closed because they were assigned the same fd value. The bug is only marked medium because when this happens, the system is already in a bad state. A comment was added above tcp_connect_server() to clarify that the fd is not valid on error.	2012-11-23 09:03:29 +01:00
Willy Tarreau	3fdb366885	MAJOR: connection: replace struct target with a pointer to an enum Instead of storing a couple of (int, ptr) in the struct connection and the struct session, we use a different method : we only store a pointer to an integer which is stored inside the target object and which contains a unique type identifier. That way, the pointer allows us to retrieve the object type (by dereferencing it) and the object's address (by computing the displacement in the target structure). The NULL pointer always corresponds to OBJ_TYPE_NONE. This reduces the size of the connection and session structs. It also simplifies target assignment and compare. In order to improve the generated code, we try to put the obj_type element at the beginning of all the structs (listener, server, proxy, si_applet), so that the original and target pointers are always equal. A lot of code was touched by massive replaces, but the changes are not that important.	2012-11-12 00:42:33 +01:00
Willy Tarreau	f2943dccd0	MAJOR: session: detach the connections from the stream interfaces We will need to be able to switch server connections on a session and to keep idle connections. In order to achieve this, the preliminary requirement is that the connections can survive the session and be detached from them. Right now they're still allocated at exactly the same place, so when there is a session, there are always 2 connections. We could soon improve on this by allocating the outgoing connection only during a connect(). This current patch touches a lot of code and intentionally does not change any functionnality. Performance tests show no regression (even a very minor improvement). The doc has not yet been updated.	2012-10-26 20:15:20 +02:00
Willy Tarreau	5f2877a7dd	BUG/MEDIUM: tcp: transparent bind to the source only when address is set Thomas Heil reported that health checks did not work anymore when a backend or server has "usesrc clientip". This is because the source address is not set and tcp_bind_socket() tries to bind to that address anyway. The solution consists in explicitly clearing the source address in the checks and to make tcp_bind_socket() avoid binding when the address is not set. This also has an indirect benefit that a useless bind() syscall will be avoided when using "source 0.0.0.0 usesrc clientip" in health checks.	2012-10-26 20:04:27 +02:00
Willy Tarreau	9b28e03b66	MAJOR: channel: replace the struct buffer with a pointer to a buffer With this commit, we now separate the channel from the buffer. This will allow us to replace buffers on the fly without touching the channel. Since nobody is supposed to keep a reference to a buffer anymore, doing so is not a problem and will also permit some copy-less data manipulation. Interestingly, these changes have shown a 2% performance increase on some workloads, probably due to a better cache placement of data.	2012-10-13 09:07:52 +02:00
Willy Tarreau	697d85045a	CLEANUP: tcp: use 'chn' instead of 'buf' or 'b' for channel pointer names Same as previous patches, avoid confusion in local variable names.	2012-10-12 23:53:39 +02:00
Willy Tarreau	1c862c5920	MEDIUM: tcp: enable TCP Fast Open on systems which support it If TCP_FASTOPEN is defined, then the "tfo" option is supported on "bind" lines to enable TCP Fast Open (linux >= 3.6).	2012-10-05 16:22:35 +02:00
Willy Tarreau	f7bc57ca6e	REORG: connection: rename the data layer the "transport layer" While working on the changes required to make the health checks use the new connections, it started to become obvious that some naming was not logical at all in the connections. Specifically, it is not logical to call the "data layer" the layer which is in charge for all the handshake and which does not yet provide a data layer once established until a session has allocated all the required buffers. In fact, it's more a transport layer, which makes much more sense. The transport layer offers a medium on which data can transit, and it offers the functions to move these data when the upper layer requests this. And it is the upper layer which iterates over the transport layer's functions to move data which should be called the data layer. The use case where it's obvious is with embryonic sessions : an incoming SSL connection is accepted. Only the connection is allocated, not the buffers nor stream interface, etc... The connection handles the SSL handshake by itself. Once this handshake is complete, we can't use the data functions because the buffers and stream interface are not there yet. Hence we have to first call a specific function to complete the session initialization, after which we'll be able to use the data functions. This clearly proves that SSL here is only a transport layer and that the stream interface constitutes the data layer. A similar change will be performed to rename app_cb => data, but the two could not be in the same commit for obvious reasons.	2012-10-04 22:26:09 +02:00
Cyril Bonté	3aaba440a2	BUILD: fix compilation error with DEBUG_FULL Recent changes in structures broke the compilation when using DEBUG_FULL. Let's update apply the changes also to the variables used in DPRINTF calls.	2012-09-24 20:36:39 +02:00
Willy Tarreau	eb6cead1de	MINOR: standard: make memprintf() support a NULL destination Doing so removes many checks that were systematically made because the callees don't know if the caller passed a valid pointer.	2012-09-24 10:53:16 +02:00
Willy Tarreau	4348fad1c1	MAJOR: listeners: use dual-linked lists to chain listeners with frontends Navigating through listeners was very inconvenient and error-prone. Not to mention that listeners were linked in reverse order and reverted afterwards. In order to definitely get rid of these issues, we now do the following : - frontends have a dual-linked list of bind_conf - frontends have a dual-linked list of listeners - bind_conf have a dual-linked list of listeners - listeners have a pointer to their bind_conf This way we can now navigate from anywhere to anywhere and always find the proper bind_conf for a given listener, as well as find the list of listeners for a current bind_conf.	2012-09-20 16:48:07 +02:00
Willy Tarreau	28a47d6408	MINOR: config: pass the file and line to config keyword parsers This will be needed when we need to create bind config settings.	2012-09-18 20:02:48 +02:00
Willy Tarreau	51fb7651c4	MINOR: listener: add a scope field in the bind keyword lists This scope is used to report what the keywords are used for (eg: TCP, UNIX, ...). It is now reported by bind_dump_kws().	2012-09-18 18:27:14 +02:00
Willy Tarreau	4479124cda	MEDIUM: config: move the "bind" TCP parameters to proto_tcp Now proto_tcp.c is responsible for the 4 settings it handles : - defer-accept - interface - mss - transparent These ones do not need to be handled in cfgparse anymore. If support for a setting is disabled by a missing build option, then cfgparse correctly reports : [ALERT] 255/232700 (2701) : parsing [echo.cfg:114] : 'bind' : 'transparent' option is not implemented in this version (check build options).	2012-09-15 22:33:16 +02:00
Willy Tarreau	d1d5454180	REORG: split "protocols" files into protocol and listener It was becoming confusing to have protocols and listeners in the same files, split them.	2012-09-15 22:29:32 +02:00
Willy Tarreau	184636e3e7	BUG: tcp: close socket fd upon connect error When the data layer fails to initialize (eg: out of memory for SSL), we must close the socket fd we just allocated.	2012-09-06 14:04:41 +02:00
Willy Tarreau	dd2f85eb3b	CLEANUP: includes: fix includes for a number of users of fd.h It appears that fd.h includes a number of unneeded files and was included from standard.h, and as such served as an intermediary to provide almost everything to everyone. By removing its useless includes, a long dependency chain broke but could easily be fixed.	2012-09-03 20:49:14 +02:00
Willy Tarreau	40ff59d820	CLEANUP: fd: remove fdtab->flags These flags were added for TCP_CORK. They were only set at various places but never checked by any user since TCP_CORK was replaced with MSG_MORE. Simply get rid of this now.	2012-09-03 20:49:14 +02:00
Willy Tarreau	15678efc45	MEDIUM: connection: add an ->init function to data layer SSL need to initialize the data layer before proceeding with data. At the moment, this data layer is automatically initialized from itself, which will not be possible once we extract connection from sessions since we'll only create the data layer once the handshake is finished. So let's have the application layer initialize the data layer before using it.	2012-09-03 20:47:34 +02:00
Willy Tarreau	64ee491309	MINOR: tcp: replace tcp_src_to_stktable_key with addr_to_stktable_key Make it more obvious that this function does not depend on any knowledge of the session. This is important to plan for TCP rules that can run on connection without any initialized session yet.	2012-09-03 20:47:34 +02:00
Willy Tarreau	14f8e86da5	MEDIUM: proto_tcp: remove any dependence on stream_interface The last uses of the stream interfaces were in tcp_connect_server() and could easily and more appropriately be moved to its callers, si_connect() and connect_server(), making a lot more sense. Now the function should theorically be usable for health checks. It also appears more obvious that the file is split into two distinct parts : - the protocol layer used at the connection level - the tcp analysers executing tcp-* rules and their samples/acls.	2012-09-03 20:47:34 +02:00
Willy Tarreau	93b0f4f6c6	MEDIUM: stream_interface: remove CAP_SPLTCP/CAP_SPLICE flags These ones are implicitly handled by the connection's data layer, no need to rely on them anymore and reaching them maintains undesired dependences on stream-interface.	2012-09-03 20:47:34 +02:00
Willy Tarreau	986a9d2d12	MAJOR: connection: move the addr field from the stream_interface We need to have the source and destination addresses in the connection. They were lying in the stream interface so let's move them. The flags SI_FL_FROM_SET and SI_FL_TO_SET have been moved as well. It's worth noting that tcp_connect_server() almost does not use the stream interface anymore except for a few flags. It has been identified that once we detach the connection from the SI, it will probably be needed to keep a copy of the server-side addresses in the SI just for logging purposes. This has not been implemented right now though.	2012-09-03 20:47:34 +02:00
Willy Tarreau	3cefd521fa	REORG: connection: move the target pointer from si to connection The target is per connection and is directly used by the connection, so we need it there. It's not needed anymore in the SI however.	2012-09-03 20:47:34 +02:00
Willy Tarreau	8263d2b259	CLEANUP: channel: use "channel" instead of "buffer" in function names This is a massive rename of most functions which should make use of the word "channel" instead of the word "buffer" in their names. In concerns the following ones (new names) : unsigned long long channel_forward(struct channel buf, unsigned long long bytes); static inline void channel_init(struct channel buf) static inline int channel_input_closed(struct channel buf) static inline int channel_output_closed(struct channel buf) static inline void channel_check_timeouts(struct channel b) static inline void channel_erase(struct channel buf) static inline void channel_shutr_now(struct channel buf) static inline void channel_shutw_now(struct channel buf) static inline void channel_abort(struct channel buf) static inline void channel_stop_hijacker(struct channel buf) static inline void channel_auto_connect(struct channel buf) static inline void channel_dont_connect(struct channel buf) static inline void channel_auto_close(struct channel buf) static inline void channel_dont_close(struct channel buf) static inline void channel_auto_read(struct channel buf) static inline void channel_dont_read(struct channel buf) unsigned long long channel_forward(struct channel *buf, unsigned long long bytes) Some functions provided by channel.[ch] have kept their "buffer" name because they are really designed to act on the buffer according to some information gathered from the channel. They have been moved together to the same place in the file for better readability but they were not changed at all. The "buffer" memory pool was also renamed "channel".	2012-09-03 20:47:33 +02:00
Willy Tarreau	03cdb7c678	CLEANUP: channel: usr CF_/CHN_ prefixes instead of BF_/BUF_ Get rid of these confusing BF_* flags. Now channel naming should clearly be used everywhere appropriate. No code was changed, only a renaming was performed. The comments about channel operations was updated.	2012-09-03 20:47:33 +02:00
Willy Tarreau	3bf1b2b816	MAJOR: channel: stop relying on BF_FULL to take action This flag is quite complex to get right and updating it everywhere is a major pain, especially since the buffer/channel split. This is the first step of getting rid of it. Instead now it's dynamically computed whenever needed.	2012-09-03 20:47:33 +02:00
Willy Tarreau	8e21bb9e52	MAJOR: channel: remove the BF_OUT_EMPTY flag This flag was very problematic because it was composite in that both changes to the pipe or to the buffer had to cause this flag to be updated, which is not always simple (eg: there may not even be a channel attached to a buffer at all). There were not that many users of this flags, mostly setters. So the flag got replaced with a macro which reports whether the channel is empty or not, by checking both the pipe and the buffer. One part of the change is sensible : the flag was also part of BF_MASK_STATIC, which is used by process_session() to rescan all analysers in case the flag's status changes. At first glance, none of the analysers seems to change its mind base on this flag when it is subject to change, so it seems fine not to add variation checks here. Otherwise it's possible that checking the buffer's output size is more useful than checking the flag's replacement.	2012-09-03 20:47:32 +02:00
Willy Tarreau	c7e4238df0	REORG: buffers: split buffers into chunk,buffer,channel Many parts of the channel definition still make use of the "buffer" word.	2012-09-03 20:47:32 +02:00
Willy Tarreau	96199b1016	MAJOR: stream-interface: restore splicing mechanism The splicing is now provided by the data-layer rcv_pipe/snd_pipe functions which in turn are called by the stream interface's recv and send callbacks. The presence of the rcv_pipe/snd_pipe functions is used to attest support for splicing at the data layer. It looks like the stream-interface's SI_FL_CAP_SPLICE flag does not make sense anymore as it's used as a proxy for the pointers above. It also appears that we call chk_snd() from the recv callback and then try to call it again in update_conn(). It is very likely that this last function will progressively slip into the recv/send callbacks in order to avoid duplicate check code. The code works right now with and without splicing. Only raw_sock provides support for it and it is automatically selected when the various splice options are set. However it looks like splice-auto doesn't enable it, which possibly means that the streamer detection code does not work anymore, or that it's only called at a time where it's too late to enable splicing (in process_session).	2012-09-03 20:47:31 +02:00
Willy Tarreau	75bf2c925f	REORG: sock_raw: rename the files raw_sock* The "raw_sock" prefix will be more convenient for naming functions as it will be prefixed with the data layer and suffixed with the data direction. So let's rename the files now to avoid any further confusion. The #include directive was also removed from a number of files which do not need it anymore.	2012-09-02 21:54:56 +02:00
Willy Tarreau	572bf9095d	REORG/MAJOR: extract "struct buffer" from "struct channel" At the moment, the struct is still embedded into the struct channel, but all the functions have been updated to use struct buffer only when possible, otherwise struct channel. Some functions would likely need to be splitted between a buffer-layer primitive and a channel-layer function. Later the buffer should become a pointer in the struct buffer, but doing so requires a few changes to the buffer allocation calls.	2012-09-02 21:54:56 +02:00
Willy Tarreau	7421efb85f	REORG/MAJOR: use "struct channel" instead of "struct buffer" This is a massive rename. We'll then split channel and buffer. This change needs a lot of cleanups. At many locations, the parameter or variable is still called "buf" which will become ambiguous. Also, the "struct channel" is still defined in buffers.h.	2012-09-02 21:54:55 +02:00
Willy Tarreau	afad0e0f80	MAJOR: make use of conn_{data\|sock}_{poll\|stop\|want}* in connection handlers This is a second attempt at getting rid of FD_WAIT_. Now the situation is much better since native I/O handlers can directly manipulate the FD using fd_{poll\|want\|stop}_ and the connection handlers manipulate connection-level flags using the conn_{data\|sock}_* equivalent. Proceeding this way ensures that the connection flags always reflect the reality even after data<->handshake switches.	2012-09-02 21:53:12 +02:00
Willy Tarreau	f9dabecd03	MEDIUM: connection: make use of the new polling functions Now the connection handler, the handshake callbacks and the I/O callbacks make use of the connection-layer polling functions to enable or disable polling on a file descriptor. Some changes still need to be done to avoid using the FD_WAIT_* constants.	2012-09-02 21:53:11 +02:00
Willy Tarreau	49b046dddf	MAJOR: fd: replace all EV_FD_* macros with new fd__ inline calls These functions have a more explicity meaning and will offer provisions for explicit polling. EV_FD_ISSET() has been left for now as it is still in use in checks.	2012-09-02 21:53:11 +02:00
Willy Tarreau	0b0c097a3a	MINOR: rearrange tcp_connect_probe() and fix wrong return codes Sometimes we returned the need for polling while it was not needed. Remove some of the spaghetti in the function.	2012-09-02 21:53:09 +02:00
Willy Tarreau	8f8c92fe93	MAJOR: connection: add a new CO_FL_CONNECTED flag This new flag is used to indicate that the connection was already connected. It can be used by I/O handlers to know that a connection has just completed. It is used by stream_sock_update_conn(), allowing the sock_opt handlers not to manipulate the SI timeout nor the BF_WRITE_NULL flag anymore.	2012-09-02 21:53:09 +02:00
Willy Tarreau	3c55ec2020	MEDIUM: stream_interface: centralize the SI_FL_ERR management It's better to have only stream_sock_update_conn() handle the conversion of the CO_FL_ERROR flag to SI_FL_ERR than having it in each and every I/O callback.	2012-09-02 21:53:09 +02:00
Willy Tarreau	239d7189fc	MEDIUM: stream_interface: pass connection instead of fd in sock_ops The sock_ops I/O callbacks made use of an FD till now. This has become inappropriate and the struct connection is much more useful. It also fixes the race condition introduced by previous change.	2012-09-02 21:53:08 +02:00
Willy Tarreau	fd31e53139	MAJOR: remove the stream interface and task management code from sock_* The socket data layer code must only focus on moving data between a socket and a buffer. We need a special stream interface handler to update the stream interface and the file descriptor status. At the moment the code works but suffers from a race condition caused by its API : the read/write callbacks still make use of the fd instead of using the connection. And when a double shutdown is performed, a call to ->write() after ->read() processed an error results in dereferencing a NULL fdtab[]->owner. This is only a temporary issue which doesn't need to be fixed now since this will automatically go away when the functions change to use the connection instead.	2012-09-02 21:53:08 +02:00
Willy Tarreau	076be25ab8	CLEANUP: remove the now unused fdtab direct I/O callbacks They were all left to NULL since last commit so we can safely remove them all now and remove the temporary dual polling logic in pollers.	2012-09-02 21:51:29 +02:00
Willy Tarreau	2da156fe5e	MAJOR: tcp: remove the specific I/O callbacks for TCP connection probes Use a single tcp_connect_probe() instead of tcp_connect_write() and tcp_connect_read(). We call this one only when no data layer function have been processed, so this is a fallback to test for completion of a connection attempt. With this done, we don't have the need for any direct I/O callback anymore. The function still relies on ->write() to wake the stream interface up, so it's not finished.	2012-09-02 21:51:29 +02:00
Willy Tarreau	2c6be84b3a	MEDIUM: connection: extract the send_proxy callback from proto_tcp This handshake handler must be independant, so move it away from proto_tcp. It has a dedicated connection flag. It is tested before I/O handlers and automatically removes the CO_FL_WAIT_L4_CONN flag upon success. It also sets the BF_WRITE_NULL flag on the stream interface and stops the SI timeout. However it does not perform the task_wakeup(), and relies on the data handler to do so for now. The SI wakeup will have to be moved elsewhere anyway.	2012-09-02 21:51:28 +02:00
Willy Tarreau	8018471f44	MINOR: fd: make fdtab->owner a connection and not a stream_interface anymore It is more convenient with a connection here and will abstract stream_interface more easily.	2012-09-02 21:51:28 +02:00
Willy Tarreau	d2274c6536	MAJOR: connection: replace direct I/O callbacks with the connection callback Almost all direct I/O callbacks have been changed to use the connection callback instead. Only the TCP connection validation remains.	2012-09-02 21:51:28 +02:00
Willy Tarreau	aece46a44d	MEDIUM: protocols: use the generic I/O callback for accept callbacks This one is used only on read events, and it was easy to convert to use the new I/O callback.	2012-09-02 21:51:27 +02:00
Willy Tarreau	4e6049e553	MINOR: fd: add a new I/O handler to fdtab This one will eventually replace both cb[] handlers. At the moment it is not used yet.	2012-09-02 21:51:27 +02:00
Willy Tarreau	505e34a36d	MAJOR: get rid of fdtab[].state and use connection->flags instead fdtab[].state was only used to know whether a connection was in progress or an error was encountered. Instead we now use connection->flags to store a flag for both. This way, connection management will be able to update the connection status on I/O.	2012-09-02 21:51:26 +02:00
Willy Tarreau	ed8f614078	REORG/MEDIUM: fd: get rid of FD_STLISTEN This state was only used so that ev_sepoll did not match FD_STERROR, which changed in previous patch. We can now safely remove this state.	2012-09-02 21:51:25 +02:00
David du Colombier	65c1796c4a	MINOR: IPv6 support for transparent proxy Set socket option IPV6_TRANSPARENT on binding to enable transparent proxy on IPv6. This option is available from Linux 2.6.37.	2012-07-31 07:53:42 +02:00
Willy Tarreau	96596aeead	MEDIUM: fd/si: move peeraddr from struct fdinfo to struct connection The destination address is purely a connection thing and not an fd thing. It's also likely that later the address will be stored into the connection and linked to by the SI. struct fdinfo only keeps the pointer to the port range and the local port for now. All of this also needs to move to the connection but before this the release of the port range must move from fd_delete() to a new function dedicated to the connection.	2012-06-08 22:59:52 +02:00
Willy Tarreau	fb7508aefb	REORG/MINOR: stream_interface: move si->fd to struct connection The socket fd is used only when in socket mode and with a connection.	2012-05-21 16:47:54 +02:00
Willy Tarreau	73b013b070	MINOR: stream_interface: introduce a new "struct connection" type We start to move everything needed to manage a connection to a special entity "struct connection". We have the data layer operations and the control operations there. We'll also have more info in the future such as file descriptors and applet contexts, so that in the end it becomes detachable from the stream interface, which will allow connections to be reused between sessions. For now on, we start with minimal changes.	2012-05-21 16:31:45 +02:00
Willy Tarreau	a190d591fc	REORG: move the send-proxy code to tcp_connect_write() It is much better and more efficient to consider that the send-proxy feature is part of the protocol layer than part of the data layer. Now the connection is considered established once the send-proxy line has been sent. This way the data layer doesn't have to care anymore about this specific part. The tcp_connect_write() function now automatically calls the data layer write() function once the connection is established, which saves calls to epoll_ctl/epoll_wait/process_session. It's starting to look more and more obvious that tcp_connect_read() and tcp_connect_write() are not TCP-specific but only socket-specific and as such should probably move, along with some functions from protocol.c, to a socket-specific file (eg: stream_sock). It would be nice to be able to support autonomous listeners to parse the proxy protocol before accepting a connection, so that we get rid of it at the session layer and to support using these informations in the tcp-request connection rules.	2012-05-20 18:35:19 +02:00
Willy Tarreau	8ae52cb144	BUG/MINOR: stop connect timeout when connect succeeds If the connect succeeds exactly at the same millisecond as the connect timeout is supposed to strike, the timeout is still considered while data may have already be sent. This results in a new connection attempt with no data and with the response being lost. Note that in practice the only real-world situation where this is observed is when connect timeouts are extremely low, too low for safe operations. This bug was encountered with a 1ms connect timeout. It is also present on 1.4 and needs to be fixed there too.	2012-05-20 10:38:46 +02:00
Willy Tarreau	be0688c64d	MEDIUM: stream_interface: remove the si->init Calling the init() function in sess_establish was a bad idea, it is too late to allow it to fail on lack of resource and does not help at all. Remove it for now before it's used.	2012-05-18 15:15:26 +02:00
Willy Tarreau	b147a8382a	CLEANUP: fd: remove unused cb->b pointers in the struct fdtab These pointers were used to hold pointers to buffers in the past, but since we introduced the stream interface, they're no longer used but they were still sometimes set. Removing them shrink the struct fdtab from 32 to 24 bytes on 32-bit machines, and from 52 to 36 bytes on 64-bit machines, which is a significant saving. A quick tests shows a steady 0.5% performance gain, probably due to the better cache efficiency.	2012-05-13 00:35:44 +02:00
Willy Tarreau	eeda90e68c	MAJOR: fd: remove the need for the socket layer to recheck the connection Up to now, if an outgoing connection had no data to send, the socket layer had to perform a connect() again to check for establishment. This is not acceptable for SSL, and will cause problems with socketpair(). Some socket layers will also need an initializer before sending data (eg: SSL). The solution consists in moving the connect() test to the protocol layer (eg: TCP) and to make it hold the fd->write callback until the connection is validated. At this point, it will switch the write callback to the socket layer's write function. In fact we need to hold both read and write callbacks to ensure the socket layer is never called before being initialized. This intermediate callback is used only if there is a socket init function or if there are no data to send. The socket layer does not have any code to check for connection establishment anymore, which makes sense.	2012-05-11 20:18:26 +02:00
Willy Tarreau	59b9479667	BUG/MEDIUM: stream_interface: restore get_src/get_dst Commit e164e7a removed get_src/get_dst setting in the stream interfaces but forgot to set it in proto_tcp. Get the feature back because we need it for logging, transparent mode, ACLs etc... We now rely on the stream interface direction to know what syscall to use. One benefit of doing it this way is that we don't use getsockopt() anymore on outgoing stream interfaces nor on UNIX sockets.	2012-05-11 16:48:10 +02:00
Willy Tarreau	c63190d429	REORG: use the name sock_raw instead of stream_sock We'll soon have an SSL socket layer, and in order to ease the difference between the two, we use the name "sock_raw" to designate the one which directly talks to the sockets without any conversion.	2012-05-11 14:23:52 +02:00
Willy Tarreau	0a3dd74c9c	MEDIUM: cfgparse: use the new error reporting framework for remaining cfg_keywords All keywords registered using a cfg_kw_list now make use of the new error reporting framework. This allows easier and more precise error reporting without having to deal with complex buffer allocation issues.	2012-05-08 21:28:17 +02:00
Willy Tarreau	22bca61404	MEDIUM: proto_tcp: remove src6 and dst6 pattern fetch methods These methods have been superseded by src and dst which support multiple families. There is no point keeping them since they appeared in a development version anyway. For configurations using "src6", please use "src" instead. For "dst6", use "dst" instead.	2012-05-08 21:28:15 +02:00
Willy Tarreau	bbebbbff83	REORG/MEDIUM: move the default accept function from sockstream to protocols.c The previous sockstream_accept() function uses nothing from sockstream, and is totally irrelevant to stream interfaces. Move this to the protocols.c file which handles listeners and protocols, and call it listener_accept(). It now makes much more sense that the code dealing with listen() also handles accept() and passes it to upper layers.	2012-05-08 21:28:15 +02:00
Willy Tarreau	26d8c59f0b	REORG/MEDIUM: replace stream interface protocol functions by a proto pointer The stream interface now makes use of the socket protocol pointer instead of the direct functions.	2012-05-08 21:28:15 +02:00
Willy Tarreau	1b79bdee26	REORG/MEDIUM: move protocol->{read,write} to sock_ops The protocol must not set the read and write callbacks, they're specific to the socket layer. Move them to sock_ops instead.	2012-05-08 21:28:14 +02:00
Willy Tarreau	cd3b094618	REORG: rename "pattern" files They're now called "sample" everywhere to match their description.	2012-05-08 20:57:21 +02:00
Willy Tarreau	1278578487	REORG: use the name "sample" instead of "pattern" to designate extracted data This is mainly a massive renaming in the code to get it in line with the calling convention. Next patch will rename a few files to complete this operation.	2012-05-08 20:57:20 +02:00
Willy Tarreau	b7451bb660	MEDIUM: acl: report parsing errors to the caller All parsing errors were known but impossible to return. Now by making use of memprintf(), we're able to build meaningful error messages that the caller can display.	2012-05-08 20:57:20 +02:00
Willy Tarreau	0d5fe144a1	MINOR: proto_tcp: validate arguments of payload and payload_lv ACLs Now it's possible to control arguments, so let's do it.	2012-05-08 20:57:19 +02:00
Willy Tarreau	d6281ae046	MEDIUM: pattern: use smp_fetch_rdp_cookie instead of the pattern specific version pattern_fetch_rdp_cookie() is useless now since it only used to add controls on top of smp_fetch_rdp_cookie() which have now been integrated into the pattern subsystem. Let's remove it.	2012-05-08 20:57:18 +02:00
Willy Tarreau	82ea800b0f	CLEANUP: pattern: ensure that payload and payload_lv always stay in the buffer A test was already performed which worked by pure luck due to integer types, otherwise it would have been possible to start checking for an offset out of the buffer's bounds if the buffer size was large enough to allow an integer wrap. Let's perform explicit checks and use unsigned ints for offsets instead of risking being hit later.	2012-05-08 20:57:18 +02:00
Willy Tarreau	0ce3aa0c66	MEDIUM: acl: implement payload and payload_lv These ones were easy to adapt to ACL usage and may really be useful, so let's make them available right now. It's likely that some extension such as regex, string-to-IP and raw IP matching will be implemented in the near future.	2012-05-08 20:57:17 +02:00
Willy Tarreau	4a12981c68	MEDIUM: acl/pattern: factor out the src/dst address fetches Since pattern_process() is able to automatically cast returned types into expected types, we can safely use the sample functions to fetch addresses whatever their family. The lowest castable type must be declared with the keyword so that config checks pass. Right now this means that src/dst use the same fetch function for ACLs and patterns. src6/dst6 have been kept so that configs which explicitly rely on v6 are properly checked.	2012-05-08 20:57:17 +02:00
Willy Tarreau	25c1ebc0c9	MEDIUM: acl/pattern: start merging common sample fetch functions src_port, dst_port and url_param have converged between ACLs and patterns. This means that src_port is now available in patterns and that urlp_* has been added to ACLs. Some code has moved to accommodate for static function definitions, but there were little changes.	2012-05-08 20:57:17 +02:00
Willy Tarreau	32a6f2e572	MEDIUM: acl/pattern: use the same direction scheme Patterns were using a bitmask to indicate if request or response was desired in fetch functions and keywords. ACLs were using a bitmask in fetch keywords and a single bit in fetch functions. ACLs were also using an ACL_PARTIAL bit in fetch functions indicating that a non-final fetch was performed, which was an abuse of the existing direction flag. The change now consists in using : - a capabilities field for fetch keywords => SMP_CAP_REQ/RES to indicate if a keyword supports requests, responses, both, etc... - an option field for fetch functions to indicate what the caller expects (request/response, final/non-final) The ACL_PARTIAL bit was reversed to get SMP_OPT_FINAL as it's more explicit to know we're working on a final buffer than on a non-final one. ACL_DIR_* were removed, as well as PATTERN_FETCH_*. L4 fetches were improved to support being called on responses too since they're still available. The <dir> field of all fetch functions was changed to <opt> which is now unsigned. The patch is large but mostly made of cosmetic changes to accomodate this, as almost no logic change happened.	2012-05-08 20:57:17 +02:00
Willy Tarreau	9fb4bc7f43	MINOR: tcp: replace acl_fetch_rdp_cookie with smp_fetch_rdp_cookie The former was only a wrapper to the second, let's remove it now that the calling convention is exactly the same. This is the first function to be unified between ACLs and samples.	2012-05-08 20:57:16 +02:00
Willy Tarreau	24e32d8c6b	MEDIUM: acl: replace acl_expr with args in acl fetch_* functions Having the args everywhere will make it easier to share fetch functions between patterns and ACLs. The only place where we could have needed the expr was in the http_prefetch function which can do well without.	2012-05-08 20:57:16 +02:00
Willy Tarreau	32389b7d04	MEDIUM: acl/pattern: switch rdp_cookie functions stack up-down Previously, both pattern, backend and persist_rdp_cookie would build fake ACL expressions to fetch an RDP cookie by calling acl_fetch_rdp_cookie(). Now we switch roles. The RDP cookie fetch function is provided as a sample fetch function that all others rely on, including ACL. The code is exactly the same, only the args handling moved from expr->args to args. The code was moved to proto_tcp.c, but probably that a dedicated file would be more suited to content handling.	2012-05-08 20:57:16 +02:00
Willy Tarreau	b8c8f1f611	MEDIUM: pattern: retrieve the sample type in the sample, not in the keyword description We need the pattern fetchers and converters to correctly set the output type so that they can be used by ACL fetchers. By using the sample type instead of the keyword type, we also open the possibility to create some multi-type pattern fetch methods later (eg: "src" being v4/v6). Right now the type in the keyword is used to validate the configuration.	2012-05-08 20:57:16 +02:00
Willy Tarreau	342acb4775	MEDIUM: pattern: integrate pattern_data into sample and use sample everywhere Now there is no more reference to union pattern_data. All pattern fetch and conversion functions now make use of the common sample type. Note: none of them adjust the type right now so it's important to do it next otherwise we would risk sharing such functions with ACLs and seeing them fail.	2012-05-08 20:57:15 +02:00
Willy Tarreau	f853c46bc3	MEDIUM: pattern/acl: get rid of temp_pattern in ACLs This one is not needed anymore as we can return the data and its type in the sample provided by the caller. ACLs now always return the proper type. BOOL is already returned when the result is expected to be processed as a boolean. temp_pattern has been unexported now.	2012-05-08 20:57:14 +02:00
Willy Tarreau	3740635b88	MAJOR: acl: make use of the new sample struct and get rid of acl_test This change is invasive in lines of code but not much in terms of functionalities as it's mainly a replacement of struct acl_test with struct sample.	2012-05-08 20:57:14 +02:00
Willy Tarreau	422aa0792d	MEDIUM: pattern: add new sample types to replace pattern types The new sample types are necessary for the acl-pattern convergence. These types are boolean and signed int. Some types were renamed for less ambiguity (ip->ipv4, integer->uint).	2012-05-08 20:57:14 +02:00
Willy Tarreau	21d68a6895	MEDIUM: pattern: add an argument validation callback to pattern descriptors This is used to validate that arguments are coherent. For instance, payload_lv expects that the last arg (if any) is not more negative than the sum of the first two. The error is reported if any.	2012-05-08 20:57:13 +02:00
Willy Tarreau	9fcb984b17	MEDIUM: pattern: use the standard arg parser We don't need the pattern-specific args parsers anymore, make use of the common parser instead. We still need to improve this by adding a validation function to report abnormal argument values or combinations. We don't report precise parsing errors yet but this was not previously done either.	2012-05-08 20:57:13 +02:00
Willy Tarreau	f995410355	MEDIUM: pattern: get rid of arg_i in all functions making use of arguments arg_i was almost unused, and since we migrated to use struct arg everywhere, the rare cases where arg_i was needed could be replaced by switching to arg->type = ARGT_STOP.	2012-05-08 20:57:12 +02:00
Willy Tarreau	ecfb8e8ff9	MEDIUM: pattern: replace type pattern_arg with type arg arg is more complete than pattern_arg since it also covers ACL args, so let's use this one instead.	2012-05-08 20:57:12 +02:00
Willy Tarreau	61612d49a7	MAJOR: acl: store the ACL argument types in the ACL keyword declaration The types and minimal number of ACL keyword arguments are now stored in their declaration. This will allow many more fantasies if some ACL use several arguments or types. Doing so required to rework all ACL keyword declarations to add two parameters. So this was a good opportunity for a general cleanup and to sort all entries in alphabetical order. We still have two pending issues : - parse_acl_expr() checks for errors but has no way to report them to the user ; - the types of some arguments are still not resolved and kept as strings (eg: ARGT_FE/BE/TAB) for compatibility reasons, which must be resolved in acl_find_targets()	2012-05-08 20:57:11 +02:00
Willy Tarreau	34db108423	MAJOR: acl: make use of the new argument parsing framework The ACL parser now uses the argument parser to build a typed argument list. Right now arguments are all strings and only one argument is supported since this is what ACLs currently support.	2012-05-08 20:57:11 +02:00
Willy Tarreau	89fa706d39	MAJOR: buffers: replace buf->w with buf->p - buf->o This change introduces the buffer's base pointer, which is the limit between incoming and outgoing data. It's the point where the parsing should start from. A number of computations have already been greatly simplified, but more simplifications are expected to come from the removal of buf->r. The changes appear good and have revealed occasional improper use of some pointers. It is possible that this patch has introduced bugs or revealed some, although preliminary testings tend to indicate that everything still works as it should.	2012-05-08 12:28:10 +02:00
Willy Tarreau	02d6cfc1d7	MAJOR: buffer: replace buf->l with buf->{o+i} We don't have buf->l anymore. We have buf->i for pending data and the total length is retrieved by adding buf->o. Some computation already become simpler. Despite extreme care, bugs are not excluded. It's worth noting that msg->err_pos as set by HTTP request/response analysers becomes relative to pending data and not to the beginning of the buffer. This has not been completed yet so differences might occur when outgoing data are left in the buffer.	2012-05-08 12:28:10 +02:00
Willy Tarreau	2e046c6017	MAJOR: buffer rework: replace ->send_max with ->o This is the first minor step of the buffer rework. It's only renaming, it should have no impact.	2012-04-30 11:57:00 +02:00
Willy Tarreau	9b061e3320	MEDIUM: stream_sock: add a get_src and get_dst callback and remove SN_FRT_ADDR_SET These callbacks are used to retrieve the source and destination address of a socket. The address flags are not hold on the stream interface and not on the session anymore. The addresses are collected when needed. This still needs to be improved to store the IP and port separately so that it is not needed to perform a getsockname() when only the IP address is desired for outgoing traffic.	2012-04-07 18:03:52 +02:00
Aman Gupta	d94991d236	CLEANUP: Fix some minor whitespace issues	2012-04-07 09:56:14 +02:00
Aman Gupta	0bc0c2426c	MINOR: Add TO/FROM_SET flags to struct stream_interface [WT: it will make sense to remove SN_FRT_ADDR_SET and to use these flags everywhere instead ]	2012-04-07 09:17:26 +02:00
Aman Gupta	ceafb4aa92	CLEANUP: Fix some minor typos	2012-04-05 10:39:45 +02:00
William Lallemand	b7ff6a3a36	MEDIUM: log-format: backend source address %Bi %Bp %Bi return the backend source IP %Bp return the backend source port Add a function pointer in logformat_type to do additional configuration during the log-format variable parsing.	2012-03-12 15:50:52 +01:00
Willy Tarreau	664092ccc1	MEDIUM: acl: use temp_pattern to store any string-type information Now strings and data blocks are stored in the temp_pattern's chunk and matched against this one. The rdp_cookie currently makes extensive use of acl_fetch_rdp_cookie() and will be a good candidate for the initial rework so that ACLs use the patterns framework and not the other way around.	2011-12-30 17:33:26 +01:00
Willy Tarreau	f4362b3e3b	MEDIUM: acl: use temp_pattern to store any address-type information IPv4 and IPv6 addresses are now stored into temp_pattern instead of the dirty hack consisting into storing them into the consumer's target address. Some refactoring should now be possible since the methods used to fetch source and destination addresses are similar between patterns and ACLs.	2011-12-30 17:33:26 +01:00
Willy Tarreau	a5e375646c	MEDIUM: acl: use temp_pattern to store any integer-type information All ACL fetches which return integer value now store the result into the temporary pattern struct. All ACL matches which rely on integer also get their value there. Note: the pattern data types are not set right now.	2011-12-30 17:33:26 +01:00
Willy Tarreau	5dc1e98905	BUG: proto_tcp: don't try to bind to a foreign address if sin_family is unknown This is 1.5-specific. It causes issues with transparent source binding involving hdr_ip. We must not try to bind() to a foreign address when the family is not set, and we must set the family when an address is set.	2011-12-30 17:33:24 +01:00
Willy Tarreau	f6f8225390	BUG: tcp: option nolinger does not work on backends Daniel Rankov reported that "option nolinger" is inefficient on backends. The reason is that it is set on the file descriptor only, which does not prevent haproxy from performing a clean shutdown() before closing. We must set the flag on the stream_interface instead if we want an RST to be emitted upon active close.	2011-11-30 18:06:23 +01:00
Willy Tarreau	6471afb43d	MINOR: remove the client/server side distinction in SI addresses Stream interfaces used to distinguish between client and server addresses because they were previously of different types (sockaddr_storage for the client, sockaddr_in for the server). This is not the case anymore, and this distinction is confusing at best and has caused a number of regressions to be introduced in the process of converting everything to full-ipv6. We can now remove this and have a much cleaner code.	2011-09-23 10:54:59 +02:00
Willy Tarreau	631f01c2f1	[MINOR] make use of addr_to_str() and get_host_port() to replace many inet_ntop() Many inet_ntop calls were partially right, which was hard to detect given the complex combinations. Some of them were relying on the listener's proto instead of the address itself, which could have been different when dealing with an accept-proxy connection. The new addr_to_str() function does the dirty job and returns the family, which makes it particularly suited to calls from switch/case statements. A large number of if/else statements were removed and the stats output could even be cleaned up in the case of session dump. As a side effect of doing this, the resulting code is smaller by almost 1kB. All changed parts have been tested and provided expected output.	2011-09-05 00:54:36 +02:00
Willy Tarreau	86ad42c5b7	[MINOR] make use of set_host_port() and get_host_port() to get rid of family mismatches This also simplifies the code and makes it more auditable.	2011-09-05 00:54:35 +02:00
Simon Horman	6c54d8b63b	[MINOR] Consistently use error in tcp_parse_tcp_req() It seems to me that without this change tcp_parse_tcp_req() may leak memory.	2011-07-18 10:21:23 +02:00
Simon Horman	de072bd8ff	[CLEANUP] Remove unnecessary casts There is no need to cast when going to or from void *	2011-06-25 21:14:10 +02:00
Simon Horman	ab814e0a6b	[MINOR] Add rdp_cookie pattern fetch function This pattern fetch function extracts the value of the rdp cookie <name> as a string and uses this value to match. This enables implementation of persistence based on the mstshash cookie. This is typically done if there is no msts cookie present. This differs from "balance rdp-cookie" in that any balancing algorithm may be used and thus the distribution of clients to backend servers is not linked to a hash of the RDP cookie. It is envisaged that using a balancing algorithm such as "balance roundrobin" or "balance leastconnect" will lead to a more even distribution of clients to backend servers than the hash used by "balance rdp-cookie". Example : listen tse-farm bind 0.0.0.0:3389 # wait up to 5s for an RDP cookie in the request tcp-request inspect-delay 5s tcp-request content accept if RDP_COOKIE # apply RDP cookie persistence persist rdp-cookie # Persist based on the mstshash cookie # This is only useful makes sense if # balance rdp-cookie is not used stick-table type string size 204800 stick on rdp_cookie(mstshash) server srv1 1.1.1.1:3389 server srv1 1.1.1.2:3389	2011-06-25 21:07:02 +02:00
Willy Tarreau	96dd079b49	[BUG] proto_tcp: fix address binding on remote source Mark Brooks reported that commit 1b4b7c broke tproxy in 1.5-dev6. Nick Chalk tracked the issue down to a missing address family setting in tcp_bind_socket() which resulted in a failure to use get_addr_len(). This issue is 1.5-specific.	2011-04-19 07:20:57 +02:00
Willy Tarreau	1b4b7ce6dd	[BUG] stream_sock: use get_addr_len() instead of sizeof() on sockaddr_storage John Helliwell reported a runtime issue on Solaris since 1.5-dev5. Traces show that connect() returns EINVAL, which means the socket length is not appropriate for the family. Solaris does not like being called with sizeof and needs the address family's size on sockaddr_storage. The fix consists in adding a get_addr_len() function which returns the socket's address length based on its family. Tests show that this works for both IPv4 and IPv6 addresses.	2011-04-05 16:56:50 +02:00
David du Colombier	4f92d32004	[MEDIUM] IPv6 support for stick-tables Since IPv6 is a different type than IPv4, the pattern fetch functions src6 and dst6 were added. IPv6 stick-tables can also fetch IPv4 addresses with src and dst. In this case, the IPv4 addresses are mapped to their IPv6 counterpart, according to RFC 4291.	2011-03-29 01:09:14 +02:00
Willy Tarreau	6f831b446c	[BUILD] proto_tcp: fix build issue with CTTPROXY Recent sockaddr_storage changes broke the almost unused cttproxy code. Fix is obvious.	2011-03-20 14:03:54 +01:00
Willy Tarreau	7d0aaf39d1	[MEDIUM] stats: split frontend and backend stats It's very annoying that frontend and backend stats are merged because we don't know what we're observing. For instance, if a "listen" instance makes use of a distinct backend, it's impossible to know what the bytes_out means. Some points take care of not updating counters twice if the backend points to the frontend, indicating a "listen" instance. The thing becomes more complex when we try to add support for server side keep-alive, because we have to maintain a pointer to the backend used for last request, and to update its stats. But we can't perform such comparisons anymore because the counters will not match anymore. So in order to get rid of this situation, let's have both frontend AND backend stats in the "struct proxy". We simply update the relevant ones during activity. Some of them are only accounted for in the backend, while others are just for frontend. Maybe we can improve a bit on that later, but the essential part is that those counters now reflect what they really mean.	2011-03-13 22:00:23 +01:00
David du Colombier	6f5ccb1589	[MEDIUM] add internal support for IPv6 server addresses This patch turns internal server addresses to sockaddr_storage to store IPv6 addresses, and makes the connect() function use it. This code already works but some caveats with getaddrinfo/gethostbyname still need to be sorted out while the changes had to be merged at this stage of internal architecture changes. So for now the config parser will not emit an IPv6 address yet so that user experience remains unchanged. This change should have absolutely zero user-visible effect, otherwise it's a bug introduced during the merge, that should be reported ASAP.	2011-03-13 22:00:12 +01:00
Willy Tarreau	ac82540c35	[MEDIUM] stream_interface: store the target pointer and type When doing a connect() on a stream interface, some information is needed from the server and from the backend. In some situations, we don't have a server and only a backend (eg: peers). In other cases, we know we have an applet and we don't want to connect to anything, but we'd still like to have the info about the applet being used. For this, we now store a pointer to the "target" into the stream interface. The target describes what's on the other side before trying to connect. It can be a server, a proxy or an applet for now. Later we'll probably have descriptors for multiple-stage chains so that the final information may still be found. This will help removing many specific cases in the code. It already made it possible to remove the "srv" and "be" parameters to tcpv4_connect_server().	2011-03-10 23:32:15 +01:00
Willy Tarreau	f153686a71	[REORG] tcp: make tcpv4_connect_server() take the target address from the SI The address is now available in the stream interface, no need to pass it by argument.	2011-03-10 23:32:15 +01:00
Willy Tarreau	957c0a5845	[REORG] session: move client and server address to the stream interface This will be needed very soon for the keep-alive.	2011-03-10 23:32:14 +01:00
Willy Tarreau	48a7e72c5d	[MINOR] tcp: add support for dynamic MSS setting By passing a negative value to the "mss" argument of "bind" lines, it becomes possible to subtract this value to the MSS advertised by the client, which results in segments smaller than advertised. The effect is useful with some TCP stacks which ACK less often when segments are not full, because they only ACK every other full segment as suggested by RFC1122. NOTE: currently this has no effect on Linux kernel 2.6, a kernel patch is still required to change the MSS of established connections.	2010-12-30 09:50:23 +01:00
Emeric	f2d7caedd1	[MINOR] Add pattern's fetchs payload and payload_lv	2010-11-11 09:29:07 +01:00
Emeric Brun	485479d8e9	[MEDIUM] Create new protected pattern types CONSTSTRING and CONSTDATA to force memcpy if data from protected areas need to be manipulated. Enhance pattern convs and fetch argument parsing, now fetchs and convs callbacks used typed args. Add more details on error messages on parsing pattern expression function. Update existing pattern convs and fetchs to new proto. Create stick table key type "binary". Manage Truncation and padding if pattern's fetch-converted result don't match table key size.	2010-11-11 09:29:07 +01:00
Emeric Brun	97679e7901	[MEDIUM] Implement tcp inspect response rules	2010-11-11 09:28:18 +01:00
Cyril Bont�	43ba1b331c	[MINOR] startup: print the proxy socket which caused an error Add the address and port to the error message of the proxy socket that caused the error. This can be helpful when several listening addresses are used in a proxy. Edit: since we now also support unix sockets (which already report their path), better move the address reporting to proto_tcp.c by analogy. -Willy	2010-11-11 09:26:28 +01:00
Emeric Brun	f769f51af6	[MINOR] Enhance controls of socket's family on acls and pattern fetch	2010-11-09 15:59:42 +01:00
Emeric Brun	cf20bf1c1c	[MEDIUM] Enhance message errors management on binds	2010-11-05 10:34:07 +01:00
emeric	8aa6b3762c	[BUG] proto_tcp: potential bug on pattern fetch dst and dport Pattern fetches relying on destination address must first fetch the address if it has not been done yet. (cherry picked from commit 21abf441feb318b2ccd7df590fd89e9e824627f6)	2010-10-30 19:04:35 +02:00
Willy Tarreau	b824b002cd	[MEDIUM] tcp-request : don't wait for inspect-delay to expire when the buffer is full If a request buffer is full, there's no point waiting for the timeout to expire, the contents will not change.	2010-10-30 19:04:31 +02:00
Willy Tarreau	fb024dc1c9	[BUG] conf: add tcp-request content rules to the correct list Due to the change in commit 68c03, the tcp-request content rules were unfortunately being added to the request rules.	2010-08-20 13:35:41 +02:00
Willy Tarreau	0a4838cd31	[MEDIUM] session-counters: correctly unbind the counters tracked by the backend In case of HTTP keepalive processing, we want to release the counters tracked by the backend. Till now only the second set of counters was released, while it could have been assigned by the frontend, or the backend could also have assigned the first set. Now we reuse to unused bits of the session flags to mark which stick counters were assigned by the backend and to release them as appropriate.	2010-08-10 18:04:16 +02:00
Willy Tarreau	56123282ef	[MINOR] session-counters: use "track-sc{1,2}" instead of "track-{fe,be}-counters" The assumption that there was a 1:1 relation between tracked counters and the frontend/backend role was wrong. It is perfectly possible to track the track-fe-counters from the backend and the track-be-counters from the frontend. Thus, in order to reduce confusion, let's remove this useless {fe,be} reference and simply use {1,2} instead. The keywords have also been renamed in order to limit confusion. The ACL rule action now becomes "track-sc{1,2}". The ACLs are now "sc{1,2}_" instead of "trk{fe,be}_". That means that we can reasonably document "sc1" and "sc2" (sticky counters 1 and 2) as sort of patterns that are available during the whole session's life and use them just like any other pattern.	2010-08-10 18:04:15 +02:00
Willy Tarreau	68c03aba9e	[MEDIUM] config: replace 'tcp-request <action>' with "tcp-request connection" It began to be problematic to have "tcp-request" followed by an immediate action, as sometimes it was a keyword indicating a hook or setting ("content" or "inspect-delay") and sometimes it was an action. Now the prefix for connection-level tcp-requests is "tcp-request connection" and the ones processing contents remain "tcp-request contents". This has allowed a nice simplification of the config parser and to clean up the doc a bit. Also now it's a bit more clear why tcp-request connection are not allowed in backends.	2010-08-10 18:04:15 +02:00
Willy Tarreau	d1f9652d90	[MEDIUM] tcp: accept the "track-counters" in "tcp-request content" rules Doing so allows us to track counters from backends or depending on contents. For instance, it now becomes possible to decide to track a connection based on a Host header if enough time is granted to parse the HTTP request. It is also possible to just track frontend counters in the frontend and unconditionally track backend counters in the backend without having to write complex rules. The first track-fe-counters rule executed is used to track counters for the frontend, and the first track-be-counters rule executed is used to track counters for the backend. Nothing prevents a frontend from setting a track-be rule nor a backend from setting a track-fe rule. In fact these rules are arbitrarily split between FE and BE with no dependencies.	2010-08-10 18:04:15 +02:00
Willy Tarreau	f059a0f63a	[MAJOR] session-counters: split FE and BE track counters Having a single tracking pointer for both frontend and backend counters does not work. Instead let's have one for each. The keyword has changed to "track-be-counters" and "track-fe-counters", and the ACL "trk_" changed to "trkfe_" and "trkbe_*".	2010-08-10 18:04:15 +02:00
Willy Tarreau	8b22a71a4d	[MEDIUM] session: move counter ACL fetches from proto_tcp It was not normal to have counter fetches in proto_tcp.c. The only reason was that the key based on the source address was fetched there, but now we have split the key extraction and data processing, we must move that to a more appropriate place. Session seems OK since the counters are all manipulated from here. Also, since we're precisely counting number of connections with these ACLs, we rename them src_conn_cnt and src_updt_conn_cnt. This is not a problem right now since no version was emitted with these keywords.	2010-08-10 18:04:12 +02:00
Willy Tarreau	9ba2dcc86c	[MAJOR] session: add track-counters to track counters related to the session This patch adds the ability to set a pointer in the session to an entry in a stick table which holds various counters related to a specific pattern. Right now the syntax matches the target syntax and only the "src" pattern can be specified, to track counters related to the session's IPv4 source address. There is a special function to extract it and convert it to a key. But the goal is to be able to later support as many patterns as for the stick rules, and get rid of the specific function. The "track-counters" directive may only be set in a "tcp-request" statement right now. Only the first one applies. Probably that later we'll support multi-criteria tracking for a single session and that we'll have to name tracking pointers. No counter is updated right now, only the refcount is. Some subsequent patches will have to bring that feature.	2010-08-10 18:04:12 +02:00
Willy Tarreau	171819b5d7	[MINOR] tcp: src_count acl does not have a permanent result This ACL's count can change along the session's life because it depends on other sessions' activity. Switch it to volatile since any session could appear while evaluating the ACLs.	2010-08-10 18:04:11 +02:00
Willy Tarreau	fb35620e87	[MEDIUM] session: support "tcp-request content" rules in backends Sometimes it's necessary to be able to perform some "layer 6" analysis in the backend. TCP request rules were not available till now, although documented in the diagram. Enable them in backend now.	2010-08-10 14:10:58 +02:00
Willy Tarreau	f535683123	[BUG] config: report the correct proxy type in tcp-request errors A copy-paste typo caused a wrong proxy's type to be reported in case of parsing errors.	2010-06-14 18:40:26 +02:00
Willy Tarreau	6a984fa7c1	[CLEANUP] proto_tcp: make the config parser a little bit more flexible We'll need to let the tcp-request parser able to delegate parsing of track-counters to a commun function, let's prepare it.	2010-06-14 16:44:27 +02:00
Willy Tarreau	cb18364ca7	[MEDIUM] stick_table: separate storage and update of session entries When an entry already exists, we just need to update its expiration timer. Let's have a dedicated function for that instead of spreading open code everywhere. This change also ensures that an update of an existing sticky session really leads to an update of its expiration timer, which was apparently not the case till now. This point needs to be checked in 1.4.	2010-06-14 15:10:26 +02:00
Willy Tarreau	a975b8f381	[MINOR] tcp: add per-source connection rate limiting This change makes use of the stick-tables to keep track of any source address activity. Two ACLs make it possible to check the count of an entry or update it and act accordingly. The typical usage will be to reject a TCP request upon match of an excess value.	2010-06-14 15:10:25 +02:00
Willy Tarreau	2799e98a36	[MINOR] frontend: count denied TCP requests separately It's very disturbing to see the "denied req" counter increase without any other session counter moving. In fact, we can't count a rejected TCP connection as "denied req" as we have not yet instanciated any session at all. Let's use a new counter for that.	2010-06-14 10:53:20 +02:00
Willy Tarreau	a5c0ab200b	[MEDIUM] frontend: check for LI_O_TCP_RULES in the listener The new LI_O_TCP_RULES listener option indicates that some TCP rules must be checked upon accept on this listener. It is now checked by the frontend and the L4 rules are evaluated only in this case. The flag is only set when at least one tcp-req rule is present in the frontend. The L4 rules check function has now been moved to proto_tcp.c where it ought to be.	2010-06-14 10:53:13 +02:00
Willy Tarreau	1a68794418	[MEDIUM] config: parse tcp layer4 rules (tcp-request accept/reject) These rules currently only support the "accept" and "reject" actions. They will apply on pure layer 4 and will not support any content.	2010-06-14 10:53:12 +02:00
Willy Tarreau	eb472685cb	[MEDIUM] separate protocol-level accept() from the frontend's For a long time we had two large accept() functions, one for TCP sockets instanciating proxies, and another one for UNIX sockets instanciating the stats interface. A lot of code was duplicated and both did not work exactly the same way. Now we have a stream_sock layer accept() called for either TCP or UNIX sockets, and this function calls the frontend-specific accept() function which does the rest of the frontend-specific initialisation. Some code is still duplicated (session & task allocation, stream interface initialization), and might benefit from having an intermediate session-level accept() callback to perform such initializations. Still there are some minor differences that need to be addressed first. For instance, the monitor nets should only be checked for proxies and not for other connection templates. Last, we renamed l->private as l->frontend. The "private" pointer in the listener is only used to store a frontend, so let's rename it to eliminate this ambiguity. When we later support detached listeners (eg: FTP), we'll add another field to avoid the confusion.	2010-06-14 10:53:11 +02:00
Willy Tarreau	03fa5df64a	[CLEANUP] rename client -> frontend The 'client.c' file now only contained frontend-specific functions, so it has naturally be renamed 'frontend.c'. Same for client.h. This has also been an opportunity to remove some cross references from files that should not have depended on it. In the end, this file should contain a protocol-agnostic accept() code, which would initialize a session, task, etc... based on an accept() from a lower layer. Right now there are still references to TCP.	2010-06-14 10:53:10 +02:00
Willy Tarreau	645513ade8	[CLEANUP] client: move some ACLs away to their respective locations Some ACLs in the client ought to belong to proto_tcp, or protocols. This file should only contain frontend-specific information and will be renamed that way in next commit.	2010-06-14 10:53:10 +02:00
Willy Tarreau	44b90cc4d8	[CLEANUP] tcp: move some non tcp-specific layer6 processing out of proto_tcp Some functions which act on generic buffer contents without being tcp-specific were historically in proto_tcp.c. This concerns ACLs and RDP cookies. Those have been moved away to more appropriate locations. Ideally we should create some new files for each layer6 protocol parser. Let's do that later.	2010-06-14 10:53:09 +02:00
Willy Tarreau	06457871a4	[CLEANUP] acl: use 'L6' instead of 'L4' in ACL flags relying on contents Just like we do on health checks, we should consider that ACLs that make use of buffer data are layer 6 and not layer 4, because we'll soon have to distinguish between pure layer 4 ACLs (without any buffer) and these ones.	2010-06-14 10:53:09 +02:00
Willy Tarreau	23968d898a	[BUG] tcp: dropped connections must be counted as "denied" not "failed" This probably was a copy-paste typo from the initial tcp-request feature. This must be backported to 1.4 and possibly 1.3.	2010-05-28 18:10:31 +02:00
Willy Tarreau	c4262961f8	[MEDIUM] acl: add tree-based lookups of exact strings Now if some ACL patterns are loaded from a file and the operation is an exact string match, the data will be arranged in a tree, yielding a significant performance boost on large data sets. Note that this only works when case is sensitive. A new dedicated function, acl_lookup_str(), has been created for this matching. It is called for every possible input data to test and it looks the tree up for the data. Since the keywords are loosely typed, we would have had to add a new columns to all keywords to adjust the function depending on the type. Instead, we just compare on the match function. We call acl_lookup_str() when we could use acl_match_str(). The tree lookup is performed first, then the remaining patterns are attempted if the tree returned nothing. A quick test shows that when matching a header against a list of 52000 network names, haproxy uses 68% of one core on a core2-duo 3.2 GHz at 42000 requests per second, versus 66% without any rule, which means only a 2% CPU increase for 52000 rules. Doing the same test without the tree leads to 100% CPU at 6900 requests/s. Also it was possible to run the same test at full speed with about 50 sets of 52000 rules without any measurable performance drop.	2010-05-13 21:37:45 +02:00
Willy Tarreau	090466c91a	[MINOR] add new tproxy flags for dynamic source address binding This patch adds a new TPROXY bind type, TPROXY_DYN, to indicate to the TCP connect function that we want to bind to the address passed in argument.	2010-03-30 09:59:44 +02:00
Willy Tarreau	b1d67749db	[MEDIUM] backend: move the transparent proxy address selection to backend The transparent proxy address selection was set in the TCP connect function which is not the most appropriate place since this function has limited access to the amount of parameters which could produce a source address. Instead, now we determine the source address in backend.c:connect_server(), right after calling assign_server_address() and we assign this address in the session and pass it to the TCP connect function. This cannot be performed in assign_server_address() itself because in some cases (transparent mode, dispatch mode or http_proxy mode), we assign the address somewhere else. This change will open the ability to bind to addresses extracted from many other criteria (eg: from a header).	2010-03-30 09:59:43 +02:00
Willy Tarreau	ef6494cb8c	[CLEANUP] config: use build_acl_cond() instead of parse_acl_cond() This allows to clean up the code a little bit by moving some of the ACL internals out of the config parser.	2010-01-28 17:12:36 +01:00
Willy Tarreau	e803de2c6b	[MINOR] add the ability to force kernel socket buffer size. Sometimes we need to be able to change the default kernel socket buffer size (recv and send). Four new global settings have been added for this : - tune.rcvbuf.client - tune.rcvbuf.server - tune.sndbuf.client - tune.sndbuf.server Those can be used to reduce kernel memory footprint with large numbers of concurrent connections, and to reduce risks of write timeouts with very slow clients due to excessive kernel buffering.	2010-01-22 11:49:41 +01:00
Willy Tarreau	7c3c54177a	[MAJOR] buffers: automatically compute the maximum buffer length We used to apply a limit to each buffer's size in order to leave some room to rewrite headers, then we used to remove this limit once the session switched to a data state. Proceeding that way becomes a problem with keepalive because we have to know when to stop reading too much data into the buffer so that we can leave some room again to process next requests. The principle we adopt here consists in only relying on to_forward+send_max. Indeed, both of those data define how many bytes will leave the buffer. So as long as their sum is larger than maxrewrite, we can safely fill the buffers. If they are smaller, then we refrain from filling the buffer. This means that we won't risk to fill buffers when reading last data chunk followed by a POST request and its contents. The only impact identified so far is that we must ensure that the BF_FULL flag is correctly dropped when starting to forward. Right now this is OK because nobody inflates to_forward without using buffer_forward().	2009-12-22 10:06:34 +01:00
Krzysztof Piotr Oledzki	97f07b832f	[MEDIUM] Decrease server health based on http responses / events, version 3 Implement decreasing health based on observing communication between HAProxy and servers. Changes in this version 2: - documentation - close race between a started check and health analysis event - don't force fastinter if it is not set - better names for options - layer4 support Changes in this version 3: - add stats - port to the current 1.4 tree	2009-12-16 00:29:27 +01:00
Willy Tarreau	8d5d77efc3	[OPTIM] move some rarely used fields out of fdtab Some rarely information are stored in fdtab, making it larger for no reason (source port ranges, remote address, ...). Such information lie there because the checks can't find them anywhere else. The goal will be to move these information to the stream interface once the checks make use of it. For now, we move them to an fdinfo array. This simple change might have improved the cache hit ratio a little bit because a 0.5% of performance increase has measured.	2009-10-18 08:17:33 +02:00
Willy Tarreau	cb6cd43725	[MINOR] tcp: add support for the defer_accept bind option This can ensure that data is readily available on a socket when we accept it, but a bug in the kernel ignores the timeout so the socket can remain pending as long as the client does not talk. Use with care.	2009-10-13 07:34:14 +02:00
Krzysztof Piotr Oledzki	aeebf9ba65	[MEDIUM] Collect & provide separate statistics for sockets, v2 This patch allows to collect & provide separate statistics for each socket. It can be very useful if you would like to distinguish between traffic generate by local and remote users or between different types of remote clients (peerings, domestic, foreign). Currently no "Session rate" is supported, but adding it should be possible if we found it useful.	2009-10-04 18:56:02 +02:00
Krzysztof Piotr Oledzki	052d4fd07d	[CLEANUP] Move counters to dedicated structures Move counters from "struct proxy" and "struct server" to "struct pxcounters" and "struct svcounters". This patch should make no functional change.	2009-10-04 18:32:39 +02:00
Willy Tarreau	520d95e42b	[MAJOR] buffers: split BF_WRITE_ENA into BF_AUTO_CONNECT and BF_AUTO_CLOSE The BF_WRITE_ENA buffer flag became very complex to deal with, because it was used to : - enable automatic connection - enable close forwarding - enable data forwarding The last point was not very true anymore since we introduced ->send_max, but still the test remained everywhere. This was causing issues such as impossibility to connect without forwarding data, impossibility to prevent closing when data was forwarded, etc... This patch clarifies the situation by getting rid of this multi-purpose flag and replacing it with : - data forwarding based only on ->send_max \|\| ->pipe ; - a new BF_AUTO_CONNECT flag to allow automatic connection and only that ; - ability to perform an automatic connection when ->send_max or ->pipe indicate that data is waiting to leave the buffer ; - a new BF_AUTO_CLOSE flag to let the producer automatically set the BF_SHUTW_NOW flag when it gets a BF_SHUTR. During this cleanup, it was discovered that some tests were performed twice, or that the BF_HIJACK flag was still tested, which is not needed anymore since ->send_max replcaed it. These places have been fixed too. These cleanups have also revealed a few areas where the other flags such as BF_EMPTY are not cleanly used. This will be an opportunity for a second patch.	2009-09-19 21:14:54 +02:00
Dmitry Sivachenko	caf58986fb	[BUILD] compilation of haproxy-1.4-dev2 on FreeBSD Please consider the following patches. They are required to compile haproxy-1.4-dev2 on FreeBSD. Summary: 1) include <sys/types.h> before <netinet/tcp.h> 2) Use IPPROTO_TCP instead of SOL_TCP (they are both defined as 6, TCP protocol number)	2009-08-30 14:45:19 +02:00
Willy Tarreau	9650f37628	[MEDIUM] move connection establishment from backend to the SI. The connection establishment was completely handled by backend.c which normally just handles LB algos. Since it's purely TCP, it must move to proto_tcp.c. Also, instead of calling it directly, we now call it via the stream interface, which will later help us unify session handling.	2009-08-16 17:46:15 +02:00
Willy Tarreau	c9fce2fee8	[BUILD] fix build for systems without SOL_TCP Andrew Azarov reported that haproxy-1.4-dev1 does not build under FreeBSD 7.2 because SOL_TCP is not defined. So add a check for its definition before using it. This only impacts network optimisations anyway.	2009-08-16 14:13:47 +02:00
Willy Tarreau	606ad73e73	[BUG] config: tcp-request content only accepts "if" or "unless" As reported by Maik Broemme, if something different from "if" or "unless" was specified after "tcp-request content accept", the condition would silently remain void. The parser must obviously complain since this typically corresponds to a forgotten "if".	2009-07-14 21:17:05 +02:00
Willy Tarreau	1a211943f6	[MINOR] acl: don't complain anymore when using L7 acls in TCP Since TCP can now check contents using L7 acls, we must not complain anymore.	2009-07-14 13:53:17 +02:00
Emeric Brun	647caf1ebc	[MEDIUM] add support for RDP cookie persistence The new statement "persist rdp-cookie" enables RDP cookie persistence. The RDP cookie is then extracted from the RDP protocol, and compared against available servers. If a server matches the RDP cookie, then it gets the connection.	2009-07-14 12:50:40 +02:00
Emeric Brun	bede3d0ef4	[MINOR] acl: add support for matching of RDP cookies The RDP protocol is quite simple and documented, which permits an easy detection and extraction of cookies. It can be useful to match the MSTS cookie which can contain the username specified by the client.	2009-07-14 12:50:39 +02:00
Willy Tarreau	51d5dad90a	[MINOR] allow TCP inspection rules to make use of HTTP ACLs Since we can call the HTTP parser from TCP inspection rules, it makes sense to be able to use the HTTP ACLs with it. That way, we can decide from a TCP frontend to take a switching decision based on full layer7 decoding. This might be useful to perform layer7 content switching from a layer4 frontend in fact. For instance, we might want to be able to detect http/https on a frontend, but still switch to backend X or Y depending on the Host header. Note that it is mandatory to wait for an HTTP request otherwise the ACLs will randomly match.	2009-07-12 10:10:05 +02:00
Willy Tarreau	a9fb08317f	[MINOR] report in the proxies the requirements for ACLs This patch propagates the ACL conditions' "requires" bitfield to the proxies. This makes it possible to know exactly what a proxy might have to support for any request, which helps knowing whether we have to allocate some space for certain types of structures or not (eg: the hdr_idx struct). The concept might be extended to a lot more types of information, such as detecting whether we need to allocate some space for some request ACLs which need a result in the response, etc...	2009-07-10 23:09:39 +02:00
Willy Tarreau	3a816293e9	[MEDIUM] session: tell analysers what bit they were called for Some stream analysers might become generic enough to be called for several bits. So we cannot have the analyser bit hard coded into the analyser itself. Let's make the caller inform the callee.	2009-07-07 10:55:49 +02:00
Willy Tarreau	5d707e1aaa	[MEDIUM] stream_sock: don't close prematurely when nolinger is set When the nolinger option is used, we must not close too fast because some data might be left unsent. Instead we must proceed with a normal shutdown first, then a close. Also, we want to avoid merging FIN with the last segment if nolinger is set, because if that one gets lost, there is no chance for it to be retransmitted.	2009-06-28 11:09:07 +02:00
Willy Tarreau	be1b91842a	[MEDIUM] add support for TCP MSS adjustment for listeners Sometimes it can be useful to limit the advertised TCP MSS on incoming connections, for instance when requests come through a VPN or when the system is running with jumbo frames enabled. Passing the "mss <value>" arguments to a "bind" line will set the value. This works under Linux >= 2.6.28, and maybe a few earlier ones, though due to an old kernel bug most of earlier versions will probably ignore it. It is also possible that some other OSes will support this.	2009-06-14 18:48:19 +02:00
Willy Tarreau	fb14edc215	[MEDIUM] stream_sock: implement tcp-cork for use during shutdowns on Linux Setting TCP_CORK on a socket before sending the last segment enables automatic merging of this segment with the FIN from the shutdown() call. Playing with TCP_CORK is not easy though as we have to track the status of the TCP_NODELAY flag since both are mutually exclusive. Doing so saves one more packet per session and offers about 5% more performance. There is no reason not to do it, so there is no associated option.	2009-06-14 15:24:37 +02:00
Willy Tarreau	9ea05a790f	[MEDIUM] implement option tcp-smart-accept at the frontend This option disables TCP quick ack upon accept. It is also automatically enabled in HTTP mode, unless the option is explicitly disabled with "no option tcp-smart-accept". This saves one packet per connection which can bring reasonable amounts of bandwidth for servers processing small requests.	2009-06-14 12:07:01 +02:00
Willy Tarreau	8e80e0bc4c	[BUG] fix parser crash on unconditional tcp content rules Since 1.3.17, a config containing one of the following lines would crash the parser : tcp content reject tcp content accept This is because a check is performed on the condition which is not specified. The obvious fix consists in checkinf for a condition first.	2009-05-10 12:22:39 +02:00
Willy Tarreau	61d188920e	[MINOR] improve reporting of misplaced acl/reqxxx rules Now we can detect improper ordering of "block", "reqxxx", "reqadd", "redirect" and "use_backend", and warn the user accordingly.	2009-03-31 10:49:21 +02:00
Willy Tarreau	86ef7dc98d	[MINOR] tcp_request: let the caller take care of errors and timeouts tcp_request is not meant to decide how an error or a timeout has to be handled. It must just apply it rules. Now that the error checks have been added to the session, we don't need to check them anymore in tcp_request_inspect(), which will only consider the shutdown which may be the result of such an error. That makes a lot more sense since tcp_request is not really waiting for a request.	2009-03-15 22:55:47 +01:00
Willy Tarreau	5af24efee9	[CLEANUP] config: catch and report some possibly wrong rule ordering There are some configurations in which redirect rules are declared after use_backend rules. We can also find "block" rules after any of these ones. The processing sequence is : - block - redirect - use_backend So as of now we try to detect wrong ordering to warn the user about a possibly undesired behaviour.	2009-03-15 15:23:16 +01:00
Willy Tarreau	d869b24119	[MINOR] tcp-inspect: permit the use of no-delay inspection Sometimes it may make sense to be able to immediately apply a verdict without waiting at all. It was not possible because no inspect-delay meant no inspection at all. This is now fixed.	2009-03-15 14:43:58 +01:00
Willy Tarreau	604e83097f	[BUG] interface binding: length must include the trailing zero The interface length passed to the setsockopt(SO_BINDTODEVICE) must include the trailing \0. Otherwise it will randomly fail.	2009-03-06 00:48:23 +01:00
Willy Tarreau	5e6e204d1c	[MINOR] add support for bind interface name By appending "interface <name>" to a "bind" line, it is now possible to specifically bind to a physical interface name. Note that this currently only works on Linux and requires root privileges.	2009-02-04 17:19:29 +01:00
Willy Tarreau	03d60bbaf9	[OPTIM] buffer: replace rlim by max_len In the buffers, the read limit used to leave some place for header rewriting was set by a pointer to the end of the buffer. Not only this required subtracts at every place in the code, but this will also soon not be usable anymore when we want to support keepalive. Let's replace this with a length limit, comparable to the buffer's length. This has also sightly reduced the code size.	2009-01-09 11:14:39 +01:00
Willy Tarreau	b5654f6ff4	[MINOR] move the listener reference from fd to session The listener referenced in the fd was only used to check the listener state upon session termination. There was no guarantee that the FD had not been reassigned by the moment it was processed, so this was a bit racy. Having it in the session is more robust.	2008-12-07 16:45:10 +01:00
Willy Tarreau	edcf6687d6	[MEDIUM] extract TCP request processing from HTTP The TCP analyser has moved to proto_tcp.c. Breaking the function has required finer use of the return value and adding some tests to process_session().	2008-11-30 23:15:34 +01:00
Willy Tarreau	dded32defa	[MINOR] replace client_retnclose() with stream_int_retnclose() This makes more sense to return a message to a stream interface than to a session. senddata.{c,h} have been removed.	2008-11-30 19:48:07 +01:00
Willy Tarreau	eabf313df2	[MINOR] change type of fdtab[]->owner to void* The owner of an fd was initially a task but this was sometimes casted to a (struct listener ). We'll soon need more types, so void is more appropriate.	2008-11-02 10:19:08 +01:00
Willy Tarreau	c7e961e5f7	[BUILD] fix warning in proto_tcp.c with gcc >= 4 signedness issues.	2008-08-17 17:13:47 +02:00
Willy Tarreau	dd64f8d394	[MEDIUM] acl: when possible, report the name and requirements of ACLs in warnings When an ACL is referenced at a wrong place (eg: response during request, layer7 during layer4), try to indicate precisely the name and requirements of this ACL. Only the first faulty ACL is returned. A small change consisting in iterating that way may improve reports : cap = ACL_USE_any_unexpected while ((acl=cond_find_require(cond, cap))) { warning() cap &= ~acl->requires; } This will report the first ACL of each unsupported type. But doing so will mangle the error reporting a lot, so we need to rework error reports first.	2008-08-03 09:41:05 +02:00
Willy Tarreau	0ceba5af74	[MEDIUM] acl: set types on all currently known ACL verbs All currently known ACL verbs have been assigned a type which makes it possible to detect inconsistencies, such as response values used in request rules.	2008-07-25 19:31:03 +02:00
Willy Tarreau	ec6c5df018	[CLEANUP] remove many #include <types/xxx> from C files It should be stated as a rule that a C file should never include types/xxx.h when proto/xxx.h exists, as it gives less exposure to declaration conflicts (one of which was caught and fixed here) and it complicates the file headers for nothing. Only types/global.h, types/capture.h and types/polling.h have been found to be valid includes from C files.	2008-07-16 10:30:42 +02:00
Willy Tarreau	284648e079	[CLEANUP] remove unused include/types/client.h This file is not used anymore.	2008-07-16 10:30:40 +02:00
Willy Tarreau	655e26af24	[MINOR] acl: add req_ssl_ver in TCP, to match an SSL version This new keyword matches an dotted version mapped into an integer. It permits to match an SSL message protocol version just as if it was an integer, so that it is easy to map ranges, like this : acl obsolete_ssl req_ssl_ver lt 3 acl correct_ssl req_ssl_ver 3.0-3.1 acl invalid_ssl req_ssl_ver gt 3.1 Both SSLv2 hello messages and SSLv3 messages are supported. The test tries to be strict enough to avoid being easily fooled. In particular, it waits for as many bytes as announced in the message header if this header looks valid (bound to the buffer size). The same decoder will be usable with minor changes to check the response messages.	2008-07-16 10:30:06 +02:00
Willy Tarreau	b686644ad8	[MAJOR] implement tcp request content inspection Some people need to inspect contents of TCP requests before deciding to forward a connection or not. A future extension of this demand might consist in selecting a server farm depending on the protocol detected in the request. For this reason, a new state CL_STINSPECT has been added on the client side. It is immediately entered upon accept() if the statement "tcp-request inspect-delay <xxx>" is found in the frontend configuration. Haproxy will then wait up to this amount of time trying to find a matching ACL, and will either accept or reject the connection depending on the "tcp-request content <action> {if\|unless}" rules, where <action> is either "accept" or "reject". Note that it only waits that long if no definitive verdict can be found earlier. That generally implies calling a fetch() function which does not have enough information to decode some contents, or a match() function which only finds the beginning of what it's looking for. It is only at the ACL level that partial data may be processed as such, because we need to distinguish between MISS and FAIL before applying the term negation. Thus it is enough to add "\| ACL_PARTIAL" to the last argument when calling acl_exec_cond() to indicate that we expect ACL_PAT_MISS to be returned if some data is missing (for fetch() or match()). This is the only case we may return this value. For this reason, the ACL check in process_cli() has become a lot simpler. A new ACL "req_len" of type "int" has been added. Right now it is already possible to drop requests which talk too early (eg: for SMTP) or which don't talk at all (eg: HTTP/SSL). Also, the acl fetch() functions have been extended in order to permit reporting of missing data in case of fetch failure, using the ACL_TEST_F_MAY_CHANGE flag. The default behaviour is unchanged, and if no rule matches, the request is accepted. As a side effect, all layer 7 fetching functions have been cleaned up so that they now check for the validity of the layer 7 pointer before dereferencing it.	2008-07-16 10:29:07 +02:00
Willy Tarreau	d6f087ea1c	[BUG] fix truncated responses with sepoll Due to the way Linux delivers EPOLLIN and EPOLLHUP, a closed connection received after some server data sometimes results in truncated responses if the client disconnects before server starts to respond. The reason is that the EPOLLHUP flag is processed as an indication of end of transfer while some data may remain in the system's socket buffers. This problem could only be triggered with sepoll, although nothing should prevent it from happening with normal epoll. In fact, the work factoring performed by sepoll increases the risk that this bug appears. The fix consists in making FD_POLL_HUP and FD_POLL_ERR sticky and that they are only checked if FD_POLL_IN is not set, meaning that we have read all pending data. That way, the problem is definitely fixed and sepoll still remains about 17% faster than epoll since it can take into account all information returned by the kernel.	2008-01-18 17:20:13 +01:00
Willy Tarreau	e8c66afd41	[MEDIUM] fix server health checks source address selection The source address selection for health checks did not consider the new transparent proxy method. Rely on the same unified function as the other connect() calls. This patch also fixes a bug by which the proxy's source address was ignored if cttproxy was used.	2008-01-13 18:40:14 +01:00
Willy Tarreau	0a45989de3	[MINOR] add transparent proxy support for balabit's Tproxy v4 Balabit's TPROXY version 4 which replaces CTTPROXY provides a similar API to the previous proxy, but relies on IP_FREEBIND instead of IP_TRANSPARENT. Let's add it.	2008-01-13 17:37:16 +01:00
Willy Tarreau	b1e52e8c44	[MEDIUM] support fully transparent proxy on Linux (USE_LINUX_TPROXY) Using some Linux kernel patches, it is possible to redirect non-local traffic to local sockets when IP forwarding is enabled. In order to enable this option, we introduce the "transparent" option keyword on the "bind" command line. It will make the socket reachable by remote sources even if the destination address does not belong to the machine.	2008-01-13 14:49:51 +01:00
Willy Tarreau	c73ce2b111	[MINOR] add support for the "backlog" parameter Add the "backlog" parameter to frontends, to give hints to the system about the approximate listen backlog desired size. In order to protect against SYN flood attacks, one solution is to increase the system's SYN backlog size. Depending on the system, sometimes it is just tunable via a system parameter, sometimes it is not adjustable at all, and sometimes the system relies on hints given by the application at the time of the listen() syscall. By default, HAProxy passes the frontend's maxconn value to the listen() syscall. On systems which can make use of this value, it can sometimes be useful to be able to specify a different value, hence this backlog parameter.	2008-01-06 10:55:10 +01:00
Willy Tarreau	e6b989479c	[MAJOR] create proto_tcp and move initialization of proxy listeners Proxy listeners were very special and not very easy to manipulate. A proto_tcp file has been created with all that is required to manage TCPv4/TCPv6 as raw protocols, and provide generic listeners. The code of start_proxies() and maintain_proxies() now looks less like spaghetti. Also, event_accept will need a serious lifting in order to use more of the information provided by the listener.	2007-11-04 22:42:49 +01:00

... 4 5 6 7 8 ...

547 Commits