Commit Graph

1158 Commits

Author SHA1 Message Date
Willy Tarreau
436d9ed808 [REORG] http: move HTTP error codes back to proto_http.h
This one was left isolated in its own file. It probably is a leftover
from the 1.2->1.3 split.
2011-05-11 16:31:43 +02:00
Willy Tarreau
c9f6011760 [BUG] TCP source tracking was broken with IPv6 changes
John Helliwell reported a bug when using TCP source address
tracking on Solaris. The bug was introduced in haproxy 1.5-dev5.
2011-04-07 10:53:25 +02:00
Willy Tarreau
1b4b7ce6dd [BUG] stream_sock: use get_addr_len() instead of sizeof() on sockaddr_storage
John Helliwell reported a runtime issue on Solaris since 1.5-dev5. Traces
show that connect() returns EINVAL, which means the socket length is not
appropriate for the family. Solaris does not like being called with sizeof
and needs the address family's size on sockaddr_storage.

The fix consists in adding a get_addr_len() function which returns the
socket's address length based on its family. Tests show that this works
for both IPv4 and IPv6 addresses.
2011-04-05 16:56:50 +02:00
David du Colombier
4f92d32004 [MEDIUM] IPv6 support for stick-tables
Since IPv6 is a different type than IPv4, the pattern fetch functions
src6 and dst6 were added. IPv6 stick-tables can also fetch IPv4 addresses
with src and dst. In this case, the IPv4 addresses are mapped to their
IPv6 counterpart, according to RFC 4291.
2011-03-29 01:09:14 +02:00
David du Colombier
11bcb6c4f5 [MEDIUM] IPv6 support for syslog 2011-03-28 18:45:15 +02:00
Willy Tarreau
0bc3493d2c [OPTIM] buffers: uninline buffer_forward()
Since the latest additions to buffer_forward(), it became too large for
inlining, so let's uninline it. The code size drops by 3kB. Should be
backported to 1.4 too.
2011-03-28 16:25:58 +02:00
Willy Tarreau
d8ee85a0a3 [BUG] http: fix content-length handling on 32-bit platforms
Despite much care around handling the content-length as a 64-bit integer,
forwarding was broken on 32-bit platforms due to the 32-bit nature of
the ->to_forward member of the "buffer" struct. The issue is that this
member is declared as a long, so while it works OK on 64-bit platforms,
32-bit truncate the content-length to the lower 32-bits.

One solution could consist in turning to_forward to a long long, but it
is used a lot in the critical path, so it's not acceptable to perform
all buffer size computations on 64-bit there.

The fix consists in changing the to_forward member to a strict 32-bit
integer and ensure in buffer_forward() that only the amount of bytes
that can fit into it is considered. Callers of buffer_forward() are
responsible for checking that their data were taken into account. We
arbitrarily ensure we never consider more than 2G at once.

That's the way it was intended to work on 32-bit platforms except that
it did not.

This issue was tracked down hard at Exosec with Bertrand Jacquin,
Thierry Fournier and Julien Thomas. It remained undetected for a long
time because files larger than 4G are almost always transferred in
chunked-encoded format, and most platforms dealing with huge contents
these days run on 64-bit.

The bug affects all 1.5 and 1.4 versions, and must be backported.
2011-03-28 16:25:16 +02:00
Willy Tarreau
d3db94399f [MINOR] tools: add two macros MID_RANGE and MAX_RANGE
Those will be used later, they return the largest and middle integer
possible for a given variable or type.
2011-03-28 15:55:43 +02:00
Willy Tarreau
fab5a43726 [MEDIUM] config: rework the IPv4/IPv6 address parser to support host-only addresses
The parser now distinguishes between pure addresses and address:port. This
is useful for some config items where only an address is required.

Raw IPv6 addresses are now parsed, but IPv6 host name resolution is still not
handled (gethostbyname does not resolve IPv6 names to addresses).
2011-03-23 19:01:18 +01:00
David du Colombier
64e9c90e69 [BUG] standard: is_addr return value for IPv4 was inverted 2011-03-22 14:39:16 +01:00
Willy Tarreau
5ab04ec47c [MEDIUM] server: add support for the "send-proxy" option
This option enables use of the PROXY protocol with the server, which
allows haproxy to transport original client's address across multiple
architecture layers.
2011-03-20 11:53:50 +01:00
Willy Tarreau
b22e55bc8f [MEDIUM] stream_sock: add support for sending the proxy protocol header line
Upon connection establishment, stream_sock is now able to send a PROXY
line before sending any data. Since it's possible that the buffer is
already full, and we don't want to allocate a block for that line, we
compute it on-the-fly when we need it. We just store the offset from
which to (re-)send from the end of the line, since it's assumed that
multiple outputs of the same proxy line will be strictly equivalent. In
practice, one call is enough. We just make sure to handle the case where
the first send() would indicate an incomplete output, eventhough it's
very unlikely to ever happen.
2011-03-20 10:16:46 +01:00
Willy Tarreau
a73fcaf424 [MINOR] frontend: add a make_proxy_line function
This function will build a PROXY protocol line header from two addresses
(IPv4 or IPv6). AF_UNIX family will be reported as UNKNOWN.
2011-03-20 10:15:22 +01:00
Willy Tarreau
ff011f26e9 [REORG] http: move the http-request rules to proto_http
And also rename "req_acl_rule" "http_req_rule". At the beginning that
was a bit confusing to me, especially the "req_acl" list which in fact
holds what we call rules. After some digging, it appeared that some
part of the code is 100% HTTP and not just related to authentication
anymore, so let's move that part to HTTP and keep the auth-only code
in auth.c.
2011-03-13 22:00:24 +01:00
Willy Tarreau
7d0aaf39d1 [MEDIUM] stats: split frontend and backend stats
It's very annoying that frontend and backend stats are merged because we
don't know what we're observing. For instance, if a "listen" instance
makes use of a distinct backend, it's impossible to know what the bytes_out
means.

Some points take care of not updating counters twice if the backend points
to the frontend, indicating a "listen" instance. The thing becomes more
complex when we try to add support for server side keep-alive, because we
have to maintain a pointer to the backend used for last request, and to
update its stats. But we can't perform such comparisons anymore because
the counters will not match anymore.

So in order to get rid of this situation, let's have both frontend AND
backend stats in the "struct proxy". We simply update the relevant ones
during activity. Some of them are only accounted for in the backend,
while others are just for frontend. Maybe we can improve a bit on that
later, but the essential part is that those counters now reflect what
they really mean.
2011-03-13 22:00:23 +01:00
David du Colombier
6f5ccb1589 [MEDIUM] add internal support for IPv6 server addresses
This patch turns internal server addresses to sockaddr_storage to
store IPv6 addresses, and makes the connect() function use it. This
code already works but some caveats with getaddrinfo/gethostbyname
still need to be sorted out while the changes had to be merged at
this stage of internal architecture changes. So for now the config
parser will not emit an IPv6 address yet so that user experience
remains unchanged.

This change should have absolutely zero user-visible effect, otherwise
it's a bug introduced during the merge, that should be reported ASAP.
2011-03-13 22:00:12 +01:00
Willy Tarreau
827aee913f [MAJOR] session: remove the ->srv pointer from struct session
This one has been removed and is now totally superseded by ->target.
To get the server, one must use target_srv(&s->target) instead of
s->srv now.

The function ensures that non-server targets still return NULL.
2011-03-10 23:32:17 +01:00
Willy Tarreau
9e000c6ec8 [CLEANUP] stream_interface: use inline functions to manipulate targets
The connection target involves a type and a union of pointers, let's
make the code cleaner using simple wrappers.
2011-03-10 23:32:17 +01:00
Willy Tarreau
3d80d911aa [MEDIUM] session: remove s->prev_srv which is not needed anymore
s->prev_srv is used by assign_server() only, but all code paths leading
to it now take s->prev_srv from the existing s->srv. So assign_server()
can do that copy into its own stack.

If at one point a different srv is needed, we still have a copy of the
last server on which we failed a connection attempt in s->target.
2011-03-10 23:32:16 +01:00
Willy Tarreau
664beb8610 [MINOR] session: add a pointer to the new target into the session
When dealing with HTTP keep-alive, we'll have to know if we can reuse
an existing connection. For that, we'll have to check if the current
connection was made on the exact same target (referenced in the stream
interface).

Thus, we need to first assign the next target to the session, then
copy it to the stream interface upon connect(). Later we'll check for
equivalence between those two operations.
2011-03-10 23:32:16 +01:00
Willy Tarreau
f5ab69aad9 [MINOR] proxy: add PR_O2_DISPATCH to detect dispatch mode
Till now we used the fact that the dispatch address was not null to use
the dispatch mode. This is very unconvenient, so let's have a dedicated
option.
2011-03-10 23:32:16 +01:00
Willy Tarreau
295a837726 [REORG] session: move the data_ctx struct to the stream interface's applet
This is in fact where those parts belong to. The old data_state was replaced
by applet.state and is now initialized when the applet is registered. It's
worth noting that the applet does not need to know the session nor the
buffer anymore since everything is brought by the stream interface.

It is possible that having a separate applet struct would simplify the
code but that's not a big deal.
2011-03-10 23:32:16 +01:00
Willy Tarreau
5ec29ffa42 [CLEANUP] stats: make all dump functions only rely on the stream interface
This will be needed to move the applet-specific data out of the session.
2011-03-10 23:32:16 +01:00
Willy Tarreau
75581aebb0 [CLEANUP] session: remove data_source from struct session
This one was only used for logging purposes, it's not needed
anymore.
2011-03-10 23:32:15 +01:00
Willy Tarreau
7c0a151a2e [CLEANUP] stream_interface: remove the applet.handler pointer
Now that we have the target pointer and type in the stream interface,
we don't need the applet.handler pointer anymore. That makes the code
somewhat cleaner because we know we're dealing with an applet by checking
its type instead of checking the pointer is not null.
2011-03-10 23:32:15 +01:00
Willy Tarreau
ac82540c35 [MEDIUM] stream_interface: store the target pointer and type
When doing a connect() on a stream interface, some information is needed
from the server and from the backend. In some situations, we don't have
a server and only a backend (eg: peers). In other cases, we know we have
an applet and we don't want to connect to anything, but we'd still like
to have the info about the applet being used.

For this, we now store a pointer to the "target" into the stream interface.
The target describes what's on the other side before trying to connect. It
can be a server, a proxy or an applet for now. Later we'll probably have
descriptors for multiple-stage chains so that the final information may
still be found.

This will help removing many specific cases in the code. It already made
it possible to remove the "srv" and "be" parameters to tcpv4_connect_server().
2011-03-10 23:32:15 +01:00
Willy Tarreau
f153686a71 [REORG] tcp: make tcpv4_connect_server() take the target address from the SI
The address is now available in the stream interface, no need to pass it by
argument.
2011-03-10 23:32:15 +01:00
Willy Tarreau
957c0a5845 [REORG] session: move client and server address to the stream interface
This will be needed very soon for the keep-alive.
2011-03-10 23:32:14 +01:00
Willy Tarreau
be5ea19188 [REORG] stream_interface: split the struct members in 3 parts
Those 3 parts are the buffer side, the remote side and the communication
functions. This change has no functional effect but is needed to proceed
further.
2011-03-10 23:32:14 +01:00
Willy Tarreau
bc4af0573c [REORG] stream_interface: move the st0, st1 and private members to the applet
Those fields are only used by the applets, so let's move them to the
struct.
2011-03-10 23:32:14 +01:00
Willy Tarreau
b24281b0ff [MINOR] stream_interface: make use of an applet descriptor for IO handlers
I/O handlers are still delicate to manipulate. They have no type, they're
just raw functions which have no knowledge of themselves. Let's have them
declared as applets once for all. That way we can have multiple applets
share the same handler functions and we can store their names there. When
we later need to add more parameters (eg: usage stats), we'll be able to
do so in the applets themselves.

The CLI functions has been prefixed with "cli" instead of "stats" as it's
clearly what is going on there.

The applet descriptor in the stream interface should get all the applet
specific data (st0, ...) but this will be done in the next patch so that
we don't pollute this one too much.
2011-03-10 23:32:14 +01:00
Willy Tarreau
124d99181c [BUG] http: fix computation of message body length after forwarding has started
Till now, the forwarding code was making use of the hdr_content_len member
to hold the size of the last chunk parsed. As such, it was reset after being
scheduled for forwarding. The issue is that this entry was reset before the
data could be viewed by backend.c in order to parse a POST body, so the
"balance url_param check_post" did not work anymore.

In order to fix this, we need two things :
  - the chunk size (reset upon every forward)
  - the total body size (not reset)

hdr_content_len was thus replaced by the former (hence the size of the patch)
as it makes more sense to have it stored that way than the way around.

This patch should be backported to 1.4 with care, considering that it affects
the forwarding code.
2011-03-01 20:30:48 +01:00
Willy Tarreau
930bd6b763 [MINOR] acl: add ability to check for internal response-only parameters
Some parameters are known to the response only (eg: the server which processed
it). Let's have a flag for that.
2011-02-23 15:32:20 +01:00
Rauf Kuliyev
38b4156a69 [MINOR] checks: add PostgreSQL health check
I have written a small patch to enable a correct PostgreSQL health check
It works similar to mysql-check with the very same parameters.

E.g.:
listen pgsql 127.0.0.1:5432
   mode tcp
   option pgsql-check user pgsql
   server masterdb pgsql.server.com:5432 check inter 10000
2011-01-04 15:14:13 +01:00
Kevinm
48936af9a2 [MINOR] log: ability to override the syslog tag
One of the requirements we have is to run multiple instances of haproxy on a
single host; this is so that we can split the responsibilities (and change
permissions) between product teams. An issue we ran up against is how we
would distinguish between the logs generated by each instance. The solution
we came up with (please let me know if there is a better way) is to override
the application tag written to syslog. We can then configure syslog to write
these to different files.

I have attached a patch adding a global option 'log-tag' to override the
default syslog tag 'haproxy' (actually defaults to argv[0]).
2010-12-30 11:43:36 +01:00
Joe Williams
df5b38fac1 [MINOR] log: add support for passing the forwarded hostname
Haproxy does not include the hostname rather the IP of the machine in
the syslog headers it sends. Unfortunately this means that for each log
line rsyslog does a reverse dns on the client IP and in the case of
non-routable IPs one gets the public hostname not the internal one.

While this is valid according to RFC3164 as one might imagine this is
troublsome if you have some machines with public IPs, internal IPs, no
reverse DNS entries, etc and you want a standardized hostname based log
directory structure. The rfc says the preferred value is the hostname.

This patch adds a global "log-send-hostname" statement which accepts an
optional string to force the host name. If unset, the local host name
is used.
2010-12-29 17:05:48 +01:00
Willy Tarreau
0499e3575c [BUG] http: analyser optimizations broke pipelining
HTTP pipelining currently needs to monitor the response buffer to wait
for some free space to be able to send a response. It was not possible
for the HTTP analyser to be called based on response buffer activity.
Now we introduce a new buffer flag BF_WAKE_ONCE which is set when the
HTTP request analyser is set on the response buffer and some activity
is detected. This is not clean at all but once of the only ways to fix
the issue before we make it possible to register events for analysers.

Also it appeared that one realign condition did not cover all cases.
2010-12-17 07:15:57 +01:00
Willy Tarreau
10479e4bac [MINOR] stats: add global event ID and count
This counter will help quickly spot whether there are new errors or not.
It is also assigned to each capture so that a script can keep trace of
which capture was taken when.
2010-12-12 14:00:34 +01:00
Willy Tarreau
078272e115 [MINOR] stats: report HTTP message state and buffer flags in error dumps
Debugging parsing errors can be greatly improved if we know what the parser
state was and what the buffer flags were (especially for closed inputs/outputs
and full buffers). Let's add that to the error snapshots.
2010-12-12 12:46:33 +01:00
Willy Tarreau
798a39cdc9 [MEDIUM] hash: add support for an 'avalanche' hash-type
When the number of servers is a multiple of the size of the input set,
map-based hash can be inefficient. This typically happens with 64
servers when doing URI hashing. The "avalanche" hash-type applies an
avalanche hash before performing a map lookup in order to smooth the
distribution. The result is slightly less smooth than the map for small
numbers of servers, but still better than the consistent hashing.
2010-11-29 07:28:16 +01:00
Willy Tarreau
4c14eaa0d4 [CLEANUP] hash: move the avalanche hash code globally available
We'll use this hash at other places, let's make it globally available.
The function has also been renamed because its "chash_hash" name was
not appropriate.
2010-11-29 07:28:16 +01:00
Willy Tarreau
b695a6e5fa [BUILD] pattern: use 'int' instead of 'int32_t'
Ross West reported that int32_t breaks compilation on FreeBSD. Since an
int is 32-bit on all supported platforms and we already rely on that,
change the type.
2010-11-14 14:24:27 +01:00
Emeric Brun
32da3c40db [MEDIUM] Manage peers section parsing and stick table registration on peers. 2010-11-11 09:29:08 +01:00
Emeric Brun
2b920a1af1 [MAJOR] Add new files src/peer.c, include/proto/peers.h and include/types/peers.h for sync stick table management
Add cmdline option -L to configure local peer name
2010-11-11 09:29:08 +01:00
Emeric Brun
85e77c7f0d [MEDIUM] Create updates tree on stick table to manage sync. 2010-11-11 09:29:08 +01:00
Emeric Brun
485479d8e9 [MEDIUM] Create new protected pattern types CONSTSTRING and CONSTDATA to force memcpy if data from protected areas need to be manipulated.
Enhance pattern convs and fetch argument parsing, now fetchs and convs callbacks used typed args.
Add more details on error messages on parsing pattern expression function.
Update existing pattern convs and fetchs to new proto.
Create stick table key type "binary".
Manage Truncation and padding if pattern's fetch-converted result don't match table key size.
2010-11-11 09:29:07 +01:00
Emeric Brun
97679e7901 [MEDIUM] Implement tcp inspect response rules 2010-11-11 09:28:18 +01:00
Emeric Brun
c89a57284a [BUG] stick table entries expire on counters updates/read or show table, even if there is no "expire" parameter 2010-11-11 09:28:18 +01:00
Willy Tarreau
17f449b214 [MINOR] move MAXPATHLEN definition to compat.h
MAXPATHLEN may be used at other places, it's unconvenient to have it
redefined in a few files. Also, since checking it requires including
sys/param.h, some versions of it cause a macro declaration conflict
with MIN/MAX which are defined in tools.h. The solution consists in
including sys/param.h in both files so that we ensure it's loaded
before the macros are defined and MAXPATHLEN is checked.
2010-11-11 09:21:53 +01:00
Emeric Brun
ed76092e10 [MEDIUM] Add supports of bind on unix sockets. 2010-11-09 15:59:42 +01:00
Emeric Brun
cf20bf1c1c [MEDIUM] Enhance message errors management on binds 2010-11-05 10:34:07 +01:00
Willy Tarreau
8b0cbf9969 [MINOR] frontend: add a new analyser to parse a proxied connection
The introduction of a new PROXY protocol for proxied connections requires
an early analyser to decode the incoming connection and set the session
flags accordingly.

Some more work is needed, among which setting a flag on the session to
indicate it's proxied, and copying the original parameters for later
comparisons with new ACLs (eg: real_src, ...).
2010-10-30 19:04:38 +02:00
Willy Tarreau
74172757c7 [MINOR] standard: change arg type from const char* to char*
inetaddr_host_lim_ret() used to make use of const char** for some
args, but that make it impossible ot use char** due to the way
controls are made by gcc. So let's change that.
2010-10-30 19:04:37 +02:00
Willy Tarreau
4ec83cd939 [MINOR] standard: add read_uint() to parse a delimited unsigned integer
This function parses an integer and returns it along with the pointer to the
next char not part of the number.
2010-10-30 19:04:37 +02:00
Willy Tarreau
8a95691ae8 [MINOR] listener: add the "accept-proxy" option to the "bind" keyword
This option will enable the AN_REQ_DECODE_PROXY analyser on the requests
that come from those listeners.
2010-10-30 19:04:37 +02:00
Willy Tarreau
6e595772ad [MINOR] buffers: add a new request analyser flag for PROXY mode
Since it must be the first analyser, the other flags have been renumbered.
2010-10-30 19:04:37 +02:00
Willy Tarreau
ba4c5be880 [MINOR] cookie: add support for the "preserve" option
This option makes haproxy preserve any persistence cookie emitted by
the server, which allows the server to change it or to unset it, for
instance, after a logout request.
(cherry picked from commit 52e6d75374c7900c1fe691c5633b4ae029cae8d5)
2010-10-30 19:04:36 +02:00
Cyril Bont
474be415af [MEDIUM] stats: add an admin level
The stats web interface must be read-only by default to prevent security
holes. As it is now allowed to enable/disable servers, a new keyword
"stats admin" is introduced to activate this admin level, conditioned by ACLs.
(cherry picked from commit 5334bab92ca7debe36df69983c19c21b6dc63f78)
2010-10-30 19:04:34 +02:00
Cyril Bont
70be45dbdf [MEDIUM] enable/disable servers from the stats web interface
Based on a patch provided by Judd Montgomery, it is now possible to
enable/disable servers from the stats web interface. This allows to select
several servers in a backend and apply the action to them at the same time.

Currently, there are 2 known limitations :
- The POST data are limited to one packet
  (don't alter too many servers at a time).
- Expect: 100-continue is not supported.
(cherry picked from commit 7693948766cb5647ac03b48e782cfee2b1f14491)
2010-10-30 19:04:34 +02:00
Willy Tarreau
f64d1410fc [MEDIUM] cookie: check for maxidle and maxlife for incoming dated cookies
If a cookie comes in with a first or last date, and they are configured on
the backend, they're checked. If a date is expired or too far in the future,
then the cookie is ignored and the specific reason appears in the cookie
field of the logs.
(cherry picked from commit faa3019107eabe6b3ab76ffec9754f2f31aa24c6)
2010-10-30 19:04:33 +02:00
Willy Tarreau
c01062bead [MINOR] add encode/decode function for 30-bit integers from/to base64
These functions only require 5 chars to encode 30 bits, and don't expect
any padding. They will be used to encode dates in cookies.
(cherry picked from commit a7e2b5fc4612994c7b13bcb103a4a2c3ecd6438a)
2010-10-30 19:04:33 +02:00
Willy Tarreau
f1348310e8 [MEDIUM] cookie: reassign set-cookie status flags to store more states
The set-cookie status flags were not very handy and limited. Reorder
them to save some room for additional values and add the "U" flags
(for Updated expiration date) that will be used with expirable cookies
in insert mode.
(cherry picked from commit 5bab52f821bb0fa99fc48ad1b400769e66196ece)
2010-10-30 19:04:33 +02:00
Willy Tarreau
b761ec4c94 [MINOR] cookie: add the expired (E) and old (O) flags for request cookies
These flags will indicate the cookie status when an expiration date is
set.
(cherry picked from commit 3f0f0e4583a432d34b75bc7b9dd2c756b4e181a7)
2010-10-30 19:04:33 +02:00
Willy Tarreau
92954fdf2e [MINOR] http: make some room in the transaction flags to extend cookies
We'll need one more bit to store and report the request cookie's status.
Doing this required moving a few bits around. However, now in 1.4 all bits
are used, there's no room left.

Cookie flags will need
(cherry picked from commit 09ebca0413c43620ddc375b5b4ab31a25d47b3f4)
2010-10-30 19:04:32 +02:00
Willy Tarreau
bca9969daf [MEDIUM] cookie: support client cookies with some contents appended to their value
In all cookie persistence modes but prefix, we now support cookies whose
value is suffixed with some contents after a vertical bar ('|'). This will
be used to pass an optional expiration date. So as of now we only consider
the part of the cookie value which is used before the vertical bar.
(cherry picked from commit a4486bf4e5b03b5a980d03fef799f6407b2c992d)
2010-10-30 19:04:32 +02:00
Willy Tarreau
3193685865 [MINOR] cookie: add options "maxidle" and "maxlife"
Add two new arguments to the "cookie" keyword, to be able to
fix a max idle and max life on them. Right now only the parameter
parsing is implemented.
(cherry picked from commit 9ad5dec4c3bb8f29129f292cb22d3fc495fcc98a)
2010-10-30 19:04:32 +02:00
Willy Tarreau
43961d523f [MINOR] global: add "tune.chksize" to change the default check buffer size
HTTP content-based health checks will be involved in searching text in pages.
Some pages may not fit in the default buffer (16kB) and sometimes it might be
desired to have larger buffers in order to find patterns. Running checks on
smaller URIs is always preferred of course.
(cherry picked from commit 043f44aeb835f3d0b57626c4276581a73600b6b1)
2010-10-30 19:04:32 +02:00
Willy Tarreau
bd741540d2 [MEDIUM] checks: add support for HTTP contents lookup
This patch adds the "http-check expect [r]{string,status}" statements
which enable health checks based on whether the response status or body
to an HTTP request contains a string or matches a regex.

This probably is one of the oldest patches that remained unmerged. Over
the time, several people have contributed to it, among which FinalBSD
(first and second implementations), Nick Chalk (port to 1.4), Anze
Skerlavaj (tests and fixes), Cyril Bont (general fixes), and of course
myself for the final fixes and doc during integration.

Some people already use an old version of this patch which has several
issues, among which the inability to search for a plain string that is
not at the beginning of the data, and the inability to look for response
contents that are provided in a second and subsequent recv() calls. But
since some configs are already deployed, it was quite important to ensure
a 100% compatible behaviour on the working cases.

Thus, that patch fixes the issues while maintaining config compatibility
with already deployed versions.

(cherry picked from commit b507c43a3ce9a8e8e4b770e52e4edc20cba4c37f)
2010-10-30 19:04:31 +02:00
Gabor Lekeny
b4c81e4c81 [MINOR] checks: add support for LDAPv3 health checks
This patch provides a new "option ldap-check" statement to enable
server health checks based on LDAPv3 bind requests.
(cherry picked from commit b76b44c6fed8a7ba6f0f565dd72a9cb77aaeca7c)
2010-10-30 19:04:31 +02:00
Willy Tarreau
74b08c9ab7 [MEDIUM] buffers: rework the functions to exchange between SI and buffers
There was no consistency between all the functions used to exchange data
between a buffer and a stream interface. Also, the functions used to send
data to a buffer did not consider the possibility that the buffer was
shutdown for read.

Now the functions are called buffer_{put,get}_{char,block,chunk,string}.

The old buffer_feed* functions have been left available for existing code
but marked deprecated.
2010-09-08 17:04:31 +02:00
Willy Tarreau
af7ad00a99 [MINOR] support a global jobs counter
This counter is incremented for each incoming connection and each active
listener, and is used to prevent haproxy from stopping upon SIGUSR1. It
will thus be possible for some tasks in increment this counter in order
to prevent haproxy from dying until they have completed their job.
2010-08-31 15:39:26 +02:00
Willy Tarreau
d0807c3c60 [MEDIUM] signals: support redistribution of signal zero when stopping
Signal zero is never delivered by the system. However having a signal to
which functions and tasks can subscribe to be notified of a stopping event
is useful. So this patch does two things :
  1) allow signal zero to be delivered from any function of signal handler
  2) make soft_stop() deliver this signal so that tasks can be notified of
     a stopping condition.
2010-08-27 18:26:11 +02:00
Willy Tarreau
24f4efa670 [MEDIUM] signals: add support for registering functions and tasks
The two new functions below make it possible to register any number
of functions or tasks to a system signal. They will be called in the
registration order when the signal is received.

    struct sig_handler *signal_register_fct(int sig, void (*fct)(struct sig_handler *), int arg);
    struct sig_handler *signal_register_task(int sig, struct task *task, int reason);
2010-08-27 18:00:40 +02:00
Willy Tarreau
08c4b79f9a [CLEANUP] reference product branch 1.5
It still appeared as 1.4 on the stats links.
2010-08-27 11:09:17 +02:00
Willy Tarreau
bb545b4cfc [MINOR] startup: don't wait for nothing when no old pid remains
In case of binding failure during startup, we wait for some time sending
signals to old pids so that they release the ports we need. But if there
aren't any old pids anymore, it's useless to wait, we prefer to fail fast.
Along with this change, we now have the number of old pids really found
in the nb_oldpids variable.
2010-08-25 12:58:59 +02:00
Willy Tarreau
0a4838cd31 [MEDIUM] session-counters: correctly unbind the counters tracked by the backend
In case of HTTP keepalive processing, we want to release the counters tracked
by the backend. Till now only the second set of counters was released, while
it could have been assigned by the frontend, or the backend could also have
assigned the first set. Now we reuse to unused bits of the session flags to
mark which stick counters were assigned by the backend and to release them as
appropriate.
2010-08-10 18:04:16 +02:00
Willy Tarreau
56123282ef [MINOR] session-counters: use "track-sc{1,2}" instead of "track-{fe,be}-counters"
The assumption that there was a 1:1 relation between tracked counters and
the frontend/backend role was wrong. It is perfectly possible to track the
track-fe-counters from the backend and the track-be-counters from the
frontend. Thus, in order to reduce confusion, let's remove this useless
{fe,be} reference and simply use {1,2} instead. The keywords have also been
renamed in order to limit confusion. The ACL rule action now becomes
"track-sc{1,2}". The ACLs are now "sc{1,2}_*" instead of "trk{fe,be}_*".

That means that we can reasonably document "sc1" and "sc2" (sticky counters
1 and 2) as sort of patterns that are available during the whole session's
life and use them just like any other pattern.
2010-08-10 18:04:15 +02:00
Willy Tarreau
f6efda1189 [MEDIUM] session counters: automatically remove expired entries.
When a ref_cnt goes down to zero and the entry is expired, remove it.
2010-08-10 18:04:15 +02:00
Willy Tarreau
f059a0f63a [MAJOR] session-counters: split FE and BE track counters
Having a single tracking pointer for both frontend and backend counters
does not work. Instead let's have one for each. The keyword has changed
to "track-be-counters" and "track-fe-counters", and the ACL "trk_*"
changed to "trkfe_*" and "trkbe_*".
2010-08-10 18:04:15 +02:00
Willy Tarreau
4f3f01fa39 [MEDIUM] stats: add the ability to dump table entries matching criteria
It is now possible to dump some select table entries based on criteria
which apply to the stored data. This is enabled by appending the following
options to the end of the "show table" statement :

  data.<data_type> {eq|ne|lt|gt|le|ge} <value>

For intance :

  show table http_proxy data.conn_rate gt 5
  show table http_proxy data.gpc0 ne 0

The compare applies to the integer value as it would be displayed, and
operates on signed long long integers.
2010-08-10 18:04:14 +02:00
Willy Tarreau
3b9c6e053e [MEDIUM] stick-table: make use of generic types for stored data
It's a bit cumbersome to have to know all possible storable types
from the stats interface. Instead, let's have generic types for
all data, which will facilitate their manipulation.
2010-08-10 18:04:14 +02:00
Willy Tarreau
795e4a93dd [CLEANUP] stick-table: declare stktable_data_types as extern
It's fortunate we did not get hit by this stupid declaration !
2010-08-10 18:04:14 +02:00
Willy Tarreau
69f58c8058 [MEDIUM] stats: add "show table [<name>]" to dump a stick-table
It is now possible to dump a table's contents with keys, expire,
use count, and various data using the command above on the stats
socket.

"show table" only shows main table stats, while "show table <name>"
dumps table contents, only if the socket level is admin.
2010-08-10 18:04:14 +02:00
Willy Tarreau
da7ff64aa9 [MEDIUM] session-counters: add HTTP req/err tracking
This patch adds support for the following session counters :
  - http_req_cnt : HTTP request count
  - http_req_rate: HTTP request rate
  - http_err_cnt : HTTP request error count
  - http_err_rate: HTTP request error rate

The equivalent ACLs have been added to check the tracked counters
for the current session or the counters of the current source.
2010-08-10 18:04:14 +02:00
Willy Tarreau
c3bd972cda [MINOR] session-counters: add a general purpose counter (gpc0)
This counter may be used to track anything. Two sets of ACLs are available
to manage it, one gets its value, and the other one increments its value
and returns it. In the second case, the entry is created if it did not
exist.

Thus it is possible for example to mark a source as being an abuser and
to keep it marked as long as it does not wait for the entry to expire :

	# The rules below use gpc0 to track abusers, and reject them if
	# a source has been marked as such. The track-counters statement
	# automatically refreshes the entry which will not expire until a
	# 1-minute silence is respected from the source. The second rule
	# evaluates the second part if the first one is true, so GPC0 will
	# be increased once the conn_rate is above 100/5s.
	stick-table type ip size 200k expire 1m store conn_rate(5s),gpc0
	tcp-request track-counters src
	tcp-request reject if { trk_get_gpc0 gt 0 }
	tcp-request reject if { trk_conn_rate gt 100 } { trk_inc_gpc0 gt 0}

Alternatively, it is possible to let the entry expire even in presence of
traffic by swapping the check for gpc0 and the track-counters statement :

	stick-table type ip size 200k expire 1m store conn_rate(5s),gpc0
	tcp-request reject if { src_get_gpc0 gt 0 }
	tcp-request track-counters src
	tcp-request reject if { trk_conn_rate gt 100 } { trk_inc_gpc0 gt 0}

It is also possible not to track counters at all, but entry lookups will
then be performed more often :

	stick-table type ip size 200k expire 1m store conn_rate(5s),gpc0
	tcp-request reject if { src_get_gpc0 gt 0 }
	tcp-request reject if { src_conn_rate gt 100 } { src_inc_gpc0 gt 0}

The '0' at the end of the counter name is there because if we find that more
counters may be useful, other ones will be added.
2010-08-10 18:04:14 +02:00
Willy Tarreau
1f7e925d6a [MINOR] stktable: add a stktable_update_key() function
This function looks up a key, updates its expiration date, or creates
it if it was not found. acl_fetch_src_updt_conn_cnt() was updated to
make use of it.
2010-08-10 18:04:14 +02:00
Willy Tarreau
6c59e0a942 [MEDIUM] session counters: add bytes_in_rate and bytes_out_rate counters
These counters maintain incoming and outgoing byte rates in a stick-table,
over a period which is defined in the configuration (2 ms to 24 days).
They can be used to detect service abuse and enforce a certain bandwidth
limits per source address for instance, and block if the rate is passed
over. Since 32-bit counters are used to compute the rates, it is important
not to use too long periods so that we don't have to deal with rates above
4 GB per period.

Example :
    # block if more than 5 Megs retrieved in 30 seconds from a source.
    stick-table type ip size 200k expire 1m store bytes_out_rate(30s)
    tcp-request track-counters src
    tcp-request reject if { trk_bytes_out_rate gt 5000000 }

    # cause a 15 seconds pause to requests from sources in excess of 2 megs/30s
    tcp-request inspect-delay 15s
    tcp-request content accept if { trk_bytes_out_rate gt 2000000 } WAIT_END
2010-08-10 18:04:13 +02:00
Willy Tarreau
91c43d7fe4 [MEDIUM] session counters: add conn_rate and sess_rate counters
These counters maintain incoming connection rates and session rates
in a stick-table, over a period which is defined in the configuration
(2 ms to 24 days). They can be used to detect service abuse and
enforce a certain accept rate per source address for instance, and
block if the rate is passed over.

Example :
	# block if more than 50 requests per 5 seconds from a source.
	stick-table type ip size 200k expire 1m store conn_rate(5s),sess_rate(5s)
	tcp-request track-counters src
	tcp-request reject if { trk_conn_rate gt 50 }

	# cause a 3 seconds pause to requests from sources in excess of 20 requests/5s
	tcp-request inspect-delay 3s
	tcp-request content accept if { trk_sess_rate gt 20 } WAIT_END
2010-08-10 18:04:13 +02:00
Willy Tarreau
ac78288eaf [MEDIUM] stick-tables: add stored data argument type checking
We're now able to return errors based on the validity of an argument
passed to a stick-table store data type. We also support ARG_T_DELAY
to pass delays to stored data types (eg: for rate counters).
2010-08-10 18:04:13 +02:00
Willy Tarreau
888617dc3b [MEDIUM] stick-tables: add support for arguments to data_types
Some data types will require arguments (eg: period for a rate counter).
This patch adds support for such arguments between parenthesis in the
"store" directive of the stick-table statement. Right now only integers
are supported.
2010-08-10 18:04:13 +02:00
Willy Tarreau
f4d17d9071 [MEDIUM] session: add a counter on the cumulated number of sessions
Sessions are like connections but they have been accepted by L4 rules
and really became sessions.
2010-08-10 18:04:13 +02:00
Willy Tarreau
e348793696 [MEDIUM] session-counters: automatically update tracked connection count
When a session tracks a counter, automatically increase the cumulated
connection count. This makes src_updt_conn_cnt() almost useless. In
fact it might still be used to update different tables.
2010-08-10 18:04:12 +02:00
Willy Tarreau
855e4bbcc7 [MEDIUM] session: add data in and out volume counters
The new "bytes_in_cnt" and "bytes_out_cnt" session counters have been
added. They're automatically updated when session counters are updated.
They can be matched with the "src_kbytes_in" and "src_kbytes_out" ACLs
which apply to the volume per source address. This can be used to deny
access to service abusers.
2010-08-10 18:04:12 +02:00
Willy Tarreau
38285c18f4 [MEDIUM] session: add concurrent connections counter
The new "conn_cur" session counter has been added. It is automatically
updated upon "track XXX" directives, and the entry is touched at the
moment we increment the value so that we don't consider further counter
updates as real updates, otherwise we would end up updating upon completion,
which may not be desired. Probably that some other event counters (eg: HTTP
requests) will have to be updated upon each event though.

This counter can be matched against current session's source address using
the "src_conn_cur" ACL.
2010-08-10 18:04:12 +02:00
Willy Tarreau
8fb12c4b61 [MINOR] stick-table: use suffix "_cnt" for cumulated counts
The "_cnt" suffix is already used by ACLs to count various data,
so it makes sense to use the same one in "conn_cnt" instead of
"conn_cum" to count cumulated connections.

This is not a problem because no version was emitted with those
keywords.

Thus we'll try to stick to the following rules :

  xxxx_cnt : cumulated event count for criterion xxxx
  xxxx_cur : current number of concurrent entries for criterion xxxx
  xxxx_rate: event rate for criterion xxxx
2010-08-10 18:04:12 +02:00
Willy Tarreau
4a0347add0 [MINOR] stick-table: provide a table lookup function
We'll often need to lookup a table by its name. This will change
in the future once we can resolve these names on startup.
2010-08-10 18:04:12 +02:00
Willy Tarreau
9ba2dcc86c [MAJOR] session: add track-counters to track counters related to the session
This patch adds the ability to set a pointer in the session to an
entry in a stick table which holds various counters related to a
specific pattern.

Right now the syntax matches the target syntax and only the "src"
pattern can be specified, to track counters related to the session's
IPv4 source address. There is a special function to extract it and
convert it to a key. But the goal is to be able to later support as
many patterns as for the stick rules, and get rid of the specific
function.

The "track-counters" directive may only be set in a "tcp-request"
statement right now. Only the first one applies. Probably that later
we'll support multi-criteria tracking for a single session and that
we'll have to name tracking pointers.

No counter is updated right now, only the refcount is. Some subsequent
patches will have to bring that feature.
2010-08-10 18:04:12 +02:00
Willy Tarreau
591fedc2c3 [MEDIUM] buffer: make buffer_feed* support writing non-contiguous chunks
The buffer_feed* functions that are used to send data to buffers did only
support sending contiguous chunks while they're relying on memcpy(). This
patch improves on this by making them able to write in two chunks if needed.
Thus, the buffer_almost_full() function has been improved to really consider
the remaining space and not just what can be written at once.
2010-08-10 17:48:57 +02:00
Willy Tarreau
fb35620e87 [MEDIUM] session: support "tcp-request content" rules in backends
Sometimes it's necessary to be able to perform some "layer 6" analysis
in the backend. TCP request rules were not available till now, although
documented in the diagram. Enable them in backend now.
2010-08-10 14:10:58 +02:00
Willy Tarreau
5b18020201 [MINOR] tools: add a get_std_op() function to parse operators
We already have several places where we use operators to compare
values. Each time the parsing is done again. Let's have a central
function for this.
2010-08-10 14:03:25 +02:00
Willy Tarreau
f25fbad356 [MINOR] errors: provide new status codes for config parsing functions
Some config parsing functions need to return composite status codes
when they rely on other functions. Let's provide a few such codes
for general use and extend them later.
2010-08-10 14:01:15 +02:00
Willy Tarreau
2970b0bedf [MINOR] freq_ctr: add new types and functions for periods different from 1s
Some freq counters will have to work on periods different from 1 second.
The original freq counters rely on the period to be exactly one second.
The new ones (freq_ctr_period) let the user define the period in ticks,
and all computations are operated over that period. When reading a value,
it indicates the amount of events over that period too.
2010-08-10 14:01:09 +02:00
Willy Tarreau
f0d9eecc52 [MINOR] tools: add a fast div64_32 function
We'll need to divide 64 bits by 32 bits with new frequency counters.
Gcc does not know when it can safely do that, but the way we build
our operations let us be sure. So let's provide an optimised version
for that purpose.
2010-08-10 13:59:56 +02:00
Willy Tarreau
258a14b7d7 [MINOR] proxy: add a "parent" member to the structure
This member will be used later when frontends are created on the
fly by some tasks. It will also be usable later if we need to
support multiple config instances for example.
2010-07-13 16:24:48 +02:00
Willy Tarreau
0bd05eaf24 [MEDIUM] stream-interface: add a ->release callback
When a connection is closed on a stream interface, some iohandlers
will need to be informed in order to release some resources. This
normally happens upon a shutr+shutw. It is the equivalent of the
fd_delete() call which is done for real sockets, except that this
time we release internal resources.

It can also be used with real sockets because it does not cost
anything else and might one day be useful.
2010-07-13 16:06:23 +02:00
Willy Tarreau
acf9577350 [MINOR] config: provide a function to quote args in a more friendly way
The quote_arg() function can be used to quote an argument or indicate
"end of line" if it's null or empty. It should be useful to more precisely
report location of problems in the configuration.
2010-06-14 19:09:21 +02:00
Willy Tarreau
5214be1b22 [MINOR] session: add a pointer to the tracked counters for the source
We'll have to keep counters of various criteria specific to the session's
source. When we get one, keep a pointer to it in the session.
2010-06-14 15:32:18 +02:00
Willy Tarreau
e7f3d7ab9f [MEDIUM] stick-tables: add a reference counter to each entry
We'll soon have to maintain links from sessions to entries, so let's
add a refcount in entries to avoid purging them if it's not null.
2010-06-14 15:10:26 +02:00
Willy Tarreau
cb18364ca7 [MEDIUM] stick_table: separate storage and update of session entries
When an entry already exists, we just need to update its expiration
timer. Let's have a dedicated function for that instead of spreading
open code everywhere.

This change also ensures that an update of an existing sticky session
really leads to an update of its expiration timer, which was apparently
not the case till now. This point needs to be checked in 1.4.
2010-06-14 15:10:26 +02:00
Willy Tarreau
41883e2041 [MINOR] stick_table: export the stick_table_key
This one is huge and will be needed by other portions of code for various
data lookups. Let's not have them allocate it in the stack.
2010-06-14 15:10:25 +02:00
Willy Tarreau
13c29dee21 [MEDIUM] stick_table: move the server ID to a generic data type
The server ID is now stored just as any other data type. It is only
allocated if needed and is manipulated just like the other ones.
2010-06-14 15:10:25 +02:00
Willy Tarreau
68129b90eb [MINOR] stick_table: provide functions to return stksess data from a type
This function does the indirection job in the table to find the pointer
to the real data matching the requested type.
2010-06-14 15:10:25 +02:00
Willy Tarreau
f16d2b8c1b [MEDIUM] stick_table: don't overwrite data when storing an entry
Till now sticky sessions only held server IDs. Now there are other
data types so it is not acceptable anymore to overwrite the server ID
when writing something. The server ID must then only be written from
the caller when appropriate. Doing this has also led to separate
lookup and storage.
2010-06-14 15:10:24 +02:00
Willy Tarreau
69b870f862 [MINOR] stick_table: add support for "conn_cum" data type.
This one can be parsed on the "stick-table" after with the "store"
keyword. It will hold the number of connections matching the entry,
for use with ACLs or anything else.
2010-06-14 15:10:24 +02:00
Willy Tarreau
08d5f98294 [MEDIUM] stick_table: add room for extra data types
The stick_tables will now be able to store extra data for a same key.
A limited set of extra data types will be defined and for each of them
an offset in the sticky session will be assigned at startup time. All
of this information will be stored in the stick table.

The extra data types will have to be specified after the new "store"
keyword of the "stick-table" directive, which will reserve some space
for them.
2010-06-14 15:10:24 +02:00
Willy Tarreau
f0b38bfc33 [CLEANUP] stick_table: move pattern to key functions to stick_table.c
pattern.c depended on stick_table while in fact it should be the opposite.
So we move from pattern.c everything related to stick_tables and invert the
dependency. That way the code becomes more logical and intuitive.
2010-06-14 15:10:24 +02:00
Willy Tarreau
86257dc116 [CLEANUP] stick_table: rename some stksess struct members to avoid confusion
The name 'exps' and 'keys' in struct stksess was confusing because it was
the same name as in the table which holds all of them, while they only hold
one node each. Remove the trailing 's' to more clearly identify who's who.
2010-06-14 15:10:23 +02:00
Willy Tarreau
393379c3e0 [MINOR] stick_table: add support for variable-sized data
Right now we're only able to store a server ID in a sticky session.
The goal is to be able to store anything whose size is known at startup
time. For this, we store the extra data before the stksess pointer,
using a negative offset. It will then be easy to cumulate multiple
data provided they each have their own offset.
2010-06-14 15:10:23 +02:00
Willy Tarreau
f8f33284bd [BUILD] memory: add a few missing parenthesis to the pool management macros
These missing ones caused a build error when a macro was called with
operations as the argument.
2010-06-14 15:10:23 +02:00
Willy Tarreau
aea940eb23 [CLEANUP] stick_table: add/clarify some comments 2010-06-14 15:10:23 +02:00
Willy Tarreau
2799e98a36 [MINOR] frontend: count denied TCP requests separately
It's very disturbing to see the "denied req" counter increase without
any other session counter moving. In fact, we can't count a rejected
TCP connection as "denied req" as we have not yet instanciated any
session at all. Let's use a new counter for that.
2010-06-14 10:53:20 +02:00
Willy Tarreau
b36b4244a2 [MINOR] session: differenciate between accepted connections and received connections
Now we're able to reject connections very early, so we need to use a
different counter for the connections that are received and the ones
that are accepted and converted into sessions, so that the rate limits
can still apply to the accepted ones. The session rate must still be
used to compute the rate limit, so that we can reject undesired traffic
without affecting the rate.
2010-06-14 10:53:19 +02:00
Willy Tarreau
1d315eaa24 [MINOR] buffer: refine the flags that may wake an analyser up.
Analysers don't care (and must not care) about a few flags such as
BF_AUTO_CLOSE or BF_AUTO_CONNECT, so those flags should not be listed
in the BF_MASK_STATIC bitmask.

We should also recheck if some buffer flags should be ignored or not
in process_session() when deciding if we must loop again or not.
2010-06-14 10:53:18 +02:00
Willy Tarreau
decd14d298 [MEDIUM] stats: rely on the standard session_accept() function
The stats' accept() function is now ridiculously small. It could
even be reduced by moving some parts to the common accept code.
2010-06-14 10:53:18 +02:00
Willy Tarreau
81f9aa3bf2 [MAJOR] frontend: split accept() into frontend_accept() and session_accept()
A new function session_accept() is now called from the lower layer to
instanciate a new session. Once the session is instanciated, the upper
layer's frontent_accept() is called. This one can be service-dependant.

That way, we have a 3-phase accept() sequence :
  1) protocol-specific, session-less accept(), which is pointed to by
     the listener. It defaults to the generic stream_sock_accept().
  2) session_accept() which relies on a frontend but not necessarily
     for use in a proxy (eg: stats or any future service).
  3) frontend_accept() which performs the accept for the service
     offerred by the frontend. It defaults to frontend_accept() which
     is really what is used by a proxy.

The TCP/HTTP proxies have been moved to this mode so that we can now rely on
frontend_accept() for any type of session initialization relying on a frontend.

The next step will be to convert the stats to use the same system for the stats.
2010-06-14 10:53:17 +02:00
Willy Tarreau
f229eb8f8f [MINOR] proxy: add an accept() callback for the application layer
This will be used by the application layer for setting accept callbacks.
2010-06-14 10:53:17 +02:00
Willy Tarreau
ee28de0a12 [MEDIUM] session: move the conn_retries attribute to the stream interface
The conn_retries still lies in the session and its initialization depends
on the backend when it may not yet be known. Let's first move it to the
stream interface.
2010-06-14 10:53:16 +02:00
Willy Tarreau
a8f55d5473 [MEDIUM] backend: initialize the server stream_interface upon connect()
It's not normal to initialize the server-side stream interface from the
accept() function, because it may change later. Thus, we introduce a new
stream_sock_prepare_interface() function which is called just before the
connect() and which sets all of the stream_interface's callbacks to the
default ones used for real sockets. The ->connect function is also set
at the same instant so that we can easily add new server-side protocols
soon.
2010-06-14 10:53:15 +02:00
Willy Tarreau
ace495e468 [CLEANUP] buffer->cto is not used anymore
The connection timeout stored in the buffer has not been used since the
stream interface were introduced. Let's get rid of it as it's one of the
things that complicate factoring of the accept() functions.
2010-06-14 10:53:14 +02:00
Willy Tarreau
de3041d443 [MINOR] frontend: only check for monitor-net rules if LI_O_CHK_MONNET is set
We can disable the monitor-net rules on a listener if this flag is not
set in the listener's options. This will be useful when we don't want
to check that fe->addr is set or not for non-TCP frontends.
2010-06-14 10:53:13 +02:00
Willy Tarreau
a5c0ab200b [MEDIUM] frontend: check for LI_O_TCP_RULES in the listener
The new LI_O_TCP_RULES listener option indicates that some TCP rules
must be checked upon accept on this listener. It is now checked by
the frontend and the L4 rules are evaluated only in this case. The
flag is only set when at least one tcp-req rule is present in the
frontend.

The L4 rules check function has now been moved to proto_tcp.c where
it ought to be.
2010-06-14 10:53:13 +02:00
Willy Tarreau
ab786194f0 [MINOR] proxy: add a list to hold future layer 4 rules
This list will be evaluated right after the accept() call.
2010-06-14 10:53:11 +02:00
Willy Tarreau
eb472685cb [MEDIUM] separate protocol-level accept() from the frontend's
For a long time we had two large accept() functions, one for TCP
sockets instanciating proxies, and another one for UNIX sockets
instanciating the stats interface.

A lot of code was duplicated and both did not work exactly the same way.

Now we have a stream_sock layer accept() called for either TCP or UNIX
sockets, and this function calls the frontend-specific accept() function
which does the rest of the frontend-specific initialisation.

Some code is still duplicated (session & task allocation, stream interface
initialization), and might benefit from having an intermediate session-level
accept() callback to perform such initializations. Still there are some
minor differences that need to be addressed first. For instance, the monitor
nets should only be checked for proxies and not for other connection templates.

Last, we renamed l->private as l->frontend. The "private" pointer in
the listener is only used to store a frontend, so let's rename it to
eliminate this ambiguity. When we later support detached listeners
(eg: FTP), we'll add another field to avoid the confusion.
2010-06-14 10:53:11 +02:00
Willy Tarreau
03fa5df64a [CLEANUP] rename client -> frontend
The 'client.c' file now only contained frontend-specific functions,
so it has naturally be renamed 'frontend.c'. Same for client.h. This
has also been an opportunity to remove some cross references from
files that should not have depended on it.

In the end, this file should contain a protocol-agnostic accept()
code, which would initialize a session, task, etc... based on an
accept() from a lower layer. Right now there are still references
to TCP.
2010-06-14 10:53:10 +02:00
Willy Tarreau
44b90cc4d8 [CLEANUP] tcp: move some non tcp-specific layer6 processing out of proto_tcp
Some functions which act on generic buffer contents without being
tcp-specific were historically in proto_tcp.c. This concerns ACLs
and RDP cookies. Those have been moved away to more appropriate
locations. Ideally we should create some new files for each layer6
protocol parser. Let's do that later.
2010-06-14 10:53:09 +02:00
Willy Tarreau
06457871a4 [CLEANUP] acl: use 'L6' instead of 'L4' in ACL flags relying on contents
Just like we do on health checks, we should consider that ACLs that make
use of buffer data are layer 6 and not layer 4, because we'll soon have
to distinguish between pure layer 4 ACLs (without any buffer) and these
ones.
2010-06-14 10:53:09 +02:00
Willy Tarreau
0b1cd94c8b [MINOR] acl: add srv_is_up() to check that a specific server is up or not
This ACL was missing in complex setups where the status of a remote site
has to be considered in switching decisions. Until there, using a server's
status in an ACL required to have a dedicated backend, which is a bit heavy
when multiple servers have to be monitored.
2010-05-16 22:18:27 +02:00
Willy Tarreau
e56cda9a6a [MEDIUM] acl: add ability to insert patterns in trees
The code is now ready to support loading pattern from filesinto trees. For
that, it will be required that the ACL keyword has a flag ACL_MAY_LOOKUP
and that the expr is case sensitive. When that is true, the pattern will
have a flag ACL_PAT_F_TREE_OK to indicate that it is possible to feed the
tree instead of a usual pattern if the parsing function is able to do this.
The tree's root is pre-initialized in the pattern's value so that the
function can easily find it. At that point, if the parsing function decides
to use the tree, it just sets ACL_PAT_F_TREE in the return flags so that
the caller knows the tree has been used and the pattern can be recycled.

That way it will be possible to load some patterns into the tree when it
is compatible, and other ones as linear linked lists. A good example of
this might be IPv4 network entries : right now we support holes in masks,
but this very rare feature is not compatible with binary lookup in trees.
So the parser will be able to decide itself whether the pattern can go to
the tree or not.
2010-05-13 21:37:41 +02:00
Willy Tarreau
438b0f3f0f [MINOR] acl trees: add flags and union members to store values in trees
If we want to be able to match ACLs against a lot of possible values, we
need to put those values in trees. That will only work for exact matches,
which is normally just what is needed.

Right now, only IPv4 and string matching are planned, but others might come
later.
2010-05-12 16:59:59 +02:00
Cyril Bont
47fdd8e993 [MINOR] add the "ignore-persist" option to conditionally ignore persistence
This is used to disable persistence depending on some conditions (for
example using an ACL matching static files or a specific User-Agent).
You can see it as a complement to "force-persist".

In the configuration file, the force-persist/ignore-persist declaration
order define the rules priority.

Used with the "appsesion" keyword, it can also help reducing memory usage,
as the session won't be hashed the persistence is ignored.
2010-04-25 22:37:14 +02:00
Willy Tarreau
8a8e1d99cb [MINOR] http: make it possible to pretend keep-alive when doing close
Some servers do not completely conform with RFC2616 requirements for
keep-alive when they receive a request with "Connection: close". More
specifically, they don't bother using chunked encoding, so the client
never knows whether the response is complete or not. One immediately
visible effect is that haproxy cannot maintain client connections alive.
The second issue is that truncated responses may be cached on clients
in case of network error or timeout.

scar Fras Barranco reported this issue on Tomcat 6.0.20, and
Patrik Nilsson with Jetty 6.1.21.

Cyril Bont proposed this smart idea of pretending we run keep-alive
with the server and closing it at the last moment as is already done
with option forceclose. The advantage is that we only change one
emitted header but not the overall behaviour.

Since some servers such as nginx are able to close the connection
very quickly and save network packets when they're aware of the
close negociation in advance, we don't enable this behaviour by
default.

"option http-pretend-keepalive" will have to be used for that, in
conjunction with "option http-server-close".
2010-04-05 16:26:34 +02:00
Willy Tarreau
bce7088275 [MEDIUM] add ability to connect to a server from an IP found in a header
Using get_ip_from_hdr2() we can look for occurrence #X or #-X and
extract the IP it contains. This is typically designed for use with
the X-Forwarded-For header.

Using "usesrc hdr_ip(name,occ)", it becomes possible to use the IP address
found in <name>, and possibly specify occurrence number <occ>, as the
source to connect to a server. This is possible both in a server and in
a backend's source statement. This is typically used to use the source
IP previously set by a upstream proxy.
2010-03-30 10:39:43 +02:00
Willy Tarreau
090466c91a [MINOR] add new tproxy flags for dynamic source address binding
This patch adds a new TPROXY bind type, TPROXY_DYN, to indicate to the
TCP connect function that we want to bind to the address passed in
argument.
2010-03-30 09:59:44 +02:00
Willy Tarreau
d54bbdce87 [MINOR] add very fast IP parsing functions
Those functions were previouly used in my firewall log parser,
and are particularly suited for use with http headers.
2010-03-30 09:59:44 +02:00
Willy Tarreau
b1d67749db [MEDIUM] backend: move the transparent proxy address selection to backend
The transparent proxy address selection was set in the TCP connect function
which is not the most appropriate place since this function has limited
access to the amount of parameters which could produce a source address.

Instead, now we determine the source address in backend.c:connect_server(),
right after calling assign_server_address() and we assign this address in
the session and pass it to the TCP connect function. This cannot be performed
in assign_server_address() itself because in some cases (transparent mode,
dispatch mode or http_proxy mode), we assign the address somewhere else.

This change will open the ability to bind to addresses extracted from many
other criteria (eg: from a header).
2010-03-30 09:59:43 +02:00
Willy Tarreau
07a5490881 [CLEANUP] proxy: move PR_O_SSL3_CHK to options2 to release one flag
We'll need another flag in the 'options' member close to PR_O_TPXY_*,
and all are used, so let's move this easy one to options2 (which are
already used for SQL checks).
2010-03-30 09:59:43 +02:00
Willy Tarreau
e45997661b [MEDIUM] session: better fix for connection to servers with closed input
The following patch fixed an issue but brought another one :
  296897 [MEDIUM] connect to servers even when the input has already been closed

The new issue is that when a connection is inspected and aborted using
TCP inspect rules, now it is sent to the server before being closed. So
that test is not satisfying. A probably better way is not to prevent a
connection from establishing if only BF_SHUTW_NOW is set but BF_SHUTW
is not. That way, the BF_SHUTW flag is not set if the request has any
data pending, which still fixes the stats issue, but does not let any
empty connection pass through.

Also, as a safety measure, we extend buffer_abort() to automatically
disable the BF_AUTO_CONNECT flag. While it appears to always be OK,
it is by pure luck, so better safe than sorry.
2010-03-21 23:31:42 +01:00
Nick Chalk
57b1bf7785 [MEDIUM] checks: support multi-packet health check responses
We are seeing both real servers repeatedly going on- and off-line with
a period of tens of seconds. Packet tracing, stracing, and adding
debug code to HAProxy itself has revealed that the real servers are
always responding correctly, but HAProxy is sometimes receiving only
part of the response.

It appears that the real servers are sending the test page as three
separate packets. HAProxy receives the contents of one, two, or three
packets, apparently randomly. Naturally, the health check only
succeeds when all three packets' data are seen by HAProxy. If HAProxy
and the real servers are modified to use a plain HTML page for the
health check, the response is in the form of a single packet and the
checks do not fail.

(...)
I've added buffer and length variables to struct server, and allocated
space with the rest of the server initialisation.

(...)
It seems to be working fine in my tests, and handles check responses
that are bigger than the buffer.
2010-03-16 22:57:26 +01:00
Cyril Bont
8f27090b84 [CLEANUP] product branch update
today I've noticed that the stats page still displays v1.3 in the
"Updates" link, due to the PRODUCT_BRANCH value in version.h, then
it's maybe time to send you the result (notice that the patch updates
PRODUCT_BRANCH to "1.4").

--
Cyril Bont
2010-03-12 06:45:26 +01:00
Willy Tarreau
66dc20a17b [MINOR] stats socket: add show sess <id> to dump details about a session
When trying to spot some complex bugs, it's often needed to access
information on stuck sessions, which is quite difficult. This new
command helps one get detailed information about a session, with
flags, timers, states, etc... The buffer data are not dumped yet.
2010-03-05 17:58:04 +01:00
Willy Tarreau
ae52678444 [STATS] count transfer aborts caused by client and by server
Often we need to understand why some transfers were aborted or what
constitutes server response errors. With those two counters, it is
now possible to detect an unexpected transfer abort during a data
phase (eg: too short HTTP response), and to know what part of the
server response errors may in fact be assigned to aborted transfers.
2010-03-04 20:34:23 +01:00
Willy Tarreau
0e996c681f [BUILD] includes order breaks OpenBSD build
Jeff Buchbinder reported that OpenBSD build broke on compat.h,
and that this patch fixes the issue.
2010-02-26 22:00:19 +01:00
Willy Tarreau
8096de9a99 [MEDIUM] http: revert to use a swap buffer for realignment
The bounce realign function was algorithmically good but as expected
it was not cache-friendly. Using it with large requests caused so many
cache thrashing that the function itself could drain 70% of the total
CPU time for only 0.5% of the calls !

Revert back to a standard memcpy() using a specially allocated swap
buffer. We're now back to 2M req/s on pipelined requests.
2010-02-26 11:12:27 +01:00
Willy Tarreau
2465779459 [STATS] separate frontend and backend HTTP stats
It is wrong to merge FE and BE stats for a proxy because when we consult a
BE's stats, it reflects the FE's stats eventhough the BE has received no
traffic. The most common example happens with listen instances, where the
backend gets credited for all the trafic even when a use_backend rule makes
use of another backend.
2010-02-26 10:30:28 +01:00
Willy Tarreau
d9b587f260 [STATS] report HTTP requests (total and rate) in frontends
Now that we support keep-alive, it's important to report a separate
counter for requests. Right now it just appears in the CSV output.
2010-02-26 10:05:55 +01:00
Willy Tarreau
b97f199d4b [MEDIUM] http: don't use trash to realign large buffers
The trash buffer may now be smaller than a buffer because we can tune
it at run time. This causes a risk when we're trying to use it as a
temporary buffer to realign unaligned requests, because we may have to
put up to a full buffer into it.

Instead of doing a double copy, we're now relying on an open-coded
bouncing copy algorithm. The principle is that we move one byte at
a time to its final place, and if that place also holds a byte, then
we move it too, and so on. We finish when we've moved all the buffer.
It limits the number of memory accesses, but since it proceeds one
byte at a time and with random walk, it's not cache friendly and
should be slower than a double copy. However, it's only used in
extreme situations and the difference will not be noticeable.

It has been extensively tested and works reliably.
2010-02-25 23:54:31 +01:00
Krzysztof Piotr Oledzki
329f74d463 [BUG] uri_auth: do not attemp to convert uri_auth -> http-request more than once
Bug reported by Laurent Dolosor.
2010-02-23 12:36:10 +01:00
Krzysztof Piotr Oledzki
15f0ac4829 [BUG] uri_auth: ST_SHLGNDS should be 0x00000008 not 0x0000008 2010-02-23 12:36:10 +01:00
Willy Tarreau
b4c06b7be6 [BUILD] auth: don't use unnamed unions
unnamed unions are not compatible with older compilers (eg: gcc 2.95) so
name it "u" instead.
2010-02-02 11:28:20 +01:00
Willy Tarreau
9cc670f7d9 [CLEANUP] config: use build_acl_cond() to simplify http-request ACL parsing
Now that we have this new function to make your life better, use it.
2010-02-01 10:43:44 +01:00
Cyril Bont
cd19e51b05 [MEDIUM] add a maintenance mode to servers
This is a first attempt to add a maintenance mode on servers, using
the stat socket (in admin level).

It can be done with the following command :
   - disable server <backend>/<server>
   - enable  server <backend>/<server>

In this mode, no more checks will be performed on the server and it
will be marked as a special DOWN state (MAINT).

If some servers were tracking it, they'll go DOWN until the server
leaves the maintenance mode. The stats page and the CSV export also
display this special state.

This can be used to disable the server in haproxy before doing some
operations on this server itself. This is a good complement to the
"http-check disable-on-404" keyword and works in TCP mode.
2010-01-31 23:33:18 +01:00
Krzysztof Piotr Oledzki
8c8bd4593c [MAJOR] use the new auth framework for http stats
Support the new syntax (http-request allow/deny/auth) in
http stats.

Now it is possible to use the same syntax is the same like in
the frontend/backend http-request access control:
 acl src_nagios src 192.168.66.66
 acl stats_auth_ok http_auth(L1)

 stats http-request allow if src_nagios
 stats http-request allow if stats_auth_ok
 stats http-request auth realm LB

The old syntax is still supported, but now it is emulated
via private acls and an aditional userlist.
2010-01-31 19:14:09 +01:00
Krzysztof Piotr Oledzki
f9423ae43a [MINOR] acl: add http_auth and http_auth_group
Add two acls to match http auth data:
 acl <name> http_auth(userlist)
 acl <name> http_auth_hroup(userlist) group1 group2 (...)
2010-01-31 19:14:09 +01:00
Krzysztof Piotr Oledzki
59bb218b86 [MINOR] http-request: allow/deny/auth support for frontend/backend/listen
Use the generic auth framework to control access to frontends/backends/listens
2010-01-31 19:14:08 +01:00
Krzysztof Piotr Oledzki
d7528e5a60 [MINOR] add ACL_TEST_F_NULL_MATCH
Add ACL_TEST_F_NULL_MATCH so expr->kw->match can be called
even if expr->patterns is empty.
2010-01-31 19:14:07 +01:00
Krzysztof Piotr Oledzki
961050465e [MINOR] generic auth support with groups and encrypted passwords
Add generic authentication & authorization support.

Groups are implemented as bitmaps so the count is limited to
sizeof(int)*8 == 32.

Encrypted passwords are supported with libcrypt and crypt(3), so it is
possible to use any method supported by your system. For example modern
Linux/glibc instalations support MD5/SHA-256/SHA-512 and of course classic,
DES-based encryption.
2010-01-31 19:14:07 +01:00
Krzysztof Piotr Oledzki
fccbdc8421 [MINOR] Base64 decode
Implement Base64 decoding with a reverse table.

The function accepts and decodes classic base64 strings, which
can be composed from many streams as long each one is properly
padded, for example: SGVsbG8=IEhBUHJveHk=IQ==
2010-01-31 19:14:07 +01:00
Willy Tarreau
fdb563c06f [MEDIUM] http: add support for conditional response header rewriting
Just as for the req* rules, we can now condition rsp* rules with ACLs.
ACLs match on response, so volatile request information cannot be used.
A warning is emitted if a configuration contains such an anomaly.
2010-01-31 15:43:27 +01:00
Willy Tarreau
6c123b15cb [MEDIUM] http: make the request filter loop check for optional conditions
From now on, if request filters have ACLs defined, these ACLs will be
evaluated to condition the filter. This will be used to conditionally
remove/rewrite headers based on ACLs.
2010-01-28 20:22:06 +01:00
Willy Tarreau
f4f04125d4 [MINOR] prepare req_*/rsp_* to receive a condition
It will be very handy to be able to pass conditions to req_* and rsp_*.
For now, we just add the pointer to the condition in the affected
structs.
2010-01-28 18:10:50 +01:00
Willy Tarreau
f1e98b8628 [CLEANUP] config: use warnif_cond_requires_resp() to check for bad ACLs
Factor out some repetitive copy-pasted code to check for request ACLs
validity.
2010-01-28 17:59:39 +01:00
Willy Tarreau
2bbba415d7 [MINOR] acl: add build_acl_cond() to make it easier to add ACLs in config
This function automatically builds a rule, considering the if/unless
statements, and automatically updates the proxy's acl_requires, the
condition's file and line.
2010-01-28 16:48:33 +01:00
Willy Tarreau
ef78104947 [MINOR] checks: add the server's status in the checks
Now a server can check the contents of the header X-Haproxy-Server-State
to know how haproxy sees it. The same values as those reported in the stats
are provided :
  - up/down status + check counts
  - throttle
  - weight vs backend weight
  - active sessions vs backend sessions
  - queue length
  - haproxy node name
2010-01-27 20:16:12 +01:00
Willy Tarreau
e9d8788fdd [MINOR] checks: make the HTTP check code add the CRLF itself
Currently we cannot easily add headers nor anything to HTTP checks
because the requests are pre-formatted with the last CRLF. Make the
check code add the CRLF itself so that we can later add useful info.
2010-01-27 20:16:12 +01:00
Willy Tarreau
9e92d327f7 [MINOR] pattern: add support for argument parsers for converters
Some converters will need one or several arguments. It's not possible
to write a simple generic parser for that, so let's add the ability
for each converter to support its own argument parser, and call it
to get the arguments when it's specified. If unspecified, the arguments
are passed unmodified as string+len.
2010-01-26 18:01:35 +01:00
Willy Tarreau
2937c0dd20 [MINOR] standard: str2mask: string to netmask converter
This function converts a dotted or CIDR value to a netmask.
2010-01-26 17:36:17 +01:00
Willy Tarreau
1a51b6342e [MINOR] pattern: make the converter more flexible by supporting void* and int args
The pattern type converters currently support a string arg and a length.
Sometimes we'll prefer to pass them a list or a structure. So let's convert
the string and length into a generic void* and int that each converter may
use as it likes.
2010-01-26 17:17:56 +01:00
Willy Tarreau
88d349d25d [MEDIUM] http: add support for Proxy-Connection header
Despite what is explicitly stated in HTTP specifications,
browsers still use the undocumented Proxy-Connection header
instead of the Connection header when they connect through
a proxy. As such, proxies generally implement support for
this stupid header name, breaking the standards and making
it harder to support keep-alive between clients and proxies.

Thus, we add a new "option http-use-proxy-header" to tell
haproxy that if it sees requests which look like proxy
requests, it should use the Proxy-Connection header instead
of the Connection header.
2010-01-25 12:48:26 +01:00
Willy Tarreau
4de9149f87 [MINOR] add the "force-persist" statement to force persistence on down servers
This is used to force access to down servers for some requests. This
is useful when validating that a change on a server correctly works
before enabling the server again.
2010-01-22 19:10:05 +01:00
Willy Tarreau
e803de2c6b [MINOR] add the ability to force kernel socket buffer size.
Sometimes we need to be able to change the default kernel socket
buffer size (recv and send). Four new global settings have been
added for this :
   - tune.rcvbuf.client
   - tune.rcvbuf.server
   - tune.sndbuf.client
   - tune.sndbuf.server

Those can be used to reduce kernel memory footprint with large numbers
of concurrent connections, and to reduce risks of write timeouts with
very slow clients due to excessive kernel buffering.
2010-01-22 11:49:41 +01:00
Willy Tarreau
bbf0b37f6c [MAJOR] http: rework request Connection header handling
We need to improve Connection header handling in the request for it
to support the upcoming keep-alive mode. Now we have two flags which
keep in the session the information about the presence of a
Connection: close and a Connection: keep-alive headers in the initial
request, as well as two others which keep the current state of those
headers so that we don't have to parse them again. Knowing the initial
value is essential to know when the client asked for keep-alive while
we're forcing a close (eg in server-close mode). Also the Connection
request parser is now able to automatically remove single header values
at the same time they are parsed. This provides greater flexibility and
reliability.

All combinations of listen/front/back in all modes and with both
1.0 and 1.1 have been tested.
2010-01-22 11:49:35 +01:00
Willy Tarreau
348238b3a9 [MINOR] tools: add a "word_match()" function to match words and ignore spaces
Some header values might be delimited with spaces, so it's not enough to
compare "close" or "keep-alive" with strncasecmp(). Use word_match() for
that.
2010-01-18 19:51:39 +01:00
Willy Tarreau
68085d8cfb [MINOR] http: add http_remove_header2() to remove a header value.
Calling this function after http_find_header2() automatically deletes
the current value of the header, and removes the header itself if the
value is the only one. The context is automatically adjusted for a
next call to http_find_header2() to return the next header. No other
change nor test should be made on the transient context though.
2010-01-18 19:51:33 +01:00
Willy Tarreau
050737f798 [BUILD] remove a warning in standard.h on AIX 2010-01-14 11:40:12 +01:00
Emeric Brun
1d33b2965e [MEDIUM] Add stick and store rules analysers. 2010-01-12 16:01:24 +01:00
Emeric Brun
b982a3d23a [MEDIUM] Add stick table configuration and init. 2010-01-12 16:01:24 +01:00
Emeric Brun
107ca30d54 [MEDIUM] Add pattern fetch management types and functions 2010-01-12 16:01:19 +01:00
Emeric Brun
3bd697e071 [MEDIUM] Add stick table (persistence) management functions and types 2010-01-12 11:23:15 +01:00
Emeric Brun
39132b2165 [MINOR] Add function to parse a size in configuration 2010-01-12 11:23:15 +01:00
Hervé COMMOWICK
698ae00fc2 [MINOR] add option "mysql-check" to use MySQL health checks
This patch adds support for MySQL health checks. Those are
enabled using the new option "mysql-check".
2010-01-12 10:37:39 +01:00
Willy Tarreau
b16a5746b7 [MINOR] http: add a separate "http-keep-alive" timeout
This one is used to wait for next request after a response was sent
to the client.
2010-01-10 14:46:16 +01:00
Willy Tarreau
fcffa6911c [MINOR] http: differentiate waiting for new request and waiting for a complete requst
While waiting in a keep-alive state for a request, we want to silently
close if we don't get anything. However if we get a partial request it's
different because that means the client has started to send something.
This requires a new transaction flag. It will be used to implement a
distinct timeout for keep-alive and requests.
2010-01-10 14:24:53 +01:00
Willy Tarreau
520bbb2b85 [OPTIM] reorder http_txn to optimize cache lines placement
This re-ordering brings about 3% of performance boost on x86_64
on pipeline intensive requests, which means it mainly benefits
the parsers.
2010-01-10 11:31:22 +01:00
Willy Tarreau
a3377eeeff [MINOR] http: move appsession 'sessid' from session to http_txn
This change, suggested by Cyril Bont, makes a lot of sense and
would have made it obvious that sessid was not properly initialized
while switching to keep-alive. The code is now cleaner.
2010-01-10 10:49:11 +01:00
Willy Tarreau
148d099406 [BUG] stream_interface: fix retnclose and remove cond_close
The stream_int_cond_close() function was added to preserve the
contents of the response buffer because stream_int_retnclose()
was buggy. It flushed the response instead of flushing the
request. This caused issues with pipelined redirects followed
by error messages which ate the previous response.

This might even have caused object truncation on pipelined
requests followed by an error or by a server redirection.

Now that this is fixed, simply get rid of the now useless
function.
2010-01-10 10:21:21 +01:00
Willy Tarreau
81e3b4f48d [MINOR] http redirect: add the ability to append a '/' to the URL
Sometimes it can be desired to return a location which is the same
as the request with a slash appended when there was not one in the
request. A typical use of this is for sending a 301 so that people
don't reference links without the trailing slash. The name of the
new option is "append-slash" and it can be used on "redirect"
statements in prefix mode.
2010-01-10 00:42:19 +01:00
Willy Tarreau
962c3f4aab [MEDIUM] http: fix handling of message pointers
Some message pointers were not usable once the message reached the
HTTP_MSG_DONE state. This is the case for ->som which points to the
body because it is needed to parse chunks. There is one case where
we need the beginning of the message : server redirect. We have to
call http_get_path() after the request has been parsed. So we rely
on ->sol without counting on ->som. In order to achieve this, we're
making ->rq.{u,v} relative to the beginning of the message instead
of the buffer. That simplifies the code and makes it cleaner.

Preliminary tests show this is OK.
2010-01-10 00:15:35 +01:00
Willy Tarreau
90deb18916 [MEDIUM] http: make safer use of the DONT_READ and AUTO_CLOSE flags
Several HTTP analysers used to set those flags to values that
were useful but without considering the possibility that they
were not called again to clean what they did. First, replace
direct flag manipulation with more explicit macros. Second,
enforce a rule stating that any buffer which changes one of
these flags from the default must restore it after completion,
so that other analysers see correct flags.

With both this fix and the previous one about analyser bits,
we should not see any more stuck sessions.
2010-01-07 00:20:41 +01:00
Krzysztof Piotr Oledzki
c6df066980 [MEDIUM] default-server support
This patch implements default-server support allowing to change
default server options. It can be used in [defaults] or [backend]/[listen]
sections. Currently the following options are supported:

 - error-limit
 - fall
 - inter
 - fastinter
 - downinter
 - maxconn
 - maxqueue
 - minconn
 - on-error
 - port
 - rise
 - slowstart
 - weight
2010-01-06 00:28:06 +01:00
Krzysztof Piotr Oledzki
15514c21a2 [MINOR]: stats: add show-legends to report additional informations
Supported informations, available via "tr/td title":
  - cap: capabilities (proxy)
  - mode: one of tcp, http or health (proxy)
  - id: SNMP ID (proxy, socket, server)
  - IP (socket, server)
  - cookie (backend, server)
2010-01-06 00:28:06 +01:00
Emeric Brun
3a7fce5383 [BUILD] warning ultoa_r returns char *
ultoa_r modifies its output, it returns a char *.
2010-01-05 23:47:00 +01:00
Emeric Brun
129c59014c [BUILD] warning in stream_interface.h
On some platforms, gcc complains about struct sockaddr.
2010-01-05 23:46:08 +01:00
Willy Tarreau
610ecceef9 [MAJOR] http: fix again the forward analysers
There were still several situations leading to CLOSE_WAIT sockets
remaining there forever because some complex transitions were
obviously not caught due to the impossibility to resync changes
between the request and response FSMs.

This patch now centralizes the global transaction state and feeds
it from both request and response transitions. That way, whoever
finishes first, there will be no issue for converging to the correct
state.

Some heavy use of the new debugging function has helped a lot. Maybe
those calls could be removed after some time. First tests are very
positive.
2010-01-04 21:15:02 +01:00
Willy Tarreau
477ecd8627 [MEDIUM] config: remove the limitation of 10 config files
Now we use a linked list, there is no limit anymore.
2010-01-03 21:22:14 +01:00
Willy Tarreau
deb9ed8f60 [MEDIUM] config: remove the limitation of 10 reqadd/rspadd statements
Now we use a linked list, there is no limit anymore.
2010-01-03 21:22:14 +01:00
Willy Tarreau
2be3939416 [MINOR] http: don't wait for sending requests to the server
By default we automatically wait for enough data to fill large
packets if buf->to_forward is not null. This causes a problem
with POST/Expect requests which have a data size but no data
immediately available. Instead of causing noticeable delays on
such requests, simply add a flag to disable waiting when sending
requests.
2010-01-03 17:24:51 +01:00
Willy Tarreau
face839296 [OPTIM] http: set MSG_MORE on response when a pipelined request is pending
Many times we see a lot of short responses in HTTP (typically 304 on a
reload). It is a waste of network bandwidth to send that many small packets
when we know we can merge them. When we know that another HTTP request is
following a response, we set BF_EXPECT_MORE on the response buffer, which
will turn MSG_MORE on exactly once. That way, multiple short responses can
leave pipelined if their corresponding requests were also pipelined.
2010-01-03 11:37:54 +01:00
Willy Tarreau
b608feb82a [MAJOR] http: add support for option http-server-close
This option enables HTTP keep-alive on the client side and close mode
on the server side. This offers the best latency on the slow client
side, and still saves as many resources as possible on the server side
by actively closing connections. Pipelining is supported on both requests
and responses, though there is currently no reason to get pipelined
responses.
2010-01-02 22:47:18 +01:00
Willy Tarreau
4c283dce4b [MINOR] stream_sock: add SI_FL_NOLINGER for faster close
This new flag may be set by any user on a stream interface to tell
the underlying protocol that there is no need for lingering on the
socket since we know the other side either received everything or
does not care about what we sent.

This will typically be used with forced server close in HTTP mode,
where we want to quickly close a server connection after receiving
its response. Otherwise the system would prevent us from reusing
the same port for some time.
2009-12-29 14:36:34 +01:00
Willy Tarreau
5523b32cc6 [MEDIUM] http: add two more states for the closing period
HTTP_MSG_CLOSING and HTTP_MSG_CLOSED are needed to know when it
is safe to close a connection without risking to destroy pending
data.
2009-12-29 12:05:52 +01:00
Willy Tarreau
d98cf93395 [MAJOR] http: implement body parser
The body parser will be used in close and keep-alive modes. It follows
the stream to keep in sync with both the request and the response message.
Both chunked transfer-coding and content-length are supported according to
RFC2616.

The multipart/byterange encoding has not yet been implemented and if not
seconded by any of the two other ones, will be forwarded till the close,
as requested by the specification.

Both the request and the response analysers converge into an HTTP_MSG_DONE
state where it will be possible to force a close (option forceclose) or to
restart with a fresh new transaction and maintain keep-alive.

This change is important. All tests are OK but any possible behaviour
change with "option httpclose" might find its root here.
2009-12-27 22:54:55 +01:00
Willy Tarreau
5d881d0f3a [MINOR] new function stream_int_cond_close()
This one will be used to conditionally send a message upon a
close on a stream interface. It will not overwrite any existing
data.
2009-12-27 22:51:06 +01:00
Willy Tarreau
1d3bcce4dd [BUG] http: offsets are relative to the buffer, not to ->som
Some wrong operations were performed on buffers, assuming the
offsets were relative to the beginning of the request while they
are relative to the beginning of the buffer. In practice this is
not yet an issue since both are the same... until we add support
for keep-alive.
2009-12-27 15:50:06 +01:00
Willy Tarreau
d21e01c63f [MINOR] buffers: add buffer_ignore() to skip some bytes
This simple function will be used to skip blanks at the beginning of a
request or response.
2009-12-27 15:45:38 +01:00
Willy Tarreau
e8e785bb85 [MEDIUM] http: add a new transaction flags indicating if we know the transfer length
It's not enough to know if the connection will be in CLOSE or TUNNEL mode,
we still need to know whether we want to read a full message to a known
length or read it till the end just as in TUNNEL mode. Some updates to the
RFC clarify slightly better the corner cases, in particular for the case
where a non-chunked encoding is used last.

Now we also take care of adding a proper "connection: close" to messages
whose size could not be determined.
2009-12-26 16:29:04 +01:00
Willy Tarreau
0394594b06 [MINOR] http: introduce a new synchronisation state : HTTP_MSG_DONE
This state indicates that an HTTP message (request or response) is
complete. This will be used to know when we can re-initialize a
new transaction. Right now we only switch to it after the end of
headers if there is no data. When other analysers are implemented,
we can switch to this state too.

The condition to reuse a connection is when the response finishes
after the request. This will have to be checked when setting the
state.
2009-12-22 16:50:27 +01:00
Willy Tarreau
0937bc43cf [MINOR] http: move the http transaction init/cleanup code to proto_http
This code really belongs to the http part since it's transaction-specific.
This will also make it easier to later reinitialize a transaction in order
to support keepalive.
2009-12-22 15:03:09 +01:00
Willy Tarreau
7c3c54177a [MAJOR] buffers: automatically compute the maximum buffer length
We used to apply a limit to each buffer's size in order to leave
some room to rewrite headers, then we used to remove this limit
once the session switched to a data state.

Proceeding that way becomes a problem with keepalive because we
have to know when to stop reading too much data into the buffer
so that we can leave some room again to process next requests.

The principle we adopt here consists in only relying on to_forward+send_max.
Indeed, both of those data define how many bytes will leave the buffer.
So as long as their sum is larger than maxrewrite, we can safely
fill the buffers. If they are smaller, then we refrain from filling
the buffer. This means that we won't risk to fill buffers when
reading last data chunk followed by a POST request and its contents.

The only impact identified so far is that we must ensure that the
BF_FULL flag is correctly dropped when starting to forward. Right
now this is OK because nobody inflates to_forward without using
buffer_forward().
2009-12-22 10:06:34 +01:00
Willy Tarreau
5b15447672 [MAJOR] http: completely process the "connection" header
Up to now, we only had a flag in the session indicating if it had to
work in "connection: close" mode. This is not at all compatible with
keep-alive.

Now we ensure that both sides of a connection act independantly and
only relative to the transaction. The HTTP version of the request
and response is also correctly considered. The connection already
knows several modes :
  - tunnel (CONNECT or no option in the config)
  - keep-alive (when permitted by configuration)
  - server-close (close the server side, not the client)
  - close (close both sides)

This change carefully detects all situations to find whether a request
can be fully processed in its mode according to the configuration. Then
the response is also checked and tested to fix corner cases which can
happen with different HTTP versions on both sides (eg: a 1.0 client
asks for explicit keep-alive, and the server responds with 1.1 without
a header).

The mode is selected by a capability elimination algorithm which
automatically focuses on the least capable agent between the client,
the frontend, the backend and the server. This ensures we won't get
undesired situtations where one of the 4 "agents" is not able to
process a transaction.

No "Connection: close" header will be added anymore to HTTP/1.0 requests
or responses since they're already in close mode.

The server-close mode is still not completely implemented. The response
needs to be rewritten as keep-alive before being sent to the client if
the connection was already in server-close (which implies the request
was in keep-alive) and if the response has a content-length or a
transfer-encoding (but only if client supports 1.1).

A later improvement in server-close mode would probably be to detect
some situations where it's interesting to close the response (eg:
redirections with remote locations). But even then, the client might
close by itself.

It's also worth noting that in tunnel mode, no connection header is
affected in either direction. A tunnelled connection should theorically
be notified at the session level, but this is useless since by definition
there will not be any more requests on it. Thus, we don't need to add a
flag into the session right now.
2009-12-22 09:52:43 +01:00
Krzysztof Piotr Oledzki
97f07b832f [MEDIUM] Decrease server health based on http responses / events, version 3
Implement decreasing health based on observing communication between
HAProxy and servers.

Changes in this version 2:
 - documentation
 - close race between a started check and health analysis event
 - don't force fastinter if it is not set
 - better names for options
 - layer4 support

Changes in this version 3:
 - add stats
 - port to the current 1.4 tree
2009-12-16 00:29:27 +01:00
Willy Tarreau
d0f06fc4b2 [MINOR] http: detect tunnel mode and set it in the session
In order to support keepalive, we'll have to differentiate
normal sessions from tunnel sessions, which are the ones we
don't want to analyse further.

Those are typically the CONNECT requests where we don't care
about any form of content-length, as well as the requests
which are forwarded on non-close and non-keepalive proxies.
2009-11-30 12:19:56 +01:00
Cyril Bonté
b21570ae0f [MEDIUM] appsession: add "len", "prefix" and "mode" options
To sum up :
- len : it's now the max number of characters for the value, preventing
  garbaged results.
- a new option "prefix" is added, this allows to use dynamic cookie
  names (e.g. ASPSESSIONIDXXX).

Previously in the thread, I wanted to use the value found with
"capture cookie" but when i started to update the documentation, I
found this solution quite weird. I've made a small rework to not
depend on "capture cookie".

- There's the posssiblity to define the URL parser mode (path parameters
  or query string).
2009-11-30 11:31:53 +01:00
Willy Tarreau
fa355d4a51 [MINOR] http: keep pointer to beginning of data
We now set msg->col and msg->sov to the first byte of non-header.
They will be used later when parsing chunks. A new macro was added
to perform size additions on an http_msg in order to limit the risks
of copy-paste in the long term.

During this operation, it appeared that the http_msg struct was not
optimal on 64-bit, so it was re-ordered to fill the holes.
2009-11-29 18:12:29 +01:00
Willy Tarreau
655dce90d4 [MINOR] http: create new MSG_BODY sub-states
An HTTP message can be decomposed into several sub-states depending
on the transfer-encoding. We'll have to keep these state information
while parsing chunks, so we must extend the values. In order not to
change everything, we'll now consider that anything >= MSG_BODY is
the body, and that the value indicates the precise state. The
MSG_ERROR status which was greater than MSG_BODY was moved for this.
2009-11-08 13:10:58 +01:00
Willy Tarreau
da3b7c31f7 [MINOR] tools: add hex2i() function to convert hex char to int 2009-11-02 20:12:52 +01:00
Alex Williams
96532db923 [MINOR] server tracking: don't care about the tracked server's mode
Right now, an HTTP server cannot track a TCP server and vice-versa.
This patch enables proxy tracking without relying on the proxy's mode
(tcp/http/health). It only requires a matching proxy name to exist. The
original function was renamed to findproxy_mode().
2009-11-02 11:08:00 +01:00
Willy Tarreau
cc05fba613 [BUG] definitely fix regparm issues between haproxy core and ebtree
It's a pain to enable regparm because ebtree is built in its corner
and does not depend on the rest of the config. This causes no problem
except that if the regparm settings are not exactly similar, then we
can get inconsistent function interfaces and crashes.

One solution realized in this patch consists in externalizing all
compiler settings and changing CONFIG_XXX_REGPARM into CONFIG_REGPARM
so that we ensure that any sub-component uses the same setting. Since
ebtree used a value here and not a boolean, haproxy's config has been
set to use a number too. Both haproxy's core and ebtree currently use
the same copy of the compiler.h file. That way we don't have any issue
anymore when one setting changes somewhere.
2009-10-27 21:53:58 +01:00
Willy Tarreau
3e79479348 [CLEANUP] ebtree: remove old unused files 2009-10-26 21:15:10 +01:00
Willy Tarreau
45cb4fb640 [MEDIUM] build: switch ebtree users to use new ebtree version
All files referencing the previous ebtree code were changed to point
to the new one in the ebtree directory. A makefile variable (EBTREE_DIR)
is also available to use files from another directory.

The ability to build the libebtree library temporarily remains disabled
because it can have an impact on some existing toolchains and does not
appear worth it in the medium term if we add support for multi-criteria
stickiness for instance.
2009-10-26 21:10:04 +01:00
Willy Tarreau
8e89b84848 [MINOR] http: remove the last call to stream_int_return
And remove the now unused function itself too.
2009-10-18 23:56:35 +02:00
Willy Tarreau
b37c27e28f [MAJOR] http: create the analyser which waits for a response
The code part which waits for an HTTP response has been extracted
from the old function. We now have two analysers and the second one
may re-enable the first one when an 1xx response is encountered.
This has been tested and works.

The calls to stream_int_return() that were remaining in the wait
analyser have been converted to stream_int_retnclose().
2009-10-18 23:15:41 +02:00
Willy Tarreau
3667d5d0b6 [MINOR] http: add new transaction flags for keep-alive and content-length
We'll need to store the keep-alive status as well as content-length
and/or transfer-encoding status.
2009-10-18 19:50:43 +02:00
Cyril Bont
bf47aeb946 [MEDIUM] appsession: add the "request-learn" option
This patch has 2 goals :

1. I wanted to test the appsession feature with a small PHP code,
using PHPSESSID. The problem is that when PHP gets an unknown session
id, it creates a new one with this ID. So, when sending an unknown
session to PHP, persistance is broken : haproxy won't see any new
cookie in the response and will never attach this session to a
specific server.

This also happens when you restart haproxy : the internal hash becomes
empty and all sessions loose their persistance (load balancing the
requests on all backend servers, creating a new session on each one).
For a user, it's like the service is unusable.

The patch modifies the code to make haproxy also learn the persistance
from the client : if no session is sent from the server, then the
session id found in the client part (using the URI or the client cookie)
is used to associated the server that gave the response.

As it's probably not a feature usable in all cases, I added an option
to enable it (by default it's disabled). The syntax of appsession becomes :

  appsession <cookie> len <length> timeout <holdtime> [request-learn]

This helps haproxy repair the persistance (with the risk of losing its
session at the next request, as the user will probably not be load
balanced to the same server the first time).

2. This patch also tries to reduce the memory usage.
Here is a little example to explain the current behaviour :
- Take a Tomcat server where /session.jsp is valid.
- Send a request using a cookie with an unknown value AND a path
  parameter with another unknown value :

  curl -b "JSESSIONID=12345678901234567890123456789012" http://<haproxy>/session.jsp;jsessionid=00000000000000000000000000000001

(I know, it's unexpected to have a request like that on a live service)
Here, haproxy finds the URI session ID and stores it in its internal
hash (with no server associated). But it also finds the cookie session
ID and stores it again.

- As a result, session.jsp sends a new session ID also stored in the
  internal hash, with a server associated.

=> For 1 request, haproxy has stored 3 entries, with only 1 which will be usable

The patch modifies the behaviour to store only 1 entry (maximum).
2009-10-18 11:56:26 +02:00
Willy Tarreau
f1ba4b3de5 [MAJOR] buffer: flag BF_DONT_READ to disable reads when not required
When processing a GET or HEAD request in close mode, we know we don't
need to read anything anymore on the socket, so we can disable it.
Doing this can save up to 40% of the recv calls, and half of the
epoll_ctl calls.

For this we need a buffer flag indicating that we're not interesting in
reading anymore. Right now, this flag also disables both polled reads.
We might benefit from disabling only speculative reads, but we will need
at least this flag when we want to support keepalive anyway.

Currently we don't disable the flag on completion, but it does not
matter as we close ASAP when performing the shutw().
2009-10-18 08:52:24 +02:00
Willy Tarreau
b48b323223 [MEDIUM] fd: merge fd_list into fdtab
The fd_list[] used by sepoll was indexed on the fd number and was only
used to store the equivalent of an integer. Changing it to be merged
with fdtab reduces the number of pointer computations, the code size
and some initialization steps. It does not harm other pollers much
either, as only one integer was added to the fdtab array.
2009-10-18 08:20:26 +02:00
Willy Tarreau
8d5d77efc3 [OPTIM] move some rarely used fields out of fdtab
Some rarely information are stored in fdtab, making it larger for no
reason (source port ranges, remote address, ...). Such information
lie there because the checks can't find them anywhere else. The goal
will be to move these information to the stream interface once the
checks make use of it.

For now, we move them to an fdinfo array. This simple change might
have improved the cache hit ratio a little bit because a 0.5% of
performance increase has measured.
2009-10-18 08:17:33 +02:00
Krzysztof Piotr Oledzki
5fb1882514 [MINOR] Collect & provide http response codes received from servers
Additional data is provided on both html & csv stats:
 - html: when passing a mouse over Sessions -> Total (servers, backends)
 - cvs: by 6 additional fields (hrsp_1xx, hrsp_2xx, hrsp_3xx, hrsp_4xx, hrsp_5xx, hspr_other)

Patch inspired by:
 http://www.formilux.org/archives/haproxy/0910/2528.html
 http://www.formilux.org/archives/haproxy/0910/2529.html
2009-10-14 21:49:53 +02:00
Willy Tarreau
cb6cd43725 [MINOR] tcp: add support for the defer_accept bind option
This can ensure that data is readily available on a socket when
we accept it, but a bug in the kernel ignores the timeout so the
socket can remain pending as long as the client does not talk.
Use with care.
2009-10-13 07:34:14 +02:00
Willy Tarreau
4e33d8677a [OPTIM] stats: check free space before trying to print
This alone makes a typical HTML stats dump consume 10% CPU less,
because we avoid doing complex printf calls to drop them later.
Only a few common cases have been checked, those which are very
likely to run for nothing.
2009-10-11 23:35:10 +02:00
Willy Tarreau
ea1f5fe28a [MINOR] stats: use a dedicated state to output static data
It is a bit expensive and complex to use to call buffer_feed()
directly from the request parser, and there are risks that some
output messages are lost in case of buffer full. Since most of
these messages are static, let's have a state dedicated to print
these messages and store them in a specific area shared with the
stats in the session. This both reduces code size and risks of
losing output data.
2009-10-11 23:12:51 +02:00
Krzysztof Piotr Oledzki
f7089f5852 [MINOR] Capture & display more data from health checks, v2
Capture & display more data from health checks, like
strerror(errno) for L4 failed checks or a first line
from a response for L7 successes/failed checks.

Non ascii or control characters are masked with
chunk_htmlencode() (html stats) or chunk_asciiencode() (logs).
2009-10-10 21:51:16 +02:00
Krzysztof Piotr Oledzki
ba8d7d3916 [MINOR] Add chunk_htmlencode and chunk_asciiencode
Add two functions to encode input chunk replacing
non-printable, non ascii or special characters
with:
 "&#%u;"  - chunk_htmlencode
 "<%02X>" - chunk_asciiencode

Above functions should be used when adding strings, received
from possible unsafe sources, to html stats or logs.
2009-10-10 21:51:16 +02:00
Krzysztof Piotr Oledzki
3d5562b96c [MINOR] Add cut_crlf(), ltrim(), rtrim() and alltrim()
Add four generic functions.
2009-10-10 21:51:16 +02:00
Willy Tarreau
975c50b838 [MINOR] add the "initial weight" to the server struct.
This one will be used when changing weights.
2009-10-10 19:34:06 +02:00
Willy Tarreau
f395017227 [MINOR] proxy: provide function to retrieve backend/server pointers
int get_backend_server(const char *bk_name, const char *sv_name,
                       struct proxy **bk, struct server **sv);

This function scans the list of backends and servers to retrieve the first
backend and the first server with the given names, and sets them in both
parameters. It returns zero if either is not found, or non-zero and sets
the ones it did not found to NULL. If a NULL pointer is passed for the
backend, only the pointer to the server will be updated.
2009-10-10 18:36:25 +02:00
Willy Tarreau
9bcc91e80e [MINOR] buffers: add buffer_feed2() and make buffer_feed() measure string length
It's inconvenient to always have to compute string lengths when calling
buffer_feed(), so change that.
2009-10-10 18:01:44 +02:00
Willy Tarreau
6162db2a81 [MEDIUM] add access restrictions to the stats socket
The stats socket can now run at 3 different levels :
  - user
  - operator (default one)
  - admin

These levels are used to restrict access to some information
and commands. Only the admin can clear all stats. A user cannot
clear anything nor access sensible data such as sessions or
errors.
2009-10-10 17:13:00 +02:00
Willy Tarreau
6b2e11be1e [MEDIUM] backend: implement consistent hashing variation
Consistent hashing provides some interesting advantages over common
hashing. It avoids full redistribution in case of a server failure,
or when expanding the farm. This has a cost however, the hashing is
far from being perfect, as we associate a server to a request by
searching the server with the closest key in a tree. Since servers
appear multiple times based on their weights, it is recommended to
use weights larger than approximately 10-20 in order to smoothen
the distribution a bit.

In some cases, playing with weights will be the only solution to
make a server appear more often and increase chances of being picked,
so stats are very important with consistent hashing.

In order to indicate the type of hashing, use :

   hash-type map-based      (default, old one)
   hash-type consistent     (new one)

Consistent hashing can make sense in a cache farm, in order not
to redistribute everyone when a cache changes state. It could also
probably be used for long sessions such as terminal sessions, though
that has not be attempted yet.

More details on this method of hashing here :
  http://www.spiteful.com/2008/03/17/programmers-toolbox-part-3-consistent-hashing/
2009-10-09 07:17:58 +02:00
Krzysztof Piotr Oledzki
6f61b21524 [BUG] Fix NULL pointer dereference in stats_check_uri_auth(), v2
Recent "struct chunk rework" introduced a NULL pointer dereference
and now haproxy segfaults if auth is required for stats but not found.

The reason is that size_t cannot store negative values, but current
code assumes that "len < 0" == uninitialized.

This patch fixes it.
2009-10-04 23:44:45 +02:00
Willy Tarreau
ac68c5d92c [OPTIM] counters: move some max numbers to the counters struct
There are a few remaining max values that need to move to counters.
Also, the counters are more often used than some config information,
so get them closer to the other useful struct members for better cache
efficiency.
2009-10-04 23:26:19 +02:00
Willy Tarreau
53fb4ae261 [MEDIUM] config: automatically find unused IDs for proxies, servers and listeners
Until now it was required that every custom ID was above 1000 in order to
avoid conflicts. Now we have the list of all assigned IDs and can automatically
pick the first unused one. This means that it is perfectly possible to interleave
automatic IDs with persistent IDs and the parser will automatically allocate
unused values starting with 1.
2009-10-04 23:04:08 +02:00
Willy Tarreau
482b00d1b4 [MINOR] tools: add a new get_next_id() function
This function returns the next unused key in a tree. This will be
used to find spare IDs.
2009-10-04 22:48:42 +02:00
Willy Tarreau
88922354fb [MINOR] config: add pointer to file name in block/redirect/use_backend/monitor rules
Those conditions already referenced the config line, but not the file.
2009-10-04 22:02:50 +02:00
Willy Tarreau
90a570f025 [MINOR] config: reference file and line with any listener/proxy/server declaration
Those will be used later for cross-references of conflicts or errors.
2009-10-04 21:14:56 +02:00
Krzysztof Piotr Oledzki
aeebf9ba65 [MEDIUM] Collect & provide separate statistics for sockets, v2
This patch allows to collect & provide separate statistics for each socket.
It can be very useful if you would like to distinguish between traffic
generate by local and remote users or between different types of remote
clients (peerings, domestic, foreign).

Currently no "Session rate" is supported, but adding it should be possible
if we found it useful.
2009-10-04 18:56:02 +02:00
Krzysztof Piotr Oledzki
052d4fd07d [CLEANUP] Move counters to dedicated structures
Move counters from "struct proxy" and "struct server"
to "struct pxcounters" and "struct svcounters".

This patch should make no functional change.
2009-10-04 18:32:39 +02:00
Krzysztof Piotr Oledzki
8d06b8b8db [MINOR] Introduce include/types/counters.h
This patch introduces include/types/counters.h that
will be used to split couters from other structures
and to create statistics for listeners.
2009-10-04 18:32:12 +02:00
Willy Tarreau
b0c9bc4f95 [MEDIUM] stats: make HTTP stats use an I/O handler
Doing this, we can remove the last BF_HIJACK user and remove
produce_content(). s->data_source could also be removed but
it is currently used to detect if the stats or a server was
used.
2009-10-04 15:56:38 +02:00
Willy Tarreau
65671abd32 [MINOR] remove now obsolete ana_state from the session struct
This one is not used anymore.
2009-10-04 14:24:59 +02:00
Willy Tarreau
f5a885fd28 [MEDIUM] stats: don't use s->ana_state anymore
The stats handler used to store internal states in s->ana_state. Now
we only rely on si->st0 in which we can store as many states as we
have possible outputs. This cleans up the stats code a lot and makes
it more maintainable. It has also reduced code size by a few hundred
bytes.
2009-10-04 14:22:18 +02:00
Willy Tarreau
24955a1000 [MINOR] stats: make stats_dump_raw_to_buffer() use buffer_feed_chunk
Same as previous change. A remaining call to stats_dump_proxy()
still prevents us from completing the update.
2009-10-04 12:17:54 +02:00
Willy Tarreau
7e72a8faf2 [MINOR] stats_dump_sess_to_buffer: use buffer_feed_chunk()
same as previous patch for this function.
2009-10-04 11:00:11 +02:00
Willy Tarreau
61b347342c [MINOR] stats_dump_errors_to_buffer: use buffer_feed_chunk()
We can simplify the code in the stats functions using buffer_feed_chunk()
instead of buffer_write_chunk(). Let's start with this function. This
patch also fixed an issue where we could dump past the end of the capture
buffer if it is shorter than the captured request.
2009-10-04 11:00:11 +02:00
Willy Tarreau
33b230b34a [BUG] stats: don't call buffer_shutw(), but ->shutw() instead
Calling buffer_shutw() marks the buffer as closed but if it was already
closed in the other direction, the stream interface is not marked as
closed, causing infinite loops.

We took this opportunity to completely remove buffer_shutw() and buffer_shutr()
which have no reason to be used at all and which will always cause trouble
when directly called. The stats occurrence was the last one.
2009-10-04 09:19:36 +02:00
Willy Tarreau
f27b5ea8dc [MEDIUM] new option "independant-streams" to stop updating read timeout on writes
By default, when data is sent over a socket, both the write timeout and the
read timeout for that socket are refreshed, because we consider that there is
activity on that socket, and we have no other means of guessing if we should
receive data or not.

While this default behaviour is desirable for almost all applications, there
exists a situation where it is desirable to disable it, and only refresh the
read timeout if there are incoming data. This happens on sessions with large
timeouts and low amounts of exchanged data such as telnet session. If the
server suddenly disappears, the output data accumulates in the system's
socket buffers, both timeouts are correctly refreshed, and there is no way
to know the server does not receive them, so we don't timeout. However, when
the underlying protocol always echoes sent data, it would be enough by itself
to detect the issue using the read timeout. Note that this problem does not
happen with more verbose protocols because data won't accumulate long in the
socket buffers.

When this option is set on the frontend, it will disable read timeout updates
on data sent to the client. There probably is little use of this case. When
the option is set on the backend, it will disable read timeout updates on
data sent to the server. Doing so will typically break large HTTP posts from
slow lines, so use it with caution.
2009-10-03 22:01:18 +02:00
Willy Tarreau
9757a38feb [MEDIUM] backend: introduce the "static-rr" LB algorithm
The "static-rr" is just the old round-robin algorithm. It is still
in use when a hash algorithm is used and the data to hash is not
present, but it was impossible to configure it explicitly. This one
is cheaper in terms of CPU and supports unlimited numbers of servers,
so it makes sense to be able to use it.
2009-10-03 18:41:19 +02:00
Willy Tarreau
f3e49f9521 [MINOR] backend: separate declarations of LB algos from their lookup method
LB algo macros were composed of the LB algo by itself without any indication
of the method to use to look up a server (the lb function itself). This
method was implied by the LB algo, which was not very convenient to add
more algorithms. Now we have several fields in the LB macros, some to
describe what to look for in the requests, some to describe how to transform
that (kind of algo) and some to describe what lookup function to use.

The next patch will make it possible to factor out some code for all algos
which rely on a map.
2009-10-03 18:41:18 +02:00
Willy Tarreau
5b4c2b58fe [CLEANUP] proxy: move last lb-specific bits to their respective files
The lbprm structure has moved to backend.h, where it should be, and
all algo-specific types and declarations have moved to their specific
files. The proxy struct is now much more readable.
2009-10-03 18:41:18 +02:00
Krzysztof Piotr Oledzki
48cb2aed5a [MINOR] add "description", "node" and show-node"/"show-desc", remove "node-name", v2
This patch implements "description" (proxy and global) and "node" (global)
options, removes "node-name" and adds "show-node" & "show-desc" options
for "stats". It also changes the way the header lines (with proxy name) and
the statistics are displayed, so stats no longer look so clumsy with very
long names.

Instead of "node-name" it is possible to use show-node/show-desc with
an optional parameter that overrides a default node/description.

backend cust-0045
        # report specific values for this customer
        stats show-node Europe
        stats show-desc Master node for Europe, Asia, Africa
2009-10-03 07:10:14 +02:00
Willy Tarreau
39c9ba72a7 [MINOR] lb_map: reorder code in order to ease integration of new hash functions
We need to remove hash map accesses out of backend.c if we want to
later support new hash methods. This patch separates the hash computation
method from the server lookup. It leaves the lookup function to lb_map.c
and calls it with the result of the hash.
2009-10-01 21:11:15 +02:00
Willy Tarreau
f89c1873f8 [CLEANUP] backend: move LB algos to individual files
It was becoming painful to have all the LB algos in backend.c.
Let's move them to their own files. A few hashing functions still
need be broken in two parts, one for the contents and one for the
map position.
2009-10-01 11:19:37 +02:00
Willy Tarreau
78ff5d0a9e [MINOR] include time.h from freq_ctr.h as is uses "now". 2009-10-01 11:05:26 +02:00
Krzysztof Piotr Oledzki
213014e587 [MEDIUM] Health check reporting code rework + health logging, v3
This patch adds health logging so it possible to check what
was happening before a crash. Failed healt checks are logged if
server is UP and succeeded healt checks if server is DOWN,
so the amount of additional information is limited.

I also reworked the code a little:

 - check_status_description[] and check_status_info[] is now
   joined into check_statuses[]

 - set_server_check_status updates not only s->check_status and
   s->check_duration but also s->result making the code simpler

Changes in v3:
 - for now calculate and use local versions of health/rise/fall/state,
   it is a slow path, no harm should be done. One day we may centralize
   processing of the checks and remove the duplicated code.
 - also log checks that are restoring current state
 - use "conditionally succeeded" for 404 with disable-on-404
2009-10-01 10:17:37 +02:00
Krzysztof Piotr Oledzki
78abe618a8 [MAJOR] struct chunk rework
Add size to struct chunk and simplify the code as there is
no longer required to pass sizeof in chunk_printf().
2009-10-01 10:17:37 +02:00
Willy Tarreau
ca7d4b98d4 [MINOR] backend: uninline some LB functions
There is no reason to inline functions which are used to grab a server
depending on an LB algo. They are large and used at several places.
Uninlining them saves 400 bytes of code.
2009-10-01 09:21:55 +02:00
Willy Tarreau
c5d9c80182 [MINOR] backend: export some functions to recount servers
Those functions will be used by new LB algorithms.
2009-10-01 09:17:05 +02:00
Willy Tarreau
9a42c0d771 [MEDIUM] stats: replace the stats socket analyser with an SI applet
We can get rid of the stats analyser by moving all the stats code
to a stream interface applet. Above being cleaner, it provides new
advantages such as the ability to process requests and responses
from the same function and work only with simple state machines.
There's no need for any hijack hack anymore.

The direct advantage for the user are the interactive mode and the
ability to chain several commands delimited by a semi-colon. Now if
the user types "prompt", he gets a prompt from which he can send
as many requests as he wants. All outputs are terminated by a
blank line followed by a new prompt, so this can be used from
external tools too.

The code is not very clean, it needs some rework, but some part
of the dirty parts are due to the remnants of the hijack mode used
in the old functions we call.

The old AN_REQ_STATS_SOCK analyser flag is now unused and has been
removed.
2009-09-23 23:52:17 +02:00
Willy Tarreau
eecc8ee673 [MINOR] add a ->private member to the stream_interface
iohandlers will need to store some form of context and for this will
need a way to find their call context. We add the ->private as well
as ->st0 and ->st1 for that purpose. Most likely ->private will be
initialized to the current session and ->st0 and ->st1 will be used
to maintain any form of internal state between calls.
2009-09-23 23:52:16 +02:00
Willy Tarreau
fb90d94d7a [MINOR] stream_interface: add functions to support running as internal/external tasks
It will soon be necessary to have stream interfaces running as part of
the current task, or as independant tasks. For instance when we want to
implement compression or SSL. It will also be used for applets running
as stream interfaces.

These new functions are used to perform exactly that. Note that it's
still not easy to write a simple echo applet and more functions will
likely be needed.
2009-09-23 23:52:15 +02:00
Willy Tarreau
b029f8cd7d [MINOR] stream_interface: add iohandler callback
When stream interfaces will embedded applets running as part as their
holding task, we'll need a new callback to process them from the
session processor.
2009-09-23 23:52:15 +02:00
Willy Tarreau
89f7ef295d [MINOR] stream_interface: add SI_FL_DONT_WAKE flag
We had to add a new stream_interface flag : SI_FL_DONT_WAKE. This flag
is used to indicate that a stream interface is being updated and that
no wake up should be sent to its owner. This will be required for tasks
embedded into stream interfaces. Otherwise, we could have the
owner task send wakeups to itself during status updates, thus
preventing the state from converging. As long as a stream_interface's
status is being monitored and adjusted, there is no reason to wake it
up again, as we know its changes will be seen and considered.
2009-09-23 23:52:14 +02:00
Willy Tarreau
2e1dd3d213 [BUG] fix buffer_skip() and buffer_si_getline() to correctly handle wrap-arounds
Those two functions did not correctly deal with full buffers and/or
buffers that wrapped around. Buffer_skip() was even able to incorrectly
set buf->w further than the end of buffer if its len argument was wrong,
and buffer_si_getline() was able to incorrectly return a length larger
than the effective buffer data available.
2009-09-23 23:52:14 +02:00
Willy Tarreau
fb0e9209a9 [MINOR] ensure that buffer_feed() and buffer_skip() set BF_*_PARTIAL
It's important that these functions set these flags themselves, otherwise
the callers will always have to do this, and there is no valid reason for
not doing it.
2009-09-23 23:50:57 +02:00
Krzysztof Piotr Oledzki
0960541e49 [MEDIUM] Collect & show information about last health check, v3
Collect information about last health check result,
including L7 code if possible (for example http or smtp
return code) and time took to finish last check.

Health check info is provided on both stats pages (html & csv)
and logged when a server is marked UP or DOWN. Currently active
check are marked with an asterisk, but only in html mode.

Currently there are 14 status codes:
  UNK     -> unknown

  INI     -> initializing
  SOCKERR -> socket error

  L4OK    -> check passed on layer 4, no upper layers testing enabled
  L4TOUT  -> layer 1-4 timeout
  L4CON   -> layer 1-4 connection problem, for example "Connection refused"
              (tcp rst) or "No route to host" (icmp)

  L6OK    -> check passed on layer 6
  L6TOUT  -> layer 6 (SSL) timeout
  L6RSP   -> layer 6 invalid response - protocol error

  L7OK    -> check passed on layer 7
  L7OKC   -> check conditionally passed on layer 7, for example
               404 with disable-on-404
  L7TOUT  -> layer 7 (HTTP/SMTP) timeout
  L7RSP   -> layer 7 invalid response - protocol error
  L7STS   -> layer 7 response error, for example HTTP 5xx
2009-09-23 23:15:36 +02:00
Willy Tarreau
2d028db75a [BUG] buffers: buffer_forward() must not always clear BF_OUT_EMPTY
This flag was unconditionally cleared, which is wrong because we
can enable forwarding on an empty buffer.
2009-09-21 06:25:15 +02:00
Willy Tarreau
269358db93 [BUILD] stream_interface: fix conflicting declaration
stream_int_check_timeouts was declared void while it's an int.
2009-09-21 06:24:42 +02:00
Willy Tarreau
31971e536a [MEDIUM] add support for infinite forwarding
In TCP, we don't want to forward chunks of data, we want to forward
indefinitely. This patch introduces a special value for the amount
of data to be forwarded. When buffer_forward() is called with
BUF_INFINITE_FORWARD, it configures the buffer to never stop
forwarding until the end.
2009-09-20 12:07:52 +02:00
Willy Tarreau
ba0b63d2c7 [MAJOR] buffers: fix the BF_EMPTY flag's meaning
The BF_EMPTY flag was once used to indicate an empty buffer. However,
it was used half the time as meaning the buffer is empty for the reader,
and half the time as meaning there is nothing left to send.

"nothing to send" is only indicated by "->send_max=0 && !pipe". Once
we fix this, we discover that the flag is not used anymore. So the
flags has been renamed BF_OUT_EMPTY and means exactly the condition
above, ie, there is nothing to send.

Doing so has allowed us to remove some unused tests for emptiness,
but also to uncover a certain amount of situations where the flag
was not correctly set or tested.
2009-09-20 08:17:45 +02:00
Willy Tarreau
520d95e42b [MAJOR] buffers: split BF_WRITE_ENA into BF_AUTO_CONNECT and BF_AUTO_CLOSE
The BF_WRITE_ENA buffer flag became very complex to deal with, because
it was used to :
  - enable automatic connection
  - enable close forwarding
  - enable data forwarding

The last point was not very true anymore since we introduced ->send_max,
but still the test remained everywhere. This was causing issues such as
impossibility to connect without forwarding data, impossibility to prevent
closing when data was forwarded, etc...

This patch clarifies the situation by getting rid of this multi-purpose
flag and replacing it with :
  - data forwarding based only on ->send_max || ->pipe ;
  - a new BF_AUTO_CONNECT flag to allow automatic connection and only
    that ;
  - ability to perform an automatic connection when ->send_max or ->pipe
    indicate that data is waiting to leave the buffer ;
  - a new BF_AUTO_CLOSE flag to let the producer automatically set the
    BF_SHUTW_NOW flag when it gets a BF_SHUTR.

During this cleanup, it was discovered that some tests were performed
twice, or that the BF_HIJACK flag was still tested, which is not needed
anymore since ->send_max replcaed it. These places have been fixed too.

These cleanups have also revealed a few areas where the other flags
such as BF_EMPTY are not cleanly used. This will be an opportunity for
a second patch.
2009-09-19 21:14:54 +02:00
Willy Tarreau
c77e761968 [MINOR] buffers: inline buffer_si_putchar()
By inlining this function and slightly reordering it, we can double
the getchar/putchar test throughput, and reduce its footprint by about
40 bytes. Also, it was the only non-inlined char-based function, which
now makes it more consistent this time.
2009-09-19 16:34:18 +02:00
Willy Tarreau
9cb8daa203 [MINOR] buffers: add buffer_cut_tail() to cut only unsent data
This function is used to cut the "tail" of a buffer, which means strip it
to the length of unsent data only, and kill any remaining unsent data. Any
scheduled forwarding is stopped. This is mainly to be used to send error
messages after existing data. It does the same as buffer_erase() for buffers
without pending outgoing data.
2009-09-19 14:53:47 +02:00
Willy Tarreau
91aa577b1f [BUG] buffer_forward() would not correctly consider data already scheduled
The computations in buffer_forward() were only valid if buffer_forward()
was used on a buffer which had no more data scheduled for forwarding.
This is always the case right now so this bug is not yet triggered but
it will soon be. Now we correctly discount the bytes to be forwarded
from the data already present in the buffer.
2009-09-19 14:53:47 +02:00
Willy Tarreau
36a5c5389d [MINOR] buffers: provide buffer_si_putchar() to send a char from a stream interface
This function works like a traditional putchar() except that it
can return 0 if the output buffer is full.

Now a basic character-based echo function would look like this, from
a stream interface :

	while (1) {
		c = buffer_si_peekchar(req);
		if (c < 0)
			break;
		if (!buffer_si_putchar(res, c)) {
			si->flags |= SI_FL_WAIT_ROOM;
			break;
		}
		buffer_skip(req, 1);
		req->flags |= BF_WRITE_PARTIAL;
		res->flags |= BF_READ_PARTIAL;
	}
2009-09-19 14:53:47 +02:00
Willy Tarreau
4fe7a2ec6c [MINOR] buffers: add peekchar and peekline functions for stream interfaces
The buffer_si_peekline() function is sort of a fgets() to be used from a
stream interface. It returns a complete line whenever possible, and does
not update the buffer's pointer, so that the reader is free to consume
what it wants to.

buffer_si_peekchar() only returns one character, and also needs a call
to buffer_skip() once the character is definitely consumed.
2009-09-19 14:53:47 +02:00
Willy Tarreau
aeac31979e [MEDIUM] buffers: provide new buffer_feed*() function
This functions act like their buffer_write*() counter-parts,
except that they're specifically designed to be used from a
stream interface handler, as they carefully check size limits
and automatically advance the read pointer depending on the
to_forward attribute.

buffer_feed_chunk() is an inline calling buffer_feed() as both
are the sames. For this reason, buffer_write_chunk() has also
been turned into an inline which calls buffer_write().
2009-09-19 14:53:46 +02:00
Willy Tarreau
2b7addc833 [MINOR] buffers: provide more functions to handle buffer data
buffer_contig_space(), buffer_contig_data() and buffer_skip()
provide easy methods to extract/insert data from/into a buffer.

buffer_write() and buffer_write_chunk() currently do not check
max_len nor to_forward, so they will quickly become embarrassing
to use or will need an equivalent. The reason is that they are
used to build error messages which currently are not subject to
analysis.
2009-09-19 14:53:46 +02:00
Willy Tarreau
418fd4722a [MAJOR] buffers: fix misuse of the BF_SHUTW_NOW flag
This flag was incorrectly used as meaning "close immediately",
while it needs to say "close ASAP". ASAP here means when unsent
data pending in the buffer are sent. This helps cleaning up some
dirty tricks where the buffer output was checking the BF_SHUTR
flag combined with EMPTY and other such things. Now we have a
clearly defined semantics :

  - producer sets SHUTR and *may* set SHUTW_NOW if WRITE_ENA is
    set, otherwise leave it to the session processor to set it.
  - consumer only checks SHUTW_NOW to decide whether or not to
    call shutw().

This also induced very minor changes at some locations which were
not protected against buffer changes while the SHUTW_NOW flag was
set. Now we prevent send_max from changing when the flag is set.

Several tests have been run without any unexpected behaviour detected.

Some more cleanups are needed, as it clearly appears that some tests
could be removed with stricter semantics.
2009-09-19 14:53:46 +02:00
Willy Tarreau
106f979bbd [MINOR] acl: add support for hdr_ip to match IP addresses in headers
For x-forwarded-for and such headers, it's sometimes needed to match
based on network addresses. Let's use hdr_ip() for that.
2009-09-19 14:47:49 +02:00
Willy Tarreau
6db06d3870 [MEDIUM] remove TCP_CORK and make use of MSG_MORE instead
send() supports the MSG_MORE flag on Linux, which does the same
as TCP_CORK except that we don't have to remove TCP_NODELAY before
and we don't need any syscall to set/remove it. This can save up
to 4 syscalls around a send() (two for setting it, two for removing
it), and it's much cleaner since it is not persistent. So make use
of it instead.
2009-08-19 11:29:44 +02:00
Willy Tarreau
d6d06909da [CLEANUP] remove ifdef MSG_NOSIGNAL and define it instead
ifdefs are really annoying in the code. Define MSG_NOSIGNAL to zero
when undefined and remove associated ifdefs.
2009-08-19 11:25:08 +02:00
Willy Tarreau
dc85b39db7 [MEDIUM] stream_interface: add and use ->update function to resync
We used to call stream_sock_data_finish() directly at the end of
a session update, but if we want to support non-socket interfaces,
we need to have this function configurable. Now we access it via
->update().
2009-08-18 07:38:19 +02:00
Willy Tarreau
27a674efb8 [MEDIUM] make it possible to change the buffer size in the configuration
The new tune.bufsize and tune.maxrewrite global directives allow one to
change the buffer size and the maxrewrite size. Right now, setting bufsize
too low will block stats sockets which will not be able to write at all.
An error checking must be added to buffer_write_chunk() so that if it
cannot write its message to an empty buffer, it causes the caller to abort.
2009-08-17 22:56:56 +02:00
Willy Tarreau
a07a34eb24 [MEDIUM] replace BUFSIZE with buf->size in computations
The first step towards dynamic buffer size consists in removing
all static definitions of the buffer size. Instead, we store a
buffer's size in itself. Right now they're all preinitialized
to BUFSIZE, but we will change that.
2009-08-16 23:27:46 +02:00
Willy Tarreau
4e5b8287a6 [MEDIUM] set rep->analysers from fe and be analysers
sess_establish() used to resort to protocol-specific guesses
in order to set rep->analysers. This is no longer needed as it
gets set from the frontend and the backend as a copy of what
was defined in the configuration.
2009-08-16 22:57:50 +02:00
Willy Tarreau
c1a2167e9d [MINOR] cleanup set_session_backend by using pre-computed analysers
Analyser bitmaps are now stored in the frontend and backend, and
combined at configuration time. That way, set_session_backend()
does not need to perform any protocol-specific combinations.
2009-08-16 22:37:44 +02:00
Willy Tarreau
2c9f5b130f [MINOR] move the initial task's nice value to the listener
Since the listener is the one indicating what analyser and session
handlers to call, it makes sense that it also sets the task's nice
value. This also helps getting rid of the last trace of the stats
in the proto_uxst file.
2009-08-16 19:36:56 +02:00
Willy Tarreau
5ca791da8d [CLEANUP] move remaining stats sockets code to dumpstats
The remains of the stats socket code has nothing to do in proto_uxst
anymore and must move to dumpstats. The code is much cleaner and more
structured. It was also an opportunity to rename AN_REQ_UNIX_STATS
as AN_REQ_STATS_SOCK as the stats socket is no longer unix-specific
either.

The last item refering to stats in proto_uxst is the setting of the
task's nice value which should in fact come from the listener.
2009-08-16 19:35:36 +02:00
Willy Tarreau
8e13d7492d [CLEANUP] unix: remove uxst_process_session()
This one is not used anymore.
2009-08-16 19:34:23 +02:00
Willy Tarreau
104eb36f26 [MEDIUM] make the unix stats sockets use the generic session handler
process_session() is now ready to handle unix stats sockets. This
first step works and old code has not been removed. A cleanup is
required. The stats handler is not unix socket-centric anymore and
should move to dumpstats.c.
2009-08-16 19:33:51 +02:00
Willy Tarreau
89a6313c34 [MEDIUM] make the global stats socket part of a frontend
Creating a frontend for the global stats socket will help merge
unix sockets management with the other socket management. Since
frontends are huge structs, we only allocate it if required.
2009-08-16 19:31:51 +02:00
Willy Tarreau
9650f37628 [MEDIUM] move connection establishment from backend to the SI.
The connection establishment was completely handled by backend.c which
normally just handles LB algos. Since it's purely TCP, it must move to
proto_tcp.c. Also, instead of calling it directly, we now call it via
the stream interface, which will later help us unify session handling.
2009-08-16 17:46:15 +02:00
Willy Tarreau
b55932ddaf [MEDIUM] remove old experimental tcpsplice option
This Linux-specific option was never really used in production and
has since been superseded by new splicing options brought by recent
Linux kernels.

It caused several particular cases in the code because the kernel
would take care of the session without haproxy being able to do
anything on it, which became hard to handle in the new architecture.

Let's simply get rid of it now that there is a replacement available.
2009-08-16 13:20:32 +02:00
Willy Tarreau
1d45b7cbae [MINOR] stats: add a new node-name setting
The new "node-name" stats setting enables reporting of a node ID on
the stats page. It is possible to return the system's host name as
well as a specific name.
2009-08-16 10:29:18 +02:00
Willy Tarreau
3ad6a7640b [MINOR] export the hostname variable so that all the code can access it
The hostname variable will be used later, export it.
2009-08-16 10:08:02 +02:00
Emeric Brun
3a058f3091 [MINOR] add a new CLF log format
Appending the "clf" word after "option httplog" turns the HTTP log
format into a CLF format, more suited for certain tools.
2009-07-14 12:50:40 +02:00
Emeric Brun
647caf1ebc [MEDIUM] add support for RDP cookie persistence
The new statement "persist rdp-cookie" enables RDP cookie
persistence. The RDP cookie is then extracted from the RDP
protocol, and compared against available servers. If a server
matches the RDP cookie, then it gets the connection.
2009-07-14 12:50:40 +02:00
Emeric Brun
736aa238a3 [MEDIUM] add support for RDP cookie load-balancing
This patch adds support for hashing RDP cookies in order to
use them as a load-balancing key. The new "rdp-cookie(name)"
load-balancing metric has to be used for this. It is still
mandatory to wait for an RDP cookie in the frontend, otherwise
it will randomly work.
2009-07-14 12:50:39 +02:00
Emeric Brun
bede3d0ef4 [MINOR] acl: add support for matching of RDP cookies
The RDP protocol is quite simple and documented, which permits
an easy detection and extraction of cookies. It can be useful
to match the MSTS cookie which can contain the username specified
by the client.
2009-07-14 12:50:39 +02:00
Willy Tarreau
bedb9bad67 [MINOR] prepare callers of session_set_backend to handle errors
session_set_backend will soon have to allocate areas for HTTP
headers. We must ensure that the callers can handle an allocation
error.
2009-07-12 08:36:24 +02:00
Willy Tarreau
a9fb08317f [MINOR] report in the proxies the requirements for ACLs
This patch propagates the ACL conditions' "requires" bitfield
to the proxies. This makes it possible to know exactly what a
proxy might have to support for any request, which helps knowing
whether we have to allocate some space for certain types of
structures or not (eg: the hdr_idx struct).

The concept might be extended to a lot more types of information,
such as detecting whether we need to allocate some space for some
request ACLs which need a result in the response, etc...
2009-07-10 23:09:39 +02:00
Willy Tarreau
1d0dfb155d [MAJOR] http: complete splitting of the remaining stages
The HTTP processing has been splitted into 7 steps, one of which
is not anymore HTTP-specific (content-switching). That way, it
becomes possible to use "use_backend" rules in TCP mode. A new
"use_server" directive should follow soon.
2009-07-07 15:10:31 +02:00
Willy Tarreau
3a816293e9 [MEDIUM] session: tell analysers what bit they were called for
Some stream analysers might become generic enough to be called
for several bits. So we cannot have the analyser bit hard coded
into the analyser itself. Let's make the caller inform the callee.
2009-07-07 10:55:49 +02:00
Willy Tarreau
d787e6648c [MEDIUM] http: split request waiter from request processor
We want to split several steps in HTTP processing so that
we can call individual analysers depending on what processing
we want to perform. The first step consists in splitting the
part that waits for a request from the rest.
2009-07-07 10:14:51 +02:00
Willy Tarreau
dc340a900d [MEDIUM] splice: set the capability on each stream_interface
The splice code did not consider compatibility between both ends
of the connection. Now we set different capabilities on each
stream interface, depending on what the protocol can splice to/from.
Right now, only TCP is supported. Thanks to this, we're now able to
automatically detect when splice() is not implemented and automatically
disable it on one end instead of reporting errors to the upper layer.
2009-06-28 23:10:19 +02:00
Willy Tarreau
5d707e1aaa [MEDIUM] stream_sock: don't close prematurely when nolinger is set
When the nolinger option is used, we must not close too fast because
some data might be left unsent. Instead we must proceed with a normal
shutdown first, then a close. Also, we want to avoid merging FIN with
the last segment if nolinger is set, because if that one gets lost,
there is no chance for it to be retransmitted.
2009-06-28 11:09:07 +02:00
Willy Tarreau
5d01a63b78 [MEDIUM] config: support loading multiple configuration files
We now support up to 10 distinct configuration files. They are
all loaded in the order defined by -f <file1> -f <file2> ...

This can be useful in order to store global, private, public,
etc... configurations in distinct files.
2009-06-23 08:17:17 +02:00
Willy Tarreau
915e1ebe63 [MEDIUM] config: split parser and checker in two functions
This is a first step towards support of multiple configuration files.
Now readcfgfile() only reads a file in memory and performs very minimal
parsing. The checks are performed afterwards.
2009-06-23 08:17:17 +02:00
Willy Tarreau
c9fe4562c2 [MINOR] make DEFAULT_MAXCONN user-configurable at build time
The only way to set this previously was to set SYSTEM_MAXCONN
which serves a different purpose.
2009-06-15 16:34:03 +02:00
Willy Tarreau
be1b91842a [MEDIUM] add support for TCP MSS adjustment for listeners
Sometimes it can be useful to limit the advertised TCP MSS on
incoming connections, for instance when requests come through
a VPN or when the system is running with jumbo frames enabled.

Passing the "mss <value>" arguments to a "bind" line will set
the value. This works under Linux >= 2.6.28, and maybe a few
earlier ones, though due to an old kernel bug most of earlier
versions will probably ignore it. It is also possible that some
other OSes will support this.
2009-06-14 18:48:19 +02:00
Willy Tarreau
d88edf2e52 [MEDIUM] implement tcp-smart-connect option at the backend
This new option enables combining of request buffer data with
the initial ACK of an outgoing TCP connection. Doing so saves
one packet per connection which is quite noticeable on workloads
mostly consisting in small objects. The option is not enabled by
default.
2009-06-14 15:48:17 +02:00
Willy Tarreau
fb14edc215 [MEDIUM] stream_sock: implement tcp-cork for use during shutdowns on Linux
Setting TCP_CORK on a socket before sending the last segment enables
automatic merging of this segment with the FIN from the shutdown()
call. Playing with TCP_CORK is not easy though as we have to track
the status of the TCP_NODELAY flag since both are mutually exclusive.
Doing so saves one more packet per session and offers about 5% more
performance.

There is no reason not to do it, so there is no associated option.
2009-06-14 15:24:37 +02:00
Willy Tarreau
9ea05a790f [MEDIUM] implement option tcp-smart-accept at the frontend
This option disables TCP quick ack upon accept. It is also
automatically enabled in HTTP mode, unless the option is
explicitly disabled with "no option tcp-smart-accept".

This saves one packet per connection which can bring reasonable
amounts of bandwidth for servers processing small requests.
2009-06-14 12:07:01 +02:00
Willy Tarreau
84b57dae4a [MINOR] config: track "no option"/"option" changes
Sometimes we would want to implement implicit default options,
but for this we need to be able to disable them, which requires
to keep track of "no option" settings. With this change, an option
explicitly disabled in a defaults section will still be seen as
explicitly disabled. There should be no regression as nothing makes
use of this yet.
2009-06-14 11:10:45 +02:00
Willy Tarreau
c6f4ce8fc4 [MEDIUM] add support for binding to source port ranges during connect
Some users are already hitting the 64k source port limit when
connecting to servers. The system usually maintains a list of
unused source ports, regardless of the source IP they're bound
to. So in order to go beyond the 64k concurrent connections, we
have to manage the source ip:port lists ourselves.

The solution consists in assigning a source port range to each
server and use a free port in that range when connecting to that
server, either for a proxied connection or for a health check.
The port must then be put back into the server's range when the
connection is closed.

This mechanism is used only when a port range is specified on
a server. It makes it possible to reach 64k connections per
server, possibly all from the same IP address. Right now it
should be more than enough even for huge deployments.
2009-06-10 12:23:32 +02:00
Willy Tarreau
13a34bd110 [MINOR] compute the max of sessions/s on fe/be/srv
Some users want to keep the max sessions/s seen on servers, frontends
and backends for capacity planning. It's easy to grab it while the
session count is updated, so let's keep it.
2009-05-10 18:52:49 +02:00
Willy Tarreau
f7edefa413 [MINOR] implement per-logger log level limitation
Some people are using haproxy in a shared environment where the
system logger by default sends alert and emerg messages to all
consoles, which happens when all servers go down on a backend for
instance. These people can not always change the system configuration
and would like to limit the outgoing messages level in order not to
disturb the local users.

The addition of an optional 4th field on the "log" line permits
exactly this. The minimal log level ensures that all outgoing logs
will have at least this level. So the logs are not filtered out,
just set to this level.
2009-05-10 17:20:05 +02:00
Benoit
affb481f1a [MEDIUM] add support for "balance hdr(name)"
There is a patch made by me that allow for balancing on any http header
field.

[WT:
  made minor changes:
  - turned 'balance header name' into 'balance hdr(name)' to match more
    closely the ACL syntax for easier future convergence
  - renamed the proxy structure fields header_* => hh_*
  - made it possible to use the domain name reduction to any header, not
    only "host" since it makes sense to do it with other ones.
  Otherwise patch looks good.
/WT]
2009-05-10 15:50:15 +02:00
Willy Tarreau
946ba59190 [MINOR] standard: provide a new 'my_strndup' function
This function is only offered by GNU extensions and is sometimes
useful during configuration parsing.
2009-05-10 15:41:18 +02:00
Willy Tarreau
c9bd0cc224 [MINOR] add options dontlog-normal and log-separate-errors
Some big traffic sites have trouble dealing with logs and tend to
disable them. Here are two new options to help cope with massive
logs.

  - dontlog-normal only disables logging for 100% successful
    connections, other ones will still be logged

  - log-separate-errors will cause non-100% successful connections
    to be logged at level "err" instead of level "info" so that a
    properly configured syslog daemon can send them to a different
    file for longer conservation.
2009-05-10 11:57:02 +02:00
Willy Tarreau
8f38bd0497 [MINOR] add basic signal handling functions
These functions will be used to deliver asynchronous signals in order
to make the signal handling functions more robust. The goal is to keep
the same interface to signal handlers.
2009-05-10 09:24:23 +02:00
Maik Broemme
2850cb42b6 [MINOR] add X-Original-To: header
I have attached a patch which will add on every http request a new
header 'X-Original-To'. If you have HAProxy running in transparent mode
with a big number of SQUID servers behind it, it is very nice to have
the original destination ip as a common header to make decisions based
on it.

The whole thing is configurable with a new option 'originalto'. I have
updated the sourcecode as well as the documentation. The 'haproxy-en.txt'
and 'haproxy-fr.txt' files are untouched, due to lack of my french
language knowledge. ;)

Also the patch adds this header for IPv4 only. I haven't any IPv6 test
environment running here and don't know if getsockopt() with SO_ORIGINAL_DST
will work on IPv6. If someone knows it and wants to test it I can modify
the diff. Feel free to ask me questions or things which should be changed. :)

--Maik
2009-05-01 16:22:33 +02:00
Willy Tarreau
3b88d441e9 [MINOR] switch all stat counters to 64-bit
The byte counters have long been 64-bit to avoid overflows. But with
several sites nowadays, we see session counters wrap around every 10-days
or so. So it was the moment to switch counters to 64-bit, including
error and warning counters which can theorically rise as fast as session
counters even if in practice there is very low risk.

The performance impact should not be noticeable since those counters are
only updated once per session. The stats output have been carefully checked
for proper types on both 32- and 64-bit platforms.
2009-04-11 20:44:08 +02:00
Willy Tarreau
40d2516371 [BUILD] add format(printf) to printf-like functions
Doing this helps catching warnings about wrong output formats.
2009-04-03 12:01:47 +02:00
Willy Tarreau
4076a15255 [MEDIUM] http: capture invalid requests/responses even if accepted
It's useful to be able to accept an invalid header name in a request
or response but still be able to monitor further such errors. Now,
when an invalid request/response is received and accepted due to
an "accept-invalid-http-{request|response}" option, the invalid
request will be captured for later analysis with "show errors" on
the stats socket.
2009-04-02 21:36:37 +02:00
Willy Tarreau
32a4ec0ed7 [MEDIUM] http: add options to ignore invalid header names
Sometimes it is required to let invalid requests pass because
applications sometimes take time to be fixed and other servers
do not care. Thus we provide two new options :

     option accept-invalid-http-request  (for the frontend)
     option accept-invalid-http-response (for the backend)

When those options are set, invalid requests or responses do
not cause a 403/502 error to be generated.
2009-04-02 21:36:34 +02:00
Willy Tarreau
61d188920e [MINOR] improve reporting of misplaced acl/reqxxx rules
Now we can detect improper ordering of "block", "reqxxx", "reqadd",
"redirect" and "use_backend", and warn the user accordingly.
2009-03-31 10:49:21 +02:00
Willy Tarreau
e7239b5152 [MINOR] implement ulltoh() to write HTML-formatted numbers
This function sets CSS letter spacing after each 3rd digit. The page must
create a class "rls" (right letter spacing) with style "letter-spacing: 0.3em"
in order to use it.
2009-03-29 13:41:58 +02:00
Willy Tarreau
3884cbaae6 [MINOR] show sess: report number of calls to each task
For debugging purposes, it can be useful to know how many times each
task has been called.
2009-03-28 17:54:35 +01:00
Willy Tarreau
1b194fe03e [OPTIM] buffer: new BF_READ_DONTWAIT flag reduces EAGAIN rates
When the reader does not expect to read lots of data, it can
set BF_READ_DONTWAIT on the request buffer. When it is set,
the stream_sock_read callback will not try to perform multiple
reads, it will return after only one, and clear the flag.
That way, we can immediately return when waiting for an HTTP
request without trying to read again.

On pure request/responses schemes such as monitor-uri or
redirects, this has completely eliminated the EAGAIN occurrences
and the epoll_ctl() calls, resulting in a performance increase of
about 10%. Similar effects should be observed once we support
HTTP keep-alive since we'll immediately disable reads once we
get a full request.
2009-03-21 21:57:30 +01:00
Willy Tarreau
6f4a82c7af [OPTIM] stream_sock: don't retry to read after a large read
If we get very large data at once, it's almost certain that it's
worthless trying to read again, because we got everything we could
get.

Doing this has made all -EAGAIN disappear from splice reads. The
threshold has been put in the global tunable structures so that if
we one day want to make it accessible from user config, it will be
easy to do so.
2009-03-21 20:43:57 +01:00
Willy Tarreau
c7bdf09f9f [MINOR] stats: report number of tasks (active and running)
It may be useful for statistics purposes to report the number of
tasks.
2009-03-21 18:33:52 +01:00
Willy Tarreau
a461318f97 [MINOR] task: keep a task count and clean up task creators
It's sometimes useful at least for statistics to keep a task count.
It's easy to do by forcing the rare task creators to always use the
same functions to create/destroy a task.
2009-03-21 18:13:21 +01:00
Willy Tarreau
26ca34e66e [BUG] scheduler: fix improper handling of duplicates __task_queue()
The top of a duplicate tree is not where bit == -1 but at the most
negative bit. This was causing tasks to be queued in reverse order
within duplicates. While this is not dramatic, it's incorrect and
might lead to longer than expected duplicate depths under some
circumstances.
2009-03-21 12:57:06 +01:00
Willy Tarreau
e35c94a748 [MEDIUM] scheduler: get rid of the 4 trees thanks and use ebtree v4.1
Since we're now able to search from a precise expiration date in
the timer tree using ebtree 4.1, we don't need to maintain 4 trees
anymore. Not only does this simplify the code a lot, but it also
ensures that we can always look 24 days back and ahead, which
doubles the ability of the previous scheduler. Indeed, while based
on absolute values, the timer tree is now relative to <now> as we
can always search from <now>-31 bits.

The run queue uses the exact same principle now, and is now simpler
and a bit faster to process. With these changes alone, an overall
0.5% performance gain was observed.

Tests were performed on the few wrapping cases and everything works
as expected.
2009-03-21 10:25:14 +01:00
Willy Tarreau
5804434a0f [MINOR] update ebtree to version 4.1
Ebtree version 4.1 brings lookup by ranges. This will be useful for
the scheduler.
2009-03-21 10:23:36 +01:00
Willy Tarreau
844553303d [BUG] session: errors were not reported in termination flags in TCP mode
In order to get termination flags properly updated, the session was
relying a bit too much on http_return_srv_error() which is http-centric.

A generic srv_error function was implemented in the session in order to
catch all connection abort situations. It was then noticed that a request
abort during a connection attempt was not reported, which is now fixed.

Read and write errors/timeouts were not logged either. It was necessary
to add those tests at 4 new locations.

Now it looks like everything is correctly logged. Most likely some error
checking code could now be removed from some analysers.
2009-03-15 22:34:05 +01:00
Willy Tarreau
5af24efee9 [CLEANUP] config: catch and report some possibly wrong rule ordering
There are some configurations in which redirect rules are declared
after use_backend rules. We can also find "block" rules after any
of these ones. The processing sequence is :
  - block
  - redirect
  - use_backend

So as of now we try to detect wrong ordering to warn the user about
a possibly undesired behaviour.
2009-03-15 15:23:16 +01:00
Willy Tarreau
ff01a21ebe [MINOR] cfgparse: some cleanups in the consistency checks
Check for servers in health mode, for health mode in pure-backends.
Some code have been refactored for better organization.
2009-03-15 13:46:16 +01:00
Willy Tarreau
e8a28bf165 [MINOR] buffers: implement buffer_flush()
This function will flush the buffer's data, which means that all data
remaining in the buffer will be scheduled for sending.
2009-03-08 21:12:04 +01:00
Willy Tarreau
6f0aa476bd [CLEANUP] buffer_flush() was misleading, rename it as buffer_erase 2009-03-08 20:33:29 +01:00
Willy Tarreau
531cf0cf8d [OPTIM] task: reduce the number of calls to task_queue()
Most of the time, task_queue() will immediately return. By extracting
the preliminary checks and putting them in an inline function, we can
significantly reduce the number of calls to the function itself, and
most of the tests can be optimized away due to the caller's context.

Another minor improvement in process_runnable_tasks() consisted in
taking benefit from the processor's branch prediction unit by making
a special case of the process_session() callback which is by far the
most common one.

All this improved performance by about 1%, mainly during the call
from process_runnable_tasks().
2009-03-08 16:35:27 +01:00
Willy Tarreau
d0a201b35c [CLEANUP] task: distinguish between clock ticks and timers
Timers are unsigned and used as tree positions. Ticks are signed and
used as absolute date within current time frame. While the two are
normally equal (except zero), it's important not to confuse them in
the code as they are not interchangeable.

We add two inline functions to turn each one into the other.

The comments have also been moved to the proper location, as it was
not easy to understand what was a tick and what was a timer unit.
2009-03-08 15:58:07 +01:00
Willy Tarreau
26c250683f [MEDIUM] minor update to the task api: let the scheduler queue itself
All the tasks callbacks had to requeue the task themselves, and update
a global timeout. This was not convenient at all. Now the API has been
simplified. The tasks callbacks only have to update their expire timer,
and return either a pointer to the task or NULL if the task has been
deleted. The scheduler will take care of requeuing the task at the
proper place in the wait queue.
2009-03-08 09:38:41 +01:00
Willy Tarreau
4726f53794 [OPTIM] task: don't unlink a task from a wait queue when waking it up
In many situations, we wake a task on an I/O event, then queue it
exactly where it was. This is a real waste because we delete/insert
tasks into the wait queue for nothing. The only reason for this is
that there was only one tree node in the task struct.

By adding another tree node, we can have one tree for the timers
(wait queue) and one tree for the priority (run queue). That way,
we can have a task both in the run queue and wait queue at the
same time. The wait queue now really holds timers, which is what
it was designed for.

The net gain is at least 1 delete/insert cycle per session, and up
to 2-3 depending on the workload, since we save one cycle each time
the expiration date is not changed during a wake up.
2009-03-08 07:59:18 +01:00
Willy Tarreau
79584225e5 [OPTIM] rate-limit: cleaner behaviour on low rates and reduce consumption
The rate-limit was applied to the smoothed value which does a special
case for frequencies below 2 events per period. This caused irregular
limitations when set to 1 session per second.

The proper way to handle this is to compute the number of remaining
events that can occur without reaching the limit. This is what has
been added. It also has the benefit that the frequency calculation
is now done once when entering event_accept(), before the accept()
loop, and not once per accept() loop anymore, thus saving a few CPU
cycles during very high loads.

With this fix, rate limits of 1/s are perfectly respected.
2009-03-06 09:18:27 +01:00
Willy Tarreau
3a7d20781d [MEDIUM] implement "rate-limit sessions" for the frontend
The new "rate-limit sessions" statement sets a limit on the number of
new connections per second on the frontend. As it is extremely accurate
(about 0.1%), it is efficient at limiting resource abuse or DoS.
2009-03-05 23:48:25 +01:00
Willy Tarreau
7f062c4193 [MEDIUM] measure and report session rate on frontend, backends and servers
With this change, all frontends, backends, and servers maintain a session
counter and a timer to compute a session rate over the last second. This
value will be very useful because it varies instantly and can be used to
check thresholds. This value is also reported in the stats in a new "rate"
column.
2009-03-05 18:43:00 +01:00
Willy Tarreau
755905857a [MINOR] add curr_sec_ms and curr_sec_ms_scaled for current second.
Several algorithms will need to know the millisecond value within
the current second. Instead of doing a divide every time it is needed,
it's better to compute it when it changes, which is when now and now_ms
are recomputed.

curr_sec_ms_scaled is the same multiplied by 2^32/1000, which will be
useful to compute some ratios based on the position within last second.
2009-03-05 16:56:16 +01:00
Willy Tarreau
776cd87e32 [MINOR] time: add __usec_to_1024th to convert usecs to 1024th of second
This function performs a fast conversion from usec to 1024th of a second,
and will be useful for many fast sub-second computations.
2009-03-05 00:34:01 +01:00
Willy Tarreau
74808cb907 [MEDIUM] implement error dump on unix socket with "show errors"
The new "show errors" command sent on a unix socket will dump
all captured request and response errors for all proxies. It is
also possible to bound the log to frontends and backends whose
ID is passed as an optional parameter.

The output provides information about frontend, backend, server,
session ID, source address, error type, and error position along
with a complete dump of the request or response which has caused
the error.

If a new error scratches the one currently being reported, then
the dump is aborted with a warning message, and processing goes
on to next error.
2009-03-04 15:53:18 +01:00
Willy Tarreau
f073a83b1d [MEDIUM] store a complete dump of request and response errors in proxies
Each proxy instance, either frontend or backend, now has some room
dedicated to storing a complete dated request or response in case
of parsing error. This will make it possible to consult errors in
order to find the exact cause, which is particularly important for
troubleshooting faulty applications.
2009-03-04 10:26:38 +01:00
Willy Tarreau
0b9c02c861 [MEDIUM] implement bind-process to limit service presence by process
The "bind-process" keyword lets the admin select which instances may
run on which process (in multi-process mode). It makes it easier to
more evenly distribute the load across multiple processes by avoiding
having too many listen to the same IP:ports.
2009-02-04 22:05:05 +01:00
Willy Tarreau
c76721da57 [MEDIUM] add support for source interface binding at the server level
Add support for "interface <name>" after the "source" statement on
the server line.
2009-02-04 20:20:58 +01:00
Willy Tarreau
d53f96b3f0 [MEDIUM] add support for source interface binding
Specifying "interface <name>" after the "source" statement allows
one to bind to a specific interface for proxy<->server traffic.

This makes it possible to use multiple links to reach multiple
servers, and to force traffic to pass via an interface different
from the one the system would have chosen based on the routing
table.
2009-02-04 18:46:54 +01:00
Willy Tarreau
5e6e204d1c [MINOR] add support for bind interface name
By appending "interface <name>" to a "bind" line, it is now possible
to specifically bind to a physical interface name. Note that this
currently only works on Linux and requires root privileges.
2009-02-04 17:19:29 +01:00
Willy Tarreau
3ab68cf0ae [MEDIUM] splice: add the global "nosplice" option
Setting "nosplice" in the global section will disable the use of TCP
splicing (both tcpsplice and linux 2.6 splice). The same will be
achieved using the "-dS" parameter on the command line.
2009-01-25 16:03:28 +01:00
Willy Tarreau
43b78999ec [MEDIUM] move global tuning options to the global structure
The global tuning options right now only concern the polling mechanisms,
and they are not in the global struct itself. It's not very practical to
add other options so let's move them to the global struct and remove
types/polling.h which was not used for anything else.
2009-01-25 15:42:27 +01:00
Willy Tarreau
3eba98aa57 [MEDIUM] splice: make use of pipe pools
Using pipe pools makes pipe management a lot easier. It also allows to
remove quite a bunch of #ifdefs in areas which depended on the presence
or not of support for kernel splicing.

The buffer now holds a pointer to a pipe structure which is always NULL
except if there are still data in the pipe. When it needs to use that
pipe, it dynamically allocates it from the pipe pool. When the data is
consumed, the pipe is immediately released.

That way, there is no need anymore to care about pipe closure upon
session termination, nor about pipe creation when trying to use
splice().

Another immediate advantage of this method is that it considerably
reduces the number of pipes needed to use splice(). Tests have shown
that even with 0.2 pipe per connection, almost all sessions can use
splice(), because the same pipe may be used by several consecutive
calls to splice().
2009-01-25 13:56:13 +01:00
Willy Tarreau
982b6e37e4 [MEDIUM] introduce pipe pools
A new data type has been added : pipes. Some pre-allocated empty pipes
are maintained in a pool for users such as splice which use them a lot
for very short times.

Pipes are allocated using get_pipe() and released using put_pipe().
Pipes which are released with pending data are immediately killed.
The struct pipe is small (16 to 20 bytes) and may even be further
reduced by unifying ->data and ->next.

It would be nice to have a dedicated cleanup task which would watch
for the pipes usage and destroy a few of them from time to time.
2009-01-25 13:49:53 +01:00
Willy Tarreau
259de1b702 [MINOR] introduce structures required to support Linux kernel splicing
When CONFIG_HAP_LINUX_SPLICE is defined, the buffer structure will be
slightly enlarged to support information needed for kernel splicing
on Linux.

A first attempt consisted in putting this information into the stream
interface, but in the long term, it appeared really awkward. This
version puts the information into the buffer. The platform-dependant
part is conditionally added and will only enlarge the buffers when
compiled in.

One new flag has also been added to the buffers: BF_KERN_SPLICING.
It indicates that the application considers it is appropriate to
use splicing to forward remaining data.
2009-01-18 21:56:21 +01:00
Willy Tarreau
66aa61f76b [MEDIUM] splice: add configuration options and set global.maxpipes
Three new options have been added when CONFIG_HAP_LINUX_SPLICE is
set :
  - splice-request
  - splice-response
  - splice-auto

They are used to enable splicing per frontend/backend. They are also
supported in defaults sections. The "splice-auto" option is meant to
automatically turn splice on for buffers marked as fast streamers.
This should save quite a bunch of file descriptors.

It was required to add a new "options2" field to the proxy structure
because the original "options" is full.

When global.maxpipes is not set, it is automatically adjusted to
the max of the sums of all frontend's and backend's maxconns for
those which have at least one splice option enabled.
2009-01-18 21:44:07 +01:00
Willy Tarreau
3ec79b9c42 [MINOR] global.maxpipes: add the ability to reserve file descriptors for pipes
This will be needed to use linux's splice() syscall.
2009-01-18 20:39:42 +01:00
Willy Tarreau
03d60bbaf9 [OPTIM] buffer: replace rlim by max_len
In the buffers, the read limit used to leave some place for header
rewriting was set by a pointer to the end of the buffer. Not only
this required subtracts at every place in the code, but this will
also soon not be usable anymore when we want to support keepalive.

Let's replace this with a length limit, comparable to the buffer's
length. This has also sightly reduced the code size.
2009-01-09 11:14:39 +01:00
Willy Tarreau
0abebcc0fb [MEDIUM] i/o: rework ->to_forward and ->send_max
The way the buffers and stream interfaces handled ->to_forward was
really not handy for multiple reasons. Now we've moved its control
to the receive-side of the buffer, which is also responsible for
keeping send_max up to date. This makes more sense as it now becomes
possible to send some pre-formatted data followed by forwarded data.

The following explanation has also been added to buffer.h to clarify
the situation. Right now, tests show that the I/O is behaving extremely
well. Some work will have to be done to adapt existing splice code
though.

/* Note about the buffer structure

   The buffer contains two length indicators, one to_forward counter and one
   send_max limit. First, it must be understood that the buffer is in fact
   split in two parts :
     - the visible data (->data, for ->l bytes)
     - the invisible data, typically in kernel buffers forwarded directly from
       the source stream sock to the destination stream sock (->splice_len
       bytes). Those are used only during forward.

   In order not to mix data streams, the producer may only feed the invisible
   data with data to forward, and only when the visible buffer is empty. The
   consumer may not always be able to feed the invisible buffer due to platform
   limitations (lack of kernel support).

   Conversely, the consumer must always take data from the invisible data first
   before ever considering visible data. There is no limit to the size of data
   to consume from the invisible buffer, as platform-specific implementations
   will rarely leave enough control on this. So any byte fed into the invisible
   buffer is expected to reach the destination file descriptor, by any means.
   However, it's the consumer's responsibility to ensure that the invisible
   data has been entirely consumed before consuming visible data. This must be
   reflected by ->splice_len. This is very important as this and only this can
   ensure strict ordering of data between buffers.

   The producer is responsible for decreasing ->to_forward and increasing
   ->send_max. The ->to_forward parameter indicates how many bytes may be fed
   into either data buffer without waking the parent up. The ->send_max
   parameter says how many bytes may be read from the visible buffer. Thus it
   may never exceed ->l. This parameter is updated by any buffer_write() as
   well as any data forwarded through the visible buffer.

   The consumer is responsible for decreasing ->send_max when it sends data
   from the visible buffer, and ->splice_len when it sends data from the
   invisible buffer.

   A real-world example consists in part in an HTTP response waiting in a
   buffer to be forwarded. We know the header length (300) and the amount of
   data to forward (content-length=9000). The buffer already contains 1000
   bytes of data after the 300 bytes of headers. Thus the caller will set
   ->send_max to 300 indicating that it explicitly wants to send those data,
   and set ->to_forward to 9000 (content-length). This value must be normalised
   immediately after updating ->to_forward : since there are already 1300 bytes
   in the buffer, 300 of which are already counted in ->send_max, and that size
   is smaller than ->to_forward, we must update ->send_max to 1300 to flush the
   whole buffer, and reduce ->to_forward to 8000. After that, the producer may
   try to feed the additional data through the invisible buffer using a
   platform-specific method such as splice().
 */
2009-01-09 10:15:03 +01:00
Willy Tarreau
dcef33fa9b [MINOR] add the splice_len member to the buffer struct in preparation of splice support
In preparation of splice support, let's add the splice_len member
to the buffer struct. An earlier implementation made it conditional,
which made the whole logics very complex due to a large number of
ifdefs.

Now BF_EMPTY is only set once both buf->l and buf->splice_len are
null. Splice_len is initialized to zero during buffer creation and
is currently not changed, so the whole logics remains unaffected.

When splice gets merged, splice_len will reflect the number of bytes
in flight out of the buffer but not yet sent, typically in a pipe for
the Linux case.
2009-01-09 10:15:02 +01:00
Willy Tarreau
6b66f3e4f6 [MAJOR] implement autonomous inter-socket forwarding
If an analyser sets buf->to_forward to a given value, that many
data will be forwarded between the two stream interfaces attached
to a buffer without waking the task up. The same applies once all
analysers have been released. This saves a large amount of calls
to process_session() and a number of task_dequeue/queue.
2009-01-09 10:15:02 +01:00
Willy Tarreau
3ffeba1f67 [MEDIUM] enable inter-stream_interface wakeup calls
By letting the producer tell the consumer there is data to check,
and the consumer tell the producer there is some space left again,
we can cut in half the number of session wakeups.

This is also an important starting point for future splicing support.
2008-12-28 11:09:02 +01:00
Willy Tarreau
b0ef735c71 [MINOR] add flags to indicate when a stream interface is waiting for space/data
It will soon be required to know when a stream interface is waiting for
buffer data or buffer room. Let's add two flags for that.
2008-12-28 11:08:03 +01:00
Willy Tarreau
86491c3164 [MEDIUM] indicate when we don't care about read timeout
Sometimes we don't care about a read timeout, for instance, from the
client when waiting for the server, but we still want the client to
be able to read.

Till now it was done by articially forcing the read timeout to ETERNITY.
But this will cause trouble when we want the low level stream sock to
communicate without waking the session up. So we add a BF_READ_NOEXP
flag to indicate that when the read timeout is to be set, it might
have to be set to ETERNITY.

Since BF_READ_ENA was not used, we replaced this flag.
2008-12-28 11:06:40 +01:00
Willy Tarreau
dd80c6f92d [MEDIUM] don't report buffer timeout when there is I/O activity
We don't want to report a buffer timeout if there was I/O activity
for the same events. That way we'll not have to always re-arm timeouts
on I/O, without the fear of a timeout triggering too fast.
2008-12-28 10:58:52 +01:00
Willy Tarreau
f890dc9003 [MEDIUM] add a send limit to a buffer
For keep-alive, line-mode protocols and splicing, we will need to
limit the sender to process a certain amount of bytes. The limit
is automatically set to the buffer size when analysers are detached
from the buffer.
2008-12-28 10:58:52 +01:00
Willy Tarreau
922a806075 [BUG] do not dequeue the backend's pending connections on a dead server
Kai Krueger found that previous patch was incomplete, because there is
an unconditionnal call to process_srv_queue() in session_free() which
still causes a dead server to consume pending connections from the
backend.

This call was made unconditionnal so that we don't leave unserved
connections in the server queue, for instance connections coming
in with "option persist" which can bypass the server status check.
However, the server must not touch the backend's queue if it is down.

Another fear was that some connections might remain unserved when
the server is using a dynamic maxconn if the number of connections
to the backend is too low. Right now, srv_dynamic_maxconn() ensures
this cannot happen, so the call can remain conditionnal.

The fix consists in allowing a server to process it own queue whatever
its state, but not to touch the backend's queue if it is down. Its
queue should normally be empty when the server is down because it is
redistributed when the server goes down. The only remaining cases are
precisely the persistent connections with "option persist" set, coming
in after the queue has been redispatched. Those ones must still be
processed when a connection terminates.
(cherry picked from commit cd485c4480)
2008-12-07 23:51:12 +01:00
Willy Tarreau
07dc95abf1 [BUG] do not dequeue requests on a dead server
Kai Krueger reported a problem when a server goes down with active
connections. A lot of connections were drained by that server. Kai
did an amazing job at tracking this bug down to the dequeuing
mechanism which forgets to check the server state before allowing
a request to be sent to a server.

The problem occurs more often with long requests, which have a chance
to complete after the server is completely marked down, and to find
requests in the global queue which have not yet been fetched by other
servers.

The fix consists in ensuring that a server is up before sending it
any new request from the queue.
(cherry picked from commit 80b286a064)
(cherry picked from commit 2e5e0d2853f059a1d09dc81fdbbad9fd03124a98)
2008-12-07 23:49:07 +01:00
Willy Tarreau
0140f2553c [MINOR] redirect: add support for "set-cookie" and "clear-cookie"
It is now possible to set or clear a cookie during a redirection. This
is useful for logout pages, or for protecting against some DoSes. Check
the documentation for the options supported by the "redirect" keyword.

(cherry-picked from commit 4af993822e880d8c932f4ad6920db4c9242b0981)
2008-12-07 23:46:38 +01:00
Willy Tarreau
79da4697ca [MINOR] redirect: add support for the "drop-query" option
If "drop-query" is present on a "redirect" line using the "prefix" mode,
then the returned Location header will be the request URI without the
query-string. This may be used on some login/logout pages, or when it
must be decided to redirect the user to a non-secure server.

(cherry-picked from commit f2d361ccd73aa16538ce767c766362dd8f0a88fd)
2008-12-07 23:42:01 +01:00
Willy Tarreau
da250db376 [BUG] ensure that listeners from disabled proxies are correctly unbound.
There is a problem when an instance is marked "disabled". Its ports are
still bound but will not be unbound upon termination. This causes processes
to accumulate during soft restarts, and might even cause failures to restart
new ones due to the inability to bind to the same port.

The ideal solution would be to bind all ports at the end of the configuration
parsing. An acceptable workaround is to unbind all listeners of disabled
proxies. This is what the current patch does.
(cherry picked from commit a944218e9c)
(cherry picked from commit 8cfebbb82b87345bade831920177077e7d25840a)
2008-12-07 23:33:25 +01:00
Willy Tarreau
3dfe6cd095 [MEDIUM] add support for "show sess" in unix stats socket
It is now possible to list all known sessions by issuing "show sess"
on the unix stats socket. The format is not much evolved but it is
very useful for debugging.

The doc has been updated to reflect the new keyword.
2008-12-07 22:41:17 +01:00
Willy Tarreau
62e4f1dedd [MINOR] add back-references to sessions for later use by a dumper.
This is the first step in implementing a session dump tool.
A session dump will need restart points. It will be necessary for
it to get references to sessions which can be moved when the session
dies.

The principle is not that complex : when a session ends, it looks for
any potential back-references. If it finds any, then it moves them to
the next session in the list. The dump function will of course have
to restart from that new point.
2008-12-07 21:57:02 +01:00
Willy Tarreau
bc04ce7cd9 [MINOR] add a new back-reference type : struct bref
This type will be used to maintain back-references to items which
are subject to move between accesses. Typical usage includes session
removal during a listing.
2008-12-07 20:00:15 +01:00
Willy Tarreau
0a46489228 [MINOR] slightly rebalance stats_dump_{raw,http}
Both should process the response buffer equally. They now both
clear the hijack bit once done, and both receive a pointer to
the response buffer in their arguments.
2008-12-07 18:30:00 +01:00
Willy Tarreau
01bf8675ed [MEDIUM] reference the current hijack function in the buffer itself
Instead of calling a hard-coded function to produce data, let's
reference this function into the buffer and call it from there
when BF_HIJACK is set. This goes in the direction of more generic
session management code.
2008-12-07 18:03:29 +01:00
Willy Tarreau
b5654f6ff4 [MINOR] move the listener reference from fd to session
The listener referenced in the fd was only used to check the
listener state upon session termination. There was no guarantee
that the FD had not been reassigned by the moment it was processed,
so this was a bit racy. Having it in the session is more robust.
2008-12-07 16:45:10 +01:00
Willy Tarreau
7e5067d459 [MEDIUM] remove cli_fd, srv_fd, cli_state and srv_state from the session
Those were previously used by the unix sockets only, and could be
removed.
2008-12-07 16:27:56 +01:00
Willy Tarreau
b1356cf4e4 [MAJOR] make unix sockets work again with stats
The unix protocol handler had not been updated during the last
stream_sock changes. This has been done now. There is still a
lot of duplicated code between session.c and proto_uxst.c due
to the way the session is handled. Session.c relies on the existence
of a frontend while it does not exist here.

It is easier to see the difference between the stats part (placed
in dumpstats.c) and the unix-stream part (in proto_uxst.c).

The hijacking function still needs to be dynamically set into the
response buffer, and some cleanup is still required, then all those
changes should be forward-ported to the HTTP part. Adding support
for new keywords should not cause trouble now.
2008-12-07 16:06:43 +01:00
Willy Tarreau
ff8d42ea68 [MINOR] add an analyser state in struct session
It will be very convenient to have an analyser state in the session.
It will always be initialized to zero. The analysers can make use of
it, but must reset it to zero when they leave.
2008-12-07 14:37:09 +01:00
Willy Tarreau
3bc13774e1 [MINOR] pre-set analyser flags on the listener at registration time
In order to achieve more generic accept() code, we can set the request
analysers at the listener registration time. It's better than doing it
during accept(), and allows more code reuse.
2008-12-07 11:50:35 +01:00
Willy Tarreau
70cb633e2c [MINOR] add an analyser code for UNIX stats request
The UNIX stats socket will be analysed like any other protocol. Add
an analyser for it.
2008-12-07 11:28:08 +01:00
Willy Tarreau
e43d42490a [MINOR] declare process_session in session.h, not proto_http.h 2008-12-01 01:35:40 +01:00
Willy Tarreau
59234e91c2 [MEDIUM] rename process_request to http_process_request
Now the function only does HTTP request and nothing else. Also pass
the request buffer to it.
2008-11-30 23:51:27 +01:00
Willy Tarreau
d34af78a34 [MEDIUM] move the HTTP request body analyser out of process_request().
A new function http_process_request_body() has been created to process
the request body. Next step is now to clean up process_request().
2008-11-30 23:36:37 +01:00
Willy Tarreau
60b85b0694 [MEDIUM] extract the HTTP tarpit code from process_request().
The tarpit is now an autonomous independant analyser.
2008-11-30 23:28:40 +01:00
Willy Tarreau
edcf6687d6 [MEDIUM] extract TCP request processing from HTTP
The TCP analyser has moved to proto_tcp.c. Breaking the function
has required finer use of the return value and adding some tests
to process_session().
2008-11-30 23:15:34 +01:00
Willy Tarreau
b025325274 [MINOR] stream_sock_data_finish() should not expose fd
stream_sock_data_finish was still using a file descriptor as only
argument, while a stream interface is preferred. This is now fixed.
2008-11-30 21:37:12 +01:00
Willy Tarreau
0cac36f415 [MEDIUM] make the http server error function a pointer in the session
It was a bit awkward to have session.c call return_srv_error() for
HTTP error messages related to servers. The function has been adapted
to be passed a pointer to the faulty stream interface, and is now a
pointer in the session. It is possible that in the future, it will
become a callback in the stream interface itself.
2008-11-30 20:44:17 +01:00
Willy Tarreau
2d3d94cf23 [MINOR] replace srv_close_with_err() with http_server_error()
The new function looks like the previous one except that it operates
at the stream interface level and assumes an already closed SI.

Also remove some old unused occurrences of srv_close_with_err().
2008-11-30 20:28:57 +01:00
Willy Tarreau
dded32defa [MINOR] replace client_retnclose() with stream_int_retnclose()
This makes more sense to return a message to a stream interface
than to a session.

senddata.{c,h} have been removed.
2008-11-30 19:48:07 +01:00
Willy Tarreau
81acfab4fd [MINOR] replace the ambiguous client_return function by stream_int_return
This one applies to a stream interface, which makes more sense.
2008-11-30 19:22:53 +01:00
Willy Tarreau
a5555ec68a [MINOR] call session->do_log() for logging
In order to avoid having to call per-protocol logging function directly
from session.c, it's better to assign the logging function when the session
is created. This also eliminates a test when the function is needed, and
opens the way to more complete logging functions.
2008-11-30 19:02:32 +01:00
Willy Tarreau
55a8d0e1bb [CLEANUP] move the session-related functions to session.c
proto_http.c was not suitable for session-related processing, it was
just convenient for the tranformation.

Some more splitting must occur: process_request/response in proto_http.c
must be split again per protocol, and the caller must run a list.

Some functions should be directly attached to the session or the buffer
(eg: perform_http_redirect, return_srv_error, http_sess_log).
2008-11-30 18:47:21 +01:00
Willy Tarreau
fe3718ab79 [MAJOR] complete layer4/7 separation
All the processing has now completely been split in layers. As of
now, everything is still in process_session() which is not the right
place, but the code sequence works. Timeouts, retries, errors, all
work.

The shutdown sequence has been strictly applied: BF_SHUTR/BF_SHUTW
are only assigned by lower layers. Upper layers can only indicate
their wish to close using BF_SHUTR_NOW and BF_SHUTW_NOW.

When a shutdown is performed on a stream interface, the buffer flags
are updated accordingly and re-checked by upper layers. A lot of care
has been taken to ensure that aborts during intermediate connection
setups are correctly handled and shutdowns correctly propagated to
both buffers.

A future evolution would consist in ensuring that BF_SHUT?_NOW may
be set at any time, and applies only when the buffer is empty. This
might help with error messages, but might complicate the processing
of data remaining in buffers.

Some useless buffer flag combinations have been removed.

Stat counters are still broken (eg: per-server total number of sessions).

Error messages should be delayed to the close instant and be produced by
protocol.

Many functions must now move to proper locations.
2008-11-30 18:14:12 +01:00
Willy Tarreau
f54f8bdd8d [MINOR] maintain a global session list in order to ease debugging
Now the global variable 'sessions' will be a dual-linked list of all
known sessions. The list element is set at the beginning of the session
so that it's easier to follow them all with gdb.
2008-11-23 19:53:55 +01:00
Willy Tarreau
0a5d5ddeb9 [MEDIUM] remove stream_sock_update_data()
Two new functions are used instead : buffer_check_{shutr,shutw}.
It is indeed more adequate to check for new closures only when the
buffer reports them.

Several remaining unclosed connections were detected after a test,
even before this patch, so a bug remains. To reproduce, try the
following during 30 seconds :

  inject30l4 -n 20000 -l -t 1000 -P 10 -o 4 -u 100 -s 100 -G 127.0.0.1:8000/
2008-11-23 19:31:35 +01:00
Willy Tarreau
74ab2ac7b0 [MEDIUM] stream_interface: added a DISconnected state between CON/EST and CLO
There were rare situations where it was not easy to detect that a failed
session attempt had occurred and needed some server cleanup. In particular,
client aborts sometimes lead to session leaks on the server side.

A new state "SI_ST_DIS" (disconnected) has been introduced for this. When
a session has been closed at a stream interface but the server cleanup has
not occurred, this state is entered instead of CLO. The cleanup is then
performed there and the state goes to CLO.

A new diagram has been added to show possible stream_interface state
transitions that can occur in a stream-sock. It makes debugging easier.
2008-11-23 17:23:07 +01:00
Willy Tarreau
1e62de615b [MEDIUM] add the SN_CURR_SESS flag to the session to track open sessions
It is quite hard to track when the current session has already been counted
or discounted from the server's total number of established sessions. For
this reason, we introduce a new session flag, SN_CURR_SESS, which indicates
if the current session is one of those reported by the server or not. It
simplifies session accounting and makes it far more robust. It also makes
it possible to perform a last-minute cleanup during session_free().

Right now, with this fix and a few more buffer transitions fixes, no session
were found to remain after a test.
2008-11-11 20:26:58 +01:00
Willy Tarreau
cff6411f9a [MAJOR] add a connection error state to the stream_interface
Tracking connection status changes was hard, and some code was
redundant. A new SI_ST_CER state was added to the stream interface
to indicate a past connection error, and an SI_FL_ERR flag was
added to report past I/O error. The stream_sock code does not set
the connection to SI_ST_CLO anymore in case of I/O error, it's
the upper layer which does it. This makes it possible to know
exactly when the file descriptors are allocated.

The new SI_ST_CER state permitted to split tcp_connection_status()
in two parts, one processing SI_ST_CON and the other one SI_ST_CER.
Synchronous connection errors now make use of this last state, hence
eliminating duplicate code.

Some ib<->ob copy paste errors were found and fixed, and all entities
setting SI_ST_CLO also shut the buffers down.

Some of these stream_interface specific functions and structures
have migrated to a new stream_interface.c file.

Some types of errors are still not detected by the buffers. For
instance, let's assume the following scenario in one single pass
of process_session: a connection sits in SI_ST_TAR state during
a retry. At TAR expiration, a new connection attempt is made, the
connection is obtained and srv->cur_sess is increased. Then the
buffer timeout is fires and everything is cleared, the new state
becomes SI_ST_CLO. The cleaning code checks that previous state
was either SI_ST_CON or SI_ST_EST to release the connection. But
that's wrong because last state is still SI_ST_TAR. So the
server's connection count does not get decreased.

This means that prev_state must not be used, and must be replaced
by some transition detection instead of level detection.

The following debugging line was useful to track state changes :

  fprintf(stderr, "%s:%d: cs=%d ss=%d(%d) rqf=0x%08x rpf=0x%08x\n", __FUNCTION__, __LINE__,
          s->si[0].state, s->si[1].state, s->si[1].err_type, s->req->flags, s-> rep->flags);
2008-11-03 06:26:53 +01:00
Willy Tarreau
efb453c259 [MAJOR] migrate the connection logic to stream interface
The connection setup code has been refactored in order to
make it run only on low level (stream interface). Several
complicated functions have been removed from backend.c,
and we now have sess_update_stream_int() to manage
an assigned connection, sess_prepare_conn_req() to assign a
server to a connection request, perform_http_redirect() to
redirect instead of connecting to server, and return_srv_error()
to return connection error status messages.

The stream_interface status changes are checked before adjusting
buffer flags, so that the buffers can be informed about this lower
level update.

A new connection is initiated by changing si->state from SI_ST_INI
to SI_ST_REQ.

The code seems to work but is awfully dirty. Some functions need
to be moved, and the layering is not yet quite clear.

A lot of dead old code has simply been removed.
2008-11-02 10:19:10 +01:00
Willy Tarreau
d7704b5343 [MINOR] add an expiration flag to the stream_sock_interface
This expiration flag is used to indicate that the timer has
expired without having to check it everywhere.
2008-11-02 10:19:10 +01:00
Willy Tarreau
3c6ab2e28d [MEDIUM] use buffer_check_timeouts instead of stream_sock_check_timeouts()
It's more appropriate to use buffer_check_timeouts() to check for buffer
timeouts and si->shutw/shutr to shutdown the stream interfaces.
2008-11-02 10:19:10 +01:00
Willy Tarreau
2eb52f012d [MINOR] add buffer_check_timeouts() to check what timeouts have fired.
This new function sets the BF_*_TIMEOUT flags when a buffer timeout
has expired.
2008-11-02 10:19:10 +01:00
Willy Tarreau
534c5a2224 [OPTIM] add compiler hints in tick_is_expired()
adding those two unlikely() reduces the number of branches taken in
the common path and the size of the code.
2008-11-02 10:19:09 +01:00
Willy Tarreau
3537467679 [MEDIUM] move QUEUE and TAR timers to stream interfaces
It was not practical to have QUEUE and TAR timers in buffers, as they caused
triggering of the timeout flags. Move them to the stream interface where they
belong.
2008-11-02 10:19:09 +01:00
Willy Tarreau
4ffd51a848 [MEDIUM] process_session: make use of the new buffer flags
Now we have almost two distinct parts between tcp and http.
Only the connection establishment code still requires some
resynchronization, the rest does not.
2008-11-02 10:19:09 +01:00
Willy Tarreau
9a2d15429d [MEDIUM] buffers: add BF_READ_ATTACHED and BF_ANA_TIMEOUT
Those two flags will be used to wake up analysers only when
needed.
2008-11-02 10:19:09 +01:00
Willy Tarreau
48adac5db9 [MEDIUM] stream interface: add the ->shutw method as well as in and out buffers
Those entries were really needed for cleaner and better code. Using them
has permitted to automatically close a file descriptor during a shut write,
reducing by 20% the number of calls to process_session() and derived
functions.

Process_session() does not need to know the file descriptor anymore, though
it still remains very complicated due to the special case for the connect
mode.
2008-11-02 10:19:08 +01:00
Willy Tarreau
e5ed406715 [MAJOR] make stream sockets aware of the stream interface
As of now, a stream socket does not directly wake up the task
but it does contact the stream interface which itself knows the
task. This allows us to perform a few cleanups upon errors and
shutdowns, which reduces the number of calls to data_update()
from 8 per session to 2 per session, and make all the functions
called in the process_session() loop completely swappable.

Some improvements are required. We need to provide a shutw()
function on stream interfaces so that one side which closes
its read part on an empty buffer can propagate the close to
the remote side.
2008-11-02 10:19:08 +01:00
Willy Tarreau
eabf313df2 [MINOR] change type of fdtab[]->owner to void*
The owner of an fd was initially a task but this was sometimes
casted to a (struct listener *). We'll soon need more types,
so void* is more appropriate.
2008-11-02 10:19:08 +01:00
Willy Tarreau
fdccded0e8 [MEDIUM] indicate a reason for a task wakeup
It's very frequent to require some information about the
reason why a task is running. Some flags have been added
so that a task now knows if it got woken up due to I/O
completion, timeout, etc...
2008-11-02 10:19:08 +01:00
Willy Tarreau
75cf17ee30 [OPTIM] force inlining of large functions with gcc >= 3
GCC 3 and above do not inline large functions, which is a problem
with ebtree where most core functions are inlined.

This simple patch has both reduced code size and increased speed.
It should be back-ported to ebtree.
2008-11-02 10:19:08 +01:00
Willy Tarreau
4df8206832 [OPTIM] reduce the number of calls to task_wakeup()
A test has shown that more than 16% of the calls to task_wakeup()
could be avoided because the task is already woken up. So make it
inline and move the test to the inline part.
2008-11-02 10:19:07 +01:00
Willy Tarreau
3da77c5abd [MINOR] re-arrange buffer flags and rename some of them
The buffer flags became a big bazaar. Re-arrange them
so that their names are more explicit and so that they
are more easily readable in hex form. Some aggregates
have also been adjusted.
2008-11-02 10:19:07 +01:00
Willy Tarreau
72b179a53c [MEDIUM] reintroduce BF_HIJACK with produce_content
The stats dump are back. Even very large config files with
5000 servers work fast and well. The SN_SELF_GEN flag has
completely been removed.
2008-11-02 10:19:06 +01:00
Willy Tarreau
3a16b2c9cd [MEDIUM] split stream_sock_process_data
It was a waste to constantly update the file descriptor's status
and timeouts during a flags update. So stream_sock_process_data
has been slit in two parts :
  stream_sock_data_update()  => computes updated flags
  stream_sock_data_finish()  => computes timeouts

Only the first one is called during flag updates. The second one
is only called upon completion. The number of calls to fd_set/fd_clr
has now significantly dropped.

Also, it's useless to check for errors and timeouts in the
process_session() loop, it's enough to check for them at the
beginning.
2008-11-02 10:19:06 +01:00
Willy Tarreau
f9839bdffe [MAJOR] make the client side use stream_sock_process_data()
The client side now relies on stream_sock_process_data(). One
part has not yet been re-implemented, it concerns the calls
to produce_content().

process_session() has been adjusted to correctly check for
changing bits in order not to call useless functions too many
times.

It already appears that stream_sock_process_data() should be
split so that the timeout computations are only performed at
the exit of process_session().
2008-11-02 10:19:06 +01:00
Willy Tarreau
2d2127989c [MEDIUM] stream_sock_process_data moved to stream_sock.c
The old temporary process_srv_data function moved to stream_sock.c.
2008-11-02 10:19:05 +01:00
Willy Tarreau
fa7e10251d [MAJOR] rework of the server FSM
srv_state has been removed from HTTP state machines, and states
have been split in either TCP states or analyzers. For instance,
the TARPIT state has just become a simple analyzer.

New flags have been added to the struct buffer to compensate this.
The high-level stream processors sometimes need to force a disconnection
without touching a file-descriptor (eg: report an error). But if
they touched BF_SHUTW or BF_SHUTR, the file descriptor would not
be closed. Thus, the two SHUT?_NOW flags have been added so that
an application can request a forced close which the stream interface
will be forced to obey.

During this change, a new BF_HIJACK flag was added. It will
be used for data generation, eg during a stats dump. It
prevents the producer on a buffer from sending data into it.

  BF_SHUTR_NOW  /* the producer must shut down for reads ASAP  */
  BF_SHUTW_NOW  /* the consumer must shut down for writes ASAP */
  BF_HIJACK     /* the producer is temporarily replaced        */

BF_SHUTW_NOW has precedence over BF_HIJACK. BF_HIJACK has
precedence over BF_MAY_FORWARD (so that it does not need it).

New functions buffer_shutr_now(), buffer_shutw_now(), buffer_abort()
are provided to manipulate BF_SHUT* flags.

A new type "stream_interface" has been added to describe both
sides of a buffer. A stream interface has states and error
reporting. The session now has two stream interfaces (one per
side). Each buffer has stream_interface pointers to both
consumer and producer sides.

The server-side file descriptor has moved to its stream interface,
so that even the buffer has access to it.

process_srv() has been split into three parts :
  - tcp_get_connection() obtains a connection to the server
  - tcp_connection_failed() tests if a previously attempted
    connection has succeeded or not.
  - process_srv_data() only manages the data phase, and in
    this sense should be roughly equivalent to process_cli.

Little code has been removed, and a lot of old code has been
left in comments for now.
2008-11-02 10:19:04 +01:00
Willy Tarreau
ffab5b4ab0 [MEDIUM] merge inspect_exp and txn->exp into request buffer
Since we may have several analysers on a buffer, it's more
convenient to have the analyser timeout attached to the
buffer itself.
2008-08-17 18:03:28 +02:00
Willy Tarreau
c7e961e5f7 [BUILD] fix warning in proto_tcp.c with gcc >= 4
signedness issues.
2008-08-17 17:13:47 +02:00
Willy Tarreau
61eadc028f [BUG] regparm is broken on gcc < 3
Gcc < 3 does not consider regparm declarations for function pointers.
This causes big trouble at least with pollers (and with any function
pointer after all). Disable CONFIG_HAP_USE_REGPARM for gcc < 3.
2008-08-17 17:06:37 +02:00
Willy Tarreau
2df28e8110 [MEDIUM] session: move the analysis bit field to the buffer
It makes more sense to store the list of analysers in the buffer
than in the session since they are precisely plugged onto one
buffer.
2008-08-17 15:20:19 +02:00
Willy Tarreau
26ed74dadc [MEDIUM] use buffer->wex instead of buffer->cex for connect timeout
It's a shame not to use buffer->wex for connection timeouts since by
definition it cannot be used till the connection is not established.
Using it instead of ->cex also makes the buffer processing more
symmetric.
2008-08-17 12:11:14 +02:00